Portfolio

Projects with purpose.

These aren't just things I built — they're problems I solved. Each one has a story, measurable impact, and lessons that shaped how I approach the next challenge.

Generative AI Databricks Enterprise

AI-Powered HR Analytics Assistant

FrieslandCampina · 2025 – Present

HR data is rich, complex, and often locked behind technical barriers. Business users needed a way to ask questions about workforce data in plain language — without waiting for an analyst. At the same time, we needed to handle sensitive employee data responsibly.

70%
Query accuracy
in testing
Self-Service
Natural language
to insights
Privacy-First
Local LLM
exploration

The Challenge

Stakeholders across HR, finance, and leadership needed quick answers from workforce data — headcount trends, retention patterns, diversity metrics. But these requests often bottlenecked through a small analytics team. We needed a solution that was both powerful and responsible.

The Approach

I architected a solution using Databricks Genie as the conversational interface layer, backed by structured HR data models. For sensitive use cases like anonymising employee feedback, I explored locally-hosted open-source models to ensure data never leaves our environment.

Key Outcomes

Built a Genie-based assistant achieving 70% accuracy on natural language HR queries in initial testing
Reduced dependency on manual reporting for common workforce questions
Established a proof-of-concept for privacy-preserving AI using local open-source LLMs for sensitive data
Improved Databricks environment usability and governance for the broader analytics team
Databricks Genie Python Azure Open-Source LLMs SQL
Data Engineering Cost Optimisation Azure

Data Lake Modernisation & Cost Optimisation

RoyalHaskoningDHV · 2023 – 2024

Our corporate data lake was growing expensive and slow. Storage costs were ballooning, and processing times were hurting downstream reporting. It was time for a fundamental rethink of how we stored and accessed data.

~90%
Reduction in
Azure storage costs
Faster
Significantly
improved processing
Single Source
Of truth for
company-wide reporting

The Challenge

Data was stored in various formats across the Azure Data Lake. Some processes were brittle, redundant, and expensive to maintain. The finance team was asking questions about cloud costs, and the analytics team was waiting too long for data refreshes.

The Approach

I led the migration of core corporate datasets to Parquet format — a columnar storage format that dramatically reduces storage footprint while improving read performance. I also rebuilt fragile ETL automations using Python, replacing manual, error-prone processes with robust, testable pipelines.

Key Outcomes

Migrated corporate data to Parquet format, cutting Azure Data Lake storage costs by nearly 90%
Processing speed improved significantly, enabling faster downstream reporting and analytics
Established a single source of truth for company-wide reporting and KPI tracking
Replaced fragile manual processes with robust Python-based ETL workflows
Python Azure Data Lake Parquet Alteryx Power BI SQL
Decision Support Sustainability Research

Energy Renovation Decision Support Tool

Woonbedrijf / TU/e · 2020 – 2022

In the Netherlands, social housing makes up 31% of the housing stock, and by law, 70% of tenants must consent before energy renovations can proceed. Housing associations needed to understand what tenants actually wanted — not just assume.

31%
Of Dutch housing
is social housing
70%
Tenant consent
required by law
Featured
On TU/e
website

The Challenge

Two key challenges emerged: understanding diverse tenant preferences for different renovation measures, and giving housing associations a practical tool to evaluate renovation packages based on those preferences. The goal was to accelerate the energy transition while respecting tenant voice.

The Approach

I designed a stated preference survey to collect structured data on tenant priorities. Using discrete choice models in R, I quantified how different tenant segments valued various renovation features. The insights were packaged into an interactive R Shiny web application that lets housing association decision-makers visually explore trade-offs.

Key Outcomes

Quantified tenant preferences using rigorous discrete choice modelling methodology
Built an interactive R Shiny web application for visual evaluation of renovation packages
Project featured on TU/e's Smart Buildings & Cities programme website
Contributed to evidence-based policy for sustainable housing renovation
R R Shiny Discrete Choice Models Survey Design Data Visualisation
Supply Chain Optimisation Operations

Overseas Supply Chain Optimisation

MAHLE · 2016

Before data was my primary tool, I was already using data to solve operational problems. At MAHLE, I applied systematic analysis to a logistics challenge that was eating into margins.

~15%
Freight cost
reduction
Milk Routes
Designed for
efficiency
Feasibility
Study for new
EGR line

The Challenge

MAHLE's overseas supply chain was suffering from high freight costs and inefficient transportation routes. The logistics team needed fresh eyes on route design and freight negotiations.

The Approach

I analysed supplier bases, customer locations, child part lists, volume consumption, bin sizes, distances, and transportation costs. Based on this data, I designed optimised milk routes and presented negotiation points to freight forwarders.

Key Outcomes

Reduced freight and transportation costs by approximately 15% through route optimisation
Negotiated improved freight terms with overseas forwarders using data-backed arguments
Conducted feasibility analysis for a new EGR line in Noida in collaboration with BITS Pilani interns
Learned that data-driven decision making works in any domain — not just software
Excel Route Optimisation Data Collection Negotiation Teamwork