Predicting French wine vintage quality using 10+ years of weather data, deep learning models, and geospatial preprocessing.
This project demonstrates end-to-end ownership of a complete machine learning pipeline, from raw data acquisition to model deployment and visualization.
It showcases expertise in data engineering, scraping, and merging complex datasets, along with geospatial processing to link wine regions with weather stations.
The implementation of deep learning on tabular data using modern architectures (MLP, FT-Transformer, TabNet) highlights practical ML engineering skills directly applicable to real-world problems.
This work is highly relevant for roles in AI, ML Engineering, Data Science, and Big Tech, demonstrating the ability to tackle ambiguous problems and deliver production-ready solutions.
High-level architecture: categorical embeddings + numeric climate features → Feature Builder → MLP or FT-Transformer → quality class prediction.
Extract wine ratings, vintage years, and region data from the Vivino platform.
Match wine regions to official AOC (Appellation d'Origine Contrôlée) designations using fuzzy string matching.
Clean and standardize Météo-France weather station data, handling missing values and outliers.
Create meaningful climate features including temperature, precipitation, growing degree days, and seasonal aggregations.
Merge wine vintage data with corresponding weather patterns using geospatial joins and temporal alignment.
Train multiple deep learning architectures (MLP, FT-Transformer, TabNet) with cross-validation.
Assess model performance with classification metrics and create interactive visualizations for insights.