Wine Quality Forecasting

Predicting French wine vintage quality using 10+ years of weather data, deep learning models, and geospatial preprocessing.

Why This Project Matters

This project demonstrates end-to-end ownership of a complete machine learning pipeline, from raw data acquisition to model deployment and visualization.

It showcases expertise in data engineering, scraping, and merging complex datasets, along with geospatial processing to link wine regions with weather stations.

The implementation of deep learning on tabular data using modern architectures (MLP, FT-Transformer, TabNet) highlights practical ML engineering skills directly applicable to real-world problems.

This work is highly relevant for roles in AI, ML Engineering, Data Science, and Big Tech, demonstrating the ability to tackle ambiguous problems and deliver production-ready solutions.

Skills Demonstrated

Machine Learning

  • • MLP, FT-Transformer, TabNet architectures
  • • Cross-validation and hyperparameter tuning
  • • Classification metrics and evaluation
  • • Model interpretation and feature importance

Data Engineering

  • • Polars and Pandas for data manipulation
  • • Feature engineering and selection
  • • Time-series data processing
  • • Data cleaning and normalization

Scraping & Data Collection

  • • Web scraping from Vivino platform
  • • API integration with MĂ©tĂ©o-France
  • • Automated data pipeline construction
  • • Robust error handling and retry logic

Geospatial Processing

  • • AOC wine region coordinate mapping
  • • Nearest weather station identification
  • • Fuzzy matching for location data
  • • Spatial joins and distance calculations

Visualization

  • • Interactive Plotly maps
  • • Data analytics dashboards
  • • Model performance visualization
  • • Feature correlation heatmaps

Model Architecture Preview

Model Architecture Diagram

High-level architecture: categorical embeddings + numeric climate features → Feature Builder → MLP or FT-Transformer → quality class prediction.

End-to-End Pipeline

01

Vivino Scraping

Extract wine ratings, vintage years, and region data from the Vivino platform.

02

AOC Fuzzy Matching

Match wine regions to official AOC (Appellation d'Origine Contrôlée) designations using fuzzy string matching.

03

Weather Station Cleaning

Clean and standardize Météo-France weather station data, handling missing values and outliers.

04

Climate Feature Engineering

Create meaningful climate features including temperature, precipitation, growing degree days, and seasonal aggregations.

05

Wine–Weather Integration

Merge wine vintage data with corresponding weather patterns using geospatial joins and temporal alignment.

06

Deep Model Training

Train multiple deep learning architectures (MLP, FT-Transformer, TabNet) with cross-validation.

07

Evaluation & Visualization

Assess model performance with classification metrics and create interactive visualizations for insights.