Alright, so I thought I’d share my little experiment with the Bordeaux wine classification. It was a fun dive into something a bit different for me.

It all started with me just being curious about how these wines get their rankings. I mean, you see these “Grand Cru” and “Premier Cru” labels, and I wanted to understand what’s behind it all. So, I figured, why not try to build something to predict it?
First, I gathered the data. This was the most tedious part, honestly. I scoured the web for info on Bordeaux wines, things like grape varietals, region, vintage, critic scores, and price. I ended up cobbling together a dataset from a bunch of different sources. It was messy, for sure, but it was a start.
Then, I cleaned up the data. Man, there were missing values everywhere! I had to make some decisions about how to handle them. For some, I just filled them in with the average. For others, I had to drop the rows altogether. It wasn’t perfect, but I got it into a usable state.
Next up, feature engineering. I took the raw data and tried to create some more useful features. For example, I combined the region and vintage into a single “age_potential” score. I also created a “price_per_point” feature to see how much bang you get for your buck, based on critic scores.
Okay, now for the fun part: building the model. I decided to go with a simple Random Forest classifier. I didn’t want to overcomplicate things. I split the data into training and testing sets, and then I trained the model on the training data.

After training, I evaluated the model on the testing data. The accuracy wasn’t amazing, but it was better than random chance. I think I got around 70% accuracy. Not bad for a first try.
Finally, I tried to interpret the results. I looked at the feature importances to see which factors the model thought were most important. Critic scores and price were, unsurprisingly, the biggest predictors. But the “age_potential” feature I created also had some influence.
Overall, it was a cool project. I learned a lot about Bordeaux wines and about the challenges of working with messy data. And who knows, maybe one day I’ll be able to impress my friends with my newfound knowledge of wine classifications.
- Gathered data from various online sources
- Cleaned the data and handled missing values
- Engineered new features like “age_potential” and “price_per_point”
- Built a Random Forest classifier
- Evaluated the model’s accuracy
- Interpreted feature importances
It’s not perfect, but it was a good learning experience!