Wine from the lens of a data scientist

A recurring theme we've heard throughout our classes has been the need for data-driven decision making in the wine industry. I believe data can bring about a shift in how wineries and distributors think about the types of wine they make, customer feedback, marketing tactics, packaging, and distribution channels. Companies in the wine industry have unprecedented access to industry data and the ability to collect more data for their own businesses and the sub-market in which they exist. But how complex is the process of leveraging these data to identify impactful trends? Though it's certainly no walk in the park, an analytical eye can certainly find meaning in a giant array of data points. Here are a couple of interesting examples I came across of using simple, but powerful data analytics techniques to provide relevant insights. 

David Morrison, creator of the Wine Gourd (a wine data blog), analyzed the sales of 9L cases of wine sold in the U.S. from 1960-2019. 




This would seem to indicate a "strong but slowing trend" of growth. However, Morrison decomposes the graph into its linear and cyclical components by subtracting the positive linear trend from the above graph, to yield:


Considering the periodicity of the above graph, it can be concluded that U.S. wine market sales, though apparently increasing in the long term, could be seeing a sharp decline over the next 10 years. Such information could help wine makers and distributors plan for how to adjust their business strategies accordingly. 

Another example I found was around work the company Label Analytics (a wine label analysis company) is doing around their labels. I was amazed at how significant (and negative) an impact something as simple as a dot sticker added to the wine bottle had on customer preferences:



Again, a rather simple analysis, but with all sorts of implications for how much small differences in packaging and labeling influence the purchasing decisions of potential customers. 

Another area in which rigorous analytics can be done is in perceived wine quality as a function of numerous factors. The UCI machine learning repository has a publicly available wine quality data set that's often used for an introduction to pairwise correlations, and tutorials like this one allow amateur data scientists to play with the data and draw conclusions based on the correlations they see. A graph like the one below, though seemingly complicated, can very quickly tell a lot about what variables matter most in making a "quality wine" (apparently and unsurprisingly, high alcohol content is pretty essential).  



It's clear that well-executed data analysis can inform everything from a winery's decisions regarding making quality wine to a distributor's short and long term business strategies. Is there a business opportunity here? The company CustomerVineyard definitely believes so, and their research indicates that the ROI from consumer data analytics could be 10x. Vintners and retailers already came on board for the first phase of the company, and many others have expressed strong interest in this one-stop-shop for all their customer data analysis needs. Perhaps gleaning information from platforms like these is the best way for businesses in the wine industry to keep afloat with the apparent slump in sales and changing trends. 



2 comments:

  1. This is interesting stuff! I'm particularly drawn to the decomposed linear/cyclical graph that you added in with the conclusion that U.S. wine could be entering a downturn. While cyclicality can definitely indicate ups and downs, I notice that there's really only one up/down cycle shown on the graph since 1960. There are also some little dips that then continue an upward trend, so I'm not sure I'd be prepared to conclude from such a small sample that the wine industry is going to take a downturn. If I were working with a winery and advising them to cut production for the next decade, I think I would need more than this initial graph to make a compelling argument. Not to say a decline couldn't happen, but that more information is needed! Regardless, I love the more analytical approach you examined here.

    ReplyDelete
  2. Similar thoughts to Janine - the approaches you've taken here are great and really interesting! I would love to get a sense of the data set being used in the correlation matrix and dig into how internally consistent each of the variables are to get a sense of the variance in annotators. Similarly, I was struck at first by the cyclicality but hadn't yet checked the scale of the time axis - I would have interpreted that graph as being indicative of no real cyclical nature. I can't help but turn to causality in this case, and would love to figure out, if this can be interpreted as cyclical, what is driving that.

    ReplyDelete

Note: Only a member of this blog may post a comment.