Brouhaha: Visualizing Beer Styles


Details
The recent craft beer renaissance and its attendant online communities of craft beer aficionados have created both a valuable digital resource and a headache for the lay beer drinker. On the one hand, these online communities have crowdsourced a comprehensive set of individual beer reviews to help the beer neophyte understand and navigate this rich space. On the other hand, online beer communities have developed such a huge volume of material with such a specific lexicon of tastes, colors, mouthfeel, and aromatic qualities that interested neophytes may need some visual guidance to help navigate this space.
Descriptive language is inherently high dimensional--a beer vocabulary may have thousands of words--but one need not read or understand every word in order to understand the approximate relationships between beers--beer styles are a quick proxy for these relationships among beers. Specifically, a handful of subject matter experts (beer sommeliers) have already hand crafted lists, decision trees, two dimensional hierarchical trees, and two dimensional graphs to illustrate beer styles with examples. Beyond these hand-crafted examples, still others have taken a data-driven approach to cutting the massive tangle of beer reviews out there into understandable images.
I'll discuss some common steps in this data-driven visualization approach and show some examples of how you can turn beer review data into understandable 2D and 3D mappings of beer styles. Some steps include standard Natural Language Processing (NLP) techniques, such as tf-idf vectorization and topic modeling, while others include dimensionality embedding techniques like t-distributed Stochastic Neighbor Embedding (t-SNE).
About Dr. Kaufhold:
Dr. Kaufhold is a data scientist and managing partner of Deep Learning Analytics, a data science company based in Arlington, VA. Prior to forming Deep Learning Analytics, Dr. Kaufhold investigated deep learning algorithms as a staff scientist at NIH. Prior to NIH, Dr. Kaufhold was a Technical Fellow at SAIC, serving as principal investigator or technical lead on a number of large government contracts funded by NIH, DARPA and IARPA, among others. Prior to joining SAIC, Dr. Kaufhold investigated machine learning algorithms for medical image analysis and image and video processing at GE's Global Research Center. Dr. Kaufhold earned his Ph.D. from Boston University's biomedical engineering department in 2001.
Agenda:
• 6:30 - Doors open, networking
• 7ish - Introductions & Announcements
• 7:15 - Visualizing Beer
• 8:30 - Data Drinks (Secret Code Announced)
Data Visualization DC Sponsors:
Cloudera, SynglyphX (http://www.synglyphx.com/), Statistics.com (http://statistics.com/sna), General Assembly, Continuum Analytics
Hadoop-leader Cloudera (http://datacommunitydc.us3.list-manage1.com/track/click?u=75fc125999b198f97fe860a8d&id=2a18cfe96b&e=263865b99d) is an Organizational sponsor of Data Community DC!
http://res.cloudinary.com/hrscywv4p/image/upload/c_limit,h_540,w_720/x8gh2iktzfwyiigycdim.png
Transform Data to Knowledge, Faster.
Use the code "DC2" at Statistics.com (http://statistics.com/sna/) for 15% off their web-based courses!
General Assembly (http://datacommunitydc.us3.list-manage.com/track/click?u=75fc125999b198f97fe860a8d&id=8eac8031e2&e=263865b99d) has upcoming DC-based courses, including Data Science, Business, and Design!
DDL runs hands-on Data Science Workshops, Courses, and Projects
Want to sponsor Data Community DC, support our events, and help build this professional community? Get in touch!

Brouhaha: Visualizing Beer Styles