From Data Points to Data Dan. Sundar Dorai-Raj -- Google


From Data Points to Data Dan: Combining Log Analysis, Survey Analysis and Interviews to Segment Google Analytics Customers

Google Analytics has a wide user base, from hobbyist bloggers to employees of Fortune 100 corporations. In order to better understand our users, and to get more precision around the proportion of each user type that make up our customer base, we embarked on a customer segmentation project. This long-term research project used both qualitative and quantitative methods to scope and define customer “use cases,” or particular tasks that directed the front-end interactions of a user’s session. Our quantitative approach consisted of collecting all front-end user interactions, and performing Latent Dirichlet Analysis to arrive at groupings of 25 use cases, as well as conducting a survey to investigate how users’ background impact their usage. In parallel, our qualitative approach included over 50 subject interviews to understand what use cases were important from the user’s perspective. We used this research, along with product subject matter experts, to help assign labels to each of our use case parameter groupings. Using the labeled LDA topics, we measured engagement by user across each, and performed k-means clustering on individual users to arrive at 12 user segments. The qualitative interpretation of these clusters through 40 interviews led to a set of personas, which will provide further inspiration for product development.

Sundar Dorai-Raj is the lead data scientist and manager for Google Analytics, a free web and app measurement service for advertisers and publishers. Since 2009, Sundar has held similar roles within Google at YouTube, Video Ads and Fiber. He has a Ph.D. in Statistics from Virginia Tech and a Masters in Applied Math from the University of Alabama.