May 24, 2013 · 12:30 PM
Speaker: Aaron Siegel
The operation of the Twitter platform generates an enormous amount of data, on the order of petabytes per month. I'll describe our ongoing efforts to build a pipeline to process and aggregate this data stream in real time. Along the way, I'll discuss how Twitter's technologies for streaming analytics have evolved over the past several years, from simple counting engines to complex systems that can run a range of computations over tens of billions of events per day.
Aaron Siegel manages the platform analytics team at Twitter, helping to organize, provide, and understand the vast body of data created by the operation of the Twitter platform. Prior to joining Twitter, he worked as a postdoc in combinatorial game theory and spent four years as a partner at Berkeley Quantitative, a technology-driven financial trading company. He holds a Ph.D. in mathematics from UC Berkeley and a B.S. in electrical and computer engineering from Carnegie-Mellon.