The analytics platform at Twitter has experienced tremendous growth over the past few years in terms of size, complexity, number of users, and variety of use cases. This talk will discuss the evolution of the Twitter infrastructure and the development of capabilities for data mining on “big data”. We’ll share experiences as a case study, but make recommendations for best practices and point out opportunities for future work.
About the speaker:
Jimmy Lin is an associate professor in the iSchool at the University of Maryland, with appointments in the Institute for Advanced Computer Studies (UMIACS) and the Department of Computer Science. He works on "big data", with a particular focus on large-scale distributed algorithms for text processing. His research lies at the intersection of natural language processing (NLP) and information retrieval (IR). Recently, Jimmy spent an extended sabbatical (from 2010 to 2012) at Twitter working on large-scale data analytics. Previously, he has also done work for Cloudera, the enterprise Hadoop company.