On January 31st, we’re hosting the first Spark User Meetup at Klout in San Francisco. Spark is a cluster programming framework that provides in-memory computing for iterative and interactive analytics and a high-level programming interface in the Scala language. This meetup will be a chance to learn Spark from the developers, hear about other peoples’ experiences with Spark, network, and hear about future development plans.
At the first meeting, we’re planning to have a Spark tutorial by Matei Zaharia followed by a presentation from Karthik Thiyagarajan of Quantifind on their experience using Spark to replace Pig for predictive analytics (details below).
If you want to follow along, we strongly recommend you sign up for Amazon EC2 and make sure you can launch instances. This way you’ll be able to bring up your own Mesos/Spark cluster on EC2 and play with a Wikipedia dataset in real time!
Time: Tuesday January 31. Pizza and beer at 6:30, talks at 7 PM.
Location: Klout, 77 Stillman St, San Francisco, CA 94107
Quantifind is a small start up into predictive analytics. We are building a platform that automatically identifies relevant signals in both structured and unstructured content, contextualizes them based on what matters most to the customer, and derives insights about past and future events in a way that is directly relevant and actionable.
The talk highlights our experience using Spark in various ways ranging from a distributed batch processing framework powering our analytics pipeline to an interactive computing infrastructure serving some of our internal exploratory tools.
Over the course of moving our analytics pipeline from Pig to Spark, we realized that Spark's inherent characteristics of low latency and interactivity can be leveraged as an agile way to create new computing services. So we experimented with creating web services on top of Spark which answered queries in real time by performing operations on a cached RDD. Today, Spark acts as a sharded in memory infrastructure for many of the services we use internally helping us explore our data and prototype algorithms in an agile manner.
Future Meetups: The meetup will rotate among locations in San Francisco, Silicon Valley and Berkeley.