Spark at Thomson Reuters and Project Tungsten

Name: Spark at Thomson Reuters and Project Tungsten
Start: 2015-10-13T18:30:00-07:00
End: 2015-10-13T21:30:00-07:00
Location: Galvanize

Hosted by Scott W.

Bay Area Spark Meetup

Details

Tonight we will have 2 talks, the first will be discussing Spark at Thomson Reuters, the second will a talk on Project Tungsten from Databricks. A detailed abstract is below. We will be filming the talk and posting it to the Apache Spark YouTube page.

Agenda:

6:30: Mingling

7-7:05: Intro's

7:05-8:15: Technical Talks

8:15: Mingling

Adam Baron, Director of Big Data Quantitative Research

StarMine, a Thomson Reuters brand, began using Hadoop in 2011 and built a home-grown quantitative finance research environment heavily leveraging MapReduce, Hive and Mahout. In 2014, they started using Spark with a strong reliance on Spark SQL for data manipulation and Spark MLlib for machine learning. StarMine has also dabbled in Sparkling Water for algorithms which are not yet available in Spark MLlib, such as Deep Learning. Adam will speak about the steps involved in going from raw text to a predictive quantitative finance model. He will highlight the technologies involved, share some Spark examples and give insight into how quants approach Big Data.

Josh Rosen, Spark Committer and Software Engineer at Databricks

Project Tungsten focuses on substantially improving the efficiency of memory and CPU for Spark applications, to push performance closer to the limits of modern hardware. In this talk, we will give an update on the Project Tungsten improvements included in Spark 1.5.0 and dive into some of the technical challenges we are solving.

Bay Area Spark Meetup

Spark at Thomson Reuters and Project Tungsten

Bay Area Spark Meetup

Details

Related topics

You may also like