Skip to content

Apache Spark as Cross-Over Hit for Data Science

Photo of Mike Stringer
Hosted By
Mike S.
Apache Spark as Cross-Over Hit for Data Science

Details

We're excited to host Sean Owen (https://twitter.com/sean_r_owen)! Sean is Director of Data Science at Cloudera, based in London. Before Cloudera, he founded Myrrix Ltd, a company commercializing large-scale real-time recommender systems on Apache Hadoop. He has been a primary committer and VP for Apache Mahout, and co-author of Mahout in Action. Previously, Sean was a senior engineer at Google.

Apache Spark as Cross-Over Hit for Data Science

Spark is getting a lot of buzz lately. It's now a top-level Apache project, and is integrated now with Hadoop. Its programming model is attractively simple compared to MapReduce, and for iterative computations often found in machine learning, can be much faster.

It has the potential to be a 'cross-over' platform for both
exploratory and operational data scientists, and offers elements familiar to Java, Hadoop, R, and Python developers.

This talk will briefly introduce Spark, and the features that make it attractive to different data science "camps".

Photo of Chicago AI/LLMs/ML Developers Group group
Chicago AI/LLMs/ML Developers Group
See more events
Matilda
3101 N Sheffield Ave · Chicago, IL