Skip to content

Spark, Scala, and the Berkeley Data Analytics Stack.

Photo of Andy Hicks
Hosted By
Andy H.
Spark, Scala, and the Berkeley Data Analytics Stack.

Details

IMPORTANT Please register at SkillsMatter:

http://skillsmatter.com/event-details/home/spark-scala-and-the-berkeley-data-analytics-stack (http://skillsmatter.com/event-details/home/spark-scala-and-the-berkeley-data-analytics-stack)

Spark, Scala, and the Berkeley Data Analytics Stack.

with

Patrick Wendell (http://www.pwendell.com/)

http://spark.incubator.apache.org/images/spark-project-header1.png

This talk will introduce Apache Spark (http://spark.incubator.apache.org/). Spark is a cluster computing engine that lets users concisely express a wide range of applications through APIs in Scala, Java and Python. Under the hood, Spark is written primarily in Scala. Spark supports streaming, batch and interactive analytics on very large datasets. Due to its support for in-memory storage and general operator graphs, it can run 100x faster than Hadoop for complex algorithms such as machine learning and graph processing.

This talk will give an overview of Spark and provide reflections on writing a large production application in Scala. Spark has spawned a variety of related projects which will also be covered briefly, including a SQL execution engine (Shark (https://github.com/amplab/shark/wiki)), a graph computing library (GraphX (https://amplab.cs.berkeley.edu/publication/graphx-grades/)), and a machine learning library (MLLib (http://www.mlbase.org/)).

Patrick Wendell is a committer on Apache Spark (http://spark.incubator.apache.org/) and a co-founder of Databricks. Before Databricks, he was pursuing a Ph.D in the UC Berkeley AMPLab advised by Ion Stoica. His research focused on scalable low latency scheduling for data processing frameworks. In the past, he has contributed to several Hadoop projects, including Apache Flume and Apache Avro. He holds a B.S. in Computer Science from Princeton University and an M.S. in Computer Science from UC Berkeley.

We will, as always, also be heading to the Slaughtered Lamb (http://www.theslaughteredlambpub.com/) pub afterwards.

**IMPORTANT READ ME TO REGISTER **

Skills Matter are hosting this event and are handling the attendance it is essential that you confirm your place at this link:

http://skillsmatter.com/event-details/home/spark-scala-and-the-berkeley-data-analytics-stack

failure to do so may result in not obtaining a seat. Please register on the Meetup.com "I'm going" to only let the others in the group know your going.

If this is your first time to SkillsMatter, directions are: http://skillsmatter.com/go/find-us

Photo of London Scala User Group group
London Scala User Group
See more events
The Skills Matter eXchange
116-120 Goswell Road, EC1V 7DP · London