Spark, Scala, and the Berkeley Data Analytics Stack.


Details
IMPORTANT Please register at SkillsMatter:
http://skillsmatter.com/event-details/home/spark-scala-and-the-berkeley-data-analytics-stack (http://skillsmatter.com/event-details/home/spark-scala-and-the-berkeley-data-analytics-stack)
Spark, Scala, and the Berkeley Data Analytics Stack.
with
Patrick Wendell (http://www.pwendell.com/)
http://spark.incubator.apache.org/images/spark-project-header1.png
This talk will introduce Apache Spark (http://spark.incubator.apache.org/). Spark is a cluster computing engine that lets users concisely express a wide range of applications through APIs in Scala, Java and Python. Under the hood, Spark is written primarily in Scala. Spark supports streaming, batch and interactive analytics on very large datasets. Due to its support for in-memory storage and general operator graphs, it can run 100x faster than Hadoop for complex algorithms such as machine learning and graph processing.
This talk will give an overview of Spark and provide reflections on writing a large production application in Scala. Spark has spawned a variety of related projects which will also be covered briefly, including a SQL execution engine (Shark (https://github.com/amplab/shark/wiki)), a graph computing library (GraphX (https://amplab.cs.berkeley.edu/publication/graphx-grades/)), and a machine learning library (MLLib (http://www.mlbase.org/)).
Patrick Wendell is a committer on Apache Spark (http://spark.incubator.apache.org/) and a co-founder of Databricks. Before Databricks, he was pursuing a Ph.D in the UC Berkeley AMPLab advised by Ion Stoica. His research focused on scalable low latency scheduling for data processing frameworks. In the past, he has contributed to several Hadoop projects, including Apache Flume and Apache Avro. He holds a B.S. in Computer Science from Princeton University and an M.S. in Computer Science from UC Berkeley.
We will, as always, also be heading to the Slaughtered Lamb (http://www.theslaughteredlambpub.com/) pub afterwards.
**IMPORTANT READ ME TO REGISTER **
Skills Matter are hosting this event and are handling the attendance it is essential that you confirm your place at this link:
http://skillsmatter.com/event-details/home/spark-scala-and-the-berkeley-data-analytics-stack
failure to do so may result in not obtaining a seat. Please register on the Meetup.com "I'm going" to only let the others in the group know your going.
If this is your first time to SkillsMatter, directions are: http://skillsmatter.com/go/find-us

Spark, Scala, and the Berkeley Data Analytics Stack.