Apache Spark Night - show and tell

Name: Apache Spark Night - show and tell
Start: 2014-09-02T19:00:00-05:00
End: 2014-09-02T22:00:00-05:00
Location: Rackspace

Hosted by Austin Data Geeks

Austin Data Geeks

Details

An evening of Spark demos and use cases! Doing something interesting with Spark? Share it with the group. Send us a note to join the presentations. What is Apache Spark?

Among the presentations in store for this evening:

Chance Coble will be presenting a use case for analysis on telecommunications data. He will show how he was able to use Spark's low latency to quickly integrate new information in the form of call records into in-memory structures from HDFS. He will then show how to build pattern recognition using Spark's concise functional API on top of those data structures to identify network problems at speeds much faster than the company had seen before.

UPDATE:
Sandeep Parikh, who had planned to present on how to use MongoDB and Spark for recommendations with Spark's MLLib collaborative filter, won't be able to make it. We'll have him next time.

Cody Koeninger, software engineer at MediaCrossing, will discuss the use of Spark SQL, Parquet, and GraphX in their work related to advertising exchanges.

About Spark

Apache Spark is an open-source data analytics cluster computing framework originally developed in the AMPLab at UC Berkeley. Spark fits into the Hadoop open-source community, building on top of the Hadoop Distributed File System (HDFS). However, Spark is not tied to the two-stage MapReduce paradigm, and promises performance up to 100 times faster than Hadoop MapReduce, for certain applications. Spark provides primitives for in-memory cluster computing that allows user programs to load data into a cluster's memory and query it repeatedly, making it well suited to machine learning algorithms.

Austin Data Geeks

Apache Spark Night - show and tell

Austin Data Geeks

Details

Related topics

You may also like