Past Meetup

Apache Spark - Easier and Faster Big Data

This Meetup is past

50 people went

Location visible to members

Details

You are invited to attend the FIRST Spark User’s Group meeting. (sponsored by Cloudera)

Apache Spark is an open-source data analytics cluster computing framework. Spark fits into the Hadoop open-source community, building on top of the Hadoop Distributed File System (HDFS). However, Spark is not tied to the two-stage MapReduce paradigm, and promises performance up to 100 times faster than Hadoop MapReduce for certain applications. Spark provides primitives for in-memory cluster computing that allows user programs to load data into a cluster's memory and query it repeatedly, making it well suited to machine learning algorithms.

PRESENTERS:

Amr Awadallah

CTO & Co-Founder, Cloudera

Before co-founding Cloudera in 2008, Amr (@awadallah) was an Entrepreneur-in-Residence at Accel Partners. Prior to joining Accel he served as Vice President of Product Intelligence Engineering at Yahoo!, and ran one of the very first organisations to use Hadoop for data analysis and business intelligence. Amr joined Yahoo after they acquired his first startup, VivaSmart, in July of 2000. Amr holds a Bachelor’s and Master’s degrees in Electrical Engineering from Cairo University, Egypt, and a Doctorate in Electrical Engineering from Stanford University.

Carlo Piva

Solution Architect, Cloudera

Carlo is a highly accomplished architect, technologist and data scientist with over 12 years of experience designing, implementing, and managing highly scalable software systems. Carlo has worked in the United States, Europe and Australia implementing Big Data solutions for start-ups, financial institutions, commercial firms, R&D firms, retail firms, and non-profit financial regulators. He has architecture experience across topic such as Hadoop, Big Data, Machine Learning, Scientific Algorithms, Computer Science, Financial Markets, High volume low latency back end software. Carlo holds a Master's degree in Computer Science.

ABSTRACT

Amr will give an overview of the Cloudera roadmap for Spark while Carlo will introduce the open source project, demo the high level libraries including streaming, SQL, and machine learning, and expand into how Spark can help you make better decisions easier and faster.