Skip to content

Apache Spark Concepts Primer

Photo of Adam Doyle
Hosted By
Adam D.
Apache Spark Concepts Primer

Details

Apache Spark is an open source lightning-fast cluster computing for data processing. It is built for speed, ease of use, and sophisticated analytics, including machine learning. Since its beginning that was a graduate research project at UC Berkeley, Spark has seen a global scale rapid adoption by academia and enterprises including those IT giants such as eBay, Netflix, LinkedIn, Yahoo, and Microsoft.

Understanding Spark from the ground up is an important step in a successful use of Spark. In this talk, I will walk you through the building blocks of Spark, how Spark works, what makes Spark great, what Spark is not (hint: not a modified version of Hadoop), why you should use DataFrame instead of RDD, and much more. As a cherry on top, I will conclude the talk with a live Spark streaming demo using the best enterprise version of Spark, Azure Databricks. In the demo, we will also incorporate Microsoft Cognitive Services to understand the sentiments from tweets in a near real-time fashion.

Bio: I am an award-winning data scientist with comprehensive experience and training in computer engineering, computer science, statistics, and data systems. I am a non-conventional IT knowledge worker in the sense that I enjoy working with business and other IT folks (in addition to working quietly to solve data problems on my own). I believe there is a motivation behind everything we do with data, and my ability to build a story and shape business values comes from years of exposure to constructive questions in research. I am currently a Data Solution Architect at Microsoft working mainly in the field of Predictive Maintenance that involves IoT, Big Data, and ML.

Call 314.529.4155 if you need to be let in.

Agenda

6:30-7:00 - Food and Networking

7:00-8:00 - Presentation

8:00-8:30 - Questions

Photo of STL Big Data - Innovation, Data Engineering, Analytics Group group
STL Big Data - Innovation, Data Engineering, Analytics Group
See more events
Daugherty Business Solutions
3 Cityplace Drive · SAINT LOUIS, mo