Skip to content

Apache Drill and Apache Spark

Photo of Eric Christenson
Hosted By
Eric C.
Apache Drill and Apache Spark

Details

At our May event, we'll have speakers from IBM and MapR presenting on some great tools:

Self-Service Data Exploration and Nested Data Analytics on Hadoop - Introduction to Apache Drill, Presented by MapR Technologies

and

Apache Spark Overview, Presented by IBM

Come out and join us for some beers, meet others in the data science community, and learn about these great tools.

Descriptions of the topics follow.

Self-Service Data Exploration and Nested Data Analytics on Hadoop - Introduction to Apache Drill

Presented by Andrew Goade

Abstract:
SQL is one of the most widely used languages to access, analyze, and manipulate structured data. As Hadoop gains traction within enterprise data architectures across industries, the need for SQL for both structured and loosely-structured data on Hadoop is growing rapidly. Apache Drill started off with the audacious goal of delivering consistent, millisecond ANSI SQL query capability across wide range of data formats. At a high level, this translates to two key requirements – Schema Flexibility and Performance. Apache Drill provides the users the ability to interact with big data on Hadoop much faster and far more easily using the familiar SQL language. Users are no longer dependent on central IT teams and DBAs to produce schemas and then maintain them when the structure changes for a few records. Drill alleviates the pain associated with structuring unstructured data before one gains any insights by providing a simple mechanism to query any dataset on Hadoop - be it flat files, parquet or JSON files or tables within an HBase table. This session will give you an overview of several different use cases that enterprises are testing Drill for.

Apache Spark Overview

Presented by Bruce Fischer

• What is Apache Spark

• Spark Resilient Distributed Dataset (i.e. RDD)

• Spark SQL

• Spark Data Frames

• Spark Machine Learning

Speaker Bios:

Andrew Goade - Sales Engineer at MapR Technologies

Andrew Goade is a Sales Engineer for MapR. Prior to MapR, Andrew was a Solution Architect and Product Specialist for Forsythe, one of the largest independent IT integrators in North America, where he designed and scoped enterprise architectures for IBM Power Systems environments. Earlier in his career, Andrew was the Director of IT Production Services for Hub Group, a transportation management company. He began his career as a Programmer Analyst for McLeodUSA, where he was responsible for application design and development. Andrew holds a B.S.B. in Computer Management from Eastern Illinois University, and an M.B.A. from Elmhurst College

http://photos1.meetupstatic.com/photos/event/d/c/b/5/600_447716501.jpeg

Bruce Fischer, IBM Spark Technologist

I am a technology professional with more than 30 years of experience in all areas information technology. I have a broad base of experience from z/OS System Programming and database administration to Hadoop, Cloud Data Services and Big Data Wrangling using Apache Spark. My goal as an IBM Spark Technologist is helping organizations understand the benefits of Apache Spark and it's use within Analytics.

http://photos3.meetupstatic.com/photos/event/3/9/1/e/600_448634622.jpeg

Photo of BAM-Big Data, Advanced Analytics, Machine Learning group
BAM-Big Data, Advanced Analytics, Machine Learning
See more events
Titletown Tap Room
320 N. Broadway · Green Bay, WI