Skip to content

Hands-on Spark workshop for beginners

Hands-on Spark workshop for beginners

Details

Apache Spark Workshop (generously sponsored by Cloudera (http://www.cloudera.com))

This is a half-day workshop on Apache Spark (http://spark.apache.org/) FOR BEGINNERS led by Venkat Ankam (http://linkedin.com/in/ankamv), Big Data Architect at Centurylink (http://www.centurylink.com). In this workshop, Venkat will be sharing basics of Apache Spark with hands-on lab exercises.

Format:

· Half-day course (4 hours)

· Theory and hands-on lab exercises

Requirements:

· You must bring your own laptop. Laptop requirements are

· RAM – 8GB and above

· Latest VMWare Player must be installed if it is Windows laptop

· VMWare Fusion must be installed if it is Mac Laptop

· Virtualization is enabled in BIOS

Audience:

· Developers/Analysts/Data Scientists/Architects with basic knowledge of programming and Linux and who wants to get started with Apache Spark.

· Lab exercises will be based on Python (PySpark) so knowledge of Python will be helpful but it’s not mandatory.

· Hadoop knowledge (especially HDFS) will be helpful, but it’s not mandatory. Some exercises are based on HDFS and YARN.

Goals:

By attending this course, participants will be comfortable performing the following:

· Start using Apache Spark on Hadoop platform

· Spark and RDD concepts are understood

· Use Spark Shell

· Write Applications

· Get started with Spark SQL

· Get started with Spark Streaming

· If time permits, get started with MLLib

Course Outline:

· Introduction to Apache Spark

· Getting started with spark shell and applications

· Parallel programming with Spark

· Loading and Saving Data

· Introduction to Unified Platform (SparkSQL, Spark Streaming, MLLib, GraphX)

About Presenter:

Venkat Ankam is a Big Data Architect at Data and Analytics practice department of Centurylink (http://www.centurylink.com), Denver. He has over 16 years of IT experience and 3 years in Big Data/Hadoop working with customers to design scalable data architectures and applications. Having worked with multiple clients globally, he has tremendous experience in data warehousing and application programming. He is a founder and presenter of a few Hadoop meetup groups globally and he loves to share knowledge to the community.

Photo of Boulder/Denver Data + AI Meetup Group group
Boulder/Denver Data + AI Meetup Group
See more events
Spark Boulder
1310 E College Ave #100 · Boulder, CO