Hands-on Introduction to Apache Spark & Apache Zeppelin

Name: Hands-on Introduction to Apache Spark & Apache Zeppelin
Start: 2016-09-28T18:30:00+02:00
End: 2016-09-28T21:00:00+02:00
Location: Inkubator STARTER Gdańsk

Gospodarz: Robert

Future of Data: Gdansk

Szczegóły

Yameo, IT company from Gdansk, has a pleasure to host a Data Advocate from Hortonworks who will demonstrate a 100% open source Big Data platform to process data in motion and data at rest using cutting edge tools such as Apache Spark and Apache Zeppelin.

Agenda

18:30 - 19:00: Food, mingling
19:00 - 19:45: Lecture
19:45 - 21:00: Hands-on Lab

Apache Spark is a unified framework for big data analytics. Spark provides one integrated API for use by developers, data scientists, and analysts to perform diverse tasks that would have previously required separate processing engines such as batch analytics, stream processing and statistical modeling. Spark supports a wide range of popular languages including Python, R, Scala, SQL, and Java. Spark can read from diverse data sources and scale to thousands of nodes.

Overview Video: http://bit.ly/2cJ81pw

The lecture will be followed by a demo and a hands-on lab. There will be a short introductory lecture on Spark & Spark SQL module and how it fits within a wider Hadoop ecosystem. The lecture will be followed by a hands-on lab in Apache Zeppelin with examples of basic data manipulation, ETL, and visualization. Zeppelin provides a notebook style environment for data exploration, analytics and more - it’s a modern Data Exploration / Wrangling / Science notebook.

Users have three options to follow along with the hands-on labs. You can use the:

Hortonworks Sandbox (preconfigured HDP 2.5) on a Virtual Machine (VM) where you have full control of the environment. No data center, no cloud service, and no internet connection needed! http://hortonworks.com/products/sandbox/#install

Hortonworks Cloud on Amazon AWS for more control & pre-configured multi-node cluster deployments (slightly more advanced). Includes latest bits, such as Spark 2.0: http://hortonworks.github.io/hdp-aws/

About the Speaker

Robert Hryniewicz has over 10 years of experience working on Machine Learning, AI, Robotics, cloud products and more. Currently he's a Data Scientist and Advocate at Hortonworks (a 100% open-source public Big Data company). Previously, Robert has been a principal consultant at TiVo, CTO at a Singularity Labs company, Sr. Engineer at Cisco, NASA, Concurrent et al. Robert has been developing in Apache Spark since 2014. As a consultant he developed several interesting products including a Graph Analytics platform, as well as multiple Machine Learning and IoT prototypes.

Robert’s interests range anywhere from distributed systems to advanced analytics, deep learning, NLP, general AI, robotics, VR, DNA Sequencing, personalized medicine, vertical farms, blockchain, and technologies related to promoting more open, democratic, peaceful, and cooperative societies.

He comes up with his best ideas when hiking close to home in Yosemite or when traveling to various remote regions around the world.

Contact Info

Email: rhryniewicz@hortonworks.com
Twitter: @RobH8z
LinkedIn: https://www.linkedin.com/in/roberthryniewicz

Future of Data: Gdansk

Hands-on Introduction to Apache Spark & Apache Zeppelin

Future of Data: Gdansk

Szczegóły

Pokrewne tematy

Może ci się również spodobać