Real Time Transactional SQL and Real Time Data Ingestion on Hadoop


Details
The second meetup in the Future of Data series has speakers from Pivotal and Hortonworks. This time with a hands on lab on HAWQ and more real time Hadoop goodness! Scroll down to find a link to prepare for the hands on lab.
We thank Dell for kindly hosting us in their awesome office in the Amsterdam Sloterdijk area.
Agenda:
17:00 - Arrive, drink, eat
17:45 - Presentations
- Install and Admin of Apache HAWQ on Hortonworks with Apache Ambari
Learn how to install and manage Apache HAWQ on a Hadoop Cluster.
Speaker: Tony van Büüren van Heijst, Pivotal
If you bring your laptop and complete the installation exercise, you will leave with a functioning HAWQ instance in a Hortonworks Sandbox for you to play around with on your own time.
Apache HAWQ (http://hawq.incubator.apache.org/) <- Note LInk, is an elastic, parallel processing query engine that operates on all your data directly within Hadoop. It provides the highest degree of ANSI-SQL completeness to execute sophisticated queries for advanced analytics and data science.
In this session we'll cover:
-
Installation of baseline Hortonworks Sandbox
-
Installation and configuration of Apache HAWQ using Ambari
-
Tour of administrative capabilities of HAWQ using Ambari
-
Running a smoke test by executing queries with HAWQ
-
Connecting Apache Zeppelin to your HAWQ instance
System prerequisites
To participate in the hands-on portion, please bring a laptop with the following:
VirtualBox 4.2 or later, or VMWare 5.0 or later installed Pre-downloaded Sandbox VM with HAWQ
Please download the lab image VM and uncompress it ahead of time.
Download size: 6GB Uncompressed size: 13GB
You will need total drive capacity of 20 GB to expand, and can then delete the download file to reclaim 6GB.
Please download the lab VM at: https://drive.google.com/file/d/0BzjW8doIt2F-T3J0LTN6dDRBS2c/view?usp=sharing
- Apache NiFi, Kafka & Storm - Better Together
Speaker: Hellmar Becker, Hortonworks
The second talk shows how to connect Apache NiFi, Kafka and Storm to build a scalable data flow, based on telemetry data from trucks.
19:30 - Drinks and Networking
20:30 - Everybody out

Real Time Transactional SQL and Real Time Data Ingestion on Hadoop