Skip to content

Details

What is covered?

Ingest data and transform it for processing using pig and hive – at the end of this session attendees will have a working data set and the ability to and experience of querying it from hive.
Enable Tez for faster queries and the role of ORC files for compression and better filter-push down.

Requirements for Hadoop Sessions:
Attendees should have 64 bit machine with internet connectivity 8 GB ram and 50GB free disk space
Attendees should perform the following steps before the second session begins:

  1. Install Vmware player or Virtualbox on your laptop/PC
  2. Download (http://hortonworks.com/hdp/downloads/) the Hortonworks sandbox.
  3. Load the image into your virtualization tool of choice (Virtual box or Vmware Player)
  4. Start the sandbox image and login

Related topics

You may also like