Skip to content

Get Started with Apache Spark & Apache Hive on the AWS Cloud

Photo of Future of Data
Hosted By
Future of D.
Get Started with Apache Spark & Apache Hive on the AWS Cloud

Details

Get started with running Apache Hadoop, Apache Spark and Apache Hive built for the AWS cloud (https://aws.amazon.com/marketplace/pp/B01LXOQBOU). In this meetup, you will learn how this cloud service addresses prescriptive, ephemeral use cases around Spark and Hive, it is offered as a Pay-As-You-Go (PAYG) pricing with “Optional Free Community Support” on the AWS marketplace.

https://lh4.googleusercontent.com/Lj1y_Lg7mmN9IJiePq30oQQ2duqKT7SsOv2j_ta1CjjD6BA5M7NsWZFF2-7APG5AfG5fvcIXmgYz2qF96BRqmg6AmszQhgzlwYoltk-puqH8SOxxfj8WF56aMX1FI7Hs3ElulPH3

Join us to learn more about the how to get started with quickly creating a cluster, analyzing S3 data, using Hive LLAP (https://cwiki.apache.org/confluence/display/Hive/LLAP) for EDW projects or use Spark & Apache Zeppelin for Data Science projects and to see a live demo by Ram Venkatash, Apache Committer. You’ll see how to quickly deploy Apache Spark and Apache Hive clusters for processing and analyzing data in the cloud. You’ll always see learn about Amazon S3 improvements across HDFS, Hive, & Tez for S3.

Watch this video to deploy a cluster in 15 min:

https://www.youtube.com/watch?v=7NjJnNZ6t_4

Check out the docs for additional help in launching a cluster:

https://hor.tn/HDCloud-docs

Speakers

Mingliang Liu is an Apache Hadoop committer. He works with HDFS and cloud storage at Hortonworks. Before that, he was a software development engineer at Amazon Web Services. He got his PhD from Tsinghua University, China in 2014. His interests include distributed storage systems, high performance computing, and compilers. His username is liuml07 and you can find him at Twitter, GitHub and Apache community.

Alan Gates is a founder of Hortonworks and an original member of the engineering team that took Pig from a Yahoo! Labs research project to a successful Apache open source project. Alan is PMC member on Apache Hive, Pig, and many other Apache projects. As part of the Apache Incubator PMC he has mentored many new Apache communities.  At Hortonworks he is part of the architecture team, helping design new features and products.

Gopal Vijayaraghavan is a performance specialist who is an active contributor and PMC member to the Apache Tez and Apache Hive projects. He is currently working on the Stinger.next initiative to improve SQL performance for popular BI tools, across deployments both on-prem and in the cloud.

Photo of Future of Data: Silicon Valley group
Future of Data: Silicon Valley
See more events
Hortonworks HQ
5470 Great America Parkway · Santa Clara, CA