The state of SQL-on-Hadoop in the Cloud by Nicolas Poggi

Name: The state of SQL-on-Hadoop in the Cloud by Nicolas Poggi
Start: 2016-12-01T19:00:00+01:00
End: 2016-12-01T22:00:00+01:00
Location: Trovit Search

Hosted by Nico P. and Ferran Galí i R.

Big Data Operations On Performance (BDOOP)

Details

Managed Hadoop in the cloud, especially SQL-on-Hadoop, has been gaining attention recently. On Platform-as-a-Service (PaaS), analytical services like Hive and Spark come pre-configured for general-purpose and ready to use, giving users a quick entry and on-demand deployment of ready SQL-like solutions. This talk evaluates main PaaS services from an end-user perspective using a popular Hive benchmark. Results focus on the performance, readiness, scalability, and price of the different tested providers, including:

• Microsoft Azure HDInsight (HDI)

• Amazon Web Services Elastic Map Reduce (EMR)

• Google Dataproc

• Rackspace Cloud Big Data (CBD)

The talk highlights the main performance trends to both hardware and software configuration, pricing, similarities and architectural differences of the different cloud providers and compares them to an On-Prem commodity clusters. Results also show the importance of application-level tuning and how keeping up-to-date hardware and software stacks can influence performance even more than replicating the on-premises model in the cloud.

Agenda:

19:00 - Arrive at Trovit and meet other members

19:15 - Main talk starts

20:00 - Discussion, Beers, and pizzas courtesy of Trovit search

About the speaker:

Nicolas Poggi(@ni_po (http://www.twitter.com/ni_po)), is an IT researcher with focus on performance and scalability of Data intensive applications and infrastructures. He is currently leading a research project on upcoming architectures for Big Data at the Barcelona Supercomputing (BSC) and Microsoft Research joint center. Nicolas received his PhD in Distributed Systems and Computer Architecture at UPC/BarcelonaTech, where he is part of the HPC and of the Data Centric Computing research groups. He has also been a Research Scholar at IBM Watson, working in Big Data and system performance topics. Publications can be found at: http://personals.ac.upc.edu/npoggi/

Big Data Operations On Performance (BDOOP)

The state of SQL-on-Hadoop in the Cloud by Nicolas Poggi

Big Data Operations On Performance (BDOOP)

Details

Related topics

You may also like