Why Pig? + Pig on Spark Update (Strata + Hadoop World meetup)

Name: Why Pig? + Pig on Spark Update (Strata + Hadoop World meetup)
Start: 2014-10-15T18:30:00-04:00
End: 2014-10-15T20:30:00-04:00
Location: Outbrain HQ

Hosted by John M.

NYC Pig User Group

Details

It's been a while since we got together, but we've got two great talks lined up during this year's Strata + Hadoop World.

Our first speaker is Jonathan Coveney. Jonathan is a senior software engineer at Twitter and an Apache Pig committer and PMC member. Follow him on Twitter: @jco (https://twitter.com/jco)

Says Jonathan: "Pig is a useful tool for data analytics used at many companies, but there are competitors. Scalding, Hive, etc., all have active communities and provide an alternative to Pig. I will look at the areas where I think Pig shines, as well as the areas where I think improvements need to be made if it's not going to be left in the dust. Bring questions."

Next we'll hear from Sigmoid Analytics CTO Mayur Rustagi (@mayur_rustagi (https://twitter.com/mayur_rustagi)) about the Pig on Spark project:

"With big data processing geared towards low latency, Pig on Spark aims to make ETL faster by using Spark as the execution engine instead of MapReduce in a Hadoop cluster. Spark, an open source data analytics cluster computing framework, is a more natural fit for the query plan produced by Pig. With optimized and shorter query plan graphs, Pig-on-Spark delivers huge performance improvements by executing the entire script within one YARN application as a single DAG and avoiding intermediate storage in HDFS. Spark offers the competitive advantage of high velocity analytics by way of stream processing large volumes of data, versus what has been traditionally a more heavily 'batch-oriented' approach to data processing as seen with Hadoop. Spark also provides a more inclusive framework allowing for multiple analytics processing options including: fast interactive queries, streaming analytics, graph analytics and machine learning."

Food and drink generously sponsored by Cloudera. See you there!

NYC Pig User Group

Why Pig? + Pig on Spark Update (Strata + Hadoop World meetup)

NYC Pig User Group

Details

Related topics

You may also like