Skip to content

Data processing mayhem

Photo of Thijs Schnitger
Hosted By
Thijs S.
Data processing mayhem

Details

We start of the first Software Circus meetup of 2018 with a hot topic: Data Processing.
We have two excellent speakers lined up for you covering data processing from different angles. We have Adrian Ionescu from Databricks talking about MPP databases, MapReduce & Spark SQL, and Rokesh Jankie from Google on Spark & Hadoop.

• Big Data Analytics: What goes around comes around
This talk is about the differences between MPP databases and BigData systems like MapReduce and how over the last couple of years they've been converging, with Spark SQL being at the intersection.

Speaker: Adrian Ionescu - Databricks
Adrian is a VU alumnus, MSc degree in High Performance Distributed Computing. He built a prototype for an MPP database as part of a 6 month internship at Vectorwise, under the guidance of Peter Boncz. He worked there for another 4 years, taking that prototype into production. He finally joined Databricks about a year ago, where he's working on improving the performance of Spark using MPP database techniques.

• Moving your Spark and Hadoop Workloads to Google Cloud Platform
For those who use popular open source data processing tools like Apache Spark and Hadoop, there can be several steps involved before being able to really focus on the data itself: creating a cluster to use the open source tools, finding a software package to easily install and manage the tools, then finding people to create jobs and applications and to operate, maintain and scale your clusters. In this session, you'll learn how you can use Google Cloud's managed Spark and Hadoop service, Google Cloud Dataproc, to take advantage of your existing investments in Spark and Hadoop. You'll have the opportunity to see how easily your existing data and code can be migrated to Google Cloud Platform (GCP) and how within Cloud Dataproc clusters can be right-sized, run ephemerally and cleanly separated to maximize your invested resources.

Speaker: Rokesh Jankie - Google Cloud Customer Engineer
Google enthusiast/believer since 2004 (launch of Gmail).
Rokesh Jankie is passionate about Google technology and enjoys being a Customer Engineer Google Cloud at Google Amsterdam.
Before this Rokesh was CTO of QAFE Inc., subsidiary of Qualogy, where he created his own product which is commercialized. At Qualogy Rokesh was also head of R&D.
Rokesh was a consultant and worked for a startup in the Finance industry before his role at QAFE.

Photo of Software Circus - Amsterdam group
Software Circus - Amsterdam
See more events
De Ruijterkade 143 · Amsterdam