It's PyData time again.
In an unexpected twist, a flood of H2O folks are visiting town, bringing goodies and knowledge and they asked us if we wanted to know the latest things boiling in the H2O cauldron. eBay/Marktplaats were kind enough to offer a location for the meetup. Thank you folks!
Melanie, Vincent, Marcel, Gabriele, and Giovanni
18:00 doors open up
19:05 doors close - be on time!
19:15 first talk - Sparkling Water 2.0
20:30 second talk - Deep Water
Sparkling Water 2.0 - Jakub Háva (45 mins)
Sparkling Water integrates the H2O open source distributed machine learning platform with the capabilities of Apache Spark. It allows users to leverage H2O’s machine learning algorithms with Apache Spark applications via Scala, Python, R or H2O’s Flow GUI which makes Sparkling Water a great enterprise solution. Sparkling Water 2.0 was built to coincide with the release of Apache Spark 2.0 and introduces several new features. These include the ability to use H2O frames as Apache Spark’s SQL datasource, transparent integration into Apache Spark machine learning pipelines, the power to use Apache Spark algorithms via the Flow GUI and easier deployment of Sparkling Water in a Python environment. In this talk we will introduce the basic architecture of Sparkling Water and provide an overview of the new features available in Sparkling Water 2.0. The talk will also include a live demo showing how to integrate H2O algorithms into Apache Spark pipelines – no terminal needed!
Jakub (or “Kuba”) finished his bachelors degree in computer science at Charles University in Prague, and is currently finishing his master’s in software engineering as well. As a bachelors thesis, Kuba wrote a small platform for distributed computing of tasks of any type. On his current masters studies he’s developing a cluster monitoring tool for JVM based languages which should make debugging and reasoning about performance of distributed systems easier using a concept called distributed stack traces. At H2O, Kuba mostly works on Sparkling Water project.
Project “Deep Water” (H2O integration with other deep learning libraries) - Arno Candel (30 mins)
The “Deep Water" project is about integrating our H2O platform with other open-source deep learning libraries such as TensorFlow, mxnet and Caffe. I will talk about the motivation and potential benefits of this project and then carry out a live demo using mxnet as the GPU backend.
Dr. Arno Candel is the Chief Architect at H2O.ai. Arno is also the main author of H2O’s Deep Learning and key contributor to H2O's GBM and DRF algorithms. Arno spent the last 5 years designing and implementing high-performance machine-learning algorithms. Previously, he spent a decade in high-performance computing and ran his code on the world’s largest supercomputers as a staff scientist at SLAC National Accelerator Laboratory, where he participated in US DOE scientific computing initiatives and collaborated with CERN on next-generation particle accelerators.