Machine Learning with H2O
At our next meetup you can attend many exciting talks about the H2O machine learning platform. Telenor will talk about how they use H2O, RapidMiner will present their H2O integration project, and we will have two special guests from H2O.ai who will mainly talk about the new Sparkling Water 2.0.
See you on September 2 at Prezi House of Ideas! As this is a Friday, we are starting earlier than usual. Gates open at 18:00 with pizza and socializing. Talks will start around 18:30.
Jo-Fai Chow, Data Scientist, H2O.ai
Joe will briefly introduce H2O for those unfamiliar with the platform and set the stage for the talks afterwards.
Jo-fai (or Joe) is a data scientist at H2O.ai. Before joining H2O, he was in the business intelligence team at Virgin Media where he developed data products to enable quick and smart business decisions. He also worked (part-time) for Domino Data Lab as a data science evangelist promoting products via blogging and giving talks at meetups. Joe has a background in water engineering. Before his data science journey, he was an EngD researcher at STREAM Industrial Doctorate Centre working on machine learning techniques for drainage design optimization. Prior to that, he was an asset management consultant specialized in data mining and constrained optimization for the utilities sector in UK and abroad. He also holds a MSc in Environmental Management and a BEng in Civil Engineering.
Integrating H2O into RapidMiner Studio
Zsolt Toth, Software Engineer, RapidMiner
At RapidMiner, we decided to integrate H2O in version 7.2 because of the uniquely high accuracy, good performance, and easy-to-use Machine Learning algorithms, like GLM, GBT and Deep Learning. As the target audiences for H2O and RapidMiner are quite different, we have simplifed H2O parameters, improved error messages, and made it more user-friendly and easier to use. The talk will explain the challenges along the road, for example matching two very different architectures and presenting them on a unified user interface.
Zsolt has been a software engineer at RapidMiner for 3 years. He mainly works on the award-winning Big Data analytics product called RapidMiner Radoop. He loves working on the edge with the newest technologies and features in the Hadoop ecosystem, focusing on Apache Spark, Apache Hive and H2O. Zsolt has a master’s degree in computer science from the Budapest University of Technology.
How H2O is Sparkling @ Telenor: Experiences in moving from a traditional data mining architecture to a distributed one
Norbert Liki, Advanced Analytics Manager, Telenor
In this talk Norbert will briefly show how the changes in the ICT market affects the traditional telecom analytical/BI architecture and how they are coping with this challenge at Telenor Hungary using Big Data and H2O. He will give you insight in the aspects that convinced them to choose this solution in the first place. Furthermore, he will share their hands-on experiences in integrating H2O in their data mining process elaborating its strengths and weaknesses from their point of view.
Norbert has a background in Economics and Finance. He joined Telenor Hungary 5 years ago. Throughout his career he has been working closely together with business to solve problems with more and more advanced tools and techniques. As part of the Advanced Analytics Team he is responsible for tasks coming from every department of the company involving machine learning and Big Data.
Sparkling Water 2.0
Jakub Háva, Software Engineer, H2O.ai
Sparkling Water integrates the H2O open source distributed machine learning platform with the capabilities of Apache Spark. It allows users to leverage H2O’s machine learning algorithms with Apache Spark applications via Scala, Python, R or H2O’s Flow GUI which makes Sparkling Water a great enterprise solution. Sparkling Water 2.0 was built to coincide with the release of Apache Spark 2.0 and introduces several new features. These include the ability to use H2O frames as Apache Spark’s SQL datasource, transparent integration into Apache Spark machine learning pipelines, the power to use Apache Spark algorithms via the Flow GUI and easier deployment of Sparkling Water in a Python environment. In this talk we will introduce the basic architecture of Sparkling Water and provide an overview of the new features available in Sparkling Water 2.0. The talk will also include a live demo showing how to integrate H2O algorithms into Apache Spark pipelines – no terminal needed!
Jakub (or “Kuba”) finished his bachelors degree in computer science at Charles University in Prague, and is currently finishing his master’s in software engineering as well. As a bachelors thesis, Kuba wrote a small platform for distributed computing of tasks of any type. On his current masters studies he’s developing a cluster monitoring tool for JVM based languages which should make debugging and reasoning about performance of distributed systems easier using a concept called distributed stack traces. At H2O, Kuba mostly works on Sparkling Water project.
Closing: Beers at Apacuka