English speakers: unlike our regular meetings, this meetup will be in English, so don’t hesitate to come !
Vous avez un paquet de données et vous aimeriez pouvoir faire tourner des algorithmes de machine learning de fou-furieux en n’utilisant que du R ? Alors ce meetup est fait pour vous !
H2O est un logiciel qui permet de distribuer les algorithmes prédictifs les plus avancés. Il est optimisé pour les environnements “big data” et s’interface avec R grâce au package “h2o”. Pour vous faire découvrir plus en détail cette plateforme, nous avons fait venir tout droit de Londres les développeurs eux-même afin qu’ils nous présentent le fruit de leur labeur ! Et plus encore, une utilisatrice viendra nous faire une démonstration sur un cas concret.
Contrairement aux rencontres habituelles, ce meetup exceptionnel sera en anglais. Mais comme toujours il y aura à boire et à manger dans une ambiance des plus chaleureuses !
Programme de la soirée :
• 18h30-19h Accueil des participants, networking et snack (pour tenir jusqu'aux pizzas !)
• 19h-19h30 Introduction to Machine Learning with H2O by Jo-fai Chow
In this talk, I will give you an overview of our company (H2O.ai), our open-source machine learning platform (H2O) as well as our new projects (e.g. Deep Water and Steam). This will be useful for attendees who are not familiar with H2O.
• 19h30-20h15 Automatic Machine Learning using H2O - Jiqiong QIU
In this talk, I am going to talk about a general machine learning pipeline using H2O for a Kaggle competition (data mining competition). Participating in Kaggle competitions can be very time-consuming. For this reason, I would like to build an automatic machine learning solution that suits most of the datasets with reasonable results. This talk also gives an overview of the main functions in H2O as well as real-world use cases.
• 20h15-21h Sparkling Water 2.0 - Jakub Háva
Sparkling Water integrates the H2O open source distributed machine learning platform with the capabilities of Apache Spark. It allows users to leverage H2O’s machine learning algorithms with Apache Spark applications via Scala, Python, R or H2O’s Flow GUI which makes Sparkling Water a great enterprise solution. Sparkling Water 2.0 was built to coincide with the release of Apache Spark 2.0 and introduces several new features. These include the ability to use H2O frames as Apache Spark’s SQL datasource, transparent integration into Apache Spark machine learning pipelines, the power to use Apache Spark algorithms via the Flow GUI and easier deployment of Sparkling Water in a Python environment. In this talk we will introduce the basic architecture of Sparkling Water and provide an overview of the new features available in Sparkling Water 2.0. The talk will also include a live demo showing how to integrate H2O algorithms into Apache Spark pipelines – no terminal needed!
• 21h jusqu’au bout de la nuit :
Jo-fai (or Joe) is a data scientist at H2O.ai. Before joining H2O, he was in the business intelligence team at Virgin Media in UK where he developed data products to enable quick and smart business decisions. He also worked remotely for Domino Data Lab in US as a data science evangelist promoting products via blogging and giving talks at meetups. Joe has a background in water engineering. Before his data science journey, he was an EngD research engineer at STREAM Industrial Doctorate Centre working on machine learning techniques for drainage design optimization. Prior to that, he was an asset management consultant specialized in data mining and constrained optimization for the utilities sector in UK and abroad. He also holds a MSc in Environmental Management and a BEng in Civil Engineering.
Jakub (or “Kuba”) finished his bachelors degree in computer science at Charles University in Prague, and is currently finishing his master’s in software engineering as well. As a bachelors thesis, Kuba wrote a small platform for distributed computing of tasks of any type. On his current masters studies he’s developing a cluster monitoring tool for JVM based languages which should make debugging and reasoning about performance of distributed systems easier using a concept called distributed stack traces. At H2O, Kuba mostly works on Sparkling Water project.