SOLD OUT - Spark & Machine Learning Meetup

Details
Join The Brussels Data Science Community, Spark Summit Europe (https://spark-summit.org/eu-2016/) attendees, and Spark ML and machine learning experts Nick Pentreath and Jean-Francois Puget for a talk on a Spark-based end-to-end machine learning system. A round of Apache Spark™ and machine learning lightening talks will follow. Here is the Agenda:
18:30 Welcome, Food & Drink
18:40 Update about the preparation and workshops of the Data4Good Hackathon ( www.denguehack.org) — Philippe Van Impe, Founder, European Data Innovation Hub & Brussels Data Science Community.
18:50 Introduction — Berni Schiefer , IBM Fellow. https://spark-summit.org/eu-2016/speakers/berni-schiefer/
19:00 Creating an end-to-end Recommender System with Spark ML
— Nick Pentreath, Principal Engineer at the IBM Spark Technology Center, Apache Spark PMC member, and author of Machine Learning with Spark.https://spark-summit.org/eu-2016/speakers/nick-pentreath/
— Jean-François Puget, Distinguished Engineer, Machine Learning and Optimization, IBM Analytics https://www.linkedin.com/in/jfpuget
There are many resources available for building basic recommendation models using Spark. But how does a practitioner go from the basics to creating an end-to-end machine learning system, including deployment and management of models for real-time serving? In this session, we will demonstrate how to build such a system based on Spark ML and Elasticsearch. In particular, we will focus on how to go from data ingestion to model training to real-time predictive system.
19:45 Lightening Talks
10-minute Spark and machine learning talks, including new projects from Belgium.
-
Data Science as a Team Sport
Today, data science is very often an individual sport. Data scientists and data engineers, choose their own tools or flavor, work on their own.
Learn how Data Science Experience can make data science a team sport, bringing data scientists and data engineers together to make data science and machine learning available to everyone. Presenter: Juergen Schaeck -
Telco data stream simulation, processing and visualization Koen will discuss the development of a prototype for processing of data coming from cell towers, executed for a telco operator in the Middle East. The added difficulty was that the customer could not provide real data.In the end he developed a data generator in Scala/Akka, a data processor with Spark Streaming, and a visualization front-end with Node.js. Presenter: Koen Dejonghe
-
Hyperparameter Optimization - when scikit-learn meets PySpark
Spark is not only useful, when you have big data problems. If you have a relatively small data set you might still have a big computational problem. One problem is the search for optimal parameters for ML algorithms.
Normally, a data scientist has a laptop with 4 cores (8 threads), that means it will take some time to perform a grid search …However, if you use Spark, then it opens the possibility to have the grid search taken out on a cluster with a higher degree of parallelism. Presenter: Sven Hafeneger -
A data scientist, a BI expert and a big data engineer walk into a bar: how 3 different worlds come together with Spark Because of its general purpose nature, Spark is being used by a wide variety of data professionals, each with their own backgrounds. The data warehouse / data lake of a large organisation is a spot where those 3 worlds collide. We've experience the good, the bad and the ugly of those encounters first hand. In this lightning talk, we share what each group can learn from each other, how they can collaborate, and which are the recipes for disaster. Presenter: -Kris Peeters - Data Minded
-
Writing Spark applications, the easy way : how to focus on your data pipelines and forget about the rest - Pierre Borckmans - Real Impact Analytics Even though Spark offers intuitive and high-level APIs, writing production-ready Spark data pipelines involves non-trivial challenges for data scientists without expert background in software development and devops matters. In this short talk, I'll present how we tackled these issues at Real Impact Analytics, by developing an intuitive framework for writing dataflows, offering convenient data exploration and testing facilities, while hiding devops-related complexity. Presenter: Pierre Borckmans - Real Impact Analytics
-
A very brief introduction to extending Spark ML for custom models: Talk + Demo Spark ML pipelines, inspired by sci-kit learn, have the potential to make our machine learning tasks much easier. This talk looks at how to extend Spark ML with your own custom model types when the built in options don't meet your needs. Presenter: Holden Karau
20:45 Networking & Refreshments

SOLD OUT - Spark & Machine Learning Meetup