Skip to content

FOD Meetup #3 : Data Science, Spark with RapidMiner and Serverless

FOD Meetup #3 : Data Science, Spark with RapidMiner and Serverless

Details

Bonjour à tous,

Le prochain Meetup Future Of Data aura lieu le 08 Juin à partir de 18h30 chez D2SI (http://www.d2-si.fr/). Merci à D2SI (http://www.d2-si.fr/) de nous accueillir dans leurs locaux et merci à RapidMiner (https://rapidminer.com/) et Hortonworks (https://www.hortonworks.com/) qui offrent le buffet de clôture.

Agenda

Au programme, trois présentations en Français suivis d’un apéro networking:

18h30 – 19h00 : Accueil des participants

19h00 – 19h30 : What methodology to adopt for a Data Science project? (Amélie Groud - FastConnect)

19h30 – 20h00 : Apache Spark with RapidMiner demo (Jess Kilubu - RapidMiner and Dalila Messedi - Itecor)

20h00 – 20h30 : Ooso: MapReduce the Serverless way (Nicolas Monchy and Othmane Nahyl - D2SI)

20h30 – 21h00 : Apéro

Résumé

What methodology to adopt for a Data Science project? - Amélie Groud - Data Scientist, FastConnect

The key to the success of a Data Science project is mastering the data source. It is important to know the structure of the data and the relationships that exist between the different variables in order to build the best mathematical model. For this, many statistical tools are available but you need the right methodology to not fall into certain traps.

Apache Spark with RapidMiner demo - Jess Kilubu - Inside Sales Rep, RapidMiner / Dalila Messedi - Senior Data Consultant, Itecor

Learn how SparkRM improves performance and increases productivity when working in-Hadoop clusters with parallel loops and the ability to bootstrap algorithms.

Ooso: MapReduce the Serverless way - Nicolas Monchy - Data engineering consultant, D2SI / Othmane Nahyl - Data engineering intern, D2SI

What if we could mix Severless and Big Data ? Serverless gives us huge scalability and parallelization and it sounds like something we could use in Big Data. This is why Ooso was developed. It is a Java library that lets you do MapReduce in a serverless way based on AWS Lambda and Amazon S3 ( https://github.com/d2si-oss/ooso ). All you need to implement is your map and reduce functions. What about the performance and the limitations ? This is what will be discussed during the talk while a demo will be performed.

Photo of Future of Data: Paris group
Future of Data: Paris
See more events
D2SI
29 bis rue d’Astorg, 75008 · 75008 Paris