FOD Meetup #3 : Data Science, Spark with RapidMiner and Serverless

Details
Bonjour à tous,
Le prochain Meetup Future Of Data aura lieu le 08 Juin à partir de 18h30 chez D2SI (http://www.d2-si.fr/). Merci à D2SI (http://www.d2-si.fr/) de nous accueillir dans leurs locaux et merci à RapidMiner (https://rapidminer.com/) et Hortonworks (https://www.hortonworks.com/) qui offrent le buffet de clôture.
Agenda
Au programme, trois présentations en Français suivis d’un apéro networking:
18h30 – 19h00 : Accueil des participants
19h00 – 19h30 : What methodology to adopt for a Data Science project? (Amélie Groud - FastConnect)
19h30 – 20h00 : Apache Spark with RapidMiner demo (Jess Kilubu - RapidMiner and Dalila Messedi - Itecor)
20h00 – 20h30 : Ooso: MapReduce the Serverless way (Nicolas Monchy and Othmane Nahyl - D2SI)
20h30 – 21h00 : Apéro
Résumé
What methodology to adopt for a Data Science project? - Amélie Groud - Data Scientist, FastConnect
The key to the success of a Data Science project is mastering the data source. It is important to know the structure of the data and the relationships that exist between the different variables in order to build the best mathematical model. For this, many statistical tools are available but you need the right methodology to not fall into certain traps.
Apache Spark with RapidMiner demo - Jess Kilubu - Inside Sales Rep, RapidMiner / Dalila Messedi - Senior Data Consultant, Itecor
Learn how SparkRM improves performance and increases productivity when working in-Hadoop clusters with parallel loops and the ability to bootstrap algorithms.
Ooso: MapReduce the Serverless way - Nicolas Monchy - Data engineering consultant, D2SI / Othmane Nahyl - Data engineering intern, D2SI
What if we could mix Severless and Big Data ? Serverless gives us huge scalability and parallelization and it sounds like something we could use in Big Data. This is why Ooso was developed. It is a Java library that lets you do MapReduce in a serverless way based on AWS Lambda and Amazon S3 ( https://github.com/d2si-oss/ooso ). All you need to implement is your map and reduce functions. What about the performance and the limitations ? This is what will be discussed during the talk while a demo will be performed.

FOD Meetup #3 : Data Science, Spark with RapidMiner and Serverless