Skip to content

Big Data & Data Science

Photo of Rémi Lissajoux
Hosted By
Rémi L.
Big Data & Data Science

Details

Nous vous convions au prochain Meetup autour de Streaming Analytics, Hadoop, Spark, Jaro-Winkler, etc. Nous vous attendons nombreux à cette nouvelle session meetup dédiée au Big Data et à la Data Science qui feront comme d'habitude la part belle aux illustrations et démonstrations.

Le 2 octobre au Village by CA (http://www.levillagebyca.com/evenement/meetup-ibm-big-data-data-science/)

Voici l'agenda que nous vous proposons (agenda détaillé à venir) :

1/ Apache NiFi: latest developments for flow management at scale

Apache NiFi is an integrated flow management platform for automating the movement of data between disparate systems. It provides real-time architecture that makes it easy to integrate with several systems and move data in secure, scalable and governed fashion. In this meetup, we will give an introduction to NiFi and its architecture. Later, we will cover the developments and features brought by the last release such as Change Data Capture, SQL queries on streams and Record readers/writers. We will also discuss NiFi roadmap and future projects.

Orateur : Abdelkrim Hadjidj, Solution Engineer Hortonworks

2/ Giving a boost to the Hadoop and Spark ecosystems with in-memory technologies

Hadoop and Spark are fast, but in-memory technologies are blazingly faster. Today, if there is a requirement to speed-up data access it means we need to do something around HDFS, add more nodes to the cluster, purchase more expensive hardware or look for solutions from Hadoop’s ecosystem of technologies. But what if we had some extra RAM in our data warehouse? Could we make use of it? We will present how to speed up existing Hadoop and Spark deployments by making Apache Ignite responsible for RAM utilization. No code modifications, no new architecture from scratch! Specifically, this presentation will cover : Hadoop Accelerator, HDFS compliant In-Memory File System, MapReduce Accelerator, Spark Shared RDDs, Spark SQL boost.

Speaker : Akmal Chaudhri, PhD, Expert Big Data, GridGain Systems

3/ Parallélisation de l'algorithme de similarité "Jaro-Winkler"

Jaro-Winkler est un algorithme qui permet de mesurer la similarité entre deux chaînes de caractère en Fuzzy matching. Dans cette intervention, nous présentons des résultats comparatifs de la performance des implémentations parallélisées de Jaro-Winkler avec Spark et BigSQL.

Speaker : Zied Abidi, Data Scientist et Jacques Milman, Architecte

4/ Streaming Analytics In Real Life

Du sondage de l'espace lointain à la prédiction des mouvements de foule dans un festival, en passant par la reproduction des poissons, le Streaming Analytics est partout ! Retour d'expérience sur des uses cases d'analyse en temps réel à grande échelle.

Speaker : Johan Picard, Expert Big Data

5/ Q&A - Clôture & cocktail networking

Notre meetup démarrera à 18h30 précises. Les participants seront accueillis à partir de 18h00.

============ INFOS PRATIQUES =============

Horaires des talks : Début : 18h30 - Fin prévue : 20h00 + cocktail

Adresse : Le Village By CA (https://www.facebook.com/levillagebycaparis/), 55 rue de la boëtie, Paris

Comment y aller ?
Métro : Miromesnil, ligne 9 ou ligne 13
Bus : 28, 32, 52, 80, 83, 93
Parking : À 7 min du parking Haussmann-Berri

Photo of Data, Cloud and AI in Paris group
Data, Cloud and AI in Paris
See more events
Le Village by CA
55, Rue de la Boëtie · Paris