Saltar al contenido

Detalles

⚠️ Por favor, no olvides indicar tu nombre completo y un documento identificativo (DNI, pasaporte, etc.) en la pregunta habilitada para ello en la inscripción. Será obligatorio presentar dicho documento el día del evento para poder acceder al espacio.

Queremos dar difusión al proyecto Apache más activo, Spark, que está cambiando el Big Data a gran velocidad. Las plataformas basadas en Hadoop y HDFS siguen vigentes pero el rendimiento y velocidad de las basadas en Spark hacen posibles nuevos usos y aplicaciones. Si estás en Madrid y te interesa Spark, este es el sitio para aprender y compartir intereses e ideas al respecto. Let's spark things up!
----
Spark is changing Big Data at a great speed. The platforms based on Hadoop and HDFS are still valid, but Spark brings new uses and applications. If you are in Madrid and you are interested in Spark, this is the place to learn and share interests and ideas. Let's spark things up!

Agenda:

18:15 Presentación y bienvenida
18:30 Meni Shmueli (Co-founder & CEO at DataFlint), Distributed Computing Unpacked
19:20 Ángel Álvarez Pascua (Dataiku), Data Forensics with Spark, Zingg and Graphframes
20:00 Networking (el networking es por cortesía de DataFlint)

Abstract:

Meni Shmueli (Data Flint) Distributed computing especially with Apache Spark can feel unnecessarily complex, but under the hood it's a beautiful system of trade-offs. In this talk, Meni walks through how distributed engines optimize for speed, fault tolerance, and scalability and how understanding these mechanics can help you fix real-world performance issues in your Spark jobs.

Ángel Álvarez Pascua (Dataiku) Your data is living a double life—probably a triple one. In this session, we play digital detective by using Zingg to sniff out suspicious similarities and GraphFrames to map the entire "criminal" network of duplicate profiles. You'll see how to move from "I think these are the same person" to "I have the graph proof," all while keeping your Spark cluster from breaking a sweat.

Bio:

Meni Shmueli - Co-founder & CEO at DataFlint

Ángel Álvarez Pascua - Dataiku

Temas relacionados

Apache Spark
Big Data
Data Analytics
Python
Scala

También te puede gustar