Pre-holiday Spark des fêtes


Details
Vos examens sont terminés? Vos projets sont livrés? C'est le temps d'un autre meetup Spark avant les vacances. Notez que la langue de la présentation est encore à déterminer (à la discrétion du présentateur).
Les portes ouvrirons à 18h et les présentations commencerons à 18h30.
Your exams are over? Your project delivered? It's that time of year when you feel the need for your Spark fix! Note that the language of the talk is to be determined by the speaker(s).
Doors will open at 6PM and the talks will start at 6:30PM
Titre (1ère présentation)
Découverte et discussion du big data stack de Ludia et cas d'usage inusité @Ludia
Présentateur (1ère présentation)
Bonie Ntanke
Big data Analytics developer @Ludia
https://www.linkedin.com/in/bonie-ntanke-04a84511b/
Description (1ère présentation)
Les développeurs de Ludia nous présenterons leur utilisation de Spark en deux parties:
-
Leur stack: S3, Spark, Oozie, Zeppelin, Impala ...
-
La difference entre impala et spark sur Zeppelin et pourquoi impala est beaucoup mieux ainsi que les rapports de comptabilité sur Spark
Titre (2e présentation)
Compilation de requêtes graph openCypher avec Spark Catalyst
(En anglais)
Speaker (2e présentation)
Gabor Szarnyas PhD student @ Budapest University of Technology and Economics
visiting researcher @ McGill
-----------------------------
Title (1st talk)
Discoveries and discussions on Ludia's Big Data Stack.
Speaker (1st talk)
Jean-Philippe Malette
Data analyste @Ludia
https://www.linkedin.com/in/jeanphilippemallette/
Description (1st talk)
Ludia's developers will present their use of Spark in two parts:
-
Their data stacks : S3, Spark, Oozie, Zeppelin, Impala ...
-
Difference between Impala and Spark on Zeppelin and why Impala is a lot better. Also discussing the accounting reports using Spark.
Title (2nd talk)
Compiling openCypher graph queries with Spark Catalyst
(In english)
Speaker (2nd talk)
Gabor Szarnyas PhD student @ Budapest University of Technology and Economics visiting researcher @ McGill
Description (2nd talk)
In the early 2010s, graph databases were an odd branch of the NoSQL family, with cumbersome APIs and limited practical applications. This changed significantly when the Neo4j graph database introduced Cypher, a SQL-like declarative graph query language that allows users to express their queries using an intuitive and readable formalism.
Two years ago, Neo4j released openCypher, an open specification of the Cypher query language. openCypher was since then adopted by both industry and academia, most notably the SAP HANA database, and multiple research prototypes, such as Graphflow (developed at the University of Waterloo) and our ingraph project.
The goal of the ingraph project is to provide live query evaluation using incremental view maintenance – a common technique used by relational databases. However, compiling openCypher queries to an incrementally maintainable representation is a challenging task that involves multiple transformation steps. Our approach uses Spark's Catalyst framework, originally intended for representing and optimizing relational algebra expressions for Spark SQL, and extends it with graph-specific operators.
The talk will outline the context of the problem and walk through the steps required to compile and evaluate openCypher queries, with a particular emphasis on the Spark-specific components.

Pre-holiday Spark des fêtes