Apache Spark & Co. - 5th Meetup in Berlin!


Details
The Spark Meetup Berlin returns for a fifth round: on Thursday, 5th October at 7pm, we will meet for two exciting talks in the meetup space at Deloitte Analytics Institute, Hohenzollerndamm 150-151, top floor.
Our first speaker is Christophe Schmitz from Instaclustr -- please see below for a short bio. Second talk still tbc.
About the first talk:
At instaclustr, we manage 1000+ Cassandra and Spark nodes of our customers. Part of our solution is Instametrics, a Spark + Cassandra cluster we use to store and process 50K+ metrics per seconds. In this talk, I will introduce the main concepts of Cassandra and how we can leverage Spark to process data stored in Cassandra, based on a real world use case.
This talk will cover:
-
General introduction to Cassandra, including data modelling concepts
-
Cassandra + Spark architecture
-
Writing Spark jobs to query Cassandra
-
How to optimize Spark - Cassandra jobs
-
Tips and tricks learned along the journey
About the speaker:
Christophe has been an engineer and consultant at Instaclustr for more than two years. He worked on several Cassandra projects, including instametrics, an internal Cassandra cluster used to collect metrics for all instaclustr managed nodes; a government project of a document repository in the health sector, and a global digital platform for a fortune 500 company.

Apache Spark & Co. - 5th Meetup in Berlin!