Enter the Snake Pit for Fast and Easy Spark

Veranstaltet von Munich Cassandra Users - by DataStax -

Öffentliche Gruppe

Dies ist ein vergangenes Event

33 Personen haben teilgenommen

Bild des Veranstaltungsortes

Details

We are joining forces with our friends from NoSQL Munich Meetup Group (http://www.meetup.com/munich-nosql-meetup/) for this great Meetup with Jon Haddad who is coming all the way from America to meet us!

-----

Agenda:

18:30 - Doors Open (Pizza to be served at this time)
19:00 - Jon Haddad: Enter the Snake Pit for Fast and Easy Spark
19:45 - Alex Petrov: Optimising Cassandra Usage for Data Crunching backends
20:30 - Networking

-----

Speaker:
Jon Haddad

Bio:
Jon has 15 years experience in both development and operations. For the last 10 he's worked at various startups in southern California. For the last 2 years he's been the maintainer of cqlengine, the Python object mapper for Cassandra, now integrated into the native Cassandra driver. He's a Technical Evangelist at Datastax, continuing to focus on advancing Cassandra in the Python, operations and data science communities. Jon holds a degree in Computer Science from the University of Vermont. Technical Evangelist at DataStax.

Title:
Enter the Snake Pit for Fast and Easy Spark

Abstract:
Everyone knows that Python isn’t suitable for massive scale analytics, right? Wrong. Spark 1.3 introduced data frames, which allow for high performance Spark batch jobs, streaming, and machine learning over massive datasets. In this talk you’ll learn how to combine Cassandra, a highly scalable, always on OLTP data store, with PySpark, a framework for distributed computation. You'll learn how PySpark matches the performance that you’d see from either Scala or Java, something only a year ago was impossible. I’ll show you how to unleash the full power of SQL with Cassandra data - that means joins, aggregations, and complex where clauses. Lastly I’ll show you how to wrap up your queries with rich visualizations. So let’s get started and put the death wrap on your data.

----

Speaker:
Alex Petrov

Bio:
Alex is a Polyglot Programmer, Lead Data Infrastructure Engineer at Instana Inc, a long-term Cassandra user, 2015 Cassandra MVP, Project Reactor committer, working on data processing pipelines, distributed systems and near-realtime processing backends.

Talk:
Optimising Cassandra Usage for Data Crunching backends

Abstract:
He’s going to talk about using Cassandra for data procesing backends: how to run complex offline aggregate queries on large datasets with stream fusion, a great concept that has raised from the research from University of New South Wales. The talk is also going to cover how to pack your models, how to make access them in a fast, concurrent manner.