Skip to content

Analytics with Cassandra and PySpark

Photo of Friso van Vollenhoven
Hosted By
Friso van V.
Analytics with Cassandra and PySpark

Details

Update: eBay have kindly offered to host our meetup at their Amsterdam office. They will also be sponsoring food and drinks for the evening.

Be advised that we'll not be serving a full meal at this meetup. There will be snacks and drinks, thanks to eBay.

For the next meetup, we welcome speaker Frens Jan Rumph as a speaker and perhaps a second speaker to be announced.

We are working on a location; this will be announced soon. We will also up the RSVP limit when the location is secured.

Agenda:

• 18.00: Arrive, mingle, etc.

• 19.00: Introduction from your humble organisers and the evening sponsor

• 19.20: Talks:

Analytics with Cassandra and Pyspark, by Frens Jan Rumph

Spark is a great tool for working with Python on a cluster. Cassandra is a powerful storage engine and Spark extends it with a powerful set of processing capabilities. In this talk Frens Jan will show how these tools work together through Pyspark Cassandra. He will show how to set up Spark batch and streaming applications with Cassandra using real world use cases in entity resolution and earthquake monitoring as a backdrop. Also he will provide insights in performance characteristics of this tool chain.

About Pyspark Cassandra

Pyspark Cassandra is an open source project by Target Holding which wraps the Spark Cassandra Connector by Datastax in order to support all major batch and stream processing capabilities in Python.

About Frens Jan Rumph

Frens Jan is database and processing architect at Target Holding. There he works with machine learning and IT experts in delivering intelligent applications in media, utilities and health care.

• 20.00: Drink some more

• ??.??: Everybody out!

Photo of Data Council Amsterdam - NL Data Engineering & Science group
Data Council Amsterdam - NL Data Engineering & Science
See more events