We excited to invite you to another Data Engineering Workshop. This time, we will continue working with Python, Scala, Apache Spark locally on the computer and in the cloud.
Some experience in the above areas is recommended.
Dublin Tech Summit provided two free tickets to their conference on 10-11 April 2019. We will do a raffle for these tickets during this event. Make sure to provide your full name during the event. More information about the conference is at https://dublintechsummit.com/
If you want to buy a ticket, with a discount, use this link https://ti.to/dts/dts19/with/b01nzi0cdmi
9:30 - 10:30 Data processing with Apache Spark by Eren
10:45 - 11:45 Apache Spark in practice by Sahil
12:00 - 12:45 ETL pipelines with Spark by Sahil
Eren will be going through the following topics:
Modern Data Processing Patterns,
Apache Spark Architecture,
Scala Structured API Transformation Examples on Databricks Community Edition,
Real life use cases from his Production Perspective.
Eren Avşaroğulları (https://www.linkedin.com/in/erenavsarogullari/) holds both B.Sc & M.Sc. degree in Electronics & Control Engineering. Currently, he works at Workday on Data Analytics as Sr. Data Engineer. He is also an open source contributor at Apache Software Foundation (Apache Spark, Pulsar, Heron).
Sahil Dadia (https://www.linkedin.com/in/sahil-dadia-a77773109/) holds a Masters in Data Science and Analytics from Maynooth University. He works with Python, R, AWS, Apache Spark in the Linux environment. Previously, he developed computer vision software for self-driving cars at Swaayatt Robots.
Sahil will cover these areas:
- Using pyspark for ETL cycle.
- Understanding Spark Internals for optimization perspectives.
- Wrting/Reading from cassandra and postgres database using pyspark.
- Deploying pyspark code on AWS EMR.
We recommend bringing a fully charged laptop.
Let Roman Golovnya know if you keen to host the event or/and present at the future meetups. You can contact him via meetup messages or email [masked].