Skip to content

Mastering Data Science Engineering

Photo of Roman Golovnya
Hosted By
Roman G. and 3 others
Mastering Data Science Engineering

Details

Hi All,

We excited to invite you to another informative and coding Saturday morning.

Agenda:

9:30 - 10:00 Mastering Data Science Engineering by Roman

10:00 - 11:00 Building data pipelines by Sahil

11:00 - 11:15 Break, networking

11:15 - 12:00 Apache Spark Best Practices by Eren

12:00 - 12:45 Training Deep Learning Models on AWS Spot Instances using Spotty by Oleg

We recommend bringing a fully charged laptop.

Roman will discuss the last 12 months Data Science Engineering Club events, projects and results. Then, he will advise regarding future plans for the next 3 months.

Sahil will show various use case of data pipelines, components required when creating a pipeline, Live Demo of creating a house price prediction involving ETL from AWS S3, data cleaning, data warehousing to Postgres Database and creating visualization dashboard using Superset.

Eren will be going through Apache Spark Performance and Tuning Takeaways by focusing Data Structures, Persistency, Partitioning, Event Sourcing on Transformations and Checkpointing.

Oleg will present Spotty python package which simplifies training of deep learning models on AWS. If you would like to follow Oleg with live demo, you require an AWS account, and will have installed and configured the AWS CLI tool (https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-install.html, https://docs.aws.amazon.com/cli/latest/userguide/cli-chap-configure.html). Also, you will need an environment with Python >= 3.5.

https://github.com/apls777/spotty

Roman Golovnya (https://www.linkedin.com/in/romangolovnya) graduated with degree in Finance and IT. Currently, he works in Hertz as Data Engineer. He develops scalable data solutions using python, Apache Spark and AWS services. In a free time, he organises the events for this meetup group, plays table tennis and participate in Kaggle competitions.

Sahil Dadia (https://www.linkedin.com/in/sahil-dadia-a77773109/) holds a Masters in Data Science and Analytics from Maynooth University. He works with Python, R, AWS, Apache Spark in the Linux environment. Previously, he developed computer vision software for self-driving cars at Swaayatt Robots.

Eren Avşaroğull­arı (https://www.linkedin.com/in/erenavsarogullari/) holds both B.Sc & M.Sc. degree in Electronics & Control Engineering. Currently, he works at Workday on Data Analytics as Sr. Data Engineer. He is also an open source contributor at Apache Software Foundation (Apache Spark, Pulsar, Heron).

Oleg Polosin (https://www.linkedin.com/in/polosin/) holds a M.Sc. degree in Informatics and Applied Mathematics. He is a Machine Learning Engineer at Zalando SE.

Let Roman Golovnya know if you keen to host the event or/and present at the future meetups. You can contact him via meetup messages or email roman.golovnya@gmail.com.

Photo of Data Science and Engineering Club group
Data Science and Engineering Club
See more events
Bank of Ireland Trinity branch,
Hamilton Building, Trinity Campus Dublin · Dublin