Skip to content

Big Data Meetup - October

Photo of Valentina Crisan
Hosted By
Valentina C.
Big Data Meetup - October

Details

Hello everybody,

I am happy to restart our online meetings this year with the October meetup. Our guests: Tudor @ GoPro will talk about their journey with upgrading to Spark 3 and Marius @ eMAG will take us through their journey of building a simple BI solution for startups. Many thanks to both for taking time to present in our meetup.

Please note that we will use Zoom, the meeting registration and details will be visible only to the participants that will RSVP Yes.

Details of the agenda:

6:00 PM - 6:10 PM gathering
6:10 PM - 6:50 PM GoPro's Spark 3 upgrade Journey, Tudor Mihordea (https://www.linkedin.com/in/tudor-mihordea-16649469/ )

Spark 3 was released in June 2020 with the promise of significant improvements in terms of performance and ML features. This raised a big interest in GoPro to upgrade as soon as possible. Since we are using Databricks for spark cluster management it looked to us like it should be an easy upgrade. Moreover since we do not have a single big cluster but many smaller ones that we start on demand, it seemed like we can do the upgrade incrementally. When we started to deep down into what actually needs to be done for the upgrade, we found things are a bit more complicated mainly due to the fact that we needed to upgrade our metastore. The presentation will focus on the steps we went through for the upgrade as well as the lessons learned along the way.

Tudor has been with GoPro's Data Science Engineering team for the last 2 years, mainly in charge of the Spark infrastructure and Spark data pipelines. He has more that 5 years of experience using the big data technologies and is big fan of functional programming and the Scala programming language.

6:50 PM - 7:00 PM break

7:00 PM - 7:40 PM Creating a scalable & cost efficient BI infrastructure for a startup in the AWS cloud, Marius Costin (https://www.linkedin.com/in/mariuscostin/)

How we created an efficient BI solution that can easily used by a startup, using the AWS cloud environment. Using Python we can easily import, process and store data in Amazon S3 from different data sources including Rabbit MQ, Big Query, MySQL etc. From there we are taking advantage of the power of Dremio as a query engine & the scalability of S3, you can create beautiful dashboards in Tableau fast, in order to kickstart a data journey in a startup.

About Marius: Arrived in eMAG in 2016 as DWH Developer. In 2017 we decided that it’s time to revamp eMAG’s DWH, so I accepted the position of DWH Architect, which I held until May 2021. In May, a new opportunity & challenge showed itself: creating a BI infrastructure for a startup in eMAG’s portfolio based on AWS.

7:40 PM - 8:00 PM Q&A and closing

See you all soon virtually,
Valentina

Photo of Bucharest Big Data Meetup group
Bucharest Big Data Meetup
See more events