Macedonian Data Scientists, its a great pleasure to announce our third Data Science meetup. It will be held on the 12th of September from 18:00 to 21:00 in Cineplexx, Conference Hall 5.
This meetup will feature 3 interesting talks on different data science related topics.
Data engineering showcase: delivering clean and reliable data - Ondrej Vesely and Jakub Kohut (kiwi.com)
Three years ago our data pipeline consisted of one developer manually querying various production databases in order to create a simple report from a JSON hell. Now we have 40 analytics and data scientists who expect clear, consistent data delivered in near real time to feed their models. We’ll talk about the combination of open source or cloud technologies we have chosen to process data in a batch or streaming way and present what we have learned on the way. Namely: Apache Airflow, Asyncio, Redis, protocol buffers, Google Cloud Dataflow and Google BigQuery. Also, we’ll talk about sadness & despair we felt during the process.
Ondrej handles data engineering for analytical, business intelligence and machine learning purpose. It usually means building microservices for streaming and real-time analytics (stuff like asyncio, Kafka, redis, Pub/Sub, protobuf) and batch processing (PostgreSQL, Python) across Amazon and Google cloud. Ondrej also teaches Python for Czechitas and gladly shares his technical know-from from FlowerChecker and Plant.id services.
Jakub works as a BI Developer and Data Engineer in Kiwi.com. He entered the world of data engineering relatively recently but is very enthusiastic and very eager to explore everything it has to offer. He has experience in ETL design using technologies like PostgreSQL, Apache Airflow, Redis and SQL Server, just to name a few. He’s very excited to share his knowledge with his audiences and meet interesting people on his journey through the data world.
Automated Machine Learning for Time-Series predictions - Darko Matovski (causaLens)
Predicting the future has enormous business value. Measurements about economic activity are primarily represented in the form of time-series data. Building predictive models using this type of data is difficult and time consuming due to the presence of temporal characteristics.
There are situations in which large number of adaptable models need to be built. However, the current approaches are unsustainable and inadequate to achieve this. In this talk we shall present how the technology we develop at causaLens is changing this.
Darko is the CEO of causaLens. Darko has also worked for cutting edge hedge funds and research institutions. For example, the National Physical Laboratory in London (where Alan Turing worked) and Man Group in London. Darko has a PhD in Machine Learning and an MBA.
Processing Heterogeneous Data Streams using Dynamic State Configuration - Gjorgji Madjarov (Elevate-Global)
A data stream is an endless and continuous flow of data generated from thousands of connected devices, IoT, and any other sensors or data generators. Data stream processing enables users to filter, aggregate, and cleanse the data in flow within a small time period from the time of receiving the data. The key strength of stream processing is that it can provide insights faster, often near real-time. This talk outlines a newly developed platform for state-full stream processing of heterogeneous data using dynamic configuration for real-time analytics and event-driven applications.
Gjorgji is CTO of Elevate-Global and an Associate Professor at the Faculty of Computer Science and Engineering, “St. Cyril and Methodius” University in Skopje. He is currently working on real-time data stream processing, learning from multiple heterogeneous information sources, anomaly detection and time series prediction and analysis.
Free snacks and drinks! Provided by kiwi.com!