What we're about
Upcoming events (1)
Online event - hosted on Zoom platform and streamed on YouTube.
Talk: MATS stack for machine learning
Abstract: At Avast we complete over 17 million phishing detections a day, providing crucial online protection for this type of attacks. There are a lot of challenges that data scientists frequently face, such as inconsistent environments, transitioning from research to production, access management, model deployment.
In this talk, Joao Da Silva and Yury Kasimov will present MATS stack for productionizing machine learning and their journey into integrating model tracking, storage, cross-system orchestration and E2E model deployments for complete and modern machine learning pipelines. They will discuss how they scale machine learning operations in Avast by using popular technologies such as MLFlow, Airflow, Tensorflow and Spark aka MATS.
They will show you the tooling they developed to improve Avast's MLOps maturity levels, that binds MATS into a declarative interface, abstracting infrastructure away from the researcher, produces consistent environments that can be executed on the researcher's laptop, on-prem or cloud clusters and how they automate deployments.
Data engineer at Avast with a focus on machine learning pipelines and with experience in network security and ML research.
Joao Da Silva
Lead Data Engineer at Avast and long time Scala developer with a big focus on Big Data and Machine Learning pipelines.
Talk: Feature store as a part of modern data lakehouse
Abstract: Over years of experience, DataSentics, as well as many other companies working with big data, begin to appreciate a structure within their data platforms. Thus, we matured from a data lake to a data lakehouse architecture and a feature store as a cherry on its top. Seemingly a minor component, the feature store covers the whole ecosystem of tools and practices, from a storage layer to SDK to API and a web interface. In a nutshell, the feature store aims to increase reusability of features, automate their computation, backfills, and logging, enable features versioning and lineage, and many more. But at the end of the day, the ultimate purpose of it is allowing data to face real-world analytical and business use cases and to do it systematically and efficiently.
Through the talk, DataSentics wants to share its experience of building a feature store layer on top of a modern data lakehouse from scratch. We will share how we approached the challenge, technical decisions we made along the way, as well as demands we face with our
Jiri leads daipe.ai platform development which aims to simplify data-science notebooks productionalization. Before that, he worked with various enterprise clients on both data engineering and machine learning projects.
Machine Learning Meetups (MLMU) is an independent platform for people interested in Machine Learning, Information Retrieval, Natural Language Processing, Computer Vision, Pattern Recognition, Data Journalism, Artificial Intelligence, Agent Systems and all the related topics. MLMU is a regular community meeting usually consisting of a talk, a discussion and subsequent networking. Except for Prague, MLMU also spread to Brno, Bratislava and Košice.