PyData Berlin 2023 Feburary Edition


Details
Welcome to the February edition of Pydata Berlin meetup !!
For everybody to feel safer, we recommend you test yourself against COVID-19 before coming to the event. A self-test or a rapid antigen test would suffice. And please refrain from coming to the event if you feel unwell.
Doors open at 18:45 and the pizzas will be served at 19:00. For security reasons GetYourGuide will only be able to welcome people until 19:30. If you arrive after 19:30 you will not be able to join the meetup. Please be on time!
***
Talks :
Andrey Cheptsov: Bringing GitOps to ML experiments with dstack
GitOps is an emerging DevOps practice that helps teams ship software faster by applying things like version control and collaboration. In ML, GitOps is still at the trial stage and hasn’t been adopted at scale yet.
The speaker of this event is Andrey Cheptsov - the creator of dstack. He is passionate about open-source and developer tools. Previously, Andrey worked at JetBrains with the PyCharm team. Andrey will show how one can use dstack, an open-source tool, to bring collaborative software development practices to their ML experiments.
Speaker bio:
Andrey Cheptsov - the creator of dstack. He is passionate about open-source and developer tools. Previously, Andrey worked at JetBrains with the PyCharm team.
Talk: Using Graph Neural Networks to predict traffic accidents
In this talk, we will present our project on estimating the frequency of traffic accidents in Berlin. We compiled the historical accident data from Statistisches Bundesamt, urban area usage and population density from Berlin Open Data, and integrated them on the street network of Berlin, obtained from OpenStreetMap (OSM), which is represented as a graph. We then built a Message Passing Neural Network (MPNN) model using PyTorch-Geometric, which we trained on neighborhoods extracted from the graph, such that the model can learn the relationships between accident frequencies and OSM features such as speed limits and road type, external features such as population density or surrounding area use, as well as the intrinsic network features. Combining Neural Networks with a graph model, enables automation of feature selection, capturing non-linear relationships and extrapolating these across similar neighborhoods. We will conclude by presenting our Berlin-wide predictions in comparison to the observed accidents, and a simple web app to predict accident frequencies along the shortest path between two points in Berlin.
Speakers and Bio
Anand Deshpande: Data Scientist, with a strong background in Mathematics. Combining solid command of complex mathematical ideas with programming skills to solve intriguing problems in data science.
Jan Meyer:Data Scientist, keen on modeling human behaviour, with 5 years industry experience. Background in psychology and behavioral economics.
Onur Kerimoglu: Data scientist with a passion for coding, and a background in biogeochemical modelling research.
***
NumFOCUS Code of Conduct
THE SHORT VERSION
Be kind to others. Do not insult or put down others. Behave professionally. Remember that harassment and sexist, racist, or exclusionary jokes are not appropriate for NumFOCUS.
All communication should be appropriate for a professional audience including people of many different backgrounds. Sexual language and imagery are not appropriate.
NumFOCUS is dedicated to providing a harassment-free community for everyone, regardless of gender, sexual orientation, gender identity, and expression, disability, physical appearance, body size, race, or religion. We do not tolerate harassment of community members in any form.
Thank you for helping make this a welcoming, friendly community for all.
If you haven't yet, please read the detailed version here: https://numfocus.org/code-of-conduct
***
SPONSORS:
GetYourGuide is the best place to unlock unforgettable travel experiences all over the world. See our open positions on careers.getyourguide.com or get a behind-the-scenes look at life at GetYourGuide on our blog: inside.getyourguide.com.
COVID-19 safety measures

PyData Berlin 2023 Feburary Edition