PyData 2022 October Edition


Details
Welcome to the October Pydata Berlin edition!!
For everybody to feel safer, we recommend you test yourself against COVID-19 before coming to the event. A self-test or a rapid antigen test would suffice. And please refrain from coming to the event if you feel unwell.
Doors open at 18:45 and the pizzas will be served at 19:00. For security reasons GetYourGuide will only be able to welcome people until 19:30. If you arrive after 19:30 you will not be able to join the meetup. Please be on time!
***
Talks:
Pieter Luitjens
Deploying Transformers at Scale: Addressing Challenges and Increasing Performance
Transformer networks have taken the NLP world by storm, powering everything from sentiment analysis to chatbots. However, the sheer size of these networks presents new challenges for deployment, such as how to provide acceptable latency and unit economics. The de-identification tasks Private AI services rely heavily on Transformer networks and involve processing large amounts of data. In this talk, I will go over the challenges we faced and how we managed to improve the latency and throughput of our Transformer networks, allowing our system to process Terabytes of data easily and cost-effectively.
Bio
Pieter Luitjens has a Bachelor of Science in Physics and Mathematics and a Bachelor of Engineering from the University of Western Australia, as well as a Masters from the University of Toronto. He worked on software for Mercedes-Benz and developed the first deep-learning algorithms for traffic sign recognition deployed in cars made by one of the most prestigious car manufacturers in the world. He has over 10 years of engineering experience, with code deployed in multi-billion dollar industrial projects. Pieter specializes in ML edge deployment & model optimization for resource-constrained environments.
Break
Ismail Benbrahim
Causal Inference on time series - making sure you can trust your model
Assessing quantitatively the impact of a marketing campaign or the launch of a new product on the company's performance is not straightforward as a lot of parameters can influence the target metric (e.g. tickets sold; new customers acquired…).
There exist libraries, such as causalImpact, which are very helpful in such use cases. However, making sure the assumptions are verified can be treacherous and in the worst-case lead to bad business decisions. That is the reason why, within Flix, we have built and are continuously improving upon a framework to provide data teams with best practices and tools to build trustworthy recommendations.
Bio
Ismail Benbrahim is a data scientist at Flix. He works in the data science incubation team. He supports product and business teams adding value to Flix by developing machine learning enabled solutions. Outside of work, he enjoys bouldering and reading scifi novels.
***
NumFOCUS Code of Conduct
THE SHORT VERSION
Be kind to others. Do not insult or put down others. Behave professionally. Remember that harassment and sexist, racist, or exclusionary jokes are not appropriate for NumFOCUS.
All communication should be appropriate for a professional audience including people of many different backgrounds. Sexual language and imagery are not appropriate.
NumFOCUS is dedicated to providing a harassment-free community for everyone, regardless of gender, sexual orientation, gender identity, and expression, disability, physical appearance, body size, race, or religion. We do not tolerate harassment of community members in any form.
Thank you for helping make this a welcoming, friendly community for all.
If you haven't yet, please read the detailed version here: https://numfocus.org/code-of-conduct
***
SPONSORS:
GetYourGuide is the best place to unlock unforgettable travel experiences all over the world. See our open positions on careers.getyourguide.com or get a behind-the-scenes look at life at GetYourGuide on our blog: inside.getyourguide.com.
COVID-19 safety measures

PyData 2022 October Edition