**IMPORTANT**
This event is in person at the Georgetown University's School of Continuing Studies auditorium in the Mt. Vernon Square area of Washington DC. Space is limited. You must be registered on Meetup at least two days in advance to attend, as we need to provide a list of attendees to the venue two days in advance of the event. Please ensure your Meetup username contains your first and last name.
Please only register if you are able to attend in person and please cancel your registration a few days prior to the event if plans change.
The talk will kick-off at 7pm ET.
Description
The “science” part of data science requires us to reproduce and understand the results of our models. The problem is that our training data (often stored in relational databases) are fundamentally disconnected from the models themselves. How can we diagnose and analyze our models if our instances can be modified or deleted from the underlying database at any time? When we deploy models to production, the problem gets even more complex, as the models start generating new instances and feedback signals. Data scientists and engineers must ensure that the models employed in production are as up-to-date and as predictive as possible, monitoring how models are used in production. The rise of tools like feature stores, vector databases, model registries, and eventing platforms are evidence that the way we've been doing things is changing.
It’s no wonder that MLOps is seen primarily as an engineering discipline rather than part of data science, and that disconnect means that fewer and fewer of our models are actually making it into production. As data scientists, we must shift our perspective away from static data sets and towards real-time analytics. In this talk, we’ll explore how event streams can influence reproducibility in model training, how asynchronous inferencing can improve the rate at which we get models into production, and how real-time data can lead to more insights and better models.
Bio
Dr. Benjamin Bengfort is the co-founder and CEO of Rotational Labs. Benjamin is an experienced data scientist, systems engineer, and open-source programmer; he is one of the creators and maintainers of the Yellowbrick machine learning diagnostics Python package and the author of several O'Reilly data science books, including Data Analytics with Hadoop and Applied Text Analysis with Python. Driven by a desire to build large systems with many users that have a global impact, he takes pride in solutions where many small interactions combine to create complex dynamics. At Rotational Labs, he aims to apply advanced distributed computing, networking, open-source software, education, and machine learning to solutions that allow us to collaborate more effectively worldwide to solve big problems.