Tue, Oct 28 · 6:00 PM EDT
PLEASE REGISTER ON LU.MA SO WE KNOW HOW MANY PEOPLE TO EXPECT: https://luma.com/ug69dcx6
Come see our awesome lightning speakers!
Speaker: Alan Feder
Title: Why It's Time to Ditch Pandas for Polars, Conda for UV, and Matplotlib for Altair
Abstract: For years, the pandas, conda, and matplotlib trio has been the undisputed foundation of Python data science. But is it still the best? This talk argues that the modern data landscape demands a modern stack. I will show in a head-to-head comparison how switching to Polars can dramatically accelerate your data manipulation with tidyverse-style syntax, how Altair's grammar can create visualizations with ggplot-style code, and how uv can solve your dependency nightmares with blazing-fast environment management.
Speaker: John Sobanski
Title: Building a Machine Learning Algorithm from Scratch
Abstract: John shares his experience building a machine learning algorithm from scratch. He outlines the core idea, walks through the math, logic and implementations, and addresses challenges in feature engineering, hyperparameter tuning, and debugging odd behavior. He shows quick runs in Octave, R, Pandas, and Polars, pointing out lessons from each environment. His goal: spark new ways of thinking about the algorithms we use every day and what becomes clear when you work through the details yourself.
Speaker: Jonathan Conrad
Title: AI for Analytics: How to Integrate Gen AI into Your Next Data Project
Abstract: Gen AI is often touted as a miracle solution to data-related problems—but results often end up short of expectations. In this talk, we will discuss how to implement commercial AI models into your data projects to drive superior results at a fraction of the cost. This talk centers around a recent project where we used OpenAI's GPT-4o-Mini model to scrape and classify data from a series of unstandardized documents. I will discuss the general architecture of building a data pipeline that performs these operations, the labor and computation costs, and the best practices we discovered from scrutinizing and refining the process.
Speaker: Abigail Haddad
Title: Hot Claude Summer: My Three-Month Adventure With Claude Code
Abstract: I used to envy developers because they could build websites someone might want to use. But this summer, I was able to use Claude Code to layer data pipelines and web development onto my data science skills—and suddenly found myself debugging at 3 AM after realizing my pipeline was silently deleting data. I'll talk about how more time fixing is currently the cost of using AI-assisted coding, how you can set up a real test/prod framework for free for your data pipelines via GitHub, and about what I still do from scratch.