Skip to content

Details

Welcome to 2021, PyData Boston-Cambridge! For our first event of the year, we'll be having some of our own core organizers walking through some really fascinating topics.

AGENDA

7pm | Opening Remarks + Announcements

7:10pm | Cesar Hernandez, Geo data pipeline processing for crowdfree.me

7:55pm | Andrew Therriault, Predicting History: Behind the Scenes of the Bloomberg News 2020 Voter Turnout Forecasts

------
TALK DETAILS
Geo data pipeline processing for crowdfree.me

Crowdfree.me is an app created to help people avoid crowds during COVID19. You can search a place and see relative density of people at different times and days of the week. The app was developed by Tripadvisor with collaboration from multiple companies that provided cloud computing, data and applications. In this talk, we will discuss how we took 10Tb of data per day and transform it using pyspark (EMR), AWS Athena and full orchestration with Airflow.

2nd TALK
Predicting History: Behind the Scenes of the Bloomberg News 2020 Voter Turnout Forecasts

The coronavirus pandemic has turned the world upside down in countless ways. One you almost certainly never thought of: how the media covers election night vote counting. The traditional “precincts reported” metric doesn’t make sense when most votes are cast before election day, so instead, Bloomberg News brought on Andrew Therriault to build machine learning models to estimate how many votes to expect. In this talk, Therriault walks through the mechanics of making these models and the challenges of forecasting a historic event with no good precedent to compare to. He’ll also discuss how his models incorporated uncertainty into their forecasts, how they did on election night, and how they could be improved in the future.

SPEAKER BIOS

Cesar Hernandez, ML Engineer at Tripadvisor
Cesar has worked on multiple projects from data pipelines to ML development. Cesar has worked in multiple industries, starting at Deloitte in 2012, then Vistaprint and now Tripadvisor. Cesar is passionate about data.

Andrew Therriault, Data Scientist and Founder at Civin
Andrew has spent the past decade building and leading data science programs for organizations across the political, government, and technology sectors. He co-founded and leads Civin, a civic tech consultancy that works with technology companies and public sector organizations, and he holds a doctorate in political science from New York University. Therriault’s previous career highlights include serving as the City of Boston’s first Chief Data Officer, launching the Democratic National Committee's data science team, and teaching graduate courses in data science at Harvard and Northeastern.

Members are also interested in