Data Engineering for Data Scientists

Name: Data Engineering for Data Scientists
Start: 2022-11-16T19:30:00-05:00
End: 2022-11-16T21:00:00-05:00
Location: Code & Supply Coworking

Hosted by Colin D. and 2 others

PyData Pittsburgh

Details

Please join PyData Pittsburgh for the presentation Data Engineering for Data Scientists by Pete Fein!

In this fast-paced talk, you’ll learn how adopting data engineering best practices and tools can improve your data science projects and empower you to deliver better, more reliable results in record time. We’ll discuss data architecture and design principles and explore open source tools you can use today, including:

Running Jupyter notebooks in production using Papermill and nbdev
Improve data quality with Great Expectations and monitor models with Evidently.ai
Write unit tests for your pandas and Spark DataFrames with pandera
Reusable SQL with dbt, an exciting new tool for data transformation that’s transforming data teams
Workflow orchestration with Apache Airflow, a better approach than fragile and frustrating cron jobs or Lambdas
Version control your data alongside your code with DVC

Special thanks to Code & Supply for hosting us!

Attendees are welcome to use the parking lot associated with the building off St Clair Street. The front door on Friendship Avenue will be open but is stairs-only. There's an elevator by the parking lot entrance. Head to the third floor and look for signs pointing to the presentation room, where the event will be held. All doors should be unlocked and open, so you're welcome to come right in!

PyData Pittsburgh

NumFOCUS

Data Engineering for Data Scientists

PyData Pittsburgh

Details

Related topics

Sponsors

NumFOCUS

You may also like