Processing Large Data with Python Pandas

Name: Processing Large Data with Python Pandas
Start: 2019-08-27T18:00:00-04:00
End: 2019-08-27T20:00:00-04:00
Location: Smart Data

Hosted by Dayton Women Code Together

Dayton Women Code Together

Details

Data sets can get large quickly. You can go from looking at a few 100 lines and a handful of columns to a million lines and hundred of columns. Python's Pandas library is a great tools to handle and process data. Pandas is fast, powerful and flexible. Plus it does an amazing job at cleaning messy / real world data. It can quickly parse data and help you make meaningful plots. But it was designed to handle ~<100mb of data.

So what do you do when you have a few gigabytes of real world data? Data that you need to explore via a laptop? This talk will show you how to reduce the amount of memory your data takes on a computer system by up to 90%! Thus enabling you on a laptop to read in a few gigabyte csv file and process the data in that file with RAM to spare.

This talk will be interactive presented by Evelyn Boettcher. Please bring a laptop with python (vs 3.n) and pandas installed.

Python
https://www.python.org/downloads/release/python-374/

Pandas
https://pandas.pydata.org/

Follow along with Evelyn's talk:
https://github.com/DiDacTexGit/Talk-ProcessingLargeDatawithPandas

Dayton Women Code Together

Processing Large Data with Python Pandas

Dayton Women Code Together

Details

Related topics

You may also like