Python is quickly becoming the go-to language for data analysis. There are so many tools out there that it can be overwhelming for those that are new to analyzing data in Python. In this presentation, I’ll discuss several of the best tools for working with data, how to structure a data analysis workflow, and which tools are appropriate for handling different kinds of data. You’ll leave with a good understanding of different data analysis techniques in Python and some ideas to try on your own.
I’ll show you examples of each of the following:
* Data preprocessing
* Using Scikit-learn for machine learning
* Using the Natural Language Toolkit for natural language processing
*Running MapReduce jobs with MRjob
*Visualizing our results with matplotlib