Skip to content

Pandas: documentation and bokeh (continued)

Photo of Pav A
Hosted By
Pav A.
Pandas: documentation and bokeh (continued)

Details

According to this recent blog post by Stack Overflow (https://stackoverflow.com/), Python is declared as the fastest-growing major programming language. And 10% of its credit is considered to be due to the pandas (http://pandas.pydata.org/) library.

In this sprint we'll have two different groups:

Beginners: We will improve pandas documentation

Gitter: https://gitter.im/py-sprints/pandas-doc

The idea is to improve the API documentation. So we will transform a page like:

http://pandas.pydata.org/pandas-docs/version/0.20.3/generated/pandas.DataFrame.reset_index.html

to a page like:

https://pandas-docs.github.io/pandas-docs-travis/generated/pandas.DataFrame.reset_index.html

More information on how to contribute to Pandas documentation can be found here:

https://pandas.pydata.org/pandas-docs/stable/contributing.html#contributing-to-the-documentation

Intermediate / advanced: We will continue the implementation of Bokeh, as a backend for Pandas. This is described next.

Gitter: https://gitter.im/py-sprints/pandas-bokeh

One of the popular features of pandas is that it can directly plot the data it contains (in a Series or DataFrame). For example:

https://secure.meetupstatic.com/photos/event/b/7/a/0/600_464567008.jpeg

When this feature was implemented, matplotlib (https://matplotlib.org/) was the standard plotting library in Python. But things changed, and now there are many great available libraries. One of the most popular ones is Bokeh (https://bokeh.pydata.org/en/latest/). Bokeh generates interactive visualization charts in the style of D3.js.

Plotting pandas data in Bokeh is quite straight-forward:

https://secure.meetupstatic.com/photos/event/b/7/d/c/600_464567068.jpeg

But it would be more efficient and consistent, if pandas could be configured for a different backend like Bokeh, and then use the current pandas methods to plot with your favorite library. The result with Bokeh would be:

https://secure.meetupstatic.com/photos/event/b/8/2/9/600_464567145.jpeg

Pandas is already well prepared to be integrated with other backends. Having all the matplotlib logic in a directory plotting (https://github.com/pandas-dev/pandas/tree/master/pandas/plotting).

But some work needs to be done, adding a setting to define the backend, and further decoupling the plotting logic.

Also, a new package pandas-bokeh needs to be created, that can be called from the .plot() pandas methods.

In this sprint we will code this new module (that can be later added to pandas), and we will send the pull request for the required pandas packages.

Our sponsor

https://www.touchsurgery.com/img/logo-colour.svg

Thanks to Touch Surgery (https://www.touchsurgery.com/jobs.html) for providing the venue, and the pizza and drinks for the night.

Set up instructions:

  1. Get a pandas development repository

Fork pandas repository by clicking in the top right button at:

https://github.com/pandas-dev/pandas (http://meet.meetup.com/wf/click?upn=pEEcc35imY7Cq0tG1vyTt5bU6lMAVp2Y-2FVwG3L7-2BlDcVMvd0oa5cY0SWbzw0V4Ad_v655q35lr747ElyfPGSUh046oGHdEMFAcxOonLu-2Fm0JZczNJBiF0HIg0yRWx4wXrHKZqAy-2Bm1ktramBpclWxNzxh4Cj0yzX22X5UkO1w2ESkTzn9TDw5EI4hJ5BCZRppDS6KVWXcxRFV9yT9xJTr8rMzdjmCRqYc3RTwY8QrPDShjQRcNNBC1Tu3u3FRIuajrfCvzPi9w2B2iHesxeIKuJIMVI4QYQCocAkqZUVQtWk-3D)

After it completes, run in your computer terminal.

$ git clone https://github.com/ /pandas

$ cd

$ python setup.py build_ext --inplace

  1. Download and install Anaconda from:

https://www.anaconda.com/download/ (http://meet.meetup.com/wf/click?upn=pEEcc35imY7Cq0tG1vyTt45mZa7RQhrDun4GaOz4VCMNBdVHxUvrij57tfKWAO-2Bq_v655q35lr747ElyfPGSUh046oGHdEMFAcxOonLu-2Fm0JZczNJBiF0HIg0yRWx4wXrsRtCeL7UDjbgHTidR-2FG30KibXFDMMquRBlsa2WlQD8VyLAAMyhY9B8kNbMuWrCkUeYo-2B4MKFWom9JYG-2BgS4cmK-2B9dnDcBNwhRBxWlpouJmyb4EIc2At0JP84vmELypxXSmvQsCWYuvGDIsuN5hrI8M-2BICEE5xWPTmpFCs3y1SeY-3D)

After restarting the terminal, run:

$ conda config --add channels conda-forge

$ conda create -n pandas_dev --file ci/requirements_dev.txt

$ source activate pandas_dev

Photo of London Python Sprints group
London Python Sprints
See more events
Touch Surgery
230 City Rd, EC1V 1JT · London