40th meetup



NOTE: a valid photo ID is required



Note: Please use your full real names when signing up, otherwise you may be refused entry!

As always, there'll be free beer and pizza, generously provided by our host AHL.

We are still experimenting with issuing tickets via a lottery - if you want to be in with a chance of a place - sign up for the waitlist! The lottery will be run approx 1 week before the meetup, and we will re-run the lottery to fill any spaces that free up or use the waitlist towards the time of the event.


Alexey Simonov on Programming a self driving car using Python

We will see how anyone with a suitable vehicle, drive by wire kit and cameras would go about programming a real self driving car using python, ROS and tensorflow. It is an easy task to make the car to follow a set of waypoints, detect traffic lights and accelerate/stop as required. This is an implementation of the capstone project on Udacity’s Self-Driving Car Engineer Nanodegree


Víctor Zabalza on Lens: Data exploration with Dask and Jupyter widgets

The first step in any data-intensive project is understanding the available data. To this end, data scientists spend a significant part of their time carrying out data quality assessments and data exploration. In spite of this being a crucial step, it usually requires repeating a series of menial tasks before the data scientist gains an understanding of the dataset and can progress to the next steps in the project.

In this talk I will present Lens (https://github.com/asidatascience/lens), a Python package which automates this drudge work, enables efficient data exploration, and kickstarts data science projects. A summary is generated for each dataset, including:
- General information about the dataset, including data quality of each of the columns;
- Distribution of each of the columns through statistics and plots (histogram, CDF, KDE), optionally grouped by other categorical variables;
- 2D distribution between pairs of columns;
- Correlation coefficient matrix for all numerical columns.

Building this tool has provided a unique view into the full Python data stack, from the parallelised analysis of a dataframe within a Dask custom execution graph, to the interactive visualisation with Jupyter widgets and Plotly. During the talk, I will also introduce how Dask works, and demonstrate how to migrate data pipelines to take advantage of its scalable capabilities.


Lightning talk will be given by Gary Collier (Man AHL)


Doors open at 6.30 (get there early as you have to sign-in via AHL's security), talks start at 7 pm, beers from 9 pm in the bar. We normally have > 200 folks in the room so there's plenty of people to discuss data science questions with!

Please unRSVP in good time if you realise you can't make it. We're limited by building security on the number of attendees, so please free up your place for your fellow community members!

Follow @pydatalondon (https://twitter.com/pydatalondon) for updates and early announcements. See you on the 5th!