PyData London - 46th Meetup


NOTE: A valid photo ID is required by building security. Please use your full real names when signing up, otherwise you may be refused entry!


As always, there'll be free food & drinks, generously provided by our host, AHL.

We are issuing tickets via a lottery - if you want to be in with a chance of a place - sign up for the waitlist! The lottery will be run approx 1 week before the meetup, and we will re-run the lottery to fill any spaces that free up or use the waitlist towards the time of the event.


Main Talks:

Jason Byrne on "Digital Image Processing: Introduction and Applications":

Digital image processing may be considered as a set of algorithmic tools to derive information from digital images using a computer's processing power. It is an extension of signal processing where the signal is generally a sequence of numbers that represent samples of a continuous variable in a domain such as time, space or frequency (e.g. econometrics, seismology, audio & speech recognition). An image is typically represented with two-dimensional pixels, which is essentially a matrix or array of values indicating brightness. For a series of images we can consider the signal as three-dimensional, with the third dimension being time in the case of video frames. Applications are numerous: for example in computer & machine vision such as industry robots and autonomous cars; medical imaging such as ultrasound and MRI; and astronomy to study satellite and telescope data. Indeed many people will have some familiarity with its application in software like Photoshop and in the facial recognition and image filters of social media like Facebook, Instagram and Snapchat. Here at Royal Mail an optical character recognition system is relied upon to read the addresses on the vast amount of mail items passing through our network.

In this talk I will give an introduction to the field of image processing and demonstrate some example applications, drawing upon my experiences in employing such techniques during my academic and data science projects to date. This should be interesting and useful to any researcher, data scientist or developer who has had limited exposure to signal or image processing techniques, especially if they would like to pursue further learning in the fields of computer and machine vision.


Michael Craig, Lawrence Phillips and Vid Stojevic on "Machine Learning on molecular data”

At GTN we are combining ideas from quantum physics and chemistry with machine learning to aid the process of discovering new medicines. In this talk I will discuss the challenges of applying machine learning to molecular datasets. Issues of data representation are starkly different then for, say, image or text based data, and I will describe various ways to represent molecules, starting with simple representations such as SMILES strings, chemical fingerprints, and going on to more advanced graph based, and quantum mechanical representations. In going beyond text or matrix based representations to graphs, standard convolutional or RNN networks are no longer appropriate, and recently developed architectures specifically tailored for graph data need to be utilised. I will describe recent advances in so called “graph convolutional networks” that have generated best in class results on chemistry datasets, and will demo how one can curate molecular datasets from public sources, most prominently ChEMBL, and run advanced ML algorithms on them using publicly available python libraries such as RDKIT and deepchem.


Lightning Talks:

Raphael Holca on "Mimica"

We are building a software that automates computer-based work simply by observing it. Our algorithms record clicks and keystrokes over a few weeks, process this data using ML/AI, and then generate automations for parts of the work

Peter Bleackley on "Is it a mushroom or is it a toadstool?"

Using the UCI machine learning Mushroom Classification dataset from Kaggle, I demonstrate a simple Bayesian Belief Network that predicts whether fungi are edible or toxic.



Doors open at 6.30 (get there early as you have to sign-in via AHL's security), talks start at 7 pm, drinks from 9 pm in the bar. We normally have >200 folks in the room so there's plenty of people to discuss data science questions with!

Please unRSVP in good time if you realize you can't make it. We're limited by building security on the number of attendees, so please free up your place for your fellow community members!

Follow @pydatalondon ( for updates and early announcements. See you on the 3rd!