In this talk we will cover three aspects of an Entity Resolution (ER) project done in partnership with Anidata and the Fulton County District Attorney’s Office to combat Human Trafficking. We will start by demonstrating how we collected data using a python web scraper behind a Tor proxy. Then we will discuss the data science used to do ER via unsupervised network clustering. Finally, we will conclude by demonstrating the power of Luigi as a scheduling tool to automate jobs.
Click to find the slides (https://docs.google.com/presentation/d/1r3SAci6PRdddPec606zLMXRONWDL9xjbQMWy7oDeEVU/pub?start=false&loop=false&delayms=60000) and code (https://github.com/gte620v/graph_entity_resolution/tree/master/201612_PyDataATL) for Bob's Talk.
Please make sure to also sign up here (https://generalassemb.ly/education/pydata-and-ga-present-dark-data-and-improving-human-rights-in-fulton-county/atlanta/32089) with our partner venue and drinks sponsor, General Assembly.
About our speaker:
Bob Baxley is the Chief Engineer at Bastille, where he helps build systems to sift through massive amounts of radio frequency data. He joined Bastille in 2014 shortly after the company was founded and leads the Data Science and Signal Processing teams.
Bob has more than 10 years of experience implementing machine learning systems with an emphasis on cognitive radios. Prior to joining Bastille, Bob was the Director of the Software Defined Radio Lab at the Georgia Tech Research Institute. In that role, Bob led GTRI’s team to second place out of 90 international competitors in the DARPA Spectrum Challenge.
In 2008, Bob earned his PhD in Electrical Engineering from Georgia Tech. During his graduate work, he was recognized with the Sigma Xi Best Thesis award, the CSIP PhD Research Award, and the NSF GRFP Award. He has co-authored over 90 peer-reviewed papers, is the inventor of 17 patents, and formerly served at an Associate Editor for Digital Signal Processing.
In addition to his role at Bastille, Bob is an Adjunct Faculty member in the School of ECE at Georgia Tech. He also volunteers as a Board Member of the data science non-profit organization, Anidata.
Doors will open at 7 for mixing, networking, and refreshments.
Food and drinks will be provided. The main talk will start at 7:30, followed by a town-hall style Q&A session. We will then have several short lightning talks.
As always, please consider signing up for a lightning talk yourself: http://www.meetup.com/PyData-Atlanta/pages/20893440/Info_about_giving_a_lightning_talk/
Food is sponsored by IBM Data Science (http://datascience.ibm.com/)