Past Meetup

PyData Meetup - May 2016 - Luigi + Scikit-Learn, and Airline Crash Prediction

This Meetup is past

140 people went

Details

Agenda

• 6:45pm - 7:00pm Networking

• 7:00pm - 7:55pm Presentation: Using Luigi and Scikit-Learn to create a Machine Learning Pipeline which trains a model and predict through a Rest API by Atreya Biswas

Synopsis: A Machine Learning Pipeline can be broadly thought of as many tasks which includes - Data Ingestion - Data Cleaning - Feature Extraction - Training Models - Hyper Parameter Optimization - Model Evaluation - Model Deployment. Luigi is Spotify's open sourced Python framework for batch data processing including dependency resolution, workflow resolution, visualisation, handling failures and monitoring. Scikit-Learn is the most popular and widely used Machine Learning Library in Python. We will demonstrate how Luigi and Scikit-Learn can be used to orchestrate the Machine Learning Tasks, hence creating a cohesive Machine Learning Pipeline.

Speaker: Atreya is currently working as a Data Scientist for Pocketmath, a Digital Advertisement buying platform with Real Time Bidding. In his day to day life he has to process TBs of data using Hadoop, Spark and apply machine learning techniques. Prior to joining Pocketmath, he was pursuing his Master's in Enterprise Business Analytics from National University Of Singapore and also working as a Machine Learning Associate with Newcleus, a CRM Data Analytics Platform. At Newcleus, he has been responsible to productise a Machine Learning platform which ingests CRM data from Salesforce, apply cleaning and Machine Learning. Further his final year thesis was in association with Dailymotion, a video platform for web and mobile. At Dailymotion he was exposed to the world of Natural Language Processing and Text Mining on Twitter data to improve their existing recommendation system using Twitter trending topics. He has an experience of 2 years with SAP Labs in the Research and Development team creating Enterprise Applications in the Mobile and Big Data Space. He has been using Python now for almost 2.5 years for data analysis and backend development. Some of the libraries which he uses in his day to day task are - numpy, scipy, pandas, scikit-learn, luigi, hyperopt, flask etc.

Apart from work and technology he is a Football aficionado, love travelling to new places, read comics and an amateur wine connoisseur.

• 8:00pm - 8:20pm Presentation: Predicting Next Fatal Airline Crash Due to Bad Weather Conditions by Pawel Lachowicz

Synopsis: Instrument meteorological conditions (IMC) is an aviation flight category that describes weather conditions that require pilots to fly primarily by reference to instruments, and therefore under instrument flight rules (IFR), rather than by outside visual references. We study the NTSB aviation accident database which contains information from 1962 and later about civil aviation accidents regarding all cases of fatal air crashes in the world due to bad weather conditions (IFR-based). Using classical and Bayesian statistics we build a model for effective calculation of the probability of the occurrence of next accident for the major airlines operating their flights with AIRBUS, BOEING, Embraer, and McDonnell Douglas types of aircrafts.

Speaker: Dr. Pawel Lachowicz (Sydney, Australia) received his PhD by applying novel techniques of signal processing in astrophysics from Polish Academy of Sciences in 2007. He worked at Temasek Laboratories and NUS in Singapore after that. He is a leading expert in data analysis covering financial markets, an educator, an author of books on finance, data processing, and applied programming. He also is a founder and writer at QuantAtRisk.com. He specializes in Python for finance and big data.

• 8:20pm - 8:25pm Lucky Draw sponsored by O'Reilly

Updates

• PyCon Singapore 2016 is happening on June 23-25. See https://pycon.sg/

Join us on Facebook and Twitter

https://www.facebook.com/groups/pydatasg/
https://www.twitter.com/pydatasg

Sponsors

Group Sponsor