PyData Amsterdam - Schiphol Group

PyData Amsterdam
PyData Amsterdam
Public group

Schiphol Real Estate

Evert van de Beekstraat 202 · Schiphol

How to find us

>>>> LOCATION OF THE MEETUP (otherwise you'll end up at the wrong place) https://goo.gl/maps/wxwi6kcJpgaQJFVT7 >>>>

Location image of event venue

Details

>>>> LOCATION OF THE MEETUP (otherwise you'll end up at the wrong place)
https://goo.gl/maps/wxwi6kcJpgaQJFVT7

Parking: you can use P12. Please show your ticket at the event to one of the hosts/hostesses from Schiphol Group so they can stamp it for you.
>>>>

Hi PyData folks,

Time for a post conference meetup! This time hosted by our sponsor Schiphol Group at their headquarters.

# Program:
18:00 Doors open. Come to mingle and have some food and drink!
18:45 Welome by Schiphol Group
18:50 Talks by Schiphol Group:
- Aircraft turnaround insights powered by Deep Learning (Jori van Lier)
- Schiphol Runway: take off to the cloud(s)! (Daniel van der Ende & Tim van Cann)
19:40: ---- Small break ----
19:50: Labeling a country-wide dataset on a budget with active learning (Rok Mihevc)
20:20 More time to mingle and drink!
Around 21:15 the bar will close (and the building as well!)

# Abstracts and bio's
## Aircraft turnaround insights powered by Deep Learning
When an aircraft arrives at a gate at Schiphol, the turnaround process begins: de-boarding the passengers, unloading bags and cargo, cleaning, fueling, catering, etc. - all as fast as possible such that the aircraft can take off again with minimal delay. Many things can and do go wrong during this process. One of the top priorities at Schiphol is to improve On-Time Performance. Unfortunately, we were missing key datapoints of this process and were not able to analyse it effectively. In order to solve that, we have started to automatically process the video streams using Computer Vision and Deep Learning techniques. We are now able to recognise vehicles and other objects on the aircraft stand, and can determine whether key process milestones have started on time.

Bio Jori van Lier:
Jori van Lier is a Senior Data Scientist at Schiphol (ad interim). He worked on various data products and models to predict and improve airport operations.

## Schiphol Runway: take off to the cloud(s)!
Developing and deploying data products often involves several steps, especially when using cloud infrastructure. Tying this all together can take a lot of time, and can also result in a lot of different ways to achieve the same goal throughout an organisation. At Schiphol, we've built a library, dubbed Runway, that we use for deployment of
data products to the Schiphol Data Hub (SDH) on Microsoft Azure. Runway allows us to abstract away the details of how to interact with the various components and systems needed to put a full-fledged data product live. From deploying Spark jobs on Databricks, publishing custom packages to pypi, setting up Azure Eventhub consumers, deploying API's on Kubernetes, and much more. Runway itself is written in Python, and we'll give you a glimpse into how we built it.

Bio Daniel van der Ende & Tim van Cann:
Daniel and Tim are both data engineers at GoDataDriven. They enjoy building scalable machine learning solutions and automating all the things.

## Labeling a country-wide dataset on a budget with active learning
Labeling data is a tedious, expensive and time consuming activity. It typically needs to be done before a project has proof of value and can present a significant business risk.
We were tasked to create a catalog of all Slovenian dolines (sinkhole-like geographical features). Our starting dataset was a LIDAR scan of the entire country's surface with resolution of 1 m^2 and no labels. We estimated manual labeling would have taken about 4 months of non-stop expert work.
We looked into ways to best leverage manual work while maintaining accuracy.
Labeling library we developed for the project has been open-sourced and I will present it alongside our approach and results.

Bio Rok Mihevc:
Rok is a physicist turned freelance data scientist. Open source contributor, interested in data science tooling and building complete data pipelines.
Currently working for UC Berkely on DARPA's Data-Driven Discovery of Models (D3M) program.