Past Meetup

Big Data and Machine Learning - London - Meetup #7

This Meetup is past

62 people went

Location image of event venue


Meetup #7

PLEASE NOTE: Limit of 140 attendees (see below)

Welcome to Meetup #7, and what we hope will be another interesting evening of presentations and lightning talks. You are encouraged to participate in the Q&A sessions and hope that the networking gives you the opportunity to meet with the presenters, other attendees and the organisers of the Meetup.

The agenda is listed below, followed by further details about the main presentations and their presenters.

We look forward to meeting many of you at this Meetup, but for those who are unable to join us, hope to see you at one of our other meetups throughout the year.

Please note that there is a maximum limit of 140 attendees for this event. However, in common with other Meetups, we unfortunately see a high no-show rate (despite pleading with people to release their places if they find they are unable to attend). We have raised the number of attendees who can register, in the hope that we can get closer to a full-house. We will however have to stick to a maximum of 140 through the doors. So on the night, it will be first come, first served. So make sure you turn up early to guarantee entry!

If you have already RSVP’d “YES”, but find you are no longer able to attend, PLEASE make the effort to release your space ASAP to enable those on the Waiting List the opportunity to attend – THANKS!

Should you wish to contact me, email me at [masked].

Kindest regards




18:30 Doors open and networking

18:55 Welcome

Mark Whalley

(5 mins)

19:00 ML + Firebase = ❤️

Luiz Gustavo Martins

(30 mins)

19:30 Predicting defaults payments for a bank from raw data preparation to machine learning using Dataiku and Vertica

Alexandre Hubert

(30 mins)

20:00 The Lab Series

Mark Whalley

(30 mins)

20:30 Networking / Beer & Pizza

21:30 Close


ML + Firebase = ❤️

With the democratization of cameras it's much easier today to take pictures anywhere and anytime.

With all this data comes the need to understand and extract insights from them.

In this talk I'll show you how you can use Firebase and Google Cloud ML to build a pipeline for acquiring data, training a model, running on-device ML algorithms and add new functionality to your app that give super-powers to your app's users.

Luiz Gustavo Martins

Over 15 years Gus has gone from working on Financial Systems to Android Apps.

Now he's a Developer Advocate at Google where he helps people create even better apps using Firebase and Android while learning more about Machine Learning.

When he's not at the keyboard, you might find him teaching Capoeira.

twitter: @gusthema

g+: +LuizGustavoMartins


Predicting defaults payments for a bank from raw data preparation to machine learning using Dataiku and Vertica

Designing and deploying an effective predictive analytics model that is integrated into a company’s daily business operations can be very challenging. Data scientists often use complex machine learning models to exploit large volumes of data from multiple environments and technologies to deliver analytics that the business needs.

Join us as we walk you through the data science journey applied to default payment prediction and learn how you can automate the entire data science workflow.
See how the integration of Dataiku, the collaborative data science platform, and Vertica, the ultra-fast analytics database platform with built-in machine learning, can help you speed the deployment of data-intensive predictive analytics.

Learn how to:

· Design connections to existing data sources with Dataiku

· Understand your datasets with built-in charting capabilities and data pre-aggregation

· Reduce the time it takes for the data preparation phase

· Leverage the scalability and speed of Vertica

Alexandre Hubert

Lead Data Scientist at Dataiku UK.


The Lab Series

In this continuation of The Lab Series, Mark will outline the next phase of the Mini Project for capturing Automatic Dependent Surveillance Broadcast (ADS-B) to track the position, speed and other metrics of commercial aircraft.

So far we have discussed the background to ADS-B, and how to install and configure DUMP1090 to capture and decode the digital broadcast signals from aircraft transponders using a Raspberry Pi. This was followed by sessions which introduced the various outputs from DUMP1090, outlined a method for transforming and feeding this data into Kafka topics and how, using Vertica’s inbuilt tools for integrating with Apache Kafka, we can populate a series of Vertica tables in near real time, and make this data immediately available for querying and analytics. With streaming data being loaded into Vertica in near real-time, we then demonstrated how it could be immediately interrogated to provide visualisations of that data, including using geospatial elements and the Google Maps API to plot positions of aircraft.

We have now moved on to the next Phase in this Mini Project where we start to look at how we Measure and Prepare our data, prior to moving on to the final Phase where we look at how to Build and Deploy Machine Learning models.

In this first session of the “Measure and Prepare” Phase, we will look at some of the many functions built into Vertica for preparing data. These include; Gap Filling Interpolation (GFI) of time series data and Outlier Detection.

Mark Whalley

From the early 1980s, Mark worked with Michael Stonebraker's Ingres RDBMS and then column-store big data analytic technologies. In 2016, he joined HPE Big Data Platform as a Systems Engineer specialising in Vertica and Vertica SQL in Hadoop, and from September 2017 followed Vertica as it moved over to Micro Focus.