ML in production @ iZettle, Continuous Analytics w Arcon, Provenance & Lineage

This is a past event

70 people went

Location image of event venue

Details

Announcing our next Hops ML meetup. This time with iZettle.

NOTICE: Doors lock a few minutes before 18:00 for facility security reasons. Be there in time.

A bit smaller venue than usual so fewer spots. Make sure to keep your attendance status up to date.

We have a strong agenda with real world deep dives in machine learning from our friends at iZettle, updates from the "Continuous Deep Analytics" Joint Group Lead at RI.SE and KTH and news around Data provenance and Lineage in Machine Learning with the Hopsworks platform.

Agenda
17:30 Doors Open
18:00 - 18:25: Logical Clocks: Data Provenance and Lineage in Machine Learning in Hopsworks
18:25 - 19:10: Pizza and Drinks (Thank you iZettle!)
19:10 - 19:35: iZettle: Running ML in production at iZettle
19:35 - 20:00: Paris Carbone: Continuous and Deep Analytics with Arcon

Welcome!

Details

Logical Clocks: Data Provenance and Lineage in Machine Learning the Hopsworks
Alex Ormenisan, Logical Clocks

Data lineage has gained popularity in the Machine Learning community as a way to make models and datasets easier to interpret and to help developers debug their ML pipelines by enabling them to go from a model to the dataset/user who trained it. Data provenance and lineage is the process of building up the history of how a data artifact came to be. This history of derivations and interactions can provide a better context for data discovery, debugging, as well as auditing. In this area, others such as Google, have made small steps.

The Hopsworks approach presented provenance information is collected implicitly through the unobtrusive instrumentation of jupyter notebooks and python code - What we call 'implicit provenance'.

Alex Ormenisan is a software engineer at Logical Clocks AB, the main developers of Hops (www.hops.io) and Hopsworks. Alex is a PhD Student at the School of Electrical Engineering and Computer Science, KTH Royal Institute of Technology with an interest in using data lineage to improve ML pipelines and large data transfers.

---
iZettle: Rethinking how we deploy machine learning solutions at iZettle
Björn Herder, Data Scientist, iZettle

iZettle is on a mission to help small businesses succeed in a world of giants. Founded in Stockholm in 2010, they revolutionised mobile payments with the world’s first mini chip card reader and software for mobile devices. In a world where they live and breath data. How do they approach it with Machine Learning?

Björn is a Data Scientist at iZettle, working with creating and implementing machine learning solutions across iZettle. His previous work includes machine learning projects within sales, online marketing and the transportation industry.

---
Continuous and Deep Analytics with Arcon
Paris Carbone, Senior Researcher, RISE

There's a framework available for every need in today's data analytics, specializing on a specific data model e.g., Tensorflow, Flink, Spark, Neo4j etc. A lot of issues though arise when we try to put these pieces of software together in order to build continuous end-to-end pipelines. Examples are different modes of operation (batch/streaming), mismatch of processing guarantees, diverse source (Scala, Python, Java) or target language (CUDA, LLVM) and excessive data materialization across framework boundaries. Optimizing and tuning such pipelines to deal with real-time processing demands is one of the biggest missed opportunities.

Introducing Arcon, a system we currently build at RISE and KTH to solve this complex problem. Arcon is equipped with a powerful compiler that can take code from any frontend or language and translate it into Arc, a general-purpose, data-driven intermediate language.

Paris Carbone is senior researcher at RISE, currently leading the "Continuous Deep Analytics" joint group at Rise and KTH. PhD in Distributed Computing from KTH.