Skip to content

Details

Auto Product Classification | Sharing Jupyter with Binder

USE CASE: 10 Years of Automated Category Classification for Product Data at billiger.de
Johannes Knopp (solute)

COLLABORATION: Sharing Jupyter notebooks with Binder
Tim Head (Binderproject)

----

17:30 – Doors open & Networking
18:00 – Welcome / Opening
18:10 – 10 Years of Automated Category Classification for Product Data at billiger.de
19:00 – Break with refreshments
19:45 – Sharing Jupyter notebooks with Binder
20:30 – Networking
21:15 – End

Lightning talks welcome: ping pydata-lightningtalk@koenigsweg.com.

Thanks a lot, to the speakers, KÖNIGSWEG for organizing and KPMG AG Wirtschaftsprüfungsgesellschaft for hosting this PyData Frankfurt.

This event will be in English. Questions? python@koenigsweg.com or Telegram https://t.me/joinchat/CeKOXBACWgvtkjpz8z7hQA

----
Johannes Knopp: 10 Years of Automated Category Classification for Product Data at billiger.de

10 years ago we built a classifier for categorizing product data. Let's take a journey through the lessons we learned over the years about building, maintaining, and modernizing the category classifier.

Being in the price comparison business, solute's mission is to make
sense of product data. Crucial to fulfilling that mission is figuring
out the category of each offer. We tackle this problem with Machine
Learning algorithms for over 10(!) years now.

We want to invite you to a journey through our history of building,
maintaining, and modernizing our category classification system.
Starting from back in the days where people used blackberry phones and
scikit-learn wasn't even invented yet. You will learn about the rise of
our SVM classifier, well motivated decisions leading to a successful
system that just needed some tweaking over the years — until this
approach didn't suffice any more. We will share our most interesting
mistakes, misconceptions and design flaws and how we moved
forward with our rework of the solute Machine Learning infrastructure
and the introduction of a Neural Network based category classifier. No
previous knowledge of Machine Learning algorithms is required.

Johannes Knopp studied Natural Language Processing and Computer Science, worked some years at a university. After that he ended up being a software developer in Karlsruhe at solute GmbH where he can play around with millions of product data every day and try to make sense of it. He likes podcasts, board games and word plays.

Tim Head: Sharing Jupyter notebooks with Binder

Tim Head is one of the core developers of Binder. He is a known good actor in the Python data ecosystem and has contributed to the development of Project Jupyter and other PyData projects for several years. He has extensive experience using and developing Python and C++ for data science applications, is one of the maintainers of scikit-optimize, a Python library for blackbox optimization, and has contributed to scikit-learn. Tim has given many talks at small and large international conferences like PyCon and EuroSciPy and co-organizes the PyData meetup in Zurich. He holds a PhD in experimental particle physics from the University of Manchester.

Related topics

Sponsors

Pioneers Hub

Pioneers Hub

Organizing

Hessian AI

Hessian AI

Hosting

KPMG AG Wirtschaftsprüfungsgesellschaft

KPMG AG Wirtschaftsprüfungsgesellschaft

Co-Host

Denic eG

Denic eG

Event-Host

You may also like