Skip to content

Christmas Tech Talks: Dive into DuckDB & Hopsworks

Photo of Marianna Riant
Hosted By
Marianna R. and 3 others
Christmas Tech Talks: Dive into DuckDB & Hopsworks

Details

Christmas is just around the corner and what better way to end the year with talks around DuckDB?

Our last meetup of the year will welcome Max from DuckDB Labs and Fabio from Hopsworks. Max will introduce DuckDB, an innovative embedded data management system optimized for analytical SQL workloads and Fabio will introduce feature stores and the challenges & learnings of integrating DuckDB and Arrow Flight into the Hopsworks platform.

Agenda:

17:30 - 18:00: Doors open
18:00 - 18:10: Welcome
18:10 - 18:40: DuckDB: Transforming Data Management and Analytics
18:40 - 19:10: Snacks & Refreshments
19:10 - 19:40: MLOps on the fly: Optimizing a feature store with DuckDB and ArrowFlight
19:40 - 20:30: Networking

Presentations:

DuckDB: Transforming Data Management and Analytics
Max Gabrielsson - Software Engineer at DuckDB Labs

In this talk we present DuckDB, a novel embedded data management system designed for analytical SQL workloads. By incorporating decades of clever techniques and algorithms from the database research community, DuckDB empowers data engineering on a single machine to reach a whole new level of scale and performance, without the hassle and operational overhead commonly associated with traditional database and data warehouse systems.

One way DuckDB aims to achieve this goal is through its unique in-depth integration with Python, allowing for seamless interoperability with the existing data science ecosystem through familiar APIs and zero-copy data sharing between staple libraries like Numpy, Pandas and Polars. This makes DuckDB an essential tool for the practical data scientist looking to squeeze the most out of their system without having to leave their comfort zone.

We will introduce and explain the main strengths and characteristics of DuckDB such as its parallelized vectorized query execution engine, out-of-core beyond memory capabilities and transparent compression, and demonstrate how these features can be leveraged in a typical python-based data science workflow through a series of examples mixing both SQL and dataframes. We will also showcase DuckDBs flexible extension system and illustrate how it can be used to bridge different data sources and domains.

Speaker Bio:
Max Gabrielsson is a software engineer at DuckDB Labs where he works on the DuckDB database system. While he generally tries to not stay confined to any specific part of the stack, he has a particular interest in geospatial data management and is the primary maintainer of the DuckDB spatial GIS extension. Max holds a BSc in Computer Science from Uppsala University and hopes to one day finish his MSc with a thesis on the topic of database systems. In his spare time he enjoys kickboxing and hacking on side projects, usually involving compilers, databases or cartography.

MLOps on the fly: Optimizing a feature store with DuckDB and ArrowFlight
Fabio Buso - VP of Engineering at Hopsworks

Feature Stores are a vital part of the MLOps stack for managing machine learning features and ensuring data consistency. This talk introduces Feature Stores and the underlying data management architecture. We’ll then discuss the challenges and learnings of integrating DuckDB and Arrow Flight into our Feature Store platform, and share benchmarks showing up to 30x speedups compared to Spark/Hive. Discover how DuckDB and ArrowFlight can also speedup your data management and machine learning pipelines.

Speaker Bio:
Fabio Buso is VP of Engineering at Hopsworks, leading the Feature Store development team. Fabio holds a master’s degree in Cloud Computing and Services with a focus on data intensive applications.

About the event

Date: December 14th, 17:30 - 20:30
Location: Hopsworks Office (Åsögatan 119, Plan 2, 116 24 Stockholm)
The venue this time is at the Hopsworks Office. As the office is sometimes difficult to locate we have made this map for everyone to follow. See you then!
Directions: 2-minute walk from Medborgarplatsen.
Tickets: Sign up required. Anyone who is not on the list will not get in. The event is free of charge.
Capacity: Space is limited to 70 participants. If you are signed up but unable to attend, please let us know by December 13th.
Food and drinks: Snacks and drinks will be provided.
Questions: Please contact the meetup organizers.

Code of Conduct
The NumFOCUS Code of Conduct applies to this event; please familiarize yourself with it before attending. If you have any questions or concerns regarding the Code of Conduct, please contact the organizers.

Photo of PyData Stockholm group
PyData Stockholm
See more events
Åsögatan 119
Åsögatan 119 · Stockholm