Skip to content

Details

This time we dive into Polars, a powerful DataFrame Python library with a highly efficient, multi-threaded implementation in Rust in using Apache Arrow. It has support for SQL syntax as well as native Python syntax and, while very efficient and complete on its own, plays well together with Pandas and DuckDB for nearly seamless conversion between data structures from and to the other libraries. Not only are we excited about the benchmarks that show significant performance improvements with respect to Pandas, and to a lesser extent DuckDB, but we also love the intuitive Python syntax in comparison.

That is not all, as we will also cover a basic machine learning (or, more specifically, deep learning) use case. Namely, after our exploration of the Polars library, we will put it to the test in a fun practical use case for preprocessing our data efficiently in order to train a neural network.

You will hack your way through at least the following topics:
* Reading and writing large datasets from and into Parquet files
* Using Polars for data analysis and visualisation
* Lazy loading of data and lazy evaluation of queries in Polars
* Migration from Pandas and speed comparisons between the two
* Training a neural network with preprocessing performed using Polars

Agenda:
- 9:00 the Office opens
- 09:30 the hackathon kicks off
- 12:00 Lunch
- 16:00 end of the hackathon and time for some drinks

Lunch and snacks are provided. Dinner is provided if people are interested!

The hackathon will be at the Dataworkz office.

The address is:
Tractieweg 41, Studio E
3534 AP Utrecht

If you can't find it call Sigrid: +31 6 121 030 72

https://github.com/codebeez/polars_hackathon

Events in Utrecht, NL
Python
Python Web Development
Software Development
Hackathons
Structural Engineering

Members are also interested in