Skip to content

Details

Welcome to our April meeting! Python WA is for Python enthusiasts of all skill levels. We will be meeting at Spacecubed in the Town Hall. We'll be starting off with food and non-alcoholic beverages at 5:30 pm, and talks begin at 6:00 pm. Hope to see you there! Please update your RSVP if you signed up and are unable to attend.

This month we'll be focusing on handling extra large datasets with Xarray.

Planetary Scale Data and Xarray
by Charles Turner

There's big data, and then there's planetary scale data. In weather and climate science, it's common to deal with the latter: small datasets are regularly terabytes, and the larger ones can reach into the petabyte range. This presents some unique challenges, not least trying to make the data tractable to researchers who don't want to learn the ins and outs of distributed computing and efficient data engineering.

Enter xarray - a high-level pure Python library for working with n-dimensional labelled arrays. I'll outline how it lets you join thousands of files in a single line of code (no SQL required!), how it scales seamlessly from a laptop to a cluster with hundreds of gigabytes of memory, and how some clever tricks built on top of it's data model let you stream gigabytes of data from the cloud into interactive maps or machine learning models in real time, with no servers, databases, or sharding.

Although we'll probably get deep into the weeds, we'll start with the basics of data modelling, serialisation, and reading files, so there's (hopefully) something for everyone.

Charles works at ACCESS-NRI, where he works on open source tools to make analysing climate model data less painful. He previously built similar tooling for air quality and observational oceanography.

Second Talk TBA
Food sponsor: Horizon Digital
Venue sponsor: Spacecubed
Hosting sponsor:Ben Fitzhardinge

Related topics

Events in Perth
Data Science
Python
Software Development
Web Development
Computer Science

You may also like