Rescheduled October Data Engineering Meetup, Sydney


Details
Airtasker have kindly offered to host us this month.
We have 3 awesome speakers:
- Dan Gooden
- Claire Carroll
- Nick Wienholt
******************************
1st Talk - Dan Gooden:
Testing Patterns in Code Driven SQL Data Pipelines
Consistent and automated testing builds confidence in datasets, catches change in upstream systems, and ensures reliability so you can build more complex models safely.
In this talk I'll cover ideas I've developed over the past few years about useful testing patterns in fast moving, small data teams writing code driven SQL pipelines.
Dan Gooden is the Data Lead at Airtasker, where he is responsible for ensuring the company leverages data internally to discover valuable insights, and externally for the benefit of its users of our platform. He has a keen interest in ensuring data has a meaningful relationship to the activities that companies undertake in the world.
Before Airtasker, Dan worked for the Domain Group as the Data Engineering Platform Lead, where he was responsible for creating and managing a team that built the data warehouse. Prior to that he contracted for many years in the DW & BI space.
******************************
2nd Talk - Claire Carroll
Sharing beautiful data documentation
One of the hardest parts of building a data-driven culture is making sure everyone is speaking the same language – in essence, answering the question “what does this number mean, and where does it come from?”
Attempts to share this knowledge usually come in the form of building a “databook”, either built as a bespoke solution, or by using off the shelf products like Confluence.
In this talk, I’m going to demonstrate how open source tool dbt has solved this problem.
Claire is a Data Analyst at Airtasker, and Community Manager for dbt.
******************************
3rd Talk - Nick Wienholt:
Designing and implementing an automated trading system based on many disparate data sources, using multiple machine learning models and executing across multiple exchanges is an interesting engineering challenge, and one in with reference architectures are very much at the embryonic stage.
In this presentation, Nick will present a complete architecture based on a number of open-source tools including Redis, Kafka and Spark, and examine a number of the possible design approaches.
Nick is a consulting data and quantitive engineering based in Sydney. With a focus on high volume trading systems based on machine learning and alternate data, Nick enjoys working with a variety of clients on both the buy- and sell-side in the financial market and gaming industry.
******************************
We have our own slack group and website which you can find out more details about here: https://sydneydataengineers.github.io/

Sponsors
Rescheduled October Data Engineering Meetup, Sydney