Spark Meetup July @ Macquarie


Details
Flexible and scalable machine learning using Apache MXNet
Guy Needham - Servian
Apache MXNet is rapidly gaining momentum as a multi-language machine learning framework. I'll introduce the framework and explain how it works with Spark to scale not only model scoring but also model training across a compute cluster. I will compare and contrast different methodologies for training machine learning models on large data sets using Spark, focussing on the differences between TensorFlow and MXNet.
Introducing Arc: Predictable, repeatable and manageable data pipelines
Mike Seddon - AGL Energy
Introducing Arc, an opinionated and open-source framework for defining predictable, repeatable and manageable data transformation pipelines. This talk will introduce Arc and walk through the process of defining a declarative data pipeline which performs common data extract-transformation-load (ETL) tasks of data typing, aggregation, dealing with schema-evolution and deployment.

Spark Meetup July @ Macquarie