Skip to content

Big Data & Machine Learning Pipelines: A Tale of Lambdas, Kappas and Pancakes

K
Hosted By
Kelsey I. and Jarrod L.
Big Data & Machine Learning Pipelines: A Tale of Lambdas, Kappas and Pancakes

Details

6:00pm - Arrival, Mingling & Pizza eating!
6:20pm - Introductions & Presentation
7:30pm - Open Discussions & Networking
8:00pm - Event Concludes

ACL welcomes us back to their wonderful space with an in-house presenter on the topic of "Big Data & Machine Learning Pipelines: A Tale of Lambdas, Kappas and Pancakes" Level: Beginner to Intermediate

Presenter: Osama Khan, Big Data Engineer, ACL

Bio: Osama Khan is a software engineer interested in distributed systems, machine learning, complexity and game theory. Currently Osama is a Big Data Engineer at ACL, working on the core big data platform.

Presentation Description:
Data Lake, Business Intelligence, Enterprise Data Warehouse, Big Data Pipeline, Online Machine Learning, Lambda Architecture, Streaming, Spark, Kafka, Storm, Flink, Hadoop, Mesos and SMACK stack are some of the things you hear about when you want to dive into building a data pipeline. The Big Data Landscape cannot fit on a single screen as seen below: Check this out!
http://mattturck.com/wp-content/uploads/2017/05/Matt-Turck-FirstMark-2017-Big-Data-Landscape.png) The above is in addition to all the Big Data & Machine Learning offerings AWS has been introducing over the past few years which address many pain points highlighted by the various communities and help you get up and running faster.

The objective of this talk is to provide the audience with a framework which helps them define their pipeline problems, isolate components and pick the right tools for the right job.

We will talk about:

  1. A consistent definition of BIG in big data

  2. The lineage of fundamental tools in the ecosystem

  3. First principles of a big data pipeline based on the lambda (not lambda functions) and kappa architectures

  4. Distinguishing between big data and online machine learning pipelines

  5. Technology choices based on first principles, open source solutions and AWS offerings

  6. Demo: Serverless, Managed Big Data Pipeline and real-time dashboard on AWS (orchestrated via Terraform)

Photo of Vancouver Amazon Web Services User Group group
Vancouver Amazon Web Services User Group
See more events