Skip to content

Kubernetes and Databricks Spark Streaming

Photo of John Kirby
Hosted By
John K. and 3 others
Kubernetes and Databricks Spark Streaming

Details

Agenda below. Hosted by Armada House - https://armadahousebristol.com/ - a gorgeous new events venue in the heart of Bristol’s Old City.

Location: Armada House, Telephone Avenue, Bristol BS1 4BQ

AGENDA

18.00 – 18:20 Meet & Greet

--------------

18:20 - 19:10 A Deep Dive into Kubernetes by Andrew Pruski

With the rise of containers came the rise for container orchestrators. There are several available but Kubernetes really did win the "orchestrator war".

It's now the defacto standard for managing large numbers of containers in production but Kubernetes does have a lot of moving parts.

Join Microsoft Data Platform MVP, Andrew Pruski, in this session to dive into Kubernetes and learn about its various components.

First off we'll look at the various control node components: -
- The apiserver, the front end for the Kubernetes control plane
- Etcd, a key value store for all the cluster data
- The kube-scheduler, which assigns pods to a node

Kubernetes can have multiple control nodes and we'll go over the various different considerations for making the control plane of Kubernetes highly available.

Then we'll dive into the different worker node components: -
- The kubelet, responsible for ensuring containers are running in a pod
- The container runtime, which options do we have? (docker, containerd, etc..)

Once we've gone over all the node components we will then cover deploying applications to Kubernetes, using SQL Server as an example: -
- What is the difference between deployments, daemonsets, and statefulsets?
- How to connect to our app in Kubernetes (services)
- Adjusting pod eviction timings for high availability
- Persisting data for SQL Server

This session is for anyone who wants to learn more about the internals of Kubernetes.

--------------

19:10 - 19:40 Pizza and Networking

--------------

19:40 - 20:30 Turbo Charge your Lake House with Spark Streaming on Azure Databricks by Niall Langley

Streaming is one of the buzzwords used when talking about the Lakehouse. It promises to give us real time analytics by enabling a continual flow of data into our analytics platforms. It's being used to power real time processes as diverse as fraud detection, recommendation engines, stock trading, GPS tracking and social media feeds. However, for data engineers used to working with batch jobs this can be a big paradigm shift.

In this session we take a look at Spark Structured Streaming:
- How is it architected
- What can ingest
- How it handles state and late arriving data
- What is the latency and performance
- Stateless vs stateful joins

At the end of the session you'll have a good idea of what the hype around streaming actually means for your pipelines - can you improve latency and resiliency or reduce costs by implementing streaming pipelines.

Niall is a independent consultant specialising in data engineering & platform architecture. He has been working with the Microsoft Data Platform tools for over 13 years, these days Niall helps clients build robust, scalable data pipelines built around Databricks and the Lakehouse architecture.

Niall is actively engaged in the data community, blogging occasionally, and regularly speaking at user groups and conferences in the UK, including SQLBits in 2019 & 2020. Niall is a committee member for Data Relay, and has been a helper at SQL Bits for many years.

--------------

20:30 - Pub

--------------
About Armada House - https://armadahousebristol.com/

Armada House is a gorgeous new events venue in the heart of Bristol’s Old City.

Step into the grand Edwardian foyer of Armada House, with its glittering chandelier, marble flooring and unique original features. The building is steeped in history, boasting two antique fireplaces and a remarkable past as the site where Queen Elizabeth II made the first non-operator telephone call in the UK. With a range of stunning events spaces and meeting rooms, Armada House combines historic charm with modern amenities for an outstanding event hosting experience.

----
Photos
We ask that you do NOT take photos at this meetup.

We will invite people to be included in a group photo/s during the event. Speakers will let you know if it's okay to photograph their presentation (excluding other attendees).

You may see organisers taking photos during the talks. These will be of speakers, if they have agreed to this, and will not include faces of attendees.

COVID-19 safety measures

Event will be indoors
The event host is instituting the above safety measures for this event. Meetup is not responsible for ensuring, and will not independently verify, that these precautions are followed.
Photo of Data Bristol group
Data Bristol
See more events