Cloud Native London, February 2024


Details
Hi folks!
Welcome to our February Cloud Native London meetup, join us to hear from our three great speakers and network with your fellow techies over pizza and drinks, or alternatively chat and following along on Youtube!
6:00 Pizza and drinks
6:30 Welcome
6:45 There’s no such thing as not innovative enough (Josh Greencroft, DEFRA)
7:15 Kubernetes Multi-Cluster Made Easy (Marco Ballerini and Federica Ciuffo, AWS)
7:45 Break
8:00 Challenges in serving self-hosted Large Language Model (LLM) applications (Fergus Finn, TitanML)
8:30 Wrap up
See you there!
Cheryl (@oicheryl)
There’s no such thing as not innovative enough (Josh Greencroft, DEFRA)
The PlatUI team at HMRC, led by Josh Greencroft, discovered how embracing “innovation for innovation’s sake” through regular ideation sprints was integral is working around the common pitfall of platform teams not innovating for themselves. Josh shares how by brainstorming ideas, voting on favourites, and rapidly building prototypes, they created a culture where innovation was no longer seen as optional, but essential. You will learn about their process, key lessons attained along the way, and explore how there is no such thing as “not innovative enough” even for infrastructure teams.
Josh has over 10 years experience in delivering complex technical products for Central Government departments. Having managed the team delivering the new Divorce Law in 2022, he is now working on transforming the Animal and Plant Health Agency. Outside of work, Josh lives in Folkestone, and spends his time doing CrossFit, Climbing with his son, and doing whatever his daughter tells him to do.
Kubernetes Multi-Cluster Made Easy (Marco Ballerini and Federica Ciuffo, AWS)
A Kubernetes multi-cluster configuration can be the answer to a number of load balancing, scaling and availability problems. However, this is and has always has been challenging to implement and manage. Most of the current solutions on the market don’t offer the right mix of configurability and maintainability. The solutions also includes a number of moving pieces and multiple points of failure. In this talk we will analyse the pros and cons of the most common approaches and finally show how AWS can make this as easy as a few lines of yaml. You will learn how to manage cross cluster connectivity with integrated security and observability without additional infrastructure, such as front-end load balancers, side-car proxies, and lower level network connectivity between virtual private networks.
An IT Engineer with more than 10 years of field experience, Marco is currently enjoying the public Cloud and the fact that with few lines of code he can avoid lifting a 2U server across the datacenter, and the applications can scale without being paged on a Saturday night at 5AM. He likes to think of himself as one of the frontmen of the cloud revolution, one of the heroes behind the curtains of automation, a cloud champion with a lead role in the fight to inefficiency and obsolescence. He is also very modest.
Federica is a Solutions Architect at Amazon Web Services. She specializes in container services and is passionate about building infrastructure with code. Outside of the office, she enjoys reading, drawing, and spending time with her friends, preferably in restaurants trying out new dishes from different cuisines. Find her at: https://beacons.ai/fedeciuffo
Challenges in serving self-hosted Large Language Model (LLM) applications (Fergus Finn, TitanML)
In this talk, we will explore the challenges and solutions in effectively designing, serving, and scaling Large Language Model (LLM) applications. We’ll focus on the practical challenges we experienced when designing our efficient Takeoff inference server, including how to manage workload with different batching strategies and performance enhancements like quantization and caching. We’ll also discuss how to efficiently enforce LLM output formats, such as JSON and regex, which are vital for getting predictable and repeatable outputs in applications. This session aims to provide actionable insights for professionals looking to optimize their LLM applications for large-scale operations.
Fergus Finn is co-founder & CTO of TitanML. He leads a team of AI/ML infrastructure developers devoted to hyper-optimizing LLMs for large scale production in enterprise. Previously, he earned a PhD in theoretical quantum condensed matter physics, worked on applications of high performance computing to simulating quantum computers, and worked to bring quantum-inspired techniques to AI optimization.
Check out https://www.oicheryl.com/cloudnativelondon if you're interested in speaking or sponsoring.

Cloud Native London, February 2024