Skip to content

DevOps September meetup

Photo of Richard Fojta
Hosted By
Richard F.
DevOps September meetup

Details

Talk 1๏ธโƒฃ Sirius Ivlev ๐Ÿ”— LinkedIn
Sirius, Lead of the DevTools SRE Team ๐Ÿ› ๏ธ
AI in Engineering: The Trial-and-Error Method ๐Ÿค–โšก

For almost three years now, weโ€™ve been waiting for AI to replace us. But until that glorious day arrives โœจ, what real value can it deliver? Letโ€™s take a closer look at the pros and cons โš–๏ธ, review research findings ๐Ÿ“‘, and share the results of our own experiments ๐Ÿงช in integrating AI tooling into our daily workflows.

***

Talk 2๏ธโƒฃ Evgeny Arhipov ๐Ÿ”—
Head of scheduler services at Nebius ๐ŸŒ
Managed Soperator: A modern, democratic approach to Slurm-based supercomputing ๐Ÿ’ป๐Ÿš€

Pretraining and fine-tuning tasks often require a significant amount of interconnected processing power ๐Ÿ”‹, also known as supercomputing. Go-to tool of choice in the industry is Slurm ๐Ÿงฉ, a project dating as far back as 1994 ๐Ÿ“…. We will discuss how we made this traditionally very expensive ๐Ÿ’ฐ and complex endeavour of running a Slurm cluster at scale -- a breeze ๐ŸŒฌ๏ธ. The result is a managed solution built on top of a modern tech stack ๐Ÿ—๏ธ, using open source tools ๐Ÿ‘, either existing or newly built and contributed back by Nebius to the community ๐Ÿค.

Talk 3๏ธโƒฃ ๐Ÿš€๐Ÿ” Building an LLM Observability Stack on AWS โ˜๏ธ

In this talk, Iโ€™ll share how we built the foundational infrastructure ๐Ÿ—๏ธ for integrating an LLM observability SaaS ๐Ÿ“Š into our internal co-pilot product ๐Ÿค–. Iโ€™ll walk through how we designed the stack from scratch โœจ using fully AWS-native services: VPC networking ๐ŸŒ, load balancers โš–๏ธ, MSK (Kafka) ๐Ÿ“จ, RDS (PostgreSQL) ๐Ÿ—„๏ธ, and ClickHouse Cloud โ˜๏ธ with cross-region access ๐ŸŒ via AWS PrivateLink ๐Ÿ”’.

Iโ€™ll cover how we automated Kafka topic creation โšก with Lambda ๐Ÿ‘, provisioned secure ๐Ÿ”, production-ready infra ๐Ÿ›ก๏ธ with Terraform ๐ŸŒฑ, and connected it all to enable real-time โฑ๏ธ prompt tracing ๐Ÿงต. We will discuss pitfalls โš ๏ธ and lessons ๐Ÿ“˜; if you're interested in building AI-adjacent systems ๐Ÿค without drowning in abstraction ๐ŸŒŠ, this is a practical ๐Ÿ› ๏ธ, honest look ๐Ÿ‘€ at what it takes.

This was a greenfield project ๐ŸŒฑ developed in pre-production ๐Ÿงช, offering insights ๐Ÿ’ก into designing for scale ๐Ÿ“ˆ, reliability ๐Ÿ› ๏ธ, and future extensibility ๐Ÿ”ฎ without the pressure of live customer data ๐Ÿ“ฆ. Ideal for DevOps engineers ๐Ÿ‘จโ€๐Ÿ’ป๐Ÿ‘ฉโ€๐Ÿ’ป curious about AI system architecture ๐Ÿ›๏ธ in the cloud โ˜๏ธ.

Photo of Prague DevOps Meetup group
Prague DevOps Meetup
See more events
WeWork DRN
Narodni 135/14 ยท Prague
Google map of the user's next upcoming event's location
FREE