Google NY Site Reliability Engineering (SRE) Tech Talks, 24 Jun 2025


Details
Google SRE NYC proudly announces our next event in the Google SRE NYC Tech Talk series.
Join us for an hour of interactive short talks on Site Reliability and DevOps topics with an opportunity to mingle with the speakers and attendees over some light snacks and beverages.
The event will take place on Tuesday, 24th of June 2025 at 6 PM at our Chelsea Markets office in NYC. The doors will open at 5:30 pm. Pls RSVP only if you're able to attend in-person, there will be no live streaming.
When RSVP'ing to this event, please enter your full name exactly as it appears on your government issued ID. You will be required to present your ID at check in.
Agenda:
Gideon Lapshun – Senior Solutions Engineer at Rootly
Vibe Coding and Site Reliability
We’ll explore how vibe coding impacts SRE teams. Attendees will learn how this shift affects reliability and incident response and the challenges it introduces, such as reduced familiarity with codebases among developers and the loss of subject matter expertise.
We’ll discuss why “incident vibing”— leveraging automation and AI-driven features to tackle increased incident volume — is crucial.
The audience will learn practical strategies for:
-Accelerating incident response using AI-generated incident briefings and automated post-mortem drafts.
-Streamlining root cause analysis and resolution through AI-powered anomaly detection and contextual data ingestion.
-Mitigating the limitations of AI systems, such as hallucinations and a lack of context
Ultimately, this talk is about turning a risk into a competitive advantage. Not only empowering SRE teams to handle the growing challenges of AI-driven development, but also graduate to achieving the elusive "six nines" of reliability.
Gideon is a Senior Solutions Engineer at Rootly, helping organizations modernize their incident management and on-call practices. Previously at Sentry, he led technical strategy for strategic clients, built internal tools to streamline operations, and partnered closely with product teams to drive customer-centric solutions. With a background in DevOps, observability, and cloud infrastructure, Gideon brings a pragmatic, collaborative approach to solving complex reliability challenges. Outside of work, he’s a sports enthusiast, F1 fan, and lover of all things teamwork—on and off the field.
Rudi Chiarito - Former Research Engineer at Meta and a former SRE at Google
The platform behind a generic noninvasive neuromotor interface for human-computer interaction
In 2024, the Ctrl-labs team at Meta Reality Labs submitted a paper for journal review, introducing the science behind a new neural input device worn on the wrist. This talk will cover the custom Kubernetes-based platform underlying both the research/ML workloads and the data collection. We'll talk about the challenges of serving "only" hundreds of internal scientists and engineers, while also supporting data collection from thousands of volunteers. We'll cover the evolution of the services and codebase, the reliability tradeoffs, the growing pains and the custom tools that we had to build.
Rudi was a Research Engineer at Meta for five years after the acquisition of CTRL-labs. Earlier, he led infrastructure at Clarifai, where he was the very first person to schedule GPUs on Kubernetes, as he authored the code to support them. Before that, he was a Google SRE on storage and various user facing projects, serving billions of end users. He is an avid runner and an expert in useless esoteric trivia about music.
Ronaldo Arrudas - Digital Development Studio Leader at Nearsure
Automated Observability and Incident Response in GCP
Many SRE teams still rely on manual intervention for incident handling; automation can improve response times and reduce toil.
We will cover:
Setting up comprehensive observability: Cloud Logging, Cloud Monitoring, and OpenTelemetry
Incident automation strategies: Runbooks, Auto-Healing, and ChatOps
Lessons from AWS CloudWatch and Azure Monitor applied to GCP
Case study: Reducing MTTR (Mean Time to Resolution) through automated detection and remediation
Ronaldo is the Digital Development Studio Leader at Nearsure. He is a gifted individual (AH/SD) and a Mensa member with exceptional analytical and strategic abilities, which he leverages as a leader to precisely execute complex, multidisciplinary projects. His practical, results-driven approach is evident in successful initiatives like the Fractal Initiative and Solution Insights program, demonstrating his impact on organizational transformation. An INTP, he relentlessly pursues innovative and efficient solutions, proactively addressing challenges—as shown by his internationally recognized work on the Coca-Cola GO! Platform. He champions transparency, direct communication, and neurodiversity, fostering inclusive teams through programs like “Talk to Ronaldo,” and applies a pragmatic approach to negotiations and strategic decisions, consistently focusing on innovation and sustainable value creation.
Our Tech Talks series are are professional development and networking: no recruiters, sales or press please! Google is committed to providing a harassment-free and inclusive conference experience for everyone, and all participants must follow our Event Community Guidelines. The event will be photographed and video recorded.
Event space is limited! A reservation is required to attend. Reserve your spot today and share the event details with your SRE/DevOps friends 🙂

Google NY Site Reliability Engineering (SRE) Tech Talks, 24 Jun 2025