Google NY Site Reliability Engineering (SRE) Tech Talks, 12 Dec 2024

Name: Google NY Site Reliability Engineering (SRE) Tech Talks, 12 Dec 2024
Start: 2024-12-12T18:00:00-05:00
End: 2024-12-12T20:00:00-05:00
Location: Chelsea Market

Hosted By

Vlad L.

Google NY Site Reliability Engineering (SRE) Tech Talks, 12 Dec 2024

Details

Google SRE NYC proudly announces our fourth and final Tech Talk event in 2024.

The in-person only event will take place on Thursday, 12th of December 2024 at 6 PM at our Chelsea Markets office in NYC. The doors will open at 5:30 pm.

We invite you to join us for an hour of short talks on Site Reliability and DevOps topics with an opportunity to meet and talk with fellow engineers over light snacks and beverages.

When RSVP'ing to this event, please enter your full first name and full last name, this needs to match your government issued ID you will be required to present at check in.

Agenda:

Sal Furino - Customer Reliability Engineer at Bloomberg
9 SLIs ... OH MY!

After years of working and coaching teams to implement SLOs, it’s becoming incredibly clear to me that the greatest challenge that engineering and product teams face is finding the right SLIs. SLOs are hard to get right, and it generally takes time and multiple iterations to tweak, tune, and adjust them so they’re providing value to inform when we need to take action to defend the reliability of our systems. However there is an underlying assumption that the SLI itself is/has been providing value.
As hard as SLOs are to get right, thinking of a good SLI is also difficult. This especially complicates things for engineering teams that don’t have a product person. As a result, they often struggle to identify what are key user / customer journeys. This talk will attempt to provide attendees with additional guidance to help them think more clearly about and create better SLIs.
We’ll break SLIs up into three (3) categories – Customer / User Experience, Supporting Services, and Management/Reputation. For each of these categories, I’ll discuss three relevant SLIs of each (e.g., application metrics, network metrics, Public Sentiment, etc.), some best practices, common pitfalls, and how the signal for each of the nine (9) metrics can be developed further to become more mature over time.
Sal Furino is a Customer Reliability Engineer at Bloomberg. During his career he’s worked as a TPM, SRE, Developer, Sys Admin, and IT support. While not working he enjoys cooking, gaming, and traveling. Sal lives in Queens and has a BS in Applied Mathematics from Marist College.

Daniel Paulus - VP of Engineering at Checkly
Shifting observability left: How to get Frontend Engineers to build monitoring checks

This talk is about using synthetic monitoring to reduce MTTD&MTTR significantly and achieve high devops maturity.
Daniel is a big believer in synthetic monitoring as a concept to build reliable production services. If engineers are supposed to run what they build, they need monitoring tools that work for them. He has built his own custom solutions in the past using Jenkins or GH Actions and later used SaaS tools for this. He would like to share his experience getting frontend engineers to build monitoring and get everyone on an engineering team to care about production system reliability.
Daniel Paulus has taken a unique journey from military officer to tech leader, and he’s now the VP of Engineering at Checkly. Along the way, he’s worn many hats— from engineering lead to director —learning how to build strong teams and solve tough challenges. Outside of work, Daniel lives near Berlin with his family and four kids, while also finding time to maintain an open-source project. Whether it’s scaling teams or debugging code, he’s passionate about technology and enjoys sharing his knowledge with others.

Theo Klein - Senior Site Reliability Engineer for Google Maps
A Safer Future with STPA
Want to prevent outages before they happen? Traditional SRE methods focus on component failures, but a whole class of outages stem from unexpected system interactions. We found a solution.
In our team, we use Systems Theoretic Process Analysis (STPA) to identify and fix system-level vulnerabilities before they cause outages. By applying STPA during the design phase, we've prevented major incidents and saved countless engineering hours.
This talk will show you how STPA can transform your approach to reliability. We'll share a real-world example where STPA caught critical design flaws that traditional methods missed, saving us months of costly rework.
Don't wait for outages to happen. Learn how STPA can help you build more resilient systems and become a 1000x engineer.
Theo is a Senior Site Reliability Engineer for Google Maps. He is leading a program to improve road closure data safety. Previously, he led a program identifying risky dependencies within Google Maps. In his spare time, he hosts supper clubs.

Our Tech Talks series are focused on professional development and networking: no recruiters, sales or press are allowed. Google is committed to providing a harassment-free and inclusive conference experience for everyone, and all participants must follow our Event Community Guidelines. The event will be photographed and video recorded.
Event space is limited! A reservation is required to attend. Reserve your spot today and share the event details with your SRE/DevOps friends 🙂

Events in New York, NY Software Engineering

DevOps Tech Talks Site Reliability Engineering (SRE)

New York Site Reliability Engineering Tech Talks

See more events

New York Site Reliability Engineering Tech Talks

Thursday, December 12, 2024 at 6:00 PM to Thursday, December 12, 2024 at 8:00 PM EST

Chelsea Market

75 9th Ave · New York, NY

New York Site Reliability Engineering Tech Talks

public group

Google NY Site Reliability Engineering (SRE) Tech Talks, 12 Dec 2024