The 3AM Page That Convinced Me AI Alone Won't Fix Ops
Details
Note: please fill out this simple form in advance of attending the meeting to allow for smoother flow at the front door, and help us with organizing this and upcoming events. Thank you!
Abstract:
Every operations-heavy company runs the same loop: 24x7 follow-the-sun rotations, alarm fatigue, 5+ hour incident bridges, and a huge expectation that AI will fix all these issues. Most of the AI SRE agents/products shipped today won't.
In this talk, I will walk through a 3am page that taught me why context, not signal, is the missing ingredient in modern incident response. We will look at where the hyperscalers actually are with their AI Ops investments (AWS DevOps Agent, Azure SRE Agent, and what they are still missing). In addition, let's examine a production-data deletion this past weekend that cost a small business its database in 10 seconds, and lay out a 3-step framework for AI + Human Ops that compresses MTTR without replacing the on-call.
Drawn from 6 years running a 50,000-node, 100-region SRE organization. 30 minutes of material plus extended Q&A.
Speaker Bio:
Sridhar Rajarao spent 6 years leading Site Reliability Engineering at hyperscale, most recently as Director of SRE for Oracle Cloud Infrastructure (OCI) Object Storage, where he ran a 45-member organization, scaled the service from 4 to 100+ global regions, and delivered a custom GPU-optimized storage platform for a Tier-1 AI customer in under three months.
Earlier in his career he led Technical Program Management at Oracle and at Kohl's/Skava, focused on mobile and omnichannel platforms for Fortune 500 retail.
His recent LinkedIn article "The 3AM Page That Convinced Me AI Alone Won't Fix Ops" is the basis for this talk. He lives in Pleasanton, CA.
Timing:
7:00--7:30 Social Session
7:30--7:45 Announcements & Introductions
7:45-ish Presentation
