Location visible to members
Join us for our monthly gathering to discuss everything from incident response to cross-team collaboration to automation frameworks. RSVP to make sure you receive the email with video call instructions!
Operating within Normal Parameters: Monitoring Kubernetes
After Kubernetes takes over your data centers, how can you be sure that it's operating within normal parameters? What does "normal" even mean? By formalizing your expected quality of service, you can measure and compare against known targets with open source tools like Prometheus. In this talk, we'll use Kubernetes as a case study for introducing service level objectives (SLOs) to guide monitoring efforts. Come learn the how and why of metric selection for monitoring Kubernetes quality of service, what gaps exist in the open source Kubernetes monitoring ecosystem, how to use Prometheus and its exporters to establish predictability and "normal" baselines, and how to use this telemetry to debug service degradations in a Kubernetes cluster.
Elana Hashman currently works for Red Hat as a Principal Site Reliability Engineer, serving as a technical lead on the Azure Red Hat OpenShift managed service. She chairs the Kubernetes Instrumentation SIG, where she focuses on benchmarking and metrics usability. In the wider FOSS community, she is a Director of the Open Source Initiative, a Python Software Foundation Fellow, and a Debian Developer (uploading).