Skip to content

Details

Queueing & The Cliff: Why fast systems suddenly slow under load

Your service handles traffic beautifully at 60% utilization. At 80%, things feel a little slower. At 95%, latency explodes and everything melts down - even though you're "not at 100% yet." Why?

In this session, we'll dig into queueing theory, the math behind one of the most counterintuitive behaviors in distributed systems:

  • Why latency doesn't grow linearly with load - it grows like a hockey stick
  • Utilization vs. throughput vs. latency, and why the last 20% of capacity is a trap
  • Where queues hide in real systems: thread pools, connection pools, load balancers, message brokers
  • Practical takeaways: capacity headroom, backpressure, load shedding, and setting sane autoscaling thresholds

No heavy math required - we'll build the intuition with visuals and real-world examples from production systems.

Who's this for: Engineers, SREs, and architects who've ever wondered why their "fast" system fell over during a traffic spike.

Related topics

Distributed Systems
Software Architecture
Performance Engineering
Site Reliability Engineering (SRE)

You may also like