Skip to content

Details

“It happened once. We couldn’t reproduce it. We fixed what we think was the issue.”

🎟️ Please RSVP here to join-in for this conversation in-person.

(Note: To ensure a spot, please register through the link)
This line captures a hard truth of microservices: the most serious failures rarely come from obvious bugs. They emerge from subtle interactions between timing, retries, partial failures, and networks. These incidents are rare, fleeting, and often disappear when we try to observe them. In distributed systems, they’re called Heisenbugs not because they’re imaginary, but because they resist repeatable observation.

Microservices inherit this failure profile even when individual services are simple and well-tested. As systems become more asynchronous and failure-aware, the space of possible behaviors explodes. Chaos engineering encourages us to inject failures and observe outcomes, but chaos is inherently non-deterministic. When something breaks, we usually can’t replay the exact sequence that caused it — leaving teams to reason probabilistically and ship fixes with limited confidence.

This talk doesn’t introduce a new framework or silver bullet. Instead, it introduces an idea: applying deterministic simulation techniques from distributed systems research to microservices. Through a small simulation, we’ll explore what changes when we can control time, message ordering, retries, and failures — and replay the same incident until we truly understand it.

The goal isn’t to replace chaos, but to complement it. By making rare failures reproducible, deterministic simulation changes how we debug systems, reason about correctness, and validate fixes. Attendees will leave with a mental model and design questions that influence how microservices are tested, evolved, and trusted in the real world.

Related topics

Events in Chennai, IN
Software Development
Software Engineering
Software QA and Testing

You may also like