We're back. :)
It's been far too long since we did a BashoChats. On the heels of RICON West, it's time to pick things back up. We've got two exceptional talks lined up this, the 008 installment. See below for full details. Also, we'll be starting talks at 7PM sharp, so try to show up at 6:30 to enjoy the food and drinks before things get going.
Jesse, Tomaz , and the team at Layer have offered us their beautiful office in the Mission for the meetup. (See below for full address details.) They will also be sponsoring the pizza and beer. Make sure to say thanks when you arrive and ask them about they are building...
See you on December 3rd. Ping me if you've got any questions.
7:00PM - Erasure Coding and Riak (full talk details coming soon)
Basho's Chief Architect Andy Gross returns to BashoChats to talk about some of the work he's been doing with erasure coding and Riak.
8:00PM - A State Machine Datastore in the Wild
Richard Crowley, Betable
Betable, provider of gambling-as-a-service and my employer, recently
made the jump from single-player, single-event games of chance like
slot machines into multi-player, multi-event games like blackjack.
This new service broke a lot of fundamental assumptions and resulted
in a new WebSocket-based API, new game services and a new datastore.
This is a story about that datastore.
We used to store everything in Cassandra and Cassandra was good to us,
especially operationally. Naturally, we first explored how we would
use Cassandra to support multi-event games. We prototyped two designs
based on Cassandra. Both left a lot to be desired, performance-wise,
on the bursty and relatively high-concurrency workload we tested.
So we went to the whiteboard to design our way out of "death by
roundtrip." It was time to move computation to the data. The plan
became to implement a distributed, replicated state machine in Go.
I'll go over the interesting parts of the design: the data model, disk
and wire protocols, replication and disaster recovery, secondary
indexes and instrumentation, too. We made a lot of mistakes, but the
fundamentals were sound so we were always able to recover, usually
after extreme panic.
I'll also go into detail on the human factors. We skipped the
honeymoon phase of the project in which we were "on time" and
generally handled our deadlines and expectation-setting poorly. We
further complicated matters by splitting what would typically be one
service and one dumb datastore into two services, and developed them
in tandem. Despite these hurdles we shipped everything without
This is the story of what went right and what went wrong as we
developed, deployed and operated this new service for our game
developers and their players.