Talk by Caskey Dickson: while (true) do; how hard can it be to keep running?

while (true) do; how hard can it be to keep running?


Caskey Dickson of Google

At Google we have more than a handful of servers and must leverage our administration time as effectively as possible. Between custom in-house software and off-the-shelf daemons, there are many parts to running a reliable, distributed, redundant service. Most fundamental is running the software and keeping it running. Through reboots, crashes, upgrades, downgrades, bugs, canaries and outages, myriad forces conspire to end your process and keep it stopped or worse, keep it alive but not functioning.


There exists init, upstart, rc scripts, cron, at and more that provide mechanisms to run programs unattended, but each of them can fail in different ways. When you have dozens or hundreds of servers they will fail in many different ways. This talk will discuss the obvious and not-so-obvious failure modes of popular packages like upstart and cron, as well as how we’ve worked with and around them to ensure that when we run a daemon it stays running. Some special emphasis will be given to how virtual hosts create new challenges that can trip up launch strategies and services written for bare metal.

 

About the speaker

Caskey Dickson is a Site Reliability Engineer/Software Engineer at Google where he works on infrastructure systems writing and maintaining monitoring services that operate at google scale. Working in online service development and system administration since 1995, before coming to Google he was a senior developer at Symantec, wrote software for various internet startups such as CitySearch, Cars Direct, WeddingChannel, ran a consulting company for several years and even spent a half decade teaching undergraduate and graduate computer science at Loyola Marymount University. He has an undergraduate degree in Computer Science, a Masters in Systems Engineering and an MBA from Loyola Marymount.

 

Parking and Transportation; Getting In; Etc.

There is no on-site guest parking but free street parking is available on surrounding blocks.  Also, the Google office is convenient to several bus stops. Use the main entrance on Main St.  It's the giant binoculars, hard to miss.

You will have to get a printed name tag and then you'll be escorted to the room.

Please be on time.

You can come as early as 6:30 PM.  We'll have pizza and drinks.

After the meeting, Caskey will join us for drinks down the street at O'Briens Pub.

Join or login to comment.

  • Steve M B.

    Just like to add my Excellent talk also. And can't wait for part II, monitoring ;-). Also, I remembered what the alternative to upstart was, systemd, and mailed a quick blurb about it on the la-lopsa list:
    https://lists.lopsa.org/pipermail/lopsa-us-ca-la/2013-June/000084.html

    June 26, 2013

  • Hashim C.

    Great talk Caskey.

    June 24, 2013

  • Matti S

    Thanks! I really enjoyed the presentation. Caskey let us know when the notes are ready for us to review.

    June 23, 2013

  • Ralf Q.

    Just saw this blog post matching to our "after party" conversation at O'Briens...
    http://www.theregister.co.uk/2013/06/20/google_hiring_procedures/

    1 · June 20, 2013

    • Christopher H.

      I'm glad to see some reflection and sanity here from inside Google. The rep as an overly academic environment will take years to recover from.

      June 21, 2013

  • Jayme

    Very informative talk - out of my normal scope of work, but still very worthwhile. Thanks for hosting!

    June 19, 2013

  • caskey

    Thanks for all the kind comments. I appreciate you all letting me subject you to my presentation draft, oh and to berate you about pid files.

    Extra thanks to Aleksey who made it all happen in the first place.

    2 · June 19, 2013

  • Rony F.

    Awesome talk, Caskey. Very informative, especially for a Jr. Sysadmin trying to avoid any major missteps as I'm getting started. Also, a few of you guys mentioned that you were hiring. Any chance there's an opening for a Jr.?

    June 18, 2013

    • caskey

      absolutely, drop me an email caskey@google and I can give you the run down

      June 19, 2013

  • Ralf Q.

    though in range beyond my everyday work, very interesting talk and interesting group of people

    June 19, 2013

  • James M.

    Fun, informative. Talk matched well with the audience. Caskey was clearly knowledgeable and authoritative. He happily shared information that was relevant and immediately applicable. Takeaway: virtualization makes trivial race conditions non-trivial; don't disown processes you care about; try deploying configurations as OS packages; pid files suck

    June 19, 2013

  • Oleg B.

    Great talk! some very good insights and practical tips... And yes, no more process daemonization, no need to deal with PIDs any more... :-)

    1 · June 19, 2013

  • Patrick O.

    Great talk! Thanks for the insight! Where can I learn more about PID files? :-P

    June 19, 2013

  • Carolyn O.

    Great talk, very technical - I learned a lot.

    June 19, 2013

  • Christopher H.

    interesting

    June 19, 2013

  • Kazutaka U.

    i am sorry i don't think i can make it this time...

    June 18, 2013

  • Daniel V.

    not going to make it out tonight, just getting off work.

    June 18, 2013

  • A former member
    A former member

    Not gonna make it after all. Please publish slides!

    June 18, 2013

  • Michael S.

    If anyone needs a ride from downtown today, contact me.

    June 18, 2013

  • Steven B. C.

    I've been working in Hollywood for thirty years helping to build and deploy the digital tools that we use today. I look forward to the presentation and meeting everyone.

    Steven B. Cohen
    Aberdeen LLC

    June 17, 2013

  • Daniel V.

    SUPER STOKED!

    June 14, 2013

  • Hashim C.

    Should be fun

    June 12, 2013

  • Aleksey T.

    Hi, Christopher. Yes, the June one is in Venice. The address is 340 Main Street, Venice, CA. I updated the meetup page AFTER I talked to you. I think you might have hit it just as I was updating it, so you saw the old address.

    May 29, 2013

  • Keith E.

    Hi Caskey. I am sorry that I can not attend this meeting because I am 2000 miles away and I will not be back until June 10th or so.

    Keith

    May 29, 2013

  • Thomas S.

    Caskey actually suggested we hold one at Google at our last meeting, as it turns out he could hook us up with a meeting room. The June meeting arrangement is with someone else, I think.

    May 29, 2013

  • Christopher H.

    Did I mishear? I thought you said the June one was in Venice.

    May 28, 2013

Our Sponsors

  • LPI North America

    Any LPI certification exam at 20% below list price: use code Meetup14.

People in this
Meetup are also in:

Sometimes the best Meetup Group is the one you start

Get started Learn more
Bill

I started the group because there wasn't any other type of group like this. I've met some great folks in the group who have become close friends and have also met some amazing business owners.

Bill, started New York City Gay Craft Beer Lovers

Start your Meetup today

Act now and get 50% off.
Until February 1.

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy