addressalign-toparrow-leftarrow-rightbackbellblockcalendarcameraccwcheckchevron-downchevron-leftchevron-rightchevron-small-downchevron-small-leftchevron-small-rightchevron-small-upchevron-upcircle-with-checkcircle-with-crosscircle-with-pluscontroller-playcrossdots-three-verticaleditemptyheartexporteye-with-lineeyefacebookfolderfullheartglobegmailgooglegroupshelp-with-circleimageimagesinstagramFill 1light-bulblinklocation-pinm-swarmSearchmailmessagesminusmoremuplabelShape 3 + Rectangle 1ShapeoutlookpersonJoin Group on CardStartprice-ribbonprintShapeShapeShapeShapeImported LayersImported LayersImported Layersshieldstartickettrashtriangle-downtriangle-uptwitteruserwarningyahoo

"A Working Theory of Monitoring" by Caskey Dickson of Google

LOPSA-LA and UUASC present:

A Working Theory of Monitoring 

by Caskey L. Dickson, Site Reliability Engineer, Google Inc.

At Google we have discovered many common pitfalls and false simplifications that cause frustration and blind-spots with monitoring systems. Internally we have our own home-grown monitoring systems, but to move beyond the hit-and-miss approach to monitoring we have developed a formal model for such systems. This model is used as a framework for developing, evaluating, and evolving monitoring systems at Google that are suitable for operating at scale.

We will present our model, show how existing open source solutions fit (and don't fit!) into that model, and invite attendees to contrast it with their experiences. The goal is to encourage a larger discussion into the theory of monitoring and how current solutions can be evolved into more effective tools for operators of large systems.

Caskey Dickson is a Site Reliability Engineer/Software Engineer at Google, where he works writing and maintaining monitoring services that operate at "Google scale." In online service development since 1995, before coming to Google he was a senior developer at Symantec, wrote software for various internet startups such as CitySearch and CarsDirect, ran a consulting company, and even taught undergraduate and graduate computer science at Loyola Marymount University. He has an undergraduate degree in Computer Science, a Masters in Systems Engineering, and an M.B.A from Loyola Marymount.

PARKING AND DIRECTIONS

* No on-site parking, please use street parking and public lots surrounding the Google facility.

* The entrance is on the South-East corner of the building through a vehicle gate (corner of Sunset and Hampton), do not come to the main entrance, they will just send you around back.

* Google security will be at the gate entrance, state you are here for the 'LOPSA meetup' and he will admit you.

Join or login to comment.

  • Thomas

    Always fun attending these meetups and getting to geek out with fellow administrators.

    October 16, 2013

  • Carolyn O.

    Nice overview/comparison of all the most common monitoring alternatives.

    1 · October 16, 2013

  • Jarrett I.

    Also in all seriousness Linux is severely lacking a proper stats framework that all applications can report to on a local box. That was one of the most important tools in Windows arsenal, Perfmon. A one stop shop to see how all your applications are running in a single endpoint (in this case RPC, yuk!). None the less it beats having to search for or write your own program/script that has to literally scrape metrics from either a flat file, socket or web page and cobble it all together. If the community can come up with such a framework that makes it easy to publish metrics and to pull from it in one location then it would be a huge win for all.

    October 15, 2013

  • Jarrett I.

    Great talk, I wanted to also mention but not have the chance to was the Kale stack that Etsy has opened up on GitHub. One is Skyline, the other Oculus. Skyline addresses the part of the monitoring stack for smart thresholds based on preset algorithms where Oculus correlates graphs with similar patterns. For example, "OMG our web traffic dropped!!!! Oh its because of the network drop on this switch!"

    October 15, 2013

  • Andrew G.

    Check out monitorama.com also ( @monitorama and #monitoringlove on twitter) to see some of the current development of FOSS monitoring tools and mindset. a great conference + hackathon to get the ops and devs together to monitor all the things. E.g. : Riemann, sensuapp, graphite ( whisper db), statsd, logstash. Look at the speakers names from the 2013 Boston conference, a great who's who for further research: Jason Dixon, John Allspaw, John Vincent, Jordan Sissel, Kyle "@aphyr" Kingsbury

    October 15, 2013

  • A former member
    A former member

    Silly security theater. My name is Tommi Virtanen.

    October 12, 2013

    • A former member
      A former member

      That NDA quote is (probably) not about the Google office, it sounds like http://www.meetup.com...­ -- which I have incidentally decided to not go to, because that's just a douche move. Casual NDAs like this most definitely don't help to protect actual trade secrets, they're just ego stroking.

      2 · October 15, 2013

    • Ralf Q.

      @Kristian: This was not in regards to the event @Google, but another (DevOps) event. And IMHO, this is a typical sign of a company taking itself too serious.
      Meetups should be about meeting people/networking and sharing information/ideas, not willfully restricting this!
      if they are afraid about people gleaning confidential information in a meeting room or such, they simply don't have the right venue. And that recruiter thing is beyond silly as well....

      1 · October 15, 2013

  • Jordan S.

    I am going to attempt to livestream tonight's presentation, and shoot
    video as well.

    The livestream link is ....

    https://new.livestream.com/accounts/5492092/events/2474342/

    and I will try to bring it up at 7PM.

    This is a new setup and a new location so I am not overly optimistic
    that this will work.

    Thanks,

    Jordan

    October 15, 2013

    • A former member
      A former member

      Awesome!

      October 15, 2013

    • A former member
      A former member

      Thanks for attempting Jordan. It's a topic I'd like to know more about, but I had something come up and I wont be able to make it tonight, but I might be able to watch the livestream in the background.

      October 15, 2013

  • Tully R.

    Name: Tully Rankin +1 Kurtis Velarde

    October 15, 2013

  • Darius

    Darius Clarke

    October 15, 2013

  • Dahlia

    My +1 is Bill Hughes. I am Dahlia van Gelder.

    October 12, 2013

  • Techivist

    For badge access, since I go by Techivist here & in other places
    NAME: Miguel Hernandez

    October 11, 2013

  • Thomas

    Say, if we're bringing a +1, do we need to supply their names for a badge in advance as well?

    October 10, 2013

  • A former member
    A former member

    I'm very interested in seeing how you guys deal with monitoring on such a large scale but unfortunately will be out of town, by any chance will there be a recording of the talk made available? Or failing that maybe a copy of the slide deck at the least?

    October 10, 2013

  • Louie S.

    Hi there. My guests are Artur Hovsepian and Matt Muramoto.

    October 11, 2013

  • Patrick O.

    Another talk from Caskey, it's sure to be great!

    2 · October 9, 2013

  • A former member
    A former member

    Sounds fascinating and I will definitely be there.

    1 · September 17, 2013

Our Sponsors

People in this
Meetup are also in:

Sign up

Meetup members, Log in

By clicking "Sign up" or "Sign up using Facebook", you confirm that you accept our Terms of Service & Privacy Policy