Getting REKd at DigitalOcean
From server health checks to network monitoring to customer activity events -- logs are everywhere at DigitalOcean. In a single day, we collect more than a terabyte of real-time event data over our entire operations infrastructure. Buried in that non-stop stream of data is everything we need to know to keep DigitalOcean's cloud services up and running. But without a centralized system to collect and analyze our logs, how can we know what's happening across thousands of nodes in our data centers? How can we identify and track important trends and patterns hidden in all that data? How can we respond to critical network events before our customers even notice?
In this talk, we'll introduce the logging and analysis tools we use to keep an eye on all those DigitalOcean droplets you depend on. We'll talk about where we were before, what we changed, and why we changed it. And, we'll show how today we use open-source tools including Rsyslog, Elasticsearch, Kibana (we call it the REK stack) to manage our operations in real-time at scale.
Gnome Fighter/Illusionist at DigitalOcean, Software engineer with expertise in logging, metrics, distributed messaging and cloud operations. Open Source contributor (ZeroMQ and Rsyslog)
RVA Data Hackers meets monthly to hear, present and discuss topics in machine learning & big data. Come on down and help us build Richmond's Data Hacking community.