- Cassandra Traffic Management at Instagram | Cassandra and K8s with Instaclustr
Join us for the first meetup of 2019! This meetup will take place at Instagram HQ and will include a talk from Instagram and Instaclustr on all things Cassandra! Food, drinks and giveaways will be provided. Due to security at Instagram please also register on Eventbrite (https://goo.gl/idoZMw). You will need to be registered on Eventbrite to get into the meetup. First talk: Cassandra Traffic Management at Instagram Cassandra has been deployed for many years at Instagram and is still growing fast. Over the years, we’ve constantly improved the design of our infrastructure: we’ve introduced Cassandra proxy nodes to decouple the processing and storage workloads, we’ve developed Rocksandra, a new storage engine relying on RocksDB that reduced the GC pressure and improved the efficiency of our clusters. Lately, we’ve added an intermediate layer in our Cassandra infrastructure, where we could add many traffic optimizations to further improve our Cassandra clusters efficiency and reliability. This talk will explain in details this new component and the results we’ve observed. Bio: Michaël Figuière is a software engineer at Instagram where he focuses on improving its Cassandra infrastructure for a better efficiency and reliability. Previously, he worked on other large scale Cassandra deployments at Netflix and Apple and he lead the Cassandra drivers team at DataStax. Second Talk: Cassandra and Kubernetes Kubernetes has become the most popular container orchestration and management API with cloud-native support from AWS, GCP, Azure and a growing enterprise support ecosystem. Leveraging Kubernetes to provide tested, repeatable deployment patterns that follow best practices is a win for both developers and operators. In this talk Adam Zegelin, Co-Found of Instaclustr, will introduce the Cassandra Kubernetes Operator, a Cassandra controller that provides robust, managed Cassandra deployments on Kubernetes. By adopting Kubernetes and Cassandra, you can provide DBaaS like services rapidly and easily to the rest of your team and have a simple on-ramp to true multi-cloud capabilities to your environment. Several lighting talks will follow. Schedule: 6:00 - 7:00: Food and Network 7:00 - 8:00: Main Presentations 8:00 - 8:30: Lighting Talks 8:30 - 9:00 QA/Networking
- Apache Cassandra at Uber and Netflix on new features in 4.0
Join us at the Uber SF office for a night of Apache Cassandra awesomeness. We have two excellent talks lined up. First we'll be looking at some of the new features coming in 4.0 followed by experiences of running Cassandra at scale with Apache Mesos. This will be a night of real users talking about open source Cassandra at large, successful companies. Definitely not to be missed. 6pm: Doors open 6pm-7pm: Socialize around some food and drinks 7.00pm-7.15pm: Cassandra Compression and the ZStandard Algorithm in RocksDB 7.15pm-7.55pm: "A glimpse of Cassandra 4.0 features" 7.55pm-8.30pm: "Cassandra At Uber: Cassandra Operations and Lessons Learned" Sushma Devendrappa: Instagram Instagram's RocksDB storage engine now supports ZStandard (ZSTD) compression. This compression algorithm helps clusters with high disk usage providing disk savings ~25-30%. In this talk, we'll do a quick refresher of RocksDB Apache Cassandra storage engine status, cover compression algorithms supported by Cassandra, and discuss the technical implementation of ZSTD in the Rocksdb engine. We'll show disk space versus CPU Utilization, parameters to consider while moving to different compression algorithm and ZSTD Versus LZ4 and Disk Savings. Vinay Chella: Cloud Data Architect at Netflix A glimpse of Cassandra 4.0 features There are a lot of exciting features coming in 4.0, but this talk covers some of the features that we at Netflix are particularly excited about and looking forward to. In this talk, we present an overview of just some of the many improvements shipping soon in 4.0. Ankur Bansal and Jaydeepkumar Chovatia: Uber Cassandra At Uber: Cassandra Operations and Lessons Learned As Uber continues to scale, our internal systems generate lots of real-time data and require highly available, reliable storage. Since 2016, Cassandra has been a key piece of Uber Engineering’s multi-datacenter infrastructure. In this talk we will discuss our architecture, technical challenges, learnings, key use-cases and how a blend of open source infrastructure (Apache Cassandra and Mesos) and in-house technologies have helped Uber scale.
- Cassandra rocks! Instagram's migration to C3.0 and the new storage engine.
Join us at the Instagram HQ and learn more about the largest takeaway of Instagram migration to Cassandra 3.0 and the observed performance of the RocksDB storage engine. We will start the event with a series of lightning talk (5mins). Please reach out to a host if you would like to present something. Register at the building 24 lobby and mention "Cassandra meetup". Self parking is available. Using lyft/uber is recommended. Those who have provided name/email/company information, will have their badges ready to pick up at the lobby reception (Thank you!). If you haven't pre-registered, arrive earlier to account for the checkin process. Agenda (more details to come): * 6pm: Doors open * 6pm-7pm: Socialize around some food and drinks * 7pm-7.20pm: Lightning talks * 7.20pm-7.40pm: "Getting Instagram on C3.0, when is it your turn?" * 7.40pm-8.00pm: "Cassandra on RocksDB" * 8.00pm-8.30pm: Saying goodbye Getting Instagram on C3.0 We're all culprits of snoozing software upgrades - 'Remind me tomorrow' for so many days, or until we're faced with mandatory upgrade. The hesitation may stem from the risks of upgrading. Will it lose my data? Will it be incompatible with dependent systems? Will it be slower? However, when the risks are mitigated and the benefits are significant, the decision to upgrade is a no-brainer. Cassandra 3.0 introduces major changes - storage engine rewrite, materialized views, file-based hint storage, to name a few. As Instagram serves much of its core functionality on Cassandra, the risks of upgrading to this major version weighed heavily from the start. In this talk we'll cover the testing process we employed to detect regressions and data inconsistency, the patches we deployed to address performance issues, and the lessons we learned from upgrading our production clusters to 3.0. Andrew Whang is a software engineer at Instagram. He enjoys working on infrastructure challenges at work, almost as much as he enjoys the croissant section in the cafeterias. Cassandra on RocksDB Instagram is running one of the largest Cassandra deployments. In this year, the Cassandra team in Instagram has been working on a very interesting project to make Apache Cassandra’s storage engine to be pluggable, and implement a new RocksDB based storage engine into Cassandra. The new storage engine can improve the performance of Apache Cassandra significantly. In this talk, we will describe the motivation and different approaches we have considered, the high-level design of the solution we choose, also the performance metrics in benchmark and production environments. Pengchao Wang (aka @wpc) is a software engineer from Instagram Cassandra Team. Before Instagram, he worked for a consulting company helping build high-quality software products/services for 11 years. Father of two kids, an open source enthusiast. Hobbies are coding and tea.
- Polyglot Persistence At Netflix
• What we'll do Netflix’s architecture involves thousands of microservices built to serve unique business needs. As this architecture grew, it became clear that the data storage and query needs were unique to each area; there is no one silver bullet which fits the data needs for all microservices. CDE (Cloud Database Engineering team) offers polyglot persistence, which promises to offer ideal matches between problem spaces and persistence solutions. In this meetup you will get a deep dive into the Self service platform, our solution to repairing Cassandra data reliably across different datacenters, Memcached Flash and cross region replication and Graph database evolution at Netflix. Agenda : 6:00 - 7:00 Registration, Food/Drink & Networking 7:00 - 8:30 Talks 8:30 - 9:00 Q&A Talk Details: 1. CDE Service & Data Explorer (20 mins) Join us to learn about CDE Service, a central hub for managing our large-scale fleet of polyglot datastores, and how it empowers Netflix engineers across the company to get onboarded and access operational insights that they care about. We'll also demo the Netflix Data Explorer for Cassandra and Dynomite -- see how Netflix engineering explores data in our persistent stores with tools that encourage best practices. 2. Repair Service (20 mins) Anti-entropy repair in C* is and has been one of the most painful operational overheads in providing C* as a service. To solve this pain, we built a fully decentralized, self-schedulable, self-healable and self-monitoring repair service to keep data consistent across nodes and data centers which solves this problem once and for all. In this meetup, we will share the design internals and production wins our repair service brought to hundreds of C* clusters and thousands of C* nodes. 3. Memcached Flash and Cross region replication (20 mins) Memcached Flash is the next gen storage solution used by EVCache which uses SSD (Flash) to store data. We are going to talk about how we were able to scale the storage from GB’s to TB’s without compromising speed or throughput and at a significantly reduced cost. Coherency in a distributed cache is a tough problem to solve. Doing this at scale across multiple aws regions is challenging. We are going to talk about our approach and our solution 4. What’s Next? Graph Database (10 mins) Graph databases optimize for use cases driven by many-to-many relationships and the need for fast, flexible, interactive traversals of those relationships. At Netflix we have identified use cases that require flexible fine-grained data model, and decided to leverage a framework based on our Cassandra deployment. Therefore, we have integrated JanusGraph with the Netflix ecosystem. In this talk, we are going to go over a few of the use cases that leverage JanusGraph, how we got motivated to use Janusgraph and the migration path from TitanDB. • What to bring • Important to know