GTS Episode 13: Building data pipelines in Shopee with DEC

Are you going?

70 people going

Share:

Details

Imagine that you're working on an awesome backend project. It's well defined and well decoupled from the rest of the subsystems in the company. At some point in time however, another team asks you to send them a stream some of the data that your project owns. This request doesn't really fit your project definition, implementation or API, but this data must be shared as per product requirement. So, you add a seemingly random, out of place piece of code that does exactly that, sends a very specific stream of data to some external system.

With time your backend project adds more and more such "additional" data streaming pieces of code, bloating the project and making it hard to track and manage. You can’t wait to get rid of it, but don't really know where to move this code...

Now imagine, a few days later, your project manager tells you to expect peak load due to a massive marketing campaign starting soon. As a responsible developer you run stress tests and find the bottleneck. You can fix it by moving some locking queries out of a transaction and processing them asynchronously somewhere else. But where?...

DEC to the rescue!

DEC (Data Event Center) is Shopee’s new programmable data pipeline. It is a new platform that allows asynchronous tracking of database events and triggering actions upon those events using built-in scripting engine. This platform solves the problem of streaming data changes from databases to any given destinations, be it another database, queue or k/v storage, with the ability to modify the stream of data that flows from source DB to destination sinks. The applications range from DB replication tasks, to async DB updates processing, DB to cache(s) synchronisation or generating task queues based on DB changes.

This talk will touch on the basic concepts and architecture that power DEC, and how it’s used in Shopee to solve these problems.

Speaker: Rim Zaydullin

Rim is a senior expert software engineer at Shopee and has been working on infrastructure projects in the internet industry for the last 12 years. He’s particularly passionate about data storage systems of all flavors, from in-memory only storage to distributed filesystems. Some of his previous work includes:

- Yandex Cocaine Cloud (https://techcrunch.com/2013/10/16/search-engine-giant-yandex-launches-cocaine-a-cloud-service-to-compete-with-google-app-engine/)
- CocCoc Bigdata project (https://bigdata.coccoc.com/en)

Schedule

7:00 PM Registration, mingling and snacks

7:30 PM The Talk

8:30 PM Q & A, and more mingling