[Workshop] Storage engines that power our databases
![[Workshop] Storage engines that power our databases](https://secure.meetupstatic.com/photos/event/6/0/a/f/highres_513264751.webp?w=750)
Details
Click the link below to Register (MANDATORY)
https://forms.gle/VRUikZmkTx2jsL9o8
(Note: By clicking directly on "Attend" at the bottom, you will receive an empty invite that does not confirm your registration. So please register using the above link.)
ThoughtWorks Pune engineering centre is going to host a series of workshops and talks on storage engine technology over the next few months.
We're glad to announce the first event of the series on storage engines.
About:
Database systems play a key role in any application architecture, whether it's on cloud or otherwise. Databases range from traditional relational databases like PostgreSQL and MySQL, to NewSQL databases such as YugabyteDB, CockroachDB and NoSQL databases (Cassandra, MongoDB etc).
The key component of these databases is a storage engine which manages persistence of data on the disks, and provides a way to efficiently access required data. Because of the nature of the storage hardware, from mechanical hard drives to latest NVMe SSDs, there are two prominent data structures used by all these databases, viz B+Tree and LSM Tree.
Understanding these data structures is the key, when it comes to making database choices for your application architecture. In this workshop you will get hands-on experience building these two styles of storage engines. We will also discuss how these are used in popular mainstream databases like Cassandra, MongoDB and Postgres.
Key takeaways:
- Get code level understanding of B+Tree and LSM Tree based storage engines.
- Get familiar with the storage engine code of mainstream databases like Cassandra, YugabyteDB, MongoDB etc.
- Understand how data storage engines like RocksDB or BadgerDB are built.
- Get an understanding of how to choose a data storage engine for your application architectures.
This workshop is ideal for: Developers, Technical Architects
Date: Saturday, 17-June-2023
Time: 10:00 AM to 05:00 PM
Venue: ThoughtWorks Technologies India Pvt. Ltd., 6th Floor, Binarius Building, Deepak Complex, National Games Road Beside Sales Tax Office, Shastrinagar, Yerawada, Pune, Maharashtra 411006
This is an in-person hands-on workshop. Lunch will be provided at the venue.
Registration:
Click the link below to Register (MANDATORY)
https://forms.gle/VRUikZmkTx2jsL9o8
(Note: By clicking directly on "Attend" at the bottom, you will receive an empty invite that does not confirm your registration. So please register using the above link.)
Hosts:
Unmesh Joshi
Principal Consultant, ThoughtWorks
Unmesh is an application developer at ThoughtWorks. He is working on documenting Patterns of distributed systems, which is periodically getting published on https://martinfowler.com/articles/patterns-of-distributed-systems/
Sarthak Makhija
Lead Consultant, ThoughtWorks
Sarthak is an application developer at ThoughtWorks and has worked for Citigroup and TCS in the past. He has keen interest in designing storage engines and databases. He is currently working on building an LFU-cache in Rust and plans to build a write-optimized storage engine using an LSM tree and provide read-optimized paths.
Prerequisites:
- Set up golang on your machine (Version 1.17)
- Have an IDE ready to code in golang
- Write a basic test in golang and ensure that it works
- Clone the repository (https://github.com/SarthakMakhija/storage-engine-workshop-template.git)
- (Optional) If you can find some time to read about golang, go ahead and do it
This workshop will cover following topics:
Introducing Storage engine
- What is a storage engine?
- Forms of block IO
- Data structures for a storage engine
- Role of B+Tree in MongoDB and Postgres
LSM
- Understanding the LSM tree and its components
- Tradeoffs between LSM tree and B+tree
- Building Segmented Write-ahead log
- Introducing memtable based on ConcurrentHashMap
- Building Memtable with "put" and "get"
- Role of LSM-tree in BadgerDB
- Implementation of Z-set in Redis
SSTable and Bloom filter
- Introducing SSTable
- Converting Memtable to SSTable
- Introducing Bloom filter
- Writing a bloom filter and attaching it to SSTable
Compaction
- Searching in SSTable
- Introducing Compaction
- Connecting all the dots in our storage engine
COVID-19 safety measures

[Workshop] Storage engines that power our databases