TBDEG - Project Nessie and Lakehouse Catalog Versioning by Alex Merced!


Details
This meetup is a monthly chat for our community to discuss the latest and greatest in data engineering. We'll cover interesting topics, techniques, and tools through general open discussions and focused presentations.
This month, Alex Merced will present "Project Nessie and Lakehouse Catalog Versioning" - A deep-dive into leveraging Project Nessie for effective catalog versioning in a Lakehouse setup.
Project Nessie is an open-source project that provides a Git-like approach to version control for data lakehouse tables. This makes it possible to track data changes over time and revert to previous versions if necessary.
In a lakehouse environment, catalog versioning is essential for ensuring the accuracy and reliability of data. By tracking changes to the catalog, you can ensure that everyone is working with the same data version. This can help to prevent errors and inconsistencies.
Project Nessie can be used to implement catalog versioning in a lakehouse environment. This can be done by creating a Nessie repository for the catalog and then tracking changes to the repository using Git.
This presentation will discuss the benefits of using Project Nessie for catalog versioning in a lakehouse environment. We will also discuss how to implement catalog versioning using Project Nessie.
Key takeaways:
- Project Nessie can be used to track changes to data over time in a lakehouse environment.
- Catalog versioning is essential for ensuring the accuracy and reliability of data in a lakehouse environment.
- Project Nessie can be used to implement catalog versioning in a lakehouse environment.
Speaker Bio:
Alex Merced is a developer advocate for Dremio, a developer, and a seasoned instructor with a rich professional background. Having worked with companies like GenEd Systems, Crossfield Digital, CampusGuard, and General Assembly.
Alex is a co-author of the O'Reilly Book "Apache Iceberg: The Definitive Guide." With a deep understanding of the subject matter, Alex has shared his insights as a speaker at events including Data Day Texas, OSA Con, P99Conf and Data Council.
Driven by a profound passion for technology, Alex has been instrumental in disseminating his knowledge through various platforms. His tech content can be found in blogs, videos, and his podcasts, Datanation and Web Dev 101.
Moreover, Alex Merced has made contributions to the JavaScript and Python communities by developing a range of libraries. Notable examples include SencilloDB, CoquitoJS, and dremio-simple-query, among others.
twitter: amdatalakehouse
threads: alexmercedcoder
mastodon: @alexmerced@data-folks.masto.host
linkedin: /in/alexmerced
Looking forward to seeing you all soon!

TBDEG - Project Nessie and Lakehouse Catalog Versioning by Alex Merced!