Flyte Community Talk: Cross Platform Lineage with OpenLineage


Details
This is another jam-packed session that we hope you can join or rewatch:
Talk #1: Cross-platform lineage using OpenLineage by Harel Shein
Abstract:
There are more data tools available than ever before, and it's easier to build a pipeline than it's ever been. This has resulted in an explosion of innovation, but it also means that data within today's organizations has become increasingly distributed. It can't be contained within a single brain, a single team, or a single platform.
Data lineage can help by tracing the relationships between datasets and providing a map of your entire data universe. OpenLineage provides a standard for lineage collection that spans multiple platforms, including Apache Airflow, Apache Spark, Flink, and dbt. This empowers teams to diagnose and address widespread data quality and efficiency issues in real-time.
In this session, Harel Shein from Datadog will show how to trace data lineage across Apache Spark and Apache Airflow. He will walk through the OpenLineage architecture and provide a reference that might be useful for the Flyte community.
Talk #2: Flyte platform monitoring with Prometheus and Grafana by Shivay Lamba,
Abstract:
Join this session to learn how to set up an observability stack for a Flyte environment, giving you better insights to plan resources, identify bottlenecks, and optimize your ML/Data pipeline platform.
--
Direct Zoom link: https://zoom-lfx.platform.linuxfoundation.org/meeting/97630681189?password=af2582b0-4453-45d5-93fe-fbf30ff7f3ed
All levels of experience are welcome. It's also fine to join to listen in.
We hope you learn something new every time.
See you there!

Flyte Community Talk: Cross Platform Lineage with OpenLineage