Skip to content

Flyte Community Talk: Cross Platform Lineage with OpenLineage

Photo of Sage Elliott
Hosted By
Sage E.
Flyte Community Talk: Cross Platform Lineage with OpenLineage

Details

This is another jam-packed session that we hope you can join or rewatch:

Talk #1: Cross-platform lineage using OpenLineage by Harel Shein

Abstract:
There are more data tools available than ever before, and it's easier to build a pipeline than it's ever been. This has resulted in an explosion of innovation, but it also means that data within today's organizations has become increasingly distributed. It can't be contained within a single brain, a single team, or a single platform.

Data lineage can help by tracing the relationships between datasets and providing a map of your entire data universe. OpenLineage provides a standard for lineage collection that spans multiple platforms, including Apache Airflow, Apache Spark, Flink, and dbt. This empowers teams to diagnose and address widespread data quality and efficiency issues in real-time.

In this session, Harel Shein from Datadog will show how to trace data lineage across Apache Spark and Apache Airflow. He will walk through the OpenLineage architecture and provide a reference that might be useful for the Flyte community.

Talk #2: Flyte platform monitoring with Prometheus and Grafana by Shivay Lamba,

Abstract:
Join this session to learn how to set up an observability stack for a Flyte environment, giving you better insights to plan resources, identify bottlenecks, and optimize your ML/Data pipeline platform.

--
Direct Zoom link: https://zoom-lfx.platform.linuxfoundation.org/meeting/97630681189?password=af2582b0-4453-45d5-93fe-fbf30ff7f3ed

All levels of experience are welcome. It's also fine to join to listen in.
We hope you learn something new every time.

See you there!

Photo of Building AI Together - Seattle group
Building AI Together - Seattle
See more events