Skip to content

Details

Data engineers and pipeline managers know that producing data lineage – end-to-end pipeline metadata instrumented at runtime or parsed at design time – is a heavy lift without a shared standard for lineage metadata. It requires duplication of effort across pipeline tooling, and deployment of new tools can break existing lineage workflows. Getting useful lineage can seem like a sisyphean task.

Enter OpenLineage, an increasingly adopted open standard for lineage metadata collection. It defines a generic model of run, job, and dataset entities identified using consistent naming strategies. The core lineage model is extensible by defining specific facets to enrich those entities.

Time: 5 PM ET
Place:
Canarts Media Studio, 600 Bay St. #410
Toronto, ON, M5G1M6
Venue phone: 416-805-2286

The tentative agenda:

  1. Intros
  2. Evolution of spec presentation/discussion (project background/history)
  3. State of the community
  4. Integrating OpenLineage with Metaphor (by special guests Ye & Ivan)
  5. Spark/Column lineage update
  6. Airflow Provider update
  7. Roadmap Discussion
  8. Action items review/next steps

Join our Slack community: https://bit.ly/lineageslack

Related topics

Events in Toronto, ON
Big Data
Database Development
Database Professionals
Linked Data
Web of Data

Sponsors

Astronomer Inc

Astronomer Inc

Supercharge Airflow with our modern data orchestration platform

You may also like