Skip to content

Details

Data engineers and pipeline managers know that producing data lineage – end-to-end pipeline metadata instrumented at runtime or parsed at design time – is a heavy lift without a shared standard for lineage metadata. It requires duplication of effort across pipeline tooling, and deployment of new tools can break existing lineage workflows. Getting useful lineage can seem like a sisyphean task.

Enter OpenLineage, an increasingly adopted open standard for lineage metadata collection. It defines a generic model of run, job, and dataset entities identified using consistent naming strategies. The core lineage model is extensible by defining specific facets to enrich those entities.

Time: 5 PM ET
Place:
Canarts Media Studio, 600 Bay St. #410
Toronto, ON, M5G1M6
Venue phone: 416-805-2286

The tentative agenda:

  1. Intros
  2. Evolution of spec presentation/discussion (project background/history)
  3. State of the community
  4. Integrating OpenLineage with Metaphor (by special guests Ye & Ivan)
  5. Spark/Column lineage update
  6. Airflow Provider update
  7. Roadmap Discussion
  8. Action items review/next steps

Join our Slack community: https://bit.ly/lineageslack

Events in Toronto, ON
Big Data
Database Development
Database Professionals
Linked Data
Web of Data

Members are also interested in