Skip to content
Vol. I · No. 251
Mon · 8 Jun
A Daily Lexicon of Trustworthy Data
No. 245
245·02 · Owner MissingNo. 245 · 23 May 2026 · 2 min

Lineage is mandatory for the audit and partial in practice.

A graph that stops at the warehouse door explains everything except where the number came from.

FiledThe EditorSource Notes

Every regulator wants the same thing: trace the number in the report back to where it began. Most lineage diagrams answer a narrower question, which is where the number was last seen.

What happened is that lineage matured as a standard faster than it matured as coverage. OpenLineage is an open framework for collecting lineage metadata across pipelines, with Marquez as its reference implementation under the Linux Foundation's LF AI & Data. The tooling to record provenance is real, governed, and adopted. The question is how far up the chain the recording actually reaches.

It matters because the audit need is specific. Column-level lineage tracks how an individual field was computed and transformed, not merely that one table fed another. Table-level lineage tells an auditor that customer data reached a report; column-level tells them which field, through which transformation, produced the figure under review. The gap between those two sentences is the entire point of the exercise.

What it reveals about the field is a familiar boundary problem. The Basel Committee's risk-data principles, in force since 2013, require banks to aggregate risk data fully and accurately, and a 2023 progress report found additional work required at all assessed banks to attain or sustain full compliance. Lineage that begins where data lands in the warehouse quietly excludes the spreadsheets, hand-edits, and source systems where the number was actually shaped.

What to watch is where the graph terminates. A lineage program that stops at the warehouse door is documenting custody, not origin. The reveal: lineage explains everything except the one thing the audit asked, which is where the number came from and who is accountable for the transformation that made it.

The takeaway

Lineage that ends at the warehouse documents custody, not origin. Audit accountability lives in the transformations it skips.

The claim, mapped
  1. OpenLineage is an open standard for lineage metadata collection, with Marquez as its reference implementation under LF AI & Data.

    supports01
  2. Column-level lineage traces how a specific field was computed and transformed, where table-level lineage only shows that tables are connected.

    supports02
  3. Regulators require banks to trace and aggregate risk data accurately, and lineage provides the audit trail used to verify it.

    supports0304
  4. Nearly a decade after the principles took effect, a 2023 review found additional work required at all assessed banks to reach or sustain full compliance.

    context03
Sources
01
OpenLineage (LF AI & Data) — OpenLineage — an open framework for data lineage collection and analysis2024-01-01 · Tier 2 · primaryDescribes OpenLineage as an open standard for lineage metadata collection, with Marquez as the reference implementation, an LF AI & Data project, to collect, aggregate, and visualize ecosystem metadata.
02
Dawiso — What Is Column-Level Lineage? Data Tracing at Field Granularity2024-06-01 · Tier 4 · practitionerContrasts table-level lineage, which shows tables connect, with column-level lineage, which traces how a specific field was computed and transformed across each pipeline step.
03
Basel Committee on Banking Supervision (BIS) — Principles for effective risk data aggregation and risk reporting (BCBS 239)2013-01-09 · Tier 1 · primaryRequires banks to strengthen risk-data aggregation capabilities so exposures can be identified fully, quickly, and accurately; issued after banks proved unable to do so.
04
Collibra — Four ways data lineage powers BCBS 239 compliance2025-08-20 · Tier 4 · vendorStates that data lineage provides the audit trail letting regulators verify data is managed per BCBS 239 principles, so consumers can understand the origin and transformation of report data.
Mark this entry
Marginalia · 0 notes

No notes yet. The margin is open.

Sign in to add a note. The margin is moderated — we keep it useful, not cruel.

Related entries
Business Sense Required
New AI rules ask you to govern data you never classified. The bill comes due first.

The obligation assumes an inventory the organization skipped. The inventory is the project.

Owner Missing
NIST Asked Where the Data Came From; The Pipeline Went Quiet

The Generative AI Profile treats provenance as a control — but admits most builders cannot say what they trained on.

Owner Missing
Everyone shipped data products. The decision rights never left the building.

The catalog filled with products. The org chart did not move an inch.