Skip to content
Vol. I · No. 251
Mon · 8 Jun
A Daily Lexicon of Trustworthy Data
No. 243
243·01 · Platform Tunnel VisionNo. 243 · 21 May 2026 · 2 min

The table format went open. The lock-in just moved upstairs to the catalog.

Storage is interoperable now. The decisions about it are still proprietary.

FiledThe EditorShiny Object Parking Lot

Three open table formats spent years competing to store your data in a way no single vendor could fence off. They mostly succeeded. So the fence quietly moved to the floor above, where the catalog lives.

The formats converged. Apache Iceberg, Delta Lake, and Apache Hudi grew into the leading open table formats, their feature sets drifting toward one another — row-level deletes, deletion vectors, similar transaction guarantees. In June 2024 Databricks agreed to acquire Tabular, founded by Iceberg's original creators, pledging format compatibility and a single open standard. The storage-layer format war was effectively called.

Then the contest moved up a layer, to the catalog. The same week, Snowflake introduced Polaris Catalog, open-source and built on Iceberg's REST API, pitched as control over your data without copying it between engines. "Open" became the headline adjective everywhere at once. Notice what the catalog holds: governance, access control, and the version history deciding who may read and trust each table.

Here is the reveal under the press releases. Open at the storage layer is not open at the decision layer. A file format anyone can read does not say who owns the table, which definition of "revenue" it encodes, or who is accountable when two engines disagree. Neutral explainers note governance is migrating into the catalog — where the next lock-in accrues.

Watch the catalog, not the file format. The question is no longer "which open format" but "who controls the catalog, the definitions, and the access decisions, and can we leave with them." Portable bytes under a proprietary decision layer is a sophisticated cage, not an open field. Confirm the governance and semantics travel too — otherwise you moved the format and kept the dependency.

The takeaway

"Open" at the storage layer is not "open" at the decision layer. Check who owns the catalog and the definitions before you call it freedom.

The claim, mapped
  1. Iceberg, Delta Lake, and Hudi are the leading open table formats and their feature sets are converging.

    supports03
  2. In June 2024 Databricks agreed to acquire Tabular, founded by Iceberg's original creators, and committed to format compatibility and a single open standard.

    supports01
  3. Snowflake introduced Polaris Catalog, an open-source catalog on Iceberg's REST API positioned to give choice and avoid copying data between engines.

    supports02
  4. As storage formats open up, governance and control are migrating to the catalog layer, where lock-in and ownership gaps can re-form.

    context03
Sources
01
Databricks — Databricks Agrees to Acquire Tabular, the Company Founded by the Original Creators of Apache Iceberg2024-06-04 · Tier 1 · primaryDatabricks states it will bring format compatibility to the lakehouse — near term via Delta Lake UniForm, long term by evolving toward a single, open, common standard (paraphrased).
02
Snowflake — Polaris Catalog: An Open Source Catalog for Apache Iceberg2024-06-03 · Tier 1 · primarySnowflake introduces Polaris, an open catalog on Iceberg's REST API, so many engines can interoperate on a single copy of data instead of moving and copying it (paraphrased).
03
Dremio — Exploring the Architecture of Apache Iceberg, Delta Lake, and Apache Hudi2023-09-22 · Tier 3 · vendorNeutral architecture explainer: the three formats are converging on similar features while the catalog layer carries table governance and cross-engine interoperability (paraphrased).
Mark this entry
Marginalia · 0 notes

No notes yet. The margin is open.

Sign in to add a note. The margin is moderated — we keep it useful, not cruel.

Related entries
Shiny Object Pursuit
Active metadata is everywhere. The people it describes are still somewhere else.

The catalog scanned every table. It has not been opened since the rollout.

Definition Drift
Everyone Agreed on the YAML. Nobody Agreed on the Owner.

A real standard for data contracts now exists. The argument it was supposed to settle has simply moved up a layer.

Owner Missing
Everyone shipped data products. The decision rights never left the building.

The catalog filled with products. The org chart did not move an inch.