The table format went open. The lock-in just moved upstairs to the catalog.
Storage is interoperable now. The decisions about it are still proprietary.
Three open table formats spent years competing to store your data in a way no single vendor could fence off. They mostly succeeded. So the fence quietly moved to the floor above, where the catalog lives.
The formats converged. Apache Iceberg, Delta Lake, and Apache Hudi grew into the leading open table formats, their feature sets drifting toward one another — row-level deletes, deletion vectors, similar transaction guarantees. In June 2024 Databricks agreed to acquire Tabular, founded by Iceberg's original creators, pledging format compatibility and a single open standard. The storage-layer format war was effectively called.
Then the contest moved up a layer, to the catalog. The same week, Snowflake introduced Polaris Catalog, open-source and built on Iceberg's REST API, pitched as control over your data without copying it between engines. "Open" became the headline adjective everywhere at once. Notice what the catalog holds: governance, access control, and the version history deciding who may read and trust each table.
Here is the reveal under the press releases. Open at the storage layer is not open at the decision layer. A file format anyone can read does not say who owns the table, which definition of "revenue" it encodes, or who is accountable when two engines disagree. Neutral explainers note governance is migrating into the catalog — where the next lock-in accrues.
Watch the catalog, not the file format. The question is no longer "which open format" but "who controls the catalog, the definitions, and the access decisions, and can we leave with them." Portable bytes under a proprietary decision layer is a sophisticated cage, not an open field. Confirm the governance and semantics travel too — otherwise you moved the format and kept the dependency.
"Open" at the storage layer is not "open" at the decision layer. Check who owns the catalog and the definitions before you call it freedom.
Iceberg, Delta Lake, and Hudi are the leading open table formats and their feature sets are converging.
supports03In June 2024 Databricks agreed to acquire Tabular, founded by Iceberg's original creators, and committed to format compatibility and a single open standard.
supports01Snowflake introduced Polaris Catalog, an open-source catalog on Iceberg's REST API positioned to give choice and avoid copying data between engines.
supports02As storage formats open up, governance and control are migrating to the catalog layer, where lock-in and ownership gaps can re-form.
context03
No notes yet. The margin is open.
Sign in to add a note. The margin is moderated — we keep it useful, not cruel.
The catalog scanned every table. It has not been opened since the rollout.
Definition DriftA real standard for data contracts now exists. The argument it was supposed to settle has simply moved up a layer.
Owner MissingThe catalog filled with products. The org chart did not move an inch.