The Schema History Topic Remembers What Nobody Wrote Down
Debezium 3.0 ships a connector that refuses to guess. The org still guesses.
Change data capture promises the database's truth, streamed live. It delivers the database's truth and, separately, the unanswered question of whether anyone agreed on what the columns mean.
On October 2, 2024, the Debezium project released 3.0.0.Final, the long-running open-source change data capture engine that turns database commit logs into event streams for MySQL, Postgres, Oracle, and others. The release raised the runtime baseline to Java 17, moved onto Kafka 3.8, and removed the deprecated additional-condition field, asking everyone to update their scripts and workflows accordingly.
The detail worth dwelling on is older and quieter: Debezium keeps a schema history topic, and reads it so it applies the correct schema to each captured row. Per its own docs, if a schema is not present in that history, the connector is unable to capture the table, and an error results. The tool will not invent a definition. It would rather stop than lie about what a column was.
That is a remarkable standard, and it is held by the connector, not the company. The software refuses to emit a row whose shape it cannot account for. The organization downstream has no such scruple: it will happily wire dashboards, alerts, and a quarterly metric onto a freshly streamed field that three teams describe three different ways. The pipeline preserves provenance to the byte. The meaning was never written down to preserve.
Watch what happens the first time a source table adds a column at 2 a.m. and the stream carries it instantly. The latency graph stays flat and green. The new field arrives correct, on time, and completely undefined. Real-time did its job. The question of what "correct" means for that column is now a ticket, and the ticket is slower than the stream by several orders of magnitude.
If your CDC engine won't capture a table without a known schema, that's the bar. Apply the same refusal to the meaning, not just the shape, before the field is live.
Debezium 3.0.0.Final was released October 2, 2024, requiring Java 17 and built on Kafka 3.8.
supports01The 3.0 release removed the deprecated additional-condition field (DBZ-8278).
supports01If a table's schema is not present in the schema history topic, the connector cannot capture the table and errors out.
supports02CDC tooling preserves the shape of data precisely but does not establish or preserve the agreed meaning of fields.
No notes yet. The margin is open.
Sign in to add a note. The margin is moderated — we keep it useful, not cruel.
An engineering convenience became a control point. A control point needs a controller.
Definition DriftAnomaly detection now defines 'good' for you. It defines it as 'whatever usually happens.'
Definition DriftA real standard for data contracts now exists. The argument it was supposed to settle has simply moved up a layer.