The Lexicon
429 terms of data management, each led by the meeting-safe punchline. The formal definition still appears with the little glasses on.
access controls
A least-privilege model in which forty-one people have privilege, because requesting removal is harder than requesting access.
The mechanisms governing who may read, write, or export a given dataset, ideally on a least-privilege basis.access recertification
The quarterly ritual of approving every entry on a list too long to read, thereby renewing access nobody remembers granting.
The periodic review in which managers confirm that each person's system access still matches their current role.accountability
A quality every initiative is said to require and every org chart is careful never to locate.
The obligation of a single named party to answer for a decision and its consequences; not delegable, unlike the work itself.accuracy
Agreement with the source system, which everyone has quietly agreed to call reality.
The degree to which a recorded value correctly describes the real-world entity or event it represents.acting data steward
The only person who knows what the field means, formally prevented from being the one who knows what the field means.
A steward appointed on a temporary basis to maintain definitions and quality for a domain pending a formal assignment.actionable insight
Term struck pending the definitions of action, owner, and decision right it claims to carry; absent those, the insight is observed but not actionable.
Term struck pending the definitions of action, owner, and decision right it claims to carry; absent those, the insight is observed but not actionable.ad hoc analysis
A one-off query that becomes a weekly dependency the moment someone forwards the screenshot.
A one-off investigation answering a specific question outside the standing reporting set.additive measure
A number the dashboard sums across every dimension, including the two where summing it means nothing.
A fact that can be summed meaningfully across every dimension, as opposed to semi-additive or non-additive measures with restricted aggregation.agentic systems
Software empowered to act on decisions no human had been assigned to make.
Systems in which a model plans and executes multi-step actions across tools with limited human intervention.agentic workflow
A workflow that gained autonomy before the approval matrix did.
A process in which an AI system plans and performs a sequence of tasks with limited human direction.ai bill of materials
The parts list everyone agrees is essential and assembles the week after the audit is scheduled.
An inventory of the models, datasets, and dependencies that compose an AI system, kept for provenance and risk.ai center of excellence
The committee formed to govern the three models the other teams already shipped without it.
A central team chartered to set AI standards, build reusable capability, and advise teams on adoption.ai gateway
A toll booth on the model traffic that three teams are already routing around.
A control point that routes, logs, and rate-limits requests to language models across an organization.ai governance
Term struck pending the decision rights, review cadence, and accountability for training data and outcomes it presumes someone already assigned.
Term struck pending the decision rights, review cadence, and accountability for training data and outcomes it presumes someone already assigned.ai maturity
Term struck: cannot be measured until the field defines the quality, ownership, and monitoring the score is meant to rate.
Term struck: cannot be measured until the field defines the quality, ownership, and monitoring the score is meant to rate.ai pipeline strategy
Term struck pending the definitions of ingestion, ownership, idempotency, lineage, and a data freshness SLA it depends on.
Term struck pending the definitions of ingestion, ownership, idempotency, lineage, and a data freshness SLA it depends on.ai policy
A document that prohibits the thing three teams shipped last quarter, dated next quarter.
A written standard governing acceptable model uses, data handling, disclosure, and human review.ai readiness
A green slide presented before anyone has profiled the data the slide describes.
The degree to which an organization's data is profiled, documented, owned, and fit for a stated model use.ai readiness of our data
Struck: the possessive is the only part established; classification, retention, lawful basis, and ownership are not.
Struck: the possessive is the only part established; classification, retention, lawful basis, and ownership are not.ai strategy
Term struck pending the definition of quality, ownership, lineage, and decision rights it depends on.
Term struck pending the definition of quality, ownership, lineage, and decision rights it depends on.ai-readiness assessment
Struck. It rates the ownership, quality, lineage, and decision rights it quietly presumes already exist.
Struck. It rates the ownership, quality, lineage, and decision rights it quietly presumes already exist.ai-ready data model
Term struck pending the definitions of grain, ownership, conformed dimensions, and metric meaning it claims to already possess.
Term struck pending the definitions of grain, ownership, conformed dimensions, and metric meaning it claims to already possess.alert fatigue
A channel so loud it has achieved the silence it was built to prevent.
The diminished responsiveness that results when a pipeline emits so many low-value alerts that operators stop distinguishing the actionable ones.alignment
The state of having agreed on the word 'alignment.'
A documented, shared agreement among stakeholders on a definition, priority, or decision, with the agreement recorded.analyst relations
The budget that moves the dot on the quadrant without moving the product.
The function that manages a company's standing with the industry research firms whose ratings influence buyers.anomaly detection
A system that reliably notices the unusual and forwards it to an inbox that treats everything as usual.
The automated identification of data points or patterns that deviate materially from an established expectation or historical baseline.anonymization
A claim made about a dataset roughly until the first person re-identifies someone in it for a conference talk.
Irreversibly altering data so that no individual can be identified, removing it from the scope of privacy law.approved use
The use described in the request form, as distinct from the use, which began the following Tuesday.
The set of purposes for which a dataset may lawfully and contractually be used, recorded at the point of access.audit finding
The annual rediscovery of last year's finding, now with a new tracking number.
A documented gap between a control's stated design and its observed operation, raised by an independent reviewer.audit trail
A complete and immutable record of every action, retained for exactly thirty days, then quietly rotated out.
A tamper-evident chronological record of who accessed or changed data, and when.backfill
Re-running history to match the present, and discovering history had a different opinion about what 'customer' meant.
Reprocessing historical data through a pipeline to populate or correct records for periods before a job existed or after it failed.batch processing
Running it overnight, so the data is ready by morning and wrong by lunch.
Processing data in bounded, scheduled groups rather than continuously, typically optimizing throughput over latency.benchmark
The industry average chosen because the company was already above it.
A reference value, often external, against which a metric is compared to judge performance.benchmark saturation
Everyone got an A, so the test resigned.
The point at which models score near-perfectly on a benchmark, so it no longer distinguishes their real capability.best practice
Someone else's context, repackaged as a rule and applied without the conditions that made it work for them.
A method shown to produce good results in comparable settings, adopted to avoid relearning what is already known.bounded context
The fence inside which customer means one thing, drawn precisely so the fight happens at the gate.
A boundary within which a particular domain model and its terms are defined and consistent, borrowed from domain-driven design.breach notification
The seventy-two-hour clock that starts the moment someone decides whether it counts as a breach.
The obligation to inform regulators and affected individuals of a personal-data breach within a defined window.breach response plan
A binder, last opened during the tabletop exercise, that names a coordinator who left in the spring.
The documented procedure assigning roles, decisions, and timelines for handling a data breach.break
A difference between two numbers, resolved by writing down why it is fine and changing neither.
In reconciliation, a discrepancy between two datasets that should agree; each break must be identified, explained, and resolved.break-glass access
An emergency-only door, propped open since the incident in March, with the review still pending.
An emergency mechanism granting exceptional access in a crisis, designed to be heavily logged and reviewed afterward.bridge table
The table built to connect two things that were promised never to relate that way.
An intermediary table resolving a many-to-many relationship between a fact and a dimension, often carrying an allocation weight.brittle job
A job documented entirely in the muscle memory of the one person who knows which step to re-run first.
A pipeline task that fails on minor, expected variation in its inputs or environment, requiring manual intervention to recover.bronze layer
The pile, renamed to suggest a podium finish.
The raw landing zone of a medallion design, holding source data as ingested with minimal transformation.brownfield architecture
Where the architecture meets the people who depend on it and loses.
A design that must integrate with and evolve existing legacy systems, accommodating their constraints, data, and dependencies.bus factor
Usually one, usually unaware, usually about to take leave.
The number of people who would have to be lost before a system or definition became unmaintainable; a lower number is a higher risk.business glossary
A glossary in which the term 'owner' is the only entry without one.
The authoritative, owned register of agreed business term definitions and their relationships to data.business intelligence
Two words that survived the rebrand to analytics, the rebrand to data science, and the rebrand back.
The practice and tooling that turn operational data into reports, dashboards, and analysis for decisions.business key
The identifier the business uses in conversation, which is to say a different one in each conversation.
The natural identifier the business recognizes for an entity, carried alongside the surrogate to preserve traceability to the source.business logic drift
The slow distance between what the query computes and what anyone still means by it, measured in pull requests no one reviewed.
The gradual divergence between the rules encoded in a transformation and the current definitions the business actually uses, accumulating as logic is copied without review.canonical data model
The one true model every system maps to, plus the per-system exceptions that quietly restore the chaos it replaced.
A single agreed representation of a business concept that systems translate to and from, so integrations share one definition rather than negotiating pairwise.center of excellence
Three people, a shared inbox, and a name chosen before the headcount was approved.
A central team that sets practices, builds reusable capability, and advises domains on a governance discipline.centralization
What the last reorg was moving away from and the next one is named after.
Consolidating data ownership, modeling, and decision authority into a single team or platform for consistency and control.certified report
A report stamped trustworthy by whoever was last willing to put their name on it.
A report formally reviewed and endorsed as accurate, with a named owner accountable for its definitions.change data capture
Capturing every change the source makes, including the ones it would prefer you had not noticed.
A technique that detects and propagates row-level changes from a source database, usually by reading its transaction log, rather than re-copying full tables.change management
The workstream that explains the new tool to the people who will still lack the authority the old tool was missing.
The discipline of helping people adopt a new way of working: communication, training, and support that make a change stick.chief data officer
An accountability spread across an average tenure shorter than a remediation plan.
The executive accountable for an organization's data strategy, governance, quality, and value realization.chunking
Cutting the policy in half exactly where the exception lived.
Splitting source documents into passages sized for retrieval and model context windows.circle back
To place a decision in a holding pattern phrased as motion. The circle is reliable; the back is theoretical.
To return to an item later, after gathering the input or authority a decision requires.citizen analyst
The role created so the analytics team could decline the request and still call it enablement.
A business user who builds reports and analyses without a formal analytics or data role.completeness
The percentage of fields that are populated, of which a smaller, unmeasured percentage is populated correctly.
The extent to which all required values are present in a dataset, typically measured as the proportion of non-null entries against an expected population.completeness check
A test confirming the rows that arrived are all there, while remaining serenely unaware of the rows that did not.
A test verifying that all expected records and required fields are present for a given load or period.compliance attestation
A signed affirmation that the controls work, dated the same week three of them were turned off for the migration.
A formal statement, often annual, in which an organization affirms it meets a defined set of controls.compliance gate
A checkbox that has never once been left unchecked.
A mandatory checkpoint in a delivery process at which work must demonstrate conformance before proceeding.compliant by default
Struck: 'by default' names no classification, no access control, and no retention to be compliant against.
Struck: 'by default' names no classification, no access control, and no retention to be compliant against.composable data stack
Some assembly required, sold as a philosophy.
An architecture assembled from interchangeable, best-of-breed components connected through open standards rather than a single vendor suite.concept drift
The thing being measured moved; the dashboard, helpfully, stayed green.
A shift in the underlying relationship a model predicts, so the same input now implies a different outcome.conformed dimension
A dimension every department agreed to share, then quietly forked the moment the meeting ended.
A dimension shared with identical structure and meaning across multiple fact tables, enabling consistent reporting across business processes.connector
A component marked 'maintained' until the source changes its API, after which it is marked 'a meeting'.
A reusable component that handles authentication, extraction, and protocol details for a specific source or destination system.consent
A freely given, specific, informed agreement obtained through a banner engineered so that 'Accept All' is the only restful option.
A freely given, specific, informed, and unambiguous indication of a data subject's agreement to processing.consistency
The state of two systems agreeing, usually because no one has joined them yet.
The degree to which the same fact holds the same value across systems, records, and points in time without contradiction.context engineering
Requirements-gathering, performed live, in a text box, by whoever is holding the laptop.
The practice of assembling the instructions, examples, and retrieved facts a model needs for a task within its context window.context window
A budget the organization fills with everything except the one document that decided the matter.
The maximum span of tokens a model can attend to in a single request, bounding how much it can consider at once.control attestation
A signature confirming that someone read the sentence describing the control, which is legally distinct from the control existing.
A signed assertion by an accountable owner that a stated control operated as designed over a defined period.controlled vocabulary
The aspiration; the practice is mostly the uncontrolled kind, which is why this exists.
A curated, governed set of approved terms that constrains how concepts are named and used to keep meaning stable.copilot
A second opinion on data the org never gave a first opinion on.
An assistant embedded in a workflow that suggests or drafts actions for a human to accept or edit.cron job
A line of asterisks on a server in the closet, quietly running the business since a tenure no one present can confirm.
A task scheduled to run automatically at fixed times or intervals, defined by a crontab expression on a host.cross-border transfer
A transfer mechanism on file, and a logging endpoint in a region nobody filed.
The movement of personal data across jurisdictions, lawful only under an approved transfer mechanism.dag
Acyclic in the diagram. The on-call rotation reports a cycle.
Directed acyclic graph: a representation of pipeline tasks and their dependencies in which no task can depend on itself, directly or transitively.dark data
Most of it, kept on the theory that volume is a strategy.
Data an organization collects and stores but never uses, classifies, or governs.dark pattern
The 'Manage Preferences' button rendered in the exact shade that the eye has been trained to skip.
An interface designed to steer users toward choices against their interest, such as broader data sharing.dashboard
A wall of charts no one is assigned to act on.
A curated visual surface that presents selected metrics for monitoring and decision support.dashboard adoption
Measured by logins, because measuring whether it changed a decision was harder.
The degree to which intended users actually open and act on a dashboard after it ships.dashboard graveyard
Where dashboards go when no one will decide to delete them and no one will open them.
The accumulation of unused dashboards that remain published, indexed, and occasionally cited.data as a product
Term struck pending the named owner, consumer, quality SLA, and decision rights that distinguish a product from a table with ambition.
Term struck pending the named owner, consumer, quality SLA, and decision rights that distinguish a product from a table with ambition.data catalog
A complete index of assets nobody has agreed are correct.
A searchable inventory of an organization's data assets, enriched with descriptions, ownership, lineage, and usage to aid discovery and trust.data certification
A badge granted on a Tuesday and never re-checked against the Wednesday data.
A formal attestation that a dataset has met defined quality criteria and is approved for a stated class of use.data champion
A title awarded in lieu of time, headcount, or authority, on the understanding that enthusiasm is a kind of budget.
A volunteer within a team who promotes good data practice and links the central program to local reality.data citizenship
A responsibility assigned to everyone, which is the established method of assigning it to no one.
The shared responsibility of all employees to handle, interpret, and share data according to agreed standards.data classification
A four-tier scheme under which everything is filed as 'Internal' to avoid the conversation.
The assignment of data to sensitivity tiers, such as public or restricted, to drive handling, access, and retention rules.data cleansing
Repairing the data instead of the system that keeps breaking it, monthly, in perpetuity.
The correction or removal of inaccurate, incomplete, or improperly formatted records to bring a dataset within its quality requirements.data contract
A written agreement about what the source will send, honored until the source has a deadline.
An agreed, enforceable specification of the schema, semantics, and quality a data producer guarantees to consumers, versioned and tested at the pipeline boundary.data contract
A promise with a schema, signed by no one with the authority to keep it.
A versioned, enforceable agreement specifying the schema, semantics, quality, and SLAs a producer guarantees to its consumers.data culture
The thing a poster is bought to fix, on the model of curing a leak by framing a photograph of a dry ceiling.
The shared habits and incentives by which an organization actually defines, trusts, and acts on its data, beyond what any policy states.data democracy
Struck: absent an access policy, an accountable owner, and decision rights, it is an unmanaged free-for-all with a hopeful name.
Struck: absent an access policy, an accountable owner, and decision rights, it is an unmanaged free-for-all with a hopeful name.data democratization
Granting everyone access to the warehouse, then asking why everyone got a different total.
Broadening access to data and analytics tools beyond a central specialist team.data dictionary
A faithful description of the column named 'cust_status_2_final', whose meaning died with its author.
A catalog of a system's data elements — their names, types, formats, allowed values, and meanings — describing structure as built.data discovery
The tool whose first scan is reclassified internally as an incident.
The automated scanning of systems to locate and classify sensitive data wherever it has accumulated.data disposition
The least-rehearsed verb in the data lifecycle, and the first one a regulator asks you to demonstrate.
The defensible destruction or transfer of records at the end of their retention period.data domain
A boundary three teams drew differently and one team enforces.
A bounded business area — customer, order, payment — that organizes data ownership and modeling around a coherent subject.data domain owner
Accountable for the domain, consulted on the budget, informed of the deadline.
The accountable leader for a data domain, holding authority over its definitions, access, and quality priorities.data downtime
The hours the data was wrong, billed entirely to whoever used it, not whoever produced it.
The cumulative period during which data is missing, late, or wrong — the data analogue of system downtime, measured in hours of untrustworthy state.data evangelist
One who brings the good news that the data exists, and is not asked whether it is correct.
A role focused on internal promotion of data initiatives and tooling, measured by enthusiasm generated rather than data quality delivered.data fabric
The word chosen because every other textile was taken.
An integration layer that uses active metadata, knowledge graphs, and automation to connect data across distributed sources for unified access.data for ai
Struck: it claims the warehouse already holds the profiling, ownership, rights clearance, and fitness-for-use no one has done.
Struck: it claims the warehouse already holds the profiling, ownership, rights clearance, and fitness-for-use no one has done.data freshness
The timestamp everyone trusts and no one checks.
How recently the data in a report reflects the source system, relative to the decision it supports.data freshness sla
A promise about how recent the data is, signed by no one and measured by whoever is annoyed first.
A committed bound on how stale data may be, stating the maximum acceptable lag between an event in the source and its availability downstream.data governance
The meeting where everyone agrees data is important and no one agrees who owns it.
The exercise of decision rights and accountability over data assets, expressed through agreed policies, roles, and standards.data governance charter
A two-page mandate granting a committee the authority to request authority.
A founding document defining a governance program's scope, authority, membership, and escalation path.data governance council
A monthly quorum convened to ratify decisions no attendee has the authority to make.
A standing cross-functional body chartered to set data policy, arbitrate disputes, and assign accountability across domains.data governance framework
A wheel diagram printed, framed, and never opened past the wheel diagram.
A reference structure of domains, roles, and practices, such as DAMA-DMBOK or DCAM, used to organize a governance program.data governance tooling
Bought to make the decisions; capable only of cataloguing the fact that no one will.
The platforms that catalog, classify, and track data and policy, supporting governance work but not performing its decisions.data labeling
The step where the team discovers it never agreed on the categories.
The process of attaching agreed categories or values to examples so a model has something to learn from.data lake
A folder, but you capitalized it.
A repository holding raw data in native formats at scale, schema applied on read rather than on write.data lineage
A family tree for numbers, requested immediately after the audit discovers everyone was adopted.
A record of where data came from, how it changed, and where it moved across systems and processes.data lineage for audit
A diagram that is accurate up to the join where someone exported it to a spreadsheet and emailed it.
A traceable record of where personal data originated, how it moved, and where it now resides, for accountability.data literacy
The training program assigned to readers to explain why the numbers never matched.
The ability to read, interpret, and reason about data and the limits of a given measure.data literacy program
Training that teaches everyone to read the dashboard and no one to ask who built it.
An organized effort to raise staff competence in reading, interpreting, and reasoning with data.data marketplace
A storefront stocked with products that have no buyers and a returns desk no one staffs.
An internal or external venue where data products are published, discovered, and subscribed to under defined terms.data mart
The version of the numbers a team built so it would stop having to agree with the other teams.
A curated, subject-area subset of the warehouse, modeled and maintained for a specific team or function's reporting needs.data masking
The careful obscuring of the production data, in the one environment, that a different team had already copied somewhere else.
The replacement of sensitive values with realistic but non-sensitive substitutes for use in lower environments.data maturity, level 3
The level every organization rates itself: precisely one above wherever it actually is.
A self-assessed stage on a capability model indicating defined, repeatable practice across the organization.data mesh
A reorganization announced as an architecture.
A sociotechnical approach that distributes ownership of analytical data to the domains that produce it, supported by self-serve platform infrastructure and federated governance.data minimization
The principle that lost, by unanimous vote, to a roadmap item called 'data we might need to train on later.'
The principle that collection be limited to data adequate, relevant, and necessary for a stated purpose.data observability
The ability to watch the data break in real time, in high resolution, on a screen no one is assigned to.
The practice of monitoring data systems for freshness, volume, schema, and distribution so that incidents are detected before consumers are affected.data onboarding
The two weeks a new hire spends discovering which of the four dashboards is the real one.
The process by which a new team member gains the access and context to work productively with the data.data owner
A field on a slide, populated last quarter with the name of whoever wasn't there to refuse.
The accountable party with authority to approve access, definitions, and acceptable use for a data asset, and who answers for its outcomes.data pipeline
A sequence of steps that worked when one person built it and has been load-bearing ever since.
A defined sequence of steps that moves and transforms data from a source to a destination on a schedule.data pipeline architecture
A diagram of arrows, one of which is doing something nobody documented.
The structural arrangement of stages — extract, transform, load, orchestrate — through which data moves from source to consumer.data platform
It did not fail to create ownership. It succeeded at revealing there was none.
An integrated set of services for storing, processing, governing, and serving data, offered to internal teams as shared infrastructure.data policy
A document whose enforcement mechanism is the hope that someone reads it.
A documented, enforceable rule governing how data may be defined, classified, accessed, retained, or shared.data product
A dataset with a landing page.
A reusable dataset with named ownership, a quality SLA, documented consumers, and decision rights attached.data product
A dashboard given a roadmap so no one has to retire it.
In analytics, a maintained dataset or metric set with a named owner, an SLA, and documented definitions for consumers.data product manager
A title invented so a dataset could have someone to attend its roadmap review.
The role accountable for a data product's consumers, roadmap, quality, and value.data profiling
The one-time exercise of learning what the data actually contains, scheduled for after launch.
The systematic examination of a dataset's structure, content, and statistics to discover its true shape, ranges, patterns, and anomalies.data protection impact assessment
An assessment performed before the activity begins, scheduled three weeks after the activity began.
A structured analysis of privacy risks for a high-risk processing activity, performed before that activity begins.data quality
A program everyone funds, no one owns, and the dashboard reports as green.
The degree to which a dataset is fit for its intended use, measured across dimensions such as accuracy, completeness, timeliness, consistency, validity, and uniqueness.data quality debt
The interest paid every quarter on the validation skipped to make one deadline two years ago.
The accumulated future cost of deferred quality work — unfixed defects, skipped validation, and undefined rules — that compounds as systems build on top of it.data quality dimension
One of six words a vendor promises to measure and a scorecard promises to average into a single reassuring number.
A named axis along which quality is assessed — accuracy, completeness, timeliness, consistency, validity, or uniqueness — each with its own rules and measures.data quality initiative
The quarterly observation that the data has quality, followed by the closing of the observation.
A time-boxed program to identify, prioritize, and remediate data-quality defects across one or more domains.data quality rule
A test that fires, raises a ticket, and waits to learn whether anyone meant it.
A codified, testable assertion about data — a constraint, range, or relationship — whose pass or fail is evaluated against incoming records.data quality scorecard
A panel of green squares whose thresholds were set after the squares were already green.
A periodic summary of data-quality metrics—completeness, validity, timeliness—reported to stakeholders by domain.data residency
A map showing where the data lives, drawn before anyone asked the CDN where it caches.
A requirement that data be stored within a specified geographic or jurisdictional boundary.data retention schedule
A document specifying when data will be deleted, ratified by an organization that has never deleted anything.
A documented policy stating how long each record class is kept and when it must be disposed.data sovereignty
A principle observed everywhere except the vendor's support tier, which is global and reads everything.
The principle that data is subject to the laws of the nation in which it is collected or stored.data standard
One of the several standards each team independently considers the standard.
An agreed, documented specification for how a class of data is to be named, formatted, classified, or validated.data steward
The one person who reads the policy, applied as a layer on top of an existing full-time job.
A named individual accountable for the definition, quality, and correct use of a defined set of data within their domain.data stewardship
A responsibility added to a job description, never to a calendar.
The assigned operational responsibility for the quality, definition, and correct use of a defined set of data within its domain.data storytelling
The discipline of making a number persuasive before it is correct.
The practice of framing analysis as a narrative so an audience can follow the finding and its implication.data subject access request
A request that, for the first time, forces the organization to find out where it actually keeps that person.
A formal request by which an individual obtains the personal data an organization holds about them.data swamp
Term struck pending the definitions of quality, ownership, catalog, and lineage it was supposed to replace.
Term struck pending the definitions of quality, ownership, catalog, and lineage it was supposed to replace.data virtualization
The integration you postponed, now happening live, in front of the user.
A technique that provides unified, real-time access to data across sources through an abstraction layer, without physically moving or copying it.data warehouse
Where data goes to be agreed upon, eventually, by people who have since left.
A centralized store of integrated, modeled, historical data optimized for query and reporting across subjects.data-driven culture
Term struck pending the definitions of decision right, metric ownership, and the willingness to act against a preference the data contradicts.
Term struck pending the definitions of decision right, metric ownership, and the willingness to act against a preference the data contradicts.dbt model
A SQL file that turned a folder of opinions about 'revenue' into a folder of opinions under version control.
A version-controlled SQL select statement that defines a transformation, materialized by dbt as a table or view with declared dependencies and tests.dead-letter queue
Where the records that could not be processed go to be inspected, in the same sense that a junk drawer is a filing system.
A holding location for messages or records a pipeline could not process, set aside for inspection rather than discarded.decentralization
The half of the pendulum currently swinging away from the slide that recommended the other half.
Distributing data ownership and decision authority to teams closer to the data, in place of a single central function.decision intelligence
The newest word for who decides, sold so the question can be procured instead of answered.
The discipline of designing how decisions are framed, made, and improved using data and models.decision log
The artifact whose existence is confirmed only by the meeting spent looking for it.
A maintained record of significant decisions, their rationale, owner, and date, kept so they can be revisited rather than relitigated.decision rights
Who may decide, within what bounds — the one field every RACI leaves to a future workshop.
The explicit allocation of who may decide what, within which bounds, and who must be consulted before they do.deduplication
The quarterly project to merge the duplicate customers, which itself runs in three teams under two names.
The identification and removal or merging of records that represent the same entity, reducing a set to its distinct members.default value pollution
The reason a company believes a meaningful share of its customers were born on the first of January, 1900.
The degradation of accuracy that occurs when placeholder or default entries — 01/01/1900, 99999, 'N/A' — accumulate and are later treated as real values.definition of done
A list that names tests, deploys, and docs, and goes quiet on whether the number is right.
The agreed checklist a unit of work must satisfy before it is considered complete.degenerate dimension
A dimension demoted to living in the fact table, named as though the modeling, and not the org, were at fault.
A dimension attribute, such as an invoice or order number, stored in the fact table itself because it has no other attributes to warrant its own dimension.denormalization
Copying the value into six tables so the dashboard loads faster, then learning which copy is wrong six tables at a time.
The deliberate introduction of redundancy into a schema to improve read performance, accepting update complexity in exchange for fewer joins.digital transformation
A standing program whose deliverable is its own continuation. It does not finish; it renews, like a season pass to a process no one owns.
A coordinated change to how an organization operates, using technology to do work that was previously manual, slow, or impossible.dimension table
Where the company keeps every adjective it has ever applied to a customer, none of them governed.
A table of descriptive attributes that give business context to facts and supply the labels used to filter and group them.dimensional modeling
Deciding what the business measures and what it measures by, then discovering nobody agrees on either.
A design technique that organizes data into facts and dimensions to optimize a warehouse for query and analysis rather than transaction processing.do not sell
A right whose definition of 'sell' the company has worked harder to interpret than to honor.
A consumer right to opt out of the sale or sharing of their personal data, with a clear mechanism to exercise it.documentation debt
The distance between the wiki and the warehouse, compounding every sprint and paid by whoever onboards next.
The accumulating gap between how a system actually behaves and what its documentation describes.domain-oriented ownership
A diagram in which every box owns the data and every arrow owns the blame.
The principle that the business domain producing data owns its modeling, quality, and lifecycle rather than a central team.dotted line
The line on the chart that carries all of the responsibility and none of the headcount. It is dotted because solid would imply someone could be held to it.
A secondary reporting relationship: influence or coordination without direct authority over a person or function.drill-down
The feature requested in every demo and used the day after the number is already wrong.
Navigation from an aggregate figure into its underlying detail to locate the source of a change.drill-through
The path from the green number to the spreadsheet that disagrees with it.
A link from a summary visual to a detailed report filtered to the selected context.ediscovery
The annual event at which the org learns, under oath, how many places it kept the same message.
The identification, collection, and production of electronically stored information for legal proceedings.elt
ETL, but the part nobody documented now runs where the compute bill can see it.
Extract, load, transform: raw data lands in the warehouse first and is reshaped in place, deferring transformation to the destination.embeddings
Coordinates that faithfully encode whatever bias the source corpus never declared.
Numeric vector representations of text or other data, positioned so that similar items sit near each other.enablement
The function that hands teams a self-service portal to a decision that still requires three approvals they are not on.
The work of equipping teams to do something for themselves: training, documentation, templates, and support.encryption at rest
The control most cited in the questionnaire and least relevant to the breach, which came in through a login.
The encoding of stored data so that it is unreadable without a key, protecting it if the storage is compromised.entity resolution
Deciding whether these four customers are one customer, a question the org would rather bill four times than answer once.
The process of determining which records across systems refer to the same real-world entity and linking or merging them.error budget
A budget you only discover you had when someone explains it was already spent.
The maximum tolerable amount of quality failure allowed within a period before remediation must take priority over new work.escalation
Sending a decision upward in search of an owner, where it is acknowledged, admired for its complexity, and sent back down for alignment.
The act of raising an unresolved issue to a level with the authority to decide it.escalation path
A documented route to the person who will say it isn't their decision either.
The defined sequence of authorities a governance issue follows until it reaches someone empowered to resolve it.etl
Three verbs, four teams, and no agreement on which one owns the part that broke.
Extract, transform, load: a pattern that moves data from source systems, reshapes it, then writes it to a destination warehouse.eval harness
A scoreboard everyone trusts and no one has read the rules of.
The reusable apparatus that runs a model against fixed test cases and records scored outputs.eval set
The exam the model is allowed to study before every retake.
A fixed, labeled collection of cases used to measure a model's quality consistently across versions.event-driven architecture
Decoupling, sold as a feature, until someone asks which service owns the order.
A design in which components communicate by producing and reacting to events, decoupling producers from consumers in time and identity.exactly-once
A guarantee printed on the slide and approximated in the system.
A delivery guarantee that each event is processed one time and one time only, with no loss and no duplication, typically requiring coordinated state and deduplication.executive buy-in
Term struck pending a definition of the thing bought into, the price paid, and the moment the buyer would be asked to refund it.
Term struck pending a definition of the thing bought into, the price paid, and the moment the buyer would be asked to refund it.executive dashboard
The one dashboard that must never be red while the report is open.
A summary view of high-level indicators built for senior leadership to track performance at a glance.executive sponsor
The name on the kickoff slide, available again at the one-year retrospective.
The senior leader who funds a governance program, owns its mandate, and is accountable for its outcomes at the executive level.executive summary
The only slide that gets read, written by the person who least wanted to read the others.
A short framing of an analysis that states the finding and recommendation before the detail.explainability
A chart explaining a decision no one had defined well enough to question.
The degree to which a model's outputs can be attributed to understandable factors a human can audit.fact table
The one table everyone trusts, named for a property no one has verified.
A table holding the measurements of a business process at a declared grain, with foreign keys to its dimensions.fail fast
A slogan repeated until the first failure, after which it is replaced by the question of who approved the experiment.
A practice of testing ideas in small, cheap experiments so the wrong ones are abandoned early and the learning is kept.false positive rate
The reason the real alert was muted along with the eighty before it.
The proportion of quality alerts that flag a problem where none exists, eroding trust in the detection system that raised them.feature leakage
The model knew the answer because the answer was in the question.
When information unavailable at prediction time slips into training, inflating measured accuracy.feature store
A shared shelf where four teams store four columns named 'revenue'.
A managed repository of curated model inputs with consistent definitions shared across training and serving.federated governance
Central authority for writing the policy, local authority for ignoring it.
A model in which global standards and policies are set centrally while domains apply and enforce them locally, balancing autonomy and consistency.federation
The word for centralization once the central team admits it cannot make anyone do anything.
An arrangement in which independent systems or teams retain local control while agreeing on shared standards for interoperation.fine-tuning
Teaching the model the house style before deciding the house had one.
Further training a pretrained model on task-specific examples to specialize its behavior.fitness for use
The clause that lets data fail every dimension and still pass, because no one wrote down the use.
The principle that quality is defined relative to a specific consumer and purpose, not as an absolute property of the data.foundation model
A general-purpose engine bought to solve a problem the org had not yet agreed to define.
A large model pretrained on broad data that serves as a base for many downstream tasks.freshness
How recently the pipeline ran, which is not the same as how recently it was right.
A measure of how recently a dataset was last updated relative to the cadence its consumers expect.freshness check
A test that confirms the data is old, then emails someone who has muted the channel.
An automated test that flags a table as stale when its most recent record is older than a defined threshold.freshness sla
A promise about data age that is measured precisely, breached quietly, and reported only in the incident review.
A committed maximum age for data in a table, beyond which it is considered stale and consumers are notified.full audit coverage
Struck: there is no covering a data inventory, audit trail, and lineage that were never drawn.
Struck: there is no covering a data inventory, audit trail, and lineage that were never drawn.glossary owner
A role defined in the glossary it was supposed to be assigned in.
The named party accountable for the completeness, accuracy, and approval workflow of business glossary entries.gold layer
The tier the dashboard reads from, which is why it is always green.
The business-ready, aggregated tier of a medallion design, modeled for direct consumption by reports and applications.golden record
The one true version of the customer, assembled from five wrong ones by a rule no one in the room can recite.
The single best, reconciled version of an entity assembled from multiple sources according to defined survivorship rules.governance by committee
A method of distributing a decision until no single person can be found to have made it.
An approach that routes governance decisions through group consensus rather than assigning them to accountable individuals.governance council meeting
The governance council met. The definition did not. Minutes were taken; a decision was scheduled for the meeting that takes minutes about this one.
A recurring forum where data leaders ratify standards, settle disputes over definitions, and assign ownership.governance office
The function accountable for everything and authorized for the calendar invites.
The standing function that coordinates policy, stewardship, issue management, and reporting across a governance program.governance scorecard
A dashboard reporting every metric except whether anything was decided.
A periodic summary of governance metrics, such as stewardship coverage, issue aging, and policy conformance.grain
The question 'what does one row mean?', asked far too late and answered by whoever wrote the query.
The precise level of detail represented by a single row in a fact table; the first and most consequential modeling decision.green dashboard
Green because nobody had agreed what red would have required someone to do.
A quality monitoring display showing all indicators within tolerance, signaling that no metric currently requires intervention.greenfield architecture
A clean design, defined by the absence of anyone using it yet.
A design built from scratch with no constraints from existing systems, free to adopt preferred patterns and technologies.ground truth
One analyst's spreadsheet, promoted to truth because no one else volunteered.
The reference labels treated as correct, against which a model's outputs are measured.grounding
Asking the model to cite a source, which works only if the organization keeps one.
Constraining a model's output to retrieved, citable evidence rather than its parametric memory alone.guardrails
Rules that enforce a policy, written for the policy the org has not yet enforced anywhere else.
Programmatic constraints that filter or block model inputs and outputs against defined safety and policy rules.hallucination
The model answering a question the organization had also never answered, only faster.
A model output that is fluent and confident but not supported by its inputs or any verifiable source.handoff
The moment a responsibility becomes a diagram of an arrow, with no one standing at either end of it.
The transfer of responsibility for a task or system from one team or person to another.headless bi
Business intelligence with the part that asked the business removed.
An architecture that exposes governed metric definitions through an API, decoupling the metrics layer from any single visualization tool.hierarchy
The official rollup of the org, redrawn each reorg, never matching the one Finance is actually using.
An ordered parent-child structure within a dimension — such as date or organization — that supports rollup and drill-down across levels.hippo
The variable the dashboard was quietly fitted to predict.
Highest Paid Person's Opinion: the decision input that overrides the analysis when the two disagree.human in the loop
The named human who will be blamed for the output they were given two seconds to approve.
A design in which a person reviews or approves model outputs before they take effect.idempotency
The property everyone assumed the job had until the day it ran twice.
The property that running a pipeline operation multiple times produces the same result as running it once, so retries and replays do not duplicate or corrupt data.idempotent load
A load that is safe to rerun any number of times, which is why it has now been rerun any number of times.
A load designed so that running it multiple times produces the same result as running it once.incident postmortem
A document that names the cause precisely, assigns the corrective action vaguely, and recurs on schedule.
A structured review after a quality incident that documents the timeline, impact, root cause, and corrective actions to prevent recurrence.incremental load
Processing only what changed, defined by a watermark everyone trusts and no one can explain.
A load strategy that processes only new or changed records since the last run, rather than reloading the full dataset.inference
The part priced per call, unlike the definitions it relies on, which are free and missing.
The act of running a trained model on new inputs to produce predictions in production.ingestion
Accepting whatever a source sends, on the working theory that the schema it sent last week is the schema it will send today.
The process of bringing data from external or upstream systems into a data platform for storage and processing.ingestion layer
The door, drawn as a layer to justify a team.
The tier responsible for acquiring data from sources and landing it into the platform, batch or streaming, with basic validation.insight backlog
The pile of insights that were actionable until everyone agreed they would not act.
The queue of analyses that have been produced but not yet acted on by a decision owner.insights dashboard
A dashboard renamed 'insights' the year charts stopped impressing the board.
A dashboard positioned to surface findings and recommendations rather than raw measures.inter-annotator agreement
The metric that quietly reports how undefined the term was all along.
A measure of how consistently independent labelers assign the same label to the same example.interim owner
The permanent owner, pending the search that was never opened.
A person designated to hold accountability for an asset or process until a permanent owner is identified.interoperability
Term struck pending the shared definitions, contracts, and identifiers two systems must agree on before they can be said to operate together at all.
Term struck pending the shared definitions, contracts, and identifiers two systems must agree on before they can be said to operate together at all.issue log
A spreadsheet sorted by date opened, scrolled to the bottom only during audits.
The authoritative record of open and closed data issues, each with an owner, severity, and target resolution date.issue management
Moving a problem from a person's worry to a system's backlog, where it is now safe.
The disciplined intake, triage, ownership, and resolution of data problems against tracked service levels.it runs on dave's laptop
The architecture diagram for a critical daily process, redrawn every time Dave takes a vacation.
An informal note that a production-relevant process executes on an individual workstation rather than managed infrastructure, with no redundancy or oversight.junk dimension
The drawer where the model keeps every yes/no flag a stakeholder swore was critical and never queried.
A dimension that consolidates assorted low-cardinality flags and indicators into one table to keep the fact table narrow.keep everything for ai
Data minimization's opposite, sponsored at the executive level, filed under strategy rather than under risk.
An informal retention posture in which deletion is deferred indefinitely on the theory that any data may later have model value.key performance indicator
A number promoted to 'key' the quarter it finally looked good.
A quantified measure tied to an objective, used to judge whether performance is on track.keys and identity
The unanswered question of whether two rows are the same person, deferred until a regulator asks it for you.
The discipline of deciding what uniquely distinguishes one record from another, and which attribute the model trusts to do so over time.knowledge graph
A graph of how every entity relates to every other, built atop a 'customer' table no two systems define the same way.
A graph-structured model of entities and their relationships, typically grounded in an ontology, that links facts across a domain for query and inference.label drift
Definition drift, now with a confidence interval.
Change over time in how a target category is defined or applied, degrading models trained on the older meaning.lagging indicator
The number that tells you what happened, reported the week after you could have changed it.
A measure that confirms an outcome after it has occurred, useful for accountability but not steering.lakehouse
A lake that read the warehouse's performance review and panicked.
An architecture that adds warehouse-style management — transactions, schema, governance — over low-cost lake storage, aiming to serve both analytics and machine learning from one tier.land and expand
The pricing model in which the second invoice is the strategy.
A sales motion that begins with a small, low-friction deployment and grows usage and spend over time.late-arriving data
Data that shows up after the meeting that needed it, with a perfectly good explanation no one waited to hear.
Records that reach the pipeline after the time window they belong to has already been processed, requiring late updates or reprocessing.lawful basis
The ground selected after processing begins, by choosing whichever one the activity already happens to fit.
The legal ground that legitimizes a given processing activity, selected before processing begins.leading indicator
A metric called 'leading' until the outcome it predicted failed to follow.
A measure that tends to change before an outcome, used to anticipate rather than confirm.least privilege
A principle interpreted as 'the least we can grant without anyone filing a ticket,' which trends upward.
The principle that each identity receives only the access strictly required to perform its function.legal hold
The one instruction that the retention schedule, otherwise inert, has always obeyed instantly.
A directive suspending normal deletion of records relevant to anticipated or active litigation.lineage
A map of where the number came from, consulted exclusively after it is already wrong.
A traced record of how data flows through transformations from source to destination, showing each column's upstream dependencies.llm-as-judge
Outsourcing whether the answer is good to a second system that also does not know.
Using one language model to score another model's outputs against a rubric, in place of human grading.logical data model
The clean diagram approved at kickoff, preserved unchanged precisely because the database stopped resembling it in week two.
A technology-independent representation of entities, attributes, and relationships that captures business meaning before physical implementation.master data
The handful of nouns the entire company depends on and no single function will admit it owns.
The shared core entities of a business — customers, products, suppliers, accounts — maintained for consistent use across processes and systems.master data management
A program funded to decide who owns 'customer', staffed by everyone who is certain it is not them.
The discipline and tooling that govern master data — its definitions, stewardship, survivorship rules, and distribution — to keep core entities consistent.maturity assessment
A survey that asks a team to grade itself and is then surprised by the grade.
A structured evaluation of a program against a maturity model, producing a current-state score and improvement targets.maturity model
A five-level scale on which every organization assesses itself at a confident level three.
A staged framework rating a capability's development from ad hoc to optimized, used to plan and benchmark improvement.medallion architecture
A medal ceremony for data that has not yet competed.
A layered design that refines data through successive bronze, silver, and gold zones of increasing cleanliness and business readiness.metadata
Everything we know about the data except whether anyone agreed what it means.
Data that describes other data — its meaning, structure, origin, ownership, and quality — enabling discovery, governance, and trust.metadata layer
Where the ownership field is required and the ownership is not.
The tier that captures and serves information about data — schemas, lineage, ownership, definitions — to people and systems that need context.metadata management
Writing down what the data is so future employees do not have to summon the one person who remembers.
The discipline of maintaining descriptive, technical, operational, and business context about data assets.metric certification
A badge that certifies who to ask, not whether the number is right.
A formal review that endorses a metric's definition and computation as the approved version.metric definition
The document everyone cites and no one has read since the analyst who wrote it left.
The exact logic, filters, and grain that determine how a measure is computed and what it excludes.metric layer
A single place to disagree about 'active user', now with version control.
A centralized service that defines metrics once and serves consistent values to every downstream report and tool.metric sprawl
Nine definitions of 'churn', each defended by the team that pays for the tool that computes it.
The accumulation of redundant or near-duplicate metrics that lack a single agreed definition.metric tree
A diagram proving the north star depends on twelve numbers no one owns.
A hierarchy that decomposes a top-line metric into the inputs that drive it.metrics store
A store for the definitions everyone agreed to centralize and continued to override locally.
A centralized layer that defines, computes, and serves business metrics consistently across tools from a single set of definitions.minimum viable governance
The minimum, identified correctly, then funded at half.
The smallest set of roles, policies, and decision rights needed to manage data risk without stalling delivery.mlops
The operating discipline a model needs, requested only after the model is already in production.
The discipline of versioning, deploying, monitoring, and retraining models with reproducible, owned pipelines.model card
The honest paragraph about limitations that the deployment slide forgot to copy across.
A short document stating a model's intended use, training data, evaluation, and known limitations.model collapse
The org photocopied the photocopy until 'customer' meant nothing at all.
Progressive degradation when models are trained largely on the outputs of earlier models rather than original data.model drift
The model did not change. The world did, and no one was watching the gap.
Degradation of a deployed model's accuracy as live data diverges from the data it was trained on.model evaluation
The benchmark the model passes, chosen after seeing which benchmark it passes.
The structured measurement of a model's accuracy, calibration, and failure modes against a held-out reference.model monitoring
A panel that shows the model is healthy by reporting every signal except whether it is still right.
Ongoing measurement of a deployed model's inputs, outputs, and accuracy against expected behavior.model registry
A list of every model in production except the three a team is quietly running from a notebook.
A catalog of model versions with their lineage, metrics, approvals, and deployment status.modern data stack
A stack whose defining feature is the year you bought it.
A loosely coupled set of cloud-native, mostly managed tools — ingestion, warehouse, transformation, BI — assembled around a central store.natural key
A key the business swore was unique and permanent, right up until the merger, the typo, and the reissued account number.
An identifier drawn from a record's real-world business attributes, as opposed to a system-generated surrogate.normalization
Storing each fact exactly once so that no one can find it, then joining nine tables to prove it was there all along.
The process of organizing tables to reduce redundancy and dependency, so each fact is stored once and updated in one place.north star
A fixed point chosen for navigation and consulted by no one steering. It is admired at the offsite and unreachable from the desk, which is rather the point of a star.
A single guiding metric or aim meant to keep distributed teams pointed at the same long-term outcome.north-star architecture
A destination chosen for its property of never being arrived at.
An aspirational end-state design used to align investment and sequencing toward a shared long-term target.north-star metric
The metric everyone steers by until the next reorg picks a different star.
A single metric chosen to align an organization on the value it delivers to customers.offsite
The annual event where a year of deferred decisions is recast as a fresh set of priorities and deferred again, now with a venue.
A multi-day gathering held away from the workplace to set strategy, resolve tension, and produce decisions hard to reach in the daily flow.okr
Last quarter's aspirations, reformatted as this quarter's data model.
Objectives and Key Results: a goal-setting frame pairing a qualitative aim with measurable outcomes.okr-as-data
Automating the status update so the goal can drift in real time without anyone typing it.
The practice of wiring OKR key results directly to live metrics so progress updates automatically.okrs
A quarterly ritual that converts work into numbers, then converts the numbers back into the work that was always going to happen anyway. Graded green by the same team that set them.
Objectives and Key Results: a method for setting an ambitious aim and the few measurable outcomes that would prove it reached.on-call rotation
The one place the org reliably assigns pipeline ownership: at 3 a.m., to whoever is holding the pager.
A schedule assigning responsibility for responding to pipeline incidents to a named person during a given window.ontology
The most rigorous way to model a business while declining to define a single word the business uses today.
A formal specification of concepts, their properties, and the relationships among them within a domain, expressive enough to support inference.operating model
Called 'federated' when no one wanted to fund 'centralized.'
The chosen arrangement of governance authority across the organization, typically centralized, federated, or hybrid.operating rhythm
A schedule of recurring meetings mistaken for the work they were meant to coordinate. The rhythm is reliable; the operating is pending.
The regular cadence of reviews and decisions by which an organization stays coordinated week to week.orchestration
The discipline of deciding which job runs first, now practiced by three tools that each believe they are in charge.
The coordinated scheduling and sequencing of pipeline tasks, including dependencies, retries, and execution order.orchestrator sprawl
The state of having three tools to schedule jobs and a fourth, undocumented one to schedule the other three.
The accumulation of multiple, overlapping orchestration tools across an organization, each scheduling a subset of jobs with no single source of truth.perennial q3 item
An item with the rare distinction of being permanently next.
A planned deliverable that is repeatedly deferred to a future quarter without ever being cancelled.personally identifiable information
The fields everyone agreed were sensitive, in the one table nobody scanned.
Data that, alone or combined with other data, can be linked to a specific individual.physical data model
What was actually built, which the logical model has agreed never to look at directly.
The implementation-specific model defining tables, columns, types, keys, indexes, and constraints as deployed in a given platform.pipeline monitoring
A wall of green panels confirming the job ran, and silent on whether it was right.
The instrumentation that tracks pipeline runs, durations, failures, and data volumes, surfacing problems before consumers report them.pipeline ownership
A field in the catalog set to the name of someone who left in March.
Named accountability for a pipeline's correctness, uptime, and downstream contracts, including who responds when it fails.platform engineering
Building the self-service so thoroughly that no one notices service was never assigned.
The discipline of building and operating internal self-service platforms that abstract infrastructure so product teams can ship faster.platform migration
Rebuilding every pipeline on the new platform, then keeping the old one running for the four jobs nobody can find.
The project of moving pipelines from one orchestration or processing platform to another, typically motivated by cost, scale, or vendor consolidation.poc purgatory
Where a project is funded forever for almost working.
The state in which a proof of concept neither fails nor reaches production, persisting indefinitely as a pilot.policy exception
A temporary exemption now entering its fourth year of being temporary.
A documented, time-bound waiver permitting a deviation from policy, granted with a remediation date and an accountable approver.privacy by design
A principle implemented, in practice, as privacy by retrofit, two sprints after the feature shipped.
The practice of building privacy safeguards into systems and processes from inception rather than bolting them on later.privacy notice
A document written to be technically complete and structurally unreadable, which is treated as the same achievement.
A statement informing individuals what data is collected, why, and how it will be used and shared.proactive data quality
Term struck pending a single budgeted owner — until prevention has a name and a calendar, it is firefighting described in the future tense.
Term struck pending a single budgeted owner — until prevention has a name and a calendar, it is firefighting described in the future tense.prompt engineering
Writing the requirements doc you skipped, one sentence at a time, into a text box.
The practice of structuring instructions, context, and examples to steer a model toward reliable outputs.prompt injection
Discovering the assistant trusts the document more than the policy that approved it.
An attack that smuggles adversarial instructions into a model's input so it overrides its intended task.prompt library
The folder everyone copies from and no one was assigned to keep working.
A shared, versioned collection of approved prompts for reuse across teams.protected health information
Data protected by everyone who assumed someone else was protecting it.
Individually identifiable health data held or transmitted by a covered entity, subject to statutory safeguards.purpose creep
The reason the privacy notice had to be written broad enough to survive the next idea.
The gradual reuse of data for ends beyond the purpose for which it was originally collected and consented.purpose limitation
The requirement that the stated purpose be broad enough to cover any future purpose nobody has thought of yet.
The requirement that data collected for one purpose not be reused for an incompatible one without a fresh basis.quality scorecard
A page that averages six things no one acts on into one number no one questions.
A summary view that aggregates quality measures across dimensions and datasets into scores intended to guide attention and investment.quality sla
A promise about data with a number, a period, and no clause describing what happens when it breaks.
A service-level agreement that commits a data producer to specified quality targets — freshness, completeness, accuracy — over a stated period.quality slo
The target you report against when you would rather not name the agreement you report to.
A service-level objective: an internal, measurable quality target an organization holds itself to, typically expressed as a percentage over a window.quality threshold
A number set just below the current measurement, so the check passes on the day it ships.
The boundary value at which a quality measure is deemed acceptable or unacceptable, separating a passing dataset from a failing one.quick win
A dashboard delivered fast to defer the slow question of who acts on it. The win is quick because the hard part was left out of scope.
A small, visible result delivered early to build momentum and confidence for the larger effort behind it.raci matrix
A grid that makes authority explicit by listing four people beside a decision none of them can make alone.
A grid mapping who is Responsible, Accountable, Consulted, and Informed for each task, so authority and answerability are explicit.raci theater
A grid in which every cell is Consulted and no cell is ever Accountable.
The production of a responsibility-assignment matrix as the deliverable, after which the assignments are neither enforced nor revisited.rag status
Amber, the color of a team that has seen the number and would prefer not to be quoted on it.
A red-amber-green indicator summarizing whether a metric or initiative is on track.re-identification
The quiet proof that the word 'anonymized' was doing a great deal of unsupervised work.
The process of matching supposedly anonymous records back to the individuals they describe.real-time dashboard
Term struck pending the definition of the decision its latency serves; until a choice depends on the second, real-time is a refresh rate, not a requirement.
Term struck pending the definition of the decision its latency serves; until a choice depends on the second, real-time is a refresh rate, not a requirement.reconciliation
The monthly ritual of explaining why two numbers differ, after which both are kept.
The process of comparing two or more datasets that should agree and identifying, explaining, and resolving every difference between them.records management
A lifecycle with a thorough creation stage, a generous storage stage, and a disposition stage observed mainly in theory.
The discipline of controlling records through their lifecycle, from creation to authorized disposition.red-teaming theater
A documented attempt to break the model, scoped to end just before it succeeds.
Adversarial testing of an AI system performed and documented mainly to satisfy a review rather than to change the system.reference architecture
A diagram everyone references and no system resembles.
A reusable template capturing recommended structures, patterns, and component choices for a class of systems.reference customer
The one deployment that worked, available for the call, unavailable for the details.
An existing customer a vendor cites as evidence its product succeeds in production.reference data
The list of approved values, maintained in a spreadsheet, owned by an analyst who left in 2019.
Standardized sets of permissible values — country codes, currencies, status codes — used to classify and constrain other data.referential integrity
The guarantee that every reference points somewhere real, enforced everywhere except the warehouse where it was switched off for performance.
The guarantee that every foreign key points to a row that actually exists in the referenced table.refresh cadence
How often the numbers change, set independently of how often anyone looks.
The scheduled interval at which a report's underlying data is reloaded and recomputed.regulatory reporting
The one report whose definition of 'customer' is finally agreed, because the penalty for drift is statutory.
The periodic submission of mandated data to a regulator in a prescribed format and cadence.remediation
Fixing the rows now and scheduling the cause for a quarter that does not arrive.
The corrective work that brings a defective dataset back within its quality requirements and, ideally, addresses the cause that produced the defect.remediation backlog
The list of things everyone agrees are broken, sorted by how long they have been agreed.
The accumulated queue of identified quality defects awaiting correction, prioritized against competing engineering demands.remediation plan
A plan to fix the finding, due the week after the auditor's next visit.
A documented set of corrective actions, owners, and dates intended to close a finding or resolve a data issue.renewal
The only product review with a hard deadline, run by the one team that never opened the tool.
The point at which a customer decides whether to continue paying for a product for another term.reorg
The act of redrawing the boxes to avoid defining what goes in them. The ambiguity is conserved; only its address changes.
A restructuring of reporting lines and responsibilities intended to fix how an organization works.replay
Living the same afternoon again, this time with the join written correctly.
Reprocessing a recorded stream of events from a stored offset to recover from failure or apply corrected logic.report sprawl
Forty dashboards answering one question, each slightly, confidently wrong.
The uncontrolled proliferation of overlapping reports and dashboards across teams and tools.responsible ai
Term struck pending a named owner for each of fairness, accountability, transparency, and review — qualities currently assigned to everyone, which means no one.
Term struck pending a named owner for each of fairness, accountability, transparency, and review — qualities currently assigned to everyone, which means no one.retrieval augmented generation
A method for making the model read the documents the organization has not finished governing.
A pattern that grounds a model's answer by retrieving relevant context from an external corpus before generation.retrieval-augmented generation
A pipeline that retrieves the company's contradictory documents and asks the model to sound certain about all of them.
An architecture that retrieves relevant documents at query time and conditions a model's answer on them.retry storm
A pipeline responding to a small problem with the enthusiasm of a much larger one.
A cascading overload in which many failed tasks retry simultaneously, amplifying load on an already-degraded system and prolonging the outage.reverse etl
Sending the warehouse's answer back to the system that asked, so the two can be wrong in unison, at last.
The practice of moving modeled data from the warehouse back into operational systems such as CRMs and ad platforms to power workflows.rfp
A two-hundred-question instrument for selecting a tool to fix a problem the questions never required anyone to define.
A request for proposal: a structured document inviting suppliers to bid against stated requirements so options can be compared on the merits.right to be forgotten
A right honored in production, in staging, in the warehouse, in the lake, in the backups, and in the seventeen exports nobody mapped.
A data subject's right to have personal data erased where there is no overriding lawful ground to retain it.right-sized retention
Term struck pending the definition of the retention schedule and disposition practice that would make any size defensible.
Term struck pending the definition of the retention schedule and disposition practice that would make any size defensible.roadmap
A map drawn for a vehicle that has not been built, on a road that resets each quarter. Everything important lives just past the visible horizon.
A sequenced plan of what an organization intends to deliver and roughly when, used to set expectations and coordinate work.root cause analysis
The search for the underlying cause, which terminates the moment it reaches a team that can defend itself.
The disciplined investigation that traces a quality failure back from its symptom to the underlying defect in process, system, or definition that produced it.rubber-stamp review
A gate with a hinge and no latch.
An approval step required by process that, in practice, grants approval without substantive scrutiny.scd type 2
The full audit trail of every value the business changed its mind about, kept forever, consulted never.
A slowly-changing-dimension method that preserves history by inserting a new versioned row on each change, with effective-date and current-flag columns.scheduler
The component that decides when work happens, currently outvoted by a cron entry no one will admit to writing.
A component that triggers pipeline runs based on time intervals, calendar rules, or upstream events.schema design
An afternoon of careful structure followed by two years of columns named after the ticket that demanded them.
The deliberate structuring of tables, columns, types, and relationships to represent a domain and serve its access patterns.schema drift
The source renamed a column and told no one, in the same spirit it once promised the contract was stable.
Unannounced changes to a source's structure — added, removed, or retyped fields — that a downstream pipeline was not designed to accommodate.schema evolution
The fossil record of the data model: every layer a column someone added and no one was ever allowed to remove.
The managed change of a schema over time — adding, deprecating, or altering fields — while preserving compatibility for existing consumers.scorecard
A scorecard that keeps the score but never names the player who lost.
A compact tabular view that tracks a fixed set of metrics against their targets over time.self-describing schema
Term struck pending the business glossary, owner, and agreed meanings a schema cannot describe about itself.
Term struck pending the business glossary, owner, and agreed meanings a schema cannot describe about itself.self-service analytics
Unsupported.
A model that lets business users build their own queries and reports without filing a request to a central team.self-service governance
Self-service, until two of the services certify the same number differently.
A model where analysts build freely within shared definitions, certifications, and review enforced by the platform.semantic cache
A faster way to return last week's answer to a definition that changed on Tuesday.
A store of prior model responses keyed by meaning, returned to avoid recomputing similar queries.semantic drift
The slow process by which 'customer' comes to mean six things, none of them on file.
The gradual divergence of a term's working meaning across teams and time, eroding trust in shared data.semantic layer
The place where the company agreed what 'revenue' means, except for the four teams that didn't get the invite.
A modeling tier that maps physical tables to shared business definitions so every tool computes a metric the same way.semantic model
The layer where the business finally agrees what 'revenue' means, defined four times to match four dashboards.
A layer that maps physical tables to business concepts, metrics, and dimensions so consumers query meaning rather than raw schema.semantic search
Search that understands the question better than the org understands the answer.
Retrieval that ranks results by learned meaning similarity rather than exact keyword match.serving layer
The exit, drawn as a layer to balance the door.
The tier that exposes prepared data to consumers — applications, reports, APIs — optimized for fast, concurrent reads.shadow ai
The actual deployment, running one floor below the AI strategy describing it.
Models or assistants adopted by teams outside any inventory, review, or governance process.shadow analytics
The numbers the business actually uses, kept in the one place governance agreed not to look.
Reporting and metrics produced outside the governed platform, typically in spreadsheets, that inform real decisions.shadow copy
The copy that the deletion request, the lineage diagram, and the retention schedule all politely declined to know about.
An unmanaged duplicate of governed data, created outside official systems and absent from the inventory.shadow pipeline
The pipeline the org does not know it depends on, discovered the week the person who ran it stops answering.
An undocumented pipeline built outside the governed platform, often a spreadsheet macro or personal script, that feeds production reporting.single pane of glass
One pane of glass over eleven dashboards — the same view, now load-bearing and impossible to close.
A unified interface that consolidates metrics from many systems into one consistent view.single source of truth
Term struck pending the definitions of ownership, metric, and certification it depends on; without them the source is single but the truth is plural.
Term struck pending the definitions of ownership, metric, and certification it depends on; without them the source is single but the truth is plural.single version of the truth
The truth that everyone agreed to, and then maintained a personal copy of, just in case.
The architectural goal of one authoritative, agreed source for each business fact, eliminating conflicting figures across reports and systems.slowly changing dimension
An attribute the business called permanent in the requirements meeting and changed twice before launch.
A dimension whose attribute values change over time, managed by a defined policy for whether history is overwritten, versioned, or preserved.snowflake schema
What a star schema becomes after each team adds the one extra table that only their report needs.
A star schema whose dimensions are normalized into additional related tables, reducing redundancy at the cost of more joins.stakeholder
A title that grants a seat at the review and immunity from the result. Every stakeholder holds the stake; no stakeholder holds the decision.
A person or group with a legitimate interest in an outcome, whose needs a decision must account for.staleness
The state a dashboard reaches quietly, while still rendering its last good number with full confidence.
The condition of data that has aged past its acceptable freshness window and may no longer reflect the current state of the world.star schema
A clean diagram at the center of the deck, surrounded on all sides by the joins that did not fit on the slide.
A dimensional design in which a central fact table joins directly to denormalized dimension tables, forming a star-shaped query path.steering committee
A body that steers by approving the direction already taken.
A senior body that sets governance direction, approves funding, and removes organizational obstacles for the program.stewardship RACI
Three columns fully staffed; the Accountable column reserved for the hire the org is still recruiting.
A matrix mapping who is Responsible, Accountable, Consulted, and Informed for each governance activity, designed to remove ambiguity from ownership.stewardship coverage
Reported at ninety percent because the unassigned ten percent had no one to report it.
The proportion of critical data elements or domains that have a named, active steward assigned.storage and compute separation
Two invoices where there used to be one.
A design that decouples persistent storage from processing resources so each can scale and be billed independently.strategic priority
An item declared a top priority alongside the other nine top priorities. Priority is a count of one; the plural is the tell.
An aim ranked high enough to attract scarce time, money, and attention ahead of competing work.stream processing
Real-time delivery of a number no downstream meeting is scheduled to act on for another quarter.
Processing data continuously as unbounded events arrive, optimizing for low latency over batch throughput.streaming architecture
Real-time delivery of a number no one has agreed how to define yet.
A design that processes data continuously as it arrives, supporting low-latency analytics and reaction over unbounded streams.subprocessor
A name on page nine of a list nobody re-reads, who in turn keeps a list of their own that you will never see.
A third party engaged by a processor to carry out specific processing on the controller's behalf.surrogate key
A number we invented so we would stop arguing about which real-world thing it points to. The argument moved one table over.
A system-generated identifier with no business meaning, used as a stable primary key independent of changing natural attributes.survivorship
The logic that picks which of three addresses is real, optimized so the survivor is always the one entered most recently and least correctly.
The rules that decide which attribute value wins when merging duplicate records into a single golden record.survivorship rule
The logic that picks which department was right, then routes the complaint to the other one.
A defined policy that decides which value wins when multiple sources disagree about the same attribute of an entity.synergy
The named benefit of a merger, claimed before either side has agreed what customer means in their respective tables.
The combined effect of two efforts exceeding the sum of their separate results.synthetic data
Invented to fill a gap, and shaped exactly like the assumption that left it.
Artificially generated records used to augment or stand in for scarce or sensitive real data.system of record
Term struck pending the definitions of accuracy, ownership, and reconciliation it claims to settle — a record is not authoritative because it was named first.
Term struck pending the definitions of accuracy, ownership, and reconciliation it claims to settle — a record is not authoritative because it was named first.take it offline
The polite retirement of the only question that mattered, to a venue that is never named and a time that never comes.
To move a detailed or contentious topic out of a meeting into a smaller venue with the right people.target-state architecture
A state the organization is permanently transitioning toward, on a slide updated quarterly.
The documented future-state design an organization intends to reach, against which current-state gaps and a transition plan are measured.task dependency
The reason the report is late, traced backward through eleven tasks to a file someone renamed.
A declared relationship requiring one pipeline task to complete successfully before a downstream task begins.taxonomy
A tree of categories the steering committee approved in March and routed around by April.
A hierarchical classification scheme that arranges concepts into broader and narrower categories with defined membership.temporary job
Term struck pending the removal date promised at creation, now three years past and load-bearing in four reports.
Term struck pending the removal date promised at creation, now three years past and load-bearing in four reports.the abstract owner
Everyone agreed it was owned. No one could say by whom. The asset has a guardian the way the weather has a manager.
A placeholder of accountability assigned to a team, function, or committee rather than a person who can be reached on a Tuesday.the business
The party to whom accountability is assigned precisely because it is not a person. The requirement came from the business, was approved by the business, and was owned, in the end, by no one who answers to that name.
The operating side of an organization that uses data to make and sell things, as distinct from the functions that supply and govern that data.third normal form
A standard the schema fully met on the day of the design review and has been drifting away from ever since.
A normalization level in which every non-key attribute depends on the key, the whole key, and nothing but the key.third-party data sharing
A defined legal basis covering the vendor, the vendor's vendor, and the analytics tag nobody put in the contract.
The disclosure of personal data to an external party under a defined legal basis and contract.thought leadership
Term struck pending the discovery of either the thought or the leadership; the field reports only the hyphen.
Term struck pending the discovery of either the thought or the leadership; the field reports only the hyphen.tiger team
The same five people, summoned again, because the org has confused availability with capacity.
A small group pulled from their regular duties to resolve an urgent, high-visibility problem under compressed timelines.timeliness
The window between when the data was right and when anyone looked at it.
The degree to which data is available within the window required for its intended decision or process.token
The smallest thing in the program with a clearly assigned cost and owner.
The sub-word unit a model reads and generates, and the unit by which usage is billed.tokenization
A reversible privacy control whose entire safety rests on a vault that three integrations have read access to 'temporarily.'
Substituting a sensitive value with a non-sensitive token, with the mapping held in a separately controlled vault.tool sprawl
Forty tools the agent may call, eleven that do the same thing, none that anyone will deprecate.
The accumulation of overlapping tools an agent can call, added without a registry of what already exists.town hall
A broadcast styled as a conversation: the questions are pre-submitted, the hard one is read aloud, and the answer is taken offline, where it remains.
An all-hands gathering where leadership shares direction and the wider organization can ask questions.training data
Whatever was easiest to export, now load-bearing.
The labeled or observed examples a model learns from, whose coverage and quality bound everything the model can do.transformation
Where the business logic lives, defined by whoever happened to write the join and unavailable for comment.
The step in a pipeline that applies business logic to reshape, clean, join, or aggregate raw data into a form fit for analysis.tribal knowledge
The real documentation, sharded across the people most likely to resign.
Operational understanding held informally in people's heads rather than in documentation or systems.trusted dataset
A dataset trusted because it was labeled trusted, by the team that did the labeling.
A dataset designated as reliable for decision-making on the basis of documented quality controls, lineage, and ownership.uniqueness
The dimension that is perfect right up to the first merger.
The degree to which each real-world entity is represented by exactly one record, with no unintended duplicates.universal data model
Term struck pending the definition of a single business term it claims to model for every business at once.
Term struck pending the definition of a single business term it claims to model for every business at once.validation
Confirming the data is the right shape, and inferring nothing about whether it is true.
The act of checking data against defined rules at the point of entry, transformation, or load to prevent invalid records from propagating.validity
Proof that a value followed the rules, and no proof at all that the rules described anything true.
The degree to which a value conforms to its defined format, type, range, or domain of allowed values.vanity funnel
A funnel widened at the top until the conversion rate finally cleared the target.
A conversion funnel whose stages are defined to flatter the figure rather than to model the buyer.vanity metric
A number that goes up to make a slide feel better and a quarter look busier.
A figure that reliably trends upward but does not inform any decision or change any behavior.vanity quality metric
The metric that goes up while trust goes down.
A quality measure chosen because it reliably reads well rather than because it reflects fitness for use or drives any decision.vector database
The new system of record for documents no one would govern in the old system of record.
A store optimized to index embeddings and return nearest-neighbor matches for similarity search.vendor demo
A film in which your problem is played by a stand-in with clean data, good lighting, and a definition of customer everyone in the scene shares.
A guided presentation in which a supplier shows its product solving a problem, ideally one the buyer actually has.vocabulary court
The hearing where five teams arrive certain that 'active user' is obvious and leave with six definitions.
An informal review forum where conflicting definitions of a contested term are reconciled into one agreed entry.war room
The room convened at the cost of the ownership that would have prevented the incident. The reality tax, billed in conference-room hours.
A temporary, intensive gathering of the right people to resolve a critical problem in real time.watermark
A single stored number that is the difference between an incremental load and a four-hour incident.
A tracked marker, often a timestamp or sequence value, recording how far a pipeline has processed so the next run knows where to resume.working group
A standing invitation that survives the problem it was formed to solve, named for the one activity it most reliably defers.
A small cross-functional team convened to make progress on a specific problem between formal decisions.write-audit-publish
A pattern that writes, audits, and publishes, in which the audit step is a boolean permanently pinned to true.
A pattern that writes data to a staging area, runs quality checks, and publishes only if those checks pass.zero-copy architecture
Zero copies of the data and the same number of agreements about what it means.
A design that grants access to data in place — through sharing, views, or references — without duplicating it across systems.zero-copy clone
A free instant copy of production, of which there are now forty, each owned by someone who has left.
A storage feature that creates an independent, writable copy of a dataset by sharing underlying data until it is changed.zero-defect data
Term struck pending the definition of fitness-for-use it depends on — there is no defect without a stated requirement to violate.
Term struck pending the definition of fitness-for-use it depends on — there is no defect without a stated requirement to violate.zero-etl
Term struck pending an account of where the transform went, who owns the source schema now, and what happens to the join when it drifts.
Term struck pending an account of where the transform went, who owns the source schema now, and what happens to the join when it drifts.