The number nobody agrees on.
Two reports show different values for the same metric — and both are right. No one ever defined what the metric means, so each team computes its own version and the meeting argues about whose number is real. Why this is a definition problem, not a data problem, and why an AI built on top of it picks a side silently.
Two people pull the same metric and get two different numbers. Sales says there were 1,200 active accounts last month; Finance says 980. Both ran a query. Both queries were correct. Both numbers are defensible. The meeting then spends forty minutes not on the decision the metric was supposed to inform, but on whose number is real — and usually ends with someone promising to 'reconcile offline,' which means the disagreement survives to the next meeting. This is not a data-quality problem in the usual sense. The data is fine. The arithmetic is fine. What is missing is upstream of both: a single, owned definition of what the metric actually means.
Metric-definition drift is invisible precisely because every individual number is correct. There is no bad row to find, no null to fix, no pipeline that failed. Each report is internally consistent; they simply encode different assumptions about the same word. The gap only appears when two of them are placed side by side, which is exactly the moment a decision is on the line. It almost always takes one of six forms:
- 1.Same word, different math. 'Revenue' is gross in one report, net of refunds in another, and recognized-on-a-schedule in a third. 'Churn' counts logos in one place and dollars in another. The word is shared; the formula behind it is not, and nothing on the chart says which one you are reading.
- 2.The definition lives in a person, not the warehouse. One analyst knows the 'real' way to calculate the number — which accounts to include, which edge cases to drop. That knowledge is the definition. When they are on vacation, or they leave, the number quietly drifts, because the rule was never written down anywhere a query could read it.
- 3.Two teams built the same metric twice. Sales and Finance each wrote their own 'active customer' logic, months apart, neither aware the other existed. Both are in production, both feed dashboards executives trust, and they will never agree because they were never the same calculation.
- 4.The silent filter. One report excludes test accounts, internal users, or refunded orders; the other does not. The exclusion is real and often correct — but it is undocumented, so two reports of 'the same' metric are quietly counting two different populations.
- 5.The clock and the calendar disagree. One report bins by UTC, another by local time; one uses fiscal weeks, another calendar months; one attributes an event to when it happened, another to when it was recorded. 'Last month' is a different window depending on who you ask, so the totals never line up.
- 6.'Active' means five different things. The adjectives that drive the business — active user, qualified lead, closed deal, at-risk account — each carry a threshold that differs by team, and no canonical definition arbitrates. Everyone says the same word in the meeting and means something different by it.
Notice what none of these are: none is a broken pipeline, a dirty table, or a wrong calculation. They are disagreements about meaning, not correctness. That is why throwing more data engineering at the problem does not fix it — you can have pristine, fully-tested, perfectly-fresh data and still have two teams confidently reporting different truths, because the fix lives at the definition layer, not the data layer.
Why this matters before you build AI
A human analyst, handed two conflicting definitions, at least feels the friction — they hesitate, they ask which number you mean, they footnote the assumption. A model does none of that. Ask it for revenue and it will reach for whichever table it was pointed at, apply whatever definition is implicit in that data, and answer in a confident sentence with no asterisk. It cannot tell that 'revenue' is contested, because the contest never made it into the data — it lived in the heads of the people who are now out of the loop. So the model does the worst possible thing with an ambiguous metric: it silently picks one definition and applies it everywhere, at machine scale, with the authority of a single clean answer. Definitional disagreement that used to surface as a visible argument in a meeting becomes invisible, automated, and consistent — consistently using a definition no one signed off on.
The fix is a definition, not a dashboard
The cure for metric-definition drift is not another BI tool or another cleanup project; it is naming the metric once. For each number the business actually steers by, write down a single canonical definition — the exact formula, the included and excluded population, the time basis — give it one owner, and put it somewhere the warehouse and every downstream report read from, rather than re-implement. This is what a semantic or metrics layer is for, but the tooling is secondary; the discipline is what matters. Most organizations have only a handful of metrics that genuinely drive decisions. Defining those few, out loud and in one place, eliminates the majority of the cross-report disagreements that burn meetings — and it is the prerequisite, not the afterthought, for any AI you intend to let answer questions about your business.
Data quality asks whether the number is right. This asks whether everyone is even computing the same number. An AI on top of undefined metrics does not resolve the disagreement — it picks a side, silently, and scales it.
So before the next dashboard rebuild or the next AI assistant that answers questions over your data, try the cheap test: ask two different teams for your single most important number, and see whether they match. If they do not, the gap is not in the data — it is in the definition, and that is the thing to fix first. Name your handful of decision-critical metrics, give each one an owner and a single home, and the arguments about whose number is real stop being a standing agenda item.
Would your teams report the same number?
The Data-Quality Scorecard walks the dimensions of data trust — including whether your key metrics have a single owned definition — and shows you, in about five minutes, where two teams would answer the same question with two different numbers. No call required.