← Insights
Data Quality

Your team keeps a spreadsheet next to the system.

The official system is the system of record. The shadow spreadsheet beside it is the system of truth. Every one of those workbooks marks a place your data isn’t trusted — and an AI project built on the same data inherits the gap.

Zak Data Solutions · June 18, 2026

The most honest data audit in most companies is not in the warehouse or the BI tool. It is in the spreadsheets people keep open beside the official system — the workbook the finance team actually closes the month in, the tracker operations really runs on, the tab a manager rebuilds every Monday because the report cannot be trusted as-is. The official system is the system of record. The spreadsheet is the system of truth. When those two are different files, that gap is the most useful thing you can know about your data.

There is a one-question test for it. Walk the people who do the work and ask: what do you keep in a spreadsheet that the system is supposed to handle for you? You will not get blank looks. You will get a list — and every item on it marks a place the official system is not trusted. A side spreadsheet is never laziness. It is a workaround someone built on purpose, and it is telling you exactly one of six things:

  1. 1.The system's numbers are wrong often enough to rebuild by hand. Someone reconciles the report against reality every cycle because the report has been wrong before and they got burned. The spreadsheet is a trust gap with a column header.
  2. 2.The system cannot answer the question people actually ask. The data is in there, but not in a shape anyone can use, so they export it and reassemble it. The model of the business in the system does not match the model in people's heads.
  3. 3.Getting data out is harder than re-keying it. When the official path to a number runs through three screens and a ticket, a spreadsheet someone types by hand wins. That is an access gap, and it quietly doubles the work and the error surface.
  4. 4.Nobody trusts the system's version of history. The spreadsheet is the real audit trail — the place where what changed, when, and why is actually legible. If the system cannot reconstruct last quarter's number, people keep their own record of it.
  5. 5.The process changed and the system did not. The business moved; the schema did not. The spreadsheet absorbs the new step, the new status, the new exception the system has no field for. It is a backlog of changes the system never caught up to.
  6. 6.The spreadsheet is faster to fix than the official path. When correcting a record means a change request and a release, and fixing the spreadsheet means typing, the spreadsheet wins every time. That is a governance gap — the sanctioned path is slower than the unsanctioned one.

None of those six are a people problem, and none of them get fixed by banning spreadsheets. They are a map. Each shadow workbook is a free, precise specification of something the official system should do and does not — written by the person who felt the pain sharply enough to build the workaround.

Why this matters before you build AI

Here is the part that turns a quirk into a risk: you cannot build reliable automation or AI on data the people closest to it quietly route around. If the team keeps a parallel spreadsheet, then a model trained on — or fed by — the official data inherits the exact same gap that drove them to the spreadsheet, but without the human who knew to keep one. The reconciliation, the missing field, the correction that only ever happened in the workbook: the model never sees it. It learns the version of the business the team already decided not to trust.

So the side-spreadsheet test doubles as an AI-readiness test. Wherever a shadow spreadsheet exists, an AI initiative built on the upstream data will be confidently wrong in the same place — and, unlike the analyst with the workbook, it will not know to flag it.

What to do about it

Do not ban the spreadsheets — read them. Inventory the ones the most people depend on; that is where distrust is costing the most. For each, name which of the six gaps it represents, then close that gap in the system: fix the number that keeps getting rebuilt, add the field the process grew, make the export a query, give corrections a path faster than retyping. Migrate the spreadsheet's logic back into the system, earn the trust that the workaround replaced, and the spreadsheet retires itself. You do not have to decommission shadow workbooks. A system that deserves the data gets it back.

Every shadow spreadsheet is a system someone built because they did not trust yours.

A system worth trusting earns its way back into the workflow by being right, accessible, and auditable — not by a policy that forbids the workaround. Find the spreadsheets, read what they are telling you, and fix the gaps underneath them. That is also the cheapest data-quality program you will ever run: the work has already been scoped, in someone else's workbook.

Where are your side spreadsheets?

The Data-Quality Scorecard walks the same gaps a shadow spreadsheet exposes — trust, access, history, and governance — and tells you where the official system has lost the data’s confidence, in about two minutes.