Your Jira Data Is Lying to You: A Data Quality Primer for CTOs
Green dashboards hide unreliable data. Explore three dimensions of execution data quality and why confidence metrics matter more than status colours.
Your Jira dashboard is green. Your velocity charts trend upward. Your burn-down looks healthy. And none of it may be telling you the truth.
This is not a criticism of Jira. It is a statement about the fundamental gap between what project management data records and what leadership needs to know. The data in your execution tools reflects what people entered, when they entered it, and how consistently they maintained it. It does not inherently reflect reality.
For operational decisions , sprint planning, capacity allocation, release coordination , this data is usually good enough. But the moment you try to use it for strategic questions (“Are we aligned? Are we on track? Where should we invest?”), you are building decisions on a foundation you have never inspected.
The Green Dashboard Problem
Every experienced technology leader has encountered this pattern: the dashboard says everything is fine, but something feels wrong. Deadlines slip despite healthy velocity. Features ship but do not solve the problem they were meant to address. Teams are busy but progress feels slow.
The instinct is to blame the teams or the process. The more likely culprit is the data itself.
Project management tools are input-dependent systems. They reflect what is entered into them, which is shaped by human behaviour, team culture, and the incentive structures around reporting. None of these are optimised for accuracy. They are optimised for workflow convenience.
A ticket marked “in progress” might have been untouched for two weeks. A story point estimate might reflect political negotiation rather than technical assessment. An epic marked “on track” might have five of its twenty stories completed, with the remaining fifteen quietly growing in complexity.
The dashboard aggregates these inputs and presents them as fact. The colour green does not mean “this is true.” It means “the data, as entered, does not breach the thresholds you configured.”
Three Dimensions of Data Quality
When we assess execution data for strategic alignment purposes, we evaluate three dimensions. Each reveals a different kind of unreliability.
Completeness: What Is Missing?
Completeness measures the proportion of actual work that is represented in the data. In a perfectly complete dataset, every piece of work being performed would have a corresponding, accurately described entry in the system.
In practice, completeness problems are pervasive:
Untracked work. Engineers routinely perform work that never appears in Jira , exploratory investigations, ad-hoc support requests, infrastructure maintenance, technical debt remediation that is folded into feature work rather than tracked separately. In our experience, 15-30% of actual engineering effort is invisible in project management data.
Partially described work. Tickets exist but lack the information needed for strategic analysis. A ticket titled “Fix auth bug” tells you nothing about which strategic objective it serves, what system it affects, or how significant the work is. When tickets lack descriptions, acceptance criteria, or proper categorisation, they are technically present but analytically useless.
Orphaned work. Tickets that exist outside any epic, initiative, or strategic hierarchy. They represent real effort but have no connection to strategic objectives, making it impossible to assess their alignment. In a typical Jira instance, 20-40% of tickets in active sprints have no clear hierarchical connection to a strategic objective.
Completeness matters because any analysis based on incomplete data will systematically undercount certain types of work. If untracked work is disproportionately operational (which it usually is), your data will overstate the proportion of effort that is strategically aligned.
Freshness: When Was This Last True?
Freshness measures how recently the data was updated relative to the reality it represents. A ticket that was accurate when created but has not been updated in three weeks is stale data, regardless of what the fields say.
Freshness problems manifest in several ways:
Status lag. Tickets remain in states that no longer reflect reality. “In progress” tickets that are actually blocked. “In review” tickets that were merged days ago. “To do” tickets that someone started working on but did not move. The longer the lag between reality and data, the less reliable any analysis built on status fields.
Estimate decay. Story point estimates or time estimates are typically set at creation and rarely updated as understanding evolves. A ticket estimated at 3 points that has consumed two weeks of effort is still showing 3 points in every report. Velocity calculations based on stale estimates produce misleading throughput metrics.
Priority drift. Priorities set during planning may no longer reflect actual organisational priorities, but the data still shows the original values. When priority fields are not maintained, any analysis that filters or weights by priority is working with outdated assumptions.
The practical effect of freshness problems is that your data describes the past, not the present. Decisions made on stale data are decisions made on how things were, not how things are.
Consistency: Are We Measuring the Same Things?
Consistency measures whether data is structured and categorised in a uniform way across teams, projects, and time periods. Inconsistent data can be complete and fresh but still produce misleading analysis.
Common consistency problems include:
Varying granularity. One team creates fine-grained tickets (one per function), another creates coarse-grained tickets (one per feature). Comparing ticket counts or velocity across these teams is meaningless, but dashboards do it routinely.
Inconsistent categorisation. Custom fields, labels, and components are used differently across teams. “Bug” means different things to different teams. “Technical debt” might be a label, a component, a ticket type, or simply not tracked. Any cross-team analysis that relies on categorisation is comparing unlike quantities.
Schema evolution. The way your Jira instance is configured changes over time. Fields are added, renamed, or repurposed. Workflows are modified. What was tracked as a “task” last year might be tracked as a “story” this year. Longitudinal analysis , tracking trends over time , is compromised when the measurement basis changes underneath.
Consistency problems are particularly dangerous because they are invisible in single-team views. Each team’s data looks fine internally. The problems only appear when you try to aggregate or compare across teams, which is exactly what strategic analysis requires.
Real Decisions Distorted by Bad Data
These are not theoretical concerns. Here are patterns we encounter regularly:
“We are investing heavily in platform reliability.” Analysis reveals that 80% of tickets tagged “reliability” are actually feature work with a reliability label added to secure approval. Actual reliability work is 12% of effort, not the 35% that leadership believes.
“Team A is twice as productive as Team B.” Team A creates one ticket per feature. Team B creates five tickets per feature. Their actual output is comparable, but velocity metrics make Team A look dramatically more productive. Staffing decisions based on this comparison would be wrong.
“We are ahead of schedule on the migration.” Status fields show 60% of migration tickets as “done.” Closer inspection reveals that the completed tickets represent the easy 60% , well-understood, low-risk items. The remaining 40% includes the complex integration work that carries most of the project risk. Progress is not 60%. It is “the easy part is finished.”
Each of these situations involves data that is technically accurate. The tickets exist. The fields are populated. The statuses are set. But the conclusions drawn from the data are wrong because the data quality was never assessed.
Why Confidence Levels Matter
The alternative to treating execution data as ground truth is to treat it as evidence with varying reliability. This is where confidence levels become essential.
A confidence level is a measure of how much you should trust a particular data point or analysis. It accounts for completeness, freshness, and consistency , and it changes the conversation fundamentally.
Without confidence levels, an alignment report says: “72% of effort is aligned to strategic objectives.” This sounds precise. It invites action. It also might be wrong.
With confidence levels, the same report says: “72% of effort appears aligned to strategic objectives, with moderate confidence (data completeness: 65%, freshness: 80%, consistency: 70%). The true figure likely falls between 58% and 82%.”
This is less satisfying. It is also more honest, and it leads to better decisions. A leader looking at the first report might conclude the situation is acceptable. A leader looking at the second might conclude that the data needs improvement before the situation can be properly assessed , which is the correct conclusion.
Confidence levels also highlight where to invest in data quality. If completeness is your weakest dimension, the priority is capturing untracked work. If freshness is the problem, the priority is workflow hygiene. If consistency is the issue, the priority is cross-team standardisation.
How to Assess Your Own Data
You do not need a tool to start evaluating your execution data quality. Here is a practical assessment you can run this week:
For completeness: Pick three engineers at random. Ask them to list everything they worked on last week. Compare their answers with the tickets assigned to them. The gap between what they did and what Jira shows is your completeness problem.
For freshness: Pull all tickets currently in “in progress” status. Check the last update date on each. Any ticket that has been “in progress” for more than two weeks without an update is stale. Calculate the percentage. That is your freshness problem.
For consistency: Compare how two different teams track the same type of work (say, bug fixes). Look at ticket structure, field usage, estimation approach, and workflow states. The differences you find are your consistency problem.
These are rough measures, not rigorous audits. But they will tell you whether your execution data is reliable enough to support strategic decisions , and in most cases, the answer will be illuminating.
The point is not that Jira data is useless. It is that Jira data is unverified, and unverified data should not be the foundation for strategic decisions. The first step toward evidence-based leadership is knowing how much to trust your evidence.
BAS TrustDesk includes automated data quality assessment across your execution tools, providing confidence-weighted alignment measurement. If you want to know how reliable your data actually is, start a conversation.
See your own alignment score
Upload your strategy. Connect your tools. Under an hour, no credit card.
Get Started Free