Skip to main content
Platform Engineering11 min read·

Signal Integrity in platform engineering — what it means and why DORA alone is not enough

Signal Integrity: metrics that resist gaming and evolve with tooling. Three properties DORA alone cannot provide, and the adversarial review that builds them.

Signal Integrity in platform engineering — what it means and why DORA alone is not enough

TL;DR. Signal Integrity is the capability to produce measurements that inform decisions accurately, resist gaming, and evolve ahead of the tools they describe. All three properties are required. A deployment-frequency metric that was honest in 2018, before feature flags decoupled deploys from releases, is theater in 2026 unless someone audits the stack on a schedule. DORA alone gives you the numbers, not the integrity.

Ask most engineering organizations how they know their platform is improving, and they will point to a dashboard. Deployment frequency: up. Lead time: down. Change failure rate: stable. The numbers exist. The question worth asking is whether the numbers are measuring what the labels say they are measuring.

Signal Integrity is the second pillar of the Foundations Framework. It is the capability to produce measurements that inform decisions accurately, resist gaming, and evolve ahead of the tools they describe. It is not the same as data quality, observability, or monitoring, though it depends on all three. It is the discipline that makes measurement instruments honest enough to be trusted when engineering leaders make investment decisions based on them.

This post defines Signal Integrity precisely, distinguishes it from adjacent concepts, explains why DORA metrics alone cannot provide it, and describes the adversarial review pattern that makes it operational.

What Signal Integrity actually requires — three properties, all mandatory

Signal Integrity has three required properties. A measurement practice that lacks any one of these properties is not producing signals with integrity, regardless of how sophisticated the instrumentation is.

The first is that decisions are informed accurately. The metric measures what the label claims to measure, and the measurements are consistent enough across time and across engineers that decisions made from the data can be trusted. Two engineers measuring the same system with the same data should produce the same number. If they do not, the measurement protocol is broken and the numbers are artifacts of the instrumentation.

The second is that signals resist gaming. A metric that can be optimized without improving the underlying outcome it claims to measure has negative Signal Integrity. Optimizing it makes the dashboard look better while the system gets worse. Deployment frequency measured per microservice and per automated dependency update commit is gameable in ways that deployment frequency measured per customer-facing capability change is not. Signal Integrity requires choosing metric definitions that are resistant to Goodhart's Law: when a measure becomes a target, it ceases to be a good measure.

The third is that signals evolve ahead of the tools. Metric definitions drift silently as tooling generations change. A deployment frequency measurement that was honest in 2018 (before feature flags decoupled deploys from releases) is theater in 2026. Without a quarterly audit of what new tools have made misleading, dashboards report numbers that no longer measure what their labels claim. Signal Integrity requires that the measurement team audits the stack proactively, not reactively.

All three properties are required. A measurement practice that is accurate and gaming-resistant but does not evolve will degrade silently within twelve months of a major tooling shift. A practice that is accurate and evolves but is gameable will produce correct answers until someone has an incentive to optimize the metric. A practice that resists gaming and evolves but is not accurate never had Signal Integrity to begin with.

How Signal Integrity differs from data quality, observability, and monitoring

Signal Integrity is related to but distinct from three concepts that engineering organizations often conflate with it.

Data quality concerns the completeness, consistency, and accuracy of data at the storage and pipeline layer. High data quality means the data that exists is correct and complete. It does not address whether the metric definitions above the data are honest. An organization can have high data quality and low Signal Integrity simultaneously: the data is accurate, the metrics built on it are measuring the wrong things.

Observability concerns the capability to understand system state from external outputs. A highly observable system can be interrogated without prior knowledge of its failure modes. Observability is a prerequisite for Signal Integrity but not a substitute for it. Observability answers "can we see what is happening?" Signal Integrity answers "are the things we are measuring the right things to measure, and are we measuring them the same way twice?"

Monitoring concerns the capability to detect known failure states and alert on them. Monitoring is narrower than observability and much narrower than Signal Integrity. Monitoring dashboards can show all green while Signal Integrity is low: all monitored conditions are within bounds, but the conditions being monitored are not the conditions that matter.

The distinction matters for investment decisions. An organization that conflates Signal Integrity with observability will invest in observability tooling and discover that their measurement problems persist. Observability gives them more data. Signal Integrity requires them to ask harder questions about what the data is actually measuring.

Why DORA metrics are a foundation for Signal Integrity but not a substitute for it

The DORA four keys — deployment frequency, lead time for changes, change failure rate, mean time to restore — are among the most rigorously validated metrics in software engineering research. The research behind Accelerate (Forsgren, Humble, Kim, 2018) established these four as leading indicators of organizational performance with a causal chain that has been replicated across thousands of organizations.

None of that makes them Signal Integrity by themselves. Three failure modes occur when DORA metrics are treated as sufficient rather than as a foundation.

The first is definition inconsistency. The DORA researchers defined these metrics precisely. Most organizations implement them imprecisely. Deployment frequency measured per artifact is different from deployment frequency measured per customer-visible capability change. Lead time measured from code commit is different from lead time measured from ticket creation. Change failure rate measured as any degradation is different from change failure rate measured as user-impacting incidents. These definition differences are not minor. They can produce factor-of-three differences in the reported number from the same underlying system.

The DORA 2025 report found that AI acts as an amplifier of existing engineering conditions. Source: DORA State of AI-assisted Software Development 2025. An organization with low Signal Integrity that adds AI tooling produces more activity: more deploys, shorter lead times on individual commits, more changes flowing through the pipeline. Whether those metrics reflect improved customer outcomes depends entirely on whether the metric definitions are capturing the right things. Low Signal Integrity organizations add AI and watch their DORA numbers improve while their customer outcomes do not.

The second failure mode is gaming. DORA metrics are gameable. Deployment frequency can be inflated by automating dependency updates and counting each as a deployment. Lead time can be compressed by breaking work into smaller commits without reducing the time from idea to user value. Change failure rate can be deflated by classifying incidents as planned maintenance or by setting alert thresholds that miss user-impacting degradations. None of these games require malice. They emerge from the same mechanism Goodhart's Law describes: when engineers are evaluated on DORA metrics, they optimize DORA metrics. Signal Integrity requires metric designs that make gaming harder and measurement practices that detect gaming when it occurs.

The third failure mode is tool-generation obsolescence. Feature flags decoupled deployment frequency from release frequency. AI-assisted development compressed lead time on individual tasks while adding new verification costs that the metric does not capture. Microservice architectures changed the denominator of every DORA metric in ways the original research did not anticipate. A metric calibrated to a pre-feature-flag, pre-AI, monolithic deployment context measures something different when applied to a modern polyglot microservice platform with AI-assisted development. Signal Integrity requires quarterly audit of each metric against current tooling.

The adversarial review: running quarterly reviews designed to find where the numbers are lying

Adversarial review is the practice of regularly attempting to find ways in which your current metrics are wrong, misleading, or gameable. It is the operational implementation of Signal Integrity.

Most engineering organizations do the opposite: they run quarterly reviews designed to show that the numbers are moving in the right direction. Adversarial review runs quarterly reviews designed to find the ways in which the numbers might be lying.

The review follows a structured protocol. For each metric, attempt to describe a scenario in which the metric improves while the system gets worse. If you cannot describe such a scenario, the metric is likely gaming-resistant for your current tooling. If you can describe the scenario, assess whether it is occurring or plausible in your current context.

For each metric, ask whether two engineers measuring the same system would produce the same number. If not, identify the definitional ambiguity and resolve it. Unresolved ambiguity means the metric is measuring engineer interpretation, not system state.

For each metric, identify what tool change in the last twelve months might have made the definition obsolete. AI coding assistants, feature flags, new CI/CD tooling, new deployment patterns: each is a candidate for having changed the relationship between the metric and the concept it claims to measure.

For each metric, identify the second independent signal that triangulates it. A metric without a triangulating signal is a single point of measurement failure. If the triangulating signal disagrees with the primary metric, the disagreement is a finding, not a problem to dismiss.

The adversarial review produces a measurement stack health report: which metrics are currently trustworthy, which have known limitations being actively managed, which are candidates for obsolescence, and which have been superseded by more accurate instruments.

Why Delivery Reliability must come first — Signal Integrity built on noise produces noise

In practice, Signal Integrity depends on both data quality and observability as prerequisites, but builds above them.

The Foundations Framework positions Signal Integrity as the second pillar and Delivery Reliability as the first. The sequencing is deliberate. Delivery Reliability produces the consistent infrastructure state that makes measurement possible. Without reliable deployments, deployment frequency numbers are noisy in ways that obscure rather than illuminate. Without reliable incident response, mean time to restore numbers are artifacts of which incidents get counted rather than reflections of organizational capability.

Signal Integrity is built once Delivery Reliability is established. The organization that attempts to achieve Signal Integrity while its delivery reliability is low produces measurement noise that looks like signal. Investment decisions made from that noise are worse than decisions made from no data, because they have the confidence of data without the accuracy.

The practical implication: if your DORA metrics vary significantly week to week without obvious external causes, your Delivery Reliability is probably too low to support Signal Integrity work. Stabilize the pipeline before optimizing the measurement.

Four indicators that tell you whether your measurement practice itself has integrity

Signal Integrity is itself measurable. Four indicators, tracked quarterly, tell you whether your measurement practice has integrity.

Definition consistency rate is the percentage of your active metrics for which two engineers independently measuring the same system for the same period would produce the same number within five percent. Baseline this by having two engineers independently calculate each metric for a past quarter, then compare. A rate below 80 percent indicates systemic definitional ambiguity.

Gaming detection rate is the percentage of your active metrics for which you have an identified triangulating signal that would surface gaming if it occurred. For deployment frequency: paired with customer-visible change frequency. For lead time: paired with WIP age at merge. For change failure rate: paired with escaped defect rate. A low gaming detection rate means your dashboard can improve without your system improving.

Metric age since audit is the number of quarters each metric has been in use without an adversarial review. Metrics older than four quarters without review are candidates for tool-generation obsolescence. In a period of rapid tooling change (AI coding tools, new deployment patterns), the threshold should be two quarters.

Disagreement rate between paired signals is the rate at which a primary metric and its triangulating signal disagree. Disagreement is not a problem — it is a finding. A 0 percent disagreement rate means either the pair are measuring the same thing (redundant, not triangulating) or something is wrong with the data. A high disagreement rate means the system is in a state your primary metrics are not capturing.

Signal Integrity as the capability that makes every other platform pillar verifiable

Signal Integrity is the capability that lets everything else work.

Without Signal Integrity, platform investment decisions are guesses made with the appearance of data. The team measures cognitive load improvement for four quarters while deployment frequency drops twenty percent. The metric moved correctly. The system moved incorrectly. That is not a measurement success.

With Signal Integrity, the platform team can answer the question that engineering leaders actually need answered: is the platform better than it was six months ago, and how do I know? The answer requires measurements that are accurate, gaming-resistant, and current. It requires an adversarial review practice that surfaces measurement problems before they become decision errors. It requires pairing primary metrics with triangulating signals so that disagreement surfaces findings.

Signal Integrity is the second pillar of the Foundations Framework because it is the capability that lets Cognitive Absorption measurement work, that lets Delivery Reliability improvements be verified, that lets platform investment ROI be calculated honestly.

Frequently asked questions

What is Signal Integrity, and how is it different from observability?

Signal Integrity is the capability to produce measurements that inform decisions accurately, resist gaming, and evolve ahead of the tools they describe. Observability is a prerequisite: it gives you the ability to see what is happening in a system. Signal Integrity sits above that. It asks whether what you are measuring is the right thing to measure, whether two engineers measuring the same system would get the same number, and whether the definition is still accurate after the tooling has changed. An organization can have excellent observability and low Signal Integrity simultaneously — the data is complete, but the metrics built on it are measuring the wrong things or measuring them inconsistently.

Why do DORA metrics lose Signal Integrity over time without audits?

Because tooling generations change what the metric denominators and numerators capture. A deployment frequency definition written in 2018 was calibrated to a world without feature flags, without AI-assisted development, without automated dependency update bots. Each of those changes shifts what gets counted. Feature flags decouple deploys from releases, so a team deploying behind a flag may show high deployment frequency while customers see no change. AI assistants generate commits at a different cadence than human developers, compressing lead time on individual tasks while adding verification costs the metric does not capture. Without a quarterly audit, the dashboard reports numbers from a definition that no longer matches the system.

What makes a metric gameable, and how does adversarial review catch it?

A metric is gameable when it can be optimized without improving the outcome it claims to measure. Deployment frequency becomes gameable when automated dependency updates count as deployments — the number rises without new code reaching users. Change failure rate becomes gameable when teams control what counts as a failure and have an incentive to classify incidents conservatively. The adversarial review catches gaming by requiring teams to explicitly describe a scenario in which each metric improves while the underlying system degrades. If they can describe such a scenario, they either redesign the metric or add a triangulating signal that would surface the gaming if it occurred. The exercise takes 30 to 60 minutes per metric per quarter and is the most cost-effective Signal Integrity investment most teams can make.


Read about the full Foundations Framework at dxclouditive.com/en/foundations/. For a structured assessment of where your Signal Integrity stands across all five dimensions, book a Foundations Assessment or take the free Platform Score.

For a deeper read on where DORA metrics specifically fail Signal Integrity requirements, read Why your DORA metrics are lying to you.

For the three measurement traps that make dashboards theater in the AI era, read three platform measurement traps that make your dashboards theater.

For the developer experience measurement signals that triangulate Signal Integrity against survey data, read developer experience measurement — beyond survey scores.

Signal IntegrityDORA MetricsPlatform EngineeringEngineering MetricsMeasurementFoundations FrameworkObservabilityMat Caniglia

Found this useful? Share it with your network.

Matías Caniglia

Mat Caniglia

LinkedIn

Founder of Clouditive. 18+ years transforming engineering organizations across LATAM and globally through Developer Experience consulting.

79 articles published

Related Articles

Platform Engineering

The Cost of Not Investing in Platform Engineering

Every hour engineers spend fighting deploy friction, waiting on platform tickets, or repeating slow onboarding is a real cost. A framework for making the number concrete.

Read More →
Platform Engineering

Platform Engineering Consulting vs. Hiring: When Each Makes Sense

An honest analysis for a VP Eng facing the build-the-team-or-bring-in-a-consultancy decision. Cover the 3-6 month critical window, failure modes of each approach, and what a good engagement exit looks like.

Read More →
Platform Engineering

IDP Build vs Buy: A Decision Framework for Engineering Leaders

A structured decision framework covering total cost of ownership, team capacity requirements, vendor lock-in spectrum, what changes at 10 vs 50 vs 200 engineers, and the hybrid path.

Read More →

Stay updated with Clouditive

Long-form analysis on platform engineering, DORA, and AI readiness from Mat Caniglia. Sent when there is something worth reading.

Start here

See where your delivery stands.

A fifteen minute self-diagnostic that scores your platform across DORA metrics, deployment frequency, change failure rate, and cognitive load. No sales call required.

Want to read first? See the Foundations Framework