DORA Metrics vs. SPACE Framework: Which to Implement First
TL;DR. The DORA vs SPACE decision is not a competition. They measure different things at different altitudes. DORA measures software delivery: how fast, how often, how stable. SPACE measures developer experience: satisfaction, performance, activity, communication, efficiency. DORA tells you how your delivery system performs. SPACE tells you whether your developers can function well inside it. Most organizations should start with DORA because it produces actionable improvement loops faster. SPACE becomes essential once DORA metrics are stable and you need to understand why they move.
The question "DORA or SPACE" comes up in every engineering measurement conversation, usually in a context where someone has read about both frameworks and is unsure which to prioritize. Both have legitimate academic foundations. Both have real adoption across engineering organizations. The confusion is understandable — they appear to measure similar things (developer productivity) but actually measure at different levels of abstraction.
Getting clear on what each framework measures, and why the distinction matters, is more useful than a head-to-head comparison. The goal is not to pick the winner. It is to understand which framework gives you the most actionable signal at your current organizational maturity level.
What DORA actually measures
The DORA research program, now maintained at dora.dev, has been running since 2014. It is, by the research team's own description, the longest-running academically rigorous research into software delivery performance. The program's key finding — repeated across hundreds of thousands of survey responses over a decade — is that software delivery performance can be measured by a small set of metrics, and that organizations that perform well on those metrics also tend to perform better on organizational outcomes like profitability, market share, and employee satisfaction.
The current DORA model has evolved from the original four metrics to five, organized into two categories.
Throughput metrics measure how effectively the organization delivers software. Change lead time is the duration from code commit to that code running in production. Deployment frequency is how often deployments occur in a given period.
Stability metrics measure how reliably the software behaves once it is deployed. Change fail rate is the proportion of deployments that cause a degraded service and require intervention. Failed deployment recovery time measures how quickly the team restores normal service after a failure. The fifth metric, deployment rework rate, captures unplanned deployments triggered by production incidents — a signal of rework cost.
What DORA does not measure: developer satisfaction, individual productivity, team collaboration quality, or cognitive load. These are real and important. They just are not what DORA is for.
The value of DORA metrics is their specificity. Each metric has a clear definition, a clear measurement method, and a clear improvement lever. If your change lead time is measured in weeks, the lever is automation in the deployment pipeline. If your change fail rate is high, the lever is test coverage and deployment automation quality. The metrics are causally connected to the practices that produce them, which makes them actionable in a way that more holistic measurement is not.
What SPACE actually measures
The SPACE framework was published in ACM Queue in February 2021 by Nicole Forsgren, Margaret-Anne Storey, Chandra Maddila, Tom Zimmermann, Brian Houck, and Jenna Butler — researchers from Microsoft Research, GitHub, and the University of Victoria. The paper's central argument is that developer productivity cannot be captured by any single metric or dimension, and that attempts to do so typically produce measurement pathologies: optimizing one dimension while degrading others.
SPACE is an acronym for five dimensions of developer productivity:
Satisfaction and well-being — how fulfilled developers feel in their work, team, and tools. This includes psychological safety, burnout indicators, and whether developers believe the work they do matters.
Performance — the outcomes of work, not the activity. Whether the code achieves its intended purpose, delivers value to customers, and meets quality standards. Performance is distinct from activity deliberately: commits and pull requests are activity, not performance.
Activity — the count of observable actions: commits, code reviews, deployments, incidents responded to. Activity metrics are useful as context for other dimensions but misleading in isolation. The paper explicitly cautions against using activity as a proxy for productivity.
Communication and collaboration — how teams coordinate and how knowledge flows. Pull request review times, onboarding ramp time, cross-team contribution patterns, and documentation quality are indicators here.
Efficiency and flow — the ability to complete work with minimal friction, interruption, or context-switching. This includes both individual focus time and system-level pipeline speed.
The framework's design requirement is that organizations measure across at least three dimensions simultaneously. Measuring a single dimension in isolation produces the optimization pathology the framework is designed to prevent: a team that maximizes deployment frequency (activity) while developer satisfaction collapses is not performing better — it is trading long-term capability for short-term throughput.
The fundamental difference: system performance vs. developer experience
DORA measures the delivery system. SPACE measures the developers operating inside it.
This distinction matters because a delivery system that performs well on DORA metrics can still produce burned-out, unhappy developers — and eventually, burned-out developers produce delivery systems that stop performing well on DORA metrics. The causality runs in both directions.
A team with excellent DORA metrics and declining developer satisfaction is experiencing what the SPACE paper would identify as an activity-performance trade-off: the system produces throughput while degrading the conditions that sustain it. This is not detectable from DORA metrics alone.
A team with poor DORA metrics and high developer satisfaction is experiencing the opposite: developers who feel good about their work but are constrained by a delivery system that slows them down. This is immediately detectable from DORA metrics and actionable from their improvement levers.
The argument for starting with DORA is that the second case — constrained by a broken delivery system — is far more common than the first, and the improvement levers for DORA metrics are more specific and more actionable for most organizations at most maturity levels.
Why most organizations should start with DORA
The case for starting with DORA is not that SPACE is less important. It is that DORA has faster feedback loops.
When you establish DORA baselines and start working on the levers — deployment pipeline automation, test coverage, rollback reliability — you produce measurable changes in the metrics within weeks to months. The improvement loop is tight: make a change to the deployment pipeline, measure the lead time, observe the result. That tight feedback loop is what makes DORA useful for engineering leaders who need to demonstrate progress on a concrete timeline.
SPACE measurement, done correctly, requires establishing qualitative feedback mechanisms (developer surveys, pulse checks, shadowing sessions) alongside quantitative telemetry. This setup takes longer and the signal takes longer to interpret. A developer satisfaction survey conducted at the start of a platform initiative and repeated three months later will show movement, but the statistical significance and the attribution to specific changes require careful analysis. Most organizations are not ready to do that analysis rigorously until they have the quantitative foundation that DORA provides.
There is also a sequencing logic to the measurement. DORA tells you how the delivery system performs. Once you have stabilized delivery performance — deployment frequency is consistent, lead time is predictable, change fail rate is low — the remaining improvement opportunities tend to be human-system interaction problems rather than pipeline problems. That is when SPACE becomes essential: it identifies where the humans inside the well-functioning system are still struggling.
The organizational maturity at which each framework is most valuable
The DORA research itself categorizes organizations into performance clusters based on metric benchmarks. The research consistently shows a bimodal distribution: a cluster of high performers and a larger cluster of low and medium performers, with relatively few organizations in between.
For low and medium performers — organizations where deployment frequency is measured in weeks or months, lead time is similarly extended, and change fail rates are high — DORA is the right starting point. The problems these organizations face are delivery system problems: manual deployment steps, insufficient test automation, deployment processes that cannot roll back reliably. DORA metrics quantify these problems directly, and the improvement levers are known.
For organizations that have achieved DORA high-performer status — frequent deployments, short lead times, low fail rates — the improvement ceiling on pure delivery metrics is lower. Further gains require understanding the developer experience inside the high-performing delivery system. Are developers experiencing high cognitive load despite good pipeline metrics? Are collaboration patterns creating bottlenecks that pipeline metrics do not capture? Is developer satisfaction declining in ways that will eventually appear in DORA metrics as key people leave? This is where SPACE measurement provides signal that DORA cannot.
Most engineering leaders reading this are working with teams that have significant DORA improvement opportunities before hitting the ceiling where SPACE becomes the primary signal. That is not universal — fast-growing teams with high hiring velocity, high onboarding rates, and strong delivery fundamentals may need SPACE sooner — but it is the common case.
How DORA and SPACE coexist: a practical implementation sequence
The frameworks are not mutually exclusive. The practical question is not whether to use both but in what sequence.
Start with DORA baselines. Instrument the four or five key metrics with real telemetry — not survey-based estimates, but actual measurement from your deployment pipeline and incident management system. The measurement setup takes two to four weeks for most organizations. Establish the baseline numbers before drawing any conclusions.
Run DORA improvement cycles. Identify the highest-leverage gaps in your current metrics. For most organizations, the first cycle addresses lead time and deployment frequency — the pipeline automation and test coverage investments that have the most direct impact. Measure the impact after each change. The DORA 2024 research notes that AI adoption increases individual productivity and flow but can negatively affect software delivery stability — which means DORA measurement is becoming more, not less, important as AI tooling proliferates.
Introduce SPACE measurement once DORA is stable. Once your delivery metrics are consistently in a range you understand and are actively managing, add the SPACE dimensions that are most relevant to your specific context. For most platform engineering contexts, the highest-value SPACE dimensions are satisfaction and well-being (to detect burnout before it becomes attrition), efficiency and flow (to identify cognitive load patterns that DORA doesn't surface), and communication and collaboration (to identify onboarding and cross-team friction).
Use SPACE to explain DORA movements. Once both are established, they work together causally. A drop in deployment frequency that coincides with declining developer satisfaction signals a connection worth investigating. An improvement in lead time that does not produce corresponding improvement in developer efficiency suggests the pipeline improvement is not translating to reduced cognitive load — a sign that friction exists outside the pipeline.
Common mistakes when implementing either framework
Both frameworks get implemented badly, and the failure modes are predictable.
DORA mistake: measuring without improving. Organizations that establish DORA baselines and then do not change anything have created a reporting process, not a measurement practice. The value of DORA metrics is the improvement loop. Baseline, intervene, measure again. Organizations that measure quarterly without clear ownership of improvement cycles produce dashboards that leadership reviews and developers ignore.
DORA mistake: using activity as a proxy. Deployment count is not a productivity metric. An organization that increases deployment frequency by automating trivial deployments while deferring substantive changes is gaming the metric, not improving the system. DORA metrics are only meaningful when the denominator — deployable changes — is stable and representative of real product development.
SPACE mistake: survey without action. Developer satisfaction surveys that produce no visible response from leadership are worse than no surveys. They signal that the organization asks about experience without intending to change it, which reduces psychological safety and future survey participation. Every SPACE measurement cycle should include a visible response to the findings: here is what we heard, here is what we are doing about it.
SPACE mistake: measuring individuals. The SPACE paper is explicit that the framework is designed for team-level measurement. Applying SPACE dimensions to evaluate individual engineers' productivity produces surveillance dynamics that destroy psychological safety and drive the best engineers — the ones with the most options — toward exit. See also 5 signs your platform team is stuck in ad-hoc mode for organizational signals that SPACE measurement would surface.
Both frameworks: skipping the baseline. Neither framework provides value without a clear baseline established before any intervention. Organizations that adopt both frameworks simultaneously without baselines find themselves three months later unable to distinguish noise from signal. Sequence the measurement setup: DORA baseline first, SPACE baseline second, improvement cycles with both.
For the full treatment of DORA measurement setup and the specific signal integrity issues that produce misleading DORA metrics, read Why your DORA metrics are lying to you (and how to fix it).
The measurement infrastructure that both frameworks require
Both DORA and SPACE require measurement infrastructure before they produce reliable signal. This infrastructure is worth building once correctly rather than twice inadequately.
DORA requires deployment telemetry connected to incident management. Specifically: a deployment event stream from your CI/CD system with timestamps and outcome data (success/failure), and an incident event stream from your incident management tool with resolution timestamps. Without these data sources, DORA metrics are estimated rather than measured, and estimated metrics are imprecise at exactly the moments when precision matters most — when you are trying to determine whether an intervention worked.
SPACE requires both quantitative telemetry and qualitative feedback infrastructure. The quantitative side includes pull request cycle time from your version control system, pipeline wait times from your CI system, and incident response time from your incident tool. These can often reuse the same data sources as DORA with different aggregations. The qualitative side requires a developer survey infrastructure: a regular cadence, consistent question sets, and a process for reviewing and acting on the results.
For organizations building this infrastructure from scratch, the Foundations Assessment establishes both measurement dimensions as part of the engagement — the assessment defines what the baselines should be and what infrastructure needs to be in place before either framework produces reliable signal.
Frequently Asked Questions
Q: Can we use both DORA and SPACE from day one?
You can, but the setup work for both simultaneously is significant and the risk is that neither is done well. DORA setup typically takes two to four weeks if your deployment pipeline has the instrumentation needed for accurate telemetry. SPACE setup takes longer because it requires establishing both quantitative telemetry and the qualitative survey infrastructure. If you try to do both simultaneously while managing other platform work, one or both will be done superficially. Start with DORA, which has more specific data requirements and faster feedback loops. Introduce SPACE measurement three to six months later once DORA baselines are established and improvement cycles are running.
Q: Who invented the SPACE framework and why does the authorship matter?
The SPACE framework was published in February 2021 in ACM Queue by Nicole Forsgren (lead author of the DORA research program and one of the authors of Accelerate), Margaret-Anne Storey (University of Victoria), Chandra Maddila, Tom Zimmermann, Brian Houck, and Jenna Butler (Microsoft Research and GitHub). The authorship matters because Forsgren is the same researcher who established the DORA research program's methodology. SPACE was not developed in opposition to DORA — it was developed by one of DORA's principal architects to address the dimensions of developer productivity that delivery metrics do not capture. The frameworks are designed to be complementary.
Q: What if our DORA metrics look good but developers say they are unhappy?
This is precisely the scenario where SPACE becomes essential. Good DORA metrics with declining developer satisfaction is a leading indicator: the delivery system performs well today because the team is absorbing high cognitive load to keep it running. That is not sustainable. SPACE measurement in this scenario typically surfaces efficiency and flow problems (context-switching, tool fragmentation, on-call burden) and satisfaction and well-being signals (burnout indicators, erosion of psychological safety) that will eventually appear in DORA metrics as key people leave and delivery quality degrades. The AI Amplifier: DORA 2025 and Platform Readiness covers how AI tooling is changing this relationship specifically.
If you need to establish measurement baselines for your platform before deciding which framework to prioritize, the Foundations Assessment defines both DORA and SPACE measurement requirements as part of the platform gap analysis.

Mat Caniglia
LinkedInFounder of Clouditive. 18+ years transforming engineering organizations across LATAM and globally through Developer Experience consulting.
79 articles published