Why 80% of IDPs fail: three patterns that guarantee it

Most internal developer platforms do not fail at launch. They fail quietly, over six to eighteen months, while the platform team keeps shipping features that nobody uses. By the time leadership notices, the investment is already sunk and the engineering org is building workarounds on top of a platform they were supposed to adopt.

Port's 2025 survey of 300 IT professionals found that 90% of organizations report using an IDP. The same survey found that 75% of developers lose 6 to 15 hours per week to tool sprawl. Those two numbers do not belong together unless most of those platforms are not working.

After running assessments across platform organizations in the US and Latin America, three patterns appear in nearly every failed IDP. They are not exotic. They are not caused by the wrong tech stack. They are structural, and they start on day one.

Pattern one: built without a baseline

The most common failure I see is a platform that cannot prove it changed anything.

A team spends six months building a golden path. They reduce the number of steps to deploy from fourteen to three. They are proud of this. When asked to show the business impact, they cannot, because they never measured the fourteen steps before they started.

No P50 or P95 lead time data from before the platform. No incident count by service from before. No developer survey score. No cognitive load proxy. Nothing.

This is not carelessness. It is a planning habit that the industry has normalized. Platform teams treat measurement as a reporting activity that happens after delivery, not a design input that shapes what to build.

The consequence is that the platform has no feedback loop. When adoption stalls, the team cannot tell whether the golden path is actually faster, or just different. They add more features. Adoption continues to stall.

The Foundations Framework addresses this at the Horizon phase, before any design work starts. We establish baselines for the five pillars: delivery reliability, signal integrity, cognitive absorption, security and compliance posture, and operational accountability. The work after that is measured against those baselines, not against subjective impressions of improvement.

If you cannot answer "what does deploy lead time look like today, for the teams we are building for," you are not ready to build. You are ready to measure.

Pattern two: built without users

The second pattern is a platform that was designed by the platform team, for the platform team.

This is not a cynical observation. Platform teams are highly capable engineers who want to build good things. The problem is that they spend most of their time with each other, and they build for the developer experience they know, which is their own.

Application teams have different constraints. They work under delivery pressure that the platform team often does not feel directly. They have existing workflows that took months to establish. They have tribal knowledge embedded in scripts and Makefiles that nobody documented. They have opinions.

When platform teams skip discovery with application teams, the platform reflects the platform team's mental model of how software should be delivered. That model may be correct in the abstract. It is almost never correct for the specific teams that have to adopt it.

Matthew Skelton and Manuel Pais documented this problem in Team Topologies. Cognitive load is not just a function of how many steps a process has. It is a function of how unfamiliar those steps are, and how much existing knowledge they displace. A platform that requires an application team to unlearn six years of muscle memory is generating cognitive load even if the new process is objectively simpler.

The symptom is predictable: low voluntary adoption. The platform team responds by making adoption mandatory. Engineering directors start receiving complaints. The platform becomes a compliance exercise rather than a capability.

The fix is not a better onboarding doc. The fix is bringing application teams into the design process before a single line of platform code is written. Their friction points are the platform's requirements. Their workarounds are the feature list.

Agarwal and Karahanna's construct of cognitive absorption, from their 2000 MIS Quarterly paper, is useful here. Users adopt tools when they reach a state of deep engagement with the work the tool enables, not with the tool itself. A platform that creates friction before value is invisible to that state. Users bounce before they ever experience the capability.

Pattern three: built without metrics

The State of Platform Engineering Vol 4 report found that 29.6% of platform teams measure nothing. Not deploy frequency. Not change failure rate. Not developer satisfaction. Nothing.

This is the most structurally dangerous pattern because it makes the other two invisible. If you are not measuring adoption, you cannot detect that nobody is using the platform. If you are not measuring lead time, you cannot detect that the platform is slower than the manual process it replaced. If you are not measuring incident rates, you cannot detect that the platform introduced new failure modes.

Platform teams that do not measure operate on belief. They believe the platform is helping. They believe adoption is growing. They believe the investment is justified. When questioned, they produce anecdotes: "the team in payments said it was much faster." Anecdotes are not wrong. They are just not sufficient to make decisions at the organizational level.

The contrast is significant. The same State of Platform Engineering report found that mature platform organizations achieve 3.5 times the deploy frequency and 4 times the lead time improvement of immature ones. That gap does not appear by accident. It appears because mature platform organizations instrument everything from the beginning, and they use that data to make decisions about what to build next.

Metrics also change the political dynamics around a platform. A platform team that can show week-over-week improvement in P50 lead time is having a different conversation with engineering leadership than a team that is asking for continued investment on the strength of their roadmap. One conversation is about evidence. The other is about trust.

Why AI made all three patterns worse

DORA 2025 framed AI adoption in engineering as an amplifier, not a solution. Where the platform is already mature, AI tools raise throughput and quality. Where the platform is weak, AI tools amplify the chaos.

The three patterns above define a weak platform. When you add AI coding assistants to an IDP that was built without a baseline, you cannot measure whether the AI is helping or hurting. When you add AI to a platform that was built without users, you accelerate the production of code that the platform was not designed to handle. When you add AI to a platform that does not measure outcomes, you add another variable to an already uncontrolled system.

I have seen this in the field. A team adopts an AI coding assistant and sees a short-term spike in PR volume. Without platform observability, they cannot tell whether those PRs are passing CI, how long they stay open, or what their change failure rate looks like. Six months later, incident frequency has increased and nobody can explain why. The AI gets blamed. The real cause is the platform.

The DORA finding is specific: when AI adoption increases 25% in organizations with weak platforms, delivery stability drops 7.2% and throughput drops 1.5%. AI does not rescue a weak platform. It reveals it faster.

The signals that appear before it is too late

Each of the three patterns has early signals. They appear before the platform team declares failure, if you know what to look for.

For built without a baseline: the platform team cannot answer basic questions about current state without checking with individual teams. There is no shared data model for delivery performance. The assessment phase was skipped or compressed. Success criteria for the platform are defined in terms of features shipped, not outcomes changed.

For built without users: the first version of the golden path was designed in a planning session that included only platform engineers. Application teams were informed, not consulted. Adoption targets are defined by the platform team, not co-created with the teams expected to adopt. Feedback mechanisms were added after the platform launched, not before.

For built without metrics: there is no dashboard that shows platform adoption, platform uptime, or developer time-to-first-success. DORA metrics are collected at the org level but not disaggregated by platform usage. The platform team reports on features delivered, not outcomes produced.

These signals are visible within the first four to six weeks of a platform initiative. Catching them at week four is a planning conversation. Catching them at month eighteen is a postmortem.

What a baseline-first approach looks like

Before designing anything, measure where the teams you are building for are today. Four data points are enough to start: P50 and P95 deploy lead time, change failure rate, time to restore service after an incident, and a short developer survey that captures friction points and context-switching cost.

Then define what success means in terms of those four numbers. Not "we want to reduce friction." Specifically: P50 lead time from 4 days to 1 day, change failure rate from 12% to under 5%, developer survey score on cognitive load from 6.2 to 8.0. Measurable, time-bounded, agreed on with the teams affected.

Then build with those targets as design constraints, not aspirations. At the end of each phase, measure again. If the numbers are not moving in the right direction, the platform has a problem that more features will not solve.

This is what the Horizon phase is for in the Foundations Framework. Not architecture. Not tooling decisions. Baseline measurement and success criteria, agreed on before a single design decision is made.

The Foundations Assessment is designed specifically for organizations that want to measure before they build. It produces a baseline across the five pillars of platform maturity and identifies which of the three patterns, if any, are already present in the current initiative. For teams that have already launched an IDP and are seeing adoption problems, it surfaces what went wrong and where the leverage points are.

Most platform failures are diagnosable. The evidence is usually there. It just was not collected when it would have been useful.

Why 80% of IDPs fail: three patterns that guarantee it

Pattern one: built without a baseline

Pattern two: built without users

Pattern three: built without metrics

Why AI made all three patterns worse

The signals that appear before it is too late

What a baseline-first approach looks like

Related Articles

Why your DORA metrics are lying to you (and how to fix it)

Golden paths that developers actually choose (without being forced to)

Cognitive absorption: the platform metric nobody measures

Stay updated with Clouditive

Two ways forward.