How a Traditional Bank Transformed Its Engineering Without a Big-Bang Rewrite
TL;DR. A 200-person banking engineering organization went from twice-monthly deployment windows to over 300 deployments per month in three years — without a big-bang rewrite and with their compliance function's active support. The mechanism was not a transformation program. It was a sequence: instrument what the existing process actually costs, fix the highest-visibility constraint first, earn credibility, pick the next constraint. The compliance-speed trade-off turned out to be a false one once incident data showed that large-batch deployments were generating most of their incidents.
When the Head of Engineering at this bank described their deployment process to me in our first meeting, she was matter-of-fact about it. Twice a month. Friday nights. A six-hour change window with a 15-person call bridge. An explicit rollback plan documented for every change. An on-call rotation that engineers dreaded.
"We know it's not great," she said. "But this is banking. We can't afford mistakes."
The argument that high-stakes industries require slow, careful, manual deployment processes sounds reasonable until you look at the data. The organizations with the fastest deployment frequency and highest reliability are not startups building side projects. They're companies like Google, Amazon, and Netflix, where the cost of failure is real and the volume of deployments is enormous. The correlation between deployment frequency and system reliability runs counter to intuition: deploying more often, with smaller changes, produces fewer failures, not more.
Convincing the bank's leadership of this was not a conversation. It was a 14-month process.
Why we started with the deployment process — even though it was not the highest-leverage problem
The strategy from the beginning was not to propose a transformation. It was to fix specific, visible problems.
The first target was the deployment process itself. Not because it was the highest-leverage problem, but because it was the most visible and universally resented. When engineers spend every other Friday night on a six-hour call, they know the process is broken. That shared frustration is organizational energy that can be redirected.
We started by instrumenting what was actually happening during those deployment windows. Of the six hours, roughly four were waiting: for health checks to pass, for someone to confirm a service was stable, for someone else to get off the previous call to join this one. Only about 90 minutes was actual deployment work.
The first improvement was automated health checks. This sounds trivial. It saved an average of 90 minutes per deployment window and removed the need for three specific people to be present on the call. Within two cycles, the mood around deployments had shifted perceptibly. Engineers who had been vocal skeptics about process changes were asking when we'd tackle the next thing.
That early win accomplished something beyond the time saved. It established that improvement was possible without a complete rewrite of the process. The engineers who had been most cynical about change, the ones who had been through previous "transformation" initiatives that delivered slide decks rather than results, saw something specific and measurable improve and updated their priors accordingly. That credibility was the prerequisite for everything that followed.
The constraint-by-constraint sequence that made each improvement possible
What followed over the next 18 months was a deliberate sequence: pick the constraint that most limits deployment confidence, fix it, measure the improvement, pick the next constraint.
Automated testing was the second major investment. The codebase had some tests, but they were slow, flaky, and not treated as blocking. A test suite that takes 45 minutes to run and fails randomly 20% of the time is not a safety net. It's a tax. We worked with the teams to identify the highest-value test coverage and made test reliability a hard constraint: if a test was flaky, it was disabled and addressed before being re-enabled.
The flaky test policy was the most controversial change of the entire transformation. Engineers had become accustomed to ignoring test failures they'd seen before. The discipline of treating every test failure as meaningful, rather than as background noise, required a behavioral shift that some engineers found frustrating in the short term. But the data was unambiguous: as flaky tests were eliminated, false positives dropped, and engineers started trusting the test results again. The build was no longer a green light you ignored or a red light you investigated based on whether you recognized the failure pattern. It was a signal.
After testing came observability. The bank had logging, but correlating a production incident to a specific deployment required hours of forensic work. We introduced structured logging and basic distributed tracing for the five services responsible for the highest-volume customer flows. When something broke now, the oncall engineer could typically identify the source within minutes instead of hours. Mean time to restore dropped by about 65% on those services.
The choice to start with five services rather than the full estate was deliberate. The observability work would not have been credible as a broad initiative. By demonstrating dramatic improvement on the highest-risk services first, we created a business case for expanding the investment that was grounded in actual incident data rather than theoretical benefits.
Deployment frequency followed naturally. When you trust your tests, when you can deploy quickly and confidently, and when you can detect and respond to failures fast, there is no reason to batch changes into a twice-monthly window. The risk of a big-batch deployment is much higher than the risk of a small, isolated change. Once the team internalized this experientially, not from a presentation, but from watching it work, the change in behavior was self-sustaining.
The organizational changes — working with compliance, not around it
The technical changes were the easier part. The organizational changes were slower and more uncomfortable.
Change Advisory Board (CAB) processes exist in highly regulated industries for legitimate reasons. Risk management is not theater; there are real compliance requirements around documentation and approval for certain classes of changes. The challenge was that the CAB had become a default bottleneck for everything rather than a targeted control for high-risk changes.
Working with the risk and compliance team rather than around them was essential. We helped them develop a risk tiering model: low-risk changes (updates to internal services with no customer data exposure) could be deployed with standard automated controls. Medium-risk changes required peer review and a deployment plan. High-risk changes retained the full change management process. The result was that roughly 70% of deployments moved to a fast track, while the CAB retained meaningful control over the 30% of changes where the oversight was actually warranted.
This required compliance leadership to accept that "every change goes through CAB" was not equivalent to "high-risk changes are well-controlled." That was not an easy conversation. But when the CISO saw the data, that the high-friction process had not prevented any of the incidents in the past 24 months and had caused several through deployment complexity, the conversation became possible.
The key was framing the change in terms of risk management rather than speed. The CAB's mandate was risk reduction, not change approval. If the existing process was not reducing risk effectively, it was failing its mandate regardless of how many changes it reviewed. That framing opened a conversation that "we want to deploy faster" would not have.
The culture shift: when "deployment" stopped being a word that caused a stress response
Three years of sustained transformation effort produced predictable changes in deployment metrics. The culture change took longer and was harder to see until it was clearly present.
The signal I look for when assessing whether engineering culture has genuinely shifted is how the team talks about deployments in casual conversation. Before the transformation, the word "deployment" at this bank produced a visible stress response in engineers. It was associated with Friday nights, six-hour calls, things going wrong, and being accountable for failures in front of a large audience.
Two years into the transformation, deployments were mentioned in passing. "Oh, we deployed that fix yesterday." No elaboration required. The same event that had been a significant organizational milestone had become a routine operational act.
This change in how engineers relate to their work is not just cultural, it is economically significant. Teams that fear deployment avoid it, batch changes, and accumulate risk. Teams that treat deployment as a routine act deploy confidently, deploy frequently, and catch problems earlier. The culture change is what sustains the technical improvements after the consulting engagement ends.
What the numbers looked like three years in
Deployments per month went from approximately 8 to over 300. Change failure rate dropped from an estimated 25% to under 4%. Mean time to restore went from eight hours to under 45 minutes. The Friday night deployment window is gone.
The more interesting change is cultural. When I talk to engineers at that bank now, they describe their work differently. They're not waiting for a deployment window. They're not afraid of releasing. The deployment process is not an event to be survived. It's a non-event that happens several times a day.
That shift in experience, from deployment as a high-stakes ritual to deployment as a routine, unremarkable act, is what sustained improvement looks like. You can't produce it by announcing it. You can only produce it by making it true, one fixed constraint at a time.
The bank's Head of Engineering described the outcome this way: "We used to think that being careful meant being slow. Now we understand that being slow was itself a form of carelessness. We were accumulating risk with every large batch, every manual step, every deployment window that required 15 people to coordinate. The new way is more careful, not less."
How the CISO came to see automated deployment pipelines as better risk management than manual controls
One of the most instructive moments in the transformation was a conversation with the bank's chief risk officer about six months into the process. She had been skeptical initially, concerned that the movement toward more frequent deployments contradicted the regulatory expectation of rigorous change management.
The conversation shifted when she reviewed the incident data from the two years before the transformation began. During that period, 15 significant production incidents had occurred. Of those 15, 11 had been directly caused by large batch deployments: services with unexpected interdependencies, manual steps executed incorrectly under time pressure, changes that had been in the pipeline too long and were no longer well-understood by anyone still at the organization.
The large batch deployment window that felt like the safe option was actually the mechanism generating most of their incidents. This was not intuitive but was empirically clear from the data.
The regulatory framework was not designed to require large batch deployments. It was designed to require evidence of risk management. A well-instrumented, highly automated deployment pipeline with comprehensive testing, automated rollback capability, and observable deployment outcomes provides better evidence of risk management than a 15-person call bridge reviewing a checklist of manual steps. Once the CISO and CRO understood this distinction, the regulatory conversation changed from an obstacle to a lever.
What the engineers discovered about their own capabilities when the constraints changed
The most unexpected outcome of the transformation was what it revealed about the team's own capabilities. When engineers had been operating in a high-friction, twice-monthly deployment cycle, many of them had developed a set of beliefs about what was possible that were shaped by the constraints they had always worked within.
They believed that deployments were inherently risky and required extensive manual verification. They believed that high-stakes systems required slow processes. They believed that the complexity of their codebase made automation difficult to implement reliably.
Each of these beliefs was reasonable given the environment they had been working in. And each was wrong in ways that the transformation revealed experimentally.
The engineers who had been most skeptical of the changes became, in many cases, the most enthusiastic advocates after they experienced the new process. The experience of deploying a change on a Tuesday afternoon with automated tests, watching it pass, deploying to production, seeing the health checks pass, and moving on to the next task was qualitatively different from the experience of attending a six-hour call. They knew this was better not because someone told them so, but because they had lived both versions.
This experiential learning is not something that can be transmitted through training or documentation. It can only be created by building the environment and letting people work in it. The transformation succeeded in large part because it created the conditions for engineers to update their beliefs through direct experience rather than through persuasion.
The cost accounting exercise that changed the investment conversation
One of the most valuable exercises we did at the beginning of the bank engagement was a cost accounting of the existing deployment process: not just the direct engineering time, but the full cost, including coordination overhead, the productivity recovery time after interrupted work, the incidents caused by deployment complexity, and the retention impact of engineers who found the on-call rotation unsustainable.
The total came to significantly more than leadership had assumed. The Friday night deployment windows alone were consuming approximately 800 engineering-hours per year in direct participation time. The incidents caused or worsened by the large-batch deployment process were adding roughly 600 hours of remediation work annually. The turnover in the on-call rotation, higher than the industry average and partially attributable to the on-call burden, was producing approximately $300,000 per year in recruiting and onboarding costs.
When the Head of Engineering brought this analysis to the CTO, the conversation changed from "can we afford to invest in improving the deployment process?" to "can we afford not to?" The investment required to automate the deployment pipeline and build the CI infrastructure to support it was a fraction of the annual cost of the status quo.
This cost accounting exercise is the work that most organizations skip because it is uncomfortable to do. Making the cost of a broken engineering process legible requires acknowledging that the process was broken and that the cost was being paid silently for years. But the discomfort is worth it. The investment conversation on the other side of that acknowledgment is fundamentally different from the investment conversation that does not have the data.
Why compliance and deployment speed are not in tension — especially in regulated industries
The most persistent myth in regulated industry engineering is that compliance and deployment speed are in fundamental tension: that moving faster inherently means taking on more regulatory risk. The bank's experience demonstrates this is false, and the DORA data across regulated industries supports this broadly.
The compliance requirements that regulators care about in software deployment are traceability, auditability, and evidence of risk management. An automated deployment pipeline that requires every change to pass a defined set of automated tests, that records every deployment with a full audit trail, and that requires explicit approval for high-risk changes provides better compliance evidence than a manual process whose actual execution varies with the individuals involved.
The banks, insurance companies, and healthcare organizations that have moved to continuous delivery have not done so by reducing compliance standards. They have done so by demonstrating to their compliance functions that automated controls provide better risk management than manual controls. This argument is now supported by years of industry data. The organizations that have made it successfully have compliance leadership that understands software risk management empirically rather than procedurally.
Regulated industries are not exceptions to the principle that small, frequent deployments are safer than large, infrequent ones. They are the industries where that principle matters most and where the case for it is most clearly supported by incident data.
Frequently asked questions
How do you start a DevOps transformation in a risk-averse organization?
Start with the constraint that is most visible and most universally resented — not necessarily the highest-leverage one. At this bank, the Friday-night deployment windows were universally understood to be broken. Fixing something specific and visible in the first two cycles earned credibility with the engineers who had been through previous transformation initiatives that delivered slide decks instead of results. That credibility is the prerequisite for everything harder.
Can a bank or regulated organization actually achieve continuous delivery?
Yes, and the compliance function can be a lever rather than an obstacle. The key is reframing: compliance requirements are about traceability, auditability, and evidence of risk management — not about manual processes or slow deployments. An automated pipeline that requires every change to pass defined tests, records every deployment with a full audit trail, and requires explicit approval for high-risk changes provides better compliance evidence than a manual 15-person call bridge. Once the CISO sees incident data showing that large-batch deployments caused most of the previous incidents, the conversation changes.
How long does a genuine DevOps transformation take?
The bank went from twice-monthly deployment windows to over 300 deployments per month in three years. The technical changes were the faster part. The culture change — the point where engineers mentioned deployments in passing rather than with visible stress — took about two years. The signal I use to assess whether culture has genuinely shifted is how the team talks about deployments in casual conversation, not in retrospectives or surveys.
What is the change failure rate improvement realistic to expect?
At this bank, change failure rate dropped from an estimated 25 percent to under 4 percent over three years. The mechanism was the same as the deployment frequency improvement: fix the constraint that generates the most failures (large-batch deployments with manual steps and poor test coverage), then measure, then fix the next constraint. DORA research across industries supports the finding that high-performing organizations sustain change failure rates below 15 percent while deploying at high frequency.
Related reading
- DORA metrics implementation guide — the metrics that tracked the bank's improvement: deployment frequency, change failure rate, lead time, and mean time to restore, with consistent definitions.
- Why DORA metrics are lying to you — signal integrity in platform engineering — the signal integrity work that must accompany the measurement: ensuring the before and after numbers are using the same definitions.
- SRE for growth-stage engineering — when you need it and what to build first — the reliability foundation (actionable alerting, runbooks, blameless postmortems) that the bank built as part of the same sequence.
- Platform engineering ROI — what to measure and how to defend it — the cost accounting exercise the bank ran (800 engineering-hours in deployment windows, $300k in on-call attrition) is exactly the ROI methodology described here.
If your organization is in a similar place, the path forward is not a transformation roadmap. It is a diagnostic: understanding where the real constraints are and which one to address first. Start there. For more on the DORA metrics that track this kind of improvement, read the complete implementation guide.

Mat Caniglia
LinkedInFounder of Clouditive. 18+ years transforming engineering organizations across LATAM and globally through Developer Experience consulting.
79 articles published