Core Banking Modernization Without Downtime: A Migration Playbook

Core banking replacement projects have a poor track record. High-profile failures at TSB, CommBank, and others have made boards cautious and regulators watchful. The engineering challenge is genuine: migrating decades of accumulated business logic, customer data, and real-time transaction processing without a single moment of customer-visible failure. This article presents the strangler fig, parallel ledger, and event sourcing patterns that make zero-downtime core migration achievable rather than aspirational.

The core banking system is the ledger at the centre of every bank. It records every account, every transaction, every balance, and every relationship between them. Most banks are running core systems that were implemented between 1970 and 2005, layered with decades of customisation, interfacing with hundreds of downstream systems, and processing transactions around the clock. Replacing them is the most consequential infrastructure project in financial services, and the failure rate is high enough that several regulators have begun requiring banks to demonstrate operational resilience specifically in the context of technology change programmes of this scale.

Why Core Banking Projects Fail

The canonical failure mode is the big-bang cutover. The bank runs the old and new systems in parallel for a defined period, then switches all traffic to the new system in a single event — typically over a long weekend. The TSB migration of April 2018 is the most publicly documented example of what can go wrong: 1.9 million customers locked out of their accounts for days, weeks of degraded service, a fine from the FCA and PRA of 48.65 million pounds, and a permanent reputational cost. The engineering post-mortem identified insufficient testing, inadequate data migration validation, and a cutover plan that did not account for the failure modes of the new system under production load as the primary causes.

The underlying issue is not that big-bang cutovers are uniquely risky — it is that they concentrate all the risk into a single irreversible event. If the cutover fails, the rollback plan is always more complex and more damaging than the plan anticipated, because the migration has changed state on both sides. Banks that plan for rollback as a last resort rather than an engineering requirement consistently find it unavailable when they need it.

The Engineering Reality

Regulators, particularly the PRA and ECB, now expect to see evidence of operational resilience in core banking change programmes before they proceed to cutover. The question is not just whether the new system can handle production load, but whether the bank can tolerate the new system failing, can identify the failure within its impact tolerance, and can recover within that tolerance window. Most core banking project plans do not address this.

The Strangler Fig Pattern for Core Banking

The strangler fig pattern — taking its name from the tree that gradually surrounds and replaces a host tree — migrates functionality incrementally rather than all at once. In a core banking context, individual product lines or business functions are migrated to the new platform while the legacy system continues to handle everything else. Traffic is routed to the appropriate system based on product type or customer segment, with a routing layer (sometimes called an integration bus or middleware hub) sitting between the channels and the two core systems.

This approach requires careful API design: every product function the new core handles must also be callable through the legacy interface for systems that have not yet been updated to call the new system directly. This creates a brief period of dual-interface maintenance but eliminates the need to migrate all downstream integrations before any customer can be moved. The strangler fig is slower than a big-bang cutover but allows the bank to demonstrate the new system working in production before committing to it fully.

Parallel Ledger and Reconciliation

An alternative to incremental functional migration is running parallel ledgers — maintaining equivalent account and transaction data in both the legacy and new core systems simultaneously, with automated reconciliation to verify that they agree. The parallel ledger approach allows the new system to be validated at scale before it becomes the system of record, because every transaction is processed by both systems and the results are compared.

The engineering challenge is the reconciliation framework itself. Ledger positions must agree to the cent, in real time, across all account types. Differences must be investigated and resolved before the end of day. The tolerance for unresolved reconciliation breaks must be zero for a migration of this kind — any unexplained difference is potentially a data integrity issue in the new system. Building a reconciliation framework with this level of rigour requires significant engineering investment, but it is the investment that allows the migration to proceed with evidence rather than faith.

Event Sourcing as a Migration Foundation

Event sourcing — the practice of persisting all state changes as an immutable sequence of events rather than updating a single current-state record — provides a particularly useful foundation for core banking migration. If the current-state balance in any account can be reconstructed by replaying the transaction event log, the migration question becomes: can the new system correctly replay the event history and arrive at the same current state as the legacy system? This is a testable, automatable assertion. Banks that have introduced event sourcing into their core platform before beginning the migration have found it dramatically simplifies the validation problem.

Regulatory Notification and Change Management

In the UK, the PRA's Supervisory Statement SS3/19 requires firms to notify the regulator in advance of material changes to their technology infrastructure, including core banking replacements. The ECB's SREP process includes technology risk as an explicit assessment dimension. Regulators expect to see a detailed testing strategy, a rollback plan with defined decision points, an operational resilience assessment covering the impact tolerance for the service, and post-go-live monitoring protocols. Banks that engage with their supervisors early in the programme design phase — sharing the migration methodology and the evidence that supports it — find the regulatory engagement less adversarial than those who present a cutover plan as a fait accompli.

Data Engineering

The engineering behind this article is available as a service.

We have done this work — not advised on it, not reviewed documentation about it. If the problem in this article is your problem, the first call is with a senior engineer who has solved it.

Talk to an Engineer See Case Studies →