The Vendor Rescue Pattern: How to Recover a Failed Implementation in 12 Weeks

78%

Of failed implementations we've recovered had salvageable core data models

Failed software implementations share common failure signatures â€” and if you know what to look for in the first two weeks, you can predict whether the existing codebase is salvageable. Eight patterns appear across vendor recovery engagements in regulated industries. The triage decisions made in the first two weeks determine whether recovery is possible in 12 weeks or whether a rebuild is the only viable path.

Failed software implementations have signatures. By the end of the first week of a vendor recovery engagement, experienced engineers can identify whether the codebase is salvageable or whether a rebuild is the only path to production. The distinction matters because the answer determines the recovery architecture — and a misdiagnosis in the first two weeks is the most common reason a 12-week recovery takes 18 months.

These are the eight failure patterns we've seen across vendor recovery engagements, and the triage framework for each.

The Eight Failure Patterns

Pattern 1: The Scope Creep Collapse. The original scope was deliverable. The delivered scope is not. The codebase has 3-5x the original feature surface, none of it complete, and the core functionality is buried under half-built extensions. Diagnosis: map what was in the original SOW. Everything outside it is a write-off unless it's load-bearing for the core. Prognosis: salvageable if the core is intact.

Pattern 2: The Integration Fantasy. The vendor built the application assuming integrations would work in ways they don't. The data model assumes a format the upstream system doesn't produce. The API calls assume response structures that are incorrect. Diagnosis: test every integration contract in the first week. Prognosis: usually salvageable if the application logic is sound, but integration rebuild can be 40-60% of recovery effort.

Pattern 3: The Compliance Afterthought. The system works, but compliance was "for the next sprint" for 18 sprints. Audit logs don't exist. Access controls are role-based in the UI but not in the database. Encryption is present in production but not enforced in staging. Diagnosis: run the compliance checklist for the applicable framework in the first two days. Prognosis: salvageable, but compliance retrofit adds 4-6 weeks to any recovery timeline.

Pattern 4: The Performance Cliff. The system works with synthetic data in demo environments. It fails under realistic data volumes. Diagnosis: load test with production-scale data immediately. Prognosis: depends on whether the performance problem is architectural (data model, N+1 queries, synchronous processing where async is required) or implementation (missing indexes, unoptimised queries). Architectural performance problems often require partial rebuild.

Pattern 5: The Documentation Desert. The system runs. No one knows why. No architecture decisions are documented. No infrastructure is in code — it was clicked through the console. Diagnosis: reverse-engineer the architecture before touching anything. Prognosis: salvageable but high-risk. Any change has unknown blast radius.

Pattern 6: The Dependency Timebomb. The system uses pinned dependencies with known CVEs. Or it uses package versions that are no longer maintained. Or it runs on an EOL runtime version. Diagnosis: automated dependency scan in the first hour. Prognosis: usually fixable, but in regulated environments (PCI DSS, HIPAA, FedRAMP) the dependency update itself must go through a change management process.

Pattern 7: The Test Void. The system has test files. The tests don't test anything meaningful — snapshot tests of UI state, tests that mock every dependency, tests that always pass. Coverage metrics are meaningless. Diagnosis: run the test suite and examine what it actually asserts. Prognosis: the test void is a symptom, not the problem. The problem it reveals is that the codebase was never validated by anyone who understood the domain.

Pattern 8: The Vendor Abandonment. The previous vendor has disappeared, become unresponsive, or is holding code or credentials hostage. Diagnosis: establish what you actually control — code, infrastructure credentials, domain registrations, SSL certificates, deployment pipelines. Prognosis: depends on what you own. If you own the code and credentials, recovery is straightforward. If you don't, recovery requires rebuilding from a dump of production data.

The Engineering Reality

The salvageability decision is not primarily technical — it's economic. A codebase that would take 14 weeks to bring to production through recovery might take 10 weeks to rebuild. The rebuild gives you a clean architecture, proper compliance foundations, and maintainable code. The recovery gives you faster deployment of a system that still has the original technical debt. The right answer depends on the organisation's risk appetite and the cost of the additional 4 weeks.

The 12-Week Recovery Architecture

Week 1-2: Triage and environment stabilisation. Own the credentials. Freeze deployments from the previous vendor. Document what exists.
Week 2-3: Compliance and security gap analysis. Run the checklist. Identify what must be fixed before production can go live.
Week 3-6: Core feature stabilisation. Fix the integration contracts. Fix the performance issues. Do not add features.
Week 6-8: Compliance remediation. Implement the controls that were missing. This is non-negotiable in regulated environments.
Week 8-10: Testing and validation. Build the test coverage that was missing. Not to hit a coverage number — to validate that the system does what it must do.
Week 10-11: Staged deployment. Production with limited user population. Monitor aggressively.
Week 11-12: Full production and handover. Documentation, runbooks, incident response procedures.

Compliance Engineering

The engineering behind this article is available as a service.

We have done this work — not advised on it, not reviewed documentation about it. If the problem in this article is your problem, the first call is with a senior engineer who has solved it.

Talk to an Engineer See Case Studies →

Related Reading

The Vendor Rescue Pattern: How to Recover a Failed Implementation in 12 Weeks

The Eight Failure Patterns

The 12-Week Recovery Architecture

EU AI Act: What CTOs Actually Need to Do Before August 2026

The LLM Hallucination Problem in Regulated Environments: What 'Acceptable Error Rate' Actually Means

How Accenture's Staff Augmentation Model Creates Compliance Debt (And How to Audit It)

The engineering behind this article is available as a service.