UK FCA AI Governance for Fintech: What Consumer Duty Demands of Your Models

PS23/16

FCA Consumer Duty final rules â€” the fair outcomes standard that applies to every algorithmic decision

FCA PS23/16 and the Consumer Duty fair outcomes standard create specific obligations for AI-driven financial decisions â€” credit scoring, insurance pricing, investment recommendations. The FCA's expectations for model explainability in an examination context are materially different from what ML research defines as explainability. SHAP values do not satisfy a regulatory requirement to explain a decision in terms a customer can understand and challenge. The technical architecture for Consumer Duty-compliant AI decisions.

The FCA's Consumer Duty (PS22/9, effective July 31, 2023) created a new standard for financial services firms: the requirement to deliver good outcomes for retail customers. When those products and services are delivered through AI — algorithmic credit decisions, AI-driven insurance pricing, robo-advisory investment recommendations — the Consumer Duty's fair outcomes requirement creates specific technical obligations that most ML engineering teams have not designed for.

PS23/16, published November 2023, provides FCA guidance on Consumer Duty implementation for existing products and services. The guidance makes explicit that firms using AI to make decisions that affect consumers must be able to demonstrate that the AI's decision-making process does not produce outcomes that are unfair to consumers. "Demonstrate" in an FCA examination context means providing evidence that satisfies a technically competent examiner — not an executive presentation, and not a SHAP plot.

What the FCA Means by Explainability

The PRA's model risk management supervisory statement (SS1/23, published April 2023) defines model explainability requirements in terms of the model's intended use. For a credit decision model, explainability means: the firm can explain to a specific customer, in plain language, why their application was declined and what they could do to improve their outcome. A feature importance score showing "credit utilization was the most important feature" does not explain why this customer's specific credit utilization level led to a decline in terms the customer can understand and act upon.

The technical architecture that satisfies this: counterfactual explanations, implemented at the inference layer, that produce specific actionable guidance for each declined decision. "If your credit card utilization were below 30%, your application would have been approved" is a counterfactual explanation. "Credit utilization contributed 0.23 to the model output" is not. Building counterfactual explanation generation requires either a model architecture that supports it natively (decision trees, rule-based models) or a post-hoc counterfactual generation layer (DiCE, ALIBI) integrated into the inference pipeline.

The Engineering Reality

The FCA's fair outcomes requirement applies to the output distribution of AI decisions, not just to individual decisions. A credit model that performs well on average but produces systematically worse outcomes for a protected characteristic group fails Consumer Duty even if each individual decision can be explained. SS1/23 requires that firms monitor model outputs for bias across protected characteristics on an ongoing basis and maintain evidence of that monitoring. Bias monitoring as a post-hoc audit exercise does not satisfy SS1/23 — it must be continuous, automated, and the results must feed back into the model governance process.

The Model Risk Management Framework

SS1/23 defines a three-stage model risk management framework: model development, model validation, and model use. At development, the MRM framework requires documentation of the model's intended use, limitations, and assumptions. At validation, an independent model validation must assess the model's conceptual soundness, data quality, and performance against the intended use. At model use, ongoing performance monitoring must track the model against the validation metrics and trigger a review when performance degrades. For AI models, the "performance" dimension must include fairness metrics, not just accuracy metrics.

Algorithmic Accountability: The UK GDPR Interaction

UK GDPR Article 22 restricts solely automated decisions that significantly affect individuals — including credit decisions, insurance pricing, and investment recommendations — and creates a right to obtain human intervention, to express their point of view, and to contest the decision. An AI decision system with no human review pathway for contested decisions violates UK GDPR Article 22 regardless of how accurate the model is. The human review pathway must be staffed and operational — not just documented in the privacy notice.

Implement counterfactual explanation generation at the inference layer — not post-hoc analysis, but per-decision explanation generation
Deploy automated bias monitoring across all Consumer Duty-relevant protected characteristics — make it continuous, not periodic
Establish an independent model validation function — SS1/23 requires independence, not just internal review
Build the UK GDPR Article 22 human review pathway operationally — it must be staffed and accessible before the model goes live
Document model limitations and assumptions in terms an FCA examiner can evaluate — not in terms an ML practitioner would use
Include fairness metrics alongside accuracy metrics in the model validation report — accuracy alone does not satisfy SS1/23

AI in Regulated Industries

The engineering behind this article is available as a service.

We have done this work — not advised on it, not reviewed documentation about it. If the problem in this article is your problem, the first call is with a senior engineer who has solved it.

Talk to an Engineer See Case Studies →

UK FCA AI Governance for Fintech: What Consumer Duty Demands of Your Models

What the FCA Means by Explainability

The Model Risk Management Framework

Algorithmic Accountability: The UK GDPR Interaction

Agentic AI in Healthcare: The HIPAA Problems Nobody Is Talking About

SOC 2 Type II in 90 Days: The Architecture-First Approach

The LLM Hallucination Problem in Regulated Environments: What 'Acceptable Error Rate' Actually Means

The engineering behind this article is available as a service.