Prompt Engineering for Fraud Detection Models
AI Prompting for Real Banking Systems
AI Prompting for Real Banking Systems
Fraud detection in banking is a data problem before it is a model problem. The features fed into a classifier, the hypotheses that guide anomaly investigation, and the explanations surfaced to compliance teams all depend on someone asking the right questions of the right data — quickly enough to matter. For data engineers and ML engineers working on fraud systems, prompt engineering with Claude has become a practical accelerator across three specific workflows: feature engineering, anomaly detection hypothesis generation, and model explainability.
This article covers each workflow with concrete prompt patterns you can adapt directly to your environment.
Feature selection is where fraud models are won or lost. A classifier trained on the wrong features will produce confident, wrong predictions — and in a high-stakes environment, confident and wrong is worse than uncertain. The challenge is that feature ideation is time-consuming: it requires domain knowledge, familiarity with transaction patterns, and the ability to reason about how fraudsters adapt to existing controls.
Claude accelerates this process by acting as a structured thinking partner. The prompt pattern is straightforward: provide the transaction schema, describe the fraud typology you are targeting, and ask for candidate features with explicit reasoning for each.
You are a fraud detection engineer working on a card-not-present fraud model for a European retail bank. The transaction schema includes: merchant_category_code, transaction_amount, timestamp, device_fingerprint, ip_geolocation, time_since_last_transaction, and customer_tenure_days.
Generate 10 candidate features for detecting account takeover fraud. For each feature, explain the fraud signal it encodes and any data quality risks that would make it unreliable in production.
The instruction to include data quality risks is important. It forces the output to be operationally useful rather than theoretically complete — a distinction that matters when you are deciding what to build versus what to document as a future consideration.
Important: Before sending any schema or transaction data to Claude, ensure all fields are anonymised or synthetic. No real customer identifiers, account numbers, or PII should appear in prompts. This is a non-negotiable requirement under GDPR, regardless of how the LLM processes or retains input data.
Anomaly detection models surface outliers — but outliers require interpretation. A transaction flagged as anomalous is not inherently fraudulent; it may reflect a legitimate behavioural change, a data pipeline error, or a genuine fraud pattern. The investigative bottleneck is generating plausible hypotheses fast enough to prioritise the queue.
Claude is effective here when given a specific anomaly description and asked to generate competing explanations ranked by likelihood. The prompt structure should constrain the output to the relevant fraud typologies for your institution and explicitly request that benign explanations be included alongside malicious ones — otherwise the output skews toward false positives.
A customer with 8 years of tenure and a stable transaction history made 4 international transfers to new beneficiaries within 90 minutes, totalling €14,000. The transactions originated from a device not previously associated with the account. The customer has not contacted support.
Generate 5 hypotheses to explain this pattern, ranked by likelihood. Include at least one benign explanation. For each hypothesis, identify the additional data points that would confirm or rule it out.
This pattern is useful for building investigation playbooks as well as for real-time triage. The output should be treated as a starting point for analyst review — not a decision. LLMs can produce plausible-sounding but incorrect reasoning, and in a fraud context, an AI-generated hypothesis that goes unverified before action is a liability, not an asset.
Regulators and internal audit functions increasingly expect fraud model decisions to be explainable in plain language. This applies particularly to adverse decisions — declined transactions, blocked accounts, SAR filings — where the institution must be able to articulate why the model flagged a case. ML engineers are often responsible for producing these explanations, which creates a translation problem: SHAP values and feature importances are technically correct but not audit-ready.
Claude can convert model output into structured, human-readable explanations when given the right input.
A fraud detection model assigned a risk score of 0.91 to a transaction. The top contributing SHAP features were: ip_geolocation_mismatch (+0.34), time_since_last_transaction_seconds (+0.28), new_device (+0.19), transaction_amount_vs_30d_average (+0.11).
Write a plain-language explanation of why this transaction was flagged, suitable for inclusion in an internal audit log. The explanation should reference only the features provided, make no causal claims beyond what the feature values support, and avoid probabilistic language that implies certainty.
Two requirements in that prompt are doing critical work. First, restricting the explanation to provided features prevents Claude from fabricating plausible-sounding but unsupported reasoning — a real risk when LLMs are asked to explain decisions with incomplete context. Second, prohibiting certainty language keeps the output defensible: audit logs that overstate model confidence create compliance exposure.
Any Claude-generated explanation going into an audit log or regulatory submission must be reviewed by a qualified analyst before it is recorded. LLM output in this context is a drafting aid, not a final artefact.
A note on prompt injection: In automated pipelines where transaction data is passed directly into prompts, there is a risk that adversarially crafted input — data fields containing instruction-like text — could alter model behaviour. Sanitise and validate all data fields before they enter a prompt template. Treat LLM inputs in automated fraud pipelines with the same discipline applied to any external input in a production system.
Two frameworks are directly relevant to this work. The EU AI Act's Annex III classifies certain AI systems used in financial services as high-risk, with credit scoring explicitly named. Fraud detection sits in a more nuanced position: depending on implementation and the decisions it informs, it may attract similar obligations around transparency, human oversight, and documentation — but institutions should obtain a specific legal assessment rather than assuming it falls under the same classification as credit scoring.
On explainability specifically: the applicable accountability standards come from EBA guidelines on internal governance (GL/2021/05) and, where internal models inform risk decisions, the ECB Guide to Internal Models. These set expectations for how model-driven decisions are documented and challenged — not DORA, which governs ICT risk and operational resilience, not ML explainability.
The prompt patterns here are starting points, not production templates. Effective use of Claude in a fraud engineering context requires establishing institution-specific prompt libraries, testing outputs against known fraud cases, and defining clear human review gates — particularly for anything that touches audit logs or regulatory reporting. The engineers who get the most from this tooling are those who treat Claude as a structured reasoning accelerator, not an autonomous decision-maker.