Computer-aided diagnosis (CADx) software does not just flag possible findings. It characterises them as benign, malignant, or a specific diagnosis. Under Annex VIII Rule 11, CADx typically classifies as Class IIb or higher because the information it provides is directly used for diagnostic decisions with potentially irreversible consequences. The clinical evidence bar and notified body scrutiny are substantially higher than for CADe.
By Tibor Zechmeister and Felix Lenhard.
TL;DR
- CADx provides diagnostic characterisation (e.g., "this lesion is malignant"), while CADe only marks regions of interest for clinician attention.
- Under MDR Annex VIII Rule 11, CADx is rarely Class IIa; Class IIb is the realistic baseline, and Class III applies where an incorrect diagnosis could lead to death or irreversible deterioration.
- The clinical evidence expectation for CADx is prospective performance data on the intended population, not only retrospective benchmark accuracy.
- Notified body scrutiny for CADx in oncology, cardiology, and other high-severity domains is among the strictest software review categories under MDR.
- MDCG 2019-11 Rev.1 is the reference for Rule 11 application; see also the distinct post on CADe for the Class IIa baseline case.
- Subtract to Ship applied here means being honest about whether your tool is truly CADx or could be scoped as CADe with a narrower intended purpose.
Why this matters
A founder shows us a tumour classification demo. The model looks strong: 94% sensitivity on a retrospective test set. "We're Class IIa, right? It's software, it gives information, Rule 11." They want a six-month notified body review and a CE mark before their Series A.
No. This is CADx, and for a tool that says "malignant" or "benign," Class IIb is the baseline, Class III is possible, and the clinical evidence work they need has barely started. The 94% number on a retrospective academic dataset is not clinical evidence of safe intended-purpose performance. The notified body will not argue about this.
The gap between CADe and CADx is the gap between "look here" and "here is your answer." MDR treats that difference as fundamental. Founders who collapse the two end up either understating their regulatory obligations or, worse, shipping something in a pilot that was never designed to carry the diagnostic weight it's being given.
What MDR actually says
Article 2(1). Software is a medical device when it is intended for one of the medical purposes listed, including diagnosis. Article 2(12) binds the manufacturer to whatever purpose is stated on the label, in IFU, or in promotional material.
Annex VIII Rule 11, first indent. Software intended to provide information which is used to take decisions with diagnosis or therapeutic purposes is classified as Class IIa, except if: - such decisions have an impact that may cause death or an irreversible deterioration of a person's state of health, in which case it is in Class III; or - they may cause a serious deterioration of a person's state of health or a surgical intervention, in which case it is classified as Class IIb.
Read that carefully. Class IIa is the exception, not the rule, the moment you are talking about actual diagnosis. For a CADx tool whose output directly characterises a finding as malignant vs. benign, ask: could a wrong output cause serious deterioration or a surgical intervention? If yes. And for most oncology, cardiology, and neuro applications the answer is clearly yes. You are at Class IIb. If the wrong output could cause death or irreversible deterioration, Class III.
Annex XIV and Article 61 set the clinical evaluation framework. For any Class IIb or Class III software, you are in the territory where a literature-only clinical evaluation is rarely sufficient, and for Class III implantables and certain Class III software you need clinical investigation data under Article 61(4) unless a specific exemption applies.
MDCG 2019-11 Rev.1 (June 2025). The guidance walks through the qualification and classification logic of Rule 11 and provides worked examples. It confirms that diagnostic decision support in high-severity contexts is Class IIb or III, not IIa.
MDCG 2019-11 Rev.1 and the CADe/CADx distinction. The distinction matters because CADe (computer-aided detection. Marking regions for clinician attention, leaving diagnosis to the human) has a different risk profile than CADx (computer-aided diagnosis. Delivering a diagnostic characterisation). The CADe post handles the lower-risk case. This post is about what changes when you cross into characterisation.
A worked example
A startup has a chest CT algorithm. Two intended purposes:
Version A. Lung nodule CADe. Intended purpose: "Detects and marks regions of interest in chest CT that may contain pulmonary nodules, for review by a radiologist. The device does not provide diagnostic characterisation. All clinical interpretation is performed by the reader." Marketing matches. The output is a bounding box with a confidence score labelled as detection confidence, not malignancy probability.
This is CADe. Under Rule 11 and the MDCG worked examples, Class IIa is defensible. Clinical evaluation focuses on detection performance and reader workflow impact.
Version B. Lung nodule CADx. Intended purpose: "Characterises detected pulmonary nodules as likely benign or likely malignant, to support clinician decisions on biopsy or follow-up." Marketing says "AI-powered cancer risk assessment."
This is CADx. The output is used to take diagnostic decisions with potentially irreversible consequences (a missed malignancy leading to progressed cancer; an unnecessary biopsy leading to complications). Class IIb is the realistic baseline. In some framings and depending on the population and severity profile, an auditor will push toward Class III.
Now compare the evidence work: - CADe evidence: standalone detection performance, reader studies showing non-inferior or improved detection with the tool, literature support for the use case, PMCF plan. - CADx evidence: prospective clinical performance on the intended population, head-to-head against the standard of care, subgroup analyses for age, sex, ethnicity, comorbidities, rare presentations, and failure mode analysis. Expect a clinical investigation discussion with the notified body. Expect questions about dataset provenance, site diversity, and ground truth adjudication. Expect a PMCF plan that includes real-world monitoring of false negative rates.
The model weights might be identical. The product is completely different, because the intended purpose is completely different.
The Subtract to Ship playbook
Step 1. Decide honestly: CADe or CADx? If the clinical user is meant to act on your output as a diagnostic statement, you are CADx. If your output only directs the user's attention and the user independently forms the diagnosis, you are CADe. There is no middle ground that survives an auditor. "Probability of malignancy" is a CADx claim.
Step 2. Match marketing to intended purpose. If your landing page says "AI that tells you if it's cancer," you do not get to file a technical dossier that says "decision support." Either change the marketing or accept the higher class. This is the Article 2(12) discipline.
Step 3. Scope the population narrowly. CADx evidence requirements scale with population breadth. A tool validated for "screening mammography in women aged 50–69 with dense breast tissue" is a different regulatory project than "breast cancer AI for radiology." Start narrow. Expand after CE.
Step 4. Plan the clinical investigation conversation early. For Class IIb CADx in high-severity domains, start talking to your notified body about clinical evidence expectations before you lock your study design. Waiting until the technical file review to learn that your retrospective benchmark is not sufficient will cost you a year.
Step 5. Build your risk file around diagnostic error modes. Under EN ISO 14971, the hazards to identify are not model-level metrics; they are clinical error scenarios. False negative on a malignant lesion leading to delayed treatment. False positive leading to unnecessary biopsy. Demographic subgroup underperformance. Label drift as imaging hardware evolves. Each needs a risk estimate and controls.
Step 6. Instrument your PMCF from day one. For CADx, PMCF is not an afterthought. Plan how you will monitor real-world performance, detect drift, and feed signals back into your risk file and clinical evaluation. This is a Rule 11 Class IIb/III obligation, not a nice-to-have.
Step 7. Subtract the claims you cannot defend. The hardest and most valuable decision in a CADx startup is choosing what not to claim. Every claim you drop is a subgroup you do not have to evidence, a failure mode you do not have to monitor, a notified body question you do not have to answer. Subtract to Ship is not about doing less regulatory work; it is about doing the right regulatory work for honest claims.
Reality Check
- If your marketing says "diagnosis," "classification," "benign/malignant," or "risk score," have you accepted that you are CADx and not CADe?
- Can you articulate, in one sentence, whether a wrong output from your tool could plausibly cause serious deterioration or a surgical intervention. And do you have that analysis in your risk file?
- Is your clinical evidence strategy built on prospective performance on your intended population, or on retrospective benchmarks?
- Have you discussed your clinical evaluation and investigation plans with a notified body before locking the protocol?
- Does your intended population match the data you trained and validated on. By age, sex, ethnicity, comorbidity distribution, imaging hardware?
- Is your PMCF plan designed to catch real-world false negatives, or is it a box-ticking survey?
- If a notified body asked tomorrow for your Rule 11 classification justification and the evidence that led to Class IIb vs. III, could you produce it in writing?
- Which claims are you willing to drop to make the regulatory path honest and tractable?
Frequently Asked Questions
Is every AI imaging tool CADx? No. Detection tools that mark regions of interest without diagnostic characterisation are CADe. Triage tools that prioritise cases without diagnosing them are a different category again. Intended purpose decides, and the distinction matters enormously for classification.
Can a CADx tool ever be Class IIa under Rule 11? It is very rare. For diagnostic decision support to remain Class IIa under Rule 11, the wrong output must not plausibly cause serious deterioration or a surgical intervention. In most clinically meaningful CADx applications, that condition is not met.
Do we need a clinical investigation for a Class IIb CADx device? Not always, but frequently. Article 61 and Annex XIV set the framework; whether existing clinical data is sufficient depends on intended purpose, novelty, and equivalence arguments. For novel CADx in high-severity domains, plan for prospective data.
What about CADx for benign conditions with low severity? If the intended purpose genuinely involves only low-severity decisions, Class IIa may apply. But be honest: most founders pitching "benign" CADx still end up making claims that touch high-severity outcomes downstream.
How does a Predetermined Change Control Plan (PCCP) fit here? Under current MDR practice, changes to AI/ML devices require significant change assessment. PCCP concepts are emerging in EU discussions but are not yet a formal MDR mechanism the way they are in FDA guidance. Track MDCG publications closely.
Does EU AI Act high-risk classification change our MDR obligations? No. The AI Act is additive. A Class IIb CADx device is almost certainly a high-risk AI system as well, and you must meet both regimes.
Related reading
- Computer-Aided Detection (CADe) under MDR – the Class IIa baseline case and how it differs from CADx.
- MDR Classification Rule 11: Software Deep Dive – the full rule with worked examples.
- Classification of AI/ML Software under Rule 11 – how AI-specific considerations map onto Rule 11.
- Clinical Evaluation for AI/ML Medical Devices – evidence expectations for learning systems.
- AI Imaging Analysis in Radiology under MDR – the broader imaging AI regulatory landscape.
Sources
- Regulation (EU) 2017/745 on medical devices, consolidated text. Article 2(1), Article 2(12), Annex VIII Rule 11, Article 61, Annex XIV.
- MDCG 2019-11 Rev.1 (October 2019, Rev.1 June 2025). Guidance on qualification and classification of software in Regulation (EU) 2017/745.
- EN 62304:2006+A1:2015. Medical device software. Software life cycle processes.
- EN ISO 14971:2019+A11:2021. Application of risk management to medical devices.
- EN ISO 14155:2020+A11:2024. Clinical investigation of medical devices for human subjects. Good clinical practice.