Computer-Aided Detection (CADe) software is medical device software that marks or highlights regions in clinical data — typically medical images — as potentially abnormal, so that a clinician reviews them. Under MDR, CADe qualifies as a medical device when its intended purpose matches Article 2(1), and it is classified under Annex VIII Rule 11. Most CADe products land in Class IIa or IIb depending on the severity of the condition being flagged and how the output is used in the clinical workflow. Locked algorithms, full EN 62304 lifecycle, training data governance, and clinical performance evidence on the intended use population are the four pillars a Notified Body will expect to see.

By Tibor Zechmeister and Felix Lenhard. Last updated 10 April 2026.


TL;DR

  • CADe is detection software — it marks candidate findings for clinician review. The clinician remains the decision maker.
  • CADe is not CADx. Detection flags where something might be. Diagnosis characterises what it is. The distinction drives classification and evidence expectations.
  • CADe qualifies as a medical device under MDR Article 2(1) when its intended purpose is diagnostic — detection is a diagnostic function even when the clinician makes the final call.
  • Classification sits under Annex VIII Rule 11. Routine screening contexts typically land at Class IIa; severe-condition contexts (cancer, intracranial haemorrhage, pulmonary embolism) push to Class IIb.
  • Notified Body involvement is required from Class IIa upward. Self-certification is not realistic for CADe.
  • Evidence expectations include sensitivity and specificity on a representative independent test set, subgroup performance, and reader studies showing the clinician-plus-CADe workflow does not degrade the unaided clinician baseline.

The CADe category and why it matters

CADe sits at the oldest and most commercially established corner of AI in medical imaging. Mammography CAD systems that mark suspicious regions for a radiologist have been on the market for over two decades. Pulmonary nodule detection on chest CT, polyp detection on colonoscopy video, intracranial haemorrhage flagging on head CT, tuberculosis screening on chest X-ray — the category is deep, the clinical literature is substantial, and the regulatory path is better understood than for most AI product categories.

That maturity cuts both ways. A founder building a CADe product in 2026 walks into a field where the Notified Body has seen the pattern before, the expectations are reasonably clear, and the clinical evidence templates exist. It also walks into a field where the bar is not low. The existing products in this category have set a reference point for sensitivity, specificity, and reader study design, and a new entrant has to meet that bar to be taken seriously by regulators, clinicians, and reimbursement bodies.

This post covers the MDR side. What CADe is, how to qualify it, how to classify it, what evidence a Notified Body expects, and the mistakes we see startups make when they enter this category assuming it is simpler than it is.

What CADe actually is

CADe — Computer-Aided Detection — is software that analyses clinical data and marks locations or instances where a defined finding may be present. The output is a marker, a bounding box, a heatmap, a list of candidate regions, or a binary "suspicious/not suspicious" flag. The clinician reviews the marks and makes the final clinical judgment.

The defining property of CADe is that it does not characterise. A polyp detection system flags a region on colonoscopy video that looks like a polyp. It does not say whether the polyp is benign or malignant. A nodule detection system flags a region on a chest CT that looks like a nodule. It does not say whether the nodule is cancer. A haemorrhage detection system flags a CT scan as possibly containing intracranial blood. It does not grade the severity or recommend a treatment. The detection is the output. The diagnosis is downstream and belongs to the clinician.

This is not a semantic distinction. It drives the classification, the evidence expectations, the clinical workflow integration, and the risk profile. A system that detects incorrectly — a missed finding or a false alarm — has a different risk profile than a system that misdiagnoses. Both matter. They are not the same.

CADe versus CADx — the distinction the Regulation cares about

CADx — Computer-Aided Diagnosis — characterises. Given a finding (either flagged by CADe, or marked by a clinician, or identified in some other way), CADx returns a classification: benign versus malignant, high grade versus low grade, specific disease category, probability of a particular condition.

The MDR text does not use the terms CADe and CADx. Those terms come from the clinical and regulatory literature, and they are useful because they separate two different intended purposes that carry different regulatory implications under the same Rule 11.

The practical effect is this. A pure CADe product with an intended purpose of "flagging candidate regions for radiologist review" describes a detection function. A CADx product with an intended purpose of "providing a malignancy probability score to support diagnosis" describes a diagnostic characterisation function. Both are regulated under MDR. Both usually fall under Rule 11. The class they land in, the evidence they need, and the reader study design expected by a Notified Body differ because the clinical question being automated is different.

The other reason the distinction matters is that many products mix the two. A system that flags candidate nodules and also returns a malignancy probability for each candidate is doing CADe and CADx simultaneously. The intended purpose has to state both functions plainly. Trying to describe a mixed product as "just detection" to keep the evidence expectations lighter is a recipe for a difficult Notified Body review. We have a separate post on CADx characterisation software under MDR that walks through the diagnostic side in detail.

Qualification under Article 2(1)

MDR Article 2(1) defines a medical device by intended purpose. Software qualifies as a medical device when the manufacturer intends it to be used for a medical purpose listed in Article 2(1) — including diagnosis, prevention, monitoring, prediction, prognosis, treatment, or alleviation of disease.

Detection for the purpose of identifying disease or potential disease is a diagnostic function within the meaning of Article 2(1). A CADe product whose intended purpose is to highlight potential pulmonary nodules on chest CT for radiologist review is performing a diagnostic function — the detection is part of a diagnostic workflow, and the flagged regions feed a diagnostic decision even when the final call belongs to the clinician. The fact that a human is in the loop does not remove the device from the regulatory scope. MDCG 2019-11 Rev.1 (June 2025) is clear on this — software that provides information used to take decisions for diagnostic or therapeutic purposes is medical device software, and the role of a human in the workflow does not cancel that qualification.

A CADe product whose intended purpose is narrowly non-diagnostic — for example, a research-only tool that marks regions in anonymised images for academic study, with no clinical claim — might sit outside the medical device scope. In practice, most commercial CADe products cannot credibly make that claim. The marketing story, the clinical validation story, and the reimbursement story all point back to clinical use, and the intended purpose has to match.

For the foundational view on qualification, see our post on what software as a medical device means under MDR and on MDCG 2019-11 Rev.1.

Classification under Annex VIII Rule 11

Annex VIII Rule 11 is the rule that governs CADe classification under MDR. Rule 11 applies to software intended to provide information that is used to take decisions with diagnosis or therapeutic purposes. The default class under Rule 11 is IIa. The class moves up to IIb when the decisions supported by the information can cause serious deterioration of a person's state of health or require a surgical intervention. The class moves up to III when those decisions can cause death or an irreversible deterioration of health. Software intended to monitor physiological processes sits at IIa or IIb depending on criticality. The residual "other" category is Class I.

For CADe, the class follows the severity of the condition being flagged and the role of the output in the clinical decision.

A CADe product flagging candidate regions in a routine screening context — for example, a general-purpose lesion detector that supports a radiologist's routine read — typically lands at Class IIa. The information supports a diagnostic decision. The decision does not, in the ordinary path, cause serious deterioration of health directly, because the clinician reviews and confirms.

A CADe product flagging findings in a severe-condition context — intracranial haemorrhage on head CT, large vessel occlusion in acute stroke, pulmonary embolism on chest CT, cancer screening — lands at Class IIb. The information supports decisions where a missed finding or a false alarm can cause serious deterioration of health. Acute stroke triage is the clearest example: a missed large vessel occlusion delays thrombectomy, and the delay itself is the harm.

Class III for CADe is rare but not impossible — a CADe product whose missed detection would directly and unavoidably cause death or irreversible harm, in a workflow with no human backstop, could in principle reach Class III. Most real products have a clinician in the loop and do not reach that threshold. The class above IIa is the active question for most CADe founders, not the class above IIb.

MDCG 2019-11 Rev.1 provides the authoritative interpretation of Rule 11 for software. It is the document to read alongside the MDR text, not instead of it. For a deeper walk-through, see our posts on MDR classification Rule 11 for software and Rule 11 deep dive.

Evidence expectations

A Notified Body assessing a CADe product at Class IIa or IIb expects the technical documentation to include, at minimum, the following evidence layers.

Analytical performance on a representative independent test set. The test set has to be independent of the training data, representative of the intended use population (age, sex, imaging vendor, acquisition protocol, disease prevalence), and large enough to produce statistically meaningful sensitivity and specificity estimates with defined confidence intervals. A test set drawn from the same institution and the same scanner as the training set does not satisfy this — the model needs to demonstrate generalisation.

Subgroup performance. Sensitivity and specificity broken down by relevant clinical and demographic subgroups. If the model performs well overall but degrades in a subgroup where the disease is clinically important, the technical file has to surface that fact, and the intended purpose and labelling have to account for it.

Reader studies where the product affects clinical workflow. For CADe products intended to be used by a radiologist or other reader, the expected evidence includes a reader study showing that the clinician-plus-CADe workflow performs at least as well as the unaided clinician baseline — preferably better. Designs vary (standalone performance, sequential reading, concurrent reading, multi-reader multi-case), and the choice matters. A Notified Body will look at whether the design reflects the actual clinical deployment context.

Clinical evaluation under MDR Article 61. The evidence above has to be structured into a clinical evaluation that demonstrates conformity with the relevant general safety and performance requirements in Annex I. Literature, equivalence, and own investigation are the three sources, and for a novel CADe product the own-investigation component (the reader study and the independent test set analysis) is usually the centre of the evidence.

Full EN 62304:2006+A1:2015 lifecycle. The software lifecycle standard referenced by MDR Annex I applies to CADe the same way it applies to any other medical device software. Software safety classification, requirements, architecture, unit testing, integration testing, system testing, problem resolution, and configuration management are all expected. EN 62304 does not have an AI-specific track, but the process discipline applies and is not negotiable.

For the broader AI medical device context this CADe post sits inside, see the pillar post on AI in medical devices under MDR and the post on clinical decision support under MDR.

Sensitivity versus specificity — the reporting trap

This is the place where CADe founders most often trip.

Sensitivity is the proportion of true positives the system catches. A sensitivity of 95% on pulmonary nodules means the system flags 95 out of every 100 nodules that are actually there. Specificity is the proportion of true negatives the system correctly leaves unflagged. A specificity of 90% means the system correctly ignores 90 out of every 100 non-nodule regions. The two numbers trade off against each other — a more aggressive detector catches more true positives at the cost of more false alarms, and vice versa.

The reporting trap has three forms.

Reporting only the favourable number. A marketing page that quotes "95% sensitivity" without the corresponding specificity, or vice versa, tells the reader almost nothing about real-world behaviour. A Notified Body does not accept this. The clinical evaluation must report the operating point (or points) the product is actually shipped at, with both numbers, with confidence intervals, and with the test set composition that produced them.

Reporting on a test set that is not the intended use population. A nodule detector validated on a balanced dataset with a 30% nodule prevalence produces very different performance numbers than the same detector running on a screening population with 2% prevalence. The positive predictive value collapses. The technical file has to show the numbers on a test set that reflects the actual clinical deployment prevalence, or has to explain carefully why a different set is valid.

Reporting without subgroup analysis. Aggregate sensitivity can hide subgroup failure. A detector that is 95% sensitive overall but 70% sensitive on small nodules is not a 95% sensitive product from the perspective of a radiologist looking for early-stage disease. The technical file must include the subgroup breakdown.

The principle is simple. Report the numbers honestly, at the operating point the product actually uses, on a test set that looks like the real world, with subgroup breakdown, with confidence intervals, and with the comparator (standalone or with-reader) explicit. A product that cannot survive honest reporting is not ready for the Notified Body.

Common mistakes startups make with CADe

  • Calling a mixed CADe-CADx product "just detection" to keep the class down. The Notified Body reads the intended purpose, the marketing materials, and the output semantics, and will classify the product on what it actually does. Misrepresenting the function wastes months.
  • Training on one institution's data and validating on data from the same institution. Generalisation to other scanners, protocols, and populations is a core evidence expectation. An independent test set from a different institution is not optional.
  • Designing a reader study that does not reflect actual clinical deployment. A sequential-read design when the product will be deployed concurrent-read, or a single-reader design when the product will support a two-reader workflow, produces evidence that the Notified Body will discount.
  • Under-specifying the intended use population in the intended purpose statement. Leaving the population vague so the product can be sold broadly later causes the Notified Body to demand broader evidence, not less.
  • Skipping subgroup analysis because the aggregate numbers look good. Subgroup failure is where real-world harm happens, and it is where the Notified Body looks closely.
  • Treating EN 62304 as paperwork. The lifecycle standard is not a formality. Software safety classification, traceability from requirements to tests, and configuration management are what allow the Notified Body to trust the evidence at all.

The Subtract to Ship angle for CADe

The Subtract to Ship framework for MDR runs here too.

The Purpose Pass asks whether the intended purpose can be written narrowly enough to reduce scope without misrepresenting the product. A CADe product that is honestly a screening-assist tool for one specific modality in one specific clinical context has a narrower evidence burden than a general-purpose detector that promises to flag everything interesting in every modality. Narrowing the intended purpose is legitimate and often reduces the evidence set by a meaningful margin.

The Classification Pass walks Rule 11 precisely. A CADe product in a routine screening context can defensibly sit at IIa. Pushing it to IIb because "AI sounds risky" is subtraction failure in the other direction. Class follows severity and workflow role, not vibes.

The Evidence Pass asks what the minimum defensible evidence set looks like. For CADe, that is typically a representative independent test set, a pre-specified operating point, subgroup analysis, and a reader study sized for the actual workflow. Adding a second reader study on a second dataset before the first is mature is usually waste.

The Operations Pass asks what the QMS, PMS, and drift detection stack looks like for a CADe product specifically. PMS for CADe has to include monitoring of real-world sensitivity and specificity proxies, complaint triage for missed findings, and a defined revalidation trigger for scanner or protocol changes in the deployed base.

Reality Check — Where do you stand?

  1. Can you state your CADe intended purpose in one sentence that distinguishes clearly between detection and diagnosis?
  2. Have you classified the product under Annex VIII Rule 11 with the specific sub-clause and severity justification documented?
  3. Does your test set come from institutions and scanners that were not in the training set?
  4. Do you report sensitivity and specificity at a pre-specified operating point, with confidence intervals, on an intended-use-population prevalence?
  5. Have you run a reader study that matches the clinical workflow the product will actually be deployed in?
  6. Have you broken performance down by the subgroups where clinical importance is highest (small findings, edge demographics, underrepresented scanners)?
  7. Does your EN 62304 file trace every software requirement to a test, and does your software safety classification match your risk analysis?
  8. Does your PMS plan include active monitoring of field sensitivity proxies and a defined revalidation trigger for acquisition-parameter drift?

Frequently Asked Questions

Is CADe a medical device under MDR? Yes, when its intended purpose is diagnostic. Detection in a clinical workflow is a diagnostic function within Article 2(1), and the presence of a clinician reviewing the output does not remove the device from the regulatory scope. MDCG 2019-11 Rev.1 confirms that software providing information used to take diagnostic decisions is medical device software.

What class is CADe under MDR? Most CADe products fall under Annex VIII Rule 11 and land at Class IIa or Class IIb. Routine screening contexts typically sit at IIa. Severe-condition contexts where a missed finding or a false alarm can cause serious deterioration of health — intracranial haemorrhage, acute stroke, pulmonary embolism, cancer — push to IIb. Class III for CADe is rare in practice because the clinician in the loop is the usual backstop.

Can I self-certify a CADe product? Not in practice. Self-certification is limited to Class I, and CADe almost never reaches Class I because its intended purpose supports diagnostic decisions. Notified Body involvement is required from Class IIa upward under MDR Article 52.

How is CADe different from CADx under MDR? CADe flags where something might be. CADx characterises what it is. Both are usually regulated under Rule 11, but they have different evidence expectations and often different classes. A product that does both has to declare both functions in the intended purpose and meet the evidence bar for each.

What evidence does a Notified Body expect for a CADe product? Analytical performance on a representative independent test set with sensitivity and specificity reported at the shipped operating point; subgroup performance; a reader study that matches the deployed clinical workflow; a clinical evaluation under MDR Article 61 that structures the evidence into a conformity argument; and a full EN 62304:2006+A1:2015 software lifecycle with traceable requirements, tests, and configuration management.

Do I need a reader study, or are standalone performance numbers enough? For CADe products intended to be used by a clinician reviewing the output, a reader study is usually expected. Standalone numbers describe the model in isolation; the clinical question is how the clinician-plus-CADe workflow performs relative to the unaided clinician baseline. That is the number a Notified Body and a clinical user care about.

Sources

  1. Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on medical devices, Article 2(1) (definition of medical device) and Annex VIII Rule 11 (classification of software). Official Journal L 117, 5.5.2017.
  2. MDCG 2019-11 Rev.1 — Guidance on Qualification and Classification of Software in Regulation (EU) 2017/745 — MDR and Regulation (EU) 2017/746 — IVDR, October 2019, Revision 1 June 2025.
  3. EN 62304:2006 + A1:2015 — Medical device software — Software life-cycle processes.

This post is part of the AI, Machine Learning and Algorithmic Devices category in the Subtract to Ship: MDR blog. Authored by Felix Lenhard and Tibor Zechmeister. CADe is one of the oldest AI-in-medicine categories and also one of the most unforgiving — the evidence bar is set by a mature field, and the Notified Body has seen the shortcuts before. If your CADe product sits at a boundary the general framing here does not resolve, that is exactly the point where a sparring partner who has walked other CADe founders through the same Notified Body conversation earns their keep.