Risk Acceptability Criteria: How to Set Them for Your Startup's Device

Risk acceptability criteria must be defined before risk analysis begins, not fitted to the results afterwards. Two buckets labelled "low" and "medium" are not enough to survive a notified body audit under MDR. Defensible criteria are quantitative where possible, tied to clinical consequence, and documented with their rationale in the risk management plan required by EN ISO 14971:2019+A11:2021 clause 5 and MDR Annex I §3.

By Tibor Zechmeister and Felix Lenhard.

TL;DR

Acceptability criteria live in the risk management plan and are approved before risk analysis starts.
"Low" and "medium" buckets without quantitative anchors are the single most common auditor pushback on startup risk files.
Criteria must be tied to clinical consequence, not to internal convenience.
The same criteria apply at initial evaluation, after risk control, and during post-market review.
Criteria cannot be edited mid-project to turn a red cell green. That is the fastest way to lose credibility with a notified body.
Defensible criteria cite state of the art, benefit-risk balance, and comparable devices on the market.

Why this matters

Tibor's audit work surfaces a recurring pattern. A startup shows up with a two-bucket matrix: "low" and "medium" risk, both implicitly acceptable, nothing labelled "high". When the auditor asks what "low" means, the answer is a hand gesture. When the auditor asks where the threshold between low and medium is anchored, there is no anchor. When the auditor asks how the criteria were defined, the answer is "we took them from a template".

Criteria without anchors are not criteria. They are vibes. MDR risk evaluation cannot rest on vibes, and notified bodies know it.

Felix sees the founder-facing version of the problem. A CTO opens the risk management software, sees a default 5x5 matrix with green, yellow and red cells, and assumes the tool has already answered the acceptability question. It has not. The tool is a container. The acceptability judgment is the manufacturer's.

This post is the bridge between those two views. It describes how to write acceptability criteria a startup can defend, and why the most common shortcuts do not survive contact with an auditor.

What MDR actually says

MDR Annex I §3 places the criteria in a specific document. The risk management plan is the required output of step one of the risk management system. Among other things, the plan must contain the criteria for risk acceptability. This is mirrored by EN ISO 14971:2019+A11:2021 clause 5, which is the clause notified bodies read when they open the risk management plan.

Clause 5 of EN ISO 14971:2019+A11:2021 requires the risk management plan to include the criteria for risk acceptability, the policy for establishing those criteria, the method for evaluating the overall residual risk, and the criteria for acceptability of the overall residual risk. The policy for establishing criteria is where most startup plans are silent. A policy is not the same as a matrix. The policy is the reasoning that leads to the matrix.

MDR Annex I §8 is the constraint on the criteria. The generally acknowledged state of the art, the benefit-risk balance, and the specific characteristics of the device and its intended purpose must all be reflected. Criteria that ignore state of the art are not MDR compliant even if they are internally consistent.

MDR Annex I §4, covered in depth in risk evaluation under MDR, creates the second constraint. Acceptability is not the stopping rule. It is an input to the continuing reduction obligation. Criteria must be written with that in mind, otherwise teams will read the criteria as permission to stop and miss the reduction requirement.

What a "policy for establishing criteria" looks like

A credible policy names its inputs. Typical inputs Tibor looks for in audits:

Clinical severity scales drawn from published harm descriptions, not from internal scoring habits.
Probability scales anchored to frequency data from comparable marketed devices or from clinical literature, with the sources cited.
A benefit-risk reasoning clause that ties acceptability back to the device's intended purpose under MDR Article 2(12).
A state-of-the-art review of how similar devices in the same classification set their criteria.
A top management approval step, because the risk management plan has to be signed at management level.

A policy that says "the criteria were set by the risk manager based on engineering judgment" is not a policy. It is an abdication.

A worked example

A wearable cardiac monitor, Class IIa. The team drafts its first acceptability matrix.

Version 1. A 3x3 matrix. Severity: low, medium, high. Probability: rare, occasional, frequent. Green cells anywhere severity is low. Yellow cells for medium severity + occasional or rarer. Red cells for high severity or frequent medium severity.

Tibor's audit critique would be immediate. "Low severity" is not defined. "Rare" is not quantified. "The device produces false positives" could land in any cell depending on who is reading the matrix.

Version 2. The team adds quantitative anchors. Severity S1 is no harm or reversible discomfort. S2 is reversible harm requiring medical attention but no hospitalisation. S3 is reversible harm requiring hospitalisation. S4 is irreversible harm. S5 is life-threatening or fatal. Probability P1 is less than one event per 10,000 uses. P2 is between 1 per 10,000 and 1 per 1,000. P3 is between 1 per 1,000 and 1 per 100. P4 is between 1 per 100 and 1 per 10. P5 is more frequent than 1 per 10.

A 5x5 matrix now exists with named cells. Green, yellow and red regions are drawn based on device context: a cardiac monitor tolerates fewer high-severity cells than a thermometer because the clinical use case is different.

Version 3. The team adds the policy paragraph. It cites the state of the art by referencing two comparable marketed cardiac monitors and the false-positive rates disclosed in their public literature. It cites benefit-risk reasoning by stating that the device's benefit (early arrhythmia detection) justifies accepting certain P2-S2 cells that would be unacceptable in a lower-benefit device. Top management signs the plan.

That third version is defensible. The first two are not.

Tibor has seen a variant of Version 1 in the field more times than any other failure mode. A team with a handheld device, prolonged skin contact, ran with a two-bucket "acceptable / unacceptable" scheme. Post-market feedback surfaced skin irritations that were nominally in the acceptable bucket. The notified body reopened the risk file at the next surveillance audit and spent a full day inside it. The cost of writing Version 3 acceptability criteria from the start is less than one day of that audit time.

The Subtract to Ship playbook

Felix's coaching pattern for startup acceptability criteria compresses into six steps.

Step 1. Write the policy first. One page in the risk management plan. What inputs define the scales. What state of the art was consulted. What benefit-risk reasoning ties the criteria to the intended purpose. Without this page the matrix is decoration.

Step 2. Anchor severity to clinical consequence. Severity scales must be described in terms a clinician would recognise, not in terms engineers invented. "Low" is not a clinical consequence. "Reversible harm requiring medical attention but no hospitalisation" is.

Step 3. Anchor probability to numbers where possible. Quantitative ranges (1 per 10,000 uses, 1 per 1,000 uses) are always stronger than qualitative labels (rare, occasional). Where data is unavailable, qualitative labels are acceptable but must be defined with reference to comparable devices or published data.

Step 4. Design the matrix around the device, not around a template. A 3x3 matrix is enough for some simple devices. For software, wearables, implantables, connected devices, a 5x5 matrix usually earns its extra rows by surfacing distinctions that matter. The risk matrix design post walks through when 3x3 suffices and when it does not.

Step 5. Lock the criteria before analysis begins. The risk management plan is approved. The criteria are frozen. If new evidence forces a change, that change is itself a documented deviation with a rationale and a signature. Criteria cannot drift silently. Criteria cannot be edited mid-project to turn a red cell green. Notified bodies check version history.

Step 6. Reapply the criteria at every lifecycle stage. Initial evaluation, after risk control, after production data, after post-market surveillance. The same criteria apply. If post-market data shifts probabilities, risks migrate across the matrix and the evaluation reopens. The criteria themselves do not move.

Reality Check

Is there a written policy in your risk management plan explaining how the acceptability criteria were established, or does the plan simply present a matrix?
Are severity levels defined in clinical consequence language a clinician could recognise?
Are probability levels anchored to quantitative ranges or to named comparable devices, or are they labelled "rare" and "frequent" with no anchor?
Could you show a notified body a version history proving the criteria were locked before the first risk analysis session?
Does the plan explain why this specific matrix size (3x3, 4x4, 5x5) was chosen for this specific device?
Was the risk management plan signed by top management, as required by EN ISO 14971 clause 5?
If your post-market surveillance data showed a risk shifting from "acceptable" to "unacceptable", does your process reopen the evaluation automatically, or does it wait for the next scheduled review?
If an auditor asked how you handle benefit-risk reasoning for cells that sit on the acceptable-unacceptable boundary, is the answer documented anywhere?

Frequently Asked Questions

Is a two-bucket "acceptable / unacceptable" scheme ever enough? Only for the simplest possible devices, and even then it is risky. The two-bucket scheme collapses severity and probability into a single judgment that the auditor cannot independently verify. Tibor has never seen a two-bucket scheme survive a notified body audit on anything above a Class I mechanical device, and even there the weakness shows when post-market data arrives.

Who approves the acceptability criteria? Top management, per EN ISO 14971:2019+A11:2021 clause 4.2 on management responsibility. This is not a formality. The signature carries the organisational commitment to apply the criteria consistently.

Can the criteria be different for different product lines? Yes, provided each product has its own risk management plan with its own criteria, and the criteria reflect the specific device and intended purpose. What is not acceptable is applying one product's criteria to another product without documented justification.

What if we genuinely have no probability data? State that. Use qualitative scales with explicit reference to comparable devices, published clinical literature, or expert judgment. Document the uncertainty. Flag the items where data is weakest as priorities for post-market data collection. An honest qualitative scale with documented uncertainty is stronger than a quantitative scale built on invented numbers.

Can AI help write acceptability criteria? Tibor's view: AI is useful for creative hazard identification where teams miss failure modes. For acceptability criteria specifically, AI can draft the policy paragraph and suggest scale definitions, but the clinical anchoring and the state-of-the-art review must be done by humans who know the device and the clinical context. The criteria carry a management signature. That signature cannot be delegated to a model.

Do we need to redo our criteria if a competitor publishes new safety data? Potentially yes. State of the art under MDR Annex I §8 is not static. If a competitor establishes a higher bar for the same risk category, the question is whether your current criteria still represent state of the art. The answer goes into the next risk management review.

Risk evaluation under MDR, the reduction obligation that acceptability criteria must not be allowed to override.
The EN ISO 14971 Annex Z trap, why the harmonised amendment matters when writing the plan.
The risk matrix: designing one that works for your device type, how to pick a matrix size and shape that fits.
MDR Annex I and the GSPRs, the requirements the criteria must align with.

Sources

Regulation (EU) 2017/745 on medical devices, consolidated text. Annex I, Chapter I §2, §3, §4, §8.
EN ISO 14971:2019+A11:2021, Medical devices, Application of risk management to medical devices, clauses 4 and 5 on management responsibility and the risk management plan.
EN ISO 14971:2019+A11:2021, Annex Z mapping to Regulation (EU) 2017/745.