---
title: MDR Summative Usability Evaluation: The Final Validation Test
description: How MDR summative usability evaluation works under EN 62366-1:2015+A1:2020 clause 5.9, with intended users, realistic environments, and recorded evidence.
authors: Tibor Zechmeister, Felix Lenhard
category: AI, ML & Algorithmic Devices
primary_keyword: MDR summative usability evaluation IEC 62366
canonical_url: https://zechmeister-solutions.com/en/blog/mdr-summative-usability-evaluation-iec-62366
source: zechmeister-solutions.com
license: All rights reserved. Content may be cited with attribution and a link to the canonical URL.
---

# MDR Summative Usability Evaluation: The Final Validation Test

*By Tibor Zechmeister (EU MDR Expert, Notified Body Lead Auditor) and Felix Lenhard.*

> **MDR summative usability evaluation is the final validation test required by EN 62366-1:2015+A1:2020 clause 5.9. It uses intended users, a real or realistic use environment, recorded observations, and documented outcomes to prove that the frozen design is safe and effective in the hands of the actual operator. Notified bodies expect this evidence in the technical documentation under MDR Annex II.**

**By Tibor Zechmeister and Felix Lenhard.**

## TL;DR
- Summative evaluation is the formal validation step under EN 62366-1:2015+A1:2020 clause 5.9, run once on a frozen design.
- It requires representative intended users, a realistic use environment, documented scenarios, and recorded observations.
- Testing with engineers, sales staff, friendly customers, or KOLs is not summative. Notified bodies will reject it.
- Summative findings feed back into risk management under EN ISO 14971:2019+A11:2021 and may trigger design changes, IFU updates, or additional risk controls.
- The summative evaluation report is part of the usability engineering file referenced by MDR Annex II and reviewed by the notified body during conformity assessment.
- For mobile, handheld, and software-only devices, a simulated environment can be valid. For devices requiring clinical or hospital conditions, there is no alternative to real-environment testing.

## Why this matters more than any other usability step

Tibor has seen summative evaluation surface issues that no internal review, no formative session, and no simulation ever caught. The reason is structural. Development teams have blind spots about their own designs. They know the device too well. They hold it the way they built it. They read the screen the way they drew it. Real users, recruited from the intended population, do not share those assumptions. They hold the device differently. They read the screen differently. And they reveal what the team could not see.

This is why summative evaluation cannot be replaced by anything else. It is not a better formative session. It is not a longer design review. It is a structurally different activity with a structurally different purpose: to validate that the finished design, as it will be manufactured and shipped, is safe and effective when handed to the real user in the real environment.

For MedTech startups, summative evaluation is also the single most expensive usability activity. Tibor notes that for most device categories, the dominant usability cost is summative. Budget realism matters. Underestimating summative is a common founder mistake that surfaces three months before the notified body submission, when there is no time to run it properly and no money to do it twice.

The regulatory stakes are high. MDR (EU) 2017/745 Annex I §5 requires that devices be designed and manufactured in such a way as to reduce, as far as possible, the risks associated with the ergonomic features of the device and the environment in which the device is intended to be used. Annex I §22 covers protection against the risks posed by devices intended for use by lay users. The harmonised standard EN 62366-1:2015+A1:2020 provides the presumption of conformity, and clause 5.9 is the validation step that proves the reduction happened.

## What EN 62366-1:2015+A1:2020 actually requires

Clause 5.9 of EN 62366-1:2015+A1:2020 requires summative evaluation of the user interface to validate the safety of use. The clause sets out specific expectations.

First, the participants must be representative of the intended user groups identified in the use specification under clause 5.1. If the device has multiple user groups, for example trained clinicians and lay home users, summative evaluation must cover each group separately. Testing only the easier group is not compliant.

Second, the use environment must be representative of the intended use environment. For a home-care device, that means a simulated home. For an ambulance device, that means realistic ambulance conditions. For a hospital device, that means hospital conditions or a credible simulation. The standard does not mandate the real environment in every case, but it does mandate realism. A conference room is not a realistic home. An engineering lab is not a realistic clinic.

Third, the hazard-related use scenarios identified under clause 5.3 must be covered. Summative evaluation is not a general usability review. It is a targeted validation that the specific scenarios that could lead to harm are safely handled by the intended user. Every hazard-related use scenario must appear in the summative test protocol.

Fourth, observations must be recorded. Video, structured notes, and think-aloud protocols are all acceptable methods. The records form part of the usability engineering file under clause 5.7 and feed into the summative evaluation report.

Fifth, the outcomes must be documented and evaluated. Every observed use error, close call, or difficulty must be assessed against the risk analysis. If the observation indicates a new or unmitigated hazard, the design, the risk controls, or the user interface must be updated. Summative evaluation is not a pass/fail exam. It is a validation activity with structured follow-up.

Sixth, the standard expects sufficient statistical confidence. The usual convention is at least fifteen participants per distinct user group for a summative evaluation, drawn from FDA guidance widely applied in the EU context. EN 62366-1:2015+A1:2020 itself does not mandate a specific number, but notified bodies expect a defensible rationale for the sample size.

## A worked example: the left-handed display that only surfaced in summative

Tibor recounts a case that illustrates why summative matters. A small development team built a handheld medical device with a touchscreen. The engineering team was mostly left-handed. Without realising, they designed the display orientation and the grip ergonomics around their own natural hold. Internal reviews raised no issues. Formative sessions with friendly users, who happened to also be left-handed or ambidextrous, raised no issues. The device entered summative evaluation confident that usability was clean.

The first summative session with a right-handed user surfaced the problem within minutes. The user held the device in the natural right-hand grip, looked at the display, and said the screen was upside down. Subsequent sessions confirmed the pattern across the right-handed population. The fix was a software iteration that let the user flip display orientation on first use, plus a corresponding update to the IFU and the risk management file.

The lesson Tibor draws from this story is not that the team was negligent. It is that summative evaluation is the only point in the development cycle where the team's blind spots become visible. No amount of internal review would have caught this. No formative session with convenience-recruited users would have caught it. Only a structured summative evaluation with representative right-handed participants in a realistic holding posture surfaced the issue. The fix came late, but it came before market release. That is what summative evaluation is designed to do.

A second case, also from Tibor's casebook, involved a tongue-controlled mouthpiece for a wheelchair used by quadriplegic patients. The mouthpiece colour was chosen during indoor testing. Summative evaluation included outdoor use scenarios, and the auditors raised the question of whether the chosen colour might attract insects. It did. The colour attracted bees. An outdoor patient with the mouthpiece could be stung while unable to remove it. The fix was a colour change and a documented outdoor-use scenario in the use specification. Without summative evaluation in a realistic environment, that hazard would have reached the market.

## The Subtract to Ship playbook for summative evaluation

Felix's Subtract to Ship approach treats summative evaluation as a high-stakes event that must be planned like one. The playbook has seven steps.

**Step 1: Freeze the design before you start.** Summative evaluation runs on the exact device that will be manufactured. Changes to hardware, firmware, software, or IFU during summative invalidate the test. Plan the design freeze as a project milestone and hold it.

**Step 2: Write the summative evaluation plan.** The plan identifies the intended user groups, the use environment, the hazard-related use scenarios to be covered, the participant count, the data capture method, and the analysis approach. This plan goes into the usability engineering file under EN 62366-1:2015+A1:2020 clause 5.7.

**Step 3: Recruit real users.** Tibor is emphatic on this point. Engineers, sales staff, friendly customers, and KOLs are not representative. Budget for participant recruitment through a professional recruiter or a clinical network. The recruitment brief must match the intended user profile, including age, experience, language, and relevant disabilities or impairments.

**Step 4: Build a realistic environment.** For mobile, handheld, and software-only devices, a simulated environment can be valid, and this is where startups can save money without corner-cutting. For devices requiring clinical or hospital conditions, there is no shortcut. The environment must be realistic enough that the user's behaviour matches real use.

**Step 5: Run the sessions with structured observation.** Use video, think-aloud protocols, and structured observer notes. Capture every use error, every hesitation, every question. Do not coach the participant. Do not answer questions during the scenarios. If the user gets stuck, the session reveals the problem.

**Step 6: Analyse findings against the risk file.** Every observed use error is evaluated against the hazard analysis under EN ISO 14971:2019+A11:2021. New hazards trigger new risk controls. Confirmed hazards with adequate controls are documented as validated. Unresolved hazards block the submission until they are addressed.

**Step 7: Document the summative evaluation report.** The report is the primary evidence the notified body will review. It must describe the participants, the environment, the scenarios, the observations, the analysis, and the conclusions. The report becomes part of the technical documentation under MDR Annex II.

One additional Felix observation: the recruited participants for summative evaluation can sometimes become early customers. Done ethically, with clear separation between the test session and any commercial conversation, summative evaluation can produce dual value for the startup: regulatory compliance and early customer signal. This is not a reason to cut corners on the test, but it is a reason to take recruitment seriously.

## Reality Check

1. Is your design frozen for summative evaluation, or are you still making changes that will invalidate the test?
2. Have you identified every intended user group under EN 62366-1:2015+A1:2020 clause 5.1, and does your summative plan cover each one separately?
3. Is your test environment realistic enough that the user's behaviour will match real use, or are you testing in a conference room?
4. Have you recruited participants through a recruiter or clinical network, or are you relying on employees, friends, and KOLs?
5. Does your summative plan cover every hazard-related use scenario from clause 5.3?
6. Do you have a defensible rationale for your participant count, and does it cover the statistical expectations a notified body will apply?
7. Have you budgeted for a repeat summative run if the first pass surfaces a design-breaking issue?
8. Is your summative evaluation report structured to become part of the usability engineering file and reviewed by the notified body without follow-up questions?

## Frequently Asked Questions

**What is the minimum number of participants for summative usability evaluation?**
EN 62366-1:2015+A1:2020 clause 5.9 does not specify a minimum. The widely accepted convention, drawn from FDA human factors guidance and applied pragmatically in the EU, is fifteen participants per distinct user group. Notified bodies expect a defensible rationale.

**Can summative evaluation be run in a simulated environment?**
Yes, if the simulation is realistic enough to elicit representative user behaviour. For mobile, handheld, and software-only devices, a simulated home or clinic can be valid. For devices requiring genuine clinical conditions, the environment must be real or very closely simulated.

**Do software-only medical devices and apps need summative evaluation?**
Yes. EN 62366-1:2015+A1:2020 applies to the user interface of the device regardless of its physical form. Connected devices that combine hardware and a mobile app must evaluate the full user journey, including app download, configuration, and use.

**What happens if summative evaluation reveals a new hazard?**
The hazard is fed into the risk management file under EN ISO 14971:2019+A11:2021. If the existing risk controls are insufficient, the design, the IFU, or the user interface must be updated, and the affected scenarios must be re-tested. This can delay certification, which is why formative evaluation matters.

**Is a design review meeting summative evaluation?**
No. A team reviewing the device around a table is not summative evaluation. Tibor is explicit that this is one of the most common startup audit findings. Summative requires recruited representative users, a realistic environment, and recorded observations.

**How does summative evaluation relate to the notified body audit?**
The summative evaluation report is part of the usability engineering file referenced in the technical documentation under MDR Annex II. Notified body reviewers read it in detail as part of the conformity assessment. Gaps, weak methodology, or non-representative participants are common findings.

**Can summative evaluation findings update the IFU after testing?**
Yes. If summative evaluation shows that users need additional guidance, the IFU can be updated, provided the update is then validated either through a targeted re-test or through structured justification in the risk file. IFU updates are a legitimate risk control under EN ISO 14971:2019+A11:2021, though safety by information is the lowest-priority control in the hierarchy.

## Related reading
- [Formative Usability Evaluation: How to Test Early and Often as a Startup](/blog/formative-usability-evaluation-startup) explains the iterative testing that precedes summative.
- [How to Plan and Execute Usability Tests for Medical Devices](/blog/plan-execute-usability-tests) covers the protocol and logistics for both formative and summative sessions.
- [IEC 60601-1-6 Usability Cross-Reference](/blog/iec-60601-1-6-usability-cross-reference) connects the general usability standard to medical electrical equipment.
- [Risk Management and Usability Engineering: How They Link](/blog/risk-management-usability-engineering-link) shows how summative findings feed the ISO 14971 risk file.
- [MDR Annex I GSPR](/blog/mdr-annex-i-gspr) covers the general safety and performance requirements that usability engineering validates.

## Sources
1. Regulation (EU) 2017/745 on medical devices, consolidated text. Annex I §5, §14, §22. Annex II.
2. EN 62366-1:2015+A1:2020, Medical devices, Application of usability engineering to medical devices. Clauses 5.1, 5.3, 5.7, 5.8, 5.9.
3. EN ISO 14971:2019+A11:2021, Medical devices, Application of risk management to medical devices.
4. EN 60601-1:2006+A1+A12+A2+A13:2024, Medical electrical equipment, Part 1, General requirements for basic safety and essential performance.

---

*This post is part of the [AI, ML & Algorithmic Devices](https://zechmeister-solutions.com/en/blog/category/ai-ml-devices) cluster in the [Subtract to Ship: MDR Blog](https://zechmeister-solutions.com/en/blog). For EU MDR certification consulting, see [zechmeister-solutions.com](https://zechmeister-solutions.com).*