---
title: Common Usability Engineering Audit Findings
description: The usability patterns that fail notified body audits: insider-only testing, formative called summative, 150-page IFUs, thin use specifications, and blaming users.
authors: Tibor Zechmeister, Felix Lenhard
category: Usability Under MDR
primary_keyword: usability audit findings MDR
canonical_url: https://zechmeister-solutions.com/en/blog/common-usability-audit-findings
source: zechmeister-solutions.com
license: All rights reserved. Content may be cited with attribution and a link to the canonical URL.
---

# Common Usability Engineering Audit Findings

*By Tibor Zechmeister (EU MDR Expert, Notified Body Lead Auditor) and Felix Lenhard.*

> **Five usability patterns account for most of the nonconformities notified bodies raise on early MedTech startups. Insider-only testing that substitutes engineers and sales staff for representative users. A formative group review that gets called a summative evaluation. A 150-page IFU that no real user can read. A thin use specification that skips procedures. Use errors that get explained away as training failures. Each pattern has a direct fix, and each fix traces back to specific clauses in EN 62366-1:2015+A1:2020 and MDR Annex I.**

**By Tibor Zechmeister and Felix Lenhard.**

## TL;DR
- Insider-only testing is the single most common usability nonconformity. Engineers, sales staff, friendly customers, and KOLs are not representative users and do not satisfy EN 62366-1:2015+A1:2020.
- Formative-called-summative is the second most common. A group review is formative at best, and a notified body will push back the moment it reads the file.
- A 150-page IFU is not a usability solution. It is a usability failure disguised as documentation, and the standard response is to test whether a real user can complete the task with the IFU and no coaching.
- Thin use specifications cause hazard analysis to miss entire scenarios. The fix is to decompose every real-world procedure: cleaning, transport, installation, normal operation, edge cases.
- Use errors framed as training failures violate the EN 62366-1 risk control hierarchy. Training is the last resort, not the first.
- All five patterns produce the same downstream result: change control after market entry, notified body re-engagement at the next surveillance audit, and a cost curve that would have been flat if the team had invested in usability earlier.

## Why one post covers all five patterns

This post is a summary for the usability cluster. Tibor has audited enough early-stage MedTech startups that the same findings appear in a roughly stable distribution. Felix confirms the pattern across the 44 startups he has coached through Subtract to Ship: founders who check their own files against this list and fix the weakest pattern first cut their usability rework bill in half.

The regulatory anchors are the same across all five. MDR Annex I §5 requires reduction of use-related risks as far as possible. Annex I §22 imposes additional requirements for lay users. Annex I Chapter III §23 governs the information supplied with the device. EN 62366-1:2015+A1:2020 operationalizes these requirements through the usability engineering process, with EN ISO 14971:2019+A11:2021 providing the risk management linkage.

## Pattern 1: insider-only testing

The pattern looks generous on paper. A startup runs a usability test with eight participants, collects video, writes up the results, and files the evidence. The problem is the participant list. Two are engineers. Two are sales staff. Two are friendly customers. Two are key opinion leaders.

Every one of those participants is too skilled, too familiar, or too motivated to represent the intended user population. Engineers find the device intuitive because they designed it. Sales staff rehearse it daily. KOLs are clinical experts far above the representative user level. The real user, a 70-year-old patient at home or a nurse on a busy ward, would not find it intuitive at all.

The cost shows up later: complaints from users who cannot complete the task, change control, notified body re-engagement at the next surveillance audit. The fix costs ten times what recruiting the right users would have cost up front. The fix is disciplined recruiting against the intended user group defined in the use specification. Recruiting costs money. The alternative costs more money later.

## Pattern 2: formative evaluation called summative

The second pattern is definitional confusion. A startup runs a group review of the prototype. The development team passes the device around a conference table. Everyone agrees it is intuitive. The team writes a short memo and files it under "summative evaluation". When the notified body reads the file, it sees a group review and pushes back.

Formative and summative are different instruments in EN 62366-1:2015+A1:2020. Formative is early, iterative, diagnostic; it can be informal and small-sample. Summative is late, final, confirmatory. It requires recruited users representative of the intended user group, a real or simulated use environment, recorded observations, and documented outcomes tied to the hazard-related use scenarios.

A group review cannot be a summative evaluation. It fails all four non-negotiable criteria: the participants are the development team, the environment is the engineering office, the observations are informal memories, and the outcomes are not tied to hazard-related use scenarios. A notified body identifies the gap in under five minutes.

The fix is to budget for a real summative evaluation early in the program, with recruited participants, environment simulation, structured observation, and a written protocol aligned to the hazard-related use scenarios in the use specification.

## Pattern 3: the 150-page IFU as a usability solution

Tibor has seen the pattern on more than one device. The development team knows the device has several use-related risks. Instead of designing them out, the team adds warnings, cautions, and step-by-step instructions to the instructions for use. The IFU grows from 20 pages to 80 to 150. The team considers the risks addressed because the information is supplied.

This is not how EN 62366-1 treats safety by information. Safety by information is the last resort in the risk control hierarchy, below inherent safety by design and below protective measures by functionality. A risk that could have been designed out is not acceptably controlled just because a warning appears on page 93 of the IFU. MDR Annex I §5 asks for risk reduction as far as possible, and a 150-page IFU is not the ceiling of what is possible.

The test for IFU usability is exact and unforgiving. During summative evaluation, the recruited user receives the device and the IFU, with no coaching and no questions. The user attempts to complete the task. If the user cannot complete the task using the IFU as written, the IFU failed. A 150-page IFU almost always fails this test because a real user does not read 150 pages before first use, and a device that depends on the user reading 150 pages is designed wrong.

The fix is to do the hard work in the design. Reduce the number of use-related risks through inherent design choices. Add protective measures for the risks that remain. Only then turn to the IFU, which should be short, visual, and task-focused. Tibor has watched startups cut an IFU from 120 pages to 18 pages by addressing the underlying design and removing the warning text that was papering over design flaws. The notified body received that shortened IFU with enthusiasm, because it was evidence of a governed usability process, not a weaker one.

## Pattern 4: thin use specification that skips procedures

The fourth pattern is upstream of the other four. A startup writes a use specification that describes the intended user as "clinician" and the use environment as "clinical setting". The document is one page long. The team believes this satisfies EN 62366-1 section on use specification. It does not.

The use specification is the most-skipped and most-important artefact in the usability process. Tibor emphasizes this consistently. The right approach is to divide and conquer. Do not simply write "the clinician uses the device". Decompose the use into every real-world procedure the device will be subjected to, including cleaning, transport, initial installation, normal operation, edge cases, and maintenance. Each procedure has its own intended users, its own use environment, its own potential hazards. Granular procedures make hazards visible. Without that decomposition, hazard analysis misses scenarios, and the scenarios it misses are exactly the ones that surface as complaints after market entry.

The outdoor-use hazard that surfaced on a tongue-controlled wheelchair is the canonical example. The use specification implicitly assumed indoor use. When the auditor asked about outdoor use, an entirely new class of hazards appeared, including insect attraction to the mouthpiece color. A use specification that had decomposed the use environment into indoor and outdoor conditions from the start would have surfaced the outdoor hazards during hazard analysis, not during an audit.

The fix is to write a use specification that reads like a detailed operational manual for a stranger. Every real-world procedure gets its own section. Every use environment gets explicit description. Every intended user group gets characterized with age, skill, training, and condition. The document is longer than one page. It is almost always longer than ten. It is the foundation on which every downstream usability deliverable rests.

## Pattern 5: use errors framed as training failures

The fifth pattern is a philosophical mistake that shows up in writing. A user makes an error. The manufacturer investigates. The investigation concludes that the user was not adequately trained and that additional training material will fix the issue. The risk control in the risk file becomes "enhanced training".

This framing violates the risk control hierarchy in EN ISO 14971:2019+A11:2021. The hierarchy is explicit. First, inherent safety by design. Second, protective measures by functionality. Third, safety by information, including labels, warnings, and training. Training is the third tier, not the first.

Tibor has seen the pattern fail audits consistently. The auditor asks why the risk was not controlled by design. The team answers that design change was too expensive. The auditor then asks for evidence that the design change was evaluated and documented as infeasible. Most teams do not have that evidence.

The fix is to force the team to walk through the hierarchy for every use-related risk. Could this risk be eliminated by design? If not, could it be controlled by a protective measure? Only after those questions have been answered with evidence does training or information become an acceptable primary control. Tibor has watched startups cut their use-related residual risk list in half by running this walk-through rigorously.

## The Subtract to Ship playbook for avoiding all five patterns

Write the use specification first and write it long. Every real-world procedure, every intended user group with real demographic and skill detail, every environment condition including lighting, noise, and climate. If it takes two weeks, it is worth two weeks. Every downstream usability deliverable is faster and cleaner when the use specification is complete.

Budget summative evaluation as a funded line item in the plan, not as a late-stage compliance activity. Recruit participants who actually match the intended user group. Pay them. Schedule the test. Record the sessions. Tie the outcomes back to hazard-related use scenarios in the use specification.

Resist the IFU as a safety control until the design has been exhausted. Every page added to the IFU should be justified against the question "could this risk have been designed out instead". For the risks that pass that test, keep the IFU short, visual, and task-focused.

Run the risk control hierarchy on every use-related risk. Document that the walk-through was performed. If a risk is ultimately controlled by training, the file must show why the first two tiers were infeasible.

Treat insider testing as a temptation to avoid, not a shortcut to accept. Friends, engineers, sales staff, KOLs, and employees cannot substitute for recruited representative users under EN 62366-1. The moment a team notices it is cutting costs by using insiders, it is accumulating change-control debt that will come due later.

## Reality Check

1. Could your summative evaluation satisfy the four non-negotiable criteria of recruited representative users, realistic environment, recorded observations, and documented outcomes?
2. Are any of your test participants employees, sales staff, friendly customers, or KOLs instead of representative users?
3. Does your use specification decompose the use into every real-world procedure, including cleaning, transport, installation, normal operation, maintenance, and edge cases?
4. If you handed your IFU to a recruited user with no coaching, could they complete a typical task, and have you actually tested this?
5. Does your risk file show the risk control hierarchy walk-through for every use-related risk, with training appearing only where design and protective measures were infeasible?
6. How many pages is your IFU, and for each page, could the underlying risk have been designed out instead?
7. Have you confused formative evaluation with summative evaluation anywhere in your technical file?

## Frequently Asked Questions

**How many participants are needed for summative evaluation?**
EN 62366-1:2015+A1:2020 does not prescribe a fixed number, but common practice is at least 15 representative users per distinct user group. A device with two distinct user groups, for example patients and caregivers, needs 15 per group.

**Can a formative evaluation use colleagues as participants?**
Formative evaluation is more flexible on participant selection, and colleagues are acceptable for early iterations. Summative evaluation is not flexible. Representative users are required.

**Is a 150-page IFU ever acceptable?**
Rarely, and only for complex capital equipment where the length is justified by the genuine operational complexity of the device. Even then, the IFU must be structured so that a user can find the relevant section quickly without reading the whole document, and the summative evaluation must still demonstrate that users can complete tasks with the IFU.

**What is the difference between a use error and a use-related risk?**
A use error is an action or inaction by a user that could lead to harm. A use-related risk is the combination of the probability of the use error and the severity of the resulting harm. EN 62366-1 addresses use errors; EN ISO 14971 addresses the risks that follow from them.

**Can training ever be the primary risk control?**
Only for risks that cannot feasibly be controlled by inherent design or protective measures, and only when the infeasibility is documented. Training as a default choice violates the hierarchy.

**Does this list apply to software-only medical devices?**
Yes. Every pattern appears in software-only devices as frequently as in hardware. Insider-only testing, formative-called-summative, overlong in-app help text that replaces a 150-page IFU, thin use specifications, and training-as-first-resort all occur in software programs.

## Related reading
- [What Is Usability Engineering for Medical Devices? A Startup Introduction](/blog/usability-engineering-medical-devices-startup) The parent usability primer.
- [MDR Color Coding and Visual Design](/blog/mdr-color-coding-visual-design-iec-62366) The color-specific sibling post in this cluster.
- [MDR Alarm Management: Where Usability Meets IEC 60601-1-8](/blog/mdr-alarm-management-iec-60601-1-8-usability) The alarm-specific sibling post in this cluster.
- [IEC 60601-1-6 Usability Cross-Reference](/blog/iec-60601-1-6-usability-cross-reference) How electrical safety hands off to usability.
- [Risk Management and Usability Engineering Link](/blog/risk-management-usability-engineering-link) The EN ISO 14971 and EN 62366-1 integration.

## Sources
1. Regulation (EU) 2017/745 on medical devices, consolidated text. Annex I §5, §22, Chapter III §23.
2. EN 62366-1:2015+A1:2020, Medical devices Part 1: Application of usability engineering to medical devices.
3. EN ISO 14971:2019+A11:2021, Medical devices: Application of risk management to medical devices.
4. EN 60601-1:2006+A1+A12+A2+A13:2024, Medical electrical equipment Part 1: General requirements for basic safety and essential performance.

---

*This post is part of the [Usability Under MDR](https://zechmeister-solutions.com/en/blog/category/usability) cluster in the [Subtract to Ship: MDR Blog](https://zechmeister-solutions.com/en/blog). For EU MDR certification consulting, see [zechmeister-solutions.com](https://zechmeister-solutions.com).*
