Remote Usability Testing for Medical Devices

Remote usability testing is defensible under EN 62366-1:2015+A1:2020 only when the intended use environment can be genuinely reproduced at the user's location and when the hazard-related use scenarios do not depend on clinical infrastructure the remote setting cannot provide. For software-only, mobile, and home-use devices the remote protocol is often credible. For devices tied to clinical workflow or physical infrastructure, remote testing collapses.

By Tibor Zechmeister and Felix Lenhard.

TL;DR

Remote usability testing became operationally mainstream during the pandemic and remains a legitimate modality where the intended use environment is reproducible at the user's location.
EN 62366-1:2015+A1:2020 does not prohibit remote testing, but it requires the same rigour as in-person testing: recruited representative users, adequate environmental simulation, recorded observations, and documented outcomes.
Moderated remote testing with screen sharing, video observation, and a trained moderator is the typical defensible form; unmoderated remote testing is rarely sufficient for summative evaluation.
For home-use devices intended for lay users under MDR Annex I §22, remote testing in the user's actual home can be more representative than a lab environment, not less.
For devices requiring a clinical field of view, an instrument tray, a surgical team, or a specific physical infrastructure, remote testing cannot be defended.
A notified body will review the justification for the remote modality, not the modality itself.

Why this matters after the pandemic

Tibor has seen the usability engineering landscape shift since 2020. Before the pandemic, summative evaluations were almost always in-person: a prepared room, a moderator on site, a recording setup, recruited users brought into the facility. The pandemic forced the industry to adapt. Moderated remote protocols were developed, tested, and accepted. Notified bodies reviewed and accepted remote usability files where the justification held. Many of those protocols did not go away when the travel restrictions lifted. They became part of the standard toolbox.

The consequence for startups is ambiguous. Remote testing is now a legitimate option, but that does not mean it is the right option for every device. The temptation for a resource-constrained founder is to assume that remote is always cheaper and therefore always preferable. Tibor's observation is that this assumption is where the second wave of usability nonconformities has come from. Founders run remote testing on devices where remote was never defensible, and the notified body pushes back on the justification.

Felix's coaching experience is that the question founders should ask is not "can we do this remotely" but "what environment does the device require the user to be in, and can that environment be reproduced at the user's location". The answer decides the modality. It is not a matter of preference. It is a matter of whether the hazard-related use scenarios from EN 62366-1:2015+A1:2020 can be observed under the chosen conditions.

What EN 62366-1 allows

EN 62366-1:2015+A1:2020 is environment-agnostic in its text. The standard does not specify in-person or remote testing. It specifies recruited representative users from the intended user population, a use environment that adequately simulates or reproduces the intended use environment, recorded observations sufficient for analysis, and documented outcomes that feed the residual risk evaluation and the usability engineering file.

Every one of those requirements can, in principle, be met remotely. Recruited users can participate from anywhere. Screen sharing tools, two-way video, and ambient recording can capture observations. Structured protocols run by a trained moderator can collect the same data as an in-person session. Outputs can flow into the usability engineering file in the same form.

The question is whether the environment at the user's location reproduces the intended use environment. That question has different answers for different devices.

For a software-only device running on the user's own phone, the answer is usually yes. The intended use environment is the user's own phone, in the user's own hand, in a context of the user's choosing. A remote session with video observation of the user's face, screen recording of the application, and moderated prompts can reproduce the intended use environment more faithfully than any lab ever could.

For a home-use device for a lay user under MDR Annex I §22, the answer is often yes, and sometimes better than in-person. Observing a 70-year-old user in their own kitchen, with their own lighting, their own reading glasses, their own distractions, is closer to the intended use environment than bringing the same user into a clinical lab. Tibor has seen remote protocols for home-use devices produce richer observations than the in-person equivalents.

For a handheld diagnostic device used in a professional but non-hospital setting, such as a general practitioner's consulting room or a community pharmacy, remote testing can work if the user can be visited or observed at their actual workplace with the device shipped in advance. The logistics are heavier but the justification holds.

For a device requiring clinical infrastructure, such as a surgical tool used in an operating theatre, an imaging accessory used on a scanner, a device integrated with hospital patient monitoring, or a tool requiring a multidisciplinary clinical team, the answer is no. The intended use environment cannot be reproduced at the user's remote location. Remote testing cannot observe the hazards that the clinical environment would surface.

A worked example: three startups, three decisions

Consider three startups planning their summative evaluations under EN 62366-1:2015+A1:2020.

Startup A builds a smartphone application for patient symptom tracking that qualifies as a medical device. The intended users are adult patients with a chronic condition, using the app at home. Startup A decides on a moderated remote protocol with eight recruited users across an age range from 35 to 80. Each session is scheduled via video call, the user installs the application on their own phone following the real onboarding flow, the moderator guides them through hazard-related scenarios while observing their face and their screen, and the session is recorded for analysis. The justification written into the usability engineering plan references the fact that the intended use environment is the user's own phone and home, and that the remote modality reproduces this environment more faithfully than a lab. The notified body accepts the justification. The file clears review.

Startup B builds a home dialysis accessory. The intended user is a lay user in their own home performing a multi-step procedure with real physical components. Startup B considers remote testing. The hazard-related scenarios include aseptic handling, component orientation, and the correct sequencing of physical steps. Remote observation through a webcam cannot reliably capture all the critical hand motions and the aseptic technique. Startup B decides on a hybrid approach: formative evaluations are run remotely with the device shipped to participants, but the summative is run in person in a simulated home environment. The justification in the usability engineering plan distinguishes between the two phases. The notified body accepts this.

Startup C builds a laparoscopic surgical instrument. The intended user is a surgeon in an operating theatre. Startup C briefly considers remote testing because the budget is tight. Tibor's feedback in a consulting conversation is direct: the hazards this device presents only exist in a real operating theatre environment with a real instrument tray, a real sterile field, a real surgical team, and real patient conditions. No remote protocol can reproduce these. The summative must be run in a cadaveric lab or an equivalent real-conditions setting. Startup C reprioritises the budget. The device eventually clears notified body review. If Startup C had run a remote summative and filed it anyway, the notified body would have reopened the usability engineering file and delayed CE marking by months.

The three startups illustrate the decision rule: the intended use environment and the hazard-related scenarios determine whether remote is defensible, not the founder's budget.

The Subtract to Ship playbook

The Subtract to Ship approach to remote usability testing has four components that a startup can apply during planning under EN 62366-1:2015+A1:2020.

Component one: write the environment reproduction test into the usability engineering plan. Before committing to any modality, answer the question: can the intended use environment be reproduced at the user's location well enough that the hazard-related use scenarios will surface? Document the answer. If the answer is yes, remote is defensible. If the answer is no, remote is not.

Component two: default to moderated protocols, not unmoderated ones. Unmoderated remote testing, where the user completes tasks on their own and submits a recording or a log, is rarely sufficient for summative evaluation. The moderator provides structured probing, captures observations in real time, and can follow up on ambiguous moments. Unmoderated testing can be useful for exploratory formative phases but it is not a summative.

Component three: plan for the remote-specific failure modes. Remote sessions can fail in ways in-person sessions cannot: network drops, camera angle problems, lighting problems, device shipping problems, the user being interrupted mid-session by a real-world event. The protocol must include fallback steps, a minimum data quality threshold, and a rule for when a session is discarded and rerun.

Component four: verify the demographic match. Tibor's audit finding is that remote testing can tempt a startup to recruit users from whoever signs up online. Online recruitment skews toward younger, more technically literate users. For a device intended for a 70-year-old lay user population, the recruitment must include real representatives of the target demographic, which often means partnering with community organisations, patient groups, or recruitment services that can reach non-digital populations.

Felix adds a fifth component from the business perspective: the cost of a moderated remote protocol is often comparable to an in-person protocol once the moderator time, the recruitment, the equipment shipping, and the data analysis are added up. The savings come from the travel and facility costs, not from cutting scope. A founder who expects remote to be dramatically cheaper has not read the invoice yet.

Reality Check

Has the usability engineering plan documented an explicit justification for the chosen modality, referencing the environment reproduction question?
Is the summative evaluation moderated, with a trained moderator present throughout each session, rather than unmoderated?
For a remote protocol, has the recruitment process been verified to produce users who match the real demographic of the intended user population?
Do the hazard-related use scenarios include any steps that depend on clinical infrastructure, multidisciplinary teams, or physical conditions that cannot be reproduced at the user's location?
Does the protocol include a minimum data quality threshold and a rule for rerunning sessions that fail to meet it?
Have formative evaluations been run in advance, remotely or in person, so that the summative is not the first time the team has observed a user handling the device?
If the device requires a real clinical environment, has the team accepted that remote testing is not a defensible option and budgeted accordingly?

Frequently Asked Questions

Is remote usability testing accepted by notified bodies in the post-pandemic period? Yes, where the justification holds. Notified bodies review the justification in the usability engineering plan, not the modality itself. A remote file with a weak justification will be pushed back. An in-person file with a weak justification will also be pushed back.

Can unmoderated remote testing be used for summative evaluation? Very rarely. Unmoderated testing lacks the structured probing and real-time observation that EN 62366-1:2015+A1:2020 requires for the summative evaluation. It can have a role in early formative phases.

Does remote testing save money? Sometimes. The savings come from reduced travel and facility costs. The moderator time, recruitment cost, equipment shipping, and analysis cost do not disappear. For some devices the total cost is comparable to in-person testing.

What equipment does a remote session need? At minimum: a video call platform that allows recording, a way to observe the user's screen or physical environment, a structured protocol document, and a trained moderator. For physical devices, the device itself must be shipped in advance with clear instructions for safe handling and return.

Can a remote session capture aseptic technique or fine motor skills? Only partially. Webcam observation is inferior to in-person observation for fine motor skills and aseptic technique. Devices that depend on these are better evaluated in person.

How does remote testing interact with MDR Annex I §22 devices for lay users? For lay-user devices, remote testing in the user's actual home can be more representative than a lab. The Annex I §22 bar is not lowered, but the environmental realism of the remote modality can strengthen the file rather than weaken it.

Usability Engineering for Medical Devices: A Startup Introduction covers the full EN 62366-1:2015+A1:2020 process.
Usability Engineering on a Startup Budget addresses the broader budget question and the simulation spectrum.
Training Requirements as Risk Control under MDR explains why residual use-related hazards cannot be routed into training.

Sources

Regulation (EU) 2017/745 on medical devices, consolidated text. Annex I §5, §22.
EN 62366-1:2015+A1:2020, Medical devices, Part 1, Application of usability engineering to medical devices.
EN ISO 14971:2019+A11:2021, Medical devices, Application of risk management to medical devices.