---
title: MDR Software Test Documentation: Protocols and Reports via EN 62304
description: How to write software test protocols and reports under EN 62304 clauses 5.5-5.7 that survive an MDR notified body review.
authors: Tibor Zechmeister, Felix Lenhard
category: Software as a Medical Device
primary_keyword: software test documentation IEC 62304 MDR
canonical_url: https://zechmeister-solutions.com/en/blog/software-test-documentation-iec-62304
source: zechmeister-solutions.com
license: All rights reserved. Content may be cited with attribution and a link to the canonical URL.
---

# MDR Software Test Documentation: Protocols and Reports via EN 62304

*By Tibor Zechmeister (EU MDR Expert, Notified Body Lead Auditor) and Felix Lenhard.*

> **Software test documentation under MDR means writing protocols and reports that satisfy EN 62304:2006+A1:2015 clauses 5.5 (unit), 5.6 (integration) and 5.7 (system) testing, with evidence traceable to requirements, design, and risk controls. The notified body reads the records, not the code — if the records do not stand alone, the tests effectively did not happen.**

**By Tibor Zechmeister and Felix Lenhard.**

## TL;DR
- MDR Annex I §17.2 requires software to be developed and manufactured in accordance with the state of the art; EN 62304:2006+A1:2015 provides the presumption of conformity for the software lifecycle.
- Clauses 5.5, 5.6, and 5.7 of EN 62304 require documented unit, integration, and system testing respectively, each with protocols, records of execution, and pass/fail interpretation.
- A test protocol must define purpose, setup, inputs, expected results, acceptance criteria, and traceability anchors before the test runs. A test report captures actual results, deviations, and a reasoned pass/fail verdict.
- Every test record must trace upward to a software requirement (clause 5.2) and, where applicable, to a risk control measure derived from EN ISO 14971:2019+A11:2021.
- Auditors do not accept "it passed CI" as evidence. They accept reviewable records with timestamps, versions, tester identity, and configuration under version control.
- Software safety class (A, B, or C) scales the depth of documentation, not whether documentation exists.

## Why software test documentation is where SaMD startups stall

We see the same pattern in almost every pre-audit review. The team has a green dashboard. The pipeline runs thousands of tests. Coverage looks healthy. Then a notified body auditor asks: "Show me the test protocol for software requirement SRS-0142, the actual execution record, who signed off on the interpretation, and the version of the software under test." Silence.

The code works. The evidence does not. Under MDR Annex II, the technical documentation must contain verification and validation results that demonstrate conformity with the general safety and performance requirements. A log file from yesterday's CI run, without traceability and without human interpretation, is not verification evidence in the sense the regulation expects. It is raw data. The work of turning raw data into evidence is what EN 62304 clauses 5.5, 5.6, and 5.7 describe.

This matters more for software than for many hardware disciplines because software verification can look complete from a developer seat and still be invisible from an auditor seat. Protocols and reports are the bridge.

## What MDR and EN 62304 actually say

MDR Annex I §17.2 is the anchor: software shall be developed and manufactured in accordance with the state of the art taking into account the principles of development lifecycle, risk management including information security, verification and validation. The text is short. The state of the art it points to is EN 62304:2006+A1:2015, listed among the harmonised standards supporting presumption of conformity for the software lifecycle.

EN 62304 splits verification into three layers. Clause 5.5 (software unit implementation and verification) requires that each software unit is verified against its acceptance criteria, with records kept. Clause 5.6 (software integration and integration testing) requires integration of units according to the integration plan and documentation of integration test results. Clause 5.7 (software system testing) requires testing the software system against the software requirements, with documented results and regression testing for changes.

For all three, the standard requires that:
- tests are planned (a protocol exists before execution);
- acceptance criteria are defined in advance;
- results are recorded;
- deviations and anomalies are evaluated;
- a pass/fail conclusion is documented;
- traceability is maintained between requirements, tests, and risk controls.

The software safety class (A, B, or C) per clause 4.3 modulates rigour. Class A has the lightest documentation floor. Class C is the heaviest. But the structural elements — protocol, execution record, interpretation, traceability — are present at every class.

MDR Annex II then requires all of this to sit in the technical documentation so that a notified body can review it during conformity assessment.

## A worked example: a Class B SaMD triage tool

A startup builds a Class IIa SaMD per Annex VIII Rule 11 that triages referral letters and suggests urgency. Software safety class B under EN 62304. Requirement SRS-0142 reads: *"The system shall flag any referral letter containing the term 'chest pain' with urgency level HIGH within 2 seconds."*

The unit test protocol for the keyword matcher function (clause 5.5) looks like this:

- **Protocol ID:** UT-MATCHER-017, v1.2
- **Traces to:** SRS-0142, RISK-CTRL-044 (mitigation for hazard "missed time-critical symptom")
- **Purpose:** Verify the keyword matcher returns HIGH for any input containing "chest pain" (case-insensitive, with common punctuation).
- **Environment:** Build SHA a9f3b12, Python 3.11.6, test harness v0.8.0
- **Inputs:** 14 predefined strings (positive and negative cases, listed in full in appendix A)
- **Expected results:** For each string, the expected urgency label
- **Acceptance criteria:** 100% match with expected labels; no exceptions raised
- **Tester:** (to be completed at execution)
- **Date:** (to be completed at execution)

After execution, the test report records: the actual outputs for each input, the pass/fail verdict per case, any deviations observed, the interpretation ("All 14 cases returned expected urgency labels; protocol UT-MATCHER-017 passes"), the tester name and date, and the build SHA. The protocol, the report, and the code under test are all under version control and referenced from the traceability matrix.

Integration testing (clause 5.6) for the same requirement might verify that when the intake parser hands a message to the matcher and the matcher returns HIGH, the routing module assigns the correct queue — same structure: protocol, inputs, expected, actual, interpretation, traceability.

System testing (clause 5.7) verifies SRS-0142 end-to-end in the production-like environment: a referral letter goes in, the user sees a HIGH urgency flag within 2 seconds. This is the test that most directly maps back to the software requirement the user will read in the clinical evaluation.

Three layers, three protocols, three reports, all traced to the same requirement and to the same risk control. Any auditor can now follow the chain.

## The Subtract to Ship playbook for test documentation

The mistake startups make is writing beautiful test strategies and never writing the actual protocols. The second mistake is writing protocols after execution — which turns them into fiction, not plans.

Here is the lean path. Every item traces to an EN 62304 clause.

**1. One protocol template. One report template. Nothing more.**
Build one protocol template with: ID, version, traces-to (requirements, risk controls, design), purpose, environment, prerequisites, inputs, expected results, acceptance criteria, tester, date. Build one report template: the same header plus actual results, deviations, interpretation, conclusion, signatures. Use them for unit, integration, and system testing. Stop designing templates. Start writing tests.

**2. Write the protocol before the code, or at minimum before the execution.**
This is non-negotiable under EN 62304 clauses 5.5-5.7. A protocol written after execution is not a protocol; it is a transcription. Auditors detect this instantly by checking whether the protocol version predates the execution timestamp.

**3. Automate execution. Do not automate interpretation.**
CI pipelines are excellent for running unit tests thousands of times. They are poor at producing a human-reviewed, signed verdict. The fix: the CI run produces a machine-readable result artifact; a human reviews the artifact against the protocol and writes the interpretation into the report. Yes, for every release candidate. That is what "verification" means under EN 62304.

**4. Traceability anchors live in the protocol header, not in a separate spreadsheet.**
When a requirement changes, you search the protocols for its ID and know exactly which tests to re-run. When a risk control changes, same. A standalone traceability matrix that is updated weekly is a liability. An anchor inside the protocol is self-healing.

**5. Keep deviations visible.**
Tests fail. Some failures are genuine defects; some are protocol errors; some are environment issues. Record the deviation, the root cause, and the resolution in the report. Do not rewrite the protocol silently to make the test pass. That is the fastest way to lose an audit.

**6. Regression testing for every change.**
EN 62304 clause 5.7.3 is explicit: regression testing is required for software system testing when changes are made. The practical form: when a software requirement or a risk control changes, identify the impacted protocols through traceability and re-execute. Document the regression decision — including a written justification when you choose not to re-run.

**7. Scale depth to software safety class, not to ambition.**
Class A SaMD does not need the same depth as Class C. Read clauses 5.5-5.7 and the class tables carefully; do what the class requires and no more. Subtraction is not laziness — it is reading the standard and stopping at its boundary.

## Reality Check

1. Can you show an auditor a unit test protocol that was version-controlled before its first execution? If not, you have a timestamp problem.
2. For a randomly chosen software requirement, can you trace forward to at least one unit test, one integration test (where applicable), and one system test — and backward from each test to the requirement?
3. Does every test report contain a human interpretation and a human signature, or only machine output?
4. When a test fails, is there a recorded deviation with root cause and resolution, or is the protocol silently rewritten?
5. Is the software version under test recorded in every test report (SHA, build number, or equivalent)?
6. Do your protocols for software safety class B or C items contain acceptance criteria defined in advance, not derived after the fact?
7. When a software requirement changes, can you identify every affected test protocol within an hour?
8. Could a new regulatory engineer, with access only to your documentation, reconstruct what was tested, how, and whether it passed?

If more than two of these return "no," your software test documentation is not ready for notified body review yet.

## Frequently Asked Questions

**Do I need separate protocols for unit, integration, and system tests?**
Yes. EN 62304 treats them as distinct activities in clauses 5.5, 5.6, and 5.7, with different inputs, different environments, and different traceability targets. One protocol cannot serve all three layers without losing the specificity each requires.

**Can I use CI test output as my unit test records?**
The raw output can be an input to the record, but it is not the record itself. The record is the protocol, the actual results attributed to a specific build, the interpretation, and the signature. Pipeline logs alone do not meet clause 5.5 because they lack the protocol context and the human verdict.

**What if my software is safety class A? Do I still need all this?**
Yes, at a lighter depth. Class A still requires unit and system verification records under EN 62304, and the MDR Annex II technical documentation still needs verification evidence. The difference is rigour and scope, not the existence of records.

**How does risk management connect to test documentation?**
Every risk control implemented in software should trace to a test that verifies it. EN ISO 14971:2019+A11:2021 requires verification of risk control measures; in software, that verification lives in your EN 62304 test records. The traceability header on your protocol is where the connection is made visible.

**How many test protocols is "enough"?**
Enough to cover every software requirement and every software risk control at the depth required by your software safety class. Counting protocols is the wrong metric. Counting unverified requirements and unverified risk controls is the right one.

**Do I need to rerun all tests for every release?**
System testing per EN 62304 clause 5.7 requires regression testing for changes. The scope is driven by impact analysis, not by a blanket rule. What matters is that the regression decision is documented and defensible.

## Related reading
- [Software verification: unit testing under EN 62304](/blog/software-verification-unit-testing-iec-62304) — the clause 5.5 activity in detail.
- [Software integration testing under EN 62304](/blog/software-integration-testing-iec-62304) — clause 5.6, where units become subsystems.
- [Software system testing under EN 62304](/blog/software-system-testing-iec-62304) — clause 5.7, the layer the notified body examines most closely.
- [Software traceability: design, tests, risks](/blog/software-traceability-requirements-design-tests-risks) — how the anchors in your protocols connect the whole lifecycle.
- [Software documentation in the technical file](/blog/software-documentation-technical-file) — where test records live under MDR Annex II.

## Sources
1. Regulation (EU) 2017/745 on medical devices, consolidated text. Annex I §17.2, Annex II.
2. EN 62304:2006+A1:2015, Medical device software — Software life cycle processes. Clauses 5.5, 5.6, 5.7.
3. EN ISO 14971:2019+A11:2021, Medical devices — Application of risk management to medical devices.

---

*This post is part of the [Software as a Medical Device](https://zechmeister-solutions.com/en/blog/category/samd) cluster in the [Subtract to Ship: MDR Blog](https://zechmeister-solutions.com/en/blog). For EU MDR certification consulting, see [zechmeister-solutions.com](https://zechmeister-solutions.com).*
