---
title: MDR Software System Testing: Validating the Complete System via IEC 62304
description: System testing under EN 62304 Section 5.7 verifies software against its system requirements. Here is how to scope, execute, and document it.
authors: Tibor Zechmeister, Felix Lenhard
category: Software as a Medical Device
primary_keyword: software system testing IEC 62304 MDR
canonical_url: https://zechmeister-solutions.com/en/blog/software-system-testing-iec-62304
source: zechmeister-solutions.com
license: All rights reserved. Content may be cited with attribution and a link to the canonical URL.
---

# MDR Software System Testing: Validating the Complete System via IEC 62304

*By Tibor Zechmeister (EU MDR Expert, Notified Body Lead Auditor) and Felix Lenhard.*

> **Software system testing under EN 62304:2006+A1:2015 Section 5.7 is the activity where a manufacturer verifies the complete, integrated software against its software requirements. It is required for all software safety classes, though the depth of rigour scales with the class. The test set must cover every software requirement, the evidence must be captured in a reproducible way, anomalies must be evaluated and either resolved or justified under the risk-management process, and the full record — plan, cases, results, anomalies, regression runs — lands in the technical documentation under MDR Annex II. The MDR is the North Star. EN 62304:2006+A1:2015 is the tool that operationalises the system-testing obligation under MDR Annex I Section 17.2.**

**By Tibor Zechmeister and Felix Lenhard. Last updated 10 April 2026.**

---

## TL;DR

- EN 62304:2006+A1:2015 Section 5.7 defines software system testing as the activity of verifying the integrated software against its software requirements specification.
- System testing is required for all software safety classes (A, B, and C) under EN 62304:2006+A1:2015. What scales with the class is the rigour of planning, documentation, and regression coverage — not whether the activity happens at all.
- Every software requirement must be covered by at least one system test case, and the traceability from requirement to test case to test result must close without gaps.
- Anomalies found during system testing enter the software problem-resolution process and are evaluated against the EN ISO 14971:2019+A11:2021 risk-management file before release.
- Regression testing after changes is part of system testing, not a separate activity. The regression strategy is defined in the test plan and documented per release.
- The record that lands in the MDR Annex II technical documentation is the test plan, the test cases, the test results with evidence, the anomaly log, and the final release statement tying it all together.

---

## Why system testing is the activity Notified Bodies look at first

Software system testing is the place where a Notified Body can, in one sitting, see whether the whole lifecycle you documented actually holds together. The test plan tells them what you intended to verify. The traceability tells them whether your requirements are real. The results tell them whether the software does what you claim. The anomaly log tells them what you found and what you did about it. Every other lifecycle artefact — planning, requirements, architecture, design, unit testing, integration testing — funnels into this one activity.

Every startup we see passes system testing the same way, and every startup we see fails it the same way. The ones who pass treated system testing as a first-class engineering activity with evidence captured from CI, traceability maintained continuously, and anomalies processed into the risk file as they appeared. The ones who fail wrote a test report two weeks before the Notified Body visit and tried to reconstruct coverage from memory. The first kind walks out of the audit with findings on the margins. The second kind walks out with a corrective action that pushes certification by a quarter.

System testing is also the activity where the subtract-to-ship discipline pays the highest dividend. A lean requirements set produces a lean test set. A bloated requirements set — the kind a team writes when it treats requirements as documentation rather than engineering — produces a bloated test set that takes months to run and months to re-run after every change. The system-testing cost is a direct function of the requirements scope, and that is a design decision, not a testing problem.

## EN 62304:2006+A1:2015 Section 5.7 — what the standard requires

Section 5.7 of EN 62304:2006+A1:2015 is the software system testing clause. Read literally, it requires the manufacturer to establish and carry out a set of tests, expressed as input stimuli, expected outcomes, and pass/fail criteria, to verify that the integrated software performs as specified by the software requirements. The standard also requires that the test results be evaluated, that anomalies be entered into the problem-resolution process, that the tests be repeatable, and that regression testing be performed when changes are made to the software. The clause is mandatory for all software safety classes under the standard.

What scales with the software safety class is the documentation depth and the rigour of the evaluation. For Class A items, the activity still happens — the standard does not exempt Class A from system testing — but the required evidence is lighter. For Class B and Class C items, the standard adds explicit requirements around the independence of the testing, the documentation of the test procedures, and the regression strategy after changes. The higher the class, the less you can leave implicit.

The system-testing activity sits inside the MDR chain through Annex I Section 17.2, which requires software to be developed and manufactured in accordance with the state of the art taking into account the principles of development life cycle, risk management, verification, and validation. (Regulation (EU) 2017/745, Annex I, Section 17.2.) System testing is the verification step that closes the loop between the requirements and the finished software. The record of system testing is part of the technical documentation described in MDR Annex II.

## Scoping and planning the system tests

System-test planning happens inside the software development plan and the software verification plan — not in a separate document written the week before testing starts. The plan covers the scope (which requirements are in scope for this release), the test environment (hardware, operating system, data sets, dependencies), the pass/fail criteria, the anomaly-handling procedure, the regression strategy, and the evidence capture mechanism.

The test environment deserves its own attention. The Notified Body will want to see that the environment in which the system tests run is representative of the intended clinical environment, or that any differences are understood and justified. For a desktop SaMD this can be as simple as a documented operating-system matrix and a set of realistic test data. For a cloud-hosted SaMD it means a staging environment that mirrors production. For a device with embedded software, it means either the target hardware or a validated simulator. The rule is that the environment is documented, controlled, and reproducible — not that it is expensive.

The evidence capture mechanism is the part startups most often underinvest in. The regulatory artefact of a system test is not a screenshot pasted into a Word document. It is a reproducible record that ties an input, an execution, an output, and a pass/fail judgement together with a timestamp and a software version identifier. CI test runners produce this record natively. A hand-run test with a screenshot does not, and you will end up reconstructing half the evidence the week before the audit. Build the capture mechanism in sprint one.

## Traceability from requirements to test cases

Traceability is the mechanism by which the Notified Body checks that the system-test set actually covers the software requirements. The rule is simple: every software requirement must be covered by at least one system test case, and every system test case must trace back to at least one requirement. No orphan requirements. No orphan test cases.

In practice this means the traceability matrix is a live artefact maintained alongside the requirements and the test cases, not a document generated at the end. A missing link in either direction is a finding. A requirement with no test is an uncovered requirement — you are claiming something the software does without verifying it. A test with no requirement is either testing behaviour you have not specified (which means your requirements are incomplete) or testing behaviour that is not in scope (which means the test set is bloated).

Traceability is the single biggest source of rework at audit time for startups who did not maintain it continuously. The work of reconstructing a traceability matrix across hundreds of requirements and test cases is larger than the work of maintaining it in the first place, and the reconstructed matrix is always less trusted by the auditor. The engineering-team-wide practice that makes traceability cheap is to store requirements as structured text in the same repository as the test cases, with identifiers that both sides reference. The matrix then becomes a queryable output of the repository, not a hand-maintained spreadsheet.

## Designing the test cases

A system test case is defined by four things: the inputs, the initial state, the expected outputs, and the pass/fail criteria. EN 62304:2006+A1:2015 Section 5.7 requires all four to be present; a test case that leaves any of them implicit is not a test case the auditor will accept.

The design of the test set itself is driven by the requirements. Functional requirements produce functional test cases. Performance requirements produce performance test cases with numeric thresholds. Interface requirements produce interface-boundary test cases. Safety requirements derived from the risk file produce safety-specific test cases that explicitly reference the hazard they guard against. The test set should cover normal operation, boundary conditions, error handling, and the safety-critical paths surfaced by the risk analysis.

What the test set does not need to be is exhaustive at the unit level. System testing is not unit testing. Unit-level coverage belongs to the activities under Section 5.5 of the standard. System testing is the level at which the integrated software is exercised against its requirements as a whole — user-visible behaviour, end-to-end flows, interface contracts, real data. Teams that conflate the levels end up either writing unit tests and calling them system tests, or writing system tests that drill into unit behaviour and missing the end-to-end coverage. Keep the levels separate and let each one do its job.

For safety-classified items, the test cases that cover safety-related requirements should be marked as such in the test set, and the link to the specific hazard or risk control from the EN ISO 14971:2019+A11:2021 risk file should be explicit. The Notified Body will look for this connection, and a safety test that does not reference the risk it is guarding against is a test the auditor cannot weight correctly.

## Anomaly handling during and after system testing

When a system test fails, the failure becomes an anomaly under the software problem-resolution process (Section 9 of EN 62304:2006+A1:2015). The anomaly is logged, investigated, and one of three things happens: it is fixed and the test is re-run, it is accepted with a documented justification that is evaluated against the risk file, or it is deferred to a future release with a documented rationale. All three outcomes are legitimate if they are documented and justified. What is not legitimate is an anomaly that disappears from the log without a resolution.

The evaluation against the risk file is the part that startups most often underweight. An anomaly in a safety-critical path is not the same as an anomaly in a cosmetic feature, and the software problem-resolution process has to treat them differently. The evaluation asks: does this anomaly change the risk profile of the software? Does it introduce a new hazard, or increase the probability of an existing one? If yes, the risk file is updated and the risk controls are re-evaluated before release. If no, the rationale is documented and the release proceeds. This is the integration point where EN 62304:2006+A1:2015 and EN ISO 14971:2019+A11:2021 meet in practice, and the Notified Body will look for the evidence that the integration actually happened.

## Regression testing — the part that eats release cycles if you let it

Every change to the software — bug fix, feature addition, dependency update, configuration tweak — raises the question of which previously-passing tests might now fail. EN 62304:2006+A1:2015 Section 5.7 requires regression testing after changes, and the regression strategy is defined in the test plan.

The lean regression strategy is the one most startups eventually converge on after paying for the bloated version at least once. It has three elements. First, all system tests are automated so a full regression run is the default, not an exception. Second, the tests are fast enough that a full run fits inside a nightly or per-commit CI job — if a full run takes a day, the team will stop running it, and the regression coverage will silently erode. Third, the tests are deterministic enough that a failure is a signal, not noise — flaky tests destroy the credibility of the whole suite and lead to the team ignoring real failures. A regression suite that runs on every change, passes reliably, and captures its evidence automatically is the cheapest insurance the lifecycle offers.

For changes that are too large to regression-test with the existing suite — a major architectural change, a new class of input — the test plan should specify an impact analysis step that identifies which new tests are needed and which existing tests need to be updated. The impact analysis is documented and the updated test set is re-run. This is the part of the standard that prevents "we changed everything but the regression suite still passes" being a valid release argument.

## Documenting the system testing for the technical file

The record that lands in the MDR Annex II technical documentation has five components. The test plan covers the scope, environment, criteria, and regression strategy. The test cases define the inputs, states, expected outputs, and pass/fail criteria, with traceability to the requirements. The test results capture what actually ran, when, on which software version, with what evidence. The anomaly log captures every failure and its resolution. The release statement ties the four together and confirms that the software version being released has passed its system tests or has documented, justified acceptances for any that did not.

The principle for the file is: everything a reviewer needs to reconstruct the testing activity from scratch is in the file. Nothing that is not needed is in the file. A Notified Body reviewer who cannot find a specific test result has reason to doubt the rest. A reviewer who has to wade through irrelevant artefacts to find the relevant ones loses time and goodwill. Lean and complete is the target.

## Common mistakes startups make with system testing

- Treating system testing as a phase at the end of development rather than an activity that runs continuously from the first requirement. The result is a system-test crunch in the last weeks before release and a reconstructed evidence trail that does not convince auditors.
- Writing system tests that duplicate unit tests. System testing is not about drilling into units — it is about verifying the integrated software against its requirements. If your system tests look like unit tests, your unit tests are doing something else and your coverage has a gap.
- Skipping traceability until audit prep. Reconstructing traceability from memory is more expensive than maintaining it in-flight, and the reconstructed matrix is always less trusted.
- Capturing evidence by screenshot. Screenshots are not reproducible records. They are snapshots of one run. The auditor will ask you to re-run the test, and if re-running produces a different result you have a problem.
- Treating flaky tests as acceptable noise. A test suite that sometimes passes and sometimes fails for reasons unrelated to the code under test is a suite the team will stop trusting, and a suite the team does not trust is a suite the auditor cannot trust either.
- Assigning Class A to software items to avoid the system-testing obligation. Class A does not exempt you from system testing. It only reduces some of the surrounding documentation depth. If the risk analysis supports Class B, assigning Class A is a misrepresentation the auditor will find.

## The Subtract to Ship angle

System testing is a pure subtract-to-ship exercise. The default response to the requirement is to add test cases until the suite is large enough to look impressive, and then to spend months running a suite that was built to impress rather than to verify. The competent response is to build exactly the test set the requirements demand, no more and no fewer, and to run it automatically on every change.

The subtraction moves are these. One — drive the test set from the requirements, not from the code. A test that does not trace to a requirement either points at a missing requirement or at a test that should not exist. Two — automate the evidence capture from day one, so a test run produces its own regulatory artefact. Three — keep the test set fast enough to run on every change, so regression is free and the coverage does not erode. Four — link safety-critical tests explicitly to the risk file, so the Notified Body can see the integration without asking. Five — accept anomalies honestly, with documented rationale, rather than hiding them or burying them in a backlog. Six — keep the test plan short. A plan the team reads and follows is worth more than a plan that sits in a shared drive.

The principle traces back to the broader framework: every activity earns its place by linking to a specific clause of EN 62304:2006+A1:2015 and, through the standard, to MDR Annex I Section 17.2. Activities that do not link come out. For the framework applied to MDR as a whole, see post 065.

## Reality Check — Is your software system testing ready for a Notified Body review?

1. Do you have a written software verification plan that defines scope, environment, pass/fail criteria, anomaly handling, and regression strategy for system testing?
2. Does every software requirement trace to at least one system test case, and does every system test case trace back to at least one requirement, with no orphans in either direction?
3. Are your system test results captured automatically from CI, with inputs, outputs, software version, and timestamp, in a form a reviewer can re-run?
4. Are your safety-critical test cases explicitly linked to the hazards and risk controls in the EN ISO 14971:2019+A11:2021 risk file?
5. Is every anomaly found during system testing logged, investigated, and resolved with one of three documented outcomes — fixed, accepted with justification, or deferred with rationale?
6. Is your regression test suite automated, fast enough to run on every change, and deterministic enough that a failure is a real signal?
7. Does your release statement tie the test plan, test cases, test results, and anomaly log together for the specific software version being released?
8. Does the system-testing documentation reflect what the engineering team actually does, or does it describe a process that exists only on paper?

Any question you cannot answer with a clear yes is a gap between your current practice and what the Notified Body will expect to see. System testing is the activity where those gaps surface first, and the earlier you close them, the cheaper they are to close.

## Frequently Asked Questions

**Is software system testing required for Class A software under EN 62304:2006+A1:2015?**
Yes. Section 5.7 of EN 62304:2006+A1:2015 applies to all software safety classes. What scales with the class is the rigour of the documentation and the evaluation, not whether the activity happens. A Class A item still has to be tested against its requirements, with recorded evidence and traceability.

**What is the difference between software system testing and software integration testing?**
Integration testing (Section 5.6 of the standard) verifies that software items connect correctly and that their interfaces work as designed. System testing (Section 5.7) verifies that the fully integrated software meets its software requirements as a whole. Integration testing asks "do the parts fit together." System testing asks "does the finished software do what the requirements say it does." Both are required for Class B and Class C; system testing is required for Class A as well.

**Can I use automated CI test runs as the sole evidence of system testing?**
Yes, and this is the lean approach. The regulatory artefact is a reproducible record of the test run — inputs, outputs, software version, timestamp, pass/fail — and a CI log produces exactly that natively. The requirements are that the tests are documented, the evidence is captured, the traceability is maintained, and anomalies are processed through the problem-resolution process. All of that can be wired into CI.

**What happens if a system test fails but we ship anyway?**
You can ship with a failed test only if the failure has been evaluated against the risk file, the acceptance has been documented with a rationale, and the software problem-resolution process has recorded the deferral. This is legitimate when the failure is low-impact and the fix is scheduled. It is not legitimate when the failure touches a safety-critical path or when the rationale is not documented. The Notified Body will read every deferred anomaly and ask how it was evaluated.

**How much regression testing is enough after a change?**
Enough to catch failures the change could plausibly introduce. For a small, localised change, running the tests around the affected area may be sufficient if the impact analysis supports it. For larger changes, a full regression run is the default. The lean approach is to make the full regression cheap enough — fast, automated, deterministic — that running it on every change is the default, and impact analysis becomes the exception rather than the norm.

**Where does system testing evidence live in the MDR technical file?**
In the software documentation section of the technical file required by MDR Annex II. The expected contents are the test plan, the test cases with traceability to requirements, the test results with evidence, the anomaly log, and the release statement. (Regulation (EU) 2017/745, Annex II.) The form can be structured markdown, PDFs, or a combination, as long as a reviewer can navigate it and reproduce the testing activity from the record.

## Related reading

- [What Is Software as a Medical Device (SaMD)? The MDR Definition for Startups](/blog/what-is-software-as-medical-device-samd-mdr) — the category pillar this post sits under.
- [MDR Software Lifecycle Requirements: How IEC 62304 Helps You Demonstrate Conformity](/blog/mdr-software-lifecycle-iec-62304) — the lifecycle overview that frames where system testing fits.
- [MDR Software Requirements: Using IEC 62304 to Write Compliant Requirements](/blog/software-requirements-iec-62304) — the requirements activity that drives the system test set.
- [Software Architectural Design Under EN 62304](/blog/software-architectural-design-en-62304) — the architecture activity that precedes verification.
- [Software Unit Verification Under EN 62304](/blog/software-unit-verification-en-62304) — the unit-level verification that feeds into system testing.
- [Software Integration Testing Under EN 62304](/blog/software-integration-testing-en-62304) — the integration activity immediately before system testing.
- [Software Release Under EN 62304](/blog/software-release-en-62304) — the release activity that consumes the system-testing record.
- [Software Problem Resolution Under EN 62304](/blog/software-problem-resolution-en-62304) — the process that handles anomalies surfaced in system testing.
- [Software Maintenance Under EN 62304](/blog/software-maintenance-en-62304) — the post-release process that re-runs regression testing after changes.
- [The Subtract to Ship Framework for MDR Compliance](/blog/subtract-to-ship-framework-mdr) — the methodology pillar this post applies to system testing.

## Sources

1. Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on medical devices, Annex I Section 17.1 and Section 17.2; Annex II (Technical Documentation). Official Journal L 117, 5.5.2017.
2. EN 62304:2006+A1:2015 — Medical device software — Software life-cycle processes (IEC 62304:2006 + IEC 62304:2006/A1:2015), Section 5.7 — Software system testing. Harmonised standard referenced for the software lifecycle under MDR Annex I Section 17.2.
3. EN ISO 14971:2019+A11:2021 — Medical devices — Application of risk management to medical devices. Harmonised standard referenced for risk management under MDR Annex I, integrated with EN 62304:2006+A1:2015 for software risk management and anomaly evaluation.
4. EN ISO 13485:2016+A11:2021 — Medical devices — Quality management systems — Requirements for regulatory purposes. Harmonised standard referenced for the QMS that wraps the software lifecycle.

---

*This post is a category-9 spoke in the Subtract to Ship: MDR blog, focused on software system testing as the verification activity that closes the loop between software requirements and the integrated software under EN 62304:2006+A1:2015 Section 5.7. Authored by Felix Lenhard and Tibor Zechmeister. The MDR is the North Star for every claim in this post — EN 62304:2006+A1:2015 is the harmonised tool that operationalises the system-testing obligation under MDR Annex I Section 17.2, not an independent authority. For startup-specific regulatory support on software verification planning, traceability, and audit-ready evidence capture, Zechmeister Strategic Solutions is where this work is done in practice.*

---

*This post is part of the [Software as a Medical Device](https://zechmeister-solutions.com/en/blog/category/samd) cluster in the [Subtract to Ship: MDR Blog](https://zechmeister-solutions.com/en/blog). For EU MDR certification consulting, see [zechmeister-solutions.com](https://zechmeister-solutions.com).*
