MDR Software Integration Testing: Using EN 62304 to Structure Your Approach

Quick Summary

Software integration and integration testing under EN 62304:2006+A1:2015 Section 5.6 is the activity where verified software units are combined into software items and the integrated result is tested against the integration plan and the software architecture. For Class B and Class C software, the manufacturer must plan the.

Software integration and integration testing under EN 62304:2006+A1:2015 Section 5.6 is the activity where verified software units are combined into software items and the integrated result is tested against the integration plan and the software architecture. For Class B and Class C software, the manufacturer must plan the integration, integrate the units according to that plan, test the integration, evaluate the results against documented criteria, and handle anomalies through the problem-resolution process. Regression is part of the activity, not an add-on: whenever integrated software changes, the integration tests re-run and the evidence is re-captured. The MDR is the North Star. EN 62304:2006+A1:2015 is the tool that operationalises the integration-level verification obligation under MDR Annex I Section 17.2.

By Tibor Zechmeister and Felix Lenhard. Last updated 10 April 2026.

TL;DR

EN 62304:2006+A1:2015 Section 5.6 defines software integration and integration testing as the activity where verified software units are combined and the integrated result is tested against the integration plan and the architecture.
The activity is required for Class B and Class C software under EN 62304:2006+A1:2015. Class A is not subject to the full integration-testing obligation at this clause, though Class A software is still verified at the system level under Section 5.7.
Integration testing is not the same as system testing. Integration testing verifies that software items interact correctly with each other and with the underlying platform. System testing verifies the integrated software against the software requirements.
The integration plan is written before the integration runs. It defines the sequence in which units and items are integrated, the test environment, the evaluation criteria, and the regression approach when changes occur.
Anomalies found during integration testing are handled through the problem-resolution process under Section 9. Logged, evaluated for safety impact against the risk file, resolved or justified, and re-tested.
Evidence is captured reproducibly from CI where possible: logs with version identifiers, timestamps, test inputs and outputs, pass/fail judgements, and regression results after each change.
Traceability runs from the software architecture to the integration plan to the integration test cases to the results. Gaps in this chain are gaps in the file.

Why integration testing is where the architecture meets reality

Unit tests prove that each unit behaves as its detailed design says it should. System tests prove that the whole software behaves as its requirements say it should, integration testing lives in between, and it is where most of the hard bugs in regulated software actually come from, interfaces between modules get timing wrong, shared state drifts, a library update changes a default and the caller does not notice. A message queue reorders under load. A dependency injects a different object than the test assumed. None of these are unit-level defects, and none of them are requirement-level defects. They are integration defects, and the integration activity under EN 62304:2006+A1:2015 Section 5.6 exists to catch them before they reach the system test and the patient.

Every team we work with has a moment during their first audit preparation where the difference between "we have tests" and "we have integration evidence" becomes sharp. The unit tests pass. The end-to-end tests pass on a clean environment. But there is no record of the integration sequence, no evaluation criteria that distinguish integration outcomes from unit outcomes, no regression evidence when a shared component changes. The auditor asks how the team knows two items work together after a refactor, and the answer comes back as "we ran the tests." That is not the answer Section 5.6 asks for.

The good news is that if the team already uses modern CI, integration testing can be formalised with very little added work. The work is defining what counts as an integration, what counts as evidence, and what counts as a regression, and then letting the CI pipeline carry the weight.

EN 62304:2006+A1:2015 Section 5.6. What the standard actually requires

Section 5.6 of EN 62304:2006+A1:2015 is the software integration and integration testing clause. It has four components worth reading carefully.

The first component is the integration plan. The manufacturer plans the integration of software items. The sequence in which units are combined, the build environment, the dependencies, the order in which items are brought together, and the way the integration activity relates to the architecture and the detailed design. The plan is written before the integration happens, not reconstructed after the fact.

The second component is the integration itself. The manufacturer integrates the software units and software items according to the plan. "Integration" here means combining verified units into items and integrated software, not merging feature branches in a git repository. The engineering mechanism may well be the branch merge, but the regulatory activity is the combination of verified units into verified items, recorded against the architecture.

The third component is the integration testing. The manufacturer tests the integrated software items against the integration plan and against the architecture. The tests address the interfaces between items, the behaviour of the integrated system at the item-to-item boundary, the interaction with the underlying platform and external interfaces, and the effect of integrating items that were each correct in isolation. The evaluation criteria for each test are documented before the test runs.

The evaluation criteria for each test are documented before the test runs.

The fourth component is the handling of test results and anomalies. Results are evaluated against the documented criteria. Anomalies are logged and handled through the problem-resolution process defined in Section 9. Evaluated for safety impact against the risk file, resolved or justified, and re-tested after resolution. The standard also expects regression testing whenever integrated software changes: the integration tests re-run, and the evidence is re-captured.

The whole activity lands back in the MDR chain through Annex I Section 17.2, which requires software to be developed in accordance with the state of the art taking into account the principles of development life cycle, risk management, verification, and validation. (Regulation (EU) 2017/745, Annex I, Section 17.2.) The record of the integration activity is part of the technical documentation required under MDR Annex II.

Integration testing is not system testing

The single most common confusion in this area is collapsing integration testing and system testing into one activity. EN 62304:2006+A1:2015 keeps them separate for a reason. Integration testing verifies that the pieces work together: that items correctly exchange data, handle error conditions at interfaces, share state safely, and behave correctly when combined into larger items. System testing, under Section 5.7, verifies the integrated software against its software requirements specification: that the fully integrated system does what the requirements say it does.

A practical example. A SaMD product has a signal-processing item, a patient-state classifier, and a user-interface item, unit tests verify each item's internal logic, integration tests verify that the signal-processing output feeds the classifier correctly under all expected and edge-case inputs, that the classifier output drives the user-interface state machine correctly, and that error conditions propagate across the interfaces without data loss or silent failures. System tests then verify that the complete, integrated software meets the software requirement "the system shall display a red alert within two seconds when signal X crosses threshold Y." Integration tests check the plumbing, system tests check the promise.

Merging the two levels usually means either leaving integration gaps uncovered or duplicating system tests at a level where they do not produce useful evidence.

The integration plan. The part teams skip

Most teams that have a verification plan do not have an integration plan, Section 5.6 expects one, and a good integration plan is short. It answers four questions. In what sequence are the software units and items combined? What is the test environment. Platforms, dependencies, configurations, test data? What evaluation criteria does each integration test measure against? What is the regression strategy when an integrated item changes?

The sequence matters more than most teams realise, integrating bottom-up, starting with low-level items and adding higher-level items that depend on them, produces different failure modes than integrating top-down or a "big bang" integration where everything comes together at once. The standard does not mandate a specific sequence, but it expects the chosen sequence to be documented and followed, so that when an integration failure surfaces, the team knows which combination of items introduced it.

The regression strategy is where the plan pays for itself. Regulated software is never static. A dependency update, a refactor, a bug fix all change the integrated software. The plan specifies which integration tests re-run on which kinds of change. For most teams the right answer is: all integration tests, on every change, automatically in CI. The discipline is less about test selection and more about making the re-run automatic.

Key Takeaway

The discipline is to route every integration anomaly through the problem-resolution process, not to fix it quietly in a side commit.

Anomalies and the link to risk management

An integration test that fails is not just an engineering problem. Under EN 62304:2006+A1:2015 it triggers the problem-resolution process in Section 9, and the anomaly is evaluated for safety impact against the risk file maintained under EN ISO 14971:2019+A11:2021. The evaluation asks whether the failure reveals a previously unknown hazardous situation, whether an existing risk control is ineffective, or whether a new risk control is needed. The answer feeds back into the risk file and, where relevant, into the software requirements.

The link matters because integration failures are often where safety-relevant interactions first show up. A classifier that worked perfectly in unit tests but returns a stale result when the signal-processing item drops a frame is an integration defect with potential safety impact. The discipline is to route every integration anomaly through the problem-resolution process, not to fix it quietly in a side commit.

CI/CD. What "good" looks like in practice

The practice that makes integration evidence cheap is wiring the integration activity into the CI pipeline from the start. A concrete shape that works for most SaMD teams looks like this.

Every pull request that touches regulated code triggers a pipeline that builds the software items from their sources, runs the unit tests at the item level, assembles the items into the integrated software according to the integration plan's sequence, and runs the integration test suite against the assembled result, integration tests use a reproducible test environment, containerised dependencies, pinned versions, deterministic test data, so the result of any run can be reproduced from the pipeline metadata alone. The pipeline records, for every run, the commit hash, the dependency lock file, the test inputs, the outputs, the pass/fail judgement against each evaluation criterion, and the duration. Failed runs open anomaly tickets automatically, linked to the failing test and the commit that introduced the failure. Merges to the regulated-code branch are blocked unless the integration suite is green.

On main-branch merges, the pipeline re-runs the full integration suite and publishes the result as a release-candidate artefact. Release artefacts carry the full integration record with them, so the technical file for a given release contains the integration evidence for exactly that build, not a reconstructed approximation.

None of this is exotic engineering. It is what most modern teams do anyway. The difference is labelling the outputs as regulatory evidence and capturing them in a form that survives audit review.

In Practice

A SaMD product has a signal-processing item, a patient-state classifier, and a user-interface item.

Scaling by safety class. A, B, and C

The depth of the integration activity scales with the software safety class under EN 62304:2006+A1:2015. For Class A software, the full Section 5.6 integration-and-integration-testing activity is not mandated as a distinct clause; Class A software is still verified at the system level under Section 5.7, and integration happens in engineering terms, but the formal integration plan and integration test evidence are not required by the standard for Class A.

For Class B software, the integration activity is required. The integration plan is written, the tests run, the evaluation criteria are documented, and anomalies are handled through problem resolution. The rigour is proportionate to the class, but the activity is present.

For Class C software, the integration activity is required at full depth. The integration plan is more detailed, the test coverage of item-to-item interfaces is more exhaustive, the regression approach is more conservative, and the evidence is reviewed with a sceptical eye for interface-level failure modes. Class C is where integration testing carries the most weight, because the interactions between items in a Class C system are often where the highest-severity failure modes live.

The class must be assigned at the item level based on the EN ISO 14971:2019+A11:2021 risk analysis. Not bulk-assigned for convenience in either direction.

Common mistakes startups make with integration testing

Collapsing integration testing into system testing, and producing only end-to-end evidence. The interface-level failure modes never get a dedicated test and surface later as field complaints.
Running integration tests without an integration plan. The tests may be useful engineering but they do not constitute the Section 5.6 activity.
Treating regression as optional. An integration suite that runs only on release candidates misses the defects that were introduced between candidates.
Capturing integration evidence in ad-hoc form. Screenshots, developer notes, retrospective logs. Rather than in CI artefacts that carry version, timestamp, and criterion.
Fixing integration failures silently without routing them through problem resolution. The link to the risk file is lost and the anomaly history becomes unauditable.
Using a non-reproducible test environment. A test that passes on a developer machine and fails in CI is not verification evidence either way until the environment is pinned.
Conflating "the branches merged cleanly" with "the items integrated successfully." The git operation is not the regulatory activity.

Strategic Simplification

The integration-testing activity under EN 62304:2006+A1:2015 rewards early investment and punishes bolt-on catch-up more sharply than almost any other software-lifecycle activity. The moves that work are these.

One, write the integration plan early and keep it short, four answers: sequence, environment, evaluation criteria, regression strategy, a two-page plan that is actually followed beats a twenty-page plan that nobody reads.

Two. Wire the integration suite into CI from the first commit intended to become regulated code. Every merge runs the suite. Every failure opens a ticket. Every pass produces an artefact.

Three. Make the test environment reproducible with pinned dependencies and containerised services. A non-reproducible integration run is not evidence.

Four. Route every integration anomaly through the Section 9 problem-resolution process, with an explicit safety evaluation against the EN ISO 14971:2019+A11:2021 risk file. No silent fixes.

Five. Keep integration testing and system testing separate in both the plan and the evidence. Section 5.6 and Section 5.7 are different activities and produce different records.

Every activity traces back to a clause of EN 62304:2006+A1:2015 and through the standard to MDR Annex I Section 17.2. Activities that do not trace come out. For the broader framework, see post 065.

Every activity traces back to a clause of EN 62304:2006+A1:2015 and through the standard to MDR Annex I Section 17.2.

Practical Implications. Is your software integration testing ready for a Notified Body review?

Do you have a written integration plan that defines the integration sequence, the test environment, the evaluation criteria, and the regression strategy?
Is the integration activity distinct from unit verification (Section 5.5) and system testing (Section 5.7), with separate plans, cases, and evidence?
Are integration tests running automatically in CI on every change to regulated code, with evidence captured reproducibly. Version, timestamp, inputs, outputs, pass/fail?
Is the integration test environment reproducible from pipeline metadata alone, with pinned dependencies and deterministic test data?
Are integration test failures routed through the Section 9 problem-resolution process, with a documented safety evaluation against the EN ISO 14971:2019+A11:2021 risk file?
Does every integration test have a documented evaluation criterion, defined before the test runs, that the result is measured against?
Does traceability close from the software architecture to the integration plan to the integration test cases to the results, without gaps?
When a shared dependency or integrated item changes, do the integration tests re-run automatically and is the regression evidence captured in the release artefact?

Any question you cannot answer with a clear yes is a gap between your current practice and what the Notified Body will expect to see.

Frequently Asked Questions

What is software integration testing under EN 62304:2006+A1:2015?

Integration testing under Section 5.6 is the activity where verified software units and items are combined according to an integration plan and the integrated result is tested against that plan and the software architecture. It is distinct from unit verification at Section 5.5 and from system testing at Section 5.7. Its purpose is to catch defects that only surface when items interact. Interface mismatches, shared-state issues, timing defects, dependency conflicts.

Is integration testing required for Class A software?

The full Section 5.6 integration and integration-testing activity is required for Class B and Class C software under EN 62304:2006+A1:2015. Class A software is still verified at the system level under Section 5.7, and integration happens in engineering practice, but the formal integration plan and integration test evidence required at Section 5.6 are not mandated for Class A. Most teams still do integration testing for Class A because it is cheap insurance and because it protects the system test from failing on interface defects.

How is integration testing different from system testing?

Integration testing verifies that software items work together: that interfaces behave correctly, that shared state is handled safely, that the integrated software is consistent with the architecture, system testing under Section 5.7 verifies the integrated software against its software requirements specification. Integration testing checks the plumbing; system testing checks the promise.

Does the standard require regression testing on every change?

Section 5.6 expects the integration tests to re-run when integrated software changes, and the evidence to be re-captured. The standard does not prescribe a specific selection strategy, but for most teams the practical answer is to re-run the full integration suite automatically in CI on every change and capture the result. Selective regression is possible but harder to justify at audit than running everything on every change.

Where does integration test evidence live in the MDR technical file?

In the software documentation section of the technical file required by MDR Annex II. The expected contents are the integration plan, the integration test cases with their evaluation criteria, the test results with version and timestamp, the anomaly records and their problem-resolution history, and the traceability matrix linking the architecture to the tests. (Regulation (EU) 2017/745, Annex II.) Pipeline artefacts are acceptable as long as a reviewer can reproduce the activity from the record.

Can I use the same test framework for unit, integration, and system testing?

Yes, and most teams do. The standard does not prescribe or exclude any framework. What it requires is that the three activities remain distinct in plans, cases, evidence, and evaluation criteria. So a reviewer can tell which test is verifying what. The framework is a tool; the activity distinction is a regulatory obligation.

MDR Software Verification: Unit Testing for Medical Software Using EN 62304, the unit-level verification that feeds verified units into integration.
MDR Software System Testing: Validating the Complete System via EN 62304, the system-level activity that the integration activity feeds into.
DevOps for Medical Software Under MDR, how CI/CD practice maps onto the regulated software lifecycle.
Software Architectural Design Under EN 62304, the architecture the integration plan traces back to.
Software Problem Resolution Under EN 62304, the Section 9 process that handles integration anomalies.
The Subtract to Ship Framework for MDR Compliance, the methodology pillar this post applies to integration testing.

Sources

Regulation (EU) 2017/745 of the European Parliament and of the Council of 5 April 2017 on medical devices, Annex I Section 17.1 and Section 17.2; Annex II (Technical Documentation). Official Journal L 117, 5.5.2017.
EN 62304:2006+A1:2015. Medical device software. Software life-cycle processes (EN 62304:2006 + EN 62304:2006/A1:2015), Section 5.6. Software integration and integration testing; Section 9. Software problem resolution process. Harmonised standard referenced for the software lifecycle under MDR Annex I Section 17.2.
EN ISO 14971:2019+A11:2021. Medical devices. Application of risk management to medical devices. Harmonised standard referenced for risk management under MDR Annex I, integrated with EN 62304:2006+A1:2015 for the safety evaluation of integration anomalies.
EN ISO 13485:2016+A11:2021. Medical devices. Quality management systems. Requirements for regulatory purposes. Harmonised standard referenced for the QMS that wraps the software lifecycle.

This post is a category-9 spoke in the Subtract to Ship: MDR blog, focused on software integration and integration testing under EN 62304:2006+A1:2015 Section 5.6. Authored by Felix Lenhard and Tibor Zechmeister. The MDR is the North Star for every claim in this post. EN 62304:2006+A1:2015 is the harmonised tool that operationalises the integration-level verification obligation under MDR Annex I Section 17.2, not an independent authority. For startup-specific regulatory support on integration planning, CI pipeline design for regulated software, and audit-ready evidence capture, Zechmeister Strategic Solutions is where this work is done in practice.