Regression Testing for Medical Software: Keeping Updates Compliant

Quick Summary

Regression testing medical software MDR: how to scope, select, and evidence regression runs under EN 62304 so updates do not break compliance.

Regression testing for medical software under MDR means rerunning enough of your existing verification to prove a change did not silently break a safety-relevant behaviour. EN 62304 requires you to analyse the impact of each change, select tests based on that analysis, and retain the evidence, not rerun everything blindly, and not skip tests because the diff looks small.

By Tibor Zechmeister and Felix Lenhard.

TL;DR

EN 62304:2006+A1:2015 treats regression testing as part of the maintenance process (clause 6) and problem resolution (clause 9), triggered by change requests and driven by change-impact analysis.
You do not have to rerun every test for every change. You have to justify, in writing, which tests you reran and why.
Every software change on a released SaMD must be classified: is it a "significant change" under MDR Article 120, or a maintenance change handled under your QMS?
Regression evidence is audit evidence. If a notified body cannot trace a released version back to a test run, the release is effectively unverified.
The startup mistake is not "too little regression testing." It is "regression testing without a change-impact record." The second one fails audits faster.

Why regression testing is where small teams get hurt

A two-person team ships a fix, ten lines, CI goes green. They deploy. Three weeks later a user reports a number on a results screen is off by a decimal place. The ten-line change touched a unit conversion nobody listed as affected. There was no change-impact analysis. The tests that would have caught it existed. They were just not selected to run.

Under EN 62304 and EN ISO 13485:2016+A11:2021, that sequence is a verification failure on a released medical device. If the error had clinical consequences, it becomes a vigilance event under MDR Articles 87-92.

Regression testing exists to prevent this, not by running every test every time, which is neither required nor realistic, but by deciding, for each change, which parts of the verification set still apply, which need to be rerun, and which need new tests. The discipline is the analysis. The execution is follow-through.

What EN 62304 actually says about regression testing

EN 62304:2006+A1:2015 does not use the word "regression" as a headline requirement. It does something stricter: it builds the expectation into the maintenance and problem-resolution processes.

Clause 5.6.7. Integration testing. Clause 5.6.7 requires that when software items are integrated, the manufacturer verifies that the integrated software performs as intended. In practice, every change that re-integrates code against existing modules triggers this expectation for the affected integration boundaries.

Clause 6. Software maintenance process. Clause 6 applies from the moment your software is released. It requires a documented maintenance plan, analysis of each modification (including its effect on existing software), and verification of modifications. The standard is explicit that you must evaluate the impact of a change on existing software and redo verification and validation activities to the extent affected. That is regression testing, even though the word does not appear.

Clause 9 governs what happens when a problem is reported against released software.

Clause 9. Software problem resolution. Clause 9 governs what happens when a problem is reported against released software. It requires you to investigate, determine the impact, and, critically, verify that the fix resolves the problem and does not introduce new ones. Again: regression, by another name.

MDR Annex I §17.2. Annex I §17.2 of Regulation (EU) 2017/745 requires that software be developed and manufactured according to the state of the art, taking into account the principles of development lifecycle, risk management, and verification and validation. EN 62304 is the harmonised standard that gives you the presumption of conformity for that clause.

MDR Article 120. Significant change. Article 120 governs the transitional regime and, importantly, the concept of "significant change" to design or intended purpose for legacy MDD devices. For devices already under MDR, the analogous concept lives in your conformity assessment procedure and your QMS change-control process: some software changes require renewed notified body involvement; others do not. Regression evidence is how you demonstrate that a non-significant change genuinely stayed within the boundaries you claimed.

The combined picture: EN 62304 tells you to analyse impact and re-verify to the extent affected. EN ISO 13485 tells you to control changes and keep records. MDR tells you that if the change is significant, your notified body needs to hear about it before you ship.

Regression tests vs new tests. The distinction that gets missed

One of the cleanest ways to lose a notified body audit is to conflate regression tests with new tests.

A regression test is an existing, already-approved test case that you rerun to confirm a previously verified behaviour still holds after a change. It has a test ID, an expected result, and a history.

A new test is a test case created specifically because the change introduced a new behaviour, a new requirement, or a new risk control, or because a defect revealed that an old requirement was not adequately covered.

Both are required in different situations. EN 62304 clause 6 expects that when a change introduces new behaviour, you add the tests needed to cover it. Clause 9 expects that when a defect is fixed, you add a test that would have caught the defect. So the same problem does not reappear silently.

Key Takeaway

Call it what it is, and document both activities separately in the change record.

If your "regression run" consists of tests written the same week as the change, that is not regression. That is initial verification. Call it what it is, and document both activities separately in the change record.

Change-impact analysis: the document that drives test selection

Change-impact analysis (CIA) is the artefact EN 62304 implicitly requires when it says "evaluate the effect on existing software." It is also the document most startups do not write, and the first thing a thorough auditor asks for after a software change.

A usable CIA for a startup contains, at minimum:

Change identifier. The change request, ticket, or problem report ID.
Description of the change. What is actually being modified, in plain language plus a pointer to the diff or commit range.
Affected software items, which modules, units, or components are touched directly.
Indirectly affected items, which modules depend on the affected items (call graph, shared state, shared data formats, shared interfaces).
Affected requirements, which software requirements, system requirements, and risk controls are in scope.
Affected risks, which hazard analysis entries need review under EN ISO 14971:2019+A11:2021.
Test selection rationale, which existing tests will be rerun, which new tests will be added, which tests are explicitly not rerun and why.
Result. Pass/fail, linked test execution records, date, reviewer.

Points 4 and 7 are where small teams cut corners. Point 4 requires you to actually know your dependency structure. Point 7 requires you to defend what you did not test. Both are muscles that get stronger with practice and weaker with haste.

In Practice

A regression test is an existing, already-approved test case that you rerun to confirm a previously verified behaviour still holds after a change.

A worked example: the unit-conversion fix that looked safe

A startup ships a Class IIa SaMD that computes dosing ranges from patient weight. The software is classified Class B under EN 62304 clause 4.3. A developer fixes a rounding bug in a shared utility function convertUnits. The diff is 12 lines. CI is green.

Without CIA, the test set run by CI covers about 60 percent of the units that depend on convertUnits. The other 40 percent, including the dosing calculation, rely on integration tests that run only nightly, and those integration tests do not cover the specific rounding edge case that was fixed.

With CIA, the team lists the affected items: convertUnits is called by doseCalculator, reportFormatter, exportService, and auditLogger. Under "indirectly affected items," they note that doseCalculator feeds the primary clinical output and is linked to risk control RC-17 (dose miscalculation). The CIA flags that:

All unit tests on convertUnits must be rerun, including the new test for the rounding bug.
Integration tests on doseCalculator must be rerun, not deferred to nightly.
A targeted new test covering the specific rounding edge case must be added at the unit and integration level.
reportFormatter and exportService require rerun of their integration tests because they format values derived from the utility.
auditLogger is out of scope because it logs raw input, not converted values. That decision is written down.

The team reruns the selected tests. doseCalculator integration tests fail: the rounding fix exposes a latent assumption in the dosing logic. The release is held. The latent bug is fixed. A new test is added. The release ships with a clean CIA, a clean rerun, and two new tests.

That entire sequence, including the failure, is audit-positive. A reviewer sees a team that found its own problem before a patient did.

The Lean Path Forward

Startups do not need a 40-page regression strategy. They need a small number of habits enforced rigorously.

1. Bind regression to the change-control process, not to CI. Your QMS change-control procedure (EN ISO 13485 clause 4.1.6 and clause 7.3.9) is the gate. CI is a tool inside that gate. If CI is your only regression trigger, you have no change-impact record and no paper trail.

2. Write a one-page CIA template and use it for every released-software change. One page, eight sections, no change merges to the release branch without it. For pre-release internal changes, you can be lighter. But anything that touches a released version needs the full form.

3. Tag every automated test with the requirement and risk control it verifies. Traceability from requirements to tests is already required by EN 62304 clause 5.1.1 and by MDR Annex II. If your tests are tagged, test selection becomes a database query instead of an argument.

4. Keep regression test evidence for the retention period required by EN ISO 13485 clause 4.2.5. Which, for MDR devices, is at least the lifetime of the device plus the duration specified in MDR Article 10(8). Test logs, not just pass/fail summaries. The logs are what an auditor will ask for.

5. Decide "significant change" in writing, not in a Slack thread. For every change batch that ships to users, document whether it is a significant change in the sense of MDR Article 120 and your conformity assessment (Annex IX / X / XI). If in doubt, consult your notified body before shipping, not after.

6. Treat regression failures as information, not embarrassment. A regression failure caught in your own pipeline is a PMS/CAPA win (EN ISO 13485 clause 8.5.2, MDR Articles 83-86). A regression failure caught by a user is a vigilance event. The gap between those two outcomes is the entire value of regression testing.

Traceability from requirements to tests is already required by EN 62304 clause 5.1.1 and by MDR Annex II.

Bottom Line

Answer these honestly. Each "no" is a gap, not a judgement.

For your last ten released software changes, can you produce a written change-impact analysis for each one. Today, without rebuilding it from memory?
Do your automated tests trace back to specific software requirements and risk controls, or are they organised only by code module?
When a test fails in CI, is there a documented decision on whether the failure is release-blocking, and who made the decision?
Can you tell an auditor, for any given released version, which test execution records correspond to that exact build?
For changes you classified as "not significant" under MDR Article 120 or your conformity assessment, is the classification rationale written down and reviewed?
When was the last time a regression test caught a real defect? If the answer is "never," are your tests actually covering the risk-controlled behaviours?
Does your maintenance plan under EN 62304 clause 6 describe how regression testing is scoped, selected, and evidenced, or does it just say "regression tests will be performed"?

Frequently Asked Questions

Do I have to rerun every test for every change?

No. EN 62304 clause 6 requires you to redo verification to the extent affected by the change. The obligation is to analyse the impact and select tests on that basis, not to rerun everything. But the analysis has to exist in writing.

Is CI pipeline green enough evidence of regression testing?

Not by itself. CI output is a technical artefact; audit evidence is CI output plus a change-impact analysis, a release record, and traceability from the change to the tests that were selected and to the requirements they cover.

What counts as a "significant change" for software under MDR?

For devices placed on the market under MDR, the concept lives in your conformity assessment (Annex IX, X, or XI) and your notified body's change-notification rules. For legacy devices under the Article 120 transitional regime, "significant changes in the design or intended purpose" are not permitted. For new software changes, the pragmatic test is: does the change alter intended purpose, introduce new risks, or affect the basis on which the notified body issued the certificate? If yes, talk to your notified body before shipping.

Can I skip regression testing for urgent security patches?

You can compress the scope, but you cannot skip the analysis. A security patch still requires a change-impact record and evidence that the affected behaviours were re-verified. EN IEC 81001-5-1:2022 adds cybersecurity-specific expectations on top.

How does regression testing relate to problem resolution under EN 62304 clause 9?

Clause 9 requires that when a problem is fixed, the fix is verified and that the fix does not introduce new problems. That verification is regression testing scoped by the problem-resolution record. The same CIA discipline applies. It is just triggered by a problem report instead of a change request.

What should test evidence include for an audit?

Test case ID, software version, date, operator, environment, input reference, expected and actual result, pass/fail, and a link back to the change record.

MDR Software Maintenance under EN 62304, the maintenance process that regression testing sits inside.
Significant Change for Software under MDR, how to classify a change before it ships.
Software Problem Resolution under EN 62304, the clause 9 process that drives regression on defect fixes.
Software Unit Testing under EN 62304, the test layer most commonly affected by regression scope.
Software Traceability: Requirements, Tests, Risks, the traceability that makes test selection a query instead of an argument.

Sources

Regulation (EU) 2017/745 on medical devices, consolidated text. Article 120; Annex I §17.2; Annex II.
EN 62304:2006+A1:2015. Medical device software. Software lifecycle processes. Clauses 5.1.1, 5.6.7, 6, 9.
EN ISO 13485:2016+A11:2021. Medical devices. Quality management systems. Clauses 4.1.6, 4.2.5, 7.3.9, 8.5.2.
EN ISO 14971:2019+A11:2021. Medical devices. Application of risk management to medical devices.

SaMD Pre-Certification Strategies: Getting Market-Ready Faster