---
title: Federated Learning for Medical Devices: MDR and GDPR
description: How federated learning changes AI medical device development under MDR Annex VIII Rule 11, with GDPR data minimisation and EN 62304 evidence expectations.
authors: Tibor Zechmeister, Felix Lenhard
category: AI, ML & Algorithmic Devices
primary_keyword: federated learning medical devices MDR GDPR
canonical_url: https://zechmeister-solutions.com/en/blog/federated-learning-medical-devices
source: zechmeister-solutions.com
license: All rights reserved. Content may be cited with attribution and a link to the canonical URL.
---

# Federated Learning for Medical Devices: MDR and GDPR

*By Tibor Zechmeister (EU MDR Expert, Notified Body Lead Auditor) and Felix Lenhard.*

> **Federated learning is a training architecture, not a regulatory shortcut. Under MDR Annex VIII Rule 11 the device is still classified on its intended purpose and the information it provides for clinical decisions. What federated learning changes is how the manufacturer controls training data and evidence. What it does not change is the obligation to demonstrate safety, performance, and lifecycle rigour.**

**By Tibor Zechmeister and Felix Lenhard.**

## TL;DR
- Federated learning trains a shared model across hospitals without pooling raw patient data on one server; only model updates move.
- Under MDR, classification is driven by intended purpose and Rule 11, not by training architecture. Federated training does not down-classify a Class IIa or IIb device.
- GDPR data minimisation aligns well with federated learning, but model updates can still constitute personal data in specific cases. Flag this for your DPO.
- The manufacturer remains responsible for the full EN 62304 lifecycle, including verification of the aggregation server, the local training code, and the global model.
- Evidence challenges are real: distribution shift between sites, client drift, reproducibility, and auditability of a training run that never happened in one place.
- Keep the global model locked for the first CE mark. Defer any form of continuous learning until post-market infrastructure is in place.

## Why this matters

A founder emailed us last month with a pitch that sounded familiar. "We do federated learning, so we do not touch patient data, so GDPR is solved and our notified body review will be faster." Three sentences, three misconceptions.

Federated learning is a powerful and, in many cases, the right architecture for medical AI. Hospitals are rightly protective of their data. Centralising millions of images in one bucket is legally and politically painful, and sometimes impossible. Training locally and exchanging only model updates can unblock access to data that would otherwise be unreachable.

But federated learning does not change what a medical device is. It changes where the gradients live. The regulator cares about the device, its intended purpose, and the evidence behind it. If your software provides information used for diagnosis, monitoring, or treatment decisions, it is a medical device whether it was trained on one server in Frankfurt or on fifty servers in fifty hospitals.

This post walks through what actually changes when you choose federated learning, what does not, and how to build a defensible technical file around a decentralised training pipeline.

## What MDR actually says

MDR Article 2(12) defines intended purpose as *"the use for which a device is intended according to the data supplied by the manufacturer on the label, in the instructions for use or in promotional or sales materials or statements and as specified by the manufacturer in the clinical evaluation"*. That definition is architecture-agnostic. A federated model with an intended purpose of "automatic detection of lung nodules in adult chest CT to support radiologist review" is the same regulatory object as an equivalent centrally trained model.

Classification for software flows through MDR Annex VIII Rule 11. Software intended to provide information used to take decisions with diagnosis or therapeutic purposes is Class IIa, unless such decisions may cause death or irreversible deterioration (Class III) or serious deterioration or surgical intervention (Class IIb). Software intended to monitor physiological processes is generally Class IIa, unless monitoring vital physiological parameters where variations could result in immediate danger, in which case Class IIb. Everything else intended as software for medical purposes falls to Class I. The training architecture does not appear anywhere in Rule 11.

For the lifecycle, MDR Annex I §17.2 requires that software devices be developed and manufactured in accordance with the state of the art, taking into account the principles of development lifecycle, risk management, verification, and validation. The harmonised standard is EN 62304:2006+A1:2015, which defines the software lifecycle processes and the software safety classification (A, B, C). EN ISO 14971:2019+A11:2021 governs risk management. Annex II defines the technical documentation the manufacturer must produce.

GDPR is adjacent law, not MDR. The relevant articles on data minimisation and lawfulness of processing sit in Regulation (EU) 2016/679. . For this post, the useful framing is that GDPR compliance is necessary but not sufficient for MDR compliance, and MDR compliance does not satisfy GDPR.

## A worked example

Consider a startup building a Class IIa SaMD that detects diabetic retinopathy from fundus photographs. Intended purpose: "To provide information to ophthalmologists and screening clinicians to support referral decisions in adult patients with diabetes." Under Rule 11, Class IIa. Under EN 62304, the team assigns software safety class B after risk analysis shows that a missed referral could contribute to serious deterioration but is mitigated by the physician-in-the-loop workflow.

The team has access to three hospital networks in Austria, Germany, and the Netherlands. None will export raw images. The startup implements federated averaging: local training on each site, weights sent to a central aggregation server, averaging, global model redistributed.

What stays the same:
- The intended purpose and classification. Still Class IIa, still Rule 11.
- The technical file structure per Annex II.
- The need for clinical evaluation against MDR Article 61 and Annex XIV.
- The full EN 62304 lifecycle: planning, requirements, architecture, implementation, verification, release.
- The risk management file under EN ISO 14971.

What changes:
- The training data documentation. Instead of one dataset description, the manufacturer now documents three site-specific datasets: demographics, camera models, label protocols, inclusion and exclusion criteria, label noise estimates per site.
- The software architecture documentation per EN 62304 §5.3 expands to describe the client training code, the aggregation server, the communication protocol, and the trust model.
- Verification. The manufacturer must verify not only the final global model but also that the training process is deterministic enough to be reproducible and auditable. "We ran federated averaging for 200 rounds and this is what came out" is not a verification record.
- Risk analysis. New hazards appear: client drift (one site's data shifts and poisons the global model), aggregation server failure, communication channel compromise, label protocol divergence between sites.
- Bias and performance validation. Performance must be characterised per site and on an external hold-out. A federated model that performs beautifully on average can hide a site where it performs badly.

On the GDPR side, the team's DPO confirms that raw images never leave the hospitals, which helps with data minimisation. But the team also learns that model updates can, under certain conditions, leak information about training examples. This is not a hypothetical in the academic literature. The mitigation is secure aggregation and, where appropriate, differential privacy. Both need to be documented as part of the cybersecurity and data protection strategy.

## The Subtract to Ship playbook

The instinct with federated learning is to add complexity. Resist that. Subtract until what is left is defensible.

**1. Lock the model before CE mark.** Do not combine federated learning with continuous learning for your first submission. Freeze the global model, version it, and submit it as a locked algorithm under MDR. Continuous retraining is a separate regulatory conversation and belongs in a predetermined change control plan, which European notified bodies are still working through. See our post on locked vs adaptive AI algorithms.

**2. Treat each site as a qualified supplier.** The hospitals running local training are, in practice, executing part of your training pipeline. Apply supplier controls from your QMS. Contract terms, data provenance, label protocols, software versions on the client side. If a site upgrades their PACS or changes their labelling protocol mid-study, you need to know.

**3. Document the dataset like a clinical study.** For each site: patient demographics, device mix, acquisition protocols, label methodology, inter-rater reliability, exclusion criteria, sample sizes by relevant subgroup. This is how a notified body assesses whether your training data supports your intended purpose across the target population.

**4. Verify the pipeline, not just the model.** Write software requirements for the client training code, the aggregation server, and the orchestration layer. Run unit tests, integration tests, and system tests per EN 62304. The aggregation server is part of your medical device software unless you can justify otherwise. Most startups cannot.

**5. Test for distribution shift explicitly.** Hold out one site entirely. Train on the other two. Measure performance on the held-out site. If it collapses, your federated model has memorised site-specific shortcuts, not medical signal. This is not optional. It is the minimum evidence a reviewer will expect.

**6. Engage your DPO early and in writing.** GDPR treatment of model updates is not fully settled in case law. Get your data protection position documented, shared with the hospitals, and referenced in your risk file. When the notified body asks about privacy, you want to point to a coherent position, not improvise.

**7. Plan post-market monitoring per site.** Under MDR Article 83, post-market surveillance is mandatory. For federated models, monitor per deployment site where possible. Site-level performance drift is the early warning signal that your training distribution no longer matches reality.

The core discipline is this: federated learning buys you access to data you could not otherwise reach. It costs you complexity in verification, documentation, and risk management. Make sure the trade is worth it before you commit.

## Reality Check

1. Can you state your intended purpose in one sentence and map it to a specific Rule 11 sub-clause?
2. Do you have a documented dataset description for each participating site, including demographics and label protocols?
3. Have you written software requirements for the aggregation server and client training code, or are they treated as "infrastructure"?
4. Can you reproduce a specific global model version from archived client updates and aggregation logs?
5. Have you tested the model on a completely held-out site that was not part of training?
6. Does your risk management file identify hazards specific to federated training (client drift, aggregation failure, update leakage)?
7. Has your DPO provided a written position on whether model updates constitute personal data in your setup?
8. Is your first CE submission for a locked model, with continuous learning deferred to a later planned change?

Any "no" is a gap to close before you submit.

## Frequently Asked Questions

**Does federated learning lower my device class?**
No. Classification follows intended purpose and Rule 11. The training architecture is invisible to the classification logic.

**If the data never leaves the hospital, is GDPR automatically satisfied?**
No. Model updates can encode information about training examples. You still need a lawful basis for processing, data minimisation analysis, and, depending on the threat model, technical measures like secure aggregation or differential privacy. Get a written DPO position.

**Can I use federated learning to avoid a clinical investigation?**
Training architecture has no effect on clinical evidence requirements. Clinical evaluation under MDR Article 61 and Annex XIV is driven by claims, intended purpose, and state of the art, not by how weights were computed.

**Is the aggregation server part of my medical device software?**
Almost certainly yes, if it influences the final model that ships to users. Treat it as medical device software under EN 62304, with requirements, architecture, verification, and change control.

**What software safety class applies to the training code?**
EN 62304 classifies the device software based on the harm the device can cause. The training code contributes to the behaviour of the device, so it sits within the same lifecycle. Your safety classification rationale should cover it explicitly.

**Can we do continuous federated learning after CE mark?**
Not without a predetermined change control plan that the notified body has reviewed. For a first submission, lock the model.

## Related reading
- [Rule 11 classification for AI/ML software](/blog/classification-ai-ml-software-rule-11) — how intended purpose drives classification regardless of training architecture.
- [Training data requirements for AI medical devices](/blog/training-data-requirements-ai-medical-devices) — what your notified body expects to see in the dataset section.
- [Data quality and bias in AI medical devices](/blog/data-quality-bias-ai-medical-devices) — why per-site characterisation matters in federated setups.
- [Locked vs adaptive AI algorithms under MDR](/blog/locked-vs-adaptive-ai-algorithms-mdr) — why your first submission should be a locked model.
- [Clinical evaluation for AI/ML medical devices](/blog/clinical-evaluation-ai-ml-medical-devices) — clinical evidence expectations for AI SaMD.

## Sources
1. Regulation (EU) 2017/745 on medical devices, consolidated text. Article 2(12), Annex I §17.2, Annex II, Annex VIII Rule 11, Annex XIV.
2. EN 62304:2006+A1:2015 — Medical device software — Software lifecycle processes.
3. EN ISO 14971:2019+A11:2021 — Medical devices — Application of risk management to medical devices.
4. MDCG 2019-11 Rev.1 (June 2025) — Guidance on qualification and classification of software.
5. Regulation (EU) 2016/679 (GDPR) — general reference for data minimisation and lawfulness of processing.

---

*This post is part of the [AI, ML & Algorithmic Devices](https://zechmeister-solutions.com/en/blog/category/ai-ml-devices) cluster in the [Subtract to Ship: MDR Blog](https://zechmeister-solutions.com/en/blog). For EU MDR certification consulting, see [zechmeister-solutions.com](https://zechmeister-solutions.com).*