Style Guides that Pass Audits: Applying a Regulatory English Style Guide for FDA and EMA to SaMD Narratives
Do your SaMD narratives trigger avoidable audit questions on terminology, claims, or traceability? In this lesson, you’ll learn to apply a regulator-calibrated English style guide that aligns with FDA/EMA expectations—so your writing is clear, consistent, and evidence-anchored. You’ll get precise rules, annotated examples, and a checklist-driven workflow, plus short practice items to harden skills. Finish with a repeatable approach that reduces queries, speeds reviews, and standardizes your team’s voice across US/EU submissions.
Step 1: Orienting to Audit Expectations and the Role of a Regulatory English Style Guide
A regulatory English style guide for FDA and EMA is a practical tool that converts regulatory expectations into concrete writing and formatting rules for Software as a Medical Device (SaMD) narratives. It operationalizes three qualities auditors look for first: clarity, consistency, and traceability. Clarity ensures that reviewers can understand your device, claims, and evidence without guessing. Consistency aligns your terminology, formatting, and phrasing across documents so that nothing appears contradictory or ad hoc. Traceability connects requirements to verification, validation, and postmarket evidence, enabling auditors to follow every claim back to its source.
This kind of style guide is not simply about grammar or tone. It encodes regulatory semantics. For AI/ML-enabled SaMD, semantics include what you call your model states (locked versus adaptive), what you mean by training versus tuning versus validation datasets, how you present performance claims and confidence intervals, and how you represent risk language that matches ISO 14971 and Good Machine Learning Practice (GMLP). When these semantics are stable and explicitly defined in the style guide, your narratives become auditable because every term and claim is unambiguous and map-able to evidence.
Auditors often begin with rapid checks that expose deeper weaknesses. They look for consistent terminology that matches your submission sections and referenced standards; they scan for unambiguous claims and calibrated benefit–risk language; they verify traceability from requirements to evidence; they confirm proper citations to standards and guidance; and they check your change/version discipline. If these basic markers are missing or inconsistent, the reviewer anticipates problems and may expand questions, leading to delays or deficiencies. A strong style guide brings these markers into your default writing practice so that even early drafts are audit-aligned.
To serve SaMD AI/ML narratives, the guide should have explicit, modular sections. A terminology control list defines approved terms and their precise definitions, including allowed regional variants. Naming conventions standardize the identifiers for models, datasets, and versions. Approved verbs and modal verbs define how you express requirements, recommendations, and commitments. Tense and voice rules set expectations for when to use present versus past and active versus passive. Evidence-citation formatting defines how to reference standards and guidance and how to anchor claims to unique evidence IDs. Metric and reporting templates specify what performance statistics to include, in what order, and with which parameters. Tables and figure conventions dictate captions, sources, and versioning. Risk language codifies hazard-to-harm phrasing and residual risk acceptability statements. Abbreviations and inclusivity/bias language guard against unclear or problematic wording. Finally, change-log policies ensure that every narrative is linked to version control and that readers can understand the history of modifications.
In essence, the style guide becomes a shared contract across regulatory writers, clinicians, data scientists, quality assurance, and regulatory affairs. It reduces subjective debate during drafting and review, because the rules have already been agreed and documented. This shared contract accelerates writing, strengthens review outcomes, and minimizes audit risk.
Step 2: Core Rules to Standardize SaMD Narratives (Minimum Viable Style Layer)
The minimum viable style layer contains core rules that deliver immediate audit value. The first rule is controlled terminology. Maintain a term base aligned with IMDRF SaMD definitions, ISO 13485/14971, GMLP, the FDA’s Predetermined Change Control Plan (PCCP), and EMA expectations for AI transparency. Controlled terminology eliminates drift between “intended use” (FDA context) and “intended purpose” (EU context) by prescribing which term to use where. It distinguishes “data drift” from “concept drift,” and fixes the usage of “locked model” versus “adaptive model.” The term base is not a glossary afterthought; it is the governing source that authors consult while drafting. Because auditors will check for definition alignment with cited standards, your term base must include authoritative references and note any regional synonyms.
A second rule concerns claims and the modality of certainty. You should calibrate modal verbs to reflect the commitment strength. Use “shall” for requirements that your device or process must meet. Reserve “must” for legal or regulatory obligations, because overuse weakens precision. Use “will” to describe sponsor actions and “should” for recommendations. Avoid “may” in performance claims, because it signals uncertainty that undermines verifiability. Replace vague statements with specific, testable claims that include named metrics, confidence intervals, dataset splits, and references to pre-specified analysis plans. This calibration keeps claims auditable and keeps commitments within your control.
Tense and voice rules provide additional clarity. Use present tense for enduring truths, device characterization, intended use/purpose, and algorithm description that remain valid across time. Use past tense for completed studies and analyses, such as validation conducted on a particular dataset at a particular time point. Prefer active voice to reveal responsibility: specify who performed the activity and what was done. Passive voice often obscures accountability and invites audit questions. By consistently articulating agency, you improve the reviewer’s ability to locate the responsible process or group.
Evidence citation and traceability must be mechanical and consistent. Mandate in-text anchors to unique evidence IDs that connect to a traceability matrix. The matrix links requirements (REQ-###) to verification (VER-###), validation (VAL-###), and reports (REP-###). Make the anchor part of the sentence containing the claim, not a footnote at the end, so that a reviewer can immediately see the evidence path. Use a consistent citation format for standards and guidance (e.g., FDA GMLP, year; ISO 14971:2019) and maintain a reference list that records the exact edition and publication year. Require table captions and figure captions that include a data source identifier and artifact version. These practices turn your narrative into a navigable map rather than a story that must be believed on trust.
Quantitative reporting needs tight standardization. Name metrics precisely and define them in the term base. Present them in a consistent order, with primary endpoints first, secondary endpoints next, and exploratory metrics clearly flagged. Require confidence intervals, sample sizes, dataset split definitions (training, tuning, locked validation, external validation), and a pointer to the pre-specified analysis plan. For bias and fairness, establish subgroup reporting conventions by demographic and clinical segments relevant to intended use, and define thresholds that trigger risk language or mitigation steps. This structure demonstrates statistical discipline and guards against cherry-picked results.
Risk and benefit language must follow a structured format that aligns with ISO 14971 practices. Articulate the hazard, the sequence of events, the resulting harm, the risk control measure, and the residual risk acceptability with a reference to your predefined criteria. Use consistent phrasing for acceptability thresholds and link each decision back to the risk management file. Avoid generic endorsements of acceptability; tie every statement to criteria and evidence.
Version control supports the audit trail. Use a naming convention that encodes artifact type and semantic versioning, such as ALG-###-vX.Y.Z for models, DS-###-vA.B for datasets, and DOC-###-vM.N for documents. Maintain change logs that map edits to the appropriate governance mechanism, including the PCCP where applicable. Ensure that the narrative’s version metadata lists the versions of all referenced artifacts. This closes the loop between text and technical assets, so auditors can reconcile claims with the exact versions used to generate evidence.
Step 3: Adapting the Guide to Your SaMD and Region, and Applying It with a Checklist-Driven Workflow
Adaptation is necessary because devices, indications for use/purpose, and regulatory routes vary. Begin by mapping terminology to your device’s intended use/purpose and regulatory route, whether 510(k), De Novo, or CE marking under the MDR. Localize region-sensitive terms explicitly: use “intended use” for FDA submissions and “intended purpose” for EU MDR documents. Harmonize “benefit–risk” versus “risk–benefit” to match agency-preferred ordering, and ensure your term base lists both with notes on context.
Customize metric templates to the clinical role of your SaMD. Diagnostic devices will emphasize sensitivity, specificity, likelihood ratios, and ROC area; monitoring devices may prioritize time-to-alert, alert fatigue rates, and missed-event rates; decision-support tools might report calibration metrics, decision impact, and user override rates. Align these metrics with your clinical evaluation plan and define which are primary and secondary. Predefine subgroup categories based on clinical relevance and regulatory expectations for fairness and inclusivity. Incorporate threshold values and escalation rules for when performance drifts or subgroup disparities appear.
Define your AI/ML model taxonomy and data assets. Clarify model classes (e.g., locked versus adaptive), life cycle phases (development, verification, validation, deployment, monitoring), and data categories (training, tuning, locked validation, external validation, real-world performance). Approve abbreviations and ensure each abbreviation expands to a unique, unambiguous term in the term base. If you use internal names or codenames, document their mappings to submission-friendly labels.
Approve a verbs list for claims and commitments. For instance, reserve “demonstrates” for claims supported by evidence that meets pre-specified criteria, “supports” for associative or exploratory findings, and “will” for future actions governed by a plan. Tie verb choices to evidence levels. This removes ambiguity in how strongly you present findings.
Apply the guide through a disciplined authoring workflow:
-
Pre-author checklist: Confirm the document’s scope and intended audience (FDA reviewer, notified body, or both). List applicable guidances and standards, and ensure you are using the current style guide version. Load the term base and abbreviations into your authoring environment, so automatic checks can flag deviations. This prevents misalignment from the first line of drafting.
-
Draft using templates: Insert approved section headings and boilerplate language before writing body text. Place metric tables and traceability anchors while drafting claims, not as an afterthought. Use figure and table caption templates that require data source and version fields. This approach bakes compliance into the narrative rather than layering it on later.
-
Self-check pass with seven gates: Run a style checklist focused on terminology conformance, calibrated claims modality, correct tense and active voice, completeness of metrics (including confidence intervals and N), presence and correctness of evidence anchors, structured risk phrasing, and presence of version metadata. Document any deviations with proposed corrections and rationales.
-
Peer QA pass: A second reviewer repeats the same checklist and conducts a “red team” search for ambiguity. They look for vague quantifiers, hedging verbs, and unanchored claims. They also scan tables and figures for missing sources or version tags. All issues go into a defect log keyed to document line items or paragraph IDs. The defect log becomes a record that supports continuous improvement and informs training for authors.
-
Traceability verification: Cross-walk the narrative against the requirements and evidence matrix. Confirm that every claim maps to at least one piece of evidence and that every requirement appears in the narrative with a clear touchpoint. Verify that any cited standard is the correct edition and that the evidence IDs match what is stored in your document control system. This step is critical, because traceability errors are common and immediately visible to auditors.
-
Finalization: Update the change log with a concise summary of what changed and why. Generate a style compliance summary page that shows the checklist scores and any waivers with justification. Freeze versions of cited artifacts and list them in an appendix so that the narrative, evidence, and datasets are locked together. This final step turns your document into a fixed, auditable record.
Together, adaptation and workflow turn the style guide from a static document into a living practice. They make your narratives robust to reviewer scrutiny and reduce the time you spend resolving ambiguities during query rounds.
Step 4: Short Practice and Assessment for Mastery
To internalize these principles, targeted practice is essential. Focus on rewriting weak or ambiguous sentences into audit-ready statements using the style rules. Emphasize subgroup metrics with confidence intervals, specify agents and datasets for completed activities, and anchor every claim to evidence IDs. Practice labeling terms as approved or non-approved according to your term base to reinforce controlled terminology. Finally, perform quick exit checks to select the correct citation format, attach the right traceability anchor, and choose the appropriate modal verb for commitments. These short, frequent exercises help authors build automaticity, so that style compliance becomes a fluent, default behavior rather than a laborious effort.
By aligning your SaMD narratives with a regulatory English style guide tuned to FDA and EMA expectations, you create documents that are clear in meaning, consistent in usage, and traceable to evidence. You reduce audit risk, accelerate reviews, and build trust with regulators. Most importantly, you equip your team with a repeatable, checklist-driven process that scales across documents and iterations, even as your AI/ML models and datasets evolve. This is how style guides move beyond aesthetics and become instruments of compliance and quality in the SaMD lifecycle.
- Build and use a controlled terminology and semantics layer (e.g., intended use vs. intended purpose; locked vs. adaptive; dataset splits) aligned to standards to ensure clarity, consistency, and traceability.
- Calibrate modality, tense, and voice: use shall/must/will/should precisely; avoid may in claims; use present for enduring facts and past for completed studies; prefer active voice to show responsibility.
- Anchor every claim in-line to unique evidence IDs and maintain a traceability matrix linking requirements to verification, validation, reports, and cited standards with exact editions.
- Standardize quantitative reporting, risk language, and version control with templates (metrics with CI, N, predefined endpoints; ISO 14971 hazard-to-harm phrasing; semantic versioning and change logs) and apply the guide via a checklist-driven workflow.
Example Sentences
- The diagnostic model is a locked model that supports pneumonia detection in adult chest X-rays (INT-USE-001), and validation achieved AUROC 0.92 (95% CI 0.90–0.94; N=3,200) [VAL-214; REP-552].
- External validation was conducted on a pre-specified dataset from Site B in May 2024 and met the primary endpoint of sensitivity ≥0.90 (observed 0.91, 95% CI 0.88–0.93) [VAL-198; SAP-017].
- Each requirement in the Predetermined Change Control Plan shall map to at least one verification artifact and one validation report [PCCP-003; REQ-100→VER-145→VAL-210].
- We will monitor subgroup performance quarterly and initiate mitigation if any demographic subgroup shows absolute sensitivity delta >5 percentage points versus the overall rate [MON-012; RMP-040].
- Residual risk for false negatives remains acceptable per ISO 14971 criteria because control RM-07 reduced the probability from probable to remote, with clinical escalation in place [RISK-073; PROC-221; ISO 14971:2019].
Example Dialogue
[Alex]: Our draft says the model may improve triage speed, but that verb is too weak for a claim.
[Ben]: Agreed—let's change it to demonstrates only if we meet the pre-specified threshold; otherwise, we say supports and anchor it.
[Alex]: Good call. The external validation achieved 12% reduction in median review time (95% CI 9–15; N=840) and met the threshold, so we can write demonstrates and cite VAL-233 and REP-601.
[Ben]: Also, switch to past tense for the study sentence and add the evidence IDs in-line, not as footnotes.
[Alex]: Done. And I updated intended purpose to intended use for the FDA version and kept the PCCP link in the same paragraph.
[Ben]: Perfect—now the terminology, modality, tense, and traceability all align with the style guide.
Exercises
Multiple Choice
1. Which sentence best follows the calibrated modality rule for a performance claim backed by pre-specified criteria and in-line evidence anchors?
- The model may improve detection accuracy in older adults.
- The model should improve detection accuracy in older adults.
- The model demonstrates improved detection accuracy in older adults (12% absolute gain, 95% CI 9–15; N=840) [VAL-233; REP-601].
- The model will improve detection accuracy in older adults.
Show Answer & Explanation
Correct Answer: The model demonstrates improved detection accuracy in older adults (12% absolute gain, 95% CI 9–15; N=840) [VAL-233; REP-601].
Explanation: Use demonstrates for claims supported by evidence that meets pre-specified criteria, and include metrics, confidence intervals, N, and in-line evidence IDs. Avoid may for performance claims as it signals uncertainty.
2. Choose the sentence that correctly applies tense/voice and traceability for completed validation work.
- Validation is conducted on Site B data and results are acceptable.
- Validation was conducted on Site B data; results meet the primary endpoint (sensitivity ≥0.90; observed 0.91, 95% CI 0.88–0.93) [VAL-198; SAP-017].
- Validation will be conducted on Site B data and should meet the primary endpoint.
- Validation may be conducted on Site B data (see appendix).
Show Answer & Explanation
Correct Answer: Validation was conducted on Site B data; results meet the primary endpoint (sensitivity ≥0.90; observed 0.91, 95% CI 0.88–0.93) [VAL-198; SAP-017].
Explanation: Use past tense for completed studies, include specific metrics with CI and N when available, and place in-line anchors to evidence IDs. Active, specific phrasing supports audit traceability.
Fill in the Blanks
For FDA submissions, use the term ___ to describe what the device is intended to do; for EU MDR documents, use intended purpose.
Show Answer & Explanation
Correct Answer: intended use
Explanation: Controlled terminology prevents drift between regions. The guide prescribes intended use for FDA and intended purpose for EU MDR.
Each requirement (REQ-###) shall map to at least one verification artifact (VER-###) and one validation report (VAL-###) to ensure ___.
Show Answer & Explanation
Correct Answer: traceability
Explanation: Traceability links requirements to verification and validation evidence, enabling auditors to follow each claim back to its source.
Error Correction
Incorrect: The adaptive model will be validated on an external dataset last quarter and the results are cited in a footnote.
Show Correction & Explanation
Correct Sentence: The adaptive model was validated on an external dataset in Q2 2025, and results are anchored in-line [VAL-310; REP-722].
Explanation: Use past tense for completed activities, specify time, and place in-line evidence anchors rather than footnotes to support audit navigation.
Incorrect: We must monitor subgroup performance monthly and may initiate mitigation if disparities are observed.
Show Correction & Explanation
Correct Sentence: We will monitor subgroup performance monthly and shall initiate mitigation if any subgroup shows an absolute sensitivity delta >5 percentage points versus the overall rate [MON-012; RMP-040].
Explanation: Calibrate modality: will for sponsor actions; shall for requirements. Avoid may in commitments that need enforceability; specify the trigger and anchor to governance artifacts.