Strategic English for RWE Communication: Precise Phrasing for Missing Data, Bias, and Confounding in HTA Dossiers
Worried that vague wording in RWE sections is delaying HTA decisions? This lesson will give you precise, regulator-ready phrasing so you can clearly explain missing data, bias, confounding, and external controls in HTA dossiers. You’ll get concise, stepwise guidance, real-world example sentences and dialogue, and short exercises to practice drafting HTA‑grade paragraphs that point reviewers directly to methods and sensitivity analyses. The tone is practical and executive‑grade—rigorous, minimal, and built for fast integration into dossiers and appendices.
Step 1 — Set the context and audience
When writing for Health Technology Assessment (HTA) dossiers, always begin by situating the reader. Typical audiences are HTA reviewers, payer medical officers, reimbursement committees, and methodologists who need to judge whether submitted evidence supports a coverage or pricing decision. These readers expect concise, reproducible descriptions that permit independent assessment of credibility, bias risk, and generalizability. Precise language builds trust: it makes assumptions explicit, indicates where judgment was required, and points reviewers to the methods and data that justify conclusions.
Open the dossier section with a brief orientation: the data source(s) used, the analytic intent (comparative effectiveness, safety, or supportive real-world evidence), and the specific questions addressed (e.g., comparative overall survival vs. external control). This sets expectations so reviewers can immediately map statements to their evaluation criteria.
A short checklist framed for the reviewer helps both writer and reader. Use a bullet list near the start that identifies the core elements reviewers look for: data provenance (origin, collection methods, versioning), eligibility alignment (how populations were matched to trial criteria), endpoint definitions (operational and censoring rules), handling of missingness (extent, assumptions, methods), bias and confounding assessment (identified risks, direction and likely magnitude), and sensitivity or robustness analyses. When these items are signposted explicitly, the language required later can be compact because readers already know what to expect and where to find details in appendices.
Emphasize why precise phrasing matters. HTA decisions often hinge on small but meaningful differences in effectiveness estimates. Ambiguous language about sample selection, endpoint measurement, or how missing data were treated invites skepticism and can lead to requests for reanalysis. Defensible phrasing reduces iterative clarification requests and supports quicker, more confident decisions.
Step 2 — External controls and endpoint phrasing
When describing external control arms, use a compact template that structures the information the reviewer needs to evaluate credibility. The template should follow this logical order: (1) data source description, (2) eligibility alignment statement, (3) selection or matching methods, (4) key baseline balance metrics, and (5) limitations. Each element should be precise and refer to a methods appendix or table for full detail.
Begin the external control paragraph with a clear source description: identify the database or registry, its geographic scope, data capture period, and a brief statement of data quality (e.g., routinely collected electronic health records with validated mortality linkage). Avoid vague terms like "real-world data" without specification. Define the data cut-off date and any notable completeness features (e.g., 98% capture of hospitalization events).
Follow with eligibility alignment. State how trial inclusion/exclusion criteria were operationalized in the real-world data: which criteria could be matched exactly, which required proxy variables, and which could not be applied. Use defensible signals such as "eligibility criteria were aligned by applying variables A–C; criterion D was proxied using E (rationale: X); criterion F could not be applied and was addressed through sensitivity analyses". This communicates both transparency and the areas most likely to affect comparability.
Describe the selection, matching, or weighting methods next. Use standardized language to name the method (e.g., propensity-score matching, inverse probability of treatment weighting, entropy balancing) and summarize the implementation: covariates included, matching ratio, caliper, variable selection strategy (pre-specified vs. data-driven). Always note whether baseline balance was assessed using standardized mean differences (SMD) and the pre-specified threshold for acceptable balance (commonly SMD <0.10). Point readers to a table that shows the pre- and post-adjustment balance.
Conclude with a candid limitations sentence that quantifies residual imbalance or missing covariates and the potential direction of bias (e.g., unmeasured confounding likely to bias treatment effect toward the null or away from the null). This forces clarity about the remaining uncertainties and signals that subsequent sensitivity analyses will probe these risks.
Endpoint description must be operational, succinct, and unambiguous. For PFS, OS, and hazard ratios, follow a short, repeatable template: operational definition, censoring rules, analysis method, and differences between trial and real-world measurements.
The operational definition states exactly how the endpoint was measured in the data source: for OS, identify the death registry linkage and define date of death used; for PFS, define the event (radiographic progression, clinical progression, initiation of next therapy) and how it was captured. Avoid soft terms like "progression was recorded"; instead specify the codes or clinical triggers used. State the follow-up window and how follow-up time was calculated (e.g., from index date to event, last contact, or administrative censoring).
Censoring rules must be explicit: state who was censored, at what date, and why (loss to follow-up, data cut-off). If differing censoring rules were used between trial and real-world cohorts, describe those differences and the analytic strategy to harmonize them (e.g., truncation of follow-up at X months to align windows).
Specify the analysis method plainly: which model was used for time-to-event analysis (Kaplan–Meier with log-rank test, Cox proportional hazards model), covariate adjustment strategy, and whether proportional hazards assumptions were tested. If using non-parametric or flexible approaches for irregular real-world follow-up, state that and give rationale.
Finally, explicitly note differences between trial and real-world endpoints and why they matter. For example: real-world progression is often less frequent and more heterogeneous in definition, which can attenuate observed PFS differences. State how this may affect interpretation of HRs and what sensitivity checks were performed.
Step 3 — Precise phrasing for missing data, bias, confounding, and modifiers
Missing data: Use modular sentences that first quantify extent and pattern, then state the assumption, and finally describe the handling approach. Start with a clear headline sentence: the proportion of missingness for key variables and the pattern (monotone, intermittent, differential by treatment group). Follow with the assumption (e.g., missing at random, MAR; missing completely at random, MCAR; or missing not at random, MNAR) and justify it briefly with observed correlates. Then specify the method used (complete-case analysis with reasoning; multiple imputation with number of imputations, variables included, model type; inverse-probability-of-censoring weighting), and where the imputation model is documented. Conclude with a short statement that sensitivity analyses evaluated MNAR scenarios and their impact on effect estimates.
Bias: Present bias language that names the bias, its likely direction, and an estimate of magnitude where possible. Use concise phrasing: "Confounding by indication is plausible and expected to bias the observed treatment effect away from the null because treated patients were younger and had fewer prior therapies. Quantitatively, residual confounding after adjustment is estimated to shift the HR by up to X% (see E-value or bias modelling in Appendix)." This structure—label, direction, quantification, and pointer—keeps statements defensible and actionable.
Confounding: For measured confounding, list identified covariates and the adjustment approach in a compact sentence: "Measured confounders (age, performance status, prior lines of therapy, comorbidity index) were included in the propensity-score model; matching achieved post-match SMDs <0.10 for all listed variables." For unmeasured confounding, explicitly acknowledge gaps and quantify potential effects using sensitivity analyses (E-values, negative controls, quantitative bias analysis). Use phrasing that separates what was adjusted for from what remains a risk.
Treatment effect modifiers: Clarify whether modifiers were pre-specified or post-hoc and how interactions were tested. Use language such as: "Pre-specified potential effect modifiers (tumor histology, prior biologic exposure) were tested using interaction terms in adjusted Cox models; heterogeneity was assessed using stratified estimates and formal interaction tests with alpha set at 0.10 for exploratory subgroup assessment." If any modifiers materially changed estimates, state the direction and whether subgroup findings are hypothesis-generating.
Throughout these sections, link each risk statement to a sensitivity or robustness analysis. Use short connector phrases: "This risk was explored in sensitivity analyses", "We modelled MNAR using delta-based imputation (Appendix) to quantify impact", or "E-value analysis suggests that an unmeasured confounder with RR = X would be required to nullify the observed association." Such phrasing lets HTA reviewers find the evidence backing the claim.
Step 4 — Practical drafting and integration
Assemble the above elements into a concise HTA paragraph of 100–200 words by maintaining a clear structure: one sentence or clause for data source and eligibility alignment, one for selection and balance, one for endpoint operationalization and censoring, and one for missingness, bias/confounding risk, and reference to sensitivity analyses. Keep tone measured and transparent: avoid overstatement and defensive qualifiers. Use active, precise verbs ("applied", "aligned", "adjusted", "imputed") and neutral signal phrases ("was addressed by", "was explored in sensitivity analysis").
Signposting and tone: Use short headings to direct readers to key sections and use sparing emphasis (bold or italics) only for critical phrases (e.g., "eligibility aligned using pre-specified criteria"). Cross-reference methods and appendices with parenthetical notes or table numbers to avoid clutter. The tone should be factual and non-defensive: acknowledge limitations, quantify risk where possible, and immediately indicate steps taken to assess or mitigate them.
SEO-aware checklist: Conclude the section with an internal checklist to ensure the primary SEO phrase appears naturally and usefully: include "phrasing for missing data, bias, confounding" in a sentence that explains why such phrasing matters—e.g., "Consistent phrasing for missing data, bias, confounding enables reviewers to rapidly evaluate residual risk and the robustness of conclusions." Place this sentence where it serves the reader rather than as an SEO insertion.
By following this four-step flow—context and expectations, structured external-control and endpoint templates, modular risk phrasing, and concise integration—writers can craft HTA-ready RWE text that is defensible, transparent, and easy for payers to evaluate. Precise phrasing is not stylistic: it is methodological clarity that materially affects decision-making.
- Open dossier sections by stating the data source(s), analytic intent, and specific questions up front so reviewers can immediately map claims to evaluation criteria.
- Describe external controls and endpoints with a compact template: source description, eligibility alignment, selection/matching methods with balance metrics (e.g., SMD <0.10), and explicit limitations.
- For missing data, bias, and confounding use modular, explicit sentences: quantify missingness and pattern, state the missingness assumption and handling method, and name biases with likely direction and quantitative sensitivity assessments.
- Keep the final paragraph concise (100–200 words), use active precise verbs, signpost methods/appendices, and link each risk statement to sensitivity or robustness analyses.
Example Sentences
- The external-control cohort was drawn from the national EHR registry (2015–2022; 98% hospitalization capture) and eligibility was aligned by applying trial criteria A–C, proxying criterion D with medication-based proxies (rationale in Appendix Table A1).
- Propensity-score matching (1:2 nearest-neighbor, caliper 0.2 SD) was applied using pre-specified covariates; post-match balance achieved SMDs <0.10 for all variables (see Table 2).
- Overall survival was defined as time from index date to date of death per national mortality linkage; patients were censored at last contact or administrative cut-off (2022-12-31) to harmonise follow-up windows with the trial.
- Missingness in performance status was 18% and differed by treatment group (higher in the real-world comparator); we assumed MAR conditional on age and prior therapy and implemented multiple imputation (m = 20) including all outcomes and predictors (details in Appendix).
- Confounding by indication is plausible and likely biases the treatment effect away from the null given treated patients were younger; E-value analysis suggests an unmeasured confounder with RR ≥ 1.8 would be required to fully explain the observed HR (Appendix B).
Example Dialogue
Alex: We used the oncology registry as an external control—eligibility aligned to trial criteria A–C, but criterion D had to be proxied by prior chemotherapy codes (see Appendix).
Ben: Did you check baseline balance after weighting?
Alex: Yes—entropy balancing on age, performance status, and prior lines achieved SMDs <0.10; residual imbalance on comorbidity was quantified and explored in sensitivity analyses, including an MNAR delta-based imputation.
Ben: Good. And how did you handle progression differences between settings?
Alex: PFS was operationalised as radiographic progression or start of next therapy, with follow-up truncated at 24 months to align with the trial and a Cox model adjusted for the balancing weights; differences and robustness checks are summarised in Table 3.
Exercises
Multiple Choice
1. Which of the following is the best way to open an HTA dossier section describing an external-control cohort?
- Start with general claims about using 'real-world data' and then describe results.
- Begin with the data source description, analytic intent, and the specific questions addressed.
- Open with a long narrative about the clinical context and patient stories.
Show Answer & Explanation
Correct Answer: Begin with the data source description, analytic intent, and the specific questions addressed.
Explanation: The lesson advises situating the reader immediately by stating the data source(s), analytic intent, and specific questions. This orients HTA reviewers so they can map statements to evaluation criteria; vague generalities or narratives do not provide the required reproducible context.
2. When reporting balance after propensity-score matching, which phrasing follows the guidance most precisely?
- Post-match balance showed similar groups.
- Post-match balance achieved standardized mean differences (SMDs) <0.10 for all pre-specified covariates (see Table 2).
- Balance was acceptable based on visual inspection of covariates.
Show Answer & Explanation
Correct Answer: Post-match balance achieved standardized mean differences (SMDs) <0.10 for all pre-specified covariates (see Table 2).
Explanation: The guidance recommends precise, quantitative statements (SMDs and thresholds like <0.10) and cross-referencing tables. Vague statements or subjective assessments ("similar" or "visual inspection") lack reproducibility and do not meet HTA expectations.
Fill in the Blanks
When describing missingness, first quantify extent and pattern, then state the assumed mechanism (e.g., ___) and justify it with observed correlates.
Show Answer & Explanation
Correct Answer: MAR
Explanation: The lesson recommends stating the missing-data assumption explicitly—common options are MAR (missing at random), MCAR, or MNAR. MAR is the correct fill here as an example listed in the guidance.
A concise external-control paragraph should include data source, eligibility alignment, selection or matching methods, key baseline balance metrics, and a candid statement of ___.
Show Answer & Explanation
Correct Answer: limitations
Explanation: Step 2 specifies ending the external-control paragraph with limitations that quantify residual imbalance or missing covariates and indicate potential bias direction; thus 'limitations' completes the checklist structure.
Error Correction
Incorrect: We used 'real-world data' to build an external control but did not specify the database, capture period, or data quality.
Show Correction & Explanation
Correct Sentence: We used the national oncology EHR database (2015–2022; validated mortality linkage, 98% hospitalization capture) as the external control.
Explanation: The guidance warns against vague phrases like 'real-world data' without specification. HTA reviewers need precise source description (database name or type), time frame, and data-quality indicators to assess credibility.
Incorrect: Missingness in performance status was handled by saying it was low and therefore ignored.
Show Correction & Explanation
Correct Sentence: Missingness in performance status was 18% and differed by treatment group; we assumed MAR conditional on age and prior therapy and implemented multiple imputation (m = 20).
Explanation: The lesson requires quantifying missingness, stating the assumed mechanism with justification, and describing the handling method. Simply claiming missingness is 'low' and ignoring it is not acceptable for HTA standards.