Written by Susan Miller*

EQUATOR for Pharmacoepidemiology: RECORD‑PE Template Wording to Elevate RWE Manuscripts

Struggling to turn rigorous RWD analyses into reviewer‑proof prose? In this lesson, you’ll learn to map every high‑risk element of a pharmacoepidemiology study to precise, RECORD‑PE‑aligned wording—covering data provenance, exposure/outcome algorithms, confounding control, linkage, missingness, and robustness. You’ll find surgical explanations, adaptable title‑to‑discussion templates, mini‑case examples, and targeted exercises to test your understanding. The result: defensible, journal‑ready language that shortens peer‑review cycles and elevates your RWE manuscript under strict confidentiality.

Why RECORD‑PE matters and how it relates to STROBE

Pharmacoepidemiology studies often use routinely collected health data (RCD)—such as electronic health records, insurance claims, registries, and prescribing databases—to evaluate how medicines are used and what effects they have in real‑world populations. These data are powerful because they are large, timely, and reflect clinical practice. At the same time, they carry risks: coding choices can change results, linkage errors can bias associations, and missing information can be hard to detect. The EQUATOR Network provides reporting guidelines to improve clarity and reproducibility across health research. STROBE is the foundational guideline for observational studies, but it is not detailed enough for the special challenges of RCD. RECORD (REporting of studies Conducted using Observational Routinely collected health Data) extends STROBE for RCD generally. RECORD‑PE is the pharmacoepidemiology‑specific extension that adds further detail on exposures, outcomes, confounding, and bias in medicine‑focused analyses.

The purpose of RECORD‑PE is straightforward: if your study uses RCD to examine the use or effects of medications, RECORD‑PE tells you exactly what to report so that readers can understand how the data were constructed, how algorithms defined exposures and outcomes, how biases were anticipated and handled, and how reproducible the work is. It complements STROBE by keeping the backbone of observational reporting—design, setting, participants, variables, bias, statistical methods—and adding the pharmacoepidemiology specifics: data provenance and transformations, linkage processes, exposure measurement windows, outcome validation, confounding strategies, and sensitivity analyses. In short, STROBE is the chassis; RECORD‑PE is the instrumentation panel you need for RWD/RWE driving.

Importantly, RECORD‑PE is not just a checklist to satisfy; it is a practical map of what expert reviewers look for. When reviewers cannot find details on coding algorithms, cohort entry rules, or how stockpiling of prescriptions was handled, they lose confidence in the results. By aligning each manuscript section with RECORD‑PE items and using standardized, transparent wording, you reduce back‑and‑forth during peer review, avoid ambiguous phrasing, and make replication possible. This alignment converts documentation from an afterthought into a central evidence‑building step.

Section‑by‑section template wording mapped to high‑impact RECORD‑PE items

Below are targeted wording templates you can adapt to your study. Each subsection notes the reporting intent and the RECORD‑PE focus.

Title and Abstract

Goal: Signal data type, design, setting, and main medication‑outcome focus clearly and early.

  • Template wording for title: “Association between [drug/class] and [outcome] in [population] using [data source type, e.g., claims/EHR/registry]—a [design: cohort/case‑control/self‑controlled] study.”
  • Template wording for abstract background/objective: “We evaluated the association between [drug exposure] and [outcome] using routinely collected [claims/EHR/registry] data, adhering to STROBE and RECORD‑PE reporting.”
  • Template wording for abstract methods: “In a [design] using [named database(s), years], we defined exposure via [codes/dispensing records] and outcomes via [diagnosis/procedure codes/validation algorithm]. We controlled confounding using [propensity score matching/weighting/active comparator/new‑user design].”
  • Template wording for abstract results: “Among [N] eligible [patients/person‑time], the adjusted [measure: HR/OR/IRR] for [outcome] comparing [exposed] with [comparator] was [estimate (95% CI)].”
  • Template wording for abstract conclusions: “Findings should be interpreted considering [misclassification/linkage/missing data] and the observational nature of routinely collected data.”

These elements satisfy RECORD‑PE expectations to identify the use of RCD, specify the type of database, and outline core design decisions that affect validity.

Introduction

Goal: Motivate the pharmacoepidemiologic question and justify RCD as appropriate.

  • Template wording: “We investigated [drug‑outcome] because [clinical/public health rationale]. Routinely collected [claims/EHR/registry] data enable evaluation in large, diverse populations with longitudinal capture of [dispensing/diagnosis/procedures], supporting real‑world evidence generation where randomized trials are limited.”

This framing ties clinical relevance to the strengths and limitations of RCD.

Methods: Data sources and provenance

Goal: Describe origin, capture processes, periods of coverage, update cadence, and transformations.

  • Template wording: “We used the [database name], which includes [population coverage, e.g., commercially insured adults across the US] with [types of data: enrollment, pharmacy dispensing, inpatient/outpatient diagnoses and procedures, laboratory results where available]. Data are collected for administrative/clinical purposes and compiled by [data holder]. The study period was [years]. We applied [version] of the common data model and executed standardized extract‑transform‑load (ETL) procedures documented in [protocol/appendix].”
  • Template wording on access and quality: “Investigators had secure access to de‑identified data; data completeness and consistency checks were performed using [described metrics], with results in [supplement].”

This maps to RECORD‑PE emphasis on describing data provenance and any transformations that could influence variables and outcomes.

Methods: Study design and setting

Goal: Identify the design, observation periods, and eligibility windows.

  • Template wording: “We conducted a [new‑user active‑comparator cohort/case‑control/self‑controlled case series] study. Cohort entry (index date) was defined as the first dispensing/administration of [drug] after at least [X months] of continuous enrollment to capture baseline covariates. Follow‑up began on [index + X] and continued until [outcome, disenrollment, death, treatment discontinuation, or end of study].”

This level of detail helps readers understand time‑at‑risk definitions and aligns with RECORD‑PE’s need for design clarity.

Methods: Participant selection

Goal: Show inclusion/exclusion criteria and create transparency for generalizability.

  • Template wording: “We included adults aged [range] with [condition] identified by [diagnosis codes/algorithms] and excluded those with [contraindications/prior outcome] in the [look‑back] period. Detailed code lists and algorithm logic are provided in [supplement/repository].”

Such precise wording helps prevent ambiguity in denominator construction.

Methods: Exposure definition and measurement

Goal: Specify coding systems, exposure windows, stockpiling, switching, and adherence assumptions.

  • Template wording: “Exposure to [drug] was ascertained using [NDC/ATC/procedure codes]. Days’ supply was taken from dispensing records; overlapping fills were handled by stockpiling up to [X days]. We defined on‑treatment risk windows as [start/stop rules], with a [grace period] to account for imperfect adherence. Switching to [other agents] was treated as [censoring/time‑varying exposure/new cohort entry], as detailed in [appendix].”

This responds to RECORD‑PE’s requirement to report exposure algorithms and their assumptions.

Methods: Outcome definition and validation

Goal: Report coding, algorithms, and any validation performance.

  • Template wording: “The primary outcome was [clinical event], identified using [ICD/OPCS/LOINC/CPT] codes in [setting]. Where available, we applied a validated algorithm with reported positive predictive value [PPV] of [X%] in comparable data. Full code lists and validation citations are provided in [supplement].”

Providing validation metrics directly addresses reviewer concerns about outcome misclassification.

Methods: Covariates and confounding control

Goal: Describe confounder selection and analytic strategy.

  • Template wording: “We measured baseline covariates during the [look‑back] period, including demographics, comorbidities (via [Charlson/Elixhauser/custom algorithm]), medication history, healthcare utilization, and proxies for disease severity. Confounding was addressed using [propensity score matching/weighting/stratification] with variables prespecified in the protocol. Covariate balance after adjustment was evaluated by standardized differences; values <0.1 indicated acceptable balance.”

This aligns with RECORD‑PE’s focus on transparent confounding control and diagnostics.

Methods: Data linkage

Goal: Explain whether multiple data sources were linked and how linkage quality was assessed.

  • Template wording: “We linked [claims] with [EHR/registry/mortality data] using [deterministic/probabilistic] methods based on [identifiers]. Linkage quality was evaluated by [match rates, clerical review, or sensitivity analyses restricting to high‑confidence links], as detailed in [supplement].”

Since linkage can produce selection bias and misclassification, RECORD‑PE calls for explicit description of the process and evaluation.

Methods: Missing data

Goal: Clarify which variables had missingness and how it was handled.

  • Template wording: “We quantified missingness for [variables]. Where appropriate, we used [multiple imputation by chained equations/complete case analysis/indicator methods], assuming [missing at random/other], and included [list of variables] in the imputation model to preserve associations.”

Providing the assumed mechanism and technique satisfies transparency requirements.

Methods: Statistical analysis

Goal: Present primary and secondary analyses, model specifications, and robustness checks.

  • Template wording: “Primary analyses estimated [HR/OR/IRR] using [Cox/logistic/Poisson/negative binomial/marginal structural models], with [time scale]. We accounted for [clustering/repeated measures] via [robust variance/random effects]. Secondary analyses included [dose‑response, subgroup, or interaction tests]. Sensitivity analyses evaluated [alternative outcome definitions, exposure windows, grace periods, washout, competing risks] and [quantitative bias analyses or E‑values] to appraise unmeasured confounding.”

This meets RECORD‑PE expectations for analytic clarity and planned robustness evaluations.

Results: Participant flow and characteristics

Goal: Show how the cohort was formed and whether balance was achieved.

  • Template wording: “From [N] eligible individuals, [N] entered the cohort after applying inclusion/exclusion criteria (Figure X). Baseline characteristics are shown [before and after adjustment], with standardized differences indicating [adequate] balance.”

Results: Main outcomes and estimates

Goal: Present effect measures with precision and context.

  • Template wording: “During [person‑time], we observed [events] of [outcome]. The adjusted [HR/OR/IRR] comparing [exposed] to [comparator] was [estimate (95% CI)]. Kaplan–Meier/cumulative incidence curves illustrate time‑to‑event differences (Figure Y).”

Results: Sensitivity, subgroup, and ancillary analyses

Goal: Demonstrate robustness and explore heterogeneity while avoiding fishing.

  • Template wording: “Results were consistent across sensitivity analyses varying [exposure/outcome definitions] and [modeling assumptions]. Prespecified subgroup analyses by [age/sex/comorbidity] showed [pattern], with interaction p‑values of [values].”

Discussion: Interpretation with bias considerations

Goal: Integrate findings with mechanisms, prior literature, and bias assessment.

  • Template wording: “Our findings suggest [direction/size] of association between [drug] and [outcome] in routine care. Strengths include large sample size, new‑user active‑comparator design, and validated outcome algorithms. Limitations include potential residual confounding (e.g., [disease severity/health behaviors]), exposure misclassification from imperfect adherence, and incomplete capture of [outcome] outside [settings]. Sensitivity analyses and quantitative bias assessment indicate that an unmeasured confounder would need [strength] to fully explain the association. Results align/contrast with prior studies [citations], potentially due to differences in [population, coding, risk window].”

This paragraph meets RECORD‑PE’s call to discuss bias sources tied to RCD, not just generic limitations.

Discussion: Generalizability and applicability

Goal: Clarify for whom the results apply and where caution is needed.

  • Template wording: “Because the database captures [insured/region‑specific] populations, generalizability may be limited for [uninsured/other regions]. However, similar prescribing patterns and coding systems suggest applicability to [related settings].”

Data, code, and registration statements

Goal: Enhance reproducibility and transparency.

  • Template wording: “Protocol, analysis plan, and code lists are available at [persistent repository/DOI]. Analytic code to construct cohorts and reproduce models is deposited at [URL], with instructions to run in [software version]. Data are available from [data holder] under license; qualified researchers can request access per [link]. The study was prospectively registered at [registry, ID] and deviations from the registered plan are documented in [supplement].”

These statements directly satisfy RECORD‑PE’s emphasis on data provenance and reproducibility, while acknowledging licensing constraints typical of commercial or clinical RCD.

Applying template wording to a mini‑case: how to substitute effectively

When adapting templates, focus on four substitution layers that map to high‑impact RECORD‑PE items:

  • Data provenance substitutions: Replace generic database descriptors with precise names, coverage, and governance. Swap in the exact coding systems and ETL details used. If multiple sources are linked, include the linkage method and quality metrics. This ensures readers can judge capture and completeness.
  • Algorithmic substitutions: Specify exposure windows (initiation, grace periods, stockpiling) and outcome algorithms (codes, settings, validation PPV). State how switching, augmentation, or discontinuation is handled. These choices are common reviewer pain points; precise language reduces ambiguity.
  • Confounding and design substitutions: Identify the comparator choice (active comparator versus nonuser), justify the new‑user design, and list the covariates used in propensity modeling or other control methods. Provide balance diagnostics and articulate time‑at‑risk. This aligns strongly with RECORD‑PE guidance on bias control.
  • Robustness substitutions: Insert the exact sensitivity analyses you performed—alternative exposure/outcome definitions, different grace periods, and negative control outcomes or exposures. Include any quantitative bias analyses. This anticipates reviewer requests for proof that results are not an artifact of specific assumptions.

Use consistent terminology across sections. If you call your index date the “first qualifying dispensing” in Methods, keep the same wording in Results and Discussion. If you define a 30‑day grace period, repeat “30‑day grace period” rather than “about a month.” Consistency is a subtle but powerful signal of rigor and helps readers connect definitions to estimates.

Quick self‑check and exportable checklist for alignment

Before submission, run a streamlined self‑check aligned with RECORD‑PE:

  • Title/Abstract: Does the title name the data type and design? Does the abstract state adherence to STROBE and RECORD‑PE, name the database(s), design, primary exposure/outcome definitions, and confounding strategy?
  • Data provenance: Are the data sources named with coverage periods, content domains, update frequency, ETL steps, and access conditions? If linked, is the linkage method and quality described?
  • Algorithms: Are full exposure and outcome code lists available in a public or accessible repository? Are validation metrics cited or provided? Are stockpiling, grace periods, and switching rules explicit?
  • Cohort construction: Are eligibility criteria, look‑back periods, and censoring rules clear? Is time‑at‑risk defined? Is a flow diagram provided?
  • Confounding: Are confounders justified and prespecified? Are diagnostics (e.g., standardized differences) reported? Is the comparator choice justified?
  • Missing data: Are missingness patterns reported and handling methods justified with assumptions?
  • Analysis: Are model types, time scales, variance estimators, and clustering methods specified? Are sensitivity and subgroup analyses prespecified and reported?
  • Results alignment: Do tables/figures reflect the definitions in Methods? Are effect sizes presented with confidence intervals and event counts?
  • Bias discussion: Are key biases—misclassification, confounding, selection, immortal time, and linkage error—explicitly evaluated? Are robustness checks interpreted in relation to these biases?
  • Reproducibility: Are protocol, code, and code lists shared or described with access routes? Is registration stated? Are deviations explained?

To make this check actionable, many teams maintain an exportable mapping table: one column for RECORD‑PE items, one for the exact manuscript sentence(s) that address each item, and one for the location (section/line/table). This table becomes both an internal quality control tool and, if requested, a supplementary file for reviewers. It ensures every high‑risk element—data source description, exposure/outcome algorithms, confounding control, data linkage, missing data, and sensitivity analyses—appears clearly and consistently.

By treating RECORD‑PE as both a guidance framework and a library of template sentences, authors can move from abstract compliance to concrete clarity. The result is a manuscript that communicates design choices, documents algorithms, and anticipates reviewer questions, ultimately accelerating peer review and strengthening the credibility of real‑world evidence. With disciplined use of standardized wording across Title/Abstract, Methods, Results, Discussion, and the Data/Code/Registration statements, your paper will demonstrate transparency, reproducibility, and thoughtful bias control—the central goals of EQUATOR‑aligned reporting in pharmacoepidemiology.

  • RECORD‑PE extends STROBE for pharmacoepidemiology using routinely collected data, requiring transparent reporting of data provenance, algorithms, and bias handling.
  • Methods must precisely define exposure and outcome algorithms (codes, windows, stockpiling/grace periods, switching rules) and report any validation metrics (e.g., PPV).
  • Clearly describe study design, cohort construction, confounding control (e.g., propensity scores with balance diagnostics), linkage processes, missing data handling, and planned sensitivity analyses.
  • Ensure reproducibility and clarity by naming databases, maintaining consistent terminology across sections, sharing code lists/analytic code, and stating registration and access conditions.

Example Sentences

  • We conducted a new-user active-comparator cohort study using US commercial claims (2015–2023) and reported methods in accordance with STROBE and RECORD‑PE.
  • Exposure to GLP‑1 receptor agonists was defined from NDC dispensing records with a 30‑day grace period and stockpiling up to 15 days to account for overlapping fills.
  • The primary outcome—hospitalization for acute pancreatitis—was identified via ICD‑10 codes in inpatient claims using a validated algorithm with a reported PPV of 88%.
  • Confounding was addressed through propensity score weighting based on demographics, Elixhauser comorbidities, prior medication use, and healthcare utilization during a 12‑month look‑back.
  • We linked claims to a state mortality registry using deterministic matching on encrypted identifiers and evaluated linkage quality via match rates and sensitivity analyses restricted to high‑confidence pairs.

Example Dialogue

Alex: I’m drafting the abstract—do I really need to name the database and say we followed RECORD‑PE?

Ben: Yes. RECORD‑PE expects you to flag the RCD type, the design, and the confounding strategy right up front.

Alex: Okay, so I’ll write: “A new‑user active‑comparator cohort in Optum Clinformatics, 2016–2022, with exposure from NDCs and outcomes from validated ICD‑10 algorithms; confounding controlled by propensity score weighting.”

Ben: Perfect, and don’t forget to define the time‑at‑risk and the 30‑day grace period in Methods.

Alex: Got it—plus we’ll cite the pancreatitis algorithm’s PPV and share code lists in the repository.

Ben: Exactly; that alignment with RECORD‑PE will make reviewers trust the RWE.

Exercises

Multiple Choice

1. Which statement best captures how RECORD‑PE relates to STROBE in pharmacoepidemiology studies using routinely collected data (RCD)?

  • RECORD‑PE replaces STROBE for all observational studies.
  • RECORD‑PE is a separate guideline for randomized trials only.
  • RECORD‑PE extends STROBE with medication‑specific details like exposure algorithms, outcome validation, and confounding strategies.
  • RECORD‑PE is only a data‑sharing policy, not a reporting guideline.
Show Answer & Explanation

Correct Answer: RECORD‑PE extends STROBE with medication‑specific details like exposure algorithms, outcome validation, and confounding strategies.

Explanation: STROBE provides the backbone for observational reporting; RECORD adds RCD details, and RECORD‑PE further specifies pharmacoepidemiology elements such as exposure measurement windows, outcome validation, and confounding control.

2. A reviewer cannot find details on stockpiling rules for overlapping prescriptions. According to RECORD‑PE, where should authors make this explicit to maintain transparency and reproducibility?

  • In the Discussion only, as a limitation.
  • In Methods under Exposure definition and measurement, with exact rules (e.g., stockpiling up to 15 days).
  • In the Title, immediately after the study design.
  • Nowhere; stockpiling details are optional.
Show Answer & Explanation

Correct Answer: In Methods under Exposure definition and measurement, with exact rules (e.g., stockpiling up to 15 days).

Explanation: RECORD‑PE asks for precise exposure algorithms, including stockpiling, grace periods, and switching rules, reported in the Methods so readers can reproduce the cohort construction.

Fill in the Blanks

We conducted a ___ active‑comparator cohort using US claims (2016–2022), adhering to STROBE and RECORD‑PE.

Show Answer & Explanation

Correct Answer: new‑user

Explanation: RECORD‑PE encourages clear design labeling. “New‑user active‑comparator cohort” specifies incident exposure and a comparator strategy, both central to confounding control.

The primary outcome was identified by inpatient ICD‑10 codes using a validated algorithm with a reported ___ of 88%.

Show Answer & Explanation

Correct Answer: positive predictive value (PPV)

Explanation: RECORD‑PE recommends reporting outcome validation metrics; PPV quantifies how often algorithm‑identified cases are true positives.

Error Correction

Incorrect: We used an EHR database but did not specify coding systems because RECORD‑PE focuses only on results.

Show Correction & Explanation

Correct Sentence: We used an EHR database and specified the coding systems (e.g., ICD‑10, CPT) used for outcomes and procedures, as required by RECORD‑PE.

Explanation: RECORD‑PE emphasizes transparent algorithms and code systems for exposures and outcomes; it is not limited to results reporting.

Incorrect: Missing data were present, but we assumed they did not matter and did not report the mechanism or handling.

Show Correction & Explanation

Correct Sentence: We quantified missingness, stated the assumed mechanism (missing at random), and handled it using multiple imputation with prespecified variables.

Explanation: RECORD‑PE calls for reporting which variables have missingness, the assumed mechanism, and the chosen handling method to preserve validity and reproducibility.