Results Summary
What was the project about?
This project aims to improve the methods that researchers use to compare how treatments affect different patients. When researchers use data from patients’ health records to compare treatments, it’s often hard to know whether changes in a patient’s health are from the treatment or something else. Factors other than the treatment may affect the patient’s health, including
- A patient’s traits, such as age, gender, or other health problems
- Group-level factors, such as where patients get care or where they live
To address this problem, researchers rely on statistical methods. Existing methods use data from patients who have similar traits but received different treatments. But they may not work well if some group-level factors affect both the treatment and patients’ health. In this study, the research team created two new ways of including group-level factors in the methods they use to find similar patients.
What did the research team do?
The research team made test data to look like patient health records. The test data included information on patient traits, group-level factors, and treatments. The team picked one group-level factor, such as hospital size, that can affect treatment results. Then the team created two new ways of including the group-level factor in the methods they use to find similar patients. To see which ways worked best, the team compared findings from the two new ways to those from an existing way of including group-level factors. Finally, the team compared findings from all three ways of including the group-level factor with findings from not including the group-level factor.
One parent, one clinician, and one researcher helped design the study.
What were the results?
Including group-level factors in the analysis got more accurate results than not including them. Among the three ways of including group-level factors, the existing way and one of the new ways provided more accurate findings about how treatments work.
What were the limits of the project?
The test data set had only one group-level factor. Results might have been different with real health record data with more than one group-level factor.
Future research could create and test ways of including more than one group-level factor.
How can people use the results?
Researchers can use methods that include group-level factors to get more accurate findings about how treatments work.
Professional Abstract
Background
Researchers can use secondary data sources such as electronic health records (EHRs) or registries to study a representative sample of patients over a long period of time. However, studies that rely on secondary data are vulnerable to increased risk of biased results from confounding variables. For example, a patient’s prognostic factors or a physician’s experience can affect both the choice of treatment and the outcome; if treatment and comparison groups are differentially affected by these variables, the findings may be biased.
Propensity score (PS) methods can address this potential source of bias. Applying PS methods involves two steps. First, researchers identify a set of variables to calculate the PS, or the probability that a patient with certain characteristics will be assigned to the treatment group. Second, researchers estimate treatment effects using a subsample of patients who are matched using the PS across treatment and comparison groups.
Data from EHRs or registries generally include some hierarchical, or clustered, structures. For example, patient data may reflect clustering by hospitals or geographical area. Ignoring cluster-level confounders such as hospital characteristics in PS matching could bias the results. Existing PS methods do not adequately address the hierarchical structure of secondary data.
Objective
To develop PS matching methods for hierarchical data
Study Design
Design Elements | Description |
---|---|
Design | Simulation studies, empirical analyses |
Data Sources and Data Sets | Simulations based on real-world comparative effectiveness research (CER) studies, data from two existing CER studies |
Analytic Approach |
|
Bias, RMSE, and matched sample size |
This abstract summarizes selected findings from this project.
Methods
The research team simulated data to mimic the characteristics of real-world hierarchical data. They considered four PS matching methods:
- Across-cluster matching, which ignored the hierarchical data structure
- Within-cluster matching, which accounted for the hierarchical data structure
- Stratified matching, which stratified data by a cluster-level variable, such as hospital size, and then matched patients across clusters within strata to ensure closer matches
- Hierarchical matching, which grouped clusters with similar cluster-level variables within each stratum and matched patients across clusters within each group
To assess the performance of each method, the research team examined bias and the root mean squared error (RMSE) of the estimated treatment effects. The team also compared matched sample sizes to determine if the matching methods would produce sufficiently large matched samples for PS analysis.
A patient and caregiver advocate, a clinician researcher, and a research foundation representative helped design and conduct the study.
Results
Compared with across-cluster matching, PS matching methods accounting for hierarchical data structure performed better on all outcomes, including
- Bias. Among the four methods, within-cluster matching yielded the least bias (less than 8%). Hierarchical matching yielded a smaller bias than across-cluster matching and stratified matching.
- RMSE. RMSEs were smallest with hierarchical matching.
- Matched sample size. Hierarchical matching yielded 76% more matched pairs than within-cluster matching.
Limitations
The methods addressed confounding issues for only a single cluster-level variable and may not apply to all hierarchical data structures. Also, the study looked at only binary and continuous outcomes and did not consider survival or bivariate outcome types.
Conclusions and Relevance
Accounting for hierarchical data structures can help build PS models that appropriately adjust for confounding bias in studies that use secondary data sources.
Future Research Needs
Future research could develop methods to accommodate multiple cluster-level variables. Researchers could also explore how cluster sizes affect the feasibility of these alternative matching methods.
Final Research Report
View this project's final research report.
Peer-Review Summary
Peer review of PCORI-funded research helps make sure the report presents complete, balanced, and useful information about the research. It also assesses how the project addressed PCORI’s Methodology Standards. During peer review, experts read a draft report of the research and provide comments about the report. These experts may include a scientist focused on the research topic, a specialist in research methods, a patient or caregiver, and a healthcare professional. These reviewers cannot have conflicts of interest with the study.
The peer reviewers point out where the draft report may need revision. For example, they may suggest ways to improve descriptions of the conduct of the study or to clarify the connection between results and conclusions. Sometimes, awardees revise their draft reports twice or more to address all of the reviewers’ comments.
Peer reviewers commented and the researchers made changes or provided responses. Those comments and responses included the following:
- The reviewers asked for a statistical description of the overlap of estimated propensity scores within each cluster group. The researchers added figures showing the distribution of estimated propensity scores to illustrate the overlaps in the simulations and data examples.
- The reviewers asked that the limitations sections in the abstract and discussion sections be expanded to include some of the statistical limitations they had described. The researchers did so, and added a description of the usefulness of also examining full matching to the limitations section.
- The reviewers asked why particular previously published methods, including preferential matching, were not used in simulations. The researchers explained that one previously published method was not used in simulations here because of differences in the data types used. Regarding preferential matching, the researchers added text to the background section explaining how preferential matching could be integrated with the hierarchical-matching method that performed best in this study. The researchers said that in future work they will compare preferential matching with preferential implementation of the hierarchical-matching method they used in this report.
Conflict of Interest Disclosures
Project Information
Key Dates
Study Registration Information
^Mi-Ok Kim, PhD, was affiliated with Cincinnati Children's Hospital Medical Center when the project was initially awarded.
Final Research Report
View this project's final research report.