Results Summary
What was the research about?
Comparative effectiveness research compares two or more treatments to see which one works best for which patients. Information from health insurance claims could be useful for this type of research. These claims include data on how well patients respond to treatments. But many things—not just treatments—affect whether patients’ health improves.
How well patients respond to treatments could depend on patients’ ages or medicines they take. It could also depend on how many health problems a patient has and how severe the problems are. Also, a doctor may suggest one treatment instead of another because of a patient’s personal situation and health. Researchers need ways to figure out whether changes in patients’ health result from treatment or something else.
Comparing treatments is hard in small studies with only a few patients. When there are few patients in a study, researchers can study only a few events. An event is an outcome related to the health problem or treatment researchers are studying. When there are few events and many things that could affect treatment results, it is hard to figure out what causes changes in patients’ health. To address this problem, researchers use different statistical methods to account for all the things that could affect treatment results. But researchers don’t know which methods might work best in studies with few events. In this study, the research team compared several methods to see which ones worked best.
What were the results?
The research team found that certain statistical methods worked better than others to account for all the things that could affect treatment results in studies with few events.
What did the research team do?
The research team wanted to see which statistical methods worked best to account for things that could affect treatment results. To do this, the team made a test set of health insurance claims using real patient data. The team made sure the set had only a few events and many things other than treatment that could affect the results. The team also made sure the test set had information on what happened after each patient got treatment. The test set made it possible to see which methods worked best.
During the study, patients gave input to the research team about the issues that are important to them in research that uses health insurance claims.
What were the limits of the study?
This study compared different statistical methods using data created by the research team. Studies using different data may have different results. Also, the results may not apply to all types of data.
How can people use the results?
This study can help researchers understand which statistical methods to use when doing a study with few events and many things that could affect treatment results. Knowing which methods work best can help researchers use health insurance claims to get information that patients can use to choose between treatment options.
Professional Abstract
Objective
To evaluate and improve analytic approaches for variable selection and treatment-effect estimation in nonrandomized studies with few outcome events and many confounders
Study Design
Design Element | Description |
---|---|
Design | 2 simulation studies |
Data Sources and Data Sets |
Simulations based on 3 previously published pharmacoepidemiologic cohorts |
Analytic Approach |
Simulations based on the plasmode framework were used to examine the following approaches:
|
Outcomes | Bias, mean squared error, and precision |
Nonrandomized studies are essential in early identification of promising new treatments, rare diseases, and comparing treatment effects in population subgroups often excluded from randomized trials (e.g., children and older adults). Nonrandomized studies may have few outcome events and numerous confounding variables (i.e., variables associated with both treatment and outcomes). Such nonrandomized studies present significant challenges to drawing causal inferences.
Researchers often use propensity-score (PS) models to control for many measured confounders in estimating the causal effects of treatment. Applying PS models involves two steps. Researchers identify a set of variables or confounders to calculate the PS for each patient. Then, researchers use the PS to estimate treatment effects. Researchers typically apply PS methods in analyzing data with many confounders, but PS methods can be unstable when there are few outcome events. Few studies have explored which PS approaches offer the greatest control for confounding in such scenarios.
Researchers conducted two simulation studies evaluating PS models that have been proposed in the literature. Researchers based the simulations on three previously published cohort datasets using the plasmode framework. The plasmode framework creates realistic simulated datasets that mimic traits found in real nonrandomized cohort studies based on large healthcare datasets.
One simulation compared the high-dimensional propensity score (hdPS) algorithm with regularized regression approaches, such as ridge regression and lasso regression. The hdPS algorithm prioritizes a subset of potential confounders to include in the PS model. However, regularized regression approaches adjust for all potential confounders when modeling the outcomes.
The other simulation compared a variety of PS-based estimators of the treatment effect across different conditions. These conditions included whether treatment effects were heterogeneous, which means a treatment’s effect differed for different patients, or homogeneous, which means a treatment’s effect was the same across patients.
Researchers used bias and mean squared error of the estimated effects to assess performance.
Patient representatives provided input during the study about issues related to nonrandomized research that were important to them.
Results
In the first simulation, the hdPS variable-selection algorithm generally performed better than regularized-regression approaches across conditions. However, using lasso regression for variable selection in a regular PS model also performed well.
In the second simulation, regression adjustment for the PS using one nonlinear spline (a method allowing for nonlinear associations among confounders, PS, and outcomes) and matching weights provided lower bias and mean squared error in the context of rare outcomes. Regression adjustment for the PS using one nonlinear spline provided robust inference when the PS model was misspecified, but it introduced bias when treatment effect was heterogeneous. Matching weights provided robust inference for heterogeneous treatment effect, but the robustness depended on correct specification of the PS model. Therefore, researchers should evaluate their data to determine whether treatment effect is likely to be heterogeneous when choosing which approach to use.
Limitations
The research team used simulated datasets to explore a variety of realistic scenarios. However, the use of simulated datasets that differ from those the research team used could produce different results. Moreover, the simulated datasets may differ from actual data, and the results derived from this study may not apply to all types of data.
Conclusions and Relevance
Automated variable selection methods, such as hdPS and lasso regression, can help build PS models that appropriately adjust for confounding in comparative effectiveness studies using healthcare databases. However, regularized-regression approaches are not appropriate for simultaneously selecting variables and adjusting for confounding via the outcome model. Treatment-effect-estimation approaches that focus on effects in the feasible population while preserving study size and number of outcomes are likely to lead to better estimates of treatment effect than other popular approaches. Applying these findings to the analysis of nonrandomized healthcare datasets should improve information available to support patient-physician decision making.
Future Research Needs
Future research could explore the performance of these analytical approaches when there are important unmeasured confounding variables or when there is uncertainty in model specification. Future work could also focus on improving understanding of how the relative performance of approaches varies as the number of observed outcome events increases. Finally, there is a strong need for evaluating the use of these approaches in survival outcomes.
Final Research Report
View this project's final research report.
Journal Citations
Related Journal Citations
Peer-Review Summary
Peer review of PCORI-funded research helps make sure the report presents complete, balanced, and useful information about the research. It also assesses how the project addressed PCORI’s Methodology Standards. During peer review, experts read a draft report of the research and provide comments about the report. These experts may include a scientist focused on the research topic, a specialist in research methods, a patient or caregiver, and a healthcare professional. These reviewers cannot have conflicts of interest with the study.
The peer reviewers point out where the draft report may need revision. For example, they may suggest ways to improve descriptions of the conduct of the study or to clarify the connection between results and conclusions. Sometimes, awardees revise their draft reports twice or more to address all of the reviewers’ comments.
Reviewers’ comments and the investigator’s changes in response included the following: In response to reviewer requests, the awardee added figures to an appendix showing results for simulation 1 as well as for the other two covariate specifications. The awardee also added a list of variables to the appendix.
- Responding to reviewer feedback, the awardee clarified that simulation 1, unlike simulation 2, is generalizable because it included prevalent and rare outcome scenarios.
- The awardee expanded the discussion of how results differ across the simulation scenarios and provided guidance for researchers on determining which results are applicable for their own studies.