Results Summary
What was the research about?
Comparative effectiveness research compares two or more treatments to see which one works better for which patients. Electronic healthcare data are useful for this type of research. These data come from medical records and insurance claims. The data include information about how well patients respond to treatments. But many things—not just treatments—affect whether a patient’s health improves.
How well a patient responds to a treatment may depend on the patient’s age or what medicines the patient takes. It could also depend on what other health problems a patient has and how severe those problems are. Or a doctor may suggest one treatment instead of another because of a patient’s personal situation and health. Researchers need ways to determine whether changes in a patient’s health result from a certain treatment or something else.
Different statistical methods help researchers account for the various things that can affect treatment results. But researchers don’t know which methods work best. This study compared several methods. The team looked at how well the methods worked to predict patients’ responses to treatment, taking into account their personal situations and health. The team then created a computer program to help researchers use the methods.
What were the results?
No statistical method worked best in all cases to predict how well patients responded to treatment, taking into account their personal situations and health. Each method had pros and cons. But the research team found that one combination of two methods worked well in many cases.
What did the research team do?
The team used electronic healthcare data from three studies. The researchers first used the data from the studies to test several statistical methods. Then, they made more data sets based on the real data to test the methods further. The team created a computer program to help researchers use the methods.
During the study, patients gave input to the research team about key problems they have had in getting health care and what questions they see as most important for research to answer.
What were the limits of the study?
The research team could not say for sure which method will work best in specific cases to account for the things that could affect treatment results. Future research could continue to look at which methods might work best when doing a study using electronic healthcare data.
How can people use the results?
Researchers can use the results to understand which statistical methods might be useful when doing a study using electronic healthcare data. The software that the team developed can help researchers use the statistical methods in their research.
Professional Abstract
Objective
To compare data-adaptive algorithmic approaches for improving confounding control in comparative effectiveness research that uses electronic healthcare databases
Study Design
Design Elements | Description |
---|---|
Design | Empirical analyses and simulation studies |
Data Sources and Data Sets | Novel Oral Anticoagulant Prescribing Study data set, Nonsteroidal Anti-inflammatory Drugs Study data set, and Vytorin Study data set |
Analytic Approach | Empirical analyses and simulations to examine 4 different algorithmic approaches: hdPS alone, a combination of super learner prediction modeling and the hdPS, a combination of a scalable version of CTMLE and the hdPS, and a combination of penalized regression (lasso) and the hdPS |
Outcomes |
Outcomes informing variable selection, PS estimation, and causal inferencey |
Analyzing data from electronic healthcare databases, such as electronic health records, to gain generalizable knowledge of the effectiveness of medical interventions in routine care can improve patient care and outcomes, particularly for populations that are often excluded from randomized trials. However, researchers underuse these data for generating evidence on treatment effects, in part because of concerns about bias. Bias may arise in the data because clinicians selectively base their prescribing decisions on such factors as disease severity and patient prognosis. Current approaches to minimize such bias rely on the investigator to specify all potential confounding factors. New analytic approaches propose using algorithms to maximize control of confounding factors. However, researchers do not know how well these algorithms perform when applied to electronic healthcare data, particularly for special populations and small samples. Further, researchers have lacked readily available software to facilitate use of the algorithms.
This study evaluated the performance of several algorithms for variable selection, propensity score (PS) estimation, and causal inference. Researchers performed simulations using the plasmode framework, which combines simulated and empirical data to more accurately reflect complex relations that typically exist among baseline covariates. The research team then used three healthcare data sets in conjunction with plasmode simulations to evaluate the ability of each algorithm to effectively control for confounding. The team considered different scenarios by varying outcome incidence, treatment prevalence, sample size, and treatment effect. To help researchers use the algorithms, the team developed software and accompanying guidance.
During the study, the research team met with patient representatives, who identified key problems they encounter in healthcare delivery and provided input on what potential research questions are of greatest interest.
Results
- Overall, in settings with many covariates, the high-dimensional propensity score (hdPS) algorithm alone performed as well as or better than other algorithms for automated variable selection. However, the hdPS algorithm can be sensitive to the number of covariates included for adjustment, and severe overfitting of the PS model can negatively affect the properties of effect estimates, particularly for small samples.
- Combining the hdPS algorithm and the scalable version of collaborative targeted maximum likelihood estimation (CTMLE) performed well for many of the scenarios considered, but this combination was sensitive to parameter specifications within the algorithm.
- Combining the hdPS algorithm with super learner prediction modeling performed well across a broad range of settings and conditions and was the most consistent strategy in terms of reducing bias and mean squared error in the effect estimates. This approach seems especially promising for use in early periods of drug approval, where small samples and rare exposures are common.
Limitations
A framework that incorporates empirical data and observed variable relationships with simulated data allowed the research team to evaluate the algorithms in settings that reflect real-world practice. However, simulations that use specific data may have limited generalizability to other settings. In addition, assumptions made in applying the super learner algorithm may have influenced results. Finally, this study relied solely on statistical algorithms to control confounding without investigator input. It is unclear whether investigator input might influence results.
Conclusions and Relevance
Among the data-adaptive algorithms assessed in this study, no single algorithm was optimal across all data sets and scenarios. The widely used hdPS functioned well in most scenarios, although the combination of the hdPS with super learner prediction modeling outperformed hdPS alone under certain conditions. These approaches may be effective in reducing bias due to confounding when estimating treatment effects using healthcare databases.
Future Research Needs
Future studies could determine whether similar findings emerge when investigators identify confounders to include in the model a priori. Additional research could also explore factors that influence the performance of these algorithms.
Final Research Report
View this project's final research report.
Journal Citations
Related Journal Citations
Peer-Review Summary
Peer review of PCORI-funded research helps make sure the report presents complete, balanced, and useful information about the research. It also assesses how the project addressed PCORI’s Methodology Standards. During peer review, experts read a draft report of the research and provide comments about the report. These experts may include a scientist focused on the research topic, a specialist in research methods, a patient or caregiver, and a healthcare professional. These reviewers cannot have conflicts of interest with the study.
The peer reviewers point out where the draft report may need revision. For example, they may suggest ways to improve descriptions of the conduct of the study or to clarify the connection between results and conclusions. Sometimes, awardees revise their draft reports twice or more to address all of the reviewers’ comments.
Peer review identified the following strengths and limitations in the report:
- Reviewers commended the researchers on a methodologically sound study focused on improving statistical methods for confounder adjustment needed for causal inference models in patient-centered outcomes research. Many of the comments were requests for investigators to provide explanations of their highly-technical methods and results in language that would be understood by the general clinical researcher.
- Reviewers questioned the generalizability of the statistical approaches tested in this study, as simulation studies are often plagued with limited generalizability. The researchers acknowledged this problem, but responded that they used three different datasets for the simulations, which would improve the overall generalizability. One of the specific advantages of their simulation approach, the authors stated, was the preservation of the complex relationships among baseline covariates, which would not be possible in other simulation frameworks.