Developing and Testing New Methods for Comparing Outcomes Across Groups of Patients in Observational Studies

Results Summary

Download Summary

What was the project about?

One goal of comparative effectiveness research is to find out which treatments work best for different groups of patients. For example, treatments may work differently for patients with only one health problem than for those with more than one health problem.

In observational studies, researchers look at health outcomes when patients and their doctors choose the treatments. These studies often use data from electronic health records, or EHRs. Researchers can apply propensity score, or PS, methods to look at different groups of patients. With PS methods, researchers create groups of patients with similar traits who had different treatments. But PS methods require researchers to have data on all patient traits that could affect how well the treatment works. With EHR data, data on some patient traits, like health problems, may be missing. Using current PS methods in observational studies may lead to biased results.

In this study, the research team created new guidance for using PS methods with EHR data to look at the effects of treatment in different groups of patients. The team also created and tested new PS methods to make groups of patients with similar traits.

What did the research team do?

The research team created graphs to show how similar patient traits are within groups when using current PS methods. They used the graphics to develop guidance for planning ways to compare treatments within different groups of patients. Next the team developed a new method called OW-pLASSO for creating groups of patients with similar traits. The team used test data created by a computer to compare OW-pLASSO with current methods.

Then the research team applied OW-pLASSO and two current PS methods to real data from patients who received one of two treatments for uterine fibroids. One year after treatment, the team looked at patient quality of life and symptoms within 35 groups of patients.

Doctors provided input during the study.

What were the results?

With the test data, OW-pLASSO worked better than current methods to create groups of patients with similar traits.

With the real patient data, OW-pLASSO found different treatment effects between patient groups that the current methods didn’t find. For example, patients with mild symptoms before treatment had similar quality of life after getting either treatment. But for patients with moderate to severe symptoms, one treatment worked better than the other to improve quality of life.

What were the limits of the project?

The research team tested OW-pLASSO using data from patients with uterine fibroids. Results may differ with data from patients with other health problems.

Future research could test how well OW-pLASSO works for patients with other health problems.

How can people use the results?

Researchers can use the results when comparing treatments in patient groups using EHR data. Results from such studies may help patients and their doctors make treatment decisions.

Professional Abstract

Background

Subgroup analyses help determine how well treatments work for different groups of patients. In observational studies, propensity score (PS) methods create comparable patient subgroups to distinguish the effects of treatment from confounding factors, such as patient characteristics like age or co-occurring conditions. However, existing PS methods assume that all patient characteristics are measured and balanced across treatment groups. With unmeasured confounding factors, creating comparable subgroups is difficult and can lead to biased findings about treatment effectiveness. New methods may help improve the accuracy of PS methods for subgroup analyses.

Objective

To develop and test new methods and tools for subgroup analysis in observational studies

Study Design

Design Element	Description
Design	Methods development; simulation studies; empirical analysis
Data Sources and Data Sets	COMPARE-UF registry data collected between November 11, 2015, and April 18, 2019, from 1,430 women ages 30 and older with uterine fibroids
Analytic Approach	Developing statistical method: OW-pLASSO algorithm Testing new methods: simulation analysis and empirical analysis
Outcomes	Simulation studies: absolute standardized mean difference; relative bias; root mean squared error Empirical analysis: mean difference in quality of life summary and symptom severity score

Methods

First, the research team created graphical diagnostic plots to help visualize subgroup covariate precision and balance, which is the similarity of baseline characteristics between different treatment groups. The team developed guidance, based on the diagnostic plots, for choosing which PS method to use when designing a subgroup analysis.

Next the team developed a new method called OW-pLASSO by combining the post-LASSO propensity score model with overlap weighting (OW) to improve covariate balance within subgroups and the precision of estimated propensity scores. The team then conducted simulations to compare OW-pLASSO with existing PS methods, such as machine learning models and inverse probability weighting. They measured covariate balance using the absolute standardized mean difference and measured the precision of estimates for different subgroups using relative bias and root mean squared error.

To test the new method, the research team applied OW-pLASSO to data from patients who received a myomectomy or a hysterectomy in the Comparing Options for Management: Patient-Centered Results for Uterine Fibroids (COMPARE-UF) registry. They estimated the effect of each treatment one year post-procedure on mean quality of life score and symptom severity within 35 patient subgroups. The team also compared the estimates from OW-pLASSO with estimates from two existing PS methods.

Clinicians and researchers provided input during the study.

Results

In simulations, OW-pLASSO achieved exact within-subgroup mean balance and greater precision in estimating subgroup effects across most scenarios compared with existing methods. However, when subgroup sample sizes were small and had little overlapping data on covariates, OW-pLASSO produced similar mean balance compared with existing methods.

With COMPARE-UF data, OW-pLASSO detected different treatment effects between subgroups with different baseline symptom severity. For example, patients with mild symptom severity at baseline had similar quality of life outcomes following a hysterectomy or myomectomy. In contrast, in patients with moderate to severe baseline symptoms, hysterectomy worked better than myomectomy at improving quality of life. Existing methods were unable to detect such differences.

Limitations

The research team tested OW-pLASSO using data from patients with uterine fibroids. Results may differ for other health conditions.

Conclusions and Relevance

OW-pLASSO accurately estimated treatment effects in subgroups. Researchers can use OW-pLASSO to conduct subgroup analysis using observational data.

Future Research Needs

Future research could test the methods with different health conditions.

COVID-19-Related Study

Methods to Improve Comparative Effectiveness Analyses for COVID-19 Treatments

View Professional Abstract
View Peer Review Summary
View Final Enhancement Report

Results Summary

Download Summary

In response to the COVID-19 public health crisis in 2020, PCORI launched an initiative to enhance existing research projects so that they could offer findings related to COVID-19. The initiative funded this study and others.

What was the project about?

Researchers can use data from health records to find out which treatments work better for COVID-19. But it can be hard to know whether changes in a patient’s health are from the treatment or something else, like patient traits such as their age or other health problems.

To address this problem, researchers can use a statistical method known as propensity scoring, or PS. PS helps researchers group patients based on their traits to compare how treatments work. But analyses that use PS may not be accurate. Errors may occur if PS misses some patient traits in creating patient groups. Researchers need better methods for using PS.

In this project, the research team created new methods to improve the use of PS for comparing how well COVID-19 treatments work for different patients.

What did the research team do?

The research team created methods to help researchers find out how likely it is for a PS analysis to have errors. The team designed two tools to help researchers use the methods: a study design assessment and a computer program.

Next, the research team tested the tools in two analyses of how well a medicine called dexamethasone worked to treat COVID-19. They looked at health record data from:

Patients with COVID-19 who were admitted to the hospital. The team compared the risk of death or discharge to hospice for patients who did and didn't receive dexamethasone.
Patients with COVID-19 who came to the hospital but weren't admitted. The team compared the risk of future hospital admission for patients who did and didn't receive dexamethasone.

In both analyses, the research team used the new methods and tools to check for errors in how they used PS to create patient groups for analysis.

Doctors helped design the study.

What were the results?

Improving use of PS methods. The study design assessment helped the research team know if errors were likely in the PS analysis. The computer program created graphs that showed whether errors in using PS analysis were likely when comparing patient groups.

Comparing patients with COVID-19 who did and didn’t receive dexamethasone. Results from both analyses suggest that:

For patients admitted to the hospital, dexamethasone was helpful among those who stayed in the hospital for at least two days.
For patients who came to the hospital but weren’t sick enough to be admitted, dexamethasone could be harmful.

What were the limits of the project?

The new methods don’t account for errors that happen if health records don’t include data that affect treatment decisions, like patient preferences.

How can people use the results?

Researchers can use the new methods to reduce errors when using data from health records to find out how well COVID-19 treatments work for patients with different traits.

Professional Abstract

Background

Propensity score (PS) methods can be useful for researchers conducting observational studies to compare treatments for COVID-19. PS methods create comparable patient subgroups that enable researchers to distinguish treatment effects from confounding factors, such as patient characteristics like age or co-occurring conditions. However, with unmeasured confounding factors, creating comparable subgroups is difficult and can lead to biased findings about treatment effectiveness. Researchers need new methods that improve the accuracy of PS methods for subgroup analyses.

Objective

To develop methods that improve the design of PS-based subgroup analyses for comparing treatments for COVID-19

Study Design

Design Element	Description
Design	Methods development, observational cohort study, empirical (comparative effectiveness) analysis
Data Sources and Data Sets	HCA healthcare data including electronic health records for 102,697 patients diagnosed with COVID-19 infection and admitted to 1 of 176 HCA healthcare-affiliated facilities, across the United States, between July 2020 and October 2021 CHC data including pharmacy and medical claims with procedure and diagnosis codes for 22,170 patients with COVID-19 who were treated in the hospital and immediately discharged
Analytic Approach	PS modeling to adjust for confounding factors: HCA analysis: subgroup analysis based on age, race, ethnicity, Charlson Comorbidity Index, diabetes, remdesivir, baricitinib, and daily oxygenation flow CHC analysis: subgroup analysis based on Medicare Advantage enrollment, diabetes, and kidney disease
Outcomes Timeframe Length of follow-up for collecting data on primary outcomes. View Glossary	HCA analysis: in-hospital mortality or discharge to hospice CHC analysis: admission to hospital

Methods

The research team developed two approaches to improve PS-based subgroup analyses in observational studies. First, they adapted a tool to assess risk of bias in the study design. Second, they developed software to visually compare covariate balance across patient subgroups. Next, the team tested the approaches using multicenter data in two observational studies that used PS methods to evaluate outcomes for patients who received dexamethasone.

The first analysis used Hospital Consortium of America (HCA) electronic health record data from 176 facilities across the United States. The research team compared starting dexamethasone within two days of hospital arrival for COVID-19 versus not starting it within two days on in-hospital mortality or discharge to hospice. They examined whether outcomes varied based on receipt of oxygen support overall and within subgroups based on patient characteristics and confounding factors.

The second analysis used Change Healthcare (CHC) claims data. The research team compared receiving dexamethasone versus not receiving it on subsequent hospitalization in patients with COVID-19 who arrived at the hospital and left the same day.

For both analyses, the research team evaluated the risk of bias and used the software to examine the accuracy of PS methods in comparing treatment effectiveness across different subgroups.

Clinicians helped design the study.

Results

In both analyses, the new methods helped measure the risk of bias in the study design. The software’s visual plots showed the covariate balance across patient subgroups and helped adjust PS weights to improve the accuracy of the analysis.

In the HCA analysis, dexamethasone was associated with lower risk of inpatient mortality or discharge to hospice for patients who did not receive oxygen (odds ratio [OR]=0.90; 95% confidence interval [CI]: 0.78, 1.03) and patients who received oxygen (OR=0.92; 95% CI: 0.86, 0.98). In the CHC analysis, dexamethasone was associated with a 1.2% higher rate of hospitalization in patients who arrived at the hospital and left the same day; this finding was not statistically significant.

Results from both analyses suggest that dexamethasone was moderately helpful for patients who stayed at the hospital for at least two days, but potentially harmful for patients who did not require hospital admission.

Limitations

The new methods do not account for errors due to lack of information in the data sources, such as patient preferences, which may affect treatment decisions.

Conclusions and Relevance

The assessment and software optimize the use of PS methods to accurately compare treatments for COVID-19.

Peer Review Summary

The Peer-Review Summary for this COVID-19 study will be posted here soon.

Final Enhancement Report

This COVID-19 study's final enhancement report is expected to be available by July 2024.

Enhancement Budget:

$349,999

Final Research Report

This project's final research report is expected to be available by Sept. 2024.

Journal Citations

Related Journal Citations

Peer-Review Summary

Peer review of PCORI-funded research helps make sure the report presents complete, balanced, and useful information about the research. It also assesses how the project addressed PCORI’s Methodology Standards. During peer review, experts read a draft report of the research and provide comments about the report. These experts may include a scientist focused on the research topic, a specialist in research methods, a patient or caregiver, and a healthcare professional. These reviewers cannot have conflicts of interest with the study.

The peer reviewers point out where the draft report may need revision. For example, they may suggest ways to improve descriptions of the conduct of the study or to clarify the connection between results and conclusions. Sometimes, awardees revise their draft reports twice or more to address all of the reviewers’ comments.

Peer reviewers commented and the researchers made changes or provided responses. Those comments and responses included the following:

The reviewers asked whether the statistical methods developed in this project could be applied to randomized trials as well as to observational studies. The researchers agreed that the methods could benefit randomized trials, particularly in subgroup analyses where imbalances between groups could significantly affect the outcome. The researchers noted that this approach was not part of the current research project, but they did write a paper about this possible use of the new methods and credited this PCORI-funded project for contributing to the ideas relayed in that paper.
The reviewers asked for more rationale for the researchers’ use of the LASSO (least absolute shrinkage and selection operator) technique in choosing interaction terms for their statistical model but not using the technique for selecting the main effects of covariates in the model. The reviewers felt that reducing the number of covariates in the model by removing those with little or no effect on the outcome would reduce the statistical model’s complexity. The researchers added an explanation to the report, clarifying that their goal was to specify a propensity score model that took into consideration the potential for covariate-subgroup interaction rather than assuming that the interaction would be 0. The LASSO technique was therefore used to identify the most salient covariate-subgroup interactions to include in their propensity score models.

Conflict of Interest Disclosures

View the COI Disclosure Form

Project Information

Principal Investigator Principal Investigator The lead researcher and primary contact for the project. View Glossary:

Laine E. Thomas, PhD

Organization Organization The institution/organization in which the project originates, or the primary institution or organization that received funding for the project. View Glossary:

Duke University

Project Budget:

$1,081,267

Project Title Project Title The original title of the project supplied by the principal investigator or project lead/team. View Glossary:

Methods for the Design and Conduct of Subgroup Analysis in Observational Studies

Key Dates

Approval Date Approval Date The date of approval to fund by PCORI. The actual project start dates vary as the negotiation of project milestones must be completed before the contract can be fully executed. View Glossary:

April 2019

Project End Date Project End Date Includes the research project period and may be subject to modification to allow other research-related activities such as peer review. View Glossary:

September 2023

Year Awarded Year Awarded The year that funding for the project was approved, or the year the proposal received a notice of award. View Glossary:

2019

Year Completed:

2023

Study Registration Information

HSR Project Number:

HSRP20194228

About

Research

Impact

Highlights of PCORI-Funded Research Results

Topics

Engagement

Funding Opportunities

Applicant and Awardee Resources

Events

Jump to Section

Developing and Testing New Methods for Comparing Outcomes Across Groups of Patients in Observational Studies

Results Summary

What was the project about?

What did the research team do?

What were the results?

What were the limits of the project?

How can people use the results?

Professional Abstract

Background

Objective

Study Design

Methods

Results

Limitations

Conclusions and Relevance

Future Research Needs

COVID-19-Related Study

Methods to Improve Comparative Effectiveness Analyses for COVID-19 Treatments

Results Summary

What was the project about?

What did the research team do?

What were the results?

What were the limits of the project?

How can people use the results?

Professional Abstract

Background

Objective

Study Design

Methods

Results

Limitations

Conclusions and Relevance

Peer Review Summary

Final Enhancement Report

Final Research Report

Journal Citations

Related Journal Citations

Dexamethasone for Inpatients With COVID-19 in a National Cohort

Covariate adjustment in subgroup analyses of randomized clinical trials: A propensity score approach

Propensity score weighting for causal subgroup analysis

Peer-Review Summary

Conflict of Interest Disclosures

Project Information

Key Dates

Study Registration Information

Tags