Skip to main content
Patient-Centered Outcomes Research Institute
Patient-Centered Outcomes Research Institute
  • Blog
  • Newsroom
  • Find It Fast
  • Help Center
  • Subscribe
  • Careers
  • Contact Us

PCORI

Patient-Centered Outcomes Research Institute

Search form

  • About Us
    Close mega-menu

    About Us

    • Our Programs
    • Governance
    • Financials and Reports
    • Procurement Opportunities
    • Our Staff
    • Our Vision & Mission
    • Contact Us

    Fact Sheets: Learn More About PCORI

    Download fact sheets about out work, the research we fund, and our programs and initiatives.

    Find It Fast

    Browse through an alphabetical list of frequently accessed and searched terms for information and resources.

    Subscribe to PCORI Email Alerts

    Sign up for weekly emails to stay current on the latest results of our funded projects, and more.

  • Research & Results
    Close mega-menu

    Research & Results

    • Explore Our Portfolio
    • Research Fundamentals
    • Research Results Highlights
    • Putting Evidence to Work
    • Peer Review
    • Evidence Synthesis
    • About Our Research

    Evidence Updates from PCORI-Funded Studies

    These updates capture highlights of findings from systematic reviews and our funded research studies.

    Journal Articles About Our Funded Research

    Browse through a collection of journal publications that provides insights into PCORI-funded work.

    Explore Our Portfolio of Funded Projects

    Find out about projects based on the health conditions they focus on, the state they are in, and if they have results.

  • Topics
    Close mega-menu

    Topics

    • Addressing Disparities
    • Arthritis
    • Asthma
    • Cancer
    • Cardiovascular Disease
    • Children's Health
    • Community Health Workers
    • COVID-19
    • Dementia and Cognitive Impairment
    • Diabetes
    • Kidney Disease
    • Medicaid
    • Men's Health
    • Mental and Behavioral Health
    • Minority Mental Health
    • Multiple Chronic Conditions
    • Multiple Sclerosis
    • Obesity
    • Older Adults' Health
    • Pain Care and Opioids
    • Rare Diseases
    • Rural Health
    • Shared Decision Making
    • Telehealth
    • Transitional Care
    • Veterans Health
    • Women's Health

    Featured Topic: Women's Health

    Learn more about the projects we support on conditions that specifically or more often affect women.

  • Engagement
    Close mega-menu

    Engagement

    • The Value of Engagement
    • Engagement in Health Research Literature Explorer
    • Influencing the Culture of Research
    • Engagement Awards
    • Engagement Resources
    • Engage with Us

    Engagement Tools and Resources for Research

    This searchable peer-to-peer repository includes resources that can inform future work in patient-centered outcomes research.

    Explore Engagement in Health Literature

    This tool enables searching for published articles about engagement in health research.

    Research Fundamentals: A New On-Demand Training

    It enables those new to health research or patient-centered research to learn more about the research process.

  • Funding Opportunities
    Close mega-menu

    Funding Opportunities

    • What & Who We Fund
    • What You Need to Know to Apply
    • Applicant Training
    • Merit Review
    • Awardee Resources
    • Help Center

    PCORI Funding Opportunities

    View and learn about the newly opened funding announcements and the upcoming PFAs in 2021.

    Tips for Submitting a Responsive LOI

    Find out what PCORI looks for in a letter of intent (LOI) along with other helpful tips.

    PCORI Awardee Resources

    These resources can help awardees in complying with the terms and conditions of their contract.

  • Meetings & Events
    Close mega-menu

    Meetings & Events

    • Upcoming
    • Past Events

    PCORI 2021 and Beyond

    During this webinar, PCORI leaders shared ways to get involved in PCOR, improvements to our funding opportunities, and more.

    Confronting COVID-19: A Webinar Series

    Learn more about the series and access recordings and summary reports of all six sessions.

    2020 PCORI Annual Meeting

    Watch recordings of all sessions, and view titles and descriptions of the posters presented at the virtual meeting.

You are here

  • Research & Results
  • Explore Our Portfolio
  • Methods for Improving Confounding Con...

This project has results

Methods for Improving Confounding Control in Comparative Effectiveness Research Using Electronic Healthcare Databases

Sign Up for Updates to This Study  

Results Summary and Professional Abstract

Results Summary
Download Summary

Results Summary

What was the research about?

Comparative effectiveness research compares two or more treatments to see which one works better for which patients. Electronic healthcare data are useful for this type of research. These data come from medical records and insurance claims. The data include information about how well patients respond to treatments. But many things—not just treatments—affect whether a patient’s health improves.

How well a patient responds to a treatment may depend on the patient’s age or what medicines the patient takes. It could also depend on what other health problems a patient has and how severe those problems are. Or a doctor may suggest one treatment instead of another because of a patient’s personal situation and health. Researchers need ways to determine whether changes in a patient’s health result from a certain treatment or something else.

Different statistical methods help researchers account for the various things that can affect treatment results. But researchers don’t know which methods work best. This study compared several methods. The team looked at how well the methods worked to predict patients’ responses to treatment, taking into account their personal situations and health. The team then created a computer program to help researchers use the methods.

What were the results?

No statistical method worked best in all cases to predict how well patients responded to treatment, taking into account their personal situations and health. Each method had pros and cons. But the research team found that one combination of two methods worked well in many cases.

What did the research team do?

The team used electronic healthcare data from three studies. The researchers first used the data from the studies to test several statistical methods. Then, they made more data sets based on the real data to test the methods further. The team created a computer program to help researchers use the methods.

During the study, patients gave input to the research team about key problems they have had in getting health care and what questions they see as most important for research to answer.

What were the limits of the study?

The research team could not say for sure which method will work best in specific cases to account for the things that could affect treatment results. Future research could continue to look at which methods might work best when doing a study using electronic healthcare data.

How can people use the results?

Researchers can use the results to understand which statistical methods might be useful when doing a study using electronic healthcare data. The software that the team developed can help researchers use the statistical methods in their research.

Professional Abstract

Professional Abstract

Objective

To compare data-adaptive algorithmic approaches for improving confounding control in comparative effectiveness research that uses electronic healthcare databases

Study Design

Design Elements Description
Design Empirical analyses and simulation studies
Data Sources and Data Sets Novel Oral Anticoagulant Prescribing Study data set, Nonsteroidal Anti-inflammatory Drugs Study data set, and Vytorin Study data set
Analytic Approach Empirical analyses and simulations to examine 4 different algorithmic approaches: hdPS alone, a combination of super learner prediction modeling and the hdPS, a combination of a scalable version of CTMLE and the hdPS, and a combination of penalized regression (lasso) and the hdPS
Outcomes

Outcomes informing variable selection, PS estimation, and causal inferencey

Analyzing data from electronic healthcare databases, such as electronic health records, to gain generalizable knowledge of the effectiveness of medical interventions in routine care can improve patient care and outcomes, particularly for populations that are often excluded from randomized trials. However, researchers underuse these data for generating evidence on treatment effects, in part because of concerns about bias. Bias may arise in the data because clinicians selectively base their prescribing decisions on such factors as disease severity and patient prognosis. Current approaches to minimize such bias rely on the investigator to specify all potential confounding factors. New analytic approaches propose using algorithms to maximize control of confounding factors. However, researchers do not know how well these algorithms perform when applied to electronic healthcare data, particularly for special populations and small samples. Further, researchers have lacked readily available software to facilitate use of the algorithms.

This study evaluated the performance of several algorithms for variable selection, propensity score (PS) estimation, and causal inference. Researchers performed simulations using the plasmode framework, which combines simulated and empirical data to more accurately reflect complex relations that typically exist among baseline covariates. The research team then used three healthcare data sets in conjunction with plasmode simulations to evaluate the ability of each algorithm to effectively control for confounding. The team considered different scenarios by varying outcome incidence, treatment prevalence, sample size, and treatment effect. To help researchers use the algorithms, the team developed software and accompanying guidance.

During the study, the research team met with patient representatives, who identified key problems they encounter in healthcare delivery and provided input on what potential research questions are of greatest interest.

Results

  • Overall, in settings with many covariates, the high-dimensional propensity score (hdPS) algorithm alone performed as well as or better than other algorithms for automated variable selection. However, the hdPS algorithm can be sensitive to the number of covariates included for adjustment, and severe overfitting of the PS model can negatively affect the properties of effect estimates, particularly for small samples.
  • Combining the hdPS algorithm and the scalable version of collaborative targeted maximum likelihood estimation (CTMLE) performed well for many of the scenarios considered, but this combination was sensitive to parameter specifications within the algorithm.
  • Combining the hdPS algorithm with super learner prediction modeling performed well across a broad range of settings and conditions and was the most consistent strategy in terms of reducing bias and mean squared error in the effect estimates. This approach seems especially promising for use in early periods of drug approval, where small samples and rare exposures are common.

Limitations

A framework that incorporates empirical data and observed variable relationships with simulated data allowed the research team to evaluate the algorithms in settings that reflect real-world practice. However, simulations that use specific data may have limited generalizability to other settings. In addition, assumptions made in applying the super learner algorithm may have influenced results. Finally, this study relied solely on statistical algorithms to control confounding without investigator input. It is unclear whether investigator input might influence results.

Conclusions and Relevance

Among the data-adaptive algorithms assessed in this study, no single algorithm was optimal across all data sets and scenarios. The widely used hdPS functioned well in most scenarios, although the combination of the hdPS with super learner prediction modeling outperformed hdPS alone under certain conditions. These approaches may be effective in reducing bias due to confounding when estimating treatment effects using healthcare databases.

Future Research Needs

Future studies could determine whether similar findings emerge when investigators identify confounders to include in the model a priori. Additional research could also explore factors that influence the performance of these algorithms.

Final Research Report

View this project's final research report.

Journal Articles

Related Articles

European Journal of Epidemiology

Theory meets practice: a commentary on VanderWeele's 'principles of confounder selection'

Journal of Applied Statistics

Propensity score prediction for electronic healthcare databases using super learner and high-dimensional propensity score methods

Statistical Methods in Medical Research

Scalable collaborative targeted learning for high-dimensional data

Clinical Epidemiology

Automated data-adaptive analytics for electronic healthcare data to study causal treatment effects

Nature Communications

Network-based approach to prediction and population-based validation of in silico drug repurposing

Epidemiology

Using Super Learner Prediction Modeling to Improve High-dimensional Propensity Score Estimation

Statistical Methods in Medical Research

Collaborative-controlled LASSO for constructing propensity score-based estimators in high-dimensional data

Epidemiology

Variable Selection for Confounding Adjustment in High-dimensional Covariate Spaces When Analyzing Healthcare Databases

More on this Project  

Peer-Review Summary

Peer review of PCORI-funded research helps make sure the report presents complete, balanced, and useful information about the research. It also assesses how the project addressed PCORI’s Methodology Standards. During peer review, experts read a draft report of the research and provide comments about the report. These experts may include a scientist focused on the research topic, a specialist in research methods, a patient or caregiver, and a healthcare professional. These reviewers cannot have conflicts of interest with the study.

The peer reviewers point out where the draft report may need revision. For example, they may suggest ways to improve descriptions of the conduct of the study or to clarify the connection between results and conclusions. Sometimes, awardees revise their draft reports twice or more to address all of the reviewers’ comments.

Peer review identified the following strengths and limitations in the report:

  • Reviewers commended the researchers on a methodologically sound study focused on improving statistical methods for confounder adjustment needed for causal inference models in patient-centered outcomes research. Many of the comments were requests for investigators to provide explanations of their highly-technical methods and results in language that would be understood by the general clinical researcher.
  • Reviewers questioned the generalizability of the statistical approaches tested in this study, as simulation studies are often plagued with limited generalizability. The researchers acknowledged this problem, but responded that they used three different datasets for the simulations, which would improve the overall generalizability. One of the specific advantages of their simulation approach, the authors stated, was the preservation of the complex relationships among baseline covariates, which would not be possible in other simulation frameworks.

Conflict of Interest Disclosures

View the COI disclosure form.

Project Details

Principal Investigator
Sebastian Schneeweiss, MD, MS, ScD
Project Status
Completed; PCORI Public and Professional Abstracts, and Final Research Report Posted
Project Title
Causal Inference for Effectiveness Research in Using Secondary Data
Board Approval Date
September 2013
Project End Date
July 2018
Organization
Brigham and Women's Hospital
Year Awarded
2013
State
Massachusetts
Year Completed
2018
Project Type
Research Project
Health Conditions  
Cancer
Cardiovascular Diseases
Atrial Fibrillation
Stroke
Intervention Strategies
Drug Interventions
Funding Announcement
Improving Methods for Conducting Patient-Centered Outcomes Research
Project Budget
$1,102,522
DOI - Digital Object Identifier
10.25302/7.2019.ME.13035638
Study Registration Information
HSRP20143575
Page Last Updated: 
September 10, 2019

About Us

  • Our Programs
  • Governance
  • Financials and Reports
  • Procurement Opportunities
  • Our Staff
  • Our Vision & Mission
  • Contact Us

Research & Results

  • Explore Our Portfolio
  • Research Fundamentals
  • Research Results Highlights
  • Putting Evidence to Work
  • Peer Review
  • Evidence Synthesis
  • About Our Research

Engagement

  • The Value of Engagement
  • Engagement in Health Research Literature Explorer
  • Influencing the Culture of Research
  • Engagement Awards
  • Engagement Resources
  • Engage with Us

Funding Opportunities

  • What & Who We Fund
  • What You Need to Know to Apply
  • Applicant Training
  • Merit Review
  • Awardee Resources
  • Help Center

Meetings & Events

March 8
Engagement Awards 2021 Special Cycle -- Applicant Office Hours (One)
March 15
Priorities on the Health Horizon: Informing PCORI's Strategic Plan (Webinar)
March 15
PCORI Workshop on Methodologic Challenges in Intellectual and Developmental Disabilities Research

PCORI

Footer contact address

Patient-Centered Outcomes
Research Institute

1828 L Street, NW, Suite 900
Washington, DC 20036
Phone: (202) 827-7700 | Fax: (202) 355-9558
[email protected]

Subscribe to Newsletter

Twitter Facebook LinkedIn Vimeo

© 2011-2021 Patient-Centered Outcomes Research Institute. All Rights Reserved.

Privacy Policy | Terms of Use | Trademark Usage Guidelines | Credits | Help Center