Project Summary

When doctors and researchers want to compare two treatments to see which leads to better outcomes for patients with a disease, one approach is to design and conduct a randomized controlled trial. In such trials, patients with a disease are randomly assigned to one of the treatments and then followed over time to measure changes in their health. However, there are times when conducting a randomized trial to compare treatments is very difficult or impossible. These include situations where a disease or outcome is very rare or when researchers want to understand the effectiveness of treatments for patients who may be less likely to enroll in a randomized trial, such as those who are frail or who live far from an institution running the trial. Randomized trials can also be difficult to carry out if the questions they ask do not attract investment from pharmaceutical companies, even though the questions may be very important to patients.   

Another way to compare the effectiveness of two treatments is to conduct an observational study by reviewing the medical records of patients who received each treatment in routine care from their doctors, analyzing differences in outcomes associated with each treatment. However, observational research is vulnerable to a statistical problem called confounding, which means that there may be patient characteristics that influence both which treatment the patient receives and how well the patient does with the treatment. One example of this would be if a patient with cancer is prescribed a treatment with fewer side effects than an alternative because the patient is frail. Since patients who are frail often have poorer prognoses, the effectiveness of the first treatment could artificially appear worse than the effectiveness of an alternative, unless the patient’s frailty is taken into account.   

Observational research therefore requires methods to adjust for these types of confounding characteristics. A traditional way to do this is to select a list of such characteristics and measure them for each patient, then attempt to predict which treatment the patient would have received based on those characteristics. Two patients who then had similar probabilities of receiving one treatment, but who actually received different treatments, can be matched, and their outcomes can be compared. This is called propensity score modeling. There are challenges with this type of traditional propensity score modeling, including the fact that some important patient characteristics—such as level of frailty—are not recorded as a structured number in the electronic health record. Capturing these characteristics therefore can require manually reviewing large quantities of text reports. The clinical status of a patient also shifts as he or she proceeds along the trajectory of a disease, and we need flexible methods for capturing their characteristics at different moments in time.   

To facilitate observational comparative clinical effectiveness research, this study will develop artificial intelligence (AI) methods for ascertaining patient characteristics from all available electronic health records data. Deep representations of how well a patient is doing at any given time will be trained by using such data to predict later events, including which treatment the patient receives next and the prognosis of the patient. With a focus on patients being treated for cancer, researchers will use this method to compare the effectiveness of different chemotherapy drugs for patients with advanced disease. The study team will also collaborate with colleagues at a large diverse health system to evaluate how well these approaches can work when applied by another health system to its own data.

Project Information

Kenneth Kehl, M.D., MPH
Dana-Farber Cancer Institute
$1,061,984 *

Key Dates

36 months *
November 2023

*All proposed projects, including requested budgets and project periods, are approved subject to a programmatic and budget review by PCORI staff and the negotiation of a formal award contract.


Award Type
State State The state where the project originates, or where the primary institution or organization is located. View Glossary
Last updated: January 24, 2024