Observational studies occupy an important role in patient-centered outcomes research (PCOR) and comparative effectiveness research (CER) with the potential to address questions that are unlikely to be evaluated in clinical trials. For example, the COMPARE-UF registry was designed to evaluate treatment options for women with uterine fibroids, specifically myomectomy versus hysterectomy. Like all observational studies, procedural comparisons in COMPARE-UF may be unfair due to differences in the type of patients who select alternative interventions. Suppose women with worse symptoms, and more likely worse outcomes, tend to choose hysterectomy over myomectomy. Hysterectomy may appear relatively harmful, for reasons unrelated to the treatment itself.
To facilitate a fair comparison, COMPARE-UF used propensity score (PS) methods to make groups comparable, or balanced, on their measured characteristics. In addition, COMPARE-UF was designed to answer the question of what works best for whom. To determine whether certain subgroups of patients should receive different treatment recommendations due to having different comparative results from myomectomy versus hysterectomy, subgroup analyses (SGA) were planned. Investigators and stakeholders identified priority subgroups based on age, race, body mass index, baseline quality of life, prior procedures for uterine fibroids, reproductive history, and uterine size. For example, African-American women have higher rates of procedural complications, but whether this corresponds to different treatment effects is unknown.
SGA in COMPARE-UF could provide evidence that African-American women have better outcomes with myomectomy, but Caucasian women have better outcomes with hysterectomy. Incorrect or biased results in these SGAs would recommend a large group of women to receive hysterectomy when in fact myomectomy would be better (or vice versa). Despite the impact of SGA, methodological approaches and guidance for observational SGA are lacking.
This research will address a key reason why SGA may be biased: lack of balance (comparability) in the measured patient characteristics despite the use of PS methods. Standard assessment of balance, used to check the adequacy of PS adjustment overall, provides no guarantee of good comparability within subgroups. Therefore, conclusions about the treatment effect in subgroups are vulnerable to be wrong. On the other hand, steps to improve subgroup balance will tend to increase the uncertainty or decrease precision in the analysis. Without methods and guidance to address this tradeoff, incorrect results may be replaced by inconclusive results, failing to inform patients where better methods could provide conclusive recommendations.
To address these gaps, we will develop measures to evaluate balance and precision during the design stage of SGA, increasing transparency and guiding selection of PS methodology. Additionally, we will develop new PS weighting methods that incorporate machine learning to better optimize the competing goals in the conduct of SGA. The proposed research will advance methods for observational SGA ensuring that PCOR/CER studies provide patients, physicians, and policy makers with correct and precise information to regarding what works best for whom.
*All proposed projects, including requested budgets and project periods, are approved subject to a programmatic and budget review by PCORI staff and the negotiation of a formal award contract.