A goal in comparative effectiveness research (CER) is to understand the different mechanisms through which treatments act to affect patient outcomes, especially when such mechanisms explain heterogeneous treatment effects (i.e., why some treatments may work well for some patients but not for others). Causal mediation analysis provides a framework to learn such mechanisms from data. In causal mediation analysis, the total effect of a treatment on an outcome is decomposed into the portion that operates through the mediators (called the indirect effect) and the portion that operates through all other mechanisms (called the direct effect).
Mediation may reveal multiple pathways that cancel or balance each other out in the aggregate, thereby explaining weak or null overall effects. When these multiple causal pathways are heterogeneous across patient subgroups, they may reveal why a treatment works well for some patients but not others. Such mechanistic knowledge is a critical tool in developing targeted, optimal strategies for patient-centered treatment.
Four key barriers hamper the reliability and accuracy of existing mediation methods.
First, methods for heterogeneous mediation treatment effects, which are most useful for patient-level decision-making, are scarce.
Second, methods for mediation analysis from longitudinal studies are also scarce, and the few methods available impose strong structural assumptions on the data, which are frequently in conflict with reality. For example, available approaches do not allow for time-varying confounders measured before and after the mediator, and/or fail to capture all relevant paths through which the treatment may operate.
Third, mediation methods that leverage instrumental variables (IV) for confounding control are in their infancy, even though treatment assignment is a ubiquitous, valid instrument in the context of randomized trials.
Fourth, most existing approaches require strong model assumptions to reduce the complexity of the relations between confounders, exposures, mediators, and outcomes. This allows for the use of standard regression tools such as logistic or linear regression, but if the models are wrong, results from such analyses can lead to incorrect conclusions. In particular, complex heterogeneous treatment effects involving interacting variables may be missed or mischaracterized in parametric models. Advances in flexible model-fitting using machine learning can reduce reliance on correct model specification, but they are underutilized in causal inference in general and mediation analysis in particular.
Consequently, there is a critical need to develop general mediation methodology compatible with real-world longitudinal data that integrates state-of-the-science flexible machine learning algorithms into model fitting. The rationale is that the development of such methodology will allow researchers to answer the scientific question at hand without assuming away true complexity in their data.
Our goal is to develop a broadly general, flexible, and robust approach to estimate mediational direct and indirect effects that are appropriate for common study designs and for identifying patient heterogeneity in mediation mechanisms. We will apply the methods developed to understand differential effects in a comparative effectiveness trial for the treatment of opioid use disorder.