Project Summary
With the broad adoption of electronic health records (EHRs), there has been a push to make use of them in health services applications. This has enabled the widespread construction and utilization of risk models, which are designed to help assess trends in patient health for a health system and to screen high-risk patients into interventions. Health system risk models such as hospital readmission risk models play a major role in this process, yet building accurate risk models is challenging and by and large these models have not shown sufficient performance for clinical use.
Several factors play into the lack of success in constructing accurate health system risk models. First, risk models are constructed in a one-size-fits-all fashion from large and diverse populations; however, the factors that drive risk vary substantially for patients with different medical histories. For example, the nature of the prior hospitalization plays a significant role in determining what impacts future health risk. Models should thus be designed separately for each different subgroup of patients that have clinically similar diagnoses; however, sample sizes within each subgroup are often small for a given health system. The high dimensionality of EHR data exacerbates estimation issues arising from small sample sizes within subgroups. Second, health risk models often binarize the readmission outcome into an indicator of an event occurring within a window of time, such as readmission within 30 days. This practice ignores nuances in whether, when, and how frequently a patient has an adverse health event and ignores the relationship between the risk of the event and risk of death.
To fill these methodological gaps, the study team will build on our prior work in risk modeling for heterogeneous populations from high dimensional data. With this proposal the study team seeks to improve the model building process for health system risk models by designing statistical procedures that allow for subgroup-specific models and address the small sample size issue by borrowing strength across related diagnosis groups in a data-driven manner. The study team also seeks to improve health system risk models by developing a statistical approach that jointly models time to the event of interest and time to death.
In particular, for aim 1, the study team proposes a framework that allows for subgroup-specific models and allows the data to determine which subgroups to collapse into a single model to help improve accuracy and interpretability. For aim 2, the study team proposes a joint modeling framework that models both the event of interest and death, allowing for the models to understand the risk of the event in terms of the time until the event occurs. This framework will also be designed to handle high dimensions and will allow for borrowing strength across the event and death models.
The frameworks proposed in these aims can be combined to create a single framework to handle both patient heterogeneity and joint modeling of an event of interest, such as readmissions, and death. The research described in this application will build innovative statistical methods to produce accurate, reliable, and descriptive risk models for health system populations. The resulting risk models can be incorporated into the EHR to help assist clinicians and patients in better understanding patient health risk and what drives risk for a given patient.