Project Summary

One of PCORI’s goals is to improve the methods that researchers use for patient-centered outcomes research. PCORI funds methods projects like this one to better understand and advance the use of research methods that improve the strength and quality of comparative effectiveness research.

This research project is in progress. PCORI will post the research findings on this page within 90 days after the results are final.

What is the research about?

Data from electronic health records, or EHRs, are a helpful resource for researchers. By looking at information from many patients, researchers can see how treatments work over time. They can also compare the health of people who get different types of treatment. But EHR data often have errors. If data about a patient’s weight or medicine dose or the date treatment started are not correct, studies using those data will also have errors.

The best way to get accurate EHR data is to review patients’ records closely. But reviewing all patient records is costly and takes time. In this study, the research team is working on ways to make studies with EHR data more accurate by reviewing some, but not all, records.

Who can this research help?

Results of this study can help researchers plan and carry out studies using data from EHRs.

What is the research team doing?

In this study, the research team is developing statistical methods that can estimate and account for possible errors in a large set of EHR data. The methods are based on what errors are found in a small number of these records. The team is also looking at what records, and how many records, to review. They are also creating guides and analysis tools to help other researchers decide what records to review and get accurate study estimates after reviewing a part of the EHR data. 

The research team is using the statistical methods in a study that is using a large set of EHR data. They are looking for factors, such as mothers’ weight, that might affect children’s risk for obesity. After working with a part of the data, the team is applying what it learns to improve estimates using the full data set.

Other medical researchers are helping the research team with this project. The team is also working with parents from the obesity study.

Research methods at a glance

Design Elements Description
Goal Improve validity and efficiency of analyses of EHR-derived data with novel statistical methods that use data validation in targeted subsamples
  • Develop novel statistical methods that reduce or eliminate bias caused by correlated errors in time-to-event outcomes and covariates
  • Design optimal multi-wave validation/audit strategies and develop an audit/validation design tool kit
  • Apply methods and designs to a study of risk factors for early childhood obesity

Journal Citations

Related Journal Citations

Project Information

Bryan E. Shepherd, PhD
Pamela Shaw, PhD
Vanderbilt University Medical Center
Statistical Methods and Designs for Addressing Correlated Errors in Outcomes and Covariates in Studies Using Electronic Health Records Data

Key Dates

August 2016
June 2022

Study Registration Information


Health Conditions Health Conditions These are the broad terms we use to categorize our funded research studies; specific diseases or conditions are included within the appropriate larger category. Note: not all of our funded projects focus on a single disease or condition; some touch on multiple diseases or conditions, research methods, or broader health system interventions. Such projects won’t be listed by a primary disease/condition and so won’t appear if you use this filter tool to find them. View Glossary
State State The state where the project originates, or where the primary institution or organization is located. View Glossary
Last updated: March 4, 2022