Results Summary

What was the project about?

Researchers often combine data from different sources, such as insurance claims and health records, to get a better picture of patients’ health and use of health care. Researchers use unique identifiers, like Social Security numbers, to connect patient records and make them more complete. But sometimes this approach doesn’t work well, especially when records don’t have much personal information. Having limited personal data can lead to errors when linking records.

In this study, the research team created new methods to link data sets with limited personal information. Then they compared the new methods with existing ones. They also applied the new methods with real patient data.

What did the research team do?

The research team created two new methods to link records from two sources:

  • BRLVOF, which uses extra patient information from one source along with identifiers in both sources to link records
  • MLBRL, which matches information about patients and groups of patients, like patients who go to the same doctor

Then the research team created test data to see how the new methods perform in different situations. For example, they created mistakes and then changed the number of mistakes in the personal information. They compared the new methods to current methods, which use only linking identifiers. The team looked at how well each method linked patient records in each situation.

Using real patient data, the research team used the new methods to link a national injury list to Medicare data for patients who had a brain injury. They linked the two data sets to see if certain patient traits before the injury were related to patient recovery at a medical center.

Doctors gave input during the study.

What were the results?

With the test data, BRLVOF did better than the current methods to link patient records when patient data were missing or wrong. When linking patient records within patient groups getting care from the same doctor, MLBRL did better than the current methods.

When the research team used the new methods to link real patient data, the results showed that patient traits before a brain injury were not related to patient health outcomes at care centers.

What were the limits of the project?

The research team only tested the new methods using data from patients who have Medicare. Results may differ for patients with other types of insurance.

Future research can check how well the new methods work for linking different data sources and data for patients with other health issues.

How can people use the results?

Researchers can use the new methods to link patient data from data sets that have limited personal information.

Final Research Report

This project's final research report is expected to be available by July 2024.

Peer-Review Summary

The Peer-Review Summary for this project will be posted here soon.

Conflict of Interest Disclosures

Project Information

Roee Gutman, PhD, MS, BSc
Brown University
Improving CER/PCOR Methods for Analyzing Linked Data Sources in the Absence of Unique Identifiers

Key Dates

August 2018
September 2023

Study Registration Information


Has Results
Award Type
Intervention Strategy Intervention Strategies PCORI funds comparative clinical effectiveness research (CER) studies that compare two or more options or approaches to health care, or that compare different ways of delivering or receiving care. View Glossary
State State The state where the project originates, or where the primary institution or organization is located. View Glossary
Last updated: October 12, 2023