Results Summary
What was the project about?
Data from healthcare systems, patients and communities, and health plans can support health research. Two types of data sources are
- Patient-powered research networks, or PPRNs. In PPRNs, patients, families, caregivers, and community members share health data with the network. They work closely with researchers to plan and conduct research.
- Health plan research networks, or HPRNs. In HPRNs, networks of health plans have access to health claims data from members for research.
By linking patient records across PPRNs and HPRNs, researchers may be able to do more robust research. To link records, researchers use computer programs to connect the records of people in a PPRN with their claims data in an HPRN. Current methods to link records require use of personal information, such as names and dates of birth. But patients may not want to share this information.
In this project, the research team developed methods for linking data from PPRNs and HPRNs without using patients’ personal information.
What did the research team do?
The research team developed new methods that protected patients’ privacy. The team linked data from four PPRNs and an HPRN. The team confirmed the diagnosis of seven health problems in PPRN data with claims data in the HPRN. The team then compared patients who had linked data with patients without linked data.
Next the research team talked to nine patients from seven PPRNs. The team asked patients their views about taking part in research being done by HPRNs.
What were the results?
The methods linked data from 4,487 of the 21,616 PPRN patients with claims data in the HPRN.
For 50 to 75 percent of patients with linked records, claims data confirmed diagnoses that patients shared in PPRNs. The longer patients were enrolled in their health plan, the more likely claims data could confirm diagnoses in linked data.
Compared with patients without linked data, patients with linked data were younger, more likely to be women, and less likely to have several health problems.
In interviews, patients noted barriers to taking part in HPRN research:
- Changes in health plan enrollment
- Trust in health plans
- Fear of data breaches or reduced insurance benefits
What were the limits of the project?
Errors in claims data could affect this study’s findings. The nine interviews may not represent all views on taking part in HPRN research.
Future research could compare the methods in this study versus methods that use personal information with proper permissions from patients.
How can people use the results?
To maintain privacy, researchers can use the methods to link data between PPRNs and HPRNs without using patients’ personal information.
Professional Abstract
Background
Comparative effectiveness research can use data gathered from various sources. These sources include
- Patient-powered research networks (PPRNs). Patients, caregivers, and community members provide self-reported data and contribute to all aspects of research.
- Health plan research networks (HPRNs). Through medical, pharmacy, and laboratory health claims, health plans provide member data for research.
Linking patient records from PPRNs and HPRNs enables robust research. To link records from these sources, researchers often use patient identifiers, such as name and birthdate. However, using personal data without member permission may raise privacy concerns. The use of privacy-preserving record linkage methods can enable robust data without the exchange of personal information.
Objective
(1) To develop a privacy-preserving data linkage methodology between PPRNs and HPRNs; (2) To understand the patient perceptions of HPRN health outcomes research
Study Design
Design Elements | Description |
---|---|
Design | Empirical analysis |
Data Sources and Data Sets |
4 disease-specific PPRNs with data from 21,616 members who have, or support people with, one of the following conditions: breast or ovarian cancer, inflammatory arthritis, multiple sclerosis, or vasculitis Longitudinal administrative health claims data within the HealthCore Integrated Research Environment for 60 million patients with commercial health insurance Semi structured interviews with 9 patients from 7 PPRNs |
Analytic Approach |
Hashing of identifiable information for anonymous linkage between PPRN and HPRN data Descriptive statistics to analyze overlap in linked data Comparison of linked data with unmatched reference group Poisson regression Thematic content analysis of semi structured interviews |
Patient counts, claims-based diagnosis confirmation rates Patient perceptions |
Methods
Using the data exchange technique called hashing, the research team developed a privacy-preserving data linkage method that anonymously linked and identified overlapping participants within four PPRNs and an HPRN.
The research team calculated confirmation rates that measured agreement between PPRN self-reported diagnoses and HPRN claims-based diagnoses for seven clinical conditions. The team evaluated how length of enrollment in a health plan could affect confirmation rates. The team also compared participants with linked data to a reference group of HPRN participants with the same claims-based diagnoses whose data did not link with PPRN participants.
To understand patient perspectives on HPRN research outreach, the research team conducted nine semistructured telephone interviews with patients from seven PPRNs.
PPRN leadership and patients helped develop the method.
Results
The method linked data from 4,487 of the 21,616 PPRN participants to administrative health claims data from an HPRN. Diagnosis confirmation rates for the linked participants ranged between 50% and 75%. Confirmation rates increased with longer health plan enrollment.
Compared with the reference group, PPRN participants with linked data were younger (50 versus 56 years, p<0.001), more likely to be female (90% versus 76%, p<0.001), and less likely to have more comorbid conditions (p<0.001).
In interviews, barriers to HPRN outreach included health plan enrollment changes, lack of trust in HPRNs, and patients’ fears of data breaches and reduced healthcare benefits.
Limitations
Claims data may have coding errors that could cause misclassifications and inaccurate estimates of disease states. The study included nine interviews with PPRN patients; findings may not represent all perspectives.
Conclusions and Relevance
Researchers can use the privacy-preserving data linkage methods developed in this study to link PPRN and HPRN data.
Future Research Needs
Future research could compare the linkage methods used in this study with fully identifiable record linkage methods that have patients’ permissions.
Final Research Report
View this project's final research report.
Journal Citations
Related Journal Citations
Peer-Review Summary
Peer review of PCORI-funded research helps make sure the report presents complete, balanced, and useful information about the research. It also assesses how the project addressed PCORI’s Methodology Standards. During peer review, experts read a draft report of the research and provide comments about the report. These experts may include a scientist focused on the research topic, a specialist in research methods, a patient or caregiver, and a healthcare professional. These reviewers cannot have conflicts of interest with the study.
The peer reviewers point out where the draft report may need revision. For example, they may suggest ways to improve descriptions of the conduct of the study or to clarify the connection between results and conclusions. Sometimes, awardees revise their draft reports twice or more to address all of the reviewers’ comments.
Peer reviewers commented and the researchers made changes or provided responses. Those comments and responses included the following:
- The reviewers asked why the researchers did not prespecify minimum continuous health plan enrollment periods for their study participants. The reviewers noted that allowing for any enrollment period might have affected the researchers’ ability to confirm self-reported diagnoses through health plan records. The researchers responded that they did not want to set a minimum enrollment period because they wanted to include as many participants as possible, in order to include as many patient-powered research network members as possible whose health plan data could be linked. They did report on the duration of continuous health plan coverage to assuage concerns.
- The reviewers asked for more details about how researchers selected patient-powered research network representatives for interviews and whether any consideration was given to diversity, such as in gender and age. The researchers explained that given limits in budget and time, they decided to try to find a single patient interviewee from each of the seven research networks of interest to ensure that they had representation from each group. The researchers did not feel they were in a good position to identify the patient representatives, so they asked for help from research network leaders in identifying potential interviewees. The researchers received a list of nine patient representatives from the seven networks who were experienced research participants, and they interviewed all nine. The researchers commented that future work could certainly broaden the representation of patient representatives. At this stage, the researchers said they were most interested in assessing the views of highly engaged patient network representatives.