Results Summary

What was the project about?

Clinical notes in electronic health records, or EHRs, can help researchers study treatments. For example, EHR notes may contain information about whether patients take their medicines as directed. But it takes researchers a lot of time to find this information.

Natural language processing, or NLP, methods can help researchers find information in EHR notes. With NLP, computer programs read and identify written language to make it easier to sort and study. But current NLP methods don’t work well to find and label text about medicine use.

In this study, the research team created and tested a new NLP method to find and label EHR notes on patients’ medicine use.

What did the research team do?

First, the research team created a new NLP method to find and label text on medicine use in EHR notes. To train the NLP method, the team used 2,250 sentences from EHR notes that a researcher labeled as medicine use. Then the team tested the method with 600 different sentences.

Next, the research team tested how well the new NLP method worked. They compared the method’s results on medicine use with data on medicine use from prescription claims data, and from readmissions data for psychiatric care. Then they looked at the method’s accuracy using EHR notes from another data set. The data set had 1,018 sentences from EHR notes for 923 patients.

Patients, doctors, a caregiver, and a patient advocate helped design the study.

What were the results?

The new NLP method could accurately find and label text on medicine use in EHR notes. This text included how often a patient took medicine as directed and reasons for not taking medicine. Across test data, the method had a high level of accuracy in labeling text.

When applying the results of the new NLP method with findings from other data sources, the research team found that:

  • Only one-third of patients with prescription claims data had EHR notes that documented that they took the medicine as directed.
  • Among patients who received psychiatric care, use of medicine as directed decreased as readmissions for psychiatric care increased.

The new NLP method also showed a high level of accuracy when using EHR notes from another EHR data set.

What were the limits of the project?

The new NLP method may incorrectly label text as medicine use when it’s not, such as use of wheelchairs. If the method labels text as medicine use when it’s not, results would be less accurate.

Future studies could improve the method by training it using EHR notes from patients who have other health problems and receive different types of treatments.

How can people use the results?

Researchers can use this method in studies using EHR notes to look at patients’ medicine use.

Final Research Report

This project's final research report is expected to be available by December 2024.

Peer-Review Summary

Peer review of PCORI-funded research helps make sure the report presents complete, balanced, and useful information about the research. It also assesses how the project addressed PCORI’s Methodology Standards. During peer review, experts read a draft report of the research and provide comments about the report. These experts may include a scientist focused on the research topic, a specialist in research methods, a patient or caregiver, and a healthcare professional. These reviewers cannot have conflicts of interest with the study.

The peer reviewers point out where the draft report may need revision. For example, they may suggest ways to improve descriptions of the conduct of the study or to clarify the connection between results and conclusions. Sometimes, awardees revise their draft reports twice or more to address all of the reviewers’ comments.

Peer reviewers commented and the researchers made changes or provided responses. Those comments and responses included the following:

  • The reviewers questioned the researchers’ use of hospital readmission as a proxy for medication adherence, noting that there are other potential reasons for readmission. The researchers stated that they did not intend to convey the message that medication nonadherence was the primary cause for readmission and edited the report to clarify that medication nonadherence was one of the likely factors for hospital readmission.
  • The reviewers noted that in two of the evaluation experiments, the natural language processing (NLP) tool had limited agreement with other proxy measures of medication adherence. While the limitations of the proxy measures were discussed, the reviewers asked the researchers to comment on the limitations of the NLP tool for identifying medication adherence. The researchers added to their discussions of the three experiments, explaining that the NLP tool could only identify medications mentioned in the same sentence as a nonadherence keyword and often providers do not refer to a specific medication in their notes. In addition, the researchers noted that at times the NLP tool would tag adherence language that did not refer to medication adherence at all. Future research would need to identify a classifier that could differentiate between treatment adherence and other types of adherence.
  • The reviewers questioned the generalizability of the information gained from focus groups of patients who were hospitalized for depression. They noted that individuals in inpatient care were likely to be least adherent to medication. The researchers acknowledged that their recruitment in inpatients could have biased the results, but also noted that if they had recruited individuals with less severe depression or better controlled depression, they might have missed some more serious themes related to nonadherence.
  • The reviewers asked the researchers to expand the information they provided about developing the NLP instrument, including methods used and the terminology related to NLP. The researchers noted that the primary model used, BERT (Bidirectional Encoder Representations from Transformers), is very complicated with millions of parameters to consider. Such explanations would be too technical to be useful for the intended audience of this report. They did define other machine-learning terminology for a lay audience.

Conflict of Interest Disclosures

Project Information

Kirk Roberts, PhD
The University of Texas Health Science Center at Houston
NLP for Medication Adherence: Complex Semantics and Negation

Key Dates

March 2023

Study Registration Information


Has Results
State State The state where the project originates, or where the primary institution or organization is located. View Glossary
Last updated: March 15, 2024