Project Summary

One of PCORI’s goals is to improve the methods that researchers use for patient-centered outcomes research. PCORI funds methods projects like this one to better understand and advance the use of research methods that improve the strength and quality of comparative effectiveness research.

What is the project about?

Electronic health records, or EHRs, contain data that researchers can use to test treatments and improve patient care. Some of the most detailed information in EHRs is in clinicians’ notes. But clinicians don’t write notes in a standard way. For example, clinicians’ notes may include one topic or more than one topic in a sentence. As a result, researchers may have a hard time getting accurate information from these notes using current methods.

In this study, the research team is developing new methods that use natural language processing, or NLP, to extract data from clinicians’ notes for use in research. In NLP, computer programs interpret written language and make it easier to sort and study. The new methods help researchers to correctly identify complex concepts in clinicians’ notes. For example, the methods help identify the start and end of a topic. They can also tell if statements are positive or negative.

How can this project help improve research methods?

Researchers can use the results to obtain more accurate data from clinicians’ notes in EHRs.

What is the research team doing?

The research team is using two types of NLP software to create the new methods and looking to see how well each method works. The team is using three clinical examples from EHR data from a healthcare system in Massachusetts as the basis for developing the new methods:

  • Medicine use in patients with high cholesterol
  • Patient and provider discussion of surgery to help patients lose weight
  • Patient history of smoking

To see if the new methods work with data from different healthcare organizations, the research team is testing the methods with EHR data from a healthcare system in Maryland. The team is also comparing the new methods with existing NLP methods.

Research methods at a glance

Design Element Description
Goal To improve the accuracy of NLP extraction of EHR data by developing and comparing methods for improving identification of (1) the start and end of topics and (2) negative vs. positive statements
Approach Supervised NLP technology

Project Information

Alexander Turchin, MD, MS
Brigham and Women's Hospital
Using Topic Segmentation to Enhance Concept Parsing and Identification of Negations

Key Dates

November 2019
April 2024

Study Registration Information


Has Results
State State The state where the project originates, or where the primary institution or organization is located. View Glossary
Last updated: March 14, 2024