DN-1: Requirements for the design and features of data networks

Data networks established for conducting PCOR must have the following characteristics to facilitate valid, useable data and to ensure appropriate privacy, confidentiality, and intellectual property protections:

  1. Data Integration Strategy—In order for equivalent data elements from different sources to be harmonized (treated as equivalent), processes should be created and documented that describe the quality and completeness of the data integration. Processes should also be created and documented that either 1) transform data elements prior to analysis or 2) make transformation logic available that can be executed when data are extracted. The selected approach should be based on an understanding of the research domain of interest.
  2. Risk Assessment Strategy—If data are exchanged between data partners, data custodians should develop policies for the management of the risk of use of the data other than the agreed-upon use. This should include agreements for how data will be handled and how time limits on the data will be enforced.
  3. Identity Management and Authentication of Individual Researchers—Develop reliable processes for verifying credentials of researchers who are granted access to a distributed research network and for authenticating them.
  4. Intellectual Property Policies—A research network should develop policies for the handling and dissemination of intellectual property (IP); networks should also have an ongoing process for reviewing and refreshing those policies. IP can include data, research databases, papers, reports, patents, and/or products resulting from research using the network. Guidelines should balance 1) minimizing impediments to innovation in research processes, 2) determining whether or how IP belongs to the patients or research participants, and 3) making the results of research widely accessible, particularly to the people who need them the most.
  5. Standardized Terminology Encoding of Data Content—The data contents should be represented with standardized terminology systems to ensure that their meaning is unambiguously and consistently understood by parties using the data.
  6. Metadata Annotation of Data Content—Semantic and administrative aspects of data contents should be annotated with a set of metadata items. Metadata annotation helps to correctly identify the intended meaning of a data element and facilitates an automated compatibility check among data elements.
  7. Common Data Model—Individual data items should be assembled into a contextual environment that shows close or distant association among data. A common data model (CDM) specifies necessary data items that need to be collected and shared across participating institutes, clearly represents these associations and relationships among data elements, and promotes correct interpretation of the data content.

Public comments

With respect to this Standard, AcademyHealth would recommend that PCORI make a few modest changes and address the following points that are in need of clarification: Within the “Data Integration Strategy” bullet, in clause 2, PCORI’s request for researchers to “make transformation logic” available is not so easily done. Often, it involves individuals needing both the code to transform the data as well as significant process documentation to define mapping strategies. Although this isn’t a simple process, a note from PCORI within this Standard to specify what it means by ‘logic’ could be helpful for researchers undertaking this process. Next, we would ask PCORI for additional illumination on the following bullet regarding “Risk Assessment Strategy.” AcademyHealth assumes the greatest issue on this point is the handling risk of personal health information being released. However, in such a case, these issues are addressed in a data use agreement (DUA). Barring the DUA case, what is PCORI’s threshold for a ‘policy?’ If PCORI is merely specifying that a DUA should be in place, and that it includes the aforementioned issues, we recommend that PCORI be precise in the Standard language. Finally, when PCORI refers to the “standardized terminology systems” in the fifth bullet, it’s unclear to which system(s) PCORI is referring—the ONC-endorsed standardized terminologies or some other resource providing standardized nomenclature? There is a vast array of standardized terminologies in the health care ether, and more specific guidelines from PCORI on its preferred set of terminology standards—whether SNOMED, LOINC, or RxNORM—would be useful.

Lisa Simpson, AcademyHealth, Stakeholder - Other, 04/11/2016 - 4:44pm

• DN-1: A. The Data Integration Strategy: The standards do not specifically address considerations about assessment of the validity of the data sources, which should be integrated to a data network. For example, a data source might be technically feasible to be part of a network, but the data quality/integrity might not be sufficient to avoid compromising results when that data might be used in conjunction with data from the other sources of the data network. Therefore, we recommend PCORI adds some language about a data quality/integrity assessment prior to integration (e.g. completeness of data, data quality assurance measures implemented by the source) to this area of the standard. • In addition, we recommend an additional characteristic under DN-1 emphasizing that the foundation of any data integration strategy is a clear description of the equivalence assessment of the data items. There should be documentation, which assesses the reason why a data item is judged to be equivalent to the same data item in another data source and any limitations to that equivalence. Suggested language: DN-1: Data Quality and Equivalence Evaluation: In order to assure a robust foundation of the data network, the data equivalence evaluation for all involved data sources against each other should be documented and any limitations should be clearly outlined. Data Quality assurance measures of the data sources should be assessed and documented. Any limitations imposed on the Data Network due to quality limitations of single data sources should be evaluated and documented. • DN-1: B. Risk Assessment Strategy: This standard should be expanded, or a new one should be added, to help researchers consider and address the physical security of the data and data platforms used to access and utilize data from data networks. • DN-1: B. Risk Assessment Strategy: In addition to the risk of re-identification concerns covered in this part of the standard, additional privacy concerns could be addressed. Suggested language: Data custodians should assure that data privacy/consents of the original data source cover the intended usage of the data through the data network.

Eli Lilly and Company , Industry, 03/30/2016 - 2:23pm

Question: Do you think that under (G.) Common data Model that this is an acceptable definition of a CDM? It seems correct, but not very well constructed. (I don't have a specific alternative to offer on this one. Please look at and decide.) {EBort}

{EBort} Merck & Co Inc, Industry, 02/23/2016 - 2:38pm

DN-2: Selection and use of data networks

Researchers planning PCOR studies relying on data networks must ensure that these networks meet the requirements contained in Standard DN-1, and they must document each required feature of the data network(s) to be used (e.g., in an appendix to the funding application or study protocol). Deviations from the requirements should be justified by explaining why a required feature is not feasible or not necessary to achieve the overall goals of Standard DN-1.

Public comments

No comments.

General feedback on the Standards for Data Networks as Research-Facilitating Structures

Public comments

As a formatting note, this standard along with Standards 3, 4, and 6 contain information and best practices for how to manage data prior to analysis and ensuring data integrity. Although it is beneficial to introduce concepts of data quality and data integrity for both management and analysis (and these are certainly related), consider having one complete standard for addressing all aspects of handling data prior to analysis.

Eli Lilly and Company , Industry, 03/30/2016 - 2:23pm

What's Happening at PCORI?

The Patient-Centered Outcomes Research Institute sends weekly emails about opportunities to apply for funding, newly funded research studies and engagement projects, results of our funded research, webinars, and other new information posted on our site.

Subscribe to PCORI Emails


Hand pointing to email icon