Submitted public comments follow the text of the proposed methodology standard
Data management is a critical phase in clinical research that contributes to the generation of high-quality, reliable, and statistically sound data from clinical trials and observational studies. The underlying motivation for good data management practice is to ensure that the data are accessible, sustainable, and reproducible, both for future investigators who may want to use it, and (importantly) for the original research team as well. This standard applies to both the quantitative and qualitative data collected in a study.
A data management plan (DMP) is a document that describes the data to be generated by a research study, how that data will be managed and stored, who will have access, what documentation and metadata will be created with the data, how the data will be preserved, as well as how the data will be shared in support of future scientific inquiries. DMPs are distinct from statistical analysis plans, which articulate the planned statistical analyses associated with the study (e.g., statistical tests to be used to analyze the data, how missing data will be accounted for in the analysis).
The study investigators should self-monitor their data management procedures in order to ensure quality control. This includes checks to ensure manually entered subject numbers conform to study-defined site/subject number format rules and real-time review of data to verify their accuracy and validity.
DMPs should include language that, at a minimum, addresses each of the following considerations:
- Collecting data: Based on the hypotheses and sampling plan, describe the data that will be generated, and how it will be collected. Provide descriptive documentation of the data collection rationale and methods, and any relevant contextual information.
- Organizing data: Decide and document how data will be organized within a file, what file formats will be used, and the types of data products that will be generated.
- Handling data: Describe and document who is responsible for managing the data, how version control will be managed, the data handling rules, the method and frequency for backing up the data, and how confidentiality and personal privacy will be protected.
- Describing data: Describe how a data dictionary and metadata record will be produced (metadata standard and tools that will be used).
- Storing and preserving data: Implement a data storage and preservation plan that ensures that both the raw data and analytic files can be recovered in the event of file loss. The data storage and preservation plan, including the approach to data recovery (e.g., storing data routinely in different locations), should be documented.
- Maintaining data: Develop a plan for maintaining the data in a data repository, consistent with PCORI’s policy on Data Access and Data Sharing.
- Sharing data: Develop a plan for sharing data with the project team, with other collaborators, and with the broader scientific community, consistent with PCORI’s policy on Data Access and Data Sharing.
Consistent with the Guideline for Good Clinical Practice (GCP), the investigator/institution should maintain adequate and accurate source documents, including the DMP. The DMP should be attributable, contemporaneous, original, accurate, and complete. Changes to the DMP should be traceable, should not obscure the original entry, and should be explained if necessary (e.g., via an audit trail).