UMMS Affiliation
UMass Center for Clinical and Translational Science
Publication Date
2021-01-23
Document Type
Article Preprint
Disciplines
Epidemiology | Health Information Technology | Infectious Disease | Translational Medical Research | Virus Diseases
Abstract
Background: The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy.
Methods and Findings: In a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen
Conclusions: This is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease.
Keywords
Health Informatics, COVID-19, U.S., cohort, electronic health record repository, risk factors, severity, National COVID Cohort Collaborative (N3C)
Rights and Permissions
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
DOI of Published Version
10.1101/2021.01.12.21249511
Source
medRxiv 2021.01.12.21249511; doi: https://doi.org/10.1101/2021.01.12.21249511. Link to preprint on medRxiv
Journal/Book/Conference Title
medRxiv
Repository Citation
Bennett TD, Chute CG, N3C Consortium. (2021). The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction [preprint]. UMass Center for Clinical and Translational Science Supported Publications. https://doi.org/10.1101/2021.01.12.21249511. Retrieved from https://escholarship.umassmed.edu/umccts_pubs/236
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Included in
Epidemiology Commons, Health Information Technology Commons, Infectious Disease Commons, Translational Medical Research Commons, Virus Diseases Commons
Comments
This article is a preprint. Preprints are preliminary reports of work that have not been certified by peer review.
University of Massachusetts Medical School Worcester (UL1TR001453: University of Massachusetts Center for Clinical and Translational Science) was a funder of this study.
Full author list omitted for brevity. For the full list of authors, see preprint.