UMass Center for Clinical and Translational Science
Epidemiology | Health Information Technology | Infectious Disease | Translational Medical Research | Virus Diseases
Background: The majority of U.S. reports of COVID-19 clinical characteristics, disease course, and treatments are from single health systems or focused on one domain. Here we report the creation of the National COVID Cohort Collaborative (N3C), a centralized, harmonized, high-granularity electronic health record repository that is the largest, most representative U.S. cohort of COVID-19 cases and controls to date. This multi-center dataset supports robust evidence-based development of predictive and diagnostic tools and informs critical care and policy.
Methods and Findings: In a retrospective cohort study of 1,926,526 patients from 34 medical centers nationwide, we stratified patients using a World Health Organization COVID-19 severity scale and demographics; we then evaluated differences between groups over time using multivariable logistic regression. We established vital signs and laboratory values among COVID-19 patients with different severities, providing the foundation for predictive analytics. The cohort included 174,568 adults with severe acute respiratory syndrome associated with SARS-CoV-2 (PCR >99% or antigen
Conclusions: This is the first description of an ongoing longitudinal observational study of patients seen in diverse clinical settings and geographical regions and is the largest COVID-19 cohort in the United States. Such data are the foundation for ML models that can be the basis for generalizable clinical decision support tools. The N3C Data Enclave is unique in providing transparent, reproducible, easily shared, versioned, and fully auditable data and analytic provenance for national-scale patient-level EHR data. The N3C is built for intensive ML analyses by academic, industry, and citizen scientists internationally. Many observational correlations can inform trial designs and care guidelines for this new disease.
Health Informatics, COVID-19, U.S., cohort, electronic health record repository, risk factors, severity, National COVID Cohort Collaborative (N3C)
Rights and Permissions
The copyright holder for this preprint is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. It is made available under a CC-BY 4.0 International license.
DOI of Published Version
medRxiv 2021.01.12.21249511; doi: https://doi.org/10.1101/2021.01.12.21249511. Link to preprint on medRxiv
Bennett TD, Chute CG, N3C Consortium. (2021). The National COVID Cohort Collaborative: Clinical Characterization and Early Severity Prediction [preprint]. UMass Center for Clinical and Translational Science Supported Publications. https://doi.org/10.1101/2021.01.12.21249511. Retrieved from https://escholarship.umassmed.edu/umccts_pubs/236
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.