Article Type

EScience in Action

Publication Date





Objective: Data curation is becoming widely accepted as a necessary component of data sharing. Yet, as there are so many different types of data with various curation needs, the Data Curation Network (DCN) project anticipated that a collaborative approach to data curation across a network of repositories would expand what any single institution might offer alone. Now, halfway through a three-year implementation phase, we’re testing our assumptions using one year of data from the DCN.

Methods: Ten institutions participated in the implementation phase of a shared staffing model for curating research data. Starting on January 1, 2019, for 12 months we tracked the number, file types, and disciplines represented in data sets submitted to the DCN. Participating curators were matched to data sets based on their self-reported curation expertise. Aspects such as curation time, level of satisfaction with the assignment, and lack of appropriate expertise in the network were tracked and analyzed.

Results: Seventy-four data sets were submitted to the DCN in year one. Seventy-one of them were successfully curated by DCN curators. Each curation assignment takes 2.4 hours on average, and data sets take a median of three days to pass through the network. By analyzing the domain and file types of first- year submissions, we find that our coverage is well represented across domains and that our capacity is higher than the demand, but we also observed that the higher volume of data containing software code relied on certain curator expertise more often than others, creating potential unbalance.

Conclusions: The data from year one of the DCN pilot have verified key assumptions about our collaborative approach to data curation, and these results have raised additional questions about capacity, equitable use of network resources, and sustained growth that we hope to answer by the end of this implementation phase.


data curation, collaboration, data repositories

Data Availability

Data associated with this article are available from the Data Repository for the University of Minnesota at: https://doi.org/10.13020/ak4d-ge34.


We would like to thank the Alfred P. Sloan Foundation for its generous support of our project (Primary Award: G-2018-10072; Planning Award: G-2016-7044). We would like to thank all past and current contributors to the Data Curation Network and acknowledge the time and ongoing commitment of this amazing community: Aditya Ranganath, Alexis Logsdon, Alicia Hofelich Mohr, Andrew Battista, Ashley Hetrick, Chen Chiu, Claire Stewart, Cynthia Hudson-Vitale, Dave Fearon, Debra Fagan, Dorris Scott, Elizabeth Hull, Erica Johns, Erin Clary, Henrik Spoon, Hannah Hadley, Heidi Imker, Hoa Luong, Jake Carlson, Janice Jaguszewski, Jennifer Darragh, Jennifer Moore, Joel Herndon, John Russell, Katie Wilson, Katie Wissel, Mara Blake, Marley Kalt, Melinda Kernik, Nathan Piekielek, Rachel Woodbrook, Robert Olendorf, Rich Yaxley, Sarah Wright, Seth Erickson, Shanda Hunt, Sophia Lafferty-Hess, Susan Borda, Tim McGeary, Tracy Teal, Valerie Collins, Wanda Marsolek, Wendy Kozlowski and Xuying Xin.

Corresponding Author

Elizabeth Coburn, University of Minnesota, Science/Engineering Library, 108 Walter Library, Minneapolis, MN 55105; ecoburn@umn.edu

Rights and Permissions

© 2020 Coburn and Johnston. This is an open access article licensed under the terms of the Creative Commons Attribution-Noncommercial-Share Alike License.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 4.0 License.