Objectives: (1) to identify common errors in data organization and metadata completeness that would preclude a “reader” from being able to interpret and re-use the data for a new purpose; and (2) to develop a set of best practices derived from these common errors that would guide researchers in creating more usable data products that could be readily shared, interpreted, and used.
Methods: We used directed qualitative content analysis to assess and categorize data and metadata errors identified by peer reviewers of data papers published in the Ecological Society of America’s (ESA) Ecological Archives. Descriptive statistics provided the relative frequency of the errors identified during the peer review process.
Results: There were seven overarching error categories: Collection & Organization, Assure, Description, Preserve, Discover, Integrate, and Analyze/Visualize. These categories represent errors researchers regularly make at each stage of the Data Life Cycle. Collection & Organization and Description errors were some of the most common errors, both of which occurred in over 90% of the papers.
Conclusions: Publishing data for sharing and reuse is error prone, and each stage of the Data Life Cycle presents opportunities for mistakes. The most common errors occurred when the researcher did not provide adequate metadata to enable others to interpret and potentially re-use the data. Fortunately, there are ways to minimize these mistakes through carefully recording all details about study context, data collection, QA/ QC, and analytical procedures from the beginning of a research project and then including this descriptive information in the metadata.
Ecology, data publication, data management, data sharing, best practices
Kervin, Karina E., William K. Michener, and Robert B. Cook. 2013. "Common Errors in Ecological Data Sharing." Journal of eScience Librarianship 2(2): e1024. http://dx.doi.org/10.7191/jeslib.2013.1024
Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.
Figure 1: Average number of errors per paper by year by Data Life Cycle Element Category.
life_cycle_errors_percent.png (81 kB)
Figure 2: Percent of data papers with errors in each Data Life Cycle Element Category.
life_cycle_errors_mean.png (69 kB)
Figure 3: Mean number of errors in a given Data Life Cycle Element Category