•  
  •  
 

Publication Date

4-23-2013

Article Type

Full-Length Paper

Abstract

Objectives: (1) to identify common errors in data organization and metadata completeness that would preclude a “reader” from being able to interpret and re-use the data for a new purpose; and (2) to develop a set of best practices derived from these common errors that would guide researchers in creating more usable data products that could be readily shared, interpreted, and used.

Methods: We used directed qualitative content analysis to assess and categorize data and metadata errors identified by peer reviewers of data papers published in the Ecological Society of America’s (ESA) Ecological Archives. Descriptive statistics provided the relative frequency of the errors identified during the peer review process.

Results: There were seven overarching error categories: Collection & Organization, Assure, Description, Preserve, Discover, Integrate, and Analyze/Visualize. These categories represent errors researchers regularly make at each stage of the Data Life Cycle. Collection & Organization and Description errors were some of the most common errors, both of which occurred in over 90% of the papers.

Conclusions: Publishing data for sharing and reuse is error prone, and each stage of the Data Life Cycle presents opportunities for mistakes. The most common errors occurred when the researcher did not provide adequate metadata to enable others to interpret and potentially re-use the data. Fortunately, there are ways to minimize these mistakes through carefully recording all details about study context, data collection, QA/ QC, and analytical procedures from the beginning of a research project and then including this descriptive information in the metadata.

Keywords

Ecology, data publication, data management, data sharing, best practices

Creative Commons License


This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

life_cycle_errors_year.png (91 kB)
Figure 1: Average number of errors per paper by year by Data Life Cycle Element Category.

life_cycle_errors_percent.png (81 kB)
Figure 2: Percent of data papers with errors in each Data Life Cycle Element Category.

life_cycle_errors_mean.png (69 kB)
Figure 3: Mean number of errors in a given Data Life Cycle Element Category

Share

COinS
 

To view the content in your browser, please download Adobe Reader or, alternately,
you may Download the file to your hard drive.

NOTE: The latest versions of Adobe Reader do not support viewing PDF files within Firefox on Mac OS and if you are using a modern (Intel) Mac, there is no official plugin for viewing PDF files within the browser window.