Objective: This article analyzes twenty cited or downloaded datasets and the repositories that house them, in order to produce insights that can be used by academic libraries to encourage discovery and reuse of research data in institutional repositories.
Methods: Using Thomson Reuters’ Data Citation Index and repository download statistics, we identified twenty cited/downloaded datasets. We documented the characteristics of the cited/downloaded datasets and their corresponding repositories in a self-designed rubric. The rubric includes six major categories: basic information; funding agency and journal information; linking and sharing; factors to encourage reuse; repository characteristics; and data description.
Results: Our small-scale study suggests that cited/downloaded datasets generally comply with basic recommendations for facilitating reuse: data are documented well; formatted for use with a variety of software; and shared in established, open access repositories. Three significant factors also appear to contribute to dataset discovery: publishing in discipline-specific repositories; indexing in more than one location on the web; and using persistent identifiers. The cited/downloaded datasets in our analysis came from a few specific disciplines, and tended to be funded by agencies with data publication mandates.
Conclusions: The results of this exploratory research provide insights that can inform academic librarians as they work to encourage discovery and reuse of institutional datasets. Our analysis also suggests areas in which academic librarians can target open data advocacy in their communities in order to begin to build open data success stories that will fuel future advocacy efforts.
open data, data discovery, data reuse, institutional data repositories
Mannheimer, Sara, Leila B. Sterman, and Susan Borda. 2016. "Discovery and Reuse of Open Datasets: An Exploratory Study." Journal of eScience Librarianship 5(1): e1091. http://dx.doi.org/10.7191/jeslib.2016.1091
Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.
Figure 1: Median Citations by Repository
Table1_1091.png (7 kB)
Table 1: Characteristics of Cited/Downloaded Datasets
Figure2_1091.jpg (115 kB)
Figure 2: Academic Disciplines
Figure3_1091.jpg (48 kB)
Figure 3: Data Repository Preservation Policies
Appendix_1091.pdf (577 kB)
Appendix A: Data Repository and Dataset Analysis Rubric