Research Data Services in Academic Libraries: Data Intensive Roles for the Future?

Objectives: The primary objectives of this study are to gauge the various levels of Research Data Service academic libraries provide based on demographic factors, gauging RDS growth since 2011, and what obstacles may prevent expansion or growth of services. Methods: Survey of academic institutions through stratified random sample of ACRL library directors across the U.S. and Canada. Frequencies and chi-square analysis were applied, with some responses grouped into broader categories for analysis. Results: Minimal to no change for what services were offered between survey years, and interviews with library directors were conducted to help explain this lack of change. Conclusion: Further analysis is forthcoming for a librarians study to help explain possible discrepancies in organizational objectives and librarian sentiments of RDS. Correspondence: Carol Tenopir: ctenopir@utk.edu


Introduction
The growth of data intensive science, coupled with funding mandates for data management plans and government open data, has led to a growing emphasis on data management across all academic disciplines. At the same time, the roles of academic libraries have changed dramatically within the last decade. Academic librarians are now often integrated as partners in all aspects of the research process, from data collection to publication and preservation of research output. A suite of research data services (RDS) needed by academic communities is emerging in academic libraries in response to the growth of data intensive research, changing roles of libraries, and the recognition of a need for research data management.
In 2011, an assessment of research data services (RDS) in North American academic libraries by the Usability & Assessment Working Group of the NSF-funded DataONE project found that many research academic libraries have added research data services and many more were planning to offer a variety of research data services in the future. The respondents came from a panel of library directors from all types of academic libraries in Canada and the United States that was put together by the Association of College & Research Libraries (ACRL). Results from the 2011 baseline survey were published as an ACRL white paper (Tenopir, Birch & Allard 2012). In order to measure changes in practice over time, a second survey was planned at the outset and conducted in 2014 and reported here. There have been other assessments of the role that libraries play in providing RDS; but few have been conducted with the intent of long-term, periodic assessment of RDS within libraries (Soehner, Steeves & Ward 2010;Antell et al. 2013).
These assessments of RDS in libraries are an integral part of the DataONE project (dataone.org), which has a mission to ensure the preservation and accessibility of science data. DataONE is multi-scale, multi-national, and cross-disciplinary in scope, and provides cyberinfrastructure for data discovery, as well as tools and educational materials to help build a culture and capability of research data management. One of the tasks of the Usability & Assessment Working group of DataONE is to build an understanding of perceptions, attitudes, behaviors, and requirements with respect to data and data activities within stakeholder communities (Tenopir, Birch & Allard 2012). Academic libraries are an important stakeholder community who are playing a part in building a culture and infrastructure for RDS. (Soehner, Steeves & Ward 2010) 1 . Except where we are describing the terms used by other authors in their research, in this article we use the term data "management" to refer to the broad suite of services or processes involving data, including services that assist with data management planning, finding repositories for both accessing and depositing data, metadata description, and preservation.

Related Research
Recognition of the need for good data management is now widespread, and the resource requirements to accomplish these tasks are broadly discussed in scholarly and popular literature (Tenopir et al. 2011). As is the case with other cultural and scientific artifacts, academic libraries can play a role in helping researchers find, describe, and preserve data, and to help implement these data management requirements. Some believe that libraries have the mandate and may have the capacity to curate data to serve the advancement of science and society as a whole (Heidorn 2011). The Association of Research Libraries (ARL) E-Science Working Group conducted a survey in 2010 to determine how ARL libraries have been approaching this task of providing RDS to their patrons (Soehner, Steeves & Ward 2010). Although the majority of institutions lacked designated departments for providing RDS, over one-third had conducted assessments of their researchers' needs for data services. The DataONE 2011 baseline assessment of ACRL library directors identified that well under half of libraries surveyed offered some form of RDS; however, more libraries were planning to offer such services within the next two years (Tenopir, Birch & Allard 2012). Most of the libraries already offering RDS were in research-focused/PhD-granting institutions, such as those who are likely members of ARL.
Institutional assessments of RDS needs are plentiful and continue to be performed and published. Many libraries have conducted surveys, (Parsons, Grimshaw & Williamson 2013;Parham et al. 2012) while others have taken more personal approaches, such as interviews and focus groups (Carlson 2012;Mclure et al. 2014). Regardless of assessment methods employed, academic librarians are seeking to understand the needs of the researchers at their institutions and develop targeted strategies to meet those needs. Studies show that researchers' needs vary from discipline to discipline, so one solution may not be sufficient (Akers & Doty 2013;Weller & Monroe-Gulick 2014).
Libraries are also discovering the need for workforce development and training regarding RDS. The baseline assessment discovered that most libraries are shifting current staff into data positions instead of hiring new data professionals (Tenopir, Birch & Allard 2012). A recent survey of ARL science librarians found that less than one-quarter of these librarians felt that they have the skills to help scientists with data management (Antell et al. 2013). Training for current staff may be on-the-job, or more formalized (MacMillan 2015;Tenopir et al. 2013). Library and information science (LIS) programs are also developing courses and programs to provide information professionals with a solid foundation in data management and curation (Varvel, Bammerlin & Palmer 2012).
In some scientific disciplines, data managers emerged decades ago to support the need to organize, preserve, and curate data on behalf of field and bench researchers (Tayi & Ballou 1998;Gray et al. 2005); however, it is only more recently that academic libraries have recognized the need to have staff with research data skillsets (Mayernik et al. 2014). Needless to say, librarians with the full spectrum of RDS skills may be difficult for academic libraries to find and retain. Collaboration with groups outside of the library or with other institutions will be essential for fully developing RDS for an academic community (Norman & Stanton 2014;Deards 2013). However, the 2010 ARL survey found that just under half of responding libraries had built collaboration with RDS at other institutions (Soehner, Steeves & Ward 2010). Data generated by academic researchers are varied and require a wide range of knowledge about data and metadata standards (Heidorn 2011). Collaborations are also recommended for developing stable cyberinfrastructure to support data curation in the long term (Tansley & Tolle 2009).

Methodology
The survey instrument was built in the Qualtrics software and housed on the servers at the University of Tennessee, Knoxville. The survey, approved by the University of Tennessee Institutional Review Board for Human Subjects, was administered to a stratified random sample panel of ACRL library directors across the United States and Canada. Three hundred and fifty invitations to participate with a link to the questionnaire were sent on March 11, 2014 via Qualtrics Mailer. Thirteen email addresses were unreachable, yielding a final total of 337 valid invitations. The survey closed on April 12, 2014 with 146 responses, 128 of which were usable, for a 38% response rate. The participants were asked to respond on behalf of their institutions.
Responses were imported from Qualtrics and analyzed in SPSS. In order to run chi-square analysis, survey questions asking for several date choices of when future services were planned were recoded into "yes," "no, but plan to," and "no" responses. Some demographic choices were recoded and consolidated in a similar manner, such as full time equivalent (FTE) recoded into "fewer than 5,000 students," and "5,000 or more students," and tenure/tenure track faculty (fewer than 100 faculty/100 faculty or more).
The questionnaire included demographic questions as well as a range of items on current and planned RDS and workforce development (see Appendix A).
To better understand the survey results, semi-structured interviews of five academic library directors were conducted after the survey. Respondents from the ACRL panel were invited to participate based on how knowledgeable they were about RDS. Responses were imported into NVivo software to help determine common patterns in interview responses.

Overall Results
Of the identifiable institution types, over half of the responding institutions (53.7% of n=128) are research universities. Two-year institutions represented 17.9% of identifiable institution types, while four-year institutions represented 28.4%.
Well over half (65%) of the respondents have fewer than 5,000 full-time equivalent students (Table 1), and 43% of institutions have fewer than 100 tenure or tenure-track faculty under their employment (Table 2). Three-quarters of campuses (76%) receive less than $10 million in external research grants, with noticeably fewer receiving more than $10 million annually (Table  3).   The majority of institutional respondents do not currently offer, nor do they plan to offer, most types of RDS ( Figure 1 & Figure 2). In general, more institutions currently offer or plan to offer what could be defined as informational /consultative services ( Figure 1) rather than what could be defined as technical services (Figure 2). Providing reference support for finding and citing data (29.7%) and creating web guides for data and data repositories (21.5%) are the most frequently offered services. This is not surprising, as consultative RDS align well with traditional reference or liaison librarian services.  The survey also gathered information about who in the library, if anyone, provides reference/ consultation/instructional RDS to researchers and who has primary leadership responsibility for plans and programs for RDS. Individual discipline librarians (i.e., subject specialists or liaison librarians) or staff members are the largest group providing RDS informational/consultative services in the responding libraries (Table 4). Only a small percentage (6.7%) of libraries have dedicated data librarians to provide RDS.
Almost three-quarters of respondents indicated that their library is not involved in RDS. Of those libraries that are involved in RDS, responsibility for planning and developing programs differs from library to library. Some libraries have a single individual who is responsible, some have a group or committee that is responsible, and others have a combination of individuals, committees, and/or departments responsible for RDS planning (Table 5). Table 4: Who in the library provides research data reference/consultation/instruction services to researchers? Respondents were also asked how their libraries have developed staff capacity for RDS. A little over half of the 25% of those offering RDS (54.5% or 12 of 22) indicated that they have reassigned existing staff (Table 6). Only a few (6) have hired new staff. Although their libraries may not yet offer RDS, many of the library directors agree that issues of research data are important. Respondents were asked to report how much they agree or disagree with a number of statements relating to their opinion on library involvement in RDS (  Table 7: Statements about which respondents were asked to agree or disagree regarding their opinion about library involvement in RDS. Level of agreement was based on a five-point Likert scale with one equal to strongly agree and five equal to strongly disagree. n=86

Differences by Institution Type
The survey revealed some statistically significant differences among two-year, four-year, and research institutions with respect to the RDS offered or planned to be offered, the staff responsible, and the development of RDS capacity. Not surprisingly, librarians at four-year and research universities are more likely to consult with faculty, staff, or students on data management plans (Table 8). Libraries at four-year and research universities are also more likely than those from two-year institutions to have plans to discuss RDS with other librarians, RDS professionals, or others on campus (Table 9).

Differences by Annual External Funding Amount
Just as type of institution reflects a research focus, and therefore a greater focus on research data, institutions that receive more grant money can be expected to focus more on research data. Not surprisingly, significantly higher percentages of libraries at institutions receiving $50 million or more in external funding currently offer more RDS (Figure 3). Additionally, a higher percentage of these institutions indicated that dedicated data librarians, as opposed to individual discipline librarians, provide research data reference, consultation and instruction services. Institutions with more research funding are also more likely to provide opportunities for their staff to develop RDS skills, have policies and procedures in place for RDS, and collaborate with others, both within their institution and with other institutions (Table 10 & 11). services based on institution's approximate annual external funding. * indicates significant difference based on standardized residual >= |1.6|. X 2 (2)=19.187, n=41, p<0.001 4  Table 11: Percentage of libraries at institutions with less than $50 million and $50 million or more in annual external funding whose respondents answered in the affirmative to the survey statements shown. All differences are significant based on Fisher's Exact Test. *p<0.001, †p=0.007, •p<0.001, ‡p=0.003

Differences by Full-Time Equivalent Student Populations
More institutions with larger student populations currently offer RDS compared to institutions with smaller populations. More libraries at institutions with 5,000 or more students responded they currently provide outreach and collaboration with other RDS providers either on or off campus (23.5% compared to 6.5% (< 5,000 FTE students)), directly participate with researchers on a project (27.3% compared to 10.0%), and train co-workers in their library, or across campus, on RDS (25.0% compared to 6.6%). Libraries at institutions with larger student populations are also more likely to have dedicated data librarians (16.1% compared to 0%), and have a higher rate of collaboration with other departments or units within their institutions regarding RDS (35.5% compared to 15.5%). Respondents from universities with larger student populations were more likely to agree with the following statement: The library needs to offer research data services (RDS) to remain relevant to the institution (F(4,81)=3.097, p=0.020). A Tukey post-hoc test revealed the level of agreement was significantly greater for libraries at institutions with 25,000 or more FTE students (1.20 ± 0.447) compared to libraries at institutions with 5,000 -9,999 FTE students (3.00 ± 1.000, p=0.041) and up to 1,999 students (3.07 ± 1.307 , p=0.012).   survey responses, there has been little change in the percentages of libraries that currently offer, plan to offer, and do not plan to offer most of the specific RDS from 2011 to 2014 (Table   15). In 2014, however, significantly fewer libraries say they have plans to provide reference support for finding and citing data (Table 16).

Differences by Faculty Size between 2011 and 2014
While there are no significant differences between 2011 and 2014 for the entire survey population regarding identifying data that could be candidates for repositories, differences exist for this service when the population is broken down by faculty size. Multi-dimensional chi-square tests based on the type of RDS offered, survey year, and faculty size indicate that, in 2014, a significantly smaller percentage of libraries at institutions with fewer than 100 faculty members reported that they plan to offer this service, compared to 2011 (X 2 (2)=8.452, n=124, p=0.015) (7). Similarly, there was no significant change in percentages between 2011 and 2014 for libraries at institutions with more than 100 faculty members 6 .

Discussion & Conclusions
The role of the academic library must adapt over time to changes in the research landscape.
As we move into a new paradigm of "data-intensive" research (Tansley, Stewart & Tolle 2009), academic libraries are working to determine what, if any, new research data services they will need to offer to continue to support their faculty, staff, and students. The majority of the ACRL institutions surveyed are not offering RDS, although research universities are more likely to offer a range of data services than libraries at other types of academic institutions. Plans for the future of RDS are still being considered by academic libraries.
In order to fully offer technical RDS, libraries need to have technologically skilled staff or greatly increase opportunities for technology training for their existing staff, which might not be feasible due to resource constraints.
Interviews provided deeper insights into the challenges of offering technical RDS. There is some sentiment that these services should be offered by other institutional departments with larger technical capacity. Some interviewees suggested partnering with their institution's Information Technology department, as well as offices that are involved with sponsored programs, grants, and research. Another example of collaboration to grow RDS is to create a "research data network," including researchers with varied expertise, to address RDS issues based on multiple discipline-specific needs. One such discipline-specific network mentioned is a digital humanities working group, which provides a platform for exchanging ideas, keeping up with trends in the fields and focusing on RDS issues. Similar sentiments about collaboration were reported in an earlier study (Pinfield, Cox & Smith 2014). If libraries collaborate with other departments or other institutions, they may be able to offer a more complete suite of RDS for their institution.
As academic libraries begin to decide on which types of RDS they plan to provide in-house, they also need to decide who exactly will be providing those services. Library leaders need to decide whether they will invest in developing current librarians and staff, or if they will hire data librarians. There are many data-related skills that librarians and library staff need to acquire to keep pace with the changing landscape and the costs related to workforce development are not insignificant. For many reasons, few academic libraries have hired data librarians. For example, there may not be enough of a perceived demand for RDS to warrant a full-time data librarian. In the interviews, some library directors commented on the lack of demand at their institution for RDS, which support the findings of an earlier study that found there is little or no demand for RDS from patrons at many institutions (Tenopir, Birch & Allard 2012). In some instances, patrons may be unaware that the library could offer these services. Some interviewees mentioned that faculty, staff, and students wished they had known that the library offered RDS earlier as they mistakenly believed RDS were beyond the suite of services associated with libraries.
To improve awareness of RDS, some interviewees suggested marketing library-offered RDS by meetings with various campus groups. Some also mentioned the importance of meeting with university leaders about implementing RDS. Until libraries and their institutions see more demand, they may not feel that it is worth investing in RDS. Many academic libraries may not have the resources or administrative support to hire a research data services librarian.
Institutions with larger student populations and higher levels of external funding are more likely to have full-time data librarians.
The 2014 survey also found that some institutions have hired data librarians and are also reassigning others within the library to take on data-related responsibilities. A single person may not be able to keep up with all of the different standards and best practices for data across all disciplines. According to Heidorn, data at research and teaching institutions are very heterogeneous, thereby making it difficult for any one person to have all the necessary skills to provide researchers at his/her institution with RDS (2011). Instead, a team approach may be best for implementing RDS. Pinfield, Cox and Smith describe how important it is for libraries to maintain awareness of the various disciplinary cultures and practices with respect to RDS within their institutions (2014). Therefore, it may make sense to have subject librarians or liaisons take on the role of RDS provider for their discipline because, ultimately, they will have the best understanding of the cultures within that domain. These librarians will likely understand the best way to approach researchers about sensitive topics such as data sharing.
Despite the low percentage of ACRL academic libraries surveyed that are actively providing RDS, many respondents agree that libraries should be involved in data curation and that data are important for future scholarship. There may be an inconsistency between librarians' feelings about the importance of the library's involvement in RDS versus the motivation to move forward. If the library's leadership does not perceive a disadvantage to their institutions' research capabilities without library-led RDS, there may be little incentive to start offering these services. The need for RDS may be externally imposed. Many libraries that have started to offer RDS are doing so in response to funding agency Open Data (OSTP) 7 mandates surrounding research data.
However, interviewees noted that it is important to portray RDS as not just a means of compliance, but also as services that will directly benefit the researchers themselves. This is especially true because there is still some doubt about the enforcement of funder requirements and the consequences of not complying with those requirements. Therefore, solely relying on the "compliance" argument may make it difficult for the library to "sell" RDS to researchers (Pinfield, Cox & Smith 2014). This means responding to regulatory mandates is not a sufficient basis for a successful RDS program. Libraries have the responsibility to advocate for good research data management practices.

Development of an RDS program requires both a top-down and bottom-up approach.
Leadership and engagement from the librarians and staff who will provide and promote the services is essential; however, leadership and resources are also required from the top-level administrators. Interviewees mentioned that this support is a necessity for RDS implementation. Some indicated that campus-wide support is important for securing funding for RDS since libraries do not always have the resources to implement such a program on their own. Similarly, authority figures who are interested in creating partnerships to implement campus-wide RDS have secured more institutional funding for the library.
A lack of institutional support may be one of the reasons there has not been a faster library adoption rate of RDS (Pinfield, Cox & Smith 2014). In 2011, 15 to 35 percent of libraries surveyed indicated they were planning to offer most types of RDS. Therefore, we anticipated that we would find an increase in the percentage of libraries offering RDS in 2014; however, this was not the case. Instead of an increase, the percentage of libraries offering RDS remained fairly stable and in many cases declined slightly. The interviews helped us understand why libraries did not follow through on their plans.
Without top-level support from their institutions, even if library directors had wanted to implement RDS, it might not have been possible. Academic institutions are slow-moving. They require a lot of time to make substantial changes, and different departments move at different paces. In cases where libraries are coordinating RDS efforts with other departments, difficulties in enacting change may occur. Some respondents stated that time is a very limited resource when it comes to implementing RDS. Specifically, library staff, researchers, and office partners have limited time to either implement or use RDS. Interviewees also suggested that libraries need to find ways to make the RDS process more automatic and less time-consuming for researchers. For example, a central repository could be created where researchers could both compile and deposit data, as opposed to moving data at a later time.
Libraries are proceeding cautiously. Some researchers have "massive expectations" when it comes to RDS. Librarians and upper-level administrators realize that the amount of work could quickly get out of control. With concerns over resources and support from various institutional levels, some are hesitant to get involved. During the interviews, some library directors provided their experiences of curbing the expectations of researchers on what and how much the library can do to help. Some researchers may assume that the library is going to do data management for them; whereas RDS may just mean to assist researchers with data management. Additionally, if the RDS offered become too successful, there is concern the library will not have resources to support the demands. For example, one interviewee described concerns about data storage and the library not being able to keep up with demand for server space. The interviewee also mentioned repository traffic concerns if users start "taking out all of our data left and right." Implementing RDS in an academic library requires resources, be it personnel, time, skills, money, or support. One solution for dealing with limited resources is to for libraries to partner with others on campus or with other universities and share resources. One example of collaboration within an institution is where the institutional office of research provides data management training, while the library provides a repository for completed data management plans. An intra-institutional example is the Data Management Planning tool developed by the University of California's California Digital Library in collaboration with other universities (https://dmp.cdlib.org).
RDS requires a skilled workforce and motivation from library leadership and library staff to transform the library's role in support of academic scholarship. The results of this survey of academic library directors reveals the degree to which their libraries are engaged in RDS. To understand individual librarians' attitudes towards these new roles, we also surveyed librarians who work in academic libraries. The results of the 2011 baseline survey of academic librarians showed that many academic librarians do not feel prepared to take on these new roles in spite of plans by their libraries to offer RDS (Tenopir, Birch & Allard 2012). To see if this mismatch between organizational RDS objectives and the readiness of individual librarians continues, we conducted a follow-up survey of academic librarians in 2014; results are forthcoming.

Supplemental Content
Appendices A and B An online supplement to this article can be found at http://dx.doi.org/10.7191/jeslib.2015.1085 under "Additional Files".