Page 1 of 16 Contents Executive Summary o Selected Activities of Note o Key Findings and Recommendations 1: Formalize the Program for Data Management/Curation 2: Foster Collaborations to Grow the Culture of Data Management Needs Background for Charging the Data Management/Curation Task Force Year One Activities: Overview, Current Status, and Recommendations for Year Two Append ices: o Proposal for Digital Humanities Library Group o Data Survey Results
Page 2 of 16 Executive Summary The first year of Data Management/Curation Task Force (DMCTF) focused on assessment information gathering planning, and development activities from Januar y to December 2013. 1 The DMCTF has been enormously successful in many regards as detail ed in this report. However, based on the information gathered over the course of the past year and ongoing trends a great deal needs to be done Further, i t needs to be done as quickly as possible given internal and external pressures for data supports. Selected Activities of Note The DMCTF has been extremely active in a variety of areas, with details in this report. Activities of particular importance and success incl ude: Events and Trainings o o Graduate Student Events : many focused on data outreach and training o Conducted trainings on data reference and the Data Management Plan Tool (DMPTool ) o SobekCM Digital Repository Training Series: in person and webinar trainings with videos online for future use; faculty and graduate student attendees reported interest in SobekCM for research project data for independent and collaborative projects Tools Developed o Data Management Plan Tool (DMPTool) customized for writing data management plans for grants implemented for UF ( preparing for DMPTool 2.0 coming in Spring 2014 ) o Website at http://www.uflib.ufl.edu/datamgmt/ built around the research life cycle, including links to storage options on campus and training activities Assessment and Data Gathering o Campus wide Data Survey in fall 2013; over 280 responses confirming the critical data needs in many areas o Facilitated meetings and discussions with departmen ts, centers, and institutes on data o Evaluation of the Dataverse Network (DVN) with the Southeastern Universities Research Association (SURA) with data from the process supporting Research Computing implementation of new data storage system Building Colla boration and Awareness o Facilitated meetings and discussions with other campus service providers for integrated services with computing, sponsored programs, the Graduate School, and others o Coordinated visit by FSU Libraries for UF Research Computing Day o Coo rdinated presentation by Bess de Farber to the Research Computing Advisory (delivered in Jan. 2014) Committee to discuss collaborative processes and events, including CoLABs o DMCTF presentation and update to the Libraries in fall 2013 1 DMCTF charge: http://ufdc.ufl.edu/AA00014835/00011
Page 3 of 16 Key Findings and Recommendations is not feasible without additional staffing For future phases of work to support DMCTF findin gs and recommendations fall within three primary areas: 1. Formalizing the program for data management a. Hiring critically needed new personnel b. Creating a formalized structure in the Libraries, which will serve as a connected node with other groups in the Lib raries and across campus c. Developing full, standardized supports for common needs where possible and applicable formalized structure within the libraries and as con nected with other campus groups 2. Fostering collaboration s across campus to grow the culture of data management 3. Con ducting research, development, and assessment necessary for data needs Key Recommendation 1: Form aliz e the Program for Data Management /Cura tion Critically needed pers onnel include at minimum a Data Curation Coordinator Librarian and Data IT Expert Both will need to be positioned to collaborate throughout the Libraries and with Research Computing fo r integration wherever possible: The Data IT E xpert c ould report through the Digital Development Unit, be embe dded with in the data team, and act as the technical contact for collaboration with Library IT, Research Computing, and other campus IT units. The Data IT Expert is needed to support testing and setup of any new tools, integrate and add additional data supp orts to the IR@UF, collaborate with Research Computing to integrate the SobekCM front end data/digital repository system supported by the Libraries with the back end computational processing and granularly controlled storage access with Research Computing and other data needs The Data Librarian would serve on project teams and more importantly, support other librarians in engaging in data management and curation issues 2 help ing to lay the foundation for creating a culture of data management as part of the human and technical infrastructure work necessary to support modern research needs and radical collaboration. 3 Further formalization of the program for data management/curati on is needed once the new positions are hired. data units within their libraries. Structured in various ways 4 they are designed to connect various areas and experts from existing librar y structures and support new data activities. For UF, the most successful structure would 2 Carlson, Jake (2013). "Opportunities and Barriers for Librarians in Exploring Data: Observations from the D ata Curation Profile Workshops." Journal of eScience Librarianship 2(2): http://dx.doi.org/10.7191/jeslib.2013.1042 3 For radical collaboration defined in relation to UF and research, see: Visi on for Research Computing at UF (Feb 2011), http://www.it.ufl.edu/wp content/uploads/2012/03/research computing vision.pdf 4 E xamples include: Publishing & Curati on Services, Office of Digital Scholarly Publishing, Data Management Consulting Group Scholarly Commons
Page 4 of 16 connect existing experts, serve as the home for new positions and data activities, and serve as a critical hub to connect with other groups in the Libraries and acro ss campus. The DMCTF is already working to build this hub structure through connecting with other Library groups, as explained in the next recommendation. Key Recommendation 2: Foster Collaborations to G row the C ulture of D ata M anagement With or without additional personnel, the DMCTF has a critical role t o play in developing a culture of data management within the L ibraries and extending through collaboration with other campus service groups t o the full campus. Ideally, the DMCTF will continue to focus efforts on specific work by the group and work done in collaboration with various library groups: Collaborating on data related activities: Library Instruction Committee for data instruction. Identifying and activating other best fit collaborations: Born Digital Archival Content Working Group, Grants Committee 5 and others. Providing a n excellent hub framework and configuration for supporting new groups for data: new Digital Humanities Library Group 6 In addition to the DMCTF, data specific groups, and core collaborators outside of the libraries like Research Computing, all L iaison L ibrarians have a critical role for data as the primary contacts and leaders on consultative teams within the libraries for data and other digital scholarship needs. 7 To further foster collaboration and grow the culture of data management, the Libraries need ongoing work internally with Liaisons and through to external groups. For external group collaboration, the DMCTF recommends approaching best fit, targeted groups to build collaboration on data, with details in this report. Additionally, the Libraries have a wealth of expertise to offer for all aspects of data management and curation, including support on attribution and fair cite initiatives for attribution and cred it for data scientists and computational team members for data projects. Key Recommendation 3: Con duct R esearch, D evelopment, and A ssessment N ecessary for ata Needs Because data management and curation impact all types of scholarly work, all schola rly areas and fields, and all aspects of academic institutions, ongoing research, development, and assessment are needed to ensure support for im mediate campus data needs and build support for future needs. Currently, the DMCTF can best support this ongoi ng work in collaboration with other library and campus g roups. For 2014, the DMCTF can best conduct assessment and outreach on data in collaboration with Research Computing in their efforts for outreach and promotion on Research Computing and HiPerGator. 5 The DMCTF is collaborating with the Librari discussing next steps on data and grant related supports. In December 2013, DMCTF members others in the Libraries and the Grants Manager met with the Division of Sponsored Programs on collaborati on related to data, grants, etc 6 The Digital Humanities Library Group ( http://ufdc.ufl.edu/AA00014835/00030/pdf ) developed from the DMCTF. 7 For liaison roles on data and digital scholarship project teams, see: http://ufdc.ufl.edu/AA00017119/00021/pdf
Page 5 of 16 Background for Charging the Data Management/Curation Task Force plans and activities, and significant ongoing changes to research and teaching practices in the digital or the data age make data management a priority. Data management needs parallel and intersect with changes in academic libraries and librarianship which are enabling new types of data support 8 Academic institutions are facing a criti cal, urgent need for data management alongside of needs and opportunity with academic libraries for creating, sustaining, and transforming cultures of data management. Recognizing the need and opportunity, in 2011 UF created Research Computing to collaborate with other campus groups to enable radical collaboration 9 and the Research Computing Advisory Committee tasked a subcommittee on data lifecycle management. Research Computing continues to develop technological infrastructure at UF and in collaboration to the State and beyond. In 2012, the UF Smathers Libraries and Research Computing collaboratively developed the Science and BIGDATA to address the human infrastructure at UF and beyond, to create a culture of data management and enabl e radical collaboration. Following this, the Data Management/Curation Task Force (DMCTF) began in 2013 with representatives from the Libraries, Research Computing, and the Office of Research. The DMCTF was charged to assess needs, make recommendations, an d develop support for the role of the Libraries in campus wide data management and curation. While the initial DMCTF charge focused on assessment and information gathering, t he DMCTF members agreed that the charge had to be extended The extension was to b e t ter support campus data needs (immediate and long term) with currently available resources, begin building additional supports in the best manner possible as integrated with other groups across campus, and to undertake activities and develop plans for gr ow ing the full campus wide culture of data management. 10 8 For a longer review, see: http://www.clir.org/pubs/reports/pub160/pub160.pdf 9 http://www.it.ufl.edu/wp content/uploads/2012/03/research computing vision.pdf 10 Q1 Report: http://ufdc.ufl.edu/l/AA00014835/00001/pdf
Page 6 of 16 Year One Activities: Overview Current Status and Recommendations for Year Two T he DMCTF charge included advisory and operational activities Those activities, the status, and next steps are detailed below. Where the charge listed specific advisory and operational activities as separate groupings, the activities needed to support data management blend assessment, advisory, testing, and operational activities. Because of this, the advisory and operational activities are grouped together, where useful. Charge: Formally assess, through surveys, interviews, and focus groups, campus wide data management needs and current support resources and activities Status: Formal and informal methods were us ed to help bridge translation gaps (e.g., some researchers ) and included: focused group discussions; formal interviews; informal information gathering through trainings, workshops, facilitated discussions, reference mee campus wide survey with 288 responses) Next steps: Implem ent an annual survey in fall on data (add questions on Research Computing HiPerGator ) Continue to a ssess and develop recommendations fo r data needs and to foster a culture of and comprehensiv e support for a data management Continue to assess and develop recommendations for specific areas of need ; identify appropriate projects and develop plans to build to recommendations Continue work on p roject : o Dinky d atabases 11 are a known problem where researchers have small datasets and databases on personal, departmental, and other servers. With the current uneven support, the p roblem is undefined. With the problem defined, the next steps would be to assess these needs to develop support options and a plan for implementing support that continues to build toward the long term goals for providing better researcher support and bette r centralized data support for possible new opportunities (e.g., for some fields, this is a project to define a service as with ArcServer for GIS). o The DMCTF recommends that Library Liaisons be asked to contact their departments and researchers on this pro blem, and to collect and share information with the DMCTF 12 and also with Research Computing. o After liaison contact, t he DMCTF could coordinate and establish the needed collaborative teams including collaborating with the appropriate IT (Library and 11 http://ufdc.ufl.edu/l/AA00014835/00022/pdf and https://docs.google.com/spreadsheet/ ccc?key=0AoYPOTobTSykdDNUeEprN01DQzlBdE10T19BRnRLRUE#gid=0 12 See example call for information: http://cms.uflib.ufl.edu/datamgmt/contactus
Page 7 of 16 Resear ch Computing) units to migrate databases to the IR@UF when appropriate, inform and collaborate on development for SobekCM (which powers the IR@UF) for appropriate d atabase support, and collaborating on additional supports as appropriate which may include de velop ing other centralized solution s. The DMCTF could then collaboratively support migrating the databases for permanent support and leveraging of capacity with the appropriate solution s Charge: Review and consider the best practices and models of peer institutions Status: The DMCTF reviewed the rich variety and abundance of activities by p eer institutions P rogrammatic activities, plans, and best practices are still largely in development with wide variation based on institutional particularities and resource availability Many institutions are engaged in activities and have developed structures that are informative for best practices. For instance, Purdue has an integrated grant process so that when grant proposals are submitted, the librarians are notified and can follow up regarding data management and work to ensure ongoing integrated technical operations and open communication. 13 The Libraries at Emory University 14 and Notre Dame Uni versity 15 created new centers for data support (providing central integrated support for data management/curation, GIS, digital scholarship, and related ) Similarly, Pennsylvania State University created the new Publishing and Curation Services Unit. 16 Each of t hese library units offer consultative services and the units act as central nodes that connect to other campus areas for collaboration and to provide integrated campus wide support Fiscal Year 2013 2014 Budget Review repo rt for RCM, the UF Libraries identified the core need for a Data Curation Librarian and IT Experts within the libraries, 17 noting that Libraries fill the role of the intellectual ombudsman as they bring disciplines together in a rapidly changing enviro nment. The need for these positions continues to grow. Based on peer institutions, the preferred model includes new data positions and a formalized library unit that provides data and related services. Aside from new personnel, best practices from academic libraries include integration with other campus groups and within the libraries themselves. Next steps: Pursue integrat ion with the Division of Sponsored Programs ; s pecific ally : o DMPTool Training: promote and integrate with DSP trainings 13 This has been discussed with the Division of Sponsored Programs, and should be available in mid 2014, following a system upgrade by DSP for their data systems. 14 See: http://digitalscholarship.emory.edu/about/ and http://digitalscholarship.emory.edu/research/data%20management.html 15 See: http://library.nd.edu/cds/ and http://library.nd.edu/cds/expertise/DataManagement.shtml 16 See: http://www.libraries.psu.edu/psul/pubcur.html 17 See page 27: http:/ /ufdc.ufl.edu/IR00001359/00001
Page 8 of 16 o IR@UF impl ementation of an authority system for ORCID and other identifiers 18 Integrate data management whenever possible with existing groups: o Library Instruction (for trainings and possible for credit data courses) o Library Liaisons/Selectors (for integrated reference /consultation referral, and promotion of existing data resources, information gathering for new needs, and overall communication with all of UF for data needs) o Born Digital Archival Content Working Group o Grants Committee o S obekCM user group o Autho rs @UF for data associated with publications Support developing new groups as needed: o Digital Humanities Library Group (new interest group launched in 2014) Charge: Develop and implement templates and support training and services for the DMPTool (Data Man agement Plan Tool) and other resources Status: The DMPTool is a tool that provides a structured guide for the process of writing a data management plan with tailored supports for different funding agencies and programs The DMCTF collaborated with othe rs in the Libraries to deliver trainings and instruction focused on and featuring the DMPTool. 19 The DMCTF collected example data management plans 20 from UF for use by others as models in writing their plans. Next Steps: Continue to d eliver hands on trainings with the DMPTool are needed on an ongoing basis 21 Continue development of the DMPTool with the version 2 release in 2014 C ontinue to collaborate with the Library Instruction Committee on data training and instruction for librarians, facu lty, and students. o Specific training of interest includes training on the DMPTool (interest in overviews on the tool, and in hands on in a guided process of using the tool to create a data management plan for a specific granting opportunity). Trainings ne eded include in person and online trainings as well as ongoing work to ensure DMPTool training is part of an integrated data management training program from the Libraries integrated with Library Instruction overall and integrated with other campus group s involved in data management including the Division of Sponsored Programs 18 Discussed in the meeting with DSP, with Emerging Technologies grant proposal targeted for 2014. 19 Presentation slides: http://ufdc.ufl.edu/AA00017906/00007/pdf 20 Ex ample UF DMPs: http://ufdc.ufl.edu/contains/?t=%22Data+Management+Plan+(+DMP+)%22&f=SU 21 The DMCTF discussed the DMPTool trainings with the Office of Research and Rese arch Computing. Both will refer and promote this centralized service as provided by the Libraries.
Page 9 of 16 Charge: Recommend a framework for liaisons and subject specialists to incorporate data instruction and consultation into their workflows Charge: Develop materials and sessions f or training of liaisons, subject specialists, and other library staff to prepare them to support campus data management services Charge: Develop training and outreach materials to be used by liaisons, subject specialists, and other library staff in their w ork with clients Recommendation for a Framework : In the age of Big Data, the roles for librarians (as functional, technical and subject experts) are being blurred with librarians adding fluency in the other areas to their primary expertise. With broad a nd deep areas of knowledge, L ibrarians are expert collaborators for the academic and research community and libraries serve to support communities of expert collaborators with the space and other resources required to support connecting to information and other resources for research. Data instruction and con sultation logical ly extend from existing expert ise and activities. The DMCTF recommends an integrative framework that recognizes the existing strengths leveraging and building on known areas of expertise, workflows, activities, products, and capacities The DMCTF recommends developing this integrati ve framework by focusing on specific activitie s (explained in the next sections) to strengthen the foundation for next steps in development and expansion Status for Sessions Training, and Outreach Materials : The Data website 22 is a core resource for sup porting campus data management services. The DMCTF hosted a number of trainings, workshops, and instr uction sessions on 23 ) and created additional data orientation and overview materials 24 Recommendation: The DMCTF recommends a variety to activities to incorporate data into existing Liaison activities. The DMCTF has created a variety of resources to support these activities. 25 The DMCTF recommend s that all librarians : I ncorporate the slides (as refined to be most appropriate) on data into all appropriate library presentations and instruction. The DMCTF recommends that all L iaison and Subject Specialists update their LibGuides and appropriate websi tes to add a tab or page for data with the links and materials below : o Data website, http://library.uflib.ufl.edu/datamgmt/ 22 http://library.ufl.edu/datamgmt 23 Announcement: h ttp://ufdc.ufl.edu/AA00016055/00001/pdf http://ufdc.ufl.edu/AA00017906/ ; Zotero LibGuide: http:/ /guides.uflib.ufl.edu/profile.php?uid=2284 24 Data curation exploration slides: http://ufdc.ufl.edu/AA00013885/00001/pdf and short promotional handout: http://ufdc.ufl.edu/AA00017342/00001/downloads (PSD file) 25 For instance, see this draft text on data support and services: http://ufdc.ufl.edu/AA00019190 Other resources include standard, template slides and handouts on data management resources from the Libraries and Research Computing, and a guide for supporting data at the reference desk.
Page 10 of 16 o IR@UF http://library.ufl.edu/ufir o DMPTool o Research Computing http://rc.ufl.edu o HiPerGator, http://www.hpc.ufl.edu/2013/06/hipergator/ o A ppropriate subject repositori es (or DataBib for finding resources) o Additional, appropriate resources for data identified by the Liaisons I ncorporate links for relevant data management resources into all presentations and trainings, including introductions and overviews, as with the new graduate student and postdoc orientations, 26 as well as into all resources materials, including basic handouts. 27 C ontact their departments to provide an update on data services from the libraries (draft text 28 ) Contact their departments to request inform ation and feedback regarding on their small 29 ). o Following contact regarding the dinky databases, the DMCTF recommends that the DMCTF collaborates with librarians who receive information on dinky databases and data projects to create a data project page, with short entries on a variety of data projects a nd associated researchers. The simple narrative (not directory listing) information and the data project page 30 would provide an accessible sampler guide to different types of data projects across campus and would provide an accessible entry point for thin means for different research fields, options for managing it, and way to connect/pro vide context for case studies. 31 Include and integrate data into existing events whenever possible, ex plicitly mentioning data support from the libraries and Research Computing resources In many cases, this may simply require acknowledging that data is already included and integral to the event. For instance, events like various events in InfoCommons, GIS Day, Digital Humanities Day, and other Digital Humanities events 32 may not always mention data explicitly as data, but data is a central concern and part of bot h. I ncl uded and integrate data into existing and new courses taught by librarians The DMCTF w ill collaborate with the Instruction Committee and individual librarians to add information on data management support from the Libraries and Research Computing to be included in existing courses which could include: GIS, Introduction to Library and Intern et Research (IDS4930), 26 http://guides.uflib.ufl.ed u/grad_orientation and http://guides.uflib.ufl.edu/postdoc 27 http://guides.uflib.ufl.edu/content.php?pid=49813&sid=435652 28 http://ufdc.ufl.edu/AA00019190/00001 29 http://ufdc.ufl.edu/AA00014835/00022/ pdf 30 Possibly like that for DH: http://cms.uflib.ufl.edu/DigitalHumanities/UFDigitalHumanitiesProjects 31 The DMCTF could expand selected entries into case studies to sh ow projects where data is excellently managed to explain the different aspects of data management with link s to resources on campus providing support etc. This could help to communicate what the technologies do and how the resources connect T his would also support the Data Science Projects course in CSE, which requires a list of projects with half page explanations for students to use in finding data projects to work on for the class. 32 http://library.uflib.ufl.edu/DigitalHumanities/DigitalHumanitiesWorkingGroup
Page 11 of 16 African Studies Bibliography Graduate Course 33 Anthropology Bibliography Graduate Course Preserving History Undergraduate Internship Course in Archiv es, etc. P ursue new prese ntation and training opportunities on data in collaboration with resource experts like the IR@UF Coordinator, Scholarly Commu nications Librarian, etc C ontact their faculty regarding data related courses to share data management support resources from the Libraries specifically relevant for the course. For instance, Computer Science Engineering will begin a new Data Science 3 course sequence (each course is 3 credit hours) in spring 2014. The notes towards the sequence currently list the courses as : Element s of Data Science; Advanced Topics in Data Science; Projects in Data Science. 34 In Digital Worlds, an existing course is focused on research computing and data: Digital Worlds: Interdisciplinary Research Semi nar, a n introduction to research computing ( DIG68 40C ) Watch the SobekCM training videos to then support data needs using SobekCM when appropriate T he Smathers Libraries use and support the SobekCM Open Source Software for digital collections, digital libraries, data collections, and data sets. SobekCM support for data, user needs, and curator and collection manager needs makes it an excellent choice for data management, and an excellent system to know for comparison when needed. 35 o Join the SobekCM user group to best support data, digital collections, and user needs. 36 Review the data sets in the IR@UF where data sets accompanying electronic theses and dissertations (ETDs) are submitted directly into the along with all ETDs 37 The DMCTF recommends ongoing work by the D MTCF in collaboration with others to: Investigate creating a new data course. An introductory course on data, data science, and research computing (what could be called a course on data literacy) does not yet exist at UF T he Research Computing Advisory C ommittee (RCAC) has identified that a primary data course i s need ed 38 The DMCFT recommends that the DMCTF and Library Instruction Committee collaborate on the possibility of developing and teaching the new data literacy course. 39 Host additional data focused events ideally with additional data activities included The DMCTF recommends a new event, possibly a poster session, and possibly modeled in program. For addit ional data activities for events 33 http://guides.uflib.ufl.edu/content.php?pid=6493&sid=1480100 34 http://www.it.ufl.edu/wp content/uploads/2012/03/RCAC_minutes_2013Nov4.pdf 35 SobekCM Training Series: http:// ufdc.ufl.edu/AA00019186/00001/pdf 36 SobekCM User Group: https://groups.google.com/forum/#!categories/sobekcm discuss/general 37 http://ufdc.ufl.edu/ufetd 38 http://www.it.ufl.edu/wp content/uploads/2012/03/RCAC_minutes_2013Nov4.pdf 39 This includes consideration on whether a series of on line modules or credit course is best, and the process, where it could be that a series of online modules is used to build into a credit course or found sufficient on their own, etc.
Page 12 of 16 like this the DMCTF recommends specifically including the IR@UF perhaps requiring all participants to submit their materials to t he IR@UF 40 Future Recommendation s : Pursue new opportunities with any new hires, system int egrations, and existing processes for review and evaluation of librarians given added data responsibility needs, including: With the new IR@UF Coordinator, the DMCTF initiated discussion s on IR@UF training s that present the IR@UF within the context of an inte grated service. For instance, training on resources from the Libraries for grants could include: grant databases for finding funding opportunities, research skills for developing the proposal, the DMPTool for data plans, responsible conduct in research and ethics, author rights and copyright for publications, the IR@UF for grant products (posters, publications, data, etc.), and more. The Application Engineers for SobekCM are developing additional data support 41 and will be ont end interface with world class computational power back end from Research Computing. In the near future, the DMCTF should coordinate scheduling for presentations, facilitated discussions on user needs and design concerns, as well as trainings on data support in SobekCM, with these for internal and then external groups. The DMCTF should support l ibrarians and supervisors in consider ing the role of data for current a nd expected near future needs to determine if the annual assignment goals or position des cription should be revised to include the data related roles. Charge: Recommend the role of the Institutional Repository (IR@UF) and Research C omputing in storing, finding, and accessing work ing and final data, and linking publications to supporting data Status: Research Computing is bringing up a new data system with expected support at end of Q1 2014. The new system will have basic and functional controls support for access controls and permissions (including supporting UF researchers and their colla borators outside of UF), and other core supports needed for data management. 42 T he IR@UF will need to be connected to th e Research Computing data 40 Other events could include the IR@UF to focus on the need and availability to support posters from poster sessions and grey literature C onference posters are often cited in published articles based on the abstracts alone with no access to the poster; researchers are cited less often with no access to t heir content f rom posters; etc The IR@UF component would also include showing all presenters and attendees how to use the IR@UF specifically for their posters, offering to meet and assist those with many posters, and offering to visit departments to share on the IR@UF as a data management resource. 41 http://ufdc.ufl.edu/AA00017907/00001/pdf ; http://ufdc.ufl.edu/AA00019155/00001/pdf ; and data support alread y in place: http://ufdc.ufl.edu/AA00017119/00001/allvolumes 42 T he DMCTF evaluated many Open Source software data packages. N one emerged as a clear best or good solution for a majority of need s. SobekCM emerged as a best fit for some of the identified needs, provided there was additional support for computation and a backend data store, which Research Computing is currently implementing. Additional work will be require d to integrate these syste ms. To identify data needs, t he DMCTF evaluated the Dataverse Network (DVN) data repository software with SURA ASERL. T he DVN software is a poor match for current needs. While it has additional functionality, the functionality is implemented in complicate d
Page 13 of 16 store, integrating the systems as part of a connected infrastructure. The i ntegrated system will solve many ne eds, and will inform how to best solve other needs. Already identified needs include enhanced support in SobekCM for provenance to provide the data collection and tracking needed for reproducible research Other UF systems also need to be integrated and c onnected For instance, t he IR@UF needs to be integrated with the systems for the Division of Sponsored Programs (DSP) at least for author disambiguation T he Libraries applied for an ORCID implementation grant authors to the IR@UF along with a full authority structure for authors and other entities which was not funded The Libraries are now reworking the proposal to focus on the needed authority system work, with plans to submit the proposal as an inter nal emerging technologi es grant in early 2014. Integrating, c onnecting and enhancing core and continuing UF systems i s ideal, whenever possible and appropriate This follows the recommendations for the rest of the report which focus on leveraging and building from existing exp ertise and capacity. Recommendation: Recommended next steps include author disambiguation and visualization support: Submit the grant for name disambiguation support by integrating an authority system in SobekCM (for the IR@UF and all UF resources), with future plans for integration with the Division of Sponsored Programs In addition to work to connect the IR@UF and Research Computing, and the IR@UF and DSP through ORCID IDs, the DMCTF recommends continuing to pursue work on the IR@UF, specifically for add ing integration with simple visualization APIs where the data would be stored in the IR@UF, and users could also display the data as a graph or chart in the IR@UF (through the use of the API) without needing to download the data. This will support users for some needs, support users in thinking through how they want to store, access, and use stored data. The DMCTF also recommends, if possible and useful, for the IT for the IR@UF to be in contact with Research Computing, ICBR, and/ or another central IT group during the simple visualization API connection to seed discussion for connect ing the IR@UF repository storage with tool s on the other central IT group systems to plan how to further expose data in the IR@UF for connection for visu alization The DMCTF should continue ongoing research to identi fy applicable, tools and systems to connect, interlink, or support separately when appropriate. For this, the DMCTF may work from specific use cases, functional and non functional requirement s specifications, and other information as it applies to operational, planned, and research activities to ensure all recommended technologies are accompanied with as much support as possible for defining best practices, appropriateness or fit ness for purp ose, ways and impacts usability. Further, planned uses without an additional dedicated person on the DVN alone, and this would not lend itself to leveraging or integration with other systems and so would be an added cost/requirement
Page 14 of 16 and more to define when, where, what, and how the technology best supports the academic goals, and how the technology relates and connects to others. With each new planned or available technology, o utreach, promotion, and usability testing will be ne eded. Charge: Develop means to enhance and expand the librarian liaison model with the goal of making lib rarians partners in research activities Status: Libra rians are unevenly and inconsistently included as partners in research activities. The DMCTF 43 approach for data and digital scholarship project needs, in addition to integrating support for data workshops within existing liaison activities. For some areas, this may be an integrative approach t hat builds on and from existing areas of expertise. For instance, the GIS expertise in Libraries can be reframed as visualization (GIS and other) to expand the roles and opportunities for the Libraries in regards to visualization and data overall. Additi onal work is needed for enhancing and expand ing the librarian liaison model with the goal of making librarians partners in research activities and further work is needed to ensure all librarians are fully supported for this work. Recommendation s on Visua lization and GIS : The Libraries should b uild on existing data visualization expertise and activities in GIS for greater awareness and outreach that extends further into other visualization areas. For this, the GIS Librarian welcomes the opportunity for th is expanded role which would include many areas, and those could initially be: Providing expert consultation, contributing narrative, and participating as an investigator on grants from across campus Developing a list of campus plotter services, for provi ding referrals with this as a service not intended for support within the Libraries Collaborating with the Libraries on the 3D printing service for referral support and any connected technical and research concerns Developing and maintaining a list of ava ilable visualization tools and supports for awareness and to then develop additional trainings, materials, and activities as based on the available and relevant tools for research needs Developing and providing s upport for additional visualization tools as they become available ( researcher requests include technologies for HPC simulation/video rendering, remote sensing, etc.) Establishing a other appropriate space), with set lab hours for specific days/times each week for at least one full semeste r, with the GIS Librarian holding these open hours (possibly with an expert from Research Computing and other visualization experts, when possible) to provide expertise, hands on con sultat ion, training, and support 43 See: http://ufdc.ufl.edu/AA00017119/00021/pdf
Page 15 of 16 o integrated and connected service. For instance, the Libraries and Research Computing could co author a short news release explaining how this new service is deeply connected with Libraries and Research Computing for current and future work, with students (and those teaching and needing to support their students) can now access ArcGIS through UFApps, and soon all faculty will have access to ArcGI S through RCApps, which is coming soon, and which the Libraries and Research Computing look forward to o For this, success will be measured in terms of success in promoting awareness on the exper tise and support available for visualization and GIS. At the end of the first level of promotion, awareness, and outreach. At the end of the first semester, there may be refinement or other methods may be selected for this work. Additionally, if both the GIS Librarian and an expert from Research Computing are available for the open labs, this pilot will be assessed on how well it facilitates and supports further collabora tion and community building across the Libraries and Research Computing. Recommendation s on Grants : The DMCTF is working to help build a culture of data management, and the grants program is working to build a culture of grantsmanship for the full project from connecting to collaborators before the project idea through project completion and on to new phases. With many related activities, the DMCTF is in discussion with the Grants Committee for collaborative work on overall shared concerns and specificall y for making librarians partners in research activities level role in support of data management and curation Charge: Propose a corresponding framework and resources for library support of the data life cycle Charge: Propose a corresponding framework and resources for library support of the data life cycle Status: Faculty, students, and staff from all across UF are in need of data management and curation support now and will have greater needs in the future. The Libraries must address immediate needs, build towards long term support, and work towards growing the larger culture to increase the overall institutional capa city. The Libraries already have a campus level role in data management and curation because the Libraries are campus level leaders in collaboration, human infrastructure, technical infrastructure for curation, and so much more The DMCTF has worked to g row and develop thi s collaborative connector role in collaboration with Research Computing Recommendation: The DMCTF recommends that the Libraries continue to strengthen the collaboration with Research Computing and work to cross promote Research Computing as a core, library related
Page 16 of 16 resource. Building on this collaboration, t he DMCTF recommends growing this collaboration to connect with campus IT groups and campus IT professionals supporting data. For this, the DMCTF recommends targeting specific groups for shared data needs: 44 Selected IT groups and individuals : As information connectors, supporters, teachers, and collaborators, academic libraries have a close affinity with their campus counterparts in information technolog y, as is clearly seen with the collaboration with the Libraries and Research Computing. The collaboration needs to continue on to connect with other groups to grow the culture of and capacity for data management across the groups. This is needed for all gr oups and areas where data management is done. Selected campus units/groups: For instance, the Samuel Proctor Oral History Program (currently hiring for a Digital Humanities Academic Production Specialist) ICBR, and the Informatics Institute. Ideally, in selecting campus units/groups, the DMCTF will connect the new group and also strengthen existing collaborations with other campus groups on data. Recommendations for Marketing and Outreach The DMCTF has a ddition al specific recommendations in regards to m arketing and outreach: Recommendation s : Request a link for Data to be added to the UF Smathers Libraries homepage, which is so on to update to the new design ( Requested November 2013 ) Collaboratively write a brief (1 page) news article with Research Computing for UFIT news Collaborate with Research Computing on outreach for resources and services from Research Computing and HiPerGator Integrate data with existing events trainings, and activities, including : o Establishing open hours o Developing IR@UF training that presents the IR@UF as an integrated service o Updating a ll LibGuides and appropriate websites to add a tab /page on data resources o Int egrating data outreach and promotion into existing events (e.g., InfoCommons, GIS Day, D igital Humanities Day, etc.) for promotion and awareness for resources from the L ibraries and Research Computing o Integrating data into existing and new courses taught by librarians o Integ rating data into reference support o Integrating data into overall library services and resources promoted by Liaisons 44 re positioned to enable radical collaboration. The age of Big Data blends previously separated areas for librarians (e.g., functional technical, and subject expertise) in a parallel manner to the blended expertise necessary for research computing (e.g., s ystems administration, programming, database administration, experimental processes, concerns and needs specific to academic research as with understanding the difference in acceptable error rates in research versus commercial web searching, and how to sup port the rigorous requirements for research). The http:/ /llc.oxfordjournals.org/content/26/2/217.full ).