Citation
Data Management / Curation Working Group Year End Report (2019)

Material Information

Title:
Data Management / Curation Working Group Year End Report (2019)
Series Title:
Data Management/Curation Working Group Agenda/Notes
Creator:
Smith, Plato
Norton, Hannah
Deumens, Erik
Hawley, Haven
Durant, Flecther
Gonzalez, Sara
Kiszka, Sara
Gitzendanner, Matt
VanKleeck, David
Neu, Ed
Taylor, Lauire
Leonard, Michelle
Maxwell, Dan
Place of Publication:
Gainesville, FL
Publisher:
George A. Smathers Libraries, University of Florida
Language:
English
Physical Description:
Notes

Subjects

Subjects / Keywords:
Data management
Data curation
Training

Notes

Abstract:
Meeting notes for the UF Data Management / Curation Working Group.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
The author dedicated the work to the Commons by waiving all of his or her rights to the work worldwide under copyright law and all related or neighboring legal rights he or she had in the work, to the extent allowable by law.

Downloads

This item is only available as the following downloads:


Full Text

PAGE 1

Data Management/Curation Working Group Report Page 1 of 16 Updated : 16 January 2020 Data Manag ement/ Curation Work ing Group, Year End Report (2019 ) Data Management/ Working Group Charge Data Management/Curation Working Group Charge (2016) 1 Responsible Executive Senior Associate Dean of Scholarly Resources & Services , George A. Smathers Libraries Contact Data Management/Curation Working Group Email: datamgmt l [at] lists [dot] ufl [dot] edu Superseded Document s Data Management/Curation Working Group, Year End Report (2018 ) 2; Data Management/Curation Task Force Charge (2014)3 ; Data Management/Curation Task Force : Year One Report and Recommendations 4 Reporting Period 2019 01 01 to 2019 12 31 Date of Report January 15 , 2020 1. Executive Summary ........................................................................................................... 1 2. Key Accomplishments and Recommendations .................................................................. 3 3. Appendices ......................................................................................................................... 5 1. Executive Summary The sixth year of the Data Management/Curation Task Force ( DMCTF ) now operating as the Data Management/Curation Working Group (DMCWG) continues to develop collabo rations, connections, and strategic partnerships across campus including the UF Office of Re search , UFRC , UFII , and UF/IFAS Nature Coast Bio logi cal Station ( NCBS ), UFCISE , UF Clinical and Translation Science Institute ( CTS I T ) , an d UF Innovation Hub . The DMCWG promulgates FAIR (findable, accessible, interoperable, re usable) data guiding principles that include development of strong data management planning (including documentation of provenance), increased access and sharing, and longterm preservation. This group also recommends further development of research data science and informatics as key opportunities in 2020. Outputs Total DMCWG Meetings (2019 01 01 to 2019 12 31) 7 Research Bazaar at UF (ResBaz@UF) https://resbaz.github.io/resbaz2019/gainesville/ 1 Grant funded projects and/or sub awards 4 International Conference Presentation http://www.textrelease.com/gl21program.html 1 D ata management meeting s with senior UF stake holders , including UF Office of Research 6 Total DMCWG subscribers as of 12/31/19 54 1 Data Management/Curation Working Group Charge. (2016). http://ufdc.ufl.edu/AA00014835/00076 . 2 Data Management/Curation Working Group, Year End Report (2018). http://ufdc.ufl.edu/AA00014835/000135 . 3 Data Management/Curation Task Force Charge. (2014). http://ufdc.ufl.edu/AA00014835/00048 . 4 Data Management/Curation Task Force: Year One Report and Recommendations. (2014). http://ufdc.ufl.edu/AA00014835/00032 .

PAGE 2

Data Management/Curation Working Group Report Page 2 of 16 Updated : 16 January 2020 Purpose and Scope The purpose of this report is to provide a compilation of DMCWG activities executed this year for review by multiple stakeholders across campus . The scope of this report focuses on DMCWG program buil ding, collaborations, outreach, and goals in sup port of UF George A. Smathers Libraries’ strategic directions . Selected Accomplishments of Note Activities, Consultations, Events, Mee tings, and Trainings The DMCWG continues developing socio technical (goals/metrics, people, infrastructure, technology) dat a management collaborations across multiple communities of practice and stakeholders through meetings , outreach, projects, and workshops . Some key accomplishments in 2019 include the following : a. Presented the DMCWG collaboration with UFCEHT and UF ICBR to develop a Data Management and Analysis Core (DMAC) for a NIH P42 grant proposal RFAES 15019 at the 21th International Conference on Grey Literature (GL21) at the German National Library of Science and Technology, Hannover, Germany – 10/22/19 – 10/23/19 b. Presented ARCS: Data Management at the UF/IFAS Open House at Plant Diagnostic Center Affiliated Library at 2570 Hull Rd. (See: WCJB 20 http://tinyurl.com/rsnnk7e ) – 10/10/19 c. Presented at the Research Bazaar ( ResBaz ) Gainesville – 9/13/19 d. Presented the Introductory Data Management – Developing, Archiving, Sharing Data for Current/Future Use for UF/IFAS Assistant Professor as part of the UF/IFAS Global Feed the Future – Haiti program project ( See : https://ufdc.ufl.edu/IR00010927/00001 ) – 8/ 6/19 e. Demonstrated Zen odo general data repository for data deposit at the UF Anesthesiology Research Executive Committee meeting – 5/10/19 f. Discussed data management plan for Nature Coast Biological Station (NCBS) at Cedar Keys with Director and project team members – 5/8/2019 g. Developed/led 1st Data Management introductory graduate course (INSC590 003/004) at the University of Tennessee, Knoxville iSchool (See: students’ projects ) – 1/6/19 to 4/24/19 h. Attended the National Science Foundation (NSF) CAREER workshop at NSF Headquarters in Alexandria, VA (see: http://cisecareerworkshop.web.unc.edu/agenda/ ) – 4/8 4/ 9/19 i. Presented data management guest lecture for CAP5108: Research Methods for Human centered Computing ( See : https://ufdc.ufl.edu/AA00014835/00143 ) – 4/18/19 j. Presented DMCWG Data Mgmt. updates presentation to the UF Research and Scholarship Council (SCORS) ( See : http://tinyurl.com/ve84rn3 ) – 3/26/19 k. Met w/UF Legal Counsel and FL DEP to discuss SEACAR data , MOU, and strategies – 3/12/19 l. Met with FACETS project data coordinator to discuss data management – 1/17/2019 Event Fo rmat Participants Discipline Type Semester a. GL21 Conference Germany F2F 30 Multiple Specific FA 19 b. Plant Diagnostic Ctr. F2F 40 + IFAS Specific FA 19 c. ResBaz Gainesville F2F 8 Multiple General FA 19 d. Global Feed the Future F2F 12 IFAS Specific SPR 19 e. UF Anesthesiology Lab F2F 5 Medicine Specific SPR 19 f. UF/IFAS NCBS F2F 4 IFAS Specific SPR 19 g . UTK iSchool INSC 590 F2F 18 LIS Specific SPR 19 h . NSF CAREER Workshop F2F 100+ CISE Specific SPR 19 i. UF CISE CAP5108 F2F 12 CISE Specific SPR19 j. UF Research Council (SCORS) F2F 8 Research Specific SPR19 k. SEACAR (See Appendix D) F2F 3 IFAS Specific SPR19 l . UF/IFAS FACETS F2F 1 IFAS Specific SPR19

PAGE 3

Data Management/Curation Working Group Report Page 3 of 16 Updated : 16 January 2020 Building Collaboration and Awareness o Developed university wide data management and electronic lab notebook (ELN) conversations with Penn State University and Carnegie Mellon University , respectively o Continued developing collaborative partnerships with UF Office of Research, Research Computing , UF CTSIT , UFII , IFAS , UF Innovation Square , and the Carpentries @ UF to develop o utreach, support services, and workshops (e.g. GPC, ResBaz, Symposiums) o Reviewe d for the 2019 UF In ternational Center ( UFIC ) Global Fellows Program – 10/ 29 /19 o Facilitated University of Florida George A. Smath ers Libraries Earth Science Information Partners ( ESIP ) membe r ship in 2019 Establishing International Collaboration and Awareness o Invited and p articipated in GL21 Program Planning Committee GreyNet International 2. K ey Accomplishments and Recommendations Key Accomplishments: Developed highlevel approaches to data management with senior stakeholders o Initiated by the UF Office of Research at the end of 2018 and developed by UF Research Computing (UFRC), UF Clinical and Translation Science Informatics Consu lting (CTS IT), and UF Libraries, the 1st draft proposal for Data Management Instructure at UF was developed in 2019. T he DMCWG will continue to support socio t echnical data management collaborations with campus stakeholders across multiple units to develo p data management capacity, infrastructure, and resources that sustain University wide Research Data Management conversations 5 at UF before, during, post funding award. This is an interdisciplinary joint effort pending broad support. o Facilitated developmen t of the Data Management at UF proposal ( Accepted ) o Partnership between UF Office of Research, UFRC, CTS IT, and UF Libraries DMCWG o Facilitated development of the HiperGator access for students – A proposal (See Appendix E) o Proposed support p artnership between UF George A. Smathers Libraries and UF Information Technology Recommendation 1: Facilitate development of the Libraries as Connector and Resources Broker T he DMCWG adopts a socio technical systems theory 6 data management perspective to address research data management support services as a way to better meet stakeholders’ data management and sharing needs. In continued efforts to support UF’s dat a needs as they relate to evolving funder mandated data manag ement and sharing requirements , the DMCWG recommendations include but are not limited to: 1. Organize outreach, program, and education for data management across all disciplines a. Develop partnerships with multiple high level stakeholders for university wide im pact b. Develop partnerships with UFIT Data Governances, Research Computing Advisory Committee (RCAC), Faculty Senate Research & Scholarship Council (SCORS), Information Security Advisory Committee (ISAC), and other data related committees and groups c. Develop adaptable, pedagogical, and relevant training modules , videos, & workshops 2. De velop collaborative p rograms to enhance curriculum, policy, and strategic directions 3. Develop collaborative proposals that build capacity, infrastructure, and resources across units a. Work with UF Office of Research, UFRC, and UF CTS IT on data management at UF 5 Erway, R. (2013). Starting the Conversation: University wide Research Data Management Policy. Accessed January 14, 2020 from http://tinyurl.com/tjzlrk8 . 6 University of Leeds. (2020). Leeds University Business School. Socio technical Centre. Socio technical systems theory. Accessed January 14, 2020 from http://tinyurl.com/vfjt87y .

PAGE 4

Data Management/Curation Working Group Report Page 4 of 16 Updated : 16 January 2020 Recommen dation 2 : Develop Collaborative Programs to Enhance Data Management Services & Support The UFRC, CTS IT, UFII, and DMCWG training workshop groups will need to collectively align, coordinate , and disseminate “research data science” education across communities practice for better integration of services university wide. According to the Committee on Data of the International Council for Science (CODATA) Research Data Alliance (RDA) “research data science” require the ensemble of skills that include (1) principles and practices of Open Science and research data management and curation, including data repositories, (2) the use of a range of data platforms and infras tructures, (3) large scale analysis, (4) statistics, (5) visualization and modeling techniques, (6) software development and annotation, and (7) more 7 . T he DMCWG will work with campus partners in the development of the ‘ ensemble of skills ’ required for res earch data science i n co llaboration with various datarelated campus groups, funded projects, labs, units, and library groups. Two recommendations to promote this effort are: o Collaborate on data related activ ities, funded projects , and research data science across units o Provide an interdisciplinary socio technical data management support framework that scales In addition to the DMCWG working in collaboration with the Health Science Center Libraries (HSCL), datarelated groups, and core non libraries c ollaborators such as Res earch Computing, UFII, UF/IFAS, and CTS IT to name a few , all Liaison Librarians play a critical role as primary contacts for data inquires and also serve as leaders in consultative capacities for project teams within and outside of the libraries. The DMCWG will continue to develop the data management program in partnership with collaborations, library Liaisons /Academic Research Consulting & Services ( ARCS ), and external stakeholders . For ex ternal group collaboration s, the DMCWG will continue to pursue relevant expertise within the Libraries /domains for data lifecycle managem ent from the data scientists and computational team members to the social sciences/h umanities for all data projects acr oss diverse communities of practice. Recommendation 3 : Develop Research Data Management Guidance, Models, and Success Metrics The National Institutes of Health (NIH) defines scientific data as “[t] he recorded factual material commonly accepted in the scien tific community as necessary to validate and replicate research findings, regardless of whether the data are used to support scholarly publications8.” Research data can be categorized as (1) observational, (2) e xperimental, (3) simulation, (4) derived or compiled, (5) reference or canonical9. The Data Asset Framework Implementation Guide (2009, p. 3)10 diagrams the alignment of various st akeholders’ roles to the responsibilities of data management and curation in the DCC Curation Lifecycle M odel11. The DCC Curation Lifecycle Model provides high level, graphical overview of required stages for successful data management, curation, and preservation from creation through the iterative data lifec ycle process. The DMCWG working in partnerships with the UF Office of Research, UF Research Computing, UF Clinical and Translational Science Informatics Consulting (CTS IT), and UF Informatics Institute (UFII) will develop collaborations, recommendat ions, and strategies for data management resources, suppor t, and training. o Assist faculty and researchers with funding support for capacity, infrastructure, and resources 7 CODATA. (nd.). CODATA RDA School of Research Data Science. Accessed January 14, 2020 fr om http://tinyurl.com/wvqewgn . 8 NIH. (November 2019). DRAFT NIH Policy for Data management and Sharing. Accessed January 14, 2020 from http://tinyurl.com/u53btbl . 9 Un iversity of Edinburgh. (2011). Edinburgh University Data Library Research Data Management Handbook, p. 5 . Accessed January 16, 2020 from http://www.docs.is.ed.ac.uk/docs/da ta library/EUDL_RDM_Handbook.pdf . 10 JISC, University of Glasgow HATII, & DCC. (October 2009). Data Asset Framework Implementation Guide. Accessed January 14, 2020 from https://www .data audit.eu/docs/DAF_Implementation_Guide.pdf . 11 DCC. (2007). The DCC Curation Lifecycle Model. Accessed January 14, 2020 from http://tinyurl.com/2699oob .

PAGE 5

Data Management/Curation Working Group Report Page 5 of 16 Updated : 16 January 2020 o Facilitate UF datarelated groups ’ collaboration , commun ication, and integration of data management efforts with in domains and across multiple communities of practice o Collaborate with consortiums, institutions, and develop new international partners for 2020 2020 goals – carryover from 2019 i. The Schol arly Communications Librarian, in partnership with Library Technology Services and UF Identity Management, will develop ORCID API integration with select campus publishing platforms : o Potential myUFL use case (e.g. Peoplesoft platform ) o Potential Libraries u se case (e.g. SobekCM (IR@UF platform ) o Potential UF/IFAS use case (e.g. EDIS publishing platform ) ii. Develop Institutional wide data management and sharing training, support, and guidance for funded projects in collaboration with multiple campus partners iii. Exp lore data repository solution that integrates with the institutional repository (e.g., CKAN ( https://ckan.org/ ), invenio ( https://invenio software.org/ ) iv. T he UF College of Business leads effort to support GitHub initiatives. A unit within the UF College of Business is leading this effort via an Applications Developer Analyst. The DMCWG will c ontribute support to GitHub organization and institutional code/software repo sitory effo rts/initiatives via campus collaborations . o Promote University wide license and/or organization of all GitHub instances and data for monitoring/tracking UF data assets, particularly federally funded research data in GitHub repos (modeled after University of Minnesota) o Investigate University wide Electronic lab notebook (ELN ) solution o Explore REDCap integration with ELN v. Participate in data sharing/transfer agreements/MOUs for cooperative agreements and multiinstitutional collaborative projects, including funded projects vi. Develop preservation and workflows for raw datasets, processed, and publicly available datasets o Will require varied repositories and preservation infrastructure Institutional repository Discipline specific repository General data repository vii. Collaborate/partner with other UF data groups to further develop Data Management @ UF o Develop support for funded projects and centers to successfully meet funding mandates o Develop a proposal for research and education that supports 1a. CyberTraining ( 18516 ) viii. Promulgate ORCID API development, integration, and implementation across multiple UF units ix. Articulate funding, outreach, services, tools, and workshops in collaboration with other units o Develop data management guidelines for doctoral students (Se e: Univ. of Bath , Leiden U. ) o Develop training for Library Liaisons for faculty interface/outreach o Develop a Sloan Community and/or Training grant that support 3a. & 3b. (See: D&CR ) 3. Appendices A. Data Manag ement @ UF meeting w/UF Office of Research VP brief notes (10/31/19) B. Supplemental DRAFT Guidance: Elements of a NIH Data Management and Sharing Plan (Plan) comments submitted to NIH ( 1/9/ 2020) C. Select Funded Projects D. Data Transfer Cooperative Agreement Sample (Draft) for SEACAR E. HiPerGa t or access for student s – A proposal (Accepted)

PAGE 6

Data Management/Curation Working Group Report Page 6 of 16 Updated : 16 January 2020 Appendix A – Data Management @ UF meeting w/ UF Office of Research VP brief notes (10/31/19) The Director of Research Computing, Director of Clinical Translational Science IT, and DMCWG Chair met with UF Office of Research VP to discuss next steps for the Data Management @ UF Proposal (9/10/19). One of the major goals from this meeting was seeking agreement that the initial draft proposal was on target with respect to developing a strategy to leverage current resources at UF to better address increasing data management and sharing requirements from funders that pose many challenges to researchers regarding capacities, infrastructures, and res ources to fulfill funders’ mandates. Below are few highlights and action items: Highlights: Discussed the need to provide proper guidance on UF resources to fulfill funders’ increasing data management mandates Discussed the need for a responsible organizat ion, not necessarily develop a new organization, to handle data management resources service brokering at the university level o Note: Develop a matrix of services and responsible entities Discussed the need to develop an organizational chart that represent the key stakeholders, roles, and responsibilities (i.e. UNSW Data Governance Policy comes to mind. See: https://www.datagovernance.unsw.edu.au/node/12 ) Discussed that the DM org (when developed ) will be consultatively governed by the Research Computing Advisory Committee Agreed that Research Computing and CTS IT will co chair development of the DM org Discussed a need for a business model to support DM org, including new resources such as an ELN solution (i.e. modeled after CTS IT RedCap business model with first year of service free then after year pay a small nominal fee) Agreed to leverage current UF capacities, infrastructure, and resources to address university wide data management support f irst Discussed the role of the libraries in this endeavor o The Libraries will serve as a resources services broker among UFRC and CTS IT with respect to data management infrastructure at UF. o The Libraries will support outreach and training support in align ment with DM org (UFRC and CTS IT). o The Libraries will assist in university wide data management support efforts such as electronic lab notebooks (ELN), research computing access for students, and training (i.e. CITI Program) Requested intelligence on data management infrastructure at peer/other large research institutions o Note: the following institutions were specifically suggested as good starting institutions for investigating data management infrastructure research at other institutions. 1. UCLA 2. Berkeley 3. Michigan 4. Penn State (Contacted Associate CIO for Research 12/3/19) Discussed the need for a use case to actualize data management infrastructure proposal from proof of concept to pilot

PAGE 7

Data Management/Curation Working Group Report Page 7 of 16 Updated : 16 January 2020 Discussed NSF new mandate request for the MagLab to make data accessibl e in the future o This is a huge challenge for MagLab and affiliates. o A MagLab data management use case was recommended as a first pilot of this data management infrastructure proposal idea. Action items: Develop DM org chart Investigate DM at peer institut ions (see 1 4 from above) Schedule next (led by UF Research Computing) These were major points but not all points covered during the meeting. The major takeaway was the UF Office of Research supports the initial Data Management at UF proposal but require d more data to move to the next step. In conclusion, both UFRC and CTS IT agreed the meeting was a success.

PAGE 8

Data Management/Curation Working Group Report Page 8 of 16 Updated : 16 January 2020 Appendix B – Supplemental DRAFT Guidance: Elements of a NIH Data Management and Sharing Plan (Plan)12 comments submitted to NIH ( 1/9/ 2020) University of Florida George A. Smathers Libraries Comments to the National Institutes of Health Supplemental DRAFT Guidance: Elements of a NIH Data Management and Sharing Plan 1. Data Type: The supplemental draft guidance is clear in articulating key requirements for a NIH data management and Sharing Plan. The addition of (1) categories of metadata, (2) definition of supplemental data resulting in generation of scientific data, (3) definition of FAIR data , and (4) following suggestions could potentially further enhance the supplemental draft for researchers. There are categories in the identification of metadata, other relevant data, and any associated data documentation such as descriptive, preservation, technical, structural, and administrative. The inclusion of these categories after “Identifying metadata (e.g., descriptive, preservation, technical, structural, and administrative)” in the third bullet point under Data Type in Supplemental DRAFT Guidance: Elements of a NIH Da ta Management and Sharing Plan (Plan) may clarify the types of metadata for researchers. The inclusion of a definition for supplemental data outputs (i.e. raw data, temporary data, processed, and analyzed data) may clarify types of data for researchers. i. Su pplemental Data: Supplemental data (e.g. raw data, temporary data, processed data, and analyzed data) result in the generation of scientific data. Supplemental data may be represented as nonstandard public records, electronic theses and dissertations (ET Ds), open source intelligence, unpublished research data (i.e. raw data, temporary data, processed data, analyzed data), web interfaces, laboratory notebooks, preliminary analyses, completed case report forms, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues, or physical objects, such as laboratory specimens. ii. FAIR (Findable, Accessible, Interoperable, and Reusable) Data: FAIR data is data that support the FAIR guiding data principles for scientific data management and stewardship. The 15 data principles covering four categories refer to three types of entities: data (or any digital object), metadata (information about that digital object), and infrastructure . Data is considered FAIR if reasonable efforts have been made to make the data findable, accessible, interoperable, and reusable. Recommend inclusion of FAIR data examples, scenarios, and use cases to educate researchers. See: https://www.go fair.org/fair principles/ 12 NIH. (2019). Request for Public Comments on a DRAFT NIH Policy for Data Management and Sharing and Supplemental DRAFT Guidance. https://www.govinfo.gov/content/pkg/FR2019 11 08/pdf/2019 24529.pdf .

PAGE 9

Data Management/Curation Working Group Report Page 9 of 16 Updated : 16 January 2020 Clarifying sharing and deposit of raw data, processed data, and analyzed data factoring format, size, and copyright may clarify deposit of data types. Added guidance may include recommendations for reproducibility best practices such as db reproducible (See: http://dbreproducibility.seas.harvard.edu/ ) . The goal is to enable reproducibility of the raw data and relevant plots that the authors used to draw their conclusions. Authors should provide a complete set of scripts to (1) install the system, (2) produce the data, (3) run experiments, and (4) produce the resulting graphs along with a detailed Readme file that describes the process systematically for reproducibility by a reviewer or other researchers. 2. Related Tools/Software and/or Code: Guidance may suggest created code and scripts following an established best practice (i.e. community standard, best practices, recommendation) be shared with data to make the data replicable. See: http://db reproducibility.seas.harvard.edu/ Recommend the use of API, open source software, collaborative tools, and version control science scenarios and use cases to make data FAIR. Se e: https://open.fda.gov/ Provide examples of communities of practice, computational tools & services, data & information assets, management, policy & standards, and science inputs with links to successful examples in each category. 3. Standards: Guidance on identifying, understanding, and using appropriate metadata standards (i.e. discipline specific, general), ontologies, schemas, and semantics for scientific data to be created, aggregated, represented, disseminated, and preserved during the life of the usefulness of the data in the form of examples may be useful. See: https://www.go fair.org/fairprinciples/ 4. Data Preservation, Access, and Associated Timelines: Ad ded guidelines on recommendations in the selection of acceptable data repositories for data deposit may be useful to researchers. Provide list of NIH approved data repositories (i.e. institutional, general, and discipline specific), including free, fee. If fee, then guidance on proper budget for data deposits. i. NIH Data Repositories and Trusted Partners: http://tinyurl.com/yhkfyyxn Recommend a Data Services & Developer Tools guidelines resources (e.g. https://www.osti.gov/data services developer tools ) to understand the creation of OAIPMH metadata records (e.g. https://www.osti.gov/oairecords ) that al lows linking of metadata record with data in an external data repository if unable to deposit data in a repository. i. Include guidance for a metadata repository linking option to data repository to provide make data FAIR given capacity, infrastructure, and r esources. Provide clarification on types and trust of repositories. An institutional repository is not a data repository. A data repository is not a trusted data depository.

PAGE 10

Data Management/Curation Working Group Report Page 10 of 16 Updated : 16 January 2020 Provide guidance on trusted data repository. See: https://www.coretrustseal.org/ Provide examples of trusted repositories for education, guidance, and reference. i. NIH Data Repositories and Trusted Partners: http://tinyurl.com/yhkfyyxn ii. Acceptable Digital Repositories for USGS: http://tinyurl.com/y6e5d35n iii. Data Repositories Conformant with DOT Public Access Plan: http://tinyurl.com/yen5fw2z 5. Data Shari ng Agreements, Licenses, and Other Use Limitations: Guidance on development of acceptable memorandum of understanding (MOU) for NIH funded projects involving collaborators, partners, and researchers within and across organizations. Recommend examples, exemplars, and references involving data use, reuse, and data governance policies (e.g. Creative Commons Zero, Open Data Commons). 6. Oversight of Data Management: The sociotechnical management of research data at an institution requires collaboration of diverse stakeholders across multiple units involving the allocation of resources, responsibilities, and support. Recommend NIH resources (e.g. exemplars, use cases) on recommended guidelines for developing collaborative partnerships between stakeholders and Libra ries (e.g. NIH P42 Data Management and Analysis Core (DMAC)) in coordinating institutionwide initiatives to educate and support researchers in sustainable compliance with data management and sharing practices, policies, and procedures throughout the life of funded project and beyond. See: https://er.educause.edu/articles/2013/12/starting the conversation universitywide research data management policy

PAGE 11

Data Management/Curation Working Group Report Page 11 of 16 Updated : 16 January 2020 Appendix C – Select Funded Projects 1. Using Story Driven Strategies to Teach Data Science in Precision Health Settings (cash: $49,915) The project team seeks to develop and evaluate the effectiveness of “story driven” strategies for teaching data science in precision public health settings. Specifically, the research team seeks to study the efficacy of the data story method of instruction to convey statistical and computational concepts to the clinical translational workforce. A secondary objective is to create an integrated set of engaging data science learning experiences. (Project Team: D. Maxwell (PI), with S. Meyer, K. Crippen (Colleg e of Education), and M. Rethlefsen) (start date: 6/1/19; end date: 11/30/20) UF Clinical and Translational Science Institute – Pilot Project Award 2. Educating for Reproducibility: Pathways to Research Integrity – PI (UF HSCL Assoc. Dean); Co PIs (two members of the DMCWG) (1 year US Dept. of Health and Human Services project) $46,882 3. TREC Library Collections Review and Digitization Project (cash: $5,000) The UF/IFAS Tropical Research Education Center (TREC) project will offer two temporary employees at TREC to review the book collection and the historical management and data records at TREC, and to establish a cataloging and digitization plan for library and data materials available at TREC within the UF library system. The temporary employees will then work together to establish a digital repository of historical records and data hosted by TREC and educate TREC on the importance of diverse library collections and the best practices for library cataloging and data management. (Project Team: P. Smith (PI), with S. Coates and TREC staff members) (start date: 1/1/2020; end date: 12/31/2020) Strategic Opportunities Program 4. dLOC as Data (subaward to the Li braries: $14,209) The project team, in partnership with Florida International University (applicant), seeks to enhance access to its existing Caribbean newspaper collections by making texts available for bulk download to its users. The team will demonstrate the potential of newspaper data by creating a pilot thematic toolkit focused on hurricanes and tropical cyclones. (Project Team: P. Collins (PI), with L. Perry, C. Dinsmore, and L. Taylor) (start date: 1/1/2020; end date: 3/30/21) Andrew W. Mellon Foundation – Collections as Data: Part to Whole (Recently submitted) 5. Facilitating Rural Access t o Quality Health Information (cash: $15,000) By partnering on a Little Free Libraries (LFL) initiative with Florida’s Okeechobee County Library (OCL), the project team aims to improve rural residents’ health literacy and understanding of precision medici ne, through facilitating awareness of and access to NLM’s consumer health resources and NIH’s All of Us Research Program. This project supports NNLM’s mission to improve public health by empowering underserved groups to make effective use of information fo r health decision making. (Project Team: J. Morgan Daniel (PI), with L. Adkins (co PI), M. Rethlefsen, M. Ansell, S. Harnett, and Sonya Chapa (co PI) and Kresta King of Okeechobee County Library) (start date: 1/1/2020; end date: 4/30/20) National Network of Libraries of Medicine – All of Us Community Engagement Project (Recently submitted) 6. Smart Path -Growerdirected convergence of nanotechnology and smart decision analytics for irrigation water quality management related to pathogens (UF Libraries’ subaward cash: $57,439) (year 2 of a 5 year USDANIFA funded project) 7. Resource Assessment Tools to Inform – Advancing understanding of coastal ecosystem function and dynamics in the coupled natural human system of the U.S. Gulf Coast (Project team member: DMCWG Chair) – Pending Sponsor Review

PAGE 12

Data Management/Curation Working Group Report Page 12 of 16 Updated : 16 January 2020 Appendix D – Data Transfer Cooperative Agreement Sample (Draft) for SEACAR DATA TRANSFER AGREEMENT: STATEWIDE ECOSYSTEM ASSESSMENT OF COASTAL AND AQUATIC RESOURCES (SEACAR) PARTIES: U F : The University of Florida Board of Trustees Address City, State Zip INSTITUTE: [University of South Florida Water Institute ] Address City, State Zip WHEREAS, Institute has entered into a separate agreement with the State of Florida [ Dept. /Unit] to create a database for data from projects relevant to the Statewide Ecosystem Assessment o f Coastal a nd Aquatic Reso urces (the SEACAR Database) ; WHEREAS, UF is willing to submit certain scientific data to the Institute consistent with the terms of this Agreement; and WHEREAS, in support of their cooperative efforts, the parties now wish to enter into this data trans fer agreement in order to further define their roles and responsibilities , and to provide lines of accountability regarding the sharing of data toward their respective missions ; NOW THEREFORE, the parties agree as follows: 1. In their cooperative roles with respect to the SEACAR Database, each party may provide the other parties with: research data from fieldwork or their existing databases, laboratory notebooks, or any other primary records as necessary for the reconstruction and evaluation

PAGE 13

Data Management/Curation Working Group Report Page 13 of 16 Updated : 16 January 2020 of reported results of research and the events and processes leading to those results, in verbal, written, digital or other media (“hereinafter DATA”). 2. The receiving party will only use or make the DATA available to third parties to be us ed for research purposes in support of the objectives of SEACAR, the Florida Coastal Management Program or the research work of University. 3. Each party further agrees to the following regarding any DATA used by such party pursuant to this Agreement: a. The DATA is the property of the providing party and is made available as a service to the other part y and the public, in service of their respective missions and related use cases . b. THE PROVIDING PARTY GIVES NO WARRANTIES OR GUARANTEES, EXPRESS OR IMPLIED, FOR THE MATERIAL/DATA, INCLUDING MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. c. Each receiving party agrees to use the DATA in compliance with all applicable laws, regulatio ns, and policies , and to maintain any confidential or sensitive data using appropriate security measures . d. The DATA is provided by each party at no cost. e. The parties do not control use of the DATA by unaffiliated third parties, but in making the DATA av ailable to such parties, the parties will inform any third party users that the DATA is being provided for the purposes described in Section 2 above. 4. All DATA provided by a party and labeled according to Section 5 below is deemed Confidential Information, except for DATA that: a. have been published or otherwise made publicly available at the time of disclosure to the receiving party; b. were in the possession of or were readily available to the receiving party without being subject to a confidentiality obli gation from another source prior to the disclosure; c. have become publicly known, by publication or otherwise, not due to any unauthorized act of the receiving party; d. The receiving party can demonstrate it developed independently, or acquired without refe rence to, or reliance upon, such Confidential Information; or

PAGE 14

Data Management/Curation Working Group Report Page 14 of 16 Updated : 16 January 2020 e. are required to be disclosed by law, regulation, or court order, including without limitation Florida public records laws. 5. In order to be deemed confidential under this Agreement, written information must be clearly marked "CONFIDENTIAL" by the providing party. Confidential information will be maintained in confidence by the receiving party for a period of [ one to five (1 5) years ] from its receipt of the Confidential Information. In orde r to be considered confidential, any information that is orally disclosed must be reduced to writing and marked "CONFIDENTIAL" by the providing party and such notice must be provided to the receiving party within thirty (30) calendar days of the oral disc losure. 6. The parties recognize that it is a function of some or all of the parties to timely publish information. Accordingly, the parties’ researchers will not be restricted from presenting at symposia, national, or regional professional meetings, or from publishing in abstracts, journals, theses, or dissertations, or otherwise, whether in printed or in electronic media, methods and results of their work relating to the Unit, subject to Section 5 above with respect to any Confidential Information. The Institute will inform third party users of the SEACAR Database that all printed or electronic publications or presentations incorporating the DATA, including those that modify or create derivatives of the DATA, must include acknowledgement and attribution to any data creators identified in the SEACAR Database with respect to the DATA being incorporated or modified. 7. The parties shall meet to determine inventorship if an invention should arise during another party’s work with the DATA. 8. The parties may enter into separate agreements with different terms regarding transfer of data for specific projects, including but not limited to grant agreements. To the extent of a conflict between this data transfer agreement and a separate written agreement executed by two or more of the parties hereto, the separate, executed agreement will control with respect to any specific project addressed therein. 9. The provisions of this Agreement are deemed to be severable and the invalidity, illegality or unenforceability of one or more of such provisions shall not affect the validity, legality or enforceability of the remaining provisions. 10. This agreement is effective on the date of last signature and shall cont inue in force until terminated by mutual agreement, or until termination of the Cooperative Agreement . In the event this Agreement is terminated, the parties shall promptly return to the providing party or, at the latter’s option, or where return is imprac ticable, destroy all copies of DATA. Upon a providing party 's request, the receiving party shall confirm in writing as to such destruction.

PAGE 15

Data Management/Curation Working Group Report Page 15 of 16 Updated : 16 January 2020 11. This Agreement for the Transfer of Data shall be governed by and interpreted under the laws of the State of Flor ida, without reference to its conflicts of laws principles, and the jurisdiction/venue for any litigation, special proceeding or other proceeding as between the parties that may be brought, or arise, in connection with, or by reason of the Agreement shall be in Ala chua County, Florida. Approvals: _______________________________________________ Date: University of Florida Board of Trustees _______________________________________________ Date: Institute

PAGE 16

Data Management/Curation Working Group Report Page 16 of 16 Updated : 16 January 2020 Appendix E HiPerGator access for student s – A proposal (Accepted) HiPerGator access for students – A proposal Background and context HiPerGator is used by numerous faculty and their students, both undergraduate and graduate, as well as postdoctoral associates and remote collaborators. All this work is done within the context of research groups and research projects. UFIT Research Comput ing (RC) staff provides the training and support for this use. HiPerGator is used in several classes every semester with great success. 1. RC provides the resources on HiPerGator as an allocation that lasts the semester. All accounts are deleted one week a fter finals. 2. To make the support scale, there is a clear division of responsibility between the RC staff and the faculty and teaching assistants (TAs) teaching the course. Faculty and the team of TAs agree to provide support for the students, with RC staff supporting the TAs. HiPerGator is a great resource that could be useful for students in a more general, individualized context. The RC, UF Informatics Institute, and George A. Smathers Libraries staffs have received queries about availability of such access. Proposed suppor t partnership between Smathers Libraries and UFIT The sponsors of the project are: 1. Dean, George A. Smathers Libraries 2. VP and CIO, Information Technology The responsibilities are divided as follows: 1. The CIO has agreed to find funding for the investments in cores, storage, and GPUs needed for supporting this group of students. 2. The Smathers Libraries faculty and staff with data analytics and computing expertise will provide training and guidance and support for the students in their use of HiPerGator. 3. RC st aff will provide training materials and support for the library faculty and staff on the second tier issues encountered when using HiPerGator. a. RC provides the basic training with online material and its regular training classes. b. The faculty and staff in the libraries can provide additional training and Q&A sessions and office hours. 4. The Libraries are responsible for organizing the engagement effort with students to make the support model a success. For example, students with similar interest and/or tasks m ay be grouped together to optimize training as well as to create the opportunity for students to collaborate and help each other. 5. The activities are organized on a semester schedule as that seems appropriate for a student facing engagement; students can engage during multiple semesters for more com plex projects like a thesis. 6. The participation of the students in the program will be tracked to measure success.