DMPTool and Data Management Basics: Training within the Core Data for Reference Services

MISSING IMAGE

Material Information

Title:
DMPTool and Data Management Basics: Training within the Core Data for Reference Services
Series Title:
Data Management / Curation Task Force Materials
Physical Description:
Meeting agenda
Language:
English
Creator:
Norton, Hannah
Publisher:
George A. Smathers Libraries, University of Florida
Place of Publication:
Gainesville, FL
Publication Date:

Subjects

Subjects / Keywords:
Data management
Data curation
Training

Notes

Abstract:
Training session slides for the Core Data for Reference Training Series for the UF Data Management / Curation Task Force.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:

The author dedicated the work to the Commons by waiving all of his or her rights to the work worldwide under copyright law and all related or neighboring legal rights he or she had in the work, to the extent allowable by law.
System ID:
AA00014835:00050


This item is only available as the following downloads:


Full Text

PAGE 1

DMPTool and Data Management BasicsHannah Norton July 29, 2014Image modified from : http://www.flickr.com/photos/blprnt/3642742876/in/photostream /

PAGE 2

Background: the Data Lifecycle Study Concept Data Collection Data Processing Data Distribution Data Archiving Data Discovery Data Analysis Repurposing Data Analysis Based on Data Documentation Initiative (DDI) version 3.0 Combined Life Cycle ModelData Management Planning2

PAGE 3

What is a data management plan (DMP)?A clear description of how you plan to address data management issues in your research.A way to communicate your data management efforts to members of your team and others (especially funders). A data management plan gives a concise description of the who, what, where, and when of your data throughout its life cycle.

PAGE 4

Why do researchers need a Data Management Plan (DMP)? For all the same reasons you should take care of your dataTo ensure that valuable data resources will be accessible in the future to members of the research team and the broader community .To make life easier by planning ahead and documenting data throughout its life cycle, researchers can save time and focus on research .To increase the visibility of research .To satisfy funders requirements .

PAGE 5

Components of a DMPProject descriptionData collection: Types of dataData and metadata standards to be usedLegal and ethical issues:Privacy and confidentiality Intellectual property rightsPolicies for d ata sharing and re useData preservation (long term)Who is responsible for data management

PAGE 6

http://dmptool.org

PAGE 7

Log in to DMPTool with Gatorlink

PAGE 8

Funders with DMPTool TemplatesAlfred P. Sloan FoundationGordon and Betty Moore FoundationGulf of Mexico Research InitiativeInstitute of Education Sciences (US Dept of Education)Institute of Museum and Library ServicesJoint Fire Science ProgramNational Institutes of HealthNational Endowment for the Humanities Office of Digital HumanitiesNational Science Foundation (General and 11 Directorates)U.S. Geological Survey

PAGE 10

http://library.ufl.edu/datamgmt

PAGE 11

http://guides.uflib.ufl.edu/datamana gement

PAGE 14

Sample DMPs from UFExample text in the IR@UF: http://ufdc.ufl.edu/AA00014694/00001/ Research Computing guidance on Data Management Plans (includes links to UF College of Engineering and Department of Astronomy guides): http://www.hpc.ufl.edu/research/proposalsupport/data management plan/

PAGE 17

Components of a DMPProject descriptionData collectionLegal and ethical issuesPolicies for d ata sharing and re useData preservation (long term)Who is responsible for data management

PAGE 18

Example data collection questionsWhat file formats will you use for your data, and why? What metadata/documentation will be submitted alongside the data? (NIH)Describe the data to be collected (actual observations) during your research including amount (if known). Name the type of data, the instrument or collection approach, and how the data will be sampled. (NSFBIO)Give a short description of the data, including amount (estimated amount or known amount) and content. Data types could include XML spreadsheets, interview transcripts, text files, historical documents, diaries, field notes, geospatial data, citations, software code, algorithms, etc. (NEH)

PAGE 19

Data generated throughout the lifecycle has different needsRaw data some must be kept forever others can be discarded after the project is completeIntermediate data for analyzing and processing can be often be discarded at the end of the computation, but computational methods should be kept for reproducibilityFinal data should be made available indefinitely to the community

PAGE 20

File formatsFormats with the following characteristics are considered relatively stable and better for long term preservation:open documentationsupport across a range of software platformswide adoptionno compression (or lossless compression)no embedded files or embedded programs/scriptsnonproprietary format See the following for preferred and accepted file formats for the IR@UF: http://ufdc.ufl.edu /AA00017119/ 00011

PAGE 21

What exactly i s metadata again?Descriptive information that helps you and others understand your data D ata about data that acts as a surrogate for your data when you or others are trying to:Find the data laterKnow what the data is laterShare the data later

PAGE 22

Metadata across the disciplinesBasic information to keep:Descriptive What is it about? Title time, author, keywords Relations to other data objects Administrative Ownership and use permissionsProvenance Where does it come from?History of changes to the data, versions More specific information varies by discipline

PAGE 23

Components of a DMPProject descriptionData collectionLegal and ethical issuesPolicies for d ata sharing and re useData preservation (long term)Who is responsible for data management

PAGE 24

Example legal/ethical questionsProcedures for managing and for maintaining the confidentiality of the data to be shared (IES)Will any permission restrictions need to be placed on the data? (NSFBIO)Policies for public access and sharing should be described, including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements. (NEH)

PAGE 25

Components of a DMPProject descriptionData collectionLegal and ethical issuesPolicies for d ata sharing and reuseData preservation (longterm)Who is responsible for data management

PAGE 26

Example data sharing questionsWill you share data via a repository, handle requests directly or use another mechanism? (IES) What transformations will be necessary to prepare data for preservation/data sharing? (NIH)How long will the original data collector/creator/principal investigator retain the right to use the data before opening it up to wider use? (NEH)

PAGE 27

Example data preservation/archiving questions If your method of sharing is with an archive, which archive/repository/database have you identified as a place to deposit data? (IES)What is the long term strategy for maintaining, curating and archiving the data? (NSFBIO)The Data Management Plan should describe physical and cyber resources and facilities that will be used for the effective preservation and storage of research data. These can include third party facilities and repositories. (NEH)

PAGE 28

Finding a home for your dataData storage, both short term and long term, can take place in 3 types of places:Locally, within the lab or research environmentWithin the institutionWithin a national/discipline based repositorySee the following guide to find discipline based repositories: http://guides.uflib.ufl.edu /datasets

PAGE 29

http://www.hpc.ufl.edu/

PAGE 30

RepositoriesAdvantages of an institutional repository:Linked to your institution intellectual capital of the institution in one placeYou can put all your datasets togetherSome guarantee of support from the universitySome domain repositories may go out of business once their funding ends Advantages of a domain repository:Your data will stored with similar datasetsResearchers in your discipline will may find your data more easilyThe repository will understand what your data needs in terms of storage, archiving and preservationComputational tools may be developed to crunch a critical mass of data of a certain kindAdapted from: http:// libraries.mit.edu/guides/subjects/datamanagement/Managing%20Research%20Data%20101.pdf

PAGE 31

Benefits of sharing dataData can be used by other researchers with different objectives Accelerate the time of discovery by building upon previous researchResults can be reproduced more easily and accuratelyResearchers receive the credit theyre dueData producers have a new channel by which to promote their work (increase impact of research)

PAGE 32

Components of a DMPProject descriptionData collectionLegal and ethical issuesPolicies for d ata sharing and re useData preservation (long term)Who is responsible for data management

PAGE 33

Example data management responsibility questionsRoles and responsibilities of project or institutional staff in the management and retention of research data (IES)Who will be responsible for data management and for monitoring the data management plan? How will adherence to this data management plan be checked or demonstrated? (NSF BIO)Who will have responsibility over time for decisions about the data once the original personnel are no longer available? (NEH)

PAGE 36

A cautionary tale From NYU Health Science Center Libraries: http://youtu.be/N2zK3sAtr 4

PAGE 37

Questions?Feel free to contact the Data Management/ Curation Task Force: datamgmt l@lists.ufl.edu Or me: Hannah Norton, nortonh@ufl.edu 3522738412