NSF Data Management Plan Generic Astronomy Draft Version 2

NSF Data Management Plan Generic Astronomy Draft Version 2
Ford, Eric
Department of Astronomy, University of Florida
Gainesville, FL
Department of Astronomy, University of Florida
Data management plan


Data Management Plan ( DMP )
Data Management Plan


Sample data management plan for NSF (National Science Foundation).

University of Florida
University of Florida
All rights reserved by the source institution.


Data Management and Access Plan I. Products of the Research (a) Observational data may include images spectra and/or photometric time series of astronomical objects, calibration data (e.g., flat fiel ds, observations of reference stars) and associated metadata nec essary for the proposed research. (b) Results of data/statisti cal analyses (e.g., parameter estimates and associated uncertainties) are derived from observ ations/simulations and summarize the results relevant to the proposed research. (c) Simulated data used for publications is compared to observations to aid in the interpretation of results. (d) Software used for publications will be developed to reduce observations, model observations of astronomical objects and/or perform statistical analyses. (e) Curriculum materials may contain w eb pages, handouts, PowerPoint/KeyNote /OpenOffice slides, multimedia pres entations and assessment materials associated with the proposed br oader impact activities. Preliminary data, drafts of scientific papers, plans for future research, peer reviews, communications with colleagues and physical samples are not included in this plan as set forth by the US Office of Management and Budget. II. Data Formats (a) We plan to store both raw and final reduced observational data in standard FITS files, with standard metadata (e.g., time, position, instrument settings) in FITS headers and supplemental metadata in plain ASCII (e.g., observing logs with observing conditions and other notes). (b) Results of statistical analyses are typically recorded in standard ASCII based formats (e.g., plain ASCII, LaT eX table). Large results (e.g., posterior parameter distributions) may be compressed us ing standard open-source compression software (e.g., gzip) and/or stored in a binary format, provi ded that software is provided to extract binary data into ASCII form. (c) Simulated data will be stored in eit her plain ASCII formats or in a binary format, provided that software is provi ded to extract data into ASCII form. Visualizations of simulated data ma y be stored in standard graphics formats (e.g., eps, jpg, mov, .m pg, and/or .wmv). The Institutional Repository at UF (IR@UF) “will migrate items to new formats as necessary.” (d) Software source code and associated do cumentation will be stored in plain ASCII files and may be packaged using standard open-source tools (e.g., tar, gzip). A version control system (i.e., cvs, git, or subversion) will be used to support collaboration, version control and recovery of previous versions. Documentation will be embedded in source code, in separate ASCII files (e.g., plain ASCII, Asciidoc, html, LaTeX) and/or in formatted files (e.g., html generated via Doxygen and/or pdf generated from LaTeX source). (e) Curriculum materials will be stored in standard formats (e.g, html, pdf, swf, ppt, odp) for which IR@UF provides at least basic level of preservation support. III. Access to Data and Data Sh aring Practices and Policies The primary results of the proposed research, including t he results of associated statistical analyses, will be disseminated primarily through public ation in journals


(including online-only supplements for ex tended tables, animations, etc.), the arXiv preprint server, conference present ations and student theses (long-term open-access via IR@UF ). We will work to provide electronic data derived from this project upon request, in a timely fa shion and on a nondiscriminatory basis. Depending on the size of the request, data may be provided via email, the UF Astronomy web /FTP servers, IR@UF or the cloud (e.g., Data will be preserved for at least three years bey ond the award period, as required by NSF guidelines. We note details specif ic to certain types of data below: (a) Observatories generally provide acce ss to raw observational data via web/ftp after the observatory proprieta ry period (e.g., 1 year for GTC 18 month default for Keck ). (a-d) The raw observational data, final reduced observational data, results of statistical analyses, simulated data us ed for publications and software used for publications will be available for data shar ing upon request. We reserve the right to maintain a propriety period for observa tional data we colle ct as part of the proposed research, but voluntar ily limit that period to t he lesser of the time until publication or 12 months afte r the data become available to us. We will respect the restrictions imposed by data ow ners for any data obtained by unfunded collaborators or as part of coll aborations (e.g., SDSS-III). (e) Curriculum materials will be made avai lable via the UF Astronomy department web site, IR@UF and/or the national Multimedia Educational Resource for Learning and Online Teaching (MERLOT) repository where they will be peerreviewed. To increase visibility, we will request a link to these materials on the Science Information for Florida Teachers Guide to Everything website. IV. Policies for Re-Use, Re-Distribut ion, and Producti on of Derivatives Published data will be availabl e in print or electronically from publishers, subject to subscription/printing charges and copy rights. We will work to provide other data with as few restrictions as possible. For example, preprints will be posted to arXiv so as to ensure results are generally accessible without regard to journal subscriptions. Source code will be made availa ble under the GNU General Public License (GPL) Other data provided via UF websites will include a request to cite the most relevant publication(s) and notice of any copyright restrictions (e.g., how to obtain permission to r euse figures published in ApJ). As the proposed research does not invo lve the acquisition of either animal or human subjects data, we do antici pate any privacy or ethical issues associated with the data. We do not anticipate that there will be any significant intellectual property issues involved with the acquisition of the data. In the event that discoveries or inventions are m ade in direct connection with these data, access to the data will be granted upon request once appropriate invention disclosures and/or provisiona l patent filings are made. V. Archiving of Data Electronic data will be preser ved using multiple on-site copies, with all servers using RAID hard drive arrays. Final ve rsions of software source code and curriculum materials will be stored in hom e directories for which UF astronomy system administrators provide secure off-site backup


Notes to PIs: The above example needs to be cu stomized for your project. See for NSF AST-specific suggestions. Others general suggest ions are below. Provide a description of the data, identifying what wil l be measured, computed, or modeled. Include some description of how these tasks are to be performed. Provide an explanation as to why t he data are important and necessary for the work Provide an explanation regarding the nature of t he data. Data includes laboratory notebooks, survey reports, computer files created by data acquisition systems, images, etc. Incl ude an estimate on the quantity of data. Include metadata (l aboratory notebook entries, software codes, etc.) that will be needed in translating the data wit hin the context of the project Provide a description of how the da ta will be preserved, including data storage and backup. Provide instructions on how others may access the data if a request is submitted. Provide affirmation that the data will be preserved for at least three years after the award ends as is required by NSF. If there are privacy / ethical issues associated with the data, describe how these will be addressed. If there are intellectual property/patent issues a ssociated with the data, describe how these will be addressed. In case these might be useful, these are examples of data management plans from other universities for thei r recent NSF grant applications: Rice University (samples begin on page 2): Yale: http://odai.researc ajasekaranDataManagementPlan.pdf University of Virginia: Data %20Management%20Sample%20Plan.pdf University of New Mexico (examples linked on the left): University of Minnesota: example University of California San Diego: University of North Carolina, Odum Institute: Data %20Management%20Sample%20Plan.pdf