|
![]() |
|
| UFDC Home |
| Help | RSS
|
|
CITATION
PDF VIEWER
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Full Citation | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
STANDARD VIEW
MARC VIEW
|
||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Downloads | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
This item is only available as the following downloads: | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| Full Text | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
PAGE 1 Data management 101Rolando Garcia-Milianand Hannah Norton UF Health Sciences Center LibraryGeneral Guidelines for Effective Data Management UF Research Computing Day -April 25, 2012 Rolando.milian@ufl.edu / nortonh@ufl.edu PAGE 2 General Guidelines / Best Practices-Planning (DMP Nortons presentation) -Metadata -Formatting -Storing -Security -Copyright -Sharing PAGE 3 Benefits of proper data management-Data is evidence supporting/refuting models in science -Efficient use of resources -Effective protection -Preservation and re-use through data sharing and collaboration -High qualityresults -Research excellence -Advancing science PAGE 4 Challenges of data management-Planning -Organization -Documenting -Formatting -Submitting -Answer questions? -Data errors/mistakes? -Being scooped? -Public resistance? PAGE 5 Tools for data management PAGE 6 Results of poor data management From: Horner J., and MinifieF.D. 2011 Research Ethi cs II: Mentoring, Collaboration, Peer Review, and Data. Journal of Speech, Language, and Hearing Research 54: S330S345 PAGE 7 Metadata Annotation Documenting PAGE 8 Metadata (Annotation/ Documenting) Metadata Information about data: the information required to understand data, context, quality, structure, and accessibility (Michener et al., 1997) -Who, what, when, where, and how about every aspect of the data. PAGE 9 Metadata (Annotation/ Documenting)Benefits of proper metadata-Reuse and data sharing are facilitated -Data discovery -Expand the scale of study -Addresses unanticipated questions -Integrate data http://www.flickr.com/photos/boojee/3743753784/in/phot ostream/ PAGE 10 Metadata (Annotation/ Documenting)Use standardized taxonomies and controlled vocabularies including domain, national, and international standards in the capture, management and archiving of data. PAGE 11 Metadata (Annotation/ Documenting) Automatic addition of metadata -Some is automatically added during the data collectionor analysis process-i.e. date, time -Some software (e.g. R statistical package, MATLAB, SAS, Galaxy) provide analysis scripts records of the various steps involved in processing and analyzing data, and provide a form of analytical metadata. always leave record of what you did with your data, PAGE 12 Metadata (Annotation/ Documenting) User interface-driven analysis -changes to data are made by selecting steps from drop-down menus, followed by a run or execute or ok button rarely leave a clear accounting of exactly what you have done PAGE 13 Metadata (Annotation/ Documenting)Manually added metadataAbout the project -Title, people, key dates, funders and grants About the data -Title, key dates, creator(s), subjects, rights, included files, format(s), versions -Interpretive aids: codebooks, data dictionaries, algorithms, code PAGE 14 Metadata (Annotation/ Documenting) Keep a READMEfile for each data file -Plain text files -Short description of what data it includes -Who collected the data and whom to contact with questions -Column headings for any tabular data -Units of measurement used -Symbols used -Specialized formats or abbreviations used http://datadryad.org/handle/10255/dryad.8525 PAGE 15 Formatting Your Data http://www.ehow.co.uk/how_8510149_mak e-excel-spreadsheets-look-good.htm l PAGE 16 Formatting Your Data File formats in which data is created depend on: -Software in which research data are created and digitized -How researchers plan to analyze data -Hardware used -Availability of software -Discipline-specific PAGE 17 Formatting Your Data Organizing Files and Folders: -Essential for accessibility -Makes it easier to find an d keep track of data files. -Develop a system that works for your project -Be consistent http://jdorganizer.blogspot.com/2008 /03/file-folders-declare-that-youare.html PAGE 18 Formatting Your Data File names: -Use file names to classify broad types of files -Create meaningful but brief names Year01 or Fall03 vs Corvallis_VegBiodiv_2007 -Capitalize each word to differentiate it. -Avoid using special characters in a file name. \/ : ? < > | [ ] & $ PAGE 19 Formatting Your Data File names: -Use underscore or hyphen symbols instead of spaces _ or - -Capture place, time, and theme extremely useful, even if done in a highly abbreviated manner -Reverse dates so they sort usefully YYYYMMDD e.g. filenaming_20080507 -Capture document version control v01, v02, v03 instead of filenaming_lastestversion PAGE 20 Formatting Your Data for Storage Store data in nonproprietary software formats (e.g., comma delimited text file, .csv); proprietary software (e.g., Excel, Access)may become unavailable, whereas text files can always be readNOTE: When data are converted from one format to another, certain changes may occur to the data. After conversions, data should be checked for errors or changes that may be caused bythis process PAGE 21 Formatting Your Data for Storage Textual Formats File Extensions Acrobat PDF/A .pdf Comma-Separated Values .csv Open Office Formats .odt, .ods, .odp Plain Text (US-ASCII, UTF-8) .txt XML .xml Image/Graphic Formats JPEG .jpg JPEG2000 .jp2 PNG .png SVG 1.1 (no Java binding) .svg TIFF .tif, .tiff Audio Formats AIFF .aif, .aiff WAVE .wav Video Formats AVI (uncompressed) .avi Motion JPEG2000 .mj2, .mjp2 Recommended File Formats for PreservationRecommended File Formats for Preservation. University of Texas http://repositories.lib.utexas .edu/recommended_file_formats PAGE 22 Storing Your Data http://blog.brickhousesecurity.com/wpcontent/uploads/mystica_usb_flash_drive.png PAGE 23 Storing Your Data -Store data in nonproprietary hardware formatsFormats can rapidly become obsolete valuable data that are essentially lost because they are trapped on old formats, 5.25 floppy disks CD/DVD experiential life expectancy is 2 to 5 years even though published life expectancies are often cited as 10 years, 25 years, or longer Manufacturers claim that CD-R and DVD-R discs have a shelf life of 5 to 10 years before recording on them (U.S. National Archives) PAGE 24 Storing Your Data Always store an uncorrected (the original data set) data file version or master version : -Do not make any corrections to this file -Make corrections using a scripted language. -Consider making your original data file readonly -Limit access to this file PAGE 25 Storing Your Data -Whenever possible, use online storage (i.e. Dropbox) or institutional resources http://www.hpc.ufl.edu/about/newStorage.php PAGE 26 Storing Your Data Regular back-ups protect against accidental data loss:-hardware failure -software or media faults -virus infection or malicious hacking -power failure -human errorsEnsure that areas and rooms for data storage are structurally sound, and free from the risk of flood and fire http://www.mathworks.com/matlabcentral/fileexcha nge/25464-virtual-backup-using-matlab PAGE 27 Data Security http://www.icc-service.net/wpcontent/uploads/2010/07/data-storage.jpg PAGE 28 SecurityUF IT Data Security Standard http://www.it.ufl.edu/policies/security/uf-it-sec-data.html Unrestricted Data If available to the public, will not harm an individual, group, or institution Sensitive Data If available to unauthorized users, may harm an individual, a group or institution Restricted Data Highest level of protection: i.e. Patient data, student data, security-related data such as passwords and risk assessments, and intellectual property PAGE 29 SecurityDATA SECURITY AND ACCESS-Physical securityhttp://www.icc-service.net/wpcontent/uploads/2010/07/data-storage.jpg http://mrcheckout.net/wpcontent/uploads/2010/11/datasecurity.jpg -Security of computer systems and files -Network security PAGE 30 Security When working with Restricted Data AVOID: -Storing data on workstations, portable devices or removable media. -Sending data in email or instant messages. -Using data on unapproved web sites. -Removing data from UF premises.Modified from Bergsma K. UF restricted data required training http://infosec.ufl.edu/restricte d-data/data-security-slides.pdf PAGE 31 Security 392-2061 ufirt@ufl.edu http://infosec.ufl.edu/ PAGE 32 Security UF Privacy Office Susan Blair, Chief Privacy Officer Office phone: 392-2094 Privacy Hotline: 866-8764472 Email: privacy@ufl.edu Web: http://privacy.ufl.edu/ PAGE 33 DATA DISPOSALSecurityFor hard drives, simply deleting does not erase a file on most systems. Files need to be overwritten to ensure they are effectively scrambled External hard drives at the end of their life can be removed from their casings and disposed of securely through physical destruction Shredders certified to an appropriate security level should be used for destroying paper and CD/DVD discs http://www.spectrumdatarecovery.com.au/content.asp x?cid=23&m=3 Contact your IT person PAGE 34 Security http://infosec.ufl.edu/restricted-data/data-security-slides.pdf PAGE 35 Copyright http://blog.unl.edu/dixon/files/2012/01/copyright.jpg PAGE 36 CopyrightIn the case of collaborative research, copyright may be held jointly by various researchers or institutions. Secondary users of data must obtain copyright clearance from the rights holder before data can be reproduced Give credit to the data source used, the data distributor and the copyright holder Data can be copied for non-commercial teaching or research purposes without infringing copyright, under the fair dealing concept, providing that the owner of the data is acknowledged PAGE 37 Copyright UF Intellectual Property Policy http://www.research.ufl.edu/otl/pdf/ipp.pdf UF Office of Technology Licensing http://www.research.ufl.edu/otl/index.html Christine Ross Copyright on Campus http://guides.uflib.ufl.edu/copyright PAGE 38 Sharing Your Data http://www.amazon.com/Sharing-Toddler-Tools-ElizabethVerdick/dp/1575423146/ref=sr_1_1 ?s=books&ie=UTF8&qid=1335134736&s r=1-1 PAGE 39 Sharing Your DataWHY SHARE RESEARCH DATA-Encourage scientific debate -Promotes potential newuses of data -New collaborations -Improvement and validation of research methods -Increases impact and visibility of research -Promotes the research study and its outcomes -Required by journals/funding agencies -Provide direct credit to the researcher PAGE 40 Sharing Your Datahttp://www.amazon.com/Sharing-Toddler-Tools-ElizabethVerdick/dp/1575423146/ref=sr_1_1?s =books&ie=UTF8&qid=1335134736&sr =1-1 PAGE 41 Sharing Your DataHOW TO SHARE YOUR RESEARCH DATA-Depositing with a specialist or discipline-specific datarepository -Submitting to a journal to support a publication -Depositing in an institutional repository -Available online via a project or institutional website -Available informally between researchers on a peer-to-peer basis PAGE 42 Sharing Your Data A comprehensive list of data repositories by disciplines http://oad.simmons.edu/oadwiki/Data_repositories PAGE 43 Sharing Your Data PAGE 44 Sharing Your Data Advantages of depositing data with adata repository -Assurance that data meet set quality standards -Safe-keeping of data in a secure environment with the ability to control access where required -Standardized citation mechanism to acknowledge data -Promotion of data to many users -Online resource discovery of data through data catalogues -Monitoring of the secondary usage of data PAGE 45 http://www.ithenticate.com/Portals/92785/ images/researcher-science-plagiarism.jpg UF Health Sciences Center Library UF Office of Technology Licensing UF High Performance Computing Center UF Institutional Repository UF Intellectual Property Policy UF Information Security Office PAGE 46 ReferencesBergsma K. UF Restricted Data Required Training. Slide presentation. Available at http://infosec.ufl.edu/restricted-data/data-securityslides.pdf Borer E.T., SeabloomE.W., Jones M.B., and SchildhauerM. 2009. Some simple guidelines for effective data management. Bulletin of the Ecological Society of America 205-214 Data Repositories. http://oad.simmons.edu/oadwiki/Data_repositories Frequently Asked Questions (FAQs) about Optical Storage Media: Storing Temporary Records on CDs and DVDs. Record managers. U.S. National Archives http://www.archives.gov/recordsmgmt/initiatives/temp-opmedia-faq.html Horner J., and MinifieF.D. 2011 Research Ethics II: Mentoring, Collaboration, Peer Review, and Data. Journal of Speech, Language, and Hearing Research 54: S330S345 PAGE 47 ReferencesJones, S., Ross, S., and Ruusalepp, R., Data Audit Framework Methodology, draft for discussion, version 1.8, (Glasgow, HATII, May 2009) Kruse R.L., and MehrD.R. 2008. Data management for prospective research studies using SAS Software. BMC Medical Research Methodology 8: 61Michener W.K., Brunt J.W., HellyJ., Kirchner T.B., Stafford S.G. 1997 Non-geospatial metadata for the ecological sciences. Ecological Applications 7: 330342 Michener, W.K. 2006 Meta-information concepts for ecological data management. Ecological Informatics 1 (1): 37 North Carolina Gov. RecodBranch-Best practices for file-naming www.records.ncdcr.gov/erecords/filenaming_20080508_final.pdf PAGE 48 ReferencesRecommended file formats for long-term preservation. University of Texas http://repositories.lib.utexas.edu/recommended_file_formats Savage J.C., Vickers A.J. 2009 Empirical Study of Data Sharing by Authors Publishing in PLoSJournals PLoSONE 4(9): e7078. doi:10.1371/journal.pone.0007078UK -Joint. Info. Sys. Comm.-Choosing a file name www.jiscdigitalmedia.ac.uk/crossmedia/advice/choosing-a-file-name University of Edinburgh Records Management Section, Standard Naming Conventions For Electronic Records: The Rules, www.recordsmanagement.ed.ac.uk/InfoStaff/RMstaff/RMprojects/PP/ FileNameRules/Rules.htm Van den EyndenV., CortiL., Woollard M., Bishop, L., Horton L. 2011 Managing and sharing data. Best practice for researchers. University of Essex, U.K. http://www.dataarchive.ac.uk/media/2894/managingsharing.pdf | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| MILLISECOND | CLASS.METHOD | MESSAGE |
|---|---|---|
| 0 | sobekcm_page_globals.constructor | |
| 0 | sobekcm_page_globals.constructor | Application State validated or built |
| 0 | sobekcm_database.verify_item_lookup_object | |
| 0 | sobekcm_page_globals.constructor | Navigation Object created from URI query string |
| 0 | sobekcm_database.verify_item_lookup_object | |
| 0 | sobekcm_page_globals.display_item | Retrieving item or group information |
| 0 | sobekcm_page_globals.get_entire_collection_hierarchy | Retrieving hierarchy information |
| 0 | sobekcm_assistant.get_entire_collection_hierarchy | |
| 0 | cached_data_manager.retrieve_item_aggregation | |
| 0 | cached_data_manager.retrieve_item_aggregation | Found item aggregation on local cache |
| 0 | item_aggregation_builder.get_item_aggregation | Found 'all' item aggregation in cache |
| 0 | system.web.ui.page.page_load (ufdc.page_load) | |
| 0 | sobekcm_page_globals.constructor.on_page_load | |
| 0 | html_echo_mainwriter.add_style_references | Adding style references to HTML |
| 0 | html_echo_mainwriter.add_text_to_page | Reading the text from the file and echoing back to the output stream |
| 35 | html_echo_mainwriter.add_text_to_page | Finished reading and writing the file |