Data Management/Curation Task Force 1 Wed. Dec 11 2013, 1 2pm ; HSC Library C2 41 Members : Hannah Norton, Laurie Taylor Rolando Garcia Milian, Denise Bennett, Val Minson, Joe Aufmuth, David Schwieder, Blake Landor, Mark Sullivan, Sara Russell Gonzalez, Erik Deumens Robert Ferl and Cecilia Botero ; Invited: Matt Gitzendanner and Aaron Gardner Draft Agenda Updates and Discussion: Discussion/updates from work and activities of interest at UF and external 12/11 meeting : Mark Sullivan presenting on the IR@UF data support now, and possible for futures o Question/discussion topic: T ool/portal to provision din ky databases and data websites ( similar to what REDCAP does for clinical trial surveys) Work towards Year One Report strategic recommendations specific problems/ projects etc. N ew resources in support of Year One Report : o Data Services, Text for Outreach and Promotion : http://ufdc.ufl.edu/l/AA00019190/0 0001/ pdf o Fact Sheet / Overview on Data Management Support from the UF Libraries with the IR@UF & More : http://ufdc.ufl.edu/l/AA00017119/00018/pdf o IR@UF: Loading Large Files & Data Sets : http://ufdc.ufl.edu/l/AA00017 119/000 16/pdf o Draft text for requesting information on dinky databases: http://ufdc.ufl.edu/ AA00014835/00022/pdf o Research Computing Vision; note radical collaboration: http://ufdc.ufl.edu/l/AA00014835/00023 o IR @UF :: Theses and Dissertati ons (includes section on supplemental data): http://ufdc.ufl.edu/l/AA00017119/00002/pdf Upcoming events scheduled and to be discussed /planned Zotero workshops (citation management software for data in bibliographic databases and connects to many tools for text/data mining) DMPTool, scheduling hands on training Workshop for outreach for HiPerGator Resources Meetings: alt. Wed.; HSC Library C2 41, Library West 429, Marston Science Library L107 Ongoing Planning and s upporting different informational training, and outreach activities and events on data and related resources like HiPerGator W orkshops (types for different groups: researchers, and data service provider s ); known needs: o DMP Tool for Librarians (and other Data Liaisons/Supporters to be identified) o DMP Tool and creating a plan 1 Data Management: http://www.uflib.ufl.edu/datamgmt & DMCTF resources: http://ufdc. ufl.edu/AA00014835/
o Possible workshop: Primer on Data Management, 2 hour version, expanded primer within 2 day workshop, co taught with teachin g faculty in field; expanded primer within lab style courses as with research and methods courses, etc. Deadlines /Events November: o P resenting to libraries; work on survey result analysis ; RC Day; GIS Day o Work towards larger Year One report and strategic directions/recommendations o Quarterly report due for July September 2014 January: o Quarterly report due for October December o Year One Report, draft due to group 2 2014 February: o Year One Report due to Deans of the Libraries o Future surveys/data gathering fo r feedback on data needs with possible questions 3 2 See charge and notes: Draft proposed recommendations as whitepapers for review/approval/im plementation to include: level role in support of data management and curation; proposing a corresponding framework and resources for library support of the data life cycle; recommending the role of the institution al repository and research computing in storing, finding, and accessing working and final data, and linking publications to supporting data; an d, recommending a framework for liaisons and subject specialists to incorporate data instruction and consultation into their workflows. Outline with detailed plan for training and other supports based on information gathered during Focus Groups, survey, and oth er activities; plan for ideal (more resources) and for conservative (current resources); Outline with detail ed information on how the IR fits in the overall supports for data; and same for other applicable resources that can be used/leveraged as is now, and deta iled information on how to enhance or make best fit 3 Possible questions: -How would you like authe nticated users to be able to interact with the data on line, if you were to make it available? [Download only; Search on site, no download; Run statistical analysis across my data; etc.] -What type of data visualizations would you like authenticated use rs to have access to regarding my data on line? [A, B, C, D, etc., write in] -If you (or other authenticated users) could add individual records through a form on the online system, would you transfer th e data to the system and rely on it for working ac cess and long term preservation?
Initial Draft for Discussion The initial draft notes below are towards a possible c ourse to aid in t ranslation c ompetency with d ata (for working with Data Scientists, no prereqs not necessarily heavily technical, etc.) The course could draw on theories of the database age, procedural rhetoric, data provenance for reproducible research, and help frame questions and learning for changes in working, thinking, and doing scholarship and research overall in the Data Age. Readings could include Manovich Bogost ( Persuasive Games : Introductory Concepts in Research Computing 3 Credits Fall/Spring, or Summer A/B compressed course Undergraduate/Graduate sections possible (at what level?) Purpose sets enables because it involves harnessing computer power to examine more sources than is possible by any individual or team. This course i s an introduction to the basic concepts that will enable students to collaborate with computer scientists to develop or support computational research projects in different fields. The primary goals of the course are to help researchers to determine what t ypes of data modeling tools to use for their research, and to provide an introduction to associated computing concepts. This course will not teach or involve computer programming. Prerequisites : None. Anyone interested in using computers for research is encouraged to attend. Format Classes will be part lecture, discussion, and guided inquiry with hands on examples to work through different concepts and learn different programs. Students will produce a computational research proposal at the end of the co urse. Course Content Overview of Research Computing, Common Uses and Tools What are Data, and Where Do They Come From? Computer Simulations The Monte Carlo Method GIS Data Mining and OCR Visualizations and E verything Else Unit Operations Procedural Rhetoric Grounded Theory Approaches to Analysis (Functional and non Functional Requirements) Introduction to *nix and Shell Scripting Other Systems Operations Overview of Applications, P rogramming Languages, and Libraries used in Research Commercial Software Examples Open Source Software Package Managers Scripting Languages High Performance Compiled Languages Brief Overview of Parallel Computing Techniques and Resources GPUs and Moving Data On Campus resources: HiPerGator Data Management Data Storage and Curation Ethics of Big Data