This item is only available as the following downloads:
Research data needs assessment and program planningHannah F. Norton, Rolando Garcia Milian, Michele R. Tennant*, Cecilia BoteroHealth Science Center Libraries*and UF Genetics InstituteUniversity of Florida
Background Growing interest in research data and how to provide support Partnership with Research Computing/ High Performance Computing Center (HPCC) Involvement in ARL E Science Institute Desire to expand our understanding of our Clinical and Translational Science Institute (CTSI) researchers information needs, including those related to data and e science
UF High Performance Computing Center (HPCC) Mission: providing high performance computing resources and support to the faculty whose research depends on large scale computing Services: Compute capacity Storage Support Biocomputing Cluster/Galaxy
ARL E Science Institute The Institutes definition of e science: e science coversthe breadthofe research activities applied across all disciplines, including interdisciplinary research, butwith a particular focus on the sciences Designed to help institutions that participate develop a local strategic agenda for e science support and connect with others engaged in e science program planning Participants from UF: Associate Dean of Libraries/Director of Health Science Center Library, Director of HPCC, Associate Dean of Libraries for Scholarly Resources and Research Services
Clinical& Translational Science Basics BENCH Basic Science Research BEDSIDE Clinical Research COMMUNITY Population based Research
CTSI Researchers Information Needs Assessment: Project Goals1.Identify the information needs of the clinical and translational researchers at the University of Florida 2.Identify the services provided to clinical and translational researchers at other institutions 3.Develop librarian expertise in the areas of assessing and documenting research impact and assisting with the CTSA renewal process 4.Enhance the HSCLs efforts in bioinformatics support 5.Establish the HSCLs data services program 6.Educate HSCL librarians in the systematic review process
Methods Online Assessment 20 question online assessment Questions crafted with help from collaborators at the HPCC and Digital Library Center Went to > 800 investigators affiliated with CTSI Open for 1 month 59 respondents (7.1% response rate)
What college are you in? 7% 7% 2% 2% 4% 59% 6% 9% 2% 2% Agricultural and Life Sciences Dentistry Education Journalism Liberal Arts & Sciences Medicine Pharmacy Public Health & Health Professions Veterinary Medicine Multiple n=54
What types of data do you generate? 5.8% 48.1% 42.3% 69.2% 21.2% 11.5% 30.8% 38.5% 61.5%0%10%20%30%40%50%60%70% Other Tabulated data Molecular data Medical data Video files Audio files Still Images Text Numerical data Percentage of Respondents n=52
How are your data labeled or annotated? 17.3% 21.2% 78.8% 32.7% 0%20%40%60%80% My data are not annotated. Referentially, with an associated codebook Manually, by a member of my research team Automatically, through data collection tool Percentage of Respondents n=52
How do you store your data? 9.6% 7.7% 1.9% 30.8% 78.8% 17.3% 34.6% 38.5%0%20%40%60%80% Other Discipline specific database, e.g. NCBI (National Center for Biotechnology Information) Professional organization/association storage (e.g. ICPSR, available with published findings) Institutional storage College or departmental computer network Online (e.g. Drop Box/Google docs/Amazon cloud) External hard drive/CDs/DVDs Personal laptop/desktop Percentage of Respondents n=52
How long do you need your data stored? 18.4% 18.4% 42.9% 18.4% 2.0% 8.3% 12.5% 29.2% 43.8% 6.3% 22.0% 16.0% 42.0% 20.0% 0.0%0%10%20%30%40%50% Forever More than 10 years 6 10 years 1 5 years Less than a year Percentage of Respondents Raw Data (n=50) Intermediate/Working Data (n=48) Processed Data (n=49)
Who are you willing to share your data with? 6.3% 16.7% 35.4% 35.4% 95.8%0%20%40%60%80%100% Anyone Others outside of my field Others in my field Others in my department or institute Immediate collaborators Percentage of Respondents n=48
How are you sharing or planning to share your data? 10.0% 46.0% 22.0% 4.0% 68.0% 26.0%0%10%20%30%40%50%60%70% I do not share data Making them available informally to peers on request Making them available online via a project or institutional website Depositing them in UFs Institutional Repository (http://ufdc.ufl.edu/ir) Submitting them to a journal to support a publication Depositing them in a discipline specific data center or repository Percentage of Respondents n=50
What resources outside of your department do you need to best manage and analyze your data? 15.6% 37.8% 31.1% 62.2% 40.0% 51.1% 53.3% 44.4%0%20%40%60%80% Other Other external expertise (e.g. statistician, informatician) Data management service to outsource some of the work to Computing expertise or software Computing capacity for analysis Data/digital management system for organizing data Storage capacity Training on data management Percentage of Respondents n=45
Methods Interviews Interviewed 4 campus administrators involved in the research and IT enterprise Each interview approximately one hour
Methods InterviewsSample questions: Do you think (your researchers/the libraries) currently have an understanding of e science and the role it plays in current research? In what ways does the campus coordinate research efforts? What are the five and ten year research goals of the university? Are there general policies for how funding requests are made or distributed? Where does data from your project go? How do you find people to collaborate with you on grants and how do they find you?
Interview Results: Broad Themes Interdisciplinary collaboration within the institution can be accomplished by leveraging existing partnerships and careful consideration of team composition. Regional and national collaboration is vital in solving problems related to e science. The ability to effectively participate in e science at UF is strongly influenced by organizational structure. All interviewees expressed a willingness to partner with the library but not a clear idea of how.
Interview Results: Existing data projects Collaboration between regional organizations on data issue (Southern University Research Association/Association of the Southeastern Research Libraries) REDCap software for database creation and clinical research data capture Integrated Data Repository being developed to connect research and clinical data, allow cohort discovery Patient registry being built using IRB data; IRB submission process going electronic
Next Steps Library/HPCC collaboration in data management planning workshops and resources Additional interviews with CTSI investigators Further development of partnerships with campus IT and Office of Research leadership
Conclusions UF researchers have varied data and varied expectations of how they should manage and store that data. Future roles for the library include education and training and help in developing systems to better organize data. High level administrators recognize e science and data issues as highly important but havent yet developed institution wide strategies to deal with them. The libraries are seen as good partners, but must be proactive in providing specific solutions.
AcknowledgementsThank you to collaborators at UF, including faculty and staff from: Clinical Translational Science Institute High Performance Computing Center Digital Library CenterThis project has been funded in part with federal funds from the National Library of Medicine, National Institutes of Health, under Contract # HHS N 276 2011 00004 C. This presentation is available for re use under a creative commons attribution license.