<%BANNER%>

UFIR



Natioal Data Management Trends and Challenges
http://www.ufl.edu ( Publisher's URL )
CITATION DOWNLOADS PDF VIEWER
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/IR00001990/00001
 Material Information
Title: Natioal Data Management Trends and Challenges
Physical Description: Powerpoint Presentation
Creator: Conlon, Michael
Publisher: University of Florida
Place of Publication: Hershey, PA
Publication Date: March 26, 2013
 Notes
Acquisition: Collected for University of Florida's Institutional Repository by the UFIR Self-Submittal tool. Submitted by Michael Conlon.
 Record Information
Source Institution: University of Florida Institutional Repository
Holding Location: University of Florida
Rights Management: All rights reserved by the submitter.
System ID: IR00001990:00001

Downloads

This item is only available as the following downloads:

Penn_State ( PDF )

Penn_State ( PPTX )


Full Text

PAGE 1

National Data Management Trends and Challenges Mike Conlon, PhD Director of Biomedical Informatics, Chief Operating Officer UF Clinical and Translational Science Institute The UF CTSI is supported in part by NIH awards UL1 TR000064, KL2 TR000065 and TL1 TR000066

PAGE 2

Data Production

PAGE 3

We represent the natural world in computer systems as data Provider Name Speciality Cert 1640 Anansi Urology 872312 1751 OBGYN 909832 453123123 PO None MRN Lab Order Fulfill 333234432 33214 7 20130212 20130214 434234234 36637 1 20130213 20130214 453123123 23322 1 PO 20130214 MRN Diagnosis Date Provider 123123132 J41.1 20130112 1640 123123132 H18.12 20100714 1751 453123123 PO 1834 MRN CPT Date Assess 123123132 99396 20130312 Normal 123123132 99321 20130312 Poor 453123123 PO Normal MRN Name Address Insurer 123123132 23423 432342342 Smith, Apt 34433 453123123 PO None

PAGE 4

We collect data for a purpose In research: to answer a question I n mining: to find patterns In clinical settings: to treat and bill

PAGE 5

Data Management Data Representation Data Capture Data Flow Data Preservation Data Curation Data Citation Data Management Plans

PAGE 6

Data management (and data management plans) have a presumption of reuse. Perhaps we should call the plans data reuse plans.

PAGE 7

Data Representation Is the key to reuse Use Standard vocabularies and ontologies The development and choice of vocabularies and ontologies is difficult

PAGE 8

Data Preservation is typically beyond the capability of the research team. E ngage specialists (Data Coordinating Center, Informatics Team, Library, The economics of preservation are troubling.

PAGE 9

Data Curation Improving the quality of the data correcting errors, resolving discrepancies, matching to vocabularies Making decisions about on going preservation Requires domain expertise/collaboration

PAGE 10

Data Curation d oes not end when the project ends The economics of data curation are very troubling

PAGE 11

Data Citation

PAGE 13

Some Projects

PAGE 14

Integrated Data Repository

PAGE 15

I2B2 for Cohort Discovery

PAGE 16

Consent2Share Initiated on 9/11/12 Consent form given with admissions packet (pt. specific bar code) Consent asks 2 questions Can we store your excess tissue with PHI? Can we re contact you for a future study? Collected by admissions clerk, data entered into EPIC, consent form scanned with other documents Informed Consent Hotline to answer initial questions CTSI patient research advocate for more detailed queries Results to date (>8,000): 86% patients returned signed forms recontact

PAGE 17

Clinic Paper Form EMR Scanned Image IDR Data Elements and Dates Data Flow for Consent2Share Next: Continue roll out to additional clinics

PAGE 18

CTSI Biorepository Biospecimen collection, processing and storage. Stored biospecimens can be used by any researcher with IRB approved protocols. Prospective biospecimen collection to fulfill investigator needs for IRB approved protocols. Storage for biospecimens collected by investigators. Stored biospecimens belong solely to the investigator. Oversight of the release of biospecimens from the UF Department of Pathology for other IRB approved research protocols. Pathology services including those provided by the Molecular Pathology Core and confirmation of diagnosis by a board certified pathologist upon request One of two Hamilton 80 C automated sample management systems (Robotic freezers ). The biorepository also has eight Forma Thermo Scientific 80 C Freezers with back up CO 2 and sensaphone alarm systems including back up storage space, centrifuge for basic bodily fluid processing, QiaCube for small volume RNA, DNA and protein purification, Agilent Bioanalyzer for RNA, DNA and protein quality control analysis, OnCore BioSpecimen Management

PAGE 19

Biorepository Data Flow Bar codes Collection information Sample Vials & worksheets All sample information Oncore BMS Sample information linked to phenotypic information IDR

PAGE 20

Three in One Continue to build integrated data repository

PAGE 21

VIVO: An International Resource for Scholarship

PAGE 22

UF Co Funded Network 2008 2012 Main Component: what changes have occurred between 2008 and 2012? More of Health Science Center comes under the CTSI umbrella The CTSI has a broader reach in the whole network Increasingly the CTSI incorporates all researchers in relevant areas ( areas not relevant to CTSI research fields naturally remain out of its network) Chris McCarty & Raffaele Vacca, UF Bureau of Business and Economic Research Data source: UF Division of Sponsored Research (DSR) database Each node represents one Contract PI, Project PI or Co PI linked by a common PeopleSoft Contract number Nodes are sized by Total Awarded in UF fiscal year (July June)

PAGE 23

Collaboration and Coordination At UF Libraries, CTSI, AHC IT, Office of Research, Enterprise Systems, Registrar, Business Services Federal Partners -Symplectic Pivot, Elsevier, Thomson Reuters, ORCID, CiteSeer CrossRef OCLC, DuraSpace CNI, Total Ontology EuroCRIS CASRAI, NCBO, Eagle I, CTSAconnect Professional Societies International Semantic Web community DERI, Tim Berners Lee, MyExperiment Concept Web Alliance, Open Phacts Social Network Analysis Community Universities -Melbourne, Duke, Penn, Colorado, Eindhoven, Pittsburgh, Leicester, Cambridge, Stony Brook, Weill, Indiana, Scripps, Washington U, Ponce, Northwestern, Iowa, Harvard, UCSF, Florida, Stanford, MIT, Brown, Johns Hopkins, OHSU, Minnesota, and the CTSA consortium Application and service providers over 100 Software downloads (over 30,000) and contact list (over 1,600)

PAGE 24

Study Registry and StudyConnect Study Registry: All (9,400) human subject studies approved by 4 UF IRBs from 2008 to date. StudyConnect : Web site with 400 active studies for potential research participants to find opportunities

PAGE 25

Two Purposes: 1) Research management: understand the collection of human subject studies 2) Clinical research: facilitate recruitment

PAGE 26

The Translational Research Continuum at UF: An Expansion of T3 and T4 Research Basic Discovery Clinical Efficacy Clinical Effectiveness Clinical Practice T1 What works under controlled conditions? (Translation to Humans) How can we change practice? (Translation to Practice) What is the effect on population health? (Translation to Population Health) T2 T3 T4 What works in real world settings? (Translation to Patients) 2011 ( n=912) 2008 (n= 862) CTSI Study Registry includes 9,401 human subject research protocols across four IRBs (medical and non medical) 4,229 / 9,401 are considered medical and/or health related protocols 3,422 / 4,229 are categorized as translational research (T1 T4) 20% 27% 37% 16% 14% 21% 48% 18%

PAGE 27

CTSI REDCap Services No charge, unlimited self service access to REDCap and REDCap Survey Training in REDCap data entry and study set up Support Services Configuration Service Participation in the national consortium REDCap created for standard sign on method ( GatorLink Incommon /Shibboleth)

PAGE 28

REDCap and data management it easy to Time and event calendar, case report forms represent research visits and primary capture naturally Future work will introduce vocabularies and ontologies Future work will interface REDCap to electronic medical records Future work will interface REDCap to devices for data capture

PAGE 29

https://ufandshands.org/news/2012/uf delivers promise personalized medicine heart patients#!/ 1 / Personalized Medicine Program Launched June 25, 2012

PAGE 30

Personalized Medicine Information Flow Challenge: genetic polymorphism of CYP2C19 leads to reduced ability to activate clopidogrel (Plavix) and increased risk of cardiovascular complication UF Pathology Lab IT Hospital Orders & Receipts Hospital Best Practice Alerts Team IDR Team CTS IT

PAGE 31

Molecular Medicine

PAGE 32

Data Got Big (IDR Slide) Big Data Management

PAGE 34

Clinical research challenge Create a connected, accessible, efficient, effective, compliant environment for CR that speeds development, execution and reporting of CR studies Weill Cornell reported that their clinical research faculty were required to use more than 40 systems to conduct a CR study

PAGE 35

Clinical Research Environment Focus CTSAs on conducting only high quality translational research Increase efficiency and decrease cost through institution wide accountability and transparency C linical study oversight, timely recruitment, effective enrollment, and follow up, retention of participants, data quality and security, availability of highly trained study staff and ancillary services, and timely submission of adverse events B road culture of responsibility for safe and ethical conduct of human subjects research throughout the institutions participating in the CTSA Appropriateness of specific study designs Development of realistic recruitment goals Steps to ensure timely feasibility assessment and closure of studies that Efficient institutional workflows for conduct of translational studies Expectation of prompt analysis of results and dissemination of those results Address tracking of all clinical studies at participating institutions from inception through review, activation, conduct, closure, analysis, and dissemination of results Identify and address inefficiencies, troubled studies and bottlenecks Ensure registration of all applicable trials with ClinicalTrials.gov Facilitate participation by their research teams in multi site studies, including willingness to adopt centralized IRB arrangements similar to those utilized in many NIH networks Describe processes for clinical trial agreements and contracts and timelines for completion

PAGE 36

Next Step: Internal Planning Grant, The Clinical Research Initiative

PAGE 37

Data Sharing

PAGE 39

Information is stored using the Resource Description Framework (RDF) as subject predicate object Jane Smith professor in author of has affiliation with Dept. of Genetics College of Medicine Journal article Book chapter Book Genetics Institute Subject Predicate Object A Web of Data The Semantic Web

PAGE 40

processOrg < function( uri ){ x< xmlParse ( uri ) u< NULL name< xmlValue ( getNodeSet (x,"// rdfs:label ")[[1]]) subs< getNodeSet (x,"//j.1:hasSubOrganization") if(length(subs)==0) list(name= name,subs =NULL) else { for( i in 1:length(subs)){ sub.uri< getURI ( xmlAttrs (subs[[ i ]])["resource"]) u< c( u,processOrg (sub.uri)) } list(name= name,subs =u) } } VIVO produces human and machine readable formats Software reads RDF from VIVO and displays

PAGE 41

A Consumption S cenario Find all faculty members whose genetic work is implicated in breast cancer VIVO will store information about faculty and associate to genes. Diseaseome associates genes to diseases. Query resolves across VIVO and data sources it links to.

PAGE 42

Data Reasoning datasets that describe the relationships between gene, protein, interaction, pathway, target, drug, disease and patient and currently consist of more than 5 billion RDF statements. The dataset interconnects more than 20 complete previously unrelated data from heterogeneous knowledge From the LarKC (Large Knowledge Collider) http://www.larkc.eu/overview/

PAGE 44

Questions? mconlon@ufl.edu