<%BANNER%>

UFIR



Some Thoughts on Data and eScience
www.ufl.edu ( Publisher's URL )
CITATION DOWNLOADS PDF VIEWER
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/IR00000708/00001
 Material Information
Title: Some Thoughts on Data and eScience
Physical Description: Presentation slides
Creator: Conlon, Michael
Publisher: University of Florida
Place of Publication: UC Davis
Publication Date: December 6, 2011
 Notes
Acquisition: Collected for University of Florida's Institutional Repository by the UFIR Self-Submittal tool. Submitted by Michael Conlon.
 Record Information
Source Institution: University of Florida Institutional Repository
Holding Location: University of Florida
Rights Management: All rights reserved by the submitter.
System ID: IR00000708:00001

Downloads

This item is only available as the following downloads:

Some_Thoughts_on_Data_and_eScience ( PDF )

Some_Thoughts_on_Data_and_eScience ( PPTX )


Full Text

PAGE 1

Some Thoughts on Data and eScience Mike Conlon University of Florida mconlon@ufl.edu

PAGE 2

What Does Data Look Like?

PAGE 5

5. Harvesting

PAGE 6

Get a world map showing temperature sensors

PAGE 8

What are the Data Processes?

PAGE 10

Producing Data

PAGE 11

Data Sharing Photograph by J. G. Park. Flickr.com Photograph by Ell Brown Flickr.com

PAGE 12

Creative Commons

PAGE 13

Data Archive

PAGE 14

The Role of the Archive Collate data, final semantics, ready for consumption

PAGE 15

A Consumption S cenario Find all faculty members whose genetic work is implicated in breast cancer VIVO will store information about faculty and associate to genes. Diseaseome associates genes to diseases. Query resolves across VIVO and data sources it links to.

PAGE 16

Data Reasoning Data integration continues to be a serious bottleneck for the expectations of increased productivity in the pharmaceutical and biotechnology domain relationships between gene, protein, interaction, pathway, target, drug, disease and patient and currently consist of more than 5 billion RDF statements. The dataset interconnects more than 20 complete data sources and previously unrelated data from heterogeneous knowledge. From the LarKC (Large Knowledge Collider) http ://www.larkc.eu/overview/

PAGE 17

Public, structured linked data about investigators interests, activities and accomplishments, and tools to use that data to advance science

PAGE 18

Information is stored using the Resource Description Framework (RDF) as subject predicate Jane Smith professor in author of has affiliation with Dept. of Genetics College of Medicine Journal article Book chapter Book Genetics Institute Subject Predicate Object A Web of Data The Semantic Web

PAGE 21

processOrg < function( uri ){ x< xmlParse ( uri ) u< NULL name< xmlValue ( getNodeSet (x,"// rdfs:label ")[[1]] ) subs< getNodeSet (x,"//j.1:hasSubOrganization") if(length(subs)==0) list(name= name,subs =NULL) else { for( i in 1:length(subs)){ sub.uri< getURI ( xmlAttrs (subs[[ i ]])["resource"]) u< c( u,processOrg (sub.uri)) } list(name= name,subs =u) } } VIVO produces human and machine readable formats Software reads RDF from VIVO and displays

PAGE 23

VIVO Searchlight

PAGE 24

Some Questions Regarding Data Processes

PAGE 25

Shared Understanding of Data

PAGE 26

Provenance

PAGE 27

Who pays?

PAGE 28

http://vivo.ufl.edu/individual/mconlon