<%BANNER%>

Leveraging XSD's for Reflective, Live Dataset Support in Institutional Repositories

Material Information

Title:
Leveraging XSD's for Reflective, Live Dataset Support in Institutional Repositories
Abbreviated Title:
Presentation Proposal for Code4lib Conference 2014
Physical Description:
Conference paper proposal
Language:
English
Creator:
Sullivan, Mark V.
Publisher:
George A. Smathers Libraries, University of Florida
Place of Publication:
Gainesville, FL
Publication Date:

Subjects

Subjects / Keywords:
IR@UF
Data Management
Data set
Data Curation
METS
SobekCM
Genre:
Spatial Coverage:

Notes

Abstract:
The University of Florida Libraries are currently adding support for active datasets into our METS-based institutional repository software, SobekCM. This ongoing project enables the library to be a partner in current, or long-running, data-driven projects around the university by providing tangible short-term and long-term benefits to the projects. The system assists project teams by storing and providing access to their data, while supporting online filtering and sorting of the data, custom queries, and adding and editing of the data by authorized users. We are also exploring simple data visualizations to allow users to perform basic graphical and geographic queries. Several different schemas were explored including DDI and EML, but ultimately the streamlined approach of using XSD's with some custom attributes was chosen, with all other data residing in the METS file portions. Currently the system is being developed using XSD's describing XML datasets, but this model should easily scale to support SQL datasets or large datasets supported by Hadoop or iRODS. This work is being integrated in the open source SobekCM Digital Content Management System which is built on a pair-tree structure of METS resources with rich metadata support including DC, MODS, MARC, VRACore, DarwinCore, IEE-LOM, GML/KML, schema.org microdata, and many other standard schemas. The system has emphasized online, distributed creation and maintenance of resources including geo-placement and geographic searching of resources, building structure maps (table of contents) visually online, and a broad suite of curator tools. This work is presented as a model which could be implemented in other systems as well. We will demonstrate current support and discuss our upcoming roadmap to provide complete support.
General Note:
Related to the Data Managemnt/Curation Task Force ( DMCTF ) for the need to support small, research datasets and databases with online searching and other supports. DMCTF has labeled this the dinky database problem, and is assessing needs for this as of Fall 2013.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Applicable rights reserved.
System ID:
AA00019155:00001

Material Information

Title:
Leveraging XSD's for Reflective, Live Dataset Support in Institutional Repositories
Abbreviated Title:
Presentation Proposal for Code4lib Conference 2014
Physical Description:
Conference paper proposal
Language:
English
Creator:
Sullivan, Mark V.
Publisher:
George A. Smathers Libraries, University of Florida
Place of Publication:
Gainesville, FL
Publication Date:

Subjects

Subjects / Keywords:
IR@UF
Data Management
Data set
Data Curation
METS
SobekCM
Genre:
Spatial Coverage:

Notes

Abstract:
The University of Florida Libraries are currently adding support for active datasets into our METS-based institutional repository software, SobekCM. This ongoing project enables the library to be a partner in current, or long-running, data-driven projects around the university by providing tangible short-term and long-term benefits to the projects. The system assists project teams by storing and providing access to their data, while supporting online filtering and sorting of the data, custom queries, and adding and editing of the data by authorized users. We are also exploring simple data visualizations to allow users to perform basic graphical and geographic queries. Several different schemas were explored including DDI and EML, but ultimately the streamlined approach of using XSD's with some custom attributes was chosen, with all other data residing in the METS file portions. Currently the system is being developed using XSD's describing XML datasets, but this model should easily scale to support SQL datasets or large datasets supported by Hadoop or iRODS. This work is being integrated in the open source SobekCM Digital Content Management System which is built on a pair-tree structure of METS resources with rich metadata support including DC, MODS, MARC, VRACore, DarwinCore, IEE-LOM, GML/KML, schema.org microdata, and many other standard schemas. The system has emphasized online, distributed creation and maintenance of resources including geo-placement and geographic searching of resources, building structure maps (table of contents) visually online, and a broad suite of curator tools. This work is presented as a model which could be implemented in other systems as well. We will demonstrate current support and discuss our upcoming roadmap to provide complete support.
General Note:
Related to the Data Managemnt/Curation Task Force ( DMCTF ) for the need to support small, research datasets and databases with online searching and other supports. DMCTF has labeled this the dinky database problem, and is assessing needs for this as of Fall 2013.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Applicable rights reserved.
System ID:
AA00019155:00001

Full Text

PAGE 1

Leveraging XSD's for Reflective, Live Dataset Support in Institutional Repositories Mark Sullivan Library Information Technology, University of Florida The University of Florida Libraries are currently adding support for active d atasets into our METS based institutional repository software SobekCM This ongoing project enables the library to be a partner in current, or long running, data driven projects around the university by providing tangible short term and long term benefits to the p rojects. The system assists project teams by storing and providing access to their data, while supporting online filtering and sorting of the data, custom queries, and adding and editing of the data by authorized users. We are also exploring simple data vi sualizations to allow users to perform basic graphical and geographic queries. Several different schemas were explored including DDI and EML, but ultimately the streamlined approach of using XSD's with some custom attributes was chosen, with all other data residing in the METS file portions. Currently the system is being developed using XSD's describing XML datasets, but this model should easily scale to support SQL datasets or large datasets supported by Hadoop or iRODS. This work is being integrated in th e open source SobekCM Digital Content Management System 1 which is built on a pair tree structure of METS resources with rich meta data support including DC, MODS, MARC, VRACore, DarwinCore, IEE LOM, GML/KML, schema.org microdata, and many other standard schemas. The system has emphasized online, distributed creation and maintenance of resources including geo placement and geographic searching of resources, building structure maps (table of contents) visually online, and a broad suite of curator tools. This work is presented as a model which could be implemented in other systems as well. We will demonstrate current support and discuss our upcoming roadmap to provide complete support. 1 http://sobek.ufl.edu