UF Digital Preservation Support & Services
The University of Florida Libraries are committed to long-term digital preservation of all materials in the UF Digital Collections and in UF-supported collaborative projects as with the Digital Library of the Caribbean (dLOC), including UF-supported digital gallery, library, archive, and museum (GLAM) as well as digital scholarship projects. Redundant digital archives, adherence to proven standards, and rigorous quality control methods protect digital objects. The UF Digital Collections provide a comprehensive approach to digital preservation, including technical supports, reference services for both online and offline archived files, and support services by providing training and consultation for digitization standards for long-term digital preservation.
The University of Florida Libraries maintains redundant servers with copies of all online files, with an additional tape backup as a ready-access archive. In practice consistent for all University of Florida Digital Collections and projects, separate redundant digital archives are maintained by the Florida Digital Archive (FDA; http://fclaweb.fcla.edu/fda).
Information about the archival processing for all digital objects, both online and offline or "dark" archived objects, is tracked and maintained within the SobekCM Management and Reporting Tool (SMaRT) as well as the SobekCM online system under "Work History". The SobekCM "Work History" tracking includes the "History" which lists the workflow name (for the name of the archive and the process; e.g.; FDA ingest), date the workflow occurred, and location/notes (e.g.; the FDA IEID). Under "Work History" is another field titled "Archives" which lists all of the archived files including filename, size, last write date, and archived date.
SobekCM also includes tools for preparing files directly for submission to FDA without loading to an online system. These functions are supported by the SobekCM METS Editor and the DLC Toolbox. The FDA preparation process creates the Submission Ingest Package (SIP) file with the metadata and in the format for submission to FDA, including: MD5 checksum numbers, file format and version information, and administrative and bibliographic metadata.
Florida Digital Archive (FDA)
The information below is from the Florida Digital Archive (FDA) Policy and Procedures Guide, with much more extensive information available in the full guide and in related FDA documentation (http://fclaweb.fcla.edu/FDA_documentation).
The mission of the Florida Digital Archive (FDA; http://fclaweb.fcla.edu/fda) is to provide a cost-effective, long-term preservation repository for digital materials in support of teaching and learning, scholarship, and research in the state of Florida. In support of this mission, the FDA guarantees that all files deposited by agreement with its Affiliates remain available, unaltered, and readable from media. For supported formats, the FDA will maintain a usable version using the best format migration tools available.
Planning for the FDA began in 2001 in response to the perceived need of the directors of the libraries of the public universities of Florida to ensure the permanent availability of digital library materials such as electronic dissertations. Development was expedited by the award in 2002 of a three-year grant from the Institute of Museum and Library Services (IMLS) which concluded in September, 2005. In order to implement the FDA, staff designed and developed the DAITSS application (Dark Archive in the Sunshine State). In November, 2005, an early version of the DAITSS application that lacked dissemination and withdrawal functions went into production for the Florida Digital Archive. In December 2006 the first version of DAITSS with all major planned functionality was completed. The software was released as open source under the GPL license in 2007. In April 2011 a wholly rearchitected and recoded version of DAITSS was installed as DAITSS 2.
The technical design, procedures and policies of the FDA are based on OAIS -Open Archival Information System Reference Model (ISO 14721:2003) and on ongoing work to define and certify trusted digital repositories, including Trusted Digital Repositories: Attributes and Responsibilities (RLG May 2002), the RLG/NARA Audit Checklist for Certifying Digital Repositories (RLG August 2005), and Trustworthy Repositories Audit & Certification: Criteria and Checklist (NARA, et al., February 2007).
For every file in each digital object (as specified in the archival information package, or AIP, created for each SIP), two master copies are written. One copy is stored at the UF Computing & Network Services facility in Gainesville (CNS) and one copy is stored at the Northwest Regional Data Center in Tallahassee (NWRDC). The two master copies are treated as a single file by DAITSS, the repository software application underlying the FDA. This means that when any action is performed on a file, it must be successfully performed on both master copies to be considered complete. For example, a fixity check involves calculating a message digest over the bits of a file and comparing this to a previously stored message digest. For a fixity check to be complete, message digests must be calculated for both of the master copies of the file and verified to match the stored message digest. In addition to the master copies, traditional backup copies on tape are maintained in Gainesville and Tallahassee.
Additional Background Information from the FDA website (http://fclaweb.fcla.edu/content/fda-background-information)
The FDA is a partner in the TIPR (Towards Interoperable Preservation Repositories) project (http://wiki.fcla.edu:8000/TIPR), funded by a National Leadership Grant from the Institute of Museum and Library Services. The TIPR project developed and tested a model for repository-to-repository transfer among three partners: Cornell University, New York University, and the FDA. Version 2 of the DAITSS software (http://daitss.fcla.edu) went into production in April 2011. DAITSS 2 is fully PREMIS-conformant, and implements a flexible, modular, web-services architecture. The FDA has developed PREMIS tools for the preservation community under contract to the Library of Congress.
The text below is an example statement for use in grants where all materials will be accessible and archived in the UF Digital Collections, including the IR@UF.
The University of Florida Libraries are committed to long-term digital preservation of all materials in the UF Digital Collections, including the IR@UF, and in UF-supported collaborative projects as with the Digital Library of the Caribbean (dLOC). Redundant digital archives, adherence to proven standards, and rigorous quality control methods protect digital objects. The UF Digital Collections provide a comprehensive approach to digital preservation, including technical support, reference services for both online and offline archived files, and support services by providing training and consultation for digitization standards for long-term digital preservation. The UF Libraries dedicate staff time to ensuring support for digital preservation and access from the Digital Library Center staff, IR@UF Manager, Digital Development & Web Services Team, and Digital Librarian.
The UF Libraries support locally created digital resources as powered and hosted in the SobekCM Open Source Repository Software, including the UF Digital Collections which contains over 381,000 digital objects with over 30 million files (as of February 2014). The UF Libraries create METS/MODS metadata for all materials. Citation information for each digital object is also automatically transformed by the SobekCM software into MARCXML and Dublin Core. These records are widely distributed through library networks and through search engine optimization to ensure broad public access to all online materials.
In practice consistent for all digital projects and materials supported by the UF Libraries, redundant copies are maintained for all online and offline files. The digital archive is maintained as the Florida Digital Archive (FDA) (http://fclaweb.fcla.edu/fda) which was completed in 2005 and is available at no cost to Florida’s public university libraries. The software programmed to support the FDA is modeled on the widely accepted Open Archival Information System. It is a dark archive and no public access functions are provided. It supports the preservation functions of format normalization, mass format migration and migration on request.
As items are processed into the UF Digital Collections (UFDC) for public access, a command in the METS header directs a copy of the files to the Florida Digital Archive (FDA). The process of forwarding original files to the FDA is the key component in UF’s plan to store, maintain and protect electronic data for the long term. If items are not directed to load for public access, they do not load online and are instead loaded directly to the FDA.
SobekCM Integration with Preservation Systems
SobekCM offers support for integration with internal and external digital preservation systems, including automatic integration with the Florida Digital Archive (FDA). Details on this configuration are explained above.
Additional technical information is available from the community site for the SobekCM Open Source Software: