Enhancing the Legacy Digital Collections of the SPOHP for Improved User Access

Material Information

Enhancing the Legacy Digital Collections of the SPOHP for Improved User Access
Ma, Xiaoli
Birch, Stephanie
de Farber, Bess
University of Florida
Publication Date:
Physical Description:
Grant proposal and Reports


This project alms at improving the accessibility and discoverability of SPOHP's digital collections by assessing the quality of existing transcripts, adding item-tevel metadata, and re-structuring the collections using the latest available tools: Descript and Data Harmony. Descript transcribes text from audio recordings, while Data Harmony Suite reads the transcripted text and generates subject terms based on a selected thesaurus. This project will further improve the collections' usability and discoverabllity through updates to SPOHP's landing pages in the UF Digital Collections database and the creation of reference materials.
Collected for University of Florida's Institutional Repository by the UFIR Self-Submittal tool. Submitted by Danielle Sessions.
General Note:
Funding: Strategic Opportunities Program; awarded $4,954

Record Information

Source Institution:
University of Florida Institutional Repository
Holding Location:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.


This item is only available as the following downloads:

Full Text


1 Project Description The project team seeks $4 ,954 to leverage the expertise of the African American Studies Librarian and S POHP staff to comprehensively optimize the content discoverability of the digital oral history collections, using the latest available tools at UFDC. The S POHP Collection is an established multi -m edia digital collection in the University of Florida Digital Collection (UFDC), with closely 5,0 00 i tems. In collaboration with SPOHP staff, the project will plan out the collection structure change, prioritize tasks for the item metadata, and ad d su bjects using ma chine-ai ded in dexing tools that have been newly adapted by the Smathers Libraries. The project will also assess the quality of digitized transcripts (excluding born-digital items) to identify items which need to be re-t ranscribed by S POHP for accuracy and in accordance with current oral history standards. Upon completion, the project will yield: a n infographic or other visualization of the changes to the collection structure and the metadata, for illustrating the impact and value of the project work ; a report on the collection structure changes as well as the concerns and considerations for replicating these processes for other collections, a tailored thesaurus for effective, standardized indexing of SPOHPs Black Oral History collections as Program staff continue to add items to these collections; and a collection metadata guide that will help SPOHP staff to locate the collections where the new content will be added and to facilitate the building of new collectio ns. Pr oject Importance Starting with digitization in the 1990s, SPOHP s collection has grown to over 100 collections in UFDC. However, these collections were created largely to meet production process needs, therefore they currently cannot support the w ork for revealing the themes, intellectual content, or collection goals because the collections were grouped mainly by production needs and the metadata assigned were not enough for exploring the richness of the content The i nappropriate collection structure and non existence of important information retrieval points impede the creation of functional supports like LibGuides creating obstacle s for successful outreach and limits external grant funding opportunities. Wit h its current capacities, t he Libraries staff would not create the SPOHP collections today in the way it currently exist s. This is because the present aggregation model allows for much greater flexibility, i nternet speeds have increased across the world, and digital library standards now emphasize findability and usability The fruits of using this newer approach can be seen in the Judaica Digital Collections where the exceptional materials are more easily discovered. Unlike Judaica, the current SPOHP collection structure and assigned metadata cannot support further exploration of its content. This proposed project will target these issues by improving the organization of the content, adding item level metadata, and largely increas ing the access points of the records. Th is is a timely project as 2019 marks the 10th anniversary of the African American History Project at the University of Florida. In March 2019, SPOHP and the Libraries will host a symposium for scholars, students, and researchers, to celebrate the newly dedicated Joel Buchanan Archive of African American Oral History H igher usage and greater exposure of SPOHP digital collections is expected around that time.


2 T h is project will also directly benefit the Digital Publishing on Black Life and History Collaborative Meeting, another Strategic Opportunities Program proposed project, which if awarded, will be held during the Summer 2019 and is led by the African American Studies Librarian The meeting will enh ance collaborative partnerships in Black centered research and digital scholarship across North Florida. SPOHPs digital collections are important resources, which will be promoted at the collaborative meeting as a publicly accessible resource for Black Floridian, Southern US, and Caribbean oral histories. The meeting and SPOHP symposium provide an impetus and immediacy to the project, as well as an opportunity to collect user feedback on the Collections structure and metadata status. Inno vative Components This project is innovative because it takes a maintenance approach to digital collections while using the latest available technology. Maintenance Studies emerged as a robust and growing field following the 2015 start of the Maintainersi and subsequent conferences, publications, and other activities Maintenance is a critical and often overlooked aspect of innovation because innovation alone will create changes However, those changes will not necessarily be solutions, since needs evolve over time. This balance of need for innovation and maintenance is evident in the work of UFDC. Maintenance Studies offers a frame work innovation while making the working system more robust The focus is to shape a process of making metadata that can support re search and discovery in the longrun and to provide a direct and necessary complement to the stud y of innovation by asking how we create sustain, and engage with change in meaningful ways T his project will coincidentally test Data Harmony tools that pull out suggested terms from texts based on rules built around the selected thesaurus terms. Transcripts of audio and video are ideal for exploring smallscale thesaurus building and adding machine generated subject terms A dd itionally, this project will examine the Descript software for efficacy of automated transcription and process streamlining The result of which may contribute to the nation wide discussion on Oral History digital collection building standardsii. As well this project will seek to hire a H umanities major who is keen on digital tools, for instance, a recommended student from TRACE iii a researcher group that works at the intersection of writing studies, digital media studies, and ecocriticism at the English department of University of Florida. This type of collaboration between UFDC and H umanities will create opportunities for future projects that require participants to possess expert knowledge of a Humanities area, as well as skills in adapt ing to digital tools. Las tly this project has the potential to pioneer technical methods of restructuring established collection levels. Its success will shed light on solving collection structure issues of digital collections built in the early days. T he whole experience will provide valuable information, for instance, whether or not the production site performance will be affected by batch data updates of over 1 000 items at a time and whether any obstacles emerge in carrying out this type of change. A dditionally, this project will collect benchmark data that can contribute to prepar ing future batch metadata updates of this kind Pr oject Compariso n Samuel Proctor Oral History Program Collection is one of the largest oral history digital collections listed by Oral History Associationiv. Close to 5,000 items of audios, videos and related materials were made available online for research and study. By applying a collection structure that surface s the core theme of this collection and supplying more acces s points to this collection will further present the value of this collection to the public.


3 UFD C stands out among other digital collections of academic libraries for its scale (over 600,000 items) and the large variety of types of content across all disciplines that includes materials from archives, special collections, herbariums etc. Above all, the content lives in one system tool. Man y libraries operate multiple systems (e.g., digital library, archive, IR, etc.) where they target different mater ial types to the different systems. These require a high level of labor of soliciting materials, building digital collections as well as maintaining them. UF is recognized for being different in kind from these, and this has led to opportunities to collabo rate on innovative approaches to metadata. Pr oject Resources A w ork station in the library West or Interim Library Facility for the OPS worker. Pl an of Action Date Description Responsible Party Jan 2018 Prepare the project PI, Co PI Jan. 2018 Hold team meeting; share updated report on remaining aggregations, and plan to execute correction processes PI, Co PI, project team Feb 2018 Mar. 2019 Hire OPS student worker Correct aggregation structure Correct metadata for African American Oral Histories, in support of the March Symposium Create LibGuides for some of the SPOHP collections, especially for the African American Collections PI, Co PI, student worker, project team Mar. Aug. 2019 Continue work, focusing on African American until completed, and working systematically through other top priority collections Start building the thesaurus and indexing content Support new oral histories from the Symposium with the new aggregation structure and metadata standards PI, Co PI, student worker, project team Sept. Oct. 2019 Continue work Produce Metadata Collection Guide PI, Co PI student worker, project team


4 Oct. Nov. 2019 Produce short report on the steps taken for the work, and concerns/considerations for this for other collections, which could be especially useful for supporting needed changes in the other collections in UFDC Provide an infographic or other visualization of changes to collections and metadata, to illustrate the impact and value of curatorial work PI, Co PI, student worker, project team Co pyright and Hosting The collections exist in the UF Digital Collections in partnership with SPOHP. There are no copyright issues with the permissions process in place, and the collection will be held in the UF D igital Collections. UF has committed to releasing all catalog records with a Creative Commons License, as these will be. Assessm ent Assessment of the project will occur through regular project meetings with team members and SPOHP staff, to discuss : Number of collections removed ; Number of metadata records edited ; Number of collection landing pages edited ; Number of collections with new catalog records at the collection level; LibGuide design to support findability and discoverability ; and, Creation of template s, series titles, and other standardized elements for existing collections as updated records and for use in new records ongoing For evaluation of this project and in support of future projects, the project team will produce final report and infographic of the changes to the collection structure and metadata to document project processes and illustrate the impact and value of curatorial work Th e report will include concerns and considerations for using this project as a model for future collection structure changes in UFDC, which will be especially useful for supporting immediate needs of the Florida Collections and the prepar ation for the building of a portal site to collate all Floridian content. Dissemination The PI and Co PIs will collaborate with the Library West team, Director of Communications, and the Social Media Manager to support promotion of the LibGuide, collections, and specific collection items. The Co PI will promote project activities and results to students in her liaison departments. The first results of the project, with improvements to the African American Oral History Collections, will be promoted through the March 2019 Symposium. Long Term Financial Implications This project will enable greater ease for supportin g and fostering collaboration with SPOHP both now and the future While there will not be ongoing financial costs this project will have long term financial implications for positive outputs and outcomes, including better positioning for grant and foundation funding propos als. Plan for Equipment/Supplies No equipment or supplies will be purchased.


04-Budget_Form_2012-2013-Posted.xlsx, 08/15/2012 Page 1 of 1 Please add lines to table as needed. If you need help completing this form, please contact Bess de Farber, PH# 273-2519. 1.Salaries and Fringe Name of Person % of effort Grant Funds Cost Share Total Ma, Xiaoli 5 $0.00 $3,615.00 $3,615.00 Birch, Stephanie 1 $0.00 $707.00 $707.00 Taylor, Laurie 1 $0.00 $1,312.00 $1,312.00 OPS Worker 100 ($15/hr X 10hrs/wk X 30 w $4,594.00 $0.00 $4,594.00 $0.00 $0.00 $0.00 SUBTOTAL $4,594.00 $5,634.00 $10,228.00 2.Equipment Item Quantity times Cost Grant Funds Cost Share Total $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 SUBTOTAL $0.00 $0.00 $0.00 3.Supplies Item Quantity times Cost Grant Funds Cost Share Total $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 SUBTOTAL $0.00 $0.00 $0.00 4.Travel From/To # of people/# of days Grant Funds Cost Share Total $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 SUBTOTAL $0.00 $0.00 $0.00 5.Other (Vendor costs, etc. Provide detail in Budget Narrative section.) Item Quantity times cost Grant Funds Cost Share Total Descript Transcription Sevrices 2,400 mins X $0.15/min $360.00 $0.00 $360.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 $0.00 SUBTOTAL $360.00 $0.00 $360.00 Grant Funds Cost Share Total Total Direct Costs (add subtotals of items 1-5) $4,954.00 $5,634.00 $10,588.00 Strategic Opportunities Grant Budget Form 2018-2019


Pro ject Budget Narrative Expense Calculation Funding ($4, 954) for this proposed project will be expended as follows: Descript transcription service, pricing rates for forty hours of interviews (2400 minutes) are at $.15/minute; total: $360 OPS student worker for metadata creation, enhancement, and correction; LibGuide authoring and editing; and, editing of collection home pages and banners; 1 student at $15/hour for 10 hours per week for 30 weeks, including fringe benefits ; total: $4,594 Expense Justification The funds for Descript will support the critical transcription of oral histories where there are only brief records and where transcripts are not available for creating records. These funds and this work will also support evaluating Descript for larger system and workflow integration. The stu dent worker is needed to make the manual updates to create and enhance metadata at the item level, update the landing pages for collections that will be retained, create and update banner images for the collections, and support creating guides and pages via Springshare to support collection discoverability and usability Principal Investigator Role The PI will lead all activities and will serve as the primary supervisor for the student worker for metadata work and landing page editing. The Co PI will serve as the primary supervisor for the student worker for the LibGuide work. Contributed Cost Share Stephanie Birch (co PI, 1 %) will provide expert knowledge of SPOHPs content and African American studies assist in training and supervising the student worker, participate in project assessment, and promote project activities. Pro ject team members will support collection and item data changes using SobekCM, liaise between the project team and Libraries production team, and ensure collection changes are reflected in WorldCat and Aleph Team members are Patrick Lee Stanley, Laura Perry, and D avid Van Kleeck, each contributing less than 1 %. Patrick will only be conducting the tasks within his regular job responsibilities of maintaining SobekCM database and wont provide customized development work. Laurie Taylor will serve as a project consultation, contributing >1%, to liaise on technical needs for UFDC and support the PI and Co -P I. i Details at ii Details at history in the digital age/ iii Details at iv The full list is at and collections/


Letters of Commitment from Team Members Dear Grants Management Committee, Please accept this as a letter of support for Xiaolis Reviving the Samuel Proctor Oral History Program (SPOHP) Legacy Collections. The current structure of the SPOHP digital collections has become very unwieldy. With over 100 sub collections, it is difficult for Digital Support Services staff to determine which collection(s) to add new content to as well as ensure the metadata is cohesive and the items are discoverable. This project has the potential for changing the way we structure new and preexisting co llections and has my full support. Thank you, Laura Perry Manager, Digital Support Services Xiaoli, I am writing in support of, and commitment to, the Reviving the Samuel Proctor Oral History Program (SPOHP) Legacy Collections project. This highly collab orative project will strengthen ties between the public and technical work of the Libraries and will enhance digital humanities scholarship in African American Studies and oral history programs. The Data Harmony technology involved is cutting edge and its use in this project will further the Libraries' implementation of it. The interdisciplinary nature of the SPOHP makes it an ideal subject for a project like this, one aimed at improving online content discoverability in order to meet researcher needs. Sincerely, Dave David Van Kleeck Chair, Cataloging and Discovery Services George A. Smathers Libraries University of Florida 352273 2863 Xiaoli,


We can commit a maximum of 5% of Patrick's time for this project for the period January 1, 2019 through October 31, 2019. This equates to an average of 2 hours per work week, but no more than 8 hours in any given calendar month. This time includes consultation, programming and any other tasks related to the project, such as meetings or other administrative tasks. R ichmond, Clifford Supervisor Web & Programming, LB SYSTEMS DEPARTMENT Letters of Support from Scholars From Dr. Ryan Morini, SPOHP Associate Program Director S tephanie, I definitely support the UF Libraries: Reviving the Samuel Proctor Oral History Program (SPOHP) Legacy Collections project. SPOHP includes a veritable wealth of material, but the fact is that much of it is difficult to find or to search through because of a lack of metadata. This is particularly an issue when interview collections serve as the primary indicator of subject matter (e.g. Alachua County collection), but do not contain additional searchable information noting, for instance, when a given narrator is an African American elder from the town of High Springs, or was a lifelong school teacher, etc. In a particularly dramatic example, we at SPOHP only recently learned that one of our Pinellas County interviews is with a gentile man who describes in deta il his experiences living in Poland during the Nazi occupation. This is a powerful interview that was hidden in plain sight. T his project will also help to deprovincialize Florida history; the more that researchers are easily able to see thematic connections with other regions, the more readily the collections will be substantively utilized by researchers working outside of Florida or on topics that are not specific to Florida. T his project will be of great benefit and I am strongly in support. Rya n Morini


From Julian Chambliss, Professor of History at Rollins College in Winter Park, Florida M s. Birch, I am writing to express my support for the proposed project, "Reviving the Samuel Proctor Oral History Program (SPOHP) Legacy Collections." The Proctor collection collect represents a unique historical resource and the possibility of enhancing it discoverability and usability through your project is exciting. As a scholar of the black experience, I realize the oral history within the collection provid e scholars, students, and the public a valuable resource. I look forward to supporting your efforts. Ju lian Chambliss From Jeffrey Pufahl, Lecturer, UF Center for Arts in Medicine H i Stephanie! T his project is vital to researchers who are seeking access or searching for specific information in the archive. The current state of the archive makes it very difficult to search for specific topics (eg: health information, location specific or time spec ific information, etc.). The addition of descript will greatly enhance the speed and accuracy of digitizing old interviews as well as transcribing new interviews. These additions will greatly enhance the work that I do with SPOHP and I strongly recommend this project receive funding. Jeffrey Pufahl Lecturer, UF Center for Arts in Medicine


Reviving the Samuel Proctor Oral History Program (SPOHP) Legacy Collections Responses to GMC Questions Who will be responsible for and how will Quality Control occur for reviewing the accuracy of the transcriptions? Xiaoli Ma will be responsible for quality control of the transcriptions and guarantee its accuracy. During the project preparation stage in January, Xiaoli will take the lead to organize the tests of Descript, the transcription tool with 10 UFDC audio materials, ideally, five materials wit h existing transcripts and five without existing transcripts. Stephanie and Laurie will participate and share the labor if their schedule s allow. After the test, the results will be reviewed to see if Descript provides equal or better quality of trans cripts than the work previously conducted by human workers; if the speakers accents or other reasons could cause poorer results; if Descript provides other valuable adds on services, for instance, subject terms suggestion s for the given materials. Meanwhile the results will be compared to other products and services that are widely used by the cultural heritage community. The results and new discoveries will be shared and discussed across the whole team. Xiaoli will then prepare OPS worker instructions based on the learning and feedback. After the OPS worker is hired, Xiaoli will train the worker to make sure the tool is properly used. The wor ker will start with small batches of 5 10 items, so any improper use of the tool could be identified easily witho ut affecting the quality of a bigger body of work. After the worker gets familiar with every aspect of the tool, bigger batches like 10 50 items will be assigned and reviewed gradually. What will be the role for Laurie Taylor in this project, especiall y vis -vis UFDC? Laurie Taylor will guide the team with her vision. She will provide the team with the historical knowledge of the SPOHP collection and also lead the team to understand the collections significance of being an important resource for or al history and African American studies. Laurie gathered the collection data and did initial analysis on the collection structure. The info gained from those data inspired this SOP project. She will continue working with the team as the adviser to make sure the team's work will be in line with related events and activities on campus and within digital humanities field. Her direct participation will shed light on future colla boration on metadata projects between UFDC and the Digital Partnerships and Strategies, the department that Laurie chairs.


1 George A. Smathers Libraries Strategic Opportunities Program Awarded Grant Final Report Form (Spring Cycle) Date: Jan . 30 , 2019 PI: Xiaoli Ma (PI), Stephanie Birch (Co PI) Project Title: Enhancing the Legacy Digital Collections of the SPOHP for Improved User Access Funds Requested: $4,954 Cost Share: $5,635 Total Funds expended: $4,482,49 Funds Remaining:$471.51 Brief Description of Project: Laurie Taylor, Senior Director for Library Te chnology and Digital Strategies , during her long term collaboration with SPOHP, Samuel Proctor Oral History Program, observed UFDC (University of Florida Digital C ollections). Users are not able to locate what they know is included in the digital collections due to inconsistent metadata. On top of this inconsistency is an over complicated aggregation structure where content is organized into multi level hierarchical discoverability, the project team, composed of library staff from Digital Support Services, the team now maintains UFDC and Library West, as well as two OPS workers, focused on flattening aggregation/collection structure, standardizing the use of metadata fields to realize the consistency across all records, and grouping content with consistently a ssigned subjects. A Libguide was also designed to further facilitate the search and browsing of the SPOHP content. Last but not least, Descript, an online automated transcription assistance tool was assessed to understand its potential as the replacem ent of conventional method that uses foot pedals.


2 Results: The New Aggregation Structure In order to restructure the content of 59 multi level aggregations to 20 two level aggregations, the aggregation code combination of 2721 items were manually edited to form the new structure, one after another by two OPS workers,respectively, Indica Mattson and Corinne Futch as well as Xiaoli Ma, Metadata Librarian at DSS. Other DSS staff also helped finish up this task at the final stage ( detailed aggregation list attached as Appendix 1). Sixty three aggregations were deleted and four were moved under new parent collections by Laura Perry, DSS manager (detailed aggregation list attached as Appendix 2) . This is also a manual process at mult iple UFDC locations. This new structure is based on the suggestion from SPOHP staff who knows how users want to retrieve the content. In other words, this intends to be a user centered structure. As of now, the last piece of finishing this structure change lies in the library IT. That is to relocate the following aggregations. Move Addiction Oral History Project to be under SPOHP Internship Move Mississippi Delta Freedom, St. Augustine Civil Rights , Fifth Avenue African American to be under Joel Buchana n African American Oral History Archive Worth mentioning, though the aggregation structure has been changed, the original grouping was preserved. The names of all sub aggregation names are now included as below. A Metadata Guideline A Metadata Guideline, attached to this report as Appendix 3, was drafted. The use of all the metadata fields were defined with examples. This document was also closely followed when the project team reformatted the existing metadata fields that include Title, Creator and SPOHP Identifier. This guideline is now in use by SPOHP staff to create consistent metadata for future content too. A New Set of Subjects Across all SPOHP aggregations, subjec ts were compiled throughout the years, project by project. Now these legacy subjects were updated. When selecting subjects, Stephanie Birth, African American Studies Librarian, also the co PI of this grant project, provides a good set of subjects that are commonly used by scholars. This effort builds a foundation for using commonly acceptable subjects consistently across all SPOHP aggregations. This updating process replaced obsolete terms, merged spelling variations of the same concept, split long string o f subjects linked by hyphens and also removed vague and repetitive terms.


3 Su bjects can be sorted alphabetically or by the number of items affiliated with it. In addition, subjects are hyperlinked. The clicks to the subjects launch searches with the subjects as the search keywords. In other words, these subjects group the items Joel Buchanan African American Oral History Archive, one aggregation of SPOHP collection, holds 1033 items, which indicates 287) items in order limitation of doing this type of update, however, because the content of the same topics


4 now groups under the same subjects, it becomes clear where efforts should be focused to make sure t hat the subjects can reach its full power of grouping content. This process also merged spelling variation of the same concept. For instance, now the 1939 ar II, 1939 As well, the same process splits subjects, 21 in total, that hold multiple concep ts into single ones so items that share the same topics in different context could be grouped -History --Alabama -History -ome multiple affiliated items can then be grouped and browsed both by the topic and the g eographic information. This change enables users to discover the items in multiple ways. to define. In total, this project replaced 8 subjects, split 21 subjects into multiple, merged 8 and A LibGuide The Oral History @ UF Libguide contains general library resources on oral history, as well as highlighted collection content from the Buchanan Archive digital collection. The libguide was created on Apri l 30, 2019 and has since been viewed 240 times. The guide supplements the landing pages of the SPHOP digital collections, providing users with general information about collection & sub collection content and organization. The guide also features themed pa ges highlighting interviews on Tallahassee Bus Boycott and African American veteran service in US wars. These pages connect users directly to related oral history interviews and secondary library collection materials. Descript A total of 10 hours (Xiaoli , 3 hours; Indica 7 hours) were spent evaluating Descript. Independently Xiaoli and Indica used Descript to transcribe audio and video materials. They also selected the tested materials on their own. While Xiaoli assessed Descript as a new user who has no experience of transcribing, Indica evaluated Descript as an


5 experienced transcriptionist who was well versed with traditional foot pedal method. Together they discussed their observation. anscription process intuitively by having most of the audio content appears as text on the screen while the audio replays. This automation largely reduces the stress created by switching between multiple tasks of the traditional foot pedal method that incl udes typing down the audio content and pressing the foot pedal to stop or play the audio recording. In the long run, after the initial learning curves are over, Descript should improve the efficiency of transcription. Presentations Stephaine, presented this project at DLF, 2019 in Tampa. Presentation is at Xiaoli will present this project at VRA 2020 conference in Baltimore. Both DLF and VRA 2020 are national conferences. Lessons Learned: A project is not an aggregation One SPOHP project, one UFDC aggregation, that is, whenever a new oral history project was finished, a new UFDC aggregation was built to hold the products of this project. This is the primary principle followed when SPOHP content comes to UFDC in the past 1 5 years. While it is straightforward for projects, it leads to a dysfunctional collection structure for end users. Content was dispersed in siloed aggregations with no shared subjects to build up the relationship between aggregations. At the same time, w hen end users come to UFDC to search across all collections, they care least which project made the item available but care more what can be found under one topic. In other words, a gap exists between how end users expect to discover the content and how da ta contributors organize the content. cover broad areas where users would like to know more. Individual SPOHP oral history projects usually hold only a small number of interviews that touch upon a specific topic. This organization method ignored that many projects share the same topic. Individual aggregations only provide a fragment of what SPOHP can offer on a specific topic. On top of that, because aggregations were built individually as the project results came in,


6 most of the time, aggregations of the same topic were treated differently. Take Subjects for example: spelling variations of the sa Though processing aggregations by project without considering how to relate them to others created the current findability and discoverability issues, grouping by projects is still essential, especially to SPOHP staff, the data contributor. This grouping p ulls together records created as original batches. This grouping helps trace back the historical work around those batches and enables batch update. Therefore, when deconstructing the aggregation structure, the original by project grouping was preserved. At this moment, to realize this grouping, subject, a field that should be dedicated to keywords about the item, is in use for holding original project/aggregation names temporarily. This temporary use has already overburdened this field. Project names usu ally are long and cannot fully display on one row on the narrow left panel, which leads to long lists of subjects jammed together, (layout shown in the above screenshot). This overuse also r eveals that UFDC is missing a field dedicated to preserving project information and to group project items. Project names should be stored, and displayed in its dedicated field that is also hyperlinked, like subjects, to a search query that brings all item s from the same project together. The project team, with the support from the DSS staff, spent hundreds of hours of manual work and tens of hours of batch work overhauling the collection/aggregation structure. Not to mention the time and effort spent ana lyzing the situation and planning the changes. While this project proves that the collection/aggregation structure on UFDC could be changed, the amount of work alarms that every decision on a new aggregation should be extremely cautious to avoid later stru ctural changes. Subjects should be shared Subjects should be shared across all UFDC aggregations so similar items could be discovered together. While data contributors may easily lose sight of their overall content across aggregations and have little kno wledge of the availability of similar content contributed by others, DSS staff should take the responsibility to help data contributors pick commonly acceptable subject terms and assign them consistently so items of one aggregation could be linked with ite ms from other aggregations and the content from other data contributors. Only this collaboration can produce a UFDC whose content could be discovered as a whole not only as individual aggregations. SPOHP is not the aggregation for all oral history content .


7 At a later stage of the project, it was noticed that the Publisher of many SPOHP content co nfirmed 13 out of 50 are not affiliated with SPOHP. Detailed list attached as Appendix 5. Why and how these 13 publishers (113 items) made to SPOHP is a lost history, but more or less, these publishers assumed that SPOHP is the home for all oral history , which is not the case. These non from SPOHP ones, which creates extra browsing and searching issues. all oral history content on UFDC. This info informs future UFDC interface design. To avoid future wrong assumption, SPOHP staff agreed that all SPOHP content, when trans cription; after created, the digital files should be sent to SPOHP staff who will archive a copy of files and then submit them to DSS. The metadata should also be sent to Digital Support Services to be reviewed and enhanced, if applicable, by Metadata prof essionals. Sobek Inefficiency: No Dedicated Tool for Subject Sobek powers UFDC. Sobek includes a set of tools that build up UFDC, however it tremendously the a nalysis of the subject terms assigned across aggregations, which is crucial to understand the scope of this grant project. It was possible to prepare a customized report by querying the Sobek database directly, but it takes a long time and a lot of effort that can bring in the complete report. In other words, no documentation of the database structure is available or existing queries can be used for this purpose. Generating customized re ports itself could be a standalone project. Due to the above limitation, Xiaoli depended on the built in function of subjects on UFDC. On UFDC, Subjects are all hyperlinked with a query to pull out all items assigned with the same subjects. This query, th import, download and report tool, can bring together METS XMLs of these items and download them. Then these XMLs can be batch updated in XML editor, Oxygen. all item list page ( page lists 3485 items ( ).In jects. This failure blocks further batch XML updating steps.


8 Besides queries that cannot group the targeted subjects, UFDC pages only displays top 100 subjects, either by the number of items affiliated with them or by alphabets of the subjects for each aggregation. This indicates that an unknown number of subjects, not removed and merged subjects, so the total number of Subjects should have changed. However, still only one hundred subjects show on the left panel for users. aggregation collection as mentioned above. To summarize, Sobeck has no dedicated tools to analyze the use of subjects or to update subjects by batch. Sobek Inefficiency: Builder Rejects After being updated, XMLs need to be processed again by Sobek builder to register the change, however, builder rejects the updated XMLs when the XML Record Status is not This be havior, not documented, led to ingestion errors. Recently, the library IT has from this new solution. Budget: (add more lines as necessary) Expenses Categories (ad d lines as necessary) Cost Personnel Budget $4466.29 Other Operating Expense LTD $16.20 Total $4482.49 Total actual costs including cost share: $_______15,752.49_______ ($___ 5,635 ___cost share + $___ $4,482,49 ____ awarded funds expended) Still to be completed:


9 Publisher cleanup Updated Timeline: (add more lines as necessary) Activity July Aug Sept Oct Nov Dec Aggregation code manual updates x x Descript Evaluation x x Reporting progress to talks. x x x Batch Subject Updates x x x Presenting at DLF 2019 x Final Report x