Page 1 of 5
September 14, 2010 DLC-IT Meeting
Randall Renner (meeting Maureen at 2:30pm)
1. UFDC Server Hardware
a. UFDC Web Server
1. configured for ufdcweb3
2. submitting a change to support logon through ufdc.uflib.ufl.edu
Shibboleth is currently working on the ufdcweb3 url. Tonight Mark will
configure Shibboleth to work with cleaner ufdc url, create the
metadata file for submission to campus, and then restore the current
configuration so Shibboleth continues to work as it does now.
Winston suggested we use the prod URN for the new URL, rather than
the test URN. (Great suggestion). Mark will do this tonight at the
ii. Email issue resolved
iii. Mark will also check ufdcwebl and confirm that it can be taken offline.
b. Caching Server
i. Having the AppFabric service running on the same server as the database, including
full-text indexing, causes throttling of caching service as SQL memory usage expands
beyond the limits of the server
1. First we will expand memory on the caching server to 6GB
2. Otherwise, we could move database to worker server ( lib-ufdc-ws ) which
uses little memory
Per Logan, also, by default, only 1 CPU. Per Logan, no cost to add
additional CPUs, so adding 1 CPU tonight during the short downtime.
This may help prevent problems.
We will not add any particular constraints on SQL at this time, but
Winston and Mark will do some research on constraints for full-text
indexing, etc. We will examine behavior after raising the memory.
Per Logan, can't do a hot-add for RAM. Logan will schedule a short
(5-10 minute) downtime for 11pm tonight for the additional RAM to be
added. This will bring the caching server to 6GB of RAM.
Page 2 of 5
c. OCR servers
i. Prime Recognition
1. OCR2 is working correctly but with possible automation issue
One issue was that OCR automator running in two instances. Appears
automation is okay, although network issues sometimes causes things to
2. OCR4 on old hardware with a hardware key occasionally hangs requiring Prime
restart or full reboot
OCR4 is on dying hardware, has a hardware license key, and is out of
maintenance. Based on prior estimates, updating this will be a minimum
of several thousand dollars if the Kirtas-ABBYY-ALTO OCR can't be
moved into production in its place.
1. Working but needs automation
2. OCR-ing all the image files, not just TIFFs
3. Randall and Maureen working on specifications for automation
4. Will looking at maintenance agreement since a year has almost expired since
Randall and Maureen are meeting today at 2:30pm to create
specifications document. This will be submitted in 1 week and used to
plan the necessary work.
Will did not attend the meeting, so there are no updates on his
progress on the maintenance agreement. Per Laurie, the last estimate
showed the cost at over three thousand dollars per year. Per Laurie,
this is another impact that heightens the priority, along with OCR4,
for having this working in production.
Logan was out sick, but will soon begin researching the 2 suggested solution options
e. Tivoli Archiving Server
i. Number of files archived, and separate key for each BiblD, cause GUI to have problems
building the dictionary. Currently takes about 15 minutes to open.
ii. Need to move to command-line retrieval, as we have done command-line submittal
Mark will add this to his work queue. Once implemented, this will streamline
work for both IT and the DLC. This will directly support IT in removing the
need to do this work. This will directly benefit DLC in supporting fast turn-
around for the daily patron requests for files.
f. DLC Production SAN backup
i. Update backup documentation
Logan will send update.
Page 3 of 5
g. Automation / PreQC server GROVER 31960
This will be closed and a new ticket will be opened and assigned to Mark. The new
ticket is to use the worker server (lib-ufdc-ws) to automate the processing for
PreQC and to make the processing for self-submitted items happen automatically
(splitting/normalizing PDFs into TIFFs, creating derivatives).
Update: Laurie noted and closed the old ticket.
Update: Laurie submitted the new ticket, 42994.
h. Hardware IDs
i. Old = 10948, 10949, 10950, 10951
ii. Need HWIDs for newer virtual servers
3. lib-ufdc-gs (linux greenstone box )
4. lib-ufdc-ws (worker)
5. lib-ufdc-arc ( archiving server)
Logan will create HWIDs for all the UFDC virtual servers
i. UFDC server documentation / disaster recovery
i. Test server
ii. Documentation sign off
Logan is reviewing documentation with Amy and Cynthia and will be ready to
test on Friday at 2pm.
Friday at 2pm, testing will be for installing the VMs- Cdrives and configs. Not
testing Celerra because CNS manages.
Friday meeting: Winston will be out.
Friday meeting: Laurie stopped by Will's office after this meeting and he will
also be out.
j. Web server
Laurie asked about the web server move, since it may impact DLC with people the
DLC supports asking about any moves.
Logan will notify the Libraries prior to the web server move. The web server will be
unchanged in the move and is already being mirrored via Replistor. The move will be
scheduled for the next slow weekend, hopefully with the switchover happening in
k. GIS server
Logan created a fresh server with fresh install of ArcGIS. Joe is currently
configuring and testing. NT13 is still up and running and will be until Joe is satisfied
with new server. At that time, Joe will tell Logan to repoint NT13 to the new
2. Software Development
Only the items in italics are not done. The rest are complete.
Page 4 of 5
a. Current Work
i. SobekCM/UFDC Web Application
1. Simplified all URLs to move much of the query string data into the URL and
utilize URL rewriting to keep the original URL in the user's browser
2. Moved the search stop words into a database table for easy modifications in
3. Formalized URL Portals concept
a. Now pulled from the database
b. Matched by the base URL for the request
c. Includes default aggregation, default interface, and abbreviation for
the library (i.e., UFDC, dLOC, etc.)
d. Removed all previous hard coding for dLOC/UFDC/etc.
4. Modified search queries to force SQL to not retain the execution plan for the
actual full-text searching subqueries
5. Online Templates
a. Corrected main thumbnail element
b. Modified templates to include a help button after each element during
edit and submission
6. Added ability to edit the IP Restrictions through the admin form
7. Added support to export to Excel using new software library
a. Software library needs to be strongly-named assembly
Winston recommended Tom Bielicke for this.
Mark checked with Tom, but Tom is working on CSV-only and
not this. Mark will strongly-name the assembly for web
publishing and will provide Winston documentation on this fairly
8. Add ability to browse by metadata lists (i.e., list of all publishers in an item
9. Update admin user editing for new user groups ability
10. Add Ability to delete/edit the tags for other users if you have correct rights
11. Add internal header including searching by bib/vid/identifier/etc.
12. Troubleshoot lack of gracefulfailover with caching failure
ii. UFDC Manager
1. Update the manager for the new URLS
2. Add an ability to pull URLS
Add the pre-qc functionality to run with the UFDC building/loading,
part of ticket 42994.
iii. UFDC Builder
1. Incorporate new code from UFDC web server and navigation changes
2. Update Greenstone building routines
iv. METS Editor (for internal and state-wide use)
1. New Elements ( Frequency, Other Titles, Publication Place, Publication Status,
FCLA flags )
2. DMDID in the struct map is blank
3. View data as METS and MARC without having to save first
4. Automatically add TIFFS, JPEGs, and JPEG2000s (and Text files) from the
directory to the page images, not just the TIFFs
Page 5 of 5
5. Adds importing from EAD as option
6. When a user adds a file that is in a different directory, move it into the digital
7. Show a progress bar when creating checksums as this takes a period of time
8. Convert templates to template schema
9. Brand new setup file, rather than old product code also change install
10. View with files to the right and divisions on the left to allow files to be dragged
from the directory to the division
11. Include list of files added in the METS but not in the page images as an
additional tree view under the page image tree view in the structure map. This
will often include PDFs
12. Automatic page numbering
v. Mini-grants and digital resource creation
1. EAC/EAD mini-grant, display within UFDC
2. Historic Newspaper Catalog, need to finish loading data
4. Aerials on hold pending coordinates from Joe
b. Future Pending
i. Tracking -4 UFDC
1. Work will begin on the merging of tracking and ufdc shortly by building the
requirements and discussing the system with all portions of the DLC workflow
process (necessary to remove tracking as part of work on Fedora, and
necessary for damage control with institutional memory loss from retirement)
Mark stressed that the importance of this work cannot be underscored
enough. In doing this work, we will also be working with Nelda to
incorporate her spreadsheets, separate disposition database, and workflow
into tracking, so we need to start this with plenty of time before her
Will ask for Winston's help next week to make a fresh copy of the UFDC
datatabase to look at the immediate issues that will be discovered when
importing all of the tracking data into UFDC
ii. Update dLOC Toolkit
iii. Fedora (no work yet)
iv. Google map use for entering coordinate information
v. Complete image server/ALTO-aware