Title: Data Collection and Online Access to Map Collections: A discussion of technologies and methodologies discovered during the Sanborn® Map Digitization Project at the University of Florida.
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00095390/00001
 Material Information
Title: Data Collection and Online Access to Map Collections: A discussion of technologies and methodologies discovered during the Sanborn® Map Digitization Project at the University of Florida.
Physical Description: Report
Language: English
Creator: Sullivan, Mark V.
Publisher: UF Libraries
Place of Publication: Gainesville, FL
Publication Date: 2004
Copyright Date: 2004
 Record Information
Bibliographic ID: UF00095390
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.

Downloads

This item has the following downloads:

Maps ( PDF )


Full Text














Data Collection and Online Access to Map Collections



A discussion of technologies and methodologies discovered during the
Sanborn Map Digitization Project at the University of Florida.























Mark V Sullivan
System Programmer
Digital Library Center
University of Florida Libraries




"Sanbom", "Sanborn Map", "Sanborn Map Company", and "Sanborn Fire Insurance Maps" are recognized
trademarks of the Sanbom Map Company, a subsidiary of Environmental Data Resources, Inc. (EDR). The
presentation of historic Sanbom Fire Insurance Company maps of Florida is in no way connected with either the
Sanbom Map Company or Environmental Data Resources, Inc.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.









Section 1: Database Design


The data for the Sanborn maps is stored in a Microsoft Access database. The database
contained only three tables; tables hold information about the counties and cities covered and the
map sets themselves.

Counties Database Table

The first table contains all the counties in Florida. The current name, FIPS code assigned, and
any county symbols previously used are included for each county. Below is the schema for this
table, as well as a few rows as an example of the type of data stored.


CountyName Text Name or this county
FIPS Number FIPS Number associated with this County [Primary Key]
fk_StateKey Number j Foriegn key will reference the StateKey in the States table [Unused currently]
CountySymbol Text i County Symbol(s) for this county (pre-FIPS)

gure 1 a: Schema for the Counties table in the Maps Access Database


ge Charlotte 12015! 12lDDeM

gure lb: An example of data from the Counties table in the Maps Access Database


Cities Database Table

The second table lists all the cities in Florida. Each city is associated with its county by the use
of the foreign key FIPS, which references the FIPS field in the Counties table. This table is
fairly comprehensive in our database, including more cities than were actually covered in the
map collection. However, this table only needs to hold the cities for which maps exist, or have
been digitized. Again, below is the schema for this table, as well as some rows to be used as
examples.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.












UtyKey Aj utoNumDer
City iText
FIPS Number


mary Key ror


list or cities in -loric


i Name of the city
FIPS code is a foreign key which links this city to a county in the Counties table
,------------------------j a


gure 2a: Schema for the Cities table in the Maps Access Database


[! 1929 Fullers
|I 2653 Fullerville
_ 21611 Fussells Corner
7ii 2477; Gabriella
+i 811 Gainesville
I 2162i Gall
i- 18691Galliver
+i 21631 Galloway


12095
12127
12105
12117
12001
12105
12091
12105


gure 2b: An example of data from the Cities table in the Maps Access Database


Map_Series Database Table

The final table contains all of the maps digitized in the project. The first three fields in
the table are identifiers for the electronic, as well as physical, resource. The fk CityKey field
links this map to a city from the Cities table. The Title field has remained in this table from
when it was originally imported during the initial creation of this table. However, thefk CityKey
field should always be used when determining the city covered by a particular map. As before,
the schema and some example rows appear below.


LTQF
LTUF
Title
MapYear
fk_CityKey
Sheets


Text i Bib ID for this map series
Text I LTQF cataloging number (unique per each BibID)
SText I LTUF cataloging number (unique per each BibID)
jText jTitle of this map series (compound data type)
Number The year of this map series .
Number i Foreign key references the Cities table |
| Number I Number of sheets in this map series i~i


al


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.


ure 3a: Schema for the Map Series table in the Maps Access Database











UF0UUUUUU1 iAAA3486b ALHRB6bb GainesvilleI 1884
UF70000002 !AAA3639_ IjAMJ1_654 West Palm Beach, Palm Beach County 19201
UF70000003 AAA3487 IALR9865 Gainesville 18876
UF70000004 AAA3640 iAMB5120 Cocoa, Brevard County 1919
UF70000005 AAA3641 .ALS7217 IOrlando, Orange County 1919i
-~~~~ ~-~-- - ~ ~ ~ ~ ~ ~ ~~-------- ~ ~ ~- ------- ~ ~ ~ -~
UF70000006 !AAA3642 IAMH7511 Mulberry, Polk County 1 119]
UF70000007 |AAA3488 IALR9864 IGainesville, Alachua Co. 1892
UF70000008 b AAA3643 iAMJ1217 iPalm Beach, Palm Beach Count 1919i
UF70000009 AAA3644 IAMJ1225 !PlantCity, Hillsborough County 1919,

Figure 3b: An example of data from the Map Series table in the Maps Access Database


Database Relationships

The overall relationship between the three tables is shown in the diagram below.
represents a primary key-foreign key relationship.


Figure 4: Relationship of tables in the Maps Access Database


Additional Considerations


We are currently completing the indexing of each of the maps in this project.
the addition of new tables in this database to store street information.

Additionally, this database will probably be transitioned to MS SQL at that
designs and data should remain the same during this transition.


This will result in


point. The table


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.


Each line


ssa~R









Section 2: Architecture


Current Architecture

The original, raw TIFF image of each map is digitally archived on location at the University of
Florida [UF] library, as well as off-site by the Florida Center for Library Automation [FCLA].
After image capture, the raw image is digitally restored and a new, processed TIFF file is also
saved. It is from this digitally restored image that the image for the web will be generated.

We are currently using MrSID Geospatial Encoder from LizardTech to create the images for the
web. These images are loaded on a SID Server at FCLA. This server allows a web user to zoom
in on the image and view any section in detail, without being forced to install a plug-in. For each
individual map sheet, a persistent uniform resource locator [PURL] is generated and is pointed to
the image on the SID Server.

At the same time the SID files are created, JPEG thumbnails are created which are approximately
200 pixels wide. These thumbnails are created to allow the user to browse through all the sheets
for one map. A web server at UF hosts the JPEG thumbnail files.

With this architecture in place, a user over the internet now has access to both the detailed SID
file and the JPEG thumbnail. Next, the search web pages must be created, which will be hosted
by the same web server at UF.

Future Considerations

FCLA is currently testing several new servers to replace their current SID server. The new
server will support the JPEG2000 standard, which is less proprietary than the LizardTech
MrSID compression. Once the new server is in place at FCLA, we will create JPEG2000 files
from the digitally restored images. Images saved as JPEG2000 files appear to provide increased
image quality, while continuing to allow online viewing without the installation of any plug-ins.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.









Section 3: Web Design and Creation


Web Design

For reasons of simplicity and speed, static web pages were created; these pages do not have any
direct connection to the database. This allowed the pages to be written completely in HTML,
without the need for any client-side JavaScript. The choice to create static web pages does have
an inherent disadvantage; the user is limited from performing more complicated, or boolean,
searches. The searches available to a user were limited to year, city, and county.

Once a user selects the main type of search, a page with matching items is shown and the
user must choose a map set. When a single map set has been selected, all of the sheets that are
part of that map are displayed, with the 200 pixel-wide JPEG thumbnails. Again, all of the pages
are static, pre-generated HTML pages.

From the thumbnails page, the user can select a single sheet to view. Once a sheet is
selected, and one thumbnail is clicked, the detailed SID derivative is shown by redirecting the
web browser to the SID server at FCLA. The generated PURL is used to redirect the web
browser to the image of the map at FCLA.

Since all of the pages hosted by UF are static, more insight on their design can be gained
by viewing the HTML source from any web browser.


Web Creation

With so many static web pages needing to be constructed, it was obvious that some automation
would need to be used. A program, written in C#, generates the HTML pages. This application
uses the database described in Section 1. Additionally, it connects to the SID Server, via FTP,
and confirms the presence of each map sheet, before adding it to the HTML.

The code for this application is available upon request, and could be customized for any
other institution that is generating similar web pages from a similar database.


Future Considerations

None of the currently available searching allows a user to search by street or by feature. Once
this data is collected, the web page will need to be modified to allow this detailed type of
searching. This will almost certainly require the use of database queries to provide result sets.
However, before this is confronted, all of this data must be collected and entered.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.











Section 4: Creating Indexes for the Maps


Index Preparation


We are currently creating advanced indexes for the maps. The index associates a certain stretch
of a street, or a feature, to a particular sheet in the map series. Once complete, a user will be able
to search for a street address and receive the particular sheets upon which that address appears.
Users will also be able to search by features, such as churches or businesses, and obtain the sheet
number, in the map series, which contains the feature.


This process of advanced indexing is simplified for the Sanbornm maps because the larger
sets already have a printed index that appears on one of the first sheets. Cropping each complete
image creates an image of the textual indexes for each map. These indexes are then sent through
Optical Character Recognition engines. A student then checks the resulting text, first with the
assistance of another custom application and then manually.


The custom application assists students by providing an initial level of automated cleanup
and quality control. The application, written in C#, steps through the textual output from the
OCR engine one line at a time. Any long string of punctuation is replaced with a simple
delimiter ('l') Some degree of intelligence is used to separate the street range information from
the sheet number. If Microsoft Word is present on the computer, bit of text is spell-checked.
The student can change any words that do not pass the spell checking. Once completed, the
resulting text is saved for manual checking against the visual index.


zrnmr.


STREETS.

Conert,,.............. 101-1. 4
t, ... ........ 15- 4


..... .., 0 21-3 2

rd -.......... ... l
.d .J...... ..... 01- l 4
J
JTll (nrltfc-n-) .................


L
Liberty,,................... 15-41
..................101.10
.......... ..... 1-215


............... 101-13
,u, Eh ................01i-,.t
........ ..... "701-744-
,r W ......... ...401-440
..............201-7~5
LI W.. ..1.,..,., 01-125
Eabm,................101-12
0
Oi.g1,,.................ia-3


R lapa .................. -,

no................. 101-1
Univpr. ................101-134
Univ .....,,,,,,,., ...,. 11--l
'e .....,..... Mi-L


SPECIALS.
A
Alxh~a Hotel, .............
Arlington Hotel, .,,....,....,


B
IBaptiat Churchr..............

3 Centj City IC Co., ............
2 Couut tCourt Hoone, ..........
4 Jail,...............
4 nD
5 Datton & Co., otton Ginnary,..
2
4
Eaut Florida Betumary ,......
Mdine. Ia. R., a. fula'gr Mill,
4 i piEopL Church,..............


m a
$ Florida FertiU srCo .,..........
P. & N.R.R, Ra Dppo ...,.... 1
3G
ainestille Fondry,........... I
3 HoI .............., 4
1 K
3 Kno. W.M., StableB,.........
L
Leighton Bro. Co., S i, MlA l I
MoN t, .................. 1
TInnhard & Pe ar, Grmt Mill,.. 2
3 M
3 Methodl i Episipal] Chnar,.... 4
0
3 Opera Houl, Old,............... 4
41 Bloi, Now,...... 3
p
3 PtrnoBr Mills.....................
2 Porter, T. V., Wa+ronsue ...... 1
Prebytori Chure h......... a
Itbl Bti eeool................. 2
1
H
flomhEmont, The ................ 1
4
1 a
I Seminole Hotel................ 1


Figure 5: Example index from a Sanborn Insurance Map that contains both streets and features.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.











The final text that has been checked and is ready to be parsed appears as below:

STREETS
Concert 101-1261 4
Court115-2814
Factoryll-1911
"1301-32612
Garden 301-33213
"1401-412 4



SPECIALS:
Alachua Hotell3
Arlington Hotel 3
Baptist Churchl3
Central City Ice Co 1
County Court Housel3
Jaill2
Dutton & Co., Cotton Ginneryll
East Florida Seminaryl4



This text file is now ready to be parsed into the database. This parsing is completed with
another C# application which is just undergoing testing at this time.



Database Changes

To accommodate the new information, new tables must be added to the database. In all, five
new tables are added to the database to accommodate the addition of features and streets.

Streets pose a particular challenge since the names of streets have changed fairly often in
Florida. Initially, it might seem the identifier for the streets should be some combination of the
street name and the map series. In this way, University Ave from Gainesville's 1884 map would
be distinct from Gainesville 1911's University Ave. This is a good example because in this
case, they are very different roads. The older road ran north-south and the later road runs east-
west. However, identification of a street as a combination of the map series and street name
makes linking of streets with different names and between different maps more difficult.

With additional resources it would be desirable to obtain a list of historic names for
streets. In this way, a name authority could be created for each street, chronicling the historic
name changes. How to use this information is somewhat up for debate, however. If a user
searches for University Avenue in Gainesville, multiple different roads will appear with multiple
different names. A user will be surprised to see NE 2nd Ave, Mechanic, and Union all listed in
two separate groupings, for the two historic roads.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.













:urrStreetName Text Most recently known name for the street
<_CityKey Number Foreign key references the city of this street
____________________________________J _S


Fc e.gr.- iE, ief "'t "i.:e- i Free!,
Foreign key references the map series
Name of this street, as it occurs on the map


RangeID AutoNumb i Primary key references this range for this street
fk_StreetInstancelD Number Foreign key references the instance of the street for which this range or not applies
StartRange INumber i First street number in the range indicated
EndRange Number Last street number in the range indicated
StreetNote Text iNote indicated (e.g.'between Masonic and Mechanic) that does not fit into street numbers
Sheeti Number Sheet number where this street segment appears in the map series

Figure 6a: Schema for tables added to Maps database to include advanced street indexing.



The database design demonstrated here might seem unduly complicated. However, this
design will lend itself to incorporation of historic street names and changes over time.
Additionally, until the history of streets is researched, data will be duplicated between the Streets
and Street Instances tables. However, this design will make the changes to incorporate street
histories elementary.


Figure 6b: Sample data in the new tables added to the Maps database for advanced street indexing.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.










The design modifications required to accommodate features is a bit simpler. Features do
not seem to change names as often as streets, and when they do change names, can easily be
considered different features. Also, features are usually present in one location, not a range of
locations.


I Name of the feature being indexed -
SType of use this feature has
Foreign key references the city for this feature


AppearancelD
Fk_FeaturelD
FkBibID
Sheet


AutoNumber Primary key references this appearance ot a feature
Number iForeign key references the features for which this appearance applies
Text i Foreign key references the map series upon which this feature appears.
SNumber I Sheet number where this street segment appears in the map series
,Nme Ihe nme hr h sre emn per t m eis


Figure 7a: Schema for tables added to Maps database to include advanced feature indexing.


i 2 UF-/UUUUUUU 3_
4 31 UF7000003 3
5i 45UF70000003 1
6j 56UF70000003 _3
71 61 F70000003 2M
:-------- 1 3... .-
-- 7 UF70000003 1
9i 8iUF70000003 4

Figure oNumbe data in the new tables added to the Maps database for advanced feature indexing.


Figure 7b: Sample data in the new tables added to the Maps database for advanced feature indexing.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.









The overall relationship between the tables in the database appears in the figure below.


Figure 8: Relationship of tables in the Maps Access Database after adding tables for advanced indexing.


Website Changes

Changes will need to be made to the website to allow users to search via this new information.
These changes will include both the layout and appearance of the web pages, as well as changing
the navigation around the page. We are currently determining the types of searches we need to
allow users to perform, and the type of responses we should provide. The pages will be entirely
rewritten to be more tightly bound to the database, which will also be transitioned to MS SQL
from Access.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.


1'retPfflS


15 e jni tai-.


Fe3hire Appeara-ires


Feabit P,,









Section 5: Geo-Referencing and Geo-Indexing

Map Geo-Referencing

Geo-referencing a map is determining the spatial footprint of a map in an established
coordinate system. A map is geo-referenced by finding landmarks in the historic map that have
known locations. Each of these landmarks is referred to as a control point. Generally, the more
control points that can be determined, the better the geo-referencing. Additionally, streets maps
can be compared to modem geo-referenced aerials. Historic maps often will need to be slightly
distorted to allow for errors during drawing. Once a map is geo-referenced, you can determine
latitude and longitude for any point or object on the map.

At this time, we are not systematically geo-referencing the Sanborn maps. However, we
are collecting sets of geo-referenced maps, as they are completed for use in other projects or by
other institutions.


Map Geo-Indexing

Once a map is geo-referenced, an index can be created for each building or object on the
map. The geo-index is created by placing a point over each building on a map and manually
entering information about that building, such as the address and listed use. Once the complete
index is created, a user can be shown a particular building on a map. In this way, a researcher can
be taken to an exact location on a map.


Figure 9: Block from Gainesville 1884 map with ge-indexing points displayed for each building.
Figure 9: Block from Gainesville 1884 map with geo-indexing points displayed for each building.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.











Use Descii


Street Num I


Location D


E Attributes of Gainnsville184Bilin


Bldo ID


70 Dry Goods / Dentist / Photo 116 Union n E. Main G1884_71
71 Ware House 1161/2 Union n E. Main G1884_72
72 Grocery 117 Union n E. Main G1884_73
73 Cigar Factory & Roper Hall / 119 Union c E. Main G1884_74
74 Shed 1171/2 center of block G1884_75
75 Shed 309 E. Main n Union G1884 76
76 Shed 308 E. Main n Union G1884_77
77 Carpenter 306 E. Main c Masonic G1884_78
78 unknown 18 Masonic n E. Main G1884_79
79 D 17 Masonic nW. Main G1884_80
80 Oliver House 15 Masonic c W. Main G1884 81
81 Office 214 W. Main bet Masonic & Unoi G1884 82
82 Cobbler 215 W. Main n Union G1884_83
83 Saloon 217 W. Main c Union G1884 84
Ad \Wa/rp Ho 1irp 707 rpntpr nf hlnri G1RR AF
Aeckdttf !lf 87j I Suho[iF Seieed F1cids- (out.162Sed ons

Figure 10: Table of attributes for geo-indexed buildings from a Gainesville 1884 map.



We are currently geo-indexing maps of Gainesville, Tampa, and Key West for the
Ephemeral Cities project. This allows us to collect information about a single building over a
range of twenty years. In the Ephemeral Cities grant, we will be compiling a database of
businesses and individuals that worked and lived in each building from 1890 to 1910 in the three
cities. In addition to the database, a user will be able to browse to related images and textual
materials. Once all maps are geo-referenced and geo-indexed, another web interface will be
developed to allow a user to navigate through the maps in a geographic manner.


C Copyright 2004, Mark Sullivan, University of Florida Digital Library Center.


nD I




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs