Citation
Design of an extensible, object-oriented GIS framework with reactive capability

Material Information

Title:
Design of an extensible, object-oriented GIS framework with reactive capability
Creator:
Arctur, David K
Publication Date:
Language:
English
Physical Description:
ix, 108 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Computer programming ( jstor )
Database design ( jstor )
Databases ( jstor )
Expert systems ( jstor )
Geographic information systems ( jstor )
Information attributes ( jstor )
Libraries ( jstor )
Metadata ( jstor )
Software ( jstor )
Topology ( jstor )
Dissertations, Academic -- Urban and Regional Planning -- UF
Urban and Regional Planning thesis, Ph. D
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1996.
Bibliography:
Includes bibliographical references (leaves 98-107).
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by David K. Arctur.

Record Information

Source Institution:
University of Florida
Rights Management:
Copyright David K. Arctur. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
022859148 ( ALEPH )
34944454 ( OCLC )

Downloads

This item has the following downloads:


Full Text















DESIGN OF AN EXTENSIBLE, OBJECT-ORIENTED GIS FRAMEWORK WITH REACTIVE CAPABILITY








By

DAVID K. ARCTUR











A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1996
































Copyright 1996

by

David K. Arctur







ACKNOWLEDGMENTS


This work seems like the result of my whole life of living, working, growing, hurting and learning. There is no way to list all those in that vast web of involvement, though my parents' and friends' ready encouragement and support through difficult times come quickly to mind.

This research was funded by the U.S. Naval Research Laboratory and the Defense Mapping Agency. I am very grateful to Kevin Shaw, Principal Investigator at NRL, as well as to Dr. Maria Cobb and Miyi Chung at NRL, for their faith, support and guidance throughout this project. Dr. John Alexander's creativity, insights and guidance at numerous stages of the project were invaluable, as was Dr. Paul Zwick's understanding of GIS concepts and applications. Dr. Sharma Chakravarthy and Eman Anwar provided considerable information and assistance with regard to active databases and the rule-base framework. Dr. Joe Wilson's perspective and understanding of object-oriented concepts helped guide portions of the OVPF design. Dr. Earl Starnes motivated me to find ways to apply these findings in urban and regional planning applications. This research may not have started without Bob Williams' initial work on OFM. Dr. Max Egenhofer, Dr. John Herring, Dr. David Abel, and Dr. Daniel Karnes all provided timely and useful reference materials and guidance. And I cannot leave out Adobe FrameMaker book publishing software, without which this thesis would have been much more difficult to create and assemble.

Without a question, I am grateful again to John for luring me out of Silicon Valley, California, to embark on this adventure.




iii








TABLE OF CONTENTS

pagge

ACKNOWLEDGMENTS .......................... .....................111

LIST OF FIGURES .......... ... ................................. vi

ABSTRACT .......... ...................................... viii

INTRODUCTION ................................................... 1

Goal and Objectives of the Research .................. ................... 7
Managing Complex Interdependencies Among Geographic Features .......... 8 Supporting Very Large Geographic Databases .......................... 10
Supporting Reactive Capability in the GIS ............................ 10
Vector Product Format Database Structure ............................. 12
Historical Technological Developments ................ ................ 18
Large-Scale Urban Models ................ ....................... 19
Object-Oriented Programming Systems ........................... 20
Relational and Object-Oriented Database Management Systems ............ 22
Knowledge-Based Systems ..................... .............. 24
Geographical Information Systems ................. ................ 30
Object-Oriented GIS ................ ............................ 32
Expert Systems with GIS ...................................... 35
Importance and Contributions of This Thesis ............................ 37

MATERIALS AND METHODS ........................................... 40

Object-Oriented Software Development Tools .......................... 40
Smalltalk Programming Environment ........... ................... 40
Source Code Configuration Management Facility ................ ..... 41
Object-Oriented Database Management System ......................... 42
Computer Platform .......................................... 43
Approaches Used in Building OVPF Components ..................... 44
Introducing Some Object-Oriented Terms .............................. 44
Conversion of Source Data from Vector Product Format to Smalltalk Objects. . 46 Representation of Metadata Objects ................................ 48
Representation of Geo-Feature Objects ........................... 51
Representation of Graphical Primitives and Topological Relationships ....... 55 Design of an Object-Oriented Spatial Index .......................... 60
Organization of Object Webs in ODBMS Repository ..................... 61
Design of a Rule-Base Framework to Support Geographic Feature Editing ....... 65
Event Objects .............................................. 66
Rule Objects ............................... ........... 67
Event Detection Mechanism. .................. ................ 68


iv








OVPF Application Overview ................ ........ ............. 70
Transformation of Relational Vector Product Format Data to an Object Web .. 71 Displaying Spatial Features ................................... . 71
Migrating Object Webs to ODBMS ............................... 78
Applying the Rule-Base Framework for Feature Editing.................. ... 79

DISCUSSION ................................................... 85

Implications of Research for Meeting Initial Objectives .................. ... 85
Supporting Complex Interdependencies Among Geographic Features ........ 85 Supporting Very Large Databases ............................... 87
Supporting Potential for Expert System Applications .................. .. 90
Limitations of the Present Application. ............................... 90
Feature Class Definitions ....................................... 90
Spatial Index ......... . . .................................. 91
GIS Functionality ......... .. . .............................. 92
Rule-Based Framework ....................... ............... 93
Smalltalk Language .......... ......... .............. 94
Future Directions ............. ................................ 95
Summary ........... ... ..................................... 96

REFERENCES .............. ...................................... 98

BIOGRAPHICAL SKETCH ......................... ...................108



























V







LIST OF FIGURES


Figure page

1 Libraries for Chesapeake Bay Area (DNCOI) Database .................. 13

2 DNCOI Database Directory Structure (partial) ......................... 14

3 W inged-Edge Topology Components .................. ............. 16

4 Development Hardware Configuration ........................... 43

5 Development Software Configuration ........... .................. 43

6 Vector Product Format Data Types .................................. 49

7 Example of Feature Table with Header and Records .................... 50

8 VPFTableHeader and VPFSchemaColumn Class Definitions
and Example Instance Values .............. ................... 50

9 Steps to Create Metadata W eb ..................................... 52

10 Representation of Geo-Features in OVPF .......................... 56

11 Object Definitional Hierarchy for Representing
VPF Graphical Primitives with Spatial Topology ....................... 59

12 Principal Classes and Behavior for Quadtree Spatial Index ............... 62

13 Event Class Hierarchy ................ ......................... 66

14 Structure of a Rule Object ................ ...................... 67

15 VPFFeatureConstructor Hierarchy ................. ............... 69

16 Principal OVPF Components ................. ................... 70

17 Key Definitions and Relationships for OVPF Database Classes............ 72

18 Sample OVPF Data for a DNC Coastline Feature ...................... 74

19 Transfer of Spatial Features from VPF to OVPF ....................... 76

20 OVPF Map Display of Multiple Coverages in Norfolk Approach Library .... 78 vi







21 Persistency and Linkages of Principal OVPF Components. ............... 79

22 Example Rule and Event Objects ............... ................ 80

23 Key Components of Event Detection Framework ....................... 81

24 Flow of Control and Behavior For Rule-Event Example ................. 82
















































vii







Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy

DESIGN OF AN EXTENSIBLE, OBJECT-ORIENTED GIS FRAMEWORK WITH REACTIVE CAPABILITY By

David K. Arctur

May 1996


Chairman: John F. Alexander
Major Department: Urban and Regional Planning


This thesis reports on the design of an "analysis-enabling framework" for more

productive use of geographic information systems (GIS) by planners and decision makers, through the integration of object-oriented programming and database management with knowledge-based systems and active database technology. This extends the information management system to support its own customization by those who use it, for reflective adaptation of the GIS framework to applications beyond the domains or scope anticipated by its creators.

The Smalltalk programming environment is used, along with an object-oriented database management system (ODBMS), running on a Unix computer platform. This system works with the Defense Mapping Agency's Vector Product Format (VPF) digital geographic databases. The Smalltalk application, called OVPF, converts source data from a georelational database structure to Smalltalk objects. Full spatial topology for point, line and area graphics is supported using the winged-edge algorithm, as well as many-to-many relationships between geographic features and graphical primitives.



viii







OVPF incorporates a quadtree spatial index implemented in Smalltalk. The

quadtree object itself is placed in the ODBMS repository, and serves all queries to the geofeature objects. A rule based framework and event detection mechanism provide a reactive or "triggering" capability for enforcing application-based data integrity and interdependency constraints on requests to update geo-features. This effectively transforms the ODBMS into an "active database."

OVPF provides a graphical user interface (GUI) for direct interaction to view and edit geo-feature objects. Spatial topology is maintained during feature editing, and OVPF encapsulates database operations within atomic transactions. Additions and changes can be made to the rule base at run-time that take effect immediately.

The importance and contribution of this research is in the use of Smalitalk's unique object-oriented data modeling capabilities for a GIS framework, in combination with a rule-based active repository for spatial and nonspatial data. This approach supports complex interdependencies among geographic features in potentially very large databases. It provides an extensible metadata framework, and the potential for supporting expert system applications.


















ix













INTRODUCTION


The management and analysis of geographic information is becoming increasingly important to many sectors of our industrial and technological society. Local, state and federal governments use geographic data to study and forecast population and demographic growth patterns, as well as to develop comprehensive plans for urban infrastructure and development (Budic 1994). Utility companies use geographic information to plan the building and expansion of electrical, gas, water and communications facilities. Private industry and commercial businesses use geographic information to study, plan and monitor their marketing, production and distribution strategies. The military uses geographic information for strategic and tactical mission planning, mission rehearsals, logistics, navigation, and many other applications. The uses and interdependencies of geographic information are growing rapidly, as are the sources of this information (Maguire et al. 1991; Laurini and Thompson 1992).

A number of approaches for geographic information systems (GIS) have been developed to support access to, and management of, such data. A GIS encompasses, in part, the integrated computer hardware and software required to store, retrieve and update both spatial and nonspatial attributes associated with a database of geographic features. Each GIS also is a function of the context in which it is used, and embodies a set of principles and procedures for the collection, analysis, display and plotting of geographic data.






2
All these functions however, can be grouped into two main categories of

responsibilities for a GIS: information management, and information analysis. The substantive goal of GIS technology is to support spatial analysis (Goodchild 1987) and synthesis for the purpose of understanding and predicting patterns of behavior among human and other natural communities (Wellar 1989; Wellar et al. 1994). But the sheer volume of data, and the complexity of interactions among geographic entities, has necessitated considerable effort to develop the means for simply handling the data, which has often seemed to overshadow the efforts for analyzing it (Ding and Fotheringham 1992; Lober 1995).1

However, rapidly accelerating advances in computer software and hardware may be changing this picture. The evolution of software technologies, specifically objectoriented (00) programming, knowledge-based (KB) systems, and active databases, together with the exponentially increasing power and storage capacity of affordable computers, has led to the development of analysis-enabling frameworks for more productive use of the information by planners and decision makers. These enabling technologies could be viewed as an extension of an information management system, but an important distinction is that information management systems to a great extent are designed to serve a broad range of applications from business to engineering, while knowledge-based enabling technologies are designed around the requirements of a particular user community.


1. This division of effort into information management and analysis categories parallels the
division of effort in urban planning itself into procedural and substantive matters (see Faludi
1973). While substantive issues generally seem the most important, lack of attention to the
procedural issues can preclude substantive progress. So it is in using information.







This is a report of an enabling-technology research project inI which the core functionality of an object-oriented geographic information sy'stcm i OOGIS) was implemented using the Smalltalk programming language, with a commercial objectoriented database management system (ODBMS) for the repository. The OOGIS incorporates a knowledge-base framework which effectively transforms the ODBMS into an "active database" (these terms will be defined shorti~v. 1his is not the first OOGIS framework to be developed. nor is it the first K3(;IS or active database to be developed. It does, however, seem to be a unique application of OO principles and techniques (thanks to Smalltalk) to the design of a KB framework for GIS that is simple, 'light-w eight" and extensible.

This development was the re,,ult of research tunded by the iT. S. Naval Research Laboratory (NRL) and the U. S. Defense Mapping Agency (DMA). to study alternative ways of representing and managing Vector Product Format (VPF) digital geographic databases. VPF is a specification del eloped by DMA for a family of database products (DMA 1993a) that has become part of the DIGEST international standard formats for representing geographical data (DM\ 1994a). VPF is a "georelational" specification which uses a relational data framework (Date 1995) for storing both spatial and nonspatial attribute information about the geographic features represented. A number of database products representing refinements to the VPF standard have been developed. such as the Digital Nautical Chart (DMA 1993b). Vector Smart Map (DlMA 1993c) Urban Vector Smart Map (DMA 1994b), World Vector Shoreline (DMA 1995). and others. Each of these VPF derivatives has different purposes, and thus ditterent sets of geographic features and attributes; in some cases even different met ad a (database schema) structures.





4

The purpose behind the development of VPF was two-fold: (1) to provide a public specification for exchange of geographic data across computer platforms and GIS software products, and (2) to support direct viewing capability without the need for proprietary GIS software (DMA 1993a). In its first purpose, VPF occupies a similar role to the Spatial Data Transfer Standard (SDTS) developed jointly by the U. S. Census Bureau and the U. S. Geological Survey (see National Institute of Standards and Technology 1992; Lazar 1992; Fegeas et al. 1992; Davis et al. 1992; Milne et al. 1993). SDTS now defines the format used for Census Bureau TIGER data (Klosterman and Lew 1992).

VPF and its derivative specifications have evolved over several years, and are just now maturing to the point that large-scale production and distribution of geographic databases on CD-ROM are taking place. One of the problems these database products present however, is that the feature data is very difficult to edit due to its inherent complexity. That is, once a VPF database has been created (usually by exporting coverage data from a commercial GIS product), it is very difficult to modify the feature data while maintaining referential integrity of the many linkages between features, attributes, graphical primitives, and the various indexes and join tables. The single greatest source of complexity is the attempt to represent the locational and topological aspects of geographic features using the relational database model. Commercial GIS products have typically dealt with this representation through the use of proprietary, non-relational data structures and techniques. In following standard rules for normalization (Date 1995), the VPF specification has possibly reached the practical limits of what can be represented and maintained using relational database technology.






5

In its search for ways to address these issues, the Digital Mapping Program

(DMAP) at NRL sought the help of the GeoPlan Center in the Department of Urban and Regional Planning, and sponsored the current research project to develop an in-house (DMA-owned) viewer/editor capable of displaying and modifying VPF source data. NRL's decision to sponsor the project here was based on the results of prior work at GeoPlan on an OOGIS product to support automated mapping and facilities management applications (commonly abbreviated AM/FM) for large, regional electric utility companies. This project, called Object GPG (Alexander et al. 1991) and later OFM (for Objective Facilities Management), had developed a core set of object classes and methods in Smalltalk that could be used without license restrictions as a starting point for NRL's VPF viewer/editor. I was in a position to lead development of the VPF viewer/editor, having worked on OFM for the previous year.

As the DNC specification (DMA 1993b) is one of the most complex, it was chosen by NRL to be studied first. By the end of the first project year, I had completed a viewer/ editor prototype in Smalltalk that was capable of displaying and editing DNC feature data. This prototype was called ODNC, with the results appearing in a refereed conference the following year (Arctur et al. 1995c). At this stage, the project received much more attention and support from DMA. Our equipment was upgraded from aging workstations to Sun SPARCstation20's. Two more programmers were added to the project at NRL, and one to two additional graduate assistants worked on the project at GeoPlan. By the end of the second year of the project, we had integrated World Vector Shoreline, Vector Smart Map Level 0, and Urban Vector Smart Map database definitions and feature data into the object-oriented framework (now called Object-Vector Product Format, or OVPF), and had





6

incorporated the use of ObjectStore by Object Design, Inc., as an object-oriented database repository for VPF source data. We also had implemented full spatial topology (missing from the first-year prototype; see Chung et al. 1995); a novel splay-tree indexing mechanism to improve spatial query performance (Cobb et al. 1995b); and a rule-base framework for enforcing logical constraints on feature updates (Arctur et al. 1995d).

Although the OVPF data model and Smalltalk application were developed to

address specific needs of DMA, the techniques and lessons learned also seem well suited to GIS applications for urban and environmental analysis and planning, as well as facilities management applications in the various public utility industries such as electrical, gas, water and communications. My main interest is to show the implications and significance of this OO-KB-GIS framework for various GIS users. However, much of the thesis is necessarily devoted to explaining the historical context and the development of the framework itself. The remainder of this Introduction is in four main parts: (1) a statement of the specific technical goal and objectives for this research; (2) a more detailed look at the VPF data structures; (3) a review of the technological context and state of the art from which the current research proceeds; and (4) a synthesis of the preceeding review to show how the various technologies intersect in this project.

Following the Introduction, the Materials and Methods section describes specific tasks which were addressed in working toward the final product, essentially "recipes" for constructing the key components of an OO-KB-GIS framework. As the OVPF program is rather large (about 400 classes and 4500 methods, with over a megabyte of source code), it is impractical to describe its complete structure here. Therefore I will focus on the key constituent frameworks within the overall design that most directly contribute to the stated





7

objectives. The Results section then provides an illustration of the integrated design in action, to show how the finished program meets each of the stated objectives. The Discussion section concludes this work by addressing the implications of the findings, as well as the limitations and directions for future work with this approach.


Goal and Obiectives of the Research


This work began in response to the perception that traditional GIS data modeling and analysis approaches are becoming inadequate to support the complex and interdependent nature of geographic data. Data models once thought to be fairly flexible now are seen to impose significant constraints on the way the data can be used. This is becoming an increasing problem as our acquisition of huge amounts of detailed data accelerates.

In a similar way, the GIS tools themselves can be difficult to apply to a given

problem as a result of the often brittle way in which they are designed. The GIS software used for most governmental and military applications requires extensive training and continuous practice to develop and maintain proficiency. Then, even with proficiency it can be difficult, time consuming and frustrating to apply to a given problem. It seems that even one of the most sophisticated GIS software products, Arc/Info by Environmental Systems Research Institute (ESRI) has fairly rigid data structures that work well for single-theme maps, but do not support the kind of interdependencies that exist, for example, among geographic features in facilities management applications for electrical and other utilities industries. It seems inevitable to me that, with the rapidly changing environmental and societal conditions facing us, people will continue to find or create







perfectly reasonable GIS applications for which existing software systems and tools are not easily adapted.

The goal of this research project is to design and demonstrate ways of representing and working with the complexity of both the geographic information and the GIS tools that permit flexible adaptation to changes in requirements for the data models over time.

To this end, a set of objectives the GIS tools need to meet include

* representing potentially detailed and complex interdependencies among

geographic features in a database, in a user-extensible way;

* supporting very large geographic databases, potentially distributed over a network

of computers; and

* providing the capability to incorporate expert-system rules and behavior. Each of these objectives will be discussed briefly below.


Managing Complex Interdependencies Among Geographic Features


Each application domain comes with its own set of rules. In planning an electrical service extension for a new urban subdivision, the electric utility engineer has to take into account a myriad of prerequisites and corequisites in order to place any one electrical system facility, such as a high-voltage circuit breaker. Even to plan a simple facility such as an overhead capacitor, the engineer must first determine that a power pole is in place, and that such a pole is rated for the capacitor, and that the capacitor is placed on the proper circuit, and so on. Telecommunications equipment and circuits can involve even more complex interconnections than electrical utility services. Thus, in addition to support the drawing of maps of electrical and communications circuits, it is increasingly important





9

that the GIS tools support the engineer to build the maps completely and correctly. Essentially, a means of incorporating a facility engineer's books of policies and practices into the GIS tools would be a tremendous aid for such a user. An additional benefit the GIS can provide is to help manage the inventory and accounting of installed facilities.

This could also be said for a quite different application domain such as land use and building codes enforcement in a city or county government jurisdiction. For example, when a developer wishes to build or modify a residential or commercial establishment, many detailed conditions must be met, such as the number of parking spaces required based on square footage of the buildings; setbacks and easements based on proximity to roads or power lines; and so on. While these rules are generally mastered reasonably quickly by those responsible for their enforcement, this is less true in older and denser urban areas. Furthermore, many rules are open to interpretation, and are not always applied uniformly but can be applied strictly or loosely depending on an official's preference. In addition, turnover among code enforcement officers is very high in many offices, resulting in frequent retraining (Heikkila and Blewett 1992).

Most of the existing software used in map production and GIS is not well suited for adaptation to handling a rule base for describing complex interdependencies. Presently, there is no way to represent application-domain-specific dependency rules among geographic features with, for example, AutoCAD by AutoDesk, an inexpensive and popular technical drawing software product which is often used in place of a GIS. With sophisticated GIS software such as Arc/Info and Intergraph, this is still not easily done. One approach which works in very limited situations is to assign an "impedance"





10
(resistance to flow) between two proximate geographic features, but this does not capture enough of the semantics of, say, an electrical power network to be useful. Supporting Very Large Geographic Databases


A single geographic database can vary from a few megabytes to several gigabytes and even terabytes. The volume of data to be generated by remote-sensing satellites will reach a level of terabytes per day in a few more years. When attempting to perform queries and analysis with multiple databases in combination, it may not be practical for all this data to reside on a single host computer or even on a single local network. The GIS framework needs to support access to an arbitrarily large collection of data that may be distributed across an entire wide-area network.


Supporting Reactive Capability in the GIS


As the complexity of interdependencies and size of databases increase, it will become steadily more important to find ways for the GIS to assist the analyst and the database administrator in maintaining data consistency and integrity. This could be in the form of data integrity constraint support, as well as support for more complex decision processes typical of expert system applications. (Several examples of these will be presented shortly.)

To minimize the technical overhead of such assistance on the user, it is helpful and in some cases necessary for the GIS to have a flexible, consistent, and automatic way of reacting during attempts to access and modify geographic feature data. It would also be important for the reactive capability to be based on conditions anywhere in the complete







database, and not just on the portions of the database currently loaded into the computer's active memory. This is essentially what is meant here by the term active database; these are designed in such a way that application-defined events can trigger rule-based actions based on conditions in any part of the disk-resident database (Widom and Ceri 1996, Chapter 1).

For example, suppose a homeowner wished to get a building permit to expand her house on her own property. Suppose also that her property contained species of plants that cause her property to be considered wetlands, which might very well preclude her right to build any further on her property, on legal grounds. The municipal building codes enforcement officer needs to be aware of all such rules, as well as the particulars with respect to each applicant's property, to apply the rules in an objective and accurate way. Assuming pertinent, accurate data exists in the GIS database on which a correct decision could be based, the GIS could notify the codes official immediately of all such pertinent conditions at the earliest opportunity, thus precluding potentially costly re-evaluation at a later date due to obscure conditions that were not noticed earlier.

Zoning for land use is becoming an increasingly complex concern for city and county planners. Increasing land scarcity and values raise the costs and potential for litigation as competition and undesirable interactions among different zones in close proximity become more common. Expert systems that can take into account the various constituents' preferences in anticipation of future problems could be a very useful tool for planners. Again, the GIS needs some form of reactive capability to support this.

This concludes the discussion of goals and objectives for my research. The next section presents an overview of the VPF database structure, which serves as the data





12

source for the proof of concept described in this thesis to address the above objectives. Following the VPF description is my review of the technological evolution which has led to the development of the principal concepts, tools and techniques which are integrated in this project.


Vector Product Format Database Structure


For purposes of discussion, the Digital Nautical Chart (DNC) is considered

representative of the general structure of all Vector Product Format (VPF) products, and serves throughout this thesis as the concrete example of VPF to illustrate the definitions and linkages among features, attributes, and primitive graphical elements. One of the more complex VPF products, DNC was specifically designed to support GIS applications such as marine navigation. As with other VPF products, DNC geographic data is organized for distribution on CD-ROM disks where each disk or disk set contains the database of geographic information for a particular region. For example (see Figure 1), the Chesapeake Bay area surrounding Norfolk, Virginia has been coded as database DNCO 1.2 This database is organized using the hierarchical directory structure shown in Figure 2. The name of the topmost directory is also the name of the database. The DNCO I directory contains two files: the Database Header Table (DHT) which provides general information about the database (source, date of creation, revision level, etc.) and the Library Attribute Table (LAT), which provides the boundaries of each library in terms of decimal degrees of



2. Note: San serif typeface is used, e.g., DNCO 1, throughout this thesis to represent actual
filenames or parts of filenames of database components on the CD-ROM, as well as to
represent Smalltalk programming code.





13

latitude and longitude. As defined in the VPF specification and illustrated in Figure 1, a library defines a geographic boundary and scale, where a larger scale implies a closer-in view and a smaller scale implies a further-out view. Thus, the Norfolk Approach (AO 108280) library has a smaller scale and presumably lesser accuracy and precision of data than in the Norfolk Harbor (HO108280) library. A given database will have one or more library directories; in Figure 2, for example, are shown portions of two of the library directories, AO 108280 and HO 108280.





% AO108170
(Ocean City Approach)



-1 HOI08280 (Norfolk Harbor)

AO 108280 (Norfolk Approach)





GENOI (General)

COAO I (Coastal)


Figure I. Libraries for Chesapeake Bay Area (DNCO 1) Database

A library subdirectory is further divided into coverages, each of which contains the data for logically- and spatially-organized groups of geographic features. For example, the Cultural Landmarks coverage (CUL) includes buildings, power lines. streets, railroads and





14
other feature classes. The Inland Waterways coverage (IWY) includes features such as

canals, lakes, rivers and dams.


T 1 DNCO I (Norfolk, Virginia harbor area map)
fL DHT (Database Header Table)
F) LAT (Library Attribute Table)
N E AO108280 (Norfolk Approach Library)
V E HO 108280 (Norfolk Harbor Library) L LHT (Library Header Table) L GRT (Geographic Reference Table) Ib CAT (Coverage Attribute Table) > 5 IWY (Inland Waterways Coverage) V 7 CUL (Cultural Landmarks Coverage) E- FCA (Feature Class Attribute Table) Apply to D FCS (Feature Class Schema Table)
alln coverages INTVDT (Integer Attribute Value Descriptions) in CHAR.VDT (Character Attrib. Value Descriptions Apply to D EDG.FIT (Edge Feature Index Table)
all features L END.FIT (Entity Node Feature Index Table)
by type I FAC.FIT (Face Feature Index Table)
SttBUILDNGPPFT (Building Points Feature Table) Define attributes'
values for - L POWERL.LFT (Power Lines Feature Table)
each feature l BUILDNGA.AFT (Building Areas Feature Table) 0, 7 GJPG4545 (Spatial Tile Subregion) V 0 GJPH3000 (Spatial Tile Subregion) Eb END (Entity Node Primitive Table) Ib CND (Connected Node Primitive Table) l EDG (Edge Primitive Table) Legend: EBR (Edge Bounding Rectangle Table) i EDX (Edge Primitive Index Table) Unexpanded Directory L FAC (Face Primitive Table)
V Expanded Directory L FBR (Face Bounding Rectangle Table)
D File J RNG (Ring Table)

Figure 2. DNCOI Database Directory Structure (partial) Source: (Arctur 1995c)

Within a given coverage directory, the geographic feature data is divided into two

main groups of files: those that describefeature attributes, and those that describefeature

locations. Those files describing feature attributes, for example building type, road type,

accuracy level and so on. are stored in the coverage directory. The files describing feature





15

locations are stored in tile subdirectories of the coverage directory, where a tile corresponds to a rectangular subregion within the library's boundary. Tile size is a function of the library's scale. For example, tiles are 15 minutes (0.25 degrees) of latitude or longitude on each side for Harbor libraries; 30 minutes (0.5 degrees) on each side for Approach libraries; and 3 degrees on each side for Coastal and General libraries.

As shown in Figure 2, the files describing feature attributes are further grouped

according to their level of generality. The files for Feature Class Attributes (FCA), Integer Value Description Table (INT.VDT), and Character Value Description Table (CHAR.VDT) contain descriptive information concerning all feature attributes. The Feature Class Schema (FCS) file contains table-join relationships for many-to-many relationships that may exist between feature tables, associated notes and other tables (notes tables are omitted from Figure 2 for simplicity). The feature-specific attribute value detail is stored in the Building Points Feature Table (BUILDNGPPFT), Power Lines Feature Table (POWERL.LFT), and other such feature tables. The most important join tables are the Entity Node Feature Index Table (END.FIT), Edge Feature Index Table (EDG.FIT), Face Feature Index Table (FAC.FIT) and Text Feature Index Table (TXT.FIT). The Feature Index Tables (FIT files) are provided to relate each record of the Feature Tables (PFT, LFT, AFT, TFT files) to their associated graphic primitives in one or more of the tile subdirectories. Other join tables and index files may also be employed, as defined in the VPF and derivative product specifications.

A major aspect of geographic data and the VPF specification that complicates its relational structure is spatial topology. Any single geographic feature (such as a river) might consist of multiple line segments, called graphical primitives. A given line





16

segment, in turn, might be a part of more than one spatial feature (such as part of the river and an adjacent property boundary). The topology (adjacency and contiguity) properties of VPF features are stored and managed at the graphical primitive level, within the tile subdirectories. VPF specifies a "winged-edge topology" model (see Figure 3) to provide "line network and face topology, and also to maintain seamless coverages across a physical partition of tiles." (DMA 1993a, Appendix B, p. 105).



SConnected Node

0 Entity Node


Left Face
End Node

Left Edge Right Edge Start Node

Right Face






Figure 3. Winged-Edge Topology Components Source: DMA 1993a, p. 106.


In terms of the database files, the geographic coordinate data is organized into Entity Node (END), Connected Node (CND), Edge or polyline (EDG), Face or polygon (FAC), and Text (TXT) files. Entity Node records have a foreign key to their containing face primitive record: Connected Node records have a foreign key to their starting edge primitive record; and Edge records have foreign keys to their start node, end node, left





17

face, right face, left edge and right edge primitive records. Face primitive records include a foreign key to the Ring (RNG) table, which indicates the starting edge primitive for each face. The winged-edge topology algorithm (DMA 1993a, Appendix B, pp. 108-111) describes the procedure by which a face primitive is assembled from tracing the comprised edge primitives.

Text primitives consist of a textual label and a shape line that describes the location and path along which the text label is to be displayed. Text features are not topological structures, but simple cartographic elements for identifying certain features at an arbitrary location, such as the name "Chesapeake Bay."

Edge and Text primitives have variable-length records, as they may consist of an arbitrary number of locational points. To facilitate faster access to these primitives, VPF specifies additional index files, named for Edge Index (EDX) and Text Index (TXX) respectively, to be stored in each tile subdirectory. The direct byte offset for each record of the Edge or Text primitive file is stored in the associated Edge Index or Text Index file, sorted by the primary key in the tile-level Edge or Text primitive file.

See "Representation of Graphical Primitives and Topological Relationships"

starting on page 55 of this thesis, for more details on the implementation of topology in OVPF. See (Chung et al. 1995) for issues and techniques concerning maintenance of topology during geo-feature editing in OVPF.

There is still another layer of data required to represent VPF features, which is a spatial index for each coverage. VPF specifies an adaptive binary tree framework for managing spatial indexes of point, edge and face primitives (DMA 1993a, Appendix F). Spatial tree cells' keys are stored in additional index files for association with their





18

contained features for use in spatial queries and display. The OVPF prototype viewer currently uses a more efficient quadtree spatial data manager (after Samet 1994) instead of the adaptive binary tree system. See "Design of an Object-Oriented Spatial Index" starting on page 60 of this thesis for more details on the quadtree implementation in OVPF

A useful aspect of the VPF specification is that the relational files have their

schema description within each file's header. This facilitates dynamic interpretation and processing of feature data, as well as a means of coping with some of the differences in structural specifications among the various VPF products.

To give an idea of the complexity of a single VPF database, one 8-megabyte

library for the Norfolk Harbor database has about 25,000 geographic features (considered a relatively small data set), and uses over 1500 separate files to describe the location, topology, and other attributes of these features (this seems like a much larger number of files than one might expect, for just eight megabytes of data). Given the high degree of interdependency among features and graphical primitives, it is thus difficult to manage even simple changes to the location of a spatial feature, while assuring referential integrity throughout the coverage.


Historical Technological Developments


Many threads of development have contributed to the present research, which will be loosely categorized according to the period and technology represented. These start with pre-GIS large-scale urban models, object-oriented programming systems, both relational and object-oriented database management systems, and knowledge-based





19

systems (including active database systems). These are followed by discussion of various approaches to GIS, including proprietary and relational GIS, object-oriented GIS, and knowledge-based GIS. The concluding section presents a synthesis of the work in these fields as it pertains to the current research.


Large-Scale Urban Models


Due to the need to store, relate, and manipulate large amounts of spatial, temporal and topical data, computers have been used to support geographic applications since the 1950s (Budic 1994). Large-scale optimization, econometric, and simulation models of urban and regional systems were developed by the 1960s, but these began to lose favor in the U.S. by the early 1970s (Klosterman 1994). Lee voiced specific and influential concerns in 1973 which are important to keep in mind as we look at the various modeling approaches in this review. He referred to these as the "seven sins of large-scale models" (Klosterman 1994, p. 4): (1) hypercomprehensiveness, or trying to serve too many purposes at once; (2) grossness, providing information too coarse to be useful;

(3) hungriness, requiring enormous amounts of data (the management of which is error prone in itself); (4) wrongheadedness, that the models suffered from "substantial and largely unrecognized deviations between the behavior claimed for them and the variables and equations that actually determined their behavior" (Klosterman 1994, p. 4);3

(5) complicatedness, that the models' complexity and internally-generated errors resulted




3. Klosterman writes (1994, p. 4): "'As an example, Lee points out that data for an entire
metropolitan area were often used to derive model parameters that were then applied to
specific neighborhoods -- a computerized version of the ecological fallacy."






20

in the need to "massage" the models to produce reasonable-looking output;

(6) mechanicalness, that the models could produce large, unknowable errors due to iteration and rounding; and (7) expensiveness, that the models' costs were often so high as to require large federal grants just to put to use. As we will see from experiences with other approaches, these issues are not unique to urban models. Object-Oriented Programming Systems


Starting in the late 1960s and continuing into the early 1980s, compact engineering workstations with the first windowed graphical user interfaces (GUIs) were being developed at the Xerox Palo Alto Research Center (PARC). A new operating system called Smalltalk was among those being developed to take advantage of these advanced processing architectures (Goldberg 1988). Starting with a heritage from the Simula programming language, Smalltalk was a research project of PARC's Learning Research Group (later called the Software Concepts Group) to work toward "a vision of the ways different people might effectively and joyfully use computing power" (Goldberg and Robson 1983, p. vii). While it has long since surrendered its role as a complete operating system4 it has retained many features from that legacy5 and remains one of the most powerful and extensible programming languages and software development environments today. It has also been the principal catalyst in the widespread development of the object4. One of the first Macintosh operating systems was based on Smalltalk, and an early model of
Sun workstation was able to boot up under Smalltalk, but no more. However, one or more competing versions of Smalltalk now run on UNIX, IBM MVS, AS/400, OS/2, MS DOS,
MS Windows, and Macintosh platforms. Some versions such as ParcPlace VisualWorks
support cross-platform portability; i.e., the same program can be run on any of the supported
platforms without recompiling, regardless which computer system was used to create it.






21

oriented (00) paradigm and object-oriented programming systems (OOPS). These humorous acronyms may have been no accident: early Smalltalk developers tell stories of trying to work on experimental workstations with a MTBF (mean time between failures) of about twenty minutes.6 However, some of Smalltalk's best features as a software development environment7 emerged in direct response to this rather hostile environment.8

A number of very good books are available on programming with Smalltalk

(LaLonde 1994; Howard 1995; Smith 1991, 1995; Lorenz 1995). While the early years of Smalltalk were focussed on defining the language and convincing the software industry of the value of the object-oriented paradigm, attention has shifted since the late-1980s to refining the methods used for analysis and design of object-oriented programs. Numerous approaches have been presented: of these I have preferred Booch and Rumbaugh's work which essentially integrates a number of distinct techniques, each of which is more or less suited to different specific stages in the software development life cycle (Rumbaugh et al. 1991; Booch 1994; Booch and Rumbaugh 1995). In just the last two years, another focus of attention has been on the study and use of design patterns in object-oriented



5. Smalltalk helped pioneer the use of lightweight process threads. It also incorporates the use
of semaphores and non-preemptive, priority-based process scheduling. It includes a number
of other advanced programming features as well; see (Goldberg and Robson 1983, 1989;
ParcPlace Systems 1994a. b).

6. Personal communication with Russ Pencin, ParcPlace Systems, 1989.

7. References to "Smalltalk" as a language as well as a programming environment may seem
confusing at first, but I will try to distinguish these different usages by context. In fact,
Smalltalk represents at times a program organization philosophy; a language with rules of
syntax and semantics; and an interactive, graphical, software development environment with a rich set of tools for developing, cross-referencing, debugging, versioning and documenting
programs. Most of the tools for building Smalltalk programs are also written in Smalltalk,
forming an inherently user-extensible language and development environment.





22

programming (Gamma et al. 1995; Coplien and Schmidt 1995). These are based on the work of architect and professor Christopher Alexander in his study of design patterns in urban and natural architecture (Alexander et al. 1977). Design patterns provide a very concise vocabulary for discussing object-oriented programming design constructs, which also serves to aid in documenting a program. These are used only slightly in this thesis due to their recent appearance in the literature. Given time and resources, I would like to review OVPF again with the goal of identifying the specific design patterns which occur in the program.

Finally, a very useful book has recently appeared which is directed to supporting "technical managers in organizations to be successful in the use of object-oriented technology" (Goldberg and Rubin 1995, p. v). This book distills decades of experience with Smalltalk and other object-oriented systems to address the many issues of effective project management.


Relational and Object-Oriented Database Management Systems


In the late 1970s and early 1980s, relational database management systems

(RDBMS) came out of the research laboratories and began to find general commercial use in corporate minicomputer-based9 information systems applications such as finance and accounting. It was also about this time that personal computers (PCs) began to enter the




8. One of the earliest Smalltalk development utilities from ParcPlace Systems was a disk-based
audit trail called the "change list" of all programming code as it was written, along with a crash recovery tool to roll-forward changes made by the programmer since the last saved
version of the complete program. This soon evolved into a facility to support merging
multiple programmers' code.





23

office. Both of these technologies caught the interest and budgets of planners and other users of GIS. A fairly thorough guide to concepts and issues of RDBMS may be found in Date (1995), with additional perspectives provided in Stonebraker (1988).

An important complementary technology to RDBMS was the development of object-oriented database management systems (ODBMS) by the middle- to late-1980s (see Zdonik and Maier 1990). Initially these were created to meet the needs of complex applications such as computer-aided drawing, engineering or manufacturing (CAD, CAE and CAM), which have traditionally not enjoyed the relational database model. Companies offering commercial ODBMS products with Smalltalk interfaces include GemStone, Versant, ObjectStore, Objectivity, and UniSQL. We chose GemStone (GemStone Systems Inc. 1995) and ObjectStore (Object Design Inc. 1995) for evaluation in the current research project because they offer different client-server architectures, and we did not have the resources to examine more than two of these. For more information on client-server issues among ODBMS architectures, see (DeWitt et al. 1992; Cobb et al. 1995a). Another interesting paper outlines its author's proposed "object-oriented database system manifesto" of issues that need to be properly addressed when working within the object-oriented paradigm (Atkinson et al. 1992).

While the entire ODBMS market today is probably smaller than any one of the major RDBMS companies' customer lists, its influence is being felt. Many corporate information systems managers are switching to ODBMS for standard business



9. "Minicomputers" fill a middle ground between personal workstations and mainframes for
multi-user systems. There are hundreds of models in wide use today by Sun, IBM, HP, DEC, and many others. Minis and mainframes have become smaller as workstations have become
more powerful, to the point that these distinctions are sometimes hard to make now.





24

applications that were traditionally based on RDBMS. And most major RDBMS software companies now have a strategy in place for supporting ODBMS applications in the present or near future.


Knowledge-Based Systems


Another field of technological development which has a bearing on my research started in the 1960s with artificial intelligence (AI) and expert systems (ES). There was considerable initial excitement over the possibility of capturing the reasoning and heuristics (rules of thumb) of experts in a complex problem domain for use in computer models. This excitement cooled by the early 1970s due to the failure of the technology to follow through on its early hype and promise, but the field seemed to be saved from demise by the introduction of microcomputers. By the early 1980s, through the massive dissemination of affordable machines capable of meeting the heavy computational requirements of AI, "expert systems were even appearing as part of the most basic educational software packages" (Batty and Yeh 1991, pp. 103). Numerous proposals and case studies regarding the use of expert systems in non-GIS urban and environmental applications have appeared in the literature since the mid-1980s (Dickey et al. 1986; Ortolano and Perman 1987: Davis et al. 1987: Sharpe et al. 1991; Heikkila and Blewett 1992). Special issues of industry journals have been devoted to expert systems in urban and environmental planning and design (Sharpe et al. 1987; Batty and Yeh 1991). Collections of case studies covering a wide range of applications may be found in (Kim et al. 1990; Wright et al. 1993). Leung (1988) provides a theoretical foundation for the use of "fuzzy sets" to represent imprecision in spatial analysis and planning.






25

So what are expert systems? A good way of describing them is as ...

... decision aids which represent knowledge about the problem domain in terms of rulebased structures. As such, they are models of the problem-solving process which enable
conditional syllogisms in the IF-THEN form to be executed in sequence. In fact, the
problem domain is usually represented by a network of such rules and the expert system
processes these rules by searching this network to find the ultimate conclusions or the
original premeises which represent the basic outputs and inputs which drive the system.
These systems are organised into a ... knowledge base which contains the data in a form which can be operated upon by the system's inference engine which contains the search
procedures. Searching is usually accomplished by forward chaining from premise to conclusion or backward chaining from conclusion to premise. (Batty and Yeh 1991,
p. 103)

In the design of expert systems frameworks, technology has branched in three

main directions: (1) pure production-rule systems such as OPS5 (Brownston et al. 1985);

(2) first-order logic systems such as Prolog and its derivatives (Torsun 1995); and

(3) active databases (Chakravarthy 1992; Jaeger and Freytag 1995; Widom and Ceri 1996). Each of these will be described briefly, even though the first two were found to be unsuitable for the problem domain in this research project. Lessons are gained from all three approaches.

Production-rule systems

The basic architecture of production-rule systems, sometimes called simply

"production systems," consists of three main components: (1) a data store or working memory, containing a global database of symbols representing facts and assertions about the problem; (2) a set of rules, which constitutes the program, stored in production memory or rule memory; and (3) an inference engine to execute the rules. Rules have two parts: a condition to be tested, and an action to execute if the condition proves to be true (Brownston et al. 1985, pp. 6-7). Both forward chaining and backward chaining are supported. Production systems proceed computationally by examining and matching the states of all the data against all the rule conditions in each program cycle. This is well





26

suited to applications in which the program must respond adaptively to frequent, unpredictable changes in its environment. Unfortunately for our case, there is no means of supporting interactions or sequencing among rules. This would not work well with data models of interdependent geographic features in which the order of rule processing is often important, such as in facilities management applications for utilities. Also, the considerable overhead involved in examining the data store and the rule base in each programming cycle is an unnecessary price to pay "when efficient and provably correct algorithms or even close approximation algorithms exist for a task. . . . In general, if the problem and the solution to the problem are well structured or highly structured, it is unlikely that the best computer representation to the problem will be a production-system program." (Brownston et al. 1985, p. 26) In the current project, our problems and solutions tend to be highly structured.

First-order logic systems

First-order (also called predicate) logic programming approaches such as Prolog introduce formal semantics and provable correctness of theorems as the means of solving problems. These are generally backward-chaining systems, in which the system seeks to determine the premise to a given conclusion through exhaustive proofs of applicable theorems which could apply to the resolution of a given inference rule. As with production systems, these have significant shortcomings for dealing with very large databases. As Torsun writes (1995, p. 455): "the use of logic programming routinely in industrial/ commercial applications is severely hindered by a serious drawback. This drawback is the inefficiency of logic languages in applications where the problem is complex, large, or both.... logic programming is domain independent and the search methods are





27

undirected, but for efficiency to be achieved, proof servers need to be more focused." Another serious shortcoming of logic programming is that these languages do not include tools for building sophisticated windowed GUIs capable of managing tens of thousands of points and vectors at once, so would have to be somehow integrated with a GUI toolkit for this functionality.

An application area where predicate logic appears to be well suited is that of

programming code generation.l0 This is an important area offering increased productivity of programming effort in certain application domains. In this case, the problem domain is limited to the syntax and semantics of the input scripting language and of the generated output code, as far as the logic system is concerned. However, in facilities management, land-use zoning decision making, and many other applications of GIS, the bounds and semantics of the problem domain are too broad and complex to fit the limitations of logic systems.

Active databases

The field of active databases has emerged since the mid- to late-1980s as a very promising technology. "Active database systems are able to recognize specific situations (in the database and beyond) and to react to them without direct explicit user or application requests." (Gatziu and Dittrich 1992, p. 23) This represents in some ways an extension of the traditional "passive" database management system, and in some ways an extension of the OPS5 production-rule system. Active databases are superior to passive databases for enforcing general integrity constraints and enabling triggers, as well as for



10. Personal communication with Dr. Sharma Chakravarthy.





28
supporting data-intensive expert systems and workflow management applications, since the rule base does not have to fit completely in memory (Widom and Ceri 1996).

Common to most active database systems are the notions of events (or situations) and actions, associated via rules, as in SAMOS (Gatziu and Dittrich 1992). This is often referred to as an "ER" (for event-rule) framework. An event might be the creation, modification or removal of (in our case) a geographic feature object or a graphical primitive object. A rule might associate the removal of an object with an action to check the user's authorization privileges before allowing the event to proceed. Another rule might associate the creation of a new feature object with an action to check and enforce the data integrity constraints for that feature's location or attributes.

An extension of this approach developed for the HiPAC project (Chakravarthy et al. 1989; Dayal et al. 1996) and used in Snoop (Chakravarthy and Mishra 1993) adds the notion of being able to check arbitrary conditions, potentially having to do with objects not related to the triggering event, before firing the associated rule's action. This is often referred to as an "ECA" (for event-condition-action) framework. For example with this approach, we might condition the insertion of a new bridge feature object to depend on the prior existence of a nearby road feature with which it can establish application-dependent associations. This kind of rule-encoded interdependencies among geographic features would be very useful in facilities management and other complex GIS applications.

Active databases can be based on either relational or object-oriented database

models, and depending on their design, can support forward chaining, backward chaining, or both. Most of the earlier research and commercial database products applied reactive capability to RDBMS (Chakravarthy et al. 1989; Stonebraker et al. 1988; Widom and






29

Finkelstein 1990; Darnovsky and Bowman 1990, InterBase 1990). More recently, others have attempted to incorporate event and rule support into an ODBMS (Gehani et al. 1992; Gehani and Jagadish 1991; Diaz et al. 1991; Chakravarthy et al. 1993; Medeiros and Pfeffer 1990; Su et al. 1989; Anwar 1992).

Anwar et al. (1993) examine the implications of the shift from a relational to an object-oriented DBMS, and point out the greater flexibility from using an ODBMS for the active database. In particular, it is noted: "In contrast to a fixed number of pre-defined primitive events in the relational model, every method/message is a potential event" (p. 99). This is an important distinction. For example, a trigger in a typical RDBMS might be set up to take effect on update of a record in a given table, but it is not possible to trigger only on update of a certain field in the table; the trigger will always take effect for an update to the record no matter which field was the one updated.'" In an ODBMS, it is possible to have triggers defined at any granularity of an object's structure. Another finding from Anwar et al. (1993) is that by appropriate specification, parameterized rules can be associated with either a class object (in which case the rule would be in effect for all instances of the class) or for an individual instance.

The Snoop model introduced the notion of complex events, which could be

defined either as a sequence of specific primitive events, or as a Boolean composite of multiple primitive events. Taken together, these form a surprisingly simple and powerful set of constructs, which are the basis of the event system now in our OVPF application.




I1. One exception is Sybase. This RDBMS is capable of limiting an update trigger to only fire
on change to a specified field.






30
The point of building active functionality like this into the database system itself is to ensure consistent usage and high performance. SAMOS (Gatziu and Dittrich 1992) is an example of an active database layer implemented on top of ObjectStore, the same ODBMS we are using for our OOGIS repository. It was found in the SAMOS project that some performance is inevitably lost when the active capability is added on top of the ODBMS rather than being built into the kernel from the beginning. The OVPF application will share this fate, but for the present we are only concerned with prototyping advanced capabilities, and found this an acceptable trade-off, given there are no commercially available ODBMSs with reactive capability. Geographical Information Systems


There are now numerous textbooks and references on GIS. Two of the more

comprehensive books with which I am familiar are (Maguire et al. 1991) and (Laurini and Thompson 1992). These discuss the key issues and current approaches for creating and using geographic databases. A useful introductory guide to GIS also seems to be (Garson and Biggs 1992).

Briefly, a GIS provides (in varying levels of quality and ease of use, according to the system's manufacturer): (I) a database of graphical, locational information for a set of geographic features; (2) a synchronized database of nonspatial attributes for the same set of geographic features; (3) a graphical user interface (GUI) with query and update capabilities allowing a user to access and modify the feature data; and (4) analytical capabilities allowing the user to conduct studies taking advantage of the geometrical or topological properties of the geographic data. Some examples of spatial analysis





31

supported by GIS that were previously not feasible include "estimating runoff volume in specific areas, locating areas with scenic amenity, and searching for paths through threedimensional space that satisfy certain conditions, such as minimizing distance or construction costs or avoiding major obstacles." (Han and Kim 1989, p. 298) A number of publications have appeared which enumerate the functionality which should or could be found in a GIS (for example, see Goodchild 1988, 1994; Tomlin 1990). In our OVPF project, we have not yet reached the stage of implementing analytical functions such as these; hence we usually refer to OVPF simply as a viewer/editor.

The data models used, and much of the analysis performed with vector-based GISs depends on graph theory (Harary 1969) and planar topology (Alexandrov 1957, 1965; Munkres 1966; Spanier 1966; Simmons 1963). For the reader interesting in probing the mathematics of topology in more depth (according to one well-informed source12), "the standard reference is Alexandrov (1965). More mathematical, and a fine book, is Munkres (1966). THE reference for the insider is Spanier (1966)." However, it is unnecessary for our purposes to explore the range of methods for representing topology, as the VPF specification defines a particular manner in which primitive spatial objects shall be represented and associated with each other. The VPF "winged-edge topology" model (DMA 1993a, Appendix B) is a form of the point-line-polygon model typical in existing GIS systems (Worboys 1994). Our OVPF application provides complete support for the VPF winged-edge topology model, as will be described in the Materials and Methods section.



12. Personal communication with Dr. Max Egenhofer.





32

Another important aspect of a GIS is the choice of spatial indexing algorithm.

While VPF specified a particular spatial index approach (the adaptive binary tree), we felt less bound to follow this guideline, for a couple reasons: (1) the adaptive binary tree could not perform as well as many other structures; and (2) the choice of spatial index is critical to the overall performance of the system for queries and analysis. The version of the Objective Facilities Management (OFM) program from which OVPF evolved, used a simple quadtree approach based on (Samet 1994). Other popular approaches include range trees (e.g., R tree, R+ tree, and R* tree; see also Samet 1994; Beckmann et al. 1990; Brinkhoff et al. 1993). We decided on the quadtree partly because: (1) the quadtree approach yields unique spatial index keys, whereas the range tree approach does not (this would be important for storing spatial index keys on disk for later use); and (2) it was already implemented in OFM, and had reasonable performance with our prototype. One problem of range trees occurs in the case of overlapping regions for a given spatial object: it is not possible to determine a unique and repeatable index key for the spatial object. The VPF specification however, provided for storing the spatial index key values as part of the georelational file structure, for the purpose of allowing faster access to features. The use of range trees would preclude our ability to store a repeatable spatial index key with a given feature object in the VPF file structure.


Object-Oriented GIS


Since 1987, numerous object-oriented approaches and data models for GIS have been proposed and examined in the research literature (Egenhofer and Frank 1987; Dueker and Kjerne 1987; Abel 1989; Egenhofer and Frank 1992; Herring 1992; Worboys





33
1994). It is noted in Egenhofer and Frank (1992, p. 16) that "Object-oriented programming languages will be needed to implement the future GIS most efficiently.... because it naturally supports the treatment of complex, in this case geometric, objects (Kjerne and Dueker 1990). Compared with conventional data models, an object-oriented design is more flexible and better-suited to describe complex data structures." With regard to ODBMS, the same authors continue (1989, p. 16), "By using a database management system, data are treated by their properties; the object-oriented approach groups these properties into possibly complex objects and corresponding operations."

Two recent doctoral dissertations have been directed to the potential for using object-oriented concepts in GIS (Feuchtwanger 1993; Karnes 1995). Feuchtwanger proposes a geographic semantic database model, incorporating notions of both structural and behavioral aspects of stored information. Karnes implements a Smalltalk-based prototype for modeling land parcel networks in a cadastral cartography application. Karnes' work is especially interesting, as he explores the use of object-oriented programming as a means of modeling and creating novel metaphors for real-world representations in cartographic and geographic domains. It is the flexibility of objectoriented technology (and Smalltalk's development environment) in supporting complex representations that facilitates this application.

Commercial OOGIS products have emerged in the last few years, including

Arcview/Avenue from Environmental Systems Research Institute (ESRI 1995a, 1995b), Magik from Smallworld Systems (Smallworld 1995), Gothic Application Development Environment (Gothic ADE) from Laser-Scan (LSL 1995a, 1995b), and of course Objective Facilities Management (OFM 1996).






34
Avenue is an object-oriented scripting language for supporting Arcview

applications. It appears to draw much of its inspiration from both Smalltalk and C++, in terms of syntax and semantics. However it has some serious shortcomings for use in GIS:

(1) it is a closed system, that is, the user may not create new classes or class hierarchies, but can only use the classes provided with Avenue; and (2) because Arcview is not designed or intended to be used to edit Arc/Info coverage data, Avenue cannot support editing of Arc/Info coverages either.

Smallworld Magik is more powerful in some ways than Arcview with Avenue, providing a full-featured object-oriented language with much of the semantics of Smalltalk. The user can create classes and hierarchies of geographic features, and can conduct many useful analytical operations with the system. Smallworld provides a proprietary relational database system for both spatial and nonspatial data, as well as supporting access to Oracle and other commercial RDBMS repositories. Smallworld has so far focussed on facilities management applications for electrical, gas, water and telecommunications industries.

Laser-Scan Gothic ADE is also a powerful object-oriented GIS, providing both

scripting language capability and a proprietary object-oriented database system capable of holding spatial and nonspatial geographic data on the order of terabytes in size. Gothic ADE uses an interesting combination of C-language libraries and a high-level scripting language called Lull, to achieve what they claim is higher performance of processing than is possible with Smallworld's Magik system. Laser-Scan has so far concentrated on the market for large-scale map production systems.





35
Expert Systems with GIS


There has been considerable progress incorporating geographic data into urban

and regional zoning and other policy formation efforts (Maguire et al. 1991; Budic 1994). Research and practice with artificial intelligence (AI) and expert systems (ES) in recent years has resulted in the proposal and development of several models for supporting urban and regional policy studies and implementation (see Dickey et al. 1986; Ortolano and Perman 1987; Davis et al. 1987; Batty and Yeh 1991; Sharpe et al. 1991; Yan et al. 1991). However, in none of these cases was GIS data directly incorporated into the design or use of an expert system. Furthermore, some of the experience papers draw attention to significant difficulties and shortcomings in applying AI or ES technology to urban planning applications ( Dickey et al. 1986; Sharpe et al. 1991). Another thoughtful paper discusses numerous legal and ethical issues regarding the use of ES in planning (Wigan 1987).

Other research has focussed on applying AI, ES and DSS approaches to work specifically with GIS as an enabling technology for spatial queries and information analysis (Peuquet 1987; Taylor 1991; Han et al. 1991; Webster et al. 1991; Worboys 1994; Chen et al. 1994; as well as several of the papers from Kim et al. 1990; Wright et al. 1993).

A very interesting work from the mid-1980s was KBGIS-II (Smith et al. 1987).

This was a project at UC Santa Barbara to develop a knowledge-based GIS system, which was based on Common Lisp, Pascal and C. It included a means of representing both vector and raster (pixel-based) data, and defined a spatial object language. It had: (1) a query mode supporting a simple but versatile set of query forms; (2) a learn mode in which the system could modify and augment its knowledge base; (3) an edit mode in which the user





36
could modify and augment the spatial object language, as well as the knowledge base; and

(4) a trace mode in which the user could follow the processing steps being executed by the system. This work represents more complete development in terms of supporting queries and analysis than has so far been achieved in the current research. However, the thrust of my thesis is toward proof of concept that a completely object-oriented approach has merit for implementing a GIS framework. The KBGIS-II project helps inform the current research of some aspects of the overall framework that should receive attention.

With LOBSTER, Egenhofer and Frank (1990) present an interesting approach to building a Prolog-based spatial query language, resulting in progress toward a high level abstraction of spatial data and geometric operations, but note some significant difficulties. For instance, "Prolog contains no provisions to prevent the entry of invalid or contradictory data. . . . Such errors are extremely difficult to detect. If the database contains large numbers of facts, visual inspection by browsing is not possible anymore." (p. 924) Another source of problems was that "Some Prolog programs rely on the order in which facts and rules are entered into the database." Both these issues relate to the difficulties of trying to apply a strictly logic-based approach in a problem domain requiring more procedural control.

Some additional insight into the issues and difficulties of developing expert

systems is brought out by Han and Kim (1989). In this paper they discuss some of the distinctions between standard database management systems (DBMS) and decision support systems (DSS) in urban planning (p. 298):

The problems dealt with by DSS are generally different from those dealt with by DBMS.
DBMS is suited for structured problems that have a standard operational procedure,
decision rules, and clear output formats, such as those used in identifying low income
districts or in determining the median income of a city. DSS, on the other hand, is intended





37
for unstructured or semistructured problems, such as estimating fiscal and other impacts of
land development proposals, to provide quantitative support to the decision maker.

Han and Kim go on to inquire as to the reasons why urban planning complicates the use of DSS and more sophisticated expert systems. Among their findings is a list of suggested guidelines for identification of tasks suited to expert systems approaches. These were accumulated through a number of sources (Han and Kim 1989, p. 300):

I. Genuine experts exist who can articulate their problem solving methods;

2. Experts agree on solutions;

3. The task is not poorly understood:

4. The problem typically takes a few minutes to a few hours to solve;

5. No controversy over problem domain rules exists;

6. The problem is clearly specifiable and well-bounded; and

7. The problem solving should be judgmental in nature, not numerical. For those who are familiar with some of the battlegrounds in urban planning, these conditions will seem simplistic and naive. Some of the reasons they are suggested are to support repeatable results and to allow objective validation of solutions found. In any case we must start somewhere, and progress is being made. It is part of my goal with the current research to contribute to this progress.


Importance and Contributions of This Thesis


We have now looked very briefly at the main technological threads which come together in the current research: urban system modeling, database management, GIS, object-oriented programming, and knowledge-based systems. Thus far, all major commercial GIS products (both relational and object-oriented) except Objective Facilities






38
Management (OFM) have their own proprietary programming or scripting language and database system repository, although they generally support one or more of the major RDBMS products as well. It is certainly understandable why this would be so: by controlling the language with which a GIS user accesses and modifies the data, the GIS software manufacturer has fewer problems to cope with in system development and integration, as the products and users' applications grow and change.

However, it is my perspective that this approach inhibits users' ability to develop innovative solutions to meet their needs, and greatly limits the number of trained programmers in the marketplace who might have experience with a given GIS product. Considerable talent and effort has been directed to the development of each major programming language such as Cobol, Fortran, Pascal, C, C++, Smalltalk and others. The advances being made yearly with Smalltalk, C++, Java, and other emerging systems are almost staggering. Similarly, RDBMS and ODBMS each represent very significant areas of intense research and development in their own rights, independent of the applications for which they are used. It is inconceivable that any one of the software manufacturers in the GIS field can compete with the functionality, robustness, interoperability on different computer platforms, and tools for development and debugging that are now expected of most modern programming languages and database systems. Nor can the GIS industry easily tap into the larger workforce of experienced programmers and consultants using these other languages and systems. It is my perspective that systems such as OFM and OVPF represent the kind of approach which can combine the capabilities needed in a GIS with the strengths and other advantages provided by using industry-standard programming languages and ODBMS. While OVPF represents a proof-of-concept at this stage,





39

exhibiting no particular spatial analysis functionality, that kind of capability can be implemented with Smalltalk or another language, and integrated closely with the geo-data handling capabilities already present in OVPF.

In addition to addressing the more traditional aspects of GIS, the current research provides the essential framework for supporting expert systems applications, without having to carry along the significant overhead of a complete "expert system shell." This is done through implementation of a simple, elegant and extensible rule-based framework and event-detection mechanism in Smalltalk, as part of the core functionality for creating and modifying geographic features. Because of certain features in Smalltalk such as dynamic binding (Goldberg and Robson 1989), it is quite straightforward to design an application which can modify its own structure and behavior at runtime, at the user's request. Such a system can also be designed to be capable of adding, modifying, and removing rules based on input from multiple simultaneous users in real time. This is a powerful capability that could conceivably lead to development of expert systems which can learn and adapt to changing conditions--necessary functionality for use in increasingly complex urban planning activities.













MATERIALS AND METHODS


This section is in two parts. The first part describes the software environment which was used to conduct the programming (the "materials"), and the second part describes the various issues encountered and approaches used to carry out the programming tasks.


Object-Oriented Software Development Tools


The development of this GIS framework was greatly facilitated by access to excellent tools: the Smalltalk development environment, a source code management system, an object-oriented database management system (ODBMS), and of course the computer platform itself. These will each be described briefly below. Smalltalk Programming Environment


Smalltalk was chosen as the development platform for the current project (OVPF) initially because that was the language used for Objective Facilities Management (OFM), its "parent" program. However, there are many reasons for its use in OFM and its continuance in OVPF: its rich development and debugging environment, extensible nature, hooks to commercially available ODBMSs, and scaleability for working with both small and large geographic data sets (these varied from 15MB to over 300MB for each complete Vector Product Format source database).


40





41

The specific version of Smalltalk chosen was VisualWorks from ParcPlaceDigitalk, Inc.1 This product included the Smalltalk language editor, compiler, user interface building tools, cross-reference system for variables, objects and methods, and various browsers (rather like having an encyclopedia of the program built for the programmer by the system), all integrated with a graphical interface. The browsers provide lookup capability for (1) all methods that send a given message; (2) all methods that implement a given message; and (3) all references to a given instance variable, class variable, class-instance variable, or global object such as a class itself. The runtime debugger supports examination of the process stack of currently-active methods at any point in time. The debugger also allows the user to edit and recompile a method, then continue execution of the current process stack from the recompiled method, without having to stop and restart the program. These were invaluable tools throughout the development of OVPF.


Source Code Configuration Management Facility


In addition to VisualWorks, we acquired licenses for ENVY/Developer by Object Technology International (OTI) of Ottawa, Canada (the license is purchased through ParcPlace). This is a sophisticated source code versioning and configuration management facility which has been developed to support all the major brands of Smalltalk (including IBM, Digitalk, and Enfin, besides ParcPlace). ENVY supports team programming by


1. The company was called ParcPlace Systems Inc. through most of the duration of this
project. Digitalk Inc. was ParcPlace Systems' chief competitor until they merged in
August 1995. References to their separate products in this thesis are now obsolete,
but will be made nevertheless.





42

allowing multiple programmers to share a common library of Smalltalk source code. The library management and security is seamlessly integrated into the programming editor, compiler and browsers, precluding the need for the user to always remember to follow proper library checkout/checkin procedures as is typical with other programming source code managers. Multiple programmers can even divide and work on different portions of the same object class without conflict. This facility was critical for the management of OVPF's hundreds of classes and thousands of methods, developed at an intense pace during the two years, with geographic separation among the team members. ENVY consists of two modules: one for the server computer, and one for each client programmer. The library manager is installed on a host server that is accessible to all team members (access can even be physically distributed over the Internet, though performance suffers). Each programmer works with a Smalltalk image initially provided with ENVY, that has the library management subsystem integrated with the rest of the Smalltalk development system.


Object-Oriented Database Management System


The third major software component was the ODBMS. While the research project included evaluation and development with both GemStone (GemStone Systems Inc. 1995) and ObjectStore (Object Design Inc. 1995), I will limit this discussion to the design and implementation of OVPF with ObjectStore, for simplicity and clarity. ObjectStore includes both a server module and a client module. The server module must be running on the host computer having the ODBMS repository. Each client programmer then works with a Smalltalk image which has been customized to include hooks for accessing the






43

ObjectStore server over the network, much like ENVY. All of these layers can be envisioned together as shown in Figure 4 and Figure 5.



Sun Sparc 20 host with ENVY/Manager and
ObjectStore ODBMS Additionalclient servers and clients programming workstation(s)





Computer Smalltalk VPF and Best if located on
Operating Source Code ODBMS separate hard disks
System ___Library __ RepositoriesFigure 4. Development Hardware







VisualWorks Smalltalk Client Virtual Image



ENVY/Manager Smalltalk Client Object Engine (OE) ObjectStore
Source Code Library Server with hooks for ENVY and ObjectStore ODBMS Server



Sun Solaris 2.4
UNIX Operating System


Figure 5. Development Software




Computer Platform



For this project, we used the Sun Solaris 2.4 operating system on a Sparc 20 workstation for the Smalltalk and ODBMS host server. This was connected on a local





44

token-ring network to other workstations which could serve as clients. Each of the major software subsystsems requires considerable processor and data transfer resources. As noted in Figure 4, it is recommended to have each of the major subsystems on its own hard disk, to improve overall performance.


Approaches Used in Building OVPF Components


The remainder of the Materials and Methods section presents the substantive

aspects of building the components for OVPF. This was a very large undertaking, and is beyond the scope of this thesis to describe in its entirety. Instead, I will focus attention on the portions of OVPF which have the most bearing on the goals and objectives of the research.


Introducing Some Object-Oriented Terms


In the following discussions, it may be helpful to be acquainted with certain

common terms used in describing object-oriented designs. The reader is referred to one of the references on Smalltalk for more detailed explanations of object-oriented concepts (Goldberg and Robson 1989; LaLonde 1994). The term abstract superclass is used to represent a definitional abstraction, such as definition of variables and/or behavior to be shared by its subclasses. Instances are not normally created from abstract classes. Concrete classes, on the other hand, are those that are expected to have instances made from them. These terms are mainly used to aid in learning about a class hierarchy; to call a class abstract simply implies that it lacks behavior needed for creation of a useful instance-object.






45

Instance variables are data structures for which each instance-object has its own private copy. Instance variable definitions in one class are inherited as part of the definition by all of its subclasses. In Smalltalk, instance variable names begin with a lowercase letter. Class-instance variables are data structures for which the class object and each of its subclasses are defined to have a private copy of the variable. A class-instance variable can be used, for example, to hold a subclass-specific default value for a constant that can be accessed with the same name from any class in the hierarchy. This helps reduce the program's "variable-name vocabulary," which is one of the benefits of object-oriented design.

Class variables are data structures for which the defining class has a single copy, that can be directly accessed by all of its instances. Per Smalltalk convention, class variables (and other shared objects including classes) have names that begin with an uppercase letter. Class variables are generally used either to hold (1) application-specific constants, or (2) collections of specific instances of a class.

In an object-oriented system all actions are the result of sending a message to an object. The receiver-object then responds by executing a method by that name. For improved readability in this thesis, I use the terms message and method interchangeably. However, these terms have distinct meanings; i.e., for a given message there may be one or more methods defined, as any number of objects can have a method with the same name.






46

Conversion of Source Data from Vector Product Format to Smalltalk Objects


One of the first steps that was required for OVPF was to build a translator in Smalltalk capable of reading Vector Product Format (VPF) data files. With a few exceptions, these are well specified as having the schema information (metadata) for a given data file contained in the header of that file. The file header is organized in three main sections, and the actual geo-data follows immediately after these sections. This organization is described in (DMA 1993a, section 3.6.1, pp. 15-20). To summarize, the first header section consists of the following fields:

I. Header length: 4-byte integer representing the number of bytes in the header.

2. Byte orderflag: 'L' for least-significant byte first, and 'M' for mostsignificant byte first.2 (Ironically, this flag must be known before the

preceeding numeric field can be interpreted.) The second header section contains only one field:

3. Table description: up to 80 characters of textual information.

The third and final header section contains the actual schema, which is the essential part for parsing the table's data content. This consists of repetitions of the following fields:

4. Column name: up to 16 characters of textual information.






2. Each integer and floating-point number requires 2, 4 or 8 bytes for its representation. The
byte order specifies which end of the bytes comes first. This is normally determined by the
operating system. PC DOS, for example, is a little-endian platform (least-significant byte
first), while Unix and Macintosh are big-endian (most significant byte first). Since VPF data
is intended to be read on any of these platforms, the GIS software needs to be written to
translate VPF integer and floating-point numbers appropriately for that platform.





47

5. Field type: a single character defining the data type (one of those listed in the

first column of Figure 6 below).

6. Number of elements: an integer value representing either (a) the number of

textual characters, or (b) the number of occurences of the specified numeric

field type.

7. Key type: a single character for the type of key field represented by the

column (one of P-primary, U-unique, or N-none).

8. Column description: up to 80 characters of textual information.

9. Value description table: up to 12 characters for a DOS-compatible filename

(either INT.VDT or CHAR.VDT) for the file containing textual descriptions of the different values the column in each data record could have. This will

occur when the column is a nonspatial attribute of a given geo-feature. By knowing the schema for a given table, the program can loop through all the data records in that table, interpreting each field (column value) according to the schema. An example of a feature table with header and records is shown in Figure 7.

Because most of the VPF database tables follow this schema specification, it was straightforward to create a generalized VPF table reader procedure. To implement this, I created two main reader classes, VPFTableHeader and VPFSchemaColumn (see Figure 8), as well as a hierarchy of classes to implement the specific properties and behavior of the various data types listed in Figure 6. The data type classes were used to translate the data values' byte representations between VPF and Smalltalk, as well as to maintain data integrity (e.g., ensuring that text field values did not exceed their schema-specified length). There was an added dimension of translating from the byte-order of the VPF





48

source data (so far, this has always been little-endian) to that used by the operating system platform on which OVPF was running.

Because the mechanics of reading VPF tables are very straightforward

computationally once their format is known and an object structure is chosen, I will not go into any further detail on this particular task. It should suffice to say that Smalltalk was capable of reading and interpreting all VPF data files, including those which did not have their schema in the header: these included variable-length index files, spatial index files, and thematic index files (see DMA 1993a, sections 5.4.1.3, 5.4.2 and 5.4.3, pp. 77-83). The Triplet ID field was particularly troublesome, as it is a variable-length array of one or more integer values, whose length and content are determined by decoding the bits of the first byte (DMA 1993a, section 5.4.6, p. 87). Nevertheless, all these are handled within the OVPF classes just mentioned.


Representation of Metadata Objects


The term metadata is used here to represent the parts of a VPF source database that define the actual geo-feature data. There is a substantial amount of definitional content in a given VPF database, thanks to its open specification. However the metadata is quite fragmented among numerous files, and must first be assembled and organized in some manner before it is possible to start reading the actual feature data with it.

The approach taken with OVPF is to initialize a metadata object web for each VPF database to be accessed. This is a one-time procedure for each database, after which the metadata web is kept in the ODBMS repository for future use in reading VPF source data from the CD-ROM. Figure 9 summarizes the steps involved in processing the source






49






Type Length Abbrv. (Bytes) T,n Fixed-length text n T,* Variable-length text n + 4 F,I Short floating point 4 R, I Long floating point 8 S,1 Short integer 2 I, 1 Long integer 4 C,n 2-coordinate array, 8n
short floating point
C,* 2-coordinate string 8n + 4 B,n 2-coordinate array, 16n
long floating point

B,* 2-coordinate string 16n + 4 Z,n 3-coordinate array, 12n
short floating point
Z,* 3-coordinate string 12n + 4 Y,n 3-coordinate array, 24n long floating point
Y,* 3-coordinate array 24n + 4 D, 1 Date and time 20 X, 1 Null field (none) K, 1 Triplet id 1 - 13 Figure 6. Vector Product Format Data Types Source: after (DMA 1993a, Table 56, p. 86)






50



(Header length and byte order);\
ENVAREA.AFT, Environment Area Feature Table;-;\
ID=l, 1,RRow ID,-,-,:\
F_CODE=T,5,N,FACC Code,CHAR.VDT,-,:\
VAV= I, I ,N,Variation Anomaly Value, INT.VDT,-,:;

I ZC040 2

2 ZC040 I

Figure 7. Example of Feature Table with Header and Records Source: (DMA 1993a, Table 6, p. 20)




VPFTableHeader
Instance Variables:
tableDesc ENVAREA.AFT,Environment Area... headerLength I 149 (bytes) byteOrder > L (least-significant byte first) schema - Collection of 3 VPFSchemaColumn Operations: instances buildSchema
initializeFromStream: aFileStream skipOverHeaderinStream: aFileStream


VPFSchemaColumn
Instance Variables:
name > F CODE description s- FACC Code type > T (fixed-length text) length 1 5 (characters) keyType -0 N (none) vdtFile -- > CHAR.INT Operations:
(accessing methods for instance variables) byteLengthFromStream: aFileStream datumValueFromStream: aFileStream putDatumValuelnStreamUsingHeader: aVPFTableHeader


Figure 8. VPFTableHeader and VPFSchemaColumn Class Definitions and Example Instance Values





51

database metadata in creating a Smalltalk object web for this data. Notice that each of the OVPF metadata classes have pointers to two other metadata classes. This is a way of representing hierarchical containment, or aggregation, with both forward and backward pointers. For example, the libraries instance variable of VPFDatabase holds onto a collection of VPFLibrary instances, each of which holds onto a "back-pointer" to its VPFDatabase container. With this structure (starting from the bottom of the pointers in Figure 9), an individual VPFFeatureDef instance can quickly traverse its lineage to access coverage-, library-, and database-level metadata as needed.

Only a couple method names have been shown in Figure 9. VPFDatabase class has a set of methods for initializing the metadata object web (represented here by the method "initializeVPFProductFrom: pathname"). Interpretation of actual feature data based on the metadata has been made the responsibility of VPFCoverage, as this corresponds to the level at which features and graphical primitives are linked in the VPF source database.


Representation of Geo-Feature Objects


As should now be all too apparent, the complete set of data representing each geofeature in Vector Product Format's file structure is very fragmented. One of the benefits of the object-oriented approach is to tie the pieces together with object pointers instead of join tables, for more direct access and control. The greatest single cause of the fragmentation within a given coverage is the need to represent and syncronize both spatial and nonspatial attributes of the features. This subsection describes the assembly of nonspatial aspects of each feature, while the next subsection describes the handling of spatial and topological attributes.

















Figure 9. Steps to Create Metadata Web
(Please also refer to Figure 2 on page 14)

1. Process database-level files:
- create VPFDatabase instance;
- assign rdbPath variable to hold the directory pathname
for the source database;
- store the VPFTableHeader instances created for reading the
Database Header Table (DHT) and Library Attributes Table (LAT)
in VPFDatabase instance variables.

2. Process library-level files:
- loop through Library Attributes Table (LAT) to initialize all
VPFLibraries for this database (name, bounds, scales, tile names);
- store the VPFTableHeader instances created for reading the
Library Header Table (LHT), Geographic Reference Table (GRT),
and Coverage Attributes Table (CAT) in VPFLibrary instance
variables.

3. Process coverage-level files:
- loop through Coverage Attributes Table (CAT) to initialize all
VPFCoverages for this library (name, Value Description Table (VDT)
headers, Feature Index Table (FIT) headers, all primitive table
headers, and feature-notes headers).

4. Process feature-level files:
- loop through Feature Class Attributes (FCA) to initialize all
VPFFeatureDefs for this coverage (name, Feature Table (FT) header,
prim header, prim-join headers, Value Description Table (VDT) entries that are valid for this featureDef, coverage and library).






53






VPFDatabase VPFLibrary
Instance Variables: Instance Variables: rdbPath -4 - database
libraries - coverages
dhtHeader tiles
latHeader IhtHeader Operations: grtHeader initializeVPFProductFrom: pathName catHeader

(A) (B)



VPFCoverage VPFFeatureDef Instance Variables: Instance Variables: library I coverage featureDefs >0 features spatiallndex class level attribsVDT featureNotes fcsjoinDict fcaHeader ftPath fcsHeader ftHeader chaHeader primPath intHeader primHeader notHeader primjtPath endFitHeader primjtHeader cndFitHeader njtPath edgFitHeader njtHeader facFitHeader notejoins txtFitHeader
endPrimHeader (D) cndPrimHeader
edgPrimHeader
ebrPrimHeader
facPrimHeader
fbrPrimHeader
rngPrimHeader
txtPrimHeader
Operations:
importRelationalCoverage

(C)





54

The definitional organization of feature attributes in OVPF is depicted in Figure 10. The VPFFeature hierarchy handles the nonspatial aspects of features, while the VPFFeatureSymbol class hierarchy handles the spatial aspects. The featureDef instance variable defined in VPFFeature class provides the link for each feature instance to its complete set of metadata just described (see Figure 9).

The methods shown in Figure 10 are a small subset of the full set of procedures implemented, but these are sufficient for the present discussion. Notice the ReadWriteStream object (Figure 10B) which is used to represent the attributes instance variable of each geo-feature object. The ReadWriteStream is an important object that is part of the Smalltalk system class library and is used to represent and manage sequentially-accessed data collections, much like one might think of accessing data on a magnetic tape. The contents instance variable of this object is used in this case to hold all nonspatial attribute values in a single collection of bytes, which is essentially a direct copy of the geo-feature's source data record from the VPF feature table. Two simple but important methods in VPFFeature are "valueForAttribute: aName" and "putValue: aValue forAttribute: aName." These methods provide a generalized means of accessing and modifying any one of a feature's nonspatial attributes (such as F_CODE or VAV from Figure 7 on page 50). Essentially, these methods look up the attribute's data type and position from the feature table schema (held in the ftHeader instance variable of the VPFFeatureDef metadata object), then perform the selected action on those bytes in the contents of the ReadWriteStream instance. Other methods in the VPFFeature class hierarchy not shown here include means of maintaining the correct feature-primitive linkages during changes in topology.





55

VPFFeatureSymbol objects hold onto the actual graphic primitives in their graphicElements instance variable (this is the subject of the next topic). VPFFeatureSymbols also respond to display-related requests from the graphical user interface.


Representation of Graphical Primitives and Topological Relationships


One of the more intriguing issues was deciding how to represent spatial topology (adjacency and contiguity of graphical primitives) within the Smalltalk object-oriented data model. Each feature object is associated with a set of latitude-longitude coordinates, referred to as graphical primitives. Point features are associated with entity- and connected-node primitives. Line features are associated with edge primitives. An area feature is associated with a face primitive consisting of a ring of edge primitives, and text features are associated with text primitives.

Because any one line feature object may consist of multiple edges, and any single node, edge or face primitive could be used by more than one feature object, great care must be taken to maintain the correct linkages between the features and primitives. Topological relationships among the primitives must also be maintained across all features within a given coverage and tile, according to the VPF specification.

OVPF's predecessor. Objective Facilities Management (OFM), introduced an object known as a DrawOrders.3 This is a very simple structure whose inspiration is drawn from Digitalk's Smalltalk/V for OS/2 Presentation Manager (Digitalk 1989, p.464).


3. The DrawOrders class and related GraphicsEngine were initially developed by Bob Williams
for OFM.
















Figure 10. Representation of Geo-Features in OVPF

(A) VPFFeature abstract class hierarchy:
- VPFFeature class provides shared definition and methods for
accessing and modifying attributes and defaultColor.
- VPFLineFeature class provides shared definition of
defaultLineType.
- These classes are not instantiated, but are abstract superclasses
of concrete feature classes.
- Each subclass has its own private copy of a value for
defaultColor, defaultLineType and defaultAreaPattern
(where defined).
- The featureDef instance variable holds onto an object
pointer to the metadata objects for each feature class.

(B) Instances of ReadWriteStream class are used to hold
nonspatial attributes of each instance of a VPFFeature subclass; ReadWriteStream instances understand how to read ("next" message), write ("nextPut" message),
and reposition themselves ("reset" message and others).

(C) VPFFeatureSymbol class hierarchy:
- These classes provide shared definition and methods for
accessing and modifying the graphical primitives for a
given feature (see Figure 11 below).
- VPFFeatureSymbol is an abstract class with no instances,
but instances are made from each of its subclasses.






57




ReadWriteStream VPFFeature Instance Variables:
Class Instance Variables: contents
defaultColor position
Instance Variables: Operations:
id next: aninteger
featureDef nextPut: anObject
nextPutAll: aCollection reset
notes
symbol (B)
Operations:
valueForAttribute: aName VPFFeatureSymbol
putValue: aValue forAttribute: aName Instance Variables: feature
graphicElements VPFLineFeature boundingBox Class Instance Variables: color defaultLineType isHilighted Operations
beErased
beHilighted
VPFAreaFeature beUnhilighted Class Instance Variables: defaultAreaPattern

VPFPointFeatureSymbol

VPFPointFeature



VPFLineFeatureSymbol
VPFTextFeature Instance Variables: lineType



(A) IVPFAreaFeatureSymbol Instance Variables: areaPattern

(C)





58

The structure contains a variable-length array of bytes (the contents attribute). Each contents array has the following implicit organization:

* opcode -- a single byte whose integer value (0 - 255) represents an operation code,

such as set polyline, continue line, set color, etc.

* byte length -- a single byte whose integer value (0 - 255) represents the number of

bytes remaining in this draw order.

* data bytes -- the bytes whose integer or floating-point values represent the location

points, the line-color index, etc. for this draw order.

The contents byte-arrays from several DrawOrders can be concatenated into a single DrawOrders instance, to include an arbitrary number of instructions for displaying complex graphical objects. This structure is not only versatile, it is very compact and efficient for representing variable-length locational coordinate data.

Even without the need to manage spatial topology, DrawOrders are useful objects for handling graphical data and operations. However, supporting VPF graphical primitives with full spatial topology requires a refinement of this definition, so these primitives are implemented by subclassing the DrawOrders class. A straightforward example would be to have EntityNode, ConnectedNode, Edge, Face, and Ring classes defined as direct subclasses of DrawOrders. In this way, each subclass would inherit the DrawOrders contents instance variable, and add its own specific topological attributes as needed. It is also important however, for each graphical object to hold onto a collection of the OVPF feature objects that use that graphical object. This is handled in OVPF by defining the TopologicalStructure class as a subclass of DrawOrders and as a superclass of each VPF graphical primitive class (see Figure I ). The TopologicalStructure's features attribute is





59

handled as a collection of VPFFeatures because a given unique graphical primitive may be used to help draw any number of VPF geo-features. Each feature object holds onto an identity-pointer to its corresponding collection of graphical primitive objects, thus enabling both features and primitives to have access to each other.

In addition, the primid (primitive ID) and tileld attributes of TopologicalStructure are inherited by each subclass, providing a holding place for primary-key data from the relational-VPF files. For simplicity of supporting both import and export operations with the relational-VPF data, the primld, tileid, and topological attributes are assigned the VPF record ID value of the corresponding graphical primitive objects, rather than unique object-identity pointers. As features are added, deleted, and moved with respect to each other, these primid values are maintained just as they would be in a relational GIS framework.


VPFDrawOrder Legend: contents Superclass Instance Variables
VPFTopologicalStructure I features Subclass primed Instance Variables tileld

VPFEntit Node VPFEdge VPFFace
containingFace startNode, endNode, ringPtr leftEdge, rightEdge,
VPFConnectedNode leftFace, rightFace VPFRing
firstEdge VPFTextPrim firstEdge text
shapeLine


Figure 11. Object Definitional Hierarchy for Representing VPF Graphical Primitives with Spatial Topology Source: after (Arctur et al. 1995b, p. 14)





60

Presently, all source data comes to OVPF from relational-VPF databases. At this stage in the prototype development, we have assumed that all feature attribute, location, and topological relationships in the source data are initially correct. Thus we can focus our attention on developing full build and clean topological support (ESRI 1994) in a stepwise manner, beginning with simply maintaining topology locally during individual feature changes. We now have the capability to interactively add, delete, and change location coordinates of a single point, line or area feature at a time within a given tile, while maintaining correct topological relationships with adjacent and contiguous features (Chung et al. 1995). This is handled with the help of a graphical user interface (GUI) that requires the user to accept and commit changes to each topological relationship.


Design of an Object-Oriented Spatial Index


The spatial index framework in OVPF is implemented with just two main classes (this is a slight simplification for purposes of discussion). These classes are the VPFSpatialDataManager and VPFSpatialDataCell, shown in Figure 12. This framework presently uses a quadtree organization (after Samet 19944), in which each level of the tree represents a rectangular geographic area and can be divided into four equally-sized quadrants. Each quadrant in turn can be subdivided, continuing recursively until some predetermined limit is met. Within this structure, each geo-feature is inserted into the smallest quadtree cell which can completely contain it. This cell is the feature's spatial




4. The quadtree structure used in OVPF is simplified form of a spatial index structure originally
implemented by Bob Williams for OFM.





61

index. Insertion and queries start from the root (topCell) and progress recursively, until one of two conditions is met: (1) the smallest cell has been found, or (2) the maximum number of levels allowed has been reached. A maximum depth is necessary to limit the extent of recursion for very small geo-features such as points. In OVPF we have used a maximum level of 20. Note that features are indexed; not graphical primitives. This is to reduce the complexity and computational overhead of insertion and retrieval.

This design has some very interesting implications and potentials, which are brought out in the final Discussion section. One point to mention now however, is that because each VPFSpatialDataCell holds onto direct object pointers to the features which fit within its boundaries, the quadtree is more than just an index; it is an efficient, general purpose container structure for all the geo-features. This proves useful in the design of the ODBMS repository, which is the next topic. Organization of Object Webs in ODBMS Repository


There are two main groupings of database objects in OVPF: the metadata objects (table headers, schema definitions, value descriptions, and others); and the geo-feature objects and primitives. Each of these object groups needs to be stored in the ODBMS repository, that is, "made persistent." Another category of OVPF objects includes the user interface classes. These are the support classes which present the map on the computer screen and allow interaction with the user. It is very important that the user interface classes are not made persistent, for reasons that will be presented shortly.

Typically, a complete web of objects is made persistent by reference to some root or parent object for the set during the course of a database transaction. This root object















Figure 12. Principal Classes and Behavior for Quadtree Spatial Index

(A) SpatialDataManager is a subclass of Object, and
understands how to:
- create and initialize a quadtree;
- pass a geo-feature to the quadtree for insertion;
- ask the quadtree to remove a given geo-feature; and - ask the quadtree for all features within a given area.

(B) SpatialDataCell is a subclass of Array, having four
indexed slots in addition to the named instance variables.
These indexed slots each hold onto an object pointer to
another instance of VPFSpatialDataCell.

Each cell understands how to:
- determine if it is the smallest cell capable of containing the
rectangular area requested;
- propogate the request for the smallest cell recursively
to the next lower-level cell;
- propogate the request back up one level if it cannot hold
the requested rectangle; and
- gather and return pointers to all features contained within
a given rectangle, regardless of the number of levels
involved.





63







VPFSpatialDataManager
Instance Variables:
coverage
topCell
maxLevel
Operations:
initializeMin: minPt max: maxPt
collectionOfContainersFor: aRectangle
containerFor: aRectangle
returnSetOflntersectingFeatures: aRectangle

(A)





(Array) VPFSpatialDataCell Instance Variables:
id
superCell level
manager
origin
corner
width
features (four array slots for subcells) Operations: canContainBoundingBox: aRectangle contained LowerLevelContainerForBoundingBox: aRectangle containerForBoundingBox: aRectangle createLowerLevelCellForlndex: index lowerLevelContainerForBoundingBox: aRectangle upperLevelContainerForBoundingBox: aRectangle

(B)






64

can provide a named entry point to the persistent object web for future access by other application programs. In the case of the metadata object web, the root is a collection object of all initialized databases, keyed by their database name. For example, this collection has a member called 'DNCOI' which points to the persistent database for Norfolk Harbor.

For the feature objects, one logical root object is the spatial tree manager, which holds a pointer to the linked list of spatial tree cells, each of which holds pointers to the features whose bounding rectangle falls within the cells' boundaries. Each feature object (instance of a VPFFeature subclass) holds onto its attributes stream and its symbol (instance of a VPFFeatureSymbol subclass). Since each coverage has its own spatial index, the spatiallndex instance variable of VPFCoverage was defined to hold the persistent pointer to the VPFSpatialDataManager instance in charge of the coverage's quadtree (see Figure 9 on page 52).

Another logical root object for feature objects is the instance of VPFFeatureDef which defines a given feature class. Providing access to features via their VPFFeatureDef instance would be useful in certain query optimizations. The features instance variable was thus defined for VPFFeatureDef class, to hold a second set of direct pointers to the persistent feature objects (also shown in Figure 9). Establishing cut-points in object webs

Normally, a request to make an object persistent results in migrating the complete transitive closure5 of all objects to which the requested object points, into the external database. A case where this is not desireable is where links to the user interface are held by persistent objects. One reason a user interface object should not be made persistent is that it contains numerous references to transient objects that can only be assigned and changed





65

by the host operating system, such as window handles, file handles, and so on. The other main reason a user interface object should not be made persistent is that it touches so much of the Smalltalk run-time environment objects that it would essentially pull the entire Smalltalk memory image into the external database with it.

ObjectStore provides means of resolving this issue with the notion of cut-points. By adding a particular method to each of the user interface classes, the transitive closure operation can be made to insert a cut-object in place of the reference to the user interface object itself. This cut-object reference is then replaced at run-time by the "live" object reference when needed.


Design of a Rule-Base Framework to Support Geographic Feature Editing


The rule-based framework was added to OVPF in the second year of the project as a means to help enforce data integrity constraints on features during interactive updates. Rules in this framework can be defined to "fire" upon occurrence of a particular event, subject to arbitrary conditions anywhere in the database. Should one of the rules be triggered and its associated conditions hold true, then a predefined action would be carried out. The following discussion shows how this is implemented in OVPE










5. Transitive closure is a term from graph theory, denoting the set of all pairs of nodes directly
or indirectly connected by a sequence of edges. In the case of object webs, it refers to all
objects connected by association or containment from a given root object (after Rumbaugh
et al. 1991, p. 57)






66

Event Objects


Events are first-class objects in this framework as they have significant state and behavior (Arctur et al. 1995d). The PrimitiveEvent class in Figure 13 defines an "eventMsg" attribute which is inherited by all its subclasses. For each new instance of any event, this attribute is assigned the name of the message for which the event is raised. The ComplexEvent class defines further attributes used by its own subclasses.




PrimitiveEvent Legend
Instance Variable: Superclass
eventMsg attribute
Operations: method
notify: ComplexEvent
Instance Variables: direction of event I inheritance event2
event I Occurred Subclass event20ccurred



ConjunctionEvent DisjunctionEvent SequenceEvent
Operations: Operations: Operations:
notify: notify: notify: Figure 13. Event Class Hierarchy


The key method for each of the event classes is notify:. This method takes only one argument which specifies the name of the message which causes an event to be raised. The event is raised when the object(s) associated with the event object receives that message. For PrimitiveEvents the notify: method simply compares the argument to its own eventMsg attribute value and returns true if they match. For each ComplexEvent subclass, the notify:





67

method also examines a particular combination of the status of its other attributes, before returning true or false. An event instance is typically created at the time of rule creation.


Rule Objects


VPFRule objects have the structure shown in Figure 14. A single class suffices for defining all rules. The feature instance variable may be assigned a pointer to either a single geographic-feature instance, such as a road or lake; or to a feature class,such as the defining class for roads or lakes. In the former, the rule will be applied only to a particular instance whereas in the latter, the rule will be applied to all instances of the defining class.




VPFRule Description of Rule attributes:
Instance Variables: feature: pointer to a geographic-feature class or feature
feature instance
event event: pointer to an instance of PrimitiveEvent or one of its
condition subclasses
action condition: condition-test method name
actionPriority action: action method name
preOrPost actionPriority: integer value I (low) to 100 (high) preOrPost: flag specifying if condition is tested before or after the message raising the event

Figure 14. Structure of a Rule Object



The event instance variable is assigned a pointer to a specific event instance (introduced above), which could be either a PrimitiveEvent or a ComplexEvent. The condition attribute is assigned the name of a method to be executed at the time the event is signalled, which will return true if the condition is met and false otherwise. The action method is then executed if the condition evaluates to true. The preOrPost attribute specifies the relative timing for execution of the condition method with respect to the message raising





68

the event. The condition may be evaluated either before the event message is executed, or upon completion and return from the event message execution. The actionPriority attribute value is used to help mediate in situations where multiple rules fire at the same time. Event Detection Mechanism


The final component of this framework is the mechanism by which events are detected and rules are fired. In the OVPF viewer/editor tool, all changes to geo-feature objects are handled through the use of FeatureConstructor objects, which use a scriptbased framework with a state machine, supporting asynchronous events for flexibility in working with runtime-dependent constraints on changes to a given feature.6 This framework has the potential for extending its own semantics at runtime. See Figure 15 for a simplified representation of the VPFFeatureConstructor hierarchy.

With this framework, a user request to create or modify a geo-feature via the GUI is forwarded to the appropriate PointFeatureConstructor, LineFeatureConstructor or AreaFeatureConstructor. The constructor is given the name of the geo-feature class, which it instantiates with default values for all attributes. In the case of creating a new geo-feature object, the constructor then prompts the user for the feature's location. At this point, the constructor notifies the new feature object of the intended action. This notification results in a lookup to the feature's rule base. Any rules having events defined for the current operation will have the opportunity to check for any particular conditions in




6. The FeatureConstructor framework was first developed by Bob Williams for OFM. Very
minor changes were needed to accommodate the rule-based capability.






69




VPFFeatureConstructor
Instance Variables: feature
nextAction point
Operations: stopCreateFeature:


VPFPointFeatureConstructor VPFAreaFeatureConstructor
Operations: Operations:
point I: points:


VPFLineFeatureConstructor
Operations: point 1:
point2:

Figure 15. VPFFeatureConstructor Hierarchy



the database that are of interest. After all rules' conditions have been checked, those which

evaluated true are sorted in priority order, and their respective actions are performed. An

example of usage is provided in the Results section following.














RESULTS


This section presents a summary of findings from this research and development. This is in two parts: the first part shows examples of general usage, and the second part describes the operation of the rule-based framework.


OVPF Application Overview


The diagram in Figure 16 shows functional relationships among the principal modules of the OVPF application. The two main points of control in this figure are the graphical user interface (GUI) and the metadata framework. The GUI provides the user




Metadata
(Schemata)





VPF 0 Geo-Feature - ObjectStore Graphical User Geo-Relational -0 Objects 4-0 Geo-Object
Interface (GUI) Data Files DBMS



Quadtree
Spatial Index Legend Functional associations I

Data flow 0-+

Figure 16. Principal OVPF Components 70





71

the menus and programming access with which to direct the operation of OVPF. The metadata framework carries out the bulk of the processing of VPF source data for migration to the internal object model and to the ODBMS. The metadata model is also responsible for exporting edited OVPF data back to the relational VPF file structure.


Transformation of Relational Vector Product Format Data to an Object Web


As described in the preceeding Materials and Methods section, the import of source data into the OVPF application at runtime is accomplished in stages. First, the metadata object web for a given VPF database is initialized, after which the geo-feature data can be interpreted and displayed on the screen. As the geo-feature data is read by OVPF, it is inserted into the quadtree spatial index structure. Figure 17 shows all the significant definitional relationships among the metadata, the feature objects, and the spatial index structure. Figure 18 shows the dynamic associations among the runtime instances of metadata, feature, and spatial index objects.


Displaying Spatial Features


An overview of the main steps in reading, indexing and displaying a Vector Product Format map (from either the relational source files or from the ODBMS) is depicted in Figure 19 below. Figure 20 shows a "screen capture" of the OVPF map window display with a portion of the Norfolk Approach library.






















Figure 17. Key Definitions and Relationships for OVPF Database Classes

(A) Metadata Classes

(B) Feature and Spatial Index Classes

Source: after (Arctur et al. 1995a; Cobb et al. 1995a)














VPFSpatialDataCell
VPFLibrary VPFSpatialDataMgr Instance Variables: VPFDatabase Instance Variables: Instance Variables: superCell Instance Variables: libraryName topCell subCells databaseName database level libraries / overages / features




VPFFeature VPFFeatureSymbol Instance Variables: Instance Variables: VPFFeatureDef
featureDef feature
Instance Variables:idgahc
VPFCoverage coverage attributes isHilighted Instance Variables: flass notes color coverageName descr symbol library ftHeader symbol library ftHeader Class-Instance Variables: featureDefs defaultColor featureNotes VPFDrawOrder textSymbols Instance Variables: notHeader contents symHeader VPFTableHeader features endPrimHeader Instance Variables: ReadWriteStream cndPrimHeader schema Instance Variables: edgPrimHeader headerLength contents facPrimHeader tableDesc position txtPrimHeader

(A) (B)






















Figure 18. Sample OVPF Data for a DNC Coastline Feature

(A) Metadata Objects

(B) Feature and Spatial Index Objects

Source: after (Arctur et al. 1995a; Cobb et al. 1995a)




















collection-object/ collection-object l a DNCCoverage _ feat iaureDef feature






coverageNcame ECRo - c id: raphicEle coletin-bjcttopCell - subCells: 4-cell array a DNCDatabase
databaseName: DNCOI ~~~ I level: 0










library a DNCFeatureDef brar . triuea VPFLibrary





featureNotes: [notes.rat] cla mess : COASTL notes: [from notes.rat]
collection-object
collection-ouect

collection-object [ybraCCoastl desc: aCVPFFoastl nwcNCoverage . .. .. .. featureDef Il~Ifeature
coverageName: KCR id ~ g aphicElements

a ReadWriteStream DNCFeatureDef library attributes m~IisHilighted: false featureDefs coverageco : fclass: COASTL nts fo oe~a]clr featureNotes: [notes.rat] smo
texymol: sybo~rt]descr: Coastline Lines smo
ed~imt-eader ftHeader __7 . . . . . .c

a VPFEb
a ReadWriteStream i contents: [a ByteArray] VPFTableHeader aPFTableHeaderI contents:[a ByteArray] features
schema: [edg fields] schema: [coastl.lft fields] position:




(A) (B)
















Figure 19. Transfer of Spatial Features from VPF to OVPF

Geo-features are

(1) imported either from relational VPF files or from ODBMS, then

(2) placed in quadtree, and

(3) rendered on screen.

Note that the ODBMS contains whole features, while 4 or more georelational files are required to define each individual feature.

Source: after (Arctur et al. 1995c)





77












Georelational ODBMS disk files repository Feature objects C2 Quadtree








Screen display






78




aiRe \w Edit WVow Help
Point to desired feature & press left bton GJPH 7.02W 37 .08N
Databases
DNC01


A0108170 &
A0108280 BROWSE
Covera-es
LCR (A280)
LIBREF (A280)
LM (A280)
NAV (A280)
Feahres
Bridgel (ObsA280)
O Buoybcnp (Nav.A280)
Dangera (ObsA280)
0 Dangerp (ObsA280)
Embanka (LcrA280)
Hazarda (ObsA280)
0 Hazardp (ObsA280) i -o
- Lcrline (LcrA280)
- Leading (Nav.A280)
- Ubref (LibrefA280) *
O Lightsp (NavA280) 7
SPoint Show
L in e . . ....... ... ... ........ ................... ...................
W Area -4J
W Text Re h"Mp ""]j Zo eal "mi [mOut
e ec , , = inflP. 'a ;r i.i 7 1:::';S 1-0 ; :9f ) f :r Scale 1:276717


Figure 20. OVPF Map Display of Multiple Coverages
in Norfolk Approach Library of DNCO 1





Migrating Object Webs to ODBMS



As mentioned in the Methods section, not all of the OVPF data should be placed in


the ODBMS repository. In particular, the GUI objects should not be allowed to migrate to


the persistent data store, as this would inevitably result in migrating most of the Smalltalk


development environment through the transitive closure from the GUI root objects. Figure


21 shows the relationships among the main groupings of objects in OVPF, and which are


managed by the ODBMS.





79






Metadata - instances of: OVPF User Interface - instances of: - VPFDatabase
- VPFMapWindow subclasses -VPFLibrary
- VPFMapPane subclasses - VPFCoverage
- VPFFeatureEditor subclasses
- VPFGraphicsEngine - VPFTableHeader


(A)
Spatial Tree - instances of:
- VPFSpatialDataManager
-VPFSpatialDataCell


Feature Data - instances of:
- VPFFeature subclasses
- VPFFeatureSymbol subclasses


Graphic Primitive - instances of:
- VPFDrawOrders subclasses


(B)


Figure 21. Persistency and Linkages of Principal OVPF Components
(A) Non-persistent objects
(B) Persistent objects
Source: after (Arctur et al. 1995a; Cobb et al. 1995a)


Applying the Rule-Base Framework for Feature Editing


In order to demonstrate the rule-based framework, an example rule to prevent any

BuildingPoint geographic features from being placed over water was implemented and

tested. Sample VPFRule and VPFPrimitiveEvent object structures are shown in Figure 22.

In this case, the VPFRule instance is associated with the BuildingPoint class and thus will

be applied to all instances of that class. Alternatively, the user may associate the rule with





80

a particular BuildingPoint instance. Due to the setting of the preOrPost attribute, the condition method onWater: is evaluated before the Event's eventMsg (the newPoint: method) is carried out. If the condition method onWater: returns true, the action method stopCreateFeature will then be executed, which will prevent the eventMsg method newPoint: from being performed. The actionPriority setting ensures this action will have highest priority among any other VPFRules which may also fire.



a VPFRule DNCBuildingPoint class

feature a PrimitiveEvent
event 'onWater:e
condition eventMsg
action 'stopCreateFeature'
actionPriority ( e a ('newPoint:')
preOrPost newPoint: object Legend
attrbute

association pointer value object

Figure 22. Example Rule and Event Objects Source: (Arctur et al. 1995d)

At this point we need to introduce the rest of the framework in which Events are detected and Rules are fired. In the OVPF viewer/editor tool, all changes to geographicfeature objects are handled through the use of FeatureConstructor objects (see Figure 23). With reference to our example for creating a new BuildingPoint feature, we assume a RuleEvent pair has already been created (for checking if a new point feature is over water) and stored in the DNCBuildingPoint's rules dictionary (class instance variable defined in








VPFFeature class. Figure 23R i. This rule base is actually stored physically in the

ODB MS.





VPFFeatureConstructor VPFAreaFeatureConstructor
Instance Variables:
feature
nextAction VPFLineFeatureConstructor
point
Operations:
onWater: VPFPointFeatureConstructor
stopCreateFeature Operations: pointl:


(A)


VPFFeature
Class Instance Variables:
rules
Operations:
notify:argList:preOrPost:from:newPoint:

(B)


Figure 23. Key Components of Event Detection Framework
( A) Partial FeatureConstructor Class Hierarchy
(B) Partial Feature Class Hierarchy Source: after (Arctur et al. 1995d)


The following sequence of events could then take place at the user's initiation (step


numbers correspond to those in Figure 24):

I. The user chooses the appropriate OVPF menu option to add a new

geographic feature. and selects BuildingPoint from a list of available feature

classes.

2. The OVPF g aphical user interface (GtUI) creates a PointFeatureConstructor.





82


Action Summary Direction of Messages

I. User chooses menu option to add a new User feature
1
(2. GUI creates a constructor for the new feature object OVPF GUI

3. uConstructor creates a default instance of 2 BuildingPoint, and requests coordinate point from GUI I PointFeatureConstructor

4. GUI returns user-defined location 3
coordinates for new feature VPFGUI

5. Constructor sends message -feature notify: 'newPoint:' 4
argList: (point)
preOrPost: 'pre' PointFeatureConstructor
preOrPost: 'pre
from: self

6. Feature scans rule base for rules with
BuildingPoint
event message 'newPoint:'

7. Feature finds rule and evaluates condition message -- Rules
constructor perform: 'onWater:';
constructor then queries ODBMS and 1 7, 8
returns true or false
PointFeatureConstructor 8. If condition evaluates true, feature sends 9 message -constructor perform: stopCreateFeature

9. If constructor has to stopCreateFeature, then constructor assigns 'stop' value to its nextAction attribute

10. If constructor's nextAction is 'stop' it y 10, 11
discards the new feature
BuildingPoint
I I. If constructor's nextAction was not 'stop'
then it sends the message -feature newPoint: point 11
and finally inserts the new feature in the
Spatial Quadtree
quadtree.

Figure 24. Flow of Control and Behavior For Rule-Event Example Source: after (Arctur et al. 1995d)





83

3. The Constructor creates a default BuildingPoint feature object, and initiates a

request to the GUI for a user-selected location coordinate point, to be

returned via the point : message.

4. On instruction from the GUI, the user chooses a location on the map with the

mouse, and the GUI returns it as the argument in the point I: message to the

Constructor.

5. Within its point I: method, the Constructor notifies the new BuildingPoint

feature instance of an impending Event via the parameterized notify:argList:

preOrPost:from: message.

6. The new BuildingPoint object executes the inherited notify:argList:

preOrPost:from: method, which checks the rule base for all Rule-Event pairs

whose eventMsg matches the notify: argument, in this case newPoint:. 7. If a matching Rule-Event pair is found, then the Rule's condition value

(onWater:) is sent as a message to the Constructor to perform. The

Constructor's onWater: method checks the database for any water-related

features within a given tolerance of the user-selected coordinates, and returns true or false. By user's preference, this check can be performed either on just

the features currently being displayed, or on features from all coverages in

the ODBMS.

8. If the onWater: method returns true (coincident water feature was found), the

Rule's action message is then sent to the Constructor. In this case if water features were found, the message stopCreateFeature would be the action

message sent to the Constructor. Note that in the present framework, all





84

applicable conditions are evaluated before any actions are performed. If

multiple conditions return true, their action messages are sent to the

PointFeatureConstructor in order of decreasing actionPriority.

9. If the Constructor receives the message stopCreateFeature, it will set its

nextAction attribute to 'stop'.

10. Upon completion of all applicable conditions and actions, the new

BuildingPoint object returns from executing the

notify:argList:preOrPost:from: method. The thread of control reverts to the

Constructor's point I: method, which then checks its nextAction setting. If it is 'stop' then the new default BuildingPoint feature is discarded, and control

returns to the user with a descriptive dialog message.

11. If the nextAction is not 'stop' then the Constructor sends the newPoint:

message to the new BuildingPoint, inserts it in the spatial quadtree, and

presents the user with a dialog window to fill in any BuildingPoint feature

attributes needed.

This simple example can easily be extended to encompass multiple rules for a

given geographic feature, as well as to handle multiple features. In addition to the "eitheror" situation represented in this example, a rule could be based on prerequisite and corequisite existence of other features, even occurring in a particular temporal sequence or logical combination. It is simply necessary for the FeatureConstructor class or one of its subclasses to mediate all requests for changes or additions to geographic features by the user, and for the FeatureConstructor method invoked to check the affected feature's rule base.













DISCUSSION


A number of implications can be found from this research. In the following sections, I will first address this work in terms of its initial objectives, followed by a discussion of the limitations so far recognized in the technologies and designs used. A look at future directions and summary conclude this thesis.


Implications of Research for Meeting Initial Objectives


The objectives stated on page 8 include supporting (1) complex interdependencies among geographic features, (2) very large databases, and (3) the potential for expert system applications. Each of these will be discussed in turn. Supporting Complex Interdependencies Among Geographic Features


The descriptions of Vector Product Format file structures for representing geofeature data, and the object webs created in OVPF to capture this information, show that this Smalltalk-based, object-oriented data model is very versatile and expressive. In related development work for the Naval Research Laboratory, this framework has been adapted to import and display geographic data from four different kinds of VPF product databases simultaneously (Digital Nautical Chart, World Vector Shoreline, Vector Smart Map, and Urban Vector Smart Map).




85





86

In addition to the extensibility of the metadata structure facilitated by the objectoriented class hierarchy, the rule-based framework described here provides another level of extensibility. Because the Smalltalk language supports incremental dynamic compilation, rule-based actions could actually trigger the creation of additional classes and methods at runtime, according to the needs of the application. This might be done, for example, to augment the behavior of existing geographic feature objects to respond to new conditions in their environment that might only affect some of the features and not others. It is also conceivable that a given geographic feature might evolve into a different kind of feature over time; the framework described here could support such an evolution.

This rule-based approach can be used in three distinct situations: (1) immediate mode, to execute rules immediately before or after some state change; (2) deferred mode, to execute rules at the end of several changes; and (3) detached mode, to perform rulebased actions separately from the state changes. Furthermore, it has the advantage over traditional inference-engine approaches in that it will work with an arbitrarily-large database of persistent objects, rather than being limited to those objects which can fit in memory. This approach should support the types of complex interdependencies commonly found in facilities management applications, such as with public utilities networks.

The rule-event framework and procedures were surprisingly simple to implement. The FeatureConstructor classes, together with a single supporting method in VPFFeature class (notify:argList:preOrPost:from:), provide a simple and flexible event detection and rule processing system. While it introduces some processing overhead, all but the spatial query to the ODBMS (see step 7, Figure 24 on page 82) are very fast operations. An important benefit of this object-oriented framework is the potential for direct reuse by





87

other FeatureConstructors of condition checks and actions such as the onWater: and stopCreateFeature methods. Furthermore, with this system provision can be made for adding and changing rules at runtime.

The design presented here is easily extended to trigger on any kind of change (create, modify, delete) to geographic-feature objects, as well as to specific feature attributes and spatial coordinates of a given feature object. This could be a significant advantage over the triggers supported by many commercial relational and even hybrid object-relational DBMSs. Except for Sybase, these DBMSs can typically trigger only on insert, update or delete of a complete feature record, rather than being able to discriminate on changes made to a single feature attribute.

It might be noted that this rule-based framework is not limited to implementation in an object-oriented system. While the object-oriented properties of hierarchical definition and inheritance in Smalltalk facilitated a simple design, the same functionality could be achieved in a non-object-oriented language with appropriate data structures and procedures. It seems likely that rule-based frameworks like this could find their way into many more kinds of applications in the future. Supporting Very Large Databases


As a result of using a commercial ODBMS for the geo-data repository, we can

immediately start to consider working with very large, distributed databases. ObjectStore has been demonstrated already to support terabyte-sized databases, and its client-server architecture with such features as shared-page caching is well suited to multi-user applications. Other ODBMSs are also likely candidates for large applications like this. It





88

was found that spatial queries were as much as hundreds of times faster when reading from the ODBMS repository than from the relational VPF source files. While further tuning of either approach is no doubt possible, the ODBMS interface was far simpler to design and implement, especially given the need to support topology.

However, storing a large database is only part of the problem; another is providing access to it with reasonable performance. The technique demonstrated here of placing each coverage in a separate spatial index shows very good potential for helping to manage the visibility of unnecessary data while the user is trying to identify a region of interest. The rule-based concepts implemented here could also be applied to facilitate query optimization across multiple, heterogeneous databases distributed over a wide-area network. For example, rules could be defined to check the visibility or access priveleges of one or more portions of a distributed database before starting a potentially long transaction. To support this, a client ObjectStore application could be set up on the host server. This client application could serve as the effective host application to the actual users, filtering the data prior to shipping it over the network. The GemStone ODBMS already is organized to support this separation of work between a host application process and each client process.

It was pointed out that each quadtree cell (VPFSpatialDataCell instance) holds a collection of direct object pointers to its geo-features (see "Design of an Object-Oriented Spatial Index" starting on page 60). This makes the quadtree into more than just a spatial index, but an efficient, general purpose container structure for the features as well. This design is essentially independent of the actual VPF feature structure, and allows us to modify the implementation of the spatial tree at any time without affecting the rest of





89

OVPF or the source data. Thus, in the future we could easily substitute a range tree (Samet 1994; Beckmann et al. 1990; Brinkhoff et al. 1993), or special optimizing techniques in place of the present quadtree approach (this will be discussed further below). This design could also support the simultaneous implementation of multiple spatial indexing schemes, to allow choice of the most efficient spatial tree design for a given source database. This might be thought of as pluggable spatial indexing.

Another issue addressed by this design that becomes more important over time has to do with changes to the nonspatial attributes of geo-features in a given feature class or coverage. As business requirements and data sources evolve and change, it is often necessary to add, remove, or change the value range of nonspatial attributes for one or more feature classes. (In a complex database specification such as VPF, this must be done with care, to ensure consistency of attribute usage and values across similar feature classes in different coverages and libraries.) As described in the Materials and Methods section, page 54, OVPF makes use of Smalltalk ReadWriteStream objects to hold all such attributes in a single byte-stream, in which each attribute's position and length in the bytestream is known via the feature class' schema. This rather non-object-oriented way of aggregating many small pieces of information is much more memory-efficient than if we had created separate instance variables and value-objects for each geo-feature attribute. It also supports changes in attribute structure for a feature class without affecting the object class definitions. This means that attributes can be added to, or removed from, the feature class definition without having to redesign the OVPF structure. This could even be done at runtime. The issue of updating older data with the previous attribute structure to the new structure must still be addressed, which could be difficult for large databases.






90

Supporting Potential for Expert System Applications


Attempts and progress are being made to cope with the complexity of decision making faced by planners through the use of expert systems. The framework presented here seems to be a good candidate for further research in this area. With Smalltalk's reflective and dynamic compilation capabilities, there are no inherent limits to the ability to create and modify a rule-base of events, conditions and actions at runtime. This implies the system could have the capability to learn and evolve its semantics (general behavior) as a function of usage patterns and environmental conditions.


Limitations of the Present Application


The various frameworks described in this thesis present numerous possibilities for extensions and enhancements. However, there remain some significant limitations in the present implementation of OVPF. These are grouped here according to (1) feature class definitions, (2) spatial index, (3) GIS functionality, (4) the rule-based framework, and

(5) the Smalltalk language.


Feature Class Definitions


The present design for OVPF lacks support for variable-length feature attributes. It has been found that some feature classes in the Vector Smart Map (DMA 1993c) and Urban Vector Smart Map (DMA 1994b) specifications have variable-length text attributes, whereas all other nonspatial attributes in VPF databases have been fixed-length in nature. To accommodate this, we would most likely follow the VPF specification for storing such data in the feature tables, by including the integer-length of the attribute as the first four





91

bytes of the attribute's value. Since the feature table schema states which attributes have variable length, it would be a straightforward matter to modify the standard accessing protocol in VPFFeature ("valueForAttribute: aName" and "putValue: aValue forAttribute: aName" methods) to properly handle these exceptions.


Spatial Index


The simplistic quadtree implemented so far has a number of shortcomings, as

observed by Cobb et al. (1995b). For example, all spatial queries begin from the root (top) of the tree. Cobb's spatial splay tree approach addresses this issue, by storing pointers to the most recently-accessed quadtree cells at or near the root of the splay tree. This has been found to result in significant improvement in query performance.

Another issue however, is that of managing insertion of geo-feature objects in the index that fall on a quadtree cell boundary. When this happens, the geo-feature pointer is moved back up one level in the quadtree, to the next-larger cell. With small or sparse geographic databases this is not a problem, but with dense coverages this can degrade access performance. One way of overcoming this, while preserving the advantages of the quadtree approach, is to use overlapping quadtree cells. In this case, each quadtree cell could overlap its neighbor by up to 25 percent, to hold any geo-features which coincide with its boundary. By traversing the cells at a given level in a consistent order, say clockwise, each feature would still have a unique index key. While I have not seen any technical documentation on this approach, both Laser-Scan (1995a) and Smallworld (1995) use it for their spatial indexes.




Full Text

PAGE 1

DESIGN OF AN EXTENSIBLE, OBJECT-ORIENTED GIS FRAMEWORK WITH REACTIVE CAPABILITY By DAVID K. ARCTUR A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1996

PAGE 2

Copyright 1996 by David K. Arctur

PAGE 3

ACKNOWLEDGMENTS This work seems like the result of my whole life of living, working, growing, hurting and learning. There is no way to list all those in that vast web of involvement, though my parents' and friends' ready encouragement and support through difficult times come quickly to mind. This research was funded by the U.S. Naval Research Laboratory and the Defense Mapping Agency. I am very grateful to Kevin Shaw, Principal Investigator at NRL, as well as to Dr. Maria Cobb and Miyi Chung at NRL, for their faith, support and guidance throughout this project. Dr. John Alexander's creativity, insights and guidance at numerous stages of the project were invaluable, as was Dr. Paul Zwick's understanding of GIS concepts and applications. Dr. Sharma Chakravarthy and Eman Anwar provided considerable information and assistance with regard to active databases and the rule-base framework. Dr. Joe Wilson's perspective and understanding of object-oriented concepts helped guide portions of the OVPF design. Dr. Earl Starnes motivated me to find ways to apply these findings in urban and regional planning applications. This research may not have started without Bob Williams' initial work on OFM. Dr. Max Egenhofer, Dr. John Herring, Dr. David Abel, and Dr. Daniel Karnes all provided timely and useful reference materials and guidance. And I cannot leave out Adobe FrameMaker book publishing software, without which this thesis would have been much more difficult to create and assemble. Without a question, I am grateful again to John for luring me out of Silicon Valley, California, to embark on this adventure. iii

PAGE 4

TABLE OF CONTENTS page ACKNOWLEDGMENTS in LIST OF FIGURES vi ABSTRACT viii INTRODUCTION 1 Goal and Objectives of the Research 7 Managing Complex Interdependences Among Geographic Features 8 Supporting Very Large Geographic Databases 10 Supporting Reactive Capability in the GIS 10 Vector Product Format Database Structure 12 Historical Technological Developments 18 Large-Scale Urban Models 19 Object-Oriented Programming Systems 20 Relational and Object-Oriented Database Management Systems 22 Knowledge-Based Systems 24 Geographical Information Systems 30 Object-Oriented GIS 32 Expert Systems with GIS 35 Importance and Contributions of This Thesis 37 MATERIALS AND METHODS 40 Object-Oriented Software Development Tools 40 Smalltalk Programming Environment 40 Source Code Configuration Management Facility 41 Object-Oriented Database Management System 42 Computer Platform 43 Approaches Used in Building OVPF Components 44 Introducing Some Object-Oriented Terms 44 Conversion of Source Data from Vector Product Format to Smalltalk Objects. . 46 Representation of Metadata Objects 48 Representation of Geo-Feature Objects 51 Representation of Graphical Primitives and Topological Relationships 55 Design of an Object-Oriented Spatial Index 60 Organization of Object Webs in ODBMS Repository 61 Design of a Rule-Base Framework to Support Geographic Feature Editing 65 Event Objects 66 Rule Objects .67 Event Detection Mechanism 68 iv

PAGE 5

OVPF Application Overview 70 Transformation of Relational Vector Product Format Data to an Object Web . . 71 Displaying Spatial Features 71 Migrating Object Webs to ODBMS 78 Applying the Rule-Base Framework for Feature Editing 79 DISCUSSION 85 Implications of Research for Meeting Initial Objectives 85 Supporting Complex Interdependencies Among Geographic Features 85 Supporting Very Large Databases 87 Supporting Potential for Expert System Applications 90 Limitations of the Present Application 90 Feature Class Definitions 90 Spatial Index 91 GIS Functionality 92 Rule-Based Framework 93 Smalltalk Language 94 Future Directions 95 Summary 96 REFERENCES 98 BIOGRAPHICAL SKETCH 108 v

PAGE 6

LIST OF FIGURES Figure page 1 Libraries for Chesapeake Bay Area (DNC01) Database 13 2 DNC01 Database Directory Structure (partial) 14 3 Winged-Edge Topology Components 16 4 Development Hardware Configuration 43 5 Development Software Configuration 43 6 Vector Product Format Data Types 49 7 Example of Feature Table with Header and Records 50 8 VPFTable Header and VPFSchemaColumn Class Definitions and Example Instance Values 50 9 Steps to Create Metadata Web 52 10 Representation of Geo-Features in OVPF 56 1 1 Object Definitional Hierarchy for Representing VPF Graphical Primitives with Spatial Topology 59 12 Principal Classes and Behavior for Quadtree Spatial Index 62 13 Event Class Hierarchy 66 14 Structure of a Rule Object 67 15 VPFFeatureConstructor Hierarchy 69 16 Principal OVPF Components 70 17 Key Definitions and Relationships for OVPF Database Classes 72 1 8 Sample OVPF Data for a DNC Coastline Feature 74 1 9 Transfer of Spatial Features from VPF to OVPF 76 20 OVPF Map Display of Multiple Coverages in Norfolk Approach Library. ... 78 vi

PAGE 7

2 1 Persistency and Linkages of Principal OVPF Components 79 22 Example Rule and Event Objects 80 23 Key Components of Event Detection Framework 81 24 Flow of Control and Behavior For Rule-Event Example 82 vii

PAGE 8

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy DESIGN OF AN EXTENSIBLE, OBJECT-ORIENTED GIS FRAMEWORK WITH REACTIVE CAPABILITY By David K. Arctur May 1996 Chairman: John F. Alexander Major Department: Urban and Regional Planning This thesis reports on the design of an "analysis-enabling framework" for more productive use of geographic information systems (GIS) by planners and decision makers, through the integration of object-oriented programming and database management with knowledge-based systems and active database technology. This extends the information management system to support its own customization by those who use it, for reflective adaptation of the GIS framework to applications beyond the domains or scope anticipated by its creators. The Smalltalk programming environment is used, along with an object-oriented database management system (ODBMS), running on a Unix computer platform. This system works with the Defense Mapping Agency's Vector Product Format (VPF) digital geographic databases. The Smalltalk application, called OVPF, converts source data from a georelational database structure to Smalltalk objects. Full spatial topology for point, line and area graphics is supported using the winged-edge algorithm, as well as many-to-many relationships between geographic features and graphical primitives. viii

PAGE 9

OVPF incorporates a quadtree spatial index implemented in Smalltalk. The quadtree object itself is placed in the ODBMS repository, and serves all queries to the geofeature objects. A rule based framework and event detection mechanism provide a reactive or "triggering" capability for enforcing application-based data integrity and interdependency constraints on requests to update geo-features. This effectively transforms the ODBMS into an "active database." OVPF provides a graphical user interface (GUI) for direct interaction to view and edit geo-feature objects. Spatial topology is maintained during feature editing, and OVPF encapsulates database operations within atomic transactions. Additions and changes can be made to the rule base at run-time that take effect immediately. The importance and contribution of this research is in the use of Smalltalk's unique object-oriented data modeling capabilities for a GIS framework, in combination with a rule-based active repository for spatial and nonspatial data. This approach supports complex interdependencies among geographic features in potentially very large databases. It provides an extensible metadata framework, and the potential for supporting expert system applications. ix

PAGE 10

INTRODUCTION The management and analysis of geographic information is becoming increasingly important to many sectors of our industrial and technological society. Local, state and federal governments use geographic data to study and forecast population and demographic growth patterns, as well as to develop comprehensive plans for urban infrastructure and development (Budic 1994). Utility companies use geographic information to plan the building and expansion of electrical, gas, water and communications facilities. Private industry and commercial businesses use geographic information to study, plan and monitor their marketing, production and distribution strategies. The military uses geographic information for strategic and tactical mission planning, mission rehearsals, logistics, navigation, and many other applications. The uses and interdependencies of geographic information are growing rapidly, as are the sources of this information (Maguire et al. 1991; Laurini and Thompson 1992). A number of approaches for geographic information systems (GIS) have been developed to support access to, and management of, such data. A GIS encompasses, in part, the integrated computer hardware and software required to store, retrieve and update both spatial and nonspatial attributes associated with a database of geographic features. Each GIS also is a function of the context in which it is used, and embodies a set of principles and procedures for the collection, analysis, display and plotting of geographic data. 1

PAGE 11

2 All these functions however, can be grouped into two main categories of responsibilities for a GIS: information management, and information analysis. The substantive goal of GIS technology is to support spatial analysis (Goodchild 1987) and synthesis for the purpose of understanding and predicting patterns of behavior among human and other natural communities (Wellar 1989; Wellar et al. 1994). But the sheer volume of data, and the complexity of interactions among geographic entities, has necessitated considerable effort to develop the means for simply handling the data, which has often seemed to overshadow the efforts for analyzing it (Ding and Fotheringham 1992; Lober 1995).' However, rapidly accelerating advances in computer software and hardware may be changing this picture. The evolution of software technologies, specifically objectoriented (00) programming, knowledge -based (KB) systems, and active databases, together with the exponentially increasing power and storage capacity of affordable computers, has led to the development of analysis-enabling frameworks for more productive use of the information by planners and decision makers. These enabling technologies could be viewed as an extension of an information management system, but an important distinction is that information management systems to a great extent are designed to serve a broad range of applications from business to engineering, while knowledge-based enabling technologies are designed around the requirements of a particular user community. 1. This division of effort into information management and analysis categories parallels the division of effort in urban planning itself into procedural and substantive matters (see Faludi 1973). While substantive issues generally seem the most important, lack of attention to the procedural issues can preclude substantive progress. So it is in using information.

PAGE 12

3 This is a report of an enabling-teehnology research project in which the core functionality of an object-oriented geographic information system (OOG1S) was implemented using the Smalltalk programming language, with a commercial objectoriented database management system (ODBMS) for the repository. The OOGIS incorporates a knowledge-base framework which effectively transforms the ODBMS into an "active database" (these terms will be defined shortly). This is not the first OOCHS framework to be developed, nor is it the first KBG1S or active database to be developed. It does, however, seem to be a unique application of OO principles and techniques ( thanks to Smalltalk) to the design of a KB framework for GIS that is simple, "light-weight" and extensible. This development was the result of research funded by the U. S. Naval Research Laboratory (NRL) and the U. S. Defense Mapping Agency (DMA), to study alternative ways of representing and managing Vector Product Format (VPF) digital geographicdatabases. VPF is a specification developed by DMA for a family of database products (DMA 1993a) that has become part of the DIGES T international standard formats for representing geographical data (DMA 1994a). VPF is a "georelational" specification which uses a relational data framework (Date 1995) for storing both spatial and nonspatial attribute information about the geographic features represented. A number of database products representing refinements to (he VPF standard have been developed, such as the Digital Nautical Chart (DMA 1993b). Vector Smart Map (DMA 1993c), Urban Vector Smart Map (DMA 1994b), World Vector Shoreline (DMA 1995). and others. Each of these VPF derivatives has different purposes, and thus different sets of geographic features and attributes; in some cases even different metadata (database schema) structures.

PAGE 13

The purpose behind the development of VPF was two-fold: (1) to provide a public specification for exchange of geographic data across computer platforms and GIS software products, and (2) to support direct viewing capability without the need for proprietary GIS software (DMA 1993a). In its first purpose, VPF occupies a similar role to the Spatial Data Transfer Standard (SDTS) developed jointly by the U. S. Census Bureau and the U. S. Geological Survey (see National Institute of Standards and Technology 1992; Lazar 1992; Fegeas et al. 1992; Davis et al. 1992; Milne et al. 1993). SDTS now defines the format used for Census Bureau TIGER data (Klosterman and Lew 1992). VPF and its derivative specifications have evolved over several years, and are just now maturing to the point that large-scale production and distribution of geographic databases on CD-ROM are taking place. One of the problems these database products present however, is that the feature data is very difficult to edit due to its inherent complexity. That is, once a VPF database has been created (usually by exporting coverage data from a commercial GIS product), it is very difficult to modify the feature data while maintaining referential integrity of the many linkages between features, attributes, graphical primitives, and the various indexes and join tables. The single greatest source of complexity is the attempt to represent the locational and topological aspects of geographic features using the relational database model. Commercial GIS products have typically dealt with this representation through the use of proprietary, non-relational data structures and techniques. In following standard rules for normalization (Date 1995), the VPF specification has possibly reached the practical limits of what can be represented and maintained using relational database technology.

PAGE 14

5 In its search for ways to address these issues, the Digital Mapping Program (DMAP) at NRL sought the help of the GeoPlan Center in the Department of Urban and Regional Planning, and sponsored the current research project to develop an in-house (DMA-owned) viewer/editor capable of displaying and modifying VPF source data. NRL's decision to sponsor the project here was based on the results of prior work at GeoPlan on an OOGIS product to support automated mapping and facilities management applications (commonly abbreviated AM/FM) for large, regional electric utility companies. This project, called Object GPG (Alexander et al. 1991) and later OFM (for Objective Facilities Management), had developed a core set of object classes and methods in Smalltalk that could be used without license restrictions as a starting point for NRL's VPF viewer/editor. I was in a position to lead development of the VPF viewer/editor, having worked on OFM for the previous year. As the DNC specification (DMA 1993b) is one of the most complex, it was chosen by NRL to be studied first. By the end of the first project year, I had completed a viewer/ editor prototype in Smalltalk that was capable of displaying and editing DNC feature data. This prototype was called ODNC, with the results appearing in a refereed conference the following year (Arctur et al. 1995c). At this stage, the project received much more attention and support from DMA. Our equipment was upgraded from aging workstations to Sun SPARCstation20's. Two more programmers were added to the project at NRL, and one to two additional graduate assistants worked on the project at GeoPlan. By the end of the second year of the project, we had integrated World Vector Shoreline, Vector Smart Map Level 0, and Urban Vector Smart Map database definitions and feature data into the object-oriented framework (now called ObjectVector Product Format, or OVPF), and had

PAGE 15

6 incorporated the use of ObjectStore by Object Design, Inc., as an object-oriented database repository for VPF source data. We also had implemented full spatial topology (missing from the first-year prototype; see Chung et al. 1995); a novel splay-tree indexing mechanism to improve spatial query performance (Cobb et al. 1995b); and a rule-base framework for enforcing logical constraints on feature updates (Arctur et al. 1995d). Although the OVPF data model and Smalltalk application were developed to address specific needs of DMA, the techniques and lessons learned also seem well suited to GIS applications for urban and environmental analysis and planning, as well as facilities management applications in the various public utility industries such as electrical, gas, water and communications. My main interest is to show the implications and significance of this OO-KB-GIS framework for various GIS users. However, much of the thesis is necessarily devoted to explaining the historical context and the development of the framework itself. The remainder of this Introduction is in four main parts: (1) a statement of the specific technical goal and objectives for this research; (2) a more detailed look at the VPF data structures; (3) a review of the technological context and state of the art from which the current research proceeds; and (4) a synthesis of the preceeding review to show how the various technologies intersect in this project. Following the Introduction, the Materials and Methods section describes specific tasks which were addressed in working toward the final product, essentially "recipes" for constructing the key components of an OO-KB-GIS framework. As the OVPF program is rather large (about 400 classes and 4500 methods, with over a megabyte of source code), it is impractical to describe its complete structure here. Therefore I will focus on the key constituent frameworks within the overall design that most directly contribute to the stated

PAGE 16

7 objectives. The Results section then provides an illustration of the integrated design in action, to show how the finished program meets each of the stated objectives. The Discussion section concludes this work by addressing the implications of the findings, as well as the limitations and directions for future work with this approach. Goal and Objectives of the Research This work began in response to the perception that traditional GIS data modeling and analysis approaches are becoming inadequate to support the complex and interdependent nature of geographic data. Data models once thought to be fairly flexible now are seen to impose significant constraints on the way the data can be used. This is becoming an increasing problem as our acquisition of huge amounts of detailed data accelerates. In a similar way, the GIS tools themselves can be difficult to apply to a given problem as a result of the often brittle way in which they are designed. The GIS software used for most governmental and military applications requires extensive training and continuous practice to develop and maintain proficiency. Then, even with proficiency it can be difficult, time consuming and frustrating to apply to a given problem. It seems that even one of the most sophisticated GIS software products, Arc/Info by Environmental Systems Research Institute (ESRI) has fairly rigid data structures that work well for single-theme maps, but do not support the kind of interdependencies that exist, for example, among geographic features in facilities management applications for electrical and other utilities industries. It seems inevitable to me that, with the rapidly changing environmental and societal conditions facing us, people will continue to find or create

PAGE 17

8 perfectly reasonable GIS applications for which existing software systems and tools are not easily adapted. . The goal of this research project is to design and demonstrate ways of representing and working with the complexity of both the geographic information and the GIS tools that permit flexible adaptation to changes in requirements for the data models over time. To this end, a set of objectives the GIS tools need to meet include • representing potentially detailed and complex interdependencies among geographic features in a database, in a user-extensible way; • supporting very large geographic databases, potentially distributed over a network of computers; and • providing the capability to incorporate expert-system rules and behavior. Each of these objectives will be discussed briefly below. Managing Complex Interdependencies Among Geographic Features Each application domain comes with its own set of rules. In planning an electrical service extension for a new urban subdivision, the electric utility engineer has to take into account a myriad of prerequisites and corequisites in order to place any one electrical system facility, such as a high-voltage circuit breaker. Even to plan a simple facility such as an overhead capacitor, the engineer must first determine that a power pole is in place, and that such a pole is rated for the capacitor, and that the capacitor is placed on the proper circuit, and so on. Telecommunications equipment and circuits can involve even more complex interconnections than electrical utility services. Thus, in addition to support the drawing of maps of electrical and communications circuits, it is increasingly important

PAGE 18

that the GIS tools support the engineer to build the maps completely and correctly. Essentially, a means of incorporating a facility engineer's books of policies and practices into the GIS tools would be a tremendous aid for such a user. An additional benefit the GIS can provide is to help manage the inventory and accounting of installed facilities. This could also be said for a quite different application domain such as land use and building codes enforcement in a city or county government jurisdiction. For example, when a developer wishes to build or modify a residential or commercial establishment, many detailed conditions must be met, such as the number of parking spaces required based on square footage of the buildings; setbacks and easements based on proximity to roads or power lines: and so on. While these rules are generally mastered reasonably quickly by those responsible for their enforcement, this is less true in older and denser urban areas. Furthermore, many rules are open to interpretation, and are not always applied uniformly but can be applied strictly or loosely depending on an official's preference. In addition, turnover among code enforcement officers is very high in many offices, resulting in frequent retraining (Heikkila and Blewett 1992). Most of the existing software used in map production and GIS is not well suited for adaptation to handling a rule base for describing complex interdependencies. Presently, there is no way to represent application-domain-specific dependency rules among geographic features with, for example, AutoCAD by AutoDesk, an inexpensive and popular technical drawing software product which is often used in place of a GIS. With sophisticated GIS software such as Arc/Info and Intergraph, this is still not easily done. One approach which works in very limited situations is to assign an "impedance"

PAGE 19

10 (resistance to flow) between two proximate geographic features, but this does not capture enough of the semantics of, say, an electrical power network to be useful. Supporting Very Large Geographic Databases A single geographic database can vary from a few megabytes to several gigabytes and even terabytes. The volume of data to be generated by remote-sensing satellites will reach a level of terabytes per day in a few more years. When attempting to perform queries and analysis with multiple databases in combination, it may not be practical for all this data to reside on a single host computer or even on a single local network. The GIS framework needs to support access to an arbitrarily large collection of data that may be distributed across an entire wide-area network. Supporting Reactive Capability in the GIS As the complexity of interdependencies and size of databases increase, it will become steadily more important to find ways for the GIS to assist the analyst and the database administrator in maintaining data consistency and integrity. This could be in the form of data integrity constraint support, as well as support for more complex decision processes typical of expert system applications. (Several examples of these will be presented shortly.) To minimize the technical overhead of such assistance on the user, it is helpful and in some cases necessary for the GIS to have a flexible, consistent, and automatic way of reacting during attempts to access and modify geographic feature data. It would also be important for the reactive capability to be based on conditions anywhere in the complete

PAGE 20

11 database, and not just on the portions of the database currently loaded into the computer's active memory. This is essentially what is meant here by the term active database; these are designed in such a way that application-defined events can trigger rule-based actions based on conditions in any part of the disk-resident database (Widom and Ceri 1996, Chapter 1). For example, suppose a homeowner wished to get a building permit to expand her house on her own property. Suppose also that her property contained species of plants that cause her property to be considered wetlands, which might very well preclude her right to build any further on her property, on legal grounds. The municipal building codes enforcement officer needs to be aware of all such rules, as well as the particulars with respect to each applicant's property, to apply the rules in an objective and accurate way. Assuming pertinent, accurate data exists in the GIS database on which a correct decision could be based, the GIS could notify the codes official immediately of all such pertinent conditions at the earliest opportunity, thus precluding potentially costly re-evaluation at a later date due to obscure conditions that were not noticed earlier. Zoning for land use is becoming an increasingly complex concern for city and county planners. Increasing land scarcity and values raise the costs and potential for litigation as competition and undesirable interactions among different zones in close proximity become more common. Expert systems that can take into account the various constituents' preferences in anticipation of future problems could be a very useful tool for planners. Again, the GIS needs some form of reactive capability to support this. This concludes the discussion of goals and objectives for my research. The next section presents an overview of the VPF database structure, which serves as the data

PAGE 21

12 source for the proof of concept described in this thesis to address the above objectives. Following the VPF description is my review of the technological evolution which has led to the development of the principal concepts, tools and techniques which are integrated in this project. Vector Product Format Database Structure For purposes of discussion, the Digital Nautical Chart (DNC) is considered representative of the general structure of all Vector Product Format (VPF) products, and serves throughout this thesis as the concrete example of VPF to illustrate the definitions and linkages among features, attributes, and primitive graphical elements. One of the more complex VPF products, DNC was specifically designed to support GIS applications such as marine navigation. As with other VPF products, DNC geographic data is organized for distribution on CD-ROM disks where each disk or disk set contains the database of geographic information for a particular region. For example (see Figure 1), the Chesapeake Bay area surrounding Norfolk, Virginia has been coded as database DNCOI } This database is organized using the hierarchical directory structure shown in Figure 2. The name of the topmost directory is also the name of the database. The DNCOI directory contains two files: the Database Header Table (DHT) which provides general information about the database (source, date of creation, revision level, etc.) and the Library Attribute Table (LAT), which provides the boundaries of each library in terms of decimal degrees of 2. Note: San serif typeface is used, e.g., DNCOI , throughout this thesis to represent actual filenames or parts of filenames of database components on the CD-ROM, as well as to represent Smalltalk programming code.

PAGE 22

13 latitude and longitude. As defined in the VPF specification and illustrated in Figure 1, a library defines a geographic boundary and scale, where a larger scale implies a closer-in view and a smaller scale implies a further-out view. Thus, the Norfolk Approach (AO 1 08280) library has a smaller scale and presumably lesser accuracy and precision of data than in the Norfolk Harbor (HO 1 08280) library. A given database will have one or more library directories; in Figure 2, for example, are shown portions of two of the library directories, AO 1 08280 and HO 1 08280. AO 1 08 1 70 (Ocean City Approach) HO 1 08280 (Norfolk Harbor) AO 1 08280 (Norfolk Approach) GEN0I (General) COA0I (Coastal) Figure 1 . Libraries for Chesapeake Bay Area (DNC01) Database A library subdirectory is further divided into coverages, each of which contains the data for logicallyand spatially-organized groups of geographic/eagres. For example, the Cultural Landmarks coverage (CUL) includes buildings, power lines, streets, railroads and

PAGE 23

other feature classes. The Inland Waterways coverage (IWY) includes features such as canals, lakes, rivers and dams. & T Apply to all features in coverage DNCO I (Norfolk, Virginia harbor area map) D DHT (Database Header Table) D LAT (Library Attribute Table) & AO 1 08280 (Norfolk Approach Library) & HO 1 08280 (Norfolk Harbor Library) D LHT (Library Header Table) Q GRT (Geographic Reference Table) D CAT (Coverage Attribute Table) F=> IWY (Inland Waterways Coverage) CUL (Cultural Landmarks Coverage) D FCA (Feature Class Attribute Table) FCS (Feature Class Schema Table) INT.VDT (Integer Attribute Value Descriptions) CHAR.VDT (Character Attrib. Value Descriptions EDG.FIT (Edge Feature Index Table) END. FIT (Entity Node Feature Index Table) FAC.FIT (Face Feature Index Table) BUILDNGRPFT (Building Points Feature Table) POWERL.LFT (Power Lines Feature Table) BUILDNGA.AFT (Building Areas Feature Table) GJPG4545 (Spatial Tile Subregion) GJPH3000 (Spatial Tile Subregion) Q END (Entity Node Primitive Table) CND (Connected Node Primitive Table) EDG (Edge Primitive Table) EBR (Edge Bounding Rectangle Table) EDX (Edge Primitive Index Table) FAC (Face Primitive Table) FBR (Face Bounding Rectangle Table) RNG (Ring Table) Apply to all features by type Define attributes values for each feature H D D D D D D D Q D & & Legend: * Et Unexpanded Directory T Expanded Directory D File D D D D D D D Figure 2. DNC01 Database Directory Structure (partial) Source: (Arctur 1995c) Within a given coverage directory, the geographic feature data is divided into two main groups of files: those that describe feature attributes, and those that describe feature locations. Those files describing feature attributes, for example building type, road type, accuracy level and so on. are stored in the coverage directory. The files describing feature

PAGE 24

15 locations are stored in tile subdirectories of the coverage directory, where a tile corresponds to a rectangular subregion within the library's boundary. Tile size is a function of the library's scale. For example, tiles are 15 minutes (0.25 degrees) of latitude or longitude on each side for Harbor libraries; 30 minutes (0.5 degrees) on each side for Approach libraries; and 3 degrees on each side for Coastal and General libraries. As shown in Figure 2, the files describing feature attributes are further grouped according to their level of generality. The files for Feature Class Attributes (FCA), Integer Value Description Table (INT.VDT), and Character Value Description Table (CHAR.VDT) contain descriptive information concerning all feature attributes. The Feature Class Schema (FCS) file contains table-join relationships for many-to-many relationships that may exist between feature tables, associated notes and other tables {notes tables are omitted from Figure 2 for simplicity). The feature-specific attribute value detail is stored in the Building Points Feature Table (BUILDNGRPFT), Power Lines Feature Table (POWERLLFT), and other such feature tables. The most important join tables are the Entity Node Feature Index Table (END. FIT), Edge Feature Index Table (EDG.FIT), Face Feature Index Table (FAC.FIT) and Text Feature Index Table (TXT. FIT). The Feature Index Tables (FIT files) are provided to relate each record of the Feature Tables (PFT, LFT, AFT, TFT files) to their associated graphic primitives in one or more of the tile subdirectories. Other join tables and index files may also be employed, as defined in the VPF and derivative product specifications. A major aspect of geographic data and the VPF specification that complicates its relational structure is spatial topology. Any single geographic feature (such as a river) might consist of multiple line segments, called graphical primitives. A given line

PAGE 25

16 segment, in turn, might be a part of more than one spatial feature (such as part of the river and an adjacent property boundary). The topology (adjacency and contiguity) properties of VPF features are stored and managed at the graphical primitive level, within the tile subdirectories. VPF specifies a "winged-edge topology" model (see Figure 3) to provide "line network and face topology, and also to maintain seamless coverages across a physical partition of tiles." (DMA 1993a, Appendix B, p. 105). Figure 3. Winged-Edge Topology Components Source: DMA 1993a, p. 106. In terms of the database files, the geographic coordinate data is organized into Entity Node (END), Connected Node (CND), Edge or polyline (EDG), Face or polygon (FAC), and Text (TXT) files. Entity Node records have a foreign key to their containing face primitive record; Connected Node records have a foreign key to their starting edge primitive record; and Edge records have foreign keys to their start node, end node, left

PAGE 26

17 face, right face, left edge and right edge primitive records. Face primitive records include a foreign key to the Ring (RNG) table, which indicates the starting edge primitive for each face. The winged-edge topology algorithm (DMA 1993a, Appendix B, pp. 108-111) describes the procedure by which a face primitive is assembled from tracing the comprised edge primitives. Text primitives consist of a textual label and a shape line that describes the location and path along which the text label is to be displayed. Text features are not topological structures, but simple cartographic elements for identifying certain features at an arbitrary location, such as the name "Chesapeake Bay." Edge and Text primitives have variable-length records, as they may consist of an arbitrary number of locational points. To facilitate faster access to these primitives, VPF specifies additional index files, named for Edge Index (EDX) and Text Index (TXX) respectively, to be stored in each tile subdirectory. The direct byte offset for each record of the Edge or Text primitive file is stored in the associated Edge Index or Text Index file, sorted by the primary key in the tile-level Edge or Text primitive file. See "Representation of Graphical Primitives and Topological Relationships" starting on page 55 of this thesis, for more details on the implementation of topology in OVPF. See (Chung et al. 1995) for issues and techniques concerning maintenance of topology during geo-feature editing in OVPF There is still another layer of data required to represent VPF features, which is a spatial index for each coverage. VPF specifies an adaptive binary tree framework for managing spatial indexes of point, edge and face primitives (DMA 1993a, Appendix F). Spatial tree cells' keys are stored in additional index files for association with their

PAGE 27

18 contained features for use in spatial queries and display. The OVPF prototype viewer currently uses a more efficient quadtree spatial data manager (after Samet 1994) instead of the adaptive binary tree system. See "Design of an Object-Oriented Spatial Index" starting on page 60 of this thesis for more details on the quadtree implementation in OVPF. A useful aspect of the VPF specification is that the relational files have their schema description within each file's header. This facilitates dynamic interpretation and processing of feature data, as well as a means of coping with some of the differences in structural specifications among the various VPF products. To give an idea of the complexity of a single VPF database, one 8-megabyte library for the Norfolk Harbor database has about 25,000 geographic features (considered a relatively small data set), and uses over 1500 separate fdes to describe the location, topology, and other attributes of these features (this seems like a much larger number of files than one might expect, for just eight megabytes of data). Given the high degree of interdependency among features and graphical primitives, it is thus difficult to manage even simple changes to the location of a spatial feature, while assuring referential integrity throughout the coverage. Historical Technological Developments Many threads of development have contributed to the present research, which will be loosely categorized according to the period and technology represented. These start with pre-GIS large-scale urban models, object-oriented programming systems, both relational and object-oriented database management systems, and knowledge-based

PAGE 28

19 systems (including active database systems). These are followed by discussion of various approaches to GIS, including proprietary and relational GIS, object-oriented GIS, and knowledge-based GIS. The concluding section presents a synthesis of the work in these fields as it pertains to the current research. Large-Scale Urban Models Due to the need to store, relate, and manipulate large amounts of spatial, temporal and topical data, computers have been used to support geographic applications since the 1950s (Budic 1994). Large-scale optimization, econometric, and simulation models of urban and regional systems were developed by the 1960s, but these began to lose favor in the U.S. by the early 1970s (Klosterman 1994). Lee voiced specific and influential concerns in 1973 which are important to keep in mind as we look at the various modeling approaches in this review. He referred to these as the "seven sins of large-scale models" (Klosterman 1994, p. 4): ( 1 ) hypercomprehensiveness, or trying to serve too many purposes at once; (2) grossness, providing information too coarse to be useful; (3) hungriness, requiring enormous amounts of data (the management of which is error prone in itself); (4) wrongheadedness, that the models suffered from "substantial and largely unrecognized deviations between the behavior claimed for them and the variables and equations that actually determined their behavior" (Klosterman 1994, p. 4); 3 (5) complicatedness, that the models' complexity and internally-generated errors resulted 3. Klosterman writes ( 1 994. p. 4): "As an example, Lee points out that data for an entire metropolitan area were often used to derive model parameters that were then applied to specific neighborhoods -a computerized version of the ecological fallacy."

PAGE 29

20 in the need to "massage" the models to produce reasonable-looking output; (6) mechanicalness, that the models could produce large, unknowable errors due to iteration and rounding; and (7) expensiveness, that the models' costs were often so high as to require large federal grants just to put to use. As we will see from experiences with other approaches, these issues are not unique to urban models. Object-Oriented Programming Systems Starting in the late 1 960s and continuing into the early 1 980s, compact engineering workstations with the first windowed graphical user interfaces (GUIs) were being developed at the Xerox Palo Alto Research Center (PARC). A new operating system called Smalltalk was among those being developed to take advantage of these advanced processing architectures (Goldberg 1988). Starting with a heritage from the Simula programming language, Smalltalk was a research project of PARC's Learning Research Group (later called the Software Concepts Group) to work toward "a vision of the ways different people might effectively and joyfully use computing power" (Goldberg and Robson 1983, p. vii). While it has long since surrendered its role as a complete operating system 4 it has retained many features from that legacy 5 and remains one of the most powerful and extensible programming languages and software development environments today. It has also been the principal catalyst in the widespread development of the object4. One of the first Macintosh operating systems was based on Smalltalk, and an early model of Sun workstation was able to boot up under Smalltalk, but no more. However, one or more competing versions of Smalltalk now run on UNIX, IBM MVS, AS/400, OS/2, MS DOS, MS Windows, and Macintosh platforms. Some versions such as ParcPlace VisualWorks support cross-platform portability; i.e., the same program can be run on any of the supported platforms without recompiling, regardless which computer system was used to create it.

PAGE 30

21 oriented (00) paradigm and object-oriented programming systems (OOPS). These humorous acronyms may have been no accident; early Smalltalk developers tell stories of trying to work on experimental workstations with a MTBF (mean time between failures) of about twenty minutes. 6 However, some of Smalltalk's best features as a software 7 8 development environment emerged in direct response to this rather hostile environment. A number of very good books are available on programming with Smalltalk (LaLonde 1994; Howard 1995; Smith 1991, 1995; Lorenz 1995). While the early years of Smalltalk were focussed on defining the language and convincing the software industry of the value of the object-oriented paradigm, attention has shifted since the late1980s to refining the methods used for analysis and design of object-oriented programs. Numerous approaches have been presented; of these I have preferred Booch and Rumbaugh's work which essentially integrates a number of distinct techniques, each of which is more or less suited to different specific stages in the software development life cycle (Rumbaugh et al. 1991; Booch 1994; Booch and Rumbaugh 1995). In just the last two years, another focus of attention has been on the study and use of design patterns in object-oriented 5. Smalltalk helped pioneer the use of lightweight process threads. It also incorporates the use of semaphores and non-preemptive, priority-based process scheduling. It includes a number of other advanced programming features as well; see (Goldberg and Robson 1983, 1989; ParcPlace Systems 1994a, b). 6. Personal communication with Russ Pencin, ParcPlace Systems, 1989. 7. References to "Smalltalk" as a language as well as a programming environment may seem confusing at first, but I will try to distinguish these different usages by context. In fact, Smalltalk represents at times a program organization philosophy; a language with rules of syntax and semantics; and an interactive, graphical, software development environment with a rich set of tools for developing, cross-referencing, debugging, versioning and documenting programs. Most of the tools for building Smalltalk programs are also written in Smalltalk, forming an inherently user-extensible language and development environment.

PAGE 31

22 programming (Gamma et al. 1995; Coplien and Schmidt 1995). These are based on the work of architect and professor Christopher Alexander in his study of design patterns in urban and natural architecture (Alexander et al. 1977). Design patterns provide a very concise vocabulary for discussing object-oriented programming design constructs, which also serves to aid in documenting a program. These are used only slightly in this thesis due to their recent appearance in the literature. Given time and resources, I would like to review OVPF again with the goal of identifying the specific design patterns which occur in the program. Finally, a very useful book has recently appeared which is directed to supporting "technical managers in organizations to be successful in the use of object-oriented technology" (Goldberg and Rubin 1995, p. v). This book distills decades of experience with Smalltalk and other object-oriented systems to address the many issues of effective project management. Relational and Object-Oriented Database Management Systems In the late 1970s and early 1980s, relational database management systems (RDBMS) came out of the research laboratories and began to find general commercial use in corporate minicomputer-based 9 information systems applications such as finance and accounting. It was also about this time that personal computers (PCs) began to enter the 8. One of the earliest Smalltalk development utilities from ParcPlace Systems was a disk-based audit trail called the "change list" of all programming code as it was written, along with a crash recovery tool to roll-forward changes made by the programmer since the last saved version of the complete program. This soon evolved into a facility to support merging multiple programmers' code.

PAGE 32

23 office. Both of these technologies caught the interest and budgets of planners and other users of GIS. A fairly thorough guide to concepts and issues of RDBMS may be found in Date (1995), with additional perspectives provided in Stonebraker (1988). An important complementary technology to RDBMS was the development of object-oriented database management systems (ODBMS) by the middleto late1980s (see Zdonik and Maier 1990). Initially these were created to meet the needs of complex applications such as computer-aided drawing, engineering or manufacturing (CAD, CAE and CAM), which have traditionally not enjoyed the relational database model. Companies offering commercial ODBMS products with Smalltalk interfaces include GemStone, Versant, ObjectStore, Objectivity, and UniSQL. We chose GemStone (GemStone Systems Inc. 1995) and ObjectStore (Object Design Inc. 1995) for evaluation in the current research project because they offer different client-server architectures, and we did not have the resources to examine more than two of these. For more information on client-server issues among ODBMS architectures, see (DeWitt et al. 1992; Cobb et al. 1995a). Another interesting paper outlines its author's proposed "object-oriented database system manifesto" of issues that need to be properly addressed when working within the object-oriented paradigm (Atkinson et al. 1992). While the entire ODBMS market today is probably smaller than any one of the major RDBMS companies' customer lists, its influence is being felt. Many corporate information systems managers are switching to ODBMS for standard business 9. "Minicomputers" fill a middle ground between personal workstations and mainframes for multi-user systems. There are hundreds of models in wide use today by Sun, IBM, HR DEC, and many others. Minis and mainframes have become smaller as workstations have become more powerful, to the point that these distinctions are sometimes hard to make now.

PAGE 33

24 applications that were traditionally based on RDBMS. And most major RDBMS software companies now have a strategy in place for supporting ODBMS applications in the present or near future. Knowledge-Based Systems Another field of technological development which has a bearing on my research started in the 1960s with artificial intelligence (AI) and expert systems (ES). There was considerable initial excitement over the possibility of capturing the reasoning and heuristics (rules of thumb) of experts in a complex problem domain for use in computer models. This excitement cooled by the early 1970s due to the failure of the technology to follow through on its early hype and promise, but the field seemed to be saved from demise by the introduction of microcomputers. By the early 1980s, through the massive dissemination of affordable machines capable of meeting the heavy computational requirements of AI, "expert systems were even appearing as part of the most basic educational software packages" (Batty and Yeh 1991, pp. 103). Numerous proposals and case studies regarding the use of expert systems in non-GIS urban and environmental applications have appeared in the literature since the mid-1980s (Dickey et al. 1986; Ortolano and Perman 1987; Davis et al. 1987; Sharpe et al. 1991; Heikkila and Blewett 1992). Special issues of industry journals have been devoted to expert systems in urban and environmental planning and design (Sharpe et al. 1987; Batty and Yeh 1991). Collections of case studies covering a wide range of applications may be found in (Kim et al. 1 990; Wright et al. 1 993). Leung ( 1 988) provides a theoretical foundation for the use of "fuzzy sets" to represent imprecision in spatial analysis and planning.

PAGE 34

25 So what are expert systems? A good way of describing them is as . . . . . . decision aids which represent knowledge about the problem domain in terms of rulebased structures. As such, they are models of the problem-solving process which enable conditional syllogisms in the IF-THEN form to be executed in sequence. In fact, the problem domain is usually represented by a network of such rules and the expert system processes these rules by searching this network to find the ultimate conclusions or the original premeises which represent the basic outputs and inputs which drive the system. These systems are organised into a ... knowledge base which contains the data in a form which can be operated upon by the system's inference engine which contains the search procedures. Searching is usually accomplished by forward chaining from premise to conclusion or backward chaining from conclusion to premise. (Batty and Yeh 1991, p. 103) In the design of expert systems frameworks, technology has branched in three main directions: (1) pure production-rule systems such as OPS5 (Brownston et al. 1985); (2) first-order logic systems such as Prolog and its derivatives (Torsun 1995); and (3) active databases (Chakravarthy 1992; Jaeger and Freytag 1995; Widom and Ceri 1996). Each of these will be described briefly, even though the first two were found to be unsuitable for the problem domain in this research project. Lessons are gained from all three approaches. Production-rule systems The basic architecture of production-rule systems, sometimes called simply "production systems," consists of three main components: ( 1) a data store or working memory, containing a global database of symbols representing facts and assertions about the problem; (2) a set of rules, which constitutes the program, stored in production memory or rule memory; and (3) an inference engine to execute the rules. Rules have two parts: a condition to be tested, and an action to execute if the condition proves to be true (Brownston et al. 1985, pp. 6-7). Both forward chaining and backward chaining are supported. Production systems proceed computationally by examining and matching the states of all the data against all the rule conditions in each program cycle. This is well

PAGE 35

26 suited to applications in which the program must respond adaptively to frequent, unpredictable changes in its environment. Unfortunately for our case, there is no means of supporting interactions or sequencing among rules. This would not work well with data models of interdependent geographic features in which the order of rule processing is often important, such as in facilities management applications for utilities. Also, the considerable overhead involved in examining the data store and the rule base in each programming cycle is an unnecessary price to pay "when efficient and provably correct algorithms or even close approximation algorithms exist for a task. ... In general, if the problem and the solution to the problem are well structured or highly structured, it is unlikely that the best computer representation to the problem will be a production-system program." (Brownston et al. 1985, p. 26) In the current project, our problems and solutions tend to be highly structured. First-order logic systems First-order (also called predicate) logic programming approaches such as Prolog introduce formal semantics and provable correctness of theorems as the means of solving problems. These are generally backward-chaining systems, in which the system seeks to determine the premise to a given conclusion through exhaustive proofs of applicable theorems which could apply to the resolution of a given inference rule. As with production systems, these have significant shortcomings for dealing with very large databases. As Torsun writes (1995, p. 455): "the use of logic programming routinely in industrial/ commercial applications is severely hindered by a serious drawback. This drawback is the inefficiency of logic languages in applications where the problem is complex, large, or both. . . . logic programming is domain independent and the search methods are

PAGE 36

27 undirected, but for efficiency to be achieved, proof servers need to be more focused." Another serious shortcoming of logic programming is that these languages do not include tools for building sophisticated windowed GUIs capable of managing tens of thousands of points and vectors at once, so would have to be somehow integrated with a GUI toolkit for this functionality. An application area where predicate logic appears to be well suited is that of programming code generation. 10 This is an important area offering increased productivity of programming effort in certain application domains. In this case, the problem domain is limited to the syntax and semantics of the input scripting language and of the generated output code, as far as the logic system is concerned. However, in facilities management, land-use zoning decision making, and many other applications of GIS, the bounds and semantics of the problem domain are too broad and complex to fit the limitations of logic systems. Active databases The field of active databases has emerged since the midto late1980s as a very promising technology. "Active database systems are able to recognize specific situations (in the database and beyond) and to react to them without direct explicit user or application requests." (Gatziu and Dittrich 1992, p. 23) This represents in some ways an extension of the traditional "passive" database management system, and in some ways an extension of the OPS5 production-rule system. Active databases are superior to passive databases for enforcing general integrity constraints and enabling triggers, as well as for 10. Personal communication with Dr. Sharma Chakravarthy.

PAGE 37

28 supporting data-intensive expert systems and workflow management applications, since the rule base does not have to fit completely in memory (Widom and Ceri 1996). Common to most active database systems are the notions of events (or situations) and actions, associated via rules, as in SAMOS (Gatziu and Dittrich 1992). This is often referred to as an "ER" (for event-rule) framework. An event might be the creation, modification or removal of (in our case) a geographic feature object or a graphical primitive object. A rule might associate the removal of an object with an action to check the user's authorization privileges before allowing the event to proceed. Another rule might associate the creation of a new feature object with an action to check and enforce the data integrity constraints for that feature's location or attributes. An extension of this approach developed for the HiPAC project (Chakravarthy et al. 1989; Dayal et al. 1996) and used in Snoop (Chakravarthy and Mishra 1993) adds the notion of being able to check arbitrary conditions, potentially having to do with objects not related to the triggering event, before firing the associated rule's action. This is often referred to as an "ECA" (for event-condition-action) framework. For example with this approach, we might condition the insertion of a new bridge feature object to depend on the prior existence of a nearby road feature with which it can establish application-dependent associations. This kind of rule-encoded interdependencies among geographic features would be very useful in facilities management and other complex GIS applications. Active databases can be based on either relational or object-oriented database models, and depending on their design, can support forward chaining, backward chaining, or both. Most of the earlier research and commercial database products applied reactive capability to RDBMS (Chakravarthy et al. 1989; Stonebraker et al. 1988; Widom and

PAGE 38

29 Finkelstein 1990; Darnovsky and Bowman 1990; InterBase 1990). More recently, others have attempted to incorporate event and rule support into an ODBMS (Gehani et al. 1992; Gehani and Jagadish 1991; Diaz et al. 1991; Chakravarthy et al. 1993; Medeiros and Pfeffer 1990; Su et al. 1989; Anwar 1992). Anwar et al. (1993) examine the implications of the shift from a relational to an object-oriented DBMS, and point out the greater flexibility from using an ODBMS for the active database. In particular, it is noted: "In contrast to a fixed number of pre-defined primitive events in the relational model, every method/message is a potential event" (p. 99). This is an important distinction. For example, a trigger in a typical RDBMS might be set up to take effect on update of a record in a given table, but it is not possible to trigger only on update of a certain field in the table; the trigger will always take effect for an update to the record no matter which field was the one updated. 11 In an ODBMS, it is possible to have triggers defined at any granularity of an object's structure. Another finding from Anwar et al. (1993) is that by appropriate specification, parameterized rules can be associated with either a class object (in which case the rule would be in effect for all instances of the class) or for an individual instance. The Snoop model introduced the notion of complex events, which could be defined either as a sequence of specific primitive events, or as a Boolean composite of multiple primitive events. Taken together, these form a surprisingly simple and powerful set of constructs, which are the basis of the event system now in our OVPF application. 11. One exception is Sybase. This RDBMS is capable of limiting an update trigger to only fire on change to a specified field.

PAGE 39

30 The point of building active functionality like this into the database system itself is to ensure consistent usage and high performance. SAMOS (Gatziu and Dittrich 1992) is an example of an active database layer implemented on top of ObjectStore, the same ODBMS we are using for our OOGIS repository. It was found in the SAMOS project that some performance is inevitably lost when the active capability is added on top of the ODBMS rather than being built into the kernel from the beginning. The OVPF application will share this fate, but for the present we are only concerned with prototyping advanced capabilities, and found this an acceptable trade-off, given there are no commercially available ODBMSs with reactive capability. Geographical Information Systems There are now numerous textbooks and references on GIS. Two of the more comprehensive books with which I am familiar are (Maguire et al. 1991) and (Laurini and Thompson 1992). These discuss the key issues and current approaches for creating and using geographic databases. A useful introductory guide to GIS also seems to be (Garson and Biggs 1992). Briefly, a GIS provides (in varying levels of quality and ease of use, according to the system's manufacturer): ( 1 ) a database of graphical, locational information for a set of geographic features; (2) a synchronized database of nonspatial attributes for the same set of geographic features; (3) a graphical user interface (GUI) with query and update capabilities allowing a user to access and modify the feature data; and (4) analytical capabilities allowing the user to conduct studies taking advantage of the geometrical or topological properties of the geographic data. Some examples of spatial analysis

PAGE 40

31 supported by GIS that were previously not feasible include "estimating runoff volume in specific areas, locating areas with scenic amenity, and searching for paths through threedimensional space that satisfy certain conditions, such as minimizing distance or construction costs or avoiding major obstacles." (Han and Kim 1989, p. 298) A number of publications have appeared which enumerate the functionality which should or could be found in a GIS (for example, see Goodchild 1988, 1994; Tomlin 1990). In our OVPF project, we have not yet reached the stage of implementing analytical functions such as these; hence we usually refer to OVPF simply as a viewer/editor. The data models used, and much of the analysis performed with vector-based GISs depends on graph theory (Harary 1969) and planar topology (Alexandrov 1957, 1965; Munkres 1966; Spanier 1966; Simmons 1963). For the reader interesting in probing the mathematics of topology in more depth (according to one well-informed source ), "the standard reference is Alexandrov (1965). More mathematical, and a fine book, is Munkres (1966). THE reference for the insider is Spanier (1966)." However, it is unnecessary for our purposes to explore the range of methods for representing topology, as the VPF specification defines a particular manner in which primitive spatial objects shall be represented and associated with each other. The VPF "winged-edge topology" model (DMA 1993a, Appendix B) is a form of the point-line-polygon model typical in existing GIS systems (Worboys 1994). Our OVPF application provides complete support for the VPF winged-edge topology model, as will be described in the Materials and Methods section. 12. Personal communication with Dr. Max Egenhofer.

PAGE 41

32 Another important aspect of a GIS is the choice of spatial indexing algorithm. While VPF specified a particular spatial index approach (the adaptive binary tree), we felt less bound to follow this guideline, for a couple reasons: (1) the adaptive binary tree could not perform as well as many other structures; and (2) the choice of spatial index is critical to the overall performance of the system for queries and analysis. The version of the Objective Facilities Management (OFM) program from which OVPF evolved, used a simple quadtree approach based on (Samet 1994). Other popular approaches include range trees (e.g., R tree, R+ tree, and R* tree; see also Samet 1994; Beckmann et al. 1990; Brinkhoff et al. 1993). We decided on the quadtree partly because: ( 1) the quadtree approach yields unique spatial index keys, whereas the range tree approach does not (this would be important for storing spatial index keys on disk for later use); and (2) it was already implemented in OFM, and had reasonable performance with our prototype. One problem of range trees occurs in the case of overlapping regions for a given spatial object: it is not possible to determine a unique and repeatable index key for the spatial object. The VPF specification however, provided for storing the spatial index key values as part of the georelational file structure, for the purpose of allowing faster access to features. The use of range trees would preclude our ability to store a repeatable spatial index key with a given feature object in the VPF file structure. Object-Oriented GIS Since 1987, numerous object-oriented approaches and data models for GIS have been proposed and examined in the research literature (Egenhofer and Frank 1987; Dueker and Kjerne 1987; Abel 1989; Egenhofer and Frank 1992; Herring 1992; Worboys

PAGE 42

1994). It is noted in Egenhofer and Frank (1992, p. 16) that "Object-oriented programming languages will be needed to implement the future GIS most efficiently. . . . because it naturally supports the treatment of complex, in this case geometric, objects (Kjerne and Dueker 1990). Compared with conventional data models, an object-oriented design is more flexible and better-suited to describe complex data structures." With regard to ODBMS, the same authors continue (1989, p. 16), "By using a database management system, data are treated by their properties; the object-oriented approach groups these properties into possibly complex objects and corresponding operations." Two recent doctoral dissertations have been directed to the potential for using object-oriented concepts in GIS (Feuchtwanger 1993; Karnes 1995). Feuchtwanger proposes a geographic semantic database model, incorporating notions of both structural and behavioral aspects of stored information. Karnes implements a Smalltalk-based prototype for modeling land parcel networks in a cadastral cartography application. Karnes' work is especially interesting, as he explores the use of object-oriented programming as a means of modeling and creating novel metaphors for real-world representations in cartographic and geographic domains. It is the flexibility of objectoriented technology (and Smalltalk's development environment) in supporting complex representations that facilitates this application. Commercial OOGIS products have emerged in the last few years, including Arcview/ Avenue from Environmental Systems Research Institute (ESRI 1995a, 1995b), Magik from Smallworld Systems (Smallworld 1995), Gothic Application Development Environment (Gothic ADE) from Laser-Scan (LSL 1995a, 1995b), and of course Objective Facilities Management (OFM 1996).

PAGE 43

34 Avenue is an object-oriented scripting language for supporting Arcview applications. It appears to draw much of its inspiration from both Smalltalk and C++, in terms of syntax and semantics. However it has some serious shortcomings for use in GIS: (1) it is a closed system, that is, the user may not create new classes or class hierarchies, but can only use the classes provided with Avenue; and (2) because Arcview is not designed or intended to be used to edit Arc/Info coverage data, Avenue cannot support editing of Arc/Info coverages either. Smallworld Magik is more powerful in some ways than Arcview with Avenue, providing a full-featured object-oriented language with much of the semantics of Smalltalk. The user can create classes and hierarchies of geographic features, and can conduct many useful analytical operations with the system. Smallworld provides a proprietary relational database system for both spatial and nonspatial data, as well as supporting access to Oracle and other commercial RDBMS repositories. Smallworld has so far focussed on facilities management applications for electrical, gas, water and telecommunications industries. Laser-Scan Gothic ADE is also a powerful object-oriented GIS, providing both scripting language capability and a proprietary object-oriented database system capable of holding spatial and nonspatial geographic data on the order of terabytes in size. Gothic ADE uses an interesting combination of C-language libraries and a high-level scripting language called Lull, to achieve what they claim is higher performance of processing than is possible with Smallworld's Magik system. Laser-Scan has so far concentrated on the market for large-scale map production systems.

PAGE 44

35 Expert Systems with GIS There has been considerable progress incorporating geographic data into urban and regional zoning and other policy formation efforts (Maguire et al. 1991; Budic 1994). Research and practice with artificial intelligence (AI) and expert systems (ES) in recent years has resulted in the proposal and development of several models for supporting urban and regional policy studies and implementation (see Dickey et al. 1986; Ortolano and Perman 1987; Davis et al. 1987; Batty and Yeh 1991; Sharpe et al. 1991; Yan et al. 1991). However, in none of these cases was GIS data directly incorporated into the design or use of an expert system. Furthermore, some of the experience papers draw attention to significant difficulties and shortcomings in applying AI or ES technology to urban planning applications ( Dickey et al. 1986; Sharpe et al. 1991). Another thoughtful paper discusses numerous legal and ethical issues regarding the use of ES in planning (Wigan 1987). Other research has focussed on applying AI, ES and DSS approaches to work specifically with GIS as an enabling technology for spatial queries and information analysis (Peuquet 1987; Taylor 1991; Han etal. 1991; Webster et al. 1991; Worboys 1994; Chen et al. 1994; as well as several of the papers from Kim et al. 1990; Wright et al. 1993). A very interesting work from the mid-1980s was KBGIS-II (Smith et al. 1987). This was a project at UC Santa Barbara to develop a knowledge-based GIS system, which was based on Common Lisp, Pascal and C. It included a means of representing both vector and raster (pixel-based) data, and defined a spatial object language. It had: ( 1) a query mode supporting a simple but versatile set of query forms; (2) a learn mode in which the system could modify and augment its knowledge base; (3) an edit mode in which the user

PAGE 45

36 could modify and augment the spatial object language, as well as the knowledge base; and (4) a trace mode in which the user could follow the processing steps being executed by the system. This work represents more complete development in terms of supporting queries and analysis than has so far been achieved in the current research. However, the thrust of my thesis is toward proof of concept that a completely object-oriented approach has merit for implementing a GIS framework. The KBGIS-II project helps inform the current research of some aspects of the overall framework that should receive attention. With LOBSTER, Egenhofer and Frank (1990) present an interesting approach to building a Prolog-based spatial query language, resulting in progress toward a high level abstraction of spatial data and geometric operations, but note some significant difficulties. For instance, "Prolog contains no provisions to prevent the entry of invalid or contradictory data. . . . Such errors are extremely difficult to detect. If the database contains large numbers of facts, visual inspection by browsing is not possible anymore." (p. 924) Another source of problems was that "Some Prolog programs rely on the order in which facts and rules are entered into the database." Both these issues relate to the difficulties of trying to apply a strictly logic-based approach in a problem domain requiring more procedural control. Some additional insight into the issues and difficulties of developing expert systems is brought out by Han and Kim (1989). In this paper they discuss some of the distinctions between standard database management systems (DBMS) and decision support systems (DSS) in urban planning (p. 298): The problems dealt with by DSS are generally different from those dealt with by DBMS. DBMS is suited for structured problems that have a standard operational procedure, decision rules, and clear output formats, such as those used in identifying low income districts or in determining the median income of a city. DSS. on the other hand, is intended

PAGE 46

37 for unstructured or semistructured problems, such as estimating fiscal and other impacts of land development proposals, to provide quantitative support to the decision maker. Han and Kim go on to inquire as to the reasons why urban planning complicates the use of DSS and more sophisticated expert systems. Among their findings is a list of suggested guidelines for identification of tasks suited to expert systems approaches. These were accumulated through a number of sources (Han and Kim 1989, p. 300): 1 . Genuine experts exist who can articulate their problem solving methods; 2. Experts agree on solutions; 3. The task is not poorly understood; 4. The problem typically takes a few minutes to a few hours to solve; 5. No controversy over problem domain rules exists; 6. The problem is clearly specifiable and well-bounded; and 7. The problem solving should be judgmental in nature, not numerical. For those who are familiar with some of the battlegrounds in urban planning, these conditions will seem simplistic and naive. Some of the reasons they are suggested are to support repeatable results and to allow objective validation of solutions found. In any case we must start somewhere, and progress is being made. It is part of my goal with the current research to contribute to this progress. Importance and Contributions of This Thesis We have now looked very briefly at the main technological threads which come together in the current research: urban system modeling, database management, GIS, object-oriented programming, and knowledge-based systems. Thus far, all major commercial GIS products (both relational and object-oriented) except Objective Facilities

PAGE 47

38 Management (OFM) have their own proprietary programming or scripting language and database system repository, although they generally support one or more of the major RDBMS products as well. It is certainly understandable why this would be so: by controlling the language with which a GIS user accesses and modifies the data, the GIS software manufacturer has fewer problems to cope with in system development and integration, as the products and users' applications grow and change. However, it is my perspective that this approach inhibits users' ability to develop innovative solutions to meet their needs, and greatly limits the number of trained programmers in the marketplace who might have experience with a given GIS product. Considerable talent and effort has been directed to the development of each major programming language such as Cobol, Fortran, Pascal, C, C++, Smalltalk and others. The advances being made yearly with Smalltalk, C++, Java, and other emerging systems are almost staggering. Similarly, RDBMS and ODBMS each represent very significant areas of intense research and development in their own rights, independent of the applications for which they are used. It is inconceivable that any one of the software manufacturers in the GIS field can compete with the functionality, robustness, interoperability on different computer platforms, and tools for development and debugging that are now expected of most modern programming languages and database systems. Nor can the GIS industry easily tap into the larger workforce of experienced programmers and consultants using these other languages and systems. It is my perspective that systems such as OFM and OVPF represent the kind of approach which can combine the capabilities needed in a GIS with the strengths and other advantages provided by using industry-standard programming languages and ODBMS. While OVPF represents a proof-of-concept at this stage,

PAGE 48

39 exhibiting no particular spatial analysis functionality, that kind of capability can be implemented with Smalltalk or another language, and integrated closely with the geo-data handling capabilities already present in OVPF. In addition to addressing the more traditional aspects of GIS, the current research provides the essential framework for supporting expert systems applications, without having to carry along the significant overhead of a complete "expert system shell." This is done through implementation of a simple, elegant and extensible rule-based framework and event-detection mechanism in Smalltalk, as part of the core functionality for creating and modifying geographic features. Because of certain features in Smalltalk such as dynamic binding (Goldberg and Robson 1989), it is quite straightforward to design an application which can modify its own structure and behavior at runtime, at the user's request. Such a system can also be designed to be capable of adding, modifying, and removing rules based on input from multiple simultaneous users in real time. This is a powerful capability that could conceivably lead to development of expert systems which can learn and adapt to changing conditions-necessary functionality for use in increasingly complex urban planning activities.

PAGE 49

MATERIALS AND METHODS This section is in two parts. The first part describes the software environment which was used to conduct the programming (the "materials"), and the second part describes the various issues encountered and approaches used to carry out the programming tasks. Object-Oriented Software Development Tools The development of this GIS framework was greatly facilitated by access to excellent tools: the Smalltalk development environment, a source code management system, an object-oriented database management system (ODBMS), and of course the computer platform itself. These will each be described briefly below. Smalltalk Programming Environment Smalltalk was chosen as the development platform for the current project (OVPF) initially because that was the language used for Objective Facilities Management (OFM), its "parent" program. However, there are many reasons for its use in OFM and its continuance in OVPF: its rich development and debugging environment, extensible nature, hooks to commercially available ODBMSs, and scaleability for working with both small and large geographic data sets (these varied from 15MB to over 300MB for each complete Vector Product Format source database). 40

PAGE 50

41 The specific version of Smalltalk chosen was VisualWorks from ParcPlaceDigitalk, Inc. 1 This product included the Smalltalk language editor, compiler, user interface building tools, cross-reference system for variables, objects and methods, and various browsers (rather like having an encyclopedia of the program built for the programmer by the system), all integrated with a graphical interface. The browsers provide lookup capability for ( 1) all methods that send a given message; (2) all methods that implement a given message; and (3) all references to a given instance variable, class variable, class-instance variable, or global object such as a class itself. The runtime debugger supports examination of the process stack of currently-active methods at any point in time. The debugger also allows the user to edit and recompile a method, then continue execution of the current process stack from the recompiled method, without having to stop and restart the program. These were invaluable tools throughout the development of OVPF. Source Code Configuration Management Facility In addition to VisualWorks, we acquired licenses for ENVY/Developer by Object Technology International (OTI) of Ottawa, Canada (the license is purchased through ParcPlace). This is a sophisticated source code versioning and configuration management facility which has been developed to support all the major brands of Smalltalk (including IBM, Digitalk, and Enfin, besides ParcPlace). ENVY supports team programming by 1 . The company was called ParcPlace Systems Inc. through most of the duration of this project. Digitalk Inc. was ParcPlace Systems' chief competitor until they merged in August 1995. References to their separate products in this thesis are now obsolete, but will be made nevertheless.

PAGE 51

42 allowing multiple programmers to share a common library of Smalltalk source code. The library management and security is seamlessly integrated into the programming editor, compiler and browsers, precluding the need for the user to always remember to follow proper library checkout/checkin procedures as is typical with other programming source code managers. Multiple programmers can even divide and work on different portions of the same object class without conflict. This facility was critical for the management of OVPFs hundreds of classes and thousands of methods, developed at an intense pace during the two years, with geographic separation among the team members. ENVY consists of two modules: one for the server computer, and one for each client programmer. The library manager is installed on a host server that is accessible to all team members (access can even be physically distributed over the Internet, though performance suffers). Each programmer works with a Smalltalk image initially provided with ENVY, that has the library management subsystem integrated with the rest of the Smalltalk development system. Object-Oriented Database Management System The third major software component was the ODBMS. While the research project included evaluation and development with both GemStone (GemStone Systems Inc. 1995) and ObjectStore (Object Design Inc. 1995), I will limit this discussion to the design and implementation of OVPF with ObjectStore, for simplicity and clarity. ObjectStore includes both a server module and a client module. The server module must be running on the host computer having the ODBMS repository. Each client programmer then works with a Smalltalk image which has been customized to include hooks for accessing the

PAGE 52

ObjectStore server over the network, much like ENVY. All of these layers can be envisioned together as shown in Figure 4 and Figure 5. Sun Sparc 20 host with ENVY/Manager and ObjectStore ODBMS servers and clients Computer Operating System Smalltalk Source Code Library Additional client programming workstation^ VPF and ODBMS Repositories v Best if located on separate hard disks Figure 4. Development Hardware VisualWorks Smalltalk Client Virtual Image i i r ENVY/Manager Source Code Library Server Smalltalk Client Object Engine (OE) with hooks for ENVY and ObjectStore •«-> ObjectStore ODBMS Server i i r i i r r Sun Solaris 2.4 UNIX Operating System Figure 5. Development Software Computer Platform For this project, we used the Sun Solaris 2.4 operating system on a Sparc 20 workstation for the Smalltalk and ODBMS host server. This was connected on a local

PAGE 53

44 token-ring network to other workstations which could serve as clients. Each of the major software subsystsems requires considerable processor and data transfer resources. As noted in Figure 4, it is recommended to have each of the major subsystems on its own hard disk, to improve overall performance. Approaches Used in Building OVPF Components The remainder of the Materials and Methods section presents the substantive aspects of building the components for OVPF. This was a very large undertaking, and is beyond the scope of this thesis to describe in its entirety. Instead, I will focus attention on the portions of OVPF which have the most bearing on the goals and objectives of the research. Introducing Some Object-Oriented Terms In the following discussions, it may be helpful to be acquainted with certain common terms used in describing object-oriented designs. The reader is referred to one of the references on Smalltalk for more detailed explanations of object-oriented concepts (Goldberg and Robson 1989; LaLonde 1994). The term abstract superclass is used to represent a definitional abstraction, such as definition of variables and/or behavior to be shared by its subclasses. Instances are not normally created from abstract classes. Concrete classes, on the other hand, are those that are expected to have instances made from them. These terms are mainly used to aid in learning about a class hierarchy; to call a class abstract simply implies that it lacks behavior needed for creation of a useful instance-object.

PAGE 54

45 Instance variables are data structures for which each instance-object has its own private copy. Instance variable definitions in one class are inherited as part of the definition by all of its subclasses. In Smalltalk, instance variable names begin with a lowercase letter. Class-instance variables are data structures for which the class object and each of its subclasses are defined to have a private copy of the variable. A class-instance variable can be used, for example, to hold a subclass-specific default value for a constant that can be accessed with the same name from any class in the hierarchy. This helps reduce the program's "variable-name vocabulary," which is one of the benefits of object-oriented design. Class variables are data structures for which the defining class has a single copy, that can be directly accessed by all of its instances. Per Smalltalk convention, class variables (and other shared objects including classes) have names that begin with an uppercase letter. Class variables are generally used either to hold (1) application-specific constants, or (2) collections of specific instances of a class. In an object-oriented system all actions are the result of sending a message to an object. The receiver-object then responds by executing a method by that name. For improved readability in this thesis, I use the terms message and method interchangeably. However, these terms have distinct meanings; i.e., for a given message there may be one or more methods defined, as any number of objects can have a method with the same name.

PAGE 55

46 Conversion of Source Data from Vector Product Format to Smalltalk Objects One of the first steps that was required for OVPF was to build a translator in Smalltalk capable of reading Vector Product Format (VPF) data files. With a few exceptions, these are well specified as having the schema information (metadata) for a given data file contained in the header of that file. The file header is organized in three main sections, and the actual geo-data follows immediately after these sections. This organization is described in (DMA 1993a, section 3.6.1, pp. 15-20). To summarize, the first header section consists of the following fields: 1 . Header length: 4-byte integer representing the number of bytes in the header. 2. Byte order flag: 'L' for least-significant byte first, and 'M' for mostsignificant byte first. 2 (Ironically, this flag must be known before the preceeding numeric field can be interpreted.) The second header section contains only one field: 3. Table description: up to 80 characters of textual information. The third and final header section contains the actual schema, which is the essential part for parsing the table's data content. This consists of repetitions of the following fields: 4. Column name: up to 16 characters of textual information. 2. Each integer and floating-point number requires 2, 4 or 8 bytes for its representation. The byte order specifies which end of the bytes comes first. This is normally determined by the operating system. PC DOS, for example, is a little-endian platform (least-significant byte first), while Unix and Macintosh are big-endian (most significant byte first). Since VPF data is intended to be read on any of these platforms, the GIS software needs to be written to translate VPF integer and floating-point numbers appropriately for that platform.

PAGE 56

47 5. Field type: a single character defining the data type (one of those listed in the first column of Figure 6 below). 6. Number of elements: an integer value representing either (a) the number of textual characters, or (b) the number of occurences of the specified numeric field type. 7. Key type: a single character for the type of key field represented by the column (one of P-primary, U-unique, or N-none). 8. Column description: up to 80 characters of textual information. 9. Value description table: up to 12 characters for a DOS-compatible filename (either INT.VDT or CHAR.VDT) for the file containing textual descriptions of the different values the column in each data record could have. This will occur when the column is a nonspatial attribute of a given geo-feature. By knowing the schema for a given table, the program can loop through all the data records in that table, interpreting each field (column value) according to the schema. An example of a feature table with header and records is shown in Figure 7. Because most of the VPF database tables follow this schema specification, it was straightforward to create a generalized VPF table reader procedure. To implement this, I created two main reader classes, VPFTableHeader and VPFSchemaColumn (see Figure 8), as well as a hierarchy of classes to implement the specific properties and behavior of the various data types listed in Figure 6. The data type classes were used to translate the data values' byte representations between VPF and Smalltalk, as well as to maintain data integrity (e.g., ensuring that text field values did not exceed their schema-specified length). There was an added dimension of translating from the byte-order of the VPF

PAGE 57

48 source data (so far, this has always been little-endian) to that used by the operating system platform on which OVPF was running. Because the mechanics of reading VPF tables are very straightforward computationally once their format is known and an object structure is chosen, I will not go into any further detail on this particular task. It should suffice to say that Smalltalk was capable of reading and interpreting all VPF data files, including those which did not have their schema in the header; these included variable-length index files, spatial index files, and thematic index files (see DMA 1993a, sections 5.4.1.3, 5.4.2 and 5.4.3, pp. 77-83). The Triplet ID field was particularly troublesome, as it is a variable-length array of one or more integer values, whose length and content are determined by decoding the bits of the first byte (DMA 1993a, section 5.4.6, p. 87). Nevertheless, all these are handled within the OVPF classes just mentioned. Representation of Metadata Objects The term metadata is used here to represent the parts of a VPF source database that define the actual geo-feature data. There is a substantial amount of definitional content in a given VPF database, thanks to its open specification. However the metadata is quite fragmented among numerous files, and must first be assembled and organized in some manner before it is possible to start reading the actual feature data with it. The approach taken with OVPF is to initialize a metadata object web for each VPF database to be accessed. This is a one-time procedure for each database, after which the metadata web is kept in the ODBMS repository for future use in reading VPF source data from the CD-ROM. Figure 9 summarizes the steps involved in processing the source

PAGE 58

Type Abbrv. Column Type Length (Bytes) T,n Fixed-length text n T,* Variable-length text n + 4 F,l Short floating point 4 R.l Long floating point 8 S,l Short integer 2 U Long integer 4 C,n 2-coordinate array, short floating point 8n c* 2-coordinate string 8n + 4 B,n 2-coordinate array, long floating point 16n B,* 2-coordinate string 16n + 4 Z.n 3-coordinate array, short floating point 12n Z,* 3-coordinate string 12n + 4 Y,n 3-coordinate array, long floating point 24n Y,* 3-coordinate array 24n + 4 D,l Date and time 20 X,l Null field (none) K,l Triplet id 1 13 Figure 6. Vector Product Format Data Types Source: after (DMA 1993a, Table 56, p. 86)

PAGE 59

50 (Header length and byte order);\ ENVAREA.AFT.Environment Area Feature Table;-;\ ID = l.l,RRow ID,-,-,:\ F_CODE=T,5,N,FACC Code,CHAR.VDT,-,:\ VAV=I, 1 ,N,Variation Anomaly Value, INT.VDT,-,:; 1 ZC040 2 2 ZC040 1 Figure 7. Example of Feature Table with Header and Records Source: (DMA 1993a, Table 6, p. 20) VPFTableHeader Instance Variables: tableDesc headerLength byteOrder — schema Operations: buildSchema initializeFromStream: aFileStream skipOverHeaderlnStream: aFileStream VPFSchemaColumn Instance Variables: name description type length keyType vdtFile ENVAREA.AFT,Environment Area . 149 (bytes) L (least-significant byte first) Collection of 3 VPFSchemaColumn instances Operations: (accessing methods for instance variables) byteLengthFromStream: aFileStream datumValueFromStream: aFileStream putDatumValuelnStreamUsingHeader: aVPFTableHeader F_CODE FACC Code T (fixed-length text) 5 (characters) N (none) CHAR. I NT Figure 8. VPFTableHeader and VPFSchemaColumn Class Definitions and Example Instance Values

PAGE 60

51 database metadata in creating a Smalltalk object web for this data. Notice that each of the OVPF metadata classes have pointers to two other metadata classes. This is a way of representing hierarchical containment, or aggregation, with both forward and backward pointers. For example, the libraries instance variable of VPFDatabase holds onto a collection of VPFLibrary instances, each of which holds onto a "back-pointer" to its VPFDatabase container. With this structure (starting from the bottom of the pointers in Figure 9), an individual VPFFeatureDef instance can quickly traverse its lineage to access coverage-, library-, and database-level metadata as needed. Only a couple method names have been shown in Figure 9. VPFDatabase class has a set of methods for initializing the metadata object web (represented here by the method "initializeVPFProductFrom: pathname"). Interpretation of actual feature data based on the metadata has been made the responsibility of VPFCoverage, as this corresponds to the level at which features and graphical primitives are linked in the VPF source database. Representation of Geo-Feature Objects As should now be all too apparent, the complete set of data representing each geofeature in Vector Product Format's file structure is very fragmented. One of the benefits of the object-oriented approach is to tie the pieces together with object pointers instead of join tables, for more direct access and control. The greatest single cause of the fragmentation within a given coverage is the need to represent and syncronize both spatial and nonspatial attributes of the features. This subsection describes the assembly of nonspatial aspects of each feature, while the next subsection describes the handling of spatial and topological attributes.

PAGE 61

Figure 9. Steps to Create Metadata Web (Please also refer to Figure 2 on page 14) 1. Process database-level files: create VPFDatabase instance; assign rdbPath variable to hold the directory pathname for the source database; store the VPFTableHeader instances created for reading the Database Header Table (DHT) and Library Attributes Table (LAT) in VPFDatabase instance variables. 2. Process library-level files: loop through Library Attributes Table (LAT) to initialize all VPFLibraries for this database (name, bounds, scales, tile names); store the VPFTableHeader instances created for reading the Library Header Table (LHT), Geographic Reference Table (GRT), and Coverage Attributes Table (CAT) in VPFLibrary instance variables. 3. Process coverage-level files: loop through Coverage Attributes Table (CAT) to initialize all VPFCoverages for this library (name, Value Description Table (VDT) headers. Feature Index Table (FIT) headers, all primitive table headers, and feature-notes headers). 4. Process feature-level files: loop through Feature Class Attributes (FCA) to initialize all VPFFeatureDefs for this coverage (name, Feature Table (FT) header, prim header, prim-join headers, Value Description Table (VDT) entries that are valid for this featureDef, coverage and library).

PAGE 62

VPFDatabase Instance Variables: rdbPath libraries dhtHeader latHeader Operations: initializeVPFProductFrom: pathName VPFLibrary Instance Variables: library featureDefs — spatiallndex level featureNotes fcaHeader fcsHeader chaHeader intHeader notHeader endFitHeader cndFitHeader edgFitHeader facFitHeader txtFitHeader endPrimHeader cndPrimHeader edgPrim Header ebrPrimHeader facPrimHeader fbrPrimHeader rngPrimHeader txtPrimHeader Instance Variables: database coverages tiles IntHeader grtHeader catHeader VPFCoverage Operations: importRelationalCoverage (B) VPFFeatureDef Instance Variables: coverage features fclass attribsVDT fcsJoinDict ftPath ftHeader prim Path primHeader primjtPath primjtHeader njtPath njtHeader notejoins (D) (C)

PAGE 63

54 The definitional organization of feature attributes in OVPF is depicted in Figure 10. The VPFFeature hierarchy handles the nonspatial aspects of features, while the VPFFeatureSymbol class hierarchy handles the spatial aspects. The featureDef instance variable defined in VPFFeature class provides the link for each feature instance to its complete set of metadata just described (see Figure 9). The methods shown in Figure 10 are a small subset of the full set of procedures implemented, but these are sufficient for the present discussion. Notice the ReadWriteStream object (Figure 10B) which is used to represent the attributes instance variable of each geo-feature object. The ReadWriteStream is an important object that is part of the Smalltalk system class library and is used to represent and manage sequentially-accessed data collections, much like one might think of accessing data on a magnetic tape. The contents instance variable of this object is used in this case to hold all nonspatial attribute values in a single collection of bytes, which is essentially a direct copy of the geo-feature's source data record from the VPF feature table. Two simple but important methods in VPFFeature are "valueForAttribute: aName" and "putValue: aValue forAttribute: aName." These methods provide a generalized means of accessing and modifying any one of a feature's nonspatial attributes (such as F_CODE or VAV from Figure 7 on page 50). Essentially, these methods look up the attribute's data type and position from the feature table schema (held in the ftHeader instance variable of the VPFFeatureDef metadata object), then perform the selected action on those bytes in the contents of the ReadWriteStream instance. Other methods in the VPFFeature class hierarchy not shown here include means of maintaining the correct feature-primitive linkages during changes in topology.

PAGE 64

55 VPFFeatureSymbol objects hold onto the actual graphic primitives in their graphicElements instance variable (this is the subject of the next topic). VPFFeatureSymbols also respond to display-related requests from the graphical user interface. Representation of Graphical Primitives and Topological Relationships One of the more intriguing issues was deciding how to represent spatial topology (adjacency and contiguity of graphical primitives) within the Smalltalk object-oriented data model. Each feature object is associated with a set of latitude-longitude coordinates, referred to as graphical primitives. Point features are associated with entityand connected-node primitives. Line features are associated with edge primitives. An area feature is associated with a face primitive consisting of a ring of edge primitives, and text features are associated with text primitives. Because any one line feature object may consist of multiple edges, and any single node, edge or face primitive could be used by more than one feature object, great care must be taken to maintain the correct linkages between the features and primitives. Topological relationships among the primitives must also be maintained across all features within a given coverage and tile, according to the VPF specification. OVPF's predecessor. Objective Facilities Management (OFM), introduced an object known as a DrawOrders. 3 This is a very simple structure whose inspiration is drawn from Digitalk's Smalltalk/V for OS/2 Presentation Manager (Digitalk 1989, p.464). 3. The DrawOrders class and related GraphicsEngine were initially developed by Bob Williams for OFM.

PAGE 65

Figure 10. Representation of Geo-Features in OVPF (A) VPFFeature abstract class hierarchy: VPFFeature class provides shared definition and methods for accessing and modifying attributes and defaultColor. VPFLineFeature class provides shared definition of defaultLineType. These classes are not instantiated, but are abstract superclasses of concrete feature classes. Each subclass has its own private copy of a value for defaultColor, defaultLineType and defaultAreaPattern (where defined). The featureDef instance variable holds onto an object pointer to the metadata objects for each feature class. (B) Instances of ReadWriteStream class are used to hold nonspatial attributes of each instance of a VPFFeature subclass; ReadWriteStream instances understand how to read ("next" message), write ("nextPut" message), and reposition themselves ("reset" message and others). (C) VPFFeatureSymbol class hierarchy: These classes provide shared definition and methods for accessing and modifying the graphical primitives for a given feature (see Figure 1 1 below). VPFFeatureSymbol is an abstract class with no instances, but instances are made from each of its subclasses.

PAGE 66

57 VPFFeature Class Instance Variables: defaultColor Instance Variables: id featureDef attributes notes symbol Operations: valueForAttribute: aName putValue: aValue forAttribute: aName I VPFLineFeature Class Instance Variables: defaultLineType I VPFAreaFeature Class Instance Variables: defaultAreaPattern VPFPointFeature VPFTextFeature (A) ReadWriteStream Instance Variables: contents position Operations: next: anlnteger nextPut: anObject nextPutAII: aCollection reset (B) VPFFeatureSymbol Instance Variables: feature graphicElements boundingBox color isHilighted Operations beErased beHilighted beUnhilighted I VPFPointFeatureSymbol I VPFLineFeatureSymbol Instance Variables: lineType I VPFAreaFeatureSymbol Instance Variables: areaPattern (Q

PAGE 67

58 The structure contains a variable-length array of bytes (the contents attribute). Each contents array has the following implicit organization: • opcode a single byte whose integer value (0 255) represents an operation code, such as set polyline, continue line, set color, etc. • byte length a single byte whose integer value (0 255) represents the number of bytes remaining in this draw order. • data bytes -the bytes whose integer or floating-point values represent the location points, the line-color index, etc. for this draw order. The contents byte-arrays from several DrawOrders can be concatenated into a single DrawOrders instance, to include an arbitrary number of instructions for displaying complex graphical objects. This structure is not only versatile, it is very compact and efficient for representing variable-length locational coordinate data. Even without the need to manage spatial topology, DrawOrders are useful objects for handling graphical data and operations. However, supporting VPF graphical primitives with full spatial topology requires a refinement of this definition, so these primitives are implemented by subclassing the DrawOrders class. A straightforward example would be to have EntityNode, ConnectedNode, Edge, Face, and Ring classes defined as direct subclasses of DrawOrders. In this way, each subclass would inherit the DrawOrders contents instance variable, and add its own specific topological attributes as needed. It is also important however, for each graphical object to hold onto a collection of the OVPF feature objects that use that graphical object. This is handled in OVPF by defining the TopologicalStructure class as a subclass of DrawOrders and as a superclass of each VPF graphical primitive class (see Figure 1 1). The TopologicalStructure's features attribute is

PAGE 68

59 handled as a collection of VPFFeatures because a given unique graphical primitive may be used to help draw any number of VPF geo-features. Each feature object holds onto an identity-pointer to its corresponding collection of graphical primitive objects, thus enabling both features and primitives to have access to each other. In addition, the primld (primitive ID) and tileld attributes of TopologicalStructure are inherited by each subclass, providing a holding place for primary-key data from the relational-VPF files. For simplicity of supporting both import and export operations with the relational-VPF data, the primld, tileid, and topological attributes are assigned the VPF record ID value of the corresponding graphical primitive objects, rather than unique object-identity pointers. As features are added, deleted, and moved with respect to each other, these primld values are maintained just as they would be in a relational GIS framework. VPFEntityNode containingFace VPFDrawOrder contents VPFTopologicalStructure features primld tileld VPFConnectedNode firstEdge VPFEdge startNode, endNode, leftEdge, rightEdge, leftFace, rightFace VPFTextPrim text shapeLine Legend: Superclass Instance Variables Subclass Instance Variables VPFFa ce ringPtr VPFRing firstEdge Figure 1 1 . Object Definitional Hierarchy for Representing VPF Graphical Primitives with Spatial Topology Source: after (Arctur et al. 1995b, p. 14)

PAGE 69

60 Presently, all source data comes to OVPF from relational-VPF databases. At this stage in the prototype development, we have assumed that all feature attribute, location, and topological relationships in the source data are initially correct. Thus we can focus our attention on developing full build and clean topological support (ESRI 1994) in a stepwise manner, beginning with simply maintaining topology locally during individual feature changes. We now have the capability to interactively add, delete, and change location coordinates of a single point, line or area feature at a time within a given tile, while maintaining correct topological relationships with adjacent and contiguous features (Chung et al. 1995). This is handled with the help of a graphical user interface (GUI) that requires the user to accept and commit changes to each topological relationship. Design of an Object-Oriented Spatial Index The spatial index framework in OVPF is implemented with just two main classes (this is a slight simplification for purposes of discussion). These classes are the VPFSpatialDataManager and VPFSpatialDataCell, shown in Figure 12. This framework presently uses a quadtree organization (after Samet 1994 4 ), in which each level of the tree represents a rectangular geographic area and can be divided into four equally-sized quadrants. Each quadrant in turn can be subdivided, continuing recursively until some predetermined limit is met. Within this structure, each geo-feature is inserted into the smallest quadtree cell which can completely contain it. This cell is the feature's spatial 4. The quadtree structure used in OVPF is simplified form of a spatial index structure originally implemented by Bob Williams for OFM.

PAGE 70

61 index. Insertion and queries start from the root (topCell) and progress recursively, until one of two conditions is met: (1) the smallest cell has been found, or (2) the maximum number of levels allowed has been reached. A maximum depth is necessary to limit the extent of recursion for very small geo-features such as points. In OVPF we have used a maximum level of 20. Note that features are indexed; not graphical primitives. This is to reduce the complexity and computational overhead of insertion and retrieval. This design has some very interesting implications and potentials, which are brought out in the final Discussion section. One point to mention now however, is that because each VPFSpatialDataCell holds onto direct object pointers to the features which fit within its boundaries, the quadtree is more than just an index; it is an efficient, general purpose container structure for all the geo-features. This proves useful in the design of the ODBMS repository, which is the next topic. Organization of Object Webs in ODBMS Repository There are two main groupings of database objects in OVPF: the metadata objects (table headers, schema definitions, value descriptions, and others); and the geo-feature objects and primitives. Each of these object groups needs to be stored in the ODBMS repository, that is, "made persistent." Another category of OVPF objects includes the user interface classes. These are the support classes which present the map on the computer screen and allow interaction with the user. It is very important that the user interface classes are not made persistent, for reasons that will be presented shortly. Typically, a complete web of objects is made persistent by reference to some root or parent object for the set during the course of a database transaction. This root object

PAGE 71

Figure 12. Principal Classes and Behavior for Quadtree Spatial Index (A) Spatial DataManager is a subclass of Object, and understands how to: create and initialize a quadtree; pass a geo-feature to the quadtree for insertion; ask the quadtree to remove a given geo-feature; and ask the quadtree for all features within a given area. (B) SpatialDataCell is a subclass of Array, having four indexed slots in addition to the named instance variables. These indexed slots each hold onto an object pointer to another instance of VPFSpatialDataCell. Each cell understands how to: determine if it is the smallest cell capable of containing the rectangular area requested; propogate the request for the smallest cell recursively to the next lower-level cell; propogate the request back up one level if it cannot hold the requested rectangle; and gather and return pointers to all features contained within a given rectangle, regardless of the number of levels involved.

PAGE 72

VPFSpatialDataManager Instance Variables: coverage topCell maxLevel Operations: initializeMin: minPt max: maxPt collectionOfContainersFor: aRectangle containerFor: aRectangle returnSetOflntersectingFeatures: aReaangle (A) (Array) VPFSpatialDataCell Instance Variables: id superCell level manager origin corner width features (four array slots for subcells) Operations: canContainBoundingBox: aRectangle containedLowerLevelContainerForBoundingBox: aRectangle containerForBoundingBox: aRectangle createLowerLevelCellForlndex: index lowerLevelContainerForBoundingBox: aReaangle upperLevelContainerForBoundingBox: aReaangle (B)

PAGE 73

64 can provide a named entry point to the persistent object web for future access by other application programs. In the case of the metadata object web, the root is a collection object of all initialized databases, keyed by their database name. For example, this collection has a member called 'DNCO I ' which points to the persistent database for Norfolk Harbor. For the feature objects, one logical root object is the spatial tree manager, which holds a pointer to the linked list of spatial tree cells, each of which holds pointers to the features whose bounding rectangle falls within the cells' boundaries. Each feature object (instance of a VPFFeature subclass) holds onto its attributes stream and its symbol (instance of a VPFFeatureSymbol subclass). Since each coverage has its own spatial index, the spatiallndex instance variable of VPFCoverage was defined to hold the persistent pointer to the VPFSpatialDataManager instance in charge of the coverage's quadtree (see Figure 9 on page 52). Another logical root object for feature objects is the instance of VPFFeatureDef which defines a given feature class. Providing access to features via their VPFFeatureDef instance would be useful in certain query optimizations. The features instance variable was thus defined for VPFFeatureDef class, to hold a second set of direct pointers to the persistent feature objects (also shown in Figure 9). Establishing cut-points in object webs Normally, a request to make an object persistent results in migrating the complete transitive closure 5 of all objects to which the requested object points, into the external database. A case where this is not desireable is where links to the user interface are held by persistent objects. One reason a user interface object should not be made persistent is that it contains numerous references to transient objects that can only be assigned and changed

PAGE 74

65 by the host operating system, such as window handles, file handles, and so on. The other main reason a user interface object should not be made persistent is that it touches so much of the Smalltalk run-time environment objects that it would essentially pull the entire Smalltalk memory image into the external database with it. ObjectStore provides means of resolving this issue with the notion of cut-points. By adding a particular method to each of the user interface classes, the transitive closure operation can be made to insert a cut-object in place of the reference to the user interface object itself. This cut-object reference is then replaced at run-time by the "live" object reference when needed. Design of a Rule-Base Framework to Support Geographic Feature Editing The rule-based framework was added to OVPF in the second year of the project as a means to help enforce data integrity constraints on features during interactive updates. Rules in this framework can be defined to "fire" upon occurrence of a particular event, subject to arbitrary conditions anywhere in the database. Should one of the rules be triggered and its associated conditions hold true, then a predefined action would be carried out. The following discussion shows how this is implemented in OVPF. 5. Transitive closure is a term from graph theory, denoting the set of all pairs of nodes directly or indirectly connected by a sequence of edges. In the case of object webs, it refers to all objects connected by association or containment from a given root object (after Rumbaugh etal. 1991, p. 57)

PAGE 75

66 Event Objects Events are first-class objects in this framework as they have significant state and behavior (Arctur et al. 1995d). The PrimitiveEvent class in Figure 13 defines an "eventMsg" attribute which is inherited by all its subclasses. For each new instance of any event, this attribute is assigned the name of the message for which the event is raised. The ComplexEvent class defines further attributes used by its own subclasses. PrimitiveEvent Instance Variable: eventMsg Operations: notify: ConjunctionEvent Operations: notify: ComplexEvent Instance Variables: event I event2 event I Occurred event20ccurred I DisjunctionEvent Operations: notify: Legend Superclass attribute method i direction of inheritance Subclass SequenceEvent Operations: notify: Figure 13. Event Class Hierarchy The key method for each of the event classes is notify:. This method takes only one argument which specifies the name of the message which causes an event to be raised. The event is raised when the object(s) associated with the event object receives that message. For PrimitiveEvents the notify: method simply compares the argument to its own eventMsg attribute value and returns true if they match. For each ComplexEvent subclass, the notify:

PAGE 76

67 method also examines a particular combination of the status of its other attributes, before returning true or false. An event instance is typically created at the time of rule creation. Rule Objects VPFRule objects have the structure shown in Figure 14. A single class suffices for defining all rules. The feature instance variable may be assigned a pointer to either a single geographic-feature instance, such as a road or lake; or to a feature class,such as the defining class for roads or lakes. In the former, the rule will be applied only to a particular instance whereas in the latter, the rule will be applied to all instances of the defining class. The event instance variable is assigned a pointer to a specific event instance (introduced above), which could be either a Primitive Event or a ComplexEvent. The condition attribute is assigned the name of a method to be executed at the time the event is signalled, which will return true if the condition is met and false otherwise. The action method is then executed if the condition evaluates to true. The preOrPost attribute specifies the relative timing for execution of the condition method with respect to the message raising VPFRule Description of Rule attributes : feature: pointer to a geographic-feature class or feature instance event : pointer to an instance of PrimitiveEvent or one of its subclasses condition : condition-test method name action: action method name action Priority : integer value I (low) to 100 (high) preOrPost : flag specifying if condition is tested before or after the message raising the event Instance Variables: feature event condition action actionPriority preOrPost Figure 14. Structure of a Rule Object

PAGE 77

68 the event. The condition may be evaluated either before the event message is executed, or upon completion and return from the event message execution. The actionPriority attribute value is used to help mediate in situations where multiple rules fire at the same time. Event Detection Mechanism The final component of this framework is the mechanism by which events are detected and rules are fired. In the OVPF viewer/editor tool, all changes to geo-feature objects are handled through the use of FeatureConstructor objects, which use a scriptbased framework with a state machine, supporting asynchronous events for flexibility in working with runtime-dependent constraints on changes to a given feature. 6 This framework has the potential for extending its own semantics at runtime. See Figure 15 for a simplified representation of the VPFFeatureConstructor hierarchy. With this framework, a user request to create or modify a geo-feature via the GUI is forwarded to the appropriate PointFeatureConstructor, LineFeatureConstructor or AreaFeatureConstructor. The constructor is given the name of the geo-feature class, which it instantiates with default values for all attributes. In the case of creating a new geo-feature object, the constructor then prompts the user for the feature's location. At this point, the constructor notifies the new feature object of the intended action. This notification results in a lookup to the feature's rule base. Any rules having events defined for the current operation will have the opportunity to check for any particular conditions in 6. The FeatureConstructor framework was first developed by Bob Williams for OFM. Very minor changes were needed to accommodate the rule-based capability.

PAGE 78

69 VPFFeatureC constructor Instance Variables: feature nextAction point Operations: stopCreateFeature: i_ I _1 VPFPointFeatureConstructor VPFAreaFeatureConstructor Operations: point 1 : Operations: points: VPFLineFeatureConstructor Operations: point 1 : point2: Figure 15. VPFFeatureConstructor Hierarchy the database that are of interest. After all rules' conditions have been checked, those which evaluated true are sorted in priority order, and their respective actions are performed. An example of usage is provided in the Results section following.

PAGE 79

RESULTS This section presents a summary of findings from this research and development. This is in two parts: the first part shows examples of general usage, and the second part describes the operation of the rule-based framework. OVPF Application Overview The diagram in Figure 16 shows functional relationships among the principal modules of the OVPF application. The two main points of control in this figure are the graphical user interface (GUI) and the metadata framework. The GUI provides the user Graphical User Interface (GUI) Metadata (Schemata) VPF 0 — Geo-Relational 4— O Data Files Quadtree Spatial Index Geo-Feature Objects o— ObjectStore 4— O Geo-Object ^ DBMS „ Legend Functional r associations \ Data flow o — Figure 16. Principal OVPF Components 70

PAGE 80

71 the menus and programming access with which to direct the operation of OVPF. The metadata framework carries out the bulk of the processing of VPF source data for migration to the internal object model and to the ODBMS. The metadata model is also responsible for exporting edited OVPF data back to the relational VPF file structure. Transformation of Relational Vector Product Format Data to an Object Web As described in the preceeding Materials and Methods section, the import of source data into the OVPF application at runtime is accomplished in stages. First, the metadata object web for a given VPF database is initialized, after which the geo-feature data can be interpreted and displayed on the screen. As the geo-feature data is read by OVPF, it is inserted into the quadtree spatial index structure. Figure 17 shows all the significant definitional relationships among the metadata, the feature objects, and the spatial index structure. Figure 18 shows the dynamic associations among the runtime instances of metadata, feature, and spatial index objects. Displaying Spatial Features An overview of the main steps in reading, indexing and displaying a Vector Product Format map (from either the relational source files or from the ODBMS) is depicted in Figure 19 below. Figure 20 shows a "screen capture" of the OVPF map window display with a portion of the Norfolk Approach library.

PAGE 81

•Jl u i/a u u ca -C 73 —j S3 Q > O ft C o 4—1 73 3 -3 C cfl C c u Q >> u «3 U sa u ,S 3 -a u 1/5 cS U x C ft C/5 T3 c •— 3 u PL, r3 in r3 0 U m 0\ r3
PAGE 82

73 0> U CTJ H Q Cl u_ Q_ > -9 -2 = 2 5 5 « s 00 X (A V) a) Q s 4-» rt CL .0) > 0) p. E rt 0) i/i i_ 4J 0) OO .a 0) rd 4-1 'l_ 1/1 0) 4-1 c c g m u 0) c ^4J s rt c 'cn 0 O _c u CL CO

PAGE 83

3 — i S3 O u e 4 — i CS O u u z Q u u O X O c/5 30 u u '-C? O £3 "5 < 2 cd a, C ri U o — 3 P u o o

PAGE 84

75

PAGE 85

Figure 19. Transfer of Spatial Features from VPF to OVPF Geo-features are (1) imported either from relational VPF files or from ODBMS, then (2) placed in quadtree, and (3) rendered on screen. Note that the ODBMS contains whole features, while 4 or more georelational files are required to define each individual feature. Source: after (Arctur et al. 1995c)

PAGE 87

78 File View Edit Window Help DNC01 DNC1S Libraries AQ10817Q 'A0 108280 BROWSE Covera ges v' LCR (A280) LIBREF (A280) LM (A280) jv NAV (A280) Features Bridgel (ObsA280) • Buoybcnp (Nav.A280) m Dangera (Obs.A280) • Dangerp (Obs.A280) m Embanka (LcrA280) m Hazarda (ObsA280) ' • Hazardp (ObsA280) — Lcrline (Lcr.A280) — Leadingl (Nav.A280) — Libref (LibrefA280) • Lightsp (Nav.A280) Show Hide 4i M Point Line Area Text ; Refresh Map [ GJPH 76.02W 37.068N as fa-/ > Z. *j£IA a 1 X • • . 9 o° \ a o go: Query I Zoom Area j I Zoom In i Zoom Out St 7 (OSS A0 103280. D'-ICCH i Scale 1:276717 Figure 20. OVPF Map Display of Multiple Coverages in Norfolk Approach Library of DNC01 Migrating Object Webs to ODBMS As mentioned in the Methods section, not all of the OVPF data should be placed in the ODBMS repository. In particular, the GUI objects should not be allowed to migrate to the persistent data store, as this would inevitably result in migrating most of the Smalltalk development environment through the transitive closure from the GUI root objects. Figure 21 shows the relationships among the main groupings of objects in OVPF, and which are managed by the ODBMS.

PAGE 88

79 OVPF User Interface instances of: VPFMapWindow subclasses VPFMapPane subclasses VPFFeatureEditor subclasses VPFGraphicsEngine (A) Metadata instances of: VPFDatabase VPFLibrary VPFCoverage VPFFeatureDef VPFTableHeader Spatial Tree instances of: VPFSpatialDataManager VPFSpatialDataCell I Feature Data instances of: VPFFeature subclasses VPFFeatureSymbol subclasses I Graphic Primitives instances of: VPFDrawOrders subclasses _________ (B) Figure 2 1 . Persistency and Linkages of Principal OVPF Components (A) Non-persistent objects (B) Persistent objects Source: after (Arcturetal. 1995a; Cobbetal. 1995a) Applying the Rule-Base Framework for Feature Editing In order to demonstrate the rule-based framework, an example rule to prevent any BuildingPoint geographic features from being placed over water was implemented and tested. Sample VPFRule and VPFPrimitiveEvent object structures are shown in Figure 22. In this case, the VPFRule instance is associated with the BuildingPoint class and thus will be applied to all instances of that class. Alternatively, the user may associate the rule with

PAGE 89

80 a particular BuildingPoint instance. Due to the setting of the preOrPost attribute, the condition method onWater: is evaluated before the Event's eventMsg (the newPoint: method) is carried out. If the condition method onWater: returns true, the action method stopCreateFeature will then be executed, which will prevent the eventMsg method newPoint: from being performed. The action Priority setting ensures this action will have highest priority among any other VPFRules which may also fire. aVPFRule feature event condition action — actionPriority preOrPost DNCBuildingPoint class ' ('onWater:') a PrimitiveEvent C1 *" ('stopCreateFeature' J eventMsg ('newPoint:' J f object \ Legend V attribute I association pointer ~ ikk ( value object ) Figure 22. Example Rule and Event Objects Source: (Arctur et al. 1995d) At this point we need to introduce the rest of the framework in which Events are detected and Rules are fired. In the OVPF viewer/editor tool, all changes to geographicfeature objects are handled through the use of FeatureConstructor objects (see Figure 23). With reference to our example for creating a new BuildingPoint feature, we assume a RuleEvent pair has already been created (for checking if a new point feature is over water) and stored in the DNCBuildingPoint's rules dictionary (class instance variable defined in

PAGE 90

Ol VPFFeature class. Figure 23B). This rule base is actually stored physically in the ODBMS. VPFFeatureConstructor Instance Variables: feature nextAction point Operations: onWater: stopCr eateFeature VPFAreaFeatureConstructor VPFLineFeatureConstructor VPFPointFeatureConstructor Operations: point I : (A) VPFFeature Class Instance Variables: rules Operations: notify:argList:preOrPost:from:newPoint: DNCBuildingPoint (B) Figure 23. Key Components of Event Detection Framework (A) Partial FeatureConstructor Class Hierarchy (B) Partial Feature Class Hierarchy Source: after (Arctur et al. 1995d) The following sequenc e of events could then take place at the user's initiation (step numbers correspond to those in Figure 24): I . The user chooses the appropriate OVPF menu option to add a new geographic feature, and selects BuildingPoint from a list of available feature classes. The OVPF graphical user interface (GUI) creates a PointFeatureConstructor.

PAGE 91

82 Action Summary Direction of Messages 1 . User chooses menu option to add a new > feature J 2. GUI creates a constructor for the new > feature object J 3. Constructor creates a default instance of BuildingPoint, and requests coordinate point from GUI J 4. GUI returns user-defined location ^ coordinates for new feature J 5. Constructor sends message — feature notify: 'newPoint:' argList: (point) preOrPost: 'pre' from: self / 6. Feature scans rule base for rules with ^ event message 'newPoint:' , ^7. Feature finds rule and evaluates condition ^ message — constructor perform: 'onWater:' ; constructor then queries ODBMS and ^returns true or false J 8. If condition evaluates true, feature sends ^ message — construaor perform: stopCreateFeature 9. If constructor has to stopCreateFeature, ther^ constructor assigns 'stop' value to its nextAction attribute > 10. If constructor's nextAction is 'stop' it discards the new feature . ^\ I . If constructor's nextAction was not 'stop'^ then it sends the message — feature newPoint: point and finally inserts the new feature in the quadtree. . User I OVPF GUI I PointFeatureConstructor I OVPF GUI I PointFeatureConstructor I BuildingPoint I Rules lf7, 8 PointFeatureConstructor I ODBMS 10, 11 BuildingPoint t 11 Spatial Quadtree Figure 24. Flow of Control and Behavior For Rule-Event Example Source: after (Arctur et al. 1995d)

PAGE 92

83 3. The Constructor creates a default BuildingPoint feature object, and initiates a request to the GUI for a user-selected location coordinate point, to be returned via the point I : message. 4. On instruction from the GUI, the user chooses a location on the map with the mouse, and the GUI returns it as the argument in the point I : message to the Constructor. 5. Within its point I : method, the Constructor notifies the new BuildingPoint feature instance of an impending Event via the parameterized notify:argl_ist: preOrPost:from: message. 6. The new BuildingPoint object executes the inherited notifyrargList: preOrPost:from: method, which checks the rule base for all Rule-Event pairs whose eventMsg matches the notify: argument, in this case newPoint:. 7. If a matching Rule-Event pair is found, then the Rule's condition value (onWater:) is sent as a message to the Constructor to perform. The Constructor's onWater: method checks the database for any water-related features within a given tolerance of the user-selected coordinates, and returns true or false. By user's preference, this check can be performed either on just the features currently being displayed, or on features from all coverages in the ODBMS. 8. If the onWater: method returns true (coincident water feature was found), the Rule's action message is then sent to the Constructor. In this case if water features were found, the message stopCreateFeature would be the action message sent to the Constructor. Note that in the present framework, all

PAGE 93

84 applicable conditions are evaluated before any actions are performed. If multiple conditions return true, their action messages are sent to the PointFeatureConstructor in order of decreasing actionPriority. 9. If the Constructor receives the message stopCreateFeature, it will set its nextAction attribute to 'stop'. 10. Upon completion of all applicable conditions and actions, the new BuildingPoint object returns from executing the notify:argl_ist:preOrPost:from: method. The thread of control reverts to the Constructor's point I : method, which then checks its nextAction setting. If it is 'stop' then the new default BuildingPoint feature is discarded, and control returns to the user with a descriptive dialog message. 11. If the nextAction is not 'stop' then the Constructor sends the newPoint: message to the new BuildingPoint, inserts it in the spatial quadtree, and presents the user with a dialog window to fill in any BuildingPoint feature attributes needed. This simple example can easily be extended to encompass multiple rules for a given geographic feature, as well as to handle multiple features. In addition to the "eitheror" situation represented in this example, a rule could be based on prerequisite and corequisite existence of other features, even occurring in a particular temporal sequence or logical combination. It is simply necessary for the FeatureConstructor class or one of its subclasses to mediate all requests for changes or additions to geographic features by the user, and for the FeatureConstructor method invoked to check the affected feature's rule base.

PAGE 94

DISCUSSION A number of implications can be found from this research. In the following sections, I will first address this work in terms of its initial objectives, followed by a discussion of the limitations so far recognized in the technologies and designs used. A look at future directions and summary conclude this thesis. Implications of Research for Meeting Initial Objectives The objectives stated on page 8 include supporting (1) complex interdependencies among geographic features, (2) very large databases, and (3) the potential for expert system applications. Each of these will be discussed in turn. Supporting Complex Interdependencies Among Geographic Features The descriptions of Vector Product Format file structures for representing geofeature data, and the object webs created in OVPF to capture this information, show that this Smalltalk-based, object-oriented data model is very versatile and expressive. In related development work for the Naval Research Laboratory, this framework has been adapted to import and display geographic data from four different kinds of VPF product databases simultaneously (Digital Nautical Chart, World Vector Shoreline, Vector Smart Map, and Urban Vector Smart Map). 85

PAGE 95

86 In addition to the extensibility of the metadata structure facilitated by the objectoriented class hierarchy, the rule-based framework described here provides another level of extensibility. Because the Smalltalk language supports incremental dynamic compilation, rule-based actions could actually trigger the creation of additional classes and methods at runtime, according to the needs of the application. This might be done, for example, to augment the behavior of existing geographic feature objects to respond to new conditions in their environment that might only affect some of the features and not others. It is also conceivable that a given geographic feature might evolve into a different kind of feature over time; the framework described here could support such an evolution. This rule-based approach can be used in three distinct situations: (1) immediate mode, to execute rules immediately before or after some state change; (2) deferred mode, to execute rules at the end of several changes; and (3) detached mode, to perform rulebased actions separately from the state changes. Furthermore, it has the advantage over traditional inference-engine approaches in that it will work with an arbitrarily-large database of persistent objects, rather than being limited to those objects which can fit in memory. This approach should support the types of complex interdependencies commonly found in facilities management applications, such as with public utilities networks. The rule-event framework and procedures were surprisingly simple to implement. The FeatureConstructor classes, together with a single supporting method in VPFFeature class (notify:argList:preOrPost:from:), provide a simple and flexible event detection and rule processing system. While it introduces some processing overhead, all but the spatial query to the ODBMS (see step 7, Figure 24 on page 82) are very fast operations. An important benefit of this object-oriented framework is the potential for direct reuse by

PAGE 96

87 other FeatureConstructors of condition checks and actions such as the onWater: and stopCreateFeature methods. Furthermore, with this system provision can be made for adding and changing rules at runtime. The design presented here is easily extended to trigger on any kind of change (create, modify, delete) to geographic-feature objects, as well as to specific feature attributes and spatial coordinates of a given feature object. This could be a significant advantage over the triggers supported by many commercial relational and even hybrid object-relational DBMSs. Except for Sybase, these DBMSs can typically trigger only on insert, update or delete of a complete feature record, rather than being able to discriminate on changes made to a single feature attribute. It might be noted that this rule-based framework is not limited to implementation in an object-oriented system. While the object-oriented properties of hierarchical definition and inheritance in Smalltalk facilitated a simple design, the same functionality could be achieved in a non-object-oriented language with appropriate data structures and procedures. It seems likely that rule-based frameworks like this could find their way into many more kinds of applications in the future. Supporting Very Large Databases As a result of using a commercial ODBMS for the geo-data repository, we can immediately start to consider working with very large, distributed databases. ObjectStore has been demonstrated already to support terabyte-sized databases, and its client-server architecture with such features as shared-page caching is well suited to multi-user applications. Other ODBMSs are also likely candidates for large applications like this. It

PAGE 97

88 was found that spatial queries were as much as hundreds of times faster when reading from the ODBMS repository than from the relational VPF source files. While further tuning of either approach is no doubt possible, the ODBMS interface was far simpler to design and implement, especially given the need to support topology. However, storing a large database is only part of the problem; another is providing access to it with reasonable performance. The technique demonstrated here of placing each coverage in a separate spatial index shows very good potential for helping to manage the visibility of unnecessary data while the user is trying to identify a region of interest. The rule-based concepts implemented here could also be applied to facilitate query optimization across multiple, heterogeneous databases distributed over a wide-area network. For example, rules could be defined to check the visibility or access priveleges of one or more portions of a distributed database before starting a potentially long transaction. To support this, a client ObjectStore application could be set up on the host server. This client application could serve as the effective host application to the actual users, filtering the data prior to shipping it over the network. The GemStone ODBMS already is organized to support this separation of work between a host application process and each client process. It was pointed out that each quadtree cell (VPFSpatialDataCell instance) holds a collection of direct object pointers to its geo-features (see "Design of an Object-Oriented Spatial Index" starting on page 60). This makes the quadtree into more than just a spatial index, but an efficient, general purpose container structure for the features as well. This design is essentially independent of the actual VPF feature structure, and allows us to modify the implementation of the spatial tree at any time without affecting the rest of

PAGE 98

89 OVPF or the source data. Thus, in the future we could easily substitute a range tree (Samet 1994; Beckmann et al. 1990; Brinkhoff et al. 1993), or special optimizing techniques in place of the present quadtree approach (this will be discussed further below). This design could also support the simultaneous implementation of multiple spatial indexing schemes, to allow choice of the most efficient spatial tree design for a given source database. This might be thought of as pluggable spatial indexing. Another issue addressed by this design that becomes more important over time has to do with changes to the nonspatial attributes of geo-features in a given feature class or coverage. As business requirements and data sources evolve and change, it is often necessary to add, remove, or change the value range of nonspatial attributes for one or more feature classes. (In a complex database specification such as VPF, this must be done with care, to ensure consistency of attribute usage and values across similar feature classes in different coverages and libraries.) As described in the Materials and Methods section, page 54, OVPF makes use of Smalltalk ReadWriteStream objects to hold all such attributes in a single byte-stream, in which each attribute's position and length in the bytestream is known via the feature class' schema. This rather non-object-oriented way of aggregating many small pieces of information is much more memory-efficient than if we had created separate instance variables and value-objects for each geo-feature attribute. It also supports changes in attribute structure for a feature class without affecting the object class definitions. This means that attributes can be added to, or removed from, the feature class definition without having to redesign the OVPF structure. This could even be done at runtime. The issue of updating older data with the previous attribute structure to the new structure must still be addressed, which could be difficult for large databases.

PAGE 99

90 Supporting Potential for Expert Sy stem Ap plications Attempts and progress are being made to cope with the complexity of decision making faced by planners through the use of expert systems. The framework presented here seems to be a good candidate for further research in this area. With Smalltalk's reflective and dynamic compilation capabilities, there are no inherent limits to the ability to create and modify a rule-base of events, conditions and actions at runtime. This implies the system could have the capability to learn and evolve its semantics (general behavior) as a function of usage patterns and environmental conditions. Limitations of the Present Application The various frameworks described in this thesis present numerous possibilities for extensions and enhancements. However, there remain some significant limitations in the present implementation of OVPF. These are grouped here according to (1) feature class definitions, (2) spatial index, (3) GIS functionality, (4) the rule-based framework, and (5) the Smalltalk language. Feature Class Definitions The present design for OVPF lacks support for variable-length feature attributes. It has been found that some feature classes in the Vector Smart Map (DMA 1993c) and Urban Vector Smart Map (DMA 1994b) specifications have variable-length text attributes, whereas all other nonspatial attributes in VPF databases have been fixed-length in nature. To accommodate this, we would most likely follow the VPF specification for storing such data in the feature tables, by including the integer-length of the attribute as the first four

PAGE 100

91 bytes of the attribute's value. Since the feature table schema states which attributes have variable length, it would be a straightforward matter to modify the standard accessing protocol in VPFFeature ("valueForAttribute: aName" and "putValue: aValue forAttribute: aName" methods) to properly handle these exceptions. Spatial Index The simplistic quadtree implemented so far has a number of shortcomings, as observed by Cobb et al. (1995b). For example, all spatial queries begin from the root (top) of the tree. Cobb's spatial splay tree approach addresses this issue, by storing pointers to the most recently-accessed quadtree cells at or near the root of the splay tree. This has been found to result in significant improvement in query performance. Another issue however, is that of managing insertion of geo-feature objects in the index that fall on a quadtree cell boundary. When this happens, the geo-feature pointer is moved back up one level in the quadtree, to the next-larger cell. With small or sparse geographic databases this is not a problem, but with dense coverages this can degrade access performance. One way of overcoming this, while preserving the advantages of the quadtree approach, is to use overlapping quadtree cells. In this case, each quadtree cell could overlap its neighbor by up to 25 percent, to hold any geo-features which coincide with its boundary. By traversing the cells at a given level in a consistent order, say clockwise, each feature would still have a unique index key. While I have not seen any technical documentation on this approach, both Laser-Scan (1995a) and Smallworld (1995) use it for their spatial indexes.

PAGE 101

92 GTS Functionality While a significant amount of work has been done, much more functionality is required for the OVPF application to represent a GIS according to standard industry guidelines. So far, changes to topology have limited support. While the data structures are essentially complete, further work is required to allow any kind of change in topology to be correctly handled. Chung et al. (1995) describes the operations which are now supported and those which are still needed. Topology support affects not just data integrity, but also the range of spatial analysis functions which can be performed. Presently OVPF incorporates very few of the "usual" spatial analysis procedures (Goodchild 1988, 1994; Tomlin 1990). For example, it lacks functions such as edgematching, dissolving lines and merging attributes, line thinning, weeding and smoothing, centroid calculation, and others. Those which are implemented are part of the standard Smalltalk library, such as raster-vector transformations and point-in-polygon testing, and even these may need further optimization. Another shortcoming is that OVPF presently supports only very simple queries. The protocol for spatial queries is limited to returning all geo-features within a specified area, without regard for filtering criteria based on nonspatial attributes. The versatile ReadWriteStream approach for storing attributes has the drawback that we must always interpret the byte stream to perform comparisons on any values stored in the stream. For faster performance on queries, it may be helpful to create individual indexes on specific nonspatial attributes, for instance by storing such attribute values and geo-feature object pointers in a hash dictionary or binary-tree structure.

PAGE 102

93 Rule-Based Framework There are a number of issues related to supporting rules that have not been addressed as yet. The number of rules defined in an application can become very large and may be defined by various users at different points in time. This can lead to the problem of having inconsistent or conflicting rules present within the application. For example, user A may define a rule Rl whose action may trigger rule R2 defined by user B. Suppose rule R2's action results in triggering rule Rl, thereby yielding an infinite loop. From this scenario, it is evident that a mechanism for establishing the consistency or correctness of rules must be an inherent part of any active system. This involves writing algorithms which statically detect rule conflicts as well as algorithms which dynamically detect problems such as infinite rule triggering (Arctur et al. 1995d). Further work is also needed in the development of a user interface for development and modification of the rule base. This interface could provide a graphical representation of a state machine allowing the user to define and modify the events, conditions and actions for each rule. This could be a component of a graphical flowchart-like facility allowing the user to create and modify step-by-step feature construction scripts. The rule-based framework implemented so far represents a data-driven (also called bottom-up or forward chaining) system with very primitive inferencing capability. Goaldriven (top-down or backward chaining) capability would also be important for application in an expert system, and this will require further research and development efforts. An important next step would be to extend the proof-of-concept developed here into a larger study of the rule-based framework with more realistic data from various GIS

PAGE 103

94 application domains. Navinchandra (1993, pp. 87-89) points out a number of issues related to the application of artificial intelligence techniques to GIS. One of the key issues is the potential infeasibility of applications to scale up adequately from the research prototype to realistic production models. This is particularly true with a moderate to large rule base, due to interactions among the rules that become difficult to anticipate and test. Smalltalk Language This entire application has been implemented in Smalltalk. This includes such significant components as the relational data import and export facilities, metadata object hierarchy, the feature data hierarchy, topology hierarchy, spatial quadtree index, graphical user interface, low-level graphical object representations and operations, and byte-level data format conversions. The performance of this system has so far been acceptable with geographic databases up to about 30 megabytes in size. Significant tuning can be performed using Smalltalk's own capabilities; however, it is likely that some portions of the system such as data conversions and graphical rendering could be better implemented in a platform-dependent manner using C or even Assembler, which can be invoked from Smalltalk. ParcPlace's VisualWorks Smalltalk code is platform-portable however, which is a great advantage for both development and maintenance. One limiting factor inherent in VisualWorks has been a practical upper-bound on the size of hash dictionaries, which are (generally) efficient accessing structures provided as part of the Smalltalk system class library. These dictionaries lose their effectiveness when attempting to store more than about 16,000 elements. This limits their usefulness for indexing geo-feature nonspatial attributes, as there can be many more than 16,000

PAGE 104

95 instances of a given feature class within a coverage. Possible workarounds for this limitation include subdividing large collections into smaller groupings, so that no single grouping has more than 16,000 elements. Perhaps a later version of Smalltalk will overcome this issue. Future Directions With the change to support variable-length attributes described above, it is conceivable that the feature attribute structure described here could be extended to store time-series or other temporally-based versions of nonspatial data for a given attribute. For example, suppose that a particular road or bridge were seasonally out of operation due to high water. The attribute describing its operational status could be stored with multiple values associated with different time periods. This is not part of the VPF specification at present, but could be accommodated by OVPF's structure. A more difficult problem would be to store and manage multiple editions of graphical primitives based on temporal data. For example, to store a set of shorelines or hydrographic depths representing different tidal levels would require much more consideration in design than is needed for nonspatial attributes. A promising application for the rule-based framework described here could be in maintaining spatial topology following changes to geo-feature locations. One possibility would be to extend or mimic the FeatureConstructor concept to support the use of GraphicalPrimitiveConstructors. These objects could be actuated to carry out the necessary steps for splitting and splicing Edge and Face primitives, creating Nodes at Edge and Face intersections, and other such operations. The optimal sequencing of

PAGE 105

96 topology maintenance operations can become complicated with dense coverages. By integrating a rule-based framework with GraphicalPrimitiveConstructors, it might be possible to direct the sequence of topology-building steps based on runtime conditions and relative priority of the mathematical graph-related procedures. Due to the computational intensity of this kind of processing, it would be essential to tune such a framework for maximum performance. As already-large databases get much larger, it will be increasingly important to fully exploit parallel processing and data channel capabilities. It would be interesting to explore ways of using the rule-based framework to help determine optimal loading of computational resources to improve performance of both queries and data updates. Summary The research and development described in this thesis represents a unique application in a number of ways. It is one of the first frameworks for GIS designed and implemented with Smalltalk, and shows the great leverage, versatility and expressiveness of this language and development environment. The object-oriented data model has demonstrated itself to be extensible enough to accommodate multiple simultaneous database schemata. It is also one of the first GIS frameworks that uses a commercial ODBMS to store all the spatial and nonspatial data in a consistent and extensible manner. Finally, it demonstrates a comparatively "low-overhead" approach to integrating a rulebased framework and active database capability in a GIS. While much further work is needed to carry it beyond a "proof of concept" stage to be useful in spatial analysis, this framework offers great promise for such efforts.

PAGE 106

97 It is my belief that object-oriented GIS (OOGIS) can facilitate collaborative decision-making on difficult issues of global scope, such as deforestation, toxic waste handling, and distribution of physical and financial resources. No single government or multinational corporation can possibly have full understanding of all the consequences of a given decision and course of action with respect to such global issues. Nor should any one government or other organization be solely responsible for implementation of policies and activities that may be needed. Thus, collaboration at many levels across national, organizational and class boundaries is required. Others have done work which could be useful in this direction, such as Nyerges (1993) and Karnes (1995), and significant work has been reported on technologies for groupware and computer-supported cooperative work (Baecker 1993; Furuta and Neuwirth 1994). With the rapidly accelerating acceptance and use of Internet, combined with advanced GIS facilities, it should be possible to support continuous, real-time communications and updates among GIS databases for use by participating domain experts and decision makers in geographically dispersed locations around the world. To this end, the OOGIS and knowledge-base capabilities mentioned in this thesis seem well suited. It is my hope that the work started here can continue in this direction.

PAGE 107

REFERENCES Abel, D. J., S. K. Yap, R. Ackland, M. A. Cameron, D. F Smith, and G. Walker 1992. "Environmental Decision Support System Project: An Exploration of Alternative Architectures for Geographical Information Systems," International Journal of Geographical Information Systems 6:3, pp. 193-204. Alexander, C, S. Ishikawa, M. Silverstein, M. Jacobson, I. Fiksdahl-King, and S. Angel 1977. A Pattern Language . New York: Oxford University Press. Alexander, J. F, R. E. Williams, and T. G. Curtis 1991. An Object-Oriented Geographic Information System: Obiect-GPG , Technical Report of GeoPlan Center, University of Florida, Gainesville. Alexandroff, P. S. 1957. Combinatorial Topology , Vols. 1,2. Rochester, NY: Graylock Press. Alexandroff, P. S. 1965. Elementary Concepts of Topology . New York: Frederick Ungar Publishing Co. Arctur, D. K., J. F. Alexander, K. Shaw, M. Chung, and M. Cobb 1995a. "OVPF Report: Object-Oriented Database Design Issues." Internal research report for the Naval Research Laboratory, Stennis Space Center, MS. Arctur, D. K., J. F. Alexander, M. Cobb, M. Chung, and K. Shaw 1995b. "OVPF Report: Issues and Approaches for Spatial Topology in GIS." Internal research report for the Naval Research Laboratory, Stennis Space Center, MS. Arctur, D. K., E. Anwar, J. Alexander, S. Chakravarthy, M. Chung, M. Cobb, and K. Shaw 1995c. "Comparison and Benchmarks for Import of VPF Geographic Data from Object-Oriented and Relational Database Files," Proceedings of the Fourth Symposium on Spatial Databases, SSD 95 . New York: SpringerVerlag, pp. 368-384. Arctur, D. K., E. Anwar, S. Chakravarthy, M. Cobb, M. Chung, K. Shaw, and J. Alexander 1995d. "Implementation of a Rule-Based Framework for Managing Updates in an Object-Oriented VPF Database," Proceedings of Geographic Information Systems/ Land Information Systems Conference, GIS/LIS 95 . Bethesda MD: American Society for Photogrammetry and Remote Sensing, pp. 1-10. 98

PAGE 108

99 Anwar, E. 1992. Supporting Complex Events and Rules in an OODBMS: A Seamless Approach . Master's Thesis, University of Florida, Nov. 1992. Anwar, E., L. Maugis, and S. Chakravarthy 1993. "A New Perspective on Rule Support for Object-Oriented Databases," Proceedings of the 1993 ACM SIGMOD Conference , New York: ACM Press, pp. 99-108. Atkinson, M., F. Bancilhon, D. DeWitt, K. Dittrich, D. Maier, and S. Zdonik 1992. "The Object-Oriented Database System Manifesto." Building an Object-Oriented Database System: The Story of 02 . San Mateo, CA: Morgan Kaufmann, p. 3-20. Baeker, R. M., ed. 1993. Readings in Groupware and Computer-Supported Cooperative Work: Assisting Human-Human Collaboration . San Mateo, CA: Morgan Kaufmann. Batty, M., and T. Yeh 1991. "The Promise of Expert Systems for Urban Planning," Computers, Environment and Urban Systems 15:3, pp. 101-108. Beckmann, N., H. Kriegel, R. Schneider, and B. Seeger 1990. "The R*-tree: An Efficient and Robust Access Method for Points and Rectangles," Proceedings of the 1990 ACM SIGMOD Conference , New York: ACM Press, pp. 322-331. Booch, G. 1994. Object-Oriented Analysis and Design, with Applications . New York: Benjamin/Cummings. Booch, G., and J. Rumbaugh 1995. Unified Method for Object-Oriented Development , Version 0.8. Notes from workshop at OOPSLA95 Conference, Austin, Texas, October 1995. Santa Clara, CA: Rational Software Corp. Brinkhoff, T. H. Kriegel, and B. Seeger 1993. "Efficient Processing of Spatial Joins Using R-trees"in Proceedings of the 1993 ACM SIGMOD Conference , New York: ACM Press, pp. 237-246. Brownston, L., R. Farrell, E. Kant, and N. Martin 1985. Programming Expert Systems in OPS5: An Introduction to Rule-Based Programming . Menlo Park, CA: AddisonWesley. Budic, Z. D. 1994. "Effectiveness of Geographic Information Systems in Local Planning," Journal of the American Planning Association 60:2, pp. 244-263. Chakravarthy, S. 1989. "Rule Management and Evaluation: An Active DBMS Perspective." SIGMOD Record 18:3, September 1989, pp. 20-28. Chakravarthy, S., ed. 1992. Special Issue on Active Databases, Data Engineering Bulletin 15, December 1992. IEEE Computer Society.

PAGE 109

100 Chakravarthy, S., and D. Mishra 1993. "Snoop: An Expressive Event Specification Language for Active Databases," Data and Knowledge Engineering 14:3, October 1994, pp. 1-26. Chakravarthy, S., V. Krishnaprasad, E. Anwar, and S. Kim 1993. "Composite Events for Active Databases: Semantics, Context, and Detection," Proceeding s of 20th VLDB Conference , Los Altos CA: Morgan Kaufmann, pp. 606-617. Chakravarthy, S. 1995. "Architectures and Monitoring Techniques for Active Databases: An Evaluation," Data and Knowledge Engineering 16, pp. 1-26. Chakravarthy, S., B. Blaustein, A. P. Buchmann, M. Carey, U. Dayal, D. Goldhirsch, M. Hsu, R. Jauhari, R. Ladin, M. Livny. D. McCarthy, R. McKee, A. Rosenthal 1989. "HiPAC: A Research Project in Active, Time-Constrained Database Management." Technical Report XAIT-89-02, XAIT Reference Number 187. Cambridge, MA: Xerox Advanced Information Technology, July 1989. Chakravarthy, S., E. Hanson, and S.Y. W. Su 1992. "Active Database/Knowledge Base Research at the University of Florida," Data Engineering Bulletin 15, December, pp. 35-39. Chen, J., R. T. Newkirk, and G. Davidson 1994. "The Development of a KnowledgeBased Geographical Information System for the Zoning of Rural Areas," Environment and Planning B: Planning and Design 21:2, pp. 179-190. Chung, M., M. Cobb, K. Shaw, and D. Arctur 1995. "An Object-Oriented Approach for Handling Topology in VPF Products," Proceedings of Geographic Information Svstems/Land Information Systems Conference, GIS/LIS 95 . Bethesda MD: American Society for Photogrammetry and Remote Sensing, pp. 163-174. Cobb, M., D. Arctur, M. Chung and K. Shaw 1995a. "Object-Oriented Database Design and Implementation Issues for Object Vector Product Format (OVPF)." Internal research report for the Naval Research Laboratory, Stennis Space Center, MS. (NRL Technical Report No. NRL/FR/744 1-95-9641, in press) Cobb, M., M. Chung, K. Shaw, and D. Arctur 1995b. "A Self-Adjusting Indexing Structure for Spatial Data," Proceedings of Geographic Information Systems/Land Information Systems Conference, GIS/LIS 95 . Bethesda MD: American Society for Photogrammetry and Remote Sensing, pp. 182-192. Coplien, J. O., and D. C. Schmidt 1995. Pattern Languages of Program Design . New York: Addison-Wesley. Darnovsky, M., and J. Bowman 1990. TRANSACT-SOL User's Guide . Release 4.2. Document 3231-2.1, Sybase Inc.

PAGE 110

101 Date, C. J. 1995. An Introduction to Database Systems . New York: Addison-Wesley. Davis, J. R., P. T. Compagnoni, and P. M. Nanninga 1987. "Roles for Knowledge-Based Systems in Environmental Planning," Environment and Planning B: Planning and Design 14:3, pp. 239-254. Dayal, U., A. P. Buchmann, and S. Chakravarthy 1996. "The HiPAC Project," Widom, J., and S. Ceri, Active Database Systems: Triggers and Rules for Advanced Database Processing . San Francisco: Morgan Kaufmann, pp. 177-206. Defense Mapping Agency 1993a. Military Standard: Vector Product Format . Draft Document No. MIL-STD-2407. Fairfax, VA: Author. Defense Mapping Agency 1993b. Product Specifications for Digital Nautical Chart . Draft Document No. MIL-D-89023. Fairfax, VA: Author. Defense Mapping Agency 1993c. Military Specifications for Vector Smart Map (VMap) Level 0 . Document No. MIL-V-89039. Fairfax, VA: Author. Defense Mapping Agency 1994a. The Digital Geographic Information Exchange Standard (DIGEST), Edition 1.2, January 1994. Fairfax, VA: Author. Defense Mapping Agency 1994b. Draft Military Specification for Urban Vector Smart Map (UVMap) Databases . Document No. MIL-U-89035. Fairfax, VA: Author. Defense Mapping Agency 1995. Draft Military Specification for World Vector Shoreline (WVS PLUS) . Document No. MIL-W-890 12 A. Fairfax, VA: Author. DeWitt, D. J., P. Futtersack, D. Maier, and F. Velez 1992. "Three Alternative Workstation-Server Architectures." Building an Object-Oriented Database System: The Story of 02 . San Mateo, CA: Morgan Kaufmann, p.41 1-446. Diaz, O., N. Paton, and P. Gray 1991 . "Rule Management in Object-Oriented Databases: A Unified Approach," Proceedings of 17th VLDB Conference , Los Altos CA: Morgan Kaufmann. Dickey, J. W., E. Mumby, and J. Doughty 1986. "Computer Consultant Systems: An Application to Assessment of an Urban Housing Cooperative," B. Hutchinson and M. Batty, eds. Advances in Urban Systems Modelling , New York: Elsevier Science Publishers, pp. 351-371. Digitalk, Inc. 1989. Smalltalk/VPM Object-Oriented Programming System: Tutorial and Programming Handbook . Los Angeles: Author. Ding, Y, and A. S. Fotheringham 1992. "The Integration of Spatial Analysis and GIS," Computers, Environment and Urban Systems 16:1, pp. 3-19.

PAGE 111

102 Dueker, K. J., and D. Kjerne 1987. "Application of the Object-Oriented Paradigm to Problems in Geographic Information Systems," Internationa l Geographic Information. Systems (IGIS) Symposium Proceedings . Washington DC: Association of American Geographers, pp. (II) 79-87. Egenhofer, M. J., and A. U. Frank 1987. Object-Oriented Databases: Database Requirements for GIS," International Geographic Information Systems (IGIS) Symposium Proceedings . Washington DC: Association of American Geographers, pp. (II) 189-211. Egenhofer, M. J., and A. U. Frank 1990. "Object-Oriented Modeling in GIS: Inheritance and Propagation," Auto Carto 9. Symposium on Computer-Assist ed Cartography. Bethesda MD: American Society for Photogrammetry and Remote Sensing, pp. 588598. Egenhofer, M. J., and A. U. Frank 1992. "Object-Oriented Modeling for GIS," URISA Journal 4:2, pp. 3-19. Environmental Systems Research Institute 1994. Arc/Info Users Guide, Release 7 . Redlands, CA: Author. Environmental Systems Research Institute 1995a. Arcview Users Guide. Version 2 . Redlands, CA: Author. Environmental Systems Research Institute 1995b. Avenue Programmers Guide . Redlands, CA: Author. Faludi, A. 1973. Planning Theory . New York: Pergamon Press. Fegeas, R. G., J. Cascio, and R. Lazar 1992. "An Overview of FIPS 173, The Spatial Data Transfer Standard." Cartography and Geographic Information Systems 19, 5: 278-293. Feuchtwanger, M. 1993. Towards a Geographic Semantic Database Model . Ph.D. Dissertation, Simon Fraser University. Furuta, R., and C. Neuwirth. eds. 1994 Proceedings of ACM Conference on Computer Supported Cooperative Work , October. New York: ACM Press. Gamma, E., R. Helm, R. Johnson, J. Vlissides 1995. Design Patterns: Elements of Reusable Object-Oriented Software . New York: Addison-Wesley. Garson, G. D., and R. S. Biggs 1992. Analytic Mapping and Geographic Databases . Quantitative Applications in the Social Sciences Series No. 87. London: Sage Publications.

PAGE 112

103 Gatziu, S., and K. R. Dittrich 1992. "SAMOS: An Active Object-Oriented Database System," Data Engineering Bulletin 15, December, pp. 23-26. Gehani, N. H., and H. V. Jagadish 1991. "Ode as an Active Database: Constraints and Triggers," Proceedings of 17th VLDB Conference , Los Altos CA: Morgan Kaufmann, pp. 327-336. Gehani, N. H., H. V. Jagadish and O. Shmueli 1992. "Event Specification in an Active Object-Oriented Database," Proceedings of the 1992 ACM SIGMOD Conference , New York: ACM Press, pp. 81-90. GemStone Systems, Inc. 1995. GemStone Reference Manual . Beaverton OR: Author. Goldberg, A., ed. 1988. A History of Personal Workstations . New York: ACM Press. Goldberg, A., and D. Robson 1983. Smalltalk-80: The Language and Its Implementation . New York: Addison-Wesley. Goldberg, A., and D. Robson 1989. Smalltalk-80: The Language . New York: AddisonWesley. Goldberg, A., and K. S. Rubin 1995. Succeeding with Objects: Decision Frameworks for Project Management . New York: Addison-Wesley. Goodchild, M. F. 1987. "Towards an Enumeration and Classification of GIS Functions," International Geographic Information Systems (IGIS) Symposium Proceedings . Washington DC: Association of American Geographers, pp. (II) 67-77. Goodchild, M. F. 1988. "Modeling Error in Objects and Fields," M. Goodchild and S. Gopal, eds. The Accuracy of Spatial Databases . New York: Taylor & Francis, pp. 107113. Goodchild, M. F. 1994. Introduction to Spatial Analysis . Materials from GIS/LIS Workshop. Washington DC: Urban and Regional Information Systems Association. Han. S., and T. J. Kim 1989. "Can Expert Systems Help with Planning?," Journal of the American Planning Association 55:3, pp. 296-308. Han, S., T. J. Kim, and I. Adiguzel 1991 . "Integration of Programming Models and Expert Systems: An Application to Facility Planning and Management," Computers, Environment and Urban Systems 15:3, pp. 189-201. Harary, F. 1969. Graph Theory . Menlo Park, CA: Addison-Wesley.

PAGE 113

104 Heikkila, E. J., and E. J. Blewett 1992. "Using Expert Systems to Check Compliance with Municipal Building Codes," Journal of the American Planning Association 58:1, pp. 72-80. Herring, J. R. 1992. "TIGRIS: A Data Model for an Object-Oriented Geographic Information System," Computers and Geosciences 18:4, pp. 443-452. Howard, T. 1995. The Smalltalk Developer's Guide to VisualWorks . New York: SIGS Books. Interbase 1990. InterBase DDL Reference Manual , InterBase Version 3.0. InterBase Software Corporation, Bedford, MA. Jaeger, U., and J. C. Freytag 1995. "An Annotated Bibliography on Active Databases (Short Version)," SIGMOD Record 24: 1, pp. 58-69. Karnes, D. 1995. Modeling and Mapping New Metaphors: Toward Pluralistic Cartographies Using Object-Oriented Geographic Information Systems . Ph.D. Dissertation, University of Washington. Kim. T. J., L. L. Wiggins, and J. R. Wright, eds. 1990. Expert Systems: Applications to Urban Planning . New York: SpringerVerlag. Klosterman, R. E. 1994. "Large-Scale Urban Models: Retrospect and Prospect," Journal of the American Planning Association 60: 1 , pp. 3-6. LaLonde, W. 1994. Discovering Smalltalk . New York: Benjamin/Cummings. Laser-Scan, Ltd. 1995a. Introduction to APE Concepts . Cambridge, UK: Author. Laser-Scan, Ltd. 1995b. Open Systems Map & Chart Production Software: LAMPS2 v 1 . 1 , Technical Product Description . Cambridge, UK: Author. Laurini, R., and D. Thompson 1992. Fundamentals of Spatial Information Systems . New York: Academic Press. Lazar, R. A. 1992. "The SDTS Topological Vector Standard." Cartography and Geographic Information Systems 19, 5: 296-299. Leung, Y. 1988. Spatial Analysis and Planning under Imprecision . New York: NorthHolland. Lober, D. J. 1995. "Resolving the Siting Impasse: Modeling Social and Environmental Locational Criteria with a Geographic Information System," Journal of the American Planning Association 6 1 :4, pp. 482-495.

PAGE 114

105 Lorenz, M. 1995. Rapid Software Development with Smalltalk . New York: SIGS Books. Maguire, D. J., M. F. Goodchild, and D. W. Rhind, eds. 1991. Geographica l Information Systems: Principles and Applications. Vol.1 . New York: John Wiley. Medeiros, C. B., and P. Pfeffer 1990. "A Mechanism for Managing Rules in an ObjectOriented Database" Technical report, GIP Altair. Milne, P., S. Milton, and J. L. Smith 1993. "Geographical Object-Oriented Databases-A Case Study," International Journal of Geographical Informat ion Systems 7: 1 , pp. 3955. Munkres, J. 1966. Elementary Differential Topology . Princeton NJ: Princeton University Press. National Institute of Standards and Technology 1992. Spatial Data Transfer Standard (SDTS) . FIPS Publication 173, August 1992. Springfield, VA: National Technical Information Service. Navinchandra, D. 1993. "Observations on the Role of Artificial Intelligence Techniques in Geographic Information Processing." Wright et al., eds. Expert Systems in Environmental Planning . New York: SpringerVerlag. Nyerges, T. L. 1993. "Use of GIS for Collaborative Spatial Decision Making: In Search of a Theoretical Framework," Proceedings of Workshop on Geographic Information and Society , November, Friday Harbor, WA. Object Design, Inc. 1995. ObiectStore Reference Manual . Burlington MA: Author. Objective Facilities Management, Inc. 1996. OFM Programmer's Guide . Gainesville FL: Author. Ortolano, L., and C. D. Perman 1987. "A Planner's Introduction to Expert Systems," Journal of the American Planning Association 53:1, pp. 98-103. ParcPlace Systems, Inc. 1994a. VisualWorks Reference Guide, Release 2.0 . Sunnyvale CA: Author. ParcPlace Systems, Inc. 1994b. VisualWorks User's Guide, Release 2.0 . Sunnyvale CA: Author. Peuquet, D. J. 1987. "Research Issues in Artificial Intelligence and Geographic Information Systems." International Geographic Information Systems (IGIS) Symposium Proceedings . Washington DC: Association of American Geographers, pp. (I) 119-127.

PAGE 115

106 Rumbaugh, J., M. Blaha, W. Premerlani, F. Eddy, and W. Lorensen 1991. ObjectOriented Modeling and Desi gn. Englewood Cliffs, NJ: Prentice-Hall. Samet, H. 1994. The Design and Analysis of Spatial Data Structures . New York: Addison-Wesley. Sharpe, R., B. S. Marksjo, and J. V. Thomson, eds. 1987. Special Issue on Expert Systems in Planning and Design, Environment and Planning B: Planning and Design 14:3. Sharpe, R., and B. Marksjo 1991. "Expert Systems for Urban and Building Planning and Management." Computers, Environment and Urban Systems 15:3, pp. 109-124. Simmons, G. F. 1963. Introduction to Topology and Modern Analysis . New York: McGraw-Hill. Smallworld Systems, Ltd. 1995. Smallworld Magik Programming Guide . Cambridge, UK: Author. Smith, D. N. 1991. Concepts of Object-Oriented Programming . New York: McGrawHill. Smith, D. N. 1995. IBM Smalltalk: The Language . Redwood City, CA: BenjaminCummings. Smith, T., D. Peuquet, S. Menon, and P. Agarwal 1987. "KBGIS-II: A KnowledgeBased Geographical Information System," International Journal of Geographical Information Systems 1:2, pp. 149-172. Spanier, E. 1966. Algebraic Topology . New York: McGraw-Hill. Stonebraker, M. 1988, ed. Readings in Database Systems . San Mateo CA: Morgan Kaufmann. Stonebraker. M., M. Hanson, and S. Potamianos 1988. "The POSTGRES Rule Manager," IEEE Transactions on Software Engineering 14:7, pp. 897-907. Su, S. Y W., V. Krishnamurthy, and H. Lam 1989. "An Object-Oriented Semantic Association Model (OSAM*)," Theoretical Issues and Applications in Industrial Engineering and Manufacturing , pp. 242-25 1 . Taylor, M. A. P. 199 1 . "Traffic Planning by a Desktop Expert," Computers, Environment and Urban Systems 15:3, pp. 165-177. Tomlin, D. 1990. Geographic Information Systems and Cartographic Modeling . New York: Prentice Hall.

PAGE 116

107 Torsun, I. S. 1995. Foundations "f intelligent Knowledge-Rased Systems. New York: Academic Press. Webster, C. J., C. S. Ho, and T. Wislocki 1991. "Text Animation or Knowledge Engineering?: Two Approaches to Expert System Design in Urban Planning," Com puters. Environment and Urban Systems 15:3, pp. 151-164. Wellar, B. 1989. "Emerging Trends in Structuring and Directing GIS Research," Conference Proceedings. Challenge for the 1990's: Geographic In formation Systems (GIS) . Ottawa: Energy, Mines and Resources Canada and the Canadian Institute of Surveying and Mapping, pp. 601-608. Wellar, B., N. Cameron, and M. Sawada 1994. "Progress in Building Linkages Between GIS and Methods and Techniques of Scientific Inquiry," Computers, Environment and Urban Systems 18:2, pp. 67-80. Widom, J., and S. Ceri 1996. Active Database Systems: Triggers a nd Rules for Advanced Database Processing . San Francisco: Morgan Kaufmann. Widom, J., and S. Finkelstein 1990. "Set-Oriented Production Rules in Relational Database Systems," Proceedings of the 1990 ACM SIGMOD Conference , New York: ACM Press, pp. 259-270. Wigan, M. R. 1987. "Legal and Ethical Issues in Expert Systems Used in Planning," Environment and Planning B: Planning and Design 14:3, pp. 305-321. Worboys, M. F. 1994. "Object-Oriented Approaches to Geo-Referenced Information," International Journal of Geographical Information Systems 8:4, pp. 385-399. Wright, J. R., L. L. Wiggins, R. K. Jain, and T. J. Kim, eds. 1993. Expert Systems in Environmental Planning . New York: SpringerVerlag. Yan, W., E. Shimizu, and H. Nakamura 1991. "A Knowledge-Based Computer System for Zoning." Computers, Environment and Urban Systems 15:3, pp. 125-140. Zdonik, S. B., and D. Maier 1990. Readings in Object-Oriented Database Systems . San Mateo CA: Morgan Kaufmann.

PAGE 117

BIOGRAPHICAL SKETCH David Arctur received his B.S. (1975) and M.S. (1979) in electrical engineering at the University of Texas at Austin. Prior to his studies at the University of Florida, David developed and taught intensive courses in object-oriented programming (1989-92), and relational database management systems programming (1984-88). His interest in database modeling and programming began with his work in developing simulation models for energy supply and demand forecasting for the Electric Power Research Institute and SRI International (1977-83). His interest in energy modeling was in turn sparked by his experience in the field of petroleum exploration in Africa with Schlumberger Wireline Services (1975-77). As part of his undergraduate education, David had engineering internships with Southwestern Bell and NASA. His present work in developing objectoriented GIS technology benefits from all his experience in these diverse fields. David is a member of URISA (Urban and Regional Information Systems Association), ACSP (Association of Collegiate Schools of Planning), IEEE (Institute of Electrical and Electronic Engineers), ACM (Association for Computing Machinery), and CPSR (Computer Professionals for Social Responsibility). He has been vice-chair and chair of the Silicon Valley chapter of the IEEE Society on Social Implications of Technology. 108

PAGE 118

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, m scope and quality, as a dissertation for the degree of Doctor of Philosophy. F. Alexander, Gmair fessor of Urbar/ and Regional Planning I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequapflnyscope and quality, as a dissertation for the degree of Doctor of Philosophy Starnes Professor Emeritus of Urban and Regional Planning I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Paul Zwick Associate Scientist of Urban and RegionaTPl an n i n g I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Sharma Chakravarthy Associate Professor of Computer and Information Science I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Joseph N. Wilson ssistant Professor of Computer and Information Science

PAGE 119

This dissertation was submitted to the Graduate Faculty of the College of Architecture and to the Graduate School and was accepted as partialjfalfillment of the requirements for the degree of Doctor of Philosophy. May 1996 Dean, Graduate School


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EE53O2UH3_ENSWLO INGEST_TIME 2015-04-01T19:25:41Z PACKAGE AA00029868_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES