• TABLE OF CONTENTS
HIDE
 Title Page
 Copyright
 Dedication
 Acknowledgement
 Table of Contents
 Abstract
 Introduction
 A survey of related work
 A framework for extensibility
 An extensible kernel object model...
 Management of meta-information
 Management of kernel objects
 A graph-based approach to query...
 Using rules to achieve model...
 Summary and future research
 Dictionary access functions
 Algorithms for kernel object...
 Algorithms for query processin...
 Reference
 Biographical sketch
 Copyright






Group Title: extensible data model and extensible system architecture for building advanced knowledge base management systems
Title: An extensible data model and extensible system architecture for building advanced knowledge base management systems
CITATION THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00082180/00001
 Material Information
Title: An extensible data model and extensible system architecture for building advanced knowledge base management systems
Physical Description: xi, 216 leaves : ill. ; 28 cm.
Language: English
Creator: Yaseen, Rahim Mohamed
Publication Date: 1991
 Subjects
Subject: Data structures (Computer science)   ( lcsh )
Expert systems (Computer science)   ( lcsh )
Electrical Engineering thesis Ph. D
Dissertations, Academic -- Electrical Engineering -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
 Notes
Thesis: Thesis (Ph. D.)--University of Florida, 1991.
Bibliography: Includes bibliographical references (leaves 208-215).
Statement of Responsibility: by Rahim Mohamed Yaseen.
General Note: Typescript.
General Note: Vita.
 Record Information
Bibliographic ID: UF00082180
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: aleph - 001722550
oclc - 25764302
notis - AJD5029

Table of Contents
    Title Page
        Page i
    Copyright
        Page ii
    Dedication
        Page iii
    Acknowledgement
        Page iv
        Page v
    Table of Contents
        Page vi
        Page vii
        Page viii
        Page ix
    Abstract
        Page x
        Page xi
    Introduction
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
        Page 8
        Page 9
    A survey of related work
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
        Page 21
        Page 22
    A framework for extensibility
        Page 23
        Page 24
        Page 25
        Page 26
        Page 27
        Page 28
        Page 29
        Page 30
        Page 31
        Page 32
        Page 33
        Page 34
        Page 35
    An extensible kernel object model (XKOM)
        Page 36
        Page 37
        Page 38
        Page 39
        Page 40
        Page 41
        Page 42
        Page 43
        Page 44
        Page 45
        Page 46
        Page 47
        Page 48
        Page 49
        Page 50
        Page 51
        Page 52
        Page 53
        Page 54
        Page 55
        Page 56
        Page 57
        Page 58
        Page 59
        Page 60
        Page 61
        Page 62
        Page 63
        Page 64
        Page 65
        Page 66
        Page 67
        Page 68
        Page 69
        Page 70
        Page 71
        Page 72
    Management of meta-information
        Page 73
        Page 74
        Page 75
        Page 76
        Page 77
        Page 78
        Page 79
        Page 80
        Page 81
        Page 82
    Management of kernel objects
        Page 83
        Page 84
        Page 85
        Page 86
        Page 87
        Page 88
        Page 89
        Page 90
        Page 91
        Page 92
        Page 93
        Page 94
        Page 95
        Page 96
        Page 97
        Page 98
        Page 99
        Page 100
        Page 101
        Page 102
        Page 103
        Page 104
        Page 105
        Page 106
        Page 107
        Page 108
        Page 109
        Page 110
        Page 111
        Page 112
        Page 113
        Page 114
        Page 115
        Page 116
        Page 117
        Page 118
        Page 119
    A graph-based approach to query and rule processing
        Page 120
        Page 121
        Page 122
        Page 123
        Page 124
        Page 125
        Page 126
        Page 127
        Page 128
        Page 129
        Page 130
        Page 131
        Page 132
        Page 133
        Page 134
        Page 135
        Page 136
        Page 137
        Page 138
        Page 139
        Page 140
        Page 141
        Page 142
        Page 143
        Page 144
        Page 145
        Page 146
        Page 147
        Page 148
        Page 149
        Page 150
        Page 151
        Page 152
        Page 153
        Page 154
        Page 155
        Page 156
        Page 157
        Page 158
        Page 159
        Page 160
        Page 161
        Page 162
        Page 163
        Page 164
        Page 165
    Using rules to achieve model extensibility
        Page 166
        Page 167
        Page 168
        Page 169
        Page 170
        Page 171
        Page 172
        Page 173
        Page 174
        Page 175
        Page 176
        Page 177
        Page 178
        Page 179
        Page 180
        Page 181
        Page 182
        Page 183
    Summary and future research
        Page 184
        Page 185
        Page 186
        Page 187
        Page 188
    Dictionary access functions
        Page 189
        Page 190
    Algorithms for kernel object operations
        Page 191
        Page 192
        Page 193
        Page 194
        Page 195
        Page 196
        Page 197
        Page 198
        Page 199
        Page 200
        Page 201
        Page 202
    Algorithms for query processing
        Page 203
        Page 204
        Page 205
        Page 206
        Page 207
    Reference
        Page 208
        Page 209
        Page 210
        Page 211
        Page 212
        Page 213
        Page 214
        Page 215
    Biographical sketch
        Page 216
        Page 217
        Page 218
    Copyright
        Copyright
Full Text










AN EXTENSIBLE DATA MODEL AND EXTENSIBLE
SYSTEM ARCHITECTURE FOR BUILDING ADVANCED KNOWLEDGE
BASE MANAGEMENT SYSTEMS














By
RAHIM MOHAMED YASEEN


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1991



























Copyright 1991

by

Rahim Mohamed Yaseen























To the memory of my father



And to my mother, Jubeda,
for being so courageous in life














ACKNOWLEDGMENTS


I wish to thank the following people for their contributions throughout the du-
ration of my graduate studies. My sincere appreciation goes out to Dr. Stanley Su

for giving me the opportunity to work in the challenging area of object-oriented and

extensible knowledge base management systems. Without his thought-provoking

guidance, inspiring technical discussions, and steadfast support this work could not

have been realized. I am deeply grateful to Dr. Herman Lam, the co-chairman of

my supervisory committee, for valuable technical guidance, for continuous encour-

agement and thought-provoking suggestions, and for always being there. I thank

Dr. Sham Navathe for fruitful technical discussions and for access to his personal

library. I would like to thank Dr. Keith Doty and Dr. Richard Newmann-Wolfe for

their time and for agreeing to serve on my supervisory committee. I also wish to

thank Dr. Sharma Chakravarthy for valuable discussions on various technical issues.

My special thanks go to Sharon Grant whose help has been simply immense. On

a personal note, I thank all my friends, too numerous to list, for making my stay

in Gainesville a pleasant and memorable one. I take this opportunity to thank Dr.

Tom Bullock for being my mentor at the beginning of my stay in Gainesville. My
heartfelt gratitude goes out to my family for always being there when I needed them,

especially my brother, Iqbal, for his financial help and for being an inspiring force

in my life. Finally, I wish to thank Mohini for so gracefully putting up with all the

ups and downs of the last few years.








This research was supported by the National Science Foundation (Grant # DMC-
8814989), IBM (Grant # S919FM81), National Institute of Standards and Technol-
ogy (Grant # 60NANB7D0714), and the Florida High Technology and Industry
Council (Grant # UPN 90090708).















TABLE OF CONTENTS




ACKNOWLEDGEMENTS ........ ...................... iv

ABSTRACT .................................... x

CHAPTERS

1 INTRODUCTION ................................... 1

2 A SURVEY OF RELATED WORK ........................ 10

2.1 KBMS and DBMS Architectures ....................... 10
2.1.1 Traditional DBMS Architectures ............... 10
2.1.2 Object-oriented DBMS Architectures .............. 11
2.1.3 Extensible DBMS Architectures ........... ...... 13
2.2 Object Management Techniques ............. ...... 17
2.3 Query Processing in Object-oriented Databases . . . . ... 18
2.4 Methodologies for Software Development . . . ..... . ... 20
2.5 Meta-level and Reflexive Architectures . . . . . . . ... 21

3 A FRAMEWORK FOR EXTENSIBILITY . . . . . . . ... 23

3.1 Key Features of our Approach .................. ... .. 23
3.2 Overview of our Approach ........ ............... 24

4 AN EXTENSIBLE KERNEL OBJECT MODEL (XKOM) ......... 36

4.1 The Extensible Data Model Approach ................. 36
4.1.1 Benefits of the Approach ............... ....... 37
4.1.2 Data Model Requirements .. . . ....... .......... .37
4.2 Basic Data Modeling Constructs: Structure . . . . . ... 38
4.2.1 Object, Class, and Instance . . . . . . . . ... 38
4.2.2 Identity . . . . .. . . . . . . . . . 41
4.2.3 Association ............................ 42
4.2.4 Class Definition ........................ 43
4.3 An Example Database .......................... 45
4.3.1 Database Schema ...................... .. 45
4.3.2 Database Extension. . .... . . . . ...... 48
4.4 Basic Data Modeling Constructs: Behavior . . . . . . ... 50
4.5 Language Components .......................... 50
4.5.1 Queries and Methods ....................... . 53
4.5.2 Rules . . . . . . . . . . . . . . . . 58









4.6 Model
4.6.1
4.6.2
4.6.3


Extensibility ....... .................. ...
Model Reflexivity .................... .....
Class Extensibility .......................
Association Extensibility . . . . . . . . . . .


5 MANAGEMENT OF META-INFORMATION . . .. . . . . .

5.1 Using Reflexivity for Managing Meta-information . . . . . .
5.2 Model and System Meta-information . . . . . . . . .
5.2.1 The Bootstrap Process ......................
5.2.2 Parameterization .........................
5.3 Application Meta-information . . . . . . . . . . .
5.4 Dictionary Access Functions .......................
5.5 Data Dictionary Module .........................

6 MANAGEMENT OF KERNEL OBJECTS . . . . . . . . .

6.1 Basic Requirements ............................
6.2 Storage Issues for Kernel Objects . . . . . . . . .
6.2.1 Models of Storage: Static Storage Model vs Distributed Stor-
age M odel . . . . .. . . . . . . . . . .
6.2.2 Logical Storage Structure of an Instance . . . . . .
6.2.3 Value Attributes .........................
6.2.4 Object References .........................
6.2.5 Identity . . . . . . . . . . . . . . .
6.2.6 Identity-Address Mappings . . . . ..... ....
6.2.7 Homogeneous and Heterogeneous Clustering of Objects . .
6.3 Processing Kernel Objects ........................
6.3.1 Association-based Access . . . . . . . . . . .
6.3.2 Value-based Access ........................
6.3.3 Kernel Object Manipulation Operators . . . . . . .
6.3.4 Algorithms for Processing Kernel Object Manipulation Oper-
ators . . . . . . . . . . . . . . . .
6.4 Implementation of the Kernel Object Management System . . .
6.4.1 System and Module Configuration . . . . . . . .
6.4.2 Interface Classes .........................
6.4.3 Class System ...........................
6.4.4 System Extensibility .......................


61
63
68
70

73

73
75
75
78
80
81
82

83

83
84

84
87
89
92
97
98
99
101
103
104
104

111
111
111
113
114
118


7 A GRAPH-BASED APPROACH TO QUERY AND RULE PROCESSING120

7.1 Rationale for a Graph-based Approach . . . . . . . ... 120
7.1.1 The Closure Property ....................... 120
7.1.2 Pattern-based Query Formulation . . . . . . . . 121
7.1.3 Using Graphs as a Basis ................... 121
7.2 The Query Language and Algebra . . . . . . . .... 126
7.2.1 Query Language ............. ........ .. 126
7.2.2 Association Algebra .............. .......... 126
7.3 Representation Schemes for a Subdatabase . . . . . . ... 131
7.3.1 Adjacency Matrix vs Linked List . . . . . . ... 132
7.3.2 A Three-Dimensional Matrix Structure . . . . . ... 133
7.3.3 An Optimized Data Structure . . . . . . . ... 136
7.4 Processing Sets of Graphs ................... . . 143
7.4.1 A Class Index for Sets of Instance Graphs . . . . ... 143










7.4.2 Processing Algebraic Query Operators . . . . . .
7.4.3 Algorithms for Algebraic Query Operators . . . . .
7.4.4 Processing Query Execution Plans . . . . . . .
7.5 Extending the Graph-based Approach to Rule Processing . . .
7.5.1 Rule Trees ...........................
7.5.2 Deriving New Associations . . . . . . . . .
7.6 Implementation of Query Processor . . . . . . . . .
7.6.1 System and Module Configuration . . . . . . .
7.6.2 Class System .........................

8 USING RULES TO ACHIEVE MODEL EXTENSIBILITY ......

8.1 The Object-oriented Semantic Association Model (OSAM*) . .
8.2 Specification and Processing of Rules . . . . . . . .
8.2.1 Rule Types ..........................
8.2.2 Rule Storage and Binding Strategies . . . . . .
8.3 Realizing Association Extensibility . . . . . . . . .
8.3.1 Specifying Parameterized Rules . . . . . . . .
8.3.2 Extending the Meta-class System . . . . ... ..

9 SUMMARY AND FUTURE RESEARCH . . . . . . . .


9.1 Summary . .....................
9.2 Future Work ......................


. . . . . 184
. . . . . 186


A DICTIONARY ACCESS FUNCTIONS . . . . . . . . . .


B ALGORITHMS FOR KERNEL OBJECT OPERATORS .


B.1 Create ..........
B.2 Insert ..........
B.2.1 InsertInstance
B.2.2 Insert.Object .
B.3 Update . . . . .
B.3.1 Update-Instance
B.3.2 Update.Object
B.3.3 Associate . .
B.3.4 Dissociate . .
B.4 Delete ..........
B.4.1 DeleteInstance
B.4.2 Delete.Object .
B.4.3 Destroy-Object
B.5 Retrieve . . . . .
B.5.1 Retrieve . . .
B.5.2 Select . . .
B.5.3 Star . . . .
B.5.4 NonStar . . .


C ALGORITHMS FOR QUERY PROCESSING


. . . . 191


191
191
191
192
193
193
194
195
196
196
196
197
198
198
198
200
201
202


. . . . . . . . 203


C.1 Associate (*) .....................
C.2 NonAssociate(!) ...................
C.3 Union(+) ......................


. . . . . . 203
. . . . . . 205
. . . . . . 206


143
147
147
154
154
156
158
158
162


. 166

. 166
. 169
..169
. 174
. 176
. 176
. 178

. 184


. 189









C.4 Intersect(e) ................................ 206

REFERENCES .. . ......................... . . 2208

BIOGRAPHICAL SKETCH ............................. 216














Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


AN EXTENSIBLE DATA MODEL AND EXTENSIBLE
SYSTEM ARCHITECTURE FOR BUILDING ADVANCED KNOWLEDGE
BASE MANAGEMENT SYSTEMS

By
Rahim Mohamed Yaseen

December 1991

Chairman: Dr. Stanley Y. W. Su
Cochairman: Dr. Herman Lam
Major Department: Electrical Engineering

Traditional data models and the corresponding monolithic database manage-

ment system architectures have been found to be inadequate for supporting the

complex and diverse requirements found in many advanced application domains.

More advanced data models and system architectures are needed for developing
new-generation Database Management Systems (DBMSs) and Knowledge Base Man-

agement Systems (KBMSs). Clearly, if a data model with a fixed set of modeling

constructs and a fixed system architecture are used, they cannot accommodate the

diverse and dynamically changing requirements that must be supported. This dis-
sertation describes an approach of coupling an extensible data model with a cor-

responding extensible system architecture for the development of more advanced

KBMSs.

An eXtensible Kernel Object Model (XKOM) is proposed. XKOM consists of

a set of core data modeling structural abstractions which are commonly available

in existing object-oriented data models and two powerful behavioral abstractions

x









which are expressible by rules and methods. Data model extensions are realized by

using a novel technique called model reflexivity. In model reflexivity, the data model

(XKOM) is used to reflexively model itself, resulting in a "model of the data model."

Extensions are achieved by modifying or extending this "model of the model" using

the structural and behavioral abstractions provided by the data model. In particular,

we use knowledge rules as a powerful and declarative means of specifying data model

extensions.

The extensible system architecture is realized by using an open, modular, and

layered framework coupled with a reflexive approach to system design and imple-

mentation. A middle-out approach is used in the layered architecture where the data

model (XKOM) serves as a basis for developing an intermediate level of abstraction.

The architecture can then be (a) upwardly extended to support appropriate high-

level, end-user, data modeling constructs for diverse application domains, and (b)
downwardly extended to different physical data organizations, access paths, pro-

cessing strategies and storage sub-systems. The object-oriented paradigm is used

to model functionally distinct system components in various layers, resulting in a

"model of the system." Functional extensions are then carried out by appropriate

modifications to the sub-schema corresponding to specific functional modules.

To illustrate the extensible approach to data modeling and system architecture,

several model and system components including a kernel object manager, a data

dictionary module, and a query processor have been implemented. The kernel object

manager corresponds to an implementation of the data model (XKOM). The data

dictionary module manages the meta-data of the model and the system. The query

processor uses a pattern-based approach for formulating queries. A novel graph-

based technique using an adjacency-matrix data structure for processing such queries

is also described.














CHAPTER 1
INTRODUCTION

In recent years, an increasing number of complex and advanced application do-

mains, such as Computer Integrated Manufacturing (CIM), VLSI Design, Computer-

Aided Software Engineering (CASE), and Engineering Design, have been addressed

in the context of database technology. Existing Database Management Systems

(DBMSs) such as relational, network, and hierarchical have been primarily designed

for use in traditional business-oriented applications and have been found to be in-
adequate in supporting the diverse and demanding requirements of these emerging
non-traditional application domains. To support the needs of such diverse and com-

plex application domains, the research presented in this dissertation addresses many

aspects pertaining to a framework and architecture for developing next-generation

Knowledge Base Management Systems (KBMSs). A KBMS is a system which pro-
vides all the functionality of a DBMS and also incorporates additional facilities for

knowledge management. Since the capabilities of a KBMS encompass those of a

DBMS we will, in general, use the term KBMS in the broader sense to refer to both

KBMSs and DBMSs.

To identify the requirements that pertain to developing next-generation KBMSs,

we analyze such requirements from two aspects: first, the data model aspect, which
deals with identifying an appropriate data model for supporting advanced applica-

tion domains; and second, the system architecture aspect, which deals with defining

a suitable system architecture for realizing a system based on the data model.
To analyze the data model aspect, we review some important remarks made by

Dittrich [DIT86] on the notion of a semantic gap that occurs when an application is







2

mapped to a corresponding database. A database which corresponds to a real-world

application captures some specific semantics of the application as accurately and

completely as possible. The term mini-world or Universe of Discourse is used to

refer to the real-world corresponding to the application. As Dittrich points out, two

classes of semantics can be distinguished. The first class of semantics, which we shall

denote by S, is the semantics of the mini-world itself. That is, the semantics of the

application in the real-world. The second class of semantics, which we shall denote

by 8, represents the semantics of the mini-world as captured in the corresponding
database. That is, the semantics of the application as captured in the database.

In order to capture or express the mini-world semantics, a mechanism for ab-

straction and representation is needed. A data model represents such an abstraction

and representation mechanism. Thus, S depends on the data model used to capture

the semantics of the application. Usually S < S, since the imperfect nature of data

modeling constructs implies that all the required semantics of a mini-world can-

not be precisely abstracted and represented in the database. Thus, a semantic gap

(S S) always exists due to the difference between these two classes of semantics.

The semantics not captured by a data model, that is (S j), must be provided by

an application program and/or by an implicit interpretation on the part of end-users

that semantics S imply semantics S. As application domains become increasingly

complex, the semantic gap widens and complicated application programs are needed

to capture those semantics that the data model is unable to capture. Clearly, the

semantic gap cannot be totally eliminated since this would require data models to be

perfect. In view of this, an important issue is to define a suitable framework for data

modeling (that is, a data model) for supporting advanced and complex application

domains such that the semantic gap is minimized.

To address this issue, we enumerate two key factors that contribute towards an

increased semantic gap when using existing data models. First, the data model









may be semantically inadequate. That is, the data modeling constructs may not be
semantically rich enough to capture the semantics of various application domains.

Second, the data model may be fixed. That is, a given application can only be
modeled using a fixed set of data model constructs even though diverse applications
have different needs. These factors which have contributed to the semantic gap are
an important aspect of the problem that this dissertation addresses, and techniques
that aim to reduce the semantic gap are an important part of the solution that we

propose. As discussed below, these factors have influenced the development of data
models in recent years.
Traditionally, data models such as relational, hierarchical, and network were

primarily designed for use in business-oriented applications. When used in more
complex and non-traditional application domains, such data models were found to
be semantically inadequate. This resulted in the development of early semantic data
models such as the Binary Relationship Model [ABR74], Aggregation and Gener-
alization data modeling constructs [SMI77], the Entity-Relationship (ER) Model

[CHE76], the Semantic Data Model (SDM) [HAM81], the Functional Data Model
(DAPLEX) [SHI81], and Semantic Association Model (SAM*) [SU83]. In compar-
ison with traditional models, these semantic data models incorporated additional
semantics into the data modeling constructs so as to reduce the semantic gap.
The emergence of the object-oriented paradigm from Artificial Intelligence (AI)
knowledge representation concepts and its incorporation into languages such as Sim-
ula [BIR74], Smalltalk [GOL83], Loops [BOB83], and CLU [LIS86] led to the merging
of semantic data models and object-oriented concepts, resulting in a class of new
and advanced data models referred to as object-oriented (00) or object-oriented

semantic data models. In addition to providing advanced structural data model ab-

stractions, object-oriented data models also provide a strong behavioral abstraction

component through the use of abstract data types and methods--a feature absent







4

in previous data modeling paradigms. Examples of such data models can be found
in NIAM [VER82], IDEF1X [LOO86], IFO [ABI87], STDM [COP84], IRIS [FIS87],
Gallileo [ALB85], TAXIS [MYL80], DAMOKLES [DIT87], and OSAM* [SU89]. A
survey of such data models can be found in Hull and King [HULL87] and Peckham
and Maryanski [PEC88]. A more powerful behavioral abstraction, namely declara-
tive rules, has also been recognized by and incorporated into some of these advanced

data models [BOB83, BR084, KER84, MOR84, SU89]. High-level declarative rules
enhance the data management capabilities by providing knowledge management
capabilities--capabilities that include truth maintenance, constraint management,
deduction and inferencing, and the ability to trigger or activate actions under user-
specified conditions.
In order to minimize the semantic gap, object-oriented data models provide a

large set of rich data modeling constructs. These include specialized structural
constructs which encode high-level structural semantics. Furthermore, behavioral

abstractions such as methods are provided and enable application programs (nor-
mally used for enforcing additional semantics) to be directly incorporated as part
of the database. However, these models suffer from two shortcomings. First, these
data models do not address the second factor that contributes towards an increased
semantic gap. That is, in spite of their advanced data modeling constructs, these
models only provide a fixed set of data modeling constructs. Second, the combi-
nation of the large number of constructs and the semantically rich nature of the
constructs lead to unwieldy and inefficient implementations of such models.

Clearly, a fixed data model cannot support a diverse range of applications even
if it possesses many semantically rich constructs. The semantics of different ap-

plications require specialized constructs, the absence of which leads to the usage of
complicated application programs for enforcing such semantics. This analysis of data
models leads us to conclude that an extensible data model approach is needed to









overcome the fixed nature of existing data models so that many diverse application
domains may be supported. An extensible data model approach allows the semantic
gap to be reduced since the data model can be tailored to a specific application

rather than forcing an application to use a specific data model.

We now analyze system architecture considerations for next-generation KBMSs.
A KBMS represents the implementation or realization of a specific data or knowl-
edge model. Initially, many early data models served as conceptual database design

and modeling tools in various application domains. The maturing of these appli-
cation domains resulted in an increasing need for developing KBMSs or DBMSs

corresponding to the various data models so that databases or knowledge bases cor-

responding to these applications could be realized. Several approaches have been

suggested [ABI86, AGR89, ALB85, ONT88, BAN87a, BAN87b, COP84, DAD86,

MAI86a, FIS87, SER87, SKA86].
One approach is to build layers of software on top of existing traditional DBMSs

to implement the semantics of a particular data model. While this approach enables

the use of existing DBMSs, the large semantic mismatch between the semantics of

advanced data models and that of the underlying database engine entails the devel-

opment of extensive layers of system software to perform numerous mappings which

are typically complex, cumbersome, and inefficient. An additional disadvantage of
this approach is that the monolithic nature of traditional DBMS architectures is not

conducive to functional extensions. A second approach is to develop a dedicated

system for each advanced data model and for the corresponding application domain.

While this approach would eliminate any semantic mismatch and provide high per-

formance, it is not feasible due to the excessive monetary costs and development

time it would entail. The two solutions enumerated above represent two extremes

in the solution space of the problem. In pursuit of a reasonable middle ground, two









independent approaches have been proposed, namely, the object-oriented database
system approach and the extensible database system approach.
The object-oriented database system approach [AGR89, ONT88, BAN87a, BAN87b,
COP84, DAD86, DIT87, FIS87, MAI86a, SKA86] proposes the development of a sin-
gle (and usually fixed) DBMS architecture based on a specific object-oriented data
model. A key disadvantage with this approach is that the system is fixed and not

easily extendable. That is, only a fixed set of higher order data modeling con-

structs is supported. This approach implicitly assumes that a single object-oriented
data model suffices to support all applications, with not much provision for appli-
cations and data modeling constructs whose requirements fall outside the original
model and system specifications. Furthermore, the issue of defining a single object-
oriented data model is both subjective and controversial. Unlike relational DBMSs,

which are all based on a common relational data model, it is not possible to do the
same for object-oriented database systems because no clear consensus exists on a
common high-level object-oriented data model [ATK89]. Consequently, differences

and incompatibilities may exist even if the same application utilizes different object-
oriented database systems since each system supports its own notion of a high-level
object model.
The extensible database system approach [BAT88, CAR86b, LIN87, ST086] pro-
poses the development of an extensible DBMS architecture, an architecture which
is deliberately designed to provide explicit mechanisms for functional extensions to
the system architecture. This allows the database system to be customized or tai-
lored to support specific application domains. In Postgres [ST086] and Starburst

[LIN87] extensibility is provided within the framework of a relational data model.
The approach taken by Genesis [BAT88] is characterized by a network model ba-

sis and uses a strict building-blocks framework for extensibility. Exodus [CAR86b]

couples a fixed, powerful object-based storage manager with a set of tools to achieve







7

extensibility. These different extensible systems are characterized by the different

models on which they are based and the approaches taken to achieve extensibility.

A shortcoming in many of these systems is that, though the system architecture is

designed to be extensible, the model on which the system is based is fixed and not

extensible. Thus, system extensibility is usually provided only within the framework
of a fixed data model.

Based on the analysis of data models and system architectures presented above,

we conclude that the combination of a single (fixed) data model and a fixed system

architecture is clearly unable to support the needs of advanced application domains.

In this dissertation, we present our approach to the development of next-generation

KBMSs for supporting advanced application domains. We propose the use of an

extensible kernel object model as a basis coupled with an open, layered, and mod-

ular framework for achieving system extensibility. Our approach builds upon an

integration of the object-oriented DBMS approach and the extensible system ap-

proach thereby overcoming the limitations of each of the individual approaches. It

is based on the concept of a generalized, extensible, and object-oriented architecture

for Knowledge Base Management Systems. In comparison with the approaches de-
scribed previously, our approach aims to provide a unified and integrated framework

for extending both the data model and the underlying system architecture.

Unlike object-oriented DBMSs which are based on a fixed set of high-level object-

oriented data modeling constructs, our approach begins with a set of basic and
generalized data modeling constructs common to the range of advanced semantic and

object-oriented data models. Collectively, this set of basic data modeling constructs

are used to define an eXtensible Kernel Object Model (XKOM), which serves as a

basis for our approach and is used to realize a kernel system. By making the data

model extensible, the semantic gap caused by the fixed nature of data models is

alleviated.









The data model also incorporates the specification of knowledge rules since an

important requirement for next-generation KBMS architectures is an integrated ap-
proach to knowledge management [RAS87, RAS88, SU85]. Consequently, rule spec-

ification and processing are integrated into the architecture of the system at both

representational and functional levels. Rules are available for use by end-user data
models and application domains but more importantly are reflexively used in the

system architecture as a basic mechanism for extending the semantics of the data

model.

To achieve system extensibility, we propose an open, layered, and modular frame-

work coupled with a reflexive approach to system implementation. An open and

modular architecture is realized by using the object-oriented paradigm to model

functionally distinct system components in various layers. Functional extensions to

the system architecture are carried out by appropriate modifications to the schemata

which correspond to specific functional modules. In designing the layered framework,

a middle-out approach is used in which XKOM serves as the basis for a well-defined

intermediate level of abstraction. This model (XKOM) can then be (a)upwardly ex-

tended to support appropriate high-level, end-user, data model constructs useful for

advanced application domains and (b)downwardly extended to different and possibly

alternative physical data organizations, access paths, and storage sub-systems.

To demonstrate the feasibility of the concept of an extensible data model coupled

with an extensible system architecture, this research also investigates implementa-

tion techniques for a kernel object manager, a data dictionary module for managing

meta-information, and a query/rule processor [LAW91, LU91, YAS91]. Query/rule

processing is emphasized since the ability to specify and process high-level declar-

ative queries is an important factor which has contributed greatly to the success









of database systems. In view of this, we investigate a generalized graph-based ap-

proach to query processing for object-oriented databases using adjacency matrix

data structures.

This dissertation is organized as follows. In Chapter 2, a survey of relevant

literature pertaining to several areas that have a bearing on this research is pre-
sented. Among areas surveyed are DBMS architectures, object management tech-

niques, techniques for query processing in object-oriented databases, methodologies

for software development, and meta-level and reflexive architectures. Chapter 3

presents an overview of the approach that this dissertation proposes for developing

advanced KBMSs. In Chapter 4, the eXtensible Kernel Object Model (XKOM)

having the properties of generality, extensibility and reflexivity is presented. The

management of meta-information using a reflexive model of XKOM as a basis is

the focus of Chapter 5. In Chapter 6, we study the management of kernel objects:

objects which are defined using XKOM as a model of abstraction. Conceptual and

implementation issues relating to query and rule processing in our KBMS architec-

ture are addressed in Chapter 7. Chapter 8 presents further details on achieving

data model extensions in the light of the work presented in the previous chapters.

To demonstrate the proposed notions of extensibility, the Object-oriented Seman-

tic Association Model (OSAM*) [SU89] is used as an example end-user target data

model. Finally, Chapter 9 summarizes the research presented in this dissertation,

presents a conclusion on the results achieved, and suggests some areas for future

research.














CHAPTER 2
A SURVEY OF RELATED WORK

In this chapter, we present a survey of several topics which are related to the re-
search presented in this dissertation. The topics surveyed include different categories

of KBMS and DBMS architectures, object management techniques, query processing

techniques for object-oriented databases, methodologies for software development,

and meta-level and reflexive architectures.

2.1 KBMS and DBMS Architectures

In this section, we begin with a retrospective on early database system architec-

tures which are the forerunners of current system architectures. Then, we focus on
current approaches for supporting advanced application domains: object-oriented

database system architectures and extensible database system architectures.

2.1.1 Traditional DBMS Architectures

In the early stages of database system development, system architectures were

not well-defined, and database software consisted mostly of system code developed

by system programmers.

With the advent of the relational data model [COD70], two well-designed data-

base system architectures emerged, namely INGRES [ST076] and System R [AST76].

In particular, System R proposed a two-level architecture consisting of a Relational

Data System (RDS) and a Relational Storage System (RSS) having well-defined

interfaces called the Relational Data Interface (RDI) and the Relational Storage In-

terface (RSI), respectively. The separation of logical and physical database aspects







11

in different layers contributed to an architecture which has produced widely accepted
results in the domain of relational database system architectures.

2.1.2 Object-oriented DBMS Architectures

As described in Chapter 1, the semantic inadequacies of traditional data models
and their corresponding systems (e.g., relational) led to the development of object-
oriented data models and corresponding system architectures.
Object-oriented DBMSs aim to provide a single system having powerful capabil-
ities to handle the requirements of advanced application domains. In particular, the
system is based on a single high-level object-oriented data model. This approach
usually requires an application to use a specific (fixed) data model and makes no
allowance for application domains whose requirements fall outside the original spec-

ification of the data model and underlying system architecture. We now briefly
describe some of the more prominent systems using two different categories to clas-
sify such systems.
In the first category of object-oriented systems, systems such as Gemstone [MAI86a],
Vbase [ONT88], and ONTOS [ONT90] have been developed as object-oriented en-
vironments which provide a specific data model together with persistence and a
database programming language for defining and manipulating objects. Gemstone
is based on the SmallTalk Data Model (STDM) [COP84] It introduces the notion of
persistence objects into the Smalltalk Data Model by providing a large extended, vir-
tual memory together with a Smalltalk-like database language called OPAL [SER87]
for manipulating objects. Vbase provides its own object-oriented data model to-
gether with two languages. A Type Definition Language (TDL) for defining user-
defined data types, and a language called COP for writing methods associated with
a class or type. ONTOS, the successor to Vbase, is based on the C++ data model,
and provides persistence to objects in the C++ programming language. The C++
language is used for defining types, and also, as a database programming language.









The systems in this first category have not been designed and developed as com-

plete DBMSs. Rather, these systems provide an environment in which application

specific classes can be developed by application programmers. A library of system
pre-defined classes (e.g., sets, arrays, bags, etc.) is provided to build user-defined

classes. Typical database functionalities such as set-oriented queries are usually not

provided.

In the second category, object-oriented database systems such as Orion [BAN87a,

BAN87b], Iris [FIS87], and 02 [LEC88, LEC89, VEL89] have been designed as more

complete database systems based on specific object-oriented data models. Unlike

the environment-oriented systems described above, these systems not only combine a

data model with a persistent programming language but also incorporate many data-

base functionalities such as querying, schema modification, and transactions. Orion

is an object oriented prototype database system developed to support advanced ap-

plication domains. The main features of Orion are support for complex/composite

objects, schema evolution, and versioning. Orion regards such features as being key

requirements for advanced applications. Iris is a research prototype being developed

for exploring new database functionalities and features to support advanced applica-

tion domains. Iris uses the concept of an object manager built on top of a relational

storage system to support many different interfaces to external applications. More

recently, the 02 research prototype has been developed as a complete object-oriented

database system for supporting advanced applications. It provides the functionality

of a DBMS (persistence, disk management, sharing, and query language) together

with object-oriented features such as complex objects, object identity, encapsulation,

typing, and inheritance.
All the above approaches to object-oriented DBMSs have focused on a fixed

implementation based on a specific data model. A key design criterion has been to

support the object-oriented paradigm whereas extensibility has not been a major







13

design criterion. Thus, these systems are not designed or intended to be explicitly

extensible. Such systems do provide a basic form of extensibility in that the systems

allow the notion of user-defined abstract data types and user defined operations:

a feature that is attributed to object-oriented data models. However, in all these

systems, system components (modules) are not explicitly available for modification.

2.1.3 Extensible DBMS Architectures

The notion of an extensible database system architecture has been derived from

the argument that a single fixed system is unable to support diverse advanced ap-

plication domains which have varying requirements. Thus, the focus of extensible

systems is to provide a base system together with an extensibility mechanism such

that the base system can be extended or reconfigured to realize a system that meets

the requirements of an application domain.

To better analyze extensible system architectures, we propose the following clas-

sification for extensible systems.

Data Model Extensibility: In data model extensibility, the constructs which concep-

tually define a data model are extensible, that is, existing constructs can be

modified or new constructs can be added.

System Extensibility: In system extensibility, the architecture of a database system

can be extended in order to realize functional extensions. From a traditional

database system perspective, system extensibility can be broadly categorized

into logical extensibility and physical extensibility. Logical extensibility refers

to extensions to the upper (logical) layers of a system architecture, and includes

extensions such as support for different data modeling constructs, user defined

abstract data types and operations, query languages, transaction models, etc.









Physical extensibility refers to extensions to the lower (physical) layers of a sys-
tem architecture, and include extensions such as different data organizations,
file structures, access paths, and other storage related aspects.

We note that data model extensibility and logical extensibility are two distinct

concepts: the former is an abstract concept related to data modeling and the lat-
ter is a physical concept related to a system architecture. To support data model

extensibility, an underlying system architecture must possess logical extensibility.

The reverse is not always true since a system can possess some forms of logical
extensibility in the context of a fixed data model (e.g., the relational model).

While traditional DBMSs cater to two kinds of users, namely, Database Admin-
istrators (DBAs) and end-users, it has been proposed in [BAT88, CAR86b] that

extensible systems require a third kind of user: a DBMS Implementor (DBI) or

DBMS Customizer (DBC). The DBI or DBC is a highly skilled professional, whose
task is loosely defined to be that of customizing or configuring an extensible system

to meet the requirements of specific application domains.

The extensible system approach is characterized by (a) the data model used

as a basis and the degree to which data model extensibility is provided, (b) the

mechanisms proposed for extensibility, and (c) the various forms and level of logi-
cal/physical extensibility supported by the system. We now analyze several existing

extensible systems in the context of the above criteria.

Exodus [CAR86a, CAR86b] is designed around a toolkit approach, in which

kernel DBMS facilities and a set of software tools for semi-automatic generation of

application-specific DBMSs are provided. It builds upon a fixed storage manager

which uses a kernel model of unstructured and untyped objects represented as byte

sequences. The kernel does not support any features present in object-oriented data

models. At the logical level, Exodus provides for application specific Abstract Data

Types (ADTs). A programming language called E (an extension of C) is the tool







15

or mechanism by which extensibility is provided--it is the language that is used by

the DBC or DBI for customizing the system. In order to provide extensibility, an

extensible library of type independent access methods is provided and a rule based

query optimizer and compiler has also been proposed. More recently, an effort is

underway [CAR88] to incorporate an extensible data model on top of the existing
storage system. In Exodus, the initial focus has been mostly on physical extensibility.

Logical extensibility has been limited to features such as abstract data types, and

an extensible query optimizer. Data model extensions have not been investigated.

Genesis [BAT88] is designed as a modular system which relies on database com-

ponents whose interfaces have been standardized in such a manner that they become
interchangeable. The concept of rapidly configuring a storage architecture by writing

a storage architecture specification program has been proposed. The system focuses

mainly on physical extensibility and is based on a model called the Unifying Model

(a variant of the network model). The storage components of a DBMS are considered
to be building blocks or modules that realize files (set of records), linksets (links be-

tween records), and elementary transformations (conceptual to internal mappings).

A rigorous notion of data definition mappings and operation mappings from a logical
to a physical level is presented. Storage architectures are considered to be compo-

sitions of such building blocks, and reconfiguration is carried out by synthesizing a
storage architecture using pre-existing blocks from an extensive library.

Postgres [ST086, ST087] is an extensible system which adds extensibility to the
relational data model. It is based on the relational model, with the stated goal of

making minimal changes to the relational model. It extends the relational model, at

the logical level, with the notions of user-defined ADTs. Thus, the logical extensibil-

ity issues addressed are the definition and use of ADTs, and user defined operations

within the context of the relational data model. At the physical level, Postgres con-

siders the addition and extension of access methods. Postgres supports Postquel (an







16

extended query language), complex objects, triggers, alerters and inferencing. The

basic extensibility mechanism is ADTs.

Starburst [SCH86, LIN87] is a system being built as a successor to System R

[AST76]. It is based on the relational data model and addresses extensions within

the relational model. Starburst proposes two notions of extensibility: user-defined

extensions and data management extensions. User-defined extensions (logical ex-

tensibility, in our terminology) are considered as support for user-defined ADTs

and functions as fields of database records. Data Management Extensions (physical

extensibility, in our terminology) are considered as support for alternate implemen-

tations of database storage and access paths. Starburst considers two forms of Data

Management Extensions: storage methods and attachments. Storage methods deal

with alternate ways (e.g., sequential, B-tree, foreign files at a remote site) of storing

relations and provide a well-defined set of operations such as delete, insert, update.
Attachments are used to define and implement access paths, integrity constraints,

and triggers. Example of attachment types include B-tree indices, hash tables, etc.

Attachments, like storage methods, provide a well-defined set of operations. Unlike

storage methods, attachments are invoked as side effects of modification operations

on relations. The attachment concept provides a more general concept than merely

the concept of extensible access paths. Currently, a query language, rule-based opti-

mizer, and query re-write mechanism is under development [LOH88, HAA89]. Like

Postgres, Starburst provides extensibility within the framework of the relational data
model.

The Darmstadt Kernel System [PAU87] is an extensible, multi-layer system

which aims to support advanced application domains by developing a common kernel

DBMS, and allowing application specific layers to be defined on top of this kernel.

The kernel is a layer that supports the storage level functionalities of the system.









Extensibility is achieved by using the common kernel for all applications and ex-

tending the kernel through the use of application-specific layers on top of the kernel.
Thus, the concept of extensibility is "add-on-top" extensibility, with the lower layers

being largely fixed. This implies that physical extensibility or extensions to the lower
layers cannot be easily carried out. At the upper (logical) layers, the system does

not support the concept of tailoring a data model to different applications. Instead,

the system supports some forms of logical extensibility such as user-defined ADTs

and complex objects.
TI's Open OODB [THO89] represents a combination of the object-oriented ap-

proach and the extensible system approach. This approach aims to build an object-

oriented database system using an open, modular and extensible architecture. The

system has adopted the C++ data model as a reference model due to the widespread

use of the C++ language. It does not support the concept of data model extensions.
Open OODB supports the concept of system extensibility by the use of an open
and modular approach. An attempt is made to identify orthogonal functionalities

and develop a set of generalized modules for object-oriented database systems cor-
responding to such functionalities. The modules themselves are not extensible. The
approach proposed to achieve extensibility is applicable both at logical and physical

levels. The goals of this approach parallel the goals that this dissertation seeks to

achieve. However, in our approach, we propose an integrated and unified approach

to achieve concepts of model extensibility and system extensibility by reflexively
modeling data model components and system components (modules) as objects and

classes using the object-oriented paradigm.

2.2 Object Management Techniques

Techniques for the management of objects are important to the development of

any KBMS. Many object managers such as WiSS [CHO85], Observer [SKA86], Exo-

dus [CAR86a], and 02 [VEL89] have been proposed. Many of these so-called object









managers deal with objects whose structural representation is a low-level storage

representation such as records, fields and files. Many of these object managers are
more appropriately called storage managers. The interface to these storage level
managers is a low-level interface (such as get/put, read/write) based on the storage
level representation of objects.

In contrast, in this dissertation we will address object management in the context

of objects modeled at a level of abstraction which incorporates a set of basic seman-
tics. Ideally, a storage layer must deal only with abstractions and issues that are
relevant to the storage level. Consequently, storage managers should not incorporate

functionalities relating to high-level object representations, high-level operators, and

set-oriented retrievals, but rather provide adequate support for such functionalities.
Instead, management of objects at a higher level of abstraction should be carried
out in a layer above the storage layer.

2.3 Query Processing in Object-oriented Databases

An important functionality needed to support advanced application domains

in any database system architecture is query processing. Since our approach is
based on the object-oriented paradigm, we review approaches to query processing

in object-oriented databases. Several query languages and processing techniques
[SHI81, ZAN83, MAI86a, ST086, ONT88, ROW87, BAN88, CAR88, KIM89a] have
been proposed for object-oriented database systems. We discuss existing approaches
to query processing by classification the techniques used based on the formal data
structures proposed as a basis for computing queries.
Several systems, such as Vbase [ONT88], Iris [FIS87], Postgres [ST086, ROW871,

and Exodus [CAR88] have combined relational query processing structures with

object-oriented databases. Such systems use tabular (relation-like) structures for

internal representation, and process operations such as joins to compute the results.

Queries are typically specified using O-SQL like languages or variations of QUEL.









Some systems such as Vbase [ONT88] merely provide an SQL interface, while sys-
tems such as Iris [FIS87] provide an object version of SQL (OSQL). Systems using
QUEL-based languages, such as Postgres [STO86, ROW87] and Exodus [CAR88]
use an extended dot operator to perform various types of joins. Some of these sys-
tems including [DAD86] use relations or nested relations as a basis. Using data
structures such as nested relations can result in a semantic mismatch between the
object-oriented data model and the relational data model.

Other systems, such as ODE [AGR89], Gallileo [ALB85], and OPAL [SER87],
have taken the approach of providing query capabilities through the use of Database

Programming Languages (DBPLs). Such DBPLs lack a declarative set-oriented
query capability and manipulate objects singly using programming language con-
structs. Accordingly, a query result containing objects and associations that span
across many classes cannot be represented except as a user-defined program struc-
ture.
Another technique used for query processing in object-oriented databases is to
express the results of query as objects from a single class (the anchor or target
class). Objects from other classes connected to the Anchor, may then be further
accessed via navigational operators (e.g., a dot operator) on the Anchor Class. For
this category, a single set of objects (or oids) is sufficient to serve as an internal
structure for computing queries. However, this is restrictive, since only a given class

of queries can be computed. In particular, queries that return objects from multiple
classes in their results are excluded. Examples of systems which use this technique

include Orion [BAN88], Daplex [SHI81], and GEM [ZAN83].
Kim [KIM89a] proposes a technique in which a new class is created to hold the
results of a query. For example, a "join" between two objects from two different
classes produces a third object of a newly created class. In this case, the system









must address the issue of defining precisely the semantics of the newly created class

and object.

2.4 Methodologies for Software Development

In this section, we review design methodologies that apply to the development

of database system software since our approach emphasizes that the system soft-

ware itself must be developed in an extensible manner in order to realize system

extensibility.

Early approaches to software development were largely ad-hoc processes under-
going many iterations of informal design, coding, and testing. A more methodical

approach to software development was introduced through the concept of "structured

methods." Examples of structured methods or structured design include Structured

Design [CON79], Structured System Analysis [GAN79], and Jackson System De-
velopment [JAC83]. In the structured method or structured design approach, the

system is designed from a functional viewpoint. First, a high-level view of the sys-

tem is developed, and then, using functional decomposition, this view is decomposed

into a more detail design in a step-wise manner.
More recently, the object-oriented paradigm having several desirable features
such as encapsulation, data abstraction, and information hiding, has been proposed

as a methodology for software design. In this approach, the system is viewed as a

collection of objects rather than a collection of functions. Here, messages which are

passed from one object to another take the place of functions. Each object has a
set of operations which are invoked through messages. Liskov and Guttag [LIS86],

present a comprehensive approach to data and function abstraction, and a guide to

program development using such abstractions.

Booch [B0086] focuses on the design of software systems using object-oriented

development. Booch describes a system structuring technique based on an object-

oriented decomposition using data or functional abstractions and decompositions.







21

The technique illustrates factoring the system into objects using small case studies

of application-specific software. The paper does not specifically deal with applying

the approach to the system software of the general class of software systems, or even

to a specific class of software systems (for example, database management systems).

While most examples have applied object-oriented concepts to simple application

software, a more complex task involves determining how the paradigm can be applied

to large software systems. Clearly, it is feasible to apply these notions of abstraction,

and the object-oriented paradigm for developing the system software of large software

systems. In our effort, we plan to apply these notions to one class of software systems,

namely, KBMSs.

2.5 Meta-level and Reflexive Architectures

In this section, we review related work in the area of meta-level and reflexive

architectures for object-oriented systems. Currently, this area is receiving increasing

interest in the context of programming languages. Broadly speaking, a programming

language is considered reflexive if it can operate on itself. While some work has been

done in the area of reflexive architectures for procedure-based languages, logic-based

languages, and rule-based languages, we will focus on reflection in object-oriented

languages and systems.

In the object-oriented paradigm, reflexivity is usually addressed in the context

of meta-objects and meta-classes. Traditionally, the term meta-class has been used

to refer to a class whose instances themselves are classes [COI87, GOL83]. However,

the concept of meta-classes and meta-objects has evolved and today, these terms are

used to describe classes and objects which describe the specification and/or behavior

of languages or systems.

Maes [MAE87] proposes a fairly comprehensive set of concepts to describe re-

flection in object-oriented languages. Computational reflection is defined as "the

activity performed by a computational system when doing computation about (and







22

by that possibly affecting) its own computation." The paper also defines structural
reflection or self-representation as the property of a system which incorporates struc-

tures representing (aspects of) itself. The paper also identifies various issues that

a language interpreter must address in order to achieve the concept of reflexivity.

A uniform and reflexive definition for a system supporting ObjVlisp is described in
[COI87]. This system investigates the use of meta-classes, and studies the transi-
tions from uniform representations to reflexive representations, and the use of such

properties for extensibility. An ObjVlisp Model is developed using classes and ob-

jects at a meta-level and operational semantics are expressed in Lisp. In [KIC91], an

approach to programming using meta-objects is presented. Meta-objects are used
to provide the ability to extend the behavior and implementation of the language.
The approach is verified using meta-objects in the Common Lisp Object System

(CLOS).
In the approaches mentioned above, reflexivity is addressed in the context of

programming languages, with emphasis on the effect of computational reflection on

language interpreters. In contrast, reflexivity in the context of database systems has
not been a well-studied area. In this research, we explore the concept of reflexivity in
database systems with emphasis on using reflexivity as a basis for achieving extensi-
bility. Instead of focusing on computational reflexivity and language interpreters, we
initially deal with structural reflexivity or self-representation in database systems.














CHAPTER 3
A FRAMEWORK FOR EXTENSIBILITY

In this chapter, we outline the framework and software architecture that we

propose for developing advanced Knowledge Base Management Systems (KBMSs).

A clarification of some terminology is in order. We use the term "framework" to

mean the overall approach and methodology. The terms "software architecture" or

"architecture" will be used to refer to the abstract representation of the software

configuration of a complex system such as a Knowledge Base Management System.

The terms "Knowledge Base Management System" or "KBMS" will be used to refer

to the physical realization of this architecture.

First, we list key features of our approach. Then, we present an overview of the

proposed approach and discuss the importance of these features in the context of a

next-generation KBMS architecture. In the overview, we also explain the relevance

of the material to be presented in Chapters 4 through 8 to the proposed approach.

3.1 Key Features of our Approach

The key features of the approach we propose for developing next-generation

KBMSs are as follows:

1. An extensible, object-oriented base model called eXtensible Kernel Object

Model (XKOM).

2. An open, layered and modular framework for defining an extensible system

architecture.

3. Mechanisms for model extensibility.









4. Mechanisms for system extensibility.

5. A novel graph-based approach for query/rule processing in object-oriented
databases.

6. An integrated approach to knowledge management.

3.2 Overview of our Approach

In Chapter 1, we analyzed the task of developing a next-generation KBMS from

two aspects; the data model aspect and the system architecture aspect. Based on this

analysis, we propose an approach that addresses these two aspects in an integrated

manner.
To address the first aspect, we define an appropriate data model to serve as a
basis for the proposed extensible system. In this regard, we were greatly influenced

by many desirable features that object-oriented data models possess over traditional

data models such as relational, network, and hierarchical. These include features

such as data abstraction, encapsulation, polymorphism, inheritance, etc. Object-
oriented (00) data models provide two forms of abstractions; structural abstractions
and behavioral abstractions [DIT86]. Thus, a class in an 00 data model encapsu-

lates structural properties of an object such as its relationships with other objects

together with behavioral abstractions such as methods that operate on the object.
However, many high-level object-oriented data models capture advanced semantics

in complex structural constructs. Thus, different object-oriented data models may

provide similar semantics by using various structural constructs in different ways.

To unify the semantics corresponding to the diverse structural constructs in dif-

ferent models, we propose an eXtensible Kernel Object Model (XKOM) to serve

as the basis for our approach. XKOM is a generalized base model which contains

the basic structural abstractions that form a common denominator among the ex-

isting object-oriented and semantic data models. A key design philosophy is to







25

provide minimal semantics in basic structural constructs and extended semantics

via powerful behavioral abstractions. This allows the base model to be extended to

incorporate additional semantics provided by high-level data models for supporting

advanced application domains. To provide this capability, XKOM includes behav-

ioral abstractions in the form of methods and rules. Similar to methods, XKOM
incorporates rule specification as part of the class definition thereby providing an

integrated approach to knowledge management [RAS87, RAS88, SU85].

Using this extensible data model as a basis, we then propose an open, layered,

and modular framework for defining an extensible system architecture. By coupling

an extensible data model with an extensible system architecture, we aim to support

advanced application domains which require (a)advanced data model abstractions

such as specialized structural constructs, and facilities for defining behavioral ab-

stractions such as rules and methods, (b)specialized interfaces such as query lan-

guages, data definition and manipulation languages, graphical user interfaces, and

database programming languages, and (c)storage and processing techniques such as

specialized file organizations and access methods. In particular, an open, layered,

and modular framework allows the incorporation of such features at different levels
(logical and physical) of abstraction in an integrated and unified manner.

The integration of the two concepts described above, the base model (XKOM),
and an open, layered and modular framework is carried out as follows. In the lay-

ered framework, a middle-out approach is used with XKOM serving as the basis
for a well-defined intermediate level of abstraction. Figure 3.1 illustrates this lay-

ered framework, with the different layers and the corresponding mappings between

such layers. Logical extensibility or extensions to upper levels of abstraction are

carried out by upwardly extending the intermediate layer to support appropriate

high-level, end-user, data modeling constructs to be used in advanced application















LOGICAL REPRESENTATION LAYER
High level, logical (end-user) data modeling constructs
End-user interfaces such as query languages, etc
Other logical end-user abstractions (new association types),
constructs, application-related features/constructs etc
MAPPING
Mappings from logical (end-user) data models, constructs,
abstractions, query languages to underlying eXtensible
Kernel Object Model (XKOM) Layer
1 KERNEL OBJECT MANAGEMENT LAYER
eXtensible Kernel Object Model (XKOM): a basic, canonical,
object based model serving as an intermediate level of
abstraction
Provides extensibility mechanisms
Provides a base set of extensible object-level operations
for kernel objects
MAPPING
Defines the implementation of XKOM
Mapping the Kernel Object Model to underlying storage
structures and implementation of the base set of object level
operations

UNDERLYING STORAGE LAYER
Physical Storage of kernel objects



Figure 3.1: A Layered Framework for Extensibility









domains. Physical extensibility or extensions to lower levels of abstraction are car-

ried out by downwardly extending the intermediate layer to different and possibly
alternative physical data organizations, access paths, and storage sub-systems. We
note that XKOM also serves as an internal model of implementation for the system.

The use of levels of abstraction provides a systematic design and implementation
philosophy, enabling the system designer to visualize many different aspects of the
system architecture. The open, layered and modular framework that we propose

realizes an "open system architecture"--an architecture that enables the system to

be incrementally extended at any level of abstraction.

Based on the framework described above, a kernel KBMS is realized--a system

which serves as a core from which more sophisticated KBMSs can be built. Similar

to the notion of a DBC or DBI, we assume that extending and configuring the kernel

KBMS is the task of a KBMS Customizer (KBC).

In Figure 3.2, we illustrate the different steps involved in configuring and using a

kernel KBMS. If necessary, the KBC must configure or extend the kernel KBMS to

satisfy the requirements of an end-user data model to be used in a specific application

domain, and generate a final KBMS having the desired functionality. (A special

case is where the kernel KBMS is used in its basic form). The Knowledge Base

Administrator (KBA) then uses the final KBMS to develop an application knowledge
base, which can then be used by various end-users. Note that in the figure, the kernel

KBMS is portrayed as a black box, and a KBMS derived from the kernel KBMS

is portrayed as a larger black box. Later on, we will show that these black boxes

represent a well-defined specification and implementation of the model and system,

and that our approach provides a well-defined methodology to customize the model
and system. As a result, the tasks of the KBC, like the tasks of a KBA, are fairly

well-defined. In contrast, other approaches to extensible systems have not made an

attempt to clearly identify the tasks of a DBI or DBC.












SOURCE TARGET APPLICATION
(KERNEL) --TASK KBMS TASK (TARGET)
KBMS KBMS Customizer (KBC) KBMS Administrator (KBA) KNOWLEDGE
customizes Kernel KBMS implements a given applications BASE
to produce a Target KBMS on the target KBMS


KERNEL -
KBMS

[special case: target KBMS = source KBMS = kernel KBMS]


Figure 3.2: The Proposed Extensible KBMS Scenario


KERNEL
KBMS


application
knowledge
base







application
knowledge
base




application
knowledge
base


31







29

We now outline the techniques we propose to achieve extensibility. The basic

technique we propose is to reflexively apply data modeling techniques to model the

data model itself, and to model the software architecture of the system. Figure 3.3
illustrates this technique. When a data model is used to model an application such as

a University database, a schema or class system is produced representing an abstract

specification of the "application world." Using the same technique, the model can

be used to model itself, resulting in a schema or class system which represents a

"model of the data model," and to model the software architecture of the KBMS

which results in a "model of the system." We note that the black boxes referred to

previously and shown in Figure 3.2, represent these model and system schemata.

Data model extensions are realized using Model Reflexivity. In Model Reflexivity,

a data model (XKOM) is used to reflexively model itself. This results in a meta-

model schema--a set of (meta) classes which are used to describe and implement the

semantics of the data model (XKOM). Initially this meta-model schema represents a

basic and generalized model. To illustrate this concept, Figure 3.4 shows a high-level

(not detailed) schema of the "model of the basic XKOM model." Model Extensibil-
ity then involves tailoring or customization of the meta-model schema to extend the

basic model (XKOM) into a high-level end-user data model. The extension to the

meta-model schema is carried out using a number of extension mechanisms namely

user-defined data types, association types, methods (operations), and knowledge

rules. Rules provide a convenient and declarative mechanism for specifying con-

straints associated with customized data model semantics. The task of tailoring the

data model schema is the responsibility of the KBC. Data model extensions such as

user-defined class types and association types are investigated in this research.
Extensions to the software architecture of the system are realized by using an

open and modular software architecture. The object-oriented paradigm is used to













model abstract specification
"real world" 4 and implementation
of "real world"


model of a user application
University Schema
Class System
university model of Application
application





model of the data model
Meta-model Schema

model Class System
Data Model of Data Model






model of the system System Architecture
Schema
KBMS model
software Class System
of System
Architecture




Figure 3.3: Applying data modeling techniques for model and system specification









































Figure 3.4: Class System for "Model of the Model" (top level view with no details)









generically model functionally distinct system components (modules) in various lay-

ers resulting in a system meta-architecture schema--a set of (system) classes which

are used to describe and implement the software architecture of the system. To illus-
trate this concept, Figure 3.5 shows a high-level (not detailed) schema of the "model

of the system." Functional extensions are carried out through appropriate modifica-

tions to the sub-schema corresponding to specific functional modules by using data

types, association types, methods, and rules. In this dissertation, we will illustrate
this approach using three important system modules, namely, a Kernel Object Man-
ager, a Query/Rule Processor, and a Meta-model/Meta-information Management

Module. However, the same approach can be used in developing other modules such

as Storage Managers, Transaction Managers, etc.
Each chapter of this dissertation deals with some aspect of the extensible data

model and the open, modular and layered framework and architecture for next-gen-

eration KBMSs.

The first feature of our approach is defining an eXtensible Kernel Object Model

(XKOM) to serve as a basis for the approach. Chapter 4 describes the eXtensible

Kernel Object Model (XKOM). We identify a set of basic or generalized data model

structural abstractions which include the generic notions of object, instance, class,

identity, and association. XKOM also provides behavioral abstractions in the form

of rules and methods. The specification of high-level declarative rules are included

as part of class definition. This allows for an integrated approach to knowledge

management. Another important aspect of the approach which is related to the

data model, namely, Model Reflexivity is also described in this chapter.

The class system that results from reflexively modeling the data model also

serves as a natural basis for managing meta-information. Chapter 5 describes the

use of this class system in the management of meta-information. In the approach

we propose for managing meta-information, an important feature is that data and








































Figure 3.5: Class System for "Model of the System" (top level view with no details)







34

meta-data (including model, system, and application meta information) is managed
in a uniform and integrated manner based on the object-oriented paradigm. In

contrast, many existing DBMSs treat meta-data differently from application data.
For example, the notion of catalog relations in relational systems treats the defi-

nition of meta-data in a manner similar to application data, but specialized (and

non-extensible) access routines are embedded in the system for accessing meta-data

differently from application data. Similarly, in many object-oriented DBMSs, meta-

data is not treated like application data but rather embedded in internal structures

and access routines. In the approach we propose, the necessary access methods for

manipulating meta-information are realized by specifying appropriate methods in

the classes that comprise the meta-model schema. A bootstrap process is used to

instantiate the meta-model class system and generate the initial set of meta-objects

which define the semantics of the initial data model. Also, the same class system is
used to parameterize classes, associations and other data model concepts.

The material presented in Chapter 4 and Chapter 5 provide a broad perspective

on the data model (XKOM), and the usefulness of the concept of Model Reflexivity.

Issues relating to the realization of a kernel KBMS corresponding to XKOM are
addressed in Chapter 6, which deals with the management of kernel objects. The
kernel KBMS which deals with the management of kernel objects, must address is-
sues relating to the storage and processing of kernel objects. A distributed model of

storage for kernel objects, having better characteristics for inheritance is proposed.

In order to support the notion of associations, strong support is provided for pro-

cessing object references. Access patterns in object-oriented databases are classified

as either value-based access or association-based access, and processing strategies

are defined accordingly. Based on these factors, a generalized set of kernel object

manipulation operators are defined as an interface for an extensible kernel object

management system. The software configuration and class system corresponding to









this kernel object management system are used to illustrate the concept of modeling

the software components of the system architecture. The implementation of the

kernel object management system is also described.
An important feature of a next-generation KBMS is the capability to specify and
process high-level declarative queries and rules. This aspect is the focus of Chapter

7. A generalized approach to query processing for object-oriented databases using

adjacency matrix data structures is used. The approach is based on the concept

that query processing in object-oriented databases involves the manipulation and
processing of large sets of instance graphs or patterns. A representation scheme using

adjacency-matrix-based data structures is proposed and algorithms for manipulating

sets of graphs are also investigated. A classification of query types using graphs as a

basis is exploited to generate and execute query execution plans (QEPs). The same

technique of modeling software components of the system architecture is also used

in the case of the query processor and illustrates the feasibility of our approach for

an extensible system architecture. The implementation of a query processor based

on these concepts is described.
Chapter 8 provides a further study on Model Extensibility. Two important forms

of model extensibility, namely class extensibility and association extensibility are

investigated. Class extensibility relates to the ability to define different types of

classes, that is, different types of objects. For example, a particular application
domain or data model may require the notion of design objects having certain specific
characteristics or semantics. Association extensibility relates to the ability to define

different type of associations or relationships among object classes. In this case

too, a particular application domain or data model may require the notion of a

specialized association or relationship such as Interaction [SU89]. To illustrate these

concepts, we use the Object-oriented Semantic Association Model (OSAM*) [SU89]

as an end-user target model at the logical level.















CHAPTER 4
AN EXTENSIBLE KERNEL OBJECT MODEL (XKOM)

In this chapter, we address the data model aspect of next-generation KBMSs by
proposing an eXtensible Kernel Object Model (XKOM). This chapter is organized

as follows. In Section 4.1, we introduce the concept of an extensible data model

approach, outline the benefits of such an approach, and specify some requirements

for defining the extensible data model (XKOM). The structural aspects of XKOM

are presented in Section 4.2, and in Section 4.3, we use the data model to illus-
trate the object-oriented view of an example database. The behavioral aspects of

the data model are discussed in Section 4.4. In Section 4.5, we introduce language

components which are necessary to define and use the various structural and behav-

ioral abstractions. Finally, Section 4.6 addresses issues related to extending the data

model.

4.1 The Extensible Data Model Approach

To overcome the limitations of existing data models, we propose an extensible

data model approach which preserves many of the advantages of existing high-level
object-oriented data models and yet, overcomes many of their limitations. We now

describe the philosophy underlying our approach.
Our approach proposes that the data model be extensible, unlike other ap-
proaches which assume that the data model represents a fixed set of constructs

(that is, a rigid framework). To do so, we begin with a kernel data model having
a set of core data model constructs and adopt a building-blocks approach towards

extending the kernel data model to realize a customized model.









4.1.1 Benefits of the Approach

The benefits of using an extensible data model approach are enumerated below:

1. The concept of extensibility alleviates the semantic gap since the customized
data model can include appropriate data model constructs to more accurately

model the mini-world.

2. The concept of extensibility ensures that emerging application domains having

varying and dynamic requirements for data modeling can be accommodated.

3. Each application uses a specific data model consisting of the set of kernel

constructs plus any additional constructs it requires. Thus, the application

does not have to be burdened with the entire complex data model which may

include constructs not needed by the application. This feature implies a leaner
and more efficient system.

4. The implementation complexity of the kernel system is reduced since the basic

nature of the kernel concept ensures that a small and optimized implementation

of the core or kernel system can be realized.

4.1.2 Data Model Requirements

To define an extensible data model, we now outline three key requirements that

the data model must satisfy.

1. Basic Structural Abstractions: The data model must provide a set of basic

structural constructs. Ideally, these constructs must represent a set of core

concepts that are common to the range of existing object-oriented and semantic

data models. The resulting model can then unify these existing data models

by serving as a common core to all such models.









2. Powerful Behavioral Abstractions: The data model must provide powerful be-

havioral abstractions which include methods and rules. Methods serve as an

efficient and procedural approach to specifying behavioral semantics and can
be used by the system for developing and modifying system software as well as

by end-user applications. Rules are important because they serve as a powerful

declarative approach for specifying behavioral semantics. The declarative na-

ture of rules aids extensibility since rules can be used in a declarative manner

for specifying and tailoring system and application semantics.

3. Extensibility: The data model must be extensible. That is, it must provide the

necessary constructs to specify and extend the data model. Consequently, the

system underlying such a model can be made extensible, so as to realize the

data model extensions.

4.2 Basic Data Modeling Constructs: Structure

In this section, we address the first requirement that the data model must satisfy,

namely, that the model provides a set of basic structural abstractions. To support

the kernel or core concept, a deliberate attempt is made to identify constructs that
are neither semantically too specialized nor too primitive.

4.2.1 Object, Class, and Instance

Object

Since an object is a fundamental notion underlying all object data models,

XKOM supports the notion of objects.


Kernel Object: is a basic unit that models the abstract representation of any entity.

Unlike many data models and systems which tend to view objects mostly from

an application point of view, we take the view that everything is an object. Thus,









data model constructs (e.g., classes, associations, methods, rules), system software

(e.g., layers, modules), and application data are all uniformly treated as objects.
An important concept that is proposed is the notion of different types of objects.

In its kernel or core form, XKOM supports two fundamental types of objects: self-

named objects and system-named objects.

self-named object: is a kernel object whose value serves as the only mechanism for

identifying and referencing the object.

system-named object: is a kernel object to which a globally unique identifier is as-

signed.

In the case of self-named objects, the value of the object is used to identify,
reference, store and process the object. Such objects are used typically to model

values corresponding to basic data types such as the integer 5, the list [2, 3, 4] or the

string "John." Self-named objects usually have an underlying domain corresponding

to their values such as Integers, Reals, Strings, etc. While it is always possible to
model a self-named object as a system-named object, it may not be semantically

meaningful to do so and can lead to efficiency problems. Self-named objects are
usually embedded as part of system-named objects.

System-named objects are used to model entities in the mini-world that have
a need for independent existence. Examples of these mini-world entities include

physical objects (e.g., a Person), abstract concepts (e.g., a Company), relationships

(e.g., a Marriage), events (e.g., an Earthquake), software (e.g., a Module, a Method),
etc. The identifiers assigned to such objects are used for referencing such objects,

and in the storage and processing of such objects.

Class

Since class is a fundamental notion underlying all object data models, XKOM
supports the notion of classes.









Kernel Class: is an abstraction that describes the structural and behavioral seman-

tics of a set of like objects.


Classification or the notion of a class provides a specification and representa-

tion for a group of objects having similar semantics. In XKOM, the class is itself

considered to be a system-named object.

An important concept that is proposed is the notion of different types of classes

corresponding to different types of objects (Class Extensibility). In its kernel or core

form, XKOM supports two basic types of classes: Entity (E-)classes and Domain

(D-)classes.


E-class: is a class representing objects which are independently accessible and have

system-assigned identifiers (that is, system-named objects).


D-class: is a class representing objects which use values as the mechanism for iden-

tification and for reference (that is, self-named objects).


Thus, E-classes have an associated set of materialized objects whereas D-classes

serve to declare domains of values for materializing E-class objects.

Instance

Kernel Instance: is the representation or materialization of a kernel object in a par-

ticular class


A kernel object may participate (have a representation) in more than one class.

Consequently, the complete representation of a single kernel object is partitioned

across as many classes as it belongs to. Effectively, the complete representation

(CR) of a kernel object is the union (both structurally and behaviorally) of all its

instances.









Thus, CR(Ok) = U ki
i=1
where Ok is a kernel object
Zki is an instance of Ok in class i
p is the number of classes to which Ok belongs

This partitioned or distributed view of objects and object instances will dictate

the mechanisms and strategies used for supporting the notions of generalization
and inheritance. Such issues relating to the implementation of kernel objects and

instances are addressed in Chapter 6.

4.2.2 Identity

XKOM incorporates a strong notion of identity, based on tagged surrogates for

system-named objects. We propose the use of system-defined surrogates which can
enforce both the immutability and uniqueness requirements of identity. Since we are
taking a layered approach, the use of surrogates rather than (logical) disk addresses

serves to insulate the upper layers from having to know the address formats used by

lower layers (e.g., storage managers), thus providing data independence.

object identity (oid): a unique k bit integer assigned to each system-named object

class identity (cid): a unique k bit integer assigned to each class in the system, that
is, the object identifier of the system-named class object

instance identity (iid): a (k+k) bit integer used to identify the instantiation of a

specific object in a particular class

Thus, instance identity (iid) = (cid) (oid)


Finally, we note that identity itself is a self-named object since identity is repre-

sented as a particular value. Thus, different and extensible formats for identity can

be realized by using different domains of values for identity. These issues relating to
the implementation of identity are addressed in Chapter 6.









4.2.3 Association

In data modeling, an important concept is the notion of associations or rela-

tionships [CHE76, SU83, SU89] among classes and the corresponding associations

or relationships among object instances. Thus, XKOM supports the fundamental

notion of Association.

An important concept that we propose is the notion of different types of associa-

tions corresponding to the semantics of different types of associations or relationships
in the real world (this notion of Association Extensibility is described later in Section

4.6). The notion of different types of associations is analogous to and complements
the notion of different types of classes. In its core or kernel form, XKOM supports

two fundamental and orthogonal types of associations, namely Generalization (G)

and Aggregation (A) [SMI77].

Generalization

Generalization represents the super-sub class relationships between two classes.
In XKOM, each class can have a number of superclasses and every class must have

at least one superclass except the root class ("OBJECT") which has no superclass.

A class inherits all the structural abstractions of its superclasses as well as the
behavioral abstractions (rules and methods) defined for its superclass. A class can
itself be a subclass of several classes. This gives rise to a superclass-subclass lattice

or network. The notion of multiple inheritance is supported for such a generalization

lattice or network.

Aggregation

Aggregation is an association that defines a characterization relationship between
a defining class and a constituent class. An object in the defining class is character-

ized or described by an object in the constituent class. In XKOM, an aggregation as-

sociation is identified by a named mapping from the defining class to the constituent









class. Multiple aggregation associations are generally used to describe/characterize

a class in terms of its associations with other classes. Consequently, an instance

of the defining class consists of the instances or references to the instances of some

constituent classes. Aggregation associations between an E-class and one or more

D-classes are used to describe the descriptive attributes of the E-class. An aggre-

gation association may have a cardinality constraint which specifies the mapping

relationship between objects of the defining class and the constituent class to be 1:1,

l:n, n:l, or n:m.

Issues and details relating to the implementation of associations are described in

Chapter 6.

4.2.4 Class Definition

Based on the above description of the data model, we now specify the structure

of a class in XKOM. A class consists of a specification and an implementation. In

addition, E-classes may have a set of persistent instances associated with a class.

In Figure 4.1, we show a template representing the specification of a class in

XKOM. Each class specifies a system-wide unique class name, a class type, a set of

associations, a set of methods, and a set of rules.

The class type is specified as "E-class", "D-class" or any other user-defined class

type. A variable in the form of a character string is used to specify the class type.

This allows the parameterization of class types since no changes need be made in

the DDL (Data Definition Language) when a new user-defined class type is added.

The associations section of the class specification serves to define the set of struc-

tural associations for the class. Each association type is specified as "generalization,"

"aggregation" or any other user-defined association type. Corresponding to each as-

sociation type, the class specifies the set of associations of that type. For each

association of that type, an association name and the domain corresponding to the

constituent classes for the association are specified. The complete set of structural







44








CLASS: class-name
Class-Type = ;
/* e.g. "E-class" */

ASSOCIATION SECTION:
Association-Type = "generalization";
{ SUPERCLASSES : ;
SUBCLASSES :

Association-Type = ;
/*e.g. "aggregation" */
{ assoc_name : ;
assoc_name : ;
assoc_name : < type constructor> ;
*
*





METHODS SECTION:
4{
Specifications of Methods



RULES SECTION:
{
Specifications of Rules

S


Figure 4.1: Specification of a Class







45

associations of a class is the sum of associations of various association types. Simi-

lar to class type, a variable in the form of a character string is used to specify the

association type. In this case too, this parameterization of association types ensures

that no changes are necessary in the DDL when a new user-defined association is

added.
The method section of the class specification serves to define the set of methods

applicable to the objects of that class, and the rule section serves to define the set

of rules applicable to the objects of that class.

The implementation of a class refers to the creation and processing of objects cor-

responding to the class, and the implementation of the methods and rules applicable

to the objects of that class.

4.3 An Example Database

We illustrate the object-oriented view of an example database as modeled by

XKOM. This object-oriented view of a database can be viewed at two levels. First,

the database schema presents a class level view of the database. Second, the exten-

sional view of the database which represents an object or instance level view of the
database.

4.3.1 Database Schema

A database schema is represented as a network of associated (inter-related)

classes. We use a graphical/visual representation called Semantic diagrams (or

S-diagrams) [SU89] to depict a database schema. In S-diagrams, classes are rep-

resented as nodes and associations are represented as links between these nodes.

The S-diagram corresponding to an example database is shown in Figure 4.2.

E-classes are shown as rectangular nodes with the name of the class within the

node, and D-classes are shown as circular nodes with the name alongside or below

the node. A semantic association is represented as an outgoing edge from a defining
















,, section#
-. enrolment
0 class-period
semester


Figure 4.2: An example database schema corresponding to an university application









class to the constituent classess. From each class, all outgoing links or edges of a

given association type are grouped together and labeled using a letter that denotes

the association type. The name of the association (or attribute) is shown beside the

link, and if no name is specified, the name of the underlying domain is used as the

default association/attribute name.

In the Figure 4.2, E-classes such as Person, Father or Course contain objects

that are abstract representations of corresponding objects (that is persons, fathers,

and courses) in the real world. The classes Father and Mother are subclasses of

class Person, and represent persons in the real world who are fathers and mothers

respectively. The generalization or superclass-subclass association models the fact

that a person can be a father or mother, and that a father or mother object inherits

the structure and behavior of person objects from the class Person. The S-diagram

also illustrates D-classes such as degree, title, etc. which are used to define the

structure of E-classes. Descriptive data attributes of E-classes represent aggregation

associations defined over D-classes and are named by attributes such as title, dept

and course# defined for the E-class Course. An aggregation association defined
over another E-class is illustrated in the class Father which defines a multi-valued

aggregation association called father-of whose data type is SET-OF Person and
having the E-class Person as the domain or constituent class. This aggregation is a

multi-valued association since a father can have a set of children.

An n-ary relationship can be represented by defining an E-class having n binary

(aggregation) associations over the n classes participating in the relationship. The
properties and constraints of the n-ary relationship are associated with objects in

the defining E-class and objects in the constituent E-classes. The example schema

illustrates how a ternary relationship may be defined in the class Section. The

class Section represents this ternary relationship by using three binary (aggregation)

associations. Constraints relating to the ternary relationship can then be specified






48

on such a set of binary associations. For example, a particular course must be taught

by a teacher at a location (class-room). Such a relationship is representative of a

typical class scheduling process in a university environment.

4.3.2 Database Extension

A database extension is represented as a network of associated object instances.

We use a graphical/visual representation called an Object Diagram to depict the

database extension. A dot in an object diagram denotes an object instance identified

by its unique IID, and a solid edge between instances indicates an object reference.

An oval containing a set of dots is used to illustrate a set (or subset) of instances from

a given class. In Figure 4.3, we illustrate an example object diagram corresponding
to part of the database schema of Figure 4.2. Note that in an object diagram, we do

not distinguish between the types of links or object references between objects since

the S-diagram captures this aspect through relationships or associations at the class

level. However, for the sake of clarity, we use a dotted edge between instances of the

class Person and instances of the class Father in Figure 4.3 to distinguish between the

two different associations among the Person and Father classes. The dotted line is
used to indicate the father-of association. Similarly, a dotted line between instances
of class Person and instances of class Mother indicates the mother-of association.

For simplicity, we will use lower case alphabetic letters taken from the name of

the class to indicate class identity, and integers to indicate object identity. Thus, an

instance is represented by a dot having an alpha-numeric instance identifier.
We see that instance s33 in class Section is associated with instances t32 and t58

in class Teacher, instance 134 in class Location, and instance c38 in class Course.

This represents the fact that a section (s33) corresponding to a course (c38) is taught

by teachers (t32, t58). To illustrate a G-association, we observe that instance p32

in class Person represents a person who is a father and a teacher. Thus, there are

corresponding instances (f32, t32) in class Father, and in class Teacher respectively.












PERSON


TEACHER


SECTION


FATHER


MOTHER


LOCATION


Figure 4.3: An object diagram showing part of the database extension


COURSE







50

That is, an object having the same oid can instantiate several classes. Incidentally,

instance f32 in class Father is the father of p58 as indicated in the association between

f32 and p58. Then, since t58 is the instance of class Teacher corresponding to the

person p58, we can infer from the object diagram that the two teachers that are

teaching section s33 are a father and a child.

Conceptually, an entire database represents one giant object diagram containing

many complex object association patterns. Thus, specifying association patterns of

interest is an important concept for the query and rule languages, as well as for

the language used to define methods. This aspect will be evident later, when the

language components of the data model are described.

4.4 Basic Data Modeling Constructs: Behavior

In this section, we address the second requirement that the data model must

satisfy, namely, that the model provides a set of powerful behavioral abstractions.

XKOM provides two forms of behavioral abstractions as outlined below.

1. Methods: The data model provides methods as a procedural form of behav-

ioral abstraction. Methods are specified as part of a class definition and are

applicable to the set of objects which represent the extension of the class.

2. Rules: The data model provides rules as a declarative form of behavioral ab-

straction. Like methods, rules are specified as part of a class definition and

are applicable to the set of objects which represent the extension of the class.

However, rules provide a powerful form of behavioral abstraction since their

declarative nature allows for convenient and flexible specification of behavioral

semantics.

4.5 Language Components

To specify the structural and behavioral abstractions described in Sections 4.3

and 4.4, a language or a set of languages are needed. We discuss the various language









components that are needed to specify and process the behavioral as well as struc-

tural semantics of a class. We will use rules and methods to illustrate behavioral

semantics in the context of persistent data (object instances) based on the example
schema of Figure 4.2.

In this discussion, we shall focus on the notion of behavioral and structural
abstractions using one or more language components to specify and process such ab-

stractions. For the sake of completeness, we will also consider language components,

such as query languages, which are used for object retrieval and manipulation. The

specifications of such languages are outside the scope of this dissertation, and any

language that supports these concepts can be used. For XKOM, we will adapt lan-

guage components from previous and on-going research [ALA89a, ALA89b, DS088,

PUR88, SHY91, SU91] of the Database Systems Research and Development Center,

University of Florida.

The language components that are needed are enumerated below;


1. KDL: A Knowledge Definition Language (analogous to a DDL Data Defi-

nition Language) component which is used to specify class definitions, define

class hierarchies, and knowledge schemata in the data model. Integral to this

notion of a KDL component is the notion of a Knowledge or Data Dictionary

used to maintain meta-information. As part of knowledge definition, a rule

specification language component is also needed.

2. RSL: A Rule Specification Language component that is used to specify the

knowledge rules associated with instances and classes. The rule language com-

ponent must be declarative, and expressive enough for specifying a variety of

semantic constraints.

3. KML: A Knowledge Manipulation Language component that is used to manip-

ulate the persistent objects in the knowledge base. The KML must include a









Query Language (QL) component. This language component must be declar-

ative, provide set-oriented access, and provide an appropriate execution model

and concepts for query processing and object manipulation.

4. KBPL: A Knowledge Base Programming Language (KBPL) component that
is used for coding the methods of a given class. A KBPL must be compu-
tationally complete (that is, possess the full capabilities of a general-purpose

programming language), facilitate integrated access to persistent objects, and

provide an appropriate model of execution (preferably, object-oriented). It is

also possible for the first three language components described above to be

incorporated into a single KBPL, thus providing an integrated language to

specify and process classes, rules, methods, and persistent objects.


An important issue is that these languages or language components be designed

to be compatible with and complementary to one another, and to support the struc-

tural and behavioral abstractions of the data model. We note that it is possible to

design these four language components as individual languages or as a single inte-

grated language, as long as the same functionality is achieved. In 0-0 programming

languages such as Smalltalk [GOL83] or C++ [STR86], the language represents an
integration of data definition and programming language components, with class def-

inition being carried out separately from method implementation. Such languages

lack a data manipulation or query language component since the languages do not
support persistent objects, and a rule language component is not provided since the
languages do not support rules. In Database Programming Languages (DBPLs) such

as Opal [SER87] and Gallileo [ALB85], the programming language component is in-

tegrated with a data manipulation component to process persistent objects. Such

languages do not support rules and lack the ability to specify declarative queries due

to the absence of a query language component.






53

The four language components described above represent the specification and

processing of structural semantics, specification and processing of behavioral seman-

tics, and also, object querying and manipulation. Very few existing systems possess

the complete set of these language components. In related research at the Database

Systems Research and Development Center, an Object-oriented Semantic Associa-
tion Model (OSAM*) [SU89] has been developed as a powerful end-user data model

for supporting complex application domains. Various language components have

been designed and developed, including a KDL component, a KML component and

a RSL component. In this research, we will adapt and use the KDL, KML, and

RSL components from OSAM*. For the interested reader, details of the query and

rule language, and the rules system can be found ii [ALA89a, ALA89b, CHU90,
SIN90, SU91]. An association algebra which serves as the underlying formal ba-

sis for the pattern-based languages is described in [ U090, GUO91] An integrated

knowledge-base programming language design is presented in [SHY91]. In the in-

terim, the programming language component will be dressed in the context of the

C++ programming language [STR86] to specify and write methods. The behavioral
aspects of XKOM are now illustrated using these language components.

4.5.1 Queries and Methods

The KML language component of XKOM is illustrated by addressing methods

in the context of queries since queries form an impo tant mechanism for retrieving

the data that a method acts upon. A simple example will be used to illustrate this

concept.

The query model and language developed in [AL 89a] are based upon two key

concepts. First, queries are specified based upon ass ciation patterns of interest to

the user. Second, the language satisfies the properly of closure. That is, queries

issued against a database produce a result that is str ctured and modeled using the









same data model as the original database. The result of a query can then further

operated upon by other queries.

In the Object-oriented Query Language (OQL), a query is considered to be

a function which when applied to a database (or a sub-database) returns a sub-
database, which is composed of some select object classes (with each object class
containing a sub-set of its instances), and some selected associations between these

classes. A subdatabase forms a "context" under which further processing such as

the invocation of user/system defined methods (e.g., update, print) can be carried

out. The execution of the query can be considered as having two phases. First, the

establishment of a subdatabase context. Second, the processing of system or user

defined methods, e.g., retrieving the descriptive data for the query.

Specifying queries

A query is specified in OQL using the following query structure:

CONTEXT (Association Pattern Expression)
WHERE (inter-class conditions)
SELECT (object classes and/or attributes)
(class: operation-name parameter-list)


In the CONTEXT clause, the user specifies a subdatabase of interest using an
Association Pattern Expression (APE). The CONTEXT clause causes a set of data-

base extensional patterns which satisfy the CONTEXT clause to be extracted from

the database. Associated with the CONTEXT clause are the optional WHERE and

SELECT sub-clauses. The WHERE sub-clause specifies some inter-class conditions

based on attribute values which further causes some extensional patterns that do

not satisfy the where condition to be dropped from the subdatabase. The SELECT

clause provides the capability for the user to further select some classes from the

subdatabase and project out descriptive attributes of object from such classes. The

operation clause allows the user to specify a set of messages to invoke a sequence of







55

methods (defined for classes comprising the subdatabase), which are to be executed

on objects in the subdatabase.

The key component of a query block is the context clause which specifies a
subdatabase. The establishment of a subdatabase context can be viewed as a "fil-

tration" process in which objects of interest and the association patterns of interest
are "filtered out" from the original database.

The APE specifies a subdatabase by the use of several operators such as the

association operator (*), and the non-association operator(!). When the association

operator (*) is applied to two directly associated E-classes A and B in the database

(i.e., using A*B as the APE), it returns a subdatabase whose intensional pattern

consists of the classes A and B, and their association. The resulting subdatabase

contains a set of extensional patterns containing objects from class A and objects

from class B, such that every object selected from class A is directly associated with

(i.e., has object reference to) at least one object from class B and vice versa. When

the non-association operator (!) is applied to two directly associated E-classes A

and B in the database (i.e., using A!B as the APE), it returns a subdatabase whose

intensional pattern consists of the classes A and B, and their association. However,

the resulting subdatabase contains a set of extensional patterns containing objects

from class A and objects from class B, such that every object selected from class A

is not associated with (i.e., has no object reference to) any object from class B and

vice versa.
An APE specifies branches in an association pattern by the use of AND, OR

operators, in conjunction with the *, operators. A fork class is the class at which

the AND or OR condition is used to indicate a branching of associations. The AND

operator, when used in conjunction with the operator, implies that in the resulting

subdatabase, each instance of the fork class must be associated with at least one

instance from every branch class. An OR operator used in conjunction with the *









operator implies that in the resulting subdatabase, each instance of the fork class
must be associated with at least one instance from at least one of the branch classes.

An equivalent interpretation applies to the use of the AND, OR operators with the
! operator. Further details of the query language are available in [ALA89a], and
details of the association algebra are available in [GUO90].

Example query

To illustrate how methods can be specified and executed against a subdatabase,

we use an example query which invokes a method.

Queryl: Print a schedule for all CIS graduate classes which have a current offering
and have been assigned a teachers) and location ?

The corresponding OQL query is;

CONTEXT Course[dept='CIS' AND course# > 5000] Section
*AND (Teacher, Location)
PrintSchedule(Section, ...);


The Intensional Association Pattern (IAP) corresponding to the subdatabase

generated by this query is shown in Figure 4.4(a). The IAP shows the E-classes

(Section, Teacher, Location and Course) and the associations of interest. Corre-
sponding to a given IAP, there exists a set of Extensional Association Patterns

(EAPs) at the instance level. Each EAP represents a pattern (network) of instances

and associations belonging to the classes and associations specified by the IAP. The

set of EAPs corresponding to a subdatabase are shown using an extensional dia-

gram in Figure 4.4(b). The method printSchedule defined for the class Section is

now invoked on the relevant context that has been "filtered" out of the database.

The query thus provides a high level declarative means of specifying the printing of

a schedule to meet the user requirements.












SECTION
A


TEACHER COURSE

LOCATION

(a I--n---siol ------- ---t------ ---

(a) Intensional Association Pattern (IAP)


SECTION


COURSE


LOCATION


TEACHER


(b) Extensional Diagram



Figure 4.4: Intensional Association Pattern and Extensional Diagram for
an example subdatabase
a) Intensional Association Pattern (IAP); b) Extensional Diagram









4.5.2 Rules

The rule language [ALA89a, SU91] is also pattern-based, and is well-integrated

with the query language. It combines a set-oriented approach (which databases
require), with a declarative syntax. The structure of a rule is as follows:

RULE rule-id
Trigger-cond(Trigger-time, Trigger-operation)
Rule body
Corrective-action
END


The rule-id is a unique identifier assigned to each rule. A trigger condition

associated with each rule specifies the conditions under which a rule is fired. A

trigger condition may specify a combination of trigger time (such as before, after or

in parallel) and the operation (such as insert, delete) that triggers the rule.

The rule body is normally specified as an IF-THEN-ELSE clause. The exact

form of the rule body is determined by the type of the rule. Rules are classified as

state, operational, and deductive rules [CHU90, SIN90]. We illustrate each of these

rule types with the following examples.

State rules

A state rule is a knowledge rule designed to ensure semantic correctness and

consistency of a knowledge base by declaring the valid states that must exist in the

database at the time the rule is triggered. The IF clause of the rule body contains

an OQL Context Expression that looks for certain association patterns that exist

in a particular subdatabase context. The THEN part may contain an OQL context

expression which the subdatabase must satisfy, thus enforcing certain existential

constraints among patterns. For an unconditional state rule, the IF part is omitted

and only the predicate corresponding to the THEN part is specified. An example

state rule is now described:









RULE fm101
Trigger-cond(After InsertObject(Father), After InsertObject(Mother))
IF context Person Father
THEN Person Mother
Corrective-action Message ("A father and mother cannot
be the same person")
END

The above state rule, which is invoked after an object is inserted in the class

Father or class Mother is used to specify the constraint that the same person cannot

be a father or mother. Thus, the IF part of the rule body of rule fml01 identifies

association patterns corresponding to persons who are fathers, and the THEN part

specifies the constraint that these persons should not be mothers.

Operational rules

An operational rule is a rule that performs an operation if a particular state in

the knowledge base exists when the rule is triggered. Although an operational rule

can serve to correct undesirable states that exist in the knowledge base by specifying

corrective actions which enable the knowledge base to return to a consistent state,

in general, any operation can be triggered. The IF clause of the rule body contains

an OQL Context Expression that specifies a particular state in terms of certain
association patterns that must exist. The THEN clause contains operations that

must be triggered when the IF part is satisfied. An example operational is now
described:

RULE s-121
Trigger-cond(After DeleteObject(Teacher),
After DeleteObject(Course), After DeleteObject(Location))
IF context Section !OR (Teacher, Course, Location)
THEN DeleteObject(Section)
END


The above operational rule, which is invoked after an object is deleted from the

class Teacher, class Course or class Location, is used to enforce the constraint that









a Section object can exist only if it is associated with Teacher, Course and Location
objects respectively. In the rule body, the IF clause checks if there is a Section object
which is not associated with any Teacher, Course and Location object. The THEN
clause causes those objects from the class identified in the IF clause to be deleted.

Deductive rules

A deductive rule is a knowledge rule that derives new information that is not

explicitly stored in the knowledge base. In a deductive rule the IF clause of the

rule body contains an OQL Context Expression that identifies certain association
patterns that exist in the knowledge base. The THEN clause then derives a new

subdatabase, new associations or a combination of both, based on the patterns
identified in the IF clause.

To illustrate deductive rules, we use a commonly used example in logic and logic-

based languages such as prolog. We use the generation query as an example. Such
rules can infer information such as grandfathers from existing clauses (facts) such as
father, mother. In logic, such rules are expressed as:

Grandfather (?X, ?Y) :- Father(?X, ?Z) Father (?Z, ?Y)
Grandfather (?X, ?Y) :- Father(?X, ?Z) Mother (?Z, ?Y)

The equivalent deductive rule is;

RULE f141
IF (Father *[father-of] Person OR (Father-1 *[father-of],
Mother *[mother-of]) Person-1)
THEN Grandfather (Father, Person-1)
END

The rule specifies the deduction as follows. Starting from the class Father, the

corresponding children are determined by finding all objects in class Person which
are associated with the father objects through the father-of association. The next

level of association determines which of these children are fathers or mothers by

checking the G association. Finally, the children corresponding to these fathers or









mothers are determined through the father-of or mother-of association to the class

Person. Thus, a new association has been derived between the class Father and the

class Person, where the new association represents the Grandfather relationship.
In Figure 4.5, the extensional diagram showing the derivation of associations

based on the IF part of the rule is shown (the dotted line indicates the derived

associations or links). The corresponding intensional and extensional patterns for

the derived subdatabase are shown in Figure 4.6. Unlike logic based languages, the

emphasis here is on set-oriented deductive rules.


Figure 4.5: An extensional diagram showing the derivation of associations


4.6 Model Extensibility


In this section, we present techniques which allow the data model (XKOM) to be

extended to cater to various application domains. First, we use the data model to





















FATHER A g r PERSON



(a) Intensional Association Pattern


FATHER


PERSON


(b) Extensional Diagram


Figure 4.6: The Intensional Association Pattern and Extensional
Diagram for a subdatabase derived by a rule
a) Intensional Association Pattern; b) Extensional Diagram









model itself resulting in a set of model meta-classes. Next, we show that extending

the data model involves extending the schema corresponding to the set of model
meta-classes.

4.6.1 Model Reflexivity

We define Model Reflexivity as the capability of a given data model to model

itself. The use of a data model to explicitly model itself results in a model meta-class

system: a set of meta-classes which provide a specification and implementation of the

data model constructs. We use the term meta-class to refer to a class which is used

to define data model semantics. The distinction between classes and meta-classes is

transparent since classes and meta-classes are treated exactly the same. We will use

the terms model meta-class system, model meta-schema or model meta-architecture

interchangeably to refer this model of the model.

We now describe the meta-class system that results when XKOM is used to

model itself. Figure 4.7 illustrates a simplified version of the model meta-schema in

which many details have been omitted so as to illustrate the basic concepts of model

reflexivity. Model constructs such as classes, associations, methods, and rules are

treated as first class objects.
The root of the class hierarchy is the class OBJECT which represents the concept

that everything in the model (and underlying system) is modeled (treated) as an

object. Since every object in the system is either a system-named object or self-

named object, classes E-CLASS OBJECT and D-CLASS OBJECT are subclassed

from class OBJECT to represent this concept. Thus, objects of class E-CLASS

OBJECT represent all objects that are system-named objects and objects of class

D-CLASS OBJECT represent all objects that are self-named objects.
The class CLASS represents classes as objects. Such class objects may include

application classes, system classes or meta-classes. Since class objects are system-
named objects, the class CLASS is subclassed from class E-CLASS OBJECT. The












OBJECT
G

D-CLASS
application OBJECT
schema
COURSE STRING


A
/D\ FLOAT
PERSON INTEGER G


CID OID CHAR
SHORT
RAW LONG
I I RAW


Figure 4.7: Modeling the data model (Model Reflexivity)









meta-class CLASS is an important meta-class since its specification defines the struc-

tural properties of a class in the data model, and its implementation defined in terms

of methods and rules realizes the behavioral properties of a class in the data model.
Thus, modifying the specification and implementation of this class changes the se-

mantics of a class in the data model.

In XKOM, the definition of a class includes a class name, a set of associations,

a set of methods, and a set of rules. Corresponding to this concept of a class in

XKOM, the class CLASS is defined to have the following attributes: classID, class-

Name, setOfAssociations, setOfRules, and setOfMethods. The set of associations as-

sociated with a class includes GENERALIZATION associations, AGGREGATION

associations or any other type of association that may be defined. In its core or

kernel form, XKOM supports two types of classes: D-classes and E-classes. Thus, a

class object may represent an E-class or D-class. Corresponding to this concept, the

class CLASS is sub-classed into class E-CLASS and class D-CLASS. Class E-CLASS

represents those class objects that are E-classes, and class D-CLASS represents those

class objects that are D-classes.

Similar to class CLASS, class ASSOCIATION represents associations or relation-

ships between classes as objects. Such association objects may include relationships

such as generalization, aggregation or any other types of associations. Since asso-
ciation objects are system-named objects, the class ASSOCIATION is subclassed

from class E-CLASS OBJECT. The meta-class ASSOCIATION is an important

meta-class since its specification defines the structural and behavioral properties

of a relationship or association in the data model, and its implementation realizes

these properties. Thus, modifying the specification and implementation of this class

changes the semantics of an association in the data model.

In XKOM, the definition of an association includes an association name, a defin-

ing class (the class that defines the relationship), and one or more constituent classes









(the classes which constitute the domain of the relationship). In binary relationships,
an association has a single constituent class whereas in n-ary relationships, an asso-

ciation has multiple constituent classes. Corresponding to this concept of an associa-

tion in XKOM, the class ASSOCIATION is defined to have the following attributes:

associationID, associationName, definingClass, and setOfConstituentClasses. In its

core or kernel form, XKOM supports two types of associations: aggregation and gen-

eralization. Thus, an association object may represent an aggregation association

or a generalization association. Corresponding to this concept, the class ASSOCI-

ATION is sub-classed into class AGGREGATION and class GENERALIZATION.

Class AGGREGATION represents those association objects that are aggregation as-

sociations, and class GENERALIZATION represents those association objects that

are generalization objects.

In XKOM, objects corresponding to primitive data types (D-classes) are rep-

resented as self-named objects. Thus, primitive data types are represented as sub-

classes of the class D-CLASS OBJECT. Accordingly, class INTEGER, class FLOAT,

class CHAR, and class STRING are sub-classed from class D-CLASS OBJECT, and

represent D-class objects corresponding to primitive data types such as integers, re-

als, characters, and strings. In a similar manner, class IID, class CID, and class OID,

which are sub-classed from class D-CLASS OBJECT represent self-named objects

corresponding to instance identity, class identity, and object identity. An additional

primitive data type called raw, and two forms of the raw data type (short-raw, long-

raw) are represented by class RAW, class SHORT-RAW, and class LONG-RAW,

which are sub-classed from class D-CLASS OBJECT.

In a manner similar to classes and associations, objects in the class METHOD

represent method objects and objects in the class RULE represent rule objects. In

XKOM, since methods and rules are treated as system-named objects, class RULE

and class METHOD are sub-classed from class E-CLASS OBJECT.







67

The correspondence between the classes that comprise an application schema and

the model meta-schema is as follows. Top-level E-classes such as Person, Section,

Location, and Course, in the application schema shown in Figure 4.2, are immediate
sub-classes of the class E-CLASS OBJECT. Other E-classes such as Teacher or

Mother which are sub-classes of class Person are indirectly sub-classed from the
class E-CLASS OBJECT through the class Person. D-classes in the application

either represent primitive data types or are sub-classed from the class D-CLASS

OBJECT. In Figure 4.7, this correspondence is illustrated for a few classes taken

from the example schema of Figure 4.2.

The model meta-schema serves as a powerful tool for model extensibility. Model

Extensibility encompasses data model extensions such as adding multiple inheri-

tance, addition of abstract data types, adding user-defined class types, adding user-

defined association types, etc. Model Extensibility is achieved by modifying the

model meta-schema. The model meta-schema serves not only as a specification of the

semantics of end-user data model constructs but also as a basis for the implementa-

tion of such constructs. Methods (functions) and/or rules defined in the meta-classes

that comprise the model meta-schema are used to implement data model constructs.

By modifying the specification and implementation of the class CLASS, the seman-

tics of a class in the data model can be changed. We illustrate some scenarios with

respect to Figure 4.7. Consider the case where the class CLASS does not have the

attributes setOfRules and setOfMethods. This implies a model that is structurally

object-oriented but not behaviorally object-oriented [DIT86]. Another example is

the case where a data model (e.g., the relational data model) does not allow entities

to be defined in terms of other entities. That is, no abstract data types are allowed,
and an entity must be defined in terms of primitive or pre-defined data types. In

this case, the setOfAssociations attribute of class CLASS is constrained to have ob-

jects of class ASSOCIATION which represent association objects having D-classes








as constituent classes. Similarly, appropriate constraints on the setOfAssociations
attribute can determine the nature of inheritance supported by the data model (no

inheritance, single or multiple inheritance). For example, if a data model does not
allow generalization hierarchies or inheritance, then the setOfAssociations attribute

defined for class CLASS will not include an object from the class GENERALIZA-
TION. If the data model allows multiple inheritance, then the setOfAssociations

attribute will include an object from the class GENERALIZATION which models

the fact that each class is associated with more than one superclass.
Consequently, one well-defined task of the KBC is to extend the model meta-

schema to fit a given application by modifying the model meta-schema. Two aspects

of such model extensibility are of main interest to us, namely, class extensibility and

association extensibility, which are described below. We note that the model meta-

schema, in addition to providing a powerful basis for extensibility, also serves as
a powerful basis for parameterization and management of meta-information. This

aspect dealing with parameterization, meta-information, and the implementation of

the meta-class system will be presented in Chapter 5.

4.6.2 Class Extensibility

Class Extensibility is a form of extensibility whereby a model can be extended by
adding new class types or extending existing class types of a data model. Thus, it is

possible to define new class types in addition to the two basic class types offered by

XKOM: E-classes and D-classes. It is also possible to extend the data model by sub-
classing existing class types. We use an example to illustrate the usage of different

class types in different applications and discuss the usefulness of Class Extensibility

for such applications.









Design objects

In applications such as Engineering Design, an important concept is the notion

of design objects [BAT85, LOR83]. A design object corresponds to a distinct type of

object having certain well-defined characteristics, and classes used to specify design
objects can be considered as design object classes. The class type corresponding

to design object classes can be considered to be a special type of E-class. Such

specialized characteristics of a design object must be used to specify the semantics of

design objects. This can result in specialized methods such as methods for clustering

or methods for accessing and staging design object in memory. Clearly, such methods

and rules that govern design objects must be incorporated into the model meta-

schema.

Achieving class extensibility

To achieve the concept of class extensibility, we start by illustrating how the

model of the model is used to support the existing class types. Consider the notion of

E-class and objects belonging to an E-class. An E-class explicitly stores an extension,

and it may be necessary to select a particular object or objects from this extension.

Consequently, an operation such as select can be defined and implemented for the
class E-CLASS, and can be invoked on any E-class (that is, the object that represents

an E-class). If an application defines an E-class, say Person, then class Person is

made a subclass of E-CLASS OBJECT. The class Person inherits methods and rules

that apply to all E-class objects, and these methods and rules will apply to all objects

in the class Person. At a meta-level, methods defined in E-CLASS will apply to the

class Person (the object representing the class Person), since the object representing

the class Person is an instance of the class E-CLASS. Thus, to select a sub-set of the

extension of class Person, a message called select is sent to the instance of E-CLASS

that represents the Person class.









Class Extensibility is carried out as follows. If a new class type called X is re-

quired, then X-CLASS OBJECT is defined as a sub-class of Object, and X-CLASS

is defined as a sub-class of CLASS. The semantics that apply to all X-CLASS ob-
jects are defined in the class X-CLASS OBJECT, and the semantics that apply to

all X-classes are defined in the class X-CLASS. When the application declares an
application class (Ci) to be of type X, objects of class Ci inherit methods and rules

from the class X-CLASS OBJECT. The class itself (Ci) is subjected to methods and
rules in the class X-CLASS.

Existing class types (e.g., E-class, D-class) may also be extended by sub-classing

the meta-classes corresponding to such existing class types.

4.6.3 Association Extensibility

Association Extensibility is a form of extensibility whereby a model can be ex-

tended by adding new association types or extending existing association types of a

data model. Thus, it is possible to define new association types in addition to the

two basic association types offered by XKOM: aggregation and generalization. It

is also possible to extend the data model by sub-classing existing association types.
We use two examples to illustrate the usage of different association types in different

applications and discuss the usefulness of Association Extensibility for such applica-

tions. The following two association types can be defined as new association types
or as sub-types of the aggregation association.

Interaction association

A commonly used concept in many applications is the Interaction association

[SU89]. The interaction association models the concept that an object of a given class

represents a relationship or interaction between objects of two or more constituent
classes. Different from aggregation, the interaction association implies that the
object which specifies the interaction association is defined as a result of an n-ary







71

relationship with other objects. The object defining the interaction cannot exist

unless all the objects (from the constituent classes) participating in the relationship

exist.

Uses or message-passing association

An important association that we propose for achieving a higher form of abstrac-

tion in software engineering is the notion of the Uses or Message-passing association.

In this dissertation, we will use this association type extensively in modeling system

software.

The Uses or Message-passing association is a form of functional aggregation (in

contrast to structural aggregation), in which messages can flow from the defining

class to the constituent classes. The association models the relationship among

classes in which one class (the defining class) invokes methods in another class by

sending messages to objects of the constituent classes. An object involved in a

Uses relationship may play a role as an actor object which sends messages to other

objects, as a server object which receives messages from other objects, or as both

actor and server.

Achieving association extensibility

To achieve the concept of association extensibility, we start by illustrating how

the model of the model is used to support the existing association types. In Figure

4.7, class AGGREGATION and class GENERALIZATION which are sub-classed

from class ASSOCIATION model the aggregation and generalization associations

respectively. If a new association type is to be added, a new class corresponding to

this association type is defined as a sub-class of ASSOCIATION, similar to the exist-

ing association types. If an existing association type is to be refined or constrained,

a new class corresponding to the new association type is defined as a sub-class of the

class corresponding to the association type that is being refined. Methods and rules







72

are then defined for this new class to enforce the semantics of the new association

type.














CHAPTER 5
MANAGEMENT OF META-INFORMATION

The data dictionary module which is responsible for the management of meta-

information is considered to be the "heart" of a KBMS. The capabilities of the

system depend largely on the functionality that this module affords. As such, the

design and implementation techniques underlying the data dictionary module are

critical to the success of the KBMS. In this chapter, we describe the approach we

propose for managing meta-information.
This chapter is organized as follows. In Section 5.1, we discuss the important

role that model reflexivity plays in the management of meta-information. Concepts

for the management of meta-information pertaining to the model and system it-

self are presented in Section 5.2, while Section 5.3 describes the management of

meta-information pertaining to application data. In Section 5.4, we focus on some

important dictionary access functions. Finally, in Section 5.5, we briefly outline

the notion of a data dictionary module: the implementation of this module will be

discussed later in Chapter 6.

5.1 Using Reflexivity for Managing Meta-information

The concept of Model Reflexivity introduced in Chapter 4 has two important

uses. First, it can be used to specify data model extensions as described in Section

4.5. Second, it can be used for managing meta-information. In this section, we focus

on this latter use of model reflexivity.
A key characteristic of the traditional meta-information management technique

is that meta-information is accessed through specialized access methods and rou-

tines rather than through regular data manipulation operators. Although it may be

73







74

possible to use data manipulation operators to access meta-data, it is usually not

practical or efficient to do so. Instead, specialized access routines are necessary in

order to satisfy the specialized access needs of clients which use the database dic-
tionary module and/or to satisfy performance requirements since meta-information

is accessed very frequently. Consequently, the design of any meta-data management

module must consider both the structural aspect of the meta-data itself, and the

operational requirements of the meta-data such as specialized access routines.

A shortcoming in many existing DBMSs is that meta-data is treated differently

from data itself. If meta-data is treated differently from data, there is an implicit

assumption that the DBMS is recommended for managing (application) data but

is not good enough for representing and processing its own meta-data. Consider

for example the notion of catalog relations in relational DBMSs. From a structural

viewpoint, meta-data is treated like data since catalog relations are defined in a

manner similar to data by using the same modeling constructs, that is, relations,

columns, and tuples. While this meta-data is modeled as relations, it is not ma-

nipulated by regular relational data manipulation operators. Instead, specialized

access routines are used to access meta-data differently from data. Such routines or

methods are not part of the relational data model since the relational model does

not allow functional or operational specification as part of the model. Thus, from a

functional or behavioral viewpoint, meta-data is treated differently from data since

user-defined routines on the (relational) data cannot be defined in a manner similar

to the access-routines on the (relational) meta-data. As a result, access routines are

embedded within the system software and cannot be easily modified or extended.

Instead, external application programs must be written on top.

Now, let us consider object-oriented (0-0) DBMSs. From a structural view-

point, we consider an 0-0 DBMS as treating meta-data like data only if meta-data

is structurally represented as first-class objects in the system. From a behavioral






75

viewpoint, we consider an 0-0 DBMS as treating meta-data like data only if the
specialized access routines are defined as methods for processing (first-class) objects

that represent meta-data in a manner similar to user-defined methods which are de-

fined for processing objects that represent data. The use of such a reflexive concept

in which the structural and behavioral abstractions of the data model are themselves

used to model and manage meta-data is the basic premise of our approach. In this

regard, some existing object-oriented DBMSs do not treat meta-data like data ei-

ther from a structural viewpoint and/or a behavioral viewpoint. Often, meta-data

is embedded in internal data structures and/or accessed using hard-coded internal

routines.

In the approach we propose, data and meta-data (including model, system, and

application meta-information) are treated exactly alike: data and meta-data are

modeled and processed in an uniform and integrated manner based on the object-

oriented paradigm. The class system that results when the data model is used to

reflexively model itself is used as a natural and powerful basis for managing the

first class objects that represent meta-data. Specialized access methods for manipu-

lating meta-information are then realized using appropriate behavioral abstractions

(methods or rules) in the classes representing meta-data.

5.2 Model and System Meta-information

We now describe the usage of the meta-class system of the model, shown earlier

in Figure 4.7, in the management of meta-information.

5.2.1 The Bootstrap Process

When using the model to model itself, a bootstrap process is required to initially

populate the model schema with information of the model and system. We define the

bootstrap process as that process which generates the meta-class system and creates

the initial set of objects that define and represent the data model and system.









We illustrate this concept by using an important bootstrap class, the meta-class
CLASS, as an example. To bootstrap the meta-class CLASS, the meta-class CLASS

must be first defined, and the first object that must be instantiated in this class
is the object that represents the meta-class CLASS itself. This object is the first

class in the system and is also the first object of the first class. For the sake of

clarity, we will term this first object as CLASS-object. The CLASS-object is the
only object in the system whose object-id is identical to the class-id of the class

it belongs to. The instance-id is thus a concatenation of two 32 bit integers, both

of which have the same value. In the model of the model, the attribute called

SetOfInstances defined for E-CLASS, represents the fact that all E-classes have an
extension. The meta-class CLASS is also an E-class, since instances of class CLASS

represent system-named objects. Hence, CLASS-object must also be instantiated

into the meta-class E-CLASS. The extension of the meta-class CLASS is represented

by the setOfInstances attribute of the instance in E-CLASS corresponding to the

CLASS-object. Thus, if we iterate the extension of the meta-class CLASS, we would

get all classes in the system.
The bootstrap process proceeds by setting up the class hierarchy that defines

the model, setting up the first object of class CLASS, and then, instantiating meta-
classes such as CLASS, ASSOCIATION, etc., with all necessary objects that describe

the model itself (that is, classes and associations that describe the model). In Figure

5.1, we show some objects and meta-classes that represent model meta-information

after the bootstrap process is completed. Once the bootstrap process is completed,

the model and system objects have been created and the model is ready for use by

an application.

Within the context of reflexivity, we also use the term "bootstrap" in the sense

that another model or system must be used to generate the first version of the target
















as object


Links show the G-association
between object instances






E-CLASS


Figure 5.1: Meta-classes as object instances









model. In our case, the C++ programming language [STR86] is used to perform
bootstrapping.

5.2.2 Parameterization

An important effect of using the set of meta-classes to represent meta-information
is the ability to achieve parameterization. Parameterization refers to specification
and manipulation of modeling constructs such as classes and associations as parame-

ters or variables. Thus, we support the notion of class as a parameter and association
as a parameter. A parameter or variable which represents a class or association ob-
ject can be assigned, passed as an argument or have a message passed to it. This

property of parameterization is used to access and manipulate meta-information,
and is illustrated using some examples.

In the first example, we illustrate how all classes in the system and all associations
in the system can be accessed as objects.

IID x; /* x represents an IID object */
IID y; /* y represents an IID object */
IID-ARRAY r; /* r represents an array of IIDs */
IID-ARRAY s; /* s represents an array of IIDs */
ECLASS* p; /* p represents a pointer to an E-class object */
ECLASS* q; /* q represents a pointer to an E-class object */

/* Assign to x the E-class instance representing class CLASS. That is, x is now
the iid of an instance of E-CLASS representing the class CLASS object */
x = select("ECLASS", className= "CLASS");

/* Assign to y the E-class instance representing class ASSOCIATION. That is, y is
now the iid of an instance of E-CLASS representing the class ASSOCIATION object */
y = select( "ECLASS", association Name= "ASSOCIATION");

/* p is the virtual memory pointer to the E-class instance of class CLASS object */
p = #x;

/* Get virtual pointer to the E-class instance of class ASSOCIATION object */
q = #y;


/* r now represents the set of all classes in the system */









r = p -+ getSetOflnstances();

/* s now represents the set of all associations in the system */
s = q -- getSetOflnstances();

In this example, the first two steps are to access the instances of class ECLASS

which correspond to class CLASS and class ASSOCIATION (Note that instances of
class ECLASS represent all E-classes in the data model). This is done by using a
select operator to select instances of ECLASS which satisfy the selection condition

(name = "string"). In the next two steps, a lookup is performed using the instance
identifiers of the selected instances. A lookup operator (#) is used to transform the

instance identifier to the corresponding virtual memory pointer (this lookup operator
will be later discussed in Chapter 6). Now, the variables p, q point to instances of
ECLASS which represent the E-classes CLASS and ASSOCIATION respectively. In

the last two steps, we examine the setOfInstances attribute of these instances. This
gives us all the classes and all the associations in the model. After the bootstrap
process terminates, the only classes and associations are those belonging to the
meta-class system created by the bootstrap process.
We now use a more complex example to illustrate the access of a combination
of classes and associations. In this example, we show how all the associations of a
given class can be accessed by retrieving the association objects associated with the
class object.

lID x; /* x represents an IID object */
CLASS* y; /* y represents a pointer to a class object */
IID-ARRAY z; /* z represents an array of IIDs */

/* Assign to x the class object representing class CLASS */
/* x is now the iid of the Class CLASS object */
x = select("CLASS", name="CLASS");

/* Get virtual pointer to class CLASS object */
y = #x;









/* Get all associations of class CLASS */
z = y -- getSetOfAssociations);

In this example, the association objects associated with the class CLASS object
are retrieved. In the first step, the instance identifier of the class CLASS object is
looked up (using a select operator). Then, a lookup operator (#) is used to map from
the instance identifier (IID) to a virtual memory pointer (y). The virtual memory
pointer (y) now points to the class CLASS object. The IIDs of the association

objects associated with the class CLASS object (represented by z) are now accessed
by sending an appropriate message of class CLASS to the class CLASS object (y).
The approach and techniques described above to achieve parameterization can be
extended to rules and methods. Thus, rule and method objects can be parameterized
and manipulated in a manner similar to classes and associations.

5.3 Application Meta-information

The meta-information that relates to an application is manipulated in exactly
the same manner as the meta-information that relates to the model itself. The same

technique and approach described above is used. However, instead of manipulating
classes and associations that belong to the data model, we manipulate classes and as-
sociations that belong to the application. We illustrate our approach to management
of application meta-information using the following examples.
In the first example, we illustrate how the set of all instances of an application
class can be accessed. This is the mechanism by which the extension of a class is
stored and accessed.

IID x; /* x represents an lID object */
IID-ARRAY r; /* r represents an array of IIDs */
ECLASS* p; /* p represents a pointer to an E-class object */

/* Assign to x the E-class instance representing class TA. That is, x is now
the iid of an instance of E-CLASS representing the class TA object */
x = select("ECLASS", className="TA");










/* Get virtual pointer to the E-class instance of class TA object */
p = #x;

/* r now represents the set of all instance of class TA */
r = p --+ getSetOflnstancesO;

In the second example, we illustrate how the superclasses of an application class
can be accessed. First, all the associations of the class are accessed and then, the
superclass (generalization) association is singled out.
IID x; /* x represents an IID object */
CLASS* y; /* y represents a pointer to a class object */
IID p; /* p represents an lID object */
ASSOCIATION* q; /* q represents a pointer to an association object */
IID-ARRAY z; /* z represents an array of IIDs */

/* Assign to x the class object representing class TA */
/* x is now the iid of the Class TA object */
x = select("CLASS", name="TA");

/* Get virtual pointer to class TA object */
y = #x;

/* Get all associations of class TA */
z = y -+ getSetOfAssociations);

/* Iterate this set of associations to determine the superclass association */
FOR each element in AssociationSet (z)
BEGIN
p = [z -+getNextElement)]
q = #p;
IF [q --getAssocType() = superclasss"]
return q
END


5.4 Dictionary Access Functions

Due to the reflexive nature of the approach we propose, the model of the model
is realized through a series of incremental steps or extensions. Accordingly, the
dictionary access functions are extended or added to, as necessary. In Appendix A,






82

we show a relevant sub-set of dictionary access functions that are currently supported

after performing the initial bootstrap process.

5.5 Data Dictionary Module

The data dictionary module and the model meta-schema management module

are represented as one integrated module in the architecture we propose. Classes
and objects that describe and implement the data model semantics are defined here.

The module also provides a set of access methods used to access run-time meta-

information of the data model or of an application schema. The implementation of

this module is presented in Chapter 6 as part of the implementation of a Kernel

Object Management System (KOMS).














CHAPTER 6
MANAGEMENT OF KERNEL OBJECTS

In this chapter, we address storage and processing issues relating to the manage-
ment of kernel objects: objects whose abstractions are described by XKOM. Cor-

respondingly, the design and development of a Kernel Object Management System

(KOMS) is also presented.

This chapter is organized as follows. In Section 6.1 we outline a set of basic

requirements that serve as a basis for the management of kernel objects, and influence

the design and implementation of KOMS. Based on these requirements, Section 6.2

deals with various issues related to the storage of kernel objects and Section 6.3

deals with various issues related to the processing of kernel objects. In Section

6.4, we describe the implementation of the Kernel Object Management System and

techniques used to realize extensibility.

6.1 Basic Requirements

The basic requirements that influence the management of kernel objects, and

play an important role in the design and development of KOMS are enumerated

below.


1. Storage Requirements: The storage needs of KOMS are primarily influenced

by the basic structural constructs of the data model. In this regard, KOMS

must support the set of core data model constructs that XKOM incorporates:

objects, classes, instances, identity, and associations.

2. Processing Requirements: The processing needs of KOMS are primarily influ-

enced by the processing needs of clients in the upper layers. To support a

83









generic class of clients, KOMS must support set-oriented processing, value-
based processing, and association-based processing.

3. Extensibility Requirement: In order to support the various requirements of
clients in the upper layers, the software modules of KOMS must be designed
and developed in an extensible manner.

6.2 Storage Issues for Kernel Objects

In KOMS, an important issue is a generalized mapping of kernel objects to the un-
derlying storage layer. To facilitate such a mapping, a well-defined model of storage

for kernel objects is developed, and issues relating to such a storage representation

and mapping are investigated.

6.2.1 Models of Storage: Static Storage Model vs Distributed Storage Model

In implementing the structural aspects of XKOM, the storage and processing of

objects in a generalization or super-sub class hierarchy is an important consideration.

This factor determines the storage and processing of inherited attributes.

We consider two possible strategies for objects in a Generalization (G) hierarchy;

a Static Storage Model (SSM), and a Distributed Storage Model (DSM). These
strategies are illustrated in Figure 6.1.

In SSM, an object physically exists in exactly one class of the class hierarchy. In

this case, an object is "pushed" to the lowest class it belongs to in the hierarchy, and

each object stores direct attributes of that class, and also stores statically inherited
attributes from all superclasses. Thus, for example, a teacher object (based on the

schema of Figure 4.2) contains storage slots or fields for direct attributes (degree,

salary), and for inherited attributes (name, age, ssn).

In DSM, the structural properties of an object are physically distributed into as

many classes as it belongs to. An instance is thus the corresponding partition of

an object in a given class. Each instance only stores the attributes defined for that






85











ol: represents a Person object
o2: represents a Teacher object
o3: represents a TA object


PERSON
PERSON
ol2 Issnlaqlname .
ol ssnae name
o ssnagenameo2 ssn e name ......
2 i,, ___ s |o a | I"--.-name rx---ni -*. r'."''

TEACHER TEACHER
o2 I deee I salary V
f2 ssn age name 1A____
I salary IdegreeJ --- J o3 ,,rl salary03
.- GRAD
TA STUDENT
', t-------- ^t
classification
o3 ssn age name o3 j TA .
salary degree *
GPA Iclassificationl ...-office
office# o3


(a) Static Storage Model (b) Distributed Storage Model

Figure 6.1: Models of storage
a) Static Storage Model (SSM);
b) Distributed Storage Model (DSM)






86

class. Thus, for example, the same teacher object has a representation (instance)

in the Teacher class and a representation (instance) in the Person class, as well. A
dotted line in Figure 6.1(b) shows that they represent the same object.
The main advantage of SSM is that since all the attributes (direct and inherited)

are clustered together, creation, update, and deletion of objects is straightforward

and efficient. For retrieving all attributes of a given object, this approach is fast. In

SSM, a main disadvantage is that scan-based access is inefficient, and indexing can be

complicated. For example, retrieving all persons whose age > 50, is not very efficient.

This requires scanning all objects in the class Person, and scanning all objects in all

subclasses of Person. Similarly, indexing becomes complicated, requiring strategies

such as class hierarchy indexes [KIM89b, MAI86b]. A more serious shortcoming is
the inability to represent an object that spans more than one branch of a hierarchy.

For example, the case where a person is a parent and a teacher cannot be handled.

For DSM, the main advantages are faster access for scan based access such as se-

lection, and the ability to represent the instantiation of an object across many classes

(including the case where the object spans more than one branch of the generaliza-
tion hierarchy). Both these factors are important in database applications, and thus

we opted for using DSM. The main disadvantage in DSM is the added complexity

and overhead in creating, updating, and deleting an object that is partitioned. Also,

retrieving all attributes of a single object involves accessing all instances of the ob-

ject resulting in an increased cost of access. The cost of inserts, updates, deletes,

and accessing all attributes of a single object is higher than SSM, but can be reduced

by clustering together all instances of an object.

The DSM approach also allows for more flexible clustering options since it allows
instances to be clustered on a class basis (all instances of a class clustered together) or

on a class hierarchy basis (all distributed instances of an object clustered together).

In SSM, the latter technique is built in, and cannot be changed.









6.2.2 Logical Storage Structure of an Instance

Based on the Distributed Model of Storage (DSM), we now investigate the storage

representation of a generic instance. In object-oriented data models, various types

of objects such as complex/composite objects, structured objects, and large/small

unstructured objects have been identified and studied. The proposed storage repre-
sentation must encompass these various types of objects. To do so, we present below

a discussion on the characteristics of various types of objects, and how they can be

supported in a single and integrated storage representation.


1. Structured Objects: These objects are "regular" objects recognized by most
object-oriented data models. Such objects have a regular structure described

by the class that they belong to, and can be typically considered as objects

described in terms of other objects. The structural characteristic of this type

of object is that the object usually contains two components. First, it contains

"descriptive data," which are data values embedded directly in fields of the

storage object, and are usually expressed using data types such as character,
integer, string or other complex data types. Second, it contains an object refer-

ence component (association data), which represents object references (links)
between this object and other objects. A key operational characteristic for this

kind of object is set-oriented processing.

2. Large/Small unstructured objects: These objects are objects having a struc-

ture which is typically expressed as a sequence of uninterpreted bytes: its size
can range from 1 KByte to 100+ MBytes. A key characteristic of the large

unstructured objects is the need to uniformly and independently access and

manipulate parts of such large objects. Examples of such storage objects are









large (arbitrarily sized) text, images (from satellites), bit maps, etc. In Exo-

dus [CAR86a] all storage objects are treated as uninterpreted bytes and issues

related to the processing of large, uninterpreted storage objects are studied.

3. Complex/Composite objects: These objects are objects whose primary feature
is that they represent a composition of several objects from several classes
and collectively define one complex object. A key characteristic of this type of

object is the need to access and manipulate the entire composition of objects as

one entity. Such objects may be found in discrete manufacturing or engineering

design: for example, a jet engine that represents one "complex" entity.

In order to provide a unified and integrated representation for structured objects,

long/small uninterpreted objects, and complex/composite objects, we propose two

types of attributes (a) object reference attributes and (b) value attributes.

Using these two types of attributes allows for uniform representation of all types

of objects. On the one hand, a large uninterpreted object could be stored in a value

attribute using a data type called "raw": raw represents an uninterpreted sequence

of bytes. Processing the raw data type could exploit techniques and algorithms pro-
posed by Exodus [CAR86a] for processing large uninterpreted objects. On the other
hand, complex/composite objects can be represented using a combination of object

reference and value attributes. Such objects, for example, could utilize clustering

techniques and algorithms developed in application areas such as CAD/CAM for

processing complex/composite objects.

The storage format of an instance is a structure having the following fields;

instance-id field: Each instance stores its instance identifier in the first field.

object reference fields: Each instance can have several object reference attributes,

each of which is stored as an object reference field. An object reference field
stores one or more (iids) of the objects referred to by this instance, and the







89

object reference field is specified as IID or as IID-ARRAY. The former supports

1:1 references and the latter supports l:m references.

value fields: Each instance may have several value attributes, each of which is stored

as a value field.


The storage representation of an instance corresponding to this format is shown

in Figure 6.2.


instance id
of instance

reference
: ,attributes
of instance
J
value (primitive data type)
value (D class object)
value (complex D class object) value
attributes
of instance

value


Figure 6.2: A generic storage instance



6.2.3 Value Attributes

Value attributes of an instance are used to store the actual value of other object

instances as part of the instance. In Figure 6.3, we illustrate the notion of different

types of value attributes by showing a single instance of the class Person.

The different types of value attributes include;


1. primitive data types (e.g., integer, string)




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs