Table of Contents
 Survey of related works
 Language overview
 Structural abstraction mechani...
 Behavioral abstraction mechani...
 Computation model
 KBMS-supported evolutionary...
 System architecture and implem...
 Conclusion and future research...
 Appendix A: Syntax summary...
 Appendix B: Parts knowledge-base...
 Biographical sketch

Group Title: K
Title: K : an object-oriented knowledge-base programming language for software development and prototyping
Full Citation
Permanent Link: http://ufdc.ufl.edu/UF00097385/00001
 Material Information
Title: K : an object-oriented knowledge-base programming language for software development and prototyping
Physical Description: vi, 177 leaves : ill. ; 29 cm.
Language: English
Creator: Shyy, Yuh-Ming, 1959-
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 1992
Copyright Date: 1992
Subject: Database management   ( lcsh )
Object-oriented databases   ( lcsh )
Computer and Information Sciences thesis Ph. D
Dissertations, Academic -- Computer and Information Sciences -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
Thesis: Thesis (Ph. D.)--University of Florida, 1992.
Bibliography: Includes bibliographical references (leaves 166-176).
Additional Physical Form: Also available on World Wide Web
General Note: Typescript.
General Note: Vita.
Statement of Responsibility: by Yuh-Ming Shyy.
 Record Information
Bibliographic ID: UF00097385
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: alephbibnum - 001806311
oclc - 27804290
notis - AJN0144


This item has the following downloads:

PDF ( 6 MBs ) ( PDF )

Table of Contents
        Page i
        Page i-a
        Page ii
    Table of Contents
        Page iii
        Page iv
        Page v
        Page vi
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
        Page 8
        Page 9
        Page 10
        Page 11
        Page 12
    Survey of related works
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
    Language overview
        Page 21
        Page 22
        Page 23
        Page 24
        Page 25
        Page 26
        Page 27
        Page 28
        Page 29
        Page 30
        Page 31
        Page 32
        Page 33
        Page 34
        Page 35
        Page 36
        Page 37
        Page 38
        Page 39
        Page 40
        Page 41
        Page 42
        Page 43
        Page 44
    Structural abstraction mechanisms
        Page 45
        Page 46
        Page 47
        Page 48
        Page 49
        Page 50
        Page 51
        Page 52
        Page 53
        Page 54
        Page 55
        Page 56
        Page 57
        Page 58
        Page 59
        Page 60
        Page 61
        Page 62
        Page 63
        Page 64
        Page 65
        Page 66
        Page 67
        Page 68
        Page 69
        Page 70
        Page 71
        Page 72
        Page 73
        Page 74
    Behavioral abstraction mechanisms
        Page 75
        Page 76
        Page 77
        Page 78
        Page 79
        Page 80
        Page 81
        Page 82
        Page 83
        Page 84
        Page 85
        Page 86
        Page 87
        Page 88
        Page 89
        Page 90
        Page 91
    Computation model
        Page 92
        Page 93
        Page 94
        Page 95
        Page 96
        Page 97
        Page 98
        Page 99
        Page 100
        Page 101
    KBMS-supported evolutionary prototyping
        Page 102
        Page 103
        Page 104
        Page 105
        Page 106
        Page 107
        Page 108
        Page 109
        Page 110
        Page 111
        Page 112
        Page 113
        Page 114
        Page 115
        Page 116
        Page 117
        Page 118
        Page 119
        Page 120
        Page 121
        Page 122
    System architecture and implementation
        Page 123
        Page 124
        Page 125
        Page 126
        Page 127
        Page 128
        Page 129
        Page 130
        Page 131
        Page 132
        Page 133
        Page 134
        Page 135
        Page 136
        Page 137
        Page 138
        Page 139
        Page 140
        Page 141
    Conclusion and future research directions
        Page 142
        Page 143
        Page 144
        Page 145
        Page 146
        Page 147
        Page 148
        Page 149
        Page 150
        Page 151
    Appendix A: Syntax summary of K
        Page 152
        Page 153
        Page 154
        Page 155
        Page 156
        Page 157
        Page 158
        Page 159
    Appendix B: Parts knowledge-base example
        Page 160
        Page 161
        Page 162
        Page 163
        Page 164
        Page 165
        Page 166
        Page 167
        Page 168
        Page 169
        Page 170
        Page 171
        Page 172
        Page 173
        Page 174
        Page 175
        Page 176
    Biographical sketch
        Page 177
        Page 178
        Page 179
Full Text




199 2




I am grateful to a number of individuals for their

contributions during the development of this dissertation. In

particular, I would like to thank my advisor, Prof. Stanley

Y.W. Su, for his continuous encouragement, guidance, and

support throughout the course of this research work. I would

also like to thank Prof. Herman Lam, Prof. Sharma

Chakravarthy, Prof. Manuel Bermudez, and Prof. Thomas Bullock

for their participation on my supervisory committee and

careful review of my dissertation.

My special appreciation goes to my colleague Javier

Arroyo for his many fruitful suggestions and the

implementation of the K.1 prototype. I also thank Sharon

Grant, the data-base center secretary, for her constant help.

Finally, and most importantly, I would like to thank my

dear wife, Ching-Jung Wu, and my parents, Yo-Ching Shyy and

Ching-Mei Lee Shyy, for their encouragement, patience, and

understanding during the period of my Ph.D. study.

This research was supported by National Science

Foundation Grant #DMC-8814989 and Florida High Technology and

Industrial Council Grant #UPN90090708.






1.1 Motivation .
1.2 Language Design Princ
1.3 System Overview .
1.4 Dissertation Organiza


2.1 Data-base Programming Language . . .
2.2 Structural and Behavioral Modeling . .
2.3 KBMS-supported Software Development System

3 LANGUAGE OVERVIEW . . . . . . .


Knowledge Abstractions . . .
Model Extensibility and Reflexivity
Persistence . . . . . .
Type System . . . . . .
KBMS Operations . . . . .



Association Definition . . . . .
Association Pattern . . . . . .
Context Looping Statement and Navigation
Existential and Universal Quantifiers .


5.1 Method Definition . . . . . .
5.2 Rule Definition . . . . . . .


. . . v


. . . . . . 13

6 COMPUTATION MODEL . . . . . . . 92

6.1 Overview . . . . . . . . 92
6.2 Extended Object-oriented Computation . 95


7.1 Overview . . . . . . . . 102
7.2 Method Model and Control Associations . 107


8.1 System Architecture . . . . . . 123
8.2 Implementation: Mapping from K.1 to C++ . 128


9.1 Conclusion . . . . . . . . 142
9.2 Future Research Directions . . . . 145


A SYNTAX SUMMARY OF K . . . . . . . 152


REFERENCES . . . . . . . . . . 166

BIOGRAPHICAL SKETCH . . . . . . . . 177

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



Yuh-Ming Shyy

August, 1992

Chairperson: Dr. Stanley Y.W. Su
Major Department: Computer and Information Sciences

The OSAM*.KBMS is a research prototype of a knowledge-

base management system, or the so-called "next generation

data-base management system," for nontraditional data- and

knowledge-intensive applications. In order to solve the

"impedance mismatch" problems between data-base languages

(which include data definition languages, query languages, and

rule languages) and traditional programming languages, we have

developed an object-oriented knowledge-base programming

language called K to serve as the high-level interface of

OSAM*.KBMS for defining, querying, and manipulating the

knowledge base, as well as to write codes to implement any

application system. In addition to the well-known object-

oriented virtues such as abstract data types, information

hiding, complex objects, relationships, inheritance, reusable

codes, etc., K provides six important features.

(1) Powerful abstraction mechanisms for supporting the

underlying knowledge model which captures any application

domain knowledge in terms of the structural associations (such

as generalization and aggregation), functional associations

(such as the client-server relationship), methods, and

knowledge rules (such as constraints and triggers).

(2) A strong notation of address-independent object

identifiers (oid) instead of physical pointers.

(3) A persistence mechanism for supporting both

persistent and transient objects uniformly.

(4) A flexible type system that supports both static type

checking and multiple views of objects in multiple classes.

(5) A declarative knowledge retrieval mechanism based on

object association patterns for querying the knowledge base.

(6) Multi-paradigm programming constructs for specifying

procedural and rule-based computations.

K can be used for both the specification and

implementation of any application system to any level of

detail and therefore also facilitates KBMS-supported

evolutionary prototyping of software systems. This

dissertation presents the design and implementation of K in

terms of its underlying knowledge model, linguistic

facilities, implementation architecture, and application to

evolutionary prototyping.


1.1 Motivation

With a view to widening the applicability of data-base

technology to nontraditional application domains such as

Computer-Aided Software Engineering (CASE), Computer-Aided

Design and Manufacturing (CAD/CAM), Office Information

Systems, and Knowledge Representation Systems, many so-called

"Next-Generation Data-Base Management Systems (DBMS)" [ATK90,

SIG90, ACM91] have been proposed in recent years. In general,

a next-generation DBMS extends the functionalities of

traditional DBMSs (such as persistent data management, query

processing, concurrency control, and recovery) in either or

both of the following aspects. Firstly, object-oriented data

modeling constructs are introduced to model complex

application domains and the behavioral specifications are also

incorporated into the domain and functionality of a DBMS in

terms of user-defined methods. Secondly, rule management

facilities are introduced to manage and process a large number

of knowledge rules. Usually, the structure and behavior of a

complex system are often subject to design, operation, and

system rules. If these rules can be explicitly specified in


an application system, they can be used for automatically

maintaining system constraints and/or triggering predefined

actions when certain events occur. Note that, although the

semantics represented by rules can be implemented in methods,

high-level declarative rules make it much easier for a data-

base designer to clearly capture the semantics and thus

simplify the tasks of implementation, debugging, and

maintenance. In order to define, query, and manipulate the

data-base and to avoid the impedance mismatch problems [COP84,

MAI89] between data-base languages (which include data

definition languages, query languages, and/or rule languages)

and traditional programming languages, next-generation data-

base programming languages are also needed. In this

dissertation, we shall use the term "knowledge-base management

system" (KBMS) and "knowledge-base programming language"

(KBPL) to refer such a next-generation data-base management

system and data-base programming language, respectively.

The OSAM*.KBMS [LAM89a,b,YAS91] is a research prototype

of KBMS that is based on an object-oriented semantic

association model OSAM* [SU86,89a,b,92] and developed at the

Database Systems Research and Development Center of the

University of Florida. In the past several years, an object-

oriented query language (OQL) [ALA89, GU091] and constraint

language [ALA90, SU91] have been developed for specifying

queries and rules. However, the implementation of methods

still needs to be done in such traditional programming


languages as C++ [STR86]. Because the implementation language

does not directly support the OSAM* knowledge model, all of

the classical impedance mismatch problems still exist. To

solve the problems, we have developed a single integrated

object-oriented knowledge-base programming language called K

[SHY91] to serve as the high-level interface of OSAM*.KBMS for

defining, querying, and manipulating the knowledge-base, as

well as to code methods of any data/knowledge-intensive

application system. In addition to such well-known object-

oriented virtues as abstract data types, information hiding,

complex objects, relationships, inheritance, reusable codes,

etc., K provides the following six important features.

(1) Powerful abstraction mechanisms for supporting the

underlying knowledge model which captures any application

domain knowledge in terms of the structural associations (such

as generalization and aggregation), functional associations

(such as the client-server relationship), methods, and

knowledge rules (such as constraints and triggers) in an

integrated fashion.

(2) A strong notation of address-independent object

identifiers (oid) instead of physical pointers.

(3) A persistence mechanism for supporting both

persistent and transient objects uniformly.

(4) A flexible type system which supports both static

type checking and multiple representations of objects in

multiple classes.


(5) A declarative knowledge retrieval mechanism based on

object association patterns for querying the knowledge base.

(6) Basic data structures (set, list, and array) and

multi-paradigm programming constructs for specifying

procedural and rule-based computations.

K can be used for both the specification and

implementation of an application system to any level of detail

and therefore facilitates KBMS-supported evolutionary

prototyping [SU92], which will be described in Chapter 7.

1.2 Language Design Principles

The design of K is guided by the following general design

principles. More specific design rationales will be given

throughout this dissertation.

(1) Direct support of the OSAM*.KBMS kernel knowledge

model. K should provide the knowledge abstraction mechanisms

to support the extensible and reflexive OSAM*.KBMS kernel

knowledge model, which will be described in Chapter 3. All the

semantic constructs such as classes, associations, methods,

and rules should be treated as first class objects in the same

way as any other objects in K.

(2) Wide-spectrum for both specification and

implementation. K should be a uniform language for knowledge

definition, knowledge retrieval, knowledge manipulation, and

general computation with persistent/transient objects.


(3) Computationally complete. K should provide all of

the basic data structures (set, array, and list), control

structures (sequence, repetition, and condition), and rule

specification constructs for the users to implement any

algorithm and to perform any computation.

(4) Maintainability and readability. The software written

in K should have readable syntax and stable semantics so that

it can be easily understood and maintained.

(5) Seamless incorporation of query/rule language.

Instead of just embedding the existing query and rule language

of OSAM*.KBMS into K, a uniform and well-integrated syntax is

necessary to provide set-oriented and declarative query and

rule specification facilities without any conflict or

ambiguity with other programming constructs of K. New

constructs should be introduced only if we can demonstrate one

or more of the following points: readability, new concept, and

conciseness. Besides, new constructs must satisfy the

orthogonality principle, i.e., any combination of the

programming constructs is allowed.

(6) Strongly typed. As K is to be used for the

development of complex software systems, it should be a

strongly typed language so that as many type errors as

possible can be checked by static type checking at compile

time. On the other hand, the type system should be flexible

enough to support the distributed view, or multiple


representations, of OSAM* objects, as will be discussed in

Chapter 3.

(7) More emphasis on functionalities rather than

efficiency. As a high-level programming language, K should put

more emphasis on its functionalities than efficiency so that

complex application systems can be rapidly constructed by the

use of those high-level facilities of K. With the rate of

hardware progress, we do not feel that efficiency will be a

serious concern in the future.

1.3 System Overview

K is part of a KBMS-supported software development system

whose layer structure is shown in Figure 1.1. Starting from

the middle layer, i.e., the abstraction layer, we have

developed an extensible and reflexive object-oriented

knowledge model which (i) provides powerful abstraction

mechanisms for explicitly capturing the structural and

behavioral properties of all software systems and the

application domain objects that these systems deal with in

terms of the structural associations, methods, and knowledge

rules of object classes, (ii) reflexively models itself as a

kernel model so that all the meta information (i.e.,

structural and behavioral properties of objects) can also be

modeled as object classes. This knowledge model serves as the

underlying model of K, which is represented by the language

layer. In order to establish the proper computing environment


for the developers, a set of well-integrated tools will be

provided at the environment layer for the users to (i) define,

browse, modify, query, and test the software systems, (ii)

extract information from the execution of the software

systems, (iii) select, compose, and reuse existing codes

written in K for rapidly constructing the software system, and

(iv) perform validation of the target system. The OSAM*.KBMS

is represented by the control layer, which includes the

facilities for persistent and transient object management,

query processing, transaction management, and rule processing.

Finally, the control layer is mapped to the storage layer,

which includes the facilities for physical storage management

(e.g., access methods, file management, and data organization)

and low-level transaction management (e.g., concurrency

control and recovery).

Note that the OSAM*.KBMS is used not only to model,

process, and manage data of application world like the past

and present DBMSs, but also to model, process, and manage all

software systems as well as their related meta information.

All are modeled by the same knowledge model as object classes

and processed under the control of the underlying computation

model of the KBMS. It is not necessary to make traditional

distinctions among software systems (e.g., application

systems, operating systems, and DBMS) because all of them are

object classes of the underlying universal object-oriented

knowledge base as shown in Figure 1.2. The structural and


behavioral properties of all object classes that model

programs and application objects can be shared and reused

among the users.

A prototype version of K (K.1) and its supporting

OSAM*.KBMS have been implemented on Sun 4 in C++ as a first

step toward the KBMS-supported software development system

described above. This dissertation presents the design and

implementation of K in terms of its (i) underlying knowledge

model (abstraction layer), (ii) linguistic facilities

(language layer), (iii) implementation architecture (control

layer), and (iv) application to evolutionary prototyping of

software systems. In general, the contribution of this

research lies in integrating the techniques introduced in

data-base management system, programming language, and

software engineering in an object-oriented framework toward

a KBMS-supported software development system. Specifically,

my contribution to this research is two-fold. First, on the

language aspect, I have designed the knowledge-base

programming language K as the high-level interface of the

KBMS-supported software development system. A detailed

description of K will be given in this dissertation. I have

also participated in the implementation of the first prototype

version of K on Sun 4. An overview of the system architecture

and implementation strategy will be given in this

dissertation, and a detailed description can be found in

Arroyo [ARR92]. Second, on the software development


methodology aspect, I have extended the OSAM* model [SU86,89a,

YAS91] with control associations so that both structural and

behavioral properties of objects can be uniformly modeled,

managed, and processed by the KBMS to support a knowledge-base

modeling approach to evolutionary prototyping of software

systems [SU92]. This extension paves the way for a software

development system in which the developer can graphically

model, query, and execute any software system at any level of

abstraction. A detailed description of the knowledge model and

the KBMS-supported evolutionary prototyping approach will be

given in this dissertation.

1.4 Dissertation Organization

The rest of this dissertation is organized as follows.

Chapter 2 surveys the related works in the categories of data-

base programming languages, structural and behavioral

modeling, and KBMS-supported software development systems. In

Chapter 3, we give an overview of the knowledge-base

programming language K in terms of the underlying knowledge

model, type system, persistence, and KBMS operations.

Structural abstraction mechanisms of K are described in detail

in Chapter 4 in terms of the structural association

definition, association pattern, and the knowledge retrieval

facilities of K. Behavioral abstraction mechanisms of K are

described in detail in Chapter 5 in terms of methods and

rules. An extended object-oriented computation model for


supporting multi-paradigm computations is given in Chapter 6.

In Chapter 7, we present the KBMS-supported evolutionary

prototyping methodology. The system architecture and the

current implementation strategies are given in Chapter 8.

Finally, Chapter 9 gives our conclusion and outlines the

future research directions. A syntax summary of K in BNF form

is given in Appendix A. A parts knowledge-base example is

given in Appendix B to illustrate the expressiveness of K as

suggested in Atkinson and Buneman [ATK87].

WVell-hite;,rit ed Tools

* User Interface for the design,
definition, browsing, and
querying of knowledge base
* Tools for testing the prototype
and information retrieval


Knowledge Model

* Struct ural A)bstraction
* Beihaviioral Abstraction
Operationis/M let hods
Knowledge Rules
* Extensible & Reflective

Ni* Persistent Object
Transaction Mlanagement
Query Processing
Rule Management

Storage Management

Access Method
File Mlanagement
S* Data Organizationi & Placement
Low-level concurrency control
& recovery

Figure 1.1 The Layer Structure of a Future KBMS-supported
Software Development System

* Knowledge Abstraction
* Kno ledge Retrieval &
* Iiulti-paradigni

O)abstraction Ia



Figure 1.2 A Universal KBMS-Supported Software
Development System


2.1 Data-base Programming Language

Many "data-base programming languages" [ATK87, BLO87]

have been proposed in recent years (e.g., Pascal/R [SCH77),

Rigel [ROW79], Taxis [MYL80], Dial [HAM80], Plain [WAS81],

Adaplex [SMI83], PS-Algol [ATK83], GemStone [COP84, MAI86,

BUT91], Galileo [ALB85], Trellis/Owel [SCH86], Vbase [AND87],

E [RIC87], Orion [KIM88], Proquel [LIN88], 0++ [AGR89], OQL[X]

[BLA90], Ontos [ONT91], IRIS [FIS87, WIL90b, ANN91],

ObjectStore [LAM91], and 02 [LEC89, DEU91]) to overcome the

infamous impedance mismatch problem between traditional

programming languages and DDL/DML [COP84, MAI89) by

integrating data definition, data manipulation, and general

computing facilities in a single language. A detailed survey

can be found in Atkinson and Buneman [ATK87].

Most of the existing works are based on relational,

functional, or object-oriented data models, with the extension

of persistence, the computation facilities of such traditional

programming languages as Pascal, Lisp, and C/C++, and

associative access (using either iterator or SQL-like

construct, both of which can be nested to arbitrary numbers


of levels). They generally do not provide the facility for

rule processing which is considered one of the major

requirements for next-generation data-base management systems

[ATK90, SIG90, SIL91]. While researchers in deductive data-

base systems (e.g., LDL [TSU86], LOGRES [CAC90], and Glue-Nail

[PHI91]) and active data-base systems (e.g., Postgres [ROW87,

STO91], Starburst [LOH91], Ariel [HAN89], HiPAC [CHA89,

DAY88], and OSAM*.KBMS [SU89a,b,91, LAM89a,b, CHU90, SIN90])

have tried to extend relational or object-oriented data-base

systems with rules, their results in general have provided

separate rule languages as an extension of their query

languages instead of integrated data-base programming

languages. Two existing DBPLs that are most closely related

to our work are discussed in the following.

Proquel [LIN88] is the only existing data-base

programming language that is specifically designed for the

prototyping of data-base applications. With the help of a set

of prototyping tools, one can use Proquel to specify, query,

and implement relations, events, and operations of an

application. Proquel is based on a relational data model and

does not properly support complex objects and abstract data

types for advanced data-base applications such as CAD/CAM,

Office Information Systems, and Software Engineering.

O++ [AGR89, GEH91] extends the traditional object-

oriented paradigm with rules and versioning, both of which

are useful for the development of complex software systems.


0++ extends C++ with the facilities for creating persistent

and versioned objects, defining sets, iterating over sets and

clusters of persistent objects, and associating constraints

and triggers with object classes.

Our proposed knowledge-base programming language K

supports the same object-oriented features found in O++, such

as a rich type system, structural associations, operational

specifications, rules, object identification, encapsulation,

and inheritance. Temporal and versioning facilities will be

incorporated in the later version of K. Unlike 0++, which is

a superset of C++, K is designed to be a high-level

programming language. While 0++ extends C++ data model with

rules, K supports a high-level extensible and reflexive

object-oriented semantic association model [SU86,89a, YAS91]

where all things, including classes, associations, methods,

and rules, are uniformly treated as objects. For example, a

user can use the query facility of K to query the meta

information from the kernel schema in the same way as he/she

can query any application domain. New association types can

be defined in K and thus extend the model itself. The

implementation of K is also extensible based on an open and

modular architecture where each software component is also

modeled as an object class. Secondly, while 0++ extends the

"for" loop construct to iterate over sets, K provides more

declarative and concise constructs for specifying queries and

rules based on object association patterns. Thirdly, K uses


address-independent object identifiers (soft pointers) as

object surrogates rather than using three different types of

physical address pointers (persistent, transient, and dual

pointers) as in 0++. Persistent and transient objects are

transparent to the users and are treated in the same way. For

example, a query can retrieve both types of objects instead

of only persistent objects as in O++. Fourthly, K provides a

more flexible type system that supports both static type

checking and multiple representations of objects, which is not

possible in O++. Lastly, K puts more emphasis on readability

and maintainability by providing readable syntax rather than

the cryptic and non-intuitive C++ syntax. More specific

comparisons will be made throughout this dissertation.

2.2 Structural and Behavioral Modeling

As an extension to relational, semantic, and object-

oriented data models, knowledge rules have been incorporated

into many research works in next-generation data-base systems

such as HiPAC [CHA89], ODE [AGR89], OSAM* [SU89a,b], Postgres

[STO91], and Starburst [LOH91]. However, these models do not

provide facilities for explicitly modeling method

implementations at any level of details.

Object-oriented data model provides a uniform framework

by encapsulating both the structural properties and part of

the behavioral properties (in terms of signature

specifications of methods) of a target system into object


classes. Nevertheless, the implementation part of each method

is still left as a blackbox and cannot be further modeled.

Because the specification of methods does not carry enough

behavioral information, the implementation is often prone to

errors. Several research projects have been done in an effort

to provide an integrated diagram notation for static and

dynamic aspects of software systems. Both Kung [KUN89] and

Markowitz [MAR90] tried to combine ER data-model and data-flow

oriented process specification as a single graphic design tool

for conceptual modeling. However, they do not explicitly model

process implementations. Besides, as behavior properties

(processes) are not incorporated into an object-oriented

framework, they cannot take advantage of an object-oriented

paradigm such as inheritance and object-oriented data-base

system support.

Brodie and Ridjanovic [BRO83] proposed ACM/PCM (Active

and Passive Component Modeling) methodology for structural

and behavioral modeling of data-base applications using an

integrated object/behavior schema. Three types of control

abstractions (sequence/parallel, choice, and repetition) are

used to represent the behavioral relationships between an

operation and its constituent operations. Since behavioral

properties are explicitly modeled only at a gross level of

detail by relating operations to form high-level, composite

operations, there is not enough information for the behavioral


schema to be executable and evolve into the target system at

the implementation level.

Kappel and Schrefl [KAP91] proposed object/behavior

diagrams as a uniform graphic representation of object

structure and behavior based on a semantic data model and

petri nets. Behavior diagrams are split into (i) life-cycle

diagrams, which identity possible update operations and their

possible execution sequences with synchronization constraints,

(ii) activity specification diagrams, which represent method

specifications, and (iii) activity realization diagrams, which

represent method implementations at any level of details.

Though closely related to our work, the object/behavioral

diagram is more of a graphic design tool than a formal

knowledge model. Because there is no kernel model to model

object/behavior diagrams themselves, software systems

represented by these diagrams cannot be uniformly modeled and

managed by some underlying KBMS. For example, a user will not

be able to query a data base about the structural and

behavioral properties of objects.

2.3 KBMS-supported Software Development System

The research in KBMS-supported software development is

closely related to our work as it incorporates some data-base

and knowledge-base technologies to support the development of

software systems.


The CHI system [SMI85] and TI project [BAL81,85, PAR83]

are knowledge-based programming systems supported by a main-

memory knowledge base and a wide-spectrum language called V

and GIST, respectively, to express all stages of the program

development process. V is used to describe conceptual models

of domains, formal requirements and specifications, programs,

derivation histories, transformation rules, relationships

between objects, and properties of objects. V covers high-

level program specification to low-level control constructs

as well as programming language knowledge (synthesis rules,

synthesis plans, and constraints on programs). V also serves

as the query and access language of the knowledge base. The

expressive capabilities of GIST include historical reference

to past process states, constraints, demons (asynchronous

process responding to defined stimuli), and a relational and

associative data model. Both compilers of V and GIST are

transformational systems and require human assistance to

perform complex implementation steps (they are not

automatically translatable into efficient code). Besides, both

works are supported by some in-memory knowledge base of

programming rules instead of a full-fledged and well-

integrated KBMS for persistency, secondary storage management,

as well as uniform modeling and management of the software

development environment.

The DAIDA project [JAR90, CHU91] is also a knowledge-

based software development system. It uses three different


languages for the description of the software at each stage:

(i) a temporal knowledge representation language--Telos--for

requirement specification and domain analysis, (ii) a design

language--TDL--for identifying and specifying the data and

procedural components of the system, and (iii) a data-base

programming language--DBPL--(which is an extension of Modular-

2) for implementation. Meta information such as design

decisions and rules can also be modeled in Telos, which is

supported by a KBMS called ConceptBase. From a software

development point of view, there are two problems with DAIDA.

First, instead of using a single wide-spectrum language for

both the specification and implementation, DAIDA uses

different languages at different stages. Second, while Telos

and TDL are based on an object-oriented data model, the

implementation language DBPL is based on a relational data

model and therefore a mapping from object-oriented design is



3.1 Knowledge Abstractions

3.1.1 Classes

We use classes as the knowledge definition facilities to

classify objects by their common structural and behavioral

properties in an integrated fashion. Classes are categorized

as entity classes (E_Class) and domain classes (D_Class). The

sole function of a domain class is to form a domain of

possible values from which descriptive attributes of objects

draw their values. Both primitive domain classes (e.g.,

integer, real, and string) and complex domain classes (e.g.,

date and address) are supported in K. An entity class, on the

other hand, forms a domain of objects that occur in an

application's world and can be physical entities, abstract

things, functions, events, processes, and relationships. The

structural properties of each object class (called the

defining class) and thus its instances are uniformly defined

in terms of its structural associations (e.g., aggregation and

generalization [SMI77]) with other object classes (called the

constituent classes). Each type of structural association

represents a set of generic rules that govern the knowledge-


base manipulation operations on the instances of those classes

that are defined by the association types. Functional

associations between object classes can also be specified by

such association types as "friend" [STR86] and "using" [B0090]

to facilitate "programming in the large", as will be described

in Chapter 4. Manipulation of the structural properties of an

object instance is done through methods, and the execution of

methods is automatically governed by rules to maintain the

system in a consistent state or to trigger some pre-defined

actions when certain conditions become true. In other words,

the behavioral properties of each object class are defined as

methods and rules applicable to the instances of this class.

Since rules applicable to the instances of a class are defined

with the class, rules relevant to these instances are

naturally distributed and available for use when instances are

processed. The procedural information (algorithm) of methods

can be explicitly modeled using control associations, which

will be described in Chapter 7. Structural associations,

functional associations, and control associations are all

called "class associations" as each of them models the

relationships between the defining class and constituent

classes. A schema is defined as a set of class associations.

In general, a class definition consists of association

section, method section, rule section, and implementation


section (which contains the actual codes that implement the

methods specified in the method section) as follows:

entity class I domain class is

[associations: association_definition_statements]

[methods: method_definition_statements]

[rules: rule_definition_statements]

[implementation: method_implementation_statements]

end ;

Note that in this dissertation, we use square brackets to

denote optional occurrence of the enclosed construct, and

curly brackets to denote an arbitrary number (possibly zero)

of the occurrences of the enclosed construct. A sample entity

class definition of Student is given in Figure 3.1 to

illustrate the skeleton of a class definition. A detailed

description will be given in the latter chapters. Note that

while implementation section is separated from specification

sections as in abstract data types, it is still physically

part of the class definition for the following reasons. First,

we want to enforce modularity at the granularity of classes

and therefore to reduce the overhead of file management and

code generation. In other words, each class definition must

reside in exactly one file instead of one or more files as in

C++. Second, we want to replace the operating system concept

of files by the logic level concept of classes as much as

possible and eventually hide file management totally from the


user's point of view. When a complete programming environment

is available, all the specification and implementation of a

class will be done via a graphic user interface instead of by

editing files explicitly, and each class definition will be

translated into a corresponding K file internally for


3.1.2 Objects and Instances

Objects are categorized as domain class objects

(D Class Object) and entity class objects (E_Class_Object).

Domain class objects are self-named objects which are referred

by their values. Entity class objects are system-named objects

each of which is given a unique object identifier (oid). We

adopt a distributed view of entity class objects to support

generalization and inheritance as in Lam and Alashqur [LAM89a]

and Yassen et al. [YAS91] by visualizing an instance of class

"X" as the representation (or view) of some object in class

"X." Each object can be instantiated (as an instance) in

different classes with different representations but with the

same oid. Each instance is identified by a unique instance

identifier (iid), which is the concatenation of cid and oid,

where cid is a unique number assigned for each class in the

system and is defined as the type of this instance. Given an

iid, the system will use its (i) cid part to refer to a

particular class, and (ii) oid part to refer to the

representation of a particular object in this class. For two


entity classes "A" and "B," if Class "A" is a superclass of

class "B" (generalization association), then for each object

that has an instance in class "B," it must also have an

instance in class "A." Both instances have the same oid and

are conceptually connected by a generalization association

link. The advantages of the distributed view of objects are

four-fold. Firstly, as inheritance is handled by the object

manager at the KBMS layer rather than being built into the

storage layer, we achieve a higher level of abstraction and

better system independence. For example, it is possible to

replace the storage layer with a relational data-base engine

by modifying the mapping from the KBMS layer to the storage

layer without affecting the language layer. It is also

possible to extend the knowledge model at the abstraction

layer by modifying the mapping from the abstraction layer to

the KBMS layer without affecting the storage layer. Secondly,

we have more flexibility in supporting multiple

representations of object. For example, one object can span

more than one generalization hierarchy path. At the

implementation level, it is easier to delete an instance

without extra copying data and changing addresses. We also

have the flexibility to cluster objects either by "oid" or by

class. Thirdly, it is more efficient on scan-based selection

as the size of each object is significantly reduced by

partitioning it into different instances. Fourthly, as the

manipulation of objects is done at the instance level, static


type checking can be easily supported. Each entity class is

associated with an extension, which is the set of all its


3.1.3 Encapsulation and Inheritance

We adopt the C++ three-level information hiding mechanism

[STR86] by classifying aggregation associations (which are

expressed as attributes, data members, or instance variables

in other object-oriented programming languages to describe the

state of an object instance) and methods as either "public,"

"private," or "protected." Note that all the rules are treated

as "protected" by definition. Public properties can be

accessed (i.e., visible) in the implementation of any class

in the system, while private properties can only be accessed

in the implementation of the defining class and any "friend"

of the defining class (specified by the "friend" association,

which will be described in Chapter 4). Protected properties

are similar to private properties except that they can also

be accessed in the implementation of any subclass of the

defining class (i.e., subclass-visible). At the class level,

all the (i) public and protected aggregations, (ii) public and

protected methods, and (iii) rules defined by a class are

inherited by its subclasses. At the instance level, an

instance of entity class "A" stores only the attributes

defined for "A," and it inherits (i.e., gets access to) all

the public and protected attributes from its corresponding


instances (with the same oid) of all the superclasses of "A."

Name conflict in multiple inheritance is resolved by requiring

the user to (i) specify from which superclass a particular

property is inherited, or (ii) cast the type (i.e., the cid

part of an iid) of an instance to that specified superclass

to refer to the corresponding instance explicitly as will be

discussed in Section 3.4.

3.2 Model Extensibility and Reflexivity

Model extensibility is achieved via a reflexive kernel

model shown in Figure 3.2 in which all the data model

constructs described above such as classes, associations,

methods, and rules are modeled as first-class objects. One can

extend the data model by modifying this set of meta classes.

This kernel model also serves as the data dictionary as all

the object classes in the system are mapped into this class

structure. One can therefore browse and query any user-defined

schema as well as the dictionary uniformly. Note that the

kernel class structure is in turn mapped into a C++ class

structure at the implementation level that is hidden from KBCs

(Knowledge-Base Customizers, who customize the data model for

specific applications), KBAs (Knowledge-Base Administrators,

who use the model to model an application domain), and end

users. Note that Figure 3.2(a) illustrates the overall

generalization lattice, and Figure 3.2(b) shows the detailed

structural relationships among those kernel object classes,


as we will describe in the following sections. In our graphic

schema notation, (i) entity classes and domain classes are

represented as rectangular nodes and circular nodes,

respectively, (ii) a generalization association is represented

by a "G" link from a superclass to a subclass, and (iii) an

aggregation association is represented by an "A" link from the

defining class to a constituent class. Note that the root

class "Object" is represented by a special notation because

it is neither an entity class nor a domain class. The sole

function of class "Object" is to serve as the collection of

all the objects in the system. After compilation, any user-

defined class (e.g., "Person" and "Student" in Figure 3.2(a))

will be added to the class structure as an immediate or non-

immediate subclass of either "E_Class Object" or

"D Class_Object," while at the same time the objects

corresponding to the class definition, associations, methods,

and rules of the defining class will be created as instances

of the system-defined entity classes named "Class,"

"Association," "Method," and "Rule," respectively. Note that

this class structure is reflexive in the sense that we use the

model to model itself. For example, while any user-defined or

system-defined entity class is a subclass of "E_Class_Object,"

"E_Class Object" itself is also an entity class (represented

by a rectangular node). Similarly, "D_Class Object" itself is

also a domain class.


As any application domain (including the model itself)

is uniformly modeled and mapped into the kernel model, the

class structure can be further extended at any level of

abstraction. For example, one can use the kernel model to

incrementally extend the model itself by either (i) adding new

structural association types or introducing subtypes of

existing association types (e.g., "Interaction,"

"Composition," and "Crossproduct" [SU89b]) by specifying their

structural properties (in terms of existing structural

association types) and behavioral properties (in terms of

generic rules that govern the knowledge-base manipulation

operations on the instances of those classes defined by the

association types) or (ii) extending the definition of

existing association types (e.g., add new attributes

"default_value," "null_value," "optional," "unchangeable,"

and "dependent" [SHY91], as well as their corresponding

generic rules for the association type "Aggregation") so that

more semantics can be captured in the schema and maintained

by the KBMS instead of being buried in application codes. Once

a new association type is defined, it becomes a semantic

construct of the extended data model and can be used in the

definition of any object classes (including any other new

association type). In such a way, the data model itself can

be incrementally extended to meet the requirements of various

application domains.


3.3. Persistence

As a data-base programming language, K must support

persistence so that objects can live after the execution of

a program is terminated. Persistence in K is based on the

following rationales:

(1) Persistence is orthogonal to classes as in Exodus/E

[RIC87], ODE/O++ [AGR89a,b], ONTOS [ONT90], and OQL[X]

[BLA90], i.e., persistence is an instance property rather than

a class property. Any subclass of "E_Class_Object"

automatically inherits the persistence mechanism and it is up

to the user to specify each of its instances to be either

persistent or transient using the pnew or new operator when

creating a new object, respectively. By specifying an object

instance in a particular class to be persistent, we

automatically make all the instances of that object to be

persistent. A transient entity class object is similar to a

dynamically allocated heap object in C++, which exists in the

main memory until explicitly deleted or when the program

execution is terminated. Domain class objects can be

persistently stored in the data base only if they serve as

attribute values of some persistent objects. Note that

dangling references might exist after program execution is

terminated if some transient entity class instances are

referred by persistent entity class instances. Following the

same philosophy of Richardson and Carey [RIC87], Agrawal and


Gehani [AGR89], and Blakeley [BLA90], K also assumes that it

is the user's responsibility to take care of this problem. In

the later version of K, we will consider to solve this problem

by having the object manager to delete all the transient

object instances (and thus all the associations with them from

persistent entity class instances) before closing the

knowledge base.

We do not further classify entity classes as persistent

entity classes and transient entity classes as proposed in

Richardson and Carey [RIC87] and Blakeley et al. [BLA90] for

the following reasons. Firstly, based on the rationale that

persistence is an instance property rather than a class

property, it is reasonable to make any distinction only at the

instance level rather than at the class level so that

persistence can be strictly orthogonal to classes. Secondly,

since we already provide the pnew and new operators, the

classification of persistent and transient entity classes

becomes unnecessary and only causes less flexibility and more

management overhead.

(2) Persistence is orthogonal to oid/iid and any physical

address. As a high-level programming language, K provides the

user with exactly the same view of objects as the conceptual

knowledge model. Unlike other C++-based persistent languages

mentioned above, which mix the low-level concepts of pointers

(main memory and disk address) with logical oids to improve

efficiency, K uses (i) oids for entity class objects and iids


for their instances, and (ii) values for domain class objects,

no matter whether these objects are transient or persistent.

The advantages are three-fold. Firstly, the language provides

a better object-oriented flavor and data abstraction than C++-

based languages by enabling the users to (i) manipulate

objects at the logical level instead of going to the physical

level by following pointers, and (ii) navigate through the

data-base using oids/iids instead of pointer chasing.

Secondly, the concept of multiple representations (views) of

objects described in Section 3.1 is well supported at the

logic level and clearly separated from the low level

implementation. Thirdly, system-defined oid (and iid) is

independent of physical address and thus (i) enforces the

immutability and uniqueness requirements of identity [KH086]

and (ii) makes persistence orthogonal to oids/iids. As a

result, unlike in most C++-based persistent languages

[AGR89a,b, RIC87,90, BLA90] that put extra burden of managing

two or three types of pointers (persistent, transient, and/or

dual pointers) on the users, persistence in K is transparent

to the user in the way of declaring an entity class instance

variable. The reason is that such a variable in K will be

bound to iid, which is independent of any physical location

or persistence property.

(3) Persistence is orthogonal to queries and any object

manipulation. Since persistence is a property that makes

difference only after the program execution is terminated, we


feel that there should be no difference in queries and

manipulation of persistent and transient objects. For example,

a selection query over an entity class should return both its

persistent and transient instances as long as they satisfy the

selection condition as in Blakeley et al. [BLA90]. There is

also no difference in object manipulation because all the

entity class objects are manipulated using oid/iid, which is

independent of persistence property as described in the second


The overhead of implementing persistence in K is that for

each entity class, a persistent object table "PersTable"

(which itself is also persistent) and a transient object table

"TransTable" (which itself is also transient) are needed for

mapping oids of its persistent and transient object instances

to their main memory addresses, respectively. Note that the

address of a persistent object instance will be a special

value denoted by "INACTIVE" if it is not in the main memory.

During run time, the persistent object tables will be copied

into the main memory for fast retrieval. If a persistent

object instance is needed and its address is "INACTIVE," the

object manager will automatically call the ONTOS function

"OC DirectActivateObject" to retrieve it into the main memory.

A performance penalty is introduced due to this one-level

indirection of object references. Object tables are currently

implemented by using the indexed dictionaries to speed up this

process [ARR91]. The object manager will be responsible for


maintaining both object tables corresponding to various method

calls by updating these tables accordingly.

3.4 Type System

Different from conventional object-oriented programming

languages, K directly manipulates objects at the distributed

instance level as described in Section 3.1, and instance

identity is used as a primitive for type management. In order

to support the design principle of static type checking, we

incorporate a strong notion of typing into the knowledge model

by defining the type of an instance as the class to which this

instance belongs. Every variable in K will be bound to some

instance and therefore must be declared to have a type. Every

expression in K will be evaluated to return an instance and

thus also has a type that will be identical to the type of

the returned instance. Note that static type checking is

supported in K because (i) K directly manipulates instances

rather than objects, and (ii) the type of an instance is

fixed. Some programming constructs for performing the same

effect of late binding as in conventional object-oriented

programming languages without sacrificing static type checking

will be discussed in Chapter 4.

The type of an expression in general can be detected by

the system type checker by textual inspection (static type

checking) to decide the type compatibility and thus prevent

an operation from being applied to a value of an inappropriate


type. Type compatibility means that a variable of type X can

only be assigned expressions that represent instances of class

X or any subclass of X. For the later case, the system will

automatically convert the type of the instance, which is

returned by the right-hand-side expression to class X during

run time, to actually refer it as an instance of class X.

Method parameters and returned values are checked against the

method signature following the same rule as above. An instance

variable ranging over an entity class will be bound to some

iid from which the system uses its (i) cid part to refer a

particular class (which is the type of this instance), and

(ii) oid part to refer the representation of the particular

object in this class. Conceptually, an entity class object can

dynamically gain/lose types during program execution when its

corresponding instances are created/deleted as we will

illustrate in Section 3.5. An instance variable ranging over

a domain class will be bound to some value.

If the type checker is not able to ascribe a type to an

expression, the user must use the cast operator '$' to specify

the type in the form "$." The cast operator

is useful for the user to temporarily convert the type of an

expression or refer different representations of the same

entity class object in different classes. For example,

real$(3+4) asserts that the type of (3+4) is real instead of

integer. Similarly, to resolve any name conflict in multiple

inheritance, one must specify from which superclass a


particular property is inherited by casting the type of an

expression to that specified superclass to refer the

corresponding instance explicitly. For example, if both

classes Student and Employee define a method called

"evaluate," and both classes have class TA as a subclass. To

apply "evaluate" to a TA instance "t," one must use either

"Student$t.evaluate()" or "Employee$t.evaluate()" for the

system to unambiguously apply the correct method to

corresponding Student instance or Employee instance of "t,"

respectively. Note that in the cases when no name conflict

occurs, the system will automatically find the appropriate

superclass and perform the casting to support inheritance. In

other words, inheritance at run time is supported by casting

an instance of class "X" to be an instance of class "Y" (which

is a superclass of class "X") before accessing a property

defined by class "Y."

For primitive domain classes, type conversion can only

be done (i) from character to integer or string, (ii) from

integer to real, and (iii) from real to integer. For complex

domain classes, only upward conversion is allowed, i.e., a

value from domain class X can be converted to a value of any

superclass of X, but not vise versa. On the other hand, there

is no restriction for type conversion of entity class

instances because we allow an entity class object to have

different representations in different classes. The advantage

of this approach is more flexibility in modeling an


application domain, especially for the purpose of prototyping.

For example, before class Robot is defined (because its

properties are not finalized), one can still create an object

as both an instance of Person and an instance of Machine to

prototype its structural and behavioral properties.

Note that while upward casting to superclass is always

safe (an instance in class X guarantees a corresponding

instance with the same oid in every superclass of X because

of generalization association), null value might be returned

in other cases if there does not exist any corresponding

instance in the target class. In other words, one can use the

cast operator to test if an instance of class X also has a

corresponding instance in class Y. Also note that we do not

allow the use of casting to bypass and thus corrupt the

information hiding mechanism described in Section 3.1. In

other words, if property P of class A is not visible in class

B, then the expression A$x.P (where x is an entity class

instance variable) used in the implementation of class B will

be detected by the compiler as a semantic error.

3.5 KBMS Operations

Entity class objects are directly manipulated at the

instance level in K. After an entity class is defined, we can

insert instances into this class. An instance can be created

from scratch by using the "new" or "pnew" operators followed

by () to create a new


transient or persistent object along with an instance of this

object in the specified class, respectively. Figure 3.3

illustrates the basic KBMS operations and object/instance

concept using a simple schema where class "Student" is a

subclass of "person" and a superclass of both "TA" and "RA."

Statements (1) and (2) first create two new objects with

their oids, insert their instances in class Person and class

Student, and return the iids to variables "p" and "sl,"

respectively. Note that by inserting a Student instance, the

system object manager automatically inserts a corresponding

Person instance (with the same oid) because of the

generalization association. The difference between the

instances referred by "sl" and "p" is that the former is a

persistent instance and therefore any modification will be

written to the data-base while the latter is a transient

instance that resides only in main memory. Note that while

primitive domain class instances (e.g., integer and string)

can be directly referred by their values, complex domain class

instances (e.g., date) can be referred in the form

"()," as shown in statement

(1). The creation of entity class object instances can also

be nested within the creation of other entity class object

instances as shown in statement (2). Statement (3) then

inserts a new TA instance of the object referred by variable

"p" (assume we learn that this person is a TA) and returns the

iid to variable t. Note that (i) no new object is created and


the iid returned to t has the same oid as "p," and (ii) by

inserting a TA instance of the object referred by "p," the

system object manager automatically inserts a Student instance

of the same object. Similarly, statement (4) inserts another

instance of the same object referred by "p" in class RA and

returns the iid to variable "r." Statement (5) casts the TA

instance referred by "t" as a Student instance whose iid is

then assigned to "s2." In statement (6), we delete the TA

instance "t" and the object manager will automatically delete

all the instances of the object referred by "t" in all the

subclasses of TA (if any) following the generalization

association. This can be done by (i) identifying all the

immediate subclasses of TA that contains an instance of the

specified object, and (ii) performing the delete operation

recursively on the corresponding instances from these

subclasses. All the references (association links) to these

deleted instances from other object instances will be

automatically removed to maintain the referential integrity

constraint. Note that for this particular object, even though

it lost its instance in TA, it still has its instances in RA,

Student, and Person. For example, we can use the variable "s2"

from statement (5) to refer to its Student instance. In other

words, we allow an object to have different representations

in different classes (even spans more than one branch of the

generalization lattice), and we can insert or delete these

representations dynamically. This is a property that cannot


be expressed in most object-oriented data-base programming

languages except those systems that support multiple views of

objects (e.g., Aspect [RIC91], IRIS [FIS87, WIL90a, ANN91],

and Clovers [STE89]). Note that we achieve this flexibility

without losing the advantages of static type checking by (i)

explicitly making the distinction between objects and

instances and (ii) directly manipulating instances rather than

objects. In statement (7), a destroy statement will

automatically delete all the instances of the object referred

by variable "p." Note that it is impossible to restrict an

object to have instances along only one branch of the

generalization lattice when multiple inheritance is allowed.

Therefore, a "destroy" operation must be done by performing

a "delete" operation on the corresponding instance from the

root class E_Class_Object. A prototype implementation of the

object manager has been reported in Arroyo [ARR92].

entity class Student is
specialization of Person;/* Student is a subclass of Person */
friend of Faculty; /* authorize Faculty to access the private and protected properties */
aggregation of
public: /* definition of public attributes */
enroll: set of Course; /* a student can enroll in a set of courses */
college_report: array [4] of GPA_Value; /* annual report of every college year */
major: Department;
protected: /* definition of protected attributes */
S#: S#_Value;

methods: /* the signature of methods */
method eval GPA() : GPA_Value;
method suspend() : void; /* no return value */
method inform all instructor() : void;

rule CSrulel is
/* after updating the major of a student, if the new major is "CIS"
then the GPA of this student must be greater than 3.0,
otherwise we suspend this student */
triggered after update major
condition (this.major.name = "CIS" 1 this.eval_GPA() > 3.0) /* guarded condition */
otherwise this.suspend()
end CSrulel;

rule Student::General rule is
/* after suspending a student, if this student enrolls in any course,
then inform all the instructors of this student */
triggered after suspend()
condition exist c in this *>[enroll] c:Course /* existential quantifier */
action this.informall instructor()
end General rule;

implementations: /* actual coding of methods */
method eval_GPA() : GPA Value is
local sl, s2 : real := 0; GPA : GPA_Value; /* local variable declarations */
context this *<[student] t:Transcript *>[course] c:Course /* looping over a context */
do begin /* for each tuple, do the following */
s1 := sl + c.credits t.grade_point; /* calculate the accumulated grade points */
s2 := s2 + c.credits; /* calculate the accumulated credit hours */
end context; /* end of context looping */
GPA := sl/s2;
if GPA < 2.0
then "GPA Below 2.0".display();
end if;
return GPA;
end eval_GPA;
end Student;

Figure 3.1 The Class Definition of Entity Class Student in K




Q Container


-O Real


Figure 3.2 (a) Class Generalization Lattice of the
Extensible Kernel Model


L 0


- H




73 4-)

L- C-0


ao C,

local p:Person; sl,s2:Student; t:TA; r:RA;

(1) p := new Person(name := "Anne Rice",
birth := date(month := 1,
day := 1,
year := 1950));
(2) sl := pne Student(s# := "1234",
name := "Steven King",
guardian := ne Person
(name := "James Michener"));
(3) t := p insert TA (Course := "CIS6501");
(4) r := p insert RA (project := "OODB");
(5) s2 := Student$t; /* casting t to student */
(6) delete t;
(7) destroy p;

p:001 Persoll

sl:002 stTRl


t:001 TA RA r:001

Figure 3.3 KBMS Operations Example in K


It has been shown in Rumbaugh [RUM87] and Wile [WIL90a]

that Structural associations (or relationships) serve as an

important abstraction mechanism, which is missing in

traditional programming languages. In our work, structural

properties of objects are modeled by various structural

association types, which are uniformly modeled as first-class

object classes as shown in Figure 3.2. Note that while

conceptually, each type of structural association represents

a set of generic rules that govern the knowledge-base

manipulation operations on the instances of those classes that

are defined by the association types, kernel structural

association types are actually built into the object manager

for bootstrapping and better performance [LAW91, YAS91,

ARR92]. In the later version of K that supports generic rules,

we will also allow the user to use the linguistic facilities

to incrementally extend the knowledge model as will be

described in Chapter 9. We also provide the constructs for

specifying object association patterns based on which the

system can identify the corresponding sub-knowledge-bases that

satisfy these intensional patterns. Two applications of

association patterns, namely, a context looping construct for


manipulating the knowledge base, and existential/universal

quantifiers for posing logical questions upon the knowledge

base, are also described in this chapter.

4.1 Association Definition

As shown in Figure 3.1, associations are defined

following the key word "associations:" of a class definition.

Two kernel structural association types "generalization" and

"aggregation" are currently built into the object manager of

OSAM*.KBMS and supported in K.

4.1.1 Aggregation Association

For each object class, one can define a set of attributes

to describe the state of its instances in terms of their

associations with other classes by using the aggregation (A)

association type. The syntax for the Aggregation association

definition is the following.

aqqregation of
public: | private: | protected:

(public: I private: I protected:

where is a list of specifications of the

form: ":[ of] ; ...."

Each aggregation specification corresponds to an instance of

the class "Aggregation" and also a named A-link from the


defining class to the constituent class in the structural

schema as we described in Section 3.2. The name of an

attribute must be unique within the defining class. The key

words public:, private:, and protected: are used to specify

different levels of information hiding described in Section

3.1. An aggregation association defines either (i) a value

attribute if its constituent_class is a domain class, or (ii)

a reference attribute if its constituent_class is an entity

class. At the instance level, we store values and iids for

value attributes and reference attributes, respectively.

Multi-valued attributes are specified as " of

" where could be either "set,"

"list," or "array []." Note that in addition to the

constructor "set," which is critical in object-oriented data-

base systems, we also provide the constructors "list" and

"array" to capture the semantics of "order," which is useful

in real-world applications [ATK90, SIG90]. The size of an

array is specified by an integer, and the index of the first

element of an array is always 1. Also note that collection

constructors are implemented as domain classes as shown in

Figure 3.2(a). Each instance of a collection class corresponds

to a collection of iids or values. The semantic type checker

of K will use the information carried by each iid to ensure

the type compatibility of each collection operation. Note that

an aggregation association between two entity classes is

interpreted as a bi-directional link. For example, suppose


there is an aggregation association called "major" from entity

class "Student" to entity class "Department" having a

cardinality mapping of n-1, then the system will automatically

define and maintain a set-valued aggregation association

called "_KINVStudent_major" from "Department" to "Student"

(for each department, the system records all the students who

major in this particular department). The system will use this

information to support bi-directional navigation and to

maintain the referential integrity of the knowledge base. For

example, before deleting a Department instance, the system can

follow its "_KINVStudent_major" links to identify those

students who major in this department and remove their "major"

links to this particular Department instance. To access an

attribute value or a particular element from a list- or array-

valued attribute, we use the conventional dot notation

. (e.g., "s.major") or

.' [ ' '] (e.g. ,

"s.college_report[l]"), respectively.

4.1.2 Generalization Association

For each object class, one can use generalization (G)

association to specify its immediate superclass or subclass.

The syntax for a Generalization association definition is as


generalization | specialization of ;


Class "A" is said to be a superclass of class "B" (i.e., there

is a generalization association from "A" to "B") if for each

object that has an instance in class "B," it also has an

instance in class "A." Both instances have the same oid and

are conceptually connected by a G-link. Note that

generalization association is bi-directional and can be

specified in either direction. For example, to say "Student

is a specialization or subclass of Person" is equivalent to

say "Person is a generalization or superclass of Student." In

general, specialization is used to construct object classes

in a top-down and step-wise refinement approach by giving more

and more structural and behavioral properties. Contrary to

most existing object-oriented models, we also allow the user

to define specializations of primitive system-defined domain-

classes (e.g., a subset of integer) by using some constraint

rules to specify range, enumeration, or any other constraints.

4.1.3 Friend and Using Associations

As each class can be thought of as a reusable software

module in object-oriented software development, two types of

functional associations are provided in K to facilitate

"programming in the large." Functional associations can be

defined in the form of " of

(, )," where could be either

friend or using. The introduction of the "friend" and "using"

associations also illustrate the extensibility of the


knowledge model. In the later version of K that supports

generic rules, we shall allow the user to incrementally extend

the knowledge model as described in Section 3.2.

Friend. This association type is used to support the

three-level information hiding mechanism described in Section

3.1. A "friend" (F) association specifies that all the

constituent classes are "friends" of the defining class and

thus authorizes them to access the private and protected

properties of the defining class.

Using. Similar to the "#include" macro in C++, a "using"

(U) association specifies that all the public interfaces

defined by the constituent classes will be available to the

defining class (client-server relationship). Note that though

this information has been implicitly captured in parameter

specifications and method invocations, we include it at the

class level for better readability and maintainability of

complex software systems. For example, a user can easily

capture the overall structural and functional relationships

among system modules by just reading the association

definition or graphic display of the system schema rather than

going into the detailed codes of each method. Besides, the

compiler can make use of the semantic information provided by

the "using" associations in a system schema to automatically

include all the necessary classes for compilation. Note that

the "using" association provides a modular mechanism at a

larger granularity than ordinary classes as one can either (i)


functionally compose many classes into a big module structure

or (ii) functionally decompose a big module into smaller


4.2 Association Pattern

Since K serves as the high-level interface of OSAM*.KBMS,

the development and execution of a K program would generally

involve the processing of a persistent knowledge base. For

knowledge-base retrieval and manipulation, a knowledge-base

programming language should include some knowledge

manipulation constructs in addition to general programming

constructs. In our work on K [SHY91], we use pattern-based

querying constructs for this purpose. We modify the context

expression of OQL [ALA89a,b, GUO91] as the primitive construct

for specifying structural association patterns based on which

the system can identify the corresponding contexts that

satisfy the intensional patterns. In the following, we will

describe various forms of association pattern in more detail.

4.2.1 Class Expression

The most simple form of association patterns is called

class expression, each of which specifies a single object

instance or a collection of homogenous object instances (i.e.,

instances of the same class) of a particular class. A single

object instance is represented by an instance variable such

as the pseudo variable "this" (which denotes the receiver of


a coming message as in C++) or other user-defined local

instance variable. A collection can be specified in one of the

following forms: (i) , which is the name of an entity

class and represents the extension (the set of all persistent

and transient instances) of this entity class, (iii)

'[' ']', which selects instances from

an entity class by using as the intra-class

selection condition, and (ii) , which is a

user-defined variable ranging over an entity class using the

constructor set, list, or array. For example, "sl : set of

Student" defines a variable "sl" whose value will be a set of

Student instances. Note that both implicit sets (class

extensions) and explicit sets (user-defined sets) are

supported in K. While implicit sets are automatically managed

by the system, explicit sets can be manipulated by using set

operators "union" ('+'), "intersection" ('&'), and

"difference" ('-'). The two operands of a set operator must

be defined over the same entity class or domain class. We also

overload the '+' and '-' operators to represent the "add" and

"remove" operators, which can be used to add and remove a

single instance to and from a set, respectively. For example,

the statement "sl := Student[age > 40] + this;" adds a student

instance denoted by "this" to the set of all students with age

greater than 40 and assigns the new set to variable "sl."

One can define a range variable over a collection of

instances in a class expression as ":" for


specifying context looping and quantifiers as will be describe

in the latter sections. For example, "s:Student[s.eval_GPA()

> 3.5]" specifies a set of Student instances, each of which

is denoted by the variable "s" and has GPA greater than 3.5.

Note that while OQL [ALA89a,b] only allow simple comparisons

in a selection condition, K provides more expressiveness and

consistence by allowing any boolean expression expressible in

K as a selection condition to comply with the orthogonality

design principle.

4.2.2 Linear Association Pattern

A linear association pattern is specified as "

( )" where represents a class expression

described in Section 4.2.1, and is specified as

'['']' ." Note that (i)

could be either an "associate" ("*") or a "non-associate"

("!") operator as in Alashqur and Lam [ALA89] and Guo and Lam

[GUO91), (ii) could be either ">" or "<" so that

the defining class of is always at the open

side, i.e., the left-hand-side of ">" or the right-hand-side

of "<", and (iii) could be either "G"

(which represents an generalization association) or the name

of an aggregation association. Note that the non-associate

("!") operator is useful in identifying instances at either

side of the operator, which are not connected to any instance

at the other side. For example, "Student !> [advisor)


Professor" retrieves not only students who have no advisor but

also professors who are not advising any student. While the

former can be easily expressed as "Student[advisor = null]"),

the later cannot be concisely expressed without the use of "!"

operator or quantifiers, which will be described in Section


As an example, "g:Grad[major.name = "CIS"] *>[advisor]

p:Professor !< [instructor] Course" specifies a sub-knowledge-

base that contains (i) all the graduate students of CIS

department who has an advisor (i.e., there is an "advisor"

link connecting this student with a professor) who does not

teach any course (i.e., this professor is not connected

through the "instructor" association with any course

instance), (ii) all the professors who are the advisors of

some students but do not teach any course, and (iii) all the

courses that are not taught by those professors who are' the

advisors of some students. Here, "g" and "p" are variables

that represent the graduate students and professors satisfying

the association pattern specification, respectively. Note that

each course selected in (iii) might be taught by some

professor who is not the advisor of any student. A context

can be thought of as a normalized relation whose columns are

defined over the participating classes and each of its tuples

represents an extensional pattern of iids that satisfy the

intensional pattern.


The evaluation of the above pattern is illustrated step

by step in Figure 4.1. Figure 4.1(a) shows the original

knowledge-base. Figure 4.1(b) shows the sub-knowledge-base

after we first evaluated "Student *>[advisor] Professor." Only

student instances and professor instances, both of which are

connected by "advisor" links, are retained. Professor "p3" is

dropped. Note that when an instance is dropped, all the links

connected to this instance are also dropped. Figure 4.1(c)

shows the sub-knowledge-base after we applied the "!"

operator. Only those (i) professor instances selected in

Figure 4.1(b) that are not connected by the "instructor" link

of "Course" with any course instance, and (ii) course

instances that are not connected by the "instructor" link of

"Course" with any professor instance selected in Figure 4.1(b)

are retained. Professor "pl" and course "cl" are dropped.

Because professor "pl" is dropped, student "sl" no longer

connects to any professor and therefore must also be dropped.

Note that conceptually, there is a "non-link" (represented by

dotted line in Figure 4.1(c)) connecting each pair of

professor and course instances that are not connected to each

other by the "instructor" association of "Course". In Figure

4.1(d), we normalize the resulting sub-knowledge-base to get

all the combinations of possible

instances bindings. A sub-knowledge-base is empty if all the

columns of its normalized relation are empty. Each tuple of

the normalized relation is a set of bindings, which can be


thought of as an extensional association pattern that

satisfies the intensional association pattern. Note that the

normalized relation as shown in Figure 4.1(d) serves only as

a data structure to temporarily store the sub-knowledge-base,

and the semantics of each of its tuples (e.g., the "associate"

or "non-associate" association between object instances) is

interpreted by the system based on the corresponding

intensional association pattern.

The use of aggregation association must comply with the

information hiding principle described in Section 3.1. An

association pattern "A *>[p] B" used in class "X" is valid if

and only if one of the following is true: (i) "p" is a public

aggregation association defined by class "A" or any of its

superclasses, (ii) "p" is a protected aggregation association

defined by class "A" or any of its superclasses, and class "X"

is either the defining class, a subclass of the defining

class, or a friend class of the above classes, and (iii) "p"

is a private aggregation association defined by class "A,"

and "X" is either the defining class "A" or a friend of class

"A. "

The semantics of generalization association specified in

an association pattern is slightly different from that of

aggregation associations. As mentioned in Section 3.1, if

class "A" is a superclass of class "B", then for each object

that has an instance in class "B," it must also have an

instance in class "A" and these two instances are implicitly


connected via a generalization link. For example, to find

those TAs who are also RAs, we can use "TA *<[G] Student *>[G]

RA", which specifies a sub-knowledge-base consisting of all

the triples such that each triple represents

three different instances of the same object. In other words,

we retrieve all the student instances who are both TA and RA

along with their TA and RA instances. Note that we use "<[G]"

between TA and Student because TA is a specialization of

Student, i.e., Student is a generalization of TA. Similarly,

we use ">[G]" between Student and RA because Student is a

generalization of RA. Also note that based on our distributed

object storage mechanism, the class lattice is not closed

under intersection. For example, suppose there is a class

called "TA&RA", which is a subclass of both TA and RA. Even

though the presence of a TA&RA instance guarantees the

presence of a corresponding TA instance and a RA instance, the

vise versa is not always true. For example, one may insert an

object to class TA and class RA without having to insert this

object to class TA&RA. In other words, the intersection of TA

and RA is not always equal to TA&RA. The use of generalization

association in an association pattern is useful for

identifying different instances of an object in different

classes to support static type checking as will be described

in Section 4.3.


4.2.3 Direction Specification in Association Patterns

Different from OQL [ALA89], link direction in K is

explicitly specified for bi-directional navigation to comply

with the design principle of readability and maintainability.

By following the direction, a user or the system can

unambiguously identify which side is the defining class of the

association. For the above example, we use the ">" in "Student

*>[advisor] Professor" because "advisor" is an association

defined by class "Student." Similarly, we use the "<" in

"Professor !<[instructor] Course" because "instructor" is an

association defined by class "Course." We illustrate the

importance of explicit direction by the following two

examples. Firstly, suppose class Person defines an aggregation

association called "father" whose constituent class is Person

itself. The association pattern "this *>[father] Person"

specifies the sub-knowledge-base consisting of a particular

person instance denoted by "this" as well as the father of

this person. Similarly, the association pattern "this

*<[father] p:Person" specifies the sub-knowledge-base

consisting of this particular person instance as well as all

his children (person whose father is this particular person).

It is ambiguous if no direction is explicitly given. Secondly,

suppose both class Student and class Course define a set-

valued aggregation association called "enroll" from Student

to Course and from Course to Student, respectively. Similar


to the above example, direction must be explicitly given in

"this *>[enroll] Course" and "this *<[enroll] Course" because

their semantics are different. Note that the explicit

specification of direction allows the user to switch the order

of class expressions at the two sides of an association or

non-associate operator by just changing the direction. For

example, "Student *>[enroll] Course" and "Course *<[enroll]

Student" are equivalent in semantics.

From software engineering point of view, relying on the

system to infer the direction is also unacceptable for the

following two reasons. First, readability is decreased as a

reader of a K program must infer the direction by himself.

Secondly, maintainability is decreased because the semantics

of a valid association pattern would change unexpectedly as

the schema evolves gradually. Following the above example,

suppose a programmer uses "this *[enroll] Course" to capture

the semantics of "this *>[enroll] Course." Later on, the

schema is changed so that the "enroll" association from

Student to Course is deleted. Instead of reporting a semantic

error as it should, the system would automatically interpret

this pattern as "this *<[enroll] Course" and thus make the

application code difficult to read and maintain.

Comparatively, the use of explicit direction produces more

readable and reliable code.


4.2.4 Precedence in Association Pattern

Both the association ("*") and non-associate ("!")

operators are of the same precedence and are evaluated from

left to right in K. To change the order of evaluation, one can

use parenthesis "()." Note that changing the order

of evaluation will change the semantics of an association

pattern only in the case that we want to first evaluate an

association pattern at the right-hand-side of a "!" operator

before we apply the "!" operator. For example, the following

two association patterns (i) "Student *>[advisor] Professor

!<[instructor] Course" shown in Figure 4.1 and (ii) "Student

*>[advisor] (Professor !<[instructor] Course)" shown in Figure

4.2 are different in the sense that while the former contains

all the courses, each of which is not taught by any professor

who is also an advisor of some student (but maybe taught by

some professor who is not an advisor of any student), the

latter contains all the courses, each of which is simply not

taught by any professor. In other words, the set of courses

selected in (ii) is a subset of that selected in (i). Note

that because association operators are of the same precedence

and links can be specified in both directions, the association

pattern specified in Figure 4.2 can also be expressed as

"Course !>[instructor] Professor *<[advisor] Student" without

using any parentheses. Here, the evaluation of "Professor


*<[advisor] Student" will not effect the Course instances that

have been selected first.

4.2.5 Branching Association Pattern

Tree-structured complex association pattern can be

specified using the branch operators "and" and "or" in the

form " ( , ,

...)." The left-hand class expression of a branch operator is

called the fork class expression. For the "and" operator, the

set of instances returned from the fork class is associated

(or non-associated, depending on each specification)

with every (or at least one in the case of the "or" operator)

pattern on the right hand side of the "and" operator. For

example, "Student and (*>[major] Department, !>[advisor]

Professor)" specifies a sub-knowledge-base consisting of (i)

all the students each of whom has a major and does not have

an advisor, (ii) all the departments that these students

majoring in, and (iii) all the professors each of whom is not

advising any student selected in (i). The evaluation is shown

in Figure 4.3. Two sub-knowledge-bases specified by "Student

*>[major] Department" and "Student !>[advisor] Professor" are

first evaluated separately in Figure 4.3(b). The effect of the

"and" operator is to remove those student instances that

appear in only one of the sub-knowledge-bases as shown in

Figure 4.3(c). In Figure 4.3(d), we represent the resulting

sub-knowledge-base as a normalized relation, which can be


thought of as the concatenation of the two previous sub-

knowledge-bases in the sense that each tuple is either

or .

Note that no department and professor instance will appear in

a tuple at the same time because there is no association

specified in the pattern between these two classes. In Section

4.3, we will describe how to iterate over the relation for

knowledge-base manipulations. Also note that branching can be

nested to form complex patterns. For example, "Student and

(*>[receiver_of] Fellowship, *>[G] Grad or (*>[G] TA, *>[G]

RA))" specifies a sub-knowledge-base consisting of (i) all the

students each of whom receives any fellowship and at the same

time is a graduate TA or RA, (ii) the fellowships that these

students receive, and (iii) the TA or RA instances of these

selected graduate students.

4.3 Context Looping Statement and Navigation

4.3.1 Context Looping Statement

As mentioned in Chapter 1, one of the design principles

of K is to seamlessly incorporate the high-level query

facilities into the language constructs for the retrieval and

manipulation of the knowledge base based on association

patterns mentioned above. To serve this purpose, K provides

the context looping statement in the form "context

[where ] [select ] do end context,"


which iterates over each extensional pattern of the sub-

knowledge-base (context) specified by . The optional

where-clause is used to specify a boolean expression as the

inter-class selection condition so that the iteration will

skip those tuples that do not satisfy the condition. One can

also use the optional select-clause to specify a projection

over only those columns (represented by their range variables)

that he/she is interested and thus eliminate the resulting

redundant tuples. When no select-clause is given, implicit

projection will be performed over those columns, each of which

defines a range variable. For example, the following statement

will print the name of each professor whose age is smaller

than that of any his/her advised student. Note that the where-

clause is to used to specify the inter-class selection

condition between Grad and Professor, and the select-clause

is used to project over Professor column and remove the

redundant tuples so that each qualified professor will appear

only once even if he/she advises more than one student.

context g:Grad *>[advisor] p:Professor where g.age > p.age
select p
do p.name.display();
end context;

Note that by introducing the notation of range variables, no

alias class [OQL89b) is necessary in an association pattern.

For example, "al:A *>[Pa] B *>[Pb] a2:A" defines two range

variables al and a2 over the same class A. As context looping


statement is just one of the control constructs provided by

K, it can be freely combined with other constructs or nested

in an arbitrary number of levels to comply with the principle

of orthoqonality. This is one important issue that must be

addressed in the design of a knowledge-base programming

language but not necessarily in that of a query language. For

example, the following statement prints the name of the

chairman of each department followed by all the other faculty

members of that department. The first context statement

identifies the Department class, then the nested context

statement is applied to each department. For each looping of

the nested context statement, we identify all the professors

who are faculty members of the particular department, and

print his/her name if he/she is not the chairman.

context d:Department
do d.chairman.name.display();
context p:Professor *>[faculty_of] d
do if p != d.chairman
then p.name.display();
end if;
end context;
end context;

4.3.2 Navigation in Object-oriented Knowledge Base

In existing object-oriented data-base systems, navigation

is expressed by using the dot expression for implicit joins.

However, the use of dot expression is limited by the following

factors. First, navigation is done only in one direction

unless inverse attributes are supported and explicitly defined


in the system. Second, navigation cannot continue when a

multi-valued attribute is met. For example, the dot expression

"sl.friends.enroll" is not allowed if "sl" is a Student

instance, and both "friends" and "enroll" are set-valued

aggregation associations defined from Student to Student, and

from Student to Course, respectively. The reason is that the

expression "sl.friends" returns a set of Student instances,

and dot expression only allows "enroll" to be applied to a

single Student instance. Third, dot expression cannot express

navigation via negation, or the "non-association"

relationships. For example, it is not possible to express "all

the courses that student sl does not enroll" using simple dot

expression. Last, it is impossible (except using dynamic

binding) to explicitly navigate via the generalization

association in order to access different representations of

the same object in different classes. K supports all the above

cases using the association patterns described in Section 4.2.

Bi-directional navigation. As mentioned in Section 4.1,

any aggregation association between two entity classes are

interpreted as a bi-directional link, and any generalization

association can also be thought of as a bi-directional link

from a superclass to a subclass (generalization) or vise versa

(specialization). Thus, one can navigate from one class to

another by following these bi-directional links to form a

linear association pattern of arbitrary length. For example,

one can use the following context looping statement to print


all the course titles that the advisor of "sl" is teaching.

Note that "instructor" is an aggregation association defined

by "Course" and therefore we use "*<[instructor]" to express

the navigation from Professor to Course.

context sl *>[advisor] Professor *<[instructor] c:Course
do c.title.display();
end context;

Multi-valued navigation. Both single-valued and multi-

valued aggregation associations are expressed uniformly in an

association pattern. For example, one can use the following

context looping statement to print the names of all the

courses that student sl's friends enroll. Note that here, we

use implicit projection to remove the redundant courses that

are taken by more than one of sl's friends.

context sl *>[friends] Student *>[enroll] c:Course
do c.name.display();
end context;

Navigation via negation. By using the non-associate ("!")

operator, one can traverse via the conceptual "non-associate"

links through the knowledge base. For example, the following

program will print all the course titles that "sl" does not

enroll and come out with a set Cl containing all the courses

that "sl" does not enroll for further manipulation. Note that

the association pattern returns (i) student "sl" if he/she

does not enroll in any course, and (ii) all the courses that

"sl" does not enroll. The normalized relation will contain one


of the following types of tuples: (i) , if "sl"

does not enroll in any course, (ii) , if "sl"

enrolls in some, but not all of the courses, and (iii)

, if "sl" enrolls in all the courses. We then use

the range variable "c" to loop over all (if any) the selected

Course instances.

local Cl: set of Course;
context sl !>[enroll] c:Course
do c.title.display();
Cl := Cl + c; /* add "c" into set "Cl" */
end context;

Generalization navigation. In order to support static

(instead of dynamic) type checking, K allows the user to

explicitly express navigation via the generalization

association to access different representations of the same

object in different classes. For example, the following

statement will print the name of each student and, if this

student is also a graduate student, print the name of his/her

advisor (if any). Note that the normalized relation returned

by the context expression could contain three types of tuples:

(i) , if a particular student is not a

graduate student, (ii) , if a particular

student is a graduate student but does not have an advisor,

and (iii) , if a particular student

is a graduate student and has an advisor. It is the user's

responsibility to check the case of null value to avoid a run

time error when trying to send a message to null as in C++.


Also note that similar to CLU [LIS77], we use different range

variables to represent different instances of the same object

in different classes so that type checking can always be done

at compilation time.

context s:Student *>[G] g:Grad *>[advisor] p:Professor
do s.name.display();
if (g != null) and (p != null)
then p.name.display();
end if;
end context;

Similarly, one can use the cast operator "$" mentioned in

Chapter 3 to implement late binding. For example, suppose

"Employee" has two subclasses "TA" and "Professor," both of

which define a method called "raise." Then, one can use the

following context looping statement to give a raise to every

employee. Note that in the case statement, we use the cast

operator to test if an employee is either a TA or a Professor,

and apply the appropriate "raise" method to the corresponding

TA or Professor instance. In the later version of K that

supports virtual class as in C++ [STR86], the testing and

casting will be performed automatically by the system.

context e:Employee
do case
when TA$e != null do TA$e.raise() ;
when Professor$e != null do Professor$e.raise();
end case;
end context;


4.4 Existential and Universal Quantifiers

Statements for the retrieval and manipulation of a

knowledge base may involve existential and universal

quantifiers in the form "exist in [suchthat

]" and "forall in suchthat

." Quantifiers based on association patterns make it

much easier for the users to declaratively pose logic

questions upon the knowledge base. An existential quantifier

returns true if there exists any object instances) denoted

by in the sub-knowledge-base specified by and

satisfies the optional . For example, the following

expression tests if there exists any student who is taught by

his/her own advisor: "exist s in s:Student and (*>[enroll]

Course *>[instructor] pl:Professor, *>[advisor] p2:Professor)

suchthat pl = p2." Note that the "suchthat" clause of an

existential quantifier can be omitted if it can be specified

as an intra-class selection condition in the association

pattern. For example, one can ask if a student whose name is

"John Smith" takes any CIS course as "(exist c in Student[name

= "John Smith"] *>[enroll] c:Course[offered_by.name =

"CIS"))." It is also possible to specify more than one

variables in a quantifier. For example, "exist s,p in

s:Student !>[advisor] p:Professor" returns true if there

exists both any student who does not have an advisor and any

professor who is not advising any student. Note that the


result of "exist a in a:A !>[p] B" and "exist b in A !>[p]

b:B" is not necessarily the same. Similarly, a universal

quantifier returns true if all the object instances denoted

by in the sub-knowledge-base specified by

satisfy . For example, one can ask if all the

students have GPA greater than 3.5 as "(forall s in s:Student

suchthat s.GPA > 3.5)." As a special case, if no object

instance exists in the sub-knowledge-base, then a universal

quantifier returns false by default. Note that in simple

cases, quantifiers are not always necessary. For example, to

test if student "sl" has any advisor, one can use either

"sl.advisor != null" or "exist p in sl *>[advisor]

p:Professor." However, as pointed out in Section 4.3, the use

of dot expression has many limitations and cannot express many

complex cases concisely. For example, the following expression

tests if student "sl" and the friends of "sl" enroll in any

common course: "exist c in sl *>[friends] Student *>[enroll]

c:Course *< [enroll] sl." This question is difficult to

express using traditional dot expression and set operations.

Also note that we do not support the "in" operator to test the

membership of an object instance within a set in the first

version of K. The reason is that whatever the "in" operator

can express can always be expressed by using quantifiers, but

the vise versa is not always true due to the limitations of

dot expression described above. For example, to ask "if x


enrolls in course y," one can use the expression "exist c in

x *>[enroll] c:Course suchthat c = y."

As quantified expressions themselves are boolean

expressions too, they can be nested in an arbitrary number of

levels or combined with other constructs wherever boolean

expressions are allowed. For example, the following statement

will print the names of all the students who take all the

courses currently offered by his/her major department.

context s:Student *>[major] d:Department
do if forall cl in cl:Course *>[offered_by] d
suchthat exist c2 in s *>[enroll] c2:Course
suchthat (cl = c2)
then s.name.display();
end if;
end context;


*H U

0 4J

U )
0 C



0 W

> 0

m r


-H 4J

- L

~ L

ei ri

E1 ~,


E-4 >





\ C. !





- -

2 ^c


0 p
O 0

> 0
.0 -r





As mentioned in Chapter 3, behavioral properties of

objects are modeled by methods and rules. Corresponding to

each executable software system, the user also has to define

a named K program (similar to the "main" program of C++) as

the starting point of execution. Each program is defined in

the form of "program is (}

end [];." Note that such programs are defined in

parallel with object classes. A detailed description of

methods and rules will be given in this Chapter.

5.1 Method Definition

At the language layer, each method is specified in the

method section and implemented in the implementation section

of its defining class in the following syntax:

public: | private: I protected:
; (;}
(public: T private: | protected:
; (;}

; (;}


Each is the signature of a method in

the form "method I operator ( parameter> (,) ) :

[ of] ." Note that

could be a class name or "void" (which is used to specify that

there is no return value of this method). Each parameter is

specified in the same way as a variable declaration as "

: [ of] ," where is either

set, list, or array[], and is the type of the

parameter. Each is defined as

" is end [];,"

where represents a sequence of K computation

statements in the form "; (;)." One can

also overload any system-defined relational operator (>, <,

=, !=, >=, <=), arithmetic operator (+, -, *, /, mod), set

operator (+, -, &) logic operator (and, or), or unary

operator (+, -, not) by redefining the operator in a user-

defined entity class or domain class. Operator specifications

and implementations are syntactically the same as those of

other methods, except the key word method is replaced by

operator. The behavior of an operator therefore depends on the

object to which it is applied (operator polymorphism). The

difference between methods and operators is that while methods

are invoked generally by using the dot expression, operators

are invoked by using the traditional binary and unary

expressions. A detailed description of various types of

expressions will be given in the following of this Section.


The body of a method is a sequence of K computation

statements, which can be categorized as follows:

(1) Expression. The most simple form of K statements is

"," which can be further categorized as follows:

(1.1) Single item. A single item could be (i) an

identifier (e.g., a local variable or attribute name in an

intra-class selection condition), (ii) a primitive domain

class object, i.e., integer (e.g., 12), real (e.g., 2.5),

boolean (e.g., true), character (e.g., 'A'), and string (e.g.,

"Smith"), (iii) a complex domain class object (e.g., date(mm

:= 12, dd := 31, yy := 1990)), or (iv) a method invocation

when the receiver of the method is omitted (when the receiver

is "this" in a method body or in an intra-class selection

condition). The syntax for a method invocation is

"( [(,)] )," where each

corresponds to a formal parameter defined by the

method. Note that an identifier starts with a letter or

special symbol ('_ ', '#'), followed by any number of letters,

digits, or special symbols.

(1.2) Dot expression. Traditional dot expression is

supported in K for simple navigation in one of the following

two forms: (i) ".," which is used to

access an attribute referred by of the object

instance returned by , or (ii)

".," where is a method invocation

applied to the object instance returned by . For


example, "x.advisor.name" returns the name of the advisor of

a student referred by "x." Similarly, "x.eval_GPA()" returns

the GPA value of the student referred by "x," where "eval_GPA"

is a method defined by class Student and takes no parameter.

(1.3) Assignment expression. One can use the assignment

expression to update an attribute of an object instance in

the form ". := ," or to

assign a value to a local variable in the form "

:= ". For example, the expression "x.age := 20"

updates the "age" attribute of a student instance referred by

"x" to be 20.

(1.4) Binary expression. Binary operators, which include

relational operator (>, <, =, !=, >=, <=), arithmetic operator

(+, -, *, /, mod), set operator (+, -, &), and logic operator

(and, or), can be used to form binary expressions in the form

" ." As mentioned in Section 3.4,

cast operator '$' can be used to ascribe a type (i.e., a class

name referred by an identifier) to an expression in the binary

form "$."

(1.5) Unary expression. Unary operators '+', '-', and

"not" can be used preceding an expression to form an unary

expression in the form . Note that unary

operators have higher precedence over binary operators.

(1.6) Array expression. One can use the array expression

"'['']'" to access a particular

element indexed by the second from an array


returned by the first . Multi-dimensional arrays

will be supported in the later version of K.

(1.7) Object expression. As mentioned in Section 3.5, one

can use the "new" and "pnew" operator to create transient and

persistent object instances of certain entity class referred

by an identifier and return the new iids in the form "new I

pnew ([ (,)] )." Note

that the optional assignment expressions are used to assign

values to some of the visible attributes, depending on where

this new object is created as described in Section 3.1. As

shown in Section 3.5, one can also use the "insert" operator

to create a new instance in a certain class for an existing

object instance referred by an expression and return the new

iid in the form " insert

([ (,)])."

(1.8) Quantifier expression. As mentioned in Section 4.4,

existential and universal quantifier expressions are boolean-

valued expressions in the form "exist identifierss> in

[suchthat ]" and "forall identifierss>

in suchthat ," where identifierss> is

a list of variables in the form "{,}."

(1.9) Parenthesis expression. Any expression can be

enclosed in a pair of parentheses as () so that

it can be evaluated as a single expression without being

effected by its neighboring operators that have higher


precedence. We show the precedence priority of all the

operators in ascending order as follows:

level 1 :=
level 2: new, pnew
level 3: or
level 4: and
level 5: >, >=, <, <=, =, !=
level 6: +, (binary plus/union, minus/difference),
& (intersection)
level 7: *, \, mod
level 8: not, +, unaryy plus and minus)
level 9: insert
level 10: (dot operator)
level 11: $ (cast operator)

(2) Block statement. One can define a sequence of

statements as a block in the form "[local ] begin

end." Similar to Ada [DOD83], local variables can

be declared with a block following the key word "local" in the

form "; (;)," where each is a variable

declaration in the form {, ):

[ of] ." Note that the scope of a local

variable is limited to the end of the block in which the

variable is declared.

(3) Conditional statement. Corresponding to the Testing

(T) control association, two types of conditional statement

are provided: (i) if-then-else statement in the form "if

then [else ] end if,"

and (ii) case statement in the form "case when

do (when do ) [otherwise

do end case." Note that as conditional statements


can be nested, the use of the key words "end_if" and

"end_case" avoid the "dangling else" problem [AH086] by

defining a clear scope for each statement. Also note that any

expression that returns a boolean value (including the

quantifier expressions described in Section 4.4) can be used

as the test condition.

(4) Repetitive statement. In addition to the context

looping statement "context [where ]

[select ] do end context" described in

Section 4.3, two types of repetitive statement are provided

for iteration: (i) for statement in the form "for

until [by ] do end for,"

and (ii) while statement in the form "while do

end while." The first following the

key word "for" will be an assignment expression that

initializes an iteration variable. The for-statement iterates

over until the value of the iteration variable

satisfies the following the key word "until."

After each iteration, the iteration variable is updated by

executing the assignment expression specified after the key

word "by." The default setting is "increased by 1." For

example, "for i := 1 until (i > 10) by i := i*2 do i.display()

end if;" will print 1, 2, 4, and 8. Note that the type of the

iteration variable can be any domain class that has the

following operators defined: plus (+) for increment, minus (-

) for decrement, assignment (:=) for initialization, and equal


(=) for comparison. The while statement will iterate over

as long as the expression that follows the key

word "while" returns "true."

(5) Flow statement. Inside a repetitive statement, one

can use the "break" statement or "continue" statement to alter

the control flow. The break statement causes the control to

exit the repetitive structure, while the continue statement

ignores the statements following the continue statement and

forces the control to go to the beginning of the repetitive

structure. One can also use the "return []"

statement to terminate the execution of a method normally and

return the control flow to the point where the method is


(6) Object statement. As mentioned in Section 4.5, one

can use the "delete " and "destroy "

statements to manipulate the knowledge base. The delete

statement deletes an entity class instance returned by

as well as all the instances that have the same

oid from all the subclasses of the entity class. For each

instance being deleted, any association with other entity

class instance is also deleted. The destroy statement, on the

other hand, recursively deletes not only the entity class

instance returned by , but also all the instances

that have the same oid as from all the

superclasses and subclasses of the entity class.


(7) Rule statement. Within a user program, one can use

the "deactivate " and "activate "

statements to deactivate/activate a rule whose rule name is

referred by . A detailed description of rules will

be given in Section 5.2.

(8) Abort statement. As objects are manipulated by

methods in object-oriented paradigm [STE86], we feel that it

is a natural way to consider the execution of each method as

a single transaction, which commits when the method

terminates. We also provide the user with the "abort"

statement to undo any update to the knowledge base (i.e., the

states of persistent entity class instances) made before the

abort statement, and after either (i) the beginning of the

method execution (if no abort statement has been executed) or

(ii) the previous abort statement executed within the method

execution. Note that the abort statement has no effect on the

update to the value of any local variable itself or to the

state of any transient entity class instance. Also note that

the abort statement does not alter the normal control flow.

The execution of a method continues as a new transaction and

all the updates made after the last abort statement will be

committed to the knowledge base when the method terminates.

A more detailed description of the execution model will be

given in Chapter 6.


5.2 Rule Definition

Rules serve as a high-level mechanism for specifying

declarative knowledge that governs the manipulations of

objects made by KBMS operations, updates, and user-defined

methods. Rules are specified in the following syntax:

rule is
[condition ]
[action ]
[otherwise ]
end []

Each rule is given a name for its identification, which must

be unique within its defining class. Each rule is specified

by a set of trigger conditions and a rule body. Trigger

conditions are specified as " (;

)," where each consists

of a timing specification and a sequence of knowledge-base

event specification in the form " (,

}." Timing specification (or coupling mode) can be

"before," "after," or "immediate after." Event specification

can be a (1) KBMS operation (new, pnew, insert, delete, and

destroy), (2) update ("update [::] "),

where is an attribute defined or inherited by the

defining class of the rule, or (3) user-defined method

("[::] ()"), where is a

method defined or inherited by the defining class of the rule.

The two-colon operator ("::") is used to specify from which


superclass of the defining class that a particular attribute

or method is inherited when name conflict occurs. Note that

the two-colon operator is used only in the trigger conditions

of rules, not in the computation statements where the cast

operator ('$') should be used for conflict resolution as

described in Section 3.4. Internally, the system uses the

event specification as the key of a in-memory rule hash table

for fast retrieval of applicable rules as will be described

in Chapter 6.

Note that certain combinations of timing and event are

semantically invalid. For example, "before new/pnew/insert"

and "after delete/destroy" are meaningless because each rule

is applied to some instance of its defining class and there

is no such instance exists for the above trigger conditions.

For domain classes, "pnew/insert/delete/destroy" are not valid

events as they are not applicable to values. However, we allow

the use of "new/update" in a trigger condition to specify

constraints that govern the values in a range or enumeration.

It is important that the use of "new/pnew," "insert" and

"update " not be confused. In general, rules

govern the values of certain attribute should be specified by

using "update " as the event instead of using

"new/pnew" or "insert." The reason is that any assignment of

attribute value is treated as an update operation, even if it

is embedded in a new, pnew, or insert operation. Similarly,

for those rules that ensure that certain attributes can not


be null (i.e., non-optional) when an instance is created, one

should use "insert" instead of "new/pnew" for entity classes,

and use "new" for domain classes. The reason is that creating

a new entity class object of class "X" implies the insertion

of corresponding instances to class "X" as well as all the

superclasses of "X," while inserting an instance does not

imply creating a new object as described in Section 3.5.

We do not support "retrieve" as a valid KBMS operation

specification for the following reasons. First, as far as the

knowledge-base consistence is concerned, the retrieve (read)

operation will not change the state of the knowledge base.

Second, as we expect there are many retrieve operations in a

complex K program, the performance will be significantly

reduced if the system has to check rules for every retrieve

operation. Third, in the case that the user would like to

trigger some action when retrieving certain attribute, he/she

can apply the object-oriented encapsulation principle by

defining this attribute as a "private" attribute and defining

a public/protected method as the interface to retrieve this

attribute. Rules associated with this method can then be

defined to perform the triggering.

The rule body consists of (i) "condition" clause that is

a guard expression, and (ii) "action," and "otherwise"

clauses, both of which can be a sequence of any K computation

statement described in Section 5.1. Each guard expression is

in the form "[ (, ) j] " and the


evaluation of a guard expression can return either (i) true:

if all the guards and the target (all of which are boolean

expressions by themselves) are true, (ii) skip: if any of the

guards is false when they are evaluated from left to right,

(iii) false: if all the guards are true but the target is

false. Note that the guard expression itself is not boolean

expression, and it can be used only in the condition clause

of rule definitions instead of computation statements. Also

note that, although the semantics of a guard expression can

be implemented by nesting of if-then-else constructs, the

guard expression is a simpler and more concise construct to

use, particularly when the number of guards is large. Besides,

we feel that rules should be specified as declaratively as

possible, and we would like to make a clear distinction among

the "condition," "action," and "otherwise" parts of a rule

instead of mixing them in a nested if-then-else procedural

statement. This clear distinction also makes it possible for

different implementation strategies. For example, the

"condition" and "action" (or "otherwise") parts can be

executed as separate transactions as in HiPAC [HSU88, CHA89]

and ODE [GEH91]. Similar to method invocation, rule checking

is performed at the instance level, and the pseudo variable

"this" can be used in a rule body to represent a certain

instance of the defining class to which some event occurs as

shown in Figure 3.1.


We modified the syntax of the OQL rule language [ALA90]

so that it can be seamlessly incorporated into K. The

difference between the K rule language and the OQL rule

language is three-fold. First, we use a uniform syntax that

is similar to the ECA (event-condition-action) rule of

Chakravarthy [CHA89] to represent both state rules

(constraints) and operational rules (triggers) for simplicity.

Second, we use the guard expression to subsume the semantics

of "if patternn> exists, then patternn> must also exist,

otherwise do some corrective action" in the OQL rule language

so that we can provide more expressive power (e.g., any number

of guards can be specified, and any boolean expression

including quantifier can be used as a guard or target) and

avoid the confusion with the "if-then-else" computation

statement. Third, we subsume the implicit set-oriented

semantics in the OQL rule language "if context then

" and "if context then

corrective action ," where the use of the key word

"context" implies that this rule will be checked against all

the instances of the defining class of the rule. To specify

the same rule in K, quantifier expression can be used to test

the existence of certain patterns, and context looping

statement can be used in the action- or otherwise- clause to

iterate over certain context. Note that the OQL rule language

is more declarative in the sense that the user does not have

to deal with the detailed specification of control structure;


however, this feature is undesirable in an integrated

knowledge-base programming language because it overloads the

semantics of the "if-then-else" computation statement and

prohibit the user to have finer-grained control over the

computation at the instance level in terms of using various

control structures and variable bindings. Besides, when more

than one pattern is specified (e.g., in a guard expression),

the semantics of looping over several contexts at the same

time is undefined. To sum up, K provides consistent syntax,

more expressiveness, and finer-grained control of the rule

actions over OQL rules.

All the rules are assumed to be active when a user

session begins. However, during the execution of a user

program, one can use the "activate " or

"deactivate " statements to temporarily activate

or deactivate any particular rule referred by ,

respectively. Note that rules are treated as protected

properties and are encapsulated in their defining classes.

Therefore, each rule of class "X" can be referred by the

activate/deactivate statements only in those method bodies and

rule bodies of class "X" or any subclass of "X." For each

knowledge-base event occurs to instance "this" of class "X,"

all the applicable rules will be triggered (i.e., the

evaluation of the rule body) according to the trigger

conditions of each rule at either (i) before the triggering

event, (ii) immediately after the triggering event, or (iii)


not immediately after the triggering event, but at the end of

the parent event that causes the triggering event. A detailed

description of how to find applicable rules will be given in

Chapter 6. Note that the use of "after" mode allows for

temporary violation of constraints (which is likely to happen

when a constraint on an object depends on two inter-related

values and when one of the values is updated) by deferring the

rule checking until the end of a higher level operation. In

the case that multiple rules satisfy a trigger condition, such

rules will be triggered in some unspecified order, which is

dependent on the implementation. Later version of K will allow

the user to explicitly specify the triggering orders in terms

of rule associations, i.e., sequential, parallel, and

synchronization relationships among rules.

The rule body of each rule is evaluated as follows: (i)

if the condition-clause returns true, then the action-clause

(if provided) is executed, (ii) if the condition-clause

returns skip, then do-nothing, and (iii) if the condition-

clause returns false, then the otherwise-clause (if provided)

is executed. For example, the rule CIS_rulel specified in

Figure 3.1 will be executed at the end of those methods that

are applied to a student instance and update the major of this

particular student. The otherwise-clause will be executed if

this particular student is a CIS major (guard is true) and

his/her GPA is not greater than 3.0 (target condition is

false). Similarly, Generalrulel will be checked after the


method "suspend." A detailed description of the computation

model will be given in Chapter 6.

As a final note, rules specified in a class definition

can conflict with other rules for the same knowledge-base

event or cause infinite looping during execution. In general,

it is not possible to automate the validation of rules in the

context of knowledge-base programming language. It is the

user's responsibility to make sure such logic errors do not

happen as in any programming language.


A computation model, akin to a virtual machine, provides

the operational semantics of a programming language. To meet

the various requirements of different application domains, a

multi-paradigm computation model is needed for supporting

object-oriented, parallel, non-determinism, and rule-based

computations and, at the same time, satisfying the concurrency

control and recovery requirements of the knowledge-base

management system. In this chapter, we present the computation

model of K, which satisfies the above requirements in an

integrated fashion.

6.1 Overview

The computation model of K is based on object-oriented

paradigm [STE86, ACM90] and nested transaction [MOS81, HSU88]

to model the behavior of the combined execution of methods and

triggered rules in an object-oriented framework.

The nested transaction model provides a mechanism for

coping with failures and introducing concurrency in

distributed applications [MOS81]. It also provides an ideal

framework for modeling cooperate users and long-lived

activities in software development environment [DAY90]. The


execution of each method is considered as a transaction, i.e.,

an atomic action against the knowledge base: once invoked, it

either complete all its operations or behaves as if it were

never invoked. Transactions can be nested to an arbitrary

number of levels by defining a new method, which in turn

invoke other pre-defined methods. As a result, a transaction

may contain any number of nested transactions or sub-

transactions, some of which may be required to perform

sequentially, some concurrently, and all are organized as a

transaction tree whose root is the top-level transaction.

Changes to the knowledge base made by a nested transaction are

contingent upon the successful commitment of all of its

ancestral transactions. Aborting any of its ancestors

invalidates all of its changes. If a nested transaction

aborts, the knowledge-base state seen by its parent is the

same as it was immediately prior to starting the nested

transaction. Note that the abort of a transaction has no

effect on any update made to the value of a local variable

itself or to the state of any transient entity class instance.

Concurrency control and recovery will be supported by the

storage layer of the KBMS using the traditional pessimistic

transaction mechanism [ONT91].

The triggering of a rule (i.e., the execution of a rule

body) is also considered as a sub-transaction of the

triggering transaction (i.e., the transaction that represents

the execution of a method that triggers the rules) or the

University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs