The design and implementation techniques for an integrated knowledge base management system

MISSING IMAGE

Material Information

Title:
The design and implementation techniques for an integrated knowledge base management system
Physical Description:
ix, 177 leaves : ill. ; 28 cm.
Language:
English
Creator:
Raschid, Louiqa, 1958-
Publication Date:

Subjects

Subjects / Keywords:
Expert systems (Computer science)   ( lcsh )
Artificial intelligence   ( lcsh )
Database management   ( lcsh )
Genre:
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1987.
Bibliography:
Includes bibliographical references (leaves 168-176).
Statement of Responsibility:
by Louiqa Raschid.
General Note:
Typescript.
General Note:
Vita.

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001030983
notis - AFB3115
oclc - 18145350
System ID:
AA00003378:00001

Full Text













THE DESIGN AND IMPLEMENTATION TECHNIQUES FOR
AN INTEGRATED KNOWLEDGE BASE MANAGEMENT SYSTEM









By

LOUIQA RASCHID


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY



UNIVERSITY OF FLORIDA


1987









































Copyright 1987

by

Louiqa Raschid

















To my parents

Naima and Abdur Raschid


This was their dream before it was mine















ACKNOWLEDGEMENTS


I am deeply indebted to Dr. Stanley Su for introducing me to the subject area of my

doctoral research, for the long and invaluable hours devoted to guidance, discussion and even

argument, for serving as advisor par excellence, for chairing this supervisory committee and

for providing financial support. I also thank him and his wife Siew Phek for their precious

gift of friendship.

I am indebted to Dr. Sham Navathe for serving on this committee, for his technical

contributions to this research and for his encouragement and guidance both professional and

personal. I thank Dr. Keith Doty and Dr. Herman Lam for serving on this committee and

for many years of inspiration, encouragement and friendship. I am grateful to Dr. Doug

Dankel for serving on this committee and for his careful perusal of this manuscript.

I owe a debt of gratitude to Sharon Grant for her tireless efforts to provide a well-

administered research environment and for her much valued friendship and support. I thank

all my colleagues for their comaraderie, especially Mingsen (Mr.) Guo for many fruitful dis-

cussions on diverse subjects, Clay for Unix expertise, and Vishu and Ashish for comic relief.

To all my friends both far and near, I could not have done it without you. To Geeta,

my sister in spirit, words cannot express how much I value your friendship. You prodded me

to tread on paths that I would otherwise have ignored. Thank you, John, for editorial assis-

tance, truffles and amaretto, encouragement and caring, Gary for the many things you

taught me about myself, your affection and support, Raj, Esther, Chand, Rekha, Vogel, ..

To those who are farther yet dearer, Mummy and Fatso, Liqa and Lulu and your loved

ones, Ferial and family, Izmeth, Ayesha, Faeez, Lafir, Seela and others, you held me in your

thoughts and made the agony endurable, thank you.










Finally, I can never repay a debt to those in Sri Lanka, India, everywhere, who

have so little and at whose expense I have consumed so much-some day we will be a part of

a world where a more equitable distribution prevails.















TABLE OF CONTENTS


Page

A C K N O W LED G M E NT S ............................................................................................... iv

ABSTRACT .......................... ................................ viii

CHAPTER

I INTRODUCTION ................................. ... .................. 1

II A SURVEY OF RELATED WORK ..................................................... 7

K knowledge Representation ................................................... .................. 7
Object-Oriented Programming Environments .......................................... 9
The Role of Constraints in Knowledge Management................................ 10
Other Techniques for Integrating DBMS and AI Technologies ................ 12

III ARCHITECTURE OF THE INTEGRATED OBJECT-ORIENTED
K B M S ................................................................ .................................... 14

IV EXPRESSING RULES IN A KNOWLEDGE MANIPULATION
LANGUAGE ........................................................................... 20

The Semantics Captured in Rules...................................... .................... 20
R ules Expressed in a KM L ....................................... .............................. 22
Advantages of Classifying Rules........................................ ..................... 31

V CAPTURING SEMANTIC FEATURES IN A KNOWLEDGE BASE ..... 36

Useful Semantic Features for Modeling Knowledge .................................. 36
Representing the Attributes of an Object Type........................................ 38
Modeling the Use of Properties such as Transitivity................................. 43
Grouping Objects into Generalization "Is-A" Hierarchies........................ 45
Attribute Inheritance in the Generalization Hierarchy ............................. 47
Modeling Complex Object Types and their Interactions........................... 49
Modeling Composite Object Types or the "Is-A-Part-Of"
R relationship .................................................. ......................................... 52

VI THE MECHANISM OF RULE PROCESSING IN AN
INTEGRATED KBMS........................ ........................ 59

The Match-Modify-Execute (MME) Cycle .............................................. 66
Example Transaction Fragments in the MME Cycle................................ 73
Simulating the MME Cycle Using a Production System........................... 77










VII A DBMS APPROACH TO PROCESSING KBMS TRANSACTIONS..... 92

A Performance Measure for the MME Cycle Implementation.................. 94
Review of Available Techniques for Processing Rules .............................. 99
Structuring Rules Within Object Types.....................................................103
Identifying the Scope of a Rule........................................... ....................... 110
Issues Concerning the Concurrent Execution of Rules...............................112

VIII A METHOD FOR THE EFFICIENT EVALUATION OF LINEAR
RECURSIVE QUERIES IN THE INTEGRATED KBMS .......................129

Review of Methods for Evaluating Recursive Queries...............................130
The Impact of Query Processing Techniques.............................................133
Our Method for Efficiently Evaluating Linear Recursive Queries.............135
An Algorithm for Evaluating a Linear Recursion ......................................138
Perform ance Evaluation ............................................................................. 141
Summary and Extensions to the Evaluation ..............................................150

IX SUMMARY AND FUTURE RESEARCH................................................165

S um m ary ............................................................................ ........................ 165
F uture R esearch............................................................ ........................... 166

R E F E R E N C E S ..................................................... .................................................... 168

BIOGRAPHICA L SKETCH ........................................................................................... 177

















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


THE DESIGN AND IMPLEMENTATION TECHNIQUES FOR
AN INTEGRATED KNOWLEDGE BASE MANAGEMENT SYSTEM


By

LOUIQA RASCHID

August 1987

Chairman: Dr. Stanley Y.W. Su
Major Department: Electrical Engineering

A knowledge base management system (KBMS) represents an innovative technology

necessary for supporting applications that require knowledge to reason about large quantities

of data. The integration of artificial intelligence and database management technologies is

critical to the success of KBMS technology and is the focus of our research. A design metho-

dology for the integration process and several implementation techniques for a knowledge

base management system (KBMS) have been studied and the results are presented in this

dissertation.

Three important elements for integration are identified in our research. First, an

object-oriented knowledge representation model is used to define object types by (a) their

structural relationships with other object types, (b) operations that are executed against the

occurrences of the object type and (c) rule-based knowledge that captures constraints, rules

of inference, expert knowledge, etc., relevant to the object type and its occurrences. Thus,

the model integrates facts and rules within the object types of the knowledge base. Next, we

use the constructs of a single knowledge manipulation language (KML) to specify both










operations and rules for object types and we identify the constructs of this KML. The third

element is a mechanism for applying the rule-based knowledge while compiling and executing

a KBMS transaction against the knowledge base. A match-modify-execute (MME) cycle uses

rules to modify KML operations in a KBMS transaction during compilation. The MME cy-

cle incorporates rules into KBMS transactions during both compilation and execution. The

operation of the MME cycle has been verified in a production system environment.

Our approach of using a single KML to specify operations and rules and the MME cycle

mechanism that executes these operations and rules against the knowledge base supports the

similarity in database and rule processing functions of the KBMS. Consequently, implemen-

tation techniques from existing database technology can be tailored for use in the integrated

KBMS and we have examined two techniques. The first technique is the interleaved execu-

tion of concurrent transactions and we studied its effect on the execution efficiency of KBMS

transactions. The second technique is query optimization exploiting decomposition, inter-

mediate result sharing and pipelined execution. An analytical study on the benefits of using

these optimization techniques while evaluating a linear recursive query is presented in this

dissertation.















CHAPTER I

INTRODUCTION



As individual technologies mature, it is often beneficial to integrate technologies either

to solve problems that cannot be solved by each individual technology or to solve these prob-

lems more efficiently. The research presented in this dissertation relates to the integration of

technologies leading to a new technology, namely knowledge base management systems.

A knowledge base management system (KBMS) represents a new technology that has

recently emerged from the merging of two existing technologies, namely artificial intelligence

(AI) and database management systems (DBMS). KBMS technology benefits from the

knowledge representation techniques, the deductive problem solving capability, the enhanced

query languages, the explanation facility that follows a particular line of reasoning, etc., sup-

ported by an AI reasoning system or an expert system. It also benefits from the efficient and

sophisticated management of a large database, the enforcement of reliability, security and

integrity, the efficient implementation techniques that exploit query optimization methods,

the use of parallel processing to support concurrent execution of database transactions, etc.,

supported by a DBMS. Other advantages of merging these technologies include the use of

semantic knowledge for query processing and optimization, the support of intelligent user

interfaces, etc., in the KBMS.

Various application domains that use expert knowledge when processing large quanti-

ties of data can benefit from the new KBMS technology. Computer aided design of VLSI cir-

cuits and the engineering design process in a manufacturing environment are two examples of

these domains. The layout of a single VLSI circuit or a design database in a manufacturing

environment involves several megabytes of data. Fragments of this data will be accessed and










modified by various users and tools and it would require all of the features provided by con-

ventional DBMS technology to manage these databases.

At the same time, the complex process of design and testing could benefit from AI tech-

nology as well. Expert knowledge guiding these tasks may be in the form of rules and con-

straints. AI reasoning techniques can be used to apply the relevant rules when processing the

data and to maintain consistency of the circuit layout or the design data with respect to the

constraints. Various heuristics supported by AI technology can also be used to make these

tasks more manageable and efficient.

Several approaches for merging AI and DBMS technologies have been suggested

[BRO84, GAL83, JAR84b, VAS85 and WHA87]. One approach is to build an interface or a

bridge between a database processor that manages a database of facts or assertions (the

extensional database) and an inference processor that manipulates deductive rules (the inten-

sional database). This approach is used in JAR84a, KEL82 and KEL84 and is discussed in

Chapter Two. The main disadvantage of this "separate but equal" approach to knowledge

management is the use of separate representation schemes for facts and rules; this makes it

cumbersome and expensive to use the rules for processing the extensional database.

Other approaches to implementing a KBMS either enhance a DBMS with deductive

power so that the search portion of a reasoning system can be moved into the DBMS

[STO83, WON84 and ST0841 or enhance a logic programming system with database facili-

ties [WAR84]. Although these approaches may achieve some of the necessary functionality

of a KBMS, it is difficult to extend an existing system to handle requirements outside the ori-

ginal system specifications.

In this dissertation, we describe our approach for integrating these technologies for the

purpose of designing and implementing a KBMS. We follow the object-oriented paradigm as

proposed in KER84b, WIE83 and WIE84 and discussed in Chapter Two. A primary feature

of the KBMS is an object-oriented knowledge representation model which defines the struc-

ture, operations and rules for the object types of a single integrated knowledge base.










Problem solving knowledge is captured by the rules defined for the object types. The

object types provide a natural structuring for knowledge, i.e., they provide a binding

between facts and relevant rules; this has been identified as a desired property of a KBMS

[WIE83 and WIE84]. The integrated KBMS also supports a single knowledge manipulation

language (KML) which is used both to specify operations that manipulate facts and as a rule

language to express problem solving knowledge relevant to the facts.

In comparison with the other approaches, ours is closest to the meaning of integration.

We combine a DBMS and an AI rule processing system into a single integrated KBMS. This

integration takes place both at the representation and functional levels.

To elaborate, integration at the representation level means that the KBMS provides a

uniform representation framework capable of defining the facts corresponding to the DBMS

as well as the knowledge (rules) used by the AI reasoning system to solve problems. Addi-

tionally, a single knowledge manipulation language (KML) provides a unified scheme to

express both operations and rules defined for the object types.

Integration at the representation level leads to functional integration of the KBMS

components. The DBMS component that processes the operations and the AI reasoning

component that processes the rules use the same KML constructs to manipulate the object

types of the integrated knowledge base. This common characterization helps identify com-

mon functionality among the different components and this leads to functional integration.

Functional integration in the KBMS eliminates functional redundancy of the components. It

also leads to an efficient implementation of the KBMS since implementation techniques that

have been successful in either DBMS or AI technology can be applied to the functionally

integrated KBMS.

This dissertation is organized as follows: Chapter Two provides a brief survey of the

relevant literature. Chapter Three outlines the architecture of the integrated KBMS and

lists its desired features. We show how a technique of incorporating rules into the semantic

association model SAM* [SU83 and SU85], an object-oriented semantic data model currently










under development at the University of Florida Center for Database Research, is used to

provide an object-oriented framework for knowledge representation. Chapters Four through

Seven deal with several important aspects in the design of the integrated KBMS, as outlined

in Chapter Three.

Chapter Four deals with the semantics of the knowledge manipulation language (KML).

We describe the use of the KML to specify operations that manipulate the object types and

to express rules. We discuss different categories of KML constructs and show how rules

expressed using these constructs can explicitly specify declarative and operational (process

oriented or procedural) semantics of knowledge. We introduce two main categories of rules,

namely value independent rules and value dependent rules and discuss their differences and

the advantages of classifying rules. We also discuss how rules can be used to support both

forward and backward inference chains.

Chapter Five presents an example knowledge base. In this chapter, we describe how

semantic features found useful in modeling knowledge can be captured by the object types of

our knowledge representation model. First we discuss semantic features, from both DBMS

and AI literature, that have been found useful in modeling knowledge from diverse domains.

Next, we describe some of the different association types of SAM*, the semantic data model

that we use in the design of our knowledge representation model. We then show how these

association types and the rules defined for them can be used to construct the object types

which in turn capture the previously identified semantic features.

A mechanism for applying the rules that capture problem solving knowledge is critical

to the success of the integrated KBMS. We use a transaction oriented paradigm to charac-

terize processing in the KBMS. A transaction is typically a sequence of KML operations

which are executed against the object types of a knowledge base. A match-modify-execute

(MME) cycle represents the mechanism of applying rules, defined for these object types,

while executing this KBMS transaction, and it is described in Chapter Six.










The different categories of rules are treated differently in the MME cycle. For example,

value independent rules that capture operational semantics are matched against the transac-

tion and are used to modify the transaction prior to execution. The modifications can incor-

porate further operations or specify some operations to be conditionally executed. In con-

trast, value dependent rules that capture declarative semantics are incorporated into the

transaction to be executed against the knowledge base; these rules are explicitly selected for

execution.

During execution, too, the transaction can be modified and operations or value depen-

dent rules may be incorporated; this causes the MME cycle to be called recursively. Com-

mitting the changes made to the knowledge base by the transaction can result in certain con-

ditions being satisfied; this leads to the implicit selection and execution of other value depen-

dent rules that also capture declarative semantics. Details of the MME cycle are discussed in

Chapter Six. We also describe a prototype of the MME cycle implemented in the OPS5 pro-

duction system environment [FOR81].

The OPS5 prototype of the MME cycle highlights several implementation issues which

lead to the discussions in the following chapters. In Chapter Seven, we first introduce a per-

formance measure for the MME cycle implementation and review available techniques to

process rules. We discuss how different categories of rules defined for an object type can be

effectively structured to reflect and exploit differences in usage of these rules in the MIME

cycle. We also discuss how the context, i.e., those object types that are relevant to a rule,

can be determined.

The focus of Chapters Seven and Eight is the functional integration of the DBMS and

AI reasoning system components within the KBMS. One approach to functional integration

is to characterize these components using common functions. We characterize the execution

of rules using DBMS retrieval and storage manipulation functions. Now we can study the

migration of provenly efficient techniques from a DBMS to the functionally integrated

KBMS.










Chapter Seven deals with the increased efficiency resulting from the interleaved execu-

tion of concurrent transactions. If we can identify and isolate KBMS transactions that can

be executed in parallel, then we can benefit from concurrency in the KBMS. The approach

taken is, first, to identify sources of parallelism and isolate a set of independent transactions

that can be executed in parallel. Then, we extend the serializability criterion for the correct-

ness of concurrent DBMS transactions to the KBMS. We prove that the concurrent execu-

tion of a set of KBMS transactions is equivalent to a particular serial execution of the same

set. We also examine DBMS concurrency control algorithms, such as two phase locking,

from the viewpoint of KBMS transactions.

In Chapter Eight, we examine ways to further benefit from functional integration in the

KBMS through the use of DBMS query optimization techniques. A KBMS transaction that

generates the transitive closure of a relation is an example of the use of a deductive rule that

generates new information. Transitive closure is an example of a linear recursive query

which cannot be evaluated by a conventional DBMS. In Chapter Eight, we show that in the

functionally integrated KBMS, the set of resolvents generated by the rule processing system

of the KBMS using a linear recursive rule can be treated as a set of concurrent KBMS

retrievals. We then use query optimization techniques, based on query decomposition, inter-

mediate result sharing and pipelining, developed for use in a DBMS, to evaluate these

retrievals in a KBMS. This leads to an efficient evaluation strategy for linear recursive

queries and illustrates how DBMS strategies can be used to support processing of rules in an

integrated KBMS.

Chapter Nine summarizes the research presented in this dissertation and suggests areas

for future research. The research described here introduces techniques for integrating data-

base management and artificial intelligence technologies so that intelligent systems with large

databases and rule bases can be built elegantly to run efficiently. An object-oriented KBMS

can be used as the foundation to model and build these intelligent systems.















CHAPTER II

A SURVEY OF RELATED WORK



Research in the integration of technologies requires surveying several related subject

areas. In our study of KBMS technology, we investigated database management systems and

techniques, artificial intelligence techniques, knowledge representation, expert systems and

constraint management, to name a few. A complete literature review would be considerably

longer than this entire dissertation. In this chapter, we survey a few topics of general

interest that have a bearing on the design of an object-oriented KBMS. We delay the discus-

sion of specific research to later chapters.

We review related research in knowledge representation, the object-oriented program-

ming paradigm, the role of constraints in knowledge management and approaches other than

our own for integrating DBMS and AI technologies.



2.1 Knowledge Representation


The representation of knowledge is a key issue in determining the structure and organi-

zation of knowledge bases that support efficient knowledge management. Various knowledge

representation techniques have been studied in artificial intelligence research and these

schemes can be broadly classified into either declarative schemes or procedural schemes.

The declarative schemes include logical, network and frame based representations. The

advantages of logical schemes [BRO84 and MYL81] are the simple syntax and well defined

semantics of logical formulae as well as the generality of inference and proof procedures that

manipulate the logical formulae. The disadvantages are the lack of organizational principles

resulting in unstructured knowledge and the inability to conveniently express procedural or










heuristic knowledge. An example of a network representation is a semantic net [QUI68 and

SCH761 which models objects and the (binary) relationships between objects. Network

schemes in general lack the formal semantics of logical schemes but have a natural graphical

representation and provide a means for organizing information. Frame based representation

schemes [WIN75] are used to model a stereotypical situation with complex structures and

provides a framework for developing other representation models. Various adaptations of

frames include FRL [GOL77], KRL [BOB77] and OWL [SZ077].

In contrast, the procedural or process oriented knowledge representation schemes, while

lacking in formal semantics, allow for direct and efficient interaction between knowledge

facts and rules. However, this same interaction results in meta-information being embedded

in the control structures of the system and the inability to easily understand and modify pro-

cedural schemes [KER84b and MYL81]. Pattern directed inference systems such as

PLANNER [HEW71 and HEW72] and CONNIVER [SUS72] are examples of procedural

knowledge representation systems. These systems can be classified on the basis of procedure

activation mechanisms and control structures offered. Production systems [DAV75, NEW73

and WAT79] such as OPS5 [FOR81] have sometimes been classified as procedural schemes

[BRA85 and MYL81]. However, production systems stress modularity of productions; hence,

they do not support direct interaction (communication or control) between productions.

The distinction between declarative and procedural or process oriented information is

revisited in MOR84. Knowledge and expertise are perceived as consisting of representation

and performance components. Paralleling the distinction made earlier in MYL81 and

STE80, the representation component contains the factual information and statements of

relationships that define the desired result [MOR84]. This can be equated to declarative

semantics. The performance component then deals with the strategy and tactics for manipu-

lating and combining this information to achieve results efficiently [MOR84]. This can be

viewed as the procedural or process oriented component.










A complete knowledge representation scheme must be able to structure and organize

knowledge. It must include a declarative framework to describe the relationships existing

between different knowledge components. On the other hand it must capture process

oriented knowledge that specifies the context, procedural methods, priorities, scheduling

information, etc., that is needed to make a system efficient. In this dissertation, we refer to

the process oriented or procedural component as the operational component of knowledge.

The object-oriented programming paradigm which we now discuss has the capability to

organize knowledge and to represent both declarative and operational information.



2.2 Object-Oriented Programming Environments


The object-oriented programming paradigm, first introduced in SIMULA 67 [DAH68]

and later popularized by SmallTalk-80 [GOL83], has had a considerable impact on modeling

and managing knowledge; one of its attractions is that it allows the incorporation of concepts

from other programming paradigms such as logic, functional programming, rule-based pro-

duction systems, etc. It has been proposed in KER84b that this feature can be exploited so

that object-oriented knowledge representation models can uniformly handle the representa-

tion of both declarative and process oriented or operational information.

Traditionally, a program was characterized as follows:

program = data structure + algorithm

The object-oriented paradigm emerged from the necessity to structure program

knowledge by encapsulating data structures and algorithms into complex objects and to sup-

port the direct representation and processing of these complex objects.

An object-oriented model has two main features:

(a) It separates the specification and representation of objects-this is the information hid-

ing aspect.

(b) It controls the interface to the objects through standardized access methods-this is the










encapsulation aspect. These methods correspond to high level operations that are

closer to real world operations than the lower level primitive data manipulation opera-

tions used to implement them.

An important organizational feature of an object-oriented model is its class hierarchy.

Objects are organized into classes; each class can be derived from a super-class and/or each

class can be further organized into sub-classes. The class hierarchy is an inheritance hierar-

chy, and the specification and representation can be inherited into a class from its super-

class(es). Each object class or object type can be instantiated to have several object

occurrences or instances. Objects can also be composed of other objects in a component

hierarchy which is separate from the inheritance hierarchy.

KEE [INT841, KLONE [MOS83 and SCH83], LOOPS [STE83], PRISM [KER84a and

KER84b] and STROBE [LAF84 and SMI83] are examples of systems that are object-

oriented.

Our knowledge representation model conforms with the object-oriented paradigm. We

exploit the organizational features, i.e., the inheritance and component hierarchies and the

encapsulation feature of this paradigm, as discussed in Chapter Three.



2.3 The Role of Constraints in Knowledge Management


The use of constraints as a unifying paradigm in expert systems, DBMS and knowledge

representation systems has been suggested in MOR84. Constraints provide a convenient way

for expressing relationships that must hold between different pieces of information. The role

of constraints as a means of expressing knowledge has been recognized and exploited in

several ways [BRO78, CHA84, FUT84, HAM75, HAM76, HAM80, KER84b, MIN83,

MOR84, RAS84 and XU83].

In DBMS, constraints can provide internal consistency of semantic data models, data

security and data integrity. Constraints can also act as a search heuristic and can be applied










to queries in order to generate semantically equivalent but more efficient queries or to ter-

minate queries that violate the constraint.

In artificial intelligence applications such as expert systems, constraints can specify rela-

tionships between problem specifications, goals, etc. For example, constraints are used

effectively in a truth maintenance system [DOY79]. In CHA84 and RAS84 we see the use of

integrity constraints in semantic query optimization. Based on the notion of subsumption

and partial subsumption [CHA84], integrity constraints are used to optimize queries.

The process of constraint management has several components. The first component is

involved with the specification of the constraint. The second concerns the mechanism used

for checking if the constraint has been violated and the third component is responsible for

maintaining the consistency of the database with the constraints.

In MOR84, Constraint Equations (CEs) are developed in conjunction with the KL-ONE

object-oriented knowledge representation system. The information expressed in the CEs is

the declarative component and for a subset of the CEs a prototype compiler automatically

generates a set of condition-action rules which maintain these constraints. In this system,

however, the CEs cannot always express in a declarative way the complete operational infor-

mation required to maintain consistency. To overcome this drawback, operational informa-

tion is implicitly expressed in the CEs. A more elegant solution is to explicitly specify both

the declarative and process oriented or operational semantics of the constraint; this is the

approach we have used in our specification of rules as will be discussed.

Constraints are also used in the PRISM architecture [KER84a and KER84b] which is

an object-oriented semantic net with a natural clustering of rules at the nodes. Constraints

are expressed in the Constraint Language (CL) and these constraints are used to define,

extend and populate the net in a consistent way. Advantages of this system are the explicit

specification of constraints and the uniform treatment of data and metadata in the semantic

net, e.g., an instance or occurrence of an object type is the same as the object type. The

user can extend the system (including the constraints) and the inference engine validates










system behavior by proving conjectures. New knowledge is added incrementally to the sys-

tem. Several aspects not addressed in this study include the use of deductive rules to derive

new information, the support of recursion and examples of the use of inference chains caused

by chaining constraints together. Also, there is no mention of how the rules are stored or

manipulated in the object-oriented system. These are issues that will be explored in this

dissertation.



2.4 Other Techniques for Integrating DBMS and AT Technologies


Several approaches for merging AI and DBMS technologies have been suggested in

BRO84, GAL83, JAR84b, VAS85 and WHA87. One approach is to build an interface or a

bridge between a database processor that manages a database of facts or assertions (the

extensional database) and an inference processor that manipulates deductive rules (the inten-

sional database). This approach is used in JAR84a, KEL82 and KEL84.

The disadvantages of this separation stem from the fact that there are different

representations for facts and rules and the two systems are functionally independent. The

specification of rules is independent of the database retrieval capability provided by the data-

base processor. As a result, the expressive power of the rules is limited by the rule

specification language; we cannot bring to bear the full capabilities of the database manage-

ment system to express the rules. The second drawback relates to efficiency considerations.

An inference plan is created using the intensional database and is verified using the exten-

sional database. Efficiency can be improved by verifying the inference plan at intermediate

stages and providing feedback. Such verification, however, is cumbersome because of the

separation of the rule base and fact base.

Another related problem, mentioned in KEL84, is that database operations, such as

aggregations, cannot be executed over deduced concepts not found in the fact base. As a

result, functions have to be duplicated in the two systems. Lastly, forward reasoning










requires intimate interaction between the fact base and the rule base since deductions are

made outward from the facts in the absence of a specific goal. This is difficult if the two

databases are on separate systems.

Another approach to implementing a KBMS is to either enhance a DBMS with deduc-

tive power so that the search portion of a reasoning system can be moved into the DBMS

[STO83, ST084 and WON84] or to enhance a logic programming system with database facil-

ities [WAR84]. Although these approaches may achieve some of the necessary functionalities

of a KBMS, it is difficult to extend an existing system to handle requirements outside the ori-

ginal system specifications.

Our approach to the design of a KBMS, which we outline in the next chapter, is closest

to the meaning of integration. It has neither the disadvantages of the first "interface"

approach nor the limitations of the second "extensions" approach.

The research topics to be reviewed in later chapters are as follows: In Chapter Five we

briefly discuss semantic data models and semantic features that are useful in modeling

knowledge. In Chapter Seven we review available techniques for implementing rules in the

OPS5 production system as well as in an extended version of the INGRES DBMS. We also

briefly discuss the serializability criterion of correctness for concurrently executed database

transactions. In Chapter Eight we discuss techniques for evaluating linear recursive queries

as well as database query optimization techniques.















CHAPTER III

ARCHITECTURE OF THE INTEGRATED OBJECT-ORIENTED KBMS


In this chapter, we outline the architecture of our integrated object-oriented KBMS.

We first list the features of the KBMS and discuss why the design of the KBMS is integrated

and object-oriented. We then outline the approach that is used in our design. Finally, we

explain the significance of the material to be presented in Chapters Four through Eight to

the design of the KBMS.

The features of an integrated object-oriented KBMS are as follows:

(1) A powerful object-oriented knowledge representation model. The model is capable of

defining the structure of the object types to be stored in the knowledge base. It also

supports high level operations by which the object types will be accessed and manipu-

lated and rules that capture problem solving knowledge.

(2) A powerful knowledge manipulation language (KML) to be used in conjunction with the

model. The KML is used to specify operations for accessing and manipulating the

object types in a standardized manner and to specify the semantic information captured

by the rules.

(3) A mechanism for applying the problem solving knowledge that is captured in the rules

defined for the object types, when processing a transaction in the KBMS.

(4) The implementation (of the KBMS) fosters the functional integration of the DBMS and

the AI reasoning system components of the KBMS that process the knowledge base.

An object-oriented model defines a collection of object types and specifies the structure

of the facts to be stored with each object type, the structural relationships between object

types and the operations that access and manipulate each object type. An object-oriented

knowledge representation model must extend this definition to include a knowledge










component to capture problem solving knowledge relevant to each object type. Rules have

been used widely and effectively as the knowledge component in AI reasoning systems. We

propose that rules be used in our model, too, as the knowledge component of each object

type.

Unfortunately, there is a serious shortcoming in most rule based production systems or

clausal schemes such as Prolog; they do not structure the rules or clauses. This results in

inefficient inference mechanisms. For example, in FOR82, it is estimated that a production

system such as OPS5 spends 90% of its effort in selecting appropriate rules. However, both

the inheritance and component hierarchies of an object-oriented model can be used to con-

struct a framework in which to organize and provide structure to rules. As a result, rules

can be used very effectively as the knowledge component in our object-oriented model.

We extend the encapsulation feature that allows high level operations to be defined for

each object type and we allow the attachment of rules to the definition of the object types,

as well. Declarative, operational (or procedural) and heuristic knowledge is incorporated into

our model, via rules. For example, rules are used to describe declarative knowledge about

relationships that exist between object types. Rules also encapsulate operational information

and heuristic knowledge into abstract entities.

By using the object-oriented paradigm, we are able to structure and organize

knowledge. We also integrate the fact base and rule base within the object types of a single

knowledge base. Clustering operations and rules by object type defines the context for

applying these operations and rules. This clustering also provides a binding between facts

and relevant rules within the object type; this binding is an important property of a

knowledge representation model.

The integrated knowledge base is defined by a collection of object types. Each object

type is defined by its structure, operations and rules. Each object type can be instantiated to

have many instances or occurrences. Each occurrence of an object type will acquire the

definition of the object type; i.e., it will have the same structure, operations and rules that










define the corresponding object type. The object type and its occurrences are all knowledge

base objects.

Each object (type or occurrence) is unique and identifiable in the knowledge base; to do

so each object is associated with an object identifier (OID) which is system generated. Rules

are themselves objects. Rules defined for an object type apply to all occurrences of the

object type.

To provide this object-oriented framework for knowledge representation, we use a tech-

nique of incorporating rules into a semantic data model [RAS85]. The semantic association

model SAM* [SU83 and SU85], was designed as a semantic data model for engineering and

scientific/statistical databases. A DBMS implementation of SAM* is currently underway at

the University of Florida Center for Database Systems Research and Development.

SAM* derives its powerful representation capability from the variety of modeling con-

structs that it supports. SAM* recognizes basic data types such as integer, complex data

types such as set, and abstract data types such as COMPUTE (to encapsulate a sequence of

executable operations) and RULE (corresponding to a rule). SAM* identifies several seman-

tic properties found useful in modeling engineering databases and then defines seven associa-

tion types to model these semantic properties.

Each of the association types in the SAM* model is defined by its structure (to

represent the facts which it stores) and operations (to specify how the stored facts are mani-

pulated). To accommodate our knowledge representation model, we add a knowledge com-

ponent; each of the association types is now defined by its rules, as well.

SAM* association types are the system object types inherent to the knowledge

representation model and are used as building blocks. When a knowledge base is being

designed for a particular application domain, the user object types specific to that domain

will be built using the system object types (SAM* association types).

A user object type is modeled using either a single system object type or a network of

system object types. The user object type is a sub-class of the system object type(s) and










inherits the structure, operations and rules defined for the system object type(s). In order to

capture the semantics and problem solving knowledge specific to an application domain, the

designer will extend the inherited definition of a user object type by defining additional

operations and rules for it.

Each of the user object types is an object and will have an OID value. Similarly, each

of the user object types is instantiated and has several occurrences; each of these is an object

with an OID value. Object occurrences are instantiated with the same structure, operations

and rules that are defined for the corresponding user object type.

Just as problem solving knowledge can be specific to a particular application domain or

user object type, there can be knowledge that is specific to a particular occurrence of a user

object type. To accommodate this, the user object type corresponding to this object

occurrence is defined by an attribute whose data type is RULE; the value of this attribute

for any occurrence of the user object type is a rule that is specific to that particular

occurrence.

Each chapter of this dissertation deals with some aspect of the design and implementa-

tion of the integrated, object-oriented architecture for a KBMS which is outlined above.

Rules that are defined for the object types and occurrences of the knowledge base are

an important component of our model. Thus, the focus of Chapter Four is a knowledge

manipulation language (KML) for expressing these rules. In that chapter we introduce some

KML constructs for manipulating the objects. The rules are expressed using these KML con-

structs. The KML allows the rules to express (a) declarative semantics corresponding to

relationships between knowledge base object types and occurrences which must be main-

tained and (b) operational semantics corresponding to strategies to maintain these relation-

ships. The KML constructs can also be used to express (c) access methods or high level

operations that are a part of the object type definition. The use of a single KML to express

the rules and to define operations that manipulate objects is an important feature and its

consequences will be discussed later.










Chapter Five provides examples of user object types in an application domain modeled

using the system object types (SAM* association types) and their rules. Although by no

means exhaustive, these examples illustrate the technique of incorporating rules into the

semantic data model to obtain our knowledge representation model. Semantic features use-

ful in modeling knowledge from diverse domains are modeled by the user object types of the

example knowledge base. The user object types are built using the system object types and

extended by the domain specific rules defined for the user object types.

The third feature of our KBMS is a mechanism for applying the knowledge component

during transaction processing and Chapter Six describes this mechanism. A KBMS transac-

tion is a sequence of operations which are executed against the object types and occurrences

of the integrated knowledge base. The rules defined for these object types must be applied

while processing the transaction. This mechanism for applying rules comprises a match-

modify-execute (MME) cycle. The mechanism exploits the binding between facts and

relevant rules within the object types. It also exploits the fact that the rules are expressed

using the same KML constructs as are the operations of the transaction.

The final feature of our design is that the implementation fosters the functional integra-

tion of the DBMS component that processes operations and the AI reasoning component that

processes rules. Chapters Seven and Eight focus on implementation to support functional

integration.

Integration at the representation level of the model is based on the following features

(previously described):

(a) structuring rules using the inheritance and component hierarchies of our object-oriented

model

(b) binding facts and relevant rules within the object types of the integrated knowledge

base using the encapsulation feature

(c) using a single KML to both define operations that manipulate object types and to

express rules.










These features lead to functional integration; the DBMS and AI reasoning components

can be characterized using common functions. In our KBMS, the AI reasoning component is

characterized using DBMS retrieval and storage manipulation functions. Functional integra-

tion implies that either AI or DBMS techniques can be applied to the functionally integrated

KBMS.

Chapter Seven studies methods to (a) organize the object types and occurrences and

their rules in the knowledge base and (b) determine the context of a rule (the relevant object

types and occurrences), so as to support the mechanism for applying rules. Chapter Seven

also studies the increased efficiency resulting from the interleaved execution of concurrent

transactions in the KBMS. Chapter Eight examines ways to further benefit from functional

integration through the use of DBMS query optimization techniques while evaluating a

KBMS transaction. The approach taken is to treat the set of resolvents generated by a

linear recursive query as a set of concurrent KBMS retrievals and then to apply DBMS query

optimization techniques to efficiently evaluate these retrievals in a KBMS transaction.















CHAPTER IV

EXPRESSING RULES IN A KNOWLEDGE MANIPULATION LANGUAGE



In this chapter, we first discuss desirable semantics to be captured in the knowledge

component defined for the object types. We describe different categories of language con-

structs of a knowledge manipulation language (KML) and different categories of rules

(corresponding to these language constructs). We show how rules expressed in a KML can

capture these desirable semantics. Finally, we discuss the advantages of classifying the rules

and how these rules can support both forward and backward chains of inference.



4.1 The Semantics Captured in Rules


In Chapter Two we reviewed the dichotomy between declarative and operational (pro-

cedural) semantics from the viewpoint of knowledge representation schemes. A similar dis-

tinction can also apply to the rules that are a component of our knowledge representation

model. These rules must be able to express diverse semantics. This includes integrity con-

straints, deductive rules that generate new information, expert rules that capture problem

solving knowledge peculiar to an application domain, rules that support attribute inheri-

tance, etc.

Declarative semantics correspond to a well formed formula describing relationships that

must hold between object occurrences in the knowledge base and are the specification com-

ponent of semantics. In contrast, the information needed to check and maintain these rela-

tionships are operational semantics. This operational component deals with the strategy and

tactics needed to achieve results efficiently [MOR84]; this includes specifying the context,

procedural methods, priorities, scheduling information, etc. Thus, to be adequate, the










knowledge component of an integrated KBMS must explicitly specify both declarative

semantics as well as operational semantics [BRO84, KER84b and MYL81].

In KER84b, it is suggested that complete constraint formalisms must provide informa-

tion along the lines of WHAT, WHEN, WHERE and HOW. WHAT specifies the conditions

that must be satisfied by the knowledge base and corresponds closely to declarative seman-

tics. The WHEN and WHERE components specify when a constraint is to be examined and

in what context, and is related to the operational semantics of checking a constraint. The

HOW component specifies what actions must be taken to maintain the relationship and

requires information from both components. The same analogy can be extended to rules in

general.

To illustrate the importance of explicitly specifying both declarative as well as opera-

tional semantics using rules, we use the example of a P*S..P object type with two attributes

PART and SUB-PARTS. The object type models the relationship between a part and its

set of sub-parts. It is described in detail in Chapter Five and Figure 5.1 models this object

type.

P*S-P is subject to a uniqueness constraint based on the attribute, PART, i.e., each

occurrence of P*S..P should have a distinct value for the attribute PART. This constraint

can be expressed declaratively in clausal first order predicate logic, using extensional predi-

cates corresponding to the object occurrences, e.g., P*SP(x,y), and valuable predicates for

functions evaluated against these occurrences, e.g., EQUAL(x,y) as described in CHA84 and

ULL85. The uniqueness constraint, expressed in this form, is as follows:


EQUAL(y,z) <-- P*S-P(x,y), P*S-P(p,z), EQUAL(x,p)

This well formed formulae only expresses the WHAT information or the specification

component. The operational information that must be specified includes the WHEN and

WHERE component, e.g., this uniqueness constraint must be checked before an INSERT

operation executes against the P*SP object type. The HOW component may be that if the










constraint is violated (when p equals x but y and z are unequal), then the strategy to main-

tain the constraint is to combine P*S...P(x,y) and P*S-P(p,z) into a single occurrence

P*S..P(x,q), where q is the union of the sets y and z. All the information, related to these

different components, must be made available in the form of rules and it could require

several rules to completely specify this information.



4.2 Rules Expressed in a KML,


In this section, we show that using a single knowledge manipulation language (KML) to

specify operations that manipulate objects and to express rules provides a powerful rule

language that can explicitly specify both declarative and operational semantics.

The KML constructs are an extension of conventional data manipulation language

(DML) constructs. The KML includes both set-theoretic and algebraic operations that can

be executed against the object occurrences of the knowledge base, either for retrieval or for

storage manipulation. Two new operations, namely the EXECUTE operation and the

DERIVE operation are introduced and described in this chapter. The KML supports a

variety of high level programming constructs such as FOR, WHILE and REPEAT loops, IF-

THEN-ELSE or CASE statements, etc. It also supports the use of valuable functions, e.g.,

GREATER-THAN, EQUAL, etc., and set-oriented functions, e.g., MEMBERLOF,

SUBSET-OF, etc., that return a truth value after retrieving one or more object occurrences

from the knowledge base.

While the specification and design of a KML is important, it deals with issues in

language design and is beyond the scope of our research. We are concerned with using the

KML to specify rules, and so, we limit ourselves to the functionality of a language construct

rather than the specific construct.

The constructs of the KML are classified into different categories corresponding to the

functionality supported by each construct, as follows:










(1) Constructs which describe a relationship between object occurrences in the knowledge

base.

(2) Constructs which represent the execution of KML operations.

(3) Constructs which involve communication between the KBMS and the user or an appli-

cation program.

(4) Constructs which correspond to the execution of another rule.

The rules, expressed using KML constructs, have the structure

or (LHS, RHS) as often used in production sys-

tems IDAV75 and NEW73]. The conditional left hand side (LHS) must be satisfied before

the right hand side (RHS) consequent can be applied. There are several differences between

these rules expressed in the KML and productions of a production system. These differences

are with respect to the interactions between the rules, as discussed in this chapter, and the

mechanism for selecting rules, as discussed in Chapter Six.

Both the conditional LHS and the consequent RHS of the rules are expressed using the

KML constructs in any of the above categories. Each category of KML constructs supports

a different functionality; consequently, rules expressed using these constructs capture

different semantics. Thus, based on the category of the KML constructs on either the LHS

or the RHS, we classify rules into different categories.

For example, depending on the category of the LHS conditional construct, rules are

categorized as follows:

(a) value dependent rules and

(b) value independent rules.

Value dependent rules capture declarative semantics describing a relationship between

knowledge base object occurrences. On the other hand, value independent rules capture

operational semantics. These rules test the execution status of KML operations or other

rules and capture triggering or scheduling information.










Based on the category of the RHS consequents, rules are categorized as follows:

(a) rules which derive new information (or deductive rules),

(b) rules which execute actions and

(c) rules which build explicit inference chains by executing other rules.

Several rules from different categories are used together to capture complex problem

solving information for a particular domain as will be seen in the next chapter. In the rest of

this chapter we describe the different categories of rules, in detail, and discuss some advan-

tages of classifying them.



4.2.1 Value Dependent Rules


A rule whose LHS is a KML construct that expresses a condition or relationship

between object occurrences is classified as a value dependent rule. Its LHS corresponds to a

declarative formula which may be expressed in clausal form by extensional predicates and

valuable predicates [CHA84 and ULL85], or equivalently, in a tuple calculus form as in the

QUEL or SQL data manipulation languages.

This declarative formula represents a condition that must be verified against the

knowledge base and is equivalent to a retrieval statement describing the object occurrences

that must be retrieved to satisfy the condition. Thus, we may use the complete retrieval

capability of the KML to describe this condition. A value dependent rule is as follows:

IF (condition or relationship between object occurrences)

THEN (RHS consequent)

The LHS of such a rule is true after retrieval of the required object occurrences, or

verification of the described condition. When a condition is verified for a truth value, then

the actual values of the object occurrences that satisfy the condition may not be required for

further use. Several examples of this category of rules are seen in the next chapter. A value










dependent rule which specifies a relationship between occurrences of P*SP is as follows:

IF (there exist occurrences X and Y of P*S-P, respectively, such that

EQUAL(X.PART, Y.PART)

THEN (RHS consequent)

We classify this rule as a value dependent rule because satisfying the LHS depends on

specific values of object occurrences stored in the knowledge base. The semantics captured

by such a rule are declarative (or descriptive) semantics. These rules must be executed

against the knowledge base, as will be discussed.

The condition being verified or the values retrieved on the LHS of a value dependent

rule could also involve KML constructs that support communication between the KBMS and

the external world. The LHS condition could test the values of messages to and from an

operator, interrupts or control commands, error messages or abort flags, etc., and these rules

can be used to synchronize the KBMS with other programs.



4.2-2 Value Independent Rules


A rule whose LHS tests a KML operation (actually its execution status, as will be dis-

cussed) or the execution status of another rule is classified as a value independent rule. The

KML operations being tested may include retrieval operations and storage operations

(INSERT, DELETE, etc.,) as well as user defined operations associated with an object type.

Examples of these rules are also seen in Chapter Five.

A rule which tests the execution status of a KML operation is as follows:

IF option (KML OPERATION against OBJECT TYPE)

THEN (RHS consequent)

The KML operation that is tested by this rule is called a triggering operation since it

triggers or causes the execution of the RHS consequent of this rule. These rules, also called

triggers, are value independent since they are independent of the values of object occurrences










stored in the knowledge base and their LHS is satisfied by the KML operations themselves.

These triggers modify a transaction which contains the KML operation by scheduling the

execution of the trigger's RHS consequent, within the transaction. Triggers capture opera-

tional semantics, i.e., they specify triggering and scheduling information.

Production systems generally support rules that resemble our value dependent rules.

The support of triggers that capture operational semantics, and are used to modify a tran-

saction, is a major difference from the production system approach. Triggers are introduced

in conjunction with the concept of a transaction and thus, have a major impact on the pro-

cess of selecting rules, as will be seen in Chapter Six.

The execution status of a KML operation identifies if the operation is waiting to be exe-

cuted, is being executed or has completed execution. There are three options that can be

specified in the trigger. The options specify the semantics that control the scheduling of the

triggering operations and the RHS consequent of the trigger, within a transaction.

In Figure 4.1, we describe the modifications to a transaction, when each of these

options is specified in a trigger. The first option is the pre-execution option which is satisfied

when the triggering operation is waiting to be executed. When this option is used, then the

RHS consequent of the trigger is scheduled to execute before the triggering operation. The

next option is the post-execution option which will be satisfied after the triggering operation

has completed execution. With the post-execution option, the RHS consequent of the trigger

is executed just after the triggering operation. The third option is the parallel-execution

option which will be satisfied during the execution of the triggering operation. With this

option the RHS consequent of the trigger and the triggering operation can be executed in

parallel. The keywords pre-exec, post-exec and par-exec will be used to specify these options.

Each of these options have their specific semantics. For example, a retrieval operation

could trigger a security constraint which may find the retrieval to violate the constraint and

may abort the retrieval. This is possible only if the security constraint is executed before the

operation. To support this, we use the pre-exec option with the triggering retrieval










operation on the LHS of the trigger. The security constraint will be the RHS consequent of

this trigger and it will be scheduled to precede the retrieval operation. The triggering opera-

tion will execute only if the RHS consequent does not set a flag to abort the operation.

Thus, with the pre-exec option, the outcome of the RHS consequent must be tested (for an

abort flag) before the triggering operation executes.

A constraint which maintains an existence dependency following a triggering deletion

operation may require that the constraint be executed after the deletion. Here the post-exec

option will be used on the LHS of the trigger and the constraint will be the RHS consequent.

The RHS consequent will execute only after the triggering operation completes execution.

Alternately, a retrieval operation may trigger a deductive rule that derives new

occurrences that may satisfy the retrieval. In this case, the triggering retrieval operation and

the deductive rule can execute in parallel. However, the triggering operation must continue

execution as long as the deductive rule is deriving new occurrences. The par-exec option will

be used on the LHS of the trigger to capture these semantics with the deductive rule as the

RHS consequent. The triggering operation will not complete its execution until the RHS

consequent completes execution. A discussion on parallel execution will be postponed to later

chapters.

A value independent rule or trigger, rl, can also test the execution status of another

rule, r2, defined for an object type, as follows:

trigger rl: IF option (EXECUTE rule r2)

THEN (RHS consequent)

Trigger rl is a value independent rule since it does not directly test values of object

occurrences, in the knowledge base. The rule r2 whose execution status is being tested by

trigger rl is the triggering rule and it triggers the execution of the RHS consequent of rl.

Triggers such as rl modify transactions containing the triggering rules. The triggering rules

that are in the transaction are always value dependent rules.










The trigger rl will modify a transaction based on the option used on its LHS. Figure

4.2 describes the possible modifications to a triggering rule, in a transaction. There are four

different options involved. In addition to the pre-exec, post-exec and par-exec options dis-

cussed earlier with the KML operations, there is a fourth option, namely "successful" execu-

tion. For example,

trigger rl: IF succ-exec (EXECUTE rule r2)

THEN (RHS consequent)

In this situation, "successful" execution (succ-exec) guarantees that the LHS of rule r2, whose

execution status is being tested, evaluates to a truth value and the RHS of rule r2 is actually

executed before the RHS of rl will be scheduled for execution.

The value independent rules that test the execution status of other rules actually cap-

ture meta-information; i.e., they reason about the execution of other rules. This information

can be used while structuring rules within the object types, as discussed in Chapter Seven.



4.2.3 Rules that Derive Information


Rules in the following categories are classified on the basis of their RHS constructs. A

construct describing a condition or relationship between object occurrences can occur on the

RHS of a rule. This means that some new information corresponding to the RHS consequent

is derived by applying the rule and rules such as these are called deductive rules. We use the

DERIVE operation of the KML to describe this new information, as follows:

IF (LHS condition)

THEN (DERIVE {description of derived object occurrences})

A rule such as this also captures declarative (descriptive) semantics. The syntax used

to describe the derived occurrences of a DERIVE operation is similar to an INSERT opera-

tion. However, the difference between them is with respect to the temporary or permanent

nature of the data. Data that are inserted are permanent but the existence of derived data is










dependent on pre-conditions, specified in the LHS, which may not always hold. Methods for

handling derived data have been investigated in NIC78. For our purposes, we will assume

that the derived occurrences will be differentiated from the (permanently) inserted

occurrences of an object type; i.e., derived occurrences will belong to a temporary object

type. Examples of these rules are seen in Chapter Five.



4.2.4 Rules that Execute Actions


Rules in this category are those rules whose RHS construct corresponds to executable

actions. The action is a KML operation and the rule is as follows:

IF (LHS condition)

THEN (KML OPERATION against OBJECT TYPE)

The semantics clearly indicate that the KML operation is a consequent action and must

be executed by the KBMS. A rule such as this captures operational semantics or the WHAT

information mentioned earlier. Note that we previously discussed the possibility that an

operation could be aborted (when a constraint is violated). Thus, before executing this KML

operation, the corresponding abort flag for the operation must be tested. When execution

completes, its status will reflect completion.

The RHS action of rules in this category could also correspond to constructs that facili-

tate communication between the KBMS and the external world. This action could involve

messages, flags, etc. When this construct appears on the RHS, it is an executable action

which allows the KBMS to synchronize with the external world.



4.2.5 Rules that Build Explicit Inference Chains


Rules that build explicit inference chains are those rules which explicitly execute other

rules. KML constructs corresponding to the execution of a rule can occur on the RHS of

another rule; i.e., a rule rl can execute another rule r2, on the RHS, as a consequent action,










as follows:

rule ri: IF (LHS condition)

THEN (EXECUTE rule r2 (parameters))

This is called an "explicit inference chain." Rules such as these capture operational

semantics corresponding to the WHAT information previously mentioned. The support of

explicit inference chains by executing a rule on the RHS is another way in which rules

expressed in the KML differ from productions.

The rule r2 that is executed on the RHS must always be a value dependent rule. An

explicit inference chain is one way of selecting value dependent rules for execution and this

method for selecting rules is not generally supported by production systems. It can support

both forward and backward inference chains as will be discussed in the next section.

Explicit inference chains are seen in PLANNER [HEW71 and HEW72] and are con-

sidered to reduce the independence of the rules by embedding control information within a

rule. The alternative is to use a blackboard of shared variables as is common in most pro-

duction systems but this tends to hide information about the chain of execution. As an alter-

native to a blackboard to pass values between rules, values are passed as parameters between

rules in an explicit inference chain.

Before the rule r2 starts executing the abort flag corresponding to it must be checked to

see if it has been set. After r2 completes execution its status will identify if the execution

was "successful."

An explicit inference chain also represents meta-information about the rules; i.e., one

rule controls the execution of another rule. This meta-information is useful in structuring

rules within object types, as is discussed later.










4.3 Advantages of Classifying Rules


We have seen that using the KML constructs to express rules increases the expressive

power of the rule language by allowing the explicit specification of both declarative and

operational components of problem solving knowledge. In order to improve the clarity of the

rules to the user or designer and their manageability by the KBMS, we require meta-

knowledge about these rules; classifying rules is one way to meet this requirement.

Meta-knowledge obtained from classifying rules occurs in many forms and includes the

category of the LHS and RHS constructs, the distinction between declarative and operational

semantics, the options used in conjunction with the LHS such as post-exec, the identification

of explicit inference chains on the RHS, etc.

Meta-knowledge about the category of the LHS constructs provides the ability to dis-

tinguish between value independent triggers and value dependent rules. This distinction will

assist the KBMS to select the appropriate subset of rules to be processed, thus increasing the

manageability of the rules. In Chapter Six, we describe the mechanism of applying rules in

the integrated KBMS and we identify the advantage of distinguishing between rules, based

on the LHS constructs.

For example, the value independent triggers which check the execution status of opera

tions or rules on their LHS and capture operational information are used to modify transac-

tions before execution. The value dependent rules which express declarative semantics and

retrieve object occurrences from the knowledge base on their LHS are incorporated into the

transaction to be executed against the object occurrences. Being able to distinguish between

these categories of rules improves the efficiency of the mechanism for applying rules.

Meta-knowledge in the form of the different options, associated with testing the execu-

tion status of operations and rules, is used to schedule the execution of operations and rules

within a transaction. It is also used to determine when a rule or operation is conditionally










executed or when rules and operations can be executed in parallel. This is useful when

optimizing a transaction for the efficient implementation of the KBMS.

In MOR84, it was suggested that constraints activated in a chained fashion should be

grouped together to improve their manageability. In our system, using meta-knowledge about

the RHS EXECUTE constructs, explicit inference chains are easily detected. Clustering

rules that are explicitly chained together will facilitate the propagation of information along

the inference chain via parameter passing as well as be an aid to efficient implementation, as

will be seen in Chapter Seven.

The object-oriented paradigm provided structure to the knowledge by grouping data

and relevant rules within the object types. Meta-knowledge about the different categories of

rules can be used to further structure these rules defined for a single object type; this struc-

turing will reflect and exploit differences in applying these rules in the KBMS. This is also

discussed in Chapter Seven.

One final note is that any rule can be used for either forward chaining or backward

chaining. It is the inferencing mechanism which determines the direction of the chain by

binding variables on the appropriate side of a rule and selecting rules for execution, based on

their LHS conditions (forward chain) or the RHS consequent (backward chain).

Most deductive systems are characterized by one form of inferencing mechanism; Pro-

log uses a goal based backward chaining strategy while most production systems are exam-

ples of forward chaining.

In our KBMS, we use a transaction oriented mechanism for applying rules. As described

in Chapter Six, the process of matching value independent triggers against the triggering

operations and rules of the KBMS transaction follows a forward chain of inference. How-

ever, the explicit inference chains that (explicitly) control the execution of value dependent

rules support both forward and backward chaining strategies.










When a rule rl executes another rule r2 on its RHS, rl uses parameter passing to bind

variables in r2, as follows:

rule ri: IF (LHS condition)

THEN (EXECUTE rule r2 (parameters))

Depending on the direction of variable binding within rule r2, both forward and back-

ward inference chains can be supported. A backward inference chain will use the values

passed via the parameters to initially bind variables on the RHS of rule r2. A forward infer-

ence chain corresponds to using values passed via the parameters to initially bind variables

on the LHS of rule r2. This will be seen in the examples discussed in the next chapter. The

desired direction of inference is specified in these rules through the appropriate use of vari-

able names and parameter names. Consequently, the direction of inference is decided when

each rule is specified.







































+ conditionally executed if no abort
flag is set


Figure 4.1 Possible Modifications to a Triggering Operation


triggering
operation in trigger modified transaction
transaction

I
INSERT IF pre-exec (INSERT into object) RHS consequent
into THEN (RHS consequent) I +
object INSERT into object



INSERT IF post-exec(INSERT into object) INSERT into obecti
into THEN (RHS consequent)
obectRHS consequent
object



INSERT IF par-exec(INSERT into object) INSERT into RHS
into THEN (RHS consequent) object consequent
object I I
















triggering
rule in trigger modified transaction
transaction


EXECUTE IF pre-exec (EXECUTE rule) RHS consequent
THEN (RHS consequent) I
rule EXECUTE rule



EXECUTE IF post-exec (EXECUTE rule) EXECUTE rule
THEN (RHS consequent) I
RHS consequent
rule



EXECUTE IF par-exec (EXECUTE rule) EXECUTE RHS
THEN (RHS consequent) rule consequent
rule I 1


EXECUTE rule
EXECUTE IF succ-exec (EXECUTE rule) I+
rule THEN (RHS consequent) RHS consequent


+ conditionally executed if no abort
flag is set
++ conditionally executed if rule
successfully executes

Figure 4.2 Possible Modifications to a Triggering Rule















CHAPTER V

CAPTURING SEMANTIC FEATURES IN A KNOWLEDGE BASE



In Chapter Three we described the architecture of the integrated, object-oriented

KBMS and in Chapter Four we outlined how rules could be used to capture declarative and

operational semantics. In this chapter, we show, by example, how semantic features found

useful in modeling knowledge can be captured by the user object types of the integrated

knowledge base.

First we discuss semantic features, from both DBMS and AI literature, that have been

found useful in modeling knowledge from diverse domains. We introduce the system object

types of our model, corresponding to the different association types of the underlying object-

oriented semantic data model, SAM*. We then show examples of user objects types that are

built using the system object types of SAM*, extended by a knowledge component, i.e., the

rules defined for each user object type. The desirable semantic features are captured by

these user object types and their rules. Depending on the complexity of a semantic feature,

several system object types and rules may be needed to capture it.



5.1 Useful Semantic Features for Modeling Knowledge


Semantic models containing a number of general modeling constructs have been pro-

posed in DBMS research [BRO81, CHE76, COD79, HAM81, SCH75, SMI77, SU79 and

SU83]. The features captured in these semantic data models parallel those captured in AI

knowledge representation schemes [BOB77, FOX84, GOL77, INT84, MOS83, MYL81,

QUI68, SCH76, SCH83, STE83, SZ077 and WIN75].










The constructs in these models, enhancements to them, and other constructs that are

supported by our object-oriented knowledge representation model are summarized below.

An object type representing an entity or concept is described by its attributes. Object

types are often used to model complex hierarchical structures; thus, the concept of describing

an object type by its attributes must be extended to handle attributes that are themselves

object types. This could lead to recursively defined object types where an attribute of an

object type has the same definition as the object type itself.

Next, generalization or "is-a" hierarchies model classes of object types and their sub-

classes and are capable of supporting the inheritance of attributes and operations, along the

hierarchy. This feature must be extended to support the inheritance of rules as well. The

hierarchies must also be able to handle multiple inheritance when the hierarchy represents a

graph rather than a tree. This is an extension of the strict hierarchies of SmallTalk-80.

The "is-a-part-of" or component relationship must be extended to include complex

object types whose components are also complex object types. This, too, could lead to recur-

sively defined object types. The ability to group objects into a class and to identify attri-

butes of the class is used to represent summary functions such as the maximum, average,

etc., for all members of the class. Another useful feature is the ability to model facts or

events that result from the interactions of the occurrences of two (or more) independent

object types as well as the attributes that describe these interactions. It is also useful to

identify properties such as transitivity, symmetry, reflexivity, etc., to describe relationships

between object types and occurrences.

The membership or "as-a" relationship is used to distinguish between an object type

and the individual object occurrences (or instances) of that type. This feature must be

extended in order to associate knowledge with individual object occurrences.

Useful problem solving knowledge that is captured in both DBMS and AI systems are

often in the form of constraints or rules. This includes integrity and security constraints

common in DBMS and AI truth maintenance systems, deductive rules that describe how new










facts can be derived, often through the use of properties such as transitivity, and expert

problem solving rules defined for an application domain. This knowledge, too, must be

represented within the object types of the integrated knowledge base.

The knowledge component, in the form of rules which are also objects, may be defined

for a system object type, a user object type or for an occurrence of a user object type. A rule

defined for a system object type captures semantics which are inherent to the semantic

model; it will be inherited by user object types built using the system object type. A rule

defined for a user object type (or one of its occurrences) captures domain semantics relevant

to a particular application domain. Related work on this subject is described in RAS85.

Note that the terms SAM* association type and system object type will be used interchange-

ably in this chapter.



5.2 Representing the Attributes of an Object Type


An object type is usually defined by its attributes or characteristics; such a grouping of

a set of attributes to represent an object type must be captured in the knowledge base. In

Figure 5.1, PART-...DEF is a user object type representing a collection of parts and is

described by its attributes PART-NAME, PART-DESC, etc. PART-DEF will be modeled

by an aggregation (A) association type (system object type). The A association type of

SAM* is a system object type that defines an object type by a set of characteristic attributes,

each of which is represented by an attribute type. Each occurrence of this object type is a

member of the cross product of the domains of the attribute types involved and is

represented by the values of its attributes.

The domain of an attribute can be defined by a membership (M) association type which

is used to group together similar atomic concepts; it is formed by a set of distinct elements of

the same data type. The data type can be a simple data type, e.g., integer, a complex data

type, e.g., vector, or an abstract data type, e.g., RULE.










In Figure 5.1, M association types are used to define the domains of the attributes of

the user object type PART-DEF. These domains are PART-NAME and PART-DESC

whose data type is string, QTY-IN-STOCK of data type integer, PROD-COST and

MARK-UP whose data type is real and ORDER-PROC whose data type is RULE. P*S-P

is another user object type representing a part-subpart relationship. It is also modeled by an

A association type with attributes PART and SUB-PARTS.

In the figures in this section, a network representation using labelled nodes and arcs is

used. A labelled node is a concept defined in terms of other concepts pointed to by the

directed arcs leading from the node. Attributes are represented by arcs whose domains are

pointed to by the arrow. A labelled arc represents an attribute whose name is different from

the domain name. PART and SUB-PARTS have the same underlying domain

PART-NAME. The data type of the attribute may be different from the domain data type,

e.g., SUB-..PARTS is a set of values obtained from the domain PART.NAME.

The knowledge component of user object types PART-DEF and P*S-P is in the form

of rules that are either inherited from the system object types or are specified for the user

object type. Rules that maintain the semantics of the system object type used to model the

user object type are inherited. For example, in addition to the object identification, OID,

which is used to distinguish all objects in the knowledge base, the A association type defines

a uniqueness property over one (or more) attributes; these attributes must have distinct

values for each occurrence of the type. The user object type PART-DEF, modeled by an A

association object type, inherits a uniqueness constraint, say on the attribute PART-NAME.

The user object type PART-DEF inherits a rule, unique-PART-NAME, which checks

each occurrence of PART-DEF for uniqueness of attribute PART-NAME. Rule

unique-PART.NAME is executed when an occurrence is inserted into object type

PART-DEF. PART-DEF also inherits a rule Ti which triggers the execution of

unique-PART-NAME when there is a corresponding INSERT operation. Since there is a

possibility that rule unique-PART-NAME can abort the insertion into PART-DEF (when










the constraint is not met), this rule must be applied before the insertion is executed by the

KBMS. We use the pre-exec option with the INSERT operation in rule T1 to provide the

KBMS with this scheduling information.

A rule such as T1, that tests the execution status of a KML (INSERT) operation exe-

cuting against an object type (PART-DEF), is a value independent trigger as described in

Chapter Four. Rule unique-PART-NAME is a value dependent rule which is explicitly

selected for execution by trigger T1. In this explicit inference chain, T1 passes the value, X,

of the inserted occurrence of PART-DEF as a parameter into unique.PARTNAME.

Note that insertion into PART-DEF also affects the attribute QTY-IN-STOCK and

its domain which is modeled by an M association type. This domain may be subject to user-

defined membership constraints specified by a range, or an enumeration, of acceptable values.

In this application, the desired constraint is that the the attribute values of

QTY-JN.STOCK for each occurrence of PART-DEF must not exceed a maximum value of

10,000. A value dependent rule qty-.constr-1, defined for QTY-JN-STOCK, maintains this

constraint and is executed by the value independent trigger Tl', when an insertion into

PART-DEF is attempted. The rules T1, Ti' and unique.PART-NAME, defined for

PART-DEF, and qty-constr 1, defined for QTY-JN.STOCK, are as follows:

T1 : PARTDEF
IF pre-exec(INSERT an occurrence X into PART.DEF)
THEN (EXECUTE unique-PART-NAME(X): PART-DEF)

unique-PART-NAME(X) : PART-DEF
IF (for an occurrence X of PART-DEF, there exists an occurrence Y of PARTJDEF
such that EQUAL(X.PARTNAME, Y.PARTJNAME) AND NOT-EQUAL(X,Y) )
THEN (alert the KBMS to reject X)


Tl' : PART-DEF
IF pre-exec(INSERT an occurrence X into PART-DEF)
THEN (EXECUTE qtyconstrl(X.QTY-JN.STOCK) : QTY-IN.STOCK)

qty-constrl(X) : QTY-JNSTOCK
IF (for an occurrence X of QTY-IN.STOCK, GREATER-THAN(X,10000))
THEN (alert the KBMS to reject X)










The user object type PART-DEF may be subject to inter-occurrence and intra-

occurrence constraints. This is application specific knowledge which only applies to a partic-

ular user object type. For example, the user may wish to limit profits by maintaining a limit

on the attribute MARK-UP at a maximum of 50 percent of the PROD-COST. A value

dependent rule, lim-prof-1, maintains this intra-occurrence constraint by executing an

UPDATE operation when the constraint is violated by an insertion into PART-DEF. It is

executed by a trigger T2. Based on its RHS, lim..prof-l is classified as a rule that executes

an action while T2 is a rule that builds an explicit inference chain. Both rules are defined by

the user for the object type PART-DEF. This expert rule lim-prof-l, unlike

unique-PART-NAME, does not abort the insertion into PART-DEF and may be scheduled

for execution after the insertion. We use the post-exec option with the INSERT operation in

trigger T2.


T2 : PART-DEF
IF post-exec(INSERT occurrence X into PART-DEF)
THEN (EXECUTE lim-prof-l(X): PART-DEF)

lim.prof-l(X): PART-DEF
IF (for an occurrence X of PART-DEF,
GREATER-THAN (X.MARK-UP, PRODUCT (0.5, X.PRODCOST)) )
THEN (UPDATE X such that X.MARKUP =
PRODUCT (0.5, X.PROD-COST) )

As mentioned in Chapter Three, rules can be defined for an object type; i.e., it applies

to all occurrences of the object type, or it can be defined for a particular occurrence of an

object type. In the next example, each occurrence of PART.DEF has a different rule associ-

ated with it. This is modeled by an attribute ORDER-RULE of object type PART-DEF;

this attribute is of abstract data type, RULE. Then, for each occurrence of PART-DEF,

the value of this attribute is a rule associated with it.

For all occurrences of PART-DEF whose attribute PART-DESC has a value "military

equipment", the rule ORDER-RULE must be executed, after the corresponding

PART-DEF occurrences are retrieved. This information is captured by trigger T3 which










uses the post-exec option to execute rule milit-constr-1. Rule milit.-constr-1 is a value

dependent rule which checks the value of attribute PART.-DESC of each retrieved

occurrence of PART-DEF. If satisfied, milit-constr-J executes the rule stored as a value of

ORDER-RULE with that occurrence of PART-.DEF. The rules are as follows:

T3 : PART-DEF
IF post-exec(RETRIEVE an occurrence X of PART-DEF)
THEN (EXECUTE milit-constr-l(X): PART-DEF)

milit.constr-l(X): PART.DEF
IF (for an occurrence X of PART-DEF,
EQUAL (X.PART-DESC, "military equipment"))
THEN (EXECUTE X.ORDER-RULE)

an example value of ORDER..RULE :
IF (GREATERILTHAN (PROD-COST, 1500) AND
LESS-THAN (QTY-IN-STOCK, 10) )
THEN (alert the KBMS)

Currently, there are certain restrictions imposed on rules which are attributes of type

RULE. One restriction is that the rules should be value dependent rules; i.e., they may not

test the execution status of operations or rules executed against other objects in the

knowledge base. The conditions that they test on the LHS should only involve the particular

object occurrence for which the rule is specified. This restriction makes the rule sensitive

only to the attribute values of the occurrence for which it is specified.

The RHS consequent action is restricted to execute against the particular occurrence

for which the rule is specified. The RHS action could also be part of an explicit inference

chain by executing another value dependent rule. However, the rule in the explicit inference

chain must also be an attribute of type RULE, specified for the same object occurrence.

As a result of these restrictions, the scope of the knowledge rule is restricted to the

occurrence for which it is defined. Conceptually, there is no necessity to impose these res-

trictions. The reason we currently impose these scoping restrictions is to simplify the KBMS

prototype; this will be discussed with strategies for applying rules, in Chapters Six and

Seven. Finally, since these rules are actually attributes of an object occurrence, they do not

have an execution status associated with them. Value independent rules may not test their










execution status or modify transactions that execute these rules. This, too, simplifies the

prototype of the KBMS.



5.3 Modeling the Use of Properties such as Transitivity


The user may wish to include knowledge that makes use of properties such as transi-

tivity. For example, if P*S-P is an object type that captures the part-subpart relationship,

then the transitive closure of this relationship will derive all the subparts of a part. All the

occurrences derived using the transitivity property will be stored in an object type

der-P*S-P. The object type der.P*S.P is also modeled by an A association type and is

similar in structure to P*S-P. We use separate object types so as to differentiate between

permanent occurrences of P*S-P, which are directly inserted, and the temporary occurrences

of der-P*S-P (its transitive closure), which are derived and whose existence depends on cer-

tain conditions which may not always be true in the knowledge base.

Value dependent rules trans..cl- and trans-.cL2, defined for the derived object type

der-P*S-P, define the transitive closure of P*SP. Based on their RHS, they are classified

as deductive rules that derive new information. The two rules are executed by triggers T4

and T4', respectively, when occurrences of der-P*S-P are retrieved.

T4 : derP*S-P
IF par-exec(RETRIEVE occurrence X from der-P*S-P)
THEN (EXECUTE trans.cll(X): der-P*S-P)

trans...cLl(Z): der-P*S-P
IF (there exists an occurrence Y of P*S-P)
THEN (DERIVE occurrence Z of der-P*S-P such that Z = Y)


T4' : der-P*S..P
IF par-exec(RETRIEVE occurrence X from der-P*S-P)
THEN (EXECUTE trans_.cl2(X): der-P*S-P)










trans..cL2(Z) : der-P*S-P
IF (there exist occurrences P and Q of derP*S..P and P*SP, respectively, such that
SET-MEMBER(Q.PART, P.SUB-PARTS))
THEN (DERIVE occurrence Z of der-P*S-P where Z.PART = P.PART
AND Z.SUB-PARTS = the union of P.SUB-PARTS and Q.SUB-.PARTS)

This chain of inference, corresponding to triggers T4 and T4' executing value depen-

dent rules trans-cL-l and trans-c"L2, is an example of a backward chain. In a backward

chain, the goal or triggering operation (the retrieval operation), initially binds occurrences on

the RHS of a rule. These bindings are then passed to the LHS of that rule. In the example,

triggers T4 and T4' pass the attributes of the X occurrences (to be retrieved) as parameters

into trans-..cL1 and trans-cL-2. There, they first bind with the attributes of the Z

occurrences on the RHS of these rules.

We use the special knowledge manipulation operation DERIVE to indicate that these

new occurrences are generated through the application of a deductive rule. Derived data are

not usually a permanent part of the knowledge base and are generated when required. Since

we specify the par-exec option on the LHS of the triggers T4 or T4', the rules, trans-.cl1

and transcl...2 can be applied in parallel with the retrieval operation.

Transitive closure exhibits recursive properties, as is seen in this example. The trigger-

ing retrieval operation against derP*S-.P causes the execution of trans-ccL-2. In trying to

satisfy its LHS, rule trans..cL-2 will attempt to retrieve appropriate P occurrences; this

corresponds to the execution of another retrieval operation against der...P*S.P. Triggers T4

and T4' will match against this new retrieval operation and will cause rules trans-cl-1 and

trans...cL2 to be recursively executed.

The same rules trans..cl-1 and transl..-c2 can also be used to compute the transitive

closure of P*S-P via a forward chain of inference. To support a forward chain, the triggers

must be changed to execute the rules when new occurrences are inserted into P*S..P or

derived into der-P*S-P. Now, the new occurrences will initially bind attributes on the LHS

of the rules.










The detection of recursive rules and evaluation strategies (when there are a large

number of occurrences that can satisfy the rules, thus, requiring efficient search strategies)

has been studied [BAY85, CHA81, HEN84, ST084, ULL85 and WON84] and is still an open

topic for research. Chapter Eight of this dissertation deals with the efficient evaluation of

linear recursive rules.



5.4 Grouping Objects into Generaliza.tion "Is-A" Hierarchies


Similar objects can be grouped together to form a generic type. The generic object

type and its constituent object types will form a generalization hierarchy. In the example of

Figure 5.2, GOVT-PROJECT is a generic object type of all government projects and it

groups together all members of its three constituent object types, NON-MILITARY-PROJ,

MILITARY-PROJ and TOP.SECRETPROJ. An occurrence of a constituent object type

is also a member of the generic object type GOVT.PROJECT.

In our model, we use a generalization (G) association type to model a generic object

type. Thus, GOVT-.PROJECT is modeled by a G association type. Its three constituents

are modeled by A association types. The constituent object types of a generic object type

could be dissimilar, i.e., the constituent object types could have different attributes, rules,

etc., to describe them. The occurrences of a user object type modeled by a G association

type are obtained by taking the outerjoin of the occurrences of its constituents. The generic

object type simply groups together existing object occurrences but does not create any new

object occurrences. Thus, the set of OID values for GOVT-PROJECT is formed by the

union of the OID values of its constituents.

Although the generalization we have described is a strict hierarchy of user object types,

the occurrences themselves may not be hierarchically organized. For example, an occurrence

of TOP-SECRET-PROJ could also be an occurrence of MILITARY-PROJ. To model this,

four types of constraints, set exclusion, set equality, set subset and set intersection can be










specified for the generic object type. These constraints specify the set relationship between

the occurrences of a pair of constituent object types. The constraint is based on the OID

values of these occurrences.

The set constraints between the constituents of GOVTPROJECT are shown in Fig-

ure 5.2. For any generic object type, the specific set constraints are application domain

knowledge; i.e., corresponding rules will not be inherited but must be defined for each user

object type. In this application, there is a set exclusion constraint between occurrences of

MILITARY-PROJ and NON-MILITARY-PROJ while occurrences of

TOP-SECRET-PROJ form a subset of MILITARY-PROJ. The generic object type

GOVT-PROJECT must be defined by rules that check for the appropriate set constraints.

These rules will be triggered when occurrences are inserted into the constituent object types.

GOVTPROJECT has a value dependent rule gen-...constr- which ensures set exclu-

sion; i.e., an occurrence that is inserted into TOP-SECRET-PROJ should not previously

have been inserted into NON-MILITARY-PROJ. Trigger T5, defined for

TOP-.SECRET-PROJ, executes gen-.constr-1 when there is an insertion into

TOP-SECRET-PROJ. Since gen-constr-1 can abort the insertion into

TOP.SECRET-PROJ, T5 uses the pre-exec option with the INSERT operation.

A rule gen-constr-2, also defined for GOVT-PROJECT, inserts a corresponding

occurrence into MILITARY-PROJ to satisfy the set subset constraint (if it is not already

satisfied). This occurrence will have the same OID value as the inserted occurrence of

TOP-SECRET-PROJ. Attribute values will be obtained from the user or set to null values.

Rule gen-constr-2 can be executed in parallel with the insertion into TOP-SECRET-PROJ.

A trigger, T6, defined for TOP-SECRET-PROJ, executes gen-constr-2 using the par-exec

option. The rules are as follows:

T5 : TOP-SECRET-PROJ
IF pre-exec(INSERT an occurrence X into TOP-SECRET-PROJ)
THEN (EXECUTE gen-constr-l(X): GOVT-PROJECT)










gen-constr-l(X): GOVTPROJECT
IF (for an inserted occurrence X of TOP..SECRET-PROJ, there exists
an occurrence Y of NON-JMILITARY-PROJ, such that EQUAL(X.OID, Y.OID))
THEN (alert the KBMS to reject occurrence X)


T6 : TOP-SECRET-PROJ
IF par-exec(INSERT an occurrence X into TOP-SECRET..PROJ
THEN (EXECUTE gen-constr-2(X): GOVT-PROJECT)

gen-constr-2(X): GOVT-PROJECT
IF (for an inserted occurrence X of TOP-SECRETPROJ,
NOT-SET-MEMBER(X.OID, MILITARY-PROJ.OID) )
THEN (INSERT an occurrence Z into MILITARYPROJ where Z.OID = X.OID)

In the generalization hierarchy, each member of a constituent object type is also a

member of the generic object type. Thus, when an occurrence of TOP-SECRET-PROJ is

inserted, a corresponding occurrence of GOVT-PROJECT must be inserted (if it does not

already exist). This knowledge is inherent to the knowledge representation model; rules that

support this are inherited by user object types modeled by G association types. A rule

gen-hier-1 inherited by GOVT-PROJECT maintains this hierarchy by inserting a

corresponding occurrence into GOVT-PROJECT, when necessary. A value independent

trigger T7, also inherited by TOP-SECRET-PROJ, executes gen-hier-. in parallel with an

insertion into TOP-SECRET-PROJ. They are as follows:


T7 : TOP-SECRET-PROJ
IF par-exec(INSERT an occurrence X into TOP-SECRETJPROJ)
THEN (EXECUTE gen-hier-l(X) : GOVT-PROJECT)

gen-hier-l(X) : GOVT-PROJECT
IF (for occurrence X inserted into a component of GOVT-PROJECT
NOT.SET-MEMBER(X.OID, GOVT-PROJECT.OID))
THEN (INSERT an occurrence P into GOVT-PROJECT where P.OID = X.OID)



5.5 Attribute Inheritance in the Generalization Hierarchy


Attributes that describe a generic object type are inherited by the occurrences of the

constituent object types. This semantic feature is called "attribute inheritance". For exam-

ple, two attributes, LOCATION and STATUS, describe all occurrences of the generic object










type GOVT-PROJECT and are inherited by the occurrences of its three constituent object

types.

Our model supports the inheritance of descriptive attributes defined for a generic user

object type. Rules to support attribute inheritance will be inherited by these user object

types.

In the previous section, a rule gen-...hier-1 inserts an occurrence into the generic object

type GOVT...PROJECT, whenever an occurrence is inserted into TOP.-SECRET-PROJ.

Before this insertion into GOVT..PROJECT is executed, values for the inherited attributes

must be obtained from the user. A value independent trigger T8, inherited by

GOVT..PROJECT, will obtain the values of LOCATION and STATUS from the user (for

the corresponding occurrence of TOP-SECRET-PROJ), before the insertion into

GOVT-PROJECT. The trigger T8, is as follows:

T8 : GOVT.PROJECT
IF pre-exec(INSERT an occurrence X into GOVT..PROJECT)
THEN (obtain values for attributes X.LOCATION and X.STATUS from user)

The occurrences of GOVT-PROJECT and its constituents may also be subject to

application specific knowledge defined by the user. There may be a constraint specifying

that the attribute LOCATION of all occurrences of TOP..SECRET..PROJ must have a

value "Virginia" if the attribute STATUS has a value "testing". A value dependent rule

loc..stat_1, defined for TOP-SECRET..PROJ, tests the occurrence of GOVT.PROJECT

corresponding to an occurrence of TOP-SECRET..PROJ, for this constraint. Note that the

constraint tests the values of inherited attributes. Rule loc..stat-1 updates the occurrence of

GOVT-PROJECT, if necessary, to maintain this constraint. A trigger T9, defined for

TOP.SECRET-PROJ, executes rule loc-stat-1 after an insertion into

TOP-SECRET-PROJ. Rules loc-stat-1 and T9 are as follows:

T9 : TOP-SECRET-PROJ
IF post-exec(INSERT occurrence X into TOP-SECRET-PROJ)
THEN (EXECUTE loc-stat-l(X) : GOVT-PROJECT)










loc-stat-l(X): GOVT-PROJECT
IF (for an inserted occurrence X of TOP-SECRET-PROJ,
there exists an occurrence Y of GOVTPROJECT, such that EQUAL(X.OID, Y.OID)
AND EQUAL(Y.STATUS, "testing") AND
NOT-EQUAL(Y.LOCATION, "Virginia"))
THEN (UPDATE Y such that Y.LOCATION = "Virginia")

The G association system object type can also be used to model the concept of speciali-

zation. Instead of using the hierarchy to group similar constituent object types into the gen-

eric object type, the generic object type is specialized into different constituent object types.

In this case, all occurrences of the generic object type need not necessarily belong to some

constituent object type and occurrences can be directly inserted into the generic object type.

Different rules will support the specialization concept.



5.6 Modeling Complex Object Types and Their Interactions


Complex objects are described by attributes which are themselves occurrences of other

object types. For example, in Figure 5.3, an object type WORK-STATION is described by

several attributes, one of which is a set of occurrences of another object type WKSTOPS,

representing the operations that can be performed by WORK-STATION. In our model,

complex user object types are modeled by a combination of other object types; in this

instance an aggregation (A) hierarchy is used.

WORK.STATION is modeled by an A association type with attributes WKST.JD

and WKSTOPS. The attribute WKSTOPS is modeled as a set of occurrences of

another user object type WK-ST-OP which, in turn, is modeled by an A association type,

with two attributes OPERATION and OP-TIME.

A complex object type can also represent a set of facts or events which result from the

interaction among occurrences of independent object types. In Figure 5.3, PROD-JOB and

WORK-STATION are two user object types representing a collection of production jobs

and work stations, respectively. An interaction is said to exist between occurrences of these

two object types whenever a production job can be executed on a work station; this










interaction is represented by another user object type WK-ST-TASK which captures the

attributes describing the interaction. These attributes are OPERATION, the operation that

the work station can perform on the production job, and OP-TIME, the time to execute this

operation.

The system object type that models an interaction is the interaction (I) association

type. Thus, the user object type WK-ST-TASK is modeled by an I association type. Each

interaction occurrence of WK-ST-TASK is identified by an OID value and has two attri-

butes corresponding to the OID values of the interacting occurrences. For example,

WK-ST-TASK has attributes PROD-JOB.OID and WORK-STATION.OID to store the

OlD values of the occurrences of PROD-JOB and WORK-STATION.

We already discussed modeling the complex user object type WORK-STATION. The

user object type PROD-JOB is modeled by an A association as seen in Figure 5.3; it has

attributes JOB-ID and OPERATION (which has the same domain as

WORK-STATION.WK-ST-OPS.OPERATION).

There are a number of constraints that can be defined for an I association type; they

include uniqueness constraints, intra-occurrence and inter-occurrence constraints similar to

those defined for an A association type, and a mapping constraint (1-1, 1-n and n-m) to

describe the relationship between the interacting occurrences. These constraints capture

domain specific knowledge about the interaction.

Rules can also be used to capture domain specific knowledge about the actual condi-

tions under which an interaction may occur. These rules are not inherited by the user object

type but must be specified by the user. The user object type WK-ST-TASK may be defined

by rules that automatically derive (or delete) occurrences of WK..-ST-TASK in response to

changes to the underlying object types that interact with each other. This follows a forward

chain of inference.

In contrast, rules specified for WK-ST-TASK may derive its occurrences whenever

there is a retrieval request. This would correspond to a backward inference chain. For










example, a value dependent rule inter-rul-1 examines the object occurrences of PROD-JOB

and WORK-STATION for possible interactions. The rule identifies operations that the

work station can perform on the production job and derives corresponding interaction

occurrences. Based on its RHS, inter-ruLl is a deductive rule that derives new occurrences.

Since these derived occurrences are temporary; i.e., their existence depends on conditions

specified on the LHS of inter-ruL-, they are stored as occurrences of der-WK-ST-TASK.

The temporary user object type der-WK-ST-TASK has the same structure as

WK3ST-TASK.

A value independent trigger T10, defined for der-WK-ST-TASK, executes inter-rul-1

whenever occurrences of der-WK-ST-TASK are retrieved. This is a backward inference

chain; T10 initially binds variables on the RHS of inter-ruL-l.

T10 : der-WK-ST-TASK
IF par-exec(RETRIEVE occurrence X from der-WK-ST-TASK)
THEN (EXECUTE inter-ruLl(X) : der-WKST-TASK)

inter.ruL.l(Z) : der-WK-ST-TASK
IF (there exist occurrences X and Y of PROD-JOB and WORK-STATION,
such that SET-MEMBER(X.OPERATION, Y.WKICSTOPS.OPERATION))
THEN (DERIVE an occurrence Z into der-WK-ST-TASK
where Z.PROD-JOB.OID = X.OID and Z.WORK..STATION.OID = Y.OID)

WK.ST-TASK is described by two attributes OPERATION and OP-TIME. The

values of these attributes can also be derived from the corresponding occurrences of

PROD-JOB and WORK..STATION, by rule inter-ruL-. The RHS consequent of

inter..rul-1 will now be as follows:

...(DERIVE an occurrence Z into der-WK.ST-TASK
where Z.PROD-JOB.OID = X.OID and Z.WORKSTATION.OID = Y.OID
and Z.OPERATION = X.OPERATION and Z.OP-TIME = F.OP-TIME
where F is an occurrence in set Y.WK-ST-OPS
such that F.OPERATION = X.OPERATION )










5.7 Modeling Composite Object Types or the "Ts-A-Part-Of" Relationship


Engineering databases model composite objects composed of a collection of similar or

dissimilar objects. The semantic feature captured by these composite objects is the "is a

part of" relationship. The composite object has a structure corresponding to a set of sets,

since each component object is represented by a set of occurrences. Composite objects are

also used to represent a class of objects or a sub-database.

In Figure 5.4, PROD-DATA is a collection of the various components of product data

and results of tests performed on these products. The product data include descriptions and

specifications of products and the tests describe electrical, thermal and mechanical tests con-

ducted on these products. The various components of PROD-DATA are PROD.SPEC,

PROD-DESC, ELECT-TEST-DATA, THERMAL-TEST..DATA and

MECH-TEST-DATA.

In our model, a composition (C) association type models a composite object type. Each

component of the composite object type can be modeled by any other association type

(including the C association type, itself). Each component represents the entire set of

occurrences of that component object type and the composite object occurrence is a set of

sets. PROD-DATA is comprised of five sets of data, corresponding to its five components.

The "is a part of" semantics captured by the C association type is different from the G

association type in that the components of the former are a part of the composite object

type, whereas the constituents of the latter are members of the generic type. The C associa-

tion type also differs from the A association type in which attributes describe the object type.

In Figure 5.4, the user object type PROD-DATA is modeled by a C association type

and its component object types are modeled by A association types. For example, the com-

ponent ELECT-TEST-DATA is modeled by an A association type whose attributes describe

the electrical tests performed. ELECT-TEST-DATA has an attribute TESTJID identifying

the test and attributes TEST-PARAM and PROD.SET describing the tests performed










rather than the products tested. PROD-SET is a set of values from the domain

PRODUCTJD and this attribute specifies which products were tested. Similarly, com-

ponents THERMALTEST-DATA and MECH-TEST-DATA are also modeled by A associ-

ation types whose attributes describe the test performed. In contrast, components

PROD-SPEC and PROD-DESC describe all the products in the knowledge base, identified

by the value of attribute PRODUCTJD, irrespective of the tests performed on them.

Any KML operation executed against a composite object type is actually executed

against its components. When data are to be retrieved from PROD-DATA, the retrieval

operations actually execute against its five components where each component is a set of

occurrences. To support this, each composite object type is defined by rules which generate

corresponding operations to be executed against its components.

The user may wish to retrieve from PROD-DATA all data relevant to a particular

product identified by a value for PRODUCTJID. This RETRIEVE operation actually

involves five RETRIEVE operations executed against the five components. Each component

is defined by a value dependent rule comp-ruLl, comp..-ruL5, respectively, which is

responsible for the actual RETRIEVE operations. Triggers Til, T12, T15, also defined

for PROD-DATA, execute the rules comp-ruLl, comp..ruL5 in response to a

RETRIEVE operation against PROD.DATA.

Note that a single trigger could also have been specified to execute these rules. The

rules are as follows:

Til : PROD-DATA
IF par-exec(RETRIEVE from PROD-DATA all data such that
EQUAL(PRODUCT..D, "prod-no"))
THEN (EXECUTE comp-.rull("prod-no") : ELECT-TEST-DATA)

comp..ruLl("prodno") : ELECT-TEST-DATA
IF (there exists an occurrence X in ELECT-TEST-DATA such that
SET-MEMBER("prod-no", X.PROD-SET))
THEN (RETRIEVE (X.TEST-JD, X.TEST..PARAM) )










T15 : PROD-DATA
IF par-exec(RETRIEVE from PROD-DATA all data such that
EQUAL(PRODUCT-JD, "prod-no") )
THEN (EXECUTE comp-rul_5("prod-no"): PROD-DESC)

comp-ruL.5("prod-no"): PROD-DESC
IF (there exists an occurrence X in PROD-DESC such that
EQUAL(X.PRODUCT-JD, "prod-no"))
THEN (RETRIEVE X)

Other useful semantic features not dealt with in this chapter include 1) classes of

objects and their attributes, 2) operations representing high level access functions that can be

defined for object types, 3) recursive structures representing objects whose attributes are

drawn from the same object type, etc. These features can also be captured by the user

object types and knowledge rules defined for the object types.


























PART I





M

PARTNAME


MJ MARKUP


O PROD_COST


QTY_IN_STOCK


PART_DESC


Figure 5.1 Describing an Object by its Attributes














GOVT_PROJECT


NONMILIT


LOCATION


STATUS


OlD
(system generated)


Figure 5.2 A Generic Object and its Constituents





























OlD
(system c


TION


,MJOP_TYPE


)RK_STATION






WK_ST_OP


TYPE


OPERATION


OPTIME


Figure 5.3 Complex Objects and their Interactions

















PROD_DATA


Figure 5.4 Composite Objects and their Components















CHAPTER VI

THE MECHANISM OF RULE PROCESSING IN AN INTEGRATED KBMS



A mechanism for applying rules that capture problem solving knowledge is a critical

feature in the design of our KBMS and is the focus of this chapter. Processing in the

integrated KBMS is characterized using transactions. A match-modify-execute (MME) cycle

represents the mechanism used to apply the rules defined for the object types and

occurrences in the knowledge base, while executing a KBMS transaction.

This chapter describes various aspects of the MME cycle that executes a KBMS tran-

saction. We describe how value independent rules that capture operational semantics are

used to directly modify the KBMS transaction. We also describe two methods for selecting

value dependent rules that capture declarative semantics: explicit and implicit selection.

These value dependent rules are executed against the knowledge base. Examples of user

object types and their associated rules, from Chapter Five, are used to illustrate this

mechanism. A prototype of the MME cycle was developed using the OPS5 production sys-

tem language and is also described. It is this prototype that laid the groundwork for

developing implementation techniques that foster the functional integration of the DBMS

and the AI reasoning components, within the KBMS.

A transaction is defined as a unit of work; it is also a unit of recovery in that the data-

base must be in a consistent state both before and after the execution of the transaction

[DAT77]. A typical transaction consists of a sequence (or a tree) of KML operations to be

executed against the knowledge base. These operations could be either retrieval operations

or storage manipulation operations. A transaction can be thought of as being equivalent to a

sequence (or a tree) of goals. A RETRIEVAL operation in a database transaction resembles










a Prolog like goal and provides a declarative description of the required information. A

storage manipulation operation such as an UPDATE or DELETE does not directly

correspond to a Prolog like goal. Processing in a forward chaining production system (PS)

[NEW73] such as OPS5 is not goal oriented and there is no simple analog to the concept of a

KBMS transaction.

In contrast to transaction processing in a conventional DBMS, where the transaction is

executed and then either committed or aborted, in the KBMS, the execution of a transaction

is controlled by a match-modify-execute (MME) cycle. This cycle allows rules to be incor-

porated into the transaction and supports rule processing in the KBMS. The transaction is

matched against rules which modify, the transaction using rules prior to the execution of the

modified transaction. After the modified transaction is executed it will be committed.

Finally, the MME cycle selects and executes rules that have been made applicable as a conse-

quence of committing the (changes to the knowledge base made by the) modified transaction.

During the match phase, the operations of the initial transaction are matched against a

subset of rules defined for the object types, against which the operations are to be executed.

This subset of rules are those value independent rules, introduced in Chapter Four, which

test operations on their LHS. These rules were identified as triggers and the matching opera-

tions (in the transaction) as triggering operations. The triggers are value independent since

their execution depends on the triggering operations rather than actual values of the

knowledge base object occurrences. They capture operational semantics and directly modify

the transaction without access to the object occurrences.

The modifications occur in the modify phase of the MME cycle and they depend on

both the RHS consequent and the LHS options specified in the triggers. The modifications

could incorporate either operations or value dependent rules to the transaction, as will be

discussed later. This means that the appended (new) parts of the transaction must be

matched further with appropriate value independent rules; this process continues until no

further modifications are possible.










The modified transaction which now consists of KML operations and value dependent

rules is then executed. The KML operations are simply executed. For rules, the LHS is

tested and if it is satisfied, then the RHS is executed. Testing the LHS of the rule can

modify the transaction; i.e. it introduces retrieval operations which must also go through the

match-modify phases before execution. Similarly, the RHS consequent conditionally intro-

duces operations or value dependent rules which may require further matching, etc.

The value independent rules which test the execution status of either KML operations

or other value dependent rules modify the transaction. In contrast, the value dependent

rules which test the values of knowledge base object occurrences, flags, messages, etc., are

directly executed against the knowledge base. These rules are selected for execution by two

different methods.

In the first method, value dependent rules are explicitly selected for execution by other

rules. For example, a value independent rule which matched against the transaction will be

part of an explicit inference chain and will explicitly select a value dependent rule for execu-

tion, using the EXECUTE construct on its RHS. This rule will be incorporated into the

KBMS transaction. In addition, a value dependent rule already in the transaction can, in

turn, explicitly select another value dependent rule. This technique for selecting value

dependent rules is called explicit selection and the selected rules are incorporated into the

KBMS transaction, for execution.

In the second method, some value dependent rules are selected during the execute phase

of the MME cycle. The execution of the operations of the modified KBMS transaction may

place the knowledge base in a state where certain conditions are satisfied by object

occurrences. Suppose these are the same conditions that are specified on the LHS of some

value dependent rules. Then, after the modified KBMS transaction is committed, these rules

must be selected for execution. The selection of these rules, in the execute phase of the

MME cycle, is not explicitly specified by some value independent rule, and is a value depen-

dent process. The operations executed by the modified KBMS transaction are used as a seed










or starter to select value dependent rules that are defined for the object types affected by

these operations. This technique is called implicit selection.

We have several reasons for supporting these two techniques for selecting value depen-

dent rules, in the KBMS. As discussed in Chapter Four, the EXECUTE construct is useful

when there is some a priori knowledge about explicit inference chains between rules. These

chains help in grouping related rules and in passing variables into rules and can be an aid to

an efficient implementation. However, there is a drawback in that control information is

embedded in the rules and the rules are not independent of each other. There is overhead

involved in ensuring that there are no dangling references; i.e. the execution of value depen-

dent rules that do not exist. There is also overhead in ensuring that all value dependent rules

occur in at least one inference chain (so as to be useful).

In some situations we will not know a priori which object type a value dependent rule

should be associated with or what operation should trigger its execution. Thus, explicit

selection fails when we do not have a priori information about inference chains. We may

wish to incrementally add rules to the knowledge base, perhaps on a trial basis. We may

also wish to specify value dependent rules that could be triggered by several operations.

Under these conditions, the implicit selection technique provides adequate support for rule

processing in the KBMS, since it selects value dependent rules whenever their LHS condi-

tions are satisfied.

To make these two selection techniques mutually exclusive and prevent overlap; i.e.,

selecting the same rule twice, both explicitly and implicitly, we place restrictions on the

selection. Only value dependent rules that are not selected in any explicit inference chains

will be candidates for implicit selection.

Before we provide a detailed description of the MME cycle, we make a few comments

about this approach. The mechanism we have briefly described for applying rules has the

characteristics of both the compiled approach as well as the interpreted approach. In the

presence of rules (the intensional database), a transaction or a query is compiled if it passes










through two distinct phases [REI78a]. The first is a compilation phase where the query is

processed against the intensional database alone, to produce some form of object code. The

second phase is an execution phase where the compiled object code is executed against the

extensional database alone without accessing the intensional database. If these two phases

are not distinct then the approach is defined to be interpretive [REI78a].

Modifying the transaction using the value independent rules and the explicit selection of

value dependent rules is comparable to compiling the transaction, since this process only

accesses the rules defined for the object types. The modified transaction can be considered

some form of "object" code to be executed against the object occurrences of the knowledge

base.

However, the execution of the modified transaction is not independent of rules. For

example, executing a value dependent rule in the transaction can further modify the transac-

tion by appending operations or value dependent rules. These operations and rules must also

be matched and this requires access to rules defined for the object types. Thus, the MME

cycle is no longer a strictly compiled approach. The implicit selection of value dependent

rules during the execute phase of the MME cycle also makes this interpretive rather than

compiled; both object occurrences and rules are involved.

One of the advantages of the MME cycle is that it is a simple mechanism which does

not require the services of sophisticated pieces of software such as a theorem prover, etc.

This is partly because operational semantics that specify triggering and scheduling informa-

tion are explicitly captured in the triggers. The only support the KBMS need provide is to

match the triggering operations and rules in the transaction against the LHS of the triggers.

Another reason for the simplicity of the MME cycle is that KML constructs are used to

specify value dependent rules as well as operations in the KBMS transaction; thus, value

dependent rules can be directly incorporated into the transaction when they are explicitly

selected. The MME cycle must only be able to modify the representation of the transaction

in order to support this. This requires the ability to append operations or value dependent










rules, either during the modify phase or during the execute phase. If this occurs during the

execute phase, then the MME cycle must be re-invoked so that the new operations and rules

can also be matched against the triggers.

The MME cycle must be able to halt the execution of an operation or rule if a

corresponding abort flag is set. Finally, the implicitly selected value dependent rules are

each treated as independent transactions to be executed. These aspects will be discussed in

this chapter which describes the design of the MME cycle and in the next chapter which

deals with KBMS implementation issues.

It is clear that processing a KBMS transaction is complex when compared to a DBMS

transaction. The MME cycle has some features of an AI rule processing system as well as a

DBMS and requires greater system support. Another important consideration is that its

design must support functional integration of the DBMS and the AI rule processing com-

ponents. Functional integration allows migration of techniques from either DBMS or AI

technology and their use in the MME cycle.

Consider conventional DBMS query optimization techniques. It would be advantageous

for the MME cycle to exploit these techniques. Conventional optimization techniques assume

that the DBMS transaction is not modified during execution and this simplifies the optimizer.

However, we have seen that the MME cycle is interpretive; this allows a KBMS transaction

to get modified during the execute phase too.

This dilemma would seemingly prevent the use of optimization techniques in the MME

cycle. However, one way to resolve this conflict is to identify those parts of a KBMS tran-

saction which have already been modified and will not be further modified during execution.

We can now apply conventional query optimization to those parts alone, wherever they are

identified in the KBMS transaction. Issues such as this that are relevant to functional

integration are dealt with in subsequent chapters.

Before we examine the details of the MME cycle we compare the inference mechanisms

involved in these different systems we have discussed. Information propagates down an










inference chain by the process of variable binding and the direction of variable binding within

a rule will determine the direction of inference. The inference mechanism in a Prolog like

system is a backward chain; i.e., the system will work backwards from the goal and use the

goal to bind variables in the consequent of those clauses used to satisfy the goal.

In contrast, in the absence of a specific goal, a forward chaining PS such as OPS5 is in

a continuous cycle of match, selection and action. Conceptually, in each cycle, OPS5

matches all the working memory elements against the condition elements on the LHS of all

the rules to determine a conflict set of all applicable rules. Then some subset of this conflict

set is selected based on a conflict resolution strategy. In the action phase the working

memory elements are changed as specified in the RHS of the selected rules. This is an exam-

ple of a forward chain of inference since the variables of the condition elements on the LHS

of the rules are initially bound.

In our transaction oriented MME cycle, rules are applied using both forward and back-

ward chains of inference. When a triggering operation or rule matches against the LHS of a

value independent trigger, T, then variables in the LHS of T will be bound. This is a for-

ward chain of inference. Variable binding within T will be from the conditional LHS to the

consequent RHS.

In an explicit inference chain, rule rl uses the EXECUTE construct to explicitly select

a value dependent rule, r2, and rl will propagate information into r2, via parameters.

Depending on the parameters passed between rl and r2, both forward and backward

inferencing is supported. For forward chaining, rl will initially bind variables on the condi-

tional LHS of r2. For backward chaining, rl will initially bind variables on the consequent

RHS of r2, as was seen in the transitive closure example of Chapter Five. The implicit selec-

tion of value dependent rules during the execute phase of the MME cycle is a forward chain;

object occurrences from the knowledge base are used to initially bind variables on the condi-

tional LHS of these rules in order to determine which of these rules may be applicable.










6.1 The Match-Modify-Execute (MME) Cycle


We now examine the details of the match, modify and execute phases of the MME

cycle. During the match phase, the operations of the initial transaction are matched against

the appropriate subset of value independent triggers. In the next modify phase, all matching

triggers modify the transactions based on their RHS consequents and the options specified on

their LHS. As seen in Figure 6.1, there are three ways in which a triggering operation can

be modified by a trigger. When the trigger uses the pre-exec option, on the LHS, then the

RHS consequent of the trigger is scheduled to execute before the triggering operation. If the

post-exec option is used, then the RHS consequent is scheduled to follow the operation. If

the par-exec option is specified, then the RHS consequent is scheduled for parallel execution.

The RHS consequent of the trigger can either execute an operation or a value depen-

dent rule (specified using the EXECUTE construct). This operation or rule is directly incor-

porated into the transaction. The modification could also make the execution of the trigger-

ing operation conditional on the outcome of the RHS consequent of the trigger, as seen in the

figure. This must be noted and the operation marked so that conditional execution can be

supported later, during the execute phase. The modified transaction repeatedly goes through

the match and modify phases so that the rules and operations incorporated in the previous

modify phase can be matched. A "match level" is used as an indicator that the match phase

has been re-invoked and the match level is incremented whenever newly incorporated opera-

tions and rules are matched/modified.

Value dependent rules that get incorporated into the transaction are modified as seen in

Figure 6.2 where a value dependent rule, rulel, matches with a value independent trigger. If

the pre-exec, post-exec or par-exec options are specified, then the modification is similar to

Figure 6.1. If the succ-exec options is specified, then the RHS consequent of the trigger is

scheduled to follow rulel, and it will be conditionally executed depending on the execution

status of rulel.










Value dependent rules that are incorporated into the transaction could also be values of

attributes of abstract data type RULE, specified for a particular object occurrence. Since

these rules are actually the values of attributes, they are not associated with an execution

status. Thus, these rules are not matched against the value independent triggers.

We must guarantee that the match and modify phases of the MME cycle eventually

terminates. If the transaction is finite; i.e., it comprises a finite number of operations against

a finite number of object types, and if there are a finite number of value independent triggers

defined for these object types, then the match and modify phases will eventually terminate,

in the absence of cycles. If there are cycles, as in the following example:


transaction fragment: INSERT into obj....1


rules: IF option(INSERT into obj 1) THEN (INSERT into obj-2)

IF option(INSERT into obj-2) THEN (INSERT into obj.1)

then, as soon as a cycle is detected, the match and modify phases corresponding to the cycle

is terminated. Cycles which have no termination condition that depends on values in the

knowledge base could mean endless execution. To avoid this, either all cycles must be elim-

inated or each cycle must be checked to make sure it can be broken. The latter is more

expensive since one must ensure that at least one operation in the cycle is conditionally exe-

cuted; i.e., it must match with a trigger whose LHS uses the pre-exec option and whose RHS

consequent can halt its execution. Alternately, one must ensure that there is at least one

value dependent rule included in the cycle.

As seen in Figures 6.1 and 6.2, when the pre-exec option is used by the triggers, trigger-

ing operations and rules are conditionally executed, depending on the outcome of the RHS

consequent of each trigger. Conversely, with the post-exec or succ-exec options, the RHS con-

sequent of the triggers are conditionally executed, depending on the execution status after

executing the triggering operations or rules. Thus, during the match-modify phases, the sys-

tem has to store some added information linking the RHS consequent of those triggers and










the corresponding triggering operations or rules. This could be in the form of abort flags

that halt the execution of operations or rules. The system will also be responsible for passing

values of parameters between the triggering operations or rules and the triggers.

A triggering operation (or rule) could simultaneously match with several triggers.

When this occurs there is a resulting opportunity for parallelism within a single transaction,

as will be seen in the examples of the next section. Parallelism is an important issue which

will be re-visited as we consider implementation techniques for the MME cycle in an

integrated KBMS.

The match-modify phases just described corresponds to the compiled mode of the MME

cycle. Following the termination of the match-modify phases, the modified transaction is

executed; this is the execute phase of the MME cycle.

KML operations in the transaction are directly executed as in a conventional DBMS

with two exceptions. The first exception occurs when an operation is conditionally executed

as discussed. These operations are previously marked during the match-modify phases. Usu-

ally the RHS consequent of the corresponding trigger executes a value dependent rule. This

rule will precede the execution of the operation and will set an abort flag if the triggering

operation must be halted. Thus, the system must check if any abort flags are set before exe-

cuting marked KML operations.

The second exception occurs when the triggering operation and the RHS consequent of

the trigger are scheduled to execute in parallel. The triggering operation and the RHS conse-

quent usually interact with each other in this situation. For example, the RHS consequent

may execute a value dependent rule that generates information for the triggering operation

to process. Thus, while the RHS consequent is being executed (and is generating informa-

tion), the triggering operation should not complete its execution.

The execution of a value dependent rule is more complicated. Figure 6.2 showed the

case of rules that are conditionally executed, associated with the pre-exec option. These










rules are marked during the modify phase. The corresponding abort flags are checked before

executing the rule.

Rule execution comprises two tasks; first, the LHS of the rule is verified before the RHS

consequent is executed. Verifying the LHS of a value dependent rule requires identifying

object occurrences that satisfy some condition and this introduces retrievals; i.e., the LHS of

the rule is replaced by appropriate retrieval operations. Before these retrieval operations are

executed, they, too, go through the match-modify process. If there are any triggers that

match with these retrieval operations, then the transaction is further modified before the

retrieval operations are executed. The match between the retrieval operations and the

triggers may also result in cycles, as in the example of transitive closure to be discussed in

detail in the next section and in Chapter Eight. However, the cycles in this example have a

termination condition since there are value dependent rules included in the cycles.

If the LHS of a value dependent rule is satisfied, then the RHS consequent is executed.

The RHS is itself composed of further operations (or rules) and they, too, must go through

the match-modify phases before execution. Executing the RHS of a value dependent rule is

equivalent to manipulating the object occurrences of the knowledge base, which is a function

of a conventional DBMS.

Figure 6.2 also shows the case where the succ-exec option is used with a triggering rule,

in a transaction. These triggering rules are marked during the modify phase and the system

determines that the rule successfully executed; i.e., the LHS of the rule was satisfied and the

RHS consequent indeed executed before proceeding.

We have seen that during the execution of a value dependent rule, new operations or

rules can be introduced; i.e., the transaction can be further modified. Thus, in addition to

appending these operations or rules, the MME cycle must be re-invoked so that the new

operations and rules can be further matched/modified before execution. To identify that

these operations or rules have been incorporated during execution and that the MME cycle

must be re-invoked, we make use of an "execution level."










To explain the use of the execution level, the initial transaction has an execution level

of 1. The operations and rules appended during the subsequent modify phases are also at

execution level = 1. After the match-modify phase terminates, the modified transaction at

execution level = 1 is executed. However, operations or rules that are appended during the

execute phase are marked at a higher execution level. For example, during execution of a

value dependent rule at execution level, k, new rules or operations which are appended by

evaluating this rule are marked at execution level (k+l).

Higher values for execution level take precedence, so, the most recently appended

operations and rules are executed first. Since these new operations and rules need to be

matched, we suspend execution of the transaction; i.e., we suspend the execute phase of the

MME cycle at the current execution level, k. Next, we re-invoke the MME cycle, starting

with the match phase, with execution level set to (k+l).

When the execute phase of the MME cycle at execution level (k+1) concludes; i.e., the

transaction at level (k+l) completes execution, then we resume the suspended execute phase

of the MME cycle at execution level k; i.e., we resume execution of the transaction at level k.

Suspending the execute phase of the MME cycle at level k and re-invoking the MME

cycle at level (k+1) to match new operations or rules against triggers no longer fits the com-

piled approach. It requires access to the rules defined for object types, during execution

against object occurrences, and it can be compared to switching between a compiler and an

interpreter.

We use the same KML constructs to express the rules and to specify operations in the

transaction and the match-modify phases (compilation phases) of the MME cycle do not use

sophisticated theorem provers, etc. Consequently, the cost of switching between the com-

piled and interpreted modes is not so excessive as to make the approach we have taken

infeasible. However, there is a certain overhead involved in switching; i.e., suspending the

execute phase, re-invoking it, etc., and it is beneficial to attempt to reduce the number of

times this switching must take place.










One way to reduce the number of times switching takes place is to pass the transaction

through a "look-ahead" process, before execution. The look-ahead process will attempt to

identify likely candidates to be further compiled; i.e., passed through the match and modify

phases. For example, executing the LHS of a value dependent rule at execution level k intro-

duces retrieval operations at level (k+l). These retrieval operations may match with value

independent rules, thus, re-invoking the MME cycle at level (k+l). Similarly, executing the

RHS of value dependent rules also introduces operations or rules. This happens less fre-

quently since the RHS of a value dependent rule is conditionally executed whereas the LHS

of the rule is almost always executed, unless an abort flag is set for the rule. Thus, the LHS

retrieval operations are better candidates for the look-ahead process.

The look-ahead process looks ahead and compiles those retrieval operations that will

actually be appended later during execution of the LHS of the value dependent rules. The

look-ahead process will effectively pass these operations through the match-modify phases

before they are actually appended. Since the system is still in the compilation mode

corresponding to the current execution level, the look-ahead process reduces the number of

switches between the compiled and interpreted modes.

Thus, before execution at level k; i.e., before the execute phase at level k, the LHS of

the value dependent rules at level k are further processed. The corresponding retrieval

operations they will introduce at level (k+1) are passed through the look-ahead process which

corresponds to the match-modify phases of the MME cycle with execution level set to (k+1).

Now, during execution at level k, when the retrieval operations at level (k+1) are intro-

duced, they will already have passed through the match-modify phases. This results in con-

siderable savings as the system does not have to suspend the execute phase with execution

level k, re-invoke the MME cycle with execution level (k+1), etc.

Cycles which cross execution levels are identified during this look-ahead compilation

process. If these cycles are value dependent (they include value dependent rules) then they

are not eliminated. We note that if there are any value dependent rules in the transaction










that are attributes of data type RULE, they are not passed through the look-ahead process.

The next section has examples describing the behavior of the MME cycle, in the different

situations just described.

The MME cycle for any execution level k, where k > 1, completes when the

corresponding execute phase completes; i.e., when all the operations and rules in the modified

transaction with execution level k are executed. The modified KBMS transaction itself com-

pletes execution when all operations and rules with execution level = 1 are executed. Now

the modified KBMS transaction will either be committed or aborted.

Although the modified transaction is committed, the MME cycle (execution level = 1) is

still not complete; value dependent rules must be implicitly selected for execution. All value

dependent rules, defined for user object types, which are not explicitly selected for execution

in at least one explicit inference chain are candidates for implicit selection. Value dependent

rules defined for particular object occurrences as attributes of abstract data type RULE are

not implicitly selected since the cost of identifying if each occurs in explicit inference chains

is prohibitive.

The operations executed by the modified transaction are used as a seed or starter to

select among rules that are candidates for implicit selection. This selection strategy is simi-

lar to that used in the TREAT algorithm [MIR86] to select applicable production rules and

will be discussed in Chapter Seven, with other implementation issues. Once value dependent

rules are implicitly selected, they will be executed much as the explicitly selected rules.

Compared to explicit selection, the implicit selection process is expensive since it selects

all the value dependent rules that may apply. Of these rules, there may be several whose

LHS is not satisfied, and the rules will not be executed and this is an overhead expense.

Implicit selection of rules takes place in the MME execute phase (execution level = 1) but the

selected rules must themselves go through the MME cycle. In other words, implicit selection

requires access to the value independent triggers defined for object types. The selection pro-

cess no longer fits the definition of the compiled approach and switches between an










interpreter and a compiler. Implicit selection could take place continually during the execute

phase of the MME cycle, for different execution levels. However, this would require suspend-

ing the current execute phase, re-invoking the MME cycle, resuming the suspended execute

phase, etc. The more frequently this occurs, the greater the overhead. To reduce this cost,

we should reduce the frequency of this process.

Another consideration is that we must maintain the transaction oriented nature of the

MME cycle. Implicit selection occurs in response to the changes made to the knowledge base

object occurrences by the operations executed in the modified transaction. Since the

modified transaction commits all its operations collectively, implicit selection should also

respond to these changes collectively.

Thus, implicit selection is deferred until after the modified transaction completes execu-

tion and it is either committed or aborted. If it is committed, then we respond collectively to

all the changes made to the knowledge base and implicitly select and execute relevant rules.

All the operations (at all execution levels) are used collectively as a starter. Now, even if a

rule is implicitly selected by more than one operation, it will be executed just once.

If possible, all implicitly selected rules could be executed in parallel, as is discussed in

the next chapter. Since the operations of each KBMS transaction are used collectively, the

frequency of the implicit selection process and the corresponding number of switches is also

reduced.



6.2 Example Transaction Fragments in the MME Cycle


We examine a few examples of transaction fragments as they pass through the MME

cycle to illustrate the operation of the different phases. The examples describe transaction

modification, parallelism of rules and operations within a transaction, re-invocation of the

MME cycle and cycles of value dependent rules. The examples are based on the rules defined

for the objects described in Chapter Five.










Consider a transaction fragment with an INSERT operation against the object type

TOP-SECRET-PROJ. The rules relevant to this object type are in Figure 6.3. We now

follow this fragment through the different phases of the MME cycle as described in Figure

6.4. The operation matches, simultaneously, with several value independent triggers T5, T6,

T7 and T9. The X occurrence on the LHS of the triggers are bound by the inserted

occurrence. In the next modify phase, the triggers modify the transaction, based on the LHS

options and the RHS consequents.

All four of these triggers occur in explicit inference chains and they explicitly select

value dependent rules and append them to the transaction. Thus, rule T5, uses the pre-exec

option to schedule a value dependent rule gen-constr-.l to precede the INSERT operation.

Rules T6 and T7 both use the par-exec option and value dependent rules gen-constr-2 and

gen-hier-l execute in parallel with the operation. Finally, rule T9 uses the post-exec option

and as a result, rule loc..stat- succeeds the INSERT operation. These explicit inference

chains are all forward chains; triggers T5, T6, T7 and T9 initially bind variables on the LHS

of rules gen-.constr-1, gen-constr-2, etc. All operations and rules of the modified transaction

have their execution level set to 1.

The INSERT operation is to be conditionally executed and it is marked to indicate that

it has triggered a rule gen-constr-1, whose RHS consequent could halt its execution. Refer

to Figure 6.4 for a stepwise representation of a fragment of a KBMS transaction as it passes

through the match, modify and execute phases.

There are no more rules to match with the appended rules nor does the look-ahead pro-

cess successfully match the retrieval operations which will be introduced while verifying the

LHS of these appended rules. The MME cycle thus enters the execute phase. All operations

and rules of the modified transaction are executed.

If rule gen..constr-l is successfully executed; i.e., a constraint is violated, then an

appropriate flag is set to abort the INSERT operation. After checking for this flag, the

INSERT operation is (conditionally) executed. In parallel with the INSERT operation, rules










gen-constr-2 and gen-hier-. are also executed. If the LHS of either gen-constr-2 or

gen-hier-2 are satisfied, then the RHS consequents of these rules are executed. The RHS

consequents of these rules modify the transaction and append operations. Since these

appended operations may match against some value independent rules, the MME cycle has

to be re-invoked.

The RHS consequent of rule gen-hier-1 appends an INSERT operation against the

object GOVT-PROJECT. To identify that the transaction has been modified during execu-

tion, the INSERT operation is appended with execution level set to 2. Since a higher execu-

tion level has precedence and the INSERT operation has not been matched, the current exe-

cute phase is suspended and the MME cycle is re-invoked with an execution level set to 2.

The INSERT operation matches with trigger T8. In the next modify phase, a value depen-

dent rule attr-Jnh-1 precedes this INSERT operation; these operations and rules are marked

at execution level 2 (see Figure 6.4).

There will be no more matches at execution level 2 and the operations and rules at

level 2 are executed. First attr-inhJI executes and obtains values from the user for the

inherited attributes LOCATION and STATUS. Then the INSERT into GOVT-PROJECT

executes. After all the operations and rules at level 2 complete, we resume the execute phase

with execution level = 1.

In this example, there are three parallel branches within the transaction at level 1,

corresponding to the INSERT into TOP-SECRET-PROJ, rule gen-constr-2, and the third

branch corresponding to rule gen-hier.J. This third branch caused the MME cycle to be re-

invoked at level 2. After the three parallel branches at level 1 complete execution, rule

loc-stat-1 at level 1 is executed.

We also use this example to describe the implicit selection of value dependent rules.

Suppose that rules gen.constr-2 and loc-stat.1 are not included in any explicit inference

chains; i.e., rules T6 and T9 are not defined for object type TOP-SECRET..PROJ. In this

case, when the transaction is modified, there will be two parallel branches at level 1,










corresponding to the INSERT into TOP-SECRET-PROJ and rule gen..hier-1. Neither

gen-constr.2 nor loc-stat-1 will be included in the modified transaction.

After the modified transaction completes execution, it will be committed. Next, the

operations of the committed transaction, INSERT into TOP-SECRETPROJ and INSERT

into GOVTPROJECT (conditional) are used in implicit selection. The two rules

gen-constr.2 and loc-stat-l are now candidates for implicit selection. Both these rules test

occurrences of an affected object type, TOP..SECRET-PROJ, and they will be selected. See

Figure 6.4 for these implicitly selected rules (enclosed by the dotted lines). Note that rule

loc-stat-1 will also be selected by the INSERT into GOVT-PROJECT, but it will be exe-

cuted just once. The two implicitly selected rules can be treated as independent transactions

and are candidates for concurrent execution. This is discussed in Chapter Seven.

We now examine another example with a backward inference chain and a cycle of value

dependent rules. The relevant rules defined for the object type der-P*SP, are in Figure 6.5.

Consider a transaction fragment that executes a RETRIEVE operation against the

der-P*S-P object type. This operation matches with triggers T4 and T4' and two value

dependent rules trans.cL.l and trans&cL2 are appended to the transaction, to execute in

parallel with the RETRIEVE operation (see Figure 6.6a). This is an example of a backward

inference chain; T4 and T4' use the goal; i.e., the occurrences of der-P*S-P that are to be

retrieved to bind the derived derP*S-P occurrence, Z, on the RHS of trans-cLl and

trans-..cL2. These bindings will then be passed within the two rules from the RHS to the

LHS.

The rule trans-.cL_2 causes a recursive cycle of value dependent rules. To satisfy its

LHS, trans..cl2 appends a retrieval operation against der.P*S..P (see Figure 6.6b). This

retrieval operation matches with rules T4 and T4' which (recursively) append trans-cLl and

trans-..cl2 (see Figure 6.6c). This cycle will not cause endless execution; it includes value

dependent rules which terminate due to finite object occurrences.










This recursive cycle is actually detected during the look-ahead process. Prior to the

execute phase corresponding to an execution level = 1, the transaction passes through the

look-ahead process which processes the operations that will be appended at level 2 (by exe-

cuting the LHS of the value dependent rules at level 1). Rule trans...cl2 at level 1 will

append a RETRIEVE operation against der..P*S-P at level 2. The look-ahead process re-

invokes the MME cycle at level 2, to match this RETRIEVE operation. The operation

matches with rules T4 and T4' and (recursively) appends rules trans._cLl and trans-..cL2, at

execution level 2. This cycle involving rule trans...cL2 spans execution levels; it is detected

and the look-ahead processing for rule trans..cL2 terminates. The cycle is not eliminated

since it includes value dependent rules. We use this example in Chapter Eight to discuss the

use of database query optimization techniques to efficiently process linear recursive rules.



6.3 Simulating the MME Cycle Using a Production System


The mechanism of applying rules in the KBMS was simulated using the OPS5 Produc-

tion System Language [FOR81]. We use the term simulation since the KBMS transaction

was not actually executed against object occurrences of a knowledge base. The objective

was to verify the operation of the various phases of the MME cycle and to obtain insights to

support an efficient implementation of an integrated KBMS.

The OPS5 system has a single processing paradigm of production rules. Thus, we use

production rules to model the system knowledge and the domain knowledge. The "domain"

production rules model the value independent and value dependent rules defined for the

object types and occurrences, specific to each application. On the other hand, the "system"

production rules control the different phases of the MME cycle. This includes matching the

transaction against a relevant subset of the value independent domain rules, modifying the

transaction by incorporating value dependent domain rules and executing the modified tran-

saction.










The OPS5 system was chosen because it provides excellent support for production rules

and pattern matching and because it has been widely used to implement several rule-based

expert systems. However, processing in a production system (PS) is significantly different

from the processing strategy for the KBMS that we described, resulting in several limitations

to our simulation. Ironically, these very limitations provided the greatest benefit, since they

provided insight into the functionality of the KBMS components and the MME cycle. The

different implementation issues and strategies discussed in the rest of this thesis originated

from the limitations of this OPS5 simulation.

Conceptually, there are many differences between the mechanism of applying rules in a

PS and in the KBMS. The OPS5 production system is in a continuous cycle of match, select

and action. During each cycle, all the productions or rules are candidates to be selected for

execution and the selected candidates form a conflict set. The language definition of OPS5

only allows a single production rule to be fired at any instant and there are two built-in

conflict resolution strategies, LEX and MEA, that determine which rule is to be chosen, from

the conflict set, for execution.

For the simulation of the control strategy of the MME cycle in a transaction oriented

KBMS, we had to circumvent the OPS5 selection strategy at times and implement our own

selection strategy so as to accurately model the operation of the MME cycle. For example,

during the match phase, the MME cycle only uses a subset of the rules; i.e., the value

independent triggers that match against the triggering operations and rules. During the

modify phase, the matching triggers are used to modify the transaction and to schedule

operations and value dependent rules for execution. The transaction can also be modified

during the execution of the value dependent rules. There is no equivalent to the concepts of

a transaction, modifications made to a transaction or the process of explicit selection of rules

in the OPS5 system. The implicit selection of value dependent rules in the execute phase of

the MME cycle most resembles the normal processing of the OPS5 system.










In addition to these differences, there are some further drawbacks to using OPS5.

OPS5 has neither a formal knowledge representation model nor a set oriented processing

strategy. The lack of a representation model was overcome by simulating a network

representation so that data and relevant rules could be grouped together within an object

type. The lack of set processing features was handled through the use of tags to group

together those operations which belong to a single set oriented operation.

The following is a brief description of our simulation of the MME cycle using OPS5.

The initial transaction is structured as a sequence of operations against objects. The match

level and the execution level of these operations are initialized to 1. The match level is used

to identify new operations and rules appended to the transaction during the modify phase,

after successfully matching against value independent triggers in the match phase. The exe-

cution level is used to identify new operations and rules appended to the transaction during

the execute phase of the MME cycle.

Operations from the initial transaction are matched individually against the value

independent triggers defined for the relevant object types. If the match is successful, then, in

the following modify phase, the operation (parent) is attached to a modification structure

(child) whose match level and execution level are initialized to 1. Parent and child pointers

are used with the modification structure. This modification structure is used to capture the

modifications to the (parent) operation of the transaction. The structure points to new

operations and rules appended in the modify phase. These operations and rules are properly

sequenced using the options specified on the LHS of the value independent triggers. The

matching triggers are also marked with the corresponding match and execution levels (= 1);

this is used later to detect cycles, as will be seen.

Each of the new operations and value dependent rules pointed to by the modification

structures must also be matched and modified; this is done recursively. If the match is suc-

cessful, then this new operation or rule (which already has a pointer from a parent

modification structure) will now point to its own child modification structure. The match










level of this child modification structure will be incremented by 1 from the previous match

level so that the child modification structures can be differentiated from the parent. The

execution level is still set to 1.

Potential cycles are easily detected when a there is a match with a trigger which has

been marked to indicate a previous match at a lower match level. To detect actual cycles

(from potential cycles), the modification structures are traversed backwards, via the parent

pointers, until an actual cycle is detected or a root operation at match level = 1 is encoun-

tered. Once detected, these cycles must either be eliminated or examined to ensure they will

not result in endless execution. The match-modify phases terminate when no more matches

are possible.

The modified transaction at the current execution level (=1) is now processed by the

look-ahead process which re-invokes the MME cycle at the next execution level (= 2), if

necessary. This process works as follows: For all value dependent rules at execution level =

1, the retrieval operations that will be appended to satisfy their LHS are marked at execu-

tion level = 2. The corresponding match level is initialized to 1. These retrieval operations

now pass through the match-modify phases of the MME cycle, with its execution level set to

2. If there is a match, then a modification structure, also at execution level = 2, is attached.

In addition to cycles with the same execution level, there is now a possibility of cycles

that cross execution levels. Again, such potential cycles are first identified when there is a

match with a trigger which is marked to indicate a match at a lower execution level. Actual

cycles are identified by traversing backwards via the modification structures, as before.

In the transitive closure example discussed in the previous section, a retrieval operation

executed on the LHS of rule trans-.cL.2 (execution level = 1) matches with the same rule

trans-.cL 2 (execution level = 2). These cycles do not have to be eliminated if they involve

value dependent rules; however, unless they are detected the look-ahead process will not ter-

minate.










After the look-ahead process terminates, the modified transaction (operations, rules and

modification structures) whose execution level is initialized to 1, is executed. First, the

operations and rules corresponding to match level 1 are executed. When a modification

structure is encountered, the match level is incremented and this structure is used to obtain

the new operations and rules. These too, will retain an execution level of 1.

During execution of the value dependent rules, the transaction may be modified and

new rules or operations may be introduced. However, the execution level of these new opera-

tions or rules are incremented by 1. Operations and rules at higher execution levels have

precedence. Thus, execution at level e is halted when operations or rules at level (e+1) are

appended to the transaction.

Those operations at level (e+1) that have already passed though the look-ahead process

do not require further matching. If they had been modified during the look-ahead process,

they will be attached to a modification structure with execution level (e+1) and match level

= 1. For those operations that have not been through the look-ahead process, the MME

cycle is re-invoked with an execution level of (e+1) and with the match level re-initialized to

1.

After all the operations and rules of the modified transaction complete execution, the

changes made to the knowledge base are committed. To complete the execute phase of the

MME cycle with execution level 1 requires the completion of the implicit selection process.

The implicit selection and execution of value dependent rules during the execute phase does

not need elaborate support in OPS5, since this closely resembles the OPS5 strategy of select-

ing rules. The MME cycle completes when no more rules are implicitly selected.

OPS5 productions are compiled into a Rete network and the efficiency of OPS5 is attri-

buted to the efficiency of the Rete algorithm [FOR82]; however, there are several charac-

teristics of a KBMS that reduces the efficiency of the Rete algorithm. This will be discussed

in detail with other implementation issues in the next chapter where we will suggest alterna-

tive methods to structure and select rules.










The use of the object-oriented paradigm to structure rules and to provide a binding

between data and relevant rules are important features of our KBMS. However, the Rete

network does not allow such a structuring of rules. Although the Rete network does bind

data and relevant rules, it does this by structuring the data using the rules, in contrast to

structuring both data and rules using object types as is suggested for the object-oriented

KBMS.

One of the shortcomings of the Rete algorithm is the lack of support for set oriented

operations. This precludes the use of efficient set oriented DBMS strategies to support rule

processing in a KBMS. As a result, the functional integration of a DBMS with a rule-based

system, within the KBMS, which is a cornerstone of our research could not be simulated.

Functional integration of the KBMS components is the topic of Chapter Seven.

OPS5 also does not permit a rule to execute another rule, whereas this is an important

feature of our scheme for building explicit inference chains, where a trigger explicitly selects

a value dependent rule during the modify phase. This feature had to be simulated in our

model using side effects and a blackboard.

OPS5 only allows a single production to be fired at any instant. Our simulated system

could not differentiate between system and domain production rules and always selected a

single rule for execution. As a result, we had to alternate between supporting the functions

of the MME cycle and the execution of the value dependent domain rules. For example, we

had to suspend the execute phase (at level k) when an operation or rule was appended which

required the re-invocation of the MME cycle at level (k+i). We could resume the execute

phase at level k only after the completion of the MME cycle at level (k+i). Thus, we could

not simulate possible parallelism between phases at different execution levels.

This also meant that the potential for parallelism within a single KBMS transaction

could not be simulated since the execution of the transaction was also modeled using produc-

tions. The parallel' execution of triggering operations or rules and the RHS consequent of the










triggers that was described in Figures 6.1 and 6.2 and which occurred in the example tran-

saction fragments of the previous section (see Figures 6.4 and 6.6) was not simulated.

The serialized execution of production rules in OPS5 (and most rule-based systems)

guarantees the correctness of rule execution. However, in the interests of execution

efficiency, it is useful to investigate the behavior of the MMNE cycle if this limitation were not

imposed. Research in the concurrent execution of DBMS transactions has resulted in a seri-

alizability criterion for correctness of an interleaved execution of concurrent transactions and

algorithms that guarantee serializability. In the next chapter we show that the serializability

criterion could also be applied to concurrent execution of rules in a KBMS.

As mentioned earlier, it was from the limitations of our OPS5 simulation that we

gained much insight into the task of adequately supporting the MME cycle in the KBMS.

The simulation helped to identify several implementation issues that will be discussed next.



















triggering
operation in trigger modified transaction
transaction


INSERT IF pre-exec (INSERT into object) RHS consequent
into THEN (RHS consequent) I +
object INSERT into object



INSERT into object
INSERT IF post-exec(INSERT into object) INSERT into object
into THEN (RHS consequent)
object RHS consequent



INSERT IF par-exec(INSERT into object) INSERT into RHS
into THEN (RHS consequent) object consequent
object I I


+ conditionally executed if no abort
flag is set


Figure 6.1 Possible Modifications to a Triggering Operation












triggering
rule in trigger modified transaction
transaction


EXECUTE IF pre-exec (EXECUTE rulel) RHS consequent
THEN (RHS consequent) I +
rule EXECUTE rulel



EXECUTE rule
EXECUTE IF post-exec (EXECUTE rulel)
THEN (RHS consequent) I
rule RHS consequent
rule



EXECUTE IF par-exec (EXECUTE rule) EXECUTE RHS
THEN (RHS consequent) rule consequent
rule I I



EXECUTE rule
EXECUTE IF succ-exec (EXECUTE rule) j+
rule THEN (RHS consequent) RHS consequent
I


+ conditionally executed if no abort
flag is set
++ conditionally executed if rule
successfully executes


Figure 6.2 Possible Modifications to a Triggering Rule










T5 : TOP.SECRET..PROJ
IF pre-exec(INSERT an occurrence X into TOP..SECRET-PROJ)
THEN (EXECUTE gen-constr-l(X): GOVT-PROJECT)

gen-constr-l(X): GOVT-PROJECT
IF (for an inserted occurrence X of TOP-SECRET-PROJ, there exists
an occurrence Y of NON-MILITARY-PROJ, such that EQUAL(X.OID, Y.OID)
THEN (alert the KBMS to reject occurrence X)


T6 : TOP-SECRET-PROJ
IF par-exec(INSERT an occurrence X into TOP-SECRET-PROJ
THEN (EXECUTE gen-constr_2(X): GOVTJPROJECT)

gen-constr-2(X): GOVT-PROJECT
IF (for an inserted occurrence X of TOP-SECRET.PROJ
NOT.SET-MEMBER(X.OID, MILITARY-PROJ.OID) )
THEN (INSERT an occurrence Z into MILITARY-PROJ where Z.OID = X.OID )


T7 : TOP-SECRET..PROJ
IF par-exec(INSERT an occurrence X into TOP-SECRET-PROJ)
THEN (EXECUTE gen-hier-l(X): GOVT-PROJECT)

gen-hier-l(X) : GOVT-PROJECT
IF (for occurrence X inserted into a constituent of GOVT-PROJECT
NOT-SET-MEMBER(X.OID, GOVT..PROJECT.OID))
THEN (INSERT an occurrence P into GOVT-PROJECT
where P.OID = X.OID)

T8 : GOVT-PROJECT
IF pre-exec(INSERT an occurrence X into GOVT-PROJECT)
THEN (obtain values for attributes X.LOCATION and X.STATUS from user)


T9 : TOP-SECRET-PROJ
IF post-exec(INSERT occurrence X into TOP-SECRET-PROJ)
THEN (EXECUTE loc-stat-l(X): GOVT-PROJECT)

loc-stat-l(X): GOVTPROJECT
IF (for an inserted occurrence X of TOP-SECRET-PROJ, there exists
an occurrence Y of GOVT..PROJECT, such that EQUAL(X.OID, Y.OID) AND
EQUAL(Y.STATUS, "testing") AND NOT-EQUAL(Y.LOCATION, "Virginia"))
THEN (UPDATE Y such that Y.LOCATION = "Virginia")


Figure 6.3 Knowledge Rules Relevant to TOP-SECRET-PROJ










TRANSACTION FRAGMENT MME CYCLE
PHASE execution level

INSERT X into TOPSECRETPROJ MATCH with rules
T5, T6, T7, T9
I MODIFY
genconstr_1 (X) selected rule

.1 matched
INSERT X into #2 #3 matched
TOP_SECRET_PROJ operation

#1 genconstr_2(X) selected rule
gen_hier_1 (X) selected rule

loc_stat_ 1 (X) selected rule

I EXECUTE
genconstr_1(X) rule execution
IF LHS =true then SET abort flag (test abort flag)

INSERT X into #2 #3 parallel execu-
TOP_SECRET_PROJ tion of rules
gen_constr_2(X) and operations

#1 gen_hier_1(X)
if LHS = true then suspend
APPEND operation execution of
______#3





conditionally executed if abort flag not set

Figure 6.4 Example of Transaction Fragment Passing
Through the MME Cycle











MME CYCLE
TRANSACTION FRAGMENT
PHASE execution level

INSERT P into GOVTPROJECT MATCH with rule
T8
MODIFY
attr_inh_1 (P) selected rule
I 2
matched
INSERT P into GOVT_PROJECT matched
I operation


I EXECUTE
attr inh 1 (P) rule execution
if LHS = true then INPUT values from user

I operation 2
INSERT P into GOVT_PROJECT execution

resume execution
at execution
level = 1

EXECUTE
locstat 1(X)
if LHS = true then APPEND operation rule execution


loc stat 1


gen constr.2


IMPLICIT
selection of
rules
I i


Figure 6.4 (continued) Example of Transaction Fragment Passing
Through the NIME Cycle


I |










T4 : der.P*S-P


IF par-exec(RETRIEVE occurrence X from der-P*S.P)

THEN (EXECUTE trans-jcLl(X): der-P*S.P)



trans..cLl(Z) : der-P*S.P


IF (there exists an occurrence Y of P*S-P)

THEN (DERIVE occurrence Z of der-P*S-P such that Z = Y)




T4': der-P*S..P


IF par-exec(RETRIEVE occurrence X from der-.P*S-P)

THEN (EXECUTE trans.cL2(X) : der-P*S-P)


trans..cl2(Z) : der..P*S-P


IF (there exist occurrences P and Q of der-P*S-P and P*S-P, respectively, such that

SET.-MEMBER(Q.PART, P.SUB-PARTS)

THEN (DERIVE occurrence Z of der..P*S-P where Z.PART = P.PART

AND Z.SUB-PARTS = the union of P.SUB..PARTS and Q.SUB-PARTS)


Figure 6.5 Knowledge Rules Relevant to the Object der.P*SP













RETRIEVE
der_P*S_P















RETRIEVE
der_P*SP


(a) Modified transaction fragment at level 1


EXECUTE
tr_cl_2










tr_cl_2


RETRIEVE
der_P*S_P

RETRIEVE
P*S_P




DERIVE
der_P*S P


(b) Transaction fragment executing at level 1




Figure 6.6 Transaction Fragment with Cycle of Value Dependent Rules
a) Modified transaction fragment at level 1 b) Transaction fragment
executing at level 1 c) Transaction fragment executing at levels 1 and 2


tr_cl_1

RETRIEVE
(LHS) P*SP

(LHS)

(LHS)



(RHS) DERIVE
der_P*S_P
(RHS)









I
execution level=1


RETRIEVE
derP*S P


RETRIEVE
P*S_P
























DERIVE
derP*S_P


(LHS)








RETRIEVE
der_P*S_P


execution level=2



RETRIEVE
P*SP (LHS)


(LHS)

(LHS)




DERIVE (RHS)
der_P*S_P


execution level=2
I


RETRIEVE
der_P*S_P


RETRIEVE
P*SP






DERIVE
der P*S P


execution level=1


RETRIEVE
P*S_P

DERIVE
der_P*S_P


(RHS)


(LHS)


(RHS)


(c) Transaction fragment executing at levels I and 2


Figure 6.6 (continued) Transaction Fragment with Cycle of Value Dependent Rules
a) Modified transaction fragment at level 1 b) Transaction fragment executing at
level 1 c) Transaction fragment executing at levels 1 and 2