Citation
Association algebra

Material Information

Title:
Association algebra a mathematical foundation for object- oriented databases
Creator:
Guo, Mingsen, 1947-
Publication Date:
Language:
English
Physical Description:
viii, 159 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Algebra ( jstor )
Data models ( jstor )
Database design ( jstor )
Databases ( jstor )
Departmental majors ( jstor )
Distributivity ( jstor )
Mathematics ( jstor )
Query languages ( jstor )
Relational database models ( jstor )
Undergraduate students ( jstor )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1990.
Bibliography:
Includes bibliographical references (leaves 135-140).
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Mingsen Guo.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
025013339 ( ALEPH )
AHR3687 ( NOTIS )
24160849 ( OCLC )

Downloads

This item has the following downloads:


Full Text












ASSOCIATION ALGEBRA:
A MATHEMATICAL FOUNDATION
FOR OBJECT-ORIENTED DATABASES








By

MINGSEN GUO


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY


UNIVERSITY OF FLORIDA


1990



























Copyright 1990

by

Mingsen Guo





















Dedicated to my dear wife Zhu (Susie)

and lovely daughter Jialan.


And to our parents

Jingcheng Guo and Ruiying Zhang

Shuyan Huang and Chuanxiang Chen,


this was their dream before it was mine.














ACKNOWLEDGEMENTS


I would like to express my sincere appreciation to Dr. Stanley Su, chairman of

my supervisory committee, for giving me the opportunity to work on this interesting

and important topic in the area of object-oriented database systems. Without his

patient guidance and continuous support, this work could not have been completed.

I am grateful to Dr. Herman Lam, cochairman of my supervisory committee, for his

thought-provoking suggestions on this work. I thank Dr. Sham Navathe for his com-

ments and his personal library. I thank Dr. Randy Chow for his encouragement

throughout my graduate study. I would like to thank Dr. John Staudhammer for his

time and for being on my supervisory committee.

My special thanks go to Sharon Grant, the secretary of the Database Systems

Research and Development Center, whose help to me is always friendly and in time.

This research was supported by the National Science Foundation (DMC-

8814989) and the National Institute of Standard and Technology (60NANB4D0017).

The development effort is supported by the Florida High Technology and Industrial

Council (UPN88092237).















TABLE OF CONTENTS


ACKNOW LEDGM ENTS ..............................................................................

ABSTRACT ....................................................................................................

CHAPTER


Page

iv

vii


1 INTRODUCTION .............................................................................. 1

2 A SURVEY OF RELATED WORK............................................. 12

2.1 Relational Model and Relational Algebra................................ 12
2.2 Existing 0-0 Query Languages.............................. ............ .. 18
2.3 ENCORE 0-0 Data Model and Its Underlying Query Algebra. 25

3 OVERVIEW OF 0-0 DATABASES AND
ASSOCIATION-BASED QUERY FORMULATION........................ 38

3.1 Overview of 0-0 Databases................................... ........... 38
3.2 Pattern-based Query Formulation.......................... ............ 41
3.3 Conclusion .............................................................................. 45

4 ASSOCIATION ALGEBRA ......................................... ............ .. 51

4.1 Definitions.................................................................................. 51
4.2 Relationship Between Two Patterns..................................... 55
4.3 Association Operators.......................................................... 56
4.4 Query Examples .................................................................. 71

5 MATHEMATICAL PROPERTIES OF OPERATORS
AND THEIR APPLICATIONS IN QUERY OPTIMIZATION
AND QUERY DECOMPOSITION............................................ 91

5.1 Conventional Algebraic Properties........................................ 91
5.2 Nesting of Two Unary Operators ........................................... 95
5.3 Nesting of Binary Operator in Unary Operator ...................... 97
5.4 Cascading of Two Binary Operators..................................... 99
5.5 General Identities ....................................................................104
5.6 Transformation of Operators ..................................................104
5.7 Applications in Query Optimization and Decomposition ..........106

6 COMPLETENESS OF THE A-ALGEBRA.......................................118

7 CONCLUSION.................................................................................133










REFEREN CES .................................................................................................. 135

APPEND IX .............................. .............................. ...................................141

BIO GRAPHICAL SK ETCH ................................................................................159
















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


ASSOCIATION ALGEBRA:
A MATHEMATICAL FOUNDATION
FOR OBJECT-ORIENTED DATABASES


By
Mingsen Guo

December 1990
Chairman: Dr. Stanley Y.W. Su
Major Department: Electrical Engineering

Existing 0-0 DBMSs lack a solid mathematical foundation for the manipulation

of 0-0 databases, optimization of queries, and the design and selection of storage

structures for supporting 0-0 database manipulations. An association algebra (A-

algebra) is prescribed for serving as a mathematical foundation for processing 0-0

databases, which is analogous to the use of relational algebra for processing relational

databases. In this algebra, objects and their associations in an 0-0 database are uni-

formly represented by association patterns which are manipulated by a number of

operators to produce other association patterns. Different from the relational alge-

bra, in which set operations operate on relations with union-compatible structures,

the A-algebra operators can operate on association patterns of both homogeneous and

heterogeneous structures. Different from the traditional record-based relational pro-

cessing, the A-algebra allows very complex patterns of object associations to be

directly manipulated. Pattern-based query formulation and the A-algebra operators

are described. Some mathematical properties of the algebraic operators are









presented together with their application in query decomposition and optimization.

The completeness of the A-algebra is also defined and proven. The A-algebra has

been used as the basis for the design and implementation of an object-oriented query

language, OQL, which is the query language used in a prototype Knowledge Base

Management System OSAM*.KBMS.














CHAPTER 1
INTRODUCTION


In the past two decades, techniques of data modeling have gone through two

major conceptual changes. First, in early 1970s, E. F. Codd observed that future

database systems should allow application programs and terminal users to remain

unaffected by changes made to the internal data representation (or the storage

structure) of a database. He introduced the relational data model [COD70] and

proposed the relational algebra and relational calculus [COD72a] as the

mathematical foundation for processing relational databases. The relational model

provides two levels of data independence in a three-level architecture for a data-

base management system as shown in Figure 1.1 (figures of each chapter are

placed at the end of the chapter). At the lower level, the physical data indepen-

dence is provided, i.e., the logical representation of a relational database is a set of

relations (i.e., flat tables), which is independent of the physical (data and storage)

structures in which data are stored. At the higher level, the logical data indepen-

dence is provided, i.e., the external view remains unchanged when the logical view

of a database is modified (note that the external view remains unchanged only for

some schema modifications). Besides simple logical representation and data

independence, the fact that the relational model has a solid mathematical founda-

tion is very important and has contributed to the success of the model and the

existing relational database management systems.









However, the relational model and relational systems have some limitations.

For example, the model captures rather limited structural properties of real-world

entities or objects. The construct of aggregation hierarchy which models complex

objects and the construct of generalization which models the superclass-subclass

relationship are not provided. In the relational model, data which describe a com-

plex object are scattered among a number of normalized relations and accessing

that data involves time-consuming traversal and assembly of data stored in multi-

ple relations. The model also does not allow behavioral properties of

entities/objects to be explicitly defined.

The second conceptual change of data modeling techniques occurred in the

early 1980s. The object-oriented paradigm, first introduced in the programming

language SIMULA [DAH67] and made very popular through the language

SMALLTALK [GOL81], allows richer structural constructs and behavioral proper-

ties of objects to be specified at the logical level independent of their physical

implementations. Several features of the paradigm such as abstract data types,

inheritance, encapsulation, information hiding, polymorphism, etc. have been

shown to be useful for data modeling and system development. The object encap-

sulation concept adds a level of data independence between the physical and the

logical independence introduced in the relational model, as depicted in Figure 1.2.

It requires that the structural and behavioral properties of an object be (logically)

encapsulated in its class in the conceptual view of an 0-0 database. Since then, a

number of Object-Oriented (0-0) and semantic data models have been proposed

[HAM81, BAT84, KIN84, ZAN85a, ZAN85b, DAD86, MAI86, MAN86, SU86,









ZDO86, WOE86, BAN87, FIS87, HOR87, HUL87, KIM87, ROW87, CAR88,

COL89, SU89], which offer more powerful constructs for modeling the structural

and behavioral properties of objects found in advanced applications such as

CAD/CAM, CASE, and decision support systems.

An 0-0 semantic data model can be structurally and/or behaviorally object-

oriented [DIT86]. A structurally 0-0 data model is one that encompasses at least

the following characteristics:

(1) It supports the unique identification of objects, that is, each object has a

unique object identifier (surrogate) which is valid for the life-time of the

object.

(2) It categorizes those objects which can be described by the same set of charac-

teristics (attributes) into an object class.

(3) It allows aggregation (association) hierarchies to be defined.

(4) It allows generalization (association) hierarchies to be defined.

The 0-0 view of an application world is represented in the form of a net-

work of classes and associations. Object class can be either a primitive-class whose

instances are of simple data types (e.g., string, integer) or a nonprimitive class

(e.g., Part, Student, Teacher). At the extensional level, instances of different

classes can be related (associated) with each other forming patterns of object asso-

ciations. A behaviorally object-oriented data model, on the other hand, is one in

which operations that describe the behavior of the objects of a class can be defined

and registered with that class. Programs or methods that implement the opera-

tions defined for an object are transparent to the user of the objects.









For these models to be truly useful, they must provide some object manipula-

tion languages, which can take advantage of the expressive power of the models

and provide the users with simple and powerful querying facilities. Recently,

several query languages such as DAPLAX [SHI81], GEM [ZAN83, TSU84], ARIEL

[MAC85], FAD [BAN87], POSTQUEL [ROW871, EXCESS [CAR881, and others

reported in [DAD86, MAN86, SER86, BAN87, FIS87, BAN88, COL89, SHA90]

have been proposed. These languages were developed based on different para-

digms. For example, DAPLAX and the query language of [MAN86] are based on

the functional paradigm. The query language of [BAN88] is based on the message

passing paradigm. Other query languages are based on the relational paradigm:

an extension of QUEL [ROW87, CAR88]; an extension of SQL [DAD86]; and an

extension of the relational algebra [COL89]. The query language of [FIS87] is

based on both functional and relational paradigms, allowing functions to be used

in object-oriented SQL (OSQL) constructs.

The above languages have an 0-0 flavor and have taken significant steps

towards the development of a powerful 0-0 query language. Query languages

such as DAPLAX [SHI81], GEM [ZAN83], ARIEL [MAC85], and the object-

oriented query language described in [BAN88], are based on the view of a data-

base defined in terms of objects, object classes, and their associations. A query in

these languages is formulated by specifying one class (usually a nonprimitive-class,

whose instances are real world objects) in the schema as a central class with some

path expressions. Each path expression starts from the central class and ends at

another class (usually a primitive-class, whose instances are of basic data types










such as integer, string, set, etc.). A restriction condition can be specified on the

class referenced at the end of a path expression. This class can also be specified in

the list of attributes to be retrieved. The result of a query is a set of tuples, each

of which corresponds to a single instance of the central class and contains values

related to that instance which are collected from classes specified in the list.

A major drawback of these query languages is that they do not maintain the

closure property [ALA89b]. A query language is said to be closed if the result of a

query can be further queried by other queries specified in the same language. In

the above mentioned languages, the input to a query has an 0-0 representation

(i.e., a network of objects, classes, and their associations) whereas its output is a

relation which does not have the same structural and behavioral properties as the

original objects. Consequently, the result of a query cannot be further processed

by the same set of operators. The design of these languages is very much

influenced by the relational model and relational languages which are concerned

mainly with retrieval and storage operations. In 0-0 processing, objects in

different classes that satisfy some search conditions are subject to different user-

defined operations. The idea of collecting data to form a resulting relation does

not satisfy this processing model.

The query languages proposed [DAD86, MAN86, BAN87, ROW87, CAR88,

COL89] use nested relations as their logical views of 0-0 databases. Although

these languages are closed, i.e., operators in these languages operate on nested

relations to produce nested relations, the nested relation is not a proper logical

representation for an 0-0 database which is basically a network structure of









object associations. Mapping from a network representation to nested relations is

an additional process. Furthermore, in order to use a nested relation to represent

complex network structures, a considerable amount of data has to be introduced

to relate these nested relations. It is our view that the query language and its

underlying algebra should directly support the manipulation of network structures.

A query algebra [SHA90] was proposed recently based on the 0-0 model

ENCORE [ELM89]. Although ENCORE models applications as networks of

objects, object types, and their associations, the domain of the algebra is defined

as sets of objects of the Tuple type, which is essentially the nested relation

representation since it allows the nesting of tuples. Therefore, the mapping prob-

lem addressed above still remains. In this algebra, two identical queries or two

identical operations in a single query do not give the same response, since each

produces a new object in the database. To eliminate duplicated copies of the

same newly created object, the algebra introduces operations like DupEliminate

and Coalesce, which would not have been necessary if the algebra were to directly

support the network-structured processing of 0-0 databases. We further observe

that the union operation in this algebra may produce a collection of objects having

the same data type but with different structures (e.g., the union of two collections

of objects of the Tuple type with different arities). Nevertheless, the other opera-

tors introduced in the algebra are not defined to operate on collection of objects

with heterogeneous structures.

A common limitation of many existing query languages is that they cannot

express "non-association" relationship between objects easily, i.e., identify objects










in two classes that are not associated with each other while their classes are. For

example, in an 0-0 database, let us assume that Suppliers sl and s2 supply Parts

pl and p2, respectively. GEM, POSTQUEL, and several other query languages

provide the "dot" construct (Suppliers.Parts) and ARIEL provides the "of" con-

struct (Parts of Suppliers) to navigate from the class Suppliers to the class Parts

to produce object pairs (sl,pl and s2,p2). However, they do not have a language

construct for specifying the semantics that sl does not supply p2 and s2 does not

supply pl. Similarly, in functional languages, only the function Parts(Suppliers) is

provided to specify the associations of sl,pl and s2,p2 but not the non-association

of suppliers and parts.

In view of the disadvantages of the existing 0-0 query languages, we would

like to stress the importance of using a graph as the logical representation of an

0-0 database at both intensional and extensional levels as exemplified by 02

[LEC88], FAD [BAN87], and OSAM* [SU89]. The query language and its under-

lying algebra should provide constructs to directly process graphs with different

degrees of complexity. They should also support the specification of non-

associations and the processing of heterogeneous structures. Furthermore, the clo-

sure property should be maintained.

In this dissertation, we propose an association algebra (A-algebra) based on

the graph representation of 0-0 databases and the association-based query formu-

lation (refer to Chapter 3). Analogous to the development of the relational alge-

bra for relational databases, the development of the A-algebra provides the formal

foundation for query processing and optimization in 0-0 databases and for









designing 0-0 query languages. Unlike the record(tuple)-based relational algebra

[COD70 and COD72] and the query algebra [SHA90], the A-algebra is

association-based, i.e., the domain of the algebra is sets of association patterns

(e.g., linear structures, trees, lattices, networks, etc.) and processing an 0-0 data-

base is based on the matching and manipulation of homogeneous as well as hetero-

geneous patterns of object associations. Operators of the A-algebra can be used

to navigate a network of interconnected object classes along the path of interest to

construct a complex pattern as the search condition. They can also be used to

decompose a complicated pattern into simple ones. Ten operators have been

defined for the algebra: three unary operators [A-Select (r), A-Project (I), and A-

Integrate (f)], and seven binary operators [Associate (*), A-Complement (I), A-

Union (+), A-Difference (-), A-Divide (-), NonAssociate (!), and A-Intersect (*)],

where the prefix A stands for "Association". Although many of these operators

correspond to the relational algebra operators, they are different from them in

that they can operate on complicated heterogeneous structures. In this respect,

the A-algebra is more general than the relational algebra.

The rest of this dissertation is organized as follows. A detailed survey on the

relational model and the relational algebra, the existing 0-0 query languages, and

a recently proposed query algebra is provided in Chapter 2. The graphical

representation of 0-0 databases and the association-based query formulation are

described in Chapter 3 with the help of examples. Chapter 4 formally defines the

concepts of Schema Graph (SG), Object Graph (OG), and association patterns.

The formal definitions of the association operators and their simple mathematical






9


properties are also presented. The A-algebra expressions for some example queries

are given to demonstrate the utility of the algebra. Chapter 5 presents the

mathematical properties of the association operators and their utilities in query

optimization and query decomposition. The proofs of the mathematical properties

of the operators can be found in the Appendix. The completeness of the A-

algebra is shown in Chapter 6 and the conclusion is given in Chapter 7.























logical data
independence










physical data
independence


Figure 1.1 Data independencies in relational databases






















logical data
independence






encapsulation




physical data
independence


Figure 1.2 Architecture of 0-0 databases















CHAPTER 2
A SURVEY OF RELATED RESEARCH


This section surveys some of the existing work related to the development of

the A-algebra. Section 2.1 describes the relational model and the relational alge-

bra, while Section 2.2 surveys some existing query languages designed for 0-0

semantic data models. The query algebra recently appeared in the literature is

surveyed in Section 2.3.



2.1 Relational Model and Relational Algebra


When the hierarchical and network data models were used extensively in

information systems in the late 1960s, Codd [COD70] raised an interesting and

important question: Can application programs and terminal activities remain

invariant as the internal data representations (physical representations) change?

He asserted that the future users of large data banks must be protected from hav-

ing to know how the data were organized in the machine. Following this

rationale, he conceived the notion of data independence which suggests that the

logical organization of data should be independent of its physical representation.

Determined to demonstrate the validity of his data independence concept, he pro-

posed a relational data model based on n-ary relations.










The scheme of a relation, R, of an entity set {E1, E2, ..., EJ} is defined on a

set of m attributes {A, A2, ..., Am} which correspond to m domains

{DI, D2, ...,Dm} (not necessarily distinct). Each entity (the instance of the scheme)

is represented by an m-ary tuple which has its first attribute value from D,, its

second attribute from D2, and so forth. A set of attributes of a relation is called a

key if the entities of the relation can be uniquely identified by the values of these

attributes.

In particular, the information of the suppliers such as their names, addresses,

items they supply, and the prices of the items can be represented by the relation

SUPPLIERS of the following scheme

SUPPLIERS(SNAME, ADDRESS, ITEM, PRICE)

where the attributes SNAME and ITEM form a composite key. Data represented

in this form, which intuitively is a flat table, is the logical view of an application

world. It has nothing to do with the physical representation of the data.

When designing a database using the relational model, one is often faced with

a choice among alternative sets of relation schemes. Some choices are more favor-

able than others for various reasons. For example, the relation SUPPLIERS is not

a desirable scheme because it has the following potential problems: (1) Redun-

dancy the address of the supplier is repeated once for each item supplied. (2)

Potential inconsistency (update anomalies) as a consequence of the redundancy,

the update of the address of a supplier in one tuple will leave it inconsistent with

the address of another tuple. (3) Insertion anomalies the address of a supplier

cannot be recorded if that supplier does not currently supply at least one item










since SNAME and ITEM form a composite key of the relation SUPPLIERS. (4)

Deletion anomalies the inverse to problem (3) is that should all the items sup-

plied by one supplier be deleted, we unintentionally lose the address of that sup-

plier.

The causes of these problems and their solutions are relevant to the func-

tional dependencies among the attributes of a relation [COD70, ULL82]. Suppose

X and Y are two sets of attributes of a relation. Y functionally depends on X (or

X functionally determines Y), denoted by X-.Y, if two tuples of the relation hav-

ing the same values in attributes X agree on the values of the attributes in Y.

The above four problems emerge if X-. Y and X,--Z hold simultaneously, where

X, stands for a proper subset of X and Z a set of attributes of the relation.

The solution to these problems is to decompose a relation based on the func-

tional dependencies among attributes. For example, the functional dependencies

among attributes of the relation SUPPLIERS are (SNAME,ITEM)--PRICE and

SNAME-.SADDRESS, thereby having the redundancy, update, insertion, and

deletion anomalies. It should be clear to the reader that these problems will be

eliminated if the relation SUPPLIERS is decomposed into two relations


SA(SNAME, ADDRESS) and
SIP(SNAME, ITEM, PRICE).


There is, however, a disadvantage to the above decomposition; to find the address

of a supplier who supplies item "piston", a join operation, has to be applied since

the SADDRESS and ITEM are logically distributed in two relations.










The decomposition of a relation based on the functional dependencies among

its attributes is a novel issue of normalization in the relational model. Four types

of normal forms, denoted by 1NF, 2NF, 3NF, and Boyee-Codd-NF, respectively,

have been recognized in considering the functional dependency [COD70, ARM74,

and BEE77]. The Boyee-Codd-NF is the strongest of these normal forms. Rela-

tions in these normal forms may have to be further decomposed into 4NF or 5NF

to eliminate multivalued dependencies [FAG77, DEL78, and ZAN76] and join

dependencies [AHO79]. This decomposition is needed to eliminate further redun-

dancy and anomalies.

The success and popularity of the relational model and the relational data-

base management systems (DBMSs) are due to its simplicity in structural tabularr)

representation and its sound theoretical basis the relational algebra and the rela-

tional calculus [COD72a]. The relational algebra defines five primitive operators,

of which two are unary operators [Projection (H) and Selection (o)] and three are

binary operators [Cross-product (x), Union (+), and Difference (-)]. Other opera-

tors such as Join, Natural-join, Set-intersection, and Set-division are also defined

in the algebra. Although these later operators are easy to use, they are not primi-

tive since they can be expressed in terms of the primitive operators.

The relational algebra has the closure property, since every operator must

operate on one or more relations and produces a new relation. Operators of the

relational algebra basically operate on the values of tuples in relations. Structur-

ally speaking, they are defined to operate on tuples whose structures are union-

compatible (homogeneous). The relational algebra is complete in the sense that it










has the equivalent expressive power to the relational calculus [COD72a and

ULL82]. Because of this, it serves as the theoretical basis for the relational model.

The relational algebra has been used for the following three purposes, although it

has not been previously implemented in any existing DBMSs exactly as defined

[ULL82],

(1) It creates a new class of query languages called algebraic languages. Based on

the relational algebra, languages that directly adopt the relational operators

can be developed, such as ISBL [TOD76] which is a close approximation to the

relational algebra. Although languages of this type are mostly procedural, it is

relatively easy to demonstrate their completeness along with the mathematical

properties of the relational algebra which can be readily applied to query

optimization and query decomposition.

(2) It not only serves as a benchmark for evaluating query languages in existing

systems, but also as the criterion for designing new languages for relational

DBMSs. A relational language will not have the necessary expressive power if

it is not relationally complete [ULL82].

(3) It provides a mathematical basis for transforming expressions in query decom-

position and (logical or conceptual) query optimization. As an algebra form,

the mathematical properties of the relational algebra can be explored precisely

and systematically. For query languages construed as algebraic languages,

these mathematical properties exhibit a straightforward application [HAL76J.

Query languages like SQUARE or SEQUEL having certain algebraic features

may also use these properties, since the parse of a query yields a tree in which









some nodes represent relational algebra operators [AST76]. Even if a query

language such as QUEL is a relational calculus language, its calculus-like

expressions are translated into relational algebra expressions in the QUEL

optimizer [WON76].

The total content proposed by Codd before 1979 on the relational model is

referred as Version 1 of the relational model (RM/V1), whose modeling capabilities

were extended by Codd in 1979 [COD79] to version RM/T (T for Tasmania).

Based on these two versions, Codd [COD90] introduces Version 2 of the relational

model (RM/V2). The most important additional features in RM/V2 are as fol-

lows:

(1) A new treatment of items of data missing because they represent properties

that happen to be inapplicable to certain object instances.

(2) New features supporting all kinds of integrity constraints, especially the user-

defined integrity constraints.

(3) A more detailed account of view updatability.

(4) New features pertaining to the management of distributed databases.

It is important to recognize the fact that hierarchical and network models as

well as the relational model evolved during a time in which the primary applica-

tions of information systems were business-oriented. In an attempt to apply these

techniques to the more complicated application areas such as CAD/CAM, CASE,

and decision support, it is found that the relational model is no longer adequate

for modeling these advanced applications. The inadequacies of the relational

model are summarized as follows. First, the relational model has limited modeling










capabilities. When data are logically represented in the form of relations, the rela-

tionships among entities in these relations are represented by matching values of

the attributes or keys in one relation with values of the attributes or foreign keys

in other relations. The actual semantics among the data such as generalization

and aggregation (the abstract data type) cannot be modeled by the relational

model. Second, the relational model only models the structural aspects of entities,

and thus, ignores their behavioral aspects (e.g., system-defined and user-defined

operations). Third, in these advanced applications, the concept of data indepen-

dence should be further extended to the concept of object encapsulation, i.e., not

only should the logical representation of an object be separated from its physical

representation, but its structural and behavioral properties should be logically

encapsulated in its class. The object encapsulation concept cannot be realized in

the relational model, since the data describing an entity may be logically scattered

among several relations due to normalization [COD70, COD72b, BEE77, and

ULL82]. Fourth, entities with complex structures and complicated relationships

among entities are not representable by flat tables (relations). Finally, it cannot

represent and operate on entities with different (heterogeneous) structures.



2.2 Existing 0-0 Query Languages


An extensive literature search on query languages for accessing 0-0 data-

bases such as GEM [ZAN83, TSU84], ARIEL [MAC85], DAPLEX [SHI81], FAD

[BAN87], POSTQUEL [ROW87], EXCESS [CAR88], as well as other proposed

languages [ST084, DAD86, MAN86, SER86, BAN87, FIS87, BAN88, COL89,










SHA90] has been carried out. This section surveys a representative sample of

these languages. Most existing query languages have capabilities beyond those

provided by its theoretical basis. For example, the arithmetic operations and

aggregation functions provided by the relational languages are not available in the

relational algebra. Therefore, this survey is limited to those features which are

relevant to the proposed algebra.

To demonstrate the similarities and differences of these languages, the same

database schema as shown in Figure 2.1 is used for example queries written in

GEM, ARIEL, DAPLEX. The sample schema of Figure 2.1 is for a government

owned laboratory system where rectangles represent classes and edges (links)

represent attributes.

QUEL [STO76, WON76, and Z0077] is a tuple-calculus oriented query

language for relational DBMS INGRES [ST076]. In order to avoid the ambiguity

which arises when two attributes of different relations having the same name are

addressed in a single query, QUEL uses a "dot" mechanism to qualify an attribute

of a relation (i.e., a dot is inserted between the name of the relation and the name

of the attribute). For example, Equipment.Name refers to the attribute Name of

the relation Equipment. Influenced by this mechanism, the existing 0-0 query

languages use similar notations for navigating the database schema from one class

to another or from one relation to other relations in systems which use relational

databases as their back-ends.

The language GEM [ZAN83,TSU84] is an extension of QUEL for the data

model DSIS which supports aggregation, generalization, and unique identification










of objects. In GEM, a class in an aggregation hierarchy that has a link emanating

to another class has the name of the later class as the data type of one of its attri-

bute. For example, the class Lab has an attribute, Facility, of the type Equip-

ment, and has another attribute, Locality, of the type Location, and so forth. The

dot notation is used in GEM for navigating along the reference attributes (links) in

query formulation. The following GEM query retrieves the name of the manager,

the serial number of the equipment, and the address for each laboratory whose

headquarter is located in New York.


Range of Lab is Lab
Retrieve Lab.Manager.Name
Lab.Equipment.Serial#
Lab.Location.Address
Where Lab.Manager.Department.Headquarters.City = "New York"


This query returns a set of tuples in a tabular form. Each tuple contains

values for the manager's name, the equipment serial number, and the address of

the laboratory of interest.

In the approach described in Stonebraker et al. [ST084], the dot notation is

used in a manner similar to that found in GEM to implement the abstract data

type (ADT) concept. In addition, QUEL is used as a data type to facilitate the

navigation from one relation to another. A relation may have a field of type

QUEL which may contain expressions or commands (queries). Whenever the field

is addressed in a query, these expressions, in whole or in part, will be activated.

In general, if X is the tuple variable of the relation R1, Y is a field of type QUEL

in relation R1, and the query stored in Y retrieves field Z of another relation, R2,










then the expression X.Y.Z is a field in a collection of this view. In other words,

the expression will return the values of the Z field of tuples (in R2) that are

related to X through Y. For example, let the relation Manager have a field called

OfficeInfo of type QUEL which contains a query that retrieves the telephone

number of the relation Location. The expression Manager.OfficeInfo.Tel# returns

the telephone number for each manager in a tabular format. Clearly, the imple-

mentation of QUEL as a data type provides a way to relate data in two relations

without modifying the database schema.

Instead of using the dot notation, ARIEL [MAC85] takes advantage of the

"OF" notation. The example query described for GEM can be restated as


Range of Lab is Lab
Retrieve Name OF Manager OF Lab
Serial# OF Equipment OF Lab
Address OF Location OF Lab
Where City OF Headquarters OF Department OF Manager
OF Lab = "New York"


using the "OF" notation which is linguistically more natural than using the dot

notation. However, the result of this query is also represented by a flat table

(relation).

DAPLEX [SHI81] is a functional data language. The data retrieval com-

ponent of DAPLEX is similar to the languages described above, although it is

interpreted differently. In the functional paradigm, the class having a link (i.e.,

attribute) emanating to another class is considered as a function. The function

has, by default, the name of the class to which the link points. For example,










Location(Lab) and Department(Headquarters) represent the facts that Lab has

Location and Headquarters has Department as attribute, respectively. When the

function Location(Lab) is applied to an object of the class Lab, it returns a value

which is an object in the domain class over which the attribute is defined. If the

navigation is from one class to another through a sequence of classes, a nested

function is used. For instance, the expression Name(Manager(Lab)) specifies the

name of the manager of a laboratory to which the manager is responsible. For a

particular object of Lab, the manager of the laboratory is produced first; then, the

function Name( is applied to the returned manager and returns the name of the

manager. The example query can be expressed in DAPLEX as follows.


FOR EACH Lab
SUCH THAT City (Headquarters (Department (Manager (Lab))))
= "New York"
PRINT Name (Manager (Lab)),
Serial# (Equipment (Lab)),
Address (Location (Lab))


Even though DAPLEX is based on the functional paradigm, it returns data in the

form of a relation just like in GEM and in ARIEL.

Banerjee et al. [BAN88] introduce a query language based on message pass-

ing. In the message passing paradigm, the name of a link emanating from a class

is interpreted as the name of a message which is stored within that class. One can

assume there is actually a message created by the system and having, by default,

the same name as its corresponding attribute. When such a message is sent to an

instance of the class, it returns the value of the attribute. For example, the fol-










lowing is an expression for selecting a laboratory that has a manager who belongs

to a subordinate department of its New York headquarters.


(Lab SELECT :S (:S Manager Department
Headquarters City = "New York"))


SELECT in this expression is a message sent to the class Lab. The first

argument of SELECT is :S, an iteration variable. The SELECT message iterates

over the instances of the class Lab with :S bound to one instance at a time. The

block of code within the parentheses is the second argument of SELECT, and is

executed for each value bound to :S. In this particular block, the message

Manager is sent to the instance bound to :S in order to return the related Manager

instance. Similarly, Department and Headquarters are messages. To elaborate,

Department is sent to the returned Manager instance, Manager is sent to the

returned Department instance, and Headquarters is sent to the returned Depart-

ment instance. The sign "=" is also a message which has the argument "New

York". When this message is sent to the resulting headquarter instance, it returns

a logical object TRUE or FALSE. An instance of Lab is qualified for the above

expression, if and only if the returned logical object is TRUE. The logical AND

or OR message can be sent to this object with an argument that specifies some

other condition on the instance of Lab. In principle, though not described in Ban-

erjee et al. [BAN88], similar message-based expressions can be used to retrieve

attribute values of the resulting Lab instance. The result of a query which

involves such conditions is the set of the instances of Lab along with its attribute










values and is represented in a tabular form.

As shown in the samples of these query languages, their query formulations,

though interpreted differently, are very similar to each other. This is evident in

the fact that the formulating of queries is accomplished by navigating the graphi-

cally represented database schema from class to class through their respective

links. In each of these languages, however, a query operates on a database that is

structurally represented using an 0-0 data model and returns a result whose

structure is represented in a tabular form. Consequently, the result of a query

cannot be further queried by other queries written in the same language. There-

fore, these languages are not closed.

Another drawback of these languages is seen in their navigation mechanisms

which can only formulate queries against classes (or relations) that are interre-

lated in simpler patterns like the linear and forest structures shown in Figure 2.2a.

However, in 0-0 databases, the graphical patterns in which objects are inter-

related with each other are basically networks which are not restricted to plane

graphs (a graph is a plane graph if it can be drawn on a plane without any inter-

section of two edges). They can be as complicated as surface graphs (a graph is a

surface graph if it can be drawn on a surface without any intersection of two

edges). Phrasing queries against classes that are interrelated in more complicated

patterns depicted in Figure 2.2b is beyond the capabilities of these languages.

A third drawback of these languages which renders their navigation mechan-

isms insufficient is that only one type of the relationship (an object ia related to

another object) between objects of two classes can be expressed. In fact, when










two classes are directly linked at the schema level, objects in these two classes

may have another type of relationship an object is not related to another object.

This type of relationship represents the complement aspect of the semantics

specified for the two associated classes, such as not-a-part-of,

not-a-function-of, or is-not-a which is often needed in querying the databases.

For example, "For each laboratory, list the equipment that is not available" is a

reasonable query.

The proposed query languages [DAD86, MAN86, BAN87, ROW87, CAR88,

COL89] use nested relations as their logical views of databases. A nested relation

is a generalized relation, i.e., a recursively defined relation: the attributes of a rela-

tion can be either atomic values or another relation in which the attributes can be

a third relation, and so forth. Figure 2.3 shows an example of a nested relation.

Nested relations are particularly suitable for representing data in forest structures.

The above languages are considered to be closed, since operators in these

languages operate on nested relations and produce nested relations. However,

they also have the drawbacks mentioned above and it is our view that nested rela-

tion is not a proper logical representation for an 0-0 database which is networks

of objects, object classes, and their associations. Using nested relations to

represent data in network structures introduces one level of indirection. Mapping

from a network representation to nested relations is an extra process. Further-

more, in order to use a nested relation to represent complex structures, a large

amount of data has to be replicated in the representation. Figure 2.4 shows an

example of using a nested relation to represent a graph having loops. Note that










vertex F has to be replicated three times.



2.3 ENCORE 0-0 Data Model and Its Underlying Query Algebra


In spite of the popularity of the 0-0 paradigm and its application in the field

of database management, the existing 0-0 database management systems still

lack a solid mathematical foundation for the manipulation of an 0-0 database

and the optimization of queries. Recently, a query algebra [SHA90] was proposed

for the ENCORE 0-0 data model [ELM89]. This section surveys the query alge-

bra as well as the ENCORE model. It also serves as a comparison to the associa-

tion algebra proposed in this dissertation.



2.3.1 The ENCORE Model


ENCORE 0-0 data model [ELM89] supports abstract data type, type inheri-

tance, typed collection of typed objects, objects with identity, and object encapsu-

lation. It models an application as networks of objects, object types, and their

associations. The definition of an abstract data type in this model includes the

Name of the type, a set of Properties defined for instances of the type, a set of

Operations which can be applied to the instance of the type. Properties reflect the

state of an object while operations may perform arbitrary actions. Properties are

typed objects that may be implemented as stored values, procedures, or functions.

The implementation of a property is invisible to the user and is assumed to return

an object of the correct type and to have no side-effects.









In addition to user-defined abstract data types and a collection of atomic

types such as Int, String, Boolean, etc. (i.e., primitive-classes), ENCORE provides

two parameterized types and a global Object type which is the supertype of all

other types. The parameterized type Set[TC defines T as the type, or supertype, of

objects in a collection having type Set, and T is called the member type of the set.

The parameterized tuple type associates types (T,) with attribute names (A,) and

defines properties Get-attribute-value and operations Setattribute-value for each

attribute. The T,'s can be any database types, thus, allow nesting of tuple types.

The value of a tuple is represented as where the

A's are attributes of the tuple and the o's are objects of the corresponding types.

The global supertype Object defines a family of operations for equality called

i-equality where i indicates how "deeply" a comparison of two objects must search

before finding equality. Two objects are identical when they are the same object,

i.e., they have the same identity. Identical objects are 0-equal (=0 or just =) and,

for i>0, two objects are i-equal (=J) if

(1) they are both collections of the same cardinality and there is a one-to-one

correspondence between the collections such that corresponding members are



(2) they both have the same type (not a collection type) and the values of

corresponding properties are =i-1.

Type Object also defines a stronger notion of equality called id-equality.

Two objects are id-equal at depth i if they are i-equal and graphical representa-

tions of the objects are isomorphic.









2.3.2 The Underlying Query Algebra of ENCORE


The query algebra [SHA90] is proposed based on the 0-0 model ENCORE.

The domain of the query algebra is defined as a typed collection of typed objects.

A typed collection is of parameterized type Set[T1 and the objects in the collection

are of type T. If objects of a collection are collected from different types, T is

their most specific common type in the type lattice. For example, if object a is of

type S, object p is of type P, and S is a supertype of P, the collection of objects a

and p is of type Set[S]. The query algebra is closed since the operators of the

query algebra operate on collections) of objects with type Set[T,] and produce a

collection with type Set[TJ, where type Tk is defined by the query.

Similar to the languages surveyed in Section 2.2, the query algebra addresses

a property of an object using 'dot' notation (e.g., e.p.q where a is an object of type

T1, p is a property of a and is of type T2, and q is a property of p and is of type

T3).

Twelve operators are defined in this algebra. We give their brief definitions

followed by some example queries to illustrate the major concepts of this algebra.

(1) The Select operation creates a collection of objects which satisfy a selection

predicate.

Select(S,p) = { | (a in S)Ap(s) }

where p is the predicate.

(2) The Image operation is used to return a single object for each object in the

queried collection and has the form:










Image(s, f: 7) = { (A) I s in S }

where S is a collection of objects and f returns an object of type T.

(3) The Project operation extends Image by allowing the application of many

functions to an object, thus supporting the creation and maintenance of

selected relationships between objects. The relationships are stored as tuples

with Tuple type.

Project(S, =
{ I in S }

where S is of type Set[71, the A,'s are unique attribute names, and each /f

takes a single input of type T and returns an object of type Ti. Project

returns one tuple for each object in the collection being queried. Each newly

created tuple is a new object with unique object identifier.

(4) The Ojoin operator is an explicit join operator used to create relationships

which is not defined between objects of two collections in the database. It is

essentially a Cartesian product of collections of objects, followed by a selec-

tion of result tuples. For collections S and R, the Ojoin is defined as follows:

Ojoin(S, R, A,, Ag, p) =
{ I a in S A r in R A p(s,r) }

where p is a predicate (as in Select) defined over objects from S and R. The

Ojoin operation creates new tuples in the database to store the generated

relationships. The tuples created will have unique object identifiers.

(5) Union, Difference, and Intersection are the usual set operations with object

comparisons and set membership based on object identity (=,). The result of










these operations is considered to be a collection of objects of type T, where T

is the most specific common supertype (in the type lattice) of the types of the

objects in the operands.

(6) Flatten operation is used to restructure sets of sets and Nest and UnNest

allow the representation of tuples as flat or nested relations.

(7) For the above operators, two identical operations cannot give identical

response, since each result collection is a newly identified object in the data-

base and the objects in a result collection may be either existing database

objects or new tuple objects created during the operation. Operators DupEl-

iminate and Coalesce are introduced to handle situations where equal objects

are created by a query.

The example queries are issued against the Supplier-Parts-Job database

shown in Figure 2.5. For the purpose of these examples, it is assume that Type

Object is the only supertype for each of the given types.

Example 1: Find all red parts. Which suppliers can supply all of the red parts?

Pred := Select(Parts,Xp p.color = "Red"
S-Pred:= Select(Suppliers,Xs P.red subset-of s.Inventory)

The first selection finds the red parts and the second selection finds all sup-

pliers for which the inventory includes that set of parts. The subset-of operation

is available since property Inventory and result P-red both have type Set[art].

Example 2: What parts are needed by jobs in Boston?

BosJobs := Select(Jobs,Xj j.address.city =- "Boston")
BosJobParts := Project(BosJobs,Xj <(J,j),(Pt,j.PartsNeeded)>)










The select operation finds the jobs in Boston and the project operation gives

information about which parts are needed for each job in Boston. The result of

the projection is of type Set[Tuple]. Note that operation NewPart (of type Job)

cannot be applied to members of BosJobParts, since they have type Tuple. How-

ever, it is appropriate for objects BosJobParts.J.

Example 3: Find all local suppliers for each job.

LocalS:= Ojoin(jobs,Suppliers,J,S, Xj Xs
j.address.city = s.address.city)

This Ojoin operation produces a set of tuples of type <(J,Job),(S,Supplier)>,

which is similar to a normalized relation. To get a set of suppliers for each job, a

Nest operation needs to be applied: Nest(LocalS, S).

From the above description, we can see that the query algebra supports

many features of 0-0 databases and has taken significance steps towards a power-

ful 0-0 query algebra to serve as the mathematical foundation for 0-0 database.

However, it still has the following limitations.

(1) Although the ENCORE models an application as networks of types, objects,

and their associations, the domain of its underlying query algebra is defined as

collections of objects having type Set[T], which is essentially a nested relation

representation, since the member type T of the set type can be a parameter-

ized Tuple type which may in turn contain attributes of Tuple types. There-

fore, the query algebra cannot represent network-structured relationships

among objects efficiently and the mapping problem addressed before still

remains.










(2) In this algebra, two identical expressions or two identical operations in a sin-

gle expression do not give identical response, since each result collection is a

newly identified object in the database. To eliminate duplicated copies of the

same newly created object, the algebra introduces DupEliminate and

Coalesce operations, which are not necessary if it directly supports the net-

work view of 0-0 databases.

(3) In this algebra, a collection may contain objects with heterogeneous struc-

tures. For example, two objects are both of Tuple type but with different

arities and the union of the two object is also a collection of objects having

Tuple type. However, other operators in this algebra are not defined to

operate on such collectionss.

(4) Since the query algebra is developed for a specific model (i.e., Encore), it is

difficult to apply to other 0-0 models.























































Figure 2.1 A sample schema























(a) simple query patterns


plane graphs


surface graphs


(b) complex query patterns


Figure 2.2 Simple and complex query patterns


0---0---0---0---0























































Figure 2.3 An example of a nested relation















B(b2)


A(al)


D(d3)


E(e2)




F(f5)

G(gl)

H(h6)


Figure 2.4 Using a nested relation to represent a complex structure










Type Supplier
properties:
Ident: string
Address: Addr
Inventory: Set[Part]


Type Job
properties:
Num: string
Address: Addr
PartsNeeded: Set[Part]
Preferred_Suppliers:
Ordered_list[Supp


operations:
RecvOrder:
Supplier, Set[Part] --> Supplier


operations:
NewPart: Job, Part --> Job


Type Part
properties: operation
Num: string Order:
Address: Addr Same
Color: string
Components:
Set[Tuple[<(P,Part, (Qty, Int)>]]
Plan: drawing
BillofMaterial: list[Part]


s:
Part --> Part
Part: Part, Part --> Boolean


Type Addr
properties:
Street: string
City: string
State: string


Figure 2.5 A Supplier-Parts-Job database













CHAPTER 3
OVERVIEW OF 0-0 DATABASES
AND ASSOCIATION-BASED QUERY FORMULATION


This chapter informally introduces the graphical view of 0-0 databases and

illustrates the association-based query formulation mechanism. The graphical

view captures the most important characteristics of 0-0 databases in which

object classes and their objects are associated with each other. Based on this

view, query formulation and processing can be made by specifying and manipulat-

ing association patterns in which objects are inter-related with each other, unlike

the traditional attribute-based query formulation and processing which match

values in different relations. Since the graphical view is suitable for many 0-0

data models, the association algebra developed based on this view can be used as a

general algebra for supporting these 0-0 databases. The graphical view of O-O

databases is formalized in the next chapter.



3.1 Overview of 0-0 Databases


0-0 semantic data models provide a conceptual basis for defining 0-0 data-

bases. Although each model has some unique constructs that distinguish one

model from the others, there are several common structural and behavioral pro-

perties based on which an algebra can be developed and used to support these

models:










First, objects are physical entities, abstract concepts, events, processes, func-

tions or anything that an application cares to capture and represent.

Second, objects having the same structural and behavioral properties are

grouped together to form an object class. Object classes can be categorized into

two general categories: (1) the nonprimitive-class which represents a set of objects

of interest in an application world, each of which is assigned a system-wide unique

object identifier (OID) and its data are explicitly entered in a database by the

user; and (2) the primitive-class which represents a class of self-named objects

serving as a domain for defining other object classes, such as a class of symbols or

numerical values. The behavioral properties of an object class are defined in

terms of system-defined or user-defined operations (e.g., retrieve, display, delete,

insert, rotate a design object, hire an employee, etc.), which can meaningfully

operate on its objects using their corresponding programs (or methods). The

structural properties of an object class and, thus, its objects consist of two types of

data (1) descriptive data (or instance variables) which define the states of the

objects; and (2) association data which specify the relationships between its

objects and the objects of some related classes.

Third, different 0-0 models recognize different types of associations. Two of

the most commonly recognized associations are aggregation and generalization.

Aggregation models the a-part-of, a-function-of, or a-composition-of relation-

ship. For instance, a complex object can be modeled by an aggregation hierarchy

(abstract data type) in which a complex object is defined in terms of its associa-

tions with objects in other defined classes. Generalization models the is-a or the










superclaos-subclass relationship in which an object in a subclass inherits both the

structural and the behavioral properties of its superclass(es).

Thus, from the algebra point of view, an 0-0 database can be viewed as a

collection of objects, grouped together in classes and interrelated through associa-

tions. It can be represented by graphs at both the intensional and the extensional

levels. At the intensional (schema) level, a database is defined by a collection of

inter-related object classes and is represented by a Schema Graph (SG). For

example, the SG for a university database is illustrated in Figure 3.1, in which

each rectangle denotes a nonprimitive-class such as a class of person objects or a

class of department objects, and each circle denotes a primitive-class such as a

class of names or ages. The associations among classes are represented by the

edges in SG. For example, there is an association between the class Course and

the class Department (an Aggregation association), and an association between the

class Person and the class Student (a Generalization association). Since the

semantic distinctions of these and other association types recognized by different

semantic models can be either hard-coded in a DBMS or declaratively specified by

some rules and used by a rule processor to govern the manipulation of the associ-

ated classes, the underlying algebra does not have to incorporate the semantics of

these association types. All it has to be concerned with is whether or not an

object class and its objects are associated with some other classes and their

objects, i.e., the edges (or associations) are type-less in SG. For example, the

semantics of inheritance can be incorporated in a query language translator which

translates a high-level language statement into its underlying algebraic representa-









tion. The algebra does not have to deal directly with the semantics of inheritance.

This is particularly important if the algebra is to be used as a general algebra for

supporting various 0-0 data models in which the semantics of an association type

may have slightly different meanings.

At the extensional (instance) level, a database can be viewed as a collection

of objects, grouped together in classes and inter-related through some type-less

associations; and as such it can be represented by an Object Graph (OG). For

example, the OG corresponding to a portion of the university schema graph is

shown in Figure 3.2. In this example, the Teacher object t4 is associated with two

Section objects; thereby representing the fact that he/she is teaching two sections,

sc3 and sc4. The Student object sl is associated with Undergrad object ul which,

in turn, is associated with Department object dl; thereby representing that sl is

an undergraduate student who minors in the department dl. Finally, the Section

object sc2 is not associated with any object of the Student class, which represents

the fact that it is not taken by any student. Object associations expressed by

different graph patterns represent the semantic relationships among these objects

in an application world.



3.2 Pattern-bhsed Query Formulation


Based on this view of an O-O database, users can query the database by

specifying patterns of object associations as search conditions. Once these

objected are selected, they can be further processed by either system-defined

operations (Retrieval, Display, Update, Insert, Delete, etc.) or user-defined










operations (RotatePart, PurchasePart, HireFaculty, etc.). For example, the fol-

lowing queries can be issued against the university database as illustrated in Fig-

ures 3.1 and 3.2 (the algebraic expressions for these queries will be given in Section

4.4).


Query 1: For all sections, get the majors of students who are taking these
sections.

To satisfy this query, we can specify a linear pattern containing the classes

Section, Student, and Department as shown in Figure 3.3a. In this pattern, a cir-

cle represents a class and an edge represents that the objects of the two adjacent

circles (classes) must be associated with each other. This pattern is called an

intensional pattern which represents that sections taken by students who major in

some departments are to be identified. The answer to this query can be found in

Figure 3.2 by checking if the objects of these three classes satisfy such pattern.

There are five object patterns (called extensional patterns) which satisfy the inten-

sional pattern as shown in Figure 3.3b. The Section object sc2 and the Student

object s3 do not appear in these extensional patterns, since sc2 is not taken by any

student and s3 does not have a major yet. These patterns can also be identified in

two sequential steps. First, get all the patterns in which the Section objects are

associated with the Student objects. Then, if a pattern generated in the first step

(i.e., a Section-Student pair) is further associated with an object of Department, a

new pattern consisting of three objects is constructed and retained in the result;

otherwise, the pair is dropped.










Once these objects (as well as their associations) have been identified,

different system-defined or user-defined operations defined on their corresponding

classes can be applied to these selected objects. For example, Inform(Department)

can be an operation defined on the class Department. It sends each of the selected

departments a letter concerning the majors of the students.

Suppose there is a rule in the university that a student cannot major and

minor in the same department. To check whether there is such a case in the

database, the following query can be issued.


Query 2: List students who major and minor in the same department.

The intensional pattern for this query is shown in Figure 3.3c. It can be

formed by starting from the class Student and navigating the schema in two

traversal paths (refer to Figure 3.1). One path is from Student to Department,

which means that a student majors in a certain department; and the other path is

from Student to Department through Undergrad, which means that a student is

an undergraduate and minors in a certain department (we can see from the SG

that only undergraduates may have minors). According to the query, a single stu-

dent should associate with objects in both Undergrad and Department and these

two paths should merge at Department, thereby forming a loop. This implies two

logical AND conditions, one at the Student class and the other at the Department

class. We use double arcs to denote such conditions as shown in Figure 3.3c.

From Figure 3.2, we can see that the student sl has his major and minor in the

department dl. This extensional pattern is depicted in Figure 3.3d.










Query 3: For those students taking section 300 and having majors and/or
minors, get their majors and/or minors.

There are several ways to form an intensional pattern for the query. We

may start from Section# and traverse to Student through Section and, then, navi-

gate the schema in two paths as we did for query 2. According to the query, a

student who either has a major or a minor should be included in the result (in this

database, it is assumed that graduate students do not have minors). This means

that either path of the navigation will construct a pattern that would satisfy the

query. Thus, a logical OR condition exists at Student. We use a single arc to

indicate the OR condition as shown in Figure 3.4a. Like Query 2, these two

branches merge at Department. However, this query does not require that they

merge at the same Department object. This is specified by the second OR condi-

tion at Department in Figure 3.4a.

The extensional patterns that satisfy this query have heterogeneous struc-

tures: two types of linear patterns as shown in Figure 3.4b. The first type includes

patterns that represent the minors of the undergraduates; and the second type

includes patterns that represent the majors of the student who are either under-

graduates or graduates. In both types of patterns, a student is associated with sec-

tion 300 which is assumed to be the Section# for sc3. Figure 3.4c will be

described later in Section 4.4.

We have given some example queries which specify how objects are associ-

ated with one another. In the graphical representation of an 0-0 database, when

there is no edge between two objects even though there is one between their

classes, it implies that two objects are not associated with each other. This










represents the complement aspect of the semantics between two associated classes.

It is necessary to allow a user to retrieve this type of object non-association from a

database. The following query is such an example. It can also be specified by a

pattern.


Query 4: For each teacher, list the sections which he/she does not teach.

We use a dashed line to represent the fact that two objects are not associated

with each other. Therefore, the intensional pattern for this query can be drawn as

in Figure 3.4d. There are twelve extensional patterns that match the intensional

pattern. Figure 3.4e shows a portion of them. Non-association relationships

among objects are not explicitly stored in a database. However, they can be

derived during the processing of this type of queries.

Using the above examples, we hope that we have convinced the reader that

the pattern-based query formulation is suitable for query specification based on a

graphical view of an 0-0 database.



3.3 Conclusion

The (type-less) graphical representation of 0-0 databases is applicable to

most 0-0 data models, since it captures the essential characteristics of 0-0 data

models in which object classes as well as their objects are inter-related with each

other in different association patterns. Querying such databases can be made by

specifying patterns in which objects of interest are associated with each other. It

should be clear that this formulation is quite different from the attribute-based

query formulation in the existing relational query languages which is based on










matching the attributes (or the key or composite key) of one relation with the

attributes (foreign keys) in other relations. A query that requires the specification

of a complex pattern of object associations can be specified in a rather straightfor-

ward manner in an association-based language, whereas in an attribute-based

language, complex nestings of query blocks or multiple queries would be required

[ALA89a].

It is our view that an algebra developed for processing data based on the

graphical view of 0-0 databases and the pattern-based query formulation should

satisfy the following requirements. First, it should allow direct manipulation of

complex patterns of object associations. Second, the closure property should be

maintained. Third, both association and non-association relationships among

objects should be expressible as search conditions. Fourth, it should be complete

in the sense that it can be used to describe all possible patterns in a database.

Lastly, it must be able to represent and process patterns with both homogeneous

and heterogeneous structures.




































degree


Figure 3.1 Schema graph of a university database














Teacher


Section


Section#


Student


Department


Figure 3.2 Object graph













Query 1

Section Dept
(a) 0--- 0
Student


scl sl dl
0 p
sc3 s2 d3

(b) sc3 s4 d3

sc3 s5 d4
sc4 s7 d6





Query 2

jQUndergrad
(c) a


Student Dept


ul
(d) dl
sl A Idl


Figure 3.3 Pattern specifications for Query 1 and Query 2












Query 3


Section# Section Student Dept

(a) O0O u--
(a) [300]

Undergrad


[300] sc3


s3 u3 d2 [300] sc3


[300] sc3 s4 u4 d2


[300] sc3
[300] sc3


s4 d3


s5 d4


s3 d2


s4 ^ d2

s d3




Query 4

) Teache
(d) 0-


s2 d3


s5 d4
.-----


r Section
S- --0


sc2
--0
sc3




sc2
-- -S


Figure 3.4 Pattern specifications for Query 3 and Query 4


(b)


w w


w w














CHAPTER 4
ASSOCIATION ALGEBRA


The association algebra (A-algebra) is defined based on a uniform representa-

tion of an 0-0 database in terms of objects, object classes, and type-less associa-

tions, as described in Chapter 3. The algebra contains a number of operators

which operate on graph structures of object associations to produce graph struc-

tures. The closure property of the algebra ensures that the result of a query can

be further manipulated by other queries.



4.1 Definitions


First, we formally define an 0-0 database at both schema and object levels.

Schema Graph (the intensional database):

The schema graph of an 0-0 database is defined as SG(C,A), where C={C,}
is a set of vertices representing object classes; A is a set of edges, each of
which, Aj(k), represents association between classes C and C, where k is a
number for distinguishing the edges from one another when there is more
than one edge between two vertices.

Object Graph (the extensional database):

The object graph of an 0-0 database is defined as OG(O,E), where 0={0)}
is a set of vertices representing object instances (Ith object in class q,); and
E={O(i- == m,,} is a set of edges representing the associations among object
instances. When one object instance is connected with another in the object
graph, a regular-edge (solid line) is drawn between the corresponding ver-
tices as Oi,-0O,, which specifies that jth object instance in class Ci is
related to nth object instance in class C, through the kth association of
classes C, and Cm. If two object instances 0,. and 0,,. are not connected
in the object graph but their classes Ci and Cm in the corresponding SG are










directly connected, a complement-edge (dotted line) is drawn between them
and is denoted by ,j....Om,,.

In this 0-0 models, an object may participate in several classes (e.g., in a

generalization hierarchy). Its representation in a class is called an object instance.

Since in most cases in this dissertation, "object" and "object instance" can be used

interchangeably without any ambiguity, we shall use "object" unless a distinction

is required between the two.

The reason for explicitly introducing complement-edges into the OG is to

allow the A-algebra to manipulate both association and non-association between

objects of two adjacent classes. In an actual 0-0 database, it is not necessary to

explicitly store the complement-edges. Figure 4.1 illustrates the regular-edges and

complement-edges among the objects of three object classes. For example, we see

that section scl is taken by students s2 and s3 (regular-edges) and not taken by

students sl and s4 (complement-edges).

The relationship between an OG and its corresponding SG is formally

described by the following proposition.

Proposition 1: An OG(O,E) is a morphism of its corresponding SG(C,A).
The mapping function Fm is defined as

F,,: Ci => {Oij}, and
Fm2: Aim(k) => {Oi--==m,.}.

The mapping between SG and OG is one-to-many, since a database is

dynamically changing and may have different instantiations at different times for

the same schema graph.










To define "association pattern", we first extend the concept of connected

graph in graph theory by treating complement-edges as edges, i.e., a connected

graph is a graph in which there exists at least one path between any two vertices

and each path may contain regular-edges, complement-edges, or a combination of

the two. We shall from now on use an upper-case letter to denote a class and the

corresponding lower-case letter with a subscript to denote an object instance in

that class. We shall assume that there is only one edge between any two vertices

in SG unless otherwise specified so as not to complicate the notation.


Association Pattern:

A connected subgraph of an OG is an association pattern (or pattern for
short).

By this definition, a single vertex (or object instance) in OG, which is a con-

nected subgraph, is also a pattern. We call it an Inner-association-pattern (or

Inner-pattern for short). It is algebraically represented by (a,) for a vertex of class

A in SG. Thus, object instances are treated as Inner-patterns in the A-algebra. A

regular-edge together with two vertices (i.e., two Inner-patterns) it connects is

called an Inter-association-pattern (or Inter-pattern) which is represented by (ai0b).

A complement-edge together with the two Inner-patterns it connects is called a

Complement-association-pattern (or Complement-pattern) and is represented by

(acbj). This pattern states that a, and b, are not associated with each other in OG.

If a path consisting of only regular-edges between vertices a, and b, it can be

represented by a Derived-inter-association-pattern (D-inter-pattern), denoted by

(aibj); otherwise, it can be represented by a Derived-complement-association-









pattern (D-complement-pattern), denoted by (aib,). When a path is represented

by a derived pattern, it simply means that two vertices are indirectly associated or

non-associated but how they are interrelated (the actual path) is of no importance.

A D-inter-pattern is treated as an Inter-pattern and a D-complement-pattern is

treated as a Complement-pattern in the algebraic operations.

The above five types of patterns are the primitive patterns, the latter four

being binary patterns. Their graphical and algebraic representations are summar-

ized in Figure 4.2a. All other connected subgraphs are called complex patterns.

For example, the complex pattern shown in Figure 4.2bl contains three primitive

patterns: two Inter-patterns (b61) and (bd), and a Complement-pattern (b6c). It

can be uniquely defined by its algebraic representation as a set of primitive pat-

terns, i.e., (aab,bjc,b6d,). More examples of complex patterns are shown in Figure

4.2b. From these examples, one can observe that a complex pattern can be

decomposed into a set of binary patterns which cannot be further decomposed.

This implies that, in the algebraic representation of a complex pattern, an Inner-

pattern may not occur as an element and a binary pattern may appear only once.

A pattern in this algebraic format is called a normalized pattern, otherwise it is

called an unnormalized pattern. (b,,bzcj), (b2,b22), and (a6b,,bc2,ab,) are examples

of unnormalized patterns. During the process of constructing an association pat-

tern, we always normalize it by eliminating the duplicates. The above three pat-

terns have the normalized forms of (bc6), (b22), and (a1b1,bc), respectively.

The definitions of OG and association pattern imply that a pattern is a non-

directional graph, i.e., (aib,) = (bjai), and that the sequence of primitive patterns in










the algebraic representation of a complex pattern is not important, hence

(aibj, bjck) = (ckb,, aibj).

Based on the above definition and notion of association pattern, we view an

OG as an Association Graph (AG) and all the association patterns in AG form the

domain of the A-algebra, denoted by A.



4.2 Relationship Between Two Association Patterns


The operators of the A-algebra are defined based on the possible relationships

between two patterns in A, so that they can be used either to construct complex

patterns using simpler patterns or to decompose a complex pattern into several

patterns of simpler structures. There are four possible relationships between two

patterns p' and p2: non-overlap, overlap, contain, and equal.

(1) Non-overlap: Two patterns are said to be non-overlap, denoted by p'DCp2,
if they have no common Inner-pattern.

(2) Overlap: Two patterns are said to be overlapped, denoted by pr p2, if they
have at least one common Inner-pattern.

(3) Contain: Contain is a special case of (2) when all the primitive patterns of
p' are contained in p2. We say that p' is a subpattern of p2 and denote this
relationship by p1Cp2.

(4) Equal: This is a special case of (3) when p' contains all the primitive pat-
terns of p2, and vice versa. It is denoted by p =p2.

Before defining the association operators, we give the definition of

"Association-set" the operand of the association operators.

Association-set:

An association-set, denoted by a Greek letter a (or f,"q,...), is a set of associa-
tion patterns without duplicates, a' designates the ith pattern in a, where










a oa (Vi,.j). An empty set is also an association-set, denoted by 0.

A special type of association-set is called homogeneous association-set, which

is important to the A-algebra, since some of the mathematical properties hold only

when operands are homogeneous association-sets.

Homogeneous Association-set:

An association-set is homogeneous, if

(1) all patterns are formed by the Inner-patterns (or object instances) of
the same set of object classes; and

(2) all patterns have the same number of Inner-patterns from each class in
the set; and

(3) corresponding primitive patterns belong to the same association and are
of the same type; and

(4) all patterns have the same topology.

Otherwise, it is a heterogeneous association-set.

Figure 4.3 depicts three example association-sets: a is homogeneous, whereas

P is not since pattern #f has only one Inner-pattern of class C instead of two like

' and 0. r is not homogeneous because y3 contains a Complement-pattern which

is different from and 'y (i.e., different topologies).



4.3 Association Operators


Ten association operators are formally defined in this section: three unary

operators [A-Project (II), A-Select (a), and A-Integrate (f)] and seven binary

operators [Associate (*), A-Complement (I), A-Union (+), A-Difference (-), A-

Divide (+), NonAssociate (!), and A-Intersect (0)]. The examples used to explain










these operators will make use of the domain A shown in Figure 4.4. To keep the

graph simple, the Complement-patterns are not shown in the figure. The simple

mathematical properties such as commutativity, associativity, idempotency, and

nilpotency satisfied by the operators are given after each definition.



4.3.1 Notations


Notations that will be used in the subsequent sections are listed below.

A, B,...,K Denote classes.

CL, Denotes a variable for a class.

[R(CL,,CL2)] Denotes the association between classes CL1 and CL2.

ac Denotes the ith Inner-pattern of class A.

@ Denotes an Inner-pattern variable.

(a bj) Denotes an Inter-pattern between two classes A and B.

(aibj) Denotes a Complement-pattern between two classes A and B.

(ate,) Denotes a Derived-pattern from class A to class C.

a, f, 7,... Denote association-sets.

a Denotes ith pattern of association-set a.

{W},{X},{},... Denote sets of classes. Hence, a( represents association-set a
which has Inner-pattern(s) from the classes in {X}.

It should be noted that an Inner-pattern is represented by an object instance

identifier (liD), which is a system-assigned object identifier (OID) prefixed by a

class identification so that the object instances of an object in multiple classes can

be unambiguously distinguished and the fact that these object instances are










instances of the same object can easily be recognized.



4.3.2 Operators


All relational algebraic operators operate on relations of homogeneous (or

union-compatible) structures with the exception of Cartesian-product and Join.

The Cartesian-product and Join provide the mechanism to concatenate two rela-

tions of different structures into a single relation, so that it can be further manipu-

lated by other operators. In the A-algebra, all the operators are defined to operate

on association patterns of homogeneous as well as heterogeneous structures.

Therefore, the relational algebra is a special case of the A-algebra in this respect.



(1) Associate (*):

The Associate operator is a binary operator which constructs an association-

set of complex patterns by concatenating the patterns represented by two operand

association-sets. Since a pattern may involve many classes and an object class

may have more than one association with another class, it is necessary to specify

through which association the concatenation of two patterns is intended. The

Associate operation on association-sets a and f over the association R between

classes A and B is defined as follows:

a [R(A,B)] 6 = { y 7 =(af,,amb,): amb,E[R(A,B)] A amE, A bE }

The result of an Associate operation is an association-set containing no dupli-

cates. Each of its pattern is the concatenation of two patterns (one from each









operand association-set). More specifically, if the Inner-pattern (or object am) of A

in a' is associated with the Inner-pattern (or object b,) of B in f' in the domain of

the algebra A shown in Figure 4.4, then a' and #f are concatenated via the primi-

tive pattern (a,,b.).

We do not restrict A and B to be different classes in [R(A,B)], i.e.,

a*[R(A,A)]# is a legitimate operation, which concatenates two patterns (one from

each operand association-set) if they have a common Inner-pattern of class A.

An example of the Associate operation is shown in Figure 4.5a (for conveni-

ence a copy of the sample database is shown in each figure for illustrating an

operation. For clarity, we use graphical notation in the figures. In the example,

a1 is concatenated with f' and f, respectively, due to the existence of (bcl) and

(bic2) in A as shown in Figure 4.4. a is dropped simply because it does not have an

Inner-pattern of class B. a3 is dropped because (b2) is not associated with any

Inner-pattern of class C in A. ff cannot be concatenated through (e4) with any

pattern in a because no pattern in a has an Inner-pattern of B that is associated

with (c4) in A. For the same reason f/ is dropped.

For the Associate operator, [R(A,B)] can be omitted if the following condi-

tions hold: (1) both a and f are A-algebra expressions, (2) the Associate operator

operates on the last class in a linear expression a and the first class in a linear

expression f, and (3) there is a unique association between these two classes. For

example, A *[R(A,B)] B can be written as A*B, if class A is associated with class

B through the attribute [R(A,B)] of A. It should be pointed out that A-algebra

allows an attribute to be defined by a computed value (or object). For instance,










B=(A). The implementations of the function and the procedure are invisible to

the algebra. However, they should not have side effect, i.e., the computed result

must be of the same type as B.

The Associate operator is commutative and conditionally associative as

defined below:

a 4[R(A,B)] P = P 4[R(B,A)] a (commutativity)
(afx} [[R(A,B)] #,) *[R(C,D)] -{z} (associativity)
= aC [R(A,B)] ({y1 *{R(C,D)] Y {z}) (if C {X} A BV {Z})
A ({R(A,A)] A = A (idempotency)

The associativity holds true if a and 7 do not have Inner-pattern of classes C

and B, respectively. Otherwise, the associativity does not hold. For example, if

a=(abl,b6o2), f=(bc1), r-=(d,), and A is as shown in Figure 4.4 (the domain of the

algebra), then

(a o4R(A,B)] fi) *R(C,D)] y =(alb,,bi,,b,e2,c2d, )
and


a AIR(A,B)j (P 4R(C,D)] ry) =









(2) A-Complement ( ):

The A-Complement operator is a binary operator which concatenates the

patterns of two operand association-sets over Complement-patterns. It is used to

identify the objects in two classes which are not associated with each other in A.

The A-Complement operator is defined as follows:

a [R(A,B)] f = { '1 | =(oaff,ia,,b): (amb.)E[R(A,B)] A amaEct A bE
or k=a : 3(m)(amc.a) A A(n)(be
or 7=' : 3(n)(b,Ei) A A(mX)(amEa) }

The result of an A-Complement operation is an association-set. Each of its

patterns is formed by concatenating two patterns (one from each operand

association-set) via a Complement-pattern (a.bn), where am and b, belong to a'

and #i, respectively, and the Complement-pattern (amb,) is in A. In the special

case when a(or P) is an empty association-set or does not have Inner-patterns of

class A(or B), then all patterns of f(or a) that have Inner-patterns of A(or B) are

retained in the resulting association-set.

An example of the A-Complement operation is shown in Figure 4.5b. It

operates over the association between classes B and C. a2 does not appear in the

resultant association-set because it contains no Inner-patterns of B. a1 cannot be

A-Complemented with P and fL because it is connected with f# and f by Inter-

patterns (bc,) and (bc) in A, respectively.

Under the same conditions as given in the Associate operator, [R(A,B)] need

not be specified with the A-Complement operator unless there is an ambiguity.

The A-Complement operator is commutative and associative. For the similar rea-









son described for the Associate operator, the associativity holds true conditionally.

a [R(A,B)] P = f [R(B,A)] a (commutativity)
(ax | [R(A,B)] t{y1) | [R(C,D)] f{z} (associativity)
= atx I[R(A,B)] (P{ I [R(C,D)J 7z}) (if {X} A BO{Z})
A I[R(A,A)J A = ( (nilpotency)


(3) A-Select (a):

The A-Select is a unary operator, which operates on an association-set a to

produce a subset of patterns that satisfy a specified predicate P. A pattern in the

operand association-set is retained iff the predicates are evaluated true for that

pattern.

a(&)[I = I = Ya': ;(a')=true }

where a is defined by an algebraic expression, and P= T18IT22 .* 0 ,,T,. Each

term, T,(i=l,2,...n), is a comparison between two expressions and i,(i=1,2,...,n-1) is a

Boolean operator (Aorv). (ar')=true represents that a pattern is evaluated true for

that predicate.

The expressions on the left- and right-hand sides of a comparison operation

may contain constants, functions, and/or operations on objects, but cannot both

be constants. The comparison terms are type sensitive, i.e., the results of the two

expressions in a term should be data of the same type for primitive-classes or both

liDs for nonprimitive-classes. =,>,<,>,<, and are the legitimate comparisons

for numerical types; = and o for character, string, and IID types; and =,C,D,C,D,

and # for set types. The comparison of two IIDs is performed by comparing their

OID portions, since IIDs are the concatenations of the class identifiers and OIDs.









A single valued object or a single IID can be treated either as its own data type in

numerical, string, or IID comparison, or as a set type containing one element in a

set comparison.

As an example of A-Select, we assume that there are two associated classes:

S for stack and Q for queue. To select associated stack and queue object pairs in

which the top and the bottom of the stack have some common objects) with

those in the head and the tail of the queue, it can be written as

o(S*Q)[(top(S)uottom(S)) n (head(Q)JtaiQ)) 0j

For the top equals the head and the bottom equals the tail, we have

o(S Q)[top(S)=head(Q) A bottom(S)=tai( Q)]


(4) A-Project (H):

Similar to the projection operation in the relational algebra, an A-Project

operation is defined to project subpattern(s) of a pattern. However, in the rela-

tional algebra, the relationship among the projected attributes is not important.

Whereas in A-algebra, the association among the projected subpatterns must be

maintained so that the associations among the objects in these subpatterns will be

retained. The A-Project operator is defined as follows:


I4a)[6, TJ

where a is an association-set defined by an A-algebra expression;

E=(e1, e2, .. e) is a set of expressions which specify subpatterns to be pro-

jected; and T=(t,, t, t,) is a set of ordered sets of classes. Each ordered set,










tf, specifies a path connecting two projected subpatterns defined by the E expres-

sions.

e,{i=1,2,...,n) is a subexpression of the expression which defines a. e, and

ej (Vi43) should not contain a common class. There may be many paths that con-

necting two subpatterns in the original pattern. The path to be retained can be

specified in tk. If a specific path is chosen, a minimal number of classes along the

path which can uniquely identify the path should be specified. The result of an

A-Project operation over a pattern is its subpatterns defined by E and some paths

defined by Tthat connect these subpatterns. If a path in the original pattern con-

sists of all Inter-patterns, a D-inter-pattern is retained. Otherwise, a D-

complement-pattern is included. Multiple paths between two projected subpat-

terns can be declared in T, if it is so desired.

Figure 4.5c shows an example of A-Project from a pattern a over A B and

D. For a', the subpatterns (ab,1) and (d,) satisfy A*B and D, respectively. There-

fore, they are kept in the result. According to the path specification stated in the

operation, a Derived-pattern (b,d1) is added to the result, thus 7'=(a~b, d, b,d. Its

normalized form is -=(alb,, bid. 72 is produced for the same reason. Since a3

does not have a subpattern satisfying A *B, only (ds) is retained.



(5) NonAssociate (!):

The NonAssociate operator is a binary operator used to identify the associa-

tion patterns in one operand association-set that are not associated (over a

specified association) with any pattern in the other association-set, and vice versa,









in the domain of the algebra A. The NonAssociate operator is defined as follows:

a [R(A,B)] f ={ 7 I = (ao, ', amb): (amb,)E[R(A,B)] A amEa' A bEf
A V ((amb,),(ambJEA)(am4 a A b 4 )
k i
or 7 = a: 3(m)(amea') A A(nXb6. )
V V(b,Ef)3(k, kAm)(akEa A (akb.)E[R(A,B)])
or = i: 3(n)(befi) A i(m)(amea)
V V(a,,a)3(k, k,4n)(bE A (ab )[R(A,B)]) }

The result of a NonAssociate operation is an association-set. Each of its pat-

terns is formed by concatenating two patterns a' and 0' via a Complement-

pattern (a,,b,) under the condition that a' is not associated with any # and vice

versa. Furthermore, in the special case where the patterns of a(or f) have Inner-

patterns of A(or B) and cannot be concatenated with any pattern of (or a), these

patterns of a(or P) will be retained in the result if one of the following three condi-

tions holds: (1) (or a) is an empty association-set, (2) all patterns of (or a) do

not have Inner-patterns of B(or A), or (3) all patterns of (or a) that have Inner-

patterns of B(or A) can be concatenated with patterns of a(or f).

An example of the NonAssociate operation is shown in Figure 4.5d. In the

example, a1 and f are dropped due to the existence of (b1c,) in Figure 4.4. a2 is

dropped because it does not contain an Inner-pattern of class B. 0' is dropped

because it does not contain an Inner-pattern of class C. 71 is in the resultant

association-set because (b2) is not associated with (c4) in A as shown in Figure 4.4

and (bs) does not appear in a. 7 exists because (b2) is not associated with (c,) in A.

Note that the NonAssociate operator produces a resultant association-set

which is a subset of that produced by the A-Complement operator, because a', i,









and ab, may form a new pattern only when am of a' does not associate with any

object of B in P and b. of fP does not associate with any object of A in a. In fact,

the NonAssociate operator can be expressed in terms of A-Complement and other

operators as follows:

A [R(A,B)] B = (A H(A *[R(A,B)] B)[A] I[R(A,B)] (B I(A *iR(A,B)] B)[B])

Thus, NonAssociate is not a primitive operator in a strict sense. However, it is

very useful for query formulation and is therefore included in the set of A-algebra

operators.

Under the same conditions as given in the Associate operator, [R(A,B)] need

not be specified unless there is an ambiguity. The NonAssociate operator is com-

mutative but not associative.

a [R(A,B)] f = f [R(B,A)] a (commutativity)
A ![R(A,A)] A = 0 (nilpotency)


(6) A-Intersect (.):

The A-Intersect operation is convenient for constructing a pattern with a

branch or a lattice structure (a pattern that has a loop), since a pattern in such

structures can be viewed as the intersection of two patterns. Conceptually, the

A-Intersect operator is equivalent to the JOIN operator in the relational algebra.

It operates on two operand association-sets over a set of specified classes. Two

patterns, one from each association-set, are combined into one if they contain the

same set of Inner-patterns for each specified class. The A-Intersect operation is

defined as follow:










a{ *{i W} = { l7 It = (a,fi):
V(CLE{ W})V(@ECL,,a')(@E')
A V(CL,{ W})V(@eCL,,)(@Ea') }

Figure 4.5e shows an example of the A-Intersect operation over classes B and

C. The resultant association-set contains four patterns, which are the intersection

of a'nI a'nfi, a2onf, and a2wf, respectively, since they all have Inner-patterns

(bl) and (c2). Other patterns (as, a4, fl, fl) fail to produce new patterns because

they either have no Inner-pattern in both classes B and C or have no common

Inner-pattern of class C.

The set of classes { W can be omitted when the A-Intersect operation is per-

formed on all the common classes of its operands, i.e., {W}={X}r{Y} is implied.

Since a lattice pattern can be transformed into a set of other simple patterns,

an A-Intersect operation for building a complex pattern can be replaced by an

Associate operation followed by an A-Select operation (see Section 4 for detail).

The A-Intersect operator is commutative, conditionally associative and idempo-

tent.

a *{W} = f *{ W} a (commutativity)
(aW .{ *W}) fl{Y) *{ W2} = z} = V { (WI) (l{} *{ W2} "{z}) (associativity)
(if ({W--(W } {z} =( A (W}-{W ) n ( =
a 0 a = a (if a is a homogeneous association-set) (idempotency)

The associativity is not always true because there are cases in which a pat-

tern of f which fails to intersect with any pattern of 7, may succeed by first inter-

secting with a pattern of a in the operation (o{W1}) and then intersecting with a

pattern of 7 in the operation (.{ W2}).









Now we define three set operators, which are different from the correspond-

ing set operators in relational algebra, since they operate on heterogeneous struc-

tures as well as homogeneous structures.



(7) A-Integrate (f):

The A-Integrate is a unary operator. It reorganizes patterns in an

association-set according to the relationships among patterns with respect to the

classes specified. The A-Integrate operation is defined as follows:

f()= { yI l'y=(a):
V(k, CL,.{ WIA@ECLA@EaciajEa,)(@EakAakEa,) }

By this definition, a subset of patterns (a,) of a is combined into a single pattern if

every object instance of classes in { } that appears in a pattern in the subset is

also contained in all other patterns in the subset. If a pattern of a cannot be com-

bined with any other pattern, it is retained in the resultant association-set as it is.

If no class is specified, patterns, in which every pattern has at least one

object instance (of any class) common to another, will be integrated into one pat-

tern. The reorganized association-set will contain patterns which are apart from

each other (refer to Section 4.2).

Figure 4.5f shows two examples. The first example shows an A-Integrate

operation over class A. Patterns that have common Inner-pattern of class A are

grouped into one ('1 is the integration of a', a2, and a3; and Y6 is the integration of

a and a ). All other patterns in a are retained in the result as they are. The

second example illustrates an A-Integrate operation on the same association-set of










the first example but without specifying a class. The result becomes two patterns,

which are apart and are exactly the same as they appear in the original database.

Whereas the same primitive patterns appear more than once in the result of the

first example.


(8) A-Union(+):

Similar to the UNION operation of the relational algebra, A-Union combines

two association-sets into one. However, these two association-sets can contain

heterogeneous association structures. It is important for A-algebra to be able to

operate on heterogeneous structures because some prior operations may produce

heterogeneous association-sets and may need to be further processed over the

objects of a common class against other patterns of associations. Unlike the rela-

tional algebra and other 0-0 query languages, union-compatibility is not a restric-

tion in A-algebra. For this reason, A-algebra has more expressive power. Any

query that can be expressed by a single expression in other languages can be

expressed as a single A-algebra expression but not vise versa. The A-Union opera-

tion is defined as follows:

a + p ={ 7I ea V IEf }

The A-Union operator is commutative, associative, and idempotent:

a + = P + a (commutativity)
(a + f) + 7 = a + (f + 7) (associativity)
a + a = a (idempotency)









(9) A-Difference (-):

The A-Difference implements the same concept as the DIFFERENCE opera-

tor in relational algebra but with two differences. First, its operands do not have

to be union compatible. Secondly, a pattern in the minuend is retained if it does

not contain any of the patterns in the subtrahend.

a- = 7 | Iy* = a : A(fi)(fC) }

The example depicted in Figure 4.5g shows that a1 and a3 are dropped since

they both contain #.



(10) A-Divide (-):

The A-Divide operator implements the concept that a group of patterns with

certain common features contains another set of patterns.

Q at~ = {( I = aI : V(k( a. ) }

where a, is a subset of the patterns of a, which have common Inner-patterns for

all classes of {W} and they together contain all patterns of fl. If ({W} is not

specified, the A-Divide operation retains all the patterns of a, if each of which

contain at least one pattern of f and they together contain all patterns of f.

Figure 4.5h shows an example of a being divided by f8 with respect to class

B. The A-Divide operation retains a, a2 ,and a3 since they all contain Inner-

pattern (b,) of B and together contain all patterns of f.









4.3.3 Precedence


The precedence relationships of the above operator are as follows. Unary

operators have higher precedence than binary operators. The precedence of the

seven binary association operators is given in the following order: *, |, ,, ,

and +. Parentheses can be used to alter the precedence relationships.



4.3.4 Summary of operators


(1) Associate (*): Two patterns are concatenated via an Inter-pattern.

(2) A-Complement (I): Two patterns are concatenated via a Complement-pattern.

(3) A-Select (o): A pattern is retained if it satisfies the predicate.

(4) A-Project (H7): A subpattern is projected from the original pattern.

(5) NonAssociate (!): Two patterns are concatenated via a Complement-pattern
only if each of them cannot be concatenated with any pattern of the other
operand via an Inter-pattern.

(6) A-Intersect (.): Two pattern are combined into a single pattern if their com-
mon classes have common objectss.

(7) A-Integrate (f): Patterns in an association-set are combined if objects of a
specified class in a pattern are common to these patterns.

(8) A-Union (+): Two association-sets are lumped into a single set.

(9) A-Difference (-): A pattern in the minuend is retained if it does not contain
any pattern in the subtrahand.

(10) A-Divide (-): A subset of patterns in the dividend that have certain common
features) and contain all the patterns in the divisor is retained.










4.4 Query Examples

We have formally defined nine association operators and given their simple

mathematical properties. Before exploring other properties, we give some exam-

ples to illustrate how these operators can be used to formulate queries for process-

ing an 0-0 database. There can be many alternative expressions for the same

query. Choosing the best one for execution is the task of a query optimizer. The

mathematical properties of these operators can be used for that purpose.

In the following formulation of algebraic expressions, we assume that the user

is using the algebra directly instead of a high-level query language. In the latter

case, the task of generating algebraic expressions would belong to the translator.

To formulate an A-algebra expression for a query, first, we need to construct

an intensional pattern for it by navigating the schema graph of the database as

illustrated in Chapter 3. Then, each edge of the pattern is marked an operator *,

I, or on the intended semantics. For simple patterns, the formulation is straight-

forward. For patterns with complex structures, we may have to decompose them

into patterns with simpler structures. The expression for the original pattern is

the A-Intersect's of the expressions for the decomposed patterns.

First, we formulate expressions for Query 1 to Query 4 given in Chapter 3.

We have identified the intensional patterns for these queries (see Figure 3.3).


Query 1: For all sections, get the majors of students who are taking these
sections.

It is trivial to write an algebraic expression for Query 1, which is represented

by a linear pattern. For this pattern, two edges are all marked with and the










algebraic expression can be formulated as follows:

f (sco (Section Student Department)[Section,Department;Section:Department])
{Section)

where the A-Integrate operation groups the resultant patterns by Sections.


Query 2: List students who major and minor in the same department.

For Query 2, the edges of the intensional pattern shown in Figure 3.3c are all

marked with *. Since this loop structure can be viewed as the A-Intersect of two

linear patterns involving both Student and Department, we have

(Student Undergrad Department Student Department)[Student]

where the A-Project operation gets the student objects that satisfy the association

pattern as required by the query.


Query 3: For those students taking section 300 and having majors and/or
minors, get their majors and/or minors.

The expression for the intensional pattern of Query 3 shown is as follow:

Section# *Section (Student *Department + Student *Undergrad *Departmentl)

where the A-Union operator is used to realize the OR condition at the class Stu-

dent. As long as a student has a major or a minor, the linear pattern from Student

to Department and the linear pattern from Student to Undergrad and to Depart-

ment should be retained. In the expression, Department- is an alias of Depart-

ment, which is used to distinguish major and minor departments. Since the query

ask for the majors and minors of students who are taking section 300, the A-Select

and A-Project operations are used. Thus, we have










ft (17( o(a)[Section#=300])[Student, Department, Departmentl;
{Student}
Student:Department,Student:Departmentl])

where a is the intensional pattern given above. As shown in Figure 3.3g, the

result of this expression will contain the derived patterns shown in Figure 3g

which are specified by the [CT7J clause of the projection operation and is reorgan-

ized by an A-Integrate operation. Note that Query 3 cannot be phrased in a sin-

gle relational algebra expression since (a) the union operation in relational algebra

requires operands to be union-compatible, (b) using a join operation on Student

can cause a loss of information because not every student has both major and

minor, (c) the cartesian-product of the majors and minors will produce erroneous

results, and (d) no other operation in the relational algebra can combine two rela-

tions into one.


Query 4: For each teacher, list the sections which he/she does not teach.


The algebraic expression for Query 4 can be easily formulated as follows,

since it is represented by a linear pattern shown in Figure 3.3h. We note that the

A-Complement operator I, rather than the NonAssociate operator !, should be

used for this query, since a teacher may be teaching some courses.

Teacher I Section

Several other query examples are given below. They use the schema graph

given in Figure 3.1. Their corresponding intensional patterns are depicted in Fig-

ure 4.6.










Query 5: List the names of students who teach in the same departments
as their major departments.

We can see from Figure 4.6 that the intensional pattern for this query can be

constructed in two ways. One way is to decompose it into three linear patterns:

Name-Person-Student, Student-Department, and
Student-Grad-TA- Teacher-Department

The A-Intersect's of these three patterns will produce a pattern that satisfies this

query.

n(Student Person Name Student Department
Student Grad TA Department)[Name]

where the first A-Intersect operation operates over Student and the second

operates over Student and Department. The A-Project operation projects the

names of these students.

Another way is to decompose the intensional pattern into two linear patterns:


Name-Person-Student-Department and
Student-Grad- TA- Teacher-Department

Therefore, we have an alternative expression


(lName *Person *Student *Department *TA
Student *Grad *TA I Teacher *Department)[Name]



Query 6: List the section# of those sections which have not been assigned
a room or have not been assigned a teacher.

Since the query requests sections that have not been assigned a room or a

teacher, these sections must not be connected with any room or any teacher (i.e.,










a section which does not associate with any room and teacher should also be

retained in the result). Therefore, there should be Complement-patterns between

Section and Teacher and between Section and Room, and a single arc between

these two branches as shown in Figure 4.6. We emphasize that operation,

instead of |, should be used to construct these two Complement-patterns. Then

the algebra expression for this query can be easily formulated as follows:

7I (Section# (Section Room# + Section !Teacher))[Section#]


Query 7: List the names of students who take courses 6010 and 6020.

We shall show three ways of formulating an expression for this query. First,

the intensional pattern for Query 5 shown in Figure 4.6 can be constructed by the

A-Intersect of two linear patterns as we did for Query 5:

n(a(Name *Person *Student *Enrollment *Course *Course#)[Coure#=6010]
o(Student *nrollment-l *Course.- *Course#-l)[ Course#=6020])[Name]

where Enrollment-1, Course-1, and Course#J are the aliases of the classes

Enrollment, Course, and Course#, respectively. This ensures that the A-Interact

operation will be performed only over the Student class.

A second way is to view the original pattern as a linear pattern without res-

triction on Course# as follows:

Name-Pe rson-Stude nt-Enrollme nt- Course- Course#

Students who are taking both courses must participate at least two such patterns

with Course#==6010 and Course#=6020, respectively. This implies an A-Divide

operation. Thus, the query can be formulated as follows:










1(Name *Person sStudent *Enrollment *Course *Course#
+{Student} o( Course. Course#)[ Course#=601VOCourse#==6020)[Name]

where a dot in Course.Course# is used only for identifying the Course# class

which is defined in the Course class. It does not represent a function or a method

as in other languages. This expression can also be rewritten as follow:

l(Name Person I(Student Enrollment Course Course#

-{Student} o(Course. Course#)[Course#=6O10V Course#--6020])[Student])[Name]

which is more suitable for execution than the first since the inner A-Project gets

the student objects who are taking these two courses so that all other data associ-

ated with these students, such as Enrollment, Course, and Course#, do not have

to be carried along in further processing to get the names of these student.

Details of optimization issues will be addressed in the next chapter.

We stress that the above association pattern expressions represent the inter-

nal algebraic operations that need to be performed if the dynamic inheritance

method is used. The high-level query statements corresponding to these algebraic

expressions issued by the user can be much simpler due to the inheritance of attri-

butes in the generalization hierarchy or lattice.

















Section


Figure 4.1 Regular-edges and Complement-edges in an OG


Student


Course











graphical
representation


al
Inn-pattern a



al bl
I-pattern al b
primitive
patterns cl dl
Complement- -
pattern

al dl
binary D-Inter-
pattern
patterns pattern
which is derived from
al bl c1 dl

al dl
D-Complement- al dl
pattern W -*
which is derived from
al bl c1 dl
---*--I----


algebraic
representation


(al)



(albl)


(cidl)



(aT'dl)


(albl,blcl,cldl)

(ald1)


(albl,blcl,cldl)


(a) primitive association patterns


al bl c1

(1) d

(albl,blcl,bldl)


a2 b2 c3
..7 -
(2) 3
(2a4 b3

(a2b2,a4b2,b2c3,b3c3)


bl c1 dl
b. -- -- --.--- -- -ft


(bic1,cidl)


cl

d1

c2

(aTbi ,b1c1 ,bi c2,c dl ,c2d1)


(b) complex association patterns


Figure 4.2 Examples of association patterns

















a

al bt c1
c c2

a 1c3
(a3 b2 rC1
\~1'33)


Y

al bl C1
c2


cc4
IcI /
-' .cy


Figure 4.3 Examples of association-sets




















A B C D




bl cl
dl



a2 b2 c2 d2



a3 c3 d3


a4 b3 Ad4
c4





















Figure 4.4 A sample database association graph
(The Complement-patterns are not shown)
























Sample Database
(The Complement-patterns are not shown)








P


/al -- bl\ cl e---- dl
a3 *( c2 c4--- d2

c4 ----- d3


al b1 cl d21
ka=l b c--d--2-
..--------e


(a) an Associate operation


Figure 4.5 Example of operations


























Sample Database
(The Complement-patterns are not shown)


al -- bicl ----. dl


a4 e---4 b3 c3


al bl c3
C-~--.--e-..
a4 b3 cl dI
----4-------
a4 b3 c2 d2

a4 b3 ---
a4 b3 c3
V --..- ---


(b) an A-Complement operation


Figure 4.5--continued

























Sample Database
(The Complement-patterns are not shown)


al bl ci dl
c-----*--*--*
l al bI cl d3
S c--' --- d
b2 c3 d3
e----+----


[(A*B, D);(B:D)] =


al bl dl
al bl.... d3
Id3
'4. */


(c) an A-Project operation


Figure 4.5--continued

























Sample Database
(The Complement-patterns are not shown)






a P Y



al bl e----4
--c2 d3
I 0 *c4 d4 a4 b2 c4 d4
pp I -.-----..c----.
a ) ![R(B,C)] ----
a4 b2 b2 a4 b2 c3
c3---*---.


(d) a NonAssociate operation


Figure 4.5--continued


























Sample Database
(The Complement-patterns are not shown)


bl c2 dlb 2 d
b 2 d
a; b c. bI c d3
a2 b2 0[B,C] bi ci d3 1

a3 2 c4, d4
c


bl c2 dl

d2
bl c2 d3



al bl c2 d2
l*--------*----
kal bl c2 d3
S*- --Q--* -


(e) an A-Intersect operation


Figure 4.5--continued




















Sample Database
(The Complement-patterns are not shown)


al bl c2
al bl cl dl
- --- -*
c2 dl
< d2
b3 c4
b3 c4 d4
e--*---
a4 b2
--a4 b3
a4 b3
......


Cal bl cl dl
d ----
2C d


b3 c4
b3 c4 d4

b2
a4b3
--.


al
*
al bl c2
al bl cl di

c2 ,dl1 al bl cl dl
< d2 0S --
b3 c4 c2
b3 c4 d4 b2
0-----* Z
a4 b2 a b3 c4 d4

a4 b3
-----.


(f) A-Integrate operations


Figure 4.5--continued


{A}





,1







;























Sample Database
(The Complement-patterns are not shown)







P


al b1 cl

a3 b2 \c2
a---l c
al bl c2
- c---


(al bl c2)
a3 b3
-- .--.


a3 b2
* -----


(g) an A-Difference operation


Figure 4.5--continued

























Sample Database
(The Complement-patterns are not shown)


al b cl
bl c2 dl al b1 cl
al bi cla b
bl c4 d4 -- b1 c2 dl
---.--e... } c2
b3 c4 bl c4 d4
Sc4 d4 --- ---- /
b2 c3 ----*
*-----*


(h) an A-Divide operation


Figure 4.5-continued













Query 5


Name


Student


Grad TA Teacher


Query 6


Teacher
Section# -0

Section O
Room


Query 7

Name


Enrollment Course

Student

on

Enrollment_1 Course_1


Course#=6010


Course#=6020


Figure 4.6 Intensional patterns of Query 5, 6, and 7


Dept














CHAPTER 5
MATHEMATICAL PROPERTIES OF OPERATORS
AND THEIR APPLICATIONS
IN QUERY OPTIMIZATION AND QUERY DECOMPOSITION


In Section 4.3, we have shown some mathematical properties of individual

operators. In this section, we shall study their properties systematically. The pro-

perties of A-algebra are classified into six categories: (1) conventional algebraic

properties such as commutativity, associativity, idempotency, nilpotency, and dis-

tributivity; (2) nesting of two unary operations; (3) a binary operation nested in a

unary operation; (4) cascading of two different binary operations; (5) general iden-

tities; and (6) operation transformation. The properties presented in this disserta-

tion is quite exhaustive, but may not be complete. These properties provide the

mathematical foundation for query decomposition and query optimization. Their

utilities in these two applications are also illustrated in this chapter. The proofs of

properties that are marked with t's can be found in the Appendix. Others can be

proved similarly.



5.1 Conventional Algebraic Properties


To be systematic, first we list the properties given in Section 4.3 without

explanation, since they have been illustrated previously. Then, we give the pro-

perties of distributivity.









A. Commutativity

a *R(A,B)I] = P *[R(B,A)] a (5.1 t)

a I [R(A,B)] 6 = I [R(B,A)] a (5.2 t)

a [R(A,B)] P = f [R(B,A)] a (5.3 t )

a *{W B = 6 *{ w} a (5.4 t)

a+ = + (5.5 t)

B. Associativity

(apx *[R(A,B)] ,{) *[R(C,D)] 7{z}
= ax *RR(A,B)] (fi{y *[R(C,D)] {z) (C {X} A B {Z}) (5.6 t)

(ax I [(R(A,B)] fl{y) I [R(C,D)] 7(z}
= a { [R(A,B)] ((I} [ [R(C,D)] '{z}) (CG {X} A Bq {Z}) (5.7 t )

(a{, *{ W} I{() *{ 7z} = a, w { (W } f{ )W2} 'Y{
(({Wi}-{W2}) n {z = A ({W2}-{WI}) l {X} = ) (5.8 t)


(a + P) + y = a + (f + -) (5.9 t)

C. Idempotency and Nilpotency

a a = a (if a is a homogeneous association-set) (5.10)

a + a = a (5.11)

A *R(A,A)] A = A (5.12)

A ![R(A,A)] A = (5.13)




Full Text
xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EZS2WFJTR_O925AA INGEST_TIME 2017-07-12T21:18:32Z PACKAGE AA00003326_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES



PAGE 1

$662&,$7,21 $/*(%5$ $ 0$7+(0$7,&$/ )281'$7,21 )25 2%-(&725,(17(' '$7$%$6(6 %\ 0,1*6(1 *82 $ ',66(57$7,21 35(6(17(' 72 7+( *5$'8$7( 6&+22/ 2) 7+( 81,9(56,7< 2) )/25,'$ ,1 3$57,$/ )8/),//0(17 2) 7+( 5(48,5(0(176 )25 7+( '(*5(( 2) '2&725 2) 3+,/2623+< 81,9(56,7< 2) )/25,'$

PAGE 2

&RS\ULJKW E\ 0LQJVHQ *XR

PAGE 3

'HGLFDWHG WR P\ GHDU ZLIH =KX 6XVLHf DQG ORYHO\ GDXJKWHU -LDODQ $QG WR RXU SDUHQWV -LQJFKHQJ *XR DQG 5XL\LQJ =KDQJ 6KX\DQ +XDQJ DQG &KXDQ[LDQJ &KHQ WKLV ZDV WKHLU GUHDP EHIRUH LW ZDV PLQH

PAGE 4

$&.12:/('*(0(176 ZRXOG OLNH WR H[SUHVV P\ VLQFHUH DSSUHFLDWLRQ WR 'U 6WDQOH\ 6X FKDLUPDQ RI P\ VXSHUYLVRU\ FRPPLWWHH IRU JLYLQJ PH WKH RSSRUWXQLW\ WR ZRUN RQ WKLV LQWHUHVWLQJ DQG LPSRUWDQW WRSLF LQ WKH DUHD RI REMHFWRULHQWHG GDWDEDVH V\VWHPV :LWKRXW KLV SDWLHQW JXLGDQFH DQG FRQWLQXRXV VXSSRUW WKLV ZRUN FRXOG QRW KDYH EHHQ FRPSOHWHG DP JUDWHIXO WR 'U +HUPDQ /DP FRFKDLUPDQ RI P\ VXSHUYLVRU\ FRPPLWWHH IRU KLV WKRXJKWSURYRNLQJ VXJJHVWLRQV RQ WKLV ZRUN WKDQN 'U 6KDP 1DYDWKH IRU KLV FRPn PHQWV DQG KLV SHUVRQDO OLEUDU\ WKDQN 'U 5DQG\ &KRZ IRU KLV HQFRXUDJHPHQW WKURXJKRXW P\ JUDGXDWH VWXG\ ZRXOG OLNH WR WKDQN 'U -RKQ 6WDXGKDPPHU IRU KLV WLPH DQG IRU EHLQJ RQ P\ VXSHUYLVRU\ FRPPLWWHH 0\ VSHFLDO WKDQNV JR WR 6KDURQ *UDQW WKH VHFUHWDU\ RI WKH 'DWDEDVH 6\VWHPV 5HVHDUFK DQG 'HYHORSPHQW &HQWHU ZKRVH KHOS WR PH LV DOZD\V IULHQGO\ DQG LQ WLPH 7KLV UHVHDUFK ZDV VXSSRUWHG E\ WKH 1DWLRQDO 6FLHQFH )RXQGDWLRQ '0& f DQG WKH 1DWLRQDO ,QVWLWXWH RI 6WDQGDUG DQG 7HFKQRORJ\ 1$1%'f 7KH GHYHORSPHQW HIIRUW LV VXSSRUWHG E\ WKH )ORULGD +LJK 7HFKQRORJ\ DQG ,QGXVWULDO &RXQFLO 831f ,9

PAGE 5

7$%/( 2) &217(176 3DJH $&.12:/('*0(176 LY $%675$&7 YLL &+$37(5 ,1752'8&7,21 $ 6859(< 2) 5(/$7(' :25. 5HODWLRQDO 0RGHO DQG 5HODWLRQDO $OJHEUD ([LVWLQJ 4XHU\ /DQJXDJHV (1&25( 'DWD 0RGHO DQG ,WV 8QGHUO\LQJ 4XHU\ $OJHEUD 29(59,(: 2) '$7$%$6(6 $1' $662&,$7,21%$6(' 48(5< )2508/$7,21 2YHUYLHZ RI 'DWDEDVHV 3DWWHUQEDVHG 4XHU\ )RUPXODWLRQ &RQFOXVLRQ $662&,$7,21 $/*(%5$ 'HILQLWLRQV 5HODWLRQVKLS %HWZHHQ 7ZR 3DWWHUQV $VVRFLDWLRQ 2SHUDWRUV 4XHU\ ([DPSOHV 0$7+(0$7,&$/ 3523(57,(6 2) 23(5$7256 $1' 7+(,5 $33/,&$7,216 ,1 48(5< 237,0,=$7,21 $1' 48(5< '(&20326,7,21 &RQYHQWLRQDO $OJHEUDLF 3URSHUWLHV 1HVWLQJ RI 7ZR 8QDU\ 2SHUDWRUV 1HVWLQJ RI %LQDU\ 2SHUDWRU LQ 8QDU\ 2SHUDWRU &DVFDGLQJ RI 7ZR %LQDU\ 2SHUDWRUV *HQHUDO ,GHQWLWLHV 7UDQVIRUPDWLRQ RI 2SHUDWRUV $SSOLFDWLRQV LQ 4XHU\ 2SWLPL]DWLRQ DQG 'HFRPSRVLWLRQ &203/(7(1(66 2) 7+( $$/*(%5$ &21&/86,21 Y

PAGE 6

5()(5(1&(6 $33(1',; %,2*5$3+,&$/ 6.(7&+ YL

PAGE 7

$EVWUDFW RI 'LVVHUWDWLRQ 3UHVHQWHG WR WKH *UDGXDWH 6FKRRO RI WKH 8QLYHUVLW\ RI )ORULGD LQ 3DUWLDO )XOILOOPHQW RI WKH 5HTXLUHPHQWV IRU WKH 'HJUHH RI 'RFWRU RI 3KLORVRSK\ $662&,$7,21 $/*(%5$ $ 0$7+(0$7,&$/ )281'$7,21 )25 2%-(&725,(17(' '$7$%$6(6 %\ 0LQJVHQ *XR 'HFHPEHU &KDLUPDQ 'U 6WDQOH\ <: 6X 0DMRU 'HSDUWPHQW (OHFWULFDO (QJLQHHULQJ ([LVWLQJ '%06V ODFN D VROLG PDWKHPDWLFDO IRXQGDWLRQ IRU WKH PDQLSXODWLRQ RI GDWDEDVHV RSWLPL]DWLRQ RI TXHULHV DQG WKH GHVLJQ DQG VHOHFWLRQ RI VWRUDJH VWUXFWXUHV IRU VXSSRUWLQJ GDWDEDVH PDQLSXODWLRQV $Q DVVRFLDWLRQ DOJHEUD $ DOJHEUDf LV SUHVFULEHG IRU VHUYLQJ DV D PDWKHPDWLFDO IRXQGDWLRQ IRU SURFHVVLQJ GDWDEDVHV ZKLFK LV DQDORJRXV WR WKH XVH RI UHODWLRQDO DOJHEUD IRU SURFHVVLQJ UHODWLRQDO GDWDEDVHV ,Q WKLV DOJHEUD REMHFWV DQG WKHLU DVVRFLDWLRQV LQ DQ GDWDEDVH DUH XQLn IRUPO\ UHSUHVHQWHG E\ DVVRFLDWLRQ SDWWHUQV ZKLFK DUH PDQLSXODWHG E\ D QXPEHU RI RSHUDWRUV WR SURGXFH RWKHU DVVRFLDWLRQ SDWWHUQV 'LIIHUHQW IURP WKH UHODWLRQDO DOJHn EUD LQ ZKLFK VHW RSHUDWLRQV RSHUDWH RQ UHODWLRQV ZLWK XQLRQFRPSDWLEOH VWUXFWXUHV WKH $DOJHEUD RSHUDWRUV FDQ RSHUDWH RQ DVVRFLDWLRQ SDWWHUQV RI ERWK KRPRJHQHRXV DQG KHWHURJHQHRXV VWUXFWXUHV 'LIIHUHQW IURP WKH WUDGLWLRQDO UHFRUGEDVHG UHODWLRQDO SURn FHVVLQJ WKH $DOJHEUD DOORZV YHU\ FRPSOH[ SDWWHUQV RI REMHFW DVVRFLDWLRQV WR EH GLUHFWO\ PDQLSXODWHG 3DWWHUQEDVHG TXHU\ IRUPXODWLRQ DQG WKH $DOJHEUD RSHUDWRUV DUH GHVFULEHG 6RPH PDWKHPDWLFDO SURSHUWLHV RI WKH DOJHEUDLF RSHUDWRUV DUH 9OO

PAGE 8

SUHVHQWHG WRJHWKHU ZLWK WKHLU DSSOLFDWLRQ LQ TXHU\ GHFRPSRVLWLRQ DQG RSWLPL]DWLRQ 7KH FRPSOHWHQHVV RI WKH $DOJHEUD LV DOVR GHILQHG DQG SURYHQ 7KH $DOJHEUD KDV EHHQ XVHG DV WKH EDVLV IRU WKH GHVLJQ DQG LPSOHPHQWDWLRQ RI DQ REMHFWRULHQWHG TXHU\ ODQJXDJH 24/ ZKLFK LV WKH TXHU\ ODQJXDJH XVHG LQ D SURWRW\SH .QRZOHGJH %DVH 0DQDJHPHQW 6\VWHP 26$0r.%06 9OOO

PAGE 9

&+$37(5 ,1752'8&7,21 ,Q WKH SDVW WZR GHFDGHV WHFKQLTXHV RI GDWD PRGHOLQJ KDYH JRQH WKURXJK WZR PDMRU FRQFHSWXDO FKDQJHV )LUVW LQ HDUO\ V ( ) &RGG REVHUYHG WKDW IXWXUH GDWDEDVH V\VWHPV VKRXOG DOORZ DSSOLFDWLRQ SURJUDPV DQG WHUPLQDO XVHUV WR UHPDLQ XQDIIHFWHG E\ FKDQJHV PDGH WR WKH LQWHUQDO GDWD UHSUHVHQWDWLRQ RU WKH VWRUDJH VWUXFWXUHf RI D GDWDEDVH +H LQWURGXFHG WKH UHODWLRQDO GDWD PRGHO >&2'@ DQG SURSRVHG WKH UHODWLRQDO DOJHEUD DQG UHODWLRQDO FDOFXOXV >&2'D@ DV WKH PDWKHPDWLFDO IRXQGDWLRQ IRU SURFHVVLQJ UHODWLRQDO GDWDEDVHV 7KH UHODWLRQDO PRGHO SURYLGHV WZR OHYHOV RI GDWD LQGHSHQGHQFH LQ D WKUHHOHYHO DUFKLWHFWXUH IRU D GDWDn EDVH PDQDJHPHQW V\VWHP DV VKRZQ LQ )LJXUH ILJXUHV RI HDFK FKDSWHU DUH SODFHG DW WKH HQG RI WKH FKDSWHUf $W WKH ORZHU OHYHO WKH SK\VLFDO GDWD LQGHSHQn GHQFH LV SURYLGHG LH WKH ORJLFDO UHSUHVHQWDWLRQ RI D UHODWLRQDO GDWDEDVH LV D VHW RI UHODWLRQV LH IODW WDEOHVf ZKLFK LV LQGHSHQGHQW RI WKH SK\VLFDO GDWD DQG VWRUDJHf VWUXFWXUHV LQ ZKLFK GDWD DUH VWRUHG $W WKH KLJKHU OHYHO WKH ORJLFDO GDWD LQGHSHQn GHQFH LV SURYLGHG LH WKH H[WHUQDO YLHZ UHPDLQV XQFKDQJHG ZKHQ WKH ORJLFDO YLHZ RI D GDWDEDVH LV PRGLILHG QRWH WKDW WKH H[WHUQDO YLHZ UHPDLQV XQFKDQJHG RQO\ IRU VRPH VFKHPD PRGLILFDWLRQVf %HVLGHV VLPSOH ORJLFDO UHSUHVHQWDWLRQ DQG GDWD LQGHSHQGHQFH WKH IDFW WKDW WKH UHODWLRQDO PRGHO KDV D VROLG PDWKHPDWLFDO IRXQGDn WLRQ LV YHU\ LPSRUWDQW DQG KDV FRQWULEXWHG WR WKH VXFFHVV RI WKH PRGHO DQG WKH H[LVWLQJ UHODWLRQDO GDWDEDVH PDQDJHPHQW V\VWHPV

PAGE 10

+RZHYHU WKH UHODWLRQDO PRGHO DQG UHODWLRQDO V\VWHPV KDYH VRPH OLPLWDWLRQV )RU H[DPSOH WKH PRGHO FDSWXUHV UDWKHU OLPLWHG VWUXFWXUDO SURSHUWLHV RI UHDOZRUOG HQWLWLHV RU REMHFWV 7KH FRQVWUXFW RI DJJUHJDWLRQ KLHUDUFK\ ZKLFK PRGHOV FRPSOH[ REMHFWV DQG WKH FRQVWUXFW RI JHQHUDOL]DWLRQ ZKLFK PRGHOV WKH VXSHUFODVVVXEFODVV UHODWLRQVKLS DUH QRW SURYLGHG ,Q WKH UHODWLRQDO PRGHO GDWD ZKLFK GHVFULEH D FRPn SOH[ REMHFW DUH VFDWWHUHG DPRQJ D QXPEHU RI QRUPDOL]HG UHODWLRQV DQG DFFHVVLQJ WKDW GDWD LQYROYHV WLPHFRQVXPLQJ WUDYHUVDO DQG DVVHPEO\ RI GDWD VWRUHG LQ PXOWLn SOH UHODWLRQV 7KH PRGHO DOVR GRHV QRW DOORZ EHKDYLRUDO SURSHUWLHV RI HQWLWLHVREMHFWV WR EH H[SOLFLWO\ GHILQHG 7KH VHFRQG FRQFHSWXDO FKDQJH RI GDWD PRGHOLQJ WHFKQLTXHV RFFXUUHG LQ WKH HDUO\ V 7KH REMHFWRULHQWHG SDUDGLJP ILUVW LQWURGXFHG LQ WKH SURJUDPPLQJ ODQJXDJH 6,08/$ >'$+@ DQG PDGH YHU\ SRSXODU WKURXJK WKH ODQJXDJH 60$//7$/. >*2/@ DOORZV ULFKHU VWUXFWXUDO FRQVWUXFWV DQG EHKDYLRUDO SURSHUn WLHV RI REMHFWV WR EH VSHFLILHG DW WKH ORJLFDO OHYHO LQGHSHQGHQW RI WKHLU SK\VLFDO LPSOHPHQWDWLRQV 6HYHUDO IHDWXUHV RI WKH SDUDGLJP VXFK DV DEVWUDFW GDWD W\SHV LQKHULWDQFH HQFDSVXODWLRQ LQIRUPDWLRQ KLGLQJ SRO\PRUSKLVP HWF KDYH EHHQ VKRZQ WR EH XVHIXO IRU GDWD PRGHOLQJ DQG V\VWHP GHYHORSPHQW 7KH REMHFW HQFDSn VXODWLRQ FRQFHSW DGGV D OHYHO RI GDWD LQGHSHQGHQFH EHWZHHQ WKH SK\VLFDO DQG WKH ORJLFDO LQGHSHQGHQFHV LQWURGXFHG LQ WKH UHODWLRQDO PRGHO DV GHSLFWHG LQ )LJXUH ,W UHTXLUHV WKDW WKH VWUXFWXUDO DQG EHKDYLRUDO SURSHUWLHV RI DQ REMHFW EH ORJLFDOO\f HQFDSVXODWHG LQ LWV FODVV LQ WKH FRQFHSWXDO YLHZ RI DQ GDWDEDVH 6LQFH WKHQ D QXPEHU RI 2EMHFW2ULHQWHG f DQG VHPDQWLF GDWD PRGHOV KDYH EHHQ SURSRVHG >+$0 %$7 .,1 =$1D =$1E '$' 0$, 0$1 68

PAGE 11

=' :2( %$1 ),6 +25 +8/ .,0 52: &$5 &2/ 68@ ZKLFK RIIHU PRUH SRZHUIXO FRQVWUXFWV IRU PRGHOLQJ WKH VWUXFWXUDO DQG EHKDYLRUDO SURSHUWLHV RI REMHFWV IRXQG LQ DGYDQFHG DSSOLFDWLRQV VXFK DV &$'&$0 &$6( DQG GHFLVLRQ VXSSRUW V\VWHPV $Q VHPDQWLF GDWD PRGHO FDQ EH VWUXFWXUDOO\ DQGRU EHKDYLRUDOO\ REMHFW RULHQWHG >',7@ $ VWUXFWXUDOO\ GDWD PRGHO LV RQH WKDW HQFRPSDVVHV DW OHDVW WKH IROORZLQJ FKDUDFWHULVWLFV f ,W VXSSRUWV WKH XQLTXH LGHQWLILFDWLRQ RI REMHFWV WKDW LV HDFK REMHFW KDV D XQLTXH REMHFW LGHQWLILHU VXUURJDWHf ZKLFK LV YDOLG IRU WKH OLIHWLPH RI WKH REMHFW f ,W FDWHJRUL]HV WKRVH REMHFWV ZKLFK FDQ EH GHVFULEHG E\ WKH VDPH VHW RI FKDUDFn WHULVWLFV DWWULEXWHVf LQWR DQ REMHFW FODVV f ,W DOORZV DJJUHJDWLRQ DVVRFLDWLRQf KLHUDUFKLHV WR EH GHILQHG f ,W DOORZV JHQHUDOL]DWLRQ DVVRFLDWLRQf KLHUDUFKLHV WR EH GHILQHG 7KH YLHZ RI DQ DSSOLFDWLRQ ZRUOG LV UHSUHVHQWHG LQ WKH IRUP RI D QHWn ZRUN RI FODVVHV DQG DVVRFLDWLRQV 2EMHFW FODVV FDQ EH HLWKHU D SULPLWLYHFODVV ZKRVH LQVWDQFHV DUH RI VLPSOH GDWD W\SHV HJ VWULQJ LQWHJHUf RU D QRQSULPLWLYH FODVV HJ 3DUW 6WXGHQW 7HDFKHUf $W WKH H[WHQVLRQDO OHYHO LQVWDQFHV RI GLIIHUHQW FODVVHV FDQ EH UHODWHG DVVRFLDWHGf ZLWK HDFK RWKHU IRUPLQJ SDWWHUQV RI REMHFW DVVRn FLDWLRQV $ EHKDYLRUDOO\ REMHFWRULHQWHG GDWD PRGHO RQ WKH RWKHU KDQG LV RQH LQ ZKLFK RSHUDWLRQV WKDW GHVFULEH WKH EHKDYLRU RI WKH REMHFWV RI D FODVV FDQ EH GHILQHG DQG UHJLVWHUHG ZLWK WKDW FODVV 3URJUDPV RU PHWKRGV WKDW LPSOHPHQW WKH RSHUDn WLRQV GHILQHG IRU DQ REMHFW DUH WUDQVSDUHQW WR WKH XVHU RI WKH REMHFWV

PAGE 12

)RU WKHVH PRGHOV WR EH WUXO\ XVHIXO WKH\ PXVW SURYLGH VRPH REMHFW PDQLSXODn WLRQ ODQJXDJHV ZKLFK FDQ WDNH DGYDQWDJH RI WKH H[SUHVVLYH SRZHU RI WKH PRGHOV DQG SURYLGH WKH XVHUV ZLWK VLPSOH DQG SRZHUIXO TXHU\LQJ IDFLOLWLHV 5HFHQWO\ VHYHUDO TXHU\ ODQJXDJHV VXFK DV '$3/$; >6+,@ *(0 >=$1 768@ $5,(/ >0$&@ )$' >%$1@ 326748(/ >52:@ (;&(66 >&$5@ DQG RWKHUV UHSRUWHG LQ >'$' 0$1 6(5 %$1 ),6 %$1 &2/ 6+$@ KDYH EHHQ SURSRVHG 7KHVH ODQJXDJHV ZHUH GHYHORSHG EDVHG RQ GLIIHUHQW SDUDn GLJPV )RU H[DPSOH '$3/$; DQG WKH TXHU\ ODQJXDJH RI >0$1@ DUH EDVHG RQ WKH IXQFWLRQDO SDUDGLJP 7KH TXHU\ ODQJXDJH RI >%$1@ LV EDVHG RQ WKH PHVVDJH SDVVLQJ SDUDGLJP 2WKHU TXHU\ ODQJXDJHV DUH EDVHG RQ WKH UHODWLRQDO SDUDGLJP DQ H[WHQVLRQ RI 48(/ >52: &$5@ DQ H[WHQVLRQ RI 64/ >'$'@ DQG DQ H[WHQVLRQ RI WKH UHODWLRQDO DOJHEUD >&2/@ 7KH TXHU\ ODQJXDJH RI >),6@ LV EDVHG RQ ERWK IXQFWLRQDO DQG UHODWLRQDO SDUDGLJPV DOORZLQJ IXQFWLRQV WR EH XVHG LQ REMHFWRULHQWHG 64/ 264/f FRQVWUXFWV 7KH DERYH ODQJXDJHV KDYH DQ IODYRU DQG KDYH WDNHQ VLJQLILFDQW VWHSV WRZDUGV WKH GHYHORSPHQW RI D SRZHUIXO TXHU\ ODQJXDJH 4XHU\ ODQJXDJHV VXFK DV '$3/$; >6+,@ *(0 >=$1@ $5,(/ >0$&@ DQG WKH REMHFW RULHQWHG TXHU\ ODQJXDJH GHVFULEHG LQ >%$1@ DUH EDVHG RQ WKH YLHZ RI D GDWDn EDVH GHILQHG LQ WHUPV RI REMHFWV REMHFW FODVVHV DQG WKHLU DVVRFLDWLRQV $ TXHU\ LQ WKHVH ODQJXDJHV LV IRUPXODWHG E\ VSHFLI\LQJ RQH FODVV XVXDOO\ D QRQSULPLWLYHFODVV ZKRVH LQVWDQFHV DUH UHDO ZRUOG REMHFWVf LQ WKH VFKHPD DV D FHQWUDO FODVV ZLWK VRPH SDWK H[SUHVVLRQV (DFK SDWK H[SUHVVLRQ VWDUWV IURP WKH FHQWUDO FODVV DQG HQGV DW DQRWKHU FODVV XVXDOO\ D SULPLWLYHFODVV ZKRVH LQVWDQFHV DUH RI EDVLF GDWD W\SHV

PAGE 13

VXFK DV LQWHJHU VWULQJ VHW HWFf $ UHVWULFWLRQ FRQGLWLRQ FDQ EH VSHFLILHG RQ WKH FODVV UHIHUHQFHG DW WKH HQG RI D SDWK H[SUHVVLRQ 7KLV FODVV FDQ DOVR EH VSHFLILHG LQ WKH OLVW RI DWWULEXWHV WR EH UHWULHYHG 7KH UHVXOW RI D TXHU\ LV D VHW RI WXSOHV HDFK RI ZKLFK FRUUHVSRQGV WR D VLQJOH LQVWDQFH RI WKH FHQWUDO FODVV DQG FRQWDLQV YDOXHV UHODWHG WR WKDW LQVWDQFH ZKLFK DUH FROOHFWHG IURP FODVVHV VSHFLILHG LQ WKH ILVW $ PDMRU GUDZEDFN RI WKHVH TXHU\ ODQJXDJHV LV WKDW WKH\ GR QRW PDLQWDLQ WKH FORVXUH SURSHUW\ >$/$E@ $ TXHU\ ODQJXDJH LV VDLG WR EH FORVHG LI WKH UHVXOW RI D TXHU\ FDQ EH IXUWKHU TXHULHG E\ RWKHU TXHULHV VSHFLILHG LQ WKH VDPH ODQJXDJH ,Q WKH DERYH PHQWLRQHG ODQJXDJHV WKH LQSXW WR D TXHU\ KDV DQ UHSUHVHQWDWLRQ LH D QHWZRUN RI REMHFWV FODVVHV DQG WKHLU DVVRFLDWLRQVf ZKHUHDV LWV RXWSXW LV D UHODWLRQ ZKLFK GRHV QRW KDYH WKH VDPH VWUXFWXUDO DQG EHKDYLRUDO SURSHUWLHV DV WKH RULJLQDO REMHFWV &RQVHTXHQWO\ WKH UHVXOW RI D TXHU\ FDQQRW EH IXUWKHU SURFHVVHG E\ WKH VDPH VHW RI RSHUDWRUV 7KH GHVLJQ RI WKHVH ODQJXDJHV LV YHU\ PXFK LQIOXHQFHG E\ WKH UHODWLRQDO PRGHO DQG UHODWLRQDO ODQJXDJHV ZKLFK DUH FRQFHUQHG PDLQO\ ZLWK UHWULHYDO DQG VWRUDJH RSHUDWLRQV ,Q SURFHVVLQJ REMHFWV LQ GLIIHUHQW FODVVHV WKDW VDWLVI\ VRPH VHDUFK FRQGLWLRQV DUH VXEMHFW WR GLIIHUHQW XVHU GHILQHG RSHUDWLRQV 7KH LGHD RI FROOHFWLQJ GDWD WR IRUP D UHVXOWLQJ UHODWLRQ GRHV QRW VDWLVI\ WKLV SURFHVVLQJ PRGHO 7KH TXHU\ ODQJXDJHV SURSRVHG >'$' 0$1 %$1 52: &$5 &2/@ XVH QHVWHG UHODWLRQV DV WKHLU ORJLFDO YLHZV RI GDWDEDVHV $OWKRXJK WKHVH ODQJXDJHV DUH FORVHG LH RSHUDWRUV LQ WKHVH ODQJXDJHV RSHUDWH RQ QHVWHG UHODWLRQV WR SURGXFH QHVWHG UHODWLRQV WKH QHVWHG UHODWLRQ LV QRW D SURSHU ORJLFDO UHSUHVHQWDWLRQ IRU DQ GDWDEDVH ZKLFK LV EDVLFDOO\ D QHWZRUN VWUXFWXUH RI

PAGE 14

REMHFW DVVRFLDWLRQV 0DSSLQJ IURP D QHWZRUN UHSUHVHQWDWLRQ WR QHVWHG UHODWLRQV LV DQ DGGLWLRQDO SURFHVV )XUWKHUPRUH LQ RUGHU WR XVH D QHVWHG UHODWLRQ WR UHSUHVHQW FRPSOH[ QHWZRUN VWUXFWXUHV D FRQVLGHUDEOH DPRXQW RI GDWD KDV WR EH LQWURGXFHG WR UHODWH WKHVH QHVWHG UHODWLRQV ,W LV RXU YLHZ WKDW WKH TXHU\ ODQJXDJH DQG LWV XQGHUO\LQJ DOJHEUD VKRXOG GLUHFWO\ VXSSRUW WKH PDQLSXODWLRQ RI QHWZRUN VWUXFWXUHV $ TXHU\ DOJHEUD >6+$@ ZDV SURSRVHG UHFHQWO\ EDVHG RQ WKH PRGHO (1&25( >(/0@ $OWKRXJK (1&25( PRGHOV DSSOLFDWLRQV DV QHWZRUNV RI REMHFWV REMHFW W\SHV DQG WKHLU DVVRFLDWLRQV WKH GRPDLQ RI WKH DOJHEUD LV GHILQHG DV VHWV RI REMHFWV RI WKH 7XSOH W\SH ZKLFK LV HVVHQWLDOO\ WKH QHVWHG UHODWLRQ UHSUHVHQWDWLRQ VLQFH LW DOORZV WKH QHVWLQJ RI WXSOHV 7KHUHIRUH WKH PDSSLQJ SUREn OHP DGGUHVVHG DERYH VWLOO UHPDLQV ,Q WKLV DOJHEUD WZR LGHQWLFDO TXHULHV RU WZR LGHQWLFDO RSHUDWLRQV LQ D VLQJOH TXHU\ GR QRW JLYH WKH VDPH UHVSRQVH VLQFH HDFK SURGXFHV D QHZ REMHFW LQ WKH GDWDEDVH 7R HOLPLQDWH GXSOLFDWHG FRSLHV RI WKH VDPH QHZO\ FUHDWHG REMHFW WKH DOJHEUD LQWURGXFHV RSHUDWLRQV OLNH 'XS(OLPLQDWH DQG &RDOHVFH ZKLFK ZRXOG QRW KDYH EHHQ QHFHVVDU\ LI WKH DOJHEUD ZHUH WR GLUHFWO\ VXSSRUW WKH QHWZRUNVWUXFWXUHG SURFHVVLQJ RI GDWDEDVHV :H IXUWKHU REVHUYH WKDW WKH XQLRQ RSHUDWLRQ LQ WKLV DOJHEUD PD\ SURGXFH D FROOHFWLRQ RI REMHFWV KDYLQJ WKH VDPH GDWD W\SH EXW ZLWK GLIIHUHQW VWUXFWXUHV HJ WKH XQLRQ RI WZR FROOHFWLRQV RI REMHFWV RI WKH 7XSOH W\SH ZLWK GLIIHUHQW DULWLHVf 1HYHUWKHOHVV WKH RWKHU RSHUDn WRUV LQWURGXFHG LQ WKH DOJHEUD DUH QRW GHILQHG WR RSHUDWH RQ FROOHFWLRQ RI REMHFWV ZLWK KHWHURJHQHRXV VWUXFWXUHV $ FRPPRQ OLPLWDWLRQ RI PDQ\ H[LVWLQJ TXHU\ ODQJXDJHV LV WKDW WKH\ FDQQRW H[SUHVV QRQDVVRFLDWLRQ UHODWLRQVKLS EHWZHHQ REMHFWV HDVLO\ LH LGHQWLI\ REMHFWV

PAGE 15

LQ WZR FODVVHV WKDW DUH QRW DVVRFLDWHG ZLWK HDFK RWKHU ZKLOH WKHLU FODVVHV DUH )RU H[DPSOH LQ DQ GDWDEDVH OHW XV DVVXPH WKDW 6XSSOLHUV VL DQG V VXSSO\ 3DUWV SL DQG S UHVSHFWLYHO\ *(0 326748(/ DQG VHYHUDO RWKHU TXHU\ ODQJXDJHV SURYLGH WKH GRW FRQVWUXFW 6XSSOLHUV3DUWVf DQG $5,(/ SURYLGHV WKH RI FRQn VWUXFW 3DUWV RI 6XSSOLHUVf WR QDYLJDWH IURP WKH FODVV 6XSSOLHUV WR WKH FODVV 3DUWV WR SURGXFH REMHFW SDLUV VLSL DQG VSf +RZHYHU WKH\ GR QRW KDYH D ODQJXDJH FRQVWUXFW IRU VSHFLI\LQJ WKH VHPDQWLFV WKDW VL GRHV QRW VXSSO\ S DQG V GRHV QRW VXSSO\ SL 6LPLODUO\ LQ IXQFWLRQDO ODQJXDJHV RQO\ WKH IXQFWLRQ 3DUWV6XSSOLHUVf LV SURYLGHG WR VSHFLI\ WKH DVVRFLDWLRQV RI VLSL DQG VS EXW QRW WKH QRQDVVRFLDWLRQ RI VXSSOLHUV DQG SDUWV ,Q YLHZ RI WKH GLVDGYDQWDJHV RI WKH H[LVWLQJ TXHU\ ODQJXDJHV ZH ZRXOG OLNH WR VWUHVV WKH LPSRUWDQFH RI XVLQJ D JUDSK DV WKH ORJLFDO UHSUHVHQWDWLRQ RI DQ GDWDEDVH DW ERWK LQWHQVLRQDO DQG H[WHQVLRQDO OHYHOV DV H[HPSOLILHG E\ >/(&@ )$' >%$1@ DQG 26$0r >68@ 7KH TXHU\ ODQJXDJH DQG LWV XQGHUn O\LQJ DOJHEUD VKRXOG SURYLGH FRQVWUXFWV WR GLUHFWO\ SURFHVV JUDSKV ZLWK GLIIHUHQW GHJUHHV RI FRPSOH[LW\ 7KH\ VKRXOG DOVR VXSSRUW WKH VSHFLILFDWLRQ RI QRQn DVVRFLDWLRQV DQG WKH SURFHVVLQJ RI KHWHURJHQHRXV VWUXFWXUHV )XUWKHUPRUH WKH FORn VXUH SURSHUW\ VKRXOG EH PDLQWDLQHG ,Q WKLV GLVVHUWDWLRQ ZH SURSRVH DQ DVVRFLDWLRQ DOJHEUD $DOJHEUDf EDVHG RQ WKH JUDSK UHSUHVHQWDWLRQ RI GDWDEDVHV DQG WKH DVVRFLDWLRQEDVHG TXHU\ IRUPXn ODWLRQ UHIHU WR &KDSWHU f $QDORJRXV WR WKH GHYHORSPHQW RI WKH UHODWLRQDO DOJHn EUD IRU UHODWLRQDO GDWDEDVHV WKH GHYHORSPHQW RI WKH $DOJHEUD SURYLGHV WKH IRUPDO IRXQGDWLRQ IRU TXHU\ SURFHVVLQJ DQG RSWLPL]DWLRQ LQ GDWDEDVHV DQG IRU

PAGE 16

GHVLJQLQJ TXHU\ ODQJXDJHV 8QOLNH WKH UHFRUGWXSOHfEDVHG UHODWLRQDO DOJHEUD >&2' DQG &2'@ DQG WKH TXHU\ DOJHEUD >6+$@ WKH $DOJHEUD LV DVVRFLDWLRQEDVHG LH WKH GRPDLQ RI WKH DOJHEUD LV VHWV RI DVVRFLDWLRQ SDWWHUQV HJ OLQHDU VWUXFWXUHV WUHHV ODWWLFHV QHWZRUNV HWFf DQG SURFHVVLQJ DQ GDWDn EDVH LV EDVHG RQ WKH PDWFKLQJ DQG PDQLSXODWLRQ RI KRPRJHQHRXV DV ZHOO DV KHWHURn JHQHRXV SDWWHUQV RI REMHFW DVVRFLDWLRQV 2SHUDWRUV RI WKH $DOJHEUD FDQ EH XVHG WR QDYLJDWH D QHWZRUN RI LQWHUFRQQHFWHG REMHFW FODVVHV DORQJ WKH SDWK RI LQWHUHVW WR FRQVWUXFW D FRPSOH[ SDWWHUQ DV WKH VHDUFK FRQGLWLRQ 7KH\ FDQ DOVR EH XVHG WR GHFRPSRVH D FRPSOLFDWHG SDWWHUQ LQWR VLPSOH RQHV 7HQ RSHUDWRUV KDYH EHHQ GHILQHG IRU WKH DOJHEUD WKUHH XQDU\ RSHUDWRUV >$6HOHFW Uf $3URMHFW f DQG $ ,QWHJUDWH f@ DQG VHYHQ ELQDU\ RSHUDWRUV >$VVRFLDWH rf $&RPSOHPHQW _f $ 8QLRQ f $'LIIHUHQFH f $'LYLGH Af 1RQ$VVRFLDWH f DQG $,QWHUVHFW ff@ ZKHUH WKH SUHIL[ $ VWDQGV IRU $VVRFLDWLRQ $OWKRXJK PDQ\ RI WKHVH RSHUDWRUV FRUUHVSRQG WR WKH UHODWLRQDO DOJHEUD RSHUDWRUV WKH\ DUH GLIIHUHQW IURP WKHP LQ WKDW WKH\ FDQ RSHUDWH RQ FRPSOLFDWHG KHWHURJHQHRXV VWUXFWXUHV ,Q WKLV UHVSHFW WKH $DOJHEUD LV PRUH JHQHUDO WKDQ WKH UHODWLRQDO DOJHEUD 7KH UHVW RI WKLV GLVVHUWDWLRQ LV RUJDQL]HG DV IROORZV $ GHWDLOHG VXUYH\ RQ WKH UHODWLRQDO PRGHO DQG WKH UHODWLRQDO DOJHEUD WKH H[LVWLQJ TXHU\ ODQJXDJHV DQG D UHFHQWO\ SURSRVHG TXHU\ DOJHEUD LV SURYLGHG LQ &KDSWHU 7KH JUDSKLFDO UHSUHVHQWDWLRQ RI GDWDEDVHV DQG WKH DVVRFLDWLRQEDVHG TXHU\ IRUPXODWLRQ DUH GHVFULEHG LQ &KDSWHU ZLWK WKH KHOS RI H[DPSOHV &KDSWHU IRUPDOO\ GHILQHV WKH FRQFHSWV RI 6FKHPD *UDSK 6*f 2EMHFW *UDSK 2*f DQG DVVRFLDWLRQ SDWWHUQV 7KH IRUPDO GHILQLWLRQV RI WKH DVVRFLDWLRQ RSHUDWRUV DQG WKHLU VLPSOH PDWKHPDWLFDO

PAGE 17

SURSHUWLHV DUH DOVR SUHVHQWHG 7KH $DOJHEUD H[SUHVVLRQV IRU VRPH H[DPSOH TXHULHV DUH JLYHQ WR GHPRQVWUDWH WKH XWLOLW\ RI WKH DOJHEUD &KDSWHU SUHVHQWV WKH PDWKHPDWLFDO SURSHUWLHV RI WKH DVVRFLDWLRQ RSHUDWRUV DQG WKHLU XWLOLWLHV LQ TXHU\ RSWLPL]DWLRQ DQG TXHU\ GHFRPSRVLWLRQ 7KH SURRIV RI WKH PDWKHPDWLFDO SURSHUWLHV RI WKH RSHUDWRUV FDQ EH IRXQG LQ WKH $SSHQGL[ 7KH FRPSOHWHQHVV RI WKH $ DOJHEUD LV VKRZQ LQ &KDSWHU DQG WKH FRQFOXVLRQ LV JLYHQ LQ &KDSWHU

PAGE 18

a? ORJLFDO GDWD LQGHSHQGHQFH O SK\VLFDO GDWD U LQGHSHQGHQFH )LJXUH 'DWD LQGHSHQGHQFLHV LQ UHODWLRQDO GDWDEDVHV

PAGE 19

a? ORJLFDO GDWD LQGHSHQGHQFH HQFDSVXODWLRQ SK\VLFDO GDWD U LQGHSHQGHQFH )LJXUH $UFKLWHFWXUH RI GDWDEDVHV

PAGE 20

&+$37(5 $ 6859(< 2) 5(/$7(' 5(6($5&+ 7KLV VHFWLRQ VXUYH\V VRPH RI WKH H[LVWLQJ ZRUN UHODWHG WR WKH GHYHORSPHQW RI WKH $DOJHEUD 6HFWLRQ GHVFULEHV WKH UHODWLRQDO PRGHO DQG WKH UHODWLRQDO DOJHn EUD ZKLOH 6HFWLRQ VXUYH\V VRPH H[LVWLQJ TXHU\ ODQJXDJHV GHVLJQHG IRU VHPDQWLF GDWD PRGHOV 7KH TXHU\ DOJHEUD UHFHQWO\ DSSHDUHG LQ WKH OLWHUDWXUH LV VXUYH\HG LQ 6HFWLRQ 5HODWLRQDO 0RGHO DQG 5HODWLRQDO $OJHEUD :KHQ WKH KLHUDUFKLFDO DQG QHWZRUN GDWD PRGHOV ZHUH XVHG H[WHQVLYHO\ LQ LQIRUPDWLRQ V\VWHPV LQ WKH ODWH V &RGG >&2'@ UDLVHG DQ LQWHUHVWLQJ DQG LPSRUWDQW TXHVWLRQ &DQ DSSOLFDWLRQ SURJUDPV DQG WHUPLQDO DFWLYLWLHV UHPDLQ LQYDULDQW DV WKH LQWHUQDO GDWD UHSUHVHQWDWLRQV SK\VLFDO UHSUHVHQWDWLRQVf FKDQJH" +H DVVHUWHG WKDW WKH IXWXUH XVHUV RI ODUJH GDWD EDQNV PXVW EH SURWHFWHG IURP KDYn LQJ WR NQRZ KRZ WKH GDWD ZHUH RUJDQL]HG LQ WKH PDFKLQH )ROORZLQJ WKLV UDWLRQDOH KH FRQFHLYHG WKH QRWLRQ RI GDWD LQGHSHQGHQFH ZKLFK VXJJHVWV WKDW WKH ORJLFDO RUJDQL]DWLRQ RI GDWD VKRXOG EH LQGHSHQGHQW RI LWV SK\VLFDO UHSUHVHQWDWLRQ 'HWHUPLQHG WR GHPRQVWUDWH WKH YDOLGLW\ RI KLV GDWD LQGHSHQGHQFH FRQFHSW KH SURn SRVHG D UHODWLRQDO GDWD PRGHO EDVHG RQ QDU\ UHODWLRQV

PAGE 21

7KH VFKHPH RI D UHODWLRQ 5 RI DQ HQWLW\ VHW ^(Y ( (Q` LV GHILQHG RQ D VHW RI P DWWULEXWHV ^$Y $ $P` ZKLFK FRUUHVSRQG WR P GRPDLQV ^'Y QRW QHFHVVDULO\ GLVWLQFWf (DFK HQWLW\ WKH LQVWDQFH RI WKH VFKHPHf LV UHSUHVHQWHG E\ DQ PDU\ WXSOH ZKLFK KDV LWV ILUVW DWWULEXWH YDOXH IURP 'Y LWV VHFRQG DWWULEXWH IURP 'Y DQG VR IRUWK $ VHW RI DWWULEXWHV RI D UHODWLRQ LV FDOOHG D NH\ LI WKH HQWLWLHV RI WKH UHODWLRQ FDQ EH XQLTXHO\ LGHQWLILHG E\ WKH YDOXHV RI WKHVH DWWULEXWHV ,Q SDUWLFXODU WKH LQIRUPDWLRQ RI WKH VXSSOLHUV VXFK DV WKHLU QDPHV DGGUHVVHV LWHPV WKH\ VXSSO\ DQG WKH SULFHV RI WKH LWHPV FDQ EH UHSUHVHQWHG E\ WKH UHODWLRQ 6833/,(56 RI WKH IROORZLQJ VFKHPH 6833/,(5661$0( 6$''5(66 ,7(0 35,&(f ZKHUH WKH DWWULEXWHV 61$0( DQG ,7(0 IRUP D FRPSRVLWH NH\ 'DWD UHSUHVHQWHG LQ WKLV IRUP ZKLFK LQWXLWLYHO\ LV D IODW WDEOH LV WKH ORJLFDO YLHZ RI DQ DSSOLFDWLRQ ZRUOG ,W KDV QRWKLQJ WR GR ZLWK WKH SK\VLFDO UHSUHVHQWDWLRQ RI WKH GDWD :KHQ GHVLJQLQJ D GDWDEDVH XVLQJ WKH UHODWLRQDO PRGHO RQH LV RIWHQ IDFHG ZLWK D FKRLFH DPRQJ DOWHUQDWLYH VHWV RI UHODWLRQ VFKHPHV 6RPH FKRLFHV DUH PRUH IDYRUn DEOH WKDQ RWKHUV IRU YDULRXV UHDVRQV )RU H[DPSOH WKH UHODWLRQ 6833/,(56 LV QRW D GHVLUDEOH VFKHPH EHFDXVH LW KDV WKH IROORZLQJ SRWHQWLDO SUREOHPV f 5HGXQn GDQF\ WKH DGGUHVV RI WKH VXSSOLHU LV UHSHDWHG RQFH IRU HDFK LWHP VXSSOLHG f 3RWHQWLDO LQFRQVLVWHQF\ XSGDWH DQRPDOLHVf f§ DV D FRQVHTXHQFH RI WKH UHGXQGDQF\ WKH XSGDWH RI WKH DGGUHVV RI D VXSSOLHU LQ RQH WXSOH ZLOO OHDYH LW LQFRQVLVWHQW ZLWK WKH DGGUHVV RI DQRWKHU WXSOH f ,QVHUWLRQ DQRPDOLHV WKH DGGUHVV RI D VXSSOLHU FDQQRW EH UHFRUGHG LI WKDW VXSSOLHU GRHV QRW FXUUHQWO\ VXSSO\ DW OHDVW RQH LWHP

PAGE 22

VLQFH 61$0( DQG ,7(0 IRUP D FRPSRVLWH NH\ RI WKH UHODWLRQ 6833/,(56 f 'HOHWLRQ DQRPDOLHV WKH LQYHUVH WR SUREOHP f LV WKDW VKRXOG DOO WKH LWHPV VXSn SOLHG E\ RQH VXSSOLHU EH GHOHWHG ZH XQLQWHQWLRQDOO\ ORVH WKH DGGUHVV RI WKDW VXSn SOLHU 7KH FDXVHV RI WKHVH SUREOHPV DQG WKHLU VROXWLRQV DUH UHOHYDQW WR WKH IXQFn WLRQDO GHSHQGHQFLHV DPRQJ WKH DWWULEXWHV RI D UHODWLRQ >&2' 8//@ 6XSSRVH ; DQG < DUH WZR VHWV RI DWWULEXWHV RI D UHODWLRQ < IXQFWLRQDOO\ GHSHQGV RQ ; RU ; IXQFWLRQDOO\ GHWHUPLQHV
PAGE 23

7KH GHFRPSRVLWLRQ RI D UHODWLRQ EDVHG RQ WKH IXQFWLRQDO GHSHQGHQFLHV DPRQJ LWV DWWULEXWHV LV D QRYHO LVVXH RI QRUPDOL]DWLRQ LQ WKH UHODWLRQDO PRGHO )RXU W\SHV RI QRUPDO IRUPV GHQRWHG E\ ,1) 1) 1) DQG %R\HH&RGG1) UHVSHFWLYHO\ KDYH EHHQ UHFRJQL]HG LQ FRQVLGHULQJ WKH IXQFWLRQDO GHSHQGHQF\ >&2' $50 DQG %((@ 7KH %R\HH&RGG1) LV WKH VWURQJHVW RI WKHVH QRUPDO IRUPV 5HODn WLRQV LQ WKHVH QRUPDO IRUPV PD\ KDYH WR EH IXUWKHU GHFRPSRVHG LQWR 1) RU 1) WR HOLPLQDWH PXOWLYDOXHG GHSHQGHQFLHV >)$* '(/ DQG =$1@ DQG MRLQ GHSHQGHQFLHV >$+@ 7KLV GHFRPSRVLWLRQ LV QHHGHG WR HOLPLQDWH IXUWKHU UHGXQn GDQF\ DQG DQRPDOLHV 7KH VXFFHVV DQG SRSXODULW\ RI WKH UHODWLRQDO PRGHO DQG WKH UHODWLRQDO GDWDn EDVH PDQDJHPHQW V\VWHPV '%06Vf DUH GXH WR LWV VLPSOLFLW\ LQ VWUXFWXUDO WDEXODUf UHSUHVHQWDWLRQ DQG LWV VRXQG WKHRUHWLFDO EDVLV WKH UHODWLRQDO DOJHEUD DQG WKH UHODn WLRQDO FDOFXOXV >&2'D@ 7KH UHODWLRQDO DOJHEUD GHILQHV ILYH SULPLWLYH RSHUDWRUV RI ZKLFK WZR DUH XQDU\ RSHUDWRUV >3URMHFWLRQ f DQG 6HOHFWLRQ Uf> DQG WKUHH DUH ELQDU\ RSHUDWRUV >&URVVSURGXFW [f 8QLRQ f DQG 'LIIHUHQFH f@ 2WKHU RSHUDn WRUV VXFK DV -RLQ 1DWXUDOMRLQ 6HWLQWHUVHFWLRQ DQG 6HWGLYLVLRQ DUH DOVR GHILQHG LQ WKH DOJHEUD $OWKRXJK WKHVH ODWHU RSHUDWRUV DUH HDV\ WR XVH WKH\ DUH QRW SULPLn WLYH VLQFH WKH\ FDQ EH H[SUHVVHG LQ WHUPV RI WKH SULPLWLYH RSHUDWRUV 7KH UHODWLRQDO DOJHEUD KDV WKH FORVXUH SURSHUW\ VLQFH HYHU\ RSHUDWRU PXVW RSHUDWH RQ RQH RU PRUH UHODWLRQV DQG SURGXFHV D QHZ UHODWLRQ 2SHUDWRUV RI WKH UHODWLRQDO DOJHEUD EDVLFDOO\ RSHUDWH RQ WKH YDOXHV RI WXSOHV LQ UHODWLRQV 6WUXFWXUn DOO\ VSHDNLQJ WKH\ DUH GHILQHG WR RSHUDWH RQ WXSOHV ZKRVH VWUXFWXUHV DUH XQLRQ FRPSDWLEOH KRPRJHQHRXVf 7KH UHODWLRQDO DOJHEUD LV FRPSOHWH LQ WKH VHQVH WKDW LW

PAGE 24

KDV WKH HTXLYDOHQW H[SUHVVLYH SRZHU WR WKH UHODWLRQDO FDOFXOXV >&2'D DQG 8//@ %HFDXVH RI WKLV LW VHUYHV DV WKH WKHRUHWLFDO EDVLV IRU WKH UHODWLRQDO PRGHO 7KH UHODWLRQDO DOJHEUD KDV EHHQ XVHG IRU WKH IROORZLQJ WKUHH SXUSRVHV DOWKRXJK LW KDV QRW EHHQ SUHYLRXVO\ LPSOHPHQWHG LQ DQ\ H[LVWLQJ '%06V H[DFWO\ DV GHILQHG >8//@ f ,W FUHDWHV D QHZ FODVV RI TXHU\ ODQJXDJHV FDOOHG DOJHEUDLF ODQJXDJHV %DVHG RQ WKH UHODWLRQDO DOJHEUD ODQJXDJHV WKDW GLUHFWO\ DGRSW WKH UHODWLRQDO RSHUDWRUV FDQ EH GHYHORSHG VXFK DV ,6%/ >72'@ ZKLFK LV D FORVH DSSUR[LPDWLRQ WR WKH UHODWLRQDO DOJHEUD $OWKRXJK ODQJXDJHV RI WKLV W\SH DUH PRVWO\ SURFHGXUDO LW LV UHODWLYHO\ HDV\ WR GHPRQVWUDWH WKHLU FRPSOHWHQHVV DORQJ ZLWK WKH PDWKHPDWLFDO SURSHUWLHV RI WKH UHODWLRQDO DOJHEUD ZKLFK FDQ EH UHDGLO\ DSSOLHG WR TXHU\ RSWLPL]DWLRQ DQG TXHU\ GHFRPSRVLWLRQ f ,W QRW RQO\ VHUYHV DV D EHQFKPDUN IRU HYDOXDWLQJ TXHU\ ODQJXDJHV LQ H[LVWLQJ V\VWHPV EXW DOVR DV WKH FULWHULRQ IRU GHVLJQLQJ QHZ ODQJXDJHV IRU UHODWLRQDO '%06V $ UHODWLRQDO ODQJXDJH ZLOO QRW KDYH WKH QHFHVVDU\ H[SUHVVLYH SRZHU LI LW LV QRW UHODWLRQDOO\ FRPSOHWH >8//@ f ,W SURYLGHV D PDWKHPDWLFDO EDVLV IRU WUDQVIRUPLQJ H[SUHVVLRQV LQ TXHU\ GHFRPn SRVLWLRQ DQG ORJLFDO RU FRQFHSWXDOf TXHU\ RSWLPL]DWLRQ $V DQ DOJHEUD IRUP WKH PDWKHPDWLFDO SURSHUWLHV RI WKH UHODWLRQDO DOJHEUD FDQ EH H[SORUHG SUHFLVHO\ DQG V\VWHPDWLFDOO\ )RU TXHU\ ODQJXDJHV FRQVWUXHG DV DOJHEUDLF ODQJXDJHV WKHVH PDWKHPDWLFDO SURSHUWLHV H[KLELW D VWUDLJKWIRUZDUG DSSOLFDWLRQ >+$/@ 4XHU\ ODQJXDJHV OLNH 648$5( RU 6(48(/ KDYLQJ FHUWDLQ DOJHEUDLF IHDWXUHV PD\ DOVR XVH WKHVH SURSHUWLHV VLQFH WKH SDUVH RI D TXHU\ \LHOGV D WUHH LQ ZKLFK

PAGE 25

VRPH QRGHV UHSUHVHQW UHODWLRQDO DOJHEUD RSHUDWRUV >$67@ (YHQ LI D TXHU\ ODQJXDJH VXFK DV 48(/ LV D UHODWLRQDO FDOFXOXV ODQJXDJH LWV FDOFXOXVOLNH H[SUHVVLRQV DUH WUDQVODWHG LQWR UHODWLRQDO DOJHEUD H[SUHVVLRQV LQ WKH 48(/ RSWLPL]HU >:21@ 7KH WRWDO FRQWHQW SURSRVHG E\ &RGG EHIRUH RQ WKH UHODWLRQDO PRGHO LV UHIHUHG DV 9HUVLRQ RI WKH UHODWLRQDO PRGHO 509Of ZKRVH PRGHOLQJ FDSDELOLWLHV ZHUH H[WHQGHG E\ &RGG LQ >&2'@ WR YHUVLRQ 507 7 IRU 7DVPDQLDf %DVHG RQ WKHVH WZR YHUVLRQV &RGG >&2'@ LQWURGXFHV 9HUVLRQ RI WKH UHODWLRQDO PRGHO 509f 7KH PRVW LPSRUWDQW DGGLWLRQDO IHDWXUHV LQ 509 DUH DV IROn ORZV f $ QHZ WUHDWPHQW RI LWHPV RI GDWD PLVVLQJ EHFDXVH WKH\ UHSUHVHQW SURSHUWLHV WKDW KDSSHQ WR EH LQDSSOLFDEOH WR FHUWDLQ REMHFW LQVWDQFHV f 1HZ IHDWXUHV VXSSRUWLQJ DOO NLQGV RI LQWHJULW\ FRQVWUDLQWV HVSHFLDOO\ WKH XVHU GHILQHG LQWHJULW\ FRQVWUDLQWV f $ PRUH GHWDLOHG DFFRXQW RI YLHZ XSGDWDELOLW\ f 1HZ IHDWXUHV SHUWDLQLQJ WR WKH PDQDJHPHQW RI GLVWULEXWHG GDWDEDVHV ,W LV LPSRUWDQW WR UHFRJQL]H WKH IDFW WKDW KLHUDUFKLFDO DQG QHWZRUN PRGHOV DV ZHOO DV WKH UHODWLRQDO PRGHO HYROYHG GXULQJ D WLPH LQ ZKLFK WKH SULPDU\ DSSOLFDn WLRQV RI LQIRUPDWLRQ V\VWHPV ZHUH EXVLQHVVRULHQWHG ,Q DQ DWWHPSW WR DSSO\ WKHVH WHFKQLTXHV WR WKH PRUH FRPSOLFDWHG DSSOLFDWLRQ DUHDV VXFK DV &$'&$0 &$6( DQG GHFLVLRQ VXSSRUW LW LV IRXQG WKDW WKH UHODWLRQDO PRGHO LV QR ORQJHU DGHTXDWH IRU PRGHOLQJ WKHVH DGYDQFHG DSSOLFDWLRQV 7KH LQDGHTXDFLHV RI WKH UHODWLRQDO PRGHO DUH VXPPDUL]HG DV IROORZV )LUVW WKH UHODWLRQDO PRGHO KDV OLPLWHG PRGHOLQJ

PAGE 26

FDSDELOLWLHV :KHQ GDWD DUH ORJLFDOO\ UHSUHVHQWHG LQ WKH IRUP RI UHODWLRQV WKH UHODn WLRQVKLSV DPRQJ HQWLWLHV LQ WKHVH UHODWLRQV DUH UHSUHVHQWHG E\ PDWFKLQJ YDOXHV RI WKH DWWULEXWHV RU NH\V LQ RQH UHODWLRQ ZLWK YDOXHV RI WKH DWWULEXWHV RU IRUHLJQ NH\V LQ RWKHU UHODWLRQV 7KH DFWXDO VHPDQWLFV DPRQJ WKH GDWD VXFK DV JHQHUDOL]DWLRQ DQG DJJUHJDWLRQ WKH DEVWUDFW GDWD W\SHf FDQQRW EH PRGHOHG E\ WKH UHODWLRQDO PRGHO 6HFRQG WKH UHODWLRQDO PRGHO RQO\ PRGHOV WKH VWUXFWXUDO DVSHFWV RI HQWLWLHV DQG WKXV LJQRUHV WKHLU EHKDYLRUDO DVSHFWV HJ V\VWHPGHILQHG DQG XVHUGHILQHG RSHUDWLRQVf 7KLUG LQ WKHVH DGYDQFHG DSSOLFDWLRQV WKH FRQFHSW RI GDWD LQGHSHQn GHQFH VKRXOG EH IXUWKHU H[WHQGHG WR WKH FRQFHSW RI REMHFW HQFDSVXODWLRQ LH QRW RQO\ VKRXOG WKH ORJLFDO UHSUHVHQWDWLRQ RI DQ REMHFW EH VHSDUDWHG IURP LWV SK\VLFDO UHSUHVHQWDWLRQ EXW LWV VWUXFWXUDO DQG EHKDYLRUDO SURSHUWLHV VKRXOG EH ORJLFDOO\ HQFDSVXODWHG LQ LWV FODVV 7KH REMHFW HQFDSVXODWLRQ FRQFHSW FDQQRW EH UHDOL]HG LQ WKH UHODWLRQDO PRGHO VLQFH WKH GDWD GHVFULELQJ DQ HQWLW\ PD\ EH ORJLFDOO\ VFDWWHUHG DPRQJ VHYHUDO UHODWLRQV GXH WR QRUPDOL]DWLRQ >&2' &2'E %(( DQG 8//@ )RXUWK HQWLWLHV ZLWK FRPSOH[ VWUXFWXUHV DQG FRPSOLFDWHG UHODWLRQVKLSV DPRQJ HQWLWLHV DUH QRW UHSUHVHQWDEOH E\ IODW WDEOHV UHODWLRQVf )LQDOO\ LW FDQQRW UHSUHVHQW DQG RSHUDWH RQ HQWLWLHV ZLWK GLIIHUHQW KHWHURJHQHRXVf VWUXFWXUHV ([LVWLQJ 4XHU\ /DQJXDJHV $Q H[WHQVLYH OLWHUDWXUH VHDUFK RQ TXHU\ ODQJXDJHV IRU DFFHVVLQJ GDWDn EDVHV VXFK DV *(0 >=$1 768@ $5,(/ >0$&@ '$3/(; >6+,@ )$' >%$1@ 326748(/ >52:@ (;&(66 >&$5@ DV ZHOO DV RWKHU SURSRVHG ODQJXDJHV >67 '$' 0$1 6(5 %$1 ),6 %$1 &2/

PAGE 27

6+$@ KDV EHHQ FDUULHG RXW 7KLV VHFWLRQ VXUYH\V D UHSUHVHQWDWLYH VDPSOH RI WKHVH ODQJXDJHV 0RVW H[LVWLQJ TXHU\ ODQJXDJHV KDYH FDSDELOLWLHV EH\RQG WKRVH SURYLGHG E\ LWV WKHRUHWLFDO EDVLV )RU H[DPSOH WKH DULWKPHWLF RSHUDWLRQV DQG DJJUHJDWLRQ IXQFWLRQV SURYLGHG E\ WKH UHODWLRQDO ODQJXDJHV DUH QRW DYDLODEOH LQ WKH UHODWLRQDO DOJHEUD 7KHUHIRUH WKLV VXUYH\ LV OLPLWHG WR WKRVH IHDWXUHV ZKLFK DUH UHOHYDQW WR WKH SURSRVHG DOJHEUD 7R GHPRQVWUDWH WKH VLPLODULWLHV DQG GLIIHUHQFHV RI WKHVH ODQJXDJHV WKH VDPH GDWDEDVH VFKHPD DV VKRZQ LQ )LJXUH LV XVHG IRU H[DPSOH TXHULHV ZULWWHQ LQ *(0 $5,(/ '$3/(; 7KH VDPSOH VFKHPD RI )LJXUH LV IRU D JRYHUQPHQW RZQHG ODERUDWRU\ V\VWHP ZKHUH UHFWDQJOHV UHSUHVHQW FODVVHV DQG HGJHV OLQNVf UHSUHVHQW DWWULEXWHV 48(/ >67 :21 DQG =@ LV D WXSOHFDOFXOXV RULHQWHG TXHU\ ODQJXDJH IRU UHODWLRQDO '%06 ,1*5(6 >67@ ,Q RUGHU WR DYRLG WKH DPELJXLW\ ZKLFK DULVHV ZKHQ WZR DWWULEXWHV RI GLIIHUHQW UHODWLRQV KDYLQJ WKH VDPH QDPH DUH DGGUHVVHG LQ D VLQJOH TXHU\ 48(/ XVHV D GRW PHFKDQLVP WR TXDOLI\ DQ DWWULEXWH RI D UHODWLRQ LH D GRW LV LQVHUWHG EHWZHHQ WKH QDPH RI WKH UHODWLRQ DQG WKH QDPH RI WKH DWWULEXWHf )RU H[DPSOH (TXLSPHQW1DPH UHIHUV WR WKH DWWULEXWH 1DPH RI WKH UHODWLRQ (TXLSPHQW ,QIOXHQFHG E\ WKLV PHFKDQLVP WKH H[LVWLQJ TXHU\ ODQJXDJHV XVH VLPLODU QRWDWLRQV IRU QDYLJDWLQJ WKH GDWDEDVH VFKHPD IURP RQH FODVV WR DQRWKHU RU IURP RQH UHODWLRQ WR RWKHU UHODWLRQV LQ V\VWHPV ZKLFK XVH UHODWLRQDO GDWDEDVHV DV WKHLU EDFNHQGV 7KH ODQJXDJH *(0 >=$1768@ LV DQ H[WHQVLRQ RI 48(/ IRU WKH GDWD PRGHO '6,6 ZKLFK VXSSRUWV DJJUHJDWLRQ JHQHUDOL]DWLRQ DQG XQLTXH LGHQWLILFDWLRQ

PAGE 28

RI REMHFWV ,Q *(0 D FODVV LQ DQ DJJUHJDWLRQ KLHUDUFK\ WKDW KDV D OLQN HPDQDWLQJ WR DQRWKHU FODVV KDV WKH QDPH RI WKH ODWHU FODVV DV WKH GDWD W\SH RI RQH RI LWV DWWULn EXWH )RU H[DPSOH WKH FODVV /DE KDV DQ DWWULEXWH )DFLOLW\ RI WKH W\SH (TXLSn PHQW DQG KDV DQRWKHU DWWULEXWH /RFDOLW\ RI WKH W\SH /RFDWLRQ DQG VR IRUWK 7KH GRW QRWDWLRQ LV XVHG LQ *(0 IRU QDYLJDWLQJ DORQJ WKH UHIHUHQFH DWWULEXWHV OLQNVf LQ TXHU\ IRUPXODWLRQ 7KH IROORZLQJ *(0 TXHU\ UHWULHYHV WKH QDPH RI WKH PDQDJHU WKH VHULDO QXPEHU RI WKH HTXLSPHQW DQG WKH DGGUHVV IRU HDFK ODERUDWRU\ ZKRVH KHDGTXDUWHU LV ORFDWHG LQ 1HZ 67@ WKH GRW QRWDWLRQ LV XVHG LQ D PDQQHU VLPLODU WR WKDW IRXQG LQ *(0 WR LPSOHPHQW WKH DEVWUDFW GDWD W\SH $'7f FRQFHSW ,Q DGGLWLRQ 48(/ LV XVHG DV D GDWD W\SH WR IDFLOLWDWH WKH QDYLJDWLRQ IURP RQH UHODWLRQ WR DQRWKHU $ UHODWLRQ PD\ KDYH D ILHOG RI W\SH 48(/ ZKLFK PD\ FRQWDLQ H[SUHVVLRQV RU FRPPDQGV TXHULHVf :KHQHYHU WKH ILHOG LV DGGUHVVHG LQ D TXHU\ WKHVH H[SUHVVLRQV LQ ZKROH RU LQ SDUW ZLOO EH DFWLYDWHG ,Q JHQHUDO LI ; LV WKH WXSOH YDULDEOH RI WKH UHODWLRQ 5O < LV D ILHOG RI W\SH 48(/ LQ UHODWLRQ 5O DQG WKH TXHU\ VWRUHG LQ < UHWULHYHV ILHOG = RI DQRWKHU UHODWLRQ 5

PAGE 29

WKHQ WKH H[SUHVVLRQ ;<= LV D ILHOG LQ D FROOHFWLRQ RI WKLV YLHZ ,Q RWKHU ZRUGV WKH H[SUHVVLRQ ZLOO UHWXUQ WKH YDOXHV RI WKH = ILHOG RI WXSOHV LQ 5f WKDW DUH UHODWHG WR ; WKURXJK < )RU H[DPSOH OHW WKH UHODWLRQ 0DQDJHU KDYH D ILHOG FDOOHG 2IILFHOQIR RI W\SH 48(/ ZKLFK FRQWDLQV D TXHU\ WKDW UHWULHYHV WKH WHOHSKRQH QXPEHU RI WKH UHODWLRQ /RFDWLRQ 7KH H[SUHVVLRQ 0DQDJHU2IILFHOQIR7HO UHWXUQV WKH WHOHSKRQH QXPEHU IRU HDFK PDQDJHU LQ D WDEXODU IRUPDW &OHDUO\ WKH LPSOHn PHQWDWLRQ RI 48(/ DV D GDWD W\SH SURYLGHV D ZD\ WR UHODWH GDWD LQ WZR UHODWLRQV ZLWKRXW PRGLI\LQJ WKH GDWDEDVH VFKHPD ,QVWHDG RI XVLQJ WKH GRW QRWDWLRQ $5,(/ >0$&@ WDNHV DGYDQWDJH RI WKH 2) QRWDWLRQ 7KH H[DPSOH TXHU\ GHVFULEHG IRU *(0 FDQ EH UHVWDWHG DV 5DQJH RI /DE LV /DE 5HWULHYH 1DPH 2) 0DQDJHU 2) /DE 6HULDO 2) (TXLSPHQW 2) /DE $GGUHVV 2) /RFDWLRQ 2) /DE :KHUH &LW\ 2) +HDGTXDUWHUV 2) 'HSDUWPHQW 2) 0DQDJHU 2) /DE 1HZ 6+,@ LV D IXQFWLRQDO GDWD ODQJXDJH 7KH GDWD UHWULHYDO FRPn SRQHQW RI '$3/(; LV VLPLODU WR WKH ODQJXDJHV GHVFULEHG DERYH DOWKRXJK LW LV LQWHUSUHWHG GLIIHUHQWO\ ,Q WKH IXQFWLRQDO SDUDGLJP WKH FODVV KDYLQJ D OLQN LH DWWULEXWHf HPDQDWLQJ WR DQRWKHU FODVV LV FRQVLGHUHG DV D IXQFWLRQ 7KH IXQFWLRQ KDV E\ GHIDXOW WKH QDPH RI WKH FODVV WR ZKLFK WKH ILQN SRLQWV )RU H[DPSOH

PAGE 30

/RFDWLRQ/DEf DQG 'HSDUWPHQW+HDGTXDUWHUVf UHSUHVHQW WKH IDFWV WKDW /DE KDV /RFDWLRQ DQG +HDGTXDUWHUV KDV 'HSDUWPHQW DV DWWULEXWH UHVSHFWLYHO\ :KHQ WKH IXQFWLRQ /RFDWLRQ/DEf LV DSSOLHG WR DQ REMHFW RI WKH FODVV /DE LW UHWXUQV D YDOXH ZKLFK LV DQ REMHFW LQ WKH GRPDLQ FODVV RYHU ZKLFK WKH DWWULEXWH LV GHILQHG ,I WKH QDYLJDWLRQ LV IURP RQH FODVV WR DQRWKHU WKURXJK D VHTXHQFH RI FODVVHV D QHVWHG IXQFWLRQ LV XVHG )RU LQVWDQFH WKH H[SUHVVLRQ 1DPH0DQDJHU/DEff VSHFLILHV WKH QDPH RI WKH PDQDJHU RI D ODERUDWRU\ WR ZKLFK WKH PDQDJHU LV UHVSRQVLEOH )RU D SDUWLFXODU REMHFW RI /DE WKH PDQDJHU RI WKH ODERUDWRU\ LV SURGXFHG ILUVW WKHQ WKH IXQFWLRQ 1DPHf LV DSSOLHG WR WKH UHWXUQHG PDQDJHU DQG UHWXUQV WKH QDPH RI WKH PDQDJHU 7KH H[DPSOH TXHU\ FDQ EH H[SUHVVHG LQ '$3/(; DV IROORZV )25 ($&+ /DE 68&+ 7+$7 &LW\ +HDGTXDUWHUV 'HSDUWPHQW 0DQDJHU /DEffff 1HZ %$1@ LQWURGXFH D TXHU\ ODQJXDJH EDVHG RQ PHVVDJH SDVVn LQJ ,Q WKH PHVVDJH SDVVLQJ SDUDGLJP WKH QDPH RI D OLQN HPDQDWLQJ IURP D FODVV LV LQWHUSUHWHG DV WKH QDPH RI D PHVVDJH ZKLFK LV VWRUHG ZLWKLQ WKDW FODVV 2QH FDQ DVVXPH WKHUH LV DFWXDOO\ D PHVVDJH FUHDWHG E\ WKH V\VWHP DQG KDYLQJ E\ GHIDXOW WKH VDPH QDPH DV LWV FRUUHVSRQGLQJ DWWULEXWH :KHQ VXFK D PHVVDJH LV VHQW WR DQ LQVWDQFH RI WKH FODVV LW UHWXUQV WKH YDOXH RI WKH DWWULEXWH )RU H[DPSOH WKH IRO

PAGE 31

ORZLQJ LV DQ H[SUHVVLRQ IRU VHOHFWLQJ D ODERUDWRU\ WKDW KDV D PDQDJHU ZKR EHORQJV WR D VXERUGLQDWH GHSDUWPHQW RI LWV 1HZ %$1@ VLPLODU PHVVDJHEDVHG H[SUHVVLRQV FDQ EH XVHG WR UHWULHYH DWWULEXWH YDOXHV RI WKH UHVXOWLQJ /DE LQVWDQFH 7KH UHVXOW RI D TXHU\ ZKLFK LQYROYHV VXFK FRQGLWLRQV LV WKH VHW RI WKH LQVWDQFHV RI /DE DORQJ ZLWK LWV DWWULEXWH

PAGE 32

YDOXHV DQG LV UHSUHVHQWHG LQ D WDEXODU IRUP $V VKRZQ LQ WKH VDPSOHV RI WKHVH TXHU\ ODQJXDJHV WKHLU TXHU\ IRUPXODWLRQV WKRXJK LQWHUSUHWHG GLIIHUHQWO\ DUH YHU\ VLPLODU WR HDFK RWKHU 7KLV LV HYLGHQW LQ WKH IDFW WKDW WKH IRUPXODWLQJ RI TXHULHV LV DFFRPSOLVKHG E\ QDYLJDWLQJ WKH JUDSKLn FDOO\ UHSUHVHQWHG GDWDEDVH VFKHPD IURP FODVV WR FODVV WKURXJK WKHLU UHVSHFWLYH OLQNV ,Q HDFK RI WKHVH ODQJXDJHV KRZHYHU D TXHU\ RSHUDWHV RQ D GDWDEDVH WKDW LV VWUXFWXUDOO\ UHSUHVHQWHG XVLQJ DQ GDWD PRGHO DQG UHWXUQV D UHVXOW ZKRVH VWUXFWXUH LV UHSUHVHQWHG LQ D WDEXODU IRUP &RQVHTXHQWO\ WKH UHVXOW RI D TXHU\ FDQQRW EH IXUWKHU TXHULHG E\ RWKHU TXHULHV ZULWWHQ LQ WKH VDPH ODQJXDJH 7KHUHn IRUH WKHVH ODQJXDJHV DUH QRW FORVHG $QRWKHU GUDZEDFN RI WKHVH ODQJXDJHV LV VHHQ LQ WKHLU QDYLJDWLRQ PHFKDQLVPV ZKLFK FDQ RQO\ IRUPXODWH TXHULHV DJDLQVW FODVVHV RU UHODWLRQVf WKDW DUH LQWHUUHn ODWHG LQ VLPSOHU SDWWHUQV OLNH WKH OLQHDU DQG IRUHVW VWUXFWXUHV VKRZQ LQ )LJXUH D +RZHYHU LQ GDWDEDVHV WKH JUDSKLFDO SDWWHUQV LQ ZKLFK REMHFWV DUH LQWHUn UHODWHG ZLWK HDFK RWKHU DUH EDVLFDOO\ QHWZRUNV ZKLFK DUH QRW UHVWULFWHG WR SODQH JUDSKV D JUDSK LV D SODQH JUDSK LI LW FDQ EH GUDZQ RQ D SODQH ZLWKRXW DQ\ LQWHUn VHFWLRQ RI WZR HGJHVf 7KH\ FDQ EH DV FRPSOLFDWHG DV VXUIDFH JUDSKV D JUDSK LV D VXUIDFH JUDSK LI LW FDQ EH GUDZQ RQ D VXUIDFH ZLWKRXW DQ\ LQWHUVHFWLRQ RI WZR HGJHVf 3KUDVLQJ TXHULHV DJDLQVW FODVVHV WKDW DUH LQWHUUHODWHG LQ PRUH FRPSOLFDWHG SDWWHUQV GHSLFWHG LQ )LJXUH E LV EH\RQG WKH FDSDELOLWLHV RI WKHVH ODQJXDJHV $ WKLUG GUDZEDFN RI WKHVH ODQJXDJHV ZKLFK UHQGHUV WKHLU QDYLJDWLRQ PHFKDQn LVPV LQVXIILFLHQW LV WKDW RQO\ RQH W\SH RI WKH UHODWLRQVKLS DQ REMHFW LD UHODWHG WR DQRWKHU REMHFWf EHWZHHQ REMHFWV RI WZR FODVVHV FDQ EH H[SUHVVHG ,Q IDFW ZKHQ

PAGE 33

WZR FODVVHV DUH GLUHFWO\ OLQNHG DW WKH VFKHPD OHYHO REMHFWV LQ WKHVH WZR FODVVHV PD\ KDYH DQRWKHU W\SH RI UHODWLRQVKLS f§ DQ REMHFW LV QRW UHODWHG WR DQRWKHU REMHFW 7KLV W\SH RI UHODWLRQVKLS UHSUHVHQWV WKH FRPSOHPHQW DVSHFW RI WKH VHPDQWLFV VSHFLILHG IRU WKH WZR DVVRFLDWHG FODVVHV VXFK DV QRWDSDUWRI QRWDIXQFWLRQRI RU LDQRWD ZKLFK LV RIWHQ QHHGHG LQ TXHU\LQJ WKH GDWDEDVHV )RU H[DPSOH n)RU HDFK ODERUDWRU\ OLVW WKH HTXLSPHQW WKDW LV QRW DYDLODEOH LV D UHDVRQDEOH TXHU\ 7KH SURSRVHG TXHU\ ODQJXDJHV >'$' 0$1 %$1 52: &$5 &2/@ XVH QHVWHG UHODWLRQV DV WKHLU ORJLFDO YLHZV RI GDWDEDVHV $ QHVWHG UHODWLRQ LV D JHQHUDOL]HG UHODWLRQ LH D UHFXUVLYHO\ GHILQHG UHODWLRQ WKH DWWULEXWHV RI D UHODn WLRQ FDQ EH HLWKHU DWRPLF YDOXHV RU DQRWKHU UHODWLRQ LQ ZKLFK WKH DWWULEXWHV FDQ EH D WKLUG UHODWLRQ DQG VR IRUWK )LJXUH VKRZV DQ H[DPSOH RI D QHVWHG UHODWLRQ 1HVWHG UHODWLRQV DUH SDUWLFXODUO\ VXLWDEOH IRU UHSUHVHQWLQJ GDWD LQ IRUHVW VWUXFWXUHV 7KH DERYH ODQJXDJHV DUH FRQVLGHUHG WR EH FORVHG VLQFH RSHUDWRUV LQ WKHVH ODQJXDJHV RSHUDWH RQ QHVWHG UHODWLRQV DQG SURGXFH QHVWHG UHODWLRQV +RZHYHU WKH\ DOVR KDYH WKH GUDZEDFNV PHQWLRQHG DERYH DQG LW LV RXU YLHZ WKDW QHVWHG UHODn WLRQ LV QRW D SURSHU ORJLFDO UHSUHVHQWDWLRQ IRU DQ GDWDEDVH ZKLFK LV QHWZRUNV RI REMHFWV REMHFW FODVVHV DQG WKHLU DVVRFLDWLRQV 8VLQJ QHVWHG UHODWLRQV WR UHSUHVHQW GDWD LQ QHWZRUN VWUXFWXUHV LQWURGXFHV RQH OHYHO RI LQGLUHFWLRQ 0DSSLQJ IURP D QHWZRUN UHSUHVHQWDWLRQ WR QHVWHG UHODWLRQV LV DQ H[WUD SURFHVV )XUWKHUn PRUH LQ RUGHU WR XVH D QHVWHG UHODWLRQ WR UHSUHVHQW FRPSOH[ VWUXFWXUHV D ODUJH DPRXQW RI GDWD KDV WR EH UHSOLFDWHG LQ WKH UHSUHVHQWDWLRQ )LJXUH VKRZV DQ H[DPSOH RI XVLQJ D QHVWHG UHODWLRQ WR UHSUHVHQW D JUDSK KDYLQJ ORRSV 1RWH WKDW

PAGE 34

YHUWH[ ) KDV WR EH UHSOLFDWHG WKUHH WLPHV r (1&25( 'DWD 0RGHO DQG ,WV 8QGHUO\LQJ 4XHU\ $OJHEUD ,Q VSLWH RI WKH SRSXODULW\ RI WKH SDUDGLJP DQG LWV DSSOLFDWLRQ LQ WKH ILHOG RI GDWDEDVH PDQDJHPHQW WKH H[LVWLQJ GDWDEDVH PDQDJHPHQW V\VWHPV VWLOO ODFN D VROLG PDWKHPDWLFDO IRXQGDWLRQ IRU WKH PDQLSXODWLRQ RI DQ GDWDEDVH DQG WKH RSWLPL]DWLRQ RI TXHULHV 5HFHQWO\ D TXHU\ DOJHEUD >6+$@ ZDV SURSRVHG IRU WKH (1&25( GDWD PRGHO >(/0@ 7KLV VHFWLRQ VXUYH\V WKH TXHU\ DOJHn EUD DV ZHOO DV WKH (1&25( PRGHO ,W DOVR VHUYHV DV D FRPSDULVRQ WR WKH DVVRFLDn WLRQ DOJHEUD SURSRVHG LQ WKLV GLVVHUWDWLRQ 7KH (1&25( 0RGHO (1&25( GDWD PRGHO >(/0@ VXSSRUWV DEVWUDFW GDWD W\SH W\SH LQKHULn WDQFH W\SHG FROOHFWLRQ RI W\SHG REMHFWV REMHFWV ZLWK LGHQWLW\ DQG REMHFW HQFDSVXn ODWLRQ ,W PRGHOV DQ DSSOLFDWLRQ DV QHWZRUNV RI REMHFWV REMHFW W\SHV DQG WKHLU DVVRFLDWLRQV 7KH GHILQLWLRQ RI DQ DEVWUDFW GDWD W\SH LQ WKLV PRGHO LQFOXGHV WKH 1DPH RI WKH W\SH D VHW RI 3URSHUWLHV GHILQHG IRU LQVWDQFHV RI WKH W\SH D VHW RI 2SHUDWLRQV ZKLFK FDQ EH DSSOLHG WR WKH LQVWDQFH RI WKH W\SH 3URSHUWLHV UHIOHFW WKH VWDWH RI DQ REMHFW ZKLOH RSHUDWLRQV PD\ SHUIRUP DUELWUDU\ DFWLRQV 3URSHUWLHV DUH W\SHG REMHFWV WKDW PD\ EH LPSOHPHQWHG DV VWRUHG YDOXHV SURFHGXUHV RU IXQFWLRQV 7KH LPSOHPHQWDWLRQ RI D SURSHUW\ LV LQYLVLEOH WR WKH XVHU DQG LV DVVXPHG WR UHWXUQ DQ REMHFW RI WKH FRUUHFW W\SH DQG WR KDYH QR VLGHHIIHFWV

PAGE 35

,Q DGGLWLRQ WR XVHUGHILQHG DEVWUDFW GDWD W\SHV DQG D FROOHFWLRQ RI DWRPLF W\SHV VXFK DV ,QW 6WULQJ %RROHDQ HWF LH SULPLWLYHFODVVHVf (1&25( SURYLGHV WZR SDUDPHWHUL]HG W\SHV DQG D JOREDO 2EMHFW W\SH ZKLFK LV WKH VXSHUW\SH RI DOO RWKHU W\SHV 7KH SDUDPHWHUL]HG W\SH 6HW>7@ GHILQHV 7 DV WKH W\SH RU VXSHUW\SH RI REMHFWV LQ D FROOHFWLRQ KDYLQJ W\SH 6HW DQG 7 LV FDOOHG WKH PHPEHU W\SH RI WKH VHW 7KH SDUDPHWHUL]HG WXSOH W\SH DVVRFLDWHV W\SHV 7f ZLWK DWWULEXWH QDPHV $f DQG GHILQHV SURSHUWLHV *HWDWWULEXWHBYDOXH DQG RSHUDWLRQV 6HWBDWWULEXWHBYDOXH IRU HDFK DWWULEXWH 7KH 7 V FDQ EH DQ\ GDWDEDVH W\SHV WKXV DOORZ QHVWLQJ RI WXSOH W\SHV 7KH YDOXH RI D WXSOH LV UHSUHVHQWHG DV $ RY $ R $Q RQ! ZKHUH WKH $fV DUH DWWULEXWHV RI WKH WXSOH DQG WKH RfV DUH REMHFWV RI WKH FRUUHVSRQGLQJ W\SHV 7KH JOREDO VXSHUW\SH 2EMHFW GHILQHV D IDPLO\ RI RSHUDWLRQV IRU HTXDOLW\ FDOOHG LHTXDOLW\ ZKHUH L LQGLFDWHV KRZ GHHSO\ D FRPSDULVRQ RI WZR REMHFWV PXVW VHDUFK EHIRUH ILQGLQJ HTXDOLW\ 7ZR REMHFWV DUH LGHQWLFDO ZKHQ WKH\ DUH WKH VDPH REMHFW LH WKH\ KDYH WKH VDPH LGHQWLW\ ,GHQWLFDO REMHFWV DUH 2HTXDO RU MXVW f DQG IRU }! WZR REMHFWV DUH LHTXDO f LI f WKH\ DUH ERWK FROOHFWLRQV RI WKH VDPH FDUGLQDOLW\ DQG WKHUH LV D RQHWRRQH FRUUHVSRQGHQFH EHWZHHQ WKH FROOHFWLRQV VXFK WKDW FRUUHVSRQGLQJ PHPEHUV DUH m Q 2/ f WKH\ ERWK KDYH WKH VDPH W\SH QRW D FROOHFWLRQ W\SHf DQG WKH YDOXHV RI FRUUHVSRQGLQJ SURSHUWLHV DUH BM 7\SH 2EMHFW DOVR GHILQHV D VWURQJHU QRWLRQ RI HTXDOLW\ FDOOHG LGHTXDOLW\ 7ZR REMHFWV DUH LGHTXDO DW GHSWK L LI WKH\ DUH LHTXDO DQG JUDSKLFDO UHSUHVHQWDn WLRQV RI WKH REMHFWV DUH LVRPRUSKLF

PAGE 36

7KH 8QGHUO\LQJ 4XHU\ $OJHEUD RI (1&25( 7KH TXHU\ DOJHEUD >6+$@ LV SURSRVHG EDVHG RQ WKH PRGHO (1&25( 7KH GRPDLQ RI WKH TXHU\ DOJHEUD LV GHILQHG DV D W\SHG FROOHFWLRQ RI W\SHG REMHFWV $ W\SHG FROOHFWLRQ LV RI SDUDPHWHUL]HG W\SH 6HW>7@ DQG WKH REMHFWV LQ WKH FROOHFWLRQ DUH RI W\SH 7 ,I REMHFWV RI D FROOHFWLRQ DUH FROOHFWHG IURP GLIIHUHQW W\SHV 7 LV WKHLU PRVW VSHFLILF FRPPRQ W\SH LQ WKH W\SH ODWWLFH )RU H[DPSOH LI REMHFW D LV RI W\SH REMHFW S LV RI W\SH 3 DQG 6n LV D VXSHUW\SH RI 3 WKH FROOHFWLRQ RI REMHFWV D DQG S LV RI W\SH 6HW>6@ 7KH TXHU\ DOJHEUD LV FORVHG VLQFH WKH RSHUDWRUV RI WKH TXHU\ DOJHEUD RSHUDWH RQ FROOHFWLRQVf RI REMHFWV ZLWK W\SH 6HW >79@ DQG SURGXFH D FROOHFWLRQ ZLWK W\SH 6HWIUM ZKHUH W\SH 7N LV GHILQHG E\ WKH TXHU\ 6LPLODU WR WKH ODQJXDJHV VXUYH\HG LQ 6HFWLRQ WKH TXHU\ DOJHEUD DGGUHVVHV D SURSHUW\ RI DQ REMHFW XVLQJ fGRWf QRWDWLRQ HJ DST ZKHUH m LV DQ REMHFW RI W\SH 7Y S LV D SURSHUW\ RI D DQG LV RI W\SH 7 DQG T LV D SURSHUW\ RI S DQG LV RI W\SH 7Vf 7ZHOYH RSHUDWRUV DUH GHILQHG LQ WKLV DOJHEUD :H JLYH WKHLU EULHI GHILQLWLRQV IROORZHG E\ VRPH H[DPSOH TXHULHV WR LOOXVWUDWH WKH PDMRU FRQFHSWV RI WKLV DOJHEUD f 7KH 6HOHFW RSHUDWLRQ FUHDWHV D FROOHFWLRQ RI REMHFWV ZKLFK VDWLVI\ D VHOHFWLRQ SUHGLFDWH 6HOHFW6Sf ^ D D LQ 6f$SVf ` ZKHUH S LV WKH SUHGLFDWH f 7KH ,PDJH RSHUDWLRQ LV XVHG WR UHWXUQ D VLQJOH REMHFW IRU HDFK REMHFW LQ WKH TXHULHG FROOHFWLRQ DQG KDV WKH IRUP

PAGE 37

,PDJH6 I 7f f§ ^ mf LQ 6 ` ZKHUH LV D FROOHFWLRQ RI REMHFWV DQG UHWXUQV DQ REMHFW RI W\SH 7 f 7KH 3URMHFW RSHUDWLRQ H[WHQGV ,PDJH E\ DOORZLQJ WKH DSSOLFDWLRQ RI PDQ\ IXQFWLRQV WR DQ REMHFW WKXV VXSSRUWLQJ WKH FUHDWLRQ DQG PDLQWHQDQFH RI VHOHFWHG UHODWLRQVKLSV EHWZHHQ REMHFWV 7KH UHODWLRQVKLSV DUH VWRUHG DV WXSOHV ZLWK 7XSOH W\SH 3URMHFW6 $9 $f ff! ^$9 fmf! m r} 6 ` ZKHUH 6 LV RI W\SH 6HW>7? WKH $fV DUH XQLTXH DWWULEXWH QDPHV DQG HDFK I WDNHV D VLQJOH LQSXW RI W\SH 7 DQG UHWXUQV DQ REMHFW RI W\SH 7 3URMHFW UHWXUQV RQH WXSOH IRU HDFK REMHFW LQ WKH FROOHFWLRQ EHLQJ TXHULHG (DFK QHZO\ FUHDWHG WXSOH LV D QHZ REMHFW ZLWK XQLTXH REMHFW LGHQWLILHU f 7KH 2MRLQ RSHUDWRU LV DQ H[SOLFLW MRLQ RSHUDWRU XVHG WR FUHDWH UHODWLRQVKLSV ZKLFK LV QRW GHILQHG EHWZHHQ REMHFWV RI WZR FROOHFWLRQV LQ WKH GDWDEDVH ,W LV HVVHQWLDOO\ D &DUWHVLDQ SURGXFW RI FROOHFWLRQV RI REMHFWV IROORZHG E\ D VHOHFn WLRQ RI UHVXOW WXSOHV )RU FROOHFWLRQV 6 DQG 5 WKH 2MRLQ LV GHILQHG DV IROORZV 2MRLQ6 5 $Y $ Sf ^$\ V $ U! H LQ 6 $ U LQ 5 $ SmUf ` ZKHUH S LV D SUHGLFDWH DV LQ 6HOHFWf GHILQHG RYHU REMHFWV IURP 6 DQG 5 7KH 2MRLQ RSHUDWLRQ FUHDWHV QHZ WXSOHV LQ WKH GDWDEDVH WR VWRUH WKH JHQHUDWHG UHODWLRQVKLSV 7KH WXSOHV FUHDWHG ZLOO KDYH XQLTXH REMHFW LGHQWLILHUV f 8QLRQ 'LIIHUHQFH DQG ,QWHUVHFWLRQ DUH WKH XVXDO VHW RSHUDWLRQV ZLWK REMHFW FRPSDULVRQV DQG VHW PHPEHUVKLS EDVHG RQ REMHFW LGHQWLW\ f 7KH UHVXOW RI

PAGE 38

WKHVH RSHUDWLRQV LV FRQVLGHUHG WR EH D FROOHFWLRQ RI REMHFWV RI W\SH 7 ZKHUH 7 LV WKH PRVW VSHFLILF FRPPRQ VXSHUW\SH LQ WKH W\SH ODWWLFHf RI WKH W\SHV RI WKH REMHFWV LQ WKH RSHUDQGV f )ODWWHQ RSHUDWLRQ LV XVHG WR UHVWUXFWXUH VHWV RI VHWV DQG 1HVW DQG 8Q1HVW DOORZ WKH UHSUHVHQWDWLRQ RI WXSOHV DV IODW RU QHVWHG UHODWLRQV f )RU WKH DERYH RSHUDWRUV WZR LGHQWLFDO RSHUDWLRQV FDQQRW JLYH LGHQWLFDO UHVSRQVH VLQFH HDFK UHVXOW FROOHFWLRQ LV D QHZO\ LGHQWLILHG REMHFW LQ WKH GDWDn EDVH DQG WKH REMHFWV LQ D UHVXOW FROOHFWLRQ PD\ EH HLWKHU H[LVWLQJ GDWDEDVH REMHFWV RU QHZ WXSOH REMHFWV FUHDWHG GXULQJ WKH RSHUDWLRQ 2SHUDWRUV 'XS(O LPLQDWH DQG &RDOHVFH DUH LQWURGXFHG WR KDQGOH VLWXDWLRQV ZKHUH HTXDO REMHFWV DUH FUHDWHG E\ D TXHU\ 7KH H[DPSOH TXHULHV DUH LVVXHG DJDLQVW WKH 6XSSOLHU3DUWV-RE GDWDEDVH VKRZQ LQ )LJXUH )RU WKH SXUSRVH RI WKHVH H[DPSOHV LW LV DVVXPH WKDW 7\SH 2EMHFW LV WKH RQO\ VXSHUW\SH IRU HDFK RI WKH JLYHQ W\SHV ([DPSOH )LQG DOO UHG SDUWV :KLFK VXSSOLHUV FDQ VXSSO\ DOO RI WKH UHG SDUWV" 3BUHG 6HOHFW3DUWV;S SFRORU 5HG 6B3UHG 6HOHFW6XSSOLHUV;V 3BUHG VXEVHWBRI V,QYHQWRU\f 7KH ILUVW VHOHFWLRQ ILQGV WKH UHG SDUWV DQG WKH VHFRQG VHOHFWLRQ ILQGV DOO VXSn SOLHUV IRU ZKLFK WKH LQYHQWRU\ LQFOXGHV WKDW VHW RI SDUWV 7KH VXEVHWBRI RSHUDWLRQ LV DYDLODEOH VLQFH SURSHUW\ ,QYHQWRU\ DQG UHVXOW 3BUHG ERWK KDYH W\SH 6HW>3DUW@ ([DPSOH :KDW SDUWV DUH QHHGHG E\ MREV LQ %RVWRQ" %RV -REV 6HOHFW-REV;M MDGGUHVVFLW\ %RVWRQf %RV-RE3DUWV 3URMHFW%RV-REV;M -Mf3WM3DUWV1HHGHGf!f

PAGE 39

7KH VHOHFW RSHUDWLRQ ILQGV WKH MREV LQ %RVWRQ DQG WKH SURMHFW RSHUDWLRQ JLYHV LQIRUPDWLRQ DERXW ZKLFK SDUWV DUH QHHGHG IRU HDFK MRE LQ %RVWRQ 7KH UHVXOW RI WKH SURMHFWLRQ LV RI W\SH 6HW>7XSOH@ 1RWH WKDW RSHUDWLRQ 1HZ3DUW RI W\SH -REf FDQQRW EH DSSOLHG WR PHPEHUV RI %RV-RE3DUWV VLQFH WKH\ KDYH W\SH 7XSOH +RZn HYHU LW LV DSSURSULDWH IRU REMHFWV %RV-RE3DUWV([DPSOH )LQG DOO ORFDO VXSSOLHUV IRU HDFK MRE /RFDO6 2MRLQMREV6XSSOLHUV-6 ;M ;V MDGGUHVVFLW\ VDGGUHVVFLW\f 7KLV 2MRLQ RSHUDWLRQ SURGXFHV D VHW RI WXSOHV RI W\SH -REf66XSSOLHUf! ZKLFK LV VLPLODU WR D QRUPDOL]HG UHODWLRQ 7R JHW D VHW RI VXSSOLHUV IRU HDFK MRE D 1HVW RSHUDWLRQ QHHGV WR EH DSSOLHG 1HVW/RFDO6 6f )URP WKH DERYH GHVFULSWLRQ ZH FDQ VHH WKDW WKH TXHU\ DOJHEUD VXSSRUWV PDQ\ IHDWXUHV RI GDWDEDVHV DQG KDV WDNHQ VLJQLILFDQFH VWHSV WRZDUGV D SRZHUn IXO TXHU\ DOJHEUD WR VHUYH DV WKH PDWKHPDWLFDO IRXQGDWLRQ IRU GDWDEDVH +RZHYHU LW VWLOO KDV WKH IROORZLQJ OLPLWDWLRQV f $OWKRXJK WKH (1&25( PRGHOV DQ DSSOLFDWLRQ DV QHWZRUNV RI W\SHV REMHFWV DQG WKHLU DVVRFLDWLRQV WKH GRPDLQ RI LWV XQGHUO\LQJ TXHU\ DOJHEUD LV GHILQHG DV FROOHFWLRQV RI REMHFWV KDYLQJ W\SH 6HW>7@ ZKLFK LV HVVHQWLDOO\ D QHVWHG UHODWLRQ UHSUHVHQWDWLRQ VLQFH WKH PHPEHU W\SH 7 RI WKH VHW W\SH FDQ EH D SDUDPHWHUn L]HG 7XSOH W\SH ZKLFK PD\ LQ WXUQ FRQWDLQ DWWULEXWHV RI 7XSOH W\SHV 7KHUHn IRUH WKH TXHU\ DOJHEUD FDQQRW UHSUHVHQW QHWZRUNVWUXFWXUHG UHODWLRQVKLSV DPRQJ REMHFWV HIILFLHQWO\ DQG WKH PDSSLQJ SUREOHP DGGUHVVHG EHIRUH VWLOO UHPDLQV

PAGE 40

f ,Q WKLV DOJHEUD WZR LGHQWLFDO H[SUHVVLRQV RU WZR LGHQWLFDO RSHUDWLRQV LQ D VLQn JOH H[SUHVVLRQ GR QRW JLYH LGHQWLFDO UHVSRQVH VLQFH HDFK UHVXOW FROOHFWLRQ LV D QHZO\ LGHQWLILHG REMHFW LQ WKH GDWDEDVH 7R HOLPLQDWH GXSOLFDWHG FRSLHV RI WKH VDPH QHZO\ FUHDWHG REMHFW WKH DOJHEUD LQWURGXFHV 'XS(OLPLQDWH DQG &RDOHVFH RSHUDWLRQV ZKLFK DUH QRW QHFHVVDU\ LI LW GLUHFWO\ VXSSRUWV WKH QHWn ZRUN YLHZ RI GDWDEDVHV f ,Q WKLV DOJHEUD D FROOHFWLRQ PD\ FRQWDLQ REMHFWV ZLWK KHWHURJHQHRXV VWUXFn WXUHV )RU H[DPSOH WZR REMHFWV DUH ERWK RI 7XSOH W\SH EXW ZLWK GLIIHUHQW DULWLHV DQG WKH XQLRQ RI WKH WZR REMHFW LV DOVR D FROOHFWLRQ RI REMHFWV KDYLQJ 7XSOH W\SH +RZHYHU RWKHU RSHUDWRUV LQ WKLV DOJHEUD DUH QRW GHILQHG WR RSHUDWH RQ VXFK FROOHFWLRQVf f 6LQFH WKH TXHU\ DOJHEUD LV GHYHORSHG IRU D VSHFLILF PRGHO LH (QFRUHf LW LV GLIILFXOW WR DSSO\ WR RWKHU PRGHOV

PAGE 41

)LJXUH $ VDPSOH VFKHPD

PAGE 42

2 2 2 R R Df VLPSOH TXHU\ SDWWHUQV )LJXUH 6LPSOH DQG FRPSOH[ TXHU\ SDWWHUQV

PAGE 43

1$0( $''5(66 ,19(670(176 &203$1< 6+$5(6 385&+$6( 35,&( '$7( ,62 -RKQ 6PLWK (DVW QG 6W %ORRPLQJWRQ ,1 -LOO %URG\ 1RUWK 0DLQ 6W 2EHUWLQ 2K (;;21 )25' 6($56 )LJXUH $Q H[DPSOH RI D QHVWHG UHODWLRQ

PAGE 44

3DWWHUQ 1XPEHU $ % & ( ) ) ) + D E F G H I I I JL K )LJXUH 8VLQJ D QHVWHG UHODWLRQ WR UHSUHVHQW D FRPSOH[ VWUXFWXUH

PAGE 45

7\SH 6XSSOLHU SURSHUWLHV RSHUDWLRQV ,GHQW VWULQJ 5HFY2UGHU $GGUHVV $GGU 6XSSOLHU 6HW>3DUW@ a! 6XSSOLHU ,QYHQWRU\ 6HW>3DUW@ 7\SH -RE SURSHUWLHV RSHUDWLRQV 1XP VWULQJ 1HZ3DUW -RE 3DUW -RE $GGUHVV $GGU 3DUWV1HHGHG 6HW>3DUW@ 3UHIHUUHGB6XSSOLHUV 2UGHUHG BOLVW>6XSSOLHU@ 7\SH 3DUW SURSHUWLHV RSHUDWLRQV 1XP VWULQJ 2UGHU 3DUW 3DUW $GGUHVV $GGU 6DPHB3DUW 3DUW 3DUW %RROHDQ &RORU VWULQJ &RPSRQHQWV 6HW>7XSOH>33DUW4W\OQWf!@@ 3ODQ GUDZLQJ %LOORI0DWHULDO OLVW>3DUW@ 7\SH $GGU SURSHUWLHV 6WUHHW VWULQJ &LW\ VWULQJ 6WDWH VWULQJ )LJXUH $ 6XSSOLHU3DUWV-RE GDWDEDVH

PAGE 46

&+$37(5 29(59,(: 2) '$7$%$6(6 $1' $662&,$7,21%$6(' 48(5< )2508/$7,21 7KLV FKDSWHU LQIRUPDOO\ LQWURGXFHV WKH JUDSKLFDO YLHZ RI GDWDEDVHV DQG LOOXVWUDWHV WKH DVVRFLDWLRQEDVHG TXHU\ IRUPXODWLRQ PHFKDQLVP 7KH JUDSKLFDO YLHZ FDSWXUHV WKH PRVW LPSRUWDQW FKDUDFWHULVWLFV RI GDWDEDVHV LQ ZKLFK REMHFW FODVVHV DQG WKHLU REMHFWV DUH DVVRFLDWHG ZLWK HDFK RWKHU %DVHG RQ WKLV YLHZ TXHU\ IRUPXODWLRQ DQG SURFHVVLQJ FDQ EH PDGH E\ VSHFLI\LQJ DQG PDQLSXODWn LQJ DVVRFLDWLRQ SDWWHUQV LQ ZKLFK REMHFWV DUH LQWHUUHODWHG ZLWK HDFK RWKHU XQOLNH WKH WUDGLWLRQDO DWWULEXWHEDVHG TXHU\ IRUPXODWLRQ DQG SURFHVVLQJ ZKLFK PDWFK YDOXHV LQ GLIIHUHQW UHODWLRQV 6LQFH WKH JUDSKLFDO YLHZ LV VXLWDEOH IRU PDQ\ GDWD PRGHOV WKH DVVRFLDWLRQ DOJHEUD GHYHORSHG EDVHG RQ WKLV YLHZ FDQ EH XVHG DV D JHQHUDO DOJHEUD IRU VXSSRUWLQJ WKHVH GDWDEDVHV 7KH JUDSKLFDO YLHZ RI GDWDEDVHV LV IRUPDOL]HG LQ WKH QH[W FKDSWHU 2YHUYLHZ RI 44 'DWDEDVHV VHPDQWLF GDWD PRGHOV SURYLGH D FRQFHSWXDO EDVLV IRU GHILQLQJ GDWDn EDVHV $OWKRXJK HDFK PRGHO KDV VRPH XQLTXH FRQVWUXFWV WKDW GLVWLQJXLVK RQH PRGHO IURP WKH RWKHUV WKHUH DUH VHYHUDO FRPPRQ VWUXFWXUDO DQG EHKDYLRUDO SURn SHUWLHV EDVHG RQ ZKLFK DQ DOJHEUD FDQ EH GHYHORSHG DQG XVHG WR VXSSRUW WKHVH PRGHOV

PAGE 47

)LUVW REMHFWV DUH SK\VLFDO HQWLWLHV DEVWUDFW FRQFHSWV HYHQWV SURFHVVHV IXQFn WLRQV RU DQ\WKLQJ WKDW DQ DSSOLFDWLRQ FDUHV WR FDSWXUH DQG UHSUHVHQW 6HFRQG REMHFWV KDYLQJ WKH VDPH VWUXFWXUDO DQG EHKDYLRUDO SURSHUWLHV DUH JURXSHG WRJHWKHU WR IRUP DQ REMHFW FODVV 2EMHFW FODVVHV FDQ EH FDWHJRUL]HG LQWR WZR JHQHUDO FDWHJRULHV Of WKH QRQSULPLWLYHFODVV ZKLFK UHSUHVHQWV D VHW RI REMHFWV RI LQWHUHVW LQ DQ DSSOLFDWLRQ ZRUOG HDFK RI ZKLFK LV DVVLJQHG D V\VWHPZLGH XQLTXH REMHFW LGHQWLILHU 2,'f DQG LWV GDWD DUH H[SOLFLWO\ HQWHUHG LQ D GDWDEDVH E\ WKH XVHU DQG f WKH SULPLWLYHFODVV ZKLFK UHSUHVHQWV D FODVV RI VHOIQDPHG REMHFWV VHUYLQJ DV D GRPDLQ IRU GHILQLQJ RWKHU REMHFW FODVVHV VXFK DV D FODVV RI V\PEROV RU QXPHULFDO YDOXHV 7KH EHKDYLRUDO SURSHUWLHV RI DQ REMHFW FODVV DUH GHILQHG LQ WHUPV RI V\VWHPGHILQHG RU XVHUGHILQHG RSHUDWLRQV HJ UHWULHYH GLVSOD\ GHOHWH LQVHUW URWDWH D GHVLJQ REMHFW KLUH DQ HPSOR\HH HWFf ZKLFK FDQ PHDQLQJIXOO\ RSHUDWH RQ LWV REMHFWV XVLQJ WKHLU FRUUHVSRQGLQJ SURJUDPV RU PHWKRGVf 7KH VWUXFWXUDO SURSHUWLHV RI DQ REMHFW FODVV DQG WKXV LWV REMHFWV FRQVLVW RI WZR W\SHV RI GDWD f GHVFULSWLYH GDWD RU LQVWDQFH YDULDEOHVf ZKLFK GHILQH WKH VWDWHV RI WKH REMHFWV DQG f DVVRFLDWLRQ GDWD ZKLFK VSHFLI\ WKH UHODWLRQVKLSV EHWZHHQ LWV REMHFWV DQG WKH REMHFWV RI VRPH UHODWHG FODVVHV 7KLUG GLIIHUHQW PRGHOV UHFRJQL]H GLIIHUHQW W\SHV RI DVVRFLDWLRQV 7ZR RI WKH PRVW FRPPRQO\ UHFRJQL]HG DVVRFLDWLRQV DUH DJJUHJDWLRQ DQG JHQHUDOL]DWLRQ $JJUHJDWLRQ PRGHOV WKH Df§SDUWf§RI Df§IXQFWLRQf§RI RU Df§FRPSRVLWLRQf§RI UHODWLRQn VKLS )RU LQVWDQFH D FRPSOH[ REMHFW FDQ EH PRGHOHG E\ DQ DJJUHJDWLRQ KLHUDUFK\ DEVWUDFW GDWD W\SHf LQ ZKLFK D FRPSOH[ REMHFW LV GHILQHG LQ WHUPV RI LWV DVVRFLDn WLRQV ZLWK REMHFWV LQ RWKHU GHILQHG FODVVHV *HQHUDOL]DWLRQ PRGHOV WKH LVD RU WKH

PAGE 48

VXSHUFODVVf§VXEFODVH UHODWLRQVKLS LQ ZKLFK DQ REMHFW LQ D VXEFODVV LQKHULWV ERWK WKH VWUXFWXUDO DQG WKH EHKDYLRUDO SURSHUWLHV RI LWV VXSHUFODVVHVf 7KXV IURP WKH DOJHEUD SRLQW RI YLHZ DQ GDWDEDVH FDQ EH YLHZHG DV D FROOHFWLRQ RI REMHFWV JURXSHG WRJHWKHU LQ FODVVHV DQG LQWHUUHODWHG WKURXJK DVVRFLDn WLRQV ,W FDQ EH UHSUHVHQWHG E\ JUDSKV DW ERWK WKH LQWHQVLRQDO DQG WKH H[WHQVLRQDO OHYHOV $W WKH LQWHQVLRQDO VFKHPDf OHYHO D GDWDEDVH LV GHILQHG E\ D FROOHFWLRQ RI LQWHUUHODWHG REMHFW FODVVHV DQG LV UHSUHVHQWHG E\ D 6FKHPD *UDSK 6*f )RU H[DPSOH WKH 6* IRU D XQLYHUVLW\ GDWDEDVH LV LOOXVWUDWHG LQ )LJXUH LQ ZKLFK HDFK UHFWDQJOH GHQRWHV D QRQSULPLWLYHFODVV VXFK DV D FODVV RI SHUVRQ REMHFWV RU D FODVV RI GHSDUWPHQW REMHFWV DQG HDFK FLUFOH GHQRWHV D SULPLWLYHFODVV VXFK DV D FODVV RI QDPHV RU DJHV 7KH DVVRFLDWLRQV DPRQJ FODVVHV DUH UHSUHVHQWHG E\ WKH HGJHV LQ 6* )RU H[DPSOH WKHUH LV DQ DVVRFLDWLRQ EHWZHHQ WKH FODVV &RXUVH DQG WKH FODVV 'HSDUWPHQW DQ $JJUHJDWLRQ DVVRFLDWLRQf DQG DQ DVVRFLDWLRQ EHWZHHQ WKH FODVV 3HUVRQ DQG WKH FODVV 6WXGHQW D *HQHUDOL]DWLRQ DVVRFLDWLRQf 6LQFH WKH VHPDQWLF GLVWLQFWLRQV RI WKHVH DQG RWKHU DVVRFLDWLRQ W\SHV UHFRJQL]HG E\ GLIIHUHQW VHPDQWLF PRGHOV FDQ EH HLWKHU KDUGFRGHG LQ D '%06 RU GHFODUDWLYHO\ VSHFLILHG E\ VRPH UXOHV DQG XVHG E\ D UXOH SURFHVVRU WR JRYHUQ WKH PDQLSXODWLRQ RI WKH DVVRFLn DWHG FODVVHV WKH XQGHUO\LQJ DOJHEUD GRHV QRW KDYH WR LQFRUSRUDWH WKH VHPDQWLFV RI WKHVH DVVRFLDWLRQ W\SHV $OO LW KDV WR EH FRQFHUQHG ZLWK LV ZKHWKHU RU QRW DQ REMHFW FODVV DQG LWV REMHFWV DUH DVVRFLDWHG ZLWK VRPH RWKHU FODVVHV DQG WKHLU REMHFWV LH WKH HGJHV RU DVVRFLDWLRQVf DUH W\SHOHVV LQ 6* )RU H[DPSOH WKH VHPDQWLFV RI LQKHULWDQFH FDQ EH LQFRUSRUDWHG LQ D TXHU\ ODQJXDJH WUDQVODWRU ZKLFK WUDQVODWHV D KLJKOHYHO ODQJXDJH VWDWHPHQW LQWR LWV XQGHUO\LQJ DOJHEUDLF UHSUHVHQWD

PAGE 49

WLRQ 7KH DOJHEUD GRHV QRW KDYH WR GHDO GLUHFWO\ ZLWK WKH VHPDQWLFV RI LQKHULWDQFH 7KLV LV SDUWLFXODUO\ LPSRUWDQW LI WKH DOJHEUD LV WR EH XVHG DV D JHQHUDO DOJHEUD IRU VXSSRUWLQJ YDULRXV GDWD PRGHOV LQ ZKLFK WKH VHPDQWLFV RI DQ DVVRFLDWLRQ W\SH PD\ KDYH VOLJKWO\ GLIIHUHQW PHDQLQJV $W WKH H[WHQVLRQDO LQVWDQFHf OHYHO D GDWDEDVH FDQ EH YLHZHG DV D FROOHFWLRQ RI REMHFWV JURXSHG WRJHWKHU LQ FODVVHV DQG LQWHUUHODWHG WKURXJK VRPH W\SHOHVV DVVRFLDWLRQV DQG DV VXFK LW FDQ EH UHSUHVHQWHG E\ DQ 2EMHFW *UDSK 2*f )RU H[DPSOH WKH 2* FRUUHVSRQGLQJ WR D SRUWLRQ RI WKH XQLYHUVLW\ VFKHPD JUDSK LV VKRZQ LQ )LJXUH ,Q WKLV H[DPSOH WKH 7HDFKHU REMHFW W LV DVVRFLDWHG ZLWK WZR 6HFWLRQ REMHFWV WKHUHE\ UHSUHVHQWLQJ WKH IDFW WKDW KHVKH LV WHDFKLQJ WZR VHFWLRQV VF DQG VF 7KH 6WXGHQW REMHFW VL LV DVVRFLDWHG ZLWK 8QGHUJUDG REMHFW XO ZKLFK LQ WXUQ LV DVVRFLDWHG ZLWK 'HSDUWPHQW REMHFW GO WKHUHE\ UHSUHVHQWLQJ WKDW VL LV DQ XQGHUJUDGXDWH VWXGHQW ZKR PLQRUV LQ WKH GHSDUWPHQW GO )LQDOO\ WKH 6HFWLRQ REMHFW VF LV QRW DVVRFLDWHG ZLWK DQ\ REMHFW RI WKH 6WXGHQW FODVV ZKLFK UHSUHVHQWV WKH IDFW WKDW LW LV QRW WDNHQ E\ DQ\ VWXGHQW 2EMHFW DVVRFLDWLRQV H[SUHVVHG E\ GLIIHUHQW JUDSK SDWWHUQV UHSUHVHQW WKH VHPDQWLF UHODWLRQVKLSV DPRQJ WKHVH REMHFWV LQ DQ DSSOLFDWLRQ ZRUOG 3DWWHUQEDVHG 4XHU\ )RUPXODWLRQ %DVHG RQ WKLV YLHZ RI DQ GDWDEDVH XVHUV FDQ TXHU\ WKH GDWDEDVH E\ VSHFLI\LQJ SDWWHUQV RI REMHFW DVVRFLDWLRQV DV VHDUFK FRQGLWLRQV 2QFH WKHVH REMHFWHG DUH VHOHFWHG WKH\ FDQ EH IXUWKHU SURFHVVHG E\ HLWKHU V\VWHPGHILQHG RSHUDWLRQV 5HWULHYDO 'LVSOD\ 8SGDWH ,QVHUW 'HOHWH HWFf RU XVHUGHILQHG

PAGE 50

RSHUDWLRQV 5RWDWH3DUW 3XUFKDVH3DUW +LUH)DFXLW\ HWFf )RU H[DPSOH WKH IROn ORZLQJ TXHULHV FDQ EH LVVXHG DJDLQVW WKH XQLYHUVLW\ GDWDEDVH DV LOOXVWUDWHG LQ )LJn XUHV DQG WKH DOJHEUDLF H[SUHVVLRQV IRU WKHVH TXHULHV ZLOO EH JLYHQ LQ 6HFWLRQ f 4XHU\ )RU DOO VHFWLRQV JHW WKH PDMRUV RI VWXGHQWV ZKR DUH WDNLQJ WKHVH VHFWLRQV 7R VDWLVI\ WKLV TXHU\ ZH FDQ VSHFLI\ D OLQHDU SDWWHUQ FRQWDLQLQJ WKH FODVVHV 6HFWLRQ 6WXGHQW DQG 'HSDUWPHQW DV VKRZQ LQ )LJXUH D ,Q WKLV SDWWHUQ D FLUn FOH UHSUHVHQWV D FODVV DQG DQ HGJH UHSUHVHQWV WKDW WKH REMHFWV RI WKH WZR DGMDFHQW FLUFOHV FODVVHVf PXVW EH DVVRFLDWHG ZLWK HDFK RWKHU 7KLV SDWWHUQ LV FDOOHG DQ LQWHQVLRQDO SDWWHUQ ZKLFK UHSUHVHQWV WKDW VHFWLRQV WDNHQ E\ VWXGHQWV ZKR PDMRU LQ VRPH GHSDUWPHQWV DUH WR EH LGHQWLILHG 7KH DQVZHU WR WKLV TXHU\ FDQ EH IRXQG LQ )LJXUH E\ FKHFNLQJ LI WKH REMHFWV RI WKHVH WKUHH FODVVHV VDWLVI\ VXFK SDWWHUQ 7KHUH DUH ILYH REMHFW SDWWHUQV FDOOHG H[WHQVLRQDO SDWWHUQVf ZKLFK VDWLVI\ WKH LQWHQn VLRQDO SDWWHUQ DV VKRZQ LQ )LJXUH E 7KH 6HFWLRQ REMHFW VF DQG WKH 6WXGHQW REMHFW V GR QRW DSSHDU LQ WKHVH H[WHQVLRQDO SDWWHUQV VLQFH VF LV QRW WDNHQ E\ DQ\ VWXGHQW DQG V GRHV QRW KDYH D PDMRU \HW 7KHVH SDWWHUQV FDQ DOVR EH LGHQWLILHG LQ WZR VHTXHQWLDO VWHSV )LUVW JHW DOO WKH SDWWHUQV LQ ZKLFK WKH 6HFWLRQ REMHFWV DUH DVVRFLDWHG ZLWK WKH 6WXGHQW REMHFWV 7KHQ LI D SDWWHUQ JHQHUDWHG LQ WKH ILUVW VWHS LH D 6HFWLRQ6WXGHQW SDLUf LV IXUWKHU DVVRFLDWHG ZLWK DQ REMHFW RI 'HSDUWPHQW D QHZ SDWWHUQ FRQVLVWLQJ RI WKUHH REMHFWV LV FRQVWUXFWHG DQG UHWDLQHG LQ WKH UHVXOW RWKHUZLVH WKH SDLU LV GURSSHG

PAGE 51

2QFH WKHVH REMHFWV DV ZHOO DV WKHLU DVVRFLDWLRQVf KDYH EHHQ LGHQWLILHG GLIIHUHQW V\VWHPGHILQHG RU XVHUGHILQHG RSHUDWLRQV GHILQHG RQ WKHLU FRUUHVSRQGLQJ FODVVHV FDQ EH DSSOLHG WR WKHVH VHOHFWHG REMHFWV )RU H[DPSOH ,QIRUP'HSDUWPHQWf FDQ EH DQ RSHUDWLRQ GHILQHG RQ WKH FODVV 'HSDUWPHQW ,W VHQGV HDFK RI WKH VHOHFWHG GHSDUWPHQWV D OHWWHU FRQFHUQLQJ WKH PDMRUV RI WKH VWXGHQWV 6XSSRVH WKHUH LV D UXOH LQ WKH XQLYHUVLW\ WKDW D VWXGHQW FDQQRW PDMRU DQG PLQRU LQ WKH VDPH GHSDUWPHQW 7R FKHFN ZKHWKHU WKHUH LV VXFK D FDVH LQ WKH GDWDEDVH WKH IROORZLQJ TXHU\ FDQ EH LVVXHG 4XHU\ /LVW VWXGHQWV ZKR PDMRU DQG PLQRU LQ WKH VDPH GHSDUWPHQW 7KH LQWHQVLRQDO SDWWHUQ IRU WKLV TXHU\ LV VKRZQ LQ )LJXUH F ,W FDQ EH IRUPHG E\ VWDUWLQJ IURP WKH FODVV 6WXGHQW DQG QDYLJDWLQJ WKH VFKHPD LQ WZR WUDYHUVDO SDWKV UHIHU WR )LJXUH f 2QH SDWK LV IURP 6WXGHQW WR 'HSDUWPHQW ZKLFK PHDQV WKDW D VWXGHQW PDMRUV LQ D FHUWDLQ GHSDUWPHQW DQG WKH RWKHU SDWK LV IURP 6WXGHQW WR 'HSDUWPHQW WKURXJK 8QGHUJUDG ZKLFK PHDQV WKDW D VWXGHQW LV DQ XQGHUJUDGXDWH DQG PLQRUV LQ D FHUWDLQ GHSDUWPHQW ZH FDQ VHH IURP WKH 6* WKDW RQO\ XQGHUJUDGXDWHV PD\ KDYH PLQRUVf $FFRUGLQJ WR WKH TXHU\ D VLQJOH VWXn GHQW VKRXOG DVVRFLDWH ZLWK REMHFWV LQ ERWK 8QGHUJUDG DQG 'HSDUWPHQW DQG WKHVH WZR SDWKV VKRXOG PHUJH DW 'HSDUWPHQW WKHUHE\ IRUPLQJ D ORRS 7KLV LPSOLHV WZR ORJLFDO $1' FRQGLWLRQV RQH DW WKH 6WXGHQW FODVV DQG WKH RWKHU DW WKH 'HSDUWPHQW FODVV :H XVH GRXEOH DUFV WR GHQRWH VXFK FRQGLWLRQV DV VKRZQ LQ )LJXUH F )URP )LJXUH ZH FDQ VHH WKDW WKH VWXGHQW VL KDV KLV PDMRU DQG PLQRU LQ WKH GHSDUWPHQW GO 7KLV H[WHQVLRQDO SDWWHUQ LV GHSLFWHG LQ )LJXUH G

PAGE 52

4XHU\ )RU WKRVH VWXGHQWV WDNLQJ VHFWLRQ DQG KDYLQJ PDMRUV DQGRU PLQRUV JHW WKHLU PDMRUV DQGRU PLQRUV 7KHUH DUH VHYHUDO ZD\V WR IRUP DQ LQWHQVLRQDO SDWWHUQ IRU WKH TXHU\ :H PD\ VWDUW IURP 6HFWLRQ DQG WUDYHUVH WR 6WXGHQW WKURXJK 6HFWLRQ DQG WKHQ QDYLn JDWH WKH VFKHPD LQ WZR SDWKV DV ZH GLG IRU TXHU\ $FFRUGLQJ WR WKH TXHU\ D VWXGHQW ZKR HLWKHU KDV D PDMRU RU D PLQRU VKRXOG EH LQFOXGHG LQ WKH UHVXOW LQ WKLV GDWDEDVH LW LV DVVXPHG WKDW JUDGXDWH VWXGHQWV GR QRW KDYH PLQRUVf 7KLV PHDQV WKDW HLWKHU SDWK RI WKH QDYLJDWLRQ ZLOO FRQVWUXFW D SDWWHUQ WKDW ZRXOG VDWLVI\ WKH TXHU\ 7KXV D ORJLFDO 25 FRQGLWLRQ H[LVWV DW 6WXGHQW :H XVH D VLQJOH DUF WR LQGLFDWH WKH 25 FRQGLWLRQ DV VKRZQ LQ )LJXUH D /LNH 4XHU\ WKHVH WZR EUDQFKHV PHUJH DW 'HSDUWPHQW +RZHYHU WKLV TXHU\ GRHV QRW UHTXLUH WKDW WKH\ PHUJH DW WKH VDPH 'HSDUWPHQW REMHFW 7KLV LV VSHFLILHG E\ WKH VHFRQG 25 FRQGLn WLRQ DW 'HSDUWPHQW LQ )LJXUH D 7KH H[WHQVLRQDO SDWWHUQV WKDW VDWLVI\ WKLV TXHU\ KDYH KHWHURJHQHRXV VWUXFn WXUHV WZR W\SHV RI OLQHDU SDWWHUQV DV VKRZQ LQ )LJXUH E 7KH ILUVW W\SH LQFOXGHV SDWWHUQV WKDW UHSUHVHQW WKH PLQRUV RI WKH XQGHUJUDGXDWHV DQG WKH VHFRQG W\SH LQFOXGHV SDWWHUQV WKDW UHSUHVHQW WKH PDMRUV RI WKH VWXGHQW ZKR DUH HLWKHU XQGHUn JUDGXDWHV RU JUDGXDWHV ,Q ERWK W\SHV RI SDWWHUQV D VWXGHQW LV DVVRFLDWHG ZLWK VHFn WLRQ ZKLFK LV DVVXPHG WR EH WKH 6HFWLRQ IRU VF )LJXUH F ZLOO EH GHVFULEHG ODWHU LQ 6HFWLRQ :H KDYH JLYHQ VRPH H[DPSOH TXHULHV ZKLFK VSHFLI\ KRZ REMHFWV DUH DVVRFLn DWHG ZLWK RQH DQRWKHU ,Q WKH JUDSKLFDO UHSUHVHQWDWLRQ RI DQ GDWDEDVH ZKHQ WKHUH LV QR HGJH EHWZHHQ WZR REMHFWV HYHQ WKRXJK WKHUH LV RQH EHWZHHQ WKHLU FODVVHV LW LPSOLHV WKDW WZR REMHFWV DUH QRW DVVRFLDWHG ZLWK HDFK RWKHU 7KLV

PAGE 53

UHSUHVHQWV WKH FRPSOHPHQW DVSHFW RI WKH VHPDQWLFV EHWZHHQ WZR DVVRFLDWHG FODVVHV ,W LV QHFHVVDU\ WR DOORZ D XVHU WR UHWULHYH WKLV W\SH RI REMHFW QRQDVVRFLDWLRQ IURP D GDWDEDVH 7KH IROORZLQJ TXHU\ LV VXFK DQ H[DPSOH ,W FDQ DOVR EH VSHFLILHG E\ D SDWWHUQ 4XHU\ )RU HDFK WHDFKHU OLVW WKH VHFWLRQV ZKLFK KHVKH GRHV QRW WHDFK :H XVH D GDVKHG OLQH WR UHSUHVHQW WKH IDFW WKDW WZR REMHFWV DUH QRW DVVRFLDWHG ZLWK HDFK RWKHU 7KHUHIRUH WKH LQWHQVLRQDO SDWWHUQ IRU WKLV TXHU\ FDQ EH GUDZQ DV LQ )LJXUH G 7KHUH DUH WZHOYH H[WHQVLRQDO SDWWHUQV WKDW PDWFK WKH LQWHQVLRQDO SDWWHUQ )LJXUH H VKRZV D SRUWLRQ RI WKHP 1RQDVVRFLDWLRQ UHODWLRQVKLSV DPRQJ REMHFWV DUH QRW H[SOLFLWO\ VWRUHG LQ D GDWDEDVH +RZHYHU WKH\ FDQ EH GHULYHG GXULQJ WKH SURFHVVLQJ RI WKLV W\SH RI TXHULHV 8VLQJ WKH DERYH H[DPSOHV ZH KRSH WKDW ZH KDYH FRQYLQFHG WKH UHDGHU WKDW WKH SDWWHUQEDVHG TXHU\ IRUPXODWLRQ LV VXLWDEOH IRU TXHU\ VSHFLILFDWLRQ EDVHG RQ D JUDSKLFDO YLHZ RI DQ GDWDEDVH r &RQFOXVLRQ 7KH W\SHOHVVf JUDSKLFDO UHSUHVHQWDWLRQ RI GDWDEDVHV LV DSSOLFDEOH WR PRVW GDWD PRGHOV VLQFH LW FDSWXUHV WKH HVVHQWLDO FKDUDFWHULVWLFV RI GDWD PRGHOV LQ ZKLFK REMHFW FODVVHV DV ZHOO DV WKHLU REMHFWV DUH LQWHUUHODWHG ZLWK HDFK RWKHU LQ GLIIHUHQW DVVRFLDWLRQ SDWWHUQV 4XHU\LQJ VXFK GDWDEDVHV FDQ EH PDGH E\ VSHFLI\LQJ SDWWHUQV LQ ZKLFK REMHFWV RI LQWHUHVW DUH DVVRFLDWHG ZLWK HDFK RWKHU ,W VKRXOG EH FOHDU WKDW WKLV IRUPXODWLRQ LV TXLWH GLIIHUHQW IURP WKH DWWULEXWHEDVHG TXHU\ IRUPXODWLRQ LQ WKH H[LVWLQJ UHODWLRQDO TXHU\ ODQJXDJHV ZKLFK LV EDVHG RQ

PAGE 54

PDWFKLQJ WKH DWWULEXWHV RU WKH NH\ RU FRPSRVLWH NH\f RI RQH UHODWLRQ ZLWK WKH DWWULEXWHV IRUHLJQ NH\Vf LQ RWKHU UHODWLRQV $ TXHU\ WKDW UHTXLUHV WKH VSHFLILFDWLRQ RI D FRPSOH[ SDWWHUQ RI REMHFW DVVRFLDWLRQV FDQ EH VSHFLILHG LQ D UDWKHU VWUDLJKWIRUn ZDUG PDQQHU LQ DQ DVVRFLDWLRQEDVHG ODQJXDJH ZKHUHDV LQ DQ DWWULEXWHEDVHG ODQJXDJH FRPSOH[ QHVWLQJV RI TXHU\ EORFNV RU PXOWLSOH TXHULHV ZRXOG EH UHTXLUHG >$/$D@ ,W LV RXU YLHZ WKDW DQ DOJHEUD GHYHORSHG IRU SURFHVVLQJ GDWD EDVHG RQ WKH JUDSKLFDO YLHZ RI GDWDEDVHV DQG WKH SDWWHUQEDVHG TXHU\ IRUPXODWLRQ VKRXOG VDWLVI\ WKH IROORZLQJ UHTXLUHPHQWV )LUVW LW VKRXOG DOORZ GLUHFW PDQLSXODWLRQ RI FRPSOH[ SDWWHUQV RI REMHFW DVVRFLDWLRQV 6HFRQG WKH FORVXUH SURSHUW\ VKRXOG EH PDLQWDLQHG 7KLUG ERWK DVVRFLDWLRQ DQG QRQDVVRFLDWLRQ UHODWLRQVKLSV DPRQJ REMHFWV VKRXOG EH H[SUHVVLEOH DV VHDUFK FRQGLWLRQV )RXUWK LW VKRXOG EH FRPSOHWH LQ WKH VHQVH WKDW LW FDQ EH XVHG WR GHVFULEH DOO SRVVLEOH SDWWHUQV LQ D GDWDEDVH /DVWO\ LW PXVW EH DEOH WR UHSUHVHQW DQG SURFHVV SDWWHUQV ZLWK ERWK KRPRJHQHRXV DQG KHWHURJHQHRXV VWUXFWXUHV

PAGE 55

)LJXUH 6FKHPD JUDSK RI D XQLYHUVLW\ GDWDEDVH

PAGE 56

7HDFKHU 8QGHUJUDG )LJXUH 2EMHFW JUDSK

PAGE 57

4XHU\ 6HFWLRQ 'HSW Df 2 2 2 6WXGHQW VF V G f f f VF V G f f f Ef VF V G f f f VF V G f f f VF V G } f f 4XHU\ )LJXUH 3DWWHUQ VSHFLILFDWLRQV IRU 4XHU\ DQG 4XHU\

PAGE 58

4XHU\ Ef 6HFWLRQ 6HFWLRQ 6WXGHQW 'HSW 4XHU\ Gf 7HDFKHU R 6HFWLRQ f§R VF f f VF Hf r VF )LJXUH 3DWWHUQ VSHFLILFDWLRQV IRU 4XHU\ DQG 4XHU\

PAGE 59

&+$37(5 $662&,$7,21 $/*(%5$ 7KH DVVRFLDWLRQ DOJHEUD $DOJHEUDf LV GHILQHG EDVHG RQ D XQLIRUP UHSUHVHQWDn WLRQ RI DQ GDWDEDVH LQ WHUPV RI REMHFWV REMHFW FODVVHV DQG W\SHOHVV DVVRFLDn WLRQV DV GHVFULEHG LQ &KDSWHU 7KH DOJHEUD FRQWDLQV D QXPEHU RI RSHUDWRUV ZKLFK RSHUDWH RQ JUDSK VWUXFWXUHV RI REMHFW DVVRFLDWLRQV WR SURGXFH JUDSK VWUXFn WXUHV 7KH FORVXUH SURSHUW\ RI WKH DOJHEUD HQVXUHV WKDW WKH UHVXOW RI D TXHU\ FDQ EH IXUWKHU PDQLSXODWHG E\ RWKHU TXHULHV £'HILQLWLRQV )LUVW ZH IRUPDOO\ GHILQH DQ GDWDEDVH DW ERWK VFKHPD DQG REMHFW OHYHOV 6FKHPD *UDSK WKH LQWHQVLRQDO GDWDEDVHf 7KH VFKHPD JUDSK RI DQ GDWDEDVH LV GHILQHG DV 6*&$f ZKHUH & ^&^` LV D VHW RI YHUWLFHV UHSUHVHQWLQJ REMHFW FODVVHV $ LV D VHW RI HGJHV HDFK RI ZKLFK $L`Nf UHSUHVHQWV DVVRFLDWLRQ EHWZHHQ FODVVHV & DQG & ZKHUH N LV D QXPEHU IRU GLVWLQJXLVKLQJ WKH HGJHV IURP RQH DQRWKHU ZKHQ WKHUH LV PRUH WKDQ RQH HGJH EHWZHHQ WZR YHUWLFHV 2EMHFW *UDSK WKH H[WHQVLRQDO GDWDEDVHf 7KH REMHFW JUDSK RI DQ GDWDEDVH LV GHILQHG DV 2*2W(f ZKHUH ^2A` LV D VHW RI YHUWLFHV UHSUHVHQWLQJ REMHFW LQVWDQFHV MnWK REMHFW LQ FODVV &^f DQG ( ^L; PLV D VHW RI HGJHV UHSUHVHQWLQJ WKH DVVRFLDWLRQV DPRQJ REMHFW LQVWDQFHV :KHQ RQH REMHFW LQVWDQFH LV FRQQHFWHG ZLWK DQRWKHU LQ WKH REMHFW JUDSK D UHJXODUHGJH VROLG OLQHf LV GUDZQ EHWZHHQ WKH FRUUHVSRQGLQJ YHUn WLFHV DV ^Sf§ ZKLFK VSHFLILHV WKDW MnWK REMHFW LQVWDQFH LQ FODVV & LV UHODWHG WR QWK REMHFW LQVWDQFH LQ FODVV &P WKURXJK WKH IFWK DVVRFLDWLRQ RI FODVVHV & DQG &P ,I WZR REMHFW LQVWDQFHV ^ M DQG 2P Q DUH QRW FRQQHFWHG LQ WKH REMHFW JUDSK EXW WKHLU FODVVHV & DQG &P LQ WKH FRUUHVSRQGLQJ 6* DUH

PAGE 60

GLUHFWO\ FRQQHFWHG D FRPSOHPHQWHGJH GRWWHG OLQHf LV GUDZQ EHWZHHQ WKHP DQG LV GHQRWHG E\ 2 A 2 LM P Q ,Q WKLV PRGHOV DQ REMHFW PD\ SDUWLFLSDWH LQ VHYHUDO FODVVHV HJ LQ D JHQHUDOL]DWLRQ KLHUDUFK\f ,WV UHSUHVHQWDWLRQ LQ D FODVV LV FDOOHG DQ REMHFW LQVWDQFH 6LQFH LQ PRVW FDVHV LQ WKLV GLVVHUWDWLRQ REMHFW DQG REMHFW LQVWDQFH FDQ EH XVHG LQWHUFKDQJHDEO\ ZLWKRXW DQ\ DPELJXLW\ ZH VKDOO XVH REMHFW XQOHVV D GLVWLQFWLRQ LV UHTXLUHG EHWZHHQ WKH WZR 7KH UHDVRQ IRU H[SOLFLWO\ LQWURGXFLQJ FRPSOHPHQWHGJHV LQWR WKH 2* LV WR DOORZ WKH $DOJHEUD WR PDQLSXODWH ERWK DVVRFLDWLRQ DQG QRQDVVRFLDWLRQ EHWZHHQ REMHFWV RI WZR DGMDFHQW FODVVHV ,Q DQ DFWXDO GDWDEDVH LW LV QRW QHFHVVDU\ WR H[SOLFLWO\ VWRUH WKH FRPSOHPHQWHGJHV )LJXUH LOOXVWUDWHV WKH UHJXODUHGJHV DQG FRPSOHPHQWHGJHV DPRQJ WKH REMHFWV RI WKUHH REMHFW FODVVHV )RU H[DPSOH ZH VHH WKDW VHFWLRQ VFO LV WDNHQ E\ VWXGHQWV V DQG V UHJXODUHGJHVf DQG QRW WDNHQ E\ VWXGHQWV VL DQG V FRPSOHPHQWHGJHVf 7KH UHODWLRQVKLS EHWZHHQ DQ 2* DQG LWV FRUUHVSRQGLQJ 6* LV IRUPDOO\ GHVFULEHG E\ WKH IROORZLQJ SURSRVLWLRQ 3URSRVLWLRQ $Q *(f LV D PRUSKLVP RI LWV FRUUHVSRQGLQJ 6*&$f 7KH PDSSLQJ IXQFWLRQ )P LV GHILQHG DV )P9 &L DQG ) Pn $-9 ^Y/PQ` 7KH PDSSLQJ EHWZHHQ 6* DQG 2* LV RQHWRPDQ\ VLQFH D GDWDEDVH LV G\QDPLFDOO\ FKDQJLQJ DQG PD\ KDYH GLIIHUHQW LQVWDQWLDWLRQV DW GLIIHUHQW WLPHV IRU WKH VDPH VFKHPD JUDSK

PAGE 61

7R GHILQH DVVRFLDWLRQ SDWWHUQ ZH ILUVW H[WHQG WKH FRQFHSW RI FRQQHFWHG JUDSK LQ JUDSK WKHRU\ E\ WUHDWLQJ FRPSOHPHQWHGJHV DV HGJHV LH D FRQQHFWHG JUDSK LV D JUDSK LQ ZKLFK WKHUH H[LVWV DW OHDVW RQH SDWK EHWZHHQ DQ\ WZR YHUWLFHV DQG HDFK SDWK PD\ FRQWDLQ UHJXODUHGJHV FRPSOHPHQWHGJHV RU D FRPELQDWLRQ RI WKH WZR :H VKDOO IURP QRZ RQ XVH DQ XSSHUFDVH OHWWHU WR GHQRWH D FODVV DQG WKH FRUUHVSRQGLQJ ORZHUFDVH OHWWHU ZLWK D VXEVFULSW WR GHQRWH DQ REMHFW LQVWDQFH LQ WKDW FODVV :H VKDOO DVVXPH WKDW WKHUH LV RQO\ RQH HGJH EHWZHHQ DQ\ WZR YHUWLFHV LQ 6* XQOHVV RWKHUZLVH VSHFLILHG VR DV QRW WR FRPSOLFDWH WKH QRWDWLRQ $VVRFLDWLRQ 3DWWHUQ $ FRQQHFWHG VXEJUDSK RI DQ 2* LV DQ DVVRFLDWLRQ SDWWHUQ RU SDWWHUQ IRU VKRUWf %\ WKLV GHILQLWLRQ D VLQJOH YHUWH[ RU REMHFW LQVWDQFHf LQ 2* ZKLFK LV D FRQn QHFWHG VXEJUDSK LV DOVR D SDWWHUQ :H FDOO LW DQ ,QQHUDVVRFLDWLRQSDWWHUQ RU ,QQHUSDWWHUQ IRU VKRUWf ,W LV DOJHEUDLFDOO\ UHSUHVHQWHG E\ Df IRU D YHUWH[ RI FODVV $ LQ 6* 7KXV REMHFW LQVWDQFHV DUH WUHDWHG DV ,QQHUSDWWHUQV LQ WKH $DOJHEUD $ UHJXODUHGJH WRJHWKHU ZLWK WZR YHUWLFHV LH WZR ,QQHUSDWWHUQVf LW FRQQHFWV LV FDOOHG DQ ,QWHUDVVRFLDWLRQSDWWHUQ RU ,QWHUSDWWHUQf ZKLFK LV UHSUHVHQWHG E\ DEMf $ FRPSOHPHQWHGJH WRJHWKHU ZLWK WKH WZR ,QQHUSDWWHUQV LW FRQQHFWV LV FDOOHG D &RPSOHPHQWDVVRFLDWLRQSDWWHUQ RU &RPSOHPHQWSDWWHUQf DQG LV UHSUHVHQWHG E\ DAMf 7KLV SDWWHUQ VWDWHV WKDW RI DQG EM DUH QRW DVVRFLDWHG ZLWK HDFK RWKHU LQ 2* ,I D SDWK FRQVLVWLQJ RI RQO\ UHJXODUHGJHV EHWZHHQ YHUWLFHV DW DQG EM LW FDQ EH UHSUHVHQWHG E\ D 'HULYHGLQWHUDVVRFLDWLRQSDWWHUQ 'LQWHUSDWWHUQf GHQRWHG E\ Dftf RWKHUZLVH LW FDQ EH UHSUHVHQWHG E\ D 'HULYHGFRPSOHPHQWDVVRFLDWLRQ

PAGE 62

SDWWHUQ 'FRPSOHPHQWSDWWHUQf GHQRWHG E\ DEMf :KHQ D SDWK LV UHSUHVHQWHG E\ D GHULYHG SDWWHUQ LW VLPSO\ PHDQV WKDW WZR YHUWLFHV DUH LQGLUHFWO\ DVVRFLDWHG RU QRQDVVRFLDWHG EXW KRZ WKH\ DUH LQWHUUHODWHG WKH DFWXDO SDWKf LV RI QR LPSRUWDQFH $ 'LQWHUSDWWHUQ LV WUHDWHG DV DQ ,QWHUSDWWHUQ DQG D 'FRPSOHPHQWSDWWHUQ LV WUHDWHG DV D &RPSOHPHQWSDWWHUQ LQ WKH DOJHEUDLF RSHUDWLRQV 7KH DERYH ILYH W\SHV RI SDWWHUQV DUH WKH SULPLWLYH SDWWHUQV WKH ODWWHU IRXU EHLQJ ELQDU\ SDWWHUQV 7KHLU JUDSKLFDO DQG DOJHEUDLF UHSUHVHQWDWLRQV DUH VXPPDUn L]HG LQ )LJXUH D $OO RWKHU FRQQHFWHG VXEJUDSKV DUH FDOOHG FRPSOH[ SDWWHUQV )RU H[DPSOH WKH FRPSOH[ SDWWHUQ VKRZQ LQ )LJXUH EO FRQWDLQV WKUHH SULPLWLYH SDWWHUQV WZR ,QWHUSDWWHUQV RMIFMf DQG EOGOf DQG D &RPSOHPHQWSDWWHUQ Ff ,W FDQ EH XQLTXHO\ GHILQHG E\ LWV DOJHEUDLF UHSUHVHQWDWLRQ DV D VHW RI SULPLWLYH SDWn WHUQV LH DFGf 0RUH H[DPSOHV RI FRPSOH[ SDWWHUQV DUH VKRZQ LQ )LJXUH E )URP WKHVH H[DPSOHV RQH FDQ REVHUYH WKDW D FRPSOH[ SDWWHUQ FDQ EH GHFRPSRVHG LQWR D VHW RI ELQDU\ SDWWHUQV ZKLFK FDQQRW EH IXUWKHU GHFRPSRVHG 7KLV LPSOLHV WKDW LQ WKH DOJHEUDLF UHSUHVHQWDWLRQ RI D FRPSOH[ SDWWHUQ DQ ,QQHU SDWWHUQ PD\ QRW RFFXU DV DQ HOHPHQW DQG D ELQDU\ SDWWHUQ PD\ DSSHDU RQO\ RQFH $ SDWWHUQ LQ WKLV DOJHEUDLF IRUPDW LV FDOOHG D QRUPDOL]HG SDWWHUQ RWKHUZLVH LW LV FDOOHG DQ XQQRUPDOL]HG SDWWHUQ ESEM&Mf EEFf DQG DEFDEf DUH H[DPSOHV RI XQQRUPDOL]HG SDWWHUQV 'XULQJ WKH SURFHVV RI FRQVWUXFWLQJ DQ DVVRFLDWLRQ SDWn WHUQ ZH DOZD\V QRUPDOL]H LW E\ HOLPLQDWLQJ WKH GXSOLFDWHV 7KH DERYH WKUHH SDWn WHUQV KDYH WKH QRUPDOL]HG IRUPV RI EMFEFf DQG DEEFf UHVSHFWLYHO\ 7KH GHILQLWLRQV RI 2* DQG DVVRFLDWLRQ SDWWHUQ LPSO\ WKDW D SDWWHUQ LV D QRQ GLUHFWLRQDO JUDSK LH DAf ED^f DQG WKDW WKH VHTXHQFH RI SULPLWLYH SDWWHUQV LQ

PAGE 63

WKH DOJHEUDLF UHSUHVHQWDWLRQ RI D FRPSOH[ SDWWHUQ LV QRW LPSRUWDQW KHQFH DLEU EMFNf FNEM DLEMf %DVHG RQ WKH DERYH GHILQLWLRQ DQG QRWLRQ RI DVVRFLDWLRQ SDWWHUQ ZH YLHZ DQ 2* DV DQ $VVRFLDWLRQ *UDSK $*f DQG DOO WKH DVVRFLDWLRQ SDWWHUQV LQ $* IRUP WKH GRPDLQ RI WKH $DOJHEUD GHQRWHG E\ $ r 5HODWLRQVKLS %HWZHHQ 7ZR $VVRFLDWLRQ 3DWWHUQV 7KH RSHUDWRUV RI WKH $DOJHEUD DUH GHILQHG EDVHG RQ WKH SRVVLEOH UHODWLRQVKLSV EHWZHHQ WZR SDWWHUQV LQ $ VR WKDW WKH\ FDQ EH XVHG HLWKHU WR FRQVWUXFW FRPSOH[ SDWWHUQV XVLQJ VLPSOHU SDWWHUQV RU WR GHFRPSRVH D FRPSOH[ SDWWHUQ LQWR VHYHUDO SDWWHUQV RI VLPSOHU VWUXFWXUHV 7KHUH DUH IRXU SRVVLEOH UHODWLRQVKLSV EHWZHHQ WZR SDWWHUQV S DQG S QRQRYHUODS RYHUODS FRQWDLQ DQG HTXDO f 1RQRYHUODS 7ZR SDWWHUQV DUH VDLG WR EH QRQRYHUODS GHQRWHG E\ Sn][LS LI WKH\ KDYH QR FRPPRQ ,QQHUSDWWHUQ f 2YHUODS 7ZR SDWWHUQV DUH VDLG WR EH RYHUODSSHG GHQRWHG E\ SnQS LI WKH\ KDYH DW OHDVW RQH FRPPRQ ,QQHUSDWWHUQ f &RQWDLQ &RQWDLQ LV D VSHFLDO FDVH RI f ZKHQ DOO WKH SULPLWLYH SDWWHUQV RI S DUH FRQWDLQHG LQ S :H VD\ WKDW S LV D VXESDWWHUQ RI S DQG GHQRWH WKLV UHODWLRQVKLS E\ Sn&S f (TXDO 7KLV LV D VSHFLDO FDVH RI f ZKHQ S FRQWDLQV DOO WKH SULPLWLYH SDWn WHUQV RI S DQG YLFH YHUVD ,W LV GHQRWHG E\ Sn S %HIRUH GHILQLQJ WKH DVVRFLDWLRQ RSHUDWRUV ZH JLYH WKH GHILQLWLRQ RI $VVRFLDWLRQVHW f§ WKH RSHUDQG RI WKH DVVRFLDWLRQ RSHUDWRUV $VVRFLDWLRQVHW $Q DVVRFLDWLRQVHW GHQRWHG E\ D *UHHN OHWWHU D RU Iff LV D VHW RI DVVRFLDn WLRQ SDWWHUQV ZLWKRXW GXSOLFDWHV D GHVLJQDWHV WKH rWK SDWWHUQ LQ D ZKHUH

PAGE 64

DnAD 9L9Mf $Q HPSW\ VHW LV DOVR DQ DVVRFLDWLRQVHW GHQRWHG E\ I! $ VSHFLDO W\SH RI DVVRFLDWLRQVHW LV FDOOHG KRPRJHQHRXV DVVRFLDWLRQVHW ZKLFK LV LPSRUWDQW WR WKH $DOJHEUD VLQFH VRPH RI WKH PDWKHPDWLFDO SURSHUWLHV KROG RQO\ ZKHQ RSHUDQGV DUH KRPRJHQHRXV DVVRFLDWLRQVHWV +RPRJHQHRXV $VVRFLDWLRQVHW $Q DVVRFLDWLRQVHW LV KRPRJHQHRXV LI f DOO SDWWHUQV DUH IRUPHG E\ WKH ,QQHUSDWWHUQV RU REMHFW LQVWDQFHVf RI WKH VDPH VHW RI REMHFW FODVVHV DQG f DOO SDWWHUQV KDYH WKH VDPH QXPEHU RI ,QQHUSDWWHUQV IURP HDFK FODVV LQ WKH VHW DQG f FRUUHVSRQGLQJ SULPLWLYH SDWWHUQV EHORQJ WR WKH VDPH DVVRFLDWLRQ DQG DUH RI WKH VDPH W\SH DQG f DOO SDWWHUQV KDYH WKH VDPH WRSRORJ\ 2WKHUZLVH LW LV D KHWHURJHQHRXV DVVRFLDWLRQVHW )LJXUH GHSLFWV WKUHH H[DPSOH DVVRFLDWLRQVHWV D LV KRPRJHQHRXV ZKHUHDV 3 LV QRW VLQFH SDWWHUQ I" KDV RQO\ RQH ,QQHUSDWWHUQ RI FODVV & LQVWHDG RI WZR OLNH DQG IW LV QRW KRPRJHQHRXV EHFDXVH V FRQWDLQV D &RPSOHPHQWSDWWHUQ ZKLFK LV GLIIHUHQW IURP DQG V LH GLIIHUHQW WRSRORJLHVf r $VVRFLDWLRQ 2SHUDWRUV 7HQ DVVRFLDWLRQ RSHUDWRUV DUH IRUPDOO\ GHILQHG LQ WKLV VHFWLRQ WKUHH XQDU\ RSHUDWRUV >$3URMHFW f $6HOHFW Uf DQG $,QWHJUDWH IfDQG VHYHQ ELQDU\ RSHUDWRUV >$VVRFLDWH rf $&RPSOHPHQW _f $8QLRQ f $'LIIHUHQFH f $ 'LYLGH If 1RQ$VVRFLDWH Of DQG $,QWHUVHFW ff@ 7KH H[DPSOHV XVHG WR H[SODLQ

PAGE 65

WKHVH RSHUDWRUV ZLOO PDNH XVH RI WKH GRPDLQ $ VKRZQ LQ )LJXUH 7R NHHS WKH JUDSK VLPSOH WKH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQ LQ WKH ILJXUH 7KH VLPSOH PDWKHPDWLFDO SURSHUWLHV VXFK DV FRPPXWDWLYLW\ DVVRFLDWLYLW\ LGHPSRWHQF\ DQG QLOSRWHQF\ VDWLVILHG E\ WKH RSHUDWRUV DUH JLYHQ DIWHU HDFK GHILQLWLRQ 1RWDWLRQV 1RWDWLRQV WKDW ZLOO EH XVHG LQ WKH VXEVHTXHQW VHFWLRQV DUH ILVWHG EHORZ $ &/‘ >5&/Y&/f? LDLEMf DLEMf .FNf RU 3 D 'HQRWH FODVVHV 'HQRWHV D YDULDEOH IRU D FODVV 'HQRWHV WKH DVVRFLDWLRQ EHWZHHQ FODVVHV &/; DQG &/ 'HQRWHV WKH rWK ,QQHUSDWWHUQ RI FODVV $ 'HQRWHV DQ ,QQHUSDWWHUQ YDULDEOH 'HQRWHV DQ ,QWHUSDWWHUQ EHWZHHQ WZR FODVVHV $ DQG % 'HQRWHV D &RPSOHPHQWSDWWHUQ EHWZHHQ WZR FODVVHV $ DQG % 'HQRWHV D 'HULYHGSDWWHUQ IURP FODVV $ WR FODVV & 'HQRWH DVVRFLDWLRQVHWV 'HQRWHV rnWK SDWWHUQ RI DVVRFLDWLRQVHW D 'HQRWH VHWV RI FODVVHV +HQFH UHSUHVHQWV DVVRFLDWLRQVHW D ZKLFK KDV ,QQHUSDWWHUQVf IURP WKH FODVVHV LQ ^$` ,W VKRXOG EH QRWHG WKDW DQ ,QQHUSDWWHUQ LV UHSUHVHQWHG E\ DQ REMHFW LQVWDQFH LGHQWLILHU ,,'f ZKLFK LV D V\VWHPDVVLJQHG REMHFW LGHQWLILHU 2,'f SUHIL[HG E\ D FODVV LGHQWLILFDWLRQ VR WKDW WKH REMHFW LQVWDQFHV RI DQ REMHFW LQ PXOWLSOH FODVVHV FDQ EH XQDPELJXRXVO\ GLVWLQJXLVKHG DQG WKH IDFW WKDW WKHVH REMHFW LQVWDQFHV DUH

PAGE 66

LQVWDQFHV RI WKH VDPH REMHFW FDQ HDVLO\ EH UHFRJQL]HG 2SHUDWRUV $OO UHODWLRQDO DOJHEUDLF RSHUDWRUV RSHUDWH RQ UHODWLRQV RI KRPRJHQHRXV RU XQLRQFRPSDWLEOHf VWUXFWXUHV ZLWK WKH H[FHSWLRQ RI &DUWHVLDQSURGXFW DQG -RLQ 7KH &DUWHVLDQSURGXFW DQG -RLQ SURYLGH WKH PHFKDQLVP WR FRQFDWHQDWH WZR UHODn WLRQV RI GLIIHUHQW VWUXFWXUHV LQWR D VLQJOH UHODWLRQ VR WKDW LW FDQ EH IXUWKHU PDQLSXn ODWHG E\ RWKHU RSHUDWRUV ,Q WKH $DOJHEUD DOO WKH RSHUDWRUV DUH GHILQHG WR RSHUDWH RQ DVVRFLDWLRQ SDWWHUQV RI KRPRJHQHRXV DV ZHOO DV KHWHURJHQHRXV VWUXFWXUHV 7KHUHIRUH WKH UHODWLRQDO DOJHEUD LV D VSHFLDO FDVH RI WKH $DOJHEUD LQ WKLV UHVSHFW Of $VVRFLDWH rf 7KH $VVRFLDWH RSHUDWRU LV D ELQDU\ RSHUDWRU ZKLFK FRQVWUXFWV DQ DVVRFLDWLRQ VHW RI FRPSOH[ SDWWHUQV E\ FRQFDWHQDWLQJ WKH SDWWHUQV UHSUHVHQWHG E\ WZR RSHUDQG DVVRFLDWLRQVHWV 6LQFH D SDWWHUQ PD\ LQYROYH PDQ\ FODVVHV DQG DQ REMHFW FODVV PD\ KDYH PRUH WKDQ RQH DVVRFLDWLRQ ZLWK DQRWKHU FODVV LW LV QHFHVVDU\ WR VSHFLI\ WKURXJK ZKLFK DVVRFLDWLRQ WKH FRQFDWHQDWLRQ RI WZR SDWWHUQV LV LQWHQGHG 7KH $VVRFLDWH RSHUDWLRQ RQ DVVRFLDWLRQVHWV RU DQG 3 RYHU WKH DVVRFLDWLRQ 5 EHWZHHQ FODVVHV $ DQG % LV GHILQHG DV IROORZV D r >IO$IOf@ 3 ^ mDPff DPEQH>5$%f` $ DPtrf $ EQH ` 7KH UHVXOW RI DQ $VVRFLDWH RSHUDWLRQ LV DQ DVVRFLDWLRQVHW FRQWDLQLQJ QR GXSOLn FDWHV (DFK RI LWV SDWWHUQ LV WKH FRQFDWHQDWLRQ RI WZR SDWWHUQV RQH IURP HDFK

PAGE 67

RSHUDQG DVVRFLDWLRQVHWf 0RUH VSHFLILFDOO\ LI WKH ,QQHUSDWWHUQ RU REMHFW DPf RI $ LQ Rn LV DVVRFLDWHG ZLWK WKH ,QQHUSDWWHUQ RU REMHFW EQf RI % LQ LQ WKH GRPDLQ RI WKH DOJHEUD $ VKRZQ LQ )LJXUH WKHQ D DQG DUH FRQFDWHQDWHG YLD WKH SULPLn WLYH SDWWHUQ DP:H GR QRW UHVWULFW $ DQG % WR EH GLIIHUHQW FODVVHV LQ r>5$%f? LH D r^5$$f@3 LV D OHJLWLPDWH RSHUDWLRQ ZKLFK FRQFDWHQDWHV WZR SDWWHUQV RQH IURP HDFK RSHUDQG DVVRFLDWLRQVHWf LI WKH\ KDYH D FRPPRQ ,QQHUSDWWHUQ RI FODVV $ $Q H[DPSOH RI WKH $VVRFLDWH RSHUDWLRQ LV VKRZQ LQ )LJXUH D IRU FRQYHQLn HQFH D FRS\ RI WKH VDPSOH GDWDEDVH LV VKRZQ LQ HDFK ILJXUH IRU LOOXVWUDWLQJ DQ RSHUDWLRQ )RU FODULW\ ZH XVH JUDSKLFDO QRWDWLRQ LQ WKH ILJXUHV ,Q WKH H[DPSOH RU LV FRQFDWHQDWHG ZLWK DQG UHVSHFWLYHO\ GXH WR WKH H[LVWHQFH RI Ff DQG Ff LQ $ DV VKRZQ LQ )LJXUH D LV GURSSHG VLPSO\ EHFDXVH LW GRHV QRW KDYH DQ ,QQHUSDWWHUQ RI FODVV % D LV GURSSHG EHFDXVH f LV QRW DVVRFLDWHG ZLWK DQ\ ,QQHUSDWWHUQ RI FODVV & LQ $ FDQQRW EH FRQFDWHQDWHG WKURXJK Ff ZLWK DQ\ SDWWHUQ LQ D EHFDXVH QR SDWWHUQ LQ R KDV DQ ,QQHUSDWWHUQ RI % WKDW LV DVVRFLDWHG ZLWK Ff LQ $ )RU WKH VDPH UHDVRQ LV GURSSHG )RU WKH $VVRFLDWH RSHUDWRU >5$%f? FDQ EH RPLWWHG LI WKH IROORZLQJ FRQGLn WLRQV KROG f ERWK D DQG IW DUH $DOJHEUD H[SUHVVLRQV f WKH $VVRFLDWH RSHUDWRU RSHUDWHV RQ WKH ODVW FODVV LQ D OLQHDU H[SUHVVLRQ D DQG WKH ILUVW FODVV LQ D OLQHDU H[SUHVVLRQ DQG f WKHUH LV D XQLTXH DVVRFLDWLRQ EHWZHHQ WKHVH WZR FODVVHV )RU H[DPSOH $ r>5$%f? % FDQ EH ZULWWHQ DV $r% LI FODVV $ LV DVVRFLDWHG ZLWK FODVV % WKURXJK WKH DWWULEXWH RI $ ,W VKRXOG EH SRLQWHG RXW WKDW $DOJHEUD DOORZV DQ DWWULEXWH WR EH GHILQHG E\ D FRPSXWHG YDOXH RU REMHFWf )RU LQVWDQFH

PAGE 68

% M^$f 7KH LPSOHPHQWDWLRQV RI WKH IXQFWLRQ DQG WKH SURFHGXUH DUH LQYLVLEOH WR WKH DOJHEUD +RZHYHU WKH\ VKRXOG QRW KDYH VLGH HIIHFW LH WKH FRPSXWHG UHVXOW PXVW EH RI WKH VDPH W\SH DV % 7KH $VVRFLDWH RSHUDWRU LV FRPPXWDWLYH DQG FRQGLWLRQDOO\ DVVRFLDWLYH DV GHILQHG EHORZ D r>5$-f@ r>e"$f@ D FRPPXWDWLYLW\f m^r` r^5$f%f` 3^\`f r^5LF'f? O^]f DVVRFLDWLYLW\f m: 3^5$$f@ $ $ LGHPSRWHQF\f 7KH DVVRFLDWLYLW\ KROGV WUXH LI D DQG GR QRW KDYH ,QQHUSDWWHUQ RI FODVVHV & DQG % UHVSHFWLYHO\ 2WKHUZLVH WKH DVVRFLDWLYLW\ GRHV QRW KROG )RU H[DPSOH LI \6 &Mf Gf DQG $ LV DV VKRZQ LQ )LJXUH WKH GRPDLQ RI WKH DOJHEUDf WKHQ RU r>L$%f@ f 5&'f` DQG D r>IO$IOf@ r>5&'f? f W!

PAGE 69

f $&RPSOHPHQW _f 7KH $&RPSOHPHQW RSHUDWRU LV D ELQDU\ RSHUDWRU ZKLFK FRQFDWHQDWHV WKH SDWWHUQV RI WZR RSHUDQG DVVRFLDWLRQVHWV RYHU &RPSOHPHQWSDWWHUQV ,W LV XVHG WR LGHQWLI\ WKH REMHFWV LQ WZR FODVVHV ZKLFK DUH QRW DVVRFLDWHG ZLWK HDFK RWKHU LQ $ 7KH $&RPSOHPHQW RSHUDWRU LV GHILQHG DV IROORZV D >5^$%f? 3 ^ A$fH>-$IOf@ $ DUD*m $ EQH" RU Dn +DPemff $ QfEfH3f RU nI IW QfEQeIIf $ £PfDP*Df ` 7KH UHVXOW RI DQ $&RPSOHPHQW RSHUDWLRQ LV DQ DVVRFLDWLRQVHW (DFK RI LWV SDWWHUQV LV IRUPHG E\ FRQFDWHQDWLQJ WZR SDWWHUQV RQH IURP HDFK RSHUDQG DVVRFLDWLRQVHWf YLD D &RPSOHPHQWSDWWHUQ RPQf ZKHUH DP DQG EQ EHORQJ WR D DQG IW UHVSHFWLYHO\ DQG WKH &RPSOHPHQWSDWWHUQ DPQf LV LQ $ ,Q WKH VSHFLDO FDVH ZKHQ DRU Sf LV DQ HPSW\ DVVRFLDWLRQVHW RU GRHV QRW KDYH ,QQHUSDWWHUQV RI FODVV $RU %f WKHQ DOO SDWWHUQV RI IRU Df WKDW KDYH ,QQHUSDWWHUQV RI $RU %f DUH UHWDLQHG LQ WKH UHVXOWLQJ DVVRFLDWLRQVHW $Q H[DPSOH RI WKH $&RPSOHPHQW RSHUDWLRQ LV VKRZQ LQ )LJXUH E ,W RSHUDWHV RYHU WKH DVVRFLDWLRQ EHWZHHQ FODVVHV % DQG & D GRHV QRW DSSHDU LQ WKH UHVXOWDQW DVVRFLDWLRQVHW EHFDXVH LW FRQWDLQV QR ,QQHUSDWWHUQV RI % D FDQQRW EH $&RPSOHPHQWHG ZLWK IW DQG IW EHFDXVH LW LV FRQQHFWHG ZLWK IW DQG I" E\ ,QWHUn SDWWHUQV &Mf DQG EAf LQ $ UHVSHFWLYHO\ 8QGHU WKH VDPH FRQGLWLRQV DV JLYHQ LQ WKH $VVRFLDWH RSHUDWRU >5$%f` QHHG QRW EH VSHFLILHG ZLWK WKH $&RPSOHPHQW RSHUDWRU XQOHVV WKHUH LV DQ DPELJXLW\ 7KH $&RPSOHPHQW RSHUDWRU LV FRPPXWDWLYH DQG DVVRFLDWLYH )RU WKH VLPLODU UHD

PAGE 70

VRQ GHVFULEHG IRU WKH $VVRFLDWH RSHUDWRU WKH DVVRFLDWLYLW\ KROGV WUXH FRQGLWLRQDOO\ D >B5$%f@ 3 3 >%%$f@ D FRPPXWDWLYLW\f m^r` S><@f >5>&'f` ^]` DVVRFLDWLYLW\f RUZ >5$%f` 3^5&'f? ^]`f r &I^;` $ %^=`f $ _>%$$f@ $ M! QLOSRWHQF\f f $6HOHFW WUf 7KH $6HOHFW LV D XQDU\ RSHUDWRU ZKLFK RSHUDWHV RQ DQ DVVRFLDWLRQVHW RU WR SURGXFH D VXEVHW RI SDWWHUQV WKDW VDWLVI\ D VSHFLILHG SUHGLFDWH 3 $ SDWWHUQ LQ WKH RSHUDQG DVVRFLDWLRQVHW LV UHWDLQHG LII WKH SUHGLFDWHV DUH HYDOXDWHG WUXH IRU WKDW SDWWHUQ \RLf>3? ^ < D 3Df WUXH ` ZKHUH RU LV GHILQHG E\ DQ DOJHEUDLF H[SUHVVLRQ DQG 3 7LG[ 7 f f f QB 7Q (DFK WHUP 7^W OQf LV D FRPSDULVRQ EHWZHHQ WZR H[SUHVVLRQV DQG } OQOf LV D %RROHDQ RSHUDWRU $RUYf 3Df WUXH UHSUHVHQWV WKDW D SDWWHUQ LV HYDOXDWHG WUXH IRU WKDW SUHGLFDWH 7KH H[SUHVVLRQV RQ WKH OHIW DQG ULJKWKDQG VLGHV RI D FRPSDULVRQ RSHUDWLRQ PD\ FRQWDLQ FRQVWDQWV IXQFWLRQV DQGRU RSHUDWLRQV RQ REMHFWV EXW FDQQRW ERWK EH FRQVWDQWV 7KH FRPSDULVRQ WHUPV DUH W\SH VHQVLWLYH LH WKH UHVXOWV RI WKH WZR H[SUHVVLRQV LQ D WHUP VKRXOG EH GDWD RI WKH VDPH W\SH IRU SULPLWLYHFODVVHV RU ERWK ,,'V IRU QRQSULPLWLYHFODVVHV !! DQG  DUH WKH OHJLWLPDWH FRPSDULVRQV IRU QXPHULFDO W\SHV DQG A IRU FKDUDFWHU VWULQJ DQG ,,' W\SHV DQG &'&' DQG Mr IRU VHW W\SHV 7KH FRPSDULVRQ RI WZR ,,'V LV SHUIRUPHG E\ FRPSDULQJ WKHLU 2,' SRUWLRQV VLQFH ,,'V DUH WKH FRQFDWHQDWLRQV RI WKH FODVV LGHQWLILHUV DQG 2,'V

PAGE 71

$ VLQJOH YDOXHG REMHFW RU D VLQJOH ,,' FDQ EH WUHDWHG HLWKHU DV LWV RZQ GDWD W\SH LQ QXPHULFDO VWULQJ RU ,,' FRPSDULVRQ RU DV D VHW W\SH FRQWDLQLQJ RQH HOHPHQW LQ D VHW FRPSDULVRQ $V DQ H[DPSOH RI $6HOHFW ZH DVVXPH WKDW WKHUH DUH WZR DVVRFLDWHG FODVVHV 6 IRU VWDFN DQG 4 IRU TXHXH 7R VHOHFW DVVRFLDWHG VWDFN DQG TXHXH REMHFW SDLUV LQ ZKLFK WKH WRS DQG WKH ERWWRP RI WKH VWDFN KDYH VRPH FRPPRQ REMHFWVf ZLWK WKRVH LQ WKH KHDG DQG WKH WDLO RI WKH TXHXH LW FDQ EH ZULWWHQ DV R^6r4f>RSf_A-RRP6ff S_ KHDG^4f\MWD?4ff A I!f )RU WKH WRS HTXDOV WKH KHDG DQG WKH ERWWRP HTXDOV WKH WDLO ZH KDYH R6r4f^WRS6f KHDG4f $ ERWWRUUL6f WDLO4f` f $3URMHFW LIf 6LPLODU WR WKH SURMHFWLRQ RSHUDWLRQ LQ WKH UHODWLRQDO DOJHEUD DQ $3URMHFW RSHUDWLRQ LV GHILQHG WR SURMHFW VXESDWWHUQVf RI D SDWWHUQ +RZHYHU LQ WKH UHODn WLRQDO DOJHEUD WKH UHODWLRQVKLS DPRQJ WKH SURMHFWHG DWWULEXWHV LV QRW LPSRUWDQW :KHUHDV LQ $DOJHEUD WKH DVVRFLDWLRQ DPRQJ WKH SURMHFWHG VXESDWWHUQV PXVW EH PDLQWDLQHG VR WKDW WKH DVVRFLDWLRQV DPRQJ WKH REMHFWV LQ WKHVH VXESDWWHUQV ZLOO EH UHWDLQHG 7KH $3URMHFW RSHUDWRU LV GHILQHG DV IROORZV Q^FWf>e @ ZKHUH D LV DQ DVVRFLDWLRQVHW GHILQHG E\ DQ $DOJHEUD H[SUHVVLRQ e HY H HQf LV D VHW RI H[SUHVVLRQV ZKLFK VSHFLI\ VXESDWWHUQV WR EH SURn MHFWHG DQG 7 WY WPf LV D VHW RI RUGHUHG VHWV RI FODVVHV (DFK RUGHUHG VHW

PAGE 72

W VSHFLILHV D SDWK FRQQHFWLQJ WZR SURMHFWHG VXESDWWHUQV GHILQHG E\ WKH I H[SUHVn VLRQV HW^L OQf LV D VXEH[SUHVVLRQ RI WKH H[SUHVVLRQ ZKLFK GHILQHV D H DQG H 9AMf VKRXOG QRW FRQWDLQ D FRPPRQ FODVV 7KHUH PD\ EH PDQ\ SDWKV WKDW FRQn QHFWLQJ WZR VXESDWWHUQV LQ WKH RULJLQDO SDWWHUQ 7KH SDWK WR EH UHWDLQHG FDQ EH VSHFLILHG LQ WN ,I D VSHFLILF SDWK LV FKRVHQ D PLQLPDO QXPEHU RI FODVVHV DORQJ WKH SDWK ZKLFK FDQ XQLTXHO\ LGHQWLI\ WKH SDWK VKRXOG EH VSHFLILHG 7KH UHVXOW RI DQ $3URMHFW RSHUDWLRQ RYHU D SDWWHUQ LV LWV VXESDWWHUQV GHILQHG E\ f DQG VRPH SDWKV GHILQHG E\ 7 WKDW FRQQHFW WKHVH VXESDWWHUQV ,I D SDWK LQ WKH RULJLQDO SDWWHUQ FRQn VLVWV RI DOO ,QWHUSDWWHUQV D 'LQWHUSDWWHUQ LV UHWDLQHG 2WKHUZLVH D FRPSOHPHQWSDWWHUQ LV LQFOXGHG 0XOWLSOH SDWKV EHWZHHQ WZR SURMHFWHG VXESDWn WHUQV FDQ EH GHFODUHG LQ 7 LI LW LV VR GHVLUHG )LJXUH F VKRZV DQ H[DPSOH RI $3URMHFW IURP D SDWWHUQ D RYHU $r% DQG )RU D WKH VXESDWWHUQV DEcf DQG GM VDWLVI\ $ r% DQG UHVSHFWLYHO\ 7KHUHn IRUH WKH\ DUH NHSW LQ WKH UHVXOW $FFRUGLQJ WR WKH SDWK VSHFLILFDWLRQ VWDWHG LQ WKH RSHUDWLRQ D 'HULYHGSDWWHUQ GMf LV DGGHG WR WKH UHVXOW WKXV DIF Gc E^Gf ,WV QRUPDOL]HG IRUP LV D LAGf nI LV SURGXFHG IRU WKH VDPH UHDVRQ 6LQFH D GRHV QRW KDYH D VXESDWWHUQ VDWLVI\LQJ $ r% RQO\ GJf LV UHWDLQHG f 1RQ$VVRFLDWH Of 7KH 1RQ$VVRFLDWH RSHUDWRU LV D ELQDU\ RSHUDWRU XVHG WR LGHQWLI\ WKH DVVRFLDn WLRQ SDWWHUQV LQ RQH RSHUDQG DVVRFLDWLRQVHW WKDW DUH QRW DVVRFLDWHG RYHU D VSHFLILHG DVVRFLDWLRQf ZLWK DQ\ SDWWHUQ LQ WKH RWKHU DVVRFLDWLRQVHW DQG YLFH YHUVD

PAGE 73

LQ WKH GRPDLQ RI WKH DOJHEUD $ 7KH 1RQ$VVRFLDWH RSHUDWRU LV GHILQHG DV IROORZV D >5$%f? IW ^ mrn IW A$f A$f&>L$%f@ $ DP*RUfn $ EQHIW $ 9 D fD QfAfr $ A Q P P Q RU IW Df PfDPHmnf $ £Qff*Af 9 9Q*AIF WHPfDNHr $ DIFQf*>m$6f@f RU r IW? QfEQHLf $ APfRP*mf 9 9DP*DfIF 0QfIF* $ DPrf*>L$L"f@f ` 7KH UHVXOW RI D 1RQ$VVRFLDWH RSHUDWLRQ LV DQ DVVRFLDWLRQVHW (DFK RI LWV SDWn WHUQV LV IRUPHG E\ FRQFDWHQDWLQJ WZR SDWWHUQV D DQG IW YLD D &RPSOHPHQW SDWWHUQ DPQf XQGHU WKH FRQGLWLRQ WKDW D LV QRW DVVRFLDWHG ZLWK DQ\ IW DQG YLFH YHUVD )XUWKHUPRUH LQ WKH VSHFLDO FDVH ZKHUH WKH SDWWHUQV RI DRU Sf KDYH ,QQHU SDWWHUQV RI $RU %f DQG FDQQRW EH FRQFDWHQDWHG ZLWK DQ\ SDWWHUQ RI SRU RUf WKHVH SDWWHUQV RI DRU IWf ZLOO EH UHWDLQHG LQ WKH UHVXOW LI RQH RI WKH IROORZLQJ WKUHH FRQGLn WLRQV KROGV f SRU Df LV DQ HPSW\ DVVRFLDWLRQVHW f DOO SDWWHUQV RI SRU Df GR QRW KDYH ,QQHUSDWWHUQV RI %>RU $f RU f DOO SDWWHUQV RI SRU Df WKDW KDYH ,QQHU SDWWHUQV RI %RU $f FDQ EH FRQFDWHQDWHG ZLWK SDWWHUQV RI DRU Sf $Q H[DPSOH RI WKH 1RQ$VVRFLDWH RSHUDWLRQ LV VKRZQ LQ )LJXUH G ,Q WKH H[DPSOH D DQG IW DUH GURSSHG GXH WR WKH H[LVWHQFH RI IFAf LQ )LJXUH D LV GURSSHG EHFDXVH LW GRHV QRW FRQWDLQ DQ ,QQHUSDWWHUQ RI FODVV % IW LV GURSSHG EHFDXVH LW GRHV QRW FRQWDLQ DQ ,QQHUSDWWHUQ RI FODVV & IW LV LQ WKH UHVXOWDQW DVVRFLDWLRQVHW EHFDXVH Ef LV QRW DVVRFLDWHG ZLWK Ff LQ $ DV VKRZQ LQ )LJXUH DQG f GRHV QRW DSSHDU LQ D H[LVWV EHFDXVH tf LV QRW DVVRFLDWHG ZLWK Ff LQ $ 1RWH WKDW WKH 1RQ$VVRFLDWH RSHUDWRU SURGXFHV D UHVXOWDQW DVVRFLDWLRQVHW ZKLFK LV D VXEVHW RI WKDW SURGXFHG E\ WKH $&RPSOHPHQW RSHUDWRU EHFDXVH RUn IW

PAGE 74

DQG DPEQ PD\ IRUP D QHZ SDWWHUQ RQO\ ZKHQ DP RI Dn GRHV QRW DVVRFLDWH ZLWK DQ\ REMHFW RI % LQ 3 DQG EQ RI IW GRHV QRW DVVRFLDWH ZLWK DQ\ REMHFW RI $ LQ D ,Q IDFW WKH 1RQ$VVRFLDWH RSHUDWRU FDQ EH H[SUHVVHG LQ WHUPV RI $&RPSOHPHQW DQG RWKHU RSHUDWRUV DV IROORZV $ >5$%f? % $ f§ ,-$ r>=$%f@ %f>$@ _>-$IOf@ % ,$ %f>%@f 7KXV 1RQ$VVRFLDWH LV QRW D SULPLWLYH RSHUDWRU LQ D VWULFW VHQVH +RZHYHU LW LV YHU\ XVHIXO IRU TXHU\ IRUPXODWLRQ DQG LV WKHUHIRUH LQFOXGHG LQ WKH VHW RI $DOJHEUD RSHUDWRUV 8QGHU WKH VDPH FRQGLWLRQV DV JLYHQ LQ WKH $VVRFLDWH RSHUDWRU >%$%f@ QHHG QRW EH VSHFLILHG XQOHVV WKHUH LV DQ DPELJXLW\ 7KH 1RQ$VVRFLDWH RSHUDWRU LV FRPn PXWDWLYH EXW QRW DVVRFLDWLYH D >L$%fM 3 3 ^5%$f? D FRPPXWDWLYLW\f $ >-$$f@ $ M! QLOSRWHQF\f f $,QWHUVHFW ff 7KH $,QWHUVHFW RSHUDWLRQ LV FRQYHQLHQW IRU FRQVWUXFWLQJ D SDWWHUQ ZLWK D EUDQFK RU D ODWWLFH VWUXFWXUH D SDWWHUQ WKDW KDV D ORRSf VLQFH D SDWWHUQ LQ VXFK VWUXFWXUHV FDQ EH YLHZHG DV WKH LQWHUVHFWLRQ RI WZR SDWWHUQV &RQFHSWXDOO\ WKH $,QWHUVHFW RSHUDWRU LV HTXLYDOHQW WR WKH -2,1 RSHUDWRU LQ WKH UHODWLRQDO DOJHEUD ,W RSHUDWHV RQ WZR RSHUDQG DVVRFLDWLRQVHWV RYHU D VHW RI VSHFLILHG FODVVHV 7ZR SDWWHUQV RQH IURP HDFK DVVRFLDWLRQVHW DUH FRPELQHG LQWR RQH LI WKH\ FRQWDLQ WKH VDPH VHW RI ,QQHUSDWWHUQV IRU HDFK VSHFLILHG FODVV 7KH $,QWHUVHFW RSHUDWLRQ LV GHILQHG DV IROORZ

PAGE 75

m^r` r^:f 3>
PAGE 76

1RZ ZH GHILQH WKUHH VHW RSHUDWRUV ZKLFK DUH GLIIHUHQW IURP WKH FRUUHVSRQGn LQJ VHW RSHUDWRUV LQ UHODWLRQDO DOJHEUD VLQFH WKH\ RSHUDWH RQ KHWHURJHQHRXV VWUXFn WXUHV DV ZHOO DV KRPRJHQHRXV VWUXFWXUHV f $,QWHJUDWH f 7KH $,QWHJUDWH LV D XQDU\ RSHUDWRU ,W UHRUJDQL]HV SDWWHUQV LQ DQ DVVRFLDWLRQVHW DFFRUGLQJ WR WKH UHODWLRQVKLSV DPRQJ SDWWHUQV ZLWK UHVSHFW WR WKH FODVVHV VSHFLILHG 7KH $,QWHJUDWH RSHUDWLRQ LV GHILQHG DV IROORZV I^Z`Df ^ 7 Lfn 9IF &/QH^:`$#H&/Q$#HD$RWHDf^#HRLN$RLNHRLf ` %\ WKLV GHILQLWLRQ D VXEVHW RI SDWWHUQV DWf RI D LV FRPELQHG LQWR D VLQJOH SDWWHUQ LI HYHU\ REMHFW LQVWDQFH RI FODVVHV LQ ^:` WKDW DSSHDUV LQ D SDWWHUQ LQ WKH VXEVHW LV DOVR FRQWDLQHG LQ DOO RWKHU SDWWHUQV LQ WKH VXEVHW ,I D SDWWHUQ RI D FDQQRW EH FRPn ELQHG ZLWK DQ\ RWKHU SDWWHUQ LW LV UHWDLQHG LQ WKH UHVXOWDQW DVVRFLDWLRQVHW DV LW LV ,I QR FODVV LV VSHFLILHG SDWWHUQV LQ ZKLFK HYHU\ SDWWHUQ KDV DW OHDVW RQH REMHFW LQVWDQFH RI DQ\ FODVVf FRPPRQ WR DQRWKHU ZLOO EH LQWHJUDWHG LQWR RQH SDWn WHUQ 7KH UHRUJDQL]HG DVVRFLDWLRQVHW ZLOO FRQWDLQ SDWWHUQV ZKLFK DUH DSDUW IURP HDFK RWKHU UHIHU WR 6HFWLRQ f )LJXUH I VKRZV WZR H[DPSOHV 7KH ILUVW H[DPSOH VKRZV DQ $,QWHJUDWH RSHUDWLRQ RYHU FODVV $ 3DWWHUQV WKDW KDYH FRPPRQ ,QQHUSDWWHUQ RI FODVV $ DUH JURXSHG LQWR RQH LV WKH LQWHJUDWLRQ RI RU D DQG D DQG LV WKH LQWHJUDWLRQ RI D DQG DVf $OO RWKHU SDWWHUQV LQ D DUH UHWDLQHG LQ WKH UHVXOW DV WKH\ DUH 7KH VHFRQG H[DPSOH LOOXVWUDWHV DQ $,QWHJUDWH RSHUDWLRQ RQ WKH VDPH DVVRFLDWLRQVHW RI

PAGE 77

WKH ILUVW H[DPSOH EXW ZLWKRXW VSHFLI\LQJ D FODVV 7KH UHVXOW EHFRPHV WZR SDWWHUQV ZKLFK DUH DSDUW DQG DUH H[DFWO\ WKH VDPH DV WKH\ DSSHDU LQ WKH RULJLQDO GDWDEDVH :KHUHDV WKH VDPH SULPLWLYH SDWWHUQV DSSHDU PRUH WKDQ RQFH LQ WKH UHVXOW RI WKH ILUVW H[DPSOH f $8QLRQf 6LPLODU WR WKH 81,21 RSHUDWLRQ RI WKH UHODWLRQDO DOJHEUD $8QLRQ FRPELQHV WZR DVVRFLDWLRQVHWV LQWR RQH +RZHYHU WKHVH WZR DVVRFLDWLRQVHWV FDQ FRQWDLQ KHWHURJHQHRXV DVVRFLDWLRQ VWUXFWXUHV ,W LV LPSRUWDQW IRU $DOJHEUD WR EH DEOH WR RSHUDWH RQ KHWHURJHQHRXV VWUXFWXUHV EHFDXVH VRPH SULRU RSHUDWLRQV PD\ SURGXFH KHWHURJHQHRXV DVVRFLDWLRQVHWV DQG PD\ QHHG WR EH IXUWKHU SURFHVVHG RYHU WKH REMHFWV RI D FRPPRQ FODVV DJDLQVW RWKHU SDWWHUQV RI DVVRFLDWLRQV 8QOLNH WKH UHODn WLRQDO DOJHEUD DQG RWKHU TXHU\ ODQJXDJHV XQLRQFRPSDWLELOLW\ LV QRW D UHVWULFn WLRQ LQ $DOJHEUD )RU WKLV UHDVRQ $DOJHEUD KDV PRUH H[SUHVVLYH SRZHU $Q\ TXHU\ WKDW FDQ EH H[SUHVVHG E\ D VLQJOH H[SUHVVLRQ LQ RWKHU ODQJXDJHV FDQ EH H[SUHVVHG DV D VLQJOH $DOJHEUD H[SUHVVLRQ EXW QRW YLVH YHUVD 7KH $8QLRQ RSHUDn WLRQ LV GHILQHG DV IROORZV m 3 ^ 9HD Y 9He ` 7KH $8QLRQ RSHUDWRU LV FRPPXWDWLYH DVVRFLDWLYH DQG LGHPSRWHQW D 3 If D D Sf D f m D D FRPPXWDWLYLW\f DVVRFLDWLYLW\f LGHPSRWHQF\f

PAGE 78

f $'LIIHUHQFH f 7KH $'LIIHUHQFH LPSOHPHQWV WKH VDPH FRQFHSW DV WKH ',))(5(1&( RSHUDn WRU LQ UHODWLRQDO DOJHEUD EXW ZLWK WZR GLIIHUHQFHV )LUVW LWV RSHUDQGV GR QRW KDYH WR EH XQLRQ FRPSDWLEOH 6HFRQGO\ D SDWWHUQ LQ WKH PLQXHQG LV UHWDLQHG LI LW GRHV QRW FRQWDLQ DQ\ RI WKH SDWWHUQV LQ WKH VXEWUDKHQG m 3 ^ r Dr œ ^IWIIL&Df ` 7KH H[DPSOH GHSLFWHG LQ )LJXUH J VKRZV WKDW D DQG D DUH GURSSHG VLQFH WKH\ ERWK FRQWDLQ f $'LYLGH Af 7KH $'LYLGH RSHUDWRU LPSOHPHQWV WKH FRQFHSW WKDW D JURXS RI SDWWHUQV ZLWK FHUWDLQ FRPPRQ IHDWXUHV FRQWDLQV DQRWKHU VHW RI SDWWHUQV m ^Z` 3 ^ r 2Ir 9f&DW f ` ZKHUH RWW LV D VXEVHW RI WKH SDWWHUQV RI RU ZKLFK KDYH FRPPRQ ,QQHUSDWWHUQV IRU DOO FODVVHV RI ^:` DQG WKH\ WRJHWKHU FRQWDLQ DOO SDWWHUQV RI ,I ^:` LV QRW VSHFLILHG WKH $'LYLGH RSHUDWLRQ UHWDLQV DOO WKH SDWWHUQV RI D LI HDFK RI ZKLFK FRQWDLQ DW OHDVW RQH SDWWHUQ RI DQG WKH\ WRJHWKHU FRQWDLQ DOO SDWWHUQV RI )LJXUH K VKRZV DQ H[DPSOH RI D EHLQJ GLYLGHG E\ ZLWK UHVSHFW WR FODVV % 7KH $'LYLGH RSHUDWLRQ UHWDLQV RU D DQG D VLQFH WKH\ DOO FRQWDLQ ,QQHU SDWWHUQ f RI % DQG WRJHWKHU FRQWDLQ DOO SDWWHUQV RI IL

PAGE 79

3UHFHGHQFH 7KH SUHFHGHQFH UHODWLRQVKLSV RI WKH DERYH RSHUDWRU DUH DV IROORZV 8QDU\ RSHUDWRUV KDYH KLJKHU SUHFHGHQFH WKDQ ELQDU\ RSHUDWRUV 7KH SUHFHGHQFH RI WKH VHYHQ ELQDU\ DVVRFLDWLRQ RSHUDWRUV LV JLYHQ LQ WKH IROORZLQJ RUGHU r f I DQG 3DUHQWKHVHV FDQ EH XVHG WR DOWHU WKH SUHFHGHQFH UHODWLRQVKLSV 6XPPDU\ RI RSHUDWRUV f $VVRFLDWH f 7ZR SDWWHUQV DUH FRQFDWHQDWHG YLD DQ ,QWHUSDWWHUQ f $&RPSOHPHQW _f 7ZR SDWWHUQV DUH FRQFDWHQDWHG YLD D &RPSOHPHQWSDWWHUQ f $6HOHFW HUf $ SDWWHUQ LV UHWDLQHG LI LW VDWLVILHV WKH SUHGLFDWH f $3URMHFW -f $ VXESDWWHUQ LV SURMHFWHG IURP WKH RULJLQDO SDWWHUQ f 1RQ$VVRFLDWH Of 7ZR SDWWHUQV DUH FRQFDWHQDWHG YLD D &RPSOHPHQWSDWWHUQ RQO\ LI HDFK RI WKHP FDQQRW EH FRQFDWHQDWHG ZLWK DQ\ SDWWHUQ RI WKH RWKHU RSHUDQG YLD DQ ,QWHUSDWWHUQ f $,QWHUVHFW ff 7ZR SDWWHUQ DUH FRPELQHG LQWR D VLQJOH SDWWHUQ LI WKHLU FRPn PRQ FODVVHV KDYH FRPPRQ REMHFWVf f $,QWHJUDWH f 3DWWHUQV LQ DQ DVVRFLDWLRQVHW DUH FRPELQHG LI REMHFWV RI D VSHFLILHG FODVV LQ D SDWWHUQ DUH FRPPRQ WR WKHVH SDWWHUQV f $8QLRQ f 7ZR DVVRFLDWLRQVHWV DUH OXPSHG LQWR D VLQJOH VHW f $'LIIHUHQFH f $ SDWWHUQ LQ WKH PLQXHQG LV UHWDLQHG LI LW GRHV QRW FRQWDLQ DQ\ SDWWHUQ LQ WKH VXEWUDKDQG f$'LYLGH If $ VXEVHW RI SDWWHUQV LQ WKH GLYLGHQG WKDW KDYH FHUWDLQ FRPPRQ IHDWXUHVf DQG FRQWDLQ DOO WKH SDWWHUQV LQ WKH GLYLVRU LV UHWDLQHG

PAGE 80

4XHU\ ([DPSOHV :H KDYH IRUPDOO\ GHILQHG QLQH DVVRFLDWLRQ RSHUDWRUV DQG JLYHQ WKHLU VLPSOH PDWKHPDWLFDO SURSHUWLHV %HIRUH H[SORULQJ RWKHU SURSHUWLHV ZH JLYH VRPH H[DPn SOHV WR LOOXVWUDWH KRZ WKHVH RSHUDWRUV FDQ EH XVHG WR IRUPXODWH TXHULHV IRU SURFHVVn LQJ DQ GDWDEDVH 7KHUH FDQ EH PDQ\ DOWHUQDWLYH H[SUHVVLRQV IRU WKH VDPH TXHU\ &KRRVLQJ WKH EHVW RQH IRU H[HFXWLRQ LV WKH WDVN RI D TXHU\ RSWLPL]HU 7KH PDWKHPDWLFDO SURSHUWLHV RI WKHVH RSHUDWRUV FDQ EH XVHG IRU WKDW SXUSRVH ,Q WKH IROORZLQJ IRUPXODWLRQ RI DOJHEUDLF H[SUHVVLRQV ZH DVVXPH WKDW WKH XVHU LV XVLQJ WKH DOJHEUD GLUHFWO\ LQVWHDG RI D KLJKOHYHO TXHU\ ODQJXDJH ,Q WKH ODWWHU FDVH WKH WDVN RI JHQHUDWLQJ DOJHEUDLF H[SUHVVLRQV ZRXOG EHORQJ WR WKH WUDQVODWRU 7R IRUPXODWH DQ $DOJHEUD H[SUHVVLRQ IRU D TXHU\ ILUVW ZH QHHG WR FRQVWUXFW DQ LQWHQVLRQDO SDWWHUQ IRU LW E\ QDYLJDWLQJ WKH VFKHPD JUDSK RI WKH GDWDEDVH DV LOOXVWUDWHG LQ &KDSWHU 7KHQ HDFK HGJH RI WKH SDWWHUQ LV PDUNHG DQ RSHUDWRU r RU RQ WKH LQWHQGHG VHPDQWLFV )RU VLPSOH SDWWHUQV WKH IRUPXODWLRQ LV VWUDLJKWn IRUZDUG )RU SDWWHUQV ZLWK FRPSOH[ VWUXFWXUHV ZH PD\ KDYH WR GHFRPSRVH WKHP LQWR SDWWHUQV ZLWK VLPSOHU VWUXFWXUHV 7KH H[SUHVVLRQ IRU WKH RULJLQDO SDWWHUQ LV WKH $,QWHUVHFWfV RI WKH H[SUHVVLRQV IRU WKH GHFRPSRVHG SDWWHUQV )LUVW ZH IRUPXODWH H[SUHVVLRQV IRU 4XHU\ WR 4XHU\ JLYHQ LQ &KDSWHU :H KDYH LGHQWLILHG WKH LQWHQVLRQDO SDWWHUQV IRU WKHVH TXHULHV VHH )LJXUH f 4XHU\ )RU DOO VHFWLRQV JHW WKH PDMRUV RI VWXGHQWV ZKR DUH WDNLQJ WKHVH VHFWLRQV ,W LV WULYLDO WR ZULWH DQ DOJHEUDLF H[SUHVVLRQ IRU 4XHU\ ZKLFK LV UHSUHVHQWHG E\ D OLQHDU SDWWHUQ )RU WKLV SDWWHUQ WZR HGJHV DUH DOO PDUNHG ZLWK r DQG WKH

PAGE 81

DOJHEUDLF H[SUHVVLRQ FDQ EH IRUPXODWHG DV IROORZV I ,,>6HFWLRQ  6WXGHQW  'HSDUWPHQWf>6HFWLRQ'HSDUWPHQW6HFWLRQ'HSDUWPHQW@f n^6HFWLRQ` ZKHUH WKH $,QWHJUDWH RSHUDWLRQ JURXSV WKH UHVXOWDQW SDWWHUQV E\ 6HFWLRQV 4XHU\ /LVW VWXGHQWV ZKR PDMRU DQG PLQRU LQ WKH VDPH GHSDUWPHQW )RU 4XHU\ WKH HGJHV RI WKH LQWHQVLRQDO SDWWHUQ VKRZQ LQ )LJXUH F DUH DOO PDUNHG ZLWK r 6LQFH WKLV ORRS VWUXFWXUH FDQ EH YLHZHG DV WKH $,QWHUVHFW RI WZR OLQHDU SDWWHUQV LQYROYLQJ ERWK 6WXGHQW DQG 'HSDUWPHQW ZH KDYH ,,6WXGHQW  8QGHUJUDG  'HSDUWPHQW f 6WXGHQW  'HSDUWPHQWf>6WXGHQWZKHUH WKH $3URMHFW RSHUDWLRQ JHWV WKH VWXGHQW REMHFWV WKDW VDWLVI\ WKH DVVRFLDWLRQ SDWWHUQ DV UHTXLUHG E\ WKH TXHU\ 4XHU\ )RU WKRVH VWXGHQWV WDNLQJ VHFWLRQ DQG KDYLQJ PDMRUV DQGRU PLQRUV JHW WKHLU PDMRUV DQGRU PLQRUV 7KH H[SUHVVLRQ IRU WKH LQWHQVLRQDO SDWWHUQ RI 4XHU\ VKRZQ LV DV IROORZ 6HFWLRQ 6HFWLRQ r >6WXGHQW 'HSDUWPHQW 6WXGHQW 8QGHU JUDG 'HSDUWPHQWf ZKHUH WKH $8QLRQ RSHUDWRU LV XVHG WR UHDOL]H WKH 25 FRQGLWLRQ DW WKH FODVV 6WXn GHQW $V ORQJ DV D VWXGHQW KDV D PDMRU RU D PLQRU WKH OLQHDU SDWWHUQ IURP 6WXGHQW WR 'HSDUWPHQW DQG WKH OLQHDU SDWWHUQ IURP 6WXGHQW WR 8QGHUJUDG DQG WR 'HSDUWn PHQW VKRXOG EH UHWDLQHG ,Q WKH H[SUHVVLRQ 'HSDUWPHQWBO LV DQ DOLDV RI 'HSDUWn PHQW ZKLFK LV XVHG WR GLVWLQJXLVK PDMRU DQG PLQRU GHSDUWPHQWV 6LQFH WKH TXHU\ DVN IRU WKH PDMRUV DQG PLQRUV RI VWXGHQWV ZKR DUH WDNLQJ VHFWLRQ WKH $6HOHFW DQG $3URMHFW RSHUDWLRQV DUH XVHG 7KXV ZH KDYH

PAGE 82

I ,4Uf>HFWRQ @f>XGOHQ 'HSDUWPHQW 'HSDUWPHQWDO -^6WXGHQW` 6WXGHQW?'HSDUWPHQW6WXGHQW?'HSDUWPHQW@f ZKHUH D LV WKH LQWHQVLRQDO SDWWHUQ JLYHQ DERYH $V VKRZQ LQ )LJXUH J WKH UHVXOW RI WKLV H[SUHVVLRQ ZLOO FRQWDLQ WKH GHULYHG SDWWHUQV VKRZQ LQ )LJXUH J ZKLFK DUH VSHFLILHG E\ WKH >e@ FODXVH RI WKH SURMHFWLRQ RSHUDWLRQ DQG LV UHRUJDQn L]HG E\ DQ $,QWHJUDWH RSHUDWLRQ 1RWH WKDW 4XHU\ FDQQRW EH SKUDVHG LQ D VLQn JOH UHODWLRQDO DOJHEUD H[SUHVVLRQ VLQFH Df WKH XQLRQ RSHUDWLRQ LQ UHODWLRQDO DOJHEUD UHTXLUHV RSHUDQGV WR EH XQLRQFRPSDWLEOH Ef XVLQJ D MRLQ RSHUDWLRQ RQ 6WXGHQW FDQ FDXVH D ORVV RI LQIRUPDWLRQ EHFDXVH QRW HYHU\ VWXGHQW KDV ERWK PDMRU DQG PLQRU Ff WKH FDUWHVLDQSURGXFW RI WKH PDMRUV DQG PLQRUV ZLOO SURGXFH HUURQHRXV UHVXOWV DQG Gf QR RWKHU RSHUDWLRQ LQ WKH UHODWLRQDO DOJHEUD FDQ FRPELQH WZR UHODn WLRQV LQWR RQH 4XHU\ )RU HDFK WHDFKHU OLVW WKH VHFWLRQV ZKLFK KHVKH GRHV QRW WHDFK 7KH DOJHEUDLF H[SUHVVLRQ IRU 4XHU\ FDQ EH HDVLO\ IRUPXODWHG DV IROORZV VLQFH LW LV UHSUHVHQWHG E\ D OLQHDU SDWWHUQ VKRZQ LQ )LJXUH K :H QRWH WKDW WKH $&RPSOHPHQW RSHUDWRU UDWKHU WKDQ WKH 1RQ$VVRFLDWH RSHUDWRU VKRXOG EH XVHG IRU WKLV TXHU\ VLQFH D WHDFKHU PD\ EH WHDFKLQJ VRPH FRXUVHV 7HDFKHU ? 6HFWLRQ 6HYHUDO RWKHU TXHU\ H[DPSOHV DUH JLYHQ EHORZ 7KH\ XVH WKH VFKHPD JUDSK JLYHQ LQ )LJXUH 7KHLU FRUUHVSRQGLQJ LQWHQVLRQDO SDWWHUQV DUH GHSLFWHG LQ )LJn XUH

PAGE 83

4XHU\ /LVW WKH QDPHV RI VWXGHQWV ZKR WHDFK LQ WKH VDPH GHSDUWPHQWV DV WKHLU PDMRU GHSDUWPHQWV :H FDQ VHH IURP )LJXUH WKDW WKH LQWHQVLRQDO SDWWHUQ IRU WKLV TXHU\ FDQ EH FRQVWUXFWHG LQ WZR ZD\V 2QH ZD\ LV WR GHFRPSRVH LW LQWR WKUHH OLQHDU SDWWHUQV 1DPHf§3HUVRQf§6WXGHQW 6WXGHQWf§'HSDUWPHQW DQG 6WXGHQWf§ *UDGf§ 7$f§ 7HDFKHUf§'HSDUWPHQW 7KH $,QWHUVHFWfV RI WKHVH WKUHH SDWWHUQV ZLOO SURGXFH D SDWWHUQ WKDW VDWLVILHV WKLV TXHU\ Q>6WXGHQW  3HUVRQ  1DPH f 6WXGHQW  'HSDUWPHQW f 6WXGHQW  *UDG  7$ r 'HSDUWPHQWf>1DPH@ ZKHUH WKH ILUVW $,QWHUVHFW RSHUDWLRQ RSHUDWHV RYHU 6WXGHQW DQG WKH VHFRQG RSHUDWHV RYHU 6WXGHQW DQG 'HSDUWPHQW 7KH $3URMHFW RSHUDWLRQ SURMHFWV WKH QDPHV RI WKHVH VWXGHQWV $QRWKHU ZD\ LV WR GHFRPSRVH WKH LQWHQVLRQDO SDWWHUQ LQWR WZR OLQHDU SDWWHUQV 1DPHf§3HUVRQf§6WXGHQWf§'HSDUWPHQW DQG 6WXGHQWf§ *UDGf§ 7$f§ 7 H DFKHUf§'HSDUWPHQW 7KHUHIRUH ZH KDYH DQ DOWHUQDWLYH H[SUHVVLRQ ,-1DPH 3HUVRQ 6WXGHQW 'HSDUWPHQW 7$ f 6WXGHQW *UDG 7$ 7HDFKHU 'HSDUWPHQWf>1DPH@ 4XHU\ /LVW WKH VHFWLRQ RI WKRVH VHFWLRQV ZKLFK KDYH QRW EHHQ DVVLJQHG D URRP RU KDYH QRW EHHQ DVVLJQHG D WHDFKHU 6LQFH WKH TXHU\ UHTXHVWV VHFWLRQV WKDW KDYH QRW EHHQ DVVLJQHG D URRP RU D WHDFKHU WKHVH VHFWLRQV PXVW QRW EH FRQQHFWHG ZLWK DQ\ URRP RU DQ\ WHDFKHU LH

PAGE 84

D VHFWLRQ ZKLFK GRHV QRW DVVRFLDWH ZLWK DQ\ URRP DQG WHDFKHU VKRXOG DOVR EH UHWDLQHG LQ WKH UHVXOWf 7KHUHIRUH WKHUH VKRXOG EH &RPSOHPHQWSDWWHUQV EHWZHHQ 6HFWLRQ DQG 7HDFKHU DQG EHWZHHQ 6HFWLRQ DQG 5RRP DQG D VLQJOH DUF EHWZHHQ WKHVH WZR EUDQFKHV DV VKRZQ LQ )LJXUH :H HPSKDVL]H WKDW RSHUDWLRQ LQVWHDG RI VKRXOG EH XVHG WR FRQVWUXFW WKHVH WZR &RPSOHPHQWSDWWHUQV 7KHQ WKH DOJHEUD H[SUHVVLRQ IRU WKLV TXHU\ FDQ EH HDVLO\ IRUPXODWHG DV IROORZV ,, 6HFWLRQ r 6HFWLRQ 5RRP 6HFWLRQ ?7HDFKHUff>6HFWLRQMI? 4XHU\ /LVW WKH QDPHV RI VWXGHQWV ZKR WDNH FRXUVHV DQG :H VKDOO VKRZ WKUHH ZD\V RI IRUPXODWLQJ DQ H[SUHVVLRQ IRU WKLV TXHU\ )LUVW WKH LQWHQVLRQDO SDWWHUQ IRU 4XHU\ VKRZQ LQ )LJXUH FDQ EH FRQVWUXFWHG E\ WKH $,QWHUVHFW RI WZR OLQHDU SDWWHUQV DV ZH GLG IRU 4XHU\ ,7R>1DPH 3HUVRQ6WXGHQW (QUROOPHQW &RXUVH r&RXUmHf>&RXUVH @ f R^6WXGHQW (QUROOPHQW &RXUVH r&RXUDHBOf>&RWLUVH @f>O9DPH@ ZKHUH (QUROOPHQWf§O &RXUVHBO DQG &RXUVHBO DUH WKH DOLDVHV RI WKH FODVVHV (QUROOPHQW &RXUVH DQG &RXUVH UHVSHFWLYHO\ 7KLV HQVXUHV WKDW WKH $,QWHUDFW RSHUDWLRQ ZLOO EH SHUIRUPHG RQO\ RYHU WKH 6WXGHQW FODVV $ VHFRQG ZD\ LV WR YLHZ WKH RULJLQDO SDWWHUQ DV D OLQHDU SDWWHUQ ZLWKRXW UHVn WULFWLRQ RQ &RXUVH DV IROORZV 1DPHf§3HUVRQf§ 6WXGHQWf§(QUROOPHQWf§ &RXUVHf§ &RXUVH 6WXGHQWV ZKR DUH WDNLQJ ERWK FRXUVHV PXVW SDUWLFLSDWH DW OHDVW WZR VXFK SDWWHUQV ZLWK &RXUVH DQG &RXUVH UHVSHFWLYHO\ 7KLV LPSOLHV DQ $'LYLGH RSHUDWLRQ 7KXV WKH TXHU\ FDQ EH IRUPXODWHG DV IROORZV

PAGE 85

,O1DPH 3HUVRQ 6WXGHQW (QUROOPHQW &RXUVH &RXUVH A^6WXGHQW` rL&RXUVH RZUVHf>&RXUmH 9RXUmH @f>L9RPH@ ZKHUH D GRW LQ &RXUVH&RXUVH LV XVHG RQO\ IRU LGHQWLI\LQJ WKH &RXUVH FODVV ZKLFK LV GHILQHG LQ WKH &RXUVH FODVV ,W GRHV QRW UHSUHVHQW D IXQFWLRQ RU D PHWKRG DV LQ RWKHU ODQJXDJHV 7KLV H[SUHVVLRQ FDQ DOVR EH UHZULWWHQ DV IROORZ ,O1DPH  3HUVRQ  ,,>6WXGHQW  (QUROOPHQW  &RXUVH  &RXUVH U^VWXGHQW` A^&RXUVH&FPUHf>&nRXUH 9&RXUVF @f>6XLHQ@f>L9DPH@ ZKLFK LV PRUH VXLWDEOH IRU H[HFXWLRQ WKDQ WKH ILUVW VLQFH WKH LQQHU $3URMHFW JHWV WKH VWXGHQW REMHFWV ZKR DUH WDNLQJ WKHVH WZR FRXUVHV VR WKDW DOO RWKHU GDWD DVVRFLn DWHG ZLWK WKHVH VWXGHQWV VXFK DV (QUROOPHQW &RXUVH DQG &RXUVH GR QRW KDYH WR EH FDUULHG DORQJ LQ IXUWKHU SURFHVVLQJ WR JHW WKH QDPHV RI WKHVH VWXGHQW 'HWDLOV RI RSWLPL]DWLRQ LVVXHV ZLOO EH DGGUHVVHG LQ WKH QH[W FKDSWHU :H VWUHVV WKDW WKH DERYH DVVRFLDWLRQ SDWWHUQ H[SUHVVLRQV UHSUHVHQW WKH LQWHUn QDO DOJHEUDLF RSHUDWLRQV WKDW QHHG WR EH SHUIRUPHG LI WKH G\QDPLF LQKHULWDQFH PHWKRG LV XVHG 7KH KLJKOHYHO TXHU\ VWDWHPHQWV FRUUHVSRQGLQJ WR WKHVH DOJHEUDLF H[SUHVVLRQV LVVXHG E\ WKH XVHU FDQ EH PXFK VLPSOHU GXH WR WKH LQKHULWDQFH RI DWWULn EXWHV LQ WKH JHQHUDOL]DWLRQ KLHUDUFK\ RU ODWWLFH

PAGE 86

6WXGHQW 6HFWLRQ &RXUVH )LJXUH 5HJXODUHGJHV DQG &RPSOHPHQWHGJHV LQ DQ 2*

PAGE 87

JUDSKLFDO UHSUHVHQWDWLRQ DOJHEUDLF UHSUHVHQWDWLRQ SULPLWLYH SDWWHUQV D ZKLFK LV GHULYHG IURP D E F f f D '&RPSOHPHQW BB A SDWWHUQ f ZKLFK LV GHULYHG IURP D E F G f G G Df ,SDWWHUQ D $ E $ DEf Z 9 F G &RPSOHPHQW SDWWHUQ D G F G f ',QWHU SDWWHUQ DIGf D EEFFGf DIGf DEEFFGf Df SULPLWLYH DVVRFLDWLRQ SDWWHUQV D E F DEEFEGf f G EFFGf DEDEEFEFf DWEOEF EFFGFGf Ef FRPSOH[ DVVRFLDWLRQ SDWWHUQV )LJXUH ([DPSOHV RI DVVRFLDWLRQ SDWWHUQV

PAGE 88

)LJXUH ([DPSOHV RI DVVRFLDWLRQVHWV

PAGE 89

$ % & )LJXUH $ VDPSOH GDWDEDVH DVVRFLDWLRQ JUDSK 7KH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQf

PAGE 90

$%& G G G G 6DPSOH 'DWDEDVH 7KH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQf D 3 < D W f E D f D ,f§ E r>5%&f@ Fmf§‘rG ? F f m G F E f f G 9F f§f G Df DQ $VVRFLDWH RSHUDWLRQ )LJXUH ([DPSOH RI RSHUDWLRQV

PAGE 91

$ % & G G G G 6DPSOH 'DWDEDVH 7KH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQf D D f D f AD f E _>5%&f@ G ff§f G F r m G F D r E D E D D E F F G F G f E F Ef DQ $&RPSOHPHQW RSHUDWLRQ )LJXUH f§FRQWLQXHG

PAGE 92

$ % & G G G G 6DPSOH 'DWDEDVH 7KH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQf D \ I D f E } ‘ & GL $ P I D E G$ Q D E F G >$r% 'f%'f@ ff§ D ff§ f§ff§ E ;M/ 9 9 r F G + \ f§IW Ff DQ $3URMHFW RSHUDWLRQ )LJXUH f§FRQWLQXHG

PAGE 93

$ % & G G G G 6DPSOH 'DWDEDVH 7KH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQf D 3 < I DL E$ rY G I DB >5%&f@ F E F GA? DO 9DL =Er 9 f§ D 9 f E Gf D 1RQ$VVRFLDWH RSHUDWLRQ )LJXUH aFRQWLQXHG

PAGE 94

$ % & G G G G 6DPSOH 'DWDEDVH 7KH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQf Hf DQ $,QWHUVHFW RSHUDWLRQ )LJXUH f§FRQWLQXHG

PAGE 95

$%& G G G G 6DPSOH 'DWDEDVH 7KH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQf If $,QWHJUDWH RSHUDWLRQV )LJXUH f§FRQWLQXHG

PAGE 96

$%& G G G G 6DPSOH 'DWDEDVH 7KH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQf Jf DQ $'LIIHUHQFH RSHUDWLRQ )LJXUH f§FRQWLQXHG

PAGE 97

$%& 6DPSOH 'DWDEDVH 7KH &RPSOHPHQWSDWWHUQV DUH QRW VKRZQf D D f E } E E F f§} f F G f f F G E F f f E F f f \ 3 < I G ? f D E Ur? E Ff ? f f E F G E F f f L E F G F G ? f f ? ff§ff§f M Kf DQ $'LYLGH RSHUDWLRQ )LJXUH f§FRQWLQXHG

PAGE 98

4XHU\ 1DPH 2f§ 4XHU\ 4XHU\ 1DPH 2 )LJXUH 6WXGHQW 'HSW 7HDFKHU 6HFWLRQ A r R FNe 6HFWLRQ 5RRP (QUROOPHQW &RXUVH 6WXGHQW 3HUVRQ (QUROOPHQWA &RXUVHB &RXUVH &RXUVH ,QWHQVLRQDO SDWWHUQV RI 4XHU\ DQG

PAGE 99

&+$37(5 0$7+(0$7,&$/ 3523(57,(6 2) 23(5$7256 $1' 7+(,5 $33/,&$7,216 ,1 48(5< 237,0,=$7,21 $1' 48(5< '(&20326,7,21 ,Q 6HFWLRQ ZH KDYH VKRZQ VRPH PDWKHPDWLFDO SURSHUWLHV RI LQGLYLGXDO RSHUDWRUV ,Q WKLV VHFWLRQ ZH VKDOO VWXG\ WKHLU SURSHUWLHV V\VWHPDWLFDOO\ 7KH SURn SHUWLHV RI $DOJHEUD DUH FODVVLILHG LQWR VL[ FDWHJRULHV f FRQYHQWLRQDO DOJHEUDLF SURSHUWLHV VXFK DV FRPPXWDWLYLW\ DVVRFLDWLYLW\ LGHPSRWHQF\ QLOSRWHQF\ DQG GLV WULEXWLYLW\ f QHVWLQJ RI WZR XQDU\ RSHUDWLRQV f D ELQDU\ RSHUDWLRQ QHVWHG LQ D XQDU\ RSHUDWLRQ f FDVFDGLQJ RI WZR GLIIHUHQW ELQDU\ RSHUDWLRQV f JHQHUDO LGHQn WLWLHV DQG f RSHUDWLRQ WUDQVIRUPDWLRQ 7KH SURSHUWLHV SUHVHQWHG LQ WKLV GLVVHUWDn WLRQ LV TXLWH H[KDXVWLYH EXW PD\ QRW EH FRPSOHWH 7KHVH SURSHUWLHV SURYLGH WKH PDWKHPDWLFDO IRXQGDWLRQ IRU TXHU\ GHFRPSRVLWLRQ DQG TXHU\ RSWLPL]DWLRQ 7KHLU XWLOLWLHV LQ WKHVH WZR DSSOLFDWLRQV DUH DOVR LOOXVWUDWHG LQ WKLV FKDSWHU 7KH SURRIV RI SURSHUWLHV WKDW DUH PDUNHG ZLWK IfV FDQ EH IRXQG LQ WKH $SSHQGL[ 2WKHUV FDQ EH SURYHG VLPLODUO\ &RQYHQWLRQDO $OJHEUDLF 3URSHUWLHV 7R EH V\VWHPDWLF ILUVW ZH OLVW WKH SURSHUWLHV JLYHQ LQ 6HFWLRQ ZLWKRXW H[SODQDWLRQ VLQFH WKH\ KDYH EHHQ LOOXVWUDWHG SUHYLRXVO\ 7KHQ ZH JLYH WKH SURn SHUWLHV RI GLVWULEXWLYLW\

PAGE 100

$ &RPPXWDWLYLW\ D r>L$%f@ S S r>-=%$f@ m I f D >-$IOf@ 3 3 >5%$f` D I f D >5$%f? 3 S? >5%$f@ D I f D f^:` 3 3 f^:` D If D 3 3 D _f % $VVRFLDWLYLW\ m: r^5$0 S^Qf r>5&'f@ ^]` DZ r^5$%f` S><` r>5&'f? ^]`f &e^;` $ %J ^=`f I f mm :$0 3Zf >5&'f? ^]` RUZ >5$0 3^<` >5^&'f? ^]`f &e ^$` $ %e^=`f I f m^r` f^ZLf A^\`f m^:` a^]` m^[! r :` 3^<` Zf ^=`f LZAfQZA D ^X\LZAQ^r` rf f D Sf D S f I f & ,GHPSRWHQF\ DQG 1LOSRWHQF\ D f D D LI D LV D KRPRJHQHRXV DVVRFLDWLRQf§VHWf f D D D f $ r>5$$f` $ $ f $ >IO$$f@ $ I! f

PAGE 101

D D D f 'LVWULEXWLYLW\ Df GLVWULEXWLYH SURSHUW\ RI r ZLWK UHVSHFW WR D r>5$%f` S f D r>IO$%f@ S D r>%$%f@ I f Ef GLVWULEXWLYH SURSHUW\ RI ZLWK UHVSHFW WR D >%$%f@ 3 f D >5$%f` S D >%$%f@ I f FfGLVWULEXWLYH SURSHUW\ RI f ZLWK UHVSHFW WR RU r^;` 3 f D 3 D r^;` I f 7KHVH WKUHH SURSHUWLHV KROG WUXH IRU WKH VDPH UHDVRQV )LUVW WKH $8QLRQ RSHUDWLRQ VLPSO\ OXPSV WRJHWKHU SDWWHUQV RI WZR DVVRFLDWLRQVHWV ZLWKRXW PRGLI\n LQJ WKHP 6HFRQG ZKHQ WZR SDWWHUQV DUH RSHUDWHG RQ E\ r RU f WKH SURGXFWLRQ RI D QHZ SDWWHUQ LV LQGHSHQGHQW RI RWKHU SDWWHUQV LQ WKH RSHUDQG DVVRFLDWLRQVHWV LH WKH GHFLVLRQ ZKHWKHU D QHZ SDWWHUQ LV SURGXFHG RU QRW LV GHWHUPLQHG RQO\ EDVHG RQ WKH VWUXFWXUH RI WKH WZR SDWWHUQV EHLQJ RSHUDWHG RQ Gf GLVWULEXWLYH SURSHUW\ RI r ZLWK UHVSHFW WR f m^[! r 5&/Y&/f` 3^\f ^:` ^]`f m: r>5&/Y&/f? 3^\f ^:8DZ r^5&/Y&/f? ^]` I f Hf GLVWULEXWLYH SURSHUW\ RI ZLWK UHVSHFW WR f ?5&/Y&/f? 3>\f ^:` ^]`f RUZ >5&/Y&/f? 3>\f ^:8;` D^[@ ? >5&/9&/f? ]f f

PAGE 102

'LVWULEXWLYH SURSHUWLHV G DQG H KROG WUXH XQGHU WKH IROORZLQJ WKUHH FRQGLn WLRQV Lf &/H: LLf ;S_< ;I< I! DQG LLLf RU LV D KRPRJHQHRXV DVVRFLDWLRQf§VHW 7KH ILUVW FRQGLWLRQ HQVXUHV WKDW WKH r DQG RSHUDWLRQV DUH SHUIRUPHG RQ WKH LQWHUVHFWLRQ RI DQG UL! 2WKHUZLVH LW GRHV QRW PDNH VHQVH WR KDYH DQ RSHUDn WLRQ EHWZHHQ D DQG 7KH VHFRQG FRQGLWLRQ VWDWHV WKDW D SDWWHUQV DUH QRQn RYHUODSSLQJ ZLWK 3 DQG SDWWHUQV 7KH WKLUG FRQGLWLRQ VWDWHV WKDW RQ WKH ULJKW KDQG VLGH RI WKH H[SUHVVLRQ RQO\ WKH SDWWHUQV KDYLQJ WKH VDPH D SDWWHUQV DV WKHLU VXESDWWHUQV ZLOO VXFFHHG LQ WKH $,QWHUVHFW RSHUDWLRQ $OWKRXJK WKHVH WZR GLVWULn EXWLYH SURSHUWLHV GR QRW KROG ZKHQ RQH RI WKH DERYH WKUHH FRQGLWLRQV LV QRW WUXH WKH\ DUH HTXLYDOHQW WR VRPH RWKHU H[SUHVVLRQV XQGHU D OHVV UHVWULFWLYH FRQGLWLRQ 7KHVH SURSHUWLHV DUH FODVVLILHG LQ RWKHU FDWHJRULHV ,W VKRXOG EH QRWHG WKDW WZR SRVVLEOH GLVWULEXWLYH SURSHUWLHV DUH PLVVLQJ LQ WKH DERYH OLVW )LUVW LV QRW GLVWULEXWLYH ZLWK UHVSHFW WR 7KLV SURSHUW\ GRHV QRW H[LVW EHFDXVH RI WKH ZD\ WKH 1RQ$VVRFLDWH RSHUDWLRQ LV GHILQHG %\ LWV GHILQLWLRQ D SDWWHUQ LQ RQH DVVRFLDWLRQVHW ZLOO EH LQFOXGHG LQ WKH UHVXOWDQW SDWWHUQ LII LW GRHV QRW FRQQHFW WR DQ\ SDWWHUQ LQ WKH RWKHU DVVRFLDWLRQVHW 7KLV LPSOLHV D ORJLFDO $1' FRQFHSW 7KHUHIRUH H[SUHVVLRQV D IW f DQG D IW D KDYH WRWDOO\ GLIIHUHQW VHPDQWLFV 7KH IRUPHU VWDQGV IRU SDWWHUQV LQ D WKDW DUH QRW DVVRFLDWHG ZLWK SDWWHUQV LQ ERWK IW DQG ZKHUHDV WKH ODWWHU VSHFLILHV WKRVH SDWWHUQV LQ D WKDW DUH QRW DVVRFLDWHG ZLWK DQ\ SDWWHUQ LQ HLWKHU RU 6HFRQG LV QRW GLVWULEXWLYH

PAGE 103

ZLWK UHVSHFW WR f 7KLV SURSHUW\ GRHV QRW KROG EHFDXVH SHUIRUPLQJ WKH $,QWHUVHFW RSHUDWLRQ ILUVW PD\ GURS VRPH SDWWHUQV ZKLFK PD\ EH DVVRFLDWHG ZLWK VRPH D SDWWHUQV DQG WKH GURSSHG c SDWWHUQV PD\ DOORZ WKRVH D SDWWHUQV WR EH QRQ DVVRFLDWHG ZLWK WKH UHVXOW RI WKH $,QWHUVHFW RSHUDWLRQ :KHUHDV ZKHQ SHUIRUPn LQJ WKH 1RQDVVRFLDWH RSHUDWLRQ ILUVW WKRVH D SDWWHUQV PD\ QRW DSSHDU LQ WKH ILQDO UHVXOW 7KH UHDVRQ WKDW 1RQ$VVRFLDWH RSHUDWRU LV QRW GLVWULEXWLYH ZLWK UHVSHFW WR $ 8QLRQ DQG $,QWHUVHFW RSHUDWLRQV LV PDLQO\ EHFDXVH LW LV QRW DVVRFLDWLYH :H VKDOO VHH IURP WKH UHVW RI WKLV FKDSWHU WKDW LW KDV OHVV SURSHUWLHV WKDQ RWKHU RSHUDWRUV r 1HVWLQJ RI 7ZR 8QDU\ 2SHUDWLRQV Df 7ZR $6HOHFW RSHUDWLRQV RQH QHVWHG LQ WKH RWKHUf 6LPLODU WR WKH UHODWLRQDO DOJHEUD WKH RUGHU RI WKH QHVWLQJ RI WZR VHOHFWLRQV FDQ EH H[FKDQJHG ZLWKRXW DIIHFWLQJ WKH ILQDO UHVXOW 2U WKH\ FDQ EH FRPELQHG LQWR D VLQJOH VHOHFWLRQ RSHUDWLRQ 7KH VHOHFWLRQ FRQGLWLRQ RI WKH FRPELQHG $6HOHFW RSHUDWLRQ LV WKH FRQMXQFWLRQ RI WKH SUHGLFDWHV RI WKH RULJLQDO WZR $6HOHFW RSHUDn WLRQV rL D>RWf>3$3? Wf

PAGE 104

Ef 7ZR $3URMHFW RSHUDWLRQV RQH QHVWHG LQ WKH RWKHUf ,W VKRXOG EH REYLRXV WKDW WKH RUGHU RI WKH QHVWLQJ RI WZR SURMHFWLRQ RSHUDn WLRQV FDQQRW EH H[FKDQJHG H[FHSW WKDW WKH\ SURMHFW WKH VDPH WKLQJ ZKLFK LV QRW PHDQLQJIXO +RZHYHU WKH\ DUH HTXLYDOHQW WR D VLQJOH SURMHFWLRQ LI WKH RXWHU $ 3URMHFW RSHUDWLRQ SURMHFWV VXESDWWHUQV RYHU SDWWHUQV SURGXFHG E\ WKH LQQHU $ 3URMHFW Q[ :WH7GLIWM7QDfIWM7f 9HOLF\HOLfI $ HMfe A H8f§HMf f ZKHUH HXfV DUH VXESDWWHUQ H[SUHVVLRQV RI WKH ILUVW $3URMHFW RSHUDWLRQ DQG HAfV DUH VXESDWWHUQ H[SUHVVLRQV RI WKH VHFRQG $3URMHFW RSHUDWLRQ DQG HX&H\ PHDQV WKDW HX GHILQHV D VXESDWWHUQ RI H\ Ff 7ZR $,QWHJUDWH RSHUDWLRQV RQH QHVWHG LQ WKH RWKHUf %\ WKH GHILQLWLRQ RI WKH $,QWHJUDWH RSHUDWLRQ LI DQ $,QWHJUDWH RSHUDWLRQ LV DSSOLHG VHFRQG WLPH RQ DQ DVVRFLDWLRQVHW LW ZLOO KDYH QR HIIHFW RQ WKH UHVXOW RI WKH ILUVW RSHUDWLRQ 7KHUHIRUH ZH KDYH mff mf -^Z` -^Z`Y -^Z\ n f mff mf f 6LQFH DQ $,QWHJUDWH RSHUDWLRQ ZLWK D VHW RI VSHFLILHG FODVVHV RQO\ SHUIRUPV SDUW RI WKH IXQFWLRQ RI DQ $,QWHJUDWH RSHUDWLRQ ZLWKRXW D VHW RI VSHFLILHG FODVVHV WKH IROn ORZLQJ HTXDWLRQV DOVR KROG WUXH \m} ff f

PAGE 105

\ mff mf f Gf $6HOHFW QHVWHG LQ $SURMHFW RU YLVH YHUVD $ VHOHFWLRQ RSHUDWLRQ SHUIRUPHG RQ WKH UHVXOW RI D SURMHFWLRQ RSHUDWLRQ LV HTXLYDOHQW WR WKH SURMHFWLRQ SHUIRUPHG RQ WKH UHVXOW RI WKH VHOHFWLRQ VLQFH WKH VHOHFWLRQ FRQGLWLRQ DSSOLFDEOH WR WKH SURMHFWHG VXESDWWHUQV PXVW EH DSSOLFDEOH WR WKH SDWWHUQV EHIRUH WKH SURMHFWLRQ +RZHYHU LW LV QRW WUXH IRU WKH RWKHU GLUHFWLRQ DW QrfP0 ,$ DLDPH@ f )RU WKH RWKHU GLUHFWLRQ WR EH WUXH WKH FODVVHV LQYROYHG LQ WKH SUHGLFDWH RI WKH VHOHFWLRQ FRQGLWLRQ VKRXOG DOVR DSSHDU LQ >e@ FODXVH RI WKH SURMHFWLRQ RSHUDWLRQ GHQRWHG DV 3&6f ZKLFK GHILQHV VXESDWWHUQVf WR EH SURMHFWHG RXW 2WKHUZLVH WKH UHVXOW RI WKH VHOHFWLRQ LV DOZD\V DQ HPSW\ VHW EHFDXVH WKH SUHGLFDWH LV QRW DSSOLFDn EOH WR WKH SURMHFWHG SDWWHUQV 7KHUHIRUH WKH DERYH SURSHUW\ KROGV WUXH IRU ERWK GLUHFWLRQV ZKHQ WKH FRQGLWLRQ KROGV WKXV ZH KDYH P $rfP]7? 3&6f Wf / $ %LQDU\ 2SHUDWLRQ 1HVWHG LQ $ 8QDU\ 2SHUDWLRQ %LQDU\ RSHUDWLRQ QHVWHG LQ DQ $6HOHFW Df $VVRFLDWH $&RPSOHPHQW RU $,QWHUVHFW QHVWHG LQ $6HOHFW *HQHUDOO\ VSHDNLQJ WUDQVIRUPLQJ DQ H[SUHVVLRQ RI D ELQDU\ RSHUDWLRQ $VVRFLn DWH $&RPSOHPHQW RU $,QWHUDFWf QHVWHG LQ D VHOHFWLRQ LQWR DQRWKHU H[SUHVVLRQ LV LPSRVVLEOH VLQFH WKH SUHGLFDWH RI WKH VHOHFWLRQ RSHUDWLRQ FDQ EH YHU\ FRPSOLFDWHG )RU WKLV UHDVRQ ZH VWXG\ RQO\ WKH VLPSOH FDVH LQ ZKLFK WKH SUHGLFDWH KDV WKH IRUP

PAGE 106

3[D3 RU 3\3 DQG 3[ DQG 3 DUH RQO\ DSSOLFDEOH WR D DQG 3 UHVSHFWLYHO\ 7KH IROn ORZLQJ SURSHUWLHV DUH VLPLODU WR WKRVH LQ UHODWLRQDO DOJHEUD 7KH\ GR QRW QHHG DQ H[SODQDWLRQ )RU 3[D3 ZH KDYH DD r^5$%f` 3f>3[D3@ Dmf>!@ r^5$%f` D^Sf>3f I f RD ?>5$%f? 3f>3[D3@ ARUf3[@ >r$LOf@ D3f>3@ f rD f U[Df>3[` f f )RU 3[Y3 ZH KDYH RD r>IO$IOf@ 3f>3\3 RRIf>!@ rWm$%f@ 3 D r^5$%f` R^3f>3? I f R^D _>%$%f@ 3f>3;93@ mrf>!@ ?>5$%f` D _>-$%f@ R3f>3f rLr 3f>3[Y3@ R>RWf>3[f 3 D R>\ f :H QRWH WKDW WKH DERYH SURSHUWLHV DUH QRW WUXH IRU D 1RQ$VVRFLDWH RSHUDWLRQ QHVWHG LQ DQ $6HOHFW 7KH UHDVRQ LV VLPLODU WR ZKDW ZH KDYH H[SODLQHG LQ WKH VHFn WLRQ RQ GLVWULEXWLYH SURSHUW\ Ef $'LIIHUHQFH QHVWHG LQ $6HOHFW 6LQFH ERWK $'LIIHUHQFH DQG $6HOHFW RSHUDWLRQV SHUIRUP D UHVWULFWLRQ RQ DQ DVVRFLDWLRQVHW DQG SURGXFH D VXEVHW RI SDWWHUQV ZLWKRXW FKDQJLQJ WKHLU RULJLQDO VWUXFWXUHV DQ $6HOHFW RSHUDWLRQ SHUIRUPHG RQ WKH PLQXHQG RU RQ WKH UHVXOW RI WKH $'LIIHUHQFH RSHUDWLRQ ZLOO SURGXFH WKH VDPH UHVXOW D>D Sf>3c R^Df>3> 3 If

PAGE 107

Ff $8QLRQ QHVWHG LQ $6HOHFW ,W VKRXOG EH REYLRXV WKDW WKH IROORZLQJ HTXDWLRQ LV DOZD\V WUXH RD Sf>-@ R>Df>3L D^P I f ,Q D VSHFLDO FDVH WKDW 3 KDV WKH IRUP 3A-" DQG 3[ DQG 3 FDQ EH DSSOLHG WR D DQG 3 UHVSHFWLYHO\ ZH KDYH Rm f>! Y3m03Wf %LQDU\ RSHUDWLRQ QHVWHG LQ $3URMHFW RU $,QWHJUDWH 6LQFH $3URMHFW DQG $,QWHJUDWH RSHUDWLRQV SURGXFH SDWWHUQV ZKLFK PD\ FRQn WDLQ VXESDWWHUQV RI ERWK RSHUDQGV RI WKH QHVWHG ELQDU\ RSHUDWLRQ SURSHUWLHV VLPLn ODU WR WKRVH SUHVHQWHG DERYH GR QRW KROG LQ JHQHUDO H[FHSW IRU WKH QHVWLQJ RI DQ $8QLRQ RSHUDWLRQ ,->D Sf>e7@ QDf^e7? -->L7If m 3f mf f f D Sf > I Df I Sff -^Z\ f -^Z\ -Z`Y f ?Z\ f f K$ &DVFDGLQJ RI 7ZR %LQDU\ 2SHUDWLRQV &DVFDGLQJ RI WZR LGHQWLFDO ELQDU\ RSHUDWRUV 0RVW FDVHV KDYH EHHQ FRYHUHG E\ WKH DVVRFLDWLYLW\ SURSHUWLHV $OWKRXJK WKH DVVRFLDWLYLW\ GRHV QRW KROG IRU RSHUDWRUV DQG I WKHUH H[LVW VRPH HTXLYDOHQW H[SUHVVLRQV 7KH FDVFDGLQJ RI WZR $'LIIHUHQFH RSHUDWLRQV IROORZV WKH VHW

PAGE 108

GLIIHUHQFH LQ VHW WKHRU\ Dn" DUn\" DU" n\f I f 7KH FDVFDGLQJ RI WZR $'LYLGH RSHUDWLRQV LV HTXLYDOHQW WR WKH GLYLGHQG GLYLGHG E\ WKH $8QLRQ RI WKH WZR GLYLVRUV EHFDXVH DQ $'LYLGH RSHUDWLRQ UHWDLQV SDWWHUQV RI WKH GLYLGHQG ZLWKRXW PRGLI\LQJ WKHLU VWUXFWXUHV QRWH WKDW WKH GLYLGH RSHUDWLRQ LQ UHODWLRQDO DOJHEUD UHWDLQV D VXEVWUXFWXUH RI WKH GLYLGHQGf 7KHUHIRUH WKH RUGHU RI WKH WZR $'LYLGH RSHUDWLRQV LV QRW LPSRUWDQW D a ^Z` 3 7 D A^Z` 7 ^Z` 3 I f m ^Z`L3 f &DVFDGLQJ RI WZR GLIIHUHQW ELQDU\ RSHUDWLRQV 0DQ\ FDVHV KDYH EHHQ FRYHUHG E\ WKH GLVWULEXWLYH SURSHUWLHV $OWKRXJK WKH GLVWULEXWLYLW\ SURSHUWLHV RI DQG ZLWK UHVSHFW WR GR QRW KROG WKHUH VWLOO H[LVW VRPH HTXLYDOHQW H[SUHVVLRQV 7KHVH SURSHUWLHV DUH OLVWHG EHORZ DFFRUGLQJ WR WKHLU ILUVW RSHUDWRUV Df r ZLWK RWKHU ELQDU\ RSHUDWRUV 7KH FDVFDGLQJ RI r DQG RSHUDWRUV LV DVVRFLDWLYH m^r` r^5$0 3Zf _>5&'f? ^]` DZ r>5$%f` ^\f ??5^&'f? ^]`f I f &^;`$%e^=`f 7KH FRQGLWLRQ HQVXUHV WKDW WKH RSHUDWLRQ r>$%f@ GRHV QRW RSHUDWH RQ SDWWHUQV DQG r>5&'f? GRHV QRW RSHUDWH RQ D SDWWHUQV

PAGE 109

)RU WKH FDVFDGLQJ RI r DQG RSHUDWRUV LQ WKDW RUGHUf LW VKRXOG EH REYLRXV WKDW ZKHQ WKH VXEWUDKHQG LV RQO\ DSSOLFDEOH WR RQH RI WKH RSHUDQGV RI WKH r RSHUDn WLRQ WKH RSHUDWLRQ FDQ EH SHUIRUPHG ILUVW DQG MXVW DJDLQVW WKDW RSHUDQG m^[! r>m$%f@ 3^<`f a ^]` m: ^]`f r>5$0 3^5$%fc 3^\`f f ^=` A^[` f ^]`f r^5^$%f? 3A\c I f ^\!3W^=` $ ^<`IA;` $ $H^;`f D^[` 3^\f f ^=`f 32IO^A $ $ %H^<`f 7KH ILUVW WZR FRQGLWLRQV HQVXUH WKDW SDWWHUQV GR QRW LQWHUVHFW ZLWK D DQG 3 SDWn WHUQV 2WKHUZLVH WKH $,QWHUVHFW RSHUDWLRQ ZLOO SHUIRUP RYHU WKH FRPPRQ FODVVHV RI 3 DQG LI WKH r RSHUDWLRQ LV SHUIRUPHG ILUVW 7KH WKLUG FRQGLWLRQ HQVXUHV WKDW D 3f PXVW FRQWDLQV REMHFW LQVWDQFHV RI $ %f ,Q RWKHU ZRUGV WKH DOJHEUDLF H[SUHVn VLRQ WKDW GHILQHV D 3f PXVW FRQWDLQ $ %f 2WKHUZLVH SHUIRUPLQJ WKH $,QWHUVHFW RSHUDWLRQ ILUVW PD\ SURGXFH IDOVH UHVXOW ZKHQ FRQWDLQV REMHFW LQVWDQFHV RI $ 1RWH WKDW WKH ULJKWKDQG VLGH RI WKH HTXDWLRQ LV LQ D GLVWULEXWLYH IRUP RI r ZLWK UHVSHFW WR f +RZHYHU WKH GLVWULEXWLYH SURSHUW\ FDQQRW EH DSSOLHG VLQFH LW UHTXLUHV WKDW $ EHORQJ WR D DQG 3 DQG WKDW EH D KRPRJHQHRXV DVVRFLDWLRQVHW UHIHU WR 6HFWLRQ f

PAGE 110

Ef ZLWK RWKHU ELQDU\ RSHUDWRUV 6LPLODU WR WKH DERYH WZR SURSHUWLHV ZH KDYH m: IW><@f ^]` RUZ ^]`f ?>5r0 IW>$,"f@ 3^\`f f ^=` f ^=`f _>-$%f@ 3^\f f ^<`IM]` c! $ ^<`I_: W! $ $H^;?f D^MI` _>5$eLf@ ^3\? f ^=`f $ :QL: $ "H^<`f Ff f ZLWK RWKHU ELQDU\ RSHUDWRUV 6LPLODU WR HTXDWLRQV DQG ZH KDYH D^[f f 3^<`f a ^]` r^r` f§ ^]`f f 3^\f ^<`Sf^A` ef f A^Y` f A^\` f§ ^]`f Gf ZLWK RWKHU RSHUDWRUV $V ZH KDYH PHQWLRQHG HDUOLHU WKH RSHUDWRU KDV OHVV SURSHUWLHV EHFDXVH LW LV QRW DVVRFLDWLYH $OWKRXJK LV QRW GLVWULEXWLYH ZLWK UHVSHFW WR WKH IROORZLQJ GHFRPSRVLWLRQ KROGV WUXH D >L$%f@ IW f I f D?>5$%f`IWQ^DX^5$%f@'>DW` m>L$IOf@mr>L$6f@A>D@ RU m >L$%f@ IW f f DQD r^5$%f`IWfD@QD rc$%f@f1f O>IL$%f@ IW,,Dr?5$%f`Pf OQ[r^5$%f`LfP ZKHUH D IW DQG DUH KRPRJHQHRXV DVVRFLDWLRQVHWV

PAGE 111

7KH VLJQLILFDQFH RI HTXDWLRQV DQG LV WKDW WKH\ FDQ EH XVHG WR WUDQVIRUP WKH RULJLQDO H[SUHVVLRQV LQ ZKLFK WKH RSHUDWRUV RSHUDWH RQ KHWHURJHQHn RXV DVVRFLDWLRQVHWV HJ D f IRU ZKLFK WKH GLVWULEXWLYLW\ FDQQRW EH DSSOLHG LQWR H[SUHVVLRQV LQ WKH IRUPDW RI $8QLRQfV RI KRPRJHQHRXV DVVRFLDWLRQVHWV Hf ZLWK RWKHU RSHUDWRUV $Q DVVRFLDWLRQVHW Df GLYLGHG E\ WKH $8QLRQ RI WZR RWKHU DVVRFLDWLRQVHWV DQG f LV HTXLYDOHQW WR WZR FRQVHFXWLYH $'LYLGH RSHUDWLRQV RI D GLYLGHG E\ If DQG LQ WXUQ DV LQGLFDWHG LQ HTXDWLRQ 7KH RUGHU RI WKH WZR $'LYLGH RSHUDWLRQV LV QRW LPSRUWDQW D ^Z`3 f D A^Z` 3 A^Z` 7 f D ^MY` 3 7KH $'LYLGH RSHUDWRU DOVR KDV OHVV SURSHUWLHV EHFDXVH LW LV QRW DVVRFLDWLYH If ZLWK RWKHU ELQDU\ RSHUDWRUV 7KH SURSHUWLHV RI RSHUDWRU FDVFDGHG ZLWK RWKHU RSHUDWRUV DUH FRYHUHG E\ DQG Jf ZLWK RWKHU ELQDU\ RSHUDWRUV 7KH HTXDWLRQ EHORZ IROORZV WKH VHWXQLRQ DQG VHWGLIIHUHQFH RSHUDWLRQV LQ VHW WKHRU\ RU Sf D f f I f

PAGE 112

7KH SURSHUWLHV RI FDVFDGLQJ RI ZLWK RSHUDWRUV r f DQG RSHUDWRUV FDQ EH IRXQG LQ DQG VLQFH WKH ODWWHU RSHUDWRUV DUH FRPPXWDn WLYH r *HQHUDO ,GHQWLWLHV 7KHUH DUH PDQ\ RWKHU SURSHUWLHV ZKLFK DUH XQLTXH WR WKH $DOJHEUD EXW FDQn QRW EH FODVVLILHG LQWR WKH DERYH FDWHJRULHV /LVWHG EHORZ DUH VRPH LGHQWLW\ SURSHUn WLHV 7KHVH LGHQWLWLHV DUH XVHIXO IRU H[SUHVVLRQ UHGXFWLRQ $ f $ r % $ r % f $ f $ % $ % f $ ,->$?%f>$@ $ f $r%r&r$r% $r%r& f rIL 7UDQVIRUPDWLRQ RI 2SHUDWRUV $Q LPSRUWDQW IDFW ZH KDYH REVHUYHG LV WKDW WKH VDPH SDWWHUQ FDQ EH FRQn VWUXFWHG E\ GLIIHUHQW DOJHEUDLF H[SUHVVLRQV XVLQJ GLIIHUHQW RSHUDWRUV )RU H[DPSOH SDWWHUQ $f§%f§& FDQ EH FRQVWUXFWHG HLWKHU E\ $r%r&RU E\ %r$ f %r& KHQFH %r$n%r& $r%r& f )RUPDOO\ WKHLU HTXLYDOHQFH FDQ EH GHULYHG XVLQJ WKH SURSHUWLHV SUHVHQWHG LQ WKH SUHYLRXV VHFWLRQV % r $ f % r & % f % r &@ r>%$f@ $ % r &f r>IO$f@ $ E\ f E\ f

PAGE 113

$ r % r 2f E\ f $ r% r& E\ f )RU WKH RWKHU GLUHFWLRQ ZH KDYH $ r % r & $ r % r %f r & $ r % f %f r & f§ $ r % f %f r & $ r % f % r & E\ f E\ f E\ f E\ f 8VLQJ WKLV SURSHUW\ D SDWWHUQ RI WUHHVWUXFWXUH FDQ EH GHVFULEHG ZLWKRXW XVLQJ $ ,QWHUVHFW RSHUDWRU ZKLFK LV UHODWLYHO\ PRUH H[SHQVLYH WR LPSOHPHQW )RU H[DPSOH $ r% r&% r'f $ r>5$%f` & r % r 'f E\ f $ r % r & r>5%'f? 'f E\ f $ r % r & r>5^%'f` E\ f $QRWKHU XVHIXO WUDQVIRUPDWLRQ LV SRVVLEOH EHFDXVH D SDWWHUQ RI ODWWLFH VWUXFn WXUH H[SUHVVHG E\ DQ LQWHUVHFWLRQ RI WZR OLQHDU SDWWHUQV FDQ EH YLHZHG DV D VHOHFn WLRQ RQ OLQHDU SDWWHUQV WR DYRLG WKH H[SHQVLYH $,QWHUVHFW RSHUDWLRQ )RU H[DPSOH $r%r&r' f %r(r' R $r%r&r'r(r%f§Of>% %f§O? f 7KH OHIWKDQG VLGH LV WR FRQVWUXFW D ODWWLFH SDWWHUQ E\ LQWHUVHFWLQJ WZR OLQHDU SDWn WHUQ RYHU FODVVHV % DQG %\ EUHDNLQJ WKH ODWWLFH SDWWHUQ DW % LW EHFRPHV D VLQn JOH OLQHDU SDWWHUQ DV VHHQ RQ WKH ULJKWKDQG VLGH RI WKH DERYH H[SUHVVLRQ +HUH % LV DQ DOLDV RI % %\ VSHFLI\LQJ WKDW % % LQ WKH WKH DVVRFLDWLRQVHW GHILQHG E\ $r%r&r'r(r% ZH REWDLQ WKH VDPH UHVXOW DV WKH H[SUHVVLRQ GHILQHG RQ WKH OHIW KDQG VLGH

PAGE 114

%DVHG RQ WKHVH WZR WUDQVIRUPDWLRQ SURSHUWLHV D FRPSOLFDWHG QHWZRUN VWUXFn WXUH FDQ EH YLHZHG DV D IRUHVW VWUXFWXUH E\ SURSHUO\ EUHDNLQJ DOO WKH ORRSV LQ WKH QHWZRUN DQG LWV DOJHEUDLF H[SUHVVLRQ FDQ EH VSHFLILHG XVLQJ D r DQG RSHUDWRUV K$SSOLFDWLRQV LQ 4XHU\ 2SWLPL]DWLRQ DQG 4XHU\ 'HFRPSRVLWLRQ :H KDYH V\VWHPDWLFDOO\ SUHVHQWHG WKH PDWKHPDWLFDO SURSHUWLHV RI WKH RSHUDn WRUV RI $DOJHEUD ,Q WKLV VHFWLRQ WKHLU XWLOLWLHV LQ TXHU\ RSWLPL]DWLRQ DQG TXHU\ GHFRPSRVLWLRQ ZLOO EH LOOXVWUDWHG $SSOLFDWLRQV LQ TXHU\ RSWLPL]DWLRQ *HQHUDOO\ TXHU\ SURFHVVLQJ FRQVLVWV RI WKUHH SKDVHV WUDQVODWLRQ RSWLPL]DWLRQ DQG H[HFXWLRQ $ TXHU\ LVVXHG E\ WKH XVHU LV LQ WKH IRUP RI KLJKOHYHO ODQJXDJH )LUVW LW LV WUDQVODWHG LQWR DQ LQWHUQDO UHSUHVHQWDWLRQ DQ DFFHVV SODQ ZKLFK PD\ QRW EH HIILFLHQW IRU H[HFXWLRQ 7KHQ WKH RSWLPL]HU JHQHUDWHV D QHZ DFFHVV SODQ ZKLFK LV HTXLYDOHQW WR WKH RULJLQDO DFFHVV SODQ LH WKH\ SURGXFH WKH VDPH UHVXOWf DQG LV RSWLPDO IRU H[HFXWLRQ )LQDOO\ WKH QHZ DFFHVV SODQ LV VFKHGXOHG IRU H[Hn FXWLRQ E\ WKH WUDQVDFWLRQ PDQDJHU WR SURGXFH WKH UHVXOW RI WKH TXHU\ 6LQFH LW LV GLIILFXOW WR GHWHUPLQH WKH HTXLYDOHQFH RI WZR VWDWHPHQWV LQ D KLJKOHYHO ODQJXDJH DOWHUQDWLYH DFFHVV SODQV FDQQRW EH JHQHUDWHG E\ WKH TXHU\ WUDQVODWRU ,Q UHODWLRQDO GDWDEDVHV WKH DFFHVV SODQ JHQHUDWHG E\ WKH TXHU\ WUDQVODWRU LV LQ WKH IRUP RI D TXHU\ WUHH LQ ZKLFK DOJHEUD RSHUDWRUV DUH XVHG LQ WKH UHODWLRQDO GDWDEDVHV VR WKDW WKH PDWKHPDWLFDO SURSHUWLHV FDQ EH XVHG WR JHQHUDWH HTXLYDOHQW DFFHVV SODQV HYHQ

PAGE 115

LI WKH KLJKOHYHO ODQJXDJH LV EDVHG RQ WKH UHODWLRQDO WXSOH FDOFXOXV RU GRPDLQ FDOn FXOXV UHIHU WR &KDSWHU f 4XHU\ RSWLPL]DWLRQ LV ZLWKRXW ORVV RI JHQHUDOLW\ DQ 13KDUG SUREOHP 7KHUHIRUH DQ DFFHVV SODQ JHQHUDWHG E\ WKH RSWLPL]HU LV RSWLPDO LQ D YHU\ UHVWULFn WLYH VHQVH )XUWKHUPRUH WR EH SUDFWLFDO WKH RYHUKHDG RI WKH RSWLPL]HU VKRXOG QHYHU H[FHHG WKH DGYDQWDJH RI TXHU\ RSWLPL]DWLRQ ,Q JHQHUDO D TXHU\ RSWLPL]HU JHQHUDWHV DQ RSWLPDO DFFHVV SODQ LQ WZR VWHSV f JHQHUDWH OLPLWHG QXPEHU RIf HTXLYDOHQW DFFHVV SODQV DQG f HYDOXDWH WKHVH DFFHVV SODQV EDVHG RQ D IHZf V\Vn WHP SDUDPHWHUV DQG FULWHULD 7KH PDWKHPDWLFDO SURSHUWLHV RI WKH $DOJHEUD SUHVHQWHG DERYH DUH WKH IRXQn GDWLRQ IRU WKH ILUVW VWHS RI TXHU\ RSWLPL]DWLRQ LQ GDWDEDVHV ,Q WKH VHFRQG VWHS WKH V\VWHPDSSOLFDWLRQ FKRRVHV RQH RU PRUH RI WKH IROORZLQJ DV WKH JRDO RI LWV TXHU\ RSWLPL]DWLRQ PLQLPDO UHVSRQVH WLPH PLQLPDO H[HFXWLRQ WLPH PLQLPDO FRPn PXQLFDWLRQ WLPH PLQLPDO VWRUDJH VSDFH PD[LPDO UHVRXUFH XWLOL]DWLRQ HWF 7KH SDUDPHWHUV XVHG LQ HVWLPDWLQJ WKH SHUIRUPDQFH RI DQ DFFHVV SODQ LQFOXGH FRPPXQn LFDWLRQ FRVW SHU EORFNf &38 FRVW SHU XQLWf ,2 FRVW SHU ,2f EXIIHU VL]H VHOHF WLYLWLHV RI RSHUDWLRQV HJ 6HOHFWLRQ DQG -RLQ LQ UHODWLRQDO GDWDEDVHVf GDWD VWUXFn WXUH DOJRULWKPV RI WKH RSHUDWLRQV HJ QHVWHGMRLQ KDVKMRLQf HWF 6LQFH WKH FULWHULD RI RSWLPL]DWLRQ DUH V\VWHPDSSOLFDWLRQ GHSHQGHQW DQG WKH RSWLPL]DWLRQ VWUDWHJLHV YDU\ IURP V\VWHP WR V\VWHP D GHWDLOHG VWXG\ LV RXW RI WKH VFRSH RI WKLV GLVVHUWDWLRQ :H VKDOO JLYH DQ H[DPSOH WR GHPRQVWUDWH WKH LPSRUn WDQFH RI WKH $DOJHEUD LQ TXHU\ RSWLPL]DWLRQ

PAGE 116

4XHU\ /LVW *3$V RI VWXGHQWV ZKR PDMRU DQG PLQRU LQ WKH VDPH GHSDUWPHQWV 7KH LQWHQVLRQDO SDWWHUQ IRU WKLV TXHU\ LV VKRZQ LQ )LJXUH D 6XSSRVH WKDW WKH DOJHEUDLF H[SUHVVLRQ SURGXFHG E\ WKH TXHU\ WUDQVODWRU LV DV IROORZ ZKLFK FRUUHVSRQGV WR DQ DFFHVV SODQ UHSUHVHQWHG E\ WKH TXHU\ WUHH VKRZQ LQ )LJXUH E ,,*3$ r 6WXGHQW r 'HSDUWPHQW f 6WXGHQW r 8QGHUJUDG r 'HSDUWPHQWff>*3$@ 7R PDNH WKH HYDOXDWLRQ HDV\ ZH DVVXPH WKDW HYHU\ VWXGHQW KDV PDMRU PLQRU DQG *3$ LH WKH VHOHFWLYLWLHV RI DOO r RSHUDWLRQV DUH f DQG RXW RI VWXn GHQWV PDMRU DQG PLQRU LQ WKH VDPH GHSDUWPHQWV LH WKH VHOHFWLYLW\ RI WKH f RSHUDWLRQ LV f ,I WKH WLPH WR SHUIRUP DQ $6HOHFW RQ D SDWWHUQ LV XQLW WR SHUIRUP DQ $VVRFLDWH RSHUDWLRQ LV XQLWV DQG WR SHUIRUP DQ $,QWHUVHFW RSHUDWLRQ LV XQLWV WKH WRWDO H[HFXWLRQ WLPH FDQ EH FDOFXODWHG DV IROORZV QRW LQFOXGLQJ WLPH IRU WKH $3URMHFW RSHUDWLRQ 7M rf rf rf r ZKHUH WKH ILUVW WHUP LV WKH WLPH IRU LGHQWLI\LQJ VWXGHQWVf PDMRUV WKH VHFRQG WHUP LV IRU LGHQWLI\LQJ VWXGHQWVf PLQRUV WKH WKLUG WHUP LV IRU WKH $,QWHUVHFW RSHUDWLRQ DQG WKH ODVW WHUP LV IRU LGHQWLI\LQJ WKH *3$V ,Q )LJXUH E WKH FRVWV RI RSHUDn WLRQV DUH GHSLFWHG QH[W WR WKH RSHUDWRU QRGHV +HUH WKH WLPH IRU WKH $,QWHUVHFW RSHUDWLRQ LV VPDOO EHFDXVH HDFK VWXGHQW KDV RQO\ RQH PDMRU DQG RQH PLQRU DQG LQGLFHV PD\ EH XVHG WR VSHHG XS WKH RSHUDWLRQ 8VLQJ SURSHUW\ WKH VDPH LQWHQVLRQDO SDWWHUQ FDQ EH YLHZHG DV D OLQHDU SDWWHU VKRZQ LQ )LJXUH D DQG WKXV WKH RSWLPL]HU JHQHUDWHV D QHZ DOJHEUDLF H[SUHVVLRQ ZKLFK FRUUHVSRQGV WR WKH DFFHVV SODQ VKRZQ LQ )LJXUH E

PAGE 117

Q>R^*3$ r 6WXGHQW r 'HSDUWPHQW r 8QGHU JUDG r 6WXGHQWf ?6WXGHQW 6WXGHQWO@f> *3$@ 7KH WRWDO H[HFXWLRQ WLPH IRU WKLV DFFHVV SODQ LV 7 rf f r ZKHUH WKH ILUVW WHUP LV WKH WLPH IRU IRXU $VVRFLDWH RSHUDWLRQV DQG WKH VHFRQG WHUP LV WKH WLPH IRU WKH VHOHFWLRQ RSHUDWLRQ ,W LV OHVV H[SHQVLYH WKDQ WKH RULJLQDO DFFHVV SODQ WKXV D EHWWHU SODQ +RZHYHU LI ZH DVVXPH WKDW WKH GDWDEDVH LV D GLVWULEXWHG RQH LQ ZKLFK GDWD RI VWXGHQWVf *3$V DUH LQ VLWH DQG RWKHU GDWD DUH LQ VLWH WKH FODVV 6WXGHQW KDV WR EH UHSOLFDWHG LQ ERWK VLWHVf 7KH FRPPXQLFDWLRQ FRVW LV DVVXPHG WR EH XQLWV SHU EORFN ZLWK EORFN VL]H RI SDWWHUQV 7KH WRWDO H[HFXWLRQ WLPHV IRU WKHVH WZR DFFHVV SODQV FDQ EH FDOFXODWHG DV IROORZV 7M rf rf rf r 7 rf f r ,Q 7M WKH IRXUWK WHUP LV WKH FRPPXQLFDWLRQ FRVW IRU VHQGLQJ TXDOLILHG VWXGHQWV WR VLWH ,Q W WKH WKLUG WHUP LV WKH FRPPXQLFDWLRQ FRVW WKH FRPPXQLFDWLRQ FRVWV DUH WKH VDPH IRU VHQGLQJ *3$V RI DOO VWXGHQWV WR VLWH DQG IRU VHQGLQJ VWXGHQWVf PDMRUV DQG PLQRUV WR VLWH f ,Q WKLV FDVH WKH ILUVW DFFHVV SODQ LV EHWWHU WKDQ WKH VHFRQG )LJXUH D DQG E GHSLFWV WKH FRVWV RI RSHUDWLRQV QH[W WR WKH RSHUDn WLRQVf DQG WKH FRVWV RI FRPPXQLFDWLRQV RQ WKH HGJHVf IRU WKHVH WZR DFFHVV SODQV 7KH RSWLPL]HU RI WKH GLVWULEXWHG V\VWHP PD\ JHQHUDWH DQRWKHU DFFHVV SODQ E\ DSSO\LQJ SURSHUW\ WR WKH DOJHEUDLF H[SUHVVLRQ RI WKH VHFRQG DFFHVV SODQ DQG ZH KDYH

PAGE 118

,*3$ r R6WXGHQW r 'HSDUWPHQW r 8QGHU JUDG r 6WXGHQWf >6WXGHQW 6WXGHQWO@f>*3$@ ZKLFK FRUUHVSRQGV WR WKH DFFHVV SODQ VKRZQ LQ )LJXUH F 7KH WRWDO H[HFXWLRQ WLPH IRU WKLV DFFHVV SODQ LV 7V rf r ZKHUH WKH ILUVW WHUP LV WKH WLPH IRU WKH WKUHH $VVRFLDWH RSHUDWLRQV QHVWHG LQ WKH $6HOHFW WKH VHFRQG WHUP DFFRXQWV IRU WKH VHOHFWLRQ RSHUDWLRQ WKH WKLUG WHUP DFFRXQWV IRU WKH FRPPXQLFDWLRQ FRVW DQG WKH ODVW WHUP LV WKH WLPH IRU JHWWLQJ *3$V 7KHUHIRUH WKH WKLUG DFFHVV SODQ LV WKH RSWLPDO RQH IRU H[HFXWLRQ $SSOLFDWLRQV LQ TXHU\ GHFRPSRVLWLRQ 7KH PRGHOLQJ WHFKQLTXHV LQFRUSRUDWH PDQ\ KLJKOHYHO IHDWXUHV VXFK DV DVVRFLDWLRQ W\SHV LQKHULWDQFH EHKDYLRUDO SURSHUWLHV RI REMHFWV NQRZOHGJH DQG UXOHV HWF LQ WKH '%06 7KHVH IHDWXUHV ZHUH WDNHQ FDUH RI E\ GDWDEDVH DGPLQLVn WUDWRUV DQG DSSOLFDWLRQ SURJUDPV LQ FRQYHQWLRQDO GDWDEDVHV V\VWHPV 7R HQVXUH JRRG SHUIRUPDQFH '%06V QHHG WKH VXSSRUW RI SDUDOOHO DQG GLVWULEXWHG SURn FHVVLQJ WHFKQLTXHV ,Q GLVWULEXWHG DQG SDUDOOHO SURFHVVLQJ HQYLURQPHQW D TXHU\ LV GHFRPSRVHG LQWR VXETXHULHV DFFRUGLQJ WKH SURFHVVLQJ FDSDELOLWLHV RI SURFHVVRUV DQGRU GDWD GLVn WULEXWLRQ 7KH DOJHEUDLF UHSUHVHQWDWLRQ RI D TXHU\ FDQ EH PDQLSXODWHG PDWKHPDWn LFDOO\ IRU WKLV SXUSRVH )RU H[DPSOH VXSSRVH D TXHU\ LV UHSUHVHQWHG E\ DQ LQWHQ VLRQDO SDWWHUQ VKRZQ LQ )LJXUH D 7KH DOJHEUD H[SUHVVLRQ IRU WKLV TXHU\ FDQ EH

PAGE 119

,OO ZULWWHQ DV IROORZV H[SU $ r %r(r) %r&r'r+ f &r*ff %\ DSSO\LQJ WKH GLVWULEXWLYLW\ SURSHUWLHV WKH DERYH H[SUHVVLRQ FDQ EH ZULWWHQ DV EHORZ H[SU $ r %r(r) %r&r'r+ f %r&r*f $ r% r(r) $ r % r&r' r+ f % r&r*f $r%r(r) $r%r&r'r+ f $r%r&r* 7KH GHFRPSRVHG H[SUHVVLRQ LV WKH $8QLRQ RI WZR VXEH[SUHVVLRQV UHSUHVHQWLQJ WZR VXESDWWHUQV VKRZQ LQ )LJXUH E 7KHVH VXEH[SUHVVLRQV DUH LQGHSHQGHQW RI HDFK RWKHU DQG FDQ EH SURFHVVHG LQ SDUDOOHO LQ D SDUDOOHO V\VWHP 7KH VHFRQG VXEn H[SUHVVLRQ FDQ EH IXUWKHU RSWLPL]HG DV VKRZQ LQ WKH IROORZLQJ H[SUHVVLRQ LQ ZKLFK r>&*f@ LQGLFDWHV WKDW WKH $VVRFLDWH RSHUDWLRQ LV SHUIRUPHG WKURXJK WKH DVVRFLDn WLRQ EHWZHHQ & DQG H[SU $ r%r(r) $r%r&r'r+f r>5^&*f? ,Q DGGLWLRQ VLQFH HDFK VXEH[SUHVVLRQ UHSUHVHQWV D KRPRJHQHRXV DVVRFLDWLRQVHW LWV SURFHVVLQJ ZLOO EH PRUH HIILFLHQW WKDQ SURFHVVLQJ RYHU KHWHURJHQHRXV DVVRFLDWLRQ VHWV 1H[W ZH SUHVHQW WZR WKHRUHPV RI WKH $DOJHEUD ZKLFK HQVXUHV WKDW WKH GHFRPSRVHG VXEH[SUHVVLRQV SURGXFH KRPRJHQHRXV DVVRFLDWLRQVHWV 7KHRUHP 2SHUDWRUV H[FHSW $8QLRQ DQG $,QWHJUDWHf RI $DOJHEUD SURGXFH KRPRJHQHRXV DVVRFLDWLRQVHWV LI WKHLU RSHUDQGV DUH KRPRJHQHRXV DVVRFLDWLRQVHW

PAGE 120

3URRI 7KLV LV WUXH E\ WKH GHILQLWLRQV RI WKH RSHUDWRUV $,QWHUVHFW RSHUDWLRQ VKRXOG EH XVHG ZLWKRXW VSHFLI\LQJ WKH FODVVHV RQ ZKLFK WKH $,QWHUVHFW RSHUDWLRQ LV SHUn IRUPHG LH LW SHUIRUPV RQ WKH FRPPRQ FODVVHV RI LWV RSHUDQGVf 1RWH WKDW IRU $'LIIHUHQFH DQG $'LYLGH RSHUDWLRQV WKLV LV DOVR WUXH LI RQO\ WKH ILUVW RSHUDQG WKH PLQXHQG RU WKH GLYLGHQGf LV D KRPRJHQHRXV DVVRFLDWLRQVHW 7KHRUHP ,I DQ $DOJHEUD H[SUHVVLRQ ZKLFK GRHV QRW FRQWDLQ $,QWHJUDWH RSHUDn WLRQ DQG $'LYLGH RSHUDWLRQ ZKRVH GLYLGHQG LV DQ KHWHURJHQHRXV DVVRFLDWLRQVHW LW FDQ EH GHFRPSRVHG LQWR WKH $8QLRQfV RI VRPH VXEn H[SUHVVLRQV HDFK RI ZKLFK SURGXFHV D KRPRJHQHRXV DVVRFLDWLRQVHW 3URRI $FFRUGLQJ WR 7KHRUHP EHVLGHV WKH $,QWHJUDWH RSHUDWLRQ WKH $8QLRQ LV WKH RQO\ RSHUDWRU WKDW FDQ SURGXFH KHWHURJHQHRXV DVVRFLDWLRQVHW ZKHQ LWV RSHUDQGV DUH KRPRJHQHRXV DVVRFLDWLRQVHWV 7KHUHIRUH LW VXIILFHV WR SURYH WKDW ZKHQHYHU VXFK KHWHURJHQHRXV DVVRFLDWLRQVHW DSSHDUV LQ DQ H[SUHVVLRQ WKH H[SUHVn VLRQ FDQ EH GHFRPSRVHG LQWR WKH $8QLRQ RI VXEH[SUHVVLRQV ZKLFK SURGXFH KRPRn JHQHRXV DVVRFLDWLRQVHWV 3URRI /HW D IW DQG ; EH DOO KRPRJHQHRXV DVVRFLDWLRQVHWV %\ SURSHUWLHV DQG ZH KDYH RU r^ ;f RU r Dr? Ifr? m ;f D_ D_; e_ \_; D r ;f Dr D; Am \}; R^D P R^rP R>P QRW 3f>tD IODf>e7, %\ SURSHUWLHV ZH KDYH m 3f m f ^3 f

PAGE 121

%\ SURSHUWLHV ZH KDYH D ;f D ,O>Dr?f>D` -Arf>@f DU; Drf>D@ "r;f>;@f >ID ,,3r?f,3? Drf>@f ; ,,^0P a DIr;f>;@f ,Q WKH DERYH GHFRPSRVLWLRQV HDFK WHUP RI WKH $8QLRQ RSHUDWLRQV UHSUHVHQWV D KRPRJHQHRXV DVVRFLDWLRQVHW Â’

PAGE 122

*3$ 6WXGHQW 'HSDUWPHQW Df LQWHQVLRQDO SDWWHUQ RI 4XHU\ Ef DFFHVV SODQ RI 4XHU\ )LJXUH $FFHVV SODQ RI 4XHU\

PAGE 123

*3$ 6WXGHQW 'HSDUWPHQW 8QGHUJUDG 6WXGHQWB R R R R R Df DOWHUQLWLYH LQWHQVLRQDO SDWWHUQ RI 4XHU\ Ef DFFHVV SODQ RI 4XHU\ )LJXUH $FFHVV SODQ RI 4XHU\

PAGE 124

r *3$ 6WXGHQW *3$ 6WXGHQW Ef FRVW RI DFFHVV SODQ Ff FRVW RI DFFHVV SODQ )LJXUH &RVWV LQ D GLVWULEXWHG V\VWHP

PAGE 125

4! Df Ef )LJXUH ([DPSOH RI TXHU\ GHFRPSRVLWLRQ •,

PAGE 126

&+$37(5 &203/(7(1(66 2) 7+( $$/*(%5$ :H KDYH VKRZQ LQ WKH SUHFHGLQJ VHFWLRQV WKDW D TXHU\ LVVXHG DJDLQVW DQ GDWDEDVH FDQ EH VSHFLILHG E\ DQ DVVRFLDWLRQ RU JUDSKLFf SDWWHUQ LQ ZKLFK REMHFW LQVWDQFHV RI LQWHUHVW DUH UHODWHG DVVRFLDWHG RU QRQDVVRFLDWHGf DQG WKDW WKH $ DOJHEUD SURYLGHV D XVHIXO PDWKHPDWLFDO PHWKRG IRU VSHFLI\LQJ DQG PDQLSXODWLQJ VXFK SDWWHUQ WR SURGXFH WKH UHVXOW IRU WKH TXHU\ +RZHYHU IRU WKH DOJHEUD WR EH WUXO\ XVHIXO WKH FRPSOHWHQHVV RI WKH DOJHEUD QHHGV WR EH DGGUHVVHG 'XH WR WKH FORVXUH SURSHUW\ RI WKH $DOJHEUD WKH UHVXOW RI D TXHU\ LV UHSUHVHQWHG LQWHQVLRQDOO\ E\ D VXEGDWDEDVH VFKHPD JUDSK 6*W DQG H[WHQVLRQDOO\ E\ D VXEGDWDEDVH REMHFW JUDSK 2*W ZKHUH 6* LV D VXEJUDSK RI WKH 6* RI WKH RULJLn QDO GDWDEDVH DQG 2* LV D VXEVHW RI DVVRFLDWLRQ SDWWHUQV LQ WKH RULJLQDO REMHFW JUDSK 2* $ VXEGDWDEDVH FDQ EH IXUWKHU RSHUDWHG XSRQ E\ WKH $DOJHEUD RSHUDn WRUV WR SURGXFH RWKHU VXEGDWDEDVHV :H FDQ WKHUHIRUH GHILQH WKH FRPSOHWHQHVV RI WKH DOJHEUD LQ WKH IROORZLQJ ZD\ &RPSOHWHQHVV 7KHRUHP 7KH $DOJHEUD LV FRPSOHWH LI LW FDQ GHILQH DOO SRVVLEOH VXEGDWDEDVH RI DQ GDWDEDVH %HIRUH SURYLQJ WKH WKHRUHP ZH ILUVW JLYH WKH IRUPDO GHILQLWLRQV RI WKH 6* DQG 2* RI RI WKH VXEGDWDEDVHV RI DQ GDWDEDVH

PAGE 127

6XEGDWDEDVH 6FKHPD *UDSK $ VXEGDWDEDVH VFKHPD JUDSK 6*Wf LV D VHW RI P FRQQHFWHG VXEJUDSKV ^6*f&$f` OPf IURP WKH RULJLQDO GDWDEDVH VFKHPD JUDSK 6*&$f ZKHUH & LV D VHW RI YHUWLFHV UHSUHVHQWLQJ FODVVHV ^F` DQG $ LV D VHW RI HGJHV UHSUHVHQWLQJ DVVRFLDWLRQV EHWZHHQ FODVVHV HDFK RI ZKLFK LV GHQRWHG E\ $IM IRU DQ DVVRFLDWLRQ EHWZHHQ FODVVHV & DQG &\ ,I &MH6*? WKHQ &e6*N 90Mf 7KH FRQGLWLRQ HQVXUHV WKDW D FODVV GRHV QRW DSSHDU LQ WZR GLIIHUHQW FRQQHFWHG JUDSKV LQ D VXEGDWDEDVH ,I LW GRHV WKH WZR FRQQHFWHG JUDSKV VKRXOG KDYH EHHQ D VLQJOH FRQQHFWHG JUDSK 6XEGDWDEDVH 2EMHFW $VVRFLDWLRQf *UDSK $ VXEGDWDEDVH REMHFW JUDSK *W(ff FRQWDLQV D VXEVHW RI DVVRFLDWLRQ SDWn WHUQV RI WKH RULJLQDO GDWDEDVH REMHFW JUDSK *(ff ZKHUH 2 LV D VHW RI YHUWLFHV UHSUHVHQWLQJ REMHFW LQVWDQFHV DQG ( LV D VHW RI HGJHV UHSUHVHQWLQJ DVVRFLDWLRQV EHWZHHQ REMHFW LQVWDQFHV $Q ,QQHUSDWWHUQ RU REMHFW LQVWDQFH ^-f EHORQJV WR 2*W RQO\ LI &LH6*O DQG 2A*& $Q ,QWHUSDWWHUQ RU D &RPSOHPHQWSDWWHUQ 2LM 2P Qf EHORQJV AWR 2*W RQO\ LI &L&PH6*W DQG $PH6*f ZKHUH 2A& 2PQH&P DQG 2W2P Q*$ P 7KH DERYH FRQGLWLRQV VWDWH WKDW D SULPLWLYH DVVRFLDWLRQ SDWWHUQ VKRXOG QRW EH LQFOXGHG LQ 2*W LI WKH FRUUHVSRQGLQJ FODVVHV DQGRU DVVRFLDWLRQV RI WKH RULJLQDO GDWDEDVH DUH QRW LQ 6*W ,QVWHDG RI SURYLQJ WKH FRPSOHWHQHVV WKHRUHP DV VWDWHG DERYH ZH PDNH WKH IROORZLQJ REVHUYDWLRQV DQG UHVWDWH WKH WKHRUHP DV VKRZQ EHORZ )LUVW DOWKRXJK WKH 6* RI DQ GDWDEDVH PD\ FRQVLVW RI PRUH WKDQ RQH FRQQHFWHG JUDSK LW VXIILFHV WR SURYH WKH FDVH WKDW WKH 6* LV D VLQJOH FRQQHFWHG JUDSK VLQFH LI WZR FODVVHV GR QRW KDYH D SDWK EHWZHHQ WKHP LQ WKH 6* WKH\ ZLOO QRW EH DVVRFLDWHG ZLWK HDFK RWKHU LQ DQ\ RI WKH VXEGDWDEDVHV 7KHUHIRUH HDFK FRQQHFWHG JUDSK RI 6* FDQ EH WUHDWHG DV DQ LQGHSHQGHQW GDWDEDVH DQG D VXEGDWD

PAGE 128

EDVH GHILQHG RQ PRUH WKDQ RQH FRQQHFWHG JUDSKV RI 6* FDQ EH UHSUHVHQWHG E\ WKH $8QLRQ RI WKH VXEGDWDEDVHV GHILQHG RQ GLIIHUHQW FRQQHFWHG JUDSKV RI 6* 6HFRQG LW VXIILFHV WR SURYH WKH FDVH WKDW D VXEGDWDEDVH FRQVLVWV RI RQO\ RQH FRQQHFW VXEJUDSK RI 6* DOWKRXJK LQ JHQHUDO WKH 6*W RI D VXEGDWDEDVH PD\ FRQn WDLQ PRUH WKDQ RQH VXEJUDSKV RI 6* 7KLV LV EHFDXVH WKH JHQHUDO FDVH FDQ EH UHSUHVHQWHG E\ WKH $8QLRQ RI WKH H[SUHVVLRQV IRU LQGLYLGXDO VXEJUDSKV 7KLUG VLQFH DQ GDWDEDVH LV D FROOHFWLRQ RI DVVRFLDWLRQ SDWWHUQV LW VKRXOG EH REYLRXV WKDW LI WKHUH H[LVWV DQ $DOJHEUD H[SUHVVLRQ IRU HYHU\ DVVRFLDWLRQ SDWn WHUQ RI DQ GDWDEDVH WKHQ WKH VXEGDWDEDVHV FDQ EH UHSUHVHQWHG E\ WKH $ 8QLRQ RI D VXEVHW RI WKHVH DVVRFLDWLRQ SDWWHUQV 7KHUHIRUH WKH FRPSOHWHQHVV WKHRUHP FDQ EH UHVWDWHG DV IROORZV &RPSOHWHQHVV 7KHRUHP 7KH $DOJHEUD LV FRPSOHWH LI WKHUH H[LVWV DQ H[SUHVVLRQ IRU HYHU\ DVVRn FLDWLRQ SDWWHUQ LQ WKH 2* RI DQ GDWDEDVH :H SURYH WKH DERYH WKHRUHP E\ LQGXFWLRQ RQ WKH QXPEHU RI REMHFW LQVWDQFHV LQ DQ DVVRFLDWLRQ SDWWHUQ 3URRI %XVeL :H ILUVW VKRZ WKDW WKHUH LV DQ H[SUHVVLRQ IRU WKH FDVH WKDW DQ DVVRFLDWLRQ SDWWHUQ FRQWDLQV D VLQJOH REMHFW LQVWDQFH 6LQFH WKH QDPH RI D FODVV VD\ &Y UHSUHVHQWV DOO WKH REMHFW LQVWDQFHV RI WKH FODVV DQ DVVRFLDWLRQ SDWWHUQ FRQWDLQLQJ D VLQJOH REMHFW LQVWDQFH RI WKDW FODVV FDQ EH UHSUHVHQWHG E\ DQ $6HOHFW RSHUDWLRQ RYHU WKH REMHFW LQVWDQFHV RI &O WR VHOHFW D SDUWLFXODU REMHFW LQVWDQFH RI LQWHUHVW DV VKRZQ EHORZ

PAGE 129

pP ZKHUH % LV WKH FRQGLWLRQ DQ REMHFW LQVWDQFH RI & PXVW VDWLVI\ +\SRWKHVLV $VVXPH WKDW WKHUH H[LVWV DQ H[SUHVVLRQ IRU HYHU\ DVVRFLDWLRQ SDWWHUQ WKDW FRQWDLQV Q REMHFW LQVWDQFHV 7KHVH QO REMHFW LQVWDQFHV PXVW IRUP D FRQn QHFWHG JUDSK LH HDFK REMHFW LQVWDQFH PXVW EH DW OHDVW RQH SDWK EHWZHHQ DQ\ WZR REMHFW LQVWDQFHV LQ WKH JUDSK 2WKHUZLVH WKH\ ZRXOG KDYH IRUPHG PXOWLSOH DVVRFLn DWLRQ SDWWHUQV ,QGXFWLRQ 6XSSRVH WKHUH H[LVW DQ H[SUHVVLRQ IRU DQ DVVRFLDWLRQ SDWWHUQ 3QB ZKLFK FRQWDLQV QO REMHFW LQVWDQFHV :KHQ DGGLQJ WKH QWK REMHFW LQVWDQFH WR WKLV SDWn WHUQ D QHZ SDWWHUQ 3Q FRQWDLQLQJ Q REMHFW LQVWDQFHV FDQ EH IRUPHG LQ WKH IROORZn LQJ WZR ZD\V DV GHSLFWHG LQ )LJXUH Df WKH QWK REMHFW LQVWDQFH EHORQJV WR FODVV &N DQG WKH REMHFW LQVWDQFHV RI &N GR QRW SDUWLFLSDWH LQ 3QB DQG Ef WKH QWK REMHFW LQVWDQFH EHORQJ WR D FODVV VD\ &S ZKLFK KDV VRPH REMHFW LQVWDQFHVf SDUWLFLSDWHG LQ WKH 3QB 7R DYRLG XVLQJ FRPSOLFDWHG QRWDWLRQ ZH ZLOO VKRZ WKH IRUPXODWLRQV IRU WZR VSHFLILF SDWWHUQV GHSLFWHG LQ )LJXUH D DQG E ZKLFK FRUUHVSRQG WR WKH FDVHV RI )LJXUH D DQG E UHVSHFWLYHO\ 3DWWHUQV LQ JHQHUDO IRUPV FDQ EH IRUPXODWHG XVLQJ WKH VDPH PHFKDQLVP GHVFULEHG EHORZ :H VKDOO GLVFXVV FDVHV D DQG E LQ WXUQ &DVH D :KHQ DGGLQJ DQ REMHFW LQVWDQFH RI & WR D SDWWHUQ 3 FRQWDLQLQJ REMHFW LQVWDQFHV YDULRXV QHZ SDWWHUQV 3V FDQ EH IRUPHG GHSHQGLQJ RQ WKH DVVRn FLDWLRQV EHWZHHQ WKH QHZ REMHFW LQVWDQFH DQG WKH RWKHU H[LVWLQJ REMHFW LQVWDQFHV 7KH QHZ REMHFW LQVWDQFH FDQ RQO\ KDYH RQH DVVRFLDWLRQ ZLWK DQ H[LVWLQJ REMHFW LQVWDQFH LI WKHLU FODVVHV DUH GLUHFWO\ FRQQHFWHG LQ 6* E\ D VLQJOH DVVRFLDWLRQ W\SH

PAGE 130

ZH ZLOO FRQVLGHU ODWHU WKH FDVH WKDW WKHUH DUH PRUH WKDQ RQH DVVRFLDWLRQ W\SH EHWZHHQ WZR FODVVHVf 7KHUH DUH RQO\ WKUHH SRVVLEOH FKRLFHV IRU WKH QHZ REMHFW LQVWDQFH WR UHODWH WR DQ H[LVWLQJ REMHFW LQVWDQFH Of WKH DVVRFLDWLRQ LV RI QR LQWHUHVW LH WKH DVVRFLDWLRQ LV QRW LQFOXGHG LQ WKH SDWWHUQ f WKH\ DUH DVVRFLDWHG ZLWK HDFK RWKHU f WKH\ DUH QRW DVVRFLDWHG ZLWK HDFK RWKHU *UDSKLFDOO\ ZH XVH D VROLG OLQH DQ ,QWHUSDWWHUQf WR UHSUHVHQW FKRLFH DQG D GDVKHG OLQH D &RPSOHPHQWSDWWHUQf WR UHSUHVHQW FKRLFH 1R OLQH LV GUDZQ EHWZHHQ WKH WZR REMHFW LQVWDQFHV IRU FKRLFH 1RWH WKDW DW OHDVW RQH RI WKH DVVRFLDWLRQV RI WKH QHZ REMHFW LQVWDQFH ZLWK WKH H[LVWLQJ REMHFW LQVWDQFHV PXVW KDYH D FKRLFH RI RU 2WKHUZLVH WKH QHZ REMHFW LQVWDQFH DQG 3 DUH WZR VHSDUDWH SDWWHUQV WKDW VKRXOG EH FRYHUHG E\ WKH EDVH DQG WKH K\SRWKHVLV 7R IRUPXODWH DQ H[SUHVVLRQ IRU WKH QHZ SDWWHUQ VKRZQ LQ )LJXUH D ZH ILUVW WUDQVIRUP SDWWHUQ 3 LQWR D SDWWHUQ E\ WUHDWLQJ REMHFW LQVWDQFHV RI 3Q DV LI WKH\ DUH IURP GLIIHUHQW FODVVHV E\ XVLQJ WKH DOLDVLQJ QDPHV RI WKHLU RULJLQDO FODVVHV DV VKRZQ LQ )LJXUH D 7KH SDWWHUQ 3 LQ )LJXUH D LV HTXLYDOHQW WR WKH SDWn WHUQ 3n LQ )LJXUH D SURYLGHG WKDW WKH REMHFW LQVWDQFHV RI WKH DOLDVLQJ FODVVHV RI WKH VDPH FODVV DUH QRW WKH VDPH REMHFW LQVWDQFHV 1H[W WKH HTXLYDOHQW SDWWHUQ LV GHFRPSRVHG LQWR D VHW RI SDWWHUQV HDFK RI ZKLFK LV D VXESDWWHUQ LH VXEJUDSKf RI WKH SDWWHUQ LQ )LJXUH D DQG FRQVLVWV RI 3Q WKH QHZ REMHFW LQVWDQFH DQG LWV UHODWLRQVKLS ZLWK RQH REMHFW LQVWDQFH LQ 3 ,I ZH FDQ GHULYH H[SUHVVLRQV IRU WKHVH VXESDWWHUQ LQGLYLGXDOO\ WKH $,QWHUVHFWfV RI WKHVH H[SUHVVLRQ ZLOO EH WKH H[SUHVVLRQ IRU WKH SDWWHUQ LQ )LJXUH D ZKLFK LV HTXLYDOHQW WR WKH SDWWHUQ LQ )LJXUH D ,Q WKLV H[DPSOH WKH SDWWHUQ LQ )LJXUH D LV GHFRPSRVHG LQWR VL[ VXESDWWHUQV DV

PAGE 131

VKRZQ LQ )LJXUH D ZKLFK FDQ EH HDVLO\ H[SUHVVHG DV IROORZV (SQ ( f _>-&MB&f@ & DO e e f m&A&f@ & D ASL ^( Qf rLL&MO&f@ & ( e"f r>5&nB&f@ & D e ( Xf ?>5&V&f` & R ASL ef r>mAAf@ & R2 UHVSHFWLYHO\ +HUH ( VWDQGV IRU WKH DOJHEUDLF H[SUHVVLRQ RI WKH DVVRFLDWLRQ SDWWHUQ VSHFLILHG E\ LWV VXEVFULSW ,Q HDFK H[SUHVVLRQ DQ RSHUDWLRQ r RU LV FKRVHQ FRUUHVSRQGLQJ WR WKH W\SH RI FRQQHFWLRQ EHWZHHQ REMHFW LQVWDQFHV DQG (S8 LV SDUHQWKHVL]HG WR HQVXUH WKH FRUUHFW H[HFXWLRQ VHTXHQFH 7KH H[SUHVVLRQ IRU WKH SDWWHUQ RI )LJXUH D FDQ WKHQ EH IRUPXODWHG E\ D VHTXHQFH RI $,QWHUVHFW RSHUDWLRQV RQ WKH H[SUHVVLRQV RI WKHVH LQGLYLGXDO SDWWHUQV (BZ ( ( ( ( ( ( &DVH E )LJXUH E GHSLFWV WKH FDVH WKDW WKH QHZ REMHFW LQVWDQFH EHORQJV WR DQ H[LVWLQJ FODVV & DQG LW PD\ KDYH DVVRFLDWLRQV ZLWK REMHFW LQVWDQFHV RI RWKHU FODVVHV WKDW KDYH DVVRFLDWLRQV ZLWK & 7KH IRUPXODWLRQ IRU WKH QHZ SDWWHUQ 3I VKRZQ LQ )LJXUH E FDQ EH REWDLQHG VLPLODUO\ DV GHSLFWHG LQ )LJXUH E DQG E 1RWH WKDW WKH QHZ REMHFW LQVWDQFH EHORQJV WR WKH DOLDVLQJ FODVV &fB DIWHU WKH SDWWHUQ WUDQVIRUPDWLRQ SURFHVV VHH )LJXUH Ef $V VKRZQ LQ )LJXUH E WKH HTXLYDOHQW SDWWHUQ GHSLFWHG LQ )LJXUH E LV GHFRPSRVHG LQWR IRXU SDWWHUQV ZKLFK FDQ EH H[SUHVVHG E\

PAGE 132

( ( f r>IO&B&f@ &B ( Z (BQf r>5&f§O&Bf@ &B e ( Xf _>&B&Bf@ FB A e_>&&Bf@ &I-L UHVSHFWLYHO\ 7KHUHIRUH IRU WKH SDWWHUQ 3 ZH KDYH H[SUHVVLRQ A f§ (SO r AS r ( f ( +RZHYHU WKH DERYH H[SUHVVLRQ GRHV QRW H[FOXGH WKH FDVH WKDW WZR REMHFW LQVWDQFHV LQ DOLDVLQJ FODVVHV RI &W UHIHU WR WKH VDPH REMHFW LQVWDQFH +HQFH LW LV QHFHVVDU\ WR SHUIRUP DQ $6HOHFW RSHUDWLRQ WR HOLPLQDWH VXFK FDVH DQG ZH KDYH ( t RW( f ( f ( L f ( A>&JOA&M@ r n 6R IDU ZH KDYH VKRZQ WKDW WKHUH H[LVWV DW OHDVW RQH H[SUHVVLRQ IRU D SDWWHUQ FRQVLVWLQJ RI DQ\ QXPEHU RI REMHFW LQVWDQFHV :H QRWH WKDW WKHUH PD\ H[LVW PRUH WKDQ RQH H[SUHVVLRQ IRU D SDWWHUQ :H LOOXVWUDWH WKLV E\ VKRZLQJ DQ DOWHUQDWLYH ZD\ RI WUDQVIRUPLQJ D SDWWHUQ LQWR DQ HTXLYDOHQW RQH VR WKDW GLIIHUHQW H[SUHVVLRQV FDQ EH GHULYHG )LJXUH D VKRZV DQRWKHU SDWWHUQ ZKLFK LV HTXLYDOHQW WR WKH SDWWHUQ LQ )LJn XUH D LI LQ )LJXUH D WKH REMHFWV LQVWDQFHV RI WKH DOLDVLQJ FODVVHV &B WKURXJK &ME WKDW SDUWLFLSDWH LQ 3 UHIHU WR WKH VDPH REMHFW 7KHUHIRUH ZH KDYH DQ DOWHUn QDWLYH H[SUHVVLRQ IRU 3nD (Sf reff _>3&B&Bf@ &\f§Of r>IO&,BO9f@ &B

PAGE 133

f f f r>-&f9f@ &ff>&BO & &B@ ZKLFK LV D VHTXHQFH RI r DQGRU RSHUDWLRQV RQ (S8 RYHU FODVVHV &BW Af DQG WKHLU DVVRFLDWHG FODVVHV 7KH VHOHFWLRQ FRQGLWLRQ >&BO &B &B@ HQVXUHV WKDW WKH REMHFW LQVWDQFHV LQ DOO DOLDVLQJ FODVVHV RI & UHIHU WR WKH VDPH REMHFW 6LPLODUO\ WKH SDWWHUQ LQ )LJXUH E LV HTXLYDOHQW WR WKH SDWWHUQ LQ )LJXUH E LI WKH REMHFW LQVWDQFHV LQ &JB WKURXJK &B WKDW SDUWLFLSDWH LQ 3I DUH WKH VDPH REMHFW DQG WKLV REMHFW LV GLIIHUHQW IURP WKH RQH LQ &JBO +HQFH DQ DOWHUQDWLYH H[SUHVVLRQ FDQ EH GHULYHG DV IROORZV ( R^( f r>-"&JB&Af@ &-=f r^5&O&H-f` &Bf E f f f _>L&B&Bf@ &HBff>&HB &HB &HB &HBO@ :H KDYH VKRZQ WKDW WKHUH H[LVWV DQ H[SUHVVLRQ IRU HYHU\ DVVRFLDWLRQ SDWWHUQ ZKHQ WKHUH LV D VLQJOH DVVRFLDWLRQ EHWZHHQ WZR FODVVHV 1RZ ZH SURYH WKLV LV DOVR WUXH ZKHQ WKHUH DUH PRUH WKDQ RQH DVVRFLDWLRQ EHWZHHQ WZR FODVVHV 7KHUH DUH DOVR WZR FDVHV DV GHVFULEHG LQ WKH SURRI DERYH :H RQO\ SURYH FDVH D WKDW WKH QHZ REMHFW LQVWDQFH EHORQJV WR &N DQG WKH REMHFW LQVWDQFHV RI &N GR QRW SDUWLFLn SDWH LQ 3? &DVH E FDQ EH SURYHQ XVLQJ WKH VDPH PHWKRGRORJ\ )LJXUH D VKRZV DQ 6* LQ ZKLFK WKHUH DUH WZR DVVRFLDWLRQV EHWZHHQ &WB DQG &N 7KH WZR DVVRFLDWLRQV DUH GHQRWHG DV >5A&MA&A@ DQG >5A&A&A@ UHVSHFn WLYHO\ )LJXUH E VKRZV D SDWWHUQ LQ ZKLFK WKH QHZ REMHFW LQVWDQFH RI &N KDV WZR DVVRFLDWLRQV ZLWK HDFK REMHFW LQVWDQFH RI &BY 7KH DVVRFLDWLRQV EHWZHHQ REMHFW LQVWDQFHV RI &MB^ DQG &N DUH ODEHOHG E\ QXPEHUV FRUUHVSRQGLQJ WR WKH DVVRn FLDWLRQV RI WKHLU FODVVHV 7R GHULYH WKH DOJHEUDLF H[SUHVVLRQ IRU WKLV SDWWHUQ ILUVW

PAGE 134

ZH GHFRPSRVH LW LQWR WZR SDWWHUQV 3f DQG 3N DV VKRZQ LQ )LJXUH F 7KH GHFRPSRVLWLRQ LV GRQH E\ PDNLQJ WZR FRSLHV RI WKH SDWWHUQ ,Q RQH FRS\ WKH DVVRn FLDWLRQV ODEHOHG DUH GURSSHG DQG LQ WKH RWKHU WKH DVVRFLDWLRQV ODEHOHG DUH GURSSHG )URP WKH HDUOLHU GLVFXVVLRQ ZH FDQ GHULYH H[SUHVVLRQV IRU WKHVH WZR SDWn WHUQV DQG WKH H[SUHVVLRQ IRU WKH RULJLQDO SDWWHUQ FDQ EH UHSUHVHQWHG E\ WKH $ ,QWHUVHFW RI WKH WZR ( B ( f f ( f SQ SQ Sf D E 7R HQVXUH WKDW WKH $,QWHUVHFW RSHUDWLRQ ZLOO SURGXFH WKH SDWWHUQ DV UHTXLUHG WKH VDPH REMHFW LQVWDQFH LQ WKH WZR FRSLHV VKRXOG XVH WKH VDPH DOLDVLQJ FODVV QDPH ZKHQ H[SUHVVLRQV ( f DQG ( f DUH IRUPXODWHG D *HQHUDOO\ LI WKH QHZ REMHFW LQVWDQFH RI &N KDV PXOWLSOH DVVRFLDWLRQV ZLWK REMHFW LQVWDQFHV RI VHYHUDO FODVVHV WKH DVVRFLDWLRQ SDWWHUQ LV GHFRPSRVHG LQWR P SDWWHUQV ZKHUH P LV WKH PD[LPXP QXPEHU RI DVVRFLDWLRQV &N KDV ZLWK DQRWKHU FODVV 6LQFH LW KDV EHHQ VKRZQ WKDW ZH FDQ IRUPXODWH DOJHEUDLF H[SUHVVLRQV IRU DOO SRVVLEOH SDWWHUQV LQ ZKLFK REMHFW LQVWDQFHV DUH DVVRFLDWHG RU QRQDVVRFLDWHG DQG WKH $8QLRQfV RI WKHVH H[SUHVVLRQV IRUPV D VLQJOH H[SUHVVLRQ IRU WKH VXEGDWDEDVH RI LQWHUHVW ZH KDYH VKRZQ WKDW WKH $DOJHEUD LV FRPSOHWH E\ LQGXFWLRQ Â’

PAGE 135

Df WKH QWK REMHFW LV LQ &N Ef WKH QWK REMHFW LV LQ &M )LJXUH 7ZR ZD\V RI IRUPLQJ QHZ SDWWHUQV

PAGE 136

Df WKH WK REMHFW LV LQ & Ef WKH WK REMHFW LV LQ & )LJXUH 7ZR VSHFLILF H[DPSOHV RI QHZ SDWWHUQV

PAGE 137

Df Ef )LJXUH (TXLYDOHQW SDWWHUQV

PAGE 138

Df Ef )LJXUH 'HFRPSRVHG SDWWHUQV

PAGE 139

Df Ef )LJXUH 2WKHU HTXLYDOHQW SDWWHUQV

PAGE 140

Df 7ZR FODVVHV KDYH PXOWLSOH Ef 7ZR REMHFWV KDYH PXOWLSOH DVVRFLDWLRQV LQ D SDWWHUQ DVVRFLDWLRQV )LJXUH 1HZ REMHFW LQVWDQFH KDYLQJ PXOWLSOH DVVRFLDWLRQV ZLWK WKRVH RI &WB

PAGE 141

&+$37(5 &21&/86,21 2EMHFW2ULHQWHG '%06V DQG WKHLU XQGHUO\LQJ PRGHOV H[KLELW VHYHUDO GHVLUDEOH IHDWXUHV WKDW DUH VXLWDEOH IRU PRGHOLQJ DQG SURFHVVLQJ FRPSOH[ REMHFWV IRXQG LQ PRUH DGYDQFHG GDWDEDVH DSSOLFDWLRQV +RZHYHU WKH\ VWLOO GR QRW KDYH D VROLG PDWKHPDWLFDO IRXQGDWLRQ 6XFK D IRXQGDWLRQ LV LPSRUWDQW IRU WKH HIILFLHQW PDQLn SXODWLRQ RI GDWDEDVHV DQG IRU WKH GHVLJQ RI KLJKOHYHO TXHU\ ODQJXDJHV WR HDVH WKH XVHUfV WDVN LQ DFFHVVLQJ DQG PDQLSXODWLQJ GDWDEDVHV ,Q WKLV GLVVHUWDWLRQ ZH KDYH SUHVHQWHG DQ DOJHEUD IRU GDWDEDVH SURFHVVn LQJ EDVHG RQ WKH XQLIRUPHG UHSUHVHQWDWLRQ RI REMHFW LQVWDQFHV DQG WKHLU DVVRFLDn WLRQV LQ DQ GDWDEDVH DVVRFLDWLRQ SDWWHUQV 1LQH DOJHEUD RSHUDWRUV KDYH EHHQ LQWURGXFHG IRU PDQLSXODWLQJ SDWWHUQV RI ERWK KHWHURJHQHRXV DQG KRPRJHQHRXV VWUXFWXUHV 7KH FORVXUH SURSHUW\ RI WKH DOJHEUD DOORZV WKH UHVXOW RI DQ DOJHEUDLF H[SUHVVLRQ WR EH IXUWKHU SURFHVVHG E\ WKH DOJHEUD 6HYHUDO PDWKHPDWLFDO SURSHUWLHV RI WKH $DOJHEUD RSHUDWRUV KDYH EHHQ VWXGLHG DQG IRUPDOO\ SURYHQ 7KHLU XWLOLW\ LQ TXHU\ GHFRPSRVLWLRQ DQG RSWLPL]DWLRQ KDV EHHQ GHPRQVWUDWHG 7KH $DOJHEUD LV FRPSOHWH LQ WKH VHQVH WKDW DOO SRVVLEOH VXEn GDWDEDVHV WKDW DUH GHULYDEOH IURP DQ GDWDEDVH FDQ EH H[SUHVVHG LQ $DOJHEUD H[SUHVVLRQV

PAGE 142

7KH $DOJHEUD KDV EHHQ XVHG LQ WKH GHVLJQ DQG LPSOHPHQWDWLRQ RI D KLJK OHYHO REMHFWRULHQWHG TXHU\ ODQJXDJH 24/ IRU SURFHVVLQJ GDWDEDVHV >$/$E :8@ $ JUDSKLF LQWHUIDFH IRU WKH ODQJXDJH DQG D SURWRW\SH NQRZOHGJH EDVH PDQDJHPHQW V\VWHP EDVHG RQ WKH VHPDQWLF DVVRFLDWLRQ PRGHO 26$0r >68 DQG 68@ DUH SUHVHQWHG LQ >'6 7< 68 /$0 3$1 &+8 6,1@

PAGE 143

5()(5(1&(6 >$+@ >$/$D@ >$/$E@ >$/$@ >$50@ >$67@ >%$1@ >%$1@ >%$7@ $KR $9 %HHUL & DQG 8OOPDQ -' 7KH 7KHRU\ RI -RLQV LQ 5HODn WLRQDO 'DWDEDVHV $&0 7UDQVDFWLRQV RQ 'DWDEDVH 6\VWHPV SS $ODVKTXU $0 $ 4XHU\ 0RGHO DQG 4XHU\ DQG .QRZOHGJH 'HILQLWLRQ /DQJXDJHV IRU 2EMHFWRULHQWHG 'DWDEDVHV GRFWRUDO GLVVHUWDWLRQ 8QLYHUVLW\ RI )ORULGD $ODVKTXU $0 6X 6<: DQG /DP + 24/ $ 4XHU\ /DQJXDJH IRU 0DQLSXODWLQJ 2EMHFWRULHQWHG 'DWDEDVHV 3URFHHGLQJV RI WKH WK ,QWL &RQIHUHQFH RQ 9/'% $PVWHUGDP 7KH 1HWKHUODQGV SS $ODVKTXU $0 6X 6<: DQG /DP + $ 5XOHEDVHG /DQJXDJH IRU 'HGXFWLYH 2EMHFW2ULHQWHG 'DWDEDVHV 3URFHHGLQJV RI WKH WK ,QWHUQDWLRQDO &RQIHUHQFH RQ 'DWD (QJLQHHULQJ /RV $QJHOHV &$ )HE $UPVWURQJ :: n'HSHQGHQF\ 6WUXFWXUHV RI 'DWD %DVH 5HODWLRQn VKLSV )'7 $&0 1HZ
PAGE 144

>%$7@ >%((@ >&$5@ >&+8@ >&2' @ >&2'D@ >&2' E@ >&2'@ >&2' @ >&2/@ >'$+@ >'(/@ %DWRU\ DQG .LP : 0RGHOLQJ &RQFHSWV IRU 9/6, &$' 2EMHFWV $&0 7UDQVDFWLRQV RQ 'DWDEDVH 6\VWHPV SS %HHUL & )DJLQ 5 DQG +RZDUG -+ $ &RPSOHWH $VLRPDWL]DWLRQ IRU )XQFWLRQDO DQG 0XOWLYDOXHG 'HSHQGHQFLHV $&0 6,*02' ,QWHUn QDWLRQDO 6\PSRVLXP RQ 0DQDJHPHQW RI 'DWD /RV $QJHOHV &$ SS &DUH\ 0'H:LWW 'DQG 9DQGHQEHUJ 6/ $ 'DWD 0RGHO DQG 4XHU\ /DQJXDJH IRU (;2'86 $&06,*02' &RQIHUHQFH SS &KXDQJ + 6 2SHUDWLRQDO 5ROH 3URFHVVLQJ LQ D 3URWRW\SH 26$0r .%06 0DVWHUfV WKHVLV 8QLYHUVLW\ RI )ORULGD &RGG ( $ 5HODWLRQDO 0RGHO RI 'DWD IRU /DUJH 6KDUHG 'DWD %DQN &$&0 SS &RGG ( n5HODWLRQDO &RPSOHWHQHVV RI 'DWDEDVH 6XEODQJXDJHV LQ 'DWD %DVH 6\VWHPV 5XVWLQ 5 HGf 3UHQWLFH+DOO ,QF (QJOHZRRG &OLIIV 1SS &RGG () n)XUWKHU 1RUPDOL]DWLRQ RI WKH 'DWD %DVH 5HODWLRQDO 0RGHO LQ 'DWD %DVH 6\VWHPV 5 5XVWLQ HGf 3UHQWLFH+DOO (QJOHn ZRRG &OLILV 1SS &RGG () n([WHQGLQJ WKH 'DWDEDVH 5HODWLRQDO 0RGHO WR &DSWXUH 0RUH 0HDQLQJ $&0 7UDQV RQ 'DWDEDVH 6\VWHPV SS &RGG () 7KH 5HODWLRQDO 0RGHO IRU 'DWDEDVH 0DQDJHPHQW $GGLVLRQ:HVOH\ &ROE\ / 6 $ 5HFXUVLYH $OJHEUD DQG 4XHU\ 2SWLPL]DWLRQ IRU 1HVWHG 5HODWLRQV $&06,*02' &RQIHUHQFH 3RUWODQG 25 SS 'DKO 2 0\KUKDXJ % DQG 1\JDDUG 6,08/$ &RPPRQ %DVH /DQJXDJH 1&& 3XEO 6 1RUZHJLDQ &RPSXWLQJ &HQWHU 2VOR 1RUZD\ 'HOREHO & 1RUPDOL]DWLRQ DQG +LHUDUFKLFDO 'HSHQGHQFLHV LQ WKH 5HODWLRQDO 'DWD 0RGHO $&0 7UDQVDFWLRQV RQ 'DWDEDVH 6\VWHPV SS

PAGE 145

>'6@ 'f6RX]D 7 *UDSKLF 6HPDQWLF 'DWD 'HILQLWLRQ /DQJXDJH DQG D *UDSKLF %URZVHU IRU WKH 2EMHFWHGRULHQWHG 6HPDQWLF $VVRFLDWLRQ 0RGHO 0DVWHUfV 7KHVLV 8QLYHUVLW\ RI )ORULGD >(/0@ (OPRUH 3 6KDZ *0 DQG =GRQLN 6% 7KH (1&25( 2EMHFW 2ULHQWHG 'DWD 0RGHO WHFK UHS %URZQ 8QLYHUVLW\ 1RYHPEHU >)$*@ )DJLQ 5 n0XOWLYDOXHG 'HSHQGHQFLHV DQG D 1HZ 1RUPDO )RUP IRU 5HODWLRQDO 'DWDEDVH $&0 7UDQVDFWLRQV RQ 'DWDEDVH 6\VWHPV SS >),6@ )LVKPDQ '+ %HHFK &DWH +3 &KRZ (& &RQQRUV 7 'DYLV -: 'HUUHWW 1 +RFK &* .HQW : /\QJEDHN 3 0DK ERG % 1HLPDW 0$ 5\DQ 7$ DQG 6KDQ 0& ,ULV $Q 2EMHFW 2ULHQWHG 'DWDEDVH 0DQDJHPHQW 6\VWHP $&0 7UDQVDFWLRQV RQ 2IILFH ,QIRUPDWLRQ 6\VWHPV SS >*2/@ *ROGEHUJ $ ,QWURGXFLQJ WKH 6PDOOWDON 6\VWHP %\WH $XJ SS >+$/@ +DOO 3$9 2SWLPL]DWLRQ RI D 6LQJOH 5HODWLRQDO ([SUHVVLRQ LQ D 5HODWLRQDO 'DWDEDVH ,%0 5HVHDUFK DQG 'HYHORSPHQW SS >+$0@ +DPPHU 0 DQG 0FOHRG n'DWDEDVH 'HVFULSWLRQ ZLWK 6'0 $ 6HPDQWLF $VVRFLDWLRQ 0RGHO $&0 72'6 SS >+25@ +RUQLFN 0) DQG =GRQLN 6 % $ 6KDUHG 6HJPHQWHG 0HPRU\ 6\Vn WHP IRU DQ 2EMHFWRULHQWHG 'DWDEDVH 6\VWHP $&0fV 7UDQVDFWLRQV RQ 2IILFH ,QIRUPDWLRQ 6\VWHPV SS >+8/@ +XOO 5 DQG .LQJ 5 6HPDQWLF 'DWDEDVH 0RGHOLQJ 6XUH\ $SSOLFDn WLRQV DQG 5HVHDUFK ,VVXHV $&0 &RPSXWLQJ 6XUYH\V SS >.,0@ .LP : %DQHUMHH &KRX +7 *DU]D -) DQG :RHON &RPn SRVLWH 2EMHFW 6XSSRUW LQ DQ 2EMHFWRULHQWHG 'DWDEDVH 6\VWHP 3URFHHGLQJV RI 2236/$ )/ 2FW SS >.,1@ .LQJ 5 6HPEDVH $ 6HPDQWLF '%06 WKH 3URFHHGLQJV RI WKH )LUVW ,QWHUQDWLRQDO :RUNVKRS RQ ([SHUW 'DWDEDVH 6\VWHPV $WODQWD *$ 2FW SS >./(@ .OHHQH 6& 0DWKHPDWLFDO /RJLF -RKQ :LOH\ t 6RQV ,QF

PAGE 146

>/$0@ /DP + ;LD 4LX DQG :X 3 n3URWRW\SH ,PSOHPHQWDWLRQ RI DQ 2EMHFWRULHQWHG .QRZOHGJH %DVH 0DQDJHPHQW 6\VWHP WR DSSHDU LQ WKH 3URFHHGLQJV RI 352&,(0 f 2UODQGR )/ 1RY >/(&@ /HFOXVH & 5LFKDUG 3 DQG 9HOH] ) R DQ 2EMHFW2ULHQWHG 'DWD 0RGHO $&06,*02' &RQIHUHQFH &KLFDJR ,/ -XQH SS >0$&@ 0DF*UHJRU 5 $5,(/$ 6HPDQWLF )URQW(QG WR 5HODWLRQDO '%06V 3URFHHGLQJV RI 9/'% $WODQWD *$ $SULO SS >0$,@ 0DLHU DQG 6WHLQ n'HYHORSPHQW RI DQ 2EMHFWRULHQWHG '%06 3URF RI 2236/$ f &RQIHUHQFH 3RUWODQG 25 6HSW 2FW SS >0$1@ 0DQROD ) DQG 'D\DO 8 n3'0 $Q 2EMHFW2ULHQWHG 0RGHO ,QWfO :RUNVKRS 2Q 2EMHFW2ULHQWHG 'DWDEDVH 6\VWHPV SS >3$1@ 3DQW 6 $Q ,QWHOOLJHQW 6FKHPD 'HVLJQ 7RRO IRU 26$0r 0DVWHUfV WKHVLV 8QLYHUVLW\ RI )ORULGD >52:@ 5RZH / $ DQG 6WRQHEUDNHU 0 5 7KH 3267*5(6 'DWD 0RGHO 3URFHHGLQJV RI WKH WK 9/'% &RQIHUHQFH %ULJKWRQ SS >6(5@ 6HUYLR /RJLF 'HYHORSPHQW &RUSRUDWLRQ 3URJUDPPLQJ LQ 23$/ D 0DQXDO 3XEOLVKHG E\ 6HUYLR /RJLF 'HYHORSPHQW &RUSRUDWLRQ %HDYHUn WRQ 25 >6+$@ 6KDZ 0 DQG =GRQLF 6 % $ 4XHU\ $OJHEUD IRU 2EMHFW2ULHQWHG 'DWDEDVHV ,((( 7UDQV RQ 'DWD (QJLQHHULQJ SS )HE >6+,@ 6KLSPDQ 7KH )XQFWLRQDO 'DWD 0RGHO DQG WKH 'DWD /DQJXDJH '$3/(; $&0 72'6 SS >6,1@ 6LQJK 0 7UDQVDFWLRQ 2ULHQWHG 5XOH 3URFHVVLQJ LQ DQ 2EMHFW 2ULHQWHG .QRZOHGJH %DVH 0DQDJHPHQW 6\VWHP 0DVWHUfV WKHVLV 8QLYHUVLW\ RI )ORULGD >67@ 6WRQHEUDNHU 0 :RQJ ( .UHSV 3 DQG +HOG 7KH 'HVLJQ DQG ,PSOHPHQWDWLRQ RI ,1*5(6 $&0 7UDQVDFWLRQV RQ 'DWDEDVH 6\Vn WHPV SS >67@ 6WRQHEUDNHU 0 $QGHUVRQ ( +DQVRQ ( DQG 5XEHQVWHLQ % 4XHO DV D 'DWD 7\SH 3URFHHGLQJV RI WKH $&0 6,*02' &RQIHUHQFH

PAGE 147

>68@ >68@ >68@ >72'@ >768@ >7<@ >8//@ >:2(@ >:21@ >:8@ >=$1@ RQ 0DQDJHPHQW RI 'DWD %RVWRQ 0$ -XQH SS 6X 6<: 0RGHOLQJ ,QWHJUDWHG 0DQXIDFWXULQJ 'DWD :LWK 6$0r 7((( &RPSXWHU -DQXDU\ SS 6X 6<: /DP + DQG 1DYDWKH 61 $Q 2EMHFWRULHQWHG &RPn SXWLQJ (QYLURQPHQW IRU 3URGXFWLYLW\ ,PSURYHPHQW LQ $XWRPDWHG 'HVLJQ DQG 0DQXIDFWXULQJ 3URMHFW 6XPPDU\ 352&,(0 f2UODQGR )/ 1RY 6X 6<: .ULVKQDPXUWK\ 9 DQG /DP + $Q 2EMHFWRULHQWHG 6HPDQWLF $VVRFLDWLRQ 0RGHO 26$0rf $, ,QGXVWULDO (QJLQHHULQJ DQG 0DQXIDFWXULQJ 7KHRUHWLFDO ,VVXHV DQG $SSOLFDWLRQV 6 .XPDUD $/ 6R\VWHU DQG 5/ .DVK\DS HGVf 7KH ,QVWLWXWH RI ,QGXVWULDO (QJLQHHULQJ ,QGXVWULDO (QJLQHHULQJ DQG 0DQDJHPHPQW 3UHVV 1RU FURVV *$ 7RGG 6-3 7KH 3HWHUOHH 5HODWLRQDO 7HVW 9HKLFOH $ 6\VWHP 2YHUn YLHV ,%0 6\VWHPV SS 7VXUW 6 DQG =DQLROR & $Q ,PSOHPHQWDWLRQ RI *(0 6XSSRUWLQJ D 6HPDQWLF 'DWD 0RGHO RQ D 5HODWLRQDO %DFN (QG 3URFHHGLQJV RI WKH $&0 6,*02' ,QWL &RQIHUHQFH RQ WKH 0DQDJHPHQW RI 'DWD %RVWRQ 0$ -XQH SS )UHGHULFN 7\ 7KH 'HVLJQ DQG ,PSOHPHQWDWLRQ RI D *UDSKLFV ,QWHUn IDFH IRU DQ 2EMHFWRULHQWHG /DQJXDJH 0DVWHUfV WKHVLV 8QLYHUVLW\ RI )ORULGD 8OOPDQ -' 3ULQFLSOH RI 'DWDEDVH 6\VWHPV &RPSXWHU 6FLHQFH 3UHVV :RHON .LP : DQG /XWKHU : $Q 2EMHFW2ULHQWHG $SSURDFK WR 0XOWLPHGLD 'DWDEDVHV $&0 6,*02' &RQIHUHQFH 3URFHHGLQJV :DVKLQJWRQ '& 0D\ SS :RQJ ( DQG
PAGE 148

>=$1@ >=$1D@ >=$1E@ >='@ >=@ =DQLROR & 7KH 'DWDEDVH ODQJXDJH *(0 3URFHHGLQJV RI WKH $&0 6,*02' ,QWL &RQIHUHQFH RQ WKH 0DQDJHPHQW RI 'DWD 6DQ -RVH &$ =DQLROR & 7KH 5HSUHVHQWDWLRQ DQG 'HGXFWLYH 5HWULHYDO RI &RPSOH[ 2EMHFW 3URFHHGLQJV RI 9/'% 6WRFNKROP 6ZHGHQ SS =DQLROR & $LW.DFL + %HHFK &DPPDUDWD 6 .HUVFKEHUJ / DQG 0DLHU 2EMHFW2ULHQWHG 'DWDEDVH 6\VWHPV DQG .QRZOHGJH 6\VWHPV LQ ([SHUW 'DWDEDVH 6\VWHPV /DUU\ .HUVEHUJ HGf %HQMDPLQ&XQQLQJV 3XEOLVKLQJ 0HXOR 3DUN &$ SS =GRQLN 6 % 6NDUUD $ + DQG 5HLVV 6 3 $Q REMHFW 6HUYHU IRU DQ 2EMHFWRULHQWHG 'DWDEDVH 6\VWHP ,QWHUQDWLRQDO :RUNVKRS RQ 2EMHFWRULHQWHG 'DWDEDVH 6\VWHPV 3DFLILF *URYH &$ 6HSW =RRN :
PAGE 149

$33(1',; 7KH IRUPDO SURRIV RI WKH PDWKHPDWLFDO SURSHUWLHV RI WKH $DOJHEUD RSHUDWRUV DUH JLYHQ EHORZ $ &RPPXWDWLYLW\ f DA5^$%f?3 3A%$f?D f 3URRI ,I D SDWWHUQ LQ D FDQ EH FRQFDWHQDWHG ZLWK D SDWWHUQ LQ RYHU DQ ,QWHUn SDWWHUQ DEM WKHQ WKH SDWWHUQ LQ FDQ EH FRQFDWHQDWHG ZLWK WKDW SDWWHUQ LQ D RYHU WKH ,QWHUSDWWHUQ LAD 6LQFH SDWWHUQV DUH QRQGLUHFWLRQDO LH DLEM ED^ WKH OHIW KDQG VLGH DQG WKH ULJKWKDQG VLGH RI WKH HTXDWLRQ ZRXOG SURGXFH WKH VDPH UHVXOW 2Q WKH RWKHU KDQG LI DQ D SDWWHUQ FDQQRW EH FRQFDWHQDWHG ZLWK D c SDWWHUQ E\ WKH RSHUDWLRQ RQ WKH OHIWKDQG VLGH WKHQ WKH VDPH SDWWHUQ FDQQRW EH FRQFDWHQDWHG ZLWK WKDW D SDWWHUQ E\ WKH RSHUDWLRQ RQ WKH ULJKWKDQG VLGH Â’ f D?>5$%f@3 3+5%$f`Fr f 3URRI 6LQFH D &RPSOHPHQWSDWWHUQ LV QRQGLUHFWLRQDO DQG LI D FRPSOHPHQW SDWWHUQ DLEM FRQQHFWV DQ D SDWWHUQ ZLWK D I SDWWHUQ WKHVH WZR SDWWHUQV WRJHWKHU ZLWK WKH &RPSOHPHQWSDWWHUQ DLEM ZLOO DOO EH UHWDLQHG LQ WKH UHVXOWV RI WKH H[SUHVVLRQV RQ ERWK VLGHV RI WKH HTXDWLRQ )RU WKH VDPH UHDVRQ D QHZ SDWWHUQ ZKLFK FDQQRW EH SURGXFHG E\ WKH RSHUDWLRQ RQ WKH OHIWKDQG VLGH RI WKH HTXDWLRQ FDQQRW EH SURGXFHG E\ WKH RSHUDWLRQ RQ WKH ULJKWKDQG VLGH Â’

PAGE 150

f D?^5$%f@3 3n>5%$f@[ f 3URRI $FFRUGLQJ WR WKH FRQQHFWLRQV EHWZHHQ SDWWHUQV RI D DQG 3 WKURXJK VRPH ,QWHUSDWWHUQV D DQG 3 FDQ EH GHFRPSRVHG LQWR WKH $8QLRQ RI WZR VXEVHWV RI SDWn WHUQV UHVSHFWLYHO\ LQ LQ Df RU D D DQG 3 3 3 ZKHUH D UHSUHVHQWV D VXEVHW RI D SDWWHUQV WKDW FDQ EH FRQFDWHQDWHG ZLWK WKH SDWWHUQV DQG D UHSUHVHQWV D VXEVHW RI D SDWWHUQV WKDW FDQQRW EH FRQFDWHQDWHG ZLWK 3 SDWWHUQV 7KH GHFRPSRVLWLRQ RI IW FDQ EH LQWHUSUHWHG VLPLODUO\ Q Q Q Q $VVXPH WKDW D IW DQG S D DUH XVHG WR GHQRWH WKH QHZ SDWWHUQV SURGXFHG E\ WKH 1RQ$VVRFLDWH RSHUDWLRQV RQ ERWK OHIW DQG ULJKWKDQG VLGHV RI WKH HTXDWLRQ (DFK RI WKH QHZ SDWWHUQV FRQVLVWV RI RQH D SDWWHUQ RQH S SDWWHUQ DQG D &RPSOHPHQWSDWWHUQ ZKLFK FRQQHFWV WKH WZR %\ WKH GHILQLWLRQ RI WKH 1RQ$VVRFLDWH RSHUDWLRQ ZH KDYH OHIWKDQG VLGH RU D f?^5$%f@3 3f D LI 3 Q S ,, Q Q D 3 RWKHUZLVH ULJKWKDQG VLGH 3 f?>5%$f?DW Df Q Q D LI 3 W! ,, ,, 3 LI D I! 3 D RWKHUZLVH 6LQFH D &RPSOHPHQWSDWWHUQ LV QRQGLUHFWLRQDO LH D 3 3 D WKH FRPPXWDWLYLW\ KROGV IRU DOO FDVHV Â’

PAGE 151

f &I^;f3 3r^;`D f 3URRI ,I WKH ,QQHUSDWWHUQV REMHFW LQVWDQFHVf RI WKH FODVVHV VSHFLILHG LQ ^;` FRQn WDLQHG LQ DQ D SDWWHUQ DUH FRPPRQ WR D 3 SDWWHUQ WKH QHZ SDWWHUQ ZKLFK LV WKH LQWHUVHFWLRQ RI WKH WZR SDWWHUQV ZLOO EH SURGXFHG E\ ERWK VLGHV RI WKH HTXDWLRQ 2Q WKH RWKHU KDQG LI DQ D SDWWHUQ ZKLFK GRHV QRW LQWHUVHFW ZLWK D 3 SDWWHUQ E\ WKH RSHUDWLRQ RQ WKH OHIWKDQG VLGH RI WKH HTXDWLRQ WKH VDPH 3 SDWWHUQ ZLOO QRW LQWHUVHFW ZLWK WKDW D SDWWHUQ E\ WKH RSHUDWLRQ RQ WKH ULJKWKDQG VLGH Â’ f D3 3D f 3URRI 6LQFH WKH $8QLRQ RSHUDWLRQ VLPSO\ OXPSV SDWWHUQV QDPHG E\ WZR RSHUDQGV LQWR D VLQJOH DVVRFLDWLRQVHW DQG WKH SDWWHUQV LQ DQ DVVRFLDWLRQVHW DUH QRW RUGHUHG ERWK VLGHV RI WKH HTXDWLRQ ZLOO SURGXFH WKH VDPH UHVXOW Â’ % $VVRFLDWLYLW\ f DZr^5&/9&/f?3^]` r^5 &/9 &/f`S^ Q -5 &/ &/A^=`f f &/e^;? $ &/e^=` 3URRI 7KH DVVRFLDWLYLW\ KROGV RQO\ XQGHU WKH VWDWHG FRQGLWLRQ 7KH FRQGLWLRQ VWDWHV WKDW D GRHV QRW FRQWDLQ ,QQHUSDWWHUQV RI FODVV &/ DQG M GRHV QRW FRQWDLQ ,QQHU SDWWHUQV RU REMHFW LQVWDQFHVf RI FODVV &/ VR WKDW D ZLOO KDYH QR HIIHFW RQ WKH RSHUDn WLRQ r>5&/&/A? RQ WKH OHIWKDQG VLGH DQG ZLOO KDYH QR HIIHFW RQ WKH RSHUDWLRQ r>5^&/Y&/f? RQ WKH ULJKWKDQG VLGH *LYHQ WKDW WKH DERYH FRQGLWLRQ KROGV D3 DQG FDQ EH GHFRPSRVHG DV IROORZV W } P Df D D D D

PAGE 152

ZKHUH D UHSUHVHQWV D VXEVHW RI D SDWWHUQV ZKLFK FDQ EH FRQFDWHQDWHG ZLWK D VXEVHW RI IW SDWWHUQV DQG WKHUHDIWHU EH FRQFDWHQDWHG WKURXJK IW SDWWHUQVf ZLWK D VXEVHW RI SDWWHUQV D UHSUHVHQWV D VXEVHW RI D SDWWHUQV ZKLFK FDQ EH FRQn FDWHQDWHG ZLWK D VXEVHW RI IW SDWWHUQV ZKLFK KRZHYHU FDQQRW EH FRQFDWHQDWHG ZLWK DQ\ SDWWHUQ DQG D UHSUHVHQWV D VXEVHW RI SDWWHUQV ZKLFK HLWKHU GRHV QRW KDYH WKH ,QQHUSDWWHUQV RI &/W RU FDQQRW EH FRQFDWHQDWHG ZLWK DQ\ IW SDWWHUQ U 1RWH WKDW DQ D SDWWHUQ PD\ EHORQJ WR D DQG D f P Ef IW IW IW IW IW Q ZKHUH 3 FDQ EH FRQFDWHQDWHG ZLWK D DQG FDQ EH FRQFDWHQDWHG ZLWK D EXW QRW ZLWK If FDQ EH FRQFDWHQDWHG ZLWK EXW QRW ZLWK RU DQG FDQQRW EH L FRQFDWHQDWHG ZLWK HLWKHU D RU IW 1RWH WKDW SDWWHUQV RI IW IW IW DQG IW DUH PXWXDOO\ H[FOXVLYH Ff ZKHUH DQG KDYH WKH VLPLODU LQWHUSUHWDWLRQV DV RU RU DQG D UHVSHF WLYHO\ LOO ,I DIW D IW IWnf IW DQG DIW DUH XVHG WR UHSUHVHQW WKH UHVXOWV RI WKH $VVRFLDWH RSHUDWLRQV DFFRUGLQJ WR WKH GHILQLWLRQ RI $VVRFLDWH ZH KDYH ,, ,,, LOO OHIWKDQG VLGH D D D fr^5>&/Y&/A?^IW IW IW IW ff 5&/&/f@Q f ^DIW D IWfr>5^&/Y&/f`^ f , DIWO ULJKWKDQG VLGH RU D D f r^5&/Y&/f?IW IW IW IW f ,,, r>M5&/&/f@ ff L Q KL LW LQ Q D D D fr?5&/Y&/f@>S IW f , DIW Â’

PAGE 153

f D^;`?>5&/Y&/f@3^&/&/f@^=` f D^[f?>5 &/Y&/f?3>9f?>5^ &/f &/f@^]`f &/t ^;` $ &/t ^=` 3URRI )RU WKH VLPLODU UHDVRQ JLYHQ LQ WKH GLVFXVVLRQ RI DVVRFLDWLYLW\ RI r RSHUDWRU D 3 DQG FDQ EH GHFRPSRVHG DV IROORZV W Q WQ Df RU D D D ZKHUH RU FDQ EH FRQQHFWHG WR 3 SDWWHUQV E\ &RPSOHPHQWSDWWHUQV DQG WKHQ EH FRQQHFWHG WR SDWWHUQV D FDQ EH FRQQHFWHG WR 3 SDWWHUQ E\ &RPSOHPHQW P SDWWHUQV EXW FDQQRW EH IXUWKHU FRQQHFWHG WR SDWWHUQV DQG D HLWKHU KDV QR ,QQHUSDWWHUQV RI &/; RU FDQQRW EH FRQQHFWHG WR DQ\ 3 SDWWHUQ E\ W &RPSOHPHQWSDWWHUQV $OVR SDWWHUQV RI D DQG D PD\ QRW EH PXWXDOO\ H[FOXVLYH f W Q UUW PL Ef S S S S S L Q ZKHUH 3 FDQ EH FRQQHFWHG WR D DQG SDWWHUQV E\ &RPSOHPHQWSDWWHUQV 3 FDQ P EH FRQQHFWHG WR D SDWWHUQV E\ &RPSOHPHQWSDWWHUQV EXW QRW WR SDWWHUQV FDQ EH FRQQHFWHG WR SDWWHUQV E\ &RPSOHPHQWSDWWHUQV EXW QRW WR D SDWWHUQV QQ DQG S FDQQRW EH FRQQHFWHG WR WKH SDWWHUQV RI HLWKHU D RU $OVR SDWWHUQV RI O Q LOO LOO 3 3 3 DQG 3 DUH PXWXDOO\ H[FOXVLYH ,, ,,, Ff ,, OLW ,, ,,, ZKHUH DQG KDYH WKH VLPLODU LQWHUSUHWDWLRQV DV D D DQG D UHVSHF WLYHO\ 7KHQ E\ WKH GHILQLWLRQ RI WKH $&RPSOHPHQW RSHUDWLRQ ZH KDYH L LQ LQ OHIWKDQG VLGH m D D f_>5&/9&/f`3 3 3 3 ff ,, ,,, _>m&/&/f@ f

PAGE 154

OLQQ LQ LQ >D3 D 3 f?>5&/O&/f`nf f D3 L LW QL L Q LQ QQ ULJKWKDQG VLGH RU D D f ?>5&/9&/f@3 3 3 3 f ,, ,,, _>M5&/M&/f@ ff L Q QU L L LQ Q D D D f?^5^&/Y&/f@3Q 3 Of , D 3L ZKHUH D3 D 3 3nf 3 RSHUDWLRQV Â’ DQG D3 UHSUHVHQW WKH UHVXOWV RI WKH $&RPSOHPHQW f DZm^:n`"P0:`: r^[`:`S>
PAGE 155

WKH SDWWHUQV LQ D VHW DUH QRW RUGHUHG WKH RUGHU RI SHUIRUPLQJ $8QLRQ RSHUDWLRQV RQ D QXPEHU RI DVVRFLDWLRQVHWV ZLOO KDYH QR HIIHFW RQ WKH ILQDO UHVXOW Â’ 'LVWULEXWLYLW\ f RWr>5^$%f@^37f [r^5^$%f@3 DIr>-$IOf@ f 3URRI )LUVW D DQG FDQ EH GHFRPSRVHG DV IROORZV L Q LQ Df D D D D W Q P ZKHUH D FDQ EH FRQFDWHQDWHG ZLWK D FDQ EH FRQFDWHQDWHG ZLWK DQG D FDQQRW EH FRQFDWHQDWHG ZLWK HLWKHU 3 RU 1RWH WKDW DQ DW SDWWHUQ PD\ EHORQJ ,, WR DW DQG D Ef 3 3 P ,, ZKHUH IW FDQ EH FRQFDWHQDWHG ZLWK D EXW If FDQQRW Ff Q ZKHUH FDQ EH FRQFDWHQDWHG ZLWK D EXW FDQQRW %\ WKH GHILQLWLRQ RI WKH $VVRFLDWH RSHUDWLRQ ZH KDYH ,, ,,, ,, W ,, OHIWKDQG VLGH D D D f r>5$%f?3 IW f ,, ,, DI D ,, ,,, ,, ,, ,,, OO ULJKWKDQG VLGH D D D fr>5$%f`3 IWf DW D D fr>5$%f@ f ,, ,, DIf DW Â’ f FW?^5$%f?3nOf r?>5$%f` D_>-$IOf@ f 3URRI D IW DQG FDQ EH GHFRPSRVHG DV IROORZV P Df D D D D P D ZKHUH D FRQWDLQV SDWWHUQV WKDW DUH FRQQHFWHG WR IW E\ &RPSOHPHQWSDWWHUQV D ,,, FRQWDLQV SDWWHUQV WKDW DUH FRQQHFWHG WR E\ &RPSOHPHQWSDWWHUQV DQG D FDQ

PAGE 156

QRW EH FRQQHFWHG WR HLWKHU 3 RU E\ &RPSOHPHQWSDWWHUQV 1RWH WKDW DQ D SDW WHUQ PD\ EHORQJ WR D DQG D Ef 3 3 3 Q ZKHUH c FDQ EH FRQQHFWHG WR D E\ &RPSOHPHQWSDWWHUQV EXW I FDQQRW Q ZKHUH FDQ EH FRQQHFWHG WR D E\ &RPSOHPHQWSDWWHUQV EXW FDQQRW %\ WKH GHILQLWLRQ RI WKH $&RPSOHPHQW RSHUDWLRQ ZH KDYH OHIWKDQG VLGH D D RU f ?>5$%f@3 3 f ,, ,, D D ,, +, W ,, ,, +, ,, ULJKWKDQG VLGH D D D f??5$%f@3 3f D D D f_>L($%f@ f ,, +, DIO DW Â’ f Dr^;`3f RW}^;` D^;` f 3URRI D IW DQG FDQ EH GHFRPSRVHG DV IROORZV + +, Df D D Fr DU P D KL ZKHUH D LQWHUVHFWV ZLWK D LQWHUVHFWV ZLWK DQG D GRHV QRW LQWHUVHFW ZLWK HLWKHU If RU 1RWH WKDW DQ D SDWWHUQ PD\ EHORQJ WR D DQG D Ef 3 3 3 D ,, ZKHUH 3 LQWHUVHFWV ZLWK D EXW S GRHV QRW ,, ZKHUH LQWHUVHFWV ZLWK D EXW GRHV QRW %\ WKH GHILQLWLRQ RI WKH $,QWHUVHFW RSHUDWLRQ ZH KDYH + ,+ ,, + OHIWKDQG VLGH RU D D f^;`" 3 f ,, +, RWS D + ,,, ,+ ,, +, ,, ULJKWKDQG VLGH m DW D f}^;`3 3f DW DW DW f^;` f ,, +, D3 D Â’

PAGE 157

f DZ r^5&/9&/f@"P^ :`^]`f m r^5 &/9 &/f?3>\@^ :M;`DUZ r?5 &/Y&/f?>]` f f rZ?>5&/Y&/f`3^\f^:`nL^]ff D_>L&/&/3^\`^:8;`DZ_>L&/f&/f@^]` f 7KH DERYH WZR GLVWULEXWLYH SURSHUWLHV KROG ZKHQ WKH IROORZLQJ FRQGLWLRQV DUH WUXH Lf &/e^:` LLf ;U?< ;QZ M! DQG LLLf RU LV D KRPRJHQHRXV DVVRFLDWLRQf§VHW 7KH ILUVW FRQGLWLRQ HQVXUHV WKDW WKH RSHUDWLRQV $VVRFLDWH $&RPSOHPHQW DQG 1RQ$VVRFLDWH ZLOO RSHUDWH RQ WKH FRPPRQ FODVV RI DQG DV VKRZQ LQ Df RI WKH IROn ORZLQJ ILJXUH 2WKHUZLVH WKH GLVWULEXWLRQV RI WKHVH RSHUDWLRQV WR DQG GR QRW PDNH VHQVH DV VKRZQ LQ Ef DQG Ff 7KH VHFRQG FRQGLWLRQ HQVXUHV WKDW D SDWWHUQV PXVW QRW LQWHUVHFW ZLWK DQ\ SDWWHUQ RI HLWKHU A RU VR WKDW WKH f^;X :` RSHUDWLRQV RQ WKH ULJKWKDQG VLGHV RI WKH HTXDWLRQV ZLOO H[DPLQH WKH LQWHUVHFWLRQV RQ WKH SRUWLRQV RI D DQG VHSDUDWHO\ 7KH WKLUG FRQGLWLRQ HQVXUHV WKDW RQ WKH ULJKWKDQG VLGHV RI WKH HTXDWLRQV RQO\ WKRVH SDWWHUQV WKDW KDYH WKH VDPH D SDWWHUQ ZLOO LQWHUVHFW DQG EH UHWDLQHG LQ WKH UHVXOW Df Ef Ff :H VKDOO RQO\ JLYH WKH SURRI RI FDQ EH SURYHG XVLQJ WKH VDPH WHFKn QLTXH

PAGE 158

:KHQ WKH FRQGLWLRQV DUH WUXH D 3 DQG FDQ EH GHFRPSRVHG DV IROORZV Q LQ QQ Df D D D r RU ZKHUH D FDQ EH FRQFDWHQDWHG ZLWK 3 DQG D FDQ EH FRQFDWHQDWHG ZLWK IW EXW LQ QQ QRW ZLWK D FDQ EH FRQFDWHQDWHG ZLWK EXW QRW ZLWK DQG D FDQQRW EH P QQ FRQFDWHQDWHG ZLWK HLWKHU c RU 1RWH WKDW D D D DQG D DUH PXWXDOO\ H[FOXVLYH f L Q QL QQ Ef S S S S S L Q ZKHUH 3 FDQ EH FRQFDWHQDWHG ZLWK D DQG GRHV LQWHUVHFW ZLWK IW FDQ EH FRQ +, FDWHQDWHG ZLWK D EXW GRHV QRW LQWHUVHFW ZLWK I FDQQRW EH FRQFDWHQDWHG ZLWK QQ D EXW GRHV LQWHUVHFW ZLWK DQG IW FDQ QHLWKHU EH FRQFDWHQDWHG ZLWK D QRU L Q P QQ LQWHUVHFW ZLWK 1RWH WKDW IW c DQG c DUH DOVR PXWXDOO\ H[FOXVLYH Q LQ QQ Ef ZKHUH DQG KDYH WKH VLPLODU LQWHUSUHWDWLRQV DV 3 IW IW DQG 3 UHVSHFWLYHO\ %\ WKH GHILQLWLRQ RI WKH RSHUDWLRQV RI $VVRFLDWH DQG $,QWHUVHFW ZH KDYH L LQ QQ Q P QQ OHIWKDQG VLGH D D D D f r^5&/9&/f?3 3 3 3 f L Q LQ QQ f^:` ff L Q P QQ LQ LQ D D D D fr^5&/9&/f?3 3 f ,,, ,,, 6LQFH &/H^:` DQG 3 FDQQRW EH SURGXFHG E\ WKH ^;` RSHUDWRU DFFRUGLQJ WR LQ QU WKH GHFRPSRVLWLRQV RI 3 DQG 2WKHUZLVH RU 3f PXVW FRQWDLQ WKH VDPH ,QQHU SDWWHUQ RI &/ DV FRQWDLQHG LQ 3 RU f DQG PXVW EH DEOH WR FRQFDWHQDWH ZLWK D $SSO\LQJ WKH GLVWULEXWLYH SURSHUW\ ZH REWDLQ

PAGE 159

m5&/Y&/f?3L G r?5&/Y&/f?3 Q WLQ WQ P D r>5&/Y&/f?3 D r>5^&/9&/f?3 ,W , ,,, WWW ,,, D 5^&/Y&/f@SnL D r^5&/Y&/f@S XQ W L WLQ WWW WWW m 5^&/Y&/f`3 D r^5^&/Y&/f@3 %DVHG RQ WKH GHFRPSRVLWLRQV RI D 3 DQG RQO\ WKH ILUVW LWHP ZLOO SURGXFH QHZ SDWn WHUQV DQG LV UHWDLQHG +HQFH G r>5&/Y&/f`3G U DSn\ 2Q WKH ULJKWKDQG VLGH RI WKH HTXDWLRQ ZH KDYH W LW WWW WLQ W WW WWW WWQ ULJKWKDQG VLGH D D RU D fr>5&/O&/f@3 3 3 3 ff W LW WQ WWQ W Q WQ WWQ ^;X:`D D D D fr^5^&/9&/f`^ ff W L LQ WW U LQ WQ WQ Q D3 D3 D 3 D "fr^;X:`Fr D D D f $SSO\LQJ WKH GLVWULEXWLYH SURSHUW\ ZH KDYH LW W W L LW LW WWW L U W WQ Q ULJKWKDQG VLGH D3}^;?M:`DWnL D3}^;?>M:`DWnL DAW^-XL9`D D3}^;?M:fD LQ Q WW W WW WQ W WW WQ WW D3f^;?M:`Dnf D3f^;?M:fDAc D3}^;?M:`D D 3 f^;?M:`RW Q W WWW LQ Q WWW W WW WQ Q D 3}^;?M:`DL D "r^;X:n`D D 3}^;?M:fD D 3}^;?M:`D D 3 f^;?M:`DWnf D 3 f^;?M:fDWnf D 3 r^;X :`D D 3 f^;X:`DU 2I WKH VL[WHHQ LWHPV RQO\ WKH ILUVW RQH LV UHWDLQHG 7KH UHVW RI LWHPV DUH GURSSHG EHFDXVH WKH\ GR QRW LQWHUVHFW HLWKHU RYHU FODVVHV LQ ^;` RU RYHU FODVVHV LQ ^:f 7KHUHn IRUH ,W WW ULJKWKDQG VLGH mAf^;X:`Dn W D3 Â’

PAGE 160

( 2WKHU 3URSHUWLHV f DZPP m0P f Q P QQ L Q 3URRI D FDQ EH GHFRPSRVHG LQWR D D D D ZKHUH D VDWLVILHV 3[ DQG 3 D P QQ RQO\ VDWLVILHV 3Y D RQO\ VDWLVILHV 3 DQG D GRHV QRW VDWLVI\ HLWKHU 3[ RU 3 DLm m9M m I Z m;} m f>A@ m DWmf>3$!3@ D D f ,O RLDPZ R Df>I@f>A 3=ef f 3URRI )LUVW D LV GHFRPSRVHG LQWR D r r ZKHUH D VDWLVILHV WKH VHOHFWLRQ FRQGLWLRQ Q LQ EXW D GRHV QRW 7KHQ OHW DQG UHSUHVHQW WKH UHVXOWV RI WKH SURMHFWLRQ RSHUDWLRQ FRUUHVSRQGLQJ WR D DQG D UHVSHFWLYHO\ 6LQFH 3&6 VDWLVILHV 3 EXW GRHV QRW DQG ZH KDYH ,, rD3f>I7, 9f> 3 D> ,ODf>eO@f>3L R^ f ’ f R>D r>IO$%f@ IOIW$\ A0}@ L$%f@ Df>3@ f ZKHUH 3[ DQG 3 DUH DSSOLFDEOH WR D DQG UHVSHFWLYHO\ Q QL QQ L Q 3URRI )LUVW D LV GHFRPSRVHG LQWR D D D D ZKHUH D DQG D VDWLVI\ 3[ EXW QL QQ L LQ P Q D DQG D GR QRW DQG D DQG D FDQ EH FRQFDWHQDWHG ZLWK VRPH SDWWHUQV EXW D QQ L Q LQ QQ DQG RU GR QRW If FDQ EH GHFRPSRVHG LQWR c 3 ZLWK D VLPLODU LQWHUSUH WDWLRQ 7KHUHIRUH ZH KDYH ,, OLW LQ ,,, ,,, R>D r>5$%f` f>3[D3` M>D D c D D f>3[D3? , D rLrf>!L@ Yf>3D Df r?5^$%f? f , D ’

PAGE 161

f RD r>$%f@ 3f>3^63 rrf>!@ r>5$0 3 D f ZKHUH 3O DQG 3 DUH DSSOLFDEOH WR D DQG 3 UHVSHFWLYHO\ 3URRI D DQG 3 DUH GHFRPSRVHG DV LQ WKH DERYH SURRI 7KXV ZH KDYH ,W O ,,, ,,, O ,,, LQ R^D r^5^$%f` 3f>3O?3 R>D3 r3 D 3 DW 3 f>3OY3@ ,, ,,, ,,, DW3 DW3 DW 3 mm0O 3 r m:LQ L LW LQ PL L Q P PL L LW DW RW fr>-$%f@ 3 3 3 "f mfm DW fr>5$%f@3 3f ,W ,,, ,,, DW3 RW3 D 3 ’ f R^DW 3f3c R^Df>3L 3 f L LL LQ QQ L Q Q 3URRI :H GHFRPSRVH D LQWR D D D DW ZKHUH D DQG D VDWLVI\ 3 EXW D DQG PL L P P Q QQ RW GR QRW DQG D DQG D FRQWDLQ 3 SDWWHUQV EXW RW DQG D GR QRW 7KHQ ZH KDYH Q WP Q R^DW Sf>WI RDW D f>A RU DDf>3? 3 DW DWf3 DW ’ f !} 3f>3L RDf3> R3f>3> f OO OO 3URRI 6XSSRVH D DQG 3 DUH GHFRPSRVHG LQWR VXEVHWV D DQG D DQG 3 DQG 3 UHVSHF ,, OO OO WLYHO\ ZKHUH RW DQG 3 VDWLVI\ 3 EXW D DQG 3 GR QRW %\ WKH GHILQLWLRQ RI $6HOHFW RSHUDWLRQ ZH KDYH DD 3f>3? RW 3 RDf>A R3f^3? f RRW 3f>393 UDP R3f>3 f ZKHUH 3O DQG 3 DUH DSSOLFDEOH WR D DQG 3 UHVSHFWLYHO\ OO O OO 3URRI 6XSSRVH D DQG 3 DUH GHFRPSRVHG LQWR VXEVHWV D DQG RU DQG 3 DQG 3 UHVSHF ,, OO WLYHO\ ZKHUH D VDWLVILHV 3[ EXW DW GRHV QRW DQG 3 VDWLVILHV 3 EXW 3 GRHV QRW %\ WKH GHILQLWLRQ RI $6HOHFW RSHUDWLRQ ZH KDYH RD 3f>3\3? D 3 ADIO\ r03 ’

PAGE 162

f ,D 3f>e7? ,-DWf>fn7? Q^Sf>e7c f WL L Q 3URRI 6XSSRVH WKDW RU DQG 3 DUH GHFRPSRVHG LQWR VXEVHWV D DQG D DQG 3 DQG 3 UHVSHFWLYHO\ ZKHUH D DQG 3 FRQWDLQ VXESDWWHUQV GHILQHG E\ >eEXW D DQG S GR QRW 7KH UHVXOWV RI WKH WZR $3URMHFW RSHUDWLRQV RQ D DQG 3 DUH UHSUHVHQWHG E\ D DQG IW UHVSHFWLYHO\ %\ WKH GHILQLWLRQ RI $3URMHFW RSHUDWLRQ ZH KDYH QD "f>e @ D 3 Df>e-ef>e7’ f RU Sf D f f f ,, 3URRI D DQG c DUH GHFRPSRVHG LQWR VXEVHWV D DQG D DQG If DQG UHVSHFWLYHO\ L ZKHUH D DQG 3 FRQWDLQ SDWWHUQV EXW RW DQG S GR QRW 7KXV ZH KDYH D Sf D S D f f ’ f D U: 3 f m S f 3URRI %\ WKH GHILQLWLRQ RI WKH $'LYLGH RSHUDWLRQ RQ WKH OHIWKDQG VLGH RI WKH HTXDn WLRQ DQ D SDWWHUQ ZLOO EH UHWDLQHG LQ WKH UHVXOW LI Df LW KDV ,QQHUSDWWHUQV RI FODVVHV LQ ^:` DQG FRQWDLQV DOO SDWWHUQV RI 3 DQG RU Ef WKH ,QQHUSDWWHUQV RI FODVVHV LQ ^:` WKDW DQ D SDWWHUQ KDV DUH FRPPRQ WR VRPH RWKHU D SDWWHUQV DQG WKHVH SDWWHUQV WRJHWKHU GHQRWHG E\ D FRQWDLQ DOO SDWWHUQV RI 3 DQG $Q D SDWWHUQ RU SDWWHUQV LQ Df ZKLFK LV UHWDLQHG RQ WKH OHIWKDQG VLGH RI WKH HTXDWLRQ ZLOO EH UHWDLQHG DIWHU WKH ILUVW $'LYLGH RSHUDWLRQ RQ WKH ULJKWKDQG VLGH VLQFH LW PXVW FRQWDLQ DOO WKH 3 SDWWHUQV ,W ZLOO DOVR EH UHWDLQHG LQ WKH ILQDO UHVXOW DIWHU WKH VHFRQG $'LYLGH RSHUDWLRQ VLQFH LW PXVW FRQWDLQ DOO WKH SDWWHUQV ’

PAGE 163

f RUZ r^5$%f` S>5^&'f? ^]`f f ^;` DQG % ^=` L Q P QQ LQQ L Q 3URRI 3 LV GHFRPSRVHG LQWR 3 3 3 3 3 ZKHUH 3 DQG S FDQ EH FRQ P QQ Q QQ FDWHQDWHG ZLWK D SDWWHUQV EXW DQG c FDQQRW DQG FDQ EH FRQFDWHQDWHG ZLWK QW QQU SDWWHUQV E\ &RPSOHPHQWSDWWHUQV EXW 3 DQG 3 FDQQRW DQG 3 FDQ EH QHLWKHU FRQn FDWHQDWHG ZLWK D SDWWHUQV QRU FRQFDWHQDWHG ZLWK SDWWHUQV E\ &RPSOHPHQW SDWn WHUQV D LV GHFRPSRVHG LQWR D D ZKHUH D FDQ EH FRQFDWHQDWHG ZLWK 3 SDWWHUQV EXW D FDQQRW LV GHFRPSRVHG LQWR ZLWK D VLPLODU LQWHUSUHWDWLRQ 7KXV ZH KDYH D r>5$%f` Sf ?>5%&M? GS DSf?>5&'f`n U Q L DS mm m$%f@ 3^\f ,>5&'f? ^]`f [rO5$%f`3O IL9f Q D3 Â’ f DZ m$%f@ 3^
PAGE 164

f m^[` r>A$%f@ 3^\`f f ^]` f§ ASWf f ^]`f r>A$%f@ 3^%$%f@ 3f W W D3 ’ 1RWH WKDW WKH OHIWKDQG VLGH RI LV LQ D GLVWULEXWLYH IRUP RI r ZLWK UHVSHFW WR f EXW WKH GLVWULEXWLYH SURSHUW\ FDQQRW EH DSSOLHG EHFDXVH LW UHTXLUHV WKDW $ EH LQ ERWK D DQG 3 DQG EH D KRPRJHQHRXV DVVRFLDWLRQVHW f D >%$%f@ 3 f f D?>5$%f@3f§,O>D r>L$%f@f>DU@ D>%$%f@%Dr>%$%f@>DU@ ZKHUH D 3 DQG DUH KRPRJHQHRXV DVVRFLDWLRQVHWV WL QU QQ U 3URRI D FDQ EH GHFRPSRVHG DV D D D D D ZKHUH D FDQ EH FRQFDWHQDWHG Q ZLWK 3 E\ ,QWHUSDWWHUQV EXW QRW ZLWK D FDQ EH FRQFDWHQDWHG ZLWK E\ ,QWHU QU SDWWHUQV EXW QRW ZLWK S D FDQ EH FRQFDWHQDWHG ZLWK ERWK D DQG 3 E\ ,QWHUSDWWHUQV QQ P UXP QQ DQG D FDQQRW EH FRQFDWHQDWHG ZLWK c DQG RU RU RU DQG D DUH PXWXDOO\

PAGE 165

W WW W Q H[FOXVLYH 3 LV GHFRPSRVHG LQWR " ZKHUH 3 FDQ EH FRQFDWHQDWHG ZLWK D EXW 3 FDQQRW FDQ EH GHFRPSRVHG DV S %\ WKH GHILQLWLRQ RI WKH 1RQ$VVRFLDWH RSHUDWLRQ ZH KDYH W WW P PW WW W WW OHIWKDQG VLGH WW r D WW f >L$f@ ^II 3 f PW WW WW RU R ,, 0W WW WW D 3 LI M! WW WW PW 3 LI D f WW PW WW LI D 3 M! WW OLOW WW 3 LI D I! WWUW WW WW D LI S PW WW PW WW D 3 D RWKHUZLVH W W P U W P 6LQFH D r>5$%f? RWIf RW ZH KDYH ,,Dr>5$%f?Sf>D? RW RW 6LPLODUO\ -Dr>5$M%f@f>RM D D 7KHUHIRUH RQ WKH ULJKWKDQG VLGH ZH KDYH PW 2W D WWW D f WW 3 PW LI m M! PW WW D 3 RWKHUZLVH PW 2W ,, D P D f WW ,, m+ PW WW D RWKHUZLVH

PAGE 166

+HQFH W UWI LW P ULJKWKDQG VLGH D?>5$%f?3 f§ D D f D>L$%f@Uf f§ D D f WWI D?>5$%f?3 D D f f§ LW WWW D?>5$%f@nf D D f f D^3 nLf D 3 f 3URRI %\ WKH GHILQLWLRQ RI $'LIIHUHQFH RSHUDWLRQ WKH OHIWKDQG VLGH RI WKH HTXDWLRQ UHWDLQV D SDWWHUQV WKDW GR QRW FRQWDLQ DQ\ SDWWHUQ RI S RU 2Q WKH ULJKWKDQG VLGH WKH ILUVW $'LIIHUHQFH RSHUDWLRQ UHWDLQV D SDWWHUQV WKDW GR QRW FRQWDLQ DQ\ 3 SDWWHUQ DQG WKHQ WKH VHFRQG RSHUDWLRQ UHWDLQV D SDWWHUQV WKDW GR QRW FRQWDLQ DQ\ SDWWHUQ RI 3 WWWW LW IW D ,, PL Q IW D c ,, Ha IW IW 3 WWWW LI D I! IW UXW LW LI RU c I! IW 3 WWWW ,W LI D W! WXW WW IW D LI A WWW IW WWWO IW D 3 D RWKHUZLVH RU ’

PAGE 167

%,2*5$3+,&$/ 6.(7&+ 7KH DXWKRU KDV EHHQ D UHVHDUFK DVVLVWDQW LQ WKH 'DWDEDVH 6\VWHPV 5HVHDUFK DQG 'HYHORSPHQW &HQWHU DW WKH 8QLYHUVLW\ RI )ORULGD VLQFH ZKHUH KH KDV EHHQ ZRUNLQJ WRZDUGV WKH 3K' GHJUHH LQ HOHFWULFDO HQJLQHHULQJ +LV UHVHDUFK LQWHUHVWV LQFOXGH VHPDQWLF GDWD PRGHOLQJ TXHU\ PRGHOV IRU REMHFWRULHQWHG GDWDEDVHV NQRZOHGJH DQG UXOH UHSUHVHQWDWLRQ DQG SURFHVVLQJ TXHU\ RSWLPL]DWLRQ FRQFXUUHQF\ FRQWURO DQG SDUDOOHO SURFHVVLQJ IRU GDWDEDVHV ,Q KH UHFHLYHG KLV %6 GHJUHH LQ PDWKHPDWLFV IURP )XGDQ 8QLYHUVLW\ 6KDQJKDL &KLQD ZKHUH KH ZDV D IDFXOW\ PHPEHU RI WKH &RPSXWHU &HQWHU IURP WR %HWZHHQ DQG KH MRLQHG DV D YLVLWLQJ VFKRODU WKH 'DWDEDVH 6\VWHPV 5HVHDUFK DQG 'HYHORSn PHQW &HQWHU DW WKH 8QLYHUVLW\ RI )ORULGD ZKHUH KH UHFHLYHG KLV 06 GHJUHH LQ HOHFWULFDO HQJLQHHULQJ LQ

PAGE 168

, FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 6WDQOH\ <: 6X &KDLUPDQ 3URIHVVRU RI(OHFWULFDO (QJLQHHULQJ FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU +H $VVRFLDWH 3URIHVVRU RI (OHFWULFDO (QJLQHHULQJ RI 3KLORVRSK\ DQ ; /DP &RFKDLUPDQ FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ [LNIL 6KDPNDQW % 1DYDWKH 3URIHVVRU RI &RPSXWHU DQG ,QIRUPDWLRQ 6FLHQFHV FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 8 L ODQG\ < 4 &KRZ fURIHVVRU RI &RPSXWHU DQG ,QIRUPDWLRQ 5 3URIHVVRU 6FLHQFHV

PAGE 169

, FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ -RKQ 6WDXGKDPPHU 3URIHVVRU RI (OHFWULFDO (QJLQHHULQJ 7KLV GLVVHUWDWLRQ ZDV VXEPLWWHG WR WKH *UDGXDWH )DFXLW\ RI WKH &ROOHJH RI (QJLQHHUn LQJ DQG WR WKH *UDGXDWH 6FKRRO DQG ZDV DFFHSWHG DV SDUWLDO IXOILOOPHQW RI WKH UHTXLUHn PHQWV IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 'HFHPEHU I! :LQIUHG 0 3KLOOLSV 'HDQ &ROOHJH RI (QJLQHHULQJ 0DGHO\Q 0 /RFNKDUW 'HDQ *UDGXDWH 6FKRRO

PAGE 170

81,9(56,7< 2) )/25,'$


ASSOCIATION ALGEBRA:
A MATHEMATICAL FOUNDATION
FOR OBJECT-ORIENTED DATABASES
By
MINGSEN GUO
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1990

Copyright 1990
by
Mingsen Guo

Dedicated to my dear wife Zhu (Susie)
and lovely daughter Jialan.
And to our parents
Jingcheng Guo and Ruiying Zhang
Shuyan Huang and Chuanxiang Chen,
this was their dream before it was mine.

ACKNOWLEDGEMENTS
I would like to express my sincere appreciation to Dr. Stanley Su, chairman of
my supervisory committee, for giving me the opportunity to work on this interesting
and important topic in the area of object-oriented database systems. Without his
patient guidance and continuous support, this work could not have been completed.
I am grateful to Dr. Herman Lam, cochairman of my supervisory committee, for his
thought-provoking suggestions on this work. I thank Dr. Sham Navathe for his com¬
ments and his personal library. I thank Dr. Randy Chow for his encouragement
throughout my graduate study. I would like to thank Dr. John Staudhammer for his
time and for being on my supervisory committee.
My special thanks go to Sharon Grant, the secretary of the Database Systems
Research and Development Center, whose help to me is always friendly and in time.
This research was supported by the National Science Foundation (DMC-
8814989) and the National Institute of Standard and Technology (60NANB4D0017).
The development effort is supported by the Florida High Technology and Industrial
Council (UPN88092237).
IV

TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS iv
ABSTRACT vii
CHAPTER
1 INTRODUCTION 1
2 A SURVEY OF RELATED WORK 12
2.1 Relational Model and Relational Algebra 12
2.2 Existing 0-0 Query Languages 18
2.3 ENCORE 0-0 Data Model and Its Underlying Query Algebra. 25
3 OVERVIEW OF 0-0 DATABASES AND
ASSOCIATION-BASED QUERY FORMULATION 38
3.1 Overview of 0-0 Databases 38
3.2 Pattern-based Query Formulation 41
3.3 Conclusion 45
4 ASSOCIATION ALGEBRA 51
4.1 Definitions 51
4.2 Relationship Between Two Patterns 55
4.3 Association Operators 56
4.4 Query Examples 71
5 MATHEMATICAL PROPERTIES OF OPERATORS
AND THEIR APPLICATIONS IN QUERY OPTIMIZATION
AND QUERY DECOMPOSITION 91
5.1 Conventional Algebraic Properties 91
5.2 Nesting of Two Unary Operators 95
5.3 Nesting of Binary Operator in Unary Operator 97
5.4 Cascading of Two Binary Operators 99
5.5 General Identities 104
5.6 Transformation of Operators 104
5.7 Applications in Query Optimization and Decomposition 106
6 COMPLETENESS OF THE A-ALGEBRA 118
7 CONCLUSION 133
v

REFERENCES
135
APPENDIX 141
BIOGRAPHICAL SKETCH 159
vi

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
ASSOCIATION ALGEBRA:
A MATHEMATICAL FOUNDATION
FOR OBJECT-ORIENTED DATABASES
By
Mingsen Guo
December 1990
Chairman: Dr. Stanley Y.W. Su
Major Department: Electrical Engineering
Existing 0-0 DBMSs lack a solid mathematical foundation for the manipulation
of 0-0 databases, optimization of queries, and the design and selection of storage
structures for supporting 0-0 database manipulations. An association algebra (A-
algebra) is prescribed for serving as a mathematical foundation for processing 0-0
databases, which is analogous to the use of relational algebra for processing relational
databases. In this algebra, objects and their associations in an 0-0 database are uni¬
formly represented by association patterns which are manipulated by a number of
operators to produce other association patterns. Different from the relational alge¬
bra, in which set operations operate on relations with union-compatible structures,
the A-algebra operators can operate on association patterns of both homogeneous and
heterogeneous structures. Different from the traditional record-based relational pro¬
cessing, the A-algebra allows very complex patterns of object associations to be
directly manipulated. Pattern-based query formulation and the A-algebra operators
are described. Some mathematical properties of the algebraic operators are
Vll

presented together with their application in query decomposition and optimization.
The completeness of the A-algebra is also defined and proven. The A-algebra has
been used as the basis for the design and implementation of an object-oriented query
language, OQL, which is the query language used in a prototype Knowledge Base
Management System OSAM*.KBMS.
Vlll

CHAPTER 1
INTRODUCTION
In the past two decades, techniques of data modeling have gone through two
major conceptual changes. First, in early 1970s, E. F. Codd observed that future
database systems should allow application programs and terminal users to remain
unaffected by changes made to the internal data representation (or the storage
structure) of a database. He introduced the relational data model [COD70] and
proposed the relational algebra and relational calculus [COD72a] as the
mathematical foundation for processing relational databases. The relational model
provides two levels of data independence in a three-level architecture for a data¬
base management system as shown in Figure 1.1 (figures of each chapter are
placed at the end of the chapter). At the lower level, the physical data indepen¬
dence is provided, i.e., the logical representation of a relational database is a set of
relations (i.e., flat tables), which is independent of the physical (data and storage)
structures in which data are stored. At the higher level, the logical data indepen¬
dence is provided, i.e., the external view remains unchanged when the logical view
of a database is modified (note that the external view remains unchanged only for
some schema modifications). Besides simple logical representation and data
independence, the fact that the relational model has a solid mathematical founda¬
tion is very important and has contributed to the success of the model and the
existing relational database management systems.
1

2
However, the relational model and relational systems have some limitations.
For example, the model captures rather limited structural properties of real-world
entities or objects. The construct of aggregation hierarchy which models complex
objects and the construct of generalization which models the superclass-subclass
relationship are not provided. In the relational model, data which describe a com¬
plex object are scattered among a number of normalized relations and accessing
that data involves time-consuming traversal and assembly of data stored in multi¬
ple relations. The model also does not allow behavioral properties of
entities/objects to be explicitly defined.
The second conceptual change of data modeling techniques occurred in the
early 1980s. The object-oriented paradigm, first introduced in the programming
language SIMULA [DAH67] and made very popular through the language
SMALLTALK [GOL81], allows richer structural constructs and behavioral proper¬
ties of objects to be specified at the logical level independent of their physical
implementations. Several features of the paradigm such as abstract data types,
inheritance, encapsulation, information hiding, polymorphism, etc. have been
shown to be useful for data modeling and system development. The object encap¬
sulation concept adds a level of data independence between the physical and the
logical independences introduced in the relational model, as depicted in Figure 1.2.
It requires that the structural and behavioral properties of an object be (logically)
encapsulated in its class in the conceptual view of an 0-0 database. Since then, a
number of Object-Oriented (0-0) and semantic data models have been proposed
[HAM81, BAT84, KIN84, ZAN85a, ZAN85b, DAD86, MAI86, MAN86, SU86,

3
ZD086, WOE86, BAN87, FIS87, HOR87, HUL87, KIM87, ROW87, CAR88,
COL89, SU89], which offer more powerful constructs for modeling the structural
and behavioral properties of objects found in advanced applications such as
CAD/CAM, CASE, and decision support systems.
An 0-0 semantic data model can be structurally and/or behaviorally object-
oriented [DIT86]. A structurally 0-0 data model is one that encompasses at least
the following characteristics:
(1) It supports the unique identification of objects, that is, each object has a
unique object identifier (surrogate) which is valid for the life-time of the
object.
(2) It categorizes those objects which can be described by the same set of charac¬
teristics (attributes) into an object class.
(3) It allows aggregation (association) hierarchies to be defined.
(4) It allows generalization (association) hierarchies to be defined.
The 0-0 view of an application world is represented in the form of a net¬
work of classes and associations. Object class can be either a primitive-class whose
instances are of simple data types (e.g., string, integer) or a nonprimitive class
(e.g., Part, Student, Teacher). At the extensional level, instances of different
classes can be related (associated) with each other forming patterns of object asso¬
ciations. A behaviorally object-oriented data model, on the other hand, is one in
which operations that describe the behavior of the objects of a class can be defined
and registered with that class. Programs or methods that implement the opera¬
tions defined for an object are transparent to the user of the objects.

4
For these models to be truly useful, they must provide some object manipula¬
tion languages, which can take advantage of the expressive power of the models
and provide the users with simple and powerful querying facilities. Recently,
several query languages such as DAPLAX [SHI81], GEM [ZAN83, TSU84], ARIEL
[MAC85], FAD [BAN87], POSTQUEL [ROW87], EXCESS [CAR88], and others
reported in [DAD86, MAN86, SER86, BAN87, FIS87, BAN88, COL89, SHA90]
have been proposed. These languages were developed based on different para¬
digms. For example, DAPLAX and the query language of [MAN86] are based on
the functional paradigm. The query language of [BAN88] is based on the message
passing paradigm. Other query languages are based on the relational paradigm:
an extension of QUEL [ROW87, CAR88]; an extension of SQL [DAD86]; and an
extension of the relational algebra [COL89]. The query language of [FIS87] is
based on both functional and relational paradigms, allowing functions to be used
in object-oriented SQL (OSQL) constructs.
The above languages have an 0-0 flavor and have taken significant steps
towards the development of a powerful 0-0 query language. Query languages
such as DAPLAX [SHI81], GEM [ZAN83], ARIEL [MAC85], and the object-
oriented query language described in [BAN88], are based on the view of a data¬
base defined in terms of objects, object classes, and their associations. A query in
these languages is formulated by specifying one class (usually a nonprimitive-class,
whose instances are real world objects) in the schema as a central class with some
path expressions. Each path expression starts from the central class and ends at
another class (usually a primitive-class, whose instances are of basic data types

5
such as integer, string, set, etc.). A restriction condition can be specified on the
class referenced at the end of a path expression. This class can also be specified in
the list of attributes to be retrieved. The result of a query is a set of tuples, each
of which corresponds to a single instance of the central class and contains values
related to that instance which are collected from classes specified in the fist.
A major drawback of these query languages is that they do not maintain the
closure property [ALA89b]. A query language is said to be closed if the result of a
query can be further queried by other queries specified in the same language. In
the above mentioned languages, the input to a query has an 0-0 representation
(i.e., a network of objects, classes, and their associations) whereas its output is a
relation which does not have the same structural and behavioral properties as the
original objects. Consequently, the result of a query cannot be further processed
by the same set of operators. The design of these languages is very much
influenced by the relational model and relational languages which are concerned
mainly with retrieval and storage operations. In 0-0 processing, objects in
different classes that satisfy some search conditions are subject to different user-
defined operations. The idea of collecting data to form a resulting relation does
not satisfy this processing model.
The query languages proposed [DAD86, MAN86, BAN87, ROW87, CAR88,
COL89] use nested relations as their logical views of 0-0 databases. Although
these languages are closed, i.e., operators in these languages operate on nested
relations to produce nested relations, the nested relation is not a proper logical
representation for an 0-0 database which is basically a network structure of

6
object associations. Mapping from a network representation to nested relations is
an additional process. Furthermore, in order to use a nested relation to represent
complex network structures, a considerable amount of data has to be introduced
to relate these nested relations. It is our view that the query language and its
underlying algebra should directly support the manipulation of network structures.
A query algebra [SHA90] was proposed recently based on the 0-0 model
ENCORE [ELM89]. Although ENCORE models applications as networks of
objects, object types, and their associations, the domain of the algebra is defined
as sets of objects of the Tuple type, which is essentially the nested relation
representation since it allows the nesting of tuples. Therefore, the mapping prob¬
lem addressed above still remains. In this algebra, two identical queries or two
identical operations in a single query do not give the same response, since each
produces a new object in the database. To eliminate duplicated copies of the
same newly created object, the algebra introduces operations like DupEliminate
and Coalesce, which would not have been necessary if the algebra were to directly
support the network-structured processing of 0-0 databases. We further observe
that the union operation in this algebra may produce a collection of objects having
the same data type but with different structures (e.g., the union of two collections
of objects of the Tuple type with different arities). Nevertheless, the other opera¬
tors introduced in the algebra are not defined to operate on collection of objects
with heterogeneous structures.
A common limitation of many existing query languages is that they cannot
express "non-association" relationship between objects easily, i.e., identify objects

7
in two classes that are not associated with each other while their classes are. For
example, in an 0-0 database, let us assume that Suppliers si and s2 supply Parts
pi and p2, respectively. GEM, POSTQUEL, and several other query languages
provide the "dot" construct (Suppliers.Parts) and ARIEL provides the "of" con¬
struct (Parts of Suppliers) to navigate from the class Suppliers to the class Parts
to produce object pairs (si,pi and s2,p2). However, they do not have a language
construct for specifying the semantics that si does not supply p2 and s2 does not
supply pi. Similarly, in functional languages, only the function Parts(Suppliers) is
provided to specify the associations of si,pi and s2,p2 but not the non-association
of suppliers and parts.
In view of the disadvantages of the existing 0-0 query languages, we would
like to stress the importance of using a graph as the logical representation of an
0-0 database at both intensional and extensional levels as exemplified by 02
[LEC88], FAD [BAN87], and OSAM* [SU89]. The query language and its under¬
lying algebra should provide constructs to directly process graphs with different
degrees of complexity. They should also support the specification of non¬
associations and the processing of heterogeneous structures. Furthermore, the clo¬
sure property should be maintained.
In this dissertation, we propose an association algebra (A-algebra) based on
the graph representation of 0-0 databases and the association-based query formu¬
lation (refer to Chapter 3). Analogous to the development of the relational alge¬
bra for relational databases, the development of the A-algebra provides the formal
foundation for query processing and optimization in 0-0 databases and for

8
designing 0-0 query languages. Unlike the record(tuple)-based relational algebra
[COD70 and COD72] and the query algebra [SHA90], the A-algebra is
association-based, i.e., the domain of the algebra is sets of association patterns
(e.g., linear structures, trees, lattices, networks, etc.) and processing an 0-0 data¬
base is based on the matching and manipulation of homogeneous as well as hetero¬
geneous patterns of object associations. Operators of the A-algebra can be used
to navigate a network of interconnected object classes along the path of interest to
construct a complex pattern as the search condition. They can also be used to
decompose a complicated pattern into simple ones. Ten operators have been
defined for the algebra: three unary operators [A-Select ( Integrate (/)], and seven binary operators [Associate (*), A-Complement (|), A-
Union (+), A-Difference (-), A-Divide (-^), NonAssociate (!), and A-Intersect (•)],
where the prefix A stands for "Association". Although many of these operators
correspond to the relational algebra operators, they are different from them in
that they can operate on complicated heterogeneous structures. In this respect,
the A-algebra is more general than the relational algebra.
The rest of this dissertation is organized as follows. A detailed survey on the
relational model and the relational algebra, the existing 0-0 query languages, and
a recently proposed query algebra is provided in Chapter 2. The graphical
representation of 0-0 databases and the association-based query formulation are
described in Chapter 3 with the help of examples. Chapter 4 formally defines the
concepts of Schema Graph (SG), Object Graph (OG), and association patterns.
The formal definitions of the association operators and their simple mathematical

9
properties are also presented. The A-algebra expressions for some example queries
are given to demonstrate the utility of the algebra. Chapter 5 presents the
mathematical properties of the association operators and their utilities in query
optimization and query decomposition. The proofs of the mathematical properties
of the operators can be found in the Appendix. The completeness of the A-
algebra is shown in Chapter 6 and the conclusion is given in Chapter 7.

10
~\
logical data
independence
<
l physical data
' independence
J
Figure 1.1 Data independencies in relational databases

11
logical data
independence
4
â–º encapsulation
physical data
^ independence
J
Figure 1.2 Architecture of 0-0 databases

CHAPTER 2
A SURVEY OF RELATED RESEARCH
This section surveys some of the existing work related to the development of
the A-algebra. Section 2.1 describes the relational model and the relational alge¬
bra, while Section 2.2 surveys some existing query languages designed for 0-0
semantic data models. The query algebra recently appeared in the literature is
surveyed in Section 2.3.
2J Relational Model and Relational Algebra
When the hierarchical and network data models were used extensively in
information systems in the late 1960s, Codd [COD70] raised an interesting and
important question: Can application programs and terminal activities remain
invariant as the internal data representations (physical representations) change?
He asserted that the future users of large data banks must be protected from hav¬
ing to know how the data were organized in the machine. Following this
rationale, he conceived the notion of data independence which suggests that the
logical organization of data should be independent of its physical representation.
Determined to demonstrate the validity of his data independence concept, he pro¬
posed a relational data model based on n-ary relations.
12

13
The scheme of a relation, R, of an entity set {Ev E2, ..., En} is defined on a
set of m attributes {Av A2, ..., Am} which correspond to m domains
{Dv D2, (not necessarily distinct). Each entity (the instance of the scheme)
is represented by an m-ary tuple which has its first attribute value from Dv its
second attribute from Dv and so forth. A set of attributes of a relation is called a
key if the entities of the relation can be uniquely identified by the values of these
attributes.
In particular, the information of the suppliers such as their names, addresses,
items they supply, and the prices of the items can be represented by the relation
SUPPLIERS of the following scheme
SUPPLIERS(SNAME, SADDRESS, ITEM, PRICE)
where the attributes SNAME and ITEM form a composite key. Data represented
in this form, which intuitively is a flat table, is the logical view of an application
world. It has nothing to do with the physical representation of the data.
When designing a database using the relational model, one is often faced with
a choice among alternative sets of relation schemes. Some choices are more favor¬
able than others for various reasons. For example, the relation SUPPLIERS is not
a desirable scheme because it has the following potential problems: (1) Redun¬
dancy -- the address of the supplier is repeated once for each item supplied. (2)
Potential inconsistency (update anomalies) — as a consequence of the redundancy,
the update of the address of a supplier in one tuple will leave it inconsistent with
the address of another tuple. (3) Insertion anomalies -- the address of a supplier
cannot be recorded if that supplier does not currently supply at least one item

14
since SNAME and ITEM form a composite key of the relation SUPPLIERS. (4)
Deletion anomalies -- the inverse to problem (3) is that should all the items sup¬
plied by one supplier be deleted, we unintentionally lose the address of that sup¬
plier.
The causes of these problems and their solutions are relevant to the func¬
tional dependencies among the attributes of a relation [COD70, ULL82]. Suppose
X and Y are two sets of attributes of a relation. Y functionally depends on X (or
X functionally determines Y), denoted by X—*-Y, if two tuples of the relation hav¬
ing the same values in attributes X agree on the values of the attributes in Y.
The above four problems emerge if X—*Y and Xt-*Z hold simultaneously, where
X, stands for a proper subset of X and Z a set of attributes of the relation.
The solution to these problems is to decompose a relation based on the func¬
tional dependencies among attributes. For example, the functional dependencies
among attributes of the relation SUPPLIERS are (SNAME,ITEM)-*PRICE and
SNAME—«-SADDRESS, thereby having the redundancy, update, insertion, and
deletion anomalies. It should be clear to the reader that these problems will be
eliminated if the relation SUPPLIERS is decomposed into two relations
SA(SNAME, SADDRESS) and
SIP(SNAME, ITEM, PRICE).
There is, however, a disadvantage to the above decomposition; to find the address
of a supplier who supplies item "piston", a join operation, has to be applied since
the SADDRESS and ITEM are logically distributed in two relations.

15
The decomposition of a relation based on the functional dependencies among
its attributes is a novel issue of normalization in the relational model. Four types
of normal forms, denoted by INF, 2NF, 3NF, and Boyee-Codd-NF, respectively,
have been recognized in considering the functional dependency [COD70, ARM74,
and BEE77]. The Boyee-Codd-NF is the strongest of these normal forms. Rela¬
tions in these normal forms may have to be further decomposed into 4NF or 5NF
to eliminate multivalued dependencies [FAG77, DEL78, and ZAN76] and join
dependencies [AH079]. This decomposition is needed to eliminate further redun¬
dancy and anomalies.
The success and popularity of the relational model and the relational data¬
base management systems (DBMSs) are due to its simplicity in structural (tabular)
representation and its sound theoretical basis -- the relational algebra and the rela¬
tional calculus [COD72a]. The relational algebra defines five primitive operators,
of which two are unary operators [Projection (77) and Selection ( binary operators [Cross-product (x), Union (+), and Difference (-)]. Other opera¬
tors such as Join, Natural-join, Set-intersection, and Set-division are also defined
in the algebra. Although these later operators are easy to use, they are not primi¬
tive since they can be expressed in terms of the primitive operators.
The relational algebra has the closure property, since every operator must
operate on one or more relations and produces a new relation. Operators of the
relational algebra basically operate on the values of tuples in relations. Structur¬
ally speaking, they are defined to operate on tuples whose structures are union-
compatible (homogeneous). The relational algebra is complete in the sense that it

16
has the equivalent expressive power to the relational calculus [COD72a and
ULL82]. Because of this, it serves as the theoretical basis for the relational model.
The relational algebra has been used for the following three purposes, although it
has not been previously implemented in any existing DBMSs exactly as defined
[ULL82],
(1) It creates a new class of query languages called algebraic languages. Based on
the relational algebra, languages that directly adopt the relational operators
can be developed, such as ISBL [TOD76] which is a close approximation to the
relational algebra. Although languages of this type are mostly procedural, it is
relatively easy to demonstrate their completeness along with the mathematical
properties of the relational algebra which can be readily applied to query
optimization and query decomposition.
(2) It not only serves as a benchmark for evaluating query languages in existing
systems, but also as the criterion for designing new languages for relational
DBMSs. A relational language will not have the necessary expressive power if
it is not relationally complete [ULL82].
(3) It provides a mathematical basis for transforming expressions in query decom¬
position and (logical or conceptual) query optimization. As an algebra form,
the mathematical properties of the relational algebra can be explored precisely
and systematically. For query languages construed as algebraic languages,
these mathematical properties exhibit a straightforward application [HAL76].
Query languages like SQUARE or SEQUEL having certain algebraic features
may also use these properties, since the parse of a query yields a tree in which

17
some nodes represent relational algebra operators [AST76]. Even if a query
language such as QUEL is a relational calculus language, its calculus-like
expressions are translated into relational algebra expressions in the QUEL
optimizer [WON76].
The total content proposed by Codd before 1979 on the relational model is
refered as Version 1 of the relational model (RM/Vl), whose modeling capabilities
were extended by Codd in 1979 [COD79] to version RM/T (T for Tasmania).
Based on these two versions, Codd [COD90] introduces Version 2 of the relational
model (RM/V2). The most important additional features in RM/V2 are as fol¬
lows:
(1) A new treatment of items of data missing because they represent properties
that happen to be inapplicable to certain object instances.
(2) New features supporting all kinds of integrity constraints, especially the user-
defined integrity constraints.
(3) A more detailed account of view updatability.
(4) New features pertaining to the management of distributed databases.
It is important to recognize the fact that hierarchical and network models as
well as the relational model evolved during a time in which the primary applica¬
tions of information systems were business-oriented. In an attempt to apply these
techniques to the more complicated application areas such as CAD/CAM, CASE,
and decision support, it is found that the relational model is no longer adequate
for modeling these advanced applications. The inadequacies of the relational
model are summarized as follows. First, the relational model has limited modeling

18
capabilities. When data are logically represented in the form of relations, the rela¬
tionships among entities in these relations are represented by matching values of
the attributes or keys in one relation with values of the attributes or foreign keys
in other relations. The actual semantics among the data such as generalization
and aggregation (the abstract data type) cannot be modeled by the relational
model. Second, the relational model only models the structural aspects of entities,
and thus, ignores their behavioral aspects (e.g., system-defined and user-defined
operations). Third, in these advanced applications, the concept of data indepen¬
dence should be further extended to the concept of object encapsulation, i.e., not
only should the logical representation of an object be separated from its physical
representation, but its structural and behavioral properties should be logically
encapsulated in its class. The object encapsulation concept cannot be realized in
the relational model, since the data describing an entity may be logically scattered
among several relations due to normalization [COD70, COD72b, BEE77, and
ULL82]. Fourth, entities with complex structures and complicated relationships
among entities are not representable by flat tables (relations). Finally, it cannot
represent and operate on entities with different (heterogeneous) structures.
12. Existing 0-0 Query Languages
An extensive literature search on query languages for accessing 0-0 data¬
bases such as GEM [ZAN83, TSU84], ARIEL [MAC85], DAPLEX [SHI81], FAD
[BAN87], POSTQUEL [ROW87], EXCESS [CAR88], as well as other proposed
languages [ST084, DAD86, MAN86, SER86, BAN87, FIS87, BAN88, COL89,

19
SHA90] has been carried out. This section surveys a representative sample of
these languages. Most existing query languages have capabilities beyond those
provided by its theoretical basis. For example, the arithmetic operations and
aggregation functions provided by the relational languages are not available in the
relational algebra. Therefore, this survey is limited to those features which are
relevant to the proposed algebra.
To demonstrate the similarities and differences of these languages, the same
database schema as shown in Figure 2.1 is used for example queries written in
GEM, ARIEL, DAPLEX. The sample schema of Figure 2.1 is for a government
owned laboratory system where rectangles represent classes and edges (links)
represent attributes.
QUEL [ST076, WON76, and Z0077] is a tuple-calculus oriented query
language for relational DBMS INGRES [ST076]. In order to avoid the ambiguity
which arises when two attributes of different relations having the same name are
addressed in a single query, QUEL uses a "dot" mechanism to qualify an attribute
of a relation (i.e., a dot is inserted between the name of the relation and the name
of the attribute). For example, Equipment.Name refers to the attribute Name of
the relation Equipment. Influenced by this mechanism, the existing 0-0 query
languages use similar notations for navigating the database schema from one class
to another or from one relation to other relations in systems which use relational
databases as their back-ends.
The language GEM [ZAN83,TSU84] is an extension of QUEL for the data
model DSIS which supports aggregation, generalization, and unique identification

20
of objects. In GEM, a class in an aggregation hierarchy that has a link emanating
to another class has the name of the later class as the data type of one of its attri¬
bute. For example, the class Lab has an attribute, Facility, of the type Equip¬
ment, and has another attribute, Locality, of the type Location, and so forth. The
dot notation is used in GEM for navigating along the reference attributes (links) in
query formulation. The following GEM query retrieves the name of the manager,
the serial number of the equipment, and the address for each laboratory whose
headquarter is located in New York.
Range of Lab is Lab
Retrieve Lab.Manager.Name
Lab.Equipment.Serial#
L ab .Loc at ion. Address
Where Lab.Manager.Department.Headquarters.City = "New York"
This query returns a set of tuples in a tabular form. Each tuple contains
values for the manager’s name, the equipment serial number, and the address of
the laboratory of interest.
In the approach described in Stonebraker et al. [ST084], the dot notation is
used in a manner similar to that found in GEM to implement the abstract data
type (ADT) concept. In addition, QUEL is used as a data type to facilitate the
navigation from one relation to another. A relation may have a field of type
QUEL which may contain expressions or commands (queries). Whenever the field
is addressed in a query, these expressions, in whole or in part, will be activated.
In general, if X is the tuple variable of the relation Rl, Y is a field of type QUEL
in relation Rl, and the query stored in Y retrieves field Z of another relation, R2,

21
then the expression X.Y.Z is a field in a collection of this view. In other words,
the expression will return the values of the Z field of tuples (in R2) that are
related to X through Y. For example, let the relation Manager have a field called
Officelnfo of type QUEL which contains a query that retrieves the telephone
number of the relation Location. The expression Manager.Officelnfo.Tel# returns
the telephone number for each manager in a tabular format. Clearly, the imple¬
mentation of QUEL as a data type provides a way to relate data in two relations
without modifying the database schema.
Instead of using the dot notation, ARIEL [MAC85] takes advantage of the
"OF" notation. The example query described for GEM can be restated as
Range of Lab is Lab
Retrieve Name OF Manager OF Lab
Serial# OF Equipment OF Lab
Address OF Location OF Lab
Where City OF Headquarters OF Department OF Manager
OF Lab = "New York"
using the "OF" notation which is linguistically more natural than using the dot
notation. However, the result of this query is also represented by a flat table
(relation).
DAPLEX [SHI81] is a functional data language. The data retrieval com¬
ponent of DAPLEX is similar to the languages described above, although it is
interpreted differently. In the functional paradigm, the class having a link (i.e.,
attribute) emanating to another class is considered as a function. The function
has, by default, the name of the class to which the fink points. For example,

22
Location(Lab) and Department(Headquarters) represent the facts that Lab has
Location and Headquarters has Department as attribute, respectively. When the
function Location(Lab) is applied to an object of the class Lab, it returns a value
which is an object in the domain class over which the attribute is defined. If the
navigation is from one class to another through a sequence of classes, a nested
function is used. For instance, the expression Name(Manager(Lab)) specifies the
name of the manager of a laboratory to which the manager is responsible. For a
particular object of Lab, the manager of the laboratory is produced first; then, the
function Name() is applied to the returned manager and returns the name of the
manager. The example query can be expressed in DAPLEX as follows.
FOR EACH Lab
SUCH THAT City (Headquarters (Department (Manager (Lab))))
= "New York"
PRINT Name (Manager (Lab)),
Serial# (Equipment (Lab)),
Address (Location (Lab))
Even though DAPLEX is based on the functional paradigm, it returns data in the
form of a relation just like in GEM and in ARIEL.
Banerjee et al. [BAN88] introduce a query language based on message pass¬
ing. In the message passing paradigm, the name of a link emanating from a class
is interpreted as the name of a message which is stored within that class. One can
assume there is actually a message created by the system and having, by default,
the same name as its corresponding attribute. When such a message is sent to an
instance of the class, it returns the value of the attribute. For example, the fol-

23
lowing is an expression for selecting a laboratory that has a manager who belongs
to a subordinate department of its New York headquarters.
(Lab SELECT :S (:S Manager Department
Headquarters City = "New York"))
SELECT in this expression is a message sent to the class Lab. The first
argument of SELECT is :S, an iteration variable. The SELECT message iterates
over the instances of the class Lab with :S bound to one instance at a time. The
block of code within the parentheses is the second argument of SELECT, and is
executed for each value bound to :S. In this particular block, the message
Manager is sent to the instance bound to :S in order to return the related Manager
instance. Similarly, Department and Headquarters are messages. To elaborate,
Department is sent to the returned Manager instance, Manager is sent to the
returned Department instance, and Headquarters is sent to the returned Depart¬
ment instance. The sign "=" is also a message which has the argument "New
York". When this message is sent to the resulting headquarter instance, it returns
a logical object TRUE or FALSE. An instance of Lab is qualified for the above
expression, if and only if the returned logical object is TRUE. The logical AND
or OR message can be sent to this object with an argument that specifies some
other condition on the instance of Lab. In principle, though not described in Ban-
erjee et al. [BAN88], similar message-based expressions can be used to retrieve
attribute values of the resulting Lab instance. The result of a query which
involves such conditions is the set of the instances of Lab along with its attribute

24
values and is represented in a tabular form.
As shown in the samples of these query languages, their query formulations,
though interpreted differently, are very similar to each other. This is evident in
the fact that the formulating of queries is accomplished by navigating the graphi¬
cally represented database schema from class to class through their respective
links. In each of these languages, however, a query operates on a database that is
structurally represented using an 0-0 data model and returns a result whose
structure is represented in a tabular form. Consequently, the result of a query
cannot be further queried by other queries written in the same language. There¬
fore, these languages are not closed.
Another drawback of these languages is seen in their navigation mechanisms
which can only formulate queries against classes (or relations) that are interre¬
lated in simpler patterns like the linear and forest structures shown in Figure 2.2a.
However, in 0-0 databases, the graphical patterns in which objects are inter¬
related with each other are basically networks which are not restricted to plane
graphs (a graph is a plane graph if it can be drawn on a plane without any inter¬
section of two edges). They can be as complicated as surface graphs (a graph is a
surface graph if it can be drawn on a surface without any intersection of two
edges). Phrasing queries against classes that are interrelated in more complicated
patterns depicted in Figure 2.2b is beyond the capabilities of these languages.
A third drawback of these languages which renders their navigation mechan¬
isms insufficient is that only one type of the relationship (an object ia related to
another object) between objects of two classes can be expressed. In fact, when

25
two classes are directly linked at the schema level, objects in these two classes
may have another type of relationship — an object is. not related to another object.
This type of relationship represents the complement aspect of the semantics
specified for the two associated classes, such as not-a-part-of,
not-a-function-of, or ia-not-a which is often needed in querying the databases.
For example, 'For each laboratory, list the equipment that is not available" is a
reasonable query.
The proposed query languages [DAD86, MAN86, BAN87, ROW87, CAR88,
COL89] use nested relations as their logical views of databases. A nested relation
is a generalized relation, i.e., a recursively defined relation: the attributes of a rela¬
tion can be either atomic values or another relation in which the attributes can be
a third relation, and so forth. Figure 2.3 shows an example of a nested relation.
Nested relations are particularly suitable for representing data in forest structures.
The above languages are considered to be closed, since operators in these
languages operate on nested relations and produce nested relations. However,
they also have the drawbacks mentioned above and it is our view that nested rela¬
tion is not a proper logical representation for an 0-0 database which is networks
of objects, object classes, and their associations. Using nested relations to
represent data in network structures introduces one level of indirection. Mapping
from a network representation to nested relations is an extra process. Further¬
more, in order to use a nested relation to represent complex structures, a large
amount of data has to be replicated in the representation. Figure 2.4 shows an
example of using a nested relation to represent a graph having loops. Note that

26
vertex F has to be replicated three times.
2*2 ENCORE Q-Q Data Model and Its Underlying Query Algebra
In spite of the popularity of the 0-0 paradigm and its application in the field
of database management, the existing 0-0 database management systems still
lack a solid mathematical foundation for the manipulation of an 0-0 database
and the optimization of queries. Recently, a query algebra [SHA90] was proposed
for the ENCORE 0-0 data model [ELM89]. This section surveys the query alge¬
bra as well as the ENCORE model. It also serves as a comparison to the associa¬
tion algebra proposed in this dissertation.
2.3.1 The ENCORE Model
ENCORE 0-0 data model [ELM89] supports abstract data type, type inheri¬
tance, typed collection of typed objects, objects with identity, and object encapsu¬
lation. It models an application as networks of objects, object types, and their
associations. The definition of an abstract data type in this model includes the
Name of the type, a set of Properties defined for instances of the type, a set of
Operations which can be applied to the instance of the type. Properties reflect the
state of an object while operations may perform arbitrary actions. Properties are
typed objects that may be implemented as stored values, procedures, or functions.
The implementation of a property is invisible to the user and is assumed to return
an object of the correct type and to have no side-effects.

27
In addition to user-defined abstract data types and a collection of atomic
types such as Int, String, Boolean, etc. (i.e., primitive-classes), ENCORE provides
two parameterized types and a global Object type which is the supertype of all
other types. The parameterized type Set[T] defines T as the type, or supertype, of
objects in a collection having type Set, and T is called the member type of the set.
The parameterized tuple type associates types (T,.) with attribute names (A,.) and
defines properties Get-attribute_value and operations Set_attribute_value for each
attribute. The T- s can be any database types, thus, allow nesting of tuple types.
The value of a tuple is represented as cAp ov A2: o2, ... , An: on> where the
A’s are attributes of the tuple and the o’s are objects of the corresponding types.
The global supertype Object defines a family of operations for equality called
i—equality where i indicates how "deeply" a comparison of two objects must search
before finding equality. Two objects are identical when they are the same object,
i.e., they have the same identity. Identical objects are O-equal (=0 or just =) and,
for *>0, two objects are i-equal (=$.) if
(1) they are both collections of the same cardinality and there is a one-to-one
correspondence between the collections such that corresponding members are
=«-u or
(2) they both have the same type (not a collection type) and the values of
corresponding properties are =,._j.
Type Object also defines a stronger notion of equality called id-equality.
Two objects are id-equal at depth i if they are i-equal and graphical representa¬
tions of the objects are isomorphic.

28
2.3.2 The Underlying Query Algebra of ENCORE
The query algebra [SHA90] is proposed based on the 0-0 model ENCORE.
The domain of the query algebra is defined as a typed collection of typed objects.
A typed collection is of parameterized type Set[T] and the objects in the collection
are of type T. If objects of a collection are collected from different types, T is
their most specific common type in the type lattice. For example, if object a is of
type 5, object p is of type P, and S' is a supertype of P, the collection of objects a
and p is of type Set[S]. The query algebra is closed since the operators of the
query algebra operate on collection(s) of objects with type Set [TV] and produce a
collection with type SetfTJ, where type Tk is defined by the query.
Similar to the languages surveyed in Section 2.2, the query algebra addresses
a property of an object using ’dot’ notation (e.g., a.p.q where « is an object of type
Tv p is a property of a and is of type T2, and q is a property of p and is of type
Ts).
Twelve operators are defined in this algebra. We give their brief definitions
followed by some example queries to illustrate the major concepts of this algebra.
(1) The Select operation creates a collection of objects which satisfy a selection
predicate.
Select(S,p) = { 8 | (« in S)Ap(a) }
where p is the predicate.
(2) The Image operation is used to return a single object for each object in the
queried collection and has the form:

29
Image(S, f : T) — { /(«) | 8 in S }
where 5 is a collection of objects and / returns an object of type T.
(3) The Project operation extends Image by allowing the application of many
functions to an object, thus supporting the creation and maintenance of
selected relationships between objects. The relationships are stored as tuples
with Tuple type.
Project(S, =
{ I « in S }
where S is of type Set[T\, the A/s are unique attribute names, and each
takes a single input of type T and returns an object of type T{. Project
returns one tuple for each object in the collection being queried. Each newly
created tuple is a new object with unique object identifier.
(4) The Ojoin operator is an explicit join operator used to create relationships
which is not defined between objects of two collections in the database. It is
essentially a Cartesian product of collections of objects, followed by a selec¬
tion of result tuples. For collections S and R, the Ojoin is defined as follows:
Ojoin(S, R, Av A2, p) =
{ | « in S A r in R A p(s,r) }
where p is a predicate (as in Select) defined over objects from S and R. The
Ojoin operation creates new tuples in the database to store the generated
relationships. The tuples created will have unique object identifiers.
(5) Union, Difference, and Intersection are the usual set operations with object
comparisons and set membership based on object identity (=0). The result of

30
these operations is considered to be a collection of objects of type T, where T
is the most specific common supertype (in the type lattice) of the types of the
objects in the operands.
(6) Flatten operation is used to restructure sets of sets and Nest and UnNest
allow the representation of tuples as flat or nested relations.
(7) For the above operators, two identical operations cannot give identical
response, since each result collection is a newly identified object in the data¬
base and the objects in a result collection may be either existing database
objects or new tuple objects created during the operation. Operators DupEl-
iminate and Coalesce are introduced to handle situations where equal objects
are created by a query.
The example queries are issued against the Supplier-Parts-Job database
shown in Figure 2.5. For the purpose of these examples, it is assume that Type
Object is the only supertype for each of the given types.
Example 1: Find all red parts. Which suppliers can supply all of the red parts?
P_red := Select(Parts,Xp p.color = "Red"
S_Pred:= Select(Suppliers,Xs P_red subset_of s.Inventory)
The first selection finds the red parts and the second selection finds all sup¬
pliers for which the inventory includes that set of parts. The subset_of operation
is available since property Inventory and result P_red both have type Set[Part].
Example 2: What parts are needed by jobs in Boston?
Bos Jobs := Select(Jobs,Xj j.address.city = "Boston")
BosJobParts := Project(BosJobs,Xj <(J,j),(Pt,j.PartsNeeded)>)

31
The select operation finds the jobs in Boston and the project operation gives
information about which parts are needed for each job in Boston. The result of
the projection is of type Set[Tuple]. Note that operation NewPart (of type Job)
cannot be applied to members of BosJobParts, since they have type Tuple. How¬
ever, it is appropriate for objects BosJobParts.J.
Example 3: Find all local suppliers for each job.
LocalS:= Ojoin(jobs,Suppliers,J,S, Xj Xs
j.address.city = s.address.city)
This Ojoin operation produces a set of tuples of type <(J, Job),(S,Supplier)>,
which is similar to a normalized relation. To get a set of suppliers for each job, a
Nest operation needs to be applied: Nest(LocalS, S).
From the above description, we can see that the query algebra supports
many features of 0-0 databases and has taken significance steps towards a power¬
ful 0-0 query algebra to serve as the mathematical foundation for 0-0 database.
However, it still has the following limitations.
(1) Although the ENCORE models an application as networks of types, objects,
and their associations, the domain of its underlying query algebra is defined as
collections of objects having type Set[T], which is essentially a nested relation
representation, since the member type T of the set type can be a parameter¬
ized Tuple type which may in turn contain attributes of Tuple types. There¬
fore, the query algebra cannot represent network-structured relationships
among objects efficiently and the mapping problem addressed before still
remains.

32
(2) In this algebra, two identical expressions or two identical operations in a sin¬
gle expression do not give identical response, since each result collection is a
newly identified object in the database. To eliminate duplicated copies of the
same newly created object, the algebra introduces DupEliminate and
Coalesce operations, which are not necessary if it directly supports the net¬
work view of 0-0 databases.
(3) In this algebra, a collection may contain objects with heterogeneous struc¬
tures. For example, two objects are both of Tuple type but with different
arities and the union of the two object is also a collection of objects having
Tuple type. However, other operators in this algebra are not defined to
operate on such collection(s).
(4) Since the query algebra is developed for a specific model (i.e., Encore), it is
difficult to apply to other 0-0 models.

33
Figure 2.1 A sample schema

34
O O O o o
(a) simple query patterns
Figure 2.2 Simple and complex query patterns

35
NAME
ADDRESS
INVESTMENTS
COMPANY
SHARES
PURCHASE
PRICE
DATE
ISO
John Smith
311 East 2nd St.
Bloomington, IN
47401
64.50
02/01/83
1 00
92.50
08/1 0/87
200
89.75
06/20/83
500
96.50
1 1/1 0/84
1 00
Jill Brody
41 North Main St.
Obertin, Oh
44074
EXXON
35.0
01/30/81
1 00
64.50
01/30/82
1 00
59.50
02/1 0/83
200
FORD
35.50
02/1 0/83
200
SEARS
35.75
1 2/25/87
1 00
Figure 2.3 An example of a nested relation

36
Pattern
Number
A
B
C
D
E
F
F
F
G
H
1
a1
b2
c4
d3
e2
f 5
f 5
f 5
gi
h6
Figure 2.4 Using a nested relation to represent a complex structure

37
Type Supplier
properties: operations:
Ident: string RecvOrder:
Address: Addr Supplier, Set[Part] ~> Supplier
Inventory: Set[Part]
Type Job
properties: operations:
Num: string NewPart: Job, Part --> Job
Address: Addr
PartsNeeded: Set[Part]
Preferred_Suppliers:
Ordered _list[Supplier]
Type Part
properties: operations:
Num: string Order: Part --> Part
Address: Addr Same_Part: Part, Part --> Boolean
Color: string
Components:
Set[Tuple[<(P,Part,(Qty,lnt)>]]
Plan: drawing
BillofMaterial: list[Part]
Type Addr
properties:
Street: string
City: string
State: string
Figure 2.5 A Supplier-Parts-Job database

CHAPTER 3
OVERVIEW OF 0-0 DATABASES
AND ASSOCIATION-BASED QUERY FORMULATION
This chapter informally introduces the graphical view of 0-0 databases and
illustrates the association-based query formulation mechanism. The graphical
view captures the most important characteristics of 0-0 databases in which
object classes and their objects are associated with each other. Based on this
view, query formulation and processing can be made by specifying and manipulat¬
ing association patterns in which objects are inter-related with each other, unlike
the traditional attribute-based query formulation and processing which match
values in different relations. Since the graphical view is suitable for many 0-0
data models, the association algebra developed based on this view can be used as a
general algebra for supporting these 0-0 databases. The graphical view of 0-0
databases is formalized in the next chapter.
2J Overview of Q-Q Databases
0-0 semantic data models provide a conceptual basis for defining 0-0 data¬
bases. Although each model has some unique constructs that distinguish one
model from the others, there are several common structural and behavioral pro¬
perties based on which an algebra can be developed and used to support these
models:
38

39
First, objects are physical entities, abstract concepts, events, processes, func¬
tions or anything that an application cares to capture and represent.
Second, objects having the same structural and behavioral properties are
grouped together to form an object class. Object classes can be categorized into
two general categories: (l) the nonprimitive-class which represents a set of objects
of interest in an application world, each of which is assigned a system-wide unique
object identifier (OID) and its data are explicitly entered in a database by the
user; and (2) the primitive-class which represents a class of self-named objects
serving as a domain for defining other object classes, such as a class of symbols or
numerical values. The behavioral properties of an object class are defined in
terms of system-defined or user-defined operations (e.g., retrieve, display, delete,
insert, rotate a design object, hire an employee, etc.), which can meaningfully
operate on its objects using their corresponding programs (or methods). The
structural properties of an object class and, thus, its objects consist of two types of
data (1) descriptive data (or instance variables) which define the states of the
objects; and (2) association data which specify the relationships between its
objects and the objects of some related classes.
Third, different 0-0 models recognize different types of associations. Two of
the most commonly recognized associations are aggregation and generalization.
Aggregation models the a—part—of, a—function—of, or a—composition—of relation¬
ship. For instance, a complex object can be modeled by an aggregation hierarchy
(abstract data type) in which a complex object is defined in terms of its associa¬
tions with objects in other defined classes. Generalization models the is-a or the

40
superclass—subclase relationship in which an object in a subclass inherits both the
structural and the behavioral properties of its superclass(es).
Thus, from the algebra point of view, an 0-0 database can be viewed as a
collection of objects, grouped together in classes and interrelated through associa¬
tions. It can be represented by graphs at both the intensional and the extensional
levels. At the intensional (schema) level, a database is defined by a collection of
inter-related object classes and is represented by a Schema Graph (SG). For
example, the SG for a university database is illustrated in Figure 3.1, in which
each rectangle denotes a nonprimitive-class such as a class of person objects or a
class of department objects, and each circle denotes a primitive-class such as a
class of names or ages. The associations among classes are represented by the
edges in SG. For example, there is an association between the class Course and
the class Department (an Aggregation association), and an association between the
class Person and the class Student (a Generalization association). Since the
semantic distinctions of these and other association types recognized by different
semantic models can be either hard-coded in a DBMS or declaratively specified by
some rules and used by a rule processor to govern the manipulation of the associ¬
ated classes, the underlying algebra does not have to incorporate the semantics of
these association types. All it has to be concerned with is whether or not an
object class and its objects are associated with some other classes and their
objects, i.e., the edges (or associations) are type-less in SG. For example, the
semantics of inheritance can be incorporated in a query language translator which
translates a high-level language statement into its underlying algebraic representa-

41
tion. The algebra does not have to deal directly with the semantics of inheritance.
This is particularly important if the algebra is to be used as a general algebra for
supporting various 0-0 data models in which the semantics of an association type
may have slightly different meanings.
At the extensional (instance) level, a database can be viewed as a collection
of objects, grouped together in classes and inter-related through some type-less
associations; and as such it can be represented by an Object Graph (OG). For
example, the OG corresponding to a portion of the university schema graph is
shown in Figure 3.2. In this example, the Teacher object t4 is associated with two
Section objects; thereby representing the fact that he/she is teaching two sections,
sc3 and sc4. The Student object si is associated with Undergrad object ul which,
in turn, is associated with Department object dl; thereby representing that si is
an undergraduate student who minors in the department dl. Finally, the Section
object sc2 is not associated with any object of the Student class, which represents
the fact that it is not taken by any student. Object associations expressed by
different graph patterns represent the semantic relationships among these objects
in an application world.
2*2 Pattern-based Query Formulation
Based on this view of an 0-0 database, users can query the database by
specifying patterns of object associations as search conditions. Once these
objected are selected, they can be further processed by either system-defined
operations (Retrieval, Display, Update, Insert, Delete, etc.) or user-defined

42
operations (RotatePart, PurchasePart, HireFacuity, etc.). For example, the fol¬
lowing queries can be issued against the university database as illustrated in Fig¬
ures 3.1 and 3.2 (the algebraic expressions for these queries will be given in Section
4.4).
Query 1: For all sections, get the majors of students who are taking these
sections.
To satisfy this query, we can specify a linear pattern containing the classes
Section, Student, and Department as shown in Figure 3.3a. In this pattern, a cir¬
cle represents a class and an edge represents that the objects of the two adjacent
circles (classes) must be associated with each other. This pattern is called an
intensional pattern which represents that sections taken by students who major in
some departments are to be identified. The answer to this query can be found in
Figure 3.2 by checking if the objects of these three classes satisfy such pattern.
There are five object patterns (called extensional patterns) which satisfy the inten¬
sional pattern as shown in Figure 3.3b. The Section object sc2 and the Student
object s3 do not appear in these extensional patterns, since sc2 is not taken by any
student and s3 does not have a major yet. These patterns can also be identified in
two sequential steps. First, get all the patterns in which the Section objects are
associated with the Student objects. Then, if a pattern generated in the first step
(i.e., a Section-Student pair) is further associated with an object of Department, a
new pattern consisting of three objects is constructed and retained in the result;
otherwise, the pair is dropped.

43
Once these objects (as well as their associations) have been identified,
different system-defined or user-defined operations defined on their corresponding
classes can be applied to these selected objects. For example, Inform(Department)
can be an operation defined on the class Department. It sends each of the selected
departments a letter concerning the majors of the students.
Suppose there is a rule in the university that a student cannot major and
minor in the same department. To check whether there is such a case in the
database, the following query can be issued.
Query 2: List students who major and minor in the same department.
The intensional pattern for this query is shown in Figure 3.3c. It can be
formed by starting from the class Student and navigating the schema in two
traversal paths (refer to Figure 3.1). One path is from Student to Department,
which means that a student majors in a certain department; and the other path is
from Student to Department through Undergrad, which means that a student is
an undergraduate and minors in a certain department (we can see from the SG
that only undergraduates may have minors). According to the query, a single stu¬
dent should associate with objects in both Undergrad and Department and these
two paths should merge at Department, thereby forming a loop. This implies two
logical AND conditions, one at the Student class and the other at the Department
class. We use double arcs to denote such conditions as shown in Figure 3.3c.
From Figure 3.2, we can see that the student si has his major and minor in the
department dl. This extensional pattern is depicted in Figure 3.3d.

44
Query 3: For those students taking section 300 and having majors and/or
minors, get their majors and/or minors.
There are several ways to form an intensional pattern for the query. We
may start from Section# and traverse to Student through Section and, then, navi¬
gate the schema in two paths as we did for query 2. According to the query, a
student who either has a major or a minor should be included in the result (in this
database, it is assumed that graduate students do not have minors). This means
that either path of the navigation will construct a pattern that would satisfy the
query. Thus, a logical OR condition exists at Student. We use a single arc to
indicate the OR condition as shown in Figure 3.4a. Like Query 2, these two
branches merge at Department. However, this query does not require that they
merge at the same Department object. This is specified by the second OR condi¬
tion at Department in Figure 3.4a.
The extensional patterns that satisfy this query have heterogeneous struc¬
tures: two types of linear patterns as shown in Figure 3.4b. The first type includes
patterns that represent the minors of the undergraduates; and the second type
includes patterns that represent the majors of the student who are either under¬
graduates or graduates. In both types of patterns, a student is associated with sec¬
tion 300 which is assumed to be the Section# for sc3. Figure 3.4c will be
described later in Section 4.4.
We have given some example queries which specify how objects are associ¬
ated with one another. In the graphical representation of an 0-0 database, when
there is no edge between two objects even though there is one between their
classes, it implies that two objects are not associated with each other. This

45
represents the complement aspect of the semantics between two associated classes.
It is necessary to allow a user to retrieve this type of object non-association from a
database. The following query is such an example. It can also be specified by a
pattern.
Query 4: For each teacher, list the sections which he/she does not teach.
We use a dashed line to represent the fact that two objects are not associated
with each other. Therefore, the intensional pattern for this query can be drawn as
in Figure 3.4d. There are twelve extensional patterns that match the intensional
pattern. Figure 3.4e shows a portion of them. Non-association relationships
among objects are not explicitly stored in a database. However, they can be
derived during the processing of this type of queries.
Using the above examples, we hope that we have convinced the reader that
the pattern-based query formulation is suitable for query specification based on a
graphical view of an 0-0 database.
2*3 Conclusion
The (type-less) graphical representation of 0-0 databases is applicable to
most 0-0 data models, since it captures the essential characteristics of 0-0 data
models in which object classes as well as their objects are inter-related with each
other in different association patterns. Querying such databases can be made by
specifying patterns in which objects of interest are associated with each other. It
should be clear that this formulation is quite different from the attribute-based
query formulation in the existing relational query languages which is based on

46
matching the attributes (or the key or composite key) of one relation with the
attributes (foreign keys) in other relations. A query that requires the specification
of a complex pattern of object associations can be specified in a rather straightfor¬
ward manner in an association-based language, whereas in an attribute-based
language, complex nestings of query blocks or multiple queries would be required
[ALA89a].
It is our view that an algebra developed for processing data based on the
graphical view of 0-0 databases and the pattern-based query formulation should
satisfy the following requirements. First, it should allow direct manipulation of
complex patterns of object associations. Second, the closure property should be
maintained. Third, both association and non-association relationships among
objects should be expressible as search conditions. Fourth, it should be complete
in the sense that it can be used to describe all possible patterns in a database.
Lastly, it must be able to represent and process patterns with both homogeneous
and heterogeneous structures.

47
Figure 3.1 Schema graph of a university database

48
Teacher Undergrad
Figure 3.2 Object graph

49
Query 1
Section Dept
(a) O O O
Student
sc1 s1 d1
• • •
sc3 s2 d3
• • •
(b) sc3 s4 d3
• • •
sc3 s5 d4
• • •
sc4 s7 d6
» • •
Query 2
Figure 3.3 Pattern specifications for Query 1 and Query 2

50
Query 3
(b)
Section# Section Student Dept
Query 4
(d)
Teacher
o- -
Section
—o
11 sc2
• - •
11 sc3
(e) I
14 * sc2
Figure 3.4 Pattern specifications for Query 3 and Query 4

CHAPTER 4
ASSOCIATION ALGEBRA
The association algebra (A-algebra) is defined based on a uniform representa¬
tion of an 0-0 database in terms of objects, object classes, and type-less associa¬
tions, as described in Chapter 3. The algebra contains a number of operators
which operate on graph structures of object associations to produce graph struc¬
tures. The closure property of the algebra ensures that the result of a query can
be further manipulated by other queries.
áJ Definitions
First, we formally define an 0-0 database at both schema and object levels.
Schema Graph (the intensional database):
The schema graph of an 0-0 database is defined as SG(C,A), where C={C{}
is a set of vertices representing object classes; A is a set of edges, each of
which, Ai}{k), represents association between classes C,. and C-, where k is a
number for distinguishing the edges from one another when there is more
than one edge between two vertices.
Object Graph (the extensional database):
The object graph of an 0-0 database is defined as OG(OtE), where 0={O^}
is a set of vertices representing object instances (j'th object in class C{); and
E={0iX=OmJ is a set of edges representing the associations among object
instances. When one object instance is connected with another in the object
graph, a regular-edge (solid line) is drawn between the corresponding ver¬
tices as Oi^—Omn which specifies that j'th object instance in class <7,. is
related to nth object instance in class Cm through the fcth association of
classes C¡ and Cm. If two object instances Ot j and Om n are not connected
in the object graph but their classes <7,- and Cm in the corresponding SG are
51

52
directly connected, a complement-edge (dotted line) is drawn between them
and is denoted by ^
J ij m, n
In this 0-0 models, an object may participate in several classes (e.g., in a
generalization hierarchy). Its representation in a class is called an object instance.
Since in most cases in this dissertation, "object" and "object instance" can be used
interchangeably without any ambiguity, we shall use "object" unless a distinction
is required between the two.
The reason for explicitly introducing complement-edges into the OG is to
allow the A-algebra to manipulate both association and non-association between
objects of two adjacent classes. In an actual 0-0 database, it is not necessary to
explicitly store the complement-edges. Figure 4.1 illustrates the regular-edges and
complement-edges among the objects of three object classes. For example, we see
that section scl is taken by students s2 and s3 (regular-edges) and not taken by
students si and s4 (complement-edges).
The relationship between an OG and its corresponding SG is formally
described by the following proposition.
Proposition 1: An 0G(0,E) is a morphism of its corresponding SG(C,A).
The mapping function Fm is defined as
FmV Ci => and
Fm2' => {OiJ===Omn}.
The mapping between SG and OG is one-to-many, since a database is
dynamically changing and may have different instantiations at different times for
the same schema graph.

53
To define "association pattern", we first extend the concept of connected
graph in graph theory by treating complement-edges as edges, i.e., a connected
graph is a graph in which there exists at least one path between any two vertices
and each path may contain regular-edges, complement-edges, or a combination of
the two. We shall from now on use an upper-case letter to denote a class and the
corresponding lower-case letter with a subscript to denote an object instance in
that class. We shall assume that there is only one edge between any two vertices
in SG unless otherwise specified so as not to complicate the notation.
Association Pattern:
A connected subgraph of an OG is an association pattern (or pattern for
short).
By this definition, a single vertex (or object instance) in OG, which is a con¬
nected subgraph, is also a pattern. We call it an Inner-association-pattern (or
Inner-pattern for short). It is algebraically represented by (a,.) for a vertex of class
A in SG. Thus, object instances are treated as Inner-patterns in the A-algebra. A
regular-edge together with two vertices (i.e., two Inner-patterns) it connects is
called an Inter-association-pattern (or Inter-pattern) which is represented by (a{bj).
A complement-edge together with the two Inner-patterns it connects is called a
Complement-association-pattern (or Complement-pattern) and is represented by
This pattern states that of and bj are not associated with each other in OG.
If a path consisting of only regular-edges between vertices at and bj. it can be
represented by a Derived-inter-association-pattern (D-inter-pattern), denoted by
(a.-bj); otherwise, it can be represented by a Derived-complement-association-

54
pattern (D-complement-pattern), denoted by (a{bj). When a path is represented
by a derived pattern, it simply means that two vertices are indirectly associated or
non-associated but how they are interrelated (the actual path) is of no importance.
A D-inter-pattern is treated as an Inter-pattern and a D-complement-pattern is
treated as a Complement-pattern in the algebraic operations.
The above five types of patterns are the primitive patterns, the latter four
being binary patterns. Their graphical and algebraic representations are summar¬
ized in Figure 4.2a. All other connected subgraphs are called complex patterns.
For example, the complex pattern shown in Figure 4.2bl contains three primitive
patterns: two Inter-patterns (ojfcj) and (bldl), and a Complement-pattern (6,c,). It
can be uniquely defined by its algebraic representation as a set of primitive pat¬
terns, i.e., (a,61,61c1,61d1). More examples of complex patterns are shown in Figure
4.2b. From these examples, one can observe that a complex pattern can be
decomposed into a set of binary patterns which cannot be further decomposed.
This implies that, in the algebraic representation of a complex pattern, an Inner-
pattern may not occur as an element and a binary pattern may appear only once.
A pattern in this algebraic format is called a normalized pattern, otherwise it is
called an unnormalized pattern. (b2,b2c2), and are examples
of unnormalized patterns. During the process of constructing an association pat¬
tern, we always normalize it by eliminating the duplicates. The above three pat¬
terns have the normalized forms of (fejcJ, (b2c2), and (a^pbjCg), respectively.
The definitions of OG and association pattern imply that a pattern is a non-
directional graph, i.e., (a{bj) = (6,-a,.), and that the sequence of primitive patterns in

55
the algebraic representation of a complex pattern is not important, hence
(aibr bjck) = (ckbj, aibj)-
Based on the above definition and notion of association pattern, we view an
OG as an Association Graph (AG) and all the association patterns in AG form the
domain of the A-algebra, denoted by A.
4*2 Relationship Between Two Association Patterns
The operators of the A-algebra are defined based on the possible relationships
between two patterns in A, so that they can be used either to construct complex
patterns using simpler patterns or to decompose a complex pattern into several
patterns of simpler structures. There are four possible relationships between two
patterns p1 and p2: non-overlap, overlap, contain, and equal.
(1) Non-overlap: Two patterns are said to be non-overlap, denoted by p'zxip2,
if they have no common Inner-pattern.
(2) Overlap: Two patterns are said to be overlapped, denoted by p'np2, if they
have at least one common Inner-pattern.
(3) Contain: Contain is a special case of (2) when all the primitive patterns of
p1 are contained in p . We say that p is a subpattern of p and denote this
relationship by p'Cp2.
(4) Equal: This is a special case of (3) when p1 contains all the primitive pat¬
terns of p2, and vice versa. It is denoted by p=p.
Before defining the association operators, we give the definition of
"Association-set" — the operand of the association operators.
Association-set:
An association-set, denoted by a Greek letter a (or #7,...), is a set of associa¬
tion patterns without duplicates, a designates the *th pattern in a, where

56
a'^a3 (ViVj). An empty set is also an association-set, denoted by .
A special type of association-set is called homogeneous association-set, which
is important to the A-algebra, since some of the mathematical properties hold only
when operands are homogeneous association-sets.
Homogeneous Association-set:
An association-set is homogeneous, if
(1) all patterns are formed by the Inner-patterns (or object instances) of
the same set of object classes; and
(2) all patterns have the same number of Inner-patterns from each class in
the set; and
(3) corresponding primitive patterns belong to the same association and are
of the same type; and
(4) all patterns have the same topology.
Otherwise, it is a heterogeneous association-set.
Figure 4.3 depicts three example association-sets: a is homogeneous, whereas
P is not since pattern f? has only one Inner-pattern of class C instead of two like
$ and ft. 7 is not homogeneous because 7s contains a Complement-pattern which
is different from 71 and 7s (i.e., different topologies).
4*3 Association Operators
Ten association operators are formally defined in this section: three unary
operators [A-Project (77), A-Select ( operators [Associate (*), A-Complement (|), A-Union (+), A-Difference (-), A-
Divide (-f), NonAssociate (l), and A-Intersect (•)]. The examples used to explain

57
these operators will make use of the domain A shown in Figure 4.4. To keep the
graph simple, the Complement-patterns are not shown in the figure. The simple
mathematical properties such as commutativity, associativity, idempotency, and
nilpotency satisfied by the operators are given after each definition.
4.3.1 Notations
Notations that will be used in the subsequent sections are fisted below.
A,
CL,â– 
\R(CLvCL2)\
(a,bj)
(aibj)
(aick)
or, P,
a
Denote classes.
Denotes a variable for a class.
Denotes the association between classes CLl and CL2.
Denotes the *th Inner-pattern of class A.
Denotes an Inner-pattern variable.
Denotes an Inter-pattern between two classes A and B.
Denotes a Complement-pattern between two classes A and B.
Denotes a Derived-pattern from class A to class C.
Denote association-sets.
Denotes *'th pattern of association-set a.
Denote sets of classes. Hence, represents association-set a
which has Inner-pattern(s) from the classes in {A}.
It should be noted that an Inner-pattern is represented by an object instance
identifier (IID), which is a system-assigned object identifier (OID) prefixed by a
class identification so that the object instances of an object in multiple classes can
be unambiguously distinguished and the fact that these object instances are

58
instances of the same object can easily be recognized.
4,3.2 Operators
All relational algebraic operators operate on relations of homogeneous (or
union-compatible) structures with the exception of Cartesian-product and Join.
The Cartesian-product and Join provide the mechanism to concatenate two rela¬
tions of different structures into a single relation, so that it can be further manipu¬
lated by other operators. In the A-algebra, all the operators are defined to operate
on association patterns of homogeneous as well as heterogeneous structures.
Therefore, the relational algebra is a special case of the A-algebra in this respect.
(l) Associate (*):
The Associate operator is a binary operator which constructs an association-
set of complex patterns by concatenating the patterns represented by two operand
association-sets. Since a pattern may involve many classes and an object class
may have more than one association with another class, it is necessary to specify
through which association the concatenation of two patterns is intended. The
Associate operation on association-sets or and /? over the association R between
classes A and B is defined as follows:
or * [fl(A,fl)] P={ 7 I 7 ==(«/,ambn): ambne[R(A,B)} A am&*‘ A bnetf }
The result of an Associate operation is an association-set containing no dupli¬
cates. Each of its pattern is the concatenation of two patterns (one from each

59
operand association-set). More specifically, if the Inner-pattern (or object am) of A
in o' is associated with the Inner-pattern (or object bn) of B in ft in the domain of
the algebra A shown in Figure 4.4, then a and ft are concatenated via the primi¬
tive pattern (am6J.
We do not restrict A and B to be different classes in *[R(A,B)\, i.e.,
a *{R(A,A))f} is a legitimate operation, which concatenates two patterns (one from
each operand association-set) if they have a common Inner-pattern of class A.
An example of the Associate operation is shown in Figure 4.5a (for conveni¬
ence a copy of the sample database is shown in each figure for illustrating an
operation. For clarity, we use graphical notation in the figures. In the example,
or1 is concatenated with ft and ft, respectively, due to the existence of (61c1) and
(c2) in A as shown in Figure 4.4. a is dropped simply because it does not have an
Inner-pattern of class B. a3 is dropped because (62) is not associated with any
Inner-pattern of class C in A. ft cannot be concatenated through (c4) with any
pattern in a because no pattern in o- has an Inner-pattern of B that is associated
with (c4) in A. For the same reason ft is dropped.
For the Associate operator, [R(A,B)\ can be omitted if the following condi¬
tions hold: (1) both a and ft are A-algebra expressions, (2) the Associate operator
operates on the last class in a linear expression a and the first class in a linear
expression P, and (3) there is a unique association between these two classes. For
example, A *[R(A,B)\ B can be written as A*B, if class A is associated with class
B through the attribute [/2(A,fi)j of A. It should be pointed out that A-algebra
allows an attribute to be defined by a computed value (or object). For instance,

60
B=j{A). The implementations of the function and the procedure are invisible to
the algebra. However, they should not have side effect, i.e., the computed result
must be of the same type as B.
The Associate operator is commutative and conditionally associative as
defined below:
a *[-R(A,J9)] 0 = 0 *[#(£?,A)] a (commutativity)
(arw *{R{A,B)} 0{Y}) *[R(C,D)\ 7{Z} (associativity)
= «« MAM [P{Y) A.R(C,D)\ 1{z]) (if C£{X} A B£{Z})
A *[-R(A,A)] A = A (idempotency)
The associativity holds true if a and 7 do not have Inner-pattern of classes C
and B, respectively. Otherwise, the associativity does not hold. For example, if
a=(o161,61c2), yS=(6,Cj), 7=(rfx), and A is as shown in Figure 4.4 (the domain of the
algebra), then
(or *[J2(A,.B)] 0) 4R(C,D)} 7 =(o161,61c1(61c2,«2d1)
and
a *[fl(A,fl)] (0 *\R(C,D)\ 7) = 0

61
(2) A-Complement (|):
The A-Complement operator is a binary operator which concatenates the
patterns of two operand association-sets over Complement-patterns. It is used to
identify the objects in two classes which are not associated with each other in A.
The A-Complement operator is defined as follows:
a | [R{A,B)) P = { 7 I 7 (sX)e[i2(A,J3)] A aJZfit A bjtf
or 'f=a : 3(m)(amea‘) A i(n)(b„eff)
or 'f=ft : 3(n){bn£ff) A á(m)(amGa) }
The result of an A-Complement operation is an association-set. Each of its
patterns is formed by concatenating two patterns (one from each operand
association-set) via a Complement-pattern (om6n), where am and bn belong to a
and ft, respectively, and the Complement-pattern (am6n) is in A. In the special
case when a(or fi) is an empty association-set or does not have Inner-patterns of
class A(or B), then all patterns of f(or a) that have Inner-patterns of A(or B) are
retained in the resulting association-set.
An example of the A-Complement operation is shown in Figure 4.5b. It
operates over the association between classes B and C. a does not appear in the
resultant association-set because it contains no Inner-patterns of B. a1 cannot be
A-Complemented with ft and ft because it is connected with ft and ff by Inter¬
patterns (6,Cj) and (6^) in A, respectively.
Under the same conditions as given in the Associate operator, [R{A,B)} need
not be specified with the A-Complement operator unless there is an ambiguity.
The A-Complement operator is commutative and associative. For the similar rea-

62
son described for the Associate operator, the associativity holds true conditionally.
a | [J2(A,B)] P = P | [i2(B,A)] a (commutativity)
(«W I [i2(A,B)j P{Y]) | [R{C,D)} 7{z} (associativity)
= orw | [R(A,B)} (P{Y) | [R(C,D)\ 7{z}) (if C(f{X} A B(f{Z))
A |[i?(A,A)] A — (nilpotency)
(3) A-Select (tr):
The A-Select is a unary operator, which operates on an association-set or to
produce a subset of patterns that satisfy a specified predicate P. A pattern in the
operand association-set is retained iff the predicates are evaluated true for that
pattern.
a(ot)[P\ = { 7 | Y = a : P(a)=true }
where or is defined by an algebraic expression, and P = Tidx T292 • • • 0n_, Tn. Each
term, T,{t=l,2,...n), is a comparison between two expressions and 5¿(»=l,2,...,n-l) is a
Boolean operator (Aorv). P(a)=true represents that a pattern is evaluated true for
that predicate.
The expressions on the left- and right-hand sides of a comparison operation
may contain constants, functions, and/or operations on objects, but cannot both
be constants. The comparison terms are type sensitive, i.e., the results of the two
expressions in a term should be data of the same type for primitive-classes or both
IIDs for nonprimitive-classes. =,>,<,>,<, and ¿ are the legitimate comparisons
for numerical types; = and ^ for character, string, and IID types; and =,C,D,C,D,
and for set types. The comparison of two IIDs is performed by comparing their
OID portions, since IIDs are the concatenations of the class identifiers and OIDs.

63
A single valued object or a single IID can be treated either as its own data type in
numerical, string, or IID comparison, or as a set type containing one element in a
set comparison.
As an example of A-Select, we assume that there are two associated classes:
S for stack and Q for queue. To select associated stack and queue object pairs in
which the top and the bottom of the stack have some common object(s) with
those in the head and the tail of the queue, it can be written as
C^SÍAQl^íoplSlI^Jfcoííorr^S)) p| (head(Q)[Jtail(Q)) ^ 4>\
For the top equals the head and the bottom equals the tail, we have
o(S*Q)[top(S)=head(Q) A bottorri(S)=tail(Q)}
(4) A-Project (77):
Similar to the projection operation in the relational algebra, an A-Project
operation is defined to project subpattern(s) of a pattern. However, in the rela¬
tional algebra, the relationship among the projected attributes is not important.
Whereas in A-algebra, the association among the projected subpatterns must be
maintained so that the associations among the objects in these subpatterns will be
retained. The A-Project operator is defined as follows:
Il(a)[£, 71
where a is an association-set defined by an A-algebra expression;
£=(ev e2, , en) is a set of expressions which specify subpatterns to be pro¬
jected; and T=(tv t^, . . . , tm) is a set of ordered sets of classes. Each ordered set,

64
t,., specifies a path connecting two projected subpatterns defined by the sions.
et{i=l,2,...,n) is a subexpression of the expression which defines a. e,. and
e - should not contain a common class. There may be many paths that con¬
necting two subpatterns in the original pattern. The path to be retained can be
specified in tk. If a specific path is chosen, a minimal number of classes along the
path which can uniquely identify the path should be specified. The result of an
A-Project operation over a pattern is its subpatterns defined by € and some paths
defined by T that connect these subpatterns. If a path in the original pattern con¬
sists of all Inter-patterns, a D-inter-pattern is retained. Otherwise, a D-
complement-pattern is included. Multiple paths between two projected subpat¬
terns can be declared in T, if it is so desired.
Figure 4.5c shows an example of A-Project from a pattern a over A*B and
D. For a, the subpatterns (a,b¡) and (dj satisfy A *B and D, respectively. There¬
fore, they are kept in the result. According to the path specification stated in the
operation, a Derived-pattern (6,dj) is added to the result, thus 71=(a1fc1, d¡ b{d). Its
normalized form is 7=(a161, i^d). 'f is produced for the same reason. Since a
does not have a subpattern satisfying A *B, only (dg) is retained.
(5) NonAssociate (l):
The NonAssociate operator is a binary operator used to identify the associa¬
tion patterns in one operand association-set that are not associated (over a
specified association) with any pattern in the other association-set, and vice versa,

65
in the domain of the algebra A. The NonAssociate operator is defined as follows:
a ! [R(A,B)\ ft={ 7 I 7 = («*', ft, ^fcj: ftftJn)e[R(A,B)} A amGar*‘ A bneft
A V ((a 6 ,),(a <6n)G.4)(a A 6 ,£/?)
n m m n
or ft = a‘: 3(m)(amG«‘) A á(n)(6„G^)
V V(6„G03(fc, Mm)(atG« A (afc6n)G[«(A,S)])
or ft = ft: 3(n)(6„G^) A ^(m)(omG«)
V V(amGa)3(*, Mn)(6fcG0 A (am6fc)G[JR(A,B)]) }
The result of a NonAssociate operation is an association-set. Each of its pat¬
terns is formed by concatenating two patterns a and ft via a Complement-
pattern (am6„) under the condition that a is not associated with any ft and vice
versa. Furthermore, in the special case where the patterns of a(or ft) have Inner-
patterns of A(or B) and cannot be concatenated with any pattern of ftor or), these
patterns of a(or ft) will be retained in the result if one of the following three condi¬
tions holds: (1) ft(or a) is an empty association-set, (2) all patterns of ft(or a) do
not have Inner-patterns of B[or A), or (3) all patterns of ft(or a) that have Inner-
patterns of B(or A) can be concatenated with patterns of a(or ft).
An example of the NonAssociate operation is shown in Figure 4.5d. In the
example, a1 and ft are dropped due to the existence of (fcxc2) in Figure 4.4. a is
dropped because it does not contain an Inner-pattern of class B. ft is dropped
because it does not contain an Inner-pattern of class C. ft is in the resultant
association-set because (b2) is not associated with (c4) in A as shown in Figure 4.4
and (63) does not appear in a. 7 exists because (&2) is not associated with (c3) in A.
Note that the NonAssociate operator produces a resultant association-set
which is a subset of that produced by the A-Complement operator, because or', ft,

66
and ambn may form a new pattern only when am of a' does not associate with any
object of B in P and bn of ft does not associate with any object of A in a. In fact,
the NonAssociate operator can be expressed in terms of A-Complement and other
operators as follows:
A ! [i2(A,B)] B = [A — II(A *[fl(A,B)] B)[A] \{R(A,B)} (B - II(A *[R(A,B)} B){B])
Thus, NonAssociate is not a primitive operator in a strict sense. However, it is
very useful for query formulation and is therefore included in the set of A-algebra
operators.
Under the same conditions as given in the Associate operator, [i?(A,B)] need
not be specified unless there is an ambiguity. The NonAssociate operator is com¬
mutative but not associative.
a ! [i2(A,B)j ft = ft ! [R(B,A)\ a (commutativity)
A ![J2(A,A)] A = (nilpotency)
(6) A-Intersect (•):
The A-Intersect operation is convenient for constructing a pattern with a
branch or a lattice structure (a pattern that has a loop), since a pattern in such
structures can be viewed as the intersection of two patterns. Conceptually, the
A-Intersect operator is equivalent to the JOIN operator in the relational algebra.
It operates on two operand association-sets over a set of specified classes. Two
patterns, one from each association-set, are combined into one if they contain the
same set of Inner-patterns for each specified class. The A-Intersect operation is
defined as follow:

67
«{*} *{W} P{Y) = { 7 I 7* = (<*'/):
V( CLne{ W}) V(@G CLn,a)(@eff)
A V(CL„G{W})V(@GCL„/)(@ea*) }
Figure 4.5e shows an example of the A-Intersect operation over classes B and
C. The resultant association-set contains four patterns, which are the intersection
of an/?, a D/T, au.fi, and a2nfi, respectively, since they all have Inner-patterns
(6,) and (c2). Other patterns (a3, a4, fi, fi) fail to produce new patterns because
they either have no Inner-pattern in both classes B and C or have no common
Inner-pattern of class C.
The set of classes {W> can be omitted when the A-Intersect operation is per¬
formed on all the common classes of its operands, i.e., {IV}={^Qn{T} is implied.
Since a lattice pattern can be transformed into a set of other simple patterns,
an A-Intersect operation for building a complex pattern can be replaced by an
Associate operation followed by an A-Select operation (see Section 4 for detail).
The A-Intersect operator is commutative, conditionally associative and idempo-
tent.
a *{W) fi = p •{W} a (commutativity)
(«{*} *{Wi} fir¡) «{WJ 7{z} = «{x} • {IF,} {P[Y) #{W2} 7{z}) (associativity)
a • a = a (if a is a homogeneous association—set) (idempotency)
The associativity is not always true because there are cases in which a pat¬
tern of P which fails to intersect with any pattern of 7, may succeed by first inter¬
secting with a pattern of a in the operation (•{ VV,}) and then intersecting with a
pattern of 7 in the operation (•{ W2}).

68
Now we define three set operators, which are different from the correspond¬
ing set operators in relational algebra, since they operate on heterogeneous struc¬
tures as well as homogeneous structures.
(7) A-Integrate (/):
The A-Integrate is a unary operator. It reorganizes patterns in an
association-set according to the relationships among patterns with respect to the
classes specified. The A-Integrate operation is defined as follows:
f[w}(a) = { T I 7 = i0.)'-
v(fc, CLne{W}A@eCLnA@eQ1Aot’ea.){@eoikAoikeoi,) }
By this definition, a subset of patterns (or,) of a is combined into a single pattern if
every object instance of classes in {W} that appears in a pattern in the subset is
also contained in all other patterns in the subset. If a pattern of a cannot be com¬
bined with any other pattern, it is retained in the resultant association-set as it is.
If no class is specified, patterns, in which every pattern has at least one
object instance (of any class) common to another, will be integrated into one pat¬
tern. The reorganized association-set will contain patterns which are apart from
each other (refer to Section 4.2).
Figure 4.5f shows two examples. The first example shows an A-Integrate
operation over class A. Patterns that have common Inner-pattern of class A are
grouped into one (71 is the integration of or1, a, and a3; and is the integration of
a and as). All other patterns in a are retained in the result as they are. The
second example illustrates an A-Integrate operation on the same association-set of

69
the first example but without specifying a class. The result becomes two patterns,
which are apart and are exactly the same as they appear in the original database.
Whereas the same primitive patterns appear more than once in the result of the
first example.
(8) A-Union(+):
Similar to the UNION operation of the relational algebra, A-Union combines
two association-sets into one. However, these two association-sets can contain
heterogeneous association structures. It is important for A-algebra to be able to
operate on heterogeneous structures because some prior operations may produce
heterogeneous association-sets and may need to be further processed over the
objects of a common class against other patterns of associations. Unlike the rela¬
tional algebra and other 0-0 query languages, union-compatibility is not a restric¬
tion in A-algebra. For this reason, A-algebra has more expressive power. Any
query that can be expressed by a single expression in other languages can be
expressed as a single A-algebra expression but not vise versa. The A-Union opera¬
tion is defined as follows:
« + P = { 7 I Vea v Ve£ }
The A-Union operator is commutative, associative, and idempotent:
a + P = f) + a
(a + p) + 7 = a + (0 + 7)
at + a = a
(commutativity)
(associativity)
(idempotency)

70
(9) A-Difference (-):
The A-Difference implements the same concept as the DIFFERENCE opera¬
tor in relational algebra but with two differences. First, its operands do not have
to be union compatible. Secondly, a pattern in the minuend is retained if it does
not contain any of the patterns in the subtrahend.
* - P = { 7 I 7* = a* : }
The example depicted in Figure 4.5g shows that a1 and a are dropped since
they both contain $.
(10) A-Divide (^-):
The A-Divide operator implements the concept that a group of patterns with
certain common features contains another set of patterns.
« +{W) P = { 7 I 1 = «V VfrX/^Ca, ) }
where ott is a subset of the patterns of or, which have common Inner-patterns for
all classes of {W} and they together contain all patterns of /?. If {W} is not
specified, the A-Divide operation retains all the patterns of a, if each of which
contain at least one pattern of f) and they together contain all patterns of /?.
Figure 4.5h shows an example of a being divided by ft with respect to class
B. The A-Divide operation retains or1, a ,and a3 since they all contain Inner-
pattern (6,) of B and together contain all patterns of fi.

71
4.3.3 Precedence
The precedence relationships of the above operator are as follows. Unary
operators have higher precedence than binary operators. The precedence of the
seven binary association operators is given in the following order: *, |, !, •, 4-,
and +. Parentheses can be used to alter the precedence relationships.
4.3.4 Summary of operators
(1) Associate (#): Two patterns are concatenated via an Inter-pattern.
(2) A-Complement (|): Two patterns are concatenated via a Complement-pattern.
(3) A-Select ( (4) A-Project (77): A subpattern is projected from the original pattern.
(5) NonAssociate (l): Two patterns are concatenated via a Complement-pattern
only if each of them cannot be concatenated with any pattern of the other
operand via an Inter-pattern.
(6) A-Intersect (•): Two pattern are combined into a single pattern if their com¬
mon classes have common object(s).
(7) A-Integrate (/): Patterns in an association-set are combined if objects of a
specified class in a pattern are common to these patterns.
(8) A-Union (+): Two association-sets are lumped into a single set.
(9) A-Difference (-): A pattern in the minuend is retained if it does not contain
any pattern in the subtrahand.
(10)A-Divide (-f): A subset of patterns in the dividend that have certain common
feature(s) and contain all the patterns in the divisor is retained.

72
4.4 Query Examples
We have formally defined nine association operators and given their simple
mathematical properties. Before exploring other properties, we give some exam¬
ples to illustrate how these operators can be used to formulate queries for process¬
ing an 0-0 database. There can be many alternative expressions for the same
query. Choosing the best one for execution is the task of a query optimizer. The
mathematical properties of these operators can be used for that purpose.
In the following formulation of algebraic expressions, we assume that the user
is using the algebra directly instead of a high-level query language. In the latter
case, the task of generating algebraic expressions would belong to the translator.
To formulate an A-algebra expression for a query, first, we need to construct
an intensional pattern for it by navigating the schema graph of the database as
illustrated in Chapter 3. Then, each edge of the pattern is marked an operator *,
I, or ! on the intended semantics. For simple patterns, the formulation is straight¬
forward. For patterns with complex structures, we may have to decompose them
into patterns with simpler structures. The expression for the original pattern is
the A-Intersect’s of the expressions for the decomposed patterns.
First, we formulate expressions for Query 1 to Query 4 given in Chapter 3.
We have identified the intensional patterns for these queries (see Figure 3.3).
Query 1: For all sections, get the majors of students who are taking these
sections.
It is trivial to write an algebraic expression for Query 1, which is represented
by a linear pattern. For this pattern, two edges are all marked with * and the

73
algebraic expression can be formulated as follows:
f (II[Section ¿ Student ¿ Department)[Section,Department,Section.Department])
'{Section}
where the A-Integrate operation groups the resultant patterns by Sections.
Query 2: List students who major and minor in the same department.
For Query 2, the edges of the intensional pattern shown in Figure 3.3c are all
marked with *. Since this loop structure can be viewed as the A-Intersect of two
linear patterns involving both Student and Department, we have
II(Student ¿ Undergrad ¿ Department • Student ¿ Department)[StudentJ
where the A-Project operation gets the student objects that satisfy the association
pattern as required by the query.
Query 3: For those students taking section 300 and having majors and/or
minors, get their majors and/or minors.
The expression for the intensional pattern of Query 3 shown is as follow:
Section# ¿Section * [Student ¿Department + Student ¿Under grad ¿Department-1)
where the A-Union operator is used to realize the OR condition at the class Stu¬
dent. As long as a student has a major or a minor, the linear pattern from Student
to Department and the linear pattern from Student to Undergrad and to Depart¬
ment should be retained. In the expression, Department_l is an alias of Depart¬
ment, which is used to distinguish major and minor departments. Since the query
ask for the majors and minors of students who are taking section 300, the A-Select
and A-Project operations are used. Thus, we have

74
J (IJ( <7(Qr)[5ecíton#=300])[5íudlení, Department, Departmental;
J{Student}
Student.Department,Student.Department-A])
where a is the intensional pattern given above. As shown in Figure 3.3g, the
result of this expression will contain the derived patterns shown in Figure 3g
which are specified by the [£;7] clause of the projection operation and is reorgan¬
ized by an A-Integrate operation. Note that Query 3 cannot be phrased in a sin¬
gle relational algebra expression since (a) the union operation in relational algebra
requires operands to be union-compatible, (b) using a join operation on Student
can cause a loss of information because not every student has both major and
minor, (c) the cartesian-product of the majors and minors will produce erroneous
results, and (d) no other operation in the relational algebra can combine two rela¬
tions into one.
Query 4: For each teacher, list the sections which he/she does not teach.
The algebraic expression for Query 4 can be easily formulated as follows,
since it is represented by a linear pattern shown in Figure 3.3h. We note that the
A-Complement operator |, rather than the NonAssociate operator !, should be
used for this query, since a teacher may be teaching some courses.
Teacher \ Section
Several other query examples are given below. They use the schema graph
given in Figure 3.1. Their corresponding intensional patterns are depicted in Fig¬
ure 4.6.

75
Query 5: List the names of students who teach in the same departments
as their major departments.
We can see from Figure 4.6 that the intensional pattern for this query can be
constructed in two ways. One way is to decompose it into three linear patterns:
Name—Person—Student, Student—Department, and
Student— Grad— TA— Teacher—Department
The A-Intersect’s of these three patterns will produce a pattern that satisfies this
query.
n[Student ¿ Person * Name • Student ¿ Department
• Student ¿ Grad ¿ TA * Department)[Name\
where the first A-Intersect operation operates over Student and the second
operates over Student and Department. The A-Project operation projects the
names of these students.
Another way is to decompose the intensional pattern into two linear patterns:
Name—Person—Student—Department and
Student— Grad— TA— T e acher—Department
Therefore, we have an alternative expression
IJ(Name ¿Person ¿Student ¿Department ¿TA
• Student ¿Grad ¿TA ¿Teacher ¿Department)[Name]
Query 6: List the section# of those sections which have not been assigned
a room or have not been assigned a teacher.
Since the query requests sections that have not been assigned a room or a
teacher, these sections must not be connected with any room or any teacher (i.e.,

76
a section which does not associate with any room and teacher should also be
retained in the result). Therefore, there should be Complement-patterns between
Section and Teacher and between Section and Room, and a single arc between
these two branches as shown in Figure 4.6. We emphasize that ! operation,
instead of |, should be used to construct these two Complement-patterns. Then
the algebra expression for this query can be easily formulated as follows:
II (Section# * (Section ! fioom# + Section \Teacher))[Section#\
Query 7: List the names of students who take courses 6010 and 6020.
We shall show three ways of formulating an expression for this query. First,
the intensional pattern for Query 5 shown in Figure 4.6 can be constructed by the
A-Intersect of two linear patterns as we did for Query 5:
IT(o(Name ¿Person ¿Student ¿Enrollment ¿Course ¿Course#)[Course#=6010]
• o(Student ¿Enrollment-1 ¿Course-1 ¿Course#-\)[Course#=&02Q\)[Name]
where Enrollment-1, Course_l, and Course#_l are the aliases of the classes
Enrollment, Course, and Course#, respectively. This ensures that the A-Interact
operation will be performed only over the Student class.
A second way is to view the original pattern as a linear pattern without res¬
triction on Course# as follows:
Name—Person—Student—Enrollment—Course—Course#
Students who are taking both courses must participate at least two such patterns
with Course#=6010 and Course#=6020, respectively. This implies an A-Divide
operation. Thus, the query can be formulated as follows:

77
Il(Name ¿Person ¿Student ¿Enrollment ¿Course ¿Course#
-r{student} °iCourse. Course#)[Course#=60l0\/Course#=6020})[Name]
where a dot in Course.Course# is used only for identifying the Course# class
which is defined in the Course class. It does not represent a function or a method
as in other languages. This expression can also be rewritten as follow:
Il(Name ¿ Person ¿ II(Student ¿ Enrollment ¿ Course ¿ Course#
-T{student} which is more suitable for execution than the first since the inner A-Project gets
the student objects who are taking these two courses so that all other data associ¬
ated with these students, such as Enrollment, Course, and Course#, do not have
to be carried along in further processing to get the names of these student.
Details of optimization issues will be addressed in the next chapter.
We stress that the above association pattern expressions represent the inter¬
nal algebraic operations that need to be performed if the dynamic inheritance
method is used. The high-level query statements corresponding to these algebraic
expressions issued by the user can be much simpler due to the inheritance of attri¬
butes in the generalization hierarchy or lattice.

78
Student Section Course
Figure 4.1 Regular-edges and Complement-edges in an OG

79
primitive
patterns
graphical
representation
algebraic
representation
a1
which is derived from
a1 b1 c1
„ _ a1
D-Complement- __- ^
pattern ”
which is derived from
a1 b1 c1
d1
-•
d1
+
d1
(a1)
I-pattern
a1
A
b1
A
(a1b1)
w
V
c1
d1
Complement-
pattern
a1
d1
(c 1 d 1)
D-Inter-
pattern
(afd1)
(a1 b1,b1c1,c1d1)
(afd1)
(a1b1,b1c1,c1d1)
(a) primitive association patterns
a1 b1 c1
(a1b1,b1c1,b1d1)
(3)
d1
(b1c1,c1d1)
(a2b2,a4b2,b2c3,b3c3)
(atbl,b1c1 ,b1c2,c1d1,c2d1)
(b) complex association patterns
Figure 4.2 Examples of association patterns

80
Figure 4.3 Examples of association-sets

81
A
B
C D
Figure 4.4 A sample database association graph
(The Complement-patterns are not shown)

82
ABC
D
d1
d2
d3
d4
Sample Database
(The Complement-patterns are not shown)
a
P
Y
a1 t-
-• b1
a3 •
a3 I—-4 b2 ,
*[R(B,C)]
/c1«—■-*d1 \
c2 • « d2
c4
b3 • • 0d4
Vc4 —• d3 J
(a) an Associate operation
Figure 4.5 Example of operations

83
A B C D
d1
d2
d3
d4
Sample Database
(The Complement-patterns are not shown)
a
/"a1 •-
a2 •
^a4
-• b1
|[R(B,C)]
d •"—"• d1
c2 * 4 d2
.c3#
( a1
»-
b1
a4 b3
a4
a4 b3
c3
c1 d1
■4—-4
c2 d2
■4 •
b3 c3
J
(b) an A-Complement operation
Figure 4.5—continued

84
A B C D
d1
d2
d3
d4
Sample Database
(The Complement-patterns are not shown)
a y
f a1
•
b1
—m ■
C1
di A
m
f a1
b1
dA
n
a1
b1
c1
d3
[(A*B, D);(B:D)] =
•—
a1
•—
—•—
b1-
XjL3
V
V
*2
c3
d3
-H. y
—ft
(c) an A-Project operation
Figure 4.5—continued

85
A B C D
d1
d2
d3
d4
Sample Database
(The Complement-patterns are not shown)
a
P
Y
a1
bA
al
a4
b2
J
![R(B,C)]
r
v
c2
4
c4
4
b2<
c3<
d3
d4
A
y
f a4
b?
c4
•—
—
—
a4
b2
c3
v •-
(d) a NonAssociate operation
Figure 4.5—continued

86
A B C D
d1
d2
d3
d4
Sample Database
(The Complement-patterns are not shown)
(e) an A-Intersect operation
Figure 4.5—continued

87
ABC
D
d1
d2
d3
d4
Sample Database
(The Complement-patterns are not shown)
(f) A-Integrate operations
Figure 4.5—continued

88
ABC
D
d1
d2
d3
d4
Sample Database
(The Complement-patterns are not shown)
(g) an A-Difference operation
Figure 4.5— continued

89
ABC
D
Sample Database
(The Complement-patterns are not shown)
a
a1
•-
b1
»
b1
b1 c1
—» •
c2 d1
• •
c4 d4
b3 c4
• •
b2 c3
• •
y
P Y
f d1 \
•
a1 b1
b.1 c.’ \
• •
=
b1 c2 d1
b1 c2
• •
i b1 c4 d4
c4 d4
\ • • /
\ •—•—• j
(h) an A-Divide operation
Figure 4.5—continued

90
Query 5
Name Student Dept
Query 6
Teacher
Section# ^ *0
o ck£
Section ^
Room
Query 7
Enrollment Course
Course#=6010
Course#=6020
Figure 4.6 Intensional patterns of Query 5, 6, and 7

CHAPTER 5
MATHEMATICAL PROPERTIES OF OPERATORS
AND THEIR APPLICATIONS
IN QUERY OPTIMIZATION AND QUERY DECOMPOSITION
In Section 4.3, we have shown some mathematical properties of individual
operators. In this section, we shall study their properties systematically. The pro¬
perties of A-algebra are classified into six categories: (1) conventional algebraic
properties such as commutativity, associativity, idempotency, nilpotency, and dis-
tributivity; (2) nesting of two unary operations; (3) a binary operation nested in a
unary operation; (4) cascading of two different binary operations; (5) general iden¬
tities; and (6) operation transformation. The properties presented in this disserta¬
tion is quite exhaustive, but may not be complete. These properties provide the
mathematical foundation for query decomposition and query optimization. Their
utilities in these two applications are also illustrated in this chapter. The proofs of
properties that are marked with f’s can be found in the Appendix. Others can be
proved similarly.
5J Conventional Algebraic Properties
To be systematic, first we list the properties given in Section 4.3 without
explanation, since they have been illustrated previously. Then, we give the pro¬
perties of distributivity.
91

92
A. Commutativity
a *[¿2(A,B)] p = p *[fl(fl,A)] « (5.1 f )
a | [J2(A,J3)] P = P | [«(fl,A)] a (5.2 f )
a ! [J2(A,B)] P = P ! [R(B,A)\ a (5.3 f )
a •{W} P = P •{W} a (5.4 f)
a + P = P + a (5.5 |)
B. Associativity
(“w *{R(AM pw) *{R(C,D)\ 1{z]
= aw *{R(AM (P{y) *{R(C,D)\ 7{z}) (C%{X) A B* {Z}) (5.6 f )
(«{*> I WAM P{Y)) I [*()] 7{Z}
= orw | [B(A,B)] (P{Y} | [B(C,B)] 7{z}) (Cg{A) A B£{Z}) (5.7 f )
(«{*} *{^i} P{y)) «{W,} ~<{z} = • W) (P{Y} *{W2} 7{z})
(«wiHMyjnw-^ a ({w})nffl = « (5.81)
(a + p) + 7 = a + (P + 7) (5.9 f )
C.Idempotency and Nilpotency
a • a = a (if a is a homogeneous association—set) (5.10)
a + a = a (5.11)
A *[R(A,A)} A = A (5.12)
A ![B(A,A)] A =
(5.13)

93
a + a = a (5.14)
D. Distributivity
a) distributive property of * with respect to +:
a *[R(A,B)} (p + 7) = a *[fl(A,B)] p + a *[B(A,B)] 7 (5.15 f )
(b) distributive property of | with respect to +:
a 1 [B(A,B)] (P + 7) = a I [R(A,B)} P + a | [B(A,B)] 7 (5.16 f )
c)distributive property of • with respect to + :
or .{*} ( p + 7 ) = « *{X} P + a *{X} 7 (5.17 f )
These three properties hold true for the same reasons. First, the A-Union
operation simply lumps together patterns of two association-sets without modify¬
ing them. Second, when two patterns are operated on by *, |, or •, the production
of a new pattern is independent of other patterns in the operand association-sets,
i.e., the decision whether a new pattern is produced or not is determined only
based on the structure of the two patterns being operated on.
d) distributive property of * with respect to •:
a{x) *{R{CLvCL2)\ (P[y} .{W} 7{z})
= *[R(CLvCL2)\ P[y) .{WUaw *[R(CLvCL2)\ 1{z} (5.18 f )
e) distributive property of | with respect to •:
I \R(CLvCL2)\ (P[y) .{W} 7{z})
= am I [R(CLvCL2)\ P[y) .{WUX} am \ [R(CLVCL2)\ 1{z] (5.19)

94
Distributive properties d and e, hold true under the following three condi¬
tions:
i) CL2eW,
ii) Xp|Y = XfY = ; and
iii) or is a homogeneous association—set.
The first condition ensures that the *, |, and ! operations are performed on
the intersection of 0 and ri>. Otherwise, it does not make sense to have an opera¬
tion between a and 7. The second condition states that a patterns are non¬
overlapping with f) and 7 patterns. The third condition states that, on the right-
hand side of the expression, only the patterns having the same a patterns as their
sub-patterns will succeed in the A-Intersect operation. Although these two distri¬
butive properties do not hold when one of the above three conditions is not true,
they are equivalent to some other expressions under a less restrictive condition.
These properties are classified in other categories.
It should be noted that two possible distributive properties are missing in the
above list. First, ! is not distributive with respect to +. This property does not
exist because of the way the NonAssociate operation is defined. By its definition,
a pattern in one association-set will be included in the resultant pattern iff it does
not connect to any pattern in the other association-set. This implies a logical
AND concept. Therefore, expressions a ! (ft + 7) and a ! ft + a ! 7 have totally
different semantics. The former stands for patterns in a that are not associated
with patterns in both ft and 7; whereas the latter specifies those patterns in a that
are not associated with any pattern in either /? or 7. Second, ! is not distributive

95
with respect to •. This property does not hold because performing the A-Intersect
operation first may drop some /? patterns which may be associated with some a
patterns and the dropped 0 patterns may allow those a patterns to be non-
associated with the result of the A-Intersect operation. Whereas, when perform¬
ing the Nonassociate operation first those a patterns may not appear in the final
result.
The reason that NonAssociate operator is not distributive with respect to A-
Union and A-Intersect operations is mainly because it is not associative. We shall
see from the rest of this chapter that it has less properties than other operators.
5*2 Nesting of Two Unary Operations
a) Two A-Select operations (one nested in the other):
Similar to the relational algebra, the order of the nesting of two selections
can be exchanged without affecting the final result. Or, they can be combined
into a single selection operation. The selection condition of the combined A-Select
operation is the conjunction of the predicates of the original two A-Select opera¬
tions.
*i( mm
= «(«)[/> A/y
(5.20 t)

96
b) Two A-Project operations (one nested in the other):
It should be obvious that the order of the nesting of two projection opera¬
tions cannot be exchanged except that they project the same thing, which is not
meaningful. However, they are equivalent to a single projection if the outer A-
Project operation projects subpatterns over patterns produced by the inner A-
Project.
nx( n¿ct)[e¿T¿)[e{,tj = nja)[^tj (5.21)
( Velt-3e,y(e!,■€£, A e2j€£2 ^ eU—e2j) )
where elf’s are subpattern expressions of the first A-Project operation and e^.’s are
subpattern expressions of the second A-Project operation; and euCey means that
eu defines a subpattern of ey.
c) Two A-Integrate operations (one nested in the other):
By the definition of the A-Integrate operation, if an A-Integrate operation is
applied second time on an association-set, it will have no effect on the result of the
first operation. Therefore, we have
I (/ («)) = / («)
J{w}VJ{w}v ” J{wy '
(5.22)
/( ¡(a)) = /(a)
(5.23)
Since an A-Integrate operation with a set of specified classes only performs part of
the function of an A-Integrate operation without a set of specified classes, the fol¬
lowing equations also hold true.
/(y»» - /(«)
(5.24)

97
J¡w}( /(«)) = /(«) (5.25)
d) A-Select nested in A-project, or vise versa:
A selection operation performed on the result of a projection operation is
equivalent to the projection performed on the result of the selection, since the
selection condition applicable to the projected subpatterns must be applicable to
the patterns before the projection. However, it is not true for the other direction.
at (5.26)
For the other direction to be true, the classes involved in the predicate of the
selection condition should also appear in [£;7] clause of the projection operation
(denoted as PCS) which defines subpattern(s) to be projected out. Otherwise, the
result of the selection is always an empty set because the predicate is not applica¬
ble to the projected patterns. Therefore, the above property holds true for both
directions when the condition holds, thus we have
U[ a(<*)mzT\ = JJ(ar)[£;TlM (« (5-27 f )
ÍL3 A. Binary Operation Nested in A Unary Operation
5.3.1 Binary operation nested in an A-Select
a) Associate, A-Complement, or A-Intersect nested in A-Select
Generally speaking, transforming an expression of a binary operation (Associ¬
ate, A-Complement, or A-Interact) nested in a selection into another expression is
impossible, since the predicate of the selection operation can be very complicated.
For this reason, we study only the simple case in which the predicate has the form

98
PxaP2 or PyP2, and Px and P2 are only applicable to a and /?, respectively. The fol¬
lowing properties are similar to those in relational algebra. They do not need an
explanation.
For PxaP2, we have
*{R(A,B)] P)[PxaP2] = v¿a)[P¿ *{R(A,B)} a2{p)[P2) (5.28 f )
°i<* \[R(A,B)\ P)[PxaP2] = ax(a)[/>,] [fl(A,il)] | a2(P)[P2] (5.29)
°ia • P)[P(5.30)
For PxvP2, we have
o{a *[fl(A,fl)] P)[PyP¿ = <*a)[P¿ *[Í2(A,B)] fi + a *{R(A,B)} o{P)[P2\ (5.31 f )
o{a |[fl(A,B)] P)\PXVP¿ = a{a)[Px} |[fl(A,B)] fi + a |[J2(A,B)] o{p)[P¿ (5.32)
°i<* . P)[PxvP2] = o{a)[Px} . P + a . o(fi\P¿ (5.33)
We note that the above properties are not true for a NonAssociate operation
nested in an A-Select. The reason is similar to what we have explained in the sec¬
tion on distributive property.
b) A-Difference nested in A-Select
Since both A-Difference and A-Select operations perform a restriction on an
association-set and produce a subset of patterns without changing their original
structures, an A-Select operation performed on the minuend or on the result of the
A-Difference operation will produce the same result.
o{a - m = o{a)[P[ - P
(5.34 f)

99
c) A-Union nested in A-Select
It should be obvious that the following equation is always true:
o(a + p)[J] = o[a)[Pi + o{m (5.35 f )
In a special case that P has the form PyjP2 and P, and P2 can be applied to a and
P, respectively, we have
o(« + flfP.vPJ = ^[P,] + a2{p)[P2) (5.36 f )
5.3.2 .Binary operation nested in A-Project or A-Integrate
Since A-Project and A-Integrate operations produce patterns which may con¬
tain subpatterns of both operands of the nested binary operation, properties simi¬
lar to those presented above do not hold in general except for the nesting of an
A-Union operation.
TJ[a + p)[£;T] = Il(a){£;T\ + 17{p)[e;7]
(5.37 f)
/(« + P) = /( /(«) + /($)
(5.38)
/ (a + p) = [ ( f (a) + f (p))
J{wy ’ J{wf J(w}v ’ \wy ”
(5.39)
hA Cascading of Two Binary Operations
5.4.1 Cascading of two identical binary operators
Most cases have been covered by the associativity properties. Although the
associativity does not hold for operators - and -f, there exist some equivalent
expressions. The cascading of two A-Difference operations follows the set-

100
difference in set theory.
a'-/?-7 = ar-'y-/? = ar-(/? + 'y) (5.40 f )
The cascading of two A-Divide operations is equivalent to the dividend
divided by the A-Union of the two divisors because an A-Divide operation retains
patterns of the dividend without modifying their structures (note that the divide
operation in relational algebra retains a substructure of the dividend). Therefore,
the order of the two A-Divide operations is not important.
a P T = a ^{w} T -^{w} P (5.41 f )
= « +{w}iP + 7)
5.4.2 Cascading of two different binary operations
Many cases have been covered by the distributive properties. Although the
distributivity properties of ! and -f with respect to + do not hold, there still exist
some equivalent expressions. These properties are listed below according to their
first operators.
a) * with other binary operators
The cascading of * and | operators is associative.
(«W P(y]) |[R(C,D)\ 7{z} = aw *[R(A,B)} (P{Y) \\R{C,D)\ 7{z}) (5.42 f )
(OÉWABÉ{2})
The condition ensures that the operation *[R(A,B)} does not operate on 7 patterns
and *[R(C,D)\ does not operate on a patterns.

101
For the cascading of * and - operators (in that order), it should be obvious
that when the subtrahend is only applicable to one of the operands of the * opera¬
tion, the - operation can be performed first and just against that operand.
(«W *{R(AM Pw) - 7{z} = (orw - 7{z}) *[R{A,B)} P{Y) ({Y}f^Z} = 0(5.43 |)
= orw *{R(A,B)} (P{Y} - 1{z]) ({X}f^Z} = 0)
For a similar reason, the following property hold true.
(“{x} *{R(A,B)\ P{y}) • l{z) = (“{x} • 7{z}) *[R(A,B)} P^y] (5.44 f )
({Y}pt{Z}=0 A {Y>nW = 4> A Ae{X})
= a{x} (P{y) • 7{z})
(wn{^>=0A wntw a be{y\)
The first two conditions ensure that 7 patterns do not intersect with a and P pat¬
terns. Otherwise, the A-Intersect operation will perform over the common classes
of P and 7 if the * operation is performed first. The third condition ensures that a
(P) must contains object instances of A (B). In other words, the algebraic expres¬
sion that defines a (P) must contain A (B). Otherwise, performing the A-Intersect
operation first may produce false result when 7 contains object instances of A.
Note that the right-hand side of the equation is in a distributive form of * with
respect to •. However, the distributive property cannot be applied, since it
requires that A belong to a and P, and that 7 be a homogeneous association-set
(refer to Section 5.1).

102
b) | with other binary operators
Similar to the above two properties, we have
(«{*} l[«(A,B)] P[Y)) - 7{z} = («w - 7{z}) l[*(A,fl)] P[Y) ({Y}^ = *) (5.45)
= «W \[R(AB)} (P{y) - 7{z}) (Wni^J = 4>)
(«W l[«(A,fl)] /?m) . 7{z} = (of{Jt} . 7{z}) l[*(A,iJ)] /?{y} (5.46)
({y>n{^}=0 a wnw = a agw)
= ^{jv} |[.R(A,£i)] (^yj • 7{z})
(WrK^W A wni^W A ¿*e{Y})
c) • with other binary operators
Similar to equations 5.43 and 5.45, we have
(a{x) • P{y}) ~ 7{z} = (<*{.*} — 7{z}) • P{y) ({Y}p){.Z} = ) (5-47)
= ^{v} • (^{v} — ^{z}) (POPH-2) =
d) ! with other operators
As we have mentioned earlier, the ! operator has less properties because it is
not associative. Although ! is not distributive with respect to +, the following
decomposition holds true:
a \\R(A,B)} (p + 7) (5.48 f )
= a![i2(A,Bp-i7(tt*[i2(A,B)]7)[a] + a![/?(A,fl)]7-17(a*[i2(A,B)]^[a]
or « ![Í2(A,B)] (P + 7) (5.49)
= (a-IJ{a *{R(A,B)}P)l<*}-n(<* *[JR(A,B)]7)[«])
l[fi(A,B)] ((P~n{at*[R{A,B)]P)[P\) + (i-n( where a, p, and 7 are homogeneous association-sets.

103
The significance of equations 5.48 and 5.49 is that they can be used to
transform the original expressions, in which the ! operators operate on heterogene¬
ous association-sets (e.g., a+0 ) for which the distributivity cannot be applied, into
expressions in the format of A-Union’s of homogeneous association-sets.
e) 4- with other operators
An association-set (a) divided by the A-Union of two other association-sets (/?
and 7) is equivalent to two consecutive A-Divide operations of a divided by f) and
7 in turn as indicated in equation 5.41. The order of the two A-Divide operations
is not important.
a +{w}(P + 7) = a "^{w} P "^{w} T (5.50)
= a 4-{iv} 1 P
The A-Divide operator also has less properties because it is not associative.
f) - with other binary operators
The properties of operator - cascaded with other operators are covered by
5.43, 5.45, and 5.47.
g) + with other binary operators
The equation below follows the set-union and set-difference operations in set
theory.
(or + p) - 7 = (a - 7) + (0 - 7)
(5.51 f )

104
The properties of cascading of + with operators *, |, •, and ! operators can be
found in 5.15, 5.16, 5.17, 5.48, and 5.49, since the latter operators are commuta¬
tive.
5*5 General Identities
There are many other properties which are unique to the A-algebra but can¬
not be classified into the above categories. Listed below are some identity proper¬
ties. These identities are useful for expression reduction.
A • A * B = A * B (5.52)
A • A ! B = A ! B (5.53)
A + IJ[A\B)[A] = A (5.54)
A*B*C*A*B = A*B*C (5.55)
5*fi Transformation of Operators
An important fact we have observed is that the same pattern can be con¬
structed by different algebraic expressions using different operators. For example,
pattern A—B—C can be constructed either by A*B*Cor by B*A • B*C, hence
B*A»B*C=A*B*C (5.56)
Formally, their equivalence can be derived using the properties presented in
the previous sections:
B * A • B * C
= (B • B * C] *[42(B,A)] A
= (B * C) *[12(fl,A)] A
(by 5.44)
(by 5.52)

105
= A * (B * O) (by 5.1)
= A *B *C (by 5.6)
For the other direction, we have
A * B * C
= A * (B * B) * C
= A * (B • B) * C
= (A * B • B) * C
= A * B • B * C
(by 5.10)
(by 5.10, 5.12)
(by 5.44)
(by 5.44)
Using this property, a pattern of tree-structure can be described without using A-
Intersect operator, which is relatively more expensive to implement. For example,
A *(B *C,B *D)
= A *[R{A,B)} (C * B * D) (by 5.56)
= A * (B * C *[R(B,D)\ D) (by 5.1,5.6)
= A * B * C *[R(B,D)} D (by 5.6)
Another useful transformation is possible because a pattern of lattice struc¬
ture expressed by an intersection of two linear patterns can be viewed as a selec¬
tion on linear patterns to avoid the expensive A-Intersect operation. For example,
A*B*C*D • B*E*D = o (A *B*C*D*E*B—l)[B=B—l\. (5.57)
The left-hand side is to construct a lattice pattern by intersecting two linear pat¬
tern over classes B and D. By breaking the lattice pattern at B, it becomes a sin¬
gle linear pattern as seen on the right-hand side of the above expression. Here, B-1
is an alias of B. By specifying that B=B-1 in the the association-set defined by
A*B*C*D*E*B-1, we obtain the same result as the expression defined on the left-
hand side.

106
Based on these two transformation properties, a complicated network struc¬
ture can be viewed as a forest structure by properly breaking all the loops in the
network and its algebraic expression can be specified using a, *, |, and ! operators.
hJ. Applications in Query Optimization .and Query Decomposition
We have systematically presented the mathematical properties of the opera¬
tors of A-algebra. In this section, their utilities in query optimization and query
decomposition will be illustrated.
5.7.1 Applications in query optimization
Generally, query processing consists of three phases: translation, optimization,
and execution. A query issued by the user is in the form of high-level language.
First, it is translated into an internal representation -- an access plan, which may
not be efficient for execution. Then, the optimizer generates a new access plan
which is equivalent to the original access plan (i.e., they produce the same result)
and is "optimal" for execution. Finally, the new access plan is scheduled for exe¬
cution by the transaction manager to produce the result of the query. Since it is
difficult to determine the equivalence of two statements in a high-level language,
alternative access plans cannot be generated by the query translator. In relational
databases, the access plan generated by the query translator is in the form of a
query tree in which algebra operators are used in the relational databases so that
the mathematical properties can be used to generate equivalent access plans, even

107
if the high-level language is based on the relational tuple calculus or domain cal¬
culus (refer to Chapter 2).
Query optimization is, without loss of generality, an NP-hard problem.
Therefore, an access plan generated by the optimizer is optimal in a very restric¬
tive sense. Furthermore, to be practical, the overhead of the optimizer should
never exceed the advantage of query optimization. In general, a query optimizer
generates an optimal access plan in two steps: (1) generate (limited number of)
equivalent access plans, and (2) evaluate these access plans based on (a few) sys¬
tem parameters and criteria.
The mathematical properties of the A-algebra presented above are the foun¬
dation for the first step of query optimization in 0-0 databases. In the second
step, the system/application chooses one or more of the following as the goal of its
query optimization: minimal response time, minimal execution time, minimal com¬
munication time, minimal storage space, maximal resource utilization, etc. The
parameters used in estimating the performance of an access plan include commun¬
ication cost (per block), CPU cost (per unit), I/O cost (per I/O), buffer size, selec-
tivities of operations (e.g., Selection and Join in relational databases), data struc¬
ture, algorithms of the operations (e.g., nested-join, hash-join), etc.
Since the criteria of optimization are system/application dependent and the
optimization strategies vary from system to system, a detailed study is out of the
scope of this dissertation. We shall give an example to demonstrate the impor¬
tance of the A-algebra in query optimization.

108
Query 8: List GPAs of students who major and minor in the same
departments.
The intensional pattern for this query is shown in Figure 5.1a. Suppose that
the algebraic expression produced by the query translator is as follow, which
corresponds to an access plan represented by the query tree shown in Figure 5.1b.
II(GPA * (Student * Department • Student * Undergrad * Department))[GPA]
To make the evaluation easy, we assume that every student has major, minor,
and GPA (i.e., the selectivities of all * operations are 1.0) and 100 out of 104 stu¬
dents major and minor in the same departments (i.e., the selectivity of the •
operation is 1/102). If the time to perform an A-Select on a pattern is 1 unit, to
perform an Associate operation is 2 units, and to perform an A-Intersect operation
is 5 units, the total execution time can be calculated as follows not including time
for the A-Project operation:
Tj = (2 *104) + (4*104) + (5 *104) + 200 = 11.02*104
where the first term is the time for identifying students’ majors, the second term
is for identifying students’ minors, the third term is for the A-Intersect operation,
and the last term is for identifying the GPAs. In Figure 5.1b, the costs of opera¬
tions are depicted next to the operator nodes. Here, the time for the A-Intersect
operation is small because each student has only one major and one minor and
indices may be used to speed up the operation.
Using property 5.57, the same intensional pattern can be viewed as a linear
patter shown in Figure 5.2a, and thus, the optimizer generates a new algebraic
expression, which corresponds to the access plan shown in Figure 5.2b.

109
II(o{GPA * Student * Department * Under grad * Student-1)
\Student=Student-l])[ GPA]
The total execution time for this access plan is
T2 = (8*104) + (104) = 9*104
where the first term is the time for four Associate operations and the second term
is the time for the selection operation. It is less expensive than the original access
plan, thus, a better plan.
However, if we assume that the database is a distributed one in which data
of students’ GPAs are in site 1 and other data are in site 2 (the class Student has
to be replicated in both sites). The communication cost is assumed to be 1000
units per block with block size of 100 patterns. The total execution times for
these two access plans can be calculated as follows:
7\ = (2 *104) + (4*104) + (5 *104) + 1000 + 200 = 11.12 *104
T2 = (8*104) + (104) + 106 = 19 *104
In Tj, the fourth term is the communication cost for sending qualified students to
site 1. In t2, the third term is the communication cost (the communication costs
are the same for sending GPAs of all students to site 2 and for sending students’
majors and minors to site 1). In this case, the first access plan is better than the
second. Figure 5.3a and 5.3b depicts the costs of operations (next to the opera¬
tions) and the costs of communications (on the edges) for these two access plans.
The optimizer of the distributed system may generate another access plan by
applying property 5.28 to the algebraic expression of the second access plan, and
we have

110
I7(GPA * o(Student * Department * Under grad * Student-1)
[Student=Student-l])[GPA]
which corresponds to the access plan shown in Figure 5.3c. The total execution
time for this access plan is
Ts = (6*104) + 104 + 104 + 200 = 7.12 *104
where the first term is the time for the three Associate operations nested in the
A-Select, the second term accounts for the selection operation, the third term
accounts for the communication cost, and the last term is the time for getting
GPAs. Therefore, the third access plan is the optimal one for execution.
5.7.2 Applications in query decomposition
The 0-0 modeling techniques incorporate many high-level features such as
association types, inheritance, behavioral properties of objects, knowledge and
rules, etc. in the DBMS. These features were taken care of by database adminis¬
trators and application programs in conventional databases systems. To ensure
good performance, 0-0 DBMSs need the support of parallel and distributed pro¬
cessing techniques.
In distributed and parallel processing environment, a query is decomposed
into subqueries according the processing capabilities of processors and/or data dis¬
tribution. The algebraic representation of a query can be manipulated mathemat¬
ically for this purpose. For example, suppose a query is represented by an inten-
sional pattern shown in Figure 5.4a. The algebra expression for this query can be

Ill
written as follows:
expr = A * (B*E*F + B*(C*D*H • C*G)).
By applying the distributivity properties, the above expression can be written as
below:
expr = A * (B*E*F + B*C*D*H • B*C*G)
- A *B *E *F + A * (B *C*D *H • B *C*G)
= A*B*E*F + A*B*C*D*H • A*B*C*G.
The decomposed expression is the A-Union of two sub-expressions representing
two sub-patterns shown in Figure 5.4b. These sub-expressions are independent of
each other and can be processed in parallel in a parallel system. The second sub¬
expression can be further optimized as shown in the following expression in which
*[/2(C,G)] indicates that the Associate operation is performed through the associa¬
tion between C and G.
expr = A *B*E*F + (A *B*C*D*H) *[R{C,G)\ G.
In addition, since each sub-expression represents a homogeneous association-set, its
processing will be more efficient than processing over heterogeneous association-
sets.
Next, we present two theorems of the A-algebra, which ensures that the
decomposed sub-expressions produce homogeneous association-sets.
Theorem 5.1:
Operators (except A-Union and A-Integrate) of A-algebra produce
homogeneous association-sets if their operands are homogeneous
association-set.

112
Proof: This is true by the definitions of the operators (A-Intersect operation should
be used without specifying the classes on which the A-Intersect operation is per¬
formed, i.e., it performs on the common classes of its operands). Note that, for
A-Difference and A-Divide operations, this is also true if only the first operand
(the minuend or the dividend) is a homogeneous association-set.
Theorem 5.2:
If an A-algebra expression which does not contain A-Integrate opera¬
tion and A-Divide operation whose dividend is an heterogeneous
association-set, it can be decomposed into the A-Union’s of some sub¬
expressions, each of which produces a homogeneous association-set.
Proof: According to Theorem 5.1, besides the A-Integrate operation, the A-Union
is the only operator that can produce heterogeneous association-set when its
operands are homogeneous association-sets. Therefore, it suffices to prove that
whenever such heterogeneous association-set appears in an expression, the expres¬
sion can be decomposed into the A-Union of sub-expressions which produce homo¬
geneous association-sets.
Proof: Let a, ft, 7, and X be all homogeneous association-sets. By properties 5.15,
5.16, 5.17, 5.35, and 5.37 we have
(or + $*{7 + X) = a *7 + a*\ + 7 + f)*\
(« + 01(7 + X) = or hr + a|X + /?|7 + y9|X
(a + $*(7 + X) = a*7 + a-.X + £«7 + y9»X
o{a + m = o{<*m + TI(a + = 77(a)[£;7j + f;7j
By properties 5.56, we have
(or + P) - 1 = (or - 7) + (£ - 7)

113
By properties 5.42, we have
(a + $!(7 + X)
= («hr - n[a*\)[a} - 7J(^*7)[7])
+ (ar!X - 77(a*7)[a] - 77(/?*X)[X])
+ - IT(P*\)IP\ - 77(a*7)[7])
+ (0X - IAPnm ~ J7(ar*X)[X])
In the above decompositions, each term of the A-Union operations represents a
homogeneous association-set. â–¡

114
GPA Student Department
(a) intensional pattern of Query 8
(b) access plan 1 of Query 8
Figure 5.1 Access plan 1 of Query 8

115
GPA Student Department Undergrad Student_1
o o o o o
(a) alternitive intensional pattern of Query 8
(b) access plan 1 of Query 8
Figure 5.2 Access plan 2 of Query 8

116
* 200
GPA Student GPA Student
(b) cost of access plan 2 (c) cost of access plan 3
Figure 5.3 Costs in a distributed system

Q>
117
(a) (b)
Figure 5.4 Example of query decomposition
ÓI

CHAPTER 6
COMPLETENESS OF THE A-ALGEBRA
We have shown in the preceding sections that a query issued against an 0-0
database can be specified by an association (or graphic) pattern, in which object
instances of interest are related (associated or nonassociated), and that the A-
algebra provides a useful mathematical method for specifying and manipulating
such pattern to produce the result for the query. However, for the algebra to be
truly useful, the completeness of the algebra needs to be addressed.
Due to the closure property of the A-algebra, the result of a query is
represented intensionally by a subdatabase schema graph SGt and extensionally by
a subdatabase object graph OGt, where SG, is a subgraph of the SG of the origi¬
nal database and OG, is a subset of association patterns in the original object
graph OG. A subdatabase can be further operated upon by the A-algebra opera¬
tors to produce other subdatabases. We can therefore define the completeness of
the algebra in the following way.
Completeness Theorem:
The A-algebra is complete if it can define all possible subdatabase of an 0-0
database.
Before proving the theorem, we first give the formal definitions of the SG,
and OG, of of the subdatabases of an 0-0 database.
118

119
Subdatabase Schema Graph:
A subdatabase schema graph (SGt) is a set of m connected subgraphs,
{SG’(C,A)} from the original database schema graph SG(C,A),
where C is a set of vertices representing classes {c,.} and A is a set of edges
representing associations between classes, each of which is denoted by A(j
for an association between classes C,. and Cy If Cje.SG\, then C£SGk (VMj).
The condition ensures that a class does not appear in two different connected
graphs in a subdatabase. If it does, the two connected graphs should have been a
single connected graph.
Subdatabase Object (Association) Graph:
A subdatabase object graph (0Gt(0,E)) contains a subset of association pat¬
terns of the original database object graph (0G(0,£)), where O is a set of
vertices representing object instances and E is a set of edges representing
associations between object instances. An Inner-pattern (or object instance
0{J) belongs to OGt only if CieSGl and O^-eC,-. An Inter-pattern or a
Complement-pattern (Oij===Gm n) belongs ^to OGt only if Ci,CmeSGt and
A,-,meSG„ where O^C,, Omn£Cm, and Oif==A)mneAim.
The above conditions state that a primitive association pattern should not be
included in OGt if the corresponding classes and/or associations of the original
database are not in SGt.
Instead of proving the completeness theorem as stated above, we make the
following observations and restate the theorem as shown below.
First, although the SG of an 0-0 database may consist of more than one
connected graph, it suffices to prove the case that the SG is a single connected
graph since if two classes do not have a path between them in the SG, they will
not be associated with each other in any of the subdatabases. Therefore, each
connected graph of SG can be treated as an independent database and a subdata-

120
base defined on more than one connected graphs of SG can be represented by the
A-Union of the subdatabases defined on different connected graphs of SG.
Second, it suffices to prove the case that a subdatabase consists of only one
connect subgraph of SG, although in general the SGt of a subdatabase may con¬
tain more than one subgraphs of SG. This is because the general case can be
represented by the A-Union of the expressions for individual subgraphs.
Third, since an 0-0 database is a collection of association patterns, it should
be obvious that if there exists an A-algebra expression for every association pat¬
tern of an 0-0 database, then the subdatabases can be represented by the A-
Union of a subset of these association patterns. Therefore, the completeness
theorem can be restated as follows:
Completeness Theorem:
The A-algebra is complete if there exists an expression for every asso¬
ciation pattern in the OG of an 0-0 database.
We prove the above theorem by induction on the number of object instances
in an association pattern.
Proof:
Bus£i We first show that there is an expression for the case that an association
pattern contains a single object instance. Since the name of a class, say Cv
represents all the object instances of the class, an association pattern containing a
single object instance of that class can be represented by an A-Select operation
over the object instances of Cl to select a particular object instance of interest, as
shown below:

121
*( where B is the condition an object instance of C, must satisfy.
Hypothesis: Assume that there exists an expression for every association pattern
that contains n-1 object instances. These n-l object instances must form a con¬
nected graph, i.e., each object instance must be at least one path between any two
object instances in the graph. Otherwise, they would have formed multiple associ¬
ation patterns.
Induction. Suppose there exist an expression for an association pattern Pn_1 which
contains n-l object instances. When adding the nth object instance to this pat¬
tern, a new pattern Pn containing n object instances can be formed in the follow¬
ing two ways as depicted in Figure 6.1: (a) the nth object instance belongs to class
Ck and the object instances of Ck do not participate in Pn_I; and (b) the nth object
instance belong to a class, say Cfi which has some object instance(s) participated
in the Pn-1. To avoid using complicated notation, we will show the formulations
for two specific patterns depicted in Figure 6.2a and 6.2b, which correspond to
the cases of Figure 6.1a and 6.1b, respectively. Patterns in general forms can be
formulated using the same mechanism described below. We shall discuss cases a
and b in turn.
Case a: When adding an object instance of C7 to a pattern P11 containing 11
object instances, various new patterns P12,s can be formed depending on the asso¬
ciations between the new object instance and the other existing object instances.
The new object instance can only have one association with an existing object
instance if their classes are directly connected in SG by a single association type

122
(we will consider later the case that there are more than one association type
between two classes). There are only three possible choices for the new object
instance to relate to an existing object instance: (l) the association is of no
interest, i.e., the association is not included in the pattern; (2) they are associated
with each other; (3) they are not associated with each other. Graphically, we use a
solid line (an Inter-pattern) to represent choice 2 and a dashed line (a
Complement-pattern) to represent choice 3. No line is drawn between the two
object instances for choice 1. Note that at least one of the associations of the new
object instance with the existing object instances must have a choice of 2 or 3.
Otherwise, the new object instance and P11 are two separate patterns that should
be covered by the base and the hypothesis.
To formulate an expression for the new pattern shown in Figure 6.2a, we
first transform pattern P11 into a pattern by treating object instances of Pn as if
they are from different classes by using the aliasing names of their original classes,
as shown in Figure 6.3a. The pattern P12 in Figure 6.2a is equivalent to the pat¬
tern P'2 in Figure 6.3a provided that the object instances of the aliasing classes of
the same class are not the same object instances. Next, the equivalent pattern is
decomposed into a set of patterns, each of which is a subpattern (i.e., subgraph) of
the pattern in Figure 6.3a and consists of Pn, the new object instance, and its
relationship with one object instance in P11. If we can derive expressions for these
subpattern individually, the A-Intersect’s of these expression will be the expression
for the pattern in Figure 6.3a, which is equivalent to the pattern in Figure 6.2a.
In this example, the pattern in Figure 6.3a is decomposed into six subpatterns, as

123
shown in Figure 6.4a, which can be easily expressed as follows:
Epu = (EpU) \{R(C^l,C7)} C7,
ol
E = (E ) *[i2(C,_2,C7)] C7;
02
Ep i2 = {E n) *(-R( C3-I > c7)} C7;
Epn = (E 11) *[R(C¿¿,C7)\ Cv
o4
Ep„ = (E ) \{R(Cb,C7)\ C7;
06
^p12 = (^l) *[«(^^7)]
oO
respectively. Here, E stands for the algebraic expression of the association pattern
specified by its subscript. In each expression, an operation * or | is chosen
corresponding to the type of connection between object instances, and EpU is
parenthesized to ensure the correct execution sequence.
The expression for the pattern of Figure 6.2a can then be formulated by a
sequence of A-Intersect operations on the expressions of these individual patterns:
E_w = E
E
E
E
E
E
Case b. Figure 6.2b depicts the case that the new object instance belongs to an
existing class C6 and it may have associations with object instances of other classes
that have associations with C6. The formulation for the new pattern Pf shown in
Figure 6.2b can be obtained similarly as depicted in Figure 6.3b and 6.4b. Note
that the new object instance belongs to the aliasing class C„_2 after the pattern
transformation process (see Figure 6.3b). As shown in Figure 6.4b, the equivalent
pattern depicted in Figure 6.3b is decomposed into four patterns which can be
expressed by

124
E 2 = (E ) 4R(C^2,CeJ2)\ C6_2;
61
E w = (-^pii) *[-R(C!,—l,0,^-2)] C6_2;
62
£.2 = (S,i) |[/2(C4_2,C6_2)] C6_2;
63
^pi2 = (*>) IW^Cf«-2)] <^2,
61
respectively.
Therefore, for the pattern P¿2 we have expression
•®P12 — E12 • E l2 • E 12 • E12.
6 61 62 63 64
However, the above expression does not exclude the case that two object instances
in aliasing classes of Ct refer to the same object instance. Hence, it is necessary to
perform an A-Select operation to eliminate such case and we have
■®pi2 = otE • E ,2 • E i2 • E ,^[Cg-l^Cj-2].
6 *61 ' 62 63 6-1
So far we have shown that there exists at least one expression for a pattern
consisting of any number of object instances. We note that there may exist more
than one expression for a pattern. We illustrate this by showing an alternative
way of transforming a pattern into an equivalent one so that different expressions
can be derived.
Figure 6.5a shows another pattern which is equivalent to the pattern in Fig¬
ure 6.2a if in Figure 6.5a the objects instances of the aliasing classes C7_ 1 through
C7jb that participate in P'2 refer to the same object. Therefore, we have an alter¬
native expression for P'a2
Ep„ = o(-((¿íy,) |[J2(C1_1,C7_1)] <7,-1) *[J2(C71_2,C'7_2)] C7_2

125
• • • 4R(C5,Ch6)] C7-6))[C7_l=C7-2=...=C7_6].
which is a sequence of * and/or | operations on EpU over classes C7_t, (^1,2,...,6)
and their associated classes. The selection condition [C7_l=C7_2=...=C7_6] ensures
that the object instances in all aliasing classes of C7 refer to the same object.
Similarly, the pattern in Figure 6.5b is equivalent to the pattern in Figure
6.2b if the object instances in Cg_2 through C6_5 that participate in Pf are the
same object and this object is different from the one in Cg_l. Hence an alternative
expression can be derived as follows
E = o{...((E ) 4R(C^2,CeJ¿)} C6JZ) *[jR(C4_1,C6_3)] C6_3)
b
• • • |[J2(C74_2,Ci_5)] C'6_5))[C6_2=C6_3=C6_4=C6_5^C8_1].
We have shown that there exists an expression for every association pattern
when there is a single association between two classes. Now we prove this is also
true when there are more than one association between two classes. There are
also two cases as described in the proof above. We only prove case a that the
new object instance belongs to Ck and the object instances of Ck do not partici¬
pate in Pn~\ Case b can be proven using the same methodology.
Figure 6.6a shows an SG in which there are two associations between Ct_,
and Ck. The two associations are denoted as [R^C^C^] and [R,2(Cj_vCk)\, respec¬
tively. Figure 6.6b shows a pattern in which the new object instance of Ck has
two associations with each object instance of Ct_r The associations between
object instances of Cj_{ and Ck are labeled by numbers corresponding to the asso¬
ciations of their classes. To derive the algebraic expression for this pattern, first,

126
we decompose it into two patterns, P” and P%, as shown in Figure 6.6c. The
decomposition is done by making two copies of the pattern. In one copy the asso¬
ciations labeled 2 are dropped and in the other the associations labeled 1 are
dropped. From the earlier discussion, we can derive expressions for these two pat¬
terns and the expression for the original pattern can be represented by the A-
Intersect of the two:
E _ = E „ • E „.
pn pn p”
a b
To ensure that the A-Intersect operation will produce the pattern as required, the
same object instance in the two copies should use the same aliasing class name
when expressions E „ and E „ are formulated.
a 6
Generally, if the new object instance of Ck has multiple associations with
object instances of several classes, the association pattern is decomposed into m
patterns, where m is the maximum number of associations Ck has with another
class.
Since it has been shown that we can formulate algebraic expressions for all
possible patterns in which object instances are associated or nonassociated and the
A-Union’s of these expressions forms a single expression for the subdatabase of
interest, we have shown that the A-algebra is complete by induction. â–¡

127
(a) the nth object is in Ck
(b) the nth object is in Cj
Figure 6.1 Two ways of forming new patterns

128
(a) the 12th object is in C7 (b) the 12th object is in C6
Figure 6.2 Two specific examples of new patterns

129
(a)
(b)
Figure 6.3 Equivalent patterns

130
J
(a)
(b)
Figure 6.4 Decomposed patterns

131
(a)
(b)
Figure 6.5 Other equivalent patterns

132
(a) Two classes have multiple (b) Two objects have multiple associations in a pattern
associations
Figure 6.6 New object instance having multiple associations with those of C-_x

CHAPTER 7
CONCLUSION
Object-Oriented DBMSs and their underlying models exhibit several desirable
features that are suitable for modeling and processing complex objects found in
more advanced database applications. However, they still do not have a solid
mathematical foundation. Such a foundation is important for the efficient mani¬
pulation of 0-0 databases and for the design of high-level query languages to ease
the user’s task in accessing and manipulating 0-0 databases.
In this dissertation, we have presented an algebra for 0-0 database process¬
ing based on the uniformed representation of object instances and their associa¬
tions in an 0-0 database: association patterns. Nine algebra operators have been
introduced for manipulating patterns of both heterogeneous and homogeneous
structures. The closure property of the algebra allows the result of an algebraic
expression to be further processed by the algebra.
Several mathematical properties of the A-algebra operators have been studied
and formally proven. Their utility in query decomposition and optimization has
been demonstrated. The A-algebra is complete in the sense that all possible sub¬
databases that are derivable from an 0-0 database can be expressed in A-algebra
expressions.
133

134
The A-algebra has been used in the design and implementation of a high-
level object-oriented query language, OQL, for processing 0-0 databases
[ALA89b, WU89]. A graphic interface for the language and a prototype
knowledge base management system based on the 0-0 semantic association model
OSAM* [SU86 and SU89] are presented in [DS088, TY88, SU88, LAM89, PAN89,
CHU90, SIN90].

REFERENCES
[AH079]
[ALA89a]
[ALA89b]
[ALA90]
[ARM74]
[AST76]
[BAN87]
[BAN88]
[BAT84]
Aho, A.V., Beeri, C., and Ullman, J.D., "The Theory of Joins in Rela¬
tional Databases," ACM Transactions on Database Systems 4:3, 1979,
pp. 297-314.
Alashqur, A.M., "A Query Model and Query and Knowledge Definition
Languages for Object-oriented Databases," doctoral dissertation,
University of Florida, 1989.
Alashqur, A.M., Su, S.Y.W., and Lam, H., "OQL: A Query Language
for Manipulating Object-oriented Databases," Proceedings of the 5th
Inti. Conference on VLDB, Amsterdam, The Netherlands, 1989, pp.
433-442.
Alashqur, A.M., Su, S.Y.W., and Lam, H., "A Rule-based Language
for Deductive Object-Oriented Databases," Proceedings of the 6th
International Conference on Data Engineering, Los Angeles, CA, Feb.
5-9, 1990.
Armstrong, W.W., 'Dependency Structures of Data Base Relation¬
ships," FDT: ACM, New York, 1974.
Astrahan, M.M. and Chamberlin D.D., "System R: a relational
approach to data management," ACM Transactions on Database Sys¬
tems 1:2, 1976, pp. 97-137.
Bancilhon, F., Briggs, T., Khoshafian S., and Valduriez P., 'FAD, a
Powerful and Simple Database Language," Proceedings of the 13th
VLDB Conference, Brighton, 1987, pp. 97-105.
Banerjee, J., Kim, W., and Kim, K.C., "Queries in Object-oriented
Databases," Proceedings of the 4th Inti. Conference on Date Engineer¬
ing, Los Angeles, CA, 1988, pp. 31-38.
Batory, D.S. and Buchmann, A.P., 'Molecular Objects, Abstract,
Abstract Data Types and Data Models: A Framework," Proceedings
Inti. Conference on VLDB, 1984, pp. 172-184.
135

136
[BAT85]
[BEE77]
[CAR88]
[CHU90]
[COD 70]
[COD72a]
[COD 72b]
[COD79]
[COD 90]
[COL89]
[DAH67]
[DEL78]
Batory, D., and Kim, W., "Modeling Concepts for VLSI CAD
Objects," ACM Transactions on Database Systems, 10:3, 1985, pp.
322-346.
Beeri, C., Fagin, R., and Howard J.H., "A Complete Asiomatization
for Functional and Multivalued Dependencies," ACM SIGMOD Inter¬
national Symposium on Management of Data, Los Angeles, CA, 1977,
pp. 47-61.
Carey, M.J., DeWitt, D.J., and Vandenberg, S.L. "A Data Model and
Query Language for EXODUS," ACM-SIGMOD Conference 1988, pp.
413-423.
Chuang, H. S., "Operational Role Processing in a Prototype OSAM*
KBMS," Master’s thesis, University of Florida, 1990.
Codd, E., "A Relational Model of Data for Large Shared Data Bank,"
CACM, 13:6, 1970, pp. 377-387.
Codd, E., 'Relational Completeness of Database Sublanguages," in
Data Base Systems, (Rustin, R. ed.), Prentice-Hall Inc., Englewood
Cliffs, NJ, 1972, pp.65-98.
Codd, E.F., 'Further Normalization of the Data Base Relational
Model," in Data Base Systems (R. Rustin, ed.) Prentice-Hall, Engle¬
wood Clifis, NJ, pp. 33-64.
Codd, E.F., 'Extending the Database Relational Model to Capture
More Meaning," ACM Trans, on Database Systems, 4:4, 1979 pp. 262-
294.
Codd, E.F., The Relational Model for Database Management,
Addision-Wesley, 1990.
Colby, L. S. "A Recursive Algebra and Query Optimization for Nested
Relations," ACM-SIGMOD Conference, Portland OR, 1989, pp. 273-
283.
Dahl, O. J., Myhrhaug, B., and Nygaard, K., "SIMULA 67: Common
Base Language," NCC Publ. S22, Norwegian Computing Center, Oslo,
Norway, 1967.
Delobel, C., "Normalization and Hierarchical Dependencies in the
Relational Data Model," ACM Transactions on Database Systems, 3:3,
1978, pp. 201-222.

137
[DS088] D’Souza, G. T., "Graphic Semantic Data Definition Language and a
Graphic Browser for the Objected-oriented Semantic Association
Model," Master’s Thesis, University of Florida, 1988.
[ELM89] Elmore, P., Shaw, G.M., and Zdonik, S.B., "The ENCORE Object-
Oriented Data Model," tech, rep., Brown University, November, 1989.
[FAG77] Fagin, R., 'Multivalued Dependencies and a New Normal Form for
Relational Database," ACM Transactions on Database Systems, 2:3,
1977, pp. 262-278.
[FIS87] Fishman, D.H., Beech, D., Cate, H.P., Chow, E.C., Connors, T.,
Davis, J.W., Derrett, N., Hoch, C.G., Kent, W., Lyngbaek, P., Mah-
bod, B., Neimat, M.A., Ryan, T.A., and Shan, M.C., "Iris: An Object-
Oriented Database Management System," ACM Transactions on
Office Information Systems, 5:1, 1987, pp. 49-69.
[GOL81] Goldberg, A., "Introducing the Smalltalk-80 System," Byte, Aug. 1981,
pp. 14-26.
[HAL76] Hall, P.A.V., "Optimization of a Single Relational Expression in a
Relational Database," IBM J. Research and Development 20:3, 1976,
pp. 244-257.
[HAM81] Hammer, M. and Mcleod, D., 'Database Description with SDM: A
Semantic Association Model," ACM TODS, 6:3, 1981, pp. 351-368.
[HOR87] Hornick, M.F. and Zdonik, S. B., "A Shared, Segmented Memory Sys¬
tem for an Object-oriented Database System," ACM’s Transactions
on Office Information Systems, 5:1, 1987, pp. 70-95.
[HUL87] Hull, R. and King, R., "Semantic Database Modeling: Surey, Applica¬
tions, and Research Issues," ACM Computing Surveys, 19:3, 1987, pp.
201-260.
[KIM87] Kim, W., Banerjee, J., Chou, H.T., Garza, J.F., and Woelk D., "Com¬
posite Object Support in an Object-oriented Database System,"
Proceedings of OOPSLA, FL, Oct. 4-8, 1987, pp. 118-125.
[KIN84] King, R., "Sembase: A Semantic DBMS," the Proceedings of the First
International Workshop on Expert Database Systems, Atlanta, GA,
Oct. 1984, pp.151-171.
[KLE67] Kleene, S.C., Mathematical Logic, John Wiley & Sons Inc., 1967.

138
[LAM89] Lam, H., Xia, D. Qiu, J., and Wu, P., 'Prototype Implementation of
an Object-oriented Knowledge Base Management System," to appear
in the Proceedings of PROCIEM ’89, Orlando, FL, Nov. 13-15, 1989.
[LEC88] Lecluse, C., Richard, P., and Velez, F., "o2, an Object-Oriented Data
Model," ACM-SIGMOD Conference, Chicago IL, June 1-3, 1988, pp.
425-433.
[MAC85] MacGregor, R., "ARIEL--A Semantic Front-End to Relational
DBMSs," Proceedings of VLDB 85, Atlanta, GA., April 1985, pp. 305-
315.
[MAI86] Maier, D. and Stein J., 'Development of an Object-oriented DBMS,"
Proc. of OOPSLA ’86 Conference, Portland OR, Sept. 29 - Oct. 2,
1986, pp. 472-482.
[MAN86] Manola, F. and Dayal, U., 'PDM: An Object-Oriented Model," Int’l
Workshop On Object-Oriented Database Systems, 1986, pp 18-25.
[PAN89] Pant, S., "An Intelligent Schema Design Tool for OSAM*," Master’s
thesis, University of Florida, 1990.
[ROW87] Rowe, L. A and Stonebraker, M. R., "The POSTGRES Data Model,"
Proceedings of the 13th VLDB Conference, Brighton 1987, pp. 83-96.
[SER86] Servio Logic Development Corporation, Programming in OPAL, a
Manual, Published by Servio Logic Development Corporation, Beaver¬
ton, OR., 1986.
[SHA90] Shaw G. M., and Zdonic, S. B., "A Query Algebra for Object-Oriented
Databases," IEEE Trans, on Data Engineering, 12:3, 1990, pp. 154-162,
Feb. 1990.
[SHI81] Shipman, D., "The Functional Data Model and the Data Language
DAPLEX," ACM TODS, 6:1, 1987, pp. 140-173.
[SIN90] Singh M., "Transaction Oriented Rule Processing in an Object-
Oriented Knowledge Base Management System," Master’s thesis,
University of Florida, 1990.
[ST076] Stonebraker, M., Wong, E., Kreps, P., and Held, G., "The Design and
Implementation of INGRES," ACM Transactions on Database Sys¬
tems, 1:3, 1976, pp. 189-222.
[ST084] Stonebraker, M., Anderson, E., Hanson, E., and Rubenstein, B., "Quel
as a Data Type," Proceedings of the 1984 ACM SIGMOD Conference

139
[SU86]
[SU88]
[SU89]
[TOD76]
[TSU84]
[TY88]
[ULL82]
[WOE86]
[WON76]
[WU89]
[ZAN76]
on Management of Data, Boston, MA, June, 1984, pp. 208-214.
Su, S.Y.W., "Modeling Integrated Manufacturing Data With SAM*,"
IEEE Computer, January, 1986, pp.34-49.
Su, S.Y.W., Lam, H., and Navathe S.N., "An Object-oriented Com¬
puting Environment for Productivity Improvement in Automated
Design and Manufacturing: Project Summary," PROCIEM
’88,Orlando, FL., Nov. 14-15, 1988.
Su, S.Y.W., Krishnamurthy, V., and Lam, H, "An Object-oriented
Semantic Association Model (OSAM*)," A.I. Industrial Engineering
and Manufacturing: Theoretical Issues and Applications (S. Kumara,
A.L. Soyster, and R.L. Kashyap eds.), The Institute of Industrial
Engineering, Industrial Engineering and Managememnt Press, Nor-
cross, GA, 1989.
Todd, S.J.P., "The Peterlee Relational Test Vehicle -- A System Over¬
vies," IBM Systems J. 15:4, 1976, pp. 285-308.
Tsurt, S. and Zaniolo, C., "An Implementation of GEM -- Supporting
a Semantic Data Model on a Relational Back End," Proceedings of the
ACM SIGMOD Inti. Conference on the Management of Data, Boston
MA, June 18-21, 1984, pp. 286-295.
Frederick Ty, "The Design and Implementation of a Graphics Inter¬
face for an Object-oriented Language," Master’s thesis, University of
Florida, 1988.
Ullman, J.D., Principle of Database Systems, Computer Science Press,
1982.
Woelk, D., Kim, W., and Luther, W., "An Object-Oriented Approach
to Multimedia Databases," ACM SIGMOD Conference Proceedings,
Washington, D.C., May 1986, pp. 311-325.
Wong, E. and Youssefi, K., " Decomposition — A Strategy for Query
Processing," ACM Transactions on Database Systems, 1:3, 1976, pp.
223-241.
WU, Ping, 'Implementation Concepts for OSAM* Data Model and
OQL language," Master’s thesis, University of Florida, 1989.
Zaniolo, C., "Analysis and Design of Relational Schemata for Database
Systems," Doctoral Dissertation, UCLA, July, 1976.

140
[ZAN83]
[ZAN85a]
[ZAN85b]
[ZD086]
[Z0077]
Zaniolo, C., "The Database language GEM," Proceedings of the ACM
SIGMOD Inti. Conference on the Management of Data, San Jose, CA,
1983.
Zaniolo, C., "The Representation and Deductive Retrieval of Complex
Object," Proceedings of VLDB, Stockholm, Sweden, 1985, pp. 485-469.
Zaniolo, C., Ait-Kaci, H., Beech, D., Cammarata, S., Kerschberg, L.,
and Maier, D., "Object-Oriented Database Systems and Knowledge
Systems," in Expert Database Systems, (Larry Kersberg ed.),
Benjamin/Cunnings Publishing, Meulo Park, CA, 1985, pp. 49-63.
Zdonik, S. B., Skarra, A. H., and Reiss, S. P., "An object Server for an
Object-oriented Database System," International Workshop on
Object-oriented Database Systems, Pacific Grove, CA., Sept. 1986.
Zook, W., Youssefi, K., Whyte, N., Rubinstein, P., Kreps, P., Held,
G., Ford, J., Berman, B., and Allman, E., INGRES Reference Manual,
Dept, of EECS, Univ. of California, Berkeley, 1977.

APPENDIX
The formal proofs of the mathematical properties of the A-algebra operators are
given below:
A. Commutativity:
(1) a4R(A,B)}P = P4R(B,A)}a (5.1)
Proof: If a pattern in a can be concatenated with a pattern in /? over an Inter¬
pattern a(bj, then the pattern in /? can be concatenated with that pattern in a over
the Inter-pattern Since patterns are non-directional, i.e., aibj = b-a{, the left-
hand side and the right-hand side of the equation would produce the same result. On
the other hand, if an a pattern cannot be concatenated with a ¡3 pattern by the
operation on the left-hand side, then the same (3 pattern cannot be concatenated with
that a pattern by the operation on the right-hand side. â–¡
(2) a\[R(A,B)]P = PmB,A)]c* (5.2)
Proof: Since a Complement-pattern is non-directional and if a complement pattern
aibj connects an a pattern with a ¡3 pattern, these two patterns together with the
Complement-pattern aibj will all be retained in the results of the expressions on both
sides of the equation. For the same reason, a new pattern which cannot be produced
by the operation on the left-hand side of the equation cannot be produced by the
operation on the right-hand side. â–¡
141

142
(3) a\[R(A,B)]ft = fi[R(B}A)\ai (5.3)
Proof: According to the connections between patterns of a and ft through some
Inter-patterns, a and ft can be decomposed into the A-Union of two subsets of pat¬
terns, respectively.
in in
a) or = a + a and ft = ft + ft
where a represents a subset of a patterns that can be concatenated with the ft
patterns and a represents a subset of a patterns that cannot be concatenated
with ft patterns. The decomposition of ft can be interpreted similarly.
n n n n
Assume that a ft and ft a are used to denote the new patterns produced by the
NonAssociate operations on both left- and right-hand sides of the equation. Each of
the new patterns consists of one a pattern, one ft pattern, and a Complement-pattern
which connects the two. By the definition of the NonAssociate operation, we have
left-hand side = (or + a )\{R(A,B)](ft + ft)
a
if ft =4>
n
p
II
n n
a ft
otherwise
right-hand side = (ft + ft )\[R(B,A)\(at + a)
n n
a if ft =
II II
= ft if a =
1 ft a otherwise
Since a Complement-pattern is non-directional, i.e., a ft = ft a, the commutativity
holds for all cases. â–¡

143
(4) Cf{X)P = P*{X}a (5.4)
Proof: If the Inner-patterns (object instances) of the classes specified in {X} con¬
tained in an a pattern are common to a P pattern, the new pattern which is the
intersection of the two patterns will be produced by both sides of the equation. On
the other hand, if an a pattern which does not intersect with a P pattern by the
operation on the left-hand side of the equation, the same P pattern will not intersect
with that a pattern by the operation on the right-hand side. â–¡
(5) a+P = P+a (5.5)
Proof: Since the A-Union operation simply lumps patterns named by two operands
into a single association-set and the patterns in an association-set are not ordered,
both sides of the equation will produce the same result. â–¡
B. Associativity
(1) (aw*[fl(CLVCL2)}P{y]) *[«(CLVCL4)]1[Z}
= *{R( CLV CL2)}(p{ n JR( CL,, CL^Z}) (5.6)
CL£{X\ A CL£{Z}.
Proof: The associativity holds only under the stated condition. The condition states
that a does not contain Inner-patterns of class CL3 and j does not contain Inner-
patterns (or object instances) of class CL2 so that a will have no effect on the opera¬
tion *[fí(CL3,CL4)\ on the left-hand side and 7 will have no effect on the operation
*[R{CLvCL2)\ on the right-hand side. Given that the above condition holds, a,P and
7 can be decomposed as follows:
rum
a) a = a + a + a

144
where a represents a subset of a patterns which can be concatenated with a
subset of ft patterns and thereafter be concatenated (through ft patterns) with a
subset of 7 patterns, a represents a subset of or patterns which can be con¬
catenated with a subset of ft patterns which, however, cannot be concatenated
with any 7 pattern, and a represents a subset of patterns which either does not
have the Inner-patterns of CLt or cannot be concatenated with any ft pattern.
t 11
Note that an a pattern may belong to a and a .
„ / tt m mr
b) ft = ft + ft + ft + ft
t tt
where P can be concatenated with a and 7, /? can be concatenated with a but
not with 7, f) can be concatenated with 7 but not with or, and /? cannot be
t ft rtt irtt
concatenated with either a or ft. Note that patterns of ft, ft, ft , and ft are
mutually exclusive.
c) 7 = 7+7* + 1
1_ f tt 1 W 1 B t tt tit
where 7, 7, and 7 have the similar interpretations as or, or, and a , respec-
tively.
/ t tt tt ft fit tt
If aft, a ft, ft'), ft 7, and aft7 are used to represent the results of the Associate
operations, according to the definition of Associate we have
I It tit t tt tit till
left-hand side = ((a + a + a )*{R(CLl,CL2)}(ft + ft + ft + ft ))
/ tt III
4B(CLS,CL4)]( 7 + 7 + Tf )
= {aft + a ft)*[R{CLvCL2)}{7 + 7 + 7 )
I t I
= aft!
I It ttt t tt tit tttt
right-hand side = (dr + a + a )*{R(CLvCL2)\((ft + ft + ft + ft )
I tt ttt
*[jR(CL3,CL4)](7 + 7 + 1 ))
I ft III ft III If
= (a + a + a )4R(CLvCL2)]{ft7 + ft 7)
/ t I
= aft 7
â–¡

145
(2) (a{Jti\[R(CLvCL2)}p{Y)) |[Í2(CL3,CL4)]7{Z} (5.7)
= a{x)\[R( CLvCL2)\(P{V)\[R( CLs, CL4)]7{z})
CL.£ {X} A CL& {Z}.
Proof: For the similar reason given in the discussion of associativity of * operator,
a, P, and 7 can be decomposed as follows:
t n m
a) or = a + a + a
where a can be connected to P patterns by Complement-patterns and then be
connected to 7 patterns, a can be connected to ft pattern by Complement-
ni
patterns but cannot be further connected to 7 patterns, and a either has no
Inner-patterns of CLX or cannot be connected to any P pattern by
t n
Complement-patterns. Also, patterns of a and a may not be mutually
exclusive.
„ r n nt nn
b) p = p + p +p +p
/ n
where P can be connected to a and 7 patterns by Complement-patterns, P can
m
be connected to or patterns by Complement-patterns but not to 7 patterns, /?
can be connected to 7 patterns by Complement-patterns but not to a patterns,
nn
and P cannot be connected to the patterns of either a or 7. Also, patterns of
t n in nr
P, P, P , and P are mutually exclusive.
i n in
c) 7 = 7+ 7 + 7
in in ... . in in
where 7, 7, and 7 have the similar interpretations as a, a, and a , respec-
tively.
Then, by the definition of the A-Complement operation, we have
i n in 1 11 in nn
left-hand side = ((« + a + a )|[R(CLVCL2)}(P + P + P + P ))
I II III
|[fl(CL„CL4)](7 + 7 + 7 )

146
t t ti n in in
= (a/? + a P )\[R(CLl,CL2)}(') + 7 + 7 )
I 1 1
= aP 7
1 11 111 I n 111 1111
right-hand side = (or + a + a ) \[R(CLVCL2)]((P + P + P + P )
1 11 111
|[jR(CL3jCL4)](7 + 7 + 7 ))
I n in i i in n
= (a + a + a )\{R{CLvCL2)}{Pn + P l)
I / I
= a Pi
where aP, a P, P'), P 7 ,
operations. â–¡
and aP7 represent the results of the A-Complement
(3) (ot[xf{Wl}p{Y}).{W2}'1{z] = aw.{U'}(/?w.{VV2}7{z}) (5.8)
where {U^-W2}n{Z}=M{ W2-W1}n{XM.
Proof: The condition ensures that the operation «{X} operates only on patterns of a
and P and •{ Y} operates only on patterns of P and 7. The following figure shows four
possible cases in which three patterns intersect with one another. It should be clear
that the associativity does not hold for case (d), because it violates the condition, i.e.,
the second A-Intersect operation operates on a and p. When the condition is true,
the proof is similar to the proofs for the above two associative properties; i.e., by
decomposing a, P, and 7 accordingly.
Y
(a) (b) (c) (d)
(4) (a+/J)+i - a-K^+1) (5.0)
Proof: Since the A-Union operation simply lumps two association-sets into one and

147
the patterns in a set are not ordered, the order of performing A-Union operations on
a number of association-sets will have no effect on the final result. â–¡
D. Distributivity
(1) a*[R(A,B)}(p+'i) = a*{R(A,B)]P + (5-15)
Proof: First, a, p and 7 can be decomposed as follows.
1 11 111
a) a = a + a + a
t n nr
where a can be concatenated with P, a can be concatenated with 7, and a
cannot be concatenated with either P or 7. Note that an a pattern may belong
I II
to at and a .
b) P = p + 0
I m II
where ft can be concatenated with a but ft cannot.
c) 7 = 7+7
/ n
where 7 can be concatenated with a but 7 cannot.
By the definition of the Associate operation, we have
I II III I II t II
left-hand side = (a + a + a ) *[R(A,B)\(P + ft + 7 + 7)
II II I
= af3 + a 7
I II III I II 1 II III I II
right-hand side = (a + a + a )*\R(A,B)}(P + ft) + (at + a + a )*[-R(A,i?)](7 + 7)
II II I
= aft + a 7 â–¡
(2) a\(R(A,B)](P+'l) = a\[R(A,B)\P + «|[J2(A,fl)]7 (5-16)
Proof: a, P and 7 can be decomposed as follows.
111m
a) a = a + a + a
1 m a
where a contains patterns that are connected to P by Complement-patterns, a
III
contains patterns that are connected to 7 by Complement-patterns, and a can-

148
not be connected to either P or 7 by Complement-patterns. Note that an a pat-
1 11
tern may belong to a and a .
b) P = P + P
1 n
where ¡3 can be connected to a by Complement-patterns but f3 cannot.
/ n
where 7 can be connected to a by Complement-patterns but 7 cannot.
By the definition of the A-Complement operation, we have
left-hand side = (a + a + a ) \[R{A,B)](P + P +7 + 7)
II II I
= a/3 + a 7
i 11 ni t n 1 tt m 1 11
right-hand side = (a + a + a )|[J2(A,B)](y9 + 0) + (a + a + a )|[JE(A,B)](7 + 7)
11 ni
= a P + a 7 â–¡
(3) a*{X}(P+7) = ot»{X} + o.{X}7 (5.17)
Proof: a, ft and 7 can be decomposed as follows.
1 n in
a) a = a -f O' + ar
1 m a # hi
where a intersects with a intersects with 7, and a does not intersect with
either f) or 7. Note that an a pattern may belong to a and a .
b) P = P + P
I a II
where P intersects with a but p does not.
I * ll
where 7 intersects with a but 7 does not.
By the definition of the A-Intersect operation, we have
I H IH I II I H
left-hand side = (or + a + a ).{X}(/? + P + 7 + 7)
II HI
= otp + a 7
I H III I H I II HI I II
right-hand side = (« + a + a )»{X}(P + P) + (a + a + a ).{X}(7 + 7)
II HI
= aP + a 7 â–¡

149
(4) a[x}*{R(CLvCL2)}(P{Y).{W}7W)
= « 4R(CLvCL2)}0[rj.{ WUJf}orw *[«( CLvCL2)]~t{z) (5.18)
(5) a'[Xi\[R(CLvCL2)\(P{Y).{W}'i{z])
- a\[R(CLvCIJ]fi{yflWjX)am\[R(CLvCLj]im (5.19)
The above two distributive properties hold when the following conditions are true
i) CL2£{W};
ii) Xr\Y = Xnw= ¿ and
iii) or is a homogeneous association—set.
The first condition ensures that the operations Associate, A-Complement, and
NonAssociate will operate on the common class of /? and 7 as shown in (a) of the fol¬
lowing figure. Otherwise, the distributions of these operations to /? and 7 do not
make sense as shown in (b) and (c). The second condition ensures that a patterns
must not intersect with any pattern of either or 7 so that the *{Xu W} operations on
the right-hand sides of the equations will examine the intersections on the portions of
a and 7 separately. The third condition ensures that, on the right-hand sides of the
equations, only those patterns that have the same a pattern will intersect and be
retained in the result.
(a) (b) (c)
We shall only give the proof of 5.18. 5.19 can be proved using the same tech¬
nique.

150
When the conditions are true, a, ft and 7 can be decomposed as follows.
1 n m nn
a) a = a + a + a + a
where a can be concatenated with ft and 7, a can be concatenated with ft but
tn nn
not with 7, a can be concatenated with 7 but not with ft, and a cannot be
/ tr m 1111
concatenated with either ¡3 or 7.Note that a, a, a , and a are mutually
exclusive.
„ r n in nft
b) ft = ft + ft + ft + ft
i n
where ft can be concatenated with a and does intersect with 7, ft can be con-
III
catenated with a but does not intersect with 7, f3 cannot be concatenated with
mi
ot but does intersect with 7, and ¡3 can neither be concatenated with a nor
1 n m nn
intersect with 7. Note that f3, ¡3, /3 , and ¡3 are also mutually exclusive.
/ n in nn
b) 7 = 7+ 7 + 7 + 7
1 n m nn ... i n in nn
where 7, 7, 7 and 7 have the similar interpretations as ft, ft, ft , and ft ,
respectively.
By the definition of the operations of Associate and A-Intersect we have
t 11 ni nn i n m nn
left-hand side = (a + a + a + a )*{R(CLvCL2)\((ft + ft + ft + ft )
/ n in nn
•{W}(7 +7+7 + 7 ))
i n m nn 1 1 in in
= (a+a + a + a )*[R(CLvCL2)\{ft'i + ft 7 )
f III III I
Since CL2e{W}, ft7 and ft 7 cannot be produced by the .{X} operator according to
in nr
the decompositions of ft and 7. Otherwise, 7 (or ft ) must contain the same Inner-
pattern of CL2 as contained in ft (or 7) and must be able to concatenate with a.
Applying the distributive property 5.15, we obtain

151
= á*\R(CLvCL2)]PÍ + d*[R{CLvCL2)]P 7"'
n t i n m ni
+ a *[R(CLvCL2)\P7 + a *{R(CLVCL2)}P 7
/// / t in ni ni
+ a 4R(CL„CL2)]/?7 + a ^(CL^CL^ 7
/w / / nn ni 111
+ « 4R(CLvCL2)]P7 + a *[i2(CL1;CL2)]/9 7
Based on the decompositions of a, P, and 7, only the first item will produce new pat¬
terns and is retained. Hence,
= d *{R(CLvCL2)\pd
1 1 1
= api
On the right-hand side of the equation we have
1 ft ttt nn i n m 1111
right-hand side = ((a + a 4- or + a )*\R(CLl,CL2)](P + P + P + P ))
i n in nn 1 n m nn
•{XuW}((ar + a + a + a )*{R{CLVCL2)}{7 +7+7 + 7 ))
II I II II I II II II I II III I III II
= (aP + aP + a P + a /?)*{XuVV}(c*7 + «7 + a 7 + a 7)
Applying the distributive property 5.18, we have
II II II I II II III I II III II
right-hand side = aP»{X\jW}at'i + aP»{X\jW]ar) + a/?»{XUW}a 7 + aP»{X\jW)a 7
in 11 1 n 1 n i n in 1 in in n
+ aP•{X\jW}a') + aP•{X\jW]a^¡ + aP»{X\jW}a 7 + a/? »{Xu W}a! 7
It I II It I I II II I III I II I III II
+ a P»{X\jW)a'i + a /?*{XuW'}a7 + a P»{X\jW)a 7 + a P»{X\jW}a 7
+ a P •{X\jW}at') + a P •{X\jW)at') + a P •{XuWJa 7 + a P •{XuW}ar 7
Of the sixteen items, only the first one is retained. The rest of items are dropped
because they do not intersect either over classes in {X} or over classes in {W). There¬
fore,
II 11
right-hand side = n^»{XuiV}»7
/ I I
= aP7 â–¡

152
E. Other Properties
(1) a^ajamw = = "MWd (5-20)
/ n m rm / # _ n
Proof: a can be decomposed into a + a + a + a , where a satisfies /> and P2, a
nr ttti
only satisfies Pv a only satisfies P2, and a does not satisfy either P1 or P2.
«i(«MJy)W] = + «Vil = a
t nr r
°4Pi(a)\piiW = a2(a + <* )[/y = «
°(a)[^P¿ = a □
(2) IT( a(a)[Pi)(£;T\ = o{ Il(a)[£mPl (P£Q (5.27)
r n r 1 #
Proof: First, a is decomposed into a + a, where or satisfies the selection condition
n r n
but a does not. Then, let /? and ¡3 represent the results of the projection operation
t n m t n
corresponding to a and a, respectively. Since PC.Í, f3 satisfies P but /? does not and
we have
IA o(or)[^)[f;TI = n[á)\e-,71 = p
a{ U(a)[ftTDW - °(P + P) = P D
(3) <7(0- *[fl(A,5)] P)[P1AP¿ = 4i2(A,B)] where P, and P2 are applicable to a and (3, respectively.
/ // /// nn t n
Proof: First, or is decomposed into a + a + a + a , where a and a satisfy Px but
nr nn r nr m n
a and a do not; and or and a can be concatenated with some ¡3 patterns but a
nn r n nr nn
and or do not. /? can be decomposed into ft + /? 4- P + P with a similar interpre-
tation. Therefore, we have
t r r nr nr r nr nr
a{a *{R(A,B)\ P)[P1aP2] = o(°-P + aP + a + a (3 )[P,AP¡¡¡
r t
= a {3
"¿am o2(/3){P¿ = (a + a) *\R(A,B)\ (/3 + fi)
r r
= aft â–¡

153
(4) o{a 4i2(A,B)] P)[PXVPJ = *(«)[/>] P + » 4«(A,B)] a{p)[P2} (5.31)
where Px and P2 are applicable to a and P, respectively.
Proof: a and P are decomposed as in the above proof. Thus, we have
It l III ill l III in
o{a 4R(A,B)\ P)[PyP2] = o[aP + (*P + a P + at P )[PxvP¿
II I III III I
= otP + aP + at P
o{ot)[Px] 4R(A,B)} P + <* 4R{A,B)\ o{p)[P2)
i ti i it in mi i n ni tut i ii
= (a + ot )4i2(A,B)) (P + P + P + P) + (at+a+at + at )*[R(A,B)](P + P)
It I III III I
= otP + otP + a P â–¡
(5) o(a - p)[P\ = o{a)[Pi - P (5.34)
i ii in nn i n n
Proof: We decompose a into a + a + a + at , where a and a satisfy P but a and
ini i ni m n nn
a do not; and a and a contain P patterns but a and a do not. Then, we have
n nn n
o{at - p)[P\ = o(at + a )[^ = or
a(a)[P\ - P = (at + at) - P = at â–¡
(6) o(ct + p)[P[ = o{a)[Pi + oiP)[P¡ (5.35)
I II I II
Proof: Suppose a and P are decomposed into subsets a and a and P and P, respec-
II II II
tively, where or and P satisfy P but a and P do not. By the definition of A-Select
operation, we have
a{a + P){P[ = at + P = o{at)[P¡ + a[p)[P¡ □
(7) o{ot + p)[PxwP¿ = ^(arM] + o¿P)\P¿ (5.36)
where Px and P2 are applicable to a and P, respectively.
I II l II
Proof: Suppose a and P are decomposed into subsets a and or and P and P, respec-
t It I n
tively, where a satisfies Px but at does not and P satisfies P2 but P does not. By the
definition of A-Select operation, we have
o{ot + P)[PyP2\ = « + P = ox{at)[Px\ + □

154
(8) met + P)[£;T\ = 77(a)[f;7] + mm (5-37)
/ it i it
Proof: Suppose that or and P are decomposed into subsets a and a and P and P,
II It ft
respectively, where a and P contain subpatterns defined by [£;7j but a and ¡3 do
not. The results of the two A-Project operations on a and f3 are represented by a
and ft, respectively. By the definition of A-Project operation, we have
I$a + /?)[£; 7] = « + p = /7(a)[£;7J + II(P){£;T\ □
(9) (a + p) - 7 = (a - 7) + (p - 7) (5.40)
/ w / n
Proof: a and ¡3 are decomposed into subsets a and a and f) and , respectively,
/ / it n
where a and P contain 7 patterns but ot and p do not. Thus, we have
(a + p) - 7 = a + p = (a - 7) + (P - 7) â–¡
(10) a -rW (P + 7) = « p ^-{w} 7 (5.41)
Proof: By the definition of the A-Divide operation, on the left-hand side of the equa¬
tion, an a pattern will be retained in the result if (a) it has Inner-patterns of classes
in {W} and contains all patterns of P and 7, or (b) the Inner-patterns of classes in {W}
that an a pattern has are common to some other a patterns and these patterns
/
together, denoted by a, contain all patterns of P and 7.
An a pattern (or patterns in a) which is retained on the left-hand side of the
equation will be retained after the first A-Divide operation on the right-hand side
since it must contain all the P patterns. It will also be retained in the final result
after the second A-Divide operation since it must contain all the 7 patterns. â–¡

155
(11) (orw *{R(A,B)} p[Y)) \{R(C,D)\ 7{z} = orw *\R(A,B)} (P[Y) \[R(C,D)] 7{z}) (5.42)
{X} and B<¿ {Z}
t n in nn mu i n
Proof: P is decomposed into P + P + P + P + P , where P and p can be con-
m nn it mi
catenated with a patterns but /? and f) cannot; f) and (3 can be concatenated with
/ m mu
7 patterns by Complement-patterns but P and P cannot; and P can be neither con¬
catenated with a patterns nor concatenated with 7 patterns by Complement pat¬
terns. a is decomposed into a + a, where a can be concatenated with P patterns
but a cannot. 7 is decomposed into 7-1-7 with a similar interpretation. Thus, we
have
(a *[R(A,B)\ p) I[R(B,Cj\ 7 = («>' + ap)\[R(C,D)}'1
I II I
= ap7
4R(A,B)\ (P{y) I[R{C,D)\ 7{z}) = a*\R(A,B)\(p'l + P"Í)
I II I
= aP 7 â–¡
(12) (aw 4J2(A,B)] P{Y)) - 7{z} = («W - 7{z}) *[R(A,B)} P{y) (WRi^ = *) (5-43)
= a{x] (P{y} - 7{z}) (TO = )
Proof: We shall prove the first case. The second case can be proved similarly, a is
I II III lili I II
decomposed into a + a -»- or + or , where a and a can be concatenated with /? pat-
m mi 1 m a ii nn
terns but a and a cannot; and a and a contain 7 patterns but a and a do not.
¡3 is decomposed into ¡3 + ¡3, where f3 can be concatenated with a patterns but ¡3
cannot. Since {Y}f^Z}=, none of p patterns contains a 7 pattern and we have
(«{*} *\r(A,B)} P[y)) - 7{z} = {ap + a P) - 7
II I
= a P
nn 1 11
(«W - 7{z}) •[«(A.B)] P[Y) = («" + ») *{R(A,B)} {P+P)
II I
= a P â–¡

156
(13) («{x} *[-^(A,B)] P{y)) • 7{z] — • 7{z}) P{y) (5.44)
({Y>n(Z} = MAGW)
= a{x} *t-^(-^>-®)] (^{y} • 7{z})
({X}f({Z} = MB6W)
Proof: We only give the proof of the first case. The decompositions of a, P, and 7
/ n in mi i n .
are as follows: a = a + or + or + a , where a and a can be concatenated with p
in mt 1 m i n
patterns and a and or cannot, and a and a intersect with 7 patterns and a and
mi t n i ff
a do not; f3 = /? + (3, where /? can be concatenated with a patterns and p cannot;
1 n 1 n
7 = 7 +7, where 7 intersects a patterns and 7 does not. When {Y}p|{Z}=<£, pat-
terns of P and 7 do not intersect with each other and we have
{aw *{R(A,B)} p[Y]) • 7{z} = {<*P + «V') • (7 + 7)
/ / I
= a
I I III I I II
(a{X) • 7{z}) 4-H(A,B)] £{y} = («7 + or 7) *[B(A,B)] (0 + P)
I I I
= aP 7 â–¡
Note that the left-hand side of 5.44 is in a distributive form of * with respect to • but
the distributive property cannot be applied because it requires that A be in both a
and P and 7 be a homogeneous association-set.
(14) or ![B(A,B)] (P + 7) (5.48)
= a\[R(A,B)}P-n{a*[R(A,B)}'lM + a![B(A,B)]7-B(a*[B(A,B)]$[ar]
where a, P, and 7 are homogeneous association-sets.
i n in mi 1
Proof: a can be decomposed as a = a + or + ar + or , where a can be concatenated
II
with P by Inter-patterns but not with 7; a can be concatenated with 7 by Inter-
III
patterns but not with p, a can be concatenated with both a and P by Inter-patterns;
nn m 1 n in itn
and a cannot be concatenated with ¡3 and 7. or, a, a , and a are mutually

157
exclusive. P is decomposed into /? + /?, where P can be concatenated with a but P
cannot. 7 can be decomposed as ft.
By the definition of the NonAssociate operation we have
/ n m nn 1 n t n
left-hand side = (« + O' + a + tt ) ![J2(A,£)] (ft + P +7+7 )
a 7
-o-
II
Wh
HH II
II
a P
-o-
II
«*-<
It II
nil
P + 7
if a = II
nn n
7
if a =P =0
n
nn n
P
if a =7 =
tlH
II II
a
1111 11 nil n
if p =7 =4>
a P + a 7
otherwise
1 1 II
P = aP + a
l
P, we
have n(a*[R(A,B)}p)[a
= a + « . Therefore, on the right-hand side
nn
II
a
if P=<¡>
1 ill
11
nn
(a + a ) =
p
if Of =
nil n
a P
otherwise
nn
II
a
II
n nt
a
nn
(a + a ) =
7
if a =
nn n
a 7
otherwise

158
Hence,
r nt n nt
right-hand side = Ot\[R(A,B)\P — (a + a ) -f <*![i2(A,B)]r) — (a + a; )
I HI
a\[R(A,B)]P - (a + a )
= II III
+ a\[R(A,B)]') - (a + a )
(15) a-{P + 'i) = a- P- 7 (5.51)
Proof: By the definition of A-Difference operation, the left-hand side of the equation
retains a patterns that do not contain any pattern of p or 7. On the right-hand side,
the first A-Difference operation retains a patterns that do not contain any P pattern
and then the second operation retains a patterns that do not contain any pattern of P
tin 11 ti
a 7
-e-
II
mi n
n
a P
11
e~
II II
P + 7
nn
if a =
II
nn n
7
if Of =P =
II
P
nn n
if a =7 -
1111
II II
a
if p=n=
nn n nn n
a P + a 7
otherwise
or 7. â–¡

BIOGRAPHICAL SKETCH
The author has been a research assistant in the Database Systems Research and
Development Center at the University of Florida since 1985, where he has been
working towards the Ph.D. degree in electrical engineering. His research interests
include semantic data modeling, query models for object-oriented databases,
knowledge and rule representation and processing, query optimization, concurrency
control, and parallel processing for 0-0 databases. In 1970, he received his B.S.
degree in mathematics from Fudan University, Shanghai, China, where he was a
faculty member of the Computer Center from 1970 to 1983. Between 1983 and
1985, he joined as a visiting scholar the Database Systems Research and Develop¬
ment Center at the University of Florida, where he received his M.S. degree in
electrical engineering in 1987.
159

I certify that I have read this study and that in my opinion it conforms to
acceptable standards of scholarly presentation and is fully adequate, in scope and
quality, as a dissertation for the degree of Doctor of Philosophy.
Stanley Y.W. Su, Chairman
Professor of/Electrical Engineering
I certify that I have read this study and that in my opinion it conforms to
acceptable standards of scholarly presentation and is fully adequate, in scope and
quality, as a dissertation for the degree of Doctor
He
Associate Professor of Electrical
Engineering
of Philosophy
an X. Lam, Cochairman
I certify that I have read this study and that in my opinion it conforms to
acceptable standards of scholarly presentation and is fully adequate, in scope and
quality, as a dissertation for the degree of Doctor of Philosophy.
Shamkant B. Navathe
Professor of Computer and Information
Sciences
I certify that I have read this study and that in my opinion it conforms to
acceptable standards of scholarly presentation and is fully adequate, in scope and
quality, as a dissertation for the degree of Doctor of Philosophy.
U -t ííJaíIÁ
landy Y. Q. Chow
’rofessor of Computer and Information
R
Professor
Sciences

I certify that I have read this study and that in my opinion it conforms to
acceptable standards of scholarly presentation and is fully adequate, in scope and
quality, as a dissertation for the degree of Doctor of Philosophy.
John Staudhammer
Professor of Electrical Engineering
This dissertation was submitted to the Graduate Facuity of the College of Engineer¬
ing and to the Graduate School and was accepted as partial fulfillment of the require¬
ments for the degree of Doctor of Philosophy.
December, 1990 &-â–  /!)
Winfred M. Phillips
Dean, College of Engineering
Madelyn M. Lockhart
Dean, Graduate School

UNIVERSITY OF FLORIDA
3 1262 08285 385 3