Citation
Voltaire : a database programming environment with a single execution model for evaluating queries, satisfying constraints and computing functions

Material Information

Title:
Voltaire : a database programming environment with a single execution model for evaluating queries, satisfying constraints and computing functions
Creator:
Gala, Sunit, 1964-
Publication Date:
Language:
English
Physical Description:
vii, 106 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Algebra ( jstor )
Boolean data ( jstor )
Data models ( jstor )
Databases ( jstor )
Identifiers ( jstor )
Information attributes ( jstor )
Integers ( jstor )
Programming languages ( jstor )
Query languages ( jstor )
Semantics ( jstor )
Dissertations, Academic -- Electrical Engineering -- UF
Electrical Engineering thesis Ph. D
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1991.
Bibliography:
Includes bibliographical references (leaves 102-105).
Additional Physical Form:
Also available online.
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Sunit Gala.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
026891698 ( ALEPH )
25541272 ( OCLC )

Downloads

This item has the following downloads:


Full Text







VOLTAIRE: A DATABASE PROGRAMMING ENVIRONMENT WITH A
SINGLE EXECUTION MODEL FOR EVALUATING QUERIES, SATISFYING
CONSTRAINTS AND COMPUTING FUNCTIONS














By

SUNIT GALA


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1991



















To Geeta and Kilu
for having shown me the joys of wondering,
for having gifted me with a childhood that was never theirs,
for having given me courage to seek Truth and Beauty.
It is to them that I dedicate the work of my life.














ACKNOWLEDGMENTS


It is difficult to compress in a few lines one's gratitude to a number of people who

have had any bearing on this dissertation. I want to thank Dr. Shamkant Navathe,

with whom I have worked for five years, for opening many opportunities to me and

being one of the most flexible and understanding advisors that one can have. I want

to thank Manuel Bermudez for teaching me denotational semantics and all that I

know about programming languages. I spent two great years working with Howard

Beck on the CANDIDE project, during which period I learned much. The rudiments

of Voltaire lay in a "mercurial" late night discussion with Stephan Grill over beer.

It has always been a joy to discuss the meaning of life, the universe, ..., and 42

with Dr. Principe; he has shown me that it is possible to learn as many things in

one's life as one cares to. I would also like to thank Drs. Chakravarthy, Lam and Su

for many illuminating discussions on Voltaire and other topics. It would be difficult

to recount the various interactions with past and present students who have shared

the Database Center as a superlative work place. But, I should like to thank Rahim

Yaseen, who taught me more about Unix and other systems oriented concepts.

Without Sharon Grant, the Database Center could not have been the great place

that it is. She manages it with dedication, care and a smile. Of course, she also

provides M&,Ms. Working late nights was made much bearable by "short" coffee

breaks with Niranjan Mayya and Ravi Malladi, which occasionally got extended to

Cedar Key!


















TABLE OF CONTENTS




ACKNOWLEDGEMENTS ....................

ABSTRACT ............................

CHAPTERS

1 DATABASE PROGRAMMING LANGUAGES .......

1.1 Introduction .. .... .. .. .. ... .. .. .
1.2 Scope of this Dissertation ................
1.3 Some Design Criteria for DBPLs ............
1.3.1 Semantic Data Model versus Persistent Abstract
1.3.2 Type Checking ... .................
1.3.3 Ability to Manipulate Heterogeneous Sets .
1.3.4 Ability to Share Data . . . .
1.3.5 Data versus Functions . . . .
1.3.6 Database Integrity . . . .
1.3.7 Role of the Query Language . . .
1.3.8 Implementation Strategies .. ............
1.3.9 Choice of Computing Paradigm . . .
1.4 Previous Research . . . . . .

2 AN OVERVIEW OF VOLTAIRE . . . ...


Data Types


. .. .20


2.1 Design Rationale of Voltaire . .
2.2 A Quick Glance of Voltaire . .
2.3 An Introductory Example . .


3 DATA DEFINITION . . . . . ...


Classes and Instances . . . .
An Extensional Semantics for Classes .
Update Operators . . . .
On the Computability of Subclass . .
3.4.1 Object Graphs and Equality . .
3.4.2 Classes, Types and Schemas . .
3.4.3 Glossary . . . .


. 31


. . 31
. . 35
. . 37
. . 38
. . 38
. . 41
. . 46











4 QUERY SPECIFICATION .......................... 47

4.1 The Basic Structure of a Query ....................... 48
4.2 Exam ples . . . . . . . ... .. 49
4.3 Aggregate Operators . . . . . . ... .. 51
4.4 Evaluation Strategies . . . . . . ... .. 52
4.4.1 Semantics of the Dot Operator . . . ... .. 52
4.4.2 Naive Approach . . . . . ... .. 53
4.4.3 Algebraic Approach . . . . . ... .. 54

5 CONSTRAINT SPECIFICATION . . . . ... .. 56

5.1 Basic Structure of Constraints . . . . ... .. 57
5.2 Exam ples . . . . . . . ... .. 57
5.2.1 Constraints on the class Student . . . ... .. 57
5.2.2 Constraints on the class Grad . . . ... .. 59
5.3 Null Values and Exceptions . . . . . ... .. 60

6 FUNCTION SPECIFICATION . . . . . ... .. 62

6.1 Basic Structure of a Function . . . . ... .. 63
6.2 A Database Example . . . . . . ... .. 65
6.3 Temporary Instance Creation . . . . ... .. 68
6.4 A Model of Inheritance for Classes and Functions . . .... ..69
6.5 Equality, Assignment and Modify . . . . ... .. 71
6.6 Scope of Identifiers . . . . . . ... .. 72
6.7 Function Composition . . . . . ... .. 74

7 THE VOLTAIRE ENVIRONMENT AND ITS SEMANTICS ........ 76

7.1 Interacting with the Voltaire Environment . . . .... .. 76
7.2 A Denotational Semantics for Voltaire . . . ... .. 79
7.3 Implementation Strategy . . . . . ... .. 81

8 CONCLUSIONS AND FUTURE RESEARCH . . . .... .. 83

APPENDICES

A UNIVERSITY SCHEMA . . . . . . ... .. 86

B CONCRETE SYNTAX . . . . . . ... .. 89

C ABSTRACT SYNTAX . . . . . . ... .. 92

D DENOTATIONAL SEMANTICS . . . . . ... .. 94

REFERENCES . . . . . . . . ... .. 102

BIOGRAPHICAL SKETCH . . . . . . ... .. 106

















Abstract of Dissertation
Presented to the Graduate School of the University of Florida
in Partial Fulfillment of the Requirements for the
Degree of Doctor of Philosophy


VOLTAIRE: A DATABASE PROGRAMMING ENVIRONMENT WITH A
SINGLE EXECUTION MODEL FOR EVALUATING QUERIES, SATISFYING
CONSTRAINTS AND COMPUTING FUNCTIONS

By
Sunit Gala

December 1991


Chairman: Shamkant B. Navathe
Major Department: Electrical Engineering

In this thesis we present Voltaire, which is a set-oriented, imperative database

programming language. The set expressions in the language are conducive to data

intensive programming while maintaining a certain amount of efficiency by espousing

the imperative paradigm. The language and its semantics are defined in a modular

but additive fashion, which facilitates some measure of bootstrapping. We further

argue that such an implementation model is desirable, since it provides a single exe-

cution model for evaluating queries, satisfying constraints and computing functions.

The system provides automatic integrity enforcement in a lazy evaluation mode.

Functions are effectively computed as the result of integrity enforcement. This is

because we consider constraints as a sequence of commands to be evaluated or sat-

isfied in the specified order. There are no arbitrary restrictions on the persistence











of values-even functions can have a persistent extent. Further, the query language

incorporates functions by providing access to the persistent extent of a function or by

allowing an actual function call. Also, the compiler can exploit conventional algebraic

techniques for query optimization.

The data definition (or type) facility is similar to what might be found in most

semantic data models and is conducive to sharing heterogeneous records. We have

defined a type algebra that incorporates structure, extent and behavior by providing

an extensional semantics for the behavior. We also attempt to define a denotational

semantics for the Voltaire language and environment.

We believe that Voltaire is a suitable language for data intensive programming,

and is a reasonable compromise between a database system and a programming

language.
















CHAPTER 1
DATABASE PROGRAMMING LANGUAGES

1.1 Introduction

In today's typical organization, a large proportion of software applications are in

fact database applications and are developed at considerable cost. The development

of these applications is usually performed using two distinct, incompatible languages:

one for data manipulation and one for programming the application. For example,

COBOL is often used as the "host" programming language, in which SQL data

manipulation statements are embedded.

This is the case in most business applications which constitute the largest con-

sumers of database technology. A typical database management system consists of

a data definition language (DDL) and a data manipulation language (DML) [25].

The DDL defines the database structure and hence constitutes the structural com-

ponent, whereas the DML consists of a query sublanguage (i.e., retrieval operators)

and update operators. For example, in a relational database, sets of relations and

various integrity constraints form the structural component or DDL, while the query

language (QL) is based on the relational calculus or algebra. Further, the relational

QL is set-oriented and declarative in nature. Thus, embedding declarative DML

statements in an imperative host language inevitably leads to a paradigm mismatch

between the languages.

The application developer often spends inordinate amounts of time and energy

overcoming these incompatibilities. The incompatibilities are not just conceptual, but

physical as well. For example, sharing of symbol space and work space between the

1











embedded and host languages creates challenges for implementation. Thus, Database
Programming Languages (DBPLs) have been proposed to alleviate this problem, by

integrating programming language constructs and database constructs into a single

language, (see, for example, [1, 3, 4, 6, 8, 9, 23, 26, 28, 34, 37, 41, 43, 44, 48, 51, 52]).

There are some important issues concerning the design of database programming
languages [5, 7, 12, 16]. Perhaps the most difficult issue stems from the fact that

data modeling (and knowledge representation) enterprises are ontologic in nature, in

contrast to traditional programming. This means that the role of a data model is to

faithfully capture the semantics of some real world entity without worrying about the

actual data structures with which to implement the given entity. On the other hand,

the role of a rich type system in a traditional programming language is to allow the

user to choose a data structure which will lead to the most efficient implementation

of the application in question. Designing a DBPL necessarily entails the merging

of certain incompatible features of a database system and programming language.

Thus, the type system of a programming language must be elevated to match the

ontologic properties of a data model to enhance the computational expressibility of

the resulting DBPL. Unfortunately, a uniform treatment of types, behavior, extent

and classes is a non-trivial problem. An important reason for this seems to be that a

type definition usually does not account for the extent of a type [16, 5, 15] whereas

a database class definition does provide a semantic description of its extent (i.e.,

the closed world assumption). Further, it is important that the type system provide
structures (such as classes) for representing sets of similar, but possibly heterogeneous

structures (such as records or instances).

We would also like to emphasize that many proposed DBPLs do not provide a

truly integrated computing paradigm. For example, they do not provide a homo-

geneous treatment of object (type or class) manipulation and function (procedure











or method) specification. This lack of homogeneity stems from the fact that there

are three sublanguages that form a single DBPL. These sublanguages are for data

definition to specify object types, data manipulation to compute a restricted class

of queries, and function specification for making arbitrary computations. It is im-

portant to note that in many existing DBPLs (an exception being the embedding

of relational systems within logic languages), the three sublanguages are orthogonal,

i.e., there tends to be no interleaving among programming language constructs, data

manipulation constructs, and data definition constructs. Instead, the three sublan-

guages are merely "appended" to each other, which results in a DBPL lacking a truly

integrated paradigm. However, appending languages in this manner is still a vast im-

provement over embedding queries in a host language (such as SQL in COBOL).

We shall briefly enumerate some issues that lead to conflicts when designing a

database programming language:

1. Set-oriented manipulation primitives versus record-oriented programming prim-

itives.

2. Declarative query language versus imperative programming language.

3. Ability to define a theory of types which accounts for extent as well as behavior

involves certain compromises:

(a) a type theory must be able to clearly define when one class is a subclass

of another, and when a database object belongs to a given class;

(b) static versus dynamic type checking;

(c) polymorphism versus efficiency;

(d) ability to deal with heterogeneous records or objects;











4. Uniform persistence for all objects independent of their type versus efficient

retrieval from secondary storage.

5. Ability to define the notion of a transaction.

6. Ability to provide referential transparency between objects in main memory

and those in secondary storage.

1.2 Scope of this Dissertation

In this dissertation we present Voltaire, a set-oriented, imperative database pro-

gramming language. The set expressions in the language are conducive to data

intensive programming while maintaining a certain amount of efficiency by subscrib-

ing to the imperative paradigm. The language and its semantics are defined in a

modular but additive fashion, which facilitates a bootstrapped implementation. We

further argue that such an implementation model is desirable. The data definition

(or type) facility is similar to what might be found in most semantic data models and

is conducive to sharing heterogeneous records. The query language provides uniform

access to sets of instances as well as functions. Also, the compiler can exploit conven-

tional algebraic techniques for query optimization. The system provides automatic

integrity enforcement (up to a certain degree). Functions are effectively computed

as the result of integrity enforcement. This is because we consider constraints as a

sequence of commands to be evaluated or satisfied in the specified order. Further,

there are no arbitrary restrictions on the persistence of values-even functions can

have a persistent extent.

We view Voltaire as an experiment to provide a language facility to manipulate

sets of associative data. Our set expressions are superficially similar to those in

SETL [49], thus reducing certain paradigm mismatch problems with record-oriented









5


languages. The design of our language in general and our inheritance and data

declaration scheme, in particular, strongly reflect the database notion that a class

denotes a set of instances that belong to it. We provide the following functionality

in Voltaire:

1. a data definition facility similar to what might be found in most semantic data

models [30],

2. a query language which provides uniform access to sets of instances as well as

functions [7],

3. automatic constraint management (up to a certain degree), for reasonably ex-

pressive constraints [40], and

4. ability to specify and compute arbitrary functions.

The first three features are based on the core functionality that a typical DBMS

must provide. Arbitrary functions are then computed under the control of the DBMS.

All of the above functionality is provided by a single execution model, which reflects

a bootstrapped implementation (see Figure 1.lc). Further, there are no arbitrary

restrictions on the persistence of values. We shall not be dealing with other important

issues such as concurrency, transaction management, recovery or active database

management (essential for efficient integrity enforcement). The main contributions

of this dissertation can be summarized as follows:

1. define a semantics for types, incorporating extent and behavior, that emphasizes

the notion that a class (or type) denotes a set of objects,

2. allow a set of heterogeneous records (objects) to belong to a single class to

facilitate sharing of data,












3. alleviate the paradigm mismatch between record-oriented and set-oriented prim-

itives for manipulating associative data within the language by means of type

coercion,

4. provide a modicum of efficiency by subscribing to the imperative paradigm

within a set-oriented language, and

5. provide a single model of execution for evaluating queries, enforcing constraints

and computing functions, by designing a language that facilitates some measure

of bootstrapping.

The rest of this dissertation is organized as follows. In the remainder of chapter 1,

we list some general design criteria for database programming languages and discuss

previous research. Then in chapter 2, we give a brief overview of the design rationale

of Voltaire and some of its features. In chapter 3, we describe the data definition

facility in Voltaire along with update operators and give a formal semantics of the

type model used in the language. In chapter 4, we describe the features of the query

sublanguage with the help of examples and also outline possible execution strategies.

In chapter 5, the constraint specification sublanguage is described. In chapter 6,

we first introduce the basic structure of functions in Voltaire and give a number of

examples. Then we explain how the notion of temporary instance creation provides

an operational means for giving an equivalent semantics to classes and functions in

the run-time environment. This is followed by a theoretical explanation of why classes

and functions can have an equivalent semantics and some implications thereof. In

chapter 7, we first describe how a user can interact with the Voltaire environment,

followed by a denotational semantics of the language. Finally, we summarize our

conclusions and the main contributions of this dissertation, as well as define future

research goals in chapter 8.











1.3 Some Design Criteria for DBPLs

Here we discuss the implications of merging the database and programming lan-

guage cultures, which have traditionally been divergent. We feel that these issues

discussed elsewhere [5, 7, 12, 16] have been predominantly viewed from a program-

ming language standpoint. We must first note that the primary function of a database

management system (DBMS) is to provide a persistent store of bulk data structures

for efficiently processing transactions on sets of such data.

More traditional application domains are data intensive, that is, the application

tends to have a large volume of instances or records, and relatively fewer types or

classes. Therefore, it is conceivable that existing data models are extended to provide

advanced functionality such as the ability to compute arbitrary functions or active

data management [39, 46, 55]. The ability to define and handle various kinds of

transactions is crucial in these applications. In contrast, newer application areas

such as CAD/CAM or CASE are computation intensive; that is, they tend to have a

large number of types or classes, each class having few instances, but requiring some

database functionality. It may be more expeditious to extend a given programming

language such that it provides DBMS-like functionality [1, 44, 48, 52]. Hence, it

seems that before designing a DBPL, the expected application domain should be
known, since it is rather difficult (however desirable it may be) to design a system

which can solve all problems. Most DBPLs seem to have taken the second option

with certain exceptions. Some of these are relational systems embedded within logic

and procedural languages [28, 34, 36, 48] and other systems such as [33, 52]. There is

a third class of DBPLs which are designed from scratch and address specific issues.

These languages tend to be more experimental in nature.











We now attempt to analyze the effects of both the above options on various

features that a DBPL may have.

1.3.1 Semantic Data Model versus Persistent Abstract Data Types

A semantic data model rigidly defines the structure of objects (or instances) which

reside in a persistent store, and classes which describe these objects. Type construc-

tors can only be used to define the domain of values which various attributes of a given

object can assume. This means that new classes cannot be defined (or constructed) by

applying type constructors to existing types; such manipulation is allowed only in the

query language. In contrast, there are no such restrictions on type constructors with

an abstract data type. However, with the abstract data type approach, the database

administrator must determine the most suitable data types and structures for the ap-

plication at hand, and also write a set of create, update, delete and retrieve routines

for each such structure. This is usually not considered a satisfactory situation in the

database culture, primarily because it violates the principle of data independence.

A partial remedy may be to distinguish between persistent and non-persistent data

types, so that generic operators for manipulating the persistent objects can be effi-

ciently implemented. But then this violates the principle of uniform persistence, i.e.,

persistence should be orthogonal to type [5]. Therefore, choosing a rigid data model

implies efficient access to the persistent store but a lack of a rich typing mechanism,

whereas the second option implies inefficient access to the secondary store but a rich

typing mechanism and extensibility.

We would like to emphasize that persistent programming languages are not data-

base programming languages. This is because when a programming language is ex-

tended to provide persistence, its type theory is usually not appropriately extended.











That is, such type systems are often unable to answer the following questions in a

clear fashion:

1. when is one class (type) a subclass (subtype) of another?

2. when is an object (instance or record) a member of the domain of a given class

(or type)?

Another problem with these type systems is that they often do not provide trans-

parency between persistent and transient objects, that is, a separate set of operators

is defined for persistent set of objects. Hence, we believe that persistent versions of

languages such as C++, Smalltalk or Ada cannot be classified as DBPLs, but should

be considered as intermediate (albeit important) steps towards one.

1.3.2 Type Checking

The general consensus here seems to be that the language should be strongly
typed, though some obviously convenient overloading may be allowed [5]. There also

seems to be a consensus that type checking should be static as far as possible. This

would minimize run-time errors thus saving on the transaction processing overhead

(catching a run-time error late in the transaction may result in a number of undo

operations). Static type checking can be difficult to achieve in highly polymorphic

languages, though some progress has been reported [43, 54].

1.3.3 Ability to Manipulate Heterogeneous Sets

Type definitions in languages such as C++ do not account for the extent of the
type. This contrasts with the database notion of a class, which denotes the set

of all instances that belong to that class. There has been much recent work on
defining type schemes which attempt to define the extent of a type [5, 15, 16, 18,
19, 54]. An important feature is the ability to manipulate sets of heterogeneous











data. For example, the language Machiavelli [43] defines a type discipline in which
it is possible to write polymorphic functions, which may operate on sets of different

kinds. However, a particular execution of the function may only operate on a set

whose elements belong to a single kind.

1.3.4 Ability to Share Data

The ability to share data (heterogeneous or otherwise) should be an important

property of a database programming environment. Sharing can occur in three ways:

1. A single schema can describe multiple databases. For example, a chain of stores

can have a single schema to describe the inventory at all of its locations.

2. A single database can have multiple schemas describing it (unlike views). For

example, a plant manager and plant engineer can have two different schemas

emphasizing different aspects of the same CAM database.

3. Multiple users may wish to share a given database (possibly viewed through

different schemas).

1.3.5 Data versus Functions

Since independent applications access the same shared data under the control of

a DBMS, the focus of a DBMS is on the data. On the other hand, the focus in

a programming language is on the application itself, and the data types are sim-

ply a mechanism for efficient implementation of the application. This traditional

separation of data from function leads to a very fundamental conflict when design-

ing a DBPL, having implications on constraint management, ad hoc querying and

transaction processing. For example, let us examine the implications on an appli-

cation independent (i.e., ad hoc) query mechanism. Since functions (or methods or

procedures) can be used to generate derived attributes, it becomes necessary to be











able to query them [7]. Consider the class person with attributes birthdate and age

and a function called compute-age which computes the age of a person given his/her

birth date and the current date. The query reference personage should automati-

cally trigger the compute-age function. Alternately, the language should allow the

query reference person.compute-age. Ideally, the DBPL should allow functions to be

accessed in a fashion similar to that of other objects.

1.3.6 Database Integrity

The importance of database integrity should be established for the given applica-

tion area, and also it should be decided as to how much of the burden for maintaining

this integrity can be placed on the application programmer before designing a DBPL.

Typically, in traditional database systems, integrity is enforced by application pro-

grams. However, enforcing integrity constraints is considered an important database

function, which should be handled by the DBMS itself. Some recent solutions to this

problem have been discussed in the area of active databases [39, 40]. When dealing

with complex objects, the DBMS must at least be capable of maintaining referential

integrity. It is relatively difficult to define a theory of types that also takes into ac-

count the extent of the type in persistent store, since the user has complete freedom to

define any arbitrary type. This makes it even more difficult to identify and enforce

integrity constraints. The fundamental conflict here is that a database associates

constraints with objects (i.e., automatic triggering of constraints when an object is

created, updated or deleted), whereas in a programming language, constraints are

embedded in the procedure and therefore cannot be triggered automatically. Much

recent work on constraint management is reported in the active database literature

[21, 39, 46, 55]. This would also lead to a more efficient transaction management,











since a user-defined procedure for maintaining integrity can have arbitrary side ef-

fects, thus making it impossible to automatically determine which constraints will be

violated. However, it is not yet clear how the notion of an active database can be

merged with a programming language to design a DBPL.

1.3.7 Role of the Query Language

A database user usually needs to retrieve or otherwise operate on sets of similar

valued objects defined by various classes. The query language is the mechanism

that allows the user to specify a restricted class of computations to operate on such

sets. It usually allows only restricted computations so as to maximize efficiency.

The considerations for optimizing a query processor are significantly different from

those in programming languages, which typically operate on one object at a time in

virtual memory. Query optimizers rely heavily on clustering information on the disk,

indexing, caching, and the algebraic properties of the primitive operators provided

by the query language. Ideally, one would want to augment the computing power of

a query language by making it a "proper" subset of the programming language. (By

proper subset we mean that by removing all querying primitives from the DBPL,

it would be rendered Turing incomplete.) In this scenario, it would be possible to

make arbitrary computations efficiently as well as to evaluate ad hoc queries. But it

should be pointed out here that if the DBPL were to have a very rich type system

where the persistent bulk data are of various different types, then query optimization

becomes too complex to be effective. This is because each bulk data type would have

its own associated optimization technique. Additionally, if the bulk data types are

vastly different from each other, then it can be very difficult to meaningfully overload

the query language primitives. For instance, it might be difficult to define a single

"join" operator for relations in first normal form and user-defined complex objects











in a non-relational format. After all, the notion of uniform persistence should quite

naturally be extended to the notion that the query language should be uniform (i.e.,

have a small set of operations that apply uniformly across) for all data types. This

might be possible only in a language whose type system is highly polymorphic, and

even if so, would be achieved only at the expense of sacrificing efficiency. Some work

towards this end is reported in [22, 35, 43, 50, 53, 56].

1.3.8 Implementation Strategies

Traditional database functionality such as concurrency, locking and transaction

management facilitate data sharing. Such functionality is based on the notion that

a class denotes the set of instances that belong to it. Thus, it seems important that

a database programming language emphasize data rather than function.1 Figure 1.1

shows some possible implementation strategies-Figure 1.1 a simply depicts a classical

situation where DML statements are embedded in some host language. It is perhaps

fair to say that Figure 1.lb depicts a typical implementation of the newer generation

of database systems. Such implementations are in agreement with some recent work

on extensible systems [10, 20]. From the application programmer's point of view,

Figures 1.lb and 1.lc are functionally equivalent. However, we believe that Figure

1.lc is a cleaner and more desirable implementation model because:

1. it is possible for syntactic structures to be shared without harmfully overloading

their semantics,

2. it would be easier to bootstrap such a system,

3. it would lead towards a smaller, integrated language, and

4. it would reduce communication overhead between the various modules.

'This is in contradistinction to functional data models such as DAPLEX [51] or PDM [38].

















Operating System DBMS
a. Classical Scenario.


b. New Generation DBMS


c. Bootstrapping in Database Programming Language

Figure 1.1. Implementation Strategies











1.3.9 Choice of Computing Paradigm

Ideally, the choice of a given computing paradigm should make no difference.

Unfortunately, this is not the case in practice. It is very tempting to design a logic or

functional language since they have sound theoretical bases. This would make query

optimization much easier, but the semantics of transaction processing can become

messy because all update functions may have to be implemented as meta-predicates.

This is because it is often difficult to provide a formal description of operations that

produce side effects such as updates. Besides, users seem to have a tendency to shy

away from such languages. The implications of object-orientation on DBPL design

have been well discussed in Bloom and Zdonick [12] and Bancilhon [7] and will not

be discussed here. Procedural languages such as COBOL or C or Pascal have the

main advantage of being rather popular among application programmers. However,

they are considered to be "low-level" and therefore not expressive enough. Also, most

procedural languages have virtually no set processing primitives (with the exception

of COBOL).

However, from a database perspective, we feel that the destructive assignment op-

erator causes the most problems. In a truly integrated DBPL environment [5] with

uniform persistence, it is difficult to prevent the user from (even accidentally) assign-

ing a new value to a field. In effect, such an assignment is an update to the database

which could spawn potentially many subtransactions for checking constraints before

the assignment operation could be committed and the next command executed. (This

is in addition to the usual problems such as garbage collection and dangling refer-

ences caused by destructive assignment.) The destructive assignment operator is the











bete noire of automatic side-effect detection and constraint management. Unfortun-

ately, the destructive assignment operator is necessary to achieve efficiency and better

performance.

Regardless of which design strategy or language paradigm is chosen, one obvious

pitfall to avoid is the PL/I syndrome.2 Many DBPLs that are the result of three

orthogonal sublanguages being appended to each other (see section 1.1) are also

victims (though to a much lesser degree) of the PL/I syndrome. For instance, it is

better to provide different kinds of users with various library functions, rather than

incorporating language constructs for everything. Since one of the design goals of a

DBPL is to cater to a larger variety of users, the environment should provide default

primitives for each functionality which can be easily superseded by the user.

1.4 Previous Research

Most DBPLs described in the literature fall into three main design options:

1. Embed a given data model in some programming language, e.g., Pascal/R [48],

Modula/R [34], ADAPLEX [52], 02 [35], Gemstone [23].

2. Provide persistence to a programming language (some languages also provide

set manipulation primitives), e.g., PS-Algol [6], ODE [1], ONTOS [44].

3. Design a new system from scratch, e.g., TAXIS [41], Galileo [3], Machiavelli [43].

Voltaire falls in this category. TAXIS offers elaborate exception handling and

meta-data definition capabilities, while the other two have polymorphic type

systems based on ML [29]. Galileo is an expression-oriented language, thus

eliminating the need for an explicit query language. Machiavelli is a functional
2The PL/1 syndrome is a design pitfall in which an arbitrarily large number of constructs are
provided. This in turn leads to a large and unwieldy language which is difficult to implement or
learn.











language which explicitly addresses the type versus class issue and the ability

to manipulate sets of heterogeneous elements.

The first class of languages is engineered to provide a relatively clean interface

between the record-oriented programming language primitives and set manipulation

primitives for the underlying data model. Another important class of such languages

are relational systems embedded within logic languages [27]. However, the main

problem with these languages is that a certain amount of paradigm mismatch remains.

For example, in Pascal/R, Pascal is an imperative language whereas the relational

model and its query language are declarative.

In the second class of languages, we have PS-Algol, which provides a persistent

store for all types in Algol. On the other hand, ODE and ONTOS are extensions of

C++, in which the only persistent structures are C++ classes. The problem with

these languages is that they have not addressed the type versus class issues. When

extending these languages with persistence, their type systems are not appropriately

extended. That is, the type systems of these extended languages are unable to answer

one or both of the following questions:

1. when is one class (type) a subclass (subtype) of another?

2. when is an object (instance or record) a member of the domain of a given class

(or type)?

In the third class of languages, to which Voltaire belongs, TAXIS is one of the
earliest efforts. It is a record-oriented language with a very elaborate exception han-

dling mechanism. It provides arbitrary levels of meta-classes, and transactions and

exceptions can be organized into a taxonomy. The language relied heavily on asso-

ciative access by means of a dot operator. However, it did not have set manipulation











primitives, and constraints could be satisfied only by means of defining appropriate

transactions and handling exceptions. Also, TAXIS classes are derived mainly from

semantic networks rather than a typical type system [19]. In Voltaire, we provide a

similar dot operator for associative access, as well as set manipulation primitives and

automatic constraint management. Further, the type system is well-defined.

Galileo is an expression-oriented language with an ML-style type discipline. In

such languages, expressions are evaluated directly; there is no need to write a function

(or query) and then compile it before executing it. Therefore, it eliminates the need

for a separate query language. A main design goal was to view Galileo as a conceptual

design tool. Unlike Voltaire, it offers no automatic constraint management. Although

Voltaire is not expression-oriented, we do not need a separate query language (largely

due to its bootstrapped design).

Machiavelli is a functional language with an ML-style type discipline. An im-

portant aspect of its polymorphism is an underlying algebra of sets based on the

homomorphic extension operator [17]. It also defines a coherent type theory which

can deal with sets of heterogeneous records. Unlike Voltaire, a notion of persistence

is still be to be defined, and it does not support automatic constraint management.

Like Machiavelli, we have an underlying algebra of sets based on the homomorphic

extension operator. An important difference is that a unique identifier (and option-

ally, the name of the class) is automatically a part of any instance created in the

system.

By contrast, 02 defines a theory of types based on Cardelli [18]. The semantics

of behavior (i.e., methods) is captured by defining a signature (which is a set of

functions attached to a class or type). The 02 data model is embedded within C

and Basic. The semantics of our type system is based on that of 02 with two main

differences:








19


1. we support multiple inheritance, and

2. we model behavior by giving it an entirely extensional interpretation, rather

than as a signature.

Thus, the design of Voltaire was heavily influenced by Machiavelli, TAXIS and

02. Further, none of these languages provide a means to share data as described in

section 1.3.4.
















CHAPTER 2
AN OVERVIEW OF VOLTAIRE

While there are a number of issues governing the design of a database program-

ming language, we have chosen to address only a few of them. The Voltaire environ-

ment is intended to be used as a vehicle in which a user can efficiently define his or her
application with ease. The applications are expected to be data intensive, as opposed

to computation intensive. An environment that is easy to use can result when the

user need only focus on the specification of the application, rather than worry about

dealing with paradigm mismatch problems between the host programming language

and the DDL/DML (as discussed in the previous chapter). Thus, our primary goal
is to provide the user with a truly integrated paradigm for data intensive comput-

ing. We achieve this by providing a single model of execution for evaluating queries,

enforcing constraints and computing functions, by designing a language that facili-

tates a bootstrapped implementation. Further, we define an extensional semantics

for behavior in our type theory, thereby giving an equivalent semantics to classes

and functions. Thus, a function is computed as the result of constraint satisfaction.

We first present the design rationale of Voltaire, followed by a brief overview of its

various programming constructs.

2.1 Design Rationale of Voltaire

The basic structure of a query expression is as shown below:

::= { "I" }

::= and I ... I < rel op> >











::= I I
::= I .

A query consists of associative set expressions (see chapter 4). The user specifies a
path (or subgraph) of interest on the LHS of the vertical bar, and boolean predicates

for selection conditions on the RHS of the vertical bar. This path of interest denotes

the context of the set expression within which certain boolean conditions must hold
true. The further defines the scope of identifiers. A simple context can be specified

by using a dot expression such as Student. Course.Dept. As an example, consider the
query {Student.name I Student.Course.c# > 6000 and Student.advisor in Faculty}.

The syntactic category denotes expressions which are simple extensions to terms
and factors found in most languages such as Pascal. A query can contain embedded
subqueries since a query is a kind of expression, and consists of expressions.

Boolean expressions have the usual and, or, not operators, quantifiers and rela-
tional expressions of the form < E1 > < E2 >. Thus, a constraint is of the

form:

::= if then

The issue is to define the syntactic category :

1. without introducing further syntactic categories, and

2. without overloading the semantics of existing structures in an unnatural fashion.

This can be resolved by overloading the equality operator such that two conditions
arise. If both the RHS and LHS are bound, then satisfiability is checked. If the LHS
of the equality operator is unbound, then an assignment (or, more appropriately, a
binding) takes place. Thus, ::= . If these boolean conditions

are chosen to be simple propositions, then satisfiability is NP-complete (due to the











satisfiability problem), and the order in which constraints appear is insignificant. But

such a choice would be inadequate for the following reasons:

1. lack of expressive power,

2. computational overhead due to insignificance in the order of constraints,

3. it raises the issue of how to blend such a semantics into a programming language

that is not based on theorem proving techniques (such as resolution).

By taking a rather operational view in which the order of constraints is significant,

we can avoid the above problems. Also, we can blend constraints into a set-oriented

yet imperative programming language. A program can then be viewed as a sequence

of constraints and other commands:

::= +

::= I

The category may consist of operators with side effects such as up-

dates or input-output or other convenient constructs such as an iterator. Given the

above interpretation, there is no a priori reason why a command cannot be a kind

of consequent as well, i.e., ::= I . Constraints

are no longer viewed as mere pre- and post-conditions on the state of a computa-

tion, but rather as conditions that must hold true at arbitrarily specified points in a

computation. This scheme is fairly general-consider the following:

::= if then

The antecedent of a constraint can also be events such as updates or retrieves, or

exceptions. These issues are important in active database management [13, 21, 39,

40]. Thus, ::= I I .











The main limitation of this operational interpretation is that constraints cannot

be automatically propagated, other than what has been explicitly programmed by a

user. For example, the user would have to write a rule such that if any employee is

deleted, then delete all dependents of such an employee. If such rules are omitted in

the definition of a given class, then the database may result in an inconsistent state.

However, by adopting a lazy evaluation strategy, consistent data can be guaranteed

as the result of evaluating an expression1 (recall that a query is only one kind of

expression). The above discussion is based on the implicit assumption that expres-

sions can be evaluated against a persistent store, i.e., a database. We believe that

the above formulation leads towards a bootstrapped implementation.

Other issues that we chose to address in the design of Voltaire with respect to the

issues outlined in section 1.3 are:

1. We define an object-based data model (or type system) that accounts for both

extent and behavior, and facilitates manipulation of heterogeneous records and

sharing of data. Further, operators defined in the language are transparent to

the persistence or non-persistence of objects. The set-oriented expressions can

be statically checked for type errors.

2. We alleviate the paradigm mismatch problem between record- and set-oriented

paradigms by designing a language based on set expressions, by employing

implicit type coercion, and some obvious operator overloading.

3. We provide a limited form of automatic constraint management. The query

language can uniformly access objects and functions.

To make our discussion more concrete, we shall briefly present an introductory

example of data definition, constraints and functions written in Voltaire in section 2.3.
'This is precisely the view taken by Jagadish [31].











We shall adopt the following convention in all subsequent chapters. All identifiers for

class names will begin with a capital letter, attribute names with a small letter and

reserved words in bold face. In normal text, all identifiers will be italicized, except

for reserved words.

2.2 A Quick Glance of Voltaire

Voltaire supports a number of features and abstraction mechanisms for modeling

the data as well the application. We first list the abstractions for database modeling:


1. Classes: A class is a set of instances or objects being modeled, such that these

objects share certain common characteristics. The name of a class denotes the

objects currently existing in the database. There exists only one copy of the

object in the database, though other objects may refer to it. A class definition

consists of a sequence of pairs. An object can be a

member of a class if it has at least those attributes defined in the class-thus

an object can have additional attributes and belong to the class in question

without the necessity for creating either a new subclass or an exception.

2. Aggregation: Objects belonging to classes are aggregates of heterogeneous com-

ponents, having objects of other classes as components. Associations between

various objects are represented as aggregations. An object is a sequence of

pairs.

3. Generalization: Voltaire supports a taxonomy of classes. Subclasses are derived

from a class by adding more information to the class. Instances of a subclass

also belong to its parent classes. Since we support multiple inheritance, an

instance can have many parent classes or belong to a subclass which can have











many parent classes. Further, the type of the elements of a subclass is a subtype

of the type of the elements of the parent class.

4. Sharing: The type system of Voltaire makes it possible for a given set of in-

stances to be viewed or shared by more than one schema; or for a given schema

to be able to define more than one set of instances (see section 1.3.4).


The Voltaire language also has the following characteristics:


1. Voltaire is a set-oriented but object-based language subscribing to the impera-

tive paradigm of programming.

2. Expressions in Voltaire are a simple extension of terms and factors-the kind of

expressions found in Pascal-like languages. An important extension is the set

expression which returns a set of objects (values or instances) belonging to a

given type. A simple set expression includes the dot operator which facilitates

associative access.

3. The main control structure is the sequencing of commands or constraints. The

language also provides conditionals, iterators, and recursive function call.

4. Every denotable value of the language possesses a type:


(a) A type is a set of values sharing a set of common properties, together with

a sequence of constraints which define the behavior of elements of a type.

(b) The predefined types are boolean, integer, real, string, with the usual op-

erators, the type Nil, which is a singleton set with the element null, and

the type Any, of which all types are a subtype. Equality is defined for the

type Nil, which is a subtype of all types defined in the schema.











(c) The type constructors set and tuple are available to define new types from

predefined or previously defined types.

(d) A value of type r1 can be used as an argument to a function defined for

values of type r2, if T1 is a subtype of r2. Since the subtype relation is a

partial order, reverse substitution is not allowed.


5. It is a first order language. However, the extent of a function is a denotable value

(which can also be persistent). Therefore, an element belonging to the extent

of a function2 can be embedded in data structures, passed as a parameter, or

returned as a value. It should be noted that this approach is quite different

from the one taken in higher order functional languages where the function

itself is a denotable value.

6. Functions and classes in Voltaire have an equivalent semantics.

7. A given function is specified by the relationships between the input and output

arguments of that function. These parameters form the attributes of the func-

tion (or class), and the relationships among them are expressed as a sequence of

constraints. These relationships or constraints are rules for evaluating the func-

tion. Thus, the evaluation of a function can be seen as the result of sequential

constraint satisfaction.

8. The Voltaire environment prompts the user for inputs and reports the result of

computations in an interactive fashion. At this level of evaluation, the user can

load a given schema (definitions of classes and functions) and a given database
21t is useful to think of an element of the extent of a function as a member of the graph of that
function. The Voltaire system, however, treats it as an instance whose attributes (which correspond
to the formal parameters of the function) are bound to denotable values, thus capturing pre- and
post-computation information.











(a set of instances). Alternately, a new schema can be defined and a new

database created. Further, one can evaluate set expressions (which, effectively,

are queries) or execute functions.


2.3 An Introductory Example

We give below a simple example to illustrate the notion of sharing as defined

in section 1.3.4. As mentioned there, a given schema can describe more than one

consistent set of instances, and likewise, a given set of instances can be defined by

more than one schema. Therefore, we define two simple schemas and two sets of

instances.

Let Schema1 be defined as follows:

class Employee defined class Dept defined
attributes attributes
name: string name: string
ss#: integer location: string
dept: Department manager: Employee
manager: Employee budget: integer
salary: integer
Constraints Constraints
budget > sum {Employee.salary I
Employee.dept.Dept.name = self .name };


class Incr.-Salary function
attributes
incr: integer
constraints
for each x in Employee do
{modify .x I salary = prev.salary + (prev.salary x incr) 100};
enddo



Thus, Schema1 consists of the two classes Employee and Dept and the function

IncrSalary. A constraint is defined on the class Dept such that the budget of each











Dept should be greater than the sum of the salaries of all employees working in it.

The argument of the sum operator is effectively a query, in which self denotes the

currently active instance of the class Dept. The function Incr-Salary increases the

salary of each employee in the database by a given percentage. The dot expression

prev.salary denotes the older value of salary. The command in the body of the for

loop could have been alternately written as:

salary := salary + (salary x incr) + 100;


Similarly, let Schema2 be defined as follows:

class Employee defined class Dept defined
attributes attributes
name: string name: string
manager: Employee manager: Employee
salary: integer
Constraints Constraints
self .salary < manager.salary


class Emps-in-Dept function
attributes
dept-name: string
deptmgr: string
emps.in-dept: set Employee
constraints
dept-mgr = {Dept.manager I Dept.name = dept-name };
empsin-dept = {Employee I Employee.manager = dept-mgr; }


We again define Employee and Dept classes and a function Emps-in-Dept which

determines all the employees working in a department given its name. The function

could have been redefined without the identifier dept-mgr as follows:

emps in-dept = {Employee I Employee.manager =

{Dept.manager I Dept.name = dept-name } };












Let the set of instances DB1 be as follows:


instance joe class Employee ii
ss# = 123123123
name = "Joe"
dept = finance
manager = sally
salary = 60000

instance harry class Employee
name = "Harry"
ss# = 111222333
dept = production
manager = harry
salary = 55000
spouse = sally
instance production class Dept
name = "Production"
location = "austin"
manager = harry
budget = 6000000
employees = {jim, harry}


stance jim class Employee
ss# = 121212121
name = "Jim"
dept = production
manager = john
salary = 50000
car = "toyota"
instance sally class Employee
name = "Sally"
ss# = 789789789
dept = finance
manager = sally
salary = 65000

instance finance class Dept
name = "Finance"
location = "athens"
manager = sally
budget = 5550000


Note that the structures of the instances belonging to the classes Employee and

Dept are different. For example, nothing is mentioned about spouses and cars in the

class definition. Further, sally has a value for the attribute manager which points

to itself. Such cyclic structures are legal in Voltaire. It means that Sally is her own

manager. Similarly, let the set of instances DB2 be as follows:

instance smith class Employee instance jill class Employee
name = "Smith" name = "Jill"
manager = jack manager = alice
salary = 45000 salary = 54000
education = "M.S." spouse = jack
instance jack class Employee instance alice class Employee
name = "Jack" name = "Alice"
manager = jack manager = alice
salary = 55000 salary = 65000
dept = wonderland








30


instance wonderland class Dept
name = "Wonderland"
manager = alice
budget = null

We have defined a semantics for the type scheme that facilitates sharing of data

(see section 3.4). Thus, Schema2 can adequately define DB1 and DB2, since the type

system will deduce that the corresponding structures are compatible. Similarly, DB1

can be defined by Schema1 and Schema2.
















CHAPTER 3
DATA DEFINITION

3.1 Classes and Instances

The data definition facility in Voltaire allows us to define classes and an inheri-

tance hierarchy, as well as a database of instances. Depicted in Figure 3.1 is a schema

graph that can be easily modeled in Voltaire. This schema is defined in appendix

A. The purpose of this schema graph is to emphasize the associative nature of data

in many applications. For example, the classes Grad and Person denoting the set

of all graduate students and persons respectively in the universe of discourse can be

defined as follows:

class Grad defined class Person defined
superclasses Student superclasses any
subclasses RA, TA subclasses Student, Teacher
attributes attributes
ss#: integer ss#: integer
name: string name: string
gpa: real
major: Dept
advisor: Faculty
sections: set Section

The attributes ss# and name are inherited from the class Person; gpa, major

and sections are inherited from Student, and therefore, need not have been repeated

since Person was explicitly mentioned as a superclass in the definition of Student.

Instances are characterized by a unique identifier, the set of classes to which the

instance may belong, and the set of attribute value pairs. An instance may belong











to one or more classes provided it satisfies all constraints attached to a given class

and all of its superclasses. Some examples of instances are:

instance joe class Student instance jim class Person
ss# = 123123123 ss# = 121212121
name = "Joe" name = "Jim"
gpa = 3.5
major = EE
sections = s123, s234, s345

instance john class Person instance jack class Person
ss# = 111222333 ss# = 789789789
name = "John" name = "Jack"
age = 35 salary = 12000

The first identifier "joe" after the keyword instance denotes a unique identifier

for the instance in question. It belongs to the class Student. The value for major

refers to an instance of class Dept, and that for sections is a set of unique identifiers

belonging to the class Section. Further, notice that nothing was mentioned about

age and salary in the definition of Person. However, since we have chosen to give an

extensional semantics to class definitions similar to that in previous works [18, 35, 45],

an instance may have an arity greater than that of the classes to which it may belong.

This decision was made for the following reasons:


1. To allow a single schema to describe multiple databases.

2. To allow a single database to be described by multiple schemas.1

3. To prevent an unnecessary proliferation of classes such as Person-with-age or

Person-with-salary, besides Person.

4. To provide a means to deal with incomplete information and exceptions.
'If a single database is described by more than one schema, then the class to which an instance
belongs cannot be stored along with the instance. In such a case, the class of an instance must be
inferred (or read from a pre-compiled table) when opening a database.





















section# ".
.... __ Aggregation
room# - -- G
roo .Section Generalization
textbook ."


ss#l v '~~
I- I ----
I I

Person
namefA-
name ,,." Transcript




S c e vs un ugrade '

Teacher Advising Student Course


^- '/ I
startdate /"
i# t
I
I

I


A
I\
I \

'
i I
S I



I I

books speciality


% classification
\ S

GPA
\
\
\
\



II
I %
td #


% \ major
minor %, /



Department


A

I I

name college
name college


-- ------
*1 *.
lgj 's
#1 I '.
'II

I credit-
s i i hours

/ c# tie


Figure 3.1. University Schema


/
/

degree











Now, consider the following program segment:
s := { jim, john, jack };
for each x in s
print x.name;

The reason why { jim, john, jack } is a valid structure is based on a simple
extension of an idea described in Buneman and Ohori [17]. The idea is that one can
define an ordering of database objects based on their information content, since a
database object is a partial description of some real world entity. Thus, the instance
(jim, ( ss#: 121212121, name: "Jim")) contains less information than (john, ( ss#:
111222333, name: "John", age: 35)) and (jack, ( ss#: 789789789, name: "Jack",
salary: 12000 )). If we were to assign types 61, 62 and 63, respectively, to these
records, then one can define an ordering 62 is < based on the subtype relationship. Further, 61 = Ul{61,62,63 }1, which can
adequately define the type of {jim, john, jack}, where U stands for the least upper
bound (lub). Thus, a set can contain elements that can be assigned types, such that
a lub can be computed for these types. Discussion on the computability of a lub for
more complex terms is found in Buneman and Ohori [17].

Before describing the update operators and query language, we shall briefly in-
troduce the notion of associative access. The dot operator is a common means for
achieving this [50, 57], which is similar to field selection in Machiavelli [43]. For ex-
ample, Grad.advisor.Faculty.name is an associative pattern which denotes the name
of a faculty member who advises some graduate student. This dot expression could
also have been written as Grad.Faculty.name since there is a unique path from Grad
to Faculty via advisor. Also, the dot expression joe.ss# denotes the value 123123123
of type integer, and a set expression of the form { Student.name I ss# = 123123123
} denotes the singleton set, the element of which has the value "Joe" of type string.











The dot operator forms the basis of an associative pattern (or dot expression),

and is directional. For example, let a and b be two classes, where a has an attribute

s whose domain is b, and b has an attribute t whose domain is a. Thus, a.b has a

different denotation from b.a since they result in values whose domains are different

(assuming that there is a unique path from a to b and vice versa). Given such

unique paths, s and t can be thought of as inverse attributes. The system does not

automatically maintain inverse attributes. Therefore, even though a dot expression

may be meaningful in one direction, it may not be defined in the reverse direction.

It is possible for the user to specify the names of two classes as operands to the dot

operator provided there exists an unambiguous path between the classes (or nodes in

the schema graph). These dot expressions or associative patterns form an important

component of the query sublanguage, as we shall see in the next chapter.

3.2 An Extensional Semantics for Classes

We shall now attempt to give an extensional semantics similar to that given in

KANDOR [45]. In a Voltaire database, let C be the set of classes defined in it, let A

be the set of attributes defined in it, B3 be the set of constraints (to model behavior),

and let 1 be the set of instances defined in it. A partial model for a Voltaire database

is then a set A, the set of all instances, strings and numbers, plus a function such

that:

: C -*.2D

This accounts for the fact that a given instance may belong to more than one class,

due to multiple inheritance.

S: A -- (D -- 2E+)
where D+ is the disjoint union of D, numbers and strings. Thus, an attribute is

treated as a function or two place predicate.











:I-^T
:B^T>

C : numerals --- integers
: realnumerals -- real
E : strings -- strings
The last three conditions account for base types supported by the system.
This function effectively computes the extent of a given class. It may be thought
of as being similar to a typical valuation function as found in denotational semantics.
In order to compute the extent of a class, we must first compute the extent due to
each syntactic category allowed in the definition of the class. Therefore, the various
forms of C are defined above, and further, must satisfy the following conditions:

1. [a: c] = x where if y = C[a](x) then y E 9[c] and x E D

2. [a: set c] = {x E E) I if y E [a](x) then yE -F[c]}

3. [a: tuple a, : ci] = Ini 1 F[a.ai : ci]

4. [c: constraint b1;...; bi] = nfl[i [c : constraint bi]

5. [c: constraint bi] = x if x satisfies the constraint b, else 0

6. C[c] = nC[1, f F nni [a]
where the class c has superclasses cl ... cq, and has attributes (with domain
restrictions) a,1 ... am.

This type of model is called a partial model because it does not take into account
the definitions of instances. The reason for this is that the definitions of instances are
not important for determining the subclass relationship, because it does not depend
on a particular model but on the entire set of models. Thus, cl is a subclass of c2, i.e.,











c1 -< C2 iff [cil] C [c21. It should be clear that a traditional characterization for this
simple type discipline would ensure that the subclass relationship as defined above
is decidable (provided that constraints are ignored). In fact, the formulation would
be very similar to that of 02, and is given in section 3.4. The above formulation is

trivial since it does not yet account for functions, which we shall see in chapter 6.
The main reason for choosing the above semantics was to emphasize the extension
of a given class. Our model makes no arbitrary assumptions. For example, the arity of

an instance can be greater than that of the classes) to which may it may belong. Also,
multiple inheritance is possible without any problems. Instances are characterized
by a unique identifier, the set of classes to which the instance may belong, and the
set of attribute value pairs. An instance may belong to one or more classes provided

it satisfies all constraints attached to a given class. The unique identifier is assigned
to an instance by system (which also ensures its uniqueness across the system) at the
time when the instance is created.

3.3 Update Operators

We also provide a set of update operators to create and modify existing instances.
The new operator allows us to create a new persistent instance with an immutable,

unique identifier as follows:
{ new.Student I ss# = 456456456 and name = "Smith" and
major = { Dept I name = "EE" } and
sections = { Section I sec-number = 8814 or
sec-number = 7835 or
sec-number = 8845 } }

This returns a unique identifier for a new instance of class Student which will
now be stored in the database. The right hand side of the vertical bar "I" defines
the values for each attribute of the instance. Assuming that there exists an instance
defining the "EE" department, the value for major is given by the set expression











{Dept I name = "EE"}, which denotes the identifier EE. The value of gpa is not
specified because there may be a constraint or rule which tells the system how to
compute its value, i.e., gpa may be a derived attribute. Thus, before the instance
is actually placed in the persistent store, the value for gpa would be computed and

checked for consistency, but would not be made persistent along with the other values

specified in the command.
The modify operator is like destructive assignment, in the sense that it will
destroy a persistent value (other than the unique identifier), and replace it with a new
value specified by the user. The modified instance is then checked for consistency

before it is committed to the persistent store. This check is limited only to those
classes to which the instance may belong. For example, { modify.joe I major =

{ Dept I name = "CS" } } changes the value of the major attribute of the object
referenced by joe. Similarly, { modify.Person I age = prev.age + 1 } will increase
the age of every instance of class person by 1. The delete operator actually destroys
the (set of) instances specified by the user, e.g., { delete.Student I gpa < 1.0 }.

These operators are also defined for non-persistent data values.2

3.4 On the Computability of Subclass

3.4.1 Object Graphs and Equality

Suppose we are given:

1. A finite set of domains Di,..., Dn, n > 1.

Let D denote the union of all domains Di.

2. A countably infinite set A of attribute names.

3. A countably infinite set I2) of identifiers.
2The reason why new, modify, delete are defined for non-persistent values as well is that
persistence is a property of the instance and not the class or type.











We now define the notion of value.

Definition 3.4.1.1 Values:

1. The special symbol null is a value, called a basic value.

2. Every element v of D is a value, called a basic value.

3. Every finite subset of ID is a value, called a set value. Set values are denoted

in the usual way using brackets.

4. The finite partial function r : A -4 D, denoted by (ai : il,...,ap : ip), is

defined on a,,..., ,ap such that r(ak) = ik for all k from 1 to p. Every r is

called a tuple value

We denote by V the set of all values. We now define the notion of an object.

Definition 3.4.1.2 Objects:

1. The set of all objects 0 = ID x V

2. An object is a pair o = (i, v), where i is an element of ID (an identifier) and v

is a value.

In o = (i,v), if v is a basic value, then o is a basic object. Similarly, we can

define set-structured and tuple-structured objects. Further, we define the functions

S: 0-ITD and v : 0-+V such that t(o) denotes the identifier i and v(o) denotes

the value of object o, respectively. We also define the function p : 0 21D, which

associates with an object the set of all identifiers appearing in its value, i.e., those

referenced by the object. We can now define an Object Graph.

Definition 3.4.1.3 Object Graph: Let ( be a set of objects. Then, graph(O) is defined

as follows:










1. If o is a basic object of 0, then the graph contains a corresponding vertex with
no outgoing edge. The vertex is labeled with the value of o, i.e., v(o).

2. If o is the tuple-structured object (i, (a, : il,..., ap, : ip), then the subgraph
in graph(O) corresponding to o contains a node (say, '7*) labeled with i, and
p outgoing edges from 77 labeled with a,,..., ap leading respectively to nodes
corresponding to objects l01,... op where each Ok is identified by ik (provided
such objects exist).

3. If o is a set-structured object (i, {i1,..., ip}), then the graph of o consists of
a node (say, 17*) labeled by i, and p unlabeled outgoing edges from rf* lead-
ing respectively to nodes corresponding to objects 01,... op where each ok is
identified by ik (provided such objects exist).

As an example, consider 0 = {O1,02,03, 04,06, 07, 08}, where

o01 = (i1, (name: i3,dept : i4, advisor: i2))

02 = (i2, (name: i6,dept: is,address: i7, advises : i1))

03 = (i3, "Jim"), o06 = (i6, "Joe") 04 = (i4, "CS"), o08 = (i8, "EE")

05 = (i5, {i4, is})

07 = (i7, (city : null,zip : null))

The objects 01,02 and 07 are tuple-structured, 03,04,06 and o8 are basic, and 05 is
set-structured. 0 is a consistent set of objects if it satisfies the definition given below.

Definition 3.4.1.4 Consistency of 0: A set 0 of objects is consistent iff


1. 0 is finite; and











2. the function t is injective on 0, i.e., there exist no pair of two objects with the
same identifiers; and

3. V o E 0, p(o) C t(O), i.e., every referenced identifier corresponds to an object

e.

Definition 3.4.1.5 Equality:

1. 0-equality: two objects o and o' are 0-equal (or identical) iff o = o'

2. 1-equality: two objects o and o' are 1-equal iff v(o) = v(o').

3. a-equality: two objects o and o' are a-equal iff span-tree(o) = span-tree(o' where
span-tree(o) is the tree obtained from o by recursively replacing an identifier i
(in a value) by the value of the object identified by i.

3.4.2 Classes. Types and Schemas

Definition 3.4.2.1 Basic Class Names:
Bnames is the set of names for basic classes containing:

1. The special symbols Any and Nil.

2. A symbol di for each domain Di. We denote Di = dom(di).

3. A symbol 'x for every value x of D.

Cnames is the set of names for constructed classes which is countably infinite and
is disjoint with Bnames. This is because Bnames denotes the set of the names for
basic domains such as boolean, string or integer. Tnames is the union of Bnames
and Cnames, and it is the set of all names for classes.











In order to define classes, we assume there is a finite set B whose elements are
constraints which describe the behavior of classes. For now, we shall consider elements

of B as uninterpreted symbols.

Definition 3.4.2.2 Classes: A basic class is a pair (n,b), where n is an element of
Bnames and b is a subset of B.
A constructed class is one of the following:

1. A triple (s,t,b) where s is an element of Cnames, t is an element of Tnames,
and b is a subset of B. Such a class is denoted by (s = t, b).

2. A triple (s, r, b) where s E Cnames, and r is a finite partial function T : A --
Tnames. Such a class is denoted by s = (ai : S ,... a: sn),b), where r(ak) =

Sk, and is called a tuple-structured class.

3. A triple (s, s', b) where s E Cnames, s' E Tnames. Such a class is denoted by
(s = s', b) and is called a set-structured class.

A class is either basic or constructed, and the set of all classes is denoted by T.

Definition 3.4.2.3 Class Structures

1. Basic Class Structure: Let t = (n,m) be a basic class. Then n is called the
basic class structure associated with t.

2. Constructed Class Structure: Let t = (s = x, b) be a constructed class. Then
s = x is called the constructed type structure associated with t.

Given a class t, its structure is denoted by a(t) and its behavior by /(t). We first
give some notation before defining the notion of consistency for class structures.


1. If t is a class, then r7(t) denotes the name of the class.











2. if a(t) is a class structure associated with the class t, then we denote r(a(t)) =



3. If a(t) is a class structure associated with the class t, then we denote the set of
all class names appearing in the structure of t (namely, u(t)) by refer(a(t)).

Definition 3.4.2.4 Schemas: A set A of constructed class structures is a schema if
and only if:

1. A is a finite set; and

2. 7y is injective on A (i.e, there exists only one class structure for a given class

name); and

3. Va(t) E A, refer(a(t)) fl Cnames C y(A), i.e., there are no dangling identifiers.

The semantics of the class structure system defined above is given by a function
which associates subsets of a consistent set of objects to class structure names.

Definition 3.4.2.5 Interpretations: Let A be a schema and 0 be a consistent subset
of the universe of objects 0. An interpretation I of A in 0 is a function from Tnames
to 2'(e), such that the following properties are satisfied.

A. Basic Class names

(a) I(Nil) C {i E t(0) 1 (i, null) E 0}.
The interpretation of Nil is a subset of the identifiers in 0 such that they
denote objects whose value is null.

(b) I(di) C {i E t(O) I 0(i) e Di} U I(Nil).
The interpretation of a basic domain or type is the subset of identifiers of
objects in 0 such that they denote basic objects in 0.










(c) Z('x) C {i E t(O) I O(i) x}U I(Nil).

(d) I(Any) = {i I i E t(O)}.
Since all objects belong to Any, its interpretation is the set of all identifiers
defined in 0.

B. Constructed Class Names

(a) If s = (ai : sl,... ,a, : s,) E A, then I(s) C {i E t(O) I Q(i) is a tuple-
structured value defined at least on a1,..., a, and Vk Q(i)(ak) E 2"(Sk)} U
I(Nil).

(b) if s = {s'} E A, then J(s) C {i E t(0) | (i) C I(s')} U Z(Nil).

(c) (s = t) E A, then I(s) C I(t).

C. Undefined Class names

(a) If s is neither a class name nor the name of the schema A, then Z(s) C
Z(Nil).

Definition 3.4.2.6 Model of a Schema

1. Partial order on Interpretations: An interpretation ZIF Z' if and only if for all
s C Tnames, I(s) C '(s).

2. Model: Let A be a schema and 0 be a consistent set of objects. The model M
of A is 8, which is the greatest interpretation of A in 8.

Theorem 3.4.1 The definition of a Model is sound.

Proof of Theorem 3.4.1 Given a schema A and a consistent set of objects 0, there
are a finite number of interpretations of A defined on 8. Therefore, in order to











prove that the greatest interpretation exists, we have to prove that the union of two
interpretations is an interpretation.
Let 11 and 12 be two interpretations and I(s) = Li(s) U 12(s), for every class
name s. Clearly, I satisfies properties A.1, 2 and 3 of the definition above. Let
s = (a, : si,..., a, : s,), and i be an element of I(s). Then, i is either an element
of "1 or 12-. If i is an element of 1i, then O(i)(ak) e I(sk) for all k, and I satisfies
property B.1 above. Similarly, it can be shown that I satisfies properties B.2 and
B.3 above. Thus, there exists a greatest interpretation M such that

M(S) = Ue (A)2-()'
for every class name s, where INT(A) denotes the set of all interpretations of A in
0. 0

Definition 3.4.2.7 Partial Order -<: Let s and s' be two class structures of a schema
A. Then s is a substructure of s' (denoted by s -< s') if and only if M(s) C M(s')
for all consistent sets 0.

Theorem 3.4.2 If s and s' are two class structures of a schema A, then by s -< s' if
and only if one of the following conditions holds true:

1. s and s' are tuple structures s = t and s' = t', such that t is more defined than
t' and for every attribute a such that t' is defined, t(a) t'(a) holds.

2. s and s' are set structures such that s = {t} and s' = {t'}, then t -< t' holds.

3. s = 'x, and s' is a basic class structure, and x E dom(s').

Proof of Theorem 3.4.2 The validity of this characterization can be established by
induction. Completeness can be established on a case-by-case basis for tuple, set and
basic class structures. m











This theorem provides a syntactical means for computing the subclass relationship,

since we are ignoring the behavior of classes in this characterization.

Definition 3.4.2.8 Databases A database is a tuple (A, 0, --,Z) where


1. A is a consistent schema.

2. 0 is a consistent set of objects.

3. is a partial order among elements of A.

4. is an interpretation of A in 0.

Further, the following properties must hold:

1. If t -< t' and t -< t", then U{t',t"} is computable, provided t' 5 Any and t" 4

Any. Further, t' and t" are now said to be comparable, and U{t', t"} is the least

upper bound of t' and t".

2. 0 =UteA(t)


3.4.3 Glossary

Here we provide a brief glossary of some of the functions used in this section.

t denotes the identifier of an object o

v denotes the value of an object o
p associates with an object the set of all identifiers appearing in its value

T is a partial function for tuple values

71 denotes the name of a class

a denotes the structure of a class
















CHAPTER 4
QUERY SPECIFICATION

As mentioned earlier, Voltaire is an imperative programming language based on
the notion of objects. Since query languages have traditionally been declarative

and set-oriented, embedding them within a procedural, record-oriented framework

inevitably leads to design conflicts. However, we avoid much of this conflict since

Voltaire is a set-oriented language. This means that expressions, which form the core
of Voltaire, denote a set of objects by default. For example, even the simple dot

expression Student.advisor.Faculty.dept denotes a set of instances or objects whose

type is the type of the attribute dept, such that each object participates in the
association described in the dot expression. These same set expressions are used in

specifying constraints in a class or function definition, with one important restriction.

An expression of the form s := {ci.[.. ali ...].c2[... a2j...]...} is not allowed even

though it is well-typed: type(s) = {(...,type(a1i),... ,type(a2j),...)}. The value

of s would be a set of tuples, and each element in a tuple can contain nested sets

and tuples. If such expressions were allowed, the run-time overhead would be very

expensive.

Multiple inheritance does not create a problem when evaluating a query or set
expression. This is because an instance can occur only within a unique context in the

expression. The context is decided by the anchor class of the dot expression, which
is simply the first class appearing in a dot expression. For example, in {TA.advisor I

TA in RA}, the context is defined by the LHS of the "I", and therefore, the anchor

class is TA. This query denotes the set of objects belonging to type(advisor) such











that all instances of the class TA that have advisors are also members of the class
RA. Even though the classes TA and RA are not subclasses of each other, they have
common elements. Since the boolean condition TA in RA means self in RA (where
self maintains currency in the set of objects belonging to the anchor class), the query
can be evaluated without conflict.

4.1 The Basic Structure of a Query

The basic structure of the query sublanguage is as shown below:
::= { 'I' } I { } I I

::= ( ) I not < Booll > or < Boo12> >
< Booll > and < Boo12 > < E1 > < E2 >
< E1 > = < E2 > I forall : |
exists | dbexists

::= I I

The query sublanguage consists of associative set expressions. The user specifies a
path (or subgraph) of interest on the LHS of the vertical bar, and simple boolean

predicates for selection conditions on the RHS of the vertical bar. This path of interest

denotes the context of the set expression within which certain boolean conditions
must hold true. The context is also important since it defines the scope of identifiers
(this will be further elaborated in section 6.6). A simple context can be specified by
using a dot expression. An important restriction is that the first identifier in a dot

expression on the LHS which defines the context must be a class name. This class is
then called the anchor class. The syntactic category denotes expressions which
are simple extensions to those found in most languages such as Pascal. To project
attributes of a class referenced in the dot expression, they are enclosed within square
brackets. We show a few examples (some taken from Alashqur et al [2]) below with

respect to the schema graph depicted in figure 2.











4.2 Examples

Q 1. Project the names of all graduate students who teach other graduate students in

some sections. Also, project the names of those graduate students they teach.

{ TA [name].teaches.Section. Grad [name] }

Note that the class TA inherits two attributes whose domain is the class Section,
namely, teaches from the class Teacher and sections from the class Grad (via Stu-
dent). Since we are interested in TAs in their role as Teachers (and not as graduate
students who also enroll in course sections), we appropriately include teaches in the
dot expression.

Q 2. Project the names of all departments that offer 6000 level courses that have a
current offering (i.e., sections). Also, project the titles of these courses and the
textbook used in each section.


{ Dept[name].Course[title].Section[textbook] I Course.c# < 6000 and

Course.c# < 7000 }


A department offers many courses, i.e., the class Dept has an attribute course.offer-
ing whose domain is the class Course. Similarly, each Course may have one or more
Sections. This query is evaluated by first accessing all instances of the class Dept.
For each instance of Dept, we retrieve the object references to all courses offered by
that Dept. These instances of class Course are then filtered through the boolean con-
ditions to check if the corresponding course numbers lie between 6000 and 7000. All
instances of Course which do not satisfy this condition are dropped from further con-
sideration. For each instance of Course so far selected, we access the corresponding

Sections for that course.











Q 3. Project the names of all graduate students who are RAs but not TAs.

{ RA.name I not ( RA in TA) }

The boolean condition could have also been specified as not (self in TA). This is
because any dot expression on the RHS of the vertical bar beginning with the anchor
class means the same as self. Self is a special operator used to define currency in a
set processing stream.

Q 4. Project the names of all under-graduate students whose minor is in that depart-
ment which is the the major department of the under-graduate student with

ss# = 123456789.


{Undergrad.name I Undergrad.minor.Dept =

{Undergrad.major.Dept I Undergrad.ss# = 123456789}}


The boolean condition in this query has an embedded set expression. The scope
of a dot expression (i.e., context) is local to the set expression in which it occurs.
Therefore, in the inner set expression, we are interested in the major department of

that instance of class Undergrad whose ss# has the value 123456789. Similarly, in the
outer set expression, we are interested in that Undergrad whose minor Dept has the

same value as that specified by the embedded set expression. In order to transcend
the scope of a dot expression from an inner to outer set expression (or vice versa),
we must use special operators such as prev, and will be seen in chapter 6.

Q 5. Project the names of all TAs who grade courses in which they themselves are

registered (i.e., enrolled).


{ TA.name I self.teaches.Section in self.enrolled.Section }











We are interested in those instances of TA that teach some section of a course in
which that same instance of TA is enrolled. Since a TA may be taking more than
one course, but can teach only one course, we use the set inclusion operator. Again,
self could have been replaced by TA.

Q 6. What would be the values for salary for all research assistants whose advisor is
Smith, if they were to receive a 20% increment?

{ 1.2 x (RA.salary) I RA.advisor.Faculty.name = "Smith" }

This query would first evaluate the set expression and then multiply each pro-
jected value of salary by the scalar 1.2. If the context were to have more than one
subexpression containing the dot operator, then the first dot expression from the left
would be chosen as the context, and the remaining ones would be interpreted as if
they were on the RHS of the vertical bar.

4.3 Aggregate Operators

Several aggregate operators such as count, sum, min, max are provided. These
are not really special operators, but are mainly provided for convenience. These can

be easily defined by using a homomorphic set extension operator [17].

lethom = A(f,op,z,S).S = {} zI
tail S = {} -- op(f(head S),z)l
op(f (head S), hom(f, op, z, tail 5))

There is an alternative form of this function that applies to non-empty sets, and
does not require the argument z.

let hom* = A(f, op, S).op(f(head S), hom*(f, op, tail S))

Thus, we can now define the following:
let sum = AS.hom(Ax.x, -+-, 0, S)










let count = AS.hom(Ax.1, +, 0, S)
let min = AS.hom*(Ax.x, A(x, y).x < y -f xly, S)
This above formulation gives us a way to define and compute these aggregation
operators for sets of arbitrary structures, and are guaranteed of getting a correct
result that is free of side-effects.

4.4 Evaluation Strategies

4.4.1 Semantics of the Dot Operator

Set theoretic definition. Let C1, C2 be class names, E[C0], S[C2] be the extents
of C1, C2 and ci,,c2j E E[Cj], [C2] respectively. Let C'i have an attribute labeled
a,, whose domain is C2. Effectively, given a schema graph with two nodes C'1, 2,
there must exist a unique path from C' to C2 for C1.C2 to be meaningful. Let S
denote the aggregation association from C0 to 02 via the attribute alk such that
S C [C1] x E[C2], where alk is an attribute of C'1. Thus,

c.c2 = {c f C, c E [Cl] A C2 E [c1] A (c.,c2,) E s)

If the domain of a,, is set C2, then S C E[CI] x 2F[C2] and let C2, C [C2]. Then,

C1.C2 = {C2 I ci, E [Ci,] A C2 E C2, A (ci,, C2,) e S}

In general, let C1,...,C,, denote class names and [C],... ,[Cn] denote their
respective extents. Let ci, be the kt" element of [Ci]. Let Si be a meaningful
aggregation association (in the sense mentioned above) between Ci and Ci+1, such
that Si C [Ci] x [Ci+1] (or, if the domain that unique attribute ai, of C, is set
Ci+l, then Si C [Ci] x 2[Ci+1]). Also, Co.C0 =- E[C4]. Then,

C1..... .- ,. = {cn, I cn, E [C,] A c,n-l E C0 kc.....c-2,.c_A
(cni,, Cn) E Sn-I}











Model theoretic definition. We now give a formal definition of the dot operator
with respect to the algebra defined in section 3.4. Let Ci E T, where Tis the set
of all types in the schema. Then 7?(Ci) E 71(T) where r is the name function. Let
c, E I(Ci) where I(Ci) is the interpretation of Ci. Then ?j(C,).r7(C,+i) is valid if
and only if Oa(r(Ci)) = (ai, : si, ...ai : si} A 3a,,k : r(a,) = s,,k AZ(snk) C Z((C7+).
Clearly, then ti(Cj).?(C:+i) C I(C+,i). Recall that ca(Ci) denotes the structure of C,
and r is the partial function defined on tuple structures. For brevity, we drop 77, so
that Ci.Ci+l means the same as q/(C7).q(Ci+i), and also Co.C = Z(Ci). Now,
CiC. .C,-_.C. = {c, c,, E I Z(C,,) A 3an : r(a,,)(c,,,) E C,,-2.Cn-.

4.4.2 Naive Approach

As we have seen, queries are formulated in an associative fashion via the dot oper-
ator. The LHS of a set expression defines the context in which the boolean conditions
on the RHS are to be evaluated. These boolean conditions are also formulated with
the dot operator. Therefore, it seems reasonable to investigate the semantics of the

dot operator, and a means to evaluate it. We first give a simple example, and an
obvious operational meaning for a set expression. Let A, B, C, D, E, F, G, H be class

names. The dot operator is said to be meaningful for A.B if and only if there exists
an attribute in A whose domain is B or a subtype of B (as was formalized above).
Now consider the query:

{A.B.C.D.E I C.G = E.H} = {A.B.C.D.E I A.B.C.G = A.B.C.D.E.H}

This query can be evaluated as follows:
result := null;
for each a E A
for each b E B
for each c E C
for each d E D
for each e E E
for each h E H











if (((a.b).c).g) = ((((a.b).c).d).e).h then
result := union(result, e);

Note that (a.b) is similar to the usual record selection operator except for the implicit

assumption that there exists an attribute in class A whose type is B. The parentheses

define the order of evaluation. For example, if the current object in A is OA, and Ak

is the attribute label in question, then a.b = r(oA)(Ak), where r is the usual record

selection function.

However, as mentioned earlier, the only way to override the scope of an identifier

within a set expression (and, therefore, a context) is to use prev. For example,

consider the following:

{A.B.C.D.E I C.G = prev.E.H} = {A.B.C.D.E I A.B.C.G = E.H}

This query can be evaluated as follows:
result := null;
for each a E A
for each b E B
for each c E C
for each d e D
for each e E E
for each h E H
if (((a.b).c).g) = e.h then
result := union(result, e);

4.4.3 Algebraic Approach

As we have seen, the query language essentially consists of dot expressions, which

form the context on the LHS and selection conditions on the RHS of the vertical bar.

However, it is possible to evaluate these queries using extended algebraic operators

[50, 53, 56]. Thus, the compiler can exploit existing query optimization techniques.

For example, the first example can be transformed by the compiler to the following

form1:
'The actual definitions in Shaw and Zdonick [50] are slightly different, but we are using a simpler
notation for sake of clarity.










T1 =Mo, (Section, Grad) where
01 Grad in Section.enrollment
T2 = 7rSection.oid, Grad.name(Ti)
T3 =M02 (TA, T2) where
02 TA.teaches in T2.Section.oid
T4 = IrTA.name, Grad.name(T3)
Similarly, { (RA.salary) I RA.advisor.Faculty.name = "Smith" }, can be trans-
formed to:
7rsalary(ce((RA)), where
0 = RA.advisor = 7roid(Oname=" Smith" (Faculty))
An algebraic formulation can also be used to define a dataflow implementation
of the query processor. Since Voltaire expressions are set-oriented, a parallel imple-
mentation is possible:

{ I < Booll > and < Bool12 >} =
{ I < Booll >} n { I < Bool12 >}
{ I < Booll > or < Boo12 >} =
{ I < Booll >} U { I < Bool12 >}

In general, it is possible to show that the dot operator and boolean conditions can be
reduced to a small set of algebraic operators as described in the literature [50, 53, 56].
















CHAPTER 5
CONSTRAINT SPECIFICATION

Automatic integrity enforcement is a non-trivial problem [13, 21, 39, 40, 46, 55].

For example, when the consequent of a rule results in a database update operation,

detecting possible infinite regression due to update propagation simply adds to the

complexity. Another problem is that of maintaining cross-references. For example,

suppose a rule states that every graduate must have an advisor. If a certain faculty

member who advises three graduate students leaves the university, and is therefore

deleted from the database, then these three instances of graduate students will be

in an inconsistent state. Automatic update propagation may be dangerous since we

certainly would not like to delete the three graduate students merely because their

advisor left. A better way to deal with such situations is to introduce an elaborate

exception handling mechanism. Thus, we can state an exception to the above rule

such that the graduate students in question must find another advisor within three

months from the time the faculty member was deleted. Exception handling and active

database management are outside the scope of this dissertation.

There are two important characteristics about constraint management in Voltaire:

1. unlike most other constraint languages, the order in which constraints appear

is significant (reasons for this will be clear only after chapter 6), and

2. since the execution model is lazy (as derived attributes are computed on de-

mand only), and the effects of modify are only local, the user can never access

inconsistent data in the persistent store.1
'This is precisely the view taken in Jagadish [31].
56











This is because an instance can belong to a given class if and only if it satisfies all

the constraints specified in the definition of that class.2 Lazy evaluation implies that

constraints in Voltaire are automatically triggered whenever a new instance is created

or an existing instance is modified.

5.1 Basic Structure of Constraints

The basic structure of constraints is as shown below:

::= < Bi > ;< B2 > I I < Commi >

< Comm, > ::= if then endif I
if then < B1 > else < B2 > endif

::= ... < E >=< E2 > ...

It is important to note that the antecedent of a constraint is structurally and seman-

tically identical to the selection (i.e., boolean) conditions, which form the RHS of the

vertical bar in a set expression. The consequent of a constraint can also be a boolean

condition, in which case satisfiability is computed. However, when the consequent

contains the equality operator, two possibilities arise. If both the RHS and LHS are

bound, then satisfiability is checked. If the LHS of the equality operator is unbound,

then a binding takes place. That is, the equality operator is overloaded. Further,

when a constraint does not have an antecedent (as in rules 1 and 2 in Student below),

it behaves like an equational constraint which must be satisfied (in one direction

only). We now look at a few examples.

5.2 Examples

5.2.1 Constraints on the class Student

1. Student. totaLwork = Student.total -credit + Student.job-hours
2This means that if a class can be found such that its constraints are satisfied by the instance in
question, then the class of this instance can be automatically inferred.











2. Student.leisure.time = 80 Student.total-work;

3. Student.leisurejtime > 20;

4. if Student.visa-status = "F-I" then Student.job-hours < 20;


Rule 2 specifies how to compute the leisure time of a student, whereas rule 3

places a bound on the possible values that a student's leisure time can have. When a

new instance of class Student is created, the total work may not be known. Therefore,

before the value of leisure time can be computed, rule 1 must be triggered. When

the value for the total number of credit hours for which a student may be registered

or job hours is modified, rules 1, 2, 3 and 4 are triggered. Rule 4 states that for all

students whose visa status is F-l, they will not be allowed to work for more than 20

hours.

Since all of the above constraints are attached to the single class Student there is

no need to repeat the class name. For example, rule 1 could be rewritten as totalwork

= totaLcredit + jobhJiours, with an implicit self operator prepended to each attribute.

The self operator keeps track of the specific instance in question at all times during

the state of a computation.

Consider a program segment where a new instance of class Grad is created:

jim = { new.Grad I ss# = 123456789 and name = "jim brown"

and ... and total-credit = 12 and job-hours = 20 };

Before this instance can be placed in the persistent store, domain and other con-

straints must be checked. Since a new instance is being created, attributes occurring

on the RHS of the vertical bar are bound to their corresponding values. Rules 1, 2, 3

and 4 are now triggered. The first two rules result in the computation of totaLwork











and leisure-time. Rule 3 checks the condition leisure-time > 20 hours, which is satis-

fied in our example. Suppose that nothing is mentioned about visa-status when the

instance is being created. If the domain constraints of that attribute allow a null

value, then 4 is ignored, else an error condition is reported.

Suppose a modify command is issued where Jim's leisure time is updated to a

new value. This would trigger rules 2 and 3. Rule 2 is an equational constraint

on the relationship between leisuretime, totaLcredit and job-hours. Thus, if a new

value for leisure-time does not satisfy 2, then an error condition is reported, even
though 3 may be satisfied. Integrity enforcement in this situation is not possible due

to the inherent nondeterminism. However, any update to totaLcredit or job.-hours is

propagated in the obvious way.

5.2.2 Constraints on the class Grad

1. if exists Grad.thesis-option then

exists Grad.advisor and Grad.advisor in Grad.committee;

2. for all Grad.section.course.c# : c# > 5000;

3. if Grad.status = "full-time" then Grad.total-credit > 12;

4. if course-work = "done" and thesis-status = "defended" and

count { committee.Faculty I Faculty.Dept includes self .Dept } > 2

then degree-req = "fulfilled";

In the consequent of Rule 1, we need an existential quantifier because if Grad.advi-

sor evaluates to a null set, then it would be trivially contained in Grad.committee,

which is not the intended semantics. Rule 2 states that all the course numbers

taken by any graduate student must be of level 5000 or greater. Rule 3 states that











all graduate students attending school full time must register for at least 12 credit

hours.

5.3 Null Values and Exceptions

Information is often not always available when a new record or instance is being

created. This means that there may be a number of attributes of the instance in

question with null values. These instances are nevertheless useful since they contain

at least partial information about some real world entity. Dealing with the issue

of null values involves certain compromises since it conflicts with the following fact.

Null values may violate the structural and/or behavioral constraints of the class (or

type) to which the instance belongs. Thus, loading a database with null values may

jeopardize the safeness in a type system, and the user may thereby encounter run-

time errors. These errors could otherwise have been detected when the database was

being loaded. We have chosen a compromise in which:

1. The value null can be coerced to belong to any type.3 Thus, the structural

constraints of a type need not be violated.

2. It is very likely that the behavioral constraints can be violated due to the

presence of null values (i.e., the absence of information). But since we have

adopted a lazy evaluation mode, derived attributes are not computed until

actually requested. Thus, the user will not receive inconsistent instances as a

part of the result of a query.

Another way that the user can deal with null values is by defining constraints

with the help of the exists operator. Suppose that most graduate students must

have advisors, though not all of them may have one (probably because the student
3Actually, the type of null is Nil and Nil is always a subtype of any other type that is defined
in the type scheme.











has not yet found a suitable advisor). Further, if the student does have an advisor,

then the advisor must belong to the same department as the student. This constraint

can be modeled as follows:


if exists Grad.advisor then

Grad.dept = Grad.advisor.dept;

By defining this rule, an instance of the class Grad can have a null value in its advisor

attribute, and at the same time not violate a behavioral constraint. Further, this can

also be used as a means to deal with simple exceptions, thus avoiding a proliferation

of subclasses such as Grad-with-advisor and Grad-without-advisor whose superclass

is Grad. As another example, suppose that every graduate student must register for

at least 12 credits, except Joe, who is allowed to register for any number of credits.

This can be modeled as follows:


if Grad.name $ "Joe" then

Grad.credit-hours > 12;


Constraint specification is very similar to what is found in most other systems,

except that the order in which the constraints appear is significant. We have shown

that it is possible to bootstrap the constraint specification sublanguage on top of

the query sublanguage. We also show how to exploit null values to deal with in-

complete information and exceptions. Constraints in Voltaire get triggered whenever

an instance is created or modified. Further, functions are computed as the result of

integrity enforcement, as we shall see in the next chapter.
















CHAPTER 6
FUNCTION SPECIFICATION

Traditionally, in the database world, a function or application is implemented in

a host language with embedded DML statements. This application is then executed

independently of the DBMS under the control of the operating system. Thus, the

DBMS only knows of a transaction defined by a block of DML statements, and has

no way of knowing whether an application as a whole will succeed or not. This may

cause run-time aborts, which are expensive to handle. In contrast, the application

is executed under the control of a central transaction manager within a DBPL, and

the application is implemented as a function or method (in object-oriented database

systems). However, the problem of defining a transaction is still an area of on-

going research. In a DBPL, a function is expected to be compiled into a transaction

sublanguage which gets executed each time a function is to be evaluated at a higher

level. However, such issues are outside the scope of this dissertation. Here, we shall

merely concern ourselves with the evaluation and semantics of function specification

in Voltaire.

Functions in Voltaire rely heavily on the dot operator for associative access, and

set expressions for computing denotable values. A function is specified as a sequence

of constraints or commands, in a manner similar to the imperative paradigm. That

is, each command is executed sequentially. Further, the user can write programs

without worrying about using different operators for persistent and non-persistent

objects. For example, the new operator creates a location for an instance of a

given class, and returns a denotable value of domain Ref Consider the expression:











s := {new.c I ... ai = vi...1. If s is not persistent within the context of evaluation,
it is bound to a denotable value belonging to the domain Ref On the other hand,
if s is a persistent value within the context of evaluation, then it gets bound to a
denotable value belonging to the domain Ref in the run-time environment, and also
gets reflected in the persistent store. In either case, the symbol s provides a consistent
handle to the value referenced by it in the run-time environment. Similarly, if the

modify operator is applied to a non-persistent object, its effect is made available only
to the run-time environment, whereas, if it applied to a persistent object, its effect
is reflected in the persistent store (i.e., database) as well the run-time environment.
We now examine the basic structure of a Voltaire function with the help of a simple
factorial example, followed by a database example.

6.1 Basic Structure of a Function

Function specification can be thought of as a set of rules or constraints defin-
ing the relationship between its input and output parameters. Thus, by extending

the constraint sublanguage to include a few additional constructs, we can write an
arbitrary function in Voltaire.

< Comm2 > ::= < Comm, > j I


::= :=

::=
::= for each in do enddo
::= while do enddo

::= I I I


Additionally, functions can have an extent which is persistent, or a function
call (via dot expressions) may result in the non-persistent creation of instances)











of that function for the duration of a computation. These temporary instances form

the backbone of the execution model of a Voltaire program in which

1. functions and classes are treated uniformly, and

2. function evaluation is the result of integrity enforcement.

We first elaborate with the help of a simple example.

class Fact function
attributes
n: integer
f: integer
constraints
ifn =0 then f = 1;
ifn >0 then f = n x { Fact.f I Fact.n = prev.n 1 };

The function Fact has two parameters, namely n and f, looked upon as a class,

it has two (corresponding) attributes. The left hand side of the "I" operator defines

the context within which the right hand side is evaluated. Thus, n refers to the

attribute value of a new copy of Fact, and is bound to prev.n 1 (where prev.n

is bound to that value of n immediately outside the set expression). For example,

we can obtain the factorial of 6 by issuing the following command: eval {Fact.f I

Fact.n = 6} in the Voltaire environment. When the function is initially invoked,

n is bound to the value 6, while f is unbound. The expression prev.n then refers

to the value of n that is immediately outside the set expression, namely 6. Thus

prev.n 1 denotes the value 5. Also, the equality operator is overloaded such that

when the LHS is initially unbound, it gets bound to the RHS value; when the LHS

is initially bound, satisfiability is computed. The attribute f remains unbound until

the recursion begins to unwind. Additionally, there is an implicit coercion on the set

expression to an object of type integer due to the semantics of the x operator. Since











one operand is an integer and the other is a set of integers (due to the set expression),

coercion is necessary for the proper evaluation of the x operator.

It must be noted that the set expression can also be construed as a query. For

example, the subexpression, {Fact.f I Fact.n = prev.n-1} also means "retrieve all

objects of class Fact such that Fact.n is the same as n-1 for some other instance of

class Fact". Thus, if there were a database consisting of instances of class Fact, i.e.,

value pairs of n and f, then a query asking for the factorial of 6 could result in a

simple look-up. Alternately, the same sub-expression can be interpreted as "compute

the result of function Fact given the value of n" (i.e., function call). This is because

classes and functions are treated uniformly in Voltaire.

Aggregate operators such as sum are provided as a convenience, but it is easy to

write such a function in Voltaire as shown below:

class Sum function
attributes
operand: list integer
result: integer
constraints
result = head.operand + {Sum.result I Sum.operand = tail.prev.operand}

While the above program is similar to the factorial function, it would have been

more efficient to have written it as follows:

for each x in operand do
{ modify.Sum I result = prev.result + x }
enddo


6.2 A Database Example

In order to compare the expressive power of various DBPLs, a task list has been

described in [5]. Here, we show how some of these tasks can be performed in Voltaire.

The first task is to be able to describe a fragment of a manufacturing company's parts












inventory. Among other things, the database represents the way certain parts are

manufactured out of other parts: the subparts that are involved in the manufacture

of other parts, the cost of manufacturing a part from its subparts, the mass increment

that occurs when the subparts are assembled. The manufactured parts themselves

may be subparts in a further manufacturing process, thus representing an aggrega-

tion hierarchy. In addition, the part name, its supplier and purchase cost is also

maintained in the database. A partial Voltaire schema for this database is shown

below.


class Part
superclasses Any
subclasses
Basepart Compositepart
attributes
name: string
used-in: Compositepart


class Basepart
superclasses Part
subclasses nil
attributes
cost: integer
mass: integer
supplied-by: Supplier


class Compositepart
superclasses Part
subclasses nil

attributes
assemblycost: integer
massincrement: integer
uses: set Use

class Use
superclasses Any
subclasses nil
attributes
component: Part
assembly: Compositepart
quantity: integer


The second task is to write a program to print the names, cost and mass of all

base parts that cost more than 100 dollars. This can be achieved by writing a simple

query, namely, { Basepart[name, cost, mass] I cost > 100 }

The next task is to compute and print the total cost of a part as shown below. This

task defeats most query languages because it requires the computation of transitive

closure over the parts hierarchy in the database. To compute the cost of a pump,











we simply invoke the function as follows: { ComputeCost.resultcost | partname =
"pump" }.

class ComputeCost function
attributes
partname: string
resultcost: integer
transients
p: Part
el-cost: integer
subcosts: list integer
constraints

p = { Part I name = partname };
if p in Basepart then
resultcost = p.cost
else
for each y in p.uses.component
do
elcost := p.uses.quantity x { ComputeCost.resultcost
partname = y.partname } },
{ modify.subcosts I head.subcosts = el-cost and
tail.subcosts = prev.subcosts }
enddo;
resultcost = p.assemblycost + { sum I subcosts }
endif


The keyword transients denotes temporary attributes and has the same seman-

tics as regular attributes, except that they are not persistent. Transient attributes

do not reflect the final state of a computation, but merely facilitates a more efficient

evaluation of a function. Therefore, they can be seen to behave like local variables.

The first statement assigns the object identifier of that instance of Part referenced

via its name, to the transient variable p. In the second statement, there is an iterator

which has two commands. The first one makes a recursive call to the function to

descend the aggregation hierarchy, and temporarily stores the cost of an element in

elcost. The second command needs more elaboration. As the recursion unfolds, the











element costs are collected in the list subcosts. The effect of the modify operator is

similar to subcosts := append(eLcost, subcosts). However, since subcosts is a tem-

porary attribute, it merely refers to some object (here, a list of integers) in virtual

memory. Therefore, the effects of modify will be limited only to virtual memory.

On the other hand, if the RHS of the vertical bar referred to some persistent objects,

then modify would appropriately make changes in the persistent store. Also, if we

had used the function Sum defined in the previous section, then the last command

in the ComputeCost function would have been written as:

resultcost = p.assemblycost + { Sum.result I operand = subcosts }

6.3 Temporary Instance Creation

Let us recapitulate some features of Voltaire. We began with the premise that

certain database and programming capabilities must be incorporated within a uni-

form framework. We chose integrity enforcement as that unifying framework. The

main reason why functions can also be computed is that the execution model treats

the constraints as a sequence of statements to be evaluated in the order in which

they appear. In fact, these expressions have a semantics in which new bindings are

passed on to the next expression to be evaluated.1 It is a direct consequence of this

execution model that classes and functions can truly be equivalent.2 This equiva-

lence was important because we insisted that the query language be able to reference

classes and make function calls with the same syntax and semantics. The inability of

a query language to uniformly access classes and functions causes various paradigm

mismatch problems [7, 12]. Typically, query languages allow function calls via ad
'We now see why the order in which constraints appeared in the classes Grad and Student was
important.
2Manuel Bermudez suggested collectively calling them clunctions.











hoc trigger mechanisms, something we wish to avoid since it would create problems
in defining and executing a transaction.
Now, consider the set expression:
{ Student.total-hours I ss# = 987654321 and name = "john" and ... };
When such a program segment is encountered, the evaluation function will first search
for an instance existing in the database. If the search fails, it will then attempt to
create a temporary instance which must satisfy all the constraints in the definition
of class Student. Effectively, this failure is a function call. The semantics of such
an expression can be construed to denote the value for total-hours of a hypothetical
student that satisfies the bindings on the RHS of the "I" operator. This might be
useful in a context where (in the ensuing program sequence) this temporary instance
is to be made persistent if, say, totaLhours evaluates to greater than 40:
x = { Student I ss# = 987654321 and name = "john" and ... };
if x.total-work > 40 then { new.Student Ix };

The first statement results in a binding. The identifier x is bound to a reference

(unique identifier) to an instance of class Student. As mentioned earlier, if "john"
does not exist in the database, then the set expression results in a function call, and
x is bound to a reference to a temporary instance. This temporary instance must
satisfy all constraints of class Student, and all derived attributes are also computed. If
for this instance, the condition totalwork > 40 holds true, then this instance is made

persistent by using the new operator. In this way a temporary instance can be made
persistent. Thus, temporary instance creation forms the backbone of our execution
model which allows us to give an equivalent semantics to classes and functions.

6.4 A Model of Inheritance for Classes and Functions

A problem with equivalence of classes and functions is that we now have to under-
stand what the notion of subclass (or subfunction) means. The subclass relationship











can be defined as follows. Let f, g be two classes and [f] [g] denote their respective

extensions. Then f is said to be a subclass of g iff E[f] C [g]. Such extensional

semantics have been defined for term subsumption languages [45]. However, the sub-

class (or subsumption) relationship is computable by performing a structural analysis

of the class taxonomy.3 Such analysis is based on a set of inference rules for comput-

ing subsumption. For example, CANDIDE [11] is a carefully constrained language

in which the subclass relationship (called subsumption) is decidable [11] [45] and its

complexity is at least co-NP-hard [42]. But this is clearly an undecidable proposition

in Voltaire because we allow arbitrary constraints to be specified in the class (and

function) definition.

Our proposed solution is based on the realization that we are primarily interested
in only those values that exist in the persistent store (i.e., database), as opposed to

the possibly infinite set of instances that may belong to a given class.4 Addition-

ally, we are also interested in instances temporarily created within the context of

some program. Note that a class can be viewed to have base attributes and derived

attributes, while in a function, the input parameters are like base attributes and out-

put parameters are like derived attributes. Thus, the proposition that an instance

is indeed a member of a function (or class) is decidable iff the function terminates

for a given input (though termination is still undecidable). Further, if such class

membership is computable for each instance (of a given class) in the persistent store,

then the subclass relationship is also computable.5
3Computing the subsumption relationship is not decidable for all term subsumption languages,
most notably KL-ONE [14].
4This ontologic nature of databases is in stark contrast to the role of persistent types played in
programming languages.
5Since any instance of f must also satisfy the constraints of its superclass g due to inheritance,
mutual inconsistency will be detected at least for those instances existing in the store. Additionally,
this model will work in cases where the domain of an attribute is a function.











Let f be a function and C[f] be its extension. Based on our above discussion,
the extension is a finite set in the store. However, the notion of temporary instance

creation provides us with a means to make arbitrary computations. Thus, there

are no restrictions on what values may be persistent (as is often the case in many

DBPLs), i.e., a function can also have instances in the persistent store just like any

other class. The keyword function serves only one purpose, namely, that the class

(or function) in question is precluded from participating in the class taxonomy. This

is because we do not know what a taxonomy of functions might mean. The above

model for inheritance is different from those described in [5, 15, 16, 19] because we

provide an extensional account of inheritance rather than intensional.

Since the subclass relationship can be computed based on the above approach,

the main argument against it would be a combinatorial explosion. However, coupled

with our execution model, it conceptually provides a methodology to deal with the

problem of procedural attachments in frame-based languages. As mentioned earlier,

this approach should be contrasted with term subsumption languages. However, we

can still use the same classification algorithm to build a taxonomy of functions. The

ability to define a taxonomy of functions might be of use in functional abstractions

used in simulation applications.

6.5 Equality. Assignment and Modify

It is very important to be able to define equality between expressions in a pro-

gramming language. We have already seen equality in chapter 3 for objects, and we

have seen in chapters 5 and 6 how equality is overloaded. This issue is made poignant

in section 6.3, where we discuss how the notion of temporary instance creation allows

us to give an operational equivalence to the semantics of a class and function. Equal-

ity is different from the assignment and modify operators, in the sense that it is not











destructive. The assignment and modify operators have a very similar semantics-

actually the assignment operator is syntactic sugar for modify. For example, let i

be an instance and aj its attributes. Then {modify.i I a, = vi and... and an = Vn}

is equivalent to a sequence of assignments: i.ai := vl;...; i.an := vn; From an imple-

mentation viewpoint, the modify operation would be less expensive to compile than

the sequence of assignments because the context (that is, the LHS) is evaluated only

once in the former case, while it would be evaluated n times in the latter. Consider

another example: s := x = {modify.s I self = x} or o.a := v = {modify.o I a = x}.

The LHS of an assignment must denote an attribute name, and the expression on

the RHS must be of the same type as the type of the attribute on the LHS. If s in
s := x refers to a non-persistent value (such as a transient attribute), then only the

run-time environment is updated. On the other hand, if s refers to a persistent value,

then the database (that is, persistent store) as well as the run-time environment are

updated.

6.6 Scope of Identifiers

We have already examined the scope of identifiers in a set expression in chapter 4.

We saw that the context of a set expression determines the scope of identifiers. The

only way to override the scope imposed by the context is to use the prev operator.

To understand the scope of identifiers when they occur in a function definition, we

first need to understand how the user interacts with Voltaire. While details of such

interactions are deferred to section 7.1, we briefly introduce the eval command here.

Given that a user has loaded some database and a corresponding schema into the

Voltaire environment, s/he can issue various commands. The eval command takes a

set expression as an argument and evaluates it against the currently active database.

Recall that functions are triggered via set expressions. For example, to compute











the factorial of 6, the user would say eval {fact.f I n = 6}, or the cost of a pump

can computed by issuing the command eval {ComputeCost.resultcost I partname =
"pump" }. This is known as the outermost layer of evaluation.

When a function is triggered by a set expression from the outermost layer of eva-

luation, it is passed an initial environment which consists of the identifiers bound

to their respective values on RHS of the set expression. Other attributes (or pa-

rameters) of the function are bound as the computation progresses. The database

schema is treated as a global declaration. It is useful to think of the database as an

environment which maps classes to instances. Thus, the context of any set expression

is now decided with respect to this global environment (i.e., the database) and the

local environment.6 When computing the value of an identifier, the values in the local

environment take precedence. Once we have moved from the Voltaire environment

to an inner level of computation, the run-time environment looks much different due

to the notion of temporary instance creation and the prey operator.

The run-time environment is Renv = Self x Cenv x Penv, where Self denotes

the currently active record, Cenv denotes the currently active environment and Penv

denotes the calling (or previous) environment. Further, Self = Cenv = Penv =

Env = Id-+Denotable-Value. Self essentially maintains a copy of the currently

active record against which the self operator is evaluated. This is required when a

query is being evaluated within a function call. For example, consider {Person I age <

50}. If the class Person has n instances, and the i" instance is being evaluated, then

Self is used to denote that instance. Any modification to the current environment

is reflected in Self, though the reverse case is not true. Similarly, the prev operator

is evaluated with respect to Penv. Cenv behaves in the usual manner. It must
6Apropos, it should be clear that the context is decided with respect to the global environment
or database for all the examples of chapter 4.











be noted that each time a set expression is encountered in the function body, it is

evaluated with a new run-time environment. We do not allow dot expressions of

the form prev.prev.identifier, since that would require the run-time environment

to maintain information about all the previous environments, one for each level of

nesting.

6.7 Function Composition

As mentioned earlier, Voltaire is a first order language. However, the extent of a

function is a denotable value (which can also be persistent). Therefore, an element

belonging to the extent of a function can be embedded in data structures, passed as

a parameter, or returned as a value. Therefore, function names are valid identifiers

in a dot expression. Thus, the dot operator also denotes function composition. For

example, let f, and f2 denote two functions and il, ol and i2, 02 denote their respective

attributes (input and output parameters). Then,

{f.f2.o02 I fi.il = v A f2-.i2 = l0}

is a valid expression, and is equivalent to f2 o f..7 (Strictly speaking, the two ex-

pressions are equivalent after an implicit coercion in the sense discussed below.) It

should be expected that the subexpression f2.fi is valid if and only if f, and f2 are

isomorphisms. This means that even though f.fj2 may have a denotable value, it

does not imply that f2.fi will also have a denotable value, unless the two functions
are isomorphisms. The reason why this is to be expected is that the extent of a

function is exactly its graph. Further, the above set expression could also have been

equivalently written as

{f2.o2 1 i2 = {f- |i1 = V1}}
7Note that {fi./2 I flii = vi A f2-.i2 = o01} is not equivalent to f2 o fi, since the set expression
returns a reference to an instance of f2, rather than the value 02.











Thus, even though Voltaire has a first order syntax, an element belonging to the

extent of a function can be embedded in data structures, passed as a parameter, or

returned as a value.

It might be useful to list the various forms of the dot operator, each of which are

mutually consistent.

1. c.a denotes the set of values of the attribute a of class c, such that a is selected

from each instance i c. This can equivalently denote function evaluation as

discussed in section 6.3.

2. f.o denotes the value of parameter o of a function f, which is the result of

evaluating f. Again, this can equivalently denote set evaluation if f has a

persistent extent, as discussed in section 6.1.

3. i.a denotes the usual field selection for records if i is an instance (of a class or

function), having the attribute a. There is one important difference, namely,

in our case, i.a will return a singleton set whose element is the value of a for i.


If s is an identifier of type t, then s := i.a is legal, because there is an implicit

coercion. If i.a evaluates to a singleton set with the element v, namely, {v}, it

is coerced to v since {v} 1("t). However, s := c.a can be valid if and only c.a

evaluates to a singleton set. Since this can be known only at run-time, it would limit

the usefulness of any static type checking. Therefore, we impose the restriction that

the above expression is valid if and only if s has {t}. The rule for f.o is similar to

that of i.a.
















CHAPTER 7
THE VOLTAIRE ENVIRONMENT AND ITS SEMANTICS

7.1 Interacting with the Voltaire Environment

The user must first enter the Voltaire environment before a database is loaded
and computations are made against it. At this level of evaluation, the environment

is interactive-it prompts the user for input and reports the result of computations.

The user can begin making computations after loading a schema and a database by

using the loadcdb command. If the schema and/or database do not exist, then the

system returns a message warning the user that the schema and the database have

been initialized to null, so that any computations other than newc and newi will

fail. The newc command is used to create either a new class or a new function.

This class is inserted in the schema at the appropriate place, and corresponding

modifications are made in the database. For example, if a new class has superclass

csup, then it is possible that some instances of csup may migrate to the new class.

Effectively, this implies a coercion on the type of all instances that migrate from csup

to the new class. The newi command is used to create new instances. The user

should not specify the unique object identifier since the system automatically assigns

one to the new object being created. However, the user needs to specify the parent

classes) of the new instance along with all the attribute value pairs. The system will

then check if the new instance satisfies all the structural and behavioral constraints

of each parent class. In order to ensure type safeness, the type of each instance is

verified at the time of creation, as well as when loading a given database with respect

to a given schema.











Once a populated database exists within the environment, various other compu-
tations can be made. The eval command is used to evaluate either a function or
a query expression. The LHS of a set expression (which defines the context within
which the rest of the expression is to be evaluated) can only refer to names defined

in the schema. The reason why a single eval command suffices is because classes and
functions have an equivalent semantics. For example, consider {fact.fI n = 6} and
{Student.name I ss# = 111222333}. The result of a query is tabular. For example,
the result of the query { Dept[name].Course[title].Section[textbook] I Course.c# <

6000 and Course.c# < 7000 } is a table which can be described as a set of objects such

that each object has the type (name : string, { (title: string, textbook : {string})}),
given that a Department offers many courses and that each course has many sections

(each of which may follow different textbooks). The result of the factorial function
would be the value 720.

Since we have adopted a lazy evaluation mode for enforcing integrity constraints, it
is possible that instances belonging to certain classes are modified and the database

can then result in an inconsistent state. To find out which instances of a given
class cause the database to result in an inconsistent state, one can use the check

command. If the name of the class is specified as Any, then each and
every class in the schema is checked to discover inconsistent instances. The result

is displayed as an object graph (that is, linear span-tree), with a question mark

indicating the source of trouble. For example, an instance i0 may have an attribute
a0 which refers to an instance ik of another class, possibly through many levels of

indirection. Now, if it is the case that ik is either nonexistent or inconsistent, then a

question mark would appear:
0o k- 1
t0~ ~ ~~ -- k '











It is trivial to generate such a graph by computing the span-tree of io as dis-

cussed in section 3.4.1. An alternative form of this command is check :

. This command checks if the instances returned by the set expression

are members of a given class (note that membership implies consistency). For exam-

ple, check Department: {Student.advisor.Faculty.dept I Faculty.salary > 50000} will

check only those instances returned by the set expression rather than all instances of

class Department for consistency. Also, the resulting object graph will begin with an

instance of class Department. This command is also useful in finding out nonmembers

of a class. For example, check RA: {TA} will result in a set of instances of RA that

are not in TA. This information can then be used to coerce the type RA on instances

of TA (this is legal since we support multiple inheritance).

The delete command is used to delete all instances returned as the

result of evaluating the set expression. This delete operation should be used with

caution since it will blindly delete all objects returned by the set expression without

regard for the consistency of the database. However, it is useful in order to delete

inconsistent objects determined by the check command. The semantics of this delete

operator is identical to that when it appears in a function for the case of persistent

objects.

Transcripts of a session (or a portion of the session) with Voltaire can be saved in

a file by using the save command. The user can eventually quit a session, which has

the effect of closing the database and returning to the operating system. Since each

command is considered as an atomic transaction, the effects of a successful execution

are permanently reflected in the database. For example, if a function for increasing

Faculty salaries by 10% is executed by the eval command, then all instances of the

class Faculty are updated upon successful execution of the function, and will be

reflected the next time the database is loaded.











7.2 A Denotational Semantics for Voltaire

In decreasing level of abstraction, there are three complementary methodologies

for defining the semantics of a programming language, namely, axiomatic, denota-

tional and operational semantics [47]. The last method uses an interpreter to define

a language. The meaning of a program is the evaluation history that the interpreter

produces when it interprets the program. In the denotational semantics approach, a

program is directly mapped to its meaning, called its denotation. A valuation func-

tion maps a program directly to its denotation, which is a mathematical value such

as a number or function. With an axiomatic semantics, properties about language

constructs are defined, expressed with axioms and inference rules from symbolic logic.

A denotational description of a programming language consists of an abstract
syntax, a set of semantic domains along with their operators, and a valuation function.

A semantic domain along with its set of operators is called a semantic algebra. Before

the valuation function is defined, we must define appropriate semantic algebras for

primitive domains such as numbers and boolean, compound domains such as sets,

lists and records, and other complex domains such as run-time environments and

memory stores. The valuation function takes an abstract syntax tree of the program

and maps it onto its meaning with the help of these semantic algebras.

There are many styles of denotational semantics. Two important styles are di-
rect and continuation semantics. Direct semantics definitions tend to use lower-

order expressions, and emphasize the compositional structure of a language. For

example, the equation EREl + E2I = Ae.[EE1le plus EE21e gives a simple definition

of side-effect free addition, that is, there is no notion of sequencing in this definition.

Sequencing is an entirely operational notion. However, sequencing is an important

control structure in all imperative languages. The semantic argument that models











control is called a continuation. As an analogy, the activation record stack of a

programming language translator contains the sequencing information that "drives"
the evaluation of a program. Thus, the above example can be rewritten in the

continuation style as follows:

ERE1 + E2 = Ae.Ak. EREIe(Ani. EDE2ie (An2. k(n, plus n2)))
where e is the run-time environment argument and k is the continuation or control

argument. An important advantage of using a continuation is that abstractions in

the semantic equations are nonstrict. This is because the continuation effectively

captures the notion of "rest of the program" (in an expression-oriented language, the

program is an expression); thus the remainder of the program (denoted by k) is never

reached when an infinite loop is encountered. Though it is often possible to show the

equivalence (or more precisely, congruence) between a direct and continuation style

semantics for a given language, it is difficult.

As discussed in chapter 6, the definition of a transaction is still an area of on-going

research for object-based database languages. We believe that one effective way to

study various possible definitions of a transaction is by defining a continuation style

semantics for the language. The central idea is that a valuation function then maps

a database program directly onto a transaction. One of the original targets of this

research was to define a transaction with the help of a continuation semantics. While

a concise continuation semantics to define transactions has managed to elude us, we

have been partially successful in defining a direct semantics for Voltaire. The concrete

syntax is defined in Appendix B, the abstract syntax is defined in Appendix C, and

the denotational semantics is defined in Appendix D. We follow the notation found

in [47].











7.3 Implementation Strategy

Our implementation strategy is shown in Figure 7.1. A Voltaire schema (consist-

ing of class and function definitions) is first translated by a parser into an abstract

syntax tree (AST). This AST is then analyzed by a semantic processor for consis-

tency, and possible optimization. If any syntax errors are detected, then they are

reported to the user at this level. If there are no errors, then another abstract syntax

tree (AST*) is generated. The run-time environment takes a request from the user

and executes it with respect to AST*. Effectively, the run-time environment recur-

sively walks the abstract syntax tree (AST*) to execute the user request. The main

advantage of this implementation strategy is that multiple optimization strategies

may be pursued independently, but in a coherent fashion. For example, the seman-

tic processor can exploit different optimization strategies to convert AST to AST*,

such as algebraic rewrites. Also, the run-time environment can exploit another set

of optimizations in which access from the persistent store is more efficient. A single

user request is treated as an atomic transaction.

If the user modifies the current schema in the middle of a session with the envi-

ronment, then any such change must be reflected. Since the run-time environment

will only reference (and therefore modify AST*), there must be another mechanism

to translate the changes made to AST* back into Voltaire code. Thus, when the user

quits the environment, AST* is translated back into Voltaire code by the deparser.




























Abstract Sematic
Deparser Syntax Processor
Tree*_e


Figure 7.1. Implementation of Voltaire
















CHAPTER 8
CONCLUSIONS AND FUTURE RESEARCH

In this dissertation we have described the syntax and semantics of the Voltaire

database programming language. Unlike most other languages, Voltaire has a single

execution model for evaluating queries, satisfying constraints and computing func-

tions. Such a design also facilitates a bootstrapped implementation. We believe that

it is a suitable language for data intensive programming. A prototype implementa-

tion is currently being completed. The main contributions of this dissertation are as

follows:


1. We have described a set-oriented, imperative database programming language

called Voltaire.

2. We have described a data definition facility which facilitates sharing of data

and manipulation of heterogeneous sets, and in which persistence is a property

of the instances rather than classes (or types).

3. The system provides transparency between persistent and transient objects by

defining a single set of operators for both kinds of objects.

4. We have designed the language in an additive or bootstrapping fashion.

5. We have discussed how the notion of temporary instance creation allows us to

give an equivalent semantics to classes and functions, which seemed necessary to

have a single model of execution for querying, enforcing integrity and computing

functions.











6. We have given a formal definition to the object model of Voltaire, which ac-

counts for behavior as well as the extent of a type. Thus, it provides a uniform

semantics for the persistent store (i.e., the database) and the run-time envi-

ronment by making it possible to statically type check expressions.

7. We have also given a partial denotational semantics, defining the main features

of Voltaire.

While the fact that the sequential order of constraints is significant may be con-

sidered as a limitation, we placed that restriction to avoid traditional computational

overhead associated with constraints. Also, we can now compute a function which

consists of evaluating or satisfying a sequence of constraints. Since functions and

classes are equivalent, they can be thought of as views (and likewise, the output pa-

rameters of the function as derived attributes). The values of derived attributes are

not stored, but are computed only upon demand. This adds to run-time overhead,

but guarantees that the user will always obtain correct results.

While our type system has certain useful properties, the type expressions are not

as powerful as in, say, Machiavelli. For example, we have not considered variant

records; polymorphism is ad hoc in terms of operator overloading, implicit coercion

and inheritance. It is an open question whether we can define a static type discipline

that is truly polymorphic, but also supports sharing of heterogeneous data. Advanced

issues such as exception handling or versioning may be addressed to enhance the

language. There are at least two directions for future research that appear promising:

1. Since the set expressions in Voltaire are very similar to those in SETL, it would

be interesting to investigate the possibility of extending SETL to make it a

polymorphic, strongly typed database programming language with static type

checking.








85


2. Extend the denotational description of Voltaire to a continuation style of seman-

tics, which could then be used to study the notion of transactions for DBPLs.

3. Extend the type system of Voltaire to define a type inferencing mechanism that

would eliminate the need the pre-define transient attributes.

















APPENDIX A
UNIVERSITY SCHEMA

class Person defined
superclasses Any
subclasses Student, Teacher
attributes
ss#: integer
name: string

class Student defined
superclasses Person
subclasses Grad, Undergrad
attributes
gpa: real
major: Dept
sections: set Section
transcripts: set Transcript
total-work: integer
total-credit: integer
job-hours: integer
leisure-time: integer
visa-status: integer
constraints
total-credit = sum {sections.course.credit-hours };
total-work = total-credit + job-hours;
leisure-time = 80 total-work;
leisure-time > 20;
if visa-status = "F-1" then job-hours < 20;

class Grad defined
superclasses Student
subclasses RA, TA
attributes
advisor: Faculty
committee: set Faculty
status: string












course-work: string
degree-req: string
thesis-option: integer
constraints
if exists thesis-option then advisor and advisor in committee;
for all { section.course.c# I c# > 5000 };
if status = "full-time" then total-credit > 12;
if course-work = "done" and thesis-status = "defended" and
count { committee.Faculty I Faculty.Dept includes Dept } > 2
then degree-req = "fulfilled";

class Undergrad defined
superclasses Student
attributes
minor: Dept

class Teacher defined
superclasses Person
subclasses Faculty, TA
attributes
degree: string

class Faculty defined
superclasses Teacher
attributes
books: string
specialty: string
advises: set Grad

class TA defined
superclasses Teacher, Grad
attributes
supervisor: Faculty

class RA defined
superclasses Grad
attributes
project: string

class Section defined
superclasses Any
attributes












section#: string
room#: string
textbook: string
taught-by: Teacher
course: Course
enrollment: set Student

class Course defined
superclasses Any
attributes
c#: string
title: string
credit-hours: integer
prereqs: set Course
sections: set Section
enrollment: set Student
dept: Dept

class Dept defined
superclasses Any
attributes
name: string
college: string
students: set Student
courses-offered: set Course

class Transcript defined
superclasses Any
attributes
grade: integer
course: Course
student: Student

class Advising defined
superclasses Any
attributes
startdate: string
faculty: Faculty
student: Student

















APPENDIX B
CONCRETE SYNTAX
I. A BNF for the Data Definition Sublanguage


::=
::=
::=

::=








::=



+
+

class (defined I function)
[superclasses superclasss>+]
subclassess +]
[instances +]
[attributes +]
[transients +]
[constraints: ]

instance [ +]
[attributes +]


::= : I =


::=


nil [ any I string I integer I real I I
set + I list + I tuple +


::= =


::=

::=
::=
::=

superclasss> ::=
::=
::=
::=
::=


null I | I I ""
I
{ + }
( + )
[+ ]
















II. Some Data Manipulation Operators
::= I I

::= = { new. | + }
= { new. }
::= { modify. | }
::= { delete. }

III. Query Sublanguage

::= { I } { }


::= ( ) I not < Bool > or < Bool02 >
< Booll > and < Bool2 > < E, > < E2 >
< E1 > = < E2 > I forall :
exists | dbexists

::= | I I

::= I . I < Ii > [ < 12 >+] |
< Ii > [ < 12 >+ ].

::= I

::= I I I "" |


::= { + } I { + I I { + }
{ + }I { + }

::= count sum avg I min I max

::= | I |I< I > linlincludes
::= + -
::= x | I mod I div

::= prev I next self head I tail I











IV. Additive Constraint Sublanguage


::=


;< B2 > I < Commi >


< Comm, > ::= if then endif I
if then < B1 > else < B2 > endif

V. Additive Programming Sublanguage


< Comm2 > ::= < Comm, > I I I I

::= :=


::=
::=
::=

::=



for each in do enddo
while do enddo

I I I


VI. Environment


::= new-c I newi eval
::= load-db +

::=new..c I new-i I eval
script I check |
check : I quit
savein I delete |
















APPENDIX C
ABSTRACT SYNTAX


Voltaire::= load-db Sc Db S


SI; S2 I newc Cl I new-i Ins I eval SE
script Fn I check Cn
check Cn SE I quit
savein Fn I delete SE |


Cl ::= class Cn (defined
[superclasses Sup
subclassess Sub
[instances Rf]
[attributes AD]
[transients AD]
[constraints: B]

AD::= An: D I An=V


I function)


nil I any I string I integer
set D I list D I tuple AD


I real I Cn I


instance Rf [ Cn+ I [attributes AV]

An=V

null I Rf Int R | St I SV I TV

{V+}
[AV+]

CI+
Ins+

Cn
Cn
Ide


D ::=


Ins ::=


AV::=


V::=

SV
TV ::=

Sc::
Db::=

Sup ::=
Sub ::=
Cn::=


S ::=








93

An ::= Ide

B ::= Bi;B2 IBoIC

C ::= if Bo then B endif I if Bo then B1 else B2 endif
A I L DML 110

Bo ::= ( Bo ) not Bo Bol or Bo02 I Boi and Bo2
SEi RelE2 Ei=E2 I
forall E : Bo I exists E I dbexists E

E::= -TIT IT Add E IDE

T::= F F Mul T
F ::= Rf Int I R I "St" SV I SE

Agg::= count sum I avg min I max
Rel::= I I| II <>< > in includes
Add::= + -
Mul ::= x I I mod I div

I ::= prev I self I head tail I Ide

DML::= New I Mod I Del
New::= DE= { new.Cn I AV+} DE= { new.Cn Ide}
Mod::= { modify.DE Bo }
Del ::= { delete.DE Bo }

DE::= I I I.DE II1[1I2+ ]I1I [ 12+].DE

SE::= {E I Bo} I{E} I EIAggSE

A::= DE:= SE

L::= It I W
It::= for each Ide in SE do B enddo
W::= while Bo do B enddo


10 ::= Open I Close I Print I Read




Full Text

PAGE 1

92/7$,5( $ '$7$%$6( 352*5$00,1* (19,5210(17 :,7+ $ 6,1*/( (;(&87,21 02'(/ )25 (9$/8$7,1* 48(5,(6 6$7,6)<,1* &21675$,176 $1' &20387,1* )81&7,216 %\ 681,7 *$/$ $ ',66(57$7,21 35(6(17(' 72 7+( *5$'8$7( 6&+22/ 2) 7+( 81,9(56,7< 2) )/25,'$ ,1 3$57,$/ )8/),//0(17 2) 7+( 5(48,5(0(176 )25 7+( '(*5(( 2) '2&725 2) 3+,/2623+< 81,9(56,7< 2) )/25,'$

PAGE 2

7R *HHWD DQG .LOX IRU KDYLQJ VKRZQ PH WKH MR\V RI ZRQGHULQJ IRU KDYLQJ JLIWHG PH ZLWK D FKLOGKRRG WKDW ZDV QHYHU WKHLUV IRU KDYLQJ JLYHQ PH FRXUDJH WR VHHN 7UXWK DQG %HDXW\ ,W LV WR WKHP WKDW GHGLFDWH WKH ZRUN RI P\ OLIH

PAGE 3

$&.12:/('*0(176 ,W LV GLIILFXOW WR FRPSUHVV LQ D IHZ OLQHV RQHfV JUDWLWXGH WR D QXPEHU RI SHRSOH ZKR KDYH KDG DQ\ EHDULQJ RQ WKLV GLVVHUWDWLRQ ZDQW WR WKDQN 'U 6KDPNDQW 1DYDWKH ZLWK ZKRP KDYH ZRUNHG IRU ILYH \HDUV IRU RSHQLQJ PDQ\ RSSRUWXQLWLHV WR PH DQG EHLQJ RQH RI WKH PRVW IOH[LEOH DQG XQGHUVWDQGLQJ DGYLVRUV WKDW RQH FDQ KDYH ZDQW WR WKDQN 0DQXHO %HUPXGH] IRU WHDFKLQJ PH GHQRWDWLRQDO VHPDQWLFV DQG DOO WKDW NQRZ DERXW SURJUDPPLQJ ODQJXDJHV VSHQW WZR JUHDW \HDUV ZRUNLQJ ZLWK +RZDUG %HFN RQ WKH &$1','( SURMHFW GXULQJ ZKLFK SHULRG OHDUQHG PXFK 7KH UXGLPHQWV RI 9ROWDLUH OD\ LQ D fPHUFXULDOf ODWH QLJKW GLVFXVVLRQ ZLWK 6WHSKDQ *ULOO RYHU EHHU ,W KDV DOZD\V EHHQ D MR\ WR GLVFXVV WKH PHDQLQJ RI OLIH WKH XQLYHUVH DQG ZLWK 'U 3ULQFLS KH KDV VKRZQ PH WKDW LW LV SRVVLEOH WR OHDUQ DV PDQ\ WKLQJV LQ RQHfV OLIH DV RQH FDUHV WR ZRXOG DOVR OLNH WR WKDQN 'UV &KDNUDYDUWK\ /DP DQG 6X IRU PDQ\ LOOXPLQDWLQJ GLVFXVVLRQV RQ 9ROWDLUH DQG RWKHU WRSLFV ,W ZRXOG EH GLIILFXOW WR UHFRXQW WKH YDULRXV LQWHUDFWLRQV ZLWK SDVW DQG SUHVHQW VWXGHQWV ZKR KDYH VKDUHG WKH 'DWDEDVH &HQWHU DV D VXSHUODWLYH ZRUN SODFH %XW VKRXOG OLNH WR WKDQN 5DKLP
PAGE 4

7$%/( 2) &217(176 $&.12:/('*(0(176 LLL $%675$&7 YL &+$37(56 '$7$%$6( 352*5$00,1* /$1*8$*(6 ,QWURGXFWLRQ 6FRSH RI WKLV 'LVVHUWDWLRQ 6RPH 'HVLJQ &ULWHULD IRU '%3/V 6HPDQWLF 'DWD 0RGHO YHUVXV 3HUVLVWHQW $EVWUDFW 'DWD 7\SHV 7\SH &KHFNLQJ $ELOLW\ WR 0DQLSXODWH +HWHURJHQHRXV 6HWV $ELOLW\ WR 6KDUH 'DWD 'DWD YHUVXV )XQFWLRQV 'DWDEDVH ,QWHJULW\ 5ROH RI WKH 4XHU\ /DQJXDJH ,PSOHPHQWDWLRQ 6WUDWHJLHV &KRLFH RI &RPSXWLQJ 3DUDGLJP 3UHYLRXV 5HVHDUFK $1 29(59,(: 2) 92/7$,5( 'HVLJQ 5DWLRQDOH RI 9ROWDLUH $ 4XLFN *ODQFH RI 9ROWDLUH $Q ,QWURGXFWRU\ ([DPSOH '$7$ '(),1,7,21 &ODVVHV DQG ,QVWDQFHV $Q ([WHQVLRQDO 6HPDQWLFV IRU &ODVVHV 8SGDWH 2SHUDWRUV 2Q WKH &RPSXWDELOLW\ RI 6XEFODVV 2EMHFW *UDSKV DQG (TXDOLW\ &ODVVHV 7\SHV DQG 6FKHPDV *ORVVDU\ ,9

PAGE 5

48(5< 63(&,),&$7,21 7KH %DVLF 6WUXFWXUH RI D 4XHU\ ([DPSOHV $JJUHJDWH 2SHUDWRUV (YDOXDWLRQ 6WUDWHJLHV 6HPDQWLFV RI WKH 'RW 2SHUDWRU 1DLYH $SSURDFK $OJHEUDLF $SSURDFK &21675$,17 63(&,),&$7,21 %DVLF 6WUXFWXUH RI &RQVWUDLQWV ([DPSOHV &RQVWUDLQWV RQ WKH FODVV 6WXGHQW &RQVWUDLQWV RQ WKH FODVV *UDG 1XOO 9DOXHV DQG ([FHSWLRQV )81&7,21 63(&,),&$7,21 %DVLF 6WUXFWXUH RI D )XQFWLRQ $ 'DWDEDVH ([DPSOH 7HPSRUDU\ ,QVWDQFH &UHDWLRQ $ 0RGHO RI ,QKHULWDQFH IRU &ODVVHV DQG )XQFWLRQV (TXDOLW\ $VVLJQPHQW DQG 0RGLI\ 6FRSH RI ,GHQWLILHUV )XQFWLRQ &RPSRVLWLRQ 7+( 92/7$,5( (19,5210(17 $1' ,76 6(0$17,&6 ,QWHUDFWLQJ ZLWK WKH 9ROWDLUH (QYLURQPHQW $ 'HQRWDWLRQDO 6HPDQWLFV IRU 9ROWDLUH ,PSOHPHQWDWLRQ 6WUDWHJ\ &21&/86,216 $1' )8785( 5(6($5&+ $33(1',&(6 $ 81,9(56,7< 6&+(0$ % &21&5(7( 6<17$; & $%675$&7 6<17$; '(127$7,21$/ 6(0$17,&6 5()(5(1&(6 %,2*5$3+,&$/ 6.(7&+ Y

PAGE 6

$EVWUDFW RI 'LVVHUWDWLRQ 3UHVHQWHG WR WKH *UDGXDWH 6FKRRO RI WKH 8QLYHUVLW\ RI )ORULGD LQ 3DUWLDO )XOILOOPHQW RI WKH 5HTXLUHPHQWV IRU WKH 'HJUHH RI 'RFWRU RI 3KLORVRSK\ 92/7$,5( $ '$7$%$6( 352*5$00,1* (19,5210(17 :,7+ $ 6,1*/( (;(&87,21 02'(/ )25 (9$/8$7,1* 48(5,(6 6$7,6)<,1* &21675$,176 $1' &20387,1* )81&7,216 %\ 6XQLW *DOD 'HFHPEHU &KDLUPDQ 6KDPNDQW % 1DYDWKH 0DMRU 'HSDUWPHQW (OHFWULFDO (QJLQHHULQJ ,Q WKLV WKHVLV ZH SUHVHQW 9ROWDLUH ZKLFK LV D VHWRULHQWHG LPSHUDWLYH GDWDEDVH SURJUDPPLQJ ODQJXDJH 7KH VHW H[SUHVVLRQV LQ WKH ODQJXDJH DUH FRQGXFLYH WR GDWD LQWHQVLYH SURJUDPPLQJ ZKLOH PDLQWDLQLQJ D FHUWDLQ DPRXQW RI HIILFLHQF\ E\ HVSRXVLQJ WKH LPSHUDWLYH SDUDGLJP 7KH ODQJXDJH DQG LWV VHPDQWLFV DUH GHILQHG LQ D PRGXODU EXW DGGLWLYH IDVKLRQ ZKLFK IDFLOLWDWHV VRPH PHDVXUH RI ERRWVWUDSSLQJ :H IXUWKHU DUJXH WKDW VXFK DQ LPSOHPHQWDWLRQ PRGHO LV GHVLUDEOH VLQFH LW SURYLGHV D VLQJOH H[Hn FXWLRQ PRGHO IRU HYDOXDWLQJ TXHULHV VDWLVI\LQJ FRQVWUDLQWV DQG FRPSXWLQJ IXQFWLRQV 7KH V\VWHP SURYLGHV DXWRPDWLF LQWHJULW\ HQIRUFHPHQW LQ D OD]\ HYDOXDWLRQ PRGH )XQFWLRQV DUH HIIHFWLYHO\ FRPSXWHG DV WKH UHVXOW RI LQWHJULW\ HQIRUFHPHQW 7KLV LV EHFDXVH ZH FRQVLGHU FRQVWUDLQWV DV D VHTXHQFH RI FRPPDQGV WR EH HYDOXDWHG RU VDWn LVILHG LQ WKH VSHFLILHG RUGHU 7KHUH DUH QR DUELWUDU\ UHVWULFWLRQV RQ WKH SHUVLVWHQFH YL

PAGE 7

RI YDOXHVf§HYHQ IXQFWLRQV FDQ KDYH D SHUVLVWHQW H[WHQW )XUWKHU WKH TXHU\ ODQJXDJH LQFRUSRUDWHV IXQFWLRQV E\ SURYLGLQJ DFFHVV WR WKH SHUVLVWHQW H[WHQW RI D IXQFWLRQ RU E\ DOORZLQJ DQ DFWXDO IXQFWLRQ FDOO $OVR WKH FRPSLOHU FDQ H[SORLW FRQYHQWLRQDO DOJHEUDLF WHFKQLTXHV IRU TXHU\ RSWLPL]DWLRQ 7KH GDWD GHILQLWLRQ RU W\SHf IDFLOLW\ LV VLPLODU WR ZKDW PLJKW EH IRXQG LQ PRVW VHPDQWLF GDWD PRGHOV DQG LV FRQGXFLYH WR VKDULQJ KHWHURJHQHRXV UHFRUGV :H KDYH GHILQHG D W\SH DOJHEUD WKDW LQFRUSRUDWHV VWUXFWXUH H[WHQW DQG EHKDYLRU E\ SURYLGLQJ DQ H[WHQVLRQDO VHPDQWLFV IRU WKH EHKDYLRU :H DOVR DWWHPSW WR GHILQH D GHQRWDWLRQDO VHPDQWLFV IRU WKH 9ROWDLUH ODQJXDJH DQG HQYLURQPHQW :H EHOLHYH WKDW 9ROWDLUH LV D VXLWDEOH ODQJXDJH IRU GDWD LQWHQVLYH SURJUDPPLQJ DQG LV D UHDVRQDEOH FRPSURPLVH EHWZHHQ D GDWDEDVH V\VWHP DQG D SURJUDPPLQJ ODQJXDJH 9OO

PAGE 8

&+$37(5 '$7$%$6( 352*5$00,1* /$1*8$*(6 ,QWURGXFWLRQ ,Q WRGD\fV W\SLFDO RUJDQL]DWLRQ D ODUJH SURSRUWLRQ RI VRIWZDUH DSSOLFDWLRQV DUH LQ IDFW GDWDEDVH DSSOLFDWLRQV DQG DUH GHYHORSHG DW FRQVLGHUDEOH FRVW 7KH GHYHORSPHQW RI WKHVH DSSOLFDWLRQV LV XVXDOO\ SHUIRUPHG XVLQJ WZR GLVWLQFW LQFRPSDWLEOH ODQJXDJHV RQH IRU GDWD PDQLSXODWLRQ DQG RQH IRU SURJUDPPLQJ WKH DSSOLFDWLRQ )RU H[DPSOH &2%2/ LV RIWHQ XVHG DV WKH fKRVWf SURJUDPPLQJ ODQJXDJH LQ ZKLFK 64/ GDWD PDQLSXODWLRQ VWDWHPHQWV DUH HPEHGGHG 7KLV LV WKH FDVH LQ PRVW EXVLQHVV DSSOLFDWLRQV ZKLFK FRQVWLWXWH WKH ODUJHVW FRQn VXPHUV RI GDWDEDVH WHFKQRORJ\ $ W\SLFDO GDWDEDVH PDQDJHPHQW V\VWHP FRQVLVWV RI D GDWD GHILQLWLRQ ODQJXDJH ''/f DQG D GDWD PDQLSXODWLRQ ODQJXDJH '0/f >@ 7KH ''/ GHILQHV WKH GDWDEDVH VWUXFWXUH DQG KHQFH FRQVWLWXWHV WKH VWUXFWXUDO FRPn SRQHQW ZKHUHDV WKH '0/ FRQVLVWV RI D TXHU\ VXEODQJXDJH LH UHWULHYDO RSHUDWRUVf DQG XSGDWH RSHUDWRUV )RU H[DPSOH LQ D UHODWLRQDO GDWDEDVH VHWV RI UHODWLRQV DQG YDULRXV LQWHJULW\ FRQVWUDLQWV IRUP WKH VWUXFWXUDO FRPSRQHQW RU ''/ ZKLOH WKH TXHU\ ODQJXDJH 4/f LV EDVHG RQ WKH UHODWLRQDO FDOFXOXV RU DOJHEUD )XUWKHU WKH UHODWLRQDO 4/ LV VHWRULHQWHG DQG GHFODUDWLYH LQ QDWXUH 7KXV HPEHGGLQJ GHFODUDWLYH '0/ VWDWHPHQWV LQ DQ LPSHUDWLYH KRVW ODQJXDJH LQHYLWDEO\ OHDGV WR D SDUDGLJP PLVPDWFK EHWZHHQ WKH ODQJXDJHV 7KH DSSOLFDWLRQ GHYHORSHU RIWHQ VSHQGV LQRUGLQDWH DPRXQWV RI WLPH DQG HQHUJ\ RYHUFRPLQJ WKHVH LQFRPSDWLELOLWLHV 7KH LQFRPSDWLELOLWLHV DUH QRW MXVW FRQFHSWXDO EXW SK\VLFDO DV ZHOO )RU H[DPSOH VKDULQJ RI V\PERO VSDFH DQG ZRUN VSDFH EHWZHHQ WKH

PAGE 9

HPEHGGHG DQG KRVW ODQJXDJHV FUHDWHV FKDOOHQJHV IRU LPSOHPHQWDWLRQ 7KXV 'DWDEDVH 3URJUDPPLQJ /DQJXDJHV '%3/Vf KDYH EHHQ SURSRVHG WR DOOHYLDWH WKLV SUREOHP E\ LQWHJUDWLQJ SURJUDPPLQJ ODQJXDJH FRQVWUXFWV DQG GDWDEDVH FRQVWUXFWV LQWR D VLQJOH ODQJXDJH VHH IRU H[DPSOH > @f 7KHUH DUH VRPH LPSRUWDQW LVVXHV FRQFHUQLQJ WKH GHVLJQ RI GDWDEDVH SURJUDPPLQJ ODQJXDJHV > @ 3HUKDSV WKH PRVW GLIILFXOW LVVXH VWHPV IURP WKH IDFW WKDW GDWD PRGHOLQJ DQG NQRZOHGJH UHSUHVHQWDWLRQf HQWHUSULVHV DUH RQWRORJLF LQ QDWXUH LQ FRQWUDVW WR WUDGLWLRQDO SURJUDPPLQJ 7KLV PHDQV WKDW WKH UROH RI D GDWD PRGHO LV WR IDLWKIXOO\ FDSWXUH WKH VHPDQWLFV RI VRPH UHDO ZRUOG HQWLW\ ZLWKRXW ZRUU\LQJ DERXW WKH DFWXDO GDWD VWUXFWXUHV ZLWK ZKLFK WR LPSOHPHQW WKH JLYHQ HQWLW\ 2Q WKH RWKHU KDQG WKH UROH RI D ULFK W\SH V\VWHP LQ D WUDGLWLRQDO SURJUDPPLQJ ODQJXDJH LV WR DOORZ WKH XVHU WR FKRRVH D GDWD VWUXFWXUH ZKLFK ZLOO OHDG WR WKH PRVW HIILFLHQW LPSOHPHQWDWLRQ RI WKH DSSOLFDWLRQ LQ TXHVWLRQ 'HVLJQLQJ D '%3/ QHFHVVDULO\ HQWDLOV WKH PHUJLQJ RI FHUWDLQ LQFRPSDWLEOH IHDWXUHV RI D GDWDEDVH V\VWHP DQG SURJUDPPLQJ ODQJXDJH 7KXV WKH W\SH V\VWHP RI D SURJUDPPLQJ ODQJXDJH PXVW EH HOHYDWHG WR PDWFK WKH RQWRORJLF SURSHUWLHV RI D GDWD PRGHO WR HQKDQFH WKH FRPSXWDWLRQDO H[SUHVVLELOLW\ RI WKH UHVXOWLQJ '%3/ 8QIRUWXQDWHO\ D XQLIRUP WUHDWPHQW RI W\SHV EHKDYLRU H[WHQW DQG FODVVHV LV D QRQWULYLDO SUREOHP $Q LPSRUWDQW UHDVRQ IRU WKLV VHHPV WR EH WKDW D W\SH GHILQLWLRQ XVXDOO\ GRHV QRW DFFRXQW IRU WKH H[WHQW RI D W\SH > @ ZKHUHDV D GDWDEDVH FODVV GHILQLWLRQ GRHV SURYLGH D VHPDQWLF GHVFULSWLRQ RI LWV H[WHQW LH WKH FORVHG ZRUOG DVVXPSWLRQf )XUWKHU LW LV LPSRUWDQW WKDW WKH W\SH V\VWHP SURYLGH VWUXFWXUHV VXFK DV FODVVHVf IRU UHSUHVHQWLQJ VHWV RI VLPLODU EXW SRVVLEO\ KHWHURJHQHRXV VWUXFWXUHV VXFK DV UHFRUGV RU LQVWDQFHVf :H ZRXOG DOVR OLNH WR HPSKDVL]H WKDW PDQ\ SURSRVHG '%3/V GR QRW SURYLGH D WUXO\ LQWHJUDWHG FRPSXWLQJ SDUDGLJP )RU H[DPSOH WKH\ GR QRW SURYLGH D KRPRn JHQHRXV WUHDWPHQW RI REMHFW W\SH RU FODVVf PDQLSXODWLRQ DQG IXQFWLRQ SURFHGXUH

PAGE 10

RU PHWKRGf VSHFLILFDWLRQ 7KLV ODFN RI KRPRJHQHLW\ VWHPV IURP WKH IDFW WKDW WKHUH DUH WKUHH VXEODQJXDJHV WKDW IRUP D VLQJOH '%3/ 7KHVH VXEODQJXDJHV DUH IRU GDWD GHILQLWLRQ WR VSHFLI\ REMHFW W\SHV GDWD PDQLSXODWLRQ WR FRPSXWH D UHVWULFWHG FODVV RI TXHULHV DQG IXQFWLRQ VSHFLILFDWLRQ IRU PDNLQJ DUELWUDU\ FRPSXWDWLRQV ,W LV LPn SRUWDQW WR QRWH WKDW LQ PDQ\ H[LVWLQJ '%3/V DQ H[FHSWLRQ EHLQJ WKH HPEHGGLQJ RI UHODWLRQDO V\VWHPV ZLWKLQ ORJLF ODQJXDJHVf WKH WKUHH VXEODQJXDJHV DUH RUWKRJRQDO LH WKHUH WHQGV WR EH QR LQWHUOHDYLQJ DPRQJ SURJUDPPLQJ ODQJXDJH FRQVWUXFWV GDWD PDQLSXODWLRQ FRQVWUXFWV DQG GDWD GHILQLWLRQ FRQVWUXFWV ,QVWHDG WKH WKUHH VXEODQn JXDJHV DUH PHUHO\ fDSSHQGHGf WR HDFK RWKHU ZKLFK UHVXOWV LQ D '%3/ ODFNLQJ D WUXO\ LQWHJUDWHG SDUDGLJP +RZHYHU DSSHQGLQJ ODQJXDJHV LQ WKLV PDQQHU LV VWLOO D YDVW LPn SURYHPHQW RYHU HPEHGGLQJ TXHULHV LQ D KRVW ODQJXDJH VXFK DV 64/ LQ &2%2/f :H VKDOO EULHIO\ HQXPHUDWH VRPH LVVXHV WKDW OHDG WR FRQIOLFWV ZKHQ GHVLJQLQJ D GDWDEDVH SURJUDPPLQJ ODQJXDJH 6HWRULHQWHG PDQLSXODWLRQ SULPLWLYHV YHUVXV UHFRUGRULHQWHG SURJUDPPLQJ SULPn LWLYHV 'HFODUDWLYH TXHU\ ODQJXDJH YHUVXV LPSHUDWLYH SURJUDPPLQJ ODQJXDJH $ELOLW\ WR GHILQH D WKHRU\ RI W\SHV ZKLFK DFFRXQWV IRU H[WHQW DV ZHOO DV EHKDYLRU LQYROYHV FHUWDLQ FRPSURPLVHV Df D W\SH WKHRU\ PXVW EH DEOH WR FOHDUO\ GHILQH ZKHQ RQH FODVV LV D VXEFODVV RI DQRWKHU DQG ZKHQ D GDWDEDVH REMHFW EHORQJV WR D JLYHQ FODVV Ef VWDWLF YHUVXV G\QDPLF W\SH FKHFNLQJ Ff SRO\PRUSKLVP YHUVXV HIILFLHQF\ Gf DELOLW\ WR GHDO ZLWK KHWHURJHQHRXV UHFRUGV RU REMHFWV

PAGE 11

8QLIRUP SHUVLVWHQFH IRU DOO REMHFWV LQGHSHQGHQW RI WKHLU W\SH YHUVXV HIILFLHQW UHWULHYDO IURP VHFRQGDU\ VWRUDJH $ELOLW\ WR GHILQH WKH QRWLRQ RI D WUDQVDFWLRQ $ELOLW\ WR SURYLGH UHIHUHQWLDO WUDQVSDUHQF\ EHWZHHQ REMHFWV LQ PDLQ PHPRU\ DQG WKRVH LQ VHFRQGDU\ VWRUDJH 6FRSH RI WKLV 'LVVHUWDWLRQ ,Q WKLV GLVVHUWDWLRQ ZH SUHVHQW 9ROWDLUH D VHWRULHQWHG LPSHUDWLYH GDWDEDVH SURn JUDPPLQJ ODQJXDJH 7KH VHW H[SUHVVLRQV LQ WKH ODQJXDJH DUH FRQGXFLYH WR GDWD LQWHQVLYH SURJUDPPLQJ ZKLOH PDLQWDLQLQJ D FHUWDLQ DPRXQW RI HIILFLHQF\ E\ VXEVFULEn LQJ WR WKH LPSHUDWLYH SDUDGLJP 7KH ODQJXDJH DQG LWV VHPDQWLFV DUH GHILQHG LQ D PRGXODU EXW DGGLWLYH IDVKLRQ ZKLFK IDFLOLWDWHV D ERRWVWUDSSHG LPSOHPHQWDWLRQ :H IXUWKHU DUJXH WKDW VXFK DQ LPSOHPHQWDWLRQ PRGHO LV GHVLUDEOH 7KH GDWD GHILQLWLRQ RU W\SHf IDFLOLW\ LV VLPLODU WR ZKDW PLJKW EH IRXQG LQ PRVW VHPDQWLF GDWD PRGHOV DQG LV FRQGXFLYH WR VKDULQJ KHWHURJHQHRXV UHFRUGV 7KH TXHU\ ODQJXDJH SURYLGHV XQLIRUP DFFHVV WR VHWV RI LQVWDQFHV DV ZHOO DV IXQFWLRQV $OVR WKH FRPSLOHU FDQ H[SORLW FRQYHQn WLRQDO DOJHEUDLF WHFKQLTXHV IRU TXHU\ RSWLPL]DWLRQ 7KH V\VWHP SURYLGHV DXWRPDWLF LQWHJULW\ HQIRUFHPHQW XS WR D FHUWDLQ GHJUHHf )XQFWLRQV DUH HIIHFWLYHO\ FRPSXWHG DV WKH UHVXOW RI LQWHJULW\ HQIRUFHPHQW 7KLV LV EHFDXVH ZH FRQVLGHU FRQVWUDLQWV DV D VHTXHQFH RI FRPPDQGV WR EH HYDOXDWHG RU VDWLVILHG LQ WKH VSHFLILHG RUGHU )XUWKHU WKHUH DUH QR DUELWUDU\ UHVWULFWLRQV RQ WKH SHUVLVWHQFH RI YDOXHVf§HYHQ IXQFWLRQV FDQ KDYH D SHUVLVWHQW H[WHQW :H YLHZ 9ROWDLUH DV DQ H[SHULPHQW WR SURYLGH D ODQJXDJH IDFLOLW\ WR PDQLSXODWH VHWV RI DVVRFLDWLYH GDWD 2XU VHW H[SUHVVLRQV DUH VXSHUILFLDOO\ VLPLODU WR WKRVH LQ 6(7/ >@ WKXV UHGXFLQJ FHUWDLQ SDUDGLJP PLVPDWFK SUREOHPV ZLWK UHFRUGRULHQWHG

PAGE 12

ODQJXDJHV 7KH GHVLJQ RI RXU ODQJXDJH LQ JHQHUDO DQG RXU LQKHULWDQFH DQG GDWD GHFODUDWLRQ VFKHPH LQ SDUWLFXODU VWURQJO\ UHIOHFW WKH GDWDEDVH QRWLRQ WKDW D FODVV GHQRWHV D VHW RI LQVWDQFHV WKDW EHORQJ WR LW :H SURYLGH WKH IROORZLQJ IXQFWLRQDOLW\ LQ 9ROWDLUH D GDWD GHILQLWLRQ IDFLOLW\ VLPLODU WR ZKDW PLJKW EH IRXQG LQ PRVW VHPDQWLF GDWD PRGHOV >@ D TXHU\ ODQJXDJH ZKLFK SURYLGHV XQLIRUP DFFHVV WR VHWV RI LQVWDQFHV DV ZHOO DV IXQFWLRQV >@ DXWRPDWLF FRQVWUDLQW PDQDJHPHQW XS WR D FHUWDLQ GHJUHHf IRU UHDVRQDEO\ H[n SUHVVLYH FRQVWUDLQWV >@ DQG DELOLW\ WR VSHFLI\ DQG FRPSXWH DUELWUDU\ IXQFWLRQV 7KH ILUVW WKUHH IHDWXUHV DUH EDVHG RQ WKH FRUH IXQFWLRQDOLW\ WKDW D W\SLFDO '%06 PXVW SURYLGH $UELWUDU\ IXQFWLRQV DUH WKHQ FRPSXWHG XQGHU WKH FRQWURO RI WKH '%06 $OO RI WKH DERYH IXQFWLRQDOLW\ LV SURYLGHG E\ D VLQJOH H[HFXWLRQ PRGHO ZKLFK UHIOHFWV D ERRWVWUDSSHG LPSOHPHQWDWLRQ VHH )LJXUH Ff )XUWKHU WKHUH DUH QR DUELWUDU\ UHVWULFWLRQV RQ WKH SHUVLVWHQFH RI YDOXHV :H VKDOO QRW EH GHDOLQJ ZLWK RWKHU LPSRUWDQW LVVXHV VXFK DV FRQFXUUHQF\ WUDQVDFWLRQ PDQDJHPHQW UHFRYHU\ RU DFWLYH GDWDEDVH PDQDJHPHQW HVVHQWLDO IRU HIILFLHQW LQWHJULW\ HQIRUFHPHQWf 7KH PDLQ FRQWULEXWLRQV RI WKLV GLVVHUWDWLRQ FDQ EH VXPPDUL]HG DV IROORZV GHILQH D VHPDQWLFV IRU W\SHV LQFRUSRUDWLQJ H[WHQW DQG EHKDYLRU WKDW HPSKDVL]HV WKH QRWLRQ WKDW D FODVV RU W\SHf GHQRWHV D VHW RI REMHFWV DOORZ D VHW RI KHWHURJHQHRXV UHFRUGV REMHFWVf WR EHORQJ WR D VLQJOH FODVV WR IDFLOLWDWH VKDULQJ RI GDWD

PAGE 13

DOOHYLDWH WKH SDUDGLJP PLVPDWFK EHWZHHQ UHFRUGRULHQWHG DQG VHWRULHQWHG SULPn LWLYHV IRU PDQLSXODWLQJ DVVRFLDWLYH GDWD ZLWKLQ WKH ODQJXDJH E\ PHDQV RI W\SH FRHUFLRQ SURYLGH D PRGLFXP RI HIILFLHQF\ E\ VXEVFULELQJ WR WKH LPSHUDWLYH SDUDGLJP ZLWKLQ D VHWRULHQWHG ODQJXDJH DQG SURYLGH D VLQJOH PRGHO RI H[HFXWLRQ IRU HYDOXDWLQJ TXHULHV HQIRUFLQJ FRQVWUDLQWV DQG FRPSXWLQJ IXQFWLRQV E\ GHVLJQLQJ D ODQJXDJH WKDW IDFLOLWDWHV VRPH PHDVXUH RI ERRWVWUDSSLQJ 7KH UHVW RI WKLV GLVVHUWDWLRQ LV RUJDQL]HG DV IROORZV ,Q WKH UHPDLQGHU RI FKDSWHU ZH OLVW VRPH JHQHUDO GHVLJQ FULWHULD IRU GDWDEDVH SURJUDPPLQJ ODQJXDJHV DQG GLVFXVV SUHYLRXV UHVHDUFK 7KHQ LQ FKDSWHU ZH JLYH D EULHI RYHUYLHZ RI WKH GHVLJQ UDWLRQDOH RI 9ROWDLUH DQG VRPH RI LWV IHDWXUHV ,Q FKDSWHU ZH GHVFULEH WKH GDWD GHILQLWLRQ IDFLOLW\ LQ 9ROWDLUH DORQJ ZLWK XSGDWH RSHUDWRUV DQG JLYH D IRUPDO VHPDQWLFV RI WKH W\SH PRGHO XVHG LQ WKH ODQJXDJH ,Q FKDSWHU ZH GHVFULEH WKH IHDWXUHV RI WKH TXHU\ VXEODQJXDJH ZLWK WKH KHOS RI H[DPSOHV DQG DOVR RXWOLQH SRVVLEOH H[HFXWLRQ VWUDWHJLHV ,Q FKDSWHU WKH FRQVWUDLQW VSHFLILFDWLRQ VXEODQJXDJH LV GHVFULEHG ,Q FKDSWHU ZH ILUVW LQWURGXFH WKH EDVLF VWUXFWXUH RI IXQFWLRQV LQ 9ROWDLUH DQG JLYH D QXPEHU RI H[DPSOHV 7KHQ ZH H[SODLQ KRZ WKH QRWLRQ RI WHPSRUDU\ LQVWDQFH FUHDWLRQ SURYLGHV DQ RSHUDWLRQDO PHDQV IRU JLYLQJ DQ HTXLYDOHQW VHPDQWLFV WR FODVVHV DQG IXQFWLRQV LQ WKH UXQWLPH HQYLURQPHQW 7KLV LV IROORZHG E\ D WKHRUHWLFDO H[SODQDWLRQ RI ZK\ FODVVHV DQG IXQFWLRQV FDQ KDYH DQ HTXLYDOHQW VHPDQWLFV DQG VRPH LPSOLFDWLRQV WKHUHRI ,Q FKDSWHU ZH ILUVW GHVFULEH KRZ D XVHU FDQ LQWHUDFW ZLWK WKH 9ROWDLUH HQYLURQPHQW IROORZHG E\ D GHQRWDWLRQDO VHPDQWLFV RI WKH ODQJXDJH )LQDOO\ ZH VXPPDUL]H RXU FRQFOXVLRQV DQG WKH PDLQ FRQWULEXWLRQV RI WKLV GLVVHUWDWLRQ DV ZHOO DV GHILQH IXWXUH UHVHDUFK JRDOV LQ FKDSWHU

PAGE 14

6RPH 'HVLJQ &ULWHULD IRU '%3/V +HUH ZH GLVFXVV WKH LPSOLFDWLRQV RI PHUJLQJ WKH GDWDEDVH DQG SURJUDPPLQJ ODQn JXDJH FXOWXUHV ZKLFK KDYH WUDGLWLRQDOO\ EHHQ GLYHUJHQW :H IHHO WKDW WKHVH LVVXHV GLVFXVVHG HOVHZKHUH > @ KDYH EHHQ SUHGRPLQDQWO\ YLHZHG IURP D SURJUDPn PLQJ ODQJXDJH VWDQGSRLQW :H PXVW ILUVW QRWH WKDW WKH SULPDU\ IXQFWLRQ RI D GDWDEDVH PDQDJHPHQW V\VWHP '%06f LV WR SURYLGH D SHUVLVWHQW VWRUH RI EXON GDWD VWUXFWXUHV IRU HIILFLHQWO\ SURFHVVLQJ WUDQVDFWLRQV RQ VHWV RI VXFK GDWD 0RUH WUDGLWLRQDO DSSOLFDWLRQ GRPDLQV DUH GDWD LQWHQVLYH WKDW LV WKH DSSOLFDWLRQ WHQGV WR KDYH D ODUJH YROXPH RI LQVWDQFHV RU UHFRUGV DQG UHODWLYHO\ IHZHU W\SHV RU FODVVHV 7KHUHIRUH LW LV FRQFHLYDEOH WKDW H[LVWLQJ GDWD PRGHOV DUH H[WHQGHG WR SURYLGH DGYDQFHG IXQFWLRQDOLW\ VXFK DV WKH DELOLW\ WR FRPSXWH DUELWUDU\ IXQFWLRQV RU DFWLYH GDWD PDQDJHPHQW > @ 7KH DELOLW\ WR GHILQH DQG KDQGOH YDULRXV NLQGV RI WUDQVDFWLRQV LV FUXFLDO LQ WKHVH DSSOLFDWLRQV ,Q FRQWUDVW QHZHU DSSOLFDWLRQ DUHDV VXFK DV &$'&$0 RU &$6( DUH FRPSXWDWLRQ LQWHQVLYH WKDW LV WKH\ WHQG WR KDYH D ODUJH QXPEHU RI W\SHV RU FODVVHV HDFK FODVV KDYLQJ IHZ LQVWDQFHV EXW UHTXLULQJ VRPH GDWDEDVH IXQFWLRQDOLW\ ,W PD\ EH PRUH H[SHGLWLRXV WR H[WHQG D JLYHQ SURJUDPPLQJ ODQJXDJH VXFK WKDW LW SURYLGHV '%06OLNH IXQFWLRQDOLW\ > @ +HQFH LW VHHPV WKDW EHIRUH GHVLJQLQJ D '%3/ WKH H[SHFWHG DSSOLFDWLRQ GRPDLQ VKRXOG EH NQRZQ VLQFH LW LV UDWKHU GLIILFXOW KRZHYHU GHVLUDEOH LW PD\ EHf WR GHVLJQ D V\VWHP ZKLFK FDQ VROYH DOO SUREOHPV 0RVW '%3/V VHHP WR KDYH WDNHQ WKH VHFRQG RSWLRQ ZLWK FHUWDLQ H[FHSWLRQV 6RPH RI WKHVH DUH UHODWLRQDO V\VWHPV HPEHGGHG ZLWKLQ ORJLF DQG SURFHGXUDO ODQJXDJHV > @ DQG RWKHU V\VWHPV VXFK DV > @ 7KHUH LV D WKLUG FODVV RI '%3/V ZKLFK DUH GHVLJQHG IURP VFUDWFK DQG DGGUHVV VSHFLILF LVVXHV 7KHVH ODQJXDJHV WHQG WR EH PRUH H[SHULPHQWDO LQ QDWXUH

PAGE 15

:H QRZ DWWHPSW WR DQDO\]H WKH HIIHFWV RI ERWK WKH DERYH RSWLRQV RQ YDULRXV IHDWXUHV WKDW D '%3/ PD\ KDYH 6HPDQWLF 'DWD 0RGHO YHUVXV 3HUVLVWHQW $EVWUDFW 'DWD 7\SHV $ VHPDQWLF GDWD PRGHO ULJLGO\ GHILQHV WKH VWUXFWXUH RI REMHFWV RU LQVWDQFHVf ZKLFK UHVLGH LQ D SHUVLVWHQW VWRUH DQG FODVVHV ZKLFK GHVFULEH WKHVH REMHFWV 7\SH FRQVWUXFn WRUV FDQ RQO\ EH XVHG WR GHILQH WKH GRPDLQ RI YDOXHV ZKLFK YDULRXV DWWULEXWHV RI D JLYHQ REMHFW FDQ DVVXPH 7KLV PHDQV WKDW QHZ FODVVHV FDQQRW EH GHILQHG RU FRQVWUXFWHGf E\ DSSO\LQJ W\SH FRQVWUXFWRUV WR H[LVWLQJ W\SHV VXFK PDQLSXODWLRQ LV DOORZHG RQO\ LQ WKH TXHU\ ODQJXDJH ,Q FRQWUDVW WKHUH DUH QR VXFK UHVWULFWLRQV RQ W\SH FRQVWUXFWRUV ZLWK DQ DEVWUDFW GDWD W\SH +RZHYHU ZLWK WKH DEVWUDFW GDWD W\SH DSSURDFK WKH GDWDEDVH DGPLQLVWUDWRU PXVW GHWHUPLQH WKH PRVW VXLWDEOH GDWD W\SHV DQG VWUXFWXUHV IRU WKH DSn SOLFDWLRQ DW KDQG DQG DOVR ZULWH D VHW RI FUHDWH XSGDWH GHOHWH DQG UHWULHYH URXWLQHV IRU HDFK VXFK VWUXFWXUH 7KLV LV XVXDOO\ QRW FRQVLGHUHG D VDWLVIDFWRU\ VLWXDWLRQ LQ WKH GDWDEDVH FXOWXUH SULPDULO\ EHFDXVH LW YLRODWHV WKH SULQFLSOH RI GDWD LQGHSHQGHQFH $ SDUWLDO UHPHG\ PD\ EH WR GLVWLQJXLVK EHWZHHQ SHUVLVWHQW DQG QRQSHUVLVWHQW GDWD W\SHV VR WKDW JHQHULF RSHUDWRUV IRU PDQLSXODWLQJ WKH SHUVLVWHQW REMHFWV FDQ EH HIILn FLHQWO\ LPSOHPHQWHG %XW WKHQ WKLV YLRODWHV WKH SULQFLSOH RI XQLIRUP SHUVLVWHQFH LH SHUVLVWHQFH VKRXOG EH RUWKRJRQDO WR W\SH >@ 7KHUHIRUH FKRRVLQJ D ULJLG GDWD PRGHO LPSOLHV HIILFLHQW DFFHVV WR WKH SHUVLVWHQW VWRUH EXW D ODFN RI D ULFK W\SLQJ PHFKDQLVP ZKHUHDV WKH VHFRQG RSWLRQ LPSOLHV LQHIILFLHQW DFFHVV WR WKH VHFRQGDU\ VWRUH EXW D ULFK W\SLQJ PHFKDQLVP DQG H[WHQVLELOLW\ :H ZRXOG OLNH WR HPSKDVL]H WKDW SHUVLVWHQW SURJUDPPLQJ ODQJXDJHV DUH QRW GDWDn EDVH SURJUDPPLQJ ODQJXDJHV 7KLV LV EHFDXVH ZKHQ D SURJUDPPLQJ ODQJXDJH LV H[n WHQGHG WR SURYLGH SHUVLVWHQFH LWV W\SH WKHRU\ LV XVXDOO\ QRW DSSURSULDWHO\ H[WHQGHG

PAGE 16

7KDW LV VXFK W\SH V\VWHPV DUH RIWHQ XQDEOH WR DQVZHU WKH IROORZLQJ TXHVWLRQV LQ D FOHDU IDVKLRQ ZKHQ LV RQH FODVV W\SHf D VXEFODVV VXEW\SHf RI DQRWKHU" ZKHQ LV DQ REMHFW LQVWDQFH RU UHFRUGf D PHPEHU RI WKH GRPDLQ RI D JLYHQ FODVV RU W\SHf" $QRWKHU SUREOHP ZLWK WKHVH W\SH V\VWHPV LV WKDW WKH\ RIWHQ GR QRW SURYLGH WUDQVn SDUHQF\ EHWZHHQ SHUVLVWHQW DQG WUDQVLHQW REMHFWV WKDW LV D VHSDUDWH VHW RI RSHUDWRUV LV GHILQHG IRU SHUVLVWHQW VHW RI REMHFWV +HQFH ZH EHOLHYH WKDW SHUVLVWHQW YHUVLRQV RI ODQJXDJHV VXFK DV & 6PDOOWDON RU $GD FDQQRW EH FODVVLILHG DV '%3/V EXW VKRXOG EH FRQVLGHUHG DV LQWHUPHGLDWH DOEHLW LPSRUWDQWf VWHSV WRZDUGV RQH 7\SH &KHFNLQJ 7KH JHQHUDO FRQVHQVXV KHUH VHHPV WR EH WKDW WKH ODQJXDJH VKRXOG EH VWURQJO\ W\SHG WKRXJK VRPH REYLRXVO\ FRQYHQLHQW RYHUORDGLQJ PD\ EH DOORZHG >@ 7KHUH DOVR VHHPV WR EH D FRQVHQVXV WKDW W\SH FKHFNLQJ VKRXOG EH VWDWLF DV IDU DV SRVVLEOH 7KLV ZRXOG PLQLPL]H UXQWLPH HUURUV WKXV VDYLQJ RQ WKH WUDQVDFWLRQ SURFHVVLQJ RYHUKHDG FDWFKLQJ D UXQWLPH HUURU ODWH LQ WKH WUDQVDFWLRQ PD\ UHVXOW LQ D QXPEHU RI XQGR RSHUDWLRQVf 6WDWLF W\SH FKHFNLQJ FDQ EH GLIILFXOW WR DFKLHYH LQ KLJKO\ SRO\PRUSKLF ODQJXDJHV WKRXJK VRPH SURJUHVV KDV EHHQ UHSRUWHG > @ $ELOLW\ WR 0DQLSXODWH +HWHURJHQHRXV 6HWV 7\SH GHILQLWLRQV LQ ODQJXDJHV VXFK DV & GR QRW DFFRXQW IRU WKH H[WHQW RI WKH W\SH 7KLV FRQWUDVWV ZLWK WKH GDWDEDVH QRWLRQ RI D FODVV ZKLFK GHQRWHV WKH VHW RI DOO LQVWDQFHV WKDW EHORQJ WR WKDW FODVV 7KHUH KDV EHHQ PXFK UHFHQW ZRUN RQ GHILQLQJ W\SH VFKHPHV ZKLFK DWWHPSW WR GHILQH WKH H[WHQW RI D W\SH > @ $Q LPSRUWDQW IHDWXUH LV WKH DELOLW\ WR PDQLSXODWH VHWV RI KHWHURJHQHRXV

PAGE 17

GDWD )RU H[DPSOH WKH ODQJXDJH 0DFKLDYHOOL >@ GHILQHV D W\SH GLVFLSOLQH LQ ZKLFK LW LV SRVVLEOH WR ZULWH SRO\PRUSKLF IXQFWLRQV ZKLFK PD\ RSHUDWH RQ VHWV RI GLIIHUHQW NLQGV +RZHYHU D SDUWLFXODU H[HFXWLRQ RI WKH IXQFWLRQ PD\ RQO\ RSHUDWH RQ D VHW ZKRVH HOHPHQWV EHORQJ WR D VLQJOH NLQG $ELOLW\ WR 6KDUH 'DWD 7KH DELOLW\ WR VKDUH GDWD KHWHURJHQHRXV RU RWKHUZLVHf VKRXOG EH DQ LPSRUWDQW SURSHUW\ RI D GDWDEDVH SURJUDPPLQJ HQYLURQPHQW 6KDULQJ FDQ RFFXU LQ WKUHH ZD\V $ VLQJOH VFKHPD FDQ GHVFULEH PXOWLSOH GDWDEDVHV )RU H[DPSOH D FKDLQ RI VWRUHV FDQ KDYH D VLQJOH VFKHPD WR GHVFULEH WKH LQYHQWRU\ DW DOO RI LWV ORFDWLRQV $ VLQJOH GDWDEDVH FDQ KDYH PXOWLSOH VFKHPDV GHVFULELQJ LW XQOLNH YLHZVf )RU H[DPSOH D SODQW PDQDJHU DQG SODQW HQJLQHHU FDQ KDYH WZR GLIIHUHQW VFKHPDV HPSKDVL]LQJ GLIIHUHQW DVSHFWV RI WKH VDPH &$0 GDWDEDVH 0XOWLSOH XVHUV PD\ ZLVK WR VKDUH D JLYHQ GDWDEDVH SRVVLEO\ YLHZHG WKURXJK GLIIHUHQW VFKHPDVf 'DWD YHUVXV )XQFWLRQV 6LQFH LQGHSHQGHQW DSSOLFDWLRQV DFFHVV WKH VDPH VKDUHG GDWD XQGHU WKH FRQWURO RI D '%06 WKH IRFXV RI D '%06 LV RQ WKH GDWD 2Q WKH RWKHU KDQG WKH IRFXV LQ D SURJUDPPLQJ ODQJXDJH LV RQ WKH DSSOLFDWLRQ LWVHOI DQG WKH GDWD W\SHV DUH VLPn SO\ D PHFKDQLVP IRU HIILFLHQW LPSOHPHQWDWLRQ RI WKH DSSOLFDWLRQ 7KLV WUDGLWLRQDO VHSDUDWLRQ RI GDWD IURP IXQFWLRQ OHDGV WR D YHU\ IXQGDPHQWDO FRQIOLFW ZKHQ GHVLJQn LQJ D '%3/ KDYLQJ LPSOLFDWLRQV RQ FRQVWUDLQW PDQDJHPHQW DG KRF TXHU\LQJ DQG WUDQVDFWLRQ SURFHVVLQJ )RU H[DPSOH OHW XV H[DPLQH WKH LPSOLFDWLRQV RQ DQ DSSOLn FDWLRQ LQGHSHQGHQW LH DG KRFf TXHU\ PHFKDQLVP 6LQFH IXQFWLRQV RU PHWKRGV RU SURFHGXUHVf FDQ EH XVHG WR JHQHUDWH GHULYHG DWWULEXWHV LW EHFRPHV QHFHVVDU\ WR EH

PAGE 18

DEOH WR TXHU\ WKHP >@ &RQVLGHU WKH FODVV SHUVRQ ZLWK DWWULEXWHV ELUWKGDWH DQG DJH DQG D IXQFWLRQ FDOOHG FRPSXWHDJH ZKLFK FRPSXWHV WKH DJH RI D SHUVRQ JLYHQ KLVKHU ELUWK GDWH DQG WKH FXUUHQW GDWH 7KH TXHU\ UHIHUHQFH SHUVRQDJH VKRXOG DXWRPDWLn FDOO\ WULJJHU WKH FRPSXWHDJH IXQFWLRQ $OWHUQDWHO\ WKH ODQJXDJH VKRXOG DOORZ WKH TXHU\ UHIHUHQFH SHUVRQFRPSXWHDJH ,GHDOO\ WKH '%3/ VKRXOG DOORZ IXQFWLRQV WR EH DFFHVVHG LQ D IDVKLRQ VLPLODU WR WKDW RI RWKHU REMHFWV 'DWDEDVH ,QWHJULW\ 7KH LPSRUWDQFH RI GDWDEDVH LQWHJULW\ VKRXOG EH HVWDEOLVKHG IRU WKH JLYHQ DSSOLFDn WLRQ DUHD DQG DOVR LW VKRXOG EH GHFLGHG DV WR KRZ PXFK RI WKH EXUGHQ IRU PDLQWDLQLQJ WKLV LQWHJULW\ FDQ EH SODFHG RQ WKH DSSOLFDWLRQ SURJUDPPHU EHIRUH GHVLJQLQJ D '%3/ 7\SLFDOO\ LQ WUDGLWLRQDO GDWDEDVH V\VWHPV LQWHJULW\ LV HQIRUFHG E\ DSSOLFDWLRQ SURn JUDPV +RZHYHU HQIRUFLQJ LQWHJULW\ FRQVWUDLQWV LV FRQVLGHUHG DQ LPSRUWDQW GDWDEDVH IXQFWLRQ ZKLFK VKRXOG EH KDQGOHG E\ WKH '%06 LWVHOI 6RPH UHFHQW VROXWLRQV WR WKLV SUREOHP KDYH EHHQ GLVFXVVHG LQ WKH DUHD RI DFWLYH GDWDEDVHV > @ :KHQ GHDOLQJ ZLWK FRPSOH[ REMHFWV WKH '%06 PXVW DW OHDVW EH FDSDEOH RI PDLQWDLQLQJ UHIHUHQWLDO LQWHJULW\ ,W LV UHODWLYHO\ GLIILFXOW WR GHILQH D WKHRU\ RI W\SHV WKDW DOVR WDNHV LQWR DFn FRXQW WKH H[WHQW RI WKH W\SH LQ SHUVLVWHQW VWRUH VLQFH WKH XVHU KDV FRPSOHWH IUHHGRP WR GHILQH DQ\ DUELWUDU\ W\SH 7KLV PDNHV LW HYHQ PRUH GLIILFXOW WR LGHQWLI\ DQG HQIRUFH LQWHJULW\ FRQVWUDLQWV 7KH IXQGDPHQWDO FRQIOLFW KHUH LV WKDW D GDWDEDVH DVVRFLDWHV FRQVWUDLQWV ZLWK REMHFWV LH DXWRPDWLF WULJJHULQJ RI FRQVWUDLQWV ZKHQ DQ REMHFW LV FUHDWHG XSGDWHG RU GHOHWHGf ZKHUHDV LQ D SURJUDPPLQJ ODQJXDJH FRQVWUDLQWV DUH HPEHGGHG LQ WKH SURFHGXUH DQG WKHUHIRUH FDQQRW EH WULJJHUHG DXWRPDWLFDOO\ 0XFK UHFHQW ZRUN RQ FRQVWUDLQW PDQDJHPHQW LV UHSRUWHG LQ WKH DFWLYH GDWDEDVH OLWHUDWXUH > @ 7KLV ZRXOG DOVR OHDG WR D PRUH HIILFLHQW WUDQVDFWLRQ PDQDJHPHQW

PAGE 19

VLQFH D XVHUGHILQHG SURFHGXUH IRU PDLQWDLQLQJ LQWHJULW\ FDQ KDYH DUELWUDU\ VLGH HIn IHFWV WKXV PDNLQJ LW LPSRVVLEOH WR DXWRPDWLFDOO\ GHWHUPLQH ZKLFK FRQVWUDLQWV ZLOO EH YLRODWHG +RZHYHU LW LV QRW \HW FOHDU KRZ WKH QRWLRQ RI DQ DFWLYH GDWDEDVH FDQ EH PHUJHG ZLWK D SURJUDPPLQJ ODQJXDJH WR GHVLJQ D '%3/ 5ROH RI WKH 4XHU\ /DQJXDJH $ GDWDEDVH XVHU XVXDOO\ QHHGV WR UHWULHYH RU RWKHUZLVH RSHUDWH RQ VHWV RI VLPLODU YDOXHG REMHFWV GHILQHG E\ YDULRXV FODVVHV 7KH TXHU\ ODQJXDJH LV WKH PHFKDQLVP WKDW DOORZV WKH XVHU WR VSHFLI\ D UHVWULFWHG FODVV RI FRPSXWDWLRQV WR RSHUDWH RQ VXFK VHWV ,W XVXDOO\ DOORZV RQO\ UHVWULFWHG FRPSXWDWLRQV VR DV WR PD[LPL]H HIILFLHQF\ 7KH FRQVLGHUDWLRQV IRU RSWLPL]LQJ D TXHU\ SURFHVVRU DUH VLJQLILFDQWO\ GLIIHUHQW IURP WKRVH LQ SURJUDPPLQJ ODQJXDJHV ZKLFK W\SLFDOO\ RSHUDWH RQ RQH REMHFW DW D WLPH LQ YLUWXDO PHPRU\ 4XHU\ RSWLPL]HUV UHO\ KHDYLO\ RQ FOXVWHULQJ LQIRUPDWLRQ RQ WKH GLVN LQGH[LQJ FDFKLQJ DQG WKH DOJHEUDLF SURSHUWLHV RI WKH SULPLWLYH RSHUDWRUV SURYLGHG E\ WKH TXHU\ ODQJXDJH ,GHDOO\ RQH ZRXOG ZDQW WR DXJPHQW WKH FRPSXWLQJ SRZHU RI D TXHU\ ODQJXDJH E\ PDNLQJ LW D fSURSHUf VXEVHW RI WKH SURJUDPPLQJ ODQJXDJH %\ SURSHU VXEVHW ZH PHDQ WKDW E\ UHPRYLQJ DOO TXHU\LQJ SULPLWLYHV IURP WKH '%3/ LW ZRXOG EH UHQGHUHG 7XULQJ LQFRPSOHWHf ,Q WKLV VFHQDULR LW ZRXOG EH SRVVLEOH WR PDNH DUELWUDU\ FRPSXWDWLRQV HIILFLHQWO\ DV ZHOO DV WR HYDOXDWH DG KRF TXHULHV %XW LW VKRXOG EH SRLQWHG RXW KHUH WKDW LI WKH '%3/ ZHUH WR KDYH D YHU\ ULFK W\SH V\VWHP ZKHUH WKH SHUVLVWHQW EXON GDWD DUH RI YDULRXV GLIIHUHQW W\SHV WKHQ TXHU\ RSWLPL]DWLRQ EHFRPHV WRR FRPSOH[ WR EH HIIHFWLYH 7KLV LV EHFDXVH HDFK EXON GDWD W\SH ZRXOG KDYH LWV RZQ DVVRFLDWHG RSWLPL]DWLRQ WHFKQLTXH $GGLWLRQDOO\ LI WKH EXON GDWD W\SHV DUH YDVWO\ GLIIHUHQW IURP HDFK RWKHU WKHQ LW FDQ EH YHU\ GLIILFXOW WR PHDQLQJIXOO\ RYHUORDG WKH TXHU\ ODQJXDJH SULPLWLYHV )RU LQVWDQFH LW PLJKW EH GLIILFXOW WR GHILQH D VLQJOH fMRLQf RSHUDWRU IRU UHODWLRQV LQ ILUVW QRUPDO IRUP DQG XVHUGHILQHG FRPSOH[ REMHFWV

PAGE 20

LQ D QRQUHODWLRQDO IRUPDW $IWHU DOO WKH QRWLRQ RI XQLIRUP SHUVLVWHQFH VKRXOG TXLWH QDWXUDOO\ EH H[WHQGHG WR WKH QRWLRQ WKDW WKH TXHU\ ODQJXDJH VKRXOG EH XQLIRUP LH KDYH D VPDOO VHW RI RSHUDWLRQV WKDW DSSO\ XQLIRUPO\ DFURVVf IRU DOO GDWD W\SHV 7KLV PLJKW EH SRVVLEOH RQO\ LQ D ODQJXDJH ZKRVH W\SH V\VWHP LV KLJKO\ SRO\PRUSKLF DQG HYHQ LI VR ZRXOG EH DFKLHYHG RQO\ DW WKH H[SHQVH RI VDFULILFLQJ HIILFLHQF\ 6RPH ZRUN WRZDUGV WKLV HQG LV UHSRUWHG LQ > @ ,PSOHPHQWDWLRQ 6WUDWHJLHV 7UDGLWLRQDO GDWDEDVH IXQFWLRQDOLW\ VXFK DV FRQFXUUHQF\ ORFNLQJ DQG WUDQVDFWLRQ PDQDJHPHQW IDFLOLWDWH GDWD VKDULQJ 6XFK IXQFWLRQDOLW\ LV EDVHG RQ WKH QRWLRQ WKDW D FODVV GHQRWHV WKH VHW RI LQVWDQFHV WKDW EHORQJ WR LW 7KXV LW VHHPV LPSRUWDQW WKDW D GDWDEDVH SURJUDPPLQJ ODQJXDJH HPSKDVL]H GDWD UDWKHU WKDQ IXQFWLRQ )LJXUH VKRZV VRPH SRVVLEOH LPSOHPHQWDWLRQ VWUDWHJLHVf§)LJXUH D VLPSO\ GHSLFWV D FODVVLFDO VLWXDWLRQ ZKHUH '0/ VWDWHPHQWV DUH HPEHGGHG LQ VRPH KRVW ODQJXDJH ,W LV SHUKDSV IDLU WR VD\ WKDW )LJXUH E GHSLFWV D W\SLFDO LPSOHPHQWDWLRQ RI WKH QHZHU JHQHUDWLRQ RI GDWDEDVH V\VWHPV 6XFK LPSOHPHQWDWLRQV DUH LQ DJUHHPHQW ZLWK VRPH UHFHQW ZRUN RQ H[WHQVLEOH V\VWHPV > @ )URP WKH DSSOLFDWLRQ SURJUDPPHUfV SRLQW RI YLHZ )LJXUHV E DQG F DUH IXQFWLRQDOO\ HTXLYDOHQW +RZHYHU ZH EHOLHYH WKDW )LJXUH F LV D FOHDQHU DQG PRUH GHVLUDEOH LPSOHPHQWDWLRQ PRGHO EHFDXVH LW LV SRVVLEOH IRU V\QWDFWLF VWUXFWXUHV WR EH VKDUHG ZLWKRXW KDUPIXOO\ RYHUORDGLQJ WKHLU VHPDQWLFV LW ZRXOG EH HDVLHU WR ERRWVWUDS VXFK D V\VWHP LW ZRXOG OHDG WRZDUGV D VPDOOHU LQWHJUDWHG ODQJXDJH DQG LW ZRXOG UHGXFH FRPPXQLFDWLRQ RYHUKHDG EHWZHHQ WKH YDULRXV PRGXOHV 7KLV LV LQ FRQWUDGLVWLQFWLRQ WR IXQFWLRQDO GDWD PRGHOV VXFK DV '$3/(; >@ RU 3'0 >@

PAGE 21

2SHUDWLQJ 6\VWHP '%06 D &ODVVLFDO 6FHQDULR F %RRWVWUDSSLQJ LQ 'DWDEDVH 3URJUDPPLQJ /DQJXDJH )LJXUH ,PSOHPHQWDWLRQ 6WUDWHJLHV

PAGE 22

&KRLFH RI &RPSXWLQJ 3DUDGLJP ,GHDOO\ WKH FKRLFH RI D JLYHQ FRPSXWLQJ SDUDGLJP VKRXOG PDNH QR GLIIHUHQFH 8QIRUWXQDWHO\ WKLV LV QRW WKH FDVH LQ SUDFWLFH ,W LV YHU\ WHPSWLQJ WR GHVLJQ D ORJLF RU IXQFWLRQDO ODQJXDJH VLQFH WKH\ KDYH VRXQG WKHRUHWLFDO EDVHV 7KLV ZRXOG PDNH TXHU\ RSWLPL]DWLRQ PXFK HDVLHU EXW WKH VHPDQWLFV RI WUDQVDFWLRQ SURFHVVLQJ FDQ EHFRPH PHVV\ EHFDXVH DOO XSGDWH IXQFWLRQV PD\ KDYH WR EH LPSOHPHQWHG DV PHWDSUHGLFDWHV 7KLV LV EHFDXVH LW LV RIWHQ GLIILFXOW WR SURYLGH D IRUPDO GHVFULSWLRQ RI RSHUDWLRQV WKDW SURGXFH VLGH HIIHFWV VXFK DV XSGDWHV %HVLGHV XVHUV VHHP WR KDYH D WHQGHQF\ WR VK\ DZD\ IURP VXFK ODQJXDJHV 7KH LPSOLFDWLRQV RI REMHFWRULHQWDWLRQ RQ '%3/ GHVLJQ KDYH EHHQ ZHOO GLVFXVVHG LQ %ORRP DQG =GRQLFN >@ DQG %DQFLOKRQ >@ DQG ZLOO QRW EH GLVFXVVHG KHUH 3URFHGXUDO ODQJXDJHV VXFK DV &2%2/ RU & RU 3DVFDO KDYH WKH PDLQ DGYDQWDJH RI EHLQJ UDWKHU SRSXODU DPRQJ DSSOLFDWLRQ SURJUDPPHUV +RZHYHU WKH\ DUH FRQVLGHUHG WR EH fORZOHYHOf DQG WKHUHIRUH QRW H[SUHVVLYH HQRXJK $OVR PRVW SURFHGXUDO ODQJXDJHV KDYH YLUWXDOO\ QR VHW SURFHVVLQJ SULPLWLYHV ZLWK WKH H[FHSWLRQ RI &2%2/f +RZHYHU IURP D GDWDEDVH SHUVSHFWLYH ZH IHHO WKDW WKH GHVWUXFWLYH DVVLJQPHQW RSn HUDWRU FDXVHV WKH PRVW SUREOHPV ,Q D WUXO\ LQWHJUDWHG '%3/ HQYLURQPHQW >@ ZLWK XQLIRUP SHUVLVWHQFH LW LV GLIILFXOW WR SUHYHQW WKH XVHU IURP HYHQ DFFLGHQWDOO\f DVVLJQn LQJ D QHZ YDOXH WR D ILHOG ,Q HIIHFW VXFK DQ DVVLJQPHQW LV DQ XSGDWH WR WKH GDWDEDVH ZKLFK FRXOG VSDZQ SRWHQWLDOO\ PDQ\ VXEWUDQVDFWLRQV IRU FKHFNLQJ FRQVWUDLQWV EHIRUH WKH DVVLJQPHQW RSHUDWLRQ FRXOG EH FRPPLWWHG DQG WKH QH[W FRPPDQG H[HFXWHG 7KLV LV LQ DGGLWLRQ WR WKH XVXDO SUREOHPV VXFK DV JDUEDJH FROOHFWLRQ DQG GDQJOLQJ UHIHUn HQFHV FDXVHG E\ GHVWUXFWLYH DVVLJQPHQWf 7KH GHVWUXFWLYH DVVLJQPHQW RSHUDWRU LV WKH

PAGE 23

EWH QRLUH RI DXWRPDWLF VLGHHIIHFW GHWHFWLRQ DQG FRQVWUDLQW PDQDJHPHQW 8QIRUWXQn DWHO\ WKH GHVWUXFWLYH DVVLJQPHQW RSHUDWRU LV QHFHVVDU\ WR DFKLHYH HIILFLHQF\ DQG EHWWHU SHUIRUPDQFH 5HJDUGOHVV RI ZKLFK GHVLJQ VWUDWHJ\ RU ODQJXDJH SDUDGLJP LV FKRVHQ RQH REYLRXV SLWIDOO WR DYRLG LV WKH 3/ V\QGURPH 0DQ\ '%3/V WKDW DUH WKH UHVXOW RI WKUHH RUWKRJRQDO VXEODQJXDJHV EHLQJ DSSHQGHG WR HDFK RWKHU VHH VHFWLRQ f DUH DOVR YLFWLPV WKRXJK WR D PXFK OHVVHU GHJUHHf RI WKH 3/ V\QGURPH )RU LQVWDQFH LW LV EHWWHU WR SURYLGH GLIIHUHQW NLQGV RI XVHUV ZLWK YDULRXV OLEUDU\ IXQFWLRQV UDWKHU WKDQ LQFRUSRUDWLQJ ODQJXDJH FRQVWUXFWV IRU HYHU\WKLQJ 6LQFH RQH RI WKH GHVLJQ JRDOV RI D '%3/ LV WR FDWHU WR D ODUJHU YDULHW\ RI XVHUV WKH HQYLURQPHQW VKRXOG SURYLGH GHIDXOW SULPLWLYHV IRU HDFK IXQFWLRQDOLW\ ZKLFK FDQ EH HDVLO\ VXSHUVHGHG E\ WKH XVHU 3UHYLRXV 5HVHDUFK 0RVW '%3/V GHVFULEHG LQ WKH OLWHUDWXUH IDOO LQWR WKUHH PDLQ GHVLJQ RSWLRQV (PEHG D JLYHQ GDWD PRGHO LQ VRPH SURJUDPPLQJ ODQJXDJH HJ 3DVFDO5 >@ 0RGXOD5 >@ $'$3/(; >@ 2 >@ *HPVWRQH >@ 3URYLGH SHUVLVWHQFH WR D SURJUDPPLQJ ODQJXDJH VRPH ODQJXDJHV DOVR SURYLGH VHW PDQLSXODWLRQ SULPLWLYHVf HJ 36$OJRO >@ 2'( >@ 21726 >@ 'HVLJQ D QHZ V\VWHP IURP VFUDWFK HJ 7$;,6 >@ *DOLOHR >@ 0DFKLDYHOOL >@ 9ROWDLUH IDOOV LQ WKLV FDWHJRU\ 7$;,6 RIIHUV HODERUDWH H[FHSWLRQ KDQGOLQJ DQG PHWDGDWD GHILQLWLRQ FDSDELOLWLHV ZKLOH WKH RWKHU WZR KDYH SRO\PRUSKLF W\SH V\VWHPV EDVHG RQ 0/ >@ *DOLOHR LV DQ H[SUHVVLRQRULHQWHG ODQJXDJH WKXV HOLPLQDWLQJ WKH QHHG IRU DQ H[SOLFLW TXHU\ ODQJXDJH 0DFKLDYHOOL LV D IXQFWLRQDO 7KH 3/ V\QGURPH LV D GHVLJQ SLWIDOO LQ ZKLFK DQ DUELWUDULO\ ODUJH QXPEHU RI FRQVWUXFWV DUH SURYLGHG 7KLV LQ WXUQ OHDGV WR D ODUJH DQG XQZLHOG\ ODQJXDJH ZKLFK LV GLIILFXOW WR LPSOHPHQW RU OHDUQ

PAGE 24

ODQJXDJH ZKLFK H[SOLFLWO\ DGGUHVVHV WKH W\SH YHUVXV FODVV LVVXH DQG WKH DELOLW\ WR PDQLSXODWH VHWV RI KHWHURJHQHRXV HOHPHQWV 7KH ILUVW FODVV RI ODQJXDJHV LV HQJLQHHUHG WR SURYLGH D UHODWLYHO\ FOHDQ LQWHUIDFH EHWZHHQ WKH UHFRUGRULHQWHG SURJUDPPLQJ ODQJXDJH SULPLWLYHV DQG VHW PDQLSXODWLRQ SULPLWLYHV IRU WKH XQGHUO\LQJ GDWD PRGHO $QRWKHU LPSRUWDQW FODVV RI VXFK ODQJXDJHV DUH UHODWLRQDO V\VWHPV HPEHGGHG ZLWKLQ ORJLF ODQJXDJHV >@ +RZHYHU WKH PDLQ SUREOHP ZLWK WKHVH ODQJXDJHV LV WKDW D FHUWDLQ DPRXQW RI SDUDGLJP PLVPDWFK UHPDLQV )RU H[DPSOH LQ 3DVFDO5 3DVFDO LV DQ LPSHUDWLYH ODQJXDJH ZKHUHDV WKH UHODWLRQDO PRGHO DQG LWV TXHU\ ODQJXDJH DUH GHFODUDWLYH ,Q WKH VHFRQG FODVV RI ODQJXDJHV ZH KDYH 36$OJRO ZKLFK SURYLGHV D SHUVLVWHQW VWRUH IRU DOO W\SHV LQ $OJRO 2Q WKH RWKHU KDQG 2'( DQG 21726 DUH H[WHQVLRQV RI & LQ ZKLFK WKH RQO\ SHUVLVWHQW VWUXFWXUHV DUH & FODVVHV 7KH SUREOHP ZLWK WKHVH ODQJXDJHV LV WKDW WKH\ KDYH QRW DGGUHVVHG WKH W\SH YHUVXV FODVV LVVXHV :KHQ H[WHQGLQJ WKHVH ODQJXDJHV ZLWK SHUVLVWHQFH WKHLU W\SH V\VWHPV DUH QRW DSSURSULDWHO\ H[WHQGHG 7KDW LV WKH W\SH V\VWHPV RI WKHVH H[WHQGHG ODQJXDJHV DUH XQDEOH WR DQVZHU RQH RU ERWK RI WKH IROORZLQJ TXHVWLRQV ZKHQ LV RQH FODVV W\SHf D VXEFODVV VXEW\SHf RI DQRWKHU" ZKHQ LV DQ REMHFW LQVWDQFH RU UHFRUGf D PHPEHU RI WKH GRPDLQ RI D JLYHQ FODVV RU W\SHf" ,Q WKH WKLUG FODVV RI ODQJXDJHV WR ZKLFK 9ROWDLUH EHORQJV 7$;,6 LV RQH RI WKH HDUOLHVW HIIRUWV ,W LV D UHFRUGRULHQWHG ODQJXDJH ZLWK D YHU\ HODERUDWH H[FHSWLRQ KDQn GOLQJ PHFKDQLVP ,W SURYLGHV DUELWUDU\ OHYHOV RI PHWDFODVVHV DQG WUDQVDFWLRQV DQG H[FHSWLRQV FDQ EH RUJDQL]HG LQWR D WD[RQRP\ 7KH ODQJXDJH UHOLHG KHDYLO\ RQ DVVRn FLDWLYH DFFHVV E\ PHDQV RI D GRW RSHUDWRU +RZHYHU LW GLG QRW KDYH VHW PDQLSXODWLRQ

PAGE 25

SULPLWLYHV DQG FRQVWUDLQWV FRXOG EH VDWLVILHG RQO\ E\ PHDQV RI GHILQLQJ DSSURSULDWH WUDQVDFWLRQV DQG KDQGOLQJ H[FHSWLRQV $OVR 7$;,6 FODVVHV DUH GHULYHG PDLQO\ IURP VHPDQWLF QHWZRUNV UDWKHU WKDQ D W\SLFDO W\SH V\VWHP >@ ,Q 9ROWDLUH ZH SURYLGH D VLPLODU GRW RSHUDWRU IRU DVVRFLDWLYH DFFHVV DV ZHOO DV VHW PDQLSXODWLRQ SULPLWLYHV DQG DXWRPDWLF FRQVWUDLQW PDQDJHPHQW )XUWKHU WKH W\SH V\VWHP LV ZHOOGHILQHG *DOLOHR LV DQ H[SUHVVLRQRULHQWHG ODQJXDJH ZLWK DQ 0/VW\OH W\SH GLVFLSOLQH ,Q VXFK ODQJXDJHV H[SUHVVLRQV DUH HYDOXDWHG GLUHFWO\ WKHUH LV QR QHHG WR ZULWH D IXQFWLRQ RU TXHU\f DQG WKHQ FRPSLOH LW EHIRUH H[HFXWLQJ LW 7KHUHIRUH LW HOLPLQDWHV WKH QHHG IRU D VHSDUDWH TXHU\ ODQJXDJH $ PDLQ GHVLJQ JRDO ZDV WR YLHZ *DOLOHR DV D FRQFHSWXDO GHVLJQ WRRO 8QOLNH 9ROWDLUH LW RIIHUV QR DXWRPDWLF FRQVWUDLQW PDQDJHPHQW $OWKRXJK 9ROWDLUH LV QRW H[SUHVVLRQRULHQWHG ZH GR QRW QHHG D VHSDUDWH TXHU\ ODQJXDJH ODUJHO\ GXH WR LWV ERRWVWUDSSHG GHVLJQf 0DFKLDYHOOL LV D IXQFWLRQDO ODQJXDJH ZLWK DQ 0/VW\OH W\SH GLVFLSOLQH $Q LPn SRUWDQW DVSHFW RI LWV SRO\PRUSKLVP LV DQ XQGHUO\LQJ DOJHEUD RI VHWV EDVHG RQ WKH KRPRPRUSKLF H[WHQVLRQ RSHUDWRU >@ ,W DOVR GHILQHV D FRKHUHQW W\SH WKHRU\ ZKLFK FDQ GHDO ZLWK VHWV RI KHWHURJHQHRXV UHFRUGV 8QOLNH 9ROWDLUH D QRWLRQ RI SHUVLVWHQFH LV VWLOO EH WR EH GHILQHG DQG LW GRHV QRW VXSSRUW DXWRPDWLF FRQVWUDLQW PDQDJHPHQW /LNH 0DFKLDYHOOL ZH KDYH DQ XQGHUO\LQJ DOJHEUD RI VHWV EDVHG RQ WKH KRPRPRUSKLF H[WHQVLRQ RSHUDWRU $Q LPSRUWDQW GLIIHUHQFH LV WKDW D XQLTXH LGHQWLILHU DQG RSWLRQn DOO\ WKH QDPH RI WKH FODVVf LV DXWRPDWLFDOO\ D SDUW RI DQ\ LQVWDQFH FUHDWHG LQ WKH V\VWHP %\ FRQWUDVW GHILQHV D WKHRU\ RI W\SHV EDVHG RQ &DUGHOOL >@ 7KH VHPDQWLFV RI EHKDYLRU LH PHWKRGVf LV FDSWXUHG E\ GHILQLQJ D VLJQDWXUH ZKLFK LV D VHW RI IXQFWLRQV DWWDFKHG WR D FODVV RU W\SHf 7KH GDWD PRGHO LV HPEHGGHG ZLWKLQ & DQG %DVLF 7KH VHPDQWLFV RI RXU W\SH V\VWHP LV EDVHG RQ WKDW RI ZLWK WZR PDLQ GLIIHUHQFHV

PAGE 26

ZH VXSSRUW PXOWLSOH LQKHULWDQFH DQG ZH PRGHO EHKDYLRU E\ JLYLQJ LW DQ HQWLUHO\ H[WHQVLRQDO LQWHUSUHWDWLRQ UDWKHU WKDQ DV D VLJQDWXUH 7KXV WKH GHVLJQ RI 9ROWDLUH ZDV KHDYLO\ LQIOXHQFHG E\ 0DFKLDYHOOL 7$;,6 DQG 2 )XUWKHU QRQH RI WKHVH ODQJXDJHV SURYLGH D PHDQV WR VKDUH GDWD DV GHVFULEHG LQ VHFWLRQ

PAGE 27

&+$37(5 $1 29(59,(: 2) 92/7$,5( :KLOH WKHUH DUH D QXPEHU RI LVVXHV JRYHUQLQJ WKH GHVLJQ RI D GDWDEDVH SURJUDPn PLQJ ODQJXDJH ZH KDYH FKRVHQ WR DGGUHVV RQO\ D IHZ RI WKHP 7KH 9ROWDLUH HQYLURQn PHQW LV LQWHQGHG WR EH XVHG DV D YHKLFOH LQ ZKLFK D XVHU FDQ HIILFLHQWO\ GHILQH KLV RU KHU DSSOLFDWLRQ ZLWK HDVH 7KH DSSOLFDWLRQV DUH H[SHFWHG WR EH GDWD LQWHQVLYH DV RSSRVHG WR FRPSXWDWLRQ LQWHQVLYH $Q HQYLURQPHQW WKDW LV HDV\ WR XVH FDQ UHVXOW ZKHQ WKH XVHU QHHG RQO\ IRFXV RQ WKH VSHFLILFDWLRQ RI WKH DSSOLFDWLRQ UDWKHU WKDQ ZRUU\ DERXW GHDOLQJ ZLWK SDUDGLJP PLVPDWFK SUREOHPV EHWZHHQ WKH KRVW SURJUDPPLQJ ODQJXDJH DQG WKH ''/'0/ DV GLVFXVVHG LQ WKH SUHYLRXV FKDSWHUf 7KXV RXU SULPDU\ JRDO LV WR SURYLGH WKH XVHU ZLWK D WUXO\ LQWHJUDWHG SDUDGLJP IRU GDWD LQWHQVLYH FRPSXWn LQJ :H DFKLHYH WKLV E\ SURYLGLQJ D VLQJOH PRGHO RI H[HFXWLRQ IRU HYDOXDWLQJ TXHULHV HQIRUFLQJ FRQVWUDLQWV DQG FRPSXWLQJ IXQFWLRQV E\ GHVLJQLQJ D ODQJXDJH WKDW IDFLOLn WDWHV D ERRWVWUDSSHG LPSOHPHQWDWLRQ )XUWKHU ZH GHILQH DQ H[WHQVLRQDO VHPDQWLFV IRU EHKDYLRU LQ RXU W\SH WKHRU\ WKHUHE\ JLYLQJ DQ HTXLYDOHQW VHPDQWLFV WR FODVVHV DQG IXQFWLRQV 7KXV D IXQFWLRQ LV FRPSXWHG DV WKH UHVXOW RI FRQVWUDLQW VDWLVIDFWLRQ :H ILUVW SUHVHQW WKH GHVLJQ UDWLRQDOH RI 9ROWDLUH IROORZHG E\ D EULHI RYHUYLHZ RI LWV YDULRXV SURJUDPPLQJ FRQVWUXFWV 'HVLJQ 5DWLRQDOH RI 9ROWDLUH 7KH EDVLF VWUXFWXUH RI D TXHU\ H[SUHVVLRQ LV DV VKRZQ EHORZ 4XHU\! ^ 'RWB([SU! f_f %RRO! ` %RRO! (M DQG ( ‘ ‘ f ([ UHO RS (

PAGE 28

(! 5HJB([SU! 4XHU\! 'RWB([SU! 'RWB([SU! ,GHQWLILHU! ,GHQWLILHU!'RWB([SU! $ TXHU\ FRQVLVWV RI DVVRFLDWLYH VHW H[SUHVVLRQV VHH FKDSWHU f 7KH XVHU VSHFLILHV D SDWK RU VXEJUDSKf RI LQWHUHVW RQ WKH /+6 RI WKH YHUWLFDO EDU DQG ERROHDQ SUHGLFDWHV IRU VHOHFWLRQ FRQGLWLRQV RQ WKH 5+6 RI WKH YHUWLFDO EDU 7KLV SDWK RI LQWHUHVW GHQRWHV WKH FRQWH[W RI WKH VHW H[SUHVVLRQ ZLWKLQ ZKLFK FHUWDLQ ERROHDQ FRQGLWLRQV PXVW KROG WUXH 7KH IXUWKHU GHILQHV WKH VFRSH RI LGHQWLILHUV $ VLPSOH FRQWH[W FDQ EH VSHFLILHG E\ XVLQJ D GRW H[SUHVVLRQ VXFK DV 6WXGHQW&RXUVH'HSW $V DQ H[DPSOH FRQVLGHU WKH TXHU\ ^6WXGHQWQDPH 6WXGHQW&RXUVHF DQG 6WXGHQWDGYLVRU LQ )DFXOW\` 7KH V\QWDFWLF FDWHJRU\ (! GHQRWHV H[SUHVVLRQV ZKLFK DUH VLPSOH H[WHQVLRQV WR WHUPV DQG IDFWRUV IRXQG LQ PRVW ODQJXDJHV VXFK DV 3DVFDO $ TXHU\ FDQ FRQWDLQ HPEHGGHG VXETXHULHV VLQFH D TXHU\ LV D NLQG RI H[SUHVVLRQ DQG %RRO! FRQVLVWV RI H[SUHVVLRQV %RROHDQ H[SUHVVLRQV KDYH WKH XVXDO DQG RU QRW RSHUDWRUV TXDQWLILHUV DQG UHODn WLRQDO H[SUHVVLRQV RI WKH IRUP (L UHORS! ( 7KXV D FRQVWUDLQW LV RI WKH IRUP &RQVWUDLQW! LI %RRO! WKHQ &RQVHTXHQW! 7KH LVVXH LV WR GHILQH WKH V\QWDFWLF FDWHJRU\ &RQVHTXHQW! ZLWKRXW LQWURGXFLQJ IXUWKHU V\QWDFWLF FDWHJRULHV DQG ZLWKRXW RYHUORDGLQJ WKH VHPDQWLFV RI H[LVWLQJ VWUXFWXUHV LQ DQ XQQDWXUDO IDVKLRQ 7KLV FDQ EH UHVROYHG E\ RYHUORDGLQJ WKH HTXDOLW\ RSHUDWRU VXFK WKDW WZR FRQGLWLRQV DULVH ,I ERWK WKH 5+6 DQG /+6 DUH ERXQG WKHQ VDWLVILDELOLW\ LV FKHFNHG ,I WKH /+6 RI WKH HTXDOLW\ RSHUDWRU LV XQERXQG WKHQ DQ DVVLJQPHQW RU PRUH DSSURSULDWHO\ D ELQGLQJf WDNHV SODFH 7KXV &RQVHTXHQW! %RRO! ,I WKHVH ERROHDQ FRQGLWLRQV DUH FKRVHQ WR EH VLPSOH SURSRVLWLRQV WKHQ VDWLVILDELOLW\ LV 13FRPSOHWH GXH WR WKH

PAGE 29

VDWLVILDELOLW\ SUREOHPf DQG WKH RUGHU LQ ZKLFK FRQVWUDLQWV DSSHDU LV LQVLJQLILFDQW %XW VXFK D FKRLFH ZRXOG EH LQDGHTXDWH IRU WKH IROORZLQJ UHDVRQV ODFN RI H[SUHVVLYH SRZHU FRPSXWDWLRQDO RYHUKHDG GXH WR LQVLJQLILFDQFH LQ WKH RUGHU RI FRQVWUDLQWV LW UDLVHV WKH LVVXH RI KRZ WR EOHQG VXFK D VHPDQWLFV LQWR D SURJUDPPLQJ ODQJXDJH WKDW LV QRW EDVHG RQ WKHRUHP SURYLQJ WHFKQLTXHV VXFK DV UHVROXWLRQf %\ WDNLQJ D UDWKHU RSHUDWLRQDO YLHZ LQ ZKLFK WKH RUGHU RI FRQVWUDLQWV LV VLJQLILFDQW ZH FDQ DYRLG WKH DERYH SUREOHPV $OVR ZH FDQ EOHQG FRQVWUDLQWV LQWR D VHWRULHQWHG \HW LPSHUDWLYH SURJUDPPLQJ ODQJXDJH $ SURJUDP FDQ WKHQ EH YLHZHG DV D VHTXHQFH RI FRQVWUDLQWV DQG RWKHU FRPPDQGV 3URJUDP! 6HTXHQFH!I 6HTXHQFH! &RQVWUDLQW! &RPPDQG! 7KH FDWHJRU\ &RPPDQG! PD\ FRQVLVW RI RSHUDWRUV ZLWK VLGH HIIHFWV VXFK DV XSn GDWHV RU LQSXWRXWSXW RU RWKHU FRQYHQLHQW FRQVWUXFWV VXFK DV DQ LWHUDWRU *LYHQ WKH DERYH LQWHUSUHWDWLRQ WKHUH LV QR D SULRUL UHDVRQ ZK\ D FRPPDQG FDQQRW EH D NLQG RI FRQVHTXHQW DV ZHOO LH &RQVHTXHQW! %RRO! &RPPDQG! &RQVWUDLQWV DUH QR ORQJHU YLHZHG DV PHUH SUH DQG SRVWFRQGLWLRQV RQ WKH VWDWH RI D FRPSXWDn WLRQ EXW UDWKHU DV FRQGLWLRQV WKDW PXVW KROG WUXH DW DUELWUDULO\ VSHFLILHG SRLQWV LQ D FRPSXWDWLRQ 7KLV VFKHPH LV IDLUO\ JHQHUDOf§FRQVLGHU WKH IROORZLQJ &RQVWUDLQW! LI $QWHFHGHQW! WKHQ &RQVHTXHQW! 7KH DQWHFHGHQW RI D FRQVWUDLQW FDQ DOVR EH HYHQWV VXFK DV XSGDWHV RU UHWULHYHV RU H[FHSWLRQV 7KHVH LVVXHV DUH LPSRUWDQW LQ DFWLYH GDWDEDVH PDQDJHPHQW > @ 7KXV $QWHFHGHQW! %RRO! (YHQW! ([FHSWLRQ!

PAGE 30

7KH PDLQ OLPLWDWLRQ RI WKLV RSHUDWLRQDO LQWHUSUHWDWLRQ LV WKDW FRQVWUDLQWV FDQQRW EH DXWRPDWLFDOO\ SURSDJDWHG RWKHU WKDQ ZKDW KDV EHHQ H[SOLFLWO\ SURJUDPPHG E\ D XVHU )RU H[DPSOH WKH XVHU ZRXOG KDYH WR ZULWH D UXOH VXFK WKDW LI DQ\ HPSOR\HH LV GHOHWHG WKHQ GHOHWH DOO GHSHQGHQWV RI VXFK DQ HPSOR\HH ,I VXFK UXOHV DUH RPLWWHG LQ WKH GHILQLWLRQ RI D JLYHQ FODVV WKHQ WKH GDWDEDVH PD\ UHVXOW LQ DQ LQFRQVLVWHQW VWDWH +RZHYHU E\ DGRSWLQJ D OD]\ HYDOXDWLRQ VWUDWHJ\ FRQVLVWHQW GDWD FDQ EH JXDUDQWHHG DV WKH UHVXOW RI HYDOXDWLQJ DQ H[SUHVVLRQ UHFDOO WKDW D TXHU\ LV RQO\ RQH NLQG RI H[SUHVVLRQf 7KH DERYH GLVFXVVLRQ LV EDVHG RQ WKH LPSOLFLW DVVXPSWLRQ WKDW H[SUHVn VLRQV FDQ EH HYDOXDWHG DJDLQVW D SHUVLVWHQW VWRUH LH D GDWDEDVH :H EHOLHYH WKDW WKH DERYH IRUPXODWLRQ OHDGV WRZDUGV D ERRWVWUDSSHG LPSOHPHQWDWLRQ 2WKHU LVVXHV WKDW ZH FKRVH WR DGGUHVV LQ WKH GHVLJQ RI 9ROWDLUH ZLWK UHVSHFW WR WKH LVVXHV RXWOLQHG LQ VHFWLRQ DUH :H GHILQH DQ REMHFWEDVHG GDWD PRGHO RU W\SH V\VWHPf WKDW DFFRXQWV IRU ERWK H[WHQW DQG EHKDYLRU DQG IDFLOLWDWHV PDQLSXODWLRQ RI KHWHURJHQHRXV UHFRUGV DQG VKDULQJ RI GDWD )XUWKHU RSHUDWRUV GHILQHG LQ WKH ODQJXDJH DUH WUDQVSDUHQW WR WKH SHUVLVWHQFH RU QRQSHUVLVWHQFH RI REMHFWV 7KH VHWRULHQWHG H[SUHVVLRQV FDQ EH VWDWLFDOO\ FKHFNHG IRU W\SH HUURUV :H DOOHYLDWH WKH SDUDGLJP PLVPDWFK SUREOHP EHWZHHQ UHFRUG DQG VHWRULHQWHG SDUDGLJPV E\ GHVLJQLQJ D ODQJXDJH EDVHG RQ VHW H[SUHVVLRQV E\ HPSOR\LQJ LPSOLFLW W\SH FRHUFLRQ DQG VRPH REYLRXV RSHUDWRU RYHUORDGLQJ :H SURYLGH D OLPLWHG IRUP RI DXWRPDWLF FRQVWUDLQW PDQDJHPHQW 7KH TXHU\ ODQJXDJH FDQ XQLIRUPO\ DFFHVV REMHFWV DQG IXQFWLRQV 7R PDNH RXU GLVFXVVLRQ PRUH FRQFUHWH ZH VKDOO EULHIO\ SUHVHQW DQ LQWURGXFWRU\ H[DPSOH RI GDWD GHILQLWLRQ FRQVWUDLQWV DQG IXQFWLRQV ZULWWHQ LQ 9ROWDLUH LQ VHFWLRQ n7KLV LV SUHFLVHO\ WKH YLHZ WDNHQ E\ -DJDGLVK >@

PAGE 31

:H VKDOO DGRSW WKH IROORZLQJ FRQYHQWLRQ LQ DOO VXEVHTXHQW FKDSWHUV $OO LGHQWLILHUV IRU FODVV QDPHV ZLOO EHJLQ ZLWK D FDSLWDO OHWWHU DWWULEXWH QDPHV ZLWK D VPDOO OHWWHU DQG UHVHUYHG ZRUGV LQ EROG IDFH ,Q QRUPDO WH[W DOO LGHQWLILHUV ZLOO EH LWDOLFL]HG H[FHSW IRU UHVHUYHG ZRUGV $ 4XLFN *ODQFH RI 9ROWDLUH 9ROWDLUH VXSSRUWV D QXPEHU RI IHDWXUHV DQG DEVWUDFWLRQ PHFKDQLVPV IRU PRGHOLQJ WKH GDWD DV ZHOO WKH DSSOLFDWLRQ :H ILUVW OLVW WKH DEVWUDFWLRQV IRU GDWDEDVH PRGHOLQJ &ODVVHV $ FODVV LV D VHW RI LQVWDQFHV RU REMHFWV EHLQJ PRGHOHG VXFK WKDW WKHVH REMHFWV VKDUH FHUWDLQ FRPPRQ FKDUDFWHULVWLFV 7KH QDPH RI D FODVV GHQRWHV WKH REMHFWV FXUUHQWO\ H[LVWLQJ LQ WKH GDWDEDVH 7KHUH H[LVWV RQO\ RQH FRS\ RI WKH REMHFW LQ WKH GDWDEDVH WKRXJK RWKHU REMHFWV PD\ UHIHU WR LW $ FODVV GHILQLWLRQ FRQVLVWV RI D VHTXHQFH RI DWWULEXWHBQDPH GRPDLQ! SDLUV $Q REMHFW FDQ EH D PHPEHU RI D FODVV LI LW KDV DW OHDVW WKRVH DWWULEXWHV GHILQHG LQ WKH FODVVf§WKXV DQ REMHFW FDQ KDYH DGGLWLRQDO DWWULEXWHV DQG EHORQJ WR WKH FODVV LQ TXHVWLRQ ZLWKRXW WKH QHFHVVLW\ IRU FUHDWLQJ HLWKHU D QHZ VXEFODVV RU DQ H[FHSWLRQ $JJUHJDWLRQ 2EMHFWV EHORQJLQJ WR FODVVHV DUH DJJUHJDWHV RI KHWHURJHQHRXV FRPn SRQHQWV KDYLQJ REMHFWV RI RWKHU FODVVHV DV FRPSRQHQWV $VVRFLDWLRQV EHWZHHQ YDULRXV REMHFWV DUH UHSUHVHQWHG DV DJJUHJDWLRQV $Q REMHFW LV D VHTXHQFH RI DWWULEXWHBQDPHV YDOXH! SDLUV *HQHUDOL]DWLRQ 9ROWDLUH VXSSRUWV D WD[RQRP\ RI FODVVHV 6XEFODVVHV DUH GHULYHG IURP D FODVV E\ DGGLQJ PRUH LQIRUPDWLRQ WR WKH FODVV ,QVWDQFHV RI D VXEFODVV DOVR EHORQJ WR LWV SDUHQW FODVVHV 6LQFH ZH VXSSRUW PXOWLSOH LQKHULWDQFH DQ LQVWDQFH FDQ KDYH PDQ\ SDUHQW FODVVHV RU EHORQJ WR D VXEFODVV ZKLFK FDQ KDYH

PAGE 32

PDQ\ SDUHQW FODVVHV )XUWKHU WKH W\SH RI WKH HOHPHQWV RI D VXEFODVV LV D VXEW\SH RI WKH W\SH RI WKH HOHPHQWV RI WKH SDUHQW FODVV 6KDULQJ 7KH W\SH V\VWHP RI 9ROWDLUH PDNHV LW SRVVLEOH IRU D JLYHQ VHW RI LQn VWDQFHV WR EH YLHZHG RU VKDUHG E\ PRUH WKDQ RQH VFKHPD RU IRU D JLYHQ VFKHPD WR EH DEOH WR GHILQH PRUH WKDQ RQH VHW RI LQVWDQFHV VHH VHFWLRQ f 7KH 9ROWDLUH ODQJXDJH DOVR KDV WKH IROORZLQJ FKDUDFWHULVWLFV 9ROWDLUH LV D VHWRULHQWHG EXW REMHFWEDVHG ODQJXDJH VXEVFULELQJ WR WKH LPSHUDn WLYH SDUDGLJP RI SURJUDPPLQJ ([SUHVVLRQV LQ 9ROWDLUH DUH D VLPSOH H[WHQVLRQ RI WHUPV DQG IDFWRUVf§WKH NLQG RI H[SUHVVLRQV IRXQG LQ 3DVFDOOLNH ODQJXDJHV $Q LPSRUWDQW H[WHQVLRQ LV WKH VHW H[SUHVVLRQ ZKLFK UHWXUQV D VHW RI REMHFWV YDOXHV RU LQVWDQFHVf EHORQJLQJ WR D JLYHQ W\SH $ VLPSOH VHW H[SUHVVLRQ LQFOXGHV WKH GRW RSHUDWRU ZKLFK IDFLOLWDWHV DVVRFLDWLYH DFFHVV 7KH PDLQ FRQWURO VWUXFWXUH LV WKH VHTXHQFLQJ RI FRPPDQGV RU FRQVWUDLQWV 7KH ODQJXDJH DOVR SURYLGHV FRQGLWLRQDOV LWHUDWRUV DQG UHFXUVLYH IXQFWLRQ FDOO (YHU\ GHQRWDEOH YDOXH RI WKH ODQJXDJH SRVVHVVHV D W\SH Df $ W\SH LV D VHW RI YDOXHV VKDULQJ D VHW RI FRPPRQ SURSHUWLHV WRJHWKHU ZLWK D VHTXHQFH RI FRQVWUDLQWV ZKLFK GHILQH WKH EHKDYLRU RI HOHPHQWV RI D W\SH Ef 7KH SUHGHILQHG W\SHV DUH ERROHDQ LQWHJHU UHDO VWULQJ ZLWK WKH XVXDO RSn HUDWRUV WKH W\SH 1LO ZKLFK LV D VLQJOHWRQ VHW ZLWK WKH HOHPHQW QXOO DQG WKH W\SH $Q\ RI ZKLFK DOO W\SHV DUH D VXEW\SH (TXDOLW\ LV GHILQHG IRU WKH W\SH 1LO ZKLFK LV D VXEW\SH RI DOO W\SHV GHILQHG LQ WKH VFKHPD

PAGE 33

Ff 7KH W\SH FRQVWUXFWRUV VHW DQG WXSOH DUH DYDLODEOH WR GHILQH QHZ W\SHV IURP SUHGHILQHG RU SUHYLRXVO\ GHILQHG W\SHV Gf $ YDOXH RI W\SH 7M FDQ EH XVHG DV DQ DUJXPHQW WR D IXQFWLRQ GHILQHG IRU YDOXHV RI W\SH U LI 7? LV D VXEW\SH RI U 6LQFH WKH VXEW\SH UHODWLRQ LV D SDUWLDO RUGHU UHYHUVH VXEVWLWXWLRQ LV QRW DOORZHG ,W LV D ILUVW RUGHU ODQJXDJH +RZHYHU WKH H[WHQW RI D IXQFWLRQ LV D GHQRWDEOH YDOXH ZKLFK FDQ DOVR EH SHUVLVWHQWf 7KHUHIRUH DQ HOHPHQW EHORQJLQJ WR WKH H[WHQW RI D IXQFWLRQ FDQ EH HPEHGGHG LQ GDWD VWUXFWXUHV SDVVHG DV D SDUDPHWHU RU UHWXUQHG DV D YDOXH ,W VKRXOG EH QRWHG WKDW WKLV DSSURDFK LV TXLWH GLIIHUHQW IURP WKH RQH WDNHQ LQ KLJKHU RUGHU IXQFWLRQDO ODQJXDJHV ZKHUH WKH IXQFWLRQ LWVHOI LV D GHQRWDEOH YDOXH )XQFWLRQV DQG FODVVHV LQ 9ROWDLUH KDYH DQ HTXLYDOHQW VHPDQWLFV $ JLYHQ IXQFWLRQ LV VSHFLILHG E\ WKH UHODWLRQVKLSV EHWZHHQ WKH LQSXW DQG RXWSXW DUJXPHQWV RI WKDW IXQFWLRQ 7KHVH SDUDPHWHUV IRUP WKH DWWULEXWHV RI WKH IXQFn WLRQ RU FODVVf DQG WKH UHODWLRQVKLSV DPRQJ WKHP DUH H[SUHVVHG DV D VHTXHQFH RI FRQVWUDLQWV 7KHVH UHODWLRQVKLSV RU FRQVWUDLQWV DUH UXOHV IRU HYDOXDWLQJ WKH IXQFn WLRQ 7KXV WKH HYDOXDWLRQ RI D IXQFWLRQ FDQ EH VHHQ DV WKH UHVXOW RI VHTXHQWLDO FRQVWUDLQW VDWLVIDFWLRQ 7KH 9ROWDLUH HQYLURQPHQW SURPSWV WKH XVHU IRU LQSXWV DQG UHSRUWV WKH UHVXOW RI FRPSXWDWLRQV LQ DQ LQWHUDFWLYH IDVKLRQ $W WKLV OHYHO RI HYDOXDWLRQ WKH XVHU FDQ ORDG D JLYHQ VFKHPD GHILQLWLRQV RI FODVVHV DQG IXQFWLRQVf DQG D JLYHQ GDWDEDVH ,W LV XVHIXO WR WKLQN RI DQ HOHPHQW RI WKH H[WHQW RI D IXQFWLRQ DV D PHPEHU RI WKH JUDSK RI WKDW IXQFWLRQ 7KH 9ROWDLUH V\VWHP KRZHYHU WUHDWV LW DV DQ LQVWDQFH ZKRVH DWWULEXWHV ZKLFK FRUUHVSRQG WR WKH IRUPDO SDUDPHWHUV RI WKH IXQFWLRQf DUH ERXQG WR GHQRWDEOH YDOXHV WKXV FDSWXULQJ SUH DQG SRVWFRPSXWDWLRQ LQIRUPDWLRQ

PAGE 34

D VHW RI LQVWDQFHVf $OWHUQDWHO\ D QHZ VFKHPD FDQ EH GHILQHG DQG D QHZ GDWDEDVH FUHDWHG )XUWKHU RQH FDQ HYDOXDWH VHW H[SUHVVLRQV ZKLFK HIIHFWLYHO\ DUH TXHULHVf RU H[HFXWH IXQFWLRQV $Q ,QWURGXFWRU\ ([DPSOH :H JLYH EHORZ D VLPSOH H[DPSOH WR LOOXVWUDWH WKH QRWLRQ RI VKDULQJ DV GHILQHG LQ VHFWLRQ $V PHQWLRQHG WKHUH D JLYHQ VFKHPD FDQ GHVFULEH PRUH WKDQ RQH FRQVLVWHQW VHW RI LQVWDQFHV DQG OLNHZLVH D JLYHQ VHW RI LQVWDQFHV FDQ EH GHILQHG E\ PRUH WKDQ RQH VFKHPD 7KHUHIRUH ZH GHILQH WZR VLPSOH VFKHPDV DQG WZR VHWV RI LQVWDQFHV /HW 6FKHPDL EH GHILQHG DV IROORZV FODVV (PSOR\HH GHILQHG DWWULEXWHV QDPH VWULQJ VV LQWHJHU GHSW 'HSDUWPHQW PDQDJHU (PSOR\HH VDODU\ LQWHJHU &RQVWUDLQWV FODVV 'HSW GHILQHG DWWULEXWHV QDPH VWULQJ ORFDWLRQ VWULQJ PDQDJHU (PSOR\HH EXGJHW LQWHJHU &RQVWUDLQWV EXGJHW VXP ^(PSOR\HHVDODU\ (PSOR\HHGHSW'HSWQDPH VHOI QDPH ` FODVV ,QFUB6DODU\ IXQFWLRQ DWWULEXWHV LQFU LQWHJHU FRQVWUDLQWV IRU HDFK [ LQ (PSOR\HH GR ^PRGLI\ [ VDODU\ SUHYVDODU\ SUHYVDODU\ [ LQFUf M ` HQGGR 7KXV 6FKHPDL FRQVLVWV RI WKH WZR FODVVHV (PSOR\HH DQG 'HSW DQG WKH IXQFWLRQ ,QFU6DODU\ $ FRQVWUDLQW LV GHILQHG RQ WKH FODVV 'HSW VXFK WKDW WKH EXGJHW RI HDFK

PAGE 35

'HSW VKRXOG EH JUHDWHU WKDQ WKH VXP RI WKH VDODULHV RI DOO HPSOR\HHV ZRUNLQJ LQ LW 7KH DUJXPHQW RI WKH VXP RSHUDWRU LV HIIHFWLYHO\ D TXHU\ LQ ZKLFK VHOI GHQRWHV WKH FXUUHQWO\ DFWLYH LQVWDQFH RI WKH FODVV 'HSW 7KH IXQFWLRQ ,QFU6DODU\ LQFUHDVHV WKH VDODU\ RI HDFK HPSOR\HH LQ WKH GDWDEDVH E\ D JLYHQ SHUFHQWDJH 7KH GRW H[SUHVVLRQ SUHYVDODU\ GHQRWHV WKH ROGHU YDOXH RI VDODU\ 7KH FRPPDQG LQ WKH ERG\ RI WKH IRU ORRS FRXOG KDYH EHHQ DOWHUQDWHO\ ZULWWHQ DV VDODU\ VDODU\ VDODU\ [ LQFUf 6LPLODUO\ OHW 6FKHPD EH GHILQHG DV IROORZV FODVV (PSOR\HH GHILQHG DWWULEXWHV QDPH VWULQJ PDQDJHU (PSOR\HH VDODU\ LQWHJHU &RQVWUDLQWV FODVV 'HSW GHILQHG DWWULEXWHV QDPH VWULQJ PDQDJHU (PSOR\HH &RQVWUDLQWV VHOI VDODU\ PDQDJHUVDODU\ FODVV (PSVLQB'HSW IXQFWLRQ DWWULEXWHV GHSWBQDPH VWULQJ GHSWBPJU VWULQJ HPSV-QBGHSW VHW (PSOR\HH FRQVWUDLQWV GHSWBPJU ^'HSWPDQDJHU 'HSWQDPH GHSWBQDPH ` HPSV-QBGHSW ^(PSOR\HH (PSOR\HHPDQDJHU GHSWBPJU ` :H DJDLQ GHILQH (PSOR\HH DQG 'HSW FODVVHV DQG D IXQFWLRQ (PSVLQ'HSW ZKLFK GHWHUPLQHV DOO WKH HPSOR\HHV ZRUNLQJ LQ D GHSDUWPHQW JLYHQ LWV QDPH 7KH IXQFWLRQ FRXOG KDYH EHHQ UHGHILQHG ZLWKRXW WKH LGHQWLILHU GHS(PJU DV IROORZV HPSVBLQBGHSW ^(PSOR\HH (PSOR\HHPDQDJHU ^'HSWPDQDJHU 'HSWQDPH GHSWBQDPH ` `

PAGE 36

/HW WKH VHW RI LQVWDQFHV '%L EH DV IROORZV LQVWDQFH MRH FODVV (PSOR\HH VV QDPH f-RHf GHSW ILQDQFH PDQDJHU VDOO\ VDODU\ LQVWDQFH KDUU\ FODVV (PSOR\HH QDPH f+DUU\f VV GHSW SURGXFWLRQ PDQDJHU KDUU\ VDODU\ VSRXVH VDOO\ LQVWDQFH SURGXFWLRQ FODVV 'HSW QDPH f3URGXFWLRQf ORFDWLRQ fDXVWLQf PDQDJHU KDUU\ EXGJHW HPSOR\HHV ^MLP KDUU\` VV QDPH f-LPf GHSW SURGXFWLRQ PDQDJHU MRKQ VDODU\ FDU fWR\RWDf LQVWDQFH VDOO\ FODVV (PSOR\HH QDPH f6DOO\f VV GHSW ILQDQFH PDQDJHU VDOO\ VDODU\ LQVWDQFH ILQDQFH FODVV 'HSW QDPH f)LQDQFHf ORFDWLRQ fDWKHQVf PDQDJHU VDOO\ EXGJHW 1RWH WKDW WKH VWUXFWXUHV RI WKH LQVWDQFHV EHORQJLQJ WR WKH FODVVHV (PSOR\HH DQG 'HSW DUH GLIIHUHQW )RU H[DPSOH QRWKLQJ LV PHQWLRQHG DERXW VSRXVHV DQG FDUV LQ WKH FODVV GHILQLWLRQ )XUWKHU VDOO\ KDV D YDOXH IRU WKH DWWULEXWH PDQDJHU ZKLFK SRLQWV WR LWVHOI 6XFK F\FOLF VWUXFWXUHV DUH OHJDO LQ 9ROWDLUH ,W PHDQV WKDW 6DOO\ LV KHU RZQ PDQDJHU 6LPLODUO\ OHW WKH VHW RI LQVWDQFHV '% EH DV IROORZV LQVWDQFH VPLWK FODVV (PSOR\HH QDPH f§ f6PLWKf PDQDJHU MDFN VDODU\ HGXFDWLRQ f06f LQVWDQFH MDFN FODVV (PSOR\HH QDPH f-DFNf PDQDJHU MDFN VDODU\ LQVWDQFH MLOO FODVV (PSOR\HH QDPH f-LOOf PDQDJHU DOLFH VDODU\ VSRXVH MDFN LQVWDQFH DOLFH FODVV (PSOR\HH QDPH f$OLFHf PDQDJHU DOLFH VDODU\ GHSW ZRQGHUODQG

PAGE 37

LQVWDQFH ZRQGHUODQG FODVV 'HSW QDPH f:RQGHUODQGf PDQDJHU DOLFH EXGJHW QXOO :H KDYH GHILQHG D VHPDQWLFV IRU WKH W\SH VFKHPH WKDW IDFLOLWDWHV VKDULQJ RI GDWD VHH VHFWLRQ f 7KXV 6FKHPD FDQ DGHTXDWHO\ GHILQH '%L DQG '% VLQFH WKH W\SH V\VWHP ZLOO GHGXFH WKDW WKH FRUUHVSRQGLQJ VWUXFWXUHV DUH FRPSDWLEOH 6LPLODUO\ '%; FDQ EH GHILQHG E\ 6FKHPDL DQG 6FKHPD

PAGE 38

&+$37(5 '$7$ '(),1,7,21 &ODVVHV DQG ,QVWDQFHV 7KH GDWD GHILQLWLRQ IDFLOLW\ LQ 9ROWDLUH DOORZV XV WR GHILQH FODVVHV DQG DQ LQKHULn WDQFH KLHUDUFK\ DV ZHOO DV D GDWDEDVH RI LQVWDQFHV 'HSLFWHG LQ )LJXUH LV D VFKHPD JUDSK WKDW FDQ EH HDVLO\ PRGHOHG LQ 9ROWDLUH 7KLV VFKHPD LV GHILQHG LQ DSSHQGL[ $ 7KH SXUSRVH RI WKLV VFKHPD JUDSK LV WR HPSKDVL]H WKH DVVRFLDWLYH QDWXUH RI GDWD LQ PDQ\ DSSOLFDWLRQV )RU H[DPSOH WKH FODVVHV *UDG DQG 3HUVRQ GHQRWLQJ WKH VHW RI DOO JUDGXDWH VWXGHQWV DQG SHUVRQV UHVSHFWLYHO\ LQ WKH XQLYHUVH RI GLVFRXUVH FDQ EH GHILQHG DV IROORZV FODVV *UDG GHILQHG VXSHUFODVVHV 6WXGHQW VXEFODVVHV 5$ 7$ DWWULEXWHV VV LQWHJHU QDPH VWULQJ JSD UHDO PDMRU 'HSW DGYLVRU )DFXOW\ VHFWLRQV VHW 6HFWLRQ FODVV 3HUVRQ GHILQHG VXSHUFODVVHV DQ\ VXEFODVVHV 6WXGHQW 7HDFKHU DWWULEXWHV VV LQWHJHU QDPH VWULQJ 7KH DWWULEXWHV VV DQG QDPH DUH LQKHULWHG IURP WKH FODVV 3HUVRQ JSD PDMRU DQG VHFWLRQV DUH LQKHULWHG IURP 6WXGHQW DQG WKHUHIRUH QHHG QRW KDYH EHHQ UHSHDWHG VLQFH 3HUVRQ ZDV H[SOLFLWO\ PHQWLRQHG DV D VXSHUFODVV LQ WKH GHILQLWLRQ RI 6WXGHQW ,QVWDQFHV DUH FKDUDFWHUL]HG E\ D XQLTXH LGHQWLILHU WKH VHW RI FODVVHV WR ZKLFK WKH LQVWDQFH PD\ EHORQJ DQG WKH VHW RI DWWULEXWH YDOXH SDLUV $Q LQVWDQFH PD\ EHORQJ

PAGE 39

WR RQH RU PRUH FODVVHV SURYLGHG LW VDWLVILHV DOO FRQVWUDLQWV DWWDFKHG WR D JLYHQ FODVV DQG DOO RI LWV VXSHUFODVVHV 6RPH H[DPSOHV RI LQVWDQFHV DUH LQVWDQFH MRH FODVV 6WXGHQW LQVWDQFH MLP FODVV 3HUVRQ VV VV QDPH f-RHf QDPH f-LPf JSD PDMRU (( VHFWLRQV VO V V LQVWDQFH MRKQ FODVV 3HUVRQ VV QDPH f-RKQf DJH LQVWDQFH MDFN FODVV 3HUVRQ VV QDPH f-DFNf VDODU\ 7KH ILUVW LGHQWLILHU fMRHf DIWHU WKH NH\ZRUG LQVWDQFH GHQRWHV D XQLTXH LGHQWLILHU IRU WKH LQVWDQFH LQ TXHVWLRQ ,W EHORQJV WR WKH FODVV 6WXGHQW 7KH YDOXH IRU PDMRU UHIHUV WR DQ LQVWDQFH RI FODVV 'HSW DQG WKDW IRU VHFWLRQV LV D VHW RI XQLTXH LGHQWLILHUV EHORQJLQJ WR WKH FODVV 6HFWLRQ )XUWKHU QRWLFH WKDW QRWKLQJ ZDV PHQWLRQHG DERXW DJH DQG VDODU\ LQ WKH GHILQLWLRQ RI 3HUVRQ +RZHYHU VLQFH ZH KDYH FKRVHQ WR JLYH DQ H[WHQVLRQDO VHPDQWLFV WR FODVV GHILQLWLRQV VLPLODU WR WKDW LQ SUHYLRXV ZRUNV > @ DQ LQVWDQFH PD\ KDYH DQ DULW\ JUHDWHU WKDQ WKDW RI WKH FODVVHV WR ZKLFK LW PD\ EHORQJ 7KLV GHFLVLRQ ZDV PDGH IRU WKH IROORZLQJ UHDVRQV 7R DOORZ D VLQJOH VFKHPD WR GHVFULEH PXOWLSOH GDWDEDVHV 7R DOORZ D VLQJOH GDWDEDVH WR EH GHVFULEHG E\ PXOWLSOH VFKHPDV 7R SUHYHQW DQ XQQHFHVVDU\ SUROLIHUDWLRQ RI FODVVHV VXFK DV 3HUVRQZLWKDJH RU 3HUVRQZLWKVDODU\ EHVLGHV 3HUVRQ 7R SURYLGH D PHDQV WR GHDO ZLWK LQFRPSOHWH LQIRUPDWLRQ DQG H[FHSWLRQV +I D VLQJOH GDWDEDVH LV GHVFULEHG E\ PRUH WKDQ RQH VFKHPD WKHQ WKH FODVV WR ZKLFK DQ LQVWDQFH EHORQJV FDQQRW EH VWRUHG DORQJ ZLWK WKH LQVWDQFH ,Q VXFK D FDVH WKH FODVV RI DQ LQVWDQFH PXVW EH LQIHUUHG RU UHDG IURP D SUHFRPSLOHG WDEOHf ZKHQ RSHQLQJ D GDWDEDVH

PAGE 40

I A QDPH FROOHJH ERRNV VSHFLDOLW\ )LJXUH 8QLYHUVLW\ 6FKHPD

PAGE 41

1RZ FRQVLGHU WKH IROORZLQJ SURJUDP VHJPHQW V ^ MLP MRKQ MDFN ` IRU HDFK [ LQ V SULQW [QDPH 7KH UHDVRQ ZK\ ^ MLP MRKQ MDFN ` LV D YDOLG VWUXFWXUH LV EDVHG RQ D VLPSOH H[WHQVLRQ RI DQ LGHD GHVFULEHG LQ %XQHPDQ DQG 2KRUL >@ 7KH LGHD LV WKDW RQH FDQ GHILQH DQ RUGHULQJ RI GDWDEDVH REMHFWV EDVHG RQ WKHLU LQIRUPDWLRQ FRQWHQW VLQFH D GDWDEDVH REMHFW LV D SDUWLDO GHVFULSWLRQ RI VRPH UHDO ZRUOG HQWLW\ 7KXV WKH LQVWDQFH MLP VV QDPH f-LPfff FRQWDLQV OHVV LQIRUPDWLRQ WKDQ MRKQ VV QDPH f-RKQf DJH ff DQG MDFN VV QDPH f-DFNf VDODU\ ff ,I ZH ZHUH WR DVVLJQ W\SHV L  DQG UHVSHFWLYHO\ WR WKHVH UHFRUGV WKHQ RQH FDQ GHILQH DQ RUGHULQJ &  DQG  &  ZKHUH WKH RUGHULQJ LV EDVHG RQ WKH VXEW\SH UHODWLRQVKLS )XUWKHU  8^L ` ZKLFK FDQ DGHTXDWHO\ GHILQH WKH W\SH RI ^MLP MRKQ MDFN` ZKHUH 8 VWDQGV IRU WKH OHDVW XSSHU ERXQG OXEf 7KXV D VHW FDQ FRQWDLQ HOHPHQWV WKDW FDQ EH DVVLJQHG W\SHV VXFK WKDW D OXE FDQ EH FRPSXWHG IRU WKHVH W\SHV 'LVFXVVLRQ RQ WKH FRPSXWDELOLW\ RI D OXE IRU PRUH FRPSOH[ WHUPV LV IRXQG LQ %XQHPDQ DQG 2KRUL >@ %HIRUH GHVFULELQJ WKH XSGDWH RSHUDWRUV DQG TXHU\ ODQJXDJH ZH VKDOO EULHIO\ LQn WURGXFH WKH QRWLRQ RI DVVRFLDWLYH DFFHVV 7KH GRW RSHUDWRU LV D FRPPRQ PHDQV IRU DFKLHYLQJ WKLV > @ ZKLFK LV VLPLODU WR ILHOG VHOHFWLRQ LQ 0DFKLDYHOOL >@ )RU H[n DPSOH *UDGDGYLVRU)DFXOW\QDPH LV DQ DVVRFLDWLYH SDWWHUQ ZKLFK GHQRWHV WKH QDPH RI D IDFXOW\ PHPEHU ZKR DGYLVHV VRPH JUDGXDWH VWXGHQW 7KLV GRW H[SUHVVLRQ FRXOG DOVR KDYH EHHQ ZULWWHQ DV *UDG)DFXOW\QDPH VLQFH WKHUH LV D XQLTXH SDWK IURP *UDG WR )DFXOW\ YLD DGYLVRU $OVR WKH GRW H[SUHVVLRQ MRHVV GHQRWHV WKH YDOXH RI W\SH LQWHJHU DQG D VHW H[SUHVVLRQ RI WKH IRUP ^ 6WXGHQWQDPH VV ` GHQRWHV WKH VLQJOHWRQ VHW WKH HOHPHQW RI ZKLFK KDV WKH YDOXH f-RHf RI W\SH VWULQJ

PAGE 42

7KH GRW RSHUDWRU IRUPV WKH EDVLV RI DQ DVVRFLDWLYH SDWWHUQ RU GRW H[SUHVVLRQf DQG LV GLUHFWLRQDO )RU H[DPSOH OHW D DQG E EH WZR FODVVHV ZKHUH D KDV DQ DWWULEXWH V ZKRVH GRPDLQ LV DQG E KDV DQ DWWULEXWH W ZKRVH GRPDLQ LV D 7KXV DE KDV D GLIIHUHQW GHQRWDWLRQ IURP ED VLQFH WKH\ UHVXOW LQ YDOXHV ZKRVH GRPDLQV DUH GLIIHUHQW DVVXPLQJ WKDW WKHUH LV D XQLTXH SDWK IURP D WR DQG YLFH YHUVDf *LYHQ VXFK XQLTXH SDWKV V DQG W FDQ EH WKRXJKW RI DV LQYHUVH DWWULEXWHV 7KH V\VWHP GRHV QRW DXWRPDWLFDOO\ PDLQWDLQ LQYHUVH DWWULEXWHV 7KHUHIRUH HYHQ WKRXJK D GRW H[SUHVVLRQ PD\ EH PHDQLQJIXO LQ RQH GLUHFWLRQ LW PD\ QRW EH GHILQHG LQ WKH UHYHUVH GLUHFWLRQ ,W LV SRVVLEOH IRU WKH XVHU WR VSHFLI\ WKH QDPHV RI WZR FODVVHV DV RSHUDQGV WR WKH GRW RSHUDWRU SURYLGHG WKHUH H[LVWV DQ XQDPELJXRXV SDWK EHWZHHQ WKH FODVVHV RU QRGHV LQ WKH VFKHPD JUDSKf 7KHVH GRW H[SUHVVLRQV RU DVVRFLDWLYH SDWWHUQV IRUP DQ LPSRUWDQW FRPSRQHQW RI WKH TXHU\ VXEODQJXDJH DV ZH VKDOO VHH LQ WKH QH[W FKDSWHU $Q ([WHQVLRQDO 6HPDQWLFV IRU &ODVVHV :H VKDOO QRZ DWWHPSW WR JLYH DQ H[WHQVLRQDO VHPDQWLFV VLPLODU WR WKDW JLYHQ LQ .$1'25 >@ ,Q D 9ROWDLUH GDWDEDVH OHW & EH WKH VHW RI FODVVHV GHILQHG LQ LW OHW $ EH WKH VHW RI DWWULEXWHV GHILQHG LQ LW % EH WKH VHW RI FRQVWUDLQWV WR PRGHO EHKDYLRUf DQG OHW ; EH WKH VHW RI LQVWDQFHV GHILQHG LQ LW $ SDUWLDO PRGHO IRU D 9ROWDLUH GDWDEDVH LV WKHQ D VHW 9 WKH VHW RI DOO LQVWDQFHV VWULQJV DQG QXPEHUV SOXV D IXQFWLRQ e VXFK WKDW e & r 73 7KLV DFFRXQWV IRU WKH IDFW WKDW D JLYHQ LQVWDQFH PD\ EHORQJ WR PRUH WKDQ RQH FODVV GXH WR PXOWLSOH LQKHULWDQFH e$r^9ar 9f ZKHUH 3I LV WKH GLVMRLQW XQLRQ RI 3 QXPEHUV DQG VWULQJV 7KXV DQ DWWULEXWH LV WUHDWHG DV D IXQFWLRQ RU WZR SODFH SUHGLFDWH

PAGE 43

e 9 e%9 e QXPHUDOV f§A LQWHJHUV e UHDOQXPHUDOV f§ UHDO e VWULQJV f§! VWULQJV 7KH ODVW WKUHH FRQGLWLRQV DFFRXQW IRU EDVH W\SHV VXSSRUWHG E\ WKH V\VWHP 7KLV IXQFWLRQ e HIIHFWLYHO\ FRPSXWHV WKH H[WHQW RI D JLYHQ FODVV ,W PD\ EH WKRXJKW RI DV EHLQJ VLPLODU WR D W\SLFDO YDOXDWLRQ IXQFWLRQ DV IRXQG LQ GHQRWDWLRQDO VHPDQWLFV ,Q RUGHU WR FRPSXWH WKH H[WHQW RI D FODVV ZH PXVW ILUVW FRPSXWH WKH H[WHQW GXH WR HDFK V\QWDFWLF FDWHJRU\ DOORZHG LQ WKH GHILQLWLRQ RI WKH FODVV 7KHUHIRUH WKH YDULRXV IRUPV RI e DUH GHILQHG DERYH DQG IXUWKHU e PXVW VDWLVI\ WKH IROORZLQJ FRQGLWLRQV e>D F@ [ ZKHUH LI \ e>D@Df WKHQ \ f e>F@ DQG L*3 e>D VHW F@ ^[ 7! ? LI \ f e>D@Uf WKHQ \ f e>F@` e>D WXSOH D F@ >@L e>DD} F@ e>F FRQVWUDLQW E? ? EP@ I_e/L e>F FRQVWUDLQW W@ e>F FRQVWUDLQW @ [ LI [ VDWLVILHV WKH FRQVWUDLQW HOVH I>F@ 'IHL e_FL@ Q UH H>Dc@ ZKHUH WKH FODVV F KDV VXSHUFODVVHV F? FQ DQG KDV DWWULEXWHV ZLWK GRPDLQ UHVWULFWLRQVf D? DP 7KLV W\SH RI PRGHO LV FDOOHG D SDUWLDO PRGHO EHFDXVH LW GRHV QRW WDNH LQWR DFFRXQW WKH GHILQLWLRQV RI LQVWDQFHV 7KH UHDVRQ IRU WKLV LV WKDW WKH GHILQLWLRQV RI LQVWDQFHV DUH QRW LPSRUWDQW IRU GHWHUPLQLQJ WKH VXEFODVV UHODWLRQVKLS EHFDXVH LW GRHV QRW GHSHQG RQ D SDUWLFXODU PRGHO EXW RQ WKH HQWLUH VHW RI PRGHOV 7KXV &? LV D VXEFODVV RI F LH

PAGE 44

FL F LII e>FL@ 4 e?FW? ,W VKRXOG EH FOHDU WKDW D WUDGLWLRQDO FKDUDFWHUL]DWLRQ IRU WKLV VLPSOH W\SH GLVFLSOLQH ZRXOG HQVXUH WKDW WKH VXEFODVV UHODWLRQVKLS DV GHILQHG DERYH LV GHFLGDEOH SURYLGHG WKDW FRQVWUDLQWV DUH LJQRUHGf ,Q IDFW WKH IRUPXODWLRQ ZRXOG EH YHU\ VLPLODU WR WKDW RI 2 DQG LV JLYHQ LQ VHFWLRQ 7KH DERYH IRUPXODWLRQ LV WULYLDO VLQFH LW GRHV QRW \HW DFFRXQW IRU IXQFWLRQV ZKLFK ZH VKDOO VHH LQ FKDSWHU 7KH PDLQ UHDVRQ IRU FKRRVLQJ WKH DERYH VHPDQWLFV ZDV WR HPSKDVL]H WKH H[WHQVLRQ RI D JLYHQ FODVV 2XU PRGHO PDNHV QR DUELWUDU\ DVVXPSWLRQV )RU H[DPSOH WKH DULW\ RI DQ LQVWDQFH FDQ EH JUHDWHU WKDQ WKDW RI WKH FODVVHVf WR ZKLFK PD\ LW PD\ EHORQJ $OVR PXOWLSOH LQKHULWDQFH LV SRVVLEOH ZLWKRXW DQ\ SUREOHPV ,QVWDQFHV DUH FKDUDFWHUL]HG E\ D XQLTXH LGHQWLILHU WKH VHW RI FODVVHV WR ZKLFK WKH LQVWDQFH PD\ EHORQJ DQG WKH VHW RI DWWULEXWH YDOXH SDLUV $Q LQVWDQFH PD\ EHORQJ WR RQH RU PRUH FODVVHV SURYLGHG LW VDWLVILHV DOO FRQVWUDLQWV DWWDFKHG WR D JLYHQ FODVV 7KH XQLTXH LGHQWLILHU LV DVVLJQHG WR DQ LQVWDQFH E\ V\VWHP ZKLFK DOVR HQVXUHV LWV XQLTXHQHVV DFURVV WKH V\VWHPf DW WKH WLPH ZKHQ WKH LQVWDQFH LV FUHDWHG 8SGDWH 2SHUDWRUV :H DOVR SURYLGH D VHW RI XSGDWH RSHUDWRUV WR FUHDWH DQG PRGLI\ H[LVWLQJ LQVWDQFHV 7KH QHZ RSHUDWRU DOORZV XV WR FUHDWH D QHZ SHUVLVWHQW LQVWDQFH ZLWK DQ LPPXWDEOH XQLTXH LGHQWLILHU DV IROORZV ^ QHZ6WXGHQW VVII DQG QDPH f6PLWKf DQG PDMRU ^ 'HSW QDPH f§ f((f ` DQG VHFWLRQV ^ 6HFWLRQ VHFBQXPEHU RU VHFBQXPEHU RU VHFBQXPEHU ` ` 7KLV UHWXUQV D XQLTXH LGHQWLILHU IRU D QHZ LQVWDQFH RI FODVV 6WXGHQW ZKLFK ZLOO QRZ EH VWRUHG LQ WKH GDWDEDVH 7KH ULJKW KDQG VLGH RI WKH YHUWLFDO EDU f_f GHILQHV WKH YDOXHV IRU HDFK DWWULEXWH RI WKH LQVWDQFH $VVXPLQJ WKDW WKHUH H[LVWV DQ LQVWDQFH GHILQLQJ WKH f((f GHSDUWPHQW WKH YDOXH IRU PDMRU LV JLYHQ E\ WKH VHW H[SUHVVLRQ

PAGE 45

^'HSW QDPH f((f` ZKLFK GHQRWHV WKH LGHQWLILHU (( 7KH YDOXH RI JSD LV QRW VSHFLILHG EHFDXVH WKHUH PD\ EH D FRQVWUDLQW RU UXOH ZKLFK WHOOV WKH V\VWHP KRZ WR FRPSXWH LWV YDOXH LH JSD PD\ EH D GHULYHG DWWULEXWH 7KXV EHIRUH WKH LQVWDQFH LV DFWXDOO\ SODFHG LQ WKH SHUVLVWHQW VWRUH WKH YDOXH IRU JSD ZRXOG EH FRPSXWHG DQG FKHFNHG IRU FRQVLVWHQF\ EXW ZRXOG QRW EH PDGH SHUVLVWHQW DORQJ ZLWK WKH RWKHU YDOXHV VSHFLILHG LQ WKH FRPPDQG 7KH PRGLI\ RSHUDWRU LV OLNH GHVWUXFWLYH DVVLJQPHQW LQ WKH VHQVH WKDW LW ZLOO GHVWUR\ D SHUVLVWHQW YDOXH RWKHU WKDQ WKH XQLTXH LGHQWLILHUf DQG UHSODFH LW ZLWK D QHZ YDOXH VSHFLILHG E\ WKH XVHU 7KH PRGLILHG LQVWDQFH LV WKHQ FKHFNHG IRU FRQVLVWHQF\ EHIRUH LW LV FRPPLWWHG WR WKH SHUVLVWHQW VWRUH 7KLV FKHFN LV OLPLWHG RQO\ WR WKRVH FODVVHV WR ZKLFK WKH LQVWDQFH PD\ EHORQJ )RU H[DPSOH ^ PRGLI\ MRH PDMRU ^ 'HSW QDPH f&6f ` ` FKDQJHV WKH YDOXH RI WKH PDMRU DWWULEXWH RI WKH REMHFW UHIHUHQFHG E\ MRH 6LPLODUO\ ^ PRGLI\3HUVRQ DJH SUHYDJH ` ZLOO LQFUHDVH WKH DJH RI HYHU\ LQVWDQFH RI FODVV SHUVRQ E\ 7KH GHOHWH RSHUDWRU DFWXDOO\ GHVWUR\V WKH VHW RIf LQVWDQFHV VSHFLILHG E\ WKH XVHU HJ ^ GHOHWH6WXGHQW JSD ` 7KHVH RSHUDWRUV DUH DOVR GHILQHG IRU QRQSHUVLVWHQW GDWD YDOXHV 2Q WKH &RPSXWDELOLW\ RI 6XEFODVV 2EMHFW *UDSKV DQG (TXDOLW\ 6XSSRVH ZH DUH JLYHQ $ ILQLWH VHW RI GRPDLQV '? 'Q Q /HW 7! GHQRWH WKH XQLRQ RI DOO GRPDLQV $ FRXQWDEO\ LQILQLWH VHW $ RI DWWULEXWH QDPHV $ FRXQWDEO\ LQILQLWH VHW ;7! RI LGHQWLILHUV 7KH UHDVRQ ZK\ QHZ PRGLI\ GHOHWH DUH GHILQHG IRU QRQSHUVLVWHQW YDOXHV DV ZHOO LV WKDW SHUVLVWHQFH LV D SURSHUW\ RI WKH LQVWDQFH DQG QRW WKH FODVV RU W\SH

PAGE 46

:H QRZ GHILQH WKH QRWLRQ RI YDOXH 'HIQLWLRQ 9DOXHV 7KH VSHFLDO V\PERO QXOO LV D YDOXH FDOOHG D EDVLF YDOXH (YHU\ HOHPHQW Y RI LV D YDOXH FDOOHG D EDVLF YDOXH (YHU\ ILQLWH VXEVHW RI ,' LV D YDOXH FDOOHG D VHW YDOXH 6HW YDOXHV DUH GHQRWHG LQ WKH XVXDO ZD\ XVLQJ EUDFNHWV 7KH ILQLWH SDUWLDO IXQFWLRQ U $ f§ ,n' GHQRWHG E\ DL L? DS LSf LV GHILQHG RQ D?DS VXFK WKDW UDIFf LN IRU DOO N IURP WR S (YHU\ U LV FDOOHG D WXSOH YDOXH :H GHQRWH E\ 9 WKH VHW RI DOO YDOXHV :H QRZ GHILQH WKH QRWLRQ RI DQ REMHFW 'HILQLWLRQ ƒ 2EMHFWV 7KH VHW RI DOO REMHFWV 2 ,' [ 9 $Q REMHFW LV D SDLU R Yf ZKHUH L LV DQ HOHPHQW RI ,' DQ LGHQWLILHUf DQG Y LV D YDOXH ,Q R LYf LI Y LV D EDVLF YDOXH WKHQ R LV D EDVLF REMHFW 6LPLODUO\ ZH FDQ GHILQH VHWVWUXFWXUHG DQG WXSOHVWUXFWXUHG REMHFWV )XUWKHU ZH GHILQH WKH IXQFWLRQV O f§!,' DQG Y f§9 VXFK WKDW Rf GHQRWHV WKH LGHQWLILHU L DQG YRf GHQRWHV WKH YDOXH RI REMHFW R UHVSHFWLYHO\ :H DOVR GHILQH WKH IXQFWLRQ S 2 f§! U]! ZKLFK DVVRFLDWHV ZLWK DQ REMHFW WKH VHW RI DOO LGHQWLILHUV DSSHDULQJ LQ LWV YDOXH LH WKRVH UHIHUHQFHG E\ WKH REMHFW :H FDQ QRZ GHILQH DQ 2EMHFW *UDSK 'HILQLWLRQ 2EMHFW *UDSK /HW EH D VHW RI REMHFWV 7KHQ JUDSKf LV GHILQHG DV IROORZV

PAGE 47

,I R LV D EDVLF REMHFW RI WKHQ WKH JUDSK FRQWDLQV D FRUUHVSRQGLQJ YHUWH[ ZLWK QR RXWJRLQJ HGJH 7KH YHUWH[ LV ODEHOHG ZLWK WKH YDOXH RI R LH YRf ,I R LV WKH WXSOHVWUXFWXUHG REMHFW LD? rLDS LSf WKHQ WKH VXEJUDSK LQ JUDSKf FRUUHVSRQGLQJ WR R FRQWDLQV D QRGH VD\ Ucff ODEHOHG ZLWK L DQG S RXWJRLQJ HGJHV IURP Uc ODEHOHG ZLWK D?DS OHDGLQJ UHVSHFWLYHO\ WR QRGHV FRUUHVSRQGLQJ WR REMHFWV RLRS ZKHUH HDFK 2N LV LGHQWLILHG E\ LN SURYLGHG VXFK REMHFWV H[LVWf ,I R LV D VHWVWUXFWXUHG REMHFW r ^LfL LS`f WKHQ WKH JUDSK RI R FRQVLVWV RI D QRGH VD\ UMrf ODEHOHG E\ L DQG S XQODEHOHG RXWJRLQJ HGJHV IURP UMr OHDGn LQJ UHVSHFWLYHO\ WR QRGHV FRUUHVSRQGLQJ WR REMHFWV RMRS ZKHUH HDFK Rr LV LGHQWLILHG E\ LN SURYLGHG VXFK REMHFWV H[LVWf $V DQ H[DPSOH FRQVLGHU ^RL R R R R R R` ZKHUH rL rL QDPH L GHSW L DGYLVRU ff r r QDPH L GHSW  DGGUHVV  DGYLVHV Mff m f-LPff R H f-RHff R  f&6ff R LV f((ff R  ^r V`f  FLW\ QXOO ]LS QXOOff 7KH REMHFWV RLR DQG R DUH WXSOHVWUXFWXUHG R DQG R DUH EDVLF DQG R LV VHWVWUXFWXUHG LV D FRQVLVWHQW VHW RI REMHFWV LI LW VDWLVILHV WKH GHILQLWLRQ JLYHQ EHORZ 'HILQLWLRQ ƒƒ &RQVLVWHQF\ RI4 $ VHW RI REMHFWV LV FRQVLVWHQW LII LV ILQLWH DQG

PAGE 48

WKH IXQFWLRQ  LV LQMHFWLYH RQ LH WKHUH H[LVW QR SDLU RI WZR REMHFWV ZLWK WKH VDPH LGHQWLILHUV DQG 9 R e SRf & f LH HYHU\ UHIHUHQFHG LGHQWLILHU FRUUHVSRQGV WR DQ REMHFW 'HILQLWLRQ ƒ (TXDOLW\ 2HTXDOLW\ WZR REMHFWV R DQG Rn DUH HTXDO RU LGHQWLFDOf LII R Rn HTXDOLW\ WZR REMHFWV R DQG Rn DUH HTXDO LII LRf YRnf HTXDOLW\ WZR REMHFWV R DQG Rn DUH UHTXDO LII VSDQBWUHHRf VSDQBWUHHRn ZKHUH VSDQBWUHHRf LV WKH WUHH REWDLQHG IURP R E\ UHFXUVLYHO\ UHSODFLQJ DQ LGHQWLILHU L LQ D YDOXHf E\ WKH YDOXH RI WKH REMHFW LGHQWLILHG E\ L &ODVVHV 7\SHV DQG 6FKHPDV 'HILQLWLRQ A %DVLF &ODVV 1DPHV %QDPHV LV WKH VHW RI QDPHV IRU EDVLF FODVVHV FRQWDLQLQJ 7KH VSHFLDO V\PEROV $Q\ DQG 1LO $ V\PERO G IRU HDFK GRPDLQ :H GHQRWH GRPGf $ V\PERO n[ IRU HYHU\ YDOXH [ RI 9 &QDPHV LV WKH VHW RI QDPHV IRU FRQVWUXFWHG FODVVHV ZKLFK LV FRXQWDEO\ LQILQLWH DQG LV GLVMRLQW ZLWK %QDPHV 7KLV LV EHFDXVH %QDPHV GHQRWHV WKH VHW RI WKH QDPHV IRU EDVLF GRPDLQV VXFK DV ERROHDQ VWULQJ RU LQWHJHU 7QDPHV LV WKH XQLRQ RI %QDPHV DQG &QDPHV DQG LW LV WKH VHW RI DOO QDPHV IRU FODVVHV

PAGE 49

,Q RUGHU WR GHILQH FODVVHV ZH DVVXPH WKHUH LV D ILQLWH VHW % ZKRVH HOHPHQWV DUH FRQVWUDLQWV ZKLFK GHVFULEH WKH EHKDYLRU RI FODVVHV )RU QRZ ZH VKDOO FRQVLGHU HOHPHQWV RI % DV XQLQWHUSUHWHG V\PEROV 'HILQLWLRQ &ODVVHV $ EDVLF FODVV LV D SDLU QEf ZKHUH Q LV DQ HOHPHQW RI %QDPHV DQG E LV D VXEVHW RI % $ FRQVWUXFWHG FODVV LV RQH RI WKH IROORZLQJ $ WULSOH VWEf ZKHUH V LV DQ HOHPHQW RI &QDPHV W LV DQ HOHPHQW RI 7QDPHV DQG E LV D VXEVHW RI % 6XFK D FODVV LV GHQRWHG E\ V W Ef $ WULSOH VU Ef ZKHUH V e &QDPHV DQG U LV D ILQLWH SDUWLDO IXQFWLRQ U $ f§r 7QDPHV 6XFK D FODVV LV GHQRWHG E\ V DL 6L DQ VQf f ZKHUH UDrf 6W DQG LV FDOOHG D WXSOHVWUXFWXUHG FODVV $ WULSOH VVnEf ZKHUH V e &QDPHV Vn e 7QDPHV 6XFK D FODVV LV GHQRWHG E\ V Vn Ef DQG LV FDOOHG D VHWVWUXFWXUHG FODVV $ FODVV LV HLWKHU EDVLF RU FRQVWUXFWHG DQG WKH VHW RI DOO FODVVHV LV GHQRWHG E\ 7 'HILQLWLRQ &ODVV 6WUXFWXUHV %DVLF &ODVV 6WUXFWXUH /HW W QPf EH D EDVLF FODVV 7KHQ Q LV FDOOHG WKH EDVLF FODVV VWUXFWXUH DVVRFLDWHG ZLWK W &RQVWUXFWHG &ODVV 6WUXFWXUH /HW W V [ Ef EH D FRQVWUXFWHG FODVV 7KHQ V [ LV FDOOHG WKH FRQVWUXFWHG W\SH VWUXFWXUH DVVRFLDWHG ZLWK W *LYHQ D FODVV W LWV VWUXFWXUH LV GHQRWHG E\ DWf DQG LWV EHKDYLRU E\ :H ILUVW JLYH VRPH QRWDWLRQ EHIRUH GHILQLQJ WKH QRWLRQ RI FRQVLVWHQF\ IRU FODVV VWUXFWXUHV ,I W LV D FODVV WKHQ W@Wf GHQRWHV WKH QDPH RI WKH FODVV

PAGE 50

LI DWf LV D FODVV VWUXFWXUH DVVRFLDWHG ZLWK WKH FODVV W WKHQ ZH GHQRWH UMFUWff 7@Wf ,I DWf LV D FODVV VWUXFWXUH DVVRFLDWHG ZLWK WKH FODVV W WKHQ ZH GHQRWH WKH VHW RI DOO FODVV QDPHV DSSHDULQJ LQ WKH VWUXFWXUH RI W QDPHO\ DWff E\ UHIHUDWff 'HILQLWLRQ 6FKHPDV $ VHW $ RI FRQVWUXFWHG FODVV VWUXFWXUHV LV D VFKHPD LI DQG RQO\ LI $ LV D ILQLWH VHW DQG UM LV LQMHFWLYH RQ $ LH WKHUH H[LVWV RQO\ RQH FODVV VWUXFWXUH IRU D JLYHQ FODVV QDPHf DQG 9FUWf f $ UHIHUUWff IO &QDPHV & UM$f LH WKHUH DUH QR GDQJOLQJ LGHQWLILHUV 7KH VHPDQWLFV RI WKH FODVV VWUXFWXUH V\VWHP GHILQHG DERYH LV JLYHQ E\ D IXQFWLRQ ZKLFK DVVRFLDWHV VXEVHWV RI D FRQVLVWHQW VHW RI REMHFWV WR FODVV VWUXFWXUH QDPHV 'HILQLWLRQ ,QWHUSUHWDWLRQV /HW $ EH D VFKHPD DQG EH D FRQVLVWHQW VXEVHW RI WKH XQLYHUVH RI REMHFWV 2 $Q LQWHUSUHWDWLRQ RI $ LQ LV D IXQFWLRQ IURP 7QDPHV WR rHf VXFK WKDW WKH IROORZLQJ SURSHUWLHV DUH VDWLVILHG $ %DVLF &ODVV QDPHV Df ,18f & ^L f L QXOOf f ` 7KH LQWHUSUHWDWLRQ RI 1LO LV D VXEVHW RI WKH LGHQWLILHUV LQ VXFK WKDW WKH\ GHQRWH REMHFWV ZKRVH YDOXH LV QXOO Ef OGLf & ^} H Wf f f '^` 8 -1LOf 7KH LQWHUSUHWDWLRQ RI D EDVLF GRPDLQ RU W\SH LV WKH VXEVHW RI LGHQWLILHUV RI REMHFWV LQ VXFK WKDW WKH\ GHQRWH EDVLF REMHFWV LQ

PAGE 51

Ff Mn[f F ^L H f [`X -1LOf Gf O$Q\f ^" L f Wf` 6LQFH DOO REMHFWV EHORQJ WR $Q\ LWV LQWHUSUHWDWLRQ LV WKH VHW RI DOO LGHQWLILHUV GHILQHG LQ % &RQVWUXFWHG &ODVV 1DPHV Df ,I V DL M DQ VQf f $ WKHQ Vf & ^ f f LV D WXSOH VWUXFWXUHG YDOXH GHILQHG DW OHDVW RQ FWL DQ DQG 9$ ]fDMWf e -VMWf` raO 1LOf Ef LI V ^Vn` f $ WKHQ ,Vf & ^L e f & Vnf` 8 O1LOf Ff V Wf f $ WKHQ Vf & Wf & 8QGHILQHG &ODVV QDPHV Df ,I V LV QHLWKHU D FODVV QDPH QRU WKH QDPH RI WKH VFKHPD $ WKHQ =Vf & -1LOf 'HILQLWLRQ A 0RGHO RI D 6FKHPD 3DUWLDO RUGHU RQ ,QWHUSUHWDWLRQV $Q LQWHUSUHWDWLRQ & n LI DQG RQO\ LI IRU DOO V f 7QDPHV Vf & nVf 0RGHO /HW $ EH D VFKHPD DQG EH D FRQVLVWHQW VHW RI REMHFWV 7KH PRGHO 0 RI $ LV ZKLFK LV WKH JUHDWHVW LQWHUSUHWDWLRQ RI $ LQ 7KHRUHP 7KH GHILQLWLRQ RI D 0RGHO LV VRXQG 3URRI RI 7KHRUHP *LYHQ D VFKHPD $ DQG D FRQVLVWHQW VHW RI REMHFWV WKHUH DUH D ILQLWH QXPEHU RI LQWHUSUHWDWLRQV RI $ GHILQHG RQ 7KHUHIRUH LQ RUGHU WR

PAGE 52

SURYH WKDW WKH JUHDWHVW LQWHUSUHWDWLRQ H[LVWV ZH KDYH WR SURYH WKDW WKH XQLRQ RI WZR LQWHUSUHWDWLRQV LV DQ LQWHUSUHWDWLRQ /HW ;? DQG EH WZR LQWHUSUHWDWLRQV DQG ,Vf ,LVf 8 ,f IRU HYHU\ FODVV QDPH V &OHDUO\ VDWLVILHV SURSHUWLHV $O DQG RI WKH GHILQLWLRQ DERYH /HW V DL 6L DQ VQf DQG L EH DQ HOHPHQW RI -Vf 7KHQ L LV HLWKHU DQ HOHPHQW RI -L RU ,I L LV DQ HOHPHQW RI -L WKHQ fDMWf f 7VNf IRU DOO IF DQG VDWLVILHV SURSHUW\ %O DERYH 6LPLODUO\ LW FDQ EH VKRZQ WKDW VDWLVILHV SURSHUWLHV % DQG % DERYH 7KXV WKHUH H[LVWV D JUHDWHVW LQWHUSUHWDWLRQ 0 VXFK WKDW $Vf 8;HL17$f IRU HYHU\ FODVV QDPH V ZKHUH ,17$f GHQRWHV WKH VHW RI DOO LQWHUSUHWDWLRQV RI $ LQ ‘ 'HILQLWLRQ 3DUWLDO 2UGHU ; /HW V DQG Vn EH WZR FODVV VWUXFWXUHV RI D VFKHPD $ 7KHQ V LV D VXEVWUXFWXUH RI Vn GHQRWHG E\ V $ Vnf LI DQG RQO\ LI $Vf & 0Vnf IRU DOO FRQVLVWHQW VHWV 7KHRUHP ƒ ,I V DQG Vn DUH WZR FODVV VWUXFWXUHV RI D VFKHPD $ WKHQ E\ V ; Vn LI DQG RQO\ LI RQH RI WKH IROORZLQJ FRQGLWLRQV KROGV WUXH V DQG Vn DUH WXSOH VWUXFWXUHV V W DQG Vn Wn VXFK WKDW W LV PRUH GHILQHG WKDQ Wn DQG IRU HYHU\ DWWULEXWH D VXFK WKDW Wn LV GHILQHG WDf ; WnDf KROGV V DQG Vn DUH VHW VWUXFWXUHV VXFK WKDW V ^` DQG Vn ^n` WKHQ +Ln KROGV V n[ DQG Vn LV D EDVLF FODVV VWUXFWXUH DQG [ f GRPVnf 3URRI RI 7KHRUHP ƒ 7KH YDOLGLW\ RI WKLV FKDUDFWHUL]DWLRQ FDQ EH HVWDEOLVKHG E\ LQGXFWLRQ &RPSOHWHQHVV FDQ EH HVWDEOLVKHG RQ D FDVHE\FDVH EDVLV IRU WXSOH VHW DQG EDVLF FODVV VWUXFWXUHV ‘

PAGE 53

7KLV WKHRUHP SURYLGHV D V\QWDFWLFDO PHDQV IRU FRPSXWLQJ WKH VXEFODVV UHODWLRQVKLS VLQFH ZH DUH LJQRULQJ WKH EHKDYLRU RI FODVVHV LQ WKLV FKDUDFWHUL]DWLRQ 'HILQLWLRQ $ 'DWDEDVHV $ GDWDEDVH LV D WXSOH $ ;=f ZKHUH $ LV D FRQVLVWHQW VFKHPD LV D FRQVLVWHQW VHW RI REMHFWV ; LV D SDUWLDO RUGHU DPRQJ HOHPHQWV RI $ ; LV DQ LQWHUSUHWDWLRQ RI $ LQ )XUWKHU WKH IROORZLQJ SURSHUWLHV PXVW KROG ,I W ; Wn DQG W ; W WKHQ /?^WnW` LV FRPSXWDEOH SURYLGHG Wn A $Q\ DQG W A $Q\ )XUWKHU Wn DQG W DUH QRZ VDLG WR EH FRPSDUDEOH DQG 8^n W` LV WKH OHDVW XSSHU ERXQG RI Wn DQG W 8JDf *ORVVDU\ +HUH ZH SURYLGH D EULHI JORVVDU\ RI VRPH RI WKH IXQFWLRQV XVHG LQ WKLV VHFWLRQ GHQRWHV WKH LGHQWLILHU RI DQ REMHFW R Y GHQRWHV WKH YDOXH RI DQ REMHFW R S DVVRFLDWHV ZLWK DQ REMHFW WKH VHW RI DOO LGHQWLILHUV DSSHDULQJ LQ LWV YDOXH U LV D SDUWLDO IXQFWLRQ IRU WXSOH YDOXHV Uc GHQRWHV WKH QDPH RI D FODVV GHQRWHV WKH VWUXFWXUH RI D FODVV

PAGE 54

&+$37(5 48(5< 63(&,),&$7,21 $V PHQWLRQHG HDUOLHU 9ROWDLUH LV DQ LPSHUDWLYH SURJUDPPLQJ ODQJXDJH EDVHG RQ WKH QRWLRQ RI REMHFWV 6LQFH TXHU\ ODQJXDJHV KDYH WUDGLWLRQDOO\ EHHQ GHFODUDWLYH DQG VHWRULHQWHG HPEHGGLQJ WKHP ZLWKLQ D SURFHGXUDO UHFRUGRULHQWHG IUDPHZRUN LQHYLWDEO\ OHDGV WR GHVLJQ FRQIOLFWV +RZHYHU ZH DYRLG PXFK RI WKLV FRQIOLFW VLQFH 9ROWDLUH LV D VHWRULHQWHG ODQJXDJH 7KLV PHDQV WKDW H[SUHVVLRQV ZKLFK IRUP WKH FRUH RI 9ROWDLUH GHQRWH D VHW RI REMHFWV E\ GHIDXOW )RU H[DPSOH HYHQ WKH VLPSOH GRW H[SUHVVLRQ 6WXGHQWDGYLVRU)DFXOW\ GHSW GHQRWHV D VHW RI LQVWDQFHV RU REMHFWV ZKRVH W\SH LV WKH W\SH RI WKH DWWULEXWH GHSW VXFK WKDW HDFK REMHFW SDUWLFLSDWHV LQ WKH DVVRFLDWLRQ GHVFULEHG LQ WKH GRW H[SUHVVLRQ 7KHVH VDPH VHW H[SUHVVLRQV DUH XVHG LQ VSHFLI\LQJ FRQVWUDLQWV LQ D FODVV RU IXQFWLRQ GHILQLWLRQ ZLWK RQH LPSRUWDQW UHVWULFWLRQ $Q H[SUHVVLRQ RI WKH IRUP V ^FL> DM @F> DM @` LV QRW DOORZHG HYHQ WKRXJK LW LV ZHOOW\SHG W\SHVf ^ W\SHDXf W\SHDMff` 7KH YDOXH RI V ZRXOG EH D VHW RI WXSOHV DQG HDFK HOHPHQW LQ D WXSOH FDQ FRQWDLQ QHVWHG VHWV DQG WXSOHV ,I VXFK H[SUHVVLRQV ZHUH DOORZHG WKH UXQWLPH RYHUKHDG ZRXOG EH YHU\ H[SHQVLYH 0XOWLSOH LQKHULWDQFH GRHV QRW FUHDWH D SUREOHP ZKHQ HYDOXDWLQJ D TXHU\ RU VHW H[SUHVVLRQ 7KLV LV EHFDXVH DQ LQVWDQFH FDQ RFFXU RQO\ ZLWKLQ D XQLTXH FRQWH[W LQ WKH H[SUHVVLRQ 7KH FRQWH[W LV GHFLGHG E\ WKH DQFKRU FODVV RI WKH GRW H[SUHVVLRQ ZKLFK LV VLPSO\ WKH ILUVW FODVV DSSHDULQJ LQ D GRW H[SUHVVLRQ )RU H[DPSOH LQ ^7$DGYLVRU 7$ LQ 5$` WKH FRQWH[W LV GHILQHG E\ WKH /+6 RI WKH f_f DQG WKHUHIRUH WKH DQFKRU FODVV LV 7$ 7KLV TXHU\ GHQRWHV WKH VHW RI REMHFWV EHORQJLQJ WR W\SHDGYLVRUf VXFK

PAGE 55

WKDW DOO LQVWDQFHV RI WKH FODVV 7$ WKDW KDYH DGYLVRUV DUH DOVR PHPEHUV RI WKH FODVV 5$ (YHQ WKRXJK WKH FODVVHV 7$ DQG 5$ DUH QRW VXEFODVVHV RI HDFK RWKHU WKH\ KDYH FRPPRQ HOHPHQWV 6LQFH WKH ERROHDQ FRQGLWLRQ 7$ LQ 5$ PHDQV VHOI LQ 5$ ZKHUH VHOI PDLQWDLQV FXUUHQF\ LQ WKH VHW RI REMHFWV EHORQJLQJ WR WKH DQFKRU FODVVf WKH TXHU\ FDQ EH HYDOXDWHG ZLWKRXW FRQIOLFW 7KH %DVLF 6WUXFWXUH RI D 4XHU\ 7KH EDVLF VWUXFWXUH RI WKH TXHU\ VXEODQJXDJH LV DV VKRZQ EHORZ VHWMH[SU! ^ (! f_f %RRO! ` ^ (! ` (! DJJBRS! VHWMH[SU! %RRO! %RRO! f QRW %RRO! %RROD RU %RRO %RROL DQG %RRO (L UHORS! ( (L ( IRUDOO (! %RRO! H[LVWV (! GEH[LVWV (! (! GRWBH[SU! f§ WHUP! WHUP! WHUP! DGGRS! (! 7KH TXHU\ VXEODQJXDJH FRQVLVWV RI DVVRFLDWLYH VHW H[SUHVVLRQV 7KH XVHU VSHFLILHV D SDWK RU VXEJUDSKf RI LQWHUHVW RQ WKH /+6 RI WKH YHUWLFDO EDU DQG VLPSOH ERROHDQ SUHGLFDWHV IRU VHOHFWLRQ FRQGLWLRQV RQ WKH 5+6 RI WKH YHUWLFDO EDU 7KLV SDWK RI LQWHUHVW GHQRWHV WKH FRQWH[W RI WKH VHW H[SUHVVLRQ ZLWKLQ ZKLFK FHUWDLQ ERROHDQ FRQGLWLRQV PXVW KROG WUXH 7KH FRQWH[W LV DOVR LPSRUWDQW VLQFH LW GHILQHV WKH VFRSH RI LGHQWLILHUV WKLV ZLOO EH IXUWKHU HODERUDWHG LQ VHFWLRQ f $ VLPSOH FRQWH[W FDQ EH VSHFLILHG E\ XVLQJ D GRW H[SUHVVLRQ $Q LPSRUWDQW UHVWULFWLRQ LV WKDW WKH ILUVW LGHQWLILHU LQ D GRW H[SUHVVLRQ RQ WKH /+6 ZKLFK GHILQHV WKH FRQWH[W PXVW EH D FODVV QDPH 7KLV FODVV LV WKHQ FDOOHG WKH DQFKRU FODVV 7KH V\QWDFWLF FDWHJRU\ (! GHQRWHV H[SUHVVLRQV ZKLFK DUH VLPSOH H[WHQVLRQV WR WKRVH IRXQG LQ PRVW ODQJXDJHV VXFK DV 3DVFDO 7R SURMHFW DWWULEXWHV RI D FODVV UHIHUHQFHG LQ WKH GRW H[SUHVVLRQ WKH\ DUH HQFORVHG ZLWKLQ VTXDUH EUDFNHWV :H VKRZ D IHZ H[DPSOHV VRPH WDNHQ IURP $ODVKTXU HW DO >@f EHORZ ZLWK UHVSHFW WR WKH VFKHPD JUDSK GHSLFWHG LQ ILJXUH

PAGE 56

([DPSOHV 4 3URMHFW WKH QDPHV RI DOO JUDGXDWH VWXGHQWV ZKR WHDFK RWKHU JUDGXDWH VWXGHQWV LQ VRPH VHFWLRQV $OVR SURMHFW WKH QDPHV RI WKRVH JUDGXDWH VWXGHQWV WKH\ WHDFK ^ 7$>QDPH@WHDFKHV6HFWLRQ*UDG>QDPH@ ` 1RWH WKDW WKH FODVV 7$ LQKHULWV WZR DWWULEXWHV ZKRVH GRPDLQ LV WKH FODVV 6HFWLRQ QDPHO\ WHDFKHV IURP WKH FODVV 7HDFKHU DQG VHFWLRQV IURP WKH FODVV *UDG YLD 6WXn GHQWf 6LQFH ZH DUH LQWHUHVWHG LQ 7$V LQ WKHLU UROH DV 7HDFKHUV DQG QRW DV JUDGXDWH VWXGHQWV ZKR DOVR HQUROO LQ FRXUVH VHFWLRQVf ZH DSSURSULDWHO\ LQFOXGH WHDFKHV LQ WKH GRW H[SUHVVLRQ 4 3URMHFW WKH QDPHV RI DOO GHSDUWPHQWV WKDW RIIHU OHYHO FRXUVHV WKDW KDYH D FXUUHQW RIIHULQJ LH VHFWLRQVf $OVR SURMHFW WKH WLWOHV RI WKHVH FRXUVHV DQG WKH WH[WERRN XVHG LQ HDFK VHFWLRQ ^ 'HSW>QDPH@&RXUVH>WLWOH@6HFWLRQ>WH[WERRN@ &RXUVHF DQG &RXUVHF ` $ GHSDUWPHQW RIIHUV PDQ\ FRXUVHV LH WKH FODVV 'HSW KDV DQ DWWULEXWH FRXUVHRIIHUn LQJ ZKRVH GRPDLQ LV WKH FODVV &RXUVH 6LPLODUO\ HDFK &RXUVH PD\ KDYH RQH RU PRUH 6HFWLRQV 7KLV TXHU\ LV HYDOXDWHG E\ ILUVW DFFHVVLQJ DOO LQVWDQFHV RI WKH FODVV 'HSW )RU HDFK LQVWDQFH RI 'HSW ZH UHWULHYH WKH REMHFW UHIHUHQFHV WR DOO FRXUVHV RIIHUHG E\ WKDW 'HSW 7KHVH LQVWDQFHV RI FODVV &RXUVH DUH WKHQ ILOWHUHG WKURXJK WKH ERROHDQ FRQn GLWLRQV WR FKHFN LI WKH FRUUHVSRQGLQJ FRXUVH QXPEHUV OLH EHWZHHQ DQG $OO LQVWDQFHV RI &RXUVH ZKLFK GR QRW VDWLVI\ WKLV FRQGLWLRQ DUH GURSSHG IURP IXUWKHU FRQn VLGHUDWLRQ )RU HDFK LQVWDQFH RI &RXUVH VR IDU VHOHFWHG ZH DFFHVV WKH FRUUHVSRQGLQJ 6HFWLRQV IRU WKDW FRXUVH

PAGE 57

4 3URMHFW WKH QDPHV RI DOO JUDGXDWH VWXGHQWV ZKR DUH 5$V EXW QRW 7$V ^ 5$QDPH QRW 5$ LQ 7$f ` 7KH ERROHDQ FRQGLWLRQ FRXOG KDYH DOVR EHHQ VSHFLILHG DV QRW VHOI LQ 7$f 7KLV LV EHFDXVH DQ\ GRW H[SUHVVLRQ RQ WKH 5+6 RI WKH YHUWLFDO EDU EHJLQQLQJ ZLWK WKH DQFKRU FODVV PHDQV WKH VDPH DV VHOI 6HOI LV D VSHFLDO RSHUDWRU XVHG WR GHILQH FXUUHQF\ LQ D VHW SURFHVVLQJ VWUHDP 4 3URMHFW WKH QDPHV RI DOO XQGHUJUDGXDWH VWXGHQWV ZKRVH PLQRU LV LQ WKDW GHSDUWn PHQW ZKLFK LV WKH WKH PDMRU GHSDUWPHQW RI WKH XQGHUJUDGXDWH VWXGHQW ZLWK VV ^8QGHUJUDGQDPH 8QGHUJUDGPLQRU'HSW ^8QGHUJUDGPDMRU'HSW 8QGHUJUDGVV `` 7KH ERROHDQ FRQGLWLRQ LQ WKLV TXHU\ KDV DQ HPEHGGHG VHW H[SUHVVLRQ 7KH VFRSH RI D GRW H[SUHVVLRQ LH FRQWH[Wf LV ORFDO WR WKH VHW H[SUHVVLRQ LQ ZKLFK LW RFFXUV 7KHUHIRUH LQ WKH LQQHU VHW H[SUHVVLRQ ZH DUH LQWHUHVWHG LQ WKH PDMRU GHSDUWPHQW RI WKDW LQVWDQFH RI FODVV 8QGHUJUDG ZKRVH VV KDV WKH YDOXH 6LPLODUO\ LQ WKH RXWHU VHW H[SUHVVLRQ ZH DUH LQWHUHVWHG LQ WKDW 8QGHUJUDG ZKRVH PLQRU 'HSW KDV WKH VDPH YDOXH DV WKDW VSHFLILHG E\ WKH HPEHGGHG VHW H[SUHVVLRQ ,Q RUGHU WR WUDQVFHQG WKH VFRSH RI D GRW H[SUHVVLRQ IURP DQ LQQHU WR RXWHU VHW H[SUHVVLRQ RU YLFH YHUVDf ZH PXVW XVH VSHFLDO RSHUDWRUV VXFK DV SUHY DQG ZLOO EH VHHQ LQ FKDSWHU 4 3URMHFW WKH QDPHV RI DOO 7$V ZKR JUDGH FRXUVHV LQ ZKLFK WKH\ WKHPVHOYHV DUH UHJLVWHUHG LH HQUROOHGf ^ 7$QDPH VHOIWHDFKHV6HFWLRQ LQ VHOIHQUROOHG6HFWLRQ `

PAGE 58

:H DUH LQWHUHVWHG LQ WKRVH LQVWDQFHV RI 7$ WKDW WHDFK VRPH VHFWLRQ RI D FRXUVH LQ ZKLFK WKDW VDPH LQVWDQFH RI 7$ LV HQUROOHG 6LQFH D 7$ PD\ EH WDNLQJ PRUH WKDQ RQH FRXUVH EXW FDQ WHDFK RQO\ RQH FRXUVH ZH XVH WKH VHW LQFOXVLRQ RSHUDWRU $JDLQ VHOI FRXOG KDYH EHHQ UHSODFHG E\ 7$ 4 :KDW ZRXOG EH WKH YDOXHV IRU VDODU\ IRU DOO UHVHDUFK DVVLVWDQWV ZKRVH DGYLVRU LV 6PLWK LI WKH\ ZHUH WR UHFHLYH D b LQFUHPHQW" ^ [ 5$VDODU\f 5$DGYLVRU)DFXOW\QDPH f6PLWKf ` 7KLV TXHU\ ZRXOG ILUVW HYDOXDWH WKH VHW H[SUHVVLRQ DQG WKHQ PXOWLSO\ HDFK SURn MHFWHG YDOXH RI VDODU\ E\ WKH VFDODU ,I WKH FRQWH[W ZHUH WR KDYH PRUH WKDQ RQH VXEH[SUHVVLRQ FRQWDLQLQJ WKH GRW RSHUDWRU WKHQ WKH ILUVW GRW H[SUHVVLRQ IURP WKH OHIW ZRXOG EH FKRVHQ DV WKH FRQWH[W DQG WKH UHPDLQLQJ RQHV ZRXOG EH LQWHUSUHWHG DV LI WKH\ ZHUH RQ WKH 5+6 RI WKH YHUWLFDO EDU $JJUHJDWH 2SHUDWRUV 6HYHUDO DJJUHJDWH RSHUDWRUV VXFK DV FRXQW VXP PLQ PD[ DUH SURYLGHG 7KHVH DUH QRW UHDOO\ VSHFLDO RSHUDWRUV EXW DUH PDLQO\ SURYLGHG IRU FRQYHQLHQFH 7KHVH FDQ EH HDVLO\ GHILQHG E\ XVLQJ D KRPRPRUSKLF VHW H[WHQVLRQ RSHUDWRU >@ OHW KRUQ $ RS ]6f6 ^` f§} ]? WDLO 6 ^` f§ RSKHDG 6f]f_ RSKHDG fKRP RS WDLO ff 7KHUH LV DQ DOWHUQDWLYH IRUP RI WKLV IXQFWLRQ WKDW DSSOLHV WR QRQHPSW\ VHWV DQG GRHV QRW UHTXLUH WKH DUJXPHQW OHW KRUQr $ RS 6fRSIKHDG 6f KRPr RS WDLO 6ff 7KXV ZH FDQ QRZ GHILQH WKH IROORZLQJ OHW VXP $nKRP$[D 6f

PAGE 59

OHW FRXQW $nKRP$[O 6f OHW PLQ ?6KRPr?[[ ?[\f[ \ f§! [?\6f 7KLV DERYH IRUPXODWLRQ JLYHV XV D ZD\ WR GHILQH DQG FRPSXWH WKHVH DJJUHJDWLRQ RSHUDWRUV IRU VHWV RI DUELWUDU\ VWUXFWXUHV DQG DUH JXDUDQWHHG RI JHWWLQJ D FRUUHFW UHVXOW WKDW LV IUHH RI VLGHHIIHFWV (YDOXDWLRQ 6WUDWHJLHV 6HPDQWLFV RI WKH 'RW 2SHUDWRU 6HW WKHRUHWLF GHILQLWLRQ /HW *? & EH FODVV QDPHV e>&??e>&@ EH WKH H[WHQWV RI &L&L DQG F?LFM f e>&?@e?&A UHVSHFWLYHO\ /HW &? KDYH DQ DWWULEXWH ODEHOHG DLN ZKRVH GRPDLQ LV & (IIHFWLYHO\ JLYHQ D VFKHPD JUDSK ZLWK WZR QRGHV &?& WKHUH PXVW H[LVW D XQLTXH SDWK IURP &? WR & IRU &?& WR EH PHDQLQJIXO /HW 6 GHQRWH WKH DJJUHJDWLRQ DVVRFLDWLRQ IURP &? WR & YLD WKH DWWULEXWH D?N VXFK WKDW 6 & e>&?? [ e>&@ ZKHUH DLN LV DQ DWWULEXWH RI &? 7KXV &?& ^FM &?W f e>&L@ $ F@ f e>&@ $ FOWFMf }6` ,I WKH GRPDLQ RI GLN LV VHW & WKHQ 6 & e>&?@ [ DQG OHW &M & e>&` 7KHQ &?& ^FM FX e>&?@ $ FM f &M $ FLL&Mf m"` ,Q JHQHUDO OHW &?&Q GHQRWH FODVV QDPHV DQG e>&?@ e>&Q@ GHQRWH WKHLU UHVSHFWLYH H[WHQWV /HW FLN EH WKH NWK HOHPHQW RI e>&L? /HW 6 EH D PHDQLQJIXO DJJUHJDWLRQ DVVRFLDWLRQ LQ WKH VHQVH PHQWLRQHG DERYHf EHWZHHQ & DQG &L VXFK WKDW 6L & e>&L? [ f>&L@ RU LI WKH GRPDLQ WKDW XQLTXH DWWULEXWH DW> RI & LV VHW & WKHQ 6L F e>&L@ [ $OVR &&D e>&L? 7KHQ &? f f f m&QBL&Q ^FQN ? &8N A>&Q@ $ &Qf§* &? f f f &Qm&IQO $ AQf§OM A 6Qf§ `

PAGE 60

0RGHO WKHRUHWLF GHILQLWLRQ :H QRZ JLYH D IRUPDO GHILQLWLRQ RI WKH GRW RSHUDWRU ZLWK UHVSHFW WR WKH DOJHEUD GHILQHG LQ VHFWLRQ /HW & f 7 ZKHUH 7LV WKH VHW RI DOO W\SHV LQ WKH VFKHPD 7KHQ Uc&cf }"7f ZKHUH Uc LV WKH QDPH IXQFWLRQ /HW FWM 7&f ZKHUH ;&f LV WKH LQWHUSUHWDWLRQ RI & 7KHQ Wf&fWc&?f LV YDOLG LI DQG RQO\ LI FUU"&ff D VWO DQ VLQf $ DQN UDfVQLF $OVf& -&Lf &OHDUO\ WKHQ Uc&LfUc&LLf & ,&Lf 5HFDOO WKDW FU&f GHQRWHV WKH VWUXFWXUH RI & DQG U LV WKH SDUWLDO IXQFWLRQ GHILQHG RQ WXSOH VWUXFWXUHV )RU EUHYLW\ ZH GURS UM VR WKDW &L&LL PHDQV WKH VDPH DV &f&W[f DQG DOVR &R&L ;&Lf 1RZ &L f f f &Q?&Q ^FQA FQM 7&ff $ QQW 7>DQNfFQ@!f &QB&QBL` 1DLYH $SSURDFK $V ZH KDYH VHHQ TXHULHV DUH IRUPXODWHG LQ DQ DVVRFLDWLYH IDVKLRQ YLD WKH GRW RSHUn DWRU 7KH /+6 RI D VHW H[SUHVVLRQ GHILQHV WKH FRQWH[W LQ ZKLFK WKH ERROHDQ FRQGLWLRQV RQ WKH 5+6 DUH WR EH HYDOXDWHG 7KHVH ERROHDQ FRQGLWLRQV DUH DOVR IRUPXODWHG ZLWK WKH GRW RSHUDWRU 7KHUHIRUH LW VHHPV UHDVRQDEOH WR LQYHVWLJDWH WKH VHPDQWLFV RI WKH GRW RSHUDWRU DQG D PHDQV WR HYDOXDWH LW :H ILUVW JLYH D VLPSOH H[DPSOH DQG DQ REYLRXV RSHUDWLRQDO PHDQLQJ IRU D VHW H[SUHVVLRQ /HW $ % & ( ) + EH FODVV QDPHV 7KH GRW RSHUDWRU LV VDLG WR EH PHDQLQJIXO IRU $% LI DQG RQO\ LI WKHUH H[LVWV DQ DWWULEXWH LQ $ ZKRVH GRPDLQ LV % RU D VXEW\SH RI % DV ZDV IRUPDOL]HG DERYHf 1RZ FRQVLGHU WKH TXHU\ ^$%&'( &* (+f ^$%&'( $%&* $%&'(+f 7KLV TXHU\ FDQ EH HYDOXDWHG DV IROORZV UHVXOW QXOO IRU HDFK D $ IRU HDFK % IRU HDFK F & IRU HDFK G IRU HDFK H ( IRU HDFK K +

PAGE 61

LI DEfFfJf DEfFfGfHfK WKHQ UHVXOW XQLRQUHVX Hf 1RWH WKDW DEf LV VLPLODU WR WKH XVXDO UHFRUG VHOHFWLRQ RSHUDWRU H[FHSW IRU WKH LPSOLFLW DVVXPSWLRQ WKDW WKHUH H[LVWV DQ DWWULEXWH LQ FODVV $ ZKRVH W\SH LV % 7KH SDUHQWKHVHV GHILQH WKH RUGHU RI HYDOXDWLRQ )RU H[DPSOH LI WKH FXUUHQW REMHFW LQ $ LV DQG $N LV WKH DWWULEXWH ODEHO LQ TXHVWLRQ WKHQ DE URLf$MWf ZKHUH U LV WKH XVXDO UHFRUG VHOHFWLRQ IXQFWLRQ +RZHYHU DV PHQWLRQHG HDUOLHU WKH RQO\ ZD\ WR RYHUULGH WKH VFRSH RI DQ LGHQWLILHU ZLWKLQ D VHW H[SUHVVLRQ DQG WKHUHIRUH D FRQWH[Wf LV WR XVH SUHY )RU H[DPSOH FRQVLGHU WKH IROORZLQJ ^$%&'( &* SUHY(+` ^$%&'( ? $%&* (+` 7KLV TXHU\ FDQ EH HYDOXDWHG DV IROORZV UHVXOW QXOO IRU HDFK D $ IRU HDFK E f % IRU HDFK F f & IRU HDFK G f IRU HDFK H f ( IRU HDFK K f + LI DtfFff HK WKHQ UHVXOW XQLRQUHVXW Hf $OJHEUDLF $SSURDFK $V ZH KDYH VHHQ WKH TXHU\ ODQJXDJH HVVHQWLDOO\ FRQVLVWV RI GRW H[SUHVVLRQV ZKLFK IRUP WKH FRQWH[W RQ WKH /+6 DQG VHOHFWLRQ FRQGLWLRQV RQ WKH 5+6 RI WKH YHUWLFDO EDU +RZHYHU LW LV SRVVLEOH WR HYDOXDWH WKHVH TXHULHV XVLQJ H[WHQGHG DOJHEUDLF RSHUDWRUV > @ 7KXV WKH FRPSLOHU FDQ H[SORLW H[LVWLQJ TXHU\ RSWLPL]DWLRQ WHFKQLTXHV )RU H[DPSOH WKH ILUVW H[DPSOH FDQ EH WUDQVIRUPHG E\ WKH FRPSLOHU WR WKH IROORZLQJ IRUP n7KH DFWXDO GHILQLWLRQV LQ 6KDZ DQG =GRQLFN >@ DUH VOLJKWO\ GLIIHUHQW EXW ZH DUH XVLQJ D VLPSOHU QRWDWLRQ IRU VDNH RI FODULW\

PAGE 62

7L 1IOM 6HFWLRQ *UDGf ZKHUH *UDG LQ 6HFWLRQHQUROOPHQW 7 f§ IW6HFWLRQRLG *UDGQDPH 7Lf 7 0 7$7f ZKHUH 7$WHDFKHV LQ 76HFWLRQRLG f§ IW7$QDPH *UDGQDPHA7AM 6LPLODUO\ ^ 5$VDODU\f 5$DGYLVRU)DFXOW\QDPH f6PLWKf ` FDQ EH WUDQVn IRUPHG WR IW VDODU\A H^5 $ff ZKHUH 5$DGYLVRU IW LG:QDPH f 6PLWKf ^)DFXOW\ff $Q DOJHEUDLF IRUPXODWLRQ FDQ DOVR EH XVHG WR GHILQH D GDWDIORZ LPSOHPHQWDWLRQ RI WKH TXHU\ SURFHVVRU 6LQFH 9ROWDLUH H[SUHVVLRQV DUH VHWRULHQWHG D SDUDOOHO LPSOHn PHQWDWLRQ LV SRVVLEOH ^'RWBH[SU! %RROL DQG %, !` ^'RWBH[SU! %RRA !` IO ^'RWBH[SU! %RRO !` ^'RWBH[SU! %RROL RU %RRO !` ^'RWBH[SU! %RROL !` 8 ^'RWBH[SU! %RRO !` ,Q JHQHUDO LW LV SRVVLEOH WR VKRZ WKDW WKH GRW RSHUDWRU DQG ERROHDQ FRQGLWLRQV FDQ EH UHGXFHG WR D VPDOO VHW RI DOJHEUDLF RSHUDWRUV DV GHVFULEHG LQ WKH OLWHUDWXUH > @

PAGE 63

&+$37(5 &21675$,17 63(&,),&$7,21 $XWRPDWLF LQWHJULW\ HQIRUFHPHQW LV D QRQWULYLDO SUREOHP > @ )RU H[DPSOH ZKHQ WKH FRQVHTXHQW RI D UXOH UHVXOWV LQ D GDWDEDVH XSGDWH RSHUDWLRQ GHWHFWLQJ SRVVLEOH LQILQLWH UHJUHVVLRQ GXH WR XSGDWH SURSDJDWLRQ VLPSO\ DGGV WR WKH FRPSOH[LW\ $QRWKHU SUREOHP LV WKDW RI PDLQWDLQLQJ FURVVUHIHUHQFHV )RU H[DPSOH VXSSRVH D UXOH VWDWHV WKDW HYHU\ JUDGXDWH PXVW KDYH DQ DGYLVRU ,I D FHUWDLQ IDFXOW\ PHPEHU ZKR DGYLVHV WKUHH JUDGXDWH VWXGHQWV OHDYHV WKH XQLYHUVLW\ DQG LV WKHUHIRUH GHOHWHG IURP WKH GDWDEDVH WKHQ WKHVH WKUHH LQVWDQFHV RI JUDGXDWH VWXGHQWV ZLOO EH LQ DQ LQFRQVLVWHQW VWDWH $XWRPDWLF XSGDWH SURSDJDWLRQ PD\ EH GDQJHURXV VLQFH ZH FHUWDLQO\ ZRXOG QRW OLNH WR GHOHWH WKH WKUHH JUDGXDWH VWXGHQWV PHUHO\ EHFDXVH WKHLU DGYLVRU OHIW $ EHWWHU ZD\ WR GHDO ZLWK VXFK VLWXDWLRQV LV WR LQWURGXFH DQ HODERUDWH H[FHSWLRQ KDQGOLQJ PHFKDQLVP 7KXV ZH FDQ VWDWH DQ H[FHSWLRQ WR WKH DERYH UXOH VXFK WKDW WKH JUDGXDWH VWXGHQWV LQ TXHVWLRQ PXVW ILQG DQRWKHU DGYLVRU ZLWKLQ WKUHH PRQWKV IURP WKH WLPH WKH IDFXOW\ PHPEHU ZDV GHOHWHG ([FHSWLRQ KDQGOLQJ DQG DFWLYH GDWDEDVH PDQDJHPHQW DUH RXWVLGH WKH VFRSH RI WKLV GLVVHUWDWLRQ 7KHUH DUH WZR LPSRUWDQW FKDUDFWHULVWLFV DERXW FRQVWUDLQW PDQDJHPHQW LQ 9ROWDLUH XQOLNH PRVW RWKHU FRQVWUDLQW ODQJXDJHV WKH RUGHU LQ ZKLFK FRQVWUDLQWV DSSHDU LV VLJQLILFDQW UHDVRQV IRU WKLV ZLOO EH FOHDU RQO\ DIWHU FKDSWHU f DQG VLQFH WKH H[HFXWLRQ PRGHO LV OD]\ DV GHULYHG DWWULEXWHV DUH FRPSXWHG RQ GHn PDQG RQO\f DQG WKH HIIHFWV RI PRGLI\ DUH RQO\ ORFDO WKH XVHU FDQ QHYHU DFFHVV LQFRQVLVWHQW GDWD LQ WKH SHUVLVWHQW VWRUH f7KLV LV SUHFLVHO\ WKH YLHZ WDNHQ LQ -DJDGLVK >@

PAGE 64

7KLV LV EHFDXVH DQ LQVWDQFH FDQ EHORQJ WR D JLYHQ FODVV LI DQG RQO\ LI LW VDWLVILHV DOO WKH FRQVWUDLQWV VSHFLILHG LQ WKH GHILQLWLRQ RI WKDW FODVV /D]\ HYDOXDWLRQ LPSOLHV WKDW FRQVWUDLQWV LQ 9ROWDLUH DUH DXWRPDWLFDOO\ WULJJHUHG ZKHQHYHU D QHZ LQVWDQFH LV FUHDWHG RU DQ H[LVWLQJ LQVWDQFH LV PRGLILHG %DVLF 6WUXFWXUH RI &RQVWUDLQWV 7KH EDVLF VWUXFWXUH RI FRQVWUDLQWV LV DV VKRZQ EHORZ %! %L % %RRO! &RPP &RPPL LI %RRO! WKHQ %! HQGLI LI %RRO! WKHQ %L HOVH % HQGLI %RRO! ([ ( ,W LV LPSRUWDQW WR QRWH WKDW WKH DQWHFHGHQW RI D FRQVWUDLQW LV VWUXFWXUDOO\ DQG VHPDQn WLFDOO\ LGHQWLFDO WR WKH VHOHFWLRQ LH ERROHDQf FRQGLWLRQV ZKLFK IRUP WKH 5+6 RI WKH YHUWLFDO EDU LQ D VHW H[SUHVVLRQ 7KH FRQVHTXHQW RI D FRQVWUDLQW FDQ DOVR EH D ERROHDQ FRQGLWLRQ LQ ZKLFK FDVH VDWLVILDELOLW\ LV FRPSXWHG +RZHYHU ZKHQ WKH FRQVHTXHQW FRQWDLQV WKH HTXDOLW\ RSHUDWRU WZR SRVVLELOLWLHV DULVH ,I ERWK WKH 5+6 DQG /+6 DUH ERXQG WKHQ VDWLVILDELOLW\ LV FKHFNHG ,I WKH /+6 RI WKH HTXDOLW\ RSHUDWRU LV XQERXQG WKHQ D ELQGLQJ WDNHV SODFH 7KDW LV WKH HTXDOLW\ RSHUDWRU LV RYHUORDGHG )XUWKHU ZKHQ D FRQVWUDLQW GRHV QRW KDYH DQ DQWHFHGHQW DV LQ UXOHV DQG LQ 6WXGHQW EHORZf LW EHKDYHV OLNH DQ HTXDWLRQDO FRQVWUDLQW ZKLFK PXVW EH VDWLVILHG LQ RQH GLUHFWLRQ RQO\f :H QRZ ORRN DW D IHZ H[DPSOHV ([DPSOHV &RQVWUDLQWV RQ WKH FODVV 6WXGHQW 6WXGHQWWRWDOZRUN 6WXGHQWWRWDOFUHGLW 6WXGHQWMREKRXUV 7KLV PHDQV WKDW LI D FODVV FDQ EH IRXQG VXFK WKDW LWV FRQVWUDLQWV DUH VDWLVILHG E\ WKH LQVWDQFH LQ TXHVWLRQ WKHQ WKH FODVV RI WKLV LQVWDQFH FDQ EH DXWRPDWLFDOO\ LQIHUUHG

PAGE 65

6WXGHQWOHLVXUHWLPH f§ 6WXGHQWWRWDOZRUN 6WXGHQWOHLVXUHWLPH LI 6WXGHQWYLVDBVWDWXV f)Of WKHQ 6WXGHQWMREKRXUV 5XOH VSHFLILHV KRZ WR FRPSXWH WKH OHLVXUH WLPH RI D VWXGHQW ZKHUHDV UXOH SODFHV D ERXQG RQ WKH SRVVLEOH YDOXHV WKDW D VWXGHQWfV OHLVXUH WLPH FDQ KDYH :KHQ D QHZ LQVWDQFH RI FODVV 6WXGHQW LV FUHDWHG WKH WRWDO ZRUN PD\ QRW EH NQRZQ 7KHUHIRUH EHIRUH WKH YDOXH RI OHLVXUH WLPH FDQ EH FRPSXWHG UXOH PXVW EH WULJJHUHG :KHQ WKH YDOXH IRU WKH WRWDO QXPEHU RI FUHGLW KRXUV IRU ZKLFK D VWXGHQW PD\ EH UHJLVWHUHG RU MRE KRXUV LV PRGLILHG UXOHV DQG DUH WULJJHUHG 5XOH VWDWHV WKDW IRU DOO VWXGHQWV ZKRVH YLVD VWDWXV LV )O WKH\ ZLOO QRW EH DOORZHG WR ZRUN IRU PRUH WKDQ KRXUV 6LQFH DOO RI WKH DERYH FRQVWUDLQWV DUH DWWDFKHG WR WKH VLQJOH FODVV 6WXGHQW WKHUH LV QR QHHG WR UHSHDW WKH FODVV QDPH )RU H[DPSOH UXOH FRXOG EH UHZULWWHQ DV WRWD/ZRUN WRWDOFUHGLW MREKRXUV ZLWK DQ LPSOLFLW VHOI RSHUDWRU SUHSHQGHG WR HDFK DWWULEXWH 7KH VHOI RSHUDWRU NHHSV WUDFN RI WKH VSHFLILF LQVWDQFH LQ TXHVWLRQ DW DOO WLPHV GXULQJ WKH VWDWH RI D FRPSXWDWLRQ &RQVLGHU D SURJUDP VHJPHQW ZKHUH D QHZ LQVWDQFH RI FODVV *UDG LV FUHDWHG MLP f§ ^ QHZ*UDG VV DQG QDPH fMLP EURZQf DQG DQG WRWDOFUHGLW DQG MREKRXUV ` %HIRUH WKLV LQVWDQFH FDQ EH SODFHG LQ WKH SHUVLVWHQW VWRUH GRPDLQ DQG RWKHU FRQn VWUDLQWV PXVW EH FKHFNHG 6LQFH D QHZ LQVWDQFH LV EHLQJ FUHDWHG DWWULEXWHV RFFXUULQJ RQ WKH 5+6 RI WKH YHUWLFDO EDU DUH ERXQG WR WKHLU FRUUHVSRQGLQJ YDOXHV 5XOHV DQG DUH QRZ WULJJHUHG 7KH ILUVW WZR UXOHV UHVXOW LQ WKH FRPSXWDWLRQ RI WRWDOZRUN

PAGE 66

DQG OHLVXUHWLPH 5XOH FKHFNV WKH FRQGLWLRQ OHLVXUHWLPH KRXUV ZKLFK LV VDWLVn ILHG LQ RXU H[DPSOH 6XSSRVH WKDW QRWKLQJ LV PHQWLRQHG DERXW YLVDVWDWXV ZKHQ WKH LQVWDQFH LV EHLQJ FUHDWHG ,I WKH GRPDLQ FRQVWUDLQWV RI WKDW DWWULEXWH DOORZ D QXOO YDOXH WKHQ LV LJQRUHG HOVH DQ HUURU FRQGLWLRQ LV UHSRUWHG 6XSSRVH D PRGLI\ FRPPDQG LV LVVXHG ZKHUH -LPfV OHLVXUH WLPH LV XSGDWHG WR D QHZ YDOXH 7KLV ZRXOG WULJJHU UXOHV DQG 5XOH LV DQ HTXDWLRQDO FRQVWUDLQW RQ WKH UHODWLRQVKLS EHWZHHQ OHLVXUHWLPH WRWDOFUHGLW DQG MREKRXUV 7KXV LI D QHZ YDOXH IRU OHLVXUHWLPH GRHV QRW VDWLVI\ WKHQ DQ HUURU FRQGLWLRQ LV UHSRUWHG HYHQ WKRXJK PD\ EH VDWLVILHG ,QWHJULW\ HQIRUFHPHQW LQ WKLV VLWXDWLRQ LV QRW SRVVLEOH GXH WR WKH LQKHUHQW QRQGHWHUPLQLVP +RZHYHU DQ\ XSGDWH WR WRWDOFUHGLW RU MREKRXUV LV SURSDJDWHG LQ WKH REYLRXV ZD\ &RQVWUDLQWV RQ WKH FODVV *UDG LI H[LVWV *UDGWKHVLVBRSWLRQ WKHQ H[LVWV *UDGDGYLVRU DQG *UDGDGYLVRU LQ *UDGFRPPLWWHH IRU DOO *UDGVHFWLRQFRXUVHF F LI *UDGVWDWXV fIXOOWLPHf WKHQ *UDGWRWDOFUHGLW LI FRXUVHZRUN fGRQHf DQG WKHVLVBVWDWXV fGHIHQGHGf DQG FRXQW ^ FRPPLWWHH)DFXOW\ )DFXOW\'HSW LQFOXGHV VHOI 'HSW ` WKHQ GHJUHHBUHT fIXOILOOHGf ,Q WKH FRQVHTXHQW RI 5XOH ZH QHHG DQ H[LVWHQWLDO TXDQWLILHU EHFDXVH LI *UDGDGYLn VRU HYDOXDWHV WR D QXOO VHW WKHQ LW ZRXOG EH WULYLDOO\ FRQWDLQHG LQ *UDGFRPPLWWHH ZKLFK LV QRW WKH LQWHQGHG VHPDQWLFV 5XOH VWDWHV WKDW DOO WKH FRXUVH QXPEHUV WDNHQ E\ DQ\ JUDGXDWH VWXGHQW PXVW EH RI OHYHO RU JUHDWHU 5XOH VWDWHV WKDW

PAGE 67

DOO JUDGXDWH VWXGHQWV DWWHQGLQJ VFKRRO IXOO WLPH PXVW UHJLVWHU IRU DW OHDVW FUHGLW KRXUV 1XOO 9DOXHV DQG ([FHSWLRQV ,QIRUPDWLRQ LV RIWHQ QRW DOZD\V DYDLODEOH ZKHQ D QHZ UHFRUG RU LQVWDQFH LV EHLQJ FUHDWHG 7KLV PHDQV WKDW WKHUH PD\ EH D QXPEHU RI DWWULEXWHV RI WKH LQVWDQFH LQ TXHVWLRQ ZLWK QXOO YDOXHV 7KHVH LQVWDQFHV DUH QHYHUWKHOHVV XVHIXO VLQFH WKH\ FRQWDLQ DW OHDVW SDUWLDO LQIRUPDWLRQ DERXW VRPH UHDO ZRUOG HQWLW\ 'HDOLQJ ZLWK WKH LVVXH RI QXOO YDOXHV LQYROYHV FHUWDLQ FRPSURPLVHV VLQFH LW FRQIOLFWV ZLWK WKH IROORZLQJ IDFW 1XOO YDOXHV PD\ YLRODWH WKH VWUXFWXUDO DQGRU EHKDYLRUDO FRQVWUDLQWV RI WKH FODVV RU W\SHf WR ZKLFK WKH LQVWDQFH EHORQJV 7KXV ORDGLQJ D GDWDEDVH ZLWK QXOO YDOXHV PD\ MHRSDUGL]H WKH VDIHQHVV LQ D W\SH V\VWHP DQG WKH XVHU PD\ WKHUHE\ HQFRXQWHU UXQn WLPH HUURUV 7KHVH HUURUV FRXOG RWKHUZLVH KDYH EHHQ GHWHFWHG ZKHQ WKH GDWDEDVH ZDV EHLQJ ORDGHG :H KDYH FKRVHQ D FRPSURPLVH LQ ZKLFK 7KH YDOXH QXOO FDQ EH FRHUFHG WR EHORQJ WR DQ\ W\SH 7KXV WKH VWUXFWXUDO FRQVWUDLQWV RI D W\SH QHHG QRW EH YLRODWHG ,W LV YHU\ OLNHO\ WKDW WKH EHKDYLRUDO FRQVWUDLQWV FDQ EH YLRODWHG GXH WR WKH SUHVHQFH RI QXOO YDOXHV LH WKH DEVHQFH RI LQIRUPDWLRQf %XW VLQFH ZH KDYH DGRSWHG D OD]\ HYDOXDWLRQ PRGH GHULYHG DWWULEXWHV DUH QRW FRPSXWHG XQWLO DFWXDOO\ UHTXHVWHG 7KXV WKH XVHU ZLOO QRW UHFHLYH LQFRQVLVWHQW LQVWDQFHV DV D SDUW RI WKH UHVXOW RI D TXHU\ $QRWKHU ZD\ WKDW WKH XVHU FDQ GHDO ZLWK QXOO YDOXHV LV E\ GHILQLQJ FRQVWUDLQWV ZLWK WKH KHOS RI WKH H[LVWV RSHUDWRU 6XSSRVH WKDW PRVW JUDGXDWH VWXGHQWV PXVW KDYH DGYLVRUV WKRXJK QRW DOO RI WKHP PD\ KDYH RQH SUREDEO\ EHFDXVH WKH VWXGHQW $FWXDOO\ WKH W\SH RI QXOO LV 1LO DQG 1LO LV DOZD\V D VXEW\SH RI DQ\ RWKHU W\SH WKDW LV GHILQHG LQ WKH W\SH VFKHPH

PAGE 68

KDV QRW \HW IRXQG D VXLWDEOH DGYLVRUf )XUWKHU LI WKH VWXGHQW GRHV KDYH DQ DGYLVRU WKHQ WKH DGYLVRU PXVW EHORQJ WR WKH VDPH GHSDUWPHQW DV WKH VWXGHQW 7KLV FRQVWUDLQW FDQ EH PRGHOHG DV IROORZV LI H[LVWV *UDGDGYLVRU WKHQ *UDGGHSW *UDGDGYLVRUGHSW %\ GHILQLQJ WKLV UXOH DQ LQVWDQFH RI WKH FODVV *UDG FDQ KDYH D QXOO YDOXH LQ LWV DGYLVRU DWWULEXWH DQG DW WKH VDPH WLPH QRW YLRODWH D EHKDYLRUDO FRQVWUDLQW )XUWKHU WKLV FDQ DOVR EH XVHG DV D PHDQV WR GHDO ZLWK VLPSOH H[FHSWLRQV WKXV DYRLGLQJ D SUROLIHUDWLRQ RI VXEFODVVHV VXFK DV *UDGMZLWKDGYLVRU DQG *UDGMZLWKRXWDGYLVRU ZKRVH VXSHUFODVV LV *UDG $V DQRWKHU H[DPSOH VXSSRVH WKDW HYHU\ JUDGXDWH VWXGHQW PXVW UHJLVWHU IRU DW OHDVW FUHGLWV H[FHSW -RH ZKR LV DOORZHG WR UHJLVWHU IRU DQ\ QXPEHU RI FUHGLWV 7KLV FDQ EH PRGHOHG DV IROORZV LI *UDGQDPH A f-RHf WKHQ *UDGFUHGLW-LRXUV &RQVWUDLQW VSHFLILFDWLRQ LV YHU\ VLPLODU WR ZKDW LV IRXQG LQ PRVW RWKHU V\VWHPV H[FHSW WKDW WKH RUGHU LQ ZKLFK WKH FRQVWUDLQWV DSSHDU LV VLJQLILFDQW :H KDYH VKRZQ WKDW LW LV SRVVLEOH WR ERRWVWUDS WKH FRQVWUDLQW VSHFLILFDWLRQ VXEODQJXDJH RQ WRS RI WKH TXHU\ VXEODQJXDJH :H DOVR VKRZ KRZ WR H[SORLW QXOO YDOXHV WR GHDO ZLWK LQn FRPSOHWH LQIRUPDWLRQ DQG H[FHSWLRQV &RQVWUDLQWV LQ 9ROWDLUH JHW WULJJHUHG ZKHQHYHU DQ LQVWDQFH LV FUHDWHG RU PRGLILHG )XUWKHU IXQFWLRQV DUH FRPSXWHG DV WKH UHVXOW RI LQWHJULW\ HQIRUFHPHQW DV ZH VKDOO VHH LQ WKH QH[W FKDSWHU

PAGE 69

&+$37(5 )81&7,21 63(&,),&$7,21 7UDGLWLRQDOO\ LQ WKH GDWDEDVH ZRUOG D IXQFWLRQ RU DSSOLFDWLRQ LV LPSOHPHQWHG LQ D KRVW ODQJXDJH ZLWK HPEHGGHG '0/ VWDWHPHQWV 7KLV DSSOLFDWLRQ LV WKHQ H[HFXWHG LQGHSHQGHQWO\ RI WKH '%06 XQGHU WKH FRQWURO RI WKH RSHUDWLQJ V\VWHP 7KXV WKH '%06 RQO\ NQRZV RI D WUDQVDFWLRQ GHILQHG E\ D EORFN RI '0/ VWDWHPHQWV DQG KDV QR ZD\ RI NQRZLQJ ZKHWKHU DQ DSSOLFDWLRQ DV D ZKROH ZLOO VXFFHHG RU QRW 7KLV PD\ FDXVH UXQWLPH DERUWV ZKLFK DUH H[SHQVLYH WR KDQGOH ,Q FRQWUDVW WKH DSSOLFDWLRQ LV H[HFXWHG XQGHU WKH FRQWURO RI D FHQWUDO WUDQVDFWLRQ PDQDJHU ZLWKLQ D '%3/ DQG WKH DSSOLFDWLRQ LV LPSOHPHQWHG DV D IXQFWLRQ RU PHWKRG LQ REMHFWRULHQWHG GDWDEDVH V\VWHPVf +RZHYHU WKH SUREOHP RI GHILQLQJ D WUDQVDFWLRQ LV VWLOO DQ DUHD RI RQn JRLQJ UHVHDUFK ,Q D '%3/ D IXQFWLRQ LV H[SHFWHG WR EH FRPSLOHG LQWR D WUDQVDFWLRQ VXEODQJXDJH ZKLFK JHWV H[HFXWHG HDFK WLPH D IXQFWLRQ LV WR EH HYDOXDWHG DW D KLJKHU OHYHO +RZHYHU VXFK LVVXHV DUH RXWVLGH WKH VFRSH RI WKLV GLVVHUWDWLRQ +HUH ZH VKDOO PHUHO\ FRQFHUQ RXUVHOYHV ZLWK WKH HYDOXDWLRQ DQG VHPDQWLFV RI IXQFWLRQ VSHFLILFDWLRQ LQ 9ROWDLUH )XQFWLRQV LQ 9ROWDLUH UHO\ KHDYLO\ RQ WKH GRW RSHUDWRU IRU DVVRFLDWLYH DFFHVV DQG VHW H[SUHVVLRQV IRU FRPSXWLQJ GHQRWDEOH YDOXHV $ IXQFWLRQ LV VSHFLILHG DV D VHTXHQFH RI FRQVWUDLQWV RU FRPPDQGV LQ D PDQQHU VLPLODU WR WKH LPSHUDWLYH SDUDGLJP 7KDW LV HDFK FRPPDQG LV H[HFXWHG VHTXHQWLDOO\ )XUWKHU WKH XVHU FDQ ZULWH SURJUDPV ZLWKRXW ZRUU\LQJ DERXW XVLQJ GLIIHUHQW RSHUDWRUV IRU SHUVLVWHQW DQG QRQSHUVLVWHQW REMHFWV )RU H[DPSOH WKH QHZ RSHUDWRU FUHDWHV D ORFDWLRQ IRU DQ LQVWDQFH RI D JLYHQ FODVV DQG UHWXUQV D GHQRWDEOH YDOXH RI GRPDLQ 5HI &RQVLGHU WKH H[SUHVVLRQ

PAGE 70

V ^QHZF D Y` ,I V LV QRW SHUVLVWHQW ZLWKLQ WKH FRQWH[W RI HYDOXDWLRQ LW LV ERXQG WR D GHQRWDEOH YDOXH EHORQJLQJ WR WKH GRPDLQ 5HI 2Q WKH RWKHU KDQG LI V LV D SHUVLVWHQW YDOXH ZLWKLQ WKH FRQWH[W RI HYDOXDWLRQ WKHQ LW JHWV ERXQG WR D GHQRWDEOH YDOXH EHORQJLQJ WR WKH GRPDLQ 5HI LQ WKH UXQWLPH HQYLURQPHQW DQG DOVR JHWV UHIOHFWHG LQ WKH SHUVLVWHQW VWRUH ,Q HLWKHU FDVH WKH V\PERO V SURYLGHV D FRQVLVWHQW KDQGOH WR WKH YDOXH UHIHUHQFHG E\ LW LQ WKH UXQWLPH HQYLURQPHQW 6LPLODUO\ LI WKH PRGLI\ RSHUDWRU LV DSSOLHG WR D QRQSHUVLVWHQW REMHFW LWV HIIHFW LV PDGH DYDLODEOH RQO\ WR WKH UXQWLPH HQYLURQPHQW ZKHUHDV LI LW DSSOLHG WR D SHUVLVWHQW REMHFW LWV HIIHFW LV UHIOHFWHG LQ WKH SHUVLVWHQW VWRUH LH GDWDEDVHf DV ZHOO WKH UXQWLPH HQYLURQPHQW :H QRZ H[DPLQH WKH EDVLF VWUXFWXUH RI D 9ROWDLUH IXQFWLRQ ZLWK WKH KHOS RI D VLPSOH IDFWRULDO H[DPSOH IROORZHG E\ D GDWDEDVH H[DPSOH %DVLF 6WUXFWXUH RI D )XQFWLRQ )XQFWLRQ VSHFLILFDWLRQ FDQ EH WKRXJKW RI DV D VHW RI UXOHV RU FRQVWUDLQWV GHILQn LQJ WKH UHODWLRQVKLS EHWZHHQ LWV LQSXW DQG RXWSXW SDUDPHWHUV 7KXV E\ H[WHQGLQJ WKH FRQVWUDLQW VXEODQJXDJH WR LQFOXGH D IHZ DGGLWLRQDO FRQVWUXFWV ZH FDQ ZULWH DQ DUELWUDU\ IXQFWLRQ LQ 9ROWDLUH &RPP &RPPL $VVLJQPHQW! /RRS! GPORSV! LR! $VVLJQPHQW! GRWBH[SU! VHWBH[SU! /RRS! ,WHUDWRU! :KLOH! ,WHUDWRU! IRU HDFK ,! LQ VHWBH[SU! GR %! HQGGR :KLOH! ZKLOH %RRO! GR %! HQGGR LR! RSHQ! FORVH! SULQW! UHDG! $GGLWLRQDOO\ IXQFWLRQV FDQ KDYH DQ H[WHQW ZKLFK LV SHUVLVWHQW RU D IXQFWLRQ FDOO YLD GRW H[SUHVVLRQVf PD\ UHVXOW LQ WKH QRQSHUVLVWHQW FUHDWLRQ RI LQVWDQFHVf

PAGE 71

RI WKDW IXQFWLRQ IRU WKH GXUDWLRQ RI D FRPSXWDWLRQ 7KHVH WHPSRUDU\ LQVWDQFHV IRUP WKH EDFNERQH RI WKH H[HFXWLRQ PRGHO RI D 9ROWDLUH SURJUDP LQ ZKLFK IXQFWLRQV DQG FODVVHV DUH WUHDWHG XQLIRUPO\ DQG IXQFWLRQ HYDOXDWLRQ LV WKH UHVXOW RI LQWHJULW\ HQIRUFHPHQW :H ILUVW HODERUDWH ZLWK WKH KHOS RI D VLPSOH H[DPSOH FODVV )DFW IXQFWLRQ DWWULEXWHV Q LQWHJHU I LQWHJHU FRQVWUDLQWV LI Q WKHQ I LI Q WKHQ I Q [ ^ )DFWI )DFWQ SUHYQ f§ ` 7KH IXQFWLRQ )DFW KDV WZR SDUDPHWHUV QDPHO\ Q DQG ORRNHG XSRQ DV D FODVV LW KDV WZR FRUUHVSRQGLQJf DWWULEXWHV 7KH OHIW KDQG VLGH RI WKH f_f RSHUDWRU GHILQHV WKH FRQWH[W ZLWKLQ ZKLFK WKH ULJKW KDQG VLGH LV HYDOXDWHG 7KXV Q UHIHUV WR WKH DWWULEXWH YDOXH RI D QHZ FRS\ RI )DFW DQG LV ERXQG WR SUHYQ f§ ZKHUH SUHYQ LV ERXQG WR WKDW YDOXH RI Q LPPHGLDWHO\ RXWVLGH WKH VHW H[SUHVVLRQf )RU H[DPSOH ZH FDQ REWDLQ WKH IDFWRULDO RI E\ LVVXLQJ WKH IROORZLQJ FRPPDQG HYDO ^)DFWI )DFWQ ` LQ WKH 9ROWDLUH HQYLURQPHQW :KHQ WKH IXQFWLRQ LV LQLWLDOO\ LQYRNHG Q LV ERXQG WR WKH YDOXH ZKLOH LV XQERXQG 7KH H[SUHVVLRQ SUHYQ WKHQ UHIHUV WR WKH YDOXH RI Q WKDW LV LPPHGLDWHO\ RXWVLGH WKH VHW H[SUHVVLRQ QDPHO\ 7KXV SUHYQ f§ GHQRWHV WKH YDOXH $OVR WKH HTXDOLW\ RSHUDWRU LV RYHUORDGHG VXFK WKDW ZKHQ WKH /+6 LV LQLWLDOO\ XQERXQG LW JHWV ERXQG WR WKH 5+6 YDOXH ZKHQ WKH /+6 LV LQLWLDOO\ ERXQG VDWLVILDELOLW\ LV FRPSXWHG 7KH DWWULEXWH UHPDLQV XQERXQG XQWLO WKH UHFXUVLRQ EHJLQV WR XQZLQG $GGLWLRQDOO\ WKHUH LV DQ LPSOLFLW FRHUFLRQ RQ WKH VHW H[SUHVVLRQ WR DQ REMHFW RI W\SH LQWHJHU GXH WR WKH VHPDQWLFV RI WKH [ RSHUDWRU 6LQFH

PAGE 72

RQH RSHUDQG LV DQ LQWHJHU DQG WKH RWKHU LV D VHW RI LQWHJHUV GXH WR WKH VHW H[SUHVVLRQf FRHUFLRQ LV QHFHVVDU\ IRU WKH SURSHU HYDOXDWLRQ RI WKH [ RSHUDWRU ,W PXVW EH QRWHG WKDW WKH VHW H[SUHVVLRQ FDQ DOVR EH FRQVWUXHG DV D TXHU\ )RU H[DPSOH WKH VXEH[SUHVVLRQ ^)DFWI )DFWQ SUHYQ` DOVR PHDQV fUHWULHYH DOO REMHFWV RI FODVV )DFW VXFK WKDW )DFWQ LV WKH VDPH DV Q IRU VRPH RWKHU LQVWDQFH RI FODVV )DFWf 7KXV LI WKHUH ZHUH D GDWDEDVH FRQVLVWLQJ RI LQVWDQFHV RI FODVV )DFW LH YDOXH SDLUV RI Q DQG WKHQ D TXHU\ DVNLQJ IRU WKH IDFWRULDO RI FRXOG UHVXOW LQ D VLPSOH ORRNXS $OWHUQDWHO\ WKH VDPH VXEH[SUHVVLRQ FDQ EH LQWHUSUHWHG DV fFRPSXWH WKH UHVXOW RI IXQFWLRQ )DFW JLYHQ WKH YDOXH RI Qf LH IXQFWLRQ FDOOf 7KLV LV EHFDXVH FODVVHV DQG IXQFWLRQV DUH WUHDWHG XQLIRUPO\ LQ 9ROWDLUH $JJUHJDWH RSHUDWRUV VXFK DV VXP DUH SURYLGHG DV D FRQYHQLHQFH EXW LW LV HDV\ WR ZULWH VXFK D IXQFWLRQ LQ 9ROWDLUH DV VKRZQ EHORZ FODVV 6XP IXQFWLRQ DWWULEXWHV RSHUDQG OLVW LQWHJHU UHVXOW LQWHJHU FRQVWUDLQWV UHVXOW KHDGRSHUDQG ^6XPUHVXOW 6XPRSHUDQG WDLOSUHYRSHUDQG` :KLOH WKH DERYH SURJUDP LV VLPLODU WR WKH IDFWRULDO IXQFWLRQ LW ZRXOG KDYH EHHQ PRUH HIILFLHQW WR KDYH ZULWWHQ LW DV IROORZV IRU HDFK [ LQ RSHUDQG GR ^ PRGLI\6XP UHVXOW SUHYUHVXOW [ ` HQGGR $ 'DWDEDVH ([DPSOH ,Q RUGHU WR FRPSDUH WKH H[SUHVVLYH SRZHU RI YDULRXV '%3/V D WDVN OLVW KDV EHHQ GHVFULEHG LQ >@ +HUH ZH VKRZ KRZ VRPH RI WKHVH WDVNV FDQ EH SHUIRUPHG LQ 9ROWDLUH 7KH ILUVW WDVN LV WR EH DEOH WR GHVFULEH D IUDJPHQW RI D PDQXIDFWXULQJ FRPSDQ\fV SDUWV

PAGE 73

LQYHQWRU\ $PRQJ RWKHU WKLQJV WKH GDWDEDVH UHSUHVHQWV WKH ZD\ FHUWDLQ SDUWV DUH PDQXIDFWXUHG RXW RI RWKHU SDUWV WKH VXESDUWV WKDW DUH LQYROYHG LQ WKH PDQXIDFWXUH RI RWKHU SDUWV WKH FRVW RI PDQXIDFWXULQJ D SDUW IURP LWV VXESDUWV WKH PDVV LQFUHPHQW WKDW RFFXUV ZKHQ WKH VXESDUWV DUH DVVHPEOHG 7KH PDQXIDFWXUHG SDUWV WKHPVHOYHV PD\ EH VXESDUWV LQ D IXUWKHU PDQXIDFWXULQJ SURFHVV WKXV UHSUHVHQWLQJ DQ DJJUHJDn WLRQ KLHUDUFK\ ,Q DGGLWLRQ WKH SDUW QDPH LWV VXSSOLHU DQG SXUFKDVH FRVW LV DOVR PDLQWDLQHG LQ WKH GDWDEDVH $ SDUWLDO 9ROWDLUH VFKHPD IRU WKLV GDWDEDVH LV VKRZQ EHORZ FODVV 3DUW VXSHUFODVVHV $Q\ VXEFODVVHV %DVHSDUW &RPSRVLWHSDUW DWWULEXWHV QDPH VWULQJ XVHG-Q &RPSRVLWHSDUW FODVV &RPSRVLWHSDUW VXSHUFODVVHV 3DUW VXEFODVVHV QLO DWWULEXWHV DVVHPEO\FRVW LQWHJHU PDVVLQFUHPHQW LQWHJHU XVHV VHW 8VH FODVV %DVHSDUW VXSHUFODVVHV 3DUW VXEFODVVHV QLO DWWULEXWHV FRVW LQWHJHU PDVV LQWHJHU VXSSOLHGBE\ 6XSSOLHU FODVV 8VH VXSHUFODVVHV $Q\ VXEFODVVHV QLO DWWULEXWHV FRPSRQHQW 3DUW DVVHPEO\ &RPSRVLWHSDUW TXDQWLW\ LQWHJHU 7KH VHFRQG WDVN LV WR ZULWH D SURJUDP WR SULQW WKH QDPHV FRVW DQG PDVV RI DOO EDVH SDUWV WKDW FRVW PRUH WKDQ GROODUV 7KLV FDQ EH DFKLHYHG E\ ZULWLQJ D VLPSOH TXHU\ QDPHO\ ^ %DVHSDUW >QDPH FRVW PDVV@ FRVW ` 7KH QH[W WDVN LV WR FRPSXWH DQG SULQW WKH WRWDO FRVW RI D SDUW DV VKRZQ EHORZ 7KLV WDVN GHIHDWV PRVW TXHU\ ODQJXDJHV EHFDXVH LW UHTXLUHV WKH FRPSXWDWLRQ RI WUDQVLWLYH FORVXUH RYHU WKH SDUWV KLHUDUFK\ LQ WKH GDWDEDVH 7R FRPSXWH WKH FRVW RI D SXPS

PAGE 74

ZH VLPSO\ LQYRNH WKH IXQFWLRQ DV IROORZV ^ &RPSXWH&RVWUHVXOWFRVW SDUWQDPH fSXPSf ` FODVV &RPSXWH&RVW IXQFWLRQ DWWULEXWHV SDUWQDPH VWULQJ UHVXOWFRVW LQWHJHU WUDQVLHQWV S 3DUW HOMFRVW LQWHJHU VXEFRVWV OLVW LQWHJHU FRQVWUDLQWV S ^ 3DUW QDPH SDUWQDPH ` LI S LQ %DVHSDUW WKHQ UHVXOWFRVW SFRVW HOVH IRU HDFK \ LQ SXVHVFRPSRQHQW GR H/FRVW SXVHVTXDQWLW\ [ ^ &RPSXWH&RVWUHVXOWFRVW SDUWQDPH \SDUWQDPH ` ` ^ PRGLI\VXEFRVWV KHDGVXEFRVWV H/FRVW DQG WDLOVXEFRVWV SUHYVXEFRVWV ` HQGGR UHVXOWFRVW SDVVHPEO\FRVW ^ VXP VXEFRVWV ` HQGLI 7KH NH\ZRUG WUDQVLHQWV GHQRWHV WHPSRUDU\ DWWULEXWHV DQG KDV WKH VDPH VHPDQn WLFV DV UHJXODU DWWULEXWHV H[FHSW WKDW WKH\ DUH QRW SHUVLVWHQW 7UDQVLHQW DWWULEXWHV GR QRW UHIOHFW WKH ILQDO VWDWH RI D FRPSXWDWLRQ EXW PHUHO\ IDFLOLWDWHV D PRUH HIILFLHQW HYDOXDWLRQ RI D IXQFWLRQ 7KHUHIRUH WKH\ FDQ EH VHHQ WR EHKDYH OLNH ORFDO YDULDEOHV 7KH ILUVW VWDWHPHQW DVVLJQV WKH REMHFW LGHQWLILHU RI WKDW LQVWDQFH RI 3DUW UHIHUHQFHG YLD LWV QDPH WR WKH WUDQVLHQW YDULDEOH S ,Q WKH VHFRQG VWDWHPHQW WKHUH LV DQ LWHUDWRU ZKLFK KDV WZR FRPPDQGV 7KH ILUVW RQH PDNHV D UHFXUVLYH FDOO WR WKH IXQFWLRQ WR GHVFHQG WKH DJJUHJDWLRQ KLHUDUFK\ DQG WHPSRUDULO\ VWRUHV WKH FRVW RI DQ HOHPHQW LQ H/FRVW 7KH VHFRQG FRPPDQG QHHGV PRUH HODERUDWLRQ $V WKH UHFXUVLRQ XQIROGV WKH

PAGE 75

HOHPHQW FRVWV DUH FROOHFWHG LQ WKH OLVW VXEFRVWV 7KH HIIHFW RI WKH PRGLI\ RSHUDWRU LV VLPLODU WR VXEFRVWV DSSHQGHBFRVL VXEFRVWVf +RZHYHU VLQFH VXEFRVWV LV D WHPn SRUDU\ DWWULEXWH LW PHUHO\ UHIHUV WR VRPH REMHFW KHUH D OLVW RI LQWHJHUVf LQ YLUWXDO PHPRU\ 7KHUHIRUH WKH HIIHFWV RI PRGLI\ ZLOO EH OLPLWHG RQO\ WR YLUWXDO PHPRU\ 2Q WKH RWKHU KDQG LI WKH 5+6 RI WKH YHUWLFDO EDU UHIHUUHG WR VRPH SHUVLVWHQW REMHFWV WKHQ PRGLI\ ZRXOG DSSURSULDWHO\ PDNH FKDQJHV LQ WKH SHUVLVWHQW VWRUH $OVR LI ZH KDG XVHG WKH IXQFWLRQ 6XP GHILQHG LQ WKH SUHYLRXV VHFWLRQ WKHQ WKH ODVW FRPPDQG LQ WKH &RPSXWH&RVW IXQFWLRQ ZRXOG KDYH EHHQ ZULWWHQ DV UHVXOWFRVW SDVVHPEO\FRVW 7 ^ 6XPUHVXOW RSHUDQG VXEFRVWV ` 7HPSRUDU\ ,QVWDQFH &UHDWLRQ /HW XV UHFDSLWXODWH VRPH IHDWXUHV RI 9ROWDLUH :H EHJDQ ZLWK WKH SUHPLVH WKDW FHUWDLQ GDWDEDVH DQG SURJUDPPLQJ FDSDELOLWLHV PXVW EH LQFRUSRUDWHG ZLWKLQ D XQLn IRUP IUDPHZRUN :H FKRVH LQWHJULW\ HQIRUFHPHQW DV WKDW XQLI\LQJ IUDPHZRUN 7KH PDLQ UHDVRQ ZK\ IXQFWLRQV FDQ DOVR EH FRPSXWHG LV WKDW WKH H[HFXWLRQ PRGHO WUHDWV WKH FRQVWUDLQWV DV D VHTXHQFH RI VWDWHPHQWV WR EH HYDOXDWHG LQ WKH RUGHU LQ ZKLFK WKH\ DSSHDU ,Q IDFW WKHVH H[SUHVVLRQV KDYH D VHPDQWLFV LQ ZKLFK QHZ ELQGLQJV DUH SDVVHG RQ WR WKH QH[W H[SUHVVLRQ WR EH HYDOXDWHG ,W LV D GLUHFW FRQVHTXHQFH RI WKLV H[HFXWLRQ PRGHO WKDW FODVVHV DQG IXQFWLRQV FDQ WUXO\ EH HTXLYDOHQW 7KLV HTXLYDn OHQFH ZDV LPSRUWDQW EHFDXVH ZH LQVLVWHG WKDW WKH TXHU\ ODQJXDJH EH DEOH WR UHIHUHQFH FODVVHV DQG PDNH IXQFWLRQ FDOOV ZLWK WKH VDPH V\QWD[ DQG VHPDQWLFV 7KH LQDELOLW\ RI D TXHU\ ODQJXDJH WR XQLIRUPO\ DFFHVV FODVVHV DQG IXQFWLRQV FDXVHV YDULRXV SDUDGLJP PLVPDWFK SUREOHPV > @ 7\SLFDOO\ TXHU\ ODQJXDJHV DOORZ IXQFWLRQ FDOOV YLD DG :H QRZ VHH ZK\ WKH RUGHU LQ ZKLFK FRQVWUDLQWV DSSHDUHG LQ WKH FODVVHV *UDG DQG 6WXGHQW ZDV LPSRUWDQW 0DQXHO %HUPXGH] VXJJHVWHG FROOHFWLYHO\ FDOOLQJ WKHP FOXQFWLRQV

PAGE 76

KRF WULJJHU PHFKDQLVPV VRPHWKLQJ ZH ZLVK WR DYRLG VLQFH LW ZRXOG FUHDWH SUREOHPV LQ GHILQLQJ DQG H[HFXWLQJ D WUDQVDFWLRQ 1RZ FRQVLGHU WKH VHW H[SUHVVLRQ ^ 6WXGHQWWRWDOBKRXUV VV DQG QDPH fMRKQf DQG ` :KHQ VXFK D SURJUDP VHJPHQW LV HQFRXQWHUHG WKH HYDOXDWLRQ IXQFWLRQ ZLOO ILUVW VHDUFK IRU DQ LQVWDQFH H[LVWLQJ LQ WKH GDWDEDVH ,I WKH VHDUFK IDLOV LW ZLOO WKHQ DWWHPSW WR FUHDWH D WHPSRUDU\ LQVWDQFH ZKLFK PXVW VDWLVI\ DOO WKH FRQVWUDLQWV LQ WKH GHILQLWLRQ RI FODVV 6WXGHQW (IIHFWLYHO\ WKLV IDLOXUH LV D IXQFWLRQ FDOO 7KH VHPDQWLFV RI VXFK DQ H[SUHVVLRQ FDQ EH FRQVWUXHG WR GHQRWH WKH YDOXH IRU WRWDOKRXUV RI D K\SRWKHWLFDO VWXGHQW WKDW VDWLVILHV WKH ELQGLQJV RQ WKH 5+6 RI WKH f_f RSHUDWRU 7KLV PLJKW EH XVHIXO LQ D FRQWH[W ZKHUH LQ WKH HQVXLQJ SURJUDP VHTXHQFHf WKLV WHPSRUDU\ LQVWDQFH LV WR EH PDGH SHUVLVWHQW LI VD\ WRWDOKRXUV HYDOXDWHV WR JUHDWHU WKDQ [ ^ 6WXGHQW VV DQG QDPH fMRKQf DQG ` LI [WRWDOZRUN WKHQ ^ QHZ6WXGHQW [ ` 7KH ILUVW VWDWHPHQW UHVXOWV LQ D ELQGLQJ 7KH LGHQWLILHU [ LV ERXQG WR D UHIHUHQFH XQLTXH LGHQWLILHUf WR DQ LQVWDQFH RI FODVV 6WXGHQW $V PHQWLRQHG HDUOLHU LI fMRKQf GRHV QRW H[LVW LQ WKH GDWDEDVH WKHQ WKH VHW H[SUHVVLRQ UHVXOWV LQ D IXQFWLRQ FDOO DQG [ LV ERXQG WR D UHIHUHQFH WR D WHPSRUDU\ LQVWDQFH 7KLV WHPSRUDU\ LQVWDQFH PXVW VDWLVI\ DOO FRQVWUDLQWV RI FODVV 6WXGHQW DQG DOO GHULYHG DWWULEXWHV DUH DOVR FRPSXWHG ,I IRU WKLV LQVWDQFH WKH FRQGLWLRQ WRWDOZRUN KROGV WUXH WKHQ WKLV LQVWDQFH LV PDGH SHUVLVWHQW E\ XVLQJ WKH QHZ RSHUDWRU ,Q WKLV ZD\ D WHPSRUDU\ LQVWDQFH FDQ EH PDGH SHUVLVWHQW 7KXV WHPSRUDU\ LQVWDQFH FUHDWLRQ IRUPV WKH EDFNERQH RI RXU H[HFXWLRQ PRGHO ZKLFK DOORZV XV WR JLYH DQ HTXLYDOHQW VHPDQWLFV WR FODVVHV DQG IXQFWLRQV $ 0RGHO RI ,QKHULWDQFH IRU &ODVVHV DQG )XQFWLRQV $ SUREOHP ZLWK HTXLYDOHQFH RI FODVVHV DQG IXQFWLRQV LV WKDW ZH QRZ KDYH WR XQGHUn VWDQG ZKDW WKH QRWLRQ RI VXEFODVV RU VXEIXQFWLRQf PHDQV 7KH VXEFODVV UHODWLRQVKLS

PAGE 77

FDQ EH GHILQHG DV IROORZV /HW I J EH WZR FODVVHV DQG e>I@ e>J? GHQRWH WKHLU UHVSHFWLYH H[WHQVLRQV 7KHQ LV VDLG WR EH D VXEFODVV RI J LII e>I@ & e>J? 6XFK H[WHQVLRQDO VHPDQWLFV KDYH EHHQ GHILQHG IRU WHUP VXEVXPSWLRQ ODQJXDJHV >@ +RZHYHU WKH VXEn FODVV RU VXEVXPSWLRQf UHODWLRQVKLS LV FRPSXWDEOH E\ SHUIRUPLQJ D VWUXFWXUDO DQDO\VLV RI WKH FODVV WD[RQRP\ 6XFK DQDO\VLV LV EDVHG RQ D VHW RI LQIHUHQFH UXOHV IRU FRPSXWn LQJ VXEVXPSWLRQ )RU H[DPSOH &$1','( >@ LV D FDUHIXOO\ FRQVWUDLQHG ODQJXDJH LQ ZKLFK WKH VXEFODVV UHODWLRQVKLS FDOOHG VXEVXPSWLRQf LV GHFLGDEOH >@ >@ DQG LWV FRPSOH[LW\ LV DW OHDVW FR13KDUG >@ %XW WKLV LV FOHDUO\ DQ XQGHFLGDEOH SURSRVLWLRQ LQ 9ROWDLUH EHFDXVH ZH DOORZ DUELWUDU\ FRQVWUDLQWV WR EH VSHFLILHG LQ WKH FODVV DQG IXQFWLRQf GHILQLWLRQ 2XU SURSRVHG VROXWLRQ LV EDVHG RQ WKH UHDOL]DWLRQ WKDW ZH DUH SULPDULO\ LQWHUHVWHG LQ RQO\ WKRVH YDOXHV WKDW H[LVW LQ WKH SHUVLVWHQW VWRUH LH GDWDEDVHf DV RSSRVHG WR WKH SRVVLEO\ LQILQLWH VHW RI LQVWDQFHV WKDW PD\ EHORQJ WR D JLYHQ FODVV $GGLWLRQn DOO\ ZH DUH DOVR LQWHUHVWHG LQ LQVWDQFHV WHPSRUDULO\ FUHDWHG ZLWKLQ WKH FRQWH[W RI VRPH SURJUDP 1RWH WKDW D FODVV FDQ EH YLHZHG WR KDYH EDVH DWWULEXWHV DQG GHULYHG DWWULEXWHV ZKLOH LQ D IXQFWLRQ WKH LQSXW SDUDPHWHUV DUH OLNH EDVH DWWULEXWHV DQG RXWn SXW SDUDPHWHUV DUH OLNH GHULYHG DWWULEXWHV 7KXV WKH SURSRVLWLRQ WKDW DQ LQVWDQFH LV LQGHHG D PHPEHU RI D IXQFWLRQ RU FODVVf LV GHFLGDEOH LII WKH IXQFWLRQ WHUPLQDWHV IRU D JLYHQ LQSXW WKRXJK WHUPLQDWLRQ LV VWLOO XQGHFLGDEOHf )XUWKHU LI VXFK FODVV PHPEHUVKLS LV FRPSXWDEOH IRU HDFK LQVWDQFH RI D JLYHQ FODVVf LQ WKH SHUVLVWHQW VWRUH WKHQ WKH VXEFODVV UHODWLRQVKLS LV DOVR FRPSXWDEOH &RPSXWLQJ WKH VXEVXPSWLRQ UHODWLRQVKLS LV QRW GHFLGDEOH IRU DOO WHUP VXEVXPSWLRQ ODQJXDJHV PRVW QRWDEO\ ./21( >@ 7KLV RQWRORJLF QDWXUH RI GDWDEDVHV LV LQ VWDUN FRQWUDVW WR WKH UROH RI SHUVLVWHQW W\SHV SOD\HG LQ SURJUDPPLQJ ODQJXDJHV 6LQFH DQ\ LQVWDQFH RI PXVW DOVR VDWLVI\ WKH FRQVWUDLQWV RI LWV VXSHUFODVV J GXH WR LQKHULWDQFH PXWXDO LQFRQVLVWHQF\ ZLOO EH GHWHFWHG DW OHDVW IRU WKRVH LQVWDQFHV H[LVWLQJ LQ WKH VWRUH $GGLWLRQDOO\ WKLV PRGHO ZLOO ZRUN LQ FDVHV ZKHUH WKH GRPDLQ RI DQ DWWULEXWH LV D IXQFWLRQ

PAGE 78

/HW EH D IXQFWLRQ DQG e>I@ EH LWV H[WHQVLRQ %DVHG RQ RXU DERYH GLVFXVVLRQ WKH H[WHQVLRQ LV D ILQLWH VHW LQ WKH VWRUH +RZHYHU WKH QRWLRQ RI WHPSRUDU\ LQVWDQFH FUHDWLRQ SURYLGHV XV ZLWK D PHDQV WR PDNH DUELWUDU\ FRPSXWDWLRQV 7KXV WKHUH DUH QR UHVWULFWLRQV RQ ZKDW YDOXHV PD\ EH SHUVLVWHQW DV LV RIWHQ WKH FDVH LQ PDQ\ '%3/Vf LH D IXQFWLRQ FDQ DOVR KDYH LQVWDQFHV LQ WKH SHUVLVWHQW VWRUH MXVW OLNH DQ\ RWKHU FODVV 7KH NH\ZRUG IXQFWLRQ VHUYHV RQO\ RQH SXUSRVH QDPHO\ WKDW WKH FODVV RU IXQFWLRQf LQ TXHVWLRQ LV SUHFOXGHG IURP SDUWLFLSDWLQJ LQ WKH FODVV WD[RQRP\ 7KLV LV EHFDXVH ZH GR QRW NQRZ ZKDW D WD[RQRP\ RI IXQFWLRQV PLJKW PHDQ 7KH DERYH PRGHO IRU LQKHULWDQFH LV GLIIHUHQW IURP WKRVH GHVFULEHG LQ > @ EHFDXVH ZH SURYLGH DQ H[WHQVLRQDO DFFRXQW RI LQKHULWDQFH UDWKHU WKDQ LQWHQVLRQDO 6LQFH WKH VXEFODVV UHODWLRQVKLS FDQ EH FRPSXWHG EDVHG RQ WKH DERYH DSSURDFK WKH PDLQ DUJXPHQW DJDLQVW LW ZRXOG EH D FRPELQDWRULDO H[SORVLRQ +RZHYHU FRXSOHG ZLWK RXU H[HFXWLRQ PRGHO LW FRQFHSWXDOO\ SURYLGHV D PHWKRGRORJ\ WR GHDO ZLWK WKH SUREOHP RI SURFHGXUDO DWWDFKPHQWV LQ IUDPHEDVHG ODQJXDJHV $V PHQWLRQHG HDUOLHU WKLV DSSURDFK VKRXOG EH FRQWUDVWHG ZLWK WHUP VXEVXPSWLRQ ODQJXDJHV +RZHYHU ZH FDQ VWLOO XVH WKH VDPH FODVVLILFDWLRQ DOJRULWKP WR EXLOG D WD[RQRP\ RI IXQFWLRQV 7KH DELOLW\ WR GHILQH D WD[RQRP\ RI IXQFWLRQV PLJKW EH RI XVH LQ IXQFWLRQDO DEVWUDFWLRQV XVHG LQ VLPXODWLRQ DSSOLFDWLRQV (TXDOLW\ $VVLJQPHQW DQG 0RGLI\ ,W LV YHU\ LPSRUWDQW WR EH DEOH WR GHILQH HTXDOLW\ EHWZHHQ H[SUHVVLRQV LQ D SURn JUDPPLQJ ODQJXDJH :H KDYH DOUHDG\ VHHQ HTXDOLW\ LQ FKDSWHU IRU REMHFWV DQG ZH KDYH VHHQ LQ FKDSWHUV DQG KRZ HTXDOLW\ LV RYHUORDGHG 7KLV LVVXH LV PDGH SRLJQDQW LQ VHFWLRQ ZKHUH ZH GLVFXVV KRZ WKH QRWLRQ RI WHPSRUDU\ LQVWDQFH FUHDWLRQ DOORZV XV WR JLYH DQ RSHUDWLRQDO HTXLYDOHQFH WR WKH VHPDQWLFV RI D FODVV DQG IXQFWLRQ (TXDOn LW\ LV GLIIHUHQW IURP WKH DVVLJQPHQW DQG PRGLI\ RSHUDWRUV LQ WKH VHQVH WKDW LW LV QRW

PAGE 79

GHVWUXFWLYH 7KH DVVLJQPHQW DQG PRGLI\ RSHUDWRUV KDYH D YHU\ VLPLODU VHPDQWLFVf§ DFWXDOO\ WKH DVVLJQPHQW RSHUDWRU LV V\QWDFWLF VXJDU IRU PRGLI\ )RU H[DPSOH OHW L EH DQ LQVWDQFH DQG DM LWV DWWULEXWHV 7KHQ ^PRGLI\L ? D? 9L DQG DQG DQ XQ` LV HTXLYDOHQW WR D VHTXHQFH RI DVVLJQPHQWV LD? 9L@ LDQ Xf )URP DQ LPSOHn PHQWDWLRQ YLHZSRLQW WKH PRGLI\ RSHUDWLRQ ZRXOG EH OHVV H[SHQVLYH WR FRPSLOH WKDQ WKH VHTXHQFH RI DVVLJQPHQWV EHFDXVH WKH FRQWH[W WKDW LV WKH /+6f LV HYDOXDWHG RQO\ RQFH LQ WKH IRUPHU FDVH ZKLOH LW ZRXOG EH HYDOXDWHG Q WLPHV LQ WKH ODWWHU &RQVLGHU DQRWKHU H[DPSOH V [ ^PRGLI\V VHOI [` RU RD Y ^PRGLI\R D f§ [` 7KH /+6 RI DQ DVVLJQPHQW PXVW GHQRWH DQ DWWULEXWH QDPH DQG WKH H[SUHVVLRQ RQ WKH 5+6 PXVW EH RI WKH VDPH W\SH DV WKH W\SH RI WKH DWWULEXWH RQ WKH /+6 ,I V LQ V [ UHIHUV WR D QRQSHUVLVWHQW YDOXH VXFK DV D WUDQVLHQW DWWULEXWHf WKHQ RQO\ WKH UXQWLPH HQYLURQPHQW LV XSGDWHG 2Q WKH RWKHU KDQG LI V UHIHUV WR D SHUVLVWHQW YDOXH WKHQ WKH GDWDEDVH WKDW LV SHUVLVWHQW VWRUHf DV ZHOO DV WKH UXQWLPH HQYLURQPHQW DUH XSGDWHG 6FRSH RI ,GHQWLILHUV :H KDYH DOUHDG\ H[DPLQHG WKH VFRSH RI LGHQWLILHUV LQ D VHW H[SUHVVLRQ LQ FKDSWHU :H VDZ WKDW WKH FRQWH[W RI D VHW H[SUHVVLRQ GHWHUPLQHV WKH VFRSH RI LGHQWLILHUV 7KH RQO\ ZD\ WR RYHUULGH WKH VFRSH LPSRVHG E\ WKH FRQWH[W LV WR XVH WKH SUHY RSHUDWRU 7R XQGHUVWDQG WKH VFRSH RI LGHQWLILHUV ZKHQ WKH\ RFFXU LQ D IXQFWLRQ GHILQLWLRQ ZH ILUVW QHHG WR XQGHUVWDQG KRZ WKH XVHU LQWHUDFWV ZLWK 9ROWDLUH :KLOH GHWDLOV RI VXFK LQWHUDFWLRQV DUH GHIHUUHG WR VHFWLRQ ZH EULHIO\ LQWURGXFH WKH HYDO FRPPDQG KHUH *LYHQ WKDW D XVHU KDV ORDGHG VRPH GDWDEDVH DQG D FRUUHVSRQGLQJ VFKHPD LQWR WKH 9ROWDLUH HQYLURQPHQW VKH FDQ LVVXH YDULRXV FRPPDQGV 7KH HYDO FRPPDQG WDNHV D VHW H[SUHVVLRQ DV DQ DUJXPHQW DQG HYDOXDWHV LW DJDLQVW WKH FXUUHQWO\ DFWLYH GDWDEDVH 5HFDOO WKDW IXQFWLRQV DUH WULJJHUHG YLD VHW H[SUHVVLRQV )RU H[DPSOH WR FRPSXWH

PAGE 80

WKH IDFWRULDO RI WKH XVHU ZRXOG VD\ HYDO ^IDFWI Q ` RU WKH FRVW RI D SXPS FDQ FRPSXWHG E\ LVVXLQJ WKH FRPPDQG HYDO ^&RPSXWH&RVWUHVXOWFRVW SDUWQDPH fSXPSf ` 7KLV LV NQRZQ DV WKH RXWHUPRVW OD\HU RI HYDOXDWLRQ :KHQ D IXQFWLRQ LV WULJJHUHG E\ D VHW H[SUHVVLRQ IURP WKH RXWHUPRVW OD\HU RI HYDn OXDWLRQ LW LV SDVVHG DQ LQLWLDO HQYLURQPHQW ZKLFK FRQVLVWV RI WKH LGHQWLILHUV ERXQG WR WKHLU UHVSHFWLYH YDOXHV RQ 5+6 RI WKH VHW H[SUHVVLRQ 2WKHU DWWULEXWHV RU SDn UDPHWHUVf RI WKH IXQFWLRQ DUH ERXQG DV WKH FRPSXWDWLRQ SURJUHVVHV 7KH GDWDEDVH VFKHPD LV WUHDWHG DV D JOREDO GHFODUDWLRQ ,W LV XVHIXO WR WKLQN RI WKH GDWDEDVH DV DQ HQYLURQPHQW ZKLFK PDSV FODVVHV WR LQVWDQFHV 7KXV WKH FRQWH[W RI DQ\ VHW H[SUHVVLRQ LV QRZ GHFLGHG ZLWK UHVSHFW WR WKLV JOREDO HQYLURQPHQW LH WKH GDWDEDVHf DQG WKH ORFDO HQYLURQPHQW :KHQ FRPSXWLQJ WKH YDOXH RI DQ LGHQWLILHU WKH YDOXHV LQ WKH ORFDO HQYLURQPHQW WDNH SUHFHGHQFH 2QFH ZH KDYH PRYHG IURP WKH 9ROWDLUH HQYLURQPHQW WR DQ LQQHU OHYHO RI FRPSXWDWLRQ WKH UXQWLPH HQYLURQPHQW ORRNV PXFK GLIIHUHQW GXH WR WKH QRWLRQ RI WHPSRUDU\ LQVWDQFH FUHDWLRQ DQG WKH SUHY RSHUDWRU 7KH UXQWLPH HQYLURQPHQW LV 5HQY 6HOI [ &HQY [ 3HQY ZKHUH 6HOI GHQRWHV WKH FXUUHQWO\ DFWLYH UHFRUG &HQY GHQRWHV WKH FXUUHQWO\ DFWLYH HQYLURQPHQW DQG 3HQY GHQRWHV WKH FDOOLQJ RU SUHYLRXVf HQYLURQPHQW )XUWKHU 6HOI &HQY 3HQY (QY ,Gf§}'HQRWDEOHB9DOXH 6HOI HVVHQWLDOO\ PDLQWDLQV D FRS\ RI WKH FXUUHQWO\ DFWLYH UHFRUG DJDLQVW ZKLFK WKH VHOI RSHUDWRU LV HYDOXDWHG 7KLV LV UHTXLUHG ZKHQ D TXHU\ LV EHLQJ HYDOXDWHG ZLWKLQ D IXQFWLRQ FDOO )RU H[DPSOH FRQVLGHU ^3HUVRQ DJH ` ,I WKH FODVV 3HUVRQ KDV Q LQVWDQFHV DQG WKH LWK LQVWDQFH LV EHLQJ HYDOXDWHG WKHQ 6HOI LV XVHG WR GHQRWH WKDW LQVWDQFH $Q\ PRGLILFDWLRQ WR WKH FXUUHQW HQYLURQPHQW LV UHIOHFWHG LQ 6HOI WKRXJK WKH UHYHUVH FDVH LV QRW WUXH 6LPLODUO\ WKH SUHY RSHUDWRU LV HYDOXDWHG ZLWK UHVSHFW WR 3HQY &HQY EHKDYHV LQ WKH XVXDO PDQQHU ,W PXVW $SURSRV LW VKRXOG EH FOHDU WKDW WKH FRQWH[W LV GHFLGHG ZLWK UHVSHFW WR WKH JOREDO HQYLURQPHQW RU GDWDEDVH IRU DOO WKH H[DPSOHV RI FKDSWHU

PAGE 81

EH QRWHG WKDW HDFK WLPH D VHW H[SUHVVLRQ LV HQFRXQWHUHG LQ WKH IXQFWLRQ ERG\ LW LV HYDOXDWHG ZLWK D QHZ UXQWLPH HQYLURQPHQW :H GR QRW DOORZ GRW H[SUHVVLRQV RI WKH IRUP SUHYSUHYLGHQWLILHU VLQFH WKDW ZRXOG UHTXLUH WKH UXQWLPH HQYLURQPHQW WR PDLQWDLQ LQIRUPDWLRQ DERXW DOO WKH SUHYLRXV HQYLURQPHQWV RQH IRU HDFK OHYHO RI QHVWLQJ )XQFWLRQ &RPSRVLWLRQ $V PHQWLRQHG HDUOLHU 9ROWDLUH LV D ILUVW RUGHU ODQJXDJH +RZHYHU WKH H[WHQW RI D IXQFWLRQ LV D GHQRWDEOH YDOXH ZKLFK FDQ DOVR EH SHUVLVWHQWf 7KHUHIRUH DQ HOHPHQW EHORQJLQJ WR WKH H[WHQW RI D IXQFWLRQ FDQ EH HPEHGGHG LQ GDWD VWUXFWXUHV SDVVHG DV D SDUDPHWHU RU UHWXUQHG DV D YDOXH 7KHUHIRUH IXQFWLRQ QDPHV DUH YDOLG LGHQWLILHUV LQ D GRW H[SUHVVLRQ 7KXV WKH GRW RSHUDWRU DOVR GHQRWHV IXQFWLRQ FRPSRVLWLRQ )RU H[DPSOH OHW M DQG GHQRWH WZR IXQFWLRQV DQG L? RL DQG  R GHQRWH WKHLU UHVSHFWLYH DWWULEXWHV LQSXW DQG RXWSXW SDUDPHWHUVf 7KHQ ^LR ILLL W!L $  2M` LV D YDOLG H[SUHVVLRQ DQG LV HTXLYDOHQW WR ILf 6WULFWO\ VSHDNLQJ WKH WZR H[n SUHVVLRQV DUH HTXLYDOHQW DIWHU DQ LPSOLFLW FRHUFLRQ LQ WKH VHQVH GLVFXVVHG EHORZf ,W VKRXOG EH H[SHFWHG WKDW WKH VXEH[SUHVVLRQ L LV YDOLG LI DQG RQO\ LI I? DQG DUH LVRPRUSKLVPV 7KLV PHDQV WKDW HYHQ WKRXJK L PD\ KDYH D GHQRWDEOH YDOXH LW GRHV QRW LPSO\ WKDW L ZLOO DOVR KDYH D GHQRWDEOH YDOXH XQOHVV WKH WZR IXQFWLRQV DUH LVRPRUSKLVPV 7KH UHDVRQ ZK\ WKLV LV WR EH H[SHFWHG LV WKDW WKH H[WHQW RI D IXQFWLRQ LV H[DFWO\ LWV JUDSK )XUWKHU WKH DERYH VHW H[SUHVVLRQ FRXOG DOVR KDYH EHHQ HTXLYDOHQWO\ ZULWWHQ DV ^  ^O2M WnL 9L`` 1RWH WKDW ^  Y? $  f§ ?` LV QRW HTXLYDOHQW WR K IL VLQFH WKH VHW H[SUHVVLRQ UHWXUQV D UHIHUHQFH WR DQ LQVWDQFH RI UDWKHU WKDQ WKH YDOXH

PAGE 82

7KXV HYHQ WKRXJK 9ROWDLUH KDV D ILUVW RUGHU V\QWD[ DQ HOHPHQW EHORQJLQJ WR WKH H[WHQW RI D IXQFWLRQ FDQ EH HPEHGGHG LQ GDWD VWUXFWXUHV SDVVHG DV D SDUDPHWHU RU UHWXUQHG DV D YDOXH ,W PLJKW EH XVHIXO WR OLVW WKH YDULRXV IRUPV RI WKH GRW RSHUDWRU HDFK RI ZKLFK DUH PXWXDOO\ FRQVLVWHQW FD GHQRWHV WKH VHW RI YDOXHV RI WKH DWWULEXWH D RI FODVV F VXFK WKDW D LV VHOHFWHG IURP HDFK LQVWDQFH L e F 7KLV FDQ HTXLYDOHQWO\ GHQRWH IXQFWLRQ HYDOXDWLRQ DV GLVFXVVHG LQ VHFWLRQ IR GHQRWHV WKH YDOXH RI SDUDPHWHU R RI D IXQFWLRQ ZKLFK LV WKH UHVXOW RI HYDOXDWLQJ $JDLQ WKLV FDQ HTXLYDOHQWO\ GHQRWH VHW HYDOXDWLRQ LI KDV D SHUVLVWHQW H[WHQW DV GLVFXVVHG LQ VHFWLRQ LD GHQRWHV WKH XVXDO ILHOG VHOHFWLRQ IRU UHFRUGV LI L LV DQ LQVWDQFH RI D FODVV RU IXQFWLRQf KDYLQJ WKH DWWULEXWH D 7KHUH LV RQH LPSRUWDQW GLIIHUHQFH QDPHO\ LQ RXU FDVH LD ZLOO UHWXUQ D VLQJOHWRQ VHW ZKRVH HOHPHQW LV WKH YDOXH RI D IRU L ,I V LV DQ LGHQWLILHU RI W\SH L WKHQ V LD LV OHJDO EHFDXVH WKHUH LV DQ LPSOLFLW FRHUFLRQ ,I LD HYDOXDWHV WR D VLQJOHWRQ VHW ZLWK WKH HOHPHQW Y QDPHO\ ^Y` LW LV FRHUFHG WR Y VLQFH ^X` ,Lf +RZHYHU V FD FDQ EH YDOLG LI DQG RQO\ FD HYDOXDWHV WR D VLQJOHWRQ VHW 6LQFH WKLV FDQ EH NQRZQ RQO\ DW UXQWLPH LW ZRXOG OLPLW WKH XVHIXOQHVV RI DQ\ VWDWLF W\SH FKHFNLQJ 7KHUHIRUH ZH LPSRVH WKH UHVWULFWLRQ WKDW WKH DERYH H[SUHVVLRQ LV YDOLG LI DQG RQO\ LI V KDV ^` 7KH UXOH IRU IR LV VLPLODU WR WKDW RI LD

PAGE 83

&+$37(5 7+( 92/7$,5( (19,5210(17 $1' ,76 6(0$17,&6 ,QWHUDFWLQJ ZLWK WKH 9ROWDLUH (QYLURQPHQW 7KH XVHU PXVW ILUVW HQWHU WKH 9ROWDLUH HQYLURQPHQW EHIRUH D GDWDEDVH LV ORDGHG DQG FRPSXWDWLRQV DUH PDGH DJDLQVW LW $W WKLV OHYHO RI HYDOXDWLRQ WKH HQYLURQPHQW LV LQWHUDFWLYHf§LW SURPSWV WKH XVHU IRU LQSXW DQG UHSRUWV WKH UHVXOW RI FRPSXWDWLRQV 7KH XVHU FDQ EHJLQ PDNLQJ FRPSXWDWLRQV DIWHU ORDGLQJ D VFKHPD DQG D GDWDEDVH E\ XVLQJ WKH ORDGBGE FRPPDQG ,I WKH VFKHPD DQGRU GDWDEDVH GR QRW H[LVW WKHQ WKH V\VWHP UHWXUQV D PHVVDJH ZDUQLQJ WKH XVHU WKDW WKH VFKHPD DQG WKH GDWDEDVH KDYH EHHQ LQLWLDOL]HG WR QXOO VR WKDW DQ\ FRPSXWDWLRQV RWKHU WKDQ QHZBF DQG QHZZLOO IDLO 7KH QHZBF FRPPDQG LV XVHG WR FUHDWH HLWKHU D QHZ FODVV RU D QHZ IXQFWLRQ 7KLV FODVV LV LQVHUWHG LQ WKH VFKHPD DW WKH DSSURSULDWH SODFH DQG FRUUHVSRQGLQJ PRGLILFDWLRQV DUH PDGH LQ WKH GDWDEDVH )RU H[DPSOH LI D QHZ FODVV KDV VXSHUFODVV FVXS WKHQ LW LV SRVVLEOH WKDW VRPH LQVWDQFHV RI FXS PD\ PLJUDWH WR WKH QHZ FODVV (IIHFWLYHO\ WKLV LPSOLHV D FRHUFLRQ RQ WKH W\SH RI DOO LQVWDQFHV WKDW PLJUDWH IURP FXS WR WKH QHZ FODVV 7KH QHZBL FRPPDQG LV XVHG WR FUHDWH QHZ LQVWDQFHV 7KH XVHU VKRXOG QRW VSHFLI\ WKH XQLTXH REMHFW LGHQWLILHU VLQFH WKH V\VWHP DXWRPDWLFDOO\ DVVLJQV RQH WR WKH QHZ REMHFW EHLQJ FUHDWHG +RZHYHU WKH XVHU QHHGV WR VSHFLI\ WKH SDUHQW FODVVHVf RI WKH QHZ LQVWDQFH DORQJ ZLWK DOO WKH DWWULEXWH YDOXH SDLUV 7KH V\VWHP ZLOO WKHQ FKHFN LI WKH QHZ LQVWDQFH VDWLVILHV DOO WKH VWUXFWXUDO DQG EHKDYLRUDO FRQVWUDLQWV RI HDFK SDUHQW FODVV ,Q RUGHU WR HQVXUH W\SH VDIHQHVV WKH W\SH RI HDFK LQVWDQFH LV YHULILHG DW WKH WLPH RI FUHDWLRQ DV ZHOO DV ZKHQ ORDGLQJ D JLYHQ GDWDEDVH ZLWK UHVSHFW WR D JLYHQ VFKHPD

PAGE 84

2QFH D SRSXODWHG GDWDEDVH H[LVWV ZLWKLQ WKH HQYLURQPHQW YDULRXV RWKHU FRPSXn WDWLRQV FDQ EH PDGH 7KH HYDO FRPPDQG LV XVHG WR HYDOXDWH HLWKHU D IXQFWLRQ RU D TXHU\ H[SUHVVLRQ 7KH /+6 RI D VHW H[SUHVVLRQ ZKLFK GHILQHV WKH FRQWH[W ZLWKLQ ZKLFK WKH UHVW RI WKH H[SUHVVLRQ LV WR EH HYDOXDWHGf FDQ RQO\ UHIHU WR QDPHV GHILQHG LQ WKH VFKHPD 7KH UHDVRQ ZK\ D VLQJOH HYDO FRPPDQG VXIILFHV LV EHFDXVH FODVVHV DQG IXQFWLRQV KDYH DQ HTXLYDOHQW VHPDQWLFV )RU H[DPSOH FRQVLGHU ^IDFWI Q ` DQG ^6WXGHQWQDPH VV ` 7KH UHVXOW RI D TXHU\ LV WDEXODU )RU H[DPSOH WKH UHVXOW RI WKH TXHU\ ^ 'HSW>QDPH@&RXUVH>WLWOH@6HFWLRQ>WH[WERRN@ &RXUVHF DQG &RXUVHF ` LV D WDEOH ZKLFK FDQ EH GHVFULEHG DV D VHW RI REMHFWV VXFK WKDW HDFK REMHFW KDV WKH W\SH QDPH VWULQJ ^WLWOH VWULQJ WH[WERRN ^VWULQJ`f`f JLYHQ WKDW D 'HSDUWPHQW RIIHUV PDQ\ FRXUVHV DQG WKDW HDFK FRXUVH KDV PDQ\ VHFWLRQV HDFK RI ZKLFK PD\ IROORZ GLIIHUHQW WH[WERRNVf 7KH UHVXOW RI WKH IDFWRULDO IXQFWLRQ ZRXOG EH WKH YDOXH 6LQFH ZH KDYH DGRSWHG D OD]\ HYDOXDWLRQ PRGH IRU HQIRUFLQJ LQWHJULW\ FRQVWUDLQWV LW LV SRVVLEOH WKDW LQVWDQFHV EHORQJLQJ WR FHUWDLQ FODVVHV DUH PRGLILHG DQG WKH GDWDEDVH FDQ WKHQ UHVXOW LQ DQ LQFRQVLVWHQW VWDWH 7R ILQG RXW ZKLFK LQVWDQFHV RI D JLYHQ FODVV FDXVH WKH GDWDEDVH WR UHVXOW LQ DQ LQFRQVLVWHQW VWDWH RQH FDQ XVH WKH FKHFN FODVVQDPH! FRPPDQG ,I WKH QDPH RI WKH FODVV LV VSHFLILHG DV $Q\ WKHQ HDFK DQG HYHU\ FODVV LQ WKH VFKHPD LV FKHFNHG WR GLVFRYHU LQFRQVLVWHQW LQVWDQFHV 7KH UHVXOW LV GLVSOD\HG DV DQ REMHFW JUDSK WKDW LV OLQHDU VSDQWUHHf ZLWK D TXHVWLRQ PDUN LQGLFDWLQJ WKH VRXUFH RI WURXEOH )RU H[DPSOH DQ LQVWDQFH L PD\ KDYH DQ DWWULEXWH GR ZKLFK UHIHUV WR DQ LQVWDQFH RI DQRWKHU FODVV SRVVLEO\ WKURXJK PDQ\ OHYHOV RI LQGLUHFWLRQ 1RZ LI LW LV WKH FDVH WKDW Lr LV HLWKHU QRQH[LVWHQW RU LQFRQVLVWHQW WKHQ D TXHVWLRQ PDUN ZRXOG DSSHDU

PAGE 85

,W LV WULYLDO WR JHQHUDWH VXFK D JUDSK E\ FRPSXWLQJ WKH VSDQWUHH RI L DV GLVn FXVVHG LQ VHFWLRQ $Q DOWHUQDWLYH IRUP RI WKLV FRPPDQG LV FKHFN FODVVQDPH! VHWMH[SU! 7KLV FRPPDQG FKHFNV LI WKH LQVWDQFHV UHWXUQHG E\ WKH VHW H[SUHVVLRQ DUH PHPEHUV RI D JLYHQ FODVV QRWH WKDW PHPEHUVKLS LPSOLHV FRQVLVWHQF\f )RU H[DPn SOH FKHFN 'HSDUWPHQW ^6WXGHQWDGYLVRU)DFXOW\GHSW )DFXOW\VDODU\ ` ZLOO FKHFN RQO\ WKRVH LQVWDQFHV UHWXUQHG E\ WKH VHW H[SUHVVLRQ UDWKHU WKDQ DOO LQVWDQFHV RI FODVV 'HSDUWPHQW IRU FRQVLVWHQF\ $OVR WKH UHVXOWLQJ REMHFW JUDSK ZLOO EHJLQ ZLWK DQ LQVWDQFH RI FODVV 'HSDUWPHQW 7KLV FRPPDQG LV DOVR XVHIXO LQ ILQGLQJ RXW QRQPHPEHUV RI D FODVV )RU H[DPSOH FKHFN 5$ ^7$` ZLOO UHVXOW LQ D VHW RI LQVWDQFHV RI 5$ WKDW DUH QRW LQ 7$ 7KLV LQIRUPDWLRQ FDQ WKHQ EH XVHG WR FRHUFH WKH W\SH 5$ RQ LQVWDQFHV RI 7$ WKLV LV OHJDO VLQFH ZH VXSSRUW PXOWLSOH LQKHULWDQFHf 7KH GHOHWH VHW MH[SU! FRPPDQG LV XVHG WR GHOHWH DOO LQVWDQFHV UHWXUQHG DV WKH UHVXOW RI HYDOXDWLQJ WKH VHW H[SUHVVLRQ 7KLV GHOHWH RSHUDWLRQ VKRXOG EH XVHG ZLWK FDXWLRQ VLQFH LW ZLOO EOLQGO\ GHOHWH DOO REMHFWV UHWXUQHG E\ WKH VHW H[SUHVVLRQ ZLWKRXW UHJDUG IRU WKH FRQVLVWHQF\ RI WKH GDWDEDVH +RZHYHU LW LV XVHIXO LQ RUGHU WR GHOHWH LQFRQVLVWHQW REMHFWV GHWHUPLQHG E\ WKH FKHFN FRPPDQG 7KH VHPDQWLFV RI WKLV GHOHWH RSHUDWRU LV LGHQWLFDO WR WKDW ZKHQ LW DSSHDUV LQ D IXQFWLRQ IRU WKH FDVH RI SHUVLVWHQW REMHFWV 7UDQVFULSWV RI D VHVVLRQ RU D SRUWLRQ RI WKH VHVVLRQf ZLWK 9ROWDLUH FDQ EH VDYHG LQ D ILOH E\ XVLQJ WKH VDYH FRPPDQG 7KH XVHU FDQ HYHQWXDOO\ TXLW D VHVVLRQ ZKLFK KDV WKH HIIHFW RI FORVLQJ WKH GDWDEDVH DQG UHWXUQLQJ WR WKH RSHUDWLQJ V\VWHP 6LQFH HDFK FRPPDQG LV FRQVLGHUHG DV DQ DWRPLF WUDQVDFWLRQ WKH HIIHFWV RI D VXFFHVVIXO H[HFXWLRQ DUH SHUPDQHQWO\ UHIOHFWHG LQ WKH GDWDEDVH )RU H[DPSOH LI D IXQFWLRQ IRU LQFUHDVLQJ )DFXOW\ VDODULHV E\ b LV H[HFXWHG E\ WKH HYDO FRPPDQG WKHQ DOO LQVWDQFHV RI WKH FODVV )DFXOW\ DUH XSGDWHG XSRQ VXFFHVVIXO H[HFXWLRQ RI WKH IXQFWLRQ DQG ZLOO EH UHIOHFWHG WKH QH[W WLPH WKH GDWDEDVH LV ORDGHG

PAGE 86

$ 'HQRWDWLRQDO 6HPDQWLFV IRU 9ROWDLUH ,Q GHFUHDVLQJ OHYHO RI DEVWUDFWLRQ WKHUH DUH WKUHH FRPSOHPHQWDU\ PHWKRGRORJLHV IRU GHILQLQJ WKH VHPDQWLFV RI D SURJUDPPLQJ ODQJXDJH QDPHO\ D[LRPDWLF GHQRWDn WLRQDO DQG RSHUDWLRQDO VHPDQWLFV >@ 7KH ODVW PHWKRG XVHV DQ LQWHUSUHWHU WR GHILQH D ODQJXDJH 7KH PHDQLQJ RI D SURJUDP LV WKH HYDOXDWLRQ KLVWRU\ WKDW WKH LQWHUSUHWHU SURGXFHV ZKHQ LW LQWHUSUHWV WKH SURJUDP ,Q WKH GHQRWDWLRQDO VHPDQWLFV DSSURDFK D SURJUDP LV GLUHFWO\ PDSSHG WR LWV PHDQLQJ FDOOHG LWV GHQRWDWLRQ $ YDOXDWLRQ IXQFn WLRQ PDSV D SURJUDP GLUHFWO\ WR LWV GHQRWDWLRQ ZKLFK LV D PDWKHPDWLFDO YDOXH VXFK DV D QXPEHU RU IXQFWLRQ :LWK DQ D[LRPDWLF VHPDQWLFV SURSHUWLHV DERXW ODQJXDJH FRQVWUXFWV DUH GHILQHG H[SUHVVHG ZLWK D[LRPV DQG LQIHUHQFH UXOHV IURP V\PEROLF ORJLF $ GHQRWDWLRQDO GHVFULSWLRQ RI D SURJUDPPLQJ ODQJXDJH FRQVLVWV RI DQ DEVWUDFW V\QWD[ D VHW RI VHPDQWLF GRPDLQV DORQJ ZLWK WKHLU RSHUDWRUV DQG D YDOXDWLRQ IXQFWLRQ $ VHPDQWLF GRPDLQ DORQJ ZLWK LWV VHW RI RSHUDWRUV LV FDOOHG D VHPDQWLF DOJHEUD %HIRUH WKH YDOXDWLRQ IXQFWLRQ LV GHILQHG ZH PXVW GHILQH DSSURSULDWH VHPDQWLF DOJHEUDV IRU SULPLWLYH GRPDLQV VXFK DV QXPEHUV DQG ERROHDQ FRPSRXQG GRPDLQV VXFK DV VHWV OLVWV DQG UHFRUGV DQG RWKHU FRPSOH[ GRPDLQV VXFK DV UXQWLPH HQYLURQPHQWV DQG PHPRU\ VWRUHV 7KH YDOXDWLRQ IXQFWLRQ WDNHV DQ DEVWUDFW V\QWD[ WUHH RI WKH SURJUDP DQG PDSV LW RQWR LWV PHDQLQJ ZLWK WKH KHOS RI WKHVH VHPDQWLF DOJHEUDV 7KHUH DUH PDQ\ VW\OHV RI GHQRWDWLRQDO VHPDQWLFV 7ZR LPSRUWDQW VW\OHV DUH GLn UHFW DQG FRQWLQXDWLRQ VHPDQWLFV 'LUHFW VHPDQWLFV GHILQLWLRQV WHQG WR XVH ORZHU RUGHU H[SUHVVLRQV DQG HPSKDVL]H WKH FRPSRVLWLRQDO VWUXFWXUH RI D ODQJXDJH )RU H[DPSOH WKH HTXDWLRQ e?(; (@@ $H(A(AH SOXV eIOe@@H JLYHV D VLPSOH GHILQLWLRQ RI VLGHHIIHFW IUHH DGGLWLRQ WKDW LV WKHUH LV QR QRWLRQ RI VHTXHQFLQJ LQ WKLV GHILQLWLRQ 6HTXHQFLQJ LV DQ HQWLUHO\ RSHUDWLRQDO QRWLRQ +RZHYHU VHTXHQFLQJ LV DQ LPSRUWDQW FRQWURO VWUXFWXUH LQ DOO LPSHUDWLYH ODQJXDJHV 7KH VHPDQWLF DUJXPHQW WKDW PRGHOV

PAGE 87

FRQWURO LV FDOOHG D FRQWLQXDWLRQ $V DQ DQDORJ\ WKH DFWLYDWLRQ UHFRUG VWDFN RI D SURJUDPPLQJ ODQJXDJH WUDQVODWRU FRQWDLQV WKH VHTXHQFLQJ LQIRUPDWLRQ WKDW fGULYHVf WKH HYDOXDWLRQ RI D SURJUDP 7KXV WKH DERYH H[DPSOH FDQ EH UHZULWWHQ LQ WKH FRQWLQXDWLRQ VW\OH DV IROORZV e?(? (@@ ;H;N e?(??H;Q? eS"@@H ;Q N>Q? SOXV Qfff ZKHUH H LV WKH UXQWLPH HQYLURQPHQW DUJXPHQW DQG N LV WKH FRQWLQXDWLRQ RU FRQWURO DUJXPHQW $Q LPSRUWDQW DGYDQWDJH RI XVLQJ D FRQWLQXDWLRQ LV WKDW DEVWUDFWLRQV LQ WKH VHPDQWLF HTXDWLRQV DUH QRQVWULFW 7KLV LV EHFDXVH WKH FRQWLQXDWLRQ HIIHFWLYHO\ FDSWXUHV WKH QRWLRQ RI fUHVW RI WKH SURJUDPf LQ DQ H[SUHVVLRQRULHQWHG ODQJXDJH WKH SURJUDP LV DQ H[SUHVVLRQf WKXV WKH UHPDLQGHU RI WKH SURJUDP GHQRWHG E\ Nf LV QHYHU UHDFKHG ZKHQ DQ LQILQLWH ORRS LV HQFRXQWHUHG 7KRXJK LW LV RIWHQ SRVVLEOH WR VKRZ WKH HTXLYDOHQFH RU PRUH SUHFLVHO\ FRQJUXHQFHf EHWZHHQ D GLUHFW DQG FRQWLQXDWLRQ VW\OH VHPDQWLFV IRU D JLYHQ ODQJXDJH LW LV GLIILFXOW $V GLVFXVVHG LQ FKDSWHU WKH GHILQLWLRQ RI D WUDQVDFWLRQ LV VWLOO DQ DUHD RI RQJRLQJ UHVHDUFK IRU REMHFWEDVHG GDWDEDVH ODQJXDJHV :H EHOLHYH WKDW RQH HIIHFWLYH ZD\ WR VWXG\ YDULRXV SRVVLEOH GHILQLWLRQV RI D WUDQVDFWLRQ LV E\ GHILQLQJ D FRQWLQXDWLRQ VW\OH VHPDQWLFV IRU WKH ODQJXDJH 7KH FHQWUDO LGHD LV WKDW D YDOXDWLRQ IXQFWLRQ WKHQ PDSV D GDWDEDVH SURJUDP GLUHFWO\ RQWR D WUDQVDFWLRQ 2QH RI WKH RULJLQDO WDUJHWV RI WKLV UHVHDUFK ZDV WR GHILQH D WUDQVDFWLRQ ZLWK WKH KHOS RI D FRQWLQXDWLRQ VHPDQWLFV :KLOH D FRQFLVH FRQWLQXDWLRQ VHPDQWLFV WR GHILQH WUDQVDFWLRQV KDV PDQDJHG WR HOXGH XV ZH KDYH EHHQ SDUWLDOO\ VXFFHVVIXO LQ GHILQLQJ D GLUHFW VHPDQWLFV IRU 9ROWDLUH 7KH FRQFUHWH V\QWD[ LV GHILQHG LQ $SSHQGL[ % WKH DEVWUDFW V\QWD[ LV GHILQHG LQ $SSHQGL[ & DQG WKH GHQRWDWLRQDO VHPDQWLFV LV GHILQHG LQ $SSHQGL[ :H IROORZ WKH QRWDWLRQ IRXQG LQ >@

PAGE 88

,PSOHPHQWDWLRQ 6WUDWHJ\ 2XU LPSOHPHQWDWLRQ VWUDWHJ\ LV VKRZQ LQ )LJXUH $ 9ROWDLUH VFKHPD FRQVLVWn LQJ RI FODVV DQG IXQFWLRQ GHILQLWLRQVf LV ILUVW WUDQVODWHG E\ D SDUVHU LQWR DQ DEVWUDFW V\QWD[ WUHH $67f 7KLV $67 LV WKHQ DQDO\]HG E\ D VHPDQWLF SURFHVVRU IRU FRQVLVn WHQF\ DQG SRVVLEOH RSWLPL]DWLRQ ,I DQ\ V\QWD[ HUURUV DUH GHWHFWHG WKHQ WKH\ DUH UHSRUWHG WR WKH XVHU DW WKLV OHYHO ,I WKHUH DUH QR HUURUV WKHQ DQRWKHU DEVWUDFW V\QWD[ WUHH $67rf LV JHQHUDWHG 7KH UXQWLPH HQYLURQPHQW WDNHV D UHTXHVW IURP WKH XVHU DQG H[HFXWHV LW ZLWK UHVSHFW WR $67r (IIHFWLYHO\ WKH UXQWLPH HQYLURQPHQW UHFXUn VLYHO\ ZDONV WKH DEVWUDFW V\QWD[ WUHH $67rf WR H[HFXWH WKH XVHU UHTXHVW 7KH PDLQ DGYDQWDJH RI WKLV LPSOHPHQWDWLRQ VWUDWHJ\ LV WKDW PXOWLSOH RSWLPL]DWLRQ VWUDWHJLHV PD\ EH SXUVXHG LQGHSHQGHQWO\ EXW LQ D FRKHUHQW IDVKLRQ )RU H[DPSOH WKH VHPDQn WLF SURFHVVRU FDQ H[SORLW GLIIHUHQW RSWLPL]DWLRQ VWUDWHJLHV WR FRQYHUW $67 WR $67r VXFK DV DOJHEUDLF UHZULWHV $OVR WKH UXQWLPH HQYLURQPHQW FDQ H[SORLW DQRWKHU VHW RI RSWLPL]DWLRQV LQ ZKLFK DFFHVV IURP WKH SHUVLVWHQW VWRUH LV PRUH HIILFLHQW $ VLQJOH XVHU UHTXHVW LV WUHDWHG DV DQ DWRPLF WUDQVDFWLRQ ,I WKH XVHU PRGLILHV WKH FXUUHQW VFKHPD LQ WKH PLGGOH RI D VHVVLRQ ZLWK WKH HQYLn URQPHQW WKHQ DQ\ VXFK FKDQJH PXVW EH UHIOHFWHG 6LQFH WKH UXQWLPH HQYLURQPHQW ZLOO RQO\ UHIHUHQFH DQG WKHUHIRUH PRGLI\ $67rf WKHUH PXVW EH DQRWKHU PHFKDQLVP WR WUDQVODWH WKH FKDQJHV PDGH WR $67r EDFN LQWR 9ROWDLUH FRGH 7KXV ZKHQ WKH XVHU TXLWV WKH HQYLURQPHQW $67r LV WUDQVODWHG EDFN LQWR 9ROWDLUH FRGH E\ WKH GHSDUVHU

PAGE 89

6HPDQWLF 3URFHVVRU 5XQ7LPH (QYLURQPHQW 8VHU )LJXUH ,PSOHPHQWDWLRQ RI 9ROWDLUH

PAGE 90

&+$37(5 &21&/86,216 $1' )8785( 5(6($5&+ ,Q WKLV GLVVHUWDWLRQ ZH KDYH GHVFULEHG WKH V\QWD[ DQG VHPDQWLFV RI WKH 9ROWDLUH GDWDEDVH SURJUDPPLQJ ODQJXDJH 8QOLNH PRVW RWKHU ODQJXDJHV 9ROWDLUH KDV D VLQJOH H[HFXWLRQ PRGHO IRU HYDOXDWLQJ TXHULHV VDWLVI\LQJ FRQVWUDLQWV DQG FRPSXWLQJ IXQFn WLRQV 6XFK D GHVLJQ DOVR IDFLOLWDWHV D ERRWVWUDSSHG LPSOHPHQWDWLRQ :H EHOLHYH WKDW LW LV D VXLWDEOH ODQJXDJH IRU GDWD LQWHQVLYH SURJUDPPLQJ $ SURWRW\SH LPSOHPHQWDn WLRQ LV FXUUHQWO\ EHLQJ FRPSOHWHG 7KH PDLQ FRQWULEXWLRQV RI WKLV GLVVHUWDWLRQ DUH DV IROORZV :H KDYH GHVFULEHG D VHWRULHQWHG LPSHUDWLYH GDWDEDVH SURJUDPPLQJ ODQJXDJH FDOOHG 9ROWDLUH :H KDYH GHVFULEHG D GDWD GHILQLWLRQ IDFLOLW\ ZKLFK IDFLOLWDWHV VKDULQJ RI GDWD DQG PDQLSXODWLRQ RI KHWHURJHQHRXV VHWV DQG LQ ZKLFK SHUVLVWHQFH LV D SURSHUW\ RI WKH LQVWDQFHV UDWKHU WKDQ FODVVHV RU W\SHVf 7KH V\VWHP SURYLGHV WUDQVSDUHQF\ EHWZHHQ SHUVLVWHQW DQG WUDQVLHQW REMHFWV E\ GHILQLQJ D VLQJOH VHW RI RSHUDWRUV IRU ERWK NLQGV RI REMHFWV :H KDYH GHVLJQHG WKH ODQJXDJH LQ DQ DGGLWLYH RU ERRWVWUDSSLQJ IDVKLRQ :H KDYH GLVFXVVHG KRZ WKH QRWLRQ RI WHPSRUDU\ LQVWDQFH FUHDWLRQ DOORZV XV WR JLYH DQ HTXLYDOHQW VHPDQWLFV WR FODVVHV DQG IXQFWLRQV ZKLFK VHHPHG QHFHVVDU\ WR KDYH D VLQJOH PRGHO RI H[HFXWLRQ IRU TXHU\LQJ HQIRUFLQJ LQWHJULW\ DQG FRPSXWLQJ IXQFWLRQV

PAGE 91

:H KDYH JLYHQ D IRUPDO GHILQLWLRQ WR WKH REMHFW PRGHO RI 9ROWDLUH ZKLFK DFn FRXQWV IRU EHKDYLRU DV ZHOO DV WKH H[WHQW RI D W\SH 7KXV LW SURYLGHV D XQLIRUP VHPDQWLFV IRU WKH SHUVLVWHQW VWRUH LH WKH GDWDEDVHf DQG WKH UXQWLPH HQYLn URQPHQW E\ PDNLQJ LW SRVVLEOH WR VWDWLFDOO\ W\SH FKHFN H[SUHVVLRQV :H KDYH DOVR JLYHQ D SDUWLDO GHQRWDWLRQDO VHPDQWLFV GHILQLQJ WKH PDLQ IHDWXUHV RI 9ROWDLUH :KLOH WKH IDFW WKDW WKH VHTXHQWLDO RUGHU RI FRQVWUDLQWV LV VLJQLILFDQW PD\ EH FRQn VLGHUHG DV D OLPLWDWLRQ ZH SODFHG WKDW UHVWULFWLRQ WR DYRLG WUDGLWLRQDO FRPSXWDWLRQDO RYHUKHDG DVVRFLDWHG ZLWK FRQVWUDLQWV $OVR ZH FDQ QRZ FRPSXWH D IXQFWLRQ ZKLFK FRQVLVWV RI HYDOXDWLQJ RU VDWLVI\LQJ D VHTXHQFH RI FRQVWUDLQWV 6LQFH IXQFWLRQV DQG FODVVHV DUH HTXLYDOHQW WKH\ FDQ EH WKRXJKW RI DV YLHZV DQG OLNHZLVH WKH RXWSXW SDn UDPHWHUV RI WKH IXQFWLRQ DV GHULYHG DWWULEXWHVf 7KH YDOXHV RI GHULYHG DWWULEXWHV DUH QRW VWRUHG EXW DUH FRPSXWHG RQO\ XSRQ GHPDQG 7KLV DGGV WR UXQWLPH RYHUKHDG EXW JXDUDQWHHV WKDW WKH XVHU ZLOO DOZD\V REWDLQ FRUUHFW UHVXOWV :KLOH RXU W\SH V\VWHP KDV FHUWDLQ XVHIXO SURSHUWLHV WKH W\SH H[SUHVVLRQV DUH QRW DV SRZHUIXO DV LQ VD\ 0DFKLDYHOOL )RU H[DPSOH ZH KDYH QRW FRQVLGHUHG YDULDQW UHFRUGV SRO\PRUSKLVP LV DG KRF LQ WHUPV RI RSHUDWRU RYHUORDGLQJ LPSOLFLW FRHUFLRQ DQG LQKHULWDQFH ,W LV DQ RSHQ TXHVWLRQ ZKHWKHU ZH FDQ GHILQH D VWDWLF W\SH GLVFLSOLQH WKDW LV WUXO\ SRO\PRUSKLF EXW DOVR VXSSRUWV VKDULQJ RI KHWHURJHQHRXV GDWD $GYDQFHG LVVXHV VXFK DV H[FHSWLRQ KDQGOLQJ RU YHUVLRQLQJ PD\ EH DGGUHVVHG WR HQKDQFH WKH ODQJXDJH 7KHUH DUH DW OHDVW WZR GLUHFWLRQV IRU IXWXUH UHVHDUFK WKDW DSSHDU SURPLVLQJ 6LQFH WKH VHW H[SUHVVLRQV LQ 9ROWDLUH DUH YHU\ VLPLODU WR WKRVH LQ 6(7/ LW ZRXOG EH LQWHUHVWLQJ WR LQYHVWLJDWH WKH SRVVLELOLW\ RI H[WHQGLQJ 6(7/ WR PDNH LW D SRO\PRUSKLF VWURQJO\ W\SHG GDWDEDVH SURJUDPPLQJ ODQJXDJH ZLWK VWDWLF W\SH FKHFNLQJ

PAGE 92

([WHQG WKH GHQRWDWLRQDO GHVFULSWLRQ RI 9ROWDLUH WR D FRQWLQXDWLRQ VW\OH RI VHPDQn WLFV ZKLFK FRXOG WKHQ EH XVHG WR VWXG\ WKH QRWLRQ RI WUDQVDFWLRQV IRU '%3/V ([WHQG WKH W\SH V\VWHP RI 9ROWDLUH WR GHILQH D W\SH LQIHUHQFLQJ PHFKDQLVP WKDW ZRXOG HOLPLQDWH WKH QHHG WKH SUHGHILQH WUDQVLHQW DWWULEXWHV

PAGE 93

$33(1',; $ 81,9(56,7< 6&+(0$ FODVV 3HUVRQ GHILQHG VXSHUFODVVHV $Q\ VXEFODVVHV 6WXGHQW 7HDFKHU DWWULEXWHV VV LQWHJHU QDPH VWULQJ FODVV 6WXGHQW GHILQHG VXSHUFODVVHV 3HUVRQ VXEFODVVHV *UDG 8QGHUJUDG DWWULEXWHV JSD UHDO PDMRU 'HSW VHFWLRQV VHW 6HFWLRQ WUDQVFULSWV VHW 7UDQVFULSW WRWDOZRUN LQWHJHU WRWDOFUHGLW LQWHJHU MREKRXUV LQWHJHU OHLVXUHWLPH LQWHJHU YLVDBVWDWXV LQWHJHU FRQVWUDLQWV WRWDOFUHGLW VXP ^VHFWLRQVFRXUVHFUHGLWKRXUV ` WRWDOZRUN WRWDOFUHGLW MREKRXUV OHLVXUHWLPH f§ WRWDOZRUN OHLVXUHWLPH LI YLVDVWDWXV f)Of WKHQ MREKRXUV FODVV *UDG GHILQHG VXSHUFODVVHV 6WXGHQW VXEFODVVHV 5$ 7$ DWWULEXWHV DGYLVRU )DFXOW\ FRPPLWWHH VHW )DFXOW\ VWDWXV VWULQJ

PAGE 94

FRXUVHBZRUN VWULQJ GHJUHHBUHT VWULQJ WKHVLVMRSWLRQ LQWHJHU FRQVWUDLQWV LI H[LVWV WKHVLVBRSWLRQ WKHQ DGYLVRU DQG DGYLVRU LQ FRPPLWWHH IRU DOO ^ VHFWLRQFRXUVHF F ` LI VWDWXV fIXOOWLPHf WKHQ WRWD/FUHGLW LI FRXUVHZRUN fGRQHf DQG WKHVLVBVWDWXV fGHIHQGHGf DQG FRXQW ^ FRPPLWWHH)DFXOW\ )DFXOW\'HSW LQFOXGHV 'HSW ` WKHQ GHJUHHBUHT fIXOILOOHGf FODVV 8QGHUJUDG GHILQHG VXSHUFODVVHV 6WXGHQW DWWULEXWHV PLQRU 'HSW FODVV 7HDFKHU GHILQHG VXSHUFODVVHV 3HUVRQ VXEFODVVHV )DFXOW\ 7$ DWWULEXWHV GHJUHH VWULQJ FODVV )DFXOW\ GHILQHG VXSHUFODVVHV 7HDFKHU DWWULEXWHV ERRNV VWULQJ VSHFLDOW\ VWULQJ DGYLVHV VHW *UDG FODVV 7$ GHILQHG VXSHUFODVVHV 7HDFKHU *UDG DWWULEXWHV VXSHUYLVRU )DFXOW\ FODVV 5$ GHILQHG VXSHUFODVVHV *UDG DWWULEXWHV SURMHFW VWULQJ FODVV 6HFWLRQ GHILQHG VXSHUFODVVHV $Q\ DWWULEXWHV

PAGE 95

VHFWLRQ VWULQJ URRP VWULQJ WH[WERRN VWULQJ WDXJKWE\ 7HDFKHU FRXUVH &RXUVH HQUROOPHQW VHW 6WXGHQW FODVV &RXUVH GHILQHG VXSHUFODVVHV $Q\ DWWULEXWHV F VWULQJ WLWOH VWULQJ FUHGLWBKRXUV LQWHJHU SUHUHTV VHW &RXUVH VHFWLRQV VHW 6HFWLRQ HQUROOPHQW VHW 6WXGHQW GHSW 'HSW FODVV 'HSW GHILQHG VXSHUFODVVHV $Q\ DWWULEXWHV QDPH VWULQJ FROOHJH VWULQJ VWXGHQWV VHW 6WXGHQW FRXUVHVRIIHUHG VHW &RXUVH FODVV 7UDQVFULSW GHILQHG VXSHUFODVVHV $Q\ DWWULEXWHV JUDGH LQWHJHU FRXUVH &RXUVH VWXGHQW 6WXGHQW FODVV $GYLVLQJ GHILQHG VXSHUFODVVHV $Q\ DWWULEXWHV VWDUWGDWH VWULQJ IDFXOW\ )DFXOW\ VWXGHQW 6WXGHQW

PAGE 96

$33(1',; % &21&5(7( 6<17$; / $ %1) IRU WKH 'DWD 'HILQLWLRQ 6XEODQJXDJH GE! VFKHPD! GDWDEDVH! VFKHPD! GDWDEDVH! FODVV LQVWDQFH! FODVV! FODVV FODVVQDPH! GHILQHG IXQFWLRQf >VXSHUFODVVHV VXSHUFODVV!@ >VXEFODVVHV VXEFODVV!@ >LQVWDQFHV UHI!I@ >DWWULEXWHV DWWUBGRPDLQ!@ >WUDQVLHQWV DWWUBGRPDLQ!@ >FRQVWUDLQWV %!@ LQVWDQFH! LQVWDQFH UHI! > SDUHQWBFODVV!I @ >DWWULEXWHV DWWUBYDOXH!@ DWWUBGRPDLQ! DWWUBQDPH! GRPDLQ! DWWUBQDPH! YDOXH! GRPDLQ! QLO DQ\ VWULQJ LQWHJHU UHDO FODVVBQDPH! VHW GRPDLQ! OLVW GRPDLQ! WXSOH DWWUBGRPDLQ! DWWUBYDOXH! DWWUBQDPH! YDOXH! YDOXH! QXOO UHI! LQWHJHU! UHDO! fVWULQJ!f VHWBYDOXH! OLVWBYDOXH! WXSOHBYDOXH! VHWBYDOXH! OLVWBYDOXH! WXSOHBYDOXH! ^ YDOXH!I ` YDOXH! f > DWWUBYDOXH! @ VXSHUFODVV! VXEFODVV! SDUHQW FODVV FODVVBQDPH! FODVV BQDPH FODVVBQDPH! FODVVBQDPH! ,GHQWLILHU! DWWUBQDPH! ,GHQWLILHU!

PAGE 97

,, 6RPH 'DWD 0DQLSXODWLRQ 2SHUDWRUV GPORSV! QHZ! PRGLI\! GHOHWH! QHZ! GRWAH[SU! ^ QHZFODVVQDPH! DWWUBYDOXH! ` PRGLI\ GRWAH[SU! ^ QHZFODVVQDPH! ,GHQWLILHU! ` ^ PRGLI\GRWBH[SU! %RRO! ` GHOHWH! ^ GHOHWHGRWBH[SU! %RRO! ` ,,, 4XHU\ 6XEODQJXDJH VHWMH[SU! ^ (! %RRO! ` ^ (! ` (! DJJBRS! VHWMH[SU! %RRO! %RRO! f QRW %RRO! %RROL RU %RRO %RROL DQG %RRO (L UHORS! ( (L ( IRUDOO (! %RRO! H[LVWV (! GEH[LVWV (! (! GRWBH[SU! f§ WHUP! WHUP! WHUP! DGGRS! (! GRWMH[SU! ,! ,!GRWMH[SU! ,M > @ ,L > @GRWMH[SU! WHUP! IDFWRU! IDFWRU! PXOWLSO\RS! WHUP! IDFWRU! UHI! LQWHJHU! UHDO! fVWULQJ!f VHWBH[SU! VHWBFRQVWDQWV! VHWBFRQVWDQWV! ^ UHI! ` ^ LQWHJHU! ` ^ UHDO!I ` ^ VWULQJ! ` ^ ,GHQWLILHU! ` DJJBRS! FRXQW VXP DYJ PLQ PD[ UHORS! A _ _ _LQ_LQFOXGHV DGGRS! f§ PXOWLSO\RS! [ I PRG GLY ,! SUHY QH[W VHOI KHDG WDLO ,GHQWLILHUV!

PAGE 98

,9 $GGLWLYH &RQVWUDLQW 6XEODQJXDJH %! %L % %RRO! &RPPL &RPPL LI %RRO! WKHQ %! HQGLI LI %RRO! WKHQ %L HOVH % HQGLI 9 $GGLWLYH 3URJUDPPLQJ 6XEODQJXDJH &RPP &RPP $VVLJQPHQW! /RRS! GPORSV! LR! $VVLJQPHQW! GRWBH[SU! VHWBH[SU! /RRS! ,WHUDWRU! :KLOH! ,WHUDWRU! :KLOH! IRU HDFK ,! LQ VHWBH[SU! GR %! HQGGR ZKLOH %RRO! GR %! HQGGR LR! RSHQ! FORVH! SULQW! UHDG! 9,(QYLURQPHQW 6HVVB2S! QHZBF FODVV! QHZLQVWDQFH! HYDO VHWMH[SU! 6HVVLRQ! ORDGGE GE! 6HVVB2S! 6HVVB2S! QHZBF FODVV! QHZBL LQVWDQFH! HYDO VHWBH[SU! VFULSW ILOHBQDPH! FKHFN FODVVQDPH! FKHFN FODVVQDPH! VHWMH[SU! TXLW VDYHLQ ILOHBQDPH! GHOHWHVHWMH[SU! _

PAGE 99

$33(1',; & $%675$&7 6<17$; 9ROWDLUH ORDGGE 6F 'E 6 6 6Ln6 QHZBF &O QHZ,QV HYDO 6( VFULSW )Q FKHFN &Q FKHFN &Q 6( TXLW VDYHLQ )Q GHOHWHG 6( &O FODVV &Q GHILQHG IXQFWLRQf >VXSHUFODVVHV 6XS >VXEFODVVHV 6XE >LQVWDQFHV 5I@ >DWWULEXWHV $'@ >WUDQVLHQWV $'@ >FRQVWUDLQWV %@ $' $Q $Q 9 QLO DQ\ VWULQJ LQWHJHU UHDO &Q VHW OLVW WXSOH $' ,QV LQVWDQFH 5I > &Q @ >DWWULEXWHV $9@ $9 $Q 9 9 QXOO 5I ,QW 5 6W 69 79 69 ^9` 79 >$9@ 6F & 'E ,QV 6XS &Q 6XE &Q &Q ,GH

PAGE 100

$Q ,GH % %L % %R & & LI %R WKHQ % HQGLI LI %R WKHQ %L HOVH % HQGLI $ / '0/ %R %R f QRW %R % RU %R % DQG %R ( 5HO ( ([ ( IRUDOO ( %R H[LVWV ( GEH[LVWV ( ( 7 7 7 $GG ( '( 7 ) ) ) 0XO 7 5I ,QW 5 f6Wf 69 6( $JJ 5HO $GG 0XO FRXQW VXP DYJ PLQ PD[ _ _ LQ LQFOXGHV B [ 7 PRG GLY SUHY VHOI KHDG WDLO ,GH '0/ 1HZ 0RG 'HO 1HZ 0RG 'HO '( ^ QHZ&Q $9 ` '( ^ QHZ&Q ,GH ` ^ PRGLI\'( %R ` ^ GHOHWH'( %R ` '( ,'( > @ K > K @'( 6( ^ ( %R ` ^ ( ` ( $JJ 6( $ '( 6( / ,W : ,W : IRU HDFK ,GH LQ 6( GR % HQGGR ZKLOH %R GR % HQGGR 2SHQ &ORVH 3ULQW 5HDG

PAGE 101

$33(1',; '(127$7,21 $/ 6(0$17,&6 6HPDQWLF $OJHEUDV ,QWHJHU 5HDO 6WULQJ %RROHDQ ,GHQWLILHU 'HQRWDEOH 9DOXHV 'Y ,QWHJHUV 5HDOV 6WULQJ %RROHDQ 6HW /LVW 7XSOH 5HI 1LO $Q\ /RFDWLRQ (UUYDOXH ZKHUH (UUYDOXH 8QLW /LVW 6HW 'Y 7XSOH ,G f§! 'Y (QYLURQPHQW ([SUHVVLEOH 9DOXHV (Y ,QWHJHUV 5HDOV 6WULQJ %RROHDQ 6HW /LVW 7XSOH 5HI 1LO $Q\ (UUYDOXH 6WRUDEOH 9DOXHV 6Y ,QWHJHUV 5HDOV 6WULQJ %RROHDQ 6HW /LVW 7XSOH 5HI 1LO $Q\ 6WRUDJH /RFDWLRQV 'RPDLQ /RHQ 2SHUDWLRQV

PAGE 102

f ILUVW-RFQ/RFQ f QH[W-RFQ/RFQ f§} /RHQ f HTXDO-RFQ/RFQ f§! /RHQ f§r 7U f OHVVWKDQ-RFQ/RFQ f§! /RHQ f§! 7U 6WDFN%DVHG 6WRUH 'RPDLQ 6WRUH /RHQ f§! [ /RHQ 2SHUDWLRQV f DFFHVV /RHQ f§!6WRUH 6Y (UUYDOXHf DFFHVV ?O?PDSWRSfOHVVWKDQ-RFQ WRSf§P6YPDS Of LQ(UUYDOXHf f XSGDWH /RHQ 6Y f§!6WRUH f§! 3RVW6WRUH XSGDWH ?O?Y?PDSWRSfO OHVVWKDQ-RFQ WRSf§rLQ2.>O Lf§! U@PmS WRSf LQ(UUPDS WRSf f PDUN-RFQ 6WRUH f§r/RFQ PDUN-RFQ ?PDSWRSfWRS f DOORFDWH-RFQ 6WRUH f§r/RFQ [ 3RVW6WRUH DOORFDWH-RFQ ;PDS WRSfWRS LQ2.PDS QH[W -RFQWRSfff f GHDOORFDWH-RFQV /RHQ f§!6WRUH f§! 3RVW6WRUH GHDOORFDWH-RFQV ?O?PDSWRSfO OHVVWKDQ-RFQ WRSf 9 HTXDO-RFQ WRSffLQ2.PDSOf ? LQ(UUPDSWRSf

PAGE 103

(QYLURQPHQW 'RPDLQ (QY f§ (QYLURQPHQW ,G f§!(Y 2SHUDWLRQV f HPSW\HQY (QY HPSW\HQY ;LLQ(UUYDOXHff f DFFHVVHQY ,G f§!(QY f§!'Y DFFHVVHQY ;L;HHLf f XSGDWHHQY ,G A'Y f§(QY f§(QY XSGDWHHQY $$G$H> Lf§! G?Hf 3RVW6WRUH 5XQWLPH VWRUH ODEHOHG ZLWK VWDWXV RI FRPSXWDWLRQ 'RPDLQ S f 3RVW6WRUH f§ 2. (UU ZKHUH 2. (UU 6WRUH 2SHUDWLRQV f UHWXUQ 6WRUH f§!3RVW6WRUH UHWXUQ ;VLQ2.Vf f VLJQDOHUU 6WRUH f§3RVW6WRUH VLJQDOHUU ;VLQ(UUVf f FKHFN 6WRUH f§!(QY [ 3RVW6WRUHff f§ 3RVW6WRUH r(QY [ 3RVW6WRUHff FKHFN I ;S FDVHV S RI N2.Vf!I Vf? LV (UUVff§!S HQG

PAGE 104

9ROWDLUH 'DWDEDVH &OXQFWLRQ &ODVV )XQFWLRQ &ODVV &ODVV1DPH f§r &ODVV6WUXFWXUH )XQFWLRQ )XQFWLRQDODPH f§ &ODVV6WUXFWXUH &ODVV6WUXFWXUH $'r [ 75r [ &RQVWUDLQWV 'RPDLQ 7? e &5HI7DEOH 1DPH f§r5HIr &ODVV+LHU &ODVV1DPH [ &ODVV1DPHA%RROHDQ ,QVWDQFH f§ 5HI f§r $QDPH f§!6Yfr 'RPDLQ U f ,&7DEOH 5HI f§ 1DPHr 'RPDLQ D e 6FKHPD &ODVV [ &5HI7DEOH [ &ODVV+LHU 'RPDLQ e 'DWDEDVH ,QVWDQFH [ ,&7DEOH 'RPDLQ e '% 6FKHPD [ 'DWDEDVH 'RPDLQ $' $WWULEXWH'RPDLQ $QDPH [ &ODVV1DPH 'RPDLQ 75 7UDQVLHQWV $QDPH [ &ODVV1DPH &RQVWUDLQWV % 9DOXDWLRQ )XQFWLRQV 9ROWDLUH 6 f§!'E f§r'E 9ROWDLUH_ORDGBGE 6F 'E 6@@ $f OHW n ORDGV?'E?ORDGDA6F?LQLWDf LQLWVf LQ 23>6.If f LQLWD 6F

PAGE 105

LQLW f§ $fOHW HPSW\MFODVVHV QLO DQG LQLWBKLHU ;) DQG LQLW7O QLO LQ HPSW\FODVVHV LQLW-LLHU LQLW7Of f LQLWV 'E LQLWV $fOHW HPSW\LQVWDQFHV QLO DQG LQLW7 QLO LQ HPSW\FODVVHV LQLW7f f ORDGD nf 6Ff§rFUf§rD ORDGDA6FA $FW 6F QLOQU ? ORDGD >>WDLO F@@&UHDWHB&ODVV>^KHDG 6FIOFUf f ORDGV f ORDGVA'EA ;M 'E QLOf§rA ? RDGM>>WDLO 'EA&UHDWH-QVWDQFHSLHDG 'EA6f f &UHDWH&ODVV &Of§rDf§rD &UHDWH&ODVV _1 &ODVV7\SH 6XS 6XE 5 $ 7 &RQVWUDLQWVf@@ ;F7LKf FDVHV &ODVV7\SH RI f&ODVVf f§!>1 $7&RQVWUDLQWVf@F7LKf? f)XQFWLRQf f§!>1 Lf§! $7&RQVWUDLQWVf?F7?Kf? HQG f &UHDWH-QVWDQFH ,QVf§r'%f§r'% &UHDWH-QVWDQFH >@5HI 3 $9f @@ $FU L Uff OHW Ln >5HI L! FRQY-:$?Of@L LQ XSGDWH\5HI 3 D L Ufff f XSGDWH\ 5HI [ 1DPHr ;f§! XSGDWH $U S fS QLOf§r XSGDWHaU WDLO S UHFXSGU KHDG S ff f UHFXSG 5HI [ 1DPHf§r'%f§r'% UHFMXSG ;U Sf;D fPHPEHU -FUSff§r XSGDWH7LUSDfXSGDWH7USff f

PAGE 106

f XSGDWHV 5HI [ 1DPH [ 6FKHPD [ 6FKHPD XSGDWH7L ?USF7LKff F >S !! UMSfFRQV U@U[ Kf f XSGDWHA 5HI [ 1DPH [ 'DWDEDVH [ 'DWDEDVH XSGDWHBW $US L Uff  >U 7UfFRQV S@Uf f PHPEHU-F 5HI [ 1DPH f§! %RROHDQ 7KLV IXQFWLRQ UHWXUQV WUXH LI WKH LQVWDQFH GHQRWHG E\ 5HI LV D PHPEHU RI WKH FODVV GHQRWHG E\ 1DPH f FRQYB$9 $9 f§}$Q f§r6Yfr 23 6 'Ef§!'E RSL6W60ARSLt+23LVILf 23_QHZBF &@@ $FU f &UHDWHMGDVV &O FU 6f 23IOQHZ3 $9 @@ $ OHW 5 LQYHQW5HI LQ &UHDWH-QVWDQFH 5 3 $9 23IOHYDO 6(IO 23IOHYDO ^( %R`@@ )RU WKH WLPH EHLQJ ZH RQO\ FRQVLGHU WKH IROORZLQJ IRUPV RI IXQFWLRQ HYDOXDWLRQ ^IR  (? DQG DQG LQ (Q` RU ^  (? DQG DQG LQ (Q` 23I-HYDO 6n@@ $HW D JHWDQFKRU6(f LQ FDVHV D RI LV )XQFWLRQIf f§! HYD/IQ JH/FRQVWLIIf LQL/HQY6(I f LV&ODVVFf f§rHYD/TXHU\6(Ff LV(UUYDOXHHf f§!LQ (UUYDOXHHf HQG f JHWFRQVWU 1DPH f§}% JHWFRQVWU $N LV)XQFWLRQNff§rI XQFWLRQ Ff` LV&ODVVNff§rFODVV Nf_ Q(UUYDOXHNf

PAGE 107

f LQLWHQY 6( f§r1DPH 'E f§r(QY [ 3RVW6WRUHf PBHQL>@^(_%R`@@ ?Q?aI OHW HSf JHWVWRUDJH HPSW\HQY JHWDWWUQf JHWWUDQVQff LQ FUHDWH-HPSLQVW >^%R @@HSf f JHWDWWU 1DPH f§}$'r JHWDWWU $Q LV)XQFWLRQQff§rI XQFWLRQ QfMO LV&ODVVQfFODVV QfMO LQ(UUYDOXHQf f JHWWUDQV 1DPH f§!75r JHWWUDQV $Q LV)XQFWLRQQff§rI XQFWLRQ Qf_ LV& ODVVQff§rFODVV Qf> ? LQ(UUYDOXHQf f FUHDWHWHPSLQVW %R f§!'% f§! (QY [ 3RVW6WRUHf (QY [ 3RVW6WRUHf %DVLFDOO\ (QY ZLOO DOZD\V KDYH DQ LGHQWLILHU FDOOHG VHOI ZKLFK ZLOO EH PDSSHG RQWR DQ LQVWDQFH FUHDWHG E\ FUHDWHWHPSLQVW ZKLFK HIIHFWLYHO\ UHWXUQV D ERXQG (QY LQFOXGLQJ VHOI 7KXV D FRS\ RI WKH FXUUHQW HQYLURQn PHQW LV DOZD\V FRQWDLQHG LQ VHOI 6LQFH VHOI LV RI GRPDLQ ,QVWDQFH LV KDV D 5HI :KHQ LQ IXQFWLRQ HYDOXDWLRQ PRGH 5HI VHOIUHI DQG ZKHQ LQ TXHU\ PRGH VHOI FXUUHQW RLG XQGHU FRQVLGHUDWLRQ $VVXPH WKDW DOO DWWULEXWH QDPHV DUH XQLTXH )XUWKHU OHW $ 75 $' f JHWVWRUDJH $ f§!(QY f§!6WRUH (QY [ 3RVW6WRUHf JHWVWRUDJHA$L $^@ $H$V OHW HnSf JHWVWRUDJHA$L@@ LQ FKHFNJHWVWRUDJHA$AHnfSf JHWVWRUDJHA$QDPH 'RPDLQ@@ $H$V OHW GSf =f >^'RPDLQ@@ V LQ XSGDWHHQQM$QDPH@@ G HfSf

PAGE 108

JHWVWRUDJH >>$QDPH 9DOXH @ $H$V OHW GSf 9A9DOXH?V LQ XSGDWHHQY?$QDPH?GHfSf 'RPDLQ f§ 6WRUH f§r'Y [ 3RVW6WRUHf 'SQWHJHU@@ $V OHW Sf DOORFDWH-RFQV LQ ?Q,QWHJHU-RFQOfSf =f>>UHDO_@ $V OHW Sf DOORFDWH -RFQV LQ LQ5HDO-RFQOfSf A>>VWULQJ@@ $V OHW Sf DOORFDWH -RFQV LQ LQ6WULQJ -RFQOfSf '__ $V LV&WPFeLfRQ>>@@ff§!OHW Sf DOORFDWH -RFQV LQ LQ5HI-RFQOfSf _LQ(UUYDOXHfSf =f>>WXSOH$@@ $V OHW HSf JHWVWRUDJHA$AHPSW\MHQY V LQ LQ7XSOH -RFQHf Sf B'_VHW@@ $V OHW OSf DOORFDWH-RFQV LQ LQ6HW-RFQOfSf 9 9DOXH f§!6WRUH f§!'Y [ 3RVW6WRUHf 9A9DOXHA ?TFDVHV ?9DOXHRI LVOQWHJHULff§rPOQWHJHULfTf ? LV5HDOUff§!P5HDOUfTf LV6WULQJJfP6WULQJJfTf ?V5HI^SfP5HISfTf ? LV1LOQff§r1XOOTf LV/LVWOff§!?Q/LVWOf Tf LV6HWVff§!UHPRYHGXSVfTf ? LV7XSOHLff§! OHW HSf JHWVWRUDJHAHPSW\MeQY Tf LQ LQ7XSOHHfSf LQ (UUYDOXHfTf

PAGE 109

5()(5(1&(6 >@ $JUDZDO 5 DQG *HKDQL 1 f2'( 2EMHFW 'DWDEDVH DQG (QYLURQPHQWf 7KH /DQJXDJH DQG 'DWD 0RGHOf 3URF $&0 6,*02' 3RUWODQG 25 -XQH >@ $ODVKTXU $ 6X 6 DQG /DP + f24/ $ 4XHU\ /DQJXDJH IRU 0DQLSXODWLQJ 2EMHFWRULHQWHG 'DWDEDVHVf WK ,QW &RQI RQ 9HU\ /DUJH 'DWDEDVHV $PVWHUn GDP WKH 1HWKHUODQGV $XJXVW >@ $OEDQR $ &DUGHOOL / DQG 2UVLQL 5 f*DOLOHR $ 6WURQJO\ 7\SHG ,QWHUDFWLYH &RQFHSWXDO /DQJXDJHf $&0 72'6 9RO 1R -XQH >@ $QGUHZV 7 DQG +DUULV & f&RPELQLQJ /DQJXDJH DQG 'DWDEDVH $GYDQFHV LQ DQ 2EMHFW2ULHQWHG 'HYHORSPHQW (QYLURQPHQWf 3URF 2236/$ 2UODQGR )ORULGD 2FWREHU >@ $WNLQVRQ 03 DQG %XQHPDQ 23 f7\SHV DQG 3HUVLVWHQFH LQ 'DWDEDVH 3URn JUDPPLQJ /DQJXDJHVf $&0 &RPSXWLQJ 6XUYH\V 9RO 1R -XQH >@ $WNLQVRQ 03 &KLVKROP .3 DQG &RFNVKRWW :3 f36$OJRO DQ $OJRO ZLWK D 3HUVLVWHQW +HDSf $&0 6,*3/$1 1RWLFHV 9RO 1R -XO\ >@ %DQFLOKRQ ) f4XHU\ /DQJXDJHV IRU 2EMHFW2ULHQWHG 'DWDEDVH 6\VWHPV $QDOn \VLV DQG D 3URSRVDOf 3URF RI WKH WK ,QWfO 6\PSRVLXP RQ 'DWDEDVHV RI %UD]LO &DPSLQDV %UD]LO $SULO >@ %DQFLOKRQ ) %ULJJV 7 .KRVKDILDQ ) DQG 9DOGXULH] 3 f)$' $ 3RZn HUIXO DQG 6LPSOH 'DWDEDVH /DQJXDJHf 3URF WK 9/'% %ULJKWRQ (QJODQG 6HSWHPEHU >@ %DQFLOKRQ ) DQG %XQHPDQ 3 HGVf $GYDQFHV LQ 'DWDEDVH 3URJUDPPLQJ /DQJXDJHV $&0 3UHVV 1HZ @ %DWRU\ HW DO f*(1(6,6 $Q ([WHQVLEOH 'DWDEDVH 0DQDJHPHQW 6\VWHPf ,((( 7UDQV RQ 6RIW (QJJ 9RO 1R >@ %HFN +: *DOD 6. DQG 1DYDWKH 6% f&ODVVLILFDWLRQ DV D 4XHU\ 3URn FHVVLQJ 7HFKQLTXH LQ WKH &$1','( 6HPDQWLF 'DWD 0RGHOf 3URF RI WKH ,((( ,QWHUQDWLRQDO &RQI RQ 'DWD (QJJ /RV $QJHOHV )HEUXDU\ >@ %ORRP 7 DQG =GRQLFN 6% f,VVXHV LQ WKH 'HVLJQ RI 2EMHFW2ULHQWHG 'DWDEDVH 3URJUDPPLQJ /DQJXDJHVf 3URF 2236/$ 2UODQGR )ORULGD 2FWREHU >@ %RUJLGD $ f/DQJXDJH )HDWXUHV IRU )OH[LEOH +DQGOLQJ RI ([FHSWLRQV LQ ,QIRUn PDWLRQ 6\VWHPVf $&0 72'6 9RO 1R 'HFHPEHU

PAGE 110

>@ %UDFKPDQ 5DQG 6FKPRO]H -* f$Q RYHUYLHZ RI WKH ./21( .QRZOHGJH 5HSUHVHQWDWLRQ 6\VWHPf &RJQLWLYH 6FLHQFH >@ %UXFH DQG :HJQHU 3 f$Q $OJHEUDLF 0RGHO RI 6XEW\SH DQG ,QKHULWDQFHf 7HFK 5HS 'HSW RI &RPSXWHU DQG ,QIRUPDWLRQ 6FLHQFH 8QLYHUVLW\ RI 3HQQV\On YDQLD 3KLODGHOSKLD >@ %XQHPDQ 3 DQG $WNLQVRQ 0 f,QKHULWDQFH DQG 3HUVLVWHQFH LQ 'DWDEDVH 3URn JUDPPLQJ /DQJXDJHVf 3URF $&0 6,*02' :DVKLQJWRQ '& -XQH >@ %XQHPDQ 3 DQG 2KRUL $ f3RO\PRUSKLVP DQG 7\SH ,QIHUHQFH LQ 'DWDEDVH 3URJUDPPLQJf XQSXEOLVKHG PDQXVFULSW 'HSW RI &RPSXWHU DQG ,QIRUPDWLRQ 6FLHQFH 8QLYHUVLW\ RI 3HQQV\OYDQLD 3KLODGHOSKLD >@ &DUGHOOL / f$ 6HPDQWLFV IRU 0XOWLSOH ,QKHULWDQFHf ,Q 6HPDQWLFV RI 'DWD 7\SHV .DKQ '% 0DFTXHHQ DQG 3ORWNLQ HGVf /HFWXUH 1RWHV LQ &RPn SXWHU 6FLHQFH 9RO 6SULQJHU 9HUODJ 1HZ @ &DUGHOOL / DQG :HJQHU 3 f2Q 8QGHUVWDQGLQJ 7\SHV 'DWD $EVWUDFWLRQ DQG 3RO\PRUSKLVPf $&0 &RPSXWLQJ 6XUYH\V 9RO 1R 'HFHPEHU >@ &DUH\ 0 'H:LWW 5LFKDUGVRQ DQG 6KHNLWD ( f7KH $UFKLWHFWXUH RI WKH (;2'86 ([WHQVLEOH '%0Vf 3URF ,QW :RUNVKRS RQ 2EMHFW2ULHQWHG 'DWDEDVH 6\VWHPV *HUPDQ\ >@ &KDNUDYDUWK\ 86 DQG 1HVVRQ 6 f0DNLQJ DQ 2EMHFW2ULHQWHG '%06 $FWLYH 'HVLJQ ,PSOHPHQWDWLRQ DQG (YDOXDWLRQ RI D 3URWRW\SHf 3URF RI ,QWHUQDWLRQDO &RQI RQ ([WHQGHG 'DWDEDVH 7HFKQRORJ\ 9HQLFH ,WDO\ 0DUFK >@ &OXHW 6 'HOREHO & /HFOXVH & DQG 5LFKDUG 3 f5HORRS DQ $OJHEUDEDVHG 4XHU\ /DQJXDJH IRU DQ 2EMHFW2ULHQWHG 'DWDEDVH 6\VWHPf 3URF ,((( ,QWHUn QDWLRQDO &RQI RQ 'DWD (QJLQHHULQJ .\RWR -DSDQ )HEUXDU\ >@ &RSHODQG DQG 0DLHU f0DNLQJ 6PDOOWDON D 'DWDEDVH 6\VWHPf 3URF $&0 6,*02' 1HZ @ &RXUFHOOH % f)XQGDPHQWDO 3URSHUWLHV RI ,QILQLWH 7UHHVf 7KHRUHWLFDO &RPS 6F 9RO >@ (OPDVUL 5 DQG 1DYDWKH 6 )XQGDPHQWDOV RI 'DWDEDVH 6\VWHPV %HQn MDPLQ&XPPLQJV 5HGZRRG &LW\ &DOLIRUQLD >@ )LVKPDQ '+ %HHFK &DWH +3 f,ULV $Q 2EMHFW2ULHQWHG 'DWDEDVH 6\VWHPf $&0 722,6 9RO 1R >@ *DOODLUH + 0LQNHU DQG 1LFRODV f/RJLF DQG 'DWDEDVHV D 'HGXFWLYH $SSURDFKf $&0 &RPSXWLQJ 6XUYH\V 9RO 1R -XQH >@ +DOO 3$9 f$GGLQJ 'DWDEDVH 0DQDJHPHQW WR $GDf $&0 6,*3/$1 1RWLFHV 9RO 1R $SULO >@ +DUSHU 5 0LOQHU 5 DQG 7RIWH 0 f7KH 'HILQLWLRQ RI 6WDQGDUG 0/ YHUn VLRQ ff /)&6 5HSRUW (&6/)&6 'HSW RI &RPSXWHU 6F 8QLYHUVLW\ RI (GLQEXUJK 6FRWODQG $XJXVW

PAGE 111

>@ +XOO 5 DQG .LQJ 5 f6HPDQWLF 'DWDEDVH 0RGHOLQJ 6XUYH\ $SSOLFDWLRQV DQG 5HVHDUFK ,VVXHVf $&0 &RPSXWLQJ 6XUYH\V 9RO 1R 6HSWHPEHU >@ -DJDGLVK +9 f,QFRUSRUDWLQJ +LHUDUFK\ LQ 5HODWLRQDO 0RGHO RI 'DWDf 3URF $&0 6,*02' 3RUWODQG 2UHJRQ 0D\ >@ .KRVKDILDQ 6 DQG &RSHODQG f2EMHFW ,GHQWLW\f 3URF )LUVW 2236/$ &RQI 3RUWODQG 2UHJRQ 6HSWHPEHU >@ .LIHU 0 DQG /DXVHQ f)/RJLF $ +LJKHU 2UGHU /DQJXDJH IRU 5HDVRQLQJ DERXW 2EMHFWV ,QKHULWDQFH DQG 6FKHPHVf 3URF RI WKH $&0 6,*02' 3RUWn ODQG 2UHJRQ -XQH >@ .RFK 0DOO 0 3XWIDUNHQ 3 5HLPHU 0 6FKPLGW -: DQG =HKQ GHU &$ f0RGXOD5 5HSRUW /LOLWK 9HUVLRQf 7HFKQLFDO 5HSRUW (7+ =XULFK 6ZLW]HUODQG >@ /HFOXVH & 5LFKDUG 3 DQG 9HOH] ) f2 DQ 2EMHFW2ULHQWHG 'DWD 0RGHOf 3URF $&0 6,*02' &KLFDJR -XQH >@ 0DF*UHJRU 50 f$5,(/f§$ 6HPDQWLF )URQWHQG WR 5HODWLRQDO '%06Vf 3URF 9/'% 6WRFNKROP $XJXVW >@ 0DLHU 6WHLQ 2WLV $ DQG 3XUG\ $ f'HYHORSPHQW RI 2EMHFWRULHQWHG '%06f $&0 6,*3/$1 1RWLFHV 9RO 1R 1RYHPEHU >@ 0DQROD ) DQG 'D\DO 8 f3'0 $Q 2EMHFW2ULHQWHG 'DWD 0RGHOf 3URF ,QW :RUNVKRS RQ 2EMHFW2ULHQWHG 'DWDEDVH 6\VWHPV $VLORPDU &DOLIRUQLD 6HSWHPn EHU >@ 0F&DUWK\ '5 DQG 'D\DO 8 f7KH $UFKLWHFWXUH RI DQ $FWLYH 2EMHFW2ULHQWHG 'DWDEDVH 6\VWHPf 3URF $&0 6,*02' 3RUWODQG 2UHJRQ -XQH >@ 0RUJHQVWHUQ 0 f$FWLYH 'DWDEDVHV DV D 3DUDGLJP IRU (QKDQFHG &RPSXWLQJ (QYLURQPHQWVf 3URF WK 9/'% $XJXVW >@ 0\ORSRXORV %HUQVWHLQ 3$ DQG :RQJ +.7 f$ /DQJXDJH )DFLOLW\ IRU 'HVLJQLQJ 'DWDEDVH ,QWHQVLYH $SSOLFDWLRQVf $&0 72'6 9RO 1R -XQH >@ 1HEHO %HUQKDUG f&RPSXWDWLRQDO &RPSOH[LW\ RI 7HUPLQRORJLFDO 5HDVRQLQJ LQ %$&.f $UWLILFLDO ,QWHOOLJHQFH 9RO 1R >@ 2KRUL $ %XQHPDQ 3 DQG %UHD]X7DQQHQ 9 f'DWDEDVH 3URJUDPPLQJ LQ 0DFKLDYHOOLf§$ 3RO\PRUSKLF /DQJXDJH ZLWK 6WDWLF 7\SH ,QIHUHQFLQJf 3URF $&0 6,*02' &RQI 3RUWODQG 2UHJRQ >@ 2QWRORJLF ,QF f21726 2EMHFW 'DWDEDVH 'RFXPHQWDWLRQf 5HOHDVH %XUOLQJWRQ 0DVVDFKXVVHWWV >@ 3DWHO6FKQHLGHU 3) f6PDOO FDQ EH %HDXWLIXO LQ .QRZOHGJH 5HSUHVHQWDWLRQf )/$,5 7HFKQLFDO 5HSRUW )DLUFKLOG ,QF $XVWLQ 7H[DV

PAGE 112

>@ 5RZH /$ DQG 6WRQHEUDNHU 05 f7KH 3267*5(6 'DWD 0RGHOf 3URF WK 9/'% %ULJKWRQ (QJODQG 6HSWHPEHU >@ 6FKPLGW '$ 'HQRWDWLRQDO 6HPDQWLFV :P & %URZQ 'XEXTXH ,RZD >@ 6FKPLGW -: f6RPH +LJKOHYHO /DQJXDJH &RQVWUXFWV IRU 'DWD RI 7\SH 5HODn WLRQf $&0 72'6 9RO 1R 6HSWHPEHU >@ 6FKZDUW] -7 'HZDU 5%. 'XELQVN\ ( DQG 6FKRQEHUJ ( f3URJUDPPLQJ ZLWK 6HWV $Q ,QWURGXFWLRQ WR 6(7/f 6SULQJHU9HUODJ %HUOLQ >@ 6KDZ *0 DQG =GRQLFN 6% f$Q 2EMHFW2ULHQWHG 4XHU\ $OJHEUDf 3URF RI 6HFRQG ,QW :RUNVKRS RQ 'DWDEDVH 3URJUDPPLQJ /DQJXDJHV $XJXVW >@ 6KLSPDQ f7KH )XQFWLRQDO 'DWD 0RGHO DQG WKH 'DWD /DQJXDJH '$3/(;f $&0 72'6 9RO 1R 0DUFK >@ 6PLWK -0 )R[ 6 DQG /DQGHUV 7 f$'$3/(; 5DWLRQDOH DQG 5HIHUHQFH 0DQXDOf G HG &&$ &DPEULGJH 0DVVDFKXVVHWWV >@ 6WUDXEH DQG 7DPHU 2]VX 0 f7\SH &RQVLVWHQF\ RI 4XHULHV LQ DQ 2EMHFW 2ULHQWHG 6\VWHPf 3URF -RLQW $&0 2236/$(&223 &RQI RQ 2EMHFW 2ULHQWHG 3URJUDPPLQJ 2FWREHU >@ :DQG 0 f7\SH ,QIHUHQFH IRU 5HFRUGV &RQFDWHQDWLRQ DQG 6LPSOH 2EMHFWVf 3URF )RXUWK ,((( 6\PS RQ /RJLF LQ &RPSXWHU 6FLHQFH >@ :LGRP DQG )LQNHOVWHLQ 6 f6HW2ULHQWHG 3URGXFWLRQ 5XOHV LQ 5HODWLRQDO 'DWDEDVH 6\VWHPVf 3URF $&0 6,*02' $WODQWLF &LW\ 1HZ -HUVH\ 0D\ >@ @ =DQLROR & f7KH 'DWDEDVH /DQJXDJH *(0f 3URF $&0 6,*02' -XQH

PAGE 113

%,2*5$3+,&$/ 6.(7&+ 6XQLW .DO\DQML *DOD ZDV ERUQ RQ -DQXDU\ LQ %RPED\ ,QGLD +H UHFHLYHG KLV XQGHUJUDGXDWH GHJUHH LQ ,QVWUXPHQWDWLRQ IURP %LUOD ,QVWLWXWH RI 7HFKQRORJ\ DQG 6FLHQFH ,QGLD LQ -XQH +H UHFHLYHG KLV 0DVWHU RI 6FLHQFH GHJUHH LQ 'HFHPEHU DQG KLV 'RFWRU RI 3KLORVRSK\ GHJUHH LQ 'HFHPEHU IURP WKH 8QLYHUVLW\ RI )ORULGD *DLQHVYLOOH +LV FXUUHQW UHVHDUFK LQWHUHVWV DUH GDWDEDVH SURJUDPPLQJ ODQJXDJHV GDWDEDVH DSSOLFDWLRQV RI SURJUDPPLQJ ODQJXDJH WKHRU\ W\SH DOJHEUDV DQG TXHU\ DOJHEUDV

PAGE 114

, FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 6KDPNDQW % 1DYDWKH &KDLU 3URIHVVRU RI &RPSXWHU DQG ,QIRUPDWLRQ 6FLHQFHV FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 0DQXHO $VVRFLDWH 3URIHVVRU RI &RPSXWHU DQG ,QIRUPDWLRQ 6FLHQFHV %HUPXGH] FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKMAGHJUHH RI 'RFWRU RI 3KLORVRSK\ +HUPDQ /DP $VVRFLDWH 3URIHVVRU RI (OHFWULFDO (QJLQHHULQJ FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 5LFKDUG 1HZPDQ:ROIH $VVLVWDQW 3URIHVVRU RI &RPSXWHU DQG ,QIRUPDWLRQ 6FLHQFHV

PAGE 115

, FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ VFRSH DQG TXDOLW\ DV D GLVVHUWDWLRQ IRU WKH GHJWHHARI 'RFWRU RI 3KLORVRSK\ f -RV & 3ULQFLSH $VVRFLDWH 3URIHVVRU RI (OHFWULFDO (QJLQHHULQJ FHUWLI\ WKDW KDYH UHDG WKLV VWXG\ DQG WKDW LQ P\ RSLQLRQ LW FRQIRUPV WR DFFHSWDEOH VWDQGDUGV RI VFKRODUO\ SUHVHQWDWLRQ DQG LV IXOO\ DGHTXDWH LQ Ot7"Ur A6WDQOH\ < : 6Xn 3URIHVVRU RI (OHFWULFDO (QJLQHHULQJ 7KLV GLVVHUWDWLRQ ZDV VXEPLWWHG WR WKH *UDGXDWH )DFXOW\ RI WKH &ROOHJH RI (QJLQHHULQJ DQG WR WKH *UDGXDWH 6FKRRO DQG ZDV DFFHSWHG DV SDUWLDO IXOn ILOOPHQW RI WKH UHTXLUHPHQWV IRU WKH GHJUHH RI 'RFWRU RI 3KLORVRSK\ 'HFHPEHU :LQIUHG 0 3KLOOLSV 'HDQ &ROOHJH RI (QJLQHHULQJ 0DGHO\Q 0 /RFNKDUW 'HDQ *UDGXDWH 6FKRRO


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID E58B3NU9C_7CR5F9 INGEST_TIME 2014-10-06T21:28:03Z PACKAGE AA00025764_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES