Citation
Data partitioning, query processing and optimization techniques for parallel object-oriented databases

Material Information

Title:
Data partitioning, query processing and optimization techniques for parallel object-oriented databases
Creator:
Huang, Ying
Publication Date:
Language:
English
Physical Description:
ix, 93 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Algorithms ( jstor )
Data models ( jstor )
Data processing ( jstor )
Database design ( jstor )
Databases ( jstor )
Input output ( jstor )
Query processing ( jstor )
Scheduling ( jstor )
Sharing ( jstor )
Slavery ( jstor )
Dissertations, Academic -- Electrical and Computer Engineering -- UF
Electrical and Computer Engineering thesis, Ph. D
Object-oriented databases ( lcsh )
Parallel processing (Electronic computers) ( lcsh )
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1996.
Bibliography:
Includes bibliographical references (leaves 88-92).
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Ying Huang.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright [name of dissertation author]. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
023170451 ( ALEPH )
35001719 ( OCLC )

Downloads

This item has the following downloads:


Full Text












DATA PARTITIONING, QUERY PROCESSING
AND OPTIMIZATION TECHNIQUES
FOR PARALLEL OBJECT-ORIENTED DATABASES









By

YING HUANG
















A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA

1996
































To My Parents
for Their Love and Encouragement in All My Endeavors.













ACKNOWLEDGEMENTS


First and foremost, I would like to thank my advisor, Professor Stanley Su, for

giving me the opportunity to work in the field of database management, and for

providing never-ending guidance and encouragement throughout the course of my

research. I wish to thank the other members of my graduate committee, Professors

Keith L. Doty, Eric Hanson, Herman Lam and John Staudhammer, for reviewing

this dissertation. I also wish to thank researchers of Fujitsu Laboratories LTD. and

the faculty and students of the Database Research and Development Center at the

University of Florida for their invaluable suggestions to this work. I also want to

recognize Sharon Grant, our over-worked but much appreciated secretary, for the

help she has given me. Last but not the least, I am thankful to my wife for her

patience which was essential in the completion of this work.













TABLE OF CONTENTS


ACKNOWLEDGEMENTS ............................ iii

LIST OF FIGURES ................................ vii

ABSTRACT .................................... viii

CHAPTERS

1 INTRODUCTION .............. .................. 1


2 SURVEY OF RELATED WORK ............ ..... ....... 5


3 OBJECT-ORIENTED DATABASE AND QUERY SPECIFICATION .. 11

3.1 Object-oriented View of a Database ............ ...... 11
3.2 Query Graph and Query Processing ............ ...... 12

4 A GENERAL FRAMEWORK OF PARALLEL OODBMSS ........ 19

4.1 A Hybrid Data Partitioning Approach ......... ........ 19
4.2 A Distributed Graph-based Query Processing Strategy ....... 22
4.2.1 Query Graph Modification Approach ............... 22
4.2.2 Multiple Wavefront Algorithms ......... ........ 24
4.2.3 Result Collection ......................... 28
4.2.4 Method Processing and Attribute Inheritance ........ 28

5 DISTRIBUTED RESULT COLLECTION ................ 31

5.1 Two Architectures for Supporting Result Collection ......... 31
5.2 Pattern-passing Identification Strategy ............... .... 32
5.3 Distributed Result Collection ............ ..... .... .. 35
5.3.1 Join Approach ........ .................... 37
5.3.2 Concatenation Approach 38

6 QUERY OPTIMIZATION STRATEGIES ..... 42








6.1 Parallelism Not Equal to Efficiency 42
6.2 Intraquery Scheduling Strategy ... .. 47
6.3 Partial Graph Processing Strategy ...... 50
6.4 Interquery Scheduling and Common Pattern Sharing Strategies 53
6.5 Distributed Sharing of Selection Operations .... 60

7 PERFORMANCE EVALUATIONS ..... 62

7.1 Benchmark and Application Domains ... 62
7.2 Evaluations of Optimization Strategies ... 63
7.2.1 Intraquery Scheduling ..... 64
7.2.2 Partial Graph Processing ... 67
7.2.3 Interquery Scheduling and Common Pattern Sharing 69
7.2.4 Distributed Local Selection Sharing ... 70
7.2.5 Scaleup and Speedup of optimization strategies ... 72
7.3 Evaluations of Architectures 73
7.3.1 Single Class Selection 74
7.3.2 Two-Class Join .................. .. ..... .. 75
7.3.3 Three-Class Join ...... ..... 75
7.3.4 Benchmark Queries 78
7.4 Evaluations of Two Data Placement Strategies ... 78

8 DISCUSSION AND CONCLUSION 82

8.1 Discussion ............... .... ............ 82
8.2 Conclusion ............ ..... 85


REFERENCES ........... ....... ............ 88

BIOGRAPHICAL SKETCH .................. 93
















LIST OF FIGURES


3.1 Schema Graph of a University Database ............... 14

3.2 Object Graph(OG) ................ ............ 15

3.3 The Query Graph ......... .... ...... 17

3.4 The Resulting Subdatabase ..... 18


4.1 Vertical Partitioning of Class Student ..... 19

4.2 Hybrid Partitioning of Class Student .... 21

4.3 Query Graph Modification Approach .... 24

4.4 Data Structures for the Multiple Wavefront Algorithms ... 26

4.5 Execution of the Identification Algorithm ..... 27

4.6 A Proposed Mapping Strategy ...... 30


5.1 Two Architectures ........................... .. 33

5.2 An Example of the PPI Strategy 36

5.3 An Example of the Result Collection Node Assignment ... 37

5.4 An Example of the Result Pattern. .... 38

5.5 An Example of a cache ......... 41








6.1 A Modified Object Graph ....... 44

6.2 A New Query Execution Plan ..... 45

6.3 An Example of the Numbering Scheme ... 53

6.4 An Example Set of the Structures of Sharing .... 55

6.5 Three Basic Structures of Sharing .... 57


7.1 Two Software Architectures ... 65

7.2 Schema Representation of Various Benchmark Queries ... 66

7.3 Intraquery Scheduling Strategy ..... 68

7.4 Partial Graph Processing Strategy ... 69

7.5 Structure of Sharing Used in Performance Evaluation ... 70

7.6 Interquery Scheduling and Common Pattern Sharing Strategies 71

7.7 Distributed Local Selection Sharing Strategy .... 71

7.8 Scaleup and Speedup of the System ... ....... 73

7.9 Single Class Selection (20000 objects/class, 100 bytes/object) 74

7.10 Two-class Join (4000 objects/class, 100 bytes/object) ... 76

7.11 Two-class Join (20000 objects/class, 100 bytes/object) ... 76

7.12 Two-class Join (4000 objects/class, 5000 bytes/object) ... 77

7.13 Three-class Join (4000 objects/class, 100 bytes/object) ... 77

7.14 Speedup and Scaleup of Benchmark Queries .... 78

7.15 Response Time of Benchmark Queries ..... 81


8.1 Optimization Overhead for the Identification Approach ... 84















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

DATA PARTITIONING, QUERY PROCESSING
AND OPTIMIZATION TECHNIQUES
FOR PARALLEL OBJECT-ORIENTED DATABASES

By

Ying Huang

May 1996


Chairman: Dr. Stanley Y. W. Su
Major Department: Electrical and Computer Engineering


Much work has been accomplished in the past on the subject of parallel query

processing and optimization in parallel relational database systems. However, little

work on the same subject has been done in parallel object-oriented database systems.

Since the object-oriented view of a database and its processing are quite different from

those of a relational system, it can be expected that techniques of parallel query pro-

cessing and optimization for the latter can be different from the former. In this

dissertation, we present two parallel architectures, a general framework for parallel

object-oriented database systems, several implemented query processing and opti-

mization strategies together with some performance evaluation results. In this work,

multi-wavefront algorithms are used in query processing to allow a higher degree








of parallelism than the traditional tree-based query processing. Four optimization

strategies, which are designed specifically for the multi-wavefront algorithms and for

the optimization of single as well as multiple queries, are introduced and evaluated.

A distributed result collection scheme which is designed to support retrieval queries

is also introduced. Furthermore, two parallel architectures, namely, master-slave

and peer-to-peer architectures are compared. A comparison is also made for two

data placement strategies, namely, class-per-node vertical partitioning and hybrid

partitioning. The query processing algorithms, four optimization strategies and the

distributed result collection scheme have been implemented on a parallel computer

nCUBE2, and the results of a performance evaluation are presented in this disser-

tation. The main emphases and the intended contributions of this dissertation are

1) data partitioning, parallel architecture, query processing, query optimization and

result collection strategies suitable for parallel OODBMSs; 2) the implementation of

these strategies; and 3) the performance evaluation results.














CHAPTER 1
INTRODUCTION


Research on parallel database systems began in the early 1970s when the rela-

tional model and relational database management systems started to become pop-

ular. Since then, a considerable amount of work has been carried out in paral-

lel processing of relational databases. Many parallel query processing techniques

and algorithms, particularly for the processing of the time-consuming Join opera-

tion [Vald84, Grae90, Kits90, LuH91, Chen92], have been introduced, analyzed, and

prototyped. In recent years, OODBMSs have become quite popular. Some fre-

quent questions raised among researchers and practitioners in the database area are:

"What are the major differences between relational database processing and object-

oriented database processing?", "Can parallel processing techniques and algorithms

introduced for relational systems be directly applied to object-oriented systems?",

and "What new or modified parallel techniques and algorithms can be introduced to

make the future parallel OODBMSs more efficient?".

From the parallel processing perspective, OODBMSs differ from RDBMSs in

the following two main aspects:

1. OODBMSs deal with complex objects instead of normalized relational tuples.

Data associated with a complex object, say, the design of an airplane, can be com-

posed of thousands of object instances of a large number of classes. Each instance








may contain data of complex types like set, list, array, bag, image, voice, etc.. This

fact has two implications. First, a query in an OODBMS may involve a large number

of classes. Traversals of multiple object classes for object instances that satisfy or

do not satisfy some data conditions are frequent operations. Support for efficient,

bi-directional traversals of object instances as well as the retrieval of them is needed

to achieve efficient query processing. Furthermore, new parallel query optimization

strategies will be required to reduce the I/O, communication, and processing times

during these traversals. Second, since an instance of a complex object may contain

much data, the traditional tuple-oriented data access from the secondary storage and

tuple-oriented query processing used in relational DBMSs may no longer be suitable.

Only parts of the data associated with instances of complex objects that are relevant

to a query should be accessed instead of the entire instances in order to avoid an

excessive I/O time. Thus, different data structures would be required for efficient

parallel processing of complex objects.

2. Relational systems deal with only the retrieval, update, insertion and deletion

of data from databases. Further processing of the retrieved or manipulated data is

done in application programs, and thus is out of the control of relational DBMSs.

In this type of system architecture, it makes sense to generate temporal relations

in different steps of query processing since these generated results are to be either

retrieved or further manipulated by storage operations. In OODBMSs, in addition to

the traditional database operations, user-defined operations and their implementa-

tions (methods) are managed and performed by the systems. Activations of methods








are done by passing proper messages to object instances. It is therefore important

to store methods close to their applicable instances so that they are readily avail-

able when object instances which satisfy some search conditions have been identified.

Except for the final retrieval of data in a retrieval query, assembling of descriptive

data (or attribute values) in object instances (which is equivalent to the genera-

tion of temporary relations) should not be carried out since it involves the access of

large quantities of data from the secondary storage (high I/O time), the assembly

of object instances (high processing time), and the passing of assembled data from

processor to processor (high communication time). Some of these assembled data are

not applicable to the user-defined operations specified in many nonretrieval-oriented

queries.

These differences and the increasing popularity of OODBMSs have motivated

our research in data partitioning, query processing and query optimization strategies

for use in parallel OODBMSs. In our research, we propose a hybrid data partitioning

strategy (horizontal and vertical partitioning) to achieve a higher scalability while

maintaining a uniform representation of an 00 database across the processing nodes.

In order to achieve the partitioned data parallelism, a global, logical query graph

is decomposed into many physical query graphs. This approach is different from

the ones used in other parallel systems [DeWi92, Grae94] which introduced parallel

operators to bridge the gap between the logical representation of a query and the

physical allocation of data elements.




4


The parallel and asynchronous multiple wavefront algorithms proposed by [Chen95]

are used in this research as fundamental query processing strategies. We have ex-

plored the graph-based asynchronous query model and developed four optimization

strategies based on the generic wavefront algorithms to support both single and multi-

ple query processing. Based on the multiple wavefront algorithms, a distributed result

collection scheme has been introduced to support retrieval queries. These strategies

have been implemented and the results of a performance evaluation of these strategies

are presented in this dissertation.














CHAPTER 2
SURVEY OF RELATED WORK


Since the focus of this dissertation is in parallel architectures, data partitioning,

query processing and optimization strategies for parallel OODBMSs, we shall survey

the related works in these areas.

Data Partitioning is an important issue in parallel RDBMSs. By partitioning (or

declustering) a relation across several disks, the database system can exploit the I/O

bandwidth of the disks by reading and writing data in parallel. Some of the parallel or

distributed databases concentrated on horizontal partitionings. There are three basic

horizontal partitioning schemes, namely, round-robin, hash, and range partitioning.

These schemes and their merits have been described in two works [SuSY88, DeWi92].

The horizontal partitioning approach is essential for parallel RDBMSs to achieve good

scalability and speedup. The vertical data partitioning technique has been proposed

by some other researchers [Nava84, Cope85]. The same strategy has been used in sev-

eral parallel database projects [LamH87, Vald87]. This vertical partitioning technique

(or decomposed storage model) has two major advantages for storing the instances

of complex objects. First, it provides a uniform representation for complex objects.

Second, it can avoid an excessive amount of I/O required to access large instances

during query processing.








Data partitioning increases the complexity of the query processing. In tra-

ditional database systems, the query execution plan consists of sequential opera-

tors(i.e., "scan", "join", etc..). Thus, in some research efforts, parallel operators such

as "split", "merge" and "exchange" are introduced to bridge the gap between the

physical data representation and the logical query execution plan [DeWi92, Grae94].

In our research, the queries are optimized at the query graph level. We directly

transform a query graph into another query graph based on the physical partitioning

of the instances of those object classes that involved in the query graph.

The use of IID pairs for the bi-directional traversals of object instances to be

presented in this work is similar to the "join index" concept introduced for processing

relational joins [Vald87]. However, join indices can be established for any relations

that are directly or indirectly associated through their common attributes according

to the access patterns of an application. The IID-pairs of our system are established

for all base object classes that are directly associated through object references.

The traversals of object instances through their associations are analogous to

join and semi-join operations in relational database systems. The join operation is

one of the fundamental relational query operations and is a time-consuming one. A

recent survey on the join operation can be found in Mishra and Eich [Mish92]. The

parallel execution of join operations is an accepted solution for achieving query pro-

cessing efficiency [Vald84, Grae90, Kits90]. However, most of the existing works have

addressed the problem of performing a join involving only two relations. Recently,

some researchers have studied parallel execution strategies for multi-way join queries








using different query tree structures, such as right-deep, left-deep and bushy tree

structures [Schn90, LuH91, Hara94]. Others have extended query optimization tech-

niques to handle large and more complex queries [Swam88, Ioan90]. The main idea

introduced in these works is to find the optimal join schedule or order. Several semi-

join strategies have also been introduced for query processing in distributed database

systems. Similar to the join operation, most research efforts have focused on the prob-

lem of finding the optimal schedule or order of semi-join to either reduce the number

of semi-join operations or the data transmission cost [Bern81, YooH89, Chen91]. In

these works on joins and semi-joins, a query is first translated into a tree structure

of relational operators and the execution of the query follows the structure from the

leaves to the root. One of the drawbacks of the tree-based query processing approach

is that the degree of parallelism is still limited by the leaves-to-root order even if

the pipelining approach is used for processing the operations. Furthermore, these

works consider the efficient processing of a single query. While it is fairly well under-

stood how to achieve the optimal schedule in a single query, little is known about the

optimal processing of complex, multiple queries in a multi-user environment. The re-

search results for single query optimization are not always applicable to multi-query

optimization. For example, most single-query processing techniques use the response

time of each query as the main performance measurement. The horizontal partition-

ing of a large file or relation is used to exploit the intra-query parallelism, so as to

reduce the response time of a single query. This approach fails to achieve a balance

between the intra-query parallelism and the inter-query parallelism. In our work, we








try to achieve a balance between these two types of parallelisms so that the overall

response time of a set of queries can be reduced.

Several interesting works have dealt with parallel and non-parallel processing

of 00 databases. Pointer-based join techniques for both centralized and parallel

00 databases have been studied in two works [Shek90, Lieu93]. In these works,

the evaluations only consider the joining of two object classes. Some object-oriented

database systems automatically convert OIDs stored in objects to memory pointers

to other objects when they load objects from the secondary storage to memory.

This conversion is known as pointer swizzling [KimW88, Whit92]. Pointer swizzling

makes possible efficient navigation of linked objects residing in memory. However, it

heavily depends on the virtual memory mechanism of the operating system. Database

systems using pointer swizzling techniques may face difficulties when they are ported

from one platform to another. Also, pointer swizzling may not be applicable to the

share-nothing parallel computer in which there is no global memory space. Class

traversals have been proposed to find the join order of the classes in a query graph

to find some associated objects [Jenq9O, KimW89a]. Another work uses an assembly

operator to translate a set of complex objects from their disk representations to

memory representations which can be quickly traversed [Kell91). However, these

works are patterned after relational query processing techniques by translating a

query graph into a tree structure. This tree-based approach implies a pair-by-pair,

bottom-up evaluation of the query tree which limits the inter-operator parallelism

and can lead to the generation of large intermediate results. To overcome these








drawbacks, we use a graph-based query processing technique. It allows either all the

processors or many processors that manage those object classes referenced by a query

to work on the query at the same time, thus achieving a higher degree of parallelism.

Recognizing that keeping many processors busy does not necessarily bring about an

overall efficiency in multiple query processing, we also introduce several optimization

strategies to avoid nonproductive computations by some processors so that they can

be used to process other queries.

In OODBMSs, the encapsulation of methods with the data they operate on

makes the query optimization more difficulty in the following ways. First, estimating

the cost of executing methods is considerably more difficult. Second, encapsulation

raises issues related to the accessibility of storage information by the query optimizer.

Some systems overcome this difficulty by treating the query optimizer as a special

application which can violate encapsulation and access information directly [Clue92].

Others propose a mechanism whereby objects "reveal" their costs as part of their

interface [Grae88]. In our research, a heuristic approach is used which takes into

consideration some limited storage access information. Thus, we assume that the

query optimizer can access storage information as a special application.

The need for parallel processing of data and their complex relationships has

been recognized [BicL86, BicL89, DeWi90, KimK90]. The work by DeWitt, et. al.

analyzes three distributed workstation-server architectures (namely, object, page and

file servers) for efficient processing of queries based on an 00 data model. This

work varies the degree of data clustering and the buffer size in their analysis of the








performance of these three architectures. It does not investigate parallel architectures

and algorithms for processing and optimizing 00 queries. The AGM system [BicL86,

BicL89] represents and processes a database as a network of interrelated entities

and relationships modeled by the ER model. An asynchronous approach is used to

process queries. Our work uses a similar data representation and processing approach.

However, the granularity of computation in AGM is at the data element level. In

our opinion, this is not very suitable for processing very large 00 databases since

a large number of tokens carrying a substantial amount of data would have to be

generated, transmitted and processed. Also, the result of a query in AGM is not

represented structurally in the same model as the original database, thus it can not

be further operated on by the same query model (i.e., the closure property is not

maintained). The work presented in Kim's paper [KimK90] analyzes three types of

parallelism in processing 00 queries (namely, node parallelism, path parallelism and

class-hierarchy parallelism). They are also exploited in our work. However, Kim's

work took the analytical approach and only considers queries which access the object

instances of a single target class. In our work, parallel algorithms are implemented

to process multiple queries which access object instances of multiple target classes,

their interrelationships, and their attribute values.














CHAPTER 3
OBJECT-ORIENTED DATABASE AND QUERY SPECIFICATION


In this section, we describe an 00 view of a database and a graph-based query

specification and processing.

3.1 Object-oriented View of a Database

An object-oriented database (OODB) can be viewed as a collection of objects,

grouped together in classes and interrelated through various types of associations

[SuSY89, KimW89b, Well92, Ishi93, Bham93]. It can be represented by graphs at

both the intensional and the extensional levels. At the intensional (schema) level, a

database is defined by a collection of inter-related object classes in form of a Schema

Graph (SG). Figure 3.1 shows the SG of a university database. Rectangle vertices

represent entity classes and circle vertices represent domain classes. Objects in entity

classes are entities of interest in an application domain. Each object has a system-

assigned unique object identifier(OID). Objects in a domain class serve as values(e.g.,

integer 10, character string "algorithm") for defining other entity or complex domain

class objects. The associations among classes are represented by the edges in SG.

For example, the association between Course and Department is represented by an

attribute-domain link (a fine line), and the association between Person and Student

is represented by a superclass-subclass link (a bold arrow). At the extensional (object








instance) level, a database can be viewed as a network of object instances in different

classes, and inter-related through their associations. This can be represented as an

Object Graph(OG). Figure 3.2 shows an OG of a portion of the university database.

Every object instance on this graph is assigned by the system an instance identifier

(IID) which is the concatenation of an OID and a class ID. Symbols such as rl,

g2, sl, etc., instead of numbers are used as IIDs for ease of reference. Links in

the figure show the bi-directional references between object instances. We note here

that object instances are the data representations of objects in their classes. In this

example database, we assume that the distributed or dynamic model of inheritance is

used, in which data associated with an object are distributed in the object classes of a

class lattice instead of the centralized or static model, in which all its data are stored

in a bottom node of the class lattice. The former model achieves the inheritance of

attributes and methods at run-time whereas the latter model at compilation-time.

The query processing and optimization strategies presented in this dissertation are

applicable to both inheritance models.

3.2 Query Graph and Query Processing

Based on the above graphical model of OODBs, an object-oriented query lan-

guage called OQL has been introduced [Alas89] in which a query can be specified by

a query graph. A query graph is a subgraph of the schema graph and consists of a

linear, tree or network structure of object classes having association operators, non-

association operators and AND-OR branches. For example, the query "For all the








graduate research assistants, find their GPAs, numbers of hours of the their appoint-

ments, department names, and the section numbers of the courses they are taking"

can be written using the object query language (OQL) as:


context RA*Grad*Student AND (*Section,

*Department)

retrieve Student.gpa, RA.hours,

Department.name, Section.num




The context part of the query specifies the query graph shown in Figure 3.3. In

the query graph, a vertex represents a class, and an edge with an association operator

"*" specifies that only those instances of two adjacent classes that are associated with

each other in the extensional database are of interest to the query. If a non-association

operator "!" is used, only those instances that are not associated with each other will

be identified. The AND branch states that an instance of the class Student must

be associated with some instances in both classes Section and Department. An OR

branch would specify the OR condition of object associations. Range variables can

be specified for the classes referenced in the context statement.

The processing result of this query graph is shown in Figure 3.4 which is a

subgraph of Figure 3.2. After having identified object instances in the multiple classes

which satisfy the context specification, system- or user-defined operations specified

in the query can then be performed on these instances. In this example, a retrieval

operation is performed to obtain the hours, the GPAs, the department names, and






















SectionNum
Section Room
STextBook


- class/subclass link

attribute/domain link


Figure 3.1. Schema Graph of a University Database




















RA Grad


Department

Figure 3.2. Object Graph(OG)

the section numbers. In a more complex query, if attribute comparisons are involved,

they are specified in a WHERE subclause of the context statement with quantifiers

and complex predicates. If there are multiple links (or attributes) associated with two

classes, the link or attribute name is given after the "*" or "!" operator to identify

the specific link.

Since a query graph can be very complex structurally and graph searches have

to be carried out in a potentially very large extensional database, the processing of

such an 00 query can be very time-consuming. For example, if the relational query

processing approach of generating temporary relations is adopted for the 00 query

processing, complex data structures will have to be established and maintained in

each step of the association/non-association operation to construct the aggregated








instances (i.e., similar to relational Joins) and these data of complex data type will

have to be passed from one processor to another in a multi-processor computing

environment to perform object traversals. Furthermore, the aggregated instances do

not belong to any predefined classes. They can not be further processed by pre-

defined methods due to type checking problems. They may contain data which are

not relevant to the operations of any user-defined operations. A better way to identify

object instances that satisfy the context specification is to retrieve the proper part

of the object graph (i.e., the extensional database) from the secondary storage to

the main memory, traverse the in-memory structure to mark the proper instances for

subsequent processing instead of forming temporary data structures. The original

structural properties of these object instances are maintained in the object graph.

However, this method would require bi-directional traversals of object instances since

a disqualification of an object instance would cause the disqualifications of many

other associated instances, thus causing backward propagation of IIDs. Bidirectional

traversals need to be supported by an efficient query processing strategy and graph-

based traversal algorithms.

In this work, a two-phase query processing strategy [LamH89, Thak90] is adopted

to access and manipulate an OODB. In the first phase, multi-wavefront algorithms

(see the next section) are applied to identify the object instances that satisfy the

context specification. Local selection conditions, if they are specified in the Where

subclause, are applied by the involved processors to their classes in this phase. In the

second phase, system- and/or user-defined operations are executed on these object








instances. Since the retrieval of descriptive data to form the final retrieval result

is postponed until the second phase when all the instances that satisfy the context

specification have been identified, this processing strategy reduces the I/O time and

avoids the generation of large temporary instances during object instance traversals.

Another advantage of this approach is that the original structural properties of these

instances are preserved and can be used in the system- and user-defined operations

in the second phase. Thus, the closure property is preserved.



Section

AND

RA Grad Student


Department

Figure 3.3. The Query Graph







































RA Grad


Student


Figure 3.4. The Resulting Subdatabase
















CHAPTER 4
A GENERAL FRAMEWORK OF PARALLEL OODBMSS


4.1 A Hybrid Data Partitioning Approach


Data partitioning and placement is an important issue in parallel database sys-

tems since it affects system performance. In OODBMSs, we believe that there is a

need for a hybrid data partitioning strategy(a combination of vertical partitioning

and horizontal partitioning). In vertical partitioning, instances of a class are verti-

cally partitioned as illustrated by Figure 4.1. There are two types of partitions: (a)

data values stored in IID-data value pairs, and (b) instance cross-references stored in

IID-IID pairs.

Studem GPA Sudea GOnd Studet Section Stdn Dep
U D 1D ID lD i D m ID
sl 3.5 si 1 .l sel.se2 .1 dl
s2 3.6 .2 s2 sel.,2 s2 di
.3 3.5 3 g2 s3 s3 d2
s4 3.3 s4 g3 s4 se2 s4 d2
s5 3.6 .5 .5 .e2 s5 d2
*0 *
*
*0 *



Figure 4.1. Vertical Partitioning of Class Student



Vertical partitioning of data improves I/O parallelism and avoids the retrieval

of data not needed by the query. By storing the attributes columns in different

files, different queries which access different attribute values of the same set of object








instances can be carried out concurrently. In other words, the intra-object parallelism

can be exploited. Also, when the data of an object is to be retrieved, only those

needed attribute values need to be accessed from the secondary storage instead of

all the values that form an object instance. This saving can be very significant for

complex objects because their attribute values can be data of complex data types

such as video and audio. Vertical partitioning also provide a simple and uniform

representation for complex objects in that cross-references data (association between

instances of different classes) can be represented in the same structure as the attribute

values of objects.

Intuitively, the vertical partitioning approach works well if the following two

conditions exist. First, the number of object classes is greater than that of processing

nodes of a parallel computer and the sizes of object classes are about the same.

Second, queries issued against the database access the object classes with about

the same probability. However, these conditions are not always true in real world

applications. It is possible that in a database schema, there are some very large classes

and they are accessed by queries much more frequently than the other classes. Under

this circumstance, the vertical partitioning strategy will not scale up well. Thus, the

horizontal fragmentation after the vertical partitioning (i.e., hybrid partitioning) shall

be used. Figure 4.2 show a hybrid partitioning of the class Student. In this example,

all the hybrid segment starting from object instance sl are mapped to processor node

1, all the hybrid segments starting from objects instance sn are mapped to processor

node 2, etc.. The horizontal partitioning can exploits inter-object parallelism. That









is, a query can be processed against the horizontal segments of vertically partitioned


data concurrently.

S&d.nt GPA Studol God Studm Section Sludenl D
11D 11D 1ID I ID lD ID lID
.1 3.5 sl l sN ls 2 l dl
s2 3.6 .2 Q2 sel.s2 2 dil
s3 3.5 s3 _2 s3 s3 d2 Node
s4 3. 3 .4 4 se2 4 d2
s5 3.6 5 s5 se2 s5 d2



-------------... .............................. ...... ...... ...... ..... ...... ...... ...... ...... .....


S Nod.2
.................. .. ............ .


S 0



Figure 4.2. Hybrid Partitioning of Class Student



The data structure used to store the data partitions in each node is based on


the concept of "join indices." It is designed to facilitate the bi-directional traversal

of object instances. The data associated with all instances of a horizontal segment

are partitioned into vertical binary columns. There are two types of binary columns:


IID-IID pairs for storing inter-object references between two adjacent classes and IID-

attribute-value pairs for storing the descriptive data of objects. The binary columns


of the first type are pre-sorted based on the IIDs through which object references are

to be accessed. For large object classes, the binary columns of the second type are

supported by the traditional indexing schemes for fast accesses of data values given

some IIDs and fast accesses of IIDs given some data values. In Figure 4.4, some

data partitions (for simplicity sake, no IID-attribute-value pairs are shown in this

example) and methods defined in the five object classes: RA, Grad, Student, Section








and Department are stored in processors P1, P2, P3, P4 and P5, respectively. Each

processor which holds a partition of a class maintains the IID-to-IID references to

the partitions of all its adjacent classes. For example, gl -> rl; g2 -> r2; g3 -> r3 are

stored in processor P1 to record the inter-instance references between RA and Grad

partitions, and rl -> gl; r2 -> g2; r3 -> g3 and sl -> gl; s2 -> ; s3 -> g2; s4 ->

g3; s5 -> are stored in processor P2 to record the inter-instance references between

Grad instances and RA and Student instances, respectively. This structure allows

bi-directional traversals of object instances and can be viewed as pre-computed joins

in a relational database [Vald87, BicL86].

In addition to the IID-IID pairs, an integer array CON is established for each

adjacent partition as shown in Figure 4.4. Each element of the array corresponds

to one instance of the partition stored in the processor, and the integer value is

the number of connections between that instance and the instances of an adjacent

partition. For example, the elements of array Section.CON in the Student partition

have values 2, 2, 0, 1, and 1, which specify the numbers of connections sl, s2, s3, s4,

and s5 have with the instances of the Section partition, respectively. These integer

arrays are used in the multi-wavefront algorithms to be described in the next section.

4.2 A Distributed Graph-based Query Processing Strategy

4.2.1 Query Graph Modification Approach

Similar to SQL, an 00 query language is a nonprocedural language. Thus, the

physical data allocation information is transparent to the users and is not stated in








the query language. In traditional RDBMSs, a query is transformed into a tree struc-

tured query execution plan before it is executed in a bottom-up manner. In parallel

RDBMSs, some parallel operators, such as SPLIT and MERGE are introduced to

the tree structure of the query execution plan (QEP) to bridge the gap between the

logical representation of a query and the physical mapping of the data [DeWi92].

The same approach is also used in some recent research in OODBMSs [Grae94]. In

our work, we propose a different approach which modifies a logical query graph into

another query graph based on the physical mapping of the data. For example, if

classes Grad and Student are horizontally partitioned into Gradl, Grad2, Studentl,

and Student2 partitions, respectively, and each of the other class has a single parti-

tion, then the query will have to be processed against the four combinations of data

partitions to obtain the final result (i.e., Gradl, Studentl, and the partitions of all

other classes form a combination, etc.). The query graph shown in Figure 3.3 can be

transformed into a query graph as shown in Figure 4.3. From this figure, one can see

that the horizontal parallelism can be captured by the "OR" branches. Although the

algorithms for implementing "OR" and "AND" branches achieve the similar things

as "Split", "Merge" or "Exchange" operators, the main differences between the query

modification approach and the parallel operator approach are: 1) "OR" and "AND"

branches are defined in our original data query model. They are not operators intro-

duced specifically for parallel platform; 2) Processing of the partitioned data in the

former approach can take advantages of the graph-based, multi-wavefront algorithms

as well as graph-based optimization strategies.









OR P4 Section
OR ORYYAND


OR
RA

OR OR AND
P7 Grad2 P6 Stdent OR P5 Department


Figure 4.3. Query Graph Modification Approach

4.2.2 Multiple Wavefront Algorithms

The processing of each query graph against a combination of data partitioning

is based on two multiple wavefront algorithms introduced in our previous work: the

identification approach and the elimination approach. In both algorithms, data rel-

evant to the processing of a query graph are retrieved from the secondary storage

devices in parallel and manipulated in main memories by the processors which hold

the partitions of object classes referenced in the query. In the identification approach,

partitions referenced by a query are classified into two types. Partitions with more

than one "AND" conditioned edge in the query graph are called non-terminal parti-

tions; otherwise, they are called terminal partitions. Query processing starts at all the

processors that manage the terminal partitions. These processors do their local selec-

tions of instances (if selection conditions are given in the query), look up the proper

binary columns of IID-IID pairs, and send out the IIDs of the associated instances

of its only neighboring class which satisfy the selection and instance reference con-

ditions. Each propagation of IIDs forms a wavefront moving toward all other nodes








of the query graph. Multiple wavefronts go across one another in an asynchronous

fashion and the operations of all the processors depend on the operators ("*" or "!")

and branch conditions (AND or OR) given in the query. The behaviors and termina-

tion conditions of processors that contain terminal and non-terminal partitions are

as follows. Each node will send out an end-marker to one of its neighboring nodes

immediately after it sends a wavefront to that node. A node will terminate if the

number of end-markers it receives is equal to the number of edges it has. The pro-

cessor that contains a non-terminal partition would receive streams of IIDs from all

its "OR" conditioned neighbors and all its "AND" conditioned neighbors but one.

It will process those streams of IIDs and select the instances that satisfy the local

selection condition and the "AND" and "OR" branch condition. Then, it will send

the IIDs of the associated instances of the only remaining neighboring partition to

its corresponding processor. The processor of a non-terminal partition would per-

form its local selection and process the incoming streams of IIDs. When it receives

the last incoming streams of IIDs, it will process them, and pass the IIDs of those

associated instances of all other neighbors to these neighbors except the sender of

the last incoming IID stream. Figure 4.5 illustrates the execution of the query given

in Figure reffq In this algorithm, the processing starts from all terminal nodes and

each processor reports to its neighbors) the instances that satisfy the search. A

terminal node terminates its processing after it received an end.marker from its only

neighbor. We note that all the "OR" conditioned edges can be treated as one edge.








Therefore, Figure 4.3 is not a cyclic graph. The processing of a cyclic graph can be

found in previous work [Chen95].
SEC.CON
sl->gl sl 2 sel->sl,s2
s2-> STU.CON s2 2 P4
gl->rl GRA.CON s3->g2 gl I s3 0 s42,s5
g2->r2 rl 1 s4->g3 g21 s4 1 Section
g3->r3 r2 1 s-> g3 1 s5 1 sl->sel,se2
P r31 P P3 STU.CON s2->sel,se2
PI r 1 P r sel 2 s3->
RAG (Gd GRA.CON S se2 4 s4->se2
RA Grad Student s5->se2
RA.CON rl->gl s20 gl->sl s5
gll ^dl->sl, s2
gll r2->g2 g2->s3 dl->sls2 P5
g21 r3->g3 s3 1 g3->s4 d2->s3,s4,s5
g3 1 s4 1 DEP.CON
s5 0 sl 1 Department
s2 1 STU.CON sl->dl
s31 dl 2 s2->dl
s4 1 d2 3 s3->d2
s4->d2
s5 1 s5->d2

Figure 4.4. Data Structures for the Multiple Wavefront Algorithms


In contrast with the identification algorithm, the elimination algorithm elimi-

nates object instances that do not satisfy the context specification. When an object

instance in the in-memory object graph is eliminated in the query processing, all the

associated instances of the neighboring partitions will have to be eliminated. This

may in turn cause their associated instances to be eliminated. Thus, the elimina-

tion process will be repeated until all the unqualified instances have been eliminated.

In this algorithm, all processors become active after receiving the query graph, and

they can start processing local instances (e.g., do local selections and check instance

connectivities) without waiting for the waves of IIDs from the neighboring processors

(classes). For this reason, the elimination algorithm achieves a higher degree of par-

allelism than the identification algorithm (in the case of the AND branch condition).

In this algorithm, each processor reports to its neighbors) the instances that have














sel, se2 P4
rl, r2,r3 0 Section
P -- P2 P3 sel, se2)

RA Grad Student P5
{rl, r2,r3) {) 1 }
Sd2Department

{dl, d2)
(a) step 1{d
P4

gl,g2,g3 Section
P1 P2 P3 {sel, se2)

RA Grad Student PS
g s, A, sl, s4
rl, r2,r3} {gl, g2, g3) sl,2,s4,s)
Department
(dl,d2}
(b) step 2 P4
sl, s4
gl,g3 Section
P1 P2 P3 {sel, se2}

RA Grad Student P5
{rl,r2,r3 {(gl,g3) (sl, s4}
Department
{dl,d2)
(c) step 3 P4

Section
Pl P2 P3 isel, se2)

RA Grad Student P5
{rl,r3) {gl,g3} (sl,s4)
Department
(dl,d2)
(d) step 4

Figure 4.5. Execution of the Identification Algorithm








been eliminated. The counts in the proper integer arrays are decremented. When

an entry of an array becomes zero, the corresponding instance is eliminated and the

IIDs of those eliminated instances are sent to the neighbors so that they can in turn

eliminate their instances.

Although there is a significant difference between these two algorithms, some

processing techniques are applicable to both of them. For example, in both algo-

rithms, the propagation and processing of IIDs can be carried out in a pipelining

fashion, thus increasing the degree of parallelism. Also, in order for both algorithms

to know when to terminate, the end.marker is introduced.

4.2.3 Result Collection

The above multiple wavefront algorithms mark the IIDs that satisfy the query

pattern and send the IIDs, attributes and all the IID-IID cross-reference information

to a result collection (RC) node. Upon receiving these information, the RC node tra-

verses the cross-reference information to reconstruct the query results. This approach

creates a potential bottleneck in the RC node (even though more than one processing

node can be allocated as the RC nodes for differently queries). An alternative is to

use a distributed result-collection approach which will be discussed in Section 5.

4.2.4 Method Processing and Attribute Inheritance

In OODBMSs, the behavioral properties of objects are defined by method spec-

ifications in object classes. Due to the inheritance property, all the methods defined

in an object class can be applied to the instances of all its subclasses. Likewise, the

object instances of these subclasses can also inherit the attributes defined in their








superclasses. A challenge for the design of a parallel OODBMS is how to map method

implementations and object instances to processors in such a way that the following

two goals can be reached. The first goal is to place data and their applicable method

implementations as close as possible so that data and/or code do not have to be

moved. The second goal is to make the attribute inheritance as efficient as possible.

In our approach, the following rules are used to achieve the above goals.


All the methods applicable to an object class are replicated in the processing

nodes to which the instances of the class are mapped following the hybrid data

partitioning strategy.


The mapping generally starts from the root class of generalization hierarchies

using the top down approach until all the classes are allocated.


The instances of multiple classes in an inheritance hierarchy or lattice which

hold the data of the same object are mapped to the same processing node.


The object classes which have aggregation association with (attribute links) the

classes in the generalization hierarchies can be mapped to the processors after

all the classes in generalization hierarchies are allocated to achieve the load

balance. Or, they can be mapped randomly. After collecting system running

information, further adjustment can be done to achieve the load balancing.


Figure 4.6 shows an example of applying the above rules. The object instances

of class Person are mapped to P1, P2, P3 and P4. Thus, the methods defined by

class Person are replicated in these four processing nodes. Similarly, the methods








defined by class Teacher are replicated to P2 and P3. The numbers 1 to 8 represent

OIDs. By this mapping, the methods can be applied to the data of the object classes

concurrently, thus improves the performance. At the same time, there is no separation

of the data and their applicable methods. No transferring of the data and methods

is required during query execution. Moreover, it reduces the unnecessary replication

of methods. For example, the methods defined by class Teacher are not replicated

in P1 and P4 because there is no object instances of class Teacher mapped to these

two nodes. This mapping strategy also make the attribute inheritance very efficient

because all the instances of multiple classes in an inheritance hierarchy or lattice

which hold the data of the same object are mapped to the same node. Moreover,

some index structures can be established for the instances of the objects during the

mapping process to speedup future accesses.

PI P2 P3 P4




r 1 2 3 4 46 6 7 P







T5 A e



Figure 4.6. A Proposed Mapping Strategy














CHAPTER 5
DISTRIBUTED RESULT COLLECTION


5.1 Two Architectures for Supporting Result Collection

Recently, there is a significant trend in both industry and research communities

to unify the relational and object-oriented database technologies. One of the chal-

lenges this shift brought to query processing is that both navigational and retrieval

query types should be supported.

Two architectures have been studied in our research: master-slave and peer-to-

peer. In the master-slave architecture, clients (users) submit queries to a master node.

Upon receiving queries, master node will analyze them, apply query optimization

strategies to them, modify them based on the placement of data and pass the modified

query to various slave nodes. After slave nodes finish the query processing, slave

nodes send the partial results back to the master node. In turn, the master node

assembles the partial results for each query together and reports them to clients. The

master-slave architecture is shown in Figure 5.1(a).

In the peer-to-peer architecture, a client (user) can submit a query to any node

which analyzes the query, applies optimization strategies to it, modifies it and pass

the modified query to all the nodes that contain the relevant data. Upon receiving

a subquery, these nodes process the subqueries and the results are collected by one








of these nodes. In this architecture, the node which receives queries from clients is

called a coordinator node (C-node) and the node which processes the queries is called

a peer node (P-node). A node in a parallel machine can be a C-node and P-node

at the same time and there can be more than one C-node in a system. The peer-

to-peer architecture is shown in Figure 5.1(b). Our implementation assumes that

the parallel computer supports the client-server architecture. The server (e.g., the

parallel computer) itself works in a master-slave or peer-to-peer mode.

In the master-slave architecture, when a retrieval query is executed, the IID-IID

pairs and the attributes of the objects are sent to the master node. Master node

has to traverse the IIDs and construct the final results. This process could be time

consuming and the master node can become a potential bottleneck.

Based on the peer-to-peer architecture, we introduce a distributed result collec-

tion approach to ease the potential bottleneck by designating one node to collect the

results. This approach is based on a pattern-passing identification (PPI) strategy

which is a modification of the multiple wavefront identification algorithm described

in Section 4.2.2.

5.2 Pattern-passing Identification Strategy

The main idea of this strategy is to construct and propagate the association

information of objects in all the P-nodes which are involved in a query. In a query

graph, we assume a node C has i numbers of "AND" conditioned edges and j numbers

of "OR" conditioned edges. We call the processor which contains the instances of

node C, the Pc processor. If i equals to 1 and j equals 0 or if i equals to 0 and j is










Users (Clients)

.------- ,------ ----- -......------- ------

\ Node
I I I I

nuClicms(Usrn)


HypecxtteInTltoimcctlon NClofkI













Figure 5.1. Two Architectures

greater than 1, we call node C a terminal node; if i is greater than 1, we call node


C a non-terminal node. We show the PPI strategy by describing the behaviors of

terminal and non-terminal nodes below.


Every node shall send an end-marker to its neighboring node immediately after

it sends out a wavefront to that node;


If C is a terminal node in a query graph;


Pc will start the IID propagation process;


when Pc receives a wavefront of association patterns each of which is

formed by a concatenation of associated IIDs from its neighbors, it will

concatenate the local IIDs that are associated with the incoming patterns.








If the number of end-marker Pc receives is equal to the number of the

edges it has, it terminates.


If C is a non-terminal node in a query graph;


upon the arrival of each incoming wavefront of association patterns, it

will concatenate the incoming patterns with local IIDs according to the

IID-IID pairs available in the local memory.

if Pc has not received (i-1) incoming wavefronts of association patterns

from its "AND" conditioned neighboring processors and (j) incoming wave-

fronts of association patterns from its "OR" conditioned neighboring pro-

cessors, it will waits for more wavefronts to come;

when Pc receives the (i-l)th wavefront, it will perform the same sequence

of operations as describe in the first step and propagate the (i-l)th inter-

mediate results to the only remaining "AND" conditioned neighbor from

which it has not received a wavefront;

when Pc receives the i-th wavefront from its "AND" conditioned neighbor,

it will perform the same sequence of operations as above, propagate the

final result to all the neighbor processors except the sender of the i-th

wavefront and terminates.


Figure 5.2 shows an example of the above procedure. The query graph is the

query pattern shown in Figure 4.3 and the object graph is as shown in Figure 3.2.

In this approach, the association pattern among the objects is constructed by all








the processors involved in a query instead of by one or more dedicated processors.

In other words, the pattern is constructed in a distributed fashion. Notice that, at

the end of the procedure, all nodes contain the same patterns of object associations

(could be in a different order). We shall consider how to collect the results of a

retrieval query by using these patterns below.

5.3 Distributed Result Collection

In a retrieval query, such as the one shown in Section 3.2, the descriptive data

(i.e., the primitive attribute values) of one or more object classes are retrieved from

the secondary storage, concatenated and presented to the query issuer in an appro-

priate order. By taking the advantage of PPI strategy, this process can be carried

out in a distributed and parallel fashion. We designate one node or a set of nodes for

each query as the result collection node(s) depending on applications. There could

be more than one criterion for selecting a result collection node. One of the crite-

rion is that this node contains the classes) from which the size of the descriptive

data to be retrieved is greater than the other nodes. In this way, we can avoid the

transmission of the larger set of data, thus, reducing the communication cost. The

load balancing is another criterion should be considered when multiple queries are

choosing the result collection nodes. Figure 5.3 shows a possible assignment of the

results collecting nodes for the modified query shown in Figure 4.3. The highlighted

nodes are the designated result collection nodes. Each result collection node is re-

sponsible for collecting the results in a specified range (e.g., IID values or hashed IID





36














nO ), P5 ^ O
Pail P3Sticudenial Psa







F) ( .7 OR rnc OR ID2)
o OR














o)Ra2








((,I,h2.iIAIO (IlAIL,,X) ,
O O 4 3* OR ) .2 ANZ W1
0P-crd2 P3SmdM Ocnl




























( ) 2, (d
(gist a.al a.4 ,d) )) Wp,
P2t H11 III) OR i T";

P)



( ) OR O g ) D












Figur OR. A
II3u2rl ( lelUi (( ;3)
W( LII M PDqln,-


















Figure 5.2. An Example of the PPI Strategy
( 0 OR ANO
PS 2 OR







values). In the following sections, we shall discuss how the other nodes retrieve their

descriptive data and transfer them to the data collection nodes.

P2 Gradl P3 Studentl
OR P4 Section
OR OR AND


P1
OR
RA


OR OR
AND
P7 Grad2 P6 Student2 OR P5 Department


Figure 5.3. An Example of the Result Collection Node Assignment


5.3.1 Join Approach

At the end of PPI, all nodes involved in the query have the same pattern of

object associations (i.e., the same set of associated IIDs). An example of the re-

sulting patterns which involve only two classes is shown in Figure 5.4. The first

column always contains the local IIDs and the second column contains the IIDs of

the neighboring node(s). In our implementation, the pattern on each site are ordered

according to the order of the local IIDs. The P, is highlighted to indicate that it is

a result collection node.

Obviously, one way to combine the attribute values of class B with that of class

A is for P, and Pb to retrieve their attribute values from their disks, transfer Pb's

attribute values which satisfy a specified condition (e.g., an IID value range or a hash








value) to P,, and perform a join operation there. This approach is very similar to the

semi-join approach used in other distributed systems. In this approach, the attribute

value of each IID is retrieved and transferred only once. However, the join operation

at P. could be very costly. If the attributes of class B can not be held in the memory,

then they have to be stored in the secondary storage of P., thus, requiring a lot of

I/O operations and introducing a bottleneck at Pa. This problem will be even more

serious if a join operation involves more than one neighboring class.

Pa Pb
0 0
A B

1(1, 2), ((2, 1),
(1, 10), (4, 8),
(1, 40), (4, 25),
(3, 50), (10, 1),
(8, 49), (14, 6),
(8, 4), (14, 15),
(6, 14), (14,30),
(9, 17), (16, 15),
(15. 14), (17, 9),
(15, 16), (40, 1),
(25, 4), (49, 8),
(30, 14)) (50,3))

Figure 5.4. An Example of the Result Pattern



5.3.2 Concatenation Approach


In this approach, we re-order the patterns at Pb in the same order as that of

Pa (i.e., based on the IIDs in the second column). In this way, when the attribute

values of class B that satisfy a specified condition are retrieved and transferred to P,,

they will be in the same order as the attribute values of class A. Therefore, the final

results can be obtained by simply concatenating the two streams of attribute values.








This scheme works well except that some of the disk pages which contain certain

IIDs might have to be read more than once. For example, in Figure 5.4, the page

contains IID 4 might have to be read from disk twice; the page contains IID 14 might

have to be read three times. The multiple reading of the same page is due to the

unordered local IIDs in the first column of Pb. Some researchers addressed this issue

by assuming a physical OID [Shek90, Lieu93] (i.e., each OID contains information

about the node and disk page of the referenced object). This approach sorts OIDs by

their page IDs before the disk access, thus, the multiple accesses of the same object

are avoided. In our research, we assume logical IIDs are used. Therefore, a different

approach (object cache) is taken to reduce the penalty of multiple retrievals of the

same page. The following steps are used in our approach:


Allocate a certain size of the memory as a cache area. The size of this area

depends on the size of the available memory.


Count the number of appearances of each local IID in the patterns. Store this

number together with each IID and denote it as the TotalCount. Because the

patterns are sorted by the local IIDs, the complexity of counting is O(n); n is

the number of patterns.


Sort the patterns so that they are in the same order as the patterns in the result

collection node. The complexity of the sorting is nlog(n).


A data structure shown in Figure 5.5 is maintained in the cache area of the

memory. This structure is used to log the attribute values of a set of most








frequently appeared IIDs. We denote this structure as ACache (A stands for

"attribute"). For the simplicity of presentation, we use a flat table structure.

In an implementation, other data structures such as a hash table can be used.


Those attribute values of the local IIDs needed for a retrieval query are retrieved

either from the secondary storage or from A-Cache in the following manner. For

each IIDi,


if Total-Counti is equal to one, the attribute values of the IIDi will be

retrieved from the disk;

if TotalCounti is greater than one, the ACache will be checked. If IIDi

is already in it, the attribute values stored with IIDi are accessed from the

ACache instead of from the disk and paired with IIDi being processed,

and the Current-Counti is decremented by one;

if IID, can not be found in the A.Cache, the attribute values of IIDi

are retrieved from the secondary storage. If ACache has an empty entry,

the attribute values of IIDi is logged in A-Cache and Current-Counti =

Total-Count, 1. If A.Cache is full, the table entry with the smallest

CurrentCount will be replaced by IIDi and its attribute values if the

smallest Currentcount is less than Current-counti.


The above approach avoids the repeated disk accesses of attribute values asso-

ciated with the set of IIDs with larger counts maintained in the A Cache. However,




41



object cache approach increases the complexity of maintaining the consistency of

OODBMSs.

IID Current Count Attributes









Figure 5.5. An Example of a cache














CHAPTER 6
QUERY OPTIMIZATION STRATEGIES


The multiple wavefront algorithms have been designed to achieve a high degree

of parallelism. However, a high degree of parallelism does not necessarily guarantee

the maximal efficiency, since, as we shall explain, processors can be kept busy doing

nonproductive work. Furthermore, in the situation of multiple queries, there is a large

number of queries to be processed simultaneously. It is more desirable to allocate

some nodes to process other queries than to let them be committed to one particular

query and do nonproductive computations. Let us look closer into this problem and

its possible solutions from both single query and multiple query points of view.

6.1 Parallelism Not Equal to Efficiency

We use an example to illustrates the problem. Our discussion is still based on

the schema graph of Figure 3.1. We assume that there are 300 RAs at a university

with a student body size of 10,000 and a graduate student body size of 4,000. Both

rl and r2 have 20-hour appointments, and the rest of RAs' appointments are either

10 hours or 15 hours. sl's GPA is 3.5 and there are 4,000 students with a GPA of

3.5. s2's GPA is 3.6 and there are 800 students with a GPA of 3.6. Finally, there

are 10 departments at the university, and together they offer 300 sessions of courses.

The object graph is shown in Figure 6.1. For simplicity reason, we map each object








class to a processor node. However, if horizontal partition is done to any object class,

Figure 6.1 could be considered as the object graph for a specific combination of data

partitions. Therefore, in the rest of this section, the terms "class" and "partition"

are interchangeable.

Now the query is "Find the students who are RAs with a 20-hour appointment

and 3.5 GPA, and also find those sections that the students are taking. Retrieve their

names, their GRE scores, and their department names."

It can be written in OQL as below:


context RA*Grad*Student AND (*Section, *Department)

where RA.hrs = 20 A Student.gpa = 3.5

retrieve Student.name, Grad.gre, Department.name




If the identification approach is used, as shown in Figure 6.2, P3 which processes

the Student class would most likely receive the IIDs propagated from the Section and

Department classes before it receives the stream of IIDs propagated from the RA

class. If most of the object instances in the Section and Department classes are

connected with the object instances in the Student class, after applying the local

selection condition and processing the two incoming wavefronts, P3 will send 4,000

IIDs to P2. Also, P2 will have to take a considerable amount of time to process these

IIDs, most of which do not contribute to the end result. But based on our assumed

object graph, we can see that the local selections of the RA, Department and Section

classes produce very few IIDs. If the processor, which stores and processes the Student








class, simply waits for all the wavefronts (including the one propagated from P1 and

P2) to come from its neighboring classes, it will only send out a very limited number

of IIDs to its neighboring classes. The communication bandwidth between P2 and

P3 as well as the CPU time of P2 can thus be saved and be used for processing other

concurrent queries. Also, based on the data specified in the retrieval statement of the

query, the attribute values associated with RA and Section classes are not needed.

Therefore, the wavefront propagations from P3 (the Student class) towards P4 (the

Section class) and from P2 (the Grad class) towards Pl(the RA class) are not needed.

The algorithm can terminate at step 4 of Figure 6.2.







\9 9M9W9am Section



RA Grad Student





Department

Figure 6.1. A Modified Object Graph


From the above example, we can see that a number of optimization strategies

can be introduced for the graph-based object-oriented query processing in a parallel

environment. By starting a query at some selected nodes and/or by controlling the

directions and the extend to which a wavefront of IIDs propagates, we can not only








reduce the response time of an individual query but also the overall processing time

of concurrent queries since nonproductive computation can be avoided. In the next

subsection, four strategies for query optimization in multiple wavefront algorithms

are introduced. They aim to avoid excessive or unnecessary IID transfers between

nodes while maintaining an appropriate degree of parallelism in order to free some

processors from computations that do not contribute to the end results of queries.

P4
sel..se300 ~
PI P2 P3 Secon

RA Grad Student P5
dl...d( d
Department
(a) step 1
P4

Pl P2 3 Section

RA Grad Student P5

Department
I)
(b) step 2 P4


Pi P2 l-P3 Sec n

RA Grad Student PS

Department
(c) step 3 {
P4

P P2 P3 Section
0 ------0 1)
RA Grad Student p5
{} {gl) (sl)
Department
(d) step (dl)

Figure 6.2. A New Query Execution Plan








These optimization strategies will work if the system can pre-determine or pre-

estimate the number of instances of each class which may satisfy a query. Fortunately,

several researchers have done interesting work in relational DBMSs to estimate the

size of the outcome of selection and join operations [Ling92, SunW93]. The introduced

size estimation techniques that do not generate much run-time overhead are very

suitable for our application. The CON array and connection information (i.e., IID-

IID pairs) shown in Figure 4.4 is also very useful for estimating the distinct number

of IIDs to be sent to neighboring nodes after a local selection. A well-designed query

optimizer shall be able to collect this information periodically and use it to establish

efficient execution plans.

Before we present the optimization strategies, we define a couple of parameters

which are used to characterize a query graph.

IID-size: IID-size is the estimated number of the distinct IIDs to be sent

to a neighboring node after its local processing (i.e., local selection and instance

connectivities with the neighbor). In case a node is connected with more than one

node involved in a query graph, the IID-size would be the average value of the IID-

sizes of all the node pairs. For example, based on the object graph of Figure 6.1,

when the example query pattern is applied, the IID-size of RA.hrs=20 is 2. Varying

the value of this parameter represents the effect of changing the number of instances

of a node, the selectivity factor associated with instance selection based on attribute

valuess, and/or the connectivity of its instances with the instances of the neighboring

nodes.








Dia: Dia or diameter stands for the longest distance between any two processing

nodes to which the object classes referenced in a query are mapped. It is the number

of nodes along the longest path between the two terminal nodes. For example, in

our example query shown in Figure 3.3, the Dia is 4. The Dia value represents the

distance of a wavefront propagation, thus, determines its communication cost.

We now proceed to present the optimization strategies.

6.2 Intraquerv Scheduling Strategv

In a single query with N object classes, if the IID-size value of a particular

class is relatively large when compared with the IID-sizes of other classes, the node

that manages it should act as a "passive" node. A passive node will not become

active until it receives all the wavefronts from its AND branches. Only one node

in a query graph can act as a "passive" node because a query graph with *AND

branches would only propagate a wavefront of IIDs after having received all but

one wavefront of IIDs. If more than one node acts as a "passive", a deadlock would

occur. The "relatively large" can be determined by using a threshold which is a value

between the largest IID-size and the smallest IID-size that needs to be determined

by a performance evaluation study. If no class has an unusually large IID-size, the

generic identification approach described before is applied.

This strategy avoids the propagation of a large amount of IIDs while maintaining

some degree of parallelism. Figure 6.2 follows this rule. The numbers of IIDs to be

sent out by the RA, Grad, Section or Department node are smaller than the one to

be sent out by the Student node (4,000). Thus, the student node should be a passive








node. Otherwise, a lot of processing time will be wasted and other concurrent queries

will not be able to benefit from those nodes which heavily engage in processing the

incoming IIDs and yet produce results that do not contribute to the final result of

the query.

In the identification approach, a non-terminal node will wait until it receives

i-1 wavefronts from i neighbors that are involved in an AND construct. We call this

kind of node a semi-passive node as opposed to the passive node we just described.

We assume a node C has i numbers of AND-conditioned edges in the query graph

and we call the processor which contains the instances of node C, the Pc processor.

We now describe the behaviors of the different kinds of nodes/processors as follows:


If P, is an active processor, Pc will start the IID propagation process. If it

receives the stream of IIDs from its only neighbor, it will mark those instances

as qualified instances and then terminate.


If Pc is a semi-passive processor, and


if Pc processor has received less than (i-1) incoming streams of IIDs from

its neighboring processors, it will wait for more streams of IIDs to come;

if Pc processor has received streams of IIDs from all its neighbors but one,

it will process the (i-1) streams and select those C instances that satisfy the

selection condition and the query pattern. Then, it will send those IIDs

that are associated with the instances of the only remaining neighboring

node to the corresponding processor.








if Pc processor has received the ith (i.e., the last) incoming stream of IIDs,

it will form the final result of the query for node C and then pass the

IIDs that are associated with the instances of all other neighbors to these

neighbors except the sender of the ith stream.


If Pc is a passive processor, and


if Pc processor has received less than i incoming streams of IIDs from its

neighboring processors, it will wait for more streams of IIDs to come;

if Pc receives i incoming streams of IIDs from all its neighboring processors,

it will process those IIDs and form the final result of the query for node

C (i.e., the set of C instances that satisfy the query graph) and then pass

those IIDs in the resulting set that are associated with the instances of all

the neighbors to these neighbors.


The above strategy is a greedy approach. Another possible approach is the

randomized optimization [Ioan90]. In the randomized optimization approach, the

query optimizer randomly picks the Nth node as a passive node and computes the cost

of each query evaluation plan (QEP). This process is repeated for another node and

the query optimizer will choose a cheaper QEP as the current QEP. After evaluating

all possible QEPs (or stopping at some pre-defined termination conditions), the query

optimizer can have an optimal (or near optimal) QEP. The drawback of this approach

is that it involves very complicated cost functions which are quite difficult to define








and validate in a parallel environment. However, it may provide a better query

processing plan some of the time.

A hybrid approach can be used to reduce the search space and still achieve a

satisfactory QEP. This approach picks the node that has the largest IIDs size as a

passive node and evaluates the cost of the QEP. Then, a node that has the second

largest IID-size will be picked and the cost of the QEP is computed. The two costs

will be compared and the cheaper QEP is kept. This process will repeat itself until

the termination condition is met. The termination condition could be that if there

have been X number of consecutive nonproductive attempts or all the nodes have

been picked, then program terminates. The nonproductive attempt means that the

cost of a new QEP is greater than the cost of the best QEP identified so far. The

constant X is a system parameter. This approach is based on the same heuristic rule

that a node with a large IID-size will most likely send out large number of IIDs to

its neighboring nodes, and some of which may not contribute to the final result.

6.3 Partial Graph Processing Strategy

In an OODBMS, a query can be expressed as a query graph in which each node

represents a class involved in the query. In many cases, only the descriptive data from

a limited number of nodes are of interest and are to be retrieved or processed. The

rest of the nodes in the graph are used only for determining the connectivities among

object instances. For example, in a query "Find a graduate research assistant's name

whose RA assignment is 20 hours", there are three nodes involved, namely, RA, Grad

and Student. But the query issuer is only interested in retrieving the names from the








Student node. For this kind of query, the following procedure can be used to achieve

a more efficient processing:


Mark those nodes whose descriptive data are of interest as having status 1.


Mark the nodes which are between the status 1 nodes as having status 2.


Mark the rest of the nodes according to their distance from the status 1 or 2

node. The immediate neighbor of a status 1 or 2 node will be marked as having

status 3.


The generic identification or elimination algorithm is applied; however, IID

wavefronts in some directions will be suppressed by following the rules given

below.


The wavefronts between status 1 or 2 nodes are propagated.


The wavefronts from a higher numbered node to a lower numbered node are

propagated.


The wavefronts from a lower numbered node to a higher numbered node are

suppressed.


Figure 6.3 is used to illustrate the numbering scheme. In the query graph, nodes

B and F contain the data to be retrieved. They are marked as having status 1. Node

E is in between nodes B and F so it is marked as having status 2. The rest of the

nodes are either marked as having number 3 or 4 according to their distances from a


number 1 or 2 node.








The purpose for marking nodes with ordered numbers is to distinguish which

part of the query graph is of interest and which direction is towards or away from

the interested area. If the wavefront is propagated away from the interested area, the

propagation can be suppressed. However, the end-markers should still be passed on

so that every processor involved in the processing of the query graph can count the

number of the end-markers it receives to tell when the algorithm should be terminated

in that processor.

This second optimization strategy is applicable to both identification and elim-

ination algorithms. The time saving of this strategy is twofold. First of all, the

suppression of some IID propagations reduces the communication cost. Secondly, the

processors that hold the classes that do not contain the data to be retrieved contribute

only to the identification of object instance connectivities (i.e., the propagation of

wavefronts from higher numbers to lower numbers), thus, leaving themselves more

time for processing some other queries.

The only overhead introduced by this approach is to mark the nodes with ordered

numbers. If the query graph is a tree, the cost of the numbering is very small (a

variation of the depth-first search algorithm is used in our implementation). If the

query graph is cyclic, the stated numbering scheme can not be directly apply. Many

approaches have been proposed to process cyclic queries [Kamb82, Kamb85, TayY89].

These methods first convert a cyclic query into a tree and then apply a tree-structured

query processing procedure to find the result. An approach proposed in [Chen95]

identifies the cyclic components in a query graph first, and then uses a combination of








the identification and elimination algorithms to process the cyclic components. When

this procedure finishes, the query graph can be converted to an acyclic graph so that

the two query optimization strategies as well as the numbering scheme presented in

this section can be applied.


A B


Figure 6.3. An Example of the Numbering Scheme


6.4 Interquerv Scheduling and Common Pattern Sharing Strategies

Our goal for interquery scheduling is to exploit the sharing of a common pattern.

Concurrent queries can share their processing results in three ways. First, the final

processing result of one query can be used by other queries. Second, the intermediate

results of one query can be used by other queries. Third, some of the costly opera-

tions can be shared by queries (e.g., selection operation and accessing data from the

secondary storage). We will discuss them in turn.







Sharing conditions: There are different criteria for sharing. In our approach,

the following two requirements have to be met by a query before it can share the

result of another query:


1. The nodes (object classes) and edges (their associations)

of a graph form a superset of or the same set as

those of another query.

2. The local selection conditions of the nodes (object classes)

in the query graph are equally or more restrictive than the

ones specified in another query graph.

For example, we have a set of queries as follows:

Ql: A[al > 10] B[bl = 10] C

Q2: A[al > 10] B[bl = 10] C E

Q3: A[al = 20] B[bl = 10] C

Q4: A[al > 10] D F




According to the two conditions given above, Q2 and Q3 can share the final

result of Q1. Q4 can not share the final results of the other queries due to its

violation of the condition 1. However, local selection operation of the class A in Q4

can be shared with the same operation of class A in the other queries. (we shall

discuss this approach in a greater detail in Subsection 6.5).

Having specified the conditions of sharing, we now examine the structures of

sharing that can exist among queries.







Structures of Sharing: There is a need to introduce some structures of sharing

to provide a clear and more expressive way of describing the execution order of

queries in a parallel or distributed environment. For example, the four queries we

just discussed can have the structure shown in Figure 6.4.

Ql Q4 Q3 (A[al= 20])



Q2 Q3

Figure 6.4. An Example Set of the Structures of Sharing


This set of structures carries several meanings. Firstly, it says that Q1 will be

executed before Q2 and Q3 and Q2 and Q3 can share the result of Q1. Secondly,

it indicates that Q4 can not share with Q1, Q2 or Q3, however, the local selection

operations of Q1, Q4 and Q3(A[al = 20]) are sharable. We shall say that Q1, Q2

and Q3 in the same structure of sharing and Q4 by itself is in another structure.

Given a set of queries, more than one structure can be constructed. In our example,

Q1, Q4 and Q3(A[al=20]) are at the first level (a higher level) while Q2 and Q3 are

at the second level (a lower level). The reason Q3(A[al = 20]) is also placed at the

first level is that its local selection condition is different from Ql's and the instances

of class A obtained by processing Q1 can not be used for processing Q3(A[al=20]).

However, by placing the Q3's selection condition over A at the top level, that selection

condition can be shared with those of Q1 and Q4 because these three queries will be

processed concurrently.








We note here that in a structure of sharing, the higher level query is

less restrictive than the lower level queries. Thus, the query results of a

higher level query is a superset of the results of a lower level query. The

structure of sharing can be used as the structure for query scheduling. Thus, the

interquery scheduling is based on the sharing of common subpatterns.

The above example illustrates that a structure of sharing is very convenient in

describing the execution order of queries and the activation of processors that manage

different classes. Now, we proceed to introduce three basic structures of sharing.

Basic Structures of Sharing: A complex structure consists of one or more

than one of the three basic structures shown in Figure 6.5. Structure A is the simplest

one. Those nodes in Q2 that have the corresponding nodes in Q1 will wait until those

nodes in Q1 finish processing (i.e., after receiving all the endn-arkers). The nodes

in Q2 will use the query results of those nodes in Q1 as their local selection results.

Moreover, in the elimination algorithm, the CON array of a node in Q1 can also be

used by the node in Q2 so that Q2 will be able to resume the processing at the place

where Q1 left, rather than to start the processing all over again. We note here that

those nodes in Q2 that do not appear in Q1 can start the query processing without

waiting for Q1 to finish. They can also participate in the sharing of distributed local

selections to be discussed later. Case AA is an example of this kind of structure

in which the result of A*B can be used for processing A*B*C and the processor of

node C can start a wavefront algorithm without waiting for the processing of A*B

to complete.












Q2 Q3 Q3

(A) (B) (C)

A*B B'*CD C*D*E B'CD

V A
A*B*C A*B*C*D*E*F A*B*C*D B*C*D*E

(AA) (BB) (CC)

Figure 6.5. Three Basic Structures of Sharing

Case B describes the situation in which a complex query can share the results

of several smaller queries. Similarly, the nodes in Q3 can use the query results of the

corresponding nodes as their local selection results. If one node appears in both Q1

and Q2, its results from Q1 and Q2 will be intersected and used by the corresponding

node in Q3 as its local selection results. However, when it comes to the CON array

sharing, Q3 can only use CON arrays either from Q1 or from Q2 to avoid a possible

inconsistency. Case BB is an example of structure B. The results produced for C and

D in B*C*D and C*D*E processing will be intersected and the result will be used

as the local selection results of C and D, respectively. In the elimination approach,

the resulting CON array of either B*C*D or C*D*E processing will be used by the

corresponding nodes in the processing of A*B*C*D*E*F.

The third basic structure is shown in C. This case allows a query pattern to be

shared by several other queries. The query result sharing and the CON array sharing








between each pair of the high level and low level queries is the same as in Case A.

Case CC is an example of structure C.

We shall now discuss in a greater detail about what can be shared in a structure

of sharing.

Query results sharing: In a structure of sharing, the result of a higher level

query can be used by lower level queries as their local selection results if they have

the same local selection conditions so that the local selection operations of the lower

level queries are not necessary. However, if the lower level queries do not have the

corresponding object nodes in the higher level query, the local selection operations

of these nodes will have to be done separately. For those nodes whose selection

conditions are more restrictive than those of the corresponding nodes in the upper

level, these selection operations would have been moved to the upper level. Their

results are readily available for these lower level selection operations. After the local

selections, a multiple wavefront algorithm can start.

Intermediate result sharing: If the elimination algorithm is used, the CON

arrays of a higher level query which records some intermediate results can be shared

by the lower level queries. The CON array sharing enables the lower level queries to

start their query processing from where the higher level query ends rather than from

the very beginning. This strategy will reduce the communications and processing

costs. Similar to the case of query result sharing, if the lower level queries can not

find the corresponding nodes from which to copy the CON arrays, they will copy

them from the database (i.e., the original CON array). Also, if there is more than








one query at the higher level, the structure of sharing should specify a query from

which all the lower level queries should copy the CON arrays. After that, the CON

arrays of the source query should be freed, thus making the memory space available

to other tasks.

Common operation sharing: In a structure of sharing, some operations are

common to all queries, some of which are very time-consuming operations, such as the

retrieval of the final result. These common operations can be shared. The two-phase

query processing technique discussed in Section 3 postpones the retrieval of the final

result to the second phase. The common operation sharing strategy postpones the

retrieval of the final result even further, i.e., to the end of executing a structure of

queries. This approach takes advantage of the structural property of such a structure,

i.e., the query result of a higher level query is a superset of the result of a lower level

query. In processing a structure of queries, only those data (attribute values) needed

for a high level query are retrieved from the secondary storage. The lower level

queries access their data from the data already loaded in main memories instead of

from secondary storage. This approach can greatly reduce the I/O cost. However,

it may delay the response time of some individual queries. To solve this problem, a

pipelining approach for the construction of final retrieval results can be used. The

final result of a query is constructed after object instances have been traversed by slave

nodes and those instances that satisfy some local selection conditions and the context

specification have been marked. The collection of the final retrieval result can be done

in the following pipelined fashion. As soon as a processor completes its processing of








a query, its data in the form of IID-Attribute-Value pairs, which constitute the final

retrieval result, are sent to a processor responsible for constructing the final result.

Data would arrive to the construction processor at different time depending on the

time the involved processors complete the processing of a structure of sharing, thus

forming a pipeline of data. The construction processor would assemble the received

data based on the IID information provided in the data streams to construct the final

retrieval result.

6.5 Distributed Sharing of Selection Operations

In order to achieve the sharing of the results of selection operations, there must

be some processors) which is responsible for the identification of sharable selection

conditions. We use a distributed approach to achieve this identification. In this ap-

proach, all the processors that contain object classes referenced by a set of concurrent

queries are given the structures of sharing as well as the queries. They independently

examine the top level queries in these structures (note: only the top level queries

do selections) to determine if their selection conditions make references to the same

object classes in their possession. If such selection conditions and object classes are

identified, the selection conditions are compared to determine if they are sharable.

Thus, the decision of shareability is made in a parallel and distributed fashion in-

stead of in a centralized fashion. The latter approach may cause a bottleneck in a

multi-query processing system. The tasks needed for sharing the results of selection

operations in each processor would depend on the storage (or access path) structure

used. For example, if there is no index established for an attribute, all its attribute




61


values in the instances of an object class will be accessed from the secondary storage

once and be used to process the selection conditions of all the queries that make

reference to the attribute. However, if an index is available, index accesses and their

results can be shared. In either way, the amount of I/O will be reduced.














CHAPTER 7
PERFORMANCE EVALUATIONS


We have implemented the two generic multi-wavefront algorithms, result collec-

tion strategies based on both master-slave and peer-to-peer architectures, and four

optimization strategies presented in the previous sections. Our implementation plat-

form is a 64-node nCUBE 2 parallel computer [nCU92]. The software architectures

for both architectures are shown in Figures 7.1.

7.1 Benchmark and Application Domains

In our evaluation, the benchmark queries introduced in Thakore's work [Thak94]

are used. We did not use the benchmarks proposed in other works [Ande90, Catt92,

Care93] because they contain much simpler query types. The query types used are

shown in Figures 7.2. The characteristics of each query type are as follows:

Type I-Queries involve the manipulation of complex objects. Figure 7.2(a) shows

the structure of the subschema processed by the queries. Object class Cl models a

set of complex objects. Complex objects are composed of objects of other classes

and are modeled as an aggregation hierarchy. In the figure, objects of class Cl are

composed of objects of classes C2 and C3.

Type II-Queries involve the manipulation of complex objects and the inheritance

of attributes. Figure 7.2(b) shows the structure of the subschema processed by queries








of this type. As can be observed from the figure, in addition to the manipulation

of the aggregation hierarchy, the inheritance of attributes through the generalization

association (labeled G) is also involved in query processing. The dynamic model of

inheritance is assumed, meaning the attributes and values associated with objects of

a superclass are defined and stored in the superclass rather than in its subclasses.

Type III-Queries involve the interaction (or relationship) of complex objects

with inheritance of attributes. In Figure 7.2(c), classes Cl and C4 model two sets

of complex objects. Objects of class Cl inherit attributes from class C8. Class C7

models objects that capture the interaction between the complex objects of classes

C1 and C4.

In addition to the above benchmark queries, query types with one class selection,

two-class and three-class association (join) are also evaluated.

In our performance evaluations, we conduct evaluations of the proposed query

processing and optimization strategies. We compare the speedup and scaleup of the

two architectures for result collection. We also evaluate data placement strategies

over two different application domains.

7.2 Evaluations of Optimization Strategies

Using the implemented system, we evaluate the performance of four optimiza-

tion strategies. We compare the performance of query processing with optimization

against the performance without optimization. The response time of a query is de-

fined as the time it takes for the slave nodes to receive the query from the master

node, process it and construct the final results. The total CPU time of a query is








the summation of the processing times of the slave nodes which are involved in the

processing of the query (excluding the idle time). We are able to obtain this data

by using the execution profiling tool provided by the nCUBE. In constructing the

test database as shown in Figure 7.2(c). We map one object class to one processor

(class-per-node) and each processor has its own I/O channel.

7.2.1 Intraquery Scheduling

For the intraquery scheduling strategy, a test query as shown in Figure 7.2(b)

is used. We assume that each of the object classes has an IID-size of 5,000 initially.

The IID-size of class C3 varies from 5,000 to 500 so that we can observe the impact of

varying the difference in IID-sizes between class C3 and classes C2 and C8. According

to the intraquery scheduling strategy, any one of the C2 and C3 classes can be passive.

Here we arbitrarily pick class C3 as a passive class.

Figure 7.3(a) shows the response time for the single query in two situations. One

is with the intraquery scheduling strategy and the other without. We observe from

this figure that when the difference in the IID-size between Class C3 and Classes C2

and C8 is large (greater than 3,000), the intraquery optimization strategy improves

the response time. We also observe from Figure 7.3(a) that, when the difference in

IID-size between the object classes is not very significant, starting the wavefronts

from the active nodes with smaller IID-sizes decreases the degree of parallelism for

the single query so that the response time is longer.

Figure 7.3(b) shows the total CPU time (excluding the idle time) of all the

slave nodes that are involved in this query. This figure shows that, by applying the



























MMurNede


(a) A Master-Slave Software Architecture (b) A Peer-to-Peer Software Architecture



Figure 7.1. Two Software Architectures



























AC
A
m:n m:n A
m:n m:n


C2 C3 C2 C3

(a) Modeling of Complex Ob- (b) Modeling of Complex Ob-
jects jects with the Inheritance of
Attribute Values










(c) Modeling of Interacting Complex
Objects with the Inheritance of At-
tribute Values



Figure 7.2. Schema Representation of Various Benchmark Queries








intraquery scheduling strategy, the total CPU time decreases, thus leaving more CPU

time for other queries.

If we only consider the optimization of a single query, in Figure 7.3(a), we can

easily see that the threshold value of the difference in IID-sizes is about 3,000 (or when

the IID-size of the C3 class is 2,000). However, if our interest is on the optimization

of multiple queries, the response time to one query is not the only criterion that

needs to be considered. We need to consider the cost involved in achieving that

kind of response time. In other words, we need to consider how much processing

power is consumed by a single query and how much processing power is left for

other queries. The goal in parallel multiple query processing is to achieve a

balance between the response time for each individual query (parallelism)

and the cost to achieve that (efficiency) so that the overall response time

of a set of concurrent queries is reduced. With this goal in mind, by combining

Figures 7.3(a) and (b), we can see that the threshold value of the difference in IID-size

can be somewhere between 0 and 3,000.

7.2.2 Partial Graph Processing

The partial graph processing strategy is tested for the identification approach

using the query graph shown in Figure 7.2(c). We randomly pick the object classes

from which data are to be retrieved. The response time and the total cpu time of

the query is shown in Figures 7.4(a) and (b).




68




12
120 opti -
10

S- 14

7 12

0 500 1001500200025003000350040004500 00 1500 2000M 3000 300 4000 450
The Diffeence i the P-size The Dfferec in he P-e

(a) Response Time (b) Total CPU Time



Figure 7.3. Intraquery Scheduling Strategy

Two observations can be made. Firstly, the greater the ratio of the Dia (as

defined in Section 6.1) of a partial graph to the Dia of the whole query graph is, the

smaller the response time and the more saving in the total processing time would be.

Secondly, the lower the number of classes from which the final retrieval results

are to be accessed is, the more saving in the processing time would be. Also, if a

large class does not contain the descriptive data of interest, the response time will

be significantly reduced. The location of the class in the query graph will affect the

response time as well.

One can also notice, by comparing Figures 7.4(a) and (b), that, while the re-

sponse time for a single query may not be significantly reduced by applying the

partial graph strategy, the total CPU time does. This is because the response time is

determined by the time (both idle and processing time) of the last slave node which

finishes the processing of a query. In order to evaluate a query processing strategy in










16 50,
idea-opti
15 ide&opti -4- ida non-op -
IdkwnonopU ---
14
40

11 30

25
1 2 3 4 5 7 1 2 3 4 5 6 7
Number of Intereed Clas Nmber of Iterested Classe

(a) Response Time (b) Total CPU Time



Figure 7.4. Partial Graph Processing Strategy


the context of multi-query processing, both the response time of the query and the

total CPU time should be considered.


7.2.3 Interquery Scheduling and Common Pattern Sharing


Our test database is still the database shown in Figure 7.2(c). We construct a

structure of sharing which consists of three basic structures. We gradually increase

the number of queries in the structure while we measure the total response time at

each step until it consists of five queries as shown in Figure 7.5.

The performance evaluation is done for both identification and elimination ap-

proaches. We only present the results of the identification approach here because

the results of the elimination approach are similar. Figures 7.6(a) and (b) show the

performance results when both the result sharing and the sharing of result retrieval

operation (e.g., result collection phase) are applied. As expected, the performance is








further improved. We also note here that the more queries are added to the sharing

structure, the more performance gain is achieved.


7.2.4 Distributed Local Selection Sharing

The same performance evaluation method used in the preceding subsection to

evaluate the performance of the interquery scheduling and the common pattern shar-

ing is used here. The test database and the structure of sharing are the same.

However, we do not use the result sharing and the result retrieval sharing but use

only the distributed local selection sharing strategy available. Figure 7.7 shows the

performance evaluation result. We observe the same kind of performance improve-

ment as in the last subsection. The increased number of the queries shown in the

figures represent the increase in the sharing of local selection operations.



3 7.5. Structure of Sharing Used in Performance Evaluation








Q3: c 0| C I











Figure 7.5. Structure of Sharing Used in Performance Evaluation




71











30 30
I o ./
25 idae.o -opI 25 ida a /

I 20 2--
I :.--." | ......

15 15

10 / 10 /
1 2 3 4 5 1 2 3 4 5
Number o Queri Nmber rt Qri

(a) Response Time When the Result (b) Response Time When the Result
Sharing is Applied Sharing and the Result Retrieval Shar-
ing are Applied



Figure 7.6. Interquery Scheduling and Common Pattern Sharing Strategies











30

25 Idennonoil .. '3"p
id .opti .. ... ,
20

15 .

10

1 2 3 4 5
Number of Quies

Figure 7.7. Distributed Local Selection Sharing Strategy








7.2.5 Scaleup and Speedup of optimization strategies

We also evaluate the scaleup and speedup of the multiple wavefront algorithms

and the optimization strategies. In our evaluation, the benchmark queries shown in

Figures 7.2 are used.

Many application domains can be formed by varying the percentages of the

three benchmark query types. In this dissertation, only the evaluation result of one

application domain is presented. This application domain has the same percentage

for the three benchmark query types.

Figure 7.8(a) shows the scaleup of the multiple wavefront algorithms and the

optimization strategies. Several conclusions can be reached from the figure. Firstly,

when the number of the processors is much smaller than the number of object classes

in the schema, a reasonably good scalability can be achieved. This is because each

processor stores instances of multiple classes and multiple queries access data from

different classes, thus achieving some degree of load balancing. When the number of

the processors is further increased, the scalability deteriorates because the additional

processors do not lighten the processing load of these processors which hold the

instances of object classes due to the class-per-node mapping strategy used in this

particular experiment. Secondly, the optimization strategies improve the scalability.

Figure 7.8(b) shows the speedup of the multiple wavefront algorithms and the

optimization strategies. From the figure, we observe a quite good speedup when the

number of processors is much smaller than the number of the object classes in a

schema. When the number of the processors is increased, the speedup deteriorates.








The reason is the same as what was explained for the scalability. We can also observe

that the algorithms without optimization strategies have better speedup than the

ones with optimization strategies. The reason is that, when many object classes

are mapped to a processor, it is more likely that the optimization strategies can be

applied more effectively and the execution time of the queries can be reduced more

significantly. Thus, when the number of processors is increased, the execution time

can not be reduced as much as the case when the optimization strategies are not

applied. For the scaleup evaluation, the optimization strategies can be applied more

effectively because the problem size grows with the system size, thus, achieving a

better scalability.


S 4 4
3.5 ,no pf I 3 m.5 ..... ..
I 3 3 P


2I


1 2 3 4 5 6 7 8 1 2 3 4 5 6 7
Number f Procnor Nube o Proceuo

(a) Scaleup of the System (b) Speedup of the System


Figure 7.8. Scaleup and Speedup of the System



7.3 Evaluations of Architectures

In our research, we evaluate the speedup and scaleup of the two architectures.

In the master-slave architecture, node 0 is designated as a master node and the rest








of the nodes as slave nodes. In the peer-to-peer architecture, node 0 is designated as

a C-node and the rest of the nodes as P-nodes. The results are collected by different

P-nodes. We also use the hybrid data placement strategy for the evaluations in this

subsection.


7.3.1 Single Class Selection


The speedup and scaleup of a single class selection are shown in Figures 7.9. It

can be observed that both architectures have good speedup and scaleup. In our exper-

iment, we also change the size of classes and the attribute size of objects. We found

that the speedup and scaleup properties do not change significantly in either case.

However, if the size of a class is very small (e.g., 50 objects/class) or the attribute

size of an object is small (e.g., 5 bytes/object), the speedup and scaleup are not good.

This is because when each data segment is less or equal to the page size (4096 bytes

in our implementation), further partition will not improve the performance.




50 Peer-Lo-Per -1 S MaSterSlawe-
Peer-to-Peer
40 1.1




S 10 .os

10 20 30 40 50 60 10 20 30 40 50 60
Number t Proeso Number d Proces

(a) Speedup (b) Scaleup


Figure 7.9. Single Class Selection (20000 objects/class, 100 bytes/object)








7.3.2 Two-Class Join

The speedup and scaleup of a two-class join are measured on three different

kinds of databases. As shown in Figure 7.10, the peer-to-peer architecture achieves

better speedup and scaleup than the master-slave architecture. We observe this

characteristics for all types of queries we tested. We also observe from Figure 7.11

that, in a peer-to-peer architecture, when the number of objects in a class is increased

to from 4000 objects per class to 20000 objects per class, the speedup and scaleup

are degraded. This is because that the increase in the number of objects increases

the message transmission overhead. This portion of the cost can not be completely

parallelized. On the contrary, when the attribute size of an object is increased from

100 bytes/object to 5000 bytes/object, the speedup and scaleup improve. This is

because that the increase of the attribute size increases the I/O cost which is the

portion of the cost that can be parallelized. Similar characteristics can be observed

in a master-slave architecture. The variation of an object class size or an attribute

size has similar impact on a three-class join and the benchmark queries.

7.3.3 Three-Class Join

The speedup and scaleup of the three-class join is shown in Figure 7.13. Com-

pared with the two-class join, the speedup and scaleup are not as good. This is

because that three-class join increases the message overhead which can not be fully

parallelized.
















40 MmMter-Slave M."er-ve..
Peer-to-Perr 2 Pur-to.Pr--

30 1.8 "
1.6
15 1.6


10 .2030 4 1 20


10 20 30 40 50 60 10 20 30 40 0 60
Number of Processors Number of Processors

(a) Speedup (b) Scaleup



Figure 7.10. Two-class Join (4000 objects/class, 100 bytes/object)











45 1.8
40 200001.7 4000 ObjecCla -*
35 1.6
31







10 20 30 40 50 60 10 20 30 40 50 60
Number d Processors Number of Processors

(a) Speedup (b) Scaleup



Figure 7.11. Two-class Join (20000 objects/class, 100 bytes/object)
Figure 7.11. Two-class Join (20000 objects/class, 100 bytes/object)

















45
40
S 35
30

5 2

15
10


(a) Speedup


10 20 30 40 50 60
Numbr d Proco te s


(b) Scaleup


Figure 7.12. Two-class Join (4000 objects/class, 5000 bytes/object)


10 20 3 40 50 60
Number ofProceoon


(a) Speedup


1.7 Thr-Clas Jo -.
T'W.-bc jol --J
1.6
1.5
1.4
13

1.21


10 20 30 40 50 60
Number ofProc or


(b) Scaleup


Figure 7.13. Three-class Join (4000 objects/class, 100 bytes/object)


10 20 30 40 50 60
Number of Procoro








7.3.4 Benchmark Queries


Finally, we evaluate the scaleup and speedup of the benchmark queries shown

in Figures 7.2. The comparison is made between the master-slave and peer-to-peer

architectures. We observe that the peer-to-peer architecture achieves much better

speedup and scaleup than the master-slave architecture. This is because that the

queries are much more complicated and the result collection and the traversal cost

at the master node become the bottleneck.


14 2...S K 2
M.ter-Stave 2.6


6 2

1.6

2 1.0


(a) Speedup (b) Scaleup



Figure 7.14. Speedup and Scaleup of Benchmark Queries


7.4 Evaluations of Two Data Placement Strategies


In this section, we present the evaluations of the placement strategies over two

different application domains. The benchmark query types shown in Figure 7.2 are

used. In the first application domain, we assume that there are many object classes

each of which contains a small number of object instances. The queries in this appli-

cation domain involve all the object classes with an equal or nearly equal probability.








To simulate this application domain, our benchmark query set contains one Type

I query, one Type II query and eight Type III queries. There are 100 objects in

each class. In the second application domain, there are some large size classes and

the queries involves the large size classes with a higher probability. To simulate this

application domain, we let C1, C2 and C3 have 20000 objects/class while the rest

classes have 4000 objects/class. Our benchmark query set contains one Type III

query, one Type II query and eight Type I queries.

Two data placement strategies are under evaluations, namely, the class-per-node

vertical partitioning and the hybrid partitioning strategies. In a class-per-node ver-

tical partitioning strategy, the instances of a class can be stored in a single processor

and each processor can have the instances of multiple classes. At each processor,

the instances of an object class are partitioned vertically. In a hybrid partitioning

strategy, each object is horizontally partitioned into segments and each segment can

be further partitioned vertically. As explained in Section 4.2.1 on query modification,

a query is modified into another query based on the distribution of different combi-

nations of horizontally partitioned data. The modified query is processed against a

combination of data partitions using various optimization techniques to obtain the

final result of the original query.

We measure the response time for both data placement strategies in two different

application domains. In the class-per-node vertical partitioning strategy, maximally

eight processors can be utilized. Therefore, we limit the processor number to eight for

both strategies. Figure 7.15(a) shows the performance result in the first application








domain. In this domain, the class-per-node vertical partitioning strategy performs

better. This is because that the benchmark query set involves object classes with

nearly equal probability, thus, the processing power of processors can be fully utilized.

On the contrary, the hybrid partitioning strategy does not utilize the processing power

of the processors any better than the class-per-node partitioning, yet, it requires some

synchronization at "OR" branches which introduce some overheads.

Figure 7.15(b) shows the performance result in the second application domain.

In this domain, the hybrid partitioning strategy performs better. This is because that

the benchmark query set involves object classes with different probability and the

sizes of classes are different. When the class-per-node vertical partitioning strategy

is used, there are some nodes which may finish query processing well ahead the other

nodes and become idle. But, when hybrid partitioning strategy is used, every class is

partitioned into eight segments and mapped to eight nodes. In this way, the modified

queries can fully utilize all the processing power. Even though the modified query

introduce some overheads because of the synchronization, the overall response time

is improved.




81























20 CPerNoe 1ClPerNode .
I8 Hybrid 90 Hybrid

S 16"i
14
12 so
10 40

220
1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8
NumberofProcessors Number ofProceson

(a) The First Application Domain (b) The Second Application Domain


Figure 7.15. Response Time of Benchmark Queries














CHAPTER 8
DISCUSSION AND CONCLUSION


8.1 Discussion

Whenever new optimization strategies are introduced, the overhead of using

these strategies need to be considered since it is a part of query processing cost.

Also, in multiple query optimization, the memory size can become a limitation to the

applicability of these strategies. We address these issues below.

Overhead of the Intraquery Scheduling Strategy: The overhead of this

strategy is the identification of passive, semi-passive and active nodes. The IID-size

is the only parameter that needs to be considered for this purpose. We have defined

the IID-size for each object class as the number of IIDs that need to be sent to its

neighbors after its local processing. A processor calculates the IID-size based only on

the local information (e.g., CON array, instance connectivities to adjacent neighbors

and the distribution functions of the attributes) instead of the entire query graph.

This calculation is an in-memory operation, and the traversal of the entire query

graph is not necessary in the calculation.

Overhead of the Partial Graph Processing Strategy: The numbering of

nodes in a query graph is the overhead for this strategy. We have pointed out in the

previous section that the numbering process is a search process on a query graph.








Usually, the number of classes involved in a query graph is not very large. Thus, the

overhead would be small.

Overhead of the Interquery Scheduling and Common Pattern Sharing:

Building the structures of sharing is the overhead of this multiple query optimization

strategy. It takes two steps to build the structures for a batch of queries. In the first

step, each query will compare with every other query in the batch to find out if they

can form a Case A (as discussed in Section 6.4) simple structure. The complexity of

this operation is O(n2) where n is the number of queries in the batch. In the second

step, those simple structures are grouped to form one or more complex structures.

The total complexity is still O(n2). Because of the memory limitation (to be discussed

below), there will be a limited number of queries that can be batched so that the

overhead for building the structures of sharing would not be too significant.

Overhead of the Distributed Local Selection Sharing: In this strategy,

all the decisions on sharing are made in the slave nodes so that there is no bottleneck

problem for the master node. The overhead is in identifying the selection predicates

and deciding the order of executing the selection operations.

We have run different sets of queries to determine the overheads of query opti-

mization and the performance gains in response time. We did this for both identifi-

cation and elimination approaches. The test database is the university database we

presented in Figure 3.1. A set of queries are used, each of which has 4 to 7 object

classes. Figures 8.1 shows the results for the identification approach. We can see

that the overhead is very small when compared with the gain in the total response








time. For example, when there are 14 concurrent queries, the overhead is 0.073 sec-

onds while the performance gain is 121.49 67.28 = 54.21 seconds. We note here

that the overhead does not include the IID-size estimation because according to the

research results reported in Sun's work [SunW93], attribute distribution functions

can be computed in advance and the use of these functions at run-time to determine

IID-sizes generates negligible overhead.

Identification Approach

# of Queries Overhead (sec) Total Response Time (sec)
with optimization 0.023 13.64
3
no optimization 0 17.28

with optimization 0.044 44.96
5
no optimization 0 55.29

with optimization 0.029 41.23
8
no optimization 0 56.06

with optimization 0.056 35.48

no optimization 0 50.76

with optimization 0.073 67.28
14
no optimization 0 121.49


Figure 8.1. Optimization Overhead for the Identification Approach



Memory Limitation: The main memory size of the processors is another

factor which needs to be considered in multiple query processing and optimization.

In the traditional relational database processing, queries generate final or temporary

results in the form of relations. If one query can make use of the final or temporary

result produced by another query (e.g., the result of a selection operation), the final or

temporary relation generated by the latter should ideally be kept in the main memory







so that it can be readily used by the former without extra I/Os. However, this is not

always possible since the generated relation may be too large to be stored in the main

memory. In the storage structure and query processing strategy used in our system,

only the vertical binary columns (IID-IID and IID-attribute-value pairs) that are

relevant to query processing are fetched from the secondary storage and the result of

a selection operation or the processing of a wavefront is a set of IIDs which does not

occupy much main memory space. Therefore, they can be kept in the main memory

for use by other queries. Also, during the result collection phase (i.e., the second

phase of query processing), descriptive data relevant to a set of retrieval queries can

be fetched by the slave processors from their corresponding secondary storage once

and in parallel, and be distributed into different memory buffers which are set up

for these queries. When the buffers are full, they can be transferred to the result

collection node which would assembly the data to produce the final instances for the

users. A proper buffering scheme can keep the pipelines of data flowing smoothly

from the slave nodes to the master node.

8.2 Conclusion

In this dissertation, we have provided the rationale for using the proposed paral-

lel architectures, data placement strategies, query processing and optimization strate-

gies and result collection strategies for the storage and processing of object-oriented

databases. These strategies are different from those used in many existing parallel

relational database systems in the following ways. Firstly, hybrid partitioning of ob-

ject instances is used instead of the popular horizontal partitioning scheme to achieve








a high scalability and avoid the access of large instances of complex objects from the

secondary storage. Secondly, a query modification scheme is used instead of the

"split", "merge" or "exchange" operators to achieve a more uniform interprocessor

communication during query processing. Thirdly, the two-phase query processing

strategy and the marking of object instances, instead of the traditional single-phase

strategy and the generation of temporary relations, avoid the propagation of large

quantities of data among processors. fourthly, the graph-based query specification

and processing strategy instead of the traditional tree-based strategy can offer a

higher degree of parallelism since query processing can start at multiple processors

and in multiple directions instead of the fixed leaves-to-root order. Fifthly, the multi-

wavefront algorithms instead of the algebra- and tree-based processing algorithms are

used to allow a more direct implementation and processing of query graphs. Sixthly,

the query optimization strategies introduced in this dissertation control the multiple

initializations of wavefronts and the directions of their propagations and allow queries

to share their processing results in a variety of ways. They are particularly suitable for

multi-wavefront algorithms and the graph-based query processing strategy. Lastly,

the distributed result collection scheme achieves good speedup and scaleup for the

retrieval-type queries.

In this work, we have implemented the above strategies and evaluated their

performance. We have shown that not only they are implementable but also their

use improves the performance of multi-query processing with negligible overheads.

We do not claim that the proposed strategies are better than the more traditional




87


strategies used in many existing parallel database systems. A comparison between

these two sets of strategies would involve the actual implementation of both sets

and run them on a parallel system to get their precise performance measurements.

However, this task would be too laborious to undertake due to the fact that there are

many existing strategies and variations. Any selection of a subset of these strategies

for comparison purposes would subject to a criticism on the fairness of the result,

particularly different techniques and algorithms are bound to be used to implement

them. However, we do suggest that, due to the different characteristics of object-

oriented databases, researchers in parallel database systems should investigate into

different query processing and optimization strategies. The ones described in this

dissertation are some examples.














REFERENCES


[Alas89] A. M. Alashqur, S. Y. W. Su, and H. Lam. OQL: A query language
for manipulating object-oriented databases. In Proc. 15th Int'l Conf.
on Very Large Data Bases, Amsterdam, Netherlands, pp. 433-442, Aug.
1989.

[Ande90] T. Anderson, A. J. Berre, M. Mallison, I. H. H. Porter, and B. Schneider.
The hypermodel benchmark. In Proceedings of the EDBT conference,
Venice, Italy, Mar. 1990.

[Bern81] P. A. Bernstein, N. Goodman, E. Wong, C. L. Reeve, and J. B. Rothnie,
Jr. Query processing in a system for distributed databases (SDD-1).
ACM Trans. Database Syst., 6(4):602-625, Dec. 1981.

[Bham93] K. Bhambani and M. H. Kay. ODBII: The next-generation object
database. Technical overview, Fujitsu Laboratories, 1993.

[BicL86] L. Bic and R. L. Hartmann. Simulated performance of a data-driven
database machine. J. Parallel Distributed Comput., 3(1):1-22, 1986.

[BicL89] L. Bic and R. L. Hartmann. AGM: A dataflow database machine. ACM
Trans. Database Syst., 14(1):114-146, Mar. 1989.

[Care93] M. J. Carey, D. DeWitt, and J. Naughton. The 007 benchmark. In Proc.
ACM SIGMOD Int'l Conf. on Management of Data, Washington, DC,
pp. 12-21, May 1993.

[Catt92] R. Cattell and J. Skeen. Object operation benchmark. ACM Transac-
tions on Database Systems, 17(1):1-31, Mar. 1992.

[Chen91] M.-S. Chen and P. S. Yu. Determining beneficial semijoins for a join
sequence in distributed query processing. In Proc. 7th Int'l Conf. on
Data Eng., pp. 50-58, Apr. 1991.

[Chen92] M.-S. Chen, P. S. Yu, and K.-L. Wu. Scheduling and processor allocation
for parallel execution of multi-join queries. In Proc. 8th Int'l Conf. on
Data Eng., Tempe, Arizona, pp. 58-67, Feb. 1992.







[Chen95] Y. Chen and S. Y. W. Su. Identification and elimination-based parallel
query processing techniques for object-oriented databases. Journal of
Parallel and Distributed Computing, 28:130-148, 1995.

[Clue92] S. Cluet and C. Delobel. A general framework for the optimization of
object-oriented queries. In Proc. ACM SIGMOD Conf., San Diego, CA,
June 1992.

[Cope85] G. Copeland and S. N. Khoshafian. A decomposition storage model. In
Proc. ACM SIGMOD Conf., Austin, Texas, pp. 268-279, 1985.

[DeWi90] D. J. DeWitt, P. Futtersack, D. Maier, and F. Velez. A study of three
alternative workstation-server architectures for object oriented database
systems. In Proc. 16th Int'l Conf. on Very Large Data Bases, Brisbane,
Australia, pp. 107-121, Aug. 1990.

[DeWi92] D. J. DeWitt and J. Gray. Parallel database systems: The future of high
performance database systems. CACM, 35(6):85-98, June 1992.
[Grae88] G. Graefe and D. Maier. Query optimization in object-oriented database
systems: The revelation project. Technical Report, CS/E 88-025, Oregon
Graduate Center, 1988.

[Grae90] G. Graefe. Encapsulation of parallelism in the Volcano query processing
system. In Proc. ACM SIGMOD Int'l Conf. on Management of Data,
Atlantic City, NJ, pp. 102-111, June 1990.

[Grae94] G. Graefe, R. L. Cole, D. L. Davison, W. J. McKenna, and R. H. Wol-
niewicz. Extensible query optimization and parallel execution in vol-
cano. In Query Processing for Advanced Database Systems, J. C. Frey-
tag, D. Maier, and G. Vossen, editors, pp. 305-330. Morgan Kaufmann
Publishers, San Mateo, California, 1994.

[Hara94] L. Harada, N. Akaboshi, and M. Nakano. An effective parallel processing
of multi-way joins by considering resources consumption. In Proc. of
ICCI Conf., 1994.
[Ioan90] Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing
large join queries. In Proc. ACM SIGMOD Int'l Conf. on Management
of Data, Atlantic City, NJ, pp. 312-321, May 1990.
[Ishi93] H. Ishikawa and et al. The model, language, and implementation of an
object-oriented multimedia knowledge base management system. ACM
Transaction on Database Systems, 18(1), Mar. 1993.
[Jenq90] B. P. Jenq, D. Woelk, W. Kim, and W.-L. Lee. Query processing in
distributed ORION. In Advances in Database Technology EDBT '90,
Venice, Italy, F. Bancilhon, C. Thanos, and D. Tsichritzis, editors, pp.
169-187. Springer-Verlag LNCS 416, 1990.








[Kamb82] Y. Kambayashi, M. Yoshikawa, and S. Yajima. Query processing for dis-
tributed database using generalized semijoins. In Proc. ACM SIGMOD
Int'l Conf. on Management of Data, pp. 151-160, June 1982.

[Kamb85] Y. Kambayashi. Processing cyclic queries. In Query Processing in
Database Systems, W. Kim, D. S. Reiner, and D. S. Batory, editors,
pp. 62-78. Springer-Verlag, 1985.

[Kell91] T. Keller, G. Graefe, and D. Maier. Efficient assembly of complex ob-
jects. In Proc. ACM SIGMOD Int'l Conf. on Management of Data,
Denver, Colorado, May 1991.

[KimK90] K.-C. Kim. Parallelism in object-oriented query processing. In Proc. 6th
Int'l Conf. on Data Eng., Los Angeles, CA, pp. 209-217, Feb. 1990.

[KimW88] W. Kim, N. Ballou, H. T. Chou, J. F. Garza, and D. Woelk. Integrating
an object-oriented programming system with a database system. In Pro-
ceedings of International Conference on Object-Oriented Programming
Systems, Languages, and Applications. San Diego, CA, pp. 142-152,
Sept. 1988.

[KimW89a] W. Kim. A model of queries for object-oriented databases. In Proc.
15th Int'l Conf. on Very Large Data Bases, Amsterdam, Netherlands,
pp. 423-432, Aug. 1989.

[KimW89b] W. Kim, K. Kim, and A. Dale. Indexing techniques for object-oriented
databases. In Object-Oriented Concepts, Databases and Applications,
W.Kim and F. Lochovsky, editors. ACM and Addison-Wesley, 1989.

[Kits90] M. Kitsuregawa and Y. Ogawa. Bucket spreading parallel hash: A new,
robust, parallel hash join method for data skew in the super database
computer (SDC). In Proc. 16th Int'l Conf. on Very Large Data Bases,
Brisbane, Australia, pp. 210-221, Aug. 1990.

[LamH87] H. Lam, S. Y. W. Su, F. L. C. Seeger, C. Lee, and W. R. Eisenstadt. A
special function unit for database operations within a data-control flow
system. In Proc. of the Int'l Conf. on Parallel Processing, pp. 330-339,
Aug. 1987.

[LamH89] H. Lam, C. Lee, and S. Y. W. Su. An object flow computer for database
applications. In Proc. of the Int'l Workshop on Database Machines, pp.
1-17, June 1989.

[Lieu93] D. F. Lieuwen, D. DeWitt, and M. Mehta. Parallel pointer-based join
techniques for object-oriented databases. In Second International Con-
ference on Parallel and Distributed Information Systems, pp. 172-181,
Jan. 1993.








[Ling92] Y. Ling and W. Sun. A supplement to sampling-based methods for
query size estimation in a database system. ACM SIGMOD Record
12/92, 21(4):12-15, 1992.

[LuH91] H. Lu, M.-C. Shan, and K.-L. Tan. Optimization of multi-way join
queries for parallel execution. In Proc. 17th Int'l Conf. on Very Large
Data Bases, Barcelona, Spain, pp. 549-560, Sept. 1991.

[Mish92] P. Mishra and M. H. Eich. Join processing in relational databases. ACM
Computer Surv., 24(1), Mar. 1992.

[Nava84] S. Navathe, S. Ceri, G. Wiederhold, and J. Dou. Vertical partitioning of
algorithms for database design. ACM Trans. Database Syst., 9(4), Dec.
1984.

[nCU92] nCUBE, Foster City, CA. nCUBE 2 Programmer's Guide, 1992. Release
3.0.

[Schn90] D. A. Schneider and D. J. DeWitt. Tradeoffs in processing complex join
queries via hashing in multiprocessor database machines. In Proc. 16th
Int'l Conf. on Very Large Data Bases, Brisbane, Australia, pp. 469-480,
Aug. 1990.

[Shek90] E. J. Shekita and M. J. Carey. A performance evaluation of pointer-
based joins. In Proc. ACM SIGMOD Int'l Conf. on Management of
Data, pp. 300-311, May 1990.

[SunW93] W. Sun, Y. Ling, N. Rishe, and Y. Deng. An instant and accurate
size estimation method for joins and selection in a retrieval-intensive
environment. In Proc. ACM SIGMOD Int'l Conf. on Management of
Data, Washington, DC, pp. 79-98, June 1993.

[SuSY88] S. Y. W. Su. Database Computers: Principles, Architectures, and Tech-
niques. McGraw-Hill, 1988.

[SuSY89] S. Y. W. Su, V. Krishnamurthy, and H. Lam. An object-oriented seman-
tic association model (OSAM*). In Artificial Intelligence: Manufactur-
ing Theory and Practice, S. Kumara, A. L. Soyster, and R. L. Kashyap,
editors, pp. 463-494. Institute of Industrial Engineers, Industrial Engi-
neering and Management Press, 1989.

[Swam88] A. Swami and A. Gupta. Optimization of large join queries. In Proc.
ACM SIGMOD Int'l Conf. on Management of Data, Chicago, Illinois,
pp. 8-17, June 1988.

[TayY89] Y. C. Tay. Attribute agreement. In Proc. of the 8th ACM SIGACT-
SIGMOD-SIGART Symposium on Principles of Database Systems, pp.
110-119, Mar. 1989.




Full Text

PAGE 1

DATA PARTITIONING, QUERY PROCESSING AND OPTIMIZATION TECHNIQUES FOR PARALLEL OBJECT-ORIENTED DATABASES By YING HUANG A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1996

PAGE 2

To My Parents for Their Love and Encouragement in All My Endeavors.

PAGE 3

ACI
PAGE 4

TABLE OF C ONTENTS ACKNOWLEDGEMENTS ........ 111 LIST OF FIGURES Vll ABSTRA CT . . . Vlll C HAPTERS 1 INTRODU CT ION 1 2 SURVEY OF RELATED WORK 5 3 OBJE CTORI ENTE D DATABAS E AND QUERY SPECIFICATION . 11 3.1 Obj ect-orie nt ed View of a Database . . . . . . . . . 11 3.2 Qu ery Graph and Qu e r y Processing . . . . . . . . . 12 4 A GENERAL FRAMEWORK OF PARALLEL OODBMSS 19 4 .1 A H ybrid Data Partitioning Approach 19 4.2 A Di stributed Graph-based Qu ery Processing Strategy 22 4.2.1 Query Graph Modification Approach 22 4.2.2 Multiple Wavefront Algorithms 24 4.2.3 Result Co ll ection 28 4.2.4 Method Pr ocess ing and Attribute Inheri ta nce 28 5 DISTRIBU TE D RESULT COL L ECT ION ....... 31 5.1 Two Architectures for Supporting Result Co ll ect ion 31 5.2 Patt ern passing Id ent ifi catio n Strategy 32 5.3 Di stributed Re sult Co lle ction 35 5.3.1 Join A pp r oach 37 5.3.2 Concatenat i on Approach 3 8 6 QUERY OP TIM IZATION STRATEGIES . . . . . . 42 lV

PAGE 5

6.1 Parallelism Not Equal to Efficiency 6.2 Intraquery Scheduling Strategy 6.3 Partial Graph Processing Strategy 6.4 Interquery Scheduling and Common Pattern Sharing Strategies 6.5 Distributed Sharing of Selection Operations 7 PERFORMANCE EVALUATIONS .......... . 7.1 Benchmark and Application Domains 7.2 Evaluations of Optimization Strategies 7.2.1 Intraquery Scheduling 7.2.2 Partial Graph Processing 7.2.3 Interquery Scheduling and Common Pattern Sharing 7.2.4 Distributed Local Selection Sharing 7.2.5 Scaleup and Speedup of optimization strategies 7.3 Evaluations of Architectures 7.3.1 Single Class Selection 7.3.2 Two-Class Join 7.3.3 Three-Class Join 7.3.4 Benchma1k Queries 7.4 Evaluations of Two Data Placement Strategies 8 DISCUSSION AND CONCLUSION 8.1 Discussion . 8 2 Conclusion . REFERENCES ... BIOGRAPHICAL SKETCH . . . . . . . . . . V 42 47 50 53 60 62 62 63 64 67 69 70 72 73 74 75 75 78 78 82 82 85 88 93

PAGE 6

LIST OF FIGURES 3.1 Schema Graph of a University Database .. ..... 3.2 Object Graph(OG) 3.3 The Query Graph . . . . . I I I t I I I I I I I I I I I 3.4 The Resulting Subdatabase . . . . . . . 4.1 Vertical Partitioning of Class Student .. 4.2 Hybrid Partitioning of Class Student . . . . . 4.3 Query Graph Modification Approach ......... I I I I I 4 .4 Data Structures for the Multiple Wavefront Algorithms 4.5 Execution of the Identification Algorithm . . . . . .. . . 4.6 A Proposed Mapping Strategy . . . . . . .. 5.1 Two Architectures ........ 5.2 An Example of the PPI Strategy 5.3 An Example of the Result Collection Node Assignment 5.4 An Example of the Result Pattern 5.5 An Example of a cache . . . . . . . . . . Vl 14 15 17 18 19 21 24 26 27 30 33 36 37 38 41

PAGE 7

6.1 A Modified Object Graph ..... 6.2 A New Query Execution Plan 6.3 An Example of the Numbering Scheme . . . . 6.4 An Example Set of the Structures of Sharing . . . 6.5 Three Basic Structures of Sharing . . . . 7 .1 Two Software Architectures . . . . . . . . 7.2 Schema Representation of Various Benchmark Queries ..... 7.3 Intraquery Scheduling Strategy .. 7.4 Partial Graph Processing Strategy . . . . . . . . 7.5 Structure of Sharing Used in Performance Evaluation 44 45 53 55 57 65 66 68 69 70 7.6 Interquery Scheduling and Common Pattern Sharing Strategies . 71 7. 7 Distributed Local Selection Sharing Strategy . 7 .8 Scaleup and Speedup of the System . . . . . 71 73 7.9 Single Class Selection (20000 objects/class, 100 bytes/object). . . 74 7.10 Two-class Join (4000 objects/class, 100 bytes/object) .... 7 .11 Two-class Join ( 20000 objects/ class, 100 bytes/object) 7.12 Two-class Join (4000 objects/class, 5000 bytes/object) 7.13 Three-class Join (4000 objects/class, 100 bytes/object) 7.14 Speedup and Scaleup of Benchmark Queries ..... 7.15 Response Time of Benchmark Queries ......... 76 76 77 77 78 81 8.1 Optimization Overhead for the Identification Approach . . . . 84 Vll

PAGE 8

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy DATA PARTITIONING, QUERY PROCESSING AND OPTIMIZATION TECHNIQUES FOR PARALLEL OBJECT-ORIENTED DATABASES By Ying Huang May 1996 Chairman: Dr. Stanley Y. W. Su Major Department: Electrical and Computer Engineering Much work has been accomplished in the past on the subject of parallel query processing and optimization in parallel relational database systems. However, little work on the same subject has been done in parallel object-oriented database systems. Since the object-oriented view of a database and its processing are quite different from those of a relational system, it can be expected that techniques of parallel query pro cessing and optimization for the latter can be different from the former. In this dissertation, we present two parallel architectures, a general framework for parallel object-oriented database systems, several implemented query processing and opti mization strategies together with some performance evaluation results. In this work, multi-wavefront algorithms are used in query processing to allow a higher degree Vlll

PAGE 9

of parallelism than the traditional tree-based query processing. Four optimization strategies, which are designed specifically for the multi-wavefront algorithms and for the optimization of single as well as multiple queries, are introduced and evaluated. A distributed result collection scheme which is designed to support retrieval queries is also introduced. Furthermore, two parallel architectures, namely, master-slave and peer-to-peer architectures are compared. A comparison is also made for two data placement strategies, namely, class-per-node vertical partitioning and hybrid partitioning. The query processing algorithms, four optimization strategies and the distributed result collection scheme have been implemented on a parallel computer nCUBE2, and the results of a performance evaluation are presented in this disser tation. The main emphases and the intended contributions of this dissertation are 1) data partitioning, parallel architecture, query processing, query optimization and result collection strategies suitable for parallel 00D BMSs; 2) the implementation of these strategies; and 3) the performance evaluation results IX

PAGE 10

CHAPTER 1 INTRODUCTION Research on parallel database systems began in the early 1970s when the rela tional model and relational database management systems started to become pop ular. Since then, a considerable amount of work has been carried out in paral lel processing of relational databases. Many parallel query processing techniques and algorithms, particularly for the processing of the time-consuming Join opera tion [Vald84, Grae90, Kits90, LuH91, Chen92], have been introduced, analyzed, and prototyped In recent years, OODBMSs have become quite popular Some frequent questions raised among researchers and practitioners in the database area are: "What are the major differences between relational database processing and object oriented database processing?", "Can parallel processing techniques and algorithms introduced for relational systems be directly applied to object-oriented systems?", and "What new or modified parallel techniques and algorithms can be introduced to make the future parallel OODBMSs more efficient?''. From the parallel processing perspective, OODBMSs differ from RDBMSs in the following two main aspects: 1. OODBMSs deal with complex objects instead of normalized relational tuples. Data associated with a complex object, say, the design of an airplane, can be com posed of thousands of object instances of a large number of classes Each instance 1

PAGE 11

2 may contain data of complex types like set, list, array, bag, image, voice, etc . This fact has two implications. First, a query in an OODBMS may involve a large number of classes. Traversals of multiple object classes for object instances that satisfy or do not satisfy some data conditions are f1equent operations. Support for efficient, bi-directional traversals of object instances as well as the retrieval of them is needed to achieve efficient query processing. Furthermore, new parallel query optimization strategies will be required to reduce the I/0, communication, and processing times during these traversals. Second, since an instance of a complex object may contain much data, the traditional tuple-oriented data access from the secondary storage and tuple-01iented query processing used in relational DBMSs may no longer be suitable. Only parts of the data associated with instances of complex objects that are relevant to a query should be accessed instead of the entire instances in order to avoid an excessive I/ 0 time. Thus, different data structures would be required for efficient parallel processing of complex objects. 2. Relational systems deal with only the retrieval, update, insertion and deletion of data from databases. Further processing of the retrieved or manipulated data is done in application programs, and thus is out of the control of relational DBMSs. In this type of system architecture, it makes sense to generate temporal relations in different steps of query processing since these generated results a1e to be either retrieved or further manipulated by storage operations. In OODBMSs, in addition to the traditional database operations, user-defined ope1ations and their implementa tions (methods) are managed and performed by the systems. Activations of methods

PAGE 12

3 are done by passing proper messages to object instances. It is therefore important to store methods close to their applicable instances so that they are readily avail able when object instances which satisfy some search conditions have been identified. Except fo1 the final retrieval of data in a retrieval query, assembling of descriptive data ( or attribute values) in object instances ( which is equivalent to the genera tion of temporary relations) should not be carried out since it involves the access of large quantities of data from the secondary storage (high I/ 0 time), the assembly of object instances (high processing time), and the passing of assembled data from processor to processor (high communication time). Some of these assembled data are not applicable to the user-defined operations specified in many nonretrieval-oriented queries. These differences and the increasing popularity of OODBMSs have motivated our research in data partitioning, query processing and query optimization strategies for use in parallel OODBMSs. In our research, we propose a hybrid data partitioning strategy (horizontal and vertical partitioning) to achieve a higher scalability while maintaining a uniform representation of an 00 database across the processing nodes. In order to achieve the partitioned data parallelism, a global, logical query graph is decomposed into many physical query graphs. This approach is different from the ones used in other parallel systems [DeWi92, Grae94] which introduced parallel operators to bridge the gap between the logical 1epresentation of a query and the physical allocation of data elements.

PAGE 13

4 The parallel and asynchronous multiple wavefront algorithms proposed by [Chen95] are used in this research as fundamental query processing strategies. We have ex plored the graph-based asynch1 onous query model and developed four optimization strategies based on the generic wavefront algorithms to suppo1t both single and multi ple query processing. Based on the multiple wavefront algorithms, a distributed result collection scheme has been introduced to support retrieval queries. These strategies have been implemented and the results of a performance evaluation of these strategies are presented in this dissertation.

PAGE 14

CHAPTER 2 SURVEY OF RELATED WORK Since the focus of this dissertation is in parallel architectures, data partitioning, query processing and optimization strategies for parallel OODBMSs, we shall survey the related works in these areas. Data Partitioning is an important issue in parallel RDBMSs. By partitioning ( or declustering) a relation across several disks, the database system can exploit the I/0 bandwidth of the disks by reading and writing data in parallel. Some of the parallel or distributed databases concentrated on horizontal partitionings. There are three basic horizontal partitioning schemes, namely, round-robin, hash, and range partitioning. These schemes and their merits have been described in two works [SuSY88, De Wi92]. The horizontal partitioning approach is essential for parallel RDBMSs to achieve good scalability and speedup. The vertical data partitioning technique has been proposed by some other researchers [Nava84, Cope85]. The same strategy has been used in sev eral parallel database projects [LamH87, Vald87]. This vertical partitioning technique ( or decomposed storage model) has two major advantages for storing the instances of complex objects. First, it provides a uniform representation for complex objects. Second, it can avoid an excessive amount of I/0 required to access large instances during query processing. 5

PAGE 15

6 Data partitioning increases the complexity of the query processing. In tra ditional database systems, the query execution plan consists of sequential opera tors(i.e., "scan'', "join", etc . ). Thus, in some research efforts, parallel operators such as "split", "merge" and "exchange" are introduced to bridge the gap between the physical data representation and the logical query execution plan [DeWi92, Grae94]. In our research, the queries are optimized at the qt1ery graph level. We directly transform a query graph into another query graph based on the physical partitioning of the instances of those object classes that involved in the query graph. The use of IID pairs for the bi-directional traversals of object instances to be presented in this work is similar to the "join index" concept introduced for processing relational joins [Vald87]. However, join indices can be established for any relations that are directly or indirectly associated through their common attributes according to the access patterns of an application. The !ID-pairs of our system are established for all base object classes that are directly associated through object references. The traversals of object instances through their associations are analogous to join and semi-join operations in relational database systems. The join operation is one of the fundamental relational query ope1ations and is a time-consuming one. A recent survey on the join operation can be found in Mishra and Eich [Mish92]. The parallel execution of join operations is an accepted solution for achieving query pro cessing efficiency [Vald84, Grae90, Kits90]. However, most of the existing works have addressed the problem of performing a join involving only two relations. Recently, some researchers have studied parallel execution strategies for multi-way join queries

PAGE 16

7 using different que1 y tree structures, such as right-deep, left-deep and bushy tree structures [Schn90, LuH91, Hara94]. Others have extended query optimization tech niques to handle large and more complex queries [Swam88, Ioan90]. The main idea introduced in these works is to find the optimal join schedule or order. Several semi join strategies have also been introduced for query processing in distributed database systems. Similar to the join operation, most research efforts have focused on the prob lem of finding the optimal schedule or order of semi-join to either reduce the number of semi-join operations or the data transmission cost [Bern81, YooH89, Chen91]. In these works on joins and semi-joins, a query is first translated into a tree structure of relational operators and the execution of the query follows the structure from the leaves to the root. One of the drawbacks of the tree-based query processing approach is that the degree of parallelism is still limited by the leaves-to-root order even if the pipelining approach is used for processing the operations. Furthermore, these works consider the efficient processing of a single query. While it is fairly well under stood how to achieve the optimal schedule in a single query, little is known about the optimal processing of complex, multiple queries in a multi user environment. The re search results for single query optimization are not always applicable to multi-query optimization. For example, most single query processing techniques use the response time of each query as the main performance measurement. The horizontal partition ing of a large file or relation is used to exploit the intra-query parallelism, so as to reduce the t'esponse time of a single query. This approach fails to achieve a balance between the intra-query parallelism and the inte1 -query parallelism. In our work, we

PAGE 17

8 try to achieve a balance between these two types of parallelisms so that the overall response time of a set of queries can be reduced. Several interesting works have dealt with parallel and non-parallel processing of 00 databases. Pointer-based join techniques for both centralized and parallel 00 databases have been studied in two works [Shek90, Lieu93]. In these works, the evaluations only consider the joining of two object classes. Some object-oriented database systems automatically convert OIDs stored in objects to memory pointers to other objects when they load objects from the secondary storage to memory This conversion is known as pointer swizzling [Kim W88, Whit92]. Pointer swizzling makes possible efficient navigation of linked objects residing in memory. However, it heavily depends on the virtual memory mechanism of the operating system. Database systems using pointer swizzling techniques may face difficulties when they are ported from one platform to another. Also, pointer swizzling may not be applicable to the share-nothing parallel computer in which there is no global memory space. Class traversals have been proposed to find the join order of the classes in a query graph to find some associated objects [Jenq90, KimW89a]. Another work uses an assembly operator to translate a set of complex objects from their disk representations to memory representations which can be quickly traversed [Kell91] However, these works are patterned after relational query processing techniques by translating a query graph into a tree structure. This tree-based approach implies a pair-by-pair, bottom-up evaluation of the query tree which limits the inter-operator parallelism and can lead to the generation of large intermediate results. To overcome these

PAGE 18

9 drawbacks, we use a graph-based query processing technique. It allows either all the processors or many processors that manage those object classes referenced by a query to work on the query at the same time, thus achieving a higher degree of parallelism. Recognizing that keeping many processors busy does not necessarily bring about an overall efficiency in multiple query processing, we also introduce several optimization strategies to avoid nonproductive computations by some processors so that they can be used to process other queries. In OODBMSs, the encapsulation of methods with the data they operate on makes the query optimization more difficulty in the following ways. First, estimating the cost of executing methods is considerably more difficult. Second, encapsulation raises issues related to the accessibility of storage information by the query optimizer. Some systems overcome this difficulty by treating the query optimizer as a special application which can violate encapsulation and access information directly [Clue92]. Others propose a mechanism whereby objects ''reveal'' their costs as part of their interface [ Grae88]. In our research, a heuristic approach is used which takes into consideration some limited storage access information. Thus, we assume that the query optimizer can access storage information as a special application. The need for parallel processing of data and their complex relationships has been recognized [Bic186, Bic189, DeWi90, KimK90]. The work by DeWitt, et. al. analyzes three distributed workstation -se rver architectu1es (namely, object, page and file servers) for efficient processing of queries based on an 00 data model. This work varies the degree of data clustering and the buffer size in their analysis of the

PAGE 19

10 performance of these three architectures. It does not investigate parallel architectures and algorithms for processing and optimizing 00 queries. The AGM system [Bic186, Bic189] represents and processes a database as a network of interrelated entities and relationships modeled by the ER model. An asynchronous approach is used to process queries. Our work uses a similar data representation and processing approach. However, the gran11larity of computation in AGM is at the data element level. In our opinion, this is not very suitable for processing very large 00 databases since a large number of tokens carrying a substantial amount of data would have to be generated, transmitted and processed. Also, the result of a query in AGM is not represented structurally in the same model as the original database, thus it can not be further operated on by the same query model (i.e., the closure property is not maintained). The work presented in Kim's paper [I(imK90] analyzes three types of parallelism in processing 00 queries ( namely, node parallelism, path parallelism and class-hierarchy parallelism). They are also exploited in our work. However, Kim's work took the analytical approach and only considers queries which access the object instances of a single target class. In our work, parallel algorithms are implemented to process multiple queries which access object instances of multiple target classes, their interrelationships, and their attribute values.

PAGE 20

CHAPTER 3 OBJECT-ORIENTED DATABASE AND QUERY SPECIFICATION In this section, we describe an 00 view of a database and a graph-based query specification and processing. 3.1 Object-oriented View of a Database An object-oriented database (OODB) can be viewed as a collection of objects, grouped together in c la sses and interrelated through various types of associations (SuSY89, Kim W89b, Well92, Ishi93, Bham93]. It can be represented by graphs at both the intensional and the extens ional levels. At the intensional (schema) level, a database is defined by a collection of inter-related object c la sses in form of a Schema Graph (SG). Figure 3.1 shows the SG of a university database. Rectangle vertices represent entity classes and circle vertices represent domain classes. Objects in entity classes are entities of interest in an application domain. Each object has a system assigned unique object identifier(OID). Objects in a domain class serve as values(e.g., integer 10, character string ''algorithm") for defining other entity or complex domain class objects. The associations among classes are rep1esented by the edges in SG. For example, the association between Course and Department is represented by an attribute-domain link (a fine line), and the association between Person and Student is represented by a superclass-subclass link (a bold arrow). At the extensional (object 11

PAGE 21

12 instance) level, a database can be viewed as a network of object instances in different classes, and inter-related through their associations. This can be represented as an Object Graph(OG). Figure 3.2 shows an OG of a portion of the university database. Every object in s tance on this graph is assigned by the system an instance identifier (IID) which i s the concatenation of an OID and a class ID. Symbols such as rl, g2, sl, etc., in s tead of numbers are used as IIDs for ease of reference. Links in the figure show the bi-directional references between object instances. We note here that object instances are the data representations of objects in their classes. In this example database, we assume that the distributed 01 dynamic model of inheritance is used, in which data associated with an object are distributed in the object class e s of a class lattice instead of the centralized or static model, in which all its data are stored in a bottom node of the class lattice. The former model achieves the inheritance of attributes and methods at run-time whereas the latter model at compilation-time. The query processing and optimization strategies presented in this dissertation are applicable to both inheritance models. 3.2 Que1y Graph and Query Processing Based on the above graphical model of OODBs, an object-oriented query lan guage called OQL has been introduced [ Alas89] in which a query can be specified by a query graph. A query graph is a subgraph of the schema graph and con s ists of a linea r, tree or network structure of object classes having association operators, non association operators and AND-OR branches. For example, the query "For all the

PAGE 22

13 graduate research assistants, find their GPAs, numbers of hou1s of the their appoint ments, department names, and the section numbers of the courses they are taking" can be written using the object query language (OQL) as: context RA *Grad*Student AND (*Section, Department) retrieve Student.gpa, RA.hours, Department.name, Section.num The context part of the query specifies the query graph shown in Figure 3.3. In the query graph, a vertex repre sents a class, and an edge with an association operator "*'' specifies that only those instances of two adjacent classes that are associated with each other in the extensional database are of interest to the query. If a non-association operator ''!'' is used, only those instances that are not associated with each other will be identified. The AND branch states that an instance of the class Student must be associated with some instances in both classes Section and Department. An OR branch would specify the OR condition of object associations. Range variables can be specified for the classes referenced in the context statement. The processing result of this query graph is shown in Figure 3.4 which is a subgraph of Figure 3.2. After having identified object instances in the multiple classes which satisfy the context specification, systemor user-defined operations specified in the query can then be performed on these instances. In this example, a retrieval operation is performed to obtain the hours, the GPAs, the department names, and

PAGE 23

SSNQ-~ Person Name Teacher Advising Grad Faculty GRE TA Books Degree Specialty Section Student SectionNum Room TextBook Transcript Grade QGPA Undergrad maJor nunor RA Hours class/subclass link attribute/domain link Department 0 Name College Course Title CourseNum Figure 3.1. Schema Graph of a University Database 14

PAGE 24

15 sel rl gl Section r2 g2 RA Grad Student dl d2 Department Figure 3.2. Object Graph(OG) the section numbers. In a more complex query, if attribute comparisons are involved, they are specified in a WHERE subclause of the context statement with quantifiers and complex predicates. If there are multiple links ( or attributes) associated with two classes, the link or attribute name is given after the "*'' or "!" operator to identify the specific link. Since a query graph can be very complex structurally and graph searches have to be carried out in a potentially very large extensional database, the processing of such an 00 query can be very time-consuming. For example, if the relational query processing approach of generating temporary relations is adopted for the 00 query processing, complex data structures will have to be established and maintained in each step of the association/non-association operation to construct the aggregated

PAGE 25

16 instances (i.e., similar to relational Joins) and these data of complex data type will have to be passed from one processor to another in a multi-processor computing environment to perform object traversals. Furthermore, the aggregated instances do not belong to any predefined classes. They can not be further processed by pre defined methods due to type checking problems. They may contain data which are not relevant to the operations of any user-defined operations. A better way to identify object instances that satisfy the context specification is to retrieve the proper part of the object graph (i.e., the extensional database) from the secondary storage to the main memory, traverse the in-memory structure to mark the proper instances for subsequent processing instead of forming temporary data structures. The original structural properties of these object instances are maintained in the object graph. However, this method would 1equire bi-directional traversals of object instances since a disqualification of an object instance would cause the disqualifications of many other associated instances, thus causing backward propagation of !IDs. Bidirectional traversals need to be supported by an efficient query processing strategy and graph based traversal algorithms. In this work, a two-phase query processing strategy [LamH89, Thak90] is adopted to access and manipulate an OODB. In the first phase, multi-wavefront algorithms (see the next section) are applied to identify the object instances that satisfy the context specification. Local selection conditions, if they are specified in the Where subclause are applied by the involved processors to their classes in this phase. In the second phase, systemand/or user-defined operations are executed on these object

PAGE 26

17 instances. Since the retrieval of descriptive data to form the final retrieval result is postponed until the second phase when all the instances that satisfy the context specification have been identified, this processing st1ategy reduces the I/ 0 time and avoids the generation of large temporary instances during object instance traversals. Another advantage of this approach is that the original structural properties of these instances are preserved and can be used in the systemand user-defined operations in the second phase. Thus, the closure property is preserved. Section * AND RA Grad Student Department Figure 3.3. The Query Graph

PAGE 27

18 sel se2 rl gl s 1 Section dl RA Grad Student d2 Department Figure 3.4. The Resulting Subdatabase

PAGE 28

CHAPTER 4 A GENERAL FRAMEWORK OF PARALLEL OODBMSS 4 1 A Hybrid Data Partitioning Approach Data partitioning and placement is an important issue in paral l el database sys tems since it affects system performance. In OODBMSs, we believe that there is a need for a hybrid data partitioning strategy( a combination of vertical partitioning and horizontal partitioning). In vertical partitioning, instances of a class are verti cally partitioned as illustrated by Figure 4.1. There are two types of pa1titions: ( a) data values stored in IID-data value pairs, and (b) instance cross-references stored in IID-IID pairs. Student GPA IID Student Grad IID IJD Student Section IID OD Student Depl llD llD s1 3.5 sl 21 sl set ,se2 sl dl s2 3 6 s2 s2 sel,se2 s2 dl s3 3.5 s3 22 s3 s3 d2 s4 3.3 s4 23 s4 se2 s4 d2 s5 3.6 s5 s5 se2 s5 d2 Figure 4.1. Vertical Partitioning of Class Student Vertical partitioning of data improves I/0 parallelism and avoids the retrieval of data not needed by the query. By storing the attributes columns in different files, different queries which access different attribute values of the same set of object 19

PAGE 29

20 instances can be carried out concurrent l y. In other words, the intra-object parallelism can be exploited. Also, when the data of an object is to be ret1ieved, only those needed attribute values need to be accessed from the secondary storage instead of all the values that form an object instance. This saving can be very significant for complex object s because their attribute values can be data of complex data types s uch as video and audio. Vertical partitioning also provide a simple and uniform representation for comp l ex objects in that cross-references data ( association between instances of different classes) can be represented in the same structure as the attribute values of objects. Intuitively, the vertical partitioning approach works well if the following two conditions exist. First, the number of object classes is greater than that of processing nodes of a parallel computer and the sizes of object classes are about the same. Second, queries is s ued against the database access the object classes with about the same probability. However, these conditions are not always true in real world applications. It i s possible that in a database schema, there are some very large classes and they are accessed by queries much more frequently than the other classes. Under this circumstance, the vertical partitioning strategy will not scale up well. Thus, the horizontal fragmentation after the vertical partitioning (i.e hybrid partitioning) shall be used. Figure 4.2 show a hybrid partitioning of the class Student. In this example, all the hybrid segment starting from object instance sl are mapped to processor node 1 all the hybrid seg ments sta1ting from objects instance snare mapped to processor node 2, etc .. The horizontal partitioning can exploits inter-object parallelism. That

PAGE 30

21 is, a query can be processed against the horizontal segments of vertically partitioned data concurrently Student no s1 s2 s3 s4 s5 sn GPA 3.5 3.6 3.5 3.3 3.6 3.5 Student 1ro sl s2 s3 s4 s5 so Grad S1udent IID IID gl s1 s2 1?2 s3 1?3 s4 s5 gn so Scc1lon DD sci ,se2 sci ,se2 se2 se2 sen Student IlD s l s2 s3 s4 s5 so Dept llD di di d2 d2 d2 do Figure 4.2. Hybrid Partitioning of C l ass Student Node I Node2 The data structure used to store the data partitions in each node is based on the concept of ''join indices It is designed to facilitate the bi-directional traversal of object instances. The data associated with all instances of a horizontal segment are partitioned into vertical binary columns. There are two types of binary columns: IID-IID pairs for storing inter-object references between two adjacent classes and IID attribute-value pairs for storing the descriptive data of ob j ects. The binary col11mns of the first type are pre-sorted based on the IIDs through which object references are to be accessed. For large object classes, the binary columns of the second type are supported by the traditional indexing schemes for fast accesses of data values given some IIDs and fast accesses of IIDs given some data values. In Figure 4.4, some data partitions (for simplicity sake, no !ID-attribute-va l ue pairs are shown in this example) and methods defined in the five object classes: RA, Grad, Student, Section

PAGE 31

22 and Department are stored in processors Pl, P2, P3, P4 and P5, respectively. Each processor which holds a partition of a class maintains the IID-to-IID references to the partitions of all its adjacent classes. For example, gl -> rl; g2 -> r2; g3 -> r3 are stored in p1ocessor Pl to record the inter-instance references between RA and Grad partitions, and rl -> gl; r2 -> g2; r3 -> g3 and sl -> gl; s2 -> ; s3 -> g2; s4 -> g3; s5 -> are stored in processor P2 to reco1d the inter-instance references between Grad instances and RA and Student instances, respectively. This structure allows bi-directional traversals of object instances and can be viewed as pre-computed joins in a relational database [Vald87, Bic186]. In addition to the IID-IID pairs, an integer array CON is established for each adjacent partition as shown in Figure 4.4. Each element of the array corresponds to one instance of the partition stored in the processor, and the integer value is the number of connections between that instance and the instances of an adjacent partition. For example, the elements of array Section.CON in the Student partition have values 2, 2, 0, 1, and 1, which specify the n11mbers of connections sl, s2, s3, s4, and s5 have with the instances of the Section partition, respectively. These integer arrays are used in the multi-wavefront algorithms to be described in the next section. 4.2 A Distributed Graph-based Query Processing Strategy 4.2.1 Query Graph Modification Approach Similar to SQL, an 00 query language is a nonprocedural language. Thus, the physical data allocation information is transparent to the users and is not stated in

PAGE 32

23 the query language. In traditional RDBMSs, a query is transformed into a tree struc tured query execution plan before it is executed in a bottom-up manner. In parallel RDBMSs, some parallel operators, such as SPLIT and MERGE are introduced to the tree structure of the query execution plan (QEP) to bridge the gap between the logical representation of a query and the physical mapping of the data [De Wi92). The same approach is also used in some recent research in OODBMSs [Grae94]. In our work, we propose a different approach which modifies a logical query graph into another query graph based on the physical mapping of the data. For example, if classes Grad and Student are horizontally partitioned into Gradl, Grad2, Studentl, and Student2 partitions, respectively, and each of the other class has a single parti tion, then the query will have to be processed against the four combinations of data partitions to obtain the final result (i.e., Gradl, Studentl, and the partitions of all other classes form a combination, etc.). The que1y graph shown in Figure 3.3 can be transformed into a query graph as shown in Figure 4.3. From this figure, one can see that the horizontal parallelism can be captured by the "OR" branches. Although the algorithms for implementing "OR" and "AND" branches achieve the similar things as "Split", "Merge" or "Exchange'' operators, the main differences between the query modification approach and the pa1allel operator approach are: 1) "OR" and "AND" b1anches are defined in ou1 original data query model. They are not operators intro duced specifically for parallel platform; 2) Processing of the partitioned data in the former approach can take advantages of the graph-based, multi-wavefront algorithms as well as graph-based optimization strategies.

PAGE 33

P2 Grad I P3 Student! P4 Section AND OR RA OR OR P7 Grad2 P6 Student2 PS Department Figure 4.3. Query Graph Modification Approach 4.2.2 Multiple Wavefront Algorithms 24 The processing of each query g1aph against a combination of data partitioning is based on two multiple wavefront algorithms introduced in our previous work: the identification approach and the elimination approach. In both algorithms, data rel evant to the processing of a query graph are retrieved from the secondary storage devices in parallel and manipulated in main memories by the processors which hold the partitions of object classes referenced in the query. In the identification approach, partitions referenced by a query are classified into two types. Partitions with mo1e than one "AND" cond ition ed edge in the query graph are called non-terminal parti tions; otherwise, they are called terminal partitions. Query processing starts at all the processors that manage the terminal partitions. These processors do their local selec tions of instances (if selection conditions are given in the query), look up the proper binary columns of IID-IID pairs, and send out the IIDs of the associated instances of its only neighboring class which satisfy the selection and instance reference con ditions. Each propagation of IIDs forms a wavefront moving toward all other nodes

PAGE 34

25 of the query graph. Multiple wavefronts go across one another in an asynchronous fashion and the operations of all the processors depend on the operators ( "*" or "!") and branch conditions (AND or OR) given in the query. The behaviors and termina tion conditions of processors that contain terminal and non-terminal partitions are as follows. Each node will send out an end_marker to one of its neighboring nodes immediately after it sends a wavefront to that node. A node will terminate if the number of end_markers it receives is equal to the number of edges it has. The pro cessor that contains a non-terminal partition would receive streams of IIDs from all its ''OR" conditioned neighbors and all its "AND" conditioned neighbors but one. It will process those streams of IIDs and select the instances that satisfy the local selection condition and the "AND'' and ''OR'' branch condition. Then, it will send the IIDs of the associated instances of the only remaining neighboring partition to its corresponding processor. The processor of a non-terminal partition would per form its local selection and process the incoming streams of IIDs. When it receives the last incoming streams of IIDs, it will process them, and pass the IIDs of those associated instances of all other neighbors to these neighbors except the sender of the last incoming IID stream. Figure 4.5 illustrates the execution of the query given in Figure reffq In this algorithm, the processing starts from all terminal nodes and each processor reports to its neighbor( s) the instances that satisfy the search. A terminal node terminates its processing after it received an end_marke1 from its only neighbor. We note that all the "OR" conditioned edges can be treated as one edge.

PAGE 35

I 26 Therefore, Figure 4.3 is not a cyclic graph. The processing of a cyclic graph can be found in previous work [Chen95]. gl->rl ORA.CON g2->r2 rl t g3->r3 r2 1 Pl r3 I RA RA.CON gt t g2 1 g3 I s1->gl s2-> s3->g2 s4->g3 s5-> P2 Grad rl->gl r2->g2 r3->g3 STU.CON gl I g2 t SEC.CON st 2 sel->s1,s2 s 2 2 se2->s 1,s2, s3 0 s4,s5 P4 g3 1 s4 1 s5 1 P3 s1->sel ,se2 STU.CON s2->sel,se2 set 2 83 _> ORA.CON sl 1 Student se2 4 s4->se2 s5->se2 s2 0 gl-> s l dl->s1,s2 g2->s3 s3 1 g 3 ->s 4 d2->s3,s4,s5 s4 1 DEP.CON PS s5 0 sl 1 s2 1 s3 1 s4 1 s5 1 Department STU.CON s1->dl dl 2 s2->dl d2 3 s3->d2 s4->d2 s5->d2 Figure 4.4. Data Structures for the Multiple Wavefront Algorithms In contrast with the identification algorithm, the elimination algorithm elimi nates object instances that do not satisfy the context specification. When an object instance in the in-memory object graph is eliminated in the query processing, all the associated instances of the neighboring partitions will have to be eliminated. This may in turn cause their associated instances to be eliminated. Thus, the elimination process will be repeated until all the unqualified instances have been eliminated. In this algorithm, all processors become active after receiving the query graph, and they can start processing local instances (e.g., do local selections and check instance connectivities) without waiting for the waves of !IDs from the neighboring processors (classes). For this reason, the elimination algorithm achieves a higher degree of par allelism than the identification algorithm (in the case of the AND branch condition). In this algorithm, each processor reports to its neighbor( s) the instances that have

PAGE 36

Pl RA {rl,r2,r3} Pl RA {rl, r2, r3} rl, r2, r3 > P2 Grad { } (a) step 1 P2 gl,g2,g3 > Grad < { I 2 3 } s 1, s4 g g g (b) step 2 sel,se2 P3 Student { } dl, d2 P3 Student {sl,s2, s4,s5} gl,g3 Pl < P2 P3 o-------4oi----------io RA Grad { r 1, r2, r3 } { g 1, g3 } Pl RA {rl, r3} (c) step 3 P2 Grad { gl, g3} (d) step 4 Student { s1, s4} P3 Student {sl,s4} sl,s4 P4 Section {sel,se2} Department { dl, d2} P4 Section {sel,se2} Department {dl,d2} P4 Section { sel, se2} P5 0 Department {dl,d2} P4 Section {sel,se2} P5 Department { dl, d2} Figure 4.5. Execution of the Identification Algorithm 27

PAGE 37

28 been eliminated. The counts in the proper integer arrays a1e decremented. When an entry of an array becomes zero, the corresponding instance is eliminated and the !IDs of those eliminated instances are sent to the neighbors so that they can in turn eliminate their instances. Although there is a significant difference between these two algorithms s ome processing techniques are applicab l e to both of them. For example, in both algo rithms, the propagation and processing of IIDs can be carried out in a pipelining fashion, thus increasing the degree of parallelism. Also, in order for both algorithms to know when to terminate, the end_marker is introduced. 4.2.3 Result Collection The above multiple wavefront algorithms mark the IIDs that satisfy the query pattern and send the IIDs, attributes and all the IID IID cross-reference information to a result collection (RC) node. Upon receiving these information, the RC node tra verses the cross-reference information to reconstruct the query results. This approach creates a potential bottleneck in the RC node ( even though more than one processing node can be allocated as the RC nodes for differently queries). An alte1native is to use a distributed result-collection approach which will be discussed in Section 5 4.2.4 Method Processing and Attribute Inheritance In OODBMSs, the behavioral properties of objects are defined by method spec ifications in object c las ses. Due to the inheritance property, all the methods defined in an object class can be applied to the in sta nces of all its subclasses. Likewise, the object instances of these subclasses can also inherit the attributes defined in their

PAGE 38

29 superclasses. A challenge for the design of a parallel OODBMS is how to map method implementations and object instances to processors in such a way that the following two goals can be reached. The first goal is to place data and their applicable method implementations as close as possible so that data and/ or code do not have to be moved. The second goal is to make the attribute inheritance as efficient as possible. In our approach, the following rules are used to achieve the above goals. All the methods applicable to an object class are replicated in the processing nodes to which the instances of the class are mapped following the hybrid data partitioning strategy. The mapping generally starts from the root class of generalization hierarchies using the top down approach until all the classes are allocated. The instances of multiple classes in an inheritance hierarchy or lattice which hold the data of the same object are mapped to the same processing node. The object classes which have aggregation association with ( attribute links) the classes in the generalization hierarchies can be mapped to the processors after all the classes in generalization hierarchies are allocated to achieve the load balance. Or, they can be mapped randomly. After collecting system 1unning information, further adjustment can be done to achieve the load balancing Figure 4.6 shows an example of applying the above rules. The object instances of class Pe1son are mapped to Pl, P2, P3 and P4. Thus, the methods defined by class Person are replicated in these four processing nodes. Similarly, the methods

PAGE 39

30 defined by class Teacher are replicated to P2 and P3. The numbers 1 to 8 represent OIDs. By this mapping, the methods can be applied to the data of the object classes concurrently, thus improves the performance At the same time, there is no separation of the data and their applicable methods. No transferring of the data and methods is required during query execution. Moreover, it reduces the unnecessary replication of methods. For example, the methods defined by class Teacher are not replicated in Pl and P4 because there is no object instances of class Teacher mapped to these two nodes. This mapping strategy also make the attribute inheritance very efficient because all the instances of multiple classes in an inheritance hiera1chy or lattice which hold the data of the same object are mapped to the same node. Moreover, some index structures can be established for the instances of the objects during the mapping process to speedup future accesses. 1 Pl 2 3 P2 4 4 erson 5 P3 6 7 cacher 5 6 tuden 7 5 TA 6 Figure 4 6. A Proposed Mapping Strategy P4 8

PAGE 40

CHAPTER 5 DISTRIBUTED RESULT COLLECTION 5.1 Two Architectures for Supporting Result Collection Recently, there is a significant trend in both industry and research communities to unify the relational and object-oriented database technologies. One of the chal lenges this shift brought to query processing is that both navigational and retrieval query types should be supported. Two architectures have been studied in our research: master-slave and peer-to peer. In the master-slave architecture, clients (users) submit queries to a master node. Upon receiving queries, master node will analyze them, apply query optimization strategies to them, modify them based on the placement of data and pass the modified query to various slave nodes. After slave nodes finish the query processing, s lave nodes send the partial results back to the master node. In turn, the master node assembles the partial results for each query together and reports them to clients. The master-slave architecture is shown in Figure 5.l(a). In the peer-to-peer architecture, a client (user) can submit a query to any node which analyzes the query, applies optimization strategies to it, modifies it and pass the modified query to all the nodes that contain the relevant data. Upon receiving a subquery, these nodes process the subqueries and the results are col le cted by one 31

PAGE 41

32 of these nodes. In this architecture, the node which receives queries from clients is called a coordinator node (C-node) and the node which processes the queries is called a peer node (P-node). A node in a pa1allel machine can be a C-node and P-node at the same time and there can be more than one C-node in a system. The peer to-peer architecture is shown in Figure 5.l(b ). Our implementation ass11mes that the parallel computer supports the client-server architecture. The server ( e.g., the parallel computer) itself works in a master-slave or peer-to-peer mode. In the master-slave architecture, when a retrieval query is executed, the IID-IID pairs and the attributes of the objects are sent to the master node. Master node has to traverse the IIDs and construct the final results. This process could be time consuming and the master node can become a potential bottleneck. Based on the peer-to-peer architecture, we introduce a distributed result collec tion approach to ease the potential bottleneck by designating one node to collect the 1esults This app1oach is based on a pattern-passing identification (PPI) strategy which is a modification of the multiple wavefront identification algorithm described in Section 4.2.2. 5.2 Pattern-passing Identification Strategy The main idea of this strategy is to construct and propagate the association information of objects in all the P-nodes which are involved in a query. In a query graph, we assume a node Chas i numbers of "AND" conditioned edges and j numbers of "OR'' conditioned edges. We call the processor which contains the instances of node C, the P c processor. If i equals to 1 and j equals O or if i equals to O and j is

PAGE 42

Users (C ljents) ....... -------------------------------------------------------------, ' I I l --'--Parallel Server : I I I I Ma s ter I I Node : I I Hypercube Interconne ctio n Network I I I I j : I I : I I I I I I I Slave Slave Slave Slave 1 I I : Node Node Node .. Node I I I I I I I I I I I ' I ' I 33 Clienrs (Users) -----------------------------------------------------, C-node P -node P node P-node Hyperaube Interconnection Network P-node P node ' I ' ' : : Parallel Server ; _________________________________________________________________ ~------------------------------------------------------------------(a) A Master Slave Architecture (b) A Peer-to-Peer Architecture Figure 5 .1. Two Architect UI'es greater than 1, we call node C a terminal node; if i is greater than 1, we call node C a non -te rminal node. We show the PPI strategy by describing the behaviors of terminal and non terminal nodes below. Every node shall send an end.m.arker to its neighboring node immediately after it sends out a wavefront to that node; If C i s a terminal node in a query graph; P c will s tart the IID propagation process; when P c receives a wavefront of association patterns each of which is formed by a concatenation of associated IIDs from its neighbors, it will concatenate the lo c al IIDs that are associated with the incoming patterns.

PAGE 43

34 If the number of end_rnarker Pc receives is equal to the number of the edges it has, it terminates. If C is a non-terminal node in a query graph; upon the arrival of each incoming wavefront of association patterns, it will concatenate the incoming patterns with local IIDs according to the IID-IID pairs available in the local memory. if P c has not received (i-1) incoming wavefronts of association patterns from its "AND'' conditioned neighboring processors and (j) incoming wave fronts of association patterns from its "OR" conditioned neighboring pro cessors, it will waits for more wavefronts to come; when P c receives the (i-1 )th wavefront, it will perform the same sequence of operations as describe in the first step and propagate the (i-1 )th inter mediate results to the only remaining ''AND'' conditioned neighbor from which it has not received a wavefront; when P c receives the i-th wavefront from its ''AND'' conditioned neighbor, it will perform the same sequence of ope1ations as above, propagate the final result to all the neighbo1 processors except the sende1 of the i-th waveft ont and terminates. Figure 5.2 shows an example of the above procedure. The query graph is the query pattern shown in Figure 4.3 and the object graph is as shown in Figure 3.2. In this approach, the association pattern among the objects is constructed by all

PAGE 44

35 the processors involved in a query instead of by one or more dedicated processors. In other words, the pattern is constructed in a distributed fashion. Notice that, at the end of the procedure, all nodes contain the same patterns of object associations (could be in a different order). We shall consider how to collect the results of a retrieval query by using these patterns below. 5.3 Distributed Result Collection In a retrieval query, such as the one shown in Section 3.2, the descriptive data (i.e the primitive attribute values) of one or more object classes are retrieved from the secondary sto1age, concatenated and presented to the query issuer in an appro priate order. By taking the advantage of PPI strategy, this process can be carried out in a dist1ibuted and parallel fashion. We designate one node or a set of nodes for each query as the result collection node( s) depending on applications. There could be more than one criterion for selecting a result collection node. One of the crite rion is that this node contains the class( es) from which the size of the descriptive data to be retrieved is greater than the other nodes. In this way, we can avoid the t1ansmission of the larger set of data, thus, reducing the communication cost. The load balancing is another criterion should be considered when multiple queries are choosing the result collection nodes. Figure 5.3 shows a possible assignment of the results collecting nodes for the modified query shown in Figure 4.3. The highlighted nodes are the designated result collection nodes. Each result collection node is re sponsible for collecting the results in a specified range (e.g., IID values or hashed IID

PAGE 45

Pl RA Pl RA (rl,rl,r3J Pl Pl RA I {rl,(gt ,(1 l ,se l.d 1))), (rl ,(g I ,(al .se l.d 1))), (r3,(&3,(a4 ,1C2,d2))) I P2 Grad I P3 Scudentl P7Grad2 P6 Studcn12 (a) step I (g3,r3) (b) Slcp 2 ((g 1,rl,(s 1.se I ,di)), C.a 1,r I ,(II ,se2,d 1)) I P2 Gradl P7 Grad2 (g3,r'.l,(s4,ae2JJ2)) (C) S lep 3 l( gl ,rl,(1 l,5C I ,d I)), (l!l,rl,(1 I .ae2.d 1)) I P2Gradl P7 Gl'lld2 (s3,r3,(14,1e2.d2)) I (1l,1e 1.d 1,(gl,rl)), (11 .IC2,d 1,(gl ,rl))I P3 Student! P6 Stude1112 (14.se2.d2.(13 .r 3)) ((al ,ICI ,di ,(g l,r I )), (1 l.ae2.dl.(g1,r1)}) P3 Suxlenll OR P6 Suxlenl2 (s 4.-e2.d2 .(g3,r3)) ((1 l,11e I .d I ,(g l,rl )). {1l.lC2.dl ,(Jl,rl ))) ~R P4 Section (scl,1e2) (dl,d2) PS Dcpartmenl PS Depart melll P4 Section (5Cl,ae2J ((11 .KI .d 1,(gl ,rl)), (st .ae2.ll l,(gl ,rl))) ~R (1 4,se2,d2, (g3,r3)) PS Dcpartmenl OR P4 Section {(le! 1,(11 .di .(g 1,r 1))), (se2.(11, dl,(gl ,rl ))), Cse 2 ,( s4 ,d2,Cc3.r'.l) l ((di ,(sl ...,1,(st ,rl))), (II I ,(I I ,se2,(g1 ,rl )J), (d2,(s4,sc2,( & 3.r3))) I PS Dcp11rt111elll Figure 5.2. An Example of the PPI Strategy 36

PAGE 46

37 values). In the following sections, we shall discuss how the other nodes retrieve their descriptive data and transfer them to the data collection nodes. P2 Gradl OR P4 Section OR OR RA OR OR P7 Grad2 P6 Student2 P5 Department Figure 5.3. An Example of the Result Collection Node Assignment 5.3.1 Join Approach At the end of PPI, all nodes involved in the que1y have the same pattern of object associations (i.e., the same set of associated IIDs). An example of the re sulting pattern s which involve only two classes is shown in Figure 5.4. The first column always contains the local IIDs and the second column contains the IIDs of the neighboring node( s). In our implementation, the pattern on each site are ordered according to the order of the local IIDs. The Pa is highlighted to indicate that it is a result collection node. Obviously, one way to combine the attribute values of class B with that of class A is for Pa and Pb to retrieve their attribute values from their disks, transfer Pb's attribute values which satisfy a specified condition ( e.g., an IID value range or a hash

PAGE 47

38 value) to Pa, and perform a join operation there. This approach is very similar to the semi-join approach used in other distributed systems. In this approach, the attribute value of each IID is retrieved and transferred only once. However, the join operation at Pa could be very costly. If the attributes of class B can not be held in the memory, then they have to be stored in the secondary storage of Pa, thus, requiring a lot of I/ 0 operations and introducing a bottleneck at Pa. This problem will be even more serious if a join operation involves more than one neighboring class. Pa Pb -------
PAGE 48

39 This scheme works well except that some of the disk pages which contain certain IIDs might have to be read more than once. For example, in Figure 5.4, the page contains IID 4 might have to be read from disk twice; the page contains IID 14 might have to be read three times. The multiple reading of the same page is due to the unordered local IIDs in the first column of Pb. Some researchers addressed this issue by assuming a physical OID [Shek90, Lieu93] (i.e., each OID contains information about the node and disk page of the referenced object). This approach sorts OIDs by their page IDs before the disk access, thus, the multiple accesses of the same object are avoided. In our research, we assume logical IIDs are used. Therefore, a different approach ( object cache) is taken to reduce the penalty of multiple retrievals of the same page. The following steps are used in our approach: Allocate a certain size of the memory as a cache area. The size of this area depends on the size of the available memory. Count the number of appearances of each local IID in the patterns. Store this number together with each IID and denote it as the Total_Count. Because the patterns are sorted by the local IIDs, the complexity of counting is O(n); n is the number of patterns. Sort the patterns so that they are in the same order as the patterns in the result collection node. The complexity of the sorting is nlog(n). A data structure shown in Figure 5.5 is maintained in the cache area of the memory. This structure is used to log the attribute values of a set of most

PAGE 49

40 f1equently appeared IIDs. We denote this st1ucture as A_Cache ( A stands for "attribute"). For the simplicity of presentation, we use a flat table structure. In an implementation, other data structures such as a hash table can be used. Those attribute values of the local IIDs needed for a retrieval query are retrieved either from the secondary storage or from A_Cache in the following manner. For each JI D i, if Total_Counti is equal to one, the attribute values of the I I Di will be 1etrieved from the disk; if Total_C ounti is g1eater than one, the A_Cache will be checked. If JI Di is already in it, the attribute values stored with JI Di are accessed from the A_Cache instead of from the disk and paired with JI Di being processed, and the Current_Counti is decremented by one; if I !Di can not be found in the A_Cache, the attribute values of I I Di are retrieved from the secondary storage. If A_Cache has an empty entry, the attribute values of I I Di is logged in A_Cache and Current_Counti = Total _C ount i l. If A_Cache is full, the table entry with the smallest Current_Count will be replaced by I I Di and its attribute values if the smallest Current_count is less than Current_counti The above approach avoids the repeated disk accesses of attribute values asso ciated with the set of IIDs with la1ger counts maintained in the A_Cache. However,

PAGE 50

41 object cache approach increases the complexity of maintaining the consistency of OODBMSs. IID Current Count Attributes Figure 5.5. An Example of a cache

PAGE 51

CHAPTER 6 QUERY OPTIMIZATION STRATEGIES The multiple wavefront algorithms have been designed to achieve a high degree of parallelism. However, a high degree of parallelism does not necessarily guarantee the maximal efficiency, since, as we shall explain, processors can be kept busy doing nonproductive work. Furthermore, in the situation of multiple queries, there is a large number of queries to be processed simultaneously. It is more desirable to allocate some nodes to process other queries than to let them be committed to one particular query and do nonproductive computations Let us look closer into this problem and its possible solutions from both single query and multiple query points of view. 6.1 Parallelism Not Equal to Efficiency We use an example to illustrates the problem. Our discussion is still based on the schema graph of Figure 3.1. We assume that there are 300 RAs at a university with a student body size of 10,000 and a graduate student body size of 4,000. Both 1 and r2 have 20-hour appointments, and the rest of RAs' appointments are either 10 hours or 15 hours. sl's GPA is 3.5 and there are 4,000 students with a GPA of 3.5. s2's GPA is 3.6 and there are 800 students with a GPA of 3.6. Finally, there are 10 departments at the unive1sity, and together they offer 300 sessions of courses. The object graph is shown in Figure 6.1. For simplicity reason, we map each object 42

PAGE 52

43 class to a processor node. However, if horizontal partition is done to any object class, Figure 6.1 could be considered as the object graph for a specific combination of data partitions. Therefore, in the rest of this section, the terms "class" and "partition" are interchangeable. Now the query is "Find the students who are RAs with a 20-hour appointment and 3.5 GPA, and also find those sections that the students are taking. Retrieve their names, their GRE scores, and their department names." It can be written in OQL as below: context RA *Grad*Student AND (*Section, *Department) where RA.hrs = 20 /\ Student.gpa = 3.5 retrieve Student.name, Grad.gre, Department.name If the identification approach is used, as shown in Figure 6.2, P3 which processes the Student class would most likely receive the IIDs propagated from the Section and Department classes before it receives the stream of IIDs p1opagated from the RA class. If most of the object instances in the Section and Department classes are connected with the object instances in the Student class, after applying the local selection condition and processing the two incoming wavefronts, P3 will send 4,000 IIDs to P2. Also, P2 will have to take a considerable amount of time to process these IIDs, most of which do not contribute to the end result. But based on our assumed object graph, we can see that the local selections of the RA, Department and Section classes produce very few IIDs. If the processor, which stores and processes the Student

PAGE 53

44 class, simply waits for all the wavefronts (including the one propagated from Pl and P2) to come from its neighboring classes, it will only send out a very limited n11mber of IIDs to its neighboring classes. The communication bandwidth between P2 and P3 as well as the CPU time of P2 can thus be saved and be used for processing other concurrent queries. Also, based on the data specified in the retrieval statement of the query, the attribute values associated with RA and Section classes are not needed. Therefore, the wavefront propagations from P3 ( the Student class) towards P4 ( the Section class) and from P2 (the Grad class) towards Pl(the RA class) are not needed. The algorithm can terminate at step 4 of Figure 6.2. r2 r299 r300 RA s2 g3999 s9999 Grad Student sc2 Section di d2 Department Figure 6.1. A Modified Object Graph From the above example, we can see that a number of optimization strategies can be introduced for the graph-based object-oriented query processing in a parallel environment. By starting a query at some selected nodes and/or by controlling the directions and the extend to which a wavefront of IIDs propagates, we can not only

PAGE 54

45 reduce the response time of an individual query but also the overall processing time of concurrent queries since nonproductive computation can be avoided. In the next subsection, four strategies for query optimization in multiple wavefront algorithms are introduced. They aim to avoid excessive or unnecessary IID transfers between nodes while maintaining an appropriate degree of parallelism in order to free some processors from computations that do not contribute to the end results of queries. Pl RA {} Pl RA {} Pl RA {} Pl RA {} rl r2 P2 Grad { } (a) step 1 gl, g2 P2 Grad {} (b) step 2 P2 st Grad { } (c) step 3 P2 Grad {gl) (d) step 4 P4 P3 Section { } Student P5 { } .___,,. dl ... dlO D epart-1nent P3 Student { } Student {} Student {sl} -;;--.. { } P4 Section {} P5 Department { } P4 Section {} PS Department {} P4 {} PS Department { dl} Figure 6.2. A New Query Execution Plan

PAGE 55

46 These optimization strategies will work if the system can pre-determine or pre estimate the number of instances of each class which may satisfy a query Fortunately, several researchers have done interesting work in relational DBMSs to estimate the size of the outcome of selection and join operations [Ling92, Sun W93]. The introduced size estimation techniques that do not generate much run-time overhead are very suitable for our application. The CON array and connection information (i.e., IID IID pairs) shown in Figure 4.4 is also very useful for estimating the distinct number of IIDs to be sent to neighboring nodes after a local selection. A well-designed query optimizer shall be able to collect this information periodically and use it to establish efficient execution plans. Before we present the optimization strategies, we define a couple of parameters which are used to characterize a query graph. 11D-size: !ID-size is the estimated number of the distinct IIDs to be sent to a neighbo1ing node after its local processing (i.e local selection and instance connectivities with the neighbor). In case a node is connected with more than one node involved in a query graph, the IID-size would be the average value of the IID sizes of all the node pairs. For example, based on the object graph of Figure 6.1, when the example query pattern is applied, the IID-size of RA.hrs=20 is 2. Varying the value of this parameter represents the effect of changing the n11mbe1 of instances of a node, the selectivity factor associated with instance selection based on attribute value(s), and/or the connectivity of its instances with the instances of the neighboring nodes.

PAGE 56

47 Dia: Dia or diameter stands for the longest distance between any two processing nodes to which the object classes referenced in a query are mapped. It is the number of nodes along the longest path between the two terminal nodes. For example, in our example query shown in Figure 3.3, the Dia is 4. The Dia value 1epresents the distance of a wavefront propagation, thus, determines its communication cost. We now proceed to present the optimization strategies. 6 2 lntraquery Scheduling Strategy In a single query with N object classes, if the IID-size value of a particular class is relatively large when compared with the !ID-sizes of other classes, the node that manages it should act as a "passive'' node. A passive node will not become active until it receives all the wavefronts from its AND branches. Only one node in a query graph can act as a ''passive'' node because a query graph with AND branches would only propagate a wavefront of IIDs after having received all but one wavefront of IIDs. If more than one node acts as a "passive", a deadlock would occur The "relatively large" can be determined by using a threshold which is a value between the la1gest IID-size and the smallest IID-size that needs to be determined by a performance evaluation study. li no class has an unusually large IID-size, the generic identification approach described before is applied. This strategy avoids the propagation of a large amount of IIDs while maintaining some degree of parallelism. Figure 6.2 follows this rule The numbers of IIDs to be sent out by the RA Grad, Section or Department node are smaller than the one to be sent out by the Student node (4,000). Thus, the student node should be a passive

PAGE 57

48 node. Otherwise, a lot of processing time will be wasted and other concur1ent queries will not be able to benefit from those nodes which heavily engage in processing the incoming IIDs and yet produce results that do not contribute to the final result of the query. In the identification approach, a non-terminal node will wait until it receives i-1 wavefronts from i neighbors that are involved in an AND construct. We call this kind of node a semi-passive node as opposed to the passive node we just described. We assume a node C has i numbers of AND-conditioned edges in the query graph and we call the processor which contains the instances of node C, the P c processor We now describe the behaviors of the different kinds of nodes/processors as follows: If P c is an active processor, P c will start the IID propagation process. If it receives the stream of IIDs from its only neighbor, it will mark those instances as qualified instances and then terminate. If P c is a semi-passive processor, and if P c processor has received less than (i-1) incoming streams of IIDs from its neighboring processors, it will wait for more streams of IIDs to come; if P c processor has received streams of IIDs from all its neighbors but one, it will process the (i-1) streams and select those C instances that satisfy the selection condition and the query pattern. Then, it will send those IIDs that a1e associated with the instances of the only remaining neighboring node to the corresponding processor.

PAGE 58

49 if P c processor has received the ith (i.e the last) incoming stream of IIDs, it will form the final result of the query for node C and then pass the IIDs that are associated with the instances of all othe1 neighbo1s to these neighbors except the sender of the i th stream. If P c is a passive processor, and if P c processor has received less than i incoming streams of IIDs from its neighboring processors, it will wait for more streams of IIDs to come; if P c receives i incoming streams of IIDs from all its neighboring processors, it will process those IIDs and form the final result of the query for node C (i.e., the set of C instances that satisfy the query graph) and then pass those IIDs in the resulting set that are associated with the instances of all the neighbors to these neighbors The above strategy is a greedy approach. Another possible approach is the randomized optimization [Ioan90]. In the randomized optimization approach, the query optimizer randomly picks the Nth node as a passive node and computes the cost of each query evaluation plan (QEP). This process is repeated for another node and the query optimize1 will choose a cheaper QEP as the current QEP. After evaluating all possible QEPs ( or stopping at some pre-defined termination conditions), the query optimizer can have an optimal (or near optimal) QEP. The drawback of this approach is that it involves very complicated cost functions which are quite difficult to define

PAGE 59

50 and validate in a parallel environment. Howeve1, it may provide a better query processing plan some of the time. A hybrid approach can be used to reduce the search space and still achieve a satisfactory QEP. This approach picks the node that has the largest IIDs size as a passive node and evaluates the cost of the QEP. Then, a node that has the second largest IID-size will be picked and the cost of the QEP is computed. The two costs will be compared and the cheaper QEP is kept. This process will repeat itself until the termination condition is met. The termination condition could be that if there have been X number of consecutive nonproductive attempts or all the nodes have been picked, then program terminates. The nonproductive attempt means that the cost of a new QEP is greater than the cost of the best QEP identified so far. The constant Xis a system parameter. This approach is based on the same heuristic rule that a node with a large IID-size will most likely send out large number of IIDs to its neighboring nodes, and some of which may not contribute to the final result. 6 3 Partial Graph Processing Strategy In an OODBMS, a query can be expressed as a query graph in which each node represents a class involved in the query. In many cases, only the descriptive data from a limited n11mber of nodes are of interest and are to be retrieved or processed. The rest of the nodes in the graph are used only for determining the connectivities among object instances. For example, in a query "Find a graduate research assistant's name whose RA assignment is 20 hours", there are three nodes involved, namely, RA, Grad and Student But the query issuer is only interested in retrieving the names from the

PAGE 60

51 Student node. For this kind of query, the following procedure can be used to achieve a more efficient processing: Mark those nodes whose descriptive data are of interest as having status 1. Mark the nodes which are between the status 1 nodes as having status 2. Mark the rest of the nodes according to their distance from the status 1 or 2 node. The immediate neighbor of a status 1 or 2 node will be marked as having status 3. The generic identification or elimination algorithm is applied; however, IID wavefronts in some directions will be suppressed by following the rules given below. The wavefronts between status 1 or 2 nodes are propagated The wavefronts from a higher numbered node to a lower numbered node a1e propagated. The wavefronts from a lower numbered node to a higher numbered node are suppressed. Figure 6.3 is used to illustrate the numbering scheme. In the query graph, nodes B and F contain the data to be retrieved. They are marked as having status 1. Node E is in between nodes B and F so it is marked as having status 2. The rest of the nodes are either marked as having number 3 or 4 according to their distances from a number 1 or 2 node.

PAGE 61

52 The purpose for marl
PAGE 62

53 the identification and elimination algorithms to process the cyclic components. When this procedure finishes, the query graph can be converted to an acyclic graph so that the two query optimization strategies as well as the numbering scheme p1esented in this section can be applied. 3 A 1 B 3 C 3 D 2 E F 1 3 G 3 I 3 H Figure 6.3. An Example of the Numbering Scheme 4 J 6.4 Interquery Scheduling and Common Patte1n Sharing Strategies Our goal for interquery scheduling is to exploit the sharing of a common pattern Concurrent queries can share thei1 processing results in three ways. First, the final processing result of one query can be used by other queries. Second, the intermediate results of one query can be used by other queries. Third, some of the costly opera tions can be shared by queries ( e.g., selection operation and accessing data from the secondary storage). We will discuss them in turn.

PAGE 63

54 Sharing conditions: There are different criteria for sharing. In our approach, the following two requirements have to be met by a query before it can share the result of another query : 1. The nodes (object classes) and edges (their associations) of a graph form a superset of or the same set as those of another query. 2. The local selection conditions of the nodes (object classes) in the query graph are equally or more restrictive than the ones specified in another query graph. For example, we have a set of queries as follows: Ql: A[al > 10] B[bl = 10] C Q2: A [ al > 10] B [b 1 = 10] C E Q3: A[al = 20] B[bl = 10] C Q4: A[al > 10] D F According to the two conditions given above, Q2 and Q3 can share the final result of Ql. Q4 can not share the final results of the other queries due to its violation of the condition 1. However, local selection operation of the class A in Q4 can be shared with the same operation of class A in the other queries. (we shall discuss this approach in a greater detail in Subsection 6.5). Having specified the conditions of sharing, we now examine the structures of sharing that can exist among queries.

PAGE 64

55 Structures of Sharing: There is a need to introduce some structures of sharing to provide a clear and more expressive way of describing the execution order of queries in a parallel or distributed environment. For example, the four queries we just discussed can have the structure shown in Figure 6.4. Ql Q4 Q3 (A[al = 20]) Q2 Q3 Figure 6.4. An Example Set of the Structures of Sharing This set of structures carries several meanings. Firstly, it says that Ql will be executed before Q2 and Q3 and Q2 and Q3 can share the result of Ql. Secondly, it indicates that Q4 can not share with Ql, Q2 or Q3, however, the local selection operations of Ql, Q4 and Q3(A(al = 20]) are sharable We shall say that Ql, Q2 and Q3 in the s ame structure of sharing and Q4 by itself is in another structure. Given a set of queries, more than one structure can be constructed. In our example, Ql, Q4 and Q3 ( A[a1=20]) are at the first level (a higher level) while Q2 and Q3 are at the second level (a lower level). The reason Q3(A(al = 20]) is also placed at the first level is that its local selection condition is different from Ql 's and the instances of class A obtained by processing Ql can not be used for processing Q3(A[al=20]). However, by placing the Q3's selection condition over A at the top level, that selection condition can be shared with those of Ql and Q4 because these three queries will be processed concurrently.

PAGE 65

56 We note here that in a structure of sharing, the higher level query is less restrictive than the lower level queries. Thus, the query results of a higher level query is a superset of the results of a lower level query. The structure of sharing can be used as the structure for query scheduling. Thus, the interque1y scheduling is based on the sharing of common subpatterns. The above example illustrates that a structure of sharing is very convenient in describing the execution order of queries and the activation of processors that manage different classes. Now, we proceed to introduce three basic structures of sharing. Basic Structures of Sharing: A complex structure consists of one or more than one of the three basic structures shown in Figure 6.5. Structure A is the simplest one. Those nodes in Q2 that have the corresponding nodes in Ql will wait until those nodes in Ql finish processing (i.e., after receiving all the end_markers). The nodes in Q2 will use the query results of those nodes in Ql as their local selection results. Moreover, in the elimination algorithm, the CON array of a node in Ql can also be used by the node in Q2 so that Q2 will be able to resume the processing at the place whe1e Ql left, rather than to start the processing all over again. We note here that those nodes in Q2 that do not appear in Ql can start the query processing without waiting for Ql to finish. They can also participate in the sharing of distributed local selections to be discussed later. Case AA is an example of this kind of structure in which the result of A *B can be used for processing A *B*C and the processor of node C can start a wavefront algorithm without waiting for the processing of A *B to complete.

PAGE 66

57 QI \. QI Q2 Qt @ @ Q3 (A) (B) (C) AB eco Asc A*B*C*D*E*F ( AA) (BB) (CC) Figure 6.5. Three Basic Structures of Sharing Case B describes the situation in which a complex query can share the 1esults of several smaller queries. Similarly, the nodes in Q3 can use the query results of the corresponding nodes as their local selection results ff one node appears in both Ql and Q2, its results from Ql and Q2 will be intersected and used by the corresponding node in Q3 as its local selection results. However, when it comes to the CON array sharing, Q3 can only use CON a1rays either from Ql or from Q2 to avoid a possible inconsistency. Case BB is an examp l e of structure B. The results produced for C and D in B*C*D and C*D*E processing will be intersected and the result will be used as the local selection results of C and D, respectively. In the elimination approach, the resulting CON array of either B*C*D or C*D*E processing will be used by the corresponding nodes in the processing of A *B*C*D*E*F. The third basic structure is shown in C. This case allows a query pattern to be shared by several other queries. The query result sharing and the CON array sharing

PAGE 67

58 between each pair of the high level and low level queries is the same as in Case A. Case CC is an example of structure C. We shall now discuss in a greater detail about what can be shared in a structure of sharing. Query results sharing: In a structure of sharing, the result of a higher level query can be used by lower level queries as their local selection results if they have the same local selection conditions so that the local selection operations of the lower level queries are not necessary. However, if the lower level queries do not have the corresponding object nodes in the higher level query, the local selection operations of these nodes will have to be done separately. For those nodes whose selection conditions are more restrictive than those of the corresponding nodes in the upper level, these selection operations would have been moved to the upper level. Their results are readily available for these lower level selection operations. After the local selections, a multiple wavefront algorithm can start. Intermediate result sharing: If the elimination algorithm is used, the CON arrays of a higher level query which records some intermediate results can be shared by the lower level queries. The CON array sharing enables the lower level queries to start their query processing from where the higher level query ends rather than from the very beginning. This strategy will reduce the communications and processing costs. Similar to the case of query result sharing, if the lower level queries can not find the corresponding nodes from which to copy the CON arrays, they will copy them from the database (i.e the original CON array). Also, if there is more than

PAGE 68

59 one query at the higher level, the structure of sharing should specify a query from which all the lower level queries should copy the CON arrays. After that, the CON arrays of the source query should be freed, thus making the memory space available to other tasks. Common operation sharing: In a structure of sharing, some operations are common to all queries, some of which are very time-consuming operations, such as the retrieval of the final result. These common operations can be shared. The two-phase query processing technique discussed in Section 3 postpones the retrieval of the final result to the second phase. The common operation sharing strategy postpones the retrieval of the final result even further, i.e., to the end of executing a structure of queries. This approach takes advantage of the structural property of such a structure, i.e., the query result of a higher level query is a superset of the result of a lower level query. In processing a structure of queries, only those data ( attribute values) needed for a high level query are retrieved from the secondary storage. The lower level queries access their data from the data already loaded in main memories instead of from secondary storages. This approach can greatly reduce the I/0 cost. However, it may delay the response time of some individual queries. To solve this problem, a pipelining approach for the construction of final retrieval results can be used. The final result of a query is constructed after object instances have been traversed by slave nodes and those instances that satisfy some local selection conditions and the context specification have been marked. The collection of the final retrieval result can be done in the following pipelined fashion. As soon as a processor completes its processing of

PAGE 69

60 a query, its data in the form of !ID-Attribute-Value pairs, which constitute the final retrieval result are sent to a processor responsible for constructing the final result. Data would arrive to the construction processor at different time depending on the time the involved processors complete the processing of a structure of sharing, thus forming a pipeline of data. The construction processor would assemble the received data based on the IID information provided in the data streams to construct the final retrieval result. 6.5 Distributed Sharing of Selection Operations In order to achieve the sharing of the results of selection operations, there must be some processor( s) which is responsible for the identification of sharable selection conditions. We use a distributed approach to achieve this identification. In this ap proach, all the processors that contain object classes referenced by a set of concurrent queries are given the structures of sharing as well as the queries. They independently examine the top level queries in these structures (note: only the top level queries do selections) to determine if their selection conditions make references to the same object classes in their possession. If such selection conditions and object classes are identified, the selection conditions are compared to determine if they are sharable. Thus, the decision of shareability is made in a parallel and distributed fashion in stead of in a centralized fashion. The latter approach may cause a bottleneck in a multi-query process in g system. The tasks needed for sharing the results of selection operations in each processor would depend on the storage ( or access path) structure used. For example, if there is no index established for an attribute, all its attribute

PAGE 70

61 values in the instances of an object class will be accessed from the secondary storage once and be used to process the selection conditions of all the queries that make reference to the attribute. However, if an index is available, index accesses and their results can be shared. In either way, the amount of I/ 0 will be reduced.

PAGE 71

CHAPTER 7 PERFORMANCE EVALUATIONS We have implemented the two generic multi wavefront algorithms, result collec tion strategies based on both master-slave and peer-to-peer architectures, and four optimization strategies presented in the previous sections. Our implementation platform is a 64-node nCUBE 2 parallel computer [nCU92). The software architectures for both architectures are shown in Figures 7.1. 7 1 Benchmark and Application Domains In our evaluation, the benchmark queries introduced in Thakore's work (Thak94) are used. We did not use the benchmarks proposed in other works [Ande90, Catt92, Care93] because they contain much simpler query types. The query types used are shown in Figures 7.2 The characteristics of each query type are as follows : Type I Queries involve the manipulation of complex objects. Figure 7.2(a) shows the structure of the subschema processed by the queries. Object class Cl models a set of complex objects. Complex objects are composed of objects of other classes and a1e modeled as an aggregation hierarchy. In the figure, objects of class Cl are composed of objects of classes C2 and C3. Type II-Queries involve the manipulation of complex objects and the inheritance of attributes. Figure 7.2(b) shows the structure of the subschema processed by queries 62

PAGE 72

63 of this type. As can be observed from the figure, in addition to the manipulation of the aggregation hierarchy, the inheritance of attributes through the generalization association (labeled G) is also involved in query processing. The dynamic model of inheritance is assumed, meaning the attributes and values associated with objects of a superclass are defined and stored in the superclass rather than in its subclasses. Type III-Queries involve the interaction ( or relationship) of complex objects with inheritance of attributes. In Figure 7.2(c), classes Cl and C4 model two sets of complex objects. Objects of class Cl inherit attributes from class C8. Class C7 models objects that capture the interaction between the complex objects of classes Cl and C4. In addition to the above benchmark queries, query types with one class selection, two-class and three-class association (join) are also evaluated. In our perfo1mance evaluations, we conduct evaluations of the proposed query processing and optimization strategies. We compare the speedup and scaleup of the two architectures for result collection. We also evaluate data placement strategies over two different application domains. 7.2 Evaluations of Optimization Strategies Using the implemented system, we evaluate the performance of four optimiza tion strategies. We compare the performance of query processing with optimization against the performance without optimization. The response time of a query is de fined as tl1e time it takes for the slave nodes to receive the query from the master node, process it and construct the final results. The total CPU time of a query is

PAGE 73

64 the summation of the processing times of the slave nodes which are involved in the processing of the query (excluding the idle time). We are able to obtain this data by using the execution profiling tool provided by the nCUBE. In constructing the test database as shown in Figure 7.2(c). We map one object class to one processor ( class-per-node) and each processor has its own 1/0 channel. 7.2.1 Intraquery Scheduling For the intraquery scheduling strategy, a test query as shown in Figure 7.2(b) is used. We assume that each of the object classes has an 11D-size of 5,000 initially. The 11D-size of class C3 varies from 5,000 to 500 so that we can observe the impact of varying the difference in 11D-sizes between class C3 and classes C2 and C8. According to the intraquery scheduling strategy, any one of the C2 and C3 classes can be passive. Here we arbitrarily pick class C3 as a passive class. Figure 7.3( a) shows the response time for the single query in two situations. One is with the intraquery scheduling strategy and the other without. We observe from this figure that when the difference in the 11D-size between Class C3 and Classes C2 and C8 is large (greater than 3,000), the intraquery optimization strategy improves the response time. We also observe from Figure 7 3( a) that, when the difference in 11D-size between the object classes is not very significant, starting the wavefronts from the active nodes with smaller !ID-sizes decreases the degree of parallelism for the single query so that the response time is longer Figure 7.3(b) shows the total CPU time ( excluding the idle time) of all the slave nodes that are involved in this query This figure shows that, by applying the

PAGE 74

Mast e r Node .--------------------------------------------------, I I I I I Query Trawladon I I I I I Glob.ti Oicillonary Query Opiitni7JltiOO I I I I Result Collector Menage Handler I I I ~--------------------------------------------ln1crconncc1ion Network S la, "'N od a --------------------------~ -----------------------I I I I I I I lmcrmcdilllO Rewh Mwage H andler Executi on S1111o Finite S1a1e Controller Local Dlclioruary Local Query Proc:e,sor I I I I I I I I I I I I I I I I I I I I \ I I I I I I I I I I I I ' I I I I I ~------------------------------------------------------' (a) A M aster S la ve So f twa r e Architect ur e 65 ____ ________ ........... ------------------------------------------, I o C-Node I ' Query Tnwlalloo Query Opclmlutlon ' ' I ' t I I I I I Global Dictionary Meuastj Haller j I I I ' I I \ ------------------------------------------------------~ l nta-c:onnecdon Netwotk --------------------------------------------------, P Nodc ' Result Collector M uaage Handler Exooution Su1111 I ' I I I I Intermediate Reauh i,. Fin i te Staie Controller .. Local Dictionary ' ' I I ' Local Query Proceuor ' I -----------------------------------------------' (b) A P ee r -to-Pee r Software Ar chitect u re F i gu r e 7.1. Two Softwa r e A rc hi tect u res

PAGE 75

cs G Cl A Cl m : n m : n m: n A m: n C2 C3 C2 C3 ( a ) Mod e ling of Complex Ob j ec t s (b) Mod e ling of C omplex Ob j ec t s with th e Inh e ritan c e of Attribute Valu e s 0 GJ G Ill : I C l C 4 n 1 : n Cl C3 cs C6 ( c ) Mod e ling o f Intera c ting Co mplex Obj ects with th e Inh e ritan ce o f At tribut e Valu es Figur e 7.2. S c h e ma R e pre se ntation of Variou s Ben c hmark Qu e ries 66

PAGE 76

67 intraquery scheduling strategy, the total CPU time decreases, thus leaving more CPU time fo1 other queries. If we only consider the optimization of a single query, in Figure 7.3(a), we can easily see that the threshold value of the difference in !ID-sizes is about 3,000 ( or when the !ID-size of the C3 class is 2,000). However, if our interest is on the optimization of multiple queries, the response time to one query is not the only criterion that needs to be considered. We need to consider the cost involved in achieving that kind of response time. In other words, we need to consider how much processing power is consumed by a single query and how much processing power is left for other queries. The goal in parallel multiple query processing is to achieve a balance between the response time for each individual query (parallelism) and the cost to achieve that (efficiency) so that the overall response time of a set of concurrent queries is reduced. With this goal in mind, by combining Figures 7 3 (a) and (b), we can see that the threshold value of the difference in !ID-size can be somewhere between O and 3,000. 7 .2.2 Partial Graph Processing The partial graph processing strategy is tested for the identification approach using the query graph shown in Figure 7.2(c). We randomly pick the object classes from which data are to be retrieved. The response time and the total cpu time of the query is shown in Figures 7 .4( a) and (b).

PAGE 77

12 11 opti .. \ \ non_opti -+ ~-' 10 ----... .. .. -.... ... .. ---... .... 9 .... --... ---~ ' ' 8 ' 7 6 ..__,._____.____._____.__..____,____.____.____, 0 500 10001S0020002S0030003S0040004S00 The Difference io the Psize (a) Response Time ...... 'ti = f,o ::;i Q. u 20 18 --~ .. --.. ---. 16 14 12 opti - non_opti -... ---. .. _ __ ----.... .......... 10 L--_,___.___.__.__......__.___.__..__...:e 0 500 1000 1500 2000 2500 3000 3500 4000 4500 The Difference in the P-siz-e (b) Total CPU Time Figure 7.3. Intraquery Scheduling Strategy 68 Two observations can be made. Firstly, the greater the ratio of the Dia ( as defined in Section 6.1) of a partial graph to the Dia of the whole query graph is the smaller the response time and the more saving in the total processing time would be. Secondly, the lower the number of classes from which the final retrieval results are to be accessed is, the more saving in the processing time would be. Also, if a large class does not contain the descriptive data of interest, the response time will be significantly reduced. The location of the class in the query graph will affect the response time as well. One can also notice, by comparing Figures 7.4( a) and (b ), that, while the re sponse time for a single query may not be significantly reduced by applying the partial graph strategy, the total CPU time does. This is because the response time is determined by the time (both idle and processing time) of the last slave node which finishes the processing of a query. In order to evaluate a query processing strategy in

PAGE 78

16 = 15 $ 14 13 61 e E-4 12 I , , 11 10 9 1 iden_opti iden non opti -, , , -+ ....... -+ -----. ---. ---' , ,, 2 3 4 s 6 7 Number or Interested Classes (a) Response Time 50 ,-.. ,::, 45 = VJ 40 .._., 61 e E-4 35 u 30 25 8 1 iden opti iden non opti -...... .... --_, ,,,,, ,, ,, ~ ---_ ,,., ---......... ,, _.,, 2 3 4 S 6 7 8 Number of Interested C lasses (b) Total CPU Time Figure 7.4. Partial Graph Processing Strategy 69 the context of multi-query processing, both the response time of the query and the total CPU time sho uld be considered 7.2.3 Interquery Scheduling and Common Pattern Sharing Our test database is still the database shown in Fig111e 7.2(c). We constr uct a st 1ucture of sharing which consists of three ba s i c structures. We gradually increase the number of queries in the structure while we measure the tota l response time at each step until it consists of five queries as s hown in Figure 7.5. The performance eva lu ation is done for both identification and elimination ap proach es We only pr esent the results of the identification approach here because the results of the e limination approach are s imilar. Figures 7. 6( a) and (b) show the performance results when both the result s haring and the s haring of result retrieval operation ( e g., result co lle ct ion phase) are applied. A s expected, the perfo1mance i s

PAGE 79

70 further improved. We also note here that the more queries are added to the sharing structure, the more performance gain is achieved. 7.2.4 Distributed Local Selection Sharing The same performance evaluation method used in the preceding subsection to evaluate the performance of the interquery scheduling and the common pattern shar ing is used here. The test database and the structure of sharing are the same. However, we do not use the result sharing and the result retrieval sharing but use only the distributed local selection sharing strategy available. Figure 7. 7 shows the performance evaluation result. We observe the same kind of performance improve ment as in the last subsection. The increased number of the queries shown in the figures represent the increase in the sharing of local selection operations. QI l'l). Q 3 Q4 QS QI : C7 Cl C4 Q3 : C7 AND C l C4 C2 QS : C7 AND C l C6 Q4 : C7 AND __ Cl C Figure 7 .5. Structure of Sharing Used in Performance Evaluation

PAGE 80

30 ---~.----..---~.... 30 ,-----.. ----..---~.----. /. 25 .. 20 .. 15 .. ,, /.,. ,., .. ,. .d ,.. a en non opti .,, ,/ iden opti o.,, . . .. .,.~ _,. ,. .. ...,., ,_. ,a ,,. .. ... .. ... .~ ,,.. ,.,,,. ,., ,,,..~ .. ~ _,, ..... ... ,,,,,. , ; .. .. . . ,. ,.. .. _., .... .,,._ -: .. # '' / .. 25 20 L 15 ... iden non opti ----,.,/ ,,. iden opti . ...... . . . ,' _,,,, .. .. ,, ,. . .. ' , .... ~ /' .,. ., -~ .1' ;,Ii," ,,,. ,,,. ... ..a .,,;" ,,,,.. r "' / .,., .. .. .. .... ::. .... -1'"' ,. . l' . ,, 10 .. ," ,. 10 .. ., / / . . 1 2 3 4 Number of Queries 5 1 2 3 4 Number or Queries 5 (a) Response Time Wh e n the Result Sharing is Applied (b) R espo ns e Time Wh e n the R es ult Sharing and t h e R es ult R et ri eva l Shar in g are Appli e d Figure 7.6. Interquery Scheduling and Co mmon Pattern Sharing Strategi es 30 ---~.----.. --~-.,. 25 .. 20 15 ,,.,,. ., / .1 ,. iden non opti ...... ,;,/ 1" iden__opti .... ... .' ,' ., /' _.,,,.. ., ~ ,, ..... ... JIJ .. .. . .. ,. .. ... .,.. .. ,,. ,., ... , i ,,., .,.,. , .P "' 1 .. .. .,, ,.,,. ..... ,,. .., / ... _ ., /. .. . ... , ,I . .. 10 ~-<>" ,. . \., .. 1 2 3 4 5 Number of Queries Figure 7 7. Di st ribut e d Local Selection Sharing Strategy 71

PAGE 81

72 7.2.5 Scaleup and Speedup of optimization strategies We also evaluate the scaleup and speedup of the multiple wavefront algorithms and the optimization strategies. In our evaluation, the benchmark queries shown in Figures 7.2 are used. Many application domains can be formed by varying the percentages of the three benchmark query types. In this dissertation, only the evaluation result of one application domain is presented. This application domain has the same percentage for the three benchmark query types. Figure 7.8(a) shows the scaleup of the multiple wavefront algorithms and the optimization st1ategies. Several conclusions can be reached from the figure. Firstly, when the number of the processors is much smaller than the number of object classes in the schema, a reasonably good scalability can be achieved This is because each processor stores instances of multiple classes and multiple queries access data from different classes, thus achieving some degree of load balancing. When the numbe1 of the processors is further increased, the scalability deteriorates because the additional processors do not lighten the processing load of these processors which hold the instances of object classes due to the class-per-node mapping strategy used in this particular experiment. Secondly, the optimization strategies improve the scalability. Figure 7.8(b) shows the speedup of the multiple wavefront algorithms and the optimization strategies. From the figure, we observe a quite good speedup when the number of processors is much smaller than the number of the object classes in a schema. When the number of the processors is increased, the speedup deteriorates.

PAGE 82

73 The reason is the same as what was explained for the scalability. We can also observe that the algorithms without optimization strategies have better speedup than the ones with optimization strategies. The reason is that, when many object classes are mapped to a processor, it is more likely that the optimization strategies can be applied more effectively and the execution time of the queries can be reduced more significantly. Thus, when the number of processors is increased, the execution time can not be reduced as much as the case when the optimization strategies are not applied. For the scaleup evaluation, the optimization strategies can be applied more effectively because the problem size grows with the system size, thus, achieving a better scalability. e 4 4 .. "' e "' 3.5 non opti ... 3.5 noo_opti +... +- ., ____ ,... .. ...., .. ...... opti opti .. s .. , 1 3 ;. -----3 ,, , / e , iii ,, ,, ... _,, , 2 5 2.5 ,, ,,' .. , 9, GO :; , ,,, .. C .. 0 ,, -a .. 2 2 , ,, .. , ,, 16, , -------~ .. i: Q. ~ , :I -,,,' -Q. l.5 l 1 5 .. :I .. _,, i, ' (I} ,, (I} 1 I I 2 3 4 5 6 7 8 1 2 3 4 5 6 7 8 Number or Processors Number of Processors (a) Scaleup of the System (b) Speedup of the System Figure 7.8. Scaleup and Speedup of the System 7.3 Evaluations of Architectures In our research, we evaluate the speedup and scaleup of the two architectures. In the master-slave architecture, node O is designated as a maste1 node and the rest

PAGE 83

74 of the nodes as slave nodes. In the peer-to-peer architecture, node O is designated as a C-node and the rest of the nodes as P-nodes. The results are collected by different P-nodes. We also use the hybrid data placement strategy for the evaluations in this subsection 7.3.1 Single Class Selection The speedup and scaleup of a single c la ss select ion are shown in Figures 7.9. It can be obse1ved that both architectures have good speedup and scaleup. In our exper iment, we also change the size of c la sses and the attribute size of objects. We found that the speedup and scaleup properties do not change significantly in either case. However, if the size of a class is very small (e.g., 50 objects/class) or the attribute size of an object is small ( e.g., 5 bytes/object), the speedup and scaleup are not good. This is because when each data segment is less or equal to the page size ( 4096 bytes in our implementation), further partition will not improve the performance ...... e 60 1.2 e It) . Master-Slave _.. "' Peer-to-Peer = t so 1.lS Master-Slave -e 1 Peer-to-Peer j 40 1.1 -.. ... --------_.. -~.... --------.,, .. --30 "' I.OS ,,._. ........ 0 , ef .. .s 20 1 ;u -. .. r: .s Cl, 10 0.95 :::, l Cl, :::, J! 0.9 . r,_) l 10 20 30 40 so 60 10 20 30 40 so 60 Number of Processors Number of Processors (a) Speedup (b) Scaleup Figure 7.9. Single Class Selection (20000 objects/class, 100 bytes/object)

PAGE 84

75 7.3.2 Two-Class Join The speedup and scaleup of a two-class join are measured on three different kinds of databases. As shown in Figure 7.10, the peer-to-pee1 architecture achieves better speedup and scaleup than the master-slave architecture. We observe this characteristics for all types of queries we tested. We also observe from Figure 7 .11 that, in a peer-to-peer architecture, when the number of objects in a class is increased to from 4000 objects per class to 20000 objects per class, the speedup and scaleup are degraded. This is because that the increase in the number of objects increases the message transmission overhead. This portion of the cost can not be completely parallelized. On the contrary, when the attribute size of an object is increased from 100 bytes/object to 5000 bytes/object, the speedup and scaleup improve. This is because that the increase of the attribute size increases the I/0 cost which is the portion of the cost that can be parallelized. Similar characteristics can be observed in a master-slave architecture. The variation of an object class size or an attribute size has similar impact on a three-class join and the benchmark queries. 7 .3.3 Three-Class Join The speedup and scaleup of the three-class join is shown in Figure 7.13. Com pared with the two-class join, the speedup and scaleup are not as good. This is because that three-class join increases the message overhead which can not be fully parallelized.

PAGE 85

76 'u' 4S l 2.2 . Master-Slave ... ~---40 Master-Slave -=; Peer-to-Peer Peer-to-Peer 2 ,, 1 ell 3S ,, ,, , ,, 30 1.8 ,, . ,, .. .. ,, Ill 25 I>-, ,, "' Ill 1.6 ,, 2, ell 20 tloO -.. -, .. ,, Q 1.4 .. _,, ,,,' lS .. r: i , 10 ..+-----------~-------------,, Q, ____ .. ___ .. 1.2 ...--,, ,a .... . r... -, 5 r' Q, , l = ., v i 1 . tr.) 10 20 30 40 so 60 10 20 30 40 50 60 Cl) Number of Processors Number of Pr~ors (a) Speedup (b) Scaleup Figure 7.10. Two-class Join (4 000 objects/class, 100 bytes/object) a 45 ell 1.8 ell .. 20000 Objects/Class ---20000 Objects/Class +40 Ill ,, =a 1.7 4000 Objects/Chm 4000 Objects/Class ,,"" ell 35 ,1.6 1 30 ,,,' ,, ,!! 1.5 ,, ,, Ill 25 ,, ,, 1.4 ,, ell 0 20 t'l ,, -,, --.. -, 1.3 -Q ....... ,, --15 -u , ... ....-r: ,,. .Q 1.2 ,.. ,,' -u -10 -Q, .if' .... as ,= r... 1.1 ,, l ,, s Q, = i 1 Cl) 10 20 30 40 so 60 10 20 30 40 so 60 Cl) Number of Processors Number of Processors (a) Speedup (b) Scaleup F igur e 7 .11. Two -c la ss Join (2 0000 objects/class, 100 byt es/ob j ect)

PAGE 86

77 u' 45 e 1.7 i. ,, f,,! 40 5000 Bytes/Object __ ., __ ,, "' ,, =; 1.6 5000 Bytes/Object -, 100 Bytes/Object ,, 100 Bytes/Object ,, a 35 ,, ,, 1 1.5 ,, 30 ,, , I f,,< 1.4 25 ,, .,, 0 20 fo 1.3 ... ... .s 15 u ... 1.2 Cl 0 __ ., ...... -Iii-I i .... .. __ .. __ 10 ...... .. ---... --C. --::, Iii-I 1.1 ---l 5 C. -.. :, -J! 1 Cl.) 5 10 20 30 40 50 60 10 20 30 40 50 60 Cl.) Number of Processors Number of Processors (a) Sp eed up (b) Scaleup Figure 7.12. Two-class Join (4000 objects/class, 5000 bytes /object) l a 45 1.8 '!;l ;,,-. 40 Th reeC lass Join -,, "' 1.7 Three-Cbw Join ---,, ,, Two-Class Join ,, Cl Two-Class Join a 3S ,,,,," E 1.6 1 --. 30 ,, 1.5 ,, ,,.. f,,< "' 25 ,, .,, ,, IA 0 fc ---20 -'t:' ... -1.3 --.... -.... 1S ,, .. ,, : 0 ,, 1.2 ,, .. ' 10 ,, Q. -i Iii-I 1.1 ,,. 5 C. ,' :, J! 1 fl) \II 10 20 30 40 so 60 10 20 30 40 50 60 Number of Processors Number of Processors (a) Speedup (b) Scaleup Figu1e 7.13. Three-class Join (4000 objects/class, 100 bytes/object)

PAGE 87

78 7.3.4 Benchmark Queries Finally, we evaluate the scaleup and speedup of the benchmark queries shown in Figures 7.2. The comparison is made between the master-slave and peer-to-peer architectures. We observe that the peer -t o-peer architecture achieves much better speedup and scaleup than the master-slave architecture. This is because that the queries are much more complicated and the result collection and the traversal cost at the master node become the bottleneck. 14 2.8 e Master-Slave ... -2.6 f,-1 12 Peer-to-Peer = 1111 Master-Slave -" 1 2.4 Peer-to-Peer 10 2.2 a -I-f,-1 -2 -8 ,.,, --0 --i 1.8 --... 6 -_ .... _, + -------------- ----1.6 -. ---_ _ ,,,.-~ ... -r: -0 ,, -4 1.4 -,, -C. ,/ ::I 1,2 L l C. 2 = .i 1 ,-_. . r,} CII 10 20 30 40 50 60 10 20 30 40 50 60 Number of Proc~ors Number of Proc~ors (a) Speedup (b) Scaleup Figure 7.14. Speedup and Scaleup of Benchmark Queries 7 .4 Evaluations of Two Data Placement Strategies In this section, we present the evaluations of the placement strategies over two different application domains. The benchmark query types shown in Figure 7.2 are used. In the first application domain, we assume that there are many object classes each of which contains a small number of object instances. The queries in this appli cation domain involve all the object classes with an equal or nearly equal probability.

PAGE 88

79 To simulate this application domain, our benchmark query set contains one Type I query, one Type II query and eight Type III queries. There are 100 objects in each class. In the second application domain, there are some large size classes and the queries involves the large size classes with a higher probability. To simulate this application domain, we let Cl, C2 and C3 have 20000 objects/class while the rest classes have 4000 objects/class. Our benchmark query set contains one Type III query, one Type II query and eight Type I queries. Two data placement strategies are under evaluations, namely, the class-per-node vertical partitioning and the hybrid partitioning strategies. In a class-per-node ver tical partitioning strategy, the instances of a class can be stored in a single processor and each processor can have the instances of multiple classes. At each processor, the instances of an object class are partitioned vertically. In a hybrid partitioning strategy, each object is horizontally partitioned into segments and each segment can be further partitioned vertically. As explained in Section 4 2.1 on query modification, a query is modified into another query based on the distribution of different combi nations of horizontally partitioned data. The modified query is processed against a combination of data partitions using various optimization techniques to obtain the final result of the original query. We measure the response time for both data placement strategies in two different application domains. In the class-per-node vertical partitioning strategy, maximally eight processors can be utilized. Therefore, we limit the processor number to eight for both strategies. Figure 7.15( a) shows the performance result in the first application

PAGE 89

80 domain. In this domain, the class-per-node vertical partitioning strategy performs better. This is because that the benchmark query set involves object classes with nearly equal probability, thus, the processing power of processors can be fully utilized. On the contrary, the hybrid partitioning strategy does not utilize the processing power of the processors any better than the class-per-node partitioning, yet, it requires some synchronization at ''OR" branches which introduce some overheads. Figure 7.15(b) shows the performance result in the second application domain. In this domain, the hybrid partitioning strategy performs better. This is because that the benchmark query set involves object classes with different probability and the sizes of classes are different. When the class-per-node vertical partitioning strategy is used, there are some nodes which may finish query processing well ahead the other nodes and become idle. But, when hybrid partitioning strategy is used, every class is partitioned into eight segments and mapped to eight nodes. In this way, the modified queries can fully utilize all the processing power Even though the modified query introduce some overheads because of the synchronization, the overall response time is improved

PAGE 90

.,, r:I $ a;> e a;> 0 20 ClassPerNode _, .,. __ 18 Hybrid 16 14 ... ' 12 ' ' ' ... 10 ' ' ', ', 8 .... --..... ---6 ... -..... .,. _____ 1 2 3 4 s 6 7 Number of Processors (a) Th e First Application D o main .,, r:I 0 a;> e a;> 0 Q,, 8 100 90 80 70 60 so 40 30 20 \ ClassPerNode --1 \ \ Hybrid --- \ \ \ ' ... ,, ____ -----.,_ _____ ..... ___ .. _ ___ 2 3 4 5 6 7 8 Number of Processors ( b) The S eco nd Appli catio n D o main Figure 7.15. Re s pon se Time of B enc hmark Queries 81

PAGE 91

CHAPTERS DISCUSSION AND CONCLUSION 8.1 Discussion Whenever new optimization strategies are introduced, the overhead of using these strategies need to be considered since it is a part of query processing cost. Also, in multiple query optimization, the memory size can become a limitation to the applicability of these strategies. We address these issues below. Overhead of the Intraquery Scheduling Strategy: The overhead of this strategy is the identification of passive, semi-passive and active nodes. The !ID-size is the only parameter that needs to be considered for this purpose. We have defined the !ID-size for each object class as the number of IIDs that need to be sent to its neighbors after its local processing. A processor calculates the IID-size based only on the local information ( e.g., CON array, instance connectivities to adjacent neighbors and the distribution functions of the attributes) instead of the entire query graph. This calculation is an in-memory operation, and the traversal of the entire query graph is not necessary in the calculation. Overhead of the Partial Graph Processing Strategy: The numbering of nodes in a query graph is the overhead for this strategy. We have pointed out in the previous section that the numbering process is a search process on a query graph. 82

PAGE 92

83 Usually, the number of classes involved in a query graph is not very large. Thus, the overhead would be small. Overhead of the Interquery Scheduling and Common Pattern Sharing: Building the structures of sharing is the overhead of this multiple query optimization strategy. It takes two steps to build the structures for a batch of queries. In the first step, each query will compa1e with every other query in the batch to find out if they can form a Case A ( as discussed in Section 6.4) simple structure. The complexity of this operation is O(n 2 ) where n is the number of queries in the batch. In the second step, those simple structures are grouped to form one or more complex structures. The total complexity is still 0( n 2 ). Because of the memory limitation ( to be discussed below), there will be a limited number of queries that can be batched so that the overhead for building the structures of sharing would not be too significant. Overhead of the Distributed Local Selection Sharing: In this strategy, all the decisions on sharing are made in the slave nodes so that there is no bottleneck problem for the master node. The overhead is in identifying the selection predicates and deciding the order of executing the selection operations. We have run different sets of queries to determine the overheads of query opti mization and the performance gains in response time. We did this for both identifi cation and elimination approaches. The test database is the university database we presented in Figure 3.1. A set of queries are used, each of which has 4 to 7 object classes. Figures 8.1 shows the results for the identification approach. We can see that the overhead is very small when compared with the gain in the total response

PAGE 93

84 time. For example, when there are 14 concurrent queries, the overhead is 0.073 sec onds while the performance gain is 1 21.49 67.28 = 54.21 seconds. We note here that the overhead does not include the IID-size estimation because according to the research results reported in Sun's work [Sun W93], att1ibute distribution functions can be computed in advance and the use of these functions at run-time to determine !ID-sizes generates negligible overhead. Identification Approach # of Queries Overhead (sec) Total Response Time (sec) with optimization 0.023 13.64 3 0 no optinuzation 17.28 with optimization 0.044 44.96 5 no optimization 0 55.29 with optimization 0.029 41.23 8 n o optimization 0 56.06 8 with optimization 0.056 35.48 no optimization 0 50.76 with optimization 0.073 67.28 14 no optimization 0 121.49 Figure 8 1 Optimization Overhead for the Identification Approach Memory Limitat i o n : The main memory size of the processors is another factor which needs to be considered in multiple query processing and optimization In the traditional relational database processing, queries generate final 01 temporary results in the form of relations. If one query can make use of the final or temporary result produced by another query ( e.g the result of a selection operation), the final or temporary relation generated by the latter should ideally be kept in the main memory

PAGE 94

85 so that it can be readily used by the former without extra I/Os. However, this is not always possible since the generated r elation may be too large to be stored in the main memory. In the storage structure and query processing strategy used in our system, only the vertical binary columns (IID-IID and !ID-attribute-value pairs) that are relevant to query processing are fetched from the secondary storage and the result of a selection operation or the p1ocessing of a wavefront is a set of IIDs which does not occupy much main memory space. Therefore, they can be kept in the main memory for use by other queries. Also, during the result collection phase (i.e., the second phase of query processing), descriptive data relevant to a set of retrieval queries can be fetched by the slave processors from their corresponding secondary storages once a nd in parallel, and be distributed into different memory buffers which are set up for these queries. When the buffers are full, they can be transferred to the result col le ction node which would assembly the data to produce the final instances for the users. A proper buffering scheme can keep the pipelines of data flowing smoothly from the slave nodes to the master node. 8.2 Conclusion In this dissertation, we have provided the rationale for using the proposed paral lel architectures, data placement strateg i es, query processing and optimization strate gies and result co ll ect ion strategies for the storage and processing of object-oriented databases. These strategies are different from those used in many existing parallel relational database systems in the following ways. Firstly, hybrid partitioning of ob ject instances is used instead of the popular horizontal partitioning scheme to achieve

PAGE 95

86 a high scalability and avoid the access of large instances of complex objects from the secondary storages. Secondly, a query modification scheme is used instead of the ''split'', "merge" or "exchange" operators to achieve a more uniform interprocessor communication during query processing. Thirdly, the two-phase query processing strategy and the marking of object instances, instead of the traditional single-phase strategy and the generation of temporary relations, avoid the propagation of large quantities of data among processors. fourthly, the graph-based query specification and processing strategy instead of the traditional tree-based strategy can offer a higher degree of parallelism since query processing can start at multiple processors and in multiple directions instead of the fixed leaves-to-root order. Fifthly, the multi wavefront algorithms instead of the algebraand tree-based processing algorithms are used to allow a more direct implementation and processing of query graphs. Sixthly, the query optimization strategies introduced in this dissertation control the multiple initializations of wavefronts and the directions of their propagations and allow queries to share their processing results in a variety of ways. They are particularly suitable for multi-wavefront algorithms and the graph-based query processing strategy. Lastly, the distributed result collection scheme achieves good speedup and scaleup for the retrieval-type queries. In this work, we have implemented the above strategies and evaluated their performance. We have shown that not only they are implementable but also their use improves the performance of multi-query processing with negligible overheads. We do not claim that the proposed strategies are better than the more traditional

PAGE 96

87 strategies used in many existing para l lel database systems. A comparison between these two sets of strategies would involve the actual implementation of both sets and run them on a parallel system to get their precise performance measurements. However, this task would be too laborious to undertake due to the fact that there are many existing strategies and variations. Any selection of a subset of these strategies for comparison purposes would subject to a criticism on the fairness of the result, particularly different techniques and algorithms are bound to be used to implement them. However, we do suggest that, due to the different characteristics of object oriented databases, researchers in parallel database systems should investigate into different query processing and optimization strategies. The ones described in this dissertation are some examples.

PAGE 97

REFERENCES [Alas89] A. M. Alashqur, S. Y. W. Su, and H. Lam OQL: A query language for manipulating object-oriented databases. In Proc 15th Int 'l Conj on Very Large Data Bases, Amsterdam, Netherlands, pp 433 442, Aug. 1989. [Ande90] T. Anderson, A. J. Berre, M. Mallison, I. H. H. Porter, and B. Schneider. The hypermodel benchmark. In Proceedings of the EDBT conference, V enice, Italy, Mar. 1990. [Bern81] P. A. Bernstein, N. Goodman, E. Wong, C L. Reeve, and J. B. Rothnie, Jr. Query processing in a system for distributed databases (SDD-1). ACM Trans. Database Syst., 6(4):602 625, Dec. 1981. [Bham93] K. Bhambani and M. R. Kay. ODBII: The next-generation object database. Technical overview, Fujitsu Laboratories, 1993. (BicL86] L Bic and R. L. Hartmann. Simulated performance of a data-driven database machine. J. Parallel Distributed Comput., 3(1):1 22, 1986. [BicL89] L Bic and R. L. Hartmann. AGM: A dataflow database machine. ACM Trans. Database Syst., 14(1):114 146, Mar. 1989. [Care93] M. J. Carey, D. DeWitt, and J. Naughton The 007 benchmark. In Proc. A C M SIGMOD lnt l Conj. on Management of Data, Washington DC, pp. 12 21, May 1993. (Catt92) R. Cattell and J. Skeen. Object operation benchmark. ACM Transac tions on Database Systems, 17(1):1 31, Mar. 1992. (Chen91) M.-S. Chen and P. S. Yu. Determining beneficial semijoins for a join sequence in distributed query processing. In Proc. 7th Int l Conj. on Data Eng., pp. 50 58, Apr. 1991. [Chen92] M. -S. Chen, P. S. Yu and K.-L Wu. Scheduling and processor allocation for parallel execution of multi-join queries. In Proc. 8th Int 'l Conj. on Data Eng., Tempe Arizona, pp 58 67, Feb. 1992. 88

PAGE 98

89 [Chen95] Y. Chen and S. Y. W. Su. Identification and elimination-based parallel query processing techniques for object-oriented databases. Journal of Parallel and Distributed Computing, 28:130-148, 1995. (Clue92] S. Cluet and C. Delobel. A general framework for the optimization of object-oriented queries. In Proc. ACM SIGMOD Con/., San Diego, CA, June 1992. (Cope85] G. Copeland and S. N. Khoshafian. A decomposition storage model. In Proc. ACM SIGMOD Con/., Austin, Texas, pp. 268-279, 1985. [DeWi90] D. J. DeWitt, P. Futtersack, D. Maier, and F. Velez. A study of three alternative workstation-server architectures for object oriented database systems. In Proc. 16th Int 'l Con/ on Very Large Data Bases, Brisbane, Australia, pp. 107 121, Aug. 1990. [DeWi92] D. J. DeWitt and J. Gray. Parallel database systems: The future of high performance database systems. CACM, 35(6):85-98, June 1992. [Grae88] G. Graefe and D. Maier. Query optimization in object-oriented database systems: The revelation project. Technical Report, CS /E 88-025, Oregon Graduate Center, 1988. (Grae90] G. Graefe. Encapsulation of parallelism in the Volcano query processing system. In Proc. ACM SIGMOD Int'[ Con/. on Management of Data, Atlantic City, NJ, pp. 102 111, June 1990. [Grae94] G. Graefe, R. L. Cole, D. L. Davison, W. J. McKenna, and R. H. Wol niewicz. Extensible query optimization and parallel execution in vol cano. In Query Processing for Advanced Database Systems, J. C. Frey tag, D. Maier, and G. Vossen, editors, pp. 305-330. Morgan Kaufmann Publishers, San Mateo, California, 1994. [Hara94] L. Harada, N. Akaboshi, and M. Nakano. An effective parallel processing of multi-way joins by considering resources consumption. In Proc. of ICC! Con/., 1994. [Ioan90] Y. E. Ioannidis and Y. C. Kang. Randomized algorithms for optimizing large join queries. In Proc. ACM SIGMOD Int'l Con/. on Management of Data, Atlantic City, NJ, pp. 312 321, May 1990. [Ishi93] H. Ishikawa and et al. The model, language, and implementation of an object-oriented multimedia knowledge base management system. ACM Transaction on Database Systems, 18(1), Mar. 1993. [Jenq90] B. P. Jenq, D. Woelk, W. Kim, and W.-L. Lee. Query processing in distributed ORION. In Advances in Database Technology EDBT '90, Venice, Italy, F. Bancilhon, C. Thanos, and D. Tsichritzis, editors, pp. 169-187. Springer-Verlag LNCS 416, 1990.

PAGE 99

90 [Kamb82] Y. Kambayashi, M. Yoshikawa, and S. Yajima. Query processing for dis tributed database t1sing generalized semijoins In Proc. ACM SIGMOD Int'l Conj. on Management of Data, pp. 151 160, June 1982. [Kamb85] Y. Kambayashi. Processing cyclic queries. In Query Processing in Database Systems, W. Kim, D. S. Reiner, and D. S. Batory, editors, pp. 62 78. Springer-Verlag, 1985. [Kell91] T Keller, G. Graefe, and D. Maier. Efficient assembly of complex ob jects. In Proc. ACM SIGMOD Int'/ Conj. on Management of Data, Denver, Colorado, May 1991. [KimK90] K.-C. Kim. Parallelism in object-oriented query processing. In Proc. 6th Int'l Conj. on Data Eng., Los Angeles, CA, pp. 209 217, Feb. 1990. [Kim W88] W. Kim, N. Ballou, H. T. Chou, J. F. Garza, and D. Woelk. Integrating an object-oriented programming system with a database system. In Pro ceedings of International Conference on Object-Oriented Programming Systems, Languages and Applications. San Diego CA, pp. 142 152, Sept. 1988. [Kim W89a] W. Kim. A model of queries for object-oriented databases. In Proc. 15th Int'l Conj. on Very Large Data Bases, Amsterdam Netherlands, pp. 423 432, Aug. 1989. [Kim W89b] W. Kim, K. Kim, and A. Dale. Indexing techniques for object-oriented databases. In Object-Oriented Concepts, Databases and Applications, W.Kim and F. Lochovsky, editors. ACM and Addison-Wesley, 1989. [Kits90] M. Kitsuregawa and Y. Ogawa. Bucket spreading parallel hash: A new, robust, parallel hash join method for data skew in the super database computer (SDC). In Proc. 16th Int'l Conj. on Very Large Data Bases, Brisbane, Australia, pp. 210 221, Aug. 1990. [LamH87] H. Lam, S. Y. W. Su, F. L. C. Seeger, C. Lee, and W. R. Eisenstadt. A special function unit for database operations within a data-control flow system. In Proc. of the Int 'l Conj on Parallel Processing, pp. 330 339, Aug. 1987. [LamH89] H. Lam, C. Lee, and S. Y. W. Su. An object flow computer for database applications. In Proc. of the Int'l Workshop on Database Machines, pp. 1 17, June 1989. [Lieu93] D. F. Lieuwen, D. DeWitt, and M. Mehta. Parallel pointer-based join techniques for object-oriented databases. In Second International Con ference on Parallel and Distributed Information Systems, pp. 172 181, Jan. 1993.

PAGE 100

91 [Ling92] Y. Ling and W. Sun A supplement to sampling-based methods for query size estimation in a database system. ACM SIGMOD Record 12/92, 21(4):12 15, 1992. (LuH91] H. Lu, M.-C. Shan, and K.-L. Tan. Optimization of multi-way join queries for parallel execution. In Proc. 17th Int 'l Conf. on Very Large Data Bases, Barcelona, Spain, pp. 549-560, Sept. 1991. (Mish92] P. Mishra and M. H. Eich. Join processing in relational databases. ACM Compter Surv., 24(1), Mar. 1992. [Nava84] S. Navathe, S. Ceri, G. Wiederhold, and J. Dou. Vertical partitioning of algorithms for database design. ACM Trans. Database Syst., 9(4), Dec. 1984. [nCU92] nCUBE, Foster City, CA. nCUBE 2 Programmer's Guide, 1992. Release 3.0. [Schn90] D. A. Schneider and D. J. DeWitt. Tradeoffs in processing complex join queries via hashing in multiprocesso1 database machines. In Proc. 16th Int 'l Conj. on Very Large Data Bases, Brisbane, Australia, pp. 469-480, Aug. 1990. [Shek90] E. J. Shekita and M. J. Carey. A performance evaluation of pointer based joins. In Proc. ACM SIGMOD Int'l Conj. on Management of Data, pp. 300-311, May 1990. [Sun W93] W. Sun, Y. Ling, N. Rishe, and Y. Deng. An instant and accurate size estimation method for joins and selection in a retrieval-intensive environment. In Proc. ACM SIGMOD Int'l Conj. on Management of Data, Washington DC, pp. 79-98, June 1993. [SuSY88] S. Y. W. Su. Database Computers: Principles, Architectures, and Tech niques. McGraw-Hill, 1988. [SuSY89] S. Y. W. Su, V. Krishnamurthy, and H. Lam. An object-oriented seman tic association model (OSAM*). In Artificial Intelligence: Manufactur ing Theory and Practice, S. Kumara, A. L. Soyster, and R. L. Kashyap, editors, pp. 463 494. Institute of Industrial Engineers, Industrial Engi neering and Management Press, 1989. [Swam88] A. Swami and A. Gupta. Optimization of large join queries. In Proc. ACM SIGMOD Int'l Conj. on Management of Data, Chicago, fllinois, pp. 8-17, June 1988. (TayY89] Y. C. Tay. Attribute agreement. In Proc. of the 8th ACM SIGACT SIGMOD-SIGART Symposium on Principles of Database Systems, pp. 110 119, Mar. 1989.

PAGE 101

92 (Thak90] A. K. Thakore, S. Y. W. Su, H. Lam, and D. G. Shea. Asynchronous parallel processing of object bases using multiple wavefronts. In Proc. of the Int'l Conj. on Parallel Processing, pp. 127-135, Aug. 1990. [Thak94] A. K. Thakore and S. Y. W. Su. Performance analysis of parallel object-oriented que1y processing algorithms. Distributed and Parallel Databases An International Journal, 2(1 ):59 100, Jan. 1994. [Vald84] P. Valduriez and G. Gardarin. Join and semijoin algorithms for a multi processor database machine. ACM Trans. Database Syst., 9(1):133 161, Mar. 1984. [Vald87] P. Valduriez. Join indices. ACM Trans. Database Syst., 12(2):218 246, June 1987. [Well92] D. Wells, J. Blakeley, and C. Thompson. Architecture of an open object oriented database management system. IEEE Computer, 25(10), 1992. [Whit92] S. White and D. DeWitt A performance study of alternative object faulting and pointer swizzling strategies. In Proc. 18th Int'l Conj. on Very Large Data Bases, Vancouver, Canada, pp. 419 431, 1992. [YooH89] H. Yoo and S. Lafortune. An intelligent search method for query opti mization by semijoins. IEEE Trans. I(nowledge Data Eng., 1(2):226 237, June 1989.

PAGE 102

BIOGRAPHICAL SKETCH In 1984, after receiving the Bachelor of Engineering from Beijing University of P&T at the ag e of 19, Ying Huang joined Fuzhou Telecom. Bureau. During more than a fiv e -year period, he was involved in a number of projects in the telecommu nications area. One of the R&D projects received the prestigious National Science and Technology Innovation Award in 1986. In 1987, he was selected to go to Japan to participate in a training prog1 am sponsored by the United Nations. Together with other engineers from around the world, he studied the emerging international telecommunications protocols and the leading technologies. From 1992 to present, Ying has been a Ph. D. student of the Electrical and Computer Engineering Department at UF. He also works as a research assistant in the Database Re s earch and Development Center under the guidance of Prof. Stanley Y. W. Su. Ying's p1'imary research interests include the parallel and distributed databa s e s ystems, object-oriented databa s es query processing and optimizations. Ying is a student member of IEEE a nd ACM. 93

PAGE 103

I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the d ~ree of Docto1 of Ph" sophy. Stanley .W. S airman Professor of El ctrical and Comp11~ ., r Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Keit L. Doty Pro ssor of Electrical and Computer Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy .. Eric Hanson Assistant Professor of Computer and Information Science and Engineering I certify that I have read thi s study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fu l ly adequate, in scope and quality, as a dissertation f ...... .., degree of Doctor of Phi l osophy Herman Lam Associate Professor of Electrical and Computer Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. n Staudhammer Professor of Electrical and Computer Engineering

PAGE 104

This dissertation was submitted to the Graduate Faculty of the College of Engineering and to the Graduate School and was accepted as partial ful fillment of the reqt1irements for the degree o octor of Philosophy. May 1996 infred M. Phillips Dean, College of Engineering Karen A. Holbrook Dean, Graduate School

PAGE 105

LD 1780 199 ....


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID ERCWGI13W_175805 INGEST_TIME 2013-01-23T14:46:13Z PACKAGE AA00012920_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES