• TABLE OF CONTENTS
HIDE
 Title Page
 Copyright
 Dedication
 Acknowledgement
 Table of Contents
 Abstract
 Introduction
 Survey of related work
 Representation and querying of...
 Paralell architectural model and...
 Paralell algorithms for non-deductive...
 Paralell algorithms for processing...
 Simulation environment and...
 Conclusion
 Appendix
 Reference
 Biographical sketch
 Copyright














Title: Data distribution and algorithms for asynchronous parallel processing of object-oriented knowledge bases
CITATION THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00082241/00001
 Material Information
Title: Data distribution and algorithms for asynchronous parallel processing of object-oriented knowledge bases
Physical Description: ix, 302 leaves : ill. ; 29 cm.
Language: English
Creator: Thakore, Arun Kumar, 1962-
Publication Date: 1990
 Subjects
Subject: Object-oriented databases   ( lcsh )
Parallel processing (Electronic computers)   ( lcsh )
Algorithms   ( lcsh )
Electrical Engineering thesis Ph. D
Dissertations, Academic -- Electrical Engineering -- UF
Genre: bibliography   ( marcgt )
non-fiction   ( marcgt )
 Notes
Thesis: Thesis (Ph. D.)--University of Florida, 1990.
Bibliography: Includes bibliographical references (leaves 293-301).
Statement of Responsibility: by Arun Kumar Thakore.
General Note: Typescript.
General Note: Vita.
 Record Information
Bibliographic ID: UF00082241
Volume ID: VID00001
Source Institution: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: aleph - 001677511
oclc - 24887658
notis - AHY9416

Table of Contents
    Title Page
        Page i
    Copyright
        Page ii
    Dedication
        Page iii
    Acknowledgement
        Page iv
        Page v
    Table of Contents
        Page vi
        Page vii
    Abstract
        Page viii
        Page ix
    Introduction
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
        Page 8
    Survey of related work
        Page 9
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
        Page 21
        Page 22
        Page 23
        Page 24
        Page 25
        Page 26
        Page 27
        Page 28
    Representation and querying of object-oriented databases
        Page 29
        Page 30
        Page 31
        Page 32
        Page 33
        Page 34
        Page 35
        Page 36
        Page 37
        Page 38
        Page 39
        Page 40
        Page 41
        Page 42
        Page 43
        Page 44
        Page 45
        Page 46
    Paralell architectural model and data organization
        Page 47
        Page 48
        Page 49
        Page 50
        Page 51
        Page 52
        Page 53
        Page 54
        Page 55
        Page 56
        Page 57
        Page 58
        Page 59
        Page 60
        Page 61
        Page 62
        Page 63
        Page 64
        Page 65
        Page 66
        Page 67
        Page 68
        Page 69
        Page 70
        Page 71
        Page 72
        Page 73
        Page 74
        Page 75
        Page 76
        Page 77
        Page 78
        Page 79
        Page 80
        Page 81
        Page 82
        Page 83
        Page 84
        Page 85
        Page 86
        Page 87
        Page 88
        Page 89
        Page 90
        Page 91
        Page 92
        Page 93
        Page 94
        Page 95
        Page 96
        Page 97
        Page 98
        Page 99
        Page 100
        Page 101
        Page 102
        Page 103
        Page 104
        Page 105
    Paralell algorithms for non-deductive query processing
        Page 106
        Page 107
        Page 108
        Page 109
        Page 110
        Page 111
        Page 112
        Page 113
        Page 114
        Page 115
        Page 116
        Page 117
        Page 118
        Page 119
        Page 120
        Page 121
        Page 122
        Page 123
        Page 124
        Page 125
        Page 126
        Page 127
        Page 128
        Page 129
        Page 130
        Page 131
        Page 132
        Page 133
        Page 134
        Page 135
        Page 136
        Page 137
        Page 138
        Page 139
        Page 140
        Page 141
        Page 142
        Page 143
        Page 144
        Page 145
        Page 146
        Page 147
        Page 148
        Page 149
        Page 150
        Page 151
        Page 152
        Page 153
        Page 154
        Page 155
        Page 156
        Page 157
        Page 158
        Page 159
        Page 160
        Page 161
        Page 162
        Page 163
        Page 164
        Page 165
        Page 166
        Page 167
        Page 168
        Page 169
        Page 170
        Page 171
        Page 172
        Page 173
        Page 174
    Paralell algorithms for processing of deductive rules
        Page 175
        Page 176
        Page 177
        Page 178
        Page 179
        Page 180
        Page 181
        Page 182
        Page 183
        Page 184
        Page 185
        Page 186
        Page 187
        Page 188
        Page 189
        Page 190
        Page 191
        Page 192
        Page 193
        Page 194
        Page 195
        Page 196
        Page 197
        Page 198
        Page 199
        Page 200
        Page 201
        Page 202
        Page 203
        Page 204
        Page 205
        Page 206
        Page 207
        Page 208
        Page 209
        Page 210
        Page 211
    Simulation environment and results
        Page 212
        Page 213
        Page 214
        Page 215
        Page 216
        Page 217
        Page 218
        Page 219
        Page 220
        Page 221
        Page 222
        Page 223
        Page 224
        Page 225
        Page 226
        Page 227
        Page 228
        Page 229
        Page 230
        Page 231
        Page 232
        Page 233
        Page 234
        Page 235
        Page 236
        Page 237
        Page 238
        Page 239
        Page 240
        Page 241
        Page 242
        Page 243
        Page 244
        Page 245
        Page 246
        Page 247
        Page 248
        Page 249
        Page 250
        Page 251
        Page 252
        Page 253
        Page 254
        Page 255
        Page 256
        Page 257
        Page 258
        Page 259
        Page 260
        Page 261
        Page 262
        Page 263
        Page 264
        Page 265
        Page 266
        Page 267
        Page 268
        Page 269
        Page 270
        Page 271
        Page 272
        Page 273
        Page 274
        Page 275
        Page 276
        Page 277
        Page 278
        Page 279
        Page 280
    Conclusion
        Page 281
        Page 282
        Page 283
        Page 284
    Appendix
        Page 285
        Page 286
        Page 287
        Page 288
        Page 289
        Page 290
        Page 291
        Page 292
    Reference
        Page 293
        Page 294
        Page 295
        Page 296
        Page 297
        Page 298
        Page 299
        Page 300
        Page 301
    Biographical sketch
        Page 302
        Page 303
        Page 304
    Copyright
        Copyright
Full Text













DATA DISTRIBUTION AND ALGORITHMS FOR ASYNCHRONOUS PARALLEL
PROCESSING OF OBJECT-ORIENTED KNOWLEDGE BASES















By

ARUN KUMAR THAKORE


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1990













































Copyright 1990

by

Arun Kumar Thakore
































To my wife and parents













ACKNOWLEDGEMENTS


I take this opportunity to express my deepest gratitude

to Dr. Stanley Y.W. Su. He has been and is a constant source

of inspiration to me. He has motivated me and guided me with

utmost patience. I have had difficult times during the course

of my PhD work. He has supported me and understood my

frustrations, and has always been a trusted mentor. I also

thank Dr. Shamkanth Navathe for his encouragement and support.

He has always given me timely suggestions and made me feel at

home.

I thank Dr. Herman Lam for his helpful suggestions. His

acumen for details has helped me in improving upon my work.

I also thank Dr. Fred Taylor and Dr. Randy Chow for being on

my committee. My appreciation and admiration goes to Sharon

Grant whose serene face and charming smile brighten each

working day. She seems tireless and is always ready to help

with a smiling patience even at the end of a hectic day. I

thank my friends at the Database Research and Development

Center for their enthusiasm and cooperation.

This work was supported by a grant from the National

Science Foundation and the Florida High Technology Council.

The IBM Research Center at Yorktown Heights provided the use







of their facilities for the simulation of the ideas developed

in this research. The support of National Science Foundation,

Florida High Technology Council, and IBM is greatly

acknowledged.

I am thankful to my wife Rina who has provided me with

incredible understanding and encouragement in innumerous

ways. Her patience and moral support were essential in the

completion of this work. Last but not the least, I am

eternally grateful to my parents for their love and

encouragement in all my endeavors.













TABLE OF CONTENTS


Page

ACKNOWLEDGEMENTS.......................... ............. iv

ABSTRACT ................................................viii

CHAPTERS

1 INTRODUCTION......................................1

2 SURVEY OF RELATED WORK...........................9

Database Machines................................. 9
Knowledge Base Machines...........................16
Logic Based Machines .......................... 16
Production System Machines.....................22
Semantic Network Machines..................... 25

3 REPRESENTATION AND,QUERYING OF
OBJECT-ORIENTED DATABASES......................29

Object-Oriented View of Databases.................30
A Closed Model of Query Processing for
Object-Oriented Databases........................34
Association Operator..........................35
NonAssociation Operator.......................36
Query Examples...................................36
Noncyclic Association Pattern..................37
Cyclic Association Pattern.....................41
Deductive Queries .............................. 42

4 PARALLEL ARCHITECTURAL MODEL AND
DATA ORGANIZATION ...............................47

Parallel Architectural Model......................47
Partitioning and Mapping of Data..................50
Data Clustering.............................. 52
Load Balancing ................................. 59
Mapping of Cluster Groups Onto Processors.......72








5 PARALLEL ALGORITHMS FOR NON-DEDUCTIVE
QUERY PROCESSING...............................106

Processing Phases................................106
Parallel Algorithms..............................110
Identification of Subdatabases................112
Generation of the Result.......................136

6 PARALLEL ALGORITHMS FOR PROCESSING OF
DEDUCTIVE RULES................................ 175

Processing Phases ................................175
Parallel Algorithms............................ 178
Derivation of the Target Subdatabase..........180
Processing of Linearly Recursive Rules........187

7 SIMULATION ENVIRONMENT AND RESULTS............. 212

Simulation Environment..........................213
Hardware............ ..... ............. ...... 213
Software Components ........ ...............214
Benchmark Queries............. ...............217
Database Characteristics.....................219
Simulation Results and Analysis..................220
Suitability of the Heuristic Mapping
Techniques.................................. 220
Effect of Dataand Query Parameters on
Performance ......... ..................... ..226
Effect of System Parameters on Performance....239
Effect of Derivation Parameters on
Performance.............................. 248

8 CONCLUSION......................... .... ... ..... ...... 281

APPENDICES

A EQUATIONS CHARACTERIZING DATA PARAMETERS..........285

B EQUATIONS CHARACTERIZING SIMULATED TIMINGS........289

REFERENCES................................................293

BIOGRAPHICAL SKETCH........................................ 302


vii















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

DATA DISTRIBUTION AND ALGORITHMS FOR ASYNCHRONOUS PARALLEL
PROCESSING OF OBJECT-ORIENTED KNOWLEDGE BASES

By

ARUN KUMAR THAKORE

DECEMBER 1990

Chairman: Dr. Stanley Y. W. Su
Major Department: Electrical Engineering

Sophisticated management and reasoning about large

quantities of complex data are essential in advanced

application areas. Several Object-Oriented (00)

databases/knowledge bases have been developed to effectively

capture the complex domain knowledge. However, due to the

enormity and the intricacy of the data, and the generality of

the functions implemented by the 00 databases/knowledge bases,

the existing implementations operate inefficiently. In this

dissertation, we study several issues related to the efficient

parallel implementation of 00 knowledge bases.

The physical organization of the data across the

processing nodes of a parallel system plays an important role

in determining the execution time. We present several

techniques for efficiently partitioning large quantities of 00

data across the processing nodes of the parallel system. The


viii









techniques take advantage of the structure and the semantic

property of the 00 data in localizing manipulation and

reducing the overall communication costs during query

processing.

Further, we present parallel algorithms for the

processing of non-deductive and deductive queries against a

large 00 knowledge base. The algorithms are developed for

various query complexities. During processing, the algorithms

avoid the execution of time-consuming join operations by

retrieving the explicitly stored relationships, among the

various object instances, based on patterns of object

associations. Generation of large quantities of temporary

data is avoided by marking object instances using their

identifiers and by employing a two-phase query processing

strategy. A query is processed by concurrent multiple

wavefronts, thereby improving parallelism and avoiding the

complexities introduced in their sequential implementation.

The suitability of the data partitioning techniques and

the correctness and the performance of the parallel algorithms

have been tested and analyzed by running parallel programs on

the IBM's distributed message passing system Victor.

Benchmark queries of different semantic complexities are

generated and their performance is analyzed for various data

and system parameters. The performance of several application

domains characterized by specific mixes of the benchmark

queries is also analyzed.













CHAPTER 1
INTRODUCTION


Many advanced database application areas such as CAD/CAM,

CASE, and decision support have an increasing need for

manipulating large quantities of data having complex

structures. Relational systems are not expressive enough to

capture the complex structural relationships and the

behavioral properties of objects found in more advanced

applications. Several Object-Oriented (00) Semantic data

models have been developed [HAM81, BAT85, HUL87, SU89] based

on the features of the popular Object-Oriented programming

paradigm. These models provide a variety of constructs to

effectively model complex domain knowledge. Several 00

database systems have been implemented [FIS87, KIN84, WOE86].

The generality and expressiveness introduced by the 00 models

make it easier for the user to model large quantities of

application data in a complex domain. However, the need for

querying and reasoning about a large number of complex data

objects and relationships among them causes the existing 00

systems to operate rather inefficiently.

A number of database machines have been proposed to

improve the query processing efficiency of large databases.

Also, researchers in the Artificial Intelligence (AI) area








2

have designed hardware architectures and processing techniques

for efficiently supporting the various reasoning mechanisms

encountered in the execution of expert systems. However, as

illustrated in this chapter and elaborated further in the next

chapter, they do not provide adequate means to satisfy the

efficient manipulation and reasoning needs of large complex 00

knowledge bases.

The database machines have used a variety of techniques

to overcome the I/O and processor-memory bottlenecks of the

Von Neumann architecture. They have primarily supported the

efficient execution of the time consuming primitive relational

operations, such as, join and set processing, on large

relational databases. The'requirements and characteristics of

00 systems are different from those of the relational systems.

Hence, the architectures and the algorithms for the efficient

execution of 00 systems can be expected to be different from

those proposed for relational systems.

For example, the domain knowledge in an 00 database can

be represented by objects and their associations. The query

languages used for querying the database should be pattern-

based [ALA89a, ALA89b] as opposed to the attribute-based

relational languages. Pattern-based languages allow the user

to express the query as complex patterns specifying the object

classes, their associations with other object classes,

relationships desired, and the operations) to be performed on

the selected objects. Using relational machines, processing







3

of 00 queries will involve the execution of time consuming

join operations. This is due to the fact that in a relational
model the interrelationships among data objects are scattered

across several relations and have to be recomputed during

query processing. The frequency of the join operations grows

as the complexity of the data objects and the

interrelationships among the data objects increase.

Obviously, applications with dense interrelated complex

objects require the modeling power of an 00 semantic model.

Further, the relational database machines provide efficient

support only for queries involving retrieval and storage of

large quantities of data. Processing of deductive queries

involving manipulation of large sets of deductive rules and

factual data is not supported by these database machines.

Recently, Bic and Hartman have proposed an Active Graph

Model (AGM) for database processing[ BIC89 ]. The AGM is

proposed for improving the efficiency and parallelism during

query processing. The AGM explicitly captures the

relationships among the data elements and processes the query

by injecting tokens from various data elements and propagating

them asynchronously along the relationship arcs. The

explicitly captured relationships eliminate the need for

computing them during processing by executing time consuming

join operations, thereby, improving the query processing

efficiency. The asynchronous nature of the processing

improves parallelism by eliminating the need for centralized








4

control at every execution step. However, the granularity of

computation and the query model of the AGM are not suitable

for the efficient processing of large 00 systems.

The granularity of computation in the AGM is at the data

element level. In 00 systems, the number of data objects, the

connectivity among the data objects, and the number of bytes

describing the properties of each object can be very large.

At low granularities of computation, this can lead to

generation and processing of an excessive number of tokens

carrying a substantial amount of information with them. This

in turn can lead to a significant increase in overhead costs.

Also, using the AGM, the query is issued against a database

which is represented as a network of interrelated data

elements. However, the result of the query is a normalized

relation, where different tuples of the relation are collected

from the selected nodes of the target set. Since the result

of the query is not structurally represented in the same form

as the original database, the result of the query cannot be

stored and further uniformly operated on by the same query

model to produce other results that satisfy other

qualification conditions. Thus, the closure property is not

maintained.

AI machines have provided efficient reasoning of expert

systems by implementing in hardware the data structures and

operations on the data structures used during the reasoning

process. Expert systems are used in narrow domains and are








5

associated with relatively small sets of facts and rules. The

AI architectures and processing techniques employed by them

assume that the fact and rule bases are main memory resident.

The hardware and software techniques used by the AI machines

cannot be efficiently applied for reasoning on large

quantities of complex data and rules stored across several

secondary storage devices.

In this dissertation, we present and experimentally

analyze several techniques for efficiently partitioning and

processing large 00 knowledge bases on parallel architectures.

The data partitioning heuristics and nondeductive and

deductive query processing algorithms developed in this work

are general and can be executed on a variety of parallel

machines. The main features of the proposed techniques are as

follows:

1) Similar to AGM, the interrelationships among the data

objects are explicitly stored and used during query

processing. This eliminates the need for the execution of

time consuming join and unification operations in order to

relate data objects during the processing of queries

referencing the relationships captured by the 00 model.

2) The techniques take advantage of the structure and the

semantic property of the 00 data in localizing manipulation

and reducing the overall communication costs during

processing.







6

3) Unlike AGM, the query processing techniques are based on

an 00 query model which maintains the closure property. Thus,

the result of a query is structured and represented in an 00

framework similar to the base data.

4) Similar to AGM, an asynchronous approach is adopted in the

processing of queries. However, the granularity of processing

is at a object class level rather than at the data element

level. This enables the exploitation of parallelism without

the overhead penalties associated with the processing of a

large number of tokens. Further, data blocks within the

object class are pipelined and temporal parallelism is

exploited in the processing.

5) A two-phase processing strategy has been used to eliminate

the unnecessary generation and movement of large quantities of

descriptive data. During the first phase of the processing,

all the objects in the database satisfying the query are

marked after manipulating the associative data. Subsequently,

the selected descriptive data of only the marked objects are

retrieved and presented to the user.

6) A user query is processed by multiple concurrent

wavefronts. Each wavefront is asynchronously executed by a

pipeline of relevant processors. Desired objects of various

classes and specified relationships among the selected objects

are stored in a distributed fashion as a result of the

processing. This is in contrast to the traditional rigid tree

structured control in the processing of relational queries.







7

This improves the overall parallelism in the processing and

eliminates the complexities involved in the sequential

implementation.

7) The derivation rules of the knowledge base system are

integrated into the 00 data based on the classes) of objects

and/or new relationships among the classes of objects they

derive. The integrated structure facilitates in focusing on

the desired set of data and rules from the large knowledge

base during processing.

8) During the derivation process, the various classes of

objects and/or the various relationships among the specified

classes of objects are derived in parallel by various assigned

processing nodes. Further, different rules deriving the

objects of a class or the relationships between two classes

are executed in parallel. This strategy increases the overall

parallelism in the processing and a distributed controlling

mechanism is implemented.

This dissertation is organized as follows. In Chapter 2,

we survey the related work on architectures and techniques for

improving the performance of databases and knowledge bases.

In Chapter 3, we present the 00 view of knowledge bases and

discuss the features of an 00 query language based on an 00

query model which maintains the closure property. In Chapter

4, we describe a parallel architectural model for the

implementation of large 00 knowledge bases and present several

heuristic techniques for the efficient mapping of the 00 data








8

across the nodes of the parallel architecture. In Chapter 5,

we present asynchronous parallel algorithms for processing

nondeductive queries against 00 knowledge bases. The

algorithms are developed for various complexities of the

queries. Further, in Chapter 6, we present algorithms and a

distributed control mechanism for the parallel processing of

deductive queries against 00 knowledge bases. The

effectiveness of various heuristic data mapping techniques,

and the correctness of the parallel algorithms and their

performance is studied by implementing the algorithms and the

controlling mechanism on a parallel message passing system.

The results of the simulation are presented in Chapter 7.

Finally, our conclusions and possible future research

directions are presented in Chapter 8.













CHAPTER 2
SURVEY OF RELATED WORK



In this chapter, we will survey the related work in the

areas of database machines and knowledge base machines.

Database machines have been developed to improve the

processing efficiency of large databases. Similarly,

knowledge base machines provide efficient means of reasoning

on data.

2.1. Database Machines



Since the advent of VLSI technology and reduction in

hardware costs, there has been a trend in the use of

multicomputer systems for database applications.

Multicomputer systems obtain considerable performance

improvement over von Neumann architecture by decomposing the

computational task into a number of parallel subtasks and

executing them simultaneously on different processors. These

architectures employ several techniques in order to improve

the utilization of hardware resources and to reduce the query

execution time of large databases. In this section, we will

survey some of the recent multicomputer database systems that

are relevant to the research presented in this dissertation.

The goal of the survey is to illustrate the well established

9







10
techniques that can be adopted in our research and also to

depict the limitations of the existing work.

The database initially resides on Secondary Storage

Devices (SSDs). At the onset of the processing large

quantities of data are moved into main memory and during

processing the temporary results are staged in and out of

secondary storage. The time for I/O is a major source of

inefficiency in database processing. Database machines [see

references in HSI83, OZK86, SU88] employ multiple processors

each with their own main memory and SSDs. The data files are

partitioned into subfiles and stored in a distributed fashion

across the SSDs. During processing, different subfiles are

loaded into main memory simultaneously from various secondary

storage devices. The parallel retrieval of data relieves the

I/O bottleneck. Architectures such as GRACE [KIT84] also

employ filter processors integrated into the disk modules.

The filter processor performs the selection and projection on

the fly and reduces the amount of unnecessary data staged into

the main memory for further processing.

The organization and distribution of data across the SSDs

also plays an important role in reducing the I/O costs. SM3,

DIRECT, and Cube-Connected Multiprocessor [BARU86, DEW79,

FRI87] horizontally partition the data files into equal

segments and distribute the segments across the SSDs. Such a

distribution balances the retrieval in addition to improving

the retrieval parallelism. However, since the characteristics







11

of the data are not known, all the data partitions have to be

retrieved and processed. GAMMA, DBC/1012, and GRACE hash the

tuples of the relations into various partitions based on the

hash values of some selected attributes [DEW86, TER84, KIT84].

These partitions are equally distributed among the available

SSDs. During query processing, data retrieval can be reduced

by retrieving only the relevant partitions and by ignoring

partitions whose hashed values do not satisfy those desired by

the query. GRACE sorts the tuples within each partition in

addition to hashing the tuples into partitions. Sorting

eliminates the need to compare all the data values during

processing. Hashing requires additional processing overheads.

Moreover, efficient processing of queries involving nonhashed

attribute values cannot be guaranteed.

DBC and MDBS process the data based on an attribute-based

model and data records are clustered based on the semantic

similarity of their contents [BAN89]. Records of a cluster

are evenly distributed across the SSDs of various computers.

The clusters of records that are relevant to a search query

can be quickly located and retrieved from disks. However,

queries containing search conditions that do not match the

predefined descriptors on which the clusters are based do not

have the same efficiency as those that do. Data files are

vertically partitioned by the DSM and the OFC architectures

[COP85, LEE89]. In the DSM, a relation is fully decomposed

into binary relations. Each binary relation contains the







12

surrogates and the values of an individual attribute of the

original relation. However, the OFC vertically partitions the

relations based on the associative and the descriptive data.

Vertical partitioning reduces the amount of data retrieved by

retrieving only the partition(s) containing the values of the

attributes) referenced by the query. However, update costs

are higher for vertically partitioned data.

Processing of the join operation is very time consuming.

It involves relating data between two distributed relations.

In addition to retrieval of data, sizable data may have to be

exchanged among the processors. Database machines employ a

variety of techniques to improve the performance of the join

operation. SM3 and the Cube-Connected Multiprocessor use

nested-loop join algorithm wherein the smaller relation is

transmitted among the processors and joined with all the

horizontal segments of the larger relation [BARU86, FRI87].

SM3 reduces data transfer time by using a memory switching

scheme, whereas Cube-Connected Multiprocessor takes advantage

of the increased connectivity among the processors. DBC/1012,

GAMMA, and GRACE use hash-based join algorithm to reduce the

amount of data transferred among the processors [TER84, DEW86,

KIT84]. Valduriez [VAL87] has proposed prejoining the

relations based on the primary keys and storing the join

indices as prejoined relations. This considerably improves

the performance of the join operation. Similar technique has

been used by the OFC [LEE89].







13

Processing of complex queries involves execution of a

sequence of a large number of join operations. Although

various database machines use several techniques to improve

the performance of the individual join operations, similar

performance improvements cannot be expected for overall query

execution. The join algorithms take advantage of the even

distribution of data. However, even distribution of data at

the end of the operation cannot be guaranteed. Processing of

subsequent join operations in the query may be inefficient due

to the unevenly distributed data. In addition to poor

performance, low hardware utilization can be expected due to

uneven computational loads on the processors.

In the Cube-Connected Multiprocessor, redistribution of

the result data is suggested and redistribution algorithms

have been designed [FRI87]. Redistribution of the result data

after every operation may improve the performance of

subsequent operations. However, it can itself be very time

consuming. In DIRECT multiprocessor [DEW89], Query Processors

(QPs) are assigned to process individual operations of the

query tree. A QP starts execution when data are available at

each of its input node(s) and the result is transferred to the

QP processing the subsequent operation. Moreover, DIRECT is

a MIMD machine and multiple queries are processed at the same

time. The data flow approach and the MIMD nature of the

processing enable DIRECT to improve its resource utilization







14

and query execution time. A data flow approach is also used

by GAMMA, GRACE, and OFC [DEW86, KIT84, LEE89].

Recently, the need for processing data based on a data

model that explicitly captures the semantic relationships

among the data has been established [BIC86, BIC89, LEE89]. It

is observed that the relational model scatters the

relationships across several relations and during processing

the desired relationships have to be computed by performing

time consuming join operations. The processing of the OFC

[LEE89] is based on an Object-Oriented Semantic Model. OFC
captures the relationships and the descriptive data about the

objects of various object classes in the form of unnormalized

or generalized relations. A number of primitive database

operations on the generalized relations have been identified

in the OFC. Similar to the relational approach, a query is

compiled into a tree of primitive operations. Efficiency is

obtained by replacing the join operations of the relational

model with efficient special join operations which take

advantage of the explicitly captured relationships. Further,

in order to reduce the amount of unnecessary data transferred

among the processors, a two-phase processing strategy is

employed. During the first phase, a skeletal nonnormalized

relation of object identifiers is formed. Subsequently, the

desired descriptive data of only the identifiers in the

resulting relation are retrieved. However, OFC takes a

relational approach in the processing of semantic data and







15

does not eliminate the processing of time consuming join

operations.

Similarly, AGM represents the database as a network of

interrelated entities and relationships [BIC86, BIC89]. A

query is represented as a directed tree of interrelated data

sets. The desired restrictions are also specified in the

query. The query is processed by injecting tokens from

various data elements and propagating them asynchronously

along the arcs of the network. The tokens carry the status of

the selection conditions as well as desired descriptive

values. Unlike the OFC, the network representation allows the

AGM to eliminate the processing of join operations. In

addition, the asynchronous nature of the processing improves

the processing parallelism. However, the granularity of the

computation is at the data element level and a large number of

tokens carrying a substantial amount of data have to be

generated, transmitted, and processed. This can significantly

increase the overhead costs.

In the research presented in this dissertation, similar

to the AGM, we represent the Object-Oriented (00) data as a

network of interrelated objects and adopt an asynchronous

model of computation. However, the granularity of computation

is higher and we cluster the objects and relationships of

various classes and manipulate them similarly. A higher

granularity enables in reducing the overhead costs. We employ

a two-phase processing strategy similar to the OFC in order to








16

reduce the amount of unnecessary data transferred among the

processors. Unlike the OFC and the AGM, the query processing

in our research is based on an 00 query model that maintains

the closure property. The result of the query is represented

in the similar network form as the input to the query. This

enables the output of a query to be further processed using

the operators of the same query model. Unlike database

machines, our research deals with efficient query processing

strategies which include the processing of large quantities of

deductive rules integrated with a large factual database in an

00 framework.

2.2. Knowledge Base Machines


In this section, we will survey the past and the current

efforts in designing architectures for knowledge based

systems. The architectures can be classified based on the

underlying knowledge representation scheme they support. We

will survey the architectures under the following categories:

(1) Logic based machines, (2) Production System machines, and

(3) Semantic Network machines.

2.2.1. Logic Based Machines

These architectures are designed to process knowledge

represented in logical statements efficiently. Using this

representation scheme, the domain knowledge about objects and

their inter-relationships is represented as declarative

clauses. There are two kinds of clauses: facts and rules.








17
The facts capture specific knowledge that is known to be true,

whereas the rules capture general knowledge and can be used

in conjunction with facts in deducing information while

answering users queries. Prolog is a programming language

that is based on logic. Various sequential and parallel

variants of this language have been used as the basis for

architectures in this category.

Various architectures have been designed to support the

inferencing mechanism of the logic programming system directly

in hardware. Architectures [MOR89, TIC88, TAK84, TAM84] are

uniprocessors that have been developed to support the

depth-first search strategy and the backtracking mechanism in

hardware. Different sources of parallelism in the execution

of logic programs have been studied and used in the design of

parallel architectures. The various sources of parallelism

are as follows:

(i) OR-parallelism--the parallelism in the unification and

the simultaneous execution of the various clauses that are

unifiable with the given goal clause,

(ii) AND-parallelism--the parallelism in the execution of the

sub-goals of the selected clause,

(iii) Search-parallelism--the parallelism in the simultaneous

search of the sets of clauses that unify with a given goal,

(iv) Unification-parallelism--the parallelism corresponding

to the parallel activities within the unification process.

Architectures [BEN89, DES88, HER88, IT087, MOT84, SIN89] use







18

multiprocessor organizations and a breadth-first search

strategy in exploiting the various sources of parallelism

mentioned above. Some of these architectures have used

heuristics in order to guide the search of the inference

procedure.

One approach in the design of knowledge based systems has

been to combine a relational database system with a logic

programming system [KIY87]. Facts are stored as relations

and managed by the database system, and an inference processor

is designed to store rules and perform the reasoning. The

PRISM project at the University of Maryland involves research

on a multi-processor configuration knowledge base machine

consisting of problem solving machines and database machines

[KOH88]. The search and problem solving tasks are handled by

the problem solving machines, whereas the database machine

performs the unification and database retrieval. The system

exploits AND-parallelism, OR-parallelism, and Search-

parallelism. Inclusion of a constraint solving machine in the

overall architecture is also being considered. The constraint

solving machine is a specialized hardware driven by the

problem solving machine and assists in the use of constraints

to prune the search space. Similarly, one of the projects at

ICOT in Japan involves combining a relational database machine

DELTA and an inference processor PSI over a local area

network in order to develop a knowledge base machine [MUR84,

WAD87]. The DELTA database machine is developed as a







19

dedicated hardware and various primitive database operations

are implemented in hardware. PSI directly implements the

inference mechanism in hardware. It converts the query based

on the set of rules into a relational algebraic query which

is then manipulated by DELTA. This approach enables the

efficient reasoning of large databases and adds deductive

capabilities to an existing database system. Nevertheless,

this approach is not suitable when the number of rules become

large and hence have to be stored in the secondary storage.

Moreover, since the two systems are loosely coupled,

inefficiencies crop up due to the interface between them.

Since a low-level logical interface exists between the two

systems, a large number of commands and responses have to be

transferred over the medium connecting the two systems,

thereby reducing the overall performance.

Recently, an integrated approach is being taken in the

development of architectures for knowledge based systems

consisting of a large rule base and a large fact base [QAD87,

WON89]. Using the integrated approach, both facts and rules

are stored and managed uniformly. In the Opale machine

[SAB87], a top-dowm evaluation strategy is chosen. In order

to reduce the number of disk accesses, a set-oriented

approach is taken in the processing. Using this approach, a

clause is verified by pipelininig sets of solutions from one

process to another, each process verifying the binding in a

literal. The chosen strategy allows the exploitation of







20

OR-parallelism, Search-parallelism, and the pipelining of

AND-processes. A unique feature of this architecture is that

it executes unification of sets of goals with clause headers

read from the disk "on-the-fly".

The Relational Knowledge Base machine [MON88, MORI86,

SAK87, YOK86b, YOK86a] integrates the facts and the rules by

developing a relational knowledge model and by providing a

hardware architecture to support the processing based on that

model. The relational knowledge base model is an enhancement

of the relational data model and contains terms consisting of

constants, variables, and functions as basic data elements.

The relational algebra operations are also enhanced to

include the unification operation. A top-down evaluation

strategy is chosen in this architecture. The main features

of this architecture are (1) use of multiple disk systems to

store and retrieve the term relations in a distributed

fashion, (2) use of specialized hardware called the

unification engines for performing the unification

operations, (3) use of multi-ported page memories to reduce

the I/O bottleneck, and (4) the use of a clustering technique

to filter the irrelevant data. This architecture exploits

the OR-parallelism, Search-parallelism, and the

Unification-parallelism. The approach taken by this

architecture can become inefficient when the set of terms

representing facts is large because of the top-down evaluation

strategy.







21

An integrated knowledge base machine architecture for

supporting large sets of rules and facts has been proposed by

Shin and Berra [SHI87]. Surrogate files are constructed by

hashing transformation of terms representing the facts and

the heads of the rule clauses. The surrogate files are

distributed across multiple disks. The clause bodies are

stored in a separate database. A top-down evaluation

strategy is chosen and the query is evaluated by performing

unification operations on the surrogate files and later

binding the selected body clauses. A specialized associative

processor for performing the unification on the surrogate

files is proposed. An overall tightly coupled shared memory

system is proposed for the execution. OR-parallelism and

Unification-parallelism are exploited by the system and the

execution follows a breadth-first search strategy.

Although logic provides a declarative representation of

knowledge and a powerful database search facility, and has

been used in developing many knowledge based systems, it is

not without its drawbacks. Logic enforces a rigid control

structure and procedural knowledge cannot be efficiently

represented and manipulated. Moreover, invariably relational

database systems or normalized tables have been used as the

structure for representing facts. The data pertaining to

complex objects and the associations among them is not

modelled explicitly and has to be computed by performing time

consuming unification joins.







22

2.2.2. Production System Machines

Production System is another form of representing and

manipulating knowledge, and is used extensively in the

construction of knowledge-based expert systems. A production

system consists of a set of condition-action rules called the

production memory, and a set of facts called the working

memory. OPS5 is the most often used production system

language. OPS5 employs a forward chaining reasoning strategy

and performs a three-phase cyclic operation. The three phases

are match, conflict resolution, and act. Measurements on

various production systems have shown that the match phase

takes about 90% of the computation time. Forgy developed a

fast sequential matching algorithm called rete [FOR82], in

order to speed up the matching phase. Rete algorithm has

been modified and various multiprocessor architectures based

on the modified rete algorithm have been proposed and

analyzed for executing production systems.

One approach in designing architectures for supporting

production systems has been the use of massively parallel

structures. The DADO [STLO86, STL87) and NON-VON [SHA85] are

massively parallel architectures consisting of thousands of

Processing Elements (PEs) interconnected to form a complete

binary tree. The NON-VON was initially developed for

efficient processing of relational database operations and has

been improved to support knowledge processing as well. The

DADO architecture has been modelled after NON-VON and shares







23

some architectural features. In NON-VON the granularity of

the PE is small and it executes instructions broadcast by a

control processor synchronously with other PEs. The PEs of

the DADO machine are capable of executing in either SIMD or

MIMD mode. In the MIMD mode, each PE executes instructions in

its own local RAM, independent of other PEs. Speedup mainly

results from storing the fact base and the rule base in a

distributed fashion, and by associatively matching and

updating in parallel. The main disadvantage of these

architectures is the poor utilization of their hardware since

only a small percentage of productions get affected in each

cycle. Various algorithms have been proposed which attempt to

improve the utilization of these architectures.

A coarse grain approach has been taken by the MANJI

[MIY87] and the PSM [GUP86, FOR86), and other architectures

[ACH89, BUT88]. The production rules are precompiled into a

modified version of the Rete network [FOR82]. The state of

the fact base is saved in various nodes of the network. The

network represents a data flow graph and the nodes are

evaluated based on the arrival of data tokens. The MANJI is

a special shared memory architecture consisting of tens of

powerful Processing Units (PUs) connected by a simple bus.

The various nodes of the network are statically mapped to

different PUs to obtain maximal parallelism possible and are

evaluated dynamically in the order of token arrival. The

shared memory has been designed and structured so as to







24

eliminate reading contention on the bus and to reduce

reading/writing conflicts while accessing the shared memory.

The PSM is a simple shared memory architecture. The nodes of

the rete network are dynamically assigned to different

processors by a scheduler depending on the availability of

the processors. In the approaches taken by these

architectures, due to the precompilation, the dynamic

addition/deletion of rules is difficult to implement.

Recently, multiprocessor architectures have been designed

and analyzed, for executing production systems, based on the

concept of pipelining. Researchers at the University of

Waterloo have developed a parallel model of processing which

exploits the inherent parallelism in the rete algorithm in

the match phase, in addition to providing a degree of control

over the parallelism available in the conflict resolution and

act phases [OSH87]. A multiprocessor architecture called

MAPPS consisting of homogeneous processing elements connected

in a heterogeneous topology has been designed after studying

the communication requirements imposed by the model. Thus,

the architecture executes the parallel model on a three stage

processor pipeline and is tuned towards the efficient

processing of a large number of changes in the working memory

in each production system cycle. Another pipeline

architecture is being developed at the University of

Kaiserlautern in West Germany [SCH87]. The overall

architecture is a pipeline of special purpose processing








25

elements with distributed memory and control. The processing

has been based on the rete algorithm. A special instruction

set has been developed and the processing elements are

designed to execute the instruction set efficiently. This

increases the overall performance of the architecture.

Production systems are very much like logic based systems

with an additional capability of dealing with uncertainty and

explanation facility. The problem of rigid control structure

and low-level data representation is also present in

production systems. In addition, the factual data are

redundantly stored across various nodes of the rete network

and the approach taken by the architectures might not be

efficient and effective when the knowledge base grows.

2.2.3. Semantic Network Machines

Semantic Network is another popular form of representing

knowledge. The declarative knowledge about objects and their

interrelationships is represented in the form of a directed

graph. The nodes of the graph model concepts, data items or

objects, whereas the interrelationships among the objects are

modelled as links interconnecting the nodes. Knowledge

processing involves matching a query graph against the data

network. The matching will be either simple or involve using

general rules of inference. Various parallel architectures

have been designed and simulated in order to speed up the

processing of semantic networks [BIC85, FAH83, SAV67].








26

A highly parallel SIMD machine, called the Connection

Machine [HIL85], was designed and further implemented by the

Thinking Machine Corporation, for processing semantic

networks. The overall architecture consists of many (64K)

processor/memory nodes interconnected as a hypercube. The

concepts (nodes) of the semantic network are mapped onto the

processor nodes of the architecture, and the interconnection

between the processors represents the relationship between the

corresponding concepts. All processors execute instructions

from a single stream generated by a microcontroller under the

direction of a conventional machine. Another parallel

architecture called the Semantic Network Array Processor

(SNAP) is being studied at the University of Southern

California [MOL85]. A square array of identical processing

cells which are interconnected in the form of a mesh and also

connected to a central controller constitute the SNAP

architecture. The concept of mapping the data semantic

network into an architectural interconnection is the same as

in the Connection Machine. Complex searches and inferences

are performed against the network by initiating the operations

from many nodes simultaneously and by performing associative

searches.

A semantic network machine called the IXM has been

designed and simulated in Japan [FUR87]. IXM consists of an

associative network with a large number of processing elements

connected to it. Marker propagation, set operation, and









27

association have been identified as basic operations in a

semantic network. The processing elements include associative

memories and are designed to execute the basic operations in

parallel. The associative network consists of a number of

network processors connected in a pyramid shape and the

network processors contain associative memories for supporting

parallel marker propagation. The data network is partitioned

into subnetworks and stored across the processing elements.

The user queries are issued in a semantic network language

called the IXL, and the IXL commands can be interpreted by all

the processing elements. The main disadvantage of these

architectures is that their efficiency greatly reduces when

the semantic network cannot be directly mapped to the

available main memory of the processing elements.

An asynchronous data flow model of computation has been

proposed by Bic [BIC85] for processing semantic networks. The

model is based on the idea of representing the semantic

network as a dataflow graph in which each node is an active

element capable of accepting, processing, and emitting data

tokens travelling asynchronously along the network arcs.

Complex pattern matching is accomplished by representing the

query in the form of a message token and injecting it into the

selected nodes of the graph. The token is propagated and

matched across various nodes and links. Using this approach,

no centralized control is required, and in addition to the

parallel execution of a given request, multiple requests can









be executed simultaneously. This increases the overall

performance of the system.

Semantic networks have been used for representing

knowledge in domains with relatively smaller number of

objects. When the size of the semantic network grows, the

data will have to be stored in the secondary storage. During

processing, the data will have to be constantly staged in and

out of the secondary storage and the efficiency of the

architectures mentioned above will reduce drastically. In our

opinion, this problem can be tackled by increasing the

granularity of the representation from the concept level to a

class level. Many objects behaving similarly can be grouped

under the same class and the techniques developed can be

applied at the class level.

The research presented in this dissertation deals with

the efficient processing of deductive queries against 00

knowledge bases with large sets of rules and complex data.

The rules are structured and integrated into the 00 data.

During processing, the structure facilitates in focusing on

the desired set of data and rules from the large knowledge

base. In addition to the exploitation of OR parallelism, the

various objects and/or relationships among the objects are

derived in parallel and a distributed controlling mechanism is

implemented.













CHAPTER 3
REPRESENTATION AND QUERYING OF OBJECT-ORIENTED DATABASES


The limitations of record-oriented data models in

capturing the complex structural relationships and the

behavioral properties of objects in advanced application

domains such as CAD/CAM have long been observed. Several

Object-Oriented (00) semantic models have been developed to

alleviate the limitations of record-oriented data models

[HAM81, BAT85, HUL87, SU89]. The 00 semantic models provide

a rich variety of modeling constructs, which simplifies the

task of modeling complex data. The main features of an 00

data model are as follows:

(i) They support the unique identification of objects by

system assigned object identifiers,

(ii) They allow the encapsulation of data and operations on

the data,

(iii) They support abstract data typing and allow complex

objects to be defined in form of aggregation hierarchies,

(iv) They allow the definition of generalization hierarchies

(or lattices) and the inheritance of structural and behavioral

properties among object classes in hierarchies.

In Section 3.1, we first present the 00 view of databases

and illustrate the concept of a subdatabase, which is a

29








30

structure for representing and processing 00 data. Further in

Section 3.2, we illustrate the advantages of an 00 query model

which is closed under the representation of subdatabases and

present the operations and the philosophy of processing based

on an 00 Query Language (OQL) which maintains the closure

property [ALA89a, ALA89b]. Subsequently in Section 3.3, we

illustrate with examples the various complexities of queries

and their representation in OQL. We will also present with

examples the features of a rule-based language (with OQL

constructs) for processing deductive queries against 00

databases [ALA90]. The query processing algorithms presented

in this dissertation are based on OQL.

3.1. Object-Oriented View of Databases


The 00 view of an application world is represented in the

form of a network of object classes and associations (links)

between these classes. We shall illustrate the concepts of 00

data representation using an example University database

modelled by the 00 Semantic Association Model (OSAM*) [SU89].

Although OSAM* data model is used here, the data mapping and

the query processing techniques presented in this dissertation

are applicable to other 00 data models.

The University schema is shown in Figure 3.1. Using an

00 data model, objects within an application domain are

uniquely identified by system-assigned Object Identifiers

(OIDs) and objects with similar structures and behaviors are







31

grouped together into classes. The rectangular boxes in

Figure 3.1 depict various classes of objects in the university

domain. The interrelationships among these classes are

represented by various types of associations which

characterize the retrieval and storage operation behaviors on

their objects. Two of the widely recognized association types

are shown in Figure 3.1., namely, Generalization (G), and

Aggregation (A). An Aggregation association between two

classes represents an attribute which can be visualized as a

function that maps an object of one class to that of another.

For example, as shown in Figure 3.1, the objects of a class

Section are described by their sections, textbooks, rooms,

Students, Teachers (of the Section), and the Courses (to which

they belong). The circles represent Domain classes from which

the objects draw values of their descriptive attributes. The

superclass-subclass relationship is specified by a

Generalization association between two classes. For example,

in the figure, Student and Teacher are subclasses of the class

Person and inherit all the properties of the Person class.

Similarly Grad and Undergrad are subclasses of Student and TA

and RA are subclasses of Grad forming a Generalization

hierarchy. It should be noted that the objects of a subclass

are a subset of the objects belonging to the superclass.

Hence, an object plays different roles in the various classes

of the Generalization hierarchy. In order to distinguish the

different roles of the same object we assign unique Instance







32

Identifier (IID) to the individual instances of the object in

the various classes. Objects in classes with no

Generalization relationships associated with it play a single

role and each object has one instance identifier. In the

figure, similar associations are grouped together and labelled

by A (for aggregation) and G (for generalization). The

various types of associations are treated consistently during

the search process. They have different retrieval and storage

operational behaviors. Objects of a class can be associated

with objects of more than one class and a graphical view of an

00 database schema is represented by a network of interrelated

object classes. A detailed description of the OSAM* model can

be found in [SU89].

In the processing presented in this dissertation, the

structure for the representation and processing of an 00

database is a subdatabase. A subdatabase is a part of the

original database and is represented at the intensional and

the extensional level as an intensional association pattern

and a set of extensional association patterns respectively.

Figure 3.2 shows an example subdatabase of the original

database shown in Figure 3.1. The intensional association

pattern of a subdatabase is represented by a network of Object

classes and their associations as shown in Figure 3.2(a) which

consists of classes Teacher, Section and Course and their

associations. An extensional association pattern is a network

of object instances and their associations that belong to







33

the classes and association types of the intensional

association pattern. The set of extensional patterns of a

subdatabase can be represented in the form of an extensional

diagram. Figure 3.2(b) shows a possible extensional diagram

of the example subdatabase. The t's, se's, and c's represent

the unique Instance Identifiers (IIDs) of the objects of

classes Teacher, Section and Course respectively. The

interconnection of t3 and se4 in the figure is an example of

an extensional pattern, which records the fact that object

instance t3 of class Teacher is associated with object

instance se4 of class Section.

Each extensional pattern of a subdatabase can be

classified as having one of the several extensional pattern

types. An extensional pattern type is a common template that

is shared by several extensional association patterns in a

subdatabase. An extensional pattern type consists of a

connected set of the subset of the object classes in the

intensional pattern of the subdatabase. For example, the

patterns of the subdatabase of Figure 3.2(b) belong to one of

the following extensional pattern types shown in Figure

3.2(c). The extensional pattern type connecting classes

Teacher, Section, and Course has as instances all the

extensional patterns that connect the object instances of the

classes Teacher, Section, and Course. The extensional

patterns that connect only the object instances of classes

Teacher and Section belong to the extensional pattern type








34

connecting classes Teacher and Section. Similarly, the

instances of the extensional pattern type connecting classes

Section and Course can be explained.

3.2. A Closed Model of Query Processing
for Object-Oriented Databases


A "closed query model" can be defined as a model of query

processing in which the structure of the output of a query is

represented using the same data model with which the input of

the query is structured. A closed model of query processing

has several advantages. Since the result of a query is

modeled by the same data model, it can be operated uniformly

by another query using the operators of the same query

language to further produce a new result. Also, the result of

a query can also be saved as a view definition and manipulated

uniformly as the original database. An Object-Oriented Query

Language (OQL) [ALA89a, ALA89b], designed at the University of

Florida, maintains the closure property for processing 00

databases. The query operates on one or more subdatabases and

produces a new subdatabase.

The philosophy of the processing based on the OQL is to

first identify the desired subdatabase and subsequently

perform a set of specified operations) on the objects

instances of the identified subdatabase. The search engine of

the database management system establishes the desired

subdatabase and then performs the operationss. Thus a query

block in OQL consists of two clauses, namely, a Context clause








35

and an Operation clause. The Context clause has two optional

subclauses: a Where subclause and a Select subclause. The

structure is as shown below.

Context association pattern expression
Where conditions
Select object classes and/or attributes

Operations) object classes)

The Context clause specifies the desired subdatabase by

specifying the following in its association pattern

expression: (i) the intensional pattern, (ii) the set of

extensional pattern types, (iii) intraclass conditions, over

the descriptive attributes of the object instances of various

classes, qualifying the object instances. The interclass

conditions are specified in the Where subclause, and the

desired descriptive attributes of the object instances of

various object classes are specified in the Select subclause.

A set of operations for the various classes of the subdatabase

are specified in the Operations clause of the query. An

operation can be either a system-defined data manipulation

operation (e.g., Display, Update, Print) or a user-defined

operation (e.g., Rotate, Order-part, Hire employee).

The operators that can be used in the association pattern

expression of the Context clause are the association operator

and the nonassociation operator.

3.2.1. Association Operator

When the association operator (*) is applied to two

directly associated classes A and Bin a database (i.e., the








36

expression A B), it returns a subdatabase whose intensional

pattern consists of the two classes A and B and their

association. The resulting subdatabase also contains the set

of extensional patterns drawn from the operand database such

that each extensional pattern contains objects of both A and

B. B objects that are not associated with any A objects and

A objects that are not associated with any B objects in the

operand database are not retained in the resulting

subdatabase. The definition of the association operator can

be easily generalized to the case when the association pattern

expression contains more than two classes.

3.2.2. Nonassociation Operator

An exclamation sign (!) is used to denote this operator.

When this operator is applied to two directly associated

classes A and B in a schema (i.e., the expression "A 1 B"), it

returns a subdatabase which contains only the instances of A

that are not associated with any instances of B and the

instances of B that are not associated with any instances of

A.

3.3. Query Examples


The association operator has a higher precedence than the

nonassociation operator. However, the precedence can be

overridden by using parentheses. Various complexities of

association patterns can be specified using the association

and the nonassociation operators among the classes of the






37

association pattern. We illustrate the various complexities

of the association patterns with example queries. The queries

are described for the schema shown in Figure 3.1. The english

language description and the OQL representation are specified

for each example query.

3.3.1. Noncyclic Association Pattern

In this section we consider queries whose association

patterns do not form cycles. We classify noncyclic

association patterns into two types namely, linear association

patterns and branching association patterns.

3.3.1.1. Linear association pattern

This is the simplest form of the structure of the

association pattern. The various classes specified in the

association pattern are related in a linear string. The

following query 1, query 2, and query 3 are example of queries

with a linear intensional pattern. Query 1: For all the

Courses with courses greater that C600, and being offered by

the Departments in the college of Engineering, and having

currently offered Sections with sections either less than

S250 or greater than S550, retrieve the courses of the

Courses, the name of the Department offering the Courses, and

the sections of the Sections.



Context Department [college = 'Engineering'] *
Course [course# >6000] *
Section[section# < S250] OR [section# > S550]
Retrieve Course [course#], name, section#








38

The association pattern of the desired subdatabase in this

query is a linear string of object classes Department, Course,

and Section. The classes of the association pattern are

associated with the association operator (*). The query

specifies a retrieval operation on the resulting subdatabase.

The descriptive data values of the course#, name, and section#

attributes are to be aggregated under the selected objects of

class Course.

Query 2: For all the currently offered Sections with enrolled

Students who have not decided on a majoring Department, obtain

the names of the Departments, and also the section's of the

Sections, and the classification of the enrolled Students.

Context Section Student Department

Retrieve Department [name] ;
Section sections#s, classification

The association pattern of the desired subdatabase is a linear

pattern of object classes Section, Student, and Department.

The classes are related using both the association and the

nonassociation operator. It should be noted that the

association operator has precedence over the nonassociation

operator. Two separate relations are to be retrieved. The

first relation contains the names of all the Department

objects in the resulting subdatabase. The second relation

contains the sections' of the selected Section objects and

the classification of the selected Students aggregated under

the objects of class Section.








39

Query 3: For all the Students with no majoring Department,

and enrolled in currently offered Sections obtain the

section's of the Sections and the classification of the

Students. Also obtain the names of the Department with no

majoring Students.

Context Section (Student I Department)

Retrieve Department [name], Section [section#],
classification

The association pattern of the resulting subdatabase consists

of classes Section, Student, and Department as in query 2

above. Also, the structure and the operators among the object

classes are the same as in query 2. However, the precedence

of the association operation over the nonassociation operation

has been overridden by the use of parenthesis. The structure

of the result desired is also similar to that of query 2.

3.3.1.2. Branching association pattern

An association pattern expression may contain branches

expressed by an AND or an OR operator. The following query 4,

and query 5 are example of queries with a branching

intensional pattern.

Query 4: For all the currently offered Sections taught by a

Teacher with a Ph.D. degree, and enrolled by Students who are

Graduate Students, and of Courses being offered by the 'CIS'

Department, retrieve the section's of all the Sections and

the degree of all the related Teachers and the course's of

all the related Courses.









40

Context Teacher [degree = 'Ph.D.'] Section AND
(Course Department [name = 'CIS'],
Student Grad)

Retrieve Section [section#], degree, course#

The association pattern of the desired subdatabase in this

query is a branching pattern of object classes Teacher,

Section, Course, Department, Student, and Grad. The object

class Section, at which the branching occurs is called the

fork class. An AND operator is specified between the branches

of the fork class. An AND operator means that in the

result, an instance from the fork class must be associated

with instances from all the classes related with the forking

branches. The section's of all the selected Sections and the

degree of the related Teachers, and the course's of the

related Courses are to be retrieved from the resulting

subdatabase.

query 5: For all the currently offered Sections taught by a

Teacher with a Ph.D. degree, and either enrolled by Students

who are Graduate Students, or of Courses being offered by the

'CIS' Department, obtain the section's of all the Sections,

the degree of the related Teachers, and the course #'s of the

related Courses.

Context Teacher [degree = 'Ph.D.'] Section OR
(Course Department [name = 'CIS'],
Student Grad)

Retrieve Section [section#], degree, course#

The association pattern of the desired subdatabase in this

query is also a branching pattern of classes Teacher, Section,







41

Course, Department, Student, and Grad. Also, similar to query

4, the object class Section is the fork class. However, an OR

operator is specified between the branches of the fork class.

An OR operator means that, in the result, an instance from the

fork class must be associated with an instance from at least

one of the two related branching classes. The structure of

the desired result is the same as in query 4.

3.3.2. Cyclic Association Pattern

The association pattern can also contain cycles. The

following is an example of a query with a cyclic association

pattern.

Query 6: For all the Sections being taught by a Teacher with

a 'Ph.D.' degree, and belonging to Courses being offered by

the Department which has Students (who are currently enrolled

in those Sections) major in, retrieve the textbook taught by

the Section and the course# of the related Courses.

Context Teacher [degree = 'Ph.D.'] Section AND
(Course Department, Student) AND *
Grad
Retrieve Section [textbook], course#

The association pattern of the desired subdatabase in this

query consists of object classes Teacher, Section, Course,

Student, and Department. The object classes are associated

with the association operator. Moreover, branching occurs at

object classes Section and Department which are the fork

classes. An AND operator is specified between the branches of

both the forking classes and a cyclic association pattern in

formed. The textbook of the selected Sections and the course#







42

of the related courses are to be aggregated under the Section

objects in the result.

3.3.3. Deductive Queries

New subdatabases can be derived from other existing or

derived subdatabases. A derived subdatabase is called the

target subdatabase and the subdatabases used to derive it are

called the source subdatabases. The process of derivation is

captured by the derivation rules. A derivation rule has an

IF-THEN structure as follows:

IF Context association pattern expression
Where conditions

THEN subdatabase-id (association pattern expression)

The Context clause and the its optional Where subclause are

the same as described in Section 3.2. above. The subdatabase-

id in the THEN clause is a unique name to be given to the

derived subdatabase. The intensional pattern of the derived

subdatabase consists of a subset of the classes referenced in

the association pattern expression of the IF clause. Other

unreferenced classes will not be retained in the derived

subdatabase. The extensional patterns of the new subdatabase

are derived from the extensional patterns that satisfy the

conditions of the IF clause and its Where subclause. The

following is an example of a deductive rule.

Rule 1: New relationships establishing the fact that good

quality Teachers are teaching good Students taking high level

Courses can be established, if the Teachers who have a Ph.D.

degree are teaching the Sections, in which the Students with








43

GPA higher than 3.5 are enrolled, and these sections belong to

Courses having course# greater than 6000.


IF Context Teacher [degree = 'Ph.D.'] Section *
AND (Student [GPA > 3.5],
Course [course # > 6000])

THEN good (Teacher Student Course)


This rule when executed against the database of Figure 3.1,

returns a subdatabase whose set of extensional patterns are of

the type . It should be noted that

the relationships in the new subdatabase are derived and are

not present in the original database. Also, the objects of

class Section are not retained in the new subdatabase because

the object class Section is not referenced in the association

pattern expression of the THEN clause.

Once the deductive rules) that derive a new subdatabase

are defined, the classes of the derived subdatabase can be

referenced in association pattern expressions in any OQL query

in the normal way. For example, the following query

references the classes in the subdatabase defined in the THEN

clause of the rule above.

Query 9: For all the good Students majoring in the college

of Engineering and enrolled in high level courses, retrieve

the title of the Courses, the GPA of the enrolled Students,

and the name of their majoring Department.

Context Department [college = 'Engineering'] good:Student
good:Course


Course [title], GPA, name


Retrieve







44

The association pattern specified in this query references an

association (between Student and course) which is not

explicitly stored in the original database. However, the

association can be derived from the original database by

executing the derivation rule specified above. The execution

of this query would trigger the execution of the rule. Once

the rule derives the desired association, the query can be

executed to establish the database specified in the Context

clause of the query.

The execution of the rule may itself trigger other rules

for deriving source database(s) of the rule and an inference

chain will be established. Also, more than one rule can

derive the extensional patterns of the same subdatabase. When

more than one rule for the'same subdatabase is specified, all

the rules are executed and a union of the extensional patterns

derived by the individual rules is considered for further

processing of the derived subdatabase.











O
ss#
S\Al Person
name


A-
Teaching Research
Assistant Assistant
(TA) (RA)


name college


Figure 3.1 An University Schema














Teacher ---- Section P Course





(a) The Intensional Pattern of a Subdatabase


(b) An Extensional Diagram of the Subdatabase




Teacher Section Course


Teacher Section


Section Course




(c) Extensional Pattern Types of the Subdatabase



Figure 3.2 The Specification of a Subdatabase














CHAPTER 4
PARALLEL ARCHITECTURAL MODEL AND DATA ORGANIZATION



In this chapter, we discuss the desired features of a

parallel architecture for the efficient implementation of

large Object-Oriented databases. Further, we illustrate

techniques for partitioning the large sets of complex data and

organizing them across the nodes of the parallel architecture.

The main objective behind the data partitioning techniques is

to reduce the overall query execution time. In Section 4.1 we

present the parallel architectural model and in Section 4.2 we

discuss the data organization. The data partitioning and

mapping techniques presented in this chapter have been

experimentally analyzed and the results are presented in

Chapter 7.

4.1. Parallel Architectural Model



Querying on large and complex Object-Oriented databases

involve retrieving and manipulating data about various object

classes. The number of object instances in each class, the

amount of data about each object instance, and the

associativity among the individual object instances can be

enormous in large databases. The data has to be stored across

several secondary storage devices. Moreover, data about the

47








48
selected classes of object instances have to be interrelated

based on the explicitly captured associations. During

processing, large quantities of data have to be retrieved from

several secondary storage devices and transferred among the

processing nodes of the system.

Shared memory architectures are not suited for this type

of processing since, at high data rates, memory contention

drastically reduces performance. Message passing systems are

a promising alternative provided (a) the processing nodes have

sufficient processing power and storage capability to store

and process the large sets of data, and (b) the bandwidth of

the interconnection network is suitable to handle the

communication among the processing nodes.

Figure 4.1. shows the model of a parallel system

considered in our study. It consists of a set of processing

nodes, each containing a processing unit, main memory

elements, and several secondary storage devices. The

processing nodes are interconnected by a regularly and

homogeneously connected interconnection network. Since data

retrieval is one of the dominant factors in database

processing, parallel I/O at each node improves the retrieval

parallelism. In a regularly connected system, each processing

node is directly connected to the same number of other

processing nodes. A homogeneous system has topologically

identical processing nodes and the connection structure at

each node repeats in a regular fashion. The topological









49
similarity and the regularity among the interconnection

components at each processing nodes reduces the development

costs particularly for a significant number of units. Also,

the configuration can be easily expanded when the processing

demand increases.

The maximum delay among any two processing nodes in the

system varies with the exact topology of the system. The

overall bandwidth of the network varies with the degree of

connectivity of the nodes in the system. The data

partitioning and mapping algorithms presented in the next

section and the query processing algorithms presented in

Chapter 5 and Chapter 6 are not dependant on the topology and

can be executed with varying performances on different system

topologies.

The database is partitioned and stored across the various

secondary storage devices of the processing nodes in the

system. As can be seen in Figure 4.1, the user is interfaced

by one or more host processors, which are connected to the

processing system. The user issues queries at one of the host

processor. The query is compiled into a set of messages and

transferred to the relevant processing nodes in the system.

The processing nodes retrieve and manipulate the pertinent

data from their secondary storage devices. In addition, the

processing nodes pass data among each other during the course

of query processing. Finally, the result is transferred to

the host processor for presentation to the user.









50
4.2. Partitioning and Mapping of Data


The physical organization of the data across the

processing nodes of the system plays an important role in

determining the overall execution time of a query. A data

organization scheme can improve the query execution time in a

variety of ways. Firstly, in a multicomputer system, the data

can be accessed by a processing unit faster from its local

secondary storage devices than from remote devices. Hence,

reduction in data retrieval time can be accomplished by

placing similarly accessed data together across the secondary

storage devices of either a single processing node or across

a set of closely connected processing nodes. Secondly, by

organizing the different data segments in such a manner so as

to balance the processing load among the cooperating

processing nodes, a reduction in the query execution time can

be accomplished. Moreover, resource utilization can also be

improved by load balancing. Thirdly, when multiple processors

are used cooperatively to answer a query, data communication

among the processors can potentially account for a significant

portion of the query execution time. By intelligently mapping

the data segments across the processing nodes, the average

number of hops taken by the data while travelling from the

sending processor to the receiving processor can be reduced.

This in turn reduces the overall communication costs during

query processing.









51
In this section, we present a methodology for organizing

the complex data of large Object-Oriented databases across the

processing nodes of the parallel model architecture presented

in Section 4.1. The methodology is presented with an

illustrative example. As a first step, based on the knowledge

of the database schema, data clusters are formed and the

pattern of communication among the data clusters is

determined. A data cluster consists of the descriptive and

the associative data about either all or a subset of the

object instances of an individual object class. In the

initial clustering phase, a data cluster consists of data

about all the object instances of an object class. The

computation cost associated with each data cluster, and the

cost of data communication from each data cluster to other

associated data clusters is estimated based on the data

characteristics.

Subsequently, depending on the total number of processing

nodes in the system and the total amount of data in various

data clusters, the data are organized to form groups of data

clusters. The number of groups equal to the number of

processing nodes in the system, and the groups are formed such

that the computation load associated with each group is nearly

the same. During this load balancing phase, a data cluster

with large amount of data is partitioned to create new data

clusters each with relatively small amount of data. In

addition to load balancing, the grouping is performed so as to








52
allow the parallel processing of the queries with minimal

increase in communication overheads. Finally, the groups of

data clusters are mapped or assigned to the processing nodes

of the system. The groups are mapped such that the

communication costs among the processing nodes is reduced

during query processing.

4.2.1. Data Clustering

During query processing, the desired data about all the

object instances of a referenced object class are retrieved

and processed similarly. Thus, all the data pertaining to an

object class can be clustered and stored together in order to

improve localization. Thus we define a data cluster as

containing all the descriptive and the associative data about

either a subset or all the object instances of a single object

class. Figure 4.2(a) and Figure 4.3(a) show example data

clusters, in the form of nonnormalized relations, pertaining

to the object classes Section and Teacher of the schema, shown

in Figure 3.1 respectively. The network data of the database

can be partitioned as multiple nonnormalized relations for the

individual classes of the database. It should be noted that

nonassociation of an object with objects) from other classes

is not stored as null values. The relationship itself is not

stored. During query processing the relationships that are

present are used in computing the desired subdatabase. As can

be seen from Figure 3.1, the object instances of class Section

are described by their section#, textbook, and room#, and are








53
associated with the object instances of object classes

Teacher, Student, and Course. The Section IID in the first

column of the relation in Figure 4.2(a) represents the

instance identifiers of the object instances of the Section

class. The second, third, and fourth columns of Figure 4.2(a)

represent the values of the section#, textbook, and the room#

attributes respectively of the object instances of the object

class Section. The relationships among the object instances

of class Section and the object instances of classes Teacher,

Student, and Course are captured and explicitly represented in

the fifth, sixth, and seventh columns respectively of Figure

4.2(a). Similarly the population of values in the data

cluster of Figure 4.3(a) can be explained. Moreover, any

specified operations on the object instances of an object

class are stored along with the declarative data of the object

class.

Further, in order to improve the retrieval parallelism,

we vertically partition the nonnormalized relations of the

individual object classes into binary relations. Figure

4.2(b) and Figure 4.3(b) represents the vertical partitions of

the nonnormalized relation of Figure 4.2(a) and Figure 4.3(a)

respectively. During query processing, values of a certain

specified subset of the attributes of the object instances of

a class are desired. Also, the associations among the object

instances of a class with the object instances of a subset of

the related classes are manipulated during the course of query








54
processing. By vertically partitioning the data and storing

them separately, specific partitions can be retrieved and the

retrieval of unnecessary data can be avoided. Also, different

vertical partitions can be retrieved in parallel thereby

improving the retrieval parallelism. The scheme of vertically

partitioning the data is similar to the one proposed for

relational systems [VAL87].

The data clustering and vertical partitioning scheme

proposed above improves query execution time by localizing

retrievals and reducing the amount of unnecessary data

retrieved. However, the total amount of data stored is

increased. The relationship data between the object instances

of two associated classes is replicated in the nonnormalized

relations of both the related classes. For example, as can be

observed from Figure 4.2(a) and Figure 4.3(a), the

relationships between the object instances of classes Teacher

and Section are replicated in the data clusters of both the

classes. Also, as can be observed from Figure 4.2(b) and

Figure 4.3(b), the instance identifiers of the object

instances of a class are replicated in all the vertical

partitions of the class.

The clustering scheme creates a data cluster for each

object class. In a steady state, under the assumption that

queries involving various object classes and relationships

among object classes have equal probability of occurrence, the

data retrieval and processing time associated with a data








55
cluster is proportional to the amount of data in that cluster.

Also, as will be evident from the description of the query

processing algorithms in Chapter 5 and Chapter 6, data from a

cluster is related with the data in other related data

cluster(s). The amount of data communicated from a sending

cluster to the receiving cluster is proportional to the number

of object instances in the sending cluster and the average

number of object instances of the receiving cluster which are

associated with each object instance of the sending cluster.

We represent the computation costs of a data cluster in terms

of the total number of bytes of data in it. The cost of

communication from a sending data cluster to a receiving data

cluster is represented in terms of the number of object

instances transferred from the sending data cluster to the

receiving data cluster. The steady state computation and

communication costs are represented as a computation-

communication graph. As stated above, the costs of

computation and communication are computed in this

dissertation for an identical frequency of queries referencing

the various parts of the database. Nevertheless, the same

methodology can be used for computing costs when the frequency

of queries referencing different segments of the database

varies and is known. A computation-communication graph is a

directed and weighted graph. Each vertex of the graph

represents either a single data cluster or a group of data

clusters. A directed edge from an originating vertex to the








56

directed vertex depicts the direction of data communication

from the data cluster (or group of data clusters) represented

by the originating vertex to the data cluster (or group of

data clusters) represented by the directed vertex. The weight

of a vertex represents the total computation cost associated

with it and the weight of the directed edge represents the

communication cost of sending data from the originating vertex

to the directed vertex.

We illustrate the process of determining the computation-

communication graph with an example database. Figure 4.4

shows the schema of the example database. The values of

various parameters characterizing the database are shown in

Table 4.1 and Table 4.2. The same example database will be

used to illustrate the subsequent phases of the data

organization methodology. For simplicity, the size of the

values of the descriptive attributes in the example database

is assumed to be 10 bytes. Also, the size of the total amount

of stored data for each object instance of all the object

classes is assumed to be 200 bytes. The derived computation-

communication graph for the example database is shown in

Figure 4.5. The vertices of the graph represent the

computation associated with various data clusters in the

database. It should be noted that at the end of the initial

clustering phase the data about individual object classes is

clustered together and the number of vertices of the graph

equal the number of object classes in the database. The data








57

clusters are represented as Cis. The number within each

vertex represents the total size of the data within each

cluster in Mega bytes. The directed edges of the graph

represent the direction of communication of data among the

related clusters. The number along side each edge represents

the total number of object instances, in thousands,

communicated along the edge.

The following formulae are used in computing the

computation and communication costs:

Let the number of object instances in the object class c be N-

Objcts(c).

Let the number of descriptive attributes of object class c be

N-Desc-Attrs(c).

Let the set of classes associated with the object class c be

Assoc-Classes-Set(c).

Let the size of the value of the descriptive attribute a of an

object class c in bytes be Size-Desc-Attr(c,a).

Let the average number of object instances of object class c

associated with each object instance of class j be Avg-

Conn(c,j).

Let the size of the instance identifier in bytes be Size-Id.

The size of the total amount of data stored for each object

instance of an object class c is








58

Data-Per-Objct(c) = Z [i= 1 TO N-Desc-Attrs(c)] (Size-Id +

Size-Desc-Attr(c,i)) +

Z[ V j e Assoc-Classes-Set(c)] (Size-Id +

Size-Id Avg-Conn(c,j)).

As was mentioned earlier, the data about an object instance

are vertically partitioned, and each partition contains the

instance identifier and the attribute value or an instance

identifier and the instance identifiers of the related class.

Thus, the size of the total amount of data in a cluster

containing the data about all the object instances of an

object class c is

Data-Per-Class(c) = Data-Per-Objct(c) N-Objcts(c)

Data-Per-Class(c) is also the computation cost associated with

the data cluster pertaining to object class c.

The total number of object instances transferred from the data

cluster pertaining to object c to the associated data cluster

pertaining to object class j is

N-IID-Xfer(c,j) = N-Objcts(c) Avg-Conn(c,j)

N-IID-Xfer(c,j) is also the communication cost associated with

the directed edge originating from the cluster pertaining to

object class c, and pointing to the cluster pertaining to

object class j. Since a data cluster contains data about a

single object class, we will interchangeably use the above

definitions for an object class and a data cluster pertaining

to the object class.










4.2.2. Load Balancing

By storing all the data within a data cluster across the

SSD(s) of a single processing node localization can be

improved. Also, by storing the various data clusters across

different processing nodes the data in individual data

clusters can be accessed in parallel during query processing.

However, the number of object instances and the size of the

data about each object instance varies with the object class,

thereby, varying the amount of data in each cluster. Due to

the varying amount of data in different clusters, the

different cooperating processors will take varying amount of

data retrieval and processing time. Also, the number of

processing nodes in the system can be different from the

number of object classes in the database. Query execution

time can be reduced by balancing the data retrieval and

processing across the processing nodes of the system.

One possible method of load balancing is to horizontally

partition each cluster equally among the available processing

nodes of the system. However, this balancing scheme restricts

the amount of processing parallelism and increases the

communication costs during query processing. Using the above

partitioning scheme, different horizontal data segments about

an object class will be processed in parallel by all the

processing nodes of the system. However, the data has to be

sequentially related from one object class to another and the

desired subdatabase has to be established in repeated cycles








60
of forward and backward propagation depending on the

complexity of the query. As will be evident from the

description of the parallel query processing algorithms in

Chapter 5 and Chapter 6, a query can be processed in parallel

by processing data simultaneously from various object classes

referenced in the query. The parallel processing algorithms

eliminate the complexities involved in sequentially relating

data from one object class to another. Also, using the above

partitioning scheme, the data about all the object classes is

distributed among the processing nodes. At every processing

step, data from each processing node has to be replicated and

transferred to all other processing nodes. This in turn

increases the overall communication costs and consequentially

the query processing time.

We balance the data clusters among the processing nodes

of the system by horizontally partitioning the data of those

clusters having large amounts of data, and by grouping

together clusters having relatively small amount of data.

The optimal amount of data per processing node for balanced

data retrieval is estimated and the reorganization is

performed in two steps. During the first step, data clusters

with data more than the optimal value are partitioned into new

clusters with data less than or equal to the optimal value.

During the second step, original and new clusters having data

less than the optimal value are grouped to create several

groups of data clusters. The groups are created such that the








61

combined data in each group is closer to the optimal value.

At the end of the load balancing phase, the number of groups

equal the number of processing nodes in the system. The data

is reorganized so as to allow the parallel processing of the

query with minimal increase in communication overheads.

The following formulae are used in determining the

optimal amount of data in each group of data cluster(s).

Let the number of object classes in the database be N-Classes.

Thus, the total amount of data stored in all the clusters of

the database is

Total-Data = Z [i=l TO N-Classes] (Data-Per-Class(c))

Let the number of processors in the system be N-Prcs.

The desired size of the data per processor after load

balancing is

Data-Per-Proc = Round (Total-Data / N-Prcs).

The Data-Per-Proc is the optimal amount of data desired

in each group of cluster(s). The computed values of the total

data size and the desired data per processor for the example

database of Figure 4.4 is shown in Table 4.2.

4.2.2.1. Phase I partitioning of clusters

During this phase, the clusters having data more than the

desired data per processor are horizontally partitioned. The

following presents the formulae and the algorithmic step of

the partitioning phase. The partitioning of the data clusters

of the example database is also illustrated.








62
The number of object instances of class c, such that the data

about that number of object instances equals the desired data

per processor is

Optimum-N-Objcts(c) = Truncate (Data-Per-Proc /

Data-Per-Objct(c)).
Let the number of data clusters at any instant of the

partitioning phase be N-Clusters.

It should be noted that at the beginning of the partitioning

phase the number of clusters equal the number of classes. Let

each individual cluster be denoted by an unique integer from

1 to N-Clusters.

For i = 1 To N-Classes

If (N-Objcts(i) > Optimum-N-Objcts(i)), then

partition the data cluster pertaining to class i

into N-part(i) clusters. The partitioning is

performed such that (N-Part(i) 1) clusters

contain all the data about Optimum-N-Objcts(i)

object instances of the class i and the last

cluster contains all the data about (N-Objcts(i) -

(N-Part(i) 1) Optimum-N-Objcts(i)) instances.

N-part(i) = Truncate (N-Objcts(i) /

Optimum-N-Objcts(i))

The total number of data clusters is increased

appropriately.

N-Clusters = N-Clusters + (N-Part(i) 1)









63

Table 4.3 shows the cluster names and total data in each

cluster at the end of the partitioning phase. A cluster name

of C(i,j) in the table refers to the cluster belonging to the

jth partition of class C(i). The optimal data per cluster is

2.2 M bytes. It should be observed that the cluster C(1) had

more than optimal data and was split into two clusters, namely

into C(l,l) and C(1,2).

4.2.2.2. Phase II grouping of clusters

During this phase, all the clusters, among the clusters

at the end of the partitioning phase, whose data are less than

the desired data per processor are organized to create groups

of clusters each with the desired amount of data. Also, the

clusters which have the desired data per processor are

organized as groups of one cluster each. It should be noted

that at the end of the grouping phase, the number of groups

equal the number of processing nodes. At each step of the

grouping process, an estimation is made about the

communication costs that would be incurred during processing

when two potential clusters are grouped. Among the possible

grouping choices, the clusters that incur minimal

communication costs are grouped. The following illustrates

the grouping process. The grouping of the data clusters of

Table 4.3 is also shown as an illustrative example.

Let the set of cluster groups that contain the desired size of

the data at any instant be Optimum-Group-Set.








64
After the grouping of clusters, a cluster group may contain

clusters from more than one object class.

Let the number of object instances of an object class c in a

cluster group G(i) be N-Objcts(c,G(i)).

Let the set of object classes to which the data in a cluster

group G(i) belongs be Class-Set(G(i)).

Let the function returning the class of a cluster C be

Class(C).

Step 1: Identify all those data clusters, resulting from the

partitioning phase, that contain the data about the optimal

number of object instances of the relevant class. Assign the

identified clusters to the Optimal Group Set. The following

pseudo code illustrates the step.

i = 0

Optimal-Group-set = { )

For C = 1 To N-Clusters

If (N-Objcts(C) = Optimum-N-Objcts(Class(C)), then

i = i +

G(i) = (C)

Optimum-Cluster-Set = Optimum-Cluster-Set + G(i)

The Optimum-Cluster-Set identified from the clusters of Table

4.3 is

Optimum-Cluster-Set =({ G(1) = {C1,1), G(2) = (C1,2), G(3) =

{C4), G(4) = (C5), G(5) = (C7), G(6) = {C11) ).








65
Step 2: For all the clusters not in the Optimum Group Set,

assess the communication cost associated with the cluster and

identify the cluster with the minimal communication cost.

During processing, data from a cluster of an object class

are related to the data from a cluster of another object class

that is associated in the schema. As can be observed from the

description of the algorithms in Chapter 5 and Chapter 6, the

intensity of communication for relating object instances from

a cluster of an originating class to a cluster of the related

class is proportional to the number of the object instances in

the originating cluster belonging to the originating class and

the average number of object instances of the related class

associated with each instance of the originating class.

The amount of data transmitted from a data cluster C to

another associated data cluster J is proportional to

Data-Trans(C,J) = N-Objcts(C) Avg-Conn(Class(C), CLass(J))

It should be noted that Data-Trans(C,J) is not equal to Data-

Trans(J, C). Also, in order to increase the flow of data, the

processor storing the clusters are bidirectionally connected.

Thus, the communication cost due to the transmission of data

among two clusters C and J is

Comm-Cost(C,J) = Maximum(Data-Trans(C,J), Data-Trans(J,C))

The total communication cost associated with the data cluster

C due to the transmission of data among all the clusters

related to C is

Comm-Cost(C) = Z [Y I E Assoc-Classes-Set(C)] Comm-Cost(C,I)









66

The above mentioned formulae are used in computing the

communication costs of all the relevant clusters.

Subsequently, the cluster with the minimum communication cost

is identified. Table 4.4 shows the communication cost

associated with the clusters of Table 4.3 which could not be

assigned to the Optimal Group Set. As can be observed,

cluster C10 has the lowest communication cost.

Step 3: Estimate the cost of grouping the cluster with the

minimum communication cost (obtained from step 2), say C-Min,

with each of the clusters not assigned to the optimum cluster

set, and determine the cluster with minimum cost of grouping.

Two cases arise depending on the combined data size after

grouping the two clusters. When the combined data size is

less than the desired data per processor after balancing, the

clusters can be grouped in their entirety. However, when the

combined data size is more than the desired data per processor

after balancing, the cluster being grouped is partitioned into

two new clusters, and C-Min is grouped with one of the new

clusters. The grouping cluster is partitioned such that the

combined data of the C-Min and one of the new clusters equals

the desired data per processor. The following pseudo code

illustrates the step.

Let the total size of the data in a cluster group G(i) be

Data(G(i)).

The size of the data in the cluster group G(i) is








67
Data(G(i)) = Z [c c e Class-set(G(i))] (N-Objcts(c,G(i))

Data-Per-Objct(c)
The amount of data transmitted from the cluster group

containing the two clusters I and J to the cluster K is

proportional to

Data-Trans((I,J),K) = Data-Trans(I,K) +

Data-Trans(J,K).
Similarly, the amount of data transmitted from the cluster K

to the cluster group having the clusters I and J is

proportional to

Data-Trans(K,{I,J}) = Data-Trans(K,I) +

Data-Trans(K,J).
Let the cost of grouping two clusters I and J be Group-

Cost(I,J).

For every cluster (except the one with the minimum

communication cost), say J, which is not assigned to the

Optimum Group Set, the following steps are executed and the

cluster with the minimum grouping cost is obtained.

Case 1: If (Data(C-Min + J) 5 Data-Per-Proc), then

Grouping-Cost(C-Min, J) = Z [ (C = 1 TO N-Clusters) &

(C o C-Min) & (C J)]

Maximum (Data-Trans({C-Min,Cj},Ck) ,

Data-Trans(Ck,{C-Min,Cj))
Case 2: If ( Data{C-Min +J) > Data-Per-Proc), then

The data cluster J is partitioned into two clusters, say Jl

and J2, such that








68
N-Objcts(Jl) = Truncate ((Data-per-Proc Data(C-Min) /

Data-Per-Objct(J))
and N-Objcts(J2) = (N-Objcts(J) N-Objcts(Jl)).

The increased communication cost due to the partitioning of

the cluster J into clusters Jl and J2 is

Split-Cost(J) = (Comm-Cost(Jl) + Comm-Cost(J2) -

Comm-Cost(J))
Hence, the overall grouping cost in this step are the sum of

the cost of grouping the cluster C-Min with J1, and the cost

of partitioning the cluster J into clusters J1 and J2.

Grouping-Cost(C-Min,Jl) = E [ (K = 1 TO N-Clusters) &

(K C-Min) & (K 1 Jl)

Maximum (Data-Trans((C-Min,Jl),Ck) ,

Data-Trans(Ck,{C-Min,Jl)) + Split-Cost(J)

Step 4: Group the two clusters C-Min and the cluster with the

minimum cost of grouping (obtained from step 3), say C-Merge,

and if the combined data in the new group equals the desired

data per processor then add the new group to the Optimum Group

Set. Repeat the process of determining the cluster, among the

clusters not in the Optimum Group Set, with minimum

communication cost and grouping it with other clusters.

However, if the combined data in the new group is less than

the optimal data then repeat the process of adding other

clusters to the new group. Terminate the process of grouping

when the number of groups in the optimal group set equal the









69

number of processors. The following pseudo code illustrates

the step.

Let the new merged cluster group be G(new).

If C-Merge is one of the original clusters and is not created

due to the partitioning of an existing cluster in Step 3 above

then N-clusters = N-Clusters 1

If (Data(C-Min,C-Merge) = Data-Per-Proc), then

Optimum-Group-Set = Optimum-Group-set + G(new)

If (N-Clusters # N-Prcs), then

If (Data(C-Min,C-Merge) = Data-Per-Proc), then

Go to Step 2

If (Data(C-Min,C-Merge) < Data-Per-Proc), then

Go to Step 3

Let us consider the grouping of the clusters, which are

not in the Optimal-Group-Set, of Table 4.3. Table 4.4 shows

the relevant information about the clusters with less than the

optimal data per processor. The name of the cluster, the

total data in each cluster, the names of the clusters

associated with each cluster, and the communication cost

associated with each cluster are shown in Table 4.4. The

communication costs is represented in number of object

instances communicated among the clusters and are computed

based on the formulae shown in step 2 above. As can be seen

from the table, cluster C10 has minimal communication cost

associated with it and hence is grouped first. Table 4.5

shows the cost of grouping C10 with other clusters having less








70

than optimal data. Table 4.5 also shows the cost of splitting

clusters wherever appropriate. For example, clusters C2 and

C3 cannot be grouped in their entirety and have to be

partitioned. The grouping cost includes the cost of

partitioning. The costs are estimated based on the formulae

shown in Step 3 above. As can be seen from Table 4.5, among

the possible grouping choices, the grouping of a partition of

the cluster C2 with the cluster C10 adds the least

communication cost. Cluster C2 is partitioned into two

clusters C2,1 and C2,2. Cluster C2,1 contains data about

5,000 object instances of the object class OC2, and the

cluster C2,2 contains data about 4,000 object instances of the

object class OC2. The cluster C10 is grouped with the cluster-

C2,1. The combined data of the two clusters equals the

optimal data desired per cluster and the new cluster is

assigned to the Optimal-Group-Set. The new Optimal-Group-Set

is

Optimal-Cluster-Set = { G(1) = {C1,l), G(2) = (Cl,2), G(3) =

(C4), G(4) = (C5), G(5) = {C7}, G(6) = (Cll), G(7) = (C10,

C2,1) ).

Since the number of clusters in the optimal group set

does not equal the number of processors, the grouping process

is continued. Table 4.6 shows the names of the clusters which

are not in the Optimal-Group-Set. The total size of each

cluster, the set of associated clusters, and the communication

cost associated with each cluster is also shown in the table.








71

The communication costs are computed based on the formulae

shown in step 2 above. As can be seen from Table 4.6, the

cluster C2,2 has minimum communication cost associated with it

and is grouped next. Table 4.7 shows the cost of grouping the

cluster C2,2 with other clusters with less than optimal data.

Since no cluster involves partitioning during the grouping

process, the splitting cost is zero for all the clusters. The

grouping costs are estimated based on the formulae shown in

step 3 above. As can be seen from the Table 4.7, the grouping

of the cluster C2,2 with the cluster C6 adds the least

communication cost and hence the clusters are grouped. The

combined data of the cluster group containing the two clusters

does not equal the optimal data per group and hence the new

group is not assigned to the optimal group set and is grouped

further.

Table 4.8 shows the cost of grouping the cluster

group(C2,2, C6) with the remaining clusters not in the

Optimal-Group-Set. The split cost for clusters which require

partitioning of the cluster before grouping is also shown in

the table. The costs are computed based on the formulae shown

in step 3 above. As can be seen from Table 4.8, the grouping

of cluster C8 with the cluster group {C2,2, C6} adds the least

communication cost and hence the clusters are grouped. Also,

the combined data of the new group of clusters equals the

optimal data desired per processor. Similarly, the remaining

two clusters, C8 and C9 are combined to create a new group of








72
clusters with optimal data in it. The final Optimal-Cluster-

Set is, Optimal-Cluster-Set = { G(1) = {Cl,l), G(2) = {Cl,2},

G(3) = {C4), G(4) = {C5), G(5) = {C7), G(6) = {C11), G(7) =

(C10, C2,1), G(8) = {C2,2, C6, C8), G(9) = {C8, C9) ). Figure

4.6 shows the final computation-communication graph. It

should be noted that each node of the graph contains the same

amount of data and has the same retrieval and processing time

associated with it. The clusters forming each cluster group

of the computation-communication graph, and the total Mega

bytes of data in each of the clusters is shown inside the

circles representing the cluster groups. Unique vertex number

is also assigned to each cluster group. The number alongside

the edges of the graph represent the total number of object

instances, in thousands, communicated among the communicating

cluster groups.

4.2.3. Mapping of Cluster Groups Onto Processors

The load balancing phase creates data cluster groups with

nearly equal amount of data. Also, the number of data

clusters groups equals the number of processing nodes in the

system. By mapping one cluster group per processor, the data

can be evenly distributed in the system. During processing,

data from one cluster group is related to data from other

cluster groups) that contain the data pertaining to the

associated classes) of the class set of the original cluster

group. The pattern and the intensity of data communication

among the cluster groups is irregular in nature. The








73
computation-communication structure resembles a weighted

irregular directed graph. The nodes in the graph will

represent the time for the retrieval and manipulation of the

data of the individual cluster groups. Since the data among

the cluster groups was balanced in the previous phase, all the

nodes of the graph will have the same time associated with

them. A directed arc from an originating node to the directed

node in the graph will represent the communication of data

from the originating cluster group to the directed cluster

group. The weight associated with the arc will represent the

amount of data transmitted from the originating cluster group

to the directed cluster group.

An optimal data placement of irregularly communicating

data cluster groups across the processing nodes of a parallel

system requires the processing nodes to be fully connected.

However, due to cost and other technical considerations

processing nodes cannot be fully connected and are usually

connected in a regular fashion. Mapping of cluster groups

with an irregular communication patterns among them onto a set

of regularly connected processing nodes with the objective of

optimally minimizing the overall communication costs is

similar to the optimal mapping of the irregular computation-

communication graph onto a regular graph of processing nodes.

The latter mapping has been shown in the literature to be NP

complete [LO85, GAR79].








74
It is necessary to develop appropriate application

specific heuristic methods to obtain suboptimal mapping.

Researchers in the past have taken different approaches in

obtaining mapping of problem graphs on parallel architectures

for various applications [BOK81, SAD87, BOK88]. Recently,

Baru [BAR90] has investigated the mapping of ER schemas onto

hypercube multiprocessors. The algorithms developed by him

map semantically related nodes of the schema graph onto

adjacent subcubes of the hypercube architecture. The results

obtained by him are of theoretical interest. Nevertheless,

they cannot be practically used within our framework. This is

because that, in order to maintain the adjacency, a very large

number of processing nodes (compared to the number of data

cluster groups) will be required, and proper utilization of

hardware resources cannot be guaranteed.

We have developed a heuristic algorithm that maps an

irregular computation-communication graph onto a set of

regularly connected processing nodes, where the number of

nodes in the graph equals the number of processing nodes. The

heuristic algorithm maps the cluster groups in such a fashion

as to reduce the average communication time among any two

communicating cluster groups. An estimation is made about the

communication cost of the individual cluster groups and a

mapping priority is established among the various cluster

groups based on the estimated communication cost.

Subsequently, the cluster groups are spirally mapped to the








75
processing nodes of the network. The mapping is guided by the

obtained priority. We have analyzed the performance of the

two basic search strategies, namely, the depth-first and the

breadth-first, for ordering the mapping of the cluster groups

of the computation-communication graph. The following

illustrates the mapping technique. The mapping of the

computation-communication graph of Figure 4.6 across a torus

connected set of processing nodes is also shown as an example.

Let the computation-communication graph be G(C) = (V(C), E(C),

W(C)).
V(C) is a set of vertices representing the time for retrieving

and processing the data pertaining to the various data cluster

groups obtained after the load balancing phase above.

E(C) a V(C) X V(C) is a set of directed edges (V(Ci), V(Cj)),

where (V(Ci),V(Cj) e V(C)), originating from V(Ci) and ending

at V(Cj). The edge (V(Ci) ,V(Cj)) represents the

communication of data from the data cluster group represented

by V(Ci) to the data cluster represented by V(Cj). Also, if

(V(Ci), V(Cj)) e E(C), then (V(Cj), V(Ci)) e E(C).

W(C) is a set of weights associated with each of the directed

edges of the set of edges E(C). A weight W(i,j) associated

with a directed edge (V(Ci), V(Cj)) represents the intensity

of data communicated from the cluster group represented by

V(Ci) to the cluster group represented by V(Cj). It should be

noted that W(j,i) could be different from W(i,j).

Let the processor graph be G(P) = (V(P), E(P)).








76
V(P) is the set of vertices representing the processing nodes

in the parallel processing system.

E(P) a V(P) X V(P) is a set of directed edges (V(Pk), V(Pl)),

where ( (V(Pk), V(Pl)) E E(P)), originating at V(Pk) and

ending at V(Pl). The edge (V(Pk), V(Pl)) represents the

communication link between the processors V(Pk) and V(Pl). It

should be noted that in a homogeneous system all the

communication links have similar data bandwidth. Also, if

(V(Pk), V(Pl)) e E(P), then (V(Pl), V(Pk)) E E(P).
The mapping M: V(C) V(P) is one-to-one and is such that the

average communication delay among any two processing nodes

V(Pk) and V(Pl) mapping the cluster groups V(Ci) and V(Cj)

(i.e. M(V(Ci)) = V(Pk), and M(V(Cj)) = V(Pl)) is minimized.

The average communication delay among the processing nodes is

Avg-Comm-Delay = (Sum-Max-Comm-Delay)/(Sum-Max-Weights)

Sum-Max-Comm-Delay is the sum of the maximum

communication delay among all pairs of processors

corresponding to the pairs of communicating cluster groups,

and Sum-Max-Weights is the sum of the maximal weights among

all pairs of communicating cluster groups.

Sum-Max-Comm-Delay = Z [ V (V(Ci), V(Cj)) AND

(V(Cj), V(Ci)) e E(C)]
(Maximum(W(i,j)(V(Ci), V(Cj)), W(j,i)(V(Cj), V(Ci)) ) *

D(P)(M(V(Ci)), M(V(Cj))) )
D(P)(V(Pk), V(Pl)) is the shortest distance from V(Pk) to

V(Pl) in number of hops.









77
Sum-Max-Weights = E [V (V(Ci), V(Cj)) and (V(Cj), V(Ci)) e EC]

(Maximum(W(i,j)(V(Ci), V(Cj)), W(j,i)(V(Cj), V(Ci)))

The maximal communication cost along either direction

among all pairs of communicating cluster groups is considered.

This is because the links connecting the corresponding

processing nodes have the same bandwidth, and the data with

the maximal size flowing among the two cluster groups in

either direction imposes the greater demand on the

communication links. The following is an algorithmic

description of the heuristic technique. The first step of the

algorithm estimates the communication cost associated with the

vertices of the computation-communication graph and assigns

mapping priorities to the vertices. The second step of the

algorithm maps the individual vertices of the computation-

communication graph onto the vertices of the processor graph.

Step 1: Estimate the communication cost associated with each

vertex of the graph G(C) and assign mapping priorities to the

vertices.

Step 1.1: Estimate the weights associated with each edge of

the computation-communication graph.

The weight W(i,j) associated with the flow of data from

the cluster group represented by the vertex V(Ci) to the

cluster group represented by the vertex V(Cj) is proportional

to Data-Trans(V(Ci) ,V(Cj)). The formula for the estimation of

Data-Trans is shown in the second step of the grouping phase

of Section 4.2.2.











78
Step 1.2: Estimate the communication cost associated with

each vertex of the computation-communication graph.

As stated above, the communication cost associated with

two communicating vertices V(Ci) and V(Cj) is

Comm-Cost(V(Ci),V(Cj)) = Maximum (Data-Trans(V(Ci),V(Cj)),

Data-Trans(V(Cj),V(Ci)))
Let the communication cost associated with the vertex V(Ci)

due to the communication of data with other connected vertices

be Comm-Cost(V(Ci)).

Comm-Cost(V(Ci)) = E [V (V(Ci), V(Ck)) e E(C)]

Comm-Cost(V(Ci),V(Ck)).
Step 1.3: Sort the vertices of the graph G(C) in the

descending order of their communication cost.

Different sorting algorithms can be used with varying

complexities to perform the sorting. The following pseudo

code illustrates the sorting procedure using one of the

simplest sorting algorithms, namely, the bubble sort.

Let the assignment priority of the vertices of the

computation-communication graph be stored in the array named

Priority. Initially, the priority among the vertices of the

computation-communication graph is arbitrarily assigned.

For i = 1 To Number of vertices in V(C)

Priority[i] = V(Ci)

For i = 1 To (Number of vertices in V(C) 1)

For j = (i + 1) To Number of vertices in V(C)

If (Comm-Cost(Priority[i]) < Comm-Cost(Priority[j])), then








79
Swap (Priority[i], Priority[j])

The weights associated with the individual edges of the

example computation-communication graph are shown in Figure

4.6. The communication cost, in thousands of object instances

communicated, of the individual vertices of the example

computation-communication graph are tabulated in Table 4.9.

The vertices are sorted and presented in the order of their

assignment priorities.

Step 2: Map the vertices of the vertex set V(C) onto the

vertices of the vertex set V(P) using the priority established

in step 1.

Let the set of vertices of V(C), mapped to the vertices of

V(P), at any instant of the mapping process be Assigned-

Set(C).

Let the set of vertices of V(P) that have been assigned the

vertices of V(C) be Assigned-Set(P).

Let the vertex of V(C) currently being mapped to a vertex of

V(P) at any instant be Current-Vertex(C).

Let the vertex of V(P) currently being assigned the vertex of

V(C) be Current-Vertex(P).

Initially, Assigned-Set(C) = Nil.

Assigned-Set(P) = Nil.

Step 2.1: Assign the vertex in V(C) with the highest priority

onto any vertex of V(P).

Since the vertices of V(P) are regularly connected any

vertex can be chosen for the initial assignment. The vertex,








80
among the vertices of V(C), with the highest priority has the

maximum communication associated with it and is mapped first.

The possibility of mapping the connected vertices, of the

computation-communication graph, across the processor vertices

that are closely connected is higher at the initial stages of

the mapping process. The vertices of the computation-

communication graph with higher communication requirements are

mapped before the vertices with lower communication

requirements. This enables the reduction in the average

communication delay among communicating data clusters.

Current-Vertex(C) = Priority[1].

Current-Vertex(P) = V(P1).

M(Current-Vertex(C)) = Current-Vertex(P).

Assigned-Set(C) = Assigned-Set(C) + Current-Vertex(C).

Assigned-Set(P) = Assigned-Set(P) + Current-Vertex(P).

Step 2.2: Determine the next vertex of the graph G(C) to be

mapped.

Rooting at the vertex with the highest communication

cost, the subsequent vertices of the graph G(C), to be

mapped, are determined by searching the other connected

vertices in the graph. We have analyzed the performance of

the two basic search techniques, namely, the depth-first, and

the breadth-first. The search techniques are described below.

4.2.3.1. Depth-first search of the graph G(C)

Using this search technique, the vertices of the

computation-communication graph G(C) are navigated in the








81

depth-first fashion rooting at the vertex with the highest

communication cost. Starting from the Current-Vertex(C), the

last vertex of G(C) that was mapped, the next vertex is

determined by first navigating the immediate connected

vertices of the Current-Vertex(C). Among the immediate

connected vertices, the one with the maximum communication

associated with it is chosen. If all the immediate connected

vertices of the Current-vertex(C) are already mapped, then the

current vertex is backtracked to the immediate ancestor of the

Current-vertex(C) in the depth-first spanning tree of G(C) and

the ancestor's immediate connected vertices are navigated and

analyzed. The backtracking to the ancestors is recursively

performed until an unmapped vertex is found.

Let the set of vertices connected to the Current-Vertex(C) and

not yet assigned be Connected-Set(C).

Let the vertex, among the vertices in the Connected-Set(C),

with the highest assignment priority be High-Pri-Vertex(C).

Let PARENT be a function than returns the immediate ancestor

vertex of any vertex in the depth-first spanning tree of the

graph G(C).

The following pseudo code illustrates the process.

Initially, Connected-Set(C) = Nil.

Found = FALSE

REPEAT

(Y (Current-Vertex(C),V(Cj)) E E(C))

If (V(Cj) $ Assigned-Set(C)), then









82

Connected-Set(C) = Connected-Set(C) + V(Cj)

If (Connected-Set(C) = Nil), then

Current-Vertex(C) = PARENT(Current-Vertex(C))

Else

Found = TRUE

UNTIL Found

i = 1

Found = FALSE

REPEAT

i= i+ 1

If (Priority[i] e Connected-Set(C)), then

Found = True

UNTIL Found.

High-Pri-Vertex(C) = Priority[i]

Current-Vertex(C) = High-Pri-Vertex(C).

Figure 4.7(a) shows the depth-first mapping tree of the

computation-communication graph shown in Figure 4.6. Vertex

V(C5) has the highest mapping priority and is mapped first.

Among the immediately connected vertices of V(C5), the vertex

V(C4) has the highest mapping priority and is mapped next.

Vertex V(C7) is mapped after the mapping of V(C4) since it has

the highest mapping priority among the unmapped immediately

connected vertices of V(C4). Subsequently, vertices V(C3),

V(C8), and V(C9) are mapped. Since the immediately connected

vertex of V(C9) is already mapped, the search is backtracked

and the connected vertices of V(C8) are searched. The









83
connected vertices of V(C8) are also mapped and the search is

backtracked to the unmapped connected vertices of V(C3).

Among the unmapped connected vertices of V(C3), the vertex

V(C1) has the highest mapping priority and is mapped next.

Since all the connected vertices of V(C1) are mapped, the

search is again backtracked to the connected vertices of V(C3)

and the unmapped vertex with the highest mapping priority,

namely, vertex V(C2), is mapped next. Subsequently, vertex

V(C6) is mapped.

4.2.3.2. Breadth-first search of the graph G(C)

Using this mapping strategy, the vertices of G(C) are

navigated and mapped in a breadth-first fashion rooting from

the vertex that has the maximal communication cost. After the

mapping of the root vertex, the vertices at the first level of

the breath-first tree are determined by ordering all the

immediately connected vertices of the root vertex. The

vertices are ordered and mapped in the descending order of

their assignment priorities. Once all the vertices at the

first level are mapped, the unmapped immediately connected

vertices of the each of individual vertices of the first level

are sorted and mapped in the descending order of their

assignment priorities. The process is repeated until all the

vertices of the graph G(C) are mapped. The following pseudo

code illustrates the step.

Let the current level of the breadth-first tree be Current-

Level.









84

Let the array storing the vertices at the current level sorted

in the order of their mapping be Current-Level-Array.

Let the total number of vertices in the current level be N-

Current-Level.

Let the number of unmapped vertices at the current level be N-

Unmapped.

Let the array storing the vertices, at the level higher than

the current level, sorted in the order of their mapping be

Next-Level-Array.

Let the total number of vertices at the higher level be N-

Next-Level.

Initially, the state of the variables and the arrays will be

as follows:

N-Current-Level = 0,

N-Next-Level = 0,

N-Unmapped = 0, and

Current-Level-Array and Next-Level-Array will have no

vertices.

During the first execution of this step, the vertices at

the first level of the breadth-first mapping tree are obtained

by searching all the unmapped connected vertices of the root

vertex determined in step 2.1. The vertices are ordered based

on the communication cost associated with them. The vertex,

at the first level, with the highest communication cost is

returned as the next vertex to be mapped. The following

pseudo code illustrates the process.








85
If (N-Unmapped = 0) AND (Current-Level = 0), then

Current-Level = Current-Level + 1

(V (Current-Vertex(C), V(i)) e E(C))

If (V(i) $ Assigned-Set(C)), then

N-Unmapped = N-Unmapped + 1

Current-Level-Array[N-Unmapped] = V(i)

For j = 1 To (N-Unmapped 1)

For k = j To N-Unmapped

If (Comm-Cost(Current-Level-Array[j]) >

Comm-Cost(Current-Level-Array[k])), then

Swap (Current-Level-Array[j], Current-Level-Array[k])

Current-Array-Length = N-Unmapped

Current-Vertex(C) = Current-Level-Array[N-Unmapped]

N-Unmapped = N-Unmapped 1

In subsequent executions of this step, if the Current-

Level-Array contains unmapped vertices (i.e., N-Unmapped 0),

the unmapped vertex, in the current level, with the highest

communication cost is returned as the next vertex to be

mapped. The following pseudo code illustrates the process.



If (N-Unmapped o 0), then

Current-Vertex(C) = Current-Level-Array[N-Unmapped]

N-Unmapped = N-Unmapped 1

However, if all the vertices at the current level are

mapped, the unmapped vertices at the next level are searched.

The unmapped vertices which are immediately connected to each








86

of the vertex at the current level are found. The connected

vertices of each vertex, at the current level, are ordered

based on their communication costs. The connected vertices of

the vertex at the current level that has the maximal

communication cost are mapped first. Among the connected

vertices, at the next level, the vertex with the maximal

communication cost is returned as the next vertex to be

mapped. The following pseudo code illustrates the step.

Let Current-Vertex-Limit be an index of the Next-Level-Array.

If (N-Unmapped = 0) AND (Current-Level 0 0), then

N-Next-Level = 0

Current-Vertex-Limit = 1

For i = 1 To N-Current-Level

(V (Current-Level-Array[i], V(j)) E E(C))

If (V(j) & Assigned-Set(C)), then

N-Next-Level = N-Next-Level + 1

Next-Level-Array[N-Next-Level] = V(j)

For k = Current-Vertex-Limit To (N-Next-Level 1)

For 1 = k To N-Next-Level

If (Comm-Cost(Next-Level-Array[k]) >

Comm-Cost(Next-Level-Array[1])), then

Swap (Next-Level-Array[k], Next-Level-Array[1])

Current-Vertex-Limit = N-Next-Level + 1

For i = 1 TO N-Next-Level

Current-Level-Array[i] = Next-Level-Array[i]

N-Unmapped = N-Next-Level










N-Current-Level = N-Next-Level

Current-Vertex(C) = Current-Level-Array[N-Unmapped]

N-Unmapped = N-Unmapped 1

Figure 4.7(b) shows the breadth-first mapping tree of the

computation-communication graph of Figure 4.6. Vertex V(C5)

has the highest priority of mapping and is mapped first. All

the connected vertices of V(C5) are mapped in the next level.

The connected vertices are mapped in the order of their

assignment priorities. For example, vertex V(C4) has a higher

assignment priority compared with vertices V(C1), V(C2),

V(C7), and V(C8) and is mapped after the vertex V(C5). Vertex

V(C8) has the second highest assignment priority among the

connected vertices of V(C5) and is mapped next. Similarly,

the subsequent mapping of vertices V(C7), V(C1), and V(C2) can

be explained. Next, the unmapped immediately connected

vertices of V(C4) are mapped. The connected vertices are

mapped in the order of their assignment priorities. Thus,

vertex V(C3) is mapped before the mapping of vertex V(C6).

Subsequently, vertex V(C9), the unmapped connected vertex of

the vertex V(C8) is mapped. Since no more vertices remain

unmapped the mapping process is completed.

Step 4.3: Determine the next vertex of the processor graph

G(P) on which the Current-Vertex(C) has to be mapped and map

the Current-Vertex(C). If all the vertices of the graph G(C)

are not mapped, then repeat the mapping process.








88
The Current-Vertex(C) is mapped onto the vertex Current-

Vertex(P) of G(P) such that the distance between the Current-

Vertex(P) and the vertex of G(P) mapping the immediate

ancestor of Current-Vertex(C) is as small as possible. Thus,

at the first level, all the immediately connected vertices of

the vertex mapping the parent of the Current-Vertex(C) are

analyzed. Current-Vertex(C) is mapped onto any one of the

nonmapped vertex among the immediately connected vertices. If

all the vertices at a distance of one hop (i.e., all the

immediately connected vertices) are found mapped, the vertices

at a distance of two hops are analyzed, and Current-Vertex(C)

is mapped onto any nonmapped vertex among them. The

immediately connected vertices of all the vertices at a

distance of one hop are at a distance of two hops. All the

immediately connected vertices of the vertex mapping the

parent of the Current-Vertex(C) are ordered based on the

communication cost associated with the vertices of G(C) mapped

on them. The neighbors of a connected vertex mapping a vertex

of G(C) with lower communication cost are analyzed before

analyzing the connected vertex mapping a vertex with higher

communication cost. The mapping on the neighbors of vertices

with least communication cost enables the vertices of G(C)

with higher communication costs to be mapped across the

vertices of G(P) as close as possible. Vertices at a further

or longer distance from the vertex corresponding to the parent

of the Current-Vertex(C) are analyzed when all the vertices at









89
a shorter distance are already mapped. The following pseudo

code illustrates the step.

The vertex of the graph G(P) corresponding to the immediate

ancestor of the Current-Vertex(C) is

Parent-Current-Vertex(P) = INVERSE M

(PARENT (Current-Vertex(C)))
Let the set of vertices connected to the Current-Vertex(P) be

Connected-Set(P).

Starting at the first level, the immediately connected

vertices of Parent-Current-Vertex(P) are analyzed.

Subsequently at higher levels the vertices at higher distances

are analyzed.

Let the array storing the set of vertices of G(P), whose

neighbors are being analyzed at any instant in a sorted order

based on the communication costs associated with the vertices

of G(C) mapped on them, be Current-Level-Array.

Let the number of vertices in the Current-Level-Array be N-

Current-Level.

Let the number of vertices immediately connected to the vertex

in the ith element of the N-Current-Level be N-Conn-

Vertices(i).

Let the array storing all the neighboring vertices of the

vertices in the Current-Level-Array be Next-Level-Array.

Let the number of vertices in the Next-Level-Array be N-Next-

Level.









90

Initially, Current-Level-Array has only one vertex, namely the

Parent-Current-Vertex(P).

Thus, N-Current-Level = 1 and

Current-Level-Array[1] = Parent-Current-Vertex(P).

N-Next-Level = 0

Found = FALSE

REPEAT

i = 1

REPEAT

Current-Vertex-Limit = 1

V (Current-Level-Array[i], V(Pj)) e E(P)

N-Next-level = N-Next-Level + 1

Next-Level-Array[N-Next-Level] = V(Pj)

k = N-Next-Level + 1

REPEAT

k = k 1

If (Next-Level-Array[k] t Assigned-Set(P)), then

Found = TRUE

UNTIL (Found = TRUE) OR (k = 1)

If (Found = TRUE), then

Current-Vertex(P) = Next-Level-Array[k]

If (Found = FALSE), then

FOR k = Current-Vertex-Limit TO (N-Next-Level 1)

FOR 1 = k + 1 TO N-Next-Level

If (Comm-Cost(INVERSE M(Next-Level-Array[k]) >

Comm-Cost(INVERSE M(Next-Level-Array[1])), then








91

Temporary-Storage = Next-Level-Array[k]

Next-Level-Array[k] = Next-Level-Array[1]

Next-Level-Array[1] = Temporary-Storage

Current-Vertex-Limit = N-Next-Level + 1

i= i + 1

UNTIL (Found = TRUE) OR (i = 0)

If (FOUND = FALSE), then

FOR k = 1 TO N-Next-Level

Current-Level-Array[k] = Next-Level-Array[k]

N-Current-Level = N-Next-Level

UNTIL (Found = TRUE)

Figure 4.8(a) shows the mapping of the depth-first tree

of Figure 4.7(a) onto the vertices of the processor graph.

The vertices of the processor graph are connected in the form

of a torus. Vertex V(C5) is first mapped onto the processor

vertex V(Pl). Since the vertices of the processor graph are

connected in a regular and homogeneous fashion, any vertex can

be chosen for the initial assignment. Vertex V(C4) is next

mapped onto the processor vertex V(P7) since vertex V(P7) is

directly connected to the vertex V(P1). Similarly, vertices

V(C7), V(C3), V(C8), V(C9), and V(C1) are mapped onto the

processor vertices V(P4), V(P5), V(P2), V(P8), and V(P6)

respectively. Since all the directly connected vertices of

V(P5), the vertex corresponding to the vertex V(C3), are

already mapped, the vertex V(C2) is mapped onto a vertex that

is at a distance of two hops from V(P5). Vertex V(C2) is




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs