• TABLE OF CONTENTS
HIDE
 Title Page
 Dedication
 Acknowledgement
 Table of Contents
 List of Tables
 List of Figures
 Abstract
 Introduction
 Background
 Gator network dynamic restruct...
 Optimal processing of a token...
 Parallel execution of SQL...
 Scalability
 Trigger processing consistency...
 Conclusion
 References
 Biographical sketch














Title: Parallel token processing in an asynchronous trigger system
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00100665/00001
 Material Information
Title: Parallel token processing in an asynchronous trigger system
Physical Description: Book
Language: English
Creator: Park, Jongbum
Publisher: State University System of Florida
Place of Publication: <Florida>
<Florida>
Publication Date: 1999
Copyright Date: 1999
 Subjects
Subject: Rule-based programming   ( lcsh )
Database management   ( lcsh )
Computer and Information Science and Engineering thesis, Ph. D   ( lcsh )
Dissertations, Academic -- Computer and Information Science and Engineering -- UF   ( lcsh )
Genre: government publication (state, provincial, terriorial, dependent)   ( marcgt )
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )
 Notes
Summary: ABSTRACT: In an asynchronous trigger system, trigger condition checking and trigger action execution are done after update transactions to data sources are completed. This research is motivated by the development of an asynchronous trigger processing and view maintenance system called TriggerMan. In TriggerMan, a data structure called a discrimination network is built to efficiently check the condition of a trigger. A Gator network is a kind of discrimination network. In the first part of the research, we try to improve the performance of TriggerMan using techniques that do not compromise the semantic correctness of trigger processing. The techniques include Gator network dynamic restructuring, efficient processing of a large token set, parallel resource utilization, parallel processing of data, and parallel processing of trigger conditions and actions. However, parallel token processing causes problems in the semantic correctness of trigger processing. In the second part of the research, to incorporate parallel token processing into an asynchronous trigger system in a productive way, we introduce four consistency levels of trigger processing. The purpose of consistency level is to achieve maximum performance for a given degree of semantic consistency. The lower the level is, the higher the performance is. Four consistency levels can be summarized as follows: Level 3 provides serial token processing semantics for a trigger, Level 2 executes trigger actions using all and only correct data, Level 1 allows a limited amount of timing error in the data that causes a trigger action to execute, and Level 0 guarantees the convergence of the memory nodes of the discrimination network for a trigger.
Summary: ABSTRACT (cont.): We developed techniques that efficiently implement each consistency level. They include the Stability-Lookaside Buffer (SLB) for memory node stabilization, the shadow table for virtual alpha nodes, the Concurrent Token Set (CTS) detection and token processing architecture, and the Duplicate-Lookaside Buffer (DLB) for the prevention of duplicate compound tokens. These techniques will improve the overall performance of the system and allow the users to choose desired semantic consistency levels for their triggers.
Summary: KEYWORDS: trigger, asynchronous, parallel, token, processing, "discrimination network"
Thesis: Thesis (Ph. D.)--University of Florida, 1999.
Bibliography: Includes bibliographical references (p. 160-165).
System Details: System requirements: World Wide Web browser and PDF reader.
System Details: Mode of access: World Wide Web.
Statement of Responsibility: by Jongbum Park.
General Note: Title from first page of PDF file.
General Note: Document formatted into pages; contains xiii, 166 p.; also contains graphics.
General Note: Vita.
 Record Information
Bibliographic ID: UF00100665
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: oclc - 45265560
alephbibnum - 002456821
notis - AMG2152

Downloads

This item has the following downloads:

park_j ( PDF )


Table of Contents
    Title Page
        Page i
        Page ii
    Dedication
        Page iii
    Acknowledgement
        Page iv
    Table of Contents
        Page v
        Page vi
        Page vii
    List of Tables
        Page viii
    List of Figures
        Page ix
        Page x
        Page xi
    Abstract
        Page xii
        Page xiii
    Introduction
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
    Background
        Page 7
        Page 8
        Page 9
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
    Gator network dynamic restructuring
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
        Page 21
        Page 22
        Page 23
        Page 24
        Page 25
        Page 26
        Page 27
        Page 28
        Page 29
        Page 30
    Optimal processing of a token set
        Page 31
        Page 32
        Page 33
        Page 34
        Page 35
        Page 36
        Page 37
        Page 38
        Page 39
        Page 40
        Page 41
        Page 42
        Page 43
        Page 44
        Page 45
        Page 46
        Page 47
    Parallel execution of SQL statements
        Page 48
        Page 49
        Page 50
        Page 51
        Page 52
        Page 53
        Page 54
        Page 55
        Page 56
        Page 57
        Page 58
        Page 59
    Scalability
        Page 60
        Page 61
        Page 62
        Page 63
        Page 64
        Page 65
        Page 66
    Trigger processing consistency level and parallel token processing
        Page 67
        Page 68
        Page 69
        Page 70
        Page 71
        Page 72
        Page 73
        Page 74
        Page 75
        Page 76
        Page 77
        Page 78
        Page 79
        Page 80
        Page 81
        Page 82
        Page 83
        Page 84
        Page 85
        Page 86
        Page 87
        Page 88
        Page 89
        Page 90
        Page 91
        Page 92
        Page 93
        Page 94
        Page 95
        Page 96
        Page 97
        Page 98
        Page 99
        Page 100
        Page 101
        Page 102
        Page 103
        Page 104
        Page 105
        Page 106
        Page 107
        Page 108
        Page 109
        Page 110
        Page 111
        Page 112
        Page 113
        Page 114
        Page 115
        Page 116
        Page 117
        Page 118
        Page 119
        Page 120
        Page 121
        Page 122
        Page 123
        Page 124
        Page 125
        Page 126
        Page 127
        Page 128
        Page 129
        Page 130
        Page 131
        Page 132
        Page 133
        Page 134
        Page 135
        Page 136
        Page 137
        Page 138
        Page 139
        Page 140
        Page 141
        Page 142
        Page 143
        Page 144
        Page 145
        Page 146
        Page 147
        Page 148
        Page 149
        Page 150
        Page 151
        Page 152
        Page 153
        Page 154
    Conclusion
        Page 155
        Page 156
        Page 157
        Page 158
        Page 159
    References
        Page 160
        Page 161
        Page 162
        Page 163
        Page 164
        Page 165
    Biographical sketch
        Page 166
Full Text











PARALLEL TOKEN PROCESSING IN AN ASYNCHRONOUS TRIGGER SYSTEM


By

JONGBUM PARK












A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1999

































Copyright 1999

By

Jongbum Park

































To my wife, Sukhee Sung















ACKNOWLEDGMENTS


I am deeply grateful to my advisor, Dr. Eric Hanson, who helped me with careful

guidance and personal understanding during last three and a half years. Also, many

thanks to my Ph.D. committee members, Dr. Stanley Su, Dr. Sharma Chakravarthy, Dr.

Richard Newman, and Dr. Howard Beck, who sacrificed their time and offered precious

advice that helped me throughout my research.

I do not know how to express my thanks to my wife, Sukhee Sung, who had to

spend many difficult and boring days in Gainesville. I cannot forget her ceaseless love

and support until I die. I also thank to my daughter, Suwon, and my son, Charles, for not

complaining too much about my neglect. I offer my families back in Korea my heartfelt

thanks for their encouragement. In this writing, I cannot exclude Dr. John Penrod who

voluntarily helped me with my English, and I respect him.

I really thank my sponsor, the Republic of Korea Air Force, who provided

substantial financial and logistical support during my study in the United States. Finally,

I would like to thank all the friends and colleagues who stood by my side and helped me

whenever I needed it. I wish all of them good luck and divine protection.















TABLE OF CONTENTS



A C K N O W L E D G M E N T S .................................................................................................. iv

LIST OF TABLES ........................... .. ........... ............................. viii

LIST OF FIGURES .......................................... ............... ix

A B S T R A C T ...................................................................................................................... x ii

CHAPTERS

1 IN T R O D U C T IO N ................................................... ............................................... 1

2 BACK GROUN D ............................................................................ .. ........ ..............7

2.1 D iscrim nation N etw works ................................................... ....................... ..... 7
2.2 Token Propagation in a Discrimination Network ............................................ 11
2.3 TriggerMan -- Asynchronous Trigger Processing System............................... 13

3 GATOR NETWORK DYNAMIC RESTRUCTURING........................................ 16

3.1 Gator Network Restructuring and Replacement Theory.................................. 18
3.2 Test Schedule for R estructuring.................. .................................................. 21
3.3 Restructuring Test of G ator N etw ork................................................ ............... 24
3.4 Gator Network Optimality Testing Examples.................................................. 29
3 .5 S u m m ary ............................................................................................................... 3 0

4 OPTIMAL PROCESSING OF A TOKEN SET..................................................... 31

4 .1 F in ding C ro ssov er P oint ......................................................................................... 32
4.2 E stim ating the Q uery C osts...................................... ....................... .............. 35
4.3 D eterm ining Query A approach Type................................................... .............. 36
4.4 Token Set Propagation Algorithm s.................................................... 41
4.5 Sum m ary and Further Studies........................................................... ............... 45

5 PARALLEL EXECUTION OF SQL STATEMENTS ...........................................48









5.1 Parallel Features of Three Database Products.................................... .............. 50
5.2 Parameters Needed to Calculate the DOP of an SQL Statement........................ 50
5.3 Parallel Execution Strategy for an SQL Statement.......................................... 53
5 .4 S u m m a ry ................................................................................................................ 5 9

6 SCALABILITY ............................................................................ .. ........ ........ 60

6.1 C ondition-level C oncurrency .................................... ...................... .............. 62
6.2 R ule A action C oncurrency ........................................ ........................ .............. 64
6.3 D ata-level C oncurrency ................................................................... .............. 65
6.4 Sum m ary and Further Studies............................................................ .............. 65

7 TRIGGER PROCESSING CONSISTENCY LEVEL AND PARALLEL TOKEN
P R O C E S S IN G ............................................................................................................. 6 7

7.1 Notational Conventions and Definitions of Terms ...........................................70
7.1.1 N otational C onventions............................................................ .............. 70
7.1.2 D efinitions of Term s ................... .............. 71
7.2 Trigger Processing Consistency Levels ............................. ............ ............... 72
7.2.1 Transaction Consistency Degrees and Trigger Consistency Levels ............ 73
7.2.2 Criteria of Consistency Level Definition ................................. ............... 74
7.2.3 The Definition of Trigger Processing Consistency Levels ........................76
7.3 Trigger Processing Consistency Level 3 ............................................ .............. 78
7.3.1 Support of Virtual a Nodes with Shadow Tables.................................. 79
7.3.2 Preventing Duplicate Compound Tokens ................................ .............. 82
7.4 Trigger Processing Consistency Level 2............................................ .............. 84
7.4.1 C T S D election .................................................. .......................... .. ......... .. 87
7.4.2 Parallel Token Processing Architecture ................................... .............. 96
7.4.3 Sum m ary of Level 2 Consistency ....... .......... .................................... 100
7.5 Trigger Processing Consistency Level 1 ....... .......... .................................... 103
7.5.1 Stabilization of ax N odes...... ........... .......... .................... 104
7.5.2 Stabilization of 3 Nodes .......... ........................ 109
7.5.3 Necessity of Shadow Table for Virtual ax Nodes ................................ 117
7.5.4 Sum m ary of Level 1 Consistency ....... .......... .................................... 119
7.6 Trigger Processing Consistency Level 0...... .......... ..................................... 120
7.6.1 Definitions .................................................. ........ ................ 122
7.6.2 Necessity of Dummy Timestamp oo............................................... 123
7.6.3 Processing of- Tokens ....... .......... ............ .................... 124
7.6.4 Strategy III ...................................................... . ............... ............. 128
7.6.5 Proof of 3 Node Stabilization in Strategy III................... ................. 133
7.6.6 Sum m ary of Level 0 Consistency ....... .......... .................................... 147
7.7 Im plem entation Alternatives ...... ........... ........... .................... 148
7.7.1 Pure Virtual TREAT Network ................................ 149
7.7.2 TR E A T N etw ork ..................................... ......................... .............. 150
7.7.3 A -TRE A T N etw ork.................................. ....................... .............. 150
7.7.4 Gator N etw ork................................................................... ............ .. 151









7 .8 S u m m a ry .............................................................................................................. 1 5 3

8 CONCLUSION ............................... ........... ............................ 155

REFERENCES .............................................................................. 160

BIOGRAPHICAL SKETCH ........................... .................. 166















LIST OF TABLES


Table page

3.1: Gator network restructuring and equipment replacement....................................20

4.1: Determ ining the query type for a B node .......................................... ............... 40

5.1: Parallel features of the three database products .................................... ............... 51

7.1: The consistency degrees and the consistency levels........................................... 74

7.2: The processing of the first atomic token in a family..................... ................. 106

7.3: The processing of the second or subsequent atomic token in a family................ 107

7.4: The processing of the first compound token in a family................ ................. 113

7.5: The processing of the second or subsequent compound token in a family .......... 114

7.6: Processing of four tokens that arrive at pi. ........ ..................................... 130

7 .7 : T ok en s th at arrive e at i ........................................................................................... 13 3















LIST OF FIGURES


Figure page

2.1: A dum m y rule .............................................................................................................. 8

2.2: Various types of discrimination networks............................................................. 9

2.3: A trigger and a G ator netw ork for it...................................................... .............. 12

2.4: The architecture of the TriggerMan system........................................................ 14

3.1: T oken insertion frequencies ....................................... ........................ .............. 17

3.2: G ator netw ork restructuring schedule .................................................. ............... 23

3.3: A nalyzing restructuring feasibility....................................................... ............... 25

3.4: Testing a Gator network for restructuring............................................................26

4.1: Cost of the tuple query and the set query............................................................ 34

4.2: The decision of a query approach type................................................................ 39

4.3: Initial query approach type for a P3 node ............................................................. 39

4.4: Start the propagation of a token set...................................................... ............... 42

4.5: Insert and propagate a token set ........................................................... ............... 43

4.6: Preparing a set query approach ........................................................ 44

4.7: Preparing a tuple query approach......................................................... ................ 46

5.1: Example of a parallel scanning benefit ............................................................... 49

6.1: Normalized Selection Predicate Index structure ................................................. 63

6.2: P artitioned triggerlD set .................................................................... .............. 64









7.1: U ntim ely joining error... ..................................................................... .............. 75

7.2: C onvergence of a m em ory node............................................................ .............. 75

7.3: Virtual ca node and an untimely joining error ......................................................79

7.4: A shadow table supporting two virtual ca nodes.................................................. 81

7.5: Creation of duplicate com pound tokens................................................ .............. 82

7.6: M manipulation of D LB ..... ................................................................. .............. 84

7.7: A G ator netw ork for trigger T, ................................................................. .............. 85

7.8: A G ator netw ork for trigger T2 .............................................................. .............. 86

7.9: C oncurrent token set detection ..................................... ...................... .............. 88

7.10: A Gator network for a trigger without event clause.......................................... 89

7.11: Inclusion test for a token into a CTS (without event clause) ............................ 90

7 .12 : T ok en s an d C T S s ...................................................................... .............. 9 1

7.13: A Gator network for a trigger with insert event clause..................................... 92

7.14: Inclusion test for a token into a CTS (with event clause) ..................................94

7.15: The three-token-queue architecture..................................................... .............. 97

7.16: Procedure C TS detectingprocess...................................................... ............... 98

7.17: The effect of the direct insertion and the immediate deletion............................ 100

7.18: Procedure token handlingprocess...... ........ .................... 101

7.19: The necessity of dummy timestamp 0................ ........................ 110

7.20: Assigning timestamps to a compound token...... .... .................................. 112

7.21: Delayed starting of token processing cycles. ...................................... 118

7.22: D duplicate com pound token................................... ....................... .............. 121

7.23: The necessity of dummy timestamp oo.................................................... 124



x









7.24: W rong creation of a compound token...... .... ....................................... 125

7.25: Failing to create a necessary compound token...... ........................... ............ 127

7.26: The SLB structure for 13 nodes. ...... .......... ............ .................... 129

7.27: Procedure minus token processing. ..... ........ .................... 131

7.28: A compound tuple that is younger than a token...... ................. ................. 132

7.29: Procedure plus token processing...... ........ ..... .................... 134

7.30: Stabilization criteria of a 13 node ....... ....... ........ .................... 135

7.31: Four components of two incomparable tokens. ................. ................. 138

7.32: Four cases to which a timestamp pair can belong....................... ................. 140

7.33: The creation and modification cycle of 13 node /i3 ...................... ................. 142

7.34: The association of extended families with 13 tuple components......................... 143

7.35: The creation of the youngest + compound token ........................ ................. 145















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



PARALLEL TOKEN PROCESSING IN AN ASYNCHRONOUS TRIGGER SYSTEM

By

Jongbum Park

May, 1999

Chairman: Eric N. Hanson
Major Department: Computer and Information Science and Engineering

In an asynchronous trigger system, trigger condition checking and trigger action

execution are done after update transactions to data sources are completed. This research

is motivated by the development of an asynchronous trigger processing and view

maintenance system called TriggerMan. In TriggerMan, a data structure called a

discrimination network is built to efficiently check the condition of a trigger. A Gator

network is a kind of discrimination network.

In the first part of the research, we try to improve the performance of TriggerMan

using techniques that do not compromise the semantic correctness of trigger processing.

The techniques include Gator network dynamic restructuring, efficient processing of a

large token set, parallel resource utilization, parallel processing of data, and parallel

processing of trigger conditions and actions. However, parallel token processing causes

problems in the semantic correctness of trigger processing.









In the second part of the research, to incorporate parallel token processing into an

asynchronous trigger system in a productive way, we introduce four consistency levels of

trigger processing. The purpose of consistency level is to achieve maximum performance

for a given degree of semantic consistency. The lower the level is, the higher the

performance is. Four consistency levels can be summarized as follows: Level 3 provides

serial token processing semantics for a trigger, Level 2 executes trigger actions using all

and only correct data, Level 1 allows a limited amount of timing error in the data that

causes a trigger action to execute, and Level 0 guarantees the convergence of the memory

nodes of the discrimination network for a trigger.

We developed techniques that efficiently implement each consistency level. They

include the Stability-Lookaside Buffer (SLB) for memory node stabilization, the shadow

table for virtual cx nodes, the Concurrent Token Set (CTS) detection and token processing

architecture, and the Duplicate-Lookaside Buffer (DLB) for the prevention of duplicate

compound tokens. These techniques will improve the overall performance of the system

and allow the users to choose desired semantic consistency levels for their triggers.















CHAPTER 1
INTRODUCTION


Active database systems [20],[22],[24],[25],[59],[70],[73] are able to respond

automatically to the situations of interest that arise in the data. The behavior of an active

database is described using triggers [74]. Many database products provide synchronous

triggers where trigger condition testing and action execution are always done as part of

the update transaction [43],[45],[71]. The problem with synchronous triggers is that they

can cause update response time to be slower.

TriggerMan [6],[9],[13],[16],[36],[56],[57] is an asynchronous trigger processing

software system that checks trigger conditions and runs trigger actions outside of

transactions that update the data sources. In other words, we are assuming that the trigger

processing system is separate from the data sources. The data sources can be either

database tables or generic data sources. TriggerMan allows triggers to have a condition

based on multiple data sources (database tables). In an asynchronous trigger system, a

discrimination network is used to check the condition of a trigger. TriggerMan is

designed to use Gator networks, and take advantage of their good performance properties

for testing join trigger conditions.

A discrimination network consists of condition testing nodes, memory nodes (ac

nodes, P3 nodes), and a P-node. A P-node is the root node of a discrimination network.

The insertion of a tuple into a P-node means that a combination of tuples satisfying the









trigger condition has been found, and the trigger action needs to be executed. The

changes to data sources are delivered to an asynchronous trigger system in the form of a

token. The term token was coined in the research on AI production systems [1],[2],[3],

[12],[28],[29],[30],[50]. Memory nodes of a discrimination network are updated using

tokens. Another way of updating memory nodes is to refresh them periodically as

proposed by Adiba [5].

A view [8],[62],[67] is a derived table defined in terms of base (stored) tables.

Thus, a view defines a function from a set of base tables to a derived table. This function

is typically recomputed in whole or in part every time the view is referenced using a

procedure called query modification [63]. A view can also be materialized to provide

fast access to data by storing the tuples of the view in the database [32]. The

maintenance of a materialized view can be done in a synchronous way. However,

synchronous view maintenance imposes a significant overhead on update transactions

and cannot be tolerated by many applications.

Materialized view maintenance [7],[51],[69] has been studied widely, especially

in new applications such as data warehousing [14],[18],[32],[42],[66],[76]. View

maintenance in a warehousing environment is inherently asynchronous [18],[52],[76].

There are many similarities between the materialized views and the contents of memory

nodes and P-nodes of a discrimination network (see Section 2.1). We believe that

discrimination networks could be used in maintaining materialized views. Therefore, the

TriggerMan system can maintain materialized views asynchronously with ease and

efficiency.









This research consists of two parts. In the first part, we introduced techniques to

improve the performance of TriggerMan such that the techniques do not compromise

semantic correctness of trigger processing. The techniques include Gator network

dynamic restructuring, efficient processing of a large token set, parallel resource

utilization, and parallel processing of data and trigger condition and action.

To find an optimized Gator network structure, we collect or estimate various

statistics about the data sources and the cost of conditions of the trigger that is being

defined. However, those statistics can either be inaccurate or change over time.

Therefore, it is necessary to accumulate the statistics and restructure the Gator networks

accordingly during run time, while keeping the overhead of dynamic restructuring at a

minimum. Gator network dynamic restructuring is covered in Chapter 3.

Assume a set of tokens is arriving at the same node n simultaneously. If the size

of the token set were large, then it would be more efficient to process it using a so-called

set query approach. This approach stores the tokens in a temporary table and joins the

temporary table with the sibling nodes to find the compound tokens and propagate them

to the parent node of n. However, if the size of the token set were small, then the joining

with the sibling nodes token by token would be more efficient. This method is called a

tuple query approach. The determination of the crossover point between the large sets

and the small sets is covered in Chapter 4.

Parallel resources include processors, processes, main memory, and disks. To

utilize a given hardware and software (database system) environment, we need to

estimate the amount of the parallel resources required for each Gator network node. The









result will be used in requesting parallel resources before processing the SQL queries for

the Gator network node. Parallel resource utilization is covered in Chapter 5.

Processing tasks in parallel will increase the scalability of the system. The tasks

include trigger condition checking, trigger action execution, and data (memory nodes and

base tables) processing. A discussion about the parallel processing of these task types

appears in Chapter 6. However, parallel token processing causes problems in the

semantic correctness of trigger processing.

The second part of the research focuses on incorporating parallel token processing

into an asynchronous trigger system in a productive way. To accomplish this, we define

four consistency levels in trigger processing. The purpose of consistency level is to

achieve maximum performance while allowing acceptable and anticipated problems. The

lower the level is, the higher the performance is. Four consistency levels can be

summarized as follows: Level 3 provides serial token processing semantics for a trigger,

Level 2 executes trigger actions using all and only correct data, Level 1 allows a limited

amount of timing error in the data that executes a trigger action, and Level 0 guarantees

the convergence of the memory nodes of the discrimination network for a trigger. Each

level is covered by a specific section of Chapter 7 as follows.

Level 3 is the highest consistency level of trigger processing. Level 3 consistency

for a trigger, T, can be achieved by serially processing the tokens that arrive at the

discrimination network for T. Hence, Level 3 provides the lowest performance among

the four consistency levels. When a virtual ca node is in the discrimination network for a

trigger, to provide Level 3 consistency for the trigger, a special table called a shadow

table is needed for the virtual ca node. The whole technique for Level 3 consistency and a









technique that resolves the so-called duplicate compound token problem are presented in

Section 7.3.

Level 2 consistency allows the out-of-order execution of multiple instantiations of

the action of a single trigger. It is known that some tokens that arrive at the same

discrimination network can be processed in parallel and provide Level 2 consistency.

Such a group of tokens is called a Concurrent Token Set (CTS). The technique for CTS

detection and a proposed architecture for efficient token processing are introduced in

Section 7.4.

Since Level 1 allows a limited amount of timing error, all the tokens that arrive at

a trigger system within a limited time interval can be processed in parallel. However,

simple concurrent token processing can corrupt the memory nodes of a discrimination

network and break one of the conditions for Level 1 consistency. Therefore, a

mechanism that guarantees the convergence of memory nodes is needed. We developed

the Stability-Lookaside Buffer (SLB) for the stabilization of memory nodes. Our

proposed strategies that use the SLB to stabilize ca nodes and 13 nodes are presented in

Section 7.5.

Although in Level 0 consistency the timing error is allowed to extend indefinitely,

the memory nodes of a discrimination network are required to converge. The

stabilization of a stored ca node and a 13 node that does not have virtual ca node

descendents can be achieved using the strategies introduced in Section 7.5. However, a

strategy for the stabilization of a 13 node that has virtual ca node descendents needs to be

developed. Our proposed strategy for such a 13 node and a proof of the stabilization of the

node are presented in Section 7.6.






6


To help system developers, a discussion about the implementation alternatives of

an asynchronous trigger system is given in Section 7.7. Finally, a summary of this

research appears in Chapter 8.















CHAPTER 2
BACKGROUND


Active database systems are able to respond automatically to situations of interest

that arise in the databases. The behavior of an active database is described using triggers.

For the efficient trigger condition checking, a trigger processing system can use

discrimination networks. TriggerMan is an asynchronous trigger processing system that

checks trigger conditions and runs trigger actions outside of a DBMS or other data

sources.

This chapter is organized as follows: Section 2.1 compares three kinds of

discrimination networks, Section 2.2 explains how tokens are propagated in a

discrimination network, and Section 2.3 briefly describes the TriggerMan system.


2.1 Discrimination Networks


Discrimination networks have tree structures. The root of a discrimination

network is usually drawn at the bottom. Rete, TREAT, and Gator networks are major

discrimination networks and their prominent difference is in the shape of the tree

structure. A discrimination network has (selection and/or join) condition checking nodes,

memory nodes, and a P-node.


There are two kinds of memory nodes: the ca memory nodes (simply, ca nodes)

and the 13 memory nodes (simply, 13 nodes). An ca node holds the result of applying a









selection condition to a database table. A P node holds the result of the join of other

memory nodes.

An a node can be either a stored a node or a virtual a node. An ca node that

contains the qualifying data in it is called a stored ax node. An ca node that contains the

predicate describing the contents of the node rather than the qualifying data is called a

virtual (x node. Each P node is a stored P node. A stored memory node and a virtual

memory node are analogous to a materialized view and a real view, respectively. In Rete

terminology, the root node of a discrimination network is called the P-node.

The Rete algorithm was proposed by Forgy [23] to do the pattern matching for

rules in OPS5 [11]. The Rete algorithm uses a tree-structured sorting network for the

productions. We call the network a Rete network. A Rete network has both a memory

nodes and P memory nodes. Every a node in a Rete network is a stored a node. Each 13

node in a Rete network has exactly two inputs (children). A possible Rete network for

the rule a dummy rule (Figure 2.1) is shown in Figure 2.2 (a). Note the network has a

binary tree structure.


Figure 2.1: A dummy rule


define rule a dummy rule
if Rj.aj = R2.a2 and R2.b2 = R3.b3 and R3.c3 = R4.C4
then action























(a) A Rete network


R2 R3


(b) A TREAT network


(c) A Gator network

Figure 2.2: Various types of discrimination networks









The TREAT match algorithm was developed for AI production systems by

Miranker [60],[61],[72]. A TREAT network has only ca nodes as the memory nodes. As

a result, TREAT removes the overhead of maintaining 13 nodes. In a TREAT network,

the order in which ca nodes are joined can be recomputed dynamically. Every ca node of a

TREAT network is a stored ca node. The TREAT network for the rule a dummy rule

(Figure 2.1) is shown in Figure 2.2 (b). To reduce the storage requirement, Hanson

developed the A-TREAT algorithm [33]. In an A-TREAT network, an ca node can be

either a stored ca node or a virtual ca node. Note the height of the tree is only one.

The Gator (Generalized TREAT/Rete) network is developed for active database

rule systems and production system interpreters by Hanson [34],[37]. A Gator network

has general tree structure and shows good performance but with added system complexity

[34]. Rete networks and TREAT networks can be seen as the subsets of Gator networks.

A Gator network has both ca nodes and 13 nodes as the memory nodes. An ca node in a

Gator network can be either a stored ca node or a virtual ca node. A possible Gator

network for the rule a dummy rule (Figure 2.1) is shown in Figure 2.2 (c). Note the

network has a general tree structure.

Performance studies in [72] indicate that, in a database environment, TREAT

usually outperforms Rete, but Rete is better than TREAT in a few cases where the

frequency distribution of updates to different relations in the rule condition is skewed.

Gator networks, if properly tuned, have the potential to perform well in all cases [9],[56].









2.2 Token Propagation in a Discrimination Network


In an asynchronous trigger processing system, as an implementation method,

materialized memory nodes are stored in a commercial database (host DBMS). When

one or more tokens arrives at a node, the query modification technique [36] is used to

propagate them. To make the query modification technique work, a tuple query template

(TQT) and a set query template (SQT) need to be created for each memory node. They

are created based on the discrimination network structure and the trigger condition. A

TQT and an SQT are stored in each memory node.

When a + token arrives at a memory node, n, the TQT stored in n is modified

using the token. When many + tokens arrive at n simultaneously, the tokens are stored

into a temporary table and the SQT stored in n is modified using the temporary table.

The modified TQT or SQT is submitted to the host DBMS for execution. If the query has

a result, then the tuples in the result are propagated to the parent node of n. If the query

has no result, then the processing of the token(s) stops. A detailed explanation appears in

[36].

An example is given in Figure 2.3. In the figure three table schemas and a trigger,

enroll DBMS, are defined. A Gator network for enroll DBMS is also shown in the

figure. In the Gator network, the cx node a, logically contains the result of the following

query:

select from class where class.cname = "DBMS"

Similarly, the P3 node f/3 logically contains the result of the following query:

select from class, enroll where class.cname = "DBMS" and class.cno = enroll.cno












class (cno, name) rein=class reln=enroll reln=student
enroll (cno, sid).cname I "DBMS"
student (sid, same) I
a1 a2 a3
create trigger enroll DBMS
from class, enroll, student a.cno=a2.cno
when student.sid = enroll.sid
and enroll.cno = class.cno
/31.sid= a3.sid
and class.cname = "DBMS"
then do ...



Figure 2.3: A trigger and a Gator network for it



Suppose a token, t, is inserted into a2. Then the system modifies the TQT stored

in a2 using t, and creates the following query:

select a,.*, t.* from a, where ai.cno = t.cno

Here, t.* is expanded into a list of one or more constants (all the fields of t), and

t.cno represents the value of cno field of t. The above query is then passed to the query

processor for execution. Each tuple in the result is transformed into a + compound token

and then propagated to i1, the parent node of a2.


If many tokens are inserted into i1 at the same time, they are stored into a

temporary table, temp. The system modifies the SQT stored in i1 using temp, and creates

the following query:


select temp.*, a3.* from temp, a3 where temp.sid = a3.sid









The above query is then passed to the query processor for execution. Each tuple

in the result is transformed into a + compound token and then propagated to P, the parent

node of 31.


The propagation of a token can be done in two forms -- an atomic token or a

compound token. Subsections 7.5.1, 7.5.2, and 7.6.3 explain token propagation in

detail.


2.3 TriggerMan -- Asynchronous Trigger Processing System


TriggerMan basically contains the main components of an active database system,

but is separated from any specific database. However, TriggerMan is implemented on a

specific database to store its catalog and state information. TriggerMan receives update

descriptors from the data sources (databases) asynchronously, and allows the update

transactions on the databases to execute at an uninterrupted speed. A TriggerMan client

can define a trigger on multiple data sources and can register for the events provided by

the TriggerMan server. The TriggerMan system consists of the following components:

The server, which lives inside of a commercial DBMS (in current

implementation, Informix Dynamic Server with Universal Data Option [46],

we call this Informix/UDO), is a passive module with a set of user defined

routines (Informix/UDO terminology).

Data source applications are programs that transmit a sequence of update

descriptors (tokens) to the server describing updates that have occurred in the

data sources.









* Client applications create triggers, drop triggers, register for events, receive

event notifications when triggers fire, etc.

* The driver is a program that periodically invokes a special function,

TmanTesto, in the server, allowing trigger condition checking and action

execution to be performed; more than one instance of the driver can run at

the same time to fulfill the performance requirement.

* The console is a special client application program that lets a user directly

interact with the system to create or drop triggers, start or shut down the

system, etc.


Figure 2.4: The architecture of the TriggerMan system


The general architecture of the TriggerMan system is illustrated in Figure 2.4.

Having more than one driver program concurrently invoking the TmanTesto function can

allow higher throughput for TriggerMan by letting concurrent trigger processing









activities take place inside the host DBMS server. The TriggerMan system catalog

contains data source and trigger definitions, structures and statistics of discrimination

network nodes, etc. The catalog and the persistent update queues will be stored in

database tables of the host DBMS.

Two libraries that come with TriggerMan allow writing of client applications and

data source applications. These libraries define the client application programming

interface (API) and the data source API. The console program and other application

programs use client API functions to connect to TriggerMan, issue commands, register

for events, and so forth. Data source applications can be written using the data source

API [13]. Examples of data source programs are generic data sources that send streams

of update descriptors to TriggerMan, and DBMS gateway programs that gather updates

from the DBMSs and send them to TriggerMan.

As Figure 2.4 shows, data source applications can either place update descriptors

in a persistent queue, or in a volatile queue in shared memory. A persistent update queue

is an ordinary host DBMS table created and used by TriggerMan. A volatile queue in

shared memory is used to hold update descriptors that do not need to be recovered in case

of a system failure. It is the duty of the application designer to determine if update

descriptors sent to TriggerMan need to be recoverable.















CHAPTER 3
GATOR NETWORK DYNAMIC RESTRUCTURING


In the TriggerMan system, when a trigger is defined, an optimized Gator network

is produced to test the condition of the trigger (the same thing happens with a

materialized view). An optimized Gator network for a trigger condition is produced by

our optimizer [9],[56]. The optimization of a Gator network depends on many factors like

the statistics on the data sources and the costs of the selection and/or join conditions in

the condition of the trigger that is being defined.

Some of these factors are estimated since they are unavailable at the time of

optimization. They are:

Database relation (data source) update frequencies

Selectivity factors of the selection conditions associated with a nodes

Join selectivity factors (JSFs) of the join conditions between two nodes

Due to the statistical assumptions made in the estimation of these values, errors

are inevitable [17]. Furthermore, since intermediate estimations can be operands of

further estimations, inaccuracies in current estimates propagate and aggravate later ones

[47].

Other factors, like the relative token insertion frequencies (the sum of all the

relative token insertion frequencies of cx nodes of a Gator network is 100) and the sizes of









the memory nodes, change over time [56]. Let us consider a simple Gator network with

three a nodes in Figure 3.1.



5 5 5 85

S90 10







(a) (b)


Figure 3.1: Token insertion frequencies



In the figure, (a) is the initial Gator network. The tops of incoming arrows

represent the relative token insertion frequencies (5, 5, and 90) that were estimated when

the network was built. However, we can assume that the values of these frequencies will

change to 5, 85, and 10, respectively, over time (see (b) of Figure 3.1). Let us temporarily

ignore other factors influencing the Gator network structure. This will make it easier to

understand the effect of the changes in the relative token insertion frequencies. It is

certain that the initial network structure (shown in (a) of Figure 3.1) is no longer optimal

in propagating the incoming tokens with the new frequencies. If that is the case, we need

to restructure the Gator network during run time.

Geier [26] made a similar argument for computer network reengineering

(restructuring). The factors that he considered in modifying the computer network

include new applications, technology shifts, and organizational resizing. Typically, the









cycle of computer network reengineering is much longer than that of Gator network

restructuring. In doing network reengineering, he did not consider the cost of analyzing

the modification feasibility. This is one of the differences between the computer network

reengineering and Gator network restructuring.

Bodorik, Riordon, and Pyra made an effort to overcome the inaccuracies in the

initial estimations for determining query processing strategies in distributed databases

[10]. In their study, the minimization of the delay of the distributed query execution was

crucial, because the correction of the query processing strategy was performed while the

query was being processed. Hence, they adopted a computationally simple decision

method and prepared alternative heuristic strategies in a background mode. However,

optimal Gator network restructuring is an expensive operation involving many factors,

including new P3 node construction. Therefore, we could not prepare alternative plans in

advance.

Gator network restructuring is surprisingly similar to the equipment replacement

in operations research [68]. We believe that we can get an idea of what the Gator network

restructuring system should look like by comparing the two systems. Section 3.1

compares the two systems. Section 3.2 discusses test schedule for a Gator network

restructuring. Section 3.3 introduces a formula that can be used in testing the Gator

network for restructuring. Section 3.4 gives two examples of Gator network restructuring.

Finally, Section 3.5 summarizes this chapter.


3.1 Gator Network Restructuring and Replacement Theory

First, let us introduce replacement theory in operations research [68].









Replacement theory was originated from the observation that equipment deteriorates with

age; that is, the longer the equipment is retained, the higher the cost of operating it. Thus,

as an alternative, it may be profitable to acquire new equipment that is more economical

to operate. The fundamental problem that one is faced with is to make an appropriate

balance between the cost of increased upkeep of the old equipment and the acquisition

cost and reduced upkeep of new equipment.

Table 3.1 shows a comparison between Gator network restructuring and the

equipment replacement in operations research. Intermittently operating equipment is

characterized by the fact that its operation depends on the user's request; thus, the

equipment ages only when it provides service. The service that is provided by a Gator

network is the token propagation through the network. The tokens arriving at the Gator

network change the statistics decreasing the optimality of the Gator network. Therefore, a

Gator network is a kind of an intermittently operating equipment, because it ages (Gator

network becomes sub-optimal) when it propagates tokens through the Gator network (in

other words, when it operates).

Some parameters change with time after the latest optimization. Hence, the

current Gator network structure becomes sub-optimal against the current statistics. In

general, we can say that the performance of a Gator network deteriorates over time.

Therefore, we need to restructure Gator networks during runtime. Since the rate of

deterioration is hard to predict, we need to test the performance of the Gator network and

restructure it depending on the test result.









Table 3.1: Gator network restructuring and equipment replacement


FACTORS Gator network restructuring

Basic Gator network performance
assumption deteriorates with agea

Alternative
plan restructure Gator network

restructuring cost
Input
parameters increased performance of the
new Gator network

Decision
variable restructuring decision

simple comparison to get
restructuring decision
Distinctive
features costs of optimality test and
statistics gathering does not
affect replacement decision


Equipment replacement

- equipment deteriorates with
age


- acquire new equipment

- acquisition cost
- reduced upkeep of the new
equipment


- replacement age

- cost estimation curve to get
optimum replacement age
- replacement decision cost is
not accounted


aAs factors change over time, a Gator network becomes sub-optimal and deteriorates
with age.
bRestructuring involves re-optimizing a Gator network and priming newly created
memory nodes.
cThe future trends of the costs of the triggers are not projected.
dThe costs are considered as the necessary evil or tax.




The restructuring cost includes not only the re-optimizing cost of the Gator

network, but also the cost of priming new memory nodes for the new Gator network.

Priming is the process in which the nodes of the discrimination network are loaded with

the tuples that match the selection/join conditions associated with these nodes.

Restructuring cost is also known as preparation time in Section 3.2. Because thousands

of triggers will be defined in TriggerMan, projecting the future trends of the costs of









thousands of Gator network would be expensive. Therefore, we are comparing the benefit

of restructuring with the cost of it to decide the restructuring.

We are going to accumulate node statistics and selection/join selectivity factors of

the Gator networks while TriggerMan is in operation. Using the accumulated statistics

the performance of a Gator network will be tested on a schedule that is specific to the

Gator network. A Gator network will be restructured depending on the result of the test.

We will not include the costs of statistics gathering and optimality testing into the

restructuring cost of the new Gator network. We will consider the costs as the necessary

evil or tax. However, we are trying to reduce the optimality testing cost by increasing the

test schedule when it is appropriate.


3.2 Test Schedule for Restructuring


After the optimizer builds the Gator network, in order to make it operational we

need to prime the Gator network (load its stored a nodes and (3 nodes with data).

Optimizing a Gator network and priming the optimized Gator network are time-

consuming operations. Therefore, it is inefficient to restructure more often than is

necessary. Let preparation time be the optimization time plus the priming time of a Gator

network. Because each Gator network has a different change rate of statistics, the test

schedule for a restructuring needs to be different for each trigger.

We considered using the difference between the estimated statistics and the

observed statistics of a Gator network as an indicator of the sub-optimality of the Gator

network. This was done by Kabra [55] who detected the sub-optimality of a query

execution plan during query execution. Obviously, sub-optimality detection that is based









on the statistics change is cheaper than one based on the comparison of Gator network

costs obtained using the optimizer. However, we will use the latter based on the

following observations:

It is hard to see a simple change in statistics definitely indicates the sub-

optimality of a Gator network since Gator network optimization is more

complex than query optimization [9].

The cost of optimization is justified by the following facts; the restructuring

of a Gator network is less frequently done than the re-optimization of a query

and Gator network restructuring involves priming cost (which is more

expensive than the optimization cost) when restructuring is done.

The Gator network optimizer considers the change in statistics when deciding

sub-optimality of a Gator network and gives exact indication.

In replacement theory, the equipment service age is defined as the total number of

registered operational units of the equipment since acquisition. For example, in the case

of a vehicle, the operational units would be miles. The age would then be the number of

miles registered by the vehicle odometer. We believe that a Gator network needs to

provide service without being tested for restructuring for the time period proportional to

its preparation time; otherwise, too much work will be wasted in testing the optimality of

the Gator networks. That is, the restructuring schedule of Gator network should be based

on the operation time, because it is a kind of an intermittently operating equipment.

Therefore, we have included the preparation time among the factors that determine the

test schedule for restructuring.









We will also use a system-wide constant in calculating the schedule to allow the

TriggerMan system administrator to control the initial test schedule of all Gator networks

in the system. Because of the different change rates of statistics on Gator networks, we

will assign and maintain a factor, f for each Gator network so that it can control the test

schedule of the Gator network individually.

In summary, the components that control the test schedule of a Gator network are:

t : preparation time of a Gator network, in seconds

C : system-wide constant controlling the schedules of all Gator networks (>

1.0)

f: factor controlling the restructuring schedule of a Gator network, f= 2 (n:

integer, n > 0)





0 test for restructuring
preprator---- time
Duration xCxf
Action preparation operation



Figure 3.2: Gator network restructuring schedule



Once everyrnxCxf seconds of operation we will test the optimality of a Gator

network against the recent statistics gathered during the operation (see Figure 3.2). Then,

we will restructure it conditionally based on the result (details are explained in Section

3.3). The t7 inirxCxf formula is the preparation time of the most recent preparation.

Constant C can be set to 1 initially and can be adjusted during later performance tuning.









The initial value of factor f of each Gator network will be 1. The value changes

depending on the result of the restructuring test for the Gator network (details are

explained in Section 3.3).


3.3 Restructuring Test of Gator Network


If the Gator network has operated for x C x f seconds since the recent

restructuring, then check the necessity for restructuring using the statistics accumulated

since the recent restructuring. The guidelines for the Gator network restructuring are:

The restructuring decision should be based on the cost and the benefit of the

restructuring (refer to distinctive features factor in Table 3.1); if the benefits

of restructuring is greater than the costs of it, then restructure the Gator

network (refer to Figure 3.3).

The cost of testing should be as little as possible to reduce overhead (Refer to

distinctive features factors in Table 3.1).

Depending on the restructuring decision made, the next schedule needs to be

changed. Otherwise, there will be too much optimality testing overhead for

the Gator network that follows the estimated statistics or the benefit of

opportune restructuring cannot be reaped (refer to the factor fin Section 3.2).

Figure 3.3 illustrates how to analyze the feasibility of restructuring the Gator

network. The decision to restructure the Gator network is based on the result of the

comparison of the costs and the benefits. If the benefits exceed all costs involved, then

the decision should be to incorporate the restructuring.


















Decision: Restructure the Gator network


Figure 3.3: Analyzing restructuring feasibility



Keeping the above guidelines in mind, see Figure 3.4 where the performance axis

specifies the ratio of the observed performance to the optimal performance

(performanceobserved .
performancebserve. In Figure 3.4, the dotted lines passing through the time axis and
performanceoptimal )

the actions in parenthesis are the future events. After Y seconds of operation, a Gator

network is tested for a restructuring by comparing the benefit of restructuring and the cost

of it. The benefits and the costs of the restructuring can be obtained based on two

assumptions.

The first assumption is that the difference between the estimated statistics and the

observed statistics of the Gator network will maintain, in the future, at least for the same

amount of time as the time during which the statistics were gathered. Therefore, the


Decision: Do not restructure the Gator network









average performance (p' in Figure 3.4, 0 < p' < 1.0) of the Gator network will hold or

decrease in the future for the time during which the statistics were accumulated (Y

seconds in Figure 3.4). A similar forecasting method is used successfully in the LRU

page replacement policy for the virtual memory in the computer operating systems and in

the focus forecasting method [15].










p
A
tes
0
p e r i o d ................. ..............................................
action preparation operation (preparation) (operation)
past Ifuture


Figure 3.4: Testing a Gator network for restructuring



The second assumption is that the time of the next preparation will be the same as

the time of the previous preparation. The same assumption is held in the simplest type of

replacement situations in replacement theory. Therefore, ir' will be equal to it in Figure

3.4.

The benefit of restructuring will be the guaranteed increment of throughput of the

new Gator network. In obtaining the throughput increment of a Gator network, we

multiply the performance increment by the time period during which the increased









performance will be maintained. By the first assumption given above, this time period

will be the same as the past operation time (I seconds in Figure 3.4). Now we have the

benefit (the area of the shaded region B in Figure 3.4):

(1 -p')x1 (3.1)

Because the normal operation cannot be performed during the preparation of the

new Gator network, the cost of restructuring can be obtained by multiplying the average

performance (performance of the old Gator network) by the preparation time (TK seconds

in Figure 3.4). Now we have the cost (the area of the shaded region A in Figure 3.4):

p' x (3.2)

The average performance (p') during a period can be calculated using the

statistics accumulated during that period. Since the performance is directly related to the

cost (the concept used by the cost-based optimizer), p' can be obtained from the cost of

the current Gator network (Gc) against the accumulated statistics.

We actually need two costs to get p': the cost of Gc (Cc) and the cost of the

optimal Gator network (GOPT). The calculation of Cc is straightforward; pass Gc to the

cost calculation module of the optimizer and let the module calculate Cc using the

accumulated statistics. To obtain the cost of GOPT (COPT), we need to invoke the Gator

network optimizer and find GOPT. When we invoke the optimizer we will use the current

Gator network as the initial starting state (Our optimizer uses iterative improvement [9]

as its main algorithm and iterative improvement needs many starting states). This is based

on the heuristics that the optimal Gator network will be similar to the current Gator

network. Using the current Gator network as the initial starting state guarantees that the









optimizer will find a Gator network whose cost is as low as the current one. We will use

the cost of the network found by the optimizer as COPT. It is expected that the portion of

priming time in the preparation time is far greater than that of the Gator network

optimization time. Therefore, the execution cost of the optimizer in making the

restructuring decision is justified.


Since the performance of the Gator network is inversely proportional to its cost,

we have

1 1
1:p' = :
COPT CC

Hence,


p, = (3.3)
Cc

Substituting (3.3) in the benefit formula (3.1) gives


1- T (3.4)


Substituting (3.3) in the cost formula (3.2) gives

COPT (3.5)
Cc

Using formulas (3.4) and (3.5) we can formulate the restructuring condition:


COPT COPTw
I-- X- >- X7 (3.6)
Scc ) -Cc


Based on the analysis in Figure 3.3, if (3.6) is true, we will restructure the Gator

network. Following the last guideline shown at the beginning of Section 3.3, if the

restructuring is to be done, we will decreasefby half. Otherwise (no restructuring), we









will double the value off. By doing this, we can reduce the costs of the optimality test of

the Gator network.


3.4 Gator Network Optimality Testing Examples


In this section we are presenting two examples of Gator network optimality

testing for dynamic restructuring. As common conditions for the two examples, let us

assume that 7t (preparation time) is 1200 seconds, C is 1, Cc is 140, and COPT is 90.

In the first example, we have the additional conditions off equal to 1.0 and X

equal to 1200 seconds. Because Y (1200 seconds) is equal to nxCxf (1200xlx1.0= 1200

seconds) (see Figure 3.2), then it is time to test the performance of the Gator network and

determine the restructuring of it.

The value of formula (3.6) is

COPT OX T 90 90
I-- x 7X 1 I- x1200> 90 x1200
Cc) Cc 140 140

=5>9

= false

Hence, restructuring is not to be done, andfis doubled becoming 2.0.

The second example is a continuation of the first example. Now we have f equal

to 2.0 and Y equal to 2400 seconds as additional conditions. Because Y (2400 seconds) is

equal to nxCxf (1200x1x2.0= 2400 seconds) (see Figure 3.2), then it is time to test the

necessity of a restructuring. The value of formula (3.6) is

SCOPT COPT 90 90 1200
1--- xC >--xit= 1-- x2400>-x1200
Cc) Cc 140) 140

= 10 > 9









= true

Therefore, the Gator network will be restructured, and will be reduced by half

leaving 1.0.


3.5 Summary


To decide whether to restructure a Gator network or not, the performance

(optimality) of the Gator network needs to be evaluated. We reuse the cost function of

the Gator network optimizer in evaluating the performance of a given Gator network.

Since the optimality test has its own cost, it should not be done more often than

necessary. To reduce the cost of the optimality test, a formula that contains factors (the

preparation time and the adjusting factor) specific to each Gator network is developed.

The value of the formula determines when to test the optimality of a Gator network.

The decision to restructure is based on the cost and the benefit of restructuring.

The cost of restructuring of a Gator network is obtained by multiplying the average

performance and the preparation time of the current Gator network, since the normal

operation cannot be performed during the preparation of the new Gator network. The

benefit of restructuring is the increased throughput obtained using the new Gator

network. Therefore, the benefit of restructuring of a Gator network was obtained by

multiplying the performance increment and the time of operation after the previous

restructuring of the Gator network. An expression whose value determines whether to

restructure a Gator network is developed.















CHAPTER 4
OPTIMAL PROCESSING OF A TOKEN SET


When a token arrives at a memory node (a or 13 node), the TriggerMan system

will modify the query that is stored in the node (see Section 2.2 for more details). The

modified query will be submitted to the host DBMS to find tuples that match the token.

The matching tuples will be used in forming compound tokens that will be sent to the

parent node of the memory node. We can imagine the situation where multiple + tokens

arrive simultaneously at a 13 node. If we process the tokens coming from a data source

concurrently and accumulate the tokens from that data source arriving at an ca node, then

an ax node can also have multiple tokens arriving simultaneously.

There are two ways to propagate the tokens that arrive simultaneously at a

memory node. One way is to form a query for each token by modifying the tuple query

template (TQT, see Section 2.2) and executing the queries. The join degree of each query

will be d when the number of siblings of the memory node is d. The number of queries to

execute will be m given that the number of tokens is m. We will call this the tuple query

approach. The other way is to transform the token set into a temporary table, form a

query by modifying the set query template (SQT, see Section 2.2), and execute the query.

The join degree of the query will be d+1 given that the number of siblings of the memory

node is d. We will call this the set query approach.









When a database system receives a query, it goes through the syntax-checking

phase and the query optimization phase [63]. The query optimization time increases

rapidly as the number of variables in the query increases, because the number of possible

plans increases combinatorially [19],[53]. It is evident that the smaller the set of + tokens

is, the more efficient the tuple query approach is; similarly, the larger the set of + tokens

is, the more efficient the set query approach is. Finding the crossover point between the

large sets and the small sets is an optimization problem and will be studied in this

chapter. We will estimate the costs of the queries using the optimizer provided by the

host DBMS and apply the interpolation on the result to determine the crossover point of

each Gator network node (more precisely, of every pair of each memory

node n).

This Chapter is organized as follows: Section 4.1 discusses how to find the

crossover point, Section 4.2 discusses the method of estimating the costs of the queries,

Section 4.3 introduces the process of deciding the query approach type, Section 4.4

presents the algorithms that propagate token sets, and Section 4.5 presents the summary

and further studies.


4.1 Finding Crossover Point


A crossover point is a number such that if the number of tokens in a set that arrive

at a node exceeds the number, then it is more efficient to take the set query approach to

propagate the token set. Otherwise, taking the tuple query approach is more efficient.

Because the complexity of a query determines its optimization time and optimization

time determines the crossover point, each < n, parent(n)> pair of each node n needs to









have a unique crossover point. To make the explanation simple, we assume that each

memory node has a single parent node.

If a + token arrives at a node, then the tuple query approach will probably be

beneficial. Let us assume that a set of two tokens arrived at a node n. Because of the

increased optimization time, the cost of joining the set of two tokens with the siblings of

n would probably be greater than twice the cost of a tuple query. Let the token set size be

k. As k increases, the cost of the set query could be lower than k times the cost of a tuple

query. This is due to the linear increase of the cost of the tuple query approach.

The determination of the crossover point of a node n is possible based on the

assumption that we are making; the cost of joining a temporary relation (TR) with the

siblings of n is linearly proportional to the number of tuples in TR. To determine a

crossover point of n, three kinds of costs are needed (see Figure 4.1):

C1: the cost of preparation and execution of the query joining a token arriving

at n with the siblings of n (the nodes that have the same parent as n)

C2: the cost of preparation and execution of the query joining the temporary

table of size 2 with the siblings of n

Ck: the cost of preparation and execution of the query joining the temporary

table of size k with the siblings of n

We cannot say that C, increases perfectly linearly with the increase of i. If we

choose k as close as the real crossover point, then the error of estimating the crossover

point will be minimized (details are omitted). However, we do not know the crossover

point in advance. Therefore, at first, some number that is substantially larger than 2 can









be used as k, and then, after we have some idea about the real crossover point, we can

choose a universal k value.





cost
cost L: tuple query





Ck ....... L2: set query

C2

2.C1

C size of a token set, S
1 2 CO k


Figure 4.1: Cost of the tuple query and the set query



Let S be the set of + tokens that arrives at a memory node n. In Figure 4.1, lines

L, and L2 denote the costs of the tuple query approach and the set query approach,

respectively, in propagating S to the parent node of n. The equations for the two lines

are:


LI: y=Ci x


(4.1)


(4.2)


L2: y=Ck C2 X+C2-2C -C2]
2y k-2 I I k-2


By (4.1) and (4.2), we have the crossover point (CO):


CO (k -2)CC2 Ck
(k-2)Ci+C2-Ck


(4.3)









Therefore, if the number of tokens arriving at a node of a Gator network is greater

than CO, then the set query approach will be used. Otherwise, the tuple query approach

will be used.


4.2 Estimating the Query Costs


To obtain the crossover point of a pair for a node n, we need to

find three costs (C1, C2, and Ck: defined in Section 4.1) for the pair. Each of the three

costs consists of the preparation cost and the execution cost. Generally, database products

provide a query optimizer that does cost-based query optimization and shows the selected

execution plan and the cost of the plan to the user. By using the output of the query

optimizer we can remove the necessity of developing a module to estimate the query cost.

However, the costs provided by the optimizer are the execution costs of the queries.

Therefore, we need to find the preparation (verification, compilation, optimization, and

cursor opening) costs of the queries through the observation during the query preparation.

Let us assume that we are calculating the crossover point for a node n. The

procedure of determining C1 of n will be as follows:

(1) Make a dummy tuple t of the memory node for the node n.

(2) Form an SQL statement by substituting t into TQT (n, parent (n)).

(3) Run the query optimizer with the plan explain flag on.

(4) Obtain the cost of the plan by parsing the output of the optimizer.

In step (1) above, the values of the joining attributes of the tuple need to be

determined with caution when the joining nodes maintain the distribution of column

values. The costs of executing steps (2) and (3) also need to be measured. In fact, the









cost of step (3) needs to be separated into the query preparation cost and the cursor open

cost if dynamic SQL is to be used in the tuple query approach. Parsing the plan

explanation output by the optimizer is not only time consuming but also not a streamlined

implementation style. Therefore, we recommend that future database product

implementers provide an API interface function for obtaining the cost of the chosen

query execution plan. C, will be the summation of the costs of steps (2) and (3) and the

plan cost found in step (4). However, the calculation method will vary depending on the

situation.

The procedure of determining C2 of n will be as follows:

(1) Make a dummy relation TR with two dummy tuples of the memory node

for the node n.

(2) Form an SQL statement by substituting TR into SQT (n, parent (n)).

(3) Run the query optimizer with the plan explain flag on.

(4) Obtain the cost of the plan by parsing the output of the optimizer.

C2 will be the summation of the costs of steps (2) and (3) and the plan cost found

in step (4). The procedure of determining Ck of n will be the same as the procedure for

determining C2, except the number of tuples in TR is k instead of 2.


4.3 Determining Query Approach Type


In Section 4.2 the implicit assumptions are that the tokens are in main memory for

the tuple query approach and that the tokens are in a database table for the set query

approach. However, in reality, two other scenarios are possible: a token set larger than

the crossover point is stored in main memory, not in a database table, and a token set









smaller than the crossover point is stored in a database table, not in main memory. This

is true, because we estimate the appropriate query approach type for a node, p, and store

the tokens that will be propagated into p accordingly (see Figure 4.3 for more details).

This is why we need to consider the cost of the location change of the token set before we

can determine the query approach type.

We are going to associate three functions to each memory node. The first

function calculates the cost difference between the tuple query approach and the set

query approach. When the token set is larger than the crossover point, this function

returns the cost advantage of the set query approach over the tuple query approach. When

the token set size is smaller than the crossover point, this function returns the cost

advantage of the tuple query approach over the set query approach. We will call this

function as cost difference. Cost difference is found by subtracting the cost of the set

query (line L2 of Figure 4.1) from the cost of the tuple query (line L, Figure 4.1) and

taking the absolute value of it. By (4.1), (4.2), and (4.3), cost difference is the value of

formula (4.4) where x is the token set size.


Cl- Ck- C x( kC2 2Ck )-C ] -Ck-2 (4.4)
k-2 ( k-2) Ci+C2-Ck k-2


The second function returns the cost of converting a set of tokens in main memory

into a database table. We will call this function as mm to table cost. This function can

be calculated using a similar method to the one used in calculating the crossover point.

That is:

Estimate the cost of creating a temporary table and inserting a tuple into it.

Let this cost be Si.









Estimate the cost of creating a temporary table and inserting k tuples into it.

Let this cost be Sk.

Formulate a linear function based on S1 and Sk found above.

The linear function receives a number of tokens as a parameter and returns the

cost. Furthermore, a common function can be shared among all of the nodes with tuples

of the same length.

The third and last function returns the cost of loading the tokens in a database

table into the main memory. We will call this function as table to mm cost. This

function can be calculated using a similar method to the one used in calculating

mm to table cost. That is:

Estimate the cost of fetching a tuple from a temporary database table. Let this

cost be Fi.

Estimate the cost of fetching k tuples from a temporary database table. Let

this cost be Fk.

Formulate a linear function based on F1 and Fk found above.

The linear function receives a number of tuples as a parameter and returns the

cost. A common function can also be shared among all of the nodes with tuples of the

same length.

We are assuming that all the tokens arriving at an ax node are in main memory and

not in the database table. Hence, when a token set, S, arrives at an ca memory node, n, the

query approach type for n will be determined by the algorithm (Figure 4.2) written in

SPARKS [41].























Figure 4.2: The decision of a query approach type


Figure 4.3: Initial query approach type for a 13 node



Let a P3 node, p, be the parent node of n. To determine the location to store the

result of a query/queries invoked to propagate tokens top from n, we need to estimate the

appropriate query approach type for p. See Figure 4.3 where a set, S, of m tokens arrives

at n. Let e be the estimated number of tokens that will be inserted into p per token

arriving at n. If the total number of token insertions into p due to S, m-e, is greater than


if (size(S) > n.CO) and
((n.cost difference(size(S)) > n.mm to table cost(size(S)))
then create a temporary table out of the tokens in the main memory token set
use set query approach
else use tuple query approach endif


{t], t2, ..., tn}


f t], t2, - -, Ge 1









CO of p, then storing the result of the join(s) of the tokens in S into a temporary table will

be more efficient. This is because we can save the time to access the result tuples using a

cursor and insert them into a temporary table to prepare the set query approach.

Otherwise (when m-e is less than or equal to CO of p), we will accumulate the result into

main memory to prepare the tuple query approach.




Table 4.1: Determining the query type for a P3 node



initial (estimated) size of
case actual final query type
condition query type result

1 < CO tuple
in token x
est token tuple if (cost difference > mm to table cost)
2 < CO > CO then set
else tuple

if (cost difference > table to mm cost)
3 in token x < CO then tuple
est token set else set
-- > CO
4 CO set


Table 4.1 shows how final query type for a P3 node is chosen depending on the

actual number of tokens propagated into the node and the values of three cost functions

of the node:

(1) Cost difference between the tuple query approach and the set query

approach (cost difference).









(2) Cost of converting a set of tokens in main memory into a database table

(mm to table cost).

(3) cost of loading the tokens in a database table into the main memory

(table to mm cost)

In Table 4.1, in token represents the number of tokens inserted into a child node c

of a P3 node n, and est token represents the estimated number of tokens that will be

propagated into the p node per incoming token into c. Depending on the initial

(estimated) query type, the tokens that are going to be inserted into the P node will be

stored in main memory or in a temporary table. The column size of actual result specifies

the comparison between the actual number of tokens that are being propagated into the p

node and the CO of the P node. The final column final query type shows the query type

determined after considering the benefit and the extra cost.


4.4 Token Set Propagation Algorithms


In this section, we present the algorithms that can be used in propagating token

sets that arrive at a memory node, n, to the parent node of n. The procedure

startpropagation (Figure 4.4) is invoked against a 1 node. Among the tokens in a CTS

(see Section 7.4 for more details about the CTS), the + tokens that arrive at the same ca

node, n, can be stored into a table and be propagated to the parent node of n, using the set

query approach. In that case, startpropagation is invoked against an ca node.

Therefore, the first parameter, n, of start propagation is either an a node or a 1 node.












procedure start propagation (n, S)

// n: an cx or P3 memory node //

// S: a + token set arriving at n, an in-memory structure //

// Action: conditionally converts S into a stored table and calls propagate //

(2) if n is a P-node then

(3) execute the rule action associated with n using the tokens in S

(4) else if (size(S) > n.CO) and // size(S): the number of tuples in S

(5) (n.cost difference (size(S)) > n.mm to table cost (size(S)))

(6) then convert S into a database table

(7) endif

(8) propagate (n, S)
(9) endif

end startpropagation




Figure 4.4: Start the propagation of a token set



The start propagation accepts two parameters: a memory node, n, and a + token

set, S, that arrives at n. If the size of S is larger than the crossover point of n and the

benefit (cost difference) is estimated to exceed the cost (mm to table cost), then S is

transformed into a database table. Then, the procedure calls the procedure propagate

which is a recursive procedure and calls two procedures: prepare set and prepare tuple.











procedure propagate (n, S)

// n: a memory node; letp be the parent node of n //

// S: a + token set stored either in main memory or in a database table //

// S2: a + token set, stores the tokens that will be propagated top //
//I Action: inserts S into n and propagates S top //
(1) insert S into n // use INSERT ... VALUES or INSERT ... SELECT //
(2) if (size(S) n.est token) >p.CO then
(3) prepare set (n, S, S2) //I case 3 and 4 (of Table 4.1) //
(4) else prepare tuple (n, S, S2) //I case 1 and 2 //
(5) endif
(6) ifp is a P-node then
(7) execute the action of the rule for n using the tokens in S2
(8) else
(9) if (S2 in m.m.) and (size(S2) >p.CO) and // case 2 //
(10) (p.cost difference (size(S2)) > p.mm to table cost (size(S2))) then
(11) convert S2 into a database table
(12) endif
(13) if (S2 is a stored table) and (size(S2) (14) (p.cost difference (size(S2)) > p.table to mm cost (size(S2))) then
(15) convert S2 into a m.m. structure
(16) endif
(17) propagate (p, S2)
(18) endif
end propagate


Figure 4.5: Insert and propagate a token set














procedure prepare set (n, S, S2)

// n: a memory node; letp be the parent node of n //

// S: a + token set that arrives at n I//

// S2: a + token set, a database table, result parameter //

// Action: finds tokens that will be propagated top and stores them into S2 II

(1) initialize a database table S2 // Prepare set query. //

(2) if S is in m.m. then


for each token t in S do

//I Let Q be an SQL statement that is obtained by substituting t II//

// into TQT (n, p). I//

run "Insert into S2 Q" // execute tuple query, prepare set query //

repeat

else // S is a database table. //

// Let Q be an SQL statement that is obtained by substituting TS //

// into SQT (n, p). //

run "Insert into S2 Q" // execute set query, prepare set query //


(8) endif


end prepare set


Figure 4.6: Preparing a set query approach


(3)


(7)









The procedure propagate (Figure 4.5) accepts two parameters: a memory node n,

a + token set, S, that arrives at n. It inserts the tokens in S into n. If the estimated number

of tokens that will be propagated to the parent node of n, p, is greater than the crossover

point of p, then prepare a set query approach. Otherwise, prepare a tuple query approach.

Later, when the token set that will be propagated top is found, the procedure re-evaluates

the efficiency of the original query approach type and determines whether to hold on the

approach type. If it decides not to hold on the original query type, then it changes the

location of S2. Finally, it calls itself recursively passing the appropriate parameters.

The procedure prepare set (Figure 4.6) accepts three parameters: a memory node,

n, a + token set, S, that arrives at n, and a result parameter, S2. The parameter S2 is a

database table. The procedure finds the tokens that will be propagated from n top, stores

them into S2, and return S2 to the calling procedure.

The procedure prepare tuple (Figure 4.7) accepts three parameters: a memory

node, n, a + token set, S, that arrives at n, and a result parameter, S2. The parameter S2 is

an in-memory structure. The procedure finds the tokens that will be propagated from n to

p, stores them into S2, and returns S2 to the calling procedure.


4.5 Summary and Further Studies


To propagate multiple tokens that are arriving at a P node simultaneously, either

the tuple query approach or the set query approach can be used. The set query approach

is beneficial for a large token set while the tuple query approach is good for a small

token set.













procedure prepare tuple (n, S, S2)

//I n: a memory node; letp be the parent node of n //

// S: a + token set that arrives an n II//

//I S2: a + token set, an in-memory structure, result parameter. //

// Action: finds tokens that will be propagated top and stores them into S2 II

(1) initialize S2 in main memory // prepare tuple query //

(2) if S is in m.m. then

(3) for each token t in S do

//I Let Q be an SQL statement that is obtained by substituting t II//

// into TQT (n, p). //

(4) result <- run Q II// execute tuple query //

(5) append result to 2 //I prepare tuple query //

(6) repeat

(7) else // S is a database table. //

// Let Q be an SQL statement that is obtained by substituting S I//

// into SQT (n, p). //

(8) result <- run S2 //I execute set query //

(9) put result into S2 //I prepare tuple query //

(10) endif

end prepare set


Figure 4.7: Preparing a tuple query approach









In this chapter, we propose a method that determines the crossover point between

the large sets and the small sets. Our method uses interpolation on the costs that can be

obtained by parsing the query plans produced by the query optimizer. In relation to this,

we are recommending that future database product implementers provide an API

interface function to obtain the cost of the chosen query execution plan. We also propose

a strategy that dynamically decides between the tuple query approach and the set query

approach in propagating tokens that arrive at a P node. The proposed strategy uses three

functions: cost difference, mm to table cost, table to mm cost.

To help the implementation of the strategy, we present a set of algorithms that can

be used in propagating token sets that arrive at a memory node. The algorithms consist

of four procedures: start propagation, propagate, prepare set, and prepare tuple. The

strategy can be applied starting from either an cx node or a 13 node.

Adelberg et al. [4] used the Forced Delay recomputation algorithm in maintaining

derived data to reduce the cost. They exploited update locality to improve transaction

response time. Similarly, if multiple tokens apply to the same tuple were merged into a

single token, then we believe that the system performance will be further increased.















CHAPTER 5
PARALLEL EXECUTION OF SQL STATEMENTS


Today, many systems operate in an environment where multiple processors work

in parallel to provide services. Among the parallel architectures, symmetric

multiprocessing computers (SMPs) are widely [54] used and are closely related to the

traditional single-CPU processors. Accordingly, many database products provide features

to exploit the power of SMPs in executing SQL statements. The resources that allow the

parallel query execution and make higher throughput possible are called parallel

resources. The parallel resources include CPU, memory, and disk I/O.

While TriggerMan is in execution, it generates many SQL statements

automatically (see Chapter 4). Therefore, the tuning of parallel execution of the

statements is very important for the performance of the system. The parallel execution of

SQL statements includes the parallel scanning of tables and the parallel processing of

multiple smaller processing units. The smaller processing units are parts of the original

query and can be computed independently. They can be parts of the operation of join,

sort, aggregation, etc. The performance of an individual query can increase when the

tables accessed from it are partitioned across multiple disks.


For example, let us assume a2 in Figure 5.1 is partitioned on column y (*-x) and

no index is defined on column x. When a token t arrives at a,, we need to do a linear









search of all partitions of a2 to find the tuples that match with t. In this case, we can

increase the scanning speed by reading the partitions of a2 in parallel.




t
Remarks:
a .x 1 a2- a2
a2: partitioned ony,
no index on x.





Figure 5.1: Example of a parallel scanning benefit



The parallel execution of SQL statements naturally exploits data-level concur-

rency and will lead to increased scalability of TriggerMan (the scalability issue is

mentioned further in Chapter 6). The number of processors employed for a computation

is known as the degree of parallelism (DOP). To fully utilize the parallel resources, the

appropriate parallel execution strategy (the DOP and whether parallel scanning will be

employed or not) for each SQL statement needs to be selected. Inappropriate strategy

selection can lead to poor system performance [46].

In this chapter, we will introduce a method that finds the best parallel execution

strategy for an SQL statement. To develop a generally applicable method we studied the

parallel features of three database products. They are explained in Section 5.1. The

parameters that are used in calculating the DOP of an SQL statement are listed in Section

5.2. The parallel execution strategy for an SQL statement is introduced in Section 5.3.

Finally, a summary is given in Section 5.4.









5.1 Parallel Features of Three Database Products


Instead of developing a parallel feature utilization method for a specific database

product, we are trying to develop a method that is applicable to any database product. To

do that, we had to find the common parallel features of various database products first.

This section compares the parallel features of three database products. Table 5.1 shows a

comparison of the parallel features of three widely used object-relational or relational

database products: Informix Dynamic Server with Universal Data Option (UDO) [46]

(we will call this Informix/UDO), Oracle8 Enterprise Edition (EE) [64], and DB2

universal database V.5 [44].

In Informix/UDO, PDQpriority specifies the percentage of parallel resources for a

query, an application, or an instance. In Oracle8, users can determine how aggressively

the optimizer will attempt to parallelize a given execution plan of a query. IBM DB2

allows users to choose the DOP for an SQL statement. All three products support parallel

operation at the intra-query level. The parameter that is used to enable the parallel

features of a specific database product determines the number of processors that will be

employed for an SQL statement. Conversely, if we know the appropriate DOP for an

SQL statement, we can get the value of the parameter used by any database product. For

example, a DOP can be translated into what is known as PDQpriority of Informix/UDO.


5.2 Parameters Needed to Calculate the DOP of an SQL Statement


The parameters relevant to the calculation of the DOP of an SQL statement when

TriggerMan is implemented in Informix/UDO are:











Table 5.1: Parallel features of the three database products


FACTORS Informix / UDO Oracle8 / EE .DB2
Universal DB v.5

Partition table index and
table, index table, index table, index, and
unit more

scan scan scan

Operations join join join
done in aggregation summarizing aggregation
sort sort set operation
parallel
group group etc.
etc. etc.


PDQpriority: parallel hint: DOP Degree option of

degree of for a table in a pre-compile or
Parallel parallelism (DOP) query (1st priority) bind: DOP for
feature and parallel scan table's defined static query
enabling for a query, DOP (2nd ) special degree
methods application, or DOP of query is register: DOP for
instance max of DOPs of dynamic query
tables


scan thread actual DOP < actual DOP <

Distinctive reservation desired DOP desired DOP
Features actual DOP =
desired DOP









* pg byte:

* t byte:

* n tuple:

* n part:

* n page:

* r row:

* r page:

* IOper scan proc:



* 10 spd:








* CPU spd:








* spd ratio:


* fanout:


Number of bytes in a page

Number of bytes in a tuple

Number of tuples in a node -- across all partitions

Number of partitions in a node

Number of pages in a node

Expected number of tuples in the result

Expected number of pages in the result

Number of pages of a Gator network node that need

to be read by a scan process

Effective number of tuples read by all the scan

processes during one page I/O time,


J ~ r row
IO _per scanproc


Number of tuples that can be processed (tested

against a selection condition and/or a join

condition) during one page I/O time. CPU spd is o

when no condition exists

Ratio of the tuple reading speed to the tuple

I 0 spd 1
processing speed, [ spd
Fanout of CPUa node in a Btrepd

Fanout of a node in a B -tree









5.3 Parallel Execution Strategy for an SQL Statement


The parallel execution strategy for an SQL statement consists of two parts:

whether parallel scan will be employed or not and the DOP. The employment of parallel

scan for an SQL statement depends on the necessity of parallel scan for the Gator

network nodes that are accessed from the statement. If at least one node that is accessed

by an SQL statement is to be scanned in parallel, then the statement will be scanned in

parallel. Otherwise, it will be scanned in serial. This is because an SQL statement can

access multiple nodes; nevertheless, the SQL statement is the finest level at which we can

tune the parallel features of a DBMS product (see Table 5.1). In Informix/UDO, the

actual number of scan processes reserved for the SQL statement will be determined by

the number of partitions in the nodes that are accessed from the statement.

When a Gator network node is partitioned, if it is certain that the target tuples are

spread across multiple partitions, parallel scan will be employed. The conditions where it

is known that the target tuples will be spread across the partitions include:

The data partitioning is done in a round robin fashion.

The node is the base table of an a-node to be primed; if the selection

condition is defined on the node, then it should be on a different column (or

set of columns) from the partitioning column.

The node is the temporary table created to process a set of tokens arriving at

a node using the set query approach (explained in Chapter 4).

The DOP for an SQL statement depends on both the tuple I/O speed (10 spd) of

the nodes that are accessed from the statement and the costs of the functions (CPU spd)

that are embedded in the selection or join condition of the statement. In short, we will









employ multiple CPUs for an SQL statement when expensive (CPU spd is slower than

10 spd) functions are embedded in the conditions of the statement. All factors that

determine the DOP are properties of a Gator network node; in fact, the properties of a

node interact and determine the DOP of the node. The properties of a node are:

10 spd -- determined by the node properties like tuple size, the existence of

an index, the number of partitions, etc.

Cost of functions in the selection condition (a part of CPU spd) -- this is a

property of a node because each selection condition belongs to a node.

Cost of functions in the join condition (a part of CPU spd) -- this can be

thought of as a property of a node, because join of three or more tables is

performed by joining two tables at a time in most database systems [63].

As was mentioned earlier, an SQL statement can access multiple nodes, and the

finest level at which the parallel feature of a DBMS product can be tuned is the SQL

statement (see Table 5.1). Therefore, we need to determine the DOP of the SQL

statement out of the DOPs of the nodes accessed from the statement. We will use the

largest of the DOPs of the nodes as the DOP of the SQL statement. By doing this we can

maximize the processing speed of the SQL statement. This method is analogous to the

one used in Oracle8 [64] (refer to Table 5.1) where the DOP of a query is the largest of

the DOPs of the tables that are accessed from the query.

Now, the problem of determining the DOP of an SQL statement is reduced to

determining the DOP of a node. We will determine the DOP of a Gator network node

based mainly on the spd ratio of the node. The 10 spd of a node partially determines the

spd ratio of the node (see Section 5.2). The IOper scanproc partially determines








10 spd (see Section 5.2). IOper scan proc depends on the type of index on the access

column of the node (clustered index, non-clustered index, or no index). IOper

scan proc also depends on whether the node is partitioned or not and whether parallel

scan is employed or not (partitioned & parallel scan, partitioned & serial scan, and non-

partitioned where serial scan is the only choice). Therefore, we need to consider the

combinations of the values of those factors of a Gator network node to calculate IOJper

scan proc of the node. The following paragraphs will explain eight different formulas of

IOper scan proc for the whole combination.


Formula 1 -- when clustered index is defined, node is partitioned, and parallel scan is
employed
In this case, we assume that the needed tuples are evenly distributed across the

partitions. Hence, the number of data pages that need to be read by a scan process is:


r _rowxt _byte (5.1)
pg byte xn part


The number of index pages that need to be read by a scan process is:


S n tuple
log fanout pe- (5.2)
n part

From (5.1) and (5.2), IOper scan proc is:

r row xt byte 1 n tuple
-- --- -lg fanoutg
pg _bytexn _part + I n part I


Formula 2 -- when clustered index is defined, node is partitioned, and serial scan is used









In this case, the number of index pages that need to be read by a scan process is

the same as (5.2) and the number of data pages that need to be read is:

r _rowxt _byte
pg _byte

From (5.2) and (5.3), IOper scan proc is:

r rowxt byte [ n tuple
+ log fanout -_---
pg byte I n part


Formula 3 -- clustered index is defined, node is non-partitioned
In this case, the number of data pages that need to be read by a scan process is the

same as (5.1) and the number of index pages that need to be read is:

[log fanout n tuple] (5.4)

From (5.1) and (5.4), IOper scan proc is:

-rrowxt _byte + [log fanout n tuplel
pg bytexn partI

When a non-clustered index is defined on the access column, not all of the tuples

in the pages read will be tested against the conditions that exist. To obtain the number of

pages that need to be read, we are using the Yao approximation [75]. Given n tuples

grouped into m pages, if k tuples are randomly selected from the n tuples, the expected

number of pages touched is:


Yao (n, m, k) = mx 1T- where d = 1 1/m
= n-i+1






57


Formula 4 -- when non-clustered index is defined, node is partitioned, and parallel scan is
employed
In this case, the IOper scan proc is:


Yao n tuple npage r row 1+1)
Snjpart njpart njpart)


In the above formula, (1+1) is for the leaf level index page and the data page (the

I/O for the index pages other than the leaf node is ignored).


Formula 5 -- when non-clustered index is defined, node is partitioned, and serial scan
In this case, the IOper scan proc is:


Yao ( n tuple nhpage
Yao r ,r row x(l+l)
Sn part n part


Formula 6 -- when non-clustered index is defined, and node is non-partitioned
In this case, the IOper scan proc is:


Yao (n _tuple, n page, r row)x (1 +1)


When no index is defined, the node needs to be scanned sequentially.


Formula 7 -- when no index is defined, and node is partitioned
Since no index is defined, IOper scan proc remains the same whether parallel

scan is employed or not. It is:


n page
n part


Formula 8 -- when no index is defined, and node is non-partitioned
In this case, the IOper scan proc is:

n page









Now we have IOper scan proc of a Gator network node for each value

combination of factors. Hence, the 10 spd of a Gator network node can be obtained (see

Section 5.2). We assume that the cost of each function in selection and join conditions of

a Gator network node can be obtained by referring to the system catalog. We further

assume that the function costs are expressed in the unit of a page I/O time or they can be

transformed into the page I/O time unit. Using those function costs, CPU spd of a node

can be obtained. Since we have the 10 spd and the CPU spd of a node, we can get the

spd ratio of the node.

To employ n CPUs effectively, at least n units (pages or tuples) need to be

processed, so that each CPU can handle at least one unit. Hence, when a clustered index

is defined on the access column of a Gator network node, the DOP of a Gator network

node is:

min (r_page, spd ratio)

When a non-clustered index is defined on the access column, the DOP is:

min (r row, spd ratio)

Otherwise (when no index is defined), the DOP is:


n page
minm e, spd ratio
Snpart


Now, we have obtained the DOP of a Gator network node. The DOP of an SQL

statement is the largest of the DOPs of the nodes accessed from the statement.









5.4 Summary


In this chapter, we introduced a method of tuning the parallel features of the host

DBMS for the execution of the SQL statements initiated from the TriggerMan system. To

develop a parallel feature tuning method generally applicable to any database product, the

parallel features of three database products have been studied. The parallel execution

strategy (the appropriate DOP for parallel processing and when to use a parallel scan) for

an SQL statement was developed based on the parallel execution strategies for the nodes

that are accessed from the statement. The properties of a node that were considered in

determining the parallel execution strategy for the node are:

Partitioning of the node and the scheme of partitioning.

Presence of any index on the access column and the clustering of the index.

Costs of the functions in the selection and/or join conditions that are

associated with the node.

The conditions that make parallel scan beneficial were introduced. We also

provided the formulas that calculate the appropriate DOP for a Gator network node. The

execution strategy for an SQL statement was determined using the following policies:

An SQL statement is processed in parallel if at least one node that is accessed

from the statement needs to be scanned in parallel.

The largest of the DOPs of the nodes that are accessed from an SQL

statement is used as the DOP of the statement.















CHAPTER 6
SCALABILITY


Scalability can be defined by how well a solution to a problem will work when

the size of the problem increases. The TriggerMan system will have a good scalability, if

it performs well utilizing a given hardware environment against the increase of the

number of triggers, incoming tokens, etc. There are many reasons why the TriggerMan

system has to be scalable [38]. Among them, some major reasons are:

The convenience of writing applications Trigger systems can be used for

active information processing for large sets of triggers instead of writing

custom applications to carry out triggering logic.

The efficiency and the convenience of active information distribution the

critical information to the survival of an enterprise can be delivered to the

client as soon as it is available without excessive waiting of the client.

The enormous growth rate of Web-based applications [48] users can create

a large number of triggers via the interfaces provided by the applications.

The development of parallel computers, especially SMP machines to fully

utilize the available parallel machine, the TriggerMan system needs to be

scalable.

A study on parallel implementation of rule-based systems shows that only a

limited speed-up can be obtained [31],[49]. The reasons of small speed-up in [31] are: (1)









the small number of rules relevant to each change to data memory; (2) the large variation

on the processing requirements of relevant rules; and (3) the small number of changes

made to data memory between synchronization steps. However, in TriggerMan the

reasons do not necessarily hold: (1) the number of triggers relevant per change to data

source is dependent of the total number of triggers; (2) the variation on the processing

requirements of relevant triggers can be limited intentionally; and (3) in principle, there is

no synchronization step in TriggerMan. This gives a hope that TriggerMan can have

higher parallel speed-ups than observed in parallel production system.

Other properties of TriggerMan that increase the possibility of high speed-up

through parallel processing are:

Multiple triggers can have exactly the same conditions but different actions.

The size of a data source (database) is virtually unlimited.

Change rates of data sources can be extremely high; changes are made by

external applications.

Users want to define as many triggers as possible.

Therefore, the parallel processing is expected to increase the scalability of

TriggerMan. To do parallel processing, we need to identify the concurrencies that can be

exploited. Four kinds of concurrency that TriggerMan can exploit are:

Token-level concurrency -- Multiple tokens can be processed concurrently

(Chapter 7).

Condition-level concurrency -- Multiple conditions can be tested against a

single token concurrently (Section 6.1).









Rule action concurrency -- Rule actions of multiple rules or multiple

instantiations of the action of a single rule can be executed concurrently

(Section 6.2).

Data-level concurrency -- A set of tuples in an cx or P3 node of a Gator

network [36] can be processed in parallel (Section 6.3).

TriggerMan will maintain a task queue [38] in shared memory [46] to store

incoming tokens or internally generated tasks. Basically, the task queue is the same

concept as the process queue in an operating system. The task queue performs a central

role in exploiting available concurrencies.

Parallel token processing (token-level concurrency) causes problems in the

semantic correctness of trigger processing. As a result, we find a way to incorporate

parallel token processing into an asynchronous trigger system in a productive way. This

is the motive behind the introduction of consistency levels for trigger processing. These

topics are covered in Chapter 7. The following sections will explain how the other three

concurrencies could be exploited.


6.1 Condition-level Concurrency


The two kinds of conditions in a trigger condition are the selection condition and

the join condition. The join conditions are checked when the tokens are propagated

through Gator networks. Therefore, the join conditions are checked concurrently as the

tokens are processed concurrently (Chapter 7). The selection conditions are checked

concurrently by the normalized selection predicate index (SPI) structure (Figure 6.1)

[40],[57]. The normalized SPI is made by applying common sub-expression elimination.









See [38] and [40] for more details on the SPI. If a triggerlD set is large, parallel

processing of the triggerlD set can be achieved by partitioning the set into multiple equal-

size subsets (partitions) and processing the partitions in parallel [38]. If the constant set

for an expression signature is large, then it too can be partitioned and processed in

parallel.



predicate index root

data source ... ..
predicate indexes


.. % .. -- expression
signature list
constant set (set of
unique constants)



triggerlD set (set of IDs
of different triggers
having same set of
constants)



Figure 6.1: Normalized Selection Predicate Index structure



When a new token arrives, it will be inserted into the task queue, then it will be

passed to the root of the predicate index (which locates its data source predicate index).

All of the constant sets of the data source need to be searched in order to find the

triggerID sets that match the current token. After the triggerlD sets are found, each

trigger in each set need to be processed for additional conditions testing and action

execution.






64


6.2 Rule Action Concurrency


If a token satisfies the conditions of multiple triggers, then executing the actions

in parallel can increase the speed of TriggerMan. Since the normalized predicate index is

used in TriggerMan, the triggers that are simultaneously satisfied must be in a triggerlD

set. By enqueueing the actions of each trigger into the task queue, we can execute the

actions in parallel. However, according to Amdahl's Law [65], the maximum speedup is

limited by the speed of serial operations (operations that need to be performed by one

process). The serial operations include enqueueing the tasks into the task queue, and

dequeueing them from the queue, and substituting the placeholders in the trigger

definition by the values from the (compound) tokens.

Therefore, we need to incorporate triggerlD set partitioning into TriggerMan. This

partitioning is also related to condition-level concurrency (Section 6.1). The idea of

triggerlD set partitioning is depicted in Figure 6.2. When a token matches the constant

related with a partitioned triggerlD set, the set is enqueued into the task queue partition

by partition. After a triggerlD partition is dequeued from the task queue, the triggers in a

partition are processed sequentially using the procedural control.






2.. 2

constant ** *N


partitioned
trigger ID set


Figure 6.2: Partitioned triggerlD set









Determination of when and how to partition the triggerID set are optimization

problems and left for further study. The factors related with these optimization problems

are:

Overhead of enqueueing the task into the task queue -- minimize it.

Degree of parallelism (number of partitions) -- maximize it.

Execution path length of each partition -- keep it under a threshold.


6.3 Data-level Concurrency


Data-level concurrency can be exploited if we divide large tables into multiple

partitions and utilize the parallel database query feature provided by the host DBMS and

the underlying computer [21],[46]. In Chapter 5, we proposed a method of tuning the

parallel features of the host DBMS for the efficient execution of SQL statements. When

and how to partition a large Gator network node is left as a further study.


6.4 Summary and Further Studies


In this chapter the properties of a trigger system that shows the possibility of high

speed-up through parallel processing were examined. We identified four kinds of

concurrencies that exist in a trigger system and can be exploited to increase the speed-up

of the system. They are condition-level concurrency, rule action concurrency, data-level

concurrency, and token-level concurrency.

In TriggerMan, a normalized SPI (selection predicate index) structure is used in

finding the a nodes at which a token arrives. A token and an a node pair is enqueued in

the shared task queue as a task. By letting multiple processors process the tasks in the






66


queue in parallel, the condition-level concurrency and the rule action concurrency can be

exploited. The utilization of token-level concurrency is discussed in Chapter 7.

We also presented the idea of partitioning a large triggerID set in the normalized

SPI structure. Through triggerID set partitioning, we can reduce the unparallelizable

portion of the work, which increases the speed-up of the system. Finding when and how

to partition the triggerID set is an interesting optimization problem and is left for further

study.















CHAPTER 7
TRIGGER PROCESSING CONSISTENCY LEVEL AND
PARALLEL TOKEN PROCESSING


In the context of discrimination network maintenance, a computation unit is an

atomic unit of processing concerning token(s) against one or more memory nodes of a

discrimination network. The processing of a token, tk, against a memory node, n,

includes applying tk to n, joining tk with the siblings of n if necessary, and propagating tk

or the result of joining to the parent nodes of n.

In this paper, we use two kinds of computation units: the reduced computation

unit and the expanded computation unit. The reduced computation unit is the processing

of a token against a single memory node. The expanded computation unit consists of the

processing of a token against an cx memory node, a,, the processing of the tokens

propagated from a, against the parent node of a,, and so on. This happens until no more

tokens are propagated or tokens reach at the P-node. To exploit the power of parallel

computers, parallel processing of computation units is essential. For the sake of

simplicity, processing a token, t, against a memory node, n, means processing the

computation unit that includes the processing oft against n.

Basically, we assume the tokens that arrive at different discrimination networks

can be processed in parallel. This will increase the performance of the system. To

further increase the system performance, we believe the parallel processing of the tokens

that arrive at the same discrimination network is necessary. For the sake of simplicity,









parallel token processing means parallel processing of the tokens that arrive at the same

discrimination network. However, uncontrolled parallel token processing creates

problems in the semantic correctness of trigger processing. The list of the problems are:

Out of order execution of the multiple instantiations of the action of a single

trigger (out-of-order rule action execution).

Trigger action execution using a compound tuple that is created due to an

untimely joining error (Subsection 7.1.2). The compound tuple is called a

phantom compound tuple.

Failure to execute a trigger action since the system cannot detect a transient

compound tuple due to an untimely joining error or a transient tuple (lost

transient tuple).

Permanent corruption of a memory node of a discrimination network

(memory node corruption).

We conjecture that the above list of problems is complete because of the

following reasons. When the P-node of a discrimination network receives all and only

correct tokens, the unique problem that can occur is the out-of-order rule action

execution. When child nodes are correct and tokens join in a timely manner, all and only

correct tokens arrive an internal node. The P-node can receive incorrect tokens due to

either the untimely joining error or the memory node corruption. A memory node can be

corrupted either temporarily or permanently. The lost transient tuple problem can

corrupt them temporarily. Memory nodes are permanently corrupted when the tokens

that arrive at the same tuple are processed out-of-order or when an untimely joining error









occurs. The above discussion illustrates that the list of problems appears to be

comprehensive. A complete proof is left as further study.

Serial token processing against a discrimination network that does not have

virtual cx nodes removes all the above problems and provides an exact semantic

consistency in trigger processing. By serial token processing, we mean the serial

processing of tokens that are delivered from the ideal data sources. An ideal data source

delivers tokens in the commit order of the transactions to which the tokens belong.

Among the tokens that belong to the same transaction, an ideal data source delivers

tokens in the order of real execution.

However, to exploit the power of parallel computers and increase the performance

of the system, we need to relax the strict semantic requirements in trigger processing. In

other words, if some of the above problems were allowed to happen, then the system

performance could be increased through parallel token processing. The trigger users

would determine the appropriate semantic requirements for his/her own triggers.

In this chapter, we will introduce the consistency levels of trigger processing and

the techniques to achieve them. The techniques primarily include the support for parallel

token processing. Section 7.1 gives the notational conventions and the definitions of

terms that are used throughout this chapter. Section 7.2 defines four consistency levels

(Level 0 through Level 3) of trigger processing. Sections 7.3, 7.4, 7.5, and 7.6 introduce

the techniques for achieving the levels 3, 2, 1, and 0, respectively. Section 7.7 discusses

the implementation alternatives of an asynchronous trigger system. Finally, Section 7.8

summarizes this chapter.









7.1 Notational Conventions and Definitions of Terms


This section contains the definitions of terms and the notational conventions that

apply to the rest of this chapter.


7.1.1 Notational Conventions


The notations used throughout Chapter 7 are as follows:

atk, atk1, atk2 : atomic tokens

tk, tk1, tk2 : atomic tokens or compound tokens

+, -, 5 : specify insert, delete, and update, respectively

tp, tpI, tP2: memory node tuples

tmi, tn2 : timestamps

(tpi, tp2), tp(tpi, tp2) : a compound tuple comprised of tpi, tp2

atk1(+, tpi), atk2(8, tp2) : atomic tokens with contents

aik (+, tp3: tin3) : an atomic token with contents and timestamps

tk4(+, tp4, tps) : compound tokens with contents

tk4(+, tp4: tm4, tps: tms) : a compound token with contents and timestamps

a,, a2, a3 : o memory nodes

8i, /82, 3 : P3 memory nodes

ts(tki) : the timestamp of token tki









7.1.2 Definitions of Terms


A family is a set of objects that have the same key and are related with the same

node of a discrimination network. The objects can be a token, a tuple, or a line in a

Stability Lookaside Buffer (SLB, see sections 7.5 and 7.6).

Two of a token or an SLB line that belong to the same family are comparable if

they have different timestamp vectors and each timestamp of one is greater than or equal

to the corresponding timestamp of the other. When two compound tokens are not

comparable, they are incomparable.

Let tk1 and tk2 be an atomic token, a compound token, or an SLB line that belong

to the same family. If each timestamp of tki is greater than or equal to the corresponding

timestamp of tk2, then tk1 is younger than tk2. When tk1 is younger than tk2, we say tk2 is

older than tki. An atomic token or a compound token that is younger than any other

token in the same family is called the youngest token.

If a compound token, tk, is propagated from an atomic token atk, then atk is the

initiating token of tk.

When the tokens arrive from the data sources, the system accumulates them and

creates a batch. A cycle is the time during which the tokens in one batch are processed.

A cycle begins after the previous cycle has finished. After a cycle that executes a batch

begins, no more tokens could be inserted into that batch.

When an atomic token is processed against a discrimination network node, if the

creation of a compound token is inconsistent with the real world situation, then we call it

an untimely joining error. The two cases of untimely joining error are:









The creation of a compound tuple that never existed, because the components

of the compound tuple did not exist at the same time period.

The failure to create a compound tuple that existed for a short period of time.

The timing error is the maximum timestamp difference between the two tokens

that created an untimely joining error.

We say a memory node stabilizes if the parallel application of a set of tokens

arriving at the node produces the same final content as the content that would be

produced by the serial application of the same set of tokens. The term converge is also

used to refer the same situation. Similarly, stabilization and stability are used to mean

the same concept as convergence.

An atomic SLB contains information related to the processing of the atomic

tokens that arrives at an cx node. A compound SLB contains information related to the

processing of the compound tokens that arrives at a 13 node.



7.2 Trigger Processing Consistency Levels


Before the trigger processing consistency levels are defined, we will compare the

consistency levels with the degrees of consistency in the transaction processing and

recovery (Subsection 7.2.1). Then the criteria to define the consistency levels will be

introduced (Subsection 7.2.2) and the definitions of trigger processing consistency levels

will be given (Subsection 7.2.3).









7.2.1 Transaction Consistency Degrees and Trigger Consistency Levels


In a shared environment, to protect the database from inconsistencies, locking

protocols are used. Responsibilities for obtaining and releasing locks can be either

assumed by the user or delegated to the system. Motivated by the fact that some database

systems use automatic lock protocols and provide a limited degree of consistency, four

degrees of consistency were defined [27]. The lower the degree is, the more portions of

the responsibilities concerning the locking are given to the user. Since the user knows the

semantics of the data, fewer locks could be used when the user controls the locking.

However, this can make programming difficult.

The purpose of the trigger consistency levels is to allow slight relaxation of

semantic consistency to increase the performance of the system significantly. The trigger

users can save expenses by choosing appropriate consistency levels for their triggers.

The higher the level is, the lower the performance is and the more inconsistency problem

can exist. Due to the characteristics of the asynchronous trigger system, the tokens that

arrive at the system are from the committed transactions. Therefore, tokens need to be

processed in the transaction commit order to leave the system in a consistent state. The

degrees of consistency for a transaction are compared with the trigger processing

consistency levels in Table 7.1.

In summary, the degrees of consistency for a transaction are analogous with the

levels of consistency of trigger processing. However, the trigger processing consistency

levels require more complex techniques to consider the token processing order, virtual c

nodes, etc. Simple read/write lock protocols are not enough for the trigger processing

consistency levels.









Table 7.1: The consistency degrees and the consistency levels


Transaction consistency degrees


Trigger consistency levels


The user needs to protect himself The user needs to protect himself
against the sources of against the sources of
inconsistencies that could happen inconsistencies that could happen
in the lower consistency degrees. in the lower consistency degrees.
Similarity The purpose is to trade the The purpose is to trade the
increased difficulty of the consistency in the trigger action
programming for gains in the execution with the performance
performance of the system. of the system.

Transactions can be committed in Tokens need to be processed in
any order. The order is the commit order of the
Difference determined dynamically. transactions to which the tokens
belong.


7.2.2 Criteria of Consistency Level Definition


Among the problems of uncontrolled parallel token processing, the problems of

phantom compound tuple and lost tuple can be explained using the terms of untimely

joining error and timing error. The later two terms are defined in Subsection 7.1.2.

An example of untimely joining error is given in Figure 7.1 where tk3 deletes tp3

from a3 and tk4 inserts tp4 into a4. The token tk3 precedes tk4. Assume that tp3 and tp4

join with each other. Following serial processing, tp3 would be deleted by tk3 before tp4

could join it. When two tokens are processed in parallel, if tk4 is processed before tk3,

then tp4 will incorrectly join with tp3. The compound tuple that consists of tp3 and tp4

will be inserted into fi3 and could be used in executing the trigger action. This is an

untimely joining error with its timing error of 71(ik,) 1%(ik3).









tk4 (+ tp4)
tk3 (- tp3)

Remarks:
3 ts(tk4)> 1,(lk3)
tp3 tk4 '(+(tP3, tp4))
tp3 joins with tp4



Figure 7.1: Untimely joining error



An example of memory node corruption is given in Figure 7.2 where tk1 and tk2,

consecutively, arrive at a,. When tkI and tk2 are processed in parallel, if tk2 is applied to

tpi before tki, then tkI needs to be discarded later. Otherwise (if tkI is applied to tpi

later), tpi' will not exist after two tokens are processed, which is wrong. (Hint: Do not

apply an older token after a younger token in the same family.) By discarding an older

token, an (x node will converge on the content that would be generated by serial token

processing. The notion of convergence of a memory node plays an important role in

defining the trigger processing consistency levels later in this section.



tk2 (+ tpi')

Remarks:
applied tkI (- tpi)
first ts(tk2) > ts(tki)
to tp, discarded
later key(tpl) = key(tp/')


Figure 7.2: Convergence of a memory node









Among the problems of parallel token processing, memory node corruption can

never be allowed. This is because the state information of the system could be totally

corrupted, eventually, by allowing this problem. Three problems that have minor

complications will be allowed for some consistency levels to improve the performance of

the system.

From the problems of uncontrolled parallel token processing, we derived the

following criteria of the consistency level definition:

The execution order of the multiple instantiations of the action of a single

trigger.

The content or the stabilization of memory nodes.

The existence of an untimely joining error in a compound tuple.

The amount of timing error of an existing untimely joining error.


7.2.3 The Definition of Trigger Processing Consistency Levels


There are four trigger processing consistency levels: Level 0 through Level 3.

The higher the level is, the less the semantic problem exists and the lower the system

performance is. For example, the performance of Level 3, which is the highest level, has

the lowest performance since it does not allow parallel processing of the tokens that

arrive at the same discrimination network. Meanwhile, Level 0 consistency only requires

the contents of memory nodes to converge and provides the highest performance.

Using the criteria given in Subsection 7.2.2, the trigger processing consistency

levels may be defined as:

Level 3 The action of a trigger Twill be executed on Level 3 consistency if:






77


(a) The contents of the memory nodes of the discrimination network for T are

always correct.

(b) No untimely joining error exists. That is, all compound tuples generated in

the system consist of chronologically joined tuples.

(c) Multiple instantiations of the action of T are executed in the same order as

would be done if tokens that arrive at the discrimination network for T were

processed serially.

Level 2 The action of a trigger Twill be executed on Level 2 consistency if:

(a) The contents of the memory nodes of the discrimination network for T are

always correct.

(b) No untimely joining error exists. That is, all compound tuples generated in

the system consist of chronologically joined tuples.

Level 1 The action of a trigger Twill be executed on Level 1 consistency if:

(a) The contents of the memory nodes of the discrimination network for T

converge.

(b) The timing error in the joining tuples is limited to some fixed value.

Level 0 The action of a trigger Twill be executed on Level 0 consistency if:

(a) The contents of the memory nodes of the discrimination network for T

converge.


The techniques that provide levels 3, 2, 1, and 0 consistency while maximizing

the performance are explained in sections 7.3, 7.4, 7.5, and 7.6, respectively.









7.3 Trigger Processing Consistency Level 3


Level 3 is the highest consistency level of trigger processing. We can provide

Level 3 consistency for a trigger T by serially processing the tokens that arrive at the

discrimination network for T. As a result, the expanded computation unit is implicitly

used for Level 3 consistency. Level 3 consistency has the lowest performance among the

four consistency levels.

By the definition of Level 3, the tokens that arrive at different discrimination

networks can be processed in parallel. Assume triggers T, and T2 have discrimination

networks N, and N2, respectively. Further assume a token tk1 arrives at N, and token tk2

arrives at N2, consecutively. If two tokens are propagated all the way up to their P-nodes,

then the action of T2 could be executed before the action of Ti. This is because tk1 and tk2

can be processed in parallel by the definition of Level 3 consistency.

It is necessary to inform the users who subscribe to multiple triggers with Level 3

consistency about the possibility of mixed execution of the actions of multiple triggers.

We say that a user subscribes to a trigger if he/she registers for the event that is raised by

the action execution of the trigger.

When one or more virtual cx nodes exist in the discrimination network for a

trigger, a special technique needs to be employed to provide Level 3 consistency. This

technique is based on the use of a shadow table and is explained in Subsection 7.3.1.

However, the shadow table technique causes the duplicate compound token problem. To

remove the duplicate compound token problem, we developed a technique that is

explained in Subsection 7.3.2.









7.3.1 Support of Virtual a Nodes with Shadow Tables


The virtual ca node was introduced in the A-TREAT algorithm [35] to reduce

storage requirements. To join a token with a virtual ca node, the base table of the virtual

ca node needs to be accessed. Since the base table always contains the most up-to-date

tuples, the token that joins with the base table cannot see the base table content at the

point of time when the token was created. Therefore, the untimely joining error is

unavoidable when a virtual a node exists in a discrimination network.


base table


tkI (+, tpi)


tk2 (-, tp2)

.- " .


Figure 7.3: Virtual a node and an untimely joining error


An example is given in Figure 7.3 where a discrimination network with one stored

a node, a,, and one virtual a node, a2, is shown. Assume tp1 and tp2 belonging to a] and

a2, respectively, join together. At this point tkI and tk2 arrive consecutively at the Gator

network and tkI is processed first. When the base table of a2 is accessed while tkI is


Remarks:

* ts(tk2) > ts(tki)

* ( : virtual a node

* :........... virtual content









being processed, it can occur that tp2 of the base table has already been deleted. Hence,

tpi cannot join with tp2, which is an untimely joining error.

To remove the untimely joining error, we propose the idea of maintaining a copy

of the base table in the state space of the system. The base table copy is called a shadow

table. A shadow table is exactly same as the base table except that it contains slightly

older data. All virtual ca nodes that are defined on the same base table can share the same

shadow table. The sharing of a shadow table has two implications:

The maintenance of a shadow table is cheap since only a single application of

a token to the shadow table is needed irrespective of the number of virtual ca

nodes that share the shadow table.

Since a shadow table could be shared among multiple discrimination

networks, the modification time of the shadow table is important to all

discrimination networks that share the shadow table. Occasionally, the

processing of the tokens that arrive at the discrimination networks that share

a common shadow table needs to be serialized. This increases the system

complexity and decreases the number of tokens that can be processed in

parallel. An example is given in Figure 7.4.

In Figure 7.4, tki, tk2, and tk4 arrive at two Gator networks, consecutively.

Assume tk2 satisfies the selection conditions of a2 and as, tp2 joins with tpj, and tp2 joins

with tp4. If tk2 is processed before tk1, then a compound tuple of tp1 and tp2 will not be

created. This is an untimely joining error. If tk4 is processed before tk2, then a compound

tuple of tp2 and tp4 will be created. This is another untimely joining error. To prevent

both untimely joining errors, tk1, tk2, and tk4 need to be processed consecutively.









However, if the shadow table were not shared, then tki and tk4 could be processed in

parallel. The advantages and disadvantages of a shadow table can be summarized as

follows:


Figure 7.4: A shadow table supporting two virtual ca nodes.


Advantages

1. Accessing a shadow table is faster than accessing a base table.

2. The SLB structure and algorithm to stabilize a 13 node is simpler with a

shadow table than without a shadow table. In other words, the 13 node

stabilization method of Strategy II (Subsection 7.5.2.2) is simpler than that of

Strategy III (Subsection 7.6.4).

3. A shadow table increases the trigger processing consistency level from 0 to

1.


Remarks:

* : virtual a node

* | : shadow table

* tm1 < tm2 < tin4









Disadvantages

1. It takes time to create a shadow table.

2. Storage space is needed to store a shadow table, which is usually large.

3. A shadow table needs to be maintained and it requires CPU resource.


7.3.2 Preventing Duplicate Compound Tokens


A reflexive join is an operation that joins a table with itself, on different columns.

Assume a trigger, TI, contains a reflexive join in its condition. If the reflexive join is

implemented using virtual ca nodes in the discrimination network for T, then the parent

node of the virtual a nodes could receive duplicate compound tokens. The duplicate

tokens would execute a trigger action more than once using exactly same data. This is

called the duplicate compound token problem.






create trigger T, atk(+, tpi)
fromRasr,Rasr2 : shadow
when ri.x = r2.y and table of R
a'(ri) and U2(r2) 0y o


Figure 7.5: Creation of duplicate compound tokens


Remarks:

* ( ) : virtual ca node

* tp1 joins with itself









An example is given in Figure 7.5 where we assume tpi joins with itself and

satisfies selection conditions, ao and o2. The base tables of virtual ca nodes are updated

before the tokens are delivered to the system. Similarly, we apply the tokens from the

base table R to the shadow table before they are propagated to ao and a2. Then, when atk

arrives at a,, it joins with the copy of itself in the shadow table and creates a compound

token that will be propagated to pi. Later, when atk arrives at ao, it joins with the copy

of itself in the shadow table and creates a duplicate compound token that will also be

propagated to 8i.

The creation of duplicate compound tokens has been shown. To detect and

discard duplicate compound tokens, we propose the use of a Duplicate-Lookaside Buffer

(DLB) that originated from the idea of an SLB (see sections 7.5 and 7.6 for the details on

SLB usage). Each P3 node or P-node, n, such that n has two or more virtual ca node

children that have the same shadow table or base table, will be equipped with a DLB.

The DLB will contain a piece of information called a line for each tuple that joins

with a copy of itself in the current cycle. Each line contains a pair.

The manipulation of the DLB appears in Figure 7.6.

Let us consider an example of duplicate compound token detection. This is

continued from Figure 7.5. After atk is processed against a,, a compound token tk1 (+,

tpi, tpi) arrives at pi. Since tkI succeeds the checking of Step (i) of Figure 7.6, it creates

a line () in the DLB of 8i by Step (ii). Later, when atk is processed

against a2, another compound token tk2 (+, tpl, tpi) arrives at 8i1. The token tk2 also









successfully passes the check in Step (ii), and finds a DLB line / such that ts(l) = ts(tpi).

Hence, tk2 will be discarded. Now, the problem of duplicate compound token is resolved.


Figure 7.6: Manipulation of DLB




7.4 Trigger Processing Consistency Level 2


Compared to Level 3, Level 2 removes the requirement of serial execution of

multiple trigger action instantiations. To elaborate, when n tuples arrive at the P-node of

the discrimination network for a trigger T, the n tuples create n instantiations of the action


* A DLB is cleared after each cycle.
* The processing of the compound tokens tk that arrive at n needs to be
preceded by the following two steps:
(i) Check if tk is one of the duplicate compound tokens (i.e. if the
initiating token itk of tk is from one of the virtual cx nodes with the
same shadow table or base table and appears in tk more than once),
then go to step (ii). Otherwise, skip step (ii).
(ii) if the DLB of n contains a line / with the same key as itk then
if ts(1) > i,(iik) then discard tk

else change timestamp field of/to i%(iik)

endif

else initialize a line in the DLB using itk

endif









of T. Level 2 consistency allows those n action instantiations to be processed in any

order. Unless otherwise stated, each trigger that is mentioned in this section is supposed

to be defined with Level 2 consistency.


To improve the performance, the tokens that arrive at the ca nodes of the

discrimination network for a trigger can be processed in parallel as long as they do not

break the conditions of Level 2 consistency. As an example, a Gator network for trigger

T, is shown in Figure 7.7. In the figure, n + tokens arrive at ai while no token arrives at

q2. In this case the n tokens can be processed in parallel. Note that the processing of the

n tokens does not corrupt memory nodes a, and a and does not cause any untimely

joining error.





no token
n + tokens


a .......................................... a 2




P


Figure 7.7: A Gator network for trigger T,



In another case, multiple tokens that arrive at the ca nodes of a discrimination

network for a trigger cannot be processed in parallel. An example is given in Figure 7.8

where tokens tk1 and tk2 arrive at the Gator network for a trigger T2 consecutively.










tk2 (-, tp2)
tki (+, tpi)

Remarks:
a ] .......................................... aa
t* ts(tki) < ts(tk2)
tk3 ') t
(+, tp1, tp2)
P

Figure 7.8: A Gator network for trigger T2



Assume that tpi joins with tp2. If tki and tk2 are processed in serial, then a

compound token, tk3(+, tp1, tp2), will be inserted into P. However, if tp2 is processed

before tpi when they are processed in parallel, then tk3 will not be created. This is an

untimely joining error. Therefore, to remove the untimely joining error and guarantee

Level 2 consistency, tk1 and tk2 must be processed in serial.

As we discovered, among the tokens that arrive at the same discrimination

network, some tokens can be processed in parallel while others cannot. Among the

tokens that arrive at the same discrimination network, a set of consecutive tokens that can

be processed in parallel is called a Concurrent Token Set (CTS). The tokens that are

propagated from the tokens of a CTS can also be processed in parallel. In other words,

the CTS property is inherited (CTS inheritance).

If the reduced computation unit (definition is given at the beginning of Chapter 7)

is used, then the CTSs need to be recalculated at each node along the path from the oa

node to the P-node. To utilize the CTS inheritance and to avoid the overhead of CTS









recalculation, the expanded computation unit is used for Level 2 consistency. However,

this might decrease the parallel speedup because of the increased job granularity.

Our proposed technique of detecting the CTS is presented in Subsection 7.4.1.

Subsection 7.4.2 explains our proposed architecture for efficient parallel processing of

tokens. Finally, Subsection 7.4.3 summarizes the techniques for processing triggers with

Level 2 consistency.


7.4.1 CTS Detection


Since the extended computation unit is used for Level 2 consistency, only the

CTSs from among the set of tokens arriving at the ca nodes of a discrimination network

need to be detected. The CTS detection requires finding the set of consecutive tokens

that can be processed in parallel and making it as large as possible. We need to detect the

CTSs for each discrimination network in the system.

An example is shown in Figure 7.9 where n tokens, tki, tk2, ... tk, are arriving at

a discrimination network consecutively in a cycle. Whenj-i+1 tokens are in the current

CTS, as illustrated in Figure 7.9, we test the possibility of inclusion of tkA+ into the set.

For tkA+ to be able to be included into current CTS, tk,+A can be processed in parallel with

each token in current CTS.

Let N1 be a discrimination network for a trigger, TI. Then, the possibility of

parallel processing of two tokens that arrive at N1 depends on:

The event types of the tokens.

The ca nodes at which the tokens arrive.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs