Citation
Enhanced techniques for timer trigger processing

Material Information

Title:
Enhanced techniques for timer trigger processing
Creator:
Noronha, Lloyd X. 1973- ( Dissertant )
Place of Publication:
Gainesville, Fla.
Publisher:
Department of Computer and Information Science and Engineering, University of Florida
Publication Date:
Copyright Date:
1999
Language:
English
Physical Description:
vii, 73 leaves : ill. ; 29 cm.

Thesis/Dissertation Information

Degree:
Master's ( Master of Science (M))
Degree Grantor:
University of Florida
Degree Disciplines:
Computer Engineering (CEN)
Committee Chair:
Hanson, Eric N
Committee Members:
Hammer, Joachim
Su, Stanley Y

Subjects

Subjects / Keywords:
Algorithms ( jstor )
Art dealing ( jstor )
Databases ( jstor )
Email ( jstor )
Expiration ( jstor )
Indexing ( jstor )
Predicates ( jstor )
Sales personnel ( jstor )
Timing devices ( jstor )
Vacations ( jstor )
Computer and Information Science and Engineering thesis, M.S
Database management ( lcsh )
Dissertations, Academic -- UF -- Computer and Information Science and Engineering ( lcsh )
City of Gainesville ( local )
Genre:
bibliography ( marcgt )
theses ( marcgt )

Notes

Abstract:
Timer-Driven Triggers allow users to monitor interesting tuples or interesting changes to tuples of a view. Trigger condition testing takes place periodically, i.e., once every few hours, days or weeks. When the timer of a trigger expires, a new copy of the view is retrieved and the trigger predicates applied to that view. Trigger predicates could be transition predicates referring to the attribute values of tuples of a view from the previous timer expiration. Timer Triggers differ from conventional triggers, in that the trigger testing is not done for updated tuples only. Timer triggers could be fired on tuples of a view simply because they satisfy the predicate trigger predicate. Since the trigger conditions are tested every so often, high update rate tables can have a lot of timer triggers defined on them. The goal of this thesis is to design a timer trigger system that works hand-in-hand with a trigger processor and complements its features. The semantics of the timer trigger differ from that of regular triggers and are explained by a brute-force approach. Several enhanced algorithms can be written that do better than the brute-force approach, but do not change the semantics. A timer trigger system also provides the user an option of replacing expensive join and aggregate triggers with timer triggers.
General Note:
Printout.
General Note:
Vita.
Thesis:
Thesis (M.S.)--University of Florida, 1999.
Bibliography:
Includes bibliographical references (leaves 70-72).

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright Lloyd X Noronha. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
2531467 ( ALEPH )
43757201 ( OCLC )

Downloads

This item has the following downloads:

PDF ( .pdf )

1999383_Page_03.txt

1999383_Page_06.txt

1999383_Page_10.txt

1999383_Page_32.txt

1999383_Page_71.txt

1999383_Page_23.txt

1999383_Page_20.txt

1999383_Page_63.txt

1999383_Page_19.txt

1999383_Page_69.txt

1999383_Page_64.txt

1999383_Page_44.txt

1999383_Page_66.txt

1999383_Page_14.txt

1999383_Page_33.txt

1999383_Page_79.txt

1999383_Page_25.txt

1999383_Page_35.txt

1999383_pdf.txt

1999383_Page_77.txt

1999383_Page_02.txt

1999383_Page_30.txt

1999383_Page_60.txt

1999383_Page_59.txt

1999383_Page_81.txt

1999383_Page_01.txt

1999383_Page_11.txt

1999383_Page_26.txt

1999383_Page_50.txt

1999383_Page_34.txt

1999383_Page_28.txt

1999383_Page_39.txt

1999383_Page_43.txt

1999383_Page_73.txt

1999383_Page_24.txt

1999383_Page_04.txt

1999383_Page_13.txt

1999383_Page_51.txt

1999383_Page_08.txt

1999383_Page_58.txt

1999383_Page_54.txt

1999383_Page_40.txt

1999383_Page_53.txt

1999383_Page_65.txt

1999383_Page_61.txt

1999383_Page_62.txt

1999383_Page_55.txt

1999383_Page_80.txt

1999383_Page_72.txt

EWQFNRL4L_6S1TAX_xml.txt

1999383_Page_16.txt

1999383_Page_41.txt

1999383_Page_22.txt

1999383_Page_52.txt

1999383_Page_29.txt

1999383_Page_67.txt

1999383_Page_17.txt

1999383_Page_48.txt

1999383_Page_70.txt

1999383_Page_75.txt

1999383_Page_07.txt

1999383_Page_21.txt

1999383_Page_74.txt

1999383_Page_37.txt

1999383_Page_76.txt

1999383_Page_27.txt

1999383_Page_38.txt

1999383_Page_46.txt

1999383_Page_57.txt

1999383_Page_56.txt

1999383_Page_47.txt

1999383_Page_78.txt

1999383_Page_31.txt

1999383_Page_15.txt

1999383_Page_42.txt

1999383_Page_05.txt

1999383_Page_45.txt

1999383_Page_36.txt

1999383_Page_68.txt

1999383_Page_18.txt

1999383_Page_09.txt

1999383_Page_12.txt


Full Text












Enhanced Techniques for Timer Trigger Processingi

Lloyd Noronha and Eric Hanson
301 CSE
CISE Department
University of Florida
Gainesville FL 32611-6120
(352) 392-2691
hanson@cise.ufl.edu
http://www.cise.ufl.edu/-hanson

TR-99-015
9 Sep 1999































1 This report contains the MS thesis of the same title written by Lloyd Noronha under the
supervision of Eric Hanson, September 1999.















ACKNOWLEDGMENTS


I would like to express my sincere gratitude to my advisor, Dr. Eric

Hanson, for giving me an opportunity to work on this challenging project and also

for providing continuous guidance, advice, and support throughout the course of

my work.

I thank Dr. Joachim Hammer and Dr. Stanley Su for serving on my

supervisory committee and for their careful perusal of this thesis. I would like to

thank Ms. Sharon Grant for maintaining a great research environment at the

Database Systems Research and Development Center.

I am thankful to Chris Carnes, Lan Huang, Mohan Konyala, Sasi Kumar,

J. B. Park, and Albert Vernon for their invaluable help and the fruitful discussions

that we had during the course of this work.

On a more personal note, I would like to thank my family whose love,

support, and constant encouragement was of great importance through this work.















TABLE OF CONTENTS
Page



A C K N O W L E D G M E N T S ............................ ................. ................................................ii

T A B L E O F C O N T E N T S ......... ................. ......................................... ..........................iii

LIST OF FIGURES ........................ ... ..................... v

CHAPTERS

1 IN T R O D U C T IO N ......... .......... .. .. ...... .. ......................................................

2 TRIGGERMAN, A FAST ASYNCHRONOUS TRIGGER PROCESSOR.................. 5
2.1 Introduction ......................................................... ......... 5
2.2 TriggerM an A rchitecture............................................................... .. ............ 6
2.3 The Predicate Indexing Strategy of TriggerMan............................................. 8
2 .4 C onclu sion................................................. ................. 10

3 TIMER TRIGGER LANGUAGE ................................................... .................. 12
3 .1 Introdu action .............................................. .......... . ........... 12
3.2 T im er T rigger L anguage......................................................................... ... 14
3.3 Trigger Sem antics....................................... .. ................. ...... ... ....... 15
3.4 Tim er trigger exam ples .............. ............................................ .............. 16
3.5 Conclusion ................................................... 20

4 ENHANCED ALGORITHMS FOR SINGLE TABLE AND JOIN TRIGGERS........ 21
4.1 Introduction ................. ..... ...... ......... .......... ........................... 21
4.2 The Brute Force Implementation for Single Table and Join Triggers ............ 22
4.3 Optimization Goals and Strategy..................... ......... ...... .............. 23
4.4 Enhanced Algorithms for Single Table Triggers............... .......... .. 24
4.4.1 Single Table on U pdate ...................................................... .............. 25
4.4.2 Single Table on Insert................................ ................................. 28
4.4.3 Single Table on D elete .................................. ......... .............. 29
4.4.4 Single Table on Update with Transition Predicate............................. 30
4.5 Enhanced algorithms for Join Triggers ................................. .............. 32
4.5.1 Join Trigger on Update................ ..... ................ .. ............. 32
4.5.2 Join Trigger on Update with Transition Predicate ............................... 34
4.6 Combining the View Predicate and the Trigger Predicate ........................... 36









4.7 Examples ............................................... .............. ......... 37
4.8 C conclusions ............................................................................... 41

5 ENHANCED ALGORITHMS FOR AGGREGATE TRIGGERS............................ 42
5.1 Introduction ......................................... 42
5.2 The Brute Force Implementation for Aggregate Triggers............................ 42
5.3 Single Table Aggregate Triggers..... ................. ...................... .............. 44
5.3.1 Single Table, Aggregate, Non-Transition, on Update...............................45
5.3.2 Single Table, Aggregate, Transition, on Update .............................. 47
5.4 M ulti Table Aggregate Triggers ...................................... ....... .............. 48
5.4.1 Multi Table, Aggregate, Non-Transition, on Update ........................... 48
5.4.2 Multi Table, Aggregate, Transition, on Update ................................... 52
5.5 E xam ples .................................................................. ......... 53
5.6 Conclusion ........................................ .......... 55

6 OPTIMIZATION AND IMPLEMENTATION ISSUES .......................................... 56
6 .1 Introdu action .............................................. .......... . ........... 56
6.2 Architecture ......................................................................... ......... ................... 56
6.3 T thoughts on O ptim ization ..................................................... ... ................. 59
6.4 Optimization Strategies .................. ................................. 60
6.5 Conclusion........................................ .............. 62

7 SUMMARY AND FUTURE WORK.......... ....... .............................................64
7.1 Summ ary ............................ ....... .............. 64
7.2 Thoughts on Future W ork ........................................ ......................... 65

REFERENCES ........... .............. .. ......................... .... 70

BIOGRAPHICAL SKETCH...... ........ ................... ...... ... .............. 73















LIST OF FIGURES



Figure Page

2 .1: T riggerM an A rchitecture....................................................... ................................. 7

2.2: Predicate Index A architecture .......................................................... .............. 8

6.1: A architecture of a Tim er Trigger System .................................................................... 57

6.2: Performance of the brute force method vs. the enhanced method

for v trying u p date rates. ......................................................................................... 60















Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science

ENHANCED TECHNIQUES FOR TIMER TRIGGER PROCESSING

By

Lloyd X. Noronha

August, 1999

Chairman: Eric Hanson
Major Department: Computer and Information Science and Engineering

Timer-Driven Triggers allow users to monitor interesting tuples or interesting

changes to tuples of a view. Trigger condition testing takes place periodically, i.e., once

every few hours, days or weeks. When the timer of a trigger expires, a new copy of the

view is retrieved and the trigger predicates applied to that view. Trigger predicates could

be transition predicates referring to the attribute values of tuples of a view from the

previous timer expiration.

Timer Triggers differ from conventional triggers, in that the trigger testing is not

done for updated tuples only. Timer triggers could be fired on tuples of a view simply

because they satisfy the predicate trigger predicate. Since the trigger conditions are tested

every so often, high update rate tables can have a lot of timer triggers defined on them.

The goal of this thesis is to design a timer trigger system that works hand-in-hand

with a trigger processor and complements its features. The semantics of the timer trigger

differ from that of regular triggers and are explained by a brute-force approach. Several









enhanced algorithms can be written that do better than the brute-force approach, but do

not change the semantics. A timer trigger system also provides the user an option of

replacing expensive join and aggregate triggers with timer triggers.















CHAPTER 1
INTRODUCTION

Timer Triggers test and fire triggers periodically i.e., every few minutes, hours,

days, weeks etc. Views are created on the database and triggers are fired periodically on

interesting tuples or interesting changes to tuples of the view. This idea differs from the

conventional philosophy of synchronous [Sil98] and asynchronous [Han97] triggers

which fire triggers when a tuple is updated.

Current database systems process triggers synchronously. Synchronous trigger

processing lengthens the response time of update transactions, because the transaction

cannot commit until all the trigger conditions defined on the updated tables are tested and

actions fired. The response time of update transaction is thus directly affected by the

increase in the number of triggers, which limits the number of triggers that can be defined

on a table.

In the case of assertions synchronous triggers are indispensable. But in most other

cases, delaying the trigger processing to until after the triggering transaction commits (i.e.

asynchronous trigger processing) should work just as fine. This is especially true in the

case of alerters. In an asynchronous trigger system, the only work expended during a

transaction is to capture the updates for later processing [Han98]. This time spent in

capturing updates is a constant and totally unrelated to the number of triggers defined on

the tables involved in the transaction. Thus, it is possible to increase the number of









triggers defined on a table to a very large number without affecting the response time of

update transactions.

After the transaction completes, each of the updated tuples captured (also known

as tokens) are individually tested against the trigger condition of each trigger and actions

fired accordingly. This process of testing all the trigger conditions can be done

sequentially. However, a huge speedup can be achieved by means of an efficient

discrimination network for evaluating predicates. A Selection Predicate Index, is one such

discrimination network [Han98], which causes a factor of improvement in the processing

time per updated token.

This strategy works really well for all types of triggers, especially with single table

triggers because the entire condition evaluation on an update token takes place in main

memory i.e., without accessing the database. In the case of join triggers and aggregate

triggers, there is a relative increase in time spent on processing a token, due to a possible

table-access involved during condition evaluation of the trigger. In the case of join

triggers, when an update token of a table arrives, a part of the condition evaluation

involves performing a join. In the case of aggregate triggers, we have to identify the group

to which the update token belongs and test the aggregate condition for that group. This is

covered in more detail in Chapters 4 and 5.

In both these cases, the Selection Predicate Index is still used to evaluate a part of

the predicate that does not involve a table access. If this part evaluates to true, we then

proceed to run an SQL query to read one or more tables, to evaluate the rest of the

predicate. Thus, although an efficient discrimination network still evaluates a part of the

predicate quickly, it is the I/O costs that dominate the processing time of join and









aggregate triggers. Timer triggers can be thought of as an alternative option to some

expensive join and aggregate triggers because they access the database less often.

Timer triggers enable the user to define triggers on a view of the database. The

view could be a join or an aggregation of one or more tables or simply a snapshot of a

single table. Trigger conditions are tested periodically i.e., once every few minutes, hours,

days, weeks etc. On a timer expiration, a new copy of the view is created and the

predicates of the trigger are applied to the view. These predicates could be transition

predicates, which compare some attribute values of the current view with the

corresponding values of the copy of the view created on the previous timer expiration.

It is easy to conceive of many applications and examples where users would like

to be intimated of interesting conditions once every so often. Consider an example of a

manager of a computer company, who is interested in knowing every week, which of his

hardware salesmen have a two fold increase in sales from the previous week. A timer

trigger can be defined with the timer duration of a week. The view here, keeps the sales

information of only the hardware salesmen. When the timer expires, a fresh copy of this

view is created from the sales table and the sales information for each salesman is

compared with his sales from the view of the previous week. The action is fired for all

tuples of the view that satisfy the predicate.

Asynchronous triggers, like synchronous triggers, test the trigger condition for

every updated tuple, albeit only after the transaction commits. Thus, only an updated

tuple can cause a trigger to fire. However, in timer triggers, all tuples of a view that

satisfy the predicate can fire the trigger. These include tuples which may have not been

touched since the last timer expiration.









By definition, timer triggers are defined to achieve an objective different from

asynchronous triggers and thus have different semantics. However, as indicated above,

for expensive join and aggregate triggers, it is possible to replace an asynchronous trigger

with a timer trigger to achieve similar (but not always the same) results.

The goal of this thesis is to propose a good design of a timer trigger system to

complement an existing trigger processor. Chapter 2 explains the architecture and

predicate indexing mechanism of an asynchronous trigger processor, TriggerMan. The

syntax and semantics of the timer trigger language are presented in Chapter 3 along with

some examples. Implementation of timer triggers with efficient algorithms for single

table and join triggers are discussed in Chapter 4. Chapter 5 has a similar discussion for

aggregate triggers. Chapter 6 focuses on the architecture of a timer trigger system and

some optimization details. Finally, Chapter 7 summarizes the discussion on timer triggers

and provides some thoughts for future work.















CHAPTER 2
TRIGGERMAN, A FAST ASYNCHRONOUS TRIGGER PROCESSOR


2.1 Introduction


The design of a timer trigger system in this thesis focuses on working with a

conventional (synchronous or asynchronous) trigger processor to complement and

enhance its functionality. In later chapters, we present some enhanced algorithms for the

implementation of timer triggers. These algorithms need to define some regular single-

table triggers to capture interesting updates and thus it is very important for the

underlying trigger processor to have a fast predicate evaluation mechanism for single-

table triggers. In this chapter we explain the architecture of such a trigger processor called

TriggerMan [Han98].

TriggerMan is a scalable and extensible trigger processor. The triggers in

TriggerMan can have user defined data types and the system scales well even for millions

of triggers for certain applications. The scalability of the system can be attributed mainly

to its asynchronous trigger processing mechanism and an efficient predicate indexing

strategy. As the number of triggers in a system increases, more time is spent on testing

each updated tuple against all the trigger conditions. Making this testing process

asynchronous alleviates the problem of delays in update transactions, which is an inherent

problem of synchronous trigger processing. TriggerMan additionally implements a









predicate indexing strategy [Han99] that is responsible for a huge speedup in evaluating

the predicates of the triggers and thus reduces the processing time per updated tuple.

Extensibility will soon be an indispensable requirement in trigger systems. With

the advent of Object Relational Database Management Systems (ORDBMS), extending

data types and adding new routines in a database system have gained popularity amongst

users. Users are allowed to create new User-Defined Data Types (UDT) and User-

Defined Routines (UDR) in the database system and use them with regular SQL.

TriggerMan is designed to recognize UDT's and UDR's of the underlying database system

and allow their use in triggers.

The other possible choices of trigger processors for our timer trigger system are in

the active database systems of Ariel [Han96a], POSTGRES [Sto91], HiPAC [Cha89],

Starbust [Wid91], and Ode [Geh91]. But they do not provide the scalability feature of

TriggerMan and very limited, if any, extensibility support. Among them, Ariel and

POSTGRES do implement a selection predicate indexing architecture, which however, is

not extensible.

2.2 TriggerMan Architecture


TriggerMan is designed to work as a stand-alone module independent of the

underlying system. However, for simplicity of explanation, the architecture that is

discussed here is as a module that resides on an Object-Relational DBMS. The

architecture of TriggerMan is shown in Figure 2.1.

The main components of the architecture are explained as follows. The

TriggerMan Module resides and runs in the address space of the DBMS server. The task









queue is a to-do list of tasks for the trigger processor. Once every so often, multiple

TriggerMan drivers run in parallel to read and process jobs from the queue. The Data

Source applications are responsible for capturing updates on local and remote data

sources and putting them in an update queue. These updates are later gathered up from

the update queue by the TriggerMan drivers and added as tasks to be processed in the task

queue.


Figure 2.1: TriggerMan Architecture



Lastly, the TriggerMan console is an application program that interfaces to the

user and allows him to initialize the system, create triggers, drop triggers, shut down the

system etc.










2.3 The Predicate Indexing Strategy of TrigerMan


The major work done by a TriggerMan driver program in Figure 2.1 is in

evaluating the predicates for an updated tuple against each of the trigger definitions.

TriggerMan employs a novel selection predicate indexing strategy [Han98] that makes the

predicate evaluation faster by a huge factor.


Selection Predicate Index


Predicate Index Root



Data Source Node







Expression Signature List


Sets of the rest of the predicate and trigger id's
for a particular constant


Figure 2.2: Predicate Index Architecture


Constant Sets









If a system has a large number of triggers then there is a fair chance that some

clusters of triggers have a part of their predicate looking similar. This is the basic premise

of TriggerMan's predicate indexing strategy. For example a majority of triggers defined

by a stock investor will be alerters where the predicates check for stock prices going

above or below a certain interesting value.

The predicate indexing strategy of TriggerMan, simplified for explanatory

purposes, is shown in Figure 2.2. All the data sources defined in TriggerMan have a node

at the first level below the root. When a trigger is defined, its predicate is converted to

Conjunctive Normal Form (CNF), and the most selective conjunct qualifies to be the

expression signature for that trigger. If the same signature already exists, then the

constants of this conjunct are plugged into the constant list under that signature and nodes

containing the rest of the predicate and the id of the trigger are placed under this constant.

If the same signature does not already exist then a new signature node is created and the

same steps followed.

For example consider the following two triggers having the same expression

signature:

create Trigger buy_informix

from stock

where (name="Informix") and (price<5)

do email("david", "Stock price of Informix within 5");

create Trigger sell_oracle

from stock

where (name="Oracle Corp.") and (price>40)









do email("david", "Sell Oracle Corp. now. The price is :price");

The trigger processor isolates the predicate "stock.name=constant" as the

expression signature for both the triggers because it finds it the most selective. It adds this

signature into the list of expression signatures under the data source "stock" (if it does not

already exist). The constants "Informix" and "Oracle" are inserted into the list of

constants (if they do not already exist). Finally, a node for the id's of the triggers and the

remaining part of the predicate are inserted under these constants.

Now, when the information about an updated tuple of the stock relation is pulled

out from the task queue by the TriggerMan driver, it is made to pass through this

discrimination network. The attribute values within this tuple are evaluated against the

constants within each expression signature. For every constant for where the expression

evaluates to true, we proceed with testing the rest of the predicate and then firing the

trigger action if necessary.

The predicate for single table triggers can be evaluated completely within this

main memory discrimination network. However for join and aggregate triggers,

evaluation of the rest of predicate involves reading relations to perform joins and

aggregations.

2.4 Conclusion


The optimization algorithms for timer triggers need to define single table triggers

for each of the data sources with the timer trigger definition. The predicate indexing

strategy of TriggerMan works best for single table triggers and hence is ideal for the timer






11


trigger algorithms. Along with a fast predicate evaluation technique, TriggerMan also

scales well for a large number of triggers and works with extensible data types.















CHAPTER 3
TIMER TRIGGER LANGUAGE


3.1 Introduction


The preceding chapters emphasized the importance of an asynchronous trigger

processor like TriggerMan, which provides a more powerful language than current trigger

systems while at the same time providing the features of scalability and extensibility.

There are trigger situations however, where the events of interest for a user do not lie in

individual updates to the database but in interesting happenings to the database over a

period of time. Take the case of a stock market investor who wishes to be alerted at the

end of the week when he spends over a $10,000 in purchasing stock of a single company.

Here, the investor requires the trigger system to keep track of all his purchases for the

week. In essence, the user wants the trigger system to keep track of a certain view of the

database and notify him periodically of interesting changes to that view. This is the

fundamental motivation behind timer triggers.

Apart from fulfilling it's basic role of periodic notification of interesting changes

to a view, timer triggers also provide users an option of off-loading, where possible,

expensive triggers from asynchronous processing to timer based processing. An

asynchronous trigger processor tests for trigger conditions on each updated tuple. While

this strategy works out well for single table triggers (due to no table access), join and









aggregate triggers could suffer sometimes. Both join and aggregate triggers potentially

involve a table access during predicate evaluation on an update token.

A part of the predicate not dependent on the table-access, can still be evaluated

using the fast predicate indexing strategy as that of TriggerMan. However, evaluating the

remaining portion of the predicate in join triggers requires tuples from one or more tables

to be read. In aggregate triggers all tuples belonging to the same group as the updated

tuple have to be read and an aggregation performed on them. This leads to a higher

processing cost per update token, whose effect could be augmented by the increase in the

number of join and aggregate triggers. In systems where triggers are defined on high

update-rate tables, this problem could get compounded causing the system to get

overloaded.

In situations such as these, the timer trigger system in addition to fulfilling it's

semantic role, could lend a helping hand to the asynchronous trigger system by offloading

some expensive join and aggregate triggers to timer triggers. Instead of performing table

accesses often, the advantage lies in doing them once every so often. Timer triggers thus,

albeit semantically different, can be used to meet objectives similar to asynchronous

triggers for a better system performance. Due to differing semantics however, the

decision to switch to a timer trigger must not be one done automatically by the trigger

processing system but should be an explicit decision of the user. The sections to follow

explain the syntax and semantics of a timer trigger system.









3.2 Timer Trigger Language


A Timer Trigger System is designed to work as a component of an asynchronous

trigger processor like TriggerMan, making the schema definitions of both the DBMS and

trigger processor available to it. Timer triggers can be created on data sources or views.

The asynchronous trigger processor is already familiar with the schema and the primary

key of all data sources specified in the command. The syntax for creating timer triggers is

as follows:

create timer trigger
from
{on event}
{when }
do
check every
{initialize {immediately I on first timer expiration}}


The from clause specifies the name of a data source or view on which the trigger

is defined. This field can also contain a SQL select statement. The on event clause could

either be insert, delete or update indicating that the trigger should fired only when inserts,

deletes or updates take place in the view. The default is update.

The when clause has the trigger condition on the view. When the timer expires,

this boolean expression is applied on the view to identify tuples satisfying the trigger

condition. Although any kind of predicates are allowed to be in the when clause, its

primary purpose is to specify transition predicates. The do clause specifies the trigger

action. The check every clause specifies the timer period or the day of the week and time

when the trigger should fire. The initialize clause is optional and specifies whether the

first instance of the view is created immediately or the first time the timer expires.









3.3 Trigger Semantics


The semantics of the timer trigger command is now explained for the three

different events viz. insert, delete and update. Consider a timer trigger defined on a data

source or a view with a when clause W. The predicates could be transition predicates

referring to old values of attributes obtained from the copy of the view created during the

previous timer expiration. The view itself, could be a single relation or a result of

applying a select statement with predicates on single tables, joins or aggregate queries.



Periodic processing: In the explanation to follow V is the current copy of the view

created when the timer expires by executing the query in the from clause of the create

timer trigger statement and oldV is the copy of the view from the previous timer

expiration. The following steps are executed when the timer expires:

Event ON UPDATE:

1. Retrieve the contents of the view into V.

2. If the when clause W has a transition condition then the action is run on every

tuple resulting from the operation aw (V JOIN oldV). If there is no transition

condition, then the action is run on every tuple resulting from the operation aw

(V).

3. Destroy oldV and rename V to oldV.

Event ON INSERT:

1. Retrieve the contents of the view into V.

2. The action is run on every tuple resulting from the operation aw (V old V).









3. Destroy oldV and rename V to oldV.

Event ON DELETE:

1. Retrieve the contents of the view into V.

2. The action is run on every tuple resulting from the operation aw (oldV V).

3. Destroy oldV and rename V to oldV.


3.4 Timer trigger examples


Example 1: Send an alert to employees who have taken more vacation than their allowed

quota but haven't filled out a special requisition form.

The schema definition of the relation is:
employee(empid, empname, email_addr, vacationtaken, vacationleft,
vacationreq filled .........) with emp_id as the primary key.


The trigger is defined as follows:
create timer trigger excessvacation
from (select email_addr, vacationreqfilled
from employee
where vacationtaken > vacationleft)
when vacationreqfilled = False
check every week beginning Monday at 8:00 AM
do email(:email_addr, "Please fill out the Excess Vacations form");


Here the view is simply the details of the employees who have exceeded their

allowed vacation quota. When the timer expires, it identifies the employees who haven't

filled out the necessary form from among those in the view. The notation :email_addr in

the action substitutes the actual email addresses by macro expansion while running the

action for qualifying tuples. Since no event is specified, on update is assumed. Moving









the predicate of the when clause to the definition of the view does not change the

meaning of the trigger in this example.

The above is a good example illustrating the use of timer triggers and it's differing

semantics from conventional triggers. The intent is to email a forgetful employee every

Monday morning until he does the needful. Trying to achieve the same with a

synchronous or an asynchronous trigger would send an email to the employee every time

his details are touched or it is possible that he does not get emailed at all if his details are

not touched.



Example 2: Alert all junior managers when they have two or fewer days of vacation left

in the year. The schema of the relation employee is the same as the one above.

create timer trigger lessvacationleft
from (select email_addr
from employee
where (designation='Jr. Managers) and
(vacationleft vacationtaken) <= 2)
on insert
check every week
do email(:email_addr, "You have only two or less days of vacation left in the
year");


The view has all the junior managers having two or less days of vacation. The on

insert clause ensures that the employee is emailed only the first time his vacation dips

below two days.



Example 3: Notify the finance department of the customers who haven't honored their

invoices for more than ninety days past the due date.









The schema definition of the relations are:
invoice (num, cno, due_date, amount, ........) with primary key num and
customer (cno, name, addr, ...... ) with cno as the primary key.


The trigger is defined as follows:
create timer trigger unhonoredinvoice
from (select invoice.num as invnum, customer.name as custname
from customer, invoices
where (invoice.cno = customer.cno) and (CURRENT DATE -
invoice.due_date) > 90 and (invoice. status='UNPAID')
check every week
do email("finance@experts.com", "Customer :custname has not honored the
invoice :invnum for more than 90 days");

This is a good example of a join trigger where the relation of interest is only

invoice. The customer relation is present only to extract the information on defaulting

customers. Another important characteristic of this trigger is that tuples who qualify are

mostly tuples lying untouched for more than 90 days.



Example 4: The managers of a famous record label wish to let their new pop and rock

artists know when their album sales have crossed a million.

The schema definition of the relations are:
album (albumid, artistname, sales, artist_type, ......) with album_id as the
primary key and
artist (artistname, category, ......) with artistname as the primary key.


The trigger is defined as follows:
create timer trigger newartistalbum_sales
from (select artist.email as artistemail, album.sales as sales
from album, artist
where (album.artistname = artist.artistname) and
((artist.category="rock") or (artist.category="pop")) and
(album.artist_type="new")
when old.sales < 1000000 and sales>=1000000









check every day
do email(:artistemail, "Congratulations! Your album sales have crossed a
million");


The transition predicate in the when clause ensures that only one email is sent to

the artist on the day when his album sales have crossed a million.



Example 5: The managers of a departmental store chain wish to keep the salaries of all

the sales personnel to within 50% of the budgeted salary.

The schema definition of the relations are:
emp(empid, name, dno, sal, desig, ..........) with empid as the primary key and
dept(deptid, name, budget, manageremail, ....) with dept id as the primary key.


The trigger is defined as follows:
create timer trigger excess_salary
from (select dept.manager email as mgremail, sum (emp.sal) as sum_sal,
dept.budget as budget
from emp, dept
where (emp.dno = dept.dno) and
(emp.desig="sales personnel") and
(dept.name="toy")
group by dept.name, dept.manageremail, dept.budget
when sum_sal > .50 budget
check every week
do email(:mgremail, "You are exceeding the allowed budget for salespersons.");

This is an example of an aggregate trigger involving two tables. A trigger can fire

due to changing information on either of the tables and both tables also have selection

predicates. The view here has aggregate information about the total salaries of sales

personnel in the toy dept.









3.5 Conclusion


The syntax and semantics of a timer trigger language are explained. The examples

illustrate how users can use timer triggers to be informed periodically of interesting tuples

in a view. In example 3, we saw an interesting property of timer triggers where some

tuples satisfy the trigger condition although they did not do so in the previous timer

expiration and have not been touched since. They qualified for the action simply because

they aged enough to satisfy a date-time predicate. The examples brought about the

similarity and the differences in timer triggers against the more conventional synchronous

and asynchronous triggers.















CHAPTER 4
ENHANCED ALGORITHMS FOR SINGLE TABLE AND JOIN TRIGGERS


4.1 Introduction


In the previous chapter we presented the syntax of the "create timer trigger"

command and explained the semantic definition of the different types of single table

triggers, join triggers and aggregate triggers with some examples. A brute force method

for implementing any timer trigger involves running a query against the relations in the

trigger definition to generate the new copy of the view and then evaluating the predicates

of the trigger for all tuples in that view. While the brute force method always works and

is the simplest implementation, it is often an inefficient way to go. Consider for example,

the following read-mostly table employee_details which stores the name and information

of all employees within a company,

employee details(emp id, emp name, emp addr,.....)

Changes to this table normally take place only when a new employee is recruited

or an existing employee leaves. If one or more timer triggers are defined on this low

update rate table, then it is totally unnecessary to re-run queries on the entire relation

employee_details to create the new view, every time the timer goes off. In the case when

there are no updates at all, between two timer expirations, the new view is simply the old

view. If only a few updates take place in this interval, then the new view can be obtained









from the old view by simply keeping track of the updates since the last timer expiration

and applying those updates to the old view during processing.

In this chapter we try to look at some enhanced strategies to generate the old view

from the new view without the brute force method. Several enhanced methods seem

appealing, but with improper optimization methods several cases of anomaly arise when

insert, updates and deletes take place on the same tuple many times in one interval. Our

enhanced technique endeavors to rid ourselves of these anomalies, and generate an

efficient technique to handle almost all types of timer triggers.

In the next section we illustrate the brute force approach to implement timer

triggers. With the disadvantages of this approach in mind, we highlight our points of

focus while developing the various enhanced algorithms, followed by the actual

algorithms for timer triggers, on single relations and joins. We end the chapter with some

examples and thoughts.

4.2 The Brute Force Implementation for Single Table and Join Triggers


Consider a general trigger T defined on one or more relations R. The predicate of

the trigger can be expressed in CNF as PAPtran, where P represents all the non-transition

predicates and Ptran represents the transition predicates. Although the view and the trigger

have their separate predicates Pv and Pw respectively, we combine them together

discussion and represent them as PAPtran. More details on combining predicates and

applying them are covered later in this chapter in section 4.6. The current view created is

CurrView=op(R) and PrevView is the copy of the view from the previous timer

expiration. Note that CurrView and PrevView are views we use for implementation, they









are different from the view defined in the trigger i.e. V=opv(R). The operations to be

performed to implement this trigger are:

1. Create a new copy of the view, CurrView=op(R)

2. Run the action on all the tuples resulting from the following operation,

GPtran (CurrView JOIN PrevView).

The above two steps translate into two SQL queries. For non-transition triggers,

the second step is not needed and the action is run on all the tuples of CurrView, resulting

in only one SQL query.

4.3 Optimization Goals and Strategy


The primary goal of developing enhanced algorithms for timer triggers is to

generate the new view from the old view with a method more efficient than the brute

force method, in terms of CPU and I/O processing time. This, as might be expected,

implies that we keep track of some information about the tuples updated on a table since

the last timer expiration. When the timer goes off we can modify the old view to obtain

the new view. As mentioned previously, improper optimization methods can cause

anomalies, whereby trigger behavior is different than when using the brute-force method.

We overcome these anomalies by keeping track of only the primary key of the tuples that

were updated. When the timer goes off, we perform a join of these keys with the original

relation, to get the correct picture of which among the touched tuples were effectively

inserted, deleted or updated. We then go on to apply these operations to the old view to

obtain the new view.









The timer trigger system is designed to complement an asynchronous trigger

system. As mentioned before, for a table on which a timer trigger is defined, we keep

track of all updates to the table within each processing interval. We can go a step further

and keep track of only those tuples which are of interest to the trigger i.e. those tuples,

which satisfy the predicates of the trigger. This can be done by using the existing

asynchronous trigger system of TriggerMan. Predicates for updated tuples can be

evaluated asynchronously using the fast and efficient Selection Predicate Index in main

memory [Han98].

Another focus while developing our algorithms is to query the relations on which

triggers are defined, only once. It is reasonable to assume these relations are large or will

keep growing. It is therefore important for our algorithms to not touch the relations more

times than necessary. So, in almost all cases, we query the original relation only once for

its attribute values.

4.4 Enhanced Algorithms for Single Table Triggers


In this section we present our enhanced algorithms to implement the various timer

triggers on single relations. These include triggers with or without transition predicates

and with any of the various events viz. on insert, on update and on delete.

The general schema of the view CurrView, on a single table R is (key, fields,

tranfields), where key is the primary key of the table R and fields are one or more

attributes of R whose values are required in the action. The third field tranfields is present

only in a transition trigger and includes those attributes of R, whose values are required to

evaluate the transition predicate.









4.4.1 Single Table on Update


Consider a timer trigger defined on a single relation R (key, field, field,

field3,...fieldn), with one or more predicates P, none of which have a transition

condition. The brute force method to implement this trigger is to evaluate the query

currView= Hn ,.,. (7p(R)), every time the timer goes off. The specified action is then

run for all the resulting tuples of the view currView.

In our enhanced algorithms we maintain a relation S(key), which is a set of unique

primary keys of tuples of R, that were touched (i.e. inserted, deleted or updated), since the

last timer expiration. Additionally, the keys of only those tuples are inserted into the

relation S, which either satisfy the predicate clause P or used to satisfy it before the

update. Regular triggers are defined on the relation R to collect this set S. This is

explained in greater detail with examples in later sections. The following operations can

now be performed to generate a new copy of the view:

1. TempCandidates = Hi ,.,1. ((p(S JOIN R))

2. Refresh CurrView :

2a. CurrView = CurrView (S JOIN CurrView)

2b. CurrView = CurrView u TempCandidates

3. Candidates for the action are all the tuples of the view CurrView.

Explanation: CurrView is the view from the last timer expiration. When the timer

goes off, we need to refresh the view CurrView before running the action on all its tuples.

The first step is to identify tuples to be inserted into the view or have their existing

values updated. These tuples could only be from among the touched tuples of relation R









in the interval since the last timer expiration, i.e., from the set of keys in relation S. These

tuples are collected into a temporary relation TempCandidates. A tuple updated in

relation R may cause an insert into the view currView, if that tuple did not previously

satisfy the predicate P, but does so after the update. In performing a join on the original

relation R, we get the current and thus the correct values of the attributes for evaluating

the predicate. TempCandidates should not include any tuples which need to be deleted

from the view. A tuple will be deleted from the view, either because it is physically

deleted from the relation R or because it no longer satisfies the predicate due to the

update. Performing the join between the relation S to the original relation R takes care of

eliminating the former and evaluating the predicate P takes care of eliminating the latter.

The second step refreshes the view currView. The first of the two queries deletes

all the tuples in CurrView that were touched, i.e. all keys in S. The second query inserts

any of the touched tuples that still belong to the view. Let's separately consider how the

inserts, updates and deletes to the view are correctly handled. A tuple to be inserted into

the view will be present in the temporary relation TempCandidates, and thus gets inserted

into the view CurrView by the second query. A tuple that is updated needs its attribute

values refreshed in the view CurrView. This tuple is present in TempCandidates. It is

updated by first deleting it from the view CurrView with the first query and then

reinserting it into the view from TempCandidates with the second query. A tuple to be

deleted from the view will not be present in the temporary relation, TempCandidates. The

first query will delete this tuple and the second query will not reinsert it into the view.

Finally the third step simply calls a User Defined Routine to run the action on

each of the tuples in the view CurrView.









The above steps can be translated into SQL queries as follows:

1. Select S.key as key, R.fields as fields

from S, R

where S.key = R.key and op

into temp TempCandidates;

2.

2a. Delete from CurrView

where exists (select S.key from S where S.key=CurrView.key);

2b. Insert into CurrView

select key, fields from NewCandidates;

3. Select trigAction (key, fields)

from CurrView;

where trigAction is a UDR for the specified trigger action

Salient features of the algorithm are the following. The above algorithm is well

enhanced and should work especially well on tables with a low update rate. The set of

keys of touched tuples in relation S is kept to as small a size as possible. This is done by

evaluating the predicate P, on each tuple, before inserting its key into S. If the predicate P

is highly selective, only a small fraction of the updates to R, have their keys stored in S.

There is however, a small cost overhead of evaluating the predicates P on touched tuples

before inserting it into the relation S. This cost is significantly reduced by making use of

the existing fast and efficient discrimination network of TriggerMan.

The number of queries in the above algorithm is restricted to three, of which only

one accesses the original relation R. The other two queries operate on much smaller









relations and should not cause a performance overhead. Processing is made more efficient

when the relation R has an index on the primary key, because then, in the worst case, only

as many pages of relation R are accessed as the number of unique keys in S. Correlated

sub-queries, as the one above in step 2, are well optimized by several existing DBMS

products. This query is flattened to perform a join ignoring any duplicates.

4.4.2 Single Table on Insert


In the case of single table triggers with the on insert event, we are interested in

rows inserted into the view CurrView, since the last timer expiration. When the timer

goes off, we create the current copy of the view, identify all new entries from the previous

copy and run the action on them. As before the relation S(key) is the set of keys of

touched tuples of relation R. The operations to process the trigger are as follows:

1. TempCandidates = Hn 1. ,.,i. ((p(S JOIN R))

2. Candidates for the action are all the tuples of (TempCandidates CurrView)

3. Refresh CurrView :

2a. CurrView = CurrView (S JOIN CurrView)

2b. CurrView = CurrView u TempCandidates

The first step creates a temporary relation TempCandidates of all the inserts and

updates as before. The next step runs the trigger action on all the tuples in

TempCandidates that are not in the previous copy of the view CurrView. Finally

CurrView is refreshed as before. The SQL statement to run the action on new inserts in

step 2 can be written as follows:









select trigAction (key, fields)

from TempCandidates

where not exists (select CurrView.key from CurrView

where key = TempCandidates.key)

where trigAction is a UDR for the specified trigger action. It accepts the key and fields of

the view and executes the action.

4.4.3 Single Table on Delete


In the case of single table triggers with the on delete event, we are interested in

rows deleted from the view CurrView, since the last timer expiration. When the timer

goes off, we create the current copy of the view, identify rows deleted from the previous

copy and run the action on them. The operations to process the trigger are as follows:

1. TempCandidates= Hn 1. ,,.d. ((p(S JOIN R))

2. Candidates for the action are all the tuples of

((S JOIN CurrView) TempCandidates)

3. Refresh CurrView :

2a. CurrView = CurrView (S JOIN CurrView)

2b. CurrView = CurrView u TempCandidates

The first and the third steps are no different from before. The second step

identifies tuples that were present in the previous copy of the view, but were deleted

since. Of course, these could only be from among the set S. (S JOIN CurrView) returns

all the tuples in the previous copy of the view, that were touched since. From this set, we









ignore the inserts and updates (i.e. tuples in TempCandidates) to get only the deletes. The

SQL query looks as follows:

select trigAction (key, fields)

from S, CurrView

where S.key = CurrView.key and

not exists (select TempCandidates.key from TempCandidates

where S.key = TempCandidates.key).

where trigAction is a UDR for the specified trigger action. It accepts the key and fields of

the view and executes the action. Again, correlated sub-queries are flattened and well

optimized by major commercial DBMS products.

4.4.4 Single Table on Update with Transition Predicate


Timer triggers with transition predicates compare old and new values of attributes

of corresponding tuples using the old and new keywords as qualifiers. As discussed later

in section 4.6, a predicate for a transition trigger can be written in Conjunctive Normal

Form (CNF) as PAPtran. P has all the non-transition predicates as before and is used to

define the view currView= HR. n ,.,j ,,i,a,-. (op(R)). Ptran is the transition predicate

operating on the old and new values of certain fields written as tranFields in the view. For

simplicity, we assume that the predicate Ptran is such that only a change in the value of

tranFields may cause it to evaluate to true. A more general discussion on evaluating

transition predicates which evaluate to true even when the fields tranFields are untouched

(e.g. old.value=new.value) is present in Chapter 7 under future work.









In order to process a transition trigger, we create the current copy of the view, and

then apply the transition predicates on the old and new copies of the view. The operations

to process the trigger are as follows:

1. TempCandidates = key, fields, tranFields ((Yp(S JOIN R))

2. Candidates for the action are all the tuples in

UPtran (CurrView JOIN TempCandidates).

3. Refresh CurrView :

2a. CurrView = CurrView (S JOIN CurrView)

2b. CurrView = CurrView u TempCandidates

The first and third steps are same as before. In the second step we identify the

tuples which satisfy the transition condition. Only those tuples that were updated, since

the last timer expiration, are candidates to satisfy the transition condition. The join

identifies only the updated tuples and applies the transition predicate Ptran on them. An

SQL query for this step would look as follows:

select trigAction (CurrView.key, CurrView.fields, CurrView.tranFields,

TempCandidates.tranFields)

from TempCandidates, CurrView

where TempCandidates.key = CurrView.key

and Ptran (TempCandidates.tranFields, CurrView.tranFields)

where Ptran is the transition predicate applied on field tranFields of the view.









4.5 Enhanced algorithms for Join Triggers


A join trigger is defined on multiple tables. The schema for a view on N relations

R1 to RN is CurrView(keyl, ..., keyN, fields, tranfields) where keyI is the primary key of

relation RI, "fields" are one or more attributes from any of the N relations whose values

are required in the action and the last field "tranfields" have one or more attributes

required to evaluate the transition predicate, if any. The selection predicate, normalized to

CNF is P1AP2A... APNAPJoIN, where PI is the selection predicate on RI and PJOIN

represents the join predicates. The view definition is:

CurrView= FI keyl, ..., keyN, fields, tranFields

((TP1,P2,... ,PN (cPJOIN(R1 X R2 X ....X RI X ......RN)))

The brute force method accesses all the tuples of all the relations to create the new

copy of the view.

4.5.1 Join Trigger on Update


We maintain individual sets for each relation to register the updated tuples since

the last timer expiration. Each of these relations SI(keyI) has the set of all the keys of

relation RI that were touched since the last processing period. When the timer goes off

the following operations are performed:

1. Create TempCandidates

ForI= 1 toN

do


If SI is empty then skip to the next iteration;









TempCandidatel= fn keyl, ..., keyN, fields, tranFields

((P1,P2,... ,PN ((PJOIN(R1 X... (SI JOIN RI) ... X RN)))

TempCandidate = TempCandidate u TempCandidatel.;

done

2. Refresh CurrView :

2a. For I= 1 to N

do

If SI is empty then skip to the next iteration;

CurrView = CurrView (SI JOIN CurrView)

done

2b. CurrView = CurrView u TempCandidates

3. Candidates for the action are all the tuples of the view CurrView.

The relation TempCandidatel represents the set of inserted and updated tuples of

the view due to the updates on RI only. The relation TempCandidates represents all the

inserts and updates on the view since the last timer expiration and is created by

performing the union of all the relations TempCandidatel. Step la and lb can be

implemented by a single query. Step 1 can thus be implemented using N+l queries, N

queries for each of the TempCandidatel's and one query to remove any duplicates from

TempCandidates.

In step 2, the view is refreshed by first deleting all the tuples touched due to

updates on each of the relations and then inserting the tuples from TempCandidates. Step









2 can also be written using N+1 queries. Finally, the last step runs the action on all tuples

in the view.

Salient features of the algorithm are the following. When the timer goes off, we

have to execute many queries to refresh the view and run the trigger action. In the worst

case when updates to all the relations are registered 2N+2 queries will be run, of which

each of the N queries of step 1 access all the relations. Clearly, our implementation is not

efficient in this case. The brute force method runs only one query to access all the

relations. However, if only one relation registered updates then our algorithm need run

only one query for the first step, two for the second and one for the third i.e. only four in

all, three of which are on smaller relations. Our implementation thus, works really well in

cases when only a small number of the tables are updated.

4.5.2 Join Trigger on Update with Transition Predicate


In this case a transition condition is a part of the predicate. The predicate can be

written in CNF as P1AP2A... APNAPJoINAPtran. The view definition does not change and

is as follows:

CurrView= Hl keyl, ..., keyN, fields, tranFields

(GP1lP2,... APN (TPJOIN (R1 X R2 X ....X RI X ......RN)))



When the timer goes off, the trigger is implemented as follows:

1. Create TempCandidates

ForI= 1 toN









do

If SI is empty then skip to the next iteration;

TempCandidatel = nfl key ... keN, fields, tranFields

(G(P1,P2,... PN (GPJOIN(R1 X... (SI JOIN RI) ... X RN)))

TempCandidate = TempCandidate u TempCandidatel.;

done

2. Candidates for the action are all the tuples in

(Ptran (CurrView JOIN TempCandidates).

3. Refresh CurrView :

3a. For I= 1 toN

do

If SI is empty then skip to the next iteration;

CurrView = CurrView (SI JOIN CurrView)

done

3b. CurrView = CurrView u TempCandidates

All the steps look similar to the previous discussion on the non-transition case. In

step 2 the action is run on only tuples of the view that satisfy the transition predicate by

comparing the corresponding values of these tuples attributes in the old and new view.

Again as stated for the transition case of single table triggers, we assume for

simplicity that the predicate Ptan is such that only a change in the value of tranFields may

cause it to evaluate to true. A more general discussion on evaluating transition predicates









which evaluate to true even when the fields tranFields are untouched (e.g.

old.value=new.value) is present in Chapter 7 under future work.

4.6 Combining the View Predicate and the Trigger Predicate


Consider a general timer trigger defined on one or more relations R1, ... Rn with

a view predicate Pv and a trigger condition Pw, both in Conjunctive Normal Form (CNF).

The view definition is V=nfields((Gpv(R1 X R2 X ... X Rn)). On timer expiration, the view

is retrieved and the trigger predicate Pw is applied to it. The view CurrView that we create

and keep around for enhanced trigger processing need not be the same as the view V but

possibly a more efficient representation.

In the case when the trigger has no transition predicates (i.e. no reference to old

attribute values), we generate CurrView by applying the predicates P=PvAPw on the

relations i.e. CurrView=n fields (oP (R1 X R2 X ... X Rn)). Since Pv and Pw are both in

CNF, P is also on CNF.

In the case when the trigger has transition predicates, the trigger predicate Pw has a

non-transition part PnoTran and a transition part Ptran i.e. Pw=PnoTranAPtran. We generate

CurrView by applying the predicates P=PvAPnoTran on the relations i.e., CurrView=n fields

(OPvAPnoTran (R1 X R2 X ... X Rn)). The procedure for breaking up Pw into PnoTran and Ptran

is explained as follows. If the old value of an attribute is referenced in the predicate, then

all the conjuncts containing that attribute make up Ptran. The conjuncts of Pw that are not

in Ptran make up PnoTran.









From the above two cases, it is clear that we can represent the predicate for a

trigger as P when there are no transition predicates and PAPtran when there are transition

predicates.


4.7 Examples


Example of a Single Table Trigger: Consider this example of a non-transition single

table trigger from Chapter 3.

create timer trigger excessvacation
from (select email_addr, vacationreqfilled
from employee
where vacationtaken > vacationleft)
when vacationreqfilled = False
check every week beginning Monday at 8:00 AM
do email(:email_addr, "Please fill up the Excess Vacations form");


The predicate P here is

P=(vacationtaken>vacationleft) and (vacationreq_filled=False)

The various schema definitions are as follows:

CurrView(empid, email_addr);

TempCandidates(empid, email_addr);

S(emp_id);

The asynchronous trigger to collect the updates in S is

create trigger S_employee_excessvacation
from employee
when ((vacationtaken>vacationleft) and
(vacationreqfilled=False))
or
((old.vacationtaken>vacationleft) and
(old.vacationreq_filled=False))









do execSQL'insert into S values :emp_id';



From the discussion in section 4.4.1 for the case of single table triggers with an

ON UPDATE event, the steps to process the trigger on timer expiration are written as

follows:

1. TempCandidates = emp id, e-mail addr

((C(vacation taken>vacation left) and (vacation reqfilled=False) (S JOIN employee))

2. Refresh CurrView :

2a. CurrView = CurrView (S JOIN CurrView)

2b. CurrView = CurrView u TempCandidates

3. Candidates for the action are all the tuples of the view CurrView.



Example of a Join Trigger: Consider this example of a transition join trigger from

Chapter 3.

create timer trigger newartistalbum_sales
from (select artist.email as artistemail, album.sales as sales
from album, artist
where (album.artistname = artist.artistname) and
((artist.category="rock") or (artist.category="pop")) and
(album.artist_type="new")
when old.sales < 1000000 and sales>=1000000
check every day
do email(:artistemail, "Congratulations! Your album sales have crossed a
million");


The predicate here is PartistAPalbumAPJoINAPtran where

Partist= (artist.category="rock") or (artist.category="pop")

Palbum = album.artist type="new"









PJoIN = album.artist name = artist.artist name

Ptran = old.sales < 1000000 and sales>1000000

The various schema definitions are as follows:

CurrView(artistid, albumid, sales);

TempCandidates(artistid, album_id, sales);

Sartist(artist id); Salbum(albumid);



The asynchronous triggers to collect the updates in Sartist and Salbum are:

create trigger S_artistnewalbum_sales
from artist
when (artist.category="rock") or (artist.category="pop")
or (old.artist. category="rock") or (old.artist.category="pop")
do execSQL insert into St,, values :artist id';


create trigger S_albumnewalbum_sales
from album
when album.artisttype="new"
or old.album.artisttype="new"
do execSQL insert into Salbum values :album id';



The view is initialized as follows:

CurrView= H artist id, album id, sales (GPartist,Palbum((GPJOIN(artist X album)))



From the discussion in section 4.5.2 for the case of Join Triggers with an ON

UPDATE event and a Transition predicate, the steps to process the trigger on timer

expiration are written as follows:









1. Create TempCandidates:

la. If Sartist is not empty then

TempCandidateartist = n artist id, album id, sales

((yPartist,Palbum((PJOmI(Sartist JOIN artist) X album)))

TempCandidate = TempCandidate u TempCandidateartist.;

end if

lb. If Salbum is not empty then

TempCandidatealbum = artist id, album id, sales

((yPartist,Palbum((PJOmI(Salbum JOIN album) X artist)))

TempCandidate = TempCandidate u TempCandidatealbum.;

end if

2. Candidates for the action are all the tuples in

(Ptran (CurrView JOIN TempCandidates).

3. Refresh CurrView :

3a. If Salbum is not empty then

CurrView = CurrView (Salbum JOIN CurrView);

If Sartist is not empty then

CurrView = CurrView (Sartist JOIN CurrView);

3b. CurrView = CurrView u TempCandidates






41


4.8 Conclusions


The brute force implementation of timer triggers is expensive because of the CPU and I/O

cost involved in reading entire relations and applying selection and join predicates on

them. In single table triggers where the update rate on relations are not very high, our

strategy of keeping track of the tuples touched and performing queries only on this

touched set works very well. Performance improves even more if the selection predicate

is very selective or if there is an index on key of the relation. In multi-table triggers, our

strategy works well when only a few of the relations are updated and the update rate is not

very high.















CHAPTER 5
ENHANCED ALGORITHMS FOR AGGREGATE TRIGGERS


5.1 Introduction


Aggregate triggers are triggers whose condition of interest involve an aggregation

on all tuples or groups of them. If a tuple of a relation is touched then the group it belongs

to needs its aggregate information re-calculated and the trigger condition checked. Thus,

aggregate triggers implemented as synchronous or asynchronous triggers have a high

processing cost per updated tuple. The timer trigger mechanism to test for interesting

aggregate conditions periodically is a better option to implement aggregate triggers. The

view here is an aggregation of tuples from one or more relations grouped by some

grouping attributes. Again, the brute force method is the simplest implementation and

always works. There are enhanced algorithms discussed in the next few sections which go

along similar lines of the algorithms in the previous chapter that make aggregate triggers

very efficient under some cases.


5.2 The Brute Force Implementation for Aggregate Triggers


Consider a general aggregate trigger defined on relations R. The selection

predicate of the trigger is P, the grouping attributes are gAttr and the aggregate predicate

is PaggrAPtranAggr in CNF, where Paggr represents all the non-transition aggregate predicates

and PaggrTran represents the transition aggregate predicates. Although the view and the









trigger have their separate aggregate predicates Pvaggr and Pwaggr respectively, we combine

them together into one Boolean and represent them as PaggrAPtranAggr, where PtranAggr

contains the transition predicate part of the aggregate predicate and Paggr contains the non-

transtion part. This discussion of combining predicates is identical to the one we had in

section 4.6, the only difference being that the selection predicates are replaced by

aggregate predicates. The schema for the view CurrView is (gAttr, aggrValues,

tranAggrValues) where aggrValues are one or more fields in the view required for the

action and tranAggrValues are one or more fields required to evaluate the aggregate

predicate. The view is defined as follows.

CurrView= H gAttr, aggrValues, tranAggrValues ((UPaggr(gAttr G aggrFunc fields((cP (R))))

where aggrFunc is an aggregate function applied on some attribute "fields" of the relation.

The aggregate values generated by aggrFunc are to evaluate the aggregate predicate Paggr

and also to evaluate the fields aggrValues and tranAggrValues of the view. PrevView is

the copy of the view from the previous timer expiration. Note that CurrView and

PrevView are views we use for implementation, they are different from the view defined

in the trigger. The operations to be performed to implement this trigger are:

1. Create a new copy of the view CurrView,

CurrView= H gAttr, aggrValues, tranAggrValues ((UPaggr(gAttr G aggrFunc fields((cP (R))))

2. Run the action on all the tuples resulting from the following operation,

(Ptran (CurrView JOIN PrevView).









The above two steps translate into two SQL queries. For non-transition triggers,

the second step is not necessary and the action is run on all the tuples of CurrView,

resulting in only one SQL query.




5.3 Single Table Aggregate Triggers


In this section we present some enhanced algorithms to implement the various

timer triggers with aggregate conditions on single relations. These include triggers with

any of the various events viz. on insert, on update and on delete and with or without

aggregate transition predicates on the groups within the view. Again, the simplest method

that works is the brute force method. When the timer goes off the new view is created

from all the tuples of the relations and an aggregation performed on all the groups.

However, if only a small number of groups are touched or many tuples of only one group

are touched then recomputing the aggregation for all the tuples in all the groups may be

extremely inefficient in terms of I/O and CPU.

The general schema of an aggregate view currView on a single table R is (gAttr,

aggrValues, aggrTranfields) where gAttr comprises the group attributes and aggrValues

are the values of aggregate functions applied on the group, required for the action. The

aggrTranfields values are required only for transition triggers to evaluate the transition

predicate. These values are also obtained by applying some aggregate functions on the

group.









5.3.1 Single Table, Aggregate, Non-Transition, on Update


Consider a timer trigger defined on a relation R with selection predicate P, one or

more grouping attributes gAttr and the aggregate predicate Paggr. The view can be defined

as follows:

CurrView= H gAttr, aggrValues ((UPaggr (gAttr G aggrFunc fields ((P (R))))

where aggrfunc are one or more aggregate functions applied on some fields

"fields" of relation R. The values of these aggrFunc may either be needed to evaluate the

aggregate predicate Paggr, or to simply compute the aggregate values aggrValues required

for the action.

As before, we maintain a relation S(gAttr), which is the set of groups touched due

to updates on relation R since the last timer expiration. The algorithm to process this

trigger is as follows:

4. TempCandidates = H gAttr, aggrValues ((UPaggr (gAttr G aggrFunc field ((P (S JOIN R))))

5. Refresh CurrView :

2a. CurrView = CurrView (S JOIN CurrView)

2b. CurrView = CurrView u TempCandidates

6. Candidates for the action are all the tuples of the view CurrView.

TempCandidates is a temporary relation of current aggregate information of only

the groups to be inserted or updated in the view CurrView. 7p (S JOIN R) returns all the

tuples belonging to the touched groups, satisfying the selection predicates. We then group

these tuples on the grouping attributes gAttr and apply the aggregate predicates Paggr, to

obtain new aggregate values of all the touched groups. The second step refreshes the view









CurrView by deleting all the groups touched and inserting all tuples in TempCandidates.

Finally, the trigger action is run on all tuples in the view CurrView.

As consistently seen in all examples until now, the main difference in the various

algorithms for single table triggers lies in creating the relation TempCandidates consisting

of inserts and updates to the view. After this relation is created, the steps to refresh the

view and run the action on the tuples are similar.

The salient features of the algorithm are the following. This algorithm works best

if tuples from only a few groups are touched because then the data is read and aggregation

performed only on tuples belonging to this small set of groups. If there is an index on the

group attributes gAttr, then I/O is performed only on those pages having the tuples of the

touched groups. Applying the selection predicate P on all touched tuples before inserting

it into the set S, reduces the number of groups especially if the predicate is highly

selective. Only one of the queries accesses a part of the original relation R. The remaining

three operate on smaller relations and involve simple join and predicate application.

The SQL query to implement the trigger looks as follows:

1. Select S.gAttr as gAttr, aggrFunc(R.field) as aggrValues

from S, R

where S.gAttr=R.gAttr and P

group by gAttr

having Paggr

into temp TempCandidates

2. Refresh CurrView :

2a. Delete from CurrView









where exists (select gAttr from S where S.gAttr=CurrView.gAttr);

2b. Insert into CurrView

select gAttr, aggrValues from TempCandidates;

3. select trigAction (gAttr, aggrValues)

from CurrView.

5.3.2 Single Table, Aggregate, Transition, on Update


In this case one or more of the aggregate predicates involve a transition condition

on a aggregate value (e.g. old.sum(sal) < sum(sal) ). A general representation of the

aggregate predicate can be written in CNF as PaggrAPtranAggr, where Paggr is the regular

aggregate predicate from before and PtranAggr has one or more transition predicates. The

view is defined as follows:

CurrView= H gAttr, aggrValues, tranAggrValues ((UPaggr (gAttr G aggrFunc field ((P (R))))

where now the aggrfunc additionally generates one or more fields tranAggrValues

to evaluate the transition predicate.

1. TempCandidates =

I gAttr, aggrValues, tranAggrValues ((Paggr (gAttr G aggrFunc field ((P (S JOIN R))))

2. Candidates for the action are all the tuples in

(PtranAggr (CurrView JOIN TempCandidates)

3. Refresh CurrView :

2a. CurrView = CurrView (S JOIN CurrView)

2b. CurrView = CurrView u TempCandidates.









Only step 2 above looks different. In this step we identify the tuples which satisfy

the transition condition. Only those groups touched since the last timer expiration are

candidates to satisfy the transition condition. These groups are in TempCandidates. The

join identifies only the updated tuples and applies the transition predicate PtranAggr on

them.


5.4 Multi Table Aggregate Triggers


In the case of multi-table aggregate triggers, a trigger is defined on multiple tables.

The schema for the view CurrView on N relations R1 to RN is (gAttrl, ..., gAttrN,

aggrValues, aggrTranfields). The grouping attributes in this case are gAttrl, gAttr2, ...,

gAttrN, where a gAttrl could be zero or more attributes of the relation RI. This number is

zero when a relation does not have any of its attributes in the group-by clause. The

selection predicate, normalized to CNF is P1AP2A... APNAPJoIN, where PI is the selection

predicate on RI and PJON represents the join predicates. The view definition is:

CurrView= gAtrgAttrl ... gAttrN, aggrValues

((Paggr ( gAttrl, ... gAttrN G aggrFunc field

((TP1,P2,... ,PN (cPJOIN(R1 X R2 X ....X RI X ......RN)))))


5.4.1 Multi Table, Aggregate, Non-Transition, on Update


When the timer goes off, we re-aggregate on the groups touched due to updates on

each table. The operations are as follows:

4. Create TempCandidates

ForI= 1 toN









do

If SI is empty then skip to the next iteration;

create TempCandidatel;

TempCandidate = TempCandidate u TempCandidatel.;

done

5. Refresh CurrView :

2a. For I = 1 to N

do

If SI is empty then skip to the next iteration;

CurrView = CurrView (SI JOIN CurrView)

done

2b. CurrView = CurrView u TempCandidates

6. Candidates for the action are all the tuples of the view CurrView.

The relation SI (gAttrI) has the set of gAttrI values touched due to updates on

tuples of relation RI, after applying the predicate PI on each tuple. The join predicate

might cause only a subset of the groups having the values in gAttrI to be affected,

however, for simplicity of the algorithm we recompute for all groups having the gAttrI

values.

TempCandidatesI represents the set of groups touched due to the updates on

relation RI only. The process to create it is discussed below. TempCandidates is the union

of all these temporary relations and thus is a cumulative set of touched groups due to

updates on all relations on the view. Step la and lb can be implemented by a single









query. Step 1 can thus be implemented using N+1 queries, N queries for each of the RI's

and one query to remove any duplicates.

In step 2, the view is refreshed. First all the touched groups due to updates on each

relation are deleted from the view, then we reinsert the appropriate groups with their new

values from TempCandidates. Step 2 also can be written using N+1 queries. Finally, the

last step runs the action on all tuples in the view.

The process of creating the relation TempCandidatesI for any relation RI is done

in two different ways depending on whether any of its attributes contribute to the

grouping or not. In the first case we consider a relation RI whose attributes are present in

the grouping as the fields gAttrI. TempCandiatesI for this case can be created as follows:

TempCandidates = H gAttrl, ..., gAttrN, aggrValues

((Paggr ( gAttrl, ..., gAttrN G aggrFunc field

((Gp1,P2,... ,PN (cPJOIN(R1 X... (SI JOIN RI) ... X RN)))))

where SI (gAttrl) has the values of all the gAttrl touched since the last timer

expiration. The aggregation is performed for all the groups whose gAttrl values are in S.

In the second case we consider a relation RI which does not have any attributes in

the grouping clause. This makes the task of finding the groups touched due to updates on

relation RI a little difficult. In this case, we collect the set S(pKey), the set of primary

keys of tuples updated in RI. Then we use the selection and join predicates to find another

relation RJ whose gAttrJ values are touched due to updates on RI. These values are

collected in a relation SJ'(gAttrJ) and appended to SJ(gAttrJ). So SJ would now consist of

all the gAttrJ values touched due to updates on RJ as well as RI.









SJ'(gAttrJ) = n gAttrj (P+I (,PJOIN-IJ((SI JOIN RI) X RI+1 X .... X RJ)))

SJ(gAttrJ)= SJ(gAttrJ)ASJ'(gAttrJ)

where (PJOIN-IJ includes only those join predicates involving relations from RI to RJ

The relation TempCandidatesJ is created the usual way from SJ as shown above,

however it reflects the refreshed values of groups touched by updates on RI and RJ. RJ is

any relation which contributes to the grouping attributes and is connected to RI in the rule

condition graph. A rule condition graph is like a query graph where the various relations

(with their selection predicates) are the vertices and the join predicates are the edges

between the vertices. The relations RI+1 to RJ-1 are the relations along the path from RI

to RJ in the query graph.

The salient features of the algorithm are the following. When the timer goes off,

we have to execute many queries to refresh the view and run the trigger action. In the

worst case when updates to all the relations are registered 2N+2 queries will be run, of

which almost N queries could access all the relations. Clearly, our implementation is not

efficient in this case. The brute force method runs only one query to access all the

relations, but involves a lot of aggregate computation on possibly a large number of

groups. However, if only one relation registered updates then our algorithm need run only

one query for the first step, two for the second and one for the third i.e. only four in all,

three of which are on smaller relations. Our implementation thus, works really well in

cases when only a small number of the tables are updated.









5.4.2 Multi Table, Aggregate, Transition, on Update


The only difference in the case of a multi-table aggregate trigger with a transition

predicate lies in a transition condition being a part of the aggregate predicate. The

aggregate predicate can be written in CNF as PaggrAPtranAggr. The view definition does not

change and is as follows:

CurrView= gAtrgAttr ... gAttrN, aggrValues, tranAggrValues

((Paggr ( gAttrl, ..., gAttrN G aggrFunc field

((GPIP2,... ,PN (cPJOIN(R1 X... (SI JOIN RI) ... X RN)))))

When the timer goes off, the trigger is implemented as follows:

1. Create TempCandidates

ForI= 1 toN

do

If SI is empty then skip to the next iteration;

create TempCandidatel;

TempCandidate = TempCandidate u TempCandidatel.;

done

2. Candidates for the action are all the tuples in

(PtranAggr (CurrView JOIN TempCandidates).

3. Refresh CurrView :

3a. For I= 1 toN

do


If SI is empty then skip to the next iteration;










CurrView = CurrView (SI JOIN CurrView)

done

3b. CurrView = CurrView u TempCandidates.

All the steps look similar to the previous discussion on the non-transition case. In

step 2 the action is run on only the groups satisfying the transition predicate by comparing

the corresponding values of these groups in the old and new view.


5.5 Examples


Example of a Aggregate Trigger: Consider this example of an aggregate trigger with no

transition predicate from Chapter 3.

create timer trigger excess_salary
from (select dept.manageremail as mgremail, sum (emp.sal) as sum sal, dept.budget as budget
from emp, dept
where (emp.dno = dept.dno) and
(emp.desig="sales personnel") and
(dept.name="toy")
group by dept.name, dept.manageremail, dept.budget
when sumsal > .50 budget
check every week
do email(:mgremail, "You are exceeding the allowed budget for salespersons.");


The selection predicate is PempAPdeptAPJoIN where

Pempis (emp.desig="sales personnel")

Pdept is (dept.name="toy")

PJIN is (emp.dno = dept.dno)

The aggregate predicate is Paggr where

Paggr is (sum_sal > .50 budget)

The various schema definitions are as follows:

CurrView(dept name, mgr email, budget, sum_sal);










TempCandidates(dept name, mgremail, budget, sum_sal);

Semp(emp_id); Sdept(name, manager email, budget);

CurrView and TempCandidates have the same schema consisting of the grouping

attributes and a field containing the result of summing the groups on sal. Since only the

relation Dept contributes to the grouping attributes we collect the set of touched groups in

Sdept. Emp does not contribute to the grouping at all so we keep track of the primary keys

of interesting tuples updated.

The asynchronous triggers to collect the updates in Smp and Sdept are:

create trigger Semp_ excess_salary
from emp
when emp.desig="sales personnel"
or old.emp.desig="sales personnel"
do execSQL 'insert into Semp values :emp_id';


create trigger Sdept excess_salary
from dept
when dept.name= "toy"
or old.dept.name="toy"
do execSQL 'insert into Sdept values (:name, :manageremail, :budget)'


From the discussion in section 5.4.1 for the case of a multi-table aggregate trigger

with an ON UPDATE event and no transition predicate, the steps to process the trigger on

timer expiration are written as follows:

1. Create TempCandidates

la. If Smp is not empty then

Sdept- Hname, manager email, budget

((PempPdept ((Semp JOIN emp) JOIN dept)









endif

lb. If Sdept is not empty then

TempCandidatedept= Iname, manager e-mail, budget, sum sal

(dept.name, dept.manager email, dept.budget G sum sal

(GPempPdept (emp JOIN (Sdept JOIN dept))))

TempCandidate = TempCandidatedept;

endif

2. Refresh CurrView :

If Sdept is not empty then

CurrView = CurrView (Sdept JOIN CurrView);

CurrView = CurrView u TempCandidates

3. Candidates for the action are all the tuples of the view CurrView.

5.6 Conclusion


Implementing aggregate triggers by the brute force method may not be the most

efficient method in some cases. Reading entire relations and performing aggregations

involves a lot of expense in terms of both CPU and /O. In single table triggers where the

update rate on relations are not very high, our strategy of keeping track of the groups

touched and aggregating on only this small set of groups works well. Performance

improves even more if the selection predicate is very selective or if there is an index on

the relation. In multi-table triggers, our strategy works well if only a few of those

relations are updated and the update rate is not very high.















CHAPTER 6
OPTIMIZATION AND IMPLEMENTATION ISSUES


6.1 Introduction


In the preceding chapters, we looked at the syntax and semantics of timer triggers

with its brute-force and enhanced implementation algorithms. In this chapter, we discuss

the architecture and some optimization issues of a Timer Trigger System. Although the

enhanced algorithms contain more queries than the brute force approach, they can be

orders of magnitude faster under certain conditions. The same algorithms however, may

perform poorly compared to the brute-force approach under different conditions. This

chapter looks at the factors which determine the performance of our enhanced algorithms

and provides some thoughts in making a choice between the better of the two approaches.

Trying to make a comprehensive assessment using extensive statistics, like the query

optimizer does [Chm76, Se179], is extremely complex to implement.

6.2 Architecture


The architecture of a timer trigger system is shown in figure 6.1. As discussed

before, a Timer Trigger System is designed to complement the powers of an

asynchronous trigger processor. A console program interfaces to the user and accepts the

"create timer trigger" command string. This console could be same as the one used by the

asynchronous trigger processor to accept its commands. This command string is then sent









to the parser where it gets parsed for syntactic correctness. The semantic analyzer

validates the semantics of the various entities present in the command string viz. the data

sources, views, fields etc. It also checks to see if the view on which the trigger is defined

has a primary key on it, and if the action present in the do clause is valid.


Figure 6.1: Architecture of a Timer Trigger System


The code generator then generates the SQL scripts for both the brute-force and

the enhanced strategy. The default strategy is initially set to brute-force. The

implementation of the enhanced strategy requires us to create asynchronous triggers to









keep track of the updates to each of the data sources. As many asynchronous triggers are

needed as there are data sources in the timer trigger definition. The scripts for the

asynchronous trigger processor are created by the code-generator. All the scripts created

are stored in a catalog with the trigger definition. A request is then sent to the timer

manager for a periodic wake-up signal.

If the "initialize immediately" clause is set in the timer trigger string, the view is

initialized the first time on trigger definition by brute force. This view, as discussed

before, needs to be kept until the next timer expiration to evaluate transition predicates.

The timer manager accepts entries for timer requests when a trigger is defined. It

registers these entries into the its list of wake-up times. On expiration of this timer, it

sends a wake-up signal to the execution unit identifying the trigger to be processed.

The execution unit, on receiving a wake-up signal, extracts the definition and the

scripts of the specified trigger from the catalogs. Depending on which of the two

strategies were set active, the execution unit runs the set of SQL scripts from either the

brute-force or the enhanced algorithm. These scripts generate the new view, apply the

predicates of the trigger referring to the previous copy of the view if necessary, and

finally run the action on the qualifying tuples.

After firing the actions, the trigger strategy is reassessed in the reassess block.

The values of a set of parameters, on which the execution cost of the trigger is dependent,

is examined and a decision in made to choose between the two strategies. If the time

spent on re-assessing is a fair fraction of the processing time of the trigger, then this step

can be executed every few times after the processing instead of every time. The









mechanism for reassessing the trigger processing strategy is covered in more detail later

in this chapter.

6.3 Thoughts on Optimization


The run-time performance of the enhanced strategy depends on the following

factors:

update rate of the data sources

selectivity of the selection predicate of the view

indexes present on the data sources on their primary key

In Chapters 4 and 5, we mentioned that for each relation involved in the trigger, a

relation S(pKey) or S(gAttr) is maintained to store the keys of updated tuples. This list of

updated tuples is filtered to a potentially small size by applying the selection predicate

while gathering them. If the number of keys in S is kept to a small size then the time

spent for creating the new view is reduced. The size of the set S is small if either the

update rate is low or the predicate is very selective or both. If the data source has an index

on its primary key pKey or the grouping attribute gAttr, then the time spent on processing

the trigger is significantly reduced, since only a few pages of the data source are touched.

Thus, one or more of the above mentioned factors contribute to the better performance of

the enhanced algorithms.

So, if a plot of the processing time versus the update rate of the table is to be

drawn, then it should look like the one shown in figure 6.2. From this plot it is clear that

the enhanced strategy works well for low update rates. As the update rate increases there

is a crossover point, above which the enhanced strategy is worse than the brute force one.






60

It is our goal to work in the shaded region of the graph i.e., use the enhanced strategy until

the update rate reaches the crossover point, after which we want to switch over to the

brute-force approach.


Crossover Point


Update Rate


-4- Brute Force ----Enhanced I


Figure 6.2: Performance of the brute force method vs. the enhanced method for varying
update rates.


6.4 Optimization Strategies

The first optimization strategy is the simplest and the easiest to implement. Actual

run times are used to arrive at a decision on which among the two algorithms is the better

one. By actual time we mean the wall-clock time, as it is very difficult to obtain the actual

CPU and I/O time spent during query executions due to the inherent multi-threading in

DBMSs [Inf99]. Brute force is initially used to create the view for the first time and the


\"", ""--J_'









time spent in milliseconds is stored with the trigger information. In every subsequent

processing, the enhanced strategy is run and the cost recorded and compared with the

brute-force cost. When the former cost exceeds the latter in two or more consecutive runs,

we switch to the brute-force approach. In the enhanced region of the graph, this algorithm

works fine because we know with a fair degree of certainty that the enhanced algorithm

has lesser processing cost. But once we cross over to the brute-force region, it is difficult

to make a decision when to return back to the enhanced approach, due to the lack of

sufficient information collected.

In the second optimization strategy, we use the estimated costs from the query

optimizer in arriving at a relation between the update rate of the data sources to the

execution cost of the enhanced algorithm. However, this information is not reliable as the

size of the relation S will never be accurately determined by the optimizer due to the

dynamically varying size of S. But, if the optimizer provides a function to calculate the

cost of a query by simply giving to it the size of S then we can approximately determine

the size of S that causes the crossover in costs between the two algorithms. As the update-

rate rises the number of touched keys collected in S will increase too. When the number

of keys in S crosses this threshold number, we switch to the brute force approach and stop

gathering the updates. At a later time, when the update rate drops, we can resume

collecting keys in S to see if this number has dropped below the threshold number.

Stopping and restarting the process of collecting the updated keys is as simple as

disabling and enabling the asynchronous triggers defined on the data sources.

The update-rate of data sources can be kept track of by defining simple main-

memory counters (variables) for each data source. These counters are incremented on









every update to the corresponding data source and once every hour the update rate can be

calculated (by dividing the value of this counter by sixty minutes). Since this counter is

going to be shared among the different transactions updating the data source, a locking

overhead will be introduced per increment. However, since we are interested in only an

approximate number for the update-rate, we can avoid this expensive process of locking

the counter.

While the second strategy seems appealing, it is based on the assumption that the

query optimizer provide an interface function that accepts the size of S and returns the

cost of the enhanced query. However query optimizers compute the estimated costs of

queries based purely on statistics collected by the database and do not accept parameters.

It will be helpful if DBMS vendors could open up their query optimizers provide

interfaces that accept some parameters while optimizing a query. One way to get around

this problem is to enter some dummy rows into S and call the gather statistics command

on S, before optimizing the query.

The first strategy does seem like a more practical and a simple one. However, we

need to find some way to return back to the enhanced approach from the brute force one,

when the update rate drops. This can be done by recording the update-rate at the

crossover point. Later, when the update rate drops below this recorded value we can

switch back to the enhanced strategy.

6.5 Conclusion


The architecture of a timer trigger system is simple. It functions hand in hand with

an Asynchronous Trigger Processor. The enhanced algorithm works well for low update






63


rate tables, high trigger predicate selectivity and the presence of primary key indexes on

data sources. A plot of the enhanced vs. the brute-force strategy reveals that our enhanced

algorithms perform well for low update rates. As the update rates increases there is a

crossover point when this strategy is worse than the brute force one. It is important to

identify the crossover point until which the enhanced algorithm should be used. A more

detailed design and implementation of optimized techniques for a timer trigger system is

a subject of future work.















CHAPTER 7
SUMMARY AND FUTURE WORK


7.1 Summary


Timer triggers allow users to define views and monitor interesting tuples or

interesting changes to tuples in these views. The process of retrieving views, testing

trigger conditions and firing actions takes place periodically. All this gives timer triggers

different semantics from those of the more conventional synchronous (and asynchronous)

triggers, in that the tuples do not need to be updated to satisfy trigger conditions.

The examples covered in Chapter 3 sought to reassert the point that some trigger

needs of users require periodic testing and notification. These translate best to timer

triggers and sometimes cannot be replaced by regular triggers. In example 1, where it was

required to notify some employees weekly of a form to be filled, only timer triggers could

be written here to correctly send one notification per week to forgetful employees.

Some join triggers and many aggregate triggers could be expensive to implement

using asynchronous triggers due to frequent accesses on the original relations. These

triggers can be evaluated more efficiently by the periodic testing mechanism on a group

of updates.

The simplest implementation for timer triggers is the brute-force approach, which

retrieves the view using its definition and then applies the predicates of the trigger. The

enhanced algorithms of simply refreshing the view by recording its updates lead to a huge









performance benefit during periodic processing. This is especially true if one or more of

the following hold viz. the update-rate on the relations is low, the trigger predicate is very

selective or indexes are defined on the primary key of relations.

Formal cost analysis and models in trying to choose the better of the two

algorithms has not been presented and is the scope for future work. However, our

approach of making a choice between the better of the two algorithm and identifying the

point of crossover by using actual run times is a simple and a feasible idea.

The most useful contribution of this thesis is the idea of keeping a set of keys of

updated tuples and restricting the processing to this set. This allows us to be oblivious of

the anomalies arising due to the different sequence of inserts, updates and deletes on the

same tuple of the view. This is achieved by actually retrieving the tuples associated with

these keys from the original relation and thus getting the correct information. Maintaining

this set however, requires us to create asynchronous triggers as explained in Chapter 4.

An asynchronous trigger processor like TriggerMan with its fast predicate indexing

mechanism makes it possible to collect this set of keys with a very low (and almost

insignificant) overhead in many cases.

The architecture of a timer trigger system presented in Chapter 6 is simple and

reuses a lot of the functionality available from the asynchronous trigger processor like its

console, data source definition etc.

7.2 Thoughts on Future Work


In the enhanced algorithms for join triggers we assume that updates to all the

tables in the trigger (or view) definition are important for refreshing the views. Thus, we









keep track of the updates on all the tables. In many cases however, updates to fewer than

all relations are relevant and it may not be necessary to keep track of the updates to all the

relations. This is especially true in example 4 of Chapter 4 which sends a complimentary

email to new pop and rock artists when their album sales cross a million. While updates

take place on both the relations, album and artist, the updates to only album are relevant

as we are only interested when the sales cross the million mark. Thus, there should be an

option in the timer trigger language that specifies the relations on which the updates are

relevant so that our optimizer has a more efficient execution.

While making a choice between the better of the two approaches discussed, our

main basis of comparison was the total run times spent on creating the views and firing

the actions. However, the enhanced algorithms also incur a small cost during the update

to each tuple. This is because, the asynchronous trigger processor evaluates some part of

the predicate on the updated tuples and puts the key in the update set S if necessary.

Although the cost of evaluating some part of the predicate can be very small with an

efficient predicate indexing mechanism, like that of TriggerMan, it could still be

significant for a high update-rate table or an expensive predicate. Other issues such as

locking on the update set S could also escalate cost. Thus, we need to arrive at a more

mathematical model to compute costs of the enhanced algorithms, taking into account the

costs for both collecting the updates and processing them.

When the enhanced algorithms are processing the set of keys of updated tuples

(i.e., keys in S), updates on the original relation still proceed concurrently. Some of these

updates may qualify to have their keys included in the set S. This represents a boundary

case where keys are getting inserted into S while S is being processed. Some analysis









needs to be done to see if any unexpected behavior is possible leading to some tuples

getting processed at the wrong timer expiration.

In our enhanced algorithm for triggers with a transition predicate (i.e., referring to

old and new values from the view), we assumed that a tuple of the view must change to

satisfy the transition predicate. We run the action on all the tuples that satisfy the

following condition:

UPtran (CurrView JOIN TempCandidates)

where CurrView is the old copy of the view from the previous timer expiration

(and is yet to be refreshed) and TempCandidates is the view of the tuples inserted or

updated in the interval since the last timer expiration. By performing the join (CurrView

JOIN TempCandidates), we isolate only those tuples which have been updated since the

last timer expiration and apply the transition predicate Ptran (CurrView.tranFields,

TempCandidates.tranFields) on the old and new copy of the relevant fields. However,

depending on the transition predicate (e.g. old.value = new.value), there may be cases of

tuples that satisfy the predicate without being updated. The brute-force approach will

account for these tuples, as it compares all tuples of the old view to corresponding tuples

of the new view. However, our enhanced approach does not identify tuples that have not

been updated but still satisfy the transition predicate because we apply the predicate only

on the updated tuples. In order to make our enhanced strategy work we need to change the

predicate to the following:

(Ptran' (CurrView OUTER-JOIN TempCandidates)

where Ptran' =

(TempCandidate.key is not Null AND









Ptran(CurrView.tranFields, TempCandidates.tranFields))

OR

(TempCandidate.key is Null AND

Ptran (CurrView.tranFields, CurrView.tranFields))

The predicate Ptran' is explained as follows. We pull in all tuples of the view

CurrView by performing an outer join with TempCandidates. The predicate Ptran' has two

disjuncts, the first one applies the predicate Ptran on old-new pairs of updated tuples in the

view and the second one applies the predicate on only the old values of the untouched

tuples.

The second strategy is a general one and works for both the cases. This can be the

default strategy. However, if we can determine from the transition predicate that an

attribute value has to change to satisfy the predicate then we can switch to the more

efficient approach of applying the predicate Ptran on only updated tuples of the view.

Examples of such predicates are old.value < new.value, new.value> 10*old.value etc. A

theorem prover can be introduced into the code generation block of figure 6.1 to examine

the transition predicates and make this determination. This theorem prover is a subject of

future work.

It has been mentioned at many places, how some expensive join and aggregate

triggers can be implemented more efficiently using timer triggers. More work needs to be

done on this issue. We have to identify factors within the trigger definition and run-time

characteristics that help us conclude the above. Having identified such triggers, we have

to work on a method to inform the owner of that trigger or the system administrator of

possible improvements in system performance using a timer trigger, along with some









numbers possibly. We also need to research the idea of directly switching to timer triggers

where possible, without changing the semantics of the trigger.

It is possible for many triggers to be defined on similar or even the same views.

Moreover, the trigger predicates themselves may be similar. In cases such as these, we

can group the processing of two or more similar triggers to prevent multiple predicate

evaluation and table accesses. This assumes more significance in aggregate triggers, due

to the higher processing costs required for the aggregation. The issue of grouping the

processing of two or more triggers needs to be addressed.

In example 3 of Chapter 3, a trigger was written to send an email weekly to the

finance department on the customers who have defaulted on their payment for ninety days

or more. So, a tuple qualifies for the action simply because its due date has aged for more

than 90 days. Triggers such as these, that do date arithmetic using the current date, can

only be implemented using brute-force. More analysis needs to be done, identifying

predicates like the one using current date, time and timestamps where the trigger will

work correctly using only the brute-force approach.














REFERENCES


[Alf98] Al-Fayoumi, N., Temporal trigger processing in the TriggerMan active
DBMS. Ph.D. dissertation, University of Florida, Gainesville, 1998.

[Car98] Carnes, C., A flexible data source architecture for an asynchronous trigger
processor. Unpublished paper, University of Florida, Gainesville, 1998.

[Cat99] Categoric http://www.categoric.com Palo Alto, CA, 1999.

[Cha89] Chakravarthy, S., HiPAC: A research project in active, time-constrained
database management. Final Technical Report. Technical Report XAIT-89-
02. Xerox Advanced Information Technology, Cambridge, MA, August
1989.

[For82] Forgy, C. L., Rete: A fast algorithm for the many pattern/many object
pattern match problem. Artificial Intelligence 19(1):17-37, September
1982.

[Geh91] Gehani, N., & Jagadish, H. V., Ode as an active database: Constraints and
triggers. In Proceedings of the seventeenth International Conference on
Very Large Data Bases 1991, pages 327-336. Morgan Kaufmann, San
Francisco, September 1991.


[Han90] Hanson, E. N., Chaabouni, M., Kim, C., & Wang, Y., A predicate
matching algorithm for database rule systems. In Proceedings SIGMOD
Conference 1990, pages 271-280. ACM Press, New York, June 1990.

[Han94] Hanson, E. N., & Chaabouni, M., The IBS tree: A data structure for
finding all intervals that overlap a point. Technical Report WSU-CS-90-
11. Wright State University, Dayton, OH, April 1990.

[Han96a] Hanson, E. N., The design and implementation of the Ariel active database
rule system. IEEE Transactions on Knowledge and Data Engineering
8(1):157-172, February 1996.

[Han96b] Hanson, E. N., & Johnson, T., Selection predicate indexing in active
databases using Interval Skip Lists. Information Systems, pp. 269-298,
1996.









[Han97] Hanson, E. N., Bodagala, S., & Chadaga, U., Optimized trigger condition
testing in Ariel using Gator Networks, University of Florida CISE
Department. Technical Report 97-021, Gainesville, November 1997.
http://www.cise.ufl.edu

[Han98] Hanson, E. N., Konyala, M., Vernon, A., Noronha, L., & Park, J., Scalable
trigger processing in TriggerMan. University of Florida CISE Department.
Technical Report 98-008, Gainesville, July 1998. http://www.cise.ufl.edu

[Han99] Hanson, E. N., Carnes, C., Huang, L., Konyala, M., Noronha, L.,
Parasarathy, S., Park, J., & Vernon, A., Scalable Trigger Processing. In
Proceedings International Conference on Data Engineering 1999, pages
266-275. IEEE Computer Society, Los Alamitos, CA, March 1999.

[Inf98] Informix Inc. http://www.informix.com Palo Alto, CA, 1998.

[Kon98] Konyala, M., Predicate indexing in TriggerMan. Master's thesis,
University of Florida, Gainesville, 1998.

[Kan98] Kandil, M., Predicate placement in active database discrimination
networks. Ph.D. dissertation, University of Florida, Gainesville, 1998.

[Mir87] Miranker, D. P., TREAT: A better match algorithm for AI production
systems. In Proceedings AAAI-87 Six\ih National Conference on Artificial
Intelligence, pages 42-47. Morgan Kaufmann, San Francisco, August
1987.

[Net99] NetMind http://www.netmind.com Campbell, CA, 1999.

[Oas98] Ozsu, M. T., & Valduriez, P., Principles of distributed database systems.
Prentice-Hall, Englewood Cliffs, NJ, 1998.

[Par98] Park, J., Implementation and performance tuning of discrimination
networks for asynchronous trigger processing view maintenance.
Unpublished paper, University of Florida, Gainesville, 1998.

[Pat98] Paton, N. W., ed. Active rules in database systems. Springer Verlag, New
York, 1998.

[Pug90] Pugh, W., Skip lists: A probabilistic alternative to balanced trees.
Communications of the ACM 33(6):668-676, June 1990.

[Rao98] Rao, J., & Ross, K. A., Reusing invariants: A new strategy for correlated
queries. In Proceedings SIGMOD Conference 1998, pages 37-48. ACM
Press, New York, June 1998.









[Sel79] Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., & Price,
T. G., Access path selection in a relational database management system.
In Proceedings SIGMOD Conference 1979, pages 23-34. ACM Press,
New York, May-June 1979.

[Ses96] Seshadri, P., Pirahesh, H., & Cliff Leung, T. Y., Complex query
decorrelation. In Proceedings International Conference on Data
Engineering 1996, pages 450-458. IEEE Computer Society, Los Alamitos,
CA, March 1996.

[Si198] Silberschatz, A., Korth, H. F., & Sudarshan, S., Database management
systems, 3rd ed., McGraw-Hill, New York, 1998.

[Sis95a] Sistla, A. P., & Wolfson, O., Temporal conditions and integrity constraints
in active database systems. In Proceedings SIGMOD Conference 1995,
24(2):269-280, San Jose, CA, June 1995.

[Sis95b] Sistla, A. P. & Wolfson, O., Temporal triggers in active databases. IEEE
Transactions on Knowledge and Data Engineering 7(3):471-486, June
1995.

[Sto91] Stonebraker, M., & Kernnitz, G., The POSTGRES next-generation
database management system. Communications of the ACM 34(10):78-92,
October 1991.

[Syb99] Sybase Replication Server http://www.sybase.com Emeryville, CA, 1999.

[Wan92] Wang, Y. W., & Hanson, E. N., A performance comparison of the Rete
and TREAT algorithms for testing database rule conditions. In
Proceedings IEEE Data Engineering Conference 1992, pages 88-97. IEEE
Computer Society, Los Alamitos, CA, February 1992.

[Wid91] Widom, J., Cochrane, R., & Lindsay, B. G., Implementing set-oriented
production rules as an extension to Starbust. In Proceedings of the
seventeenth International Conference on Very Large Data Bases 1991,
pages 275-285. Morgan Kaufmann, San Francisco, September 1991.

[Wid95] Widom, J. & Ceri, S., eds., Active database systems: Triggers and rules
for advanced data processing. Morgan Kaufmann, San Francisco,
September 1995.














BIOGRAPHICAL SKETCH


Lloyd Noronha was born on December 7, 1973 in Bombay, India. He received a

bachelor's degree in electronic engineering securing first class with honors from Bombay

University, Bombay, India, in May 1995.

He joined University of Florida in January 1998 to pursue a master's degree in

computer and information science and engineering.

He has worked as a research assistant with Dr. Hanson at the Database Systems

Research and Development Center.

His research interests are in active database systems.




Full Text

PAGE 1

Enhanced Techniques for Timer Trigger Processing1Lloyd Noronha and Eric Hanson 301 CSE CISE Department University of Florida Gainesville FL 32611-6120 (352) 392-2691 hanson@cise.ufl.edu http://www.cise.ufl.edu/~hanson TR-99-015 9 Sep 1999 1 This report contains the MS thesis of the same title written by Lloyd Noronha under the supervision of Eric Hanson, September 1999.

PAGE 2

ii ACKNOWLEDGMENTS I would like to express my sincere gratitude to my advisor, Dr. Eric Hanson, for giving me an opportunity to work on this challenging project and also for providing continuous guidance, advice, and support throughout the course of my work. I thank Dr. Joachim Hammer and Dr. Stanley Su for serving on my supervisory committee and for their careful perusal of this thesis. I would like to thank Ms. Sharon Grant for maintaining a great research environment at the Database Systems Research and Development Center. I am thankful to Chris Carnes, Lan Huang, Mohan Konyala, Sasi Kumar, J. B. Park, and Albert Vernon for their invaluable help and the fruitful discussions that we had during the course of this work. On a more personal note, I would like to thank my family whose love, support, and constant encouragement was of great importance through this work.

PAGE 3

iii TABLE OF CONTENTS Page ACKNOWLEDGMENTS...................................................................................................ii TABLE OF CONTENTS...................................................................................................iii LIST OF FIGURES.............................................................................................................v CHAPTERS 1 INTRODUCTION..........................................................................................................1 2 TRIGGERMAN, A FAST ASYNCHRONOUS TRIGGER PROCESSOR..................5 2.1 Introduction.......................................................................................................5 2.2 TriggerMan Architecture...................................................................................6 2.3 The Predicate Indexing Strategy of TriggerMan...............................................8 2.4 Conclusion.......................................................................................................10 3 TIMER TRIGGER LANGUAGE................................................................................12 3.1 Introduction.....................................................................................................12 3.2 Timer Trigger Language..................................................................................14 3.3 Trigger Semantics............................................................................................15 3.4 Timer trigger examples...................................................................................16 3.5 Conclusion.......................................................................................................20 4 ENHANCED ALGORITHMS FOR SINGLE TABLE AND JOIN TRIGGERS........21 4.1 Introduction.....................................................................................................21 4.2 The Brute Force Implementation for Single Table and Join Triggers.............22 4.3 Optimization Goals and Strategy.....................................................................23 4.4 Enhanced Algorithms for Single Table Triggers.............................................24 4.4.1 Single Table on Update..........................................................................25 4.4.2 Single Table on Insert.............................................................................28 4.4.3 Single Table on Delete...........................................................................29 4.4.4 Single Table on Update with Transition Predicate.................................30 4.5 Enhanced algorithms for Join Triggers...........................................................32 4.5.1 Join Trigger on Update...........................................................................32 4.5.2 Join Trigger on Update with Transition Predicate.................................34 4.6 Combining the View Predicate and the Trigger Predicate..............................36

PAGE 4

iv 4.7 Examples.........................................................................................................37 4.8 Conclusions.....................................................................................................41 5 ENHANCED ALGORITHMS FOR AGGREGATE TRIGGERS..............................42 5.1 Introduction.....................................................................................................42 5.2 The Brute Force Implementation for Aggregate Triggers...............................42 5.3 Single Table Aggregate Triggers.....................................................................44 5.3.1 Single Table, Aggregate, Non-Transition, on Update............................45 5.3.2 Single Table, Aggregate, Transition, on Update....................................47 5.4 Multi Table Aggregate Triggers......................................................................48 5.4.1 Multi Table, Aggregate, Non-Transition, on Update.............................48 5.4.2 Multi Table, Aggregate, Transition, on Update.....................................52 5.5 Examples.........................................................................................................53 5.6 Conclusion.......................................................................................................55 6 OPTIMIZATION AND IMPLEMENTATION ISSUES.............................................56 6.1 Introduction.....................................................................................................56 6.2 Architecture.....................................................................................................56 6.3 Thoughts on Optimization...............................................................................59 6.4 Optimization Strategies...................................................................................60 6.5 Conclusion.......................................................................................................62 7 SUMMARY AND FUTURE WORK..........................................................................64 7.1 Summary.........................................................................................................64 7.2 Thoughts on Future Work...............................................................................65 REFERENCES..................................................................................................................70 BIOGRAPHICAL SKETCH.............................................................................................73

PAGE 5

v LIST OF FIGURES Figure Page 2.1: TriggerMan Architecture..............................................................................................7 2.2: Predicate Index Architecture........................................................................................8 6.1: Architecture of a Timer Trigger System....................................................................57 6.2: Performance of the brute force method vs. the enhanced method for varying update rates..........................................................................................60

PAGE 6

vi Abstract of Thesis Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Master of Science ENHANCED TECHNIQUES FOR TIMER TRIGGER PROCESSING By Lloyd X. Noronha August, 1999 Chairman: Eric Hanson Major Department: Computer and Information Science and Engineering Timer-Driven Triggers allow users to monitor interesting tuples or interesting changes to tuples of a view. Trigger condition testing takes place periodically, i.e., once every few hours, days or weeks. When the timer of a trigger expires, a new copy of the view is retrieved and the trigger predicates applied to that view. Trigger predicates could be transition predicates referring to the attribute values of tuples of a view from the previous timer expiration. Timer Triggers differ from conventional triggers, in that the trigger testing is not done for updated tuples only. Timer triggers could be fired on tuples of a view simply because they satisfy the predicate trigger predicate. Since the trigger conditions are tested every so often, high update rate tables can have a lot of timer triggers defined on them. The goal of this thesis is to design a timer trigger system that works hand-in-hand with a trigger processor and complements its features. The semantics of the timer trigger differ from that of regular triggers and are explained by a brute-force approach. Several

PAGE 7

vii enhanced algorithms can be written that do better than the brute-force approach, but do not change the semantics. A timer trigger system also provides the user an option of replacing expensive join and aggregate triggers with timer triggers.

PAGE 8

1 CHAPTER 1 INTRODUCTION Timer Triggers test and fire triggers periodically i.e., every few minutes, hours, days, weeks etc. Views are created on the database and triggers are fired periodically on interesting tuples or interesting changes to tuples of the view. This idea differs from the conventional philosophy of synchronous [Sil98] and asynchronous [Han97] triggers which fire triggers when a tuple is updated. Current database systems process triggers synchronously. Synchronous trigger processing lengthens the response time of update transactions, because the transaction cannot commit until all the trigger conditions defined on the updated tables are tested and actions fired. The response time of update transaction is thus directly affected by the increase in the number of triggers, which limits the number of triggers that can be defined on a table. In the case of assertions synchronous triggers are indispensable. But in most other cases, delaying the trigger processing to until after the triggering transaction commits (i.e. asynchronous trigger processing) should work just as fine. This is especially true in the case of alerters. In an asynchronous trigger system, the only work expended during a transaction is to capture the updates for later processing [Han98]. This time spent in capturing updates is a constant and totally unrelated to the number of triggers defined on the tables involved in the transaction. Thus, it is possible to increase the number of

PAGE 9

2 triggers defined on a table to a very large number without affecting the response time of update transactions. After the transaction completes, each of the updated tuples captured (also known as tokens) are individually tested against the trigger condition of each trigger and actions fired accordingly. This process of testing all the trigger conditions can be done sequentially. However, a huge speedup can be achieved by means of an efficient discrimination network for evaluating predicates. A Selection Predicate Index, is one such discrimination network [Han98], which causes a factor of improvement in the processing time per updated token. This strategy works really well for all types of triggers, especially with single table triggers because the entire condition evaluation on an update token takes place in main memory i.e., without accessing the database. In the case of join triggers and aggregate triggers, there is a relative increase in time spent on processing a token, due to a possible table-access involved during condition evaluation of the trigger. In the case of join triggers, when an update token of a table arrives, a part of the condition evaluation involves performing a join. In the case of aggregate triggers, we have to identify the group to which the update token belongs and test the aggregate condition for that group. This is covered in more detail in Chapters 4 and 5. In both these cases, the Selection Predicate Index is still used to evaluate a part of the predicate that does not involve a table access. If this part evaluates to true, we then proceed to run an SQL query to read one or more tables, to evaluate the rest of the predicate. Thus, although an efficient discrimination network still evaluates a part of the predicate quickly, it is the I/O costs that dominate the processing time of join and

PAGE 10

3 aggregate triggers. Timer triggers can be thought of as an alternative option to some expensive join and aggregate triggers because they access the database less often. Timer triggers enable the user to define triggers on a view of the database. The view could be a join or an aggregation of one or more tables or simply a snapshot of a single table. Trigger conditions are tested periodically i.e., once every few minutes, hours, days, weeks etc. On a timer expiration, a new copy of the view is created and the predicates of the trigger are applied to the view. These predicates could be transition predicates, which compare some attribute values of the current view with the corresponding values of the copy of the view created on the previous timer expiration. It is easy to conceive of many applications and examples where users would like to be intimated of interesting conditions once every so often. Consider an example of a manager of a computer company, who is interested in knowing every week, which of his hardware salesmen have a two fold increase in sales from the previous week. A timer trigger can be defined with the timer duration of a week. The view here, keeps the sales information of only the hardware salesmen. When the timer expires, a fresh copy of this view is created from the sales table and the sales information for each salesman is compared with his sales from the view of the previous week. The action is fired for all tuples of the view that satisfy the predicate. Asynchronous triggers, like synchronous triggers, test the trigger condition for every updated tuple, albeit only after the transaction commits. Thus, only an updated tuple can cause a trigger to fire. However, in timer triggers, all tuples of a view that satisfy the predicate can fire the trigger. These include tuples which may have not been touched since the last timer expiration.

PAGE 11

4 By definition, timer triggers are defined to achieve an objective different from asynchronous triggers and thus have different semantics. However, as indicated above, for expensive join and aggregate triggers, it is possible to replace an asynchronous trigger with a timer trigger to achieve similar (but not always the same) results. The goal of this thesis is to propose a good design of a timer trigger system to complement an existing trigger processor. Chapter 2 explains the architecture and predicate indexing mechanism of an asynchronous trigger processor, TriggerMan. The syntax and semantics of the timer trigger language are presented in Chapter 3 along with some examples. Implementation of timer triggers with efficient algorithms for single table and join triggers are discussed in Chapter 4. Chapter 5 has a similar discussion for aggregate triggers. Chapter 6 focuses on the architecture of a timer trigger system and some optimization details. Finally, Chapter 7 summarizes the discussion on timer triggers and provides some thoughts for future work.

PAGE 12

5 CHAPTER 2 TRIGGERMAN, A FAST ASYNCHRONOUS TRIGGER PROCESSOR 2.1 Introduction The design of a timer trigger system in this thesis focuses on working with a conventional (synchronous or asynchronous) trigger processor to complement and enhance its functionality. In later chapters, we present some enhanced algorithms for the implementation of timer triggers. These algorithms need to define some regular singletable triggers to capture interesting updates and thus it is very important for the underlying trigger processor to have a fast predicate evaluation mechanism for singletable triggers. In this chapter we explain the architecture of such a trigger processor called TriggerMan [Han98]. TriggerMan is a scalable and extensible trigger processor. The triggers in TriggerMan can have user defined data types and the system scales well even for millions of triggers for certain applications. The scalability of the system can be attributed mainly to its asynchronous trigger processing mechanism and an efficient predicate indexing strategy. As the number of triggers in a system increases, more time is spent on testing each updated tuple against all the trigger conditions. Making this testing process asynchronous alleviates the problem of delays in update transactions, which is an inherent problem of synchronous trigger processing. TriggerMan additionally implements a

PAGE 13

6 predicate indexing strategy [Han99] that is responsible for a huge speedup in evaluating the predicates of the triggers and thus reduces the processing time per updated tuple. Extensibility will soon be an indispensable requirement in trigger systems. With the advent of Object Relational Database Management Systems (ORDBMS), extending data types and adding new routines in a database system have gained popularity amongst users. Users are allowed to create new User-Defined Data Types (UDT) and UserDefined Routines (UDR) in the database system and use them with regular SQL. TriggerMan is designed to recognize UDT's and UDR's of the underlying database system and allow their use in triggers. The other possible choices of trigger processors for our timer trigger system are in the active database systems of Ariel [Han96a], POSTGRES [Sto91], HiPAC [Cha89], Starbust [Wid91], and Ode [Geh91]. But they do not provide the scalability feature of TriggerMan and very limited, if any, extensibility support. Among them, Ariel and POSTGRES do implement a selection predicate indexing architecture, which however, is not extensible. 2.2 TriggerMan Architecture TriggerMan is designed to work as a stand-alone module independent of the underlying system. However, for simplicity of explanation, the architecture that is discussed here is as a module that resides on an Object-Relational DBMS. The architecture of TriggerMan is shown in Figure 2.1. The main components of the architecture are explained as follows. The TriggerMan Module resides and runs in the address space of the DBMS server. The task

PAGE 14

7 queue is a to-do list of tasks for the trigger processor. Once every so often, multiple TriggerMan drivers run in parallel to read and process jobs from the queue. The Data Source applications are responsible for capturing updates on local and remote data sources and putting them in an update queue. These updates are later gathered up from the update queue by the TriggerMan drivers and added as tasks to be processed in the task queue. Figure 2.1: TriggerMan Architecture Lastly, the TriggerMan console is an application program that interfaces to the user and allows him to initialize the system, create triggers, drop triggers, shut down the system etc. Host ORDBMSUpdate Queue Task Queue TriggerMan Module Data Source App TriggerMan Driver TriggerMan Console

PAGE 15

8 2.3 The Predicate Indexing Strategy of TriggerMan The major work done by a TriggerMan driver program in Figure 2.1 is in evaluating the predicates for an updated tuple against each of the trigger definitions. TriggerMan employs a novel selection predicate indexing strategy [Han98] that makes the predicate evaluation faster by a huge factor. Figure 2.2: Predicate Index Architecture Selection Predicate Index Predicate Index Root Data Source Node Expression Signature List Constant Sets Sets of the rest of the predicate and trigger id's for a particular constant

PAGE 16

9 If a system has a large number of triggers then there is a fair chance that some clusters of triggers have a part of their predicate looking similar. This is the basic premise of TriggerMan's predicate indexing strategy. For example a majority of triggers defined by a stock investor will be alerters where the predicates check for stock prices going above or below a certain interesting value. The predicate indexing strategy of TriggerMan, simplified for explanatory purposes, is shown in Figure 2.2. All the data sources defined in TriggerMan have a node at the first level below the root. When a trigger is defined, its predicate is converted to Conjunctive Normal Form (CNF), and the most selective conjunct qualifies to be the expression signature for that trigger. If the same signature already exists, then the constants of this conjunct are plugged into the constant list under that signature and nodes containing the rest of the predicate and the id of the trigger are placed under this constant. If the same signature does not already exist then a new signature node is created and the same steps followed. For example consider the following two triggers having the same expression signature: create Trigger buy_informix from stock where (name="Informix") and (price<5) do email("david", "Stock price of Informix within 5"); create Trigger sell_oracle from stock where (name="Oracle Corp.") and (price>40)

PAGE 17

10 do email("david", "Sell Oracle Corp. now. The price is :price"); The trigger processor isolates the predicate "stock.name=constant" as the expression signature for both the triggers because it finds it the most selective. It adds this signature into the list of expression signatures under the data source "stock" (if it does not already exist). The constants "Informix" and "Oracle" are inserted into the list of constants (if they do not already exist). Finally, a node for the id's of the triggers and the remaining part of the predicate are inserted under these constants. Now, when the information about an updated tuple of the stock relation is pulled out from the task queue by the TriggerMan driver, it is made to pass through this discrimination network. The attribute values within this tuple are evaluated against the constants within each expression signature. For every constant for where the expression evaluates to true, we proceed with testing the rest of the predicate and then firing the trigger action if necessary. The predicate for single table triggers can be evaluated completely within this main memory discrimination network. However for join and aggregate triggers, evaluation of the rest of predicate involves reading relations to perform joins and aggregations. 2.4 Conclusion The optimization algorithms for timer triggers need to define single table triggers for each of the data sources with the timer trigger definition. The predicate indexing strategy of TriggerMan works best for single table triggers and hence is ideal for the timer

PAGE 18

11 trigger algorithms. Along with a fast predicate evaluation technique, TriggerMan also scales well for a large number of triggers and works with extensible data types.

PAGE 19

12 CHAPTER 3 TIMER TRIGGER LANGUAGE 3.1 Introduction The preceding chapters emphasized the importance of an asynchronous trigger processor like TriggerMan, which provides a more powerful language than current trigger systems while at the same time providing the features of scalability and extensibility. There are trigger situations however, where the events of interest for a user do not lie in individual updates to the database but in interesting happenings to the database over a period of time. Take the case of a stock market investor who wishes to be alerted at the end of the week when he spends over a $10,000 in purchasing stock of a single company. Here, the investor requires the trigger system to keep track of all his purchases for the week. In essence, the user wants the trigger system to keep track of a certain view of the database and notify him periodically of interesting changes to that view. This is the fundamental motivation behind timer triggers. Apart from fulfilling it's basic role of periodic notification of interesting changes to a view, timer triggers also provide users an option of off-loading, where possible, expensive triggers from asynchronous processing to timer based processing. An asynchronous trigger processor tests for trigger conditions on each updated tuple. While this strategy works out well for single table triggers (due to no table access), join and

PAGE 20

13 aggregate triggers could suffer sometimes. Both join and aggregate triggers potentially involve a table access during predicate evaluation on an update token. A part of the predicate not dependent on the table-access, can still be evaluated using the fast predicate indexing strategy as that of TriggerMan. However, evaluating the remaining portion of the predicate in join triggers requires tuples from one or more tables to be read. In aggregate triggers all tuples belonging to the same group as the updated tuple have to be read and an aggregation performed on them. This leads to a higher processing cost per update token, whose effect could be augmented by the increase in the number of join and aggregate triggers. In systems where triggers are defined on high update-rate tables, this problem could get compounded causing the system to get overloaded. In situations such as these, the timer trigger system in addition to fulfilling it's semantic role, could lend a helping hand to the asynchronous trigger system by offloading some expensive join and aggregate triggers to timer triggers. Instead of performing table accesses often, the advantage lies in doing them once every so often. Timer triggers thus, albeit semantically different, can be used to meet objectives similar to asynchronous triggers for a better system performance. Due to differing semantics however, the decision to switch to a timer trigger must not be one done automatically by the trigger processing system but should be an explicit decision of the user. The sections to follow explain the syntax and semantics of a timer trigger system.

PAGE 21

14 3.2 Timer Trigger Language A Timer Trigger System is designed to work as a component of an asynchronous trigger processor like TriggerMan, making the schema definitions of both the DBMS and trigger processor available to it. Timer triggers can be created on data sources or views. The asynchronous trigger processor is already familiar with the schema and the primary key of all data sources specified in the command. The syntax for creating timer triggers is as follows: create timer trigger from {on event} {when } do check every {initialize {immediately | on first timer expiration}} The from clause specifies the name of a data source or view on which the trigger is defined. This field can also contain a SQL select statement. The on event clause could either be insert, delete or update indicating that the trigger should fired only when inserts, deletes or updates take place in the view. The default is update. The when clause has the trigger condition on the view. When the timer expires, this boolean expression is applied on the view to identify tuples satisfying the trigger condition. Although any kind of predicates are allowed to be in the when clause, its primary purpose is to specify transition predicates. The do clause specifies the trigger action. The check every clause specifies the timer period or the day of the week and time when the trigger should fire. The initialize clause is optional and specifies whether the first instance of the view is created immediately or the first time the timer expires.

PAGE 22

15 3.3 Trigger Semantics The semantics of the timer trigger command is now explained for the three different events viz. insert, delete and update. Consider a timer trigger defined on a data source or a view with a when clause W . The predicates could be transition predicates referring to old values of attributes obtained from the copy of the view created during the previous timer expiration. The view itself, could be a single relation or a result of applying a select statement with predicates on single tables, joins or aggregate queries. Periodic processing: In the explanation to follow V is the current copy of the view created when the timer expires by executing the query in the from clause of the create timer trigger statement and old_V is the copy of the view from the previous timer expiration. The following steps are executed when the timer expires: Event ON UPDATE: 1. Retrieve the contents of the view into V. 2. If the when clause W has a transition condition then the action is run on every tuple resulting from the operation sw (V JOIN old_V). If there is no transition condition, then the action is run on every tuple resulting from the operation sw(V). 3. Destroy old_V and rename V to old_V. Event ON INSERT: 1. Retrieve the contents of the view into V. 2. The action is run on every tuple resulting from the operation sw (V old_V).

PAGE 23

16 3. Destroy old_V and rename V to old_V. Event ON DELETE: 1. Retrieve the contents of the view into V. 2. The action is run on every tuple resulting from the operation sw (old_V V). 3. Destroy old_V and rename V to old_V. 3.4 Timer trigger examples Example 1 : Send an alert to employees who have taken more vacation than their allowed quota but haven't filled out a special requisition form. The schema definition of the relation is: employee(emp_id, emp_name, email_addr, vacation_taken, vacation_left, vacation _req_filled .) with emp_id as the primary key. The trigger is defined as follows: create timer trigger excess_vacation from (select email_addr, vacation_req_filled from employee where vacation_taken > vacation_left ) when vacation_req_filled = False check every week beginning Monday at 8:00 AM do email(:email_addr, "Please fill out the Excess Vacations form"); Here the view is simply the details of the employees who have exceeded their allowed vacation quota. When the timer expires, it identifies the employees who haven't filled out the necessary form from among those in the view. The notation :email_addr in the action substitutes the actual email addresses by macro expansion while running the action for qualifying tuples. Since no event is specified, on update is assumed. Moving

PAGE 24

17 the predicate of the when clause to the definition of the view does not change the meaning of the trigger in this example. The above is a good example illustrating the use of timer triggers and it's differing semantics from conventional triggers. The intent is to email a forgetful employee every Monday morning until he does the needful. Trying to achieve the same with a synchronous or an asynchronous trigger would send an email to the employee every time his details are touched or it is possible that he does not get emailed at all if his details are not touched. Example 2 : Alert all junior managers when they have two or fewer days of vacation left in the year. The schema of the relation employee is the same as the one above. create timer trigger less_vacation_left from (select email_addr from employee where (designation='Jr. Managers') and (vacation_left vacation_taken) <= 2 ) on insert check every week do email(:email_addr, "You have only two or less days of vacation left in the year"); The view has all the junior managers having two or less days of vacation. The on insert clause ensures that the employee is emailed only the first time his vacation dips below two days. Example 3 : Notify the finance department of the customers who haven't honored their invoices for more than ninety days past the due date.

PAGE 25

18 The schema definition of the relations are: invoice (num, cno, due_date, amount, ..) with primary key num and customer (cno, name, addr, ) with cno as the primary key. The trigger is defined as follows: create timer trigger unhonored_invoice from (select invoice.num as inv_num, customer.name as cust_name from customer, invoices where (invoice.cno = customer.cno) and (CURRENT_DATE invoice.due_date) > 90 and (invoice.status='UNPAID') check every week do email("finance@experts.com", "Customer :cust_name has not honored the invoice :inv_num for more than 90 days"); This is a good example of a join trigger where the relation of interest is only invoice. The customer relation is present only to extract the information on defaulting customers. Another important characteristic of this trigger is that tuples who qualify are mostly tuples lying untouched for more than 90 days. Example 4 : The managers of a famous record label wish to let their new pop and rock artists know when their album sales have crossed a million. The schema definition of the relations are: album (album_id, artist_name, sales, artist_type, ) with album_id as the primary key and artist (artist_name, category, ) with artist_name as the primary key. The trigger is defined as follows: create timer trigger new_artist_album_sales from (select artist.email as artist_email, album.sales as sales from album, artist where (album.artist_name = artist.artist_name) and ((artist.category="rock") or (artist.category="pop")) and (album.artist_type="new") when old.sales < 1000000 and sales>=1000000

PAGE 26

19 check every day do email(:artist_email, "Congratulations! Your album sales have crossed a million"); The transition predicate in the when clause ensures that only one email is sent to the artist on the day when his album sales have crossed a million. Example 5 : The managers of a departmental store chain wish to keep the salaries of all the sales personnel to within 50% of the budgeted salary. The schema definition of the relations are: emp(emp_id, name, dno, sal, desig, .) with emp_id as the primary key and dept(dept_id, name, budget, manager_email, .) with dept_id as the primary key. The trigger is defined as follows: create timer trigger excess_salary from (select dept.manager_email as mgr_email, sum (emp.sal) as sum_sal, dept.budget as budget from emp, dept where (emp.dno = dept.dno) and (emp.desig="sales personnel") and (dept.name="toy") group by dept.name, dept.manager_email, dept.budget when sum_sal > .50 * budget check every week do email(:mgr_email, "You are exceeding the allowed budget for salespersons."); This is an example of an aggregate trigger involving two tables. A trigger can fire due to changing information on either of the tables and both tables also have selection predicates. The view here has aggregate information about the total salaries of sales personnel in the toy dept.

PAGE 27

20 3.5 Conclusion The syntax and semantics of a timer trigger language are explained. The examples illustrate how users can use timer triggers to be informed periodically of interesting tuples in a view. In example 3, we saw an interesting property of timer triggers where some tuples satisfy the trigger condition although they did not do so in the previous timer expiration and have not been touched since. They qualified for the action simply because they aged enough to satisfy a date-time predicate. The examples brought about the similarity and the differences in timer triggers against the more conventional synchronous and asynchronous triggers.

PAGE 28

21 CHAPTER 4 ENHANCED ALGORITHMS FOR SINGLE TABLE AND JOIN TRIGGERS 4.1 Introduction In the previous chapter we presented the syntax of the "create timer trigger" command and explained the semantic definition of the different types of single table triggers, join triggers and aggregate triggers with some examples. A brute force method for implementing any timer trigger involves running a query against the relations in the trigger definition to generate the new copy of the view and then evaluating the predicates of the trigger for all tuples in that view. While the brute force method always works and is the simplest implementation, it is often an inefficient way to go. Consider for example, the following read-mostly table employee_details which stores the name and information of all employees within a company, employee_details(emp_id, emp_name, emp_addr,..) Changes to this table normally take place only when a new employee is recruited or an existing employee leaves. If one or more timer triggers are defined on this low update rate table, then it is totally unnecessary to re-run queries on the entire relation employee_details to create the new view, every time the timer goes off. In the case when there are no updates at all, between two timer expirations, the new view is simply the old view. If only a few updates take place in this interval, then the new view can be obtained

PAGE 29

22 from the old view by simply keeping track of the updates since the last timer expiration and applying those updates to the old view during processing. In this chapter we try to look at some enhanced strategies to generate the old view from the new view without the brute force method. Several enhanced methods seem appealing, but with improper optimization methods several cases of anomaly arise when insert, updates and deletes take place on the same tuple many times in one interval. Our enhanced technique endeavors to rid ourselves of these anomalies, and generate an efficient technique to handle almost all types of timer triggers. In the next section we illustrate the brute force approach to implement timer triggers. With the disadvantages of this approach in mind, we highlight our points of focus while developing the various enhanced algorithms, followed by the actual algorithms for timer triggers, on single relations and joins. We end the chapter with some examples and thoughts. 4.2 The Brute Force Implementation for Single Table and Join Triggers Consider a general trigger T defined on one or more relations R. The predicate of the trigger can be expressed in CNF as P Ptran, where P represents all the non-transition predicates and Ptran represents the transition predicates. Although the view and the trigger have their separate predicates Pv and Pw respectively, we combine them together discussion and represent them as P Ptran. More details on combining predicates and applying them are covered later in this chapter in section 4.6. The current view created is CurrView= sP(R) and PrevView is the copy of the view from the previous timer expiration. Note that CurrView and PrevView are views we use for implementation, they

PAGE 30

23 are different from the view defined in the trigger i.e. V= sPv(R). The operations to be performed to implement this trigger are: 1. Create a new copy of the view, CurrView= sP(R) 2. Run the action on all the tuples resulting from the following operation, sPtran (CurrView JOIN PrevView). The above two steps translate into two SQL queries. For non-transition triggers, the second step is not needed and the action is run on all the tuples of CurrView, resulting in only one SQL query. 4.3 Optimization Goals and Strategy The primary goal of developing enhanced algorithms for timer triggers is to generate the new view from the old view with a method more efficient than the brute force method, in terms of CPU and I/O processing time. This, as might be expected, implies that we keep track of some information about the tuples updated on a table since the last timer expiration. When the timer goes off we can modify the old view to obtain the new view. As mentioned previously, improper optimization methods can cause anomalies, whereby trigger behavior is different than when using the brute-force method. We overcome these anomalies by keeping track of only the primary key of the tuples that were updated. When the timer goes off, we perform a join of these keys with the original relation, to get the correct picture of which among the touched tuples were effectively inserted, deleted or updated. We then go on to apply these operations to the old view to obtain the new view.

PAGE 31

24 The timer trigger system is designed to complement an asynchronous trigger system. As mentioned before, for a table on which a timer trigger is defined, we keep track of all updates to the table within each processing interval. We can go a step further and keep track of only those tuples which are of interest to the trigger i.e. those tuples, which satisfy the predicates of the trigger. This can be done by using the existing asynchronous trigger system of TriggerMan. Predicates for updated tuples can be evaluated asynchronously using the fast and efficient Selection Predicate Index in main memory [Han98]. Another focus while developing our algorithms is to query the relations on which triggers are defined, only once. It is reasonable to assume these relations are large or will keep growing. It is therefore important for our algorithms to not touch the relations more times than necessary. So, in almost all cases, we query the original relation only once for its attribute values. 4.4 Enhanced Algorithms for Single Table Triggers In this section we present our enhanced algorithms to implement the various timer triggers on single relations. These include triggers with or without transition predicates and with any of the various events viz. on insert, on update and on delete. The general schema of the view CurrView, on a single table R is (key, fields, tranfields), where key is the primary key of the table R and fields are one or more attributes of R whose values are required in the action. The third field tranfields is present only in a transition trigger and includes those attributes of R, whose values are required to evaluate the transition predicate.

PAGE 32

25 4.4.1 Single Table on Update Consider a timer trigger defined on a single relation R (key, field1, field2, field3 ,fieldn), with one or more predicates P, none of which have a transition condition. The brute force method to implement this trigger is to evaluate the query currView= P key,fields ( sP(R)), every time the timer goes off. The specified action is then run for all the resulting tuples of the view currView. In our enhanced algorithms we maintain a relation S(key), which is a set of unique primary keys of tuples of R, that were touched (i.e. inserted, deleted or updated), since the last timer expiration. Additionally, the keys of only those tuples are inserted into the relation S, which either satisfy the predicate clause P or used to satisfy it before the update. Regular triggers are defined on the relation R to collect this set S. This is explained in greater detail with examples in later sections. The following operations can now be performed to generate a new copy of the view: 1. TempCandidates = P key,fields ( sP(S JOIN R)) 2. Refresh CurrView : 2a. CurrView = CurrView (S JOIN CurrView) 2b. CurrView = CurrView TempCandidates 3. Candidates for the action are all the tuples of the view CurrView. Explanation: CurrView is the view from the last timer expiration. When the timer goes off, we need to refresh the view CurrView before running the action on all its tuples. The first step is to identify tuples to be inserted into the view or have their existing values updated. These tuples could only be from among the touched tuples of relation R

PAGE 33

26 in the interval since the last timer expiration, i.e., from the set of keys in relation S. These tuples are collected into a temporary relation TempCandidates. A tuple updated in relation R may cause an insert into the view currView, if that tuple did not previously satisfy the predicate P, but does so after the update. In performing a join on the original relation R, we get the current and thus the correct values of the attributes for evaluating the predicate. TempCandidates should not include any tuples which need to be deleted from the view. A tuple will be deleted from the view, either because it is physically deleted from the relation R or because it no longer satisfies the predicate due to the update. Performing the join between the relation S to the original relation R takes care of eliminating the former and evaluating the predicate P takes care of eliminating the latter. The second step refreshes the view currView. The first of the two queries deletes all the tuples in CurrView that were touched, i.e. all keys in S. The second query inserts any of the touched tuples that still belong to the view. Let's separately consider how the inserts, updates and deletes to the view are correctly handled. A tuple to be inserted into the view will be present in the temporary relation TempCandidates, and thus gets inserted into the view CurrView by the second query. A tuple that is updated needs its attribute values refreshed in the view CurrView. This tuple is present in TempCandidates. It is updated by first deleting it from the view CurrView with the first query and then reinserting it into the view from TempCandidates with the second query. A tuple to be deleted from the view will not be present in the temporary relation, TempCandidates. The first query will delete this tuple and the second query will not reinsert it into the view. Finally the third step simply calls a User Defined Routine to run the action on each of the tuples in the view CurrView.

PAGE 34

27 The above steps can be translated into SQL queries as follows: 1. Select S.key as key, R.fields as fields from S, R where S.key = R.key and sPinto temp TempCandidates; 2. 2a. Delete from CurrView where exists (select S.key from S where S.key=CurrView.key); 2b. Insert into CurrView select key, fields from NewCandidates; 3. Select trigAction (key, fields) from CurrView; where trigAction is a UDR for the specified trigger action Salient features of the algorithm are the following. The above algorithm is well enhanced and should work especially well on tables with a low update rate. The set of keys of touched tuples in relation S is kept to as small a size as possible. This is done by evaluating the predicate P, on each tuple, before inserting its key into S. If the predicate P is highly selective, only a small fraction of the updates to R, have their keys stored in S. There is however, a small cost overhead of evaluating the predicates P on touched tuples before inserting it into the relation S. This cost is significantly reduced by making use of the existing fast and efficient discrimination network of TriggerMan. The number of queries in the above algorithm is restricted to three, of which only one accesses the original relation R. The other two queries operate on much smaller

PAGE 35

28 relations and should not cause a performance overhead. Processing is made more efficient when the relation R has an index on the primary key, because then, in the worst case, only as many pages of relation R are accessed as the number of unique keys in S. Correlated sub-queries, as the one above in step 2, are well optimized by several existing DBMS products. This query is flattened to perform a join ignoring any duplicates. 4.4.2 Single Table on Insert In the case of single table triggers with the on insert event, we are interested in rows inserted into the view CurrView, since the last timer expiration. When the timer goes off, we create the current copy of the view, identify all new entries from the previous copy and run the action on them. As before the relation S(key) is the set of keys of touched tuples of relation R. The operations to process the trigger are as follows: 1. TempCandidates = P key,fields ( sP(S JOIN R)) 2. Candidates for the action are all the tuples of (TempCandidates CurrView) 3. Refresh CurrView : 2a. CurrView = CurrView (S JOIN CurrView) 2b. CurrView = CurrView TempCandidates The first step creates a temporary relation TempCandidates of all the inserts and updates as before. The next step runs the trigger action on all the tuples in TempCandidates that are not in the previous copy of the view CurrView. Finally CurrView is refreshed as before. The SQL statement to run the action on new inserts in step 2 can be written as follows:

PAGE 36

29 select trigAction (key, fields) from TempCandidates where not exists (select CurrView.key from CurrView where key = TempCandidates.key) where trigAction is a UDR for the specified trigger action. It accepts the key and fields of the view and executes the action. 4.4.3 Single Table on Delete In the case of single table triggers with the on delete event, we are interested in rows deleted from the view CurrView, since the last timer expiration. When the timer goes off, we create the current copy of the view, identify rows deleted from the previous copy and run the action on them. The operations to process the trigger are as follows: 1. TempCandidates = P key,fields ( sP(S JOIN R)) 2. Candidates for the action are all the tuples of ((S JOIN CurrView) TempCandidates) 3. Refresh CurrView : 2a. CurrView = CurrView (S JOIN CurrView) 2b. CurrView = CurrView TempCandidates The first and the third steps are no different from before. The second step identifies tuples that were present in the previous copy of the view, but were deleted since. Of course, these could only be from among the set S. (S JOIN CurrView) returns all the tuples in the previous copy of the view, that were touched since. From this set, we

PAGE 37

30 ignore the inserts and updates (i.e. tuples in TempCandidates) to get only the deletes. The SQL query looks as follows: select trigAction (key, fields) from S, CurrView where S.key = CurrView.key and not exists (select TempCandidates.key from TempCandidates where S.key = TempCandidates.key). where trigAction is a UDR for the specified trigger action. It accepts the key and fields of the view and executes the action. Again, correlated sub-queries are flattened and well optimized by major commercial DBMS products. 4.4.4 Single Table on Update with Transition Predicate Timer triggers with transition predicates compare old and new values of attributes of corresponding tuples using the old and new keywords as qualifiers. As discussed later in section 4.6, a predicate for a transition trigger can be written in Conjunctive Normal Form (CNF) as P Ptran. P has all the non-transition predicates as before and is used to define the view currView= Pkey,fields,tranFields ( sP(R)). Ptran is the transition predicate operating on the old and new values of certain fields written as tranFields in the view. For simplicity, we assume that the predicate Ptran is such that only a change in the value of tranFields may cause it to evaluate to true. A more general discussion on evaluating transition predicates which evaluate to true even when the fields tranFields are untouched (e.g. old.value=new.value) is present in Chapter 7 under future work.

PAGE 38

31 In order to process a transition trigger, we create the current copy of the view, and then apply the transition predicates on the old and new copies of the view. The operations to process the trigger are as follows: 1. TempCandidates = P key, fields, tranFields ( sP(S JOIN R)) 2. Candidates for the action are all the tuples in sPtran (CurrView JOIN TempCandidates). 3. Refresh CurrView : 2a. CurrView = CurrView (S JOIN CurrView) 2b. CurrView = CurrView TempCandidates The first and third steps are same as before. In the second step we identify the tuples which satisfy the transition condition. Only those tuples that were updated, since the last timer expiration, are candidates to satisfy the transition condition. The join identifies only the updated tuples and applies the transition predicate Ptran on them. An SQL query for this step would look as follows: select trigAction (CurrView.key, CurrView.fields, CurrView.tranFields, TempCandidates.tranFields) from TempCandidates, CurrView where TempCandidates.key = CurrView.key and Ptran (TempCandidates.tranFields, CurrView.tranFields) where Ptran is the transition predicate applied on field tranFields of the view.

PAGE 39

32 4.5 Enhanced algorithms for Join Triggers A join trigger is defined on multiple tables. The schema for a view on N relations R1 to RN is CurrView(key1, , keyN, fields, tranfields) where keyI is the primary key of relation RI, "fields" are one or more attributes from any of the N relations whose values are required in the action and the last field "tranfields" have one or more attributes required to evaluate the transition predicate, if any. The selection predicate, normalized to CNF is P1 P2 PN PJOIN, where PI is the selection predicate on RI and PJOINrepresents the join predicates. The view definition is: CurrView= P key1 , , keyN, fields, tranFields( sP1P2 PN ( sPJOIN(R1 X R2 X .X RI X RN))) The brute force method accesses all the tuples of all the relations to create the new copy of the view. 4.5.1 Join Trigger on Update We maintain individual sets for each relation to register the updated tuples since the last timer expiration. Each of these relations SI(keyI) has the set of all the keys of relation RI that were touched since the last processing period. When the timer goes off the following operations are performed: 1. Create TempCandidates For I = 1 to N do If SI is empty then skip to the next iteration;

PAGE 40

33 TempCandidateI = P key 1, , keyN, fields, tranFields( sP1P2 PN ( sPJOIN(R1 X (SI JOIN RI) X RN))) TempCandidate = TempCandidate TempCandidateI.; done 2. Refresh CurrView : 2a. For I = 1 to N do If SI is empty then skip to the next iteration; CurrView = CurrView (SI JOIN CurrView) done 2b. CurrView = CurrView TempCandidates 3. Candidates for the action are all the tuples of the view CurrView. The relation TempCandidateI represents the set of inserted and updated tuples of the view due to the updates on RI only. The relation TempCandidates represents all the inserts and updates on the view since the last timer expiration and is created by performing the union of all the relations TempCandidateI. Step 1a and 1b can be implemented by a single query. Step 1 can thus be implemented using N+1 queries, N queries for each of the TempCandidateI's and one query to remove any duplicates from TempCandidates. In step 2, the view is refreshed by first deleting all the tuples touched due to updates on each of the relations and then inserting the tuples from TempCandidates. Step

PAGE 41

34 2 can also be written using N+1 queries. Finally, the last step runs the action on all tuples in the view. Salient features of the algorithm are the following. When the timer goes off, we have to execute many queries to refresh the view and run the trigger action. In the worst case when updates to all the relations are registered 2N+2 queries will be run, of which each of the N queries of step 1 access all the relations. Clearly, our implementation is not efficient in this case. The brute force method runs only one query to access all the relations. However, if only one relation registered updates then our algorithm need run only one query for the first step, two for the second and one for the third i.e. only four in all, three of which are on smaller relations. Our implementation thus, works really well in cases when only a small number of the tables are updated. 4.5.2 Join Trigger on Update with Transition Predicate In this case a transition condition is a part of the predicate. The predicate can be written in CNF as P1 P2 PN PJOIN Ptran. The view definition does not change and is as follows: CurrView= P key1, , keyN, fields, tranFields( sP1P2 PN ( sPJOIN (R1 X R2 X .X RI X RN))) When the timer goes off, the trigger is implemented as follows: 1. Create TempCandidates For I = 1 to N

PAGE 42

35 do If SI is empty then skip to the next iteration; TempCandidateI = P key1, , keyN, fields, tranFields( sP1P2 PN ( sPJOIN(R1 X (SI JOIN RI) X RN))) TempCandidate = TempCandidate TempCandidateI.; done 2. Candidates for the action are all the tuples in sPtran (CurrView JOIN TempCandidates). 3. Refresh CurrView : 3a. For I = 1 to N do If SI is empty then skip to the next iteration; CurrView = CurrView (SI JOIN CurrView) done 3b. CurrView = CurrView TempCandidates All the steps look similar to the previous discussion on the non-transition case. In step 2 the action is run on only tuples of the view that satisfy the transition predicate by comparing the corresponding values of these tuples attributes in the old and new view. Again as stated for the transition case of single table triggers, we assume for simplicity that the predicate Ptran is such that only a change in the value of tranFields may cause it to evaluate to true. A more general discussion on evaluating transition predicates

PAGE 43

36 which evaluate to true even when the fields tranFields are untouched (e.g. old.value=new.value) is present in Chapter 7 under future work. 4.6 Combining the View Predicate and the Trigger Predicate Consider a general timer trigger defined on one or more relations R1, Rn with a view predicate Pv and a trigger condition Pw, both in Conjunctive Normal Form (CNF). The view definition is V= Pfields( sPv(R1 X R2 X X Rn)). On timer expiration, the view is retrieved and the trigger predicate Pw is applied to it. The view CurrView that we create and keep around for enhanced trigger processing need not be the same as the view V but possibly a more efficient representation. In the case when the trigger has no transition predicates (i.e. no reference to old attribute values), we generate CurrView by applying the predicates P=Pv Pw on the relations i.e. CurrView= P fields ( sP (R1 X R2 X X Rn)). Since Pv and Pw are both in CNF, P is also on CNF. In the case when the trigger has transition predicates, the trigger predicate Pw has a non-transition part PnoTran and a transition part Ptran i.e. Pw=PnoTran Ptran. We generate CurrView by applying the predicates P=Pv PnoTran on the relations i.e., CurrView= P fields( sPvPnoTran (R1 X R2 X X Rn)). The procedure for breaking up Pw into PnoTran and Ptranis explained as follows. If the old value of an attribute is referenced in the predicate, then all the conjuncts containing that attribute make up Ptran. The conjuncts of Pw that are not in Ptran make up PnoTran.

PAGE 44

37 From the above two cases, it is clear that we can represent the predicate for a trigger as P when there are no transition predicates and P Ptran when there are transition predicates. 4.7 Examples Example of a Single Table Trigger : Consider this example of a non-transition single table trigger from Chapter 3. create timer trigger excess_vacation from (select email_addr, vacation_req_filled from employee where vacation_taken > vacation_left ) when vacation_req_filled = False check every week beginning Monday at 8:00 AM do email(:email_addr, "Please fill up the Excess Vacations form"); The predicate P here is P=(vacation_taken>vacation_left) and (vacation_req_filled=False) The various schema definitions are as follows: CurrView(emp_id, email_addr); TempCandidates(emp_id, email_addr); S(emp_id); The asynchronous trigger to collect the updates in S is create trigger S_employee_excess_vacation from employee when ((vacation_taken>vacation_left) and (vacation_req_filled=False)) or ((old.vacation_taken>vacation_left) and (old.vacation_req_filled=False))

PAGE 45

38 do execSQL'insert into S values :emp_id'; From the discussion in section 4.4.1 for the case of single table triggers with an ON UPDATE event, the steps to process the trigger on timer expiration are written as follows: 1. TempCandidates = P emp_id, email_addr( s(vacation_taken>vacation_left) and (vacation_req_filled=False) (S JOIN employee)) 2. Refresh CurrView : 2a. CurrView = CurrView (S JOIN CurrView) 2b. CurrView = CurrView TempCandidates 3. Candidates for the action are all the tuples of the view CurrView. Example of a Join Trigger : Consider this example of a transition join trigger from Chapter 3. create timer trigger new_artist_album_sales from (select artist.email as artist_email, album.sales as sales from album, artist where (album.artist_name = artist.artist_name) and ((artist.category="rock") or (artist.category="pop")) and (album.artist_type="new") when old.sales < 1000000 and sales>=1000000 check every day do email(:artist_email, "Congratulations! Your album sales have crossed a million"); The predicate here is Partist Palbum PJOIN Ptran where Partist = (artist.category="rock") or (artist.category="pop") Palbum = album.artist_type="new"

PAGE 46

39 PJOIN = album.artist_name = artist.artist_name Ptran = old.sales < 1000000 and sales>1000000 The various schema definitions are as follows: CurrView(artist_id, album_id, sales); TempCandidates(artist_id, album_id, sales); Sartist(artist_id); Salbum(album_id); The asynchronous triggers to collect the updates in Sartist and Salbum are: create trigger S_artist_new_album_sales from artist when (artist.category="rock") or (artist.category="pop") or (old.artist.category="rock") or (old.artist.category="pop") do execSQL 'insert into Sartist values :artist_id'; create trigger S_album_new_album_sales from album when album.artist_type="new" or old.album.artist_type="new" do execSQL 'insert into Salbum values :album_id'; The view is initialized as follows: CurrView= P artist_id, album_id, sales ( sPartistPalbum( sPJOIN(artist X album))) From the discussion in section 4.5.2 for the case of Join Triggers with an ON UPDATE event and a Transition predicate, the steps to process the trigger on timer expiration are written as follows:

PAGE 47

40 1. Create TempCandidates: 1a. If Sartist is not empty then TempCandidateartist = P artist_id, album_id, sales( sPartistPalbum( sPJOIN(Sartist JOIN artist) X album))) TempCandidate = TempCandidate TempCandidateartist.; end if 1b.If Salbum is not empty then TempCandidatealbum = P artist_id, album_id, sales( sPartistPalbum( sPJOIN(Salbum JOIN album) X artist))) TempCandidate = TempCandidate TempCandidatealbum.; end if 2. Candidates for the action are all the tuples in sPtran (CurrView JOIN TempCandidates). 3. Refresh CurrView : 3a. If Salbum is not empty then CurrView = CurrView (Salbum JOIN CurrView); If Sartist is not empty then CurrView = CurrView (Sartist JOIN CurrView); 3b. CurrView = CurrView TempCandidates

PAGE 48

41 4.8 Conclusions The brute force implementation of timer triggers is expensive because of the CPU and I/O cost involved in reading entire relations and applying selection and join predicates on them. In single table triggers where the update rate on relations are not very high, our strategy of keeping track of the tuples touched and performing queries only on this touched set works very well. Performance improves even more if the selection predicate is very selective or if there is an index on key of the relation. In multi-table triggers, our strategy works well when only a few of the relations are updated and the update rate is not very high.

PAGE 49

42

PAGE 50

42 CHAPTER 5 ENHANCED ALGORITHMS FOR AGGREGATE TRIGGERS 5.1 Introduction Aggregate triggers are triggers whose condition of interest involve an aggregation on all tuples or groups of them. If a tuple of a relation is touched then the group it belongs to needs its aggregate information re-calculated and the trigger condition checked. Thus, aggregate triggers implemented as synchronous or asynchronous triggers have a high processing cost per updated tuple. The timer trigger mechanism to test for interesting aggregate conditions periodically is a better option to implement aggregate triggers. The view here is an aggregation of tuples from one or more relations grouped by some grouping attributes. Again, the brute force method is the simplest implementation and always works. There are enhanced algorithms discussed in the next few sections which go along similar lines of the algorithms in the previous chapter that make aggregate triggers very efficient under some cases. 5.2 The Brute Force Implementation for Aggregate Triggers Consider a general aggregate trigger defined on relations R. The selection predicate of the trigger is P, the grouping attributes are gAttr and the aggregate predicate is Paggr PtranAggr in CNF, where Paggr represents all the non-transition aggregate predicates and PaggrTran represents the transition aggregate predicates. Although the view and the

PAGE 51

43 trigger have their separate aggregate predicates PVaggr and PWaggr respectively, we combine them together into one Boolean and represent them as Paggr PtranAggr, where PtranAggrcontains the transition predicate part of the aggregate predicate and Paggr contains the nontranstion part. This discussion of combining predicates is identical to the one we had in section 4.6, the only difference being that the selection predicates are replaced by aggregate predicates. The schema for the view CurrView is (gAttr, aggrValues, tranAggrValues) where aggrValues are one or more fields in the view required for the action and tranAggrValues are one or more fields required to evaluate the aggregate predicate. The view is defined as follows. CurrView= P gAttr, aggrValues, tranAggrValues ( sPaggr(gAttr * aggrFunc fields( sP (R)))) where aggrFunc is an aggregate function applied on some attribute "fields" of the relation. The aggregate values generated by aggrFunc are to evaluate the aggregate predicate Paggrand also to evaluate the fields aggrValues and tranAggrValues of the view. PrevView is the copy of the view from the previous timer expiration. Note that CurrView and PrevView are views we use for implementation, they are different from the view defined in the trigger. The operations to be performed to implement this trigger are: 1. Create a new copy of the view CurrView, CurrView= P gAttr, aggrValues, tranAggrValues ( sPaggr(gAttr * aggrFunc fields( sP (R)))) 2. Run the action on all the tuples resulting from the following operation, sPtran (CurrView JOIN PrevView).

PAGE 52

44 The above two steps translate into two SQL queries. For non-transition triggers, the second step is not necessary and the action is run on all the tuples of CurrView, resulting in only one SQL query. 5.3 Single Table Aggregate Triggers In this section we present some enhanced algorithms to implement the various timer triggers with aggregate conditions on single relations. These include triggers with any of the various events viz. on insert, on update and on delete and with or without aggregate transition predicates on the groups within the view. Again, the simplest method that works is the brute force method. When the timer goes off the new view is created from all the tuples of the relations and an aggregation performed on all the groups. However, if only a small number of groups are touched or many tuples of only one group are touched then recomputing the aggregation for all the tuples in all the groups may be extremely inefficient in terms of I/O and CPU. The general schema of an aggregate view currView on a single table R is (gAttr, aggrValues, aggrTranfields) where gAttr comprises the group attributes and aggrValues are the values of aggregate functions applied on the group, required for the action. The aggrTranfields values are required only for transition triggers to evaluate the transition predicate. These values are also obtained by applying some aggregate functions on the group.

PAGE 53

45 5.3.1 Single Table, Aggregate, Non-Transition, on Update Consider a timer trigger defined on a relation R with selection predicate P, one or more grouping attributes gAttr and the aggregate predicate Paggr. The view can be defined as follows: CurrView= P gAttr, aggrValues ( sPaggr (gAttr * aggrFunc fields ( sP (R)))) where aggrfunc are one or more aggregate functions applied on some fields "fields" of relation R. The values of these aggrFunc may either be needed to evaluate the aggregate predicate Paggr, or to simply compute the aggregate values aggrValues required for the action. As before, we maintain a relation S(gAttr), which is the set of groups touched due to updates on relation R since the last timer expiration. The algorithm to process this trigger is as follows: 4. TempCandidates = P gAttr, aggrValues ( sPaggr (gAttr * aggrFunc field ( sP (S JOIN R)))) 5. Refresh CurrView : 2a. CurrView = CurrView (S JOIN CurrView) 2b. CurrView = CurrView TempCandidates 6. Candidates for the action are all the tuples of the view CurrView. TempCandidates is a temporary relation of current aggregate information of only the groups to be inserted or updated in the view CurrView. sP (S JOIN R) returns all the tuples belonging to the touched groups, satisfying the selection predicates. We then group these tuples on the grouping attributes gAttr and apply the aggregate predicates Paggr, to obtain new aggregate values of all the touched groups. The second step refreshes the view

PAGE 54

46 CurrView by deleting all the groups touched and inserting all tuples in TempCandidates. Finally, the trigger action is run on all tuples in the view CurrView. As consistently seen in all examples until now, the main difference in the various algorithms for single table triggers lies in creating the relation TempCandidates consisting of inserts and updates to the view. After this relation is created, the steps to refresh the view and run the action on the tuples are similar. The salient features of the algorithm are the following. This algorithm works best if tuples from only a few groups are touched because then the data is read and aggregation performed only on tuples belonging to this small set of groups. If there is an index on the group attributes gAttr, then I/O is performed only on those pages having the tuples of the touched groups. Applying the selection predicate P on all touched tuples before inserting it into the set S, reduces the number of groups especially if the predicate is highly selective. Only one of the queries accesses a part of the original relation R. The remaining three operate on smaller relations and involve simple join and predicate application. The SQL query to implement the trigger looks as follows: 1. Select S.gAttr as gAttr, aggrFunc(R.field) as aggrValues from S, R where S.gAttr=R.gAttr and P group by gAttr having Paggrinto temp TempCandidates 2. Refresh CurrView : 2a. Delete from CurrView

PAGE 55

47 where exists (select gAttr from S where S.gAttr=CurrView.gAttr); 2b. Insert into CurrView select gAttr, aggrValues from TempCandidates; 3. select trigAction (gAttr, aggrValues) from CurrView. 5.3.2 Single Table, Aggregate, Transition, on Update In this case one or more of the aggregate predicates involve a transition condition on a aggregate value (e.g. old.sum(sal) < sum(sal) ). A general representation of the aggregate predicate can be written in CNF as Paggr PtranAggr, where Paggr is the regular aggregate predicate from before and PtranAggr has one or more transition predicates. The view is defined as follows: CurrView= P gAttr, aggrValues, tranAggrValues ( sPaggr (gAttr * aggrFunc field ( sP (R)))) where now the aggrfunc additionally generates one or more fields tranAggrValues to evaluate the transition predicate. 1. TempCandidates = P gAttr, aggrValues, tranAggrValues ( sPaggr (gAttr * aggrFunc field ( sP (S JOIN R)))) 2. Candidates for the action are all the tuples in sPtranAggr (CurrView JOIN TempCandidates) 3. Refresh CurrView : 2a. CurrView = CurrView (S JOIN CurrView) 2b. CurrView = CurrView TempCandidates.

PAGE 56

48 Only step 2 above looks different. In this step we identify the tuples which satisfy the transition condition. Only those groups touched since the last timer expiration are candidates to satisfy the transition condition. These groups are in TempCandidates. The join identifies only the updated tuples and applies the transition predicate PtranAggr on them. 5.4 Multi Table Aggregate Triggers In the case of multi-table aggregate triggers, a trigger is defined on multiple tables. The schema for the view CurrView on N relations R1 to RN is (gAttr1 , , gAttrN, aggrValues, aggrTranfields). The grouping attributes in this case are gAttr1, gAttr2, , gAttrN, where a gAttrI could be zero or more attributes of the relation RI. This number is zero when a relation does not have any of its attributes in the group-by clause. The selection predicate, normalized to CNF is P1 P2 PN PJOIN, where PI is the selection predicate on RI and PJOIN represents the join predicates. The view definition is: CurrView= P gAttr1, , gAttrN, aggrValues( sPaggr ( gAttr1, , gAttrN * aggrFunc field( sP1P2 PN ( sPJOIN(R1 X R2 X .X RI X RN))))) 5.4.1 Multi Table, Aggregate, Non-Transition, on Update When the timer goes off, we re-aggregate on the groups touched due to updates on each table. The operations are as follows: 4. Create TempCandidates For I = 1 to N

PAGE 57

49 do If SI is empty then skip to the next iteration; create TempCandidateI; TempCandidate = TempCandidate TempCandidateI.; done 5. Refresh CurrView : 2a. For I = 1 to N do If SI is empty then skip to the next iteration; CurrView = CurrView (SI JOIN CurrView) done 2b. CurrView = CurrView TempCandidates 6. Candidates for the action are all the tuples of the view CurrView. The relation SI (gAttrI) has the set of gAttrI values touched due to updates on tuples of relation RI, after applying the predicate PI on each tuple. The join predicate might cause only a subset of the groups having the values in gAttrI to be affected, however, for simplicity of the algorithm we recompute for all groups having the gAttrI values. TempCandidatesI represents the set of groups touched due to the updates on relation RI only. The process to create it is discussed below. TempCandidates is the union of all these temporary relations and thus is a cumulative set of touched groups due to updates on all relations on the view. Step 1a and 1b can be implemented by a single

PAGE 58

50 query. Step 1 can thus be implemented using N+1 queries, N queries for each of the RI's and one query to remove any duplicates. In step 2, the view is refreshed. First all the touched groups due to updates on each relation are deleted from the view, then we reinsert the appropriate groups with their new values from TempCandidates. Step 2 also can be written using N+1 queries. Finally, the last step runs the action on all tuples in the view. The process of creating the relation TempCandidatesI for any relation RI is done in two different ways depending on whether any of its attributes contribute to the grouping or not. In the first case we consider a relation RI whose attributes are present in the grouping as the fields gAttrI. TempCandiatesI for this case can be created as follows: TempCandidates = P gAttr1, , gAttrN, aggrValues( sPaggr ( gAttr1, , gAttrN * aggrFunc field ( sP1P2 PN ( sPJOIN(R1 X (SI JOIN RI) X RN))))) where SI (gAttrI) has the values of all the gAttrI touched since the last timer expiration. The aggregation is performed for all the groups whose gAttrI values are in S. In the second case we consider a relation RI which does not have any attributes in the grouping clause. This makes the task of finding the groups touched due to updates on relation RI a little difficult. In this case, we collect the set S(pKey), the set of primary keys of tuples updated in RI. Then we use the selection and join predicates to find another relation RJ whose gAttrJ values are touched due to updates on RI. These values are collected in a relation SJ'(gAttrJ) and appended to SJ(gAttrJ). So SJ would now consist of all the gAttrJ values touched due to updates on RJ as well as RI.

PAGE 59

51 SJ'(gAttrJ) = P gAttrj ( sPI+1 PJ ( sPJOIN-IJ((SI JOIN RI) X RI+1 X . X RJ))) SJ(gAttrJ) = SJ(gAttrJ) SJ'(gAttrJ) where sPJOIN-IJ includes only those join predicates involving relations from RI to RJ The relation TempCandidatesJ is created the usual way from SJ as shown above, however it reflects the refreshed values of groups touched by updates on RI and RJ. RJ is any relation which contributes to the grouping attributes and is connected to RI in the rule condition graph. A rule condition graph is like a query graph where the various relations (with their selection predicates) are the vertices and the join predicates are the edges between the vertices. The relations RI+1 to RJ-1 are the relations along the path from RI to RJ in the query graph. The salient features of the algorithm are the following. When the timer goes off, we have to execute many queries to refresh the view and run the trigger action. In the worst case when updates to all the relations are registered 2N+2 queries will be run, of which almost N queries could access all the relations. Clearly, our implementation is not efficient in this case. The brute force method runs only one query to access all the relations, but involves a lot of aggregate computation on possibly a large number of groups. However, if only one relation registered updates then our algorithm need run only one query for the first step, two for the second and one for the third i.e. only four in all, three of which are on smaller relations. Our implementation thus, works really well in cases when only a small number of the tables are updated.

PAGE 60

52 5.4.2 Multi Table, Aggregate, Transition, on Update The only difference in the case of a multi-table aggregate trigger with a transition predicate lies in a transition condition being a part of the aggregate predicate. The aggregate predicate can be written in CNF as Paggr PtranAggr. The view definition does not change and is as follows: CurrView= P gA ttr1, , gAttrN, aggrValues, tranAggrValues( sPaggr ( gAttr1, , gAttrN * aggrFunc field( sP1P2 PN ( sPJOIN(R1 X (SI JOIN RI) X RN))))) When the timer goes off, the trigger is implemented as follows: 1. Create TempCandidates For I = 1 to N do If SI is empty then skip to the next iteration; create TempCandidateI; TempCandidate = TempCandidate TempCandidateI.; done 2. Candidates for the action are all the tuples in sPtranAggr (CurrView JOIN TempCandidates). 3. Refresh CurrView : 3a. For I = 1 to N do If SI is empty then skip to the next iteration;

PAGE 61

53 CurrView = CurrView (SI JOIN CurrView) done 3b. CurrView = CurrView TempCandidates. All the steps look similar to the previous discussion on the non-transition case. In step 2 the action is run on only the groups satisfying the transition predicate by comparing the corresponding values of these groups in the old and new view. 5.5 Examples Example of a Aggregate Trigger: Consider this example of an aggregate trigger with no transition predicate from Chapter 3.create timer trigger excess_salary from (select dept.manager_email as mgr_email, sum (emp.sal) as sum_sal, dept.budget as budget from emp, dept where (emp.dno = dept.dno) and (emp.desig="sales personnel") and (dept.name="toy") group by dept.name, dept.manager_email, dept.budget when sum_sal > .50 * budget check every week do email(:mgr_email, "You are exceeding the allowed budget for salespersons.");The selection predicate is Pemp Pdept PJOIN where Pemp is (emp.desig="sales personnel") Pdept is (dept.name="toy") PJOIN is (emp.dno = dept.dno) The aggregate predicate is Paggr where Paggr is (sum_sal > .50 * budget) The various schema definitions are as follows: CurrView(dept_name, mgr_email, budget, sum_sal);

PAGE 62

54 TempCandidates(dept_name, mgr_email, budget, sum_sal); Semp(emp_id); Sdept(name, manager_email, budget); CurrView and TempCandidates have the same schema consisting of the grouping attributes and a field containing the result of summing the groups on sal. Since only the relation Dept contributes to the grouping attributes we collect the set of touched groups in Sdept. Emp does not contribute to the grouping at all so we keep track of the primary keys of interesting tuples updated. The asynchronous triggers to collect the updates in Semp and Sdept are:create trigger S_emp_ excess_salary from emp when emp.desig="sales personnel" or old.emp.desig="sales personnel" do execSQL 'insert into Semp values :emp_id'; create trigger S_dept_ excess_salary from dept when dept.name="toy" or old.dept.name="toy" do execSQL 'insert into Sdept values (:name, :manager_email, :budget)';From the discussion in section 5.4.1 for the case of a multi-table aggregate trigger with an ON UPDATE event and no transition predicate, the steps to process the trigger on timer expiration are written as follows: 1. Create TempCandidates 1a.If Semp is not empty then Sdept= Pname, manager_email, budget( sPempPdept ((Semp JOIN emp) JOIN dept)

PAGE 63

55 endif 1b.If Sdept is not empty then TempCandidatedept= Pname, manager_email, budget, sum_sal(dept.name, dept.manager_email, dept.budget * sum sal( sPempPdept (emp JOIN (Sdept JOIN dept)))) TempCandidate = TempCandidatedept; endif 2. Refresh CurrView : If Sdept is not empty then CurrView = CurrView (Sdept JOIN CurrView); CurrView = CurrView TempCandidates 3. Candidates for the action are all the tuples of the view CurrView. 5.6 Conclusion Implementing aggregate triggers by the brute force method may not be the most efficient method in some cases. Reading entire relations and performing aggregations involves a lot of expense in terms of both CPU and I/O. In single table triggers where the update rate on relations are not very high, our strategy of keeping track of the groups touched and aggregating on only this small set of groups works well. Performance improves even more if the selection predicate is very selective or if there is an index on the relation. In multi-table triggers, our strategy works well if only a few of those relations are updated and the update rate is not very high.

PAGE 64

56 CHAPTER 6 OPTIMIZATION AND IMPLEMENTATION ISSUES 6.1 Introduction In the preceding chapters, we looked at the syntax and semantics of timer triggers with its brute-force and enhanced implementation algorithms. In this chapter, we discuss the architecture and some optimization issues of a Timer Trigger System. Although the enhanced algorithms contain more queries than the brute force approach, they can be orders of magnitude faster under certain conditions. The same algorithms however, may perform poorly compared to the brute-force approach under different conditions. This chapter looks at the factors which determine the performance of our enhanced algorithms and provides some thoughts in making a choice between the better of the two approaches. Trying to make a comprehensive assessment using extensive statistics, like the query optimizer does [Chm76, Sel79], is extremely complex to implement. 6.2 Architecture The architecture of a timer trigger system is shown in figure 6.1. As discussed before, a Timer Trigger System is designed to complement the powers of an asynchronous trigger processor. A console program interfaces to the user and accepts the "create timer trigger" command string. This console could be same as the one used by the asynchronous trigger processor to accept its commands. This command string is then sent

PAGE 65

57 to the parser where it gets parsed for syntactic correctness. The semantic analyzer validates the semantics of the various entities present in the command string viz. the data sources, views, fields etc. It also checks to see if the view on which the trigger is defined has a primary key on it, and if the action present in the do clause is valid. Figure 6.1: Architecture of a Timer Trigger System The code generator then generates the SQL scripts for both the brute-force and the enhanced strategy. The default strategy is initially set to brute-force. The implementation of the enhanced strategy requires us to create asynchronous triggers to Parser and Semantic Analyzer Code Generator Re-assess Strategy Timer Manager Asynchronous Trigger Processor Execution Unit Timer Trigger Console Timer Trigger System Execute Scripts

PAGE 66

58 keep track of the updates to each of the data sources. As many asynchronous triggers are needed as there are data sources in the timer trigger definition. The scripts for the asynchronous trigger processor are created by the code-generator. All the scripts created are stored in a catalog with the trigger definition. A request is then sent to the timer manager for a periodic wake-up signal. If the "initialize immediately" clause is set in the timer trigger string, the view is initialized the first time on trigger definition by brute force. This view, as discussed before, needs to be kept until the next timer expiration to evaluate transition predicates. The timer manager accepts entries for timer requests when a trigger is defined. It registers these entries into the its list of wake-up times. On expiration of this timer, it sends a wake-up signal to the execution unit identifying the trigger to be processed. The execution unit , on receiving a wake-up signal, extracts the definition and the scripts of the specified trigger from the catalogs. Depending on which of the two strategies were set active, the execution unit runs the set of SQL scripts from either the brute-force or the enhanced algorithm. These scripts generate the new view, apply the predicates of the trigger referring to the previous copy of the view if necessary, and finally run the action on the qualifying tuples. After firing the actions, the trigger strategy is reassessed in the reassess block . The values of a set of parameters, on which the execution cost of the trigger is dependent, is examined and a decision in made to choose between the two strategies. If the time spent on re-assessing is a fair fraction of the processing time of the trigger, then this step can be executed every few times after the processing instead of every time. The

PAGE 67

59 mechanism for reassessing the trigger processing strategy is covered in more detail later in this chapter. 6.3 Thoughts on Optimization The run-time performance of the enhanced strategy depends on the following factors : update rate of the data sources selectivity of the selection predicate of the view indexes present on the data sources on their primary key In Chapters 4 and 5, we mentioned that for each relation involved in the trigger, a relation S(pKey) or S(gAttr) is maintained to store the keys of updated tuples. This list of updated tuples is filtered to a potentially small size by applying the selection predicate while gathering them. If the number of keys in S is kept to a small size then the time spent for creating the new view is reduced. The size of the set S is small if either the update rate is low or the predicate is very selective or both. If the data source has an index on its primary key pKey or the grouping attribute gAttr, then the time spent on processing the trigger is significantly reduced, since only a few pages of the data source are touched. Thus, one or more of the above mentioned factors contribute to the better performance of the enhanced algorithms. So, if a plot of the processing time versus the update rate of the table is to be drawn, then it should look like the one shown in figure 6.2. From this plot it is clear that the enhanced strategy works well for low update rates. As the update rate increases there is a crossover point, above which the enhanced strategy is worse than the brute force one.

PAGE 68

60 It is our goal to work in the shaded region of the graph i.e., use the enhanced strategy until the update rate reaches the crossover point, after which we want to switch over to the brute-force approach. Update RateTrigger Processing Time Brute Force EnhancedFigure 6.2: Performance of the brute force method vs. the enhanced method for varying update rates. 6.4 Optimization Strategies The first optimization strategy is the simplest and the easiest to implement. Actual run times are used to arrive at a decision on which among the two algorithms is the better one. By actual time we mean the wall-clock time, as it is very difficult to obtain the actual CPU and I/O time spent during query executions due to the inherent multi-threading in DBMSs [Inf99]. Brute force is initially used to create the view for the first time and the CrossoverPoint

PAGE 69

61 time spent in milliseconds is stored with the trigger information. In every subsequent processing, the enhanced strategy is run and the cost recorded and compared with the brute-force cost. When the former cost exceeds the latter in two or more consecutive runs, we switch to the brute-force approach. In the enhanced region of the graph, this algorithm works fine because we know with a fair degree of certainty that the enhanced algorithm has lesser processing cost. But once we cross over to the brute-force region, it is difficult to make a decision when to return back to the enhanced approach, due to the lack of sufficient information collected. In the second optimization strategy, we use the estimated costs from the query optimizer in arriving at a relation between the update rate of the data sources to the execution cost of the enhanced algorithm. However, this information is not reliable as the size of the relation S will never be accurately determined by the optimizer due to the dynamically varying size of S. But, if the optimizer provides a function to calculate the cost of a query by simply giving to it the size of S then we can approximately determine the size of S that causes the crossover in costs between the two algorithms. As the updaterate rises the number of touched keys collected in S will increase too. When the number of keys in S crosses this threshold number, we switch to the brute force approach and stop gathering the updates. At a later time, when the update rate drops, we can resume collecting keys in S to see if this number has dropped below the threshold number. Stopping and restarting the process of collecting the updated keys is as simple as disabling and enabling the asynchronous triggers defined on the data sources. The update-rate of data sources can be kept track of by defining simple mainmemory counters (variables) for each data source. These counters are incremented on

PAGE 70

62 every update to the corresponding data source and once every hour the update rate can be calculated (by dividing the value of this counter by sixty minutes). Since this counter is going to be shared among the different transactions updating the data source, a locking overhead will be introduced per increment. However, since we are interested in only an approximate number for the update-rate, we can avoid this expensive process of locking the counter. While the second strategy seems appealing, it is based on the assumption that the query optimizer provide an interface function that accepts the size of S and returns the cost of the enhanced query. However query optimizers compute the estimated costs of queries based purely on statistics collected by the database and do not accept parameters. It will be helpful if DBMS vendors could open up their query optimizers provide interfaces that accept some parameters while optimizing a query. One way to get around this problem is to enter some dummy rows into S and call the gather statistics command on S, before optimizing the query. The first strategy does seem like a more practical and a simple one. However, we need to find some way to return back to the enhanced approach from the brute force one, when the update rate drops. This can be done by recording the update-rate at the crossover point. Later, when the update rate drops below this recorded value we can switch back to the enhanced strategy. 6.5 Conclusion The architecture of a timer trigger system is simple. It functions hand in hand with an Asynchronous Trigger Processor. The enhanced algorithm works well for low update

PAGE 71

63 rate tables, high trigger predicate selectivity and the presence of primary key indexes on data sources. A plot of the enhanced vs. the brute-force strategy reveals that our enhanced algorithms perform well for low update rates. As the update rates increases there is a crossover point when this strategy is worse than the brute force one. It is important to identify the crossover point until which the enhanced algorithm should be used. A more detailed design and implementation of optimized techniques for a timer trigger system is a subject of future work.

PAGE 72

64 CHAPTER 7 SUMMARY AND FUTURE WORK 7.1 Summary Timer triggers allow users to define views and monitor interesting tuples or interesting changes to tuples in these views. The process of retrieving views, testing trigger conditions and firing actions takes place periodically. All this gives timer triggers different semantics from those of the more conventional synchronous (and asynchronous) triggers, in that the tuples do not need to be updated to satisfy trigger conditions. The examples covered in Chapter 3 sought to reassert the point that some trigger needs of users require periodic testing and notification. These translate best to timer triggers and sometimes cannot be replaced by regular triggers. In example 1, where it was required to notify some employees weekly of a form to be filled, only timer triggers could be written here to correctly send one notification per week to forgetful employees. Some join triggers and many aggregate triggers could be expensive to implement using asynchronous triggers due to frequent accesses on the original relations. These triggers can be evaluated more efficiently by the periodic testing mechanism on a group of updates. The simplest implementation for timer triggers is the brute-force approach, which retrieves the view using its definition and then applies the predicates of the trigger. The enhanced algorithms of simply refreshing the view by recording its updates lead to a huge

PAGE 73

65 performance benefit during periodic processing. This is especially true if one or more of the following hold viz. the update-rate on the relations is low, the trigger predicate is very selective or indexes are defined on the primary key of relations. Formal cost analysis and models in trying to choose the better of the two algorithms has not been presented and is the scope for future work. However, our approach of making a choice between the better of the two algorithm and identifying the point of crossover by using actual run times is a simple and a feasible idea. The most useful contribution of this thesis is the idea of keeping a set of keys of updated tuples and restricting the processing to this set. This allows us to be oblivious of the anomalies arising due to the different sequence of inserts, updates and deletes on the same tuple of the view. This is achieved by actually retrieving the tuples associated with these keys from the original relation and thus getting the correct information. Maintaining this set however, requires us to create asynchronous triggers as explained in Chapter 4. An asynchronous trigger processor like TriggerMan with its fast predicate indexing mechanism makes it possible to collect this set of keys with a very low (and almost insignificant) overhead in many cases. The architecture of a timer trigger system presented in Chapter 6 is simple and reuses a lot of the functionality available from the asynchronous trigger processor like its console, data source definition etc. 7.2 Thoughts on Future Work In the enhanced algorithms for join triggers we assume that updates to all the tables in the trigger (or view) definition are important for refreshing the views. Thus, we

PAGE 74

66 keep track of the updates on all the tables. In many cases however, updates to fewer than all relations are relevant and it may not be necessary to keep track of the updates to all the relations. This is especially true in example 4 of Chapter 4 which sends a complimentary email to new pop and rock artists when their album sales cross a million. While updates take place on both the relations, album and artist, the updates to only album are relevant as we are only interested when the sales cross the million mark. Thus, there should be an option in the timer trigger language that specifies the relations on which the updates are relevant so that our optimizer has a more efficient execution. While making a choice between the better of the two approaches discussed, our main basis of comparison was the total run times spent on creating the views and firing the actions. However, the enhanced algorithms also incur a small cost during the update to each tuple. This is because, the asynchronous trigger processor evaluates some part of the predicate on the updated tuples and puts the key in the update set S if necessary. Although the cost of evaluating some part of the predicate can be very small with an efficient predicate indexing mechanism, like that of TriggerMan, it could still be significant for a high update-rate table or an expensive predicate. Other issues such as locking on the update set S could also escalate cost. Thus, we need to arrive at a more mathematical model to compute costs of the enhanced algorithms, taking into account the costs for both collecting the updates and processing them. When the enhanced algorithms are processing the set of keys of updated tuples (i.e., keys in S), updates on the original relation still proceed concurrently. Some of these updates may qualify to have their keys included in the set S. This represents a boundary case where keys are getting inserted into S while S is being processed. Some analysis

PAGE 75

67 needs to be done to see if any unexpected behavior is possible leading to some tuples getting processed at the wrong timer expiration. In our enhanced algorithm for triggers with a transition predicate (i.e., referring to old and new values from the view), we assumed that a tuple of the view must change to satisfy the transition predicate. We run the action on all the tuples that satisfy the following condition: sPtran (CurrView JOIN TempCandidates) where CurrView is the old copy of the view from the previous timer expiration (and is yet to be refreshed) and TempCandidates is the view of the tuples inserted or updated in the interval since the last timer expiration. By performing the join (CurrView JOIN TempCandidates), we isolate only those tuples which have been updated since the last timer expiration and apply the transition predicate Ptran (CurrView.tranFields, TempCandidates.tranFields) on the old and new copy of the relevant fields. However, depending on the transition predicate (e.g. old.value = new.value), there may be cases of tuples that satisfy the predicate without being updated. The brute-force approach will account for these tuples, as it compares all tuples of the old view to corresponding tuples of the new view. However, our enhanced approach does not identify tuples that have not been updated but still satisfy the transition predicate because we apply the predicate only on the updated tuples. In order to make our enhanced strategy work we need to change the predicate to the following: sPtran' (CurrView OUTER-JOIN TempCandidates) where Ptran' = (TempCandidate.key is not Null AND

PAGE 76

68 Ptran(CurrView.tranFields, TempCandidates.tranFields)) OR (TempCandidate.key is Null AND Ptran (CurrView.tranFields, CurrView.tranFields)) The predicate Ptran' is explained as follows. We pull in all tuples of the view CurrView by performing an outer join with TempCandidates. The predicate Ptran' has two disjuncts, the first one applies the predicate Ptran on old-new pairs of updated tuples in the view and the second one applies the predicate on only the old values of the untouched tuples. The second strategy is a general one and works for both the cases. This can be the default strategy. However, if we can determine from the transition predicate that an attribute value has to change to satisfy the predicate then we can switch to the more efficient approach of applying the predicate Ptran on only updated tuples of the view. Examples of such predicates are old.value < new.value, new.value> 10*old.value etc. A theorem prover can be introduced into the code generation block of figure 6.1 to examine the transition predicates and make this determination. This theorem prover is a subject of future work. It has been mentioned at many places, how some expensive join and aggregate triggers can be implemented more efficiently using timer triggers. More work needs to be done on this issue. We have to identify factors within the trigger definition and run-time characteristics that help us conclude the above. Having identified such triggers, we have to work on a method to inform the owner of that trigger or the system administrator of possible improvements in system performance using a timer trigger, along with some

PAGE 77

69 numbers possibly. We also need to research the idea of directly switching to timer triggers where possible, without changing the semantics of the trigger. It is possible for many triggers to be defined on similar or even the same views. Moreover, the trigger predicates themselves may be similar. In cases such as these, we can group the processing of two or more similar triggers to prevent multiple predicate evaluation and table accesses. This assumes more significance in aggregate triggers, due to the higher processing costs required for the aggregation. The issue of grouping the processing of two or more triggers needs to be addressed. In example 3 of Chapter 3, a trigger was written to send an email weekly to the finance department on the customers who have defaulted on their payment for ninety days or more. So, a tuple qualifies for the action simply because its due date has aged for more than 90 days. Triggers such as these, that do date arithmetic using the current date, can only be implemented using brute-force. More analysis needs to be done, identifying predicates like the one using current date, time and timestamps where the trigger will work correctly using only the brute-force approach.

PAGE 78

70 REFERENCES [Alf98]Al-Fayoumi, N., Temporal trigger processing in the TriggerMan active DBMS. Ph.D. dissertation, University of Florida, Gainesville, 1998. [Car98]Carnes, C., A flexible data source architecture for an asynchronous trigger processor. Unpublished paper, University of Florida, Gainesville, 1998. [Cat99]Categoric http://www.categoric.com Palo Alto, CA, 1999. [Cha89]Chakravarthy, S., HiPAC: A research project in active, time-constrained database management. Final Technical Report. Technical Report XAIT-8902. Xerox Advanced Information Technology, Cambridge, MA, August 1989. [For82]Forgy, C. L., Rete: A fast algorithm for the many pattern/many object pattern match problem. Artificial Intelligence 19(1):17-37, September 1982. [Geh91]Gehani, N., & Jagadish, H. V., Ode as an active database: Constraints and triggers. In Proceedings of the seventeenth International Conference on Very Large Data Bases 1991 , pages 327-336. Morgan Kaufmann, San Francisco, September 1991. [Han90]Hanson, E. N., Chaabouni, M., Kim, C., & Wang, Y., A predicate matching algorithm for database rule systems. In Proceedings SIGMOD Conference 1990, pages 271-280. ACM Press, New York, June 1990. [Han94]Hanson, E. N., & Chaabouni, M., The IBS tree: A data structure for finding all intervals that overlap a point. Technical Report WSU-CS-9011. Wright State University, Dayton, OH, April 1990. [Han96a]Hanson, E. N., The design and implementation of the Ariel active database rule system. IEEE Transactions on Knowledge and Data Engineering 8(1):157-172, February 1996. [Han96b]Hanson, E. N., & Johnson, T., Selection predicate indexing in active databases using Interval Skip Lists. Information System s, pp. 269-298, 1996.

PAGE 79

71 [Han97]Hanson, E. N., Bodagala, S., & Chadaga, U., Optimized trigger condition testing in Ariel using Gator Networks, University of Florida CISE Department. Technical Report 97-021, Gainesville, November 1997. http://www.cise.ufl.edu [Han98]Hanson, E. N., Konyala, M., Vernon, A., Noronha, L., & Park, J., Scalable trigger processing in TriggerMan. University of Florida CISE Department. Technical Report 98-008, Gainesville, July 1998. http://www.cise.ufl.edu [Han99]Hanson, E. N., Carnes, C., Huang, L., Konyala, M., Noronha, L., Parasarathy, S., Park, J., & Vernon, A., Scalable Trigger Processing. In Proceedings International Conference on Data Engineering 1999, pages 266-275. IEEE Computer Society, Los Alamitos, CA, March 1999. [Inf98]Informix Inc. http://www.informix.com Palo Alto, CA, 1998. [Kon98]Konyala, M., Predicate indexing in TriggerMan. Master's thesis, University of Florida, Gainesville, 1998. [Kan98]Kandil, M., Predicate placement in active database discrimination networks. Ph.D. dissertation, University of Florida, Gainesville, 1998. [Mir87]Miranker, D. P., TREAT: A better match algorithm for AI production systems. In Proceedings AAAI-87 Sixth National Conference on Artificial Intelligence , pages 42-47. Morgan Kaufmann, San Francisco, August 1987. [Net99]NetMind http://www.netmind.com Campbell, CA, 1999. [Oas98]Ozsu, M. T., & Valduriez, P., Principles of distributed database systems . Prentice-Hall, Englewood Cliffs, NJ, 1998. [Par98]Park, J., Implementation and performance tuning of discrimination networks for asynchronous trigger processing view maintenance. Unpublished paper, University of Florida, Gainesville, 1998. [Pat98]Paton, N. W., ed. Active rules in database systems. Springer Verlag, New York, 1998. [Pug90]Pugh, W., Skip lists: A probabilistic alternative to balanced trees. Communications of the ACM 33(6):668-676, June 1990. [Rao98]Rao, J., & Ross, K. A., Reusing invariants: A new strategy for correlated queries. In Proceedings SIGMOD Conference 1998, pages 37-48. ACM Press, New York, June 1998.

PAGE 80

72 [Sel79]Selinger, P. G., Astrahan, M. M., Chamberlin, D. D., Lorie, R. A., & Price, T. G., Access path selection in a relational database management system. In Proceedings SIGMOD Conference 1979, pages 23-34. ACM Press, New York, May-June 1979. [Ses96]Seshadri, P., Pirahesh, H., & Cliff Leung, T. Y., Complex query decorrelation. In Proceedings International Conference on Data Engineering 1996, pages 450-458. IEEE Computer Society, Los Alamitos, CA, March 1996. [Sil98]Silberschatz, A., Korth, H. F., & Sudarshan, S., Database management systems , 3rd ed., McGraw-Hill, New York, 1998. [Sis95a]Sistla, A. P., & Wolfson, O., Temporal conditions and integrity constraints in active database systems. In Proceedings SIGMOD Conference 1995 , 24(2):269-280, San Jose, CA, June 1995. [Sis95b]Sistla, A. P. & Wolfson, O., Temporal triggers in active databases. IEEE Transactions on Knowledge and Data Engineeri ng 7(3):471-486, June 1995. [Sto91]Stonebraker, M., & Kernnitz, G., The POSTGRES next-generation database management system. Communications of the AC M 34(10):78-92, October 1991. [Syb99]Sybase Replication Server http://www.sybase.com Emeryville, CA, 1999. [Wan92]Wang, Y. W., & Hanson, E. N., A performance comparison of the Rete and TREAT algorithms for testing database rule conditions. In Proceedings IEEE Data Engineering Conference 1992 , pages 88-97. IEEE Computer Society, Los Alamitos, CA, February 1992. [Wid91]Widom, J., Cochrane, R., & Lindsay, B. G., Implementing set-oriented production rules as an extension to Starbust. In Proceedings of the seventeenth International Conference on Very Large Data Bases 1991 , pages 275-285. Morgan Kaufmann, San Francisco, September 1991. [Wid95]Widom, J. & Ceri, S., eds., Active database systems: Triggers and rules for advanced data processing . Morgan Kaufmann, San Francisco, September 1995.

PAGE 81

73 BIOGRAPHICAL SKETCH Lloyd Noronha was born on December 7, 1973 in Bombay, India. He received a bachelor's degree in electronic engineering securing first class with honors from Bombay University, Bombay, India, in May 1995. He joined University of Florida in January 1998 to pursue a master’s degree in computer and information science and engineering. He has worked as a research assistant with Dr. Hanson at the Database Systems Research and Development Center. His research interests are in active database systems.


xml version 1.0 encoding UTF-8
REPORT xmlns http:www.fcla.edudlsmddaitss xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.fcla.edudlsmddaitssdaitssReport.xsd
INGEST IEID EWQFNRL4L_6S1TAX INGEST_TIME 2017-07-20T20:59:02Z PACKAGE UF00101372_00001
AGREEMENT_INFO ACCOUNT UF PROJECT UFDC
FILES