Citation
Real-time transaction scheduling in conventional and active databases

Material Information

Title:
Real-time transaction scheduling in conventional and active databases
Creator:
Hong, Dong-kweon, 1960-
Publication Date:
Language:
English
Physical Description:
viii, 101 leaves : ill. ; 29 cm.

Subjects

Subjects / Keywords:
Conflict resolution ( jstor )
Databases ( jstor )
Deadlines ( jstor )
Input output ( jstor )
Lateness ( jstor )
Load forces ( jstor )
Rollbacks ( jstor )
Scheduling ( jstor )
Simulations ( jstor )
Transaction costs ( jstor )
Computer and Information Sciences and Engineering thesis, Ph. D
Dissertations, Academic -- Computer and Information Science and Engineering -- UF ( lcsh )
Real-time data processing ( lcsh )
Genre:
bibliography ( marcgt )
non-fiction ( marcgt )

Notes

Thesis:
Thesis (Ph. D.)--University of Florida, 1995.
Bibliography:
Includes bibliographical references (leaves 97-100).
General Note:
Typescript.
General Note:
Vita.
Statement of Responsibility:
by Dong-kweon Hong.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. §107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
Resource Identifier:
022365702 ( ALEPH )
34409223 ( OCLC )

Downloads

This item has the following downloads:


Full Text











REAL-TIME TRANSACTION SCHEDULING IN CONVENTIONAL AND ACTIVE DATABASES















By

DONG-KWEON HONG


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


1995















ACKNOWLEDGEMENTS


First, I would like to thank my advisors, Dr. Sharma Chakravarthy and Dr. Theodore Johnson, for showing me the path of research, and for providing me with constant encouragement throughout my work. I would also like to thank the other member of my supervisory committee, Dr. Stanly Su, Dr. Eric Hanson, and Dr. Suleyman Tufekci for willingly agreeing to serve on my committee.

Next I would like to thank Dr. Paul Fishwick for providing me the SIMPACK simulation package. Without this great package, my work would have been delayed.

Finally, I would like to thank all the students at the Data Base System Research and Development Center for their help and friendship.














TABLE OF CONTENTS


ACKNOWLEDGEMENTS ......


LIST OF TABLES .....................................

LIST OF FIGURES ....................................

ABSTRACT .........................................

CHAPTERS .........................................

1 INTRODUCTION ...................................
1.1 Problem Statement ................................
1.1.1 Priority Assignment ...........................
1.1.2 Concurrency Control ...........................
1.2 Survey of Related Work .............................
1.2.1 Priority Assignment for RTDBS ....................
1.2.2 Priority Assignment for ARTDBS ...................
1.2.3 Concurrency Control ...........................
1.3 Summary of Our Research ............................
1.3.1 Contribution of our Work ........................
1.4 Structure of Dissertation .............................


page

. . . . . . . . . . . . . . . ii


2 SOFT REAL-TIME: INCORPORATING LOAD
2.1 Motivation for Our Approach ........
2.2 CCA-ALF for Soft Deadline .........
2.2.1 Priority Assignment .........
2.2.2 Scheduling Algorithm ........
2.3 EDF-CR-ALF for Soft Deadline .......
2.4 Performance Evaluation ...........
2.4.1 Main Memory DB ..........
2.4.2 Disk Resident DB ..........
2.5 Conclusions ..................


FACTOR INTO CCA



.�........
....*.....*..


3 FIRM REAL-TIME: DEFERRED-RESTART APPROACH
3.1 Introduction ........................
3.2 Related Work .......................
3.3 Motivation for our Approaches .............
3.3.1 Comparison of Conflict Resolution Policies . .
3.4 Adaptive Concurrency Control (ACC) .........
3.4.1 Procedures of ACC ................
3.4.2 Correctness of ACC ...............









3.5 Alternative Version Concurrency Control (AVCC) ................. 57
3.5.1 Algorithms ...................................... 58
3.5.2 Correctness of AVCC ............................... 65
3.6 Performance Evaluation ................................... 66
3.6.1 Performance of ACC ................................ 69
3.6.2 Performance of AVCC ............................... 71
3.7 Conclusions ........................................... 73

4 ACTIVE REAL-TIME: TRANSACTION SCHEDULING ................ 75
4.1 Priority Assignment ..................................... 75
4.1.1 Multiple Priorities ................................. 76
4.1.2 Performance Evaluation ............................. 82
4.1.3 Analysis of Results ................................. 86
4.2 Concurrency Control ..................................... 88
4.2.1 Extension of AVCC for Active Transaction Model ............. 89
4.2.2 Performance Evaluation ............................. 92

5 CONCLUSIONS ........................................... 94

REFERENCES .............................................. 97

BIOGRAPHICAL SKETCH ..................................... 101














LIST OF TABLES


Table page

1.1 Compatibility Matrix for MV2PL ........................ 9

2.1 Parameters and their meanings for CCA-ALF ..................... 27

2.2 Base parameters for main memory database ...................... 29

2.3 Base parameters for disk resident database ...................... 36

3.1 Simulation Parameters for ACC and AVCC ...................... 67

4.1 Parameters for ARTDBS simulations .......................... 83














LIST OF FIGURES


Figure page

2.1 Knowledge type and corresponding approaches ..... ................ 18

2.2 Open Network Model for the simulation (single CPU) ................ 26

2.3 Miss percent (CCA-ALF) ....... ............................ 30

2.4 Restart Rate (CCA-ALF) ....... ............................ 30

2.5 Mean Lateness (CCA-ALF) ....... ........................... 31

2.6 Multiclass: Miss Percent (CCA-ALF) ...... ..................... 32

2.7 Multiclass: Restart Rate (CCA-ALF) ...... ..................... 32

2.8 Multiclass: Mean Lateness (CCA-ALF) ...... .................... 33

2.9 Miss percent for each class and Proportion of class 2 to class 0 ......... 34 2.10 DISK: Miss Percent (CCA-ALF) ...... ........................ 38

2.11 DISK: Restart Rate (CCA-ALF) ...... ........................ 38

2.12 DISK: No. of active tr. (CCA-ALF) ...... ...................... 39

2.13 DISK: Mean Lateness ........ .............................. 39

3.1 Case 1: Both transactions finished successfully within their deadlines . ... 46 3.2 Case 2:HPT completed successfully, LPT missed .... ............... 47

3.3 Case 3:HPT missed, LPT completed successfully ................... 47

3.4 Case 4:Both LPT and HPT missed their deadlines .................. 47

3.5 Transaction blocks among HIT and MISS group ..................... 49

3.6 Primitive cases of transaction stop ...... ....................... 53

3.7 Abort of stopped transaction which also stopped others .............. 53









More complex cases of transaction stop ............

Deferred restart and immediate restart versions (AVCC) . Structure of AVCC execution ...................

Case 2: Conflicts with the same transaction more than onc


. . . . . . ..

. . . . . . . . .

. . .. . . . . ..

e . �. . . . . . ..

. . �. . . . . . ..


3.12 Case3: Conflicts with different transactions ..........


3.13 3.14 3.15 3.16 3.17 3.18 3.19

4.1 4.2 4.3 4.4 4.5 4.6 4.7 4.8 4.9


3.8 3.9

3.10 3.11


Commit of AVCC .......................................

Chain stop in AVCC ...... ...............................

Conflict with different transactions in AVCC .....................

Open Network Model with Multiple CPU and disks .................

EDF-HP, AED-HP, and EDF-ACC ...........................

AED-HP, EDF-ACC, and AED-ACC ..........................

Comparisons of EDF-HP, ACC, AVCC .........................

Deadlock due to multiple priorities ...........................

Priority Reversal: Blocking of active transactions ..................

Restarts of active transactions ...............................

Life of complex active transaction T ..........................

Restartable unit of active real-time transaction ...................

Result: Without data contention .............................

Result: With data contention ...............................

Two execution scenario ...... .............................

2PL-HP and extened AVCC for active databases ..................


54 58 59 59 60 62 62 64 67 70 71 72 77 77 78 79 85 86

87 87 93















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy


REAL-TIME TRANSACTION SCHEDULING IN CONVENTIONAL AND ACTIVE DATABASES By

DONG-KWEON HONG

DECEMBER, 1995


Chairman: Dr. Sharma Chakravarthy Cochairman: Dr. Theodore Johnson Major Department: Computer and Information Sciences and Engineering


Database applications that require time constrained (real-time) response to transactions are becoming quite common. Design of a scheduling policy for a real-time database systems entails combining techniques from conventional database systems and real-time systems and fine-tuning them to obtain a policy that meets the requirements of scheduling transactions in real-time database systems. A scheduler of real-time database systems is responsible for assigning priorities and resolving access conflicts among transactions based on priorities. In this dissertation, we propose a cost conscious dynamic priority assignment policy for soft real-time transactions and new schedulers for firm real-time transactions. Finally, we extend our research to active real-time transaction systems who for modeling complex applications.














CHAPTER 1
INTRODUCTION

Database applications that require time constrained (real-time) response to transactions are becoming quite common. The information about an object must be kept sufficiently upto-date in real-time databases. Once entered into a database, data may become out-of-date if it is not updated within a certain period of time. These time varying data (real-time data) naturally impose a time constraint to a query of the data. For example, network traffic management (NTM) data in network database system [34, 36] gives the ability for real-time monitoring of the network making it self-adapting and fault-tolerant. The sampled data of real world (NTM data) should not fall behind the actual data of real world (network traffic) by more than a specified time. Typically, applications in real-time systems (as opposed to Real-Time Database Systems (RTDBS)) do not share disk-resident data. Even when they share data, the consistency of shared data is not managed by the system but by the application program. For the assumptions used in real-time systems, it is possible to predict some of the characteristics of tasks needed for the scheduling algorithms. As a result, scheduling algorithms [47, 46] used in current real-time systems assume a priori knowledge of tasks, such as arrival time, deadline, resource requirement, and worst case (CPU) execution time. For database applications, on the other hand, the following sources of unpredictability exist which makes it difficult to predict some of the resource requirements for transactions that need to meet time constraints [35]:

1. Resource conflicts (e.g., wait for disk I/O)

2. Data dependence (e.g., execution path based on the database state)

3. Dynamic paging and I/O (e.g., page faults, caching, and buffer allocation)

1









4. Data interference (e.g., aborts, rollbacks, and restarts)

5. Algorithmic variations for disk-resident data access (e.g., clustered scan vs. use of

index)

Note that most of the sources of unpredictability are related to the database characteristics. Nevertheless, the best efforts of researchers to improve the performance metrics of RTDBS are continued by several ways.

RTDBS have close connection with active Database Systems (DBS). In active DBS data and control knowledge are stored together. This control knowledge specifies the action to be done when specified conditions hold and specified events occur. Event driven control is easily specified by ECA (Event-Condition-Action) rule of active databases. This paradigm is highly suitable to implement RTDBS as these control real-world processes. Recently, application that combines real-time and active databases have been proposed [11, 14] and priority assignment policy has been studied for active real-time database systems (ARTDBS) [33, 34].

We assume that transactions are the basic unit of work for database systems in this dissertation. Transactions with deadlines have been categorized into hard deadline, soft deadline, and firm deadline transactions. Transactions with hard deadlines have to meet their deadlines; otherwise, the system does not meet the specification. Typically, transactions that are in this category have catastrophic consequences if their deadlines are not met. Sometimes contingency measures may be included as an alternative. Soft real-time transactions have time constraints, but there may still some residual benefit for completing transactions past their deadline. Conventional transactions with response time requirements can be considered soft real-time transactions. In contrast to the above two, firm transactions are those which need not be considered any more if their deadlines are not









met, as there is no value to completing the transaction after its deadline. Typically applications that have a definite window (e.g., stock market applications [2, 45]) within which transactions need to be executed come under this category.

We view RTDBS and ARTDBS as either memory resident or disk-resident transaction processing systems whose workload is composed of transactions with individual timing constraints. A timing constraint is expressed in the form of a deadline, and we consider only soft and firm deadline transactions in this dissertation.

1.1 Problem Statement

The main focus of research in the real-time systems area has been the problem of scheduling tasks to meet the time constraints associated with each task, while the focus in traditional database area has been concurrency control of transactions to guarantee database consistency and recovery in the presence of various kinds of failures (i.e., ACID properties). Design of a scheduling policy for a RTDBS entails synergistically combining techniques from both areas and fine-tuning them to obtain a policy that meets the requirements of scheduling transactions in RTDBS. This dual requirement makes real-time transaction scheduling more difficult than task scheduling in real-time systems or transaction scheduling in database systems. Thus the scheduler of a RTDBS is responsible for assigning priorities and resolving access conflicts among transactions based on priorities.

RTDBS assume that each transaction is a unit of work. But in active DBS, transactions may trigger other rules which can be treated as arbitrary computations (i.e., transactions). An active transaction has a set of triggered transactions that are either executed as part of the active transactions or separately depending on the type of coupling mode between the parent and the triggered transactions [11]. Coupling modes proposed in the literature are immediate, deferred, and independent, and the transactions triggered in these modes are referred to as immediate, deferred, and independent transactions, respectively. These









triggered transactions make priority assignment policy and concurrency control algorithm for ARTDBS more complex.

1.1.1 Priority Assignment

The performance requirements of conventional Database Management System (DBMS) are usually expressed in terms of average response times rather than meeting the timing requirements of individual transactions. Thus improving the response time of one transaction at the expense of another is not considered as improvement in conventional DBMS. In an RTDBS, the objective is to reduce the number of transactions that have missed their deadlines or total lateness. Consider two scenarios: If the transactions share the system resources on an equal basis, the transactions that have tight deadlines miss their deadlines while the transactions that have loose deadlines are likely to meet their deadlines. Alternatively, if the transactions that have urgent deadlines execute at the expense of the transaction that have loose deadlines (earlier deadline first), they might complete before its deadline. After that the other transactions execute and still complete in time due to their loose deadlines. From these two scenarios, we can see that the service precedence which is decided by the priority assignment policy affects the performance of time constrained database systems [42].

1.1.2 Concurrency Control

The usual correctness criteria of database transactions is serializability. A serial schedule has no concurrency, but it is of interest since it preserves database consistency. Hence, we are interested in the large class of schedules, which may exhibit consistency and which are equivalent to some serial schedule. Such schedule is said to be serializable. Widely used mechanism for serializing transactions are locking, validation and timestamping. Each mechanism takes a different approach to achieve serializability. Whenever a data conflict occurs, concurrency control protocols use blocking, or transaction restarts or combinations









of them. In RTDBS and ARTDBS the decision of blocking or transaction restarts should include transaction priorities.

1.2 Survey of Related Work

1.2.1 Priority Assignment for RTDBS

Priority is assigned based on several types of information. Earliest Deadline First (EDF) and Least Slack First (LSF) are the most common priority assignment policies for real-time systems and these policies are usually combined with 2PL or OCC for RTDBS. EDF. In this policy the transaction with the earliest deadline has the highest priority. LSF. For each transaction T, we define a slack time S=d-(t+E-U), where d is deadline,

t is the current time, E is expected execution time and U is the amount of service time consumed by T so far. Slack time is an estimate of how long we can delay the execution of T and still meet its deadline. In LSF the transaction with the least slack

time has the highest priority.


1.2.2 Priority Assignment for ARTDBS

As real-time systems evolve, tasks become bigger, more complicated. In some situations, a single value of an end-to-end deadline fails to capture the sense of urgency of each individual subtasks [23]. Thus the problem arises as to how to assign a priority to a triggered transaction given the priority of the triggering transaction in ARTDBS and three priority assignment policies, PD, DIV, and SL, are suggested for immediate subtransactions and deferred subtransaction [341.


PD. This policy assigns the same priority to all subtransaction which is the same as the

parent's priority. The priority of the parent transaction (triggering transaction) which

is based on its deadline does not change with the triggering of subtransactions.









DIV. This policy divides the parent's current slack among all the immediate and deferred

subtransactions triggered until that point whenever the parent triggers a subtransaction. The parent's priority is also adjusted dynamically to reflect the work that has been triggered dynamically. This policy uses the estimated execution times of

subtransactions that have already been triggered.

SL. This policy adjusts the slack of parent at each potential triggering point and the

transaction with the least slack has the highest priority. The initial value of slack is assigned based on the predictions about the total execution time for a transaction and its subtransactions as indicated by probability of event triggering. The slack is then adjusted at each object or transaction event based on whether the parent transaction triggers a subtransaction or not. This policy assumes the future knowledge

of subtransactions triggering and their estimated execution times.


1.2.3 Concurrency Control

RTDBS scheduling algorithms combine various properties of time-critical schedulers with properties of concurrency control algorithms [1, 4, 8, 11, 20, 32, 37, 40, 42, 22]. There is a large body of work on scheduling and concurrency control algorithms that can be summarized as follows:

" Single version 2PL-HP, 2PL-WP [3]

" Optimistic Concurrency Control (OCC) [19, 28]

* Multiversion Concurrency Control [25]

" Mixed Integrated Concurrency Control [29]

" 2PL-CR, Priority Ceiling Protocol (PCP) [1, 37]


* Semantic Concurrency Control [26, 27]









Single version 2PL-HP and 2PL- WP. 2-Phase Locking (2PL) [7] algorithm executes transactions in two phases. Each transaction has a growing phase, where it obtains locks and accesses data items, and a shrinking phase, during which it releases locks. No transaction should request a lock after it releases a lock. It has been known that any concurrency control algorithm that obeys the 2PL rule is serializable. Priority scheduling without knowing the data access pattern is presented as a representative of algorithms with incomplete knowledge of resource requirements. These algorithms combine priority scheduling with 2PL. When we use 2PL with priority scheduling a transaction conflict can arise from incompatibility of locking modes and a priority inversion can occur when a higher priority transaction (HPT) requests and blocks on a lock for object 0 which is locked by a lower priority transaction (LPT). Conflicts

among transactions are resolved using one of the following methods:


High Priority. High Priority (HP) conflict resolution method is the same as prioritybased wound wait conflict resolution method (In the priority-based wound-wait protocol, transaction Ti can wait for a conflicting transaction Tj if Ti has a lower priority. Otherwise, Tj is aborted (wounded)). The idea of this method is to resolve a conflict in favor of the transaction with the higher priority. The favored transaction gets the resources, both data lock and the processor. The loser of the conflict relinquishes the control of any resources that are being used by itself [3].

Wait. Under this policy, priority inverting conflicts are resolved exactly as nonpriority inverting conflicts. The requesting transaction always blocks and waits for the data object to become free. This is standard method for most DBMS

which do not execute real-time transactions [3].

Wait Promote. Wait Promote (WP) handles conflicts as Wait does except when a

priority inversion occurs. An HPT T, will block and wait but now we promote the priority of the lock holder Th so it is as high as the priority of the T,. In other









words, Th inherits the priority of the Tr. Since locks are retained until commit

time the Th will keep its inherited priority until it commits or is restarted [3].

Based on 2PL, several combinations of conflict resolution methods and priority assignment policies described previously have been proposed. They are EDF-HP (Earliest Deadline First with High Priority), LSF-HP (Least Slack First with HP), EDF-WP (EDF with Wait Promote), AED-HP (Adaptive Earliest Deadline with HP) [20],

Virtual Clock and Pairwise Value Function [42].

Optimistic Concurrency Control. Some concurrency control algorithms based on locking

or time stamp are pessimistic in nature. They assume that the conflicts between transactions are quite frequent and do not permit a transaction to access a data item if there is a conflicting transaction that accesses that data item. Thus the execution of any operation of a transaction follows the sequence: validation, read, computation, write. Optimistic algorithms, on the other hand, delay the validation phase until just before the write phase. Thus an operation submitted to an optimistic scheduler is never delayed. Each transaction initially makes its updates on local copies of data items. The validation phase consists of checking if these updates would maintain the consistency of the database. If the answer is affirmative, the changes are written into

the actual database. Otherwise, the transaction is aborted and has to restart.

There have been several approaches that used OCC for real-time concurrency control method [22, 18, 28]. An OCC scheme with a deadline and transaction length based priority assignment scheme is presented in [22] and an OCC with several conflict resolution methods has also been proposed in [18]. With OCC approach, a policy is needed to resolve the access conflicts during the validation phase. Some of the policies proposed are commit (always let the transaction being validated commit), priority abort (abort the validating transaction only if its priority is less than that of each conflicting transaction), priority wait (wait for higher priority transactions to









complete), and opt-sacrifice (restart the validating transaction if at least one of the

transactions in the conflict set has a higher priority).

Although OCC scheme is shown to display better performance than 2PL-HP for firm real-time transactions in some studies [18, 28], it appears to provide better performance only when the data conflicts are relatively small as shown in another study [22]. Multiversion Concurrency Control. In a multiversion concurrency control algorithm, each

write on a data item 0 produces a new copy or versions of 0. DBMS keeps a list of version 0, which is the history of values that DBMS has assigned to 0. For each read request, DBMS not only decides when to allow read request, but it also decides which one of the version of 0 to read [7]. The benefit of multiple version is to reduce the transaction rejection and thus to increase the degree of concurrency.

Maintaining multiple versions may not add much to the cost of concurrency control because the versions may be needed anyway by the recovery algorithm. Obviously, however, maintaining multiple versions take a lot of storage space. To control this storage requirement, versions must periodically be purged or archived. As a variant of single version 2PL real-time multiversion 2PL [25] has been introduced to increase

concurrency by adjusting serialization order dynamically.


Multiversion 2PL (MV2PL). Multiversion 2PL [7] uses three types of locks: read

locks, write locks, and certify locks. Their lock compatibility matrix is shown in

Table 1.2.3.


Table 1.1. Compatibility Matrix for MV2PL Read Write Certify
Read Y Y N
Write Y Y Y
Certify N Y N









* When MV2PL scheduler receives a write request, it attempts to set the

write lock. Since transactions can have their own versions, there is no data

conflict between write locks.

� When the scheduler receives transaction Ti's read request for object 0, it

attempts to set read lock. Since read locks only conflict with certify locks, it can set read lock as long as no transaction already owns a certify lock on object 0. If T already owns write lock and has therefore write Oi, then the scheduler translate the read request of 0 into read request of Oi . Otherwise, it waits until it can set a read lock, and sets the lock, translate read request of 0 into read request of Oj where Oj is the most recently committed version of 0. Since only committed version may be read, MV2PL avoids cascading

aborts and ensures that the MV histories it produces are recoverable.

* When the scheduler receives transaction Ti's commit request, it attempts to

convert Ti's write locks into certify locks. Since certify locks conflict with read locks, the scheduler can only do this lock conversion on those data items that have no read locks owned by other transactions. On those data items where read locks exist, the lock conversion is delayed until all read locks are released. Thus the effect of certify lock is to delay Ti's commit until there

is no active readers of data items it is about to overwrite.


As in the single version 2PL, MV2PL also has priority inversion problem caused by locking mechanism. Real-time MV2PL [25] also use priority based aborts and

blockings to resolve conflicts among transactions.

Mixed Integrated Concurrency Control. Most DBMS schedulers synchronize conflicting operations by one of 2PL, TO (Timestamp Ordering), or OCC. There are other DBMS schedulers that use combination of these techniques to ensure that transactions are processed in a serializable manner. DBMS schedulers combining different mechanism









for read-write and write-write synchronization are called mixed integrated schedulers [7].

Lin and Son [29] have proposed a new concurrency control algorithm which is based on mixed integrated concurrency control method that adjusts the serialization order dynamically. The proposed algorithm which is based on deferred update policy uses 2PL for read-write conflicts and the Thomas' Write Rule (TWR) for write-write

conflicts.

Thomas' Write Rule. Let Tj be the transaction with maximum timestamp that wrote

into object 0 before the scheduler receives write request of T on 0. If timestamp of Ti is greater than that of Tj process the write request of Ti on 0 as usual.

Otherwise process the write request of Ti by simply acknowledging it.

Their approach resembles OCC by using deferred update policy and 2PL by getting

conflicting information when they access the data items.


There are several approaches that use a priori knowledge for handling real-time transaction scheduling. Static priority assignment policy PCP using priority inheritance with exclusive lock and read/write lock have been proposed [37, 40]. Priority Ceiling. The priority ceiling of a data object is the priority of the highest priority

task that may lock this object.

Priority-ceiling protocol. A transaction J requesting to lock a data item 0 is granted the

lock only if p(J) > c(P), where P is the data item with the highest priority ceiling among all data items currently locked by transactions other than J, p(J) is the priority of transaction J and c(P) is the priority ceiling of object P. If J cannot lock 0, J is blocked and the transaction holding the lock on P inherits the priority p(J) until P

is unlocked.









These protocols are transaction pre-analysis based nonabortive methods using priority inheritance to prevent priority inversion and indefinite blocking. It is important to note that the concept of priority ceiling assumes that we know a lot about transactions that will access the database. This is a reasonable assumption for dedicated real-time application such as tracking [37]. Although the priority ceiling protocol introduces unnecessary blocking, the worst case blocking for any task is reduced to the duration of at most one low priority transaction to finish in one critical section, and no deadlock will ever occur. The critical problem of PCP is that it is not appropriate for disk resident database because an LPT is unnecessarily blocked during 1O wait time of a conflicting HPT.

Priority scheduling with some em a priori knowledge and dynamic priority assignment is introduced as another approach [3, 8, 21]. Conditional Restart (CR) [3] uses estimated execution time of transactions to make a decision on blocking or aborts. The Cost Conscious Approach (CCA) [21] uses data access pattern to estimate the dynamic costs incurred by the interference among transactions. Conflict avoiding nonpreemptive method and Hybrid algorithms [8] which use conflict avoiding schemes in the non-overload case and CR conflict resolution method in the overload case have been proposed.


Semantic Concurrency Control. Database consistency is preserved by enforcing serializability. Serializability is often too strict a correctness criterion for real-time applications, when the precision of an answer for a query may still be acceptable even if serializability is not strictly observed in transaction scheduling. A weaker correctness criterion for concurrency control in real-time transactions by investigating the notion of similarity [26] is proposed and integrating the similarity concept into database concurrency control method [27] has been studied. The concept of similarity is based on the observation that data values of a data object that are slightly different are often interchangeable as read data for transactions. Their approach which is based on similarity assumes that the application semantics allows us to derive a similarity bound









for each data object such that two write events on the data object must be similar if their timestamp differ by an amount no greater than the similarity bound, i.e., all instances of write event on the same object that occur in any interval shorter than the similarity bound can be swapped in the schedule without violating consistency requirement [27]. Thus conflicting transactions do not need to block one another as

long as their event conflicts can be resolved by using the similarity bound.


1.3 Summary of Our Research

The goal of our research is to develop techniques for RTDBS and ARTDBS which assign priorities to transactions, schedule transactions, and resolve conflicts in proper ways. The tasks that we have accomplished are listed below. Research for soft RTDBS. Tasks in a real-time system often communicate through shared

data, yet have a correctness constraint that each task must appear to execute atomically. In this case, the tasks need to be managed as transactions, and need to be scheduled by a RTDBS. We have already developed a cost conscious dynamic priority assignment policy, CCA, which effectively exploits the time accrued by interference among transactions and developed the simulator and compared the performance [21].


1. Developed Cost Conscious Approach (CCA).

2. Extended CCA so that we can exploit the load factor of the system.

3. Developed the simulator and compared the performance.


Research for firm RTDBS. Firm deadline has different semantics from soft deadline. By

removing tardy transactions from the firm RTDBS there will not be any tardy transaction in the system. Removing tardy transactions from the system gives some advantages to OCC which uses late stage validation method. There have been some comparisons between 2PL-HP and OCC for concurrency control algorithm of firm









RTDBS. Both approaches have some advantages and disadvantages. We have developed new approaches which can benefit the advantages of 2PL-HP and OCC together:


1. Developed new concurrency control methods which use immediate restart and

deferred restart policy together. Our approaches can benefit advantages of 2PLHP and OCC together.

2. Developed the simulator and compared the performance.


Research for ARTDBS. ARTDBS has more complex transaction model in which transactions may trigger other transactions. We have developed a new priority assignment

and compared the performance:


1. Developed new priority assignment policy.

2. Developed the simulator and compared the performance.


1.3.1 Contribution of our Work

We can summarize the contribution of our works as follows:


New scheduler for soft deadline. There have been several approaches [38, 3] to use a priori

knowledge for RTDBS. We developed CCA [21] which uses transaction dataset to estimate the approximate cost of transaction rollback and restarts. Based on CCA, our new approach tried to include as much information as possible. Our new approach

even includes system load information.

New scheduler for firm deadline. Performance of 2PL versus OCC for conventional DBMS

and 2PL-HP versus OCC for RTDBS has been done. According to the studies [19] the performance of 2PL-HP and OCC is changing with the transaction mix and system load. Based on the previous performance and our observation on both approaches we

developed new concurrency control methods.









New scheduler for ARTDBS. There are many applications such as cooperative distributed

navigation systems and intelligent network services where real-time active database technology is extremely useful [33, 34]. As many commercial systems support active capability, a lot of non-traditional applications are being implemented using this capability. We studied priority assignment policy for more complex active transaction model and compared the performance of PD and DIV and extended our AVCC for

firm RTDBS to fit into ARTDBS.


1.4 Structure of Dissertation

The rest of the dissertation is structured as follows. Chapter 2 provides a cost conscious dynamic priority assignment policy which incorporates system load factor for soft RTDBS and shows the performance of our approach by using simulation studies. Chapter 3 presents new ideas that use immediate restart and deferred restart together for firm RTDBS and shows the performance comparisons. In chapter 4, we present a priority assignment policy for ARTDBS and show the performance evaluation on disk resident databases. Chapter 5 concludes the dissertation with the contributions of this dissertation and the future works.














CHAPTER 2
SOFT REAL-TIME: INCORPORATING LOAD FACTOR INTO CCA

Repetitive workload is a common property in real-time and transaction processing systems. Thus, in a real-time transaction processing system, users do not run arbitrary programs, but rather request the system to execute specific functions out of a predefined set. Each function is an instance of a transaction type. That is, RTDBS invokes a transaction program that implements the requested function. The random aspect is the sequence and the frequency with which programs are invoked [17]. Use of canned transactions and queries whose read and write sets can be predicted beforehand is a step in the right direction and the data items accessed by a transaction are likely to be known a priori once its functionality is known [35]. Based on the above observation, priority scheduling with some a priori knowledge is introduced as another approach [3, 8, 27, 37, 40, 21]. Conditional Restart (CR) [3] uses estimated execution time of transactions to make a decision on blocking/aborts, and CCA [21] uses data access pattern to estimate the dynamic costs incurred by the interference among transactions. Conflict avoiding nonpreemptive method and Hybrid algorithms [8] which uses conflict avoiding schemes in the non-overload case and CR conflict resolution method in the overload case by using data access pattern have been proposed. Priority Ceiling Protocol (PCP) [37, 40] uses data access pattern and static priority of a transaction, and Similarity Stack Protocol (SSP) [27] uses more detailed information by assuming that the application semantics allows us to derive a similarity bound for each data object. Thus conflicting transactions do not need to block one another as long as their event conflicts can be resolved by using the similarity bound.

In this part, we view a RTDBS as either memory- or disk-resident transaction processing system whose workload is composed of a set of canned transactions with individual timing 16









constraints. A timing constraint is expressed in the form of a deadline, and we consider only soft deadline transactions. With the canned transactions, we will look at how we can derive system load factor and how can we use that information for soft real-time transaction scheduling.

2.1 Motivation for Our Approach

The primary motivation for our approach is to answer the question "What kind of information is relevant and how to meaningfully incorporate it into the design of a real-time scheduling algorithm?". Various types of information are useful in different ways. Intuitively, we can do better if we have additional knowledge but the improvement is predicated upon the appropriate use of that knowledge. Figure 2.1 illustrates the classification of various scheduling algorithms proposed in the literature with respect to the type of knowledge used.

Type 0. Does not assume any a priori knowledge. Only available timing information is

deadline (e.g., EDF-HP [3]).

Type 1. Deadline and data access pattern are available (e.g., CCA [21]). Type 2. Deadline and estimated execution time are assumed to be available (e.g., EDFCR [3]).

Type 3. Data access pattern and static transaction priorities are assumed to be available

(e.g., PCP [39]).

EDF-HP is the simplest and most straightforward approach for an RTDBS. EDF priority assignment policy minimizes the number of late transactions when systems are lightly loaded. The performance, however, rapidly degrades as the system becomes overloaded. There have been several approaches [3, 21] to overcome the shortcoming of EDF-HP by using additional information. The basic idea of these approaches is to save valuable system resources by not aborting partially executed conflicting transactions blindly. EDF-CR [3]





18


Type 0 F-HP AEDF-HP


Type I C A



& CAAL ED-CR-F
Type 2

Type I
& CCA-ALF EDF-CR-ALF
Typ 2

Figure 2.1. Knowledge type and corresponding approaches

uses type 2 information while CCA [21] uses type 1 information to improve EDF-HP further, and our experiments [13] has shown that CCA is better than EDF-CR for soft real-time systems when the resource time is used as an estimated execution time for EDF-CR. From the experiments [13] we found that the response time which is the difference of completion time and arrival time in soft RTDBS varies considerably. Prediction of response time of a transaction is very hard to get without combining type 1 information and system load factor because the response time of a transaction varies with the changes of system load, especially in soft RTDBS.

In a RTDBS, irrespective of whether it is memory- or disk-resident, the (wall clock) response time has two distinct components: Tatatic, the time needed to execute a transaction in an isolated environment and Tdynamic, the time spent in waiting (both I/O and concurrency related) as well as abort/restart overhead. Tg,,ti, is dependent on the semantics of the transaction (e.g., data values accessed and branch points) and is relatively straightforward to estimate. Tdynamic, on the other hand, is dependent on the current state of the system and on future events, i.e., on the transactions that are currently in the system and the transactions that will arrive in the future. In the database context, Tdynamic is extremely difficult to compute or even estimate as it is not only dependent on the resources consumed so far but also on the resources required for its completion which may be affected by future events. Furthermore, Tdynamic is sensitive to the transaction mix and can vary considerably









when the transaction mix changes. Nevertheless, the inclusion of an approximate Tdynamic as part of the strategy for meeting timing requirements is likely to perform better than those where the dynamic information is not included at all.

Based on the above observations, we propose an adaptive cost conscious approach CCAALF (Cost Conscious Approach with Average Load Factor) and EDF-CR-ALF (EDF-CR with Average Load Factor) which combines type 1 information with a load monitoring mechanism. CCA-ALF and EDF-CR-ALF use type 1 information to calculate resource time of a transaction and then anticipate system load by using resource time and its response time. With type 1 information and current system load EDF-CR-ALF derives remaining response time of a transaction for its conflict resolution method and CCA-ALF changes its priority by incorporating system load.

2.2 CCA-ALF for Soft Deadline

CCA-ALF uses strict 2PL, exclusive lock only, and High Priority (HP) conflict resolution method. With type 1 information, which is available by using pre-analysis or pre-execution, the conflict and safety relationship for CCA-ALF can be inferred in a straightforward manner.


hasaccessed(TN). Set of data items that a transaction N has accessed from the beginning

of the transaction.

mightaccess(TN). Set of data items that a transaction N might access till its completion.


With mightaccess and hasaccessed, we can calculate the conflict and safety relations as follows:

* Transactions TN and TM conflict iff mightaccess(TN) fn mightaccess(TM) .

* Transaction TN is unsafe with respect to TM iff hasaccessed(TN) fn mightaccess(TM)
0,0.









2.2.1 Priority Assignment

CCA-ALF uses a dynamic priority assignment policy with a continuous evaluation method which evaluates the priority several times during the execution of a transaction to include some of the dynamic features as the transaction progresses. If the transaction Ti which is selected to be run next conflicts with transactions that are unsafe with Ti, we might lose

Timelost(Ti) = ETEM (rollback + execj)

M = {T jI Ti is unsafe with T,}

where execj is the effective service time of Ti and rollbackj is the time required to roll back Tj. If the value of Timelost(Ti) is large, executing T wastes system resources. We characterize the time lost as the penalty of conflict. Penalty of conflict is the value Timelost(Ti), which is the sum of the effective service time

and rollback time of the transactions that must be aborted and rolled back to execute

Ti to its commit point without interruption.

The notion of the penalty of conflict, described above, is introduced into the our CCA-ALF dynamic priority computation formula as follows. If Pr(Ti) is the priority of transaction Ti and d(T) is the deadline of transaction Ti, then


Pr(T) = -(d(Ti) + w * Timelost(T))


Our priority formula uses absolute deadline as one of the components. As time progresses, although values become larger and larger, the effect of TimeLost component will not be diminished because the relative priority order depends on the difference of their values of Pr function not on the absolute values of it. The portion of Timelost in CCA-ALF priority formula can be controlled by the value of w. Although the value of W over some ranges showed good performance [21], we can improve the performance by fine-tuning the value of









w since no priority assignment policy shows good performance in different load situations in a consistent manner. Since the value of Timelost consists of effective service time of conflicting transactions, it does not include system load in it. One way to make this approach adaptive to the system load is to adjust the value of w using the load of the system.

As we mentioned before, transaction response time varies substantially with the changes of system load in soft RTDBS. By computing the ratio (Load Factor) of response time to corresponding resource time of each completed transaction, we might be able to predict the system load. Resource time can be derived from type 1 information by assuming that the processing time for each accessed data item does not change enormously.


Resource Time = Number of data access x cpu.time + Number of disk read x disk-time response time = completion time - arrival time Load Factor (LF) = response time
Resource Time

LF of one transaction cannot represent the system load properly. We use Average Load Factor (ALF) of previously completed transactions to represent current system load. LF of the transactions that finished a long time ago cannot contribute to the current system load either. Thus we maintain a list (called lf-list) to keep track of most recently finished N transactions' LF and it is updated whenever a transaction finishes.


Average Load Factor (ALF) = ETiElffist LF(Ti)
N

With the ALF value the priority formula of this approach is as follows:


Pr(Ti) = -(d(Ti) + (penalty-weight x ALF x TimeLost(Ti)))


In lightly loaded situations the value of ALF is close to 1 and our priority assignment policy approximately resembles to CCA which showed good performance when the system is in medium or light load situation. If a system load increases the effects of deadline in the









priority formula decreases due to increase of ALF. Thus in heavily loaded situations the results obtained using our priority formula are comparable to Random Priority (RP) [18] which showed good performance in the heavily loaded situation since the multiplication of ALF and Timelost override the effect of deadline in the formula. Thus our priority formula helps to balance the urgency of transactions and waste of system resources in different load situations.

2.2.2 Scheduling Algorithm

The procedure tr-arrival-sched is invoked whenever a new transaction arrives, and the procedure tr-finish-sched is invoked whenever a running transaction finishes. The procedure tr-arrival-sched and tr-finish-sched use ALF, and penalty of conflict (approximation of dynamic cost) of transactions and the procedure tr-finsh-sched inserts LF into f-list and updates ALF. Thus ALF is updated whenever a transaction finishes its execution. The sleep queue holds transactions that are blocked and the partially executed transaction list (P-list) links all transactions that are executed partially. The ALF introduced in the priority formula is used to weigh the contribution of penalty of conflict on the value of the priority value computed. ALF value will ranges from 1 to some positive value. Procedure tr-arrival-sched(Ti)

BEGIN

Put Ti in the ready queue;

FOREACH transaction in the ready queue

BEGIN

assign new priority;

Sort and choose the highest priority transaction and run it;

END

END









Procedure tr-finish-sched (Ti) BEGIN

insert LF(Ti) into the if-list;

update ALF;

Remove Ti from the system;

FOREACH transaction in the ready queue

BEGIN

assign new priority;

Sort and choose the highest priority transaction and run it;

END

END



Disk I/O introduces new problems in real-time transaction scheduling. There are several choices when I/O wait occurs. We have considered the following 3 choices:

1. Pick the highest priority transaction among ready transactions.

2. Pick the highest priority transaction among transactions that are ready and does not

conflict with all partially executed higher priority transaction.

3. Pick the highest among transactions that are ready and does not conffict with any

partially executed transaction.

Of the above, we found that the second one comes out as the best for soft real-time transactions [13] and applied it to CCA-ALF and EDF-CR-ALF where type 1 information is available. Consider the following scenario: Transaction T is blocked and is waiting for an I/O completion. The next highest priority transaction, T'2, gets the CPU and starts executing so as not to waste the CPU. If T2 is unsafe with T1, then T2 performs a noncontributing execution because it must be rolled back when T unblocks. This situation is worse









than the situation in which no transaction is selected to execute during T1's I/O wait time, because of the cost incurred in rolling T2 back. If the third highest priority transaction, T3, accesses a data set disjoint with that of T, and T2, then T3 is the better choice. In our approach we select T3 rather than T2 during Ti's I/O wait using the type 1 information. Even though the third choice prevents noncontributing execution, also it might limit the concurrency of the system too much.

A noncontributing execution is defined as a lower priority transaction's execution during the I/O wait of higher priority transaction that has to be rolled back when the higher priority transaction finishes its I/O [21].

2.3 EDF-CR-ALF for Soft Deadline

EDF-CR [3] uses estimated execution time of a transaction when it decides whether to abort a conflicting lower priority transaction or block a higher priority transaction. The problem here is that remaining execution time of a transaction doesn't consider Tdynamic at all. When we deal with soft deadline the variations of response time changes considerably with the changes in the system load. Thus, it seems naive for EDF-CR to use statically estimated execution time which does not consider the changes to the system load at all. Our simulations [13] showed that in heavily loaded situations EDF-CR is worse than EDF-HP when the resource time of a transaction is used as an estimated execution time. For this reason we use dynamically estimated response time instead of statically estimated execution time in EDF-CR-ALF when transactions block or abort.

As we explained before, we can estimate the remaining response time dynamically by using additional information available about transactions. With type 1 and system load, we can estimate the remaining response time of a transaction dynamically by using statically calculated resource time and dynamically traced ALF. Slack time (Sr) of a lock requesting









higher priority transaction Tr, and remaining response time (RRT) of a lock holding transaction Th can be dynamically calculated by using the following formula and the priority is assigned based on EDF policy.


Remaining Response Time (RRT) = Remaining Resource Time x ALF


Sr(Tr) = (deadline(T,) - (current time + RRT(Tr)))

We name this approach as EDF-CR-ALF (EDF-CR with ALF) and its conflict resolution procedure is as follows: Procedure EDF-CR-ALF-sched BEGIN

IF Pr(Th) < Pr(T,)

THEN

IF RRT(Th) < Sr(T)

THEN

Block T,;

Inherit Pr(T) to Th;

Run Th;

ELSE

Abort Th;

Run T,;

ELSE

Block T,;

Run Th;


END









We expect EDF-CR-ALF to perform better than EDF-HP in lightly loaded situations, but it is likely to be almost the same as EDF-HP when the system is heavily loaded because under heavy load, most of transactions in the system do not have enough slack time to wait for the completion of the conflicting lower priority transaction. The advantage of EDF-CRALF over EDF-CR is that EDF-CR-ALF is never worse than EDF-HP for any situation by not overestimating the slack time of a lock requesting higher priority transaction.

2.4 Performance Evaluation

In order to evaluate the performance of the CCA-ALF algorithm described in this part, two simulations of a real-time transaction scheduler were implemented (using C language and SIMPACK simulation package [16]) for main memory [15] and disk-resident databases as shown in Figure 2.2.

Open Network Model



Ready queue



Scheduler Lock check Restart

Next step NeUx lc


comt Disk commit





Figure 2.2. Open Network Model for the simulation (single CPU)


The parameters used in the simulations are shown in Table 2.4. In these simulations, transactions enter the system according to a Poisson process with arrival rate A (i.e., exponentially distributed inter-arrival times with mean value 1/A), and they are ready to










Table 2.1. Parameters and their meanings for CCA-ALF
Parameter Meaning
db.size Number of objects in database
max-size Size of largest transaction
min-size Size of smallest transaction
i/oAime I/O time for accessing an object (read/write)
cputime CPU computation per object accessed
disk-prob Probability that an object is accessed from disk
update.prob Probability that an object accessed is updated
min-slack Minimum slack max-slack Maximum slack
restart-time Time needed to rollback and restart
penalty-weight Weight of penalty of conflict



execute when they enter the system (i.e., release time equals arrival time). The number of objects updated by a transaction is chosen uniformly from the range of min-size and max-size, and the actual database items are chosen uniformly from the range of db.size.

After accessing an object a transaction spends cpu-time in order to do some work with or on that object and then it accesses the next object. The assignment of a deadline is controlled by the resource time of a transaction and two parameters min-slack and max-slack which set, respectively, a lower and upper bound of percentage of slack time relative to the resource time. A deadline is calculated by adding resource time and slack time. Slack time is calculated by multiplying slack percent and resource time. Slack percent is chosen uniformly from the range of min-slack to max-slack.


Deadline = arrival time + resource time x (1 + slack percent 100

Disk accesses for disk resident database are controlled by diskprob when a transaction reads an object. The use of disk-prob to some extent models data maintained in the buffer. At commit time, objects that have been updated are flushed. The parameter updateprob controls the number of data that should be written at the commit time. We use restart-time for modeling the rollback of a transaction and its restart. The restarted transaction will









access the same data objects, and we maintained most recently finished 20 transactions in the circular list to keep track of current ALF.

In our performance evaluation, we measure three performance metrics (defined below) commonly used in the literature for RTDBS: i) miss percent, ii) restart rate, and iii) mean lateness.
Miss Percent =Total number of transactions that missed the deadline
MisPret= x 100 Total number of transactions that entered the system


Restart Rate = Total number of restart
Total number of transactions that entered to the system Mean Lateness = ZTiEtardy transactiona (completion-time(Ti) - deadline(Ti)) Total number of transactions that entered to the system

Another way of measuring the performance is to compute the lowest arrival rate that causes 20 % miss rate. We consider a system to be heavily loaded when the system misses more than 20 % of transactions [19]. Thus we define this arrival rate as a boundary arrival rate.

2.4.1 Main Memory DB

In this simulation we have a single processor and a memory resident database. We do not consider any durability property here in order to isolate the effects of transaction scheduling and concurrency control methods. Thus the resource time of a transaction only depends on cpu-time and the number of objects a transaction accesses. The value of parameters used in this simulation are shown in Table 2.4.1. The value of dbsize has been chosen to increase data conflict among transactions and 20,000 transactions were executed for each experiment such that 95% confidence intervals have been obtained whose halfwidths are less than 2.5%.

Effect of arrival rate

In this experiment, we varied arrival rate from 1 tr/sec to 7 trs/sec with the base parameters shown in Table 2.4.1 and measured the miss percent, the number of restarts











Table 2.2. Base parameters for main memory database
Parameter Value
dbsize 250
max-size 24
min-size 8
cpu.time 10 ms min-slack 50 (%)
max-slack 550 (%)
restart-time 5 ms
penalty-weight 1


per transaction, and mean lateness for EDF-HP, EDF-CR-ALF, CCA, and CCA-ALF. With the base parameters, the maximum capacity of the system (assuming no blocking and aborts) is

10 ms 16 objects 160 ms
object transaction transaction

If we consider the effects of blocking and aborting (dynamic factors) the capacity of the system will be much less than the maximum capacity of the system. Figure 2.3 shows the effect of arrival rate on the percentage of transactions that miss their deadline. The boundary arrival rates of EDF-HP, CCA, EDF-CR-ALF and CCA-ALF are approximately 4.4, 4.6, 4.5, and 4.6 trs/sec, respectively. Figure 2.4 shows the effect of arrival rate on the restart rate of transactions and Figure 2.5 shows mean lateness using the logarithmic scale.

CCA-ALF shows better performance as compared to CCA, EDF-HP and EDF-CRALF especially when the arrival rate is between 3 and 5.5 trs/sec. Within this arrival range CCA-ALF and EDF-CR-ALF show much less number of transaction restarts than EDFHP. Generally, less number of transaction restarts does not guarantee better performance but CCA-ALF reduces expensive restarts to achieve better performance. This phenomenon can be seen clearly in the multiclass experiment presented later. Observe that for the base parameters shown in Table 2.4.1, the number of restarts climbs steeply up to the arrival rate of 4 and then declines sharply (Figure 2.4). The reason for sharp decline is












that beyond a specific arrival rate, it is less likely that an arriving transaction will have an earlier deadline than the currently running transaction. After the peak point in Figure 2.4,


it is usually the case that the currently running transaction arrived a long time ago, but could not get system services due to the heavy load on the system (most of the dynamic


factors in heavily loaded situation are arrival blockings rather than preemption blockings and aborts [43]). Thus, fewer transactions are preempted and there are fewer opportunities for restarts [1].

CCA-ALF,EDF-CR-ALF (base parameters) 100 1 1


80 EDF-HP CCA.ALF .
EDF-CR-ALF .......
GJ60


~40


20 ---------------------------------0
1 2 3 4 5 6 7 Arrival Rate(trs/sec)


Figure 2.3. Miss percent (CCA-ALF)


CCA.ALF,EDF-CR.ALF (base parameters)
0.1
0.09
0.08
0.07




0.03
0.02 .o EDF-HP 0.01 .EDF CR-ALF .01 ...... EDF-CR-ALIF ...... . -,
0
1 2 3 4 5 6 7 Arrival Rate(trs/sec)


Figure 2.4. Restart Rate (CCA-ALF)









le+06
100000
1000

1000 EDF-HP
CCA-ALF
100 EDF..CRALF
10
1o


0.1
1 2 3 4 5 6 7 Arrival Rate(tr/sec)

Figure 2.5. Mean Lateness (CCA-ALF) Effect of multiclass (Transaction mix)

In this experiment, the arriving transactions are divided into three classes (class 0, 1, and 2) and assigned different values of cpu-time - 1 for class 0, 10 for class 1, and 100 for class 2. We assigned 1 ms as restart-time for all classes because the resource times of class 0 transactions are between 8 ms and 24 ms. The other parameters are the same as that of the previous experiment. Thus data contention remains the same but the amount of resource time for each class is different. With these assignments a lower class (the lowest is class 0) transaction has a shorter resource time. As a result it has a shorter slack time. The maximum capacity of the system (disregarding blockings and aborts) is:

3 X16 objects 592 ms 1.7 transaction/second
object transaction transaction

Different assignments of cpu-time for each transaction class creates a lot of variance in the transaction resource time (the resource time of transaction varies from 8 ms to 2400 ms). Therefore, there will be more chances for transaction preemption. Figures 2.6 show the results of this experiment. The boundary arrival rates of EDF-HP, EDF-CR-ALF, CCA, and CCA-ALF are 0.95, 1.0, 1.1 and 1.15 trs/sec respectively in Figure 2.6. Thus CCA-ALF schedules more transactions without missing more than 20 % of transactions.












With the variation of cpu.time there is higher possibility that an arriving transaction will have an earlier deadline than the currently executing transaction. Thus restart rate per transaction of this experiment is increased for both approaches as can be observed from Figures 2.4) and 2.7. CCA-ALF shows better performance especially when the arrival rate is between 0.6 and 1.4 trs/sec. Within this arrival range CCA-ALF shows much less number of transaction restarts compare to EDF-HP. CCA-ALF reduces very expensive restarts to achieve better performance in the multiclass situation. This experiment also indicates the adaptive nature of the CCA-ALF approach in which the dynamic cost changes as the transaction mix changes and reduces the effect of deadline accordingly.

CCA-ALF,EDF-CR-ALF (base parameters) 100

80 EDF-HP

CCA-ALF.
- EDF-CR-ALF .....

640



40
0 - . - . . . .


0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Arrival Rate(trs/sec)


Figure 2.6. Multiclass: Miss Percent (CCA-ALF) CCA-ALF,EDF.CR.ALF (base parameters)
0.2
EDF-HP
~~CCA-ALF -a.. 0.15 EDF-CR-ALF t 0.1



0.05



0
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Arrival Rate(trs/sec)


Figure 2.7. Multiclass: Restart Rate (CCA-ALF)









CCA-ALF,EDF-CR.ALF (base parameters)
le+06 . .. ..
100000 E.Dv-sHP
CCA-ALF ...
10000 EDF-CRALF ...
1000
10 "



0.1 . . . .
0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Arrival Rate(trs/sec)

Figure 2.8. Multiclass: Mean Lateness (CCA-ALF) Another metric of comparison for this experiment is to observe miss percent for each class. In this experiment data contention is the same for all classes but their active resource requirements are different because the transactions belonging to classes 1 and 2 require more cpu-time to process their data objects. The relative difference of miss percent of each class is reduced after arrival rate 1 for both approaches (Figure 2.9). The reason is that after this point preemption of transactions is reduced and execution behavior is more serialized.

We plot miss percent for each class from arrival rate of 0.6 trs/sec to 1.4 trs/sec for EDF-HP and CCA-ALF in Figure 2.9 (miss percent is too small to plot when the arrival rate is less than 0.6 trs/sec and the behavior of EDF-CR-ALF is almost the same as that of EDF-HP). Their relative difference is reduced when the arrival rate is bigger than 1.4 trs/sec. From Figure 2.9, we can see that EDF-HP and EDF-CR-ALF blindly favors shorter transactions transactions. Thus EDF-HP and EDF-CR-ALF causes very expensive restarts by aborting transactions that consumed a lot of resources. CCA-ALF (the behavior of CCA is almost the same as that of CCA-ALF) also favors shorter transactions but CCA-ALF avoids expensive restarts by not aborting transactions that consumed a lot of resources. In Figure 2.9 miss percent of class 0 transactions is higher than that of class 1 transactions in our experiment. The reason is that class 0 transactions are very vulnerable due to their relatively small absolute slack time.






















EDF-HP class 0 class I
class 2


061 0.8 1 10 1.2 1.4
Arrival Rate


Arrival Rate


EDF-HP CCA-ALF
Arrival Rate(trs/sec) 0.6 0.8 1.0 1.2 1.4 0.6 0.8 1.0 1.2 1.4 Miss Percent(Class 2) 2.63 5.91 11.71 19.80 24.94 1.04 2.64 4.81 7.92 16.68 Miss Percent(Class 0) 0.63 1.86 5.3 11.52 19.1 0.8 2.48 5.29 9.22 15.1
Class 2 / Class 0 4.17 3.18 2.2 1.72 1.3 1.3 1.06 0.9 0.85 1.18


Figure 2.9. Miss percent for each class and Proportion of class 2 to class 0









We expected that there would be less discrimination against long running transactions in CCA-ALF than EDF-HP because CCA-ALF implicitly considers the effective service time of a transaction as we can see it in Figure 2.9. Discrimination against long running transactions in RTDBS is discussed [32]. In their experiment each class requires different ranges of object number. Thus each class has different level of data contention and resource time. In our experiment, however, each class only has different level of resource contention. That is the reason why their experiment shows more discrimination against long running transactions. Also, the formula used for priority computation currently does not distinguish between transaction classes. This can be easily included in the formula that computes penalty of conflict.

CCA-ALF shows much better performance especially when the variance of execution time is high among transactions, by not aborting transactions that have already consumed a lot of resource time.

2.4.2 Disk Resident DB

In order to measure the performance of our algorithm on disk resident database, we extended the simulation program to perform experiments for this case. In this simulation we assumed that we have a single processor, single disk and FCFS I/O scheduling. If a transaction is aborted during its wait on the disk queue, the transaction is deleted from the disk queue immediately. However, if a transaction is aborted during its I/O access it is not deleted until it releases the disk. We used deferred update rather than immediate update for fast rollback [6]. Thus we assume that transaction rollback and restart do not require any disk access.

The values of parameters used for this experiment axe shown in Table 2.4.2. The values of cpu-time and i/o-time are chosen to balance the utilization of CPU and disk [5, 4, 44]. With this parameter assignments the system is slightly I/O bound. Resource time in this experiment depends on the cpu-time, the number of objects, the number of









disk access, and i/o-time. Since the deadline is assigned based on pre-commit time we inspect timing requirement and release locks when a transaction pre-commits. As we have 2 system resources in this experiment and the disk-prob is 0.5, we assigned 0.5 as the value of penalty-weight. The rationale is to distribute the penalty of conflict over the system resources.


Table 2.3. Base parameters for disk resident database Parameter Value dbsize 250 max-size 24 minsize 8 i/otime 25 ms cputime 15 ms disk-prob 0.5 update.prob 0.5 min-slack 100 (%) max-slack 650 (%) restart-time 5 ms penalty-weight 0.5



With the base parameters in the Table 2.4.2 the maximum capacity of the system is:

16 objects 15 ms 240 ms
transaction object = transaction

This calculation is very optimistic because it neither includes the abort cost nor the blocking cost of transactions. Effect of arrival rate

In this experiment, we varied arrival rate from 0.6 tr/sec to 2 trs/sec with the base parameters shown in Table 2.4.2 and compared EDF-HP, EDF-CR-ALF and CCA-ALF schemes. Increasing the arrival rate increases time contention as well as data contention thus increases transaction miss percent for all three approaches. CCA-ALF and EDF-CRALF which use type 1 information to reduce noncontributing execution shows a much larger improvement over EDF-HP for the disk resident database (as expected) as compared to the









main memory case. The boundary arrival rates of EDF-HP, EDF-CR-ALF, and CCA-ALF are 1.2, 1.42, and 1.43 trs/sec respectively.

The reason for earlier rapid increase of restart rate in EDF-HP than EDF-CR-ALF and CCA-ALF is that when the arrival rate is high, the number of available transactions increases which in turn causes high data contention. High data contention makes many transactions to block which eventually increases the number of active transactions as well as the transactions that have begun execution but have not finished yet, in the system. Thus the priority-based restarts of the active transactions that are blocked waiting for locks or resources increases very rapidly. The increase in the restart ratio means that a longer fraction of disk time is spent doing work that will be redone later [5]. Wasted resource time due to priority-based restart causes high resource utilization and easily makes bottleneck resource saturation that induces longer I/O wait time. With the longer I/O wait time more transactions are scheduled and that increases the I/O wait time further. Thus the possibility of restarting an active transaction increases further. After the peak point, restart rates slowly increase as shown in Figure 2.11. This is because the number of restarts due to higher priority transaction's I/O wake up increases but the restarts by higher priority transaction's arrivals are gradually reduced. The number of restarts will flatten out eventually as the arrival rate increases.

Even though the available transactions increase as the arrival rate increases, the number of useful transactions for CCA-ALF and EDF-CR-ALF increases very slowly. Thus the number of active transactions are relatively small as shown in Figure 2.12 until arrival rate of 1.6 trs/sec . As a result, the number of priority-based restarts for CCA-ALF and EDF-CR-ALF increases slowly as can be seen in Figure 2.11. After arrival rate of 1.6 trs/sec the number of active transactions for EDF-CR-ALF and CCA-ALF increases because both approaches have chosen transactions seemingly not conflicting with partially executed higher priority transactions.











In the heavy load situation, the conflict resolution policy of CCA-ALF resembles to EDF-Wait which uses nonabortive method. Thus the restart rate of CCA-ALF is less than that of EDF-CR-ALF which uses HP conflict resolution method independent of system load after arrival rate of 1.6 trs/sec. That is the reason why CCA-ALF has less restart rate than EDF-CR-ALF after the arrival rate of 1.8 trs/sec even though CCA-ALF has more number of active transactions.


100 so


-60
40


EDF-HP,CCA-ALF,EDF-CR-ALF (base parameters)


20 F


0.6 0.8 1 1.2 14 1.6 1.8 2 Arrival RWte(trsdsec)


Figure 2.10. DISK: Miss Percent (CCA-ALF)


EDF-HP, CCA-ALFEDF.CR.ALF (base


Figure 2.11.


0.8 1 1.2 1.4 1.6 1.8 2
Arrival Rate(trs/sec)

DISK: Restart Rate (CCA-ALF)


2.5 Conclusions


Synthesizing static and dynamic information available to transactions seems to be a viable approach for obtaining scheduling policies to meet the requirements of real-time


EDF-HP
CCA-ALF -EDF-CR-ALF .






....................... .. .......... ...... 21".... ..................... ..............




















EDF-HP,CCA-ALF,EDF-CR-ALF (base parameters)


20






10 15
z


Figure


0.6 0.8 1 1.2 1.4 1.6 1.8
Arrival Rate(tr/sec)


2.12. DISK: No. of active tr. (CCA-


ALF)


EDF-HP,CCA-ALF,EDF-CR-ALF (base parameters) le406

-
100000 I


000
100.



EDF-HP
10 .,-'a- CCA-ALF ......
.x: EDF-CR-ALF ..

0.6 0.8 1 1.2 1A 1.6 1.8 2
Arrival Rate(trs/sec)


Figure 2.13. DISK: Mean Lateness


EDF-HP-CCA-ALF
EDF-CR-ALF - /









transactions. CCA-ALF and EDF-CR-ALF described in this part use dynamic priority assignment with continuous evaluation method to adapt to load changes effectively and to reduce the excessive restart problem encountered by EDF-HP in high data contention situations. CCA-ALF uses its new priority formula for both resource and data conflict resolution accordingly to adapt to the current system load by using type 1 information and ALF of the system while EDF-CR-ALF uses EDF for resource conflicts and CR only for its data conflicts. According to our simulation results, we can see that CCA-ALF way of using available information is better than that of EDF-CR-ALF.

Our simulations indicate that

1. CCA-ALF performs better than EDF-CR-ALF for soft deadline in wide ranges of

arrival rate.

2. CCA-ALF is more fair than EDF-CR-ALF.

3. CCA-ALF shows particularly good performance when the transactions have a wide

range of processing requirements.

4. Reducing noncontributing execution by using type 1 information dominates the performance of disk resident databases.














CHAPTER 3
FIRM REAL-TIME: DEFERRED-RESTART APPROACH




3.1 Introduction

The main focus of research in the RTDBS area has been the problem of scheduling transactions to meet the time constraints associated with each transaction. The scheduler of an RTDBS is responsible for assigning priorities [21, 22, 30, 42] and resolving access conflicts among transactions based on priorities (concurrency control) [3, 8, 19, 25, 26, 27, 28, 29, 37]. Among them several approaches [2, 19, 22, 28] are specifically studied to improve the performance of firm RTDBS. We can classify these approaches into: Overload management. By removing jobs with infeasible deadlines from the system as early

as possible, the Feasible Deadline (FD) [2] can significantly improve the performance of firm RTDBS. The basic idea is not to spend time on transactions that are likely to miss their deadlines. However, the predictive FD approach requires an estimation of the transaction execution time, which may be difficult or even impossible to obtain

due to database characteristics.

Priority assignment. Earliest Deadline First (EDF) and Least Slack First (LSF) [30] are

well known ways to assign priorities to soft and firm deadline transactions, and Adaptive Earliest Deadline (AED) [19] has been proposed for firm transactions. In AED transactions are assigned to HIT and MISS buckets and their bucket sizes are controlled by using the past system load. Priorities of transactions in the HIT group are assigned based on EDF and those in MISS group are assigned based on Random Priority (RP). The priorities of transactions in the MISS group are always lower than 41









those of HIT group. AED tries to make the miss percent of HIT and MISS groups close to 0 % and 100 % respectively. The basic idea here is that to spend more time

on transactions in the HIT group and less time on transactions in the MISS group. Concurrency Control. Several Optimistic Concurrency Control (OCC) variants have been

proposed for concurrency control for firm RTDBS [19, 20, 22, 28]. By delaying the validation until commit time transactions that are supposed to miss their deadlines never restart other transactions. OCC variants shows better performance than locking based approaches in simulations [20, 28] and shows comparable performance on their

testbed [22].

3.2 Related Work

Two Phase locking with High Priority (2PL-HP) [2] uses blocking and immediate restarts to maintain the consistency of databases. When a lower priority transaction (LPT) tries to access the data that is already accessed by a higher priority transaction (HPT) in a conflicting mode, we block the LPT until the HPT releases the corresponding data. When an HPT tries to access the data that is already accessed by an LPT in a conflicting mode, we restart the LPT (immediate restart) and make the HPT access the data.

OCC uses only a validation phase restart at commit time to make databases consistent. With OCC approach, a policy is needed to resolve the access conflicts during the validation phase. Some of the policies proposed are commit (always let the transaction being validated commit), priority abort (abort the validating transaction only if its priority is less than that of each conflicting transaction), priority wait (wait for higher priority transactions to complete), and opt-sacrifice (restart the validating transaction if at least one of the transactions in the conflict set has a higher priority) and their performances have been studied [19, 20, 22, 28].

Performance comparisons between locking and OCC have been made for conventional database systems [6, 5, 10], and they show the superiority of locking over OCC. The









superiority of locking over OCC for conventional database systems comes from its early stage blocking validation policy which does not waste valuable system resources for conventional database systems. However, locking only with blocking validation policy, 2PL-Wait, shows worse performance than 2PL-HP which adopted priority based High Priority (HP) conflict resolution method in locking algorithm for RTDBS.

Locking can be done with in-place update or deferred update while OCC can only be done with deferred update. In-place update shows some advantages when most of transactions are successfully committed, because its commit protocol is simple and effective, while deferred update has advantages when many transactions are aborted, because its rollback mechanism is very simple. Agrawal and DeWitt [6] showed that if the buffer space available to the transaction is large enough to hold all the pages updated by the transaction until the transaction is validated, the cost of making local copies global is not significant. In actual most comparisons [5, 10, 19, 22, 28] have been done without considering the effects of in-place or deferred update [5, 10, 19, 22, 281. In this paper, we discuss concurrency control algorithms without considering the effects of in-place and deferred update.

3.3 Motivation for our Approaches

Several possibilities of OCC as a concurrency control mechanism for firm RTDBS [19, 20, 22, 28] have been developed. According to these papers 2PL-HP loses some of its advantages over OCC because of wasted restart and wasted wait problems even though OCC has wasted executions resulting from the delay in validation. wasted restart. A wasted restart happens if an HPT aborts an LPT and then the HPT

is discarded as it misses its deadline. In other words a transaction which is later

discarded can cause restarts.









wasted wait. A wasted wait happens if an LPT waits for the commit of an HPT and later

the HPT is discarded as it misses its deadline. In other words a transaction which is

later discarded can cause wait of a conflicting LPT.

wasted execution. A wasted execution happens when an LPT in commit time validation

phase is restarted due to a conflicting HPT which hasn't finished yet.

If a lock requesting transaction has a higher priority than conflicting transactions, 2PLHP aborts the conflicting LPT immediately (immediate restart). Immediate restart is useful when an HPT has a high possibility of successful commit (i.e., transactions in soft RTDBS) by restarting LPTs as early as possible. In firm RTDBS, however, immediate restart might cause wasted restarts, which affects the performance adversely. It seems that deferred restart is always preferable for firm real-time main memory database systems that have only one CPU. Thus we assume that we have multiple CPUs or disk resident databases.

Our observation is that an HPT can proceed without aborting conflicting LPTs if needed when we use deferred update policy which updates local copies of data items and makes them global at commit time. We only need to stop the LPT until the completion (commit or abort) of the conflicting HPT. If an HPT is discarded by missing its deadline we can execute the stopped LPT by resuming it. This is termed in this dissertation as stop/resume deferred restart policy. Thus we can avoid wasted restart problem by using deferred restart policy selectively.

In order to differentiate the cause of transaction blocking we define transaction stopping as follows:

Transaction stopping. Transaction stopping is the blocking of an LPT and happens when

an HPT tries to access a data item which is accessed by the LPT. This is in contrast to the term blocking which describes the situation when an LPT tries to access a data

item accessed by an HPT and waiting for the completion of that HPT.









3.3.1 Comparison of Conflict Resolution Policies

In order to see the advantages and disadvantages of immediate and deferred restart policies we illustrate 4 cases that can arise when an HPT and an LPT conflict with each other as shown in Figures 3.1, 3.2, 3.3, and 3.4. For each case we evaluate 3 different restart methods, namely, OCC style deferred restart (DR-OCC), immediate restart (IR), and stop/resume deferred restart (DR-SR). DR-OCC. This policy is exactly the one used by OCC. The validation and restart happen

at commit time.

IR. This policy is exactly the one used by 2PL-HP. The validation and restart are done

when data conflict occurs.

DR-SR. This policy uses early stage validation but commit time restart. When an HPT

conflicts with a lock holding LPT we stop the LPT until the completion time of the HPT. If the HPT completes successfully (commit) then we restart the LPT. Otherwise

(i.e., if the HPT aborts) we resume the LPT.


The following summarizes the alternative outcomes and their relationship to the conflict resolution policies described above:


Case 1: Both LPT and HPT complete successfully as shown in Figure 3.1. It is clear that

immediate restart is the best for them in order to have the earliest finish time for the

LPT.

Case 2: In this case HPT completes successfully and LPT misses its deadline. DR-SR

looks better in terms of wasting the least amount of system resources (as shown in Figure 3.2), but the LPT's effective service time of the IR approach is the longest among them. This indicates that IR has a much better chance of changing Case 2 to

Case 1.









Case 3: This case explains wasted restart of 2PL-HP most clearly. In Figure 3.3 IR is the

worst among them due to wasted restart. Both deferred restart approaches, DR-OCC and DR-SR, are preferable in this situation. Even though DR-OCC looks the best among them DR-SR is equally good because during the LPT's stopped period we can

execute other transactions.

Case 4: This case (shown in Figure 3.4) happens when both HPT and LPT miss their

deadlines. If we consider the waste of system resources, DR-SR is the best as it wastes less resources. Saving valuable resources reduces transaction arrival blocking under heavily loaded situations by giving many chances of execution to other transactions.

This is likely to happen often in heavily loaded situations.


Deferred Immediate Deferred Restart (OCC) Restart Restart (Stop-Resume) L H L H L H Start c data
conflict

restart't1 stop

rest
i restart )coinmit
conumt commit


commit

commit commit Figure 3.1. Case 1: Both transactions finished successfully within their deadlines


By analyzing the 4 alternative outcomes between HPT and LPT transactions, we notice that alternating IR and DR-SR selectively could be better than using a single conflict resolution policy. Based on this strong motivation we propose a new approach termed Adaptive Concurrency Control (ACC) that integrates DR-SR and IR together for firm RTDBS.











Deferred
Restart (OCC)
L H
S






start commi


abort


Immediate
Restart
L H data conflict: ......

restart...



it com


abort


miit


Deferred
Restart (stop-resume)
L H





stop

restart commit


abort


Figure 3.2. Case 2:HPT completed successfully, LPT missed

Deferred Immediate Deferred
Restart (OCC) Restart Restart (stop-resume)
L H L H L H
start 0 0 0 0


restart


abort


commit


abort


commit commit

Figure 3.3. Case 3:HPT missed, LPT completed successfully

Deferred Immediate Deferred
Restart (OCC) Restart Restart (stop-resume)
L H L H L H
Sai
data
conflict.
" ... .... ..

stop
P
abort abort resume T abort


abort abort abort

Figure 3.4. Case 4:Both LPT and HPT missed their deadlines









3.4 Adaptive Concurrency Control (ACC)

The basic idea behind this approach is that we apply the restart policy selectively for different situations. If a conflicting HPT has a high possibility of successful commit, we use IR policy. Otherwise, we use DR-SR policy. In the following parts of program we assume that the requesting transaction Tr has a higher priority than lock holding transaction Th.

When Tr requests the data held by Th:

IF (Tr has a high possibility of successful commit)

Restart Th; /* Immediate restart */

Execute Tr;

ELSE

Stop Th until Tr finishes its execution;



When Tr finishes its execution:

IF (Tr is discarded)

Resume the stopped Th;

ELSE

IF (Tr commits)

Restart Th; /* deferred restart */



In general the destiny of a transaction is not decided beforehand. It depends on the system load, the tightness of slack, transaction mix and so on. although we don't know the destiny of a transaction in advance, we can control it by using a proper grouping mechanism. HIT/MISS grouping algorithm in AED [20] is a good candidate for our concurrency control algorithm. The possibility of successful commit in the HIT group is very high and very low in MISS group. By using the properties of HIT/MISS groups in AED, we can apply the









appropriate restart policy. If a lock requesting HPT comes from the HIT group we use IR, otherwise we use DR-SR.
HIT group
case (3)
~case (2)




case (1)
MISS group

Figure 3.5. Transaction blocks among HIT and MISS group


Another advantage of incorporating the HIT/MISS groups approach is that we can reduce wasted wait as well. We have 3 cases of transaction conflicts that cause transaction blocking (as depicted in Figure 3.5) and we can see how wasted wait can be reduced by using AED policy.


case 1 Both transactions in the MISS group will miss their deadlines. Thus wasted wait

do not cause a problem here.

case 2 and 3 Higher priority transaction that cause transaction wait is in the HIT group in

which miss percent is very low and waiting low priority transaction is from the MISS

group. Thus wasted wait will be negligible here.


3.4.1 Procedures of ACC

Algorithm for HIT/MISS grouping is well presented in Haritsa's paper [20]. In this section we focus on combining IR and DR-SR. Basically we follow 2PL-HP by maintaining a shared global lock table and add additional transaction lists to combine DR-SR. Each lock table entry contains an object identifier (OID), a lock mode, and a list of transaction identifier (TID). We use dynamic priority assignment with static evaluation policy and









maintain transaction state, version-number, deferred-rcnt, deferred-rlist, stopped-cnt and stopped-list fields for each transaction and their meaning and purposes are as follows:


Ti.state
Ti.version-n umber




Ti.deferred-rcnt Ti.deferred-rlist


Ti.stopped-cnt Ti.stopped-list


State of transaction Ti (READY, STOPPED, BLOCKED). Version number of transaction Ti. The initial value is zero. Whenever Ti is restarted its value increases by 1. Version number helps us to check whether a transaction is restarted or not while it is stopped by other transactions because TID of a transaction never changes when it is restarted. The number of lower priority transactions stopped by the transaction Ti. The list of OID and stopped transactions' TIDs, version.numbers and lock modes of objects when the transaction Ti stopped transactions.
The number of higher priority transactions that stopped Ti. The list of object id and identifier of the transaction that stopped Ti.


Using the global lock table and data structure described above we have developed and implemented key procedures of our approach for simulation. Key procedures of our approach are: Lock-request, Discard, Commit, and Restart. Lock-request is called when a transaction is trying to access a data and Discard is called when we remove a tardy transaction from the system. Commit is called when a transaction finishes successfully and Restart is invoked when a transaction is restarted by a conflicting HPT. In the following procedures T7, Th, T represent lock requesting transaction, lock holding transaction, and transaction being aborted, respectively.

Lock request

We maintain a shared global lock table and each entry has an OID, a lock mode and a list of lock owners. For the simplicity we assume exclusive lock only system in this paper. When transaction T, requests X lock on a data item and if no one has lock on that data item, T, gets the lock. If the requesting mode is incompatible and the requesting transaction T, has a higher priority than the lock holding transaction Th we stop the lock









holding transaction Th, save the lock table entry into Tr, and update the lock table entry for the object with the new lock mode and new lock owner.


Procedure Lock-request(Tr, Obj, lockmode) BEGIN

IF (Obj is locked with a conflicting lock mode)

THEN

IF (Pr(Tr) is greater than Pr(Th))

THEN

IF (Tr came from HIT group) THEN

Restart (Th); Tr gets the lock; ELSE

Stop Th;

Add STOPPED flag to Th; Put oid, previous lock mode, id of Th and its version-number to Tr's deferred..rlist; Increment Tr's deferred-rcnt by 1; Add oid, id of Tr to stopped-fist of Th; Increment Th's stoppedcnt by 1; Tr gets the lock; ELSE

Block Tr; /* lock wait */ ELSE

Tr gets the lock;


END









Commit

Deferred updates and deferred restarts are done here by using the deferredrlist of the committing transaction. When we do the deferred restart we check the version number of a transaction to make sure that transactions in the deferred-rlst haven't been restarted by other transaction because a transaction has the same transaction identifier (TID) after it is restarted. If the stopped transaction have been restarted by other transaction, then the version number of the transaction must have been increased also. Procedure Commit(Ta)

BEGIN

Make local copies global;

IF (Ta.deferred-rcnt is greater than zero)

FOREACH (Ti in Ta.deferred-rlist)

IF (current version-number and the version-number in Ta.deferred.rlist is the same) THEN

Restart (Ti); /* deferred restart here */ Release-lock(Ta);

END

Restart and Discard

When we discard or restart the transaction that stopped other transactions we have to update stop relationship and have to restore the lock owner and lock mode of objects that are accessed by the stopped transaction in global lock table properly.

In Figure 3.6 we illustrate what can happen when we restart or discard the transaction that stopped other transactions. Directed edges represent the stop relationship among transactions and a rectangle node represents a stopped transaction. A transaction represented by a circular node represents the transaction that is not stopped. The OID and









Abort Abort Ti T2 TiT2
(O1,T2) (O1,T2)





(O1,T2) (O1,TI)

Figure 3.6. Primitive cases of transaction stop


the corresponding lock owner are described in the parenthesis. Restarting or removing a transaction which has never stopped any transaction is quite straightforward. When, however, we restart or discard the transaction T2 which stopped other transactions (T1 in Figure 3.6), we have to restore the lock table entry by using the saved information in T2. In Figure 3.6, the object O1 was owned by T1 before T2 stops T and gets the lock on 01. When T2 is restarted or is discarded, the previous state need to be restored. From now on we assume that restoration is done by the procedure restore-lock-entry.

Abort Abort


01 02 > 0 1 01 T I T2 T~ TI T2 T3
(01,T2) (02,T3)





(01,T1) (02,T3) (01,T3)

(a) (b) Figure 3.7. Abort of stopped transaction which also stopped others









When we restart or discard a stopped transaction which has stopped other transactions, again we have to check the relationship more carefully and these cases are illustrated in Figure 3.7. If their relationship is made on different data, aborting T2 then we can disconnect their relationship in the manner shown in Figure 3.7 (a). If their relationship is made on the same data then we have to update deferred-rlist of T3 and stopped-list of T1 and transfer the saved lock table entry of the object 01 in T2 to T3 as shown in Figure 3.7 (b). This transferred lock table entry to T3 will be used when we restore lock table entry after T3 is removed on account of missing its deadline. We assume that updating stopped-list and deferred.rlist and transferring the saved lock table entry is done by the procedure update-stop-relationship.


Abort



0 T2 [Ti 1 02 (01 ,T2)(02,T3)


Abort (01,T2)(02,T2)


(02,T3) (01,T1)(02,T1) Figure 3.8. More complex cases of transaction stop


A transaction can be stopped by more than one HPT and even a transaction can be stopped more than once by the same transaction on different data. Figure 3.8 shows these cases.


Procedure Restart(Ta) BEGIN


Rollback(Ta);









Increment Ta.version-number by 1;

Ta.state = READY;

Ta.deferred-rcnt = Ta.stopped-cnt = 0;

Ta.deferredrlist = Ta.stopped-list = NIL;

Restart (Ta);

END


Procedure Discard(Ta) BEGIN

Rollback(Ta);

Remove Ta from the system; END


Procedure Rollback(Ta) BEGIN

IF (Ta.state is STOPPED)

IF (Ta.deferred-rcnt is greater than zero)

FOREACH (Tc which is stopped by Ta and Ta is stopped by Tb for the same data that stopped Tc) update-stop-relationship(Tc, Tb); FOREACH (Ti in Ta.deferred-rlist)

IF (Ti.state is STOPPED and version-number is the same) decrement Ti.stoppedcnt by 1; IF (Ti.stopped-cnt is zero) Remove STOPPED flag from Ti.state; restore lock-entry(Ta); /* for Ti which is stopped *1 Adjust lock counter for Ta;

Release-lock(Ta);









END

3.4.2 Correctness of ACC

Theorem 3.4.1 Adaptive Concurrency Control (A CC) is serializable. Proof: To prove a history H serializable, we have to show that SG(H) is acyclic. Let T' and

T2' (here, subscript is transaction identifier and superscript represents version number) be two committed versions of transactions in a history H produced by Adaptive Concurrency Control. If there is an edge T1 -+ T2 in SG(H), there exist conflicting

operations q and p such that q'[x] < pT[x].

1 If Pr(Tn) > Pr(Tfm), then qjn[x] < ... < Clj < p2-[x] ... < C2.

case 1 If Tn have never been stopped during its execution, T" releases its locks

at commit time. Thus T7m cannot access the data x until the commit of

transaction TI1.

case 2 If T' have been stopped by Th/ during its execution, Th must have a

higher priority than Tn. During that period T m cannot access the data x because Thl has the lock on x. After Th releases its lock on x by finishing its execution, TI1 gets the lock again by lock transfer from Th. Thus T2m cannot

access the data x until the commit of transaction TI1.

2 If Pr(T1) < Pr(Tf), then qn[x] < ... < Cn < pm[x]..- < C.

Let's assume p and q the first conflicting operation between TI1 and T2-. If

p [x] appears before C[ T cannot commit because T2' has the higher

priority than T1 and T m is the committed transaction.

Suppose there is a cycle T1 --+ T ... - Tk -+ TIn in SG(H).

Case 1 When Pr(T1) < Pr(Tnk),


T -f... --* Tk implies C' < Ck and Tk - T1 implies Ck < C









Contradiction. Thus this one cannot happen.

Case2 When Pr(Tf) > Pr(Tk)


T -- ... -- Tk implies C' < Ck and Tk, --- Tn implies Cnk < Cn


Contradiction. Thus this one cannot happen either.


Therefore no cycle can exist in SG(H) and thus our approach only produces serializable

histories.


3.5 Alternative Version Concurrency Control (AVCC)

One of several ways to use DR-SR and IR policies together is to have both versions together by starting alternative version. If a lock requesting HPT T, conflicts with a lock holding LPT Th we stop Th and initiate additional immediately restarted version T as an alternative of Th in Figure 3.9. Thus we have a stopped version and a restarted version of the LPT together at the same time. Even though the stopped version Th takes some space it doesn't consume any processor until it resumes its execution again. Meanwhile T can proceed up to the data point that caused stop and work from there after Tr commits, instead of starting from the beginning when Tr commits. If T, misses its deadline, Ti is removed from the system and Th resumes its work from the stopped position. This approach can be viewed a method to implement partial rollback without using save point mechanism. T is a partially rolled back version of Th when T, commits.

By maintaining stopped version and initiating restarted version AVCC can reduce wasted restart and wasted execution. In AVCC the execution path of Th and Ti is exactly the same when T, is still in the system because Tr has locks on the shared data and the value of input data doesn't change within the deadline of Th. DR version Th and IR version T has parent/child relationship so that Ti can inherit the deadline and priority of Th and can access the data accessed by Th freely.












Ad



Ai




Figure 3.9. Deferred restart and immediate restart versions (AVCC)

During the execution of a transaction an IR version of a transaction might be stopped and changed to DR-SR version and initiated a new IR version. Thus in AVCC each transaction might have multiple DR versions and a single IR version which is the leaf of the family tree. Only the IR version of a transaction can be allowed to run while the other DR-SR versions are waiting the resumption. In Figure 3.10 illustrated general view of transactions' execution with 3 transactions, Ti, Ti, and Tk. For each transaction a rectangle represents a version and dark area in each version shows how far it is executed from the beginning of a transaction. For the simplicity we assume that the left one is the parent of right one and parent has always progressed more than its child.

3.5.1 Algorithms

Let's look at several cases that could happen with AVCC. In Figure 3.11 HPT T conflicts only with stopped version Th more than once. This case doesn't make any problem. If Tr commits we take the restarted version Ti and remove the stopped version Th from the system. When T, misses its deadline we remove Tr and Ti from the system and resume Th from the stopped position.

The case in Figure 3.12 can happen when transaction T, stopped Th and initiated Ti. After that another HPT T, conflicted only with Th. If both of T,. and T, abort, we resume






59









Ti Tj Tk












Global data, I

Shared lock table Figure 3.10. Structure of AVCC execution













LPT

Tih






Figure 3.11. Case 2: Conflicts with the same transaction more than once









Th from the stopped position. Otherwise (i.e., if at least one of them commits) we use Ti and remove Th from the system.

Th Tr






Figure 3.12. Case3: Conflicts with different transactions


Our algorithm maintains a global shared lock table and each version of transaction follows 2 phase locking. Each lock table entry contains an object identifier (OID), a lock mode, number of lock waiter, number of lock granter, a list of lock waiter, and a list of lock granter.

In addition, we maintain transaction state, stop-cnt, stop-list, av-stoppedcnt, av-stopped-list, parent, and child field for each transaction and their meaning and purposes are as follows:


Ti.state Ti.stop-list


Ti.av-stopped-list Ti.stop-cnt Ti.av-stoppedcnt Ti.parent

Ti.child


State of transaction Ti (READY, REPLACED, BLOCKED) The list of object identifier and stopped transactions' TID when the transaction Ti stopped a transaction. This list is used to implement deferred restarts when a transaction commits The list of object identifier and TID of the transaction that stopped Ti. The number of lower priority transactions stopped by the transaction Ti The number of higher priority transactions that stopped Ti A pointer to Ti's parent.
This field is used when we check the transaction relationship. A pointer to Ti's child.


Lock acquire

When a transaction request a lock on a data object the lock compatibility and parent/child relationship should be checked. If a lock requesting higher priority transaction T, conflicts with a lower priority transaction Th we stop Th and make T, get the lock and









initiate Ti which is restarted version of Th. If a lock holding Th is the ancestor of lock requesting Tr, T, gets the lock. Procedure Lockacquire(Tr, oid, lockmode) BEGIN

IF (oid is locked with a conflicting granted lock mode)

THEN

IF (Th is Tr's ancestor)

Lock-granted(Tr, oid, lockmode);

ELSE

IF (Pr(Tr) is greater than Pr(Th))

THEN

IF (Th.state is REPLACED) Add oid, and TID of Th to Tr's stop-list; Add oid, and TID of Tr to Th's av-stopped-ist; ELSE

stop Th and

generate Ti which is restart version of Th; Add REPLACED flag to Th; Add oid, and TID of Th to Tr's stop-list; Add oid, and TID of Tr to Th's av-stoppedilist; ELSE

Block(Tr);

ELSE

Lock-granted(Tr,oid,lockmode);


END









Commit

We make local copies global and remove stopped versions here. When the transaction T1 commits as in Figure 3.13 T uses its stop-list to remove transactions that are stopped by T and ancestors of those stopped transactions. Thus when T1 commits Th which is in stop-list of T and Ta which is parent of Th are removed.







Commit

Th Ti







Figure 3.13. Commit of AVCC Figure 3.14 shows a special case for chain stop. Transaction T2 stopped T3 for object 01 and T, stopped T2 again for the same object 01. In this case commit of T1 initiates the removal of T2 and T3.


Figure 3.14. Chain stop in AVCC


Procedure Commit(T1) BEGIN


Make local copies global;









IF (Tl.stopxcnt is greater than zero)

FOREACH Ti in Tl.stop-list BEGIN

Check chain stop;

Remove all stopped versions that are in Tl.stop-list and their ancestors;

Remove all locks held by T1; END

END




Discard

This function is called

1. when a transaction is removed by missing its deadline, (for example, transaction B

in the Figure 3.15)

2. when a transaction commits and try to remove all transaction that are in its stop-list, 3. when a restarted version of a transaction is removed due to the removal of the transaction that caused the restart version. (for example, Ai in case 1 of Figure 3.15)

Figure 3.15 shows the changes of transactions' relationship in the system when a transaction is discarded by missing its deadline.

When the transaction Ta is discarded lock owner of an object O1 which is locked by T is changed to Ti if it is used to stop transaction Ti. At the same time transactions blocked for lock release of O1 are waken up and try a lock on O1 and compare the priority with Ti. By doing this a transaction Tb which has an intermediate priority (Pr(Ta) > Pr(Tb) > Pr(Ti)) will not be blocked by Ti and won't form circular waits.









Discard Case B


Ad0


Case 2 Discard Case 4
B Adscard
AAd4





Figure 3.15. Conflict with different transactions in AVCC Procedure Discard(Ta) BEGIN

IF (Ta.stop-cnt is greater than zero)

FOREACH Ti in Ta.stopiist

Change lock owner of OID which caused stop of Ti from Ta to Ti

and wake up transaction blocked for OID;

Delete Ta from the av-stopped-ist of Ti;

IF (Ti.av-stopped.cnt is zero) Tc is child of Ti; IF (Tc.state is REPLACED) Remove Tc and adjust parent/child field of Tc's parent and child; ELSE

Remove Tc and adjust child field of Tc's parent; Resume Ti;









Remove all locks held by Ta;

Remove all data structure for Ta from the system; END



3.5.2 Correctness of AVCC

Theorem 3.5.1 AVCC is serializable.


Proof: To prove a history H serializable, we have to show that SG(H) is acyclic. Let T

and T7m (here, subscript is transaction identifier and superscript represents version number) be two committed versions of transactions in a history H produced by AVCC.

If there is an edge T1 --, T2' in SG(H), there exist confficting operations q and p such

that qj'[x] < pm[x].


1 If Pr(T ) > Pr(Tm), then q'[x] < ...< C[ < p[x] -.. < C2-.

case 1 If T1' has never been stopped during its execution, T' releases its locks

at commit time. Thus T7 cannot access the data x until the commit of

transaction Tj.

case 2 If Tn has been stopped by Th due to data x during its execution, Th must

have a higher priority than Tn. During that period T7m cannot access the data x because Th has the lock on x. After Thl releases its lock on x by finishing its execution (removed), Tj' gets the lock again by lock transfer from Th. Thus T2' cannot access the data x until the commit of transaction
Tn.

case 3 If TT has been stopped by Thl due to data y which is different from data

x during its execution, Th must have a higher priority than Tn. During that period T7m cannot access the data x because T ' or Ti+i which is a descendant of T' has the lock on x. After Tl+ releases its lock on x by









finishing its execution (removed), T1' gets the lock again by lock transfer from its descendant Tfl+i. Thus T2 cannot access the data x until the

commit of transaction T1'.

2 If Pr(T,') < Pr(T2), then q [x] < ...< Cj1


Let's assume p and q the first conflicting operation between T and Ti". If

p2'[x] appears before Cn Tn cannot commit because T21 has the higher

priority than T1' and T2 is the committed transaction.

Suppose there is a cycle Tn -- T7 - ... Tk -+ T in SG(H).

Case 1 When Pr(T) < Pr(Tk),

T ..--, Tkimp IesC < Ck and Tk Tln implies Ck < Cn


Contradiction. Thus this one cannot happen.

Case2 When Pr(Tf) > Pr(Tn)

T1 - Tk implies Cn
Contradiction. Thus this one cannot happen either.

Therefore no cycle can exist in SG(H) and thus our approach only produces serializable

histories.

3.6 Performance Evaluation

In order to compare the performance of ACC and AVCC, simulations of a real-time transaction scheduler were implemented (using C language and SIMPACK simulation package [16]). In our simulations we are assuming multiple CPUs, which has a common queue for the CPUs and the service discipline is priority Preemptive-Resume and multiple disks environment with each of the disks with its own queue and the service discipline is priority Head-Of-Line (Non-preemptive).































Figure 3.16. Open Network Model with Multiple CPU and disks










Table 3.1. Simulation Parameters for ACC and AVCC Parameter Value dbsize 1000 max-size 24 min-size 8 i/o-time 20 ms cputime 10 ms disk-prob 0.5 min-slack 100 (%) max-slack 650 (%) no-of-cpu 8 no-of-disk 16









The parameters used in the simulations are shown in Table 3.6 and transactions enter the system according to a Poisson process with arrival rate A (i.e., exponentially distributed inter-arrival times with mean value 1/A), and they are ready to execute when they enter the system (i.e., release time equals arrival time). The number of objects updated by a transaction is chosen uniformly from the range of min-size and maxsize and the actual database items are chosen uniformly from the range of db-size.

After accessing an object a transaction spends cpu-time in order to do some work with or on that object and then it accesses the next object. The assignment of a deadline is controlled by the resource time of a transaction and two parameters min-slack and max-slack which set, respectively, a lower and upper bound of percentage of slack time relative to the resource time. A deadline is calculated by adding resource time and slack time. Slack time is calculated by multiplying slack percent and resource time. Slack percent is chosen uniformly from the range of min-slack to max-slack.


Deadline = arrival time + resource time x (1 + slack percent 100

Disk accesses are controlled by disk-prob when a transaction reads an object. The use of disk-prob to some extent models data maintained in the buffer. At commit time, objects that have been updated are flushed. The restarted transaction will access the same data objects and the number of CPU and disk is controlled by no-of-cpu and no-of.disk, respectively.

In our performance evaluation, we measure the transaction miss percent commonly used in the literature for firm RTDBS:

Miss Percent = Total number of transactions that missed the deadline x 100 Total number of transactions that entered the system

We ran 10,000 transactions for each simulations and 95% confidence intervals have been obtained whose halfwidths are less than 2.5%. During our simulations first 1,000









transactions were not counted in simulation results in order to avoid the warm up problem (initial transient problem) and to get the proper HIT/MISS bucket size for ACC.

3.6.1 Performance of ACC

In this experiment, we compared EDF-HP, AED-HP, and EDF-ACC to evaluate the merit of ACC conflict resolution policy proposed in this dissertation. EDF-HP. Priorities of transactions are assigned based on EDF and its conflict resolution

policy is HP. HIT/MISS grouping is not used in this approach.

AED-HP. This approach is proposed in [20]. Priorities of transactions are assigned based

on AED which use EDF and Random Priority (RP) for HIT group and MISS group,

respectively. Conflict resolution policy is HP for both groups.

EDF-ACC. Priorities of transactions are assigned based on EDF for both HIT and MISS

group but priorities of transactions in the HIT group are higher than those of MISS

group. Conflict resolution policy is HP for HIT group and DR-SR for MISS group.

In our comparison we have used the same HIT/MISS grouping algorithm [20] for AEDHP and EDF-ACC and Figure 3.17 shows the simulation result. As we expected AED-HP and EDF-ACC performs similar to EDF-HP when the system is lightly loaded by assigning most of transactions to HIT group in which EDF priority assignment policy and HP conflict resolution are used for all 3 approaches.

In the heavily loaded situation both of AED-HP and EDF-ACC performs better than EDF-HP. The performance improvement was achieved by reducing wasted restart in both the approaches. AED-HP tries to avoid wasted restart with RP priority assignment by trying not to assign higher priorities to the transactions that have earlier deadlines assuming those transactions do not have much chances to finish within their deadlines under heavily loaded situations. Due to the randomness of RP priority assignment AED-HP may assign higher priorities to those transactions that are going to miss. While EDF-ACC uses EDF









for both HIT group and MISS group and DR-SR conflict resolution policy for MISS group. If HIT/MISS group assignment is 100% correct, EDF-ACC can remove all possibilities of wasted restart.

In Figure 3.17, EDF-ACC shows better performance than AED-HP. One reason for this is that ACC reduces wasted restarts in a systematic way. ACC never misses a chance to remove wasted restart if HIT/MISS group assignment is 100% correct. While AED-HP uses random function to assign priorities to transactions in the MISS group so that it may miss some of its chances to reduce wasted restart even if group assignment is correct.

100


80 EDF-HP .....
AED-HP -.
EDF-ACC- .. , oo---"

row60


" 40


20



0 20 40 60 80 100 Arrival Rate(trs/sec)

Figure 3.17. EDF-HP, AED-HP, and EDF-ACC In the previous experiment we have seen the merit of ACC as a good conflict resolution policy in EDF-ACC and RP as a priority assignment in AED-HP. In the following experiment, we applied ACC to AED (AED-ACC) to evaluate the performance of different combination.









AED-ACC Priorities of transactions are assigned based on AED which uses EDF for HIT

group and RP for MISS group. Conflict resolution policy is HP for HIT group and

DR-SR for MISS group.

Using RP priority assignment and DR-SR conflict resolution policy together for MISS group didn't show much improvement as shown in Figure 3.18. The reason is that RP priority assignment policy itself already reduced wasted restart by ignoring a transaction's deadline which in turn sacrificed the importance of transaction deadline completely in heavily loaded situations. There are not too many chances for ACC to reduce wasted restarts in AED-ACC. That seems to be the reason why AED-ACC shows poorer performance than EDF-ACC in Figure 3.18.

100


80 AED-HP --EDF-ACCAED-ACC
~60


40


20



0 20 40 60 80 100 Arrival Rate(trs/sec)

Figure 3.18. AED-HP, EDF-ACC, and AED-ACC


3.6.2 Performance of AVCC

In our experimentation we changed transaction arrival rate from 10 to 110 trs/second and priorities of transactions are assigned by using EDF policy for EDF-HP and AVCC. As we expected AVCC shows better performance than EDF-HP for wide ranges of system load









in Figure 3.19. AVCC is much better than EDF-HP except for heavily loaded situation in which we did not have much chances to reduce wasted restart because most of transactions in this load missed their deadlines as in Figure 3.4.

(base parameters)


80 EDF-HP ,-- -.
ACC -B--. :.�..
AVCC -b- ... _ r;::



� 40


20


0 *-* I II
20 40 60 80 1 Arrival Rate(trslsec) Figure 3.19. Comparisons of EDF-HP, ACC, AVCC


AVCC is better than ACC in normal load but is worse in heavy load. This phenomenon can easily be explained by the Figures 3.1, 3.2, 3.3, 3.4. In normal load most of cases are Case 1, 2, and 3 in which maintaining both of IR and DR-SR together reduced wasted restart and wasted execution a lot. While initiating IR version in heavy load increase the competition of active resources by wasting system resources as in case 4 of Figure 3.4.

From this simulation results we can conclude that

1. DR-SR conflict resolution policy shows its superiority over IR in overload. Initiating

IR version make the situation worse in overload.

2. In most situations maintaining IR and DR-SR versions of a transaction together shows

its superiority over IR.









3.7 Conclusions

EDF-HP and AED-HP use blocking and immediate restarts to resolve data conflicts. While OCC variants use deferred restart to resolve data conflicts. Both approaches have some advantages and disadvantages for firm RTDBS. In our study, we have tried to synthesize the advantages of both approaches by applying immediate and stop-resume deferred restart (DR-SR) policies together. By combining immediate and deferred restart policies our EDF-ACC misses less transactions than AED-HP and EDF-HP that use immediate restart policy only. Our simulations indicate that:


1. AED-HP performs better than EDF-HP in overload. Our simulation results conforms

to the previous simulation result [20].

2. EDF-ACC is better than AED-HP in overload. Both approaches use the same

HIT/MISS grouping mechanism but their priority assignment and conflict resolution on MISS group are different. From the simulation results we can conclude that

EDF-ACC reduces wasted restart more effectively than AED-HP.

3. AED-ACC is slightly better than AED-HP.

4. EDF-ACC is better than AED-ACC. We can conclude that ACC is better combined

with EDF priority assignment policy.

5. AVCC misses less transactions than EDF-HP for all ranges of system load.

6. AVCC misses less transactions than ACC except for heavily loaded situation.

Our concurrency control algorithm ACC introduced in this dissertation uses a common shared lock table and a few lists for each transaction to make database consistent. The overhead associated with implementing DR-SR mechanism do not affect transactions in the HIT group at all and maintaining a few lists for each transaction in the MISS group is not significant. A disadvantage of ACC is that its performance is sensitive to the preciseness









of HIT/MISS prediction. While AVCC do not use any grouping mechanism to combine IR and DR-SR restart policy together.

During our performance comparisons we didn't consider implementation overhead because we believe that the overhead related to the algorithm of ACC is not significant to affect the result of our simulations. In terms of spaces AVCC requires more data spaces than EDF-HP and ACC by maintaining IR and DR-SR versions of a transactions. The space problem might be trivial if we consider memory prices that are dropping severely and primary goal of real-time systems, timely responses, which usually requires some redundant resources.














CHAPTER 4
ACTIVE REAL-TIME: TRANSACTION SCHEDULING

There are many applications such as cooperative distributed navigation systems and intelligent network services where ARTDBS technology is extremely useful [33, 34] and there have been several proposals to combine real-time databases and active databases [11, 33, 34].

4.1 Priority Assignment

Subtask deadline assignment (SDA) problem has been studied in distributed real-time systems where a given task is to be executed and completed by a specified deadline. The task executes several subtasks, each at possibly different system components. When each subtask is submitted to its component, a local deadline must be assigned to it [24]. Likewise in ARTDBS a transaction triggers subtransactions dynamically [9] and a triggered transaction which is a part of the triggering transaction need to be finished along with the triggering transaction and to obey the coupling mode specified. Consider an active transaction T = [T1,T2,T3] with EDF priority assignment policy. The ultimate deadline of transaction T fails to represent the tightness of each individual subtransactions. For example, if subtransaction T1 is scheduled with the deadline of T, the scheduler will consider the time that should be reserved for other subtransactions as slack to Ti. Thus subtransaction T1 will be running at a lower priority because of its excessive slack. As a remedy, earlier intermediate deadline (virtual deadline) has been assigned so as to reserve enough amount of time for the subtransactions to follow [34]. Based on earlier intermediate deadline assignment idea PD, DIV, and SL priority assignment for main-memory resident databases and their relative performances have been studied for a triggering transaction and a triggered









transaction [34]. In the simulations [34] on memory-resident databases, incoming transactions are classified into triggering and non-triggering classes. In their simulations PD shows the best performance in terms of total miss percent but DIV and SL shows reduced miss percent of triggering class with increased miss percent of non-triggering class. Thus DIV and SL are better than PD only if triggering class is more valuable (more critical) than non-triggering class. The criticalness of a transaction is indicative of the level of importance that is attached to that transaction relative to the other transaction. Depending on the functionality of a transaction, meeting the deadline of one transaction may be considered more critical than another. If transactions have the same criticalness PD is the best among them for the main-memory databases.

In this dissertation we will take a look at effects of subtransaction priority assignment on disk resident databases with active transactions. We believe that triggering itself is really unpredictable unless the corresponding events are periodical. We know data access patterns of all transactions rather than statically estimated execution times of transactions and we don't know when events will be triggered and all transaction have the same criticalness. With those assumptions we will see how subtransaction deadline assignment policy affects disk resident databases with active transactions.

4.1.1 Multiple Priorities

Priority assignment policy DIV [34] assigned earlier deadlines (higher priority) to subtransactions by considering the urgency of subtransactions properly. We believe that, however, their priority assignment policy, DIV, caused unwanted phenomena such as deadlock, priority reversal, reverse direction of High Priority (HP) by assigning different priorities to a triggering and its triggered transactions.

In the Figure 4.1, 4.2, and 4.3 we illustrated above 3 problems. In these figures Ti and T4 are triggering transactions and T2, T3 are subtransactions of T1 and T5, T6 axe









subtransactions of T4. Relative priority order of these transactions is Pr(T2) > Pr(T3) > Pr(T5) > Pr(T6) > Pr(T1) > Pr(T4). Deadlock. In Figure 4.1 T1 is waiting the completion of T5 while T5 is waiting the completion of T2. Subtransaction T2 which is ready to commit is waiting the completion of its parent transaction T1. Thus there exist a circular wait among T1, T5, and T2

that causes a deadlock.


TI




T2 T3, T5 T6

wait

Figure 4.1. Deadlock due to multiple priorities Priority reversal. In Figure 4.2 T1 requests data already accessed by T5 and is blocked

waiting for the completion of T5. As T5 is a subtransaction of T4, T1 waits the

completion of T4 which has lower priority than T1.





. . . . . . . .............. . . ...........................
TIW




T2 T3 T5 T


Figure 4.2. Priority Reversal: Blocking of active transactions


Reverse direction of high priority. In Figure 4.3 T5 requested the data that already accessed by T1 and T5 caused the restart of T1. This implies that lower priority









transaction T4 causes the restart of higher priority transaction Ti because T5 is a

part of T4.







T2 T3.: T5


Figure 4.3. Restarts of active transactions


It seems difficult to solve those problems with a single priority value for each transaction if we assign different priorities to triggering and triggered transactions. Thus we suggest a double or 2 level priority scheme which assigns two priorities for each transaction; one (r.priority) for active resource contention and the other (d.priority) for data conflict resolution.

In our scheme we assign the same value for r.priority and d-priority for each triggering transaction and we assign a value to rpriority by considering the urgency of subtransaction and assigning the value of parent transaction's d.priority for each subtransaction's d.priority. Thus for each nested transaction [31] only single value of priority is used to solve data conflicts. With our double priority scheme we can easily solve deadlock, priority reversal, and reverse direction of High Priority while making subtransactions have timely services of active resource.

As we mentioned earlier, transaction response time varies substantially with the changes of system load. By computing the ratio (Load Factor) of response time to corresponding resource time of each completed transaction, we might be able to predict the current system load. Resource time can be derived from transaction programs by assuming that the processing time for each accessed data item does not change enormously.










Nit(T)
Resource Time Response time Load Factor(LF)

Average Load Factor =


Number of immediate subtransactions triggered by T until time t No. of data access x cpultime + Number of disk access x disk-time completion time - arrival time Response time
Resource Time
ETiELf-jst LF(Ti)
N


After getting ALF we estimate transaction response time of transaction in the system from estimated resource time and ALF

We will explain how we calculate priorities for subtransactions by using the Figure 4.4.

arrival dtl dt2 itl dt3 it2 dtl,dt2,dt3
I I I F- I F7 I I

tI t2 t3 t4 t5 t6 t7 Figure 4.4. Life of complex active transaction T


r.prt(T) d-prt(T) ERT(T) RRTt(T)
ALFt Slackt(T)
Ndt(T) Nit(T)


r-priority of T at time t d-priority of T at time t Estimated Response time of T at time t Remaining Resource time of T at time t Average Load Factor at time t Slack time of T at time t Number of deferred subtransactions triggered by T until time t Number of immediate subtransactions triggered by T until time t


Priority assignment for immediate subtransaction


When immediate transactions are triggered at time t4 and t6 we assign priorities. First of all, we can derive estimation of transaction response time with Average Load Factor (ALF) and RRT as follows: This means a transaction have to spend ERT amount of time to get RRT amount of service time in the system.


ERTM4(itl) = ALF4 x RRTt4(it1)









We will use this modified DIV for our r.priority assignment because we think that DIV approach is quite simple and reasonable for our assumptions. By equally dividing the parent's effective slack among all the immediate and deferred subtransactions triggered until that point, we can assign priorities of subtransactions as follows: Slackt4 (T) -(ERTt4 (it1)+ERTt4(dt1)+ERTt4(dt2))
r_prt4(itl) = t4 + ERTt4(itl) + Ndt4(T)+Nit4(T)
d_prt4(itl) = d-prt4(T)



r.priority of a subtransaction will be used only for resource contention and a subtransaction inherits the priority of its top-level transaction's d.priority for data conflict resolution. Priority assignment for deferred subtransaction

By equally dividing the parent's effective slack among all the deferred subtransactions triggered until that point, we can assign priorities of deferred subtransactions as follows by using DIV policy [34] assuming parallel execution of deferred subtransactions:


rprt7(dtl) = t7 + ERTt7( dtl) + Slackt7(T)-(ERTt(dtl)+ERTt(dt2)+ERT,7(dt3))
dprt7(dtl) = d-prt7(T)




Priority assignment for a top-level transaction

Assigning proper priorities to a triggering transaction is more important because its d-priority will be inherited to subtransactions to resolve data conflict. Even though increasing the d-priority of triggering transaction based on subtransaction triggering might help the performance, changing d-priority of a triggering transaction based on subtransaction triggering might cause priority inversion which in turn make circular abort. Thus we are going to use a fixed d.priority and r-priority for a triggering transaction. At time tl transaction T will get its initial d-priority and r.priority based on its deadline and it won't change during its execution.









Resource scheduling

We have multiple CPUs and a single common queue with priority-based preemptive scheduling policy. When a transaction arrives the procedure Arrivalsched is invoked and a transaction finishes (commit) or a transaction releases CPU (subtransaction commit, disk 10) Release-sched is invoked. Both of the procedures use r-priority when they compare priorities of transaction. Procedure Arrivalsched(Ta) BEGIN

Put Ta in the ready queue;

IF (there is an available CPU)

Assign Ta to one of available CPU; Adjust ready queue and CPU pool;

ELSE

Pick CPU Ci that runs Tb which has the lowest r.priority;

IF (r-priority (Ta) is greater than r-priority(Tb))

Preempt Ta from Ci;

Execute Tb on Ci; END



Procedure Release-sched(Ci, Ta) BEGIN

Release transaction Ta from the CPU Ci;

IF (there is an available transaction in the queue)

Pick Tb that has the highest r-priority in the ready queue;

Execute Tb on Ci;


ELSE









Return CPU Ci to the CPU pool;

END




4.1.2 Performance Evaluation

In order to compare the performance of different priority assignment policy, simulations of an active real-time transaction scheduler were implemented (using C language and SIMPACK simulation package [16]). In our simulations we are assuming multiple CPUs and multiple disks environment which has a common queue for the CPUs and the service discipline is priority Preemptive-Resume and each of the disks has its own queue and the service discipline is priority Head-Of-Line (Non-preemptive). Our simulation model follows 3 rules of nested transaction model [17] and the parameters used in the simulations are shown in Table 4.1.2.


Commit rule. The commit of a subtransaction makes its results accessible only to the parent

transaction. The subtransaction will finally commit only if it has committed itself

locally and all its ancestors up to the root have finally committed.

Rollback rule. If a transaction at any level of nesting is rolled back, all its subtransactions

are also rolled back, independent of their local commit status.

Visibility rule. All changes done by a subtransaction become visible to the parent transaction upon the transaction's commit. All objects held by a parent transaction can be made accessible to its subtransactions. Changes made by a subtransaction are not

visible to its siblings, in case they execute concurrently.


In our simulations, transactions enter the system according to a Poisson process with arrival rate A (i.e., exponentially distributed inter-arrival times with mean value 1/A), and they are ready to execute when they enter the system (i.e., release time equals arrival











Table 4.1. Parameters for ARTDBS simulations Parameter Value db-size 1000 i/otime 20 ms cpu-time 10 ms disk-prob 0.5 min-slack 100 (%) max-slack 650 (%) no-of-cpu 8 no-of-disk 16 min-size 4 max-size 6 prob-of-triggering 20-70 prob-of-immediate 50 (%) prob-of-deferred 50 (%)


time). The number of objects updated by a transaction is chosen uniformly from the range of min-size and max-size and the actual database items are chosen uniformly from the range of db-size.

General behaviors of our active transaction model in our simulations are as follows:

Triggering. Triggering transaction triggers a subtransaction after reading a data item (main

memory or disk resident) according to triggering probability prob.of.triggering and coupling mode of a subtransaction (immediate (IMM) or deferred (DEF)) is decided

by probof-immediate and prob-of-deferred.

1. When a transaction triggers an IMM subtransaction we stop triggering transaction and execute immediate subtransaction.

2. When a transaction triggers a DEF subtransaction we just increase the count

of deferred subtransaction (i.e., deferred-scnt) and keep executing the triggering transaction. DEF subtransactions will be actually triggered at the end of the triggering transaction by checking the value of deferred-scnt. Those DEF

subtransactions can be executed in parallel as there are multiple CPUs.









Commit. 1. When an IMM subtransaction finishes its execution, it returns control to its

triggering transaction and waits phase 2 commit.

2. When a DEF subtransaction finishes its execution, it decreases deferred-scnt of

its parent. If the count goes to zero, its parent initiates phase 2 commit.

3. When a triggering transaction commits (phase 2 commit) it makes all subtransactions do the phase 2 commit and releases locks held by itself.

Abort. We maintain the deadline list of triggering transactions to check the tardiness of a

transaction easily. When a triggering transaction is aborted by missing its deadline

all its subtransactions are also aborted according to Rollback rule.

Lock conflict. 1. When a subtransaction tries to access the object held by its parent, it

can access the object freely.

2. When a parent tries to access the object held by its committed (phase 1 commit)

child, it can access the object freely.

3. When a DEF subtransaction tries to access the object held by its sibling, it can

access the object freely when its sibling has already finished its phase 1 commit.

Otherwise it is blocked until the phase 1 commit of its sibling.

In flat transaction model we simply assume that transactions are restarted from the beginning of it. While in our active transaction model rollbacks are done as follows:


* Case 1: When a triggering transaction is restarted, triggering transaction and all its

subtransactions are rolled back to the beginning of the transaction and the triggering

transaction is restarted.

* Case 2: When an immediate subtransaction is restarted

- If it is ready to commit (i.e., precommit) all its siblings and parent are rolled

back to the beginning and triggering transaction is restarted.








- if it is not ready to commit it is restarted from the beginning. It doesn't affect
the corresponding triggering transaction and its siblings.

* Case 3: When a deferred subtransaction is restarted it is restarted from the beginning
without affecting its parent and its siblings.






Q r dial is still niaing
i .........................

Ad

Restart





(b)
Figure 4.5. Restartable unit of active real-time transaction

After accessing an object a transaction spends .utime in order to do some work with or on that object and then it accesses the next object. The assignment of a deadline is controlled by the resource time of a transaction and two parameters mintelack and maxzslack which set, respectively, a lower and upper bound of percentage of slack time relative to the resource time. A deadline is calculated by adding resource time and slack time. Slack time is calculated by multiplying slack percent and resource time. Slack percent is chosen uniformly from the range of mainslack to max~slack. Disk accesses are controlled by disk.prob and the number of CPU and disk is controlled by noo..pu and no..ofidisk, respectively. I our performance evaluation, we measure the transaction miss percent that is commonly used in the literature for firm RTDBS. We ran 10,000 active transactions for each simulations and during our simulations first 1,000 transactions were not counted in simulation results in order to avoid the warm up problem (initial transient problem).









4.1.3 Analysis of Results

We compared the performance of PD and DIV variant which we proposed here. In our experimentation all incoming transactions have the same criticalness independent of their triggering probability. In the first experimentation we turned off concurrency control so that all data could be accessed freely to observe the effect of intermediate priority for active resources only. From Figure 4.6 we can conclude that assigning earlier deadline to subtransactions does not affect the performance of ARTDBS.

100


80 PD
DIV

~60


~40


20


0
10 15 20 25 30 Arrival Rate(trs/sec)

Figure 4.6. Result: Without data contention In the second stage, we added data contention. As we expected transaction miss percent of both approaches climbed earlier than previous experimentation on no data contention environment due to transaction blocking and restarts but the relative performance difference of PD and DIV variant is negligible too.

From the Figure 4.6 and Figure 4.7, we can conclude that assigning earlier deadline to subtransactions does not affect the performance of ARTDBS if all transactions have the same criticalness. Observe the execution scenarios of two transactions in Figure 4.8 based









100


80 PD DIV --E.

~60


*~40


20



10 15 20 25 30 Arrival Rate(trs/sec)

Figure 4.7. Result: With data contention on main-memory databases. Execution Ex-A models PD and Ex-B models DIV priority assignment by executing parts of two transactions way. As we can see T has a better chance to finish its execution within its deadline with execution Ex-A which models PD priority assignment policy.


Ex-A


TI T2


Ex-B


0

Figure 4.8. Two execution scenario From our simulation results and previous research [34] we can conclude as follows: o Transaction triggering prolongs the execution time of an active transaction than expected. Thus its shortened slack time makes it difficult to finish in time. If we consider


I









a transaction which triggers more subtransactions more important (i.e., more critical),

increasing its priority with decreased slack is a proper approach [34].

* Transaction criticalness is usually decided statically based on the functionality of

subtransactions it might trigger rather than the number of subtransaction it triggers. Thus transaction criticalness doesn't change with the subtransaction triggering dynamically. With this assumption increasing the priority of a transaction which triggered many subtransactions merely increases the transaction miss percent by favoring longer transactions. Assigning an earlier deadline to a subtransaction doesn't improve the performance of ARTDBS either when a transaction's criticalness does

not change dynamically.

4.2 Concurrency Control

2PL-HP seems a good approach for soft RTDBS but it has wasted restart and wasted wait for firm RTDBS. Thus we have developed new concurrency control algorithms, ACC and AVCC, and have shown that the performances of ACC and AVCC are better than 2PL-HP for firm RTDBS which has flat transaction model. ACC uses HIT/MISS group assignment mechanism to anticipate the destiny of a transaction and HIT/MISS group assignment algorithm controls the number of transactions in both groups by assigning proper groups to incoming transactions. But the dynamic triggering of subtransactions in active transaction model requires a major modification of group assignment algorithm. Meanwhile our new concurrency control algorithm, AVCC, that is designed for firm RTDBS by using the semantics of firm deadline can fit easily into firm real-time active model and the performance of AVCC won't be affected by the changes of the transaction execution model. By extending our concurrency control algorithm AVCC for complex transaction model of active databases, we can develop firm real-time active database concurrency control algorithm.









4.2.1 Extension of AVCC for Active Transaction Model


Our extension of AVCC maintains a global shared lock table and each version of transaction follows 2 phase locking. Each lock table entry contains an object identifier (OID), a lock mode, number of lock waiter, number of lock granter, a list of lock waiters, and a list of lock granters. In addition, we maintain transaction state, stopcnt, stop-list, av.stopped-cnt, avstopped-list, parent, child, trigger, and sub-list field for each transaction and their meanings and purposes are as follows:


Ti.state Ti.stop-list


Ti.av-stopped-list Ti.stopcnt Ti.av-stopped-cnt Ti.parent

Ti.child Ti.trigger Ti.sub-list


State of transaction Ti (READY, REPLACED, BLOCKED) The list of object identifier and stopped transactions' TID when the transaction Ti stopped a transaction. This list is used to implement deferred restarts when a transaction commits The list of object identifier and TID of the transaction that stopped Ti. The number of lower priority transactions stopped by the transaction Ti The number of higher priority transactions that stopped Ti A pointer to T's parent.
This field is used when we check the transaction stop relationship. A pointer to Ti's child made by transaction stop. A pointer to the transaction that triggered Ti. List of subtransactions of Ti.


Key procedures of our extended AVCC are NestedLock.require, Phase-two-commit, Phase-one-commit, and Nested-Discard. Lock acquire

When a transaction request a lock on a data object the lock compatibility and parent/child relationship should be checked. If an HPT T, conflicts with an LPT Th which does not have parent/child or sibling relationship we stop Th and make T, get the lock and initiate Ti which is restarted version of Th. When it checks its parent/child or sibling relationship it uses its parent, child, and trigger fields. Procedure NestedALock-acquire(Tr, oid, lockmode) BEGIN









IF (oid is locked with a conflicting granted lock mode)

THEN

IF (Th is Tr's ancestor or Th is Tr's child or

Th is Tr's sibling which finished its phase 1 commit)

Lock-granted(Tr, oid, lockmode);

ELSE

IF (Pr(Tr) is greater than Pr(Th))

THEN

IF (Th.state is REPLACED) Add oid, and TID of Th to Tr's stop-list; Add oid, and TID of Tr to Th's av-stopped-list; ELSE

stop Th and

generate Ti which is restart version of Th; Add REPLACED flag to Th; Add oid, and TID of Th to Tr's stop-list; Add oid, and TID of Tr to Th's av-stopped list; ELSE

Block(Tr);

ELSE

Lock-granted(Tr,oid,lockmode); END




Commit

During the phase 1 commit, an IMM subtransaction returns control to its parent so that its parent can resume the execution. While a DEF subtransaction reduces the number









of deferred subtransaction count (deferred-scnt) of its parent and make its parent commit when there is no unfinished deferred subtransaction.


Procedure Phase-one-commit(Ti) BEGIN

Ta is triggering transaction of Ti (Ti.trigger);

SWITCH (class of Ti)

BEGIN

CASE IMM:

Return control to Ta and wait;

CASE DEF:

Decrease deferred-scnt of Ta by 1;

IF (deferred-scnt of Ta is zero) Phase-two-commit (Ta) END

END




During the phase 2 commit of Phase-two-commit we make local copies global by invoking commit of subtransactions in its sub-list. Procedure Phase-two-commit(Ti) BEGIN

IF (Ti has subtransactions)

THEN

FOREACH subtransaction Ta in Ti.sub-list

Phase-two-commit(Ta);

Commit(Ti); /* leaf level */









END




Discard

In order to discard tardy transactions from the system easily we maintain list of top-level transactions by deadline order and check the tardiness of transactions. When a top-level transaction is discarded by missing its deadline all its subtransactions are discarded also according to Rollback rule of nested transaction model. Function Discard is defined in the previous section. Procedure Nested -Discard(Ti) BEGIN

IF (Ti has subtransactions)

THEN

BEGIN

FOREACH subtransaction Ta in Ti.subilist NestedDiscard(Ta); END

Discard(Ti); /* leaf level */ END




4.2.2 Performance Evaluation

We ran a simulation with the simulation parameters in Table 4.1.2 to see the effects of extended AVCC for nested transaction model. In this simulation, we used PD priority assignment policy for both of 2PL-HP and AVCC. As we can see in Figure 4.9 extended AVCC shows better performance than 2PL-HP for nested transaction model and the result is similar to Figure 3.19 of flat transaction model.




Full Text

PAGE 1

REAL-TIME TRANSACTION SCHEDULING IN CONVENTIONAL AND ACTIVE DATABASES By DONG-KWEON HONG A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1995

PAGE 2

ACKNOWLEDGEMENTS First, I would like to thank my advisors, Dr. Sharma Chakravarthy and Dr. Theodore Johnson, for showing me the path of research, and for providing me with constant encouragement throughout my work. I would also like to thank the other member of my supervisory committee. Dr. Stanly Su, Dr. Eric Hanson, and Dr. Suleyman Tufekci for willingly agreeing to serve on my committee. Next I would like to thank Dr. Paul Fishwick for providing me the SIMPACK simulation package. Without this great package, my work would have been delayed. Finally, I would like to thank all the students at the Data Base System Research and Development Center for their help and friendship. ii

PAGE 3

TABLE OF CONTENTS page ACKNOWLEDGEMENTS u LIST OF TABLES v LIST OF FIGURES vi ABSTRACT viii CHAPTERS 1 1 INTRODUCTION 1 1.1 Problem Statement 3 1.1.1 Priority Assignment 4 1.1.2 Concurrency Control 4 1.2 Survey of Related Work 5 1.2.1 Priority Assignment for RTDBS 5 1.2.2 Priority Assignment for ARTDBS 5 1.2.3 Concurrency Control 6 1.3 Summary of Our Research 13 1.3.1 Contribution of our Work 14 1.4 Structure of Dissertation 15 2 SOFT REAL-TIME: INCORPORATING LOAD FACTOR INTO CCA 16 2.1 Motivation for Our Approach 17 2.2 CCA-ALF for Soft DeadUne 19 2.2.1 Priority Assignment 20 2.2.2 Scheduling Algorithm 22 2.3 EDF-CR-ALF for Soft Deadline 24 2.4 Performance Evaluation 26 2.4.1 Main Memory DB 28 2.4.2 Disk Resident DB 35 2.5 Conclusions 38 3 FIRM REAL-TIME: DEFERRED-RESTART APPROACH 41 3.1 Introduction 41 3.2 Related Work 42 3.3 Motivation for our Approaches 43 3.3.1 Comparison of Conflict Resolution Policies 45 3.4 Adaptive Concurrency Control (ACC) 48 3.4.1 Procedures of ACC 49 3.4.2 Correctness of ACC 56 iii

PAGE 4

3.5 Alternative Version Concurrency Control (AVCC) 57 3.5.1 Algorithms 58 3.5.2 Correctness of AVCC 65 3.6 Performance Evaluation 66 3.6.1 Performance of ACC 69 3.6.2 Performance of AVCC 71 3.7 Conclusions 73 4 ACTIVE REAL-TIME: TRANSACTION SCHEDULING 75 4.1 Priority Assignment 75 4.1.1 Multiple Priorities 76 4.1.2 Performance Evaluation 82 4.1.3 Analysis of Results 86 4.2 Concurrency Control 88 4.2.1 Extension of AVCC for Active Transaction Model 89 4.2.2 Performance Evaluation 92 5 CONCLUSIONS 94 REFERENCES 97 BIOGRAPHICAL SKETCH 101 iv

PAGE 5

LIST OF TABLES Table page 1.1 Compatibility Matrix for MV2PL 9 2.1 Parameters and their meanings for CCA-ALF 27 2.2 Base parameters for main memory database 29 2.3 Base parameters for disk resident database 36 3.1 Simulation Parameters for ACC and AVCC 67 4.1 Parameters for ARTDBS simulations 83 V

PAGE 6

LIST OF FIGURES Figure page 2.1 Knowledge type and corresponding approaches 18 2.2 Open Network Model for the simulation (single CPU) 26 2.3 Miss percent (CCA-ALF) 30 2.4 Restart Rate (CCA-ALF) 30 2.5 Mean Lateness (CCA-ALF) 31 2.6 Multiclass: Miss Percent (CCA-ALF) 32 2.7 Multiclass: Restart Rate (CCA-ALF) 32 2.8 Multiclass: Mean Lateness (CCA-ALF) 33 2.9 Miss percent for each class and Proportion of class 2 to class 0 34 2.10 DISK: Miss Percent (CCA-ALF) ' 38 2.11 DISK: Restart Rate (CCA-ALF) 38 2.12 DISK: No. of active tr. (CCA-ALF) 39 2.13 DISK: Mean Lateness 39 3.1 Case 1: Both transactions finished successfully within their deadlines .... 46 3.2 Case 2:HPT completed successfully, LPT missed 47 3.3 Case 3:HPT missed, LPT completed successfully 47 3.4 Case 4:Both LPT and HPT missed their deadUnes 47 3.5 Transaction blocks among HIT and MISS group 49 3.6 Primitive cases of transaction stop 53 3.7 Abort of stopped transaction which also stopped others 53 vi

PAGE 7

3.8 More complex cases of transaction stop 54 3.9 Deferred restart and immediate restart versions (AVCC) 58 3.10 Structure of AVCC execution 59 3.11 Case 2: Conflicts with the same transaction more than once 59 3.12 Case3: Conflicts with diff'erent transactions 60 3.13 Commit of AVCC 62 3.14 Chain stop in AVCC 62 3.15 Conflict with different transactions in AVCC 64 3.16 Open Network Model with Multiple CPU and disks 67 3.17 EDF-HP, AED-HP, and EDF-ACC 70 3.18 AED-HP, EDF-ACC, and AED-ACC 71 3.19 Comparisons of EDF-HP, ACC, AVCC 72 4.1 Deadlock due to multiple priorities 77 4.2 Priority Reversal: Blocking of active transactions 77 4.3 Restarts of active transactions 78 4.4 Life of complex active transaction T 79 4.5 Restartable unit of active real-time transaction 85 4.6 Result: Without data contention 86 4.7 Result: With data contention 87 4.8 Two execution scenario 87 4.9 2PL-HP and extened AVCC for active databases 93 vii

PAGE 8

Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy REAL-TIME TRANSACTION SCHEDULING IN CONVENTIONAL AND ACTIVE DATABASES By DONG-KWEON HONG DECEMBER, 1995 Chairman; Dr. Sharma Chakravarthy Cochairman: Dr. Theodore Johnson Major Department: Computer and Information Sciences and Engineering Database applications that require time constrained (real-time) response to transactions are becoming quite common. Design of a scheduling policy for a real-time database systems entails combining techniques from conventional database systems and real-time systems and fine-tuning them to obtain a policy that meets the requirements of scheduling transactions in real-time database systems. A scheduler of real-time database systems is responsible for assigning priorities and resolving access conflicts among transactions based on priorities. In this dissertation, we propose a cost conscious dynamic priority assignment policy for soft real-time transactions and new schedulers for firm real-time transactions. Finally, we extend our research to active real-time transaction systems who for modeling complex applications. viii

PAGE 9

CHAPTER 1 INTRODUCTION Database applications that require time constrained (real-time) response to transactions are becoming quite common. The information about an object must be kept sufficiently upto-date in real-time databases. Once entered into a database, data may become out-of-date if it is not updated within a certain period of time. These time varying data (real-time data) naturally impose a time constraint to a query of the data. For example, network traffic management (NTM) data in network database system [34, 36] gives the ability for real-time monitoring of the network making it selfadapting and fault-tolerant. The sampled data of real world (NTM data) should not faU behind the actual data of real world (network traffic) by more than a specified time. Typically, applications in real-time systems (as opposed to Real-Time Database Systems (RTDBS)) do not share disk-resident data. Even when they share data, the consistency of shared data is not managed by the system but by the application program. For the assumptions used in real-time systems, it is possible to predict some of the characteristics of tasks needed for the scheduling algorithms. As a result, scheduling algorithms [47, 46] used in current real-time systems assume a priori knowledge of tasks, such as arrival time, deadline, resource requirement, and worst case (CPU) execution time. For database applications, on the other hand, the following sources of unpredictability exist which makes it difficult to predict some of the resource requirements for transactions that need to meet time constraints [35]: 1. Resource confficts (e.g., wait for disk I/O) 2. Data dependence (e.g., execution path based on the database state) 3. Dynamic paging and I/O (e.g., page faults, caching, and buffer allocation)

PAGE 10

2 4. Data interference (e.g., aborts, rollbacks, and restarts) 5. Algorithmic variations for disk-resident data access (e.g., clustered scan vs. use of index) Note that most of the sources of unpredictability are related to the database characteristics. Nevertheless, the best efforts of researchers to improve the performance metrics of RTDBS are continued by several ways. RTDBS have close connection with active Database Systems (DBS). In active DBS data and control knowledge are stored together. This control knowledge specifies the action to be done when specified conditions hold and specified events occur. Event driven control is easily specified by ECA (Event-ConditionAction) rule of active databases. This paradigm is highly suitable to implement RTDBS as these control real-world processes. Recently, application that combines real-time and active databases have been proposed [11, 14] and priority assignment policy has been studied for active real-time database systems (ARTDBS) [33, 34]. We assume that transactions are the basic unit of work for database systems in this dissertation. Transactions with deadlines have been categorized into hard deadline, soft deadline, and firm deadline transactions. Transactions with hard deadlines have to meet their deadlines; otherwise, the system does not meet the specification. Typically, transactions that are in this category have catastrophic consequences if their deadlines are not met. Sometimes contingency measures may be included as an alternative. Soft real-time transactions have time constraints, but there may still some residual benefit for completing transactions past their deadline. Conventional transactions with response time requirements can be considered soft real-time transactions. In contrast to the above two, firm transactions are those which need not be considered any more if their deadlines are not

PAGE 11

3 met, as there is no value to completing the transaction after its deadline. Typically applications that have a definite window (e.g., stock market applications [2, 45]) within which transactions need to be executed come under this category. We view RTDBS and ARTDBS as either memory resident or disk-resident transaction processing systems whose workload is composed of transactions with individual timing constraints. A timing constraint is expressed in the form of a deadline, and we consider only soft and firm deadline transactions in this dissertation. 1.1 Problem Statement The main focus of research in the real-time systems area has been the problem of scheduling tasks to meet the time constraints associated with each task, while the focus in traditional database area has been concurrency control of transactions to guarantee database consistency and recovery in the presence of various kinds of failures (i.e., ACID properties). Design of a scheduling policy for a RTDBS entails synergisticaUy combining techniques from both areas and fine-tuning them to obtain a policy that meets the requirements of scheduling transactions in RTDBS. This dual requirement makes real-time transaction scheduling more difficult than task scheduling in real-time systems or transaction scheduling in database systems. Thus the scheduler of a RTDBS is responsible for assigning priorities and resolving access conflicts among transactions based on priorities. RTDBS assume that each transaction is a unit of work. But in active DBS, transactions may trigger other rules which can be treated as arbitrary computations (i.e., transactions). An active transaction has a set of triggered transactions that are either executed as part of the active transactions or separately depending on the type of coupling mode between the parent and the triggered transactions [11]. Coupling modes proposed in the literature are immediate, deferred, and independent, and the transactions triggered in these modes are referred to as immediate, deferred, and independent transactions, respectively. These

PAGE 12

4 triggered transactions make priority assignment policy and concurrency control algorithm for ARTDBS more complex. 1.1.1 Priority Assignment The performance requirements of conventional Database Management System (DBMS) are usually expressed in terms of average response times rather than meeting the timing requirements of individual transactions. Thus improving the response time of one transaction at the expense of another is not considered as improvement in conventional DBMS. In an RTDBS, the objective is to reduce the number of transactions that have missed their deadlines or total lateness. Consider two scenarios: If the transactions share the system resources on an equal basis, the transactions that have tight deadlines miss their deadlines while the transactions that have loose deadlines are likely to meet their deadlines. Alternatively, if the transactions that have urgent deadlines execute at the expense of the transaction that have loose deadlines (earlier deadline first), they might complete before its deadline. After that the other transactions execute and still complete in time due to their loose deadlines. From these two scenarios, we can see that the service precedence which is decided by the priority assignment policy affects the performance of time constrained database systems [42]. 1.1.2 Concurrencv Control The usual correctness criteria of database transactions is serializability. A serial schedule has no concurrency, but it is of interest since it preserves database consistency. Hence, we are interested in the large class of schedules, which may exhibit consistency and which are equivalent to some serial schedule. Such schedule is said to be serializable. Widely used mechanism for serializing transactions are locking, validation and timestamping. Each mechanism takes a different approach to achieve serializability. Whenever a data conflict occurs, concurrency control protocols use blocking, or transaction restarts or combinations

PAGE 13

5 of them. In RTDBS and ARTDBS the decision of blocking or transaction restarts should include transaction priorities. 1.2 Survey of Related Work 1.2.1 Priority Assignment for RTDBS Priority is assigned based on several types of information. Earliest Deadline First (EDF) and Least Slack First (LSF) are the most common priority assignment policies for real-time systems and these policies are usually combined with 2PL or OCC for RTDBS. EDF. In this policy the transaction with the earliest deadline has the highest priority. LSF. For each transaction T, we define a slack time S=d-(t-t-E-U), where d is deadline, t is the current time, E is expected execution time and U is the amount of service time consumed by T so far. Slack time is an estimate of how long we can delay the execution of T and still meet its deadline. In LSF the transaction with the least slack time has the highest priority. 1.2.2 Priority Assignment for ARTDBS As real-time systems evolve, tasks become bigger, more complicated. In some situations, a single value of an end-to-end deadline fails to capture the sense of urgency of each individual subtasks [23]. Thus the problem arises as to how to assign a priority to a triggered transaction given the priority of the triggering transaction in ARTDBS and three priority assignment policies, PD, DIV, and SL, are suggested for immediate subtransactions and deferred subtransaction [34]. PD. This policy assigns the same priority to all subtransaction which is the same as the parent's priority. The priority of the parent transaction (triggering transaction) which is based on its deadline does not change with the triggering of subtransactions.

PAGE 14

6 DIV. This policy divides the parent's current slack among all the immediate and deferred subtransactions triggered until that point whenever the parent triggers a subtransaction. The parent's priority is also adjusted dynamically to reflect the work that has been triggered dynamically. This policy uses the estimated execution times of subtransactions that have already been triggered. SL. This policy adjusts the slack of parent at each potential triggering point and the transaction with the least slack has the highest priority. The initial value of slack is assigned based on the predictions about the total execution time for a transaction and its subtransactions as indicated by probability of event triggering. The slack is then adjusted at each object or transaction event based on whether the parent transaction triggers a subtransaction or not. This policy assumes the future knowledge of subtransactions triggering and their estimated execution times. 1.2.3 Concurrency Control RTDBS scheduling algorithms combine various properties of time-critical schedulers with properties of concurrency control algorithms [1, 4, 8, 11, 20, 32, 37, 40, 42, 22]. There is a large body of work on scheduling and concurrency control algorithms that can be summarized as foUows: • Single version 2PL-HP, 2PL-WP [3] • Optimistic Concurrency Control (OCC) [19, 28] • Multiversion Concurrency Control [25] • Mixed Integrated Concurrency Control [29] • 2PL-CR, Priority CeiUng Protocol (PCP) [1, 37] • Semantic Concurrency Control [26, 27]

PAGE 15

7 Single version tPL-HP and 2PL-WP. 2-Phase Locking (2PL) [7] algorithm executes transactions in two phases. Each transaction has a growing phase, where it obtains locks and accesses data items, and a shrinking phase, during which it releases locks. No transaction should request a lock after it releases a lock. It has been known that any concurrency control algorithm that obeys the 2PL rule is serializable. Priority scheduling without knowing the data access pattern is presented as a representative of algorithms with incomplete knowledge of resource requirements. These algorithms combine priority scheduling with 2PL. When we use 2PL with priority scheduling a transaction conflict can arise from incompatibility of locking modes and a priority inversion can occur when a higher priority transaction (HPT) requests and blocks on a lock for object 0 which is locked by a lower priority transaction (LPT). Conflicts among transactions are resolved using one of the following methods: High Priority. High Priority (HP) conflict resolution method is the same as prioritybased wound wait conflict resolution method (In the priority-based wound-wait protocol, transaction T, can wait for a conflicting transaction Tj if T, has a lower priority. Otherwise, Tj is aborted (wounded)). The idea of this method is to resolve a conflict in favor of the transaction with the higher priority. The favored transaction gets the resources, both data lock and the processor. The loser of the conflict relinquishes the control of any resources that are being used by itself [3]. Wait. Under this policy, priority inverting conflicts are resolved exactly as nonpriority inverting conflicts. The requesting transaction always blocks and waits for the data object to become free. This is standard method for most DBMS which do not execute real-time transactions [3]. Wait Promote. Wait Promote (WP) handles conflicts as Wait does except when a priority inversion occurs. An HPT Tr will block and wait but now we promote the priority of the lock holder Th so it is as high as the priority of the Tr. In other

PAGE 16

8 words, Th inherits the priority of the TrSince locks are retained until commit time the Th wUl keep its inherited priority until it commits or is restarted [3]. Based on 2PL, several combinations of conflict resolution methods and priority assignment policies described previously have been proposed. They are EDF-HP (Earliest Deadline First with High Priority), LSF-HP (Least Slack First with HP), EDF-WP (EDF with Wait Promote), AED-HP (Adaptive Earliest Deadline with HP) [20], Virtual Clock and Pairwise Value Function [42]. Optimistic Concurrency Control. Some concurrency control algorithms based on locking or time stamp are pessimistic in nature. They assume that the conflicts between transactions are quite frequent and do not permit a transaction to access a data item if there is a conflicting transaction that accesses that data item. Thus the execution of any operation of a transaction follows the sequence: validation, read, computation, write. Optimistic algorithms, on the other hand, delay the validation phase until just before the write phase. Thus an operation submitted to an optimistic scheduler is never delayed. Each transaction initially makes its updates on local copies of data items. The validation phase consists of checking if these updates would maintain the consistency of the database. If the answer is afiirmative, the changes are written into the actual database. Otherwise, the transaction is aborted and has to restart. There have been several approaches that used OCC for real-time concurrency control method [22, 18, 28]. An OCC scheme with a deadline and transaction length based priority assignment scheme is presented in [22] and an OCC with several conflict resolution methods has also been proposed in [18]. With OCC approach, a policy is needed to resolve the access conflicts during the validation phase. Some of the policies proposed are commit (always let the transaction being validated commit), priority abort (abort the validating transaction only if its priority is less than that of each conflicting transaction), priority wait (wait for higher priority transactions to

PAGE 17

complete), and opt-sacrifice (restart the validating transaction if at lezist one of the transactions in the conflict set has a higher priority). Although OCC scheme is shown to display better performance than 2PL-HP for firm real-time transactions in some studies [18, 28], it appears to provide better performance only when the data conflicts are relatively small as shown in another study [22]. Multiversion Concurrency Control. In a multiversion concurrency control algorithm, each write on a data item 0 produces a new copy or versions of 0. DBMS keeps a list of version 0, which is the history of values that DBMS has assigned to 0. For each read request, DBMS not only decides when to allow read request, but it also decides which one of the version of 0 to read [7]. The benefit of multiple version is to reduce the transaction rejection and thus to increase the degree of concurrency. Maintaining multiple versions may not add much to the cost of concurrency control because the versions may be needed anyway by the recovery algorithm. Obviously, however, maintaining multiple versions take a lot of storage space. To control this storage requirement, versions must periodically be purged or archived. As a variant of single version 2PL real-time multiversion 2PL [25] has been introduced to increase concurrency by adjusting serialization order dynamically. Multiversion 2PL (MV2PL). Multiversion 2PL [7] uses three types of locks: read locks, write locks, and certify locks. Their lock compatibility matrix is shown in Table 1.2.3. Table 1.1. Compatibility Matrix for MV2PL Read Write Certify Read Write Certify Y Y N Y Y Y N Y N

PAGE 18

10 • When MV2PL scheduler receives a write request, it attempts to set the write lock. Since transactions can have their own versions, there is no data conflict between write locks. • When the scheduler receives transaction r,'s read request for object 0, it attempts to set read lock. Since read locks only conflict with certify locks, it can set read lock as long as no transaction already owns a certify lock on object 0. If T, already owns write lock and has therefore write 0,, then the scheduler translate the read request of 0 into read request of 0, . Otherwise, it waits until it can set a read lock, and sets the lock, translate read request of O into read request of Oj where Oj is the most recently committed version of 0. Since only committed version may be read, MV2PL avoids cascading aborts and ensures that the MV histories it produces are recoverable. • When the scheduler receives transaction T^'s commit request, it attempts to convert Ti's write locks into certify locks. Since certify locks conflict with read locks, the scheduler can only do this lock conversion on those data items that have no read locks owned by other transactions. On those data items where read locks exist, the lock conversion is delayed until all read locks are released. Thus the effect of certify lock is to delay T.-'s commit untU there is no active readers of data items it is about to overwrite. As in the single version 2PL, MV2PL also has priority inversion problem caused by locking mechanism. Real-time MV2PL [25] also use priority based aborts and blockings to resolve conflicts among transactions. Mixed Integrated Concurrency Control. Most DBMS schedulers synchronize conflicting operations by one of 2PL, TO (Timestamp Ordering), or OCC. There are other DBMS schedulers that use combination of these techniques to ensure that transactions are processed in a serializable manner. DBMS schedulers combining different mechanism

PAGE 19

11 for read-write and write-write synchronization are called mixed integrated schedulers [7]. Lin and Son [29] have proposed a new concurrency control algorithm which is based on mixed integrated concurrency control method that adjusts the serialization order dynamically. The proposed algorithm which is based on deferred update policy uses 2PL for read-write conflicts and the Thomas' Write Rule (TWR) for write-write conflicts. Thomas' Write Rule. Let Tj be the transaction with maximum timestamp that wrote into object 0 before the scheduler receives write request of T, on 0. H timestamp of T, is greater than that of Tj process the write request of T, on 0 as usual. Otherwise process the write request of T,by simply acknowledging it. Their approach resembles OCC by using deferred update policy and 2PL by getting conflicting information when they access the data items. There are several approaches that use a priori knowledge for handling real-time transaction scheduling. Static priority assignment policy PCP using priority inheritance with exclusive lock and read/ write lock have been proposed [37, 40]. Priority Ceiling. The priority ceiling of a data object is the priority of the highest priority task that may lock this object. Priority-ceiling protocol. A transaction J requesting to lock a data item 0 is granted the lock only if p(J) > c(P), where P is the data item with the highest priority ceiling among all data items currently locked by transactions other than J, p( J) is the priority of transaction J and c(P) is the priority ceiling of object P. If J cannot lock 0, J is blocked and the transaction holding the lock on P inherits the priority p(J) until P is unlocked.

PAGE 20

12 These protocols are transaction pre-analysis based nonabortive methods using priority inheritance to prevent priority inversion and indefinite blocking. It is important to note that the concept of priority ceiling assumes that we know a lot about transactions that will access the database. This is a reasonable assumption for dedicated real-time application such as tracking [37]. Although the priority ceiling protocol introduces unnecessary blocking, the worst case blocking for any task is reduced to the duration of at most one low priority transaction to finish in one critical section, and no deadlock wiU ever occur. The critical problem of PCP is that it is not appropriate for disk resident database because an LPT is unnecessarily blocked during 10 wait time of a conflicting HPT. Priority scheduling with some em a priori knowledge and dynamic priority assignment is introduced as another approach [3, 8, 21]. Conditional Restart (CR) [3] uses estimated execution time of transactions to make a decision on blocking or aborts. The Cost Conscious Approach (CCA) [21] uses data access pattern to estimate the dynamic costs incurred by the interference among transactions. Conflict avoiding nonpreemptive method and Hybrid algorithms [8] which use conflict avoiding schemes in the non-overload case and CR conflict resolution method in the overload case have been proposed. Semantic Concurrency Control. Database consistency is preserved by enforcing serializability. Serializability is often too strict a correctness criterion for real-time applications, when the precision of an answer for a query may stiU be acceptable even if serializability is not strictly observed in transaction scheduling. A weaker correctness criterion for concurrency control in real-time transactions by investigating the notion of similarity [26] is proposed and integrating the similarity concept into database concurrency control method [27] has been studied. The concept of similarity is based on the observation that data values of a data object that are slightly different are often interchangeable as read data for transactions. Their approach which is based on similarity assumes that the application semantics allows us to derive a similarity bound

PAGE 21

13 for each data object such that two write events on the data object must be similar if their timestamp differ by an amount no greater than the similarity bound, i.e., all instances of write event on the same object that occur in any interval shorter than the similarity bound can be swapped in the schedule without violating consistency requirement [27]. Thus conflicting transactions do not need to block one another as long as their event conflicts can be resolved by using the similarity bound. 1.3 Summarv of Our Research The goal of our research is to develop techniques for RTDBS and ARTDBS which assign priorities to transactions, schedule transactions, and resolve conflicts in proper ways. The tasks that we have accomplished are listed below. Research for soft RTDBS. Tasks in a real-time system often communicate through shared data, yet have a correctness constraint that each task must appear to execute atomically. In this case, the tasks need to be managed as transactions, and need to be scheduled by a RTDBS. We have already developed a cost conscious dynamic priority assignment policy, CCA, which effectively exploits the time accrued by interference among transactions and developed the simulator and compared the performance [21]. 1. Developed Cost Conscious Approach (CCA). 2. Extended CCA so that we can exploit the load factor of the system. 3. Developed the simulator and compared the performance. Research for firm RTDBS. Firm deadline has different semantics from soft deadline. By removing tardy transactions from the firm RTDBS there will not be any tardy transaction in the system. Removing tardy transactions from the system gives some advantages to OCC which uses late stage validation method. There have been some comparisons between 2PL-HP and OCC for concurrency control algorithm of firm

PAGE 22

14 RTDBS. Both approaches have some advantages and disadvantages. We have developed new approaches which can benefit the advantages of 2PL-HP and OCC together: 1. Developed new concurrency control methods which use immediate restart and deferred restart policy together. Our approaches can benefit advantages of 2PLHP and OCC together. 2. Developed the simulator and compared the performance. Research for ARTDBS. ARTDBS has more complex transaction model in which transactions may trigger other transactions. We have developed a new priority assignment and compared the performance: 1. Developed new priority assignment policy. 2. Developed the simulator and compared the performance. 1.3.1 Contribution of our Work We can summarize the contribution of our works as follows: New scheduler for soft deadline. There have been several approaches [38, 3] to use a priori knowledge for RTDBS. We developed CCA [21] which uses transaction dataset to estimate the approximate cost of transaction rollback and restarts. Based on CCA, our new approach tried to include as much information as possible. Our new approach even includes system load information. New scheduler for firm deadline. Performance of 2PL versus OCC for conventional DBMS and 2PL-HP versus OCC for RTDBS has been done. According to the studies [19] the performance of 2PL-HP and OCC is changing with the transaction mix and system load. Based on the previous performance and our observation on both approaches we developed new concurrency control methods.

PAGE 23

15 New scheduler for ARTDBS. There are many applications such as cooperative distributed navigation systems and intelligent network services where real-time active database technology is extremely useful [33, 34]. As many commercial systems support active capability, a lot of non-traditional applications are being implemented using this capability. We studied priority assignment policy for more complex active transaction model and compared the performance of PD and DIV and extended our AVCC for firm RTDBS to fit into ARTDBS. 1.4 Structure of Dissertation The rest of the dissertation is structured as follows. Chapter 2 provides a cost conscious dynamic priority assignment policy which incorporates system load factor for soft RTDBS and shows the performance of our approach by using simulation studies. Chapter 3 presents new ideas that use immediate restart and deferred restart together for firm RTDBS and shows the performance comparisons. In chapter 4, we present a priority assignment poUcy for ARTDBS and show the performance evaluation on disk resident databases. Chapter 5 concludes the dissertation with the contributions of this dissertation and the future works.

PAGE 24

CHAPTER 2 SOFT REAL-TIME: INCORPORATING LOAD FACTOR INTO CCA Repetitive workload is a common property in real-time and transaction processing systems. Thus, in a real-time transaction processing system, users do not run arbitrary programs, but rather request the system to execute specific functions out of a predefined set. Each function is an instance of a transaction type. That is, RTDBS invokes a transaction program that implements the requested function. The random aspect is the sequence and the frequency with which programs are invoked [17]. Use of canned transactions and queries whose read and write sets can be predicted beforehand is a step in the right direction and the data items accessed by a transaction are likely to be known a priori once its functionality is known [35]. Based on the above observation, priority scheduling with some a priori knowledge is introduced as another approach [3, 8, 27, 37, 40, 21]. Conditional Restart (CR) [3] uses estimated execution time of transactions to make a decision on blocking/aborts, and CCA [21] uses data access pattern to estimate the dynamic costs incurred by the interference among transactions. Conflict avoiding nonpreemptive method and Hybrid algorithms [8] which uses conflict avoiding schemes in the non-overload case and CR conflict resolution method in the overload case by using data access pattern have been proposed. Priority Ceiling Protocol (PCP) [37, 40] uses data access pattern and static priority of a transaction, and Similarity Stack Protocol (SSP) [27] uses more detailed information by assuming that the application semantics allows us to derive a similarity bound for each data object. Thus conflicting transactions do not need to block one another as long as their event conflicts can be resolved by using the similarity bound. In this part, we view a RTDBS as either memoryor disk-resident transaction processing system whose workload is composed of a set of canned transactions with individual timing 16

PAGE 25

17 constraints. A timing constraint is expressed in the form of a deadline, and we consider only soft deadline transactions. With the canned transactions, we will look at how we can derive system load factor and how can we use that information for soft real-time transaction scheduling. 2.1 Motivation for Our Approach The primary motivation for our approach is to answer the question "What kind of information is relevant and how to meaningfully incorporate it into the design of a real-time scheduling algorithm?". Various types of information are useful in different ways. Intuitively, we can do better if we have additional knowledge but the improvement is predicated upon the appropriate use of that knowledge. Figure 2.1 illustrates the classification of various scheduling algorithms proposed in the literature with respect to the type of knowledge used. Type 0. Does not assume any a priori knowledge. Only available timing information is deadline (e.g., EDF-HP [3]). Type 1. Deadline and data access pattern are available (e.g., CCA [21]). Type 2. Deadline and estimated execution time are assumed to be available (e.g., EDFCR [3]). Type 3. Data access pattern and static transaction priorities are assumed to be available (e.g., PCP [39]). EDF-HP is the simplest and most straightforward approach for an RTDBS. EDF priority assignment policy miiumizes the number of late transactions when systems are lightly loaded. The performance, however, rapidly degrades as the system becomes overloaded. There have been several approaches [3, 21] to overcome the shortcoming of EDF-HP by using additional information. The basic idea of these approaches is to save valuable system resources by not aborting partially executed conflicting transactions blindly. EDF-CR [3]

PAGE 26

18 TypeO DF•HP AEDF-HP Type 1 CCA Type 2 EDF-CR Typel & Type 2 CCA-ALF EDF-CR-ALF Figure 2.1 Knowledge type and corresponding approaches uses type 2 information while CCA [21] uses type 1 information to improve EDF-HP further, and our experiments [13] has shown that CCA is better than EDF-CR for soft real-time systems when the resource time is used as an estimated execution time for EDF-CR. From the experiments [13] we found that the response time which is the difference of completion time and arrival time in soft RTDBS varies considerably. Prediction of response time of a transaction is very hard to get without combining type 1 information and system load factor because the response time of a transaction varies with the changes of system load, especially in soft RTDBS. In a RTDBS, irrespective of whether it is memoryor disk-resident, the (wall clock) response time has two distinct components: Tstatici the time needed to execute a transaction in an isolated environment and Tdynamic the time spent in waiting (both I/O and concurrency related) as well as abort /restart overhead. Tatatic is dependent on the semantics of the transaction (e.g., data values accessed and branch points) and is relatively straightforward to estimate. Tdynamici on the other hand, is dependent on the current state of the system and on future events, i.e., on the transactions that are currently in the system and the transactions that wUl arrive in the future. In the database context, Tdynamic is extremely difficult to compute or even estimate as it is not only dependent on the resources consumed so far but also on the resources required for its completion which may be aflFected by future events. Furthermore, Tdynamic is sensitive to the transaction mix and can vary considerably

PAGE 27

19 when the transaction mix changes. Nevertheless, the inclusion of an approximate Tdynamic as part of the strategy for meeting timing requirements is likely to perform better than those where the dynamic information is not included at all. Based on the above observations, we propose an adaptive cost conscious approach CCAALF (Cost Conscious Approach with Average Load Factor) and EDF-CR-ALF (EDF-CR with Average Load Factor) which combines type 1 information with a load monitoring mechanism. CCA-ALF and EDF-CR-ALF use type 1 information to calculate resource time of a transaction and then anticipate system load by using resource time and its response time. With type 1 information and current system load EDF-CR-ALF derives remaining response time of a transaction for its conflict resolution method and CCA-ALF changes its priority by incorporating system load. 2.2 CCA-ALF for Soft DeadUne CCA-ALF uses strict 2PL, exclusive lock only, and High Priority (HP) conflict resolution method. With type 1 information, which is available by using pre-analysis or pre-execution, the conflict and safety relationship for CCA-ALF can be inferred in a straightforward manner. hasaccessedfTf]). Set of data items that a transaction N has accessed from the beginning of the transaction. mightaccessfTN). Set of data items that a transaction N might access till its completion. With mightaccess and hasaccessed, we can calculate the conflict and safety relations as follows: • Transactions T/v and Tm conflict iff mightaccess(r/v) n mightaccess(rA/) ^ 4>. • Transaction is unsafe with respect to Tm iff hasaccessed(r/v) n mightaccess(rji^)

PAGE 28

20 2.2.1 Priority Assignment CCA-ALF uses a dynamic priority assignment policy with a continuous evaluation method which evaluates the priority several times during the execution of a transaction to include some of the dynamic features as the transaction progresses. If the transaction Ti which is selected to be run next conflicts with transactions that are unsafe with Tj, we might lose Timelost(Ti) = Sr^gM {rollbackj + execj) M — {Tj I Tj is unsafe with Ti} . where execj is the effective service time of Tj and rollbackj is the time required to roU back Tj. If the value of Timelost(Ti) is large, executing Ti wastes system resources. We characterize the time lost as the penalty of conflict. Penalty of conflict is the value Timelost{Ti), which is the sum of the effective service time and rollback time of the transactions that must be aborted and rolled back to execute Ti to its commit point without interruption. The notion of the penalty of conflict, described above, is introduced into the our CCA-ALF dynamic priority computation formula as foUows. If Pr(r,) is the priority of transaction Ti and d(T,) is the deadline of transaction Tj, then Pr(r,) = -(d(r,) -j-w * Timelost{Ti)) Our priority formula uses absolute deadline as one of the components. As time progresses, although values become larger and larger, the effect of TimeLost component will not be diminished because the relative priority order depends on the difference of their values of Pr function not on the absolute values of it. The portion of Timelost in CCA-ALF priority formula can be controlled by the value of u. Although the value of u over some ranges showed good performance [21], we can improve the performance by fine-tuning the value of

PAGE 29

21 uj since no priority assignment policy shows good performance in different load situations in a consistent manner. Since the value of Timelost consists of effective service time of conflicting transactions, it does not include system load in it. One way to make this approach adaptive to the system load is to adjust the value of w using the load of the system. As we mentioned before, transaction response time varies substantially with the changes of system load in soft RTDBS. By computing the ratio (Load Factor) of response time to corresponding resource time of each completed transaction, we might be able to predict the system load. Resource time can be derived from type 1 information by assuming that the processing time for each accessed data item does not change enormously. Resource Time = Number of data access x cpuJime + Number of disk read x diskJime response time = completion time — arrival time , „ , , „s response time Load Factor (LF) = — — Resource Time LF of one transaction cannot represent the system load properly. We use Average Load Factor (ALF) of previously completed transactions to represent current system load. LF of the transactions that finished a long time ago cannot contribute to the current system load either. Thus we maintain a list (called IfJist) to keep track of most recently finished N transactions' LF and it is updated whenever a transaction finishes. Average Load Factor (ALF) = ^r.-gz/J.^ LF{Ti) With the ALF value the priority formula of this approach is as follows: Pr{Ti) = -(d(Ti) + {penalty. weight x ALF x TimeLost{Ti))) In lightly loaded situations the value of ALF is close to 1 and our priority assignment policy approximately resembles to CCA which showed good performance when the system is in medium or light load situation. If a system load increases the effects of deadline in the

PAGE 30

22 priority formula decreases due to increase of ALF. Thus in heavily loaded situations the results obtained using our priority formula are comparable to Random Priority (RP) [18] which showed good performance in the heavily loaded situation since the multiplication of ALF and Timelost override the effect of deadline in the formula. Thus our priority formula helps to balance the urgency of transactions and waste of system resources in different load situations. 2.2.2 Scheduling Algorithm The procedure tr-arrival-sched is invoked whenever a new transaction arrives, and the procedure tr-f inish-sched is invoked whenever a running transaction finishes. The procedure tr-airrival-sched and tr-f inish-sched use ALF, and penalty of conflict (approximation of dynamic cost) of transactions and the procedure tr-f inish-sched inserts LF into IfJist and updates ALF. Thus ALF is updated whenever a transaction finishes its execution. The sleep queue holds transactions that are blocked and the partially executed transaction list (PJist) links all transactions that are executed partially. The ALF introduced in the priority formula is used to weigh the contribution of penalty of conflict on the value of the priority value computed. ALF value will ranges from 1 to some positive value. Procedure tr-arrival-sched(Ti) BEGIN Put Ti in the ready queue; FOREACH transaction in the ready queue BEGIN assign new priority; Sort and choose the highest priority transaction and run it; END END

PAGE 31

23 Procedure tr-finish-sched (Ti) BEGIN insert LF(Ti) into the IfJist; update ALF; Remove Ti from the system; FOREACH transaction in the ready queue BEGIN assign new priority; Sort and choose the highest priority transaction and run it; END END Disk I/O introduces new problems in real-time transaction scheduling. There are several choices when I/O wait occurs. We have considered the following 3 choices: 1. Pick the highest priority transaction among ready transactions. 2. Pick the highest priority transaction among transactions that are ready and does not conflict with aU partially executed higher priority transaction. 3. Pick the highest among transactions that are ready and does not conflict with any partially executed transaction. Of the above, we found that the second one comes out as the best for soft real-time transactions [13] and applied it to CCAALF and EDF-CR-ALF where type 1 information is available. Consider the following scenario: Transaction Ti is blocked and is waiting for an I/O completion. The next highest priority transaction, T2, gets the CPU and starts executing so as not to waste the CPU. If T2 is unsafe with Ti, then T2 performs a noncontributing execution because it must be roUed back when Ti unblocks. This situation is worse

PAGE 32

24 than the situation in which no transaction is selected to execute during Ti's I/O wait time, because of the cost incurred in rolling T2 back. If the third highest priority transaction, T3, accesses a data set disjoint with that of Ti and T2, then T3 is the better choice. In our approach we select T3 rather than T2 during Ti's I/O wait using the type 1 information. Even though the third choice prevents noncontributing execution, also it might limit the concurrency of the system too much. A noncontributing execution is defined as a lower priority transaction's execution during the I/O wait of higher priority transaction that has to be roUed back when the higher priority transaction finishes its I/O [21]. 2.3 EDF-CR-ALF for Soft Deadhne EDF-CR [3] uses estimated execution time of a transaction when it decides whether to abort a conflicting lower priority transaction or block a higher priority transaction. The problem here is that remaining execution time of a transaction doesn't consider Tdynamic a^t all. When we deal with soft deadline the variations of response time changes considerably with the changes in the system load. Thus, it seems naive for EDF-CR to use statically estimated execution time which does not consider the changes to the system load at all. Our simulations [13] showed that in heavily loaded situations EDF-CR is worse than EDF-HP when the resource time of a transaction is used as an estimated execution time. For this reason we use dynamically estimated response time instead of statically estimated execution time in EDF-CR-ALF when transactions block or abort. As we explained before, we can estimate the remaining response time dynamically by using additional information available about transactions. With type 1 and system load, we can estimate the remaining response time of a transaction dynamically by using statically calculated resource time and dynamically traced ALF. Slack time (Sr) of a lock requesting

PAGE 33

25 higher priority transaction Tr, and remaining response time (RRT) of a lock holding transaction Th can be dynamically calculated by using the following formula and the priority is assigned based on EDF policy. Remaining Response Time (RRT) = Remaining Resource Time x ALF Sr{Tr) = {deadline{Tr) {current time + RRT{Tr))) We name this approach as EDF-CR-ALF (EDF-CR with ALF) and its conflict resolution procedure is as follows: Procedure EDF-CR-ALF-sched BEGIN IF Pr(n) < Pr(r,) THEN IF RRT{Th) < SriTr) THEN Block Tr; Inherit Pr(Tr) to T^; Run Th; ELSE Abort Th; Run Tr; ELSE Block Tr; Run Th; END

PAGE 34

26 We expect EDF-CR-ALF to perform better than EDF-HP in lightly loaded situations, but it is likely to be almost the same as EDF-HP when the system is heavily loaded because under heavy load, most of transactions in the system do not have enough slack time to wait for the completion of the conflicting lower priority transaction. The advantage of EDF-CRALF over EDF-CR is that EDF-CR-ALF is never worse than EDF-HP for any situation by not overestimating the slack time of a lock requesting higher priority transaction. 2.4 Performance Evaluation In order to evaluate the performance of the CCA-ALF algorithm described in this part, two simulations of a real-time transaction scheduler were implemented (using C language and SIMPACK simulation package [16]) for main memory [15] and disk-resident databases as shown in Figure 2.2. Open Network Model Source Restart Commit Figure 2.2. Open Network Model for the simulation (single CPU) The parameters used in the simulations are shown in Table 2.4. In these simulations, transactions enter the system according to a Poisson process with arrival rate A (i.e., exponentially distributed inter-arrival times with mean value 1/A), and they are ready to

PAGE 35

27 Table 2.1. Parameters and their meanings for CCA-ALF \J -p o rti Of db_size iNumoer oi ODjecxs m aaiauase maxjsize oize OI largesi iransaciioii min_size Size of smallest transaction 1 / n tiTTiP T/O time for arressinff an obiect fread/write^ cpu.time CPU computation per object accessed disk_prob Probability that an object is accessed from disk update.prob Probability that an object accessed is updated min_slack Minimum slack max^lack Maximum slack rest art -time Time needed to rollback and restart penalty .weight Weight of penalty of conflict execute when they enter the system (i.e., release time equals arrival time). The number of objects updated by a transaction is chosen uniformly from the range of minsize and max-size, and the actual database items are chosen uniformly from the range of dbsize. After accessing an object a transaction spends cpuJime in order to do some work with or on that object and then it accesses the next object. The assignment of a deadline is controlled by the resource time of a transaction and two parameters minslack and maxslack which set, respectively, a lower and upper bound of percentage of slack time relative to the resource time. A deadline is calculated by adding resource time and slack time. Slack time is calculated by multiplying slack percent and resource time. Slack percent is chosen uniformly from the range of minslack to max.slack. Deadline — arrival time + resource time X (1 + '^^"^^ percent . ^ 100 ^ Disk accesses for disk resident database are controlled by disk.prob when a transaction reads an object. The use of disk.prob to some extent models data maintained in the buffer. At commit time, objects that have been updated are flushed. The parameter update.prob controls the number of data that should be written at the commit time. We use restart Jime for modeling the rollback of a transaction and its restart. The restarted transaction will

PAGE 36

28 access the same data objects, and we maintained most recently finished 20 transactions in the circular list to keep tracli of current ALF. In our performance evaluation, we measure three performance metrics (defined below) commonly used in the literature for RTDBS: i) miss percent, ii) restart rate, and iii) mean lateness. j^ p ^ Total number of transactions that missed the deadline ^ Total number of transactions that entered the system Total number o f restart Restart Rate Total number of transactions that entered to the system Mean Lateness = ^TiGtardytransactionsi(^o^Pieii(^''-ti'n
PAGE 37

29 Table 2.2. Ba se parameters for main mem ory database Parameter Value dbjsize 250 maxjsize 24 min_size 8 cpu-time 10 ms min_slack 50 (%) max_slack 550 (%) restart-time 5 ms penalty .weight 1 per transaction, and mean lateness for EDF-HP, EDF-CR-ALF, CCA, and CCA-ALF. With the base parameters, the maximum capacity of the system (assuming no blocking and aborts) is 10 ms 16 objects 160 ms „ . . , , —r-. — X : — : — = 6.25 transactions/ second object transaction transaction If we consider the effects of blocking and aborting (dynamic factors) the capacity of the system will be much less than the maximum capacity of the system. Figure 2.3 shows the effect of arrival rate on the percentage of transactions that miss their deadline. The boundary arrival rates of EDF-HP, CCA, EDF-CR-ALF and CCA-ALF are approximately 4.4, 4.6, 4.5, and 4.6 trs/sec, respectively. Figure 2.4 shows the effect of arrival rate on the restart rate of transactions and Figure 2.5 shows mean lateness using the logarithmic scale. CCA-ALF shows better performance as compared to CCA, EDF-HP and EDF-CRALF especially when the arrival rate is between 3 and 5.5 trs/sec. Within this arrival range CCA-ALF and EDF-CR-ALF show much less number of transaction restarts than EDFHP. Generally, less number of transaction restarts does not guarantee better performance but CCA-ALF reduces expensive restarts to achieve better performance. This phenomenon can be seen clearly in the multiclass experiment presented later. Observe that for the base parameters shown in Table 2.4.1, the number of restarts climbs steeply up to the arrival rate of 4 and then declines sharply (Figure 2.4). The reason for sharp decline is

PAGE 38

30 that beyond a specific arrival rate, it is less likely that an arriving transaction will have an earlier deadline than the currently running transaction. After the peak point in Figure 2.4, it is usually the case that the currently running transaction arrived a long time ago, but could not get system services due to the heavy load on the system (most of the dynamic factors in heavily loaded situation are arrival blockings rather than preemption blockings and aborts [43]). Thus, fewer transactions are preempted and there are fewer opportunities for restarts [1]. CCA-ALF^DF-CR-ALF (base parameters) 100 0 1 2 3 4 5 Arrival Rate(trs/sec) 6 7 Figure 2.3. Miss percent (CCA-ALF) CCA-ALF^DF-CR-ALF (base parameters) 0.1 0.09 2 3 4 5 Arrival Rate(trs/sec) 6 7 Figure 2.4. Restart Rate (CCA-ALF)

PAGE 39

31 Ie-f06 100000 10000 s Jooo 2 |100 S 10 1 0.1 CCA-ALF^;DF-CR-ALF (base parameters) EDF-HP CCA-ALF EDF-CR-ALF 2 3 4 5 Arrival Rate(trs/sec) Figure 2.5. Mean Lateness (CCA-ALF) Effect of multiclass (Transaction mix) In this experiment, the arriving transactions are divided into three classes (class 0, 1, and 2) and assigned different values of cpu.time 1 for class 0, 10 for class 1, and 100 for class 2. We assigned 1 ms as restartJime for aU classes because the resource times of class 0 transactions are between 8 ms and 24 ms. The other parameters are the same as that of the previous experiment. Thus data contention remains the same but the amount of resource time for each class is diflFerent. With these assignments a lower class (the lowest is class 0) transaction has a shorter resource time. As a result it has a shorter slack time. The maximum capacity of the system (disregarding blockings and aborts) is: 16 objects 592 ms 1+10+100 3 = 1.7 transaction /second object transaction transaction Different assignments of cpuJime for each transaction class creates a lot of variance in the transaction resource time (the resource time of transaction varies from 8 ms to 2400 ms). Therefore, there will be more chances for transaction preemption. Figures 2.6 show the results of this experiment. The boundary arrival rates of EDF-HP, EDF-CR-ALF, CCA, and CCA-ALF are 0.95, 1.0, 1.1 and 1.15 trs/sec respectively in Figure 2.6. Thus CCA-ALF schedules more transactions without missing more than 20 % of transactions.

PAGE 40

32 With the variation of cpuJime there is higher possibility that an arriving transaction will have an earlier deadline than the currently executing transaction. Thus restart rate per transaction of this experiment is increased for both approaches as can be observed from Figures 2.4) and 2.7. CCA-ALF shows better performance especially when the arrival rate is between 0.6 and 1.4 trs/sec. Within this arrival range CCA-ALF shows much less number of transaction restarts compare to EDF-HP. CCA-ALF reduces very expensive restarts to achieve better performance in the multiclass situation. This experiment also indicates the adaptive nature of the CCA-ALF approach in which the dynamic cost changes as the transaction mix changes and reduces the effect of deadline accordingly. CCA-ALF^DF-CR-ALF (base parameters) 100 80 EDF-HP — CCA-ALF o EDF-CR-ALF 60 40 20 0.2 0.4 0.6 0.8 1 12 1.4 1.6 1.8 2 Arrival Rate(trs/sec) Figure 2.6. Multiclass: Miss Percent (CCA-ALF) 0.2 CCA-ALF JIDF-CR-ALF (base parameters) EDF-HP — CCA-ALF 'EDF-CR-ALF 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Arrival Rate(trs/sec) Figure 2.7. Multiclass: Restart Rate (CCA-ALF)

PAGE 41

33 CCA-ALF^DF-CR-ALF (base parameters) lc+06 100000 10000 1 1000 2 g 100 s 10 1 0.1 OJ 0.4 0.6 0.8 1.2 1.4 1.6 1.8 2 Arrival Rate(trs/sec) Figure 2.8. Multiclass: Mean Lateness (CCA-ALF) Another metric of comparison for this experiment is to observe miss percent for each class. In this experiment data contention is the same for all classes but their active resource requirements are different because the transactions belonging to classes 1 and 2 require more cpuJime to process their data objects. The relative difference of miss percent of each class is reduced after arrival rate 1 for both approaches (Figure 2.9). The reason is that after this point preemption of transactions is reduced and execution behavior is more serialized. We plot miss percent for each class from arrival rate of 0.6 trs/sec to 1.4 trs/sec for EDF-HP and CCA-ALF in Figure 2.9 (miss percent is too small to plot when the arrival rate is less than 0.6 trs/sec and the behavior of EDF-CR-ALF is almost the same as that of EDF-HP). Their relative difference is reduced when the arrival rate is bigger than 1.4 trs/sec. From Figure 2.9, we can see that EDF-HP and EDF-CR-ALF blindly favors shorter transactions transactions. Thus EDF-HP and EDF-CR-ALF causes very expensive restarts by aborting transactions that consumed a lot of resources. CCA-ALF (the behavior of CCA is almost the same as that of CCA-ALF) also favors shorter transactions but CCA-ALF avoids expensive restarts by not aborting transactions that consumed a lot of resources. In Figure 2.9 miss percent of class 0 transactions is higher than that of class 1 transactions in our experiment. The reason is that class 0 transactions are very vulnerable due to their relatively small absolute slack time.

PAGE 42

34 Arrival Rate Arrival Rate EDF-HP CCA-ALF Arrival Rate(trs/sec) 0.6 0.8 1.0 1.2 1.4 0.6 0.8 1.0 1.2 1.4 Miss Percent(Class 2) 2.63 5.91 11.71 19.80 24.94 1.04 2.64 4.81 7.92 16.68 Miss Percent( Class 0) 0.63 1.86 5.3 11.52 19.1 0.8 2.48 5.29 9.22 15.1 Class 2 / Class 0 4.17 3.18 2.2 1.72 1.3 1.3 1.06 0.9 0.85 1.18 Figure 2.9. Miss percent for each class and Proportion of class 2 to class 0

PAGE 43

35 We expected that there would be less discrimination against long running transactions in CCA-ALF than EDF-HP because CCA-ALF implicitly considers the effective service time of a transaction as we can see it in Figure 2.9. Discrimination against long running transactions in RTDBS is discussed [32]. In their experiment each class requires different ranges of object number. Thus each class has different level of data contention and resource time. In our experiment, however, each class only has different level of resource contention. That is the reason why their experiment shows more discrimination against long running transactions. Also, the formula used for priority computation currently does not distinguish between transaction classes. This can be easily included in the formula that computes penalty of conflict. CCA-ALF shows much better performance especially when the variance of execution time is high among transactions, by not aborting transactions that have already consumed a lot of resource time. 2.4.2 Disk Resident DB In order to measure the performance of our algorithm on disk resident database, we extended the simulation program to perform experiments for this case. In this simulation we assumed that we have a single processor, single disk and FCFS I/O scheduling. If a transaction is aborted during its wait on the disk queue, the transaction is deleted from the disk queue immediately. However, if a transaction is aborted during its I/O access it is not deleted until it releases the disk. We used deferred update rather than immediate update for fast rollback [6]. Thus we assume that transaction rollback and restart do not require any disk access. The values of parameters used for this experiment are shown in Table 2.4.2. The values of cpuJime and i/oJime are chosen to balance the utilization of CPU and disk [5, 4, 44]. With this parameter assignments the system is slightly I/O bound. Resource time in this experiment depends on the cpuJime, the number of objects, the number of

PAGE 44

36 '1 disk access, and i/oMme. Since the deadline is assigned based on pre-commit time we inspect timing requirement and release locks when a transaction pre-commits. As we have 2 system resources in this experiment and the disk.prob is 0.5, we assigned 0.5 as the value of penalty .weight. The rationale is to distribute the penalty of conflict over the system resources. Table 2.3. Base parameters for disk resident database Parameter Value db_size 250 maxjsize 24 min^size 8 i/o_time 25 ms cpu-time 15 ms disk.prob 0.5 update.prob 0.5 min .slack 100 (%) max.slack 650 (%) restart .time 5 ms penalty .weight 0.5 With the base parameters in the Table 2.4.2 the maximum capacity of the system is: 16 objects 15 ms 240 ms . ^ , , X — — = : — = 4.2 trs/ second transaction object transaction This calculation is very optimistic because it neither includes the abort cost nor the blocking cost of transactions. Effect of arrival rate In this experiment, we varied arrival rate from 0.6 tr/sec to 2 trs/sec with the base parameters shown in Table 2.4.2 and compared EDF-HP, EDF-CR-ALF and CCA-ALF schemes. Increasing the arrival rate increases time contention as well as data contention thus increases transaction miss percent for all three approaches. CCA-ALF and EDF-CRALF which use type 1 information to reduce noncontributing execution shows a much larger improvement over EDF-HP for the disk resident database (as expected) as compared to the

PAGE 45

37 main memory case. The boundary arrival rates of EDF-HP, EDF-CR-ALF, and CCA-ALF are 1.2, 1.42, and 1.43 trs/sec respectively. The reason for earlier rapid increase of restart rate in EDF-HP than EDF-CR-ALF and CCA-ALF is that when the arrival rate is high, the number of available transactions increases which in turn causes high data contention. High data contention makes many transactions to block which eventually increases the number of active transactions as well as the transactions that have begun execution but have not finished yet, in the system. Thus the priority-based restarts of the active transactions that are blocked waiting for locks or resources increases very rapidly. The increase in the restart ratio means that a longer fraction of disk time is spent doing work that will be redone later [5]. Wasted resource time due to prioritybased restart causes high resource utilization and easily makes bottleneck resource saturation that induces longer I/O wait time. With the longer I/O wait time more transactions are scheduled and that increases the I/O wait time further. Thus the possibility of restarting an active transaction increases further. After the peak point, restart rates slowly increase as shown in Figure 2.11. This is because the number of restarts due to higher priority transaction's I/O wake up increases but the restarts by higher priority transaction's arrivals are gradually reduced. The number of restarts will flatten out eventually as the arrival rate increases. Even though the available transactions increase as the arrival rate increases, the number of useful transactions for CCA-ALF and EDF-CR-ALF increases very slowly. Thus the number of active transactions are relatively small as shown in Figure 2.12 until arrival rate of 1.6 trs/sec . As a result, the number of priority-based restarts for CCA-ALF and EDF-CR-ALF increases slowly as can be seen in Figure 2.11. After arrival rate of 1.6 trs/sec the number of active transactions for EDF-CR-ALF and CCA-ALF increases because both approaches have chosen transactions seemingly not conflicting with partially executed higher priority transactions.

PAGE 46

38 In the heavy load situation, the conflict resolution policy of CCA-ALF resembles to EDF-Wait which uses nonabortive method. Thus the restart rate of CCA-ALF is less than that of EDF-CR-ALF which uses HP conflict resolution method independent of system load after arrival rate of 1.6 trs/sec. That is the reason why CCA-ALF has less restart rate than EDF-CR-ALF after the arrival rate of 1.8 trs/sec even though CCA-ALF has more number of active transactions. EDF-HP,CCA-ALF,EDF-CR-ALF (base parameters) , , — — i i EDF-HP CCA-ALF B EDF-CR-ALF e /! / i m'' ^^...-^^'-'^ 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Arrival Rate(trs/sec) Figure 2.10. DISK: Miss Percent (CCA-ALF) EDF-HP, CCA-ALF,EDF-CR-ALF (base parameters) 14 Arrival Rate(trs/sec) Figure 2.11. DISK: Restart Rate (CCA-ALF) 2.5 Conclusions Synthesizing static and dynamic information available to transactions seems to be a viable approach for obtaining scheduling pohcies to meet the requirements of real-time

PAGE 47

EDF-HP,CCA-ALF^DF-CR-ALF (base parameters) 20 15 EDF-HP CCA-ALF — EDF-CR-ALF a 10 5 ^ 1, n r-rrr^-m -—a--^ A 1 1 1 1 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Arrival Rate(trs/sec) ure 2.12. DISK: No. of active tr. (CCA-ALF) EDF-HP,CCA-ALFf DF-CR-ALF (base parameters) 1 k — I . 1 1 . 1 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Arrival Rate(trs/sec) Figure 2.13. DISK: Mean Lateness

PAGE 48

40 transactions. CCA-ALF and EDF-CR-ALF described in this part use dynamic priority assignment with continuous evaluation method to adapt to load changes effectively and to reduce the excessive restart problem encountered by EDF-HP in high data contention situations. CCA-ALF uses its new priority formula for both resource and data conflict resolution accordingly to adapt to the current system load by using type 1 information and ALF of the system while EDF-CR-ALF uses EDF for resource conflicts and CR only for its data conflicts. According to our simulation results, we can see that CCA-ALF way of using available information is better than that of EDF-CR-ALF. Our simulations indicate that 1. CCA-ALF performs better than EDF-CR-ALF for soft deadline in wide ranges of arrival rate. 2. CCA-ALF is more fair than EDF-CR-ALF. 3. CCA-ALF shows particularly good performance when the transactions have a wide range of processing reqmrements. 4. Reducing noncontributing execution by using type 1 information dominates the performance of disk resident databases.

PAGE 49

CHAPTER 3 FIRM REAL-TIME: DEFERRED-RESTART APPROACH 3.1 Introduction The main focus of research in the RTDBS area has been the problem of scheduling transactions to meet the time constraints associated with each transaction. The scheduler of an RTDBS is responsible for assigning priorities [21, 22, 30, 42] and resolving access conflicts among transactions based on priorities (concurrency control) [3, 8, 19, 25, 26, 27, 28, 29, 37]. Among them several approaches [2, 19, 22, 28] are specifically studied to improve the performance of firm RTDBS. We can classify these approaches into: Overload management. By removing jobs with infeasible deadlines from the system as early as possible, the Feasible Deadline (FD) [2] can significantly improve the performance of firm RTDBS. The basic idea is not to spend time on transactions that are likely to miss their deadlines. However, the predictive FD approach requires an estimation of the transaction execution time, which may be difficult or even impossible to obtain due to database characteristics. Priority assignment. Earliest Deadline First (EDF) and Least Slack First (LSF) [30] are well known ways to assign priorities to soft and firm deadline transactions, and Adaptive Earliest Deadline (AED) [19] has been proposed for firm transactions. In AED transactions are assigned to HIT and MISS buckets and their bucket sizes are controlled by using the past system load. Priorities of transactions in the HIT group are assigned based on EDF and those in MISS group are assigned based on Random Priority (RP). The priorities of transactions in the MISS group are always lower than 41

PAGE 50

42 those of HIT group. AED tries to make the miss percent of HIT and MISS groups close to 0 % and 100 % respectively. The basic idea here is that to spend more time on transactions in the HIT group and less time on transactions in the MISS group. Concurrency Control. Several Optimistic Concurrency Control (OCC) variants have been proposed for concurrency control for firm RTDBS [19, 20, 22, 28]. By delaying the validation until commit time transactions that are supposed to miss their deadlines never restart other transactions. OCC variants shows better performance than locking based approaches in simulations [20, 28] and shows comparable performance on their testbed [22]. 3.2 Related Work Two Phase locking with High Priority (2PL-HP) [2] uses blocking and immediate restarts to maintain the consistency of databases. When a lower priority transaction (LPT) tries to access the data that is already accessed by a higher priority transaction (HPT) in a conflicting mode, we block the LPT until the HPT releases the corresponding data. When an HPT tries to access the data that is already accessed by an LPT in a conflicting mode, we restart the LPT (immediate restart) and make the HPT access the data. OCC uses oidy a validation phase restart at commit time to make databases consistent. With OCC approach, a policy is needed to resolve the access conflicts during the validation phase. Some of the policies proposed are commit (always let the transaction being validated commit), priority abort (abort the validating transaction only if its priority is less than that of each conflicting transaction), priority wait (wait for higher priority transactions to complete), and opt-sacrifice (restart the validating transaction if at least one of the transactions in the conflict set has a higher priority) and their performances have been studied [19, 20, 22, 28]. Performance comparisons between locking and OCC have been made for conventional database systems [6, 5, 10], and they show the superiority of locking over OCC. The

PAGE 51

43 superiority of locking over OCC for conventional database systems comes from its early stage blocking validation policy which does not waste valuable system resources for conventional database systems. However, locking only with blocking validation policy, 2PL-Wait, shows worse performance than 2PL-HP which adopted priority based High Priority (HP) conflict resolution method in locking algorithm for RTDBS. Locking can be done with in-place update or deferred update while OCC can only be done with deferred update. In-place update shows some advantages when most of transactions are successfully committed, because its commit protocol is simple and effective, while deferred update has advantages when many transactions are aborted, because its rollback mechanism is very simple. Agrawal and DeWitt [6] showed that if the buffer space available to the transaction is large enough to hold all the pages updated by the transaction until the transaction is validated, the cost of making local copies global is not significant. In actual most comparisons [5, 10, 19, 22, 28] have been done without considering the effects of in-place or deferred update [5, 10, 19, 22, 28]. In this paper, we discuss concurrency control algorithms without considering the effects of in-place and deferred update. 3.3 Motivation for our Approaches Several possibilities of OCC as a concurrency control mechanism for firm RTDBS [19, 20, 22, 28] have been developed. According to these papers 2PL-HP loses some of its advantages over OCC because of wasted restart and wasted wait problems even though OCC has wasted executions resulting from the delay in validation. wasted restart. A wasted restart happens if an HPT aborts an LPT and then the HPT is discarded as it misses its deadline. In other words a transaction which is later discarded can cause restarts.

PAGE 52

44 wasted wait. A wasted wait happens if an LPT waits for the commit of an HPT and later the HPT is discarded as it misses its deadline. In other words a transaction which is later discarded can cause wait of a conflicting LPT. wasted execution. A wasted execution happens when an LPT in commit time validation phase is restarted due to a conflicting HPT which hasn't finished yet. If a lock requesting transaction has a higher priority than conflicting transactions, 2PLHP aborts the conflicting LPT immediately (immediate restart). Immediate restart is useful when an HPT has a high possibility of successful commit (i.e., transactions in soft RTDBS) by restarting LPTs as early as possible. In firm RTDBS, however, immediate restart might cause wasted restarts, which affects the performance adversely. It seems that deferred restart is always preferable for firm real-time main memory database systems that have oidy one CPU. Thus we assume that we have multiple CPUs or disk resident databases. Our observation is that an HPT can proceed without aborting conflicting LPTs if needed when we use deferred update policy which updates local copies of data items and makes them global at commit time. We only need to stop the LPT until the completion (commit or abort) of the conflicting HPT. If an HPT is discarded by missing its deadline we can execute the stopped LPT by resuming it. This is termed in this dissertation as stop/resume deferred restart policy. Thus we can avoid wasted restart problem by using deferred restart policy selectively. In order to differentiate the cause of transaction blocking we define transaction stopping as follows: Transaction stopping. Transaction stopping is the blocking of an LPT and happens when an HPT tries to access a data item which is accessed by the LPT. This is in contrast to the term blocking which describes the situation when an LPT tries to access a data item accessed by an HPT and waiting for the completion of that HPT.

PAGE 53

45 3.3.1 Comparison of Conflict Resolution Policies In order to see the advantages and disadvantages of immediate and deferred restart policies we illustrate 4 cases that can arise when an HPT and an LPT conflict with each other as shown in Figures 3.1, 3.2, 3.3, and 3.4. For each case we evaluate 3 different restart methods, namely, OCC style deferred restart (DR-OCC), immediate restart (IR), and stop/resume deferred restart (DR-SR). DR-OCC. This policy is exactly the one used by OCC. The validation and restart happen at commit time. IR, This policy is exactly the one used by 2PL-HP. The validation and restart are done when data conflict occurs. DR-SR. This policy uses early stage validation but commit time restart. When an HPT conflicts with a lock holding LPT we stop the LPT until the completion time of the HPT. If the HPT completes successfully (commit) then we restart the LPT. Otherwise (i.e., if the HPT aborts) we resume the LPT. The following summarizes the alternative outcomes and their relationship to the conflict resolution policies described above: Case 1: Both LPT and HPT complete successfully as shown in Figure 3.1. It is clear that immediate restart is the best for them in order to have the earliest finish time for the LPT. Case 2: In this case HPT completes successfully and LPT misses its deadline. DR-SR looks better in terms of wasting the least amount of system resources (as shown in Figure 3.2), but the LPT's eff'ective service time of the IR approach is the longest among them. This indicates that IR has a much better chance of changing Case 2 to Case 1.

PAGE 54

46 Case 3: This case explains wasted restart of 2PL-HP most clearly. In Figure 3.3 IR is the worst among them due to wasted restart. Both deferred restart approaches, DR-OCC and DR-SR, are preferable in this situation. Even though DR-OCC looks the best among them DR-SR is equally good because during the LPT's stopped period we can execute other transactions. Case 4'This case (shown in Figure 3.4) happens when both HPT and LPT miss their deadlines. If we consider the waste of system resources, DR-SR is the best as it wastes less resources. Saving valuable resources reduces transaction arrival blocking under heavily loaded situations by giving many chances of execution to other transactions. This is likely to happen often in heavily loaded situations. Deferred Restart (OCC) L H Start O O restar; C) data conflict Immediate Restart L H 0 o resi tartt> commit Deferred Restart (Stop-Resume) L H Q O comrmt stop restart 0 commit commit commit cohimit Figure 3.1. Case 1: Both transactions finished successfully within their deadlines By analyzing the 4 alternative outcomes between HPT and LPT transactions, we notice that alternating IR and DR-SR selectively could be better than using a single conflict resolution policy. Based on this strong motivation we propose a new approach termed Adaptive Concurrency Control (ACC) that integrates DR-SR and IR together for firm RTDBS.

PAGE 55

Deferred Restart (OCC) Immediate Restart L H L H st^ 0 0 0 9 data conflict: r sstart ( ^ restart. coimnit abort Deferred Restart (stop-resume) L H 0 0 stop commit restart 9 abort abort Figure 3.2. Case 2:HPT completed successfully, LPT missed Deferred Restart (OCC) L H Start 0 0 Immediate Restart data conflict L 0 restart abort commit H 0 abort commit Deferred Restart (stop-resume) L H 0 0 stop abort commit Figure 3.3. Case 3:HPT missed, LPT completed successfully Deferred Restart (OCC) L H Start 0 0 data conflict.. abort Immediate Restart L H 0 Q abort Deferred Restart (stop-resume) L H 0 0 stop resume 9 abort abort abort abort Figure 3.4. Case 4:Both LPT and HPT missed their deadlines

PAGE 56

48 3.4 Adaptive Concurrency Control (ACQ The basic idea behind this approach is that we apply the restart policy selectively for different situations. If a conflicting HPT has a high possibility of successful commit, we use IR policy. Otherwise, we use DR-SR policy. In the following parts of program we assume that the requesting transaction Tr has a higher priority than lock holding transaction Th. When Tr requests the data held by T/,: IF (Tr has a high possibility of successful commit) Restart Th; /* Immediate restart */ Execute Tr; ELSE Stop Th until Tr finishes its execution; When Tr finishes its execution: IF (Tr is discarded) Resume the stopped Th; ELSE IF (Tr commits) Restart Th; /* deferred restart */ In general the destiny of a transaction is not decided beforehand. It depends on the system load, the tightness of slack, transaction mix and so on. although we don't know the destiny of a transaction in advance, we can control it by using a proper grouping mechanism. HIT/MISS grouping algorithm in AED [20] is a good candidate for our concurrency control algorithm. The possibility of successful commit in the HIT group is very high and very low in MISS group. By using the properties of HIT/MISS groups in AED, we can apply the

PAGE 57

49 appropriate restart policy. If a lock requesting HPT comes from the HIT group we use IR, otherwise we use DR-SR. HIT group case (3) ^^^—^^^ case (2) case (1) MISS group Figure 3.5. Transaction blocks among HIT and MISS group Another advantage of incorporating the HIT/MISS groups approach is that we can reduce wasted wait as weU. We have 3 cases of transaction conflicts that cause transaction blocking (as depicted in Figure 3.5) and we can see how wasted wait can be reduced by using AED policy. case 1 Both transactions in the MISS group will miss their deadlines. Thus wasted wait do not cause a problem here. case 2 and 3 Higher priority transaction that cause transaction wait is in the HIT group in which miss percent is very low and waiting low priority transaction is from the MISS group. Thus wasted wait wiU be negligible here. 3.4.1 Procedures of ACC Algorithm for HIT/MISS grouping is well presented in Haritsa's paper [20]. In this section we focus on combining IR and DR-SR. Basically we follow 2PL-HP by maintaining a shared global lock table and add additional transaction lists to combine DR-SR. Each lock table entry contains an object identifier (OID), a lock mode, and a list of transaction identifier (TID). We use dynamic priority assignment with static evaluation policy and

PAGE 58

50 maintain transaction state, version_number, deferred _rcnt, deferred_rlist, stopped.cnt and stoppedJist fields for each transaction and their meaning and purposes are as follows: Ti.state Ti.versionjnumber Ti-de ferred-Tcnt Ti.de ferredjrlist Ti.stopped.cnt Ti.stoppedJist State of transaction Ti (READY, STOPPED, BLOCKED). Version number of transaction T,-. The initial value is zero. Whenever T, is restarted its value increases by 1. Version number helps us to check whether a transaction is restarted or not while it is stopped by other transactions because TID of a transaction never changes when it is restarted. The number of lower priority transactions stopped by the transaction T,-. The list of OID and stopped transactions' TIDs, versionjiumbers and lock modes of objects when the transaction Ti stopped transactions. The number of higher priority transactions that stopped T,-. The list of object id and identifier of the transaction that stopped T,-. Using the global lock table and data structure described above we have developed and implemented key procedures of our approach for simulation. Key procedures of our approach are: Lock-request, Discard, Commit, and Restart. Lockjequest is called when a transaction is trying to access a data and Discard is called when we remove a tardy transaction from the system. Commit is called when a transaction finishes successfully and Restart is invoked when a transaction is restarted by a conflicting HPT. In the following procedures T^, T^, Ta represent lock requesting transaction, lock holding transaction, and transaction being aborted, respectively. Lock request We maintain a shared global lock table and each entry has an OID, a lock mode and a list of lock owners. For the simplicity we assume exclusive lock only system in this paper. When transaction Tr requests X lock on a data item and if no one has lock on that data item, Tr gets the lock. If the requesting mode is incompatible and the requesting transaction Tr has a higher priority than the lock holding transaction Th we stop the lock

PAGE 59

51 holding transaction Th, save the lock table entry into Tr, and update the lock table entry for the object with the new lock mode and new lock owner. Procedure Lock_request(Tr, Obj, lockmode) BEGIN IF (Obj is locked with a conflicting lock mode) THEN IF (Pr(Tr) is greater than Pr(Th)) THEN IF (Tr came from HIT group) THEN Restart (Th); Tr gets the lock; ELSE Stop Th; Add STOPPED flag to Th; Put old, previous lock mode, id of Th and its version-number to Tr's deferred_rlist; Increment Tr's deferred_rcnt by 1; Add oid, id of Tr to stoppedJist of Th; Increment Th's stopped.cnt by 1; Tr gets the lock; ELSE Block Tr; /* lock wait */ ELSE Tr gets the lock; END

PAGE 60

52 Commit Deferred updates and deferred restarts are done here by using the deferred-rlist of the committing transaction. When we do the deferred restart we check the version number of a transaction to make sure that transactions in the deferred_rlist haven't been restarted by other transaction because a transaction has the same transaction identifier (TID) after it is restarted. If the stopped transaction have been restarted by other transaction, then the version number of the transaction must have been increased also. Procedure Commit(Ta) BEGIN Make local copies global; IF (Ta.deferred_rcnt is greater than zero) FOREACH (Ti in Ta.deferredjrlist) IF (current version_number and the version_number in Ta.deferredjrlist is the same) THEN Restart (Ti); /* deferred restart here */ Release Jock ( Ta) ; END Restart and Discard When we discard or restart the transaction that stopped other transactions we have to update stop relationship and have to restore the lock owner and lock mode of objects that are accessed by the stopped transaction in global lock table properly. In Figure 3.6 we illustrate what can happen when we restart or discard the transaction that stopped other transactions. Directed edges represent the stop relationship among transactions and a rectangle node represents a stopped transaction. A transaction represented by a circular node represents the transaction that is not stopped. The OID and

PAGE 61

53 Abort Abort Tl Tl (01, T2) (01, T2) (01,T2) (01,T1) Figure 3.6. Primitive cases of transaction stop the corresponding lock owner are described in the parenthesis. Restarting or removing a transaction which has never stopped any transaction is quite straightforward. When, however, we restart or discard the transaction T2 which stopped other transactions (Tl in Figure 3.6), we have to restore the lock table entry by using the saved information in T^. In Figure 3.6, the object Oi was owned by Ti before T2 stops and gets the lock on Oi. When T2 is restarted or is discarded, the previous state need to be restored. From now on we assume that restoration is done by the procedure restore Jockjentry . Abort Abort Tl 01 VI 02 ^3 (01, T2) (02,T3) Tl 01 T2 h^T3 (Ol.Tl) (02,T3) (01,T3) (a) (b) Figure 3.7. Abort of stopped transaction which also stopped others

PAGE 62

54 When we restart or discard a stopped transaction which has stopped other transactions, again we have to check the relationship more carefully and these cases are illustrated in Figure 3.7. If their relationship is made on different data, aborting T2 then we can disconnect their relationship in the manner shown in Figure 3.7 (a). If their relationship is made on the same data then we have to update deferred _rlist of T3 and stoppedJist of Ti and transfer the saved lock table entry of the object Oi in T2 to T3 as shown in Figure 3.7 (b). This transferred lock table entry to T3 will be used when we restore lock table entry after T3 is removed on account of missing its deadline. We Jissume that updating stoppedJist and deferred_rlist and transferring the saved lock table entry is done by the procedure updatestop.relationship. Abort (02,T3) (01,T1)(02,T1) Figure 3.8. More complex cases of transaction stop A transaction can be stopped by more than one HPT and even a transaction can be stopped more than once by the same transaction on different data. Figure 3.8 shows these cases. Procedure Restart(Ta) BEGIN Rollback(Ta);

PAGE 63

Increment Ta.version_number by 1; Ta.state = READY; Ta.deferred-Tcnt = Ta.stopped.cnt = 0; Ta.deferred_rlist = Ta.stoppedJist = NIL; Restart (Ta); END Procedure Discard(Ta) BEGIN Rollback(Ta); Remove Ta from the system; END Procedure RoIlback(Ta) BEGIN IF (Ta.state is STOPPED) IF (Ta. deferred j-cnt is greater than zero) FOREACH (Tc which is stopped by Ta and Ta is stopped by Tb for the same data that stopped Tc) update_stop_relationship(Tc, Tb); FOREACH (Ti in Ta.deferredjlist) IF (Ti.state is STOPPED and version_number is the same) decrement Ti.stopped.cnt by 1; IF (Ti.stopped.cnt is zero) Remove STOPPED flag from Ti.state; restore_lock_entry(Ta); /* for Ti which is stopped */ Adjust lock counter for Ta; Release Jock(Ta);

PAGE 64

56 END 3.4.2 Correctness of ACC Theorem 3.A.1 Adaptive Concurrency Control (ACC) is serializable. Proof: To prove a history H serializable, we have to show that SG(H) is acyclic. Let Tf and T2" (here, subscript is transaction identifier and superscript represents version nunaber) be two committed versions of transactions in a history H produced by Adaptive Concurrency Control. If there is an edge T" — »^ Tj" in SG(H), there exist conflicting operations q and p such that 9" [a;] < P2'[^]' / If Pr(rf ) > Pr(T^), then q^[x] < • • • < Cf < p^[x] < C^ . case 1 If T" have never been stopped during its execution, T" releases its locks at commit time. Thus cannot access the data x until the commit of transaction T". case 2 If T" have been stopped by during its execution, must have a higher priority than T". During that period Tj" cannot access the data x because has the lock on a;. After releases its lock on x by finishing its execution, gets the lock again by lock transfer from T^. Thus cannot access the data x until the commit of transaction T". 2 If Pr{T^) < Pr{T^), then ^^"[a;] < < C^ < pf[x]< C^. Let's assume p and q the first conflicting operation between and Tj". If appears before Cf T" cannot commit because Tj" has the higher priority than and is the committed transaction. Suppose there is a cycle T^" ^ ^ -* in SG(H). Case 1 When Pr(rf ) < Pr{T!l), Ti" ^ . implies Cf < Cj^ and t!^ Tf implies C^ < Cf

PAGE 65

57 Contradiction. Thus this one cannot happen. Case2 When Pr(rf ) > Pr-(T*) ^Tl implies C^" < and ^ mp/te5 < Contradiction. Thus this one cannot happen either. Therefore no cycle can exist in SG(H) and thus our approach only produces serializable histories. 3.5 Alternative Version Concurrency Control (AVCC) One of several ways to use DR-SR and IR policies together is to have both versions together by starting alternative version. If a lock requesting HPT Tr conflicts with a lock holding LPT T/, we stop Th and initiate additional immediately restarted version T,as an alternative of Th. in Figure 3.9. Thus we have a stopped version and a restarted version of the LPT together at the same time. Even though the stopped version takes some space it doesn't consume any processor until it resumes its execution again. Meanwhile T, can proceed up to the data point that caused stop and work from there after Tr commits, instead of starting from the beginning when Tr commits. If Tr misses its deadline, T, is removed from the system and Th resumes its work from the stopped position. This approach can be viewed a method to implement partial rollback without using save point mechanism. T,is a partially roUed back version of Th when Tr commits. By maintaining stopped version and initiating restarted version AVCC can reduce wasted restart and wasted execution. In AVCC the execution path of Th and T,is exactly the same when Tr is still in the system because TV has locks on the shared data and the value of input data doesn't change within the deadline of T/,. DR version Th and IR version T, has parent/child relationship so that T, can inherit the deadline and priority of Th and can access the data accessed by Th freely.

PAGE 66

58 Ad' 0' Figure 3.9. Deferred restart and immediate restart versions (AVCC) During the execution of a transaction an IR version of a transaction might be stopped and changed to DR-SR version and initiated a new IR version. Thus in AVCC each transaction might have multiple DR versions and a single IR version which is the leaf of the family tree. Only the IR version of a transaction can be allowed to run while the other DR-SR versions are waiting the resumption. In Figure 3.10 illustrated general view of transactions' execution with 3 transactions, T,, Tj, and Tk. For each transaction a rectangle represents a version and dark area in each version shows how far it is executed from the beginning of a transaction. For the simplicity we assume that the left one is the parent of right one and parent has always progressed more than its child. 3.5.1 Algorithms Let's look at several cases that could happen with AVCC. In Figure 3.11 HPT Tr conflicts only with stopped version Th more than once. This case doesn't make any problem. If Tr commits we take the restarted version T, and remove the stopped version from the system. When Tr misses its deadline we remove Tr and T,from the system and resume T/, from the stopped position. The case in Figure 3.12 can happen when transaction Tr stopped Th and initiated T,-. After that another HPT conflicted only with T/,. If both of Tr and abort, we resume

PAGE 67

59 Ti Tj Tk II n 1 II "1 in II.. local area \ Global data, Shared lock table Figure 3.10. Structure of AVCC execution LPT Th HPT Tr Figure 3.11. Case 2: Conflicts with the same transaction more than once

PAGE 68

60 Th from the stopped position. Otherwise (i.e., if at least one of them commits) we use T,and remove from the system. Figure 3.12. Case3: Conflicts with different transactions Our algorithm maintains a global shared lock table and each version of transaction follows 2 phcise locking. Each lock table entry contains an object identifier (OID), a lock mode, number of lock waiter, number of lock granter, a list of lock waiter, and a list of lock granter. In addition, we maintain transaction state, stop.cnt, stopJist, avstopped-cnt, av. stopped Jist, parent, and child field for each transaction and their meaning and purposes are as follows: Ti.state State of transaction T, (READY, REPLACED, BLOCKED) Ti.stopJist The list of object identifier and stopped transactions' TID when the transaction T,stopped a transaction. This list is used to implement deferred restarts when a transaction commits Ti.av stopped Jist The list of object identifier and TID of the transaction that stopped T,-. Ti.stopjcnt The number of lower priority transactions stopped by the transaction T,Ti.av stopped jcnt The number of higher priority transactions that stopped T, Ti.parent Ti.child A pointer to r,'s parent. This field is used when we check the transaction relationship. A pointer to T,'s child. Lock acquire When a transaction request a lock on a data object the lock compatibility and parent/child relationship should be checked. If a lock requesting higher priority transaction Tr conflicts with a lower priority transaction Th we stop T/, and make Tr get the lock and

PAGE 69

61 initiate Ti which is restarted version of Th. If a lock holding Th is the ancestor of lock requesting Tr, Tr gets the lock. Procedure Lock_acquire(Tr, oid, lockmode) BEGIN IF (oid is locked with a conflicting granted lock mode) THEN IF (Th is Tr's ancestor) Lock_granted(Tr, oid, lockmode); ELSE IF (Pr(Tr) is greater than Pr(Th)) THEN IF (Th.state is REPLACED) Add oid, and TID of Th to Tr's stopJist; Add oid, and TID of Tr to Th's av^toppedJist; ELSE stop Th and generate Ti which is restart version of Th; Add REPLACED flag to Th; Add oid, and TID of Th to Tr's stopJist; Add oid, and TID of Tr to Th's av_stopped_list; ELSE Block(Tr); ELSE Lock_granted(Tr,oid,lockmode); END

PAGE 70

62 Commit We make local copies global and remove stopped versions here. When the transaction Ti commits as in Figure 3.13 Ti uses its stopJist to remove transactions that are stopped by Ti and ancestors of those stopped transactions. Thus when Ti commits Th which is in stopJist of Ti and Ta which is parent of Th are removed. Figure 3.13. Commit of AVCC Figure 3.14 shows a special case for chain stop. Transaction T2 stopped T3 for object 01 and Ti stopped T2 again for the same object 01. In this case commit of Ti initiates the removal of T2 and T3. Discard TI T2 n 0 0 Figure 3.14. Chain stop in AVCC Procedure Commit (TI) BEGIN Make local copies global;

PAGE 71

63 IF (Tl.stop-cnt is greater than zero) FOREACH Ti in Tl.stopJist BEGIN Check chain stop; Remove all stopped versions that are in Tl.stopJist and their ancestors; Remove all locks held by Tl; ' END END Discard This function is called 1. when a transaction is removed by missing its deadline, (for example, transaction B in the Figure 3.15) 2. when a transaction commits and try to remove all transaction that are in its stop Jist, 3. when a restarted version of a transaction is removed due to the removal of the transaction that caused the restart version, (for example, Ai in case 1 of Figure 3.15) Figure 3.15 shows the changes of transactions' relationship in the system when a transaction is discarded by missing its deadline. When the transaction Ta is discarded lock owner of an object Oi which is locked by Ta is changed to T,if it is used to stop transaction T, . At the same time transactions blocked for lock release of Oi are waken up and try a lock on Ot and compare the priority with T,-. By doing this a transaction Tb which has an intermediate priority {Pr{Ta) > Pr{Tb) > Pr(Ti)) will not be blocked by T, and won't form circular waits.

PAGE 72

Figure 3.15. Conflict with different transactions in AVCC Procedure Discard(Ta) BEGIN IF (Ta.stopxnt is greater than zero) FOREACH Ti in Ta.stopJist Change lock owner of OID which caused stop of Ti from Ta to Ti and walie up transaction blocked for OID; Delete Ta from the avjstoppedJist of Ti; IF (Ti.av_stopped_cnt is zero) Tc is child of Ti; IF (Testate is REPLACED) Remove Tc and adjust parent/child field of Tc's parent and child; ELSE Remove Tc and adjust child field of Tc's parent; Resume Ti;

PAGE 73

65 Remove all locks held by Ta; Remove aU data structure for Ta from the system; END 3.5.2 Correctness of AVCC Theorem 3.5.1 AVCC is serializable. Proof: To prove a history H serializable, we have to show that SG(H) is acyclic. Let T" and Tj" (here, subscript is transaction identifier and superscript represents version number) be two committed versions of transactions in a history H produced by AVCC. If there is an edge T" T!^ in SG(H), there exist conflicting operations q and p such that [a;] < p^[x]. 1 If Pr{T^) > Pr{T^), then q^[x] < • • • < < p^[x]
PAGE 74

66 finishing its execution (removed), Tf gets the lock again by lock transfer from its descendant T""*"'. Thus cannot access the data x untU the commit of transaction T". 2 If PriT^) < Pt{T^), then q'^lx] < • • • < Cj" < p^[x] ••• Pr{T^) Ti" ^ ^ imp/ie^ < and ^ imp/ies < Contradiction. Thus this one cannot happen either. Therefore no cycle can exist in SG(H) and thus our approach only produces serializable histories. 3.6 Performance Evaluation In order to compare the performance of ACC and AVCC, simulations of a real-time transaction scheduler were implemented (using C language and SIMPACK simulation package [16]). In our simulations we are assuming multiple CPUs, which has a common queue for the CPUs and the service discipline is priority PreemptiveResume and multiple disks environment with each of the disks with its own queue and the service discipline is priority Head-Of-Line (Non-preemptive).

PAGE 75

ID CPUs Disks Sink Concurrency Control Figure 3.16. Open Network Model with Multiple CPU and disks Table 3.1. Sim ulation Parameters for A CC and AVCC Parameter Value db_size 1000 max_size 24 min_size 8 i/o_time 20 ms cpu-time 10 ms disk.prob 0.5 min .slack 100 (%) majcslack 650 (%) no_of_cpu 8 no_of_disk 16

PAGE 76

68 The parameters used in the simulations are shown in Table 3.6 and transactions enter the system according to a Poisson process with arrival rate A (i.e., exponentially distributed inter-arrival times with mean value 1/A), and they are ready to execute when they enter the system (i.e., release time equals arrival time). The number of objects updated by a transaction is chosen uniformly from the range of minsize and max^size and the actual database items are chosen uniformly from the range of dhsize. After accessing an object a transaction spends cpuJime in order to do some work with or on that object and then it accesses the next object. The assignment of a deadline is controlled by the resource time of a transaction and two parameters minslack and max.slack which set, respectively, a lower and upper bound of percentage of slack time relative to the resource time. A deadline is calculated by adding resource time and slack time. Slack time is calculated by multiplying slack percent and resource time. Slack percent is chosen uniformly from the range of min.slack to max.slack. ,,. • 7j , /, slack percent . Deadline = arrival time + resource time x (1 A ) ^ 100 Disk accesses are controlled by disk.prob when a transaction reads an object. The use of disk.prob to some extent models data maintained in the buffer. At commit time, objects that have been updated are flushed. The restarted transaction will access the same data objects and the number of CPU and disk is controlled by no-of-cpu and no-of.disk, respectively. In our performance evaluation, we measure the transaction miss percent commonly used in the literature for firm RTDBS: n ^ Total number of transactions that missed the deadline Miss Percent — ; i . x 100 Total number of transactions that entered the system We ran 10,000 transactions for each simulations and 95% confidence intervcds have been obtained whose halfwidths are less than 2.5%. During our simulations first 1,000

PAGE 77

69 transactions were not counted in simulation results in order to avoid the warm up problem (initial transient problem) and to get the proper HIT/MISS bucket size for ACC. 3.6.1 Performance of ACC In this experiment, we compared EDF-HP, AED-HP, and EDF-ACC to evaluate the merit of ACC conflict resolution policy proposed in this dissertation. EDF-HP. Priorities of transactions are assigned based on EDF and its conflict resolution policy is HP. HIT/MISS grouping is not used in this approach. AED-HP. This approach is proposed in [20]. Priorities of transactions are assigned based on AED which use EDF and Random Priority (RP) for HIT group and MISS group, respectively. Conflict resolution policy is HP for both groups. EDF-ACC. Priorities of transactions are assigned based on EDF for both HIT and MISS group but priorities of transactions in the HIT group are higher than those of MISS group. Conflict resolution policy is HP for HIT group and DR-SR for MISS group. In our comparison we have used the same HIT/MISS grouping algorithm [20] for AEDHP and EDF-ACC and Figure 3.17 shows the simulation result. As we expected AED-HP and EDF-ACC performs similar to EDF-HP when the system is lightly loaded by assigning most of transactions to HIT group in which EDF priority assignment policy and HP conflict resolution are used for all 3 approaches. In the heavily loaded situation both of AED-HP and EDF-ACC performs better than EDF-HP. The performance improvement was achieved by reducing wasted restart in both the approaches. AED-HP tries to avoid wasted restart with RP priority assignment by trying not to assign higher priorities to the transactions that have earlier deadlines assuming those transactions do not have much chances to finish within their deadlines under heavily loaded situations. Due to the randomness of RP priority assignment AED-HP may assign higher priorities to those transactions that are going to miss. While EDF-ACC uses EDF

PAGE 78

70 for both HIT group and MISS group and DR-SR conflict resolution policy for MISS group. If HIT/MISS group assignment is 100% correct, EDF-ACC can remove all possibilities of wasted restart. In Figure 3.17, EDF-ACC shows better performance than AED-HP. One reason for this is that ACC reduces wasted restarts in a systematic way. ACC never misses a chance to remove wasted restart if HIT/MISS group assignment is 100% correct. While AED-HP uses random function to assign priorities to transactions in the MISS group so that it may miss some of its chances to reduce wasted restart even if group assignment is correct. 100 S 60 u u EDF-HP AED-HP B EDF-ACC * 20 40 60 80 Arrival Rate(trs/sec) Figure 3.17. EDF-HP, AED-HP, and EDF-ACC 100 In the previous experiment we have seen the merit of ACC as a good conflict resolution policy in EDF-ACC and RP as a priority assignment in AED-HP. In the following experiment, we applied ACC to AED (AED-ACC) to evaluate the performance of diflterent combination.

PAGE 79

71 AED-ACC Priorities of transactions are assigned based on AED which uses EDF for HIT group and RP for MISS group. Conflict resolution policy is HP for HIT group and DR-SR for MISS group. Using RP priority assignment and DR-SR conflict resolution policy together for MISS group didn't show much improvement as shown in Figure 3.18. The reason is that RP priority assignment policy itself already reduced wasted restart by ignoring a transaction's deadline which in turn sacrificed the importance of transaction deadline completely in heavily loaded situations. There are not too many chances for ACC to reduce wasted restarts in AED-ACC. That seems to be the reason why AED-ACC shows poorer performance than EDF-ACC in Figure 3.18. 0 20 40 60 80 100 Arrival Rate(trs/sec) Figure 3.18. AED-HP, EDF-ACC, and AED-ACC 3.6.2 Performance of AVCC In our experimentation we changed transaction arrival rate from 10 to 110 trs/second and priorities of transactions are assigned by using EDF policy for EDF-HP and AVCC. As we expected AVCC shows better performance than EDF-HP for wide ranges of system load

PAGE 80

72 in Figure 3.19. AVCC is much better than EDF-HP except for heavily loaded situation in which we did not have much chances to reduce wasted restart because most of transactions in this load missed their deadlines as in Figure 3.4. (base parameters) 100 80 I 60 u V VI 40 20 0 20 40 60 80 100 Arrival Rate(trs/sec) Figure 3.19. Comparisons of EDF-HP, ACC, AVCC AVCC is better than ACC in normal load but is worse in heavy load. This phenomenon can easily be explained by the Figures 3.1, 3.2, 3.3, 3.4. In normal load most of cases are Case 1, 2, and 3 in which maintaining both of IR and DR-SR together reduced wasted restart and wasted execution a lot. While initiating IR version in heavy load increase the competition of active resources by wasting system resources as in case 4 of Figure 3.4. From this simulation results we can conclude that 1. DR-SR conflict resolution policy shows its superiority over IR in overload. Initiating IR version make the situation worse in overload. 2. In most situations maintaining IR and DR-SR versions of a transaction together shows its superiority over IR.

PAGE 81

73 3.7 Conclusions EDF-HP and AED-HP use blocking and immediate restarts to resolve data conflicts. While OCC variants use deferred restart to resolve data conflicts. Both approaches have some advantages and disadvantages for firm RTDBS. In our study, we have tried to synthesize the advantages of both approaches by applying immediate and stop-resume deferred restart (DR-SR) policies together. By combining immediate and deferred restart policies our EDF-ACC misses less transactions than AED-HP and EDF-HP that use immediate restart policy only. Our simulations indicate that: 1. AED-HP performs better than EDF-HP in overload. Our simulation results conforms to the previous simulation result [20]. 2. EDF-ACC is better than AED-HP in overload. Both approaches use the same HIT/MISS grouping mechanism but their priority assignment and conflict resolution on MISS group are different. From the simulation results we can conclude that EDF-ACC reduces wasted restart more effectively than AED-HP. 3. AED-ACC is sUghtly better than AED-HP. 4. EDF-ACC is better than AED-ACC. We can conclude that ACC is better combined with EDF priority assignment policy. 5. AVCC misses less transactions than EDF-HP for all ranges of system load. 6. AVCC misses less transactions than ACC except for heavily loaded situation. Our concurrency control algorithm ACC introduced in this dissertation uses a common shared lock table and a few lists for each transaction to make database consistent. The overhead associated with implementing DR-SR mechanism do not affect transactions in the HIT group at all and maintaining a few lists for each transaction in the MISS group is not significant. A disadvantage of ACC is that its performance is sensitive to the preciseness

PAGE 82

74 of HIT/MISS prediction. While AVCC do not use any grouping mechanism to combine IR and DR-SR restart policy together. During our performance comparisons we didn't consider implementation overhead because we believe that the overhead related to the algorithm of ACC is not significant to affect the result of our simulations. In terms of spaces AVCC requires more data spaces than EDF-HP and ACC by maintaining IR and DR-SR versions of a transactions. The space problem might be trivial if we consider memory prices that are dropping severely and primary goal of real-time systems, timely responses, which usually requires some redundant resources.

PAGE 83

CHAPTER 4 ACTIVE REAL-TIME: TRANSACTION SCHEDULING There are many applications such as cooperative distributed navigation systems and intelligent network services where ARTDBS technology is extremely useful [33, 34] and there have been several proposals to combine real-time databases and active databases [11, 33, 34]. 4.1 Priority Assignment -. Subtask deadline assignment (SDA) problem has been studied in distributed real-time systems where a given task is to be executed and completed by a specified deadline. The task executes several subtasks, each at possibly different system components. When each subtask is submitted to its component, a local deadline must be assigned to it [24]. Likewise in ARTDBS a transaction triggers subtransactions dynamically [9] and a triggered transaction which is a part of the triggering transaction need to be finished along with the triggering transaction and to obey the coupling mode specified. Consider an active transaction T = [T1,T2,T3] with EDF priority assignment policy. The ultimate deadline of transaction T fails to represent the tightness of each individual subtransactions. For example, if subtransaction Tl is scheduled with the deadline of T, the scheduler will consider the time that should be reserved for other subtransactions as slack to Tl. Thus subtransaction Tl will be running at a lower priority because of its excessive slack. As a remedy, earlier intermediate deadline (virtual deadline) has been assigned so as to reserve enough amount of time for the subtransactions to follow [34]. Based on earlier intermediate deadline assignment idea PD, DIV, and SL priority assignment for main-memory resident databases and their relative performances have been studied for a triggering transaction and a triggered 75

PAGE 84

76 transaction [34]. In the simulations [34] on memoryresident databases, incoming transactions are classified into triggering and non-triggering classes. In their simulations PD shows the best performance in terms of total miss percent but DIV and SL shows reduced miss percent of triggering class with increased miss percent of non-triggering class. Thus DIV and SL are better than PD only if triggering class is more valuable (more critical) than non-triggering class. The criticalness of a transaction is indicative of the level of importance that is attached to that transaction relative to the other transaction. Depending on the functionality of a transaction, meeting the deadline of one transaction may be considered more critical than another. If transactions have the same criticalness PD is the best among them for the main-memory databases. In this dissertation we wiU take a look at effects of subtransaction priority assignment on disk resident databases with active transactions. We believe that triggering itself is really unpredictable unless the corresponding events are periodical. We know data access patterns of all transactions rather than statically estimated execution times of transactions and we don't know when events will be triggered and all transaction have the same criticalness. With those assumptions we will see how subtransaction deadline assignment policy affects disk resident databases with active transactions. 4.1.1 Multiple Priorities Priority assignment policy DIV [34] assigned earlier deadlines (higher priority) to subtransactions by considering the urgency of subtransactions properly. We believe that, however, their priority assignment policy, DIV, caused unwanted phenomena such as deadlock, priority reversal, reverse direction of High Priority (HP) by assigning different priorities to a triggering and its triggered transactions. In the Figure 4.1, 4.2, and 4.3 we illustrated above 3 problems. In these figures Tl and T4 are triggering transactions and T2, T3 are subtransactions of Tl and T5, T6 are

PAGE 85

77 subtransactions of T4. Relative priority order of these transactions is Pr(T2) > Pr(T3) > Pr{T5) > Pr{T6) > Pr{Tl) > Pr(r4). Deadlock. In Figure 4.1 Tl is waiting the completion of T5 while T5 is waiting the completion of T2. Subtransaction T2 which is ready to commit is waiting the completion of its parent transaction Tl. Thus there exist a circular wait among Tl, T5, and T2 that causes a deadlock. wait Figure 4.1. Deadlock due to multiple priorities Priority reversal. In Figure 4.2 Tl requests data already accessed by T5 and is blocked waiting for the completion of T5. As T5 is a subtransaction of T4, Tl waits the completion of T4 which has lower priority than Tl. Figure 4.2. Priority Reversal: Blocking of active transactions Reverse direction of high priority. In Figure 4.3 T5 requested the data that already accessed by Tl and T5 caused the restart of Tl. This implies that lower priority

PAGE 86

78 transaction T4 causes the restart of higher priority transaction Tl because T5 is a part of T4. Tl T4 — ' — ' Vs. \ rest^ |t2| Q ' ' Figure 4.3. Restarts of active transactions It seems difficult to solve those problems with a single priority value for each transaction if we assign different priorities to triggering and triggered transactions. Thus we suggest a double or 2 level priority scheme which assigns two priorities for each transaction; one (r_priority) for active resource contention and the other (d_priority) for data conflict resolution. In our scheme we assign the same value for r_priority and d_priority for each triggering transaction and we assign a value to r.priority by considering the urgency of subtransaction and assigning the value of parent transaction's d.priority for each subtransaction's d-priority. Thus for each nested transaction [31] only single value of priority is used to solve data conflicts. With our double priority scheme we can easily solve deadlock, priority reversal, and reverse direction of High Priority while making subtransactions have timely services of active resource. As we mentioned earlier, transaction response time varies substantially with the changes of system load. By computing the ratio (Load Factor) of response time to corresponding resource time of each completed transaction, we might be able to predict the current system load. Resource time can be derived from transaction programs by assuming that the processing time for each accessed data item does not change enormously.

PAGE 87

79 Nit{T) Resource Time = Response time Load Factor(LF) = Average Load Factor = Number of immediate subtransactions triggered by T until time t No. of data access x cpuJime + Number of disk access x diskJime completion time — arrival time Response time Resource Time ^Tielf-lint LF{Ti) N After getting ALF we estimate transaction response time of transaction in the system from estimated resource time and ALF We will explain how we calculate priorities for subtransactions by using the Figure 4.4. dtl,dt2,dt3 arrival dtl dt2 itl dt3 it2 1 I I tl t2 t3 t4 t5 t6 t7 Figure 4.4. Life of complex active transaction T r.prt{T) r .priority of T at time t djprt{T) d-priority of T at time t ERTt(T) Estimated Response time of T at time t RRTt{T) Remaining Resource time of T at time t ALFi Average Load Factor at time t Slackt{T) Slack time of T at time t Ndt{T) Number of deferred subtransactions triggered by T until time t Nit{T) Number of immediate subtransactions triggered by T until time t Prioritv assignment for immediate subtransaction When immediate transactions are triggered at time t4 and t6 we assign priorities. First of all, we can derive estimation of transaction response time with Average Load Factor (ALF) and RRT as follows: This means a transaction have to spend ERT amount of time to get RRT amount of service time in the system. ERTt4{itl) = ALFt4 x RRTuiitl)

PAGE 88

80 We will use this modified DIV for our r .priority assignment because we think that DIV approach is quite simple and reasonable for our assumptions. By equally dividing the parent's effective slack among all the immediate and deferred subtransactions triggered until that point, we can assign priorities of subtransactions as foUows: r.pru{itl) = t4 + ERTt.iitl) + SlackMT)-{ERT^.^iHl)+E^^^ d.prt4{itl) = djprtA{T) r.priority of a subtransaction will be used only for resource contention and a subtransaction inherits the priority of its top-level transaction's d_priority for data conflict resolution. Priority assignment for deferred subtransaction By equally dividing the parent's effective slack among aU the deferred subtransactions triggered until that point, we can assign priorities of deferred subtransactions as follows by using DIV policy [34] assuming parallel execution of deferred subtransactions: Tjpr„{dn) = tl + ERTtrjdtl) + g'°'^fc'7(r)-(gfir.7(dn)+gflr.,(dt2)+gflr.r(A3)) d.prt7{dtl) = djpraiT) Priority assignment for a top-level transaction Assigning proper priorities to a triggering transaction is more important because its d-priority will be inherited to subtransactions to resolve data conflict. Even though increasing the d.priority of triggering transaction based on subtransaction triggering might help the performance, changing d.priority of a triggering transaction based on subtransaction triggering might cause priority inversion which in turn make circular abort. Thus we are going to use a fixed d.priority and r.priority for a triggering transaction. At time tl transaction T will get its initial d.priority and r.priority based on its deadline and it won't change during its execution.

PAGE 89

81 Resource scheduling We have multiple CPUs and a single common queue with priority-based preemptive scheduling policy. When a transaction arrives the procedure Arrival_sched is invoked and a transaction finishes (commit) or a transaction releases CPU (subtransaction commit, disk 10) Release_sched is invoked. Both of the procedures use r_priority when they compare priorities of transaction. Procedure Arrival_sched(Ta) BEGIN Put Ta in the ready queue; IF (there is an available CPU) Assign Ta to one of available CPU; Adjust ready queue and CPU pool; ELSE Pick CPU Ci that runs Tb which has the lowest r_priority; IF (r.priority (Ta) is greater than r_priority(Tb)) Preempt Ta from Ci; Execute Tb on Ci; END Procedure Release_sched(Ci, Ta) BEGIN Release transaction Ta from the CPU Ci; IF (there is an available transaction in the queue) Pick Tb that has the highest r.priority in the ready queue; Execute Tb on Ci; ELSE

PAGE 90

82 Return CPU Ci to the CPU pool; END 4.1.2 Performance Evaluation In order to compare the performance of different priority assignment policy, simulations of an active real-time transaction scheduler were implemented (using C language and SIMPACK simulation package [16]). In our simulations we are assuming multiple CPUs and multiple disks environment which has a common queue for the CPUs and the service discipline is priority PreemptiveResume and each of the disks has its own queue and the service discipline is priority Head-Of-Line (Non-preemptive). Our simulation model follows 3 rules of nested transaction model [17] and the parameters used in the simulations are shown in Table 4.1.2. Commit rule. The commit of a subtransaction makes its results accessible only to the parent transaction. The subtransaction wiU finally commit only if it has committed itself locally and aU its ancestors up to the root have finally committed. Rollback rule. If a transaction at any level of nesting is rolled back, all its subtransactions are also roUed back, independent of their local commit status. Visibility rule. All changes done by a subtransaction become visible to the parent transaction upon the transaction's commit. All objects held by a parent transaction can be made accessible to its subtransactions. Changes made by a subtransaction are not visible to its siblings, in case they execute concurrently. In our simulations, transactions enter the system according to a Poisson process with arrival rate A (i.e., exponentially distributed inter-arrival times with mean value 1/A), and they are ready to execute when they enter the system (i.e., release time equals arrival

PAGE 91

83 Table 4. 1. Parameters for ARTDBS sim ulations Parameter Value db_size 1 nnn iUUU i/o_time 20 ms cpu-time ID ms disk.prob 0.5 miii_slack inn /"t^^ iUU {/o) max^lack 650 (%) no.oLcpu 8 no-of_disk 16 min_size 4 maj{_size 6 prob.oLtriggering 20-70 prob_of immediate 50 (%) prob-of .defer red 50 (%) time). The number of objects updated by a transaction is chosen uniformly from the range of minsize and maxsize and the actual database items are chosen uniformly from the range of db-size. General behaviors of our active transaction model in our simulations are as follows: Triggering. Triggering transaction triggers a subtransaction after reading a data item (main memory or disk resident) according to triggering probability proh.of. triggering and coupling mode of a subtransaction (immediate (IMM) or deferred (DEF)) is decided by prob-ofJmmediate and prob.of-deferred. 1. When a transaction triggers an IMM subtransaction we stop triggering transaction and execute immediate subtransaction. 2. When a transaction triggers a DEF subtransaction we just increase the count of deferred subtransaction (i.e., deferredscnt) and keep executing the triggering transaction. DEF subtransactions will be actually triggered at the end of the triggering transaction by checking the value of deferredscnt. Those DEF subtransactions can be executed in parallel as there are multiple CPUs.

PAGE 92

84 Commit. 1. When an IMM subtransaction finishes its execution, it returns control to its triggering transaction and waits phase 2 commit. 2. When a DEF subtransaction finishes its execution, it decreases deferredscnt of its parent. If the count goes to zero, its parent initiates phase 2 commit. 3. When a triggering transaction commits (phase 2 commit) it makes all subtransactions do the phase 2 commit and releases locks held by itself. Abort. We maintain the deadline list of triggering transactions to check the tardiness of a transaction easily. When a triggering transaction is aborted by missing its deadline all its subtransactions are also aborted according to Rollback rule. Lock conflict. 1. When a subtransaction tries to access the object held by its parent, it can access the object freely. 2. When a parent tries to access the object held by its committed (phase 1 commit) child, it can access the object freely. 3. When a DEF subtransaction tries to access the object held by its sibling, it can access the object freely when its sibling has already finished its phase 1 commit. Otherwise it is blocked until the phase 1 commit of its sibling. In flat transaction model we simply assume that transactions are restarted from the beginning of it. While in our active transaction model rollbacks are done as follows: • Case 1: When a triggering transaction is restarted, triggering transaction and all its subtransactions are rolled back to the beginning of the transaction and the triggering transaction is restarted. • Case 2: When an immediate subtransaction is restarted If it is ready to commit (i.e., precommit) all its siblings and parent are rolled back to the beginning and triggering transaction is restarted.

PAGE 93

85 if it is not ready to commit it is restarted from the beginning. It doesn't affect the corresponding triggering transaction and its siblings. • Case 3: When a deferred subtransaction is restarted it is restarted from the beginning without affecting its parent and its siblings. tocoaunit Q tr. that is still ninniiig Ai (b) Figure 4.5. Restartable unit of active real-time transaction After accessing an object a transaction spends cpuJime in order to do some work with or on that object and then it accesses the next object. The assignment of a deadline is controlled by the resource time of a transaction and two parameters min.slack and max.slack which set, respectively, a lower and upper bound of percentage of slack time relative to the resource time. A deadline is calculated by adding resource time and slack time. Slack time is calculated by multiplying slack percent and resource time. Slack percent is chosen uniformly from the range of minslack to max.slack. Disk accesses are controlled by disk.prob and the number of CPU and disk is controlled by no.of.cpu and no.of.disk, respectively. In our performance evaluation, we measure the transaction miss percent that is commonly used in the literature for firm RTDBS. We ran 10,000 active transactions for each simulations and during our simulations first 1,000 transactions were not counted in simulation results in order to avoid the warm up problem (initial transient problem).

PAGE 94

86 4.1.3 Analysis of Results We compared the performance of PD and DIV variant which we proposed here. In our experimentation all incoming transactions have the same criticalness independent of their triggering probability. In the first experimentation we turned off concurrency control so that all data could be accessed freely to observe the effect of intermediate priority for active resources only. From Figure 4.6 we can conclude that assigning earlier deadline to subtransactions does not affect the performance of ARTDBS. Arrival Rate(trs/sec) Figure 4.6. Result: Without data contention In the second stage, we added data contention. As we expected transaction miss percent of both approaches climbed earlier than previous experimentation on no data contention environment due to transaction blocking and restarts but the relative performance difference of PD and DIV variant is negligible too. From the Figure 4.6 and Figure 4.7, we can conclude that assigning earlier deadline to subtransactions does not aflFect the performance of ARTDBS if all transactions have the same criticalness. Observe the execution scenarios of two transactions in Figure 4.8 based

PAGE 95

87 100 15 20 25 30 Arrival Rate(trs/sec) Figure 4.7. Result: With data contention on main-memory databases. Execution Ex-A models PD and Ex-B models DIV priority assignment by executing parts of two transactions way. As we can see Ti has a better chance to finish its execution within its deadline with execution Ex-A which models PD priority assignment policy. Tl T2 Ex-A Ex-B Figure 4.8. Two execution scenario From our simulation results and previous research [34] we can conclude as follows: • Transaction triggering prolongs the execution time of an active transaction than expected. Thus its shortened slack time makes it difficult to finish in time. If we consider

PAGE 96

88 a transaction which triggers more subtransactions more important (i.e., more critical), increasing its priority with decreased slack is a proper approach [34]. • Transaction criticalness is usually decided statically based on the functionality of subtransactions it might trigger rather than the number of subtransaction it triggers. Thus transaction criticalness doesn't change with the subtransaction triggering dynamically. With this assumption increasing the priority of a transaction which triggered many subtransactions merely increases the transaction miss percent by favoring longer transactions. Assigning an earlier deadline to a subtransaction doesn't improve the performance of ARTDBS either when a transaction's criticalness does not change dynamically. 4.2 Concurrency Control 2PL-HP seems a good approach for soft RTDBS but it has wasted restart and wasted wait for firm RTDBS. Thus we have developed new concurrency control algorithms, ACC and AVCC, and have shown that the performances of ACC and AVCC are better than 2PL-HP for firm RTDBS which has flat transaction model. ACC uses HIT/MISS group assignment mechanism to anticipate the destiny of a transaction and HIT/MISS group assignment algorithm controls the number of transactions in both groups by assigning proper groups to incoming transactions. But the dynamic triggering of subtransactions in active transaction model requires a major modification of group assignment algorithm. Meanwhile our new concurrency control algorithm, AVCC, that is designed for firm RTDBS by using the semantics of firm deadline can fit easily into firm real-time active model and the performance of AVCC won't be affected by the changes of the transaction execution model. By extending our concurrency control algorithm AVCC for complex transaction model of active databases, we can develop firm real-time active database concurrency control algorithm.

PAGE 97

89 4.2.1 Extension of AVCC for Active Transaction Model Our extension of AVCC maintains a global shared lock table and each version of transaction follows 2 phase locking. Each lock table entry contains an object identifier (OID), a lock mode, number of lock waiter, number of lock granter, a list of lock waiters, and a list of lock granters. In addition, we maintain transaction state, stop-cnt, stopJist, av. stopped. cnt, av^stoppedJist, parent, child, trigger, and subJist field for each transaction and their meanings and purposes are as follows: Ti.state Ti.stopJist Ti.av stopped Jist Ti.stopjcnt Ti.av stopped. cnt Ti.parent Ti.child Ti.trigger Ti.subJist State of transaction (READY, REPLACED, BLOCKED) The list of object identifier and stopped transactions' TID when the transaction Ti stopped a transaction. This list is used to implement deferred restarts when a transaction commits The list of object identifier and TID of the transaction that stopped T,-. The number of lower priority transactions stopped by the transaction T, The number of higher priority transactions that stopped T,A pointer to r,'s parent. This field is used when we check the transaction stop relationship. A pointer to r,'s child made by transaction stop. A pointer to the transaction that triggered T,. List of subtransactions of T,-. Key procedures of our extended AVCC are Nested_Lockjrequire, Phase_two_conimit, Phase_one_coinmit, and Nested_Discard. Lock acquire When a transaction request a lock on a data object the lock compatibility and parent/child relationship should be checked. If an HPT Tr conflicts with an LPT 7\ which does not have parent/child or sibling relationship we stop Th and make Tr get the lock and initiate T,which is restarted version of ThWhen it checks its parent/child or sibling relationship it uses its parent, child, and trigger fields. Procedure Nested_Lock.acquire(Tr, oid, lockmode) BEGIN

PAGE 98

90 IF (oid is locked with a conflicting granted lock mode) THEN IF (Th is Tr's ancestor or Th is Tr's child or Th is Tr's sibling which finished its phase 1 commit) Lock_granted(Tr, oid, lockmode); ELSE IF (Pr(Tr) is greater than Pr(Th)) THEN IF (Th.state is REPLACED) Add oid, and TID of Th to Tr's stopJist; Add oid, and TID of Tr to Th's avjstoppedJist; ELSE stop Th and ' generate Ti which is restart version of Th; Add REPLACED flag to Th; Add oid, and TID of Th to Tr's stopJist; Add oid, and TID of Tr to Th's avjstoppedJist; ELSE Block(Tr); ELSE Lock_granted(Tr,oid,lockmode); END Commit During the phase 1 commit, an IMM subtransaction returns control to its parent so that its parent can resume the execution. While a DEF subtransaction reduces the number

PAGE 99

91 of deferred subtransaction count (deferred^cnt) of its parent and make its parent commit when there is no unfinished deferred subtransaction. Procedure Phase_one_commit(Ti) BEGIN Ta is triggering transaction of Ti (Ti. trigger); SWITCH (class of Ti) BEGIN CASE IMM: Return control to Ta and wait; CASE DEF: Decrease deferred_scnt of Ta by 1; IF (deferred-scnt of Ta is zero) Phase_two_commit(Ta) END END During the phase 2 commit of Phase_two_conimit we make local copies global by invoking commit of subtransactions in its subJist. Procedure Phase_two_commit(Ti) BEGIN IF (Ti has subtransactions) THEN FOREACH subtransaction Ta in Ti.subJist Phase_two_commit(Ta); Commit(Ti); /* leaf level */

PAGE 100

92 END Discard In order to discard tardy transactions from the system easily we maintain list of top-level transactions by deadline order and check the tardiness of transactions. When a top-level transaction is discarded by missing its deadline all its subtransactions are discarded also according to Rollback rule of nested transaction model. Function Discard is defined in the previous section. Procedure Nested_Discard(Ti) BEGIN IF (Ti has subtransactions) THEN BEGIN FOREACH subtransaction Ta in Ti.subJist Nested-Discard(Ta); END Discard(Ti); /* leaf level */ END 4.2.2 Performance Evaluation We ran a simulation with the simulation parameters in Table 4.1.2 to see the effects of extended AVCC for nested transaction model. In this simulation, we used PD priority assignment policy for both of 2PL-HP and AVCC. As we can see in Figure 4.9 extended AVCC shows better performance than 2PL-HP for nested transaction model and the result is similar to Figure 3.19 of flat transaction model.

PAGE 101

100 80 S 60 u CM 40 20 10 2PL-HP AVCC B 15 20 25 Arrival Rate(trs/sec) 30 Figure 4.9. 2PL-HP and extened AVCC for active databases

PAGE 102

CHAPTER 5 CONCLUSIONS We have developed a cost conscious dynamic priority assignment policy CCA and extended it to incorporate the load factor of systems into CCA for soft RTDBS, CCA-ALF, and showed good performance of CCA and CCA-ALF by using simulations [21, 12]. To the best of our knowledge, all previous abortive methods of real-time transaction scheduling have not considered the dynamic cost, i.e, the cost of rolling back and restarting transactions. This perhaps is not a key consideration in real-time task scheduling that only consider timing correctness. But in real-time transaction scheduling, the cost incurred at run time to keep the database consistent should be considered as a key factor. As a second step, we have developed new concurrency control methods, ACC and AVCC, for firm RTDBS and showed the performance of ACC and AVCC by using simulations. EDF-HP and AED-HP use blocking and immediate restarts to resolve data conflicts while OCC variants use deferred restart to resolve data conflicts. Both approaches have advantages and disadvantages for firm RTDBS. In our study, we have tried to combine the advantages of both approaches by applying immediate restart (IR) and stop-resume deferred restart (DR-SR) policies together. ACC and AVCC miss less transactions than AED and EDF-HP as shown by our simulations. Finally, we extended our work on RTDBS for more complex ARTDBS. In ARTDBS a transaction performs additional work by executing rules as subtransactions dynamically. A triggered transaction which is a part of the triggering transaction need to be finished along with the triggering transaction and has to obey the coupling mode specified. We have studied priority assignment policies for a triggering transaction and a triggered transaction 94

PAGE 103

95 in ARTDBS and compared the performance. According to our simulation study assigning earlier deadline (higher priority) to subtransactions does not improve the performance of ARTDBS. For the concurrency control of firm deadline ARTDBS we extended AVCC for nested transaction model. In this dissertation we haven't compared the performance of AVCC to OCC variants directly because the performance of 2PL-HP and OCC changes with data contention for firm deadline transactions [22] and 2PL-HP is better than OCC for non real-time or soft deadline transactions [5, 18]. In real applications such as Network Service Databases [41, 36] or Stock market applications [2, 45] incoming transactions are usually mixed load of non real-time, soft and firm transactions. 2PL-HP and AVCC can easily deal with those situations by assigning priorities so that all priorities of soft deadline transactions are lower than those of firm deadline transactions and priorities of non real-time transactions are lower than those of soft deadline transactions. By maintaining IR version only for non real-time or soft deadline transactions and IR and DR-SR versions together for firm deadline transactions AVCC can deal with mixed load situations without any difficulties. But OCC variants cannot easily handle those situation with simple adjustment. They might need a more complex validation mechanism to handle non real-time, soft and firm deadline transactions together with OCC which cannot handle non real-time, and soft deadline transactions efficiently because it uses commit time validation mechanism. Currently we are looking at these practical situations. Other interesting problems we are going to look at in near future are conflict resolution on shared data structure and Commit protocol. If the preempted transaction hold semaphores on a B-tree or a B-tree node and an HPT want to use the B-tree with a conflicting mode, a priority inversion happens. This problem should be solved properly. Another issue is commit protocol. In RTDBS a transaction that is ready to commit still might have

PAGE 104

96 some slack time until its deadline. The slack time can be used to increase the performance of RTDBS.

PAGE 105

REFERENCES [1] Robert Abbott and Hector GarciaMolina. Scheduling real-time transactions. SIGMOD RECORD, 17(1):71-81, 1988. [2] Robert Abbott and Hector GarciaMolina. Scheduling real-time transactions: a performance evaluation. In Proceedings of the 14th VLDB, pages 1-12. ACM, 1988. [3] Robert Abbott and Hector GarciaMolina. Scheduling real-time transactions with disk resident data. In Proceedings of the 15th VLDB, pages 385-396. ACM, 1989. [4] Robert Abbott and Hector Garcia-Molina. Scheduling real-time transaction: Performance evaluation. ACM Transactions on Database Systems, 17(3):513-560, 1992. [5] Rakesh Agrawal, Michael J. Carey, and Miron Livny. Concurrency control performance modeling: Alternatives and implications. ACM Transactions on Database Systems, 12(4):609-654, 1987. [6] Rakesh Agrawal and D. DeWitt. Integrated concurrency control and recovery mechanism: Design and performance evaluation. ACM Transactions on Database Systems, 10(4):529-564, 1985. [7] P.A. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in Database Systems. AddisonWesley, Reading, MA, 1987. [8] A. Buchmann, D.R. McCarthy, and M. Hsu. Time-critical database scheduling: A framework for integrating real-time scheduling and concurrency control. In Proceedings of the Fifth Conference on Data Engineering, pages 470-480, Feb 1989. [9] M. J. Carey, R. Jauhari, and M. Livny. On Transaction Boundaries in Active Databases: A Performance Perspective. IEEE Trans, on Knowledge and Data Engineering, 7(l):78-84, 1991. [10] Michael J. Carey and Michael R. Stonebraker. The performance of concurrency control algorithms for database management systems. In Proceedings of the 10th VLDB, pages 107-118. ACM, 1984. [11] S. Chakravarthy, B. Blaustein, A. Buchmann, M. Carey, U. Dayal, D. Goldhirsch, M. Hsu, R. Jauhari, R. Ladin, M. Livny, D. McCarthy, R. McKee, and A. Rosenthal. HiPAC: A research project in active, time-constrained database management. Technical report XAIT-89-02, XEROX, July 1989. [12] S. Chakravarthy, D. Hong, and T. Johnson. Incorporating load factor into the scheduling of soft real-time transactions. Technical Report UF-CIS-TR-94-024, University of Florida, Dept. of CIS, 1994. [13] S. Chakravarthy, D. Hong, and T. Johnson. Real-time transaction scheduling: A framework for synthesizing static and dynamic factors. Technical Report UF-CIS-TR94-008, University of Florida, Dept. of CIS, 1994. 97 V

PAGE 106

98 [14] U. Dayal, B. Blaustein, A. Buchmann, U. Chakravarthy, M. Hsu, R. Ledin, D. McCarthy, A. Rosenthal, and S. Sarin. The HiPAC project: Combining active database and timing constraints. SIGMOD RECORD, 17(l):51-70, 1988. [15] David J. DeWitt, Randy H. Katz, Frank Olken, Leonard D. Shapiro, Michael R. Stonebraker, and David Wood. Implementation techniques for main memory database ststems. In Proceedings of the 1984 ACM SIGMOD Int'l Conference on Management of Data, pages 1-8. ACM, 1984. [16] Paul A. Fishwick. SIMPACK:C-based Simulation Tool Package Version 2. University of Florida, 1992. [17] Jim Gray and Andreas Reuter. Transaction Processing: Concepts and Techniques. Morgan Kaufmann, San Mateo, CA, 1993. [18] Jayant R. Haritsa, Michael J. Carey, and Miron Livny. Dynamic real-time optimistic concurrency control. In Proceedings of Real-Time System Symposium, pages 94-103. IEEE, 1990. [19] Jayant R. Haritsa, Michael J. Carey, and Miron Livny. On being optimistic about real-time constraints. In Symposium in Principles of Database systems, pages 331343. ACM, 1990. [20] Jayant R. Haritsa, Miron Livny, and Michael J. Carey. Earliest deadline scheduling for real-time database systems. In Proceedings of Real-Time System Symposium, pages 232-242. IEEE, 1991. [21] D. Hong, T. Johnson, and S. Chakravarthy. Real-time transaction scheduling: A CostConscious Approach. In Proceedings of the 1993 ACM SIGMOD Int'l Conference on Management of Data, pages 197-206. ACM, 1993. [22] Jiandong Huang, John A. Stankovic, Krithi Ramamritham, and Don Towsley. Experimental evaluation of real-time optimistic concurrency control schemes. In Proceedings of the nth VLDB, pages 35-46. ACM, 1991. [23] B. Kao and H. Garcia Mohna. Subtask deadline assignment for complex distributed soft real-time tasks. Technical Report STAN-CS-93-1491, Stanford University, 1993. [24] Ben Kao and Hector Garcia-Molina. Deadline assignment in a distributed soft realtime system. In Proceedings of 13th of Distributed Computing Systems, pages 428-437. IEEE, 1993. [25] Woosaeng Kim and Jaideep Srivastava. Enhancing real-time DBMS performance with multiversion data and priority based disk scheduling. In Proceedings of Real-Time Systems Symposium, pages 222-231. IEEE, 1991. [26] Tei-Wei Kuo and Aloysius K. Mok. Application semantics and concurrency control of real-time data-intensive applications. In Proceedings of Real-Time Systems Symposium, pages 35-45. IEEE, 1992. [27] Tei-Wei Kuo and Aloysius K. Mok. SSP: a semantics-based protocol for real-time data access. In Proceedings of Real-Time Systems Symposium, pages 76-86. IEEE, 1993. [28] Juhnyoung Lee and Sang H. Son. Using dynamic adjustment of serialization order for real-time database systems. In Proceedings of Real-Time Systems Symposium, pages 66-75. IEEE, 1993.

PAGE 107

99 [29] Yi Lin and Sang H. Son. Concurrency control in real-time databases by dynamic adjustment of serialization order. In Proceedings of Real-Time Systems Symposium, pages 104-112. IEEE, 1990. [30] C.L. Liu and J.W. Layland. Scheduling algorithms for multiprogramming in a hard real-time environment. Journal of ACM, 20:46-61, 1973. [31] J.E. Moss. Nested Transactions: An Approach to Reliable Distributed Computing. The MIT Press, 1985. [32] Hweehwa Pang, Miron Livny, and Michael J.Carey. Transaction scheduling in multiclass real-time database systems. In Proceedings of Real-Time System Symposium, pages 23-34. IEEE, 1992. [33] B. Purimetla, R.M. Sivasanliaran, J.A. Stankovic, and K. Ramamritham. A study of distributed real-time active database applications. Technical Report UM-CS-93-010, University of Massachusetts at Amherst, 1993. [34] B. Purimetla, R.M. Sivasankaran, J.A. Stankovic, K. Ramamritham, and D. Towsley. A priority assignment in real-time active databases. Technical Report UM-CS-94-029, University of Massachusetts at Amherst, 1994. [35] Krithi Ramamrithm. Real-time databases. International Journal of Distributed and Parallel Databases, 1:1-30, 1993. [36] N. Redding. Network Services Databases. In IEEE Global Telecommunications Conference, volume Vol. 3, pages 1336-1340. IEEE, 1986. [37] Lui Sha. Concurrency control for distributed real-time databases. SIGMOD RECORD, 17(l):82-98, 1988. [38] Lui Sha. Modular concurrency control and failure recovery. IEEE Transactions on Computers, 37(2): 146-159, 1988. [39] Lui Sha, Ragunathan Rajkumar, and J. P. Lehoczky. Priority inheritance protocols: An approach to real-time synchronization. IEEE Transactions on Computers, 39:11751185, 1990. [40] Lui Sha, Ragunathan Rajkumar, Sang Hyuk Son, and Chun-Hyun Chang. A real-time locking protocol. IEEE Transactions on Computers, 40(7):793-800, 1991. [41] R.M. Sivasankaran, B.Purimetla, J.A. Stankovic, and K. Ramamritham. Network services database a Distributed Active Real-Time Database (DARTDB) application. In Proceedings of IEEE Workshop on Real-Time Applications, pages 184-187. IEEE 1993. [42] John A. Stankovic and Wei Zhao. On real-time transactions. SIGMOD RECORD, 17(1):4-18, 1988. [43] Y.C. Tay. A behavioral analysis of scheduling by earliest deadline. Technical Report No. 532, Department of Mathematics, National University of Singapore, 1992. [44] Y.C. Tay, R. Suri, and N. Goodman. Locking performance in centralized databases. ACM Transactions on Database Systems, 10(4):415-462, 1985. [45] John Voelcker. How computers helped stampede the stock market. IEEE Spectrum. 24:30-33, 1987.

PAGE 108

100 [46] Wei Zhao, Krithi Ramamritham, and John A. Stankovic. Preemptive scheduling under time and resource constraints. IEEE Transactions on Computers, 36(8):949-960, 1987. [47] Wei Zhao, Krithi Ramamritham, and John A. Stankovic. Scheduling tasks with requirement in hard real-time systems. IEEE Transactions on Software Engineering, 13(5):225-236, 1987.

PAGE 109

BIOGRAPHICAL SKETCH Dong-kweon Hong was born on June 11, 1960, in Taegu, South Korea. He received his Bachelor of Engineering degree in computer sciences from the Kyung-Pook National University, South Korea. After finishing his undergraduate degree in 1985, he worked as an Research Engineer at Electronics and Telecommunications Research Institute (ETRI) at Taejeon, South Korea. In the Fall of '90, he started his graduate studies with a major in computer and information sciences at the University of Florida. He received his Master of Science degree in August, 1992 and wiU receive Doctor of Philosophy degree in computer and information sciences and engineering from the University of Florida, Gainesville, in December, 1995. 101