• TABLE OF CONTENTS
HIDE
 Title Page
 Table of Contents
 Main






Group Title: Department of Computer and Information Science and Engineering Technical Reports
Title: Incorporating load factor into the scheduling of soft real-time transactions
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00095293/00001
 Material Information
Title: Incorporating load factor into the scheduling of soft real-time transactions
Series Title: Department of Computer and Information Science and Engineering Technical Report ; TR-94-024
Physical Description: Book
Language: English
Creator: Chakravarthy, Sharma
Hong, D.
Johnson, T.
Publisher: Department of Computer and Information Sciences, University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: April, 1994
Copyright Date: 1994
 Record Information
Bibliographic ID: UF00095293
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.

Downloads

This item has the following downloads:

1994145 ( PDF )


Table of Contents
    Title Page
        Page i
    Table of Contents
        Page ii
    Main
        Page 1
        Page 2
        Page 3
        Page 4
        Page 5
        Page 6
        Page 7
        Page 8
        Page 9
        Page 10
        Page 11
        Page 12
        Page 13
        Page 14
        Page 15
        Page 16
        Page 17
        Page 18
        Page 19
        Page 20
        Page 21
        Page 22
        Page 23
Full Text














University of Florida
Computer and Information Sciences


S. Department of Computer and Information Sciences
Computer Science Engineering Building
University of Florida
- Gainesville, Florida 32611
{ i .: 3 -


Incorporating Load Factor into the
Scheduling of Soft Real-Time
Transactions

S. Chakravarthy
D. Hong
T. Johnson
email: sharma@cis.ufl.edu

Tech. Report UF-CIS-TR-94-024
April 1994
(Submnitted for publication)











Contents


1 Introduction


2 Previous Work


3 Motivation for our approach


4 CCA-ALF for soft deadline

4.1 Priority A ssignm ent . . . . . . . . . . . . . . . . . .

4.2 Scheduling algorithm . . . . . . . . . . . . . . . . . .


5 EDF-CR-ALF for soft deadline


6 Performance Evaluation

6.1 Main memory DB .. . ...........

6.1.1 Effect of Arrival rate .. ........

6.1.2 Effect of multiclass (Transaction mix)

6.2 Disk Resident DB .......

6.2.1 Effect of Arrival Rate .. ......


7 Conclusions


......................

......................

......................

......................

......................













Incorporating Load Factor into the Scheduling of Soft

Real-Time Transactions


S. Chakravarthy D. Hong T. Johnson
Database Systems Research and Development Center
Computer and Information Sciences Department
University of Florida, Gainesville, FL 32611
Email: {sharma, dh2, ted}@cis.uf1. edu


Abstract

In real-time databases (as opposed to real-time systems), transactions must satisfy the ACID
properties in addition to satisfying the timing constraints specified for each transaction (or
task). EDF-HP is the simplest and most straightforward approach that uses only the deadline
information of each transaction.
In this paper, we propose a real-time transaction scheduling algorithm CCA-ALF (Cost
Conscious Approach with Average Load Factor) which used both static (e.g., deadline) and
dynamic information (e.g., system load). We compare the performance of EDF-HP, CCA-
ALF and EDF-CR-ALF which is a variant of EDF-CR. As our approach adjusts the priority
assignment policy to the fluctuation of the system load, CCA-ALF adapts well to the changes
of the system load without causing excessive number of transaction restarts. Our simulations
indicate: i) CCA-ALF is better than EDF-HP and EDF-CR-ALF in terms of miss percent and
mean lateness, ii) CCA-ALF is more fair than EDF-HP and EDF-CR-ALF, and iii) CCA-ALF
adapts well to the changes of the system load.


1 Introduction

The main focus of research in the real-time systems area has been the problem of scheduling tasks
to meet the timing constraints associated with each task, while the focus in database area has
been concurrency control to guarantee database consistency and recovery in the presence of various
kinds of failures. Design of a scheduling policy for a real-time database system (RTDBS) entails
synergistically combining techniques from both areas and fine-tuning them to obtain a policy that
meets the requirements of scheduling transactions in RTDBS. This dual requirement makes real-
time transaction scheduling more difficult than task scheduling in real-time systems. Typically,
applications in real-time systems do not share disk-resident data. Even when they share data, the
consistency of shared data is not managed by the system but by the application program. For the
assumptions used in real-time systems, it is possible to predict some of the characteristics of tasks
needed for the design of scheduling algorithms. As a result, scheduling algorithms [ZRS87b, ZRS87a]
used in current real-time systems assume a priori knowledge of tasks, such as arrival time, deadline,
resource requirement, and worst case (CPU) execution time. For database applications, on the other
hand, many sources of unpredictability exist [Ram93] which makes it difficult to predict some of
the resource requirements for transactions that need to meet time constraints.









Transactions with deadlines have been categorized into hard deadline, soft deadline, and firm
deadline transactions. Transactions with hard deadlines have to meet their deadlines; otherwise,
the system does not meet the specification. Typically, transactions that are in this category have
catastrophic consequences if their deadlines are not met. Sometimes contingency measures may be
included as an alternative. Soft real-time transactions have time constraints, but there may be still
some residual benefit for completing the transaction after its deadline. Conventional transactions
with response time requirements can be considered as soft real-time transactions. In contrast to the
above two, firm transactions are those which need not be considered any more if their deadlines are
not met, as there is no value to completing the transaction after its deadline. Typically applications
that have a definite window (e.g., banking and stock market applications) within which transactions
need to be executed come under this category. In this paper, we view a real-time database system
as either memory- or disk-resident transaction processing system whose workload is composed of
transactions with individual timing constraints. A timing constraint is expressed in the form of a
deadline, and we consider only soft deadline transactions in this paper.
Repetitive workload is a common property in real-time and transaction processing systems.
Thus in a real-time transaction processing system, users do not run arbitrary programs, but rather
request the system to execute specific functions out of a predefined set. Each function is an
instance of a transaction type. That is, RTDBS invokes a transaction program that implements
the requested function. The random aspect is the sequence and the frequency with which programs
are invoked [GR93]. Use of canned transactions and queries whose read and write sets can be
predicted beforehand is a step in the right direction and the data items accessed by a transaction
are likely to be known a priori once its functionality is known [Ram93].
Although it is possible to make conservative (or worst case) estimates (e.g., read and write sets
gleaned from a transaction), it is, in general, not possible to predict a prior the interference among
transactions. Although serial execution avoids interference, in the presence of deadlines, completion
of transactions without violating timing constraints is completely determined by the arrival order.
Knowledge of transaction semantics, such as write-only transactions, update transactions and read-
only transactions can also affect performance if they are taken into account in the development of
a scheduling policy. Hence, a synthesis of pre-analyzed information and the use of dynamic costs
obtained from the actual execution seems to be a viable approach for obtaining scheduling policies
to meet the requirements of transaction executions in real-time databases.


2 Previous Work

In addition to the approaches taken for real-time systems, there is a large body of work on RTDBS
that can be summarized as follows:

Priority scheduling and concurrency control

1. 2PL-HP, 2PL-WP, 2PL-CR [AC;:.IS9]
2. Optimistic concurrency control (OCC) [HCL90b]
3. Multiversion [KS91]
4. Mixed Integrated concurrency control [LS90]
5. Semantics and correctness I;.l'i']









* Buffer management [CJL89, RMM90]


IO scheduling [AC:.S19, KS91]

In this paper, we focus on priority assignment of concurrent transactions. Concurrency control
based real-time database (time-critical database) scheduling algorithms combine various proper-
ties of time-critical schedulers with properties of concurrency control algorithms [AC .I'i', BMH89,
CBB+89, HLC91, si...c- SZ'8, HSRT91]. Priority scheduling without knowing the data access
pattern is presented as a representative of algorithms with incomplete knowledge of resource re-
quirements. The scheduling policies presented in [AC;.1'-', HSRT91, HLC91, Z7'] combine either
2 phase locking or optimistic concurrency control (OCC) with time-critical schedulers. EDF-HP
(Earliest Deadline First with High Priority), LSF-HP (Least Slack First with HP), EDF-WP (EDF
with Wait Promote), Virtual Clock, Pairwise Value Function [Z7''] are combined with 2 phase lock-
ing. As a variant of single version 2 phase locking, real-time multiversion concurrency control [KS91]
has been introduced to increase concurrency and .,.1li1-I the serialization order dynamically.
An OCC scheme with a deadline and transaction length based priority assignment scheme is
presented in [HSRT91]. An OCC with adaptive EDF has also been proposed [HLC91]. With OCC
approach, a policy is needed to resolve the access conflicts during the validation phase. Some
of the policies proposed are commit (always let the transaction being validated commit), priority
abort (abort the validating transaction only if its priority is less than that of each conflicting
transaction), priority wait (wait for higher priority transactions to complete), and opt-sacrifice
(restart the validating transaction if at least one of the transactions in the conflict set has a higher
priority). OCC schemes display better performance for firm real-time transactions [HCL90a]. Lin
and Son [LS90] have proposed a new concurrency control algorithm which is based on mixed
integrated concurrency control method [BHG87] that adjusts the serialization order dynamically.
Priority scheduling with some a priori knowledge is introduced as another approach in [ACG.I 19,
BMH89, il...- SRSC91, HJC93]. Conditional Restart (CR) [AC;.I19] uses estimated execution
time of transactions to make a decision on blocking and aborts and CCA [HJC93] uses data access
pattern to estimate the dynamic costs incurred by the interference among transactions. Conflict
avoiding nonpreemptive method and Hybrid algorithms which use conflict avoiding schemes in the
non-overload case and CR conflict resolution method in the overload case have been proposed in
[BMH89]. Static priority assignment based Priority Ceiling Protocol (PCP) using priority inheri-
tance with exclusive lock and read/write PCP have been proposed in [li.,'"- SRSC91].


3 Motivation for our approach

The primary motivation for our approach [CHJ94] is to answer the question "What kind of infor-
mation is relevant and how to meaningfully incorporate it into the design of a real-time scheduling
algorithm?". Various types of information are useful in different ways. Intuitively we can do better
if we have additional knowledge but the improvement is predicated upon the use of the knowledge
appropriately. Figure 1 illustrates the classification of various scheduling algorithms proposed in
the literature with respect to the type of knowledge used.

Type 0 Does not assume any a priori knowledge. Only available timing information is deadline
(EDF-HP [AC;:.I9]).









Type 1 Deadline and data access pattern are available (CCA [HJC93]).


Type 2 Deadline and estimated execution time are assumed to be available (EDF-CR [AC.;:.I9]).

Type 3 Data access pattern and static transaction priorities are assumed to be available (Priority
Ceiling [SRL90]).


Type 0 EDF-HP AEDF-HP



Type 1 CCA



SEDF-CR
Type 2


Type 1
& CCA-ALF EDF-CR-ALF
Type 2

Figure 1: Knowledge type and corresponding approaches

EDF-HP is the simplest and most straightforward approach for an RTDBS. EDF priority as-
signment policy minimizes the number of late transactions when systems are lightly loaded. The
performance, however, steeply degrades in overloaded systems. There have been several approaches
to overcome the shortcoming of EDF and they can be grouped into:

1. Use overload detection and management [HLC91].

2. Delay the build up of overload [AC;:.19, HJC93].

Overload detection mechanisms for real-time tasks are quite straightforward as one can assume
the availability of arrival time, execution time, resource requirement and deadline [DLT85]. For
database applications, however, arrival and execution times of transactions are usually not available
or not correct due to database characteristics. AED (Adaptive Earliest Deadline) [HLC91] priority
assignment for an RTDBS uses a feedback mechanism that detects overload conditions and modifies
transaction priority assignment policy accordingly. AED, however, is only applicable to firm real-
time systems. AED uses past history that have been gathered dynamically rather than a priori
knowledge to detect overload.
Other approaches [AC'.18I9, HJC93] use additional information to improve EDF-HP further.
Even though these approaches do not have a specific overload management mechanism their meth-
ods improve the performance of soft RTDBS by delaying the buildup of overload. The basic idea is
to save valuable system resources by not aborting partly executed conflicting transactions blindly.
EDF-CR uses type 2 information while CCA uses type 1 information to improve EDF-HP
further and our experiments in [CHJ94] has shown that CCA is better than EDF-CR for soft
real-time systems when a resource time is used as an estimated execution time for EDF-CR. Type
2 information is important for RTDBS but the estimation should be combined with system load
adequately. We think that type 2 information is very hard to get without using type 1 information









and load factor because the execution time of a transaction varies with the changes in system load,
especially in soft RTDBS. In firm real-time systems the estimation error of a transaction is bound
by its deadline as transactions that miss the deadline are removed from the system. However,
the estimation error is not bound in soft RTDBS where we do not remove missed transactions.
Estimated execution time of a transaction can be roughly calculated with estimated resource time
of a transaction and current system load when it is in the system. System load might be traced
with the past history or with the queue length of the system resources or with both.
Based on the above observations we -i,--. -I an adaptive cost conscious approach CCA-ALF
(Cost Conscious Approach with Average Load Factor) which combines type 1, type 2 and a load
detection mechanism. CCA-ALF uses type 1 information to calculate resource time of a transaction
and then anticipates system load by using resource time and its actual execution time. CCA-ALF
changes its priority assignment policy and conflict resolution policy accordingly with the changes
of the systems load.
In a real-time database, irrespective of whether it is memory- or disk-resident, the (wall clock)
response time has two distinct components: Tstatic, the time needed to execute a transaction in an
isolated environment and Tdynamic, the time spent in waiting (both I/O and concurrency related)
as well as abort/restart overhead. TstItic is dependent on the semantics of the transaction (e.g.,
data values accessed and branch points) and is relatively straightforward to estimate. Tdynamic, on
the other hand, is dependent on the current state of the system and on future events, i.e, on the
transactions that are currently in the system and the transactions that will arrive in the future. In
the database context, Tdynamic is extremely difficult to compute or even estimate as it is not only
dependent on the resources consumed so far but also on the resources required for its completion
which may be affected by future events. Furthermore, Tdynamic is sensitive to the transaction
mix and can vary considerably when the transaction mix changes. Nevertheless, the inclusion of
an approximate dynamic cost as part of the strategy for meeting timing requirements is likely to
perform better than those where the dynamic information is not included at all. Thus CCA-ALF
uses system load information to estimate the dynamic time of a transaction more accurately.
With type 1 and type 2 information together, we propose CCA-ALF and EDF-CR-ALF that
are extensions of CCA and EDF-CR respectively. In Figure 1 we can see the knowledge type and
the evolution of scheduling algorithms.


4 CCA-ALF for soft deadline

We assume that CCA-ALF uses strict 2-phase locking, exclusive lock only, and High Priority conflict
resolution method. With the type 1 information which is available by using pre-analysis or pre-
execution the conflict and safety relationship for CCA-ALF can be inferred in a straightforward
manner.

hasaccessed(TN) Set of data items that a transaction N has accessed from the beginning of the
transaction.

mightaccess(TN) Set of data items that a transaction N might access till its completion.

With riiqhtan-c...- and hasaccessed, we can calculate the conflict and safety relations as follows:

Transactions TN and TM conflict iff mightaccess(TN) n mightaccess(TM) 5 p.









Transaction TN is unsafe with respect to TM iff hasaccessed(TN) n mightaccess(TM) # <.


4.1 Priority Assignment

CCA-ALF uses a dynamic priority assignment policy with a continuous evaluation method which
evaluates the priority several times during the execution of a transaction to capture all the dynamic
features as the transaction progresses. If the transaction Ta which is selected to be run next conflicts
with m transactions that are unsafe with T,, we might lose

Timelost(T,) = EtEM rollbackt + expect)

M = {t t is unsafe with Ta}

where exect is the effective service time of Tt and rollback is the time required to roll back Tt. If
the value of Timelost(T,) is large, executing Ta wastes system resources. We characterize the time
lost as the penalty of conflict.

Penalty of conflict is the value Timelost(T,), which is the sum of the effective service time and
rollback time of the transactions that must be aborted and rolled back to execute Ta to its
commit point without interruption.

The notion of the penalty of conflict, described above, is introduced into the our CCA-ALF dynamic
priority computation formula as follows. If Pr(Ti) is the priority of transaction Ti and d(Ti) is the
deadline of transaction Ti, then


Pr(Ti) = -(d(Ti) +w Timelost(Ti))

The portion of Timelost in CCA-ALF priority formula is controlled by the value of w (penalty
weight). Although the value of w over some ranges showed good performance in [HJC93], we can
improve the performance by fine-tuning the value of w since no priority assignment policy shows
good performance in different load situations in a consistent manner. Since the value of Timelost
consists of effective service time of conflicting transactions, it does not include system load in it.
One way to make this approach adaptive to the system load is to .,.li1i-1 the value of w using the
load of the system.
Transaction execution time varies severely with the changes of system load in soft RTDBS. By
computing the ratio of actual execution time (transaction completion time transaction arrival
time) to corresponding resource time (Load Factor of a transaction) of each completed transaction,
we might be able to predict the system load. Resource time can be derived from type 1 information
by assuming that the processing time for each accessed data item does not change enormously.

Resource Time = Number of data access x cpu_time + Number of disk read x disk_time

actual execution time = completion time arrivaltime
actual execution time
Load Factor (L) = Resource Time
Resource Time










LF of one transaction cannot represent the system load properly. We use ALF (Average Load
Factor) of previously completed transactions to represent current system load. LF of the transac-
tions that finished a long time ago cannot contribute to the current system load either. In order to
trace recent ALF we maintain the execution time of most recently finished N transactions in the
system and take the average of them as ALF of those transactions.

CTIEN LF(Ti)
Average Load Factor (ALF) = LF


With the ALF value the priority formula of this approach is as follows:


Pr(Ti) = -(d(Ti) + (penalti._. ..il,/ x ALF x TimeLost(Ti)))

In lightly loaded situations the value of ALF is close to 1 and our priority assignment policy
approximately resembles to EDF which shows good performance when the system is lightly loaded.
If a system load increases the effects of deadline in the priority formula decreases due to increase of
ALF. Thus in heavily loaded situations the results obtained using our priority formula is comparable
to Random Priority (RP) [HCL90a] since the value of Timelost override the effect of deadline in
the formula. Another effect of ALF in the formula is that the conflict resolution policy changes
from abortive method to nonabortive method in heavy load situations because the priorities of
transactions that are conflicting with partially executed transactions decreases due to increase of
ALF.


4.2 Scheduling algorithm

The procedure "tr-arri.,1--, 1i ,1" is invoked whenever a new transaction arrives and the procedure
"tr-fi, i-1 --. 1I, 1" is invoked whenever the running transaction finishes. The procedure "tr-arrival-
-,1..,1" uses ALF and penalty of conflict (approximation of dynamic cost) of transactions and
the procedure "tr-fi i-li--. 1 ,.1" inserts LF and updates ALF. The sleep queue holds transactions
that are blocked and the partially executed transaction list (P_list) links all transactions that are
executed partially. We maintain a circular list of size N to keep track of most recently finished N
transactions' LF. The ALF introduced in the priority formula is used to weigh the contribution of
penalty of conflict on the value of the priority value computed. ALF value will ranges from 1 to
some positive value.

Function Pr
begin
calculate penalty of conflict
return( (deadline + penaltyweight ALF penalty of conflict));
end


In the following procedures, TA is a new transaction and TH is the highest priority transaction.

Procedure tr-arrival-sched
begin
if Pr(TH) < Pr(TA)
then










make TA as a new TH;
schedule TH;
else
add TA to the ready queue;
schedule TH;
end


Procedure tr-finish-sched
begin
insert LF into the circular list;
update ALF;
foreach transaction in the ready queue
begin
assign new priority;
Choose the highest priority transaction
and make it TH;
end
end


Disk I/O introduces new problems in real-time transaction scheduling. There are several choices
when I/O wait occurs. We have considered the following 3 choices:

1. Blindly pick the highest priority transaction among ready transactions.

2. Pick the highest priority transaction among transactions that are ready and does not conflict
with partially executed higher priority transaction.

3. Pick the highest among transactions that are ready and does not conflict with any partially
executed transaction.

Of the above, we found that the second one comes out as the best for soft real-time transac-
tions [CHJ94] and applied it to CCA-ALF and EDF-CR-ALF where type 1 information is available.
Consider the following scenario: Transaction Ti is blocked and is waiting for an I/O completion.
The next highest priority transaction, T2, gets the CPU and starts executing so as not to waste
the CPU. If T2 conflicts with Ti, then T2 performs a ",.,,," ..li, 1',i ,1 execution because it must be
rolled back when Ti unblocks. This situation is worse than the situation in which no transaction
is selected to execute during Ti's I/O wait time, because of the cost incurred in rolling T2 back. If
the third highest priority transaction, T3, accesses a data set disjoint with that of Ti and T2, then
T3 is the better choice. In our approach we select T3 rather than T2 during Ti's I/O wait using the
type 1 information. Even though the third choice prevents noncontributing execution also it might
limit the concurrency of the system.
A noncontributing execution is defined as a lower priority transaction's execution during
the I/O wait of higher priority transaction that has to be rolled back when the higher priority
transaction finishes its I/O [HJC93].
We expect that CCA-ALF works like EDF-HP when the system is lightly loaded and RP-Wait
(Random Priority with Wait) when the system is heavily loaded. When the system is heavily loaded
the multiplication of ALF and Timelost dominates the deadline effect of many transactions. Thus
their priorities are randomized and the conflict resolution policy is changing to Wait. In heavily
loaded situation RP has shown better performance than EDF [HLC91].









5 EDF-CR-ALF for soft deadline

EDF-CR uses estimated execution time of a transaction when it decides whether to abort a con-
flicting lower priority transaction or block a higher priority transaction. When we deal with soft
deadline the variations of actual execution time changes considerably with the changes in the system
load. Thus, it seems naive for EDF-CR to use statically estimated execution time which does not
consider the changes to the system load at all. Our simulations in [CHJ94] showed that EDF-CR
is worse than EDF-HP when the resource time of a transaction is used as an estimated execution
time.
We can estimate the actual execution time dynamically by using additional information available
about transactions. With type 1 and type 2 information, we can estimate the remaining execution
time of a transaction dynamically by using statically calculated resource time and dynamically
traced ALF. Slack time (Sr) of a lock requesting higher priority transaction Th and remaining
execution time (Rh) of a lock holding transaction Th can be dynamically calculated by using the
following formula.

R c aii ni i g Execution Time (RET) = Rr lilai ning Resource Time x ALF

Sr(T,) = (d(T,) (current time + RET(T,)))
Rh(Th) = RET(Th)

We name this approach as EDF-CR-ALF (EDF-CR with ALF). EDF-CR-ALF uses dynamically
estimated execution time by combining type 1 information and ALF. We expect EDF-CR-ALF to
perform better than EDF-HP in lightly loaded situations but it is likely to be almost the same
as EDF-HP when the system is heavily loaded. Under heavy load, most of transactions in the
system do not have enough slack time to wait for the completion of the conflicting lower priority
transaction. The advantage of EDF-CR-ALF over EDF-CR is that EDF-CR-ALF is never worse
than EDF-HP for any situation.


6 Performance Evaluation

In order to evaluate the performance of the CCA-ALF algorithm described in this paper, two
simulations of a real-time transaction scheduler were implemented (using C language and SIMPACK
simulation package [Fis92]) for main memory- and disk-resident databases as shown in Figure 2.
The parameters used in the simulations are shown in Table 1. In these simulations, transactions
enter the system according to a Poisson process with arrival rate A (i.e., exponentially distributed
inter-arrival times with mean value 1/A), and they are ready to execute when they enter the
system (i.e., release time equals arrival time). The number of objects updated by a transaction is
chosen uniformly from the range of min_size and max_size and the actual database items are chosen
uniformly from the range of db_size.
After accessing an object a transaction spends cpu_time in order to do some work with or on
that object and then it accesses the next object. The assignment of a deadline is controlled by the
resource time of a transaction and two parameters min_slack and max_slack which set, respectively,
a lower and upper bound of percentage of slack time relative to the resource time. A deadline
is calculated by adding resource time and slack time. Slack time is calculated by multiplying













Open Network Model


Figure 2: Open Network Model for the simulation


Parameter
db_size
max_size
min_size
i/o_time
cpu_time
disk_prob
update_prob
min_slack
max_slack
restart _time
penalty weight


Meaning
Number of objects in database
Size of largest transaction
Size of smallest transaction
I/O time for accessing an object (read/write)
CPU computation per object accessed
Probability that an object is accessed from disk
Probability that an object accessed is updated
Minimum slack
Maximum slack
Time needed to rollback and restart
Weight of penalty of conflict


Table 1: Parameters and their meanings









slack percent and resource time. Slack percent is chosen uniformly from the range of min_slack to
max_slack.

Deadline = arrival time + resource time x (1 + slack percent)

Disk accesses for disk resident database are controlled by diskprob when a transaction reads
an object. The use of disk_prob to some extent models data maintained in the buffer. At commit
time, objects that have been updated are flushed. The parameter update_prob controls the number
of data that should be written at the commit time. We use restart_time for modeling the rollback
of a transaction and its restart. The restarted transaction will access the same data objects. We
maintained most recently finished 20 transactions in the circular list to keep track of current ALF.
In our performance evaluation, we measure three performance metrics (defined below) commonly
used in the literature for RTDBS: i) miss percent, ii) restart rate, and iii) mean lateness.

STotal number of transactions that missed the deadline
Miss Percent = x 100
Total number of transactions that entered the system


Total number of restart
Restart Rate =
Total number of transactions that entered to the system


TMean Lateness tardy transactions(completiontime(Ti) deadline(Ti))
Mean Lateness =
Total number of transactions that entered to the system

Another way of measuring the performance is to compute the lowest arrival rate that causes
20 miss rate. We consider a system to be heavily loaded when the system misses more than 20 .
of transactions [HCL90b]. Thus we define this arrival rate as a boundary arrival rate and will show
that our algorithm has a bigger boundary arrival rate.


6.1 Main memory DB

In this simulation we have a single processor and a memory resident database. We do not consider
any durability property here in order to isolate the effects of transaction scheduling and concurrency
control methods. Thus the resource time of a transaction only depends on cpu_time and the number
of objects a transaction accesses. The value of parameters used in this simulation are shown in
Table 2. The value of db_size has been chosen to increase data conflict among transactions and
20,000 transactions were executed for each experiment.


6.1.1 Effect of Arrival rate

In this experiment, we varied arrival rate from 1 tr/sec to 7 trs/sec with the base parameters shown
in Table 2 and measured the miss percent, the number of restarts per transaction, and mean lateness
for EDF-HP, EDF-CR-ALF, and CCA-ALF. With the base parameters the maximum capacity of
the system (assuming no blocking and aborts) is:











Parameter Value
db_size 250
max_size 24
min_size 8
cpu_time 10 ms
min_slack 50 ( .)
max_slack 550 ( .)
restart_time 5 ms
penalty_weight 1


Table 2: Base parameters for main memory database



10 ms 16 objects 160 ms
x = = 6.25 transactions/second
object transaction transaction

If we consider the effects of blocking and aborting (dynamic factors) the capacity of the system
will be much less than the maximum capacity of the system. Figure 3 (a) shows the effect of
arrival rate on the percentage of transactions that miss their deadline. The boundary arrival rates
of EDF-HP, EDF-CR-ALF and CCA-ALF are approximately 4.4, 4.5, and 4.6 trs/sec respectively.
Figure 3 (b) shows the effect of arrival rate on the restart rate of transactions and Figure 3 (c)
shows mean lateness using the logarithmic scale.
CCA-ALF shows better performance as compared to EDF-HP and EDF-CR-ALF especially
when the arrival rate is between 3 and 5.5 trs/sec. Within this arrival range CCA-ALF and EDF-
CR-ALF show much less number of transaction restarts than EDF-HP. Generally, less number
of transaction restarts does not guarantee better performance but CCA-ALF reduces expensive
restarts to achieve better performance. This phenomenon can be seen clearly in the multiclass
experiment presented later. Observe that for the base parameters shown in Table 2, the number
of restarts climbs steeply up to the arrival rate of 4 and then declines sharply (Figure 3 (b)).
The reason for sharp decline is that beyond a specific arrival rate, it is less likely that an arriving
transaction will have an earlier deadline than the currently running transaction. After the peak
point in Figure 3 (b), it is usually the case that the currently running transaction arrived a long
time ago, but could not get system services due to the heavy load on the system (most of the
dynamic factors in heavily loaded situation are arrival '.1,, I 'o.i- rather than preemption '.1,, I ',,.-
and aborts [T.; ',']). Thus, fewer transactions are preempted and there are fewer opportunities for
restarts [AC.; 1"]


6.1.2 Effect of multiclass (Transaction mix)

In this experiment, the arriving transactions are divided into three classes (class 0, 1, and 2) and
assigned different values of cpu_time 1 for class 0, 10 for class 1, and 100 for class 2. We assigned
1 ms as restart_time for all classes because the resource times of class 0 transactions are between
8 ms and 24 ms. The other parameters are the same as that of the previous experiment. Thus
data contention remains the same but the amount of resource time for each class is different. With

















CCA-ALF,EDF-CR-ALF (base parameters)


1 2 3 4 5
Arrival Rate(trs/sec)


(a) Miss percent


CCA-ALF,EDF-CR-ALF (base parameters)


3 4 5 6 7
Arrival Rate(trs/sec)


le+06

100000

10000

J000

100

10

1

0.1


(b) Restart rate


CCA-ALF,EDF-CR-ALF (base parameters)


1 2 3 4 5
Arrival Rate(trs/sec)


(c) Mean Lateness


Figure 3: Comparison of EDF-CR-ALF and CCA-ALF


100




80


6 7


0.1
0.09
0.08
0.07
0.06
t 0.05
S0.04
0.03
0.02
0.01
0









these assignments a lower class (the lowest is class 0) transaction has a shorter resource time. As
a result it has a shorter slack time. The maximum capacity of the system (disregarding blocking
and aborts) is:

1+10+100 16 objects 592 ms
3 X = 1.7 transactionlsecond
object transaction transaction

Different assignments of cpu_time for each transaction class creates a lot of variance in the
transaction resource time (the resource time of transaction varies from 8 ms to 2400 ms). Therefore,
there will be more chances for transaction preemption. Figures 4 show the results of this experiment.
The boundary arrival rates of EDF-HP, EDF-CR-ALF, and CCA-ALF are 0.95, 1.0, and 1.15 trs/sec
respectively in Figure 4 (a). Thus CCA-ALF schedules more transactions without missing more
than 20 of transactions.
With the variation of cpu_time there is higher possibility that an arriving transaction will have
an earlier deadline than the currently executing transaction. Thus restart rate per transaction
of this experiment is increased for both approaches as can be observed from Figures 3 (b) and
4 (b). CCA-ALF shows better performance especially when the arrival rate is between 0.6 and
1.4 trs/sec. Within this arrival range CCA-ALF shows much less number of transaction restarts
compare to EDF-HP. CCA-ALF reduces very expensive restarts to achieve better performance in
the multiclass situation. This experiment also indicates the adaptive nature of the CCA-ALF
approach in which the dynamic cost changes as the transaction mix changes and reduces the effect
of deadline accordingly.
Another metric of comparison for this experiment is to observe miss percent for each class. In
this experiment data contention is the same for all classes but their active resource requirements
are different because the transactions belonging to classes 1 and 2 require more cpu_time to process
their data objects. The relative difference of miss percent of each class is reduced after arrival rate
1 for both approaches (Figure 5). The reason is that after this point preemption of transactions is
reduced and execution behavior is more serialized.
We plot miss percent for each class from arrival rate of 0.6 trs/sec to 1.4 trs/sec for EDF-HP
and CCA-ALF in Figure 5 (miss percent is too small to plot when the arrival rate is less than 0.6
trs/sec and the behavior of EDF-CR-ALF is almost the same as that of EDF-HP). Their relative
difference is reduced when the arrival rate is bigger than 1.4 trs/sec. From Figure 5, we can see
that EDF-HP and EDF-CR-ALF blindly favors shorter transactions transactions. Thus EDF-HP
and EDF-CR-ALF causes very expensive restarts by aborting transactions that consumed a lot of
resources. CCA-ALF also favors shorter transactions but CCA-ALF avoids expensive restarts by
not aborting transactions that consumed a lot of resources. In Figure 5 miss percent of class 0
transactions is higher than that of class 1 transactions in our experiment. The reason is that class
0 transactions are very vulnerable due to their relatively small absolute slack time.
We expected that there would be less discrimination against long running transactions in CCA-
ALF than EDF-HP because CCA-ALF implicitly considers the effective service time of a transaction
as we can see it in Figure 5. Discrimination against long running transactions in RTDBS is discussed
in [PLJ92]. In their experiment each class requires different ranges of object number. Thus each
class has different level of data contention and resource time. In our experiment, however, each class
only has different level of resource contention. That is the reason why their experiment shows more
discrimination against long running transactions. Also, the formula used for priority computation


















CCA-ALF,EDF-CR-ALF (base parameters)


0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Arrival Rate(trs/sec)


(a) Miss percent


CCA-ALF,EDF-CR-ALF (base parameters)

EDF-HP -
CCA-ALF
EDF-CR-ALF ---


ED--RAL


le+06

100000

10000

1000

S100

10

1

0.1


0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Arrival Rate(trs/sec)


(b) Restart rate


CCA-ALF,EDF-CR-ALF (base parameters)


0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2
Arrival Rate(trs/sec)


(c) Mean Lateness


Figure 4: Multiclass experiment


100




80


0.2



0.15



t 0.1


0.05



0






















CCA-ALF
class 0
- class 1
Class 2


an Imt


0.6 I 0.8


Arrival Rate


I 1.0 I 1.2
Arrival Rate


EDF-HP CCA-ALF
Arrival Rate(trs/sec) 0.6 0.8 1.0 1.2 1.4 0.6 0.8 1.0 1.2 1.4
Miss Percent(Class 2) 2.63 5.91 11.71 19.80 24.94 1.04 2.64 4.81 7.92 16.68
Miss Percent(Class 0) 0.63 1.86 5.3 11.52 19.1 0.8 2.48 5.29 9.22 15.1
Class 2 / Class 0 4.17 3.18 2.2 1.72 1.3 1.3 1.06 0.9 0.85 1.18


Figure 5: Miss percent for each class and Proportion of class 2 to class 0









currently does not distinguish between transaction classes. This can be easily included in the
formula that computes penalty of conflict.
CCA-ALF shows much better performance especially when the variance of execution time is
high among transactions, by not aborting transactions that have already consumed a lot of resource
time.


6.2 Disk Resident DB

In order to measure the performance of our algorithm on disk resident database, we extended the
simulation program to perform experiments for this case. In this simulation we assumed that we
have a single processor, single disk and FCFS I/O scheduling. If a transaction is aborted during
its wait on the disk queue, the transaction is deleted from the disk queue immediately. However, if
a transaction is aborted during its I/O access it is not deleted until it releases the disk. We used
deferred update rather than immediate update for fast rollback [AD85]. Thus we assume that
transaction rollback and restart do not require any disk access.
The values of parameters used for this experiment are shown in Table 3. The values of cpu_time
and i/o_time are chosen to balance the utilization of CPU and disk [ACL87, ACG.I'i'_, TSCC-".] With
this parameter assignments the system is slightly I/O bound. Resource time in this experiment
depends on the cpu_time, the number of objects, the number of disk access, and i/o_time. Since the
deadline is assigned based on pre-commit time we inspect timing requirement and release locks when
a transaction pre-commits. As we have 2 system resources in this experiment and the disk_prob is
0.5, we assigned 0.5 as the value of penaltyweight. The rationale is to distribute the penalty of
conflict over the system resources.


Parameter Value
db_size 250
max_size 24
min_size 8
i/o_time 25 ms
cpu_time 15 ms
disk_prob 0.5
update_prob 0.5
min_slack 100 ( .)
max_slack 650 ( .)
restart_time 5 ms
penalty_weight 0.5


Table 3: Base parameters for disk resident database

With the base parameters in the Table 3 the maximum capacity of the system is:

16 objects 15 ms 240 nms
Sx = 4.2 trs/second
transaction object transaction

This calculation is very optimistic because it neither includes the abort cost nor the blocking
cost of transactions.









6.2.1 Effect of Arrival Rate


In this experiment, we varied arrival rate from 0.6 tr/sec to 2 trs/sec with the base parameters
shown in Table 3 and compared EDF-HP, EDF-CR-ALF and CCA-ALF schemes. Increasing the
arrival rate increases time contention as well as data contention thus increases transaction miss
percent for all three approaches. CCA-ALF and EDF-CR-ALF which use type 1 information to
reduce noncontributing execution shows a much larger improvement over EDF-HP for the disk
resident database (as expected) as compared to the main memory case in Figures 6. The boundary
arrival rates of EDF-HP, EDF-CR-ALF, and CCA-ALF are 1.2, 1.42, and 1.43 trs/sec respectively.
The reason for earlier rapid increase of restart rate in EDF-HP than EDF-CR-ALF and CCA-
ALF is that when the arrival rate is high, the number of available transactions increases which in
turn causes high data contention. High data contention makes many transactions to block which
eventually increases the number of active transactions as well as the transactions that have begun
execution but have not finished yet, in the system. Thus the priority-based restarts of the active
transactions that are blocked waiting for locks or resources increases very rapidly. The increase in
the restart ratio means that a longer fraction of disk time is spent doing work that will be redone
later [ACL87]. Wasted resource time due to priority-based restart causes high resource utilization
and easily makes bottleneck resource saturation that induces longer I/O wait time. With the longer
I/O wait time more transactions are scheduled and that increases the I/O wait time further. Thus
the possibility of restarting an active transaction increases further. After the peak point, restart
rates slowly increase as shown in Figure 6 (b). This is because the number of restarts due to
higher priority transaction's I/O wake up increases but the restarts by higher priority transaction's
arrivals are gradually reduced. The number of restarts will flatten out eventually as the arrival rate
increases.
Even though the available transactions increase as the arrival rate increases, the number of useful
transactions for CCA-ALF and EDF-CR-ALF increases very slowly. Thus the number of active
transactions are relatively small as shown in Figure 7 (a) until arrival rate of 1.6 trs/sec As a result,
the number of priority-based restarts for CCA-ALF and EDF-CR-ALF increases slowly as can be
seen in Figure 6 (b). After arrival rate of 1.6 trs/sec the number of active transactions for EDF-
CR-ALF and CCA-ALF increases because both approaches have chosen transactions seemingly not
conflicting with partially executed higher priority transactions.
In the heavy load situation, the conflict resolution policy of CCA-ALF resembles to EDF-Wait
which uses nonabortive method. Thus the restart rate of CCA-ALF is less than that of EDF-CR-
ALF which uses HP conflict resolution method independent of system load after arrival rate of 1.6
trs/sec. That is the reason why CCA-ALF has less restart rate than EDF-CR-ALF after the arrival
rate of 1.8 trs/sec even though CCA-ALF has more number of active transactions.


7 Conclusions

Synthesizing static and dynamic information available to transactions seems to be a viable approach
for obtaining scheduling policies to meet the requirements of real-time transactions. A single fixed
priority assignment policy is not meaningful in all situations as different load situations require
different methods. The approach described in this paper uses dynamic priority assignment with
continuous evaluation method to adapt to load changes effectively and to reduce the excessive
restart problem encountered by EDF-HP in high data contention situations. CCA-ALF changes its










EDF-HP,CCA-ALF,EDF-CR-ALF (base parameters)


0.6 0.8


1 1.2 1.4 1.6 1.8 2
Arrival Rate(trs/sec)


(a) Miss percent

EDF-HP, CCA-ALF,EDF-CR-ALF (base parameters)


0.6 0.8


1 1.2 1.4 1.6 1.8 2
Arrival Rate(trs/sec)


(b) Restart rate

19
Figure 6: Comparison of EDF-CR-ALF and CCA-ALF


100


80


S60


40

20

0

0











EDF-HP,CCA-ALF,EDF-CR-ALF (base parameters)


0.6 0.8


1 1.2 1.4 1.6 1.8 2
Arrival Rate(trs/sec)


(a) no. of active tr


EDF-HP,CCA-ALF,EDF-CR-ALF (base parameters)


le+06


100000


00ooo


1000


100


10


0.6 0.8
0.6 0.8


1 1.2 1.4 1.6 1.8 2
Arrival Rate(trs/sec)


(b) Mean Lateness


Figure 7: Comparison of EDF-CR-ALF and CCA-ALF
20









priority assignment policy accordingly to adapt to the current system load by using type 1, type
2 and ALF of the system. It adapts well to the fluctuations of the system load without causing
excessive number of transaction restarts.
The distinctive features of our approach are:

1. Our dynamic priority policy synthesizes deadline and penalty of conflict together. The amount
of effective service time of a transaction is implicitly taken into account as it is a part of the
penalty of conflict computed for conflicting transactions,

2. Our priority assignment policy easily adapts to the changes of system load by using ALF of
previously finished transactions. It works like EDF in lightly loaded environment and works
like RP in heavily loaded situation.

3. Our conflict resolution policy works like EDF-HP in lightly loaded situation and works like
EDF-Wait or EDF-WP by decreasing the priorities of conflicting transactions in heavily
loaded environment.

Our simulations indicate that:

1. CCA-ALF performs better than EDF-HP and EDF-CR-ALF for soft deadline in wide ranges
of arrival rate.

2. CCA-ALF is more fair than EDF-HP and EDF-CR-ALF.

3. CCA-ALF shows particularly good performance when the transactions have a wide range of
processing requirements.

4. Reducing noncontributing execution by using type 1 information dominates the performance
of disk resident databases.

Currently we are looking at the applicability of CCA-ALF for firm deadline transactions.


References

[ACL87] Rakesh Agrawal, Michael J. Carey, and Miron Livny. Concurrency control performance
modeling: Alternatives and implications. AC if Transactions on Database Systems,
12(4' iI,, ,,-. 1, 1," 7.

[AD85] Rakesh Agrawal and D. DeWitt. Integrated concurrency control and recovery mech-
anism: Design and performance evaluation. AC if Transactions on Database Systems,
10(4):529-564, 1',".

[AC;..I"] Robert Abbott and Hector Garcia-Molina. Scheduling real-time transactions. SIC I[OD
RECORD, 17(1):71-81, 1''"

[AC;.I189] Robert Abbott and Hector Garcia-Molina. Scheduling real-time transactions with disk
resident data. In Proceedings of the 15th VLDB, pages .'.- 396. AC('.I 1'i'i

[AC;:.I'i'] Robert Abbott and Hector Garcia-Molina. Scheduling real-time transaction: Perfor-
mance evaluation. AC if Transactions on Database Systems, 17(3):513-560, 1992.









[BHG87] P.A. Bernstein, V. Hadzilacos, and N. Goodman. Concurrency Control and Recovery in
Database Systems. Addison-Wesley, 1'i".

[BMH89] A. Buchmann, D.R. McCarthy, and M. Hsu. Time-critical database scheduling: A
framework for integrating real-time scheduling and concurrency control. In Proceedings
of the Fifth Conference on Data EI',.. -',i pages 470-480, Feb 1','I

[CBB+89] Sharma Chakravarthy, Barbara Blaustein, Alejandro Buchmann, Michael Carey, Umesh-
war Dayal, David Goldhirsch, Meichun Hsu, Rivka Ladin Rajiv Jauhari, Miron Livny,
Dennis McCarthy, Richard McKee, and Arnon Rosenthal. Hipac: A research project
in active, time-constrained database management. Final technical report xait-89-02,
XEROX, July 1','i

[CHJ94] S. Chakravarthy, D. Hong, and T. Johnson. Real-time transaction scheduling: A frame-
work for synthesizing static and dynamic factors. Technical Report Electronic UF-CIS-
TR-'I I- III-, Availiable at anonymous ftp site cis.ufl.edu, University of Florida, Dept. of
CIS, 1994.

[CJL89] M.J. Carey, R. Jauhari, and M. Livny. Priority in dbms resource scheduling. In Pro-
ceedings of the 15th VLDB, pages 0-0, 1',',

[DLT85] Jensen E. Douglas, C. Douglass Locke, and Hideyuki Tokuda. A time-driven sched-
uler for real-time operating systems. In Proceedings of the IEEE Real-Time Systems
Symposium, pages 112-122. IEEE, 1' ".

[Fis92] Paul A. Fishwick. SIMPACK:C-based Simulation Tool Package Version 2. University
of Florida, 1992.

[GR93] Jim Gray and Andreas Reuter. Transaction P, .** --' i, Concepts and Techniques. Mor-
gan Kaufmann, 1993.

[HCL90a] Jayant R. Haritsa, Michael J. Carey, and Miron Livny. Dynamic real-time optimistic
concurrency control. In Proceedings of Real-Time System Symposium, pages 94-103.
IEEE, 1990.

[HCL90b] Jayant R. Haritsa, Michael J. Carey, and Miron Livny. On being optimistic about
real-time constraints. AC I1 PODS, 1990.

[HJC93] D. Hong, T. Johnson, and S. Chakravarthy. Real-time transaction scheduling: A cost-
conscious approach. In Proceedings of the 1'r' AC if SIC( IOD I,,i' Conference on
l, b i. ,,,. ,. / of Data, pages 197-206. AC(.I, 1993.

[HLC91] Jayant R. Haritsa, Miron Livny, and Michael J. Carey. Earliest deadline scheduling
for real-time database systems. In Proceedings of Real-Time System Symposium, pages
232-242. IEEE, 1991.

[HSRT91] Jiandong Hyang, John A. Stankovic, Krithi Ramamritham, and Don Towsley. Experi-
mental evaluation of real-time optimistic concurrency control schemes. In Proceedings
of the 17th VLDB, pages 35-46. AC':.I 1991.

R.-r1',2] Tei-Wei Kuo and Aloysius K. Mok. Application semantics and concurrency control of
real-time data-intensive applications. In Proceedings of Real-Time Systems Symposium,
pages 35-45. IEEE, 1992.









[KS91] Woosaeng Kim and Jaideep Srivastava. Enhancing real-time dbms performance with
multiversion data and priority based disk scheduling. In Proceedings of Real-Time Sys-
tems Symposium, pages 222-231. IEEE, 1991.

[LS90] Yi Lin and Sang H. Son. Concurrency control in real-time databases by dynamic ad-
justment of serialization order. In Proceedings of Real-Time Systems Symposium, pages
104-112. IEEE, 1990.

[PLJ92] Hweeehwa Pang, Miron Livny, and Michael J.Carey. Transaction scheduling in multiclass
real-time database systems. In Proceedings of Real-Time System Symposium, pages 23
34. IEEE, 1992.

[Ram93] Krithi Ramamrithm. Real-time databases. International Journal of Distributed and
Parallel Databases, pages 1-30, 1993.

[RMM90] R.Jauhari, M.J.Carey, and M.Livny. Priority-hint: An algorithm for priority-based
buffer management. In Proceedings of the 16th VLDB, pages 708-721, 1990.

[1il,.'"] Lui Sha. Concurrency control for distributed real-time databases. SIC6 I[OD RECORD,
17(1):82-98, 1',--

[SRL90] Lui Sha, Ragunathan Rajkumar, and J.P. Lehoczky. Priority inheritance protocols: An
approach to real-time synchronization. IEEE Transactions on Computers, 39:1175-1185,
1990.

[SRSC91] Lui Sha, Ragunathan Rajkumar, Sang Hyuk Son, and Chun-Hyun Chang. A real-time
locking protocol. IEEE Transactions on Computers, 40(7):793-800, 1991.

['7-] John A. Stankovic and Wei Zhao. On real-time transactions. SIC IfOD RECORD,
17(1):4-18, l'1--

[T.,;'i,-] Y.C. Tay. A behavioral analysis of scheduling by earliest deadline. Technical Report
No. 532, Department of Mathematics, National University of Singapore, 1992.

[TSC -.] Y.C. Tay, R. Suri, and N. Goodman. Locking performance in centralized databases.
AC if Transactions on Database Systems, 10(4):415-462, 1I'-.

[ZRS87a] Wei Zhao, Krithi Ramamritham, and John A. Stankovic. Preemptive scheduling under
time and resource constraints. IEEE Transactions on Computers, 36(8):949-960, 1'" 7.

[ZRS87b] Wei Zhao, Krithi Ramamritham, and John A. Stankovic. Scheduling tasks with require-
ment in hard real-time systems. IEEE Transactions on Software E, -i',.. ,I ',, 13(5):225
236, 1' 7.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs