Title: Strongly partitioned system architecture for integration of real-time applications
CITATION PDF VIEWER THUMBNAILS PAGE IMAGE ZOOMABLE
Full Citation
STANDARD VIEW MARC VIEW
Permanent Link: http://ufdc.ufl.edu/UF00100847/00001
 Material Information
Title: Strongly partitioned system architecture for integration of real-time applications
Physical Description: Book
Language: English
Creator: Kim, Daeyoung, 1968-
Publisher: University of Florida
Place of Publication: Gainesville Fla
Gainesville, Fla
Publication Date: 2001
Copyright Date: 2001
 Subjects
Subject: Real-time data processing   ( lcsh )
Computer software -- Development   ( lcsh )
Computer and Information Science and Engineering thesis, Ph. D   ( lcsh )
Dissertations, Academic -- Computer and Information Science and Engineering -- UF   ( lcsh )
Genre: government publication (state, provincial, terriorial, dependent)   ( marcgt )
bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )
 Notes
Summary: ABSTRACT: In recent years, the design and development of technologies for real-time systems have been investigated extensively in response to the demand for fast time-to-the-market and flexible application integration. This dissertation proposes a design approach for integrated real-time systems in which multiple real-time applications with different criticalities can be feasibly operated, while sharing computing and communication resources. The benefits of integrated real-time systems are cost reduction in design, maintenance, and upgrades; and easy adoption of Commercial-Off-The-Shelf (COTS) hardware and software components. The integrated real-time applications must meet their own timing requirements and be protected from other malfunctioning applications. To guarantee timing constraints and dependability of each application, integrated real-time systems must be equipped with strong partitioning schemes. Therefore, we name them strongly partitioned integrated real-time systems (SPIRIT). To prove the theoretical correctness of the model and provide a scheduling analysis method, we developed a fundamental two-level scheduling theory, an algorithm for integrated scheduling of partitions and communication channels, and aperiodic task scheduling methods. We also augmented the scheduling model with practical constraints, which are demanded by ARINC 653 Integrated Modular Avionics standards. The scheduling algorithms are evaluated by mathematical analysis and simulation studies, and implemented as an integrated scheduling tool suite.
Summary: ABSTRACT (cont.): To provide a software platform for integrated real-time systems, we developed a real-time kernel, SPIRIT-muKernel, which implements strong partitioning schemes based on the two-level scheduling theory. The kernel also enables a generic mechanism to host heterogeneous COTS real-time operating systems on top of the kernel. The performance of critical parts of the kernel on the prototype kernel platform is measured in the prototype environment of a PowerPC embedded controller. Finally, we propose a real-time Ethernet named Strong Partitioning Real-Time Ethernet (SPRETHER), which is designed for the communication network of integrated real-time systems. To overcome the lack of deterministic characteristics of Ethernet, we use a software-oriented synchronization approach based on a table-driven proportional access method. We performed scheduling analysis work for SPRETHER and measured performance by experiments on a prototype network.
Summary: KEYWORDS: real-time, strong-partitioning, scheduling, kernel, IMA
Thesis: Thesis (Ph. D.)--University of Florida, 2001.
Bibliography: Includes bibliographical references (p. 141-147).
System Details: System requirements: World Wide Web browser and PDF reader.
System Details: Mode of access: World Wide Web.
Statement of Responsibility: by Daeyoung Kim.
General Note: Title from first page of PDF file.
General Note: Document formatted into pages; contains xiii, 148 p.; also contains graphics.
General Note: Vita.
 Record Information
Bibliographic ID: UF00100847
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
Resource Identifier: oclc - 49241403
alephbibnum - 002766261
notis - ANP4300

Downloads

This item has the following downloads:

dkim-phd-aug01 ( PDF )


Full Text











STRONGLY PARTITIONED SYSTEM ARCHITECTURE FOR INTEGRATION OF
REAL-TIME APPLICATIONS

















By

DAEYOUNG KIM


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


2001




























Copyright 2001

by

Daeyoung Kim



























To my parents, my wife Yoonmee, and my son Minjae
















ACKNOWLEDGMENTS

I would like to thank many individuals for the care and support they have given to me

during my doctoral studies. First of all, I would like to express my deep gratitude to Professor

Yann-Hang Lee. As my advisor, he has provided me with invaluable guidance, and insightful

comments and suggestions. I would also like to thank Professors Paul W. Chun, Douglas D.

Dankel, Randy Y. Chow, and Jih-Kwon Peir for serving as my dissertation committee.

I would also like to thank the former and present members of the Real-Time Systems

Research Group--Yoonmee Doh, Okehee Goh, Yan Huang, Vlatko Milosevski, James J. Xiao,

Youngjoon Byun, Jaeyong Lim, and Kunsuk Kim--for their friendship and discussion. I am also

grateful to my friend Sejun Song at CISCO systems, and Professor Youngho Kim at Pusan

National University, for their encouragement and discussion. My thanks should also go to Dr.

Mohamed Younis at the University of Maryland, in Baltimore County; Dr. Zeff Zhou and James

McElroy, at Honeywell International; and Dr. Mohamed Aboutabl, at Lucent Technology. My

deep gratitude also goes to my former directors of ETRI: Professor Munkee Choi, at Information

and Communications University; and Hyupjong Kim, at Holim Technology.

Thanks also go to the CISE Department of the University of Florida, ETRI, Allied Signal

Inc., Honeywell International, NASA, and the CSE Department of Arizona State University for

providing financial and administrative support during my Ph.D. program.

I give special thanks to my parents, since they have always encouraged me in my studies,

and believed in me. I am also grateful to my mother-in-law and my sisters, Mikyung and Mijung.

My final acknowledgements go to my wife Yoonmee and my son Minjae for their love shown to

me during the study.

















TABLE OF CONTENTS

page


A CKN OW LEDGM EN TS ........ ....... .. ........... ...... ............... ........... .........iv

LIST OF TA BLES ................................... .............. ............... viii

LIST OF FIGURES ............ ............................... ............ ............. .... ................... ix

A B ST R A C T ...................... ... .. .. ..... ... .. ...... ............. ... .................. ... xii
CHAPTERS
1 INTRODUCTION ......... ...................................... ...... 1
1.1 Integrated Real-Tim e System s ................................................. ................................. 1
1.2 Research Background and Related Works....... ...... .............. .................................. 4
1.2.1 Fundamental Scheduling Theories................................................... 4
1.2.2 Model and Theories for Integration of Real-Time Systems ....................................... 6
1.2 .3 R eal-T im e K ern els .......................................................................................... ............... 7
1.2.4 Real-Time Communication Networks ............................. ..... ......................... 9
1.3 M ain C ontrib ution s ...................................................... ................................................. 10
1.4 O organization of the D issertation ......................................................................................... 11
2 INTEGRATED SCHEDULING OF PARTITIONS AND CHANNELS................................ 13
2 .1 In tro d u ctio n ................................................ ................................................ ...... ...... ... 13
2.2 Integrated Real-Tim e System s M odel.................................... ........................................ .. 14
2.3 Fundamental Scheduling Theory for Integration ........................................................ 19
2.3.1 Schedulability Requirement................... ....... ............... 19
2.3.2 Characteristics of the Two-Level Scheduling Theory ............................................. 22
2.3.3 Distance-Constrained Low-Level Cyclic Scheduling........................................... 25
2.4 Integrated Scheduling Theory for Partitions and Channels .......................................... 30
2.4.1 Scheduling A approach ........................................................... ............................ 30
2.4.2 A lgorithm Evaluation ..................................................... ................................. 42
2.5 Solving Practical Constraints ......................... ................................... ............................ 47
2.5.1 Increm ental Changing Scheduling................................... ....................................... ... 48
2.5.2 Replicated Partition Scheduling................................... ....... ................... 52
2.5.3 Fault Tolerant Cyclic Slot Allocation for Clock Synchronization............................. 55
2.5.4 M miscellaneous Practical Constraints ........................................ ........................ 57
2.6 Scheduling T ool ............ .......... .............................. .......... .................... .. .... ............. 60
2 .7 C onclu sion ................... .............................................. .. .. ................. 62
3 SOFT AND HARD APERIODIC TASK SCHEDULING .................................. ................64
3 .1 Intro du action ......... .. .............. .. ................................................................ ....... ......... 64
3.2 Aperiodic Task Scheduling M odel................................................... ............ ............... 65


v










3.3 Soft A periodic T ask Scheduling ........................................................................................ 67
3.3.1 Left Sliding (LS) .... ........................... ................ ........ ......... .. .......... ......... .. 67
3.3.2 R eight Putting (R P) ....................................................... .............................. ...... 69
3 .3 .3 C om acting .... ................... ................................................ ......................... .. .. 7 1
3.3.4 D C2 Scheduler........................ ..... ........................................... ................... 72
3.4 H ard A periodic Task Scheduling ............................................. ................. .................... 73
3.4.1 Low -Bound Slack Tim e ................................................................................. .... 73
3.4.2 Acceptance Test Considering Future RP Operations................................................. 75
3.4.3 Dynam ic Slack Tim e M anagem ent................................................... .... ............... 76
3.4.4 M ulti-Periods Aperiodic Server Scheduler....................................... .................... 78
3.5 Sim ulation Studies of the DC2 Scheduler ........................................ ........ ............... 80
3.5.1 Response Tim e ..... .. .................. .......................... ................ .. 80
3 .5 .2 A acceptance R ate ............................................... .................................................... 8 1
3.6 Conclusion ........................................ .............. ...... 85
4 REAL-TIME KERNEL FOR INTEGRATED REAL-TIME SYSTEMS ..............................86
4.1 Introduction ............. ........... .... .. ..... .. ......... 86
4.2 SPIRIT-.iKemel Model and Design Concepts....................................... 88
4.2.1 Software Architecture model for Integrated Real-Time Systems.............................. 88
4.2.2 Design Concepts of the SPIRIT-Kemel ........................................... ........... 89
4.3 SPIRIT-iKem el Architecture ...................................................................... ............... 90
4.3.1 M em ory M anagem ent............................................................ .......................... 91
4.3.2 Partition M anagem ent............................................................ .......................... 93
4.3.3 Tim ers/C lock Service .................................................... .................................. 95
4 .3.4 E exception H handling ............................................ .................................................. 95
4.3.5 K ernel Prim itives Interface ........................................................................ ............... 96
4.3.6 Inter-Partition Com m unication ................................................................................. 96
4 .3.7 D vice D river M odel .......................................... .................................................. 97
4.4 Generic RTOS Port Interface (RPI) ............... ........................................................... 98
4.4.1 E vent D delivery O bject (E D O )........................................................... ... .. ............... 99
4.4.2 Event Server.. ......... .. .......................... ............... 102
4.4.3 Kernel Context Switch Request Primitive ............................................................ 103
4.4.4 Interrupt Enable/Disable Emulation ............................... 104
4.4.5 M miscellaneous Interface ................................................. ................................ 104
4.5 Perform ance Evaluation ............................................................................. ............ ......... 105
4.5.1 K ernel T ick O overhead ......................................................... ........................... 105
4.5.2 Partition Sw itch O overhead ................................ ............................... .................... 107
4.5.3 K ernel-U ser Sw itch O overhead ................................................................................ 108
4.5.4 TLB Miss Handling Overheads ................................. ..................... 108
4 .6 C o n clu sio n ............... ............................................................ ........... ...... 10 9
5 REAL-TIME COMMUNICATION NETWORKS FOR INTEGRATED REAL-TIME
SY STEM S .......................................... ................................... .................... . 110
5 .1 Intro du action ......................... ...................... .. ........... ...... ........ ..... 1 10
5.2 SPRETHER Architecture and M odel............................................. ........................... 111
5.3 SPR ETH ER Protocols.................................................................. ......................... 114
5.3.1 Table-Driven Proportional Access Protocol................................... .. ............. 114
5.3.2 Single Message Stream Channel (SMC)................. ................... 117
5.3.3 Multiple Message Streams Channel (MMC) .................................. .. ............. 118
5.3.4 Non-Real-Time Channel (NRTC)................................. ............... 123
5.3.5 Packet and Frame Format and Management............................ ................... 124










5.3.6 SPRETH ER A PI .......... ..... ................................. ............ ........ .............. 125
5.4 Scheduling Real-Time Messages .................................. ............... 126
5.4.1 Scheduling Requirement of the SMC and MMC Channels................................ 126
5.4.2 Distance-Constrained Cyclic Scheduling ............... ............................................. 129
5.5 Prototype and Perform ance Evaluation............................................................................ 131
5.5.1 TD PA Protocol O overhead ........................................................................ ............... 133
5.5.2 Response Time of Non-Real-Time ICMP Message ............................................... 134
5.5.3 Throughput of Non-Real-Time Connection.................... ... .................... 134
5 .6 C o n clu sio n ................ ......................................................... ......... ............. ...... 13 7
6 CONCLUSIONS AND FUTURE WORK................................. ...... ................. 138
6 .1 C contribution s................................................................. ................. .. ........ ..... 138
6.2 Future Research D directions ....... ........................................................ .................. ..... 139
L IST O F R E FE R EN C E S ..................................................... .............................................. 14 1

BIOGRAPHICAL SKETCH .................................................. 148

















LIST OF TABLES




Table Page


2-1 Index Notations used in the SPIRIT Model ........._.............................. ................ 17

2-2 Notations used in the SPIRIT M odel ............. ................ ................ .................................. 18

2-3 Task Param eters for the Example Partitions ........................................... ......................... 22

2-4 Task Parameters for the Example of Integrated Scheduling ..............................................37

2-5 C yclic Tim e Schedule for Processor 1 ............................................................. ......................39

2-6 Cyclic Tim e Schedule for Processor 2 ................................................. ............ ................ 39

2-7 Slot Allocation of Cyclic Scheduling for Bus of Major Frame Size 202................................40

2-8 Notations for Incremental Changing Algorithm .............................................. ................ 48

3-1 Fram e-to-Fram e Slack Tim e Table ........................................ ..............................................74

4-1 Partition C configuration Inform action ................................................................ ......................94

4-2 M measured Kernel Overheads .......................... .. .................. ..... ..................... 106

4-3 Partition Sw itch Overheads ..................... ..... ...................... ......... ........ ................ 107

4-4 T L B M iss H handling O verheads .......................................................................... ................... 109

5-1 Param eters U sed in the SM C Protocol...................... ......................................... .... .............. 117

5-2 Parameters Used in the MMC Protocol ......................................................................... 120

5-3 The SPRETHER Packet Field Description..... ....................... ........................ 125

5-4 Synchronization (NIS Packet Processing) Overhead..............................................................133

















LIST OF FIGURES


Figure Page

2-1 Model for Strongly Partitioned Integrated Real-Time Systems (SPIRIT)..................................15

2-2 Task M odel and Deadline ....... ....... .. .. ....................................................... ...... 16

2-3 Task and Partition Execution Sequence............................. ........... .................... .. .. .................. 17

2-4 Inactivity Periods of the Exam ple Partitions ........................................ .......................... 23

2-5 Maximum Partition Cycles for Different Processor Capacity Assignments..............................24

2-6 Maximum Partition Cycles for Partition 1 under Different Task Deadlines .............................26

2-7 Maximum Partition Cycles of Partition 1 under Different Processor Utilizations ....................26

2-8 Example Cyclic Schedule at the Lower Level ................................................. .................28

2-9 Distance-Constrained Scheduling M odel ................................................................................. 29

2-10 Processor Cyclic Schedule Exam ple............................................... .............................. 30

2-11 Combined Partition and Channel Scheduling Approach.........................................................31

2-12 Deadline Decom position Algorithm ................................................................. ......................32

2-13 Scheduling Example of Two Messages in Combined Channel............................... ..........35

2-14 Heuristic Channel Combining Algorithm .......................................... ............................ 36

2-15 Processor Capacity vs. Partition Cycle ............ ........................................... ........ .............. 38

2-16 Bus Capacity vs. Channel Server Cycle ................................................................................. 41

2-17 Schedulability Test for Configuration (4,3,5) and (2,2,4) .............................................. 43

2-18 Measures for Bus Utilization and Capacities ................................................. .................46

2-19 Ratio of Message Deadline to Task Deadline .............................................. ...................47

2-20 Cyclic Schedule Example for Incremental Changing.................................... ...................49










2-21 Exam ple of Attaching N ew Partition................................................................ ....................50

2-22 Replicated Partitions in IM A System s................ ............................................. ......................53

2-23 Example of Fault Tolerant Cyclic Slot Allocation .................. ................. ...................56

2-24 Cycle Transformation Algorithm for Fixed Bus Major Frame Size Scheduling.......................58

2-25 Logical Channel Server Architecture .............................................. .............................. 59

2-26 Usage of Scheduling Tool in Avionics System Design.......................................................60

2-27 The Structure of the Scheduling Tool............................................... ............................. 61

2-28 System and Scheduling Parameters in the Tool Interface .............................................. 62

3-1 Exam ple of Feasible Cyclic Schedule ............................................................. ......................67

3-2 M modified Schedule after L S .................................................................................... ............... 68

3-3 M modified Schedule after R P .............................................................................. ..................... 70

3-4 D C 2 Scheduing A lgorithm ..................................................................................... ................ 73

3-5 Exam ple of the sf, sl, and sr Param eters................................................................ ................... 74

3-6 A acceptance T est A lgorithm ............................................................................... .................... 76

3-7 Six Possible Slack Time Adjustments after an RP Operation .......................................... 78

3-8 Two-Level Hierarchy of Aperiodic Task Schedulers............................ ............ ...............79

3-9 A average R response T im e ...................................................... ............................................. 83

3-10 A average A acceptance R ate........................ .........................................................................84

4-1 Strongly Partitioned Integrated Real-Time System Model..............................................88

4-2 Architecture of the SPIRIT-Kemel ...............................................................................91

4-3 EDO, Event Server, and Kernel-Context Switch Request Primitive Interaction........................99

4-4 Structure of the Event D delivery O bject........................................... ......................................... 100

4-5 SPIRIT-|jKemel's Generic Interrupt/Exception Handling Procedure ......................................101

4-6 SPIRIT-|jKemel's Generic Event Server Procedure ............................................. ...................103

4-7 K ernel T ick O verheads .................................................................................... ......................107

5-1 SPRETHER Real-Time Communication Model............. .......................... ................112




x










5-2 M message Stream M odel in the SPRETHER.......................................... ........................... 113

5-3 Strongly Partitioned Real-Time Ethernet Architecture.................................... ...................114

5-4 Channels and Gaps Configuration According to a Cyclic Scheduling Table .............................115

5-5 Example of TDPA the Real-Time Ethernet Protocol ........................ ......................... 116

5-6 A analysis of SM C Channel A activity ............................................................... ..... ............. ... 117

5-7 Utilization of SM C Channel ............................................................ ... ........... ............... 118

5-8 Analysis of M M C Channel Activity................................................... ..................................... 119

5-9 Utilization of M M C Channel .................................................................................................. 120

5-10 Utilization of MMC Channel when Packet Size is 64, 128, 256, and 512 Bytes ...................... 122

5-11 Protection M echanism of the Real-Tim e Channel.................................................................... 124

5-12 SPRETHER Packet Format ................ ..... ................................... ....................... 124

5-13 Architecture of MMC Channel Server.................... .... ......................... 127

5-14 Example of a Communication Frame Cyclic Schedule.......... ............... .................... 131

5-15 SPRETH ER Prototype Platform ........................ ...... ..................................... ................ 132

5-16 Software Architecture of SPRETHER.................... .... ......................... 132

5-17 Response Time of Non-Real-Time Packet (64Bytes ICMP Packet) .................... .............135

5-18 Throughput of Non-Real-Time Packet (FTP)................ ................................... .................. 136
















Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

STRONGLY PARTITIONED SYSTEM ARCHITECTURE FOR INTEGRATION OF
REAL-TIME APPLICATIONS

By

Daeyoung Kim

August 2001


Chairman: Dr. Yann-Hang Lee
Major Department: Computer and Information Science and Engineering

In recent years, the design and development of technologies for real-time systems have

been investigated extensively in response to the demand for fast time-to-the-market and flexible

application integration. This dissertation proposes a design approach for integrated real-time

systems in which multiple real-time applications with different criticalities can be feasibly

operated, while sharing computing and communication resources.

The benefits of integrated real-time systems are cost reduction in design, maintenance,

and upgrades; and easy adoption of Commercial-Off-The-Shelf (COTS) hardware and software

components. The integrated real-time applications must meet their own timing requirements and

be protected from other malfunctioning applications. To guarantee timing constraints and

dependability of each application, integrated real-time systems must be equipped with strong

partitioning schemes. Therefore, we name them strongly partitioned integrated real-time systems

(SPIRIT).

To prove the theoretical correctness of the model and provide a scheduling analysis

method, we developed a fundamental two-level scheduling theory, an algorithm for integrated










scheduling of partitions and communication channels, and periodic task scheduling methods. We

also augmented the scheduling model with practical constraints, which are demanded by ARINC

653 Integrated Modular Avionics standards. The scheduling algorithms are evaluated by

mathematical analysis and simulation studies, and implemented as an integrated scheduling tool

suite.

To provide a software platform for integrated real-time systems, we developed a real-

time kernel, SPIRIT-|jKemel, which implements strong partitioning schemes based on the two-

level scheduling theory. The kernel also enables a generic mechanism to host heterogeneous

COTS real-time operating systems on top of the kernel. The performance of critical parts of the

kernel on the prototype kernel platform is measured in the prototype environment of a PowerPC

embedded controller.

Finally, we propose a real-time Ethernet named Strong Partitioning Real-Time Ethernet

(SPRETHER), which is designed for the communication network of integrated real-time systems.

To overcome the lack of deterministic characteristics of Ethernet, we use a software-oriented

synchronization approach based on a table-driven proportional access method. We performed

scheduling analysis work for SPRETHER and measured performance by experiments on a

prototype network.















CHAPTER 1
INTRODUCTION

1.1 Integrated Real-Time Systems

Advances in computer and communication technology have introduced new architectures

for real-time systems, which emphasize dependability, cost reduction, and integration of real-time

applications with different criticalities. Away from the traditional federated and distributed

implementation for real-time systems, the new approach, referred to as Strongly Partitioned

Integrated Real-time Systems (SPIRIT), uses multiple COTS processor modules and software

components in building comprehensive real-time systems. It allows the real-time applications to

be merged into an integrated system.

The benefits of integrated real-time systems are cost reduction in design, maintenance,

and upgrades, and easy adoption of Commercial-Off-The-Shelf (COTS) hardware and software

components. Applications running in an integrated real-time system must meet their own timing

requirements and be protected from other malfunctioning applications, while sharing computing

and communication resources. To guarantee timing constraints and dependability of each

application, integrated real-time systems must be equipped with strong partitioning schemes.

Strong partitioning schemes include temporal and spatial partitioning to protect

applications from potential interference. Temporal partitioning ensures that execution time and

communication bandwidth reserved to an application would not be changed either by overrun or

by hazardous events of other applications. Spatial partitioning guarantees that physical resources,

such as memory and I/O space of an application, are protected from illegal accesses attempted by

other applications. Providing strong partitioning schemes for the integration of real-time

applications is a main goal of this dissertation.










The basic scheduling entities of integrated real-time systems are partitions and channels.

A partition represents a real-time application that is protected from potential interference by

means of temporal and spatial partitioning schemes. Multiple cooperating tasks are scheduled by

a partition server within a partition. To facilitate communications among applications, each

partition can be assigned one or more communication channels. An application can transmit

messages using its channel, and can access the channel buffers exclusively. In this sense, a

channel is a spatial and temporal partition of a communication resource and is dedicated to a

message-sending application. It serves multiple real-time messages that belong to the same

partition and are scheduled by a channel server. Groups of partitions and channels are scheduled

by a cyclic CPU scheduler and cyclic bus (network) scheduler, respectively.

A good example of an integrated real-time system is the Integrated Modular Avionics

(IMA), which is being embraced by the aerospace industry these days [1,2,3]. An application

running within a partition can be composed of multiple cooperating tasks. For instance,

Honeywell's Enhanced Ground Proximity Warning System (EGPWS) consists of tasks for map

loading, terrain threat detection, alert prioritization, display processing, etc. With spatial and

temporal partitioning, the EGPWS application can be developed separately and then integrated

with other applications running in different partitions of an IMA-based system. Its execution

cannot be affected by any malfunctions of other applications (presumably developed by other

manufacturers) via wild writes or task overruns. As long as sufficient resources are allocated to

the partition and the channels, the EGPWS application can ensure proper execution and meet its

real-time constraints.

One apparent advantage of IMA-based systems with spatial and temporal partitioning is

that each application is running in its own environment. Thus, as long as the partition

environment is not changed, an application's behavior remains constant even if other applications

are modified. This leads to a crucial advantage for avionics systems; i.e., when one application is

revised, other applications don't need to be re-certified by the FAA. Thus, the integration of










applications in a complex system can be upgraded and maintained easily. It is conceivable that

such architecture with spatial and temporal partitioning can be used for integrating general real-

time applications.

To guarantee a strong partitioning concept in integrated real-time systems, we

investigated a two-level hierarchical scheduling theory that adopts distance-constrained cyclic

scheduling at lower levels and fixed-priority driven scheduling at higher levels. According to the

two-level scheduling theory, each real-time application (partition) is scheduled by a cyclic

scheduler; and tasks within an application are scheduled by a fixed-priority scheduler. Based on

the two-level scheduling theory, we devised a heuristic deadline decomposition and channel-

combining algorithm to schedule both partitions and channels concurrently. Soft and hard

periodic tasks are also supported efficiently by reclaiming unused processor capacities with a

distance-constrained cyclic scheduling approach. We also augmented the scheduling model with

practical constraints, which are demanded by ARINC 653 Integrated Modular Avionics

standards. The scheduling algorithms are evaluated by mathematical analysis and simulation

studies, and are implemented as an integrated scheduling tool suite.

To establish a software platform for integrated real-time systems, we developed a real-

time kernel, SPIRIT-|jKemel. The goals of the SPIRIT-|jKemel are to provide dependable

integration of real-time applications, flexibility in migrating operating system personalities from

kernel to user applications, including transparent support of heterogeneous COTS RTOS on top

of the kernel, and high performance. To support integration of real-time applications that have

different criticality, we have implemented a strong partitioning concept using a protected memory

(resource) manager and a partition (application) scheduler. We also developed a generic RTOS

Port Interface (RPI) for easy porting of heterogeneous COTS real-time operating systems on top

of the kernel in user mode. A variety of operating system personalities, such as task scheduling

policy, exception-handling policy and inter-task communication can be implemented within the

partition according to individual requirements of partition RTOS. To demonstrate this concept,










we ported two different application level RTOS: WindRiver's VxWorks 5.3 and Cygnus's eCos

1.2, on top of the SPIRIT-|tKemel. Performance results show that the kernel is practical and

appealing because of its low overhead.

Finally, we propose a Strong Partitioning Real-Time Ethernet (SPRETHER) to enable

real-time communication networks for integrated real-time systems using Ethernet technology.

Currently, Ethernet is the dominant network technology because of its cost-effectiveness, high

bandwidth, and availability. With a Carrier Sense Multiple Access / Collision Detect

(CSMA/CD) MAC protocol, Ethernet is inherently incapable of guaranteeing deterministic

accesses due to possible packet collision and random back-off periods. However, in order to take

advantage of COTS Ethernet in real-time systems, it is preferred not to change the standard

network hardware components and protocol, so that most advanced and inexpensive Ethernet

products can be used without any modifications. The issue is then how to design an easily

accessible, reliable, deterministic, and affordable real-time Ethernet for safety-critical real-time

systems using COTS Ethernet hardware and software methods. In the dissertation, we propose a

Table-Driven Proportional Access (TDPA) protocol in the standard Ethernet MAC layer to

achieve guaranteed real-time communication. In addition to real-time traffic, non-real-time

Ethernet traffic is also supported with embedded CSMA/CD operations in the TDPA protocol.

We performed scheduling analysis work for SPRETHER and measured performance by

experiments in a prototype network.

1.2 Research Background and Related Works

1.2.1 Fundamental Scheduling Theories

1.2.1.1 RMA, EDF, and time-driven scheduling

Rate Monotonic Analysis (RMA) is a collection of quantitative methods and algorithms,

with which we can specify, analyze, and predict the timing behavior of real-time software

systems. RMA grew out of the theory of fixed-priority scheduling. A theoretical treatment of the










problem of scheduling periodic tasks was first discussed by Serlin in 1972 [4] and then more

comprehensively discussed by Liu and Layland in 1973 [5]. They assumed an idealized situation

in which all tasks are periodic, do not synchronize with one another, do not suspend themselves

during execution, can be instantly pre-empted by higher-priority tasks, and have deadlines at the

end of their periods. The term "rate monotonic" originated as a name for the optimal task priority

assignment in which higher priorities are accorded to tasks that execute at higher arrival rates.

Rate monotonic scheduling is fixed-priority task scheduling that uses a rate monotonic

prioritization.

Compared to Rate Monotonic scheduling, which is a static priority scheduling method,

Earliest Deadline First (EDF) is a dynamic priority scheduling method. In EDF, the task with the

earliest deadline is always executed first.

When scheduling is time-driven, or clock-driven, the decisions of what jobs execute at

what times are made at specific time instants. These instants are chosen before the system begins

execution. Typically, in a system that uses time-driven scheduling, all the parameters of hard real-

time jobs are fixed and known. A schedule of the jobs is computed off-line and is stored for use at

run-time. The scheduler activates job execution according to this schedule at each scheduling

decision instance.

In our integrated real-time systems model, we used a two-level scheduling approach that

adopts time-driven scheduling (cyclic scheduling) for the low-level scheduler; and rate monotonic

scheduling (fixed-priority driven scheduling) for the high-level scheduler. In low-level cyclic

scheduling, we provide distance constraints to guarantee independence between low- and high-

level schedulers. The characteristics of distance constraints ensure that at any given invocation

interval of the partition and channel, they will be allocated a pre-scheduled amount of processor

or communication capacity.










1.2.1.2 Aperiodic task scheduling

There have been substantial research works in the scheduling of periodic tasks in fixed-

priority systems. Examples include the sporadic server algorithm [6], the deferrable server

algorithm [7], and the slack stealer algorithm [8]. To serve periodic tasks in our integrated real-

time systems model, we may use one of the above periodic servers as a dedicated periodic

server of each partition. But, it is not a good approach, as an periodic server cannot claim the

spare capacities earned by the periodic servers in other partitions.

A number of algorithms that solve periodic task scheduling in dynamic priority systems

using an EDF scheduler can be found in the literature [9]. The Total Bandwidth Server assigns

feasible priority to the arrived periodic task, while guaranteeing deadlines of other periodic

tasks. These solutions cannot be applied to cyclic scheduling which requires distance constraints.

We can find useful concepts in Shen's resource reclaiming [10], Fohler's slot-shifting algorithm

[11], and J. Liu's periodic task scheduling in cyclic schedule [12]. However, their algorithms are

also not applicable to our integrated real-time systems model because distance constraints are not

considered in their scheduling models.

1.2.2 Model and Theories for Integration of Real-Time Systems

A different two-level hierarchical scheduling scheme has been proposed by Deng and Liu

in [13]. The scheme allows real-time applications to share resources in an open environment. The

scheduling structure has an earliest-deadline-first (EDF) scheduling at the operating system level.

The second level scheduling within each application can be either time-driven or priority-driven.

For acceptance tests and admission of a new application, the scheme analyzes the application

schedulability. Then, the server size is determined and the server deadline of the job at the head

of the ready queue is set at run-time. Since the scheme does not rely on fixed allocation of

processor time or fine-grain time slicing, it can support various types of application requirements,

such as release-time jitters, nonpredictable scheduling instances, and stringent timing










requirements. However, their model does not support the Integrated Modular Avionics standards,

nor follow the strong partitioning concepts. To remedy the problem, our first step is to establish

scheduling requirements for the cyclic schedules, such that task schedulability, under given fixed-

priority schedules within each partition, can be ensured. The approach we adopt is similar to the

one in [13] of comparing task execution in the SPIRIT environment with that of a dedicated

processor. The cyclic schedule then tries to allocate partition execution intervals by "stealing"

task inactivity periods. This stealing approach resembles the slack stealer for scheduling soft-

aperiodic tasks in fixed-priority systems [8].

The scheduling approach for avionics applications under the APEX interface of IMA

architecture was discussed by Audsley and Wellings [14]. A recurrent solution to analyze task

response time in an application domain is derived, and the evaluation results show that there is

potential for a large amount of release jitter. However, the paper does not address the issues of

constructing cyclic schedules at the operating system level.

1.2.3 Real-Time Kernels

The microkernel idea met with efforts in the research community to build post-Unix

operating systems. New hardware (e.g., multiprocessors, massively parallel systems), application

requirements (e.g., security, multimedia, and real-time distributed computing) and programming

methodologies (e.g., object orientation, multithreading, persistence) required novel operating-

system concepts. The corresponding objects and mechanisms-threads, address spaces, remote

procedure calls (RPCs), message-based IPC, and group communication-were more basic, and

more general abstractions than the typical Unix primitives. In addition, providing an API

compatible with Unix or another conventional operating system was an essential requirement;

hence implementing Unix on top of the new systems was a natural consequence. Therefore, the

microkernel idea became widely accepted by operating-system designers for two completely

different reasons: (1) general flexibility and power and (2) the fact that microkernels offered a

technique for preserving Unix compatibility while permitting development of novel operating










systems. Many academic projects took this path, including Amoeba [15], Choices [16], Ra [17],

and V [18]; some even moved to commercial use, particularly Chorus [19], L3 [20], and Mach

[21], which became the flagship of industrial microkernels. We refer to them as first-generation

microkernel architecture.

Compared to first-generation microkernel architecture, the second-generation

microkernels, such as MIT's Exokemel [22] and GMD's L4 [23] achieve better flexibility and

performance by following two fundamental design concepts. Firstly, the microkernels are built

from scratch to avoid any negative inheritance from monolithic kernels. Secondly, the kernels are

implemented as hardware dependent, maximizing the support from the hardware. Our real-time

kernel, which implements strong partitioning concepts, follows the concepts of second-generation

microkernel architecture.

The number of contemporary real-time operating systems (RTOSs) has grown, both in

academia and in the commercial marketplace. Systems such as MARUTI [24] and MARS [25]

represent a time-triggered approach in the scheduling and dispatching of safety-critical real-time

applications, including avionics. However, these systems either provide no partitioning scheme,

as in the case of MARUTI; or rely completely on proprietary hardware, as in the case of MARS.

Commercial RTOSs, such as OSE, VRTX, and Neutrino, provide varying levels of

memory isolation for applications. However, applications written for any of them are RTOS

specific. In addition, it is not possible for applications written for one RTOS to co-exist on the

same CPU with a different RTOS, without a considerable effort of re-coding and re-testing. The

approach that we present in this dissertation research overcomes these drawbacks by ensuring

both spatial and temporal partitioning, while allowing the integration of applications developed

using a contemporary RTOS.

The Airplane Information Management System (AIMS) on board the Boeing 777

commercial airplane is among the few examples of IMA based systems [26]. Although the AIMS

and other currently used modular avionics setups offer strong partitioning, they lack the ability of










integrating legacy and third party applications. In our approach, a partition is an operating system

encompassing its application. All previous architectures have the application task(s) as the unit of

partitioning.

The SPIRIT-|jKemel cyclic scheduler can be viewed as a virtual machine monitor that

exports the underlying hardware architecture to the application partitions, similar to the IBM

VM/370 [27]. The concept of virtual machines has been used for a variety of reasons, such as

cross-platform development [28], fault tolerance [29], and hardware-independent software

development [30]. Although some of these virtual machines restrict memory access to maintain

partitioning, we are not aware of any that enforce temporal partitioning or support real-time

operating systems.

Decomposing the operating system services into a i-kemel augmented with multiple

user-level modules has been the subject of extensive research, such as SPIN [31,32], VINO [33],

Exokemel [22], and L4 [23,24]. The main objective of these i-kernel-based systems is to

efficiently handle domain-specific applications by offering flexibility, modularity, and

extendibility. However, none of these systems is suitable for hard real-time applications.

RT-Mach and CVOE are among the few i-kemel based RTOSs that support spatial and

temporal partitioning. Temporal guarantees in RT-Mach are provided through an operating

system abstraction called processor reserve [35]. Processor reservation is accepted through an

admission control mechanism employing a defined policy. The CVOE is an extendable RTOS

that facilitates integration through the use of a callback mechanism to invoke application tasks

[36]. Yet tasks from different applications will be mixed, and thus legacy applications cannot be

smoothly integrated without substantial modification and revalidation.

1.2.4 Real-Time Communication Networks

We look into several commercial products available that claim to provide a real-time

performance guarantee over LANs. HP's 100VG-AnyLAN [37] uses advanced Demand Priority

Access to provide users with guaranteed bandwidth and low latency, and is now the IEEE 802.12










standard for 100-Mbps networking. National Semiconductor's Isochronous Ethernet [38] includes

a 10-Mbps P channel for normal Ethernet traffic, 96 64-Kbps B channels for real-time traffic, one

64-Kbps D channel for signaling, and one 96-Kbps M channel for maintenance. The 96 B

channels can provide bandwidth guarantees to network applications because they are completely

isolated from the CSMA/CD traffic. Isochronous Ethernet forms the IEEE 802.9 standard.

3COM's Priority Access Control Enabled (PACE) [39] technology enhances multimedia (data,

voice and video) applications by improving network bandwidth utilization, reducing latency,

controlling jitter, and supporting multiple traffic priority levels. It uses star-wired switching

configurations and enhancements to Ethernet that ensure efficient bandwidth utilization and

bounded latency and jitter. Because the real-time priority mechanism is provided by the switch,

there is no need to change the network hardware on the desktop machines.

On the other hand, RETHER provides real-time guarantees for multimedia applications

with a software approach. It uses a timed token similar to FDDI and P1394 to control the packet

transmission in the network. The tokens contain all connection information, and are passed as

handshaking signals for each data packet transmission. However, their scheme cannot fulfill the

requirements of safety-critical real-time systems such as integrated modular avionics (IMA)

standards. For example, RETHER does not support scalable multicast real-time sessions, which

are necessary for the upgrade and evolution of the IMA system, or a multi-frame cyclic schedule,

which provides a highly scalable scheduling capability. Also, all message transmissions must

follow the noncontention-based round-robin polling method, even if messages have no real-time

communication requirements

1.3 Main Contributions

The objective of the dissertation is to design integrated real-time systems supporting

strong partitioning schemes. The main contributions of the dissertation are as follows:

A model for strongly partitioned integrated real-time systems. We propose a system

model for integrated real-time systems, which can be used for safety-critical real-time systems









such as integrated modular avionics systems. The model concerns (1) integration of real-time

applications with different criticalities, (2) integrated scheduling of processor and message

communication, and (3) guaranteeing a strong partitioning concept.

Comprehensive scheduling approaches to integrated real-time systems. We devise

comprehensive scheduling algorithms for integrated real-time systems. They include (1) a

fundamental two-level scheduling theory, (2) integrated scheduling of partitions and channels, (3)

soft and hard periodic task scheduling, (4) scheduling with practical constraints, and (5) a full

scheduling tool suite that provides automated scheduling analysis.

A real-time kernel for integrated real-time systems. We establish a software platform

for integrated real-time systems by means of SPIRIT-|tKemel. The kernel implements (1) strong

partitioning concepts with a cyclic scheduler and a strong resource protection mechanism, and (2)

a generic real-time operating system port interface. (3) The kernel makes it possible to run real-

time applications developed in different real-time operating systems on the same processor while

guaranteeing strong partitioning.

A real-time Ethernet for integrated real-time systems. We design and prototype a

real-time Ethernet for integrated real-time systems by means of SPRETHER. The SPRETHER

includes (1) a table-driven proportional access protocol for implementing distance-constrained

cyclic scheduling, (2) support for real-time messages, (3) a scheduling algorithm to meet real-

time requirements, and (4) support for non-real-time messages by integrating original CSMA/CD

MAC operations into the TDPA protocol.

1.4 Organization of the Dissertation

The rest of the dissertation is organized as follows. Chapter 2 considers the modeling and

scheduling analysis of strongly partitioned real-time systems. A basic two-level scheduling theory

is presented to solve scheduling problems of integrated real-time systems. Based on the two-level

scheduling theory, we developed an integrated scheduling approach to find feasible schedules of

partitions and channels on a multiprocessor network platform. Then the schedulability and










behaviors of an integrated scheduling approach were studied through simulations. We also

discuss the incorporation of several practical constraints that would be required in different

application domains. To help a system integrator construct a schedule for integrated real-time

systems, we provide a scheduling tool suite, which automates all scheduling analysis presented in

the dissertation.

Chapter 3 presents a solution for supporting on-line scheduling of soft and hard periodic

tasks in integrated real-time systems. The Distance Constraint guaranteed Dynamic Cyclic (DC2)

scheduler is proposed, which uses three basic operations named Left Sliding (LS), Right Putting

(RP), and Compacting, to dynamically schedule periodic tasks within a distance-constrained

cyclic schedule. We also developed an admission test algorithm and scheduling method for hard

periodic tasks. With the simulation studies, we observed that the DC2 could result in a

significant performance enhancement in terms of the average response time of soft periodic

tasks and the acceptance rate for hard periodic tasks.

Chapter 4 describes the design of a real-time kernel, SPIRIT-|jKemel, which implements

a strong partitioning scheme based on our two-level scheduling theory. The detailed

implementation issues are presented, including a generic real-time operating system port

interface, which is used to host heterogeneous COTS real-time operating systems on top of the

kernel. The performance of the kernel on an experimental platform is evaluated.

Chapter 5 deals with a real-time Ethernet called SPRETHER (Strong Partitioning Real-

Time Ethernet), which is designed for a communication network of integrated real-time systems.

The SPRETHER uses a software-oriented synchronization approach based on the table-driven

proportional access protocol to overcome the lack of deterministic characteristics of Ethernet. The

scheduling algorithm and performance analysis are presented. Chapter 6 summarizes the main

contributions of the dissertation, and discusses future research directions.















CHAPTER 2
INTEGRATED SCHEDULING OF PARTITIONS AND CHANNELS

2.1 Introduction

One of the essential issues of real-time systems is to provide scheduling algorithms that

guarantee timing constraints to real-time applications. In this chapter, we present research results

related to the partition (processor) and channel (communication) scheduling in integrated real-

time systems. First, we discuss the scheduling algorithm for a two-level hierarchical scheduling

model that is a fundamental model of integrated real-time systems. Because the priority-driven

scheduling model is the most popular and demandable in real-time systems, we adopt the priority-

driven model for applications in our two-level hierarchical scheduling model. In a priority-driven

model, all tasks are modeled as hard periodic tasks. To guarantee strict temporal partitioning

constraints among integrated applications, we adopt a distance-constrained cyclic scheduling

model for the lower-level scheduling method.

To schedule processor execution, we need to determine which partition is active and to

select a task from the active partition for execution. According to temporal partitioning, time slots

are allocated to partitions. Within each partition, fixed priorities are assigned to tasks based on

rate-monotonic or deadline-monotonic algorithms. A low-priority task can be preempted by high-

priority tasks of the same partition. In other words, the scheduling approach is hierarchical, in that

partitions are scheduled using a distance-constrained cyclic schedule; and tasks are dispatched

according to a fixed-priority schedule. Similar hierarchical scheduling is also applied to the

communication media, where channels are scheduled in a distance-constrained cyclic fashion and

have enough bandwidth to guarantee message communication. Within each channel, messages

are then ordered according to their priorities for transmission.










Given task-execution characteristics, we are to determine the distance-constrained cyclic

schedules for partitions and channels under which computation results can be delivered before or

on the task deadlines. The problem differs from typical cyclic scheduling since, at the partition

and channel levels, we don't evaluate the invocations for each individual task or message. Only

the aggregated task execution and message transmission models are considered. In addition, the

scheduling for partitions and channels must be done collectively using a heuristic deadline

decomposition, and channel-combining algorithm such that tasks can complete their computation

and then send out the results without missing any deadlines. Soft and hard periodic tasks are also

supported efficiently by reclaiming unused processor capacities with a distance-constrained

dynamic cyclic (DC2) scheduling approach. We also give a scheduling model and solutions to

practical constraints that are required by Integrated Modular Avionics standards. The scheduling

algorithms are evaluated by simulation studies and implemented as an integrated scheduling tool.

The chapter is organized as follows. Section 2.2 presents an integrated real-time systems

model. Section 2.3 discusses the fundamental two-level scheduling theory and provides the

characteristics of the scheduling approach. Section 2.4 describes the integrated scheduling

approach for partitions and channels. An evaluation of the algorithms by simulation studies is

also included. The algorithms to solve practical constraints and a scheduling tool suite are

described in Section 2.5 and 2.6 respectively. The conclusion then follows in Section 2.7.

2.2 Integrated Real-Time Systems Model

The SPIRIT (Strongly Partitioned Integrated Real-Time Systems) model, as shown in

Figure 2-1, includes multiple processors connected by a time division multiplexing

communication bus, such as ARINC 659 [3]. Each processor has several execution partitions to

which applications can be allocated. An application consists of multiple concurrent tasks that can

communicate with each other within the application partition. Task execution is subject to

deadlines. Each task must complete its computation and send out the result messages on time in

order to meet its timing constraints. Messages are the only form of communication among










applications, regardless of whether their execution partitions are in the same processor or not. For

inter-partition communication, the bandwidth of the shared communication media is distributed

among all applications by assigning channels to a subset of tasks running in a partition. We

assume that there are hardware mechanisms to enforce the partition environment and channel

usage by each application, and to prevent any unauthorized accesses. Thus, task computation and

message transmission are protected in their application domain. The mechanisms could include a

hardware timer, memory protection controller, slot/channel mapping, and separate channel

buffers.

Node 1 Node n

Cyclic CPU Scheduler) Cyclic CPU Scheduler

Partition Server ( Partition Server ( Partition Server )


Taske 2 M f S Task I rat RalTi Ssts (TaskT)


In our task model, we assume that each task arrive s periodically and needs to send an






output message after its computation. Thus, as illustrated in Figure 2-2, tasks are specified by
( Channel (Channel )Channel Channel Channel Channel ]





Cyclic Bus Scheduler


Figure 2-1 Model for Strongly Partitioned Integrated Real-Time Systems (SPIRIT)

In our task model, we assume that each task arrives periodically and needs to send an

output message after its computation. Thus, as illustrated in Figure 2-2, tasks are specified by

several parameters, including invocation period (T,), worst-case execution time (C,), deadline (D,)

and message size (M,). Note that, to model sporadic or periodic tasks, we can assign the

parameter T, as the minimum inter-arrival interval between two consecutive invocations.










_Task period T, I



I I I
; : -- ^^ Y
C I M, I
Computation deadline CD, Message deadline MD,

Task deadline D,


Figure 2-2 Task Model and Deadline

In order to schedule tasks and messages at processors and communication channels, the

task deadline, D,, is decomposed into message deadline (MD,) and computation deadline (CD,).

The assignment of message deadlines influences the bandwidth allocation for the message. For

example, when the message size, M,, is 1 Kbytes, and the message deadline is 10 ms, then the

bandwidth requirement is 0.1 Mbytes per second. In the case of a 1 ms message deadline, the

bandwidth requirement becomes 1 M bytes per second. However, a tradeoff must be made, since

a long message deadline implies that less bandwidth needs to be allocated, and the task

computation has to be completed immediately.

For each processor in the SPIRIT architecture, the scheduling is done in a two-level

hierarchy. The first level is within each partition server where the application tasks are running,

and a higher-priority task can preempt any lower-priority tasks of the same partition. The second

level is a distance-constrained cyclic partition schedule that allocates execution time to partition

servers of the processor. In other words, each partition server, Sk, is scheduled periodically with a

fixed period. We denote this period as the partition cycle, 7Ik. For each partition cycle, the server

can execute the tasks in the partition for an interval akrk where ak is less than or equal to 1 and is

called partition capacity. For the remaining interval of (I-ak)r k, the server, Sk, is blocked. Figure

2-3 shows an example execution sequence of a partition that consists of three tasks. During each

partition cycle, 77k, the tasks, rl, r2, and Z3, are scheduled to be executed for a period of ak7k. If









there is no active task in the partition, the processor is idle and cannot run any active tasks from

other partitions.

T1 T2 T2 T3, T T1 T3 idle T2 Tj T2


I I lkI



Figure 2-3 Task and Partition Execution Sequence

Similarly, a two-level hierarchical scheduling method is applied to the message and

channel scheduling. A channel server provides fixed-priority preemptive scheduling for

messages. Then, a distance-constrained cyclic schedule assigns a sequence of communication

slots to each channel server according to its channel cycle, Yk, and channel capacity, /k. A

channel may send out messages using fl !i lots during every period of yk slots. Note that we use

the unit of "slot" to show both message length and transmission time, with an assumption that

communication bandwidth and slot length are given. For instance, a 64-bit slot in the 30 MHz 2-

bit wide ARINC 659 bus is equivalent to 1.0667 ps, and a message of 1000 bytes is transmitted in

125 slots. For convenience purposes, we define the conversion factors ST as a slot-to-time ratio

based on slot length and bus bandwidth.


Table 2-1 Index Notations used in the SPIRIT Model


Notation Description
n(N) Number of nodes in the system
n(P,), 1i _n(N) Number of partitions in node i
n(r,,), li _n(N), 1 n(Q,,), 1i jn(N), 1









For all the notations shown in Table 2-2, we use the index such that 1 i i n(N), 1< j <

n(P), Ilk n(z,), and I l< n(Q,). For most of the notations, we also provide a simplified form

such as Ak for A,j.


Table 2-2 Notations used in the SPIRIT Model


Notation
N,
P,,
A,J or Ak

'C,j,k Or Tk
S, or Sk
Qij,i or Qk
a,j or ak
h h
a or a

71,j or r7k
h or hk

P,,j or pk
.j, I or Pk

Ph or p




T,,j,k or Tk
C,j,k or Ck
D,,j,k or Dk
CD,,,k or CDk
SMDj,korMDk
1f .or Mk
ST
TS


Description
Node i
Partitionj of node i
Application j of node i
Task k of partition of node i
Periodic partition server for P,,
Channel server I of partition of node i
Partition capacity of partition server S,j
Partition capacity of partition server S,, after specialization of
partition cycle
Partition cycle of partition server S,,
Partition cycle of partition server S,, after specialization
Processor utilization of partition server S,
Channel capacity of channel server Q,,j,
Channel capacity of channel server Q,,j, after specialization of
channel cycle
Channel cycle of channel server Q,,
Channel cycle of channel server Q;,; after specialization
Period of task ,,j,k
Worst case execution time of task ',,j,k
Deadline of task ',,j,k (includes message transmission)
Decomposed computation deadline of task ,,j,k
Decomposed message deadline of task ,,j,k
Message transmission size(#of slots) of task ,,j,k
Slot to Time conversion factor
Time to Slot conversion factor









2.3 Fundamental Scheduling Theory for Integration

In this section, we discuss the fundamental scheduling theory for the SPIRIT two-level

hierarchical scheduling model. We are only concerned with schedulability analysis for partitions

here. The next section shows integrated scheduling of both partitions and channels.

2.3.1 Schedulability Requirement

We consider the scheduling requirements for a partition server, Sk that executes the tasks

of an application partition Ak according to a fixed-priority preemptive scheduling algorithm and

shares the processing capacity with other partition servers in the operating system level. Let

application Ak consist of Tz, Tz,..., T tasks. Each task T, is invoked periodically with a period T,

and takes a worst-case execution time (WCET) C,. Thus, the total processor utilization demanded

C
by the application is pk = Also, upon each invocation, the task z must be completed


before its deadline period D,, where C, < D, < 7.

As modeled in SPIRIT, the tasks of each partition are running under fixed-priority

preemptive scheduling. Suppose that there are n tasks in Ak listed in priority ordering TI<

2< ... < where zI has the highest priority and Tn has the lowest. To evaluate the schedulability of

the partition server Sk, let's consider the case that Ak is executed at a dedicated processor of speed

ak, normalized with respect to the processing speed of Sk. Based on the necessary and sufficient

condition of schedulability [40,41], task rz is schedulable if there exists a

t eH, = { IT, j = 1,2,..i;l = 1,2,..., D,/TJ} u{D,}, such that

W,(k,t)=_ ci I t < t.



The expression W, (ak, t) shows the worst cumulative execution demand made on the

processor by the tasks with a priority higher than or equal to rz during the interval [0, t]. We now









defineB,(a) =maxt H, {t-W(ak,t)}and B,(ak)=min,1,2, ,,B,(ak). Note that, when r, is

schedulable, B, (ak) represents the total period in the interval [0, D,] that the processor is not

running any tasks with a priority higher than or equal to that of 'z. It is equivalent to the level-i

inactivity period in the interval [0, D,] [8].

By comparing the task executions at server Sk and at a dedicated processor of speed ak,

we can obtain the following theorem:

Theorem 1. The application Ak is schedulable at server Sk that has a partition cycle r7k and

a partition capacity ak, if

a) Ak is schedulable at a dedicated processor of speed ak, and

b) 17k < B(ak)/(1 -ak)

Proof" The task execution at server Sk can be modeled by tasks rt, 2,...., of Ak and an

extra task To that is invoked every period r7k and has an execution time Co=(l-ak)T k The extra

task to is assigned with the highest priority and can preempt other tasks. We need to show that,

given the two conditions, any task r, of Ak can meet its deadline even if there are preemptions

caused by the invocations of task To. According to the schedulability analysis in [40,41], task 'z is

schedulable at server Sk if there is a t e H, u G, such that


SC t -+Co j=1 J TJ k

where G,= {1'k 1,2,...,LD, /7k }.

If r, is schedulable on a processor of speed ak, there exists a t, e H, such that

B,(ak) = t* W(ak, t*) > BO(ak) 0 for all i=1,2,...,n. Note that W(ak, t,) is a non-decreasing

function of t,. Assume that t," i ,,i +6, where < rk If 8 > B (ak ),









Z tC +Co ] =aFW,(ak,t, )+(m+l)Co

ak (t (ak ))+(m+l1)(1 -ak)7k
ak B (R ))+ (1 a, 5) + B, (a,
= t,* + (1- ak)(Bo (ak) )


The above inequality implies that all tasks z; are schedulable at server Sk.

On the other hand, if 6 < Bo (ak) then, at t = mr7k < t,, we have


ZCj K +CO
<_ak (* -5)+m(l-a,)rk
=ak(,t + ( a )tk
=t[

Since t, e G,, the application Ak is schedulable at server Sk.

When we compare the execution sequences at server Sk and at the dedicated processor,

we can observe that, at the end of each partition cycle, Sk has put the same amount of processing

capacity to run the application tasks as the dedicated processor. However, if the tasks are running

at the dedicated processor, they are not blocked and can be completed earlier within each partition

cycle. Thus, we need an additional constraint to bound the delay of task completion at server Sk.

This bound is set by the second condition of the Theorem and is equal to the minimum inactivity

period before each task's deadline. U

An immediate extension of Theorem 1 is to include the possible blocking delay due to

synchronization and operating system overheads. Assume that the tasks in the partition adopt a

priority ceiling protocol to access shared objects. The blocking time is bounded to the longest

critical section in the partition in which the shared objects are accessed. Similarly, additional

delays caused by the operating system can be considered. For instance, the partition may be









invoked later than the scheduled moments since the proceeding partition just enters an O/S

critical section. In this case, we can use the longest critical section of the operating system to

bound this scheduling delay. These delay bounds can be easily included into the computation of

W(ak, t)

2.3.2 Characteristics of the Two-Level Scheduling Theory

Theorem 1 provides a solution to determine how frequently a partition server must be

scheduled at the O/S level and how much processor capacity it should use during its partition

cycle. It is easy to see that Bo(ck) and rlk are increasing functions of ak. This implies that if more

processor capacity is assigned to a partition during its partition cycle, the tasks can still meet their

deadlines even if the partition cycle increases. To illustrate the result of Theorem 1, we consider

an example in Table 2-3 where four application partitions are allocated in a processor. Each

partition consists of several periodic tasks and the corresponding parameters of (C,, T,) are listed

in the Table. Tasks are set to have deadlines equal to their periods and are scheduled within each

partition according to a rate-monotonic algorithm [5]. The processor utilizations demanded by the

4 partitions, Pk, are 0.25, 0.15, 0.27, and 0.03, respectively.



Table 2-3 Task Parameters for the Example Partitions

Partition 1 Partition 2 Partition 3 Partition 4
(utilization=0.25) (utilization=0.15) (utilization=0.27) (utilization=0.03)
tasks (4, 100) (2, 50) (7,80) (1,80)
(C, T,) (9, 120) (1, 70) (9,100) (2,120)
(7, 150) (8, 110) (16,170)
(15,250) (4, 150)
(10, 320)

Following Theorem 1, the minimum level-i inactivity period is calculated for each


partition and for a given capacity assignment ak, i.e., B (ak) min max(t ). The
tH ~J ak TJ

resulting inactivity periods are plotted in Figure 2-4 for the four partitions. It is easy to see that,







23



when ak is slightly larger than the processor utilization, the tasks with a low priority (and a long

period) just meet their deadlines, and thus have a small inactivity period. On the other hand, when

a, is much larger than the processor utilization of the partition, the inactivity period is bounded to

the smallest task period in each partition. This is due to the fact that the tasks with a short period

cannot accumulate more inactivity period before their deadlines. The curves in the figure also

show that an increase of ak after the knees wouldn't make the inactivity periods significantly

longer.





120
Partition 1
Partition 2
100 -- Partition 3
Partition 4
-n
& 80





c 40


20
o //
0.
0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0






Figure 2-4 Inactivity Periods of the Example Partitions


In Figure 2-5, the maximum partition cycles are depicted with respect to the assigned

capacity ak. If the points below the curve are chosen to set up cyclic scheduling parameters for

each partition at the O/S level, then the tasks in the partition are guaranteed to meet their

deadlines. For instance, the curve for partition 2 indicates that, if the partition receives 28% of

processor capacity, then its tasks are schedulable as long as its partition cycle is less than or equal

to 59 time units. Note that the maximum partition cycles increase as we assign more capacity to







24



each partition. This increase is governed by the accumulation of inactivity period when ak is


small. Then, the growth follows by a factor of 1/(1-ak) for a larger ak.


Figure 2-5 suggests possible selections of (ak, r7k) for the 4 partitions subject to a total


assignment of processor capacity not greater than 1. From the figure, feasible assignments for (ak,


i7k) are (0.32, 36), (0.28, 59), (0.34, 28), and (0.06, 57), respectively. In the following subsection,


we shall discuss the approaches of using the feasible pairs of (ak, 77k) to construct cyclic


schedules.


/

'II





/7^
//
/

/


//
/


.6 .7


.8 .9 1.0


Figure 2-5 Maximum Partition Cycles for Different Processor Capacity Assignments


To reveal the properties of the parameters (ak, irk) under different task characteristics, we


present two evaluations on the task set of Partition 1 in Table 2-3. We first alter the task deadlines


such that D, is equal to 1OT,, 0.8T,, 0.6T,, and 0.4T, for all tasks. The tasks are then scheduled

according to the deadline monotonic algorithm within the partition. The maximum partition

cycles are plotted in Figure 2-6. As the deadlines become tighter, the curves shift to the bottom-


'artition 1
'artition 2
'artition 3
'artition 4


500
--p
450 p
-- P
400 p

350

300

250

200

150

100

50

0 -
0.0 .1


.2 .3 .4 .5

a










right comer. This change suggests that either the partition must be invoked more frequently or

should be assigned with more processor capacity. For instance, if we fix the partition cycle at 56

time units and reduce the task deadlines from 1.OT, to 0.4T,, then the processor capacity must be

increased from 0.34 to 0.56 to guarantee the schedulability.

The second experiment, shown in Figure 2-7, concerns partition schedulability under

different task execution times. Assume that we upgrade the system with a high speed processor.

Then, the application partitions are still schedulable even if we allocate a smaller amount of

processor capacity or extend partition cycles. In Figure 2-7, we show the maximum partition

cycles for Partition 1 of Table 2-3. By changing task execution times proportionally, the

processor utilizations are set to 0.25, 0.20, 0.15 and 0.1. It is interesting to observe the

improvement of schedulability when ak is slightly larger than the required utilization. For

instance, assume that the partition cycle is set to 56. Then, for the 4 different utilization

requirements, the processor capacities needed for schedulability can be reduced from 0.34, to

0.28, 0.212, and 0.145, respectively. On the other hand, once a more than sufficient capacity is

assigned to the partition, the maximum partition cycle is relatively independent of the processor

utilization, and is mainly affected by task deadlines and periods.

2.3.3 Distance-Constrained Low-Level Cyclic Scheduling

With the result of the scheduling analysis, we can assign a fixed priority to each task, and

a pair of partition capacity and partition cycle to each partition. In this subsection, we discuss how

to construct a distance-constrained cyclic schedule from the result of scheduling analysis.

Given the schedulability requirement of (ak, r7k) for each partition server Sk, a cyclic

schedule must be constructed at the lower-level scheduler. Notice that the pair of parameters (ak,

r7k) indicates that the partition must receive an ak amount of processor capacity at least every r7k

time units. The execution period allocated to the partition needs to be not continuous, or to be

restricted at any specific instance of a scheduling cycle.















500
D=1.0 *T
450 D=0.8 T
-- D=0.6*T
400 D=0.4 T

350

300

250

200 -
-
150

100

50 .

0
0.0 .1 .2 .3 .4 .5 .6 .7


.8 .9 1.0


Figure 2-6 Maximum Partition Cycles for Partition 1 under Different Task Deadlines


350

300

250
0"
. 200
0
150
a
100

50

0


0.0 .1 .2 .3 .4 .5
a.


.6 .7 .8 .9 1.0


Figure 2-7 Maximum Partition Cycles of Partition 1 under Different Processor Utilizations


/
/
/ /
/ /

/ /
7//

/ /
/
/
//


util.=0.25
util.=0.20
util.=0.15
util.=0.10












//










This property makes the construction of the cyclic schedule extremely flexible. In the

following, we will use the example in Table 2-3 to illustrate two simple approaches.


1. Unique partition cycle approach: In this approach, the O/S schedules every partition in a

cyclic period equal to the minimum of r7k and each partition is allocated an amount of

processor capacity that is proportional to ak. For instance, in the example of Table 2-3, a

set of feasible assignments of (ak, r7k) can be transferred from {(0.32, 36), (0.28, 59),

(0.34, 28), (0.06, 57)} to {(0.32, 28), (0.28, 28), (0.34, 28), (0.06, 28)}. The resulting

cyclic schedule is shown in Figure 2-8(a) where the O/S invokes each partition every 28

time units and allocates 8.96, 7.84, 9.52, and 1.68 time units to partitions 1, 2, 3, and 4,

respectively.


2. Harmonic partition cycle approach: When partition cycles are substantially different, we

can adjust them to form a set of harmonic cycles in which i7, is a multiple of r77, if 77, < r7

for all i andj. Then, the O/S cyclic schedule runs repeatedly every major cycle which is

equal to the maximum of r7k. Each major cycle is further divided into several minor

cycles with a length equal to the minimum of r7k. In Figure 2-8 (b), the cyclic schedule

for the example is illustrated where the set of (ak, irk) is adjusted to {(0.32, 28), (0.28,

56), (0.34, 28), (0.06, 56)}. The major and minor cycles are 56 and 28 time units,

respectively. Note that partitions 2 and 4 can be invoked once every major cycle.

However, after the processing intervals for partitions 1 and 3 are assigned in every minor

cycle, there doesn't exist a continuous processing interval of length 0.28 56 time units

in a major cycle. To schedule the application tasks in partition 2, we can assign an

interval of length 0.22 28 in the first minor cycle and the other interval of length 0.34 *

28 in the second minor cycle to the partition. This assignment meets the requirement of

allocating 28% of processor capacity to the partition every 56 time units.










Comparing Figure 2-8 (a) and (b), there are fewer context switches in the harmonic

partition cycle approach. The reduction could be significant if there are many partitions with

different partition cycles. In such a case, an optimal approach of constructing a set of harmonic

cycles and assigning processing intervals should be sought in order to minimize the number of

context switches. The other modification we may consider is that, when a partition cycle is

reduced to fit in the cyclic schedule, the originally allocated capacity becomes more than

sufficient. We can either redistribute the extra capacity equally to all partitions, or keep the same

allocation in the partition for future extensions.

In addition to the flexible cycle schedules, the choice of ak is adaptable as long as the

sum of all ak is less than or equal to 1. The parameter ak must be selected to ensure partition

schedulability at a dedicated processor of speed ak. For instance, if the task set is scheduled

according to a rate monotonic algorithm, we can define a minimum capacity equal to pkln(2 1"-1)

which guarantees the partition schedulability. Additional capacity can be assigned to the partition

such that the partition cycle can be prolonged. In the case that new tasks are added into the

partition and the modified task set is still schedulable with the original capacity assignment, we

may need to change the partition cycle and construct a new cyclic schedule at the O/S level.

However, as long as the new cyclic schedule is subject to the requirement of (ak, irk) for each

partition, no change to other partitions is necessary to ensure their schedulability.

A, A2 A3 A4 A A A2 A3 AA, A2 A3 A4 A A2 A3 A4
I I I I I I


(a) Unique Partition Cycle
A, A3 A4A2 A1 A3 A2 A1 A3 A4A2 A1 A3 A2


(b) Harmonic Partition Cycle

Figure 2-8 Example Cyclic Schedule at the Lower Level










In distance-constrained scheduling, all partitions must be allocated an exact capacity at

any instance of partition cycle period as shown in Figure 2-9. So the tasks of a partition, which

arrive arbitrarily, can be allocated sufficient capacity with which they can be scheduled.

Relaxed Periodic Model
Partition Cycle Instance 1 Partition Cycle Instance 2-- Partition Cycle Instance 3
P(1) P(3)
Distance > Partition Cycle---
Distance Constrained Periodic Model
Partition Cycle Instance 1 J Partition Cycle Instance 2 Partition Cycle Instance 3
P(1) P(3)
Distance = Partition Cycle



Figure 2-9 Distance-Constrained Scheduling Model


Let a feasible set of partition capacities and cycles be (a,, 7), (a2, 12) ... (a,, r) and

the set be sorted in the non-decreasing order of irk. The set cannot be directly used in a cyclic

schedule that guarantees the distance constraint of assigning ak processor capacity for every r7k

period in a partition. To satisfy the distance constraint between any two consecutive invocations,

we can adopt the pinwheel scheduling approach [42,43] and transfer {0/k} into a harmonic set

through a specialization operation. Note that, in [42], a fixed amount of processing time is

allocated to each task and would not be reduced even if we invoke the task more frequently. This

can lead to a lower utilization after the specialization operations. For our partition-scheduling

problem, we allocate a certain percentage of processor capacity to each partition. When the set of

partition cycles {7k)} is transformed into a harmonic set {hk}, this percentage doesn't change.

Thus, we can schedule any feasible sets of (ak, r7k) as long as the total sum of ak is less than 1.

A simple solution for a harmonic set {hk} is to assign hk =71 for all k. However, since it

chooses a minimal invocation period for every partition, a substantial number of context switches

between partitions could occur. A practical approach to avoiding excessive context switches is to










use Han's Sx specialization algorithm with a base of 2 [42]. Given a base partition cycle q7, the

algorithm finds a h, for each r7, that satisfies:

hi = 2 < ri < r" *2 j+l= 2*hi,

To find the optimal base 7 in the sense of processor utilization, we can test all candidates

7 in the range of (r7/2, r7l] and compute the total capacity ka To obtain the total capacity,


the set of r7k is transferred to the set of hk based on corresponding 7 and then the least capacity

requirement, ah, for partition cycle hk is obtained from Theorem 1. The optimal 77 is selected in

order to minimize the total capacity. In Figure 2-10, we show a fixed-cyclic processor scheduling

example that guarantees distance constraints for the set of partition capacities and cycles,

A(0.1,12), B(0.2,14), C(0.1,21), D(0.2,25), E(0.1,48), and F(0.3,50). We use the base of 10 to

convert the partition cycles to 10, 10, 20, 20, 40, and 40, respectively.


A B C D 0A B E F A B C D 0A B F
1 2 1 2 2 1 2 075 1 1 2 1 2 2 1 2 175
10 1010 10
20 20
40


Figure 2-10 Processor Cyclic Schedule Example


2.4 Integrated Scheduling Theory for Partitions and Channels

2.4.1 Scheduling Approach

The objective of an integrated scheduling approach is to find feasible cyclic schedules for

partition and channel servers which process tasks and transmit messages according to their fixed

priorities within the servers. With proper capacity allocation and frequent invocation at each

server, the combined delays of task execution and message transmission are bounded by the task

deadlines. In Figure 2-11, we show the overall approach which first applies a heuristic deadline

decomposition to divide the problem into two parts: partition-scheduling and channel-

scheduling. If either one cannot be done successfully, the approach iterates with a modified










deadline assignment. We also assume that the initial task set imposes a processor utilization and a

bus utilization of less than 100% and each task's deadline is larger than its execution time plus its

message transmission time, i.e., D, 2 C, + ST*M, for task i.


Figure 2-11 Combined Partition and Channel Scheduling Approach


2.4.1.1 Deadline decomposition

It is necessary to decompose the original task deadline, D,, into computation and message

deadline, CD, and MD,, for every task, before we can schedule the servers for partition execution

and message transmission. A deadline decomposition algorithm is used to assign these deadlines

in a heuristic way. If we assign tight message deadlines, messages may not be schedulable.

Similarly, if tasks have tight deadlines, processor scheduling can fail. The following equation is

used to calculate the message deadline and computation deadline for each task:

ST M
Message Deadline, MD = (D ) f
C, +ST *M,









Computation Deadline, CD, = D, -MD,

where f is an adjusting factor for each task. The main idea of deadline decomposition is that it

allocates the deadlines, CD, and MD,, proportionally to their time requirements needed for task

execution and message transmission. In addition, the adjusting factor f is used to calibrate the

computation and message deadlines based on the result of previous scheduling attempts and the

utilization of the processor and communication bus. Since the message and task deadlines must

be lower-bounded to the transmission time (ST*M,) and computation time (C,), respectively, and

upper-bounded to D,, we can obtain the lower bound and upper bound of the adjusting factorfas

1 D, -C,
1 1 ST *M
D,( ) D,(
C, +ST *M, DC, +ST *M,

Since an adjusting factor of 1.0 is a fair distribution and always included in the range of

f, we set the initial value of f to be 1. The heuristic deadline decomposition, as show in Figure

2-12, is similar to a binary search algorithm in attempting to find the right proportion of task and

message deadlines. If we reach the situation that it cannot assign a new value for all tasks, we

declare the input set of tasks to be unschedulable.



initialization for all tasks
1 D -C
MinF = MaxF = D, J 1.0;
1 ( ST *M )
C, +ST*M, C, +ST*M,

iterative change offi when either partition or channel scheduling fails

if (partition scheduling fails) {
MaxF =f; f = (MinF + f) / 2.0;
}
else if (channel scheduling fails) {
MinF =f; f = (MaxF + f) / 2.0;
}


Figure 2-12 Deadline Decomposition Algorithm










2.4.1.2 Partition and channel scheduling

In SPIRIT, partitions and channels are cyclically scheduled. The partition cyclic schedule

is based on partition cycle, rk, and partition capacity, ak. Similarly, a channel cyclic schedule with

parameters Pk and pk implies that the channel can utilize f3kpk slots during a period of pk slot

interval. While tasks and messages are scheduled according to their priority within the periodic

servers, the cyclic schedule determines the response time of task execution and message

transmission.

We can use the same scheduling method of for channel scheduling as for partition

scheduling. A channel server, Qk, transmits its messages according to a fixed-priority preemptive

scheduling method. It provides a bandwidth of fkpk slots to the messages in the channel during

every channel cycle, pk, where Pk < 1. For the remaining slots of (]-Pjkfpk, the channel server is

blocked. Since each channel server follows the identical two-level hierarchical scheduling as

partition servers, Theorem 1 can be directly applied to obtain the pair of parameters (/3k, k).

However, there are several differences. First, only integer number of slots can be assigned to a

channel server. Thus, we can use either F[/3pk slots or restrict k3kpk to be integer. The second

difference is that the message arrivals are not always periodic due to possible release jitters.

Release jitters can be included in the schedulability test if they are bounded by some maximum

values [44].The release jitter can also be eliminated if the communication controller incorporates

a timed message service that becomes active immediately after the computation deadline is

expired. The last difference is the assignment of messages into a channel. According to the

principle of partitioning, tasks from different partitions cannot share the same channel for

message transmission. For the tasks in a partition, we can group a subset of tasks and let them

share a channel server. The grouping can be done based on the semantics of the messages or other

engineering constraints. Also, the multiplexing of messages in a shared channel may lead to a

saving of bandwidth reservation.










2.4.1.3 Channel combining

For a channel server that transmits a periodic message with a deadline MD, and a

message size M,, we must allocate a minimum bandwidth of / 11 /). Since there is a limitation in

the total bus bandwidth, we may not always assign one channel server to each message. However,

we may be able to combine some messages and let them share a common channel server. This

can lead to a bandwidth reduction, since the reserved bandwidth can be better utilized by the

messages of different deadlines. For example, given two messages 1 and 2 with parameters (M1,

MDi, Ti) and (M2, MD2, T2), respectively, the minimum bandwidth requirements, in terms of slots

per time unit, for separate channels of messages 1 and 2, and for the combined channel, can be

computed as following:


CB1 = M ,CB2 M2
MD1 MD 2


M2 +M1 MD
Mf T,
CB12 =max{ { M ,}
MDI MD2

We assume that message 1 has a higher priority than message 2 in the computation of

CB12, i.e. the required bandwidth if messages 1 and 2 share a common channel server. The cost of

message preemption is ignored, which can be at most one slot per preemption, since we assume

that slots are the basic transmission units in the communication bus. Notice that CB12 is not

always less than CBI+CB2. However, if message 1 has a much shorter deadline compared to its

period and message 2 has a longer deadline than message l's period, then the bandwidth

reduction CBI+CB2-CB12 becomes substantial. While we reserve a proper amount of bandwidth

for an urgent message, the channel is only partially utilized if the message arrives infrequently.

This provides a good chance to accommodate additional messages in the same channel and results

in a reduction in the required bandwidth.










To help understanding channel combining, we give a simple example. Let's assume that

there are two messages Msgl and Msg2 with parameters (2,5,10) and (1,10,20), respectively.

When we assume that deadline D, equals period T,, the message deadline MD, must be less than

T, due to the deadline decomposition, which makes the equation, CD,+MD, = D, ( T,), true.

According to the above equations, we obtain CBi= 2/5, CB2=1/10, CBi+CB2 = 1/2 and CB12 =

2/5. So the bandwidth reduction is 1/10 after combining two messages in single channel. We

illustrate the correctness of the channel combining in Figure 2-13 where both messages are

released at the same time.


TT2


Msg1, Msg2 Msg, Msg1, Msg2

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
-MD,- -MD,-

Msg, =
MD2
Msg2


Unused




Figure 2-13 Scheduling Example of Two Messages in Combined Channel


The above equation also implies that the maximum bandwidth reduction can be obtained

by combining the message with a long deadline and the message with a short deadline, where the

period of the latter should be greater than the message deadline of the former. With this

observation, we devise a heuristic channel-combining algorithm that is shown in Figure 2-14. The

computation of the minimum bandwidth requirement of a channel consisting of messages

1,2,...,k-1, and k, is:

J-1 MD~
CB12 k = max{((-MA +MJ)/MDJ)}
j=1...k T=1











where we assume that message j has a higher priority then message j+1. Note that the real

bandwidth allocation must be determined according to the choice of channel cycle as described in

Theorem 1. However, in order to calculate channel cycle and capacity, the messages in each

channel must be known. The channel-combining algorithm outlined in Figure 2-14 is developed

to allocate messages to channels for each partition, and to reduce the minimum bandwidth

requirement to a specific threshold. If the combined channels cannot be scheduled, we can further

decrease the target threshold until no additional combining can be done.


initialization (channel combining is allowed for tasks in the same partition)

assign one channel server Qk to the message of each task

iterate the following steps until the sum of total CBk is less than the target threshold

determine all pairs of combinable channel server Qk and Qj where the max. message
deadline in Qk is larger than the min. task period in Qj

for every pair of combinable channel servers Qk and Qj {
calculate the bandwidth reduction CBk+CBj- CBkj
}
combine Qj with the server Qk that results in the maximum bandwidth reduction



Figure 2-14 Heuristic Channel Combining Algorithm

The basic method of cyclic scheduling for channel servers is the same as that of partition

server scheduling. The only difference is that we need to consider that channel bandwidth

allocation must be done based on an integer number of slots. Let the feasible bus bandwidth

capacity allocation set be (/3, p,), (P2, p) ... (,p J. Using the Sx specialization, the set {Pk}

will be transformed to a harmonic set {mk}. Then, based on Theorem 1 and the reduced nk, we


can adjust the channel capacity 3k to kh subject to t Ih h m I=1


slots allocated to the channel server Qk.









2.4.1.4 Example of integrated scheduling

To illustrate how the scheduling algorithm works, we considered the example given in

Table 2-4 in which a total of five partitions are allocated to two processors. For simplicity, we

assume that deadlines of tasks are equal to their periods. In this example, we considered the

APEX/OASYS mission critical avionics real time system as a platform. APEX/OASYS has

features such that the total bandwidth of the bus is 2,000,000 slots per second, and one time unit

is capable of 2,000 slots. Therefore, the conversion factor STis 0.0005 and TS is 2,000.

The processor utilizations for each processor, and bus utilizations, are also given in Table

2-4. Since both processor and bus share the same deadline of each task, schedulable utilization

will be less than compared to a single resource scheduling problem. The processor utilization p

and bus utilization care given in the parenthesis below the processor and partition identifications,

in the form of (p, a).


Table 2-4 Task Parameters for the Example of Integrated Scheduling


Processor 1 Partition (1,1) Partition (1,2) Partition (1,3)
(0.65, 0.30) (0.26,0.14) (0.13,0.09) (0.26, 0.07)
(4,7000,90) (2,2000,50) (7,4000,80)
Tasks (9,9000,120) (1,2000,70) (9,4000,100)
(Ck, Mk, Tk) (7,9000,150) (8,14000,110) (10,5000,120)
(15,15000,250)
(10,2000,320)
Processor 2 Partition (2,1) Partition (2,2)
(0.55, 0.17) (0.25,0.04) (0.30, 0.13)
(5,4000,80) (6,8000,60)
Tasks (9,3000,100) (10,7000,90)
(Ck, Mk, k) (11,0,120) (14,6000,150)








38






Processor 1 (ak, 1k)


300

Partition(1,1)
250 Partition(1,2)
-- Partition(1,3)

200
a)

0 150
0
'/
S100 -






0
0.0 .1 .2 .3 .4 .5 .6

Processor Capacity (ak)


.7 .8 .9 1.0


Processor 2 (ak, k)


300


Partition(2,1)
250 Partition(2,2)


^ 200
0)

0 150-



U 100


50 -


0
0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0

Processor Capacity (ak)


Figure 2-15 Processor Capacity vs. Partition Cycle










In Figure 2-15, the maximum partition cycles are depicted with respect to the processor

capacity ak. If the points below the curve are chosen to set up cyclic scheduling parameters for

each partition in the processor cyclic scheduler, the tasks in the partition are guaranteed to meet

their task deadlines. From the result of the scheduling algorithm, we can obtain sets of (ak, ilk) as

follows: (0.356, 17), (0.262, 6), (0.381, 11) and (0.375,17), (0.624,31) for each five partition. By

the specialization technique of distance-constrained scheduling, we can transform the above sets

to harmonic sets as follows (0.356, 16), (0.262, 4), (0.381, 8) and (0.375,15), (0.624,30) for each

of the five partitions.

In Table 2-5 and Table 2-6, we show the result of cyclic scheduling for the chosen

harmonic sets of parameters.



Table 2-5 Cyclic Time Schedule for Processor 1


start 0.000 1.048 4.000 5.048 5.144 8.000


end 1.048 4.000 5.048 5.144 8.000 9.048
partition 2 3 2 3 1 2
start 9.048 12.000 13.048 13.144 15.984


end 12.00 13.048 13.144 15.984 16.000
partition 3 2 3 1 IDLE


Table 2-6 Cyclic Time Schedule for Processor 2


start 0.000 5.625 15.000 20.625 29.970


end 5.625 15.000 20.625 29.970 30.000
partition 1 2 1 2 IDLE

In terms of message scheduling, we have sixteen messages that share the bus for message

transmission. The scheduling algorithm combines tasks to reduce the whole bandwidth









requirement to fit the shared bus. In this example, the result of the message (channel) combining

algorithm is: Q1,ii = {71,1,2, 1,1,5 Q1,1,2 { 1,1,4 }, Q1,1,3 {T1,1,1, '1,1,3}, Q1,2,1 {T1,2,1, 1,2,2, 1,2,3},

Q1,3,1 = {21,3,1, T1,3,2, 21,3,3 }, Q2,, {2,1,1, Q,1,2}, 2,,= {2,2,1, T2,2,, T2,2,3 }. In Figure 2-16, the

maximum channel cycles are depicted with respect to the bus capacity Pk. In this example, since

the channel server period is exponential to the capacity lk, we use the logarithmic value instead.

If the points below the curve are chosen to set up cyclic scheduling parameters for each channel

server in the bus cyclic scheduler, the messages in the channel server are guaranteed to meet their

message deadlines. From the result of the scheduling algorithm, we can obtain sets of (Pk, lk) as

follows: (0.079, 136), (0.063, 297), (0.101, 127), (0.157, 114), (0.144, 122) and (0.068, 172),

(0.179, 101) for each nine channel servers. Applying the harmonic transformation algorithm we

obtained a harmonic set of (3k, nk) as follows : (0.079, 101), (0.063, 202), (0.101, 101), (0.157,

101), (0.144, 101) and (0.068, 101), (0.179, 101) for nine channel servers, respectively. In Table

2-7, we show the result of cyclic scheduling in terms of slot numbers in a major frame of 202

slots.


Table 2-7 Slot Allocation of Cyclic Scheduling for Bus of Major Frame Size 202


start end channel start end channel start end channel
000-018 Q2,2,1 069-075 Q2,1,1 136-150 Q1,3,1
019-034 Q1,2,1 076-088 Q1,1,2 151-161 Q1,1,3
035-049 Q1,3,1 089- 100 IDLE 162- 169 Qi,1,i
050-060 Q1,1,3 101-119 Q2,2,1 170-176 Q2,1,1
061-068 Qi,1,i 120-135 Q1,2,1 177-201 IDLE









41






Bus Processor 1 (pkI, k) : lOglONk


1e+10

1e+9

1e+8

le+7

1e+6

1e+5

1e+4

1e+3

1e+2

le+1















1e+10

1e+9

1e+8

le+7

1e+6

1e+5

1e+4


0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0

Channel Bandwidth Capacity (pk)


Figure 2-16 Bus Capacity vs. Channel Server Cycle


- Channel Server (1,1,1)
SChannel Server (1,1,2)
- Channel Server (1,1,3)
- Channel Server (1,2,1)
- Channel Server (1,3,1)











I


i i I I I 1
0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0

Channel Bandwidht Capacity (pk)






Bus Processor 2 (pk, k) : IOgloo k


1e+3

1e+2

le+1


- Channel Server (2,1,1)
SChannel Server (2,2,1)


-









-






-


_










2.4.2 Algorithm Evaluation

In this section, we present the evaluation results of the integrated scheduling algorithms.

Firstly, we show the percentage of schedulable task sets in terms of processor and bus utilization

under the two-level scheduling, deadline decomposition and channel combining algorithms.

Then, we show that the penalty of the harmonic transformation is negligibly small. Finally, the

characteristic behavior of deadline decomposition is illustrated. The evaluations are done with

random task and message sets that are generated with specific processor and bus utilization.

2.4.2.1 Schedulability test

A schedulability test of the algorithm is obtained using the simulations of a system model

that consists of four processors, three partitions per processor and five tasks per partition, i.e., a

configuration of (4, 3, 5). The simulations use random task sets that result in variable processor

utilization of 15%, 30%, 45%, 60% and 75%. The task periods are uniformly distributed between

the minimum and maximum periods. The total processor utilization is randomly distributed to all

tasks in each processor and is used to compute the task execution times. To create message sets,

we vary the total bus utilization from 10% to 90%. Message lengths are computed with a random

distribution of the total bus utilization and task periods.

Using the scheduling procedure of Figure 2-11 we first assign task and message deadlines

for each task. Then the partition capacity and cycle for each partition are computed and the cyclic

schedule for each processor is constructed. To schedule message transmission, messages are

combined into channels in order to reduce bandwidth requirement. After channel cycle and

capacity are determined, a cyclic schedule is formed. For the priority schedules within partitions

and channels, we adopt the deadline monotonic approach to order the task and message priorities.

With all randomly created task sets, we report the percentage of schedulable task sets among all

sets in Figure 2-17. The figure shows the algorithms are capable of finding proper deadline

assignments and, then, determining feasible partition and channel cyclic schedules.








43





(N,P,T) = (4,3,5)


0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0 1.1

Bus utilization





(N,P,T) = (2,2,4)


--- Proc. Util. 0.15
- Proc. Util. 0.30
-V- Proc. Util. 0.45
S- Proc. Util. 0.60
-M- Proc. Util. 0.75


\ \


\\ \\



^ ~\ \
\- -5


\ \


0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0 1.1

Bus utilization


Figure 2-17 Schedulability Test for Configuration (4,3,5) and (2,2,4)


1.1
1.0
.9-
.8 -
.7
.6-

1U
-s .4-
0)
.3 -
.2 -
.1
0.0
-


\ I






\ \\


-*-- Proc. Util. 0.15
0 Proc. Util. 0.30
-y-- Proc. Util. 0.45
-V Proc. Util. 0.60
-A- Proc. Util. 0.75


1.1
1.0
.9
.8
.7

.6


. -
C)
.3
.2
.1
0.0
-.1


-.I










For instance, consider the case of 60% processor and bus utilization. Even if the

deadlines are less than the task periods, almost 100% of task sets are schedulable. Figure 2-17

also reports the test results of the configuration (2, 2, 4). The curves have similar trends to that of

the configuration (4, 3, 5).

2.4.2.2 Effects of deadline decomposition and channel combining algorithms

It is worthwhile looking into how the bus is utilized in the channel schedules resulting

from the heuristic algorithms of deadline decomposition and channel combining. Consider the

following measures:

1. Measure is the bus utilization which equals the sum of (ST*M) T, for

all tasks. No real-time constraint of message delivery is considered in this measure.

2. Measure is the total bus capacity needed to transmit messages on time

with no channel combining (i.e., each task has a dedicated channel). This capacity will be

equal to the summation of (ST*A I i1 /) for all tasks and can be computed after message

deadlines are assigned.

3. Measure is the minimum bus capacity needed to schedule channels.

This measure is equal to the summation of minimum 3k for all channels. Note that,

according to Theorem 1, the minimum 3k for a channel is defined as the minimum

capacity that results in a zero inactive period for at least one message in the channel. It

can be determined after message deadlines are assigned and messages are combined into

the channel.

4. Measure is the total bus capacity selected according to Theorem 1. This

measure can be formulated as the summation of 3k for all channels.

5. Measure is the final bus capacity allocated to all channels based on a

harmonic set of channel cycles and the integer number of slots for each channel. The

capacity is equal to the summation of F3kh mk 'ui, for all channels.










We can expect an order of Measure2> Measure5> Measure4> Measure3> Measurel

among the measures. Measure should be much higher than other measures as we allocate

bandwidth for each message independently to ensure on-schedule message delivery. With the

message multiplexing within each channel, the on-schedule message delivery can be achieved

with a smaller amount of bandwidth. However, a bandwidth allocation following Measure3

cannot be practical since the channel cycles must be infinitely small. According to Theorem 1,

Measure contains additional capacity that is added to each channel to allow temporary blocking

of message transmission during each channel cycle. Furthermore, in Measure5, an extra capacity

is allocated as we make an integer number of slots for each channel, and construct a cyclic

schedule with harmonic periods.

The simulation results of the above measures are shown in Figure 2-18. The results

confirm our expectation of the order relationship. However, when we change the bus utilization

from 0.1 to 0.8, the curves are not monotonically increasing (except the curve of Measurel). This

is the consequence of the deadline decomposition (DD) algorithm. When channels don't have

enough bandwidth to meet short message deadlines, the algorithm adjusts the factor f and assigns

longer deadlines for message transmission. As shown in Figure 2-12, the DD algorithm uses an

approach similar to a binary search algorithm, and makes a big increase tofk initially. This results

in long deadlines and the reduced capacity allocations in Measure2-5. In fact, when the bus

utilization is less than 30%, the average number of iterations performed in the DD algorithm is

slightly larger than 1, i.e., only the initial f is used to allocate deadlines. When the bus utilization

is raised to 40% 70%, the average number of iterations jumps to 1.6, 1.98, 2.0, and 2.04,

respectively. It further increases to 11.09 when the bus utilization is set to 80%.

Figure 2-18 also illustrates the magnitude of the measures and the differences between

them. The gap between Measure3 and Measure2 is very visible. This difference is the product of

the channel combining algorithm. In order to meet a tight message deadline, we have to reserve a

large amount of bandwidth. With channel combining, messages of different deadlines share the











allocated slots. As long as the message with a shorter deadline can preempt the on-going

transmission, the slots in each channel can be fully utilized by multiplexing and prioritizing

message transmissions.

30% proc. util. at a (4,3,5) system


15
14
13
12
11
10


c 7 7
6
5
4 Measure2
3 Measure5
Measure
2 Measure3
1 Measurel
00
1
00 1 2 3 4 5 6 7 8 9 10
Bus Utilization


Figure 2-18 Measures for Bus Utilization and Capacities


There is a moderate gap between Measure3 and Measure4. As indicated in Theorem 1,


we search for a channel capacity and a channel cycle located in the knee of the curve r7k <


Bo(q1k)/(1-) after the initial sharp rise. This implies that a small increase of 3k will be added to

Measure in order to obtain a reasonable size of channel cycle. Finally, the difference between


Measure and Measure5 is not significant at all. It is caused by the process of converting 77k to a


harmonic cycle mk, and by allocating an integer number of slots F/kh3mk for each channel.


The other way of looking into the behavior of the deadline decomposition algorithm is to

investigate the resultant decomposition of task deadline, D,. In Figure 2-19, we showed the

average ratio of message deadline to task deadline, under different processor and bus utilization.

AIMD ST AM
If the adjustment factor is constant, the ratio, =( )f, should follow a
D, C, +ST*M,


concave curve as we increase bus utilization (by increasing message length, M,). For instance,







47


when the processor utilization is 15%, there are two segments of concave curves from bus

utilization 10% to 70% and from 70% to 90%. The segmentation indicates a jump in the

adjustment factors resulting from the deadline decomposition algorithm. In Figure 2-19, the

concavity and the segmentation can also be seen in other curves that represent the message

deadline ratios of different processor utilization. When the processor utilization is high, f may be

modified gradually and partition scheduling may fail if we introduce a sharp increase to f. Thus,

the concavity and the segmentation are not so obvious as the deadline ratio in an underutilized

processor.

1.0

.9

.8 ..

S.7 /




0
o .6

S.5 7

a, 4
S.-*- 15% Proc. Util
.3 -o 30% Proc. Util.
S.'.2 --- 45% Proc. Util.
S.2 --. 60% Proc. Util.
-w- 75% Proc. Util.
.1

0.0
0.0 .1 .2 .3 .4 .5 .6 .7 .8 .9 1.0

Bus Utilization


Figure 2-19 Ratio of Message Deadline to Task Deadline


2.5 Solving Practical Constraints

Since real-time systems are usually fault-tolerant, long-lived, and hardware/software

restricted systems, there are practical issues to be considered as well as the fundamental

scheduling problem. In this section, we present six additional practical issues and propose

enhanced scheduling algorithms to solve these problems. To achieve efficient and economic

lifetime maintenance of the IMA systems, we provide incremental changing algorithms. To










support replication based fault tolerance, we supplement replication scheduling algorithms. For

meeting clock synchronization requirements, we add fault tolerant cyclic slot allocation

algorithms. To solve the practical constraints due to hardware and software limitations, we have

also devised algorithms for fixed bus major frame size, fixed message size, and time tick based

scheduling.

2.5.1 Incremental Changing Scheduling

Since an aircraft can usually be in service for several tens of years, it is very likely that

the system will be upgraded during its lifetime. In the current practice and FAA requirement,

validation and certification processes for the whole system must be done after each upgrade.

Incremental changing algorithms are to seek a minimal modification of the previous schedule

when a subsystem is changed. According to the algorithms, it is just needed to re-allocate the

resources for either modified partitions or new partitions only. We can reuse the original schedule

such as processor capacity, channel allocation, channel bandwidth, major and minor periods, etc.,

for unaffected partitions and channel servers. Thus, we can avoid significant system re-design

efforts and reduce the costs for re-certification. We define the parameters for describing

incremental changing algorithms in Table 2-8.



Table 2-8 Notations for Incremental Changing Algorithm


Notation Description
avPC, available processor capacity of node i (1 I a,)

avCC available communication bus capacity (1 z ph)
HP-P,, HP-C major (hyper) period of each node i and communication bus
mp-P,, mp-C minor period of each node i and communication bus

avp, accumulated available free capacity for harmonic period size j of
node i
avck accumulated available free capacity for harmonic period size k of
communication bus










We introduce the concept of available capacity of each harmonic period for both

processor and bus cyclic scheduling. These available capacities can be assigned to newly attached

partition and channel servers. To achieve the flexibility in cyclic scheduling and allocation of new

servers, we have to carefully manage the available capacities. For example in Figure 2-20, four

partition servers are allocated processor capacities of (1/7, 7), (1/7, 14), (1/7, 14), and (1/7, 28)

respectively. We evenly distribute the available capacities, 3/7, to every harmonics periods, 7, 14,

and 28. In this example, we can obtain the value of parameters such that 3/7 for avPC,, 1/7 for

avp,7, 4/14 for avp,i4 and 12/28 for avp,28. ASk stands for available server for period k.

Major Frame = 28
-14 -144
Minor Frame = 7 14 7 7 11 7

AS, P1 AS14 P2 P3 AS, P1 P3 AS2H AS, P1 AS14 P2 P3 AS, P1 P3 P4

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 18 19 20 21 22 23 24 25 26 27


Figure 2-20 Cyclic Schedule Example for Incremental Changing


2.5.1.1 Attach algorithm


The attach algorithm finds a feasible schedule for a new partition while not affecting the

original schedule of existing partitions. Since the attach algorithm has a flow similar to the basic

two-level scheduling algorithm, we discuss only the necessary modifications in this subsection.

In terms of utilization, the sum of processor and bus utilization of all tasks in the new

partition must be less than the current available capacity of node i, avPC, and avCC respectively.

There should be modification in selecting a pair of partition capacity and cycle, (ak, rik), from

schedule-graph (ak, rik), which is obtained from Theorem 1, of the new partition. From the

schedule-graph, we choose possible candidates for a set of (ak, rk), that satisfies the following

criterion.


[Criterion 1.] Set of (Ok, rlk) such that rlk I HP-P, k > mp-P,, and Ok < avp,nk











For example, we assume that we have found two candidates (1/7, 7) and (3/14, 14) for a

new partition. The longer the partition cycle, the higher the partition capacity required. Since the

periods 7 and 14 are included in the original harmonic set {7, 14, 28} and there is sufficient

available processor capacity for the two periods, we can insert the new partition into the original

schedule. Figure 2-21 shows the result of inserting Pnew(1/7, 7) and Pnew(3/14, 14) in (a) and (b)

respectively. The selection between them is up to the system designer. But, in this example,


Pnew(1/7, 7) is better in terms of processor utilization.

Major Frame = 28
14 14
Minor Frame =7 7 7 7

(a) P. AS14 P2 P3 P PP AS2s P. AS14 P2 P3 P P, P4
new new new new
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27

Major Frame = 28
14 14
Minor Frame = 7 7 7 7

(b) P P AS P P P P AS P P AS P P P
new 1 new 14 nnewnew new 14 new4
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27


Figure 2-21 Example of Attaching New Partition


After processor (partition) scheduling, we assign a channel server to each message (task).

And we set the requested channel bandwidth for the new partition to CThres. CThres is an

important parameter of our channel combining algorithm. It is a maximum allowable bus

utilization for the partition. Obviously, CThres must be less than the available bus capacity,

avCC. We combine channel servers until the sum of utilization does not exceed CThres. If it

succeeds in reaching CThres, we can proceed to the channel schedulability check and channel slot

allocation.


The following criterion is used to choose a candidate set of (fk, Yk) from the schedule-


graph of channel cycle, lk, vs. the channel capacity, Pk, after channel combining.


[Criterion 2.] Set of (k, Ik) such that k I HP-C, Ilk 2 mp-C, and F\k*Ik7 <: Lavppk* kl










As a result, we can obtain a candidate set of (3k, yk) for each channel server. The next

step is to select a feasible (Pk, yk) from the candidate set for each channel server. We used DFS

(Depth First Search) to search for a feasible solution. Let's assume that we have n channel

servers, and each channel server has a multiple number of feasible candidate pairs of (Pk, Yk).

Firstly, we build a graph from root level 0 to leaf level n. The root node at level 0 is a

virtual initiating node that keeps full available bus capacities per each harmonic period. From

level 1 to n-l, each level i has nodes that represent the channel server candidate set of (P,, p). At

level n, the channel server n's feasible candidate set will be constructed as leaf nodes. We can

construct a graph by connecting every node except leaf nodes to all the nodes at the next level.

Secondly, the search starts from the root until it reaches a leaf node. As a search goes into a node

at the next level, it determines whether this node can be included in the current feasible set or not

by comparing the required capacity of this node and that of the parent node's available capacity.

If a check fails, it returns to the upper level and continues DFS. If it succeeds, it allocates the

current (P,, p,) of the node to the corresponding channel server i. The remaining available

capacities of the current node for all harmonic periods are set by copying from a parent node.

Only the capacity for the period that is assigned to the current channel server i will be adjusted by

subtracting the current assigned capacity from that of the parent's. Finally, if it reaches any leaf

node, the current feasible set, from root to this leaf node, is one of the feasible sets. We can select

an optimal feasible set from these feasible sets.

2.5.1.2 Detach & modify algorithms

The detach algorithm is used to remove capacity allocation from the cyclic schedule. The

detach algorithm is relatively simple, for two reasons. Firstly, there is no dependency among

partition servers, so we de-allocate the processor capacity originally allocated to the partition

being detached. Secondly, the channel server only serves the messages in the same partition so

we can de-allocate the bus capacity allocated to the channel servers belong to the partition.










To maintain correct system information after detach, we adjust the system parameters as

follows. Let's assume that the detached partition had a processor capacity allocation of (ak, rJ).

avP, = avP, + ak

avp,, = avpz, + ak, where ri 2 r7k and r HP-P,

The basic modify algorithm is performed by sequential applications of the detach and

attach algorithms. First it removes the old partition by the detach algorithm. Then it attaches the

changed partition as a new partition. Since the two-level scheduling algorithm can give extra

capacity to servers if necessary, we do not need to change the original schedule when the existing

capacity is enough.

In the case of deleting tasks from the existing partition, we do not need to change

allocation. By deleting tasks, the existing partition is over provided with processor capacity. It is

also possible that as we delete tasks, some channel servers become empty. We can either leave it

as it is or remove those channel servers. If we decide to remove those channel servers, we should

de-allocate bus capacity and adjust avCC and corresponding avck. Now, let's look into the case of

changing parameters of the original tasks. Since we do not delete any partition server or channel

servers in this case, we can leave the original schedule if all schedulability checks pass with

changed parameters.

2.5.2 Replicated Partition Scheduling

To achieve fault tolerance, safety-critical real time systems usually adopt a redundancy

mechanism by replicating the application on multiple processors. It is required that the replicated

partition instances in the same group must be guaranteed to meet the same real time constraints

while executing on different processors. In Figure 2-22, we show an example of two replicated

partition groups in an IMA system composed of three multiprocessors.

For simple explanation, we assume that each group possesses only one channel server,

Ch. 1 and Ch.2 respectively. The replicated partition 1 must be feasibly scheduled in both node 1











and node 2. In the processor execution model, each replicated partition instance in node 1 and


node 2 may be scheduled differently because they must be run in different node configurations.

Node 1 Node 2 Node 3

Non- Non- Non- Non-
Replicated N Replicated Replicated Replicated
Panrition 2 Pantion 1 Pantion 2 Pantion 3

Replicated Non- Replicated Non-
Pantion 1 Replicated Pantion 1 Replicated
Ch. 1I Pantion 1 Ch. 1 Pantion 1







Ch. 1 Non-Replicated Non-Replicated Ch. Non-Replicated Non-Replicated
channels channels channels channels
Minor Period ----Minor Period
-r PeMajor Peod


Figure 2-22 Replicated Partitions in IMA Systems


However, in the case of channel scheduling, both instances should be scheduled in the

same manner because they have to share one channel server, Ch. 1. If we only consider replicated


partition 1, the scheduling objective is to find a feasible processor schedule (ak, rk) and (a'k, r'k)


for node 1 and node 2, and channel schedule (Pk, |k) for Ch.1. As long as these three capacity


assignments for a replicated partition 1 are kept in global system scheduling, both replicated

partition instances will meet the same timing constraints.


The overall scheduling takes two steps; replicated partition scheduling and global

scheduling.

Firstly, we schedule each replicated partition group with expected processor and bus


utilizations. We apply the basic two-level scheduling algorithm to replicated partitions as if total

processor and bus capacities are the same as the expected processor and bus utilizations. The

following heuristic equation is used to find the expected replicated-partition processor and bus

utilization for each replicated partition group. With these utilization bounds, the scheduler makes

its best efforts to find the feasible schedule for replicated partitions. But, there is still a chance to










miss the expected utilization of all replicated partitions. We allow this minor miss, because we

can achieve the total system utilization bounds in non-replicated partition scheduling later.

PUD
ERPUk = min,) {TP U, Pk
SPU,,
J 1

BUk
ERB Uk = {TBU
TotalPureBus Util.

The expected processor utilization of replicated partition k, ERPUk, can be obtained by

taking the minimum utilization requirement among all nodes for the replicated partition k. TPU, is

the total target processor utilization of the node, with a default of 100%. PU,, is the processor

utilization of partition j of node i. PUk is the processor utilization of the corresponding replicated

partition group. The expected bus utilization of replicated partition k, ERBUk, can be obtained by

taking the proportional utilization of the replicated partition from the total pure bus utilization.

We only count the bus utilization of the replicated partition group once in calculating the total

pure bus utilization. This is because replicated partition instances in the same group share one

channel.

With the two calculated parameters, ERPUk and ERBUk per replicated partition group, we

can find feasible schedules, i.e., schedule-graphs of (ak, rk) per replicated partition server and

(Pk, Lk) per channel server, using the basic two-level scheduling algorithm. While replicated

partition scheduling, every deadline of each task is decomposed into computation and message

deadline, and messages are combined properly to achieve expected bus utilization.

Secondly, we schedule the remaining non-replicated partitions without changing the

results of the replicated partition scheduling, such as deadline decomposition, priority

assignment, and channel bandwidth allocation, etc. It means that we use the same schedule-

graphs of (ak, rk) and (Pk, Lk) that were obtained in replicated partition scheduling. The global










scheduling algorithm finally selects feasible ((k, rlk) and (Pk, tk) for all replicated partition and

channel servers from the schedule-graphs while scheduling non-replicated partitions.

2.5.3 Fault Tolerant Cyclic Slot Allocation for Clock Synchronization

The ARINC 659 fault tolerant TDM bus is designed for safety-critical IMA systems that

require very high reliability. It is equipped with four redundant data serial buses and four clock

lines for fault tolerance. Clock synchronization is achieved by bit- and frame-level

synchronization messages in a distributed way. In this subsection, we propose a fault tolerant

cyclic slot allocation algorithm that can be used in the distributed clock synchronization

environment adopted in ARINC 659.

We assume that in our target distributed clock synchronization environment, when one of

the bus controllers gets bus mastership and transmits either real or idle data, it also distributes a

reference clock to all other bus controllers simultaneously. Other bus controllers try to

synchronize their local clock to the current reference clock. This approach might cause global

system clock failure if the current bus master fails and no reference clock is provided for longer

than a certain threshold time. Our new algorithm ensures that a bus master cannot be allocated

slot time longer than this threshold. Also, the failed node is prevented from re-gaining bus

mastership within another pre-determined threshold time. With this approach, bus

synchronization can be recovered from single node failure without protection switching.

We use two parameters to specify the reliability requirement of a distributed clock

synchronization mechanism and introduce Idle Clock Synchronization Channel Server (ICSCS).

The two parameters are MaxH (Maximum Clock Holding Time) and MinR (Minimum Clock Re-

sync Time) that are decided by the reliability requirement of the implemented hardware. The two

parameters imply that the number of consecutive slots allocated to a single channel must be less

than or equal to MaxH and be greater than or equal to MinR.















We present the basic idea with a simple example in Figure 2-23. In the example, there are


three nodes, which have a channel server of bandwidth CH1 (2,10), CH2 (3,10), and CH3 (3,20)


respectively. The two parameters, MaxH and MinR, are both 1.


(a)





(b) c


0


(c) C





(d) c


1


(e) c
1




Hn


0


FREE

12345


FREE

12345


C
H FREE
2

1 2 3 4 15


FRE

45


F

4 1


F C
R H
Z 2

4 5


F
R
Z

6789


F
R
Z




R
Z



F
6789


EE R
Z

6789


F
REE R
Z






67 89


(g)
1




(h) H


00


(i)


0









(k)
1



0) c

oi


FREE R
Z

6 7 89


FREE R
Z

6789


C F F
H R
3 Z

6 78 9


CF CF
HR H R

6789

CI CF
H HR
3 3



I C I
C CCC
H H 3
3 1 2
6 7 8 9


FREE





FREE





FREE





FREE




I I I
C C C
3 1 2


I I I
C C C
S S S
3 1 2
1 1 1
678


Figure 2-23 Example of Fault Tolerant Cyclic Slot Allocation


The basic idea of the algorithm is that it allocates slots to the channel servers in non-


decreasing order of channel cycle. During allocation, it maintains a temporary bus schedule List


that is an ordered set of allocated intervals. The List is initialized with only two interval entries.


The first entry is a type FREE interval of size 9 (minimum period MinR) and the second is a


type FREEZE interval of size 1, i.e., MinR, as in Figure 2-23-(a). For the channel severs of the


minimum channel cycle,i.e., CH1 and CH2, we put their allocation interval into the FREE


interval while meeting MaxH and MinR requirements, as in Figure 2-23-(b)(c)(d)(e). If it is not


possible to allocate any channel server of the corresponding channel cycle for the specific slot


interval due to the MaxH and MinR requirement, we temporally designate it as a type FREEZE, as


in Figure 2-23-(f). After finishing the current channel cycle, the scheduling list is copied to fit









with the length of the next channel cycle as in Figure 2-23-(g). When we copy the List, we reset

the intervals of type FREEZE that were generated in the previous allocation to FREE. Thus, the

newly freed intervals can be allocated for the next channel cycle as in Figure 2-23-(h). Figure

2-23-(k) and (1) show that non-allocated FREE and FREEZE slot intervals are assigned to an

ICSCS of each node properly. The allocation can be done easily, when we maintain two temporal

vector variables CAS and -iu,,iil for each dynamically created time interval. CAS stands for

consecutively allocated slots, and waitslots stands for the wait time after yielding bus mastership

for that interval per node.

2.5.4 Miscellaneous Practical Constraints

2.5.4.1 Fixed bus major frame size scheduling

In the original scheduling algorithm, we tried to find the optimal major frame size for

TDM bus with given channel bandwidth requirements. But, in a practical situation, it is also

needed to schedule channels with a fixed major frame size, because major frame size can be

tightly coupled with bus hardware design. In fixed bus major frame size scheduling, it is limited

to choosing the base minor frame size, denoted as jI, in distance-constrained cyclic scheduling.

The algorithm requires that the fixed major frame size must be multiples of the base minor frame

size. The following algorithm is to find a feasible pair of channel capacity and cycle for each

channel server to satisfy the major frame size constraint. We formalized a given problem such

that:

(input) MF (Major Frame Size), a set of (P/, pl), ( 32, z) (n, Y)

(output) set of (P*, mi), ( 2*2 ), ) (*n,m ), m, I mj, for i < j and m, MF,

n
i=,2,...,n, where P' m, 1 < 1.0

We show the self-explanatory algorithm in Figure 2-24.
We show the self-explanatory algorithm in Figure 2-24.











for (= Ji; ~t >= |ji /2; iL--) {// check all possible candidates of L
if (C divides MF) { // base minor frame size must divide MF
// finds MPS (Multi-Periods Set)
MPS = {*2' 1 *2' <= |1 AND |**21 divides MF, i=0,1,2,...};
set m, = max m PS{ m | m <= tl,}, i=1,2,...,n;



if ( /F m, 1 < 1.0) found = TRUE
} =1 ml


Figure 2-24 Cycle Transformation Algorithm for Fixed Bus Major Frame Size Scheduling

2.5.4.2 Fixed message size scheduling

Since each channel server schedules its assigned messages in a fixed-priority driven

method, the logical architecture of a channel server can be depicted as in Figure 2-25. A channel

server schedules the current set of arrived messages and puts the highest priority message into the

top of the buffer. One of the important design criteria is to select a fixed buffer size, MSIZE, in

Figure 2-25. In the case of ARINC 659, the minimum resolution of MSIZE supported by

underlying hardware is a word (32bits). However, it may be impractical to choose a word for

MSIZE, provided that a bus is much faster than a channel server. So depending on the bus speed

and performance of the channel server, we must decide on a practically feasible MSIZE. In

general, the smaller the preemption unit, the better schedulability we can achieve. It is also

improper to permit a larger MSIZE because it can cause higher-priority messages to wait for the

completion of lower-priority message of intolerably long size. It is an important job of the system

engineer to find the optimal MSIZE.

In the original message scheduling algorithm, we assume that the unit of TDM time slot

(32bit data in ARINC 659) is a preemption unit for message scheduling, i.e., MSIZE equals to 1.

The fixed message size scheduling algorithm permits arbitrary values of MSIZE.











Channel Server 1




Fixed Pnonty Dnven Channel SeAer Scheduler)

channel server Msg 2
buffer Msg 2
Msg 1 SIZE





Ch. 1 other channels or idle channel Ch. 1 other channels or idle channel

Minor Period -- norPerod
-Major Period


Figure 2-25 Logical Channel Server Architecture


The difference from the original schedulability condition is caused from that the higher-

priority messages should wait for MSIZE amount of time, in worst case, to allow a formerly

dispatched lower-priority message be completed. The modified scheduling equation, W,(s), is as

follows.



W, (s)= M, +M,+MSIZE <:[s L* ,
rJ1 TJ



2.5.4.3 Time tick based scheduling


The basic two-level scheduling algorithm assumes a high-precision real clock driven

method for processor scheduling. For example, the basic two-level scheduler is allowed to

allocate any real value for execution time, like 2.3571 (ms) in the partition cycle of 10 (ms) for a

partition. But most practical systems, like Honeywell's GPACS IMA system, use a fixed

resolution of time tick method to schedule partitions. Similar to the message scheduling algorithm

that uses a fixed slot as a scheduling unit, the processor scheduling algorithm must be modified to

allocate an integral number of time ticks to each partition. The disadvantage of the time tick

based scheduling is that its processor utilization is poorer than that of the high-precision real











clock driven method, because it has to allocate time based on time ticks of relatively low

resolution.

2.6 Scheduling Tool









iNs- Development System Processor Cyclic Table
= \Bus Cyclic Table
Task Priority Assignment
SMessage Priority Assignment
.- F -o \ 'Channel Grouping Strong Partitioning Scheduling Tool
.> -Windows NT Platform
Scheduling Result
IMA System Development
Environment



Figure 2-26 Usage of Scheduling Tool in Avionics System Design


We have developed a GUI-based integrated scheduling tool which implements all

previously described scheduling algorithms for IMA systems on the Windows NT platform.

Figure 2-26 shows the role of the tool in developing IMA systems. For guaranteeing timing

requirements of all integrated real-time applications in the IMA system, system integrators need

information such as the processor cyclic scheduling table in each processor, task priority

assignment in each partition, message to channel server mapping, bus cyclic scheduling table, and

message priority assignment in each channel. The tool prepares this scheduling information at

the system integration stage. The tool provides convenient user interface functions, such as

managing projects, editing files, performing scheduling algorithms, and browsing scheduling

results.

Figure 2-27 shows the simplified structure of the tool, which is composed of three main

components; the windows based configuration process, the command-line based integrated

scheduler, and the graphical viewer. The windows based configuration-process-module is a top-

level user interaction environment. It provides a graphical user interface with which the user can












configure the target system, input application information and scheduling options. In order to


track the evolution of the system, it also provides project and history management.


.dat
Main Input file

--r


.att
Attach file

-r


.mod
Modify file

1f


Windows Based Configuration Process


.hst & .prj
-- History & Project
file


invoke (basic, attach, modify, detach)
*


Grpahical Viewer






Figure 2-27 The Structure of the Scheduling Tool


Figure 2-28 shows the snapshot of the system configuration dialog box with the top-level


screen as a background. The command-line based integrated scheduler is responsible for


executing our scheduling algorithms with the given configuration and inputs either at the


command line or in the integrated environment. A detailed result of scheduling is stored in a *.prn


file. Additionally, the user can view the scheduled timing diagram of processor and bus allocation


with the graphical viewer.


Command-line based integrated scheduler



Frame Syc I l Time Tick Fixed Bus Felxible Fault Tolerant
Frame Sync Incremental Based Major Frame Message Size Clock Sync
Scheduling Chaging Scheduling Scheduling Scheduling Scheduling


Basic Two-Level Scheduler








62



The scheduling algorithms for solving six additional practical constraints are


implemented as plug-in modules of the basic two-level scheduling algorithm. So any combination


of six constraints can be easily applied to the application.


] I .- ih. H-I


0hll7- 1
S1 II .,
_1,I, Ii,,,1 h II
F I 1 I I .. I. 11, 1 h ,- :

11 1 11:1 .- 1i o r I-.
I ', I rF








,,,,-i, Ii -, -




i1 a ., Il r I sI l k1
I1 hi,- 1 9,
r'y i


-a M


1,1


I .. ... .









J 'F
i-, ,,,,,: I" i




i i i: .. I-" "' ... ..


2 10 100 10


Figure 2-28 System and Scheduling Parameters in the Tool Interface


2.7 Conclusion


In this chapter, we presented scheduling algorithms to produce cyclic partition and


channel schedules for the two-level hierarchical scheduling model of strongly partitioned


integrated real-time systems. The system model supports spatial and temporal partitioning in all


shared resources. Thus, applications can be easily integrated and maintained.


The main idea of our approach is to allocate a proper amount of capacity and to follow a


distance constraint on partition and channel invocations. Thus, the tasks (messages) within a


partition (channel) can have an inactive period longer than the blocking time of the partition


(channel). We also use a heuristic deadline decomposition technique to find feasible deadlines for


both tasks and messages. To reduce the bus bandwidth requirement for message transmission, we






63


develop a heuristic channel-combining algorithm, which leads to highly utilized channels by

multiplexing messages of different deadlines and periods. The simulation analysis shows

promising results in terms of schedulability and system characteristics.

We defined and solved an additional six practical constraints in scheduling Integrated

Modular Avionics systems. We also developed a GUI-based integrated scheduling tool suite,

which automates all scheduling analysis presented in the dissertation.















CHAPTER 3
SOFT AND HARD PERIODIC TASK SCHEDULING

3.1 Introduction

The integrated real-time systems model supports real-time applications of various types,

including non- and less-critical applications. These kinds of applications usually adopt soft and

hard periodic tasks. In this chapter, we focus on the design of dynamic scheduling algorithms for

soft and hard periodic tasks in integrated real-time systems. We assume that there is an periodic

task server in each partition to execute periodic tasks that belong to the same partition. To ensure

the schedulability of the periodic tasks, the server would not consume any processing capacity

allocated to the periodic tasks in the partition. Nevertheless, there exists available processing

capacity, such as unused or unallocated capacity at the partition level, that can be used by the

periodic task server. Also there exists pre-allocated processing capacity for periodic tasks,

which can be shared by partitions. Our objective then is to utilize the available capacity to

maximize the performance of periodic tasks, while meeting the distance and capacity constraints

of each partition.

To maintain the distance constraints while scheduling periodic tasks, we propose the

Distance Constraint guaranteed Dynamic Cyclic (DC2) scheduler, which uses three basic

operations; left-sliding, r:gl1-pii,,,ii, and compUlcting. The goal of these operations is to obtain a

maximum amount of slack time upon the arrival of an periodic task. The first operation, left-

sliding, is to save any slack time by scheduling other partitions earlier when there is no pending

task in the current partition. The second operation, rigl-pl',n ,. steals slack time as much as

possible by swapping the existing cyclic schedule for partitions, while guaranteeing the distance

constraints. Since the right-putting operation may divide the cyclic schedule into small segments,










the third operation, (I ,i,,i,,i ,.. is designed to merge together the segments allocated to the same

partition.

We also propose a hard periodic task scheduling approach. When a hard periodic task

arrives, the scheduler must invoke an acceptance test algorithm. Basically, the algorithm checks

the available slack time between the current time 0 and the task deadline +D,. An acceptance

can only be made if the task's timing requirement can be met, and none of the hard periodic

tasks admitted previously may miss their deadlines. With the help of the dynamic behavior of the

DC2 scheduler, we can take advantage of future r:glii-l',,,i,, operations in addition to the slack

times of the current static schedule.

The chapter is organized as follows. Section 3.2 presents an periodic task scheduling

model in integrated real-time systems. Section 3.3 presents the algorithms for soft periodic

scheduling. Section 3.4 describes the algorithm for hard periodic scheduling. The evaluation of

the algorithms by simulation studies is presented in Section 3.5. The conclusion then follows in

Section 3.6.

3.2 Aperiodic Task Scheduling Model

Let's assume that there are an infinite number of periodic tasks, [J, i= 0,1,2,...}. Each

periodic task has associated it with a worst case execution time of C,. A hard periodic task

additionally has a deadline D,. To maximize the performance for periodic tasks, we are

concerned with any available execution capacity that we can extract from the low-level cyclic

schedule of a SPIRIT system. As long as the distance and capacity constraints are satisfied for

each partition, this available execution capacity can then be assigned to any arriving periodic

tasks.

We introduce Multi-Periods Aperiodic Servers (MPAS) that are virtual execution servers

for periodic tasks. When it is activated, an periodic task server at one of the partitions can use

the capacity to execute existing periodic tasks. A MPAS can be constructed using the following

scheme. Let a finally scheduled set of partition servers be










P = {(pia ( ha )P2 j ,, P...2 h,( j, the set being sorted in the non-decreasing order of

hk. By removing the duplicate periods in the set of h,, 1< i < n, and sorting the remaining set in

increasing order, we can obtain the non-duplicative set MPS = [{H, H2, ..., H,} where m < n and

H, I Hj for i

idle processor capacity, (1- ya), evenly, i.e., we assign processor capacity


n
cpH = (1 a) /m, toeach MPASH.
J-1

Then, the total set of servers in the low-level cyclic schedule, including partition

servers P and multi-period periodic servers MPAS, can be formulated as follows:

CS =PuMPAS=CS [MPASH (cpH,H), P, (ah -.,PE (a,h),


CS2,PASH(cpH, PE(a +1,hE1 Ea hE


CS. LIPASH (Cp ,HP PE-1 I(a- 1+1,hE -+1 j ,PE (a ,hE]

where all server partition cycles in the subset CS, are the same and Ek is the last index of

the partition server in the set CSk, Ei < E2< ... < E= n.

For instance, if we have input partition servers of P = Pi(1/7,7), P2(1/7,14), P3(1/7,14),

P4(1/7,28)}, we can obtain a set of MPAS as {MPAST(1/7,7), MPAS14(1 7,14), MPAS28(1 7,28)}.

Finally, we can obtain a cyclic schedule set CS of {{MPAST(1 7, 7), Pi(1/7, 7)}, {MPAS14(1/7,

14), P2(1/7, 14), P3(1/7, 14)}, {MPAS28(1/7, 28), P4(1/7, 28)}}. Using the distance-constrained

cyclic scheduling algorithm, we can obtain the feasible cyclic schedule shown in Figure 3-1.

Since it allocates the MPAS with the minimum period first, for example, MPAS(1 7, 7), it is

guaranteed that every partition server does not cross a minor frame boundary. This characteristic

can lead to a flexible relocation of cyclic schedule entries. Note that, in the Figure, we use AS as

an abbreviation of MPAS. Given the cyclic schedule of CS, the straight-forward approach of










scheduling periodic tasks is to give CPU control to periodic tasks when the current timeline

belongs to MPAS. However, for a better response time for soft periodic tasks and a higher

acceptance rate for hard periodic tasks, we should look for ways to adjust the table entries

dynamically.

Major Frame = 28
14 14
Minor Frame = 7 1- 7 7 -- 7 -

AS, P, AS,, P, P3 AS, P, P, AS, AS, P, AS,, P, P, AS, P, P, P4

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 24 25 26 27



Figure 3-1 Example of Feasible Cyclic Schedule


3.3 Soft Aperiodic Task Scheduling

To dynamically schedule periodic tasks in a distance-constrained cyclic schedule, we

propose to use three basic operations to manipulate the table entries. These are LS (Left Sliding),

RP (Right Putting) and Comlcrting operations. The operations are applied to feasible cyclic

schedules in run time when an periodic task arrives or a server finishes earlier than scheduled.

In this section, we specify the details of each operation, and show that the deadlines of periodic

tasks in each partition will not be missed when the operations are applied. Using these basic

operations, we can then compose a distance-constrained dynamic cyclic (DC2) scheduling

algorithm.

3.3.1 Left Sliding (LS)

The LS(x) operation slides the current phase of the cyclic schedule left by x units of time

in 0(1) time. There are two situations at which LS(x) will be applied. One is when the current

partition server or MPAS has finished x units of time earlier than the allocated capacity. The other

case is when the scheduler invokes a partition server or a MPAS with an allocated capacity of x,

but finds no pending periodic task or periodic task, respectively. Figure 3-2 shows an example of

a LS(1) operation.

















LS(1)

P, AS,, P2 P3 AS, P, P, AS2, AS, P AS,, P2 P3 AS, P, P, P4 AS,

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27


Figure 3-2 Modified Schedule after LS


To reduce the average response time of periodic tasks, we can use LS to eliminate idle

processing during assigned slots. The advantage, compared to a simple polling periodic server

algorithm, comes from its dynamic behavior. In a polling periodic server algorithm [12], when

an periodic task arrives, the task has to wait until the next MPAS interval. Using LS, both

partition server and MPAS can begin their execution earlier than the instances defined statically in

the cyclic table, a better average response time should be expected. In addition, the following

theorem shows the distance and capacity constraints for periodic tasks in all partitions are still

valid under the LS operations.

Theorem 2. The modified cyclic schedule after the application of LS(x) is feasible, i.e., it

guarantees the hard deadlines of periodic tasks in all partition servers.

Proof: We assume that a LS(x) operation takes place at instance 0. Thus, we need to

show that every periodic task that has already arrived before 0, or will arrive after 0, can meet its

deadline. Firstly, we consider the tasks that arrive after instance 0. If the LS operation doesn't

occur, then there will be a processor capacity of ah allocated to P, in every period [ +8, +8 +hJ,

where 8 2 0. After the execution of LS(x), the processing capacity allocated during [ 8+6,

8+6+hj under the modified schedule is equivalent to that allocated during [f+6+x, 8+6+x+hj

in the original schedule. As a consequence, there will be exactly the required processor capacity

for all partition servers in any periods after instance 0. New periodic tasks will be able to meet










their deadlines. Secondly, we need to show that the periodic tasks that have already been released

before instance 0 shall meet their deadlines. For the tasks that do not belong to the suspended

partition, they will be given their preserved processor capacity earlier than the instance originally

scheduled so that they can meet their deadlines. For the tasks that belong to the current partition,

they have already finished before instance 4 because this partition becomes idle at 4 in spite of its

unused remaining capacity. Therefore, we can conclude that the modified cyclic schedule after

LS(x) is feasible. U

3.3.2 Right Putting (RP)

The Right Putting (RP) operation exchanges the remaining allocation of the current

partition server with a future MPAS of the same period. In order to meet the deadlines of periodic

tasks and to satisfy the distance constraint in the cyclic schedule after RP, a RP operation is done

based on two parameters: PIST, (Partition Idle Starting Time) and PL, (Putting Limit) for partition

server P,. PIST, is the instance at which the partition becomes idle due to either no pending task or

an early completion of tasks. PL, is defined as the sum of PIST, and the invocation period h, of P,.

Initially, PIST, is set to zero. According to the execution of tasks, PIST, will be reset to the

instances when P, is invoked and finds no waiting periodic tasks. Obviously, PIST, is always less

than or equal to the current instance.

An RP operation is initiated when either an periodic task arrives or there is still a

pending periodic task after the completion of a partition server. To reschedule the remaining

capacity of the current partition server at a later interval, the operation brings in an equivalent

processing capacity from a MPAS of the same period that lies between the current time instance

and PL,. If no such MPAS exists, the operation would not take place at all. Note that this capacity

exchange leads to a revision of table entries in the cyclic schedule. This implies that the exchange

becomes persistent for all future invocations of the partition server and the MPAS. Since the RP











operation should track all time interval entries in the cyclic schedule, its time complexity is O(N),

where N is the total number of time table entries.

Major Frame = 28
14 14
Minor Frame = 7 7 -7 7-

P, AS14 P2 P AS, P1 P, AS2 AS, P AS14 P2 Ps AS, P1 P, P4 AS,

o 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
0 PIST1,23, PL, PL2,PL, PL4

AS, AS P2 P P AS, P, AS28 P, AS, AS4, P2 P, P, AS, P, P4 P,

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
0 PIST1,2,3,4 PL, PL2,PL, PL4


Figure 3-3 Modified Schedule after RP


In Figure 3-3, we show an example of an RP operation, which assumes that an periodic

task arrives at time 0. It exchanges the current partition server PI with MPAS7 that is scheduled

between [0, PLJ. This RP operation results in the immediate execution of the periodic task

without any violation of distance constraints. However, the RP operation may split a scheduled

interval into segments. This will increase the number of allocation entries in the cyclic schedule

table, and the number of context switches between partitions. To reduce the problem of

segmentation, a compaction algorithm can be applied.

Theorem 3. The modified cyclic schedule after the application of RP operations is

feasible, i.e., it guarantees to meet the hard deadlines of periodic tasks in all partition servers.

Proof: We assume that an RP event occurred at instance 0 and the invocation of the

current partition server P, is postponed. Firstly, we consider a periodic task that arrives after

instance 0. If the task does not belong to P,, it can always meet the deadline. This is because the

schedules for all partition servers except P, are not changed after the RP. For P, there was exactly

a, capacity in every period of h, before the RP operation. Assume that the RP exchanges y units

of time for P, from [~ +y] to [0+8, +8+y], where 8 is a real number. Due to +8+y < PL, =










PIST, + h, <_ + h,, there exists a capacity ah allocated to P, in [0, h+hJ. We can do the same

exchange in all iterative periods after 0 given that the corresponding partition server and MPAS

are in the same relative position in the cyclic schedule. Therefore, for every period h, of [0, -], i.e.

[8+6, +65+hJ, where 8 2 0, a capacity of ah exists for P,. Secondly, we consider a periodic task

that had arrived before instance 0. If the task belongs to Pj, where jji, it will meet the deadline

because the schedule for Pj is still unchanged after the RP operation. For a task that belongs to P,,

we have to show that there should be capacity ah in any period [PIST,+e, PIST,+e+hJ where 0 <

E < -PIST,. We don't need to consider any periodic task that had arrived before PIST,. Note that

there was exactly a, processor capacity in the period of [PIST,+, PIST,+e+hJ before the

operation, where 0 < s < (-PIST,. After an exchange of the capacity y, which is less than ah, in

the period [0, PLJ, there still exists the same amount of capacity ah in the concerned interval

[PIST,+e, PIST,+e+hJ since PL, < PIST,+e+h, and [),PLJc[PIST,+e, PIST,+e+hJ for any 0 <

E < -PIST,. Therefore, the partition server P, will be allocated a requested processor capacity in

every invocation period. U

3.3.3 Compacting

To reduce the problem of segmentation due to the RP operations, we propose a

Compuctinlg operation. The operation delays a MAPS entry and exchanges it with the entry of a

future partition server. This contrasts with the RP operation, which first postpones a partition

server and then brings and starts a MAPS immediately. A heuristic selection is performed in a

Couim]'ltitng operation such that the neighboring entries that belong to the same server can then be

merged in the scheduling table. Since the co itctring operation needs multiple RP operations, the

complexity of Com]L clring is also O(N2). The operation can be applied periodically. For instance,

this can be done every few major frames, or at the time of completion of a certain number of RP

operations, or in the event that the number of context switches exceed a certain threshold.










Theorem 4. The modified cyclic schedule after the application of the Compacting

operation is feasible, i.e., it guarantees to meet the hard deadlines of periodic tasks in all partition

servers.

Proof: The proof is trivial. Since the Cormpicting operation uses a similar approach as the

RP operation for exchanging scheduling entries, the proof is the same as for the RP operation. It

also can preserve the distance constraints guaranteed in the original cyclic schedule, after the

adjacent scheduling entries that belong to the same server in the same minor frame are merged.



3.3.4 DC2 Scheduler

Based on the three operations LS, RP, and Compactrintg, we propose a DC2 scheduler that

accommodates periodic tasks within the distance-constrained cyclic scheduling of partition

servers. When there is no pending periodic task, it behaves like a cyclic scheduler, except for the

intervention of LS operations. Whenever a server becomes idle without consuming all its

allocated capacity, the LS operation is invoked. Hoping to use saved slack times, the scheduler

can apply a RP operation when an periodic task arrives. The scheduler invokes the Compnct ng

operation when the number of context switches exceeds a certain threshold at the beginning of a

major frame. Following Theorems 2, 3, and 4, we can conclude that the DC2 scheduler guarantees

the deadlines of hard periodic tasks in all partition servers while serving periodic tasks. In Figure

3-4, we present the abstract algorithm of the DC2 scheduler.

The resulting scheduler may lead to an immediate execution of an arriving periodic task,

and does not need to wait for the beginning of the next periodic server interval. Practically, the

DC2 scheduler should be carefully designed so that the scheduling overhead is minimized. For

example, we can limit the number of burst arrivals of periodic tasks for a short interval time. The

scheduling policy of MPAS will be discussed after discussion of hard periodic task scheduling.











loop {
receive a Timer interrupt;
S = getNextServer; // S will be either a partition server or MPAS
while (S doesn't have any pending task) {
LS(|S|);
S = getNextServer;
}
set Timer to current time + |S|; // |S is allocated capacity
dispatch S;
sleep;
}
signals:
case (S finishes earlier by x units of time) do LS(x);
case periodicc task arrives) do RP and dispatch MPAS if available;



Figure 3-4 DC2 Scheduing Algorithm

3.4 Hard Aperiodic Task Scheduling

When a hard periodic task arrives, the scheduler must invoke an acceptance test

algorithm. Basically, the algorithm checks the available slack time between the current time 0 and

the task deadline +D,. An acceptance can only be made if the available slack time is bigger than

the task's WCET. Thus, the task's timing requirement can be met and none of the hard periodic

tasks admitted previously may miss their deadlines. With the help of the dynamic behavior of the

DC2 scheduler, we can take advantage offuture RP operations in addition to the slack times of the

current static schedule.

3.4.1 Low-Bound Slack Time

With the current snapshot of the cyclic schedule as of current time 0, the scheduler can

obtain a low-bound slack time between 0 and O+D, that exists regardless of any future LS and RP

operations. Note that the further slack time can be added in due to the results of LS and RP

operations before .+D,.











To aid the computation of low-bound slack time, we use three parameters; sf sl and sr.

sf equals the sum of processing capacities allocated to all MAPS's in a minor frame i, and

represents the available slack times in minor frame i. slk and srk store the available slack time

before and after the schedule entry k in a minor frame. In Figure 3-5, we show an example that

illustrates the values of these parameters.

Major Frame = 28
14 14
Minor Frame = 7 1- 7 7 7 7
sfo = 3 sf = 5 sf = 3 sf3 = 1
AS P, P, P AS P, P AS,2 AS, P, P, P AS, P, P P
(02) (12) (30) (30) (0 4) (14) (14) (1 o) (02) (12) 3 0) (3 03 (00) ( 10) ( 1/ ) (1 O)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27




Figure 3-5 Example of the sf, sl, and sr Parameters


The value pair of sl and sr is shown in each scheduled interval. The scheduled interval for

partition P1 in minor frame 0 has a left slack time of 1 and a right slack time of 2 in the example.

We maintain the frame-to-frame slack time table of accumulated slack time from one frame to the

other in the cyclic schedule. For instance, the table corresponding to the schedule of Figure 3-5 is

shown in Table 3-1. The accumulated slack time from frame 0 to frame 2 is computed as

sfo+sfi+sf+=ll.



Table 3-1 Frame-to-Frame Slack Time Table


fO fl f2 f3

fO 3 8 11 12

fl 12 5 8 9

f2 7 12 3 4

f3 4 9 12 1



Using this table and relevant parameters, we can efficiently calculate the low-bound slack

time in interval [0, O+DJ. For example, we assume that a hard periodic task that has a relative









deadline 18 arrives at time 5. By summing up O[sr of P at 5], 8[the accumulated slack time from

frame fi to frame f2] and 1[sl of P3 at 23], we can obtain a low-bound slack time of 9 for the

arriving task.

To guarantee the schedulability of an arriving hard periodic task, we only consider the

low-bound slack time and future RP operations even if we know the slack time can be increased

due to future LS operations. We cannot take advantage of future LS operations because they

depend on the future usage of partition servers. However, at the arrival instance, we can exactly

calculate the extra slack time obtainable from future RP operations which exchange the capacity

of partition server Pj in [0, O+DJ with that of MPASs in [f+D,, PLJ, for every partition server

Pj, where O+D, < PLj. The expected future RP operations must occur while there is at least one

periodic task pending. The contribution from the future RP operations can be significant in slack

time calculation, as it will be evaluated in later simulation studies.

3.4.2 Acceptance Test Considering Future RP Operations

In an acceptance test, the algorithm must guarantee that the existing hard periodic tasks,

which have already been admitted, should meet their own deadlines. We assume that MPAS

schedules hard periodic tasks with an EDF scheduling policy. In the decision process, we assign

two parameters, remaining execution time (rc) and individual slack timess, for each existing hard

periodic task. Using these two parameters and the requirements of the arriving task, the

acceptance test algorithm, as shown in Figure 3-6, first checks whether total slack time,

totalSlack, as of the current time is sufficient to accommodate the existing hard periodic tasks

and the newly arrived one, or not. If the total slack time is not sufficient, it rejects the new one. It

then checks the schedulability of pre-admitted hard periodic tasks. For the hard periodic task

whose deadline is earlier than the new one, its schedulability will not be affected by the

admission of the new task due to the EDF scheduling policy. However, we should consider the









ones whose deadlines are later than the deadline of the arriving task, and check the total slack

time, Sk, for each of them.


obtain totalSlack[(, +D, ]= low-bound slack time from 4 to +D, plus expected slack
times from future RP operations;
if ((s, = totalSlack[o, +DJ C, km r1 Ck )< 0) return reject;
for all hard periodic tasks k whose deadline is later than O+D,
if (( k = Sk C1) < 0) return reject;
return admit;



Figure 3-6 Acceptance Test Algorithm

In order to prove that our LS and RP operations are also harmless to the admitted hard

periodic tasks, we now show that the currently calculated slack time in the interval [0, O+DJ

cannot be reduced by future LS and RP operations.

Lemma 1. Future LS and RP operations will not jeopardize the hard periodic tasks that

had already been admitted before current time 0.

Proof: Firstly, considering a future LS operation, as long as there is an existing hard

periodic task, the LS operation cannot be applied to any MPAS. If the LS operation is applied to

a partition server, the slack time will just be increased. Secondly, considering a future RP

operation, there are only two cases. In the first case, if the RP operation exchanges the capacity of

a partition server with the capacity of a MPAS that lies in [0, O+DJ, it does not change the slack

time in the interval [0, O+DJ. In the second case, if the RP operation exchanges the capacity of a

partition server with the capacity of a MPAS that lies beyond O+D,, it increases the slack time in

the interval [0, O+DJ. Therefore, future LS and RP operations will not reduce the slack time. U

3.4.3 Dynamic Slack Time Management

Since we apply three dynamic operations, LS, RP, and Comlpacting, to the distance-

constrained cyclic schedule at run time, the slack times available at a certain instance may be










changed. To keep the slack time related parameters, i.e. sf sr, sl, and the frame-to-frame slack

time table, correct in an efficient way, an update must be done whenever the dynamic operations

are applied. At initialization time, the scheduler calculates these parameters and generates the

frame-to-frame slack time table based on the initial cyclic schedule. After completion of an LS

operation, the scheduler does not need to change anything. This is because that LS operation does

not affect the sequence or allocated capacity of the cyclic schedule. After completion of an RP

operation, the scheduler must adjust the parameters and the table for affected slack times. In

Figure 3-7, we depict six different cases of an RP operation and their corresponding slack time

adjustment.

Figure 3-7 (al, bl, and cl) shows the slack time adjustments when RP operations are

performed within the same minor frame. Three cases (al), (bl), and (cl) show three different

situations, i.e., when the capacity of a partition server is equal to, greater than, or less than the

capacity of a corresponding MPAS respectively. In (a2), (b2), and (c2), the only difference is that

the RP operation brings a MPAS interval from a different minor frame in a late timeline. Since no

partition or MPAS server crosses a minor frame boundary, there is no other case. In each case of

the figure, we show the original cyclic schedule in the top diagram and the result after completion

of an RP operation in the bottom. In the bottom diagram of each case, the shaded entry represents

the scheduled interval whose parameter, either sl or sr, should be modified. For example, in the

case of (al), the swapped MPAS and the partition server P, will have (sl, sr-x) and (sl'+x, sr')

respectively as a new parameter pair. All other entries between them will have an x-time-unit

increased sl and an x-time-unit decreased sr. In addition to modifications of sl and sr values, sf

value should be modified properly in the cases; (a2), (b2), and (c2). This is because the exchange

of capacity between two minor frames will influence the slack time in each minor frame. As an

embedded function of an RP operation, this slack time adjustment algorithm does not influence

the time complexity of RP operation, O(N).















4 -X

(al) others P P
(sI1, sr) (sl,sr) (sl, sr)



others P MPAS
(sl, sr) (sl,sr) (sl, sr-x)





(bl) others P P
(sI, sr) (sl,sr) (sl, sr)



others P MPAS
(s sr) (sl,sr) (sl,sr-x) srx





(c) others P, P,
(sI1, sr) (sl,sr) (sl,sr)


others P,
(sI, sr) (sl,sr)


MPAS
(sl,sr-x)


(a2) others P P others
s r (slsr) s, sr) (s2, sr2)



Sthe P' MPAS others
s ) s x) (s r) (sl2+ sr2)
+f, x I \



(b2) others P P, others
( (slsr) ) ( ml, sr) (s12, sr2)


sf,






( (sc2) sr) (sl.sr) (slsr) (sl, sr) \
others P MPAS others
s,sr) ,sr) S12 sr2)
st, +




s ) s x) sl sr (sl2+x< sr2)
sf,+x \


x

others MPAS others
(s12, sr2) (s', sr') (SI sr,)



others P, others
(sl2+x, sr2-X) (sl'+x, sr') (S13, sr)


others MPAS
(s12, sr2) (s', sr')



others (sP
(sl2+x, sr-x) sr'


others
(S3, sr3)


others
(sl, sr)


others MPAS others
(s12, sr2) (s', sr) (SI sr,)



others P MPAS others
(sl2+X sr -x) (s'+ (s'+x (, )
(S12+X1 S 2 X) sr'+v s (r13, sr)


others MPAS others
(sl, sr,) (sl. sr) (s4, sr4)



others P others
(sl, sr,-x) ( ', ) (sl4-x, sr)




others MPAS others
(sl, sr) (l ) (sl,, sr)




others P others
(sl, sr,-x) ( ', sr) (sl-x, sr4)



others MPAS others
\ (s1 sr ) (sl', sr') (sl, sr4)
-st -



others P MPAS others
(s3, sr,-x) s (sl'sr) (sl4-x, srs)


Figure 3-7 Six Possible Slack Time Adjustments after an RP Operation



3.4.4 Multi-Periods Aperiodic Server Scheduler


In scheduling partitions that consist of only periodic tasks, it is not necessary for the



lower-level cyclic scheduler to keep track of execution activities of local tasks within a partition.



This is possible due to the off-line scheduling analysis of partitions that have static requirements.







79



A sufficient static processor allocation, which is characterized by a pair of partition capacity and


cycle, guarantees the real-time constraints of all periodic tasks of partitions. But this cannot be

applied to our DC2 scheduling approach, because the periodic task scheduler utilizes the unused

capacity of other partitions to maximize the performance. So the minimal amount of periodic


task information is also dynamically kept in the lower-level cyclic scheduler, especially via the

MPAS scheduler, as shown in Figure 3-8.


Partition 1 Partition n
Soft Hard Soft Hard
Periodic Aperiodic Aperiodic Aperiodic
Task Task Task Task



High Aperiodic Aperiodic Aperiodic Aperiodic
Layer Task Task Task Task


Aperiodic Server Aperiodic Server
FIFO Queue EDF Queue FIFO Queue EDF Queue
Soft -AP-Task Hard -AP-Task Soft-AP-Task Hard -AP-Task






Low
Layer MPAS Scheduler
FIFO Queue EDF Queue
Soft -AP-Task Hard -AP-Task



Figure 3-8 Two-Level Hierarchy of Aperiodic Task Schedulers


In the higher layer, an periodic server scheduler takes over processor control when the


current time instance belongs to the MPAS server in the cyclic schedule and the MPAS server


gives its capacity to the corresponding partition. Then the periodic server schedules either soft or


hard periodic tasks based on the information given by the MPAS server.

In the lower layer, a MPAS server maintains a FIFO queue for partitions holding ready


soft periodic tasks, and an EDF queue for partitions holding ready hard periodic tasks. In both


queues, we do not store task-level information but corresponding partition-level information. The

EDF queue also has a higher priority than the FIFO queue. The MPAS scheduler dispatches a









partition's periodic server according to the described policy, and then each dispatched periodic

server schedules its own periodic tasks.

3.5 Simulation Studies of the DC2 Scheduler

Simulation studies have been conducted to evaluate the DC2 scheduler against the simple

polling server and the LS-only scheduler. In a polling server method, periodic tasks are served

only in fixed time intervals allocated to the MPAS. In order to find the scheduling characteristics

of the DC2 scheduler, we also introduce and make a comparison with the LS-only scheduler that

uses the LS operation only. In the simulation work, we measured average response time of soft

periodic tasks and average acceptance rate of hard periodic tasks. The periodic workload was

modeled with the Poisson inter-arrival time and exponential distribution of execution time. We

choose the hard deadline of a hard periodic task based on the inter-arrival time of the task, e.g.

0.5 or 1.0 times the inter-arrival time. For each combination of inter-arrival time and execution

time, we randomly generated 25 different simulation sets. For each random set, we generated 5

partition servers for periodic tasks and simulated it for 2,000,000 units of time. For every random

set, we randomly choose a minor frame size from 56 to 200, and a major frame size of 16 times

the chosen minor frame size. The sum of workloads from all partition servers in the cyclic

schedule is 69%.

3.5.1 Response Time

We present the measured average response time of soft periodic tasks in Figure 3-9. We

have built two different simulation scenarios. Firstly, we fixed the Poisson inter-arrival time of

periodic tasks with 56. Then, we varied the distribution of worst execution time of a task

following an exponential distribution of t between 1 (1.78% workload) and 16 (29% workload).

The simulation result is shown in Figure 3-9-(a). Secondly, we fixed the distribution of worst case

execution time with an exponential function of i equal to 3. Then we varied the Poisson inter-

arrival time of, a between 12 (25% workload) and 40 (7.5%). The result is shown in Figure 3-9-









(c). The normalized average response time of (a) and (c), based on that of the polling server

method, is also shown in Figure 3-9-(b) and (d).

As shown in (a) and (c), the DC2 scheduler significantly improves the average response

time of soft periodic tasks compared to the simple polling server and LS-only scheduler. Since

the DC2 scheduler uses both LS and RP operations, it is worth investigating contributions from

both operations. As we can observe in (b) and (d), an RP operation contributes more in a lighter

periodic task workload (or lighter processor utilization), while LS contributes more in a heavier

periodic task workload (or heavier processor utilization). This is due to the fact that when the

processor utilization increases, there is less opportunity of success for the RP operation. However,

still in this situation, the LS operation can flexibly and efficiently manage its MPAS server

capacity. Whenever there is an idle server, the DC2 scheduler can successfully invoke the LS

operation.

3.5.2 Acceptance Rate

To evaluate the acceptance rate of hard periodic tasks, we compared the average

acceptance rates of the polling server and the DC2 scheduler. A simulation was performed with a

sequence of hard periodic tasks that had a Poisson inter-arrival time, a equal to 50 and 200. The

worst case execution time of each hard periodic task follows an exponential distribution. For

each Poisson inter-arrival time of, 50 and 200, we varied workloads of hard periodic tasks from

10% to 60%. We also varied the deadline of each hard periodic task as 0.5 and 1.0 times the

inter-arrival time of the task. We show the simulation result in Figure 3-10.








82





Avg. Response Time (a=56, poisson function)


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Avg. Execution Time (exponential function)



(a)



Normalized Avg. Response Time (a=56, poisson function)


/
y




S- Polling Server
LS Only Scheduler
-- DC2 Scheduler


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

Avg. Execution Time (exponential function)


- 250
(/)

(/)
r
200

150


100

50







83





Avg. Response Time (pi=3, exponential function)


10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42
Avg. Inter-arrival Time (poisson function)


(c)


Normalized Avg. Response Time (p=3, exponential function)


10 12 14 16 18 20 22 24 26 28 30 32 34 36 38 40 42
Avg. Inter-arrival Time (poisson function)


(d)


Figure 3-9 Average Response Time


- Polling Server
LS Only Scheduler
-- DC2 Scheduler


Polling Server
LS Only Scheduler
-- DC2 Scheduler








84






Avg. Acceptance Rate (a=50, poisson function)





-- DC2 Scheduler (1.0)
-O Polling Server (1.0)
-7- DC2 Scheduler (0.5)
--V Polling Server (0.5)


V--
0 O-

0


0 5 10 15 20 25 30 35

Avg. Execution Time (exponential function)





Avg. Acceptance Rate (a=200, poisson function)


0 20 40 60 80 100 120

Avg. Execution Time (exponential function)


Figure 3-10 Average Acceptance Rate


O
O


O


V


-- DC2 Scheduler
O Polling Server (
S -7- DC2 Scheduler
S- Polling Server(



^ -41-


(1.0)
(1.0)
(0.5)
0.5)










As shown in the two simulation results, the DC2 scheduler outperforms the polling server

in acceptance rate. In the case of the Poisson inter-arrival time being equal to 50, the acceptance

rates of the DC2 scheduler are 29.5% and 24.6% higher than those of the polling server at the

deadline factor of 0.5 and 1.0 respectively. When the Poisson inter-arrival time is 200, which is

the maximum possible minor frame size in the simulation, the acceptance rates of the DC2

scheduler are 21.2% and 17% higher. The smaller the inter-arrival time and the tighter the

deadline, the DC2 scheduler shows the advantage of a higher acceptance rate. This is because the

benefit from future RP operations is limited by the period of each partition server. Therefore, we

can expect a much higher acceptance rate advantage when hard periodic tasks come in bursts

with tighter deadlines.

3.6 Conclusion

The proposed DC2 scheduling algortihm guarantees the real-time constraints of periodic

tasks of partitions while dynamically scheduling both soft and hard periodic tasks. By applying

three essential operations, Left Sliding, Right Putting, and Comnpacting, in the DC2 scheduler, we

maximize the slack time to be consumed by an periodic task, while guaranteeing that all hard

periodic tasks of each partition meet their own deadlines. We also developed hard periodic task

scheduling and admission methods. With the help of the dynamic behavior of the DC2 scheduler,

we can take advantage of future RP operations in addition to the slack times of current static

schedules in admission tests of hard periodic tasks. Through the simulation studies, we show

that the DC2 scheduler outperforms the simple polling server method and LS-only scheduler

which have similar concept to other resource reclaming algorithms, both in average response time

and in acceptance rate.















CHAPTER 4
REAL-TIME KERNEL FOR INTEGRATED REAL-TIME SYSTEMS

4.1 Introduction

To achieve reliability, reusability, and cost reduction, a significant trend in building large

complex real-time systems is to integrate separate application modules. Such systems can be

composed of real-time applications of different criticalities and even Commercial-Off-The-Shelf

(COTS) software.

An essential requirement of integrated real-time systems is to guarantee strong

partitioning among applications. The spatial and temporal partitioning features ensure an

exclusive access of physical and temporal resources to the applications. The SPIRIT-|pKemel has

been designed and implemented based on a two-level hierarchical scheduling methodology such

that the real-time constraints of each application can be guaranteed. It provides a minimal set of

kernel functions such as address management, interrupt/exception dispatching, inter-application

communication, and application scheduling.

In designing the SPIRT-|pKemel, we followed the design concepts of the second-

generation microkemel architecture, because it provides a good reference model to achieve both

flexibility and efficiency. Compared to the first-generation micorkemel architecture, the second-

generation microkemels such as MIT's Exokemel and GMD's L4 achieve better flexibility and

performance by following two fundamental design concepts. Firstly, to achieve better flexibility,

the microkernel is built from scratch to avoid any negative inheritance from monolithic kernels.

Secondly, to achieve better performance, it implements a hardware dependent kernel, maximizing

the support from the hardware.










The goals of the SPIRIT-|jKemel are to provide dependable integration of real-time

applications, flexibility in migrating operating system personalities from kernel to user

applications, including transparent support of heterogeneous COTS RTOS on top of the kernel,

high performance, and real-time feasibility. To support integration of real-time applications that

have different criticalities, we have implemented a strong partitioning concept using a protected

memory (resource) manager and a partition (application) scheduler. We also developed a generic

RTOS Port Interface (RPI) for easy porting of heterogeneous COTS real-time operating systems

on top of the kernel in user mode. A variety of operating system personalities, such as task

scheduling policy, exception-handling policy and inter-task communication, can be implemented

within the partition according to individual requirements of partition RTOS. To demonstrate this

concept, we have ported two different application level RTOS, WindRiver's VxWorks 5.3 and

Cygnus's eCos 1.2, on top of the SPIRIT-|jKemel.

The scheduling policy of the SPIRIT-|jKemel is based on the two-level hierarchical

scheduling algorithm, which guarantees the individual timing constraints of all partitions in an

integrated system. At the lower microkernel level, a distance-constrained cyclic partition

scheduler arbitrates partitions according to an off-line scheduled timetable. On the other hand, at

the higher COTS RTOS level, each local task scheduler of a partition schedules its own tasks

based on fixed-priority driven scheduling.

We can find the application areas of the SPIRIT-|jKemel in the following two cases.

Firstly, it can be used in the integration of existing real-time applications that run on different

COTS RTOS. Secondly, it can be used in the integration of real-time applications that require

different operating system personalities.

The rest of this chapter is structured as follows. We discuss the system model and design

concepts of the SPIRIT-&iKemel in section 4.2. We describe the kernel architecture in section 4.3.




University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs