UFDC Home  myUFDC Home  Help 



Full Text  
VOLTAGECLOCK SCALING AND SCHEDULING FOR ENERGYCONSTRAINED REALTIME SYSTEMS By YOONMEE DOH A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 2003 Copyright 2003 by Yoonmee Doh To my son Minjae, my husband Daeyoung, and my families, with love ACKNOWLEDGMENTS I would like to thank many individuals for the care and support they have given me during my doctoral study. First of all, I would like to express my gratitude to Professor JihKwon Peir. As my advisor, he has provided me with guidance, insightful comments, and suggestions. I wish to extend my thanks to my main advisor, Professor YannHang Lee at the CSE Department of Arizona State University, not only for his invaluable academic guidance but also for his strong support, encouragement, and his caring ways through the years he has supervised my research. I would also like to thank Professor C. Mani Krishna at University of Massachusetts for his priceless advice. My appreciation also goes to the professors on my supervisory committee, Randy Y. Chow, Douglas D. Dankel, Richard Newman, and Paul W. Chun, for their valuable and constructive advice. I would also like to thank former and current members of the RealTime Systems Research Group at the University of Florida and Arizona State University for their invaluable help, friendship, and discussion. Thanks also go to the CSE Department of Arizona State University for giving me an opportunity to visit and to work as a visiting research scholar and for providing administrative support during my stay. I am also deeply grateful to my middle school teacher, Jongyee Choo at Sungwha High School. Without his continued encouragement and moral support, I could never be what I am. My thanks should also go to the former director of ETRI, Professor Munkee Choi, at Information and Communications University and the current members of KCISE at the University of Florida for their help with all the details needed at UF to complete a degree during my stay at ASU. Finally, I would like to give special thanks to the people who are most dear to me, my dear mother, Seeok Han, my brothers Jinwoong, Youngjin, Younghyo, and my sister, Sunmee, to whom I owe everything. Their blessing, guidance, and love made me what I am today. I am also grateful to my parents, my sisters, and my brothersinlaw since they have always encouraged me in my studies and believed in me. Many thanks go to my various friends near and far, for hanging in there with me. Especially, I thank Wendy Tiedemann and Okehee Goh for their endless support. My final acknowledgements go to my wonderful husband, Daeyoung, and my lovely son, Minjae, for their continued love and support given to me until now. TABLE OF CONTENTS Page A C K N O W L E D G M EN T S................................................................................... ....................... iv L IS T O F T A B L E S ........ ............. ........................... ...................................... ... ....................... viii LIST O F FIG U RES ............................................. ................................... ......... ........... .. ix A B ST R A C T ................ .................................................. .......................... .................. .... xii CHAPTER 1 IN T R O D U C T IO N ................................................... .................. ............................................ 1 1.1 Lowpower and Poweraware RealTime Systems........................................................ 1 1.2 Research Background and Related Works ....................................................................5 1.2.1 Dynam ic Power M anagem ent................................................................... .......................5 1.2.2 Low Pow er vs. Pow eraw are Com putting ........................................... ......................6 1.2.3 Dynam ic V oltageClock Scaling ............................................ ............................ 8 1.2.4 Fundam ental Scheduling Theories ........................................ .......................... 10 1.2.5 Low Power RealTim e Com puting......................................... .......................... 12 1.2.6 Poweraware RealTime Computing .................. ................................................... 14 1.3 M ain C contributions ............................................................................... ...... ................... 17 1.4 O organization of the D issertation ......................................................................................... 18 2 TWOMODE VCSEDF SCHEDULING FOR HARD REALTIME SYSTEMS .................21 2 .1 In tro d u ctio n .................................................................................. .......... .. .............. .... 2 1 2 .2 System M odel ................. ........ ............. .. ...... ................................ ................23 2.3 VoltageClock Scaling under EDF Scheduling ...................................... ................ 24 2.4 Simulation Evaluation of VCSEDF Algorithm ...................................... ................28 2.5 Optimal Voltage Setting in VCSEDF scheduling .....................................................33 2 .6 C onclu sion ................ .................. ....... ............................................... 36 3 CONSTRAINED ENERGY BUDGET ALLOCATION ..................................................37 3.1 Introduction......................................................... .. ............. ....... 37 3.2 Design Concepts ...................... ....... .................... ............ 39 3.3 System M odel for Energy Sharing........................................................... ............... 41 3.3.1 Schedule for Periodic Tasks ................................................ ............................. 41 3.3.2 Schedule for Aperiodic Tasks ........ ..... ......... ................................. ..................42 3.3.3 V oltageC lock Scaling .......................................................................... ......................43 3.3.4 Bounded Energy Consumption........................................ ....... ................44 3.4 Profiling Energy and Utilization Usages in a Realtime Embedded System .....................44 3.4.1 Periodic Tasks............. ............................... ............ 45 3.4.2 A periodic T asks ...................................................................................... ................46 3.4.3 E energy B budget A location ........................................ ..............................................47 3.5 Constrained Energy Allocation using VCSEDF Scheduling............................................48 3.5.1 Energy A location Factors ................................................ ................................ 48 3.5.2 A lgorithm for Energy Budget A llocation............................................... .... ..............51 3.6 Sim ulation E valuation .............................................................................. ......................52 3 .7 C o n clu sio n ............................................................. ................. 5 8 4 DUALPOLICY DYNAMIC SCHEDULING .............................................. ................ 60 4.1 Introduction......................................................... .. ............. ....... 60 4.2 Design Concepts .................. ........ ... .................................63 4.2.1 Simple Two Voltage Settings .................... .................................. 63 4.2.2 Total Bandwidth Server for Aperiodic Tasks................................ ... ............. 64 4.2.3 Energy Budget Allocations using Twomode VCSEDF Scheme .............................65 4.2.4 E lem entry Schedules........................................................... ............................ 65 4.3 Dynam ic DualPolicy Scheduling M odel ........................................ ........................ 68 4.3.1 Schedule for Only Periodic Tasks ................................................... 69 4.3.2 Schedule for Both Periodic and Aperiodic Tasks..................... ................................ 70 4.3.3 Schedules for Transition between Elementary Schedules ........................................71 4.4 DualPolicy Dynamic Scheduling Using VCSEDF................................ ................80 4 .4 .1 T erm s and C conditions ............................................................................. ................80 4.4.2 Sw itching Scheduling Policies ........................................................ ................ 81 4.5 Algorithm for DualPolicy Dynamic Scheduling .................................... ................89 4.6 Perform ance Evaluation ....................................................................................... ................ 96 4 .6.1 E energy C onsum option ........................................................................... ................... 98 4.6.2 A average R response Tim e ............................................................. .... ......... ....... 108 4.7 Conclusion ................................................................. .....................115 5 SCALING OF ENERGY CONSUMPTION AND RESPONSIVENESS...............................117 5 .1 Introdu action ...1.... ... ......... .. .......................... ...... .... ..... ... ........ .................. .... 117 5.2 Feasible Ranges of Utilization and Energy Consumption for Periodic Tasks................... 120 5.3 Scaling Energy Saving and Responsiveness in DualPolicy Dynamic Scheduling ...........122 5.3.1 Scaling Factor ...................... ................................ .................. ......... 124 5.3.2 S1 Schedule w ith Scaling Factor ...............................................................................125 5.3.3 Scaling under a Constrained Energy Budget............................................................. 125 5.4 Perform ance Evaluation ............................................................................... ............. ......... 126 5.5 C conclusion .................... .......... .................. ...................................................... ...... 139 6 CONCLUSIONS AND FUTURE WORK................................. ...... ................. 141 6 .1 C o n trib u tio n s ................ ................................................................ ....... .. .... ........... 14 1 6.2 Future R research D irections............................................................................................. .. 142 L IST O F R E FE R E N C E S ....................................................... ................................................ 144 B IO G R A PH IC A L SK ET C H ............................................................................................ ......... 151 LIST OF TABLES Table Page 11 Microprocessors that allow the core to operate at different voltages and frequencies............9 41 Switching Scheduling Policies with respect to the Interval Changes .................................. 89 42 An example of increased Lmode execution times for periodic tasks when UA= 0.3 .............. 102 43 An example of increased Lmode execution times and energy saving for periodic tasks w h en U A= 0 .3 ................................................................................................. ......... ........... 106 51 Average response times with respect to energy saving factor S(S1 schedule) when Ec= E m Up= 1.2, UA=0.3 ....................... ................................................................ ................ 132 LIST OF FIGURES Figure Page 21 Online reclamation algorithm for voltageclock scaling ................................................. 27 22 The percentage of execution time in Hmode ...............................................................29 23 The percentage of the reduction for the time in Hmode ...................................... ........ 31 24 The relative energy consumption of fixed, static, and dynamic approaches..........................32 25 An example of finding the optimal aL ......... .............................................. ......... .............. 34 26 The optimal Lmode voltages VL (VH= 3.3 V, 1.3 Watts, and 50 MHz)...............................35 31 An example of the deadline assigning algorithm in Total Bandwidth Server ......................43 32 The relationship between power consumption and utilization for a set of periodic tasks.......46 33 The algorithm for constrained energy budget allocation ...............................................51 34 The responsiveness to the energy allocation of periodic tasks......... ......... ................ 54 35 The ratios of available utilization to the minimum utilization for periodic tasks ................55 36 Energy allocation percentage to the maximum energy demand ..........................................56 37 Minimum average response time with respect to the bounded energy budget .....................57 41 An example of scheduling mixed realtime tasks and two explicit intervals in the event p pattern ................ ................. ............... ................................................ 6 6 42 Switching schedules for a busy cycle starting with a periodic task and nonzero WD ...........82 43 Switching schedules for a busy cycle starting with a periodic task and all WD (t)O0 ..........85 44 Switching schedules for a busy cycle starting with an periodic task and WD (t) >0...........86 45 Sw itching from S1 to S2 schedules ................................................................. .....................87 46 Sw itching from S2 to S1 schedules ................................................................. .....................88 47 The algorithm for DualPolicy Dynamic Scheduling ................................. ................92 48 The percentages of executed times to the total execution times over a various energy budget in cases ofUp =0.8 and 1.0 for a fixed UA =0.3.......................................................... 100 49 The percentages of executed times to the total execution times over a various energy budget in cases of Up=1.2 and 1.4 for a fixed UA =0.3........................................................... 101 410 The percentages of time executed in highspeed mode to the total execution times over a various energy budget in cases of Up=1.2................................ ..................... 103 411 The percentages of time executed in lowspeed mode to the total execution times over a various energy budget in cases of Up=1.2................................ ..................... 104 412 Energy consumption ratios of dual and singlepolicy dynamic scheduling to static analysis for the variation of the real execution time demands of periodic task when Up=1.2 ........... 107 413 Average response times in regard to the actual workload demands of periodic tasks when U p = 1.2 .................................................................................................................... ........... 109 414 The responsiveness with respect to the bounded energy budgets when Up=1.2.................. 110 415 The responsiveness with respect to the bounded energy budgets when Up =0.8 and 1.0 .......111 416 The responsiveness with respect to the bounded energy budgets when Up=1.2 and 1.4 .......112 417 The percentages of execution time in changed mode execution from assigned mode to the total execution tim es w hen Up =1.2 .............................................. .............................. 114 51 The relationship between power consumption and utilization for a set of periodic tasks....... 121 52 Sharing of energy consumption and utilization in DualPolicy Dynamic Scheduling ..........123 53 The range of energy scaling factor ...................................................................... ................ 124 54 Feasible scaling ranges of energy consumption and responsiveness under a constraint en erg y b u d g et ......................................................................... .......... ........ ............ 12 6 55 The percentages of executed times to the total execution times over a various energy budget in cases of Up 1.2 for a fixed UA =0.3 ................................................... ................. 127 56 The increases in high and lowspeed mode execution times and energy saving by them when Up=1.2 for a fixed UA =0.3 ............................ .... ....................................... 128 57 Energy saving with respect to SinglePolicy Dynamic Scheduling with the variations of S (S1 schedule) and constraint energy budgets when Up=1.2 and UA =0.3 ............................. 130 58 Average response times with respect to energy scaling factor S(S1 schedule) and constraint energy budgets when Up=1.2 and UA =0.3 .............................. ................ 131 59 Comparisons of responsiveness between SPDS and DPDS with respect to constraint energy budget, Ec = Emn + wEdf, and energy scaling factor .......................................... 133 510 The percentages of switched execution times from assigned mode in worstcase for the intervals when the transitory schedule S12 is selected to total execution time .................... 135 511 The percentages of execution time in changed mode execution from assigned mode in the worstcase with respect to constraint energy budget, Ec = Emn + wEdff, and energy scaling fa cto r ............................................................... .......................................... ......... ....... 13 6 512 Comparisons of energy consumption ratios to static analysis between SPDS and DPDS with respect to constraint energy budget Ec Em, + oEdf over the variation of workload dem an d in p periodic task s........................................................... .......................................... 137 513 Comparisons of average response times between SPDS and DPDS with respect to constraint energy budget Ec Em,, + cEdf over the variation of workload demand in periodic tasks ............................................... 138 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy VOLTAGECLOCK SCALING AND SCHEDULING FOR ENERGYCONSTRAINED REALTIME SYSTEMS By Yoonmee Doh May 2003 Chairman: JihKwon Peir Major Department: Computer and Information Science and Engineering Nowadays, exponentially increasing demands for portable devices and more sophisticated and intelligent functionalities in embedded applications make processors much powerhungrier. Due to a limited energy budget, lowpower computing at a relatively low level of power consumption or poweraware computing with the knowledge of power source plays important role in extending systems' lifetime. In the dissertation, we study how to reduce power consumption using a low power voltageclock scaling scheme in realtime systems that require strict time and energy constraints. The property of powerdelay tradeoff has significant impact on the schedulability as well as on the performance of the systems. First, we focus on dynamic reclaiming of early released resources in earliest deadline first (EDF) scheduling using voltageclock scaling. In addition to a static voltage assignment, we propose a new dynamicmode assignment, which has a flexible voltage mode setting at runtime, enabling much larger energy savings. Using simulation results and exploiting the interplay among power supply voltage, frequency, and circuit delay in CMOS technology, we find the optimal twolevel voltage settings that minimize energy consumption. Next, we study batterydriven real time systems that jointly schedule hard periodic tasks and periodic (sporadic) tasks, whose power sources are bounded in a feasible range decided by a set of tasks. Therefore, reducing power consumption is not the only objective of task scheduling in this case. To make the most of the available energy budget, an effective energysharing scheme is proposed using twomode voltageclock scalingEDF scheduling. Based on the proposed energy allocation model for energyconstrained realtime systems, a dualpolicy dynamic scheduling method is proposed not only for faster responsiveness but also for reducing power consumption. The feature of the approach is an intermixing of two schedules; the one consumes minimum energy and the other the worstcase energy, respectively, leading to much longer lifetime under a bounded energy budget. Lastly, we build a scaling mechanism that optimizes energy saving and responsiveness to the aim of the system's performance. Fully utilizing the property in sharing constrainedenergy and processor's capacity in total bandwidth server, we adjust total power consumption and average response time of periodic tasks by simply controlling the utilization of periodic tasks. CHAPTER 1 INTRODUCTION 1.1 Lowpower and Poweraware RealTime Systems In recent years, the rapidly increasing popularity of portable batteryoperated computing and communication devices have motivated research efforts toward low power and energy consumption. Battery life is the primary constraint for energyconstrained systems such as personal mobile phones, laptop computers, digital cameras, and personal digital assistants (PDA's) [1]. On the other hand, limiting power is also important in the field of general or high performance systems. In the report of examination of electricity usages in U.S., total power use by office equipment and network equipment is about 74 TWh per year, which is about only 2% of total electricity use in the U.S., and power management currently saves 23 TWh/year [2]. An exponential growth of Internet usage also requires powerefficient server farms, which are warehousesized buildings filled with servers of Internet service providers (ISP). For example, an ISP with 8,000 servers needs 2 megawatt of power [3]. The power demanded by information technology (IT) will grow exponentially to overpass all other power uses combined in the near future. Therefore, it is obvious that both energyconstrained electronics and servers need processors that consume less energy. Due to the confined battery capacity in portable systems and power efficiency in high end systems, to maintain adequate computing performance levels without useless drainage is the main purpose in poweraware computing. Therefore there has been tremendous interest in power management for portable and mobile embedded systems, showing evident advances in the fields of circuit designs, processor architecture, and operating systems, and moving toward systemlevel or highlevels in design architecture. From research during the past several years, we can easily catch two major shifts in the field of poweraware computing. First, powerefficient designs, i.e., low power designs, are no longer only for portable computing systems. Power dissipation has rapidly become a firstorder design constraint in almost every type of computing system including handheld devices, desktop computers, and even high performance computing servers. The second is that computing efforts based on the state of power capacity and the power estimation of system components or whole system level have been introduced as poweraware computing. For the powerefficient designs, significant reduction in power consumption is possible from techniques ranging from lowpower CMOS circuit design to power management software tools. Initial power management efforts focused on putting the system in a lowpower/low performance sleep mode when it was idle. With the advent of the advance configuration power interface (ACPI) standard, such power management has been successfully employed in reallife systems. However, such approaches depend for their efficacy on efficient ways to decide when and which device should be shut down and woken up. An energy characteristic called voltageclock scaling (VCS) in CMOS technology allows the designers to obtain revolutionary power reduction. Power is proportional to the square of the operating voltage and proportional to clock frequency. Modem processor cores can operate at different voltage ranges to achieve different levels of energy efficiency showing a property of operating voltagefrequency tradeoff. In other words, lowering the voltage means lowering the frequency and vice versa. Scaling down both operating voltage and clock frequency allows the low power design to save energy consumption and to be studied actively. For instance an ARM7D processor can run at 33MHz and 5V as well as at 20MHz and 3.3V. The energy performance measures at these two operation modes are 185 MIPS/WATT and 579 MIPS/WATT, and the MIPS measures are 30.6 and 19.1, respectively. Another example is Motorola's PowerPC 860 processor that can be operated in a highperformance mode at 50MHz and with a supply voltage of 3.3V, or a lowpower mode at 25MHz and with an internal voltage of 2.4V. The power consumption in the highperformance mode is 1.3 Watts, as compared to 241mW in the low power mode. Advances in low power circuit design and power management have introduced a new research field of low power computing, which emphasizes running at a low energy consumption level, if possible, while conserving the performance of an application. In addition to low power designs, poweraware computing that reflects the state of battery capacity and the characteristics of batteries is widely studied. But, in general, the timeliness of computing is not greatly considered in poweraware computing. Realtime systems are often needed in specific and hard environments, such as standalone or isolated locations, so that power consumption becomes a critical design issue. More importantly, they have strict requirements of temporal correctness as well as functional correctness. Missing timeliness in realtime systems is considered as a failure and may have catastrophic consequences. For instance, a missed deadline in the control loop of the auto landing system may cause a crash. Even for nontimecritical tasks that are with soft or firm deadlines, a delay in task completion greatly reduces the value of their computation results. Thus, we can generalize system failures to include nonresponsiveness or tardy outputs of the system. Therefore, to get effective powersavings requirements in realtime systems without violating time constraints of realtime applications, we propose a new concept of poweraware realtime computing and its system model. It supports an integration of hard and soft realtime tasks and multitask soft realtime applications like multimedia application currently attracting much attention. In this dissertation, first we investigate the issues related to the lowpower realtime scheduling and significantly improve energyefficiency in realtime systems by combining the power reduction characteristics with realtime scheduling. For hard realtime scheduling for a single processor, we propose VCSEDF scheduling focused on dynamic reclaiming of early released resources in earliest deadline first scheduling using voltage clock scaling. In addition to a static voltage assignment, we propose a new dynamicmode assignment, which has a flexible voltage mode setting at runtime enabling much larger energy savings. Using simulation results in VCSEDF scheduling and exploiting the interplay among power supply voltage, frequency, and circuit delay in CMOS technology, we find the optimal twolevel voltage settings that minimize energy consumption. Secondly, we study batterydriven realtime systems, which jointly schedule hard periodic tasks and periodic (sporadic) tasks, whose available energy is bounded in the range possibly given by a set of tasks. Therefore, reducing power consumption is not the only objective of task scheduling in this case since response time may be unrestricted when it concerns only concerning total energy reduction. To make the best of an available energy budget constrained for specific time period, an effective energysharing scheme should be provided for the combined objectives. We model and solve these problems for a realtime system based on the tasks' requirements, their characteristics of energy and utilization demands, and the powerutilization property in VCSEDF scheduling. We first profile energy and utilization usages of periodic and periodic tasks focusing on EDF scheduling using static VCS. Then for the sharing of the bounded energy budget in a realtime system, an energy budget allocation method is proposed. We also propose a static scheduling algorithm to determine the optimal twolevel voltage settings of all tasks under bounded energy consumption, while guaranteeing that no deadline of periodic task is missed and average response time of periodic tasks is minimized. The algorithm selects the voltage settings that have the minimum average response time among the schedulable ones within a given energy consumption. To schedule periodic tasks, we adopt Total Bandwidth Server proposed by Spuri and Buttazzo, which can be expressed by bandwidth preserving algorithm that guarantees the isolation of bandwidth between periodic and periodic tasks. Thirdly, we expand our energy budget allocation model to a dualpolicy dynamic scheduling that significantly improves energyefficiency and extends the lifetime of battery driven realtime systems. The proposed dynamic scheduling switches its scheduling policies between two schedules utilizing an explicit pattern of event occurrence at runtime, exploiting the interplay between utilization and energy consumption in VCSEDF and devising a new view of total bandwidth server algorithm in terms of sharing constrained energy. One scheduling policy is a set of worstcase voltage settings from the proper allocations of energy budget. The other is for a given periodic task set to consume minimum energy under allocated energy consumption. During the intervals having only periodic tasks, the switched mode execution from highspeed in the worstcase schedule to lowspeed enables the DualPolicy Dynamic Scheduling to outperform in the reduction of energy consumption, compared with the scheduling method that sticks to the running modes by the worstcase. Finally, in the proposed DualPolicy Dynamic Scheduling, we build a scaling mechanism that can optimize energy consumption and responsiveness to the aim performance of a system. Fully utilizing the property of energy and utilization sharing in total bandwidth server having a restriction of limitation in energy consumption, we characterize the performance of the Dual Policy Dynamic Scheduling and show scaling up/down total power consumption and average response time by controlling the utilization of periodic tasks in their plausible range. 1.2 Research Background and Related Works 1.2.1 Dynamic Power Management An approach to low power is to utilize power down features to minimize the power consumption of unused hardware. For instance, turning off (or dimming) the screen while a laptop computer is idle and shutting down hard disks while it is not accessed conserves the energy without affecting any functionality. Packing message transmission delays state change from the listening mode to the transmit mode of radio part such that it extends the battery life for mobile devices in the wireless communication environment. But, reactivation of hardware can take some time, affecting performance like response times. The simplepowerdownwhenidle techniques can significantly reduce the system's power consumption [1]. Dynamic power management (DPM) reduces power consumption by selectively putting idle system components to lowpower states. Thus there are several power states for each component in a system and the state transitions are controlled by commands issued by a power manager (PM) that decides when and how to force the power state transitions based on its observations and/or assumptions on the workload. The power manager makes state transition decisions according to the power management policy [4]. Because shutting down an active device and waking up a sleeping device take time and extra energy, changing power states has delay and/or energy overhead and the policies trade off the performance for the power consumption. Microsoft proposed OnNow Initiative that controls power/performance through the software layer [5]. Intel, Microsoft, and Toshiba proposed the open industrial standard called advance configuration power interface (ACPI) such that systemlevel power management has been successfully settled down in reallife [6]. Linux also supports ACPI from kernel 2.4 [7]. But because the policies depend on the workload predictions and/or assumptions, such approaches should be more refined for the systems to be optimally power managed. 1.2.2 Low Power vs. Poweraware Computing Limiting power consumption presents a critical issue in computing, particularly for portable and mobile devices. The field of computer architecture has recently seen a rapidly expanding interest in power reduction at the architectural and software level. Low power computing has traditionally been the primary focus of designers of portable and batterypowered computing systems and has in the past largely been considered a lowlevel circuit design issue. There are many facets in modern computers where the system designer can get energy saving because most computing systems generally do not consume all of their resources all of the time. Slack time can be exploited to either slow down computation or save energy by shutting off components while not in use. And much previous research on low power design techniques tries to minimize average power consumption either by transition to a low power consumption state when it is idle [8, 9, 10, 11] by reducing the active current drawn by a circuit keeping the supply voltage fixed, or by scaling the supply voltage statically or dynamically. But, from the viewpoints of poweraware computing, the goal targets not only saving energy without sacrificing performance in intended applications to get longer battery life, but also making the best use of the available power for powerconfined devices. For example, poweraware computing/ communication (PAC/C) is carried in DARPA project [12], in which promising areas such as poweraware algorithm designs/libraries, poweraware compilation, poweraware operating systems as well as a novel integrated software/hardware technology suite incorporating innovative individual power reduction technologies are being pioneered. In poweraware computing, the remaining power capacity and the consumption management of power source are very important such that a system can effectively use the available power. Batterydriven systems like handheld devices have more stringent power consumption limits. In such systems, battery management is also highly emphasized on an extent of mission duration and quality of services due to the increasing requirement of miniaturization and complicated functionalities. Therefore, a lot of works have been carried to understand the behavior and properties of the power source in poweraware computing [13, 14, 15, 16, 17]. Sometimes, to prolong the lifetime of the power source, the reduction of workload to be done is required to fit the power consumption into the available level, but leads to unnoticeably degraded performance in applications. Available energy should be well allocated among applications in the system. Regarding proper allocation, components in a system including hardware and software need to be redefined as poweraware ones such that they enable the applications to become acceptable for the user at some power level. Powerawareness of components requires that the energy consumption be quantified in component basis. Much work has been done to measure and abstract the quantities in every system level, from circuits to applications. One of the wellknown frameworks of energy consumption estimation is the advanced configuration and power interface that provides systemwide power states. For low power software development Simulators such as Wattch [18] and SimplePower [19] estimate the power consumption of microprocessorbased systems in an instructional or architecturelevel. As a profiling application energy usage, PowerScope quantifies what fraction of the total energy is consumed for specific processes in the system [20]. For systemonchip (SOC) components such as memory hierarchies and system buses, power estimation and optimization techniques have been proposed [21]. X. Fan et al. investigate the memory's influence on selecting the appropriate voltage/operating frequency and extract the factors that affect the clock scaling decision to meet the system's energy/performance goals [22]. Furthermore, a circuit level energymonitoring tool is proposed for the embedded systems, in which it collects power consumption data even in a cyclebycycle resolution [23]. 1.2.3 Dynamic VoltageClock Scaling The basic concept of power reduction in the variable voltage processors is a technique called voltageclock scaling or dynamicvoltagescaling in CMOS circuit technology. CMOS circuits have both dynamic and static power consumption. Static power consumption is caused by bias and leakage currents and is insignificant in most designs that consume more than ImW. The dominant power consumption for CMOS microprocessors is the dynamic power. Every transition of digital circuit consumes power, because every charge or discharge of the digital circuit's capacitance drains power. The dynamic power consumption is equal to P = C,N,,Df, (11) where CL is the output capacitance, Nw the number of switches per clock, and f the clock frequency. It is clear that power consumption in CMOS digital circuits is proportional to the square of the supply voltage (VDD) and the clock frequency such that lowering VDD is the most effective mean to reduce the power consumption. Lowering VDD, however, causes the problem of increased circuit delay. The circuit delay td is given by the following equation [24]: td =k DD (12) (VDD VT ) where k is a constant depending on the output the gate size and the output capacitance, and Vis the threshold voltage. As the clock frequency is inversely proportional to the circuit delay, it is expressed using td and the logic depth of a critical path, Ld [25] 1 f = (13) Ld td Obviously, from the equations, we can see that there is a fundamental tradeoff between circuit delay and supply voltage. Processor can operate at a lower supply voltage, but only if the clock frequency is reduced to tolerate the increased propagation delay. Kuroda et al. use voltage scaling in the design of a processor core, in which they can adjust internal supply voltages to the minimum automatically according to its operating frequency [26]. More recently, there are a number of variable voltage processors that allow the supply voltage to be changed. Based on the VCS technique, most of today's processor cores have been designed to operate at different voltage ranges to achieve different levels of energy efficiency as shown in Table 11. Table 11 Microprocessors that allow the core to operate at different voltages and frequencies Power Processors Voltage Speed Consumption Features (MHz) (Watt) StrongARM SA2 1.30 600 0.45 12fold energy reduction 0.75 150 0.04 PentiumIII 1.60 650 22 SpeedStep Technology 1.35 500 9 2 modes Crusoe (TM5400) 1.65 700 2 16 levels 1.10 200 1 in steps of 33MHz XScale 0.75 150 0.04 185 MIPS 1.6 800 0.9 1000 MIPS ARM7D 5.0 33 0.165 185 MIPS/W 3.3 20 0.033 579 MIPS/W PowerPC860 3.3 50 1.3 2 modes 2.4 25 0.241 The StrongARM SA2 processor, designed by Intel, is estimated to dissipate 500 mW at 600 MHz, but only 40 mW when running at 150 MHz, showing a 12fold energy reduction for a 4fold performance reduction [27] and the PentiumIII processor with SpeedStep technology dissipates 22W at 650MHz but 9W at 500 MHz [28]. Likewise, AMD has added clock and voltage scaling to the AMD Mobile K6 Plus processor family and Transmeta has also announced TM5400 or Crusoe processor [29] that actually supports voltage scaling and 16 levels ranging from 1.65V at 700 MHz to 1.1 V at 200 MHz in steps of 33 MHz. For wireless and networking services, Intel also introduces XScale microarchitecture that features ontheflyscaling of voltage and frequency in the range from 40mW at 150MHz up to 0.9W at 800MHz [30]. Pouwelse et al. show several levels of the clock frequency versus supply voltage for the Transmeta Crusoe processor and carried an actual implementation in a wearable computer for lowpower computing [31]. Digital CMOS circuits are very amenable to implementing DVS because their performance and energy consumption scale together over a wide range of supply voltage. Although the maximum supply voltage drops with improved process technology, reducing the range, DVS will continue to be a viable technique for future process technologies. 1.2.4 Fundamental Scheduling Theories 1.2.4.1 RMA and EDF Scheduling Rate monotonic analysis (RMA) is a collection of quantitative methods and algorithms with which we can specify, understand, analyze, and predict the timing behavior of realtime system designs. RMA grew out of the theory of fixed priority scheduling. A theoretical treatment of the problem of scheduling periodic tasks was first discussed by Serlin in 1972 [32] and then more comprehensively discussed by Liu and Layland in 1973 [33]. They studied an idealized situation, in which all tasks are periodic, do not synchronize with one another, do not suspend themselves during execution, can be instantly preempted by higher priority tasks, and have deadlines at the end of their periods. The term "rate monotonic" originated as a name for the optimal task priority assignment in which higher priorities are accorded to tasks that execute at higher rates (that is, as a monotonic function of rate). Rate monotonic scheduling is a term used in reference to fixed priority task scheduling that uses a rate monotonic prioritization. Although a fixed priority scheduler is used widely due to the feature of analysis for a system's feasibility, RMA cannot commit to the feasibility of a system that uses more than in worst case 69% [33] or in an average case 88% [34] of available CPU time. And the worst problem with fixed priority scheduling is that it is oblivious to deadlines, that is, blindness to hard realtime. Compared to rate monotonic scheduling, which is a static priority driven scheduling method, earliest deadline first is a dynamic priority driven scheduling method. In EDF, the task with the earliest deadline is always executed first and fairly easy to implement. The execution queue is sorted by the time of the next deadline. When a task becomes active it must be inserted in the execution queue, or if its deadline is before the deadline of the currently executing task, it must preempt the currently executing task. EDF has only one serious drawback: the theory is optimal only for loads that can be scheduled. It does not address overloads. EDF degrades ungracefully when it is overloaded. Another big point is that dynamic priority scheduling tends to be complex and sometimes uses heuristics comparing to the guaranteed feasibility based on an analysis in fixed priority scheduling. But a processor can be fully utilized by dynamic priority scheduling. 1.2.4.2 Aperiodic task scheduling Contrary to the guarantee of meeting deadlines in the hard task scheduling, there is no form of guarantee for periodic tasks. They are usually served in background with respect to hard tasks in order not to jeopardize the schedulability of hard tasks or served by an periodic server to improve responsiveness. Substantial research has been done in the scheduling of periodic tasks in both fixed and dynamic priority systems. The sporadic server algorithm [35], the deferrable server algorithm [36], and the slack stealer algorithm [37] are for the fixed priority environment. A number of algorithms that solve the periodic task scheduling in dynamic priority systems using EDF scheduler can be found in literatures; deferrable server [38], slack stealing [39], or sporadic server [39, 40]. To improve the response time of periodic requests, an efficient approach called total bandwidth server (TBS) has been proposed by Spuri and Buttazzo [40, 41]. Yielding much faster responsiveness under sharing resources among periodic and periodic tasks, the algorithm has been extended to reflect in the environment of resource sharing [42, 43]. The TBS allocates feasible priority by assigning suitable deadlines to arrived periodic tasks while guaranteeing deadlines of other periodic tasks and schedules them as hard periodic tasks. By handling periodic tasks like periodic tasks within the reserved bandwidth, the TBS can produce better responsiveness than other periodic mechanisms, sporadic server, slack stealing, and so on, based on the idea of using idle time of periodic scheduling or "stealing" all the possible processing time from the periodic tasks, without causing their deadlines to be missed. 1.2.5 Low Power RealTime Computing Low energy consumption has already shown up as a critical feature in highend systems as well as power confined realtime systems. Dynamic voltage scaling and frequency scaling in variable voltage processors are outstanding techniques because the power consumed by a component implemented in CMOS technology is quadratically proportional to voltage and linearly proportional to frequency. In other words, operation takes even longer, while greater energy reductions can be achieved with slower clock and lower voltage. The longer execution time affects performance degradation in application or meeting strict time constraints in realtime systems. Therefore there have been a number of techniques that combine lowpower techniques and scheduling algorithms in systems. The benefits of reducing the supply voltage and clock speed in a processor to save energy have been primarily studied based on simulations. Weiser et al. showed the potentials of voltage scaling for generalpurpose processors using traces gathered from UNIXbased applications [44]. Their scheduling algorithms are intervalbased voltage schedulers, which set clock speed depending on processor utilization in previous intervals. By employing the same traces, Govil et al. extend the work of Weiser et al. and study several dynamic speedsetting policies that slow the processor when it is mostly idle and speed it up when it is mostly active [45]. Also using interval based clock scheduling Pering et al. achieve considerable reduction of energy consumption comparing to full speed running [46] and later they present a heuristic based on EDF scheduling [47]. Unlike previous studies in voltage scaling that rely on simulations, there are several raw measurements of the powerdelay tradeoff in voltage scaling. Grunwald et al. have investigated the power consumption of the Itsy Pocket Computer [48] for previously proposed interval schedulers and found that frequent voltage and clock changes incur unnecessary overhead. Throughout a wireless multimedia terminal called InfoPad, Truman et al. measure the power consumption in the portable system and DVS scheme to reduce the power consumption [49]. With the importance of the prediction of appropriate performance for the next interval, a workload prediction strategy using adaptive filtering is proposed [50]. However, since the gain is extremely dependent on the interval length, the optimal setting varies per application, and even the individual requirements of running tasks, e.g., deadlines and computation estimates, were not considered, intervalbased voltage schedulers result in poor behavior for more complex workloads such as burst applications. In DVS approaches for realtime systems, scheduling is done at the task level using the task execution times based on worst execution times (WCET) in general and exploiting slack times from static analysis and runtime workload variation. The concept of realtime scheduling is firstly applied to dynamic speed setting by Pering and Broderson [51]. Nonpreemptive EDF scheduling based on the ILP method is also presented [52]. The authors controlled two different operating speeds/voltages through OS system calls to minimize the energy consumed by a periodic task set. A minimumenergy scheduler for dynamic speedsetting based on the EDF scheduling policy is proposed [53] and they compare some on line algorithms to an offline algorithm executed by assigning the optimal processor speed setting to a critical interval that requires maximum processing. Similarly, Hong et al. consider a low energy heuristic for nonpreemptive scheduling with a dynamically variable range of [0.8, 3.3] V [54]. They also present a heuristic even for preemptive model with a limited speed changes [55]. Quan and Hu study the optimal voltage setting for fixedpriority scheduling [56]. All these approaches require that the task release times must be known a priori. Hong et al. describe an on line scheduling algorithm for hard realtime tasks assuming the release times of jobs are not known a priori [57]. For the periodic task model and ratemonotonic scheduling, two online voltagescaling methods [58] were proposed, which change voltage levels at the execution stage from the initially assigned levels as such changes become necessary. Using integer linear programming (ILP), Ishihara and Yasuura present a very interesting result regarding the impact on energy of the number of available distinct voltage levels [59]. They pointed out that at the most two voltage levels are usually enough to minimize energy consumption. In the presence of task synchronization, R. Jejurikar and Gupta use slowdown factors based on DVS for energy saving of EDF scheduling [60]. There are also several hybrid approaches exploiting both DPM and DVS schemes. Transmeta has combined both schemes in a processor called Crusoe [61] targeting mobile computing, which is able to dynamically adjust clock frequency in 33 MHz steps. By running applications at several frequencies even in the active mode of DPM, they can get larger power savings in addition to the one obtained by DPM. Based on the studies of energy efficient design for portable devices [62], another hybrid method has been studied on MPEG applications on top of a Crusoe processor based system [63]. Also Kim and Ha observe tradeoffs between DPM and DVS schemes and propose a hybrid method that exploits both workloadvariation slack time and shutting off unused components on a timeslotbytimeslot basis [64]. In this dissertation, we apply VCS scheme to EDF scheduling for hard realtime tasks, reclaiming resources based on the property that actual execution times of tasks in realworld embedded systems change as much as 87% relative to the measured WCET [65]. 1.2.6 Poweraware RealTime Computing Maximizing the utilization time of energy that can be delivered by a battery have become one of the most important design considerations for a powerlimited system, since it directly impacts the extent of the system life. Instead of focusing on reducing power consumption alone, researchers have begun to study the battery behavior and the effect of the battery discharge pattern on battery capacity as well [66]. Against the result of previous intervalbased dynamic voltage scaling scheduling, in the experimental research by using the Itsy handheld computer [67], Martin and Sieworek show that reducing power consumption is not always the best strategy when taking battery effectiveness into account because most of the batteries are nonideal [68, 69]. He presents a revised Weiser's PAST algorithm to account for the nonideal properties of batteries and the nonlinear relationship between system power and clock frequency. Based on the nonlinearity in battery discharge, a systemlevel power estimation methodology is proposed [70], where they have shown the formula of energyconsumed processor, memory and L2 cache, interconnects and pins, DCDC Converter, and a battery capacity model. Pedram and Wu study tradeoffs between energy dissipation and delay in batterypowered digital CMOS design [71, 71]. Benini et al. propose several DPM policies to take into account battery's charge state study [73]. In particular, they introduce a class of closedloop policies, whose decision rules used to control the system operation state are based on the observation of both system workload and output voltage of the battery. Poweraware realtime scheduling must carry out two key features, not only being aware of domainspecific knowledge such as the power source, battery status, and other operating conditions, but also guaranteeing requirements of realtime tasks, assuring the best performance of realtime applications, and even efficient utilization of power constraints under the power awareness. Much research has been done to introduce the powerawareness into realtime computing, from the properties of power source to instructional level by an assistance of compiler. For distributed realtime embedded systems having voltagescalable capability, Luo and Jha suggest two schemes to optimize the battery lifetime by modeling a battery [74]. The one is a battery aware scheduling scheme based on heuristics of minimization of the peak power consumption and reduction of the variance of the discharge current profile to get the battery efficiency. The other is a variablevoltage scheduling scheme via efficient slacktime reallocation. Motivated by the task requirements and power source characteristics of Mars rover having two power sources: a solar panel and nonrechargeable battery, Liu et al. propose a poweraware realtime scheduling [75]. All prior studies try to reduce total energy consumption exploiting the property of scaling scheme or power source. Besides, realtime embedded systems having a constrained energy budget lower than the maximum, which the tasks can consume, demand an efficient scheme to share a limited energy budget among competing realtime and nonrealtime tasks as well as reduction of total energy consumption. For systems consisting of soft periodic tasks, the objective of minimizing power consumption using a popular energy reduction technique, voltageclock scaling, will result in slow execution. On the other hand, in many cases, the battery capacity can be replenished or there is a finite mission lifetime. Minimizing power consumption that doesn't utilize all available energy may not lead to optimal system performance. A better power control strategy in such cases is to minimize the response times of soft realtime tasks, providing that the deadlines of hard real time tasks are met and the average power consumption is bounded. In spite of widespread recognition of the importance of energy, there is no convenient abstraction of the power consumption and sharing in realtime scheduling. One of goals in this dissertation is to abstract energy usage and utilization required by realtime tasks and to allocate the confined energy budget as a first criterion in realtime scheduling. On top of the quantitative analysis for the system's powerawareness such as system component or instructional level power estimation/monitoring, we abstract energy and utilization usages on task basis by exploiting the property of voltageclock scaling. Also, we build a constrained energy budget allocation model that enables tasks to share the available energy concerning the requirements of realtime tasks and present a static analysis based on voltageclock scaling. Furthermore, we propose an energy efficient realtime dynamic scheduling, which switches two scheduling policies to get energy saving, and construct an algorithm to optimize energy consumption and responsiveness according to realtime application's requirement. 1.3 Main Contributions The objectives of the dissertation are to design a lowpower scheduling reducing total power consumption for hardrealtime applications and to build poweraware realtime systems supporting energy efficiency for realtime application under bounded energy budget. The main contributions of the dissertation are as follows: A lowpower scheduling for hard realtime systems. We devise comprehensive scheduling algorithms for integrated realtime systems. They include (1) a fundamental twomode scheduling theory, (2) reduction of power consumption using poweraware voltageclock scaling, (3) dynamic reclaiming of early released resources, and (4) optimal voltage setting in VCSEDF scheduling. Poweraware scheduling for mixed realtime tasks under a constrained energy budget. In order to develop a poweraware static analysis for realtime embedded system, the relationship of energy and workload demands is investigated. We develop an energy budget allocation model for batterydriven realtime systems, including not only small appliances being widely used, but also sensor network nodes for special purposes. The model concerns (1) integrated scheduling of both hard and soft realtime tasks in a system, (2) profiling of energy and processor's capacity usages on the basis of task (3) efficient share of a bounded energy budget for batterydriven realtime systems, and (4) better performance in nonrealtime applications. Dynamic scheduling for mixed realtime tasks considering power efficiency. We demonstrate that more sophisticated and effective poweraware dynamic scheduling may enhance the overall power consumption for realtime systems. They include (1) a fundamental dualpolicy dynamic scheduling theory, (2) static analysis using the energy budget allocation model, (3) modified Total Bandwidth Server for scheduling periodic tasks, and (4) reduction of total energy consumption. Optimizing energy efficiency and responsiveness. Given the energyutilization tradeoffs in poweraware realtime scheduling, we show that the adjustment of the periodic tasks' workload within schedulable range can lead to an effective utilization of the limited energy budget from the understanding of the energy and performance implications. The tuning of periodic tasks' workload implies (1) making the best use of a constraint energy budget, (2) appropriate sharing of energy and processor's computation capacity in runtime scheduling, (3) proper adjustment of energy consumption and responsiveness to the system's performance, and (4) reduction of total energy consumption pursuing system's target performance. 1.4 Organization of the Dissertation The rest of the dissertation is organized as follows. Chapter 2 presents a solution for reduction of total energy consumption in hard realtime systems. The twomode VoltageClock Scaling EDF scheduler is proposed, which uses a simple setting arrangement that the processor in a realtime system can be dynamically configured in one of two modes named lowvoltage (L)  mode and highvoltage (H)mode. In L mode, the processor is supplied with a low voltage (VL) and runs at a slow clock rate. And, the processor can be set in Hmode, i.e. be supplied with a high voltage (VH) and run at a fast clock rate. Chapter 3 investigates the VCS problem in sharing a limited energy in batterydriven realtime embedded systems. To provide realtime guarantees, the delay penalty in VCS needs to be carefully considered in realtime scheduling. In addition to realtime requirements, the systems may contain nonrealtime tasks whose response time should be minimized. Thus, a combination of optimization objectives should be addressed when we establish a scheduling policy under a power consumption constraint. In this dissertation, we devise the VCSEDF scheduling to get proper allocation of energy budget that is limited to mixed hard and soft realtime tasks. Based on schedulability of VCSEDF, we first analyze the characteristics of energy and utilization demand of periodic and periodic tasks, and then profile energy and utilization usages in the viewpoint of realtime scheduling. The profiling enables developers to build an energyefficient or power aware realtime schedule and is directly related to how definitely it assigns a bounded energy cost to tasks for specific realtime application. Then, we build a model for sharing constrained energy budget to realtime tasks and propose a static realtime scheduling to assign operation voltage/frequency to tasks, satisfying the requirements of hard and soft realtime applications. To get a better performance for periodic tasks, voltage settings resulting in the fastest responsiveness are selected among the schedulable ones under a given energy budget. Chapter 4 proposes an efficient energy saving scheme for runtime scheduling called dualpolicy dynamic realtime scheduling. It is for the realtime embedded systems, which jointly schedule both hard and soft tasks, and whose energy usage is bounded. By the constrained energy allocation scheme proposed in chapter 3, a set of voltage settings that make the best use of assigned energy and gives better performance for periodic tasks is decided. In the chapter, based on the voltage settings determined in the constrained energy allocation, we propose a dynamic scheduling to ensure an extended lifetime of the system under the given energy budget. The proposed approach is featuring an intermixing of two elementary schedules with the following observations. As total bandwidth server algorithm is applied to mixed tasks under a constrained energy budget, we notice the share of the processor's capacity is directly related to the energy sharing among them. Also, we observe a change of pattern in a joint scheduling of both periodic and periodic tasks, which consists of two intervals assorted by the existence of any periodic event. As for energy consumption in twomode voltageclock scaling EDF scheduler based on total bandwidth server algorithm, the voltage settings for only periodic tasks consumes less energy than the ones for mixed tasks. In the intervals having only periodic tasks, applying worst case voltage settings determined for the mixed tasks leads to more energy consumption to the voltage settings assigned for periodic tasks only. Therefore, the proposed algorithm switches running modes (scheduling policies) along with the change of pattern in event occurrences. For the intervals having mixed events of periodic and periodic tasks, the worstcase schedule is taken. For the other intervals having only events of periodic tasks, the voltage settings only for periodic tasks are allowed for energy saving. Chapter 5 considers a scaling mechanism in total energy saving and responsiveness of periodic tasks in the dualpolicy dynamic realtime scheduling and proposes a method to adjust energy consumption and response times to the aimed performance. When we introduce a set of scheduling policies as a set of voltage settings/running speeds for periodic tasks in chapter 4, they fall upon extremes in feasible ranges of their energy consumption and utilization. Due to the difference in energy consumption between the two extreme schedules, the dualpolicy dynamic scheduling can save energy consumption. Inbetween the extreme schedules for periodic tasks, there exist other combinations of voltage settings/running speeds, which have different energy consumption and utilization. If a set of voltage settings other than extremes is used for the intervals consist of only periodic tasks, the amount of energy consumption and average response time that the scheduler can acquire is changed accordingly. Thus, we introduce a new scaling factor, which can control the energy consumption/responsiveness, such that the dualpolicy dynamic scheduling optimize the system's performance to the requirement of the realtime system under a specific energy budget. Chapter 6 summarizes the main contributions of the dissertation, and discusses future research directions. CHAPTER 2 TWOMODE VCSEDF SCHEDULING FOR HARD REALTIME SYSTEMS 2.1 Introduction In recent years, the rapidly increasing popularity of portable batteryoperated computing and communication devices have motivated research efforts toward low power and energy consumption. Battery life is the primary constraint for energyconstrained systems such as personal mobile phones, laptop computers, and PDA's. Significant reduction in power consumption is possible from techniques ranging from lowpower CMOS circuit design to power management software tools. Initial power management efforts focused on putting the system in a lowpower/low performance sleep mode when it was idle. With the advent of the Advance Configuration Power Interface (ACPI) standard, such power management has been successfully employed in reallife systems. However, such approaches depend for their efficacy on efficient ways to decide when and which device should be shut down and woken up [4, 5, 6]. Processor cores can operate at different voltage ranges to achieve different levels of energy efficiency. For instance an ARM7D processor can run at 33MHz and 5V as well as at 20MHz and 3.3V. The energyperformance measures at these two operation modes are 185 MIPS/WATT and 579 MIPS/WATT, and the MIPS measures are 30.6 and 19.1, respectively [76]. Another example is Motorola's PowerPC 860 processor, which can be operated in a high performance mode at 50MHz and with a supply voltage of 3.3V, or a lowpower mode at 25MHz and with an internal voltage of 2.4V [77]. The power consumption in the highperformance mode is 1.3 Watts, as compared to 241mW in the lowpower mode. The basic concept of power reduction in the variable voltage processors is a technique called VoltageClockScaling in CMOS circuit technology. The power consumed for transitions in digital circuit is equal to Pm, = CLNWV2Df (21). Due to the quadratic relationship between the supply voltage (VDD) and the clock frequency, a small reduction in voltage can produce a significant reduction in power consumption. However, lowering VDD increases the circuit delay by the amount of the equation td = kVDD /(VDD )2 (22). Obviously, the longer execution time affects performance degradation in application or meeting strict time constraints in realtime systems. From equation (21), lowering supply voltage will reduce power consumption. However, it also slows down the logic. Based on these equations, a voltagescaling scheme was introduced in many studies by dynamically adjusting clock speed, allowing a system to operate at a lower voltage level to save energy without missing task deadlines. Clearly, if low energy consumption is a desirable feature in realtime systems, voltageclock scaling must cooperate with the task scheduling algorithms since the powerdelay tradeoff property in low power design affects meeting the strict time constraints of realtime systems. For instance, the execution of a highpriority at a low voltage and slow clock rate may cause a lowpriority task to miss the deadline due to the additional delay from the execution of the high priority task. In this dissertation we extend the resource reclaiming [78] with EDF scheduling algorithms. Energy saving is made possible by running tasks in lowvoltage mode during reclaimed slack periods that are released by tasks that do not consume their entire worstcase execution times. The algorithm proposed in this paper makes the slack time reclamation possible even if the task arrival instances or phases are not known a priori. We also propose a new dynamic voltage assignment which can result in much greater energy saving. We then discuss and demonstrate how the voltage setting should be chosen so that the dynamic algorithm and slack period reclamation can be most effective. The chapter is organized as follows. In Section 2.2, we outline the preliminary system model. In Section 2.3, we describe the detail algorithms of voltage assignment and slack period reclamation. To illustrate the effectiveness of the proposed algorithms, we evaluate the energy saving performance in Section 2.4. In the following section, we present how to get the optimal voltage settings. Finally, a short conclusion is provided in Section 2.6. 2.2 System Model In realtime systems, tasks may arrive periodically or sporadically and have individual deadlines. For Earliest Deadline First (EDF) scheduling, a task is modeled as a cyclic processor activity characterized by two parameters, T, and C,, where T, is minimum interarrival time between two consecutive instances and C, the worstcase execution time of task C. The EDF scheduling algorithm always executes the earliestdeadline task awaiting service. To apply voltageclock scaling under EDF scheduling, we make several assumptions as follows: Al. Voltage switching consumes negligible overhead. This is analogous to the assumption, made in classical realtime scheduling theory, that preemption costs are negligible. Voltage switching typically takes a few microseconds. In fact, a bound of the total overhead can be calculated by considering the number of task arrivals and departures since voltage switches are only done at taskdispatch moments. A2. Tasks are independent: no task depends on the output of any other task. A3. The worstcase execution demand of each task , C,, is known. The actual execution demand is not known a priori and may vary from one arrival instance to the other. A4. The overhead of the scheduling algorithm is negligible when compared to the execution time of the application. A5. The system operates at two different voltage levels. Ideally, a variable voltage processor that has continuous voltage and clock setting in the operational range is available. The first four assumptions are analogous to assumptions made in realtime scheduling theory. As for A5, we assume a simple setting arrangement that the processor in a realtime system can be dynamically configured in one of two modes: lowvoltage (L) mode and high voltage (H)mode. In L mode, the processor is supplied with a low voltage (VL) and runs at a slow clock rate. Thus, task execution may be prolonged but the processor consumes less energy. On the other hand, the processor can be set in Hmode, i.e. be supplied with a high voltage (VH) and run at a fast clock rate, in order to complete tasks sooner at the expense of more energy consumption. The processing speeds at Lmode and Hmode are denoted as aL and aH, respectively, in terms of some unit of computational work. Depending upon the voltage setting for task T, the worstcase execution time (WCET) is C/,aL or CoH. Assume that in the system there are n tasks, r, r2..., and I,, that are numbered in decreasing priority order. That is, 1D2D: ...D, where D, is the deadline of task IT. Thus, the total utilization demand by all tasks is p = 1 C, /I Also, upon each invocation, the task T must be completed before its deadline period D,. This is the realtime requirement. To minimize energy consumption in realtime systems, the voltageclock scaling problem is imposed when aL < p s aH. The task set is schedulable if the processor is entirely run in H mode, and misses at least one deadline if running in Lmode completely. Keeping the processor entirely in Hmode to meet the deadlines of all the tasks results in extra energy consumption. Therefore, an algorithm is called for to determine the optimal voltage settings and to minimize H mode execution, while guaranteeing that no deadline is missed 2.3 VoltageClock Scaling under EDF Scheduling The VCSEDF scheduling is composed of mode assignment and resource reclaiming phases. Assuming tasks run to their WCETs on EDF scheduling, mode assignment picks the voltage setting for each task, i.e. H or Lmode, that will minimize the total energy consumed and ensure schedulability under EDF. The reclaiming phase dynamically switches voltage settings taking into account the resources released by tasks that complete their work ahead of WCET. The difference between WCET of a task and the execution time it consumes is called a slack period. The optimization problem can be stated as follows: Pick H and L such that HUL = ( rt, r2, ..., } HnL = I, HC/T, is minimized subject to the wellknown sufficient condition' for the schedulability of periodic tasks under EDF, i.e., 1 C 1 C Ci 1 i 1. (23) H ieH min(7,Di) aL ieL min(T,,D,) This optimization problem can be treated equivalently as the decision problem of the subset sum, which is NPcomplete. Consequently, efficient search techniques e.g., branchand bound algorithms, should be employed to find a solution ifn is large. For dynamic mode assignment, the subsets H and L are determined for each busy cycle during which the scheduler is busy continuously without any idle intervals. We assume that H = { Ti, r2, ..., I} and L = 0 at the beginning of a busy cycle. At the first arrival instance of task ;, the subsets can be modified to H{tj} and Luf{ }, respectively, if 1 C, 1 C,  1. (24) aH EH} min(Tz, D,) aL eLu min(T,,D) In other words, the tasks will be assigned to the low voltage mode subject to the schedulability constraint and the assignment is done in the order of the tasks' first arrival instances during each busy cycle. While the dynamic assignment is not aimed at the minimization of the execution period in Hmode, it takes advantage of short busy cycles where no Hmode execution should be invoked. Given the voltage assignments defined in H and L, tasks can be dispatched and run in the assigned mode. The sufficient condition guarantees that, if every task takes up to its WCET, there is enough system capacity for each task to be completed before its deadline. When a task consumes less execution time than its WCET, there is a slack period to indicate the remaining budget that is no longer needed before the task's deadline. Since the slack period will not exist 1 The condition is also necessary if D,2T, for all i. any more after the task's deadline, we denote this deadline as the expiration time of the slack period. Thus, a waiting task that has a deadline greater than the expiration time of a slack period can use up this slack period and run in lowvoltage mode. This reclamation algorithm is given in Figure 21 where the following definitions are used: STQ: the slacktime queue to track the slack periods given from the tasks that don't use up the whole WCETs. Each element in the STQ indicates an amount of slack period of a completed task. A slack period has an expiration time that is the deadline of the task and can be consumed by any other tasks that have a deadline earlier than the expiration time. TKQ: the normal EDF task queue for all tasks in the system. ct,: the computing time that task has consumed for its scheduled period of the EDF schedule. It doesn't include the running time during any available slack periods of other tasks. st, : the slack time of task ,. The initial slack time is equal to C, la ct, at the completion instance of i where ais the speed factor of the assigned voltage mode. enQueue(element, Q): insert an element into the queue of Q. deQueue(Q): delete the first element from the queue of Q. head(Q): the element at the head of queue Q. SLdecrement: the flag to indicate the slack period in the head of STQ must be decremented as time progresses. The flag will be on when a task is using up the slack period or when the system is idle. To track the slack periods for all tasks and to combine online EDF scheduling and voltageclock scaling, we employ two queues, i.e., TKQ and STQ, in which tasks and slack periods are ordered according to their deadlines and expiration times, respectively. A task's slack period is computed when it completes the actual computation. The slack period is inserted into STQ if it is greater than zero. At the dispatch instance of task 1, the task is executed at low voltage if there is a slack period can be reclaimed (i.e. the expiration time of the slack period is less than the task's deadline). Otherwise, it is executed at its assigned voltage mode. Algorithm Slacktime Reclamation: at the arrival instance av, of task z { if (TKQ != empty) adjustctst; ct, 0; set task deadline to av,+D,; enQueue(,, TKQ); I enqueue the arriving task dispatch(TKQ, STQ); } at the completion instance of a task { adjustctst; S= deQueue(TKQ); if (r E H) st, C, o ct, I set slack period else st, CaL ct, set st,'s expiration time to the deadline of task , if (st, > 0) enQueue(st,, STQ); dispatch(TKQ, STQ); } at the end of a slack period { st, deQueue(STQ); dispatch(TKQ, STQ); } Procedure adjustctst // adjust accumulated execution // time and available slack periods = head(TKQ); exec time = the execution time since the last dispatch instance; if (SLdecrementoff ct, ct, + exectime; else { stk = head(STQ); stk = stk exec time; } Procedure dispatch(TKQ, STQ) II// dispatch a task // and a slack period, assign execution mode if (TKQ != empty) { = head(TKQ); SLdecrement = off // assume no slack time if (STQ != empty) { stk = head(STQ); if (deadline, > expiration_;.. { SLdecrement = on; // reclaim set voltage mode to L; / L mode set the end of slack period to (current instance + stk); } if (SLdecrement of) { // no reclamation, // run the task with the assigned voltage if (, E H) set voltagemode to H; else set voltage mode to L; } } } else { //the system is idle, stk head(STQ); I slack period must be SLdecrement = on; //decremented set the end of slack period to (current instance + stk); Figure 21. Online reclamation algorithm for voltageclock scaling When a slack period is reclaimed by a task or when the system is idle, the slack period at the head of STQ is decremented as time progresses and is removed from STQ once it is exhausted. On the other hand, a running task needs to accumulate its computation time when it doesn't use up any slack periods. This implies that the execution is using up other tasks' budgets and, when the task is done, an additional slack period can be accumulated due to its reclamation and possible early completion. 2.4 Simulation Evaluation of VCSEDF Algorithm As described in Section 2.2.2, the static and dynamic voltage assignment can guarantee all tasks to meet their deadlines. Depending upon task characteristics, such as periods, WCETs, and arrival phases, the amount of time that the processor is running in Hmode may vary a lot. In this section, we evaluate their average energy consumption based on random task parameters in a system of 10 realtime tasks. For each test case, we measure the percentages of the time that the processor is in Hmode and Lmode, and is idle. In addition to the schemes of static and dynamic voltage assignments with slack period reclamation, we also considered a fixed voltage assignment which follow the static assignment but with no slack period reclamation. Note that the fixed assignment minimizes the energy consumption when all tasks invoke the worstcase execution demands. In our simulation, we first generate 10 random task periods in the range of 100 to 1000, and assign the task deadlines equal to their periods. We assume that VH and VL are fixed and the execution speeds are set to aL=1.0 and aH=2.0. aL=1.0 represents the normalized processing speed. Then, relatively the execution speed at aH=2.0 is double the speed of Lmode. The worst case execution demands of the tasks are chosen in the range of aL to aH. The real task execution demand is then selected randomly such that the mean is in the range of 0.6 to 1.0 of the worst case execution demand. 90 80 U0 "o 0 E 70 "I . 60 (U) E 5 50 (U) " 40 (U) 0130 4, S20 10 T 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 Utilization in Lmode (a) Percentages of the time in Hmode for fixed and static voltage assignments 80 0 o E 70 . 60 (U E . 50 (U) e 4 40 O 0 () C 30 5 S 20 1) 0_ T 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2 Utilization in Lmode (b) Percentages of the time in Hmode for fixed and dynamic voltage assignments Figure 22. The percentage of execution time in Hmode In Figure 22(a) and (b), we show the evaluation results for execution demands ranging from 1 to 2. The curves with dashed lines are the percentage measures from the fixed voltage assignment, whereas the solid lines are based on the static and dynamic voltage assignments. With the fixed mode assignment, the measured percentage in Hmode is approximately proportional to the average of the actual workload. The curves in Figure 22(a) and (b) show some interesting trends. The difference of the measured percentage in Hmode execution between the fixedmode and staticmode assignments increases with the workload. This indicates the effectiveness of slack period reclamation in a heavily loaded system. Also, the variance of task execution time plays an important factor, as an increased number of slack periods can be useful for a subsequent task execution. Furthermore, the dynamic voltage assignment is more effective when the workload is moderate (neither very low nor very high). This is due to two facts. The first is that the probability of finding short busy cycles diminishes as we increase the workload. On the other hand, when the workload is low, most tasks are started in Lmode in the fixed assignment. The voltage assignment made by the dynamic approach at the beginning of each busy cycle would not be substantially different from the one under the fixed approach. We study the improvement of the proposed VCSEDF scheduling in Figure 23 where the percentage reductions of the Hmode execution times over the fixed assignment approach are depicted. For instance, with aL=1.0, aH= 2.0, p=1.5, and the average execution time is 0. 7of C,, the percentages of the time in Hmode under the fixed, static and dynamicmode assignments are 35.04%, 26.08%, and 20.28%, respectively. These percentages indicate that, compared to the fixed approach, the reductions of the time in Hmode reach (0.350.26)/0.35=0.26(26%) and (0.350.20)/0.35=0.43(43%) for static and dynamic approaches, respectively. The curves in the figure show that the dynamic voltage scaling can almost completely eliminate Hmode execution when the workload is low and the real task execution time is much smaller than the WCET. This significant result is due to (1) no preassignment of Hmode execution, and (2) effective slack time reclamation. As we increment the workload, the percentage reductions shrink as the Hmode become necessary to meet deadline requirement, even though the reduction of the Hmode execution is substantial as shown in Figure 22. Figure 23 also reveals the difference between the static and dynamic approaches. As long as the workload is not high, there will be many short busy cycles and few long ones. The dynamic approach of making a greedy decision at the beginning of each busy cycle is justified given that there is no knowledge of subsequent task arrivals. 0.9 , 0.6 0.8  0.7 o, 0.8 0 Dynamicmode 0.9 S0.7 (dotted lines) 4", mean S0.6 S0.5 S  0 4 0 0.4   0.3  0.2 0.1 Staticmo e (solid lines) 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 Utilization in Lmode Figure 23. The percentage of the reduction for the time in Hmode Using the percentage of the time spent in Hmode, Lmode, and in the idle state, we can measure the savings in power consumption. Assume that VT=O and that there is no power consumption in the idle state (in practice, idle power consumption are extremely low, which justifies this assumption). The ratio of the average power consumption under the static approach to that of the fixed approach is PCMOS (static) _PT (static)V2 + PTL( static)Vl PM,,o(fixed) PTH(fixed)V2 +PTL(fixed)V2 PTH (static)+PTL (static)( L )2 (25) (XH PTH (fixed) +PTL (fixed) (L )2 cxH where PT,(y) is the percentage of execution time in xmode under y approach. Similarly, we can compute the ratio PcMos(dynamic)/PcMos(fixed). The ratios are plotted in Figure 24 for various workloads. In the extreme case, more than 25% of the energy can be saved by the dynamic approach over the fixed assignment. Note that the fixed assignment can outperform any global voltage setting, as it uses the optimal voltage assignment for each task, but does not incur any on line adjustment. 1.00 Static/Fixed (dashed lines) 0.95 , 0  I.) 0 > C > 0.85  2185 ^^+" + , 0.80 n mean Dynamic/Fixed 0.6 0.75 (solid lines) / 0.7 0.8  0.9 0.70 1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0 Utilization in Lmode Figure 24. The relative energy consumption of fixed, static, and dynamic approaches 2.5 Optimal Voltage Setting in VCSEDF scheduling In the simulation experiments of Section 2.2.3, we assume fixed aL and aH (i.e. fixed VL and VH). Here, we address the problem of determining the optimal voltage settings given the characteristics of the given task set. If the task execution demands are constant and every instance of task r, requires a fixed demand of C,, then VH should be set to VH* which leads to an execution speed aH* and p/H *= 1. The processor will have a full utilization of 1. There is no need to have a different VL voltage scaling and any VH less than VH can result in task deadline violations. Also, any VH higher than VH* will cause additional power consumption as the power consumption is a concave and increasing function of the supply voltage. When the real task execution demands are random and are bounded by the worstcase demand, we still need to set VH=VH* to guarantee schedulability in the worstcase scenario. On the other hand, any settings of VL are feasible under the VCSEDF scheduling and lead to variant execution times in Hmode and Lmode as well as the average power consumption. In Figure 25, we have an example of variant VL (a) vs. execution time and power consumption. We first adjust VH such that the processor is fully utilized under the worstcase scenario. With the actual execution times, we can utilize the slack periods and run the tasks in Lmode. The execution sequences for different aL's are shown in Figure 25 (a). With the assumptions that V,=0 and no power consumption at idle state, we can compute the relative energy consumption for a special case where the real workload is 66.7% of the worstcase execution demand. With a slow execution speed and low power consumption in Lmode, the processor must stay in Hmode for a substantial amount of time. Thus, there is only a marginal energy saving. On the other hand, if we set a high aL and spend less time in Hmode, the total energy saving is again diminished as a high VL must then be applied. As illustrated in Figure 25 (b), there exists an optimal setting of VL, which balances the invocation of Hmode execution and the power consumption in Lmode. WCET in Hmode 4 3 5 0 1 2 3 4 5 6 7 8 9 10 11 12 Actual..... Work i) 2 (H) iT, 2 (H) 1 14 (H) a=0O 2 2 (H) T ) i 2(H) zdle(L) t 4 (H) dle(L)   i d az=0.5 T, 2(H) i 2(L) T2 1.5(H) it 1.5(L) i3 3.625 (H) . ao=1.O T, 2 (H) T22 (L) 1 (H) T3 2 (L) i3 3 (H) . /= .5(H) az=1.5 i 2(H) i2 2(L) 1 3 2.5.5 ( I3 2.125(H) . a,=2.0 IT, 2(H) T2 2(L) I3 3(L) t l(H) * 0 1 2 3 4 5 6 7 8 9 10 11 12 (a) Task executions according to the speeds of two modes 4 100 O 8C 0 S7C l 60 I 5C 40 30 20 10 WCET 0 0.5 1.0 1.5 2.0 aL (b) Relative energy consumption for different aL Figure 25. An example of finding the optimal aL In Figure 26, we present simulation results of the optimal voltage settings using the DC characteristics of Motorola's MPC860 processor. At VH=3.3V, the processor runs at an operating I1 I Lmode Execution I Hmode Execution 1 2 t   833 100    21 1 41 7 667 ......659 4  500 38 5 250 frequency of 50 MHz and has a power consumption of 1.3 Watts. In addition, we assume that the processing speed is proportional to the operating frequency and that VT=O. 7 which is the typical threshold voltage in most CMOS circuits with 0.250.35um technology. In our experiments, the worstcase execution demands of 10 tasks are chosen such that a full utilization is achieved at H mode. Then, the supply voltage at Lmode, VL, is varying from 0.7 to 3.3, and the real task execution demands are generated as a random number such that the mean demand is a proportion of the worstcase demand. Using the simulation results of the dynamic VCSEDF scheduling, i.e. the percentages of time in Hmode and Lmode execution, we can compute the average energy consumption for various VL from the equations (21), (22) and (23). The curves in Figure 26 indicate the optimal VL and the possible power savings when a good candidate of VL is chosen 1.3 1.2 1.1 Vopt= 2.252 .0 E 1 0. Vopt= 2.214 0.9 0.8 Vopt= 2.176 0.6 0.7 0.7 0.8 Vopt= 2.087  0.9 mean 0.6 I I 0.5 1 1.5 2 2.5 3 3.5 Lmode Voltage Figure 26. The optimal Lmode voltages VL (VH= 3.3 V, 1.3 Watts, and 50 MHz) 2.6 Conclusion In this dissertation, we investigate the voltageclock scaling problem in realtime embedded systems. Under EDF scheduling, we propose two scaling methods to reduce power consumption. Both methods utilize Lmode execution during task slack periods that come from the discrepancy between the worstcase and the real computation times. The static approach assigns a fixed voltage setting for each task in the offline phase such that the average execution time in Hmode is minimized and the schedulability condition is satisfied. At the online (run time) stage, the tasks are dispatched according to the EDF algorithm and run at the assigned voltages except during the slack periods. The other method, a dynamic approach, does voltage assignment at the beginning of each busy cycle. Subject to the schedulability constraint, it assigns tasks to Lmode execution when the first instance of each task arrives in the busy cycle. Our simulation results demonstrate that the proposed two approaches are quite effective compared to a fixed optimal taskbase voltage assignment that does not perform any online adjustment. The dynamic approach can outperform the static one if no Hmode execution is invoked during short busy cycles. Finally, we discuss the selection of optimal voltage settings for given workloads. CHAPTER 3 CONSTRAINED ENERGY BUDGET ALLOCATION 3.1 Introduction Recently, the growth of mobile computing and communication devices such as laptop computers, cellular phones, and personal digital assistants (PDA's) has been explosive and the demands for embedded applications on those devices has become more sophisticated and intelligent along with the advanced technology in microprocessors. With accelerated miniaturization, portability, and complex functionality of the handheld devices (i.e. multimedia in portable phone), batteries having smaller size but longerlifetime are indispensable. Processors are becoming increasingly powerhungry. Still, there is a big gap between the pace of current battery technology and energy demands of microprocessor systems [79]. For this reason, the field of poweraware computing has gained increasing attention over the past decade. Simple techniques, such as turning off (or dimming) the screen while a system is idle and shutting down hard disks while it is not accessed is now commonly adopted in most portable device designs. However, in many cases, reactivation of hardware can take some time, and affect response time. Also, deciding when and which device should be shut down and woken up are often far from trivial. Another effective approach to power reduction is a technique called VoltageClock Scaling or DynamicVoltageScaling in CMOS circuit technology. Due to the quadratic relationship between supply voltage (VDD) and clock frequency, a small reduction in voltage can produce a significant reduction in power consumption [24, 25]. However, increase in the circuit delay by lowering VDD provides longer execution time and this leads to performance degradation in application response time and a failure to meet realtime deadlines. If low energy consumption is a desirable feature in realtime embedded systems, voltage clock scaling must cooperate with the task scheduling algorithms since the powerdelay tradeoff property in low power design affects meeting the strict timeconstraints of realtime systems. The execution of a highpriority at a low voltage and slow clock rate may cause a lowpriority task to miss the deadline due to the additional delay from the execution of the high priority task. While VCS has been a wellpopulated research area, poweraware system design has generally focused on minimizing total power consumption. For systems consisting of soft periodic tasks, the objective of minimizing power consumption will result in slow execution. On the other hand, in many cases, the battery capacity can be replenished or there is a finite mission lifetime. Minimizing power consumption that doesn't utilize all available energy may not lead to optimal system performance. A better power control strategy in such cases is to minimize the response times of soft realtime tasks, providing that the deadlines of hard realtime tasks are met and the average power consumption is bounded. In this dissertation, we target batterydriven realtime systems, jointly scheduling hard periodic tasks and soft periodic tasks, whose battery capacity is bounded in the feasible range given by a set of tasks. The scheduling should guarantee meeting the task deadlines of hard real time periodic tasks and achieve average response time of periodic tasks that are as low as possible. Under the constraints of a bounded energy budget, finding an optimal schedule for a task set should aim to satisfy both optimal power consumption and strict timing constraints simultaneously. We first investigate the characteristics of energy demands of periodic and periodic tasks focusing an EDF scheduling exploiting the feature of VCS. Based on the energy requirement of mixed realtime tasks, we also propose a static scheduling for energy budget allocation, which determines the optimal twolevel voltage settings of all tasks under bounded energy consumption, while guaranteeing that no deadline of any periodic task is missed and that the average response time of periodic tasks is minimized. The algorithm selects the voltage settings that have the minimum average response time among the schedulable ones within a given energy consumption. To schedule periodic tasks, we adopt the Total Bandwidth Server, which was proposed by Spuri and Buttazzo and handles periodic tasks like periodic tasks within the reserved bandwidth such that it outperforms other mechanisms in responsiveness. The rest of this chapter is structured as follows. Main concepts for designing a constrained energy budget allocation model are explained in section 3.2. In section 3.3, we outline the preliminary model with several assumptions. Then, we discuss profiling of energy consumption and processor utilization under bounded energy budget in section 3.4. Considering the characteristics described in section 3.4, energy allocation methods and an algorithm of voltage assignment are described in section 3.5. To illustrate the effectiveness of the proposed algorithm, we evaluate the performance in section 3.6. A short conclusion then follows in Section 3.7. 3.2 Design Concepts As a key implementation vehicle of the realtime systems having mixed tasks under bounded energy consumption, the energy budget allocation model has the following design concepts at its core: Scheduling hard and soft realtime tasks with powerawareness. An integration of hard and soft realtime tasks and multitask soft realtime applications like multimedia application are currently attracting much attention. Sharing a bounded energy budget. In many occasions, the battery capacity can be replenished or there is a targeted mission lifetime for a system that is confined by its batteries. In such cases, reducing power consumption cannot be the only objective of task scheduling. If batteries will be charged at a predictable instant, the tasks should be scheduled to make the best of the available energy as better performance as they can achieve. And if the application of the system is composed of mixed realtime tasks, the bounded energy budget should be shared in a way that the requirements are satisfied for both sets of tasks. Switching voltage levels by VoltageClock Scaling. Based on the VCS technique, most of today's processor cores have been designed to operate at different voltage ranges to achieve different levels of energy efficiency. In this dissertation, we assume simple voltage settings that the processor in a realtime system can be dynamically configured in one of two modes: low powerlowfrequency mode and highpowerhigh frequency mode. Profiling energy and bandwidth usages of tasks. To allocate a limited energy budget to tasks effectively such that the budget is fully utilized for better performance, the energy usage should be outlined at each task level in a system. In the dissertation, we have devised the VCS EDF scheduling to profile energy demands. The processor's operation at different energy performance levels leads the energy usage to be profiled according to realtime tasks' execution requirements. In addition to profiling the energy usage, the processor utilization must be also profiled to guarantee the schedulability of tasks. Due to the property of powerdelay tradeoff in the voltageclock scaling, processor requires prolonged execution time in lowpowerlow frequency mode, comparing to the one in highpowerhighfrequency mode. Obviously, the variation of a given execution time according to the determination of operating modes (frequencies) affects processor utilization and the schedulability of EDF scheduling. Guaranteeing no missed deadline for hard periodic tasks. In realtime systems that have mixed tasks, periodic tasks are usually served in the background with respect to hard periodic tasks in order not to jeopardize the schedulability of hard tasks and served by an periodic server to improve responsiveness. In this dissertation, the schedulibility of hard periodic tasks is guaranteed by the isolation of bandwidth between periodic and periodic tasks. Ensuring fast responsiveness for periodic tasks under a given energy budget. If systems consist of only soft periodic tasks, slow execution for these tasks to minimize power consumption may result in unrestrained response time. On the contrary, if the objective of minimizing periodic tasks' response time is given for scheduling mixed tasks under a limited energy budget, an appropriate amount of energy should be adjusted to periodic tasks to ensure responsiveness as fast as they can get within the available energy budget for them. Due to the inverserelationship between energy consumption and utilization, the responsiveness can be also affected by sharing of the total energy budget to periodic tasks. The increased energy allocation to periodic tasks leaves periodic tasks less energy budget and this causes a delayed response time in return for the increased utilization of periodic tasks. Expanding the life span of a batterydriven realtime embedded system. Since reduced power consumption promises the longer lifetime of the system, developing an effective scheduling scheme that consumes much less power is also preferred to schedule mixed tasks under a power consumption constraint. 3.3 System Model for Energy Sharing For the targeted realtime systems, tasks may arrive periodically and have individual deadlines that must be met. Or they can be periodic and can accrue computation values, which are inversely proportionate to their response times. Under a given bound on energy consumption, we build a system model and make several assumptions as follows. 3.3.1 Schedule for Periodic Tasks For Earliest Deadline First (EDF) scheduling, a periodic task is modeled as a cyclic processor activity characterized by two parameters, T, and C,, where T, is minimum interarrival time between two consecutive instances and C, the worstcase execution time (WCET) of task iT. The EDF scheduling algorithm always serves a task that has the earliestdeadline among waiting tasks. The following assumptions are analogous to assumptions made in realtime scheduling theory. Tasks are independent: no task depends on the output of any other task. The worstcase execution demand of each task T, i.e., C, is known. The actual execution demand is not known a prior and may vary from one arrival instance to the other. The overhead of the scheduling algorithm is negligible when compared to the execution time of the application. 3.3.2 Schedule for Aperiodic Tasks Infinite number of soft periodic tasks [J, i=0,1,2,...} are modeled as sporadic processor activities represented by two parameters, A, and p, where A, is average interarrival time between two consecutive periodic instances and p the average worstcase execution time of task J,. Aperiodic tasks are scheduled by total bandwidth server algorithm that makes fictitious but feasible deadline assignment based on the available processor utilization guaranteed by the isolation of bandwidth between periodic and periodic tasks. In the TBS algorithm, the kth periodic request arrives at time t = rk, a task deadline Ck dk = max(rk,dk) )+ (31) UA is assigned, where Ck is the execution time of the request and UA the allocated processor utilization for periodic tasks. By definition do=0. The request is then inserted into the ready queue of the system and scheduled by the EDF algorithm, as any other periodic instances or periodic requests already present in the system. In Figure 31, an example of deadline assignment by Total Bandwidth Server is shown. Given utilization UA= 0.25 for periodic task of To, the deadlines of the instance J1, J2, and J3, that are requesting execution time of 1, 2, 1, are assigned as d1= 4, d2= 8, d3= 7, respectively, by the equation (31) and scheduled with periodic tasks, ZI and r2. Note that the assignment of deadlines is such that in each interval of time, the processor utilization of the periodic tasks is at most UA. Hence, a set of periodic tasks with utilization factor Up = "1 C, /T, and a TBS with a bandwidth UA is schedulable by EDF if and only if UA + Up .1. Comparing to the other scheduling algorithms for periodic tasks, the TBS algorithm has a very simple implementation complexity and shows very good performance in average response time [40]. S (1) 2) I J3 1) I o; 0 2 4 6 8 0 12 14 16 178 2 22 24 0 2 4 6 8 10 12 14 16 18 20 22 24 Aperiodic task r. UA=0.25 Periodic tasks ri (6,3), zr (8,2) ,Up=0.75 Figure 31 An example of the deadline assigning algorithm in Total Bandwidth Server 3.3.3 VoltageClock Scaling 3.3.3.1 Voltage Switching We assume voltage switching consumes negligible overhead. This is also analogous to the assumption made in classical realtime scheduling theory, that preemption costs are negligible. Voltage switching typically takes a few microseconds. In fact, a bound of the total overhead can be calculated by considering the number of task arrivals and departures since voltage switches are only done at taskdispatch moments. 3.3.3.2 Two Voltage Levels The system operates at two different voltage levels. Ideally, a variable voltage processor that has continuous voltage and clock setting in the operational range is available as explained in Table 11. We assume a simple setting arrangement that the processor in a realtime system can be dynamically configured in one of two modes: lowvoltage (L) mode and highvoltage (H) mode. In Lmode, the processor is supplied with a low voltage (VL) and runs at a slow clock rate. Thus, task execution may be prolonged but the processor consumes less energy. On the other hand, the processor can be set in Hmode, i.e. be supplied with a high voltage (VH) and run at a fast clock rate, in order to complete tasks sooner at the expense of more energy consumption. The operating speeds at Lmode and Hmode are denoted as aL and aH, respectively, in terms of some unit of computational work. Depending upon the voltage setting for task ,, the worstcase execution time is C/aL or C/oH. 3.3.4 Bounded Energy Consumption In batterypowered embedded systems, it is often equally important to control power consumption to extend the lifetime of the battery and to enhance system performance. Given that the battery can be replenished or the mission lifetime is limited, we may assume that the available capacity can safely be consumed during a predefined interval of operation. Thus, an average power consumption rate or energy budget can be set to the ratio of available capacity to the target operation interval. Also, it is possible to communicate with the battery such that the system and its scheduler can know the current status of the battery capacity. One of the mechanisms for doing this is the smart battery system (SBS), which has been now actively standardized and introduced to batterydriven systems [79]. In this dissertation, for the information of bounded energy budgets, we assume batterydriven realtime systems have such kind of mechanism. And, we assume the embedded system, whose processor is the major factor of the energy consumption 3.4 Profiling Energy and Utilization Usages in a Realtime Embedded System For all realtime tasks, the available energy consumption is confined to a given energy budget called Ec, which has to be shared among periodic and periodic tasks. Let Ep and EA are the energy budget allocated to periodic tasks and periodic tasks, respectively. The voltageclock scaling problem is to find voltage settings for both periodic and periodic tasks such that all periodic tasks are completed before their deadlines and have energy consumption less than Ep. all periodic tasks can attain the minimal response times while consuming energy less that EA. Ep + EA Ec. 3.4.1 Periodic Tasks Assume that, for periodic task z, m, is the voltage setting determined between the two possible modes, i.e. Lmode and Hmode and o (m,) is the speed of task at mode m,. Given m, for all of periodic task T, the energy demand for periodic task of Ep is 1 C E,(m,)= p(m) (32) a, (m, ) 7T, where p(m) is the power consumption at mode m,, C, the average execution time of task , In addition, the worstcase utilization is given by U, (m)= C 1 (33) upya, (m,) T If m, is Hmode for all periodic tasks, the processor runs at a fast clock rate all the time, thereby minimizing the utilization. The maximum energy demand for the tasks is represented as 1 C, max Ep = PH (34) aH. 7 and its utilization becomes minUp = I (35) aH T, On the contrary, if m, is Lmode for every periodic task T, the processor runs at a slow clock rate all the time such that the utilization is maximized, but consumes the minimum possible energy. For the sake of schedulability, the tasks should be scheduled in such a way that the utilization is less than unity Therefore, we define min Ep as an energy demand when there exists a set of {m,} so that the worstcase utilization 1 C 1 C Up (m,) = 1 and Ep (m,)= p(m,) is minimized. (36) a(m,)T, m a(m, ) T, In Figure 32, we describe the relationship between energy consumption and utilization for a set of periodic tasks. The maxima and minima are denoted as max Ep and min Ep for the energy and max Up and min Up for the utilization, respectively. Again, min Ep follows equation (36). Regarding the feasibility of the energy constraint and the worstcase utilization, Ec must be greater than min Ep and min Up should be no greater than 1. By definition, if min Up is greater than unity with all Hmode executions, it is impossible to find voltage settings to ensure that all tasks meet their deadlines. If max Up is less than 1, the tasks are schedulable with all Lmode assignments and energy consumption can never be less than min Ep. In the case, max Up becomes Up (L) = C andmin Epdoes Ep (L) = p . aL T1 aL T If energy budget Ep is given in the range from min Ep to max Ep, Upavailable is the available utilization corresponding to the allocated energy budget Ep. And, by searching a set of voltage settings meeting the given energy budget and schedulability, energy demand and utilization for periodic tasks are determined as Ep (m,) sEp and Up (m,) 2 Up lable, respectively. 0 min E, E, maxE, I Energy Schedulable Utilization 0 min Up Udvailabie 1 max Up Figure 32 The relationship between power consumption and utilization for a set of periodic tasks 3.4.2 Aperiodic Tasks Denote by mA the voltage setting determined between the two possible modes for periodic tasks, which have the average interarrival time of A and the average worstcase execution time of p. If all of them are assigned in mode mA and the power consumption at mode mA is p (m), the energy consumption and utilization of them are 1/1 EA (m,) = p(m ), (37) a(mA) A UA (mA) = (38) (mA) 2 respectively. Also, if all of them are assigned in Lmode or Hmode, they demand minimum energy min EA or max EA given by the following equations minE PL and maxEA = PH (39) SL2 ^cH2 having the utilization min UA = and max UA = / (310) aL2 .2 3.4.3 Energy Budget Allocation While the constraint Ep + EA : Ec must be satisfied, we can decide how processor utilization, task scheduling, and task response time are affected. From the viewpoint of utilization, the more utilization is available for periodic tasks, the shorter the deadlines that are assigned to them by the deadline assignment of equation (31). This assigns higher priorities to them in EDF scheduling such that they can get a faster response. To give more utilization to periodic tasks, the utilization of periodic tasks must be shrunk and it can be done by assigning more tasks to H mode, but requires more energy consumption. Since the total energy budget is bounded, the energy budget left to periodic tasks will be reduced. As a result, the periodic tasks must be run in a low voltage mode and their response times will be extended. Likewise, from the viewpoint of the energy budget, the portion assigned in Hmode for periodic tasks should be maximized within an assigned energy budget EA to get faster responsiveness. But, as before, if the energy demand of periodic tasks is increased, the energy available for periodic tasks will be decreased. Consequently, the available utilization for periodic tasks will be decreased due to the increased execution time of periodic tasks, which may result in degradation in responsiveness. Eventually, to get both schedulability and fast responsiveness under a bounded energy budget, an effective scheduling and energy allocation scheme is needed for jointly scheduling hard periodic and soft periodic tasks. The scheduling should address the concern of the tradeoff between utilization and energy consumption as shown in Figure 32. 3.5 Constrained Energy Allocation using VCSEDF Scheduling In this section, we describe an energy allocation scheme, which allocates bounded energy budget to periodic and periodic tasks based on VCSEDF scheduling, meeting the requirements of realtime tasks, i.e. to meet deadlines of periodic tasks and to get faster average response time for periodic tasks. Given an energy budget Ec, considering the feasible range of energy demand determined by tasks, it finds voltage settings for periodic tasks and the execution portion in H mode and Lmode to the worstcase execution time for periodic tasks under a bounded energy budget. 3.5.1 Energy Allocation Factors Suppose that Ep and EA can be allocated in the range of [min Ep, max Ep] and [min EA, max EA], respectively, Ema = max Ep + max EA, and Em = min Ep + mmin EA. If the bounded energy consumption budget is given as Ec, Ec must fall into the range Emm Ec Emax where min Ep, min EA, max Ep, and max EA are as defined in equations (34), (35), and (39) to a given set of tasks. Then, voltage settings must be determined such that the energy consumption satisfies the constraint of Ep + EA Ec, while guaranteeing the schedulability of periodic tasks and minimizing average response time for periodic tasks. For ease of explanation, we define Edff Emax Emn.. Let f and ybe energy allocation factors of periodic and periodic tasks, given by 0 <# _< 1 and 0 y 1, respectively. Then, the energy budgets Ep and EA allocated to them are represented as Ep = min Ep + f(max Ep min Ep), (311) EA = min EA + y(max EA min E), (312) respectively. Suppose Aa = (max EA min EA), Ap = (max Ep min Ep), and Ac = Ec (min EA + min Ep), respectively, then the inequality (Ep + EA < Ec) becomes yAa + 6JAp sAc. Hence, / and yare determined by Ac yAa < Ac Ap < Ac (313) o<1= l, y . (313) Ap Aa Aa respectively. The choice of determines 3, and vice versa, and also determines EA and Ep by equations (311) and (312). If y0 and yl, energy min EA and max EA are assigned to periodic tasks, i.e. assigning all periodic tasks in Lmode and Hmode, respectively. If yis 0.6, energy assigned for periodic tasks becomes (min EA + 0.6Aa). Unlike voltage settings for periodic tasks, which are decided on the basis of a task, the running mode for periodic tasks are determined by the fraction in Hmode and Lmode. If the fraction assigned to Hmode is xH, then that assigned to Lmode becomes (1 xH). The energy consumption needs to be bounded by the budget, and so x p_ (1 x) p PH + PL EA (314) ca 2 a1L 2 Similarly, the execution time of an periodic task is determined according to the voltage modes and the deadline assigned in equation (31) is adjusted. As for responsiveness, the greater the fraction of the processor utilization that is given to periodic tasks, the better is the responsiveness expected under the TBS algorithm, because shorter deadlines are assigned to them. Under energy budgets of Ec and Ep, the utilization for periodic tasks will be increased if the voltage settings are determined to allocate more Hmode to periodic tasks within the energy budget such that it can minimize the utilization Up (m) and make an increase in UAa"alable/. We therefore have a constrained optimization problem to determine the optimal voltage settings, maximizing Hmode execution, within the constraint of budget Ep and guaranteeing that no deadline of any periodic task is missed. The optimization problem to find voltage settings for periodic tasks can be stated as follows: Pick the task subsets H and L for voltage settings of Hmode and Lmode such that HuL = {i, T2, ..., ,n} HnL=0 1 C Up ,(m ) = is minimized a(m,) T, subject to the wellknown sufficient condition' for the schedulability of periodic tasks under EDF, i.e., 1 C, 1 C(315) z +z 1 (315) a H min(T, D,) aL ,= min(T, D,) and the energy consumption constraint of PH C PL C EP aH H min(T, D) c a< min(, D,) This optimization problem can be treated equivalently to the decision problem of the subset sum, which is NPcomplete. Consequently, efficient search heuristics, e.g., branchand bound algorithms, should be employed to find a solution ifn is large. 1 The condition is also necessary if D,>T, for all i. 3.5.2 Algorithm for Energy Budget Allocation We describe here the algorithm for the bounded energy allocation explained in the previous section. The algorithm outputs the energy allocation factors,? and voltage settings for periodic tasks, {m,}, and the percentage of Hmode assignment for periodic tasks, xH. The Algorithm of VCSEDF Scheduling Under Bounded Energy Consumption Ec 1. Compute min Ep, max Ep, min EA, max EA, Emax, and Emin. 2. If Ec is less than Emin = (min Ep + min EA), there is not enough energy to execute the workload. 3. If Ec is in the range of Emin sEc Emax, compute the range of yand f, EA and Ep, accordingly. 4. For each yin the range of 0 _<_1, execute the following steps (4a) Compute f, EA and Ep, (4b) Find {m,, which satisfies C 1 p(m) < E, and that T a(m,) Up (m) = c, 1 is minimized, where mi is voltage setting either in H T a(m,) mode or Lmode for periodic task n, using simple search or branchandbound algorithms. (4c) Compute UAavailable = 1 Up (mi). (4d) Given EA, find xH, the fraction of execution in Hmode for periodic tasks, and (1XH) the fraction in Lmode. (4e) Applying the TBS algorithm for the deadline assignment Up (mi) and LI. computed in step (41b) and (4c), respectively, run VCSEDF scheduling in voltage settings [mi] for periodic tasks and xH and (1xH) for periodic tasks. 5. Find having the minimum average response time from the result of the scheduling in step 4. 6. The value of determined in step 5 is selected for energy allocation, which gives the best performance for periodic tasks, xH for running the periodic tasks in Hmode and [mi] for voltage settings of the periodic tasks are determined accordingly. Figure 33 The algorithm for constrained energy budget allocation 3.6 Simulation Evaluation We analyze here the properties of sharing the bounded energy budgets between periodic and periodic tasks based on VCS approach and evaluate the VCSEDF scheme to schedule mixed realtime tasks. For the power consumption and speed settings, Motorola's PowerPC 860 processor is used for our simulation, which can be operated in a highperformance mode at 50MHz and with a supply voltage of 3.3V, or a lowpower mode at 25MHz and with an internal voltage of 2.4V such that VH and VL are fixed to VH=3.3 and VL=2.4. The power consumption in the highperformance mode is 1.3 Watts (pH), as compared to 241mW (ps) in the lowpower mode. The clock rate at high voltage is 100% greaterthan at low voltage: H= 2.0 and aL=1.0. A simulation study is performed to address the improvement of task execution time with extra available energy. In other words, the system is assumed to possess enough energy to complete the tasks and meet the deadline requirements. In addition, there is extra energy that can be allocated to improve the response time of periodic tasks. Our immediate objective of the simulation study is to see how the response time can be reduced through a proper voltage setting. Furthermore, this extra energy can be allocated to periodic tasks such that the processor utilization reserved for periodic tasks is reduced. This leads to a reduction of deadline assignment in the totalbandwidth scheduling scheme. On the other hand, the extra energy can be consumed by periodic tasks that can result in a firstorder effect in the reduction of response time. In our simulation, we first generate 10 random task periods in the range of 100 to 1000 and set the task deadlines equal to their respective periods. The worstcase execution demands of the tasks are randomly chosen such that, for each simulation case, no deadlines need be missed and the resultant utilization is set to Up (L)=0.8, 1.0, or 1.2, respectively. For periodic tasks, we adopt the exponentially distributed execution time with an average p equal to 45. Then we let the interarrival time be exponentially distributed with mean of between 450 (10% workload, i.e. UA (L)=0.1) and 112.5 (40%, i.e. UA (L)=0.4). The energy budget Ec is set at each of several energy levels in the range from (Em,,+0.6Edf) to (Em,,+ Edf). To get fast responsiveness, how much energy budget can be allowed to periodic and periodic tasks, respectively? Over various fs and constraint energy budgets, we obtain the average response times of periodic tasks from the simulation and plot them in Figure 34. Regardless of increase in the energy budget, Figure 34 reveals a trend of reduction in average response time of periodic tasks as increases. The average response time does not show always a monotonic decrease with an increase in In some regions, it has an abrupt increase or is flat over increasing This occurs especially when Ec =(Emn+0.6Edff) or Ec =(Emn+0.7Edf). Note that when we increase y, periodic tasks are invoked more in highvoltage high speed execution. This results in a reduced CPU utilization, i.e. the utilization required by periodic tasks under the voltage setting. On the other hand, as / is reduced, the energy allocated to the periodic tasks decreases which leads to an increase in Up (m,) and a decrease in UA valuable The two reductions, one on the demand to complete periodic tasks and the other one on the available utilization for periodic tasks, can have a profound impact on the response times. Let the CPU utilization required be denoted as Ueal and we show the ratio of UA 1abl to UAre in Figure 35. For instance, with Ec =(Em,,+0.6Edff) and r1.0 in Figure 35 (a), there still exists extra energy to be assigned to periodic tasks (8 >0) and an optimal voltage setting is obtained which leads to Up (m,)=0.55 and UA avaable = 0.45. On the other hand, UAre1 is reduced to 0.15 as we increase ytol.0. A ratio of 3 is then obtained and plotted in the Figure. It is interesting to observe that, whenever the ratio is flat in Figure 35, the average response times have uneven decreases in Figure 35. In fact, as long as the ratio of UAavalable to UAre continues to increase, the processor possesses greater capacity to complete periodic tasks and the response time drops. In contrast, there would be a monotonic decrease in response time if the ratio were flat as we increase . 54 Average Response Time with respect to y 100 100 90   0.6 Ediff 90  E 0.7 a E E = 80  0.8 80 a0.9 70 } 70 1.0 0 O O c 60  60 a 50 50 S40 40  30  30 20 20 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 7 7 (a) Up=0.8, Ua=0.3 (b) Up=1.0, Ua=0.3 200 1200 w \ 1000  E E F 150 a) 800 C C o o S600 100 , 0) 400  S50  200    20 20 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 7 7 (c) Up=1.2, Ua=0.3 (d) Up=1.4, Ua=0.3 Figure 34 The responsiveness to the energy allocation of periodic tasks The other interesting observation in Figure 35 is that utilization ratios are not available for all of values. It indicates that the possible choices of only exist in the range where the plots are shown. This is also evidenced in equation (314) and is originated from the definition of in which the minimum value of means the percentage of energy available for periodic tasks after periodic tasks take energy budget as much as they can. From these results, to get fast responsiveness of periodic tasks, the greater portion of the energy budget should be allocated to periodic tasks, and then voltage settings of periodic tasks need to be determined within the energy budget remaining for them. Note that the way we formulate the minimal energy budget is based on the schedulability for periodic tasks and ensuring no CPU starvation for periodic tasks. If the energy budget is below this minimum, periodic tasks will incur much longer response times. The Ratios of Available Utilization to Min. Utilization for Aperiodic Tasks 4 4 'I 0.6 Ediff 3.5 0.7 3.5 0 o 0.8 0 Z3 0.9  3  1.0 0 2.5 0 2.5 N 2  2   2 2D 1.5  1.5 1 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 7 7 (a) Up=0.8 Ua=0.3 (b) Up=1.0 Ua=0.3 4 4 3.5  3.5 .O 3  3 2.5  2.5 2 2 1.5 1.5  11 1 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 t 7 (c) Up=1.2 Ua=0.3 (d) Up=1.4 Ua=0.3 Figure 35 The ratios of available utilization to the minimum utilization for periodic tasks To reveal the causes that lead to the flat regions in Figure 35, we now investigate how the energy budget is allocated to periodic and periodic tasks, respectively. In Figure 36, we show the energy sharing as percentages of allocated energy Ep for periodic and EA for periodic tasks to the maximum energy demand, Emax, that is the maximal energy consumption by a given task set. The plots in Figure 36 (a)(c) cover the case when Ec is bounded to (Em,,+0.6Edy). But in Figure 36 (d), we plot the energy percentages under Ec = (E,,,+0.7Edf) unlike the ones for other periodic workloads. The reason is the energy budget (E,,,+0.6Edff) is too low to select proper voltage settings making the given set of tasks schedulable under the periodic workload of U =1.4. When a set of periodic tasks can make the most of the given energy budget Ep, i.e. Ep (m) s Ep, Ep (m) is determined by the chosen set of voltage settings, m,, in the VCSEDF algorithm subject to requirements imposed by the need to maintain schedulability. Thus, there is a small discrepancy in energy consumption between Ep and Ep (m). Percentages of Allocated Energy to Emax 1 1 0.8  0.8 o U S0.6  0.6 0 0  0.4 0.4  Y Y (a) Up=0.8, Ua=0.3 (b) Up=1.0, Ua=0.3 1 1 C 0.8  0.8 0.6  0.6 S0.6^^ ^ ^^^ 0.6  C CEC S0.4  0.4 Ep  o Ep(mi) 0.2  0 2  0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Y Y (c) Up=1.2, Ua=0.3 (d) Up=1.4, Ua=0.3 Figure 36 Energy allocation percentage to the maximum energy demand Over several regions of y, Ep (m,) is kept at the same level even if Ep is decreasing, while being less than Ep. In other words, the same voltage settings are selected for different Ep's. For all the possible combinations of voltage settings, if we sort them in descending/ascending order according to energy demands, a discontinuity in energy demands exists between any two sets of voltage settings adjacent in the sorted list. Let the discontinuity in energy demand be an energy gap. Then, even if there is a small amount of change in energy budget Ep, it cannot change voltage settings unless it jumps up/down any energy gap between adjacent energy levels. However, if the number of periodic tasks is getting bigger, the flatness in Figure 36 will be 57 reduced because of the fine energy gap between adjacent energy levels of discontinuous voltage settings. It should be noticed the big drops in the response times of periodic tasks occur when the voltage settings of periodic tasks result in a energy allocation Ep (m) that is very close to the available budget Ep. For instance, at /=0.1 and 0.5 of Figure 36 (c), the settings lead to a little reduction of UAadlable which, combining with the decrease of UArea, bring about a considerable decrease in the task response time of Figure 34 (c). Min. Average Response Time to Bounded Energy Consumption 40 40 SUa=0.1 S Ua=0.2 F* 36 Ua=0.3 3 S.  UaO   Ua=0.4 S32 32 28  28 28  28 ^ CE 24 24 1 0.9 0.8 0.7 0.6 1 0.9 0.8 0.7 0.6 The % of Energy to Ediff The % of Energy to Ediff (a) Up=0.8 (b) Up=1.0 40 40 E 36  36 S32  32 cin cn C C 28    28   24 L 24 1 0.9 0.8 0.7 0.6 1 0.9 0.8 0.7 0.6 The % of Energy to Ediff The % of Energy to Ediff (c) Up=1.2 (d) Up=1.4 Figure 37 Minimum average response time with respect to the bounded energy budget We now consider how much improvement we can obtain from an increased energy budget. In Figure 37, we show the evaluation results for the minimum average response time to the constraint energy ranging from 0.6 to Edf. The responsiveness of periodic tasks for UA=0.1 and 0.2 is not much affected by the periodic tasks' workload Up and the constraint energy budget Ec. Since every periodic task is assigned to Hmode (i.e. Y=1.0 to ensure minimal response time) and is allocated with the maximal energy budget, the available energy budget for periodic tasks decreases as UA increases. As a consequence, the increased workload in periodic tasks increases the average response time for the case of UA=0.3 and 0.4 as UA""avabe is limited and the deadlines assigned to the periodic tasks are extended. 3.7 Conclusion In this paper, we have presented an algorithm to carry out voltage clock scaling in workloads consisting of periodic hard and soft realtime tasks. The aim is to keep within a predefined energy budget. The objective of the scheduling scheme is to minimize the response time of periodic tasks while all deadlines of periodic tasks are met and the total energy consumption is bounded by the energy budget. As we apply total bandwidth scheduling for periodic tasks, we notice two conflicting factors in energy budget allocation. When an extra budget is assigned to periodic tasks, their execution can be done in highvoltage and highspeed mode. This leads to a reduced response time. On the other hand, the extra energy budget allocated to periodic tasks can result in a lowering of the CPU utilization reserved for periodic tasks. This, in turn, leaves more available CPU utilization for periodic tasks and cause shorter deadlines as defined in the total bandwidth scheduling scheme. Our simulation study assumes that there the energy budget is enough to meet the hard realtime periodic tasks and to complete the periodic tasks. In addition, here there is extra energy that can be allocated to either periodic or periodic tasks. Our results demonstrate that the VCSEDF scheduling gets the fastest responsiveness when the extra energy budget is allocated to periodic tasks at their maximum energy demand such that all of them can run in Hmode. Given the requirement of responsiveness and any energy budget, the proposed scheduling method can decide the voltage settings for periodic tasks so that realtime tasks can share the bounded energy budget effectively. Therefore, the work provides the batterydriven embedded realtime system designer with a general view, which allows scheduling realtime tasks considering their general 59 characteristics of energy demands and processor utilization, given a constraint of bounded energy availability. CHAPTER 4 DUALPOLICY DYNAMIC SCHEDULING 4.1 Introduction Due to the accelerated miniaturization, portability, and complex functionality for handheld devices (i.e. multimedia in portable phone), mobile computing and communication devices such as laptop computers, personal digital assistants (PDA's) are demanding more power. But, the big gap between the pace of current battery technology and energy demands of microprocessor systems cannot satisfy the demands such that power management or power efficiency become the critical design issue of batterypowered embedded system [79]. An effective approach to power reduction is a technique called VoltageClock Scaling or DynamicVoltageScaling in CMOS circuit technology. Due to the quadratic relationship between supply voltage and clock frequency, a small reduction in voltage produces a significant reduction in power consumption. However, since lowering the supply voltage accompanies a reduction in the clock frequency from the intrinsic relationship, obviously, the objective of minimizing power consumption will result in performance degradation of response time or a failure to meet real time deadlines without concerning the powerdelay tradeoff in low power design. Despite the fact that there have been a lot of studies in lowpower and power aware computing with widely recognized importance of energy consumption in systems, energy efficiency of realtime system under a bounded energy budget, for instance, batterypowered real time systems, is relatively undefined. One of the keys to utilize a constrained energy budget more effectively is the ability to allocate the energy consumption to specific components. For an appropriate energy allocation, there have been many studies to measure energy impact at various levels of system such as instructional, functional, or system components levels [81, 82]. Besides, in realtime embedded systems, the relationship of energy and workload demands must be well understood on top of the energy profiling since energy saving can be achieved by conformity to the conventions of sacrificing system performance in tolerable level. For periodic tasks that have no form of guarantees, they are usually served in the background with respect to hard tasks in order not to jeopardize the schedulability of hard tasks or served by an periodic server to improve responsiveness. The objective of minimizing power consumption will result in slow execution for these tasks and the response time could be unrestrained. On the other hand, reduction of energy consumption that doesn't utilize all available energy may not lead to optimal system performance. A better power control strategy in such cases is to minimize the response times of soft realtime tasks, providing that the deadlines of hard real time tasks are met and the average power consumption is bounded. Fulfilling all these requirements, an energy allocation model is proposed to provide energy profiling and energy sharing scheme at task level [83] for realtime embedded systems that have mixed hard and soft tasks. The model is also reflecting the tradeoff relationship between energy consumption and utilization in VCSEDF scheduling. In this chapter, based on the constrained energy allocation model, we propose a Dual Policy Dynamic Scheduling method not only to get better performance for periodic tasks but also to extend the lifetime of a system by lowering energy consumption under a confined energy budget. The approach is featuring an intermixing of two elementary schedules, which originated from a pattern of event occurrences in a joint scheduling of periodic and periodic tasks. According to traditional scheduling schemes, both kinds of tasks are scheduled under the worst case scenario, satisfying energy consumption and realtime schedulability. Triggered by the existence of periodic tasks' events, explicitly two kinds of intervals are observed in the runtime scheduling for mixed tasks. The one is P (Periodic) interval that has only periodic task events and the other PAP (Period and APeriodic) interval that has both hard and soft task events. By the relationship between energy consumption and utilization in VCSEDF scheduling, the voltage settings for periodic tasks demand a greater energy budget in case of scheduling both kinds of tasks than scheduling only periodic task set from the viewpoint of abiding schedulability. Therefore, in the intervals of P, the proposed scheduling algorithm applies an elementary scheduling policy that consumes minimum energy for periodic task set rather than the worstcase scheduling policy determined for scheduling both kinds of tasks. The energy efficiency is obtained from how more tasks are executed in lowspeed mode instead of highspeed mode for the intervals of P. Also, executing periodic tasks in different running speed from worstcase in the intervals of P and PAP increase remaining workload at the switching point such that the backlogged workload to next scheduling policy may cause at least one periodic task to miss its deadline. Since there are slack times brought by the difference of WCET and real execution demand, the backlogging can be alleviated by the slack times. But for the cases the slack times are not enough to nullify the backlogging, two transitory scheduling policies are introduced inbetween the intervals of P and PAP. By the transitory scheduling policies, when the backlogged workload becomes equal to or less than zero, the next targeting elementary schedule can be activated. According to the switching scheduling policies along with the intervals, we modify total bandwidth server (TBS) algorithm to apply to periodic tasks. If there's no backlogged workload to PAP intervals, the modified TBS assigns proper deadline algorithm to periodic task like the intrinsic total bandwidth server. Otherwise, an infinite deadline is assigned and it is maintained till the backlogged workload becomes no greater than zero. The rest of this paper is structured as follows. In section 4.2, we discuss design concepts of a dynamic scheduling for energy saving and fast responsiveness under a bounded energy budget. Then, we propose the model of the dualpolicy dynamic scheduling in section 4.3. In section 4.4, the proposed dynamic scheduling is explained in detail, including elementary schedules that constitute the dynamic scheduling. The proposed dynamic scheduling algorithm is explained in section 4.5. To illustrate the effectiveness of the proposed scheduling algorithm, we evaluate the performance through simulations in section 4.6. In section 4.7, a short conclusion is given. 4.2 Design Concepts In this section, we propose a dynamic realtime scheduling model for mixed periodic and periodic tasks. The goals are not only for sharing a bounded energy budget, but also for reducing power consumption under a given energy budget. For the targeted realtime systems, tasks may arrive periodically or sporadically, where the periodic tasks have a constraint whose individual deadlines should be met and the performance of periodic tasks is measured in average response time. Under limited energy consumption, we build a dynamic dualpolicy scheduling model based on the profiles of energy and utilization demands for both periodic and periodic tasks and the energy budget allocation model as proposed in Chapter 3. 4.2.1 Simple Two Voltage Settings The system operates at two different voltage levels. Ideally, a variable voltage processor that has continuous voltage and clock setting in the operational range is available. We assume a simple setting arrangement that the processor in a realtime system can be dynamically configured in one of two modes: lowvoltage (L) mode and highvoltage (H)mode. In Lmode, the processor is supplied with a low voltage (VL) and runs at a slow clock rate. Thus, task execution may be prolonged but the processor consumes less energy. On the other hand, the processor can be set in Hmode, i.e. be supplied with a high voltage (VH) and run at a fast clock rate, in order to complete tasks sooner at the expense of more energy consumption. The operating speeds at Lmode and Hmode are denoted as aL and H, respectively, in terms of some unit of computational work. Depending upon the voltage setting for task z,, the worstcase execution time is C,/ a or C/,xH. 4.2.2 Total Bandwidth Server for Aperiodic Tasks In traditional scheduling in realtime systems that have both periodically and sporadically arriving tasks, periodic tasks are usually served in the background with respect to hard tasks in order not to violate the schedulability of hard tasks or served by an periodic server to improve responsiveness. Among substantial research works in the scheduling of periodic tasks in both fixed and dynamic priority systems, the TBS can produce better responsiveness than other periodic mechanisms, sporadic server, and slack stealing, and so on. Infinite number of soft periodic tasks J, i=0,1,2,...} are modeled as periodic computation activities represented by two parameters, A and p, where A is the average inter arrival time between two consecutive periodic instances and p. the average worstcase execution time of all periodic tasks . Aperiodic tasks are scheduled by total bandwidth server algorithm that makes fictitious but feasible deadline assignment based on the available processor utilization guaranteed by the isolation of bandwidth between periodic and periodic tasks. In the TBS algorithm, the kth periodic request arriving at time t = rk, a task deadline Ck dk = max(rk, dk_ )+ (41) UA is assigned, where Ck is the execution time of the request and UA the allocated processor utilization for periodic tasks. By definition do=0. The request is then inserted into the ready queue of the system and scheduled by the EDF, as any other periodic instance or periodic request already present in the system. Note that the assignment of the deadlines is such that in each interval of time the processor utilization of the periodic tasks is at most UA. Hence, a set of periodic tasks with utilization factor Up = " C, /T, and a TBS with a bandwidth UA is schedulable by EDF if and only if UA +Up 1. The definition and the formal analysis of this algorithm are proved [40] 4.2.3 Energy Budget Allocations using Twomode VCSEDF Scheme While the constraint Ep + EA sEc must be satisfied, there is a profound intervention on how processor utilization, task scheduling, and task response time are affected. From the viewpoint of utilization, the more utilization is available for periodic tasks, the shorter deadlines are assigned to them by the deadline assignment of equation (41). This brings higher priorities to them in EDF scheduling such that they can get faster response times. To give more utilization to periodic tasks, the utilization of periodic tasks must be shrunken and it can be done by assigning more tasks to Hmode, but requires more energy budget. Since total energy budget given to a system is bounded, the energy budget left to periodic tasks will be reduced. As a result, the periodic tasks must be run in a low voltage mode and their response times will be extended. Likewise, from the viewpoint of energy budget, the portion assigned in Hmode for periodic tasks should be maximized within an assigned energy budget EA to get faster responsiveness. But, under bounded energy budget in a system, if the energy demand of periodic tasks is increased, the energy left to periodic tasks will be decreased. Consequently, available utilization for periodic tasks will be decreased due to the increased execution time of periodic tasks, which may result in degradation in responsiveness. Eventually, to get both the schedulability and fast responsiveness under a bounded energy budget, an effective scheduling and energy allocation scheme is needed for jointly scheduling of hard periodic and soft periodic tasks. The scheduling should address the concern of the tradeoff between utilization and energy consumption. 4.2.4 Elementary Schedules In a sequence of events of a realtime scheduling, idle and busy cycles appear in turn for irregular duration of time so that the scheduling sequence consists of a continuum of idle and busy cycles. For an idle cycle, the scheduler does not have any jobs to do such that neither task's arrival nor completion of arrived tasks is happened. For a busy cycle the scheduler is working continuously with no idle moments. Begun with the arrival of either a periodic or periodic task and followed by the arrivals of other tasks, a busy cycle is mainly composed of arrivals, executions, and completions of tasks. The execution is computations based on a scheduling policy and execution demand. A completion represents that the computation has been completed and the executed task leaves the scheduler. When there are no more tasks waiting to being executed, another idle cycle starts, ending the busy cycle. periodic tasks Idle BC PI Idle BC P t the beginning of the end of a busy cycle a busy cycle (a) events of periodic tasks periodic tasks (b) events of periodic tasks periodic tasks periodic tasks Idle Pi PAPAP .PA P Idle PAP P t BC 1 [BC2 starts of finishes of periodic task's arrival all periodic tasks (c) scheduling mixed periodic and periodic tasks P: Interval having only arrivals of periodic tasks PAP: Interval having both arrivals of periodic and periodic tasks BC: Busy Cycle Figure 41 An example of scheduling mixed realtime tasks and two explicit intervals in the event pattern An example of event sequences is illustrated in Figure 41. An arrival sequence of mixed tasks, periodic and periodic tasks in Figure 41 (c), is separately arranged into the arrivals of only periodic tasks and the other of only periodic tasks as shown in Figure 41 (a) and (b), respectively. In Figure 41 (a), the periodic tasks are scheduled, busy and idle cycles appears in turns, i.e. BCpl, idle, BCp2, idle..., depending on the periods, deadlines, and execution times/WCETs of the task set. To minimize energy consumption, the voltageclock scaling problem is imposed when task set is schedulable if a processor is entirely run in Hmode, and misses at least one deadline if running in Lmode completely. Keeping the processor entirely in Hmode to meet all tasks' deadlines results in extra energy consumption. Therefore, the algorithm for hard realtime tasks, VCSEDF scheduling, is called for to determine the optimal voltage settings and to minimize Hmode execution, while guaranteeing that no deadline is missed, such that it will play its role for energy efficiency as shown in [58]. If the systems consist of only soft periodic tasks as shown in Figure 41 (b), the objective of minimizing power consumption will result in slow execution for these tasks and the response time could be unrestrained. This is not a desirable strategy for scheduling periodic tasks. When periodic tasks are scheduled with the same set of periodic tasks by TBS algorithm, the processor utilization occupied by periodic tasks must be shrunk to make sufficient room for the periodic tasks without violating the schedulability. It means more periodic tasks must be assigned in Hmode to accommodate periodic tasks at the cost of higher energy consumption. Thus, clearly, the energy consumption for scheduling periodic tasks with periodic tasks is higher than the case of scheduling only for the set of periodic tasks. In a joint scheduling of periodic tasks with periodic tasks, the arrival and completion of periodic and periodic tasks are intermixed with each other and show an explicit pattern as shown in Figure 41 (c). Following after idle cycles, any arrival of a periodic or aperodic task invokes busy cycles, BC1 and BC2, and finishes of all arrived tasks end the busy cycles. A hard periodic task opens the first busy cycle (BC1) and is followed by several arrivals of periodic and periodic tasks. The second (BC2) begins with an periodic task. Suppose P (Periodic) is the interval, which has no event for periodic tasks, but only periodic tasks and PAP (PAP: Periodic and APeriodic), which has mixed arrivals of both tasks. Then BC2 starts with PAP, as does BC1 with P. Likewise, an arrival of periodic task closes P and opens PAP. And the completion of all periodic tasks awaiting execution ends PAP interval and the busy cycle encounters an interval P again. Our dual policy scheduling for mixed tasks is motivated by the fact that the energy consumption required by periodic tasks is less than the one required by both periodic and periodic tasks under the same amount of energy allocation. If voltage settings that consume minimum energy for periodic tasks are selected to schedule periodic tasks for P intervals, less energy consumption is expected compared to when the worstcase voltage settings for mixed tasks throughout scheduling regardless of the pattern in event occurrences. Consequently, whenever P interval is coming up, switching running modes from the set for mixed tasks to the set for only periodic tasks can reduce energy consumption. In an extreme case, a busy cycle may consist of only periodic tasks without any periodic task, i.e. P intervals. If periodic tasks for P intervals are scheduled by the voltage settings that use up minimum energy budget, the energy consumption will be remarkably reduced comparing the running modes for mixed tasks, which are used for periodic tasks. Thus, to exploit the benefit in energy consumption in switching from worstcase to minimum energy consuming schedule for the intervals of P's, we build elementary schedules named S1 and S2 for the two intervals, respectively, and develop a dynamic scheduling model based on the elementary schedules, in which every first and last existence of periodic events in a sequence of scheduling events is the turning point between the elementary schedules. 4.3 Dynamic DualPolicy Scheduling Model In this section, we consider VCSEDF scheduling to improve energy efficiency in real time embedded systems having mixed periodic and periodic tasks. An energy efficient online scheduling under a bounded energy budget can extend the service time of energyconfined system much more. Since power consumption and execution time are contradictory to each other in voltageclock scaling scheme, there exists a profound intervention in utilization, schedulability, and response time and they should be well considered in realtime systems. In addition, the bounded energy budget must be properly shared among tasks to satisfy the time constraints for periodic tasks and fast response time for periodic tasks in performance criterion that realtime system pursues. Understanding the relationship between energy consumption and utilization of tasks based on VCSEDF scheme, we use a taskbased profiling of energy and utilization demands for both periodic and periodic tasks and adopt the energy budget allocation model [83]. Based on the event pattern occurring in a scheduling of mixed periodic and periodic tasks and the relationship between energy and processor utilization demands, we develop an on line scheduling named dualpolicy dynamic scheduling, which can reduce energy consumption by switching the scheduling policies along with the events of periodic tasks. Triggered by new arrivals after P intervals and last execution of periodic task in PAP intervals, the event pattern of a busy cycle is repeated by two regions of interval in turns, one consists of only periodic and the other of mixed tasks. Corresponding to the pattern by the two intervals in scheduling, we build elementary schedules named S1 and S2, which can be applied for intervals of P and PAP, respectively. We assume switching between elementary schedules consumes a negligible overhead like context switching or voltage switching between tasks. This is also analogous to the assumption made in classical realtime scheduling theory, that preemption costs are negligible. 4.3.1 Schedule for Only Periodic Tasks Energy is consumed only for periodic tasks. The voltage settings are determined to minimize energy consumption by periodic tasks, while meeting their schedulability within the allocated energy budget. Given a set of periodic tasks, a set of voltage settings, SI: {m,}, satisfying the conditions uniquely exists for them. Since the voltage settings demand the minimum energy consumption under the allocated energy budget, let them be defined as a best case schedule for periodic tasks regarding energy saving. For the sake of schedulability in EDF scheduling, the tasks should be scheduled in such a way that the utilization is less than unity. Therefore, we define min Ep as an energy demand and max Up as a worstcase utilization when there exist a set of voltage settings S1: {m,} so that the worstcase utilization 1 C e Up ( )= C'<1 and Ep(m,)=1 p(m,) is minimized. (42) a(m, ) T, a(im, )T, Regarding the feasibility of energy constraint, Ec must be greater than min Ep. 4.3.2 Schedule for Both Periodic and Aperiodic Tasks Periodic tasks share a constrained energy budget with periodic tasks, taking account of fast responsiveness for periodic tasks. The voltage settings are decided to achieve minimal response times for all periodic tasks, guaranteeing to meet deadlines of periodic tasks. Each periodic task Jk gets a finite deadline dk according to deadline assignment algorithm in total bandwidth server. According to the constrained energy allocation model proposed in Chapter 3, voltage settings, S2: {m,}, are determined. As the voltage settings demand the worstcase energy consumption under the allocated energy budget, let them be defined as worstcase schedule for periodic tasks. Voltage settings must be determined such that the energy consumption satisfies the constraint of Ep + EA < Ec, while guaranteeing the schedulability of periodic tasks and minimizing average response time for periodic tasks. Using the algorithm developed in Chapter 3, the energy allocation factors, / and y, voltage settings for periodic tasks, S2: {m,}, and the percentage of Hmode assignment for periodic tasks, XH are determined under the bounded energy consumption Ec. The optimization problem to find voltage settings for periodic tasks can be stated as follows: Pick the task subsets H and L for voltage settings of Hmode and Lmode such that HuL = { 1, 2, ..., z,} HnL=0 1 C S UP (m,) = is minimized subject to the wellknown sufficient condition' for the schedulability of periodic tasks under EDF, i.e., 1 C 1 C CI + c <1 (43) aH e min(, ,D,) aL min(,,D,) and the energy consumption constraint of PH L C < EP a H eH min(T,D,) L L min(, D,) This optimization problem can be treated equivalently to the decision problem of the subset sum, which is NPcomplete. Consequently, efficient search heuristics, e.g., branchand bound algorithms, should be employed to find a solution ifn is large. As for scheduling periodic tasks, the totalbandwidth algorithm is used based on 7y1, which is decided to give best responsiveness within the range of the bounded energy budget. According to the schedulable condition of UA +Up 1 in the total bandwidth server, the utilization of periodic tasks UA is determined by UA = 1Up, where Up is determined by the voltage settings in the equation (43). 4.3.3 Schedules for Transition between Elementary Schedules Since the constitutions of voltage settings are different from each other for two sets of elementary schedules, SI: {m,} and S2: {m,}, a transition between different schedules, such as from S1 to S2 or from S2 to S1, cannot be simply accomplished by applying different voltage settings to the tasks. The remaining workload at the switching decision points may cause the targeting schedule to violate some times later. To prevent the violation, in this section, two 1 The condition is also necessary if D,2T, for all i. transitory schedules, S12 for the switching from S1 to S2 schedule and S21 from S2 to S1 schedule, are proposed. 4.3.3.1 Energy Saving For P intervals, selecting the running mode for periodic tasks from S1 instead of S2 reduces energy consumption by executing tasks in Lmode, which is assigned in Hmode in S2 schedule. Compared to the energy consumption by the worstcase schedule S2, energy saving is achieved from the difference in energy consumption existing between the two different energy levels for a task in two schedules. The amount of saved energy is proportional to the difference in amount of executed time by switched running mode. However, the remaining workload increased by running periodic tasks in low speed by Si: {m,} instead of in high speed assigned in S2: {m,} is expected to affect not only the schedulability of scheduling in S2, but also the performance in average response time with an arrival of periodic task. 4.3.3.2 Backlogged Workloads at Switching Points Highspeed execution finishes given amount of execution demand, which consists of a serial of instructions, earlier than in lowspeed execution due to the higher operating frequency. Thus, as of workload accomplished after a certain passage of time, lowspeed execution of a task leaves more workload to execute than highspeed execution. The difference in remaining workload is also proportional to the difference in operating frequencies of two execution modes. For a set of periodic tasks in the proposed dualpolicy scheduling, voltage settings are determined as two sets of mixed Hmode and Lmode for both elementary schedules, Si: {m,} and S2: {m,}. It is obvious that workload assigned to Hmode in S1 schedule is less than in S2 schedule since periodic tasks in S2 schedule are more likely to be assigned to Hmode due to the share of processing capacity with periodic tasks under total bandwidth server algorithm, Up (S2) I1 UA (S2). In other words, when any periodic task that is assigned to Hmode for S2 is executed in Lmode for an interval P (by S1 schedule), the lowspeed execution of the task may not finish the execution demand and may cause the unfinished portion to be accumulated to be a backlogged workload to the next policy. During intervals of P, the more periodic tasks are executed in Lmode of S1 schedule instead of Hmode of S2 schedule, the more workload is remained without being executed. On the other hand, any periodic task assigned to Lmode for S2 is executed in Hmode for interval P (by S1 schedule), the faster execution of the task will finish the execution demand earlier at the cost of more energy consumption. By the definition of schedules S1 and S2, the difference of workload by the P intervals is varied along with the sum of execution times of tasks in different voltage settings among tasks encountered in the intervals. We define WD (t) as a difference in workloads at time t between current scheduling policy and targeting worstcase scheduling policy, to which the scheduler is going to switch. 4.3.3.3 Schedulability Analysis at Switching Points of Scheduling Policies At the moment of an arrival of periodic task after the interval P when the dynamic scheduler changes its policy with S2 schedule, there can be more workload than S2 schedule can afford to schedule mixed periodic and periodic tasks, abiding by its schedulability. So, if workload at switching point is more than targeting worstcase schedule to which the scheduler switches, at least one of the periodic tasks will miss its deadline. The unfinished workload by running periodic tasks in S1 instead of S2 schedule for P intervals should be completed before changing to S2 schedule not to lead an unschedulable state. Like switching from S1 to S2 schedules at the end of P interval, for switching of scheduling policies from S2 to S1 at the end of PAP intervals, there may also exist a difference in remained workload due to the different voltage settings for two schedules. Therefore, for the sake of realtime schedulability, workload handed over from previous schedule at the points of switching scheduling policies must not be higher than the workload affordable by targeting worstcase schedule. If so, there must be an effective scheme to prevent from being unschedulable. In case that the remained workload for the current schedule is less than the one affordable for the worst case of the schedule to take, the switching of schedules does not need any further action to make them schedulable. Before proposing transitory schedules between two elementary schedules, we explain how the nonzero workload difference between two scheduling policies endangers the schedulability of VCSEDF scheduling. Based on the concept of processor demand for hard real time tasks [80], we extend the schedulability of a periodic task set whose utilization Up is defined in the range of aL Up < aH in Lmode. SFeasibility Analysis of EDF scheduling Spuri et al. define two concepts to analyze the feasibility of realtime task sets; the processor demand as a focused measure of how much computation is requested, with respect to timing constraints, in a given interval of time and the loading factor as the maximum fraction of processor time possibly demanded by the task set in any interval of time. Given a set of realtime jobs and an interval of time fti, t2), the processor demand and loading factor of the job set on the h interval [tl, t2) are respectively defined as h[l,2) C, and U[t,t2) 1  [80]. tl r,,t2z d, 12 1 Theorem 4.1 (Spuri) Each set of realtime tasks is feasibly scheduled by EDF if and only if ul [80]. By showing that the loading factor u of the task set is equal to U, Spuri proves the feasibility analysis of a task set under EDF scheduling addressed by Liu and Layland [33] as follows. Corollary 4.1 (Liu and Layland) Any set of n synchronous periodic tasks with processor utilization U = ', T, is feasibly scheduled by EDF if and only if U 1. Proof. By Theorem 4.1, the thesis then follows. I t t t1 C For any interval ft, t2: c, = I = (t2 , t 1 _ trTt2d, i=l1 i=l i=l1 that is, U 1,t2) U Now, let t =0 and t2= lcm (Ti, ..., Tn): n t2 I;=1 I u1 U. And it follows u=U. . t2 SFeasibility Analysis of Twomode VCSEDF Scheduling In the same way as explained above, in the VCSEDF scheduling given by equations (4 2) and (43), the processor demand and the loading factor for the task set assigned in either H mode or Lmode on the interval [ti, t2) is respectively defined as h[t 1 C2 +1 kC (44) aH tl <,t2 and [ttlt2) t2 t1 Corresponding to the Corollary 4.1, the feasibility of VCSEDF scheduling is defined by the following Corollary. Corollary 4.2 Any set of n synchronous periodic tasks with processor utilization T 1 yc, 1 c, U 'C + L is feasibly scheduled by EDF if and only if U 1. aaH ieH T, aL L T, Proof. By Theorem 4.1, the thesis then follows. For any interval [ti, t2): h1 1Ic+ 'c, h tl t2) C< + c rH tl <,t2D, L tl cH ieH i L eH aH e H T, aL ieH TH h[ = 1t2 1 1 C aH iE=H T aL .eL TL Now, let tl=0 and t2= Icm (TI, ..., Tn): h + = U. And it follows u=U. t2 OH iEHT, aL L T, SFeasibility Analysis of DualPolicy Dynamic Scheduling By the characteristics of the processor demand, it is also corresponded to the workload in any interval of time. Then, let WDr (t) be the workload difference, backlogged workload to policy T, at time t defined as 1 (t) WV (t) h= hc hro,t, where C represents for current scheduling policy and so does T for targeted one. Theorem 4.2 Any set of n synchronous periodic tasks with processor utilization S 1 C 1 C S2 1 C 1 ,C U =+ ,, T + and U + w where U2 < U , U H IeHsi T, L ieLs1T/i H eHs2 Ti L eLs,2 T UA 1 and periodic tasks are scheduled by S1 by the time t with real execution demand e, for task j, is not feasibly scheduled by S2 after time t if WD (t) >0. Proof. Suppose that for the interval of P= [ti, t2) having only periodic events, there is at least an arrival of a periodic task, T, assigned to Hmode in S2 schedule and Lmode in S1 schedule. Then the processor demand for S1 schedule at t2 is h1 1 1 hs = C, + I C, + e hfitt) LL L SH tl r,,t, D, aL tl r,,t, D, aL ke H keL1 1 C, e 1 C =(t, 0 y a 7H eH aL L T,  (t2 l1) C, C EH1J j l C Ie I C aHTI aL L+ aeL2 LI =(2 (tC, ) ,H a. 2 a I 1L Se L, e WD= (t2I)=h" C n t2_t " SHH L E aHaLTI Now, let tl 0 and t2= 1cm (T, ..., T,): If WD2 (t2) > aHel aLCI > 0 ht H2 e aL C t2 t2 aHaL T US1 = US2 + aLC > US2 P P aHL (45) Otherwise, aHe LC < 0 US=US2 + aHeJ aLCI < aH aL Usi = US2 + <_ US2 At time t= t2, the workload by S1 schedule handed over S2 schedule make the utilization by S2 schedule higher than U2 and the scheduling by S2 not feasible from the time. If task j, which is assigned to Hmode in S2, is executed in Lmode for intervals of P by S1 schedule before any arrival of periodic task, there is a nonzero aoHej aL JC, difference as much as H] cLT 1 between utilizations of two schedules. From H aL T) the schedulability point of view when S2 schedule is taken with an arrival of periodic task, it makes S2 schedule not schedulable such that at least one of the tasks can miss its deadline. m Then, for the targeting schedule's schedulability, how the workload handed over at time t can be effectively reduced before switching to the targeting S1 or S2 schedule. There is a major factor to reduce the WD (t) to zero or less than zero. Considering that a periodic task consumes less execution time than the WCET, the WD (t) inbetween the interval P and PAP will not be so high. Thus, if the slack times coming from the difference between WCET and actual execution demand nullify the WD (t) or make it less than zero, the switching can be performed easily without going through further processes to maintain schedulability. Otherwise, an efficient scheme is required to deal with the reduction of the workload difference. The scheme should not only guarantee schedulability in transitory intervals between schedules, S1 and S2, but also lead it to the schedulable dynamic scheduling by the target policy. Thus, we introduce transitory schedules for the inbetween elementary schedules. Making use of slack times to lower WD (t), if tasks assigned at lowspeed mode in targeting schedule are executed in highspeed mode, the accelerated executions also reduce the handedover workload. And running tasks for the transitory intervals in highspeed mode is schedulable at the cost of more energy consumption, as the workload of the tasks set is given in the range of aL s Up s aH and the utilization for the case become min Up, which is UC 1 C 1 C Up = ' satisfying schedulability, since UplorS +L L H T aH ieHi T, aL ieL T, where minU < UslorS2 4.3.3.4 Modified TBS in DualPolicy Dynamic Scheduling For the switching policies from S1 to S2 schedule with any arrival of periodic task, we modify the TBS algorithm to guarantee the schedulability. If WD2 (t) > 0 from the equations (4 4) and (45), a TBS with a bandwidth UA becomes unschedulable by EDF. Thus, when WD2 (t) > 0, Modified TBS (MTBS) overcome the unschedulable state by assigning infinite deadlines to periodic tasks at the cost of increased response time, making utilization of UA zero. By the state of WD2 (t) in DualPolicy Dynamic Scheduling, the kth periodic request arriving at time t = rk, MTBS flexibly assigns a task deadline C dk = max(rk, dk ) + when WD2 (t) s0 (46) UA dk = when WD2 (t)> 0. (47) The assignment of infinite deadlines to periodic tasks represents that the scheduler regards the arrived tasks as not ready to be executed. At the moment when the schedulability for the target elementary schedule is guaranteed, periodic tasks that arrived but waiting in the scheduler are activated. Due to the waiting time from arrival to the activation, the responsiveness of periodic tasks is directly affected by the postponed executions of arrived periodic tasks. 4.3.3.5 Transitory Schedule S12 With any arrival of periodic task after a P interval, scheduling policy need to be switched to S2 schedule. Given workload difference WD2 (t) between current scheduling and S2 scheduling, since WD2 (t) is bigger than zero, the targeting scheduling policy cannot replace S1 scheduling. Instead, transitory schedule S12 is selected such that all periodic tasks are executed in Hmode and the deadlines of periodic tasks are assigned as infinite until WD2 (t) becomes less than or equal to zero. But the work amount increased by running some periodic tasks in Lmode instead of in Hmode affect the performance in average response time with an arrival of periodic task. At the point where the work difference becomes zero, tasks including periodic tasks are finally scheduled by targeting schedule of S2. 4.3.3.6 Transitory Schedule S21 After finishing an periodic task, schedule S1 can be chosen for the events of periodic tasks if the workload is affordable for the schedule. Because staying in schedule S2 after finishing an periodic task consumes more energy than scheduling tasks using S1, it is more efficient in the energy consumption point of view to switch scheduling policy from S2 to S1. Like transitory schedule S12, all tasks are executed in Hmode until WD1 (t) becomes less than or equal to zero. 4.4 DualPolicy Dynamic Scheduling Using VCSEDF In DualPolicy Dynamic Scheduling model, the switching instants are basically at the boundaries, where intervals P and PAP are crossed. They are the moments of the first arrival of periodic task after an idle cycle or after the intervals P and the last completion of periodic task in the scheduler. Since the voltage settings are different in two scheduling policies, for the assurance of schedulability, the workload differences at switching points needs to be checked, making the workload difference WD (t) the main criterion to judge the schedulability at switching point t. WD1 (t) is the workload difference between real executions and worstcase executions when S1 schedule is taken as targeting schedule from the instant t. Likewise, WD2 (t) is the one when S2 schedule is taken at t. 4.4.1 Terms and Conditions Suppose tB, tA, tT, tp, and tF be the instants when a scheduling policy switching is determined based the events of periodic tasks and the state of workload difference. A busy cycle is closed by any instant except for tB. Also Tdle, TBA, TAT, TTP, and TpA are the intervals divided by the instants in a busy cycle. tB: the instant of a busy cycle begun with any arrival of either periodic or periodic task, idle cycle is closed. tA: the instant when the periodic task Jk has arrived so that the event pattern includes both periodic and periodic tasks tT: the instant when WD2 (tr) becomes less than or equal to zero so the scheduling policy can be switched to S2 schedule. tF: the instant when all periodic tasks in scheduling queue are finished so no more periodic tasks are waiting in scheduler. tp: the instant when WD1 (tp) becomes less than or equal to zero so that the scheduling policy can be switched to S1 schedule. Tidle: the interval there's no task's arrival from either tA, ttr, ,t or tF to tB. TBA: the interval from time tB to tA. TAT: the interval from time tA to tr. TTF: the interval from time tr to tF. TFP: the interval from time tF to tp. TPA: the interval from time tp to tA. If the bounded energy consumption budget is given as Ec, Ec must fall into the range E,,, sEc sEmx, where min Ep, mmin EA, max Ep, and max EA are defined as follows. Enx,,: Energy consumption when all periodic tasks are run in Hmode so the processor runs at a fast clock rate all the time. Eax = (max Ep + max E), where 1C 1 maxEp = ppH and maxEA = PH aH T, H A Emi,: E,, = (min Ep + mmin E, where mmin Ep is determined by (43) when the sum of utilizations of periodic and periodic tasks takes any value in the range of aL to aH. As for min EA, it is the minimum energy consumption when all periodic tasks are run in Lmode, i.e. minE = PL aL A Ediff: The difference in maximum and minimum energy consumption by mixed tasks such that Edff= (E,, E,,). 4.4.2 Switching Scheduling Policies In Figure 42, we illustrate an example of switching scheduling policies in a busy cycle starting with an arrival of a periodic task followed by periodic and periodic tasks. It also shows when all of the schedules in dualpolicy dynamic scheduling are activated in the busy cycle according to switching criteria. Basically, switching decisions between scheduling policies are committed at the moment of the first arrival of periodic task in the interval PAP within the busy cycle (tA) and finishes all of the arrived periodic tasks (tF) with the status of workload differences by real executions and worstcase demands, WD1 (t) and WD2 (t). As the work differences do not satisfy the switching condition, transitory schedules S21 and S12 are chosen. Finally, switches to the targeting elementary schedules are committed at both tr and tp. All of the intervals except for T,dle and TBA are circulated in turns along upcoming events by mixed tasks in a busy cycle. And another round of intervals is started after Tdle. .......... P  .... PAP  ........ ..P ..  PAP . to S1 Idle S12 S21 S12 2 I _t _ tB tt t t t t A beginning of An arrival of Switch schedule Finishes of all Switch schedule a busy cycle with periodic task J to S2 since periodic task Ji to S1 since periodic task WD2(t) >0 WD2(t)<=0 WD (t) >0 WDI(tp)<=0 : The selection of Scheduling Policy SI: schedule consuming minimum energy for periodic tasks S2: schedule consuming the worstcase energy for both periodic and periodic tasks S12: Transitory schedule for switching from S1 to S2 S21: Transitory schedule for switching from S2 to S1 Figure 42 Switching schedules for a busy cycle starting with a periodic task and nonzero WD The detailed functions for each instant and interval in DualPolicy Dynamic Scheduling when a periodic task opens a busy cycle are as follows: At time tB: An arrival of periodic task starts P interval and initiates scheduling by Slschedule since there's no periodic task to execute. An periodic task's arrival at tB directly starts PAP interval without going through TBA and WD2 (tB) = WD2 (t) and is shown in Figure 43. For interval TBA: As there's no periodic task, periodic tasks are executed in the running modes consuming minimum energy without violating their schedulability, SI: {m,}, instead of in the modes of worstcase schedule, S2: {m,}. At time tA: An periodic task Jk arrives. It's time to return back to the worstcase schedule S2 from S1 schedule such that the target schedule is S2 now. The state of WD2 (t) should be checked at the instant. The arrived periodic tasks can get either finite deadline based on the MTBS algorithm when WD2 (tA) s0 or infinite deadline when WD2 (t) > 0. And if WD2 (tA) s0, the instant becomes exactly the same as the instant of tr. If WD2 (t) > 0, which means the workload left on the next target schedule policy of S2 is more than it can afford to execute, the transitory schedule S12 is chosen till the schedulability condition is satisfied to make the tasks schedulable by S2 schedule and the MTBS assigns an infinite deadline to the periodic task. Otherwise, the workload left behind is less than or equal to the one is schedulable by S2, the instant of tA is merged directly to tp without going through transitory schedule S12. For interval TAT: Since WD2 (t) is bigger than zero at time tA, the scheduler cannot jump directly to S2 schedule, transitory schedule S12 is chosen. Running all tasks in Hmode reduces the workload remaining at tA by S1 schedule for the interval TBA, i.e. nonzero workload difference between S2 and S1, and finally makes it at the same level that S2 schedule can afford. WD2 (t) is still the work difference between worstcase based on S2 scheduling and all Hmode real execution. Catching up the workload level of S2 by executing all tasks in Hmode consumes more energy than executing tasks in the modes by the worstcase schedule S2. As long as WD2 (t) is greater than zero, the deadlines of periodic tasks that arrive for this interval are assigned as infinite. When WD2 (t) is less than or equal to zero, the moment is transferred to time tr, allowing the scheduler to select S2 schedule from the instant of tr as scheduling policy. At time tT: Because WD2 (tr) is less than or equal to zero at the instant, S2 schedule is chosen as scheduling policy. Thus, all of periodic tasks that arrived before time tr but got infinite deadlines for the interval TAT are activated from at tr by getting finite deadlines. The deadlines are determined based on their execution demands and deadline assigning equation (46) in MTBS algorithm. From the instant tr, all periodic tasks that arrive at the scheduler get finite deadlines. For interval TTF: The MTBS assigns finite deadlines according to the execution demands of incoming periodic tasks till no more periodic tasks are available in the scheduler. For the interval TTF, if any task under the voltage settings of the worstcase schedule S2 makes the real execution workload greater than the worst case one, i.e. nonzero WD2 (t), the schedulability can not be guaranteed. Therefore, WD2 (t) must be checked whether it becomes greater than zero by the different running modes. All tasks are run in highspeed to make the scheduler keep schedulable if WD2 (t) become greater than zero. At time tF: All of the periodic tasks in the scheduler have been completed. S1 schedule that consumes lower power becomes the target scheduling policy from the instant. But the possible nonzero workload difference transferred by taking S2 as scheduling policy previously may cause at least one of the tasks to miss its deadline. Therefore, tasks' schedulability is checked again at the instant before setting running modes to the target schedule. Since the current target scheduling policy is S1 schedule, WD1 (tF) is checked out whether the scheduler can directly select S1 schedule or should go through S21 schedule. If WD1 (tF) s 0, the instant becomes exactly the same as the instant of tp and S1 schedule is selected, guaranteeing the schedulability. Otherwise, S21 schedule that executes all tasks in Hmode is chosen. For interval TFP: Since WD1 (tF) is greater than zero at time tF, every task is executed in Hmode to reduce the excessive workload for S1 schedule, consuming more energy by executing in H instead of in Lmode in the schedule of S1. At time tp: Due to the speedup execution in Hmode, WD1 (t) becomes less than or equals to zero at time tp. So, S1 schedule is selected at time tp. For interval TpA: S1 schedule is applied to periodic tasks till another first periodic task arrives in the busy cycle, getting more execution time in Lmode than in Hmode and bringing energy saving to the system. Figure 43 shows an example, in which a busy cycle also starts with a periodic task like shown in Figure 42. But now it satisfies the scheduability of elementary schedules at time tA and time tF, at which the next scheduling policy, not being all work differences higher than zero, i.e. WD1 (t)< 0 and WD2 (t) <=0. Thus the scheduler can directly take elementary schedules of S1 and S2 without going through the transitory schedules S12 and S21. Thus, the time of tA is the same with time tr by WD2 (t9) < 0 and so is the time of tF with tp by WD1 (tF) < 0. The energy saving is expected to be maximized if busy cycles are composed of the events meeting schedulability, WD2 (tA) < 0 and WD1 (tF) < 0, making all the moments of tA and tF to t and tp, since there is no taking of transitory schedules such that no speedup execution running all tasks in Hmode by S12 or S21. . P ..  PAP ........ .. P .. PAP . ..I A ^  p A  Idle to S Sto S2 to S1 to S2 1 I II S2 ^ t=t7 tt t =tT t A beginning of An arrival of periodic task J Finishes of all periodic task J a busy cycle with Switch schedule to S2 directly Switch schedule to Sl directly periodic task since WD2(t) <=0 since WDl(tp)<=O Figure 43 Switching schedules for a busy cycle starting with a periodic task and all WD (t).O In Figure 44, the first arrival in a busy cycle is done by an periodic task Jk, making the workload difference WD2 (tB) at time tB equal to zero since no task is waiting in dynamic scheduler except for the just arrived periodic task at tB. The zero workload difference lets the elementary schedule S2 be selected to schedule mixed tasks. And the real execution demands for arriving periodic tasks are less than the WCET's of S2 schedule, WD2 (t) remains equal to or less than zero till S1 schedule is activated with the completion of all waiting periodic tasks and it makes the time tr be matched with tB and tA. As soon as the last periodic task in dynamic scheduler is completed at time tF, WD1 (tF) is checked whether the voltage settings of tasks can be switched to the energysaving mode schedule S1. From the time tF, the scheduling policy selection follows the same procedures as shown in Figure 42. To show the workload difference at the moment of switching scheduling policies and how to change scheduling policies, we show examples of the switching moments in Figure 45 and Figure 46, which have transitory schedules from S1 to S2 schedule and from S2 to S1 schedule, respectively. In both figures, W2 (t), W, (t), and Wo (t) represent the workload demands of worstcase schedule S2 for mixed tasks, of worstcase schedule S1 for only periodic tasks (for energy saving), and of real execution for both periodic and periodic tasks, respectively. 1 ................. PAP .......... P  PAP . . Sl ^^~""~~'""  ~^A """"""^ , Idle4 S21 S12 to S2 / I $B=t T F p t t tT t A beginning of Finishes of all Switch schedule An arrival of Take S2 schedule a busy cycle with Te periodic task Ji to Sl since periodic task J since WD2(t t< =0 an periodic taskJ s D2( WD I(tF) >0 UWDI(t) <=0 WD2(tA >0 Figure 44 Switching schedules for a busy cycle starting with an periodic task and WD (t) >0 In Figure 45, before the arrival of periodic task, voltage settings of S1 schedule are chosen for the periodic tasks and Wo (t) shows a decreasing slope by execution in low speed (L mode) along the time t. W2 (t) also represents the decrement in workload by execution in high speed (Hmode). The discrepancy in execution speeds of tasks between Sland S2 schedules appears as nonzero workload difference WD2 (t) at time tA and this makes at least one of the periodic tasks miss its deadline after tA from the viewpoint of schedulability in target scheduling policy S2 at tA for both periodic and periodic tasks. With the arrival of periodic task Jk at time tA, both W2 (t) and Wo (t) jump up by the amount of execution demand, but there's no change in WD2 (t). Before taking S2 schedule as scheduling policy from time tA, WD2 (t) is reduced to be equal to or less than zero. W(t) S1 S12 S2 W. W Switching Point to S2 t WD2(t L H  4 or TP_4 AT _____._T_ _T tA tTt t An arrival of The deadline for Aperiodic task J periodic task J is assigned Figure 45 Switching from S1 to S2 schedules So, all periodic tasks are executed in highspeed and periodic tasks get infinite deadlines until the WD2 (t) becomes zero or less than zero. Applying S12 schedule, the execution in the same speed as S2 schedule is useless to decrease WD2 (t). WD2 (t) can be reduced only when the tasks are executed in Hmode, but assigned in Lmode for worstcase S2 schedule. In addition, even with the same speed execution in highspeed with S2 schedule, WD2 (t) can be lowered for every completion of periodic task by the slack time from the difference of WCET and real execution demand of periodic tasks. In Figure 45, execution in Hmode for S12 schedule, but assigned in Lmode for S2 schedule makes WD2 (tr) WD2 (t). And, both workload lines A and B represent that the last periodic task in interval TAT brought to nonzero slack time. Line B is for the case having bigger slack time than line A. From time tr, with zero or less than zero WD2 (tr) 