<%BANNER%>

Optimization Problems in Telecommunications and the Internet


PAGE 1

OPTIMIZA TION PR OBLEMS IN TELECOMMUNICA TIONS AND THE INTERNET By CARLOS A.S. OLIVEIRA A DISSER T A TION PRESENTED TO THE GRADUA TE SCHOOL OF THE UNIVERSITY OF FLORID A IN P AR TIAL FULFILLMENT OF THE REQUIREMENTS F OR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORID A 2004

PAGE 2

T o m y wife Janaina.

PAGE 3

A CKNO WLEDGMENTS The follo wing p eople deserv e m y sincere ac kno wledgmen ts: My advisor, Dr. P anos P ardalos; Dr. Mauricio Resende, from A T&T Researc h Labs, who w as resp onsible for in tro ducing me to this Univ ersit y; My colleages in the graduate sc ho ol of the Industrial and Systems Engineering Departmen t; My family and esp ecially m y paren ts; My wife. iii

PAGE 4

T ABLE OF CONTENTS page A CKNO WLEDGMENTS . . . . . . . . . . . . iii LIST OF T ABLES . . . . . . . . . . . . . . vii LIST OF FIGURES . . . . . . . . . . . . . . viii ABSTRA CT . . . . . . . . . . . . . . . . ix 1 INTR ODUCTION . . . . . . . . . . . . . 1 2 A SUR VEY OF COMBINA TORIAL OPTIMIZA TION PR OBLEMS IN MUL TICAST R OUTING . . . . . . . 4 2.1 In tro duction . . . . . . . . . . . . . 4 2.1.1 Multicast Routing . . . . . . . . . 5 2.1.2 Basic Denitions . . . . . . . . . . 7 2.1.3 Applications of Multicast Routing . . . . . 8 2.1.4 Chapter Organization . . . . . . . . 9 2.2 Basic Problems in Multicast Routing . . . . . . 9 2.2.1 Graph Theory T erminology . . . . . . . 9 2.2.2 Optimization Goals . . . . . . . . . 11 2.2.3 Basic Multicast Routing Algorithms . . . . 12 2.2.4 General T ec hniques for Creation of Multicast Routes 13 2.2.5 Shortest P ath Problems with Dela y Constrain ts . 15 2.2.6 Dela y Constrained Minim um Spanning T ree Problem 16 2.2.7 Cen ter-Based T rees and the T op ological Cen ter Problem . . . . . . . . . . . . . 17 2.3 Steiner T ree Problems and Multicast Routing . . . 18 2.3.1 The Steiner T ree Problem on Graphs . . . . 18 2.3.2 Steiner T ree Problems with Dela y Constrain ts . 21 2.3.3 The On-line V ersion of Multicast Routing . . . 25 2.3.4 Distributed Algorithms . . . . . . . . 29 2.3.5 In teger Programming F orm ulation . . . . . 34 2.3.6 Minimizing Bandwidth Utilization . . . . . 36 2.3.7 The Degree-constrained Steiner Problem . . . 37 2.3.8 Other Restrictions: Non Symmetric Links and Degree V ariation . . . . . . . . . . 38 2.3.9 Comparison of Algorithms . . . . . . . 39 iv

PAGE 5

2.4 Other Problems in Multicast Routing . . . . . . 41 2.4.1 The Multicast P ac king Problem . . . . . 42 2.4.2 The Multicast Net w ork Dimensioning Problem . 44 2.4.3 The P oin t-to-P oin t Connection Problem . . . 46 2.5 Concluding Remarks . . . . . . . . . . 47 3 STREAMING CA CHE PLA CEMENT PR OBLEMS . . . 48 3.1 In tro duction . . . . . . . . . . . . . 48 3.1.1 Multicast Net w orks . . . . . . . . . 49 3.1.2 Related W ork . . . . . . . . . . 51 3.2 V ersions of Streaming Cac he Placemen t Problems . . 52 3.2.1 The T ree Cac he Placemen t Problem . . . . 53 3.2.2 The Flo w Cac he Placemen t Problem . . . . 55 3.3 Complexit y of the Cac he Placemen t Problems . . . 56 3.3.1 Complexit y of the TSCPP . . . . . . . 56 3.3.2 Complexit y of the FSCPP . . . . . . . 60 3.4 Concluding Remarks . . . . . . . . . . 63 4 COMPLEXITY OF APPR O XIMA TION F OR STREAMING CA CHE PLA CEMENT PR OBLEMS . . . . . . . 64 4.1 In tro duction . . . . . . . . . . . . . 64 4.2 Non-appro ximabilit y . . . . . . . . . . 65 4.3 Impro v ed Hardness Result for FSCPP . . . . . . 68 4.4 Concluding Remarks . . . . . . . . . . 73 5 ALGORITHMS F OR STREAMING CA CHE PLA CEMENT PR OBLEMS . . . . . . . . . . . . . . 74 5.1 In tro duction . . . . . . . . . . . . . 74 5.2 Appro ximation Algorithms for SCPP . . . . . . 75 5.2.1 A Simple Algorithm for TSCPP . . . . . 75 5.2.2 A Flo w-based Algorithm for FSCPP . . . . 77 5.3 Construction Algorithms for the SCPP . . . . . 80 5.3.1 Connecting Destinations . . . . . . . 81 5.3.2 Adding Cac hes to a Solution . . . . . . 85 5.4 Empirical Ev aluation . . . . . . . . . . 89 5.5 Concluding Remarks . . . . . . . . . . 93 6 HEURISTIC ALGORITHMS F OR R OUTING ON MUL TICAST NETW ORKS . . . . . . . . . . . . . . 94 6.1 In tro duction . . . . . . . . . . . . . 94 6.1.1 The Multicast Routing Problem . . . . . 95 6.1.2 Con tributions . . . . . . . . . . 96 v

PAGE 6

6.2 An Algorithm for the MRP . . . . . . . . . 97 6.3 Metaheuristic Description . . . . . . . . . 101 6.3.1 Impro ving the Construction Phase . . . . . 102 6.3.2 Impro v emen t Phase . . . . . . . . . 105 6.3.3 Rev erse P ath Relinking and P ost-pro cessing . . 109 6.3.4 Ecien t implemen tation of P ath Relinking . . 110 6.4 Computational Exp erimen ts . . . . . . . . 111 6.5 Concluding Remarks . . . . . . . . . . 113 7 A NEW HEURISTIC F OR THE MINIMUM CONNECTED DOMINA TING SET PR OBLEM ON AD HOC WIRELESS NETW ORKS . . . . . . . . . . . . . . 115 7.1 In tro duction . . . . . . . . . . . . . 115 7.2 Algorithm for the MCDS Problem . . . . . . . 118 7.3 A Distributed Implemen tation . . . . . . . . 121 7.4 Numerical Exp erimen ts . . . . . . . . . . 125 7.5 Concluding Remarks . . . . . . . . . . 126 8 CONCLUSION . . . . . . . . . . . . . . 130 REFERENCES . . . . . . . . . . . . . . . 135 BIOGRAPHICAL SKETCH . . . . . . . . . . . . 147 vi

PAGE 7

LIST OF T ABLES T able page 2{1 Comparison among algorithms for the problem of m ulticast routing with dela y constrain ts. k is the n um b er of destinations. ** This algorithm is partially distributed. . . . . . 39 2{2 Comparison among algorithms for the problem of m ulticast routing with dela y constrain ts. k is the n um b er of destinations, T S P is the time to nd a shortest path in the graph. ** In this case amortized time is the imp ortan t issue, but w as not analyzed in the original pap er. . . . . . . . . 40 5{1 Computational results for dieren t v ariations of Algorithm 7 and Algorithm 8 . . . . . . . . . . . . . . 90 5{2 Comparison of computational time for Algorithm 7 and Algorithm 8 All v alues are in milliseconds. . . . . . . 92 6{1 Summary of results for the prop osed metaheuristic for the MRP Column 9 ( ) rep orts only the time sp en t in the construction phase. . . . . . . . . . . . . . . . 112 7{1 Results of computational exp erimen ts for instances with 100 v ertices, randomly distributed in square planar areas of size 100 100 and 120 120, 140 140, and 160 160. The a v erage solutions are tak en o v er 30 iterations. . . . . . . 128 7{2 Results of computational exp erimen ts for instances with 150 v ertices, randomly distributed in square planar areas of size 120 120, 140 140, 160 160, and 180 180. The a v erage solutions are tak en o v er 30 iterations. . . . . . . 129 vii

PAGE 8

LIST OF FIGURES Figure page 2{1 Conceptual organization of a m ulticast group. . . . . 6 3{1 Simple example for the cac he placemen t problem. . . . 50 3{2 Simple example for the T ree Cac he Placemen t Problem. . . 53 3{3 Simple example for the Flo w Cac he Placemen t Problem. . 56 3{4 Small graph G created in the reduction giv en b y Theorem 2 In this example, the SA T form ula is ( x 1 x 2 x 3 ) ^ ( x 2 x 3 x 4 ) ^ ( x 1 x 3 x 4 ). . . . . . . . . . . . 58 3{5 P art of the transformation used b y the FSCPP . . . . 61 4{1 Example for transformation of Theorem 11 . . . . . . 70 5{1 Sample execution for Algorithm 7 In this graph, all capacities are equal to 1. Destination d 2 is b eing added to the partial solution, and no de 1 m ust b e added to R . . . . . 82 5{2 Sample execution for Algorithm 8 on a graph with unitary capacities. No des 1 and 2 are infeasible, and therefore are candidates to b e included in R . . . . . . . . . 87 5{3 Comparison of computational time for dieren t v ersions of Algorithm 7 and Algorithm 8 Lab els `C3' to `C10' refer to the columns from 3 to 10 on T able 5{2 . . . . . . . 92 6{1 Comparison b et w een the a v erage solution costs found b y the KMB heuristic and our algorithm. . . . . . . . 113 7{1 Appro ximating the virtual bac kb one with a connected dominating set in a unit-disk graph . . . . . . . . . 117 7{2 Actions for a v ertex v in the distributed algorithm. . . . 124 viii

PAGE 9

Abstract of Dissertation Presen ted to the Graduate Sc ho ol of the Univ ersit y of Florida in P artial F ulllmen t of the Requiremen ts for the Degree of Do ctor of Philosoph y OPTIMIZA TION PR OBLEMS IN TELECOMMUNICA TIONS AND THE INTERNET By Carlos A.S. Oliv eira August 2004 Chair: P anos M. P ardalos Ma jor Departmen t: Industrial and Systems Engineering Optimization problems o ccur in div erse areas of telecomm unications. Some problems ha v e b ecome classical examples of application for tec hniques in op erations researc h, suc h as the theory of net w ork ro ws. Other opp ortunities for applications in telecomm unications arise frequen tly giv en the dynamic nature of the eld. Ev ery new tec hnique presen ts dieren t c hallenges that can b e answ ered using appropriate optimization tec hniques. In this dissertation, problems o ccurring in telecomm unications are discussed, with emphasis for applications in the In ternet. First, a study of problems o ccurring in m ulticast routing is presen ted. Here, the ob jectiv e is to allo w the deplo ymen t of m ulticast services with minim um cost. A description of the problem is pro vided, and v ariations that o ccur frequen tly in some of these applications are discussed. Complexit y results are presen ted for m ulticast problems, sho wing that it is NP-hard to appro ximate these problems eectiv ely Despite this, w e also describ e algorithms that giv e some guaran tee of appro ximation. ix

PAGE 10

A second problem in m ulticast net w orks study ed in this dissertation is the m ulticast routing problem. Its ob jectiv e is to nd a minim um cost route linking source to destinations, with additional qualit y of service constrain ts. A heuristic based on a Steiner tree algorithm is prop osed, and used to construct solutions for the routing problem. This construction heuristic is also used as the basis to dev elop a restarting metho d, based on the greedy randomized adaptiv e searc h pro cedure (GRASP). The last part of the dissertation is concerned with problems in wireless net w orks. Suc h net w orks ha v e n umerous applications due to its highly dynamic nature. Algorithms to compute near optimal solutions for the minim um bac kb one problem are prop osed, whic h p erform in practice m uc h b etter than other metho ds. A distributed v ersion of the algorithm is also pro vided. x

PAGE 11

CHAPTER 1 INTR ODUCTION Computer net w orks are a relativ ely new comm unication medium that has quic kly b ecome essen tial for most organizations. In this dissertation, w e presen t some optimization problems o ccurring in computer and telecomm unications net w orks. P erforming optimization on suc h net w orks is imp ortan t for sev eral reasons, including cost and sp eed of comm unication. W e concen trate on t w o t yp es of net w orks that ha v e recen tly receiv ed m uc h atten tion. The rst t yp e is multic ast systems whic h are used to reliably share information with a (p ossibly large) group of clien ts. The second t yp e of net w orks considered in this dissertation is wir eless ad ho c systems an imp ortan t t yp e of net w orks with sev eral applications. W e are mostly concerned ab out computational issues arising in the optimization of problems o ccurring on telecomm unications net w orks. Th us, although w e presen t mathematical programming asp ects for eac h of these problems, the main ob jectiv e will b e to deriv e ecien t algorithms, with or without guaran tee of appro ximation. The topics discussed in the dissertation are divided as follo ws. In Chapter 2 a surv ey of researc h on the area of m ulticast systems is presen ted. The review is used as a starting p oin t for the topics that will b e discussed later in the dissertation related to m ulticast net w orks. Chapter 3 in tro duces the problem that will b e studied in the next c hapters, the streaming cac he placemen t problem (SCPP). V arian ts of this basic problem are in tro duced, and all v arian ts are pro v ed to b e N P -hard. 1

PAGE 12

2 Chapter 4 is dedicated to the study of appro ximabilit y prop erties of the dieren t v ersions of the SCPP It is sho wn that in general the SCPP cannot ha v e a p olynomial time appro ximation sc heme (PT AS). This demonstrates that the SCPP is a v ery hard problem not only to solv e exactly but also to appro ximate. W e also sho w that for the directed ro w v ersion it is not p ossible to appro ximate the problem b y less than log log j D j where D is the set of destinations. In Chapter 5 algorithms for dieren t v ersions of the SCPP are prop osed. Both appro ximation algorithms, as w ell as heuristics are discussed. Initially some algorithms with p erformance guaran tee are prop osed. Ho w ev er, due to complexit y results, these algorithms in general do not giv e go o d results for problems found in practice. Heuristic algorithms are then studied, and t w o main strategies for construction heuristics are discussed. Results of computational results with these metho ds are presen ted and compared. Another problem in m ulticast net w orks is discussed in Chapter 6 The routing problem in m ulticast net w orks asks for an optimal route, i.e., a minim um cost tree connecting the source no de to destinations. The routing problem for m ulticast net w orks is kno wn to b e NP-hard. W e prop ose new heuristics, and use these heuristics to implemen t a greedy adaptiv e searc h pro cedure (GRASP). In the last part of the dissertation, wireless net w ork systems are discussed. In particular, ad ho c systems (also kno wn as MANETs) are studied. Chapter 7 is dedicated to the problem of determining a minim um bac kb one for suc h ad ho c net w orks. A new algorithm for this problem is giv en, and the adv an tages of this algorithm are addressed. A distributed v ersion of the algorithm is also prop osed.

PAGE 13

3 Finally in Chapter 8 general conclusions are giv en ab out the w ork presen ted in the dissertation. F uture w ork in the area is presen ted, and some concluding remarks ab out this area of researc h are giv en.

PAGE 14

CHAPTER 2 A SUR VEY OF COMBINA TORIAL OPTIMIZA TION PR OBLEMS IN MUL TICAST R OUTING In m ulticasting routing, the main ob jectiv e is to send data from one or more source to m ultiple destinations, while at the same time minimizing the usage of resources. Examples of resources whic h can b e minimized include bandwidth, time and connection costs. In this c hapter w e surv ey applications of com binatorial optimization to m ulticast routing. W e discuss the most imp ortan t problems considered in this area, as w ell as their mo dels. Algorithms for eac h of the main problems are also presen ted. 2.1 In tro duction A basic application of computer net w orks consists of sending information to a selectiv e, usually large, n um b er of clien ts of some sp ecic data. Common examples of suc h applications are m ultimedia distribution systems ( P asquale et al. 1998 ), video-conferencing ( Eriksson 1994 ), soft w are deliv ery ( Han and Shahmehri 2000 ), group-w are ( Cho c kler et al. 1996 ), and game comm unities ( P ark and P ark 1997 ). Multicast is a tec hnique used to facilitate this t yp e of information exc hange, b y routing data from one or more sources to a p oten tially large n um b er of destinations ( Deering and Cheriton 1990 ). This is done in suc h a w a y that o v erall utilization of resources in the underlying net w ork is minimized in some sense. T o handle m ulticast routing, man y prop osals of m ulticast tec hnologies ha v e b een done in the last decade. Examples are the MBONE ( Eriksson 1994 ), MOSPF ( Mo y 1994a ), PIM ( Deering et al. 1996 ), core-based trees ( Ballardie 4

PAGE 15

5 et al. 1993 ) and shared tree tec hnologies ( Chiang et al. 1998 ; W ei and Estrin 1994 ). Eac h prop osed tec hnology requires the solution of (usually hard) com binatorial problems. With the proliferation of services that require m ulticast deliv ery the asso ciated routing metho ds b ecame an imp ortan t source of problems for the com binatorial optimization comm unit y Man y ob jectiv es can b e devised when designing proto cols, routing strategies, and o v erall net w orks that can b e optimized using tec hniques from com binatorial optimization. In this c hapter w e discuss some of the com binatorial optimization problems arising in the area of m ulticast routing. These are v ery in teresting in their o wn, but sometimes are closely related to other w ell kno wn problems. Th us, the cross-fertilization of ideas from com binatorial optimization and m ulticast net w orks can b e b enecial to the dev elopmen t of impro v ed algorithms and general tec hniques. Our ob jectiv e is to review some of the more in teresting problems and giv e examples and references of the existing algorithms. W e also discuss some problems recen tly app earing in the area of m ulticast net w orks and ho w they are mo deled and solv ed in the literature. 2.1.1 Multicast Routing The idea of sending information for a large n um b er of users is common in systems that emplo y broadcasting. Radio and TV are t w o standard examples of broadcasting systems whic h are widely used. On the other hand, net w orks w ere initially designed to b e used as a comm unication means among a relativ ely small n um b er of participan ts. The TCP/IP proto col stac k, whic h is the main tec hnology underlying the In ternet, uses routing proto cols for deliv ery of pac k ets for single destinations. Most of these proto cols are based on the calculation of shortest paths. A go o d example of a widely used routing proto col is the OSPF ( Mo y 1994b ;

PAGE 16

6 Net w ork Multicast destinations Multicast sources Figure 2{1: Conceptual organization of a m ulticast group. Thomas I I 1998 ) (Op en Shortest P ath First), whic h is used to compute routing tables for routers inside a subnet w ork. In OSPF, eac h router in the net w ork is resp onsible for main taining a table of paths for reac hable destinations. This table can b e created using the Dijkstra's algorithm ( Dijkstra 1959 ) to calculate shortest paths from the curren t no de to all other destinations in the curren t sub-net w ork. This pro cess can b e done deterministically in p olynomial time, using at most O ( n 3 ) iterations, where n is the n um b er of no des in v olv ed. Ho w ev er, with the In ternet and the increased use of large net w orks, the necessit y app eared for services targeting larger audiences. This phenomenon b ecame more imp ortan t due to the dev elopmen t of new tec hnologies suc h as virtual conference ( Sabri and Prasada 1985 ), video on demand, group-w are ( Ellis et al. 1991 ), etc. This series of dev elopmen ts ga v e momen tum for the creation of multic ast r outing pr oto c ols In m ulticast routing, data can b e sen t from one or more source no des to a set of destination no des (see Figure 2{1 ). It is required that all destinations b e satised b y a stream of data. Dalal and Metcalfe ( 1978 ) w ere the rst to giv e non-trivial algorithms for routing of pac k ets in a m ulticast net w ork. F rom then on, man y prop osals ha v e b een made to create tec hnology supp orting m ulticast routing, suc h as

PAGE 17

7 b y Deering ( 1988 ), Eriksson ( 1994 ), and W all ( 1980 ). Some examples of m ulticast proto cols are PIM { Proto col Indep enden t Multicast ( Deering et al. 1996 ), D VMRP { Distance-V ector Multicast Routing Proto col ( Deering and Cheriton 1990 ; W aitzman et al. 1988 ), MOSPF { Multicast OSPF ( Mo y 1994a ), and CBT { Core Based T rees ( Ballardie et al. 1993 ). See Levine and Garcia-Luna-Acev es ( 1998 ) for a detailed comparison of div erse tec hnologies. 2.1.2 Basic Denitions A multic ast gr oup is a set of no des in a net w ork that need to share the same piece of information. A m ulticast group can ha v e one or more source no des, and more than one destination. Note that ev en when there is more than one source, the same information is shared among all no des in the group. A m ulticast group can b e static or dynamic. Static gr oups cannot b e c hanged after its creation. Starting with W all ( 1980 ), the problem of routing information in static groups is frequen tly mo deled as a t yp e of Steiner tree problem. On the other hand, dynamic gr oups can ha v e mem b ers added or remo v ed at an y time ( W axman 1988 ). Clearly the task of main taining routes for dynamic groups is complicated b y the fact that it is not kno wn in adv ance whic h no des can b e added or remo v ed. Multicast groups can b e also classied according to the relativ e n um b er of users, as describ ed b y Deering and Cheriton ( 1990 ). In sp arse gr oups the n um b er of participan ts is small compared to the n um b er of no des in the net w ork. In the other situation, in whic h most of the no des in the net w ork are engaged in m ulticast comm unication, the groups in v olv ed are called p ervasive gr oups ( W aitzman et al. 1988 ). F or more information ab out m ulticast net w orks in general, one can consult the surv eys b y A.J. F rank ( 1985 ), and P aul and Ragha v an ( 2002 ). A go o d

PAGE 18

8 in tro duction to m ulticasting in IP net w orks is giv en in the In ternet Draft b y Semeria and Maufer ( 1996 ) (a v ailable online). Other in teresting related literature include Du and P ardalos ( 1993a ); P ardalos and Du ( 1998 ); W an et al. ( 1998 ); P ardalos et al. ( 2000 1993 ); P ardalos and Khoury ( 1996 1995 ). 2.1.3 Applications of Multicast Routing Applications of m ulticast routing ha v e a wide sp ectrum, from business to go v ernmen t and en tertainmen t. One of the rst applications of m ulticast routing w as in audio broadcasting. In fact, the rst real use of the In ternet MBONE (Multimedia Bac kb one, created in 1992) w as to broadcast audio from IETF (In ternet Engineering T ask F orce) meetings o v er the In ternet ( Eriksson 1994 ). Another imp ortan t application of m ulticast routing is video conference ( Y um et al. 1995 ), since this is a resource-in tensiv e kind of application, where a group of users is targeted. It has requiremen ts, suc h as real-time image exc hanging and allo wing in teraction b et w een geographically separated users, also found in other t yp es of m ultimedia applications. Being closely related to the area of remote collab oration, video conferencing has receiv ed great atten tion during the last decade. Among others, P asquale et al. ( 1998 ) giv e a detailed discussion ab out utilization of m ulticast routing to deliv er m ultimedia con ten t o v er large net w orks, suc h as the In ternet. Jia et al. ( 1997 ) and Komp ella et al. ( 1996 ) also prop osed algorithms for m ulticast routing applied to real-time video distribution and video-conferencing problems. Man y other in teresting uses of m ulticast routing ha v e b een done during the last decade, with examples suc h as video on demand, soft w are distribution, In ternet radio and TV stations, etc.

PAGE 19

9 2.1.4 Chapter Organization The remainder of this c hapter is organized as follo ws. In Section 2.2 w e giv e a common ground for the description of optimization problems in m ulticast routing. W e start b y giving the terminology used throughout the c hapter, mainly from graph theory Then, w e discuss some of the common problems app earing in this area. In Section 2.3 w e discuss dela y constrained Steiner tree problems. These are the most studied problems in m ulticast routing, from the optimization p oin t of view, b eing used in div erse algorithms. Th us, w e discuss man y of the v ersions of this problem considered in the literature. In Section 2.4 w e review some other optimization problems related to m ulticast routing. They are the m ulticast pac king problem, the m ulticast net w ork dimensioning problem, and the p oin t-to-p oin t connection problem. Finally in Section 2.5 w e giv e some concluding remarks ab out the sub ject. 2.2 Basic Problems in Multicast Routing In this section w e discuss the basic problems o ccurring in m ulticast netw orks. W e start b y an in tro duction to terminology used. In the sequence w e discuss some basic problems whic h are addressed in the m ulticast routing literature.2.2.1 Graph Theory T erminology Graphs in this c hapter are considered to b e undirected and without lo ops. In our applications, the no des in a graph represen t hosts, and edges represen t net w ork links. W e use N ( v ) to denote the set of neigh b ors of a no de v 2 V Also, w e denote b y ( V ) the n um b er of suc h neigh b ors. With eac h edge ( i; j ) 2 E w e can asso ciate functions represen ting c haracteristics of the net w ork links. The most widely used functions are capacit y c ( i; j ), cost w ( i; j ) and dela y d ( i; j ), for i; j 2 V F or eac h edge ( i; j ) 2 E the

PAGE 20

10 asso ciated capacit y c ( i; j ) represen ts the maxim um amoun t of data that can b e sen t b et w een no des i and j In m ulticasting applications this is generally giv en b y an in teger m ultiple of some unit y of transmission capacit y so w e can sa y that c ( i; j ) 2 Z + for all ( i; j ) 2 E The function w ( i; j ) is used to mo del an y costs incurred b y the use of the net w ork link b et w een no des i and j This include leasing costs, main tenance costs, etc. Some applications, suc h as m ultimedia deliv ery are sensitiv e to transmission dela ys and require that the total time b et w een deliv ery and arriv al of a data pac k age b e restricted to some particular maxim um v alue ( F errari and V erma 1990 ). The dela y function d ( i; j ) is used to mo del this kind of constrain t. The dela y d ( i; j ) represen ts the time needed to transmit information b et w een no des i and j As a t ypical example, video-on-demand applications ma y ha v e sp ecic requiremen ts concerning the transmission time. Eac h pac k et i can b e mark ed with the maxim um dela y d i that can b e tolerated for its transmission. In this case the routers m ust consider only paths where the total dela y is at the most d i A path in a graph G is a sequence of no des v i 1 ; : : : ; v i j where ( v i k ; v i k +1 ) is in E for all k 2 f 1 ; : : : ; j 1 g In a routing problem w e w an t to nd paths from a source s to a set D of destinations, satisfying some requiremen ts. The cost w ( P ) of a path P is dened as the sum of the costs of all edges ( v i k ; v i k +1 ) in P A path P b et w een no des u and v is called a minimum p ath if there is no path P 0 in G suc h that w ( P 0 ) < w ( P ). The p ath delay d ( P ) is dened as the dela y incurred when routing data b et w een no des v 1 and v k through path P = ( v 1 ; : : : ; v k ). In other w ords, d ( P ) = P k 1 i =1 d ( v i ; v i +1 ). In this c hapter, w e use in terc hangeably the w ords edge and link to relate to the same ob ject. The w ord link is used when it is more appropriate in

PAGE 21

11 the application con text. F or more information of graph theoretical asp ects of m ulticast net w orks, see Berry ( 1990 ). 2.2.2 Optimization Goals Dieren t ob jectiv es can b e considered when optimizing a m ulticast routing problem, suc h as, for example, path dela y total cost of the tree, and maxim um congestion. W e discuss some of these ob jectiv es. Qualit y of service is an imp ortan t consideration with net w ork service, and it is mostly related to the time needed for data deliv ery Dep ending on the qualit y of service requiremen ts of an application, one of the p ossible goals is to minimize path dela y The b est example of application that needs this qualit y of service is video-conference. The path dela y is an additiv e dela y function, corresp onding to the sum of dela ys incurred from source to destination, for all destinations. It is in teresting to note that this problem is solv able in p olynomial time, since the paths from source to destination are considered separately Shortest path algorithms suc h as, for example, the Dijkstra's algorithm ( Dijkstra 1959 ), can b e used to ac hiev e this ob jectiv e. A second ob jectiv e is to minimize the total cost of the routing tree. This is again an additiv e metric, where w e lo ok for the minim um sum of costs for edges used in the routing tree. In this case, ho w ev er, the optimization ob jectiv e is considerably harder to ac hiev e, since it can b e sho wn to b e equiv alen t to the minim um Steiner tree, a classical N P -hard problem ( Garey and Johnson 1979 ). Another example of optimization goal is to minimize the maxim um netw ork congestion. The c ongestion on a link is dened as the dierence b et w een capacit y and usage. The higher the congestion, the more dicult it is to handle failures in some other links of the net w ork. Also, higher congestion mak es

PAGE 22

12 it harder to include new elemen ts in an existing m ulticast group, and therefore is an undesirable situation in dynamic m ulticast. Th us, in a w ell designed net w ork it is in teresting to k eep congestion at a minim um. 2.2.3 Basic Multicast Routing Algorithms The most basic w a y of sending information to a m ulticast group is using ro o ding With this tec hnique, a no de sends pac k ets through all its adjacen t links. If a no de v receiv es a pac k et p from no de u for whic h it is not the destination, then v rst c hec ks if p w as receiv ed b efore. If this is true, the pac k et do es not need to b e sen t again. Otherwise, the v just re-sends the pac k et to all other adjacen t no des (excluding u ). The formal statemen t of this strategy is sho wn in Algorithm 1 It is clear that after at most n suc h steps (where n is the n um b er of no des in the net w ork), the pac k age m ust ha v e reac hed all no des, including the destinations. Th us, the algorithm is correct. The n um b er of messages sen t b y eac h no de is at most n The n um b er of messages receiv ed b y v is at most n G ( v ). Receiv e pac k et p from no de u if destination ( p ) = v then P ac k et-Receiv ed else if p acket was not pr eviously pr o c esse d then Sen t pac k et p to all no des in N ( v ) n f u g end end Algorithm 1: Flo o ding algorithm for no de v in a m ulticast net w ork This metho d of pac k et routing is simple, but v ery inecien t. The rst reason is that it uses more bandwidth than required, since man y no des whic h are not in the path to the destination will end up b y receiving the pac k et. Second, eac h no de in the net w ork m ust k eep a list of all pac k ets whic h it sen t,

PAGE 23

13 in order to a v oid lo ops. This mak es the use of ro o ding prohibitiv e for all but v ery small net w orks. Another problem, whic h is more dicult to solv e, is ho w to guaran tee that a pac k et will b e deliv ered, since the net w ork can b e disconnected due to some link failure, for example. The r everse p ath-forwar ding algorithm is a metho d, prop osed b y Dalal and Metcalfe ( 1978 ), used to reduce the net w ork usage asso ciated with the ro o ding tec hnique. The idea is that, for eac h no de v and source no de s in the net w ork, v will determine in a distributed w a y what is the edge e = ( u; v ), for some u 2 V whic h is in the shortest path from s to v This edge is called the p ar ent link The paren t link can b e determined in dieren t w a ys, and a v ery simple metho d is: select e = ( u; v ) to b e the paren t link for source s if this w as the rst edge from whic h a pac k et from s w as receiv ed. With this information, a no de can selectiv ely drop incoming pac k ets, based on its source. If a pac k et p is receiv ed from a link whic h is not considered to b e in the shortest path b et w een the source no de and the curren t no de, then p is discarded. Otherwise, the no de broadcasts p to all other adjacen t links, just as in the ro o ding algorithm. The paren t link can also b e up dated dep ending on the information receiv ed from other no des. Other algorithms can b e used to enhance this basic sc heme as discussed, e.g., b y Semeria and Maufer ( 1996 ). 2.2.4 General T ec hniques for Creation of Multicast Routes During the last decades a n um b er of basic tec hniques w ere prop osed for the construction of m ulticast routes. Diot et al. ( 1997 ) iden tied some of the main tec hniques used in the literature. They describ e these tec hniques as b eing divided in to source based routing, cen ter based tree algorithms, and Steiner tree based algorithms.

PAGE 24

14 In sour c e b ase d r outing a routing tree ro oted at the source no de is created for eac h m ulticast group. This tec hnique is used, for example, in the D VMRP and PIM proto cols. Some implemen tations of source based routing mak e use of the rev erse path-forw arding algorithm, discussed in the previous sub-session ( Dalal and Metcalfe 1978 ). Sriram et al. ( 1998 ) observ ed that this tec hnique do es a p o or job in routing small m ulticast groups, since it tries to optimize the routing tree without considering other p oten tial users not in the curren t group. Among the source based routing algorithms, the Steiner tr e e b ase d metho ds fo cus on minimization of tree cost. This is probably the most used approac h, since it can lev erage the large n um b er of existing algorithms for the Steiner tree problem. There are man y examples of this tec hnique (suc h as in Bharath-Kumar and Jae ( 1983 ); W all ( 1982 ); W axman ( 1988 ); Wi and Choi ( 1995 )), whic h will b e discussed on Section 2.3 In con trast to source based routing, c enter b ase d tr e e algorithms create routing trees with a sp ecied r o ot no de This ro ot no de is computed to ha v e some sp ecial prop erties, suc h as, for example, b eing closest to all other no des. This metho d is w ell suited to the construction of shared trees, since the ro ot no de can ha v e prop erties in teresting to all m ulticast groups. F or example, if the ro ot no de is the top ological cen ter of a set of no des, then this is the no de whic h is closest to all mem b ers of the in v olv ed m ulticast groups. In the case of the top ological cen ter, the problem of nding the ro ot no de b ecomes N P hard, but there are other v ersions of the problem whic h are easier to solv e. An imp ortan t example of use of this idea o ccurs in the CBT (core-based tree) algorithm ( Ballardie et al. 1993 ). A recen t metho d prop osed for distributing data in m ulticast groups is called ring b ase d r outing ( Baldi et al. 1997 ; Ofek and Y ener 1997 ). The idea

PAGE 25

15 is to ha v e a ring linking no des in a group, to minimize costs and impro v e reliabilit y Note for example that trees can b e brok en b y just one link failure; on the other hand, rings are 2-connected structures, whic h oer a more reliable in terconnection. 2.2.5 Shortest P ath Problems with Dela y Constrain ts Giv en a graph G ( V ; E ), a source no de s and a destination no de t with s; t 2 V the shortest path problem consists of nding a path from s to t with minim um cost. The solution of shortest path problems is required in most implemen tations of routing algorithms. This problem can b e solv ed in p olynomial time using standard algorithms ( Dijkstra 1959 ; Bellman 1958 ; F ord 1956 ). Ho w ev er, other v ersions of the shortest problem are harder, and cannot b e solv ed exactly in p olynomial time. An example of this o ccurs when w e add dela y constrain ts to the basic problem. The dela y constrain ts require that the sum of the dela ys from source to eac h destination b e less than some threshold. In this case, the shortest path problem b ecomes N P -hard ( Garey and Johnson 1979 ) and therefore, some heuristic algorithms m ust b e used in order to nd ecien t implemen tations (e.g. Salama et al. ( 1997b )). F or example, Sun and Langendo erfer ( 1995 ) and Deering and Cheriton ( 1990 ) ha v e prop osed go o d heuristics for this problem. Some algorithms for shortest path construction are less useful than others, due to prop erties of their distributed implemen tations. According to Cheng et al. ( 1989 ), a disadv an tage of the distributed Bellman-F ord algorithm for shortest path computation is that is dicult to reco v er from link failures, from the b ouncing eect ( Sloman and Andriop oulos 1985 ) caused b y lo ops, and from termination problems caused b y disconnected segmen ts. Th us, a

PAGE 26

16 c hief requiremen t for shortest path algorithms used in m ulticast routing is to ha v e a scalable distributed implemen tation. The problems asso ciated with distributed requiremen ts for shortest path algorithms are discussed b y Cheng et al. ( 1989 ), who prop osed a distributed algorithm to o v ercome suc h limitations.2.2.6 Dela y Constrained Minim um Spanning T ree Problem In the minimum sp anning tr e e (MST) problem, giv en a graph G ( V ; E ), w e need to nd a minim um cost tree connecting all no des in V This problem can b e solv ed in p olynomial time b y Krusk al's algorithm ( Krusk al 1956 ) or Prim's algorithm ( Prim 1957 ). Ho w ev er, similarly to the shortest path problem, the MST problem b ecomes N P -hard when dela y constrain ts are applied to the resulting paths in the routing tree. This fact can b e easily sho wn, since the minim um spanning tree problem is a generalization of the minim um cost path problem. Salama et al. ( 1997a ) discuss the delay c onstr aine d minimum sp anning tr e e pr oblem They prop ose a simple heuristic, whic h resem bles Prim's algorithm, to giv e an appro ximate solution to the problem. The prop osed metho d can b e describ ed as follo ws. In its rst phase, the algorithm tries to incorp orate links, ordered according to increasing cost, but without creating cycles. A t eac h step, the algorithm m ust also insure that the curren t (partial) solution satisfy the dela y constrain ts. If this is not true, then a relaxation step is carried on, whic h consists of the follo wing pro cedure. If a no de can b e link ed b y an alternativ e path, while reducing the dela y then the new path is selected. If, after this relaxation step, there is still no path with a suitable dela y for some no de, then the algorithm fails and returns just a partial answ er.

PAGE 27

17 Other examples of algorithms for computing dela y constrained spanning trees include the w ork of Cho w ( 1991 ). In his pap er, an algorithm for the problem of com bining dieren t routes in to one single routing tree is prop osed. F or more information ab out dela y constrained routing, see Salama et al. ( 1997c ), where a comparison of div erse algorithms for this problem is p erformed. 2.2.7 Cen ter-Based T rees and the T op ological Cen ter Problem In the con text of generation of m ulticast routing trees, some routing tec hnologies, suc h as PIM and CBT, use the tec hnique kno wn as c enter-b ase d tr e es ( Salama et al. 1996 ), whic h w as initially dev elop ed in W all ( 1982 ). This metho d can b e classied as a cen ter-based routing tec hnique, as describ ed in Section 2.2.4 In this approac h the rst step is to nd the no de v whic h is the top olo gic al c enter of the set of senders and receiv ers. The top ological cen ter of a graph G ( V ; E ) is dened as the no de v 2 V whic h is closest to an y other no de in the net w ork, i.e., the no de v whic h minimizes max u 2 V d ( v ; u ). Then, a routing tree ro oted at v is constructed and used throughout the m ulticast session. The basic reasoning b ehind the algorithm is that the top ological cen ter is a b etter starting p oin t for the routing tree, since it is exp ected to c hange less than other parts of the tree. This sc heme departs from the idea of ro oting the tree at the sender, and therefore can b e extended to b e used b y more than one m ulticast group at the same time. The top ological cen ter is, ho w ev er, a N P -hard problem ( Ballardie et al. 1993 ). Th us, other related approac hes try to nd ro ot no des that are not exactly the top ological cen ter, but whic h can b e though t of as a go o d appro ximation. Along these lines w e ha v e algorithms using c or e p oints ( Ballardie et al. 1993 ) and also r endez-vous p oints ( Deering et al. 1994 ).

PAGE 28

18 It is in teresting to note that, for simplicit y most of the pap ers whic h try to create routing trees using cen ter-based tec hniques simply disregard the N P complete problem and try to nd other appro ximations. It is not completely understo o d ho w go o d these appro ximations can b e for practical instances. Ho w ev er, Calv ert et al. ( 1995 ) ga v e an informativ e comparison of the dieren t metho ds of c ho osing the cen ter for a routing tree, based on sev eral exp erimen ts. 2.3 Steiner T ree Problems and Multicast Routing In this section w e discuss dieren t v ersions of the Steiner tree problem, and ho w they can b e useful to solv e problems arising in m ulticast routing. Some of the algorithm for the Steiner tree are also presen ted. 2.3.1 The Steiner T ree Problem on Graphs Steiner tree problems are v ery useful in represen ting solutions to m ulticast routing problems. They are emplo y ed mostly when there is just one activ e m ulticast group and the minim um cost tree is w an ted. In the Steiner tree problem, giv en a graph G ( V ; E ), and a set R V of required no des, w e w an t to nd a minim um cost tree connecting all no des in R The no des in V n R can b e used if needed, and are called \Steiner" p oin ts. This is a classical N P -hard problem ( Garey and Johnson 1979 ), and has a v ast literature on its o wn ( Bauer and V arma 1997 ; Du et al. 2001 ; Du and P ardalos 1993b ; Hw ang and Ric hards 1992 ; Hw ang et al. 1992 ; Kou et al. 1981 ; T ak ahashi and Matsuy ama 1980 ; Win ter 1987 ; Win ter and Smith 1992 ). Th us, in this subsection w e giv e only some of the most used results. F or additional information ab out the Steiner problem, one can consult the surv eys Win ter ( 1987 ); Hw ang and Ric hards ( 1992 ); Hw ang et al. ( 1992 ). One of the most w ell kno wn heuristics for the Steiner tree problem w as prop osed b y Kou et al. ( 1981 ), and frequen tly refereed to as the KMB heuristic.

PAGE 29

19 There is practical in terest in this heuristic, since it has a p erformance guaran tee of at most t wice the size of the optim um Steiner tree. The steps of the KMB heuristic are sho wn in Algorithm 2 Construct a complete graph K ( R ; E ) where the set of no des is R Let the distance d ( i; j ), i; j 2 R b e the shortest path from i to j in G Find a minim um spanning tree T of K Replace eac h edge ( i; j ) in T b y the complete path from i to j in G Let the resulting graph b e T 0 Compute a minim um spanning tree ^ T of T 0 rep eat r f al se if ther e is a le af w 2 ^ T which is not in R then Remo v e w from ^ T r tr ue end un til not r Algorithm 2: Minim um spanning tree heuristic for Steiner tree. Theorem 1 (Kou et al. ( Kou et al. 1981 )) A lgorithm 2 has a p erformanc e guar ante e of 2 2 =p wher e p = j R j W all ( 1980 ) made a comprehensiv e study of ho w the KMB heuristic p erforms in problems o ccurring in real net w orks. F or example, Doar and Leslie ( 1993 ) rep ort that this heuristic can giv e m uc h b etter results than the claimed guaran tee, usually ac hieving 5% of the optimal for a large n um b er of realistic instances. Another basic heuristic for Steiner tree w as prop osed b y T ak ahashi and Matsuy ama ( 1980 ). This heuristic w orks in a w a y similar to the Dijkstra's and Prim's algorithms. The op eration of the heuristic consists of increasing the initial solution tree using shortest paths. Th us, it is classied as part of the broad class of p ath-distanc e heuristics Initially the tree is comp osed of the source no de only Then, at eac h step, the heuristic searc hes for a still

PAGE 30

20 unconnected destination d that is closest to the curren t tree T and adds to T the shortest path leading to d The algorithm stops when all required no des ha v e b een added to the solution tree. The Steiner tree tec hnique for m ulticast routing consists of using the Steiner problem as a mo del for the construction of a m ulticast tree. In general, it is considered that there is just one source no de for the m ulticast group. The set of required no des is dened as the union of source and destinations. This tec hnique is one of the most studied for m ulticast tree construction, with man y algorithms a v ailable ( Bauer and V arma 1995 ; Cho w 1991 ; Chen et al. 1993 ; Komp ella et al. 1992 1993b a ; Hong et al. 1998 ; Komp ella et al. 1996 ; Ramanathan 1996 ). In the remaining of this and the next sections w e discuss the v ersions of this problem whic h are most useful, as w ell as algorithms prop osed for them. In one of the rst uses of the Steiner tree problem for creating m ulticast trees, Bharath-Kumar and Jae ( 1983 ) studied algorithms to optimize the cost and dela y of a routing tree at the same time. Also, W axman ( 1988 ) discusses heuristics for cost minimization using Steiner tree, taking in consideration the dynamics of inclusion and exclusion of mem b ers in a m ulticast group. It is also imp ortan t to note some of the limitations of the Steiner problem as a mo del for m ulticast routing. It has b een p oin ted out b y Sriram et al. ( 1998 ) that Steiner tree tec hniques w ork b est in situations where a virtual connection m ust b e established. Ho w ev er, in the most general case of pac k et net w orks, lik e the In ternet, it do es not mak e m uc h sense to minimize the cost of a routing tree, since eac h pac k et can tak e a v ery dieren t route. In this case, it is more imp ortan t to ha v e distributed algorithms with lo w o v erhead. Despite this, Steiner trees are still useful as a starting p oin t for more sophisticated algorithms.

PAGE 31

21 2.3.2 Steiner T ree Problems with Dela y Constrain ts The simplest w a y of applying the Steiner tree problem in m ulticast netw orks requires that the costs of edges in the tree represen t the comm unication costs incurred b y the resulting m ulticast routes. In this case w e can just apply a n um b er of existing algorithms, suc h as the ones discussed in the previous section, for the Steiner tree problem. Ho w ev er, most applications ha v e additional requiremen ts in terms of the maxim um dela y for deliv ering of the information. That is the reason wh y the most w ell studied v ersion of the Steiner tree problem applied to m ulticast routing is the dela y constrained v ersion ( Im et al. 1997 ; Komp ella et al. 1992 1993b a ; Jia 1998 ; Sriram et al. 1998 ). W e giv e in this section some examples of metho ds used to giv e appro ximate solutions to this problem. One of the strategies used to solv e the dela y constrained Steiner tree problem is to adapt existing heuristics, b y adding dela y constrain ts. The heuristic prop osed b y Komp ella et al. ( 1993b ), for example, uses metho ds that are similar to the KMB algorithm ( Kou et al. 1981 ). The resulting heuristic is comp osed of three stages. The rst stage consists of nding a closure graph of constrained shortest paths b et w een all mem b ers of a m ulticast group. The closur e gr aph of G is a complete graph whic h has the set of no des V ( G ) and, for eac h pair of no des u; v 2 V an edge represen ting the cost of the shortest path b et w een u and v In the second stage, Komp ella's algorithm nds a constrained spanning tree of the closure graph. T o do this, the heuristic uses a greedy algorithm based on edge costs, to nd a spanning tree with lo w cost. In the last stage, edges of the spanning tree found in the previous step are mapp ed bac k to the original paths in the graph. A t the same time, lo ops are remo v ed using

PAGE 32

22 the shortest path algorithm on the expanded constrained spanning tree. The time complexit y of the whole pro cedure is O ( n 3 ), where is the maxim um dela y allo w ed b y the application. It should b e noted, ho w ev er, that ev en b eing v ery similar to the KMB heuristic, this algorithm do es not ha v e an y pro v ed appro ximation guaran tee. This happ ens b ecause the dela y constrain ts mak e the problem m uc h harder to appro ximate. Sriram et al. ( 1998 ) prop osed an algorithm for constructing dela yconstrained m ulticast trees whic h is optimized for sparse, static groups. Their algorithm is divided in to t w o phases. The rst phase is distributed, and w orks b y creating dela y constrained paths from source to eac h destination. The paths are created using a unicast routing algorithm, so it can use information already a v ailable on the net w ork. The second phase uses the computed paths to dene a routing tree. Eac h path is added sequen tially and cycles are remo v ed as they app ear. Basically on iteration i when a new path P i is added to an existing tree T i 1 eac h in tersection P i \ T i 1 of the path with the old tree is tested. This is necessary to determine if just the part of P i whic h do es not in tersect can b e used, while main taining the same dela y constrain t. If this is p ossible, then the tree b ecomes T i after adding the non-in tersecting part of the path. Otherwise, the algorithm m ust remo v e some parts of the old tree in order to a v oid a cycle. Another heuristic for the dela y constrained Steiner tree problem is presen ted b y F eng and Y um ( 1999 ). This heuristic uses the idea of constructing a minim um cost tree, as w ell as a minim um dela y tree, and then com bining the resulting solutions. Recall that a shortest delay tr e e can b e computed using some algorithm for shortest paths, in p olynomial time, with the dela y b eing used as the cost function. Th us, the hard part of the algorithm consists of nding the minim um cost tree and then decide ho w to com bine it with the

PAGE 33

23 minim um dela y tree. The algorithm used to compute the minim um cost, dela y constrained tree is a mo dication of the Dijkstra's algorithm, whic h main tains eac h path within a sp ecied dela y constrain t. T o com bine dieren t trees, the algorithm emplo ys a lo op remo v al subroutine, whic h v eries if the resulting paths still satises the dela y constrain ts. The resulting complexit y of this algorithm is similar to the complexit y of the Dijkstra's algorithm, and therefore is an impro v emen t in terms of computation time. Another p ossible metho d for designing go o d m ulticast routing trees is to start from algorithms for computing constrained minim um paths. This w as the tec hnique c hosen b y Kumar et al. ( 1999 ), who prop osed t w o heuristics for the constrained minim um cost routing problem. In the rst heuristic, whic h is called \dynamic cen ter based heuristic", the idea is to nd a cen ter no de to whic h all destinations will b e link ed, using constrained minim um paths. The cen ter no de c is calculated initially b y nding the pair of no des with highest minim um dela y path, and taking c as the no de in the middle of this path. Other destinations are link ed using minim um dela y paths with lo w cost. The second heuristic, called \b est eort residual dela y heuristic", follo ws a similar idea, but this time eac h no de added to the curren t routing tree T has a residual dela y b ound. New destinations are then link ed to the tree through paths whic h ha v e lo w cost and dela y smaller than the residual dela y of the connecting no de v 2 T Not only dela y constrain ts ha v e b eing used with the m ulticast routing problem. Jiang ( 1992 ) discusses another v ersion of the m ulticast Steiner tree problem, this time with link capacit y constrain ts. His w ork is related to videoconferencing, where man y users need to b e source no des during the establishmen t of the conference. One of the ideas used is that, as eac h user can b ecome a source, then a distinct m ulticast tree m ust b e created for eac h user. He

PAGE 34

24 prop oses some heuristics to solv e this problem, with computational results for the heuristics. As a last example, Zh u et al. ( 1995 ) prop osed a heuristic for routing with dela y constrain ts with complexit y O ( k j V j 3 log V ). The algorithm has t w o phases. In the rst phase, a set of dela y-b ounded paths is constructed from source to eac h destination, to form a dela y-b ounded tree. Then, in the second phase the algorithm tries to optimize this tree, b y reducing the total cost at eac h iteration. The algorithm is also sho wn useful to optimize other ob jectiv e functions than total cost. F or example, it can b e used to minimize the maxim um congestion in the net w ork, after c hanges in the second phase to accoun t for the new ob jectiv e function. In the pap er there are comparisons b et w een the prop osed heuristic and the heuristic for Steiner tree problem prop osed b y Kou et al. ( 1981 ). The results sho w that the heuristic ac hiev es solutions v ery close to that giv en b y the algorithm for Steiner tree. Sparsit y and Dela y Minimization Ch ung et al. ( 1997 ) prop osed heuristics to the dela y constrained minim um m ulticast routing, when considering the structure of sparse problems. The heuristic dep ends on the use of other algorithms to nd appro ximate solutions to Steiner problem. The Steiner tree heuristic is used to return t w o solutions: in the second run, the cost function c is replaced b y the dela y function d Th us, there are t w o solution whic h optimize dieren t ob jectiv e functions. The main idea of the prop osed algorithm is trying to optimize the cost of the routing tree, as w ell as the maxim um dela y at the same time. T o do this, the algorithm uses a metho d prop osed b y Blokh and Gutin ( 1996 ), whic h is based on Lagrangian relaxation. A critique that can b e done to the w ork of Ch ung et al. ( 1997 ) is that the goal of optimizing the Steiner tree with dela y cost is not what is

PAGE 35

25 required in most applications. F or example, a solution can b e optimal for this goal, ho w ev er some path from s to a destination d can still ha v e dela y greater than a constan t This happ ens b ecause the global optim um do es not implies that eac h source-destination path is restricted to the maxim um dela y 2.3.3 The On-line V ersion of Multicast Routing The m ulticast routing problem can b e generalized in the follo wing w a y Supp ose that a m ulticast group can b e increased or reduced b y means of online requests p osted b y no des of the net w ork. This is a harder problem, since optimal solutions, when considering just a xed group, can quic kly b ecome inaccurate, and ev en v ery far from the optim um, after a n um b er of additions and remo v als. Researc hers in the area of m ulticasting routing ha v e devised some w a ys to deal with the problem of reconguring a m ulticast tree when inclusions and departures of mem b ers of a group o ccur ( Aguilar et al. 1986 ; W axman 1988 ). A common approac h consists of mo difying the simple existing algorithms in order to a v oid the re-computation of the en tire tree for eac h c hange. Ho w ev er, as noted in P asquale et al. ( 1998 ), a problem with suc h metho ds is that the global optimalit y of the resulting trees is lost at eac h c hange, and a v ery bad solution can emerge after man y suc h lo cal mo dications. One of the diculties of the source tree based tec hniques in this resp ect is that, for eac h c hange in the m ulticast group, a new tree m ust b e computed to restore service at the required lev el. The algorithms necessary to create this tree are, ho w ev er, exp ensiv e, and this mak es the tec hnique not suitable for dynamic groups. Kheong et al. ( 2001 ) prop osed an algorithm to sp eed up the creation of m ulticast routing trees in the case of dynamic c hanges. The idea is to main tain cac hes of pre-computed m ulticast trees from previous groups.

PAGE 36

26 The cac he can b e used to quic kly nd new paths, connecting some of the mem b ers of the group. An algorithm for retrieving data from the path cac he w as prop osed, whic h nds similarities b et w een the previous and the curren t m ulticast groups. Then the algorithm constructs a connecting path using parts of the paths store in the cac he. The dicult y of adapting source based tec hniques to the dynamic case has motiv ated the app earance of sp ecialized algorithms for the on-line v ersion of the problem. F or example, W axman ( 1988 ) denes t w o t yp es of on-line m ulticast heuristics. The rst t yp e allo ws a rearrangemen t of the routing tree after some n um b er of c hanges, while the second t yp e do es not allo w suc h recongurations. The theoretical mo del for this problem is giv en b y the socalled on-line Steiner pr oblem In this v ersion of the problem, one needs to construct a solution to a Steiner problem sub ject to the addition and deletion of no des ( Imase and W axman 1991 ; W estbro ok and Y an 1993 ; Sriram et al. 1999 ). This is clearly a N P -hard problem, since it is a generalization of the Steiner problem. W axman ( 1988 ) studied ho w a routing tree m ust b e c hanged when new no des are added or remo v ed. T o b etter describ e this situation, he prop osed a random graph mo del, where the probabilit y of existing an edge b et w een t w o no des dep ends on the Euclidean distance b et w een then. This probabilit y decreases exp onen tially with the increase of distance b et w een no des. The random inclusion of links can b e used to represen t the random addition of new users to a m ulticast group. W axman also describ ed a greedy heuristic to appro ximately solv e instances generated according to this mo del. Hong et al. ( 1998 ) prop osed a dynamic algorithm whic h is capable of handling additions and remo v als of elemen ts to an existing m ulticast group. The algorithm is again based on the Steiner tree problem, with added dela y

PAGE 37

27 constrain ts. Ho w ev er, to decrease the computational complexit y of the problem, the authors emplo y ed the Lagrangian relaxation tec hnique. According to their results, the algorithm nds solutions v ery close to the optim um when the net w ork is sparse. F eng and Y um ( 1999 ) devised a heuristic algorithm with the main goal of allo wing easy insertion of new no des in a m ulticast group. The algorithm is similar to Prim's algorithm for spanning trees in whic h it, at eac h step, tak es a non-connected destination with minim um cost and tries to add this destination to the curren t solution. The algorithm also uses a priorit y queue Q where the already connected elemen ts are stored. The k ey in this priorit y queue is the total dela y b et w een the elemen ts and the source no de. The algorithm uses a parameter k to determine ho w to compute the path from a destination to the curren t tree. Giv en a v alue of k the algorithm computes k minim um dela y paths, from the curren t destination d to eac h of the smallest k elemen ts in the priorit y queue. Then, the b est of the paths is c hosen to b e part of the routing tree. An in teresting feature of the resulting algorithm is that, c hanging the v alue of the parameter k will c hange the amoun t of eort needed to connect destination. Clearly when increasing the v alue of k b etter results will b e obtained. This algorithm facilitates the inclusion of new elemen ts, b ecause the same pro cedure can b e used to gro w the existing tree, in order to accommo date a new no de. Sriram et al. ( 1999 ) prop osed new algorithms for the on-line, dela y constrained minim um cost m ulticast routing that try to main tain a xed qualit y of service b y sp ecifying minim um dela ys. The algorithm is able to adapt the routing tree to c hanges in mem b ership due to inclusions and exclusions of users. One of the problems they try to solv e is ho w to determine the momen t in whic h the tree m ust b e recomputed, and for ho w long should the algorithm

PAGE 38

28 just do mo dications to the original tree. T o answ er this question, the authors in tro duced the concept of quality factor whic h measures the usefulness of part of the routing tree to the rest of the users. When the qualit y factor of part of a tree decreases to a sp ecic threshold, that part of the tree m ust b e mo died. The authors discuss a tec hnique to rearrange the tree suc h that the minim um dela ys con tin ue to b e resp ected. The rst algorithm prop osed b y Sriram et al. ( 1999 ) starts b y creating a set of dela y constrained minim um cost paths. F or eac h destination, a path is created with bandwidth greater than the required bandwidth B and with dela y less than the maxim um dela y The next phase uses the resulting paths to create a complete routing tree. The algorithm adds sequen tially the edges in eac h path, and at eac h step remo v es the lo ops created b y the addition of the path. Lo ops are remo v ed in a w a y suc h that the dela y constrain ts are not violated. The second algorithm prop osed in Sriram et al. ( 1999 ) is a distributed proto col, where initially eac h destination receiv es a message in order to add new paths to the source tree. The no des are k ept in a priorit y list, ordered b y increasing dela y requiremen ts. According to the order in the list, the destinations receiv e messages, whic h ask them to compute parameters o v er the a v ailable paths, and then construct the new paths that will form the nal routing tree. A tec hnique that has b een emplo y ed b y some researc hers consists of using information a v ailable from unicast proto cols to simplify the creation of m ulticast routes. F or example, Bao xian et al. ( 2000 ) prop osed a heuristic for routing with dela y constrain ts whic h is based on the information giv en b y

PAGE 39

29 OSPF. Reusing this information, the resulting algorithm can run with impro v ed p erformance, in this case with complexit y O ( j D jj V j ), where D is the set of destinations. The resulting algorithm has t w o steps. In the rst step, it c hec ks, for eac h destination d i if there is some path from the source s to destination d i satisfying the dela y In the second step, the algorithm uses another heuristic to construct a unicast path from s to d i This heuristic basically construct a path using information ab out predecessor no des from the unicast proto col as w ell as the dela y information. 2.3.4 Distributed Algorithms The m ulticast routing problem is in fact a distributed problem, since eac h no de in v olv ed has some a v ailable pro cessing p o w er. Th us, it is natural to lo ok for distributed algorithms whic h can use this computational p o w er in order to reduce their time complexit y A n um b er of pap ers ha v e fo cused on distributed strategies for dela y constrained minim um spanning tree ( Jia 1998 ; Chen et al. 1993 ). A go o d example is the algorithm presen ted in Chen et al. ( 1993 ). The authors prop ose a heuristic that is similar to the general tec hnique used in the KMB heuristic for Steiner tree, and the algorithm in Komp ella et al. ( 1993b ), for example. Ho w ev er, the main dierence is that a distributed algorithm is used to compute the minim um spanning tree, whic h m ust b e computed t wice during the execution of the heuristic. The metho d used to nd the MST is based on the distributed algorithm prop osed b y Gallager et al. ( 1983 ). Komp ella et al. ( 1993a ) prop osed some distributed algorithms targeting applications of audio and video deliv ering o v er a net w ork, where the restriction on maxim um dela y pla ys an imp ortan t role. The authors try to impro v e o v er

PAGE 40

30 previous algorithms b y using a distributed metho d. The main ob jectiv e of the distributed pro cedure is to reduce the o v erall computational complexit y It m ust b e noted, ho w ev er, that, using decen tralized algorithms, some of the global information ab out the net w ork b ecomes harder to nd (for example, global connectivit y). Th us, a simplied v ersion of the algorithm m ust b e prop osed whic h do es not use global information. Nonetheless, according to the authors, the resulting algorithms sta y within 15%-30% of the optimal solution, for most test instances. The rst algorithm in Komp ella et al. ( 1993a ) is just a v ersion of BellmanF ord algorithm ( Bellman 1957 ) whic h nds a minim um dela y tree from the source to eac h destination. During the construction of eac h path, the algorithm v eries the cost of the a v ailable edges, and c ho ose the one with lo w est cost whic h satises the dela y constrain ts. The algorithm has the ob jectiv e of ac hieving feasibilit y and therefore the results are not necessarily lo cally optimal. This can b e ac hiev ed, ho w ev er, using another optimization phase, suc h as a lo cal searc h algorithm. In the second algorithm, the strategy emplo y ed is similar to the Prim's algorithm for minim um spanning tree construction. Its consists of gro wing a routing tree, starting from the source no de, un til all destinations are reac hed. The resulting algorithm is sp ecialized according to dieren t tec hniques for selecting the next edge to b e added to the tree. In the rst edge selection strategy prop osed, edges with smallest cost are selected, suc h that the dela y restrictions are satised. The second edge selection rule tries to balance the cost of the edge with the dela y imp osed b y its use. This is done b y a \bias" factor, whic h giv es higher priorit y to edges with smaller dela y among edges

PAGE 41

31 with the same cost. The factor used for edge ( i; j ) is b ( i; j ) = w ( i; j ) ( D ( s; i ) + d ( i; j )) ; where D ( s; i ) is the minim um total dela y b et w een the source s and no de i is the maxim um allo w ed dela y and, as usual, w ( i; j ) and d ( i; j ) are the cost and dela y b et w een no des i and j The authors also discuss the problem of termination, whic h is an imp ortan t question for distributed algorithms. In this case, the problem exists b ecause some congurations can rep ort an infeasible problem, while feasibilit y can b e restored b y making some c hanges to the curren t solution. Shaikh and Shin ( 1997 ) presen ted a distributed algorithm where the fo cus is to reduce the complexit y of distributed v ersions of heuristics for the the dela y constrained Steiner problem. In their pap er, the authors try to adapt the mo del of Prim's and Dijkstra's algorithms to the harder task of creating a m ulticast routing tree. In this w a y they aim to reduce the complexit y asso ciated with the heuristics for Steiner tree, while pro ducing go o d solutions for the problem. The metho ds emplo y ed b y Dijkstra's shortest path and Prim's minim um spanning tree algorithm are in teresting b ecause they require only lo cal information ab out the net w ork, and therefore they are kno wn to p erform w ell in distributed en vironmen ts. The op eration of these algorithms consists of adding at eac h step a new edge to the existing tree, un til some termination condition is satised. In the algorithm prop osed b y Shaikh and Shin, the main addition done to the structure of Dijkstra's algorithm is a metho d for distinguishing b et w een destinations and non-destination no des. This is done b y the use of an indicator function I D whic h returns 1 if and only if the argumen t is not a destination no de. The general strategy is presen ted in Algorithm 3 Note

PAGE 42

32 that in this algorithm the accum ulated cost of a path is set to zero ev ery time a destination no de is reac hed. This guides the algorithm to nd paths that pass through destination no des with higher probabilit y This strategy is called destination-driven multic ast The resulting algorithm is simple to implemen t and, according to the authors, p erform w ell in practice. input: G ( V ; E ) ; s for v 2 V do d [ v ] 1 d [ s ] 0 S 0 Q V /* Q is a queue */ while Q 6 = ; do v g et min ( Q ) S S [ f v g for u 2 N ( v ) do if u 62 S and d [ u ] > d [ v ] I D [ v ] + w ( u; v ) then d [ u ] d [ v ] I D [ v ] + w ( u; v ) end end endAlgorithm 3: Mo dication of Dijkstra's algorithm for m ulticast routing, prop osed b y Shaikh and Shin ( 1997 ). Mokb el et al. ( 1999 ) discuss a distributed heuristic algorithm for dela yconstrained m ulticasting routing, whic h is divided in a n um b er of phases. The initial phase of the algorithm consists of disco v ering information ab out no des in the net w ork, particularly ab out dela ys incurred b y pac k ages. In this phase, a pac k et is sen t from a source to all other no des in the neigh b orho o d. The pac k et is duplicated at eac h no de, using the ro o ding tec hnique. A t eac h no de visited, information ab out the total dela y and cost exp erienced b y the pac k et is collected, added to the pac k et, and retransmitted. Eac h destination will receiv e pac k ets with information ab out the path tra v ersed during the dela y time previously dened. After receiving the pac k ets, eac h destination can

PAGE 43

33 select the resulting path with lo w est cost to b e the c hosen path. As the last step of the initial phase, all destinations send this information to the source no de. In the second phase, the source no de will receiv e the selected paths for eac h destination and construct a routing tree based on this information. This is a cen tralized phase, where existing heuristics can b e applied to the construction of the tree. T o impro v e the p erformance of the algorithm, and to a v oid an o v erload of pac k ets in the net w ork during the ro o ding phase, eac h no de is required to main tain at most K pac k ets at an y time. Using this parameter, the time complexit y of the whole algorithm is O ( K 2 j V j 2 ). Sparse Groups An imp ortan t case of m ulticast routing o ccurs when the n um b er of sources and destinations is small compared to the whole net w ork. This is the t ypical case for big instances, where just a few no des will participate in a group, at eac h momen t. F or this case, Sriram et al. ( 1998 ) prop osed a distributed algorithm whic h tries to explore the sparsit y of the problem. The algorithm initially uses information a v ailable through a unicast routing proto col to nd pre-computed paths in the curren t net w ork. Ho w ev er, problems can app ear when these paths induce lo ops in the corresp onding graph. In the algorithm, suc h in tersections are treated and remo v ed dynamically The algorithm starts b y creating a list of destinations, ordered according to the their dela y constrain ts. No des with more strict dela y constrain ts ha v e the opp ortunit y of searc hing rst for paths. Eac h destination d i will indep enden tly try to nd a path from d i to the source s If during this pro cess a no de v previously added to the routing tree is found, then the pro cess stops and a new phase starts. In this new phase, paths are generated from v to the destination i The destination i c ho oses one

PAGE 44

34 of the paths according to a selection function S F (similar to the function used b y Komp ella et al. ( 1993a )), whic h is dened for eac h path P and giv en b y S F ( P ) = C ( P ) D T i 1 ( s; v ) D ( P ) ; where C ( P ) and D ( P ) are the cost and dela y of path P is the maxim um dela y in this group, and D T i 1 ( s; v ) is the curren t dela y b et w een the source s and no de v A problem that exists when the m ulticast group is allo w ed to ha v e dynamic mem b ership, is that a considerable amoun t of time is sp en t in the process of connection conguration. Jia ( 1998 ) prop oses a distributed algorithm whic h addresses this question. A new distributed algorithm is emplo y ed, whic h in tegrates the routing calculation with the connection conguration phase. Using this strategy the n um b er of messages necessary to set up the whole m ulticast group is decreased. 2.3.5 In teger Programming F orm ulation In teger programming has b een v ery useful in solving com binatorial optimization problems, via the use of relaxation and implicit en umeration metho ds. An example of this approac h to m ulticast routing is giv en b y Noronha and T obagi ( 1994 ), who studied the routing problem using an in teger programming form ulation. They discuss a general v ersion of the problem in whic h there are costs and dela ys for eac h link, and a set f 1 ; ::; T g of m ulticast groups, where eac h group i has its o wn source s i a set of n i destinations d i 1 : : : ; d in i a maxim um dela y i and a bandwidth request r i There is also a matrix B i 2 R n n i for eac h group i 2 f 1 ; : : : ; T g of source-destination requiremen ts. The v alue of B i j k is 1 if j = s 1 if j = d j k and 0 otherwise. The no de-edge incidence matrix is represen ted b y A 2 Z n m

PAGE 45

35 The net w ork considered has n no des and m edges. The v ectors W 2 R m D 2 R m and C 2 R m giv e resp ectiv ely the costs, dela ys and capacities for eac h link in the net w ork. The v ariables in the form ulation are X 1 ; : : : ; X T (where eac h X i is a matrix m n i ), Y 1 ; : : : ; Y T (where eac h Y i is a v ector m elemen ts), and M 2 R T The v ariable X i j k = 1 if and only if link j is used b y group i to reac h destination d ik Similarly v ariable Y i j = 1 if and only if link j is used b y m ulticast group i Also, v ariable M i represen ts the dela y incurred b y m ulticast group i in the curren t solution. In the follo wing form ulation, the ob jectiv es of minimizing total cost and maxim um dela y are considered. Ho w ev er, the constan t v alues c and d represen t the relativ e w eigh t giv en to the minimization of the cost and to the minimization of the dela ys, resp ectiv ely Using the v ariables sho wn, the in teger programming form ulation is giv en b y: min T X i =1 r i c C Y i + d M i (2.1) sub ject to AX i = B i for i = 1 ; : : : ; T (2.2) X i j k Y i j 1 for i = 1 ; : : : ; T j = 1 ; : : : ; n k = 1 ; : : : ; n i (2.3) M i k X j =1 D j X i j k for i = 1 ; : : : ; T k = 1 ; : : : ; n i (2.4) M i L i for i = 1 ; : : : ; T (2.5) T X i =1 r i Y i C (2.6) X i j k ; Y i j 2 f 0 ; 1 g ; for 1 i T 1 j K 1 k n i : (2.7) The constrain ts in the ab o v e in teger program ha v e the follo wing meaning. Constrain t ( 2.2 ) is the ro w conserv ation constrain t for eac h of the m ulticast groups. Constrain t ( 2.3 ) determine that an edge m ust b e selected when it is

PAGE 46

36 used b y an y m ulticast tree. Constrain ts ( 2.4 ) and ( 2.5 ) determine the v alue of the dela y since it m ust b e greater than the sum of all dela ys in the curren t m ulticast group and less than the maxim um acceptable dela y L i Finally constrain t ( 2.6 ) sa ys that eac h edge i can carry a v alue whic h is at most the capacit y C i This is a v ery general form ulation, and clearly cannot b e solv ed exactly in p olynomial time b ecause of the in tegralit y constrain ts ( 2.7 ). This form ulation is used in Noronha and T obagi ( 1994 ) to deriv e an exact algorithm for the general problem. Initially the decomp osition tec hnique w as used to decomp ose the constrain t matrix in smaller parts, where eac h part could b e solv ed more easily This can done using standard mathematical programming tec hniques, as sho wn e.g., in Bazaraa et al. ( 1990 ). Then, a branc hand-b ound algorithm is prop osed to the resulting problem. In this branc h-andb ound, the lo w er b ounding pro cedure uses the decomp osition found initially to impro v e the eciency of the lo w er b ound computation. 2.3.6 Minimizing Bandwidth Utilization A problem that usually happ ens when constructing m ulticast trees is the tradeo b et w een bandwidth used and total cost of the tree. T raditional algorithms for tree minimization try to reduce the total cost of the tree. Ho w ev er, this in general do es not guaran tee minim um bandwidth utilization. On the other hand, there are algorithms for minimization of the bandwidth that do not main tain the minim um cost. F or example, a greedy algorithm, as describ ed in F ujinoki and Christensen ( 1999 ), w orks b y connecting destinations sequentially to the source. Eac h destination is link ed to the nearest no de already connected to the source. In this w a y bandwidth is sa v ed b y reusing existing paths.

PAGE 47

37 F ujinoki and Christensen ( 1999 ) prop osed a new algorithm for main taining dynamic m ulticast trees whic h try to solv e the tradeo problem discussed ab o v e. The algorithm, called \shortest b est path tree" (SBPT), uses shortest paths to connect sources to destinations. The authors represen t distance b et w een no des as the minim um n um b er of edges in the path b et w een them. The rst phase in the algorithm consists of computing the shortest path from s to all destinations d i In the second phase, the algorithm p erforms a sequence of steps for eac h destination d i 2 D Initially it computes the shortest paths from d i to all other no des in G Then, the algorithm tak es the no de u whic h has minim um distance from d i and at the same time o ccurs in one of the shortest paths from s to d i By doing this c hoice, the metho d tries to fa v or the no des already in the routing tree giving the smallest p ossible increase in the total cost. 2.3.7 The Degree-constrained Steiner Problem If the n um b er of links from an y no de in the net w ork is required to b e a xed v alue, then w e ha v e the degree-constrained v ersion of the m ulticast routing problem. F or some applications of m ulticasting is dicult to mak e a large n um b er of copies of the same data. This is particularly true for high sp eed switc hes, where the sp eed requiremen ts ma y prohibit in practice an un b ounded n um b er of copies of the receiv ed information. F or example, in A TM net w orks, the n um b er of out connections can ha v e a xed limit ( Zhong et al. 1993 ). Th us, it is in teresting to consider Steiner tree problems where the degree of eac h no de is constrained. Bauer ( 1996 ) prop osed algorithms for this v ersion of the problem, and tried to construct degree-constrained m ulticast trees as a solution. Bauer and V arma ( 1995 ) review ed the traditional heuristics for Steiner tree, and new

PAGE 48

38 heuristics w ere giv en, whic h consider the restriction in the n um b er of adjacen t no des. They sho w that the heuristics for degree-constrained Steiner tree giv e solutions v ery close to the optim um for sample instances of the general Steiner problem. They also sho w exp erimen tally that, despite the restriction on the no de degrees, almost all instances ha v e feasible solutions whic h ha v e b een found b y the heuristics. 2.3.8 Other Restrictions: Non Symmetric Links and Degree V ariation An in teresting feature of real net w orks, whic h is not men tioned in most of the researc h pap ers, is that links are, in general, non symmetric. The capacit y in one direction can b e dieren t from the capacit y in the other direction, for example, due to congestion problems in some links. Ramanathan ( 1996 ) considered this kind of restriction. In his w ork, the minim um cost routing tree is mo deled as a minim um Steiner tree with constrain ts, where the net w ork has non symmetric links. The author prop oses an appro ximation algorithm, with xed w orst case guaran tee. The resulting algorithm has also the nice c haracteristic of b eing parameterizable, and therefore it allo ws the trading of execution time for accuracy Another restriction, whic h is normally disregarded, w as considered in the approac h tak en b y Rousk as and Baldine ( 1996 ), who prop osed the minimization of the so called delay variation The dela y v ariation is dened as the dierence b et w een the minim um and maxim um dela y dened b y a sp ecic routing tree. In some applications it is in teresting that this v ariation sta y within a sp ecic range. F or example, it can b e desirable that all no des receiv e the same information at ab out the same time.

PAGE 49

39 T able 2{1: Comparison among algorithms for the problem of m ulticast routing with dela y constrain ts. k is the n um b er of destinations. ** This algorithm is partially distributed. Algorithm Guaran tee Complexit y T yp es of instances KMB ( Kou et al. 1981 ) 2 O ( k n 2 ) general T ak ahashi and Matsuy ama ( 1980 ) 2 O ( k n 2 ) general Komp ella et al. ( 1993b ) | O ( n 3 ) general Sriram et al. ( 1998 ) | N/A ** sparse, static groups F eng and Y um ( 1999 ) | O ( n 2 ) general Kumar et al. ( 1999 ) | O ( n 3 ) cen ter based Jiang ( 1992 ) | O ( n 3 ) capacit y constrained, video conferencing Ch ung et al. ( 1997 ) | O ( n 3 ) sparse instances Zh u et al. ( 1995 ) | O ( k n 3 log n ) sparse instances 2.3.9 Comparison of Algorithms A Comparison of Non-distributed Approac hes T able 2{1 giv es a summary of features of the algorithms for the Steiner tree problem with dela y constrain ts discussed in this section. Most of them ha v e similar computational complexit y of the order of O ( n 3 ), where n is the n um b er of no des in the net w ork. The b est result is obtained b y F eng and Y um ( 1999 ), whic h com bine the w ork of nding go o d solutions in terms of cost and dela y The heuristic b y Ch ung et al. ( 1997 ) is rep orted to run faster then other heuristics, ho w ev er it m ust b e noted that it is optimized for sparse instances. Regarding appro ximation, only the rst t w o algorithms in the table ha v e kno wn appro ximation guaran tee (constan t and equal to 2). Ho w ev er, it is not dicult to ac hiev e similar p erformance guaran tees on heuristics with the same complexit y as KMB, for example. This can b e p erformed b y running the heuristic with kno wn p erformance guaran tee, follo w ed b y the normal heuristic, and then rep orting the b est solution.

PAGE 50

40 T able 2{2: Comparison among algorithms for the problem of m ulticast routing with dela y constrain ts. k is the n um b er of destinations, T S P is the time to nd a shortest path in the graph. ** In this case amortized time is the imp ortan t issue, but w as not analyzed in the original pap er. Algorithm Complexit y T yp es of instances Komp ella et al. ( 1993a ) O ( n 3 ) general on-line instances Bao xian et al. ( 2000 ) O ( mn ) based on unicast information Sriram et al. ( 1999 ) O ( n 3 ) instances with QoS Hong et al. ( 1998 ) O ( mk ( k + T S P ) dynamic, dela y sensitiv e Kheong et al. ( 2001 ) N/A ** general instances, cac he m ust b e main tained A Comparison of On-Line Approac hes T able 2{2 presen ts a comparison of algorithms prop osed for the on-line v ersion of the Steiner tree problem with dela y constrain ts, as discussed in Section 2.3.3 These algorithms in general do not pro vide a guaran tee of appro ximation, due to the dynamic nature of the problem. F rom the algorithms sho wn in T able 2{2 the one with lo w est complexit y is giv en b y Bao xian et al. ( 2000 ). Ho w ev er, this complexit y is k ept lo w due to the dep endence of the algorithm on information giv en b y other proto cols op erating unicast routing. Hong et al. ( 1998 ), ho w ev er, consider the construction of a complete solution, and ha v e the on-line issues as an additional feature. Kheong et al. ( 2001 ) also consider a tec hnique where information is reused, but in this case from previous iterations of the algorithm. It is dicult to ev aluate the complexit y of the whole algorithm, since it dep ends on the amortized complexit y on a large n um b er of iterations. This kind of analysis is not carried out in the pap er.

PAGE 51

41 A Comparison of Distributed Approac hes Distributed approac hes for the Steiner tree problem with dela y constrain ts are more dicult to ev aluate in the sense that other features b ecome imp ortan t. F or example, in distributed algorithms the message complexit y i.e., the n um b er of message exc hanges, is an imp ortan t indicator of p erformance. These factors are sometimes not deriv ed explicitly in some of the pap ers. F or the algorithm prop osed b y Chen et al. ( 1993 ), the message complexit y is sho wn to b e O ( m + n ( n + log n )), and the time complexit y is O ( n 2 ). Also, the w orst-case ratio of the solution obtained to the cost of an y giv en minim um cost Steiner tree T is 2(1 1 =l ), where l is the n um b er of lea v es in T On the other hand, Shaikh and Shin ( 1997 ) do not giv e m uc h information ab out the complexit y of their distributed algorithm. It can b e noted ho w ev er that the complexit y is similar to that of distributed algorithms for the computation of a minim um spanning tree. Finally Mokb el et al. ( 1999 ) deriv e only the total time complexit y of their algorithm, whic h is O ( K 2 n 2 ), where K is a constan t v alue in tro duced to decrease the n um b er of message exc hanges required. 2.4 Other Problems in Multicast Routing In this section w e presen t some other problems o ccurring in m ulticast routing and whic h ha v e in teresting c haracteristics, in terms of com binatorial optimization. The rst of these problems is the multic ast p acking pr oblem where the ob jectiv e is to optimize the design of the en tire net w ork in order to pro vide capacit y for a sp ecic n um b er of m ulticast groups. Then, w e discuss the p oint-to-p oint c onne ction pr oblem whic h is a generalization of the Steiner tree problem.

PAGE 52

42 2.4.1 The Multicast P ac king Problem A more general view of the m ulticast routing problem can b e found if w e consider the required constrain ts when more than one m ulticast group exists. In this case, there is a n um b er of applications that try to use the net w ork for the purp ose of establishing connection and sending information, organized in dieren t groups. Th us, the net w ork capacit y m ust b e shared accordingly with the requiremen ts of eac h group. These capacit y constrain ts are mo deled in what is called the multic ast p acking pr oblem in net w orks. This problem has attracted some atten tion in the past few y ears ( W ang et al. 2002 ; Priw an et al. 1995 ; Chen et al. 1998 ). The congestion e on edge e is giv en b y the sum of all load imp osed b y the groups using e The maxim um congestion is then dened as the maxim um of all congestion e o v er edges e 2 E If w e assume that there are K m ulticast groups, and eac h group k generates an amoun t t k of trac, an in teger programming form ulation for the m ulticast pac king problem is giv en b y min (2.8) sub ject to K X i =1 t k x ke for all e 2 E (2.9) x ke 2 f 0 ; 1 g j E j fo x i = 1 ; : : : ; K (2.10) where v ariable x ke is equal to one if and only if the edge e is b eing used b y m ulticast group k A n um b er of approac hes ha v e b een prop osed for solving this problem. F or example, W ang et al. ( 2002 ) discuss ho w to set up m ultiple groups using routing trees, and formalized this as a pac king problem. Tw o heuristics w ere

PAGE 53

43 then prop osed. The rst one is based on kno wn heuristics for constructing Steiner trees. The second is based on the cut-set problem. The constrain ts considered for the Steiner tree problem are, rst, the minim um cost under b ounded tree depth; and second, the cost minimization under b ounded degree for in termediate no des. Priw an et al. ( 1995 ) and Chen et al. ( 1998 ) prop osed form ulations for the m ulticast pac king problem using in teger programming. The last authors considered t w o w a ys of mo deling the routing of m ulticast information. In the rst metho d, the information is sen t according to a minim um cost tree among no des in the group, and therefore giv e rise to the Steiner tree problem. In the second, more in teresting v ersion, the information is sen t through a ring whic h visits all elemen ts in the group, and therefore this results in a problem similar to the tra v eling salesman problem. Using these form ulations they describ e heuristics that can b e applied to get appro ximate solutions. Comparisons w ere done b et w een the t w o prop osed form ulations with resp ect to the qualit y of the solutions found to the m ulticast pac king problem. In the in teger form ulation of the m ulticast pac king problem, there is a v ariable x e for eac h edge e 2 E whic h is equal to one if and only if this edge is selected. Eac h edge has also an asso ciated cost w e Then, the in teger form ulation for the tree v ersion of the problem is giv en b y min X e 2 E w e x e (2.11) sub ject to X e 2 ( S ) x e 1 for all S V suc h that m 1 2 S and M 6 S (2.12) x 2 f 0 ; 1 g j E j ; (2.13)

PAGE 54

44 where M is the set of no des participating in a m ulticast group and ( S ) represen ts the edges lea ving the set S V The in teger program for the ring-based v ersion is giv en b y min X e 2 E w e x e (2.14) sub ject to X e 2 ( v ) x e = 2 for all v 2 M (2.15) X e 2 ( v ) x e 2 for all v 2 V n M (2.16) X e 2 ( S ) x e 2 for all S V s.t. u 2 S and M 6 S (2.17) x 2 f 0 ; 1 g j E j : (2.18) Here, u is an y elemen t of M The in teger solution of this problem denes a ring passing through all no des participating in group M In Chen et al. ( 1998 ) these t w o problems are solv ed using br anch-and-cut tec hniques, after the iden tication of some v alid inequalities. 2.4.2 The Multicast Net w ork Dimensioning Problem Another in teresting problem o ccurs when w e consider the design of a new net w ork, in tended to supp ort a sp ecic m ulticast demand. This is called the multic ast network dimensioning pr oblem and it has b een treated in some recen t pap ers ( Prytz 2002 ; F orsgren and Prytz 2002 ; Prytz and F orsgren 2002 ). According to F orsgren and Prytz ( 2002 ), the problem consists of determining the top ology (whic h edges will b e selected) and the corresp onding capacit y of the edges, suc h that a m ulticast service can b e deplo y ed in the resulting netw ork. Muc h of the w ork for this problem has used mathematical programming tec hniques to dene and giv e exact and appro ximate solutions to the problem.

PAGE 55

45 The tec hnique used in F orsgren and Prytz ( 2002 ) has b een the Lagrangian relaxation applied to an in teger programming mo del. W e assume that there are T m ulticast groups. The mo del uses v ariables x ke 2 f 0 ; 1 g for k 2 f 1 ; : : : ; T g e 2 E whic h represen t if edge e is used b y group k There are also v ariables z l e 2 f 0 ; 1 g for l 2 f 1 ; : : : ; L g e 2 E where L is the highest p ossible capacit y lev el, whic h determine if the capacit y lev el of edge e is equal to l No w, let d k for k 2 f 1 ; : : : ; T g b e the bandwidth demanded b y group k ; c le for l 2 f 1 ; : : : ; L g and e 2 E b e the capacit y a v ailable for edge e at the lev el l ; and w l e for l 2 f 1 ; : : : ; L g and e 2 E b e the cost of using edge e at the capacit y lev el l Also, b 2 Z n is the demand v ector, and A 2 R n m is the no deedge incidence matrix. W e can no w state the m ulticast net w ork dimensioning problem using the follo wing in teger program min X e 2 E L X l =1 w l e z l e (2.19) sub ject to T X k =1 d k x ke L X l =1 c le z l e for all e 2 E (2.20) X l 2 L z l e 1 for all e 2 E (2.21) Ax = b (2.22) x; z 2: (2.23) In this in teger program, constrain t ( 2.20 ) ensures that the bandwidth used on eac h edges is at most the a v ailable capacit y Constrain t ( 2.21 ) selects just one capacit y lev el for eac h edge. Finally constrain t ( 2.22 ) enforces the ro w conserv ation in the resulting solution. The problem prop osed ab o v e has b een solv ed using a br anch-and-cut algorithm, emplo ying some basic t yp es of cuts. The authors also use Lagrangian relaxation to reduce the size of the linear program that need to b e solv ed. Some primal heuristics ha v e b een designed to exploit certain similarities with

PAGE 56

46 the Steiner tree problem. These primal heuristics w ere used to impro v e the upp er b ounds found during the branc hing phase. The resulting algorithm has b een able to solv e instances with more than 100 no des. 2.4.3 The P oin t-to-P oin t Connection Problem An in teresting generalization of the Steiner problem is kno wn as the p ointto-p oint c onne ction problem (PPCP). In the PPCP w e are giv en t w o disjoin t sets S and D of sources and, resp ectiv ely destinations. W e require that j S j = j D j Ho w ev er, that is not an imp ortan t restriction, since for ev ery net w ork w e can extend the set of sources using dumm y no des, if needed. As usual, there is a cost function w : E N The ob jectiv e is to nd a minim um cost forest F E where eac h destination is connected to at least one source, and similarly eac h source is connected to at least one destination. This problem w as rst prop osed b y Li et al. ( 1992 ), who pro v ed that all four v ersions of the PPCP (directed, undirected, with xed or non-xed destinations) are N P -hard, when p is giv en as input. Natu ( 1995 ) prop osed a dynamic programming algorithm for p = 2 with time complexit y O ( mn + n 2 log n ). Go emans and Williamson ( 1995 ) presen ted an appro ximation algorithm for a class of forest constrained problems, including the PPCP and the Steiner problem, that runs in O ( n 2 log n ) and giv es its results within a factor of 2 1 =p of the optimal solution. The PPCP is useful to mo del situations where there are m ultiple sources. Some metaheuristic algorithms ha v e b een applied to the problem b y Correa et al. ( 2003 ), and Gomes et al. ( 1998 ). The main idea of these metho ds is to design simple heuristics and com bine them in a framew ork called asynchr onous te ams ( T alukdar and de Souza 1990 ), where eac h heuristic is considered an autonomous agen t, capable of impro ving the existing solutions.

PAGE 57

47 Some of the heuristics prop osed in Correa et al. ( 2003 ) explore basic features of optimal solutions. F or example, one of the heuristics uses the triangle ine quality prop ert y: giv en three no des a b and c in one of the paths in the solution, the cost of paths b et w een a and b and b et w een b and c m ust b e at most the minim um path b et w een b and c Giv en a solution, w e can c hec k for eac h three no des a b and c in a path, if this condition is satised. If it is not, then w e can alw a ys impro v e the solution b y making the correct substitution, using the minim um path. 2.5 Concluding Remarks In this pap er w e ha v e sho wn a n um b er of applications and problems asso ciated with m ulticast routing. W e ha v e also sho wn that most of them are related to other imp ortan t problems in the area of com binatorial optimization. The topics addressed sho w that this is an ev olving area, still in its dev elopmen t stages. Moreo v er, most of the in teresting problems can b e addressed with tec hniques dev elop ed b y the com binatorial optimization and op erations researc h comm unities. W e b eliev e that in the next y ears an increased n um b er of applications and mo dels will con tin ue to ev olv e from this eld and mak e it an imp ortan t source of problems and results.

PAGE 58

CHAPTER 3 STREAMING CA CHE PLA CEMENT PR OBLEMS W e study a problem in the area of m ulticast net w orks, called the streaming cac he placemen t problem (SCPP). In the SCPP one w an ts to determine the minim um n um b er of m ulticast routers needed to deliv er con ten t to a sp ecied n um b er of destinations, sub ject to predetermined link capacities. W e initially discuss the dieren t v ersions of the SCPP found in m ulticast net w orks applications. Then, a transformation from the Sa tisfiability problem is used in order to pro v e N P -hardness to all of these v ersions of the SCPP Complexit y results are deriv ed for the cases of directed and undirected graphs, as w ell as with dieren t assumptions ab out the t yp e of ro w in the net w ork. 3.1 In tro duction Multicast proto cols are used to send information from one or more sources to a large n um b er of destinations using a single send op er ation Net w orks supp orting m ulticast proto cols ha v e b ecome increasingly imp ortan t for man y organizations due to the large n um b er of applications of m ulticasting, whic h include data distribution, video-conferencing ( Eriksson 1994 ), group w are ( Cho c kler et al. 1996 ), and automatic soft w are up dates ( Han and Shahmehri 2000 ). Due to the lac k of m ulticast supp ort in existing net w orks, there is an arising need for up dating unicast orien ted net w orks. Th us, there is a clear economical impact in pro viding supp ort for new m ulticast enabled applications. In this c hapter w e study a problem motiv ated b y the economical planning of m ulticast net w ork implemen tations. The streaming cac he placemen t problem (SCPP) has the ob jectiv e of minimizing costs asso ciated with the 48

PAGE 59

49 implemen tation of m ulticast routers. This problem has only recen tly receiv ed atten tion ( Mao et al. 2003 ; Oliv eira et al. 2003a ) and presen ts man y in teresting questions still unansw ered from the algorithmic and complexit y theoretic p oin t of view. 3.1.1 Multicast Net w orks In m ulticast net w orks, no des in terested in a particular piece of data are called a multic ast gr oup The main ob jectiv e of suc h groups is to send data to destinations in the most ecien t w a y a v oiding duplication of transmissions, and therefore sa ving bandwidth. With this aim, sp ecial purp ose m ulticast proto cols ha v e b eing devised in the literature. Examples are the PIM ( Deering et al. 1996 ) and core-based ( Ballardie et al. 1993 ) distribution proto cols. The basic op eration in these routing proto cols is to send data for a subset of no des, duplicating the information only when necessary Net w ork no des that understand a m ulticast proto col are called c ache no des b ecause they can send m ultiple copies of the receiv ed data. Other no des simply act as in termediates in the m ulticast transmission. The main problem to b e solv ed is deciding the route to b e used b y pac k ages in suc h a net w ork. One of the simplest strategies for generating m ulticast routes is to main tain a r outing tr e e linking all sources and destinations. A similar strategy whic h ma y reduce the n um b er of needed cac he no des, consists in determining a fe asible row from sources to destinations suc h that all destinations can b e satised. A main economical problem, ho w ev er, is that not all no des understand these m ulticast routing proto col. Moreo v er, upgrading all existing no des can b e exp ensiv e (or ev en imp ossible, when the whole net w ork is not o wned b y the same compan y as happ ens in the In ternet).

PAGE 60

50 a b r s c s;r = 1 c r ;a = 1 c r ;b = 1 Figure 3{1: Simple example for the cac he placemen t problem. Supp ose an extreme situation, where no no des ha v e m ulticast capabilities. In this case, the only p ossible solution consists in sending a separate cop y of the required data to eac h destination in the group. Ho w ev er, in this case instances can b ecome quic kly infeasible, as sho wn in Figure 3{1 Here, all edges ha v e capacit y equal to one, and no des a and b are destinations. In this example, a feasible solution is found when r b ecomes a cac he no de. Th us, it is in teresting to determine the minim um n um b er of cac he no des required to handle a sp ecied amoun t of m ulticast trac, sub ject to link capacit y constrain ts. This is called the str e aming c ache plac ement pr oblem (SCPP). F ormal Description of the SCPP Supp ose that a graph G = ( V ; E ) is giv en, with a capacit y function c : E Z + a distinguished source no de s 2 V and a set of destination no des D V It is required that data b e sen t from no de s to eac h destination. Th us, w e m ust determine a set R of cac he no des, used to retransmit data when necessary and the amoun t of information carried b y eac h edge, whic h is represen ted b y v ariables w e 2+ suc h that w e c e for e 2 E The ob jectiv e of the SCPP is to nd a set R of minim um size corresp onding to a ro w f w e j e 2 E g suc h that for eac h no de v 2 D [ R there is a unit ro w from some no de in f s g [ R to v and the capacit y constrain ts w e c e for e 2 E are satised. A c haracterization of the set of cac he no des can b e giv en in terms of the surplus of data at eac h no de v 2 V Supp ose that the n um b er of data units sen t b y no de v also called surplus is giv en b y v ariable b v 2. Note that

PAGE 61

51 the no de s m ust send at least one unit of information, so b s m ust b e greater than zero. Eac h destination is required to receiv e a unit of data, so it has a negativ e surplus (requiremen t) of 1. No w, supp ose that, due to capacit y constrain ts, w e need to establish v as a cac he no de. Then, the surplus at this no de cannot b e negativ e, since it is also sending forw ard the receiv ed data. If v is also a destination, than the minim um surplus is zero (in this case it is receiving one unit and sending one unit); otherwise, b v 1. Th us, the set of cac he no des R V n f s g is the one suc h that b v 0 and v 2 D or b v > 0 and v 2 V n D [ f s g for all v 2 R 3.1.2 Related W ork Problems in m ulticast routing ha v e b een in v estigated b y a large n um b er of researc hers in the last decade. The most studied problems relate to the design of routing tables with optimal cost. In these problems, giv en a set of sources and a set of destinations, the ob jectiv e is to send data from sources to destinations with minim um cost. In the case in whic h there are no additional constrain ts, this reduces to the Steiner tree problem on graphs ( Du et al. 2001 ). In other w ords, it is required to nd a tree linking all destinations to the source, with minim um cost. Using this tec hnique, the source and destinations are the r e quir e d no des the remaining ones b eing the Steiner no des. Man y heuristic algorithms ha v e b een prop osed for this kind of problem ( Cho w 1991 ; F eng and Y um 1999 ; Komp ella et al. 1993b a ; Kumar et al. 1999 ; Salama et al. 1997b ; Sriram et al. 1998 ). The problems ab o v e, ho w ev er, consider that all no des supp ort a m ulticast proto col. This is not a realistic assumption on existing net w orks, since most routers do not supp ort m ulticasting b y default. Th us, some sort of upgrade

PAGE 62

52 m ust b e applied, in terms of soft w are or ev en hardw are, in order to deplo y m ulticast applications. Despite this imp ortan t application, only recen tly researc hers ha v e started to lo ok at this kind of problem. In Mao et al. ( 2003 ) the Streaming Cac he Placemen t problem is dened, in the con text of Virtual Priv ate Net w orks (VPNs). In this pap er, the SCPP w as pro v en to b e N P -hard, using a reduction from the Exa ct Co ver by 3-Sets problem, and a heuristic w as prop osed to solv e some sample instances. Ho w ev er, the pap er do es not giv e details ab out p ossible v ersions of the problem, and pro ceeds directly to deriving lo cal searc h heuristics. Another related problem is the Cac he Placemen t Problem ( Li et al. 1999 ). Here, the ob jectiv e is to place replicas of some static do cumen t on dieren t p oin ts of a net w ork, in order to increase accessibilit y and also decrease the a v erage access time b y an y clien t. The imp ortan t dierence b et w een this problem and the SCPP is that Li et al. ( 1999 ) do es not consider m ulticast transmissions. Also, there are no restriction on the capacit y of links, and data is considered to b e placed at the lo cations b efore the real op eration of the net w ork. 3.2 V ersions of Streaming Cac he Placemen t Problems In this section, w e discuss t w o v ersions of the SCPP In the tr e e str e aming c ache plac ement pr oblem (TSCPP), the ob jectiv e is to nd a routing tree whic h minimizes the n um b er of cac he no des needed to send data from a source to a set of destinations. W e also discuss a mo dication of this problem where w e try to nd an y feasible ro w from source to destinations, minimizing the n um b er of cac he no des. The problem is called the row str e aming c ache plac ement pr oblem (FSCPP).

PAGE 63

53 c r ;b = 1 c r ;a = 1 s a b r c s;r = 1 Figure 3{2: Simple example for the T ree Cac he Placemen t Problem. 3.2.1 The T ree Cac he Placemen t Problem Consider a w eigh ted, capacitated net w ork G ( V ; E ) with a source no de s and a routing tree T ro oted on s and spanning all no des in V Let D b e a subset of the no des in V that ha v e a demand for a data stream to b e sen t from no de s The stream follo ws the path dened b y T from s to the demand no des and tak es B units of bandwidth on ev ery edge that it tra v erses. F or eac h demand no de, a separate cop y of the stream is sen t. Edge capacities cannot b e violated. Note that, dep ending on the net w ork structure, an instance of this problem can easily b ecome infeasible. T o handle this, w e allo w stream splitters, or cac hes, to b e lo cated at sp ecic no des in the net w ork. A single cop y of the stream is sen t from s to a c ache no de r and from there m ultiple copies are sen t do wn the tree. The optimization problem consists in nding a routing tree and to lo cate a minim um n um b er of cac he no des. Figure 3{2 sho ws an small example for this problem. In this example, if no des a and b eac h require a stream (with B = 1) from s and no de r is not a cac he no de, then w e send t w o units from s to r and one unit from r to a and from r to b W e get an infeasibilit y on edge ( s; r ), since t w o units ro w on it, and it has capacit y c s;r = 1 < 2. Ho w ev er, if no de r b ecomes a cac he no de, w e can send one unit from s to r and then one unit from r to a and one unit from r to b The resulting ro w is no w feasible.

PAGE 64

54 T o simplify the form ulation of the problem, w e can consider, without loss of generalit y that the bandwidth used b y eac h message is equal to one. The tree cac he placemen t problem (TSCPP) is dened as follo ws. Giv en a graph G ( V ; E ) with capacities c uv on the edges, a source no de s 2 V and a subset D V represen ting the destination no des, w e w an t to nd a spanning tree T (whic h determines the paths follo w ed b y a data stream from s to v 2 D ) suc h that the subset R V n f s g whic h represen ts the cac he no des, has minim um size. F or eac h no de v 2 D [ R there m ust b e a data stream from some no de w 2 R [ f s g to v suc h that the sum of all streams in eac h edge ( i; j ) 2 T do es not exceed the edge capacit y c ij T o state the problem more formally consider an in teger programming form ulation for the TSCPP Dene the v ariables y e = 8>><>>: 1 if edge e is in the spanning tree T 0 otherwise, x i = 8>><>>: 1 if no de i 6 = s is a cac he no de 0 otherwise b i 2 f 1 ; : : : ; j V j 1 g the ro w surplus for no de i 2 V w e 2 f 0 ; : : : ; j V jg the amoun t of ro w in edge e 2 E : Giv en the no de-arc incidence matrix A the problem can b e stated as min j V j X i =1 x i (3.1) sub ject to Aw = b (3.2) X i 2 V b i = 0 (3.3)

PAGE 65

55 b s 1 for source s (3.4) x i 1 b i x i ( j V j 1) 1 for i 2 D (3.5) x i b i x i ( j V j 1) for i 2 V ( D [ f s g ) (3.6) X e 2 E y e = j V j 1 (3.7) X e 2 G ( H ) y e j H j 1 for all H V (3.8) 0 w e c e y e for e 2 E (3.9) x 2 f 0 ; 1 g j V j ; y 2 f 0 ; 1 g j E j (3.10) b 2 Z ; w 2 Z + ; (3.11) where G ( H ) is the subgraph induced b y the no des in H Constrain t ( 3.2 ) imp oses ro w conserv ation. Constrain ts ( 3.3 ) through ( 3.6 ) require that there m ust b e a n um b er of data streams equal to the n um b er of no des in R [ D Constrain ts ( 3.7 ) and ( 3.8 ) are the spanning tree constrain ts. Finally constrain t ( 3.9 ) determine the b ounds for ro w v ariables, implying that the ro w sp ecied b y w can b e carried only on edges in the spanning tree. 3.2.2 The Flo w Cac he Placemen t Problem An in teresting extension of the TSCPP arises if w e relax the constrain ts in the previous in teger programming form ulation that require the solution to b e a tree of the graph G Then w e ha v e the more general case of a ro w sen t from the source no de s to the set of destination no des D T o see wh y this extension is in teresting, consider the example graph, sho wn in Figure 3{3 In this example all edges ha v e costs equal to one. If w e nd a solution to the TSCPP on this graph, then a stream can b e sen t through only one of the t w o edges ( s; a ) or ( s; b ). Supp ose that w e use edges ( s; a ) and ( a; c ). This implies that c m ust b e a cac he no de, in order to satisfy demand no des d 1 and

PAGE 66

56 s d 2 d 1 a b c Figure 3{3: Simple example for the Flo w Cac he Placemen t Problem. d 2 Ho w ev er, in practice the n um b er of cac hes in this optimal solution for the TSCPP can b e further reduced. Routing proto cols, lik e OSPF, ac hiev e load balancing b y sending data through parallel links. In the case of Figure 3{3 the proto col could just send another stream of data o v er edges ( s; b ) and ( b; c ). If this happ ens, w e do not need a cac he no de, and the solution will ha v e few er cac hes. W e dene the Flo w Cac he Placemen t Problem (FSCPP) to b e the problem of nding a feasible ro w from source s to the set of destinations D suc h that the n um b er of required cac hes R V n f s g is minimized. The in teger linear programming mo del for this problem is similar to ( 3.1 )-( 3.11 ), without the in teger v ariable y and relaxing constrain ts ( 3.7 )-( 3.8 ). 3.3 Complexit y of the Cac he Placemen t Problems W e pro v e that b oth v ersions of the SCPP discussed ab o v e are N P -hard, using a transformation from Sa tisfiability This transformation allo ws us to giv e a pro of of non-appro ximabilit y b y sho wing that it is a gap-preserving transformation.3.3.1 Complexit y of the TSCPP In this section w e pro v e that the TSCPP is N P -hard, b y using a reduction from Sa tisfiability (SA T) ( Garey and Johnson 1979 ).

PAGE 67

57 SA T: Giv en a set of clauses C 1 ; : : : ; C m where eac h clause is the disjunction of j C i j literals (eac h literal is a v ariable x j 2 f x 1 ; : : : ; x n g or its negation x j ), is there a truth assignmen t for v ariables x 1 ; : : : ; x n suc h that all clauses are satised?Denition 1 The TSCPP-D pr oblem is the fol lowing. Given an instanc e of the TSCPP and an inte ger k is ther e a solution such that the numb er of c ache no des ne e de d is at most k ? Theorem 2 The TSCPP-D pr oblem is N P -c omplete. Pro of: This problem is clearly in N P since for eac h instance I it is enough to giv e the spanning tree and the no des in R to determine, in p olynomial time, if this is a \y es" instance. W e reduce SA T to TSCPP-D. Giv en an instance I of SA T, comp osed of m clauses C 1 ; : : : ; C m and n v ariables x 1 ; : : : ; x n w e build a graph G ( V ; E ), with c e = 1 for all e 2 E and c ho ose k = n The set V is dened as V = f s g [ f x 1 ; : : : ; x n g [ f x 1 ; : : : ; x n g [ f T 0 1 ; : : : ; T 0 n g [f T 00 1 ; : : : ; T 00 n g [ f T 000 1 ; : : : ; T 000 n g [ f C 1 ; : : : ; C m g ; and the set E is dened as E = n [ i =1 f ( s; x i ) ; ( s; x i ) g [ n [ i =1 f ( x i ; T 0 i ) ; ( x i ; T 0 i ) g [ n [ i =1 f ( x i ; T 00 i ) g [ n [ i =1 f ( x i ; T 000 i ) g [ m [ i =1 8<: [ x j 2 C i ( x j ; C i ) [ x j 2 C i ( x j ; C i ) 9=; : (3.12) Figure 1 sho ws the construction of G for a small SA T instance. Dene D = f C 1 ; : : : ; C m g [ f T 0 1 ; : : : ; T 0 n g [ f T 00 1 ; : : : ; T 00 n g [ f T 000 1 ; : : : ; T 000 n g Clearly destination no des T 0 i ; T 00 i and T 000 i are there just to saturate the arcs lea ving s and force one of x i ; x i to b e c hosen as a cac he no de. Also, eac h no de C i forces

PAGE 68

58 x 4 x 4 T 0 4 T 00 4 T 000 4 x 2 x 3 x 3 x 2 x 1 T 0 1 T 00 1 T 000 1 T 0 2 T 0 3 T 00 3 T 000 3 T 00 2 T 000 2 x 1 s C 1 = ( x 1 ; x 2 ; x 3 ) C 2 = ( x 2 ; x 3 ; x 4 ) C 3 = ( x 1 ; x 3 ; x 4 ) Figure 3{4: Small graph G created in the reduction giv en b y Theorem 2 In this example, the SA T form ula is ( x 1 x 2 x 3 ) ^ ( x 2 x 3 x 4 ) ^ ( x 1 x 3 x 4 ). the existence of at least one cac he among the no des corresp onding to literals app earing in clause C i Supp ose that the solution of the resulting TSCPP-D problem is true. Then, w e assign v ariable x i to true if no de x i is in R otherwise w e set x i to false. This assignmen t is w ell-dened, since exactly one of the no des x i ; x i m ust b e selected. Clearly this truth assignmen t satises all clauses C i b ecause the demand of eac h no de C i is satised b y at least one no de corresp onding to literals app earing in clause C i Con v ersely if there is a truth assignmen t whic h mak es the SA T form ula satisable, w e can use it to dene the no des whic h will b e cac hes, and, b y construction of G all demands will b e satised. Finally the resulting construction is p olynomial in size, th us SA T reduces in p olynomial time to TSCPP-D. 2

PAGE 69

59 Input : a tree T Output : a set R of cac he no des forall v 2 V do if v 2 D then demand ( v ) 1 else demand ( v ) 0 endcall ndR( s ) return R pro cedure ndR( v ) b egin forall w such that ( v ; w ) 2 T do ndR(w) 1 if v = s then return R else p par ent ( v ) if c p;v < demand ( v ) then R R [ f v g demand ( p ) demand ( p ) + 1 endelse demand ( p ) demand ( p ) + demand ( v ) end Algorithm 4: Find the optimal R for a xed tree. As a simple consequence of this theorem, w e ha v e the follo wing corollary Corollary 3 The TSCPP is N P -har d. It is in teresting to observ e that the problem remains N P -hard ev en for unitary-capacit y net w orks, since the pro of remains the same for edges with unitary capacit y Some simple examples serv e to illustrate the problem. F or instance, if G is the complete graph K n then the optimal solution is simply a star graph with s at the cen ter, and R = ; On the other hand, if the graph is a tree with n no des, then the n um b er of cac he no des is implied b y the edges of the tree, th us the optim um is completely determined. Algorithm 4 determines an optimal set R from a giv en tree T The algorithm w orks recursiv ely Initially it nds the demand for all lea v es of T Then

PAGE 70

60 it go es up the tree determining if the curren t no des m ust b e a cac he no de. The correctness of this metho d is pro v ed b ello w. Theorem 4 Given an instanc e of the TSCPP which is a tr e e T then an optimal solution for T is given by A lgorithm 4 Pro of: The pro of is b y induction on the heigh t h of a tree analyzed when Algorithm 4 arriv es at line (1). If h = 0 then the n um b er of cac he no des is clearly equal to zero. Assume that the theorem is true for trees with heigh t h 1. If the capacit y of the arc ( p; v ) is greater than the demand at v then there is no need of a new cac he no de, and therefore the solution remains optimal. If, on the other hand, ( p; v ) do es not ha v e enough capacit y to satisfy all demand at v then w e do not ha v e a c hoice other than making v a cac he no de. Com bining this with the assumption that the solution for all c hildren of v is optimal, w e conclude that the new solution for a tree of heigh t h + 1 is also optimal. 2 3.3.2 Complexit y of the FSCPP W e can use the transformation from SA T to TSCPP to sho w that FSCPP is also N P -hard. In the case of directed edges, this is simple, since giv en a graph G pro vided b y the reduction, w e can giv e an orien tation of G from source to destinations. This is stated in the next theorem. Theorem 5 The FSCPP is N P -har d if the instanc e gr aph is dir e cte d. Pro of: The pro of is similar to the pro of of Theorem 4 W e need just to mak e sure that the p olynomial transformation giv en for the TSCPP-D also w orks for a decision v ersion of the FSCPP Giv en an instance of SA T, let G b e the corresp onding graph found b y the reduction. W e orien t the edges of G from S to destinations D i.e., use the implicit orien tation giv en in ( 3.12 ). It can b e c hec k ed that in the resulting instance the n um b er of cac he no des cannot b e

PAGE 71

61 x i s T 0 i T 00 i x i T 6 i T 2 i T 1 i ... Figure 3{5: P art of the transformation used b y the FSCPP reduced b y sending additional ro w in other edges other than the ones whic h form the tree in the solution of TSCPP Th us, the resulting R is the same, and FSCPP is N P -hard in this case. 2 Next w e pro v e a sligh tly mo died theorem for the undirected v ersion. T o do this w e need the follo wing v arian t of SA T: 3Sa t(5): Giv en an instance of Sa tisfiability with at most three literals p er clause and suc h that eac h v ariable app ears in at most v e clauses, is there a truth assignmen t that mak es all clauses true? The 3Sa t(5) is w ell kno wn to b e N P -complete ( Garey and Johnson 1979 ). Theorem 6 The FSCPP is N P -har d if the instanc e gr aph is undir e cte d. Pro of: When the instance of FSCPP is undirected, the only thing that can go wrong is that some of the destinations T 0 i T 00 i or T 000 i are b eing satised b y ro w coming from no des C j connected to their resp ectiv e x i x i no des. What w e need to do to prev en t this is to b ound the n um b er of o ccurrences of eac h v ariable and add enough absorbing destinations to the subgraph corresp onding to that v ariable. W e do this b y reduction from 3Sa t(5) The reduction is essen tially the same as the reduction from SA T to TSCPP but no w for eac h v ariable x i w e ha v e no des x i x i T 0 i T 00 i and T k i for 1 k 6 (see Figure 3{5 ). Also,

PAGE 72

62 for eac h v ariable x i w e ha v e edges ( s; x i ), ( s; x i ), ( x i ; T 0 i ), ( x i ; T 00 i ), ( x i ; T k i ), ( x i ; T k i ), for 1 k 6. W e claim that in this case for eac h pair of no des x i x i one of them m ust b e a cac he no de (whic h sa ys that the corresp onding v ariable in 3SA T(5) is true or false). This is true b ecause from the eigh t destinations not corresp onding to clauses ( T 0 i T 00 i and T k i 1 k 6) attac hed to x i x i t w o can b e directly satised from s without cac hes. Ho w ev er, the remaining six cannot b e satised from no des C j link ed to the curren t v ariable no des, b ecause there are at most v e suc h no des. Th us, w e m ust ha v e one cac he no de at x i or x i for eac h v ariable x i It is clear that these are the only cac he no des needed to mak e all destinations satised. This giv es us the correct truth assignmen t for the original 3SA T(5) instance. Con v ersely an y non-satisable form ula will transform to a FSCPP instance whic h needs more than n cac he no des to satisfy all destinations. Th us, the decision v ersion of FSCPP is N P -complete, and this implies the theorem. 2 Note that there is a case of TSCPP-D that is solv able in p olynomial time, and this happ ens when k = 0, i.e. determining if an y cac he no de is needed. The solution is giv en b y the follo wing algorithm. Run the maxim um ro w algorithm from no de s to all no des in D This can b e accomplished, for example, b y creating a dumm y destination no de d and linking all no des v 2 D to d b y arcs with capacit y equal to 1. If the maxim um ro w from s reac hes eac h no de in D then the answ er is true, since no cac he no de is needed to satisfy the destinations. Otherwise, the answ er m ust b e false b ecause then at least one cac he no de is needed to satisfy all no des in D

PAGE 73

63 3.4 Concluding Remarks In this c hapter w e presen ted and analyzed t w o com binatorial optimization problems, the tree cac he placemen t problem (TSCPP) and its ro w-based, generalized v ersion, the ro w cac he placemen t problem (FSCPP). W e pro v e that b oth problems, on directed and undirected graphs, are N P -hard. F or this purp ose, w e use a transformation from the Sa tisfiability problem. Man y questions remain op en for these problems. F or example, it w ould b e in teresting to nd algorithms with a b etter appro ximation guaran tee, or impro v ed non-appro ximabilit y results. Some of these issues will b e considered in the next c hapters.

PAGE 74

CHAPTER 4 COMPLEXITY OF APPR O XIMA TION F OR STREAMING CA CHE PLA CEMENT PR OBLEMS As sho wn in the previous c hapter, the SCPP in its t w o forms is N P hard. W e impro v e the hardness results for the SCPP b y sho wing that it is v ery dicult to giv e appro ximate solutions for suc h problems. General nonappro ximabilit y is pro v ed using the reduction from Sa tisfiability giv en in the previous c hapter. Then, w e impro v e the appro ximation results for the FSCPP using a reduction from Set Co ver In particular, giv en k destinations, w e sho w that the FSCPP cannot ha v e a O (log log k )-appro ximation algorithm, for a v ery small unless N P can b e solv ed in sub-exp onen tial time. 4.1 In tro duction W e con tin ue in this c hapter the study of the streaming cac he placemen t problem (SCPP). In the SCPP one w an ts to determine the minim um n um b er of m ulticast routers needed to deliv er con ten t to a sp ecied n um b er of destinations, sub ject to predetermined link capacities. The SCPP is kno wn to b e N P -hard, as sho wn Chapter 3 W e giv e appro ximation results for the SCPP in its dieren t v ersions, using prop erties of the Sa tisfiability problem. W e use the transformation describ ed in the previous c hapter to ac hiev e this nonappro ximabilit y result. W e sho w that there is a xed > 1 suc h that no SCPP problem can b e appro ximated in p olynomial time with guaran tee b etter than This is equiv alen t to sa y that the SCPP is in the MAX SNP-hard class ( P apadimitriou and Y annak akis 1991 ). 64

PAGE 75

65 W e are also able to impro v e the appro ximation results for the FSCPP using a reduction from Set Co ver In this case, w e are in terested in general ro ws and directed arcs. In particular, giv en k destinations, w e sho w that the FSCPP cannot ha v e a O (log log k )-appro ximation algorithm, for a v ery small unless N P can b e solv ed in sub-exp onen tial time. This c hapter is organized as follo ws. In Section 4.2 w e discuss the nonappro ximation result for FSCPP based on the Sa tisfiability problem. Then, in Section 4.3 w e discuss the impro v ed result for the FSCPP based on Set Co ver Section 4.4 giv es some concluding remarks. 4.2 Non-appro ximabilit y The transformation used in Theorem 2 pro vides a metho d for pro ving a non-appro ximabilit y result for the TSCPP and FSCPP W e emplo y standard tec hniques, based on the gap-preserving transformations. T o do this w e use an optimization v ersion of 3Sa t(5) Max-3Sa t(5): Giv en an instance of 3Sa t(5) nd the maxim um n um b er of clauses that can b e satised b y an y truth assignmen t. Denition 2 F or any 0 < < 1 an appr oximation algorithm with guarante e (or e quivalently, an -appr oximation algorithm) for a maximization pr oblem is an algorithm A such that, for any instanc e I 2 the r esulting c ost A ( I ) of A applie d to instanc e I satises O P T ( I ) A ( I ) ; wher e we denote by O P T ( I ) the c ost of the optimum solution. F or minimization pr oblems, A ( I ) must satisfy A ( I ) O P T ( I ) ; for any xe d > 1 The follo wing theorem from Arora and Lund ( 1996 ) is v ery useful to pro v e hardness of appro ximation results.

PAGE 76

66 Theorem 7 ( Arora and Lund ( 1996 )) Ther e is a p olynomial time r e duction fr om SA T to Max-3Sa t(5) which tr ansforms formula into a formula 0 such that, for some xe d ( is in fact determine d in the pr o of of the the or em), if is satisable, then O P T ( 0 ) = m and if is not satisable, then O P T ( 0 ) < (1 ) m wher e m is the numb er of clauses in 0 In the follo wing theorem w e use this fact to sho w a non-appro ximabilit y result for TSCPP Theorem 8 The tr ansformation use d in the pr o of of The or em 2 is a gappr eserving tr ansformation fr om Max-3SA T(5) to TSCPP. In other wor ds, given an instanc e of Max-3SA T(5) with m clauses and n variables, we c an nd an instanc e I of TSCPP such that If O P T ( ) = m then O P T ( I ) = n ; and If O P T ( ) (1 ) m then O P T ( I ) (1 + 1 ) n wher e is given in The or em 7 and 1 = = 15 Pro of: Supp ose that is an instance of Max-3SA T(5) Then, w e can use the transformation giv en in the pro of of Theorem 2 to construct a corresp onding instance I of TSCPP If has a solution with O P T ( ) = m where m is the n um b er of clauses, then b y Theorem 2 w e can nd a solution for I suc h that O P T ( I ) = n No w, if O P T ( ) (1 ) m then there are at least m clauses unsatised. In the corresp onding instance I w e ha v e at least n cac he no des due to the constrain ts from no des T 0 i T 00 i and T 000 i 1 i n These cac he no des satisfy at most (1 ) m destinations corresp onding to clauses. Let U b e the set of unsatised destinations. The no des in U can b e satised b y setting one extra cac he (in a total of t w o, for no des x j and x j ) for at least one v ariable x j app earing in the clause corresp onding to c i for all c i 2 U

PAGE 77

67 Th us, the n um b er of extra cac he no des needed to satisfy U is at least j U j = 5, since a v ariable can app ear in at most 5 clauses. W e ha v e O P T ( I ) n + j U j = 5 n + m= 5 (1 + = 15) n: The last inequalit y follo ws from the trivial b ound m n= 3. The theorem follo ws b y setting 1 = = 15. 2 Denition 3 A PT AS (Polynomial Time Appr oximation Scheme) for a minimization pr oblem is an algorithm that, for e ach > 0 and instanc e I 2 r eturns a solution A ( I ) such that A ( I ) (1 + ) O P T ( I ) and A has running time p olynomial in the size of I dep ending on (se e, e.g. Pap adimitriou and Steiglitz ( 1982 ), p age 425). Corollary 9 Unless P = N P the TSCPP c annot b e appr oximate d by (1 + 2 ) for any 2 1 wher e 1 is given in The or em 8 and ther efor e ther e is no p olynomial time appr oximation scheme (PT AS) for the TSCPP. Pro of: Giv en an instance of SA T, w e can use the transformation giv en in Theorem 7 coupled with the transformation giv en in the pro of of Theorem 2 to giv e a new p olynomial transformation from SA T to TSCPP No w, let I b e the instance created b y on input Supp ose there is an 2 appro ximation algorithm A for TSCPP with 0 2 1 Then, when A runs on an instance I constructed b y from a satisable form ula the result m ust ha v e cost A ( I ) (1 + 2 ) n < (1 + 1 ) n Otherwise, if is not satisable, then the result giv en b y this algorithm m ust b e greater than (1 + 1 ) n b ecause of the gap in tro duced b y Th us, if there is an 2 -appro ximation algorithm, then w e can decide in p olynomial time if a form ula is satisable or not. Assuming P 6 = N P there is no suc h algorithm. The fact that there is no PT AS for the TSCPP is a consequence of this result and the denition of PT AS. 2

PAGE 78

68 The ab o v e theorem and corollary can b e easily extended to the FSCPP The fact that the same transformation can b e used for b oth problems can b e used to demonstrate the non-appro ximabilit y result to the FSCPP as w ell. W e state this as a corollary Corollary 10 Unless P = N P the FSCPP has no PT AS. Pro of: The transformation from SA T to FSCPP is iden tical, so Theorem 8 is also v alid for the FSCPP This implies that the FSCPP has no PT AS, unless P = N P 2 4.3 Impro v ed Hardness Result for FSCPP In this section, w e are in terested in the case of general ro ws and directed arcs. This v ersion of the problem is called the ro w streaming cac he placemen t problem (FSCPP). In particular, giv en k destinations, w e sho w that the FSCPP cannot ha v e a O (log log k )-appro ximation algorithm, for a v ery small unless N P can b e solv ed in sub-exp onen tial time. W e ha v e sho wn ab o v e that, giv en a instance of the FSCPP there is an > 0 suc h that the FSCPP cannot b e appro ximated b y 1 + th us demonstrating that FSCPP is MAX SNP-hard ( P apadimitriou and Y annak akis 1991 ) and discarding the p ossibilit y of a PT AS. W e sho w a stronger result: there is no appro ximation algorithm that can giv e a p erformance guaran tee b etter than log log k where k is the n um b er of destinations. The pro of is based on a reduction from the Set Co ver problem. Set Co ver: Giv en a ground set T = t 1 ; : : : ; t n with subsets S 1 ; : : : ; S m T nd the minim um cardinalit y set C f 1 ; : : : ; m g suc h that S i 2 C S i = T It is kno wn ( F eige 1998 ) that Set Co ver do es not ha v e appro ximation algorithms for an y guaran tee b etter than O (log n ). Th us, if w e nd a transformation from Set Co ver to FSCPP that preserv es appro ximation, w e can

PAGE 79

69 pro v e a similar result for FSCPP W e sho w ho w this transformation, whic h will b e represen ted b y : SC FSCPP can b e done. F or eac h instance I S C of set co v er, w e m ust nd a corresp onding instance I F S C P P of the FSCPP The instance I S C is comp osed of sets T and S i ; : : : ; S m as sho wn ab o v e. The transformation consists of dening a capacitated graph G with a source and a set D of destinations. Let G b e the graph comp osed of the follo wing no des: V = f s g [ f w 1 ; : : : ; w m g [ f v 1 ; : : : ; v n g [ f s 1 ; : : : ; s m g : Also, let the edges E of the graph G b e E = f ( w j ; v i ) j t i 2 S j g [ m [ i =1 f ( s; w i ) g [ m [ i =1 f ( w i ; s i ) g : In the instance of FSCPP the set of destination no des D is giv en b y D = f v 1 ; : : : ; v n g [ f s 1 ; : : : ; s m g ; and s is the source no de. Th us, there is an one to one corresp ondence b et w een no des w i and sets S i for 1 i m There is also an one to one corresp ondence b et w een no des v i and ground elemen ts t i 2 T for 1 i n There is a directed edge b et w een the source and eac h no de w i and b et w een no des w i and no des represen ting elemen ts app earing in the set S i No des w i are also link ed to eac h s i Finally eac h edge e has capacit y c e = 1. See an example of suc h reduction in Figure 4{1 The ground set in this example is T = f t 1 ; : : : ; t 6 g and the subsets are S 1 = f t 1 ; t 2 ; t 4 ; t 5 g S 2 = f t 1 ; t 2 ; t 4 ; t 6 g and S 3 = f t 2 ; t 4 ; t 6 g Theorem 11 The tr ansformation describ e d ab ove is a p olynomial time r e duction fr om Set Co ver to FSCPP. Pro of: Let I S C b e the instance of Set Co ver and I F S C P P the corresp onding instance of the FSCPP It is clear that the transformation is p olynomial, since

PAGE 80

70 9>>=>>; D w 3 s 3 s 2 s 1 v 1 v 2 v 3 v 4 v 5 v 6 w 1 w 2 s Figure 4{1: Example for transformation of Theorem 11 the n um b er of edges and no des is giv en b y a constan t m ultiple of the n um b er of elemen ts and sets in the instance of Set Co ver W e m ust pro v e that I I S and I S C C P ha v e equiv alen t optimal solutions. Let S 0 b e an optimal solution for I S C First w e note that the destination no des s i 1 i m can b e reac hed only from no des w i and therefore eac h s i m ust b e satised with ro w coming from w i Th us, eac h no de s i saturates the corresp onding w i whic h means that to satisfy an y other no de from w i w e m ust mak e it a cac he no de. Then, w e can clearly mak e R = f w i j i 2 S 0 g and serv e all remaining destinations in v 1 ; : : : ; v n b y denition of S 0 Eac h no de in R will b e a cac he no de, and therefore R is a solution for I F S C P P This solution m ust b e optimal, b ecause otherwise w e could use a smaller solution R 0 to construct a corresp onding set S 00 f 1 ; : : : ; m g with j S 00 j < j S 0 j co v ering all elemen ts of T and therefore con tradicting the fact that S 0 is an optim um solution for the SC instance. Th us, the t w o instances I S C and I F S C P P ha v e equiv alen t optimal solutions. 2Corollary 12 Given an instanc e I of SC, and the tr ansformation describ e d ab ove, then we have O P T ( I ) = O P T ( ( I ))

PAGE 81

71 The follo wing theorem, pro v ed b y F eige ( F eige 1998 ), will b e useful for our main result. Theorem 13 (F eige ( F eige 1998 )) If ther e is some > 0 such that a p olynomial time algorithm c an appr oximate set c over within (1 ) log n then N P T I M E ( n O (log log n ) ) This theorem implies that nding appro ximate solutions with guaran tee b etter than (1 ) log n for Set Co ver is equiv alen t to solv e an y problem in N P in sub-exp onen tial time. It is strongly b eliev ed that this is not the case. W e use this theorem and the reduction ab o v e to giv e a related b ound for the appro ximation of FSCPP T o do this, w e need a gap pr eserving tr ansformation from SC to FSCPP as stated in the follo wing lemma. Lemma 14 If I is an instanc e Set Co ver then the tr ansformation fr om SC to FSCPP describ e d ab ove is gap preserving that is, it has the fol lowing pr op erty: (a) If O P T ( I ) = k then O P T ( ( I )) = k ; and (b) If O P T ( I ) k log n then O P T ( ( I )) k log log j D j wher e k is a xe d value, dep ending on the instanc e, and = k log 1 log (1 + n 2 n ) = log j D j 0 for lar ge n Pro of: P art (a) is a simple consequence of Corollary 12 No w, for part (b), note that the maxim um n um b er of sets in an instance of SC with n elemen ts is 2 n Consequen tly in the instance of FSCPP created b y transformation j D j = m + n 2 n + n Th us, w e ha v e log j D j log (2 n + n ) = n + 0 ;

PAGE 82

72 where 0 = log (1 + n 2 n ). This implies that, log n log (log j D j ) = log log j D j + 00 ; where 00 = log (1 0 log j D j ). Therefore, O P T ( ( I )) k log n k log log j D j ; where = k 00 (note that is a p ositiv e quan tit y). Finally note that the quan tit y k log 1 log (1 + n 2 n ) = log j D j go es v ery fast to zero, in comparison to n th us the v alue log log j D j is asymptotically optimal. 2 The reduction sho wn in Theorem 11 is gap preserving, since it main tains an appro ximation gap, in tro duced b y the instances of Set Co ver Note ho w ev er that the name \gap preserving" is misleading in this case, since the new transformation has a smaller gap than then original. Finally w e get the follo wing result. Theorem 15 If ther e is some > 0 such that a p olynomial time algorithm A c an appr oximate FSCPP within (1 ) log log k wher e k = j D j then N P T I M E ( n O (log log n ) ) Pro of: Supp ose that an instance I of the SC is giv en. The transformation describ ed ab o v e can b e used to nd an instance ( I ) of the FSCPP Then, A can b e used to solv e the problem for instance ( I ). According to Lemma 14 transformation reduces an y gap of log n to log log k Th us, with suc h an algorithm one can dieren tiate b et w een instances I with a gap of log n But this is not p ossible in p olynomial time, according to ( F eige 1998 Theorem 10) unless N P T I M E ( n O (log log n ) ). 2

PAGE 83

73 4.4 Concluding Remarks The SCPP is a dicult com binatorial optimization problem o ccurring in m ulticast net w orks. W e ha v e sho wn that the SCPP in general cannot ha v e appro ximation algorithms with guaran tee b etter than for some > 1. Th us, dieren t from other optimization problems (suc h as the connected dominating set in Chapter 7 ), the SCPP cannot ha v e a p olynomial time appro ximation sc heme (PT AS). W e ha v e also pro v ed that the FSCPP cannot b e appro ximated b y less then log log k where k is the n um b er of destinations, unless N P can b e solv ed in sub-exp onen tial time. This sho ws that it is v ery dicult to nd near optimal results for general instances of the FSCPP

PAGE 84

CHAPTER 5 ALGORITHMS F OR STREAMING CA CHE PLA CEMENT PR OBLEMS The results of the preceding c hapter sho w that the SCPP is v ery dicult to solv e, ev en if only appro ximate solutions are required. W e describ e some appro ximation algorithms that can b e used to giv e solutions to the problem, and decrease the gap b et w een kno wn solutions and non-appro ximabilit y results. W e also consider practical heuristics to nd go o d near-optimal solutions to the problem. W e prop ose t w o general t yp es of heuristics, based on complemen tary tec hniques, whic h can b e used to giv e go o d starting solutions for the SCPP 5.1 In tro duction In this c hapter, w e prop ose algorithms for solution of SCPP problems. Initially w e discuss algorithms with p erformance guaran tee, also kno wn as appr oximation algorithms W e giv e a general algorithm for SCPP problems, and also a b etter algorithm based on row te chniques Appro ximation algorithms are v ery in teresting as a w a y of understanding the complexit y of the problem, but sp ecially on this case, due to the negativ e results sho wn in Chapter 4 they are not v ery practical. Th us, considering the complexit y issues, w e prop ose p olynomial time construction algorithms for the SCPP based on t w o general tec hniques: adding destinations to a partial solution, and reducing the n um b er of infeasible no des in an initial solution. W e rep ort the results of computational exp erimen ts based on these t w o algorithms and its v ariations. 74

PAGE 85

75 This c hapter is organized as follo ws. In Section 5.2 w e presen t algorithms with p erformance guaran tee for the SCPP In Section 5.3 w e turn to algorithms without p erformance guaran tee, and discuss a n um b er of p ossible construction strategies. Then, in Section 5.4 w e pro ceed to an empirical ev aluation of the solutions returned b y the prop osed construction heuristics. Final remarks and future researc h directions are discussed in Section 5.5 5.2 Appro ximation Algorithms for SCPP In this section, w e presen t algorithms for the TSCPP and FSCPP and analyze their appro ximation guaran tee. T o simplify our results, w e use the notation A ( I ) = j R [ f s gj where R is the set of cac he no des found b y algorithm A applied to instance I Also, O P T ( I ) = j R [ f s gj where R is an optimal set of cac he no des for instance I Note that A ( I ) 1 and O P T ( I ) 1, whic h mak es Denition 2 v alid for our problems. 5.2.1 A Simple Algorithm for TSCPP It is easy to construct a simple appro ximation algorithm for an y instance of the TSCPP W e denote b y G ( v ) the degree of no de v in the graph G Input : Graph G destinations D source s Output : a set R of cac he no des Step 1: Construct a spanning tree T of G Step 2: Remo v e recursiv ely all lea v es of T whic h are not in D [ f s g Step 3: Let S 1 b e the set of in ternal no des v with T ( v ) > 2 Step 4: Let S 2 b e the set of in ternal no des v with T ( v ) = 2 and v 2 D Step 5: Return R = S 1 [ S 2 Algorithm 5: Spanning T ree Algorithm Note that steps 3 and 4 of Algorithm 5 represen t a w orst case for Algorithm 4 The correctness of the algorithm is sho wn in the next lemma. Lemma 16 A lgorithm 5 r eturns a fe asible solution to the TSCPP.

PAGE 86

76 Pro of: The op eration in step 2 main tains feasibilit y since lea v es cannot b e used to reac h destinations. The result R includes all in ternal no des v with T ( v ) > 2, and all in ternal no des v with T ( v ) = 2 and v 2 D It suces to pro v e that if T ( v ) = 2 and v 62 D then v is not needed in R Supp ose that v is an in ternal no de with T ( v ) = 2 and v 62 D If the n um b er of destinations do wn the tree from v is equal to 1, then v do es not need to b e a cac he. No w, assume that the n um b er of cac he no des do wn the tree from v is t w o or more. Then, there are t w o cases. In the rst case, there is a no de w b et w een v and the destinations, with T ( w ) > 2. In this case, w is in R and w e need just to send one unit of ro w from v to w th us v do es not need to b e in R In the second case, there m ust b e some destination w with T ( w ) = 2 b et w een v and the other destinations. Again, in this case w will b e included in R from S 2 Th us v do es not need to b e in R This sho ws that R is a feasible solution to the TSCPP 2 Lemma 17 A lgorithm 5 gives an appr oximation guar ante e of j D j Pro of: Let us partition the set of destinations among D 1 and D 2 where D 1 = D n S 2 and D 2 = S 2 Denote b y D 0 the set of destinations whic h are lea v es in T Initially note that for an y tree the n um b er of no des v with degree ( v ) > 2 is at most j L j 2, where L is the set of no des with ( v ) = 1 (the lea v es). But L in this case is D 0 [ f s g D 1 [ f s g This implies that j S 1 j j D 1 [ f s gj 2. Th us, j R j = j S 1 [ S 2 j j D 1 [ f s gj 2 + j D 2 j = j D j 1 ; and A ( I ) = j R j + 1 j D j j D j O P T ( I ) ; since O P T ( I ) 1. 2

PAGE 87

77 Let = ( G ) b e the maxim um degree of G In the case in whic h all capacities c e for e 2 E ( G ), are equal to one, w e can giv e a b etter analysis of the previous algorithm with an impro v ed p erformance. Theorem 18 When c e = 1 for al l e 2 E then A lgorithm 1 is a k appr oximation algorithm, wher e k = min f ( G ) ; j D jg Pro of: The k ey idea to note is that if c e = 1 for all e 2 E then dj D j = e O P T ( I ), for an y instance I of the TSCPP This happ ens b ecause eac h cac he no de (as w ell as the source) can serv e at most destinations. Let A ( I ) b e the v alue returned b y Algorithm 2 on instance I W e kno w from the previous analysis of Lemma 17 that A ( I ) j D j Th us A ( I ) O P T ( I ) : The theorem follo ws, since w e kno w that this is also an j D j -appro ximation algorithm. 2 5.2.2 A Flo w-based Algorithm for FSCPP In this section, w e presen t an appro ximation algorithm for the FSCPP The algorithm is based on the idea of sending ro w from the source to destination no des. W e sho w that this algorithm p erforms at least as go o d as the previous algorithm for the TSCPP In addition, w e sho w that for a sp ecial class of graphs this algorithm giv es essen tially the optim um solution. Therefore, for this class of graphs the FSCPP is solv able in p olynomial time. W e giv e no w standard denitions ab out net w ork ro ws. F or more details ab out the sub ject, see Ah uja et al. ( 1993 ). Let f ( x; y ) 2 R + b e the amoun t of ro w sen t on edge ( x; y ), for ( x; y ) 2 E A ro w is called a fe asible row if it satises ro w conserv ation constrain ts ( 3.2 ). Let F ( f ; s; t ) = P v 2 V f ( s; v ) b e the total ro w sen t from no de s W e assume that s can send at most P ( s;v ) 2 E c sv units of ro w, and t can receiv e at most P ( u;t ) 2 E c ut units of ro w. A feasible ro w f is the maximum row from s to t if there is no feasible ro w f 0 suc h that F ( f 0 ; s; t ) > F ( f ; s; t ). A no de v is a

PAGE 88

78 r e ache d no de from s b y ro w f if P w 2 V f ( w ; v ) > 0. It is w ell kno wn that when f is the maxim um ro w, then F ( f ; s; t ) = C ( s; t ), where C ( s; t ) represen ts the minim um capacit y of an y set of edges separating s from t in G (the minimum cut ). W e also use the notation C ( U; U ) to denote the total capacit y of edges linking no des in U to no des in U where U V and U = V n U Denote an y feasible ro w starting from no de v b y f v The algorithm w orks b y nding the maxim um ro w from s to all no des in D If the total cost of this maxim um ro w is F ( f s ; s; D ) j D j then the problem is solv ed, since all destinations can b e reac hed without cac he no des. Otherwise, w e put all no des reac hable from s in a set Q Then w e rep eat the follo wing steps un til D n Q is empt y: for all no des v 2 Q compute the maxim um ro w f v from v to D Find the no de v suc h that f v is maxim um. Then add v to R and add to Q all no des reac hable from v Also, reduce the capacit y of the edges in E b y the amoun t of ro w used b y f v These steps are describ ed more formally in Algorithm 6 Q f s g while D n Q 6 = ; do forall v 2 Q do nd the maxim um ro w f v from v to D n Q endLet v b e the no de suc h that F ( f v ; s; D ) is maxim um R R [ f v g Add to Q the no des reac hed b y f v for e ach e dge ( u; v ) 2 E do Reduce capacit y c u;v b y f v ( u; v ) end end Algorithm 6: Flo w algorithm In the follo wing theorem, w e sho w that the running time of this algorithm is p olynomial and dep ends on the time needed to nd the maxim um ro w.

PAGE 89

79 Denote b y C M ( G ) the maxim um v alue of the minim um cut b et w een an y pair of no des v ; w 2 V ( G ), i.e. C M ( G ) = max v ;w 2 V C ( v ; w ) : Similarly w e dene C m ( G ) = min v ;w 2 V C ( v ; w ) : Theorem 19 A lgorithm 6 has running time e qual to O ( n j D j T mf =C m ( G )) wher e T mf is the time ne e de d to run the maximum row algorithm. Pro of: The most costly op erations in this algorithm are calls to the maxim um ro w algorithm. Therefore w e coun t the n um b er of calls. Note that, at eac h of the N iterations of the while lo op, a new elemen t is added to the set of cac he no des. Th us, A ( I ) is equal to the n um b er of suc h iterations. Let v i b e the no de added to the set of cac he no des at iteration i and Q i b e the con ten t of set Q at iteration i of Algorithm 6 A t eac h step, the n um b er of elemen ts of D found b y the algorithm is, according to the maxim umro w/minim um-cut theorem, equal to the minim um cut from v i to the remaining no des in D n Q i (recall that all demands are unitary). Then, j D j = N X i =1 C ( v i ; D n Q i ) N X i =1 min w 2 D C ( v i ; w ) A ( I ) min v ;w 2 V C ( v ; w ) : Th us w e ha v e N = A ( I ) j D j C m ( G ) : (5.1) A t eac h iteration of the while lo op, the n um b er of calls to the maxim um ro w algorithm is at most n The total n um b er n c of suc h calls is giv en b y n c n j D j =C m ( G ) Th us the running time of Algorithm 6 is O ( n j D j T mf =C m ( G )). 2

PAGE 90

80 Based on the p erformance analysis just sho wn, the follo wing theorem giv es an appro ximation guaran tee for Algorithm 6 Theorem 20 A lgorithm 6 is a k -appr oximation algorithm, wher e k = C M ( G ) =C m ( G ) Pro of: If w e denote b y R the set of cac he no des in the optimal solution, w e ha v e j D j X v i 2 R [f s g C ( v i ; D ) X i 2 R [f s g max v ;w 2 V C ( v ; w ) = O P T ( I ) C M ( G ) (5.2) Com bining inequalities ( 5.1 ) and ( 5.2 ) results in A ( I ) C M ( G ) C m ( G ) O P T ( I ) 2 Note that the quan tit y C M ( G ) =C m ( G ) can b ecome large. Ho w ev er, for some t yp es of graphs, the preceding algorithm giv es us a b etter understanding of the problem. F or example, if the graph has maxim um degree ( G ) and xed capacit y l then it is easy to see that C M ( G ) C m ( G ) ( G ) l l = ( G ) : If the edge capacit y is not xed, then C M ( G ) =C m ( G ) b ecomes at most ( G ) c M =c m where c M represen ts the maxim um capacit y and c m the smallest capacit y of edges in G 5.3 Construction Algorithms for the SCPP W e this section, w e pro vide construction algorithms for SCPP that giv e go o d results for a class of problem instances. The algorithms are based on dual metho ds for constructing solutions. In the rst heuristic, the metho d used consists of sequen tially adding destinations to the curren t ro w, un til all destinations are satised. The second metho d uses the idea of turning an

PAGE 91

81 initial infeasible solution in to a feasible one, b y adding cac he no des to the existing infeasible ro w. The general metho d used can b e summarized b y sa ying that it consists of the selection, at eac h step, of subsets of the resulting solution, while a complete solution is not found. T o select elemen ts of the solution, an ordering function is emplo y ed. This is done in a w a y suc h that parts of the solution whic h seem promising in terms of ob jectiv e function are added rst. In the remaining of this section w e describ e the sp ecic tec hniques prop osed to create solutions for the SCPP 5.3.1 Connecting Destinations The rst metho d w e prop ose to construct solutions for the SCPP is based on adding destinations sequen tially The algorithm uses the fact that eac h feasible solution for the SCPP can b e clearly describ ed as the union of paths from the source s to the set D of destinations. More formally let D = f d 1 ; : : : ; d k g b e the set of destinations. Then, a fe asible row for the SCPP is the union of a set of paths s = P 1 [ : : : [ P k suc h that P 1 = ( s; x 1 : : : ; x j 1 1 ; x j 1 ), P 2 = ( s; x 1 : : : ; x j 2 1 ; x j 2 ), : : : P 2 = ( s; x 1 : : : ; x j k 1 ; x j k ), and where x j i = d i for i 2 f 1 ; : : : ; k g In the prop osed algorithm, w e try to construct a solution that is the union of paths. It is assumed initially that no destination has b een connected to the source, and A is the set of non connected destinations, i.e., A = D Also, the set of cac he no des R is initially empt y During the algorithm execution, S represen ts the curren t subgraph of G connected to the source s A t eac h step of the algorithm, a path is created linking S to one of the destinations d 2 A First, the algorithm tries to nd a path P from d to one of the no des in R [ f s g If this is not p ossible (this is represen ted b y sa ying that P = nil ),

PAGE 92

82 d 2 d 1 d 3 1 3 4 2 s Figure 5{1: Sample execution for Algorithm 7 In this graph, all capacities are equal to 1. Destination d 2 is b eing added to the partial solution, and no de 1 m ust b e added to R then the algorithm tries to nd a path P from d to some no de in the connected subgraph S Let w b e the connection p oin t b et w een P and S Then, add w to the set R of cac hes, since this is necessary to mak e to the solution feasible. Finally the residual ro w a v ailable in the graph is up dated, to accoun t for the capacit y used b y P The algorithm nishes when all destinations are included, and R is returned as the solution for the problem. The formal steps of the prop osed pro cedure are describ ed in Algorithm 7 A n um b er of imp ortan t decisions, ho w ev er, are unsp ecied in the description of the algorithm giv en so far. F or example, there are man y p ossible metho ds that can b e used to select the next destination added to the curren t partial solution. Also, a path to a destination v can b e found using div erse algorithms, whic h can result in dieren t selections for a required cac he no de. These p ossible v ariations in the algorithms are represen ted b y t w o functions, get path ( v ; S ), and get dest ( A ). Th us, c hanging the denition of these functions w e can actually ac hiev e dieren t implemen tations. The rst feature that can b e c hanged, b y dening a function get dest : 2 V V is the order in whic h destinations are added to the nal solution. Among the p ossible v ariations, w e can list the follo wing, whic h seem more useful:

PAGE 93

83 Input: graph G set D of destinations A D S ; /* curren t ro w */ R ; /* set of cac he no des */ while A 6 = ; do v get dest ( A ) /* c ho ose a destination */ A A n f v g P get path ( v ; R ; G ) if P = nil then P get path ( v ; S; G ) Let w b e the no de connecting P to S R R [ f w g endRemo v e from G the capacit y used in P S S [ P endreturn R Algorithm 7: First construction algorithm for the SCPP giv e precedence to destinations closer to the source; giv e precedence to destinations further from the source; giv e precedence to destinations in lexicographic order. The second basic decision that can b e made is, once a destination v is selected, what t yp e of path that will b e used to join v to the rest of the graph. This decision is incorp orated b y the function get path : V 2 V A second parameter in v olv ed in the denition of get path is the sp ecic no de w 2 S whic h will b e connected to v Note that suc h a no de alw a ys exist, since at eac h step of the construction there is a path from destinations to at least one no de already reac hed b y ro w. Using a greedy strategy the b est w a y to link a new destination d is through a path from d to some no de v 2 R [ f s g Ho w ev er, it ma y not b e p ossible to nd suc h v and this requires the addition of a new no de to R In b oth situations it is not clear what is the optimal no de to b e link ed to the curren t destination. Th us, another imp ortan t decision in

PAGE 94

84 Algorithm 7 concerns ho w to c ho ose, at eac h step, the no de to b e link ed to the curren t destination. Shortest P ath P olicy P erhaps, the simplest and most logical solution to the ab o v e questions is to link destination no des using shortest paths. This p olicy is useful, since it can b e applied to answ er the t w o questions raised ab o v e: the path is created as a shortest path, and the no de v 2 R [ f s g selected is the closest from d Th us, function get path ( v ; R ; G ) in Algorithm 7 b ecomes: select the no de v 2 R [ f s g suc h that dist ( v ; d ) (the shortest path distance b et w een v and d ) is minim um; nd the the shortest path d ; v from d to v ; add d ; v to the curren t solution. If there is no path from d to R [ f s g then let v b e the no de closest to d and reac hed b y ro w from s and add v to R Other P olicies. W e ha v e tested other t w o metho ds of connecting sources to destinations. In the rst metho d, destinations are connected through a path found using the depth rst se ar ch algorithm. In this implemen tation, paths are follo w ed un til a no de already in R or connected to some no de in R is found. In the last case, the connection no de m ust b e added to R The second metho d used emplo ys r andom p aths starting from the destination no des. This metho d is just useful to understand ho w go o d are the previous algorithms compared to a random solution. Once dened the p olicy to b e used in the constructor, it is not dicult to pro v e the correctness, as w ell as nding the complexit y of the algorithm. Theorem 21 Given an instanc e I of the SCPP, A lgorithm 7 r eturns a c orr e ct solution.Pro of: A t eac h step, a new destination is link ed to the source no de. Th us, at the end of the algorithm all destinations are connected. The paths determined b y suc h connections are v alid, since they use some of the a v ailable capacit y

PAGE 95

85 (according to the information stored in the residual graph G ). No des are added to R only when it is required to mak e the connection p ossible. Th us, the resulting set R is correct, and corresp onds to a v alid ro w from s to the set D of destinations. 2 Theorem 22 Using the closest no de p olicy for destination sele ction and the shortest p ath p olicy for p ath cr e ation, the time c omplexity of the A lgorithm 7 is O ( j D j n 2 ) Pro of: The external lo op is executed j D j times. The steps of highest complexit y inside the while lo op are exactly the ones corresp onding to the get path pro cedure. As w e are prop osing to use the shortest path algorithm for the implemen tation of get path the complexit y is O ( n 2 ) (but can b e impro v ed with more clev er implemen tations for the shortest path algorithm). Other operations ha v e smaller complexit y th us the total complexit y for Algorithm 7 is O ( j D j n 2 ). 2 Other implemen tations of Algorithm 7 w ould result in a v ery similar analysis of complexit y 5.3.2 Adding Cac hes to a Solution W e prop ose a second general tec hnique for creating feasible solutions for the SCPP The algorithm consists of adding cac hes sequen tially to an initial infeasible solution, un til it b ecomes feasible. The steps are presen ted in Algorithm 8 A t the b eginning of the algorithm, the set of cac he no des R is empt y and a p ossibly infeasible subgraph, linking the source to all destinations giv es the initial ro w. Suc h an initial infeasible solution can b e created easily with an y spanning tree algorithm. In the description of our pro cedure, w e dene the set I of infeasible no des to b e the no des v 2 V n f s g suc h that X ( w ;v ) 2 E ( G ) c ( w ; v ) X ( v ;w ) 2 E ( G ) c ( v ; w ) 6 = b v ;

PAGE 96

86 input: Graph G R ; S spanning tree ( G ) Remo v e from G the capacit y on edges used b y S I infeasible no des in S while I 6 = ; (ther e ar e infe asible no des) do v select unfeasible node T ry to nd dieren t paths to satisfy v if a set P of p aths is found then Remo v e from G the capacit y used b y P S S [ P I I n f v g else R R [ f v g end endreturn R Algorithm 8: Second construction algorithm for the SCPP where b v is the demand of v whic h can b e 0 or 1. In the while lo op of Algorithm 8 the curren t solution is initially c hec k ed for feasibilit y This v erication determine if there is an y no de v 2 V suc h that the amoun t of ro w lea ving the no de is greater then the arriving ro w, or in other w ords, I 6 = ; The formal description of this v erication pro cedure is giv en in Algorithm 9 If the solution is found to b e infeasible, then it is necessary to impro v e its feasibilit y b y increasing the n um b er of prop erly balanced no des. The correction of infeasible no des v 2 I is done in the b o dy of the while lo op in Algorithm 8 The pro cedure consists of selecting a no de v from the set of infeasible no des I in the ro w graph, and trying to mak e if feasible b y sending more data from one of the no des w 2 R [ f s g If this can done in suc h a w a y that v b ecomes feasible again, then the algorithm just needs to up date the curren t subgraph S and the set of infeasible no des I

PAGE 97

87 Input: curren t solution S destinations set D for v 2 D do b v 1 I ; for v 2 V do P ( w ;v ) 2 E ( S ) c ( w ; v ) P ( v ;w ) 2 E ( S ) c ( v ; w ) if 6 = b v then I I [ f v g end endreturn I /* returns the infeasible no des */ Algorithm 9: F easibilit y test for candidate solution. d 2 d 1 d 4 1 3 4 2 s d 3 Figure 5{2: Sample execution for Algorithm 8 on a graph with unitary capacities. No des 1 and 2 are infeasible, and therefore are candidates to b e included in R Ho w ev er, if v cannot receiv e enough additional data, then it m ust b e added to the list of cac he no des. This clearly will mak e the no de feasible again, since there will b e no restrictions on the amoun t of ro w departing from v After adding v to R the graph is mo died as necessary Example of p ossible mo dications are c hanging the ro w required b y v to one unit and deleting additional paths leading to v since only one is necessary to satisfy the ro w requiremen ts. W e assume that these c hanges are done randomly if necessary This construction tec hnique can b e seen as a dual of the algorithm presen ted in the previous subsection. In Algorithm 7 the assumption is that a

PAGE 98

88 partial solution can b e incomplete, but alw a ys feasible with relation to the ro w sen t from s to the reac hed destinations. On the other hand, in Algorithm 8 a solution is alw a ys complete, in the sense that all destinations are reac hed. Ho w ev er, it do es not represen t a feasible ro w, un til the end of the algorithm. Pro cedure select unfeasible node has the ob jectiv e of nding the most suitable no de to b e pro cessed in the curren t iteration. This is the main decision in the implemen tation of Algorithm 8 and can b e done using a greedy function, whic h will nd the b est candidate according to some criterion. W e prop ose some p ossible candidate functions, and determine empirically (in the next section) ho w these functions p erform in practice. F unction largest infeasibility : select the no de that has greatest infeasibilit y i.e., the dierence b et w een en tering and lea ving ro w min us demand is maxim um (breaking ties arbitrarily). This strategy tries to add to the set R a no de whic h can b enet most from b eing a cac he. F unction closest from source : select the infeasible no de whic h is closer to the destination. The adv an tage in this case is that a no de v selected b y this rule can help to reduce the infeasibilit y of other no des do wn the in the path from s to the destinations. F unction uniform random : select uniformly a no de v 2 I to b e added to R This rule is useful for breaking the biases existing in the previous metho ds. It has also the adv an tage of b eing v ery simple to compute, and therefore v ery fast. Theorem 23 Given an instanc e of the SCPP, A lgorithm 8 r eturns a c orr e ct solution.Pro of: In the algorithm, the set of infeasible no des I will decrease monotonically This happ ens b ecause at eac h step one infeasible no de is selected

PAGE 99

89 and turned in to a feasible no de. Also, feasible no des cannot b ecome infeasible, since eac h op eration requires that there is enough capacit y in the net w ork (this is guaran teed b y the used of the residual graph G ). Th us, the algorithm terminates. The data ro w from the source to the destinations m ust b e v alid at the end, b y denition of the set I (whic h m ust b e empt y at the end). Similarly the set R m ust b e v alid, since it is used only to mak e no des feasible in the case that no additional paths can b e found to satisfy their requiremen ts. Th us, the solution return b y Algorithm 8 is correct for the SCPP 2 Theorem 24 The time c omplexity of the A lgorithm 8 is O ( nmK ) wher e K is the sum of c ap acities in the SCPP instanc e. Pro of: A spanning tree can b e found in O ( m log n ). Then, it follo ws the while lo op that will p erform at most n iterations. The pro cedure select unfeasible node can b e implemen ted in O ( n ) b y the use of some prepro cessing, in eac h of the prop osed implemen tations. Finding paths to infeasible no des is clearly the most dicult op eration in the lo op. This can b e p erformed in O ( m + n ) for eac h path, using a pro cedure suc h as depth rst searc h. Ho w ev er, it ma y b e necessary to run this step a n um b er of times prop ortional to the sum of capacities in the graph ( K ), whic h results in O ( mK ). Other op erations ha v e lo w complexit y th us the maxim um complexit y for algorithm 8 is O ( nmK ). 2 5.4 Empirical Ev aluation In this section w e presen t computational exp erimen ts carried out with the construction algorithms prop osed ab o v e. All algorithms w ere implemen ted using the C programming language (the co de is a v ailable b y request). The resulting program w as executed in a PC,

PAGE 100

90 T able 5{1: Computational results for dieren t v ariations of Algorithm 7 and Algorithm 8 Instance Constructor 1 Constructor 2 n m DFS Shortest Random LI CS UR 50 500 9.9 2.9 18.8 18.3 18.4 3.0 2.8 3.2 50 800 12.9 4.9 29.3 29.6 28.4 5.0 4.8 5.2 60 500 35.1 8.8 56.8 56.4 55.5 9.6 9.2 9.9 60 800 44.7 11.7 77.6 76.9 76.8 12.5 11.7 12.8 70 500 78.9 17.0 111.3 111.2 110.4 18.9 18.1 19.6 70 800 102.2 20.3 139.3 140.1 139.7 22.5 21.4 23.3 80 500 147.6 27.8 180.7 182.2 181.3 32.0 31.1 33.4 80 800 186.1 32.4 218.3 218.8 218.9 36.9 36.2 38.9 90 500 239.8 42.1 268.4 267.8 268.5 49.5 49.9 52.0 90 800 285.4 47.5 312.6 313.4 313.3 56.5 57.4 59.8 100 500 344.1 60.3 368.0 369.0 369.1 71.3 74.2 75.8 100 800 398.6 67.7 420.1 421.6 422.3 80.3 84.3 85.4 110 500 466.4 84.2 482.6 484.8 485.8 100.3 106.1 106.2 110 800 525.5 92.5 541.7 545.1 543.7 111.9 119.5 118.4 120 500 599.7 115.1 611.3 614.9 613.3 137.6 146.9 144.9 120 800 668.7 125.5 676.9 680.6 679.2 151.9 164.0 159.4 130 500 748.8 157.4 751.4 755.6 754.9 180.3 194.6 189.0 130 800 824.7 169.5 823.3 827.7 827.8 199.3 215.7 208.5 140 500 910.6 213.2 906.2 909.8 908.9 233.4 252.4 243.9 140 800 995.9 228.2 984.8 990.2 989.3 254.6 276.3 265.5 150 500 1092.0 281.8 1074.4 1080.3 1080.6 292.0 319.6 303.6 150 800 1185.4 299.6 1162.6 1170.0 1169.1 315.8 347.0 353.6 with 312MB of memory and a 800MHz pro cessor. The GNU gcc compiler (under the Lin ux op erating system) w as used, with the optimization rag -O2 enabled. T able 5{1 presen ts a summary of the results of exp erimen ts with the prop osed algorithms. The rst t w o columns giv e the size of instances used. The remaining columns giv e results returned b y Algorithm 7 and Algorithm 8 under dieren t p olicies. Eac h en try rep orted in this table represen ts the a v eraged results o v er 30 instances of the stated size. Eac h instance w as created using a random generator for connected graphs, rst describ ed in ( Gomes et al. 1998 ). This generator creates graphs with random distances and capacities, but guaran tees that the resulting instance is connected. Destinations w ere also dened randomly with the n um b er of destinations b eing equal to 40% of the

PAGE 101

91 size of the n um b er of no des. All instances assume that no de 1 is the source no de. In columns 3 to 7 of T able 5{1 w e sho w the comparison of results returned b y dieren t v ariations of Algorithm 7 The second and third columns giv e the solutions returned b y the p olicies depth rst searc h and shortest path, resp ectiv ely F or b oth p olicies, it w as not observ ed an y signicativ e c hange in b eha vior b y using dieren t orderings of the destinations. On the other hand, in the columns 5 to 7 w e rep ort the v alues returned b y the random path p olicy using dieren t ordering metho ds (closest, farthest from source, and lexicographic order, resp ectiv ely). It seams that, giv en the w eakness of the random path p olicy the ordering metho d b ecomes a signican t parameter in the determination of the solution qualit y Columns 8 to 10 of T able 5{1 presen t results for the execution of Algorithm 8 They corresp ond, resp ectiv ely to the largest infeasibilit y closest from source, and uniform random p olicies. It is clear from the results that Algorithm 8 is v ery eectiv e, in comparison with of Algorithm 7 Although there is a tendency of getting b etter results with the `closest from source' p olicy it seams that the output is relativ ely indep enden t of the order of selection for infeasible no des. Th us, it is probably b etter to use a simpler implemen tation, suc h as the `uniform random' p olicy V alues for running time of the prop osed algorithms are compared in T able 5{2 In this table, all v alues are giv en in milliseconds. As exp ected b y the computational complexit y results, Algorithm 8 has sho wn to sp end more time than Algorithm 7 W e note that, although these v alues can b e impro v ed b y careful implemen tation, b oth algorithms ha v e demonstrated go o d p erformance in practice.

PAGE 102

92 T able 5{2: Comparison of computational time for Algorithm 7 and Algorithm 8 All v alues are in milliseconds. Instance Constructor 1 Constructor 2 n m DFS Shortest Random LI CS UR 50 500 10 14 7 7 2 12 17 11 50 800 23 37 20 18 5 20 29 18 60 500 32 59 30 26 12 48 69 42 60 800 47 96 43 41 16 67 98 63 70 500 62 125 56 54 20 106 163 104 70 800 81 164 71 72 24 135 211 139 80 500 97 201 82 86 32 210 335 197 80 800 124 249 100 107 36 267 429 236 90 500 144 293 117 124 42 383 619 354 90 800 175 354 142 148 48 472 769 412 100 500 199 416 163 167 56 622 1025 617 100 800 231 493 190 194 66 761 1259 772 110 500 255 562 214 217 77 985 1627 924 110 800 289 648 246 247 87 1169 1946 938 120 500 318 741 276 274 100 1501 2486 1329 120 800 356 844 312 308 112 1760 2925 1612 130 500 385 940 342 337 128 2136 3555 1984 130 800 424 1061 380 375 144 2491 4153 2093 140 500 458 1154 413 409 163 2981 4980 2905 140 800 508 1300 459 455 183 3431 5729 3321 150 500 555 1417 505 500 204 4105 6817 3729 150 800 611 1591 561 556 226 4675 7758 4015 1 4 16 64 256 1024 4096 16384 0 5 10 15 20 25 time (ms) instance C3 C4 C5 C6 C7 C8 C9 C10 Figure 5{3: Comparison of computational time for dieren t v ersions of Algorithm 7 and Algorithm 8 Lab els `C3' to `C10' refer to the columns from 3 to 10 on T able 5{2

PAGE 103

93 5.5 Concluding Remarks The SCPP is a dicult com binatorial optimization problem o ccurring in Multicast Net w orks. In this c hapter, w e describ ed algorithms for solving the SCPP and some of its v ersions. Initially w e presen ted appro ximation algorithms for the SCPP Although theoretically in teresting, these algorithms cannot giv e go o d appro ximation guaran tee in practice, due to the inheren t problem complexit y W e also discussed ho w go o d solutions can b e created b y applying construction tec hniques. Tw o general t yp es of tec hniques ha v e b eing prop osed, using dual p oin ts of view of the construction pro cess. W e describ ed p olicies for implemen tation of these construction algorithms, and ho w dieren t v arian ts of the algorithms can b e deriv ed. Finally w e p erformed computational exp erimen ts to determine the qualit y of solutions generated b y the dieren t tec hniques describ ed. W e b eliev e that the tec hniques here exp osed can b e still further rened and impro v ed. It w ould also b e in teresting to see these algorithms in tegrated with other heuristics. A future step is to use the prop osed algorithms in the framew ork of general purp ose metaheuristics, since this is the b est w a y of ac hieving high qualit y solutions for most com binatorial problems. Clearly other in teresting op en problems concern the pro of of similar results for v arian ts of the SCPP

PAGE 104

CHAPTER 6 HEURISTIC ALGORITHMS F OR R OUTING ON MUL TICAST NETW ORKS An imp ortan t problem on m ulticast net w orks asks for the determination of an optim um route to b e follo w ed b y pac k ages in a m ulticast group. This is kno wn as the m ulticast routing problem (MRP). A large n um b er of heuristic algorithms ha v e b een prop osed in the last y ears to solv e the MRP whic h is of great in terest for net w ork engineers. In this c hapter, a heuristic for the m ulticast routing problem is prop osed, whic h impro v es o v er the w ell kno wn algorithm of Kou et al. ( 1981 ). The resulting construction algorithm is used in the implemen tation of a metaheuristic approac h for the MRP In this approac h, a restarting pro cedure similar to the greedy adaptiv e searc h (GRASP) applies the heuristic together with a lo cal searc h metho d in order to nd near optimal solutions to the MRP Sim ulation results sho w that the prop osed tec hnique is sup erior in terms of solution qualit y when compared to traditional heuristics. W e also prop osed no v el impro v emen ts to GRASP that con tribute to reduce the computational time and impro v e the o v erall qualit y of solutions. 6.1 In tro duction Multicast services are used in mo dern applications to allo w direct comm unication b et w een a source no de and a set of destinations. In recen t y ears, the n um b er of applications of m ulticast has increased steadily follo wing the rapid adv ances of the In ternet and in tranet net w orks on the corp orate w orld. A n um b er of algorithmic issues, ho w ev er, remain as a ma jor problem for the wide deplo ymen t of m ulticast applications. F or example, routing is an 94

PAGE 105

95 issue that has not b een completely solv ed on m ulticast systems. While for traditional unicast systems the routing problem can b e solv ed in p olynomial time using, e.g., the Dijkstra's algorithm, the m ulticast routing problem is b etter mo deled b y the Steiner tree problem, whic h is kno wn to b e N P -hard ( Garey and Johnson 1979 ). Giv en the complexit y of solving exactly the routing problem, a large n umb er of heuristic algorithms ha v e b een prop osed to nd go o d, non-optimal solutions ( Ballardie et al. 1993 ; Hong et al. 1998 ; Komp ella et al. 1996 ; Salama et al. 1997b ; Sriram et al. 1999 ; Zh u et al. 1995 ). Suc h metho ds, ho w ev er, lac k an y guaran tee of lo cal optimalit y This turns out to b e an imp ortan t shortcoming, sp ecially for instances with a large n um b er of group mem b ers, since the ecien t use of resources is critical in this case. 6.1.1 The Multicast Routing Problem A m ulticast net w ork has the main ob jectiv e of allo wing comm unication from a source to a set of destinations with one single send op eration. This is made p ossible b y retransmitting data whenev er t w o or more destinations can b e reac hed from one single no de. A set of no des in terested in the same piece of data is called a multic ast gr oup The main task faced in the op eration of a m ulticast net w ork consists of nding routes for deliv ering data to all mem b ers of a m ulticast group. The most common w a y of linking the source to destinations is through a tree spanning the in v olv ed no des (source and destinations). The ob jectiv es and additional constrain ts of the problem ma y v ary dep ending on the sp ecic application and the resp ectiv e p erformance requiremen ts. F or example, in real time video applications the qualit y of service (QoS) constrain ts require that the dela y of transmission b e less than some xed threshold Th us, a basic

PAGE 106

96 ob jectiv e w ould b e to minimize the total cost of the tree, sub ject to the constrain t that all source-destination pairs ha v e dela y less than This problem is kno wn as the delay c onstr aine d multic ast r outing pr oblem (DCMRP). T o giv e a formal description of the DCMRP let G = ( V ; E ) b e a graph where V is the set of no des and E the set of links b et w een no des. The source no de is represen ted b y s and the destinations are D = f d 1 ; : : : ; d k g suc h that D V There is a cost function c : E Z + represen ting the cost of the links and a dela y function : E Z + giving the time elapsed when tra v ersing an edge e 2 E The problem asks for a set of edges E 0 E suc h that s is connected to ev ery no de d 2 D on G 0 = ( V ; E 0 ), the maxim um dela y is b ounded, i.e., max d 2 D X e 2P d ( e ) ; where P d is the path in E 0 from s to d and suc h that the total cost P e 2 E 0 c ( e ) of E 0 is minim um. The DCMRP is easily seen as a generalization of the Steiner problem on graphs. In the Steiner problem, one is giv en a graph G = ( V ; E ) together with a cost function c : E Z + and a set R V of required no des. The no des in V n R are called Steiner no des. The ob jectiv e is to nd a tree T linking the no des in D passing through Steiner no des if necessary suc h that the cost P e 2 T c ( e ) is minim um. The Steiner problem on graphs is kno wn to b e N P -hard ( Garey and Johnson 1979 ). Th us, w e ha v e a similar result for the DCMRP Our in terest is therefore in nding ecien t algorithms that return go o d, near optimal solutions for a large n um b er of practical instances. 6.1.2 Con tributions In this c hapter w e are in terested in solution metho ds for the dela y constrained m ulticast routing problem. In particular, w e prop ose a new metho d

PAGE 107

97 for computing routing trees for m ulticast net w orks using a v ariation of the KMB heuristic ( Kou et al. 1981 ). W e also prop ose to use the resulting heuristic as a constructor in a restarting pro cedure. The resulting metaheuristic can b e view ed as a mo dication of GRASP ( gr e e dy r andomize d se ar ch pr o c e dur e ). The strategy is emplo y ed to a v oid sub optimal solutions pro duced b y existing heuristics. A t the same time, w e prop ose inno v ativ e tec hniques applied to the GRASP metaheuristic. W e in tro duce a new metho d for computing the candidate list, and sho w that this metho d is more eectiv e in terms of computational time. The new metho d, com bined with impro v ed searc h heuristics, can b e used to yield fast implemen tations for the problem considered. This c hapter is organized as follo ws. In Section 6.2 a heuristic for the MRP is prop osed. Then, in Section 6.3 a metaheuristic based on the describ ed algorithm is prop osed. Computational results are discussed in Section 6.4 Concluding remarks are giv en in Section 6.5 6.2 An Algorithm for the MRP W e start b y describing one of the most imp ortan t algorithms for the Steiner tree problem. This algorithm w as prop osed b y Kou et al. ( 1981 ), and is kno wn in the literature as the KMB heuristic. The main adv an tage of this algorithm is the fact that it is simple to describ e and implemen t, and y et it giv es a p erformance guaran tee of t w o. In fact, empirical studies sho w that in most cases the results giv en b y the heuristic are b etter than this theoretical upp er b ound. The KMB heuristic is presen ted in Algorithm 2 F or con v enience, w e repro duce it in Algorithm 10 The main result ab out Algorithm 10 is summarized b ello w.

PAGE 108

98 Input: Graph G = ( V ; E ), source s and set D of destinations Construct a complete graph K ( R ; E ) where the set of no des is R Let the distance d ( i; j ), i; j 2 R b e the shortest path from i to j in G Find a minim um spanning tree T of K Replace eac h edge ( i; j ) in T b y the complete path from i to j in G Let the resulting graph b e T 0 Compute a minim um spanning tree ^ T of T 0 rep eat r f al se if ther e is a le af w 2 ^ T which is not in R then Remo v e w from ^ T r tr ue end un til not r Algorithm 10: KMB heuristic for the Steiner tree problem. Theorem 25 A lgorithm 10 is a 2 1 =p -appr oximation algorithm, wher e p is numb er of r e quir e d no des. Mo difying the solution for MRP The strategy used for solving the MRP consists of mo difying the KMB algorithm in order to accoun t for the additional requiremen ts of the MRP The formal description of the metho d is giv en in Algorithm 11 A t the b eginning of the algorithm, the initial solution is created b y the KMB heuristic. Therefore, the initial result is kno wn to b e feasible for the Steiner tree problem, but it can b e violating some of the dela y or reliabilit y constrain ts of the problem. The main part of the algorithm try to satisfy these requiremen ts. The rst part of the algorithm c hec ks the solution and determines if it is feasible for the QoS requiremen ts. If there is an y path P in the solution suc h that the total dela y is greater than additional steps of the algorithm m ust b e used to \x" the problem. This is done b y running the shortest

PAGE 109

99 Input: Graph G = ( V ; E ), source s and set D of destinations Run the KMB algorithm (Algorithm 10 ) Let T b e result of the previous step if ther e is P 2 T such that delay d ( P ) > then /* P erform some mo dications to allo w for routing constrain ts */Substitute P b y the shortest path b et w een rst and last no des of P Remo v e redundan t edges not required in the tree end/* Do something similar when capacit y used is more than % */ while some e dge in T (on p ath P i ) uses mor e % of c ap acity do Create a restricted graph G r where: V ( G r ) = V ( G ) and ev ery edge e 2 E ( G r ) has capacit y less than 100 c ( e ) c u ( e ), where c u ( e ) is the capacit y of e 2 E ( G ) Find a shortest path P 0 in G r Substitute P i b y the shortest path P 0 Remo v e redundan t edges not required in the tree end Algorithm 11: Prop osed heuristic for the MRP

PAGE 110

100 p ath algorithm (using, e.g., the Dijkstra's tec hnique) and substituting it in the solution. Note that the shortest path is computed using dela ys as costs, in order to minimize the total dela y W e assume that, after running this algorithm, the resulting path P 0 has dela y less than If this is not the case, then the instance is clearly not feasible, and the algorithm can terminate. The next step is to substitute the infeasible path P b y the shortest path found as describ ed. Redundan t edges m ust no w b e remo v ed whenev er needed. This can b e done in time O ( m ), where m is the n um b er of edges in the graph. The second part of the heuristic deals with the case when the reliabilit y constrained is violated. If this happ ens, a similar tec hnique is used to nd a new path a v oiding the violated edges. Ho w ev er, one needs to b e more careful b ecause of the increased p ossibilit y of nding infeasible solutions. W e prop ose the creation of a r e duc e d gr aph where edges are selected only when they ha v e enough a v ailable capacit y This graph is denoted b y G r = ( V ; E 0 ), with E 0 comp osed of all e 2 E suc h that the capacit y used b y e is less than %. The algorithm no w tries to nd a new shortest path P 0 linking the extreme no des of one of the violated paths P i that link s to a destination. Note that the shortest path P 0 is found o v er the reduced graph G r Th us, the new path is guaran teed to b e feasible for the MRP due to the w a y it w as constructed. This step is rep eated while there is an infeasible edge in the solution. Again, note that w e assume that the instance of MRP is feasible, and therefore there is at least one solution satisfying these requiremen ts. With this construction algorithm in hands, the next logical step is to use it in a more general framew ork of a metaheuristic. In the next section, a metaheuristic algorithm is prop osed, based on the restarting pro cedure kno wn as GRASP

PAGE 111

101 Read instance Initialize data structures while termination criterion not satise d do s create solution Impro v e solution s if s is the b est solution so far then Sa v e s end endReturn b est solution Algorithm 12: GRASP 6.3 Metaheuristic Description Restarting searc h heuristics ha v e b een v ery successful in a n um b er of problems. One of the most w ell studied restarting algorithms is the greedy randomized adaptiv e searc h pro cedure (GRASP). GRASP is a metaheuristic prop osed b y F eo and Resende ( 1995 ) aimed at nding near optimal solutions for combinatorial optimization problems. It is comp osed of a n um b er of iterations, where a new solution is pic k ed from the feasible set, using a construction algorithm, and subsequen tly impro v ed, using some lo cal searc h metho d. GRASP has b een v ery successful in a n um b er of applications suc h as QAP ( Oliv eira et al. 2003b ), frequency assignmen t ( Gomes et al. 2001 ), and satisabilit y The steps of GRASP are summarized in Algorithm 12 The GRASP algorithm is kno wn to b e a m ulti-start metho d, where at eac h iteration a new solution is constructed, and subsequen tly impro v ed. In our implemen tation, the construction algorithm prop osed in the previous section is used for the construction phase. A lo cal searc h algorithm is emplo y ed in the impro v emen t phase. The construction phase of GRASP is presen ted in more detail on Algorithm 13 It traditionally consists of creating the solution step-b y-step. A

PAGE 112

102 while solution s is not c omplete do Order the k existing candidate elemen ts Select a random suc h that 0 < k Let R CL b e the set of b est candidates Randomly select one of the elemen ts in R CL end Algorithm 13: GRASP construction phase eac h step, a set of candidate elemen ts is selected, and called the restricted candidate list (R CL). An elemen t of the R CL is selected, and added to the solution. Notice that in this case, the elemen ts are the p ossible paths in a solution. Although this is the most common metho d of implemen tation in GRASP w e see in the next section that this sc heme can b e impro v ed b y the careful use of randomization tec hniques. 6.3.1 Impro ving the Construction Phase In the traditional implemen tation of the GRASP construction algorithm, eac h iteration m ust construct a list of candidates (R CL), and select one of its elemen ts randomly The size of the R CL is giv en b y a parameter whic h frequen tly is randomly determined. W e prop ose a new metho d for the GRASP construction phase that is on a v erage equiv alen t to the existing metho d, but whic h is m uc h more ecien t in practice. The metho d uses the follo wing observ ation Observ ation 26 L et x 1 ; : : : ; x n b e an unor der e d se quenc e, and y 1 ; : : : ; y n the c orr esp onding or der e d se quenc e. Then, to nd a r andom element amongst y 1 ; : : : ; y for 0 < n is on aver age e quivalent to sele ct the b est of r andom elements of x 1 ; : : : ; x n Pro of: Giv en the sequence x 1 ; : : : ; x n and selecting random elemen ts of the sequence, the probabilit y of selecting one the the elemen ts in the R CL is

PAGE 113

103 while solution s is not c omplete do Select a random suc h that 1 =n k k 1 = c 1 /* if this is a minimization problem */ for j = 1 ; : : : ; k 1 do Let C j b e the list of candidates at this iteration Randomly select one elemen t x in C j c j f ( x ) if c j < c then c c j y x endInsert elemen t y in the solution s end end Algorithm 14: Impro v ed construction for GRASP n =n = Consequen tly if W is an indicator random v ariable dened as 1 whenev er the elemen t pic k ed is in the R CL, then E ( W ) = Therefore, after doing k indep enden t trials, the a v erage n um b er of elemen ts in the R CL is k b y the additivit y of exp ectation. If w e ha v e k = 1 = then on a v erage there is just one elemen t of the R CL within the pic k ed elemen ts. No w, to kno w what of these elemen ts is the one in the R CL, w e just need to tak e the smallest one. 2 The observ ation ab o v e giv es a v ery ecien t w a y of implemen ting the R CL test, whic h giv es, on a v erage, the same results. Start with the full set C of candidate elemen ts. Then, at eac h step generate a v alue of and pic k at random k = 1 = elemen ts of C F rom the pic k ed elemen ts, store only the one whic h is the b est t for the greedy function. This metho d is depicted in Algorithm 14 A clear adv an tage in terms of computational complexit y is ac hiev ed b y the prop osed construction metho d for GRASP The b est adv an tage is that,

PAGE 114

104 while in the original tec hnique the candidate elemen ts m ust b e sorted, this is not necessary in the prop osed algorithm. Moreo v er, the complexit y of traditional construction is dep enden t on the n um b er of candidate elemen ts. In our metho d, the complexit y is constan t for a xed v alue of F or example, if alpha is n= 2, then w e need just t w o iterations to nd an elemen t in the R CL, with high probabilit y Theorem 27 The c omplexity of sele cting elements fr om the R CL in the mo die d c onstruction algorithm is n log n Pro of: A probabilistic analysis of the complexit y of the algorithm will b e used. The imp ortan t step that m ust b e accoun ted for is the selection of a random elemen t from C Initially let us assume that at eac h iteration of the for lo op, the size of the list is n Kno wing that at eac h step the v alue of is c hosen from a uniform distribution, w e nd that the a v erage n um b er N of elemen ts selected in the for lo op is E ( N ) = E ( n X j =1 k ) = n X j =1 E ( k j ) = n X j =1 Z 1 1 =n E ( k j = t ) dt = n X j =1 Z 1 1 =n 1 =t dt = n X j =1 (log 1 log 1 =n ) = n X j =1 log n = n log n: No w, the decrease in the size of the list will not c hange the complexit y of the result, since n X j =1 log ( n j ) = n log n + log + log csc n ( n 1)! :

PAGE 115

105 The second term is a constan t, and the third go es to zero v ery fast, as n tends to innit y b ecause of the v alue ( n 1)! in the denominator. 2 6.3.2 Impro v emen t Phase GRASP has the adv an tage of b eing easy to dev elop, since it is comp osed of relativ ely indep enden t pro cedures (the constructor and lo cal searc h phases). It is w ell suited for applications with existing heuristic algorithms, that can b e com bined with GRASP to nd a b etter solution. Ho w ev er, one of the w eaknesses of GRASP is its incapacit y of in tegrate go o d solutions found previously in to the curren t searc h iteration. Since eac h iteration will create a completely dieren t solution, there is no information added to the system when a go o d solution is found. A metho d that has b een used lately to o v ercome this problem is called p ath r elinking (PR) ( Aiex et al. 2003 ; Resende and Rib eiro 2003 ). In PR, a subset of the b est solutions found is k ept in a separate memory called the elite set A t eac h iteration, one of the solutions s will b e selected, and a pro cess of comparing the curren t solution with s will start. Eac h comp onen t of the solution will b e c hanged to the corresp onding v alue on s and after this a lo cal searc h will b e initiated to c hec k for lo cal optimalit y The main idea of PR is that, using an elemen t s of the elite set and a dieren t starting solution, w e can nd alternativ e paths in the solution space leading to impro v emen ts in the ob jectiv e function. A t the end of the pro cess, the curren t solution will ha v e ob jectiv e function at most equal to the ob jectiv e function of s Due to randomization, ho w ev er, there is a probabilit y that the algorithm can nd a b etter solution.

PAGE 116

106 W e mo dify the general structure of the GRASP metaheuristic to accommo date P ath Relinking. The resulting algorithm is presen ted in Algorithm 15 The main mo dications made to GRASP are: W e ha v e to create and main tain a set of elite solutions. This is sho wn in lines 8 and 14 of Algorithm 15 When the elite set is complete, eac h GRASP iteration should run the P ath Relinking routine. This is sho w in line 6 of Algorithm 15 Algorithm 15: GRASP with P ath Relinking for minimization c 1 while stopping criterion not satise d do p GreedyRandomized () p LocalSearch ( p ) if jE j = ELITE SIZE then PathRelinking else Insert curren t solution in to E endif c ( p ) < c then p p c c ( p ) endUpdateEliteSet endreturn p The elite set E is main tained in memory as a v ector of solutions. W e describ e no w ho w elemen ts are inserted and remo v ed from E Dene the difference diff ( s 1 ; s 2 ) b et w een solutions s 1 and s 2 to b e the n um b er of dieren t edges in these t w o solutions. In the initial phase of GRASP with P ath Relinking, the elite set is empt y While w e ha v e less than ELITE SIZE elemen ts, the curren t solution is inserted in to E if it has a dierence of at least 3 to all other elemen ts in E The second phase starts when E has ELITE SIZE elemen ts.

PAGE 117

107 In the second phase, new solutions generated b y GRASP are inserted in to E according to the up date criterion, whic h is presen ted in Algorithm 16 If the curren t solution s is b etter than all solutions in E then s is directly inserted. Otherwise, w e require that s has a dierence of at least 3 to all elemen ts in E If this is true, and s is at least b etter than the w orst solution in E then s is inserted. T o main tain the size of the elite set constan t, w e remo v e the solution x whic h has smallest dierence diff ( x; s ), for all x 2 E Algorithm 16: UpdateEliteSet pro cedure Input: curren t solution s ; /* Set w to the w orst solution */ ; w x 2 E suc h that f ( x ) y ; 8 y 2 E ; /* Set b to the b est solution */ ; b x 2 E suc h that f ( x ) y ; 8 y 2 E ; if f ( s ) < f ( b ) or ( f ( s ) f ( w ) and min x 2E diff ( s; x ) > = 3 ) then nd s p : s p = arg min x 2E diff ( s; x ) ; E E n f s p g ; E E [ s ; end Algorithm 17: PathRelinking pro cedure Input: curren t solution s ; s 0 random( E ) ; for i 1 to m = j E j do if e i 2 s 0 and e i 62 s then evalij (s, i, j) ; if > 0 then LocalSearch ( s ) ; SaveSolution ( s ) ; end end end The P ath Relinking pro cedure, sho wn in Algorithm 17 starts with the selection of a random solution s 0 2 E This solution is called the guiding

PAGE 118

108 solution b ecause it is used throughout the pro cedure to c ho ose the next c hange to b e p erformed. In eac h iteration, if an edge e 2 E ( G ) is in s 0 but not in s the algorithm will include e in s and c hec k if this brings an impro v emen t. The c hange in ob jectiv e function caused b y this substitution is found using the function evalij If there is an impro v emen t, then w e apply lo cal searc h in the resulting solution in order to main tain lo cal optimalit y The algorithm do es not run lo cal searc h for non-impro ving c hanges, in order to reduce the total computational eort. If the solution found at the curren t step is b etter than the b est kno wn solution, then it is sa v ed. The nal result of P ath Relinking dep ends on the ob jectiv e function v alue of the solutions found during the algorithm. Let s i for 0 i n b e the solution found at the i -th iteration of the P ath Relinking pro cedure (w e note that s 0 = s and s n = s 0 ). If, giv en the initial solution s and the guiding solution s 0 w e can nd a solution s suc h that f ( s ) < min f f ( s ) ; f ( s 0 ) g and f ( s ) f ( s i ), for 1 i < n where f ( s ) is the ob jectiv e v alue of s then s is clearly the b est solution and is returned b y the algorithm. Ho w ev er, if w e cannot nd an impro v emen t, then w e prefer to return some solution whic h is dieren t from s and s 0 but still go o d in some sense. W e dene a lo cal optim um relativ e to P ath Relinking to b e a solution s i suc h that s i < s i 1 and s i < s i +1 for 1 < i < n If there is no suc h solution then w e dene s 0 and s n as the only lo cal optima. W e dene then the resulting solution as s = min f s i : s i is a lo cal optim um relativ e to P ath Relinking g In this denition, if there is an impro v emen t, then s is the b est impro v emen t found. Otherwise, if there is some lo cal optim um, w e c ho ose the b est lo cal optim um. Note that w e return the minim um of s; s 0 only if no other lo cal optim um w as found during the execution of P ath Relinking. This is done in

PAGE 119

109 order to impro v e div ersit y of GRASP a v oiding the unnecessary rep etition of solutions already in the elite set. The complexit y of the P ath Relinking pro cedure is determined b y the op erations p erformed to nd the b est impro v emen t. The evalij pro cedure (line 7 of Algorithm 17 ) has complexit y O ( n ), b ecause it simply v eries the impact of the c hange in the solution. Th us, the step with biggest complexit y is the lo cal searc h pro cedure (line 13). A faster lo cal searc h implemen tation w as emplo y ed, with O ( n 2 ) complexit y This implies a complexit y of O ( n 2 ) inside the for lo op (lines 4 to 17). Therefore, w e ha v e the follo wing result. Theorem 28 The worst c ase p erformanc e of Path R elinking as describ e d in A lgorithm 17 is O ( n 3 ) 6.3.3 Rev erse P ath Relinking and P ost-pro cessing The P ath Relinking pro cedure describ ed ab o v e can b e further generalized, b y considering that it can b e run not only from the curren t solution s to a solution s 0 in the elite set, but also in the rev erse direction. The results obtained with this c hange are not necessarily equal to the previous results. W e call this mo dication of the path relinking pro cedure r everse p ath r elinking Therefore, in our implemen tation, for eac h GRASP iteration the rev erse path relinking is also applied, to impro v e the results of the in tensication phase. As a last step of GRASP w e use P ath Relinking as a p ost-optimization pro cedure at the end of the program. W e call this nal p ath r elinking The p ost-optimization step ensures that the solution returned b y GRASP is optimal with resp ect to the path relinking op eration. A description for this step is giv en in Algorithm 18 A t eac h iteration of the repeat lo op in line 1, w e run P ath Relinking for eac h pair of solutions s i ; s j 2 E The algorithm stops

PAGE 120

110 when, after doing this for all p ossible pairs in E the ob jectiv e function of the b est solution cannot b e impro v ed. Algorithm 18: Final P ath Relinking rep eat for i = 1 to n do for j 1 to n do if i 6 = j then e 1 elemen t i of E ; e 2 elemen t j of E ; P ath-Relinking( e 1 ; e 2 ) ; end end end un til while b est solution c an b e impr ove d ; 6.3.4 Ecien t implemen tation of P ath Relinking One of the computational burdens asso ciated with the P ath Relinking metho d is the requiremen t of p erforming some kind of lo cal searc h for new solutions found during its execution. This is done in order to main tain lo cal optimalit y and further explore the new path in the searc h space. Ho w ev er the use of lo cal searc h can mak e eac h iteration of GRASP slo w er and ha v e a negativ e eect in the o v erall computational time. T o a v oid this situation w e impro v ed the original lo cal searc h used in GRASP b y using a non-exhaustiv e impro v emen t phase. In our implementation, only one of the edges not included in s (selected randomly) is v eried and included. This reduces the complexit y of lo cal searc h b y a factor of n leading to a O ( n 2 ) implemen tation. This sc heme is used in eac h iteration inside P ath Relinking. T o enhance the qualit y of lo cal searc h outside P ath Relinking w e use the follo wing additional metho d. The mo died lo cal searc h is executed as discussed ab o v e. Then, w e c hange randomly t w o pair of elemen ts in the solution

PAGE 121

111 and return to lo cal searc h as b efore. The result of this op eration allo ws the algorithm to explore a dieren t, but closely related neigh b orho o d, whic h can bring further impro v emen ts to the curren t solution. W e con tin ue with the lo cal metho d, un til a terminating criterion is satised (w e used n um b er of iterations without impro v emen t as the terminating condition). These c hanges represen ted a go o d impro v emen t o v er the original lo cal searc h, b oth in terms of time as w ell as qualit y of solutions found, as discussed in the next section. 6.4 Computational Exp erimen ts The heuristics prop osed in this c hapter ha v e b een implemen ted using the ANSI-C language. The implemen tation w as devised to b e highly p ortable across platforms and easy to reuse. The compiler emplo y ed w as the GNU GCC optimized compiler. The lev el of optimization c hosen w as -O2 The pro cessor used w as a P en tium4 with 2.8GHz, and 512MB of a v ailable memory T able 6{1 presen ts a summary of the results found b y the algorithm prop osed in this c hapter. The rst three columns describ e the instances used for testing. W e emplo y ed random instances of the MRP ranging from 100 to 200 no des and 500 to 2000 edges. F or eac h size of instance, w e made 50 runs with dieren t instances of the same size. The ob jectiv e of this metho dology is to nd results that represen t the a v erage b eha vior of instances. The results presen ted in T able 6{1 sho w that the algorithm implemen ted as prop osed ab o v e giv es a large impro v emen t o v er the simple KMB heuristic. W e see that for most problems, the solution returned b y our heuristic is ab out 30% b etter than the KMB results. This giv es a clear indication that our results are m uc h closer to the global optim um. In column 9, w e decided to rep ort just the time sp en t in the construction phase, since this is the only comparable part

PAGE 122

112 T able 6{1: Summary of results for the prop osed metaheuristic for the MRP Column 9 ( ) rep orts only the time sp en t in the construction phase. Instance KMB Metaheuristic n m j D j b est a v erage time b est a v erage time impro v. 100 500 10 77 93.2 1.5 59 63.3 1.8 32% 100 600 10 56 69.4 1.5 42 43.6 1.9 37% 100 700 10 66 74.2 1.2 43 46.8 1.3 37% 120 800 12 80 87.4 2.4 50 54.1 2.9 38% 120 900 12 73 91.3 2.1 60 64.3 2.5 30% 120 1000 12 56 63.6 2.2 41 43.1 3.5 32% 140 1000 14 82 98.2 3.1 61 65.4 3.6 33% 140 1100 14 65 81.1 3.8 56 57.5 3.7 29% 140 1200 14 65 74.9 3.3 47 50.6 3.9 32% 160 1300 16 108 121.6 4.8 78 80.7 4.9 34% 160 1400 16 97 112.3 4.8 70 77.7 5.6 31% 160 1500 16 89 97.1 4.9 55 64.2 5.8 34% 180 1600 18 101 115.3 6.4 73 80.3 6.9 30% 180 1700 18 113 122 6.5 75 79.4 7.2 35% 180 1800 18 99 110 6.5 69 73.4 8.4 33% 200 1800 20 119 133.6 8.7 87 91.4 9.3 32% 200 1900 20 114 126 8.7 79 84.6 9.7 33% 200 2000 20 112 122.5 8.6 77 83.0 10.2 32%

PAGE 123

113 0 20 40 60 80 100 120 140 0 2 4 6 8 10 12 14 16 18 20 objective costinstance KMB Metaheur. Figure 6{1: Comparison b et w een the a v erage solution costs found b y the KMB heuristic and our algorithm. of the t w o pro cedures, in terms of computational time. The metaheuristic w as, in fact, run for a maxim um of ten min utes, for eac h iteration. The same results app ear also in Figure 6{1 6.5 Concluding Remarks In this c hapter, w e presen ted a new heuristic algorithm to solv e the MRP Its main tec hnique is to use a mo died construction metho d, based on the KMB heuristic ( Kou et al. 1981 ) to nd feasible solutions for the MRP W e com bine this construction algorithm with a restarting pro cedure base on GRASP The resulting algorithm w as found to yield near optimal solutions, with v ery go o d impro v emen ts, compared to the original KMB heuristic. A topic for further researc h w ould b e to compare this solution to other existing metho ds.

PAGE 124

114 Another in teresting topic w ould b e to study distributed v ersions of this procedure. This w ould not b e v ery dicult, since it is w ell kno wn that restarting pro cedures suc h as GRASP are simple to parallelize. Ho w ev er, this is imp ortan t task, since distributed algorithms are essen tial for the practical implemen tation of net w ork routing algorithms.

PAGE 125

CHAPTER 7 A NEW HEURISTIC F OR THE MINIMUM CONNECTED DOMINA TING SET PR OBLEM ON AD HOC WIRELESS NETW ORKS Giv en a graph G = ( V ; E ), a dominating set D is a subset of V suc h that an y v ertex not in D is adjacen t to at least one v ertex in D Ecien t algorithms for computing the minim um connected dominating set (MCDS) are essen tial for solving man y practical problems, suc h as nding a minim um size bac kb one in ad ho c net w orks. Wireless ad ho c net w orks app ear in a wide v ariet y of applications, including mobile commerce, searc h and disco v ery and military battleeld. In this c hapter w e prop ose a new ecien t heuristic algorithm for the minim um connected dominating set problem. The algorithm starts with a feasible solution con taining all v ertices of the graph. Then it reduces the size of the CDS b y excluding some v ertices using a greedy criterion. W e also discuss a distributed v ersion of this algorithm. The results of n umerical testing sho w that, despite its simplicit y the prop osed algorithm is comp etitiv e with other existing approac hes. 7.1 In tro duction In man y applications of wireless net w orks, suc h as mobile commerce, searc h and rescue, and military battleeld, one deals with comm unication systems ha ving no xed infrastructure, referred to as ad ho c wir eless networks An essen tial problem concerning ad ho c wireless net w orks is to design routing proto cols allo wing for comm unication b et w een hosts. The dynamic nature of ad ho c net w orks mak es this problem esp ecially c hallenging. Ho w ev er, in some cases the problem of computing an acceptable virtual bac kb one can b e reduced 115

PAGE 126

116 to the w ell kno wn minim um connected dominating set problem in unit-disk graphs ( Butenk o et al. 2002 ). Giv en a simple undirected graph G = ( V ; E ) with the set of v ertices V and the set of edges E a dominating set (DS) is a set D V suc h that eac h v ertex in V n D is adjacen t to at least one v ertex in D If the graph is connected, a c onne cte d dominating set (CDS) is a DS whic h is also a connected subgraph of G W e note that computing the minim um CDS (MCDS) is equiv alen t to nding a spanning tree with the maxim um n um b er of lea v es in G In a unit-disk graph, t w o v ertices are connected whenev er the Euclidean distance b et w een them is at most one unit. Ad ho c net w orks can b e mo deled using unit-disk graphs as follo ws. The hosts in a wireless net w ork are represen ted b y v ertices in the corresp onding unit-disk graph, where the unit distance corresp onds to the transmission range of a wireless device (see Figure 7{1 ). It is kno wn that b oth CDS and MCDS problems are N P -hard ( Garey and Johnson 1979 ). This remains the case ev en when they are restricted to planar, unit disk graphs ( Bak er 1994 ). F ollo wing the increased in terest in wireless ad ho c net w orks, man y approac hes ha v e b een prop osed for the MCDS problem in the recen t y ears ( Alzoubi et al. 2002 ; Butenk o et al. 2002 ; Das and Bhargha v an 1997 ; Sto jmeno vic et al. 2001 ). Most of the heuristics are based on the idea of creating a dominating set incremen tally using some greedy tec hnique. Some approac hes try to construct a MCDS b y nding a maximal indep enden t set, whic h is then expanded to a CDS b y adding \connecting" v ertices ( Butenk o et al. 2002 ; Sto jmeno vic et al. 2001 ). An indep endent set (IS) in G is a set I V suc h that for eac h pair of v ertices u; v 2 I ( u; v ) 62 E An indep enden t set I is maximal IS if an y v ertex not in I has a neigh b or in I Ob viously an y maximal indep enden t set is also a dominating set.

PAGE 127

117 n r n r Figure 7{1: Appro ximating the virtual bac kb one with a connected dominating set in a unit-disk graph There are sev eral p olynomial-time appro ximation algorithms for the MCDS problem. F or instance, Guha and Kh uller ( 1998 ) prop ose an algorithm with appro ximation factor of H () + 2, where is the maxim um degree of the graph and H ( n ) = 1 + 1 = 2 + + 1 =n is the harmonic function. Other appro ximation algorithms are giv en in Butenk o et al. ( 2002 ); Marathe et al. ( 1995 ). A p olynomial time appro ximation sc heme (PT AS) for MCDS in unit-disk graphs is also p ossible, as sho wn b y Hun t I I I et al. ( 1998 ) and more recen tly b y Cheng et al. ( 2003 ). A common feature of the curren tly a v ailable tec hniques for solving the MCDS problem is that the algorithms create the CDS from scratc h, adding at eac h iteration some v ertices according to a greedy criterion. F or the general dominating set problem, the only exception kno wn to the authors is briery explained in Sanc his ( 2002 ), where a solution is created b y sequen tially remo ving v ertices. A shared disadv an tage of suc h algorithms is that they ma y require additional setup time, whic h is needed to construct a CDS from scratc h. Another

PAGE 128

118 w eakness of the existing approac hes is that frequen tly they use complicated strategies in order to ac hiev e a go o d p erformance guaran tee. In this c hapter, w e prop ose a new heuristic algorithm for computing appro ximate solutions to the minim um connected dominating set problem. In particular, w e discuss in detail the application of this algorithm to the MCDS problem in unit-disk graphs. The algorithm starts with a feasible solution, and recursiv ely remo v es v ertices from this solution, un til a minimal CDS is found (here, b y a minimal CDS w e mean a connected dominating set, in whic h remo ving an y v ertex w ould result in a disconnected induced subgraph). Using this tec hnique, the prop osed algorithm main tains a feasible solution at an y stage of its execution; therefore, there are no setup time requiremen ts. The approac h also has the adv an tage of b eing simple to implemen t, with exp erimen tal results comparable to the b est existing algorithms. This c hapter uses standard graph-theoretical notations. Giv en a graph G = ( V ; E ), a subgraph of G induced b y the set of v ertices S is represen ted b y G [ S ]. The set of adjacen t v ertices (also called neigh b ors) of v 2 V is denoted b y N ( v ). Also, w e use ( v ) to denote the n um b er of v ertices adjacen t to v i.e. ( v ) = j N ( v ) j The c hapter is organized as follo ws. In Section 7.2 w e presen t the algorithm and pro v e some results ab out its time complexit y In Section 7.3 w e discuss a distributed implemen tation of this algorithm. Results of computational exp erimen ts with the prop osed approac h are presen ted in Section 7.4 Finally in Section 7.5 w e giv e some concluding remarks. 7.2 Algorithm for the MCDS Problem In this section, w e describ e our algorithm for the minim um connected dominating set problem. As w e already men tioned, most existing heuristics for

PAGE 129

119 the MCDS problem w ork b y selecting v ertices to b e a part of the dominating set and adding them to the nal solution. W e pro ceed using the in v erse metho d: the algorithm starts with all v ertices in the initial CDS. Then, at eac h step w e select a v ertex using a greedy metho d and either remo v e it from the curren t set or include it in the nal solution. Algorithm 19 is a formal description of the prop osed pro cedure. /* D is the curren t CDS; F is the set of xed v ertices */ D V F ; while D n F 6 = ; do u argmin f ( v ) j v 2 D n F g if G [ D n f u g ] is not c onne cte d then F F [ f u g else D D n f u g forall s 2 D \ N ( u ) do ( s ) ( s ) 1 endif N ( u ) \ F = ; then w argmax f ( v ) j v 2 N ( u ) g F F [ w end end endReturn D Algorithm 19: Compute a CDS A t the initialization stage, w e tak e the set V of all v ertices as the starting CDS (recall that w e deal only with connected graphs). In the algorithm, w e consider t w o t yp es of v ertices. A xe d vertex is a v ertex that cannot b e remo v ed from the CDS, since its remo v al w ould result in an infeasible solution. Fixing a v ertex means that this v ertex will b e a part of the nal dominating set constructed b y the algorithm. A non-xe d vertex can b e remo v ed only if their remo v al do es not disconnect the subgraph induced b y the curren t solution. A t

PAGE 130

120 eac h step of the algorithm, at least one v ertex is either xed, or remo v ed from the curren t feasible solution. In Algorithm 19 D is the curren t CDS; F is the set of xed v ertices. In the b eginning, D = V and F = ; A t eac h iteration of the while lo op of Algorithm 19 w e select select a non-xed v ertex u whic h has the minim um degree in G [ D ]. If remo ving u mak es the graph disconnected, then w e clearly need u in the nal solution, and th us u m ust b e xed. Otherwise, w e remo v e u from the curren t CDS D and select some neigh b or v 2 N ( u ) to b e xed, in the case that no neigh b or of u has b een xed b efore. W e select a v ertex with the highest connectivit y to b e xed, since w e w an t to minimize the n um b er of xed v ertices. These steps are rep eated while there is a non-xed v ertex in D In the follo wing theorem w e sho w that the algorithm outputs a CDS correctly Theorem 29 A lgorithm 19 r eturns a c onne cte d dominating set, and has the time c omplexity of O ( nm ) Pro of: W e sho w b y induction on the n um b er of iterations that the returned set D is a connected dominating set. This is certainly true at the b eginning, since the graph is connected, and therefore D = V is a CDS. A t eac h step w e remo v e the v ertex with minim um degree, only if the remo v al do es not disconnect D The algorithm also mak es sure that for eac h remo v ed v ertex u there is a neigh b or v 2 N ( u ) whic h is xed. Th us, for eac h v ertex not in D there will b e at least one adjacen t v ertex included in the nal set D This implies that D is a CDS. T o determine the time complexit y of Algorithm 19 note that the while lo op is executed at most n 1 times, since w e either remo v e or x at least one v ertex at eac h iteration. A t eac h step, the most exp ensiv e op eration is to determine if remo ving a v ertex disconnects the graph. T o do this w e need O ( m + n ) time, whic h corresp onds to the time needed to run the depth rst

PAGE 131

121 searc h (or breadth rst searc h) algorithm. Th us, the total time complexit y of Algorithm 19 is giv en b y O ( nm ). 2 The prop osed algorithm can b e considered a con v enien t alternativ e to existing metho ds. Some of the adv an tages can b e seen not only in computational complexit y but in other terms as w ell. First of all, it is the simplicit y of the metho d. Most algorithms for MCDS start b y creating a (not necessarily connected) dominating set, and subsequen tly they m ust go through an extra step to ensure that the resulting set is connected. In the case of Algorithm 19 no extra step is needed, since connectedness is guaran teed at eac h iteration. Another fa v orable consideration is that the algorithm alw a ys main tains a feasible solution at an y stage of its execution, th us pro viding a feasible virtual bac kb one at an y time during the computation. 7.3 A Distributed Implemen tation In this section, w e discuss a distributed heuristic algorithm for the MCDS problem, based on Algorithm 19 F or ad ho c wireless net w ork applications, algorithms implemen ted in a non-cen tralized, distributed en vironmen t ha v e great imp ortance, since this is the w a y that the algorithm m ust run in practice. Th us, w e prop ose a distributed algorithm that uses a strategy similar to Algorithm 1. W e describ e b elo w ho w the steps of the algorithm are dened in terms of a sequence of messages. In the description of the distributed algorithm, w e sa y that a link ( v ; u ) is an active link for v ertex v if u w as not previously remo v ed from the CDS. The messages in the algorithm are sen t through activ e links only since all other links lead to v ertices whic h cannot b e a part of the CDS. W e assume, as usual, that there is a starting v ertex, found b y means of some le ader ele ction algorithm ( Malpani et al. 2000 ). It is kno wn ( Alzoubi et al. 2002 ) that this

PAGE 132

122 can b e done in O ( n log n ) time. W e also assume that the leader v ertex v l is a v ertex with the smallest n um b er of neigh b ors. This feature is not dicult to add to the original leader election algorithm, so w e will assume that this is the case. The execution starts from the leader, whic h runs the self-removal procedure. First, w e v erify if remo ving this v ertex w ould disconnect the subgraph induced b y the resulting set of v ertices. If this is the case, w e run the Fix-vertex pro cedure, since then the curren t v ertex m ust b e presen t in the nal solution. Otherwise, the Remove-vertex pro cedure is executed. The Fix-vertex pro cedure will execute the steps required to x the curren t v ertex in the CDS. Initially it sends the message NEWDOM to announce that it is b ecoming a dominator. Then, the curren t v ertex lo oks for other v ertices to b e considered for remo v al. This is done based on the degree of eac h neigh b or; therefore the neigh b or v ertex with the smallest degree will b e c hosen rst, and will receiv e a TRY-DISCONNECT message. The Remove-vertex pro cedure is executed only when it is kno wn that the curren t v ertex v can b e remo v ed. The rst step is to send the message DISCONNECTED to all neigh b ors, and then select the v ertex whic h will b e the dominator for v If there is some dominator in the neigh b orho o d, it is used. Otherwise, a new dominator is c hosen to b e the v ertex with the highest connectivit y in N ( v ). Finally the message SET-DOMINATOR is sen t to the c hosen v ertex. Computational Complexit y Note that in the presen ted algorithm, the step with highest complexit y consists in v erifying connectedness for the resulting net w ork. T o do this, w e run a distributed algorithm whic h v erify if the graph is still connected when the curren t v ertex is remo v ed. An example of suc h algorithm is the distributed breadth rst searc h (BFS), whic h is kno wn to

PAGE 133

123 run in O ( D log 3 n ), where D is the diameter (length of the maxim um shortest path) of the net w ork, and sends at most O ( m + n log 3 n ) messages ( Aw erbuc h and P eleg 1990 ). Th us, eac h step of our distributed algorithm has the same time complexit y T o sp eed up the pro cess, w e can c hange the requiremen ts of the algorithm b y asking the resulting graph to b e connected in k steps for some constan t k instead of b eing completely connected. T o do this, the algorithm for connectedness can b e mo died b y sending a message with TTL (time to liv e) equal to a constan t k This means that after k retransmissions, if the pac k et do es not reac h the destination, then it is simply discarded. The added restriction implies that w e require connectedness in at most k hops for eac h v ertex. W e think that this is not a v ery restrictiv e constrain t, since it is also desirable that paths b et w een v ertices are not v ery long. With this additional requiremen t, the diameter of the graph can b e though t of as a constan t, and therefore the resulting time complexit y for eac h step b ecomes O (log 3 n ). The time complexit y of the whole algorithm is O ( n log 3 n ) = ~ O ( n ). W e use the notation ~ O ( f ( n )) to represen t O ( f ( n ) log k n ), for some constan t k The n um b er of messages sen t while pro cessing a v ertex is also b ounded from ab o v e b y the n um b er of messages used in the BFS algorithm. Th us, after running this in at most n v ertices, w e ha v e an upp er b ound of O ( nm + n 2 log 3 n ) (whic h is ~ O ( n 2 ) when the graph is sparse) for the total message complexit y These results are summarized in the follo wing theorem. Theorem 30 The distribute d algorithm with k -c onne cte dness r e quir ement runs in time O ( n log 3 n ) = ~ O ( n ) and has message c omplexity e qual to O ( nm + n 2 log 3 n ) F or sp arse gr aphs, the message c omplexity is ~ O ( n 2 ) The details of the resulting algorithm are sho wn in Figure 7{2 W e pro v e its correctness in the follo wing theorem.

PAGE 134

124 General actions: on TRY-DISCONNECT do Self-removal on SET-DOMINATOR do Fix-vertex on NEWDOM do f dominator source, Fix-vertex g Self-remo v al: If ( v ) = 1, then send message DISCONNECTED to neigh b or send message SET-DOMINATOR to neigh b or Else run distributed BFS algorithm from this v ertex. If some v ertex is not reac hed then Fix-vertex Else Remove-vertex End-If End-If Fix-v ertex: If v is non-xed, then set v to xed send message NEWDOM to neigh b ors ask the degree of eac h non-xed, non-remo v ed neigh b or send message TRY-DISCONNECT to neigh b ors, according to increasing degree order End-If Remo v e-v ertex: send to activ e neigh b ors the message DISCONNECTED If there is no dominator, then ask the degree of activ e neigh b ors set u to neigh b or with highest degree Else set u to dominating neigh b or End-Ifsend message SET-DOMINATOR to v ertex u Figure 7{2: Actions for a v ertex v in the distributed algorithm.

PAGE 135

125 Theorem 31 The distribute d algorithm pr esente d in Figur e 7{2 nds a c orr e ct CDS.Pro of: One of the basic dierences b et w een the structure of connected dominating sets created b y the distributed algorithm and the cen tralized algorithm is that w e no w require connectedness in k steps, i.e., the diameter of the subgraph induced b y the resulting CDS is at most k Of course, this implies that the nal solution is connected. T o sho w that the result is a dominating set, w e argue similarly to what w as pro v ed for Algorithm 19 A t eac h iteration, a v ertex will b e either remo v ed from the solution, or set to b e in the nal CDS. In the case that a v ertex is remo v ed, it m ust b e dominated b y a neigh b or, otherwise it will send the message SET-DOMINATOR to one of its neigh b ors. Th us, eac h v ertex not in the solution is directly connected to some other v ertex whic h is in the solution. This sho ws that the resulting solution is a DS, and, therefore a CDS. No w, w e sho w that the algorithm terminates. First, ev ery v ertex in the net w ork is reac hed, b ecause the net w ork is supp osed to b e connected, and messages are sen t from the initial v ertex to all other neigh b ors. After the initial decision (to b ecome xed or to b e remo v ed from the CDS), a v ertex just propagates messages from other v ertices, and do es not ask further information. Since the n um b er of v ertices is nite, this implies that the ro w of messages will nish after a nite n um b er of steps. Th us, the algorithm terminates, and returns a correct connected dominating set. 2 7.4 Numerical Exp erimen ts Computational exp erimen ts w ere run to determine the qualit y of the solutions obtained b y the heuristic prop osed for the MCDS problem. W e implemen ted b oth the cen tralized and distributed v ersions of the algorithm using

PAGE 136

126 the C programming language. The computer used w as a PC with In tel pro cessor and enough memory to a v oid disk sw ap. The C compiler used w as the gcc from the GNU pro ject, without an y optimization. The mac hine w as running the Lin ux op erating system. In the computational exp erimen ts, the testing instances w ere created randomly Eac h graph has 100 or 150 v ertices distributed randomly o v er an area, v arying from 100 100 to 180 180 square units. The edges of a unit-disk graph are determined b y the size of the radius, whose v alue ranged from 20 to 60 units. The resulting instances w ere solv ed b y an implemen tation of Algorithm 19 as w ell as b y a distributed implemen tation, describ ed in the previous section. F or the distributed algorithm, w e used the additional requiremen t of k -connectedness with k = 20. The algorithms used for comparison are the ones prop osed in ( Alzoubi et al. 2002 ) and ( Butenk o et al. 2002 ). They are referred to in the results (T ables 7{1 and 7{2 ) as A WF and BCDP resp ectiv ely The results sho w that the non-distributed v ersion of Algorithm 19 consisten tly giv es results whic h are not w orse than an y of the other algorithms. The distributed v ersion of the algorithm giv es comparable results, although not as go o d as the non-distributed v ersion. This can b e explained b y the fact that the distributed implemen tation lac ks the b enet of global information, used b y Algorithm 19 for, e.g. alw a ys nding the v ertex with smallest degree. Ho w ev er, despite the restrictions on the distributed algorithm, it p erforms v ery w ell. It m ust also b e noted that the resulting implemen tation is v ery simple compared to the other approac hes, and therefore can b e executed faster. 7.5 Concluding Remarks In this c hapter, w e prop osed a new approac h to the minim um connected dominating set problem. The prop osed heuristic algorithm is applied to ad

PAGE 137

127 ho c wireless net w orks, whic h are mo deled as unit disk graphs. The algorithm is esp ecially v aluable in situations where setup time is costly since it maintains a feasible solution at an y time during the computation and th us can b e executed without in terrupting the net w ork op eration. A distributed v ersion of the algorithm is also presen ted, whic h tries to adapt the basic algorithmic idea to a distributed setting. The exp erimen tal results sho w that b oth algorithms are able to nd go o d qualit y solutions, with v alues compared to some of the b est algorithms. The ab o v e men tioned adv an tages and the simplicit y of the prop osed algorithm mak e it an attractiv e alternativ e when solving the MCDS problem in dynamic en vironmen ts.

PAGE 138

128 T able 7{1: Results of computational exp erimen ts for instances with 100 v ertices, randomly distributed in square planar areas of size 100 100 and 120 120, 140 140, and 160 160. The a v erage solutions are tak en o v er 30 iterations. Size Radius Av erage A WF BCDP Distr. Non-distr. degree 100 100 20 10.22 28.21 20.11 20.68 19.18 25 15.52 20.00 13.80 14.30 12.67 30 21.21 15.07 10.07 10.17 9.00 35 27.50 11.67 7.73 8.17 6.30 40 34.28 9.27 6.47 6.53 4.93 45 40.70 7.60 5.80 6.13 4.17 50 47.72 6.53 4.40 4.77 3.70 120 120 20 7.45 38.62 27.71 28.38 27.52 25 11.21 26.56 18.26 19.00 17.78 30 15.56 20.67 13.40 14.63 12.40 35 20.84 15.87 10.03 11.27 9.13 40 25.09 12.67 8.33 9.00 7.00 45 30.25 10.47 7.23 7.67 5.77 50 36.20 8.87 6.00 6.57 4.70 140 140 30 11.64 25.76 18.10 18.90 17.03 35 15.51 20.00 13.33 14.37 12.73 40 19.40 16.40 10.87 11.57 9.67 45 23.48 13.33 9.03 9.40 7.77 50 28.18 11.33 7.63 8.33 6.23 55 33.05 9.80 6.57 7.17 5.30 60 38.01 8.80 5.70 6.20 4.53 160 160 30 9.14 31.88 22.20 23.20 21.88 35 12.21 24.50 17.07 17.36 16.18 40 15.63 19.73 13.47 14.07 12.43 45 19.05 16.33 11.07 11.40 9.93 50 22.45 13.80 9.47 9.67 7.93 55 26.72 12.20 7.80 8.33 6.87 60 30.38 10.33 7.10 7.47 5.80

PAGE 139

129 T able 7{2: Results of computational exp erimen ts for instances with 150 v ertices, randomly distributed in square planar areas of size 120 120, 140 140, 160 160, and 180 180. The a v erage solutions are tak en o v er 30 iterations. Size Radius Av erage A WF BCDP Distr. Non-distr. degree 120 120 50 54.51 9.47 6.30 6.70 4.63 55 63.02 7.73 5.70 6.40 4.20 60 71.12 6.67 4.83 5.53 4.00 65 81.08 6.00 3.73 4.30 3.53 70 89.22 5.00 3.23 3.67 3.10 75 98.04 4.79 3.04 2.82 2.54 80 104.64 4.64 2.82 2.09 2.00 140 140 50 42.91 11.60 7.60 8.33 6.13 55 49.75 10.07 6.87 7.40 5.20 60 57.74 8.33 5.87 6.80 4.47 65 64.70 7.60 5.27 6.20 4.17 70 72.04 6.93 4.83 5.50 4.00 75 78.82 5.87 3.87 4.53 3.60 80 86.55 5.47 3.33 3.93 3.33 160 160 50 33.94 14.00 9.63 10.07 8.43 55 40.31 12.27 8.47 8.97 6.73 60 45.89 10.73 7.30 8.13 5.73 65 53.36 9.60 6.27 6.77 4.83 70 58.75 8.67 6.10 6.60 4.43 75 64.39 7.80 5.40 6.13 4.37 80 72.05 6.60 4.63 5.80 4.00 180 180 50 27.92 17.60 11.57 12.37 10.37 55 33.05 15.13 10.13 10.43 8.47 60 38.09 12.40 8.50 9.23 7.27 65 43.95 11.53 7.70 8.53 6.10 70 48.75 10.13 7.17 7.43 5.33 75 55.08 9.33 6.30 6.87 4.53 80 60.73 8.33 5.70 6.47 4.33

PAGE 140

CHAPTER 8 CONCLUSION This dissertation discussed optimization problems o ccurring in telecomm unication net w orks, particularly in the areas of m ulticast and wireless ad ho c net w ork systems. The problems presen ted here ha v e in common the presence of discrete v ariables that m ust b e optimized to reac h optim um solutions and minimizing the use of net w ork resources. A computational view of these problems w as presen ted, with results ab out complexit y and the design of ecien t algorithms. In Chapter 1 an outline of the problems discussed in the dissertation w as presen ted. Chapter 2 ga v e a thorough review of the researc h p erformed in the area of m ulticast net w orks. The c hapter presen ted the motiv ation and the curren t basis for the w ork dev elop ed in the area of m ulticast systems. The problem of streaming cac he placemen t in m ulticast net w orks (SCPP) w as in tro duced in Chapter 3 An in tro duction to the problem and motiv ations, together with relev an t computational complexit y issues w as presen ted. It w as sho wn that the SCPP on its dieren t forms is N P -hard, with no v el and insigh tful transformations from the Sa tisfiability problem. The limits of appro ximation for SCPP problems w as discussed in Chapter 4 W e pro v ed that it is not p ossible in general to giv e go o d appro ximate solutions for cac he placemen t problems, due to their inheren t complexit y Reductions from dicult problems suc h as the Set Co ver problem w ere used to pro v e these results. 130

PAGE 141

131 Algorithms for solving the SCPP ha v e b een presen ted in Chapter 5 These w ere the rst appro ximation algorithms prop osed in the literature for this problem. W e also prop osed fast construction algorithms, whic h can b e useful to nd go o d solutions for the problem in practice. In Chapter 6 a heuristic algorithm w as prop osed for the m ulticast routing problem (MRP). W e describ ed a construction algorithm that impro v es o v er existing tec hniques. This algorithm has b een used in a GRASP restarting pro cedure to deriv e practical upp er b ounds for this problem. The problem of computing a minim um sized bac kb one on wireless ad ho c net w orks w as discussed in Chapter 7 of this dissertation. A new algorithm w as prop osed, whic h a v oids the need of a set up time during the calculation of a net w ork bac kb one. The results of this algorithm also sho w that it is v ery comp etitiv e in terms of solution qualit y and computational time. A distributed v ersion of the algorithm has b een prop osed, and computational results rep orted. Man y questions remain as imp ortan t future researc h topics, and it w as giv en in eac h c hapter of the dissertation a list of in teresting issues for w ork in the area. In general, for eac h of the discussed problems it is of high in terest to dev elop upp er and lo w er b ounds on appro ximation. Similarly the dev elopmen t of new algorithms with b etter a v erage case or w orst case complexit y are clearly op en questions for future in v estigation.

PAGE 142

Index N ( v ), 7 ( V ), 7 N P -hard, 9 activ e link, 110 ad ho c wireless net w orks, 104 appro ximation algorithms, 65 async hronous teams, 40 b est eort residual dela y heuristic, 19 branc h-and-cut, 38 39 broadcasting, 4 cac he no de, 46 cac he no des, 43 capacit y 7 CBT, 5 12 cen ter based tree algorithms, 12 cen ter-based trees, 14 closure graph, 18 congestion, 9 core p oin ts, 14 core-based trees, 3 cost w ( P ) of a path, 8 dela y 7 dela y constrained minim um spanning tree problem, 13 dela y constrained m ulticast routing problem, 87 dela y constrained Steiner problem, 26 dela y v ariation, 33 depth rst searc h, 75 destination-driv en m ulticast, 27 destinations, 8 Dijkstra's algorithm, 5 9 dominating set, 105 D VMRP 5 11 dynamic cen ter based heuristic, 19 dynamic groups, 6 elite set, 96 feasible ro w, 43 68 73 nal path relinking, 99 xed v ertex, 107 ro o ding, 10 ro w streaming cac he placemen t problem, 46 132

PAGE 143

133 ro w tec hniques, 65 game comm unities, 3 gap preserving, 62 graph, 7 greedy randomized searc h pro cedure, 88 group-w are, 3 guiding solution, 97 IETF, 6 indep enden t set, 106 leader election algorithm, 110 maximal IS, 106 maxim um ro w, 68 MBONE, 3 6 minim um cut, 68 minim um path, 8 minim um spanning tree, 13 25 26 MOSPF, 3 5 Multicast, 3 m ulticast group, 5 43 87 m ulticast net w ork dimensioning problem, 38 m ulticast pac king problem, 35 36 m ulticast routing proto cols, 5 m ulticast systems, 1 m ultimedia, 3 neigh b ors of a no de, 7 net w ork links, 7 non-xed v ertex, 107 on-line Steiner problem, 22 OSPF, 4 paren t link, 11 path, 8 path dela y 8 9 path relinking, 95 path-distance heuristics, 16 p erv asiv e groups, 6 PIM, 3 5 11 p oin t-to-p oin t connection, 39 p oin t-to-p oin t connection problem, 35 preserving transformation, 62 PT AS, 59 qualit y factor, 23 qualit y of service, 9 random paths, 75 reac hed no de, 68 reduced graph, 91 rendez-v ous p oin ts, 14 required no des, 45

PAGE 144

134 rev erse path relinking, 99 rev erse path-forw arding algorithm, 10 ring based routing, 12 ro ot no de, 12 routing tree, 43 send op eration, 42 shared tree, 3 shortest b est path tree, 31 shortest dela y tree, 19 shortest path, 90 soft w are deliv ery 3 source, 8 source based routing, 11 sparse groups, 6 Static groups, 5 Steiner tree, 9 Steiner tree based metho ds, 11 streaming cac he placemen t problem, 44 surplus, 44 TCP/IP 4 top ological cen ter, 14 tree streaming cac he placemen t problem, 46 triangle inequalit y 40 unicast routing, 18 video-conference, 9 video-conferencing, 3 video-on-demand, 8 wireless ad ho c systems, 1

PAGE 145

REFERENCES Aguilar, L., Garcia-Luna-Acev es, J., Moran, D., Graighill, E., and Brungardt, R. (1986) \Arc hitecture for a m ultimedia teleconferencing system," in: \Proceedings of the A CM SIGCOMM," 126{136, Asso ciation for Computing Mac hinery Baltimore, MD. Ah uja, R. K., Magnan ti, T. L., and Orlin, J. B. (1993) Network Flows: The ory, A lgorithms, and Applic ations Pren tice-Hall, Englew o o d Clis, NJ. Aiex, R., Binato, S., and Resende, M. (2003) \P arallel GRASP with pathrelinking for job shop sc heduling," Par al lel Computing 29 393{430. A.J. F rank, A. B., L.D. Wittie (1985) \Multicast comm unication on net w ork computers," IEEE Softwar e 2 (3), 49{61. Alzoubi, K. M., W an, P .-J., and F rieder, O. (2002) \Distributed Heuristics for Connected Dominating Set in Wireless Ad Ho c Net w orks," IEEE ComSo c/KICS Journal on Communic ation Networks 4 (1), 22{29. Arora, S. and Lund, C. (1996) \Hardness of Appro ximations," in: D. Ho c h baum (Ed.) \Appro ximation Algorithms for NP-hard Problems," PWS Publishing, Boston, MA. Aw erbuc h, B. and P eleg, D. (1990) \Net w ork sync hronization with p olylogarithmic o v erhead," in: \Pro c. 31st Symp. F ound. Computer Science," 514{ 522, IEEE Computer So ciet y New Y ork, NY. Bak er, B. S. (1994) \Appro ximation algorithms for NP-complete problems on planar graphs," Journal of the A CM (JA CM) 41 (1), 153{180. Baldi, M., Ofek, Y., and Y ener, B. (1997) \Adaptiv e Real-Time Group Multicast," in: \Pro ceedings of the Conference on Computer Comm unications (IEEE Info com)," 683, IEEE, San F rancisco, CA. Ballardie, A., F rancis, P ., and Cro w croft, J. (1993) \Core-based trees (CBT) { An arc hitecture for scalable in ter-domain m ulticast routing," Computer Communic ation R eview 23 (4), 85{95. Bao xian, Z., Y ue, L., and Chang jia, C. (2000) \An Ecien t Dela y-Constrained Multicast Routing Algorithm," in: \In ternational Conference on Comm unication T ec hnologies (ICCT 2000)," S07.2, IEEE, San F rancisco, CA. 135

PAGE 146

136 Bauer, F. (1996) Multic ast r outing in p oint-to-p oint networks under c onstr aints Ph.D. thesis, Univ ersit y of California, San ta Cruz. Bauer, F. and V arma, A. (1995) \Degree-Constrained Multicasting in P oin tto-P oin t Net w orks," in: \Pro ceedings IEEE INF OCOM '95, The Conference on Computer Comm unications," 369{376, IEEE, San F rancisco, CA. Bauer, F. and V arma, A. (1997) \ARIES: A Rearrangeable Inexp ensiv e EdgeBased On-Line Steiner Algorithm," IEEE Journal of Sele cte d A r e as in Communic ations 15 (3), 382{397. Bazaraa, M., Jarvis, J., and Sherali, H. (1990) Line ar Pr o gr amming and Network Flows John Wiley and Sons, New Y ork, NY, 2nd edition. Bellman, R. (1957) Dynamic Pr o gr amming Princeton Univ ersit y Press, Princeton, NJ. Bellman, R. E. (1958) \On a routing problem," Quarterly of Applie d Mathematics 16 87{90. Berry L. T. M. (1990) \Graph Theoretic Mo dels for Multicast Comm unications," Computer Networks and ISDN Systems 20 (1), 95{99. Bharath-Kumar, K. and Jae, J. (1983) \Routing to m ultiple destinations in computer net w orks," IEEE T r ansactions on Communic ations 31 (3), 343{351. Blokh, D. and Gutin, G. (1996) \An appro ximate algorithm for com binatorial optimization problems with t w o parameters," A ustr alasian J. Combin. 14 157{164. Butenk o, S., Cheng, X., Du, D.-Z., and P ardalos, P M. (2002) \On the construction of virtual bac kb one for ad ho c wireless net w ork," in: S. Butenk o, R. Murphey and P M. P ardalos (Eds.) \Co op erativ e Con trol: Mo dels, Applications and Algorithms," 43{54, Klu w er Academic Publishers, Dordrec h t, Neterlands. Calv ert, K. L., Zegura, E. W., and Donaho o, M. J. (1995) \Core Selection Metho ds for Multicast Routing," in: \IEEE ICCCN '95," 638{642, IEEE, Las V egas, Nev ada. Chen, G., Houle, M., and Kuo, M. (1993) \The Steiner problem in distributed computing systems," Information Scienc es 74 (1), 73{96. Chen, S., G unl uk, O., and Y ener, B. (1998) \Optimal pac king of group m ulticastings," in: \Pro c. IEEE INF OCOM'98," 980{987, IEEE, San F rancisco, CA.

PAGE 147

137 Cheng, C., Riley R., Kumar, S., and Garcia-Luna-Acev es, J. (1989) \A lo opfree extended Bellman-F ord routing proto col without b ouncing eect," A CM Computer Commun. R ev. 19 (4), 224{236. Cheng, X., Huang, X., Li, D., and Du, D.-Z. (2003) \P olynomial-Time Appro ximation Sc heme for Minim um Connected Dominating Set in Ad Ho c Wireless Net w orks," Submitted to Net w orks. Chiang, C., Gerla, M., and Zhang, L. (1998) \Adaptiv e Shared T ree Multicast in Mobile Wireless Net w orks," in: \Pro ceedings of GLOBECOM '98," 1817{ 1822, IEEE, San F rancisco, CA. Cho c kler, G. V., Huleihel, N., Keidar, I., and Dolev, D. (1996) \Multimedia Multicast T ransp ort Service for Group w are," in: \TINA Conference on the Con v ergence of T elecomm unications and Distributed Computing T ec hnologies," 43{54, IEEE, San F rancisco, CA. Cho w, C. (1991) \On m ulticast path nding algorithms," in: \Pro c. IEEE INF OCOMM '91," 1274{1283, IEEE, San F rancisco, CA. Ch ung, S.-J., Hong, S.-P ., and Huh, H.-S. (1997) \A fast m ulticast routing algorithm for dela y-sensitiv e applications," in: \IEEE GLOBECOM'97," 1898{1902, IEEE, San F rancisco, CA. Correa, R., Gomes, F., Oliv eira, C. A., and P ardalos, P M. (2003) \A parallel Implemen tation of an Async hronous T eam to the P oin t-to-p oin t Connection Problem," Par al lel Computing 29 (4), 447{466. Dalal, Y. and Metcalfe, R. (1978) \Rev erse P ath F orw arding of Broadcast P ac k ets," Communic ations of the A CM 21 (12). Das, B. and Bhargha v an, V. (1997) \Routing in Ad-Ho c Net w orks Using Minim um Connected Dominating Sets," in: \In ternational Conference on Comm unications," 376{380, IEEE, San F rancisco, CA. Deering, S. (1988) \Multicast Routing in In ternet w orks and Extended LANs," in: \A CM SIGCOMM Summer 1988," 55{64, Asso ciation for Computing Mac hinery V ancouv er, BC, Canada. Deering, S. and Cheriton, D. (1990) \Multicast routing in datagram in ternet w orks and extended LANs," A CM T r ansactions on Computer Systems 85{111. Deering, S., Estrin, D., F arinacci, D., Jacobson, V., Liu, C.-G., and W ei, L. (1994) \An Arc hitecture for Wide-Area Multicast Routing," Computer Communic ation R eview 24 (4), 126{135.

PAGE 148

138 Deering, S., Estrin, D. L., F arinacci, D., Jacobson, V., Liu, C.-G., and W ei, L. (1996) \The PIM arc hitecture for wide-area m ulticast routing," IEEE/A CM T r ansactions on Networking 4 (2), 153{162. Dijkstra, E. W. (1959) \A note on t w o problems in connexion with graphs," Numer. Math. 1 269{271. Diot, C., Dabb ous, W., and Cro wCroft, J. (1997) \Multip oin t Comm unication: a surv ey of proto cols, functions, and mec hanisms," IEEE Journal on Sele cte d A r e as in Communic ations 15 (3). Doar, M. and Leslie, I. (1993) \Ho w bad is naiv e m ulticast routing," in: \Proceedings of the IEEE INF OCOM," 82{89, IEEE, San F rancisco, CA. Du, D. and P ardalos, P (Eds.) (1993a) Network Optimization Pr oblems: A lgorithms, Complexity and Applic ations W orld Scien tic, Singap ore. Du, D.-Z., Lu, B., Ngo, H., and P ardalos, P M. (2001) \Steiner tree problems," in: C. Floudas and P P ardalos (Eds.) \Encyclop edia of Optimization," v olume 5, 227{290, Klu w er Academic Publishers, Dordrec h t, Neterlands. Du, D.-Z. and P ardalos, P M. (1993b) \Subset In terconnection designs: Generalizations of spanning trees and Steiner trees," in: \Net w ork Optimization Problems," 111{124, W orld Scien tic, Singap ore. Ellis, C., Gibbs, S., and Rein, G. (1991) \Group w are: Some issues and exp eriences," Commun. A CM 34 (1), 39{58. Eriksson, H. (1994) \MBONE: The m ulticast bac kb one," Communic ations of A CM 37 (8). F eige, U. (1998) \A Threshold of ln n for Appro ximating Set Co v er," Journal of the A CM 45 (4), 634{652. F eng, G. and Y um, T. P (1999) \Ecien t m ulticast routing with dela y constrain ts," International Journal of Communic ation Systems 12 181{195. F eo, T. and Resende, M. (1995) \Greedy randomized adaptiv e searc h pro cedures," J. of Glob al Optimization 6 109{133. F errari, D. and V erma, D. C. (1990) \A sc heme for real-time c hannel establishmen t in wide-area net w orks," IEEE Journal on Sele cte d A r e as in Communic ations 8 368{379. F ord, L. (1956) \Net w ork Flo w Theory ," Pap er p-923 RAND Corp oration, San ta Monica, California.

PAGE 149

139 F orsgren, A. and Prytz, M. (2002) \Dimensioning m ulticast-enabled comm unications net w orks," Networks 39 216{231. F ujinoki, H. and Christensen, K. J. (1999) \The New Shortest Best P ath T ree (SBPT) Algorithm for Dynamic Multicast T rees," in: \24th Conference on Lo cal Computer Net w orks," 204{211, IEEE, San F rancisco, CA. Gallager, R. G., Hum blet, P A., and Spira, P M. (1983) \A distributed algorithm for minim um-w eigh t spanning trees," A CM T r ans. Pr o gr amming L anguages and Systems 5 (1), 66{77. Garey M. R. and Johnson, D. S. (1979) Computers and Intr actability: A Guide to the The ory of NP-Completeness W. H. F reeman, San F rancisco CA. Go emans, M. X. and Williamson, D. P (1995) \A General Appro ximation T ec hnique for Constrained F orest Problems," SIAM J. Comp. 24 Gomes, F., Menezes, C., Lima, A., and Oliv eira, C. (1998) \Async hronous Organizations for Solving the P oin t-to-P oin t Connection Problem," in: \Pro c. of the In tl. Conference on Multiagen ts Systems (ICMAS)," 144{149, IEEE Computer So ciet y P aris, F rance. Gomes, F. C., P ardalos, P M., Oliv eira, C. A., and Resende, M. G. (2001) \Reactiv e GRASP with P ath Relinking for Channel Assignmen t in Mobile Phone Net w orks," in: \Pro ceedings of the 5th In ternational W orkshop on Discrete Algorithms and Metho ds for Mobile Computing and Comm unications," 60{67, A CM Press, Rome, Italy Guha, S. and Kh uller, S. (1998) \Appro ximation Algorithms for Connected Dominating Sets," A lgorithmic a 20 (4), 374{387. Han, L. and Shahmehri, N. (2000) \Secure Multicast Soft w are Deliv ery ," in: \IEEE 9th In ternational W orkshops on Enabling T ec hnologies: Infrastructure for Collab orativ e En terprises (WET ICE'00)," 207{212, IEEE, San F rancisco, CA. Hong, S., Lee, H., and P ark, B. H. (1998) \An ecien t m ulticast routing algorithm for dela y-sensitiv e applications with dynamic mem b ership," in: \Pro ceedings of IEEE INF OCOM'98," 1433{1440, IEEE, San F rancisco, CA. Hun t I I I, H. B., Marathe, M., Radhakrishnan, V., Ra vi, S., Rosenkran tz, D., and Stearns, R. (1998) \NC-appro ximation sc hemes for NPand PSP A CEhard problems for geometric graphs," J. A lgorithms 26 238{274. Hw ang, F. and Ric hards, D. (1992) \Steiner tree problems," Networks 22 55{89.

PAGE 150

140 Hw ang, F., Ric hards, D., and Win ter, P (1992) The Steiner tr e e pr oblem v olume 53 of A nnals of Discr ete Mathematics North-Holland, Amsterdam, Netherlands. Im, Y., Lee, Y., and Choi, Y. (1997) \A Dela y Constrained Distributed Multicast Routing Algorithm," Computer Communic ations 20 (1). Imase, M. and W axman, B. (1991) \Dynamic Steiner tree problems," SIAM J. Discr ete Math. 4 369{384. Jia, X. (1998) \A Distributed Algorithm of Dela y-Bounded Multicast Routing for Multimedia Applications in Wide Area Net w orks," IEEE/A CM T r ansactions on Networking 6 (6), 828{837. Jia, X., Pissinou, N., and Makki, K. (1997) \A real-time m ulticast routing algorithm for m ultimedia applications," Computer Commun. J. 20 (12), 1098{1106. Jiang, X. (1992) \Routing broadband m ulticast streams," Computer Communic ations 15 (1), 45{51. Kheong, C., Siew, D., and F eng, G. (2001) \Ecien t Setup for Multicast Connections Using T ree-Cac hing," in: \Pro ceedings IEEE INF OCOM 2001, The Conference on Computer Comm unications," 249{258, IEEE, San F rancisco, CA. Komp ella, V., P asquale, J., and P olyzos, G. (1992) \Multicasting for Multimedia Applications," in: \Pro ceedings of IEEE INF OCOM'92," 2078{2085, IEEE, San F rancisco, CA. Komp ella, V., P asquale, J., and P olyzos, G. (1993a) \Tw o Distributed Algorithms for the Constrained Steiner T ree Problem," in: \Pro ceedings of the Second In ternational Conference on Computer Comm unications and Netw orking (ICCCN'93)," 343{349, IEEE, San F rancisco, CA. Komp ella, V., P asquale, J., and P olyzos, G. (1996) \Optimal m ulticast routing with qualit y of service constrain ts," Journal of Network and Systems Management 4 (2), 107{131. Komp ella, V. P ., P asquale, J. C., and P olyzos, G. C. (1993b) \Multicast routing for m ultimedia comm unication," IEEE/A CM T r ans. Networking 1 (3), 286{292. Kou, L., Mark o wsky G., and Berman, L. (1981) \A fast algorithm for Steiner trees," A cta Informatic a 15 141{145.

PAGE 151

141 Krusk al, J. (1956) \On the shortest spanning subtree of a graph and the tra v eling salesman problem," Pr o c e e dings of the A meric an Mathematic al So ciety 7 (48), 50. Kumar, G., Narang, N., and Ra vikumar, C. (1999) \Ecien t Algorithms for Dela y Bounded Multicast T ree Generation for Multimedia Applications," in: \6th In ternational Conference on High P erformance Computing," 169{173, Springer-V erlag, Berlin. Levine, B. and Garcia-Luna-Acev es, J. (1998) \A Comparison of Reliable Multicast Proto cols," Multime dia Systems 6 (5), 334{348. Li, B., Golin, M., Italiano, G., Deng, X., and Sohrab y K. (1999) \On the optimal placemen t of W eb pro xies in the In ternet," in: \Pro ceedings of IEEE INF OCOM 99," IEEE, San F rancisco, CA. Li, C., McCormic k, S., and Simc hi-Levi, D. (1992) \The p oin t-to-p oin t deliv ery and connection problems: complexit y and algorithms," Discr ete Applie d Math. 36 267{292. Malpani, N., W elc h, J., and V aidy a, N. (2000) \Leader Election Algorithms for Mobile Ad Ho c Net w orks," in: \Pro c. F ourth In ternational W orkshop on Discrete Algorithms and Metho ds for Mobile Computing and Comm unications," 96{103, A CM Press, Boston, MA. Mao, Z., Johnson, D., Spatsc hec k, O., v an de Merw e, J. E., and W ang, J. (2003) \Ecien t and Robust Streaming Pro visioning in VPNs," in: \Pro ceedings of the WWW2003 (Ma y 20-24 2003), Budap est, Hungary ," WWW03, Budap est, Hungary Marathe, M. V., Breu, H., Hun t I I I, H. B., Ra vi, S. S., and Rosenkran tz, D. J. (1995) \Simple Heuristics for Unit Disk Graphs," Networks 25 59{68. Mokb el, M. F., El-Ha w eet, W. A., and El-Derini, M. N. (1999) \A Dela y Constrained Shortest P ath Algorithm for Multicast Routing in Multimedia Applications," in: \Pro ceedings of IEEE Middle East W orkshop on Netw orking," IEEE, San F rancisco, CA. Mo y J. (1994a) \Multicast Extensions to OSPF, RF C 1584 { IETF Net w ork W orking Group," On-line do cumen t: h ttp://www.ietf.org/, last accessed on June 1, 2004. Mo y J. (1994b) \OSPF V ersion 2, RF C 1583 { IETF Net w ork W orking Group," On-line do cumen t: h ttp://www.ietf.org/, last accessed on June 1, 2004.

PAGE 152

142 Natu, M. G. (1995) Network L o ading and Conne ction Pr oblems Ph.D. thesis, Op erations Researc h Dept., North Carolina State Univ ersit y Noronha, C. and T obagi, F. (1994) \Optim um Routing of Multicast Streams," in: \IEEE INF OCOM 94," 865{873, IEEE, San F rancisco, CA. Ofek, Y. and Y ener, B. (1997) \Reliable Concurren t Multicast from Burst y Sources," IEEE Journal on Sele cte d A r e as in Communic ations 15 (3), 434{ 444. Oliv eira, C., P ardalos, P ., Prok op y ev, O., and Resende, M. (2003a) \Streaming Cac he Placemen t Problems: Complexit y and Algorithms," T e chnic al r ep ort Departmen t of Industrial and Systems Engineering, Univ ersit y of Florida. Oliv eira, C. A., P ardalos, P M., and Resende, M. (2003b) \GRASP with pathrelinking for the QAP," in: \5th Metaheuristics In ternational Conference," 57.1{57.6, MIC03, Ky oto, Japan. P apadimitriou, C. and Y annak akis, M. (1991) \Optimization, appro ximation, and complexit y classes," J. Comput. System Sci. 43 425{440. P apadimitriou, C. H. and Steiglitz, K. (1982) Combinatorial Optimization Pren tice Hall, Englew o o d Clis, NJ. P ardalos, P ., Hsu, F., and Ra jasek aran, S. (Eds.) (2000) Mobile Networks and Computing v olume 52 of DIMA CS Series American Mathematical So ciet y Pro vidence, RI. P ardalos, P and Khoury B. (1995) \An Exact Branc h and Bound Algorithm for the Steiner Problem in Graphs," in: D.-Z. D. M. Li (Ed.) \Pro ceedings of COCOON'95 (Xi'an, China, August 24-26, 1995," v olume 959 of L e ctur e Notes in Computer Scienc e 582{590, Springer-V erlag, Berlin. P ardalos, P and Khoury B. (1996) \A Heuristic for the Steiner Problem on Graphs," Comp. Opt. & Appl. 6 5{14. P ardalos, P ., Khoury B., and Du, D.-Z. (1993) \A test problem generator for the steiner problem in graphs," A CM T r ansactions on Mathematic al Softwar e 19 (4), 509{522. P ardalos, P M. and Du, D. (Eds.) (1998) Network Design: Conne ctivity and F acilities L o c ation v olume 40 of DIMA CS Series American Mathematical So ciet y Pro vidence, RI. P ark, J. and P ark, C. (1997) \Dev elopmen t of a Multi-user & Multimedia Game Engine Based on TCP/IP," in: \Pro ceedings of IEEE P acic Rim Conference on Comm unications Computers and Signal Pro cessing," 101{ 104, IEEE, Victoria, B.C. Canada.

PAGE 153

143 P asquale, J., P olyzos, G., and Xylomenos, G. (1998) \The m ultimedia m ulticasting problem," A CM Multime dia Systems Journal 6 (1), 43{59. P aul, P and Ragha v an, S. V. (2002) \Surv ey of Multicast Routing Algorithms and Proto cols," in: \Pro ceedings of the Fifteen th In ternational Conference on Computer Comm unication (ICCC 2002)," IEEE, San F rancisco, CA. Prim, R. (1957) \Shortest Connection Net w orks and Some Generalizations," Bel l System T e chnic al Journal 36 1389{1401. Priw an, V., Aida, H., and Saito, T. (1995) \The Multicast T ree Based Routing for The Complete Broadcast Multip oin t-to-Multip oin t Comm unications," IEICE T r ansactions on Communic ations E78-B (5). Prytz, M. (2002) On optimization in Design of T ele c ommunic ations Networks with Multic ast and Unic ast T r ac Ph.D. thesis, Dept. of Mathematics, Ro y al Institute of T ec hnology Sto c kholm, Sw eden. Prytz, M. and F orsgren, A. (2002) \Dimensioning of a m ulticast net w ork that uses shortest path routing distribution trees," Pap er trita-mat-2002-os1 Departmen t of Mathematics, Ro y al Institute of T ec hnology Sto c kholm, Sw eden. Ramanathan, S. (1996) \Multicast tree generation in net w orks with asymmetric links," IEEE/A CM T r ans. Networking 4 (4), 558{568. Resende, M. G. C. and Rib eiro, C. C. (2003) \A GRASP with path-relinking for priv ate virtual circuit routing," Networks 41 104{114. Rousk as, G. N. and Baldine, I. (1996) \Multicast routing with end-to-end dela y and dela y v ariation constrain ts," in: \IEEE INF OCOM'96," 353{360, IEEE, San F rancisco, CA. Sabri, S. and Prasada, B. (1985) \Video conferencing systems," Pr o c. of the IEEE 73 (4), 671{688. Salama, H. F., Reev es, D. S., and Viniotis, Y. (1996) \Shared Multicast T rees and the Cen ter Selection Problem: A Surv ey ," T r-96/27 Dept. of Electrical and Computer Engineering, NCSU. Salama, H. F., Reev es, D. S., and Viniotis, Y. (1997a) \The Dela y-Constrained Minim um Spanning T ree Problem," in: \2nd IEEE Symp osium on Computers and Comm unications (ISCC '97)," 699{704, IEEE Computer So ciet y New Y ork, NY. Salama, H. F., Reev es, D. S., and Viniotis, Y. (1997b) \A Distributed Algorithm for Dela y-Constrained Unicast Routing," in: \Pro c. IEEE INF OCOM'97," IEEE, Kob e, Japan.

PAGE 154

144 Salama, H. F., Reev es, D. S., and Viniotis, Y. (1997c) \Ev aluation of Multicast Routing Algorithms for Real-Time Comm unication on High-Sp eed Net w orks," IEEE Journal on Sele cte d A r e as In Communic ations 15 (3). Sanc his, L. A. (2002) \Exp erimen tal Analysis of Heuristic Algorithms for the Dominating Set Problem," A lgorithmic a 33 3{18. Semeria, C. and Maufer, T. (1996) \In tro duction to IP Multicast Routing," In ternet draft (IETF), a v ailable on http://nms.lcs.mit.edu/6.829f01/draftietfmb onedintromulticast03.txt last accessed on 6/2/2004. Shaikh, A. and Shin, K. G. (1997) \Destination-Driv en Routing for Lo w-Cost Multicast," IEEE Journal of Sele cte d A r e as in Communic ations 15 (3), 373{381. Sloman, M. S. and Andriop oulos, X. (1985) \Routing Algorithm for In terconnected Lo cal Area Net w orks," Computer Networks and ISDN Systems 9 (2), 109{130. Sriram, R., Manimaran, G., and Siv a Ram Murth y, C. (1998) \Algorithms for dela y-constrained lo w-cost m ulticast tree construction," Computer Communic ations 21 (18), 1693{1706. Sriram, R., Manimaran, G., and Siv a Ram Murth y, C. (1999) \A rearrangeable algorithm for the construction of dela y-constrained dynamic m ulticast trees," IEEE/A CM T r ansactions on Networking 7 (4), 514{529. Sto jmeno vic, I., Seddigh, M., and Zunic, J. (2001) \Dominating sets and neighb or elimination based broadcasting algorithms in wireless net w orks," in: \Pro c. IEEE Ha w aii In t. Conf. on System Sciences," IEEE, San F rancisco, CA. Sun, Q. and Langendo erfer, H. (1995) \Ecien t Multicast Routing for Dela ySensitiv e Applications," in: \Pro ceedings of the Second W orkshop on Proto cols for Multimedia Systems (PR OMS'95)," 452{458, PR OMS, Salzburg, Austria. T ak ahashi, H. and Matsuy ama, A. (1980) \An appro ximate solution for the Steiner problem in graphs," Mathematic a Jap onic a 573{577. T alukdar, S. N. and de Souza, P S. (1990) \Async hronous T eams," Se c ond SIAM Conf. on Line ar A lgebr a: Signals, Systems, and Contr ol, San F r ancisc o Thomas I I, T. (1998) OSPF network design solutions Cisco Systems, Indianap olis, IN.

PAGE 155

145 W aitzman, D., P artridge, C., and Deering, S. (1988) \Distance V ector Multicast Routing Proto col, RF C 1075 { IETF Net w ork W orking Group," On-line do cumen t: h ttp://www.ietf.org/, last accessed on June 1, 2004. W all, D. (1980) Me chanisms for Br o adc ast and sele ctive br o adc ast Ph.D. thesis, Stanford Univ ersit y W all, D. (1982) Me chanisms for br o adc ast and sele ctive br o adc ast Ph.D. thesis, Computer Science Departmen t, Stanford Univ ersit y W an, P .-J., Du, D.-Z., and P ardalos, P M. (Eds.) (1998) Multichannel Optic al Networks: The ory and Pr actic e v olume 46 of DIMA CS Series American Mathematical So ciet y Pro vidence, RI. W ang, C.-F., Liang, C.-T., and Jan, R.-H. (2002) \Heuristic algorithms for pac king of m ultiple-group m ulticasting," Computers & Op er ations R ese ar ch 29 (7), 905{924. W axman, B. (1988) \Routing of m ultip oin t connections," IEEE Journal on Sele cte d A r e as in Communic ations 6 (9), 1617{1622. W ei, L. and Estrin, D. (1994) \The trade-os of m ulticast trees and algorithms," in: \Pro ceedings of ICCCN'94," IEEE, San F rancisco, CA. W estbro ok, J. and Y an, D. (1993) \Greedy Algorithms for the on-line Steiner tree and generalized Steiner problems," in: F. K. H. A. Dehne, J.-R. Sac k, N. San toro, and S. Whitesides (Eds.) \Algorithms and Data Structures, Third W orkshop, W ADS '93, Mon tr eal, Canada, August 11-13, 1993," v olume 709 of L e ctur e Notes in Computer Scienc e 621{633, Springer-V erlag, Berlin. Wi, S. and Choi, Y. (1995) \A Dela y-Constrained Distributed Multicast Routing Algorithm," in: \Pro ceeding of the t w elfth In ternational Conference on Computer Comm unication (ICCC'95)," 883{838, IEEE, San F rancisco, CA. Win ter, P (1987) \Steiner problem in net w orks: a surv ey ," Networks 17 129{167. Win ter, P and Smith, J. M. (1992) \P ath-distance heuristics for the Steiner problem in undirected net w orks," A lgorithmic a 309{327. Y um, T. S., Chen, M. S., and Leung, Y. W. (1995) \Video bandwidth allo cation for m ultimedia teleconferences," IEEE T r ans. on Commun. 457{465. Zhong, W. D., Onozato, Y., and Kaniyil, J. (1993) \A cop y net w ork with shared buers for large-scale m ulticast A TM switc hing," IEEE/A CM T r ansactions on Networking 1 (2), 157{165.

PAGE 156

146 Zh u, Q., P arsa, M., and Garcia-Luna-Acev es, J. J. (1995) \A source-based algorithm for dela y-constrained minim um-cost m ulticasting," in: \Pro c. IEEE INF OCOM95," 377{385, IEEE, San F rancisco, CA.

PAGE 157

BIOGRAPHICAL SKETCH Carlos Oliv eira w as b orn in F ortaleza, Brazil. Starting his studies in a tec hnical sc ho ol in the area of civil construction, he c hanged to the exciting area of computer science and decided to mak e it his career eld. Carlos' in terests in mathematics led to the c hoice of op erations researc h as his sp ecic eld of study Carlos Oliv eira holds a bac helor's degree in computer science, from the State Univ ersit y of Cear a, Brazil (1998), where he graduated with high honors. During that y ears he sp en t more time studying computer net w orks than optimization. Ev en during his p erio d as an undergraduate studen t, he had assistan tships for w orking with researc h pro jects. Among them w as the pro ject \minho co" to build a net w ork protot yp e for teac hing purp oses. He also started w orking with some optimization problems, suc h as the p oin t-to-p oin t connection problem and the frequency assignmen t problem. Carlos Oliv eira completed a master's degree in computer science in the Articial In telligence Lab oratory of the F ederal Univ ersit y of Cear a (2000), no w in the area of applied optimization. Carlos sp en t the y ear of 2000 as a lecturer in the Computer Science Departmen t of the F ederal Univ ersit y of Cear a. He decided to pursue a career as a researc her, and applied to the Departmen t of Industrial and Systems Engineering in the Univ ersit y of Florida. Since the Spring of 2001, Carlos has b een a full time studen t in the Ph.D. program of the ISE departmen t, Univ ersit y of Florida. His advisor is Dr. P anos M. P ardalos. Carlos has b een a teac hing assistan t for courses suc h as applied probabilities, w eb-based decision systems, and sim ulation. He is also a 147

PAGE 158

148 researc h assistan t in the Cen ter of Applied Optimization. His researc h in terests include discrete optimization, mathematical programming, construction and analysis of algorithms, as w ell as application areas suc h as telecomm unications, assignmen t theory and computational biology


Permanent Link: http://ufdc.ufl.edu/UFE0005681/00001

Material Information

Title: Optimization Problems in Telecommunications and the Internet
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0005681:00001

Permanent Link: http://ufdc.ufl.edu/UFE0005681/00001

Material Information

Title: Optimization Problems in Telecommunications and the Internet
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0005681:00001


This item has the following downloads:


Full Text










OPTIMIZATION PROBLEMS IN TELECOMMUNICATIONS AND THE
INTERNET















By

CARLOS A.S. OLIVEIRA


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY

UNIVERSITY OF FLORIDA


2004

















To my wife Janaina.














ACKNOWLEDGMENTS

The following people deserve my sincere acknowledgments:

* My advisor, Dr. Panos Pardalos;

* Dr. Mauricio Resende, from AT&T Research Labs, who was responsible
for introducing me to this University;

* My colleagues in the graduate school of the Industrial and Systems Engi-
neering Department;

* My family, and especially my parents;

* My wife.















TABLE OF CONTENTS
page

ACKNOWLEDGMENTS .................. ..... iii

LIST OF TABLES ................... .......... vii

LIST OF FIGURES ................... ......... viii

ABSTRACT .................................... ix

1 INTRODUCTION ................... ....... 1

2 A SURVEY OF COMBINATORIAL OPTIMIZATION PROB-
LEMS IN MULTICAST ROUTING .............. 4

2.1 Introduction ..... ........... ...... 4
2.1.1 Multicast Routing ................... 5
2.1.2 Basic Definitions ................... 7
2.1.3 Applications of Multicast Routing ......... 8
2.1.4 Chapter Organization ................ 9
2.2 Basic Problems in Multicast Routing ........... 9
2.2.1 Graph Theory Terminology ............. 9
2.2.2 Optimization Goals . . . . 11
2.2.3 Basic Multicast Routing Algorithms . . 12
2.2.4 General Techniques for Creation of Multicast Routes 13
2.2.5 Shortest Path Problems with Delay Constraints 15
2.2.6 Delay Constrained Minimum Spanning Tree Problem 16
2.2.7 Center-Based Trees and the Topological Center Prob-
lem . . . . . . 17
2.3 Steiner Tree Problems and Multicast Routing . 18
2.3.1 The Steiner Tree Problem on Graphs . . 18
2.3.2 Steiner Tree Problems with Delay Constraints 21
2.3.3 The On-line Version of Multicast Routing . 25
2.3.4 Distributed Algorithms . . . 29
2.3.5 Integer Programming Formulation . . 34
2.3.6 Minimizing Bandwidth Utilization . .... 36
2.3.7 The Degree-constrained Steiner Problem . 37
2.3.8 Other Restrictions: Non Symmetric Links and De-
gree Variation . . . . ... 38
2.3.9 Comparison of Algorithms . . . 39









2.4 Other Problems in Multicast Routing . . .... 41
2.4.1 The Multicast Packing Problem . .... 42
2.4.2 The Multicast Network Dimensioning Problem 44
2.4.3 The Point-to-Point Connection Problem . 46
2.5 Concluding Remarks . . . . .. . 47

3 STREAMING CACHE PLACEMENT PROBLEMS .. ..... 48

3.1 Introduction . . . . . . 48
3.1.1 Multicast Networks . . . .. . 49
3.1.2 Related Work . . . . . 51
3.2 Versions of Streaming Cache Placement Problems . 52
3.2.1 The Tree Cache Placement Problem . . 53
3.2.2 The Flow Cache Placement Problem . . 55
3.3 Complexity of the Cache Placement Problems . 56
3.3.1 Complexity of the TSCPP . . . 56
3.3.2 Complexity of the FSCPP . . . 60
3.4 Concluding Remarks . . . . .. . 63

4 COMPLEXITY OF APPROXIMATION FOR STREAMING
CACHE PLACEMENT PROBLEMS .... ......... 64

4.1 Introduction . . . ... ..... 64
4.2 Non-approximability . . . . . 65
4.3 Improved Hardness Result for FSCPP . . .... 68
4.4 Concluding Remarks . . . . ... 73

5 ALGORITHMS FOR STREAMING CACHE PLACEMENT
PROBLEMS ..... . . ... ........ 74

5.1 Introduction . . . . . . 74
5.2 Approximation Algorithms for SCPP . . 75
5.2.1 A Simple Algorithm for TSCPP . . 75
5.2.2 A Flow-based Algorithm for FSCPP . . 77
5.3 Construction Algorithms for the SCPP . ... 80
5.3.1 Connecting Destinations. . . . 81
5.3.2 Adding Caches to a Solution . . 85
5.4 Empirical Evaluation . . . . ..... 89
5.5 Concluding Remarks . . . . ..... 93

6 HEURISTIC ALGORITHMS FOR ROUTING ON MULTICAST
NETWORKS ..... . . . ... ...... 94

6.1 Introduction . . . . . . 94
6.1.1 The Multicast Routing Problem . . 95
6.1.2 Contributions . . . . .... 96









6.2 An Algorithm for the MRP . . . ..... 97
6.3 Metaheuristic Description . . . ..... 101
6.3.1 Improving the Construction Phase . .... 102
6.3.2 Improvement Phase . . . . 105
6.3.3 Reverse Path Relinking and Post-processing . 109
6.3.4 Efficient implementation of Path Relinking . 110
6.4 Computational Experiments . . . . 111
6.5 Concluding Remarks . . . .... . 113

7 A NEW HEURISTIC FOR THE MINIMUM CONNECTED
DOMINATING SET PROBLEM ON AD HOC WIRELESS
NETWORKS ..... . . . ... ...... 115

7.1 Introduction ......... . . ...... 115
7.2 Algorithm for the MCDS Problem . . 118
7.3 A Distributed Implementation . . . 121
7.4 Numerical Experiments . . . . .... 125
7.5 Concluding Remarks . . . .... . 126

8 CONCLUSION . . . . . . . 130

REFERENCES ..... . . ... ........... 135

BIOGRAPHICAL SKETCH . . . . . . 147














LIST OF TABLES
Table page

2-1 Comparison among algorithms for the problem of multicast rout-
ing with delay constraints. k is the number of destinations.
** This algorithm is partially distributed. . . .. 39

2-2 Comparison among algorithms for the problem of multicast rout-
ing with delay constraints. k is the number of destinations,
TSP is the time to find a shortest path in the graph. ** In
this case amortized time is the important issue, but was not
analyzed in the original paper. ............. .. 40

5-1 Computational results for different variations of Algorithm 7 and
Algorithm 8 ..... .. .. ... .. ... ...... 90

5-2 Comparison of computational time for Algorithm 7 and Algo-
rithm 8. All values are in milliseconds. . . .. 92

6-1 Summary of results for the proposed metaheuristic for the MRP.
Column 9 (*) reports only the time spent in the construction
phase. ...... .... ..... ........ ..... 112

7-1 Results of computational experiments for instances with 100
vertices, randomly distributed in square planar areas of size
100 x 100 and 120 x 120, 140 x 140, and 160 x 160. The average
solutions are taken over 30 iterations. .. . ..... 128

7-2 Results of computational experiments for instances with 150
vertices, randomly distributed in square planar areas of size
120 x 120, 140 x 140, 160 x 160, and 180 x 180. The average
solutions are taken over 30 iterations. .. . ..... 129














LIST OF FIGURES
Figure page

2-1 Conceptual organization of a multicast group. . . 6

3-1 Simple example for the cache placement problem. ...... .. 50

3-2 Simple example for the Tree Cache Placement Problem. ... 53

3-3 Simple example for the Flow Cache Placement Problem. 56

3-4 Small graph G created in the reduction given by Theorem 2. In
this example, the SAT formula is (ax V X2 V T3) A (r2 V X3 V
T4) A (rT V X3 V 4) . . ....... .... 58

3-5 Part of the transformation used by the FSCPP. . ... 61

4-1 Example for transformation of Theorem 11. . . 70

5-1 Sample execution for Algorithm 7. In this graph, all capacities
are equal to 1. Destination d2 is being added to the partial
solution, and node 1 must be added to R. . . 82

5-2 Sample execution for Algorithm 8, on a graph with unitary ca-
pacities. Nodes 1 and 2 are infeasible, and therefore are can-
didates to be included in R. .... . . ... 87

5-3 Comparison of computational time for different versions of Al-
gorithm 7 and Algorithm 8. Labels 'C3' to 'C10' refer to the
columns from 3 to 10 on Table 52. . . 92

6-1 Comparison between the average solution costs found by the
KMB heuristic and our algorithm. . . ..... 113

7-1 Approximating the virtual backbone with a connected dominat-
ing set in a unit-disk graph ................. .. 117

7-2 Actions for a vertex v in the distributed algorithm. . 124














Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

OPTIMIZATION PROBLEMS IN TELECOMMUNICATIONS AND THE
INTERNET

By

Carlos A.S. Oliveira

August 2004

Chair: Panos M. Pardalos
M., i Department: Industrial and Systems Engineering

Optimization problems occur in diverse areas of telecommunications.

Some problems have become classical examples of application for techniques

in operations research, such as the theory of network flows. Other opportuni-

ties for applications in telecommunications arise frequently, given the dynamic

nature of the field. Every new technique presents different challenges that can

be answered using appropriate optimization techniques.

In this dissertation, problems occurring in telecommunications are dis-

cussed, with emphasis for applications in the Internet. First, a study of prob-

lems occurring in multicast routing is presented. Here, the objective is to

allow the ,1d-pl.'-ment of multicast services with minimum cost. A description

of the problem is provided, and variations that occur frequently in some of

these applications are discussed.

Complexity results are presented for multicast problems, showing that it

is NP-hard to approximate these problems effectively. Despite this, we also

describe algorithms that give some guarantee of approximation.









A second problem in multicast networks studyed in this dissertation is

the multicast routing problem. Its objective is to find a minimum cost route

linking source to destinations, with additional (.ii. il of service constraints. A

heuristic based on a Steiner tree algorithm is proposed, and used to construct

solutions for the routing problem. This construction heuristic is also used as

the basis to develop a restarting method, based on the greedy randomized

adaptive search procedure (GRASP).
)
The last part of the dissertation is concerned with problems in wireless

networks. Such networks have numerous applications due to its highly dynamic

nature. Algorithms to compute near optimal solutions for the minimum back-

bone problem are proposed, which perform in practice much better than other

methods. A distributed version of the algorithm is also provided.















CHAPTER 1
INTRODUCTION

Computer networks are a relatively new communication medium that

has quickly become essential for most organizations. In this dissertation, we

present some optimization problems occurring in computer and telecommuni-

cations networks. Performing optimization on such networks is important for

several reasons, including cost and speed of communication. We concentrate

on two types of networks that have recently received much attention. The first

type is multicast spii:t.~ir-. which are used to reliably share information with

a (possibly large) group of clients. The second type of networks considered

in this dissertation is wireless ad hoc s;it,-,ii-. an important type of networks

with several applications.

We are mostly concerned about computational issues arising in the op-

timization of problems occurring on telecommunications networks. Thus, al-

though we present mathematical programming aspects for each of these prob-

lems, the main objective will be to derive efficient algorithms, with or without

guarantee of approximation.

The topics discussed in the dissertation are divided as follows. In Chap-

ter 2, a survey of research on the area of multicast systems is presented. The

review is used as a starting point for the topics that will be discussed later in

the dissertation related to multicast networks.

Chapter 3 introduces the problem that will be studied in the next chapters,

the streaming cache placement problem (SCPP). Variants of this basic problem

are introduced, and all variants are proved to be ATP-hard.









Chapter 4 is dedicated to the study of approximability properties of the

different versions of the SCPP. It is shown that in general the SCPP cannot

have a polynomial time approximation scheme (PTAS). This demonstrates

that the SCPP is a very hard problem not only to solve exactly, but also to

approximate. We also show that for the directed flow version it is not possible

to approximate the problem by less than log log ID where D is the set of

destinations.

In Chapter 5, algorithms for different versions of the SCPP are proposed.

Both approximation algorithms, as well as heuristics are discussed. Initially,

some algorithms with performance guarantee are proposed. However, due to

complexity results, these algorithms in general do not give good results for

problems found in practice. Heuristic algorithms are then studied, and two

main strategies for construction heuristics are discussed. Results of computa-

tional results with these methods are presented and compared.

Another problem in multicast networks is discussed in Chapter 6. The

routing problem in multicast networks asks for an optimal route, i.e., a mini-

mum cost tree connecting the source node to destinations. The routing prob-

lem for multicast networks is known to be NP-hard. We propose new heuris-

tics, and use these heuristics to implement a greedy adaptive search procedure

(GRASP).

In the last part of the dissertation, wireless network systems are discussed.

In particular, ad hoc systems (also known as MANETs) are studied. Chapter 7

is dedicated to the problem of determining a minimum backbone for such ad

hoc networks. A new algorithm for this problem is given, and the advantages

of this algorithm are addressed. A distributed version of the algorithm is also

proposed.






3

Finally, in Chapter 8 general conclusions are given about the work pre-

sented in the dissertation. Future work in the area is presented, and some

concluding remarks about this area of research are given.














CHAPTER 2
A SURVEY OF COMBINATORIAL OPTIMIZATION PROBLEMS IN
MULTICAST ROUTING

In multicasting routing, the main objective is to send data from one or

more source to multiple destinations, while at the same time minimizing the

usage of resources. Examples of resources which can be minimized include

bandwidth, time and connection costs. In this chapter we survey applications

of combinatorial optimization to multicast routing. We discuss the most im-

portant problems considered in this area, as well as their models. Algorithms

for each of the main problems are also presented.

2.1 Introduction

A basic application of computer networks consists of sending information

to a selective, usually large, number of clients of some specific data. Common

examples of such applications are multimedia distribution systems (Pasquale

et al., 1998), video-conferencing (Eriksson, 1994), software delivery (Han and

Shahmehri, 2000), group-ware (Chockler et al., 1996), and game communi-

ties (Park and Park, 1997). Multicast is a technique used to facilitate this

type of information exchange, by routing data from one or more sources to a

potentially large number of destinations (Deering and Cheriton, 1990). This

is done in such a way that overall utilization of resources in the underlying

network is minimized in some sense.

To handle multicast routing, many proposals of multicast technologies

have been done in the last decade. Examples are the MBONE (Eriksson, 1994),

MOSPF (Moy, 1994a), PIM (Deering et al., 1996), core-based trees (Ballardie









et al., 1993) and shared tree technologies (Chiang et al., 1998; Wei and Es-

trin, 1994). Each proposed technology requires the solution of (usually hard)

combinatorial problems. With the proliferation of services that require multi-

cast delivery, the associated routing methods became an important source of

problems for the combinatorial optimization community. Many objectives can

be devised when designing protocols, routing strategies, and overall networks

that can be optimized using techniques from combinatorial optimization.

In this chapter we discuss some of the combinatorial optimization prob-

lems arising in the area of multicast routing. These are very interesting in their

own, but sometimes are closely related to other well known problems. Thus,

the cross-fertilization of ideas from combinatorial optimization and multicast

networks can be beneficial to the development of improved algorithms and

general techniques. Our objective is to review some of the more interesting

problems and give examples and references of the existing algorithms. We also

discuss some problems recently appearing in the area of multicast networks and

how they are modeled and solved in the literature.

2.1.1 Multicast Routing

The idea of sending information for a large number of users is common in

systems that employ broadcasting. Radio and TV are two standard examples

of broadcasting systems which are widely used. On the other hand, networks

were initially designed to be used as a communication means among a relatively

small number of participants.

The TCP/IP protocol stack, which is the main technology underlying the

Internet, uses routing protocols for delivery of packets for single destinations.

Most of these protocols are based on the calculation of shortest paths. A

good example of a widely used routing protocol is the OSPF (Moy, 1994b;













Network





Multicast sources Multicast destinations

Figure 2-1: Conceptual organization of a multicast group.

Thomas II, 1998) (Open Shortest Path First), which is used to compute routing

tables for routers inside a subnetwork. In OSPF, each router in the network

is responsible for maintaining a table of paths for reachable destinations. This

table can be created using the Dijkstra's algorithm (Dijkstra, 1959) to calculate

shortest paths from the current node to all other destinations in the current

sub-network. This process can be done deterministically in polynomial time,

using at most O(n3) iterations, where n is the number of nodes involved.

However, with the Internet and the increased use of large networks, the ne-

cessity appeared for services targeting larger audiences. This phenomenon be-

came more important due to the development of new technologies such as vir-

tual conference (Sabri and Prasada, 1985), video on demand, group-ware (Ellis

et al., 1991), etc. This series of developments gave momentum for the creation

of multicast routing protocols. In multicast routing, data can be sent from

one or more source nodes to a set of destination nodes (see Figure 2-1). It is

required that all destinations be satisfied by a stream of data.

Dalal and Metcalfe (1978) were the first to give non-trivial algorithms

for routing of packets in a multicast network. From then on, many proposals

have been made to create technology supporting multicast routing, such as











by Deering (1988), Eriksson (1994), and Wall (1980). Some examples of mul-

ticast protocols are PIM Protocol Independent Multicast (Deering et al.,

1996), DVMRP Distance-Vector Multicast Routing Protocol (Deering and

Cheriton, 1990; Waitzman et al., 1988), MOSPF Multicast OSPF (Moy,

1994a), and CBT Core Based Trees (Ballardie et al., 1993). See Levine and

Garcia-Luna-Aceves (1998) for a detailed comparison of diverse technologies.

2.1.2 Basic Definitions

A multicast qirvilup is a set of nodes in a network that need to share the

same piece of information. A multicast group can have one or more source

nodes, and more than one destination. Note that even when there is more

than one source, the same information is shared among all nodes in the group.

A multicast group can be static or dynamic. Static qfiviupi' cannot be

changed after its creation. Starting with Wall (1980), the problem of routing

information in static groups is frequently modeled as a type of Steiner tree

problem. On the other hand, .g*l,'in:l' qr1iup can have members added or

removed at any time (Waxman, 1988). Clearly the task of maintaining routes

for dynamic groups is complicated by the fact that it is not known in advance

which nodes can be added or removed.

Multicast groups can be also classified according to the relative number

of users, as described by Deering and Cheriton (1990). In sparse r1iup',

the number of participants is small compared to the number of nodes in the

network. In the other situation, in which most of the nodes in the network are

engaged in multicast communication, the groups involved are called pervasive

qirvulii (Waitzman et al., 1988).

For more information about multicast networks in general, one can consult

the surveys by A.J. Frank (1985), and Paul and Raghavan (2002). A good









introduction to multicasting in IP networks is given in the Internet Draft

by Semeria and Maufer (1996) (available online). Other interesting related

literature include Du and Pardalos (1993a); Pardalos and Du (1998); Wan

et al. (1998); Pardalos et al. (2000, 1993); Pardalos and Khoury (1996, 1995).

2.1.3 Applications of Multicast Routing

Applications of multicast routing have a wide spectrum, from business

to government and entertainment. One of the first applications of multicast

routing was in audio broadcasting. In fact, the first real use of the Internet

MBONE (I\!ultimedia Backbone, created in 1992) was to broadcast audio from

IETF (Internet Engineering Task Force) meetings over the Internet (Eriksson,

1994).

Another important application of multicast routing is video confer-

ence (Yum et al., 1995), since this is a resource-intensive kind of application,

where a group of users is targeted. It has requirements, such as real-time

image exchanging and allowing interaction between geographically separated

users, also found in other types of multimedia applications. Being closely re-

lated to the area of remote collaboration, video conferencing has received great

attention during the last decade. Among others, Pasquale et al. (1998) give a

detailed discussion about utilization of multicast routing to deliver multimedia

content over large networks, such as the Internet. Jia et al. (1997) and Kom-

pella et al. (1996) also proposed algorithms for multicast routing applied to

real-time video distribution and video-conferencing problems.

Many other interesting uses of multicast routing have been done during

the last decade, with examples such as video on demand, software distribution,

Internet radio and TV stations, etc.









2.1.4 Chapter Organization

The remainder of this chapter is organized as follows. In Section 2.2 we

give a common ground for the description of optimization problems in multi-

cast routing. We start by giving the terminology used throughout the chapter,

mainly from graph theory. Then, we discuss some of the common problems

appearing in this area. In Section 2.3 we discuss delay constrained Steiner

tree problems. These are the most studied problems in multicast routing,

from the optimization point of view, being used in diverse algorithms. Thus,

we discuss many of the versions of this problem considered in the literature.

In Section 2.4 we review some other optimization problems related to multi-

cast routing. They are the multicast packing problem, the multicast network

dimensioning problem, and the point-to-point connection problem. Finally, in

Section 2.5 we give some concluding remarks about the subject.

2.2 Basic Problems in Multicast Routing

In this section we discuss the basic problems occurring in multicast net-

works. We start by an introduction to terminology used. In the sequence

we discuss some basic problems which are addressed in the multicast routing

literature.

2.2.1 Graph Theory Terminology

Graphs in this chapter are considered to be undirected and without loops.

In our applications, the nodes in a graph represent hosts, and edges represent

network links. We use N(v) to denote the set of neighbors of a node v E V.

Also, we denote by 6(V) the number of such neighbors.

With each edge (i, j) c E we can associate functions representing char-

acteristics of the network links. The most widely used functions are capacity

c(i, j), cost w(i, j) and delay d(i, j), for i,j E V. For each edge (i, j) E E, the









associated c., .... il1v c(i,j) represents the maximum amount of data that can

be sent between nodes i and j. In multicasting applications this is generally

given by an integer multiple of some unity of transmission c.i ... i 'v, so we can

say that c(i, j) E Z+, for all (i, j) E E.

The function w(i, j) is used to model any costs incurred by the use of the

network link between nodes i and j. This include leasing costs, maintenance

costs, etc.

Some applications, such as multimedia delivery, are sensitive to transmis-

sion delays and require that the total time between delivery and arrival of a

data package be restricted to some particular maximum value (Ferrari and

Verma, 1990). The delay function d(i, j) is used to model this kind of con-

straint. The delay d(i, j) represents the time needed to transmit information

between nodes i and j. As a typical example, video-on-demand applications

may have specific requirements concerning the transmission time. Each packet

i can be marked with the maximum delay di that can be tolerated for its trans-

mission. In this case the routers must consider only paths where the total delay

is at the most di.

A path in a graph G is a sequence of nodes Vil,...,Vi where (Vik, Vk+ )

is in E, for all k E {1,..., j 1}. In a routing problem we want to find paths

from a source s to a set D of destinations, -.,i i-fvii some requirements. The

cost w(P) of a path P is defined as the sum of the costs of all edges (vik, Vik+)

in P. A path P between nodes u and v is called a minimum path if there is

no path P' in G such that w(P') < w(P). The path /. 1,,, d(P) is defined as

the delay incurred when routing data between nodes vi and ,-, through path

P = (v, ..., -,). In other words, d(P) = i1 d(v,, i ).

In this chapter, we use interchangeably the words edge and link to relate

to the same object. The word link is used when it is more appropriate in










the application context. For more information of graph theoretical aspects of

multicast networks, see Berry (1990).

2.2.2 Optimization Goals

Different objectives can be considered when optimizing a multicast routing

problem, such as, for example, path delay, total cost of the tree, and maximum

congestion. We discuss some of these objectives.

Quality of service is an important consideration with network service, and

it is mostly related to the time needed for data delivery. Depending on the

q(II.ili v, of service requirements of an application, one of the possible goals is to

minimize path delay. The best example of application that needs this quality

of service is video-conference. The path delay is an additive delay function,

corresponding to the sum of delays incurred from source to destination, for all

destinations. It is interesting to note that this problem is solvable in polyno-

mial time, since the paths from source to destination are considered separately.

Shortest path algorithms such as, for example, the Dijkstra's algorithm (Dijk-

stra, 1959), can be used to achieve this objective.

A second objective is to minimize the total cost of the routing tree. This

is again an additive metric, where we look for the minimum sum of costs for

edges used in the routing tree. In this case, however, the optimization objective

is considerably harder to achieve, since it can be shown to be equivalent to

the minimum Steiner tree, a classical A/'P-hard problem (Garey and Johnson,

1979).

Another example of optimization goal is to minimize the maximum net-

work congestion. The congestion on a link is defined as the difference between

capacity and usage. The higher the congestion, the more difficult it is to han-

dle failures in some other links of the network. Also, higher congestion makes









it harder to include new elements in an existing multicast group, and there-

fore is an undesirable situation in dynamic multicast. Thus, in a well designed

network it is interesting to keep congestion at a minimum.

2.2.3 Basic Multicast Routing Algorithms

The most basic way of sending information to a multicast group is using

flooding. With this technique, a node sends packets through all its .rli.':ent

links. If a node v receives a packet p from node u for which it is not the

destination, then v first checks if p was received before. If this is true, the

packet does not need to be sent again. Otherwise, the v just re-sends the

packet to all other .ri.1.:ent nodes (excluding u). The formal statement of

this strategy is shown in Algorithm 1. It is clear that after at most n such

steps (where n is the number of nodes in the network), the package must have

reached all nodes, including the destinations. Thus, the algorithm is correct.

The number of messages sent by each node is at most n. The number of

messages received by v is at most n6c(v).


Receive packet p from node u
if destination(p) = v then
Packet-Received
else
if packet was not /". j'.'.-iri processed then
Sent packet p to all nodes in N(v) \ {u}
end
end

Algorithm 1: Flooding algorithm for node v in a multicast network



This method of packet routing is simple, but very inefficient. The first

reason is that it uses more bandwidth than required, since many nodes which

are not in the path to the destination will end up by receiving the packet.

Second, each node in the network must keep a list of all packets which it sent,









in order to avoid loops. This makes the use of flooding prohibitive for all

but very small networks. Another problem, which is more difficult to solve,

is how to guarantee that a packet will be delivered, since the network can be

disconnected due to some link failure, for example.

The reverse path-lfi,,ii ,+'i, algorithm is a method, proposed by Dalal

and Metcalfe (1978), used to reduce the network usage associated with the

flooding technique. The idea is that, for each node v and source node s in the

network, v will determine in a distributed way what is the edge e = (u, v), for

some u E V, which is in the shortest path from s to v. This edge is called

the parent link. The parent link can be determined in different ways, and

a very simple method is: select e = (u, v) to be the parent link for source

s if this was the first edge from which a packet from s was received. With

this information, a node can selectively drop incoming packets, based on its

source. If a packet p is received from a link which is not considered to be in

the shortest path between the source node and the current node, then p is

discarded. Otherwise, the node broadcasts p to all other .,li.1:ent links, just

as in the flooding algorithm. The parent link can also be updated depending

on the information received from other nodes. Other algorithms can be used

to enhance this basic scheme as discussed, e.g., by Semeria and Maufer (1996).

2.2.4 General Techniques for Creation of Multicast Routes

During the last decades a number of basic techniques were proposed for

the construction of multicast routes. Diot et al. (1997) identified some of the

main techniques used in the literature. They describe these techniques as being

divided into source based routing, center based tree algorithms, and Steiner

tree based algorithms.









In source based rouli.i'. a routing tree rooted at the source node is cre-

ated for each multicast group. This technique is used, for example, in the

DVMRP and PIM protocols. Some implementations of source based routing

make use of the reverse path-forwarding algorithm, discussed in the previous

sub-session (Dalal and Metcalfe, 1978). Sriram et al. (1998) observed that this

technique does a poor job in routing small multicast groups, since it tries to

optimize the routing tree without considering other potential users not in the

current group.

Among the source based routing algorithms, the Steiner tree based meth-

ods focus on minimization of tree cost. This is probably the most used ap-

proach, since it can leverage the large number of existing algorithms for the

Steiner tree problem. There are many examples of this technique (such as

in Bharath-Kumar and Jaffe (1983); Wall (1982); Waxman (1988); Wi and

Choi (1995)), which will be discussed on Section 2.3.

In contrast to source based routing, center based tree .il';iJl.iis create

routing trees with a specified root node. This root node is computed to have

some special properties, such as, for example, being closest to all other nodes.

This method is well suited to the construction of shared trees, since the root

node can have properties interesting to all multicast groups. For example, if

the root node is the topological center of a set of nodes, then this is the node

which is closest to all members of the involved multicast groups. In the case

of the topological center, the problem of finding the root node becomes AVP-

hard, but there are other versions of the problem which are easier to solve.

An important example of use of this idea occurs in the CBT (core-based tree)

algorithm (Ballardie et al., 1993).

A recent method proposed for distributing data in multicast groups is

called ring based routing (Baldi et al., 1997; Ofek and Yener, 1997). The idea










is to have a ring linking nodes in a group, to minimize costs and improve

reliability. Note for example that trees can be broken by just one link failure;

on the other hand, rings are 2-connected structures, which offer a more reliable

interconnection.

2.2.5 Shortest Path Problems with Delay Constraints

Given a graph G(V, E), a source node s and a destination node t, with

s, t E V, the shortest path problem consists of finding a path from s to t

with minimum cost. The solution of shortest path problems is required in

most implementations of routing algorithms. This problem can be solved in

polynomial time using standard algorithms (Dijkstra, 1959; Bellman, 1958;

Ford, 1956).

However, other versions of the shortest problem are harder, and cannot

be solved exactly in polynomial time. An example of this occurs when we add

delay constraints to the basic problem. The delay constraints require that the

sum of the delays from source to each destination be less than some threshold.

In this case, the shortest path problem becomes ATP-hard (Garey and Johnson,

1979) and therefore, some heuristic algorithms must be used in order to find

efficient implementations (e.g. Salama et al. (1997b)). For example, Sun and

Langendoerfer (1995) and Deering and Cheriton (1990) have proposed good

heuristics for this problem.

Some algorithms for shortest path construction are less useful than others,

due to properties of their distributed implementations. According to Cheng

et al. (1989), a disadvantage of the distributed Bellman-Ford algorithm for

shortest path computation is that is difficult to recover from link failures,

from the bouncing effect (Sloman and Andriopoulos, 1985) caused by loops,

and from termination problems caused by disconnected segments. Thus, a









chief requirement for shortest path algorithms used in multicast routing is to

have a scalable distributed implementation. The problems associated with

distributed requirements for shortest path algorithms are discussed by Cheng

et al. (1989), who proposed a distributed algorithm to overcome such limita-

tions.

2.2.6 Delay Constrained Minimum Spanning Tree Problem

In the minimum ".,;,I/i:i.i tree (\!ST) problem, given a graph G(V, E), we

need to find a minimum cost tree connecting all nodes in V. This problem can

be solved in polynomial time by Kruskal's algorithm (Kruskal, 1956) or Prim's

algorithm (Prim, 1957). However, similarly to the shortest path problem, the

MST problem becomes AVP-hard when delay constraints are applied to the

resulting paths in the routing tree. This fact can be easily shown, since the

minimum spanning tree problem is a generalization of the minimum cost path

problem.

Salama et al. (1997a) discuss the /. 1 ,,i constrained minimum "1..',; :,::,:' tree

problem. They propose a simple heuristic, which resembles Prim's algorithm,

to give an approximate solution to the problem. The proposed method can

be described as follows. In its first phase, the algorithm tries to incorporate

links, ordered according to increasing cost, but without creating cycles. At

each step, the algorithm must also insure that the current (partial) solution

satisfy the delay constraints. If this is not true, then a relaxation step is carried

on, which consists of the following procedure. If a node can be linked by an

alternative path, while reducing the delay, then the new path is selected. If,

after this relaxation step, there is still no path with a suitable delay for some

node, then the algorithm fails and returns just a partial answer.









Other examples of algorithms for computing delay constrained spanning

trees include the work of Chow (1991). In his paper, an algorithm for the prob-

lem of combining different routes into one single routing tree is proposed. For

more information about delay constrained routing, see Salama et al. (1997c),

where a comparison of diverse algorithms for this problem is performed.

2.2.7 Center-Based Trees and the Topological Center Problem

In the context of generation of multicast routing trees, some routing tech-

nologies, such as PIM and CBT, use the technique known as center-based

trees (Salama et al., 1996), which was initially developed in Wall (1982). This

method can be classified as a center-based routing technique, as described in

Section 2.2.4. In this approach the first step is to find the node v which is the

S"I./.'..;', ,1 Icenter of the set of senders and receivers. The topological center

of a graph G(V, E) is defined as the node v E V which is closest to any other

node in the network, i.e., the node v which minimizes ii.i:. -, d(v, u). Then,

a routing tree rooted at v is constructed and used throughout the multicast

session.

The basic reasoning behind the algorithm is that the topological center is

a better starting point for the routing tree, since it is expected to change less

than other parts of the tree. This scheme departs from the idea of rooting the

tree at the sender, and therefore can be extended to be used by more than one

multicast group at the same time.

The topological center is, however, a A'P-hard problem (Ballardie et al.,

1993). Thus, other related approaches try to find root nodes that are not

exactly the topological center, but which can be thought of as a good approx-

imation. Along these lines we have algorithms using core points (Ballardie

et al., 1993) and also rendez-vous points (Deering et al., 1994).










It is interesting to note that, for simplicity, most of the papers which try

to create routing trees using center-based techniques simply disregard the AVP-

complete problem and try to find other approximations. It is not completely

understood how good these approximations can be for practical instances.

However, Calvert et al. (1995) gave an informative comparison of the different

methods of choosing the center for a routing tree, based on several experiments.

2.3 Steiner Tree Problems and Multicast Routing

In this section we discuss different versions of the Steiner tree problem,

and how they can be useful to solve problems arising in multicast routing.

Some of the algorithm for the Steiner tree are also presented.

2.3.1 The Steiner Tree Problem on Graphs

Steiner tree problems are very useful in representing solutions to multicast

routing problems. They are nrpl-v-d mostly when there is just one active

multicast group and the minimum cost tree is wanted. In the Steiner tree

problem, given a graph G(V, E), and a set R C V of required nodes, we want

to find a minimum cost tree connecting all nodes in R. The nodes in V \ R

can be used if needed, and are called "St. il. I" points. This is a classical

AV'P-hard problem (Garey and Johnson, 1979), and has a vast literature on

its own (Bauer and Varma, 1997; Du et al., 2001; Du and Pardalos, 1993b;

Hwang and Richards, 1992; Hwang et al., 1992; Kou et al., 1981; Takahashi

and Matsuyama, 1980; Winter, 1987; Winter and Smith, 1992). Thus, in

this subsection we give only some of the most used results. For additional

information about the Steiner problem, one can consult the surveys Winter

(1987); Hwang and Richards (1992); Hwang et al. (1992).

One of the most well known heuristics for the Steiner tree problem was

proposed by Kou et al. (1981), and frequently refereed to as the KMB heuristic.










There is practical interest in this heuristic, since it has a performance guarantee

of at most twice the size of the optimum Steiner tree. The steps of the KMB

heuristic are shown in Algorithm 2.


Construct a complete graph K(R, E) where the set of nodes is R.
Let the distance d(i, j), i, j R be the shortest path from i to j in
G.
Find a minimum spanning tree T of K.
Replace each edge (i, j) in T by the complete path from i to j in G.
Let the resulting graph be T'
Compute a minimum spanning tree T of T'.
repeat
r <- false
if there is a leaf w T which is not in R then
Remove w from T
r -- true
end
until not r

Algorithm 2: Minimum spanning tree heuristic for Steiner tree.



Theorem 1 (Kou et al. (Kou et al., 1981)) Algorithm 2 has a perfor-

mance guarantee of 2 2/p, where p = R1.

Wall (1980) made a comprehensive study of how the KMB heuristic per-

forms in problems occurring in real networks. For example, Doar and Leslie

(1993) report that this heuristic can give much better results than the claimed

guarantee, usually achieving 5% of the optimal for a large number of realistic

instances.

Another basic heuristic for Steiner tree was proposed by Takahashi and

Matsii'..,iii. (1980). This heuristic works in a way similar to the Dijkstra's

and Prim's algorithms. The operation of the heuristic consists of increasing

the initial solution tree using shortest paths. Thus, it is classified as part of

the broad class of path-distance heuristics. Initially, the tree is composed of

the source node only. Then, at each step, the heuristic searches for a still










unconnected destination d that is closest to the current tree T, and adds to T

the shortest path leading to d. The algorithm stops when all required nodes

have been added to the solution tree.

The Steiner tree technique for multicast routing consists of using the

Steiner problem as a model for the construction of a multicast tree. In general,

it is considered that there is just one source node for the multicast group.

The set of required nodes is defined as the union of source and destinations.

This technique is one of the most studied for multicast tree construction,

with many algorithms available (Bauer and Varma, 1995; Chow, 1991; Chen

et al., 1993; Kompella et al., 1992, 1993b,a; Hong et al., 1998; Kompella et al.,

1996; Ramanathan, 1996). In the remaining of this and the next sections we

discuss the versions of this problem which are most useful, as well as algorithms

proposed for them.

In one of the first uses of the Steiner tree problem for creating multicast

trees, Bharath-Kumar and Jaffe (1983) studied algorithms to optimize the cost

and delay of a routing tree at the same time. Also, Waxman (1988) discusses

heuristics for cost minimization using Steiner tree, taking in consideration the

dynamics of inclusion and exclusion of members in a multicast group.

It is also important to note some of the limitations of the Steiner problem

as a model for multicast routing. It has been pointed out by Sriram et al.

(1998) that Steiner tree techniques work best in situations where a virtual

connection must be established. However, in the most general case of packet

networks, like the Internet, it does not make much sense to minimize the cost of

a routing tree, since each packet can take a very different route. In this case, it

is more important to have distributed algorithms with low overhead. Despite

this, Steiner trees are still useful as a starting point for more sophisticated

algorithms.









2.3.2 Steiner Tree Problems with Delay Constraints

The simplest way of applying the Steiner tree problem in multicast net-

works requires that the costs of edges in the tree represent the communication

costs incurred by the resulting multicast routes. In this case we can just apply

a number of existing algorithms, such as the ones discussed in the previous sec-

tion, for the Steiner tree problem. However, most applications have additional

requirements in terms of the maximum delay for delivering of the information.

That is the reason why the most well studied version of the Steiner tree

problem applied to multicast routing is the delay constrained version (Im et al.,

1997; Kompella et al., 1992, 1993b,a; Jia, 1998; Sriram et al., 1998). We give

in this section some examples of methods used to give approximate solutions

to this problem.

One of the strategies used to solve the delay constrained Steiner tree

problem is to adapt existing heuristics, by adding delay constraints. The

heuristic proposed by Kompella et al. (1993b), for example, uses methods that

are similar to the KMB algorithm (Kou et al., 1981). The resulting heuristic

is composed of three stages. The first stage consists of finding a closure graph

of constrained shortest paths between all members of a multicast group. The

closure qlilntp of G is a complete graph which has the set of nodes V(G) and,

for each pair of nodes u, v E V, an edge representing the cost of the shortest

path between u and v.

In the second stage, Kompella's algorithm finds a constrained spanning

tree of the closure graph. To do this, the heuristic uses a greedy algorithm

based on edge costs, to find a spanning tree with low cost. In the last stage,

edges of the spanning tree found in the previous step are mapped back to

the original paths in the graph. At the same time, loops are removed using









the shortest path algorithm on the expanded constrained spanning tree. The

time complexity of the whole procedure is O(An3), where A is the maximum

delay allowed by the application. It should be noted, however, that even being

very similar to the KMB heuristic, this algorithm does not have any proved

approximation guarantee. This happens because the delay constraints make

the problem much harder to approximate.

Sriram et al. (1998) proposed an algorithm for constructing delay-

constrained multicast trees which is optimized for sparse, static groups. Their

algorithm is divided into two phases. The first phase is distributed, and works

by creating delay constrained paths from source to each destination. The paths

are created using a unicast routing algorithm, so it can use information already

available on the network. The second phase uses the computed paths to define

a routing tree. Each path is added sequentially and cycles are removed as they

appear. Basically, on iteration i, when a new path Pi is added to an existing

tree Ti_1, each intersection Pi n Ti_I of the path with the old tree is tested.

This is necessary to determine if just the part of Pi which does not intersect

can be used, while maintaining the same delay constraint. If this is possible,

then the tree becomes Ti, after adding the non-intersecting part of the path.

Otherwise, the algorithm must remove some parts of the old tree in order to

avoid a cycle.

Another heuristic for the delay constrained Steiner tree problem is pre-

sented by Feng and Yum (1999). This heuristic uses the idea of constructing a

minimum cost tree, as well as a minimum delay tree, and then combining the

resulting solutions. Recall that a shortest /. 1,,;i tree can be computed using

some algorithm for shortest paths, in polynomial time, with the delay being

used as the cost function. Thus, the hard part of the algorithm consists of

finding the minimum cost tree and then decide how to combine it with the









minimum delay tree. The algorithm used to compute the minimum cost, delay

constrained tree is a modification of the Dijkstra's algorithm, which maintains

each path within a specified delay constraint. To combine different trees, the

algorithm ei'pl'-v- a loop removal subroutine, which verifies if the resulting

paths still satisfies the delay constraints. The resulting complexity of this al-

gorithm is similar to the complexity of the Dijkstra's algorithm, and therefore

is an improvement in terms of computation time.

Another possible method for designing good multicast routing trees is to

start from algorithms for computing constrained minimum paths. This was

the technique chosen by Kumar et al. (1999), who proposed two heuristics for

the constrained minimum cost routing problem. In the first heuristic, which

is called "dynamic center based heuristic", the idea is to find a center node to

which all destinations will be linked, using constrained minimum paths. The

center node c is calculated initially by finding the pair of nodes with highest

minimum delay path, and taking c as the node in the middle of this path.

Other destinations are linked using minimum delay paths with low cost. The

second heuristic, called "best effort residual delay heuristic", follows a similar

idea, but this time each node added to the current routing tree T has a residual

delay bound. New destinations are then linked to the tree through paths which

have low cost and delay smaller than the residual delay of the connecting node

v T.

Not only delay constraints have being used with the multicast routing

problem. Jiang (1992) discusses another version of the multicast Steiner tree

problem, this time with link capacity constraints. His work is related to video-

conferi ini. where many users need to be source nodes during the establish-

ment of the conference. One of the ideas used is that, as each user can become

a source, then a distinct multicast tree must be created for each user. He









proposes some heuristics to solve this problem, with computational results for

the heuristics.

As a last example, Zhu et al. (1995) proposed a heuristic for routing

with delay constraints with complexity O(k lV3 log V). The algorithm has

two phases. In the first phase, a set of delay-bounded paths is constructed

from source to each destination, to form a delay-bounded tree. Then, in the

second phase the algorithm tries to optimize this tree, by reducing the total

cost at each iteration. The algorithm is also shown useful to optimize other

objective functions than total cost. For example, it can be used to minimize

the maximum congestion in the network, after changes in the second phase to

account for the new objective function. In the paper there are comparisons

between the proposed heuristic and the heuristic for Steiner tree problem pro-

posed by Kou et al. (1981). The results show that the heuristic achieves

solutions very close to that given by the algorithm for Steiner tree.

Sparsity and Delay Minimization

Chung et al. (1997) proposed heuristics to the delay constrained minimum

multicast routing, when considering the structure of sparse problems. The

heuristic depends on the use of other algorithms to find approximate solutions

to Steiner problem. The Steiner tree heuristic is used to return two solutions:

in the second run, the cost function c is replaced by the delay function d. Thus,

there are two solution which optimize different objective functions. The main

idea of the proposed algorithm is trying to optimize the cost of the routing tree,

as well as the maximum delay, at the same time. To do this, the algorithm uses

a method proposed by Blokh and Gutin (1996), which is based on Lagrangian

relaxation. A critique that can be done to the work of Chung et al. (1997)

is that the goal of optimizing the Steiner tree with delay cost is not what is









required in most applications. For example, a solution can be optimal for this

goal, however some path from s to a destination d can still have delay greater

than a constant A. This happens because the global optimum does not implies

that each source-destination path is restricted to the maximum delay.

2.3.3 The On-line Version of Multicast Routing

The multicast routing problem can be generalized in the following way.

Suppose that a multicast group can be increased or reduced by means of on-

line requests posted by nodes of the network. This is a harder problem, since

optimal solutions, when considering just a fixed group, can quickly become

inaccurate, and even very far from the optimum, after a number of additions

and removals.

Researchers in the area of multicasting routing have devised some ways

to deal with the problem of reconfiguring a multicast tree when inclusions and

departures of members of a group occur (Aguilar et al., 1986; Waxman, 1988).

A common approach consists of modifying the simple existing algorithms in

order to avoid the re-computation of the entire tree for each change. However,

as noted in Pasquale et al. (1998), a problem with such methods is that the

global opi im.il ii1 of the resulting trees is lost at each change, and a very bad

solution can emerge after many such local modifications.

One of the difficulties of the source tree based techniques in this respect is

that, for each change in the multicast group, a new tree must be computed to

restore service at the required level. The algorithms necessary to create this

tree are, however, expensive, and this makes the technique not suitable for

dynamic groups. Kheong et al. (2001) proposed an algorithm to speed up the

creation of multicast routing trees in the case of dynamic changes. The idea

is to maintain caches of pre-computed multicast trees from previous groups.









The cache can be used to quickly find new paths, connecting some of the

members of the group. An algorithm for retrieving data from the path cache

was proposed, which finds similarities between the previous and the current

multicast groups. Then the algorithm constructs a connecting path using parts

of the paths store in the cache.

The difficulty of adapting source based techniques to the dynamic case

has motivated the appearance of specialized algorithms for the on-line version

of the problem. For example, Waxman (1988) defines two types of on-line

multicast heuristics. The first type allows a rearrangement of the routing

tree after some number of changes, while the second type does not allow such

reconfigurations. The theoretical model for this problem is given by the so-

called on-line Steiner problem. In this version of the problem, one needs to

construct a solution to a Steiner problem subject to the addition and deletion

of nodes (Imase and Waxman, 1991; Westbrook and Yan, 1993; Sriram et al.,

1999). This is clearly a AVP-hard problem, since it is a generalization of the

Steiner problem.

Waxman (1988) studied how a routing tree must be changed when new

nodes are added or removed. To better describe this situation, he proposed

a random graph model, where the probability of existing an edge between

two nodes depends on the Euclidean distance between then. This probability

decreases exponentially with the increase of distance between nodes. The

random inclusion of links can be used to represent the random addition of

new users to a multicast group. Waxman also described a greedy heuristic to

approximately solve instances generated according to this model.

Hong et al. (1998) proposed a dynamic algorithm which is capable of

handling additions and removals of elements to an existing multicast group.

The algorithm is again based on the Steiner tree problem, with added delay









constraints. However, to decrease the computational complexity of the prob-

lem, the authors employed the Lagrangian relaxation technique. According to

their results, the algorithm finds solutions very close to the optimum when the

network is sparse.

Feng and Yum (1999) devised a heuristic algorithm with the main goal of

allowing easy insertion of new nodes in a multicast group. The algorithm is

similar to Prim's algorithm for spanning trees in which it, at each step, takes a

non-connected destination with minimum cost and tries to add this destination

to the current solution. The algorithm also uses a priority queue Q where the

already connected elements are stored. The key in this priority queue is the

total delay between the elements and the source node. The algorithm uses

a parameter k to determine how to compute the path from a destination to

the current tree. Given a value of k, the algorithm computes k minimum

delay paths, from the current destination d to each of the smallest k elements

in the priority queue. Then, the best of the paths is chosen to be part of

the routing tree. An interesting feature of the resulting algorithm is that,

changing the value of the parameter k will change the amount of effort needed

to connect destination. Clearly, when increasing the value of k, better results

will be obtained. This algorithm facilitates the inclusion of new elements,

because the same procedure can be used to grow the existing tree, in order to

accommodate a new node.

Sriram et al. (1999) proposed new algorithms for the on-line, delay con-

strained minimum cost multicast routing that try to maintain a fixed quality

of service by specifying minimum delays. The algorithm is able to adapt the

routing tree to changes in membership due to inclusions and exclusions of

users. One of the problems they try to solve is how to determine the moment

in which the tree must be recomputed, and for how long should the algorithm









just do modifications to the original tree. To answer this question, the authors

introduced the concept of ..i,',,,.:h factor, which measures the usefulness of part

of the routing tree to the rest of the users. When the quality factor of part of

a tree decreases to a specific threshold, that part of the tree must be modified.

The authors discuss a technique to rearrange the tree such that the minimum

delays continue to be respected.

The first algorithm proposed by Sriram et al. (1999) starts by creating

a set of delay constrained minimum cost paths. For each destination, a path

is created with bandwidth greater than the required bandwidth B, and with

delay less than the maximum delay A. The next phase uses the resulting paths

to create a complete routing tree. The algorithm adds sequentially the edges

in each path, and at each step removes the loops created by the addition of

the path. Loops are removed in a way such that the delay constraints are not

violated.

The second algorithm proposed in Sriram et al. (1999) is a distributed

protocol, where initially each destination receives a message in order to add

new paths to the source tree. The nodes are kept in a priority list, ordered

by increasing delay requirements. According to the order in the list, the des-

tinations receive messages, which ask them to compute parameters over the

available paths, and then construct the new paths that will form the final

routing tree.

A technique that has been 'irpl-v-d by some researchers consists of us-

ing information available from unicast protocols to simplify the creation of

multicast routes. For example, Baoxian et al. (2000) proposed a heuristic for

routing with delay constraints which is based on the information given by









OSPF. Reusing this information, the resulting algorithm can run with im-

proved performance, in this case with complexity O(|IDIIV), where D is the

set of destinations.

The resulting algorithm has two steps. In the first step, it checks, for

each destination di, if there is some path from the source s to destination di

-.I i-fviiu the delay. In the second step, the algorithm uses another heuristic

to construct a unicast path from s to di. This heuristic basically construct a

path using information about predecessor nodes from the unicast protocol as

well as the delay information.

2.3.4 Distributed Algorithms

The multicast routing problem is in fact a distributed problem, since each

node involved has some available processing power. Thus, it is natural to look

for distributed algorithms which can use this computational power in order to

reduce their time complexity. A number of papers have focused on distributed

strategies for delay constrained minimum spanning tree (Jia, 1998; Chen et al.,

1993).

A good example is the algorithm presented in Chen et al. (1993). The

authors propose a heuristic that is similar to the general technique used in the

KMB heuristic for Steiner tree, and the algorithm in Kompella et al. (1993b),

for example. However, the main difference is that a distributed algorithm is

used to compute the minimum spanning tree, which must be computed twice

during the execution of the heuristic. The method used to find the MST is

based on the distributed algorithm proposed by Gallager et al. (1983).

Kompella et al. (1993a) proposed some distributed algorithms targeting

applications of audio and video delivering over a network, where the restriction

on maximum delay plays an important role. The authors try to improve over









previous algorithms by using a distributed method. The main objective of

the distributed procedure is to reduce the overall computational complexity.

It must be noted, however, that, using decentralized algorithms, some of the

global information about the network becomes harder to find (for example,

global connectivity). Thus, a simplified version of the algorithm must be

proposed which does not use global information. Nonetheless, according to

the authors, the resulting algorithms stay within 15%-30% of the optimal

solution, for most test instances.

The first algorithm in Kompella et al. (1993a) is just a version of Bellman-

Ford algorithm (Bellman, 1957) which finds a minimum delay tree from the

source to each destination. During the construction of each path, the algo-

rithm verifies the cost of the available edges, and choose the one with lowest

cost which satisfies the delay constraints. The algorithm has the objective of

achieving feasibility, and therefore the results are not necessarily locally opti-

mal. This can be achieved, however, using another optimization phase, such

as a local search algorithm.

In the second algorithm, the strategy impl-,v- d is similar to the Prim's

algorithm for minimum spanning tree construction. Its consists of growing a

routing tree, starting from the source node, until all destinations are reached.

The resulting algorithm is specialized according to different techniques for

selecting the next edge to be added to the tree. In the first edge selection

strategy proposed, edges with smallest cost are selected, such that the delay

restrictions are satisfied. The second edge selection rule tries to balance the

cost of the edge with the delay imposed by its use. This is done by a "'.i.,-

factor, which gives higher priority to edges with smaller delay, among edges









with the same cost. The factor used for edge (i, j) is

,w(i,j)J
b(i, A-(D(si)+ d(i,j))

where D(s, i) is the minimum total delay between the source s and node i, A

is the maximum allowed delay, and, as usual, w(i, j) and d(i, j) are the cost

and delay between nodes i and j.

The authors also discuss the problem of termination, which is an im-

portant question for distributed algorithms. In this case, the problem exists

because some configurations can report an infeasible problem, while feasibility

can be restored by making some changes to the current solution.

Shaikh and Shin (1997) presented a distributed algorithm where the focus

is to reduce the complexity of distributed versions of heuristics for the the

delay constrained Steiner problem. In their paper, the authors try to adapt

the model of Prim's and Dijkstra's algorithms to the harder task of creating

a multicast routing tree. In this way, they aim to reduce the complexity

associated with the heuristics for Steiner tree, while producing good solutions

for the problem. The methods employed by Dijkstra's shortest path and Prim's

minimum spanning tree algorithm are interesting because they require only

local information about the network, and therefore they are known to perform

well in distributed environments. The operation of these algorithms consists

of adding at each step a new edge to the existing tree, until some termination

condition is satisfied.

In the algorithm proposed by Shaikh and Shin, the main addition done

to the structure of Dijkstra's algorithm is a method for distinguishing be-

tween destinations and non-destination nodes. This is done by the use of an

indicator function ID which returns 1 if and only if the argument is not a

destination node. The general strategy is presented in Algorithm 3. Note










that in this algorithm the accumulated cost of a path is set to zero every time

a destination node is reached. This guides the algorithm to find paths that

pass through destination nodes with higher probability. This strategy is called

destination-driven multicast. The resulting algorithm is simple to implement

and, according to the authors, perform well in practice.


input: G(V, E), s
for v c V do d[v] oo
d[s] <- 0
S -0
Q <- V /* Q is a queue */
while Q / 0 do
v<<- get_min(Q)
S -- SU {v}
for u E N(v) do
if u S and d[u] > d[v]ID[v] + w(u, v) then
d[u] d[V]lD[v] w(u, v)
end
end
end

Algorithm 3: Modification of Dijkstra's algorithm for multicast rout-
ing, proposed by Shaikh and Shin (1997).


Mokbel et al. (1999) discuss a distributed heuristic algorithm for delay-

constrained multicasting routing, which is divided in a number of phases. The

initial phase of the algorithm consists of discovering information about nodes

in the network, particularly about delays incurred by packages. In this phase,

a packet is sent from a source to all other nodes in the neighborhood. The

packet is duplicated at each node, using the flooding technique. At each node

visited, information about the total delay and cost experienced by the packet

is collected, added to the packet, and retransmitted. Each destination will

receive packets with information about the path traversed during the delay

time previously defined. After receiving the packets, each destination can









select the resulting path with lowest cost to be the chosen path. As the last

step of the initial phase, all destinations send this information to the source

node.

In the second phase, the source node will receive the selected paths for each

destination and construct a routing tree based on this information. This is a

centralized phase, where existing heuristics can be applied to the construction

of the tree. To improve the performance of the algorithm, and to avoid an

overload of packets in the network during the flooding phase, each node is

required to maintain at most K packets at any time. Using this parameter,

the time complexity of the whole algorithm is O(K2iV 2).

Sparse Groups

An important case of multicast routing occurs when the number of sources

and destinations is small compared to the whole network. This is the typical

case for big instances, where just a few nodes will participate in a group, at each

moment. For this case, Sriram et al. (1998) proposed a distributed algorithm

which tries to explore the sparsity of the problem. The algorithm initially uses

information available through a unicast routing protocol to find pre-computed

paths in the current network. However, problems can appear when these paths

induce loops in the corresponding graph. In the algorithm, such intersections

are treated and removed dynamically. The algorithm starts by creating a list

of destinations, ordered according to the their delay constraints. Nodes with

more strict delay constraints have the opportunity of searching first for paths.

Each destination di will independently try to find a path from di to the source

s. If during this process a node v previously added to the routing tree is

found, then the process stops and a new phase starts. In this new phase,

paths are generated from v to the destination i. The destination i chooses one









of the paths according to a selection function SF (similar to the function used

by Kompella et al. (1993a)), which is defined for each path P and given by

C(P)
a(P D, (s, v) D(P)'

where C(P) and D(P) are the cost and delay of path P, A is the maximum

delay in this group, and DT, (s, v) is the current delay between the source s

and node v.

A problem that exists when the multicast group is allowed to have dy-

namic membership, is that a considerable amount of time is spent in the pro-

cess of connection configuration. Jia (1998) proposes a distributed algorithm

which addresses this question. A new distributed algorithm is employed, which

integrates the routing calculation with the connection configuration phase.

Using this strategy, the number of messages necessary to set up the whole

multicast group is decreased.

2.3.5 Integer Programming Formulation

Integer programming has been very useful in solving combinatorial opti-

mization problems, via the use of relaxation and implicit enumeration meth-

ods. An example of this approach to multicast routing is given by Noronha and

Tobagi (1994), who studied the routing problem using an integer programming

formulation. They discuss a general version of the problem in which there are

costs and delays for each link, and a set {1, .., T} of multicast groups, where

each group i has its own source si, a set of ni destinations dil ..., din,, a maxi-

mum delay Ai, and a bandwidth request ri. There is also a matrix B' E R '1,

for each group i E {1,..., T}, of source-destination requirements. The value

of B>k is 1 if j = -1 if j = dk, and 0 otherwise. The node-edge incidence

matrix is represented by A E Znm.









The network considered has n nodes and m edges. The vectors W E R',

D E Rm and C E Rm give respectively the costs, delays and capacities for

each link in the network. The variables in the formulation are X1,...,XT

(where each Xi is a matrix m x ni), y1,..., yT (where each Y' is a vector m

elements), and M E RT. The variable Xk = 1 if and only if link j is used by

group i to reach destination dik. Similarly, variable Yj 1 if and only if link

j is used by multicast group i. Also, variable M, represents the delay incurred

by multicast group i in the current solution.

In the following formulation, the objectives of minimizing total cost and

maximum delay are considered. However, the constant values 3 and 3d rep-

resent the relative weight given to the minimization of the cost and to the

minimization of the delays, respectively. Using the variables shown, the inte-

ger programming formulation is given by:

T
min riu CYi +3.[ (2.1)
i= 1

subject to


AX' = B' for i 1,...,T (2.2)

Xj k< Y < 1 fori =,..., T, j 1,...,n,k =,...,n (2.3)
k
MV > ED for i 1,...,T, k ,...,n (2.4)
j=1
M < L, for i ,...,T (2.5)
T
rY'i < C (2.6)
i= 1
XiY, Y' {0,1}, for

The constraints in the above integer program have the following meaning.

Constraint (2.2) is the flow conservation constraint for each of the multicast

groups. Constraint (2.3) determine that an edge must be selected when it is









used by any multicast tree. Constraints (2.4) and (2.5) determine the value

of the delay, since it must be greater than the sum of all delays in the current

multicast group and less than the maximum acceptable delay Li. Finally,

constraint (2.6) says that each edge i can carry a value which is at most the

capacity Ci. This is a very general formulation, and clearly cannot be solved

exactly in polynomial time because of the integrality constraints (2.7).

This formulation is used in Noronha and Tobagi (1994) to derive an exact

algorithm for the general problem. Initially, the decomposition technique was

used to decompose the constraint matrix in smaller parts, where each part

could be solved more easily. This can done using standard mathematical pro-

gramming techniques, as shown e.g., in Bazaraa et al. (1990). Then, a branch-

and-bound algorithm is proposed to the resulting problem. In this branch-and-

bound, the lower bounding procedure uses the decomposition found initially

to improve the efficiency of the lower bound computation.

2.3.6 Minimizing Bandwidth Utilization

A problem that usually happens when constructing multicast trees is the

tradeoff between bandwidth used and total cost of the tree. Traditional algo-

rithms for tree minimization try to reduce the total cost of the tree. However,

this in general does not guarantee minimum bandwidth utilization. On the

other hand, there are algorithms for minimization of the bandwidth that do

not maintain the minimum cost. For example, a greedy algorithm, as described

in Fujinoki and Christensen (1999), works by connecting destinations sequen-

tially to the source. Each destination is linked to the nearest node already

connected to the source. In this way, bandwidth is saved by reusing existing

paths.









Fujinoki and Christensen (1999) proposed a new algorithm for maintain-

ing dynamic multicast trees which try to solve the tradeoff problem discussed

above. The algorithm, called --hIi test best path tree" (SBPT), uses short-

est paths to connect sources to destinations. The authors represent distance

between nodes as the minimum number of edges in the path between them.

The first phase in the algorithm consists of computing the shortest path

from s to all destinations di. In the second phase, the algorithm performs

a sequence of steps for each destination di e D. Initially, it computes the

shortest paths from d, to all other nodes in G. Then, the algorithm takes

the node u which has minimum distance from d, and at the same time occurs

in one of the shortest paths from s to di. By doing this choice, the method

tries to favor the nodes already in the routing tree giving the smallest possible

increase in the total cost.

2.3.7 The Degree-constrained Steiner Problem

If the number of links from any node in the network is required to be

a fixed value, then we have the degree-constrained version of the multicast

routing problem. For some applications of multicasting is difficult to make a

large number of copies of the same data. This is particularly true for high

speed switches, where the speed requirements may prohibit in practice an

unbounded number of copies of the received information. For example, in

ATM networks, the number of out connections can have a fixed limit (Zhong

et al., 1993). Thus, it is interesting to consider Steiner tree problems where

the degree of each node is constrained.

Bauer (1996) proposed algorithms for this version of the problem, and

tried to construct degree-constrained multicast trees as a solution. Bauer and

Varma (1995) reviewed the traditional heuristics for Steiner tree, and new









heuristics were given, which consider the restriction in the number of ,li. ',ent

nodes. They show that the heuristics for degree-constrained Steiner tree give

solutions very close to the optimum for sample instances of the general Steiner

problem. They also show experimentally that, despite the restriction on the

node degrees, almost all instances have feasible solutions which have been

found by the heuristics.

2.3.8 Other Restrictions: Non Symmetric Links and Degree Vari-
ation

An interesting feature of real networks, which is not mentioned in most of

the research papers, is that links are, in general, non symmetric. The capacity

in one direction can be different from the (.1....ilr in the other direction,

for example, due to congestion problems in some links. Ramanathan (1996)

considered this kind of restriction. In his work, the minimum cost routing tree

is modeled as a minimum Steiner tree with constraints, where the network

has non symmetric links. The author proposes an approximation algorithm,

with fixed worst case guarantee. The resulting algorithm has also the nice

characteristic of being parameterizable, and therefore it allows the trading of

execution time for accuracy.

Another restriction, which is normally disregarded, was considered in the

approach taken by Rouskas and Baldine (1996), who proposed the minimiza-

tion of the so called /. 1,,;i variation. The delay variation is defined as the

difference between the minimum and maximum delay defined by a specific

routing tree. In some applications it is interesting that this variation stay

within a specific range. For example, it can be desirable that all nodes receive

the same information at about the same time.








Table 2-1: Comparison among algorithms for the problem of multicast routing
with delay constraints. k is the number of destinations. ** This algorithm
is partially distributed.

Algorithm Guarantee Complexity Types of instances
KMB (Kou et al., 1981) 2 O(kn2) general
Takahashi and Matsuyama (1980) 2 O(kn2) general
Kompella et al. (1993b) O(n3A) general
Sriram et al. (1998) N/A ** sparse, static groups
Feng and Yum (1999) (n2) general
Kumar et al. (1999) (n3) center based
capacity constrained,
Jiang (1992) (n3) videoconferencing
Chung et al. (1997) (n3) sparse instances
Zhu et al. (1995) O(kn3 log n) sparse instances



2.3.9 Comparison of Algorithms

A Comparison of Non-distributed Approaches

Table 2-1 gives a summary of features of the algorithms for the Steiner

tree problem with delay constraints discussed in this section. Most of them

have similar computational complexity of the order of O(n3), where n is the

number of nodes in the network. The best result is obtained by Feng and Yum

(1999), which combine the work of finding good solutions in terms of cost and

delay. The heuristic by Chung et al. (1997) is reported to run faster then other

heuristics, however it must be noted that it is optimized for sparse instances.

Regarding approximation, only the first two algorithms in the table have

known approximation guarantee (constant and equal to 2). However, it is

not difficult to achieve similar performance guarantees on heuristics with the

same complexity as KMB, for example. This can be performed by running the

heuristic with known performance guarantee, followed by the normal heuristic,

and then reporting the best solution.










Table 2-2: Comparison among algorithms for the problem of multicast routing
with delay constraints. k is the number of destinations, TSP is the time
to find a shortest path in the graph. ** In this case amortized time is the
important issue, but was not analyzed in the original paper.


Algorithm Complexity Types of instances
Kompella et al. (1993a) O(n3) general on-line instances
Baoxian et al. (2000) O(mn) based on unicast information
Sriram et al. (1999) 0(n3) instances with QoS
Hong et al. (1998) O(mk(k + TSP) dynamic, delay sensitive
general instances,
Kheong et al. (2001) N/A ** cache must be maintained



A Comparison of On-Line Approaches

Table 2-2 presents a comparison of algorithms proposed for the on-line

version of the Steiner tree problem with delay constraints, as discussed in

Section 2.3.3. These algorithms in general do not provide a guarantee of

approximation, due to the dynamic nature of the problem.

From the algorithms shown in Table 2-2, the one with lowest complexity is

given by Baoxian et al. (2000). However, this complexity is kept low due to the

dependence of the algorithm on information given by other protocols operating

unicast routing. Hong et al. (1998), however, consider the construction of a

complete solution, and have the on-line issues as an additional feature. Kheong

et al. (2001) also consider a technique where information is reused, but in

this case from previous iterations of the algorithm. It is difficult to evaluate

the complexity of the whole algorithm, since it depends on the amortized

complexity on a large number of iterations. This kind of analysis is not carried

out in the paper.









A Comparison of Distributed Approaches

Distributed approaches for the Steiner tree problem with delay constraints

are more difficult to evaluate in the sense that other features become impor-

tant. For example, in distributed algorithms the message complexity, i.e.,

the number of message exchanges, is an important indicator of performance.

These factors are sometimes not derived explicitly in some of the papers.

For the algorithm proposed by Chen et al. (1993), the message complexity

is shown to be O(m + n(n + log n)), and the time complexity is O(n2). Also,

the worst-case ratio of the solution obtained to the cost of any given minimum

cost Steiner tree T is 2(1 1/1), where 1 is the number of leaves in T.

On the other hand, Shaikh and Shin (1997) do not give much information

about the complexity of their distributed algorithm. It can be noted however

that the complexity is similar to that of distributed algorithms for the com-

putation of a minimum spanning tree. Finally, Mokbel et al. (1999) derive

only the total time complexity of their algorithm, which is O(K2n2), where K

is a constant value introduced to decrease the number of message exchanges

required.

2.4 Other Problems in Multicast Routing

In this section we present some other problems occurring in multicast

routing and which have interesting characteristics, in terms of combinatorial

optimization. The first of these problems is the multicast packing problem,

where the objective is to optimize the design of the entire network in order to

provide (.,i ... ili, for a specific number of multicast groups. Then, we discuss

the point-to-point connection problem, which is a generalization of the Steiner

tree problem.









2.4.1 The Multicast Packing Problem

A more general view of the multicast routing problem can be found if we

consider the required constraints when more than one multicast group exists.

In this case, there is a number of applications that try to use the network

for the purpose of establishing connection and sending information, organized

in different groups. Thus, the network (.i1.. itr must be shared accordingly

with the requirements of each group. These I ..i. ilr1 constraints are modeled

in what is called the multicast packing problem in networks. This problem

has attracted some attention in the past few years (Wang et al., 2002; Priwan

et al., 1995; Chen et al., 1998).

The congestion A, on edge e is given by the sum of all load imposed

by the groups using e. The maximum congestion A is then defined as the

maximum of all congestion A,, over edges e c E. If we assume that there are

K multicast groups, and each group k generates an amount tk of traffic, an

integer programming formulation for the multicast packing problem is given

by

min A (2.8)

subject to

K
tkxk < A for all e E (2.9)
i=1
x e {0,1}E| fox i 1,..., K, (2.10)

where variable x is equal to one if and only if the edge e is being used by

multicast group k.

A number of approaches have been proposed for solving this problem.

For example, Wang et al. (2002) discuss how to set up multiple groups using

routing trees, and formalized this as a packing problem. Two heuristics were









then proposed. The first one is based on known heuristics for constructing

Steiner trees. The second is based on the cut-set problem. The constraints

considered for the Steiner tree problem are, first, the minimum cost under

bounded tree depth; and second, the cost minimization under bounded degree

for intermediate nodes.

Priwan et al. (1995) and Chen et al. (1998) proposed formulations for

the multicast packing problem using integer programming. The last authors

considered two ways of modeling the routing of multicast information. In the

first method, the information is sent according to a minimum cost tree among

nodes in the group, and therefore give rise to the Steiner tree problem. In the

second, more interesting version, the information is sent through a ring which

visits all elements in the group, and therefore this results in a problem similar

to the traveling salesman problem. Using these formulations they describe

heuristics that can be applied to get approximate solutions. Comparisons

were done between the two proposed formulations with respect to the quality

of the solutions found to the multicast packing problem.

In the integer formulation of the multicast packing problem, there is a

variable xe for each edge e c E, which is equal to one if and only if this

edge is selected. Each edge has also an associated cost w,* Then, the integer

formulation for the tree version of the problem is given by


min : (2.11)
ecE

subject to


Sxe > 1 for all S C V such that mi E S and MI S (2.12)
eO8(S)


(2.13)


x E {0, 1}|E|









where M is the set of nodes participating in a multicast group and 6(S) repre-

sents the edges leaving the set S C V. The integer program for the ring-based

version is given by

min S wexe (2.14)
eE
subject to


x = 2 for all v M (2.15)
ecS(v)
Sx, < 2 for all v V \ M (2.16)

Xe x > 2 for all Sc V s.t. u ES and M S (2.17)
ecS(S)
x {0, 1}1. (2.18)


Here, u is any element of M. The integer solution of this problem defines

a ring passing through all nodes participating in group M. In Chen et al.

(1998) these two problems are solved using branch-and-cut techniques, after

the identification of some valid inequalities.

2.4.2 The Multicast Network Dimensioning Problem

Another interesting problem occurs when we consider the design of a new

network, intended to support a specific multicast demand. This is called the

multicast network dimensioning problem, and it has been treated in some recent

papers (Prytz, 2002; Forsgren and Prytz, 2002; Prytz and Forsgren, 2002).

According to Forsgren and Prytz (2002), the problem consists of determin-

ing the topology (which edges will be selected) and the corresponding capacity

of the edges, such that a multicast service can be ,-1-pl.v d in the resulting net-

work. Much of the work for this problem has used mathematical programming

techniques to define and give exact and approximate solutions to the problem.









The technique used in Forsgren and Prytz (2002) has been the Lagrangian

relaxation applied to an integer programming model. We assume that there are

T multicast groups. The model uses variables x E {0, 1}, for k E {1,..., T},

e c E, which represent if edge e is used by group k. There are also variables

z cE {0, 1}, for I {1,..., L}, e E, where L is the highest possible capacity

level, which determine if the capacity level of edge e is equal to 1. Now,

let dk, for k E {1,..., T}, be the bandwidth demanded by group k; c<, for

I e {1,..., L}, and e e E, be the capacity available for edge e at the level

1; and w>, for I E {1,..., L}, and e E E, be the cost of using edge e at the

capacity level 1. Also, b E Z"' is the demand vector, and A E R"X" is the node-

edge incidence matrix. We can now state the multicast network dimensioning

problem using the following integer program

L
min w E z (2.19)
eCE 1=1
T L
subject to E dkj < c1z for all e E (2.20)
k=l 1=1
z < 1 for all ecE (2.21)
lL
Ax =b (2.22)

x, z Z. (2.23)


In this integer program, constraint (2.20) ensures that the bandwidth used

on each edges is at most the available .i,..i. -.11,. Constraint (2.21) selects just

one v.iI..i. ili level for each edge. Finally, constraint (2.22) enforces the flow

conservation in the resulting solution.

The problem proposed above has been solved using a branch-and-cut algo-

rithm, "'npl ing some basic types of cuts. The authors also use Lagrangian

relaxation to reduce the size of the linear program that need to be solved.

Some primal heuristics have been designed to exploit certain similarities with









the Steiner tree problem. These primal heuristics were used to improve the

upper bounds found during the branching phase. The resulting algorithm has

been able to solve instances with more than 100 nodes.

2.4.3 The Point-to-Point Connection Problem

An interesting generalization of the Steiner problem is known as the point-

to-point connection problem (PPCP). In the PPCP, we are given two i-1 '..iii

sets S and D of sources and, respectively, destinations. We require that S| =

ID1. However, that is not an important restriction, since for every network

we can extend the set of sources using dummy nodes, if needed. As usual,

there is a cost function w : E N. The objective is to find a minimum cost

forest F C E, where each destination is connected to at least one source, and

similarly each source is connected to at least one destination.

This problem was first proposed by Li et al. (1992), who proved that

all four versions of the PPCP (directed, undirected, with fixed or non-fixed

destinations) are AVP-hard, when p is given as input. Natu (1995) pro-

posed a dynamic programming algorithm for p = 2 with time complexity

O(mn + n2 log n). Goemans and Williamson (1995) presented an approxima-

tion algorithm for a class of forest constrained problems, including the PPCP

and the Steiner problem, that runs in O(n2 log n) and gives its results within

a factor of 2 1/p of the optimal solution.

The PPCP is useful to model situations where there are multiple sources.

Some metaheuristic algorithms have been applied to the problem by Correa

et al. (2003), and Gomes et al. (1998). The main idea of these methods is to

design simple heuristics and combine them in a framework called asynchronous

teams (Talukdar and de Souza, 1990), where each heuristic is considered an

autonomous agent, capable of improving the existing solutions.









Some of the heuristics proposed in Correa et al. (2003) explore basic

features of optimal solutions. For example, one of the heuristics uses the

t, :.i,.;.i: .', ,l;,i;7,:l;i property: given three nodes a, b, and c in one of the paths

in the solution, the cost of paths between a and b and between b and c must

be at most the minimum path between b and c. Given a solution, we can

check for each three nodes a, b, and c in a path, if this condition is satisfied.

If it is not, then we can always improve the solution by making the correct

substitution, using the minimum path.

2.5 Concluding Remarks

In this paper we have shown a number of applications and problems as-

sociated with multicast routing. We have also shown that most of them are

related to other important problems in the area of combinatorial optimization.

The topics addressed show that this is an evolving area, still in its develop-

ment stages. Moreover, most of the interesting problems can be addressed

with techniques developed by the combinatorial optimization and operations

research communities.

We believe that in the next years an increased number of applications and

models will continue to evolve from this field and make it an important source

of problems and results.














CHAPTER 3
STREAMING CACHE PLACEMENT PROBLEMS

We study a problem in the area of multicast networks, called the stream-

ing cache placement problem (SCPP). In the SCPP one wants to determine the

minimum number of multicast routers needed to deliver content to a specified

number of destinations, subject to predetermined link capacities. We initially

discuss the different versions of the SCPP found in multicast networks appli-

cations. Then, a transformation from the SATISFIABILITY problem is used in

order to prove AHP-hardness to all of these versions of the SCPP. Complexity

results are derived for the cases of directed and undirected graphs, as well as

with different assumptions about the type of flow in the network.

3.1 Introduction

Multicast protocols are used to send information from one or more sources

to a large number of destinations using a single send operation. Networks sup-

porting multicast protocols have become increasingly important for many orga-

nizations due to the large number of applications of nuill i- ..- in,. which include

data distribution, video-conferencing (Eriksson, 1994), groupware (Chockler

et al., 1996), and automatic software updates (Han and Shahmehri, 2000).

Due to the lack of multicast support in existing networks, there is an arising

need for updating unicast oriented networks. Thus, there is a clear economical

impact in providing support for new multicast enabled applications.

In this chapter we study a problem motivated by the economical plan-

ning of multicast network implementations. The streaming cache placement

problem (SCPP) has the objective of minimizing costs associated with the









implementation of multicast routers. This problem has only recently received

attention (\!., et al., 2003; Oliveira et al., 2003a) and presents many interest-

ing questions still unanswered from the algorithmic and complexity theoretic

point of view.

3.1.1 Multicast Networks

In multicast networks, nodes interested in a particular piece of data are

called a multicast ,rivmup. The main objective of such groups is to send data to

destinations in the most efficient way, avoiding duplication of transmissions,

and therefore saving bandwidth. With this aim, special purpose multicast

protocols have being devised in the literature. Examples are the PIM (Deering

et al., 1996) and core-based (Ballardie et al., 1993) distribution protocols. The

basic operation in these routing protocols is to send data for a subset of nodes,

duplicating the information only when necessary.

Network nodes that understand a multicast protocol are called cache

nodes, because they can send multiple copies of the received data. Other

nodes simply act as intermediates in the multicast transmission. The main

problem to be solved is deciding the route to be used by packages in such a

network. One of the simplest strategies for generating multicast routes is to

maintain a routing tree, linking all sources and destinations. A similar strategy,

which may reduce the number of needed cache nodes, consists in determining

a feasible flow from sources to destinations such that all destinations can be

satisfied.

A main economical problem, however, is that not all nodes understand

these multicast routing protocol. Moreover, upgrading all existing nodes can

be expensive (or even impossible, when the whole network is not owned by the

same company, as happens in the Internet).









Cs,r 1 a


cr,b = b

Figure 3-1: Simple example for the cache placement problem.


Suppose an extreme situation, where no nodes have multicast capabili-

ties. In this case, the only possible solution consists in sending a separate

copy of the required data to each destination in the group. However, in this

case instances can become quickly infeasible, as shown in Figure 3-1, Here, all

edges have (.i,... i1,, equal to one, and nodes a and b are destinations. In this

example, a feasible solution is found when r becomes a cache node. Thus, it is

interesting to determine the minimum number of cache nodes required to han-

dle a specified amount of multicast traffic, subject to link capacity constraints.

This is called the str, ,im,,.i cache placement problem (SCPP).

Formal Description of the SCPP. Suppose that a graph G = (V, E)

is given, with a capacity function c : E Z+, a distinguished source node

s E V and a set of destination nodes D C V. It is required that data be sent

from node s to each destination. Thus, we must determine a set R of cache

nodes, used to retransmit data when necessary, and the amount of information

carried by each edge, which is represented by variables E IR+, such that

,W < c6, for e E E. The objective of the SCPP is to find a set R of minimum

size corresponding to a flow {I,' | E E}, such that for each node v E D U R

there is a unit flow from some node in {s}UR to v, and the (.,i ... il 1 constraints

we < c,, for e E E, are satisfied.

A characterization of the set of cache nodes can be given in terms of the

surplus of data at each node v E V. Suppose that the number of data units

sent by node v, also called surplus, is given by variable b, E Z. Note that










the node s must send at least one unit of information, so bs must be greater

than zero. Each destination is required to receive a unit of data, so it has

a negative surplus (requirement) of -1. Now, suppose that, due to capacity

constraints, we need to establish v as a cache node. Then, the surplus at this

node cannot be negative, since it is also sending forward the received data.

If v is also a destination, than the minimum surplus is zero (in this case it is

receiving one unit and sending one unit); otherwise, b, > 1. Thus, the set of

cache nodes R C V \ {s} is the one such that b, > 0 and v E D or b, > 0 and

v V \DU {s}, for all v E R.

3.1.2 Related Work

Problems in multicast routing have been investigated by a large number

of researchers in the last decade. The most studied problems relate to the

design of routing tables with optimal cost. In these problems, given a set of

sources and a set of destinations, the objective is to send data from sources to

destinations with minimum cost.

In the case in which there are no additional constraints, this reduces to

the Steiner tree problem on graphs (Du et al., 2001). In other words, it is

required to find a tree linking all destinations to the source, with minimum

cost. Using this technique, the source and destinations are the required nodes,

the remaining ones being the Steiner nodes. Many heuristic algorithms have

been proposed for this kind of problem (Chow, 1991; Feng and Yum, 1999;

Kompella et al., 1993b,a; Kumar et al., 1999; Salama et al., 1997b; Sriram

et al., 1998).

The problems above, however, consider that all nodes support a multicast

protocol. This is not a realistic assumption on existing networks, since most

routers do not support multicasting by default. Thus, some sort of upgrade










must be applied, in terms of software or even hardware, in order to d1- pl.-

multicast applications. Despite this important application, only recently re-

searchers have started to look at this kind of problem.

In Mao et al. (2003) the Streaming Cache Placement problem is defined,

in the context of Virtual Private Networks (VPNs). In this paper, the SCPP

was proven to be AfP-hard, using a reduction from the EXACT COVER BY

3-SETS problem, and a heuristic was proposed to solve some sample instances.

However, the paper does not give details about possible versions of the prob-

lem, and proceeds directly to deriving local search heuristics.

Another related problem is the Cache Placement Problem (Li et al., 1999).

Here, the objective is to place replicas of some static document on different

points of a network, in order to increase accessibility and also decrease the

average access time by any client. The important difference between this

problem and the SCPP is that Li et al. (1999) does not consider multicast

transmissions. Also, there are no restriction on the capacity of links, and data

is considered to be placed at the locations before the real operation of the

network.

3.2 Versions of Streaming Cache Placement Problems

In this section, we discuss two versions of the SCPP. In the tree stry ,,in,

cache placement problem (TSCPP), the objective is to find a routing tree which

minimizes the number of cache nodes needed to send data from a source to a

set of destinations. We also discuss a modification of this problem where we try

to find any feasible flow from source to destinations, minimizing the number

of cache nodes. The problem is called the flow striinr,. cache placement

problem (FSCPP).








Cs,r 1 Cr,a 1
8 0,---7----0 ------ a
r


Cr,b 1

Figure 3-2: Simple example for the Tree Cache Placement Problem.

3.2.1 The Tree Cache Placement Problem

Consider a weighted, capacitated network G(V, E) with a source node s

and a routing tree T rooted on s and spanning all nodes in V. Let D be a

subset of the nodes in V that have a demand for a data stream to be sent

from node s. The stream follows the path defined by T from s to the demand

nodes and takes B units of bandwidth on every edge that it traverses. For each

demand node, a separate copy of the stream is sent. Edge capacities cannot

be violated. Note that, depending on the network structure, an instance of

this problem can easily become infeasible.

To handle this, we allow stream splitters, or caches, to be located at

specific nodes in the network. A single copy of the stream is sent from s

to a cache node r and from there multiple copies are sent down the tree.

The optimization problem consists in finding a routing tree and to locate a

minimum number of cache nodes.

Figure 3-2 shows an small example for this problem. In this example, if

nodes a and b each require a stream (with B = 1) from s, and node r is not a

cache node, then we send two units from s to r and one unit from r to a and

from r to b. We get an infeasibility on edge (s, r), since two units flow on it,

and it has capacity c,,r = 1 < 2. However, if node r becomes a cache node,

we can send one unit from s to r and then one unit from r to a and one unit

from r to b. The resulting flow is now feasible.










To simplify the formulation of the problem, we can consider, without loss

of generality, that the bandwidth used by each message is equal to one.

The tree cache placement problem (TSCPP) is defined as follows. Given

a graph G(V, E) with capacities c,, on the edges, a source node s E V and a

subset D C V representing the destination nodes, we want to find a spanning

tree T (which determines the paths followed by a data stream from s to v E D)

such that the subset R C V \ {s}, which represents the cache nodes, has

minimum size. For each node v E D U R, there must be a data stream from

some node w E R U {s} to v, such that the sum of all streams in each edge

(i, j) T does not exceed the edge capacity cy.

To state the problem more formally, consider an integer programming

formulation for the TSCPP. Define the variables


Y 1 if edge e is in the spanning tree T
0 otherwise,


S1 if node i $ s is a cache node
xi =
0 otherwise

bi E {-1,..., IV 1} the flow surplus for node i E V

1w {0,..., IV|} the amount of flow in edge e E E.

Given the node-arc incidence matrix A, the problem can be stated as

IVI
min x, (3.1)
i=1

subject to


Aw =b (3.2)

E b = 0 (3.3)
ieV









bs > 1 for source s (3.4)

x, 1 < b < xV 1) 1 for ie D (3.5)

x
SYe = |V1 -1 (3.7)
eE
ye < IH 1 for all H C V (3.8)
eeG(H)
0 < < ceye for e E (3.9)

x E {0, 1}, y {0, 1}|E| (3.10)

b E Z, wE Z+, (3.11)


where G(H) is the subgraph induced by the nodes in H. Constraint (3.2)

imposes flow conservation. Constraints (3.3) through (3.6) require that there

must be a number of data streams equal to the number of nodes in R U

D. Constraints (3.7) and (3.8) are the spanning tree constraints. Finally,

constraint (3.9) determine the bounds for flow variables, implying that the

flow specified by w can be carried only on edges in the spanning tree.

3.2.2 The Flow Cache Placement Problem

An interesting extension of the TSCPP arises if we relax the constraints

in the previous integer programming formulation that require the solution to

be a tree of the graph G. Then we have the more general case of a flow sent

from the source node s to the set of destination nodes D. To see why this

extension is interesting, consider the example graph, shown in Figure 3-3. In

this example all edges have costs equal to one. If we find a solution to the

TSCPP on this graph, then a stream can be sent through only one of the

two edges (s, a) or (s, b). Suppose that we use edges (s, a) and (a, c). This

implies that c must be a cache node, in order to satisfy demand nodes di and








a






d2
b

Figure 3-3: Simple example for the Flow Cache Placement Problem.


d2. However, in practice the number of caches in this optimal solution for the

TSCPP can be further reduced.

Routing protocols, like OSPF, achieve load balancing by sending data

through parallel links. In the case of Figure 3-3, the protocol could just send

another stream of data over edges (s, b) and (b, c). If this happens, we do not

need a cache node, and the solution will have fewer caches.

We define the Flow Cache Placement Problem (FSCPP) to be the problem

of finding a feasible flow from source s to the set of destinations D, such that

the number of required caches R C V \ {s} is minimized. The integer linear

programming model for this problem is similar to (3.1)-(3.11), without the

integer variable y and relaxing constraints (3.7)-(3.8).

3.3 Complexity of the Cache Placement Problems

We prove that both versions of the SCPP discussed above are A.'P-hard,

using a transformation from SATISFIABILITY. This transformation allows us

to give a proof of non-approximability by showing that it is a gap-preserving

transformation.

3.3.1 Complexity of the TSCPP

In this section we prove that the TSCPP is AHP-hard, by using a reduction

from SATISFIABILITY (SAT) (Garey and Johnson, 1979).









SAT: Given a set of clauses Ci,...,Cm, where each clause is the li-iji P li..i

of CiI literals (each literal is a variable xj E {xli,...,x, } or its negation ri),

is there a truth assignment for variables xl,..., x such that all clauses are

satisfied?

Definition 1 The TSCPP-D problem is the following. Given an instance of

the TSCPP and an integer k, is there a solution such that the number of cache

nodes needed is at most k ?

Theorem 2 The TSCPP-D problem is ViP-complete.

Proof: This problem is clearly in A/P, since for each instance I it is enough

to give the spanning tree and the nodes in R to determine, in polynomial time,

if this is a '. -' instance.

We reduce SAT to TSCPP-D. Given an instance I of SAT, composed of

m clauses C1,..., C, and n variables xl,..., x,, we build a graph G(V, E),

with c, 1 for all e E E, and choose k = n. The set V is defined as


V = {s} U {xi,...,xn} U {-i,...,Tn} U {T',...,Tr,}


U{T", ... T"} U {TI"', ...., T,/} U {CI, ..., Cr},

and the set E is defined as


E = { (s, x), (s, )} U J{(x, T', (T, T')} U (, T")}

n mT
SU {(Tri")} U U (x, C) U (r, C) (3.12)
i=1 i=1 xjzCi j6Ci
Figure 1 shows the construction of G for a small SAT instance. Define

D = {Cl,...,Cm U {T, ... T,} U {T', ..., T,} U {T",... T.'.}. Clearly,

destination nodes T', Ti" and Ti" are there just to saturate the arcs leaving s

and force one of xi, Ti to be chosen as a cache node. Also, each node Ci forces








C1 (x, x2,a3)


74 T4 3 T1,3,4)

Figure 3-4: Small graph G created in the reduction given by Theorem 2. In
this example, the SAT formula is (xt VX2 V9 ) A (2 V V x V4) A (T1 Vx3 V4).


the existence of at least one cache among the nodes corresponding to literals

appearing in clause Ci.

Suppose that the solution of the resulting TSCPP-D problem is true.

Then, we assign variable xi to true if node xi is in R, otherwise we set xi

to false. This assignment is well-defined, since exactly one of the nodes Xi, xi

must be selected. Clearly, this truth assignment satisfies all clauses Ci, because

the demand of each node Ci is satisfied by at least one node corresponding to

literals appearing in clause Ci.

Conversely, if there is a truth assignment F which makes the SAT formula

satisfiable, we can use it to define the nodes which will be caches, and, by con-

struction of G, all demands will be satisfied. Finally, the resulting construction

is polynomial in size, thus SAT reduces in polynomial time to TSCPP-D. o









Input: a tree T
Output: a set R of cache nodes
forall v V do
if v D then demand(v) <- 1
else demand(v) <- 0
end
call findR(s)
return R
procedure findR(v)
begin
forall w such that (v, w) c T do findR(w)
1 if v = s then return R
else p <- parent(v)
if cp,, < demand(v) then
R -- R U{ v}
demand(p) <- demand(p) + 1
end
else demand(p) <- demand(p) + demand(v)
end

Algorithm 4: Find the optimal R for a fixed tree.


As a simple consequence of this theorem, we have the following corollary.

Corollary 3 The TSCPP is JP-hard.

It is interesting to observe that the problem remains AVP-hard even for

unitary-capacity networks, since the proof remains the same for edges with

unitary capacity.

Some simple examples serve to illustrate the problem. For instance, if G

is the complete graph K", then the optimal solution is simply a star graph

with s at the center, and R = 0. On the other hand, if the graph is a tree

with n nodes, then the number of cache nodes is implied by the edges of the

tree, thus the optimum is completely determined.

Algorithm 4 determines an optimal set R from a given tree T. The algo-

rithm works recursively. Initially it finds the demand for all leaves of T. Then









it goes up the tree determining if the current nodes must be a cache node.

The correctness of this method is proved bellow.

Theorem 4 Given an instance of the TSCPP which is a tree T, then an

optimal solution for T is given by Algorithm 4.

Proof: The proof is by induction on the height h of a tree analyzed when

Algorithm 4 arrives at line (1). If h = 0 then the number of cache nodes is

clearly equal to zero. Assume that the theorem is true for trees with height

h > 1. If the capacity of the arc (p, v) is greater than the demand at v,

then there is no need of a new cache node, and therefore the solution remains

optimal. If, on the other hand, (p, v) does not have enough (., .... il v to satisfy

all demand at v, then we do not have a choice other than making v a cache

node. Combining this with the assumption that the solution for all children

of v is optimal, we conclude that the new solution for a tree of height h + 1 is

also optimal. O

3.3.2 Complexity of the FSCPP

We can use the transformation from SAT to TSCPP to show that FSCPP

is also A/P-hard. In the case of directed edges, this is simple, since given a

graph G provided by the reduction, we can give an orientation of G from source

to destinations. This is stated in the next theorem.

Theorem 5 The FSCPP is ACTP-hard if the instance rtplin is directed.

Proof: The proof is similar to the proof of Theorem 4. We need just to make

sure that the polynomial transformation given for the TSCPP-D also works

for a decision version of the FSCPP. Given an instance of SAT, let G be the

corresponding graph found by the reduction. We orient the edges of G from S

to destinations D, i.e., use the implicit orientation given in (3.12). It can be

checked that in the resulting instance the number of cache nodes cannot be











Ti2


17
Ti


Figure 3-5: Part of the transformation used by the FSCPP.

reduced by sending additional flow in other edges other than the ones which

form the tree in the solution of TSCPP. Thus, the resulting R is the same,

and FSCPP is A'P-hard in this case. O

Next we prove a slightly modified theorem for the undirerted version. To

do this we need the following variant of SAT:

3SAT(5): Given an instance of SATISFIABILITY with at most three literals per

clause and such that each variable appears in at most five clauses, is there a

truth assignment that makes all clauses true?

The 3SAT(5) is well known to be A/P-complete (Garey and Johnson,

1979).

Theorem 6 The FSCPP is ffP-hard if the instance irpli, is undirected.

Proof: When the instance of FSCPP is undirected, the only thing that can go

wrong is that some of the destinations T7, T", or T'" are being satisfied by flow

coming from nodes Cj connected to their respective xi, Ti nodes. What we need

to do to prevent this is to bound the number of occurrences of each variable

and add enough absorbing destinations to the subgraph corresponding to that

variable. We do this by reduction from 3SAT(5). The reduction is essentially

the same as the reduction from SAT to TSCPP, but now for each variable x,

we have nodes xi, Ti, T~, Tf", and Tk, for 1 < k < 6 (see Figure 3-5). Also,








62

for each variable x, we have edges (s, xi), (s, ri), (xi, TI), (i, T7"), (xi, T,),

(,T), for 1 < k < 6.

We claim that in this case for each pair of nodes xr, 7i, one of them must

be a cache node (which says that the corresponding variable in 3SAT(5) is true

or false). This is true because from the eight destinations not corresponding

to clauses (Ti, T'i, and Tt, 1 < k < 6) attached to xr, Ti, two can be directly

satisfied from s without caches. However, the remaining six cannot be satis-

fied from nodes Cj linked to the current variable nodes, because there are at

most five such nodes. Thus, we must have one cache node at x, or Ti, for each

variable xi. It is clear that these are the only cache nodes needed to make

all destinations satisfied. This gives us the correct truth assignment for the

original 3SAT(5) instance. Conversely, any non-satisfiable formula will trans-

form to a FSCPP instance which needs more than n cache nodes to satisfy all

destinations. Thus, the decision version of FSCPP is AHP-complete, and this

implies the theorem. o

Note that there is a case of TSCPP-D that is solvable in polynomial

time, and this happens when k = 0, i.e. determining if any cache node is

needed. The solution is given by the following algorithm. Run the maximum

flow algorithm from node s to all nodes in D. This can be accomplished, for

example, by creating a dummy destination node d and linking all nodes v E D

to d by arcs with capacity equal to 1. If the maximum flow from s reaches each

node in D, then the answer is true, since no cache node is needed to satisfy

the destinations. Otherwise, the answer must be false because then at least

one cache node is needed to satisfy all nodes in D.









3.4 Concluding Remarks

In this chapter we presented and analyzed two combinatorial optimiza-

tion problems, the tree cache placement problem (TSCPP) and its flow-based,

generalized version, the flow cache placement problem (FSCPP). We prove

that both problems, on directed and undirected graphs, are AHP-hard. For

this purpose, we use a transformation from the SATISFIABILITY problem.

Many questions remain open for these problems. For example, it would

be interesting to find algorithms with a better approximation guarantee, or

improved non-approximability results. Some of these issues will be considered

in the next chapters.














CHAPTER 4
COMPLEXITY OF APPROXIMATION FOR STREAMING CACHE
PLACEMENT PROBLEMS

As shown in the previous chapter, the SCPP in its two forms is AiP-

hard. We improve the hardness results for the SCPP, by showing that it is

very difficult to give approximate solutions for such problems. General non-

approximability is proved using the reduction from SATISFIABILITY given in

the previous chapter. Then, we improve the approximation results for the

FSCPP using a reduction from SET COVER. In particular, given k destina-

tions, we show that the FSCPP cannot have a O(log log k 6)-approximation

algorithm, for a very small 6, unless AiPcan be solved in sub-exponential time.

4.1 Introduction

We continue in this chapter the study of the streaming cache placement

problem (SCPP). In the SCPP one wants to determine the minimum number

of multicast routers needed to deliver content to a specified number of desti-

nations, subject to predetermined link capacities. The SCPP is known to be

.A'P-hard, as shown Chapter 3. We give approximation results for the SCPP

in its different versions, using properties of the SATISFIABILITY problem. We

use the transformation described in the previous chapter to achieve this non-

approximability result. We show that there is a fixed c > 1 such that no

SCPP problem can be approximated in polynomial time with guarantee bet-

ter than c. This is equivalent to say that the SCPP is in the MAX SNP-hard

class (Papadimitriou and Yannakakis, 1991).









We are also able to improve the approximation results for the FSCPP,

using a reduction from SET COVER. In this case, we are interested in general

flows and directed arcs. In particular, given k destinations, we show that the

FSCPP cannot have a O(log log k 6)-approximation algorithm, for a very

small 6, unless A'Pcan be solved in sub-exponential time.

This chapter is organized as follows. In Section 4.2 we discuss the non-

approximation result for FSCPP based on the SATISFIABILITY problem. Then,

in Section 4.3, we discuss the improved result for the FSCPP based on SET

COVER. Section 4.4 gives some concluding remarks.

4.2 Non-approximability

The transformation used in Theorem 2 provides a method for proving a

non-approximability result for the TSCPP and FSCPP. We employ standard

techniques, based on the gap-preserving transformations. To do this we use

an optimization version of 3SAT(5).

MAX-3SAT(5): Given an instance of 3SAT(5), find the maximum number of

clauses that can be satisfied by any truth assignment.

Definition 2 For any c, 0 < c < 1, an approximation .,1,'.iJ;.ii with guar-

antee c (or i ,!,.: i,. '://,. an e-approximation .,i1,i.ih1i,) for a maximization

problem II is an .i, o .l.,i,,- A such that, for any instance I E II, the r, .l1i

cost A(I) of A applied to instance I ./ili-... OPT(I) < A(I), where we de-

note by OPT(I) the cost of the optimum solution. For minimization problems,

A(I) must .i.. fl A(I) < e OPT(I), for any fixed e > 1.

The following theorem from Arora and Lund (1996) is very useful to prove

hardness of approximation results.









Theorem 7 (Arora and Lund (1996)) TlI. is a ,,i,',ii,:.:,I;/ time reduc-

tion from SAT to MAX-3SAT(5) which tro, af; i-ir- formula p into a formula o'

such that, for some fixed c (e is in fact determined in the proof of the theorem),

if Z is satisfiable, then OPT(O') = m, and

if Z is not satisfiable, then OPT(9') < (1 c)m,

where m is the number of clauses in 9'.

In the following theorem we use this fact to show a non-approximability

result for TSCPP.

Theorem 8 The transformation used in the proof of T/i..... ,,: 2 is a gap-

preserving transformation from MAx-3SAT(5) to TSCPP. In other words,

given an instance 9 of MAx-3SAT(5) with m clauses and n variables, we can

find an instance I of TSCPP such that

If OPT(O) = m then OPT(I) = n; and

If OPT(O) < (1 c)m then OPT(I) > (1 + e )n.

where e is given in TI... i, ,,: 7 and el = e/15.

Proof: Suppose that 0 is an instance of MAX-3SAT(5). Then, we can use the

transformation given in the proof of Theorem 2 to construct a corresponding

instance I of TSCPP. If 0 has a solution with OPT(O) = m, where m is the

number of clauses, then by Theorem 2, we can find a solution for I such that

OPT(I) = n.

Now, if OPT(O) < (1-ec)m, then there are at least cm clauses unsatisfied.

In the corresponding instance I we have at least n cache nodes due to the

constraints from nodes T/, T and Tf', 1 < i < n. These cache nodes satisfy

at most (1 c)m destinations corresponding to clauses. Let U be the set

of unsatisfied destinations. The nodes in U can be satisfied by setting one

extra cache (in a total of two, for nodes xj and Tj) for at least one variable xr

appearing in the clause corresponding to ci, for all ci E U.









Thus, the number of extra cache nodes needed to satisfy U is at least

IU /5, since a variable can appear in at most 5 clauses. We have


OPT(I) > n + IU/5 > n + m/5 > (1 + /15)n.


The last inequality follows from the trivial bound m > n/3. The theorem

follows by setting ec = e/15. O

Definition 3 A PTAS (Polynomial Time Approximation Scheme) for a min-

imization problem II is an .1i-o.i./'ii; that, for each e > 0 and instance I II,

returns u solution A(I), such that A(I) < (1 + e)OPT(I), and A has rii,,

time P',,ri.,'. ,::il in the size of I, depending on e (see, e.g. Papadimitriou and

Sl. (.l-. : (1' '), page 425).

Corollary 9 Unless P = AfP, the TSCPP cannot be approximated by (1+e2)

for any C2 < c1, where e1 is given in TI .... ,,i 8, and therefore there is no

.i .,1,i,,,..:,,1 time approximation scheme (PTAS) for the TSCPP.

Proof: Given an instance 0 of SAT, we can use the transformation given in

Theorem 7, coupled with the transformation given in the proof of Theorem 2,

to give a new polynomial transformation T from SAT to TSCPP. Now, let I

be the instance created by T on input 0. Suppose there is an E2 approximation

algorithm A for TSCPP, with 0 < C2 < c1. Then, when A runs on an instance

I constructed by T from a satisfiable formula 0, the result must have cost

A(I) < (1+ C2)n < (1 + c)n. Otherwise, if 0 is not satisfiable, then the result

given by this algorithm must be greater than (1 + ci)n, because of the gap

introduced by T. Thus, if there is an E2-approximation algorithm, then we

can decide in polynomial time if a formula ( is satisfiable or not. Assuming

P $ A'P, there is no such algorithm.

The fact that there is no PTAS for the TSCPP is a consequence of this

result and the definition of PTAS. o









The above theorem and corollary can be easily extended to the FSCPP.

The fact that the same transformation can be used for both problems can be

used to demonstrate the non-approximability result to the FSCPP as well. We

state this as a corollary.

Corollary 10 Unless P = HP, the FSCPP has no PTAS.

Proof: The transformation from SAT to FSCPP is identical, so Theorem 8 is

also valid for the FSCPP. This implies that the FSCPP has no PTAS, unless

P H HV. E

4.3 Improved Hardness Result for FSCPP

In this section, we are interested in the case of general flows and directed

arcs. This version of the problem is called the flow streaming cache place-

ment problem (FSCPP). In particular, given k destinations, we show that the

FSCPP cannot have a O(log log k 6)-approximation algorithm, for a very

small 6, unless HAPcan be solved in sub-exponential time.

We have shown above that, given a instance of the FSCPP, there is an e >

0 such that the FSCPP cannot be approximated by 1+ e, thus demonstrating

that FSCPP is MAX SNP-hard (Papadimitriou and Yannakakis, 1991) and

discarding the possibility of a PTAS. We show a stronger result: there is no

approximation algorithm that can give a performance guarantee better than

log log k, where k is the number of destinations. The proof is based on a

reduction from the SET COVER problem.

SET COVER: Given a ground set T = tl,..., ti, with subsets Si,..., Sm C T,

find the minimum cardinality set C C {1,..., m} such that Uiec Si T.

It is known (Feige, 1998) that SET COVER does not have approximation

algorithms for any guarantee better than O(logn). Thus, if we find a trans-

formation from SET COVER to FSCPP that preserves approximation, we can










prove a similar result for FSCPP. We show how this transformation, which

will be represented by : SC FSCPP, can be done.

For each instance Isc of set cover, we must find a corresponding instance

IFSCPP of the FSCPP. The instance Isc is composed of sets T and Si, ..., Sm

as shown above. The transformation consists of defining a capacitated graph

G with a source and a set D of destinations. Let G be the graph composed of

the following nodes:


V = {} U {W1,... ",-.} U {Vi,..., Vn} U{Si,..., Sm}.


Also, let the edges E of the graph G be


E {(w, vi) ti, S} U {(s,, )} U U {(, .i)}.
i=1 i=1

In the instance of FSCPP, the set of destination nodes D is given by


D = {v,..., V,} U {S,...,S },


and s is the source node. Thus, there is an one to one correspondence between

nodes .,- and sets Si, for 1 < i < m. There is also an one to one correspondence

between nodes vi and ground elements ti E T, for 1 < i < n. There is a

directed edge between the source and each node .w and between nodes .w, and

nodes representing elements appearing in the set Si. Nodes w, are also linked

to each si. Finally, each edge e has (.i .... il v c = 1. See an example of such

reduction in Figure 4-1. The ground set in this example is T = {t,... ,t},

and the subsets are S t =t t2, t4, t5}, S2 { tl1 t2, t4, t6}, and S = {t2, t4, t6}

Theorem 11 The transformation described above is a '.'l.v:,'.*:.lJ time reduc-

tion from SET COVER to FSCPP.

Proof: Let Isc be the instance of SET COVER and IFSCPP the corresponding

instance of the FSCPP. It is clear that the transformation is polynomial, since




















01 02 V3 V4 05 V06

0 0 0
S1 S2 S3

Figure 4-1: Example for transformation of Theorem 11.


the number of edges and nodes is given by a constant multiple of the number

of elements and sets in the instance of SET COVER. We must prove that Iis

and Isccp have equivalent optimal solutions. Let S' be an optimal solution for

Isc. First we note that the destination nodes si, 1 < i < m, can be reached

only from nodes w, and therefore each s, must be satisfied with flow coming

from 11 Thus, each node si saturates the corresponding w, which means

that to satisfy any other node from iw we must make it a cache node. Then,

we can clearly make R = {f w| i c S'}, and serve all remaining destinations

in vi,..., v, by definition of S'. Each node in R will be a cache node, and

therefore R is a solution for IFSCPP. This solution must be optimal, because

otherwise we could use a smaller solution R' to construct a corresponding set

S" c {1,...,m} with |S"| < S'I, covering all elements of T, and therefore

contradicting the fact that S' is an optimum solution for the SC instance.

Thus, the two instances Isc and IFSCPP have equivalent optimal solutions.



Corollary 12 Given an instance I of SC, and the transformation q described

above, then we have OPT(I) = OPT(q(I)).








The following theorem, proved by Feige (Feige, 1998), will be useful for

our main result.

Theorem 13 (Feige (Feige, 1998)) If there is some c > 0 such that a poly-

nomial time .i/. i./'lii, can approximate set cover within (1 ) logn, then

AfP c TIME (no(loglog )).

This theorem implies that finding approximate solutions with guarantee

better than (1 c) log n for SET COVER is equivalent to solve any problem in

AflPin sub-exponential time. It is strongly believed that this is not the case.

We use this theorem and the reduction above to give a related bound for the

approximation of FSCPP. To do this, we need a gap preserving transformation

from SC to FSCPP, as stated in the following lemma.

Lemma 14 If I is an instance SET COVER, then the transformation 0 from

SC to FSCPP described above is gap preserving, that is, it has the following

property:

(a) If OPT(I) = k then OPT(q(I)) = k; and

(b) If OPT(I) > k log n then OPT(q(I)) > k log log D 6,
where k is a fixed value, depending on the instance, and

S= -klog log(l+ 2)/logDI D} 0

for large n.

Proof: Part (a) is a simple consequence of Corollary 12. Now, for part (b),

note that the maximum number of sets in an instance of SC with n elements

is 2". Consequently, in the instance of FSCPP created by transformation 0,

DI = m + n < 2" + n. Thus, we have


log IDI < log(2 + n) n + 6',


',









where 6'= log(1+ -). This implies that,


log n > log(log IDI 6)= log log IDI + 6",


where 6" = log(1 oD ). Therefore,


OPT( (I)) > k log n > k log log ID 6,


where 6 = -k8" (note that 6 is a positive (11,.1l, il'). Finally, note that the

(111. I ll il /,r

-klog {-log(1+ )/log |DI

goes very fast to zero, in comparison to n, thus the value log log IDI is .-i.'mp-

totically optimal. O

The reduction shown in Theorem 11 is gap preserving, since it maintains

an approximation gap, introduced by the instances of SET COVER. Note

however that the name "gap preserving" is misleading in this case, since the

new transformation has a smaller gap than then original.

Finally, we get the following result.

Theorem 15 If there is some c > 0 such that a P,-I,'..;:,,:./l, time algorithm A

can approximate FSCPP within (1 ) log log k, where k = IDI, then iJVP C

TIJME(no(loglog)).

Proof: Suppose that an instance I of the SC is given. The transformation 9

described above can be used to find an instance 0(I) of the FSCPP. Then, A

can be used to solve the problem for instance 0(I). According to Lemma 14,

transformation 0 reduces any gap of logn to log log k. Thus, with such an

algorithm one can differentiate between instances I with a gap of log n. But

this is not possible in polynomial time, according to (Feige, 1998, Theorem 10)

unless flVPC TIME (n(log lg n)). D









4.4 Concluding Remarks

The SCPP is a difficult combinatorial optimization problem occurring in

multicast networks. We have shown that the SCPP in general cannot have

approximation algorithms with guarantee better than c, for some c > 1. Thus,

different from other optimization problems (such as the connected dominating

set in Chapter 7), the SCPP cannot have a poli',iniiial time approximation

scheme (PTAS).

We have also proved that the FSCPP cannot be approximated by less then

log log k, where k is the number of destinations, unless AfPcan be solved in

sub-exponential time. This shows that it is very difficult to find near optimal

results for general instances of the FSCPP.














CHAPTER 5
ALGORITHMS FOR STREAMING CACHE PLACEMENT PROBLEMS

The results of the preceding chapter show that the SCPP is very difficult

to solve, even if only approximate solutions are required. We describe some

approximation algorithms that can be used to give solutions to the problem,

and decrease the gap between known solutions and non-approximability re-

sults. We also consider practical heuristics to find good near-optimal solutions

to the problem. We propose two general types of heuristics, based on comple-

mentary techniques, which can be used to give good starting solutions for the

SCPP.

5.1 Introduction

In this chapter, we propose algorithms for solution of SCPP problems.

Initially, we discuss algorithms with performance guarantee, also known as

approximation algorithms. We give a general algorithm for SCPP problems,

and also a better algorithm based on flow techniques. Approximation algo-

rithms are very interesting as a way of understanding the complexity of the

problem, but specially on this case, due to the negative results shown in Chap-

ter 4, they are not very practical.

Thus, considering the complexity issues, we propose polynomial time con-

struction algorithms for the SCPP, based on two general techniques: adding

destinations to a partial solution, and reducing the number of infeasible nodes

in an initial solution. We report the results of computational experiments

based on these two algorithms and its variations.









This chapter is organized as follows. In Section 5.2 we present algo-

rithms with performance guarantee for the SCPP. In Section 5.3, we turn to

algorithms without performance guarantee, and discuss a number of possible

construction strategies. Then, in Section 5.4 we proceed to an empirical evalu-

ation of the solutions returned by the proposed construction heuristics. Final

remarks and future research directions are discussed in Section 5.5.

5.2 Approximation Algorithms for SCPP

In this section, we present algorithms for the TSCPP and FSCPP and

analyze their approximation guarantee. To simplify our results, we use the

notation A(I) = IRU{s} where R is the set of cache nodes found by algorithm

A applied to instance I. Also, OPT(I) = IR* U {s}|, where R* is an optimal

set of cache nodes for instance I. Note that A(I) > 1 and OPT(I) > 1, which

makes Definition 2 valid for our problems.

5.2.1 A Simple Algorithm for TSCPP

It is easy to construct a simple approximation algorithm for any instance

of the TSCPP. We denote by 6G(v) the degree of node v in the graph G.

Input: Graph G, destinations D, source s
Output: a set R of cache nodes
STEP1: Construct a spanning tree T of G
STEP2: Remove recursively all leaves of T which are not in D U {s}
STEP3: Let S1 be the set of internal nodes v with 6T(V) > 2
STEPj: Let S2 be the set of internal nodes v with 6rT() = 2 and
v D
STEPS: Return R = S1 U S2.

Algorithm 5: Spanning Tree Algorithm


Note that steps 3 and 4 of Algorithm 5 represent a worst case for Algo-

rithm 4. The correctness of the algorithm is shown in the next lemma.

Lemma 16 Algorithm 5 returns a feasible solution to the TSCPP.











Proof: The operation in step 2 maintains feasibility, since leaves cannot be

used to reach destinations. The result R includes all internal nodes v with

6r(v) > 2, and all internal nodes v with 6T(v) = 2 and v E D. It suffices to

prove that if &6(v) = 2 and v ( D then v is not needed in R.

Suppose that v is an internal node with 6T(v) = 2 and v D. If the

number of destinations down the tree from v is equal to 1, then v does not

need to be a cache. Now, assume that the number of cache nodeswdown the

tree from v is two or more. Then, there are two cases. In the first case, there

is a node w, between v and the destinations, with &6(W) > 2. In this case, w

is in R, and we need just to send one unit of flow from v to w, thus v does not

need to be in R. In the second case, there must be some destination w with

6T(W) = 2 between v and the other destinations. Again, in this case w will be

included in R from S2. Thus v does not need to be in R. This shows that R

is a feasible solution to the TSCPP. o

Lemma 17 Algorithm 5 gives an approximation guarantee of IDI.

Proof: Let us partition the set of destinations among D1 and D2, where

D1 = D \ S2 and D2 = S2. Denote by D' the set of destinations which

are leaves in T. Initially, note that for any tree the number of nodes v with

degree 6(v) > 2 is at most ILI 2, where L is the set of nodes with 6(v) = 1

(the leaves). But L in this case is D' U {s} C D1 U {s}. This implies that

Si| < IDi U {s}| 2. Thus,


IRI = Si U S21 < ID, U {s}| 2 + DD21 = IDI 1,

and

A(I)= IR 1 < DI < IDOPT(I),


since OPT(I) > 1.









Let A = A(G) be the maximum degree of G. In the case in which all

capacities c,, for e E E(G), are equal to one, we can give a better analysis of

the previous algorithm with an improved performance.

Theorem 18 When c, = 1 for all e E E, then Algorithm 1 is a k-

approximation algorithm, where k = min{A(G), |D|}.

Proof: The key idea to note is that if c, 1 for all e E E, then [IDI/A] <

OPT(I), for any instance I of the TSCPP. This happens because each cache

node (as well as the source) can serve at most A destinations. Let A(I) be

the value returned by Algorithm 2 on instance I. We know from the previous

analysis of Lemma 17 that A(I) < |D1. Thus A(I) < AOPT(I). The theorem

follows, since we know that this is also an |D|-approximation algorithm. O

5.2.2 A Flow-based Algorithm for FSCPP

In this section, we present an approximation algorithm for the FSCPP.

The algorithm is based on the idea of sending flow from the source to desti-

nation nodes. We show that this algorithm performs at least as good as the

previous algorithm for the TSCPP. In addition, we show that for a special class

of graphs this algorithm gives essentially the optimum solution. Therefore, for

this class of graphs the FSCPP is solvable in polynomial time.

We give now standard definitions about network flows. For more details

about the subject, see Ahuja et al. (1993). Let f(x, y) E R+ be the amount

of flow sent on edge (x, y), for (x, y) E E. A flow is called a feasible flow if it

satisfies flow conservation constraints (3.2).

Let F(f, s, t) = zE, f(s, v) be the total flow sent from node s. We

assume that s can send at most Z(s,v)eE Cs units of flow, and t can receive at

most E(u,t)eE Cut units of flow. A feasible flow f is the maximum flow from s to

t if there is no feasible flow f' such that F(f', s, t) > F(f, s, t). A node v is a









reached node from s by flow f if EZv f(w, v) > 0. It is well known that when

f is the maximum flow, then F(f, s, t) = C(s, t), where C(s, t) represents the

minimum (.i ,.... i ilr of any set of edges separating s from t in G (the minimum

cut). We also use the notation C(U, U) to denote the total capacity of edges

linking nodes in U to nodes in U, where U C V and U = V \ U.

Denote any feasible flow starting from node v by f,. The algorithm works

by finding the maximum flow from s to all nodes in D. If the total cost of

this maximum flow is F(f, s, D) > |D|, then the problem is solved, since all

destinations can be reached without cache nodes. Otherwise, we put all nodes

reachable from s in a set Q. Then we repeat the following steps until D \ Q

is empty: for all nodes v E Q, compute the maximum flow f, from v to D.

Find the node v* such that f,. is maximum. Then add v* to R and add to Q

all nodes reachable from v*. Also, reduce the capacity of the edges in E by

the amount of flow used by f,.. These steps are described more formally in

Algorithm 6.


Q {s}
while D\Q 0 do
forall v E Q do
find the maximum flow f, from v to D \ Q
end
Let v* be the node such that F(f,, s, D) is maximum
R RU{v*}
Add to Q the nodes reached by f,,
for each edge (u, v) E E do
Reduce (. 1.... i' Cu,,v by fv, (u, v)
end
end

Algorithm 6: Flow algorithm


In the following theorem, we show that the running time of this algorithm

is polynomial and depends on the time needed to find the maximum flow.









Denote by CM(G) the maximum value of the minimum cut between any pair

of nodes v, w E V(G), i.e.


CM(G) = maxC(, w).
v,wCV

Similarly, we define

Cm(G) min C(v, w).
v,wCV

Theorem 19 Algorithm 6 has iI,,,,iii time equal to O(n- IDI -Tm CmIC(G)),

where Tm is the time needed to run the maximum flow .,l'i., 7l.,

Proof: The most costly operations in this algorithm are calls to the maximum

flow algorithm. Therefore we count the number of calls. Note that, at each of

the N iterations of the while loop, a new element is added to the set of cache

nodes. Thus, A(I) is equal to the number of such iterations.

Let vi be the node added to the set of cache nodes at iteration i, and

Q' be the content of set Q at iteration i of Algorithm 6. At each step, the

number of elements of D found by the algorithm is, according to the maximum-

flow/minimum-cut theorem, equal to the minimum cut from vi to the remain-

ing nodes in D \ Q' (recall that all demands are unitary). Then,

N N
ID C(v, D \ Q') > min C(v, w) > A(I) min C(v, w).
SwED v,wCV

Thus we have
IDI
N A(I) < (5.1)
Cm(G).
At each iteration of the while loop, the number of calls to the maximum

flow algorithm is at most n. The total number n, of such calls is given by n, <

niDl/C'(G). Thus the running time of Algorithm 6 is O(nlDlTmf/Cm(G)).

D








Based on the performance analysis just shown, the following theorem gives

an approximation guarantee for Algorithm 6.

Theorem 20 Algorithm 6 is a k-approximation .,I1,,. :1 ,,:. where k

CM(G)/Cm(G).

Proof: If we denote by R the set of cache nodes in the optimal solution, we

have

IDI < C(vu, D) < max C(v, w) OPT(I)CM(G) (5.2)
vicRU{s} iRU{s} 'wIv

Combining inequalities (5.1) and (5.2) results in

cM (G)
A(I) < OPT(I) w
Cm(G)

Note that the quantity CM(G)/Cm(G) can become large. However, for

some types of graphs, the preceding algorithm gives us a better understanding

of the problem. For example, if the graph has maximum degree A(G) and

fixed capacity 1, then it is easy to see that

CM(G) A(G) .l A(G).
< ) A(G).
Cm(G) I

If the edge (., .... i01v is not fixed, then CM(G)/Cm(G) becomes at most

A(G) cM/cm, where cM represents the maximum (.i .... il and c" the smallest

capacity of edges in G.

5.3 Construction Algorithms for the SCPP

We this section, we provide construction algorithms for SCPP that give

good results for a class of problem instances. The algorithms are based on

dual methods for constructing solutions. In the first heuristic, the method

used consists of sequentially adding destinations to the current flow, until all

destinations are satisfied. The second method uses the idea of turning an









initial infeasible solution into a feasible one, by adding cache nodes to the

existing infeasible flow.

The general method used can be summarized by saying that it consists of

the selection, at each step, of subsets of the resulting solution, while a complete

solution is not found. To select elements of the solution, an ordering function

is eil'l-v, d. This is done in a way such that parts of the solution which seem

promising in terms of objective function are added first. In the remaining of

this section we describe the specific techniques proposed to create solutions

for the SCPP.

5.3.1 Connecting Destinations

The first method we propose to construct solutions for the SCPP is based

on adding destinations sequentially. The algorithm uses the fact that each

feasible solution for the SCPP can be clearly described as the union of paths

from the source s to the set D of destinations.

More formally, let D = {dl,...,dk} be the set of destinations. Then,

a feasible flow for the SCPP is the union of a set of paths s = P U ... U

Pk, such that Pi = (s,xi ..., xzj-1, zj), P2 = (s,1 ..., j2-I, 2 xj2), ... P2

(s, xz ..., Zjk1_, jXi), and where xzj = di, for i {1,..., k}.
In the proposed algorithm, we try to construct a solution that is the union

of paths. It is assumed initially that no destination has been connected to the

source, and A is the set of non connected destinations, i.e., A = D. Also,

the set of cache nodes R is initially empty. During the algorithm execution,

S represents the current subgraph of G connected to the source s. At each

step of the algorithm, a path is created linking S to one of the destinations

d E A. First, the algorithm tries to find a path P from d to one of the nodes

in R U {s}. If this is not possible (this is represented by saying that P = nil),













s 2

3

4

Figure 5-1: Sample execution for Algorithm 7. In this graph, all capacities
are equal to 1. Destination d2 is being added to the partial solution, and node
1 must be added to R.

then the algorithm tries to find a path P from d to some node in the connected

subgraph S. Let w be the connection point between P and S. Then, add w

to the set R of caches, since this is necessary to make to the solution feasible.

Finally, the residual flow available in the graph is updated, to account for the

capacity used by P. The algorithm finishes when all destinations are included,

and R is returned as the solution for the problem.

The formal steps of the proposed procedure are described in Algorithm 7.

A number of important decisions, however, are unspecified in the description of

the algorithm given so far. For example, there are many possible methods that

can be used to select the next destination added to the current partial solution.

Also, a path to a destination v can be found using diverse algorithms, which

can result in different selections for a required cache node. These possible

variations in the algorithms are represented by two functions, get_path(v, S),

and get_dest(A). Thus, changing the definition of these functions we can

actually achieve different implementations.

The first feature that can be changed, by defining a function get_dest

2V V, is the order in which destinations are added to the final solution.

Among the possible variations, we can list the following, which seem more

useful:









Input: graph G, set D of destinations
A D
S -- 0 /* current flow */
R -- ( /* set of cache nodes */
while A / 0 do
v <- get_dest(A) /* choose a destination */
A A\ {v}
P -get_path(v, R, G)
if P nil then
P <- getpath(v, S, G)
Let w be the node connecting P to S
R RU{w}
end
Remove from G the ., .... ilv used in P
S <-- SUP
end
return R

Algorithm 7: First construction algorithm for the SCPP.


give precedence to destinations closer to the source;

give precedence to destinations further from the source;

give precedence to destinations in lexicographic order.

The second basic decision that can be made is, once a destination v is

selected, what type of path that will be used to join v to the rest of the graph.

This decision is incorporated by the function get_path : V -- 2V. A second

parameter involved in the definition of get_path is the specific node w E S

which will be connected to v. Note that such a node always exist, since at

each step of the construction there is a path from destinations to at least one

node already reached by flow. Using a greedy strategy, the best way to link

a new destination d is through a path from d to some node v E R U {s}.

However, it may not be possible to find such v, and this requires the addition

of a new node to R. In both situations it is not clear what is the optimal node

to be linked to the current destination. Thus, another important decision in










Algorithm 7 concerns how to choose, at each step, the node to be linked to

the current destination.

Shortest Path Policy. Perhaps, the simplest and most logical solution

to the above questions is to link destination nodes using shortest paths. This

policy is useful, since it can be applied to answer the two questions raised

above: the path is created as a shortest path, and the node v E R U {s}

selected is the closest from d. Thus, function get_path(v, R, G) in Algorithm 7

becomes: select the node v E R U {s} such that dist(v, d) (the shortest path

distance between v and d) is minimum; find the the shortest path d -> v from

d to v; add d --> v to the current solution. If there is no path from d to RU{s},

then let v be the node closest to d and reached by flow from s and add v to R.

Other Policies. We have tested other two methods of connecting

sources to destinations. In the first method, destinations are connected

through a path found using the depth first search algorithm. In this imple-

mentation, paths are followed until a node already in R, or connected to some

node in R, is found. In the last case, the connection node must be added to R.

The second method used employs random paths starting from the destination

nodes. This method is just useful to understand how good are the previous

algorithms compared to a random solution.

Once defined the policy to be used in the constructor, it is not difficult to

prove the correctness, as well as finding the complexity of the algorithm.

Theorem 21 Given an instance I of the SCPP, Algorithm 7 returns a correct

solution.

Proof: At each step, a new destination is linked to the source node. Thus, at

the end of the algorithm all destinations are connected. The paths determined

by such connections are valid, since they use some of the available capacity










(according to the information stored in the residual graph G). Nodes are

added to R only when it is required to make the connection possible. Thus,

the resulting set R is correct, and corresponds to a valid flow from s to the set

D of destinations. o

Theorem 22 Using the closest node '../.:' ; for destination selection and the

shortest path "'.:.. 'i for path creation, the time <,,.,:/,/. ,.:'/ of the Algorithm 7

is O(IDIn2).

Proof: The exterlal loop is executed IDI times. The steps of highest complex-

ity inside the while loop are exactly the ones corresponding to the get_path

procedure. As we are proposing to use the shortest path algorithm for the

implementation of get_path, the complexity is O(n2) (but can be improved

with more clever implementations for the shortest path algorithm). Other op-

erations have smaller complexity, thus the total complexity for Algorithm 7 is

O(|IDn2). D

Other implementations of Algorithm 7 would result in a very similar anal-

ysis of complexity.

5.3.2 Adding Caches to a Solution

We propose a second general technique for creating feasible solutions for

the SCPP. The algorithm consists of adding caches sequentially to an initial

infeasible solution, until it becomes feasible. The steps are presented in Algo-

rithm 8. At the beginning of the algorithm, the set of cache nodes R is empty,

and a possibly infeasible subgraph, linking the source to all destinations gives

the initial flow. Such an initial infeasible solution can be created easily with

any spanning tree algorithm. In the description of our procedure, we define

the set I of infeasible nodes to be the nodes v E V \ {s} such that


E c(w,v)- E c(v, w) b,
(w,v)CE(G) (v,w)CE(G)









input: Graph G

S spanning_tree(G)
Remove from G the (. i... ili on edges used by S
I -- infeasible nodes in S
while I / 0 (there are infeasible nodes) do
v <- select_unfeasible_node
Try to find different paths to satisfy v
if a set P of paths is found then
Remove from G the ., .... i;I used by P
S -S UP
I <-- I \ { v }
else
R- RU{v}
end
end
return R

Algorithm 8: Second construction algorithm for the SCPP.


where b, is the demand of v, which can be 0 or 1.

In the while loop of Algorithm 8, the current solution is initially checked

for feasibility. This verification determine if there is any node v E V such that

the amount of flow leaving the node is greater then the arriving flow, or in other

words, I $ 0. The formal description of this verification procedure is given in

Algorithm 9. If the solution is found to be infeasible, then it is necessary to

improve its feasibility by increasing the number of properly balanced nodes.

The correction of infeasible nodes v E I is done in the body of the while

loop in Algorithm 8. The procedure consists of selecting a node v from the

set of infeasible nodes I in the flow graph, and trying to make if feasible by

sending more data from one of the nodes w E RU {s}. If this can done in such

a way that v becomes feasible again, then the algorithm just needs to update

the current subgraph S and the set of infeasible nodes I.






























s 2

4d4
3

4

Figure 5-2: Sample execution for Algorithm 8, on a graph with unitary capac-
ities. Nodes 1 and 2 are infeasible, and therefore are candidates to be included
in R.

However, if v cannot receive enough additional data, then it must be

added to the list of cache nodes. This clearly will make the node feasible

again, since there will be no restrictions on the amount of flow departing

from v. After adding v to R, the graph is modified as necessary. Example

of possible modifications are changing the flow required by v to one unit and

deleting additional paths leading to v, since only one is necessary to satisfy

the flow requirements. We assume that these changes are done randomly, if

necessary.

This construction technique can be seen as a dual of the algorithm pre-

sented in the previous subsection. In Algorithm 7 the assumption is that a


Input: current solution S, destinations set D
for v D do b,0-- 1

for v V do
<- E c(w, vu)- c(u, w)
(w,v)CE(S) (v,w)CE(S)
if 6 / b6 then
I <-- I U {v}
end
end
return I /* returns the infeasible nodes */

Algorithm 9: Feasibility test for candidate solution.









partial solution can be incomplete, but always feasible with relation to the flow

sent from s to the reached destinations. On the other hand, in Algorithm 8

a solution is always complete, in the sense that all destinations are reached.

However, it does not represent a feasible flow, until the end of the algorithm.

Procedure select_unfeasible_node has the objective of finding the

most suitable node to be processed in the current iteration. This is the main

decision in the implementation of Algorithm 8, and can be done using a greedy

function, which will find the best candidate according to some criterion. We

propose some possible candidate functions, and determine empirically (in the

next section) how these functions perform in practice.

Function largest_infeasibility: select the node that has greatest

infeasibility, i.e., the difference between entering and leaving flow minus

demand is maximum (breaking ties arbitrarily). This strategy tries to

add to the set R a node which can benefit most from being a cache.

Function closest_from_source: select the infeasible node which is

closer to the destination. The advantage in this case is that a node

v selected by this rule can help to reduce the infeasibility of other nodes

down the in the path from s to the destinations.

Function uniformrandom: select uniformly a node v E I to be added

to R. This rule is useful for breaking the biases existing in the previous

methods. It has also the advantage of being very simple to compute,

and therefore very fast.

Theorem 23 Given an instance of the SCPP, Algorithm 8 returns a correct

solution.

Proof: In the algorithm, the set of infeasible nodes I will decrease mono-

tonically. This happens because at each step one infeasible node is selected










and turned into a feasible node. Also, feasible nodes cannot become infeasi-

ble, since each operation requires that there is enough capacity in the network

(this is guaranteed by the used of the residual graph G). Thus, the algorithm

terminates.

The data flow from the source to the destinations must be valid at the

end, by definition of the set I (which must be empty at the end). Similarly,

the set R must be valid, since it is used only to make nodes feasible in the case

that no additional paths can be found to satisfy their requirements. Thus, the

solution return by Algorithm 8 is correct for the SCPP. o

Theorem 24 The time (or,,1;'. i.:;1 of the Algorithm 8 is O(nmK), where K

is the sum of capacities in the SCPP instance.

Proof: A spanning tree can be found in O(mlogn). Then, it follows

the while loop that will perform at most n iterations. The procedure

select_unfeasible_node can be implemented in 0(n) by the use of some

p'. i.'.... ii. in each of the proposed implementations. Finding paths to in-

feasible nodes is clearly the most difficult operation in the loop. This can be

performed in O(m + n) for each path, using a procedure such as depth first

search. However, it may be necessary to run this step a number of times pro-

portional to the sum of capacities in the graph (K), which results in O(mK).

Other operations have low complexity, thus the maximum complexity for al-

gorithm 8 is O(nmK). o

5.4 Empirical Evaluation

In this section we present computational experiments carried out with the

construction algorithms proposed above.

All algorithms were implemented using the C programming language (the

code is available by request). The resulting program was executed in a PC,









Table 5-1: Computational results for different variations of Algorithm 7 and
Algorithm 8.

Instance Constructor 1 Constructor 2
n m DFS I Shortest Random LI CS UR
50 500 9.9 2.9 18.8 18.3 18.4 3.0 2.8 3.2
50 800 12.9 4.9 29.3 29.6 28.4 5.0 4.8 5.2
60 500 35.1 8.8 56.8 56.4 55.5 9.6 9.2 9.9
60 800 44.7 11.7 77.6 76.9 76.8 12.5 11.7 12.8
70 500 78.9 17.0 111.3 111.2 110.4 18.9 18.1 19.6
70 800 102.2 20.3 139.3 140.1 139.7 22.5 21.4 23.3
80 500 147.6 27.8 180.7 182.2 181.3 32.0 31.1 33.4
80 800 186.1 32.4 218.3 218.8 218.9 36.9 36.2 38.9
90 500 239.8 42.1 268.4 267.8 268.5 49.5 49.9 52.0
90 800 285.4 47.5 312.6 313.4 313.3 56.5 57.4 59.8
100 500 344.1 60.3 368.0 369.0 369.1 71.3 74.2 75.8
100 800 398.6 67.7 420.1 421.6 422.3 80.3 84.3 85.4
110 500 466.4 84.2 482.6 484.8 485.8 100.3 106.1 106.2
110 800 525.5 92.5 541.7 545.1 543.7 111.9 119.5 118.4
120 500 599.7 115.1 611.3 614.9 613.3 137.6 146.9 144.9
120 800 668.7 125.5 676.9 680.6 679.2 151.9 164.0 159.4
130 500 748.8 157.4 751.4 755.6 754.9 180.3 194.6 189.0
130 800 824.7 169.5 823.3 827.7 827.8 199.3 215.7 208.5
140 500 910.6 213.2 906.2 909.8 908.9 233.4 252.4 243.9
140 800 995.9 228.2 984.8 990.2 989.3 254.6 276.3 265.5
150 500 1092.0 281.8 1074.4 1080.3 1080.6 292.0 319.6 303.6
150 800 1185.4 299.6 1162.6 1170.0 1169.1 315.8 347.0 353.6


with 312MB of memory and a 800MHz processor. The GNU gcc compiler

(under the Linux operating system) was used, with the optimization flag -02

enabled.

Table 5-1 presents a summary of the results of experiments with the

proposed algorithms. The first two columns give the size of instances used. The

remaining columns give results returned by Algorithm 7 and Algorithm 8 under

different policies. Each entry reported in this table represents the averaged

results over 30 instances of the stated size. Each instance was created using

a random generator for connected graphs, first described in (Gomes et al.,

1998). This generator creates graphs with random distances and capacities,

but guarantees that the resulting instance is connected. Destinations were also

defined randomly, with the number of destinations being equal to 40% of the