<%BANNER%>

A Framework for reliable multicast protocol

University of Florida Institutional Repository

PAGE 1

A FRAMEW ORK F OR RELIABLE MUL TICAST PR OTOCOL By VENKA T A LAKSHMANAN RAMASUBRAMANIAM A THESIS PRESENTED TO THE GRADUA TE SCHOOL OF THE UNIVERSITY OF FLORID A IN P AR TIAL FULFILLMENT OF THE REQUIREMENTS F OR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORID A 2002

PAGE 2

Cop yrigh t 2002 b y V enk ata Lakshmanan Ramasubramaniam

PAGE 3

T o My P aren ts

PAGE 4

A CKNO WLEDGMENTS I w ould lik e to express m y sincere gratitude to Dr. Ric hard E. Newman for giving me the opp ortunit y to w ork with him and b eing m y advisor. He has b een a great source of inspiration and encouragemen t throughout m y sta y at the Univ ersit y of Florida. I w ould also lik e to thank Dr. Randy Y. C. Cho w, Dr. Jonathan C. Liu, and Dr. Mic hael P F rank for serving on m y committee. Sp ecial thanks go to Dr. Christophe Fiorio for generously allo wing me to use his L A T E X algorithm template. Thanks go to Mr. Ron Smith for creating this thesis template for the L A T E X comm unit y at Univ ersit y of Florida and helping me out in the v arious problems encoun tered during the formatting. I w ould lik e to thank m y paren ts, family and friends who ha v e alw a ys b een there to help me. iv

PAGE 5

T ABLE OF CONTENTS page A CKNO WLEDGMENTS . . . . . . . . . . . . . . iv LIST OF FIGURES . . . . . . . . . . . . . . . . vii ABSTRA CT . . . . . . . . . . . . . . . . . . viii CHAPTER 1 INTR ODUCTION . . . . . . . . . . . . . . . 1 1.1 Motiv ation . . . . . . . . . . . . . . . 2 1.2 Problem Defnition . . . . . . . . . . . . . 3 1.3 Organization of the Thesis . . . . . . . . . . . 4 2 BA CK GR OUND AND PREVIOUS W ORK . . . . . . . . 5 2.1 Multicasting . . . . . . . . . . . . . . . 5 2.2 Reliable Multicasting . . . . . . . . . . . . 6 2.3 Existing Proto cols . . . . . . . . . . . . . 7 2.3.1 Lo cal Group Concept . . . . . . . . . . 7 2.3.2 Reliable Multicast T ransp ort Proto col (RMTP) . . . 8 2.3.3 T ree-based Reliable Multicast Proto col (TRAM) . . . 9 2.3.4 Multicast TCP (MTCP) . . . . . . . . . 11 2.3.5 Scalable Reliable Multicast (SRM) . . . . . . . 12 2.3.6 Log-based Receiv er-reliable Multicast (LBRM) . . . 13 2.3.7 Xpress T ransp ort Proto col (XTP) . . . . . . . 14 2.4 TCP Congestion Con trol . . . . . . . . . . . 15 2.5 Existing QoS Approac hes . . . . . . . . . . . 17 2.6 IETF Approac hes . . . . . . . . . . . . . 19 2.7 Summary . . . . . . . . . . . . . . . 21 3 ISSUES IN RELIABLE MUL TICASTING . . . . . . . . 23 3.1 A CK Implosion problem . . . . . . . . . . . 23 3.2 Error Reco v ery . . . . . . . . . . . . . . 25 3.3 Congestion Con trol . . . . . . . . . . . . . 26 3.4 Scalabilit y . . . . . . . . . . . . . . . 29 3.5 F airness . . . . . . . . . . . . . . . . 29 3.6 Summary . . . . . . . . . . . . . . . 30 v

PAGE 6

4 PR OTOCOL FRAMEW ORK . . . . . . . . . . . . 32 4.1 Goals . . . . . . . . . . . . . . . . . 32 4.1.1 A CK Handling . . . . . . . . . . . . 32 4.1.2 Error Reco v ery . . . . . . . . . . . . 32 4.1.3 Congestion Con trol . . . . . . . . . . . 32 4.1.4 Scalabilit y and Dynamic Adaptation . . . . . . 33 4.2 Arc hitecture . . . . . . . . . . . . . . . 33 4.3 Multicast Data T ransfer . . . . . . . . . . . 34 4.4 Ac kno wledgemen t and Error Reco v ery Mec hanism . . . . 37 4.4.1 Retransmission b y Group Leader . . . . . . . 38 4.4.2 Retransmission b y Source . . . . . . . . . 39 4.4.3 Monitoring of Unresp onsiv e Receiv ers and Group Leaders 40 4.4.4 Late Joining Receiv ers and Data Reco v ery . . . . 41 4.5 Congestion Con trol Mec hanism . . . . . . . . . 43 4.5.1 Slo w Start . . . . . . . . . . . . . 43 4.5.2 Congestion Con trol . . . . . . . . . . . 44 4.6 Group Managemen t Sc hemes . . . . . . . . . . 46 4.6.1 Group F ormation . . . . . . . . . . . 47 4.6.2 Dynamic Reconfguration of Groups . . . . . . 49 4.7 Securit y Issues . . . . . . . . . . . . . . 55 4.8 Summary . . . . . . . . . . . . . . . 56 5 FUNCTIONAL MODEL . . . . . . . . . . . . . 58 5.1 Proto col Comp onen ts . . . . . . . . . . . . 58 5.1.1 Sender Comp onen t . . . . . . . . . . . 58 5.1.2 Receiv er Comp onen t . . . . . . . . . . 59 5.1.3 Group Leader Comp onen t . . . . . . . . . 60 5.1.4 Cen tral Logging Serv er Comp onen t . . . . . . 61 5.2 Use Case and Sequence Diagrams . . . . . . . . . 62 5.3 Flo w Diagram . . . . . . . . . . . . . . 65 5.4 Summary . . . . . . . . . . . . . . . 67 6 CONCLUSION AND FUTURE W ORK . . . . . . . . . 68 6.1 Conclusion . . . . . . . . . . . . . . . 68 6.2 F uture W ork . . . . . . . . . . . . . . . 71 APPENDIX A PSEUDO ALGORITHMS F OR MUL TICAST OPERA TION . . . 72 B PSEUDO ALGORITHMS F OR GR OUP MANA GEMENT . . . . 74 REFERENCES . . . . . . . . . . . . . . . . . 79 BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . 81 vi

PAGE 7

LIST OF FIGURES Figure page 2.1 Dieren tiated Service (DS) Field for TOS in IP Header . . . . 18 3.1 A CK Implosion at Multicast Source . . . . . . . . . 24 3.2 Comparativ e Analysis of Reliable Multicast Proto cols . . . . 30 4.1 Ov erall Arc hitectural Design of the Proto col . . . . . . . 35 4.2 Multicast Data T ransfer . . . . . . . . . . . . 36 4.3 Lo cal Retransmission of Missing P ac k ets . . . . . . . . 38 4.4 Cen tral Logging Serv er and Monitoring of Unresp onsiv e Mem b ers . 42 4.5 Hierarc hical Consolidation of F eedbac k P arameter . . . . . 45 4.6 Expanding Ring Structure for Group Mem b ership . . . . . 47 4.7 Group Leader's Adv ertisemen t for Mem b ership . . . . . . 48 4.8 Group Leader's Status Message . . . . . . . . . . 50 4.9 T ree Construction for an Application with Less Num b er of Receiv ers 53 4.10 T ree Construction for an Application with Large Num b er of Receiv ers 54 5.1 F unctional Comp onen ts of the Proto col . . . . . . . . 59 5.2 Dieren t Messages Used in Proto col Op eration . . . . . . 62 5.3 Use Case and Sequence Analysis for Multicast Data Op eration . . 62 5.4 Use Case and Sequence Analysis for Ac kno wledgemen t . . . . 63 5.5 Use Case and Sequence Analysis for Error Reco v ery . . . . . 64 5.6 Use Case and Sequence Analysis for Group Managemen t . . . 64 5.7 Sequence Diagram of the Proto col . . . . . . . . . . 66 5.8 Flo w Diagram of a Comp onen t . . . . . . . . . . . 67 vii

PAGE 8

Abstract of Thesis Presen ted to the Graduate Sc ho ol of the Univ ersit y of Florida in P artial F ulllmen t of the Requiremen ts for the Degree of Master of Science A FRAMEW ORK F OR RELIABLE MUL TICAST PR OTOCOL By V enk ata Lakshmanan Ramasubramaniam Decem b er 2002 Chair: Ric hard E. Newman Ma jor Departmen t: Computer and Information Science and Engineering The biggest rev olution in net w orking since the in tro duction of the W orld Wide W eb (WWW) is In ternet Proto col (IP) m ulticasting. Multicasting is b ecoming increasingly ubiquitous in to da y's In ternet w orld. There are man y commercial applications, b oth real-time and non real-time, that are based on IP Multicasting. Man y of those applications require reliable deliv ery of data at the receiv er's end for them to b e meaningful. Reliable m ulticast refers to the reliable manner in whic h a message should reac h the group of receiv ers. It is still an activ e area of researc h and has a n um b er of proto cols in existence. There are sev eral issues lik e scalabilit y error reco v ery and congestion con trol that mak e the design of a reliable m ulticast proto col dicult. This thesis discusses v arious reliable m ulticast proto cols in existence and pro vides a framew ork for a scalable, reliable, dynamic m ulticast transp ort proto col for p oin t to m ultip oin t comm unication. The prop osed proto col pro vides a sequenced, loss less deliv ery of data from a sender to a set of receiv ers. The receiv ers are split as logical groups with a sp ecial receiv er called Group Leader con trolling the group. The error reco v ery viii

PAGE 9

and retransmissions are p erformed b y the Group Leader, whic h distributes the pro cessing load of the sender. The logical grouping of the receiv ers mak es the proto col scalable b y its arc hitectural design. The proto col pro vides congestion con trol algorithms and also men tions the group formation and managemen t tec hniques. A unique feature of the proto col is the abilit y of the receiv ers to recongure them to suit the net w ork conditions in case of congestion. ix

PAGE 10

CHAPTER 1 INTR ODUCTION Multicasting pro vides an ecien t mec hanism for message deliv ery from a single sender to a group of receiv ers. Multicasting is done using In ternet Proto col (IP) [1], whic h is an unreliable proto col. Therefore, applications using m ulticasting that require reliable deliv ery of data ha v e to use a reliable transp ort proto col. This is similar to reliable unicast applications using T ransmission Con trol Proto col (TCP) [2] as its transp ort proto col. Unlik e TCP the design of a transp ort proto col for m ulticasting imp oses sev eral c hallenges. The k ey issues include the implosion problem, scalabilit y error reco v ery and congestion/ro w con trol. The presence of m ultiple receiv ers causes all the con trol information from the receiv ers to ro w to the sender, that causes the implosion problem. The op eration of the proto col should not deteriorate with the increase in the n um b er of receiv ers. Due to the large n um b er of receiv ers, the reco v ery time should b e minimized and should ha v e facilities for congestion and ro w con trol. These topics are discussed in detail in Chapter 3. There are a n um b er of proto cols in existence to da y for reliable m ulticasting: Scalable Reliable Multicast [3], Xpress T ransp ort Proto col [4], Log-based Receiv er Reliable Multicast Proto col [5], Lo cal Group based Multicast Proto col [6 ] and Multicast TCP [7]. Some of the proto cols lik e Reliable Multicast T ransp ort Proto col [8], dev elop ed b y Lucen t T ec hnologies, and T ree-based Multicast Proto col [9], dev elop ed b y Sun Microsystems, are b ecomming commercially a v ailable. Chapter 2 surv eys these proto cols in detail. Although man y transp ort proto cols exist for reliable m ulticasting, none of them are standardized so far. The proto cols, that are curren tly a v ailable ha v e b een designed to suit the needs of a custom 1

PAGE 11

2 application; hence there is an absence of a generalized proto col that can b e used b y all the applications fairly Also most of the proto cols do not address imp ortan t issues suc h as congestion con trol, whic h pla ys a ma jor role due to the trac generated b y reliable m ulticasting. Some of the proto cols are not scalable to a v ery large n um b er of receiv ers, whic h is required of an y m ulticast application. 1.1 Motiv ation The In ternet has b een emerging as one of the ma jor source of comm unication utilit y This has made a dramatic eect in the w a y in whic h p eople comm unicate and share information with eac h other. A form of group comm unication called m ulticasting has led to the dev elopmen t of a n um b er of distributed applications lik e video conferencing, distributed in teractiv e sim ulation, white b oards, electronic publishing, news/news group services, digital libraries, etcetera. Due to the immense requiremen t for securit y in the commercial applications of the In ternet w orld, there w as a h uge amoun t of researc h in cryptograph y and standards lik e RSA [10 ] ha v e b een put forth. Similarly due to the emerging ab o v e-men tioned group comm unication-based applications, there is again the rise in in terest for reliabilit y Most of the real-time distributed applications require reliable deliv ery of data. Unlik e unicasting, in whic h the data is exc hanged b et w een a single source and a receiv er, exc hanging the data reliably in a m ulticasting en vironmen t with a n um b er of receiv ers p oses a lot of c hallenges. The proto col's arc hitecture, design and op eration will pla y a ma jor role in the op eration of the same. Reliable m ulticast is no w an activ ely researc hed area in the In ternet comm unit y Although a n um b er of reliable m ulticast proto cols exist for researc h and commercial applications, none of them are standardized. The reliable m ulticast proto cols in existence to da y are designed for custom applications. Also, most of the proto cols do not address some k ey issues lik e congestion/ro w con trol and some of them are not scalable to a large

PAGE 12

3 n um b er of receiv ers, whic h t ypify mainly m ulticast applications. Hence there is a need for a reliable m ulticast transp ort proto col, whic h addresses all of the issues lik e scalabilit y error reco v ery congestion/ro w con trol and group managemen t tec hniques and m ust b e suitable for a wide v ariet y of applications. 1.2 Problem Denition This thesis prop oses a framew ork for a reliable m ulticast transp ort proto col. Reliable m ulticast refers to the reliable manner in whic h a sender sends the data to a set of receiv ers. The proto col prop osed in this framew ork pro vides a sequenced, loss less deliv ery of data from a single sender to a set of receiv ers. The framew ork addresses all the ma jor issues lik e error reco v ery congestion con trol, scalabilit y and group managemen t tec hniques. The arc hitectural design of the proto col mak es it scalable to a large n um b er of receiv ers and aids lo cal error reco v ery The logical grouping of the receiv ers in the proto col mak es the error reco v ery distributed. The logical grouping of receiv ers aid in com bining the receiv ers with similar c haracteristics. The receiv ers that reside in the same lo cation, receiv ers that ha v e iden tical data rate, and the receiv ers that ha v e iden tical error rate could b e com bined together. By making the error reco v ery distributed, the comm unication load of the net w ork is greatly reduced. Receiv ers do not ha v e to send their repair request all the w a y up to the source of the m ulticast tree and similarly the sender do not ha v e to send the resp onse do wn the tree to receiv er. The con trol trac in v olv ed in the error reco v ery is limited only to the logical groups. As the receiv ers get repaired from the Group Leader rather than the sender, the end-to-end latency is greatly reduced. The distributed error reco v ery also reduces the pro cessing load of the sender as it is distributed among the Group Leaders. The retransmission p erformed b y the Group Leader is sometimes m ulticast to the whole group that sligh tly reduces the pro cessing load of the Group Leader.

PAGE 13

4 The proto col resp onds to the congestion state in the net w ork through TCPlik e congestion con trol algorithms. The proto col is also highly dynamic with the m ulticast group receiv ers adapting themselv es to the c hanging net w ork conditions and congestion state. Dynamic reconguration of the receiv ers among the logical groups do es this. The details of the proto col's arc hitecture, design and op eration are discussed in Chapter 4. 1.3 Organization of the Thesis Chapter 2 discusses the m ulticasting tec hniques and ev aluates the v arious existing proto cols. Chapter 3 discusses the v arious issues p ertaining to the design of reliable m ulticasting proto cols and analyses the existing reliable m ulticast proto cols p ertaining to these issues. Chapter 4 presen ts the framew ork for the proto col, discussing its arc hitecture, design and op eration. Chapter 5 presen ts the functional mo del of the proto col and Chapter 6 giv es the concluding remarks and suggestions for future w ork.

PAGE 14

CHAPTER 2 BA CK GR OUND AND PREVIOUS W ORK This c hapter giv es an in tro duction to reliable m ulticasting and the v arious prominen t w orks and proto cols dev elop ed on the same. 2.1 Multicasting Multicast is an ecien t w a y to transfer data from a source to a group of receiv ers. Instead of sending a separate cop y to eac h of the receiv ers, the sender sends to the net w ork, a single cop y whic h then sends it to all the receiv ers. Ro ca et al. ha v e surv ey ed dieren t m ulticast tec hnologies [11]. Multicasting is broadly classied in to the follo wing t yp es based on the applications [12 ]. One-to-Many (1toM) : The 1toM m ulticast tec hnique has a single host sending data to more than one receiv er. The one-to-man y applications include sc heduled audio/video distribution (lectures, presen tations, meetings), push media applications (news headlines, w eather up dates, sp orts scores), le distribution and cac hing, announcemen ts (net w ork time, session information, k eys) and monitoring applications (sto c k prices, sensor equipmen t, securit y systems). Many-to-Many (MtoM) : The MtoM m ulticast tec hnique supp orts the m ultiple receiv ers acting as senders to o, enabling t w o-w a y comm unication. The man y-to-man y applications include m ultimedia conferencing (audio/video, white b oard), concurren t pro cessing (distributed parallel pro cessing), distance learning, c hat groups, distributiv e in teractiv e sim ulation and collab oration (shared do cumen t editing). Many-to-One (Mto1) : The Mto1 m ulticast tec hnique has m ultiple senders sending data to a single receiv er. The man y-to-one applications include data collection applications (sensors), auctions and p olling. With man y of the m ulticast applications listed ab o v e, the gro wth of IP Multicast has gro wn to a great exten t in the past few y ears. Of the man y applications listed ab o v e, most of them require reliable deliv ery of data at the receiv er's end. 5

PAGE 15

6 2.2 Reliable Multicasting In m ultimedia applications, the loss of some data could b e acceptable the video frames could b e sacriced for the audio information. The main ob jectiv e of these applications is to guaran tee the qualit y of service at the cost of reliabilit y But most of the applications discussed ab o v e require reliable deliv ery of data at the receiv er end. Addition of the w ord r eliable to m ulticasting imp oses sev eral c hallenges in the w a y the proto col is designed. In unicast comm unication b et w een a single source and a receiv er, reliabilit y is pro vided b y TCP [2]. TCP pro vides reliable transmission with feedbac k from the receiv er, the source accordingly pro vides the retransmission of the missed pac k ets. In m ulticasting, due to the presence of a large n um b er of receiv ers, closed lo op feedbac k results in implosion problem as discussed in Chapter 3. The term reliable m ulticasting refers to the reliable manner in whic h the message should reac h the group of receiv ers. The proto col dev elop ed should also b e scalable to a large n um b er of receiv ers. Earlier m ulticast proto cols w ere broadly classied as sender-initiated and receiv er-initiated [13 ]. In the case of sender-initiated proto cols, the sender will main tain considerable state information. All the receiv ers sending in their A CKs, will lead to pac k et implosion. The presence of large n um b er of receiv ers causes them to rep ort their con trol information and ac kno wledgemen ts to the source, that results in virtually imp ounding it. This is called as the implosion problem. In the case of receiv er-initiated proto cols, ev ery receiv er will main tain state information thereb y shifting the burden from the sender to the receiv ers. There has b een m uc h comparativ e analysis done on the sender-initiated and receiv er-initiated reliable m ulticast proto cols. These analysis ha v e sho wn that receiv er-initiated proto cols are more scalable than the sender-initiated ones. The proto col framew ork discussed in this thesis is receiv er-initiated with sligh t

PAGE 16

7 mo dication. Instead of all the receiv ers main taining the state information, only selected receiv ers called Group Leaders will main tain the same. 2.3 Existing Proto cols A considerable amoun t of w ork has b een rep orted in the literature regarding reliable m ulticast proto cols. Most of the w orks could b e classied as the sender-initiated and the receiv er-initiated approac hes. Although there are man y proto cols in the reliable m ulticast area, this section deals only with the ma jor ones, discussing the highligh ts of the same. Most of the proto cols are based on the logical groupings of the receiv ers. A set of receiv ers is group ed together to form logical groups that form a hierarc hical m ulticast tree. The arc hitecture impro v es the scalabilit y of the proto col. Some of the proto cols discussed do not form an y groupings and their scalabilit y is limited. Obraczk a pro vides a comparison c hart of all the ma jor m ulticast transp ort proto cols [14 ]. 2.3.1 Lo cal Group Concept Markus Hofmann prop osed a tree-based approac h using the so-called Lo cal Group Concept (LGC) [6]. The concept, dev elop ed in 1994 has ev olv ed in to t w o separate proto cols: Lo cal Group based Multicast Proto col (LGMP) and Lo cal Group Conguration Proto col (LGCP). LGC is one of the few earlier proto cols that discussed the lo cal logical groupings of receiv ers. LGMP is also based on the principle of lo cal subgrouping. It has sp ecial no des called group con trollers that are resp onsible for lo cal re-transmissions and A CK pro cessing. The selection of the group con troller and the managemen t of the lo cal groups is not a task of the m ulticast proto col lik e LGMP Instead, a separate conguration proto col called as Dynamic Conguration Proto col (DCP) w as designed and implemen ted. An y error or retransmissions are reco v ered in the lo cal groups b y the group con troller. If an y of the group receiv ers do not ha v e the data pac k et, it is requested from the source. Ac kno wledgemen t sc hemes include a

PAGE 17

8 p ositiv e, negativ e and a no v el semi-negativ e ac kno wledgemen t sc heme, whic h indicates that a data unit has not y et b een receiv ed correctly but it do es not request the data unit for retransmission. Error reco v ery is rst p erformed lo cally within the groups. An y mem b er of the lo cal group that has the missing pac k et w ould retransmit it to the requesting mem b er. A pac k et is requested for retransmission from the sender only when it is not found with an y of the mem b ers in the group, including the group con troller. LGC denes t w o dieren t mo des of p erforming lo cal retransmissions: load-sensitiv e mo de and dela y-sensitiv e mo de. In load-sensitiv e mo de, retransmissions are p erformed to minimize the net w ork load. The con troller withholds the retransmission w aiting for requests from other receiv ers. In dela y-sensitiv e mo de, the retransmissions are p erformed immediately after the reception of retransmission requests. The establishmen t and main tenance of logically structured group hierarc hies is left to the Dynamic Conguration Proto col. The ma jor adv an tage of the DCP is that it can in teract with an y other proto col that requires a logically structured receiv er hierarc h y The DCP and LGC together form the proto col arc hitecture. LGC concerns group formation, group c haracteristics and dynamic reconguration. But there is no men tion of an y congestion con trol sc heme. The LGMP op erates on top of User Datagram Proto col (UDP) [15 ] and do es not require an y c hanges within in ternal net w ork equipmen t suc h as routers or switc hes. 2.3.2 Reliable Multicast T ransp ort Proto col (RMTP) P aul and Sabnani of Bell Lab oratories prop osed the Reliable Multicast T ransp ort Proto col (RMTP) [8 ]. RMTP is commercially mark eted b y Lucen t T ec hnologies. In RMTP the receiv ers form a dynamic m ulticast tree with the source ro oted on top of the tree. RMTP is a proto col for p oin t-to-m ultip oin t reliable m ulticast. The receiv ers are group ed logically to form groups with a Designated Receiv er (DR) as the represen tativ e of the lo cal group. These lo cal groupings of the

PAGE 18

9 receiv ers lead to the formation of sev eral lo cal m ulticast trees, whic h together form the global m ulticast tree. Hence the sender, receiv ers and the designated receiv ers form the three ma jor en tities of RMTP The leaf receiv ers p erio dically send status messages to their designated receiv ers (DRs). DRs in turn send their status messages to the DRs in the higher lev el and so on un til the status message nally reac hes the source. Th us there is a hierarc hical ro w of data in the m ulticast tree from one lev el to another lev el. Lost pac k ets are reco v ered lo cally and the DR do es retransmissions either through unicast or m ulticast mec hanism. T o facilitate error reco v ery for late joining receiv ers, the source and the DRs buer the data pac k ets for the session. RMTP uses t w o lev el cac he mec hanisms with the most recen t pac k ets are cac hed in memory and the rest are cac hed in the disk. Flo w con trol is based on a com bination of rate and windo w-based con trol. Although the proto col is scalable to a large n um b er of receiv ers, it do es not pro vide an end-to-end congestion con trol sc heme. The sender only gets feedbac k from its o wn c hildren (the DRs) ab out their receiving status. Hence, the sender has little information ab out the congestion status of the receiv ers. When congestion o ccurs at leaf receiv ers, it ma y mot b e p ossible for the sender to detect the congestion, esp ecially if the DRs and the leaf receiv ers do not share the same net w ork path. In this case, the sender will con tin ue to transmit at the same rate, aggra v ating the existing congestion. RMTP trac ma y b e completely unresp onsiv e to congestion and ma y cause congestion collapse. Another dra wbac k of the proto col is the dep endence on the net w ork equipmen t suc h as the routers. The routers ha v e to b e mo died substan tially Although the information from eac h receiv er is deliv ered in order, RMTP do es not guaran tee causal deliv ery 2.3.3 T ree-based Reliable Multicast Proto col (TRAM) TRAM [9] w as dev elop ed at Sun Microsystems Labs b y Kadansky et al. It w as designed to supp ort bulk data transfer with a single sender and m ultiple receiv ers.

PAGE 19

10 Unlik e the proto cols discussed earlier, TRAM uses dynamic trees in its arc hitecture. These are used for lo cal error reco v ery and aids in the scalabilit y of large n um b er of receiv ers. The receiv ers and the sender of the m ulticast group in teract with eac h other dynamically to form repair groups. The repair groups are link ed together hierarc hically to form the m ulticast tree with the source ro oted on top of the tree. A subset of receiv ers is c hosen for the reliable deliv ery of the data. A sp ecial receiv er called the repair head is c hosen among the subset of receiv ers, and is resp onsible for the lo cal error reco v ery and retransmission pro cessing. These repair heads ma y b e statically or dynamically selected. The repair head cac hes the data pac k ets temp orarily un til all the receiv ers in the subset receiv e them. The source of the m ulticast tree also cac hes the data pac k et and retransmits it if requested. TRAM uses a rate-based ro w con trol. The data rate of the sender is dynamically adjusted based on the congestion feedbac k from the receiv ers. The repair head also sends con trol messages lik e congestion notication to its paren t; these propagate un til they reac h the source. Eac h mem b er of the tree sends feedbac k rep orts to its repair head p erio dically apart from the congestion notication. This feedbac k consists of general information or statistics, whic h also aids in the construction of the tree. TRAM also prop oses sev eral optimized tree construction tec hniques suited for dieren t applications. TRAM prop oses sev eral tree managemen t tec hniques for con tin ually optimizing the repair tree. Though the source and the repair heads cac he the data pac k ets for error reco v ery TRAM do es not supp ort full-data reco v ery There ma y b e cases where the receiv ers joining late w ould not b e able to reco v er fully due to the absence of data b oth at the repair head and the source as their buers are p erio dically reclaimed. TRAM is curren tly implemen ted in Ja v a and sev eral sample applications Slinger, Bric ks, Sto c k and T reeT est ha v e b een dev elop ed to test its capabilities.

PAGE 20

11 2.3.4 Multicast TCP (MTCP) Rhee et al. [7] prop osed the Multicast TCP (MTCP), a p oin t-to-m ultip oin t proto col mainly dev elop ed to pro vide a detailed congestion con trol sc heme for reliable m ulticast. The proto col describ es a detailed congestion con trol tec hnique similar to TCP congestion con trol metho ds. MTCP also supp orts a tree-lik e hierarc hical structure. The receiv ers are not group ed lo cally The m ulticast group is a hierarc hical tree with the sender forming the ro ot of the tree and the receiv ers forming the other no des of the tree. The sender m ulticasts the data to all the receiv ers, and the latter send ac kno wledgemen ts to their paren ts in the tree. The in ternal no des are called Service Agen ts (SAs). The leaf receiv ers send A CKs to the SAs immediately ab o v e them, whic h in turn send them to the SA in the lev el ab o v e and so on un til it reac hes the source. The receiv ers send either a p ositiv e ac kno wledgemen t (A CK) or a negativ e ac kno wledgemen t (NA CK). Receiv ed pac k ets are rep orted in A CKs and missing pac k ets are rep orted in NA CKs. The service agen ts are resp onsible for handling the feedbac k generated b y the c hildren and retransmitting lost pac k ets. MTCP pro vides sev eral features for congestion con trol mec hanisms. The receiv ers send a consolidated congestion status rep ort up in hierarc h y to w ards the source. A new concept of relativ e time dela y is in tro duced to o v ercome the dicult y of calculating the round-trip time. Flo w con trol is windo w-based, whic h allo ws the sender to con trol the amoun t pac k ets it m ulticasts to the group. It also incorp orates a selectiv e ac kno wledgemen t sc heme at the service agen ts to prev en t indep enden t pac k et loss from reducing the sender's transmission rate. Eac h service agen t main tains a TCP-lik e congestion windo w, whic h op erates in a manner similar to the standard TCP congestion con trol algorithms [16]. MTCP pro vides a congestion con trol sc heme similar to the TCP congestion con trol mec hanisms. This is ac hiev ed through the hierarc hical congestion status

PAGE 21

12 rep orts sen t b y the leaf receiv ers to the SAs ab o v e them. Eac h SA monitors the congestion lev el of its c hildren b y indep enden tly main taining a dynamic congestion windo w using the A CKs and NA CKs receiv ed from them. The status rep orts ev en tually reac h the source, the ro ot of the tree, whic h regulates its transmission rate based on its summary MTCP also uses Relativ e time dela y (R TD) concept, whic h o v ercomes the dicult y of estimating the round-trip times in tree-based m ulticast en vironmen ts. Unlik e TCP that uses feedbac k from a single receiv er to estimate the round trip time, MTCP op en lo op system with m ultiple receiv ers. MTCP measures the dierence b et w een the clo c k v alue tak en at the sender when a pac k et is sen t, and the clo c k v alue tak en at the SA when the corresp onding A CK is receiv ed. The time dierence is called the relativ e time dela y in MTCP The proto col is not scalable to a large n um b er of receiv ers due to its arc hitecture. Also, MTCP do es not describ e in detail the formation of the m ulticast tree and its managemen t tec hniques. MTCP primarily describ es a congestion con trol mec hanism for reliable m ulticasting. 2.3.5 Scalable Reliable Multicast (SRM) SRM, prop osed b y Flo yd et al. [3] is a reliable m ulticast framew ork for ligh t-w eigh t sessions and application lev el framing dev elop ed at the La wrence Berk eley Labs (LBL). Although the framew ork has b een designed for a wide range of applications, it has b een primarily protot yp ed in wb a distributed whiteb oard application. The SRM is not hierarc hical and hence the scalabilit y is v ery limited. Whenev er a mem b er generates new data, the data are m ulticast to the whole group. Eac h and ev ery mem b er is resp onsible on its o wn to detect pac k et loss and to request retransmission. T o prev en t the implosion of con trol pac k ets sen t from receiv ers in a m ulticast group, it adopts a mec hanism similar to the XTP [4], b y whic h the con trol pac k ets are m ulticast to the whole group. As with the original data, repair

PAGE 22

13 requests and retransmissions are alw a ys m ulticast to the whole group. The other mem b ers that are also missing the same pac k et hear that request and suppress their o wn request. This prev en ts a request implosion. A similar tec hnique is adopted to prev en t resp onse implosion. The repair requests are dieren t from the traditional NA CK in that they are not addressed to a sp ecic sender and they request data b y its unique name. In SRM, eac h mem b er also m ulticasts p erio dic session messages that rep ort the sequence n um b er state for activ e sources. The session messages in SRM are used to determine the curren t participan ts of the session. F rom the arc hitectural design of SRM, it can b e seen that it is not scalable to a large n um b er of receiv ers. The error reco v ery sc heme is totally dieren t from the other proto cols discussed earlier. Separate mec hanisms for the suppression of the request and resp onse implosion ha v e to b e pro vided, that could otherwise b e a v oided in the prop er arc hitectural design. Also, the design of SRM results in a problem called the crying bab y problem. If a single link connecting a mem b er of the group has a v ery high error rate, then the mem b er sends out the repair request to the whole m ulticast group and receiv es one or more resp onses. Congestion in one part of the group will lead to this problem. 2.3.6 Log-based Receiv er-reliable Multicast (LBRM) The Log-based Receiv er-reliable Multicast proto col [5] is a reliable m ulticast proto col suited for high p erformance sim ulation applications, particularly Distributiv e In teractiv e Sim ulations (DIS). It w as dev elop ed to suit applications that ha v e requiremen ts suc h as wide-area data distribution, lo w latency and pac k et loss detection and reco v ery Holbro ok et al. giv e detailed discussions of v arious applications of LBRM lik e trac rep orts, le cac hing, sto c k quotes dissemination. In LBRM, a logging serv er pro vides the reliabilit y b y logging all transmitted pac k ets from the source. An y receiv er that missed a data pac k et requests the same

PAGE 23

14 directly from the logging serv er. The presence of the logging serv er is the equiv alen t of an y transp ort proto col buering the data to serv e retransmissions. The proto col is receiv er reliable in the sense that eac h receiving application denes its o wn reliabilit y requiremen ts. The sender merely sends the data and mak es it p ossible, via the logging serv er, for the receiv er to b e able to retriev e the lost pac k et. The source includes a sequence n um b er in eac h pac k et and denes a Maxim um Idle Time (MaxIT) b ound. The source guaran tees that it will transmit a pac k et at least once ev ery MaxIT in terv al. Ev en if the application do es not pro vide the data, the proto col k eeps sending some sp ecial pac k ets called k eep-aliv e or heartb eat pac k ets that rep eat the previous sequence n um b ers and not the asso ciated data. LBRM also prop oses the concept of distributed logging. The receiv ers are group ed together to form sites. Eac h and ev ery site has a secondary logging serv er apart from the primary logging serv er lo cated near the m ulticast source. The presence of the secondary-logging serv ers mak es the error reco v ery distributed. Eac h of the secondary logging serv ers is resp onsible for the retransmission of data pac k ets at their site. This reduces the end-to-end propagation dela y The secondary-logging serv ers can in turn reco v er the missing pac k ets from the primary logging serv er. The designs of man y reliable m ulticast proto cols with lo cal error reco v ery including the framew ork describ ed in this thesis, are based on the arc hitecture of LBRM. 2.3.7 Xpress T ransp ort Proto col (XTP) The Xpress T ransp ort Proto col [4] is a high-p erformance transp ort proto col designed to meet the needs of distributed, real-time, and m ultimedia systems in b oth unicast and m ulticast en vironmen ts. Although XTP w as rst established in 1987, it w as later mo died with sev eral v ersions.

PAGE 24

15 The imp ortan t features of XTP are the connection paradigms, con trol algorithms and the group mem b ership con trols. XTP designers c hose a unique connection-orien ted m ulticast paradigm whereb y XTP sets up a one-to-man y simplex connection with a set of receiv ers. The functionalit y found in the p oin t-to-p oin t unicast connections are extended in XTP With the con trol algorithms, XTP pro vides its users with the abilit y to enable or disable the error-rate and ro w-con trol pro cedures on a connection b y connection basis. XTP designers ha v e exp erimen ted with t w o metho ds for con trol algorithms: a heuristic algorithm for timer-based pro cessing of con trol information and explicit pro cessing of the state information from eac h receiv er in the m ulticast group. XTP is principally a sender-based reliable proto col and it has the option of b oth unicast and m ulticast. Con trol pac k ets are unicast to the sender and there is no mec hanism to prev en t con trol pac k et implosion. Retransmitted pac k ets are m ulticast and ltered at the sites. XTP supp orts a n um b er of m ulticast group managemen t tec hniques. Though it has supp ort for ro w and error con trol, it uses the mec hanisms dened for XTP unicast. 2.4 TCP Congestion Con trol This section describ es the TCP congestion con trol mec hanisms that ha v e b een standardized and researc hed widely Earlier implemen tations of the TCP/IP implemen ted the go-bac k-n mo del without the presence of an y mo dern congestion con trol algorithms. They used the cum ulativ e ac kno wledgemen t mec hanism for ac kno wledging the pac k ets and the re-transmit timer expiration for sending the pac k ets again when they are lost in the net w ork. These implemen tations did not help in reducing the congestion v ery m uc h. V an Jacobson, one of the greatest pioneers of congestion con trol in the In ternet, prop osed a series of algorithms for congestion con trol in TCP that ev en tually b ecame standardized. The algorithms are discussed briery as follo ws:

PAGE 25

16 Slo w start. The slo w start algorithm uses a new v ariable called congestion windo w(cwnd). It op erates b y observing that the rate at whic h the new pac k ets should b e inserted in to the net w ork is the rate at whic h the ac kno wledgemen ts are receiv ed from the other end. The sender can only send in a minim um of the cwnd and receiv er's adv ertised windo w (rwnd). F or eac h of the A CK the sender receiv es, the cwnd is increased b y one segmen t. Increasing the cwnd b y one for ev ery A CK, results in an exp onen tial increase of cwnd o v er round trips. Congestion a v oidance. The congestion a v oidance uses another v ariable called a slo w start threshold (ssthresh). It indicates the correct windo w size dep ending up on the net w ork load. Initially the slo w start phase b egins. As long as the cwnd is less than the ssthresh, the slo w start con tin ues. Once the cwnd crosses the ssthresh, \congestion a v oidance" phase starts. F or eac h of the A CK receiv ed, the cwnd is increased b y 1/cwnd segmen ts. When the sender times out w aiting for an A CK, ssthresh is set to a minim um of cwnd/2 and the receiv ers adv ertised windo w. The cwnd is again set to one and the slo w start phase b egins again. The slo w start will b e activ e as long as the cwnd is less than the ssthresh. Else, congestion a v oidance phase b egins. F ast retransmission. When the TCP receiv er receiv es an out of order segmen t, it immediately sends an duplicate ac kno wledgemen t. The duplicate A CK indicates the next segmen t the receiv er is exp ecting and asking the sender to transmit the same. The duplicate A CK ma y b e sen t due to the loss of a segmen t at the receiv er's side or due to re-ordering of the segmen ts. If the duplicate A CK w as due to the re-ordering of segmen ts, then there ma y b e only one or t w o duplicate A CKs. If there are 3 or more duplicate A CKs, then it m ust b e due to the missing segmen t and the sender retransmits the missing segmen ts without w aiting for the retransmit timer to go o. TCP then reduces the ssthresh to cwnd/2 and resets the cwnd to one segmen t.

PAGE 26

17 F ast reco v ery The fast reco v ery algorithm prev en ts the comm unication path from going empt y after fast retransmit. Therefore, there is no need to slo w start from the b eginning after the fast retransmit. F ast reco v ery k eeps trac k of the n um b er of duplicate ac kno wledgemen ts and tries to estimate the amoun t of outstanding data in the net w ork. It will increase the cwnd b y one segmen t for eac h duplicate ac kno wledgmen t receiv ed, th us main taining the ro w of trac. The sender comes out of the fast reco v ery when it receiv es an A CK for the segmen t whose loss resulted in the duplicate ac kno wledgemen ts. TCP will no w derate the windo w b y returning it to the ssthresh and en ters the congestion a v oidance phase. Slo w start, congestion a v oidance, and fast retransmission together form TCP T aho e and TCP Reno includes fast reco v ery mec hanism. 2.5 Existing QoS Approac hes The In ternet to da y pro vides only the b est-eort service. As discussed in the in tro duction section, with the gro wing demand for real-time and non real-time m ulticast applications, the demand for the qualit y of service is greatly increased. Although the reliable transp ort proto col discussed in this do cumen t aims at pro viding reliabilit y it w ould b e ev en b etter if the net w ork la y er w ere to com bine with the transp ort proto col to enhance the same. The IETF has prop osed a n um b er of mec hanisms to meet the demand for QoS. The p opular ones are the In tegrated Services/RSVP Dieren tiated Services, Multiproto col Lab el Switc hing (MPLS), trac engineering and constrain t-based routing [17]. Here, the QoS mo del of Dieren tiated Services is discussed in detail of ho w it is incorp orated. In tegrated services/RSVP The In tegrated Services/RSVP mo del prop oses t w o service classes in addition to the b est-eort service: Guaran teed Service and Con trolled-load service. The RSVP (Resource ReSerV ation Proto col) is a signaling proto col that reserv es resources in the net w ork for ro ws. Routers in the net w ork

PAGE 27

18 use the P A TH and RESV messages to reserv e resources. Dieren tiated Services, MPLS and other QoS mec hanisms are replacing the once-p opular In tegrated Services, largely due to the follo wing reasons. In tegrated Services places a h uge burden on the routers due to the amoun t of state information and pro cessing capabilities it requires. High storage and pro cessing o v erhead are the ma jor cons in v olv ed. Dieren tiated services. Unlik e RSVP there is no signaling mec hanism in Dieren tiated Services, th us eliminating the QoS setup costs. The QoS requiremen ts are obtained b y mo difying the TOS (T yp e of Service) eld in the IP header to a eld called Dieren tiated Services (DS) eld. The elemen ts of the DS eld are sho wn in Figure 2.1. Applications mak e use of the DS eld to mark the pac k ets according to their requiremen ts. It is the job of the DiServ arc hitecture to deliv er the pac k ets to the receiv er application. Dieren t m ulticast applications ha v e dieren t requiremen ts suc h as time or dela y sensitivit y Hence they mark the DS pac k ets as p er their requiremen ts and the DiServ arc hitecture tak es care of the op eration. XXXX Unused DiffServ Field DS Field Flags Data IHL Total Length Fragment offset Identification TTL Protocol Header checksum Source address Destination address Option + Padding XXXX Figure 2.1: Dieren tiated Service (DS) Field for TOS in IP Header The DiServ arc hitecture consists of en tities lik e routers that p erform the follo wing functions: classifying, metering, shaping and dropping. The

PAGE 28

19 customer/clien t should ha v e an agreemen t called the Service Lev el Agreemen t (SLA) with the In ternet Service Pro vider (ISP) to receiv e the Dieren tiated Services. Customers mark the DS elds of individual pac k ets to indicate the desired service. Edge routers of the net w ork classify p olice and shap e the pac k ets according to the SLA. Core routers just forw ard the pac k ets based on the marking of the DS eld. The v arious functions done b y the router are discussed b elo w. Classifying : The edge router lo oks at eac h pac k et and iden ties the ro w to whic h it b elongs. Lo oking at the DS eld do es this. Metering : After classifying the ro w, its resource consumption should b e measured. Measuring the trac is also imp ortan t for billing information. Shaping : The ro w ma y o ccasionally include some bursts that m ust b e absorb ed and the pac k ets m ust b e paced. It is up to the router to decide on the mec hanism to hold the burst. It ma y hold or ev en drop the pac k ets in the burst that exceed the particular threshold. Dr opping : Whenev er the ro w exceeds the SLA, a router ma y c ho ose to drop the pac k ets. Dep ending up on the t yp e of service subscrib ed, Assured Service or Premium Service, the pac k ets are handled accordingly The v arious queue managemen t sc hemes lik e Random Early Detection (RED) and Random Early Detection with Input and Output (RIO) are emplo y ed for dropping the pac k ets. In RED, the pac k ets are dropp ed randomly RIO is a mo died RED algorithm. RIO main tains t w o RED algorithms, one for the in pac k ets and another for the out pac k ets. There are t w o thresholds for the queue. When the pac k ets are b elo w the rst threshold, no pac k ets are dropp ed. When the queue size is b et w een the t w o thresholds, only the out pac k ets are dropp ed randomly In extreme congestion, when the queue size exceeds the second threshold b oth the out and in pac k ets are dropp ed randomly Hence the m ulticast applications could sp ecify the lev el of reliabilit y required through the marking of the DS elds. In addition to the transp ort lev el reliabilit y pro vided b y the proto col, enhanced reliabilit y could b e pro vided b y the net w ork la y er using the Dieren tiated Services as discussed ab o v e. 2.6 IETF Approac hes In ternet So ciet y (ISOC) is a global not-for-prot mem b ership organization founded in 1991 to pro vide leadership in In ternet related standards, education, and

PAGE 29

20 p olicy dev elopmen t. The In ternet Engineering T ask F orce (IETF) is an organization whic h is main tained b y ISOC. The IETF is a large op en in ternational comm unit y of net w ork designers, op erators, v endors, and researc hers concerned with the ev olution of the In ternet arc hitecture and the smo oth op eration of the In ternet. The IETF is divided in to v arious w orking groups (W G) that w ork on dieren t areas of the In ternet w orking to w ards its standardization. Although a n um b er of reliable m ulticast transp ort proto cols ha v e b een dev elop ed, most of them are used in researc h while some of them ha v e b een released as commercial pro ducts. None of the reliable m ulticast transp ort proto cols ha v e b een standardized. The Reliable Multicast T ransp ort, rm t-w orking group is c hartered to standardize reliable m ulticast transp ort proto cols for \one-to-man y bulk data" transp ort. This group w orks closely with other IETF groups suc h as Secure Multicast researc h group (sm ug) and Multicast Securit y w orking group (msec). The rm t w orking group is curren tly w orking on the follo wing three proto col instan tiations: NORM : Nac k Orien ted Reliable Multicast proto col, whic h uses NA CKs for reliabilit y; TRA CK : TRee A CKno wledgemen t-based proto col, whic h uses a tree structure for con trolling feedbac k and repairs; ALC : Async hronous La y ered Co ding, whic h uses F orw ard Error Correction (FEC) tec hniques and do es not require an y feedbac k. A n um b er of In ternet drafts and RF Cs ha v e already b een published in the ab o v e men tioned areas. The m uc h-researc hed areas in reliable m ulticasting b y IETF include congestion/ro w con trol measures, ecien t retransmission, and ac kno wledgemen t aggregation. There are n umerous publications in the literature on the same. IETF has framed some tec hnical criteria [18] for Reliable Multicast transp ort proto cols. The criteria include addressing the follo wing k ey issues:

PAGE 30

21 Sc alability : The abilit y of the proto col to accommo date a large n um b er of senders and receiv ers and the mec hanisms that limit the scalabilit y; Congestion Contr ol : Description of the congestion con trol mec hanisms that are incorp orated and their b eha vior during congestion; Err or R e c overy and R obustness : Description of ho w the proto col handles the pac k et loss and no de/link failures; Se curity and Privacy c onc erns : Analysis of securit y at the senders, receiv ers, routers and retransmission sources along with data in tegrit y and authen tication. The framew ork prop osed in this thesis addresses all of the issues men tioned ab o v e except the securit y asp ects. The proto col is scalable to a large n um b er of receiv ers b y its arc hitectural design. Logical grouping of receiv ers lo calizes the error reco v ery and it also men tions a congestion con trol sc heme that op erates on congestion algorithms similar to TCP The securit y issues are just men tioned and not discussed in detail. 2.7 Summary Due to the strong application demand of reliable m ulticasting, there is a h uge in terest in the area of standardization of the same. Therefore it con tin ues to b e an activ e researc h area in the In ternet comm unit y This c hapter discussed the v arious m ulticasting tec hniques and their applications. Reliable m ulticast proto cols ma y b e broadly classied as sender-initiated or receiv er-initiated proto cols. The c hapter also review ed the v arious reliable m ulticast transp ort proto cols in the literature. LGMP RMTP and TRAM proto cols share the tec hnique of lo cal grouping of receiv ers in their designs. The lo cal error reco v ery design of most of the reliable m ulticast proto cols w as inspired b y the design of LBRM. Although LBRM prop oses a distributed logging system with the receiv ers group ed as a site, it do es not men tion an y of the group managemen t tec hniques as in TRAM. RMTP groups the

PAGE 31

22 receiv ers lo cally leading to the formation of sev eral lo cal m ulticast trees, whic h together form a global m ulticast tree. Group managemen t tec hniques are not discussed in detail in RMTP either. The lo cal groups of receiv ers ha v e a sp ecial receiv er called repair head, resp onsible for retransmissions and reco v ery TRAM is ric h in group managemen t sc hemes and has tec hniques for con tin ually optimizing the m ulticast tree structure. MTCP w as dev elop ed primarily to implemen t a congestion con trol strategy for reliable m ulticasting. XTP the sender-based proto col w as dev elop ed aiming at distributed, real-time and m ultimedia systems. SRM is not hierarc hical and do es not pro vide logical groupings either. The arc hitectural design of SRM mak es its scalabilit y v ery limited. The next c hapter discusses the v arious issues in reliable m ulticasting, whic h mak e it dicult for the standardization pro cess. In Chapter 4, the framew ork of the proto col is prop osed.

PAGE 32

CHAPTER 3 ISSUES IN RELIABLE MUL TICASTING Design of a reliable m ulticast proto col is v ery complex when compared to the design of unicast comm unication proto cols. There are sev eral issues to b e dealt with when dealing with group comm unication. Listed b elo w are some of the imp ortan t problems encoun tered with reliable m ulticasting. 3.1 A CK Implosion problem This is most the imp ortan t problem to b e dealt with the design of a reliable m ulticast transp ort proto col. The sender receiv es feedbac k or ac kno wledgemen ts from the receiv er to up date and con trol its regulation parameter for reliabilit y and congestion con trol. In unicast comm unication, the sender receiv es its feedbac k just from the single receiv er presen t. In the case of m ulticasting, the feedbac k messages receiv ed from the m ultiple receiv ers ma y con v erse on the source. This is called the fe e db ack implosion pr oblem There are a few w a ys to deal with this problem. Some of the existing proto cols prop ose negativ e ac kno wledgemen t (NA CK) to b e sen t to the sender, instead of a p ositiv e A CK, so that the sender receiv es only a limited n um b er of resp onses from the receiv ers. If the m ulticast tree is large and the error probabilit y is relativ ely high, there w ould b e man y losses and the NA CKs sen t w ould still cause implosion. Hence there has to b e some w a y of suppressing the NA CKs to o. Some of the w a ys to deal with the implosion problem is to use the lo cal grouping sc heme and some suppression tec hniques. Figure 3.1 depicts t ypical case of the A CK implosion problem. The implosion at the source could also b e caused b y the NA CKs. Hence it is called in general as the feedbac k implosion problem rather than the A CK implosion problem. 23

PAGE 33

24 LGMP RMTP and TRAM discussed in the previous section group the receiv ers to form small lo cal groups. The receiv ers in the lo cal group send their A CKs to the en tit y that con trols the group and not to the sender. Hence the sender of the group is not b om barded b y A CKs when the data are receiv ed prop erly In all the three proto cols, the A CK implosion problem is a v oided b y the design of the proto col. MTCP has a hierarc hical structure with the receiv ers arranged in a tree-lik e fashion. There is no grouping concept in MTCP and the receiv ers send their A CK to their immediate paren t. The paren ts send their A CK to their corresp onding paren t and so on. Hence, the source will get the A CK only from the lev el one receiv ers and this a v oids the A CK implosion problem. SRM do es not pro vide grouping or an y hierarc hical structure for the receiv ers. SRM do es not A CK the data pac k ets but the NA CK pac k ets are m ulticast to ev ery one. Since NA CK is sen t only for a missing pac k et, it do es not alw a ys lead to implosion problem. In LBRM to o, no A CK pac k ets are sen t to the source. LBRM sends the NA CK pac k ets sp ecically to the source. XTP unicasts the con trol pac k ets to the source and there is no mec hanism to a v oid the implosion at source. Multicast SourceMulticast ReceiversMulticast DataData Packet Acknowledgements Figure 3.1: A CK Implosion at Multicast Source

PAGE 34

25 3.2 Error Reco v ery There are usually t w o t yp es of error reco v ery [19]: cen tralized error reco v ery (CER) and distributed error reco v ery (DER) In CER the retransmissions are p erformed b y the source of the m ulticast tree. CER is also referred as source-based reco v ery In DER, all the mem b ers of the m ulticast p erform the retransmissions group. Hence the burden of the error reco v ery pro cessing is distributed from the source to all the mem b ers of the m ulticast tree. DER is found to outp erform the CER, b ecause the source ma y not alw a ys ha v e sucien t pro cessing p o w er or buer space to supp ort error reco v ery esp ecially when the n um b er of receiv ers is v ery large, whic h is t ypical of most of the m ulticast applications. Also for reasons suc h as fault tolerance, distributed error reco v ery is recommended compared to cen tralized error reco v ery [19 ]. Kasera et al. ha v e sho wn that lo cal reco v ery has the p oten tial to pro vide signican t p erformance gains in terms of reduced bandwidth and dela y and higher throughput [20 ]. LGMP TRAM and RMTP pro vide a distributed error reco v ery mec hanism through the lo cal grouping of receiv ers. In LGMP an y mem b er of the group p erforms retransmissions and the pac k et is requested from the source only when none of the mem b ers ha v e it. In TRAM, the repair head p erforms the retransmissions to the group mem b ers whereas a designated receiv er is resp onsible for error reco v ery in RMTP MTCP has a hierarc hical structure of receiv ers with eac h in ternal no de called as Service Agen t (SA). SAs are resp onsible for the error reco v ery and it requests a pac k et from its paren t if it do es not ha v e one. In SRM, the repair or retransmission requests are m ulticast to the whole group. All the mem b ers hear the request and an y other mem b er that is missing the same pac k et w ould suppress its request on hearing the same request b y another mem b er. Retransmissions are also m ulticast to the whole group and on lo oking at the resp onse other mem b ers w ould suppress their resp onse. LBRM prop oses t w o t yp es

PAGE 35

26 of error reco v ery The proto col has a logging serv er adjacen t to the sender and all the receiv ers request missing pac k ets from the logging serv er. LBRM also pro vides distributed logging b y grouping a set of receiv ers called as \site". Ev ery site has a secondary logging serv er that pro vides the error reco v ery for the receiv ers in the particular site. Hence, in distributed logging, the reco v ery b ecomes lo calized. In XTP the repair requests and retransmissions are m ulticast to the whole group similar to SRM. 3.3 Congestion Con trol Congestion con trol mec hanisms in m ulticasting remain one of the most widely researc hed areas. Unlik e the unicast mec hanism, in m ulticast, where m ultiple receiv ers are in v olv ed, eectiv e congestion con trol relies on accurate and timely feedbac k on the prev alen t net w ork conditions. The c hallenge lies in ho w economically sp eedily and accurately is the feedbac k information collected. Golestani and Sabnani [21 ] ha v e discussed in detail the v arious issues to b e considered b y a m ulticast congestion con trol sc heme. The t w o ma jor comp onen ts in the congestion con trol structure are the regulation parameter and the regulation algorithm. A regulation parameter is a parameter b y whic h the ro w of trac on to the net w ork is regulated, whic h ma y b e either the rate at whic h the data is transmitted or the windo w size. A regulation algorithm is the algorithm b y whic h the regulation parameter is adjusted. In the standard TCP congestion con trol mec hanism, the regulation parameter is the windo w size and the regulation algorithm is the com bination of the standard congestion con trol algorithms whic h w ere prop osed b y V an Jacobson, lik e Slo w start, congestion a v oidance, etc. Most of the congestion con trol sc hemes to da y adopt the regulation parameter as either the transmission rate or the windo w size. There are diculties in extending windo w-based regulation to m ulticast comm unications b ecause of the concerns lik e A CK implosion problems. The ma jor dra wbac k in implemen ting rate-based

PAGE 36

27 regulation is the necessit y to calculate the receiv er round trip times. The measuremen t of receiv er round trip times is fundamen tally dieren t and more complex than the unicast comm unications. In unicast comm unications, where there is a single receiv er, the round trip time is measured easily In the m ulticast comm unication with m ultiple receiv ers in v olv ed, issues arise of measuring for eac h receiv er. In m ulticast proto cols with logical groupings, the receiv ers m ust measure their distances from the leader of the group. The main dierence b et w een the congestion con trol sc hemes implemen ted in unicast and m ulticast comm unication is the place at whic h the regulation algorithm is run. Accordingly congestion con trol sc heme ma y b e categorized as source-driv en or receiv er-driv en. In source-driv en unicast comm unication sc hemes, the task of up dating the regulation parameter is left to the source. The source up on getting the feedbac k from the receiv er is the ideal site to run the regulation algorithm. In m ulticast comm unication, Golestani [21 ] discusses the follo wing problems. In m ulticast sessions, due to the presence of a large n um b er of receiv ers, the complexit y of p erforming the trac regulation is also increased. Unlik e the unicast metho dology if the execution of the regulation algorithm is left solely to the source, its pro cessing capabilit y could sev erely limit the same. In the proto col discussed in this thesis, man y retransmissions are not p erformed b y the source, rather they are b eing p erformed b y the Group Leader. Hence the loss information p ertaining to the v arious receiv ers is not alw a ys a v ailable to the source. In the curren t hierarc hical lo cal grouping arc hitecture, the n um b er and iden tit y of the receiv ers is not kno wn to the source. The congestion con trol decisions are not alw a ys implemen ted b y the source. If a receiv er decides to drop out of the group, the receiv er alone m ust implemen t it. Another imp ortan t factor to b e tak en in to accoun t is the fact that the natural approac h for congestion con trol is to adopt TCP's congestion con trol algorithms for

PAGE 37

28 reacting to net w ork congestion. Flo yd et al. [22] ha v e prop osed a guideline as follo ws. F or an y link, the trac arriv al rates for a ro w should resp ond to congestion in a w a y that is no more aggressiv e than m ultiplicativ e decrease, additiv e increase, with increase and decrease rates that giv e b eha vior that is no more aggressiv e than curren t implemen tations of TCP Considering the ab o v e-men tioned factors, the real need of the congestion con trol in reliable m ulticast is to shift the tasks as m uc h as p ossible to the receiv ers. It should b e a receiv er-driv en approac h with the receiv ers sending the feedbac k ab out congestion to the source. The con trol algorithms should also b e TCP-friendly for reasons as discussed ab o v e. The arc hitectural design of the m ulticast proto col has a ma jor eect on the design of the algorithms that aid in sending the feedbac k from receiv ers to the sender. Golestani and Sabnani ha v e prop osed a hierarc hical consolidation of receiv er feedbac k in the m ulticast tree. Eac h receiv er is resp onsible for consolidating the feedbac k receiv ed from its immediate c hildren and sending the result up w ard to its paren t. A t eac h lev el inside the tree, the receiv er computes an aggregate feedbac k parameter at its lev el based on the feedbac k receiv ed from the lev el b elo w it. The computed aggregate feedbac k parameter is then sen t up w ard to w ards its paren t un til it reac hes the source nally LGMP do es not men tion an y congestion con trol mec hanisms. TRAM sp ecies w a ys in whic h the receiv ers in the group send notication ab out congestion to the source. Up on the reception of the congestion notication, the source reduces its transmission rate. TRAM sp ecies a minim um and maxim um transmission rate within whic h the proto col op eration tak es place. On reception of congestion rep orts, TRAM reduces its transmission rate b y 50

PAGE 38

29 3.4 Scalabilit y With the m ultimedia applications gro wing tremendously the n um b er of receiv ers in the m ulticast groups can b e exp ected to increase prop ortionately Hence the proto col should b e designed in suc h a w a y that it can handle the increase in n um b er of receiv ers and still pro vide the same lev el of service. Also, the m ulticast receiv ers should b e able to join or lea v e the group whenev er they wish and the source should not b e ev en a w are of these activities. The arc hitectural design of the proto col pla ys a ma jor role in deciding the scalabilit y of the proto col. LGMP TRAM and RMTP com bine a set of receiv ers to form lo cal groups. Due to the logical grouping of receiv ers, the proto col is scalable to a large n um b er of receiv ers. MTCP do es not pro vide an y grouping of receiv ers, but the receiv ers form a hierarc hical tree-lik e structure with the sender ro oted on top of the tree. The scalabilit y of MTCP is limited due to its hierarc hical nature. SRM pro vides neither grouping nor hierarc hical structure for the receiv ers. The receiv ers just form one group under the sender whic h limits its scalabilit y v ery m uc h. SRM w as mainly dev elop ed for \white b oard" application where the n um b er of receiv ers is limited. Distributed logging in LBRM is scalable to a large n um b er of receiv ers as they are group ed together to form \sites". XTP do es not pro vide an y grouping of receiv ers and its scalabilit y is limited. 3.5 F airness A particular concern for the dev elop ers of reliable m ulticast proto cols is the impact of reliable m ulticast trac on other trac in the In ternet at times of congestion, in particular the eect of reliable m ulticast trac comp eting with TCP trac [18 ]. The proto col should address the fairness issue. There are man y p ossible w a ys to dene fairness. One t yp e of fairness is global fairness. Under this denition, eac h en tit y has an equal claim to the net w ork's scarce resources. The m ulticast proto col should ensure fair sharing of net w ork resources with other

PAGE 39

30 w ell-b eha v ed proto cols. It should ensure fairness with other m ulticast and unicast trac. The m ulticast proto col should b eha v e and bac k o in a w a y similar to TCP in case of congestion. This is called TCP friendliness. Hence the reliable m ulticast proto col should b e designed in suc h a w a y that it is fair. LBRM (Secondary) LBRM (Primary) Protocol Scalability ACK Implosion Error Recovery Congestion Control TCP XTP SRM MTCP RMTP TRAM LGMP DERDERDERDERDERDERDER CER Handles Well Suffers from this Not Applicable DER Distributed Error Recovery CER Central Error Recovery Figure 3.2: Comparativ e Analysis of Reliable Multicast Proto cols 3.6 Summary This c hapter discusses v arious imp ortan t issues in v olv ed in reliable m ulticasting. The implosion problem is a primary problem to b e addressed in a reliable m ulticast proto col. The ac kno wledgemen t sen t b y m ultiple receiv ers should not b om bard the sender. The arc hitectural design of the proto col has a ma jor impact up on the implosion problem. The error reco v ery proto col could b e cen tralized or distributed.

PAGE 40

31 Distributed error reco v ery reduces the load of sender and pro vides ecien t retransmission mec hanisms. The reliable m ulticast proto col should resp ond to the congestion state in the net w ork through congestion con trol algorithms. The m ulticast proto col should also b e scalable to a large n um b er of receiv ers. F rom the ab o v e issues it could b e seen that design of a reliable m ulticast proto col is complex when compared to the design of unicast proto cols. The arc hitectural design of the proto col pla ys an imp ortan t role and has comp ounding eects on issues suc h as implosion problem, error reco v ery and scalabilit y This c hapter also ev aluated the v arious reliable m ulticast proto cols in literature to the issues suc h as implosion problem, error reco v ery congestion con trol and scalabilit y Figure 3.2 sho ws the comparativ e analysis of LGMP TRAM, RMTP MTCP SRM, LBRM, and XTP The analysis also depicts the relation of TCP with the reliable m ulticast proto cols.

PAGE 41

CHAPTER 4 PR OTOCOL FRAMEW ORK 4.1 Goals 4.1.1 A CK Handling One of the ma jor design decisions of an y reliable m ulticast proto col is ho w to o v ercome the A CK implosion problem. The proto col framew ork discussed in this thesis has an arc hitecture designed in a w a y that will a v oid the problem automatically The m ulticast receiv ers are group ed lo cally with a lo cal leader called the Group Leader (GL) for eac h particular group who is in c harge of the error reco v ery and lo cal retransmissions. Instead of the receiv ers sending p ositiv e A CKs bac k to the source, they only send NA CKs to the Group Leader whenev er they detect the loss of a pac k et. Hence the source is nev er ro o ded or implo ded with NA CKs either. All they receiv e is the A CKs from the Group Leaders. This w a y the A CK implosion problem is mitigated. 4.1.2 Error Reco v ery As the m ulticast receiv ers are group ed lo cally error reco v ery is tak en care b y the Group Leader whic h mak es it Distributed Error Reco v ery (DER) [19 ]. The Group Leader lo cally handles an y re-transmissions. Hence the error reco v ery time is minimized greatly as the re-transmission requests do not ha v e to propagate all the w a y up to the source ev ery time a pac k et is lost. This results in sp eedy reco v ery and distribution of the pro cessing throughout the m ulticast tree. The end-to-end dela y of the retransmission request and repair is reduced. 4.1.3 Congestion Con trol Man y of the earlier proto col implemen tations for reliable m ulticasting do not sp ecically address the congestion con trol tec hniques. The proto col framew ork 32

PAGE 42

33 prop osed in this thesis follo ws the TCP congestion con trol algorithms closely The algorithms are implemen ted and handled lo cally in the groups b y the Group Leader. The Group Leaders main tain a TCP-lik e congestion windo w (cwnd). Group Leader incremen ts the cwnd b y one only when it receiv es an A CK from eac h of its c hildren. Un til then, the Group Leader has to buer the pac k ets. As so on as the A CKs are receiv ed, the buers are released. The algorithms are similar to the standard TCP congestion con trol algorithms lik e slo w start and congestion a v oidance. Details of the algorithm are sp ecied in Section 4.5. The proto col also prop oses a sc heme where a congestion feedbac k is propagated up in the m ulticast tree hierarc h y 4.1.4 Scalabilit y and Dynamic Adaptation Since the receiv ers are group ed lo cally with a leader acting on their b ehalf, the proto col is scalable to a large n um b er of receiv ers. A new receiv er who wishes to join in the m ulticast could join with an y of the existing lo cal groups or it could form its o wn group announcing itself as the Group Leader. Hence the m ulticast tree gro ws hierarc hically in groups as the receiv ers increase in n um b er. Also, a receiv er ma y lea v e its group an y time it wishes and all this ma y happ en without the kno wledge of the m ulticast source. The most imp ortan t feature of this proto col is the dynamic w a y in whic h the mem b ers of the group ma y re-arrange themselv es in resp onse to c hanging net w ork conditions. The Group Leaders exc hange sp ecial con trol messages that enable them to do so. Some con trol messages are m ulticast to the group to o, so that the mem b ers of the lo cal groups also ha v e up dated information ab out the state of the m ulticast group. 4.2 Arc hitecture The basic arc hitectural design of the proto col is sho wn in Figure 4.1. The m ulticast receiv ers are group ed lo cally to form small groups called the lo cal groups.

PAGE 43

34 Eac h group con tains a sp ecial no de called the Group Leader (GL). The main motiv es b ehind the lo cal groupings are to increase the scalabilit y of the proto col and to lo calize error reco v ery The Group Leader is resp onsible for ac kno wledgemen t pro cessing and buering of the pac k ets. They also handle pro cessing of the con trol messages from the m ulticast receiv ers, used for congestion con trol and group managemen t, and pass these messages up to the m ulticast source. As the error reco v ery is lo calized within eac h of the lo cal domains, the Group Leader is resp onsible for lo cal re-transmissions. As it con tains the buered data, it w ould b e able to re-transmit the data requested b y an y of its c hildren. In all, the Group Leader pla ys a ma jor role in the functioning of the proto col. In Figure 4.1, the circled dots represen t the logical lo cal groups. The buering at the Group Leaders in only temp orary whereas the cen tral logging serv er adjacen t to the source pro vides a p ermanen t buering. In addition to m ulticasting the pac k et to the group, the source sends the pac k ets to the cen tral logging serv er, whic h retains them p ermanen tly All the lo cal groups together form a tree-lik e hierarc hical structure as sho wn in Figure 4.1. Hence the source of the m ulticast tree is directly connected to the Group Leaders and the Group Leaders in turn are connected to other Group Leaders and the m ulticast receiv ers. The Group Leaders receiv e A CKs and NA CKs from their c hildren, the m ulticast receiv ers in the lo cal group and the Group Leaders attac hed b elo w, if an y The source receiv es A CKs and NA CKs from its immediate c hildren, whic h are Group Leaders themselv es. Due to this hierarc hical structure, the proto col is scalable to a v ery large n um b er of receiv ers. 4.3 Multicast Data T ransfer The follo wing sections discuss in detail the v arious functions of the proto col suc h as the m ulticast data transfer, ac kno wledgemen t and lo cal retransmission sc hemes, congestion con trol and the group managemen t tec hniques.

PAGE 44

35 Group Receivers Multicast SourceGroup Leaders Local GroupsCentral Logging Server Figure 4.1: Ov erall Arc hitectural Design of the Proto col The arc hitecture of the proto col is v ery m uc h lik e a hierarc hical tree-lik e structure with lo cal groupings as discussed in the previous section. This w ould a v oid the end-to-end dela y of transmission and re-transmission. The m ulticast data transfer w orks as follo ws. The source or the sender rst m ulticasts the data globally to all the mem b ers of the m ulticast tree, including the Group Leaders and the individual receiv ers. The Group Leaders, that are logically connected to the source send their A CKs and con trol messages directly to it. Eac h Group Leader receiv es A CKs and NA CKs from its c hildren, the receiv ers in the lo cal group and an y Group Leaders attac hed directly b elo w it, if an y The receiv ers do not send A CKs directly to the source b ypassing the Group Leader. Hence the source is nev er b om barded b y A CKs from the receiv ers, whic h a v oids the A CK implosion problem.

PAGE 45

36 The Group Leader lo cally will retransmit an y missing data units requested. If the Group Leader do es not has the data pac k et, it w ould request for retransmission from its paren t Group Leader. The Group Leader either m ulticast or unicast the missing data and it also m ust buer the pac k ets un til it receiv es an ac kno wledgemen t from all of its c hildren. The source m ulticasts the data unit as long as it has ro om in its sending windo w. The ro w of data pac k ets and ac kno wledgemen ts is sho wn in Figure 4.2. SourceReceiverGroup LeaderMulticast Data MessageUnicast Ack Message Figure 4.2: Multicast Data T ransfer The m ulticast data transfer from the source to the m ulticast group and the whole m ulticast op eration tak es place in the follo wing order. 1. Source m ulticasts the data unit globally to the en tire m ulticast tree, including the Group Leaders, cen tral log serv er and the receiv ers. 2. The Group Leaders attac hed directly to the source w ould send their A CKs and NA CKs to the source. 3. The cen tral log serv er, whic h is also attac hed directly to the source, sends its A CKs and NA CKs to it.

PAGE 46

37 4. The Group Leaders receiv e the A CKs and NA CKs from their c hildren whic h ma y include the lo cal receiv ers and an y Group Leaders attac hed directly b elo w them. 5. The source p erforms re-transmissions to its immediate c hildren, the Group Leaders as w ell as the cen tral log serv er. 6. The Group Leaders p erform lo cal re-transmissions to their c hildren. 7. The cen tral log serv er p erforms an y re-transmissions if requested. 8. Steps 1 through 7 are rep eated for proto col op eration. 4.4 Ac kno wledgemen t and Error Reco v ery Mec hanism In his pap er [19], Nonnenmac her has sho wn that for Distributed Error Reco v ery proto cols, the p erformance of the proto col is high with a lo cal m ulticast retransmission and lo cal feedbac k-pro cessing sc heme. As discussed in the previous sections, one of the ma jor problems encoun tered with reliable m ulticast design is the A CK implosion problem. This problem is una v oidable in the sender-initiated reliable m ulticast proto cols, wherein all the receiv ers send their p ositiv e A CKs to the sender imp ounding it virtually The proto col discussed in this thesis is a receiv er-driv en approac h with a sligh t mo dication. Instead of eac h receiv er main taining the state information, the Group Leader alone main tains the state information for the lo cal group it co ordinates. The ac kno wledgemen t sc heme of the proto col w orks as follo ws. There are t w o t yp es of ac kno wledgemen ts, the p ositiv e A CK that indicates the receipt of the pac k et correctly and NA CK, the negativ e ac kno wledgemen t that indicates the absence of a data pac k et. As men tioned in the m ulticast data transfer section, the source transmits the data pac k et to the whole group. After the receipt of the pac k ets, the Group Leaders that are directly attac hed to the source send their A CKs to the source. Hence the source is not b om barded b y A CKs from all the receiv ers and the implosion problem is a v oided. A Group Leader receiv es the

PAGE 47

38 A CKs from the receiv ers in its group and also from the other Group Leaders attac hed directly b elo w it, if an y The proto col framew ork prop osed in this thesis is designed for a sequenced, loss less deliv ery of data. Missing pac k ets are detected based on the sequence n um b er of the data pac k et. Both the Group Leaders and the receiv ers send NA CK if they miss an y data pac k ets. The Group Leaders and the source buer the pac k ets to retransmit an y missing pac k ets. Th us there is a hierarc h y of ro w in the m ulticast tree b et w een the source, Group Leaders and the receiv ers. In essence, the receiv ers could not b ypass the Group Leaders to reac h the source. Unlik e the SRM [3], where the NA CKs are m ulticast to ev ery one, here a receiv er sends its NA CKs only to its Group Leader. Unicast Retransmission Multicast Retransmission Group Leader Retransmission of missing packet NACKLocal group receiver Figure 4.3: Lo cal Retransmission of Missing P ac k ets The Group Leader and the source receiv e a NA CK for the missing data pac k ets. After the reception of NA CK, b oth Group Leader and source p erforms retransmissions. The follo wing section deals with the retransmissions in detail. 4.4.1 Retransmission b y Group Leader Retransmission ma y b e done b y the Group Leader in t w o dieren t w a ys: unicast and m ulticast. Both the retransmission tec hniques are depicted in Figure 4.2.

PAGE 48

39 Unicast retransmission mec hanism. If the c hildren receiv e the pac k ets prop erly they send an A CK to the Group Leader. The Group Leader releases a pac k et from the buer when it receiv es an A CK for the pac k et from all of its c hildren. The Group Leader retransmits the missing pac k et lo cally to a receiv er up on receiving NA CK from it. The Group Leader main tains a certain threshold for the n um b er of NA CKs it has receiv ed from its c hildren for a particular pac k et. As long as the n um b er of NA CKs receiv ed for a pac k et is b elo w the threshold, the Group Leader unicasts the missing pac k et to the requesting receiv er. End-to-end propagation dela y is greatly reduced due to the hierarc hical structure, since the receiv er need not send the retransmission request all the w a y to the source and w ait for the next pac k et to mak e its w a y all the w a y bac k from the source, but instead sends its NA CK to its Group Leader receiv es the retransmission from it. Multicast retransmission mec hanism. The Group Leader w aits for a certain in terv al of time b efore it serv es the retransmission request. If the n um b er of retransmission requests for a data pac k et is found to exceed the threshold, then instead of unicasting the missing pac k et to the individual receiv er, the Group Leader m ulticasts the pac k et to the whole group. Hence the lo cal trac caused b y the retransmission requests is reduced b y this m ulticast retransmission mec hanism. Because of lo cal reco v ery the repair requests are not sen t all the w a y up to the source of the m ulticast tree. The con trol trac is greatly reduced through the lo cal reco v ery mec hanism. When the retransmission request is sen t b y a Group Leader attac hed directly b elo w the source, the retransmission is alw a ys unicast. 4.4.2 Retransmission b y Source The Group Leaders attac hed directly to the source send a NA CK to the source if they miss an y data pac k ets. The source buers the data pac k ets un til it receiv es an A CK from eac h of the Group Leaders attac hed directly to it. The source

PAGE 49

40 retransmits an y missing data pac k et from its buer when it receiv es a NA CK from a Group Leader attac hed to it. The other imp ortan t concerns of the reco v ery mec hanisms are as follo ws: monitoring of unresp onsiv e receiv ers and Group Leaders; late joining receiv ers and data reco v ery T o ensure reliable deliv ery of data, the Group Leader buers the pac k ets un til it receiv es an A CK from eac h of its c hildren. So it is imp ortan t for the Group Leaders to monitor the receiv ers in its lo cal group con tin uously If some of the receiv ers b ecome unresp onsiv e, the Group Leader will not get an A CK for a particular pac k et and it remains in the buer innitely Therefore it is up to the Group Leaders to monitor the receiv ers. Also there is a c hance that the Group Leader to o ma y b ecome unresp onsiv e sometimes. In that case, the receiv ers could not get their retransmissions from the Group Leader and w ould ha v e to get them from another Group Leader, the sender or the cen tral logger. The monitoring mec hanisms adopted b y the Group Leaders and receiv ers are discussed next. 4.4.3 Monitoring of Unresp onsiv e Receiv ers and Group Leaders Monitoring for unresp onsiv e receiv ers and Group Leaders is ac hiev ed using a con trol pac k et called GL ALIVE, a Group Leader aliv e message. The Group Leader sends the GL ALIVE pac k et p erio dically to its lo cal group. Reception of the GL ALIVE message b y the lo cal receiv ers indicates that the Group Leader is op erational. If a receiv er do es not receiv e the GL ALIVE pac k et for a certain in terv al, it indicates the absence of the GL ALIVE in the A CK pac k et it sends to the Group Leader. If three suc h A CK messages go unansw ered, the receiv er infers that the Group Leader is do wn and it joins with a dieren t Group Leader. The metho ds b y whic h the receiv ers subscrib e to a dieren t Group Leader are discussed later in the group managemen t section. If the Group Leader receiv es an A CK pac k et indicating the non-reception of GL ALIVE, it unicasts a GL ALIVE to the

PAGE 50

41 requesting receiv er. After the reception of the unicasted GL ALIVE message, the receiv er sends an A CK indicating its reception and remains attac hed to the Group Leader and con tin ues its regular op eration. The Group Leader exp ects an A CK message within a certain in terv al from eac h receiv er. If the Group Leader do es not receiv e an A CK message from a receiv er for three suc h in terv als, the Group Leader then unicasts a GL ALIVE message to the sp ecic receiv er indicating the absence of the A CK message. If the Group Leader do es not get a resp onse for the unicasted GL ALIVE message, it assumes that the receiv er is do wn and it releases the data in the buer it w as withholding for that receiv er. Also, the Group Leader presumes that particular mem b er is do wn and no further repairs w ould b e serv ed for it. The mem b er has to join again either with the same Group Leader or a dieren t one if it wishes in the future to participate in the m ulticast op eration. 4.4.4 Late Joining Receiv ers and Data Reco v ery Since the proto col discussed in this thesis allo ws receiv ers to join an ytime during the m ulticast session, the receiv ers joining the group late ha v e to b e up dated with the data pac k ets sen t since the start of the session and catc h with the rest of the group mem b ers. Also b ecause of the highly dynamic nature of the proto col, the receiv ers in the lo cal group could c hange their mem b ership to dieren t groups. This ma y require them to catc h up with the curren t group data reception. There are t w o w a ys in whic h this could b e ac hiev ed: (1) from the buer at the Group Leader or the source and (2) directly from the cen tral logging serv er for complete data reco v ery Buer at the group leader and source. The Group Leader buers eac h data pac k et sen t b y the source un til it receiv es A CKs from all of its c hildren for that pac k et, after whic h it deletes the buer en try for the particular pac k et. The newly joined mem b er could reco v er the data from the buer withheld b y the Group

PAGE 51

42 Data logged into the Central Logging Server(CLS)Retransmission request and reply directly with the CLSGL_ALIVE packets from the Group Leader Figure 4.4: Cen tral Logging Serv er and Monitoring of Unresp onsiv e Mem b ers Leader b y requesting it. If the Group Leader has deleted the data from buer needed b y the new mem b er, only partial reco v ery is p ossible. Hence it is the dut y of the joining mem b er to ensure that it has all the pac k ets that ha v e b een ac kno wledged b y all the mem b ers of the joining group. If the new mem b er has not y et receiv ed some of these pac k ets, it is its resp onsibilit y to nish all the p ending transactions with the old Group Leader. Alternativ ely if the new joining mem b er cannot nd the data in the new Group Leader's buer, it can request the same directly from the cen tral logging serv er, discussed as follo ws. Complete data reco v ery from cen tral logging serv er. There is a cen tral logging serv er attac hed to the source of the m ulticast group as in Figure 4.4. It buers the data on the disk as the source m ulticasts the data. Unlik e the data buered b y the source or the Group Leaders, the data in the cen tral logging serv er is nev er deleted.

PAGE 52

43 Therefore an y late joining mem b er, that cannot reco v er fully with the Group Leader's buer, can request the pac k ets directly from the cen tral logging serv er. Hence complete reco v ery is p ossible. The cen tral logging serv er also serv es as a fault toleran t mec hanism for the en tire m ulticast session b y buering the data p ermanen tly 4.5 Congestion Con trol Mec hanism Section 3.3 discussed the c hallenging factors in v olv ed in the design of congestion con trol in m ulticast proto col. The congestion con trol for the proto col framew ork is designed as follo ws. There are t w o phases in v olv ed in the whole pro cess: slow start and c ongestion c ontr ol In the slow start phase, the proto col tries to nd an appropriate op erating p oin t. As the pac k et ro w gradually increases and when the net w ork is sub jected to load, the c ongestion c ontr ol phase starts to op erate. Both the topics are discussed in detail b elo w. 4.5.1 Slo w Start The approac h tak en in this sc heme is similar to and motiv ated b y the congestion con trol strategy used in MTCP [7]. The source uses a congestion windo w cwnd to reduce the data transmission rate when exp eriencing congestion. Eac h Group Leader to o main tains a congestion windo w, similar to the TCP congestion windo w. The source and the Group Leaders main tain their congestion windo ws using TCP congestion con trol mec hanisms suc h as slo w start and congestion a v oidance. The manner in whic h the congestion algorithms dier is that a congestion windo w presen t at the Group Leader or the source is incremen ted only when it receiv es A CKs from all its c hildren. Hence the windo w size is incremen ted linearly as and when the A CKs are receiv ed from the c hildren. This op eration is con tin ued as long as the congestion windo w size is b elo w the slo w start threshold. If the size exceeds the threshold, the congestion windo w is reduced to 1/cwnd eac h time a new pac k et is ac kno wledged b y all of its c hildren and the proto col en ters in to the congestion

PAGE 53

44 a v oidance phase. Also, as discussed in the previous section, the lo cal receiv ers send NA CKs for the missing pac k ets and the Group Leaders immediately retransmit the pac k ets rep orted missing. It could b e seen that the slo w start phase discussed for this proto col is v ery m uc h similar to V an Jacobson's [16] standard congestion con trol algorithm. 4.5.2 Congestion Con trol After the initial phase of slo w start, congestion could o ccur and w ould b e detected b y the receiv ers, the Group Leaders, or the source itself. Accordingly the receiv ers and the Group Leaders rep ort the congestion through congestion rep orts up the hierarc h y so that the source acts to reduce the same. The strategy adopted is similar to and motiv ated b y the congestion con trol sc heme used in TRAM [9 ]. Congestion at receiv ers. Receiv ers detect the congestion according to the missing pac k ets. If the receiv ers detect the n um b er of missing pac k ets in an A CK windo w to gro w b ey ond a certain threshold, it sends a congestion message to its Group Leader. The congestion message w ould include the highest sequence n um b er of the pac k et receiv ed. Up on receiving the congestion message, the Group Leader in turn forw ards it to the Group Leader ab o v e it or to the source if it is attac hed to the source directly Congestion at group leaders. Group Leaders detect congestion when their buer b egins to ll up. The buer space is lled up when a receiv er in the lo cal group fails to ac kno wledge data pac k ets. The buer gets lled up with the pac k ets as long as it do es not receiv e the ac kno wledgemen ts from the receiv ers. The buer has a high threshold limit and the Group Leader temp orarily increases the high threshold limit set for the buer when it could not remo v e an y more pac k ets from it. Mean while, a congestion message is sen t up the hierarc h y indicating congestion. If the buer is lled ev en after the high threshold is increased and the buer size is reac hed, the Group Leader starts to drop the new pac k ets. The temp orary increase

PAGE 54

45 of the threshold limit of the buer is to allo w some time for the Group Leader to send a congestion message up the hierarc h y Congestion at source. The source of the m ulticast group to o main tains a buer to retransmit an y missing pac k ets for the Group Leaders attac hed immediately to it. Lik e the ab o v e-men tioned situation, the source w ould detect congestion if its buer b egins to ll up when a Group Leader fails to ac kno wledge data pac k ets. Similar to the Group Leaders, the source w ould increase the high threshold limit for its buer. Mean while it reacts to the congestion b y reducing the data rate, as if it had receiv ed a congestion message. After the high limit of the threshold is reac hed and the buer is full, it blo c ks an y new data from the application and attempts to solicit an A CK from the Group Leaders that are causing the buer to ll up. If the Group Leaders do not resp ond quic kly the source w ould prune them. There is a signican t p olicy c hoice here as to prune the slo w receiv ers or to catc h up in sp eed with the slo w receiv ers. If there is just one receiv er that aects the rate of m ulticast group, it is pruned and if there are a n um b er of receiv ers that dictate the rate, the sender is slo w ed do wn. GL i GL c GL j GL k Fj Fi Fk Fc Figure 4.5: Hierarc hical Consolidation of F eedbac k P arameter Apart from the congestion rep ort sen t b y the receiv ers and Group Leaders when encoun tering congestion, the proto col also sends consolidated congestion con trol

PAGE 55

46 feedbac k to the source of the m ulticast group at regular in terv als so that it can regulate the ro w of transmission accordingly Section 3.3 discussed ab out the hierarc hical consolidation of the feedbac k from the receiv ers up the m ulticast tree prop osed b y Golestani and Sabnani [21]. Ev ery receiv er calculates an aggregate feedbac k parameter based on the feedbac k receiv ed from the receiv ers at the lo w er lev el. This tec hnique is mo died to suit the requiremen ts of this proto col framew ork. It is the job of ev ery Group Leader to calculate the consolidated feedbac k of its c hildren and the Group Leaders b elo w it. After calculating the consolidated feedbac k at its lev el, it sends the same to the Group Leader ab o v e its lev el. If a particular Group Leader do es not ha v e another leader attac hed b elo w (leaf no de), it just passes its feedbac k up the tree. Also, if a Group Leader is indep enden tly attac hed to the source of the tree, it sends the feedbac k directly to the source. This is depicted in Figure 4.5. The feedbac k parameter f j denotes the highest pac k et sequence n um b er that could arriv e at j in the case of windo w based congestion con trol. If f j is the feedbac k parameter of a no de N j then the consolidated feedbac k at the curren t lev el is calculated as follo ws. f j = min f f k /N k where N k is a c hild of N j g The consolidated feedbac k at the curren t lev el f j is then propagated to the lev el ab o v e it, if an y or to the source directly 4.6 Group Managemen t Sc hemes This section deals with the v arious group managemen t sc hemes in v olv ed in the proto col op eration. The arc hitecture of the proto col is a hierarc hical tree-lik e structure with lo cal group formation. This necessitates eectiv e organization and managemen t of the groups. Issues suc h as formation of the group, dynamic conguration of the mem b ers and group termination are discussed as follo ws.

PAGE 56

47 4.6.1 Group F ormation There are t w o w a ys in whic h the lo cal groups are formed: Expanding ring structure [6] and the adv ertisemen t metho d. nnnnnn (c) (d) (a) (b) New Joining Member Figure 4.6: Expanding Ring Structure for Group Mem b ership Expanding ring structure is depicted in Figure 4.6 and w orks as follo ws. An y new mem b er who wishes to join the m ulticast group sends out a GR OUP LEADER SEAR CH request message with a limited scop e distance. The limited distance is ac hiev ed b y setting a small TTL v alue in the request pac k et. If there is an y Group Leader in the vicinit y then the Group Leader resp onds b y sending a GR OUP LEADER A V AILABLE resp onse message. If the searc hing mem b er do es not receiv e an y resp onse from an y Group Leader, it will start searc hing again, this time co v ering more distance. Increasing the TTL v alue in the

PAGE 57

48 GR OUP LEADER SEAR CH pac k et again do es this. On receipt of a resp onse from a Group Leader, the new mem b er ma y wish to join the group main tained b y the Group Leader, or it ma y not wish to join the group. The n um b er of mem b ers in the group ma y already b e reac hing the MAXIMUM GR OUP MEMBERS limit or there ma y b e a \b etter" group to join. This pro cess will con tin ue un til a suitable Group Leader is found and the new mem b er nds itself a place in a lo cal group. Another metho d b y whic h a new mem b er joins a lo cal group is b y the adv ertisemen t metho d as sho wn in Figure 4.7. Scope of advertisement messages New members wishing to join Advertising Group Leader nnnnnn rrrrrr Before Joining After Joining Figure 4.7: Group Leader's Adv ertisemen t for Mem b ership In the adv ertisemen t metho d, the Group Leaders send out a JOIN MY GR OUP adv ertisemen t message p erio dically to the group-sp ecic m ulticast address. No des who wish to join the m ulticast group listen for adv ertisemen t messages. The JOIN MY GR OUP message con tains information suc h as the distance of the Group Leader from the m ulticast source, the n um b er of mem b ers already presen t in the group, data rate, dela y bandwidth, throughput and its error probabilit y The new mem b er then calculates its distance from the Group Leader using the information in the Group Leaders adv ertisemen t message. It ma y wish to join the group or not

PAGE 58

49 according to the parameters sp ecied in the JOIN MY GR OUP message. As all the Group Leaders send the JOIN MY GR OUP message, a no de is able to learn of all the Group Leaders presen t nearb y and to gain partial information of the m ulticast tree. The time in terv al in whic h the JOIN MY GR OUP message is sen t should b e decided in suc h a w a y that the con trol message trac generated b y the Group Leaders do es not itself lead to congestion. In b oth the metho ds, the new mem b er that wishes to join the group sends an INTERESTED IN JOINING message to the Group Leader. The Group Leader resp onds p ositiv ely with an A CK JOIN message if it could accommo date the new mem b er. On the con trary the Group Leader sends a NA CK JOIN message to the requesting mem b er indicating its inabilit y to accommo date it. In b oth the metho ds, if a new joining mem b er cannot nd a suitable Group Leader, then it ma y announce itself as a new Group Leader, leading to the formation of a new group. The new Group Leader could either attac h itself directly to the source or to another Group Leader. It should b e noted that, the adv ertisemen t metho d causes more con trol trac than the expanded ring searc h metho d. Whenev er the load of the net w ork is lo w, the adv ertisemen t metho d will b e adopted and when the load increases, group formation switc hes to the expanded ring searc h metho d. 4.6.2 Dynamic Reconguration of Groups The lo cal groups are able to re-organize themselv es with the receiv ers c hanging to dieren t groups according to the curren t net w ork or congestion conditions. This is one of the most p o w erful features of proto col prop osed in this thesis. Dynamic reconguration could b e ac hiev ed in dieren t w a ys, with the receiv ers shifting groups, Group Leaders shifting groups and the termination of receiv ers and Group Leaders. The follo wing sections discuss all these asp ects in detail.

PAGE 59

50 Re-aliation of mem b ers. The mem b ers of the m ulticast groups can re-arrange themselv es b y c hanging their mem b ership to other lo cal groups. This could happ en for reasons suc h as the curren t group p erforming p o orly termination of Group Leader or the whole group. The re-aliation could b e categorized as follo ws. Receiv ers shifting lo cal groups. Receiv ers could shift to a dieren t group as follo ws. Ev ery Group Leader w ould send out a sp ecial con trol message called GL ST A T MSG to ev ery adjacen t lo cal domain group. The exten t to whic h this con trol message is receiv ed is limited b ecause the farther receiv ers w ould not b e practically willing to c hange their groups at a long distance. Hence this message is sen t suc h that only the adjacen t group mem b ers receiv e it. This is tak en care of b y limiting the TTL eld of the con trol message. Group Leader Status Message Figure 4.8: Group Leader's Status Message All the mem b ers of the m ulticast group, viz., source, Group Leaders and the individual receiv ers, receiv e the con trol message sen t b y the Group Leader. The message con tains most of the information similar to the Group Leader's adv ertisemen t message p ertaining to a particular group's c haracteristics. These

PAGE 60

51 include information on its distance from the source, data rate, dela y bandwidth, throughput, error probabilit y n um b er of receiv ers in the group, n um b er of pac k ets activ ely held in the buer, whic h w ould b e a v ery go o d indicator of the congestion state of a particular group. On receipt of the GL ST A T MSG, the individual receiv ers ma y shift to a dieren t group b ecause of the b etter service oered b y the adv ertising Group Leader. Sending an INTERESTED IN JOINING to the Group Leader do es this. The Group Leader resp onds p ositiv ely with an A CK JOINING message or negativ ely with a NA CK JOINING message. A receiv er that wishes to c hange groups directly con tacts the concerned Group Leader and to get its consen t to join the particular group as a new mem b er do es. Since the GL ST A T MSG is v ery similar to the Group Leaders adv ertisemen t message, the Group Leader do es not send out the status message when the group formation tec hnique is through adv ertisemen t. Hence when the load of the net w ork is lo w, the adv ertisemen t message acts as a status message to o. This reduces the con trol trac ro w within the m ulticast group. The dra wbac k of this metho d is that, all the receiv ers that are in a bad state will try to mo v e in to the health y group. The Group Leader w ould main tain a threshold in suc h a w a y that it do es not p ermit o v er subscription mem b ership more than the threshold. The allo cation of the mem b ership is on a rst-come rst-serv ed basis. Shifting of group leaders. It is p ossible that t w o Group Leaders could sw ap their lo cal group leadership. This w ould happ en b ecause of the pro cessing capabilit y and memory of the Group Leader not sucien t enough to supp ort its curren t mem b ers. Since the GL ST A T MSG is receiv ed b y the Group Leaders to o, they could also con tact their p eers and initiate the sw apping pro cess, if desired. Sending an INTERESTED IN JOINING to the Group Leader do es this. The Group Leader resp onds p ositiv ely with an A CK JOINING message or negativ ely with a NA CK JOINING message. It is also p ossible that the Group Leader of a particular

PAGE 61

52 lo cal group could sw ap its leadership with a lo cal receiv er of that group itself b ecause of the reasons men tioned earlier. T ermination of group leader. A Group Leader migh t wish to terminate its op eration at some p oin t. This could happ en when it wishes to lea v e the m ulticast group. Because of the termination of the Group Leader, some other no de m ust b e elected as the new Group Leader. Th us the Group Leader election pro cess is in v ok ed. There are t w o w a ys in whic h the Group Leader could b e elected. The easiest metho d is for the old Group Leader to select a receiv er in the lo cal group to b ecome the new Group Leader. The p oten tial receiv er should ha v e the sucien t pro cessing capabilit y and memory to do the same. In another metho d, a newly joining mem b er of the m ulticast group could also tak e o v er the group leadership if the ab o v e conditions are satised. In b oth the cases, the paren t of the Group Leader is informed ab out the newly elected Group Leader. F or securit y issues suc h as k ey managemen t, the source should alw a ys ha v e kno wledge ab out the Group Leaders. Under suc h conditions, the termination or election of the Group Leaders are informed to the source. T ermination of lo cal group. All or a large n um b er of receiv ers could mo v e to a dieren t group b ecause of the congestion state of the group. This shifting of the receiv ers could happ en due to the GL ST A T MSG exc hange as men tioned earlier. If a ma jorit y of the receiv ers shift to dieren t groups, the curren t mem b ership of a group could b ecome v ery thin. In suc h a case, the Group Leader w ould shed all the remaining receiv ers and terminate itself. When the Group Leader detects that the GR OUP MEMBER coun t is less than a threshold, it informs all the mem b ers ab out the termination of the group. The Group Leader receiv es a TERMINA TE A CK from all the mem b ers b efore terminating itself. This leads to the sh utting do wn of the whole group.

PAGE 62

53 Monitoring of unresp onsiv e mem b ers and group leaders. The group mem b ers and Group Leaders could b ecome unresp onsiv e sometimes. The Group Leader buers the data sen t b y the source un til it gets an ac kno wledgemen t from eac h of its c hildren. If a receiv er b ecomes unresp onsiv e, then the buer b egins to ll up. Therefore the receiv ers and the Group Leaders ha v e to b e monitored regularly This is done using the GL ALIVE con trol message sen t b y the Group Leaders. The mec hanism is describ ed in detail in Section 4.4. If a mem b er of the group w as found to b e unresp onsiv e b y the Group Leader, the mem b er is pruned from the group and no further repairs are serv ed for the same. If the Group Leader w as found to b e unresp onsiv e, the receiv er w ould join to a dieren t group. T ree construction tec hniques. The construction of the hierarc hical m ulticast tree has to b e carefully managed for dieren t t yp es of m ulticast applications. Some m ulticast applications lik e video or teleconferencing consist of small n um b er of receiv ers. Applications lik e sto c k quotes and con ten t deliv ery p oten tially ha v e a large n um b er of receiv ers. Hence tree construction should b e carefully managed for ecien t m ulticast op eration. Wellbuit Tree Structure Poorlybuilt Tree Structure Figure 4.9: T ree Construction for an Application with Less Num b er of Receiv ers

PAGE 63

54 T ree construction for small n um b er of receiv ers. If the application consists of a small n um b er of receiv ers, then the m ulticast tree construction is as follo ws. As the n um b er of receiv ers is small, the Group Leader allo ws more receiv ers to join its group than its regular coun t. This will a v oid the presence of a large n um b er of Group Leaders eac h with a minim um n um b er of receiv ers. The m ulticast op eration w ould b e b etter with limited n um b er of groups when the n um b er of receiv ers in the application is less. If the m ulticast tree consists of a n um b er of Group Leaders with just a few mem b ers deep do wn, the end-to-end dela y in v olv ed the transmission is increased as depicted in Figure 4.9. Increasing the MAXIMUM GR OUP MEMBERS coun t w ould do this. Hence there w ould b e limited n um b er of groups with large p opulation than large n um b er of groups with sparse p opulation. Hence the newly joining mem b ers w ould not b e allo w ed to declare themselv es as Group Leader unless the MAXIMUM GR OUP MEMBERS threshold is exceeded. Poorlybuilt Tree Structure Wellbuilt Tree Formation Figure 4.10: T ree Construction for an Application with Large Num b er of Receiv ers T ree construction for large n um b er of receiv ers. If the application in v olv es large n um b er of receiv ers, e.g., sto c k quotes and con ten t deliv ery then the m ulticast tree

PAGE 64

55 construction w ould b e as follo ws. The n um b er of receiv ers in a lo cal group w ould not b e as large as the coun t for an application with small n um b er of receiv ers. The MAXIMUM GR OUP MEMBERS w ould b e less when compared to the application with small n um b er of receiv ers. Figure 4.10 depicts the p o orly built tree with a lo cal group serving a large n um b er of receiv ers. The presence of large n um b er of receiv ers w ould require the Group Leader to ha v e sucien t pro cessing p o w er and memory The m ulticast op eration w ould b e optimal if there is more n um b er of lo cal groups with less n um b er of receiv ers to suit these t yp es of applications. On the con trary if there w ere large n um b er of receiv ers with few Group Leaders, it ma y lead to the A CK implosion problem at the Group Leader and ma y also result in congestion at the lo cal groups. In b oth the cases discussed ab o v e, there are other imp ortan t factors that aect the prop er tree construction. These include: Distanc e or Hops : The receiv ers of the m ulticast that reside in a net w ork are group ed together to form a lo cal group rather than grouping receiv ers that are in dieren t net w orks wide apart. The n um b er of hops or the distance of the receiv er to its Group Leader also pla ys a vital role in correct formation of the groupings. Maximum Br anching F actor : The maxim um branc hing factor of the lo cal group is prop ortional to a n um b er of functions. The lo cal pro cessing p o w er of the Group Leader is one factor that aects the branc hing factor. The Group Leader should ha v e the sucien t pro cessing p o w er and capacit y to accommo date the mem b ers of the lo cal group. The m ulticast rate suc h as the amoun t of pac k ets sen t p er second or the amoun t of b ytes p er pac k et also decides the branc hing factor of the group. The error rate and the load of the net w ork is also a factor that inruences the same. 4.7 Securit y Issues The IETF has p osted some ev aluation criteria for reliable m ulticast transp ort proto cols in [23]. Apart from the issues discussed ab o v e, it requires that a reliable m ulticast proto col should discuss the securit y issues.

PAGE 65

56 The main ob jectiv es of m ulticast securit y is to preserv e the authen tication and secrecy of m ulticast data so that only the legitimate senders can send the data and only legitimate receiv ers can receiv e the data [23 ]. The secrecy of the m ulticast data is pro vided b y public k ey cryptograph y mec hanism. The data are encrypted using a group k ey whic h is distributed among the receiv ers. This requires a go o d group k ey managemen t solution in the proto col. T o ac hiev e this goal, the arc hitecture of the proto col should b e v ery w ell designed and should aid in the same. The arc hitecture dened in Section 3.2 satises b oth criteria of scalabilit y and de-cen tralization. Due to the formation of lo cal groupings, the proto col is scalable to a large receiv er base. Also, the Group Leaders will act as a lo cal k ey managemen t en tit y managing a set of receiv ers, rather than a cen tralized con troller. Hence if an y of the lo cal k ey con trolling en tities is do wn it could b e tak en o v er b y another one. The issues concerning the Group Key Managemen t and access con trol are not considered in detail in this thesis. 4.8 Summary The c hapter pro vides a framew ork for reliable m ulticast proto col. The framew ork b egins with the discussion of goals of the prop osed proto col. The arc hitectural design of the proto col is discussed follo w ed b y the mec hanism of m ulticast data transfer. The section on ac kno wledgemen t and error reco v ery discusses the v arious t yp es of retransmission p erformed b y the Group Leader and source. The section also discusses the metho d b y whic h the unresp onsiv e receiv ers and Group Leaders are monitored and ho w the late joining receiv ers reco v er their data completely through cen tral logging serv er. The section on congestion con trol discusses the con trol algorithms lik e slo w start and congestion a v oidance in detail. The group managemen t section discusses the t w o group formation tec hniques of expanding

PAGE 66

57 ring structure and adv ertisemen t metho d. V arious other group managemen t tec hniques lik e the dynamic reconguration of the receiv ers among dieren t groups, group termination are discussed. The securit y issues are not discussed in detail.

PAGE 67

CHAPTER 5 FUNCTIONAL MODEL This c hapter pro vides a functional mo del for the prop osed reliable m ulticast proto col framew ork. The functional mo del iden ties the v arious proto col comp onen ts and its en tities. The mo del also pro vides use case and sequence diagrams for basic op erations of the proto col lik e m ulticast data transfer, ac kno wledgemen t, error reco v ery and group managemen t. The use case and sequence diagrams are constructed based on the Unied Mo deling Language (UML) sp ecication. The functional mo del also pro vides an o v erall ro w diagram of the proto col. 5.1 Proto col Comp onen ts There are four ma jor comp onen ts in the proto col design: Sender, Receiv er, Group Leader, and Cen tral Logging Serv er. The v arious comp onen ts and their sub comp onen ts are sho wn in Figure 5.1. The functions of eac h of them are detailed as follo ws. 5.1.1 Sender Comp onen t The Sender comp onen t is resp onsible for the transmission of the m ulticast data to the whole group. In addition, it is also resp onsible for n um b er of other functions lik e error reco v ery and congestion con trol, and it has the in terface to in teract with the Sender application. It has a n um b er of sub comp onen ts, details of whic h are listed b elo w. SNDR TRANS CONTR OLLER : This sub comp onen t has a n um b er of mo dules that tak e care of the transmission of dieren t t yp es of pac k ets. SNDR TR : This is a mo dule of SNDR TRANS CONTR OLLER and is resp onsible for the transmission of new data pac k ets. 58

PAGE 68

59 LOG_RTR LOG_NACK LOG_ACK LOG_TRANS_CONTROLLER LOG_PROCESS_CONTROLLER LOG_BUFR_MNGR SNDR_RTR SNDR_TR SNDR_TRANS_CONTROLLER SNDR_PROCESS_CONTROLLER SNDR_BUFR_MNGR RCVR_TR RCVR_NACK RCVR_ACK RCVR_CTRL RCVR_TRANS_CONTROLLER RCVR_PROCESS_CONTROLLER GL_RTR GL_STAT GL_ACK GL_TR GL_NACK GL_ADVT GL_TRANS_CONTROLLER GL_PROCESS_CONTROLLER GL_BUFR_MNGR CENTRAL_LOGGER_COMPONENTS SENDER_COMPONENT GROUP_LEADER_COMPONENT RECEIVER_COMPONENT Figure 5.1: F unctional Comp onen ts of the Proto col SNDR R TR : This is a mo dule of SNDR TRANS CONTR OLLER and is resp onsible for the retransmission of lost pac k ets. SNDR PR OCESS CONTR OLLER : This sub comp onen t is resp onsible for the pro cessing of A CK and NA CK messages from Group Leaders and the Cen tral Logging Serv er. It is also resp onsible for the pro cessing of the con trol messages for group mem b ership and congestion con trol. SNDR BUFR MNGR : This sub comp onen t is resp onsible for buer managemen t. The resp onsibilities include the buering of messages and deleting them as they receiv e the A CKs from its c hildren. 5.1.2 Receiv er Comp onen t The Receiv er comp onen t deliv ers the receiv ed data pac k ets to the receiv er application. It also sends the A CK and NA CK to the Group Leaders in accordance

PAGE 69

60 with the reception of data pac k ets. The v arious sub comp onen ts of it are men tioned b elo w. R CVR TRANS CONTR OLLER : This sub comp onen t has a n um b er of mo dules that tak e care of the transmission of dieren t t yp es of pac k ets. R CVR TR : This a mo dule of R CVR TRANS CONTR OLLER and is resp onsible for the transmission of the receiv ed data pac k ets to the receiv er application. R CVR A CK : This a mo dule of R CVR TRANS CONTR OLLER and is resp onsible for the transmission of the A CK messages to the Group Leader. R CVR NA CK : This a mo dule of R CVR TRANS CONTR OLLER and is resp onsible for the transmission of the NA CK messages to the Group Leader and the Cen tral Logging Serv er to request retransmission of lost pac k ets. R CVR CTRL : This a mo dule of R CVR TRANS CONTR OLLER and is resp onsible for the transmission of the con trol pac k ets for group managemen t tec hniques and congestion con trol. R CVR PR OCESS CONTR OLLER : This sub comp onen t is resp onsible for pro cessing of the con trol messages for group mem b ership and congestion con trol. 5.1.3 Group Leader Comp onen t The Group Leader comp onen t is resp onsible for buering data pac k ets from the sender temp orarily It is also in v olv ed in the lo cal error reco v ery with the receiv ers b y the retransmission of lost pac k ets. The Group Leader is also resp onsible for detecting the congestion and rep orting the same up the m ulticast tree. The v arious sub comp onen ts of the Group Leader are men tioned b elo w. GL TRANS CONTR OLLER : This sub comp onen t has a n um b er of mo dules that tak e care of the transmission of dieren t t yp es of pac k ets. GL TR : This a mo dule of GL TRANS CONTR OLLER and is resp onsible for the transmission of the receiv ed data pac k ets to the receiv er application. GL A CK : This a mo dule of GL TRANS CONTR OLLER and is resp onsible for the transmission of the A CK messages to the source or other Group Leader up in the hierarc h y

PAGE 70

61 GL R TR : This a mo dule of GL TRANS CONTR OLLER and is resp onsible for the retransmission of the lost pac k ets. GL NA CK : This a mo dule of GL TRANS CONTR OLLER and is resp onsible for the transmission of the NA CK messages to another Group Leader up in the hierarc h y or to the Cen tral Logging Serv er to request retransmission of lost pac k ets. GL AD VT : This a mo dule of GL TRANS CONTR OLLER and is resp onsible for the transmission of the group adv ertisemen t pac k ets. GL ST A T : This a mo dule of GL TRANS CONTR OLLER and is resp onsible for the transmission of the lo cal group status messages to the m ulticast group. GL PR OCESS CONTR OLLER : This sub comp onen t is resp onsible for the pro cessing of A CK and NA CK messages receiv ed from the receiv ers and c hild Group Leaders. It is also resp onsible for pro cessing of the con trol messages lik e group mem b ership and congestion con trol. GL BUFR MNGR : This sub comp onen t is resp onsible for buer managemen t. The resp onsibilities include the buering of messages and deleting them as they receiv e the A CKs from its c hildren. 5.1.4 Cen tral Logging Serv er Comp onen t The cen tral logger comp onen t is resp onsible for the complete data reco v ery It buers the data p ermanen tly in the disk and reco v ers the receiv ers and Group Leaders in case of loss. The v arious sub comp onen ts of it are discussed as follo ws. LOG TRANS CONTR OLLER : This sub comp onen t has a n um b er of mo dules that tak es care of the transmission of dieren t t yp es of pac k ets. LOG R TR : This is a mo dule of LOG TRANS CONTR OLLER and is resp onsible for the retransmission of lost pac k ets. LOG A CK : This is a mo dule of LOG TRANS CONTR OLLER and is resp onsible for the transmission of A CK pac k ets to the sender. LOG NA CK : This is a mo dule of LOG TRANS CONTR OLLER and is resp onsible for the transmission of the NA CK pac k ets to the source. SNDR PR OCESS CONTR OLLER : This sub comp onen t is resp onsible for the pro cessing of NA CK messages from Group Leaders and the receiv ers.

PAGE 71

62 LOG BUFR MNGR : This sub comp onen t is resp onsible for buer managemen t. The resp onsibilities include the buering of messages in disks and retrieving them during retransmissions. Message sent by clients in response to the join reply from Group Leaders Message sent by clients in search of Group Leaders Acknowledgement message sent by Group Leader for group membershipAdvertisement message sent by Group Leader for group membershipStatus message sent by Group Leader to aid the group member's reaffiliationMessage sent by receivers informing the Group Leader before leaving the groupAcknowledgement for the previous messageMessage sent by Group Leader to the local group to initiate leader election processMessage sent by receivers acknowledging the termination of the Group LeaderMessage sent by Group Leaders, which aid the receivers in finding the unresponsiveGroup Leaders GL_SRCH INT_IN_JOIN GRP_JOIN_ACK JOIN_MY_GRP GL_STATUS LEAVE_GRP LEAVE_GRP_ACKFIND_GRP_LDRTERM_ACK GL_ALIVE Message Description Figure 5.2: Dieren t Messages Used in Proto col Op eration App endix A con tains pseudo algorithms for the v arious comp onen ts of the proto col. Though the op erations p erformed b y them are listed sequen tially they are p erformed concurren tly App endix B con tains the pseudo algorithms for v arious group managemen t tec hniques b et w een the Group Leaders and receiv ers. The v arious messages in v olv ed in the proto col op eration are listed in Figure 5.2. Use Case Diagram Sequence Flow Receiver Group Leader Sender Central Logger Multicast Data Receiver Group Leader Central Logger Sender Figure 5.3: Use Case and Sequence Analysis for Multicast Data Op eration 5.2 Use Case and Sequence Diagrams This section deals with the use case design for the v arious phases of the proto col op eration. The use case analysis depicts the v arious op erations phases of the

PAGE 72

63 proto col as dieren t cases and analyzes them with the ma jor pla y ers in v olv ed with it in an abstract w a y without an y implemen tation details. DataAcknowledgement Sender Central LoggerGroup Leader Acknowledgement Receiver Group Leader Sender Group Leader Group Leader Group Leader Sender Use Case Diagram Sequence Flow Sender Group Leader Receiver Sender Central Logger Acknowledgement Figure 5.4: Use Case and Sequence Analysis for Ac kno wledgemen t Multicast data analysis. Figure 5.3 depicts the use case diagram and the sequence ro w asso ciated with the op eration of m ulticast data transfer. The sender m ulticasts the data pac k ets to the whole group and they are receiv ed b y all the comp onen ts of the group. The direction of data ro w from sender to the comp onen ts, cen tral log serv er, Group Leaders and receiv ers are sho wn as sequence ro ws in Figure 5.3. Ac kno wledgemen t analysis. Figure 5.4 depicts the use case diagram and the sequence ro w asso ciated with the ac kno wledgemen t pac k ets. The data pac k ets sen t b y the sender are ac kno wledged b y the Group Leaders attac hed directly to it and the cen tral log serv er. Group Leaders are ac kno wledged b y the receiv ers, and the Group Leaders attac hed b elo w it. There are four dieren t sequence diagrams in v olv ed with the second part of Figure 5.4. Both the cen tral log serv er and the

PAGE 73

64 Recovery DataRetransmission Sender Sender Central Logger Recovery Group Leader Receiver Group Leader Receiver Group Leader Group Leader Group Leader Sequence Flow Use Case Diagram Group Leader CentralLogger CentralLogger Receiver Sender Group Leader CentralLogger Figure 5.5: Use Case and Sequence Analysis for Error Reco v ery Group Leaders that are attac hed directly to the source, send their ac kno wledgemen t to sender in resp onse to the data pac k ets receiv ed from it. The receiv ers send their ac kno wledgemen ts to the Group Leader rather than the sender and Group Leaders ac kno wledge their paren t Group Leaders. The dieren t lev els in the m ulticast tree impro v e the scalabilit y of the proto col. The ro w in all the four cases are represen ted as sequence ro ws in Figure 5.4. Terminate Group Form Group Shift Group Modify Leadership Monitor Members Group Leader Sender Receiver Child Group Leader Figure 5.6: Use Case and Sequence Analysis for Group Managemen t Error reco v ery analysis. Figure 5.5 depicts the use case diagram and the sequence ro w asso ciated with the error reco v ery pac k ets. Error reco v ery is required in resp onse to the pac k ets lost b y the v arious comp onen ts of the m ulticast group. There are six dieren t sequences asso ciated with the error reco v ery The sender

PAGE 74

65 retransmits lost pac k ets to the cen tral log serv er and its c hildren Group Leaders. The cen tral log serv er retransmits the pac k ets lost b oth b y the receiv ers and the Group Leaders. The Group Leaders also retransmits the pac k ets for the receiv ers and their Group Leader c hildren. The ro w in all the six cases is represen ted as sequence ro ws in Figure 5.5. Group managemen t analysis. The use case diagram for the Group Managemen t is depicted in Figure 5.6. The sender and the Group Leaders are in v olv ed in activities suc h as formation and termination of the groups. The other op erations suc h as shifting among the group mem b ers, mo difying the leadership and monitoring of the mem b ers are restricted to the receiv ers and the Group Leaders. Ov erall sequence analysis. The Figure 5.7 represen ts the com bined sequence diagram of all the op erations represen ted ab o v e. This sequence diagram represen ts the order in whic h the proto col op eration tak es place. The four en tities are represen ted along with an extra Group Leader and an extra receiv er to sho w the retransmission op eration in v olv ed. Eac h arro w represen ts the ro w of pac k ets from a source en tit y to a destination en tit y in the direction of the p oin ting arro whead. They are mark ed as X.Y with X b eing the sending en tit y and Y b eing the sequence n um b er of the sending en tit y It is seen that the source m ulticasts the data pac k et to all the en tities and ev ery one ac kno wledges it promptly Receiv er R1 ac kno wledges the data pac k et to its Group Leader whereas R2 requests a retransmission. After the reception of the retransmission, R2 ac kno wledges the same. The other details p ertaining to the retransmissions b y cen tral log serv er and source, congestion con trol sc hemes and the group managemen t are not sho wn in Figure 5.7. 5.3 Flo w Diagram Figure 5.8 depicts data ro w and con trol ro w inside a comp onen t of the proto col. The comp onen t could b e a sender or a Group Leader or the cen tral log serv er. The

PAGE 75

66 Central Logger Sender Receiver_1 X.Y Y: Sequence of the Sending Entity X: Sending Entity Receiver_2 S.1 S.1 S.1 S.1 ACK CL.2 ACK GL.2 ACK R1.2 NACK RetransmissionACK S.2 S.2 S.2 S.2 Grp Ldr 2 Grp Ldr 1 S.1 G2.2 GL1.3 R2.2NACK GL2.3 Retransmission GL2.4 R2.3 ACK Figure 5.7: Sequence Diagram of the Proto col comp onen t depicted in Figure 5.8 could not b e a receiv er as there are buer mec hanisms in v olv ed. The data ro w is depicted in regular dark line while the con trol ro w is depicted in dashed line. There are t w o separate buers for transmission and retransmission. Similarly there are separate transmission mec hanisms for m ulticast and unicast. There are separate en tities for managing the con trol messages for group managemen t and ro w con trol. An imp ortan t feature of the ro w inside the comp onen t is that the data ro ws in from the top and ro ws out through the b ottom. Also, the con trol ro ws in from the sides and ro ws out through the sides. Separate queues are pro vided for A CK and NA CK.

PAGE 76

67 Tx Buffer Multicast Tx Unicast Tx ACK InQ NACK InQ GarbageCollecter Flow Ctrl CTRLINQ Ctrl Msgs Mgmt Group Error Recovery ReTx Buffer Data Input Data Output Control Input ControlOutput Figure 5.8: Flo w Diagram of a Comp onen t 5.4 Summary This c hapter pro vided a functional mo del of the proto col framew ork. The functional mo del is in tended to pro vide an abstract o v erview of the proto col without concen trating on the implemen tation details. This mo del could also serv e as an in terface b et w een the functional p eople and tec hnical team. The mo del has use case analysis, sequence analysis, ro w diagrams, and sequence diagrams. These w ere dra wn according to the UML sp ecications.

PAGE 77

CHAPTER 6 CONCLUSION AND FUTURE W ORK 6.1 Conclusion Reliable m ulticast refers to the reliable manner in whic h a message is sen t from a sender to a set of receiv ers. It is still an activ e area of researc h with sev eral w orks in literature. Most of the w orks are classied as sender-initiated and receiv er-initiated approac hes. Although there are a n um b er of proto cols in existence, most of them w ere dev elop ed with a particular application in mind. There are sev eral imp ortan t issues that mak e the design of a reliable m ulticast proto col dicult. The proto col has to b e scalable to a large n um b er of receiv ers. Since sev eral receiv ers are in v olv ed, the implosion problem m ust b e addressed. Implosion at the sender o ccurs when all the receiv ers send their resp onse trac bac k to source. The proto col should pro vide reco v ery mec hanisms that minimize the con trol trac and reco v ery time. The proto col m ust also pro vide mec hanisms to reco v er from congestion in the net w ork. Managemen t mec hanisms of dieren t en tities in the proto col m ust b e men tioned. All these factors mak e design of the reliable m ulticast proto col c hallenging. Some of the p opular reliable m ulticast proto cols in existence are LGMP RMTP TRAM, SRM, LBRM, and XTP An analysis of these proto cols with resp ect to the issues men tioned previously is men tioned in Chapter 3. The concept of logically grouping the receiv ers w as prop osed in LGC. Although LGC reco v ered the pac k ets lo cally it had to go to the sender if none of the mem b ers ha v e the missing pac k et. LGC did not men tion congestion con trol. The lo cal grouping concept prop osed b y LGC w as later adopted in the designs of RMTP and TRAM. RMTP builds a n um b er of lo cal subtrees, whic h together form the global m ulticast tree. Although 68

PAGE 78

69 RMTP is scalable to a large n um b er of receiv ers, it do es not supp ort m uc h in the w a y of group managemen t tec hniques. RMTP do es not pro vide end-to-end congestion con trol with an y feedbac k sen t from the receiv ers. Multicast in TRAM is based on the repair tree construction. TRAM has sev eral features lik e lo cal grouping, tree construction based up on LGC and RMTP TRAM pro vides algorithms for optimized tree construction tec hniques suitable for a v ariet y of m ulticast applications. While, TRAM do es not supp ort complete data reco v ery it do es men tions a n um b er of con trol and status messages for p erforming the group managemen t tec hniques. LBRM prop oses the idea of distributed logging of data pac k ets, whic h aids in lo cal reco v ery and reduces the end-to-end propagation dela y Although the receiv ers are group ed together forming sites, LBRM do es not sp ecify group managemen t tec hniques. LBRM w as dev elop ed for high p erformance sim ulation applications that require lo w-latency pac k et loss detection. Hence the proto col has the abilit y to pro vide k eep-aliv e or heartb eat data pac k ets, ev en when the application do es not pro vide it. SRM w as dev elop ed for supp orting a distributed whiteb oard application. There is no lo cal grouping of receiv ers, whic h sev erely limits the scalabilit y of the proto col. Retransmission requests made b y the receiv ers are m ulticast to the whole group, and other receiv ers that require the same pac k ets bac k o lo oking the request. MTCP has a hierarc hical structure of m ulticast receiv ers with source ro oted on top of the tree. It do es not logically group the receiv ers, whic h again limits its scalabilit y of the receiv ers. It prop oses a congestion con trol algorithm based on hierarc hical feedbac k sen t b y the receiv ers up to w ards the source. Con tribution. This thesis prop oses a framew ork for a reliable m ulticast proto col that is scalable to a large n um b er of receiv ers and is ric h in group managemen t tec hniques. The imp ortan t features of the proto col are discussed as follo ws.

PAGE 79

70 The arc hitectural design of the proto col with the lo cal groupings of receiv ers arranged hierarc hically mak es it scalable to a large n um b er of receiv ers. Lo cal reco v ery of pac k ets at the Group Leaders reduces the round trip time in v olv ed in retransmission. Rep orting of A CKs and NA CKs b y the receiv ers to the Group Leaders eliminates the A CK implosion problem b y design. Complete reco v ery of data pac k ets for the receiv ers joining late in the m ulticast is accomplished through the cen tral logging serv er. Unresp onsiv e receiv ers and Group Leaders are monitored b y a sp ecial con trol mec hanism. The proto col pro vides a receiv er-based congestion con trol mec hanism along with TCP-lik e slo w start congestion con trol algorithms at the source that mak es the proto col TCP-friendly Receiv ers and Group Leaders rep ort the congestion notications in a hierarc hical fashion to the source. The proto col pro vides t w o dieren t group formations tec hniques: expanding ring and adv ertisemen t metho d that are adapted dynamically to suit the net w ork load. Receiv ers can dynamically recongure themselv es b y c hanging the group mem b ership with the help of status messages from Group Leaders. The framew ork also includes a list of pseudo algorithms for the m ulticast op eration and group managemen t tec hniques. A functional mo del detailing the v arious comp onen ts of the proto col w as prop osed. The mo del also included a list of use-case and sequence diagrams for the basic m ulticast op eration of the proto col. Some of the features in the proto col framew ork w ere adapted from the earlier w orks. The concepts of lo cal groping of receiv ers w ere adopted from LGC, but the features ha v e b een mo died to suit the curren t framew ork. Sev eral new features lik e the dynamic group managemen t tec hniques and complete data reco v ery through cen tral logger w ere incorp orated.

PAGE 80

71 6.2 F uture W ork The follo wing are the sev eral areas in whic h the future w ork could b e done. The framew ork discussed in thesis addresses only the p oin t-to-m ultip oin t comm unication. F or m ultip oin t-to-m ultip oin t comm unications with sev eral senders, m ulticast trees w ould ha v e to b e setup at eac h sender. The framew ork do es not discuss the securit y asp ects in detail suc h as source authen tication, access con trol and the k ey managemen t tec hniques. T ree optimization tec hniques are not discussed in the framew ork, although a men tion of factors that aect the correct tree-formation are sp ecied. Dynamic optimization b y con tin ually reconguring the tree structure b y breaking or branc hing the tree to suit the net w ork conditions is an in teresting area to address. The presence of cen tral log serv er helps in the complete data reco v ery Multiple p ermanen t log serv ers could b e distributed to reduce the propagation dela y in retransmission and for scaling purp oses. The tradeo in v olv ed in the in tro duction of distributed log serv ers v ersus the buering of data at the Group Leaders could b e studied. The proto col prop osed in this pro vides a framew ork along with a functional mo del for a reliable m ulticast proto col. It also pro vides pseudo algorithms for the m ulticast op eration and group managemen t tec hniques. The prop osed proto col framew ork is for a p oin t-to-m ultip oin t comm unications. Analysis and issues relating to the m ultip oin t-to-m ultip oin t comm unications could b e studied further.

PAGE 81

APPENDIX A PSEUDO ALGORITHMS F OR MUL TICAST OPERA TION Algorithm 1: Sender Op eration while Conne ction is Op en do Receiv e Data from Sending Application; Multicast Data pac k ets to the Group; Buer the pac k ets b efore sending; Pro cess Ac kno wledgemen ts and Retransmissions; Manage buered Data pac k ets; if Notie d for Congestion then Reduce the sending rate; endP erform Group Managemen t activities; end Algorithm 2: Cen tral Log Serv er Op eration while T rue do Receiv e Data pac k ets from Sender; Send A CK or NA CK to Sender; if R e quest for R e c overy then P erform Retransmissions; endStore and Retriev e Data from Disk; end 72

PAGE 82

73 Algorithm 3: Group Leader Op eration while T rue do Receiv e Data pac k ets from Sender; Buer Data pac k ets; Send A CK or NA CK to P aren t; Pro cess A CK or NA CK from Children; Manage the buered Data; if R e quest for R e c overy then P erform Retransmissions; endP erform Group Managemen t activities; if Congestion or pr oblem then Send Notication to P aren t; endSend receiv ed pac k ets to Receiving Application; end Algorithm 4: Receiv er Op eration while T rue do Receiv e Data pac k ets from Sender; Send A CK or NA CK to Group Leader; P erform Group Managemen t Activities; if Congestion or pr oblem then Send Notication to Group Leader; endSend receiv ed pac k ets to Receiving Application; end

PAGE 83

APPENDIX B PSEUDO ALGORITHMS F OR GR OUP MANA GEMENT Algorithm 5: Expanding Ring Searc h: Clien t while Gr oupL e aderF ound == F alse do Multicast a GR OUP LEADER SEAR CH message; Collect Resp onses; if No R eplies then Multicast the GR OUP LEADER SEAR CH message with longer TTL; else foreac h R eply c ol le cte d fr om Gr oupL e ader do Analyze the reply for b est c haracteristics; endSend an INTERESTED IN JOINING message to GroupLeader; if A CK JOINING r e c eive d then GroupLeaderF ound T r ue ; else Con tin ue Searc hing; end end end Algorithm 6: Expanding Ring Searc h: GroupLeader while T rue do Collect messages from all p oten tial receiv ers; if GR OUP LEADER SEAR CH message then foreac h R eply c ol le cte d do if MAX MEMBERS not exc e e de d then Send Reply with the Group c haracteristics; end end else if INTERESTED IN JOINING message then Send A CK JOINING message; end 74

PAGE 84

75 Algorithm 7: Adv ertisemen t Metho d: Clien t while Gr oupL e aderF ound == T rue do Collect the JOIN MY GR OUP adv ertisemen t from GroupLeader; foreac h A dvertisement message c ol le cte d do Analyze and Categorize for b est c haracteristics; Cho ose a Group Leader reac hable with minim um hops; if Gr oup L e aders MAX MEMBERS not exc e e de d then Send INTERESTED IN JOINING message; W ait for A CK JOINING from GroupLeader; if A CK JOINING r e c eive d is Suc c ess then GroupLeaderF ound T r ue ; else Con tin ue Searc hing; end else Analyze another Reply; end end end Algorithm 8: Adv ertisemen t Metho d: Group Leader while T rue do Send JOIN MY GR OUP at certain Time In terv als; Collect resp onses from receiv ers; if INTERESTED IN JOINING message then if MAX MEMBERS not exc e e de d then Send P ositiv e A CK JOINING message; Incremen t GR OUP MEMBER coun t; else Send Negativ e A CK JOINING message; end end end

PAGE 85

76 Algorithm 9: Receiv ers Shifting Lo cal Group: Clien t while T rue do Collect the GR OUP LDR ST A TUS message from Group Leader(s); foreac h Status message c ol le cte d do Analyze and Cho ose the Group Leader with b est c haracteristics; Send INTERESTED IN JOINING message; if Positive A CK JOINING is r e c eive d then Send LEA VE GR OUP to Group Leader; W ait for the receipt of LEA VE GR OUP A CK; Join the new group; else Analyze another status message; end end end Algorithm 10: Receiv ers Shifting Lo cal Group: Group Leader while T rue do Send GR OUP LDR ST A TUS message at regular Time In terv als; Collect resp onses from receiv ers; if INTERESTED IN JOINING message then if MAX MEMBERS not exc e e de d then Send P ositiv e A CK JOINING message; Incremen t GR OUP MEMBER coun t; else Send Negativ e A CK JOINING message; end end end

PAGE 86

77 Algorithm 11: Group Leader Shift or T ermination: Group Leader if GR OUP MEMBER c ount less than Thr eshold then P erform regular Group Leader activities; if wish to le ave or terminate then Inform all receiv ers ab out termination; Receiv e TERMINA TE A CK from all receiv ers; F ulll all p ending obligations; Initiate the FIND GR OUP LEADER op eration; T erminate; end else Inform all receiv ers ab out termination; Receiv e TERMINA TE A CK from all receiv ers; Inform source ab out termination of group; Shed all clien ts; T erminate receiv ers and self; end Algorithm 12: Monitoring of Unresp onsiv e Group Leader: Receiv er while T rue do Receiv e the GL ALIVE messages; if Do not r e c eive GL ALIVE messages then Mark its absence in the A CK message; Send the mark ed A CK to Group Leader; W ait for the resp onse; Send 3 mark ed A CKs un til an y resp onse; if No R esp onse fr om Gr oup L e ader then Group Leader is dead; Initiate algorithm to join a new group; else Group Leader is Aliv e; Remain in the same group; end end end

PAGE 87

78 Algorithm 13: Monitoring of Unresp onsiv e Receiv ers: Group Leader while T rue do Send GL ALIVE messages p erio dically; Receiv e the A CK from the receiv ers; if Do not r e c eive 3 A CK messages c ontinual ly then Unicast a GL ALIVE to the receiv er; W ait for the resp onse; if No R esp onse fr om the r e c eiver then Receiv er is dead or left the group; T erminate the receiv er from the group; else Receiv er is Aliv e; Con tin ue Multicast Op eration; end end end

PAGE 88

REFERENCES [1] Information Sciences Institute, \In ternet Proto col," Request for Commen t (RF C) 791, In ternet Engineering T ask F orce, Septem b er 1981 [2] Information Sciences Institute, \T ransmission Con trol Proto col," Request for Commen t (RF C) 793, In ternet Engineering T ask F orce, Septem b er 1981 [3] S. Flo yd, V. Jacobson, C. Liu, S. McCanne, L. Zhang, '`A Reliable Multicast F ramew ork for Ligh t-w eigh t sessions and Application Lev el F raming," IEEE/A CM T ransactions on Net w orking, v ol. 5, no. 6, pages 784-803, Decem b er 1997 [4] J. W. A t w o o d, O. Catrina, J. F en ton, W. T. Stra y er, \Reliable Multicasting in Xpress T ransp ort Proto col," IEEE Conference on LCN, pages 202-211, Minneap olis, MN, USA, Octob er 1996 [5] H. W. Holbro ok, S. K. Singhal, D. R. Cheriton, \Log-based Receiv er-reliable Multicast for Distributed In teractiv e Sim ulation," SIGCOMM, pages 328-341, 1995 [6] M. Hofmann \A Generic Concept for Large-Scale Multicast," In ternational Zuric h Seminar on Digital Comm unications, pages 95-106, 1996 [7] I. Rhee, N. Ballaguru, G. Rousk as, \MTCP: Scalable TCP-lik e Congestion Con trol for Reliable Multicast," IEEE INF OCOM, v ol. 3, pages 1265-1273, New Y ork, NY, USA, Marc h 1999 [8] S. P aul, K. Sabnani, J. Lin, S. Bhattac haryy a \Reliable Multicast T ransp ort Proto col (RMTP)," IEEE Journal of Selected Areas in Comm unications, v ol. 15, pages 407-421, April 1997 [9] D. Chiu, S. Hurst, M. Kadansky \TRAM: A T ree-Based Reliable Multicast Proto col," T ec hnical Rep ort TR-98-66, Sun Microsystems, July 1998 [10] W. Stallings, Crypto gr aphy and Network Se curity: Principles and Pr actic e Second Edition, Pren tice Hall, Upp er Saddle Riv er, New Jersey July 1998 [11] V. Ro ca, L. Costa, R. Vida, A. Dracinsc hi, S. Fdida, \A Surv ey of Multicast T ec hnologies," T ec hnical Rep ort, LNCS 2236, Septem b er 2000 79

PAGE 89

80 [12] B. Quinn, K. Almeroth, \IP Multicast Applications: Challenges and Solutions," Request for Commen t (RF C) 3170, In ternet Engineering T ask F orce, Septem b er 2001 [13] B. N. Levine, J. J. Garcia-Luna-Acev es, \A Comparison of Reliable Multicast Proto cols," In ternational Conference on Net w ork Proto cols, pages 112-121, Colum bus, OH, USA, Octob er 1996 [14] K. Obraczk a \Multicast T ransp ort Proto cols: A Surv ey and T axonom y ," IEEE Comm unications Magazine, v ol. 36, pages 94-102, T oron to, On t., Canada, Jan uary 1998 [15] J. P ostel, Information Sciences Institute, \User Datagram Proto col," Request for Commen t (RF C) 768, In ternet Engineering T ask F orce, August 1980 [16] V. Jacobson \Congestion Av oidance and Con trol," A CM SIGCOMM, v ol. 18, pages 314-329, Stanford, CA, USA, August 1998 [17] X. Xiao, L. M. Ni, \In ternet QoS: A Big Picture," IEEE Net w ork, v ol. 13, pages 8-18, Marc h-April 1999 [18] A. Mankin, A. Romano w, S. Bradner, V. P axson, \IETF Criteria for Ev aluating Reliable Multicast T ransp ort and Application Proto cols," Request for Commen t (RF C) 2357, In ternet Engineering T ask F orce, June 1998 [19] J. Nonnenmac her, M. Lac her, M. Jung, E. Biersac k, G. Carle, \Ho w bad is Reliable Multicast without Lo cal Reco v ery ?," INF OCOM, v ol. 3, pages 972-979, San F rancisco, CA, USA, April 1998 [20] S. K. Kasera, J. Kurose, D. T o wsley \A Comparison of Serv er-based and Receiv er-based Lo cal Reco v ery Approac hes for Scalable Reliable Multicast," T ec hnical Rep ort UM-CS-1997-069, 1997 [21] S. J. Golestani, K. Sabnani, \F undamen tal Observ ations on Multicast Congestion Con trol in the In ternet," INF OCOM, v ol. 2, pages 990-1000, New Y ork, NY, USA, Marc h 1999 [22] S. Flo yd, M. Handley \Requiremen ts for Congestion Con trol for Reliable Multicast," Reliable Multicast W orkshop in Cannes, Septem b er 1997 [23] M. J. Mo y er, J. R. Rao, P Rohatgi, \A Surv ey of Securit y Issues in Multicast Comm unications," IEEE Net w ork, v ol. 13, pages 12-23, No v em b er-Decem b er 1999

PAGE 90

BIOGRAPHICAL SKETCH V enk ata L. Ramasubramaniam w as b orn in Srivilliputtur, T amil Nadu, India, on Octob er 17, 1978. He earned his high sc ho ol diploma from Sir. M. V enk ata Subba Rao matriculation sc ho ol. He graduated with a Bac helor of Engineering degree with distinction in computer science and engineering in 1999 from Madurai Kamara j Univ ersit y Madurai. He came to the Univ ersit y of Florida (in Gainesville, Florida) to pursue a Master of Science degree in the Computer and Information Science and Engineering. 81


Permanent Link: http://ufdc.ufl.edu/UFE0000599/00001

Material Information

Title: A Framework for reliable multicast protocol
Physical Description: Mixed Material
Creator: Ramasubramaniam, Venkata Lakshmanan ( Author, Primary )
Publication Date: 2002
Copyright Date: 2002

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0000599:00001

Permanent Link: http://ufdc.ufl.edu/UFE0000599/00001

Material Information

Title: A Framework for reliable multicast protocol
Physical Description: Mixed Material
Creator: Ramasubramaniam, Venkata Lakshmanan ( Author, Primary )
Publication Date: 2002
Copyright Date: 2002

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0000599:00001


This item has the following downloads:


Full Text











A FRAMEWORK FOR RELIABLE MULTICAST PROTOCOL


By

VENKATA LAKSHMANAN RAMASUBRAMANIAM















A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE

UNIVERSITY OF FLORIDA


2002

































Copyright 2002

by

Venkata Lakshmanan Ramasubramaniam





































To

My Parents















ACKNOWLEDGMENTS

I would like to express my sincere gratitude to Dr. Richard E. Newman

for giving me the opportunity to work with him and being my advisor. He has

been a great source of inspiration and encouragement throughout my stay at the

University of Florida.

I would also like to thank Dr. Randy Y. C. Chow, Dr. Jonathan C. Liu, and

Dr. Michael P. Frank for serving on my committee.

Special thanks go to Dr. Christophe Fiorio for generously allowing me to use

his IATEX algorithm template. Thanks go to Mr. Ron Smith for creating this thesis

template for the IATEX community at University of Florida and helping me out in

the various problems encountered during the formatting.

I would like to thank my parents, family and friends who have always been

there to help me.















TABLE OF CONTENTS

page

ACKNOWLEDGMENTS ................... ...... iv

LIST OF FIGURES ................... ......... vii

ABSTRACT ...................... ............ viii

CHAPTER

1 INTRODUCTION .................... ....... 1

1.1 M otivation . . . . . . . 2
1.2 Problem Definition .......................... 3
1.3 Organization of the Thesis ......... ............ 4

2 BACKGROUND AND PREVIOUS WORK ................ 5

2.1 Multicasting ......... .................... 5
2.2 Reliable Multicasting ............ .... ....... 6
2.3 Existing Protocols ......... ................. 7
2.3.1 Local Group Concept .. .... ..... ........ 7
2.3.2 Reliable Multicast Transport Protocol (RMTP) ...... 8
2.3.3 Tree-based Reliable Multicast Protocol (TRAM) . 9
2.3.4 Multicast TCP (MTCP) .................. .. 11
2.3.5 Scalable Reliable Multicast (SRM) . . ..... 12
2.3.6 Log-based Receiver-reliable Multicast (LBRM) ...... .. 13
2.3.7 Xpress Transport Protocol (XTP) . . ...... 14
2.4 TCP Congestion Control .................. .. 15
2.5 Existing QoS Approaches .................. .. 17
2.6 IETF Approaches .................. ........ .. 19
2.7 Summary .................. ............ .. 21

3 ISSUES IN RELIABLE MULTICASTING ................ .. 23

3.1 ACK Implosion problem .................. .. 23
3.2 Error Recovery .................. ......... .. 25
3.3 Congestion Control .................. ....... .. 26
3.4 Scalability .................. ............ .. 29
3.5 Fairness .................. ............. .. 29
3.6 Summary .................. ............ .. 30









4 PROTOCOL FRAMEWORK .......... .............. 32

4.1 Goals. ............. ..... .......... ..... 32
4.1.1 ACK Handling ......................... 32
4.1.2 Error Recovery ......................... 32
4.1.3 Congestion Control .................. ..... 32
4.1.4 Scalability and Dynamic Adaptation . . 33
4.2 Architecture .................. ........... .. 33
4.3 Multicast Data Transfer .................. .. 34
4.4 Acknowledgement and Error Recovery Mechanism . ... 37
4.4.1 Retransmission by Group Leader . . ..... 38
4.4.2 Retransmission by Source . . . ..... 39
4.4.3 Monitoring of Unresponsive Receivers and Group Leaders 40
4.4.4 Late Joining Receivers and Data Recovery . ... 41
4.5 Congestion Control Mechanism .................. .. 43
4.5.1 Slow Start .................. ........ .. 43
4.5.2 Congestion Control .................. ..... 44
4.6 Group Management Schemes .................. .. 46
4.6.1 Group Formation .................. .. 47
4.6.2 Dynamic Reconfiguration of Groups . . ... 49
4.7 Security Issues ............... ......... .. 55
4.8 Summary ............... ......... .. 56

5 FUNCTIONAL MODEL .................. ........ .. 58

5.1 Protocol Components ................ ... ... .. 58
5.1.1 Sender Component ..... ........... . 58
5.1.2 Receiver Component ..... ........... .... 59
5.1.3 Group Leader Component .............. .. .. 60
5.1.4 Central Logging Server Component . . 61
5.2 Use Case and Sequence Diagrams ............. .. .. 62
5.3 Flow Diagram ............... ......... .. 65
5.4 Summary ............... ......... .. 67

6 CONCLUSION AND FUTURE WORK ................. .. 68

6.1 Conclusion .................. ............ .. 68
6.2 Future W ork .................. ......... .. .. 71

APPENDIX

A PSEUDO ALGORITHMS FOR MULTICAST OPERATION ....... .72

B PSEUDO ALGORITHMS FOR GROUP MANAGEMENT ........ 74

REFERENCES ................... ... ... ........ .. 79

BIOGRAPHICAL SKETCH. .................. ........ .. 81















LIST OF FIGURES


Figure page

2.1 Differentiated Service (DS) Field for TOS in IP Header . ... 18

3.1 ACK Implosion at Multicast Source .................. .. 24

3.2 Comparative Analysis of Reliable Multicast Protocols . ... 30

4.1 Overall Architectural Design of the Protocol ............. ..35

4.2 Multicast Data Transfer .................. ..... .. 36

4.3 Local Retransmission of Missing Packets ............... .. 38

4.4 Central Logging Server and Monitoring of Unresponsive Members 42

4.5 Hierarchical Consolidation of Feedback Parameter . .... 45

4.6 Expanding Ring Structure for Group Membership . .... 47

4.7 Group Leader's Advertisement for Membership . . .... 48

4.8 Group Leader's Status Message .................. .. 50

4.9 Tree Construction for an Application with Less Number of Receivers 53

4.10 Tree Construction for an Application with Large Number of Receivers 54

5.1 Functional Components of the Protocol ................ ..59

5.2 Different Messages Used in Protocol Operation . . 62

5.3 Use Case and Sequence Analysis for Multicast Data Operation .. 62

5.4 Use Case and Sequence Analysis for Acknowledgement . ... 63

5.5 Use Case and Sequence Analysis for Error Recovery . ... 64

5.6 Use Case and Sequence Analysis for Group Management ...... ..64

5.7 Sequence Diagram of the Protocol ............. .. .. 66

5.8 Flow Diagram of a Component ................ ...... 67















Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science

A FRAMEWORK FOR RELIABLE MULTICAST PROTOCOL

By

Venkata Lakshmanan Ramasubramaniam

December 2002

Chair: Richard E. Newman
Major Department: Computer and Information Science and Engineering

The 'i.-. -t revolution in networking since the introduction of the World Wide

Web (WWW) is Internet Protocol (IP) multicasting. Multicasting is becoming

increasingly ubiquitous in today's Internet world. There are many commercial

applications, both real-time and non real-time, that are based on IP Multicasting.

Many of those applications require reliable delivery of data at the receiver's end for

them to be meaningful.

Reliable multicast refers to the reliable manner in which a message should

reach the group of receivers. It is still an active area of research and has a number

of protocols in existence. There are several issues like scalability, error recovery and

congestion control that make the design of a reliable multicast protocol difficult.

This thesis discusses various reliable multicast protocols in existence and provides a

framework for a scalable, reliable, dynamic multicast transport protocol for point to

multipoint communication.

The proposed protocol provides a sequenced, loss less delivery of data from

a sender to a set of receivers. The receivers are split as logical groups with a

special receiver called Group Leader controlling the group. The error recovery









and retransmissions are performed by the Group Leader, which distributes the

processing load of the sender. The logical grouping of the receivers makes the

protocol scalable by its architectural design. The protocol provides congestion

control algorithms and also mentions the group formation and management

techniques. A unique feature of the protocol is the ability of the receivers to

reconfigure them to suit the network conditions in case of congestion.















CHAPTER 1
INTRODUCTION

Multicasting provides an efficient mechanism for message delivery from a single

sender to a group of receivers. Multicasting is done using Internet Protocol (IP)

[1], which is an unreliable protocol. Therefore, applications using multicasting that

require reliable delivery of data have to use a reliable transport protocol. This is

similar to reliable unicast applications using Transmission Control Protocol (TCP)

[2] as its transport protocol.

Unlike TCP, the design of a transport protocol for multicasting imposes several

challenges. The key issues include the implosion problem, scalability, error recovery

and congestion/flow control. The presence of multiple receivers causes all the

control information from the receivers to flow to the sender, that causes the

implosion problem. The operation of the protocol should not deteriorate with

the increase in the number of receivers. Due to the large number of receivers, the

recovery time should be minimized and should have facilities for congestion and

flow control. These topics are discussed in detail in Chapter 3.

There are a number of protocols in existence today for reliable multicasting:

Scalable Reliable Multicast [3], Xpress Transport Protocol [4], Log-based Receiver

Reliable Multicast Protocol [5], Local Group based Multicast Protocol [6] and

Multicast TCP [7]. Some of the protocols like Reliable Multicast Transport

Protocol [8], developed by Lucent Technologies, and Tree-based Multicast Protocol

[9], developed by Sun Microsystems, are becoming commercially available.

Chapter 2 surveys these protocols in detail. Although many transport protocols

exist for reliable multicasting, none of them are standardized so far. The protocols,

that are currently available have been designed to suit the needs of a custom









application; hence there is an absence of a generalized protocol that can be used

by all the applications fairly. Also most of the protocols do not address important

issues such as congestion control, which plays a major role due to the traffic

generated by reliable multicasting. Some of the protocols are not scalable to a very

large number of receivers, which is required of any multicast application.

1.1 Motivation

The Internet has been emerging as one of the major source of communication

utility. This has made a dramatic effect in the way in which people communicate

and share information with each other. A form of group communication called

multicasting has led to the development of a number of distributed applications

like video conferencing, distributed interactive simulation, white boards, electronic

publishing, news/news group services, digital libraries, etcetera. Due to the

immense requirement for security in the commercial applications of the Internet

world, there was a huge amount of research in cryptography and standards like

RSA [10] have been put forth. Similarly due to the emerging above-mentioned

group communication-based applications, there is again the rise in interest for

reliability.

Most of the real-time distributed applications require reliable delivery of data.

Unlike unicasting, in which the data is exchanged between a single source and a

receiver, exchanging the data reliably in a multicasting environment with a number

of receivers poses a lot of challenges. The protocol's architecture, design and

operation will play a major role in the operation of the same. Reliable multicast

is now an actively researched area in the Internet community. Although a number

of reliable multicast protocols exist for research and commercial applications, none

of them are standardized. The reliable multicast protocols in existence today are

designed for custom applications. Also, most of the protocols do not address some

key issues like congestion/flow control and some of them are not scalable to a large









number of receivers, which typify mainly multicast applications. Hence there is a

need for a reliable multicast transport protocol, which addresses all of the issues

like scalability, error recovery, congestion/flow control and group management

techniques and must be suitable for a wide variety of applications.

1.2 Problem Definition

This thesis proposes a framework for a reliable multicast transport protocol.

Reliable multicast refers to the reliable manner in which a sender sends the data to

a set of receivers. The protocol proposed in this framework provides a sequenced,

loss less delivery of data from a single sender to a set of receivers. The framework

addresses all the major issues like error recovery, congestion control, scalability, and

group management techniques.

The architectural design of the protocol makes it scalable to a large number of

receivers and aids local error recovery. The logical grouping of the receivers in the

protocol makes the error recovery distributed. The logical grouping of receivers aid

in combining the receivers with similar characteristics. The receivers that reside

in the same location, receivers that have identical data rate, and the receivers

that have identical error rate could be combined together. By making the error

recovery distributed, the communication load of the network is greatly reduced.

Receivers do not have to send their repair request all the way up to the source of

the multicast tree and similarly the sender do not have to send the response down

the tree to receiver. The control traffic involved in the error recovery is limited

only to the logical groups. As the receivers get repaired from the Group Leader

rather than the sender, the end-to-end latency is greatly reduced. The distributed

error recovery also reduces the processing load of the sender as it is distributed

among the Group Leaders. The retransmission performed by the Group Leader is

sometimes multicast to the whole group that slightly reduces the processing load of

the Group Leader.









The protocol responds to the congestion state in the network through TCP-

like congestion control algorithms. The protocol is also highly dynamic with the

multicast group receivers adapting themselves to the changing network conditions

and congestion state. Dynamic reconfiguration of the receivers among the logical

groups does this. The details of the protocol's architecture, design and operation

are discussed in Chapter 4.

1.3 Organization of the Thesis

Chapter 2 discusses the multicasting techniques and evaluates the various

existing protocols. Chapter 3 discusses the various issues pertaining to the design

of reliable multicasting protocols and analyses the existing reliable multicast

protocols pertaining to these issues. Chapter 4 presents the framework for the

protocol, discussing its architecture, design and operation. Chapter 5 presents the

functional model of the protocol and Chapter 6 gives the concluding remarks and

-111-" --;1 1n-; for future work.















CHAPTER 2
BACKGROUND AND PREVIOUS WORK

This chapter gives an introduction to reliable multicasting and the various

prominent works and protocols developed on the same.

2.1 Multicasting

Multicast is an efficient way to transfer data from a source to a group of re-

ceivers. Instead of sending a separate copy to each of the receivers, the sender

sends to the network, a single copy, which then sends it to all the receivers. Roca

et al. have surveyed different multicast technologies [11]. Multicasting is broadly

classified into the following types based on the applications [12].

One-to-Mifi',, (itoM): The ItoM multicast technique has a single host sending
data to more than one receiver. The one-to-many applications include
scheduled audio/video distribution (lectures, presentations, meetings), push
media applications (news headlines, weather updates, sports scores), file
distribution and caching, announcements (network time, session information,
keys) and monitoring applications (stock prices, sensor equipment, security
systems).

Mioi,,-to-M1i, (MtoM): The MtoM multicast technique supports the multiple
receivers acting as senders too, enabling two-way communication. The
many-to-many applications include multimedia conferencing (audio/video,
white board), concurrent processing (distributed parallel processing), distance
learning, chat groups, distributive interactive simulation and collaboration
(shared document editing).

il,,,,i,-to-One (Mtol): The Mtol multicast technique has multiple senders
sending data to a single receiver. The many-to-one applications include data
collection applications (sensors), auctions and polling.

With many of the multicast applications listed above, the growth of IP Multicast

has grown to a great extent in the past few years. Of the many applications listed

above, most of them require reliable delivery of data at the receiver's end.









2.2 Reliable Multicasting

In multimedia applications, the loss of some data could be acceptable the video

frames could be sacrificed for the audio information. The main objective of these

applications is to guarantee the quality of service at the cost of reliability. But

most of the applications discussed above require reliable delivery of data at the

receiver end. Addition of the word reliable to multicasting imposes several

challenges in the way the protocol is designed. In unicast communication between a

single source and a receiver, reliability is provided by TCP [2]. TCP provides

reliable transmission with feedback from the receiver, the source accordingly

provides the retransmission of the missed packets. In multicasting, due to the

presence of a large number of receivers, closed loop feedback results in implosion

problem as discussed in Chapter 3.

The term reliable multicasting refers to the reliable manner in which the message

should reach the group of receivers. The protocol developed should also be scalable

to a large number of receivers. Earlier multicast protocols were broadly classified as

sender-initiated and receiver-initiated [13]. In the case of sender-initiated

protocols, the sender will maintain considerable state information. All the receivers

sending in their ACKs, will lead to packet implosion. The presence of large number

of receivers causes them to report their control information and acknowledgements

to the source, that results in virtually impounding it. This is called as the

implosion problem. In the case of receiver-initiated protocols, every receiver will

maintain state information thereby shifting the burden from the sender to the

receivers. There has been much comparative analysis done on the sender-initiated

and receiver-initiated reliable multicast protocols. These analysis have shown that

receiver-initiated protocols are more scalable than the sender-initiated ones. The

protocol framework discussed in this thesis is receiver-initiated with slight









modification. Instead of all the receivers maintaining the state information, only

selected receivers called Group Leaders will maintain the same.

2.3 Existing Protocols

A considerable amount of work has been reported in the literature regarding

reliable multicast protocols. Most of the works could be classified as the

sender-initiated and the receiver-initiated approaches. Although there are many

protocols in the reliable multicast area, this section deals only with the major ones,

discussing the highlights of the same. Most of the protocols are based on the logical

groupings of the receivers. A set of receivers is grouped together to form logical

groups that form a hierarchical multicast tree. The architecture improves the

scalability of the protocol. Some of the protocols discussed do not form any

groupings and their scalability is limited. Obraczka provides a comparison chart of

all the major multicast transport protocols [14].

2.3.1 Local Group Concept

Markus Hofmann proposed a tree-based approach using the so-called Local

Group Concept (LGC) [6]. The concept, developed in 1994 has evolved into two

separate protocols: Local Group based Multicast Protocol (LGMP) and Local

Group Configuration Protocol (LGCP). LGC is one of the few earlier protocols

that discussed the local logical groupings of receivers.

LGMP is also based on the principle of local subgrouping. It has special nodes

called group controllers that are responsible for local re-transmissions and ACK

processing. The selection of the group controller and the management of the local

groups is not a task of the multicast protocol like LGMP. Instead, a separate

configuration protocol called as Dynamic Configuration Protocol (DCP) was

designed and implemented. Any error or retransmissions are recovered in the local

groups by the group controller. If any of the group receivers do not have the data

packet, it is requested from the source. Acknowledgement schemes include a









positive, negative and a novel semi-negative acknowledgement scheme, which

indicates that a data unit has not yet been received correctly, but it does not

request the data unit for retransmission. Error recovery is first performed locally

within the groups. Any member of the local group that has the missing packet

would retransmit it to the requesting member. A packet is requested for

retransmission from the sender only when it is not found with any of the members

in the group, including the group controller. LGC defines two different modes of

performing local retransmissions: load-sensitive mode and dcl-'v-~-cnitive mode. In

load-sensitive mode, retransmissions are performed to minimize the network load.

The controller withholds the retransmission waiting for requests from other

receivers. In d'-ln,'i- initive mode, the retransmissions are performed immediately

after the reception of retransmission requests.

The establishment and maintenance of logically structured group hierarchies is

left to the Dynamic Configuration Protocol. The major advantage of the DCP is

that it can interact with any other protocol that requires a logically structured

receiver hierarchy. The DCP and LGC together form the protocol architecture.

LGC concerns group formation, group characteristics and dynamic reconfiguration.

But there is no mention of any congestion control scheme. The LGMP operates on

top of User Datagram Protocol (UDP) [15] and does not require any changes

within internal network equipment such as routers or switches.

2.3.2 Reliable Multicast Transport Protocol (RMTP)

Paul and Sabnani of Bell Laboratories proposed the Reliable Multicast Transport

Protocol (RMTP) [8]. RMTP is commercially marketed by Lucent Technologies. In

RMTP, the receivers form a dynamic multicast tree with the source rooted on top

of the tree. RMTP is a protocol for point-to-multipoint reliable multicast.

The receivers are grouped logically to form groups with a Designated Receiver

(DR) as the representative of the local group. These local groupings of the









receivers lead to the formation of several local multicast trees, which together form

the global multicast tree. Hence the sender, receivers and the designated receivers

form the three major entities of RMTP. The leaf receivers periodically send status

messages to their designated receivers (DRs). DRs in turn send their status

messages to the DRs in the higher level and so on until the status message finally

reaches the source. Thus there is a hierarchical flow of data in the multicast tree

from one level to another level. Lost packets are recovered locally and the DR does

retransmissions either through unicast or multicast mechanism. To facilitate error

recovery for late joining receivers, the source and the DRs buffer the data packets

for the session. RMTP uses two level cache mechanisms with the most recent

packets are cached in memory and the rest are cached in the disk. Flow control is

based on a combination of rate and window-based control.

Although the protocol is scalable to a large number of receivers, it does not

provide an end-to-end congestion control scheme. The sender only gets feedback

from its own children (the DRs) about their receiving status. Hence, the sender has

little information about the congestion status of the receivers. When congestion

occurs at leaf receivers, it may mot be possible for the sender to detect the

congestion, especially if the DRs and the leaf receivers do not share the same

network path. In this case, the sender will continue to transmit at the same rate,

aggravating the existing congestion. RMTP traffic may be completely unresponsive

to congestion and may cause congestion collapse. Another drawback of the protocol

is the dependence on the network equipment such as the routers. The routers have

to be modified substantially. Although the information from each receiver is

delivered in order, RMTP does not guarantee causal delivery.

2.3.3 Tree-based Reliable Multicast Protocol (TRAM)

TRAM [9] was developed at Sun Microsystems Labs by Kadansky et al.. It was

designed to support bulk data transfer with a single sender and multiple receivers.









Unlike the protocols discussed earlier, TRAM uses dynamic trees in its

architecture. These are used for local error recovery and aids in the scalability of

large number of receivers.

The receivers and the sender of the multicast group interact with each other

dynamically to form repair groups. The repair groups are linked together

hierarchically to form the multicast tree with the source rooted on top of the tree.

A subset of receivers is chosen for the reliable delivery of the data. A special

receiver called the repair head is chosen among the subset of receivers, and is

responsible for the local error recovery and retransmission processing. These repair

heads may be statically or dynamically selected. The repair head caches the data

packets temporarily until all the receivers in the subset receive them. The source of

the multicast tree also caches the data packet and retransmits it if requested.

TRAM uses a rate-based flow control. The data rate of the sender is dynamically

adjusted based on the congestion feedback from the receivers. The repair head also

sends control messages like congestion notification to its parent; these propagate

until they reach the source. Each member of the tree sends feedback reports to its

repair head periodically apart from the congestion notification. This feedback

consists of general information or statistics, which also aids in the construction of

the tree. TRAM also proposes several optimized tree construction techniques

suited for different applications. TRAM proposes several tree management

techniques for continually optimizing the repair tree.

Though the source and the repair heads cache the data packets for error recovery,

TRAM does not support full-data recovery. There may be cases where the receivers

joining late would not be able to recover fully due to the absence of data both at

the repair head and the source as their buffers are periodically reclaimed. TRAM is

currently implemented in Java and several sample applications Slinger, Bricks,

Stock and TreeTest have been developed to test its capabilities.









2.3.4 Multicast TCP (MTCP)

Rhee et al. [7] proposed the Multicast TCP (MTCP), a point-to-multipoint

protocol mainly developed to provide a detailed congestion control scheme for

reliable multicast. The protocol describes a detailed congestion control technique

similar to TCP congestion control methods.

MTCP also supports a tree-like hierarchical structure. The receivers are not

grouped locally. The multicast group is a hierarchical tree with the sender forming

the root of the tree and the receivers forming the other nodes of the tree. The

sender multicasts the data to all the receivers, and the latter send

acknowledgements to their parents in the tree. The internal nodes are called

Service Agents (SAs). The leaf receivers send ACKs to the SAs immediately above

them, which in turn send them to the SA in the level above and so on until it

reaches the source. The receivers send either a positive acknowledgement (ACK) or

a negative acknowledgement (NACK). Received packets are reported in ACKs and

missing packets are reported in NACKs. The service agents are responsible for

handling the feedback generated by the children and retransmitting lost packets.

MTCP provides several features for congestion control mechanisms. The

receivers send a consolidated congestion status report up in hierarchy towards the

source. A new concept of relative time delay is introduced to overcome the

difficulty of calculating the round-trip time. Flow control is window-based, which

allows the sender to control the amount packets it multicasts to the group. It also

incorporates a selective acknowledgement scheme at the service agents to prevent

independent packet loss from reducing the sender's transmission rate. Each service

agent maintains a TCP-like congestion window, which operates in a manner similar

to the standard TCP congestion control algorithms [16].

MTCP provides a congestion control scheme similar to the TCP congestion

control mechanisms. This is achieved through the hierarchical congestion status









reports sent by the leaf receivers to the SAs above them. Each SA monitors the

congestion level of its children by independently maintaining a dynamic congestion

window using the ACKs and NACKs received from them. The status reports

eventually reach the source, the root of the tree, which regulates its transmission

rate based on its summary. MTCP also uses Relative time delay (RTD) concept,

which overcomes the difficulty of estimating the round-trip times in tree-based

multicast environments. Unlike TCP that uses feedback from a single receiver to

estimate the round trip time, MTCP open loop system with multiple receivers.

MTCP measures the difference between the clock value taken at the sender when a

packet is sent, and the clock value taken at the SA when the corresponding ACK is

received. The time difference is called the relative time delay in MTCP.

The protocol is not scalable to a large number of receivers due to its architecture.

Also, MTCP does not describe in detail the formation of the multicast tree and its

management techniques. MTCP primarily describes a congestion control

mechanism for reliable multicasting.

2.3.5 Scalable Reliable Multicast (SRM)

SRM, proposed by Floyd et al. [3] is a reliable multicast framework for

light-weight sessions and application level framing developed at the Lawrence

Berkeley Labs (LBL). Although the framework has been designed for a wide range

of applications, it has been primarily prototyped in wb, a distributed whiteboard

application.

The SRM is not hierarchical and hence the scalability is very limited. Whenever

a member generates new data, the data are multicast to the whole group. Each

and every member is responsible on its own to detect packet loss and to request

retransmission. To prevent the implosion of control packets sent from receivers in a

multicast group, it adopts a mechanism similar to the XTP [4], by which the

control packets are multicast to the whole group. As with the original data, repair









requests and retransmissions are always multicast to the whole group. The other

members that are also missing the same packet hear that request and suppress

their own request. This prevents a request implosion. A similar technique is

adopted to prevent response implosion. The repair requests are different from the

traditional NACK in that they are not addressed to a specific sender and they

request data by its unique name. In SRM, each member also multicasts periodic

session messages that report the sequence number state for active sources. The

session messages in SRM are used to determine the current participants of the

session.

From the architectural design of SRM, it can be seen that it is not scalable to a

large number of receivers. The error recovery scheme is totally different from the

other protocols discussed earlier. Separate mechanisms for the suppression of the

request and response implosion have to be provided, that could otherwise be

avoided in the proper architectural design. Also, the design of SRM results in a

problem called the crying baby problem. If a single link connecting a member of

the group has a very high error rate, then the member sends out the repair request

to the whole multicast group and receives one or more responses. Congestion in

one part of the group will lead to this problem.

2.3.6 Log-based Receiver-reliable Multicast (LBRM)

The Log-based Receiver-reliable Multicast protocol [5] is a reliable multicast

protocol suited for high performance simulation applications, particularly

Distributive Interactive Simulations (DIS). It was developed to suit applications

that have requirements such as wide-area data distribution, low latency, and packet

loss detection and recovery. Holbrook et al. give detailed discussions of various

applications of LBRM like traffic reports, file caching, stock quotes dissemination.

In LBRM, a logging server provides the reliability by logging all transmitted

packets from the source. Any receiver that missed a data packet requests the same









directly from the logging server. The presence of the logging server is the

equivalent of any transport protocol buffering the data to serve retransmissions.

The protocol is receiver reliable in the sense that each receiving application defines

its own reliability requirements. The sender merely sends the data and makes it

possible, via the logging server, for the receiver to be able to retrieve the lost

packet. The source includes a sequence number in each packet and defines a

Maximum Idle Time (MaxIT) bound. The source guarantees that it will transmit a

packet at least once every MaxIT interval. Even if the application does not provide

the data, the protocol keeps sending some special packets called keep-alive or

heartbeat packets that repeat the previous sequence numbers and not the

associated data.

LBRM also proposes the concept of distributed logging. The receivers are

grouped together to form sites. Each and every site has a secondary logging server

apart from the primary logging server located near the multicast source. The

presence of the secondary-logging servers makes the error recovery distributed.

Each of the secondary logging servers is responsible for the retransmission of data

packets at their site. This reduces the end-to-end propagation delay. The

secondary-logging servers can in turn recover the missing packets from the primary

logging server. The designs of many reliable multicast protocols with local error

recovery, including the framework described in this thesis, are based on the

architecture of LBRM.

2.3.7 Xpress Transport Protocol (XTP)

The Xpress Transport Protocol [4] is a high-performance transport protocol

designed to meet the needs of distributed, real-time, and multimedia systems in

both unicast and multicast environments. Although XTP was first established in

1987, it was later modified with several versions.









The important features of XTP are the connection paradigms, control algorithms

and the group membership controls. XTP designers chose a unique

connection-oriented multicast paradigm whereby XTP sets up a one-to-many

simplex connection with a set of receivers. The functionality found in the

point-to-point unicast connections are extended in XTP. With the control

algorithms, XTP provides its users with the ability to enable or disable the

error-rate and flow-control procedures on a connection by connection basis. XTP

designers have experimented with two methods for control algorithms: a heuristic

algorithm for timer-based processing of control information and explicit processing

of the state information from each receiver in the multicast group.

XTP is principally a sender-based reliable protocol and it has the option of both

unicast and multicast. Control packets are unicast to the sender and there is no

mechanism to prevent control packet implosion. Retransmitted packets are

multicast and filtered at the sites. XTP supports a number of multicast group

management techniques. Though it has support for flow and error control, it uses

the mechanisms defined for XTP unicast.

2.4 TCP Congestion Control

This section describes the TCP congestion control mechanisms that have been

standardized and researched widely. Earlier implementations of the TCP/IP

implemented the go-back-n model without the presence of any modern congestion

control algorithms. They used the cumulative acknowledgement mechanism for

acknowledging the packets and the re-transmit timer expiration for sending the

packets again when they are lost in the network. These implementations did not

help in reducing the congestion very much. Van Jacobson, one of the greatest

pioneers of congestion control in the Internet, proposed a series of algorithms for

congestion control in TCP that eventually became standardized. The algorithms

are discussed briefly as follows:









Slow start. The slow start algorithm uses a new variable called congestion

window(cwnd). It operates by observing that the rate at which the new packets

should be inserted into the network is the rate at which the acknowledgements are

received from the other end. The sender can only send in a minimum of the cwnd

and receiver's advertised window (rwnd). For each of the ACK the sender receives,

the cwnd is increased by one segment. Increasing the cwnd by one for every ACK,

results in an exponential increase of cwnd over round trips.

Congestion avoidance. The congestion avoidance uses another variable called a

slow start threshold (ssthresh). It indicates the correct window size depending

upon the network load. Initially the slow start phase begins. As long as the cwnd

is less than the ssthresh, the slow start continues. Once the cwnd crosses the

ssthresh, "congestion avoidance" phase starts. For each of the ACK received, the

cwnd is increased by 1/cwnd segments. When the sender times out waiting for an

ACK, ssthresh is set to a minimum of cwnd/2 and the receivers advertised window.

The cwnd is again set to one and the slow start phase begins again. The slow start

will be active as long as the cwnd is less than the ssthresh. Else, congestion

avoidance phase begins.

Fast retransmission. When the TCP receiver receives an out of order segment, it

immediately sends an duplicate acknowledgement. The duplicate ACK indicates

the next segment the receiver is expecting and asking the sender to transmit the

same. The duplicate ACK may be sent due to the loss of a segment at the

receiver's side or due to re-ordering of the segments. If the duplicate ACK was due

to the re-ordering of segments, then there may be only one or two duplicate ACKs.

If there are 3 or more duplicate ACKs, then it must be due to the missing segment

and the sender retransmits the missing segments without waiting for the retransmit

timer to go off. TCP then reduces the ssthresh to cwnd/2 and resets the cwnd to

one segment.









Fast recovery. The fast recovery algorithm prevents the communication path from

going empty after fast retransmit. Therefore, there is no need to slow start from

the beginning after the fast retransmit. Fast recovery keeps track of the number of

duplicate acknowledgements and tries to estimate the amount of outstanding data

in the network. It will increase the cwnd by one segment for each duplicate

acknowledgment received, thus maintaining the flow of traffic. The sender comes

out of the fast recovery when it receives an ACK for the segment whose loss

resulted in the duplicate acknowledgements. TCP will now deflate the window by

returning it to the ssthresh and enters the congestion avoidance phase.

Slow start, congestion avoidance, and fast retransmission together form TCP

Tahoe and TCP Reno includes fast recovery mechanism.

2.5 Existing QoS Approaches

The Internet today provides only the best-effort service. As discussed in the

introduction section, with the growing demand for real-time and non real-time

multicast applications, the demand for the quality of service is greatly increased.

Although the reliable transport protocol discussed in this document aims at

providing reliability, it would be even better if the network layer were to combine

with the transport protocol to enhance the same.

The IETF has proposed a number of mechanisms to meet the demand for QoS.

The popular ones are the Integrated Services/RSVP, Differentiated Services,

Multiprotocol Label Switching (MPLS), traffic engineering and constraint-based

routing [17]. Here, the QoS model of Differentiated Services is discussed in detail of

how it is incorporated.

Integrated services/RSVP. The Integrated Services/RSVP model proposes two

service classes in addition to the best-effort service: Guaranteed Service and

Controlled-load service. The RSVP (Resource ReSerVation Protocol) is a signaling

protocol that reserves resources in the network for flows. Routers in the network










use the PATH and RESV messages to reserve resources. Differentiated Services,

MPLS and other QoS mechanisms are replacing the once-popular Integrated

Services, largely due to the following reasons. Integrated Services places a huge

burden on the routers due to the amount of state information and processing

capabilities it requires. High storage and processing overhead are the major cons

involved.

Differentiated services. Unlike RSVP, there is no signaling mechanism in

Differentiated Services, thus eliminating the QoS setup costs. The QoS

requirements are obtained by modifying the TOS (Type of Service) field in the IP

header to a field called Differentiated Services (DS) field. The elements of the DS

field are shown in Figure 2.1. Applications make use of the DS field to mark the

packets according to their requirements. It is the job of the DiffServ architecture to

deliver the packets to the receiver application. Different multicast applications have

different requirements such as time or delay sensitivity. Hence they mark the DS

packets as per their requirements and the DiffServ architecture takes care of the

operation.

IHL XXXX Total Length

Identification Flags Fragment offset

TTL Protocol Header checksum

Source address
Destination address

Option + Padding

Data

XXXX DiffServ Field Unused
DS Field

Figure 2.1: Differentiated Service (DS) Field for TOS in IP Header

The DiffServ architecture consists of entities like routers that perform the

following functions: classifying, metering, shaping and dropping. The









customer/client should have an agreement called the Service Level Agreement

(SLA) with the Internet Service Provider (ISP) to receive the Differentiated

Services. Customers mark the DS fields of individual packets to indicate the

desired service. Edge routers of the network classify, police and shape the packets

according to the SLA. Core routers just forward the packets based on the marking

of the DS field. The various functions done by the router are discussed below.

Clo,;ifj,,'i i: The edge router looks at each packet and identifies the flow to
which it belongs. Looking at the DS field does this.

Metering: After classifying the flow, its resource consumption should be
measured. Measuring the traffic is also important for billing information.

%q1i '!,i: The flow may occasionally include some bursts that must be
absorbed and the packets must be paced. It is up to the router to decide on
the mechanism to hold the burst. It may hold or even drop the packets in the
burst that exceed the particular threshold.

Dropping: Whenever the flow exceeds the SLA, a router may choose to drop
the packets. Depending upon the type of service subscribed, Assured Service
or Premium Service, the packets are handled accordingly. The various queue
management schemes like Random Early Detection (RED) and Random
Early Detection with Input and Output (RIO) are employed for dropping the
packets. In RED, the packets are dropped randomly. RIO is a modified RED
algorithm. RIO maintains two RED algorithms, one for the in packets and
another for the out packets. There are two thresholds for the queue. When
the packets are below the first threshold, no packets are dropped. When the
queue size is between the two thresholds, only the out packets are dropped
randomly. In extreme congestion, when the queue size exceeds the second
threshold both the out and in packets are dropped randomly.

Hence the multicast applications could specify the level of reliability required

through the marking of the DS fields. In addition to the transport level reliability

provided by the protocol, enhanced reliability could be provided by the network

layer using the Differentiated Services as discussed above.

2.6 IETF Approaches

Internet Society (ISOC) is a global not-for-profit membership organization

founded in 1991 to provide leadership in Internet related standards, education, and









policy development. The Internet Engineering Task Force (IETF) is an

organization which is maintained by ISOC. The IETF is a large open international

community of network designers, operators, vendors, and researchers concerned

with the evolution of the Internet architecture and the smooth operation of the

Internet. The IETF is divided into various working groups (WG) that work on

different areas of the Internet working towards its standardization. Although a

number of reliable multicast transport protocols have been developed, most of them

are used in research while some of them have been released as commercial products.

None of the reliable multicast transport protocols have been standardized.

The Reliable Multicast Transport, rmt-working group is chartered to standardize

reliable multicast transport protocols for "one-to-many bulk data" transport. This

group works closely with other IETF groups such as Secure Multicast research

group (smug) and Multicast Security working group (msec). The rmt working

group is currently working on the following three protocol instantiations:

NORM: Nack Oriented Reliable Multicast protocol, which uses NACKs for
reliability;

TRACK: TRee ACKnowledgement-based protocol, which uses a tree
structure for controlling feedback and repairs;

ALC: Asynchronous Layered Coding, which uses Forward Error Correction
(FEC) techniques and does not require any feedback.

A number of Internet drafts and RFCs have already been published in the above

mentioned areas. The much-researched areas in reliable multicasting by IETF

include congestion/flow control measures, efficient retransmission, and

acknowledgement aggregation. There are numerous publications in the literature on

the same.

IETF has framed some technical criteria [18] for Reliable Multicast transport

protocols. The criteria include addressing the following key issues:









S, ,ldilithJ The ability of the protocol to accommodate a large number of
senders and receivers and the mechanisms that limit the scalability;

Congestion Control: Description of the congestion control mechanisms that
are incorporated and their behavior during congestion;

Error Recovery and Robustness: Description of how the protocol handles the
packet loss and node/link failures;

Security and F, i', concerns: Analysis of security at the senders, receivers,
routers and retransmission sources along with data integrity and
authentication.

The framework proposed in this thesis addresses all of the issues mentioned

above except the security aspects. The protocol is scalable to a large number of

receivers by its architectural design. Logical grouping of receivers localizes the error

recovery and it also mentions a congestion control scheme that operates on

congestion algorithms similar to TCP. The security issues are just mentioned and

not discussed in detail.

2.7 Summary

Due to the strong application demand of reliable multicasting, there is a huge

interest in the area of standardization of the same. Therefore it continues to be an

active research area in the Internet community.

This chapter discussed the various multicasting techniques and their applications.

Reliable multicast protocols may be broadly classified as sender-initiated or

receiver-initiated protocols. The chapter also reviewed the various reliable

multicast transport protocols in the literature.

LGMP, RMTP and TRAM protocols share the technique of local grouping of

receivers in their designs. The local error recovery design of most of the reliable

multicast protocols was inspired by the design of LBRM. Although LBRM proposes

a distributed logging system with the receivers grouped as a site, it does not

mention any of the group management techniques as in TRAM. RMTP groups the









receivers locally leading to the formation of several local multicast trees, which

together form a global multicast tree. Group management techniques are not

discussed in detail in RMTP either. The local groups of receivers have a special

receiver called repair head, responsible for retransmissions and recovery. TRAM is

rich in group management schemes and has techniques for continually optimizing

the multicast tree structure. MTCP was developed primarily to implement a

congestion control strategy for reliable multicasting. XTP, the sender-based

protocol was developed aiming at distributed, real-time and multimedia systems.

SRM is not hierarchical and does not provide logical groupings either. The

architectural design of SRM makes its scalability very limited.

The next chapter discusses the various issues in reliable multicasting, which make

it difficult for the standardization process. In Chapter 4, the framework of the

protocol is proposed.















CHAPTER 3
ISSUES IN RELIABLE MULTICASTING

Design of a reliable multicast protocol is very complex when compared to the

design of unicast communication protocols. There are several issues to be dealt

with when dealing with group communication. Listed below are some of the

important problems encountered with reliable multicasting.

3.1 ACK Implosion problem

This is most the important problem to be dealt with the design of a reliable

multicast transport protocol. The sender receives feedback or acknowledgements

from the receiver to update and control its regulation parameter for reliability and

congestion control. In unicast communication, the sender receives its feedback just

from the single receiver present. In the case of multicasting, the feedback messages

received from the multiple receivers may converse on the source. This is called the

feedback implosion problem. There are a few ways to deal with this problem. Some

of the existing protocols propose negative acknowledgement (NACK) to be sent to

the sender, instead of a positive ACK, so that the sender receives only a limited

number of responses from the receivers. If the multicast tree is large and the error

probability is relatively high, there would be many losses and the NACKs sent

would still cause implosion. Hence there has to be some way of suppressing the

NACKs too. Some of the ways to deal with the implosion problem is to use the

local grouping scheme and some suppression techniques. Figure 3.1 depicts typical

case of the ACK implosion problem. The implosion at the source could also be

caused by the NACKs. Hence it is called in general as the feedback implosion

problem rather than the ACK implosion problem.









LGMP, RMTP and TRAM discussed in the previous section group the receivers

to form small local groups. The receivers in the local group send their ACKs to the

entity that controls the group and not to the sender. Hence the sender of the group

is not bombarded by ACKs when the data are received properly. In all the three

protocols, the ACK implosion problem is avoided by the design of the protocol.

MTCP has a hierarchical structure with the receivers arranged in a tree-like

fashion. There is no grouping concept in MTCP and the receivers send their ACK

to their immediate parent. The parents send their ACK to their corresponding

parent and so on. Hence, the source will get the ACK only from the level one

receivers and this avoids the ACK implosion problem. SRM does not provide

grouping or any hierarchical structure for the receivers. SRM does not ACK the

data packets but the NACK packets are multicast to everyone. Since NACK is sent

only for a missing packet, it does not always lead to implosion problem. In LBRM

too, no ACK packets are sent to the source. LBRM sends the NACK packets

specifically to the source. XTP unicasts the control packets to the source and there

is no mechanism to avoid the implosion at source.










0 Multicast Source

O Multicast Receivers
Multicast Data
--- Data Packet Acknowledgements

Figure 3.1: ACK Implosion at Multicast Source









3.2 Error Recovery

There are usually two types of error recovery [19]: centralized error recovery

(CER) and distributed error recovery (DER). In CER the retransmissions are

performed by the source of the multicast tree. CER is also referred as source-based

recovery. In DER, all the members of the multicast perform the retransmissions

group. Hence the burden of the error recovery processing is distributed from the

source to all the members of the multicast tree. DER is found to outperform the

CER, because the source may not always have sufficient processing power or buffer

space to support error recovery, especially when the number of receivers is very

large, which is typical of most of the multicast applications. Also for reasons such

as fault tolerance, distributed error recovery is recommended compared to

centralized error recovery [19]. Kasera et al. have shown that local recovery has the

potential to provide significant performance gains in terms of reduced bandwidth

and delay, and higher throughput [20].

LGMP, TRAM and RMTP provide a distributed error recovery mechanism

through the local grouping of receivers. In LGMP, any member of the group

performs retransmissions and the packet is requested from the source only when

none of the members have it. In TRAM, the repair head performs the

retransmissions to the group members whereas a designated receiver is responsible

for error recovery in RMTP. MTCP has a hierarchical structure of receivers with

each internal node called as Service Agent (SA). SAs are responsible for the error

recovery and it requests a packet from its parent if it does not have one. In SRM,

the repair or retransmission requests are multicast to the whole group. All the

members hear the request and any other member that is missing the same packet

would suppress its request on hearing the same request by another member.

Retransmissions are also multicast to the whole group and on looking at the

response other members would suppress their response. LBRM proposes two types









of error recovery. The protocol has a logging server adjacent to the sender and all

the receivers request missing packets from the logging server. LBRM also provides

distributed logging by grouping a set of receivers called as "site". Every site has a

secondary logging server that provides the error recovery for the receivers in the

particular site. Hence, in distributed logging, the recovery becomes localized. In

XTP, the repair requests and retransmissions are multicast to the whole group

similar to SRM.

3.3 Congestion Control

Congestion control mechanisms in multicasting remain one of the most widely

researched areas. Unlike the unicast mechanism, in multicast, where multiple

receivers are involved, effective congestion control relies on accurate and timely

feedback on the prevalent network conditions. The challenge lies in how

economically, speedily and accurately is the feedback information collected.

Golestani and Sabnani [21] have discussed in detail the various issues to be

considered by a multicast congestion control scheme. The two major components in

the congestion control structure are the regulation parameter and the regulation

algorithm. A regulation parameter is a parameter by which the flow of traffic onto

the network is regulated, which may be either the rate at which the data is

transmitted or the window size. A regulation algorithm is the algorithm by which

the regulation parameter is adjusted. In the standard TCP congestion control

mechanism, the regulation parameter is the window size and the regulation

algorithm is the combination of the standard congestion control algorithms which

were proposed by Van Jacobson, like Slow start, congestion avoidance, etc. Most of

the congestion control schemes today adopt the regulation parameter as either the

transmission rate or the window size. There are difficulties in extending

window-based regulation to multicast communications because of the concerns like

ACK implosion problems. The major drawback in implementing rate-based









regulation is the necessity to calculate the receiver round trip times. The

measurement of receiver round trip times is fundamentally different and more

complex than the unicast communications. In unicast communications, where there

is a single receiver, the round trip time is measured easily. In the multicast

communication with multiple receivers involved, issues arise of measuring for each

receiver. In multicast protocols with logical groupings, the receivers must measure

their distances from the leader of the group.

The main difference between the congestion control schemes implemented in

unicast and multicast communication is the place at which the regulation algorithm

is run. Accordingly congestion control scheme may be categorized as source-driven

or receiver-driven. In source-driven unicast communication schemes, the task of

updating the regulation parameter is left to the source. The source upon getting

the feedback from the receiver is the ideal site to run the regulation algorithm. In

multicast communication, Golestani [21] discusses the following problems.

In multicast sessions, due to the presence of a large number of receivers, the
complexity of performing the traffic regulation is also increased. Unlike the
unicast methodology, if the execution of the regulation algorithm is left solely
to the source, its processing capability could severely limit the same.

In the protocol discussed in this thesis, many retransmissions are not
performed by the source, rather they are being performed by the Group
Leader. Hence the loss information pertaining to the various receivers is not
always available to the source.

In the current hierarchical local grouping architecture, the number and
identity of the receivers is not known to the source.

The congestion control decisions are not always implemented by the source. If
a receiver decides to drop out of the group, the receiver alone must implement
it.

Another important factor to be taken into account is the fact that the natural

approach for congestion control is to adopt TCP's congestion control algorithms for









reacting to network congestion. Floyd et al. [22] have proposed a guideline as

follows.


For any link, the traffic arrival rates for a flow should respond to
congestion in a way that is no more aggressive than multiplicative
decrease, additive increase, with increase and decrease rates that give
behavior that is no more aggressive than current implementations of
TCP.

Considering the above-mentioned factors, the real need of the congestion control

in reliable multicast is to shift the tasks as much as possible to the receivers. It

should be a receiver-driven approach with the receivers sending the feedback about

congestion to the source. The control algorithms should also be TCP-friendly for

reasons as discussed above.

The architectural design of the multicast protocol has a major effect on the

design of the algorithms that aid in sending the feedback from receivers to the

sender. Golestani and Sabnani have proposed a hierarchical consolidation of

receiver feedback in the multicast tree. Each receiver is responsible for

consolidating the feedback received from its immediate children and sending the

result upward to its parent. At each level inside the tree, the receiver computes an

aggregate feedback parameter at its level based on the feedback received from the

level below it. The computed aggregate feedback parameter is then sent upward

towards its parent until it reaches the source finally.

LGMP does not mention any congestion control mechanisms. TRAM specifies

ways in which the receivers in the group send notification about congestion to the

source. Upon the reception of the congestion notification, the source reduces its

transmission rate. TRAM specifies a minimum and maximum transmission rate

within which the protocol operation takes place. On reception of congestion

reports, TRAM reduces its transmission rate by 50









3.4 Scalability

With the multimedia applications growing tremendously, the number of receivers

in the multicast groups can be expected to increase proportionately. Hence the

protocol should be designed in such a way that it can handle the increase in

number of receivers and still provide the same level of service. Also, the multicast

receivers should be able to join or leave the group whenever they wish and the

source should not be even aware of these activities. The architectural design of the

protocol plays a major role in deciding the scalability of the protocol.

LGMP, TRAM and RMTP combine a set of receivers to form local groups. Due

to the logical grouping of receivers, the protocol is scalable to a large number of

receivers. MTCP does not provide any grouping of receivers, but the receivers form

a hierarchical tree-like structure with the sender rooted on top of the tree. The

scalability of MTCP is limited due to its hierarchical nature. SRM provides neither

grouping nor hierarchical structure for the receivers. The receivers just form one

group under the sender which limits its scalability very much. SRM was mainly

developed for "white board" application where the number of receivers is limited.

Distributed logging in LBRM is scalable to a large number of receivers as they are

grouped together to form "sites". XTP does not provide any grouping of receivers

and its scalability is limited.

3.5 Fairness

A particular concern for the developers of reliable multicast protocols is the

impact of reliable multicast traffic on other traffic in the Internet at times of

congestion, in particular the effect of reliable multicast traffic competing with TCP

traffic [18]. The protocol should address the fairness issue. There are many possible

ways to define fairness. One type of fairness is global fairness. Under this

definition, each entity has an equal claim to the network's scarce resources. The

multicast protocol should ensure fair sharing of network resources with other










well-behaved protocols. It should ensure fairness with other multicast and unicast

traffic. The multicast protocol should behave and back off in a way similar to TCP

in case of congestion. This is called TCP friendliness. Hence the reliable multicast

protocol should be designed in such a way that it is fair.


Protocol ACK Implosion Error Recovery Congestion Control Scalability

LGMP DER X

TRAM DER I/ /

RMTP DER X

MTCP DER X

SRM X DER x X

LBRM
(Primary) CER

LBRM DER
(Secondary)

XTP X DER X X

TCP /



/ Handles Well X Suffers from this Not Applicable
DER Distributed Error Recovery CER Central Error Recovery

Figure 3.2: Comparative Analysis of Reliable Multicast Protocols


3.6 Summary

This chapter discusses various important issues involved in reliable multicasting.

The implosion problem is a primary problem to be addressed in a reliable multicast

protocol. The acknowledgement sent by multiple receivers should not bombard the

sender. The architectural design of the protocol has a major impact upon the

implosion problem. The error recovery protocol could be centralized or distributed.









Distributed error recovery reduces the load of sender and provides efficient

retransmission mechanisms. The reliable multicast protocol should respond to the

congestion state in the network through congestion control algorithms. The

multicast protocol should also be scalable to a large number of receivers.

From the above issues it could be seen that design of a reliable multicast protocol

is complex when compared to the design of unicast protocols. The architectural

design of the protocol plays an important role and has compounding effects on

issues such as implosion problem, error recovery and scalability. This chapter also

evaluated the various reliable multicast protocols in literature to the issues such as

implosion problem, error recovery, congestion control and scalability. Figure 3.2

shows the comparative analysis of LGMP, TRAM, RMTP, MTCP, SRM, LBRM,

and XTP. The analysis also depicts the relation of TCP with the reliable multicast

protocols.















CHAPTER 4
PROTOCOL FRAMEWORK

4.1 Goals

4.1.1 ACK Handling

One of the major design decisions of any reliable multicast protocol is how to

overcome the ACK implosion problem. The protocol framework discussed in this

thesis has an architecture designed in a way that will avoid the problem

automatically. The multicast receivers are grouped locally with a local leader called

the Group Leader (GL) for each particular group who is in charge of the error

recovery and local retransmissions. Instead of the receivers sending positive ACKs

back to the source, they only send NACKs to the Group Leader whenever they

detect the loss of a packet. Hence the source is never flooded or imploded with

NACKs either. All they receive is the ACKs from the Group Leaders. This way,

the ACK implosion problem is mitigated.

4.1.2 Error Recovery

As the multicast receivers are grouped locally, error recovery is taken care by the

Group Leader which makes it Distributed Error Recovery (DER) [19]. The Group

Leader locally handles any re-transmissions. Hence the error recovery time is

minimized greatly, as the re-transmission requests do not have to propagate all the

way up to the source every time a packet is lost. This results in speedy recovery

and distribution of the processing throughout the multicast tree. The end-to-end

delay of the retransmission request and repair is reduced.

4.1.3 Congestion Control

Many of the earlier protocol implementations for reliable multicasting do not

specifically address the congestion control techniques. The protocol framework









proposed in this thesis follows the TCP congestion control algorithms closely. The

algorithms are implemented and handled locally in the groups by the Group

Leader. The Group Leaders maintain a TCP-like congestion window (cwnd).

Group Leader increments the cwnd by one only when it receives an ACK from each

of its children. Until then, the Group Leader has to buffer the packets. As soon as

the ACKs are received, the buffers are released. The algorithms are similar to the

standard TCP congestion control algorithms like slow start and congestion

avoidance. Details of the algorithm are specified in Section 4.5. The protocol also

proposes a scheme where a congestion feedback is propagated up in the multicast

tree hierarchy.

4.1.4 Scalability and Dynamic Adaptation

Since the receivers are grouped locally with a leader acting on their behalf, the

protocol is scalable to a large number of receivers. A new receiver who wishes to

join in the multicast could join with any of the existing local groups or it could

form its own group announcing itself as the Group Leader. Hence the multicast

tree grows hierarchically in groups as the receivers increase in number. Also, a

receiver may leave its group any time it wishes and all this may happen without

the knowledge of the multicast source.

The most important feature of this protocol is the dynamic way in which the

members of the group may re-arrange themselves in response to changing network

conditions. The Group Leaders exchange special control messages that enable them

to do so. Some control messages are multicast to the group too, so that the

members of the local groups also have updated information about the state of the

multicast group.

4.2 Architecture

The basic architectural design of the protocol is shown in Figure 4.1. The

multicast receivers are grouped locally to form small groups called the local groups.









Each group contains a special node called the Group Leader (GL). The main

motives behind the local groupings are to increase the scalability of the protocol

and to localize error recovery. The Group Leader is responsible for

acknowledgement processing and buffering of the packets. They also handle

processing of the control messages from the multicast receivers, used for congestion

control and group management, and pass these messages up to the multicast

source. As the error recovery is localized within each of the local domains, the

Group Leader is responsible for local re-transmissions. As it contains the buffered

data, it would be able to re-transmit the data requested by any of its children. In

all, the Group Leader plays a major role in the functioning of the protocol. In

Figure 4.1, the circled dots represent the logical local groups. The buffering at the

Group Leaders in only temporary whereas the central logging server adjacent to the

source provides a permanent buffering. In addition to multicasting the packet to

the group, the source sends the packets to the central logging server, which retains

them permanently.

All the local groups together form a tree-like hierarchical structure as shown in

Figure 4.1. Hence the source of the multicast tree is directly connected to the

Group Leaders and the Group Leaders in turn are connected to other Group

Leaders and the multicast receivers. The Group Leaders receive ACKs and NACKs

from their children, the multicast receivers in the local group and the Group

Leaders attached below, if any. The source receives ACKs and NACKs from its

immediate children, which are Group Leaders themselves. Due to this hierarchical

structure, the protocol is scalable to a very large number of receivers.

4.3 Multicast Data Transfer

The following sections discuss in detail the various functions of the protocol such

as the multicast data transfer, acknowledgement and local retransmission schemes,

congestion control and the group management techniques.





























SMulticast Source -- Local Groups

S Group Receivers Central Logging Server
Z- Group Leaders

Figure 4.1: Overall Architectural Design of the Protocol
The architecture of the protocol is very much like a hierarchical tree-like
structure with local groupings as discussed in the previous section. This would
avoid the end-to-end delay of transmission and re-transmission. The multicast data
transfer works as follows. The source or the sender first multicasts the data
globally to all the members of the multicast tree, including the Group Leaders and
the individual receivers. The Group Leaders, that are logically connected to the
source send their ACKs and control messages directly to it. Each Group Leader
receives ACKs and NACKs from its children, the receivers in the local group and
any Group Leaders attached directly below it, if any. The receivers do not send
ACKs directly to the source bypassing the Group Leader. Hence the source is never
bombarded by ACKs from the receivers, which avoids the ACK implosion problem.










The Group Leader locally will retransmit any missing data units requested. If the

Group Leader does not has the data packet, it would request for retransmission

from its parent Group Leader. The Group Leader either multicast or unicast the

missing data and it also must buffer the packets until it receives an

acknowledgement from all of its children. The source multicasts the data unit as

long as it has room in its sending window. The flow of data packets and

acknowledgements is shown in Figure 4.2.





N / / N


\ I


\ ,

N /

Source

( Receiver

Group Leader
-Multicast Data Message
SUnicast Ack Message

Figure 4.2: Multicast Data Transfer

The multicast data transfer from the source to the multicast group and the

whole multicast operation takes place in the following order.

1. Source multicasts the data unit globally to the entire multicast tree, including
the Group Leaders, central log server and the receivers.

2. The Group Leaders attached directly to the source would send their ACKs
and NACKs to the source.

3. The central log server, which is also attached directly to the source, sends its
ACKs and NACKs to it.









4. The Group Leaders receive the ACKs and NACKs from their children which
may include the local receivers and any Group Leaders attached directly
below them.

5. The source performs re-transmissions to its immediate children, the Group
Leaders as well as the central log server.

6. The Group Leaders perform local re-transmissions to their children.

7. The central log server performs any re-transmissions if requested.

8. Steps 1 through 7 are repeated for protocol operation.

4.4 Acknowledgement and Error Recovery Mechanism

In his paper [19], Nonnenmacher has shown that for Distributed Error Recovery

protocols, the performance of the protocol is high with a local multicast

retransmission and local feedback-processing scheme. As discussed in the previous

sections, one of the major problems encountered with reliable multicast design is

the ACK implosion problem. This problem is unavoidable in the sender-initiated

reliable multicast protocols, wherein all the receivers send their positive ACKs to

the sender impounding it virtually.

The protocol discussed in this thesis is a receiver-driven approach with a slight

modification. Instead of each receiver maintaining the state information, the Group

Leader alone maintains the state information for the local group it coordinates.

The acknowledgement scheme of the protocol works as follows.

There are two types of acknowledgements, the positive ACK that indicates the

receipt of the packet correctly and NACK, the negative acknowledgement that

indicates the absence of a data packet. As mentioned in the multicast data transfer

section, the source transmits the data packet to the whole group. After the receipt

of the packets, the Group Leaders that are directly attached to the source send

their ACKs to the source. Hence the source is not bombarded by ACKs from all

the receivers and the implosion problem is avoided. A Group Leader receives the









ACKs from the receivers in its group and also from the other Group Leaders

attached directly below it, if any. The protocol framework proposed in this thesis is

designed for a sequenced, loss less delivery of data. Missing packets are detected

based on the sequence number of the data packet. Both the Group Leaders and the

receivers send NACK if they miss any data packets. The Group Leaders and the

source buffer the packets to retransmit any missing packets. Thus there is a

hierarchy of flow in the multicast tree between the source, Group Leaders and the

receivers. In essence, the receivers could not bypass the Group Leaders to reach the

source. Unlike the SRM [3], where the NACKs are multicast to everyone, here a

receiver sends its NACKs only to its Group Leader.











Unicast Retransmission Multicast Retransmission
NACK
Retransmission of missing packet
S Group Leader
O Local group receiver


Figure 4.3: Local Retransmission of Missing Packets

The Group Leader and the source receive a NACK for the missing data packets.

After the reception of NACK, both Group Leader and source performs

retransmissions. The following section deals with the retransmissions in detail.

4.4.1 Retransmission by Group Leader

Retransmission may be done by the Group Leader in two different ways: unicast

and multicast. Both the retransmission techniques are depicted in Figure 4.2.









Unicast retransmission mechanism. If the children receive the packets properly,

they send an ACK to the Group Leader. The Group Leader releases a packet from

the buffer when it receives an ACK for the packet from all of its children. The

Group Leader retransmits the missing packet locally to a receiver upon receiving

NACK from it. The Group Leader maintains a certain threshold for the number of

NACKs it has received from its children for a particular packet. As long as the

number of NACKs received for a packet is below the threshold, the Group Leader

unicasts the missing packet to the requesting receiver. End-to-end propagation

delay is greatly reduced due to the hierarchical structure, since the receiver need

not send the retransmission request all the way to the source and wait for the next

packet to make its way all the way back from the source, but instead sends its

NACK to its Group Leader receives the retransmission from it.

Multicast retransmission mechanism. The Group Leader waits for a certain

interval of time before it serves the retransmission request. If the number of

retransmission requests for a data packet is found to exceed the threshold, then

instead of unicasting the missing packet to the individual receiver, the Group

Leader multicasts the packet to the whole group. Hence the local traffic caused by

the retransmission requests is reduced by this multicast retransmission mechanism.

Because of local recovery, the repair requests are not sent all the way up to the

source of the multicast tree. The control traffic is greatly reduced through the local

recovery mechanism. When the retransmission request is sent by a Group Leader

attached directly below the source, the retransmission is always unicast.

4.4.2 Retransmission by Source

The Group Leaders attached directly to the source send a NACK to the source if

they miss any data packets. The source buffers the data packets until it receives an

ACK from each of the Group Leaders attached directly to it. The source









retransmits any missing data packet from its buffer when it receives a NACK from

a Group Leader attached to it.

The other important concerns of the recovery mechanisms are as follows:

monitoring of unresponsive receivers and Group Leaders;

late joining receivers and data recovery.

To ensure reliable delivery of data, the Group Leader buffers the packets until it

receives an ACK from each of its children. So it is important for the Group Leaders

to monitor the receivers in its local group continuously. If some of the receivers

become unresponsive, the Group Leader will not get an ACK for a particular

packet and it remains in the buffer infinitely. Therefore it is up to the Group

Leaders to monitor the receivers. Also there is a chance that the Group Leader too

may become unresponsive sometimes. In that case, the receivers could not get their

retransmissions from the Group Leader and would have to get them from another

Group Leader, the sender or the central logger. The monitoring mechanisms

adopted by the Group Leaders and receivers are discussed next.

4.4.3 Monitoring of Unresponsive Receivers and Group Leaders

Monitoring for unresponsive receivers and Group Leaders is achieved using a

control packet called GL_ALIVE, a Group Leader alive message. The Group

Leader sends the GL_ALIVE packet periodically to its local group. Reception of

the GLALIVE message by the local receivers indicates that the Group Leader is

operational. If a receiver does not receive the GL_ALIVE packet for a certain

interval, it indicates the absence of the GL_ALIVE in the ACK packet it sends to

the Group Leader. If three such ACK messages go unanswered, the receiver infers

that the Group Leader is down and it joins with a different Group Leader. The

methods by which the receivers subscribe to a different Group Leader are discussed

later in the group management section. If the Group Leader receives an ACK

packet indicating the non-reception of GLALIVE, it unicasts a GLALIVE to the









requesting receiver. After the reception of the unicasted GLALIVE message, the

receiver sends an ACK indicating its reception and remains attached to the Group

Leader and continues its regular operation.

The Group Leader expects an ACK message within a certain interval from each

receiver. If the Group Leader does not receive an ACK message from a receiver for

three such intervals, the Group Leader then unicasts a GLALIVE message to the

specific receiver indicating the absence of the ACK message. If the Group Leader

does not get a response for the unicasted GLALIVE message, it assumes that the

receiver is down and it releases the data in the buffer it was withholding for that

receiver. Also, the Group Leader presumes that particular member is down and no

further repairs would be served for it. The member has to join again either with

the same Group Leader or a different one if it wishes in the future to participate in

the multicast operation.

4.4.4 Late Joining Receivers and Data Recovery

Since the protocol discussed in this thesis allows receivers to join anytime during

the multicast session, the receivers joining the group late have to be updated with

the data packets sent since the start of the session and catch with the rest of the

group members. Also because of the highly dynamic nature of the protocol, the

receivers in the local group could change their membership to different groups.

This may require them to catch up with the current group data reception. There

are two ways in which this could be achieved: (1) from the buffer at the Group

Leader or the source and (2) directly from the central logging server for complete

data recovery.

Buffer at the group leader and source. The Group Leader buffers each data

packet sent by the source until it receives ACKs from all of its children for that

packet, after which it deletes the buffer entry for the particular packet. The newly

joined member could recover the data from the buffer withheld by the Group































Data logged into the Central Logging Server(CLS)
Retransmission request and reply directly with the CLS
GLALIVE packets from the Group Leader

Figure 4.4: Central Logging Server and Monitoring of Unresponsive Members

Leader by requesting it. If the Group Leader has deleted the data from buffer

needed by the new member, only partial recovery is possible. Hence it is the duty

of the joining member to ensure that it has all the packets that have been

acknowledged by all the members of the joining group. If the new member has not

yet received some of these packets, it is its responsibility to finish all the pending

transactions with the old Group Leader. Alternatively, if the new joining member

cannot find the data in the new Group Leader's buffer, it can request the same

directly from the central logging server, discussed as follows.

Complete data recovery from central logging server. There is a central logging

server attached to the source of the multicast group as in Figure 4.4. It buffers the

data on the disk as the source multicasts the data. Unlike the data buffered by the

source or the Group Leaders, the data in the central logging server is never deleted.









Therefore any late joining member, that cannot recover fully with the Group

Leader's buffer, can request the packets directly from the central logging server.

Hence complete recovery is possible. The central logging server also serves as a

fault tolerant mechanism for the entire multicast session by buffering the data

permanently.

4.5 Congestion Control Mechanism

Section 3.3 discussed the challenging factors involved in the design of congestion

control in multicast protocol. The congestion control for the protocol framework is

designed as follows. There are two phases involved in the whole process: slow start

and congestion control. In the slow start phase, the protocol tries to find an

appropriate operating point. As the packet flow gradually increases and when the

network is subjected to load, the congestion control phase starts to operate. Both

the topics are discussed in detail below.

4.5.1 Slow Start

The approach taken in this scheme is similar to and motivated by the congestion

control strategy used in MTCP [7]. The source uses a congestion window cwnd to

reduce the data transmission rate when experiencing congestion. Each Group

Leader too maintains a congestion window, similar to the TCP congestion window.

The source and the Group Leaders maintain their congestion windows using TCP

congestion control mechanisms such as slow start and congestion avoidance. The

manner in which the congestion algorithms differ is that a congestion window

present at the Group Leader or the source is incremented only when it receives

ACKs from all its children. Hence the window size is incremented linearly as and

when the ACKs are received from the children. This operation is continued as long

as the congestion window size is below the slow start threshold. If the size exceeds

the threshold, the congestion window is reduced to 1/cwnd each time a new packet

is acknowledged by all of its children and the protocol enters into the congestion









avoidance phase. Also, as discussed in the previous section, the local receivers send

NACKs for the missing packets and the Group Leaders immediately retransmit the

packets reported missing. It could be seen that the slow start phase discussed for

this protocol is very much similar to Van Jacobson's [16] standard congestion

control algorithm.

4.5.2 Congestion Control

After the initial phase of slow start, congestion could occur and would be

detected by the receivers, the Group Leaders, or the source itself. Accordingly, the

receivers and the Group Leaders report the congestion through congestion reports

up the hierarchy so that the source acts to reduce the same. The strategy adopted

is similar to and motivated by the congestion control scheme used in TRAM [9].

Congestion at receivers. Receivers detect the congestion according to the missing

packets. If the receivers detect the number of missing packets in an ACK window

to grow beyond a certain threshold, it sends a congestion message to its Group

Leader. The congestion message would include the highest sequence number of the

packet received. Upon receiving the congestion message, the Group Leader in turn

forwards it to the Group Leader above it or to the source if it is attached to the

source directly.

Congestion at group leaders. Group Leaders detect congestion when their buffer

begins to fill up. The buffer space is filled up when a receiver in the local group

fails to acknowledge data packets. The buffer gets filled up with the packets as long

as it does not receive the acknowledgements from the receivers. The buffer has a

high threshold limit and the Group Leader temporarily increases the high threshold

limit set for the buffer when it could not remove any more packets from it.

Meanwhile, a congestion message is sent up the hierarchy indicating congestion. If

the buffer is filled even after the high threshold is increased and the buffer size is

reached, the Group Leader starts to drop the new packets. The temporary increase









of the threshold limit of the buffer is to allow some time for the Group Leader to

send a congestion message up the hierarchy.

Congestion at source. The source of the multicast group too maintains a buffer to

retransmit any missing packets for the Group Leaders attached immediately to it.

Like the above-mentioned situation, the source would detect congestion if its buffer

begins to fill up when a Group Leader fails to acknowledge data packets. Similar to

the Group Leaders, the source would increase the high threshold limit for its buffer.

Meanwhile it reacts to the congestion by reducing the data rate, as if it had

received a congestion message. After the high limit of the threshold is reached and

the buffer is full, it blocks any new data from the application and attempts to

solicit an ACK from the Group Leaders that are causing the buffer to fill up. If the

Group Leaders do not respond quickly, the source would prune them. There is a

significant policy choice here as to prune the slow receivers or to catch up in speed

with the slow receivers. If there is just one receiver that affects the rate of

multicast group, it is pruned and if there are a number of receivers that dictate the

rate, the sender is slowed down.


\ Fc

Fi GL c
G/ GLe

GL i

Fj / Fk


GLj GLk



Figure 4.5: Hierarchical Consolidation of Feedback Parameter

Apart from the congestion report sent by the receivers and Group Leaders when

encountering congestion, the protocol also sends consolidated congestion control









feedback to the source of the multicast group at regular intervals so that it can

regulate the flow of transmission accordingly. Section 3.3 discussed about the

hierarchical consolidation of the feedback from the receivers up the multicast tree

proposed by Golestani and Sabnani [21]. Every receiver calculates an aggregate

feedback parameter based on the feedback received from the receivers at the lower

level. This technique is modified to suit the requirements of this protocol

framework. It is the job of every Group Leader to calculate the consolidated

feedback of its children and the Group Leaders below it. After calculating the

consolidated feedback at its level, it sends the same to the Group Leader above its

level. If a particular Group Leader does not have another leader attached below

(leaf node), it just passes its feedback up the tree. Also, if a Group Leader is

independently attached to the source of the tree, it sends the feedback directly to

the source. This is depicted in Figure 4.5.

The feedback parameter fj denotes the highest packet sequence number that

could arrive at j in the case of window based congestion control. If fj is the

feedback parameter of a node Nj, then the consolidated feedback at the current

level is calculated as follows.

fj = min{fk/Nk, where Nk is a child of Nj}

The consolidated feedback at the current level fj is then propagated to the level

above it, if any or to the source directly.

4.6 Group Management Schemes

This section deals with the various group management schemes involved in the

protocol operation. The architecture of the protocol is a hierarchical tree-like

structure with local group formation. This necessitates effective organization and

management of the groups. Issues such as formation of the group, dynamic

configuration of the members and group termination are discussed as follows.









4.6.1 Group Formation

There are two ways in which the local groups are formed: Expanding ring

structure [6] and the advertisement method.


(c) )
SNew Joining Member

Figure 4.6: Expanding Ring Structure for Group Membership

Expanding ring structure is depicted in Figure 4.6 and works as follows. Any

new member who wishes to join the multicast group sends out a

GROUP_LEADER_SEARCH request message with a limited scope distance. The

limited distance is achieved by setting a small TTL value in the request packet. If

there is any Group Leader in the vicinity, then the Group Leader responds by

sending a GROUP_LEADER_AVAILABLE response message. If the searching

member does not receive any response from any Group Leader, it will start

searching again, this time covering more distance. Increasing the TTL value in the










GROUP_LEADER_SEARCH packet again does this. On receipt of a response from

a Group Leader, the new member may wish to join the group maintained by the

Group Leader, or it may not wish to join the group. The number of members in

the group may already be reaching the MAXIMUM_GROUPMEMBERS limit or

there may be a "better" group to join. This process will continue until a suitable

Group Leader is found and the new member finds itself a place in a local group.

Another method by which a new member joins a local group is by the

advertisement method as shown in Figure 4.7.

Before Joining After Joining





--- --------







New members wishing to join
SAdvertising Group Leader
Scope of advertisement messages

Figure 4.7: Group Leader's Advertisement for Membership

In the advertisement method, the Group Leaders send out a JOINMVY_GROUP

advertisement message periodically to the group-specific multicast address. Nodes

who wish to join the multicast group listen for advertisement messages. The

JOIN_MY_GROUP message contains information such as the distance of the Group

Leader from the multicast source, the number of members already present in the

group, data rate, delay, bandwidth, throughput and its error probability. The new

member then calculates its distance from the Group Leader using the information

in the Group Leaders advertisement message. It may wish to join the group or not









according to the parameters specified in the JOINMVY_GROUP message. As all

the Group Leaders send the JOIN_MY_GROUP message, a node is able to learn of

all the Group Leaders present nearby and to gain partial information of the

multicast tree. The time interval in which the JOIN_MY_GROUP message is sent

should be decided in such a way that the control message traffic generated by the

Group Leaders does not itself lead to congestion.

In both the methods, the new member that wishes to join the group sends an

INTERESTED_IN_JOINING message to the Group Leader. The Group Leader

responds positively with an ACK_JOIN message if it could accommodate the new

member. On the contrary, the Group Leader sends a NACK_JOIN message to the

requesting member indicating its inability to accommodate it.

In both the methods, if a new joining member cannot find a suitable Group

Leader, then it may announce itself as a new Group Leader, leading to the

formation of a new group. The new Group Leader could either attach itself directly

to the source or to another Group Leader. It should be noted that, the

advertisement method causes more control traffic than the expanded ring search

method. Whenever the load of the network is low, the advertisement method will

be adopted and when the load increases, group formation switches to the expanded

ring search method.

4.6.2 Dynamic Reconfiguration of Groups

The local groups are able to re-organize themselves with the receivers changing

to different groups according to the current network or congestion conditions. This

is one of the most powerful features of protocol proposed in this thesis. Dynamic

reconfiguration could be achieved in different ways, with the receivers shifting

groups, Group Leaders shifting groups and the termination of receivers and Group

Leaders. The following sections discuss all these aspects in detail.









Re-affiliation of members. The members of the multicast groups can re-arrange

themselves by changing their membership to other local groups. This could happen

for reasons such as the current group performing poorly, termination of Group

Leader or the whole group. The re-affiliation could be categorized as follows.

Receivers shifting local groups. Receivers could shift to a different group as

follows. Every Group Leader would send out a special control message called

GL_STAT_MSG to every adjacent local domain group. The extent to which this

control message is received is limited because the farther receivers would not be

practically willing to change their groups at a long distance. Hence this message is

sent such that only the adjacent group members receive it. This is taken care of by

limiting the TTL field of the control message.




















Group Leader Status Message

Figure 4.8: Group Leader' Status Message
Figure 4.8: Group Leader's Status Message

All the members of the multicast group, viz., source, Group Leaders and the

individual receivers, receive the control message sent by the Group Leader. The

message contains most of the information similar to the Group Leader's

advertisement message pertaining to a particular group's characteristics. These









include information on its distance from the source, data rate, delay, bandwidth,

throughput, error probability, number of receivers in the group, number of packets

actively held in the buffer, which would be a very good indicator of the congestion

state of a particular group. On receipt of the GL_STAT_MSG, the individual

receivers may shift to a different group because of the better service offered by the

advertising Group Leader. Sending an INTERESTED_INJOINING to the Group

Leader does this. The Group Leader responds positively with an ACKJOINING

message or negatively with a NACK_JOINING message. A receiver that wishes to

change groups directly contacts the concerned Group Leader and to get its consent

to join the particular group as a new member does.

Since the GL_STAT_MSG is very similar to the Group Leaders advertisement

message, the Group Leader does not send out the status message when the group

formation technique is through advertisement. Hence when the load of the network

is low, the advertisement message acts as a status message too. This reduces the

control traffic flow within the multicast group.

The drawback of this method is that, all the receivers that are in a bad state will

try to move into the healthy group. The Group Leader would maintain a threshold

in such a way that it does not permit over subscription membership more than the

threshold. The allocation of the membership is on a first-come first-served basis.

Shifting of group leaders. It is possible that two Group Leaders could swap their

local group leadership. This would happen because of the processing capability and

memory of the Group Leader not sufficient enough to support its current members.

Since the GL_STAT_MSG is received by the Group Leaders too, they could also

contact their peers and initiate the swapping process, if desired. Sending an

INTERESTED_IN_JOINING to the Group Leader does this. The Group Leader

responds positively with an ACK_JOINING message or negatively with a

NACK_JOINING message. It is also possible that the Group Leader of a particular









local group could swap its leadership with a local receiver of that group itself

because of the reasons mentioned earlier.

Termination of group leader. A Group Leader might wish to terminate its

operation at some point. This could happen when it wishes to leave the multicast

group. Because of the termination of the Group Leader, some other node must be

elected as the new Group Leader. Thus the Group Leader election process is

invoked. There are two ways in which the Group Leader could be elected. The

easiest method is for the old Group Leader to select a receiver in the local group to

become the new Group Leader. The potential receiver should have the sufficient

processing capability and memory to do the same. In another method, a newly

joining member of the multicast group could also take over the group leadership if

the above conditions are satisfied. In both the cases, the parent of the Group

Leader is informed about the newly elected Group Leader. For security issues such

as key management, the source should always have knowledge about the Group

Leaders. Under such conditions, the termination or election of the Group Leaders

are informed to the source.

Termination of local group. All or a large number of receivers could move to a

different group because of the congestion state of the group. This shifting of the

receivers could happen due to the GL_STAT_MSG exchange as mentioned earlier.

If a majority of the receivers shift to different groups, the current membership of a

group could become very thin. In such a case, the Group Leader would shed all the

remaining receivers and terminate itself. When the Group Leader detects that the

GROUPMEMBER count is less than a threshold, it informs all the members

about the termination of the group. The Group Leader receives a

TERMINATE_ACK from all the members before terminating itself. This leads to

the shutting down of the whole group.









Monitoring of unresponsive members and group leaders. The group members and

Group Leaders could become unresponsive sometimes. The Group Leader buffers

the data sent by the source until it gets an acknowledgement from each of its

children. If a receiver becomes unresponsive, then the buffer begins to fill up.

Therefore the receivers and the Group Leaders have to be monitored regularly.

This is done using the GL_ALIVE control message sent by the Group Leaders. The

mechanism is described in detail in Section 4.4. If a member of the group was

found to be unresponsive by the Group Leader, the member is pruned from the

group and no further repairs are served for the same. If the Group Leader was

found to be unresponsive, the receiver would join to a different group.

Tree construction techniques. The construction of the hierarchical multicast tree

has to be carefully managed for different types of multicast applications. Some

multicast applications like video or teleconferencing consist of small number of

receivers. Applications like stock quotes and content delivery potentially have a

large number of receivers. Hence tree construction should be carefully managed for

efficient multicast operation.

















Poorly-built Tree Structure Well-buit Tree Structure


Figure 4.9: Tree Construction for an Application with Less Number of Receivers









Tree construction for small number of receivers. If the application consists of a

small number of receivers, then the multicast tree construction is as follows. As the

number of receivers is small, the Group Leader allows more receivers to join its

group than its regular count. This will avoid the presence of a large number of

Group Leaders each with a minimum number of receivers. The multicast operation

would be better with limited number of groups when the number of receivers in the

application is less. If the multicast tree consists of a number of Group Leaders with

just a few members deep down, the end-to-end delay involved the transmission is

increased as depicted in Figure 4.9. Increasing the

MAXIMUM_GROUP_MEMBERS count would do this. Hence there would be

limited number of groups with large population than large number of groups with

sparse population. Hence the newly joining members would not be allowed to

declare themselves as Group Leader unless the MAXIMUM_GROUP_MEMBERS

threshold is exceeded.



oo

0 0











Poorly-built Tree Structure Well-built Tree Formation


Figure 4.10: Tree Construction for an Application with Large Number of Receivers

Tree construction for large number of receivers. If the application involves large

number of receivers, e.g., stock quotes and content delivery, then the multicast tree









construction would be as follows. The number of receivers in a local group would

not be as large as the count for an application with small number of receivers. The

MAXIMUM_GROUP_MEMBERS would be less when compared to the application

with small number of receivers. Figure 4.10 depicts the poorly built tree with a

local group serving a large number of receivers. The presence of large number of

receivers would require the Group Leader to have sufficient processing power and

memory. The multicast operation would be optimal if there is more number of local

groups with less number of receivers to suit these types of applications. On the

contrary if there were large number of receivers with few Group Leaders, it may

lead to the ACK implosion problem at the Group Leader and may also result in

congestion at the local groups.

In both the cases discussed above, there are other important factors that affect

the proper tree construction. These include:

Distance or Hops: The receivers of the multicast that reside in a network are
grouped together to form a local group rather than grouping receivers that
are in different networks wide apart. The number of hops or the distance of
the receiver to its Group Leader also plays a vital role in correct formation of
the groupings.

Maximum B,,, 1,;,.i Factor: The maximum branching factor of the local
group is proportional to a number of functions. The local processing power of
the Group Leader is one factor that affects the branching factor. The Group
Leader should have the sufficient processing power and capacity to
accommodate the members of the local group. The multicast rate such as the
amount of packets sent per second or the amount of bytes per packet also
decides the branching factor of the group. The error rate and the load of the
network is also a factor that influences the same.

4.7 Security Issues

The IETF has posted some evaluation criteria for reliable multicast transport

protocols in [23]. Apart from the issues discussed above, it requires that a reliable

multicast protocol should discuss the security issues.









The main objectives of multicast security is to preserve the authentication and

secrecy of multicast data so that only the legitimate senders can send the data

and only legitimate receivers can receive the data [23]. The secrecy of the multicast

data is provided by public key cryptography mechanism. The data are encrypted

using a group key, which is distributed among the receivers. This requires a good

group key management solution in the protocol. To achieve this goal, the

architecture of the protocol should be very well designed and should aid in the

same.

The architecture defined in Section 3.2 satisfies both criteria of scalability and

de-centralization. Due to the formation of local groupings, the protocol is scalable

to a large receiver base. Also, the Group Leaders will act as a local key

management entity managing a set of receivers, rather than a centralized controller.

Hence if any of the local key controlling entities is down it could be taken over by

another one.

The issues concerning the Group Key Management and access control are not

considered in detail in this thesis.

4.8 Summary

The chapter provides a framework for reliable multicast protocol. The framework

begins with the discussion of goals of the proposed protocol. The architectural

design of the protocol is discussed followed by the mechanism of multicast data

transfer. The section on acknowledgement and error recovery discusses the various

types of retransmission performed by the Group Leader and source. The section

also discusses the method by which the unresponsive receivers and Group Leaders

are monitored and how the late joining receivers recover their data completely

through central logging server. The section on congestion control discusses the

control algorithms like slow start and congestion avoidance in detail. The group

management section discusses the two group formation techniques of expanding







57

ring structure and advertisement method. Various other group management

techniques like the dynamic reconfiguration of the receivers among different groups,

group termination are discussed. The security issues are not discussed in detail.















CHAPTER 5
FUNCTIONAL MODEL

This chapter provides a functional model for the proposed reliable multicast

protocol framework. The functional model identifies the various protocol

components and its entities. The model also provides use case and sequence

diagrams for basic operations of the protocol like multicast data transfer,

acknowledgement, error recovery and group management. The use case and

sequence diagrams are constructed based on the Unified Modeling Language (UML)

specification. The functional model also provides an overall flow diagram of the

protocol.

5.1 Protocol Components

There are four major components in the protocol design: Sender, Receiver,

Group Leader, and Central Logging Server. The various components and their

subcomponents are shown in Figure 5.1. The functions of each of them are

detailed as follows.

5.1.1 Sender Component

The Sender component is responsible for the transmission of the multicast data

to the whole group. In addition, it is also responsible for number of other functions

like error recovery and congestion control, and it has the interface to interact with

the Sender application. It has a number of subcomponents, details of which are

listed below.

SNDR_TRANS_CONTROLLER: This subcomponent has a number of
modules that take care of the transmission of different types of packets.

SNDR_TR: This is a module of SNDR_TRANS_CONTROLLER and is
responsible for the transmission of new data packets.










RECEIVER COMPONENT


SNDR TRANS CONTROLLER

SNDR TR SNDR RTR


SNDR PROCESS CONTROLLER

SNDR BUFR MNGR



CENTRAL LOGGER COMPONENTS


LOG TRANS CONTROLLER

LOG RTR LOG ACK

LOG NACK


LOG PROCESS CONTROLLER

LOG BUFR MNGR


RCVR TRANS CONTROLLER

RCVRTR RCVRNACK

RCVR ACK RCVR CTRL


RCVR PROCESS CONTROLLER



GROUP LEADER COMPONENT


GL TRANS CONTROLLER

GL TR GL STAT

GL NACK GL RT

GL ADVT GL ACK


GL PROCESS CONTROLLER

GL BUFR MNGR


Figure 5.1: Functional Components of the Protocol

SNDR_RTR: This is a module of SNDR_TRANS_CONTROLLER and is
responsible for the retransmission of lost packets.

SNDR_PROCESS_CONTROLLER: This subcomponent is responsible for the
processing of ACK and NACK messages from Group Leaders and the Central
Logging Server. It is also responsible for the processing of the control
messages for group membership and congestion control.

SNDR_BUFR_MNGR: This subcomponent is responsible for buffer
management. The responsibilities include the buffering of messages and
deleting them as they receive the ACKs from its children.

5.1.2 Receiver Component

The Receiver component delivers the received data packets to the receiver

application. It also sends the ACK and NACK to the Group Leaders in accordance


SENDER COMPONENT









with the reception of data packets. The various subcomponents of it are mentioned

below.

RCVR_TRANS_CONTROLLER: This subcomponent has a number of
modules that take care of the transmission of different types of packets.

RCVRTR: This a module of RCVR_TRANS_CONTROLLER and is
responsible for the transmission of the received data packets to the receiver
application.

RCVRACK: This a module of RCVR_TRANS_CONTROLLER and is
responsible for the transmission of the ACK messages to the Group Leader.

RCVRNACK: This a module of RCVR_TRANS_CONTROLLER and is
responsible for the transmission of the NACK messages to the Group Leader
and the Central Logging Server to request retransmission of lost packets.

RCVR_CTRL: This a module of RCVR_TRANS_CONTROLLER and is
responsible for the transmission of the control packets for group management
techniques and congestion control.

RCVRPROCESS_CONTROLLER: This subcomponent is responsible for
processing of the control messages for group membership and congestion
control.

5.1.3 Group Leader Component

The Group Leader component is responsible for buffering data packets from the

sender temporarily. It is also involved in the local error recovery with the receivers

by the retransmission of lost packets. The Group Leader is also responsible for

detecting the congestion and reporting the same up the multicast tree. The various

subcomponents of the Group Leader are mentioned below.

GL_TRANS_CONTROLLER: This subcomponent has a number of modules
that take care of the transmission of different types of packets.

GL_TR: This a module of GL_TRANS_CONTROLLER and is responsible for
the transmission of the received data packets to the receiver application.

GL_ACK: This a module of GL_TRANS_CONTROLLER and is responsible
for the transmission of the ACK messages to the source or other Group
Leader up in the hierarchy.









GL_RTR: This a module of GL_TRANS_CONTROLLER and is responsible
for the retransmission of the lost packets.

GL_NACK: This a module of GL_TRANS_CONTROLLER and is responsible
for the transmission of the NACK messages to another Group Leader up in
the hierarchy or to the Central Logging Server to request retransmission of
lost packets.

GL_ADVT: This a module of GL_TRANS_CONTROLLER and is responsible
for the transmission of the group advertisement packets.

GL_STAT: This a module of GL_TRANS_CONTROLLER and is responsible
for the transmission of the local group status messages to the multicast
group.

GL_PROCESS_CONTROLLER: This subcomponent is responsible for the
processing of ACK and NACK messages received from the receivers and child
Group Leaders. It is also responsible for processing of the control messages
like group membership and congestion control.

GL_BUFR_MNGR: This subcomponent is responsible for buffer management.
The responsibilities include the buffering of messages and deleting them as
they receive the ACKs from its children.

5.1.4 Central Logging Server Component

The central logger component is responsible for the complete data recovery. It

buffers the data permanently in the disk and recovers the receivers and Group

Leaders in case of loss. The various subcomponents of it are discussed as follows.

LOG_TRANS_CONTROLLER: This subcomponent has a number of modules
that takes care of the transmission of different types of packets.

LOGRTR: This is a module of LOG_TRANS_CONTROLLER and is
responsible for the retransmission of lost packets.

LOGACK: This is a module of LOG_TRANS_CONTROLLER and is
responsible for the transmission of ACK packets to the sender.

LOGNACK: This is a module of LOG_TRANS_CONTROLLER and is
responsible for the transmission of the NACK packets to the source.

SNDR_PROCESS_CONTROLLER: This subcomponent is responsible for the
processing of NACK messages from Group Leaders and the receivers.












* LOGBUFR_MNGR: This subcomponent is responsible for buffer
management. The responsibilities include the buffering of messages in disks
and retrieving them during retransmissions.


Message


Description


GL SRCH

INT IN JOIN

GRP JOIN ACK

JOIN MY GRP

GL STATUS

LEAVE GRP

LEAVE GRP ACK

FIND GRP LDR

TERM ACK

GL ALIVE


Message sent by clients in search of Group Leaders

Message sent by clients in response to the join reply from Group Leaders

Acknowledgement message sent by Group Leader for group membership

Advertisement message sent by Group Leader for group membership

Status message sent by Group Leader to aid the group member's re-affiliation

Message sent by receivers informing the Group Leader before leaving the group

Acknowledgement for the previous message

Message sent by Group Leader to the local group to initiate leader election process

Message sent by receivers acknowledging the termination of the Group Leader

Message sent by Group Leaders, which aid the receivers in finding the unresponsive
Group Leaders


Figure 5.2: Different Messages Used in Protocol Operation


Appendix A contains pseudo algorithms for the various components of the

protocol. Though the operations performed by them are listed sequentially, they

are performed concurrently. Appendix B contains the pseudo algorithms for various

group management techniques between the Group Leaders and receivers. The

various messages involved in the protocol operation are listed in Figure 5.2.


Use Case Diagram


Central Logger


Sequence Flow

Sender


Multcat DataLead
Group Leader


Receiver


Central Logger Group Leader Receiver


Figure 5.3: Use Case and Sequence Analysis for Multicast Data Operation


5.2 Use Case and Sequence Diagrams

This section deals with the use case design for the various phases of the protocol

operation. The use case analysis depicts the various operations phases of the










protocol as different cases and analyzes them with the major players involved with

it in an abstract way without any implementation details.

Use Case Diagram
Central Logger


/ /Grou Leader

7 Acnolegeen 7 Acknowledgement G
Sender Group Leader
Receiver
Sequence Flow
Sender Sender Sender Group Leader Sender



S7 Receiver



Group Leader Receiver
Central Logger Group Leader Group Leader

SData
Acknowledgement

Figure 5.4: Use Case and Sequence Analysis for Acknowledgement

Multicast data analysis. Figure 5.3 depicts the use case diagram and the

sequence flow associated with the operation of multicast data transfer. The sender

multicasts the data packets to the whole group and they are received by all the

components of the group. The direction of data flow from sender to the

components, central log server, Group Leaders and receivers are shown as sequence

flows in Figure 5.3.

Acknowledgement analysis. Figure 5.4 depicts the use case diagram and the

sequence flow associated with the acknowledgement packets. The data packets sent

by the sender are acknowledged by the Group Leaders attached directly to it and

the central log server. Group Leaders are acknowledged by the receivers, and the

Group Leaders attached below it. There are four different sequence diagrams

involved with the second part of Figure 5.4. Both the central log server and the








64

Use Case Diagram
Central Logger


Group Leader
Recovery Recovery

Sender /
Group Leader Receiver
Sequence Flow
Sender Sender Cenra Cera Group Leader Group Leader

SLogger Logger







Logger Group Leader Group Leader Receiver Receiver Group Leader
SData
S Retransmission

Figure 5.5: Use Case and Sequence Analysis for Error Recovery

Group Leaders that are attached directly to the source, send their

acknowledgement to sender in response to the data packets received from it. The

receivers send their acknowledgements to the Group Leader rather than the sender

and Group Leaders acknowledge their parent Group Leaders. The different levels in

the multicast tree improve the scalability of the protocol. The flow in all the four

cases are represented as sequence flows in Figure 5.4.


Shift Group
FormGroup Receiver
Modify Leadershi
Terminate Grou 7
Sender Group Leader Monitor Members
Child Group Leader

Figure 5.6: Use Case and Sequence Analysis for Group Management

Error recovery analysis. Figure 5.5 depicts the use case diagram and the

sequence flow associated with the error recovery packets. Error recovery is required

in response to the packets lost by the various components of the multicast group.

There are six different sequences associated with the error recovery. The sender









retransmits lost packets to the central log server and its children Group Leaders.

The central log server retransmits the packets lost both by the receivers and the

Group Leaders. The Group Leaders also retransmits the packets for the receivers

and their Group Leader children. The flow in all the six cases is represented as

sequence flows in Figure 5.5.

Group management analysis. The use case diagram for the Group Management is

depicted in Figure 5.6. The sender and the Group Leaders are involved in

activities such as formation and termination of the groups. The other operations

such as shifting among the group members, modifying the leadership and

monitoring of the members are restricted to the receivers and the Group Leaders.

Overall sequence analysis. The Figure 5.7 represents the combined sequence

diagram of all the operations represented above. This sequence diagram represents

the order in which the protocol operation takes place. The four entities are

represented along with an extra Group Leader and an extra receiver to show the

retransmission operation involved. Each arrow represents the flow of packets from a

source entity to a destination entity in the direction of the pointing arrowhead.

They are marked as X.Y with X being the sending entity and Y being the sequence

number of the sending entity. It is seen that the source multicasts the data packet

to all the entities and everyone acknowledges it promptly. Receiver R1

acknowledges the data packet to its Group Leader whereas R2 requests a

retransmission. After the reception of the retransmission, R2 acknowledges the

same. The other details pertaining to the retransmissions by central log server and

source, congestion control schemes and the group management are not shown in

Figure 5.7.

5.3 Flow Diagram

Figure 5.8 depicts data flow and control flow inside a component of the protocol.

The component could be a sender or a Group Leader or the central log server. The











Sender


Central
Logger


Receiver 1


Grp Ldr 2 Receiver 2


X: Sending Entity
X.Y
Y: Sequence of the Sending Entity

Figure 5.7: Sequence Diagram of the Protocol

component depicted in Figure 5.8 could not be a receiver as there are buffer

mechanisms involved. The data flow is depicted in regular dark line while the

control flow is depicted in dashed line. There are two separate buffers for

transmission and retransmission. Similarly there are separate transmission

mechanisms for multicast and unicast. There are separate entities for managing the

control messages for group management and flow control. An important feature of

the flow inside the component is that the data flows in from the top and flows out

through the bottom. Also, the control flows in from the sides and flows out

through the sides. Separate queues are provided for ACK and NACK.










Data Input



Garbage ReTx Buffer -- Tx Buffer
Collecter
T Group L

Ctrl tMsgs Control Input
Control Ctrl Ms
Output Q
NACK ACK _Unicast




Data Output

Figure 5.8: Flow Diagram of a Component

5.4 Summary

This chapter provided a functional model of the protocol framework. The

functional model is intended to provide an abstract overview of the protocol

without concentrating on the implementation details. This model could also serve

as an interface between the functional people and technical team. The model has

use case analysis, sequence analysis, flow diagrams, and sequence diagrams. These

were drawn according to the UML specifications.















CHAPTER 6
CONCLUSION AND FUTURE WORK

6.1 Conclusion

Reliable multicast refers to the reliable manner in which a message is sent from a

sender to a set of receivers. It is still an active area of research with several works

in literature. Most of the works are classified as sender-initiated and

receiver-initiated approaches. Although there are a number of protocols in

existence, most of them were developed with a particular application in mind.

There are several important issues that make the design of a reliable multicast

protocol difficult. The protocol has to be scalable to a large number of receivers.

Since several receivers are involved, the implosion problem must be addressed.

Implosion at the sender occurs when all the receivers send their response traffic

back to source. The protocol should provide recovery mechanisms that minimize

the control traffic and recovery time. The protocol must also provide mechanisms

to recover from congestion in the network. Management mechanisms of different

entities in the protocol must be mentioned. All these factors make design of the

reliable multicast protocol challenging.

Some of the popular reliable multicast protocols in existence are LGMP, RMTP,

TRAM, SRM, LBRM, and XTP. An analysis of these protocols with respect to the

issues mentioned previously is mentioned in Chapter 3. The concept of logically

grouping the receivers was proposed in LGC. Although LGC recovered the packets

locally, it had to go to the sender if none of the members have the missing packet.

LGC did not mention congestion control. The local grouping concept proposed by

LGC was later adopted in the designs of RMTP and TRAM. RMTP builds a

number of local subtrees, which together form the global multicast tree. Although









RMTP is scalable to a large number of receivers, it does not support much in the

way of group management techniques. RMTP does not provide end-to-end

congestion control with any feedback sent from the receivers. Multicast in TRAM

is based on the repair tree construction. TRAM has several features like local

grouping, tree construction based upon LGC and RMTP. TRAM provides

algorithms for optimized tree construction techniques suitable for a variety of

multicast applications. While, TRAM does not support complete data recovery, it

does mentions a number of control and status messages for performing the group

management techniques. LBRM proposes the idea of distributed logging of data

packets, which aids in local recovery and reduces the end-to-end propagation delay.

Although the receivers are grouped together forming sites, LBRM does not specify

group management techniques. LBRM was developed for high performance

simulation applications that require low-latency packet loss detection. Hence the

protocol has the ability to provide keep-alive or heartbeat data packets, even when

the application does not provide it. SRM was developed for supporting a

distributed whiteboard application. There is no local grouping of receivers, which

severely limits the scalability of the protocol. Retransmission requests made by the

receivers are multicast to the whole group, and other receivers that require the

same packets back off looking the request. MTCP has a hierarchical structure of

multicast receivers with source rooted on top of the tree. It does not logically

group the receivers, which again limits its scalability of the receivers. It proposes a

congestion control algorithm based on hierarchical feedback sent by the receivers up

towards the source.

Contribution. This thesis proposes a framework for a reliable multicast protocol

that is scalable to a large number of receivers and is rich in group management

techniques. The important features of the protocol are discussed as follows.









The architectural design of the protocol with the local groupings of receivers
arranged hierarchically makes it scalable to a large number of receivers.

Local recovery of packets at the Group Leaders reduces the round trip time
involved in retransmission.

Reporting of ACKs and NACKs by the receivers to the Group Leaders
eliminates the ACK implosion problem by design.

Complete recovery of data packets for the receivers joining late in the
multicast is accomplished through the central logging server.

Unresponsive receivers and Group Leaders are monitored by a special control
mechanism.

The protocol provides a receiver-based congestion control mechanism along
with TCP-like slow start congestion control algorithms at the source that
makes the protocol TCP-friendly.

Receivers and Group Leaders report the congestion notifications in a
hierarchical fashion to the source.

The protocol provides two different group formations techniques: expanding
ring and advertisement method that are adapted dynamically to suit the
network load.

Receivers can dynamically reconfigure themselves by changing the group
membership with the help of status messages from Group Leaders.

The framework also includes a list of pseudo algorithms for the multicast

operation and group management techniques. A functional model detailing the

various components of the protocol was proposed. The model also included a list of

use-case and sequence diagrams for the basic multicast operation of the protocol.

Some of the features in the protocol framework were adapted from the earlier

works. The concepts of local groping of receivers were adopted from LGC, but the

features have been modified to suit the current framework. Several new features

like the dynamic group management techniques and complete data recovery

through central logger were incorporated.









6.2 Future Work

The following are the several areas in which the future work could be done.

The framework discussed in thesis addresses only the point-to-multipoint
communication. For multipoint-to-multipoint communications with several
senders, multicast trees would have to be setup at each sender.

The framework does not discuss the security aspects in detail such as source
authentication, access control and the key management techniques.

Tree optimization techniques are not discussed in the framework, although a
mention of factors that affect the correct tree-formation are specified.
Dynamic optimization by continually reconfiguring the tree structure by
breaking or branching the tree to suit the network conditions is an interesting
area to address.

The presence of central log server helps in the complete data recovery.
Multiple permanent log servers could be distributed to reduce the
propagation delay in retransmission and for scaling purposes. The tradeoff
involved in the introduction of distributed log servers versus the buffering of
data at the Group Leaders could be studied.

The protocol proposed in this provides a framework along with a functional

model for a reliable multicast protocol. It also provides pseudo algorithms for the

multicast operation and group management techniques. The proposed protocol

framework is for a point-to-multipoint communications. Analysis and issues

relating to the multipoint-to-multipoint communications could be studied further.















APPENDIX A
PSEUDO ALGORITHMS FOR MULTICAST OPERATION

Algorithm 1: Sender Operation
while Connection is Open do
Receive Data from Sending Application;
Multicast Data packets to the Group;
Buffer the packets before sending;
Process Acknowledgements and Retransmissions;
Manage buffered Data packets;
if Notified for Congestion then
Reduce the sending rate;
end
Perform Group Management activities;
end


Algorithm 2: Central Log Server Operation
while True do
Receive Data packets from Sender;
Send ACK or NACK to Sender;
if Request for Recovery then
Perform Retransmissions;
end
Store and Retrieve Data from Disk;
end














Algorithm 3: Group Leader Operation
while True do
Receive Data packets from Sender;
Buffer Data packets;
Send ACK or NACK to Parent;
Process ACK or NACK from Children;
Manage the buffered Data;
if Request for Recovery then
Perform Retransmissions;
end
Perform Group Management activities;
if Congestion or problem then
Send Notification to Parent;
end
Send received packets to Receiving Application;
end











Algorithm 4: Receiver Operation
while True do
Receive Data packets from Sender;
Send ACK or NACK to Group Leader;
Perform Group Management Activities;
if Congestion or problem then
Send Notification to Group Leader;
end
Send received packets to Receiving Application;
end















APPENDIX B
PSEUDO ALGORITHMS FOR GROUP MANAGEMENT

Algorithm 5: Expanding Ring Search: Client
while GroupLeaderFound == False do
Multicast a GROUP_LEADER_SEARCH message;
Collect Responses;
if No Replies then
Multicast the GROUP_LEADER_SEARCH message with longer
TTL;
else
foreach R, /,l, collected from GroupLeader do
Analyze the reply for best characteristics;
end
Send an INTERESTED_IN_JOINING message to GroupLeader;
if ACK_ JOINING received then
GroupLeaderFound -- True;
else
Continue Searching;
end
end
end


Algorithm 6: Expanding Ring Search: GroupLeader
while True do
Collect messages from all potential receivers;
if GROUPLEADER_SEARCH message then
foreach R, /,/,/ collected do
if MAX_MEMBERS not exceeded then
Send Reply with the Group characteristics;
end
end
else if INTERESTEDINJOINING message then
Send ACK_JOINING message;

end













Algorithm 7: Advertisement Method: Client
while GroupLeaderFound == True do
Collect the JOIN_MY_GROUP advertisement from GroupLeader;
foreach Advertisement message collected do
Analyze and Categorize for best characteristics;
Choose a Group Leader reachable with minimum hops;
if Group Leaders MAX_MEMBERS not exceeded then
Send INTERESTED_IN_JOINING message;
Wait for ACK_JOINING from GroupLeader;
if ACK_JOINING received is Success then
GroupLeaderFound -- True;
else
Continue Searching;
end
else
Analyze another Reply;
end
end
end








Algorithm 8: Advertisement Method: Group Leader
while True do
Send JOIN_MY_GROUP at certain Time Intervals;
Collect responses from receivers;
if INTERESTED_IN_JOINING message then
if MAX_MEMBERS not exceeded then
Send Positive ACK_JOINING message;
Increment GROUPMEMBER count;
else
Send Negative ACK_JOINING message;
end
end
end














Algorithm 9: Receivers Shifting Local Group: Client
while True do
Collect the GROUP_LDR_STATUS message from Group Leader(s);
foreach Status message collected do
Analyze and Choose the Group Leader with best characteristics;
Send INTERESTED_IN_JOINING message;
if Positive ACK_JOINING is received then
Send LEAVE_GROUP to Group Leader;
Wait for the receipt of LEAVE_GROUP_ACK;
Join the new group;
else
Analyze another status message;
end
end
end


Algorithm 10: Receivers Shifting Local Group: Group Leader
while True do
Send GROUP_LDR_STATUS message at regular Time Intervals;
Collect responses from receivers;
if INTERESTEDINJOINING message then
if MAX_MEMBERS not exceeded then
Send Positive ACK_JOINING message;
Increment GROUPMEMBER count;
else
Send Negative ACK_JOINING message;
end
end
end












Algorithm 11: Group Leader Shift or Termination: Group Leader
if GROUPMEMBER count less than Threshold then
Perform regular Group Leader activities;
if wish to leave or terminate then
Inform all receivers about termination;
Receive TERMINATE_ACK from all receivers;
Fulfill all pending obligations;
Initiate the FIND_GROUP_LEADER operation;
Terminate;
end


els'


e
Inform all receivers about termination;
Receive TERMINATE_ACK from all receivers;
Inform source about termination of group;
Shed all clients;
Terminate receivers and self;


end


Algorithm 12: Monitoring of Unresponsive Group L
while True do
Receive the GL_ALIVE messages;
if Do not receive GL_ALIVE messages then
Mark its absence in the ACK message;
Send the marked ACK to Group Leader;
Wait for the response;
Send 3 marked ACKs until any response;
if No Response from Group Leader then
Group Leader is dead;
Initiate algorithm to join a new group;
else
Group Leader is Alive;
Remain in the same group;
end
end
end


eader: Receiver



























Algorithm 13: Monitoring of Unresponsive Receivers: Group Leader
while True do
Send GL_ALIVE messages periodically;
Receive the ACK from the receivers;
if Do not receive 3 ACK messages ,. ;-,i,,,il ij then
Unicast a GL_ALIVE to the receiver;
Wait for the response;
if No Response from the receiver then
Receiver is dead or left the group;
Terminate the receiver from the group;
else
Receiver is Alive;
Continue Multicast Operation;
end
end
end















REFERENCES


[1] Information Sciences Institute, "Internet Protocol," Request for Comment
(RFC) 791, Internet Engineering Task Force, September 1981

[2] Information Sciences Institute, "Transmission Control Protocol," Request for
Comment (RFC) 793, Internet Engineering Task Force, September 1981

[3] S. Floyd, V. Jacobson, C. Liu, S. McCanne, L. Zhang, "A Reliable
Multicast Framework for Light-weight sessions and Application Level
Framing," IEEE/ACi1 Transactions on Networking, vol. 5, no. 6, pages
784-803, December 1997

[4] J. W. Atwood, 0. Catrina, J. Fenton, W. T. Strayer, "Reliable
Multicasting in Xpress Transport Protocol," IEEE Conference on LCN, pages
202-211, Minneapolis, MN, USA, October 1996

[5] H. W. Holbrook, S. K. Singhal, D. R. Cheriton, "Log-based
Receiver-reliable Multicast for Distributed Interactive Simulation,"
SIGCOMM, pages 328-341, 1995

[6] M. Hofmann "A Generic Concept for Large-Scale Multicast," International
Zurich Seminar on Digital Communications, pages 95-106, 1996

[7] I. Rhee, N. Ballaguru, G. Rouskas, _iTCP: Scalable TCP-like Congestion
Control for Reliable Multicast," IEEE INFOCOM, vol. 3, pages 1265-1273,
New York, NY, USA, March 1999

[8] S. Paul, K. Sabnani, J. Lin, S. Bhattacharyya "Reliable Multicast
Transport Protocol (RMTP)," IEEE Journal of Selected Areas in
Communications, vol. 15, pages 407-421, April 1997

[9] D. Chiu, S. Hurst, M. Kadansky, "TRAM: A Tree-Based Reliable Multicast
Protocol," Technical Report TR-98-66, Sun Microsystems, July 1998

[10] W. Stallings, C,a ,n,,./,,l/,,J and Network Security: Principles and Practice,
Second Edition, Prentice Hall, Upper Saddle River, New Jersey, July 1998

[11] V. Roca, L. Costa, R. Vida, A. Dracinschi, S. Fdida, "A Survey of
Multicast Technologies," Technical Report, LNCS 2236, September 2000









[12] B. Quinn, K. Almeroth, "IP Multicast Applications: Challenges and
Solutions," Request for Comment (RFC) 3170, Internet Engineering Task
Force, September 2001

[13] B. N. Levine, J. J. Garcia-Luna-Aceves, "A Comparison of Reliable
Multicast Protocols," International Conference on Network Protocols, pages
112-121, Columbus, OH, USA, October 1996

[14] K. Obraczka liiltIr t. Transport Protocols: A Survey and Taxonomy,"
IEEE Communications Magazine, vol. 36, pages 94-102, Toronto, Ont.,
Canada, January 1998

[15] J. Postel, Information Sciences Institute, "User Datagram Protocol,"
Request for Comment (RFC) 768, Internet Engineering Task Force, August
1980

[16] V. Jacobson "Congestion Avoidance and Control," AC'i SIGCOMM, vol.
18, pages 314-329, Stanford, CA, USA, August 1998

[17] X. Xiao, L. M. Ni, "Internet QoS: A Big Picture," IEEE Network, vol. 13,
pages 8-18, March-April 1999

[18] A. Mankin, A. Romanow, S. Bradner, V. Paxson, "IETF Criteria for
Evaluating Reliable Multicast Transport and Application Protocols," Request
for Comment (RFC) 2357, Internet Engineering Task Force, June 1998

[19] J. Nonnenmacher, M. Lacher, M. Jung, E. Biersack, G. Carle, "How bad
is Reliable Multicast without Local Recovery ?," INFOCOM, vol. 3, pages
972-979, San Francisco, CA, USA, April 1998

[20] S. K. Kasera, J. Kurose, D. Towsley, "A Comparison of Server-based and
Receiver-based Local Recovery Approaches for Scalable Reliable Multicast,"
Technical Report UM-CS-1997-069, 1997

[21] S. J. Golestani, K. Sabnani, "Fundamental Observations on Multicast
Congestion Control in the Internet," INFOCOM, vol. 2, pages 990-1000, New
York, NY, USA, March 1999

[22] S. Floyd, M. Handley, "Requirements for Congestion Control for Reliable
Multicast," Reliable Multicast Workshop in Cannes, September 1997

[23] M. J. Moyer, J. R. Rao, P. Rohatgi, "A Survey of Security Issues in
Multicast Communications," IEEE Network, vol. 13, pages 12-23,
November-December 1999















BIOGRAPHICAL SKETCH


Venkata L. Ramasubramaniam was born in Srivilliputtur, Tamil Nadu, India, on

October 17, 1978. He earned his high school diploma from Sir. M. Venkata Subba

Rao matriculation school. He graduated with a Bachelor of Engineering degree

with distinction in computer science and engineering in 1999 from Madurai

Kamaraj University, Madurai. He came to the University of Florida (in Gainesville,

Florida) to pursue a Master of Science degree in the Computer and Information

Science and Engineering.