<%BANNER%>

Network Resource Provisioning in Research Networks

Permanent Link: http://ufdc.ufl.edu/UFE0041996/00001

Material Information

Title: Network Resource Provisioning in Research Networks
Physical Description: 1 online resource (107 p.)
Language: english
Creator: Jung, Eun
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2010

Subjects

Subjects / Keywords: cloud, escience, network, optimization, workflow
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Advances in optical communication and networking technologies, together with the computing and storage technologies, are dramatically changing the ways scientific research is conducted. A new term, e-Science, has emerged to describe large-scale science carried out through distributed global collaborations enabled by networks, requiring access to very large scale data collections, computing resources, and high-performance visualization . E-Science application workflows are complex and require schedulable and high-bandwidth connectivity with known future characteristics. Moreover, these workflows have performance requirements or metrics that have not been considered by conventional networking. For example, large file transfer may need guaranteed total turnaround time and the rate of progress. Given the long duration of many requests, the network resources available may change before it is completed. We develop a novel framework for provisioning a variety of e-Science applications that require complex workflows that span over multiple domains. Our framework provides guarantees on the performance while incurring minimal overhead, both necessary conditions for such a framework to be adopted in practice.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Eun Jung.
Thesis: Thesis (Ph.D.)--University of Florida, 2010.
Local: Adviser: Ranka, Sanjay.
Local: Co-adviser: Sahni, Sartaj.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2010
System ID: UFE0041996:00001

Permanent Link: http://ufdc.ufl.edu/UFE0041996/00001

Material Information

Title: Network Resource Provisioning in Research Networks
Physical Description: 1 online resource (107 p.)
Language: english
Creator: Jung, Eun
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2010

Subjects

Subjects / Keywords: cloud, escience, network, optimization, workflow
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Advances in optical communication and networking technologies, together with the computing and storage technologies, are dramatically changing the ways scientific research is conducted. A new term, e-Science, has emerged to describe large-scale science carried out through distributed global collaborations enabled by networks, requiring access to very large scale data collections, computing resources, and high-performance visualization . E-Science application workflows are complex and require schedulable and high-bandwidth connectivity with known future characteristics. Moreover, these workflows have performance requirements or metrics that have not been considered by conventional networking. For example, large file transfer may need guaranteed total turnaround time and the rate of progress. Given the long duration of many requests, the network resources available may change before it is completed. We develop a novel framework for provisioning a variety of e-Science applications that require complex workflows that span over multiple domains. Our framework provides guarantees on the performance while incurring minimal overhead, both necessary conditions for such a framework to be adopted in practice.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Eun Jung.
Thesis: Thesis (Ph.D.)--University of Florida, 2010.
Local: Adviser: Ranka, Sanjay.
Local: Co-adviser: Sahni, Sartaj.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2010
System ID: UFE0041996:00001


This item has the following downloads:


Full Text

PAGE 1

NETWORKRESOURCEPROVISIONINGINRESEARCHNETWORKSByEUN-SUNGJUNGADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2010

PAGE 2

c2010Eun-SungJung 2

PAGE 3

Tomyparents,mywife,Hyeseon,andmydaughter,Lauren 3

PAGE 4

ACKNOWLEDGMENTS Firstofall,Iwouldliketothankmychair,Dr.SanjayRanka,andmyco-chair,Dr.SartajSahni.SinceIstartedtoworkwithhim,theyhaveinspiredme,guidedmethroughalltheresearch,andgavemeinvaluableadvice,suggestions,commentsandsupportwithpatienceandgenerosity.Ialsowouldliketoshowmysinceregratitudetomysupervisorycommitteemembersforinsightfulcommentsonmyresearch.Iwouldliketogivemydeepestgratitudetomyfamilyandfriends.Withouttheirhelpandsupport,thisdissertationwouldnothavebeenpossible. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 7 LISTOFFIGURES ..................................... 8 ABSTRACT ......................................... 11 CHAPTER 1INTRODUCTION ................................... 12 1.1Overview .................................... 12 1.2TargetNetworksandServices ........................ 14 1.3ProblemsAddressedandOurContributions ................. 16 1.3.1BandwidthAllocationforIterativeData-dependentApplications .. 16 1.3.2TopologyAggregationforE-ScienceNetworks ............ 17 1.3.3WorkowScheduling .......................... 18 1.4BackgroundandRelatedWork ........................ 19 1.5OutlineofDissertation ............................. 24 2BANDWIDTHALLOCATIONFORITERATIVEDATA-DEPENDENTE-SCIENCEAPPLICATIONS ................................... 26 2.1Overview .................................... 26 2.2SynchronousDataowforE-ScienceApplications ............. 28 2.3ProblemFormulation .............................. 33 2.3.1IllustrativeExample ........................... 34 2.3.2OptimalBandwidthAllocationwithaFeasibleSchedule ...... 37 2.3.2.1Modelingcommunicationdelays .............. 38 2.3.2.2Problemformulation ..................... 42 2.4ExperimentalEvaluation ............................ 45 2.5Summary .................................... 47 3TOPOLOGYAGGREGATION ............................ 49 3.1Overview .................................... 49 3.2RelatedWork .................................. 52 3.3TAforMultiple-PathMultiple-Job(MPMJ) .................. 54 3.3.1ProblemStatement ........................... 54 3.3.2NewTopologyAggregationAlgorithms ................ 56 3.3.2.1Full-meshmethod ...................... 56 3.3.2.2Starmethod ......................... 57 3.3.2.3Partitionedstarmethod ................... 58 3.4Routing ..................................... 59 5

PAGE 6

3.5ComplexityAnalysis .............................. 61 3.6ExperimentalEvaluation ............................ 62 3.6.1BulkFileTransfersinE-Science .................... 62 3.6.2ExperimentTestbed .......................... 63 3.6.3PerformanceMetrics .......................... 64 3.6.4Results ................................. 64 3.7Summary .................................... 65 4WORKFLOWSCHEDULING ............................ 67 4.1Overview .................................... 67 4.2WorkowSchedulinginE-ScienceNetworks ................ 70 4.2.1SystemModelandDataStructure .................. 71 4.2.1.1Timemodel .......................... 71 4.2.1.2Networkresourcemodel .................. 71 4.2.1.3Workowmodel ....................... 72 4.2.2ProblemStatement ........................... 72 4.2.3ConstructionofanAuxiliaryGraph .................. 73 4.3MILPFormulation ............................... 75 4.3.1SingleWorkow ............................. 78 4.3.1.1Multi-commodityowconstraints .............. 78 4.3.1.2Taskassignmentconstraints ................ 80 4.3.1.3Precedenceconstraints ................... 80 4.3.1.4Deadlineconstraints ..................... 81 4.3.2MultipleWorkows ........................... 81 4.3.3TimeComplexity ............................ 81 4.4LPRelaxation .................................. 83 4.5ListSchedulingHeuristic ........................... 86 4.6ExperimentalEvaluation ............................ 89 4.6.1ExperimentSetup ........................... 89 4.6.2Results ................................. 92 4.6.2.1Schedulelengthofworkows ................ 92 4.6.2.2Computationaltime ..................... 93 4.7Summary .................................... 95 5CONCLUSIONS ................................... 97 REFERENCES ....................................... 98 BIOGRAPHICALSKETCH ................................ 107 6

PAGE 7

LISTOFTABLES Table page 2-1ComparisonbetweenDSPande-Scienceapplications .............. 30 2-2Summaryofsystemparametersofthevisualizationapplication ......... 36 2-3Notationforproblemformulation .......................... 40 3-1TimeComplexityforMPMJ ............................. 61 3-2SpaceComplexityforMPMJ ............................ 62 4-1Notationforproblemformulation .......................... 77 4-2Singleworkowschedulingformulationtimecomplexityanalysis ........ 84 4-3Edge-pathformsingleworkowschedulingformulationtimecomplexityanalysis 88 7

PAGE 8

LISTOFFIGURES Figure page 2-1AnexampleofSDFG ................................ 29 2-2AhomogeneousSDFGconvertedfromFigure 2-1 (a) .............. 33 2-3Arealexampleofe-Scienceapplications[ 53 ] ................... 35 2-4AnESDFGmodelforFigure 2-3 .......................... 35 2-5ModelingcommunicationdelayinaSDFG ..................... 39 2-6Modelingcommunicationdelayinthecaseofmultiplecommunicationchannels 41 2-7Moreexploitedparallelismincaseofmultiplecommunicationchannels ..... 42 2-8BAFSproblemformulationincaseoftheconservativemodel .......... 42 2-9BAFSproblemformulationincaseoftheoptimisticmodel ............ 43 2-10TheAbilenenetwork ................................. 46 2-11Rejectionratiovs.numberofrequests ....................... 47 3-1Anexampleofinter-domainQoSrouting ...................... 51 3-2Anillustrativeexampleforlimitationsofthelinesegmentalgorithm ....... 54 3-3Full-meshAR ..................................... 56 3-4StarAR ........................................ 57 3-5PartitionedstarAR .................................. 60 3-6Earliestnishtimeon-lineschedulingofmultipleletransfers .......... 63 3-7Errorratiovs.thenumberofnodes ......................... 65 3-8Normalizedcomputationaltimevs.thenumberofsourceanddestinationnodes 65 4-1ADAGconsistingof17nodes,representingdependenciesamong17tasksofanapplication.Forexample,thearcfromtaskEtotaskBrepresentsthefactthattheoutputgeneratedbytaskEisutilizedbytaskB. ........... 68 4-2Anexampleofanetworkresourcegraph ..................... 72 4-3Anexampleofataskgraph ............................. 73 4-4Anexampleofanauxiliarygraph .......................... 76 4-5Singleworkowschedulingproblemformulationvianetworkowmodel .... 79 8

PAGE 9

4-6Multipleworkowschedulingproblemformulationvianetworkowmodel ... 82 4-7Edge-pathformofsingleworkowschedulingproblemformulation ....... 87 4-8TheAbilenenetwork ................................. 92 4-9Makespanvs.CCRforallalgorithmsintheAbilenenetworkwhenthenumberofnodesinaworkowis3. ............................. 93 4-10Makespanvs.CCRandthenumberofnodesinaworkowforLPREdgeandLSintheAbilenenetwork .............................. 93 4-11Computationaltimevs.CCRforallalgorithmsintheAbilenenetworkwhenthenumberofnodesinaworkowis3. ...................... 94 4-12Computationaltimevs.thenumberofnodesinaworkowforLPREdgeandLSintheAbilenenetwork .............................. 94 9

PAGE 10

ListofAlgorithms 2-1AheuristicforBAFSproblem .......................... 46 3-1Full-meshARconstruction ............................ 56 3-2StarARconstruction ............................... 58 3-3PartitionedstarARconstruction ......................... 59 4-1Firststep-Determinationofthemappingoftasksexceptdatatransfers .. 85 4-2Secondstep-Determinationofthemappingofnetworkresources ..... 85 4-3Theadaptedextendedlistschedulingalgorithm ................ 89 4-4Datatransfernishtimecomputationalgorithm ................ 90 10

PAGE 11

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyNETWORKRESOURCEPROVISIONINGINRESEARCHNETWORKSByEun-SungJungDecember2010Chair:SanjayRankaCochair:SartajSahniMajor:ComputerEngineering Advancesinopticalcommunicationandnetworkingtechnologies,togetherwiththecomputingandstoragetechnologies,aredramaticallychangingthewaysscienticresearchisconducted.Anewterm,e-Science,hasemergedtodescribelarge-scalesciencecarriedoutthroughdistributedglobalcollaborationsenabledbynetworks,requiringaccesstoverylargescaledatacollections,computingresources,andhigh-performancevisualization[ 12 ]. E-Scienceapplicationworkowsarecomplexandrequireschedulableandhigh-bandwidthconnectivitywithknownfuturecharacteristics.Moreover,theseworkowshaveperformancerequirementsormetricsthathavenotbeenconsideredbyconventionalnetworking.Forexample,largeletransfermayneedguaranteedtotalturnaroundtimeandtherateofprogress.Giventhelongdurationofmanyrequests,thenetworkresourcesavailablemaychangebeforeitiscompleted. Wedevelopanovelframeworkforprovisioningavarietyofe-Scienceapplicationsthatrequirecomplexworkowsthatspanovermultipledomains.Ourframeworkprovidesguaranteesontheperformancewhileincurringminimaloverhead,bothnecessaryconditionsforsuchaframeworktobeadoptedinpractice. 11

PAGE 12

CHAPTER1INTRODUCTION 1.1Overview Advancesinopticalcommunicationandnetworkingtechnologies,togetherwiththecomputingandstoragetechnologies,aredramaticallychangingthewaysscienticresearchisconducted.Anewterm,e-Science,hasemergedtodescribelarge-scalesciencecarriedoutthroughdistributedglobalcollaborationsenabledbynetworks,requiringaccesstoverylargescaledatacollections,computingresources,andhigh-performancevisualization[ 12 ].Well-quotede-Science(andtherelatedgridcomputing[ 47 ])examplesincludehigh-energynuclearphysics[ 33 ],radioastronomy[ 19 ],geoscience[ 3 ]andclimatestudies[ 13 ].Tosupporte-Scienceactivities,anewgenerationofhigh-speedresearchandeducationnetworkshavebeendeveloped.TheseincludeInternet2[ 17 ],theDepartmentofEnergy'sESnet[ 14 ],NationalLambdaRail[ 21 ],CA*net4[ 9 ]inCanada,andthepan-EuropeGEANT2[ 5 ].AlargeportionofalldatatrafcsupportingU.S.scienceiscarriedbyESnet,Internet2,andNationalLambdaRail[ 55 ]. E-scienceactivitiesoftenneedtotransportlargevolumesofdataataveryhighrateamongalargenumberofcollaboratingsites[ 33 78 ],severelystressingnetworkresources.Forinstance,thehigh-energyphysics(HEP)dataisexpectedtogrowfromthecurrentpetabytes(PB)(1015)toexabytes(1018)by2015[ 33 ].Beyondtheobviousneedforlargeamountsofdatatobetransferred,e-Sciencerequirementsfornetworkusearesignicantlydifferentfromthetraditionalnetworkapplications[ 7 46 55 ]inthefollowingways: 1. Needtosupportschedulable,long-durationworkowswithperformanceguarantee:Theunderlyingapplicationsrequireschedulable,high-bandwidth,low-latencyconnectivitywithknownfuturecharacteristicsorperformanceguarantees[ 41 ],forreal-timeremotevisualization,interactionswithinstruments,distributedsimulationordataanalysis,etc.Inadistributedworkowsystemthatinvolvesmanyentitiessuchasdistantparties,scienticinstruments,computationdevices,aswellascomplexfeedbackinvariousstagesoftheworkow,unintendeddelaydueto 12

PAGE 13

lackofplanningforfuturecommunicationpathscanripplethroughtheentireworkowenvironment,slowingdownotherparticipatingsystemsastheywaitforintermediateresults,thusreducingtheoveralleffectiveness[ 55 ]. 2. Needtosupportalargenumberofnetworkserviceswithnovelperformancemetrics:Therearemanydifferenttypesofsciencesandscienticactivities,whichrequiredifferenttypesofnetworkservicestailoredtothespecicscienceactivities.(Alsosee[ 1 2 7 46 55 ].)Moreover,manyofthee-Scienceactivitieshaveperformancerequirementsormetricsthathavenotbeenconsideredbyconventionalnetworking.Largeletransfermayonlybeconcernedwithtotalturnaroundtimeandtherateofprogress;streamingconsumer-producertypeofjobsrunningattwodifferentsitesmayrequireaminimumandmaximumdatatransferrate;fusionexperimentsmaycareaboutloweringtheprobabilityoffailureintheexperimentsduetoinadequatenetworkservices. 3. Needtosupportdynamicuserandresourceenvironmentwithhighnetworkefciency:Giventhateachjobcanbeaheavyhitterintermsofnetworkresourceconsumption,thenetworkmusthandlewithgreatefciencythedynamicarrivalsofservicerequests,thechangesinrequirements,trafcpatternandaccesspoliciesatdifferentstagesofexperimentsorcollaboration.Theefciencyrequirementisespeciallyimportantforthenew-generation,high-speed,coarse-granularnetworks,suchasthewavelength-basedsystems.Inaddition,giventhelongdurationofmanyjobs,theresourcesavailabletoaparticularjoborthenetworktopologymaychangebeforethejobiscompleted.Thenetworkservicesmustbeabletoadapttotheresourcechangesbyincorporatingnewlyaddedresources(e.g.,linksorwavelengths)orfallingbacktoalternativeresourceswhentheassignedresourcesarenolongeravailable. Inshort,e-Scienceactivitiesneedschedulable,high-bandwidth,exibleandevolvingnetworkserviceswithnovelperformanceguarantees,andthenetworkneedstoprovidetheseservicesefciently.Thereisalargebodyofresearchonhowtoprovidequality-of-service(QoS)guarantees(e.g.,InterServ[ 31 ],DiffServ[ 29 ],theATMnetwork[ 71 ],orMPLS[ 86 ])forInternet-typenetworks.Thoseproposalsdonotconsideradvancereservationswithstartandendtimes.Bulktransferisusuallyregardedaslow-prioritybest-efforttrafc,notsubjecttoadmissioncontrol(AC).ACandschedulingaredecoupledfromroutinginthateachconnectionhasasingledefaultpathseparatelydeterminedbyaroutingprotocol;theroutingprotocolisusuallyobliviousofthejobsinthesystem.ACismyopicinthateachnetworkelementonthepathdetermineswhethertheconnectioncanbeacceptedbasedonacomparisonoftheremaininglinkcapacity 13

PAGE 14

atthenodeitselfwiththerequestedresourceofthejobalone.Onceadmitted,thepathandresourceallocationremainxedthroughoutthelifetimeoftheconnection. Tomeettheneedsofe-Science,weproposeaframeworkforconductingadvancereservations,admissioncontrol(AC)andschedulingofnetworkservicerequestsinresearchnetworksthat(i)supportsanevolvinglargearrayofnetworkservicesrequiredbyorusefulfore-Sciencecollaboration;(ii)guaranteesperformancelevelsthatarebasedonmetricsrelevanttotheunderlyingapplications;(iii)adaptstounderlyingchangesinnetworktopology,resourcesanduserdynamics;and(iv)providesefcientutilizationoftheunderlyingresources. 1.2TargetNetworksandServices Recentnetworksignalingprotocols,suchasMultipleProtocolLabelSwitching(MPLS)andGeneralizedMPLS(GMPLS),allowapplicationstoovercomedecienciesprevalentinexistingroutedTCP/IPprotocols(e.g.,theinabilitytoguaranteebandwidth,orofferQualityofService).Manyhigh-bandwidthnetworkprojectscurrentlyaredeployingtheseprotocolsintheresearchandacademicdomain.Thisisthecase,forexample,intheInternet2'sHOPItestbed[ 16 ],theNSF-supportedUltralight[ 8 ],Teragrid[ 23 ],CHEETAH[ 6 ]andDRAGON[ 63 ]networksandtheDOE-supportedUltraScienceNet(USN)[ 24 ],ESnet[ 14 ]andLHCNet[ 20 ].Weexpecttheseprotocolstoproliferateintotheproductionandcommercialnetworkdomain.Astherstevidence,boththeInternet2andtheDOE'sESnethavechosentoofferdedicatedbandwidthcapabilityandlightpathsusingGMPLSandMPLScontrolplanetechniques,developedintheOSCARS/BRUWprojects[ 10 ].ThetechniquesprovideaframeworktoautomatetheprovisioningprocessforbandwidthandmakeiteasierforuserstoaccessServiceOrientedBandwidthManagement(SOBM)functions,comparedtothecurrentprovisioningandbandwidthmanagementpractices,whicharemanualandlabor-intensive. Pastandcurrentprojectsonresearchnetworkshavefocusedonaddressingthefollowingchallenges[ 6 10 16 63 ]:1)Setupthehigh-speeddataplanebyahybridof 14

PAGE 15

IPpacket-switchingandopticalcircuit-switchingtechnologieswithalargefootprintandsufcientconnectivitybyconnectingthenationallabsanduniversitiesandpeeringwithothernetworks,2)Developsupportforend-to-endhigh-speedcircuitsstaticallyorondemand,whichrequiresmulti-domaininteroperability,3)Setupthebasiccontrolplaneanddevelopsignalingandcontrolmiddlewareforhandlinguserrequestsandbasicnetworkresourcereservation,4)Developend-to-endtransportprotocolsforsupportinghigh-speedchannelsandlargevolumesofdata,5)Ensuresecuritybyencryption,authentication,authorization(AAA),and6)Ensurereliability. Bulktransfer .Beingabletotransferverylargelesisapriorityinnearlyalle-Sciences[ 1 2 7 46 55 ].Iftheturnaroundtimeistheperformancemetricthattheusercaresabout,thereisagreatdealofexibilityinhowthetransfercanbecarriedout.Forinstance,thetransferofa100GBlecanbecompletedin8secondsusingten10Gbpslightpaths(Internet2links),orin1hourand26minutesusinga155Mbps(OC-3)long-lastingSONETcircuit.Thetransferchoicenotonlyaffectsthejobinquestion,butalsoothercurrentorfuturejobsincomplexways.Forlargetransferswithstartandendtimeconstraints,peakbandwidthassignmentcanleadtoanundesirablephenomenonknownasfragmentation[ 77 ],whichinturnleadstolowutilizationofnetworkresources.Thisoccurswhensometimeintervalsarelightlyloadedbutnotlongenoughtoaccommodatenewlargejobs.Greatertransferexibilityisneededtocombatthisproblem,suchastime-varyingbandwidthassignmentanddynamicre-assignment. Streamingworkow .ForthenanomaterialsciencesconductedatDOE'sCenterforFunctionalNanomaterials,researchofteninvolvesdistributedcollaborationamongsmallerresearchcenterswithdifferentscienticinstrumentsandcapabilities[ 2 ].Dataaregenerallycollectedfromseveralcentersand/orarecomparedagainsteachother.Then,amediumsizedclusterofcomputersprocessesandanalyzesthedata.Thevisualizationisdonebyspecialworkstationsequippedwithlargememorygraphicscardstohandlethelargeimagesandvolumesofdatafromtheoutputofthedataprocessing. 15

PAGE 16

Thegeneratedanimationisthenstreamedtoremotescientists'desktops,orinthecasewherethevisualizationisinstereo,toa3Dtheater.Thenetworkrequirementsvaryateachstageoftheworkow. Data-intensiveworkow .Large-scalesupercomputingisexpectedtoproducedataatasimilarratetolarge-scaleexperiments.Inordertopost-processthecomputedresults,highthroughputtransfersareoftenrequiredtostagethedataattherelatedcomputationalresources.Similarly,high-endscienticcomputingalsoprocesseslargeamountsofinputdatathat,fromaperformanceperspective,shouldbeaccessibleasefcientlyaspossible.LocalparallellesystemsarewellsuitedforsupportingthedemandedI/Ocapabilities,evenwhendatahastobestagedtotherespectivelesystems.Communityschedulersneedtocontrolmultipledistributedcomputationalresourcesinordertoserveindividualworkows. 1.3ProblemsAddressedandOurContributions E-SciencenetworksusuallyprovideQoSguarantee,i.e.,bandwidthguaranteethroughmulti-protocollabelswitching(MPLS)andgeneralmulti-protocollabelswitching(GMPLS)tomeettherequirementsofe-Scienceapplications,e.g.,in-advancepathreservationsforhigh-volumedatatransfers.Thedistinctivefeaturesofe-Scienceapplicationscomparedwithotherdistributedapplicationscanbesummarizedintwokeywords,network-centricandin-advance.Unlikeothergridcomputingapplications,schedulingofe-Scienceapplicationsputsmorefocusonnetworkresourcesorconsidersthenetworkresourceasmostimportantamongmultipleresourcessuchascomputeresourceandstorageresource.Moreover,in-advanceschedulingofe-Scienceapplicationssatisestheneedsofusersrequestingperiodicorpredictableservices. 1.3.1BandwidthAllocationforIterativeData-dependentApplications Wepresentaframeworkforbandwidthschedulingofstreaminge-Scienceapplications.Theseapplicationsincludeinteractivevisualizationofsimulations,largedatastreamingcoordinatedwithjobexecutionforproducerconsumerapplications, 16

PAGE 17

andnetworkedsupercomputing[ 46 ].WehaveadaptedtheSynchronousDataow(SDF)modeltomodelandanalyzeiterativedata-dependentapplicationsine-Science.Synchronousdataowwasproposedinlate1980sasamodelingmethodfordigitalsignalprocessing(DSP)applications,butitignoresthecommunicationdelays.Ourmodelincorporatesthecommunicationdelaysthatareinherentinlarge-scaledistributedapplications.Wehaveformulatedthebandwidthallocationproblemofiterativedata-dependente-Scienceapplicationswithtemporalconstraintsasamulti-commoditylinearprogrammingproblem.ItincorporatesoptimalratesandbufferminimizationforstreamingapplicationsthatcanberepresentedbyaSDFG.Ouralgorithmsdeterminehowmuchbandwidthisallocatedtoeachedgewhilesatisfyingtemporalconstraintsoncollaborativetasks.Usingthesolutionofthebandwidthallocationproblem,bufferrequirementsforthescheduleareachievedusingproceduressimilartotheonespresentedin[ 50 ].Tothebestofourknowledge,thisrepresentstherstattempttoanalyzethetemporalbehaviorofcollaborativelyiterativetasksandtodeterminetheoptimalbandwidthallocationsamongdistributednodes. 1.3.2TopologyAggregationforE-ScienceNetworks Thenetworksupportinge-Scienceapplicationstypicallyiscomprisedofmultipledomains.Eachdomainusuallybelongstodifferentorganizations,andismanagedbasedondifferentoperationalpolicies.Insuchcases,internaltopologiesofdomainsmaynotbevisibletotheothersforsecurityorotherreasons.However,aggregatedinformationofinternaltopologyandassociatedattributesisadvertisedtotheotherdomains. AsetoftechniquestoaggregatedatatoadvertiseoutsideonedomainiscalledTopologyAggregation(TA).TheaggregateddataitselfistermedasAggregatedRepresentation(AR).AsurveyofTAalgorithmsispresentedin[ 98 ].ThereexistsatradeoffbetweentheaccuracyandthesizeofAR.Hence,mostalgorithmsproposedin 17

PAGE 18

thepreviousworktriedtoachievethemostefcientARintermsofbothaccuracyandspacecomplexity. OnecanclassifyQoSpathrequestsintotwoclasses:single-pathsingle-job(SPSJ)andmultiple-pathmultiple-job(MPMJ),dependingonthenatureofrequests.SPSJcorrespondstoasituationinwhichrequestsforsingleQoSpatharriveandarescheduledintheorderofarrival.Incontrast,MPMJcorrespondstobatch/off-lineschedulingofmultiplerequestsformultipleQoSpaths.Manye-Scienceapplicationsrequiresimultaneoustransferofdatafrommultiplesourcesanddestinations.Also,eachoftheserequests(e.g.,letransfers)canbemoreefcientlysupportedbyusingconcurrentmultiplepaths. WeshowthatexistingTAapproachesdevelopedforSPSJdonotworkwellwithMPMJapplicationsastheyoverestimatetheamountofbandwidththatisavailable.WeproposeamaxowbasedTAapproachthatissuitableforthispurpose.Oursimulationresultsdemonstratethatouralgorithmsresultinbetteraccuracyorlessschedulingtime. 1.3.3WorkowScheduling Workow/DirectedAcyclicGraph(DAG)schedulinghasbeenshowntobeNP-hard[ 91 ].Anumberofpracticalheuristicshavebeendevelopedforthisproblem.Mostoftheseignorethecommunicationcosts[ 26 39 ]orassumedaverysimpleinterconnectionnetworkmodel,i.e.,afully-connectednetworkmodelwithoutcontention[ 59 60 79 97 99 ].Theworkin[ 97 ]proposedtheheterogeneous-earliest-nish-time(HEFT)algorithmextendedfromtheclassiclistschedulingforheterogeneouscomputingresources. However,theadvancesincomputingplatformsrangingfromclusterstogridsandemergingcloudsfordataintensiveapplicationshasposednewchallengeswherenetworkcontentionisanimportantissuethatneedstobeaddressed.Weproposetoaddressthisissuebyformulatingandsolvingtheoverallworkowschedulingthatincorporatesnetworkcontentionandoverheadsofthelargescaledatatransfers.In 18

PAGE 19

particular,weaddressthefollowingissuesfore-SciencegridsthathavenetworksthatareamixofIPnetworksandopticalnetworks: Malleableresourceallocation. Dynamicmultipathscheduling. Multipleworkows. Wehaveformulatedworkowschedulingproblemsine-Sciencenetworks,whosegoalisminimizingeithermakespanornetworkresourceconsumptionbyjointlyschedulingheterogeneousresourcessuchascomputeandnetworkresources.Theformulationsaredifferentfrompreviousworkintheliteratureinthesensethattheyallowdynamicmultiplepathsfordatatransferbetweentasksandmoreexibleresourceallocationthatmayvaryovertime.Moreover,ourworkisthersttoformulatetheworkowschedulingproblemincorporatingmultiplepathsasamixedintegerlinearprogramming(MILP).Weformulatealsoalinearprogrammingrelaxation,LPR,ofourMILP,anedge-pathbasedLPrelaxation,LPREdge,andalistschedulingheuristic,LS. TheexperimentalresultsshowthatthemakespanofLPRschedulesismuchclosertooptimalthanthatofLSscheduleswhenthecommunication-to-computeratio(CCR)islarge.TheLSalgorithmperformsroughlysimilartotheLPRalgorithmwhenCCR=0.1and1.0,buttheperformancegapofthesenon-optimalalgorithmsgrowsdramaticallyasCCRgrowsfrom1to10.Ourresultsindicatethatdata-intensiveworkowscheduling,whichiscommonine-Scienceapplications,willbenetfromdynamicmultiplepathsandmalleableresourceallocation. 1.4BackgroundandRelatedWork Ongoingresearchprojectsforsupportinge-Scienceapplications(e.g.,HOPI[ 16 ],Ultralight[ 8 ],Teragrid[ 23 ],CHEETAH[ 6 ],DRAGON[ 63 ],ESnet[ 14 ],OSCARS/BRUW[ 10 ])havemainlyfocusedonsettingupafastdataplanewithalargefootprintandsufcientconnectivityandsettingupabasicbutfunctionalcontrolplane,suchasdevelopingsignalingandcontrolmiddlewareforhandlinguserrequestsforelementary 19

PAGE 20

networkservices,ensuringsecurityandimprovingreliability.However,thecontrolplanemechanismslacksophisticatednetworkservicesupportorefcientservicereservationalgorithms.Theynormallyonlysupportxedbandwidthguaranteebyreservingcircuitsorlightpaths.Usingsucharestrictedsetofservicesorsimplisticresourcemanagementalgorithmstosupportdiversee-Scienceactivitiescanleadtoinefcientutilizationofthenetworkresources(especiallyforthenew-generation,high-speed,coarse-granularnetworks,suchasthewavelength-basedsystems)and/ornotprovidethelevelofperformancerequiredbythoseactivitiesindesiredbutvariedperformancemetrics. ComparedwiththetraditionalQoSframeworks,suchasInterServ[ 31 ],DiffServ[ 29 ],ATMnetworks[ 71 ],orMPLS[ 86 ],admissioncontrolandschedulingforresearchnetworksarerecentconcernswithlimitedpublishedwork.Priorworkiseitheraboutdedicatedpathreservation,bulkdatatransfersorjobsthatrequireminimumbandwidthguarantee(MBG).Nonehasconsideredasrichaclassofjobtypesaswedo. Controlplaneprotocols,architecturesandtools .TheNSF-supportedDRAGON[ 63 106 ]projectdevelopscontrolplanearchitectureandmiddlewareformulti-domaintrafcengineeringandresourceallocation,e.g.,usingGMPLSprotocols[ 43 ]forsettingupSONETcircuitsorlightpaths.Itusesacentralizedresourcecomputationelementperdomain,whichisresponsibletocomputepaths.Itsupportsadvancereservationsoflabelswitchedpaths(LSP)onrequestedtimeperiods.CHEETAH[ 6 ]isasimilarprojecttoDRAGONbutismoretraditionalinthatitfocusesonsimpler,distributedoperationsforpathcomputationandbandwidthmanagementtosupporthigharrivalratesofimmediateconnectionrequests.OSCARS[ 10 ]isthecontrolplaneprojectforDOE'sESnet,alsosimilartoDRAGON.Itdevelopsanddeploysaprototypeservicethatenableson-demandprovisioningofguaranteedbandwidthcircuitsforESnet.HOPI[ 16 ]isatestbedprojectonresearchnetworksthatexamineshowtoprovidenetworkservicesinahybridnetworkofsharedIPpacketswitchinganddynamicallyprovisionedlightpaths. 20

PAGE 21

[ 52 ]presentsanarchitectureforadvancereservationofintraandinterdomainlightpaths.GARA[ 48 ],thereservationandallocationarchitectureforthegridcomputingtoolkitGlobus[ 15 ],supportsadvancereservationofnetworkandcomputingresources.[ 40 ]adaptsGARAtosupportadvancereservationoflightpaths,MPLSpathsandDiffServpaths.OtherrelatedworkinthiscategoryincludesGridJIT[ 96 ],ODIN[ 54 ],[ 30 ]and[ 34 ].Muchoftheobjective,architectureframeworkandcapabilitiesoftheproposedprojectcoincideswiththeNSF'sGENIproject[ 22 ],forinstance,theuseofnetworkcontrollersandthesupportofnetworkvirtualization.Mostoftheabovecontrol-planearchitecturesandtoolsproviderudimentaryACandschedulingalgorithmsforsimplejobtypes.However,muchmorecanbedonetosupportmoreservicetypesorimprovethenetworkresourceutilization. Pathreservation .Theabilitytoprovidededicatedoron-demandcircuitsorlightpathsiscurrentlythefocusofmanyprojects,includingmostaforementionedmajorresearchnetworksandassociatedprojects,e.g.,Internet2,ESnet,NationalLambdaRail,GEANT2,UltraScienceNet(USN),HOPI,DRAGON,CHEETAHandOSCARS/BRUW.FurtherexamplesincludeUserControlledLightPaths(UCLP)[ 25 ],Enlightened[ 4 ],JapaneseGigabitNetworkII[ 18 ],LHCNet[ 20 ],andBandwidthBrokers[ 109 ].Inourpreviousresearchwork,wehaveproposednovelalgorithmsforadvancepathcomputationandbandwidthschedulingforconnectionorientednetworks[ 87 ]thathaveconsiderablybetterperformance[ 57 ].In[ 56 ],wehaveextendedthesealgorithmstoincorporatethewavelengthsharingandwavelengthcontinuityconstraints. MBGservice .Severalearlierstudies[ 32 36 89 104 ]haveconsideredACatanindividuallinkfortheMBG(minimumbandwidthguarantee)jobtypewithstartandendtimes.Theconcernistypicallyaboutdesigningefcientdatastructures,suchasasegmenttree[ 32 ],forrecordingandqueryingthelinkbandwidthusageondifferenttimeintervals.Admissionofanewjobisbasedontheavailabilityoftherequestedbandwidthbetweenitsstarttimeandendtime.[ 35 44 51 100 ]and[ 36 ]tacklethemoregeneral 21

PAGE 22

path-ndingproblemfortheMBGclass,buttypicallyonlyfornewrequests,oneatatime.Theroutesandbandwidthofexistingjobsareunchanged.[ 64 ]considersanetworkwithknownroutinginwhicheachadmittedjobderivesaprot.Itgivesapproximationalgorithmsforadmittingasubsetofthejobssoastomaximizethetotalprot. Bulktransfer .RecentpapersonACandschedulingalgorithmsforbulktransferwithadvancereservationsinclude[ 35 37 51 70 73 75 77 82 ].In[ 77 ],theACandschedulingproblemisconsideredonlyforthesinglelinkcase.Network-levelACandschedulingareconsideredtobeoutsidethescopeof[ 77 ].Asaresult,multi-pathroutingandnetwork-levelbandwidthallocationandre-allocationhavenocounterpartin[ 77 ].Incontrast,weperiodicallyre-optimizethebandwidthassignmentforallthenewandoldjobs. Foraone-timeschedulingproblem,ourrecentwork[ 82 ]conductsadetailedperformancecomparisonbetweensingle-sliceschedulingandmulti-sliceschedulingundervariousslicesizes,andbetweensingle-pathroutingandmulti-pathrouting.Weconcludethatasmallnumberofpathsperjobisusuallysufcienttoyieldnear-optimalthroughput;multi-sliceschedulingleadstosignicantperformance(e.g.,throughput)improvement.Otherauthorshavealsoconsideredasimilarproblembutwithdifferentemphasis[ 37 ]. In[ 73 75 ],theauthorsconsidersingle-linkACorlink-by-linkACundersingle-pathrouting.TheACusesheuristicalgorithmsinsteadofsolutionstooptimizationproblems.Basedonitssizeandthedeadline,theaveragerequiredbandwidthofabulktransferjobiscomputed.TheACisbasedonthejob'saveragebandwidthrequirement.Thebandwidthofexistingjobsmaybere-allocatedonlyforthesingle-linkcase. Theauthorsof[ 35 ]proposeamalleablereservationschemeforbulktransfer,whichcheckseverypossibleintervalbetweentherequestedstartandendtimesforthejobandtriestondapaththatcanaccommodatetheentirejobonthatinterval.Thescheme 22

PAGE 23

favorsintervalswithearlierdeadlines.In[ 51 ],thecomputationalcomplexityofarelatedpath-ndingproblemisstudiedandanapproximationalgorithmissuggested.[ 70 ]startswithanadvancereservationproblemforbulktransfer,butconvertsitintoaconstantbandwidthallocationproblemtomaximizethejobacceptancerate.AlltherequestsareknownatthetimeofAC;AC/schedulingiscarriedoutonlyonce.Thebandwidthconstraintsareattheingressandegresslinksonly,andhence,thereisnoroutingissue. Grid/Utility/CloudComputing .Networkresourceprovisioningproblemsine-Sciencenetworkssharesomedesigngoalssuchastheearliestnishtimeofajobwithresourcemanagementproblemsingrid/utility/cloudcomputing.However,thenetworkresourceprovisioningproblemsine-Sciencenetworksaredifferentfromthoseproblemsingrid/utility/cloudcomputinginthatnetworkresources,i.e.,thebandwidthoflinks,areassumedtobeguaranteedandmanageablebyemergingtechnologiessuchasMPLSandGMPLS.SuchQoSguaranteeinginfrastructuresfore-Scienceapplicationsareoriginatingfromthefactthatcommone-Scienceapplicationstransportlargevolumesofdataatveryhighrates.Thisdifferenceopensaresearchareatowardmoreelegantmanagementofnetworkresources,whichcanmakesystemperformancebetter. Opticalnetworks .E-sciencenetworksaremixofIPnetworksandopticalnetworks.Inopticalnetworks,thebandwidthalongagivenlinkcanbedecomposedintomultiplewavelengths.Forsuchreasons,opticalnetworkshavethefollowingconstraints. Wavelengthcontinuityconstraint:Thisconstraintforcesasinglelightpathtooccupythesamewavelengththroughoutallthelinksthatitspans.Thisconstraintisnotrequiredwhenanopticalnetworkisequippedwithwavelengthconverters.Whensuchconvertersarepresent,thenetworkiscalledawavelengthconvertiblenetwork. Wavelengthsharingconstraint:Formanydeployments,itismosteffectivetoconsiderthebandwidthonalinkasconsistingofintegermultiplesofwavelengthandasinglewavelengthasaunitforassignmenti.e.,onewavelengthisoccupiedbyonlyonereservationatacertainpointoftime.Itisworthnotingthattechniques 23

PAGE 24

basedonTimeDivisionMultiplexing(TDM)/WavelengthDivisionMultiplexing(WDM)[ 110 ]allowfordecomposingthebandwidthonawavelength. Therelatedissuesintheresearchareaofopticalnetworksare:Routingandwavelengthassignment(RWA)problem,virtualtopology(VT)problem,trafcgrooming(TR)problem,andtaskschedulingandlightpathestablishment(TSLE)problem. 1.5OutlineofDissertation Theremainderofthisdissertationisorganizedasfollows. Chapter 2 describesaSDF-basedmodelforiterativedata-dependente-Scienceapplicationsthatincorporatesvariablecommunicationdelaysandtemporalconstraints,suchasthroughput.Weformulatetheproblemasavariationofmulti-commoditylinearprogrammingwithanobjectiveofminimizingnetworkresourceconsumptionwhilemeetingtemporalconstraints.TheresultingsolutioncanthenbeusedtoderivebufferspacerequirementsbypreviouslydevelopedalgorithmsinthecontextofDSPapplications.Finally,anillustrativeexampleofane-Scienceapplicationshowsthattheframeworkandalgorithmweproposeisvalidtomodelandanalyzeiterativedata-dependente-Scienceapplications.Thesimulationresultsshowthattheoptimalbandwidthallocationbytheformulatedlinearprogrammingoutperformsthebandwidthallocationbyasimpleheuristicintermsofrejectionratioofrequests. Chapter 3 describestopologyaggregationalgorithmsfore-Sciencenetworks.E-ScienceapplicationsrequirehigherqualityintradomainandinterdomainQoSpaths,andsomeofthosearedistinguishedfromclassicsingle-pathsingle-job(SPSJ)applications.Wedeneanewclassofrequests,calledmultiple-pathmultiple-job(MPMJ),andproposeTAalgorithmsforthenewclassofapplications.Theproposedalgorithms,starandpartitionedstarARs,areshowntobesignicantlybetterthannaiveapproaches. Chapter 4 describesefcientalgorithmsforworkowschedulingproblemsine-Sciencenetworks,whosegoalisminimizingeithermakespanornetworkresource 24

PAGE 25

consumptionbyjointlyschedulingheterogeneousresourcessuchascomputeandnetworkresources.Ouralgorithmsaredifferentfrompreviousworkintheliteratureinthesensethattheyallowdynamicmultiplepathsfordatatransferbetweentasksandmoreexibleresourceallocationthatmayvaryovertime.Inaddition,itisadvantageousthattheformulationforasingleworkowschedulingcanbeeasilyextendedtotheformulationforamultipleworkowscheduling. 25

PAGE 26

CHAPTER2BANDWIDTHALLOCATIONFORITERATIVEDATA-DEPENDENTE-SCIENCEAPPLICATIONS 2.1Overview E-Scienceactivitiesoftenrequirethetransportoflargevolumesofdataatveryhighratesamongalargenumberofcollaboratingsites[ 33 78 ],severelystressingnetworkresources.Forinstance,thehigh-energyphysics(HEP)dataareexpectedtogrowfromthecurrentpetabytes(1015)toexabytes(1018)by2015[ 33 ].Beyondtheobviousneedforlargeamountsofdatatobetransferred,e-Sciencerequirementsfornetworkusearesignicantlydifferentfromthetraditionalnetworkapplications[ 7 46 55 ].Theunderlyingapplicationsrequireschedulable,high-bandwidth,low-latencyconnectivitywithknownfuturecharacteristicsorperformanceguarantees[ 41 ]forreal-timeremotevisualization,interactionswithinstruments,distributedsimulationordataanalysis,andsoon.Inadistributedworkowsystemthatinvolvesmanyentities,suchasdistantparties,scienticinstruments,computationdevices,aswellascomplexfeedbackinvariousstagesoftheworkow,unintendeddelaysduetoalackofplanningforfuturecommunicationpathscanripplethroughtheentireworkowenvironment,slowingdownotherparticipatingsystemsastheywaitforintermediateresults,thusreducingtheoveralleffectiveness[ 55 ]. Thefocusofthischapterisonsupportinge-Scienceapplicationsthatrequirestreamingofinformationbetweensites.Wepresentaframeworkforbandwidthschedulingofstreaminge-Scienceapplications.Theseapplicationsincludeinteractivevisualizationofsimulations,largedatastreamingcoordinatedwithjobexecutionforproducerconsumerapplications,andnetworkedsupercomputing[ 46 ].Themaincontributionsareasfollows: 1. WehaveadaptedtheSynchronousDataow(SDF)modeltomodelandanalyzeiterativedata-dependentapplicationsine-Science.Synchronousdataowwasproposedinlate1980sasamodelingmethodfordigitalsignalprocessing(DSP) 26

PAGE 27

applications,butitignoresthecommunicationdelays.Ourmodelincorporatesthecommunicationdelaysthatareinherentinlarge-scaledistributedapplications. 2. Wehaveformulatedthebandwidthallocationproblemofiterativedata-dependente-Scienceapplicationswithtemporalconstraintsasamulti-commoditylinearprogrammingproblem.ItincorporatesoptimalratesandbufferminimizationforstreamingapplicationsthatcanberepresentedbyaSDFG.Ouralgorithmsdeterminehowmuchbandwidthisallocatedtoeachedgewhilesatisfyingtemporalconstraintsoncollaborativetasks.Usingthesolutionofthebandwidthallocationproblem,bufferrequirementsforthescheduleisachievedusingproceduressimilartotheonespresentedin[ 50 ]. Tothebestofourknowledge,thisrepresentstherstattempttoanalyzethetemporalbehaviorofcollaborativelyiterativetasksandtodeterminetheoptimalbandwidthallocationsamongdistributednodes. Ongoingresearchprojectsforsupportinge-Scienceapplications(e.g.,HOPI[ 16 ],Ultralight[ 8 ],Teragrid[ 23 ],CHEETAH[ 6 ],DRAGON[ 63 ],ESnet[ 14 ],andOSCARS/BRUW[ 10 ])havemainlyfocusedonsettingupafastdataplanewithalargefootprintandsufcientconnectivityandsettingupabasicbutfunctionalcontrolplane,suchasdevelopingsignalingandcontrolmiddlewareforhandlinguserrequestsforelementarynetworkservices,ensuringsecurityandimprovingreliability.However,thecontrolplanemechanismslacksophisticatednetworkservicesupportorefcientservicereservationalgorithms.Theynormallyonlysupportxedbandwidthguaranteebyreservingcircuitsorlightpaths.Usingsucharestrictedsetofservicesorsimplisticresourcemanagementalgorithmstosupportdiversee-Scienceactivitiescanleadtoinefcientutilizationofthenetworkresources(especiallyforthenew-generation,high-speed,coarse-granularnetworks,suchasthewavelength-basedsystems)and/ornotprovidethelevelofperformancerequiredbythoseactivitiesindesiredbutvariedperformancemetrics. Therestofthepaperisorganizedasfollows.WeprovideadetaileddescriptionofSDFanditsoperationalsemanticsandexamineitsapplicabilitytoe-ScienceapplicationsinSection 2.2 .Wepresentanoverallprocessofproblem-solving,including 27

PAGE 28

amathematicalformulationasalinearprogrammingandadiscussionofthedetaileddeploymentoftheobtainedsolutionforthelinearprogramminginrealsystemsinSection 2.3 .Weshowthatourapproachoutperformsanaiveheuristic,alsogivenbyus,inSection 2.4 .Lastly,weconcludewithasummaryanddiscussionofthepracticalityofourdissertationinSection 2.5 2.2SynchronousDataowforE-ScienceApplications TheSDFmodelofcomputationwasrstproposedbyLeein[ 62 ].TheSDFmodelhasbeenfoundtobeveryusefulforexpressingDSPapplicationsthathavethefollowingfeatures:innitelyloopingexecution,discretizedcommunicationexpressedbytokens,andparallelismtobeexploitedformaximizingthroughput.Mostoftheexistingresearchfortheseproblemsislimitedtoderivingmaximalratesandbufferminimization. SDFGisadirectedgraphdenedbyG=(V,E,I,O,,),whereVandErepresentasetofnodesandasetofedges,respectively.EachnodeinSDFGiscalledanactorandtheedgeinSDFGiscalledacommunicationchannelorchannel.ThenotationisbasedonitsearlieruseinDSPapplicationscomprisingfunctionblocksandthecommunicationchannelsinterconnectingthem.Anactorrepeatsitstaskinnitely,andtheexecutionofitstaskiscalledring.Inthispaper,weusethetermsnodeforactor,andedgeforchannelinterchangeably.Anactorcanproduceandconsumedataperchannelatdifferentrates,whicharespeciedbythenumberoftokens. Thenumberoftokensisapositiveinteger.Ifmultipleinputsandoutputsareassociatedwithanactor,itisassumedthattheactorwaitsuntilallinputbuffershavetheirtokenstobeconsumedreadyforuseandalloutputbuffersareavailable.HomogeneousSDFG,whereatmostonetokencanbeproducedorconsumed,isaspecialcaseofSDFG. Thenumberoftokensthatactorsproduceandconsumeisspeciedbysets,IandO.Iisasetofnumbersoftokensconsumedbydestinationactorsofedges,andOisasetofnumbersoftokensproducedbysourceactorsofedges.Thus,eachedge(u,v) 28

PAGE 29

Figure2-1. AnexampleofSDFG isassociatedwithtwointegervalues,IuvandOuv.ConsiderthesampleSDFGshowninFigure 2-1 (a).Theedge(u,v)hastwoassociatedintegervalues,IuvandOuv,whichare1and2,respectively.Thisrepresentsthefactthatactoruproduces2tokensateachringandactorvconsumes1tokenateachring.Inaddition,isasetofexecutiontimesofactors,andtheexecutiontimeofeachactor'sringisdenotedbyi.Finally,asetrepresentstheinitialnumbersoftokensonedges,whicharenecessaryforthestartofiterativeoperationsofaSDFG. UsingtheknownpropertiesofahomogeneousSDFGallowsustoderivethemaximalcomputationratesaswellasbufferrequirements.Also,itcanbeshownthatanyarbitrarySDFGcanbeconvertedintoahomogeneousSDFG,althoughthisconversionmayincreasethesizeofthenetworkexponentially. ToadapttheSDFGmodelfore-Scienceapplications,itisimportanttounderstandthekeydifferencesbetweene-ScienceandDSPapplications.AsummaryofdifferencesbetweenDSPande-ScienceapplicationsisprovidedinTable 2-1 .UnlikeDSPapplications,e-Scienceapplicationscanberepresentedbyacyclicgraphs,havexedstartandendtime,andhavecommunicationdelaystobeconsidered.ThetimeunitofDSPapplicationsisontheorderofafewmilliseconds,comparedtothetimeunitofe-Scienceapplicationsthatmaybefromafewhourstoseveraldays.ThroughputisthemostimportantobjectiveinbothDSPande-Scienceapplications.However,forDSP 29

PAGE 30

Table2-1. ComparisonbetweenDSPande-Scienceapplications CategoryDSPapplicatione-Scienceapplication Inter-taskdependencyCyclesareallowed.Usuallyacyclic. ExecutionperiodInnite.Finite. TimeunitSmall(afewmilliseconds).Rangesfromsmalltolarge(afewminutes). ComputeresourceUnlimited.Unlimitedorlimitedifcomputeresourceshouldbeco-allocated. CommunicationdelayAssumedtobe0.Needstobeconsidered. TemporalconstraintsObjectiveismaximizingcomputationrate(throughput).Throughput. ScheduleStaticordynamic.Static. applications,tradeoffsarebetweenthroughputandbuffersize,whilefore-Science,thetradeoffisgenerallybetweenthroughputandnetworkresourcerequirements.Thefocusofourworkisonoptimizingtheseresources. Lee[ 61 ]dividedschedulingofparallelcomputationdenedbySDFGintofourclasses:fullydynamic,staticassignment,self-timed,andfullystatic.Fullydynamicschedulingschedulesactorsatrun-timeonly.Instaticassignment,assignmentofactorstoprocessorsisdoneoff-lineandalocalrun-timeschedulerofeachprocessorinvokesactorsassignedtotheprocessor.Inself-timedscheduling,theassignmentandorderingofactorsoneachprocessorisdeterminedoff-lineandexactringtimeisscheduledatrun-time.Inotherwords,theactorthatwillbeexecutedbyacertainprocessorwaitsforallinputdatatobeavailableandisredonceallinputdataareready.Finally,fullystaticschedulingdeterminesallinformationoff-line.Basedonthisclassication,thetargete-Scienceapplicationscanbeconsideredtobeself-timed.AnodeofSDFGsfore-Scienceapplicationsrepresentsonesite,suchasadataserveroracomputingnode.Thisimpliesthateveryactorisassignedtoauniqueprocessorthatonlymanagesthattask. 30

PAGE 31

Asdescribedearlier,aSDFGisrepresentedbyG=(V,E,I,O,,).Sinceactorscanproduceorconsumetokensatdifferentrates,afeasiblescheduleshouldguaranteethattokensarenotinnitelyaccumulated.InFigure 2-1 (a),actoruproduces2tokensateachring,whileactorvconsumes1token.Topreventinnitebufferoverow,actorushouldberedonceforeverytworingsofactorv.Formally,thiscanbestatedbytheequation,ru2=rv1,whereruandrvdenoteringratesofactoruandv,respectively.Thesekindsofequationsarecalledbalanceequationsorstateequations.Tosolvebalanceequationsformally,weneedtodeneatopologymatrix,whereeidenotestheithedgeandOeiandIeidenotethenumberofproducedtokensandconsumedtokens,respectively,onanedgeei. Denition1(Topologymatrix). topologymatrix)]TJ /F10 11.955 Tf 10.1 0 Td[(isajEjjVjmatrix.)]TJ /F8 7.97 Tf 6.78 -1.79 Td[(ij=8>>>>>><>>>>>>:Oeiifanedgeei=(vj,vk),)]TJ /F12 10.909 Tf 8.48 0 Td[(Ieiifanedgeei=(vk,vj),Oei)]TJ /F12 10.909 Tf 10.91 0 Td[(Ieiifanedgeei=(vj,vj),0otherwise. (2) ThetopologymatrixforFigure 2-1 (a)is:)-488(=2)]TJ /F4 11.955 Tf 9.3 0 Td[(1.Theexistenceofasolution,aswellasamethodtosolvethebalanceequations,canbeshownusingthefollowingtheorem. Theorem2.1([ 62 ]). AconnectedSDFgraphwithactorshasaperiodicscheduleifandonlyifitstopologymatrix)]TJ /F10 11.955 Tf 10.09 0 Td[(hasrankn)]TJ /F4 11.955 Tf 12.1 0 Td[(1.Further,ifitstopologymatrixhasrankn)]TJ /F4 11.955 Tf 12.1 0 Td[(1,thenthereexistsauniquesmallestintegersolutiontothebalanceequations)]TJ /F6 11.955 Tf 6.77 0 Td[(q=0.Itcanbeshownthattheentriesinthevectorqarecoprime. GivenratesofactorsobtainedbyTheorem 2.1 ,fr1,r2,,rng,oneiterationisdenedasaschedulecontainingriringsofactori.Figure 2-1 (b)showstheoptimal 31

PAGE 32

scheduleforaSDFGinFigure 2-1 (a)whenbothactoruandvhaveself-dependencyloopsandtheexecutiontimesofactorsareall1. Theorem2.2([ 84 ]). ForahomogeneousSDFGrepresentedbyG=(V,E,I,O,,),themaximalcomputationrateofeverynodeinthegraphisgivenby min8CP(i,j)2Cij Pi2Ci.(2)whereCisanycycleinthegraph. RegardlessoftheSDFGtype,i.e.,homogeneousormulti-rate,thecomputationrateofaSDFGisdenedasthenumberofiterationsperunittime.ThemaximalcomputationrateofahomogeneousSDFGcanbederivedbyexaminingallcyclesinthegraph.Theorem 2.2 saysthemaximalcomputationrateofahomogeneousSDFGisboundedbytheminimuminitialtoken-to-timeratiocycleinthegraph.AsforahomogeneousSDFG,themaximalcomputationrateofaniterationequalstothemaximalcomputationrateofanodesincethenumberofringsofanodeinoneiterationis1.But,regardingamulti-rateSDFG,wecancomputethemaximalcomputationrateofanodeintwosteps.First,wecancomputethemaximalcomputationrateofaniterationafterconvertingthemulti-rateSDFGintoahomogeneousSDFG.Figure 2-2 showsthehomogeneousSDFGconvertedfromthemulti-rateSDFGinFigure 2-1 whenputtingaself-dependencylooponeachnode.Acertainnodeuwitharateruinamulti-rateSDFGwillbeexpandedtorunumberofnodesinthehomogeneousSDFGconvertedfromthemulti-rateSDFG[ 49 ].Hence,themaximalcomputationrateofaniterationwithregardtoFigure 2-2 is1 2throughtheequationminf1 1ru,1 1rvg=minf1,1 2g. Next,wecancomputethemaximalrateofeachnodebymultiplyingthemaximalrateofaniterationbytherateofthenode.Inthisexample,themaximalrateofnodeuandv 32

PAGE 33

Figure2-2. AhomogeneousSDFGconvertedfromFigure 2-1 (a) are1 2(=1 2ru)and1(=1 2rv),respectively.Inthispaper,wecallthenumberofringsofanodeperunittime,thethroughputofthenode. 2.3ProblemFormulation Inthissection,weproposeanalgorithmfordeterminingefcientbandwidthallocationstoedgesoftheoriginalnetworktopologygraphwhilesatisfyingtemporalconstraintssuchasthroughput,requiredbyane-ScienceapplicationwhosedatadependencyisgivenbyaSDFG.Inaddition,withthesebandwidthallocations,wecanminimizebuffersizerequirementsandndthecorrespondingschedules. Theoverallprocessofndingthefull-edgedsolutionforane-Scienceapplicationissummarizedasfollows. 1. Discretizationstep:Inthisstep,bothtimeanddatasizehavediscretizedvalues:executiontime,datatransmissiontime,anddatatransfersize.DiscretizationisanimportantrequirementforusingtheSDFGmodel.Fortargetapplications,abaseunitforexecutionandcommunicationtimescanbechosenandappropriateroundingcanbeperformed.Abasetimeunitshouldbene-grainedenoughtodifferentiateeachactor'sexecutiontimeandtemporalconstraints.Thebaseunitcanbeafewsecondstoseveralhours,dependingontheapplication. 2. Firingratesofactors:UsingTheorem 2.1 ,ringratesofactorsguaranteeingwell-behavedSDFG,i.e.,freeofdeadlockandinnitebufferaccumulation,canbecalculated.InMATLAB,theringratesofactorscanbeobtainedthroughasimpleoperation,null(,0r0),where)]TJ /F1 11.955 Tf 10.1 0 Td[(isatopologymatrixfortheSDFGandnullisaMATLABfunctionreturningasolution,Z,for)]TJ /F2 11.955 Tf 9.43 0 Td[(Z=0. 3. Pathbandwidthselection:Thee-Scienceapplicationsaredistributed,andconnection-orientedcommunicationpathsamongdistributednodesaresetupondemandorinadvance.Thebandwidthofpathsisguaranteedbynetworktechnologies,suchasmulti-protocollabelswitching(MPLS)andgeneral 33

PAGE 34

multi-protocollabelswitching(GMPLS).Thecommunicationdelayofapathtowhichbandwidthisallocatedonrequestwithinavailablebandwidthisinverselyproportionaltotheallocatedbandwidth.Hence,pathbandwidthallocationshouldalsobetakenintoaccountsinceitcanaffectthroughputaswellasthetotalnetworkresourceconsumption.AformalproblemformulationwillbepresentedinSection 2.3.2.2 4. Theamountofbufferspacerequirementsisthetotalnumberoftokensqueuedoneveryedge.Clearly,differentschedulescanleadtodifferentbufferspacerequirements.ThefollowingbufferminimizationproblemisshowntobeNP-complete[ 76 ]:GivenahomogeneousSDFG,isthereavalidschedulefortheSDFGofwhichbufferspacerequirementsarelessthenaconstantK?itiseasier,however,tondtheminimumbufferspacewhenthecomputationrateisxed,eventhoughtheproblemisstillNP-complete.UsinganapproachsimilartoGovindaraja[ 50 ]butadaptedtoe-Scienceapplications,weuseatwo-phaseapproachforrstndingtheoptimalsolutionforthebandwidthallocationproblem,thenusethissolutiontominimizethebufferrequirements.Usingthistwo-phaseapproach,asmentionedabove,weobtaintheoptimalsolutionforthebandwidthallocationproblem,thenminimizebufferrequirementsbasedontheobtainedprevioussolution.AfterthesolutiontotheBAFSproblem(describedinthenextsection)isobtained,wehavetondexactschedulesandminimizebufferspacerequirements.Sincewechooseamodelwherethecommunicationdelaysareincludedintheexecutiontimeofactors,thepreviouslydevelopedalgorithmsforbufferspacerequirementminimizationcanbedirectlyapplied.ThebufferspacerequirementminimizationproblemhasbeensolvedinthecontextofDSPapplicationsinmanypapers[ 50 ][ 103 ][ 94 ]. 5. Adjustfordeploymentinarealsystem:Theimplementationofthederivedsolutionrequiresafewconsiderations.Thegeneratedsolutionconsistsofdiscretizedvaluesintermsoftheproperlychosenbasetimeanddataunit.Aslongaswecanensurethatthediscretizedproblemhasstricterconstraintsthantheoriginalproblem,suchashigherproductionratesandlowerconsumptionrates,theresultingsolutionshouldbefeasible.Additionally,withtheabsenceofaglobalclock,synchronizationissuesneedtobeconsideredtoforceringsoftaskstofollowthecomputedschedule.Self-timedschedulingmaynotachievethemaximalratewithouttheglobalclockifbufferspaceislimitedandnotproperlysynchronizedwithactors'schedules.However,forreasonablebuffersizes,thedeteriorationofthemaximalratewillbesmall. 2.3.1IllustrativeExample Wepickthevisualizationapplicationin[ 53 ]asarepresentativeexampleofe-Scienceapplicationsthatcanbemodeledbyanextendedsynchronousdataow 34

PAGE 35

Figure2-3. Arealexampleofe-Scienceapplications[ 53 ] graph(ESDFG).ThevisualizationapplicationshowninFigure 2-3 (a)hasause-casescenarioasfollows. ForthedemonstrationinSanDiego,CCT/LSU(Louisiana),CESNET/MU(CzechRepublic)andiGrid/Calit2(California)participatedinadistributedcollaborativesession.Thevisualizationfront-endislocatedatLSUrunningAmiraforthe3Dtexture-basedvolumerenderingfordistributedvisualization.Thevisualizationback-end(dataserver)alsoranatLSU.Theactualdatasetforthedemonstrationhadasizeof120Gbytesandcontained4003datapointsateachtimestep(4bytesdata/pointfora256Mbyte/timestep). Inthischapter,weassumeamoregeneralmodel,similartotheuse-casein[ 46 ],extendedfromthisapplicationsuchthatdataserversresideatdifferentsitesfromcomputingsites.ThisgeneralmodelcanbeabstractedasthediagraminFigure 2-3 (b). Figure2-4. AnESDFGmodelforFigure 2-3 ThesystemparametersofthevisualizationapplicationaresummarizedinTable 2-2 .Ifnotexplicitlymentioned,alltheparametersareperonering.Theguresmarkedbyboldtypeareparametersthatarenotexplicitlygivenin[ 53 ],thusarbitrarily 35

PAGE 36

Table2-2. Summaryofsystemparametersofthevisualizationapplication ItemContinuousDiscretizedvaluevalue DatacentersProduction2560Mbyte128Executiontime1second1ComputingsiteatLSUConsumption256Mbyte256Production1frame(1Mbyte)1Executiontime100ms2VisualizingsiteatSanDiegoConsumption1frame(1Mbyte)1Executiontime100ms2ThroughputAtleast5frames/sec0.25VisualizingsiteatBrnoConsumption1frame(1Mbyte)1Executiontime50ms1ThroughputAtleast5frames/sec0.25 Basetimeunit:50ms,Basedataunit:1Mbyte chosenbyuswithinareasonablerangeoftheassociatedhardware'sperformance.Thediscretizedvaluesfortheparametersarecomputedwithappropriatelychosenbasetimeanddataunit.Forexample,thedataproductionspeedofdatacenters,2560Mbyte/s,isdiscretizedinto128tokens/1unittimesincethebaseunittimeis50msandtherateof2560Mbyte/sequalstotherateof128Mbyte/50ms.TheresultantESDFGfortheapplicationisshowninFigure 2-4 Second,theringratesofnodesarecalculatedusingsimplemathonatopologymatrixoftheESDFG,asdescribedinSection 2.2 )-277(=0BBBBBBB@1280)]TJ /F4 11.955 Tf 9.3 0 Td[(256000128)]TJ /F4 11.955 Tf 9.3 0 Td[(25600001)]TJ /F4 11.955 Tf 9.3 0 Td[(100010)]TJ /F4 11.955 Tf 9.29 0 Td[(11CCCCCCCA Thesolutionforratesofnodesisgivenby[2,2,1,1,1].Eachelementofthesolutionvectorcorrespondstor1throughr5,respectively. Thenextstepistoformulatetheproblemasalinearprogramming. 36

PAGE 37

2.3.2OptimalBandwidthAllocationwithaFeasibleSchedule Toincludetemporalconstraintssuchasthroughput,wedeneextendedSDFG(ESDFG)asfollows. Denition2(ExtendedSDFG(ESDFG)). AnESDFGisrepresentedby G=(V,E,I,O,,D,st,et,T),whereV,E,I,O,,DaresameasSDFG,st,etarestartandendtimeofexecutionperiodofaSDFG,andTisf(v,Tv)jv2V,Tv2Rg. Theset,T,haselementsofthroughputconstraintsdenedastwo-tuple(v,Tv),wherevisthenodewhosethroughputshouldbeequaltoorgreaterthanTv.standetareusedforin-advancebandwidthreservations.Supposethatwemanagedatastructuresforin-advancebandwidthreservationssuchastime-bandwidthlistsrepresentinghowmuchavailablebandwidthvariesovertimeoneachedge,wecaneasilyobtainasubgraphwhoseavailablebandwidthoneachedgeissettomaximumavailablebandwidthduringtheperiod[st,et).Forexample,ifanedgeeijhasavailablebandwidth1and2overtimeperiod[0,1)and[1,2),andstandetaregivenas0and2,theeijofsubgraphhasavalueof1asanavailablebandwidth.BAFSproblemformulationworksonthissubgraphifin-advancebandwidthreservationsareconsidered.Informallywecandenethebandwidthallocationwithafeasibleschedule(BAFS)problemasfollows:GivenanetworktopologyrepresentedbyG=(V,E)anditerativedata-dependenttasksrepresentedbyanESDFG,Gt=(Vt,Et,It,Ot,t,st,et,T),whatistheoptimalbandwidthallocationwithafeasibleschedulethatminimizesnetworkresourceconsumptionandmeetstemporalrequirements? Theformalproblemformulationwillbepresentedbelowafterdiscussionofhowtomodelcommunicationdelaysinestablishedpathsfore-Scienceapplications. 37

PAGE 38

2.3.2.1Modelingcommunicationdelays Acommunicationdelayiscomposedoffourfactors:processingdelay,transmissiondelay,queueingdelay,andpropagationdelay.Processingdelayisassociatedwithoperationssuchaspacketizing,thusisproportionaltothedatasizeasistransmissiondelay.Queueingdelayisstochastic,andpropagationdelayisconstantforacertainlink.Inthischapter,weassumethate-Scienceapplicationsrunondedicatednetworks,i.e.,MPLSorGMPLSnetworks,wherethepathsareestablishedusinglabelswitchedpaths(LSPs).Forsuchscenarios,queueingdelayandpropagationdelaycanbeignored.Weassumethattransmissiondelaydominatestotaldelay.Theprocessingdelaycanbeincorporatedintotransmissiondelaybecausebothkindsofdelayareproportionaltothedatasize.Wethusoptimizewithregardtotransmissiondelay. WenowinvestigatehowtoincorporatecommunicationdelaysintoanoptimalcomputationrateproblemgivenaSDFG.Tothebestofourknowledge,thishasnotbeenaddressedintheliteratureonSDFmodelingforDSPapplications.Althoughcommunicationdelayshavebeenconsideredinmultiprocessorscheduling,thefocusismainlyonthemakespanofschedules,whichistotaltimetakentoexecuteallthetasksspeciedbyaprecedencetaskgraph,notonthethroughputofinnitelyrepeatedschedules.Thetargetapplicationsaree-Scienceapplicationswhosedata-dependentdistributednodescollaborateiteratively. Figure 2-5 (a)showsaSDFGconsistingoftwoactors,uandv.Actoruproduces2tokensperring,actorvconsumes1tokenperring.Weassumethatittakes2unitsoftimeforactorutosend2tokenstoactorv.Thevalueinparenthesisinsideanodeindicatestheexecutiontimeofthenode. TherearetwowaystointegratethecommunicationdelaywithintheSDFmodel. 1. Thecommunicationdelaycanbeincludedintheexecutiontimeofproducingactoru(Figure 2-5 (b)) 2. Thecommunicationdelaycanbeincludedbyhavingadummyactorcwhoseexecutiontimeissettothecommunicationdelay(Figure 2-5 (c)) 38

PAGE 39

Figure2-5. ModelingcommunicationdelayinaSDFG Therstoption(Figure 2-5 (b))impliesthatcommunicationcanoccurrightaftertokensareproducedintheproducer'sbufferandtheproducercannotberedagainuntiltransferofproducedtokensisdone.Thisisthemostconservativewayofmodelingacommunicationdelaysincetherelationbetweentheexecutionandcommunicationisassumedtobesynchronous.Wecallthismodeltheconservativemodelinthischapter.Ifwearenotsurehowtheprogramisimplementedinternally,wecantakethisconservativemodeltoguaranteethenalsolutionmeetsthethroughputrequirements.Thesecondoption(Figure 2-5 (c))impliesthatcommunicationcanrunindependentlyoftheproducer.This,ingeneral,canleadtohigherbufferspacerequirements,butmayresultinahighercomputationrate.AscanbeseeninFigure 2-5 (c),theoptimalscheduleshowsmorethroughputcomparedtotheoptimalscheduleinFigure 2-5 (b).Wecallthismodeltheoptimisticmodelasopposedtotheconservativemodelinthischapter.Eitherofthesetwomodelscanbechosenarbitrarilypereachnode,andthedetailsonhowthisissuecanbedealtwithintheproblemformulationarepresentedinSection 2.3 Insomecases,theremaybemultipleoutgoingcommunicationchannels(Figure 2-6 (a)).Assinglecommunicationchannel,wecanmakeachoicebetweentwooptions:aconservativeapproachandanoptimisticapproach.Theconservativeapproachaddsmaxfcommunicationdelaysofoutgoingcommunicationchannelsgtotheexecutiontimeof 39

PAGE 40

Table2-3. Notationforproblemformulation CategoryNotationDescription Function vt(v)vt:Z!Z,mapsavertex,v,inVintoavertexinVt. Com(a)Com:V!boolean,returnstrue,ifanactoraisadummynodetomodelcommunicationdelay. ConstantG(V,E),originalnetworktopology. orSetGt(Vt,Et,It,Ot,t,st,et,T), anESDFGspecifyingiterativedata-dependenttasks. Jcf(si,di)jsi2V,di2V,(vt(si),vt(di))2Etg, Asetofcommunicationjobsmodeledbytheconservativeapproach,denedbytwotuplesofsourceanddestinationnodes. Jof(si,di)jsi2V,di2V,c2Vt,(vt(si),c)2Et,(c,vt(di))2Etg, Asetofcommunicationjobsmodeledbytheoptimisticapproach,alsodenedbytwotuplesofsourceanddestinationnodes. JJcorJodependingontheapproach. sjsj2V,j2JcWj2Jo,sourcenodeofjobj. djdj2V,j2JcWj2Jo,destinationnodeofjobj. iExecutiontimeofnode(actor)i2Vt. riRateofnode(actor)i2Vt. Ijj2J,amountofdata(numberoftokens)consumedbyactorvt(dj). Ojj2J,amountofdata(numberoftokens)producedbyactorvt(sj). ClkAvailablebandwidthonedge(l,k)2Eduringtheperiod[st,et). VtfAsetoffront-endnodeswhosethroughputsareconcerned,VtfVt TdThroughputrequirementofnode(actor)d2Vtf,speciedbyusers. VariableRmaxThemaximalcomputationrateofaniteration. tdThroughputofnode(actor)d2Vtf. fjlkFlowofjobjonanedge(l,k)2E. DjAllocatedbandwidthforjobj. aproduceractor.Figure 2-6 (b)showssuchacasewheretheexecutiontimeofactoruincreasesby3,maxf2,1,3g.Oneofdrawbacksofthismodelisthatitdetersearlyexecutableactorsfromstartingontheirownschedules.Forexample,actorwinFigure 2-6 (b)cannotbered1unittimeafterunishesitsexecution.Insteaditshouldwait2unittimesmore.TheotherapproachasinFigure 2-6 (c),theoptimisticone,isthesameasthecaseofthesinglecommunicationchannels.Foreachchannel,alogicalactoraccountingforacorrespondingcommunicationdelayisinsertedbetweentheoriginalproducer/consumeractors. 40

PAGE 41

Figure2-6. Modelingcommunicationdelayinthecaseofmultiplecommunicationchannels Amoreelaborateanalysisofacertainactor'sexecutionpatternmayleadtothemoreexactmodeling,andFigure 2-7 showsinwhatcasesandhowwecanimproveourmodels.ThesemanticsofSDFenforcesthattheoutputofanactorisgeneratedattheendoftheexecutionoftheactor.Hence,incaseofFigure 2-6 (a),actorv,wandxcanstarttheirownexecutionatleast2,1and3unittimeafteractoru'sringisdone,whichmeansactorxcanstartattime5ifactoruisredattime0.However,supposethattheoutputdataforactorxisgeneratedattime1.Thecommunicationdelayonthechannelbetweenactoruandxcanbeadjustedby2asinFigure 2-7 .ThenextproceduresforincorporatingcommunicationdelayintoSDFmodelwilltakeeitherofFigure 2-6 (b)and(c). 41

PAGE 42

Figure2-7. Moreexploitedparallelismincaseofmultiplecommunicationchannels 2.3.2.2Problemformulation ThenotationfortheBAFSproblemissummarizedinTable 2-3 .TheBAFSproblemcanbeformulatedaslinearprogrammingproblemsshowninFigure 2-8 and 2-9 ,fortheconservativeandtheoptimisticmodels,respectively. ObjectiveminimizeXj2J,(l,k)2Efjlk (2)Multi-commodityowconstraintsXk:(l,k)2Efjlk)]TJ /F28 9.963 Tf 18.75 9.46 Td[(Xk:(k,l)2Efjkl=0,l6=sj,l6=dj,8j2J (2)Xj2JfjlkClk,8(l,k)2E (2)Xk:(l,k)2Efjlk)]TJ /F28 9.963 Tf 18.75 9.47 Td[(Xk:(k,l)2Efjkl=Dj,ifl=sj)]TJ /F26 9.963 Tf 7.75 0 Td[(Dj,ifl=dj,8l2V,j2J (2)0fjlk,8j2J,(l,k)2E (2)0Dj (2)TemporalconstraintsRmax1 ri(i+Oj Dj),i2Vt,j2Jc,(vt(sj),vt(dj)2Et (2)td=Rmaxrd,d2Vtf (2)Tdtdrd ri(i+Oj Dj),d2Vtf,i2Vt,j2Jc,(vt(sj),vt(dj)2Et (2)Td(iDj+Oj)rd riDj,d2Vtf,i2Vt,j2Jc,(vt(sj),vt(dj)2Et (2) Figure2-8. BAFSproblemformulationincaseoftheconservativemodel 42

PAGE 43

ObjectiveminimizeXj2J,(l,k)2Efjlk (2)Multi-commodityowconstraintsXk:(l,k)2Efjlk)]TJ /F28 9.963 Tf 18.75 9.47 Td[(Xk:(k,l)2Efjkl=0,l6=sj,l6=dj,8j2J (2)Xj2JfjlkClk,8(l,k)2E (2)Xk:(l,k)2Efjlk)]TJ /F28 9.963 Tf 18.75 9.46 Td[(Xk:(k,l)2Efjkl=Dj,ifl=sj)]TJ /F26 9.963 Tf 7.75 0 Td[(Dj,ifl=dj,8l2V,j2J (2)0fjlk,8j2J,(l,k)2E (2)0Dj (2)TemporalconstraintsRmax1 riOj Dj,ifCom(i)=trueandj2Jo,(vt(sj),i)2Et (2)td=Rmaxrd,d2Vtf (2)Tdtdrd riOj Dj,ifCom(i)=trueandd2Vtf,j2Jo,(vt(sj),i)2Et (2)TdOjrd riDj,ifCom(i)=trueandd2Vtf,j2Jo,(vt(sj),i)2Et (2) Figure2-9. BAFSproblemformulationincaseoftheoptimisticmodel Theproblemformulationallowstheuseofthemulti-commodityowproblem,forwhichavarietyofefcientsolutionsexistsintheliterature[ 27 ].Themajordifferencesbetweenatypicalmulti-commodityowproblemandthisproblemformulationareasfollows: 1. Thedemandofeachjobisnotaconstant,butadecisionvariable.Thisdetermineshowmuchbandwidthisallocatedtoajob,i.e.,communicationbetweenproducerandconsumeractors. 2. Thedecisionvariablesareconstrainedbytemporalconstraintspertainingtothroughputsofactors. 43

PAGE 44

Theobjectiveofthelinearprogramming(giveninEquation 2 and 2 )istominimizenetworkresourceconsumption,whichisthetotalamountofbandwidthsallocatedtoalledgesintheoriginalnetworktopology.Ifthedemandswereconstantvalues,theobjectivecanberegardedasminimizingaveragehopsofalljobs(communicationchannels)ifweapproximatetheaveragehopnumberastotalnetworktrac totaldemand.However,sincewehavedemandsasdecisionvariables,thisobjectivecanbethoughtofasminimizingallocatednetworkresourcesregardlessofaveragehopnumberofalljobs. Theconstraintscanbedividedintotwoparts.Therstpartistypicalofmulti-commodityowconstraints.Equation 2 and 2 ,theowconservationconstraint,mandatethatforalljobs,thenetowtoanodeiszero,i.e.,theincomingandoutgoingowstoanodearebalancedunlessthenodeisasourceoradestination.Equation 2 and 2 ,thecapacityconstraint,mandatethattheowalonganyedgecannotexceedthecapacityoftheedge.Equation 2 and 2 ensuresthatthesourceandthedestinationofanyjobshouldproduceandconsume,respectively,theowofthejob,Dj. Thesecondpartconcernstemporalconstraints,guaranteeingthethroughputsoffront-endactors.Asdiscussedearlier,Theorem 2.2 statesthatthemaximumcomputationrateislimitedbythecyclewhosecost-to-timeratioisminimum.Accordingly,Equation 2 and 2 accountforcommunicationdelaysonoutgoingedgesofacertainnode.Sincethetargetapplicationisacyclice-Scienceapplication,thecyclesweshouldconsiderforthemaximumcomputationratearelimitedtoself-dependencyloops,wherethenumberoftokensis1andthetotalexecutiontimesareri(i+Oj Dj)andriOj Djfortheconservativeandoptimisticmodels,respectively.ThetermOj Djaccountsforcommunicationdelaysofoutgoingedgesofactori.Inaddition,sinceTheorem 2.2 isforhomogeneousSDFs,consideringtheconversionfromthegivenESDFGtothehomogeneousSDF[ 76 ],theexecutiontime,(i+Oj Dj)orOj Dj,shouldbemultipliedbytheringrateriasinEquation 2 and 2 .SinceRmaxisthenumberofiterationsperunittimeandtheringrateisthenumberofringsperiteration,thethroughputofacertain 44

PAGE 45

nodedequalsRmaxrd,asinEquation 2 and 2 .Equation 2 and 2 canbetransformedintoEquation 2 and 2 ,respectively,sincetherequiredthroughputcanbeguaranteedifanyRmaxrdisgreaterthanorequaltoTdspeciedbyusers.Withafewtransformations,thethroughputconstraintsresultinEquation 2 and 2 ,whicharelinearinequalities. Thesolutionforthelinearprogrammingdeterminestheoptimalbandwidthallocationonedges,andexactschedulesandassociatedbufferspaceallocationcanthenbecomputedin[ 50 ]. 2.4ExperimentalEvaluation ThereisnootherresearchworkonBAFSprobleminthecontextofgridcomputing.WecompareourLP-basedalgorithmforBAFSproblemwithaheuristicthatsimplyusesthedenitionofthroughputbetweentwoactorsintheassignmentprocess.Wealsocomparetwoapproachesofouralgorithm,conservativeandoptimisticones.TheheuristicispresentedinAlgorithm 2-1 .Itenumeratesallthepathsbasedonthethroughputrequirements,andcomputesdelaysonedgesbasedontheassumptionthatallthedelaysofedgesconstitutingagivenpatharethesame.Iftighterdelayisrequiredwhileexaminingpaths,thetighterdelayisupdatedasthedelayoftheedge.Forexample,inFigure 2-4 ,thethroughputofnode3comesfromtwopaths:0!2!3and1!2!3.Supposethatacommunicationdelayontheedge(2,3)is2,toachievethethroughputrequiredbynode3onthepath0!2!3.Ifthecommunicationdelayshouldbe1ontheedge(2,3),consideringanotherpath1!2!3,thecommunicationdelayontheedge(2,3)isupdatedto1. Theheuristicdoesnotconsiderpossibleparallelismoftasks,orpossiblebalancedbandwidthallocationsonedgessinceitcomputesdelaysbyassumingalldelaysonapathtobethesame. Wecomparetwoalgorithmsintermsofrejectionratioofrequests.Thebandwidthsofedgesarerandomlyselectedfromauniformdistributionbetween10to1024unit 45

PAGE 46

dataperbaseunittime.Wevariedthenumberofrequestsfrom1to16ontheAbilenenetwork[ 11 ](seeFigure 2-10 ),andeachrequestisaspeciedtaskgraph(Figure 2-4 ).Thenodesofarequestwereconstrainedtohaveamatchingnodeintheoriginalnetworktopologygraph,andthematchingnodeisrandomlyassignedusinguniformdistribution. Algorithm2-1AheuristicforBAFSproblem Input:AnESDFG 1: Enumerateallthepossiblepathsfromfront-endnodestoback-endnodeswhosethrough-putsarespeciedbytemporalrequirements. 2: Initializethedelayonedgei,edi,as1. 3: foreachpathdo 4: Assumeasamedelay,d,onalltheedgesofthepath. 5: Computedsatisfyingtemporalrequirements. 6: ifedi>dthen 7: edi=d 8: endif 9: endfor 10: Computethebandwidthoneachedgebasedonedi. Figure2-10. TheAbilenenetwork TheexperimentalresultsareshowninFigure 2-11 .BothofourLP-basedapproacheshavebetterrejectionratiosandarelowerthantheheuristicby5to30%. 46

PAGE 47

Figure2-11. Rejectionratiovs.numberofrequests Thedrawbackoftheheuristicistwo-fold.First,itdoesnotconsiderthefactthatschedulesofiterationscanbeoverlapped.Second,itcannotallocatebandwidthtolinksinterconnectingnodesaccordingtothecurrentnetworkstatus,i.e.,thecurrentavailablebandwidthoneachlink.Hence,theoptimisticapproach,whichassumesthatdatatransferscanalsooccurinparallelwithcomputationalexecutions,usestheleastamountofnetworkresourcestoachievethroughputrequirementsgivenbyusers.Consequently,theoptimisticapproachleadstotheleastrejectionratioofrequestssinceonerequesthasabetterchancetobeacceptedduetolessamountofbandwidthallocationrequirementsandthefollowingrequestsbenetfromlessloadednetworkstatus. 2.5Summary Thededicatednetworkonwhiche-Scienceapplicationsoperateguaranteesthatacertainpathcanhaveareservedbandwidthoveragivenperiod,whichmeansthecommunicationdelaysvarydependingonallocatedbandwidths.WedevelopaSDF-basedmodelforiterativedata-dependente-Scienceapplicationsthatincorporatesvariablecommunicationdelaysandtemporalconstraints,suchasthroughput.Weformulatetheproblemasavariationofmulti-commoditylinearprogrammingwithanobjectiveofminimizingnetworkresourceconsumptionwhilemeetingtemporalconstraints.TheresultingsolutioncanthenbeusedtoderivebufferspacerequirementsbypreviouslydevelopedalgorithmsinthecontextofDSPapplications.Finally, 47

PAGE 48

anillustrativeexampleofane-Scienceapplicationshowsthattheframeworkandalgorithmweproposeisvalidtomodelandanalyzeiterativedata-dependente-Scienceapplications.Thesimulationresultsshowthattheoptimalbandwidthallocationbytheformulatedlinearprogrammingoutperformsthebandwidthallocationbyasimpleheuristicintermsofrejectionratioofrequests. Infuture,wewillextendourframeworksothatitalsoschedulescomputationjobstodistributedcomputingresourceswhensuchmappingsarenotknownaheadoftime,oritmaximizestheoverallthroughputwhenmultipleSDFGswiththroughputrequirementsaregiven. 48

PAGE 49

CHAPTER3TOPOLOGYAGGREGATION 3.1Overview Theneedfortransportinglargevolumesofdataine-Sciencehasbeenwell-argued[ 33 78 ].Forinstance,theHEPdataisexpectedtogrowfromthecurrentpetabytes(PB)(1015)toexabytes(1018)sometimebetween2012to2015.Inaddition,e-Scientistsdesireschedulablenetworkservicestosupportpredictableworkprocesses[ 46 ].Qualityofservice(QoS)innetworkapplicationshasbeenanactiveresearchareaforseveraldecades.Recentlynewtechnologiessuchasmultiprotocollabelswitching(MPLS)andgeneralizedmultiprotocollabelswitching(GMPLS)drewmoreattentiontoQoSroutingsincethosetechnologieshavemadeitpossiblefornetworkmanagerstosetupandteardownexplicitpathswhileguaranteeingspeciedamountsofbandwidth. Thenetworksupportinge-Scienceapplicationstypicallycomprisesmultipledomains.Eachdomainusuallybelongstodifferentorganizations,andismanagedbasedondifferentoperationalpolicies.Insuchcases,internaltopologiesofdomainsmaynotbevisibletotheothersforsecurityorotherreasons.However,aggregatedinformationofinternaltopologyandassociatedattributesisadvertisedtotheotherdomains. AsetoftechniquestoaggregatedatatoadvertiseoutsideonedomainiscalledTopologyAggregation(TA).TheaggregateddataitselfistermedasAggregatedRepresentation(AR).AsurveyofTAalgorithmsispresentedin[ 98 ].ThereexistsatradeoffbetweentheaccuracyandthesizeofAR.Hence,mostalgorithmsproposedinthepreviousworktriedtoachievethemostefcientARintermsofbothaccuracyandspacecomplexity. OnecanclassifyQoSpathrequestsintotwoclasses:single-pathsingle-job(SPSJ)andmultiple-pathmultiple-job(MPMJ),dependingonthenatureofrequests.SPSJcorrespondstoasituationthatrequestsforsingleQoSpatharriveandthey 49

PAGE 50

arescheduledintheorderofarrival.Incontrast,MPMJcorrespondstobatch/off-lineschedulingofmultiplerequestsformultipleQoSpaths.Manye-Scienceapplicationsrequiresimultaneoustransferofdatafrommultiplesourcesanddestinations.Also,eachoftheserequests(e.g.,letransfers)canbemoreefcientlysupportedbyusingconcurrentmultiplepaths. WeshowthatexistingTAapproachesdevelopedforSPSJdonotworkwellwithMPMJapplicationsastheyoverestimatetheamountofbandwidththatisavailable.WeproposeamaxowbasedTAapproachthatissuitableforthispurpose.Oursimulationresultsdemonstratethatouralgorithmsresultinbetteraccuracyorlessschedulingtime. BGP,whichhasbeendeployedforinter-domainprotocol,haslimiteduseforARtechniques,asitisnotexibleenoughtobeextendedtoaccommodatemanyQoSparameters.Thisisbecauseitwasoriginallydesignedonlyfordistributingreachabilityinformation[ 107 ].RecentlyanewnetworkmodelbasedonPCEhasbeenproposedtoovercometheaforementioneddrawbacksofBGP[ 45 ].PCEisanentitythatiscapableofcomputingnetworkpathsutilizingthetrafcengineeringdatabasewhichcontainsrequirednetworkstatusinformationsuchastopology,delaysonlinksandetc.Recentpapers[ 80 85 93 ]havebasedtheirnetworkmodelonPCE-basedarchitecture.WedevelopTAalgorithmsinthecontextofPCE-basedarchitecturethatsupportmoste-Scienceapplications.Inparticular,thefollowingnetworkmodelisassumedthroughoutthischapter. 1. AcentralizedPCEexistspereachdomain.AnodesendsarequesttothePCEtomakeareservationforaQoSpath. 2. CentralizedPCEsoodaggregatedtopologyinformationtootherssothateverycentralizedPCEmaintainsacompleteviewofanetworkinanARexceptitsowndomain. Therstconditionstatesthatoneactiveelementinadomainactsasasupernodeinonedomain,whichknowseveryinformationessentialforQoSpathcomputation.OnepossibleimplementationisthateverynodeinadomainsendsarequestforQoS 50

PAGE 51

Figure3-1. Anexampleofinter-domainQoSrouting pathtothedesignatedcentralizedPCE,therefore,thePCEcanmanageoneconsistentinformationonnetworkstatusrelatedtoQoSparameters.Thesecondconditioncanbereasonablyassumedine-Sciencenetworks,ofwhichsizeisrelativelyverysmallcomparedtotheInternet.ThisstatementenablesustodirectlyapplyQoSroutingalgorithmswhichhavebeendevelopedsofar.Inthisnetworkarchitecture,onedomaincanadvertiseitsaggregatedtopologyinformationandassociatedQoSparameterstoalltheotherdomains. Basedonthedescribednetworkmodel,ascenarioofinter-domainQoSroutingworksasinFigure 3-1 STEP1AsourcenodesendsapathcomputationrequesttoasinglecentralizedPCEinthesamedomain. STEP2ThenthePCErepliesbackwithacoarsepath,whichisasequenceofbordernodeswithoutdetailedhopsbetweenbordernodes. STEP3Withthecoarsepath,thesourcenodesendsapathsetuprequestthatwilltraversebordernodesofthecoarsepath. STEP4and5ThebordernodewhichreceivesapathsetuprequestgetsastrictpathforacoarsepathfromthePCEinthesamedomain. STEP6Thesamestepsrepeatuntilapathsetuprequestreachesadestinationnode. TAalgorithmscanalsobeusedforschedulingpathsinasingledomain.Thesemethodsareusefulasalargedomaincanbepartitionedintosubdomains.TAalgorithmscanthenbeappliedtoeachsubdomain.WithARsonsubdomains,theactualschedulingmaybeperformedeitheronasinglenodewitharichcomputeresourceor 51

PAGE 52

onadistributedsetofnodessuchthatthetimecomplexityofschedulingpathswouldbereducedbyrunningschedulingalgorithmsonthepartitionedsmallersubdomains. Therestofthechapterisorganizedasfollows.TherelatedworkonTAisdescribedinSection 3.2 .Section 3.3 describesnovelalgorithmsforMPMJ.Section 3.4 describeshowrealroutingworksforTAalgorithms,andSection 3.5 givestimeandspacecomplexitycomparisonanalysis.TheexperimentalresultsbysimulationaregiveninSection 3.6 ,and,nally,weconcludeinSection 3.7 3.2RelatedWork TAconsistsofalgorithmsandmechanismsforreducingthesizeoftopologicalinformationandassociatedattributeswithinadomainorsubdomainswhilemaintainingacertainlevelofaccuracy.Uludaget.al[ 98 ]presentedasurveyofthesealgorithmsformulti-domainenvironments.AllTAalgorithmshavetwoelements:anaggregatedgraphandaggregatedQoSparametervalues,calledepitome,assignedonlogicallinksinanaggregatedgraph. Typicaltopologiesforaggregatedgraphsarefull-mesh,simplecompaction,tree-based,andstar-basedtopologies.Someothertopologies,e.g.,Shufenet[ 108 ],havebeenproposedtoreducespacecomplexityinspeciccasessuchasasymmetricnetworks.MostTAalgorithmsstartbybuildingafull-meshgraph,whichisacompletegraphwhosenodesarecomposedofonlybordernodesoftheoriginalnetwork.AlgorithmsthataremorefocusedonthesizeofARusuallytrytotransformafull-meshgraphintomorecompactforms,forexample,aspanningtreeorastartopology,whiletryingtokeepupwiththeaccuracyofafull-meshAR.ForaggregatedQoSparametervalues,epitome,themaximum,theminimumortheaverageofQoSvaluesaretypicallyused. TAalgorithmsforSPSJinlarge-scalemulti-domainnetworksfocusonthecompactionofARsaccuracyisasecondaryissue.AsforTAalgorithmsinsmallsizednetworks,accuracyhasbeenthemainfocus[ 80 85 88 93 ].ForasingleQoS 52

PAGE 53

constraint,adistortion-freealgorithmexists[ 98 ].ButfortwoQoSconstraintscomposedofanadditiveandarestrictiveone,theproblemgetsmorecomplicated.Eventhoughtheproblemitselfisnotintractable,distortion-freerepresentationisnotcompact.Forsuchreasons,severalapproximatingalgorithmsminimizingdistortionsuchasthelinesegmentalgorithm[ 68 ]havebeenproposed.Usually,themultipleQoSconstraintsproblemisgeneralizedasonerestrictivewithmultipleadditiveconstraints,sinceamultiplicativeconstraintsuchasalinkreliabilitycanbetransformedintoanadditiveonethroughalogoperation. Tothebestofourknowledge,allexistentTAalgorithmsarelimitedtoasingleQoSpathroutingatonetime,i.e.,SPSJ,withfewexceptionsofcustomizedalgorithmsforspecialpurposessuchascomputationofreliablepaths.MPMJapplicationsconsiderabatchofjobsatatimeandmultiplepathsareallowedforonejob.Forinstance,arequestfortheearliestnishtimeforagivenmultiple-sourcemultiple-destinationdatatransfer,whichisoneoftheimportante-Scienceapplications[ 46 ],ishandledatonetimeandmultiplepathsaresetupfortherequest. TheemergingtechnologiessuchasMPLSorGMPLSmakeitpossiblethatapplicationsrequiringstrictQoSrequirementsareimplementedonnetworksequippedwithsuchfacilities.SpecialpurposenetworkssuchasresearchnetworkslinkingnationallabsintheU.S.canbesetupforthatpurpose[ 83 ].Especiallyforinter-domainQoSpathroutinginsuchspecialpurposenetworks,theaccuracyofaggregatedtopologiesandassociatedQoSparametervaluesismoreimportantthanthesizeofdataexchangedamongdomainssincethenumberofdomainsisrelativelysmallcomparedtotheInternetwhichisconstitutedbyahugenumberofhostsandswitches.ThustheneedformoreaccurateARsisprominent. Asdescribedabove,oneofthemostrecentworkregardingTAfortwoQoSconstraintsisthelinesegmentalgorithmindelay-bandwidthsensitivenetworks[ 68 ].Thelinesegmentalgorithmrstcomputes2-Dchartswhosex-axisandy-axisaredelay 53

PAGE 54

Figure3-2. Anillustrativeexampleforlimitationsofthelinesegmentalgorithm andbandwidthrespectively,foreverypairofbordernodes.ThechartcontainsalltheinformationforcomputingQoSpathswithdelayandbandwidthconstraints.Authorsin[ 68 ]suggestedthelinesegmentalgorithmapproximatingthatinformationbyalinetoreducethesizeofdatarepresentingallpossibledelay-bandwidthcombinationsbetweentwobordernodes,anditispossiblebecausetheshapeofthechartstakesaincreasingstaircasefunction.Thenextstepistoestablishafull-meshtopologyandconvertittoastartopologytofurtherenhancethespacecomplexityuptoO(jBj). WithexistentTAalgorithmsforSPSJ,thereisnowaytoestimateifmorethanonepathbetweentwobordernodesisavailable.Consideramulti-domainnetworkinFig. 3-2 .ThenetworkconsistsofthreedomainswhereAS1isconnectedtoAS3viaAS2.SupposethatahostinAS1wantstondmaxowpathsorreliablepaths,composedofaprimaryandabackuppath,toacertainhostinAS3.IfTAalgorithmssuchasthelinesegmentalgorithmaredeployedinthisnetwork,thePCEinAS1computespathsbasedontheARfromAS2,whichonlygivestheinformationonhowmuchbandwidthisavailablewithinacertaindelay.SincethePCEinAS1hasnocluehowmanypathsexistinternallyinAS2,thecomputedmax-oworreliablepathsarenotnecessarilythemostaccuratepathscomputedbasedonthecompletenetworkstatusinformation. 3.3TAforMultiple-PathMultiple-Job(MPMJ) 3.3.1ProblemStatement Whenitcomestoschedulingabatchofmultiplejobsallowingformultiplepaths,existentTAtechniquesarenotusefulbecausetheperformancedegradationcanbe 54

PAGE 55

signicant.Forexample,thefull-meshARforbandwidthscheduling,whereeachlogicallinkhasthemaximumavailablebandwidthbetweentwobordernodesasanepitome,hasbeenknownasadistortion-freeARforsinglepathbandwidthscheduling.However,itmaynotbeeffectiveformulti-pathbandwidthscheduling,e.g.,amaxowbandwidthscheduling. Animportantclassofe-Scienceapplicationsisbulkletransfers.Forexample,forhighenergyphysicslargelesareroutinelytransferredbetweentieredcentersthataregeographicallydistributedaroundtheworld.Thegenerateddatahavetobetransferredfromstoredplacestoresearchcentersforthepurposeofanalysisorvisualization.Inthecontextofe-Scienceapplications,bandwidthschedulingproblemsrangefromsingle-sourcesingle-destinationdatatransferoptimizationtomultiple-sourcemultiple-destinationdatatransferoptimization.Thecomputationalcomplexityofsuchproblemsdenitelydependsonthespacecomplexityofthenetworktopology.Generally,wecanbreakdownnetworkresourceprovisioningproceduresfore-Scienceapplicationsintotheadmissioncontrolphaseandthenetworkresource,i.e.,bandwidth,allocationphase.Inadmissioncontrolphase,acceptanceofrequestedjobsisdeterminedandthenifaccepted;explicitbandwidthallocationforeachlinkwillbeexecutedinthenetworkresourceallocationphase.Withcompactnetworkinformationabstractedfromacompletenetworktopology,chancesarethatthenetworkresourceallocationphasemayfailduetoinaccuratenetworkstatusinformation.EventhoughtheacceptedrequestintheadmissioncontrolphasecanberejectedduetoinaccurateARsinnetworkresourceallocationphase,thebenetsfromlessspacecomplexitycompensateforfailedoperationsiftheerrorrateisfairlysmall. Inthefollowingsubsections,weproposeseveralTAalgorithmssuitedforMPMJ.Eachrequestconsistsofsingleormultipledatatransferjobs.However,wewanttoallowfortheuseofmultiplepaths.TheonlyQoSparameterthatisconsideredisbandwidth. 55

PAGE 56

3.3.2NewTopologyAggregationAlgorithms 3.3.2.1Full-meshmethod ThemosttypicalwayofaggregatingnetworkswithQoSparametersistobuildafull-meshtopologybyconnectingeverypairofnodesofinterestandassigningepitomestothebuiltlogicallinks.Followingthisconventionalway,wecanbuildafull-meshARwithmaxowvaluesbetweennodesassignedtoeachlogicallink.Let'sconsidertheedgeconnectingnodesD1andD2inFig. 3-3 .TheepitomeassociatedwiththeedgeED1D2,F12,canbecomputedusinganyknownmaxowalgorithms.Thealgorithmforbuildingafull-meshARisdescribedinAlgorithm 3-1 Figure3-3. Full-meshAR Algorithm3-1Full-meshARconstruction Input:agraphG=(V,E). 1: Picknodesofinterestfromafullsetofnodes,Vandaddthemtotheaggregationrepresentation. 2: foreachpairofpickednodesdo 3: Createalinkbetweentwonodes 4: Computeamaxowvaluebetweentwonodes. 5: Assignthecomputedmaxowvalueasanepitometothelinkcreatedabove. 6: endfor ThissimplemethodadaptedfromexistentTAtechniquesforSPSJeasilyturnsouttobeinappropriateforMPMJ.LetustakeanexampleofajobrequestingmaxowbetweenD1andD2whereD1,D2,D3andD4arenodesofinterest.Sinceweconsider 56

PAGE 57

Figure3-4. StarAR MPMJinbothasingleandmultipledomainnetworkenvironments,thenodes,D1,D2,D3andD4,don'thavetobebordernodes.ThenalmaxowvaluebetweenD1andD2wouldbefarbiggerthantherightanswer,sincethereexistotherpathssuchasD1!D3!D2. 3.3.2.2Starmethod Afull-meshARdoesnoteffectivelysupportMPMJasthemaximumamountofowthataspecicnodecanpushintoanetworkisnotrestricted. Forsinglepathcomputationalgorithms,mostrecentTAtechniquesstartfromfull-meshARandproducediversevariantsstemmingfromitsuchaspartialfull-mesh,star,treeandsoon.Formultiplepathcomputationalgorithms,thereasonsdescribedintheprevioussubsectionspreventfull-meshARfrombeingutilizedasabaseARforotherefcientARsintermsofspacecomplexity. AstarARasinFig. 3-4 canovercomethedrawbacksofafull-meshARbylimitingthemaxowvaluefromanynode.First,thelogicalnode,L,iscreatedandallnodesofinterestareconnectedtoit.Supposethatfournodesofinterest(D1,D2,D3andD4)areconnectedtothecentrallogicalnodeL.Theepitome,assignedonthelogicallinkconnectingacertainnodeandthecentrallogicalnodeL,isamaxowvaluefromthenodetoalltheremainingnodes.Thisiseasilycomputedbyputtingasupersourcenodeconnectedtoanodeandasupersinknodeconnectedtoalltheremainingnodes,andrunningamaxowalgorithmbetweenthesupersourceandthesupersinknodes.Inthiscase,F1isamaxowvaluethatanodeD1cansendtothenetwork,whichiseasily 57

PAGE 58

computedbyaddingasupersinknodeconnectingD2,D3andD4andrunningamaxowalgorithmbetweenD1andthesupersinknode.Likewise,wecanalsocomputetheotherepitomessuchasF2,F3andF4.ThisARhasonlyoneoutgoinglinkfromeachnode,whichkeepsonenodefromsendingthedataowbeyondtheepitomeassignedtotheoutgoinglink.FormaldescriptionofthealgorithmispresentedinAlgorithm 3-2 Algorithm3-2StarARconstruction Input:agraphG=(V,E). 1: Picknodesofinterestfromafullsetofnodes,V. 2: Createasinglelogicalnode,L. 3: foreachofpickednodesdo 4: Createalinkbetweenthenodeandthelogicalnode,L. 5: Computeamaxowvaluefromatargetnodetoalltheremainingnodes. 6: Assignthecomputedmaxowvalueasanepitometothelinkcreatedabove. 7: endfor 3.3.2.3Partitionedstarmethod AfterperformingsomeexperimentsonstarAR,werealizedthatastarARshowslittledistortioncomparedtoanoriginaltopology.However,TAalgorithmsshouldtakeintoaccounthowtheresultonARcanbetransformedintorealpathsetupsonanoriginaltopology. Originally,TAhasarisenfromeffortstodealwithscalabilityissuesrelatedwithspacecomplexityandsecurityissuesregardingintradomaintopologyinmulti-domainnetworkenvironments.Usually,routingproceduresconsistoftwosteps:(1)pathcomputationandbandwidthallocationwithARs,and(2)explicitpathcomputationandbandwidthallocationwithoriginalnetworktopologyforeachdomain.Similarstepscanalsobeappliedforsingledomainnetworkenvironments,whereseveralsubdomainsexistforhierarchicalroutingorwecanintentionallypartitiononedomainintoseverallogicalsubdomains.Inthiscase,thebenetsfromTAarealmostthesameasthosein 58

PAGE 59

multi-domainnetworkenvironments.InthecaseofMPMJapplications,however,wecanexpectmorebenetsintermsofcomputationalcomplexity.AdetailedcomputationalcomplexityanalysiswillbegiveninthefollowingSection 3.5 Thepartitionedstarmethodtriestocombinethebenetsofstarandfull-meshmethodsbypartitioningadomainintoksubdomains,andeachsubdomainisaggregatedusingthepreviousstarmethod.Fig. 3-5 showsanexampleofadomainwithfourpartitionedsubdomains.Inthischapter,weusegeneralgraphpartitioningalgorithms,whicharewidelyusedinmanyothercomputerscienceareasincludingloaddistributioninparallelcomputers,sparsematricesanddesignofverylargescaleintegratedcircuits(VLSI)[ 58 ].ThealgorithmforbuildingapartitionedstarARisdescribedinAlgorithm 3-3 Algorithm3-3PartitionedstarARconstruction Input:agraphG=(V,E)andk,thenumberofpartitions(subdomains). 1: Picknodesofinterestfromafullsetofnodes,Vandaddthemtotheaggregationrepresentation. 2: Partitionagraphintokpartssothatthenumberofnodesofinterestisevenlydistributedoverpartitionedparts. 3: Identifycutnodesandcutedges,andaddthemtotheaggregationrepresentation. 4: foreachpartdo 5: ConstructstarARwithpickednodesandcutnodesinthepart. 6: endfor 3.4Routing WiththenetworkmodeldescribedinSection 3.1 ,inter-domainQoSpathroutingisrelativelyeasycomparedtoaQoSpathroutinginadistancevectorroutingprotocol.AnycentralizedPCEcancomputeapathtoadestinationwhichconsistsofastrictpathwithinitsowndomainandacoarseinter-domainpathtothedestinationdomain.Thecoarseinter-domainpathiscomposedofbordernodes,andwhenthepathsetup 59

PAGE 60

Figure3-5. PartitionedstarAR requestisreceivedbyabordernodeontheintermediatepath,itistranslatedintoastrictpathcomposedofintra-domainroutersorswitches.Ontheotherhand,whenlinesegmentalgorithmisdeployed,computingtheQoSpathgoesthroughtwosteps.First,wecanassigndelayvaluestovirtualedgesofaggregatedrepresentationsofalldomainsexceptthedomaininwhichasourcenoderesides;acertainrequesthasabandwidthrequirement,andcorrespondingdelayiscomputedthroughalinesegmentalgorithm.Second,anyshortestpathalgorithmsuchasDijkstra'salgorithmcanberunonthenormalgraphwithdelayattributesonitsedges. Inter-domainroutingforSPSJapplicationsiswelldescribedinSection 3.1 .TheroutingproceduresforMPMJapplicationsarethesameasthoseforSPSJapplications.Theresultsfromanyalgorithms,e.g.,amaximumbandwidthpathalgorithm,runonARsareexpandedoneachdomainoreachsubdomainbyrunningthesamealgorithmontheoriginaltopologyofadomainorasubdomain.Ifoperationsfailinanyofthedomainsorsubdomains,theentireoperationwillfail.NotethatthereasonMPMJapplicationsinintradomainenvironmentsuseARsofsubdomainsistoreducethetimecomplexityofscheduling,whereasSPSJorMPMJininterdomainenvironmentsareforcedtouseARsforsecurityoradministrativereasons.ThebenetsofusingARsinintradomainenvironmentsfromtheperspectiveoftimecomplexitywillbedescribedinSection 3.5 60

PAGE 61

Table3-1. TimeComplexityforMPMJ MethodTimeComplexity Full-meshO(n3D2)StarO(n3D)PartitionedstarO()]TJ /F8 7.97 Tf 6.81 -4.97 Td[(n k3(C+D)) D=numberofnodesofinterest C=numberofcutnodes k=numberofpartitions 3.5ComplexityAnalysis UsuallymostalgorithmsforMPMJhavehighercomputationalcomplexitiesthanalgorithmsforSPSJ.Dijkstraalgorithmcanbeusedtoderivethemaximumbandwidthpathbetweentwonodes,whichcanbetranslatedintothemaxowsinglepath.ThecomplexityofDijkstraalgorithmisO(nlogn+m),wherenisthenumberofverticesandmisthenumberofedges.Incontrast,thecomplexityofthepush-relabelmaxowalgorithmisO(n3)[ 27 ].ThisshowsalgorithmsforMPMJmayrequireafewordershighercomputationalcoststhanthoseforSPSJ. ThetimecomplexitiesofTAalgorithmsforMPMJaresummarizedinTable 3-1 .Thefull-meshmethodrequiresO(n3D2),andthestarandthepartitionedstarmethodrequireO(n3D)andO()]TJ /F8 7.97 Tf 6.81 -4.97 Td[(n k3(C+D)),respectively,whereDisthenumberofnodesofinterest,Cisthenumberofcutnodes,andkisthenumberofpartitions.ThecomplexityofmaxowalgorithmsisassumedtobeO(n3)andthenumberofpartitionsinthepartitionedstarmethodisgivenask. ThespacecomplexitiesaresummarizedinTable 3-2 .ThespacecomplexitiesofARsforfull-mesh,starandpartitionedstarmethodsareO(D2),O(D)andO(C+D),respectively.SupposethatacertainalgorithmforMPMJapplicationstakesO(n3).IfwerunthealgorithmonARs,itwilltakeO((C+D)3)andkO()]TJ /F8 7.97 Tf 6.81 -4.98 Td[(n k3),whicharetimetakenforrunningthealgorithmonARsandtimetakenforexplicitroutingineachpartition,respectively.(C+D)andkisdenitelyasmallvaluecomparedton,andn3maybegreaterthan)]TJ /F8 7.97 Tf 6.81 -4.98 Td[(n k3byafewordersofmagnitude.Hence,wecanexpectthatthe 61

PAGE 62

Table3-2. SpaceComplexityforMPMJ MethodSpaceComplexity Full-meshO(D2)StarO(D)PartitionedstarO(C+D) D=numberofnodesofinterest C=numberofcutnodes partitionedstarmethodcanexpeditethepathcomputationandbandwidthallocationprocesssignicantly. 3.6ExperimentalEvaluation 3.6.1BulkFileTransfersinE-Science Wechoseabulkletransferapplicationin[ 65 ]asatypicalMPMJe-ScienceapplicationtoshowthatourproposedalgorithmsperformbetterthannaivealgorithmsadaptedfromSPSJTAalgorithms.In[ 65 ],theauthorsformulatedthein-advanceschedulingofmultiplebulkletransfersasalinearprogrammingproblem.Weadaptedtheirlinearprogrammingformulationtoon-demandschedulingofmultiplebulkletransfersforoursimulation.ThelinearprogrammingformulationisshowninFigure 3-6 .Thenotationsandequationsareborrowedfrom[ 65 ]wheneverpossible.Inthisformulation,tfdenotesthetimebywhichallletransferscomplete.Theobjectiveofthislinearprogrammingproblemistondtheearliestnishtime.fjlkistheamountofletransferredforrequestj2Fonlink(l,k)2E.blkisthebandwidthavailableonlink(l,k).Equation 3 ensuresthatforeachtransferrequestj2F,foreachnodelthatisneitherthesourcenorthedestinationnode,theamountoflejthatleavesnodelequalstheamountthatentersthisnode.Equation 3 requiresthesourcenodeofrequestjtosendanetfjunitsoflejoutandrequiresthedestinationnodetoreceiveanetfiunits.Equation 3 ensuresthattheamountoftrafconeachlinkdoesnotexceedtheavailablecapacityofanylinkintheinterval[0,tf).Equation 3 ensuresthatletransferamountsarenon-negative. 62

PAGE 63

minimizetf (3)subjectto (3)Xk:(l,k)2Efjlk)]TJ /F11 11.955 Tf 21.34 11.36 Td[(Xk:(k,l)2Efjkl=08j2F,8l2V,l6=sj,l6=dj (3)Xk:(l,k)2Efjlk)]TJ /F11 11.955 Tf 21.34 11.36 Td[(Xk:(k,l)2Efjkl=fj,ifl=sj)]TJ /F6 11.955 Tf 9.3 0 Td[(fj,ifl=dj,8j2F (3)Xj2Ffjlkblktf,8(l,k)2E (3)fjlk0 (3) Figure3-6. Earliestnishtimeon-lineschedulingofmultipleletransfers 3.6.2ExperimentTestbed ForTAalgorithmsforMPMJ,weperformedexperimentsonrandomnetworkswithasingledomain.RandomnetworktopologiesaregeneratedbytheBRITEinternettopologygenerationpackage[ 72 ].WetriedseveralmodelssuchasWaxman,BRITE,etc.,buttheresultsfordifferentmodelsshowsimilartrends.Therefore,weshowonlyresultsforrandomnetworktopologiesfollowingtheWaxmanmodelwiththeaveragenodedegreeof4.Thebandwidthvaluesofedgesarerandomlyselectedfromauniformdistributionbetween10to1024.Thenumberofnodesineachdomainisvariedfrom100to300withtheincrementof50.Thenodesofinterestarepickedrandomlywithinadomain,andthenumberofnodesrangesfrom1to16,whichisdoubledateachstep.Wegeneratedasyntheticsetofdatatransferrequests.Eachrequestisdescribedbythe3-tuple(sourcenode,destinationnode,requestedletransfersize).Thenumberofrequestsisalsorandomlyselectedwithintherangeof1tothemaximumpossiblenumberofrequestsdeterminedbythenumberofnodesofinterest.Forexample,ifthenumberofnodesofinterestis4,themaximumpossiblenumberofrequestsis43.The 63

PAGE 64

sourceanddestinationnodesforeachrequestarerandomlyselectedusingauniformrandomnumbergenerator.Theresultsareaveragedover100randomnetworksforacertainnumberofnodes. 3.6.3PerformanceMetrics Theperformancemetricwehaveusedtocomparethedifferentapproachesistondtheearliestnishtime(EFT)tocompleteallthemultipledatatransferrequeststhataregiven.OnewouldexpectagoodARapproachtoperformasclosetousingtheoriginaltopology. Hence,weusetheerrorratio(ER)thatmeasuresthedeteriorationfromthecorrectEFTontheoriginaltopology.ATAalgorithmwithlowerERshowsbetterperformance.ERisformallydenedasER=TAEFT)]TJ /F1 11.955 Tf 11.96 0 Td[(OriginalEFT OriginalEFT 3.6.4Results WemeasuredERaccordingtotheequationdenedinSection 3.6.3 .Thecomputationaltimestakenforeachalgorithmarealsogatheredtoshowhowmuchcomputationcostreductionwecangetfromthecompactrepresentation.Fig. 3-7 showsthatthestarandthepartitionedstarmethodsgivearound5%ER.ThisisbecausetheapplicationofndingEFTtendstondandallocatealltheavailablebandwidthsinanetwork,whicharelimitedbythestarorthepartitionedstarARsinasimilarwayastheoriginalnetworkdoes.Inaddition,wecouldobservethatasthenumberofrequestsincrease,ERisimprovedbecauseallthenetworkresources,i.e.,thebandwidths,areeventuallyusedup.Asexpected,theperformanceoffull-meshARistheworst. OursimulationresultsinFig. 3-8 alsoshowthatthestarmethodiscomparabletothepartitionedstarmethodbutsignicantlyfasterthanthepartitionedstarmethod.Thisisnotonlybecausethestarmethodisamorecompactrepresentation,butalso 64

PAGE 65

Figure3-7. Errorratiovs.thenumberofnodes Figure3-8. Normalizedcomputationaltimevs.thenumberofsourceanddestinationnodes partlybecause,forrandomlygeneratednetworks,thenumberofcutnodesarerelativelylarge.Ifadomainisstructuredasbackboneandtheothernetwork,thenumberofcutnodescanbereducedtoareasonablevalue,whichcanenhancetheperformanceofthepartitionedstarmethod. 3.7Summary Weproposeseveraltopologyaggregationalgorithmsfore-Sciencenetworks.E-ScienceapplicationsrequirehigherqualityintradomainandinterdomainQoSpaths,andsomeofthosearedistinguishedfromclassicsingle-pathsingle-job(SPSJ)applications.Wedeneanewclassofrequests,calledmultiple-pathmultiple-job(MPMJ),andproposeTAalgorithmsforthenewclassofapplications.Theproposedalgorithms,starandpartitionedstarARs,areshowntobesignicantlybetterthannaive 65

PAGE 66

approaches.Especially,starARshowsthebestperformanceintermsofcomputationaltime.Itsperformanceisalsoveryclosetousingtheentiretopologyforperformingthescheduling.Thus,itiswellsuitedformultipledomaine-Scienceapplications. 66

PAGE 67

CHAPTER4WORKFLOWSCHEDULING 4.1Overview Anapplicationscientisttypicallysolveshis/herproblemasaseriesoftransformations.Eachtransformationmayrequireoneormoreinputsandmaygenerateoneormoreoutputs.Theinputsandoutputsarepredominantlyles.However,weexpectthatleswillbereplacedbydatabasesformanyapplications.ThesequenceoftransformationsrequiredtosolveaproblemcanbeeffectivelymodeledasaDirectedAcyclicGraph(DAG)formanypracticalapplicationsofinterestthatthischapteristargeting. Figure 4-1 describesaDAGconsistingof17nodes,representingdependenciesamong17tasksofanapplication.Forexample,thearcfromtaskEtotaskBrepresentsthefactthattheoutputgeneratedbytaskEisutilizedbytaskB.Eachtaskinvolvestransformingthedataorstoringanintermediateresultforarchiving.ThetimerequirementforsolvingtheentireDAGforlargescaleapplicationsmaybeoftheorderofhourstodaysevenassumingthateachofthetaskisbeingexecutedonaclusterofworkstationsoraparallelsupercomputer.DAGshavebeenwidelyusedforthedevelopmentofschedulingalgorithmsinthecomputerscienceliterature[ 38 105 ]. ThegeneralformofDAGschedulinghasbeenshowntobeNP-hard[ 91 ],andanumberofheuristicshavebeenproposed.Earlyresearchregardedcommunicationcostsassmall[ 26 39 ]orassumedaverysimpleinterconnectionnetworkmodel,i.e.,afully-connectednetworkmodelwithoutcontention[ 59 60 79 97 99 ].Theheterogeneous-earliest-nish-time(HEFT)algorithmextendedforheterogeneouscomputingresourceswasproposedin[ 97 ]. Forthedistributedapplicationsthatwearetargeting,theamountofdatathatneedstobetransferredbetweentasksmaybeoftheorderofhundredsofgigabytestomultipleterabytes.Thus,thekeychallengeistobeabletoscheduleaworkowsuchthatthetotalexecutiontimeandthecommunicationcostsareminimized.Theformer 67

PAGE 68

Figure4-1. ADAGconsistingof17nodes,representingdependenciesamong17tasksofanapplication.Forexample,thearcfromtaskEtotaskBrepresentsthefactthattheoutputgeneratedbytaskEisutilizedbytaskB. requiresmappingthetaskstoappropriatemachineswhilethelatterrequirestheuseofhighbandwidthnetworksandeffectiveschedulingofthecommunicationbandwidth.ThepastresearchonschedulingDAGs(e.g.[ 38 105 ])isgenerallylimitedtosolvingcomputeintensiveproblems.Incontrast,wewillproposenewalgorithmstomaptasksthathavelargedataaccessrequirementsontodistributedheterogeneousclustersandsupercomputers.Formanyapplications,anodeinthetaskgraphcanalsorepresentmultipleconcurrentandinteractingsubtasks.Ifthesesubtasksaremappedtomultiplemachines,therequiredinteractionhastobemappedontotheunderlyingnetworktosupportthisinteraction.ForsuchDAGs,theprecedenceisbetweensetsofmultiplesubtasks. Theextendedlistschedulingalgorithmsin[ 92 ]and[ 90 ]targetedforheterogeneousclusterarchitecturesaddressthisnetworkcontentionissuebyvariouspriorityattributingschemesandassumedthepathbetweenanytwoprocessorsisdeterminedandxedbythetargetsystemusingconventionalalgorithmssuchasabreadthrstsearch(BFS).Ontheotherhand,similarworkregardingDAGschedulinghasbeendoneinthe 68

PAGE 69

literatureofgridcomputing.Generally,theterm,workow,isusedinterchangeablywithDAGinthecontextofgridcomputing.Ataxonomyofpreviousworkontheworkowschedulingproblemingridcomputingwaspresentedin[ 102 ].Thegoaloftheschedulingalgorithmsistomapthetasksandsubtasksofalltheapplicationsonthegridsuchthattheresourcesareeffectivelyutilized,whilethequalityofserviceguaranteesgiventoanapplicationarerespected.Inthischapter,theactualformulationoftheoptimizationgoalswillbepresented.Thenetworkresourcemappingcanhavethefollowingcharacteristics: 1. Rigidvs.malleable:Rigidmappingisxedbandwidthmappingoverthetimeperiodofdatatransferwhereasmalleablemappingallowsforvariablebandwidth.Ifthereisnoqualityofservicerequirementssuchasconstantdatarate,malleablemappingisaviableoptiontoutilizenetworkresourcesefcientlysincesolutionscanbeexibleaslongastotalamountofdatatransmissionovertimemeetsthedatatransmissionrequirement. 2. Singlepathvs.multiplepaths:Formanytransfers,multiplepathscanbeeffectivelyusedtoreducethetransfertime.However,ndingasetofmultiplepathsrequiresmorecomputationtime,andthusefcientalgorithmsareneeded. 3. Staticvs.dynamicpaths:Instaticmapping,pathsdeterminedatthestarttimeofdatatransmissiondonotchangeuntiltheendofdatatransmission,whileindynamicpathmapping,pathscanchangedynamicallyovertime. Therecentworkonworkowschedulinginopticalgridscanprovisionnetworkresourcesdynamicallywithguaranteeofspeciedbandwidth[ 66 67 69 95 101 ].Accordingtotheabovetaxonomy,thosemethodsuserigidandsinglepathmapping.Moreover,thepathsareassumedtobestatic. Theworkowschedulingproblemcanbeclassiedintoin-advanceandon-demandschedulingdependingonthereservationstarttime.Ifthereservationstarttimeisthesameasthejobrequestarrivaltime,itison-demandscheduling.Ontheotherhand,thereservationstarttimeisequaltoorlaterthanthejobrequestarrivaltimeincaseofin-advancescheduling.On-demandschedulingcanbealsoregardedasaspecialcaseofin-advanceschedulingwherereservationstarttimeequalstothejobrequestarrival 69

PAGE 70

time.Inthischapter,wesolvein-advanceworkowschedulingproblemine-SciencenetworkswhichareamixofIPnetworksandopticalnetworks.Ourframeworksupportsin-advancereservationandprovidesmalleablemappinganddynamicpaths.Further,weareabletoexploitmultiplepathsandapplicableforheterogeneousandourframeworkisapplicabletohomogenousresourcesespeciallyfornetworkresources. 4.2WorkowSchedulinginE-ScienceNetworks Wedevelopworkowschedulingalgorithmsfore-Sciencenetworks.Arealnetworktopologyandworkowsaregivenasinputsforworkowschedulingalgorithms.Arealnetworktopologyisrepresentedbyanetworkresourcegraphrepresentswhereanodedenotesaresourcesuchascomputeresourceorarouter/switchonlyforwardingnetworktrafc,andanedgedenotesaphysicallinkbetweentwonodes.Aworkowisrepresentedbyataskgraph/DAGwhereanodedenotesataskassociatedwithatypeandamount,andadirectededgeconnectingtwonodesdenotesaproducer/consumerrelationofthem,i.e.,requireddatatransferfromasourcenodetoadestinationnode.Ataskinataskgraphisexecutedonlyonceandexecutionordercomplieswiththeprecedenceconstraintsdenedbythetaskgraph.Inthischapter,thegoalofworkowschedulingalgorithmsistomapanode(task)andanedge(datatransfer)inataskgraphintoanodeanddynamicmultiplepathsinanetworkresourcegraph,respectively.Themappingofanode(task)inataskgraphintoanodeinanetworkresourcegraphimpliesthatthetaskisnotsplittableandthemappingisnotvariedovertime.Buttheamountofresourceallocationcanvaryovertime.Incontrast,themappingofanedge(datatransfer)inataskgraphintodynamicmultiplepathsinanetworkresourcegraphmeansthatdatatransferisfullledbymultiplepathsvaryingovertime.Thetimemodelweassumeistheuniformtimeslicemodel,whichdiscretizesthetimelineintomanytimesliceswithuniformperiod.Inthefollowingsections,moredetailedandformaldenitionwillbedescribed. 70

PAGE 71

4.2.1SystemModelandDataStructure 4.2.1.1Timemodel TheuniformtimeslicemodelisrepresentedbyandMwhereisthesizeofatimeslideandMisthemaximumnumberoftimeslicesthesystemwouldconsider.ThestartandendtimeofthetimeslicemisdenotedbyTmandTm+1,respectively. 4.2.1.2Networkresourcemodel AnetworkresourcemodelisrepresentedbyGn=(V,E,r,TR,TB),whereVandEareasetofnodesandasetofedges,respectively,rvdenotestheresourcetypeofanodev,andTRvandTBedenotethedatastructuresfortheresourceavailabilityofnodevandedgee,respectively,overtime.Morespecically,weusetime-resource(TR)ortime-bandwidth(TB)arraysasdatastructuresformanagingresourceavailabilityovertime.ATRorTBarrayisasetof(am)wheremistheindexofatimesliceandamistheavailableamountofaresourceovertheperiod[Tm,Tm+1).Thesedatastructuresarenecessaryforeffectivein-advancereservationofresources.Basically,thedatastructuresofaTRarrayandaTBarrayaresameexceptthefactthattheresourcetyperepresentedbyaTBarrayisonlynetworkresourcetypewhereasotherresourcetypesarerepresentedbyaTRarray.Thus,aTBarrayisassignedoneachedgeandaTRarrayisassignedoneachnodeinanetworkresourcegraph. Figure 4-2 showsanexampleofnetworkresourcegraph.Eachnoderepresentsoneresource,andisassociatedwitharesourcetypeandaTRarray,whichtrackstheresourceavailabilityovertime.InFigure 4-2 ,nodesV1throughV3areofresourcetype1,andnodesV4andV5areofresourcetype2.Wecanassignauniquenumbertoadifferentresourcetypeexcludingthenetworkresource.Forexample,resourcetype1ispurecomputeresourceandresourcetype2isdatabaseserviceresource.Eachedgerepresentsaphysicallinkconnectingtwonodes,andisassociatedwithaTBarray,whichtracksthenetworkresourceavailabilityovertime. 71

PAGE 72

Figure4-2. Anexampleofanetworkresourcegraph 4.2.1.3Workowmodel Aworkowcanberepresentedbyataskgraph,whichisadirectedgraphandformallydenedasGt=(N,L,r,RN,RL,ST,Deadline).NandLrepresentanodesetandanedgeset,respectively.ridenotestheresourcetypeofnodeNi.RNidenotestherequiredamountofaresourceofnodeNi,andRLidenotestherequiredamountofdatatransferbetweenbothendnodesofedgeLi.Weassumeallcapacitiesofresourcesarenormalizedwithregardtothebasecapacityandwecanthenexpresstherequiredoravailableamountofresourcesbyrationalnumbers,multiplesofthebasecapacity.STisthestarttimeofaworkow,whichhastobetakenintoconsiderationforin-advancescheduling.Deadlineisanoptionalparameter.Ifitisgiven,wemaysettheoptimizationobjectivetominimizingnetworkresourceconsumption.Otherwise,wemaysettheoptimizationobjectivetominimizingthemakespanoftheworkow.Figure 4-3 showsanexampleofataskgraph.Resourcerequirementandresourcetypeareassociatedwitheachnode,andonlynetworkresourcerequirementpropertyisassociatedwitheachedge. 4.2.2ProblemStatement Wesolvein-advanceworkowschedulingproblemsine-SciencenetworkswhicharemixofIPnetworksandopticalnetworks.Eventhoughopticalnetworks,whereaphysicallinkcarriesmultiplewavelengths,haveinherentlyintegralityofbandwidth, 72

PAGE 73

Figure4-3. Anexampleofataskgraph weassumethatthebandwidthofanetworkresourcegraphisinnitelydivisible.Theapplicationofthealgorithmsdevelopedinthischaptertoopticalnetworksislefttofuturework.Wedevelopouralgorithmsfortwobroadcases: 1. Singleworkow:Inthiscase,asingleworkowisscheduledbasedontheavailable(fractional)resources.Itisassumedthatthepreviousworkowshavealreadybeenscheduledandthegoalistooptimizetheperformancecharacteristicsofasingleworkow. 2. Multipleworkows:Inthiscase,multipleworkowswillbesimultaneouslyscheduled.Theexpectationisthatthiswillachievebetterperformance,thanschedulingoneworkowatatime. Forbothcases,thegoalsofouralgorithmscanbeminimizationofeithernetworkresourceconsumptionormakespan(nishtime).Forsimplicity,whendeadlinesforworkowsaregiven,wesettheobjectivetominimizationofnetworkresourceconsumption.Otherwisewesettheobjectivetominimizationofnishtime.Thereforewehavefourproblemsintotal:(1)minimizationofnetworkresourceconsumptionforasingleworkow,(2)minimizationofnishtimeforasingleworkow,(3)minimizationofnetworkresourceconsumptionformultipleworkows,and(4)minimizationofnishtimeformultipleworkows. 4.2.3ConstructionofanAuxiliaryGraph Wetranslatetheworkowschedulingproblemintoanetworkowproblem.Themulticommodityowproblem,whichoptimizesthecostofmultiplecommoditieswithdifferentsourceanddestinationnodesowingthroughthenetwork,isawell-known 73

PAGE 74

networkowproblem.Toformulatetheworkowschedulingproblemasamulticommodityowproblem,wersthavetoconstructanauxiliarygraphfromthegivennetworkresourcegraphandtaskgraph.Theworkowschedulingproblemiscomprisedoftwomappingproblems,anodemappingproblemandanedgemappingproblemontoanetworkresourcegraph.Thegoalofconstructinganauxiliarygraphistoconvertanodemappingproblemintoanedgemappingproblemsincethemulticommodityowproblemcandealwithonlyanedgemappingproblem. AnillustrativeexampleoftheauxiliarygraphcorrespondingtoFigures 4-2 and 4-3 isshowninFigure 4-4 .AnauxiliarygraphGA=(VA,EA,TBA)isconstructedasfollows.First,weexpandthenetworkresourcegraphbyduplicatingeachnodeandconnectingfromtheoriginalonetotheduplicatedone.Forconvenience,let'scalltheoriginaloneafrontendnode,andtheduplicatedoneabackendnode.Forexample,inFigure 4-4 ,thenodeV1isexpandedintotwonodes,V10andV100,andanewedgeconnectingthesetwonodesisinsertedwiththeassociatedTBarraycorrespondingtoV1'sTRarray.Inthiscase,V10isafrontendnodeandV100isabackendnode.Obviously,thisexpansionistoconvertaresourceallocationproblemintoanetworkowproblem.TheoriginaltopologyofthenetworkresourcegraphremainsunchangedamongthebackendnodesoftheexpandedgraphasinFigure 4-4 .Notethatsomenodesofnochanceofbeingselectedmaynotneedtobeexpanded. Second,weexpandthetaskgraphinthesamewayaswedidthenetworkresourcegraph.Butwedonotcreateanyedgeconnectingnodesintheexpandedtaskgraph.Lastly,weinterconnecttheexpandednetworkresourcegraphandtheexpandedtaskgraph. Asmentionedabove,twokindsofowsareneededforproblemconversionfromageneralworkowschedulingproblemtoanetworkowproblem.Oneistheresourceallocationowforthepurposeofresourceallocationofeachtask(node)inataskgraph.Forexample,inFigure 4-4 ,N10isconnectedtoallpossiblefrontendnodesofthesame 74

PAGE 75

resourcetypeofN1intheexpandednetworkresourcegraph.Similarly,N100isconnectedtoallbackendnodescorrespondingfrontendnodes.Thus,byconstrainingtheowfromN10toN100demandingN1'sresourcerequirementtotaketheonlyonesinglepath,wecansolvetheproblemofresourceallocationofeachtask(node). Theotheristhedatatransferowforthepurposeofdatatransfersbetweentasks.Theseowsareseamlesslymodeledbymultipleowswithdifferentsourceanddestinationnodesinatypicalmulticommodityowproblem.Thesourcenodeofadatatransferowissettothebackendnodecorrespondingtoasourcetaskinthetaskgraph,andthedestinationnodeofadatatransferowissettobackendnodecorrespondingadestinationtaskinthetaskgraph.Forinstance,thedatatransferrequirementof10unitsbetweenN1andN2inataskgraphismodeledbyaowof10unitsofdatabetweenN100andN200. Theauxiliarygraphaccountsforasituationwheretwotasksaremappedintoasameresourceandthusthecommunicationcostbetweenthemshouldbeignored.Sincethebandwidthofinterconnectingedgesbetweenanexpandednetworkresourcegraphandanexpandedtaskgraphissettoinnity,thecommunicationcostoftasksmappedintothesameresourcewillbenearzero.SupposethatalltasksandresourcesareofsametypeinFigure 4-4 andN1andN2aremappedontoV1.ThenthedataowbetweenN1andN2willfollowthepath,N100!V100!N200,whichiscomposedonlyofedgeswithinnitebandwidth. Thespacecomplexityofanauxiliarygraphissummarizedasfollows;jVAj=2(jVj+jNj),jEAj=jEj+jVj+2jNjjVj. 4.3MILPFormulation Thesingleormultipleworkowschedulingproblemcanbeformulatedasamixedintegerlinearprogramming(MILP)problem,whichisavariantofamulticommodityowproblem.TheobjectiveoftheMILPproblemcanbeminimizingthenishtimeorminimizingthetotalnetworkresourceconsumptiondependingonwhetherthedeadlines 75

PAGE 76

Figure4-4. Anexampleofanauxiliarygraph forworkowsaregivenornot.Ifadeadlineisnotimposedonaworkow,theuserwhorequestsfortheworkowjobwantstogetthejobdoneasfastaspossible.Ifadeadlineisimposedonaworkow,theuserwillbesatisedaslongasthedeadlineismet,whichallowsthesystemtoutilizeresourcesmoreefcientlyincompensationforthedelayedtime. TheconstraintsoftheMILPproblemarecomposedoffourparts:1)multi-commodityowconstraints,2)taskassignmentconstraints,3)precedenceconstraints,and4)deadlineconstraints.Sincewehavetransformedtheworkowschedulingproblemintoamulti-commodityproblem,thetypicalmulti-commodityowconstraintsremainvalid.Additionalmulti-commodityowconstraintsareaddedtoaccountformalleableresourceallocation.Secondly,thetaskassignmentconstraintsareintegerconstraintstoenforcethatonetaskismappedtoonlyoneresourcenodeinthenetworktopologygraph.Thirdly,theprecedenceconstraintsensurethatprecedenceconstraintsofaworkowareobeyed.Finally,thedeadlineconstraintsarejustforthecasewhendeadlinesforworkowsaregiven.ThenotationfortheMILPformulationislistedinTable 4-1 76

PAGE 77

Table4-1. Notationforproblemformulation CategoryNotationDescription Function pred(v)Returnsthesetofpredecessorsofnodev ConstantorSetJf(sj,dj,Fj)jsj,dj2VA,0j
PAGE 78

andddenotethesourceanddestinationnodesofthejob,Fdenotestherequiredamountofow(resource).STandEND,thestartandendtimesofthejob,aredeterminedbyworkowschedulingalgorithms.Theresourcetypedoesnothavetobeincludedinthistuplesinceaowisforcedtotakealinkofthesameresourcetypeduetothecarefullychosenconnectedgesbetweenanetworkresourcegraphandataskgraph.Threekindsofbinarydecisionvariablesareintroduced;x,yandz.Thediscretenatureoftheproblemisduetothefactthatataskcannotbesplitandwehavediscretetimeintervalstoaccommodatejobs.Binarydecisionvariables,xjlkandyjlk,determinewhichresourceistobeallocatedtoanon-splittask.Regardingajobjcorrespondingtoataskinataskgraph,theowofthejobcantakeonlyoneoutgoingedgefromthefrontendnodeofataskandonlyoneincomingedgeintothebackendnodeofataskintheauxiliarygraph.Theseconstraintsreectthenon-splitpropertyofatask.zjmindicateswhethertimeslicemisusedforthejobjornot.Thesebinarydecisionvariablescanbeeasilyextendedtothemultipleworkowschedulingproblembyusingseparatevariablesforeachworkow. 4.3.1SingleWorkow ThecompleteformulationispresentedinFigure 4-5 .Firstofall,theproblemcanbeoptimizedforeitherminimumnishtime,Tf,orminimumnetworkresourceconsumption,Pj2JP(l,k)2EAPM)]TJ /F5 7.97 Tf 6.58 0 Td[(1m=0fjlk(m),asinExpression 4 .Minimizingnetworkresourceconsumptioncanbehelpfulforsavingmoreresourcesforfuturearrivingrequestssothatmorerequestscanbeacceptedinthelongterm. 4.3.1.1Multi-commodityowconstraints TheowconservationruleatthenodesotherthansourceanddestinationnodesisensuredbyConstraint 4 .Theamountofow(resource)tobeallocated(reserved)isensuredbyConstraint 4 .Constraint 4 ensuresthattheamountoftotalowsonlink(l,k)duringtimeslicemshouldnotexceedthemaximumpossibleamountofowduringthetimeslicem,whichisgivenbyblk(m)(Tm+1)]TJ /F6 11.955 Tf 12.79 0 Td[(Tm),whereblk(m) 78

PAGE 79

ObjectiveminimizeTforXj2JX(l,k)2EAM)]TJ /F5 7.97 Tf 6.59 0 Td[(1Xm=0fjlk(m) (4)Multi-commodityowconstraintsXk:(l,k)2EAfjlk(m))]TJ /F17 10.909 Tf 22.98 10.36 Td[(Xk:(k,l)2EAfjkl(m)=0,8j2J,8l2VA,0m
PAGE 80

istheavailablebandwidthinthetimeslicem,andTm+1andTmaretheendtimeandstarttimeofthetimeslicem.Constraint 4 ensuresthatifthetimeslicemisnotusedforjobj,theamountofowforthejobjduringthetimeslicemis0.ButnotethatConstraint 4 shouldnotbeimposedonedgeswithinniteavailablebandwidthasthereisnocostrequiredforcommunicationsbetweentasksassignedonthesameresource.Otherwise,onetimesliceisallocatedforsuchcommunications.Constraint 4 ensuresthatifthetimeslicemisusedforjobj,thestarttimeofjobjshouldbeatmostTm,thestarttimeofthetimeslicem.Supposethatmultipletimeslicesarechosenforjobj,thisconstraintenforcesSTjtobelessthanorequaltothestarttimeoftheearliesttimeslice,whichcomplieswiththedenitionofSTj.Similarly,theendtimeofjobjisensuredtobegreaterthanorequaltotheendtimeofanytimesliceminwhichthejobisscheduledbyConstraint 4 4.3.1.2Taskassignmentconstraints Thesecondpartoftheconstraintsreectsthenon-splitpropertyoftasks.Thus,Constraints 4 and 4 ensurethatonlyoneresourceamongpossiblecandidateresourcesisassignedtoonetask.Constraint 4 relatesdiscreteselectionofaresourcewithowdecisionvariables,whichmeansifaresourceischosenforajobj,therecouldbeaowontherelatedlinks. 4.3.1.3Precedenceconstraints Aftertransformingallthetasksanddatatransfersintojobsinanetworkowproblem,weshouldensurethatprecedenceconstraintsinherentinataskgrapharealsoembeddedinthenetworkowproblem.Accordingly,Constraint 4 ensuresthatthestarttimeofjobswithnoprecedentjobsissettothestarttimeofaworkow.Constraint 4 ensuresthattheendtimeofajobisgreaterthanorequaltothestarttimeofajob.Constraint 4 ensuresthatthestarttimeofajobisnotbeforetheendtimesofprecedentjobs.Constraint 4 ensuresthatalltheendtimesofjobsshouldbeless 80

PAGE 81

thanorequaltotheglobalnishtimeTf.Constraints 4 and 4 ensurethatdatatransfersbetweentasksoccurbetweenchosenresources. 4.3.1.4Deadlineconstraints Constraint 4 isoptionaldependingonwhetherwehavedeadlinesonworkowsornot. 4.3.2MultipleWorkows TheformulationforthesingleworkowschedulingproblemcanbeeasilyextendedtothemultipleworkowschedulingproblembyusingseparatevariablesforeachworkowasinFigure 4-6 WecansettheobjectiveofthemultipleworkowschedulingproblemformulationtominimizingeitherthetotalsumofmakespansofallworkowsorthetotalnetworkresourceconsumptionofallworkowsasinExpression 4 .ThersttermofExpression 4 indicatesthetotalsumofmakespansofallworkows.Eventhoughwecanoptimizethenishtimeofthewholeworkowsbydirectlyapplyingtheobjective,Expression 4 ,ofthesingleworkowschedulingproblemformulation,minimizingthenishtimeofthewholeworkowsmaynotcontributetotheefcientresourceschedulingofworkowswhosetimelinesarefaraheadofthenishtimeofthewholeworkows.Forsuchreasons,wechoosetominimizethetotalsumofmakespans.Yet,therestillexistsaconcernthatthisobjectivecannotachievebalancedoptimizationforthemakespanofeveryworkow.Supposethateachworkowisissuedbyadifferentuser.Fromtheperspectiveofthewholesystem,thisobjectivecanachievebalancedschedulingamongworkows.Butfromtheperspectiveofusers,themakespanofacertainworkowcanbesacricedtoachievetheminimumofthetotalsumofmakespansbyreducingthemakespansofotherworkows. 4.3.3TimeComplexity ThetimecomplexityofaMILPproblemdependsonthenumberofdecisionvariablesandthenumberofconstraints.Toformallyanalyzethenumberofdecision 81

PAGE 82

ObjectiveminimizeN)]TJ /F5 7.97 Tf 6.59 0 Td[(1Xn=0(Tnf)]TJ /F12 10.909 Tf 10.91 0 Td[(WSTn)orN)]TJ /F5 7.97 Tf 6.58 0 Td[(1Xn=0Xj2JX(l,k)2EAM)]TJ /F5 7.97 Tf 6.58 0 Td[(1Xm=0fjnlk(m) (4)Multi-commodityowconstraintsXk:(l,k)2EAfjnlk(m))]TJ /F17 10.909 Tf 22.98 10.36 Td[(Xk:(k,l)2EAfjnkl(m)=0,8j2J,8l2VA,0m
PAGE 83

variablesandthenumberofconstraints,rst,thefollowingvariablesaredened.nnandmndenotethenumberofnodesandthenumberofedges,respectively,ofanetworkresourcegraph.ntandmtdenotethenumberofnodesandthenumberofedges,respectively,ofaworkow.nJdenotesthenumberofjobs.IfnAandmArepresentthenumberofnodesandthenumberofedges,respectively,oftheauxiliarygraph,nAequals2(nn+nt),andmAequalsmn+nn+2ntnnasdescribedinSection 4.2.3 .Table 4-2 showsthenumberofvariablesandconstraintsofthesingleworkowschedulingproblemformulation.Flowvariablesfconsistsofowvariablesofcommunicationjobsandowvariablesofnon-communicationjobs.Therstpartisaccountedforby(2nn+mn)mtMbecauseweneedtoconsiderowsonanetworkresourcegraph(mn)andinterconnectingedgesbetweenanetworkresourcegraphandataskgraphrelatedtoajob(2nn).Ontheotherhand,thesecondpartisaccountedforby(3nnnt)MbecauseweneedtoconsiderowsonlyoninterconnectingedgesbetweenanetworkresourcegraphandataskgraphandedgesconnectedbetweenfrontendandbackendnodesinGA.Forsimplicity,let'sassumethatthenetworkresourcegraphisxedasweconductexperimentsbyvaryingonlythesizeofworkowsinSection 4.6 .Thendecisionvariablefisdominantinthenumberofdecisionvariables,andthenumberofdecisionvariablesfisproportionalto(mt+nt)M.AsshowninTable 4-2 ,thenumberofconstraintsisalsoproportionalto(mt+nt)M. 4.4LPRelaxation Asyouwillseeintheexperimentalresults,therunningtimeofMILPfortheworkowschedulingincreasesexponentiallyasthenumberofnodesofaworkowgrows.ThegeneralworkaroundtosolvetheMILPproblemfastenoughtobeusefulinpracticalisthelinearprogrammingrelaxationbytransformingbinaryvariablesintorealvariablesrangingbetween0and1.WecanturnthesolutiontothelinearprogrammingrelaxationoftheMILPproblemintotheapproximatesolutiontotheMILPproblemvia 83

PAGE 84

Table4-2. Singleworkowschedulingformulationtimecomplexityanalysis Variable/ConstraintNumberofvariables/constraints f((2nn+mn)mt+3nnnt)M xnJnn=(mt+nt)nn ynJnn=(mt+nt)nn znJM=(mt+nt)M STnJ=(mt+nt) ENDnJ=(mt+nt) Constraint 4 + 4 (nt(2nn+2)+mt(nn+2))M Constraint 4 (mn+nn)M Constraint 4 (ntnn+mt(mn+nn))M Constraint 4 4 nJM=(mt+nt)M Constraint 4 4 nJ=(mt+nt) Constraint 4 nJnn=(mt+nt)nn Constraint 4 4 4 nJ=(mt+nt) Constraint 4 2mt Constraint 4 4 2mtnn techniquessuchasrounding.WeproposeaLPrelaxation(LPR)algorithm,consistingoftwosteps,fortheworkowschedulingproblem. Wedeterminewhichresourcesareselectedforthetasks(nodes)ofataskgraph. Thenextstepistoiterativelydeterminethestartandendtimesofjobsalongwithnetworkresourceallocationsfordatatransferjobs. Thedetailedoperationsoftherst-stepalgorithmaredescribedinAlgorithm 4-1 .Thegoaloftherststepistodeterminethemappingofresourcesotherthannetwork,andtherelatedbinaryvariables,xandy.IntheoriginalMILPformulation,Constraint 4 and 4 ensurethatonlyonex/yvariableintheconstraintsbecomes1.HencewecanturnthesolutionoftheLPrelaxationproblemintothesolutionoftheoriginalMILPproblembypickingthevariablewiththemaximumvalueandsettingitto1andalltheothersto0.Inthisstep,wedon'tcareaboutzvariables,whicharerelatedtotimesliceassignment. 84

PAGE 85

Algorithm4-1Firststep-Determinationofthemappingoftasksexceptdatatransfers Input:AnetworktopologygraphGnandaworkowGt 1: RelaxallthebinaryvariablesoftheMILPproblems,i.e.,x,y,andzvariables. 2: SolvetheLPrelaxationoftheMILPproblem. 3: Findthemaximumrelaxedvariableamongmanyrelaxedvariableswhosetotalsumshouldbe1,andsetthevariableto1andallothervariablesto0regardingxandyvariables. Withthesolutionoftherst-stepalgorithm,wecandeterminethestartandendtimesofjobsbysolvingsmallMILPproblemsiteratively,regardingunscheduledjobs.ThebasicideaisthatndingasolutiontotheMILPproblemwithdeterminedx/ybinaryvariablesandundeterminedzbinaryvariablesforasmallnumberofjobs,e.g.,3,takeslittletime.Thus,wecandividetheproblemintomanysmallproblemsandsolvethemsequentially.Topickappropriatejobs,wealsousethesamebottomlevelpriorityschemeastheheuristic.However,inourcase,thenodemappingisalreadydetermined.Thedetailedoperationsofthesecond-stepalgorithmaredescribedinAlgorithm 4-2 Algorithm4-2Secondstep-Determinationofthemappingofnetworkresources Input:AnetworktopologygraphGnandaworkowGtwithxedresourcemappingobtainedfromAlgorithm 4-1 1: whileTherearenetworkjobswithunxedendtimesdo 2: Pick3non-communicationjobsandassociatedcommunicationjobs. 3: SolvetheMILPproblem,whichhasonlythosejobsandrelatedzvariablesasbinaryvariables. 4: Updatethestartandendtimesofjobsaffectedbythesolution. 5: endwhile AsforLP,thecomputationtimeisproportionaltop2qifqpwherepisthenumberofdecisionvariablesandqisthenumberofconstraints.Thedecisionvariablefis 85

PAGE 86

dominantinthenumberofdecisionvariables.SupposethatthenetworkresourcegraphisxedasweconductexperimentsbyvaryingonlythesizeofworkowsinSection 4.6 .Toaddressthefastgrowingrunningtimewithregardtothesizeofaworkow,wechooseanotherformofmulticommodityow.ThereexisttwokindsofLPformulationsforthemulticommodityowproblem,node-arcformandedge-pathform.TheMILPformulationinFigure 4-5 takesthenode-arcform,whichassignsaseparatedecisionvariableforacertainjobonacertainlink.Incontrast,theedge-pathformassignsaseparatedecisionvariableforacertainjobonacertainpathinasetofpaths,P,whichthejobcantake.Accordingly,ifwelimitthenumberofpathsinthesetP,wecanreducethenumberofdecisionvariables,whichleadstobetterperformanceintermsoftimecomplexitybysacricingtheaccuracyofthesolution.In[ 81 ],authorsshowedthattheedge-pathformulationforbulkletransferscanleadtoanearoptimalsolutionwithareasonabletimecomplexitybyusingalimitednumberofpre-denedpaths.Theedge-pathformofthesingleworkowschedulingproblemformulationispresentedinFigure 4-7 .Wewillrefertothisedge-pathbasedLPrelaxationasLPREdgefortherestofthischapter. Thetimecomplexityanalysisfortheedge-pathformformulationissummarizedinTable 4-3 .Comparedtotheoriginalformulation,thenumberofvariablesandconstraintsismuchreduced.Especially,thetimecomplexityoftheedge-pathfromformulationismuchlessinuencedbythesizeofanetworkresourcegraph,i.e.,mnandnn. 4.5ListSchedulingHeuristic Theextendedlistschedulingalgorithmwiththebottomlevelpriorityschemeachievesthebestperformanceamongotherpriorityschemessuchastoplevelpriorityscheme[ 92 ].Eventhoughtheauthorsin[ 67 ]triedtoenhancetheperformanceconsideringthepropertiesofapipelinedtaskgraph,thenewalgorithmdoesnotmakemuchdifferenceinthecaseofrandomworkows.Theirresultsshowthatthenewandclassicalgorithmsproducealmostthesamemakespansregardingworkowswithup 86

PAGE 87

ObjectiveminimizeTforXj2JXp2PjM)]TJ /F5 7.97 Tf 6.59 0 Td[(1Xm=0fjp(m) (4)Multi-commodityowconstraintsX0m
PAGE 88

Table4-3. Edge-pathformsingleworkowschedulingformulationtimecomplexityanalysis Variable/ConstraintNumberofvariables/constraints f(kmt+nnnt)M xnJnn=(mt+nt)nn ynJnn=(mt+nt)nn znJM=(mt+nt)M STnJ=(mt+nt) ENDnJ=(mt+nt) Constraint 4 nJM=(mt+nt)M Constraint 4 (mn+nn)M Constraint 4 knJM=k(mt+nt)M Constraint 4 4 nJM=(mt+nt)M Constraint 4 4 nJ=(mt+nt) Constraint 4 nJnn=(mt+nt)nn Constraint 4 4 4 nJ=(mt+nt) Constraint 4 2mt Constraint 4 4 2mtnn to60tasksandthenewalgorithmperformsatmost5-10%betterregardingworkowswith80to100tasks.Hence,weconsideradaptingthegeneralextendedlistschedulingalgorithmwiththebottomlevelpriorityschemetoourrandomworkows. Thedirectapplicationoftheextendedlistschedulingalgorithmproposedin[ 92 ]doesnottwellintoe-Sciencenetworksinthreeaspects.First,thealgorithmof[ 92 ]allowsthatlinksonthepathcanbeavailableatdifferenttimeperiodsaslongasdescendentlinksbecomeavailableaftertheprecedentlinksofapath.Thisassumptionrequiresbuffersattheendsoflinksandinterventionofmoderatorscontrollingthestartandtheendofdatatransferoneachlink.Second,[ 92 ]doesnotconsiderin-advancereservation,whichmeansonlyavailablebandwidthatthetimewhentherequestismadeistakenintoaccountforpathcomputation.Theextendedlistschedulingalgorithmadaptedfore-SciencenetworksaredescribedinAlgorithm 4-3 88

PAGE 89

Thechangesnecessaryforadaptationforin-advanceworkowreservationsine-Sciencenetworksarerelatedtocomputingdatatransfertimeaspartofcomputationoftheearliestnishtime.Theassumptionregardingsynchronizedavailabilityoflinksonapathisreasonableine-Sciencenetworksandin-advancereservationsposeanotherchallenge.Inthecaseofon-demandreservations,wecancomputedatatransfertimesimplybytheamountofdataoverthemaximumavailablebandwidthofapathwherethemaximumavailablebandwidthofapathistheminimumofmaximumavailablebandwidthsoflinksofapath.Lastly,thevaryingavailablebandwidthovertimeduetothenatureofin-advancereservationrequirescarefulhandlingofdatatransfertime.Weassumerigidmappingfortheextendedlistscheduling,whichmeanstheallocatedbandwidthofapathdoesnotchangeovertime.Tondthedatatransfernishtime,weusethesimpleheuristicasdescribedinAlgorithm 4-4 .Wewillrefertotheextendedlistschedulingalgorithmadaptedfore-SciencenetworksasLSfortherestofthischapter. Algorithm4-3Theadaptedextendedlistschedulingalgorithm Input:AnetworkresourcegraphGnandaworkowGt 1: DeterminetheprioritiesofallnodesinGtbasedonthebottompriorityscheme. 2: Orderthenodeswithrespecttoprioritieswhilecomplyingwithprecedenceconstraints. 3: foreachnodeintheorderedlistinthedecreasingorderdo 4: Findthenodethatallowstheearliestnishtimeamongallcandidatenodesbyvirtuallyschedulingallincomingdatatransfers.//NetworkpathsbetweentwonodesarepredeterminedbyBFS. 5: endfor 4.6ExperimentalEvaluation 4.6.1ExperimentSetup Wecomparetheperformanceoffouralgorithms,theoptimalMILPalgorithm,theLPrelaxationalgorithm,theedge-pathformLPrelaxationalgorithmandthelist 89

PAGE 90

Algorithm4-4Datatransfernishtimecomputationalgorithm Input:AnetworkresourcegraphGnandadatatransferspeciedby(source,destination,amountofdata,starttime) 1: foreachtimesliceintheincreasingorderofstarttimeoftimesliceswhoseendtimeisgreaterthanorequaltostarttimedo 2: //Basicintervalreferstothetimeperiodwithinwhichtheavailablebandwidthoflinksofapathisconstant. 3: AllocBW theavailablebandwidthofthetimeslice. 4: RemainingData theamountofdatatotransfer 5: CurTimeSlice thetimeslice 6: FinishTime thestarttimeofthedatatransfer 7: whileRemainingData>0do 8: ifCurTimeSlicehasmoreavailablebandwidththanAllocBWthen 9: RemainingData 10: RemainingData)]TJ /F1 11.955 Tf 9.76 0 Td[(theamountofdatatransferredinthecurrenttimeslice 11: UpdateFinishTime. 12: else 13: Exitwhile 14: endif 15: CurTimeSlice thenexttimeslice 16: endwhile 17: ifRemaingData=0then 18: returnFinishTime. 19: endif 20: endfor schedulingheuristicofSection 4.5 intermsofthemakespan,i.e.,theschedulelengthofworkowsandthecomputationaltimeofalgorithms.Inthefollowing,werefertotheMILPalgorithm,theLPrelaxationalgorithm,theedge-pathformLPrelaxationalgorithm,andthegeneralextendedlistschedulingalgorithmofSection 4.5 asMILP,LPR,LPREdgeandLS,respectively. Werstcomparetheperformanceofallfouralgorithmswithregardtoworkowswithasmallnumberofnodes,3.Thisexperimentisforcomparisonofnon-optimalalgorithmsagainsttheoptimalalgorithm.Wethencomparetwoalgorithms,LPREdgeandLS,withregardtoworkowswithalargenumberofnodesrangingfrom10to50 90

PAGE 91

withanincrementof10.Thesecondexperimentistoverifythatouralgorithmperformsbetterthantheheuristicalgorithmintermsofmakespan. Asanetworkresourcegraph,weusetheAbilenenetwork[ 11 ](seeFigure 4-8 ),whichisdeployedinpractice.Theresourcecapacitiesofnodesofthenetworkresourcegraphaswellasthebandwidthcapacitiesofedgesarerandomlyselectedfromauniformdistributionbetween10to1024.Forworkowgeneration,wecanchooseeitherwayofgeneratingrandomly[ 26 28 42 59 66 ]orsynthesizingworkowsfromasetofpre-determinedworkows[ 60 ].Inourexperiments,weusearandomworkowgenerationmethodthatdependsonthreeparameters:thenumberofnodes,theaveragedegreeofnodesandcommunication-to-computationratio(CCR).Thenumberofnodesisvariedaccordingtotheaforementionedexperiments.Theaveragedegreeofnodesisrelatedtothelevelofparallelismofworkowsandxedto2.ThedifferentCCRsof0.1,1,and10areusedtoassesstheimpactofthecommunicationfactorontheperformanceofthealgorithms.AlargerCCRmeansaworkowismoredata-intensive.Theweightsofnodesofaworkowarerandomlyselectedfromauniformdistributionbetween10to1024astheresourcecapacitiesoftheAbilenenetworkaredetermined.Subsequently,theweightsofedgesofaworkowaresettotheCCRtimestheuniformdistributionbetween10to1024.Onehundredtrialswereforeverycombinationofworkowparameters,thenumberofnodes,CCRandthechosenalgorithm.Wethenaveragedtheresultsandplottedchartsforperformanceevaluation.Eventhoughwepresentformulationsforbothsingleworkowandmultipleworkowscheduling,wehaveconductedexperimentsforthesingleworkowschedulingonly.Thisisbecausewecanunderstandeverymultipleworkowschedulinginstancemaybetransformedintoasinglebigworkowschedulinginstance. AsaMILP/LPsolver,weusedCPLEX,apopularcommercialsoftwarepackage,andthecomputermachinesonwhichCPLEXhasbeeninstalledhavethefollowingspecication;2GHzdualcoreAMDOpteron(tm)Processor280and7Gbytememory. 91

PAGE 92

Figure4-8. TheAbilenenetwork 4.6.2Results Weevaluatetheperformanceofworkowschedulingalgorithmswithregardtotwometrics:schedulelengthofworkows,i.e.,makespan,andcomputational(running)time.Thedetailedresultswithexplanationarepresentedinthissubsection. 4.6.2.1Schedulelengthofworkows Comparisonagainstoptimalschedulingresults: .Sincetheoptimalschedulesforrandomlygeneratedworkowsonthegivennetworkresourcegraph,theAbilenenetwork,arenotknownaheadoftime,theonlywayofevaluatingthemakespansofnon-optimalalgorithmsistocomparemakespansofthosealgorithmsagainsttheoptimalalgorithm. InFigure 4-9 ,wecanseethattheperformanceofnon-optimalalgorithms,i.e.,LPR,LPREdge,andLS,iscomparabletotheoptimalalgorithm,MILP,whenCCR=0.1and1.0.However,asCCRgrowsupto10,themakespanofLSbecomesroughly2timesthemakespanofMILP.Incontrast,themakespanofLPRandLPREdgeisatmost20%morethantheoptimalmakespan. ComparisonbetweenLPREdgeandLS .AsthegeneralworkowschedulingproblemisaNP-hard,ourcorrespondingformulation,MILP,requiresexponentialcomputationaltimeasthesizeofworkowsincreases.Forlargeworkows,itisimpracticaltodeterminetheoptimalmakespanusingtheMILPalgorithm.Forthisreason,wecomparethemakespansofonlyournon-optimalalgorithms,LPREdgeand 92

PAGE 93

Figure4-9. Makespanvs.CCRforallalgorithmsintheAbilenenetworkwhenthenumberofnodesinaworkowis3. LS,inFigure 4-10 .WecanseethatLPREdgeismuchbetterthanLS.ItachieveshalfthemakespanofLSinsomecases. Figure4-10. Makespanvs.CCRandthenumberofnodesinaworkowforLPREdgeandLSintheAbilenenetwork 4.6.2.2Computationaltime Comparisonagainstoptimalschedulingresults .TherunningtimeoftheoptimalalgorithmgrowsexponentiallyasshowninFigure 4-11 .Thisalgorithmtakesapproximately14secondswhenthereare3nodesandCCR=0.1.With3nodes,theruntimebecomesapproximately47secondswhenCCR=10.Whenthenumberof 93

PAGE 94

nodesisincreasedto10andCCR=0.1,MILPtakesmorethan1,500seconds.Bycontrast,LPREdgetakeslessthan5secondswhenthereare3nodesandlessthan150secondswhenthenumberofnodesislessthan50(Figure 4-12 ). Figure4-11. Computationaltimevs.CCRforallalgorithmsintheAbilenenetworkwhenthenumberofnodesinaworkowis3. ComparisonbetweenLPREdgeandLS .TherunningtimeoftheheuristicisafewsecondswhereastherunningtimeofLPREdgeislinearlyincreasingupto150secondswhenthenumberofnodesis50. Figure4-12. Computationaltimevs.thenumberofnodesinaworkowforLPREdgeandLSintheAbilenenetwork Ifrequestsforworkowschedulingfromusersareon-demandandshouldbehandledinrealtime,thecomputationaltimeofthefastgreedyalgorithmshownintheexperimentsisnotpositive.However,whentherequestsarein-advance,thereis 94

PAGE 95

enoughtimebetweenrequestarrivaltimeandrequeststarttime,andthecentralizedserverisamorehigh-endmachine,LPREdgeisdeployableinpractice. 4.7Summary Wehaveformulatedworkowschedulingproblemsine-Sciencenetworks,whosegoalisminimizingeithermakespanornetworkresourceconsumptionbyjointlyschedulingheterogeneousresourcessuchascomputeandnetworkresources.Theformulationsaredifferentfrompreviousworkintheliteratureinthesensethattheyallowdynamicmultiplepathsfordatatransferbetweentasksandmoreexibleresourceallocationthatmayvaryovertime.Inaddition,itisadvantageousthattheformulationforasingleworkowschedulingcanbeeasilyextendedtotheformulationforamultipleworkowscheduling.Thecomputationtimeoftheoptimalformulationincreasesexponentiallywithregardtothesizeofaworkow.Accordingly,theLPrelaxationalgorithm,referredtoasLPR,fordeploymentinpracticehasbeendevelopedbasedontheoptimalalgorithmthroughthecommonlinearrelaxationtechnique.Wealsoproposetheedge-pathformLPrelaxationalgorithm,LPREdge,toenhancetimecomplexity. TheexperimentalresultsshowthatthemakespanofLPRandLPREdgeiscomparable,lessthan20%longer,tothatoftheoptimalalgorithmregardlessofCCRforsmallworkows.Incontrast,thegenerallistschedulingalgorithm,LS,performsroughlysimilartoLPRandLPREdgewhenCCR=0.1,buttheperformancegapofLPR/LPREdgeandLSgrowsdramaticallyasCCRgrowsfrom1to10.Data-intensiveworkowscheduling,whichiscommonine-Scienceapplication,canbenetfromdynamicmultiplepathsandmalleableresourceallocation.Intermsofcomputationaltime,theheuristicalgorithmofcourseisthebestbecauseitrequiresonlytrivialcomputations.LPRandLPREdgealgorithmsrequiremorecomputationswhichmaytakeafewminuteswhenthenumberofnodesis50.Infrequentworkowschedulingrequestsfromusersandreasonableschedulingtimebetweenarrivaltimeandstarttimeofrequestsmayrelievethisburdenalittle. 95

PAGE 96

Tothebestofourknowledge,theoptimalalgorithm,theMILPformulation,istherstalgorithmthatjointlyschedulesheterogeneousresourcesincludingnetworkresourcesusingdynamicmultiplenetworkpathsandmalleableresourceallocation.Theapproximationbasedontheoptimalalgorithmachievesreasonableperformancecomparedwiththeoptimalalgorithmintermsofschedulelength(makespan).Theapplicationoftheseresultstoopticalnetworkswillbefuturework. 96

PAGE 97

CHAPTER5CONCLUSIONS Weproposetodevelopanovelframeworkforprovisioningavarietyofe-Scienceapplicationsthatrequirecomplexworkowsthatspanovermultipledomains.Ourframeworkwillprovideguaranteesontheperformancewhileincurringminimaloverhead,bothnecessaryconditionsforsuchaframeworktobeadoptedinpractice. WehavealreadydevelopedaSDF-basedmodelforiterativedata-dependente-Scienceapplicationsthatincorporatesvariablecommunicationdelaysandtemporalconstraints,suchasthroughput.Weformulatedtheproblemasavariationofmulti-commoditylinearprogrammingwithanobjectiveofminimizingnetworkresourceconsumptionwhilemeetingtemporalconstraints. Wealsoproposedtopologyaggregationalgorithmsfore-Sciencenetworks.E-ScienceapplicationsrequirehigherqualityintradomainandinterdomainQoSpaths,andsomeofthosearedistinguishedfromclassicsingle-pathsingle-job(SPSJ)applications.Wedenedanewclassofrequests,calledmultiple-pathmultiple-job(MPMJ),andproposeTAalgorithmsforthenewclassofapplications.Theproposedalgorithms,starandpartitionedstarARs,areshowntobesignicantlybetterthannaiveapproaches. Finally,Weformulatedworkowschedulingproblemsine-Sciencenetworks,whosegoalisminimizingeithermakespanornetworkresourceconsumptionbyjointlyschedulingheterogeneousresourcessuchascomputeandnetworkresources.Theformulationsaredifferentfrompreviousworkintheliteratureinthesensethattheyallowdynamicmultiplepathsfordatatransferbetweentasksandmoreexibleresourceallocationthatmayvaryovertime.theLPrelaxationalgorithmfordeploymentinpracticehasbeendevelopedbasedontheoptimalalgorithmthroughthecommonlinearrelaxationtechnique.Wealsoproposedtheedge-pathformLPrelaxationalgorithmtoenhancetimecomplexity. 97

PAGE 98

REFERENCES [1] BERNetworkRequirementsWorkshopFinalReport.LawrenceBerkeleyNationalLaboratory,2007. http://www.es.net/pub/esnet-doc/BER-Net-Req-Workshop-2007-Final-Report.pdf ;citedSep.2010. [2] BESNetworkRequirementsWorkshopFinalReport.LawrenceBerkeleyNationalLaboratory,2007. http://www.es.net/pub/esnet-doc/BES-Net-Req-Workshop-2007-Final-Report.pdf ;citedSep.2010. [3] EarthScope:AnEarthScienceProgram.EarthScope,2007. http://www.earthscope.org/usarray/data_flow/archiving.php ;citedSep.2010. [4] EnlightenedComputing.MCNC,2007. http://www.enlightenedcomputing.org/ ;citedJan.2008. [5] GEANT2.DANTE,2007. http://www.geant2.net/ ;citedSep.2010. [6] CHEETAH:Circuit-switchedHigh-speedEnd-to-EndTransportArchitecture.UniversityofVirginia,2008. http://www.ece.virginia.edu/cheetah/ ;citedSep.2010. [7] FESNetworkRequirementsWorkshopFinalReport.LawrenceBerkeleyNationalLaboratory,2008. http://www.es.net/pub/esnet-doc/FES-Net-Req-Workshop-2008-Final-Report.pdf ;citedSep.2010. [8] Ultralight:AnUltrascaleInformationSystemforDataIntensiveResearch.NationalScienceFoundation,2008. http://www.ultralight.org ;citedSep.2010. [9] CA*net4.CANARIE,2009. http://www.canarie.ca/canet4/index.html ;citedSep.2010. [10] OSCARS:On-demandSecureCircuitsandAdvanceReservationSystem.U.S.DepartmentofEnergy,2009. http://www.es.net/oscars ;citedSep.2010. [11] Abilene.Internet2,2010. http://abilene.internet2.edu/ ,citedJan.2009. [12] e-Science.TheU.K.ResearchCouncils,2010. http://www.rcuk.ac.uk/escience ;citedSep.2010. [13] TheEarthSystemGrid(ESG).UniversityCorporationforAtmosphericResearch,2010. http://www.earthsystemgrid.org/ ;citedSep.2010. [14] EnergyScienceNetwork(ESnet).LawrenceBerkeleyNationalLaboratory,2010. http://www.es.net ;citedSep.2010. [15] TheGlobusAlliance.Globus,2010. http://www.globus.org/ [16] HybridOpticalandPacketInfrastructure.Internet2,2010. http://www.internet2.edu/networkresearch/projects.html ;citedSep.2010. 98

PAGE 99

[17] Internet2.Internet2,2010. http://www.internet2.edu ;citedSep.2010. [18] JGNII:AdvancedNetworkTestbedforResearchandDevelopment.NICT,2010. http://www.jgn.nict.go.jp ;citedSep.2010. [19] JIVE.JointInstituteforVeryLongBaselineInterferometry,2010. http://www.jive.nl/ ;citedSep.2010. [20] LHCNet:TransatlanticNetworkingfortheLHCandtheU.S.HEPCommunity.U.S.DepartmentofEnergy,2010. http://lhcnet.caltech.edu/ ;citedSep.2010. [21] NationalLambdaRail.U.S.researchandeducationcommunity,2010. http://www.nlr.net ;citedSep.2010. [22] NSFGlobalEnvironmentforNetworkInnovations(GENI)Project.GENI,2010. http://geni.net/ ;citedSep.2010. [23] TeraGrid.NationalScienceFoundation,2010. http://www.teragrid.org/ ;citedSep.2010. [24] UltraScienceNet.U.S.DepartmentofEnergy,2010. http://www.csm.ornl.gov/ultranet/ ;citedSep.2010. [25] UserControlledLightPathProvisioning.CommunicationsResearchCentre,2010. http://www.uclp.ca/ ;citedSep.2010. [26] Adam,ThomasL.,Chandy,K.M.,andDickson,J.R.Acomparisonoflistschedulesforparallelprocessingsystems.Commun.ACM17(1974).12:685. [27] Ahuja,Ravindra,Magnanti,T.,andOrin,J.Networkows:theory,algorithms,andapplications.EnglewoodCliffsN.J.:PrenticeHall,1993. [28] Benoit,Anne,Hakem,Mourad,andRobert,Yves.Contentionawarenessandfault-tolerantschedulingforprecedenceconstrainedtasksinheterogeneoussystems.ParallelComputing35(2009).2:83. [29] Blake,S.,Black,D.,Carlson,M.,Davies,E.,Wang,Z.,andWeiss,W.Anarchitecturefordifferentiatedservices.RFC2475,IETF,1998. [30] Boutaba,R.,Golab,W.,Iraqi,Y.,Li,T.,andArnaud,B.Grid-controlledlightpathsforhighperformancegridapplications.JournalofGridComputing1(2003).4:387. [31] Braden,R.,Clark,D.,andShenker,S.Integratedservicesintheinternetarchitec-ture:Anoverview.RFC1633,IETF,1994. [32] Brodnik,AndrejandNilsson,Andreas.AstaticdatastructurefordiscreteadvancebandwidthreservationsontheInternet.Tech.Rep.Techreport 99

PAGE 100

cs.DS/0308041,DepartmentofComputerScienceandElectricalEngineering,LuleaUniversityofTechnology,Sweden,2003. [33] Bunn,J.andNewman,H.Data-intensivegridsforhigh-energyphysics.GridComputing:MakingtheGlobalInfrastructureaReality.eds.F.Berman,G.Fox,andT.Hey.JohnWiley&Sons,Inc,2003. [34] Burchard,Lars-O.Networkswithadvancereservations:applications,architecture,andperformance.JournalofNetworkandSystemsManagement13(2005).4:429. [35] Burchard,Lars-O.andHeiss,Hans-U.Performanceissuesofbandwidthreservationforgridcomputing.Proceedingsofthe15thSymposiumonComputerArchetectureandHighPerformanceComputing(SBAC-PAD'03).2003. [36] Burchard,Lars-O.,Schneider,J.,andLinnert,B.Reroutingstrategiesfornetworkswithadvancereservations.ProceedingsoftheFirstIEEEInternationalConferenceone-ScienceandGridComputing(e-Science2005).Melbourne,Australia,2005. [37] Chen,BinBinandPrimet,PascaleVicat-Blanc.Schedulingdeadline-constrainedbulkdatatransferstominimizenetworkcongestion.ProceedingsoftheSeventhIEEEInternationalSymposiumonClusterComputingandtheGrid(CCGRID).2007. [38] Chung,Yeh-ChingandRanka,S.Applicationsandperformanceanalysisofacompile-timeoptimizationapproachforlistschedulingalgorithmsondistributedmemorymultiprocessors.Supercomputing'92.Proceedings.1992,512. [39] Coffman,E.G.andGraham,R.L.Optimalschedulingfortwo-processorsystems.ActaInformatica1(1972).3:200. [40] Curti,C.,Ferrari,T.,Gommans,L.,vanOudenaarde,B.,Ronchieri,E.,Giacomini,F.,andVistoli,C.Onadvancereservationofheterogeneousnetworkpaths.FutureGenerationComputerSystems21(2005).4:525. [41] DeFanti,T.,d.Laat,C.,Mambretti,J.,Neggers,K.,andArnaud,B.TransLight:Aglobal-scaleLambdaGridfore-science.CommunicationsoftheACM46(2003).11:34. [42] Dick,RobertP.,Rhodes,DavidL.,andWolf,Wayne.TGFF:taskgraphsforfree.Proceedingsofthe6thinternationalworkshoponHardware/softwarecodesign.Seattle,Washington,UnitedStates:IEEEComputerSociety,1998,97. [43] (Ed.),E.Mannie.Generalizedmulti-protocollabelswitching(GMPLS)architecture.RFC3945,IETF,2004. 100

PAGE 101

[44] Erlebach,T.Calladmissioncontrolforadvancereservationrequestswithalternatives.Tech.Rep.TIK-ReportNr.142,ComputerEngineeringandNetworksLaboratory,SwissFederalInstituteofTechnology(ETH)Zurich,2002. [45] Farrel,A.APathComputationElement(PCE)-BasedArchitecture.2006. [46] Ferrari,Tiziana.GridNetworkServicesUseCasesfromthee-ScienceCommu-nity.TheOpenGridForum,2007. [47] Foster,I.andKesselman,C.TheGrid:BlueprintforaNewComputingInfrastruc-ture.MorganKaufmann,1999. [48] Foster,I.,Kesselman,C.,Lee,C.,Lindell,R.,Nahrstedt,K.,andRoy,A.Adistributedresourcemanagementarchitecturethatsupportsadvancereservationsandco-allocation.ProceedingsoftheInternationalWorkshoponQualityofService(IWQoS'99).1999. [49] Govindarajan,R.andGao,Guang.Rate-optimalscheduleformulti-rateDSPcomputations.TheJournalofVLSISignalProcessing9(1995).3:211. [50] Govindarajan,R.,Gao,GuangR.,andDesai,Palash.MinimizingBufferRequirementsunderRate-OptimalScheduleinRegularDataowNetworks.TheJournalofVLSISignalProcessing31(2002).3:207. [51] Guerin,R.andOrda,A.Networkswithadvancereservations:Theroutingperspective.ProceedingsofIEEEINFOCOM99.1999. [52] He,E.,Wang,X.,Vishwanath,V.,andLeigh,J.AR-PIN/PDC:Flexibleadvancereservationofintradomainandinterdomainlightpaths.ProceedingsoftheIEEEGLOBECOM2006.2006. [53] Hutanu,Andrei,Allen,Gabrielle,Beck,StephenD.,Holub,Petr,Kaiser,Hartmut,Kulshrestha,Archit,Lika,Milo,MacLaren,Jon,Matyska,Ludk,Paruchuri,Ravi,Prohaska,Steffen,Seidel,Ed,Ullmer,Brygg,andVenkataraman,Shalini.Distributedandcollaborativevisualizationoflargedatasetsusinghigh-speednetworks.FutureGener.Comput.Syst.22(2006).8:1004. [54] J.Mambretti,etal.ThePhotonicTeraStream:enablingnextgenerationapplicationsthroughintelligentopticalnetworkingatiGRID2002.FutureGenera-tionComputerSystems19(2003).6:897908. [55] Johnston,W.E.,Metzger,J.,OConnor,M.,Collins,M.,Burrescia,J.,Dart,E.,Gagliardi,J.,Guok,C.,andOberman,K.NetworkCommunicationasaService-OrientedCapability.HighPerformanceComputingandGridsinAction,Vol.16.ed.L.Grandinetti.IOSPress,2008. [56] Jung,E.,Li,Y.,Ranka,S.,,andSahni,S.Performanceevaluationofroutingandwavelengthassignmentalgorithmsforopticalnetworks.IEEESymposiumonComputersandCommunications.2008. 101

PAGE 102

[57] Jung,E.,Li,Y.,Ranka,S.,andSahni,S.Anevaluationofin-advancebandwidthschedulingalgorithmsforconnection-orientednetworks.InternationalSymp.onParallelArchitectures,Algorithms,andNetworks(ISPAN).2008. [58] Karypis,GeorgeandKumar,Vipin.MeTis:UnstrcturedGraphPartitioningandSparseMatrixOrderingSystem,Version2.0,1995. [59] Khan,A.A.,Mccreary,C.L.,andJones,M.S.AComparisonofMultiprocessorSchedulingHeuristics.ParallelProcessing,1994.ICPP1994.InternationalConferenceon.vol.2.1994,243. [60] Kwok,Y.-K.andAhmad,I.Benchmarkingthetaskgraphschedulingalgorithms.ParallelProcessingSymposium,1998.IPPS/SPDP1998.ProceedingsoftheFirstMergedInternational...andSymposiumonParallelandDistributedProcessing1998.1998,531. [61] Lee,E.A.andHa,S.Schedulingstrategiesformultiprocessorreal-timeDSP.GlobalTelecommunicationsConference,1989,andExhibition.CommunicationsTechnologyforthe1990sandBeyond.GLOBECOM'89.,IEEE.1989,1279vol.2. [62] Lee,EdwardAshford.Acoupledhardwareandsoftwarearchitectureforpro-grammabledigitalsignalprocessors(synchronousdataow).Ph.D.thesis,UniversityofCalifornia,Berkeley,1986. [63] Lehman,T.,Sobieski,J.,andJabbari,B.DRAGON:Aframeworkforserviceprovisioninginheterogeneousgridnetworks.IEEECommunicationsMagazine(2006). [64] Lewin-Eytan,L.,Naor,J.,andOrda,A.Routingandadmissioncontrolinnetworkswithadvancereservatione.ProceedingsoftheFifthInternationalWorkshoponApproximationAlgorithmsforCombinatorialOptimization(APPROX02).2002. [65] Li,Yan,Ranka,S.,andSahni,S.In-advancepathreservationforletransfersIne-Scienceapplications.ComputersandCommunications,2009.ISCC2009.IEEESymposiumon.2009,176. [66] Liu,Xin.Application-Specic,AgileandPrivate(ASAP)PlatformsforFederatedComputingServicesoverWDMNetworks.Ph.D.thesis,TheStateUniversifyofNewYorkatBuffalo,2009. [67] Liu,Xin,Wei,Wei,Qiao,Chunming,Wang,Ting,Hu,Weisheng,Guo,Wei,andWu,Min-You.TaskSchedulingandLightpathEstablishmentinOpticalGrids.INFOCOM2008.The27thConferenceonComputerCommunications.IEEE.2008,1966. 102

PAGE 103

[68] Lui,King-Shan,Nahrstedt,K.,andChen,Shigang.Routingwithtopologyaggregationindelay-bandwidthsensitivenetworks.Networking,IEEE/ACMTransactionson12(2004):17. [69] Luo,XubinandWang,Bin.IntegratedSchedulingofGridApplicationsinWDMOpticalLight-TrailNetworks.JournalofLightwaveTechnology27(2009).12:1785. [70] Marchal,L.,Primet,P.,Robert,Y.,andZeng,J.Optimalbandwidthsharingingridenvironment.ProceedingsofIEEEHighPerformanceDistributedComputing(HPDC).2006. [71] McDysan,D.E.andSpohn,D.L.ATMTheoryandApplications.McGraw-Hill,1998. [72] Medina,A.,Lakhina,A.,Matta,I.,andByers,J.BRITE:anapproachtouniversaltopologygeneration.Modeling,AnalysisandSimulationofComputerandTelecommunicationSystems,2001.Proceedings.NinthInternationalSymposiumon(2001):346. [73] Munir,K.,Javed,S.,andWelzl,M.AReliableandRealisticApproachofAdvanceNetworkReservationswithGuaranteedCompletionTimeforBulkDataTransfersinGrids.ProceedingsofACMInternationalConferenceonNetworksforGridApplications(GridNets2007).SanJose,California,2007. [74] Munir,K.,Javed,S.,Welzl,M.,Ehsan,H.,andJaved,T.AnEnd-to-EndQoSMechanismforGridBulkDataTransferforSupportingVirtualization.ProceedingsofIEEE/IFIPInternationalWorkshoponEnd-to-endVirtualizationandGridManagement(EVGM2007).SanJose,California,2007. [75] Munir,K.,Javed,S.,Welzl,M.,andJunaid,M.UsinganEventBasedPriorityQueueforReliableandOpportunisticSchedulingofBulkDataTransfersinGridNetworks.Proceedingsofthe11thIEEEInternationalMultitopicConference(INMIC2007).2007. [76] Murthy,Praveen.Schedulingtechniquesforsynchronousandmultidimensionalsynchronousdataow.Berkeley:ElectronicsResearchLaboratoryCollegeofEngineeringUniversityofCalifornia,1996. [77] Naiksatam,SumitandFigueira,Silvia.ElasticReservationsforEfcientBandwidthUtilizationinLambdaGrids.TheInternationalJournalofGridComput-ing23(2007).1:1. [78] Newman,H.B.,Ellisman,M.H.,andOrcutt,J.A.Data-intensivee-sciencefrontierresearch.CommunicationsoftheACM46(2003).11:68. 103

PAGE 104

[79] Palis,M.A.,Liou,Jing-Chiou,andWei,D.S.L.Taskclusteringandschedulingfordistributedmemoryparallelarchitectures.ParallelandDistributedSystems,IEEETransactionson7(1996).1:46. [80] PelsserandBonaventure.PathSelectionTechniquestoEstablishConstrainedInterdomainMPLSLSPs.2006,209. [81] Rajah,K.,Ranka,S.,andXia,Ye.SchedulingBulkFileTransferswithStartandEndTimes.NetworkComputingandApplications,2007.NCA2007.SixthIEEEInternationalSymposiumon.2007,295. [82] Rajah,Kannan,Ranka,Sanjay,andXia,Ye.SchedulingBulkFileTransferswithStartandEndTimes.ComputerNetworks52(2008).5:1105.. [83] Rao,N.S.,Carter,S.M.,Wu,Q.,Wing,W.R.,Zhu,M.,Mezzacappa,A.,Veeraraghavan,M.,andBlondin,J.M.Networkingforlarge-scalescience:Infrastructure,provisioning,transportandapplicationmapping.ProceedingsofSciDACMeeting.2005. [84] Reiter,Raymond.SchedulingParallelComputations.J.ACM15(1968).4:590. [85] Ricciato,F.,Monaco,U.,andAli,D.DistributedschemesfordiversepathcomputationinmultidomainMPLSnetworks.CommunicationsMagazine,IEEE43(2005):138. [86] Rosen,E.,Viswanathan,A.,andCallon,R.Multiprotocollabelswitchingarchitec-ture.RFC3031,IETF,2001. [87] Sahni,S.,Rao,N.,Ranka,S.,Li,Y.,Jung,E.,andKamath,N.Bandwidthschedulingandpathcomputationalgorithmsforconnection-orientednetworks.InternationalConferenceonNetworking.2007. [88] Sarangan,V.,Ghosh,D.,andAcharya,R.Performanceanalysisofcapacity-awarestateaggregationforinter-domainQoSrouting.GlobalTelecom-municationsConference,2004.GLOBECOM'04.IEEE3(2004):1458Vol.3. [89] Schelen,O.,Nilsson,A.,Norrgard,Joakim,andPink,S.PerformanceofQoSagentsforprovisioningnetworkresources.ProceedingsofIFIPSeventhInternationalWorkshoponQualityofService(IWQoS'99).London,UK,1999. [90] Sinnen,O.andSousa,L.A.Communicationcontentionintaskscheduling.ParallelandDistributedSystems,IEEETransactionson16(2005).6:503515. [91] Sinnen,Oliver.Taskschedulingforparallelsystems.HobokenN.J.:Wiley-Interscience,2007. 104

PAGE 105

[92] Sinnen,OliverandSousa,Leonel.Listscheduling:extensionforcontentionawarenessandevaluationofnodeprioritiesforheterogeneousclusterarchitectures.ParallelComputing30(2004).1:81. [93] Sprintson,A.,Yannuzzi,M.,Orda,A.,andMasip-Bruin,X.ReliableRoutingwithQoSGuaranteesforMulti-DomainIP/MPLSNetworks.INFOCOM2007.26thIEEEInternationalConferenceonComputerCommunications.IEEE(2007):1820. [94] Stuijk,S.,Geilen,M.,andBasten,T.Throughput-BufferingTrade-OffExplorationforCyclo-StaticandSynchronousDataowGraphs.Computers,IEEETransac-tionson57(2008).10:1331. [95] Sun,Zhenyu,Guo,Wei,Wang,Zhengyu,Jin,Yaohui,Sun,Weiqiang,Hu,Weisheng,andQiao,Chunming.SchedulingAlgorithmforWorkow-BasedApplicationsinOpticalGrid.JournalofLightwaveTechnology26(2008).17:3011. [96] Thorpe,S.R.,Stevenson,D.,andEdwards,G.K.Usingjust-in-timetoenableopticalnetworkingforgrids.FirstICST/IEEEInternationalWorkshoponNetworksforGridApplications(GridNets2004).2004. [97] Topcuoglu,H.,Hariri,S.,andWu,Min-You.Performance-effectiveandlow-complexitytaskschedulingforheterogeneouscomputing.ParallelandDistributedSystems,IEEETransactionson13(2002).3:260. [98] Uludag,Suleyman,Lui,King-Shan,Nahrstedt,Klara,andBrewster,Gregory.AnalysisofTopologyAggregationtechniquesforQoSrouting.ACMComput.Surv.39(2007):7. [99] Wang,Lee,Siegel,HowardJay,Roychowdhury,VwaniP.,andMaciejewski,AnthonyA.TaskMatchingandSchedulinginHeterogeneousComputingEnvironmentsUsingaGenetic-Algorithm-BasedApproach,.JournalofParallelandDistributedComputing47(1997).1:8. [100] Wang,TaoandChen,Jianer.Bandwidthtree-Adatastructureforroutinginnetworkswithadvancedreservations.ProceedingsoftheIEEEInternationalPerformance,ComputingandCommunicationsConference(IPCCC2002).2002. [101] Wang,Yan,Jin,Yaohui,Guo,Wei,Sun,Weiqiang,Hu,Weisheng,andWu,Min-You.Jointschedulingforopticalgridapplications.JournalofOpticalNetworking6(2007).3:304. [102] Wieczorek,Marek,Hoheisel,Andreas,andProdan,Radu.TaxonomiesoftheMulti-CriteriaGridWorkowSchedulingProblem.GridMiddlewareandServices.2008.237. 105

PAGE 106

[103] Wiggers,M.,Bekooij,M.,Jansen,P.,andSmit,G.Efcientcomputationofbuffercapacitiesformulti-ratereal-timesystemswithback-pressure.Hardware/softwarecodesignandsystemsynthesis,2006.CODES+ISSS'06.Proceedingsofthe4thinternationalconference.2006,10. [104] Xiong,Qing,Wu,Chanle,Xing,Jianbing,Wu,Libing,andZhang,Huyin.Alinked-listdatastructureforadvancereservationadmissioncontrol.ICCNMC2005.2005.LectureNotesinComputerScience,Volume3619/2005. [105] Yang,TaoandGerasoulis,Apostolos.PYRROS:statictaskschedulingandcodegenerationformessagepassingmultiprocessors.Proceedingsofthe6thinternationalconferenceonSupercomputing.Washington,D.C.,UnitedStates:ACM,1992,428. [106] Yang,Xi,Lehman,Tom,Tracy,Chris,Sobieski,Jerry,Gong,Shujia,Torab,Payam,andJabbari,Bijan.Policy-BasedResourceManagementandServiceProvisioninginGMPLSNetworks.ProceedingsofIEEEINFOCOM.2006. [107] Yannuzzi,M.,Masip-Bruin,X.,andBonaventure,O.Openissuesininterdomainrouting:asurvey.Network,IEEE19(2005):49. [108] Yoo,Younghwan,Ahn,Sanghyun,andKim,ChongSang.LinkstateaggregationusingashufenetinATMPNNInetworks.GlobalTelecommunicationsConfer-ence,2000.GLOBECOM'00.IEEE.vol.1.2000,481vol.1. [109] Zhang,Z.L.,Duan,Z.,andHou,Y.T.DecouplingQoScontrolfromcorerouters:Anovelbandwidthbrokerarchitectureforscalablesupportofguaranteedservices.Proc.ACMSIGCOMM.2000. [110] Zheng,Jun,Zhang,Baoxian,andMouftah,H.T.Towardautomatedprovisioningofadvancereservationserviceinnext-generationopticalinternet.Communica-tionsMagazine,IEEE44(2006).12:68. 106

PAGE 107

BIOGRAPHICALSKETCH Eun-SungJungreceivedB.S.andM.S.degreesinelectricalengineeringfromSeoulNationalUniversity,Korea,in1996and1998,respectively.Hisresearchinterestsincludenetworkoptimizationinconnection-orientednetworksanditsapplicationstoexistingresearchnetworks. 107