Citation
Network Resource Provisioning in Research Networks

Material Information

Title:
Network Resource Provisioning in Research Networks
Creator:
Jung, Eun
Place of Publication:
[Gainesville, Fla.]
Publisher:
University of Florida
Publication Date:
Language:
english
Physical Description:
1 online resource (107 p.)

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Computer Engineering
Computer and Information Science and Engineering
Committee Chair:
Ranka, Sanjay
Committee Co-Chair:
Sahni, Sartaj
Committee Members:
Peir, Jih-Kwon
Figueiredo, Renato J.
Chun, Paul W.
Graduation Date:
12/17/2010

Subjects

Subjects / Keywords:
Algorithms ( jstor )
Bandwidth ( jstor )
Communication models ( jstor )
File transfers ( jstor )
Flow charting ( jstor )
Heuristics ( jstor )
Linear programming ( jstor )
Propagation delay ( jstor )
Scheduling ( jstor )
Topology ( jstor )
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
cloud, escience, network, optimization, workflow
Genre:
Electronic Thesis or Dissertation
born-digital ( sobekcm )
Computer Engineering thesis, Ph.D.

Notes

Abstract:
Advances in optical communication and networking technologies, together with the computing and storage technologies, are dramatically changing the ways scientific research is conducted. A new term, e-Science, has emerged to describe large-scale science carried out through distributed global collaborations enabled by networks, requiring access to very large scale data collections, computing resources, and high-performance visualization . E-Science application workflows are complex and require schedulable and high-bandwidth connectivity with known future characteristics. Moreover, these workflows have performance requirements or metrics that have not been considered by conventional networking. For example, large file transfer may need guaranteed total turnaround time and the rate of progress. Given the long duration of many requests, the network resources available may change before it is completed. We develop a novel framework for provisioning a variety of e-Science applications that require complex workflows that span over multiple domains. Our framework provides guarantees on the performance while incurring minimal overhead, both necessary conditions for such a framework to be adopted in practice. ( en )
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis:
Thesis (Ph.D.)--University of Florida, 2010.
Local:
Adviser: Ranka, Sanjay.
Local:
Co-adviser: Sahni, Sartaj.
Statement of Responsibility:
by Eun Jung.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright Jung, Eun. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
750257247 ( OCLC )
Classification:
LD1780 2010 ( lcc )

Downloads

This item has the following downloads:


Full Text

PAGE 1

NETWORKRESOURCEPROVISIONINGINRESEARCHNETWORKSByEUN-SUNGJUNGADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2010

PAGE 2

c2010Eun-SungJung 2

PAGE 3

Tomyparents,mywife,Hyeseon,andmydaughter,Lauren 3

PAGE 4

ACKNOWLEDGMENTS Firstofall,Iwouldliketothankmychair,Dr.SanjayRanka,andmyco-chair,Dr.SartajSahni.SinceIstartedtoworkwithhim,theyhaveinspiredme,guidedmethroughalltheresearch,andgavemeinvaluableadvice,suggestions,commentsandsupportwithpatienceandgenerosity.Ialsowouldliketoshowmysinceregratitudetomysupervisorycommitteemembersforinsightfulcommentsonmyresearch.Iwouldliketogivemydeepestgratitudetomyfamilyandfriends.Withouttheirhelpandsupport,thisdissertationwouldnothavebeenpossible. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 7 LISTOFFIGURES ..................................... 8 ABSTRACT ......................................... 11 CHAPTER 1INTRODUCTION ................................... 12 1.1Overview .................................... 12 1.2TargetNetworksandServices ........................ 14 1.3ProblemsAddressedandOurContributions ................. 16 1.3.1BandwidthAllocationforIterativeData-dependentApplications .. 16 1.3.2TopologyAggregationforE-ScienceNetworks ............ 17 1.3.3WorkowScheduling .......................... 18 1.4BackgroundandRelatedWork ........................ 19 1.5OutlineofDissertation ............................. 24 2BANDWIDTHALLOCATIONFORITERATIVEDATA-DEPENDENTE-SCIENCEAPPLICATIONS ................................... 26 2.1Overview .................................... 26 2.2SynchronousDataowforE-ScienceApplications ............. 28 2.3ProblemFormulation .............................. 33 2.3.1IllustrativeExample ........................... 34 2.3.2OptimalBandwidthAllocationwithaFeasibleSchedule ...... 37 2.3.2.1Modelingcommunicationdelays .............. 38 2.3.2.2Problemformulation ..................... 42 2.4ExperimentalEvaluation ............................ 45 2.5Summary .................................... 47 3TOPOLOGYAGGREGATION ............................ 49 3.1Overview .................................... 49 3.2RelatedWork .................................. 52 3.3TAforMultiple-PathMultiple-Job(MPMJ) .................. 54 3.3.1ProblemStatement ........................... 54 3.3.2NewTopologyAggregationAlgorithms ................ 56 3.3.2.1Full-meshmethod ...................... 56 3.3.2.2Starmethod ......................... 57 3.3.2.3Partitionedstarmethod ................... 58 3.4Routing ..................................... 59 5

PAGE 6

3.5ComplexityAnalysis .............................. 61 3.6ExperimentalEvaluation ............................ 62 3.6.1BulkFileTransfersinE-Science .................... 62 3.6.2ExperimentTestbed .......................... 63 3.6.3PerformanceMetrics .......................... 64 3.6.4Results ................................. 64 3.7Summary .................................... 65 4WORKFLOWSCHEDULING ............................ 67 4.1Overview .................................... 67 4.2WorkowSchedulinginE-ScienceNetworks ................ 70 4.2.1SystemModelandDataStructure .................. 71 4.2.1.1Timemodel .......................... 71 4.2.1.2Networkresourcemodel .................. 71 4.2.1.3Workowmodel ....................... 72 4.2.2ProblemStatement ........................... 72 4.2.3ConstructionofanAuxiliaryGraph .................. 73 4.3MILPFormulation ............................... 75 4.3.1SingleWorkow ............................. 78 4.3.1.1Multi-commodityowconstraints .............. 78 4.3.1.2Taskassignmentconstraints ................ 80 4.3.1.3Precedenceconstraints ................... 80 4.3.1.4Deadlineconstraints ..................... 81 4.3.2MultipleWorkows ........................... 81 4.3.3TimeComplexity ............................ 81 4.4LPRelaxation .................................. 83 4.5ListSchedulingHeuristic ........................... 86 4.6ExperimentalEvaluation ............................ 89 4.6.1ExperimentSetup ........................... 89 4.6.2Results ................................. 92 4.6.2.1Schedulelengthofworkows ................ 92 4.6.2.2Computationaltime ..................... 93 4.7Summary .................................... 95 5CONCLUSIONS ................................... 97 REFERENCES ....................................... 98 BIOGRAPHICALSKETCH ................................ 107 6

PAGE 7

LISTOFTABLES Table page 2-1ComparisonbetweenDSPande-Scienceapplications .............. 30 2-2Summaryofsystemparametersofthevisualizationapplication ......... 36 2-3Notationforproblemformulation .......................... 40 3-1TimeComplexityforMPMJ ............................. 61 3-2SpaceComplexityforMPMJ ............................ 62 4-1Notationforproblemformulation .......................... 77 4-2Singleworkowschedulingformulationtimecomplexityanalysis ........ 84 4-3Edge-pathformsingleworkowschedulingformulationtimecomplexityanalysis 88 7

PAGE 8

LISTOFFIGURES Figure page 2-1AnexampleofSDFG ................................ 29 2-2AhomogeneousSDFGconvertedfromFigure 2-1 (a) .............. 33 2-3Arealexampleofe-Scienceapplications[ 53 ] ................... 35 2-4AnESDFGmodelforFigure 2-3 .......................... 35 2-5ModelingcommunicationdelayinaSDFG ..................... 39 2-6Modelingcommunicationdelayinthecaseofmultiplecommunicationchannels 41 2-7Moreexploitedparallelismincaseofmultiplecommunicationchannels ..... 42 2-8BAFSproblemformulationincaseoftheconservativemodel .......... 42 2-9BAFSproblemformulationincaseoftheoptimisticmodel ............ 43 2-10TheAbilenenetwork ................................. 46 2-11Rejectionratiovs.numberofrequests ....................... 47 3-1Anexampleofinter-domainQoSrouting ...................... 51 3-2Anillustrativeexampleforlimitationsofthelinesegmentalgorithm ....... 54 3-3Full-meshAR ..................................... 56 3-4StarAR ........................................ 57 3-5PartitionedstarAR .................................. 60 3-6Earliestnishtimeon-lineschedulingofmultipleletransfers .......... 63 3-7Errorratiovs.thenumberofnodes ......................... 65 3-8Normalizedcomputationaltimevs.thenumberofsourceanddestinationnodes 65 4-1ADAGconsistingof17nodes,representingdependenciesamong17tasksofanapplication.Forexample,thearcfromtaskEtotaskBrepresentsthefactthattheoutputgeneratedbytaskEisutilizedbytaskB. ........... 68 4-2Anexampleofanetworkresourcegraph ..................... 72 4-3Anexampleofataskgraph ............................. 73 4-4Anexampleofanauxiliarygraph .......................... 76 4-5Singleworkowschedulingproblemformulationvianetworkowmodel .... 79 8

PAGE 9

4-6Multipleworkowschedulingproblemformulationvianetworkowmodel ... 82 4-7Edge-pathformofsingleworkowschedulingproblemformulation ....... 87 4-8TheAbilenenetwork ................................. 92 4-9Makespanvs.CCRforallalgorithmsintheAbilenenetworkwhenthenumberofnodesinaworkowis3. ............................. 93 4-10Makespanvs.CCRandthenumberofnodesinaworkowforLPREdgeandLSintheAbilenenetwork .............................. 93 4-11Computationaltimevs.CCRforallalgorithmsintheAbilenenetworkwhenthenumberofnodesinaworkowis3. ...................... 94 4-12Computationaltimevs.thenumberofnodesinaworkowforLPREdgeandLSintheAbilenenetwork .............................. 94 9

PAGE 10

ListofAlgorithms 2-1AheuristicforBAFSproblem .......................... 46 3-1Full-meshARconstruction ............................ 56 3-2StarARconstruction ............................... 58 3-3PartitionedstarARconstruction ......................... 59 4-1Firststep-Determinationofthemappingoftasksexceptdatatransfers .. 85 4-2Secondstep-Determinationofthemappingofnetworkresources ..... 85 4-3Theadaptedextendedlistschedulingalgorithm ................ 89 4-4Datatransfernishtimecomputationalgorithm ................ 90 10

PAGE 11

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyNETWORKRESOURCEPROVISIONINGINRESEARCHNETWORKSByEun-SungJungDecember2010Chair:SanjayRankaCochair:SartajSahniMajor:ComputerEngineering Advancesinopticalcommunicationandnetworkingtechnologies,togetherwiththecomputingandstoragetechnologies,aredramaticallychangingthewaysscienticresearchisconducted.Anewterm,e-Science,hasemergedtodescribelarge-scalesciencecarriedoutthroughdistributedglobalcollaborationsenabledbynetworks,requiringaccesstoverylargescaledatacollections,computingresources,andhigh-performancevisualization[ 12 ]. E-Scienceapplicationworkowsarecomplexandrequireschedulableandhigh-bandwidthconnectivitywithknownfuturecharacteristics.Moreover,theseworkowshaveperformancerequirementsormetricsthathavenotbeenconsideredbyconventionalnetworking.Forexample,largeletransfermayneedguaranteedtotalturnaroundtimeandtherateofprogress.Giventhelongdurationofmanyrequests,thenetworkresourcesavailablemaychangebeforeitiscompleted. Wedevelopanovelframeworkforprovisioningavarietyofe-Scienceapplicationsthatrequirecomplexworkowsthatspanovermultipledomains.Ourframeworkprovidesguaranteesontheperformancewhileincurringminimaloverhead,bothnecessaryconditionsforsuchaframeworktobeadoptedinpractice. 11

PAGE 12

CHAPTER1INTRODUCTION 1.1Overview Advancesinopticalcommunicationandnetworkingtechnologies,togetherwiththecomputingandstoragetechnologies,aredramaticallychangingthewaysscienticresearchisconducted.Anewterm,e-Science,hasemergedtodescribelarge-scalesciencecarriedoutthroughdistributedglobalcollaborationsenabledbynetworks,requiringaccesstoverylargescaledatacollections,computingresources,andhigh-performancevisualization[ 12 ].Well-quotede-Science(andtherelatedgridcomputing[ 47 ])examplesincludehigh-energynuclearphysics[ 33 ],radioastronomy[ 19 ],geoscience[ 3 ]andclimatestudies[ 13 ].Tosupporte-Scienceactivities,anewgenerationofhigh-speedresearchandeducationnetworkshavebeendeveloped.TheseincludeInternet2[ 17 ],theDepartmentofEnergy'sESnet[ 14 ],NationalLambdaRail[ 21 ],CA*net4[ 9 ]inCanada,andthepan-EuropeGEANT2[ 5 ].AlargeportionofalldatatrafcsupportingU.S.scienceiscarriedbyESnet,Internet2,andNationalLambdaRail[ 55 ]. E-scienceactivitiesoftenneedtotransportlargevolumesofdataataveryhighrateamongalargenumberofcollaboratingsites[ 33 78 ],severelystressingnetworkresources.Forinstance,thehigh-energyphysics(HEP)dataisexpectedtogrowfromthecurrentpetabytes(PB)(1015)toexabytes(1018)by2015[ 33 ].Beyondtheobviousneedforlargeamountsofdatatobetransferred,e-Sciencerequirementsfornetworkusearesignicantlydifferentfromthetraditionalnetworkapplications[ 7 46 55 ]inthefollowingways: 1. Needtosupportschedulable,long-durationworkowswithperformanceguarantee:Theunderlyingapplicationsrequireschedulable,high-bandwidth,low-latencyconnectivitywithknownfuturecharacteristicsorperformanceguarantees[ 41 ],forreal-timeremotevisualization,interactionswithinstruments,distributedsimulationordataanalysis,etc.Inadistributedworkowsystemthatinvolvesmanyentitiessuchasdistantparties,scienticinstruments,computationdevices,aswellascomplexfeedbackinvariousstagesoftheworkow,unintendeddelaydueto 12

PAGE 13

lackofplanningforfuturecommunicationpathscanripplethroughtheentireworkowenvironment,slowingdownotherparticipatingsystemsastheywaitforintermediateresults,thusreducingtheoveralleffectiveness[ 55 ]. 2. Needtosupportalargenumberofnetworkserviceswithnovelperformancemetrics:Therearemanydifferenttypesofsciencesandscienticactivities,whichrequiredifferenttypesofnetworkservicestailoredtothespecicscienceactivities.(Alsosee[ 1 2 7 46 55 ].)Moreover,manyofthee-Scienceactivitieshaveperformancerequirementsormetricsthathavenotbeenconsideredbyconventionalnetworking.Largeletransfermayonlybeconcernedwithtotalturnaroundtimeandtherateofprogress;streamingconsumer-producertypeofjobsrunningattwodifferentsitesmayrequireaminimumandmaximumdatatransferrate;fusionexperimentsmaycareaboutloweringtheprobabilityoffailureintheexperimentsduetoinadequatenetworkservices. 3. Needtosupportdynamicuserandresourceenvironmentwithhighnetworkefciency:Giventhateachjobcanbeaheavyhitterintermsofnetworkresourceconsumption,thenetworkmusthandlewithgreatefciencythedynamicarrivalsofservicerequests,thechangesinrequirements,trafcpatternandaccesspoliciesatdifferentstagesofexperimentsorcollaboration.Theefciencyrequirementisespeciallyimportantforthenew-generation,high-speed,coarse-granularnetworks,suchasthewavelength-basedsystems.Inaddition,giventhelongdurationofmanyjobs,theresourcesavailabletoaparticularjoborthenetworktopologymaychangebeforethejobiscompleted.Thenetworkservicesmustbeabletoadapttotheresourcechangesbyincorporatingnewlyaddedresources(e.g.,linksorwavelengths)orfallingbacktoalternativeresourceswhentheassignedresourcesarenolongeravailable. Inshort,e-Scienceactivitiesneedschedulable,high-bandwidth,exibleandevolvingnetworkserviceswithnovelperformanceguarantees,andthenetworkneedstoprovidetheseservicesefciently.Thereisalargebodyofresearchonhowtoprovidequality-of-service(QoS)guarantees(e.g.,InterServ[ 31 ],DiffServ[ 29 ],theATMnetwork[ 71 ],orMPLS[ 86 ])forInternet-typenetworks.Thoseproposalsdonotconsideradvancereservationswithstartandendtimes.Bulktransferisusuallyregardedaslow-prioritybest-efforttrafc,notsubjecttoadmissioncontrol(AC).ACandschedulingaredecoupledfromroutinginthateachconnectionhasasingledefaultpathseparatelydeterminedbyaroutingprotocol;theroutingprotocolisusuallyobliviousofthejobsinthesystem.ACismyopicinthateachnetworkelementonthepathdetermineswhethertheconnectioncanbeacceptedbasedonacomparisonoftheremaininglinkcapacity 13

PAGE 14

atthenodeitselfwiththerequestedresourceofthejobalone.Onceadmitted,thepathandresourceallocationremainxedthroughoutthelifetimeoftheconnection. Tomeettheneedsofe-Science,weproposeaframeworkforconductingadvancereservations,admissioncontrol(AC)andschedulingofnetworkservicerequestsinresearchnetworksthat(i)supportsanevolvinglargearrayofnetworkservicesrequiredbyorusefulfore-Sciencecollaboration;(ii)guaranteesperformancelevelsthatarebasedonmetricsrelevanttotheunderlyingapplications;(iii)adaptstounderlyingchangesinnetworktopology,resourcesanduserdynamics;and(iv)providesefcientutilizationoftheunderlyingresources. 1.2TargetNetworksandServices Recentnetworksignalingprotocols,suchasMultipleProtocolLabelSwitching(MPLS)andGeneralizedMPLS(GMPLS),allowapplicationstoovercomedecienciesprevalentinexistingroutedTCP/IPprotocols(e.g.,theinabilitytoguaranteebandwidth,orofferQualityofService).Manyhigh-bandwidthnetworkprojectscurrentlyaredeployingtheseprotocolsintheresearchandacademicdomain.Thisisthecase,forexample,intheInternet2'sHOPItestbed[ 16 ],theNSF-supportedUltralight[ 8 ],Teragrid[ 23 ],CHEETAH[ 6 ]andDRAGON[ 63 ]networksandtheDOE-supportedUltraScienceNet(USN)[ 24 ],ESnet[ 14 ]andLHCNet[ 20 ].Weexpecttheseprotocolstoproliferateintotheproductionandcommercialnetworkdomain.Astherstevidence,boththeInternet2andtheDOE'sESnethavechosentoofferdedicatedbandwidthcapabilityandlightpathsusingGMPLSandMPLScontrolplanetechniques,developedintheOSCARS/BRUWprojects[ 10 ].ThetechniquesprovideaframeworktoautomatetheprovisioningprocessforbandwidthandmakeiteasierforuserstoaccessServiceOrientedBandwidthManagement(SOBM)functions,comparedtothecurrentprovisioningandbandwidthmanagementpractices,whicharemanualandlabor-intensive. Pastandcurrentprojectsonresearchnetworkshavefocusedonaddressingthefollowingchallenges[ 6 10 16 63 ]:1)Setupthehigh-speeddataplanebyahybridof 14

PAGE 15

IPpacket-switchingandopticalcircuit-switchingtechnologieswithalargefootprintandsufcientconnectivitybyconnectingthenationallabsanduniversitiesandpeeringwithothernetworks,2)Developsupportforend-to-endhigh-speedcircuitsstaticallyorondemand,whichrequiresmulti-domaininteroperability,3)Setupthebasiccontrolplaneanddevelopsignalingandcontrolmiddlewareforhandlinguserrequestsandbasicnetworkresourcereservation,4)Developend-to-endtransportprotocolsforsupportinghigh-speedchannelsandlargevolumesofdata,5)Ensuresecuritybyencryption,authentication,authorization(AAA),and6)Ensurereliability. Bulktransfer .Beingabletotransferverylargelesisapriorityinnearlyalle-Sciences[ 1 2 7 46 55 ].Iftheturnaroundtimeistheperformancemetricthattheusercaresabout,thereisagreatdealofexibilityinhowthetransfercanbecarriedout.Forinstance,thetransferofa100GBlecanbecompletedin8secondsusingten10Gbpslightpaths(Internet2links),orin1hourand26minutesusinga155Mbps(OC-3)long-lastingSONETcircuit.Thetransferchoicenotonlyaffectsthejobinquestion,butalsoothercurrentorfuturejobsincomplexways.Forlargetransferswithstartandendtimeconstraints,peakbandwidthassignmentcanleadtoanundesirablephenomenonknownasfragmentation[ 77 ],whichinturnleadstolowutilizationofnetworkresources.Thisoccurswhensometimeintervalsarelightlyloadedbutnotlongenoughtoaccommodatenewlargejobs.Greatertransferexibilityisneededtocombatthisproblem,suchastime-varyingbandwidthassignmentanddynamicre-assignment. Streamingworkow .ForthenanomaterialsciencesconductedatDOE'sCenterforFunctionalNanomaterials,researchofteninvolvesdistributedcollaborationamongsmallerresearchcenterswithdifferentscienticinstrumentsandcapabilities[ 2 ].Dataaregenerallycollectedfromseveralcentersand/orarecomparedagainsteachother.Then,amediumsizedclusterofcomputersprocessesandanalyzesthedata.Thevisualizationisdonebyspecialworkstationsequippedwithlargememorygraphicscardstohandlethelargeimagesandvolumesofdatafromtheoutputofthedataprocessing. 15

PAGE 16

Thegeneratedanimationisthenstreamedtoremotescientists'desktops,orinthecasewherethevisualizationisinstereo,toa3Dtheater.Thenetworkrequirementsvaryateachstageoftheworkow. Data-intensiveworkow .Large-scalesupercomputingisexpectedtoproducedataatasimilarratetolarge-scaleexperiments.Inordertopost-processthecomputedresults,highthroughputtransfersareoftenrequiredtostagethedataattherelatedcomputationalresources.Similarly,high-endscienticcomputingalsoprocesseslargeamountsofinputdatathat,fromaperformanceperspective,shouldbeaccessibleasefcientlyaspossible.LocalparallellesystemsarewellsuitedforsupportingthedemandedI/Ocapabilities,evenwhendatahastobestagedtotherespectivelesystems.Communityschedulersneedtocontrolmultipledistributedcomputationalresourcesinordertoserveindividualworkows. 1.3ProblemsAddressedandOurContributions E-SciencenetworksusuallyprovideQoSguarantee,i.e.,bandwidthguaranteethroughmulti-protocollabelswitching(MPLS)andgeneralmulti-protocollabelswitching(GMPLS)tomeettherequirementsofe-Scienceapplications,e.g.,in-advancepathreservationsforhigh-volumedatatransfers.Thedistinctivefeaturesofe-Scienceapplicationscomparedwithotherdistributedapplicationscanbesummarizedintwokeywords,network-centricandin-advance.Unlikeothergridcomputingapplications,schedulingofe-Scienceapplicationsputsmorefocusonnetworkresourcesorconsidersthenetworkresourceasmostimportantamongmultipleresourcessuchascomputeresourceandstorageresource.Moreover,in-advanceschedulingofe-Scienceapplicationssatisestheneedsofusersrequestingperiodicorpredictableservices. 1.3.1BandwidthAllocationforIterativeData-dependentApplications Wepresentaframeworkforbandwidthschedulingofstreaminge-Scienceapplications.Theseapplicationsincludeinteractivevisualizationofsimulations,largedatastreamingcoordinatedwithjobexecutionforproducerconsumerapplications, 16

PAGE 17

andnetworkedsupercomputing[ 46 ].WehaveadaptedtheSynchronousDataow(SDF)modeltomodelandanalyzeiterativedata-dependentapplicationsine-Science.Synchronousdataowwasproposedinlate1980sasamodelingmethodfordigitalsignalprocessing(DSP)applications,butitignoresthecommunicationdelays.Ourmodelincorporatesthecommunicationdelaysthatareinherentinlarge-scaledistributedapplications.Wehaveformulatedthebandwidthallocationproblemofiterativedata-dependente-Scienceapplicationswithtemporalconstraintsasamulti-commoditylinearprogrammingproblem.ItincorporatesoptimalratesandbufferminimizationforstreamingapplicationsthatcanberepresentedbyaSDFG.Ouralgorithmsdeterminehowmuchbandwidthisallocatedtoeachedgewhilesatisfyingtemporalconstraintsoncollaborativetasks.Usingthesolutionofthebandwidthallocationproblem,bufferrequirementsforthescheduleareachievedusingproceduressimilartotheonespresentedin[ 50 ].Tothebestofourknowledge,thisrepresentstherstattempttoanalyzethetemporalbehaviorofcollaborativelyiterativetasksandtodeterminetheoptimalbandwidthallocationsamongdistributednodes. 1.3.2TopologyAggregationforE-ScienceNetworks Thenetworksupportinge-Scienceapplicationstypicallyiscomprisedofmultipledomains.Eachdomainusuallybelongstodifferentorganizations,andismanagedbasedondifferentoperationalpolicies.Insuchcases,internaltopologiesofdomainsmaynotbevisibletotheothersforsecurityorotherreasons.However,aggregatedinformationofinternaltopologyandassociatedattributesisadvertisedtotheotherdomains. AsetoftechniquestoaggregatedatatoadvertiseoutsideonedomainiscalledTopologyAggregation(TA).TheaggregateddataitselfistermedasAggregatedRepresentation(AR).AsurveyofTAalgorithmsispresentedin[ 98 ].ThereexistsatradeoffbetweentheaccuracyandthesizeofAR.Hence,mostalgorithmsproposedin 17

PAGE 18

thepreviousworktriedtoachievethemostefcientARintermsofbothaccuracyandspacecomplexity. OnecanclassifyQoSpathrequestsintotwoclasses:single-pathsingle-job(SPSJ)andmultiple-pathmultiple-job(MPMJ),dependingonthenatureofrequests.SPSJcorrespondstoasituationinwhichrequestsforsingleQoSpatharriveandarescheduledintheorderofarrival.Incontrast,MPMJcorrespondstobatch/off-lineschedulingofmultiplerequestsformultipleQoSpaths.Manye-Scienceapplicationsrequiresimultaneoustransferofdatafrommultiplesourcesanddestinations.Also,eachoftheserequests(e.g.,letransfers)canbemoreefcientlysupportedbyusingconcurrentmultiplepaths. WeshowthatexistingTAapproachesdevelopedforSPSJdonotworkwellwithMPMJapplicationsastheyoverestimatetheamountofbandwidththatisavailable.WeproposeamaxowbasedTAapproachthatissuitableforthispurpose.Oursimulationresultsdemonstratethatouralgorithmsresultinbetteraccuracyorlessschedulingtime. 1.3.3WorkowScheduling Workow/DirectedAcyclicGraph(DAG)schedulinghasbeenshowntobeNP-hard[ 91 ].Anumberofpracticalheuristicshavebeendevelopedforthisproblem.Mostoftheseignorethecommunicationcosts[ 26 39 ]orassumedaverysimpleinterconnectionnetworkmodel,i.e.,afully-connectednetworkmodelwithoutcontention[ 59 60 79 97 99 ].Theworkin[ 97 ]proposedtheheterogeneous-earliest-nish-time(HEFT)algorithmextendedfromtheclassiclistschedulingforheterogeneouscomputingresources. However,theadvancesincomputingplatformsrangingfromclusterstogridsandemergingcloudsfordataintensiveapplicationshasposednewchallengeswherenetworkcontentionisanimportantissuethatneedstobeaddressed.Weproposetoaddressthisissuebyformulatingandsolvingtheoverallworkowschedulingthatincorporatesnetworkcontentionandoverheadsofthelargescaledatatransfers.In 18

PAGE 19

particular,weaddressthefollowingissuesfore-SciencegridsthathavenetworksthatareamixofIPnetworksandopticalnetworks: Malleableresourceallocation. Dynamicmultipathscheduling. Multipleworkows. Wehaveformulatedworkowschedulingproblemsine-Sciencenetworks,whosegoalisminimizingeithermakespanornetworkresourceconsumptionbyjointlyschedulingheterogeneousresourcessuchascomputeandnetworkresources.Theformulationsaredifferentfrompreviousworkintheliteratureinthesensethattheyallowdynamicmultiplepathsfordatatransferbetweentasksandmoreexibleresourceallocationthatmayvaryovertime.Moreover,ourworkisthersttoformulatetheworkowschedulingproblemincorporatingmultiplepathsasamixedintegerlinearprogramming(MILP).Weformulatealsoalinearprogrammingrelaxation,LPR,ofourMILP,anedge-pathbasedLPrelaxation,LPREdge,andalistschedulingheuristic,LS. TheexperimentalresultsshowthatthemakespanofLPRschedulesismuchclosertooptimalthanthatofLSscheduleswhenthecommunication-to-computeratio(CCR)islarge.TheLSalgorithmperformsroughlysimilartotheLPRalgorithmwhenCCR=0.1and1.0,buttheperformancegapofthesenon-optimalalgorithmsgrowsdramaticallyasCCRgrowsfrom1to10.Ourresultsindicatethatdata-intensiveworkowscheduling,whichiscommonine-Scienceapplications,willbenetfromdynamicmultiplepathsandmalleableresourceallocation. 1.4BackgroundandRelatedWork Ongoingresearchprojectsforsupportinge-Scienceapplications(e.g.,HOPI[ 16 ],Ultralight[ 8 ],Teragrid[ 23 ],CHEETAH[ 6 ],DRAGON[ 63 ],ESnet[ 14 ],OSCARS/BRUW[ 10 ])havemainlyfocusedonsettingupafastdataplanewithalargefootprintandsufcientconnectivityandsettingupabasicbutfunctionalcontrolplane,suchasdevelopingsignalingandcontrolmiddlewareforhandlinguserrequestsforelementary 19

PAGE 20

networkservices,ensuringsecurityandimprovingreliability.However,thecontrolplanemechanismslacksophisticatednetworkservicesupportorefcientservicereservationalgorithms.Theynormallyonlysupportxedbandwidthguaranteebyreservingcircuitsorlightpaths.Usingsucharestrictedsetofservicesorsimplisticresourcemanagementalgorithmstosupportdiversee-Scienceactivitiescanleadtoinefcientutilizationofthenetworkresources(especiallyforthenew-generation,high-speed,coarse-granularnetworks,suchasthewavelength-basedsystems)and/ornotprovidethelevelofperformancerequiredbythoseactivitiesindesiredbutvariedperformancemetrics. ComparedwiththetraditionalQoSframeworks,suchasInterServ[ 31 ],DiffServ[ 29 ],ATMnetworks[ 71 ],orMPLS[ 86 ],admissioncontrolandschedulingforresearchnetworksarerecentconcernswithlimitedpublishedwork.Priorworkiseitheraboutdedicatedpathreservation,bulkdatatransfersorjobsthatrequireminimumbandwidthguarantee(MBG).Nonehasconsideredasrichaclassofjobtypesaswedo. Controlplaneprotocols,architecturesandtools .TheNSF-supportedDRAGON[ 63 106 ]projectdevelopscontrolplanearchitectureandmiddlewareformulti-domaintrafcengineeringandresourceallocation,e.g.,usingGMPLSprotocols[ 43 ]forsettingupSONETcircuitsorlightpaths.Itusesacentralizedresourcecomputationelementperdomain,whichisresponsibletocomputepaths.Itsupportsadvancereservationsoflabelswitchedpaths(LSP)onrequestedtimeperiods.CHEETAH[ 6 ]isasimilarprojecttoDRAGONbutismoretraditionalinthatitfocusesonsimpler,distributedoperationsforpathcomputationandbandwidthmanagementtosupporthigharrivalratesofimmediateconnectionrequests.OSCARS[ 10 ]isthecontrolplaneprojectforDOE'sESnet,alsosimilartoDRAGON.Itdevelopsanddeploysaprototypeservicethatenableson-demandprovisioningofguaranteedbandwidthcircuitsforESnet.HOPI[ 16 ]isatestbedprojectonresearchnetworksthatexamineshowtoprovidenetworkservicesinahybridnetworkofsharedIPpacketswitchinganddynamicallyprovisionedlightpaths. 20

PAGE 21

[ 52 ]presentsanarchitectureforadvancereservationofintraandinterdomainlightpaths.GARA[ 48 ],thereservationandallocationarchitectureforthegridcomputingtoolkitGlobus[ 15 ],supportsadvancereservationofnetworkandcomputingresources.[ 40 ]adaptsGARAtosupportadvancereservationoflightpaths,MPLSpathsandDiffServpaths.OtherrelatedworkinthiscategoryincludesGridJIT[ 96 ],ODIN[ 54 ],[ 30 ]and[ 34 ].Muchoftheobjective,architectureframeworkandcapabilitiesoftheproposedprojectcoincideswiththeNSF'sGENIproject[ 22 ],forinstance,theuseofnetworkcontrollersandthesupportofnetworkvirtualization.Mostoftheabovecontrol-planearchitecturesandtoolsproviderudimentaryACandschedulingalgorithmsforsimplejobtypes.However,muchmorecanbedonetosupportmoreservicetypesorimprovethenetworkresourceutilization. Pathreservation .Theabilitytoprovidededicatedoron-demandcircuitsorlightpathsiscurrentlythefocusofmanyprojects,includingmostaforementionedmajorresearchnetworksandassociatedprojects,e.g.,Internet2,ESnet,NationalLambdaRail,GEANT2,UltraScienceNet(USN),HOPI,DRAGON,CHEETAHandOSCARS/BRUW.FurtherexamplesincludeUserControlledLightPaths(UCLP)[ 25 ],Enlightened[ 4 ],JapaneseGigabitNetworkII[ 18 ],LHCNet[ 20 ],andBandwidthBrokers[ 109 ].Inourpreviousresearchwork,wehaveproposednovelalgorithmsforadvancepathcomputationandbandwidthschedulingforconnectionorientednetworks[ 87 ]thathaveconsiderablybetterperformance[ 57 ].In[ 56 ],wehaveextendedthesealgorithmstoincorporatethewavelengthsharingandwavelengthcontinuityconstraints. MBGservice .Severalearlierstudies[ 32 36 89 104 ]haveconsideredACatanindividuallinkfortheMBG(minimumbandwidthguarantee)jobtypewithstartandendtimes.Theconcernistypicallyaboutdesigningefcientdatastructures,suchasasegmenttree[ 32 ],forrecordingandqueryingthelinkbandwidthusageondifferenttimeintervals.Admissionofanewjobisbasedontheavailabilityoftherequestedbandwidthbetweenitsstarttimeandendtime.[ 35 44 51 100 ]and[ 36 ]tacklethemoregeneral 21

PAGE 22

path-ndingproblemfortheMBGclass,buttypicallyonlyfornewrequests,oneatatime.Theroutesandbandwidthofexistingjobsareunchanged.[ 64 ]considersanetworkwithknownroutinginwhicheachadmittedjobderivesaprot.Itgivesapproximationalgorithmsforadmittingasubsetofthejobssoastomaximizethetotalprot. Bulktransfer .RecentpapersonACandschedulingalgorithmsforbulktransferwithadvancereservationsinclude[ 35 37 51 70 73 75 77 82 ].In[ 77 ],theACandschedulingproblemisconsideredonlyforthesinglelinkcase.Network-levelACandschedulingareconsideredtobeoutsidethescopeof[ 77 ].Asaresult,multi-pathroutingandnetwork-levelbandwidthallocationandre-allocationhavenocounterpartin[ 77 ].Incontrast,weperiodicallyre-optimizethebandwidthassignmentforallthenewandoldjobs. Foraone-timeschedulingproblem,ourrecentwork[ 82 ]conductsadetailedperformancecomparisonbetweensingle-sliceschedulingandmulti-sliceschedulingundervariousslicesizes,andbetweensingle-pathroutingandmulti-pathrouting.Weconcludethatasmallnumberofpathsperjobisusuallysufcienttoyieldnear-optimalthroughput;multi-sliceschedulingleadstosignicantperformance(e.g.,throughput)improvement.Otherauthorshavealsoconsideredasimilarproblembutwithdifferentemphasis[ 37 ]. In[ 73 75 ],theauthorsconsidersingle-linkACorlink-by-linkACundersingle-pathrouting.TheACusesheuristicalgorithmsinsteadofsolutionstooptimizationproblems.Basedonitssizeandthedeadline,theaveragerequiredbandwidthofabulktransferjobiscomputed.TheACisbasedonthejob'saveragebandwidthrequirement.Thebandwidthofexistingjobsmaybere-allocatedonlyforthesingle-linkcase. Theauthorsof[ 35 ]proposeamalleablereservationschemeforbulktransfer,whichcheckseverypossibleintervalbetweentherequestedstartandendtimesforthejobandtriestondapaththatcanaccommodatetheentirejobonthatinterval.Thescheme 22

PAGE 23

favorsintervalswithearlierdeadlines.In[ 51 ],thecomputationalcomplexityofarelatedpath-ndingproblemisstudiedandanapproximationalgorithmissuggested.[ 70 ]startswithanadvancereservationproblemforbulktransfer,butconvertsitintoaconstantbandwidthallocationproblemtomaximizethejobacceptancerate.AlltherequestsareknownatthetimeofAC;AC/schedulingiscarriedoutonlyonce.Thebandwidthconstraintsareattheingressandegresslinksonly,andhence,thereisnoroutingissue. Grid/Utility/CloudComputing .Networkresourceprovisioningproblemsine-Sciencenetworkssharesomedesigngoalssuchastheearliestnishtimeofajobwithresourcemanagementproblemsingrid/utility/cloudcomputing.However,thenetworkresourceprovisioningproblemsine-Sciencenetworksaredifferentfromthoseproblemsingrid/utility/cloudcomputinginthatnetworkresources,i.e.,thebandwidthoflinks,areassumedtobeguaranteedandmanageablebyemergingtechnologiessuchasMPLSandGMPLS.SuchQoSguaranteeinginfrastructuresfore-Scienceapplicationsareoriginatingfromthefactthatcommone-Scienceapplicationstransportlargevolumesofdataatveryhighrates.Thisdifferenceopensaresearchareatowardmoreelegantmanagementofnetworkresources,whichcanmakesystemperformancebetter. Opticalnetworks .E-sciencenetworksaremixofIPnetworksandopticalnetworks.Inopticalnetworks,thebandwidthalongagivenlinkcanbedecomposedintomultiplewavelengths.Forsuchreasons,opticalnetworkshavethefollowingconstraints. Wavelengthcontinuityconstraint:Thisconstraintforcesasinglelightpathtooccupythesamewavelengththroughoutallthelinksthatitspans.Thisconstraintisnotrequiredwhenanopticalnetworkisequippedwithwavelengthconverters.Whensuchconvertersarepresent,thenetworkiscalledawavelengthconvertiblenetwork. Wavelengthsharingconstraint:Formanydeployments,itismosteffectivetoconsiderthebandwidthonalinkasconsistingofintegermultiplesofwavelengthandasinglewavelengthasaunitforassignmenti.e.,onewavelengthisoccupiedbyonlyonereservationatacertainpointoftime.Itisworthnotingthattechniques 23

PAGE 24

basedonTimeDivisionMultiplexing(TDM)/WavelengthDivisionMultiplexing(WDM)[ 110 ]allowfordecomposingthebandwidthonawavelength. Therelatedissuesintheresearchareaofopticalnetworksare:Routingandwavelengthassignment(RWA)problem,virtualtopology(VT)problem,trafcgrooming(TR)problem,andtaskschedulingandlightpathestablishment(TSLE)problem. 1.5OutlineofDissertation Theremainderofthisdissertationisorganizedasfollows. Chapter 2 describesaSDF-basedmodelforiterativedata-dependente-Scienceapplicationsthatincorporatesvariablecommunicationdelaysandtemporalconstraints,suchasthroughput.Weformulatetheproblemasavariationofmulti-commoditylinearprogrammingwithanobjectiveofminimizingnetworkresourceconsumptionwhilemeetingtemporalconstraints.TheresultingsolutioncanthenbeusedtoderivebufferspacerequirementsbypreviouslydevelopedalgorithmsinthecontextofDSPapplications.Finally,anillustrativeexampleofane-Scienceapplicationshowsthattheframeworkandalgorithmweproposeisvalidtomodelandanalyzeiterativedata-dependente-Scienceapplications.Thesimulationresultsshowthattheoptimalbandwidthallocationbytheformulatedlinearprogrammingoutperformsthebandwidthallocationbyasimpleheuristicintermsofrejectionratioofrequests. Chapter 3 describestopologyaggregationalgorithmsfore-Sciencenetworks.E-ScienceapplicationsrequirehigherqualityintradomainandinterdomainQoSpaths,andsomeofthosearedistinguishedfromclassicsingle-pathsingle-job(SPSJ)applications.Wedeneanewclassofrequests,calledmultiple-pathmultiple-job(MPMJ),andproposeTAalgorithmsforthenewclassofapplications.Theproposedalgorithms,starandpartitionedstarARs,areshowntobesignicantlybetterthannaiveapproaches. Chapter 4 describesefcientalgorithmsforworkowschedulingproblemsine-Sciencenetworks,whosegoalisminimizingeithermakespanornetworkresource 24

PAGE 25

consumptionbyjointlyschedulingheterogeneousresourcessuchascomputeandnetworkresources.Ouralgorithmsaredifferentfrompreviousworkintheliteratureinthesensethattheyallowdynamicmultiplepathsfordatatransferbetweentasksandmoreexibleresourceallocationthatmayvaryovertime.Inaddition,itisadvantageousthattheformulationforasingleworkowschedulingcanbeeasilyextendedtotheformulationforamultipleworkowscheduling. 25

PAGE 26

CHAPTER2BANDWIDTHALLOCATIONFORITERATIVEDATA-DEPENDENTE-SCIENCEAPPLICATIONS 2.1Overview E-Scienceactivitiesoftenrequirethetransportoflargevolumesofdataatveryhighratesamongalargenumberofcollaboratingsites[ 33 78 ],severelystressingnetworkresources.Forinstance,thehigh-energyphysics(HEP)dataareexpectedtogrowfromthecurrentpetabytes(1015)toexabytes(1018)by2015[ 33 ].Beyondtheobviousneedforlargeamountsofdatatobetransferred,e-Sciencerequirementsfornetworkusearesignicantlydifferentfromthetraditionalnetworkapplications[ 7 46 55 ].Theunderlyingapplicationsrequireschedulable,high-bandwidth,low-latencyconnectivitywithknownfuturecharacteristicsorperformanceguarantees[ 41 ]forreal-timeremotevisualization,interactionswithinstruments,distributedsimulationordataanalysis,andsoon.Inadistributedworkowsystemthatinvolvesmanyentities,suchasdistantparties,scienticinstruments,computationdevices,aswellascomplexfeedbackinvariousstagesoftheworkow,unintendeddelaysduetoalackofplanningforfuturecommunicationpathscanripplethroughtheentireworkowenvironment,slowingdownotherparticipatingsystemsastheywaitforintermediateresults,thusreducingtheoveralleffectiveness[ 55 ]. Thefocusofthischapterisonsupportinge-Scienceapplicationsthatrequirestreamingofinformationbetweensites.Wepresentaframeworkforbandwidthschedulingofstreaminge-Scienceapplications.Theseapplicationsincludeinteractivevisualizationofsimulations,largedatastreamingcoordinatedwithjobexecutionforproducerconsumerapplications,andnetworkedsupercomputing[ 46 ].Themaincontributionsareasfollows: 1. WehaveadaptedtheSynchronousDataow(SDF)modeltomodelandanalyzeiterativedata-dependentapplicationsine-Science.Synchronousdataowwasproposedinlate1980sasamodelingmethodfordigitalsignalprocessing(DSP) 26

PAGE 27

applications,butitignoresthecommunicationdelays.Ourmodelincorporatesthecommunicationdelaysthatareinherentinlarge-scaledistributedapplications. 2. Wehaveformulatedthebandwidthallocationproblemofiterativedata-dependente-Scienceapplicationswithtemporalconstraintsasamulti-commoditylinearprogrammingproblem.ItincorporatesoptimalratesandbufferminimizationforstreamingapplicationsthatcanberepresentedbyaSDFG.Ouralgorithmsdeterminehowmuchbandwidthisallocatedtoeachedgewhilesatisfyingtemporalconstraintsoncollaborativetasks.Usingthesolutionofthebandwidthallocationproblem,bufferrequirementsforthescheduleisachievedusingproceduressimilartotheonespresentedin[ 50 ]. Tothebestofourknowledge,thisrepresentstherstattempttoanalyzethetemporalbehaviorofcollaborativelyiterativetasksandtodeterminetheoptimalbandwidthallocationsamongdistributednodes. Ongoingresearchprojectsforsupportinge-Scienceapplications(e.g.,HOPI[ 16 ],Ultralight[ 8 ],Teragrid[ 23 ],CHEETAH[ 6 ],DRAGON[ 63 ],ESnet[ 14 ],andOSCARS/BRUW[ 10 ])havemainlyfocusedonsettingupafastdataplanewithalargefootprintandsufcientconnectivityandsettingupabasicbutfunctionalcontrolplane,suchasdevelopingsignalingandcontrolmiddlewareforhandlinguserrequestsforelementarynetworkservices,ensuringsecurityandimprovingreliability.However,thecontrolplanemechanismslacksophisticatednetworkservicesupportorefcientservicereservationalgorithms.Theynormallyonlysupportxedbandwidthguaranteebyreservingcircuitsorlightpaths.Usingsucharestrictedsetofservicesorsimplisticresourcemanagementalgorithmstosupportdiversee-Scienceactivitiescanleadtoinefcientutilizationofthenetworkresources(especiallyforthenew-generation,high-speed,coarse-granularnetworks,suchasthewavelength-basedsystems)and/ornotprovidethelevelofperformancerequiredbythoseactivitiesindesiredbutvariedperformancemetrics. Therestofthepaperisorganizedasfollows.WeprovideadetaileddescriptionofSDFanditsoperationalsemanticsandexamineitsapplicabilitytoe-ScienceapplicationsinSection 2.2 .Wepresentanoverallprocessofproblem-solving,including 27

PAGE 28

amathematicalformulationasalinearprogrammingandadiscussionofthedetaileddeploymentoftheobtainedsolutionforthelinearprogramminginrealsystemsinSection 2.3 .Weshowthatourapproachoutperformsanaiveheuristic,alsogivenbyus,inSection 2.4 .Lastly,weconcludewithasummaryanddiscussionofthepracticalityofourdissertationinSection 2.5 2.2SynchronousDataowforE-ScienceApplications TheSDFmodelofcomputationwasrstproposedbyLeein[ 62 ].TheSDFmodelhasbeenfoundtobeveryusefulforexpressingDSPapplicationsthathavethefollowingfeatures:innitelyloopingexecution,discretizedcommunicationexpressedbytokens,andparallelismtobeexploitedformaximizingthroughput.Mostoftheexistingresearchfortheseproblemsislimitedtoderivingmaximalratesandbufferminimization. SDFGisadirectedgraphdenedbyG=(V,E,I,O,,),whereVandErepresentasetofnodesandasetofedges,respectively.EachnodeinSDFGiscalledanactorandtheedgeinSDFGiscalledacommunicationchannelorchannel.ThenotationisbasedonitsearlieruseinDSPapplicationscomprisingfunctionblocksandthecommunicationchannelsinterconnectingthem.Anactorrepeatsitstaskinnitely,andtheexecutionofitstaskiscalledring.Inthispaper,weusethetermsnodeforactor,andedgeforchannelinterchangeably.Anactorcanproduceandconsumedataperchannelatdifferentrates,whicharespeciedbythenumberoftokens. Thenumberoftokensisapositiveinteger.Ifmultipleinputsandoutputsareassociatedwithanactor,itisassumedthattheactorwaitsuntilallinputbuffershavetheirtokenstobeconsumedreadyforuseandalloutputbuffersareavailable.HomogeneousSDFG,whereatmostonetokencanbeproducedorconsumed,isaspecialcaseofSDFG. Thenumberoftokensthatactorsproduceandconsumeisspeciedbysets,IandO.Iisasetofnumbersoftokensconsumedbydestinationactorsofedges,andOisasetofnumbersoftokensproducedbysourceactorsofedges.Thus,eachedge(u,v) 28

PAGE 29

Figure2-1. AnexampleofSDFG isassociatedwithtwointegervalues,IuvandOuv.ConsiderthesampleSDFGshowninFigure 2-1 (a).Theedge(u,v)hastwoassociatedintegervalues,IuvandOuv,whichare1and2,respectively.Thisrepresentsthefactthatactoruproduces2tokensateachringandactorvconsumes1tokenateachring.Inaddition,isasetofexecutiontimesofactors,andtheexecutiontimeofeachactor'sringisdenotedbyi.Finally,asetrepresentstheinitialnumbersoftokensonedges,whicharenecessaryforthestartofiterativeoperationsofaSDFG. UsingtheknownpropertiesofahomogeneousSDFGallowsustoderivethemaximalcomputationratesaswellasbufferrequirements.Also,itcanbeshownthatanyarbitrarySDFGcanbeconvertedintoahomogeneousSDFG,althoughthisconversionmayincreasethesizeofthenetworkexponentially. ToadapttheSDFGmodelfore-Scienceapplications,itisimportanttounderstandthekeydifferencesbetweene-ScienceandDSPapplications.AsummaryofdifferencesbetweenDSPande-ScienceapplicationsisprovidedinTable 2-1 .UnlikeDSPapplications,e-Scienceapplicationscanberepresentedbyacyclicgraphs,havexedstartandendtime,andhavecommunicationdelaystobeconsidered.ThetimeunitofDSPapplicationsisontheorderofafewmilliseconds,comparedtothetimeunitofe-Scienceapplicationsthatmaybefromafewhourstoseveraldays.ThroughputisthemostimportantobjectiveinbothDSPande-Scienceapplications.However,forDSP 29

PAGE 30

Table2-1. ComparisonbetweenDSPande-Scienceapplications CategoryDSPapplicatione-Scienceapplication Inter-taskdependencyCyclesareallowed.Usuallyacyclic. ExecutionperiodInnite.Finite. TimeunitSmall(afewmilliseconds).Rangesfromsmalltolarge(afewminutes). ComputeresourceUnlimited.Unlimitedorlimitedifcomputeresourceshouldbeco-allocated. CommunicationdelayAssumedtobe0.Needstobeconsidered. TemporalconstraintsObjectiveismaximizingcomputationrate(throughput).Throughput. ScheduleStaticordynamic.Static. applications,tradeoffsarebetweenthroughputandbuffersize,whilefore-Science,thetradeoffisgenerallybetweenthroughputandnetworkresourcerequirements.Thefocusofourworkisonoptimizingtheseresources. Lee[ 61 ]dividedschedulingofparallelcomputationdenedbySDFGintofourclasses:fullydynamic,staticassignment,self-timed,andfullystatic.Fullydynamicschedulingschedulesactorsatrun-timeonly.Instaticassignment,assignmentofactorstoprocessorsisdoneoff-lineandalocalrun-timeschedulerofeachprocessorinvokesactorsassignedtotheprocessor.Inself-timedscheduling,theassignmentandorderingofactorsoneachprocessorisdeterminedoff-lineandexactringtimeisscheduledatrun-time.Inotherwords,theactorthatwillbeexecutedbyacertainprocessorwaitsforallinputdatatobeavailableandisredonceallinputdataareready.Finally,fullystaticschedulingdeterminesallinformationoff-line.Basedonthisclassication,thetargete-Scienceapplicationscanbeconsideredtobeself-timed.AnodeofSDFGsfore-Scienceapplicationsrepresentsonesite,suchasadataserveroracomputingnode.Thisimpliesthateveryactorisassignedtoauniqueprocessorthatonlymanagesthattask. 30

PAGE 31

Asdescribedearlier,aSDFGisrepresentedbyG=(V,E,I,O,,).Sinceactorscanproduceorconsumetokensatdifferentrates,afeasiblescheduleshouldguaranteethattokensarenotinnitelyaccumulated.InFigure 2-1 (a),actoruproduces2tokensateachring,whileactorvconsumes1token.Topreventinnitebufferoverow,actorushouldberedonceforeverytworingsofactorv.Formally,thiscanbestatedbytheequation,ru2=rv1,whereruandrvdenoteringratesofactoruandv,respectively.Thesekindsofequationsarecalledbalanceequationsorstateequations.Tosolvebalanceequationsformally,weneedtodeneatopologymatrix,whereeidenotestheithedgeandOeiandIeidenotethenumberofproducedtokensandconsumedtokens,respectively,onanedgeei. Denition1(Topologymatrix). topologymatrix)]TJ /F10 11.955 Tf 10.1 0 Td[(isajEjjVjmatrix.)]TJ /F8 7.97 Tf 6.78 -1.79 Td[(ij=8>>>>>><>>>>>>:Oeiifanedgeei=(vj,vk),)]TJ /F12 10.909 Tf 8.48 0 Td[(Ieiifanedgeei=(vk,vj),Oei)]TJ /F12 10.909 Tf 10.91 0 Td[(Ieiifanedgeei=(vj,vj),0otherwise. (2) ThetopologymatrixforFigure 2-1 (a)is:)-488(=2)]TJ /F4 11.955 Tf 9.3 0 Td[(1.Theexistenceofasolution,aswellasamethodtosolvethebalanceequations,canbeshownusingthefollowingtheorem. Theorem2.1([ 62 ]). AconnectedSDFgraphwithactorshasaperiodicscheduleifandonlyifitstopologymatrix)]TJ /F10 11.955 Tf 10.09 0 Td[(hasrankn)]TJ /F4 11.955 Tf 12.1 0 Td[(1.Further,ifitstopologymatrixhasrankn)]TJ /F4 11.955 Tf 12.1 0 Td[(1,thenthereexistsauniquesmallestintegersolutiontothebalanceequations)]TJ /F6 11.955 Tf 6.77 0 Td[(q=0.Itcanbeshownthattheentriesinthevectorqarecoprime. GivenratesofactorsobtainedbyTheorem 2.1 ,fr1,r2,,rng,oneiterationisdenedasaschedulecontainingriringsofactori.Figure 2-1 (b)showstheoptimal 31

PAGE 32

scheduleforaSDFGinFigure 2-1 (a)whenbothactoruandvhaveself-dependencyloopsandtheexecutiontimesofactorsareall1. Theorem2.2([ 84 ]). ForahomogeneousSDFGrepresentedbyG=(V,E,I,O,,),themaximalcomputationrateofeverynodeinthegraphisgivenby min8CP(i,j)2Cij Pi2Ci.(2)whereCisanycycleinthegraph. RegardlessoftheSDFGtype,i.e.,homogeneousormulti-rate,thecomputationrateofaSDFGisdenedasthenumberofiterationsperunittime.ThemaximalcomputationrateofahomogeneousSDFGcanbederivedbyexaminingallcyclesinthegraph.Theorem 2.2 saysthemaximalcomputationrateofahomogeneousSDFGisboundedbytheminimuminitialtoken-to-timeratiocycleinthegraph.AsforahomogeneousSDFG,themaximalcomputationrateofaniterationequalstothemaximalcomputationrateofanodesincethenumberofringsofanodeinoneiterationis1.But,regardingamulti-rateSDFG,wecancomputethemaximalcomputationrateofanodeintwosteps.First,wecancomputethemaximalcomputationrateofaniterationafterconvertingthemulti-rateSDFGintoahomogeneousSDFG.Figure 2-2 showsthehomogeneousSDFGconvertedfromthemulti-rateSDFGinFigure 2-1 whenputtingaself-dependencylooponeachnode.Acertainnodeuwitharateruinamulti-rateSDFGwillbeexpandedtorunumberofnodesinthehomogeneousSDFGconvertedfromthemulti-rateSDFG[ 49 ].Hence,themaximalcomputationrateofaniterationwithregardtoFigure 2-2 is1 2throughtheequationminf1 1ru,1 1rvg=minf1,1 2g. Next,wecancomputethemaximalrateofeachnodebymultiplyingthemaximalrateofaniterationbytherateofthenode.Inthisexample,themaximalrateofnodeuandv 32

PAGE 33

Figure2-2. AhomogeneousSDFGconvertedfromFigure 2-1 (a) are1 2(=1 2ru)and1(=1 2rv),respectively.Inthispaper,wecallthenumberofringsofanodeperunittime,thethroughputofthenode. 2.3ProblemFormulation Inthissection,weproposeanalgorithmfordeterminingefcientbandwidthallocationstoedgesoftheoriginalnetworktopologygraphwhilesatisfyingtemporalconstraintssuchasthroughput,requiredbyane-ScienceapplicationwhosedatadependencyisgivenbyaSDFG.Inaddition,withthesebandwidthallocations,wecanminimizebuffersizerequirementsandndthecorrespondingschedules. Theoverallprocessofndingthefull-edgedsolutionforane-Scienceapplicationissummarizedasfollows. 1. Discretizationstep:Inthisstep,bothtimeanddatasizehavediscretizedvalues:executiontime,datatransmissiontime,anddatatransfersize.DiscretizationisanimportantrequirementforusingtheSDFGmodel.Fortargetapplications,abaseunitforexecutionandcommunicationtimescanbechosenandappropriateroundingcanbeperformed.Abasetimeunitshouldbene-grainedenoughtodifferentiateeachactor'sexecutiontimeandtemporalconstraints.Thebaseunitcanbeafewsecondstoseveralhours,dependingontheapplication. 2. Firingratesofactors:UsingTheorem 2.1 ,ringratesofactorsguaranteeingwell-behavedSDFG,i.e.,freeofdeadlockandinnitebufferaccumulation,canbecalculated.InMATLAB,theringratesofactorscanbeobtainedthroughasimpleoperation,null(,0r0),where)]TJ /F1 11.955 Tf 10.1 0 Td[(isatopologymatrixfortheSDFGandnullisaMATLABfunctionreturningasolution,Z,for)]TJ /F2 11.955 Tf 9.43 0 Td[(Z=0. 3. Pathbandwidthselection:Thee-Scienceapplicationsaredistributed,andconnection-orientedcommunicationpathsamongdistributednodesaresetupondemandorinadvance.Thebandwidthofpathsisguaranteedbynetworktechnologies,suchasmulti-protocollabelswitching(MPLS)andgeneral 33

PAGE 34

multi-protocollabelswitching(GMPLS).Thecommunicationdelayofapathtowhichbandwidthisallocatedonrequestwithinavailablebandwidthisinverselyproportionaltotheallocatedbandwidth.Hence,pathbandwidthallocationshouldalsobetakenintoaccountsinceitcanaffectthroughputaswellasthetotalnetworkresourceconsumption.AformalproblemformulationwillbepresentedinSection 2.3.2.2 4. Theamountofbufferspacerequirementsisthetotalnumberoftokensqueuedoneveryedge.Clearly,differentschedulescanleadtodifferentbufferspacerequirements.ThefollowingbufferminimizationproblemisshowntobeNP-complete[ 76 ]:GivenahomogeneousSDFG,isthereavalidschedulefortheSDFGofwhichbufferspacerequirementsarelessthenaconstantK?itiseasier,however,tondtheminimumbufferspacewhenthecomputationrateisxed,eventhoughtheproblemisstillNP-complete.UsinganapproachsimilartoGovindaraja[ 50 ]butadaptedtoe-Scienceapplications,weuseatwo-phaseapproachforrstndingtheoptimalsolutionforthebandwidthallocationproblem,thenusethissolutiontominimizethebufferrequirements.Usingthistwo-phaseapproach,asmentionedabove,weobtaintheoptimalsolutionforthebandwidthallocationproblem,thenminimizebufferrequirementsbasedontheobtainedprevioussolution.AfterthesolutiontotheBAFSproblem(describedinthenextsection)isobtained,wehavetondexactschedulesandminimizebufferspacerequirements.Sincewechooseamodelwherethecommunicationdelaysareincludedintheexecutiontimeofactors,thepreviouslydevelopedalgorithmsforbufferspacerequirementminimizationcanbedirectlyapplied.ThebufferspacerequirementminimizationproblemhasbeensolvedinthecontextofDSPapplicationsinmanypapers[ 50 ][ 103 ][ 94 ]. 5. Adjustfordeploymentinarealsystem:Theimplementationofthederivedsolutionrequiresafewconsiderations.Thegeneratedsolutionconsistsofdiscretizedvaluesintermsoftheproperlychosenbasetimeanddataunit.Aslongaswecanensurethatthediscretizedproblemhasstricterconstraintsthantheoriginalproblem,suchashigherproductionratesandlowerconsumptionrates,theresultingsolutionshouldbefeasible.Additionally,withtheabsenceofaglobalclock,synchronizationissuesneedtobeconsideredtoforceringsoftaskstofollowthecomputedschedule.Self-timedschedulingmaynotachievethemaximalratewithouttheglobalclockifbufferspaceislimitedandnotproperlysynchronizedwithactors'schedules.However,forreasonablebuffersizes,thedeteriorationofthemaximalratewillbesmall. 2.3.1IllustrativeExample Wepickthevisualizationapplicationin[ 53 ]asarepresentativeexampleofe-Scienceapplicationsthatcanbemodeledbyanextendedsynchronousdataow 34

PAGE 35

Figure2-3. Arealexampleofe-Scienceapplications[ 53 ] graph(ESDFG).ThevisualizationapplicationshowninFigure 2-3 (a)hasause-casescenarioasfollows. ForthedemonstrationinSanDiego,CCT/LSU(Louisiana),CESNET/MU(CzechRepublic)andiGrid/Calit2(California)participatedinadistributedcollaborativesession.Thevisualizationfront-endislocatedatLSUrunningAmiraforthe3Dtexture-basedvolumerenderingfordistributedvisualization.Thevisualizationback-end(dataserver)alsoranatLSU.Theactualdatasetforthedemonstrationhadasizeof120Gbytesandcontained4003datapointsateachtimestep(4bytesdata/pointfora256Mbyte/timestep). Inthischapter,weassumeamoregeneralmodel,similartotheuse-casein[ 46 ],extendedfromthisapplicationsuchthatdataserversresideatdifferentsitesfromcomputingsites.ThisgeneralmodelcanbeabstractedasthediagraminFigure 2-3 (b). Figure2-4. AnESDFGmodelforFigure 2-3 ThesystemparametersofthevisualizationapplicationaresummarizedinTable 2-2 .Ifnotexplicitlymentioned,alltheparametersareperonering.Theguresmarkedbyboldtypeareparametersthatarenotexplicitlygivenin[ 53 ],thusarbitrarily 35

PAGE 36

Table2-2. Summaryofsystemparametersofthevisualizationapplication ItemContinuousDiscretizedvaluevalue DatacentersProduction2560Mbyte128Executiontime1second1ComputingsiteatLSUConsumption256Mbyte256Production1frame(1Mbyte)1Executiontime100ms2VisualizingsiteatSanDiegoConsumption1frame(1Mbyte)1Executiontime100ms2ThroughputAtleast5frames/sec0.25VisualizingsiteatBrnoConsumption1frame(1Mbyte)1Executiontime50ms1ThroughputAtleast5frames/sec0.25 Basetimeunit:50ms,Basedataunit:1Mbyte chosenbyuswithinareasonablerangeoftheassociatedhardware'sperformance.Thediscretizedvaluesfortheparametersarecomputedwithappropriatelychosenbasetimeanddataunit.Forexample,thedataproductionspeedofdatacenters,2560Mbyte/s,isdiscretizedinto128tokens/1unittimesincethebaseunittimeis50msandtherateof2560Mbyte/sequalstotherateof128Mbyte/50ms.TheresultantESDFGfortheapplicationisshowninFigure 2-4 Second,theringratesofnodesarecalculatedusingsimplemathonatopologymatrixoftheESDFG,asdescribedinSection 2.2 )-277(=0BBBBBBB@1280)]TJ /F4 11.955 Tf 9.3 0 Td[(256000128)]TJ /F4 11.955 Tf 9.3 0 Td[(25600001)]TJ /F4 11.955 Tf 9.3 0 Td[(100010)]TJ /F4 11.955 Tf 9.29 0 Td[(11CCCCCCCA Thesolutionforratesofnodesisgivenby[2,2,1,1,1].Eachelementofthesolutionvectorcorrespondstor1throughr5,respectively. Thenextstepistoformulatetheproblemasalinearprogramming. 36

PAGE 37

2.3.2OptimalBandwidthAllocationwithaFeasibleSchedule Toincludetemporalconstraintssuchasthroughput,wedeneextendedSDFG(ESDFG)asfollows. Denition2(ExtendedSDFG(ESDFG)). AnESDFGisrepresentedby G=(V,E,I,O,,D,st,et,T),whereV,E,I,O,,DaresameasSDFG,st,etarestartandendtimeofexecutionperiodofaSDFG,andTisf(v,Tv)jv2V,Tv2Rg. Theset,T,haselementsofthroughputconstraintsdenedastwo-tuple(v,Tv),wherevisthenodewhosethroughputshouldbeequaltoorgreaterthanTv.standetareusedforin-advancebandwidthreservations.Supposethatwemanagedatastructuresforin-advancebandwidthreservationssuchastime-bandwidthlistsrepresentinghowmuchavailablebandwidthvariesovertimeoneachedge,wecaneasilyobtainasubgraphwhoseavailablebandwidthoneachedgeissettomaximumavailablebandwidthduringtheperiod[st,et).Forexample,ifanedgeeijhasavailablebandwidth1and2overtimeperiod[0,1)and[1,2),andstandetaregivenas0and2,theeijofsubgraphhasavalueof1asanavailablebandwidth.BAFSproblemformulationworksonthissubgraphifin-advancebandwidthreservationsareconsidered.Informallywecandenethebandwidthallocationwithafeasibleschedule(BAFS)problemasfollows:GivenanetworktopologyrepresentedbyG=(V,E)anditerativedata-dependenttasksrepresentedbyanESDFG,Gt=(Vt,Et,It,Ot,t,st,et,T),whatistheoptimalbandwidthallocationwithafeasibleschedulethatminimizesnetworkresourceconsumptionandmeetstemporalrequirements? Theformalproblemformulationwillbepresentedbelowafterdiscussionofhowtomodelcommunicationdelaysinestablishedpathsfore-Scienceapplications. 37

PAGE 38

2.3.2.1Modelingcommunicationdelays Acommunicationdelayiscomposedoffourfactors:processingdelay,transmissiondelay,queueingdelay,andpropagationdelay.Processingdelayisassociatedwithoperationssuchaspacketizing,thusisproportionaltothedatasizeasistransmissiondelay.Queueingdelayisstochastic,andpropagationdelayisconstantforacertainlink.Inthischapter,weassumethate-Scienceapplicationsrunondedicatednetworks,i.e.,MPLSorGMPLSnetworks,wherethepathsareestablishedusinglabelswitchedpaths(LSPs).Forsuchscenarios,queueingdelayandpropagationdelaycanbeignored.Weassumethattransmissiondelaydominatestotaldelay.Theprocessingdelaycanbeincorporatedintotransmissiondelaybecausebothkindsofdelayareproportionaltothedatasize.Wethusoptimizewithregardtotransmissiondelay. WenowinvestigatehowtoincorporatecommunicationdelaysintoanoptimalcomputationrateproblemgivenaSDFG.Tothebestofourknowledge,thishasnotbeenaddressedintheliteratureonSDFmodelingforDSPapplications.Althoughcommunicationdelayshavebeenconsideredinmultiprocessorscheduling,thefocusismainlyonthemakespanofschedules,whichistotaltimetakentoexecuteallthetasksspeciedbyaprecedencetaskgraph,notonthethroughputofinnitelyrepeatedschedules.Thetargetapplicationsaree-Scienceapplicationswhosedata-dependentdistributednodescollaborateiteratively. Figure 2-5 (a)showsaSDFGconsistingoftwoactors,uandv.Actoruproduces2tokensperring,actorvconsumes1tokenperring.Weassumethatittakes2unitsoftimeforactorutosend2tokenstoactorv.Thevalueinparenthesisinsideanodeindicatestheexecutiontimeofthenode. TherearetwowaystointegratethecommunicationdelaywithintheSDFmodel. 1. Thecommunicationdelaycanbeincludedintheexecutiontimeofproducingactoru(Figure 2-5 (b)) 2. Thecommunicationdelaycanbeincludedbyhavingadummyactorcwhoseexecutiontimeissettothecommunicationdelay(Figure 2-5 (c)) 38

PAGE 39

Figure2-5. ModelingcommunicationdelayinaSDFG Therstoption(Figure 2-5 (b))impliesthatcommunicationcanoccurrightaftertokensareproducedintheproducer'sbufferandtheproducercannotberedagainuntiltransferofproducedtokensisdone.Thisisthemostconservativewayofmodelingacommunicationdelaysincetherelationbetweentheexecutionandcommunicationisassumedtobesynchronous.Wecallthismodeltheconservativemodelinthischapter.Ifwearenotsurehowtheprogramisimplementedinternally,wecantakethisconservativemodeltoguaranteethenalsolutionmeetsthethroughputrequirements.Thesecondoption(Figure 2-5 (c))impliesthatcommunicationcanrunindependentlyoftheproducer.This,ingeneral,canleadtohigherbufferspacerequirements,butmayresultinahighercomputationrate.AscanbeseeninFigure 2-5 (c),theoptimalscheduleshowsmorethroughputcomparedtotheoptimalscheduleinFigure 2-5 (b).Wecallthismodeltheoptimisticmodelasopposedtotheconservativemodelinthischapter.Eitherofthesetwomodelscanbechosenarbitrarilypereachnode,andthedetailsonhowthisissuecanbedealtwithintheproblemformulationarepresentedinSection 2.3 Insomecases,theremaybemultipleoutgoingcommunicationchannels(Figure 2-6 (a)).Assinglecommunicationchannel,wecanmakeachoicebetweentwooptions:aconservativeapproachandanoptimisticapproach.Theconservativeapproachaddsmaxfcommunicationdelaysofoutgoingcommunicationchannelsgtotheexecutiontimeof 39

PAGE 40

Table2-3. Notationforproblemformulation CategoryNotationDescription Function vt(v)vt:Z!Z,mapsavertex,v,inVintoavertexinVt. Com(a)Com:V!boolean,returnstrue,ifanactoraisadummynodetomodelcommunicationdelay. ConstantG(V,E),originalnetworktopology. orSetGt(Vt,Et,It,Ot,t,st,et,T), anESDFGspecifyingiterativedata-dependenttasks. Jcf(si,di)jsi2V,di2V,(vt(si),vt(di))2Etg, Asetofcommunicationjobsmodeledbytheconservativeapproach,denedbytwotuplesofsourceanddestinationnodes. Jof(si,di)jsi2V,di2V,c2Vt,(vt(si),c)2Et,(c,vt(di))2Etg, Asetofcommunicationjobsmodeledbytheoptimisticapproach,alsodenedbytwotuplesofsourceanddestinationnodes. JJcorJodependingontheapproach. sjsj2V,j2JcWj2Jo,sourcenodeofjobj. djdj2V,j2JcWj2Jo,destinationnodeofjobj. iExecutiontimeofnode(actor)i2Vt. riRateofnode(actor)i2Vt. Ijj2J,amountofdata(numberoftokens)consumedbyactorvt(dj). Ojj2J,amountofdata(numberoftokens)producedbyactorvt(sj). ClkAvailablebandwidthonedge(l,k)2Eduringtheperiod[st,et). VtfAsetoffront-endnodeswhosethroughputsareconcerned,VtfVt TdThroughputrequirementofnode(actor)d2Vtf,speciedbyusers. VariableRmaxThemaximalcomputationrateofaniteration. tdThroughputofnode(actor)d2Vtf. fjlkFlowofjobjonanedge(l,k)2E. DjAllocatedbandwidthforjobj. aproduceractor.Figure 2-6 (b)showssuchacasewheretheexecutiontimeofactoruincreasesby3,maxf2,1,3g.Oneofdrawbacksofthismodelisthatitdetersearlyexecutableactorsfromstartingontheirownschedules.Forexample,actorwinFigure 2-6 (b)cannotbered1unittimeafterunishesitsexecution.Insteaditshouldwait2unittimesmore.TheotherapproachasinFigure 2-6 (c),theoptimisticone,isthesameasthecaseofthesinglecommunicationchannels.Foreachchannel,alogicalactoraccountingforacorrespondingcommunicationdelayisinsertedbetweentheoriginalproducer/consumeractors. 40

PAGE 41

Figure2-6. Modelingcommunicationdelayinthecaseofmultiplecommunicationchannels Amoreelaborateanalysisofacertainactor'sexecutionpatternmayleadtothemoreexactmodeling,andFigure 2-7 showsinwhatcasesandhowwecanimproveourmodels.ThesemanticsofSDFenforcesthattheoutputofanactorisgeneratedattheendoftheexecutionoftheactor.Hence,incaseofFigure 2-6 (a),actorv,wandxcanstarttheirownexecutionatleast2,1and3unittimeafteractoru'sringisdone,whichmeansactorxcanstartattime5ifactoruisredattime0.However,supposethattheoutputdataforactorxisgeneratedattime1.Thecommunicationdelayonthechannelbetweenactoruandxcanbeadjustedby2asinFigure 2-7 .ThenextproceduresforincorporatingcommunicationdelayintoSDFmodelwilltakeeitherofFigure 2-6 (b)and(c). 41

PAGE 42

Figure2-7. Moreexploitedparallelismincaseofmultiplecommunicationchannels 2.3.2.2Problemformulation ThenotationfortheBAFSproblemissummarizedinTable 2-3 .TheBAFSproblemcanbeformulatedaslinearprogrammingproblemsshowninFigure 2-8 and 2-9 ,fortheconservativeandtheoptimisticmodels,respectively. ObjectiveminimizeXj2J,(l,k)2Efjlk (2)Multi-commodityowconstraintsXk:(l,k)2Efjlk)]TJ /F28 9.963 Tf 18.75 9.46 Td[(Xk:(k,l)2Efjkl=0,l6=sj,l6=dj,8j2J (2)Xj2JfjlkClk,8(l,k)2E (2)Xk:(l,k)2Efjlk)]TJ /F28 9.963 Tf 18.75 9.47 Td[(Xk:(k,l)2Efjkl=Dj,ifl=sj)]TJ /F26 9.963 Tf 7.75 0 Td[(Dj,ifl=dj,8l2V,j2J (2)0fjlk,8j2J,(l,k)2E (2)0Dj (2)TemporalconstraintsRmax1 ri(i+Oj Dj),i2Vt,j2Jc,(vt(sj),vt(dj)2Et (2)td=Rmaxrd,d2Vtf (2)Tdtdrd ri(i+Oj Dj),d2Vtf,i2Vt,j2Jc,(vt(sj),vt(dj)2Et (2)Td(iDj+Oj)rd riDj,d2Vtf,i2Vt,j2Jc,(vt(sj),vt(dj)2Et (2) Figure2-8. BAFSproblemformulationincaseoftheconservativemodel 42

PAGE 43

ObjectiveminimizeXj2J,(l,k)2Efjlk (2)Multi-commodityowconstraintsXk:(l,k)2Efjlk)]TJ /F28 9.963 Tf 18.75 9.47 Td[(Xk:(k,l)2Efjkl=0,l6=sj,l6=dj,8j2J (2)Xj2JfjlkClk,8(l,k)2E (2)Xk:(l,k)2Efjlk)]TJ /F28 9.963 Tf 18.75 9.46 Td[(Xk:(k,l)2Efjkl=Dj,ifl=sj)]TJ /F26 9.963 Tf 7.75 0 Td[(Dj,ifl=dj,8l2V,j2J (2)0fjlk,8j2J,(l,k)2E (2)0Dj (2)TemporalconstraintsRmax1 riOj Dj,ifCom(i)=trueandj2Jo,(vt(sj),i)2Et (2)td=Rmaxrd,d2Vtf (2)Tdtdrd riOj Dj,ifCom(i)=trueandd2Vtf,j2Jo,(vt(sj),i)2Et (2)TdOjrd riDj,ifCom(i)=trueandd2Vtf,j2Jo,(vt(sj),i)2Et (2) Figure2-9. BAFSproblemformulationincaseoftheoptimisticmodel Theproblemformulationallowstheuseofthemulti-commodityowproblem,forwhichavarietyofefcientsolutionsexistsintheliterature[ 27 ].Themajordifferencesbetweenatypicalmulti-commodityowproblemandthisproblemformulationareasfollows: 1. Thedemandofeachjobisnotaconstant,butadecisionvariable.Thisdetermineshowmuchbandwidthisallocatedtoajob,i.e.,communicationbetweenproducerandconsumeractors. 2. Thedecisionvariablesareconstrainedbytemporalconstraintspertainingtothroughputsofactors. 43

PAGE 44

Theobjectiveofthelinearprogramming(giveninEquation 2 and 2 )istominimizenetworkresourceconsumption,whichisthetotalamountofbandwidthsallocatedtoalledgesintheoriginalnetworktopology.Ifthedemandswereconstantvalues,theobjectivecanberegardedasminimizingaveragehopsofalljobs(communicationchannels)ifweapproximatetheaveragehopnumberastotalnetworktrac totaldemand.However,sincewehavedemandsasdecisionvariables,thisobjectivecanbethoughtofasminimizingallocatednetworkresourcesregardlessofaveragehopnumberofalljobs. Theconstraintscanbedividedintotwoparts.Therstpartistypicalofmulti-commodityowconstraints.Equation 2 and 2 ,theowconservationconstraint,mandatethatforalljobs,thenetowtoanodeiszero,i.e.,theincomingandoutgoingowstoanodearebalancedunlessthenodeisasourceoradestination.Equation 2 and 2 ,thecapacityconstraint,mandatethattheowalonganyedgecannotexceedthecapacityoftheedge.Equation 2 and 2 ensuresthatthesourceandthedestinationofanyjobshouldproduceandconsume,respectively,theowofthejob,Dj. Thesecondpartconcernstemporalconstraints,guaranteeingthethroughputsoffront-endactors.Asdiscussedearlier,Theorem 2.2 statesthatthemaximumcomputationrateislimitedbythecyclewhosecost-to-timeratioisminimum.Accordingly,Equation 2 and 2 accountforcommunicationdelaysonoutgoingedgesofacertainnode.Sincethetargetapplicationisacyclice-Scienceapplication,thecyclesweshouldconsiderforthemaximumcomputationratearelimitedtoself-dependencyloops,wherethenumberoftokensis1andthetotalexecutiontimesareri(i+Oj Dj)andriOj Djfortheconservativeandoptimisticmodels,respectively.ThetermOj Djaccountsforcommunicationdelaysofoutgoingedgesofactori.Inaddition,sinceTheorem 2.2 isforhomogeneousSDFs,consideringtheconversionfromthegivenESDFGtothehomogeneousSDF[ 76 ],theexecutiontime,(i+Oj Dj)orOj Dj,shouldbemultipliedbytheringrateriasinEquation 2 and 2 .SinceRmaxisthenumberofiterationsperunittimeandtheringrateisthenumberofringsperiteration,thethroughputofacertain 44

PAGE 45

nodedequalsRmaxrd,asinEquation 2 and 2 .Equation 2 and 2 canbetransformedintoEquation 2 and 2 ,respectively,sincetherequiredthroughputcanbeguaranteedifanyRmaxrdisgreaterthanorequaltoTdspeciedbyusers.Withafewtransformations,thethroughputconstraintsresultinEquation 2 and 2 ,whicharelinearinequalities. Thesolutionforthelinearprogrammingdeterminestheoptimalbandwidthallocationonedges,andexactschedulesandassociatedbufferspaceallocationcanthenbecomputedin[ 50 ]. 2.4ExperimentalEvaluation ThereisnootherresearchworkonBAFSprobleminthecontextofgridcomputing.WecompareourLP-basedalgorithmforBAFSproblemwithaheuristicthatsimplyusesthedenitionofthroughputbetweentwoactorsintheassignmentprocess.Wealsocomparetwoapproachesofouralgorithm,conservativeandoptimisticones.TheheuristicispresentedinAlgorithm 2-1 .Itenumeratesallthepathsbasedonthethroughputrequirements,andcomputesdelaysonedgesbasedontheassumptionthatallthedelaysofedgesconstitutingagivenpatharethesame.Iftighterdelayisrequiredwhileexaminingpaths,thetighterdelayisupdatedasthedelayoftheedge.Forexample,inFigure 2-4 ,thethroughputofnode3comesfromtwopaths:0!2!3and1!2!3.Supposethatacommunicationdelayontheedge(2,3)is2,toachievethethroughputrequiredbynode3onthepath0!2!3.Ifthecommunicationdelayshouldbe1ontheedge(2,3),consideringanotherpath1!2!3,thecommunicationdelayontheedge(2,3)isupdatedto1. Theheuristicdoesnotconsiderpossibleparallelismoftasks,orpossiblebalancedbandwidthallocationsonedgessinceitcomputesdelaysbyassumingalldelaysonapathtobethesame. Wecomparetwoalgorithmsintermsofrejectionratioofrequests.Thebandwidthsofedgesarerandomlyselectedfromauniformdistributionbetween10to1024unit 45

PAGE 46

dataperbaseunittime.Wevariedthenumberofrequestsfrom1to16ontheAbilenenetwork[ 11 ](seeFigure 2-10 ),andeachrequestisaspeciedtaskgraph(Figure 2-4 ).Thenodesofarequestwereconstrainedtohaveamatchingnodeintheoriginalnetworktopologygraph,andthematchingnodeisrandomlyassignedusinguniformdistribution. Algorithm2-1AheuristicforBAFSproblem Input:AnESDFG 1: Enumerateallthepossiblepathsfromfront-endnodestoback-endnodeswhosethrough-putsarespeciedbytemporalrequirements. 2: Initializethedelayonedgei,edi,as1. 3: foreachpathdo 4: Assumeasamedelay,d,onalltheedgesofthepath. 5: Computedsatisfyingtemporalrequirements. 6: ifedi>dthen 7: edi=d 8: endif 9: endfor 10: Computethebandwidthoneachedgebasedonedi. Figure2-10. TheAbilenenetwork TheexperimentalresultsareshowninFigure 2-11 .BothofourLP-basedapproacheshavebetterrejectionratiosandarelowerthantheheuristicby5to30%. 46

PAGE 47

Figure2-11. Rejectionratiovs.numberofrequests Thedrawbackoftheheuristicistwo-fold.First,itdoesnotconsiderthefactthatschedulesofiterationscanbeoverlapped.Second,itcannotallocatebandwidthtolinksinterconnectingnodesaccordingtothecurrentnetworkstatus,i.e.,thecurrentavailablebandwidthoneachlink.Hence,theoptimisticapproach,whichassumesthatdatatransferscanalsooccurinparallelwithcomputationalexecutions,usestheleastamountofnetworkresourcestoachievethroughputrequirementsgivenbyusers.Consequently,theoptimisticapproachleadstotheleastrejectionratioofrequestssinceonerequesthasabetterchancetobeacceptedduetolessamountofbandwidthallocationrequirementsandthefollowingrequestsbenetfromlessloadednetworkstatus. 2.5Summary Thededicatednetworkonwhiche-Scienceapplicationsoperateguaranteesthatacertainpathcanhaveareservedbandwidthoveragivenperiod,whichmeansthecommunicationdelaysvarydependingonallocatedbandwidths.WedevelopaSDF-basedmodelforiterativedata-dependente-Scienceapplicationsthatincorporatesvariablecommunicationdelaysandtemporalconstraints,suchasthroughput.Weformulatetheproblemasavariationofmulti-commoditylinearprogrammingwithanobjectiveofminimizingnetworkresourceconsumptionwhilemeetingtemporalconstraints.TheresultingsolutioncanthenbeusedtoderivebufferspacerequirementsbypreviouslydevelopedalgorithmsinthecontextofDSPapplications.Finally, 47

PAGE 48

anillustrativeexampleofane-Scienceapplicationshowsthattheframeworkandalgorithmweproposeisvalidtomodelandanalyzeiterativedata-dependente-Scienceapplications.Thesimulationresultsshowthattheoptimalbandwidthallocationbytheformulatedlinearprogrammingoutperformsthebandwidthallocationbyasimpleheuristicintermsofrejectionratioofrequests. Infuture,wewillextendourframeworksothatitalsoschedulescomputationjobstodistributedcomputingresourceswhensuchmappingsarenotknownaheadoftime,oritmaximizestheoverallthroughputwhenmultipleSDFGswiththroughputrequirementsaregiven. 48

PAGE 49

CHAPTER3TOPOLOGYAGGREGATION 3.1Overview Theneedfortransportinglargevolumesofdataine-Sciencehasbeenwell-argued[ 33 78 ].Forinstance,theHEPdataisexpectedtogrowfromthecurrentpetabytes(PB)(1015)toexabytes(1018)sometimebetween2012to2015.Inaddition,e-Scientistsdesireschedulablenetworkservicestosupportpredictableworkprocesses[ 46 ].Qualityofservice(QoS)innetworkapplicationshasbeenanactiveresearchareaforseveraldecades.Recentlynewtechnologiessuchasmultiprotocollabelswitching(MPLS)andgeneralizedmultiprotocollabelswitching(GMPLS)drewmoreattentiontoQoSroutingsincethosetechnologieshavemadeitpossiblefornetworkmanagerstosetupandteardownexplicitpathswhileguaranteeingspeciedamountsofbandwidth. Thenetworksupportinge-Scienceapplicationstypicallycomprisesmultipledomains.Eachdomainusuallybelongstodifferentorganizations,andismanagedbasedondifferentoperationalpolicies.Insuchcases,internaltopologiesofdomainsmaynotbevisibletotheothersforsecurityorotherreasons.However,aggregatedinformationofinternaltopologyandassociatedattributesisadvertisedtotheotherdomains. AsetoftechniquestoaggregatedatatoadvertiseoutsideonedomainiscalledTopologyAggregation(TA).TheaggregateddataitselfistermedasAggregatedRepresentation(AR).AsurveyofTAalgorithmsispresentedin[ 98 ].ThereexistsatradeoffbetweentheaccuracyandthesizeofAR.Hence,mostalgorithmsproposedinthepreviousworktriedtoachievethemostefcientARintermsofbothaccuracyandspacecomplexity. OnecanclassifyQoSpathrequestsintotwoclasses:single-pathsingle-job(SPSJ)andmultiple-pathmultiple-job(MPMJ),dependingonthenatureofrequests.SPSJcorrespondstoasituationthatrequestsforsingleQoSpatharriveandthey 49

PAGE 50

arescheduledintheorderofarrival.Incontrast,MPMJcorrespondstobatch/off-lineschedulingofmultiplerequestsformultipleQoSpaths.Manye-Scienceapplicationsrequiresimultaneoustransferofdatafrommultiplesourcesanddestinations.Also,eachoftheserequests(e.g.,letransfers)canbemoreefcientlysupportedbyusingconcurrentmultiplepaths. WeshowthatexistingTAapproachesdevelopedforSPSJdonotworkwellwithMPMJapplicationsastheyoverestimatetheamountofbandwidththatisavailable.WeproposeamaxowbasedTAapproachthatissuitableforthispurpose.Oursimulationresultsdemonstratethatouralgorithmsresultinbetteraccuracyorlessschedulingtime. BGP,whichhasbeendeployedforinter-domainprotocol,haslimiteduseforARtechniques,asitisnotexibleenoughtobeextendedtoaccommodatemanyQoSparameters.Thisisbecauseitwasoriginallydesignedonlyfordistributingreachabilityinformation[ 107 ].RecentlyanewnetworkmodelbasedonPCEhasbeenproposedtoovercometheaforementioneddrawbacksofBGP[ 45 ].PCEisanentitythatiscapableofcomputingnetworkpathsutilizingthetrafcengineeringdatabasewhichcontainsrequirednetworkstatusinformationsuchastopology,delaysonlinksandetc.Recentpapers[ 80 85 93 ]havebasedtheirnetworkmodelonPCE-basedarchitecture.WedevelopTAalgorithmsinthecontextofPCE-basedarchitecturethatsupportmoste-Scienceapplications.Inparticular,thefollowingnetworkmodelisassumedthroughoutthischapter. 1. AcentralizedPCEexistspereachdomain.AnodesendsarequesttothePCEtomakeareservationforaQoSpath. 2. CentralizedPCEsoodaggregatedtopologyinformationtootherssothateverycentralizedPCEmaintainsacompleteviewofanetworkinanARexceptitsowndomain. Therstconditionstatesthatoneactiveelementinadomainactsasasupernodeinonedomain,whichknowseveryinformationessentialforQoSpathcomputation.OnepossibleimplementationisthateverynodeinadomainsendsarequestforQoS 50

PAGE 51

Figure3-1. Anexampleofinter-domainQoSrouting pathtothedesignatedcentralizedPCE,therefore,thePCEcanmanageoneconsistentinformationonnetworkstatusrelatedtoQoSparameters.Thesecondconditioncanbereasonablyassumedine-Sciencenetworks,ofwhichsizeisrelativelyverysmallcomparedtotheInternet.ThisstatementenablesustodirectlyapplyQoSroutingalgorithmswhichhavebeendevelopedsofar.Inthisnetworkarchitecture,onedomaincanadvertiseitsaggregatedtopologyinformationandassociatedQoSparameterstoalltheotherdomains. Basedonthedescribednetworkmodel,ascenarioofinter-domainQoSroutingworksasinFigure 3-1 STEP1AsourcenodesendsapathcomputationrequesttoasinglecentralizedPCEinthesamedomain. STEP2ThenthePCErepliesbackwithacoarsepath,whichisasequenceofbordernodeswithoutdetailedhopsbetweenbordernodes. STEP3Withthecoarsepath,thesourcenodesendsapathsetuprequestthatwilltraversebordernodesofthecoarsepath. STEP4and5ThebordernodewhichreceivesapathsetuprequestgetsastrictpathforacoarsepathfromthePCEinthesamedomain. STEP6Thesamestepsrepeatuntilapathsetuprequestreachesadestinationnode. TAalgorithmscanalsobeusedforschedulingpathsinasingledomain.Thesemethodsareusefulasalargedomaincanbepartitionedintosubdomains.TAalgorithmscanthenbeappliedtoeachsubdomain.WithARsonsubdomains,theactualschedulingmaybeperformedeitheronasinglenodewitharichcomputeresourceor 51

PAGE 52

onadistributedsetofnodessuchthatthetimecomplexityofschedulingpathswouldbereducedbyrunningschedulingalgorithmsonthepartitionedsmallersubdomains. Therestofthechapterisorganizedasfollows.TherelatedworkonTAisdescribedinSection 3.2 .Section 3.3 describesnovelalgorithmsforMPMJ.Section 3.4 describeshowrealroutingworksforTAalgorithms,andSection 3.5 givestimeandspacecomplexitycomparisonanalysis.TheexperimentalresultsbysimulationaregiveninSection 3.6 ,and,nally,weconcludeinSection 3.7 3.2RelatedWork TAconsistsofalgorithmsandmechanismsforreducingthesizeoftopologicalinformationandassociatedattributeswithinadomainorsubdomainswhilemaintainingacertainlevelofaccuracy.Uludaget.al[ 98 ]presentedasurveyofthesealgorithmsformulti-domainenvironments.AllTAalgorithmshavetwoelements:anaggregatedgraphandaggregatedQoSparametervalues,calledepitome,assignedonlogicallinksinanaggregatedgraph. Typicaltopologiesforaggregatedgraphsarefull-mesh,simplecompaction,tree-based,andstar-basedtopologies.Someothertopologies,e.g.,Shufenet[ 108 ],havebeenproposedtoreducespacecomplexityinspeciccasessuchasasymmetricnetworks.MostTAalgorithmsstartbybuildingafull-meshgraph,whichisacompletegraphwhosenodesarecomposedofonlybordernodesoftheoriginalnetwork.AlgorithmsthataremorefocusedonthesizeofARusuallytrytotransformafull-meshgraphintomorecompactforms,forexample,aspanningtreeorastartopology,whiletryingtokeepupwiththeaccuracyofafull-meshAR.ForaggregatedQoSparametervalues,epitome,themaximum,theminimumortheaverageofQoSvaluesaretypicallyused. TAalgorithmsforSPSJinlarge-scalemulti-domainnetworksfocusonthecompactionofARsaccuracyisasecondaryissue.AsforTAalgorithmsinsmallsizednetworks,accuracyhasbeenthemainfocus[ 80 85 88 93 ].ForasingleQoS 52

PAGE 53

constraint,adistortion-freealgorithmexists[ 98 ].ButfortwoQoSconstraintscomposedofanadditiveandarestrictiveone,theproblemgetsmorecomplicated.Eventhoughtheproblemitselfisnotintractable,distortion-freerepresentationisnotcompact.Forsuchreasons,severalapproximatingalgorithmsminimizingdistortionsuchasthelinesegmentalgorithm[ 68 ]havebeenproposed.Usually,themultipleQoSconstraintsproblemisgeneralizedasonerestrictivewithmultipleadditiveconstraints,sinceamultiplicativeconstraintsuchasalinkreliabilitycanbetransformedintoanadditiveonethroughalogoperation. Tothebestofourknowledge,allexistentTAalgorithmsarelimitedtoasingleQoSpathroutingatonetime,i.e.,SPSJ,withfewexceptionsofcustomizedalgorithmsforspecialpurposessuchascomputationofreliablepaths.MPMJapplicationsconsiderabatchofjobsatatimeandmultiplepathsareallowedforonejob.Forinstance,arequestfortheearliestnishtimeforagivenmultiple-sourcemultiple-destinationdatatransfer,whichisoneoftheimportante-Scienceapplications[ 46 ],ishandledatonetimeandmultiplepathsaresetupfortherequest. TheemergingtechnologiessuchasMPLSorGMPLSmakeitpossiblethatapplicationsrequiringstrictQoSrequirementsareimplementedonnetworksequippedwithsuchfacilities.SpecialpurposenetworkssuchasresearchnetworkslinkingnationallabsintheU.S.canbesetupforthatpurpose[ 83 ].Especiallyforinter-domainQoSpathroutinginsuchspecialpurposenetworks,theaccuracyofaggregatedtopologiesandassociatedQoSparametervaluesismoreimportantthanthesizeofdataexchangedamongdomainssincethenumberofdomainsisrelativelysmallcomparedtotheInternetwhichisconstitutedbyahugenumberofhostsandswitches.ThustheneedformoreaccurateARsisprominent. Asdescribedabove,oneofthemostrecentworkregardingTAfortwoQoSconstraintsisthelinesegmentalgorithmindelay-bandwidthsensitivenetworks[ 68 ].Thelinesegmentalgorithmrstcomputes2-Dchartswhosex-axisandy-axisaredelay 53

PAGE 54

Figure3-2. Anillustrativeexampleforlimitationsofthelinesegmentalgorithm andbandwidthrespectively,foreverypairofbordernodes.ThechartcontainsalltheinformationforcomputingQoSpathswithdelayandbandwidthconstraints.Authorsin[ 68 ]suggestedthelinesegmentalgorithmapproximatingthatinformationbyalinetoreducethesizeofdatarepresentingallpossibledelay-bandwidthcombinationsbetweentwobordernodes,anditispossiblebecausetheshapeofthechartstakesaincreasingstaircasefunction.Thenextstepistoestablishafull-meshtopologyandconvertittoastartopologytofurtherenhancethespacecomplexityuptoO(jBj). WithexistentTAalgorithmsforSPSJ,thereisnowaytoestimateifmorethanonepathbetweentwobordernodesisavailable.Consideramulti-domainnetworkinFig. 3-2 .ThenetworkconsistsofthreedomainswhereAS1isconnectedtoAS3viaAS2.SupposethatahostinAS1wantstondmaxowpathsorreliablepaths,composedofaprimaryandabackuppath,toacertainhostinAS3.IfTAalgorithmssuchasthelinesegmentalgorithmaredeployedinthisnetwork,thePCEinAS1computespathsbasedontheARfromAS2,whichonlygivestheinformationonhowmuchbandwidthisavailablewithinacertaindelay.SincethePCEinAS1hasnocluehowmanypathsexistinternallyinAS2,thecomputedmax-oworreliablepathsarenotnecessarilythemostaccuratepathscomputedbasedonthecompletenetworkstatusinformation. 3.3TAforMultiple-PathMultiple-Job(MPMJ) 3.3.1ProblemStatement Whenitcomestoschedulingabatchofmultiplejobsallowingformultiplepaths,existentTAtechniquesarenotusefulbecausetheperformancedegradationcanbe 54

PAGE 55

signicant.Forexample,thefull-meshARforbandwidthscheduling,whereeachlogicallinkhasthemaximumavailablebandwidthbetweentwobordernodesasanepitome,hasbeenknownasadistortion-freeARforsinglepathbandwidthscheduling.However,itmaynotbeeffectiveformulti-pathbandwidthscheduling,e.g.,amaxowbandwidthscheduling. Animportantclassofe-Scienceapplicationsisbulkletransfers.Forexample,forhighenergyphysicslargelesareroutinelytransferredbetweentieredcentersthataregeographicallydistributedaroundtheworld.Thegenerateddatahavetobetransferredfromstoredplacestoresearchcentersforthepurposeofanalysisorvisualization.Inthecontextofe-Scienceapplications,bandwidthschedulingproblemsrangefromsingle-sourcesingle-destinationdatatransferoptimizationtomultiple-sourcemultiple-destinationdatatransferoptimization.Thecomputationalcomplexityofsuchproblemsdenitelydependsonthespacecomplexityofthenetworktopology.Generally,wecanbreakdownnetworkresourceprovisioningproceduresfore-Scienceapplicationsintotheadmissioncontrolphaseandthenetworkresource,i.e.,bandwidth,allocationphase.Inadmissioncontrolphase,acceptanceofrequestedjobsisdeterminedandthenifaccepted;explicitbandwidthallocationforeachlinkwillbeexecutedinthenetworkresourceallocationphase.Withcompactnetworkinformationabstractedfromacompletenetworktopology,chancesarethatthenetworkresourceallocationphasemayfailduetoinaccuratenetworkstatusinformation.EventhoughtheacceptedrequestintheadmissioncontrolphasecanberejectedduetoinaccurateARsinnetworkresourceallocationphase,thebenetsfromlessspacecomplexitycompensateforfailedoperationsiftheerrorrateisfairlysmall. Inthefollowingsubsections,weproposeseveralTAalgorithmssuitedforMPMJ.Eachrequestconsistsofsingleormultipledatatransferjobs.However,wewanttoallowfortheuseofmultiplepaths.TheonlyQoSparameterthatisconsideredisbandwidth. 55

PAGE 56

3.3.2NewTopologyAggregationAlgorithms 3.3.2.1Full-meshmethod ThemosttypicalwayofaggregatingnetworkswithQoSparametersistobuildafull-meshtopologybyconnectingeverypairofnodesofinterestandassigningepitomestothebuiltlogicallinks.Followingthisconventionalway,wecanbuildafull-meshARwithmaxowvaluesbetweennodesassignedtoeachlogicallink.Let'sconsidertheedgeconnectingnodesD1andD2inFig. 3-3 .TheepitomeassociatedwiththeedgeED1D2,F12,canbecomputedusinganyknownmaxowalgorithms.Thealgorithmforbuildingafull-meshARisdescribedinAlgorithm 3-1 Figure3-3. Full-meshAR Algorithm3-1Full-meshARconstruction Input:agraphG=(V,E). 1: Picknodesofinterestfromafullsetofnodes,Vandaddthemtotheaggregationrepresentation. 2: foreachpairofpickednodesdo 3: Createalinkbetweentwonodes 4: Computeamaxowvaluebetweentwonodes. 5: Assignthecomputedmaxowvalueasanepitometothelinkcreatedabove. 6: endfor ThissimplemethodadaptedfromexistentTAtechniquesforSPSJeasilyturnsouttobeinappropriateforMPMJ.LetustakeanexampleofajobrequestingmaxowbetweenD1andD2whereD1,D2,D3andD4arenodesofinterest.Sinceweconsider 56

PAGE 57

Figure3-4. StarAR MPMJinbothasingleandmultipledomainnetworkenvironments,thenodes,D1,D2,D3andD4,don'thavetobebordernodes.ThenalmaxowvaluebetweenD1andD2wouldbefarbiggerthantherightanswer,sincethereexistotherpathssuchasD1!D3!D2. 3.3.2.2Starmethod Afull-meshARdoesnoteffectivelysupportMPMJasthemaximumamountofowthataspecicnodecanpushintoanetworkisnotrestricted. Forsinglepathcomputationalgorithms,mostrecentTAtechniquesstartfromfull-meshARandproducediversevariantsstemmingfromitsuchaspartialfull-mesh,star,treeandsoon.Formultiplepathcomputationalgorithms,thereasonsdescribedintheprevioussubsectionspreventfull-meshARfrombeingutilizedasabaseARforotherefcientARsintermsofspacecomplexity. AstarARasinFig. 3-4 canovercomethedrawbacksofafull-meshARbylimitingthemaxowvaluefromanynode.First,thelogicalnode,L,iscreatedandallnodesofinterestareconnectedtoit.Supposethatfournodesofinterest(D1,D2,D3andD4)areconnectedtothecentrallogicalnodeL.Theepitome,assignedonthelogicallinkconnectingacertainnodeandthecentrallogicalnodeL,isamaxowvaluefromthenodetoalltheremainingnodes.Thisiseasilycomputedbyputtingasupersourcenodeconnectedtoanodeandasupersinknodeconnectedtoalltheremainingnodes,andrunningamaxowalgorithmbetweenthesupersourceandthesupersinknodes.Inthiscase,F1isamaxowvaluethatanodeD1cansendtothenetwork,whichiseasily 57

PAGE 58

computedbyaddingasupersinknodeconnectingD2,D3andD4andrunningamaxowalgorithmbetweenD1andthesupersinknode.Likewise,wecanalsocomputetheotherepitomessuchasF2,F3andF4.ThisARhasonlyoneoutgoinglinkfromeachnode,whichkeepsonenodefromsendingthedataowbeyondtheepitomeassignedtotheoutgoinglink.FormaldescriptionofthealgorithmispresentedinAlgorithm 3-2 Algorithm3-2StarARconstruction Input:agraphG=(V,E). 1: Picknodesofinterestfromafullsetofnodes,V. 2: Createasinglelogicalnode,L. 3: foreachofpickednodesdo 4: Createalinkbetweenthenodeandthelogicalnode,L. 5: Computeamaxowvaluefromatargetnodetoalltheremainingnodes. 6: Assignthecomputedmaxowvalueasanepitometothelinkcreatedabove. 7: endfor 3.3.2.3Partitionedstarmethod AfterperformingsomeexperimentsonstarAR,werealizedthatastarARshowslittledistortioncomparedtoanoriginaltopology.However,TAalgorithmsshouldtakeintoaccounthowtheresultonARcanbetransformedintorealpathsetupsonanoriginaltopology. Originally,TAhasarisenfromeffortstodealwithscalabilityissuesrelatedwithspacecomplexityandsecurityissuesregardingintradomaintopologyinmulti-domainnetworkenvironments.Usually,routingproceduresconsistoftwosteps:(1)pathcomputationandbandwidthallocationwithARs,and(2)explicitpathcomputationandbandwidthallocationwithoriginalnetworktopologyforeachdomain.Similarstepscanalsobeappliedforsingledomainnetworkenvironments,whereseveralsubdomainsexistforhierarchicalroutingorwecanintentionallypartitiononedomainintoseverallogicalsubdomains.Inthiscase,thebenetsfromTAarealmostthesameasthosein 58

PAGE 59

multi-domainnetworkenvironments.InthecaseofMPMJapplications,however,wecanexpectmorebenetsintermsofcomputationalcomplexity.AdetailedcomputationalcomplexityanalysiswillbegiveninthefollowingSection 3.5 Thepartitionedstarmethodtriestocombinethebenetsofstarandfull-meshmethodsbypartitioningadomainintoksubdomains,andeachsubdomainisaggregatedusingthepreviousstarmethod.Fig. 3-5 showsanexampleofadomainwithfourpartitionedsubdomains.Inthischapter,weusegeneralgraphpartitioningalgorithms,whicharewidelyusedinmanyothercomputerscienceareasincludingloaddistributioninparallelcomputers,sparsematricesanddesignofverylargescaleintegratedcircuits(VLSI)[ 58 ].ThealgorithmforbuildingapartitionedstarARisdescribedinAlgorithm 3-3 Algorithm3-3PartitionedstarARconstruction Input:agraphG=(V,E)andk,thenumberofpartitions(subdomains). 1: Picknodesofinterestfromafullsetofnodes,Vandaddthemtotheaggregationrepresentation. 2: Partitionagraphintokpartssothatthenumberofnodesofinterestisevenlydistributedoverpartitionedparts. 3: Identifycutnodesandcutedges,andaddthemtotheaggregationrepresentation. 4: foreachpartdo 5: ConstructstarARwithpickednodesandcutnodesinthepart. 6: endfor 3.4Routing WiththenetworkmodeldescribedinSection 3.1 ,inter-domainQoSpathroutingisrelativelyeasycomparedtoaQoSpathroutinginadistancevectorroutingprotocol.AnycentralizedPCEcancomputeapathtoadestinationwhichconsistsofastrictpathwithinitsowndomainandacoarseinter-domainpathtothedestinationdomain.Thecoarseinter-domainpathiscomposedofbordernodes,andwhenthepathsetup 59

PAGE 60

Figure3-5. PartitionedstarAR requestisreceivedbyabordernodeontheintermediatepath,itistranslatedintoastrictpathcomposedofintra-domainroutersorswitches.Ontheotherhand,whenlinesegmentalgorithmisdeployed,computingtheQoSpathgoesthroughtwosteps.First,wecanassigndelayvaluestovirtualedgesofaggregatedrepresentationsofalldomainsexceptthedomaininwhichasourcenoderesides;acertainrequesthasabandwidthrequirement,andcorrespondingdelayiscomputedthroughalinesegmentalgorithm.Second,anyshortestpathalgorithmsuchasDijkstra'salgorithmcanberunonthenormalgraphwithdelayattributesonitsedges. Inter-domainroutingforSPSJapplicationsiswelldescribedinSection 3.1 .TheroutingproceduresforMPMJapplicationsarethesameasthoseforSPSJapplications.Theresultsfromanyalgorithms,e.g.,amaximumbandwidthpathalgorithm,runonARsareexpandedoneachdomainoreachsubdomainbyrunningthesamealgorithmontheoriginaltopologyofadomainorasubdomain.Ifoperationsfailinanyofthedomainsorsubdomains,theentireoperationwillfail.NotethatthereasonMPMJapplicationsinintradomainenvironmentsuseARsofsubdomainsistoreducethetimecomplexityofscheduling,whereasSPSJorMPMJininterdomainenvironmentsareforcedtouseARsforsecurityoradministrativereasons.ThebenetsofusingARsinintradomainenvironmentsfromtheperspectiveoftimecomplexitywillbedescribedinSection 3.5 60

PAGE 61

Table3-1. TimeComplexityforMPMJ MethodTimeComplexity Full-meshO(n3D2)StarO(n3D)PartitionedstarO()]TJ /F8 7.97 Tf 6.81 -4.97 Td[(n k3(C+D)) D=numberofnodesofinterest C=numberofcutnodes k=numberofpartitions 3.5ComplexityAnalysis UsuallymostalgorithmsforMPMJhavehighercomputationalcomplexitiesthanalgorithmsforSPSJ.Dijkstraalgorithmcanbeusedtoderivethemaximumbandwidthpathbetweentwonodes,whichcanbetranslatedintothemaxowsinglepath.ThecomplexityofDijkstraalgorithmisO(nlogn+m),wherenisthenumberofverticesandmisthenumberofedges.Incontrast,thecomplexityofthepush-relabelmaxowalgorithmisO(n3)[ 27 ].ThisshowsalgorithmsforMPMJmayrequireafewordershighercomputationalcoststhanthoseforSPSJ. ThetimecomplexitiesofTAalgorithmsforMPMJaresummarizedinTable 3-1 .Thefull-meshmethodrequiresO(n3D2),andthestarandthepartitionedstarmethodrequireO(n3D)andO()]TJ /F8 7.97 Tf 6.81 -4.97 Td[(n k3(C+D)),respectively,whereDisthenumberofnodesofinterest,Cisthenumberofcutnodes,andkisthenumberofpartitions.ThecomplexityofmaxowalgorithmsisassumedtobeO(n3)andthenumberofpartitionsinthepartitionedstarmethodisgivenask. ThespacecomplexitiesaresummarizedinTable 3-2 .ThespacecomplexitiesofARsforfull-mesh,starandpartitionedstarmethodsareO(D2),O(D)andO(C+D),respectively.SupposethatacertainalgorithmforMPMJapplicationstakesO(n3).IfwerunthealgorithmonARs,itwilltakeO((C+D)3)andkO()]TJ /F8 7.97 Tf 6.81 -4.98 Td[(n k3),whicharetimetakenforrunningthealgorithmonARsandtimetakenforexplicitroutingineachpartition,respectively.(C+D)andkisdenitelyasmallvaluecomparedton,andn3maybegreaterthan)]TJ /F8 7.97 Tf 6.81 -4.98 Td[(n k3byafewordersofmagnitude.Hence,wecanexpectthatthe 61

PAGE 62

Table3-2. SpaceComplexityforMPMJ MethodSpaceComplexity Full-meshO(D2)StarO(D)PartitionedstarO(C+D) D=numberofnodesofinterest C=numberofcutnodes partitionedstarmethodcanexpeditethepathcomputationandbandwidthallocationprocesssignicantly. 3.6ExperimentalEvaluation 3.6.1BulkFileTransfersinE-Science Wechoseabulkletransferapplicationin[ 65 ]asatypicalMPMJe-ScienceapplicationtoshowthatourproposedalgorithmsperformbetterthannaivealgorithmsadaptedfromSPSJTAalgorithms.In[ 65 ],theauthorsformulatedthein-advanceschedulingofmultiplebulkletransfersasalinearprogrammingproblem.Weadaptedtheirlinearprogrammingformulationtoon-demandschedulingofmultiplebulkletransfersforoursimulation.ThelinearprogrammingformulationisshowninFigure 3-6 .Thenotationsandequationsareborrowedfrom[ 65 ]wheneverpossible.Inthisformulation,tfdenotesthetimebywhichallletransferscomplete.Theobjectiveofthislinearprogrammingproblemistondtheearliestnishtime.fjlkistheamountofletransferredforrequestj2Fonlink(l,k)2E.blkisthebandwidthavailableonlink(l,k).Equation 3 ensuresthatforeachtransferrequestj2F,foreachnodelthatisneitherthesourcenorthedestinationnode,theamountoflejthatleavesnodelequalstheamountthatentersthisnode.Equation 3 requiresthesourcenodeofrequestjtosendanetfjunitsoflejoutandrequiresthedestinationnodetoreceiveanetfiunits.Equation 3 ensuresthattheamountoftrafconeachlinkdoesnotexceedtheavailablecapacityofanylinkintheinterval[0,tf).Equation 3 ensuresthatletransferamountsarenon-negative. 62

PAGE 63

minimizetf (3)subjectto (3)Xk:(l,k)2Efjlk)]TJ /F11 11.955 Tf 21.34 11.36 Td[(Xk:(k,l)2Efjkl=08j2F,8l2V,l6=sj,l6=dj (3)Xk:(l,k)2Efjlk)]TJ /F11 11.955 Tf 21.34 11.36 Td[(Xk:(k,l)2Efjkl=fj,ifl=sj)]TJ /F6 11.955 Tf 9.3 0 Td[(fj,ifl=dj,8j2F (3)Xj2Ffjlkblktf,8(l,k)2E (3)fjlk0 (3) Figure3-6. Earliestnishtimeon-lineschedulingofmultipleletransfers 3.6.2ExperimentTestbed ForTAalgorithmsforMPMJ,weperformedexperimentsonrandomnetworkswithasingledomain.RandomnetworktopologiesaregeneratedbytheBRITEinternettopologygenerationpackage[ 72 ].WetriedseveralmodelssuchasWaxman,BRITE,etc.,buttheresultsfordifferentmodelsshowsimilartrends.Therefore,weshowonlyresultsforrandomnetworktopologiesfollowingtheWaxmanmodelwiththeaveragenodedegreeof4.Thebandwidthvaluesofedgesarerandomlyselectedfromauniformdistributionbetween10to1024.Thenumberofnodesineachdomainisvariedfrom100to300withtheincrementof50.Thenodesofinterestarepickedrandomlywithinadomain,andthenumberofnodesrangesfrom1to16,whichisdoubledateachstep.Wegeneratedasyntheticsetofdatatransferrequests.Eachrequestisdescribedbythe3-tuple(sourcenode,destinationnode,requestedletransfersize).Thenumberofrequestsisalsorandomlyselectedwithintherangeof1tothemaximumpossiblenumberofrequestsdeterminedbythenumberofnodesofinterest.Forexample,ifthenumberofnodesofinterestis4,themaximumpossiblenumberofrequestsis43.The 63

PAGE 64

sourceanddestinationnodesforeachrequestarerandomlyselectedusingauniformrandomnumbergenerator.Theresultsareaveragedover100randomnetworksforacertainnumberofnodes. 3.6.3PerformanceMetrics Theperformancemetricwehaveusedtocomparethedifferentapproachesistondtheearliestnishtime(EFT)tocompleteallthemultipledatatransferrequeststhataregiven.OnewouldexpectagoodARapproachtoperformasclosetousingtheoriginaltopology. Hence,weusetheerrorratio(ER)thatmeasuresthedeteriorationfromthecorrectEFTontheoriginaltopology.ATAalgorithmwithlowerERshowsbetterperformance.ERisformallydenedasER=TAEFT)]TJ /F1 11.955 Tf 11.96 0 Td[(OriginalEFT OriginalEFT 3.6.4Results WemeasuredERaccordingtotheequationdenedinSection 3.6.3 .Thecomputationaltimestakenforeachalgorithmarealsogatheredtoshowhowmuchcomputationcostreductionwecangetfromthecompactrepresentation.Fig. 3-7 showsthatthestarandthepartitionedstarmethodsgivearound5%ER.ThisisbecausetheapplicationofndingEFTtendstondandallocatealltheavailablebandwidthsinanetwork,whicharelimitedbythestarorthepartitionedstarARsinasimilarwayastheoriginalnetworkdoes.Inaddition,wecouldobservethatasthenumberofrequestsincrease,ERisimprovedbecauseallthenetworkresources,i.e.,thebandwidths,areeventuallyusedup.Asexpected,theperformanceoffull-meshARistheworst. OursimulationresultsinFig. 3-8 alsoshowthatthestarmethodiscomparabletothepartitionedstarmethodbutsignicantlyfasterthanthepartitionedstarmethod.Thisisnotonlybecausethestarmethodisamorecompactrepresentation,butalso 64

PAGE 65

Figure3-7. Errorratiovs.thenumberofnodes Figure3-8. Normalizedcomputationaltimevs.thenumberofsourceanddestinationnodes partlybecause,forrandomlygeneratednetworks,thenumberofcutnodesarerelativelylarge.Ifadomainisstructuredasbackboneandtheothernetwork,thenumberofcutnodescanbereducedtoareasonablevalue,whichcanenhancetheperformanceofthepartitionedstarmethod. 3.7Summary Weproposeseveraltopologyaggregationalgorithmsfore-Sciencenetworks.E-ScienceapplicationsrequirehigherqualityintradomainandinterdomainQoSpaths,andsomeofthosearedistinguishedfromclassicsingle-pathsingle-job(SPSJ)applications.Wedeneanewclassofrequests,calledmultiple-pathmultiple-job(MPMJ),andproposeTAalgorithmsforthenewclassofapplications.Theproposedalgorithms,starandpartitionedstarARs,areshowntobesignicantlybetterthannaive 65

PAGE 66

approaches.Especially,starARshowsthebestperformanceintermsofcomputationaltime.Itsperformanceisalsoveryclosetousingtheentiretopologyforperformingthescheduling.Thus,itiswellsuitedformultipledomaine-Scienceapplications. 66

PAGE 67

CHAPTER4WORKFLOWSCHEDULING 4.1Overview Anapplicationscientisttypicallysolveshis/herproblemasaseriesoftransformations.Eachtransformationmayrequireoneormoreinputsandmaygenerateoneormoreoutputs.Theinputsandoutputsarepredominantlyles.However,weexpectthatleswillbereplacedbydatabasesformanyapplications.ThesequenceoftransformationsrequiredtosolveaproblemcanbeeffectivelymodeledasaDirectedAcyclicGraph(DAG)formanypracticalapplicationsofinterestthatthischapteristargeting. Figure 4-1 describesaDAGconsistingof17nodes,representingdependenciesamong17tasksofanapplication.Forexample,thearcfromtaskEtotaskBrepresentsthefactthattheoutputgeneratedbytaskEisutilizedbytaskB.Eachtaskinvolvestransformingthedataorstoringanintermediateresultforarchiving.ThetimerequirementforsolvingtheentireDAGforlargescaleapplicationsmaybeoftheorderofhourstodaysevenassumingthateachofthetaskisbeingexecutedonaclusterofworkstationsoraparallelsupercomputer.DAGshavebeenwidelyusedforthedevelopmentofschedulingalgorithmsinthecomputerscienceliterature[ 38 105 ]. ThegeneralformofDAGschedulinghasbeenshowntobeNP-hard[ 91 ],andanumberofheuristicshavebeenproposed.Earlyresearchregardedcommunicationcostsassmall[ 26 39 ]orassumedaverysimpleinterconnectionnetworkmodel,i.e.,afully-connectednetworkmodelwithoutcontention[ 59 60 79 97 99 ].Theheterogeneous-earliest-nish-time(HEFT)algorithmextendedforheterogeneouscomputingresourceswasproposedin[ 97 ]. Forthedistributedapplicationsthatwearetargeting,theamountofdatathatneedstobetransferredbetweentasksmaybeoftheorderofhundredsofgigabytestomultipleterabytes.Thus,thekeychallengeistobeabletoscheduleaworkowsuchthatthetotalexecutiontimeandthecommunicationcostsareminimized.Theformer 67

PAGE 68

Figure4-1. ADAGconsistingof17nodes,representingdependenciesamong17tasksofanapplication.Forexample,thearcfromtaskEtotaskBrepresentsthefactthattheoutputgeneratedbytaskEisutilizedbytaskB. requiresmappingthetaskstoappropriatemachineswhilethelatterrequirestheuseofhighbandwidthnetworksandeffectiveschedulingofthecommunicationbandwidth.ThepastresearchonschedulingDAGs(e.g.[ 38 105 ])isgenerallylimitedtosolvingcomputeintensiveproblems.Incontrast,wewillproposenewalgorithmstomaptasksthathavelargedataaccessrequirementsontodistributedheterogeneousclustersandsupercomputers.Formanyapplications,anodeinthetaskgraphcanalsorepresentmultipleconcurrentandinteractingsubtasks.Ifthesesubtasksaremappedtomultiplemachines,therequiredinteractionhastobemappedontotheunderlyingnetworktosupportthisinteraction.ForsuchDAGs,theprecedenceisbetweensetsofmultiplesubtasks. Theextendedlistschedulingalgorithmsin[ 92 ]and[ 90 ]targetedforheterogeneousclusterarchitecturesaddressthisnetworkcontentionissuebyvariouspriorityattributingschemesandassumedthepathbetweenanytwoprocessorsisdeterminedandxedbythetargetsystemusingconventionalalgorithmssuchasabreadthrstsearch(BFS).Ontheotherhand,similarworkregardingDAGschedulinghasbeendoneinthe 68

PAGE 69

literatureofgridcomputing.Generally,theterm,workow,isusedinterchangeablywithDAGinthecontextofgridcomputing.Ataxonomyofpreviousworkontheworkowschedulingproblemingridcomputingwaspresentedin[ 102 ].Thegoaloftheschedulingalgorithmsistomapthetasksandsubtasksofalltheapplicationsonthegridsuchthattheresourcesareeffectivelyutilized,whilethequalityofserviceguaranteesgiventoanapplicationarerespected.Inthischapter,theactualformulationoftheoptimizationgoalswillbepresented.Thenetworkresourcemappingcanhavethefollowingcharacteristics: 1. Rigidvs.malleable:Rigidmappingisxedbandwidthmappingoverthetimeperiodofdatatransferwhereasmalleablemappingallowsforvariablebandwidth.Ifthereisnoqualityofservicerequirementssuchasconstantdatarate,malleablemappingisaviableoptiontoutilizenetworkresourcesefcientlysincesolutionscanbeexibleaslongastotalamountofdatatransmissionovertimemeetsthedatatransmissionrequirement. 2. Singlepathvs.multiplepaths:Formanytransfers,multiplepathscanbeeffectivelyusedtoreducethetransfertime.However,ndingasetofmultiplepathsrequiresmorecomputationtime,andthusefcientalgorithmsareneeded. 3. Staticvs.dynamicpaths:Instaticmapping,pathsdeterminedatthestarttimeofdatatransmissiondonotchangeuntiltheendofdatatransmission,whileindynamicpathmapping,pathscanchangedynamicallyovertime. Therecentworkonworkowschedulinginopticalgridscanprovisionnetworkresourcesdynamicallywithguaranteeofspeciedbandwidth[ 66 67 69 95 101 ].Accordingtotheabovetaxonomy,thosemethodsuserigidandsinglepathmapping.Moreover,thepathsareassumedtobestatic. Theworkowschedulingproblemcanbeclassiedintoin-advanceandon-demandschedulingdependingonthereservationstarttime.Ifthereservationstarttimeisthesameasthejobrequestarrivaltime,itison-demandscheduling.Ontheotherhand,thereservationstarttimeisequaltoorlaterthanthejobrequestarrivaltimeincaseofin-advancescheduling.On-demandschedulingcanbealsoregardedasaspecialcaseofin-advanceschedulingwherereservationstarttimeequalstothejobrequestarrival 69

PAGE 70

time.Inthischapter,wesolvein-advanceworkowschedulingproblemine-SciencenetworkswhichareamixofIPnetworksandopticalnetworks.Ourframeworksupportsin-advancereservationandprovidesmalleablemappinganddynamicpaths.Further,weareabletoexploitmultiplepathsandapplicableforheterogeneousandourframeworkisapplicabletohomogenousresourcesespeciallyfornetworkresources. 4.2WorkowSchedulinginE-ScienceNetworks Wedevelopworkowschedulingalgorithmsfore-Sciencenetworks.Arealnetworktopologyandworkowsaregivenasinputsforworkowschedulingalgorithms.Arealnetworktopologyisrepresentedbyanetworkresourcegraphrepresentswhereanodedenotesaresourcesuchascomputeresourceorarouter/switchonlyforwardingnetworktrafc,andanedgedenotesaphysicallinkbetweentwonodes.Aworkowisrepresentedbyataskgraph/DAGwhereanodedenotesataskassociatedwithatypeandamount,andadirectededgeconnectingtwonodesdenotesaproducer/consumerrelationofthem,i.e.,requireddatatransferfromasourcenodetoadestinationnode.Ataskinataskgraphisexecutedonlyonceandexecutionordercomplieswiththeprecedenceconstraintsdenedbythetaskgraph.Inthischapter,thegoalofworkowschedulingalgorithmsistomapanode(task)andanedge(datatransfer)inataskgraphintoanodeanddynamicmultiplepathsinanetworkresourcegraph,respectively.Themappingofanode(task)inataskgraphintoanodeinanetworkresourcegraphimpliesthatthetaskisnotsplittableandthemappingisnotvariedovertime.Buttheamountofresourceallocationcanvaryovertime.Incontrast,themappingofanedge(datatransfer)inataskgraphintodynamicmultiplepathsinanetworkresourcegraphmeansthatdatatransferisfullledbymultiplepathsvaryingovertime.Thetimemodelweassumeistheuniformtimeslicemodel,whichdiscretizesthetimelineintomanytimesliceswithuniformperiod.Inthefollowingsections,moredetailedandformaldenitionwillbedescribed. 70

PAGE 71

4.2.1SystemModelandDataStructure 4.2.1.1Timemodel TheuniformtimeslicemodelisrepresentedbyandMwhereisthesizeofatimeslideandMisthemaximumnumberoftimeslicesthesystemwouldconsider.ThestartandendtimeofthetimeslicemisdenotedbyTmandTm+1,respectively. 4.2.1.2Networkresourcemodel AnetworkresourcemodelisrepresentedbyGn=(V,E,r,TR,TB),whereVandEareasetofnodesandasetofedges,respectively,rvdenotestheresourcetypeofanodev,andTRvandTBedenotethedatastructuresfortheresourceavailabilityofnodevandedgee,respectively,overtime.Morespecically,weusetime-resource(TR)ortime-bandwidth(TB)arraysasdatastructuresformanagingresourceavailabilityovertime.ATRorTBarrayisasetof(am)wheremistheindexofatimesliceandamistheavailableamountofaresourceovertheperiod[Tm,Tm+1).Thesedatastructuresarenecessaryforeffectivein-advancereservationofresources.Basically,thedatastructuresofaTRarrayandaTBarrayaresameexceptthefactthattheresourcetyperepresentedbyaTBarrayisonlynetworkresourcetypewhereasotherresourcetypesarerepresentedbyaTRarray.Thus,aTBarrayisassignedoneachedgeandaTRarrayisassignedoneachnodeinanetworkresourcegraph. Figure 4-2 showsanexampleofnetworkresourcegraph.Eachnoderepresentsoneresource,andisassociatedwitharesourcetypeandaTRarray,whichtrackstheresourceavailabilityovertime.InFigure 4-2 ,nodesV1throughV3areofresourcetype1,andnodesV4andV5areofresourcetype2.Wecanassignauniquenumbertoadifferentresourcetypeexcludingthenetworkresource.Forexample,resourcetype1ispurecomputeresourceandresourcetype2isdatabaseserviceresource.Eachedgerepresentsaphysicallinkconnectingtwonodes,andisassociatedwithaTBarray,whichtracksthenetworkresourceavailabilityovertime. 71

PAGE 72

Figure4-2. Anexampleofanetworkresourcegraph 4.2.1.3Workowmodel Aworkowcanberepresentedbyataskgraph,whichisadirectedgraphandformallydenedasGt=(N,L,r,RN,RL,ST,Deadline).NandLrepresentanodesetandanedgeset,respectively.ridenotestheresourcetypeofnodeNi.RNidenotestherequiredamountofaresourceofnodeNi,andRLidenotestherequiredamountofdatatransferbetweenbothendnodesofedgeLi.Weassumeallcapacitiesofresourcesarenormalizedwithregardtothebasecapacityandwecanthenexpresstherequiredoravailableamountofresourcesbyrationalnumbers,multiplesofthebasecapacity.STisthestarttimeofaworkow,whichhastobetakenintoconsiderationforin-advancescheduling.Deadlineisanoptionalparameter.Ifitisgiven,wemaysettheoptimizationobjectivetominimizingnetworkresourceconsumption.Otherwise,wemaysettheoptimizationobjectivetominimizingthemakespanoftheworkow.Figure 4-3 showsanexampleofataskgraph.Resourcerequirementandresourcetypeareassociatedwitheachnode,andonlynetworkresourcerequirementpropertyisassociatedwitheachedge. 4.2.2ProblemStatement Wesolvein-advanceworkowschedulingproblemsine-SciencenetworkswhicharemixofIPnetworksandopticalnetworks.Eventhoughopticalnetworks,whereaphysicallinkcarriesmultiplewavelengths,haveinherentlyintegralityofbandwidth, 72

PAGE 73

Figure4-3. Anexampleofataskgraph weassumethatthebandwidthofanetworkresourcegraphisinnitelydivisible.Theapplicationofthealgorithmsdevelopedinthischaptertoopticalnetworksislefttofuturework.Wedevelopouralgorithmsfortwobroadcases: 1. Singleworkow:Inthiscase,asingleworkowisscheduledbasedontheavailable(fractional)resources.Itisassumedthatthepreviousworkowshavealreadybeenscheduledandthegoalistooptimizetheperformancecharacteristicsofasingleworkow. 2. Multipleworkows:Inthiscase,multipleworkowswillbesimultaneouslyscheduled.Theexpectationisthatthiswillachievebetterperformance,thanschedulingoneworkowatatime. Forbothcases,thegoalsofouralgorithmscanbeminimizationofeithernetworkresourceconsumptionormakespan(nishtime).Forsimplicity,whendeadlinesforworkowsaregiven,wesettheobjectivetominimizationofnetworkresourceconsumption.Otherwisewesettheobjectivetominimizationofnishtime.Thereforewehavefourproblemsintotal:(1)minimizationofnetworkresourceconsumptionforasingleworkow,(2)minimizationofnishtimeforasingleworkow,(3)minimizationofnetworkresourceconsumptionformultipleworkows,and(4)minimizationofnishtimeformultipleworkows. 4.2.3ConstructionofanAuxiliaryGraph Wetranslatetheworkowschedulingproblemintoanetworkowproblem.Themulticommodityowproblem,whichoptimizesthecostofmultiplecommoditieswithdifferentsourceanddestinationnodesowingthroughthenetwork,isawell-known 73

PAGE 74

networkowproblem.Toformulatetheworkowschedulingproblemasamulticommodityowproblem,wersthavetoconstructanauxiliarygraphfromthegivennetworkresourcegraphandtaskgraph.Theworkowschedulingproblemiscomprisedoftwomappingproblems,anodemappingproblemandanedgemappingproblemontoanetworkresourcegraph.Thegoalofconstructinganauxiliarygraphistoconvertanodemappingproblemintoanedgemappingproblemsincethemulticommodityowproblemcandealwithonlyanedgemappingproblem. AnillustrativeexampleoftheauxiliarygraphcorrespondingtoFigures 4-2 and 4-3 isshowninFigure 4-4 .AnauxiliarygraphGA=(VA,EA,TBA)isconstructedasfollows.First,weexpandthenetworkresourcegraphbyduplicatingeachnodeandconnectingfromtheoriginalonetotheduplicatedone.Forconvenience,let'scalltheoriginaloneafrontendnode,andtheduplicatedoneabackendnode.Forexample,inFigure 4-4 ,thenodeV1isexpandedintotwonodes,V10andV100,andanewedgeconnectingthesetwonodesisinsertedwiththeassociatedTBarraycorrespondingtoV1'sTRarray.Inthiscase,V10isafrontendnodeandV100isabackendnode.Obviously,thisexpansionistoconvertaresourceallocationproblemintoanetworkowproblem.TheoriginaltopologyofthenetworkresourcegraphremainsunchangedamongthebackendnodesoftheexpandedgraphasinFigure 4-4 .Notethatsomenodesofnochanceofbeingselectedmaynotneedtobeexpanded. Second,weexpandthetaskgraphinthesamewayaswedidthenetworkresourcegraph.Butwedonotcreateanyedgeconnectingnodesintheexpandedtaskgraph.Lastly,weinterconnecttheexpandednetworkresourcegraphandtheexpandedtaskgraph. Asmentionedabove,twokindsofowsareneededforproblemconversionfromageneralworkowschedulingproblemtoanetworkowproblem.Oneistheresourceallocationowforthepurposeofresourceallocationofeachtask(node)inataskgraph.Forexample,inFigure 4-4 ,N10isconnectedtoallpossiblefrontendnodesofthesame 74

PAGE 75

resourcetypeofN1intheexpandednetworkresourcegraph.Similarly,N100isconnectedtoallbackendnodescorrespondingfrontendnodes.Thus,byconstrainingtheowfromN10toN100demandingN1'sresourcerequirementtotaketheonlyonesinglepath,wecansolvetheproblemofresourceallocationofeachtask(node). Theotheristhedatatransferowforthepurposeofdatatransfersbetweentasks.Theseowsareseamlesslymodeledbymultipleowswithdifferentsourceanddestinationnodesinatypicalmulticommodityowproblem.Thesourcenodeofadatatransferowissettothebackendnodecorrespondingtoasourcetaskinthetaskgraph,andthedestinationnodeofadatatransferowissettobackendnodecorrespondingadestinationtaskinthetaskgraph.Forinstance,thedatatransferrequirementof10unitsbetweenN1andN2inataskgraphismodeledbyaowof10unitsofdatabetweenN100andN200. Theauxiliarygraphaccountsforasituationwheretwotasksaremappedintoasameresourceandthusthecommunicationcostbetweenthemshouldbeignored.Sincethebandwidthofinterconnectingedgesbetweenanexpandednetworkresourcegraphandanexpandedtaskgraphissettoinnity,thecommunicationcostoftasksmappedintothesameresourcewillbenearzero.SupposethatalltasksandresourcesareofsametypeinFigure 4-4 andN1andN2aremappedontoV1.ThenthedataowbetweenN1andN2willfollowthepath,N100!V100!N200,whichiscomposedonlyofedgeswithinnitebandwidth. Thespacecomplexityofanauxiliarygraphissummarizedasfollows;jVAj=2(jVj+jNj),jEAj=jEj+jVj+2jNjjVj. 4.3MILPFormulation Thesingleormultipleworkowschedulingproblemcanbeformulatedasamixedintegerlinearprogramming(MILP)problem,whichisavariantofamulticommodityowproblem.TheobjectiveoftheMILPproblemcanbeminimizingthenishtimeorminimizingthetotalnetworkresourceconsumptiondependingonwhetherthedeadlines 75

PAGE 76

Figure4-4. Anexampleofanauxiliarygraph forworkowsaregivenornot.Ifadeadlineisnotimposedonaworkow,theuserwhorequestsfortheworkowjobwantstogetthejobdoneasfastaspossible.Ifadeadlineisimposedonaworkow,theuserwillbesatisedaslongasthedeadlineismet,whichallowsthesystemtoutilizeresourcesmoreefcientlyincompensationforthedelayedtime. TheconstraintsoftheMILPproblemarecomposedoffourparts:1)multi-commodityowconstraints,2)taskassignmentconstraints,3)precedenceconstraints,and4)deadlineconstraints.Sincewehavetransformedtheworkowschedulingproblemintoamulti-commodityproblem,thetypicalmulti-commodityowconstraintsremainvalid.Additionalmulti-commodityowconstraintsareaddedtoaccountformalleableresourceallocation.Secondly,thetaskassignmentconstraintsareintegerconstraintstoenforcethatonetaskismappedtoonlyoneresourcenodeinthenetworktopologygraph.Thirdly,theprecedenceconstraintsensurethatprecedenceconstraintsofaworkowareobeyed.Finally,thedeadlineconstraintsarejustforthecasewhendeadlinesforworkowsaregiven.ThenotationfortheMILPformulationislistedinTable 4-1 76

PAGE 77

Table4-1. Notationforproblemformulation CategoryNotationDescription Function pred(v)Returnsthesetofpredecessorsofnodev ConstantorSetJf(sj,dj,Fj)jsj,dj2VA,0j
PAGE 78

andddenotethesourceanddestinationnodesofthejob,Fdenotestherequiredamountofow(resource).STandEND,thestartandendtimesofthejob,aredeterminedbyworkowschedulingalgorithms.Theresourcetypedoesnothavetobeincludedinthistuplesinceaowisforcedtotakealinkofthesameresourcetypeduetothecarefullychosenconnectedgesbetweenanetworkresourcegraphandataskgraph.Threekindsofbinarydecisionvariablesareintroduced;x,yandz.Thediscretenatureoftheproblemisduetothefactthatataskcannotbesplitandwehavediscretetimeintervalstoaccommodatejobs.Binarydecisionvariables,xjlkandyjlk,determinewhichresourceistobeallocatedtoanon-splittask.Regardingajobjcorrespondingtoataskinataskgraph,theowofthejobcantakeonlyoneoutgoingedgefromthefrontendnodeofataskandonlyoneincomingedgeintothebackendnodeofataskintheauxiliarygraph.Theseconstraintsreectthenon-splitpropertyofatask.zjmindicateswhethertimeslicemisusedforthejobjornot.Thesebinarydecisionvariablescanbeeasilyextendedtothemultipleworkowschedulingproblembyusingseparatevariablesforeachworkow. 4.3.1SingleWorkow ThecompleteformulationispresentedinFigure 4-5 .Firstofall,theproblemcanbeoptimizedforeitherminimumnishtime,Tf,orminimumnetworkresourceconsumption,Pj2JP(l,k)2EAPM)]TJ /F5 7.97 Tf 6.58 0 Td[(1m=0fjlk(m),asinExpression 4 .Minimizingnetworkresourceconsumptioncanbehelpfulforsavingmoreresourcesforfuturearrivingrequestssothatmorerequestscanbeacceptedinthelongterm. 4.3.1.1Multi-commodityowconstraints TheowconservationruleatthenodesotherthansourceanddestinationnodesisensuredbyConstraint 4 .Theamountofow(resource)tobeallocated(reserved)isensuredbyConstraint 4 .Constraint 4 ensuresthattheamountoftotalowsonlink(l,k)duringtimeslicemshouldnotexceedthemaximumpossibleamountofowduringthetimeslicem,whichisgivenbyblk(m)(Tm+1)]TJ /F6 11.955 Tf 12.79 0 Td[(Tm),whereblk(m) 78

PAGE 79

ObjectiveminimizeTforXj2JX(l,k)2EAM)]TJ /F5 7.97 Tf 6.59 0 Td[(1Xm=0fjlk(m) (4)Multi-commodityowconstraintsXk:(l,k)2EAfjlk(m))]TJ /F17 10.909 Tf 22.98 10.36 Td[(Xk:(k,l)2EAfjkl(m)=0,8j2J,8l2VA,0m
PAGE 80

istheavailablebandwidthinthetimeslicem,andTm+1andTmaretheendtimeandstarttimeofthetimeslicem.Constraint 4 ensuresthatifthetimeslicemisnotusedforjobj,theamountofowforthejobjduringthetimeslicemis0.ButnotethatConstraint 4 shouldnotbeimposedonedgeswithinniteavailablebandwidthasthereisnocostrequiredforcommunicationsbetweentasksassignedonthesameresource.Otherwise,onetimesliceisallocatedforsuchcommunications.Constraint 4 ensuresthatifthetimeslicemisusedforjobj,thestarttimeofjobjshouldbeatmostTm,thestarttimeofthetimeslicem.Supposethatmultipletimeslicesarechosenforjobj,thisconstraintenforcesSTjtobelessthanorequaltothestarttimeoftheearliesttimeslice,whichcomplieswiththedenitionofSTj.Similarly,theendtimeofjobjisensuredtobegreaterthanorequaltotheendtimeofanytimesliceminwhichthejobisscheduledbyConstraint 4 4.3.1.2Taskassignmentconstraints Thesecondpartoftheconstraintsreectsthenon-splitpropertyoftasks.Thus,Constraints 4 and 4 ensurethatonlyoneresourceamongpossiblecandidateresourcesisassignedtoonetask.Constraint 4 relatesdiscreteselectionofaresourcewithowdecisionvariables,whichmeansifaresourceischosenforajobj,therecouldbeaowontherelatedlinks. 4.3.1.3Precedenceconstraints Aftertransformingallthetasksanddatatransfersintojobsinanetworkowproblem,weshouldensurethatprecedenceconstraintsinherentinataskgrapharealsoembeddedinthenetworkowproblem.Accordingly,Constraint 4 ensuresthatthestarttimeofjobswithnoprecedentjobsissettothestarttimeofaworkow.Constraint 4 ensuresthattheendtimeofajobisgreaterthanorequaltothestarttimeofajob.Constraint 4 ensuresthatthestarttimeofajobisnotbeforetheendtimesofprecedentjobs.Constraint 4 ensuresthatalltheendtimesofjobsshouldbeless 80

PAGE 81

thanorequaltotheglobalnishtimeTf.Constraints 4 and 4 ensurethatdatatransfersbetweentasksoccurbetweenchosenresources. 4.3.1.4Deadlineconstraints Constraint 4 isoptionaldependingonwhetherwehavedeadlinesonworkowsornot. 4.3.2MultipleWorkows TheformulationforthesingleworkowschedulingproblemcanbeeasilyextendedtothemultipleworkowschedulingproblembyusingseparatevariablesforeachworkowasinFigure 4-6 WecansettheobjectiveofthemultipleworkowschedulingproblemformulationtominimizingeitherthetotalsumofmakespansofallworkowsorthetotalnetworkresourceconsumptionofallworkowsasinExpression 4 .ThersttermofExpression 4 indicatesthetotalsumofmakespansofallworkows.Eventhoughwecanoptimizethenishtimeofthewholeworkowsbydirectlyapplyingtheobjective,Expression 4 ,ofthesingleworkowschedulingproblemformulation,minimizingthenishtimeofthewholeworkowsmaynotcontributetotheefcientresourceschedulingofworkowswhosetimelinesarefaraheadofthenishtimeofthewholeworkows.Forsuchreasons,wechoosetominimizethetotalsumofmakespans.Yet,therestillexistsaconcernthatthisobjectivecannotachievebalancedoptimizationforthemakespanofeveryworkow.Supposethateachworkowisissuedbyadifferentuser.Fromtheperspectiveofthewholesystem,thisobjectivecanachievebalancedschedulingamongworkows.Butfromtheperspectiveofusers,themakespanofacertainworkowcanbesacricedtoachievetheminimumofthetotalsumofmakespansbyreducingthemakespansofotherworkows. 4.3.3TimeComplexity ThetimecomplexityofaMILPproblemdependsonthenumberofdecisionvariablesandthenumberofconstraints.Toformallyanalyzethenumberofdecision 81

PAGE 82

ObjectiveminimizeN)]TJ /F5 7.97 Tf 6.59 0 Td[(1Xn=0(Tnf)]TJ /F12 10.909 Tf 10.91 0 Td[(WSTn)orN)]TJ /F5 7.97 Tf 6.58 0 Td[(1Xn=0Xj2JX(l,k)2EAM)]TJ /F5 7.97 Tf 6.58 0 Td[(1Xm=0fjnlk(m) (4)Multi-commodityowconstraintsXk:(l,k)2EAfjnlk(m))]TJ /F17 10.909 Tf 22.98 10.36 Td[(Xk:(k,l)2EAfjnkl(m)=0,8j2J,8l2VA,0m
PAGE 83

variablesandthenumberofconstraints,rst,thefollowingvariablesaredened.nnandmndenotethenumberofnodesandthenumberofedges,respectively,ofanetworkresourcegraph.ntandmtdenotethenumberofnodesandthenumberofedges,respectively,ofaworkow.nJdenotesthenumberofjobs.IfnAandmArepresentthenumberofnodesandthenumberofedges,respectively,oftheauxiliarygraph,nAequals2(nn+nt),andmAequalsmn+nn+2ntnnasdescribedinSection 4.2.3 .Table 4-2 showsthenumberofvariablesandconstraintsofthesingleworkowschedulingproblemformulation.Flowvariablesfconsistsofowvariablesofcommunicationjobsandowvariablesofnon-communicationjobs.Therstpartisaccountedforby(2nn+mn)mtMbecauseweneedtoconsiderowsonanetworkresourcegraph(mn)andinterconnectingedgesbetweenanetworkresourcegraphandataskgraphrelatedtoajob(2nn).Ontheotherhand,thesecondpartisaccountedforby(3nnnt)MbecauseweneedtoconsiderowsonlyoninterconnectingedgesbetweenanetworkresourcegraphandataskgraphandedgesconnectedbetweenfrontendandbackendnodesinGA.Forsimplicity,let'sassumethatthenetworkresourcegraphisxedasweconductexperimentsbyvaryingonlythesizeofworkowsinSection 4.6 .Thendecisionvariablefisdominantinthenumberofdecisionvariables,andthenumberofdecisionvariablesfisproportionalto(mt+nt)M.AsshowninTable 4-2 ,thenumberofconstraintsisalsoproportionalto(mt+nt)M. 4.4LPRelaxation Asyouwillseeintheexperimentalresults,therunningtimeofMILPfortheworkowschedulingincreasesexponentiallyasthenumberofnodesofaworkowgrows.ThegeneralworkaroundtosolvetheMILPproblemfastenoughtobeusefulinpracticalisthelinearprogrammingrelaxationbytransformingbinaryvariablesintorealvariablesrangingbetween0and1.WecanturnthesolutiontothelinearprogrammingrelaxationoftheMILPproblemintotheapproximatesolutiontotheMILPproblemvia 83

PAGE 84

Table4-2. Singleworkowschedulingformulationtimecomplexityanalysis Variable/ConstraintNumberofvariables/constraints f((2nn+mn)mt+3nnnt)M xnJnn=(mt+nt)nn ynJnn=(mt+nt)nn znJM=(mt+nt)M STnJ=(mt+nt) ENDnJ=(mt+nt) Constraint 4 + 4 (nt(2nn+2)+mt(nn+2))M Constraint 4 (mn+nn)M Constraint 4 (ntnn+mt(mn+nn))M Constraint 4 4 nJM=(mt+nt)M Constraint 4 4 nJ=(mt+nt) Constraint 4 nJnn=(mt+nt)nn Constraint 4 4 4 nJ=(mt+nt) Constraint 4 2mt Constraint 4 4 2mtnn techniquessuchasrounding.WeproposeaLPrelaxation(LPR)algorithm,consistingoftwosteps,fortheworkowschedulingproblem. Wedeterminewhichresourcesareselectedforthetasks(nodes)ofataskgraph. Thenextstepistoiterativelydeterminethestartandendtimesofjobsalongwithnetworkresourceallocationsfordatatransferjobs. Thedetailedoperationsoftherst-stepalgorithmaredescribedinAlgorithm 4-1 .Thegoaloftherststepistodeterminethemappingofresourcesotherthannetwork,andtherelatedbinaryvariables,xandy.IntheoriginalMILPformulation,Constraint 4 and 4 ensurethatonlyonex/yvariableintheconstraintsbecomes1.HencewecanturnthesolutionoftheLPrelaxationproblemintothesolutionoftheoriginalMILPproblembypickingthevariablewiththemaximumvalueandsettingitto1andalltheothersto0.Inthisstep,wedon'tcareaboutzvariables,whicharerelatedtotimesliceassignment. 84

PAGE 85

Algorithm4-1Firststep-Determinationofthemappingoftasksexceptdatatransfers Input:AnetworktopologygraphGnandaworkowGt 1: RelaxallthebinaryvariablesoftheMILPproblems,i.e.,x,y,andzvariables. 2: SolvetheLPrelaxationoftheMILPproblem. 3: Findthemaximumrelaxedvariableamongmanyrelaxedvariableswhosetotalsumshouldbe1,andsetthevariableto1andallothervariablesto0regardingxandyvariables. Withthesolutionoftherst-stepalgorithm,wecandeterminethestartandendtimesofjobsbysolvingsmallMILPproblemsiteratively,regardingunscheduledjobs.ThebasicideaisthatndingasolutiontotheMILPproblemwithdeterminedx/ybinaryvariablesandundeterminedzbinaryvariablesforasmallnumberofjobs,e.g.,3,takeslittletime.Thus,wecandividetheproblemintomanysmallproblemsandsolvethemsequentially.Topickappropriatejobs,wealsousethesamebottomlevelpriorityschemeastheheuristic.However,inourcase,thenodemappingisalreadydetermined.Thedetailedoperationsofthesecond-stepalgorithmaredescribedinAlgorithm 4-2 Algorithm4-2Secondstep-Determinationofthemappingofnetworkresources Input:AnetworktopologygraphGnandaworkowGtwithxedresourcemappingobtainedfromAlgorithm 4-1 1: whileTherearenetworkjobswithunxedendtimesdo 2: Pick3non-communicationjobsandassociatedcommunicationjobs. 3: SolvetheMILPproblem,whichhasonlythosejobsandrelatedzvariablesasbinaryvariables. 4: Updatethestartandendtimesofjobsaffectedbythesolution. 5: endwhile AsforLP,thecomputationtimeisproportionaltop2qifqpwherepisthenumberofdecisionvariablesandqisthenumberofconstraints.Thedecisionvariablefis 85

PAGE 86

dominantinthenumberofdecisionvariables.SupposethatthenetworkresourcegraphisxedasweconductexperimentsbyvaryingonlythesizeofworkowsinSection 4.6 .Toaddressthefastgrowingrunningtimewithregardtothesizeofaworkow,wechooseanotherformofmulticommodityow.ThereexisttwokindsofLPformulationsforthemulticommodityowproblem,node-arcformandedge-pathform.TheMILPformulationinFigure 4-5 takesthenode-arcform,whichassignsaseparatedecisionvariableforacertainjobonacertainlink.Incontrast,theedge-pathformassignsaseparatedecisionvariableforacertainjobonacertainpathinasetofpaths,P,whichthejobcantake.Accordingly,ifwelimitthenumberofpathsinthesetP,wecanreducethenumberofdecisionvariables,whichleadstobetterperformanceintermsoftimecomplexitybysacricingtheaccuracyofthesolution.In[ 81 ],authorsshowedthattheedge-pathformulationforbulkletransferscanleadtoanearoptimalsolutionwithareasonabletimecomplexitybyusingalimitednumberofpre-denedpaths.Theedge-pathformofthesingleworkowschedulingproblemformulationispresentedinFigure 4-7 .Wewillrefertothisedge-pathbasedLPrelaxationasLPREdgefortherestofthischapter. Thetimecomplexityanalysisfortheedge-pathformformulationissummarizedinTable 4-3 .Comparedtotheoriginalformulation,thenumberofvariablesandconstraintsismuchreduced.Especially,thetimecomplexityoftheedge-pathfromformulationismuchlessinuencedbythesizeofanetworkresourcegraph,i.e.,mnandnn. 4.5ListSchedulingHeuristic Theextendedlistschedulingalgorithmwiththebottomlevelpriorityschemeachievesthebestperformanceamongotherpriorityschemessuchastoplevelpriorityscheme[ 92 ].Eventhoughtheauthorsin[ 67 ]triedtoenhancetheperformanceconsideringthepropertiesofapipelinedtaskgraph,thenewalgorithmdoesnotmakemuchdifferenceinthecaseofrandomworkows.Theirresultsshowthatthenewandclassicalgorithmsproducealmostthesamemakespansregardingworkowswithup 86

PAGE 87

ObjectiveminimizeTforXj2JXp2PjM)]TJ /F5 7.97 Tf 6.59 0 Td[(1Xm=0fjp(m) (4)Multi-commodityowconstraintsX0m
PAGE 88

Table4-3. Edge-pathformsingleworkowschedulingformulationtimecomplexityanalysis Variable/ConstraintNumberofvariables/constraints f(kmt+nnnt)M xnJnn=(mt+nt)nn ynJnn=(mt+nt)nn znJM=(mt+nt)M STnJ=(mt+nt) ENDnJ=(mt+nt) Constraint 4 nJM=(mt+nt)M Constraint 4 (mn+nn)M Constraint 4 knJM=k(mt+nt)M Constraint 4 4 nJM=(mt+nt)M Constraint 4 4 nJ=(mt+nt) Constraint 4 nJnn=(mt+nt)nn Constraint 4 4 4 nJ=(mt+nt) Constraint 4 2mt Constraint 4 4 2mtnn to60tasksandthenewalgorithmperformsatmost5-10%betterregardingworkowswith80to100tasks.Hence,weconsideradaptingthegeneralextendedlistschedulingalgorithmwiththebottomlevelpriorityschemetoourrandomworkows. Thedirectapplicationoftheextendedlistschedulingalgorithmproposedin[ 92 ]doesnottwellintoe-Sciencenetworksinthreeaspects.First,thealgorithmof[ 92 ]allowsthatlinksonthepathcanbeavailableatdifferenttimeperiodsaslongasdescendentlinksbecomeavailableaftertheprecedentlinksofapath.Thisassumptionrequiresbuffersattheendsoflinksandinterventionofmoderatorscontrollingthestartandtheendofdatatransferoneachlink.Second,[ 92 ]doesnotconsiderin-advancereservation,whichmeansonlyavailablebandwidthatthetimewhentherequestismadeistakenintoaccountforpathcomputation.Theextendedlistschedulingalgorithmadaptedfore-SciencenetworksaredescribedinAlgorithm 4-3 88

PAGE 89

Thechangesnecessaryforadaptationforin-advanceworkowreservationsine-Sciencenetworksarerelatedtocomputingdatatransfertimeaspartofcomputationoftheearliestnishtime.Theassumptionregardingsynchronizedavailabilityoflinksonapathisreasonableine-Sciencenetworksandin-advancereservationsposeanotherchallenge.Inthecaseofon-demandreservations,wecancomputedatatransfertimesimplybytheamountofdataoverthemaximumavailablebandwidthofapathwherethemaximumavailablebandwidthofapathistheminimumofmaximumavailablebandwidthsoflinksofapath.Lastly,thevaryingavailablebandwidthovertimeduetothenatureofin-advancereservationrequirescarefulhandlingofdatatransfertime.Weassumerigidmappingfortheextendedlistscheduling,whichmeanstheallocatedbandwidthofapathdoesnotchangeovertime.Tondthedatatransfernishtime,weusethesimpleheuristicasdescribedinAlgorithm 4-4 .Wewillrefertotheextendedlistschedulingalgorithmadaptedfore-SciencenetworksasLSfortherestofthischapter. Algorithm4-3Theadaptedextendedlistschedulingalgorithm Input:AnetworkresourcegraphGnandaworkowGt 1: DeterminetheprioritiesofallnodesinGtbasedonthebottompriorityscheme. 2: Orderthenodeswithrespecttoprioritieswhilecomplyingwithprecedenceconstraints. 3: foreachnodeintheorderedlistinthedecreasingorderdo 4: Findthenodethatallowstheearliestnishtimeamongallcandidatenodesbyvirtuallyschedulingallincomingdatatransfers.//NetworkpathsbetweentwonodesarepredeterminedbyBFS. 5: endfor 4.6ExperimentalEvaluation 4.6.1ExperimentSetup Wecomparetheperformanceoffouralgorithms,theoptimalMILPalgorithm,theLPrelaxationalgorithm,theedge-pathformLPrelaxationalgorithmandthelist 89

PAGE 90

Algorithm4-4Datatransfernishtimecomputationalgorithm Input:AnetworkresourcegraphGnandadatatransferspeciedby(source,destination,amountofdata,starttime) 1: foreachtimesliceintheincreasingorderofstarttimeoftimesliceswhoseendtimeisgreaterthanorequaltostarttimedo 2: //Basicintervalreferstothetimeperiodwithinwhichtheavailablebandwidthoflinksofapathisconstant. 3: AllocBW theavailablebandwidthofthetimeslice. 4: RemainingData theamountofdatatotransfer 5: CurTimeSlice thetimeslice 6: FinishTime thestarttimeofthedatatransfer 7: whileRemainingData>0do 8: ifCurTimeSlicehasmoreavailablebandwidththanAllocBWthen 9: RemainingData 10: RemainingData)]TJ /F1 11.955 Tf 9.76 0 Td[(theamountofdatatransferredinthecurrenttimeslice 11: UpdateFinishTime. 12: else 13: Exitwhile 14: endif 15: CurTimeSlice thenexttimeslice 16: endwhile 17: ifRemaingData=0then 18: returnFinishTime. 19: endif 20: endfor schedulingheuristicofSection 4.5 intermsofthemakespan,i.e.,theschedulelengthofworkowsandthecomputationaltimeofalgorithms.Inthefollowing,werefertotheMILPalgorithm,theLPrelaxationalgorithm,theedge-pathformLPrelaxationalgorithm,andthegeneralextendedlistschedulingalgorithmofSection 4.5 asMILP,LPR,LPREdgeandLS,respectively. Werstcomparetheperformanceofallfouralgorithmswithregardtoworkowswithasmallnumberofnodes,3.Thisexperimentisforcomparisonofnon-optimalalgorithmsagainsttheoptimalalgorithm.Wethencomparetwoalgorithms,LPREdgeandLS,withregardtoworkowswithalargenumberofnodesrangingfrom10to50 90

PAGE 91

withanincrementof10.Thesecondexperimentistoverifythatouralgorithmperformsbetterthantheheuristicalgorithmintermsofmakespan. Asanetworkresourcegraph,weusetheAbilenenetwork[ 11 ](seeFigure 4-8 ),whichisdeployedinpractice.Theresourcecapacitiesofnodesofthenetworkresourcegraphaswellasthebandwidthcapacitiesofedgesarerandomlyselectedfromauniformdistributionbetween10to1024.Forworkowgeneration,wecanchooseeitherwayofgeneratingrandomly[ 26 28 42 59 66 ]orsynthesizingworkowsfromasetofpre-determinedworkows[ 60 ].Inourexperiments,weusearandomworkowgenerationmethodthatdependsonthreeparameters:thenumberofnodes,theaveragedegreeofnodesandcommunication-to-computationratio(CCR).Thenumberofnodesisvariedaccordingtotheaforementionedexperiments.Theaveragedegreeofnodesisrelatedtothelevelofparallelismofworkowsandxedto2.ThedifferentCCRsof0.1,1,and10areusedtoassesstheimpactofthecommunicationfactorontheperformanceofthealgorithms.AlargerCCRmeansaworkowismoredata-intensive.Theweightsofnodesofaworkowarerandomlyselectedfromauniformdistributionbetween10to1024astheresourcecapacitiesoftheAbilenenetworkaredetermined.Subsequently,theweightsofedgesofaworkowaresettotheCCRtimestheuniformdistributionbetween10to1024.Onehundredtrialswereforeverycombinationofworkowparameters,thenumberofnodes,CCRandthechosenalgorithm.Wethenaveragedtheresultsandplottedchartsforperformanceevaluation.Eventhoughwepresentformulationsforbothsingleworkowandmultipleworkowscheduling,wehaveconductedexperimentsforthesingleworkowschedulingonly.Thisisbecausewecanunderstandeverymultipleworkowschedulinginstancemaybetransformedintoasinglebigworkowschedulinginstance. AsaMILP/LPsolver,weusedCPLEX,apopularcommercialsoftwarepackage,andthecomputermachinesonwhichCPLEXhasbeeninstalledhavethefollowingspecication;2GHzdualcoreAMDOpteron(tm)Processor280and7Gbytememory. 91

PAGE 92

Figure4-8. TheAbilenenetwork 4.6.2Results Weevaluatetheperformanceofworkowschedulingalgorithmswithregardtotwometrics:schedulelengthofworkows,i.e.,makespan,andcomputational(running)time.Thedetailedresultswithexplanationarepresentedinthissubsection. 4.6.2.1Schedulelengthofworkows Comparisonagainstoptimalschedulingresults: .Sincetheoptimalschedulesforrandomlygeneratedworkowsonthegivennetworkresourcegraph,theAbilenenetwork,arenotknownaheadoftime,theonlywayofevaluatingthemakespansofnon-optimalalgorithmsistocomparemakespansofthosealgorithmsagainsttheoptimalalgorithm. InFigure 4-9 ,wecanseethattheperformanceofnon-optimalalgorithms,i.e.,LPR,LPREdge,andLS,iscomparabletotheoptimalalgorithm,MILP,whenCCR=0.1and1.0.However,asCCRgrowsupto10,themakespanofLSbecomesroughly2timesthemakespanofMILP.Incontrast,themakespanofLPRandLPREdgeisatmost20%morethantheoptimalmakespan. ComparisonbetweenLPREdgeandLS .AsthegeneralworkowschedulingproblemisaNP-hard,ourcorrespondingformulation,MILP,requiresexponentialcomputationaltimeasthesizeofworkowsincreases.Forlargeworkows,itisimpracticaltodeterminetheoptimalmakespanusingtheMILPalgorithm.Forthisreason,wecomparethemakespansofonlyournon-optimalalgorithms,LPREdgeand 92

PAGE 93

Figure4-9. Makespanvs.CCRforallalgorithmsintheAbilenenetworkwhenthenumberofnodesinaworkowis3. LS,inFigure 4-10 .WecanseethatLPREdgeismuchbetterthanLS.ItachieveshalfthemakespanofLSinsomecases. Figure4-10. Makespanvs.CCRandthenumberofnodesinaworkowforLPREdgeandLSintheAbilenenetwork 4.6.2.2Computationaltime Comparisonagainstoptimalschedulingresults .TherunningtimeoftheoptimalalgorithmgrowsexponentiallyasshowninFigure 4-11 .Thisalgorithmtakesapproximately14secondswhenthereare3nodesandCCR=0.1.With3nodes,theruntimebecomesapproximately47secondswhenCCR=10.Whenthenumberof 93

PAGE 94

nodesisincreasedto10andCCR=0.1,MILPtakesmorethan1,500seconds.Bycontrast,LPREdgetakeslessthan5secondswhenthereare3nodesandlessthan150secondswhenthenumberofnodesislessthan50(Figure 4-12 ). Figure4-11. Computationaltimevs.CCRforallalgorithmsintheAbilenenetworkwhenthenumberofnodesinaworkowis3. ComparisonbetweenLPREdgeandLS .TherunningtimeoftheheuristicisafewsecondswhereastherunningtimeofLPREdgeislinearlyincreasingupto150secondswhenthenumberofnodesis50. Figure4-12. Computationaltimevs.thenumberofnodesinaworkowforLPREdgeandLSintheAbilenenetwork Ifrequestsforworkowschedulingfromusersareon-demandandshouldbehandledinrealtime,thecomputationaltimeofthefastgreedyalgorithmshownintheexperimentsisnotpositive.However,whentherequestsarein-advance,thereis 94

PAGE 95

enoughtimebetweenrequestarrivaltimeandrequeststarttime,andthecentralizedserverisamorehigh-endmachine,LPREdgeisdeployableinpractice. 4.7Summary Wehaveformulatedworkowschedulingproblemsine-Sciencenetworks,whosegoalisminimizingeithermakespanornetworkresourceconsumptionbyjointlyschedulingheterogeneousresourcessuchascomputeandnetworkresources.Theformulationsaredifferentfrompreviousworkintheliteratureinthesensethattheyallowdynamicmultiplepathsfordatatransferbetweentasksandmoreexibleresourceallocationthatmayvaryovertime.Inaddition,itisadvantageousthattheformulationforasingleworkowschedulingcanbeeasilyextendedtotheformulationforamultipleworkowscheduling.Thecomputationtimeoftheoptimalformulationincreasesexponentiallywithregardtothesizeofaworkow.Accordingly,theLPrelaxationalgorithm,referredtoasLPR,fordeploymentinpracticehasbeendevelopedbasedontheoptimalalgorithmthroughthecommonlinearrelaxationtechnique.Wealsoproposetheedge-pathformLPrelaxationalgorithm,LPREdge,toenhancetimecomplexity. TheexperimentalresultsshowthatthemakespanofLPRandLPREdgeiscomparable,lessthan20%longer,tothatoftheoptimalalgorithmregardlessofCCRforsmallworkows.Incontrast,thegenerallistschedulingalgorithm,LS,performsroughlysimilartoLPRandLPREdgewhenCCR=0.1,buttheperformancegapofLPR/LPREdgeandLSgrowsdramaticallyasCCRgrowsfrom1to10.Data-intensiveworkowscheduling,whichiscommonine-Scienceapplication,canbenetfromdynamicmultiplepathsandmalleableresourceallocation.Intermsofcomputationaltime,theheuristicalgorithmofcourseisthebestbecauseitrequiresonlytrivialcomputations.LPRandLPREdgealgorithmsrequiremorecomputationswhichmaytakeafewminuteswhenthenumberofnodesis50.Infrequentworkowschedulingrequestsfromusersandreasonableschedulingtimebetweenarrivaltimeandstarttimeofrequestsmayrelievethisburdenalittle. 95

PAGE 96

Tothebestofourknowledge,theoptimalalgorithm,theMILPformulation,istherstalgorithmthatjointlyschedulesheterogeneousresourcesincludingnetworkresourcesusingdynamicmultiplenetworkpathsandmalleableresourceallocation.Theapproximationbasedontheoptimalalgorithmachievesreasonableperformancecomparedwiththeoptimalalgorithmintermsofschedulelength(makespan).Theapplicationoftheseresultstoopticalnetworkswillbefuturework. 96

PAGE 97

CHAPTER5CONCLUSIONS Weproposetodevelopanovelframeworkforprovisioningavarietyofe-Scienceapplicationsthatrequirecomplexworkowsthatspanovermultipledomains.Ourframeworkwillprovideguaranteesontheperformancewhileincurringminimaloverhead,bothnecessaryconditionsforsuchaframeworktobeadoptedinpractice. WehavealreadydevelopedaSDF-basedmodelforiterativedata-dependente-Scienceapplicationsthatincorporatesvariablecommunicationdelaysandtemporalconstraints,suchasthroughput.Weformulatedtheproblemasavariationofmulti-commoditylinearprogrammingwithanobjectiveofminimizingnetworkresourceconsumptionwhilemeetingtemporalconstraints. Wealsoproposedtopologyaggregationalgorithmsfore-Sciencenetworks.E-ScienceapplicationsrequirehigherqualityintradomainandinterdomainQoSpaths,andsomeofthosearedistinguishedfromclassicsingle-pathsingle-job(SPSJ)applications.Wedenedanewclassofrequests,calledmultiple-pathmultiple-job(MPMJ),andproposeTAalgorithmsforthenewclassofapplications.Theproposedalgorithms,starandpartitionedstarARs,areshowntobesignicantlybetterthannaiveapproaches. Finally,Weformulatedworkowschedulingproblemsine-Sciencenetworks,whosegoalisminimizingeithermakespanornetworkresourceconsumptionbyjointlyschedulingheterogeneousresourcessuchascomputeandnetworkresources.Theformulationsaredifferentfrompreviousworkintheliteratureinthesensethattheyallowdynamicmultiplepathsfordatatransferbetweentasksandmoreexibleresourceallocationthatmayvaryovertime.theLPrelaxationalgorithmfordeploymentinpracticehasbeendevelopedbasedontheoptimalalgorithmthroughthecommonlinearrelaxationtechnique.Wealsoproposedtheedge-pathformLPrelaxationalgorithmtoenhancetimecomplexity. 97

PAGE 98

REFERENCES [1] BERNetworkRequirementsWorkshopFinalReport.LawrenceBerkeleyNationalLaboratory,2007. http://www.es.net/pub/esnet-doc/BER-Net-Req-Workshop-2007-Final-Report.pdf ;citedSep.2010. [2] BESNetworkRequirementsWorkshopFinalReport.LawrenceBerkeleyNationalLaboratory,2007. http://www.es.net/pub/esnet-doc/BES-Net-Req-Workshop-2007-Final-Report.pdf ;citedSep.2010. [3] EarthScope:AnEarthScienceProgram.EarthScope,2007. http://www.earthscope.org/usarray/data_flow/archiving.php ;citedSep.2010. [4] EnlightenedComputing.MCNC,2007. http://www.enlightenedcomputing.org/ ;citedJan.2008. [5] GEANT2.DANTE,2007. http://www.geant2.net/ ;citedSep.2010. [6] CHEETAH:Circuit-switchedHigh-speedEnd-to-EndTransportArchitecture.UniversityofVirginia,2008. http://www.ece.virginia.edu/cheetah/ ;citedSep.2010. [7] FESNetworkRequirementsWorkshopFinalReport.LawrenceBerkeleyNationalLaboratory,2008. http://www.es.net/pub/esnet-doc/FES-Net-Req-Workshop-2008-Final-Report.pdf ;citedSep.2010. [8] Ultralight:AnUltrascaleInformationSystemforDataIntensiveResearch.NationalScienceFoundation,2008. http://www.ultralight.org ;citedSep.2010. [9] CA*net4.CANARIE,2009. http://www.canarie.ca/canet4/index.html ;citedSep.2010. [10] OSCARS:On-demandSecureCircuitsandAdvanceReservationSystem.U.S.DepartmentofEnergy,2009. http://www.es.net/oscars ;citedSep.2010. [11] Abilene.Internet2,2010. http://abilene.internet2.edu/ ,citedJan.2009. [12] e-Science.TheU.K.ResearchCouncils,2010. http://www.rcuk.ac.uk/escience ;citedSep.2010. [13] TheEarthSystemGrid(ESG).UniversityCorporationforAtmosphericResearch,2010. http://www.earthsystemgrid.org/ ;citedSep.2010. [14] EnergyScienceNetwork(ESnet).LawrenceBerkeleyNationalLaboratory,2010. http://www.es.net ;citedSep.2010. [15] TheGlobusAlliance.Globus,2010. http://www.globus.org/ [16] HybridOpticalandPacketInfrastructure.Internet2,2010. http://www.internet2.edu/networkresearch/projects.html ;citedSep.2010. 98

PAGE 99

[17] Internet2.Internet2,2010. http://www.internet2.edu ;citedSep.2010. [18] JGNII:AdvancedNetworkTestbedforResearchandDevelopment.NICT,2010. http://www.jgn.nict.go.jp ;citedSep.2010. [19] JIVE.JointInstituteforVeryLongBaselineInterferometry,2010. http://www.jive.nl/ ;citedSep.2010. [20] LHCNet:TransatlanticNetworkingfortheLHCandtheU.S.HEPCommunity.U.S.DepartmentofEnergy,2010. http://lhcnet.caltech.edu/ ;citedSep.2010. [21] NationalLambdaRail.U.S.researchandeducationcommunity,2010. http://www.nlr.net ;citedSep.2010. [22] NSFGlobalEnvironmentforNetworkInnovations(GENI)Project.GENI,2010. http://geni.net/ ;citedSep.2010. [23] TeraGrid.NationalScienceFoundation,2010. http://www.teragrid.org/ ;citedSep.2010. [24] UltraScienceNet.U.S.DepartmentofEnergy,2010. http://www.csm.ornl.gov/ultranet/ ;citedSep.2010. [25] UserControlledLightPathProvisioning.CommunicationsResearchCentre,2010. http://www.uclp.ca/ ;citedSep.2010. [26] Adam,ThomasL.,Chandy,K.M.,andDickson,J.R.Acomparisonoflistschedulesforparallelprocessingsystems.Commun.ACM17(1974).12:685. [27] Ahuja,Ravindra,Magnanti,T.,andOrin,J.Networkows:theory,algorithms,andapplications.EnglewoodCliffsN.J.:PrenticeHall,1993. [28] Benoit,Anne,Hakem,Mourad,andRobert,Yves.Contentionawarenessandfault-tolerantschedulingforprecedenceconstrainedtasksinheterogeneoussystems.ParallelComputing35(2009).2:83. [29] Blake,S.,Black,D.,Carlson,M.,Davies,E.,Wang,Z.,andWeiss,W.Anarchitecturefordifferentiatedservices.RFC2475,IETF,1998. [30] Boutaba,R.,Golab,W.,Iraqi,Y.,Li,T.,andArnaud,B.Grid-controlledlightpathsforhighperformancegridapplications.JournalofGridComputing1(2003).4:387. [31] Braden,R.,Clark,D.,andShenker,S.Integratedservicesintheinternetarchitec-ture:Anoverview.RFC1633,IETF,1994. [32] Brodnik,AndrejandNilsson,Andreas.AstaticdatastructurefordiscreteadvancebandwidthreservationsontheInternet.Tech.Rep.Techreport 99

PAGE 100

cs.DS/0308041,DepartmentofComputerScienceandElectricalEngineering,LuleaUniversityofTechnology,Sweden,2003. [33] Bunn,J.andNewman,H.Data-intensivegridsforhigh-energyphysics.GridComputing:MakingtheGlobalInfrastructureaReality.eds.F.Berman,G.Fox,andT.Hey.JohnWiley&Sons,Inc,2003. [34] Burchard,Lars-O.Networkswithadvancereservations:applications,architecture,andperformance.JournalofNetworkandSystemsManagement13(2005).4:429. [35] Burchard,Lars-O.andHeiss,Hans-U.Performanceissuesofbandwidthreservationforgridcomputing.Proceedingsofthe15thSymposiumonComputerArchetectureandHighPerformanceComputing(SBAC-PAD'03).2003. [36] Burchard,Lars-O.,Schneider,J.,andLinnert,B.Reroutingstrategiesfornetworkswithadvancereservations.ProceedingsoftheFirstIEEEInternationalConferenceone-ScienceandGridComputing(e-Science2005).Melbourne,Australia,2005. [37] Chen,BinBinandPrimet,PascaleVicat-Blanc.Schedulingdeadline-constrainedbulkdatatransferstominimizenetworkcongestion.ProceedingsoftheSeventhIEEEInternationalSymposiumonClusterComputingandtheGrid(CCGRID).2007. [38] Chung,Yeh-ChingandRanka,S.Applicationsandperformanceanalysisofacompile-timeoptimizationapproachforlistschedulingalgorithmsondistributedmemorymultiprocessors.Supercomputing'92.Proceedings.1992,512. [39] Coffman,E.G.andGraham,R.L.Optimalschedulingfortwo-processorsystems.ActaInformatica1(1972).3:200. [40] Curti,C.,Ferrari,T.,Gommans,L.,vanOudenaarde,B.,Ronchieri,E.,Giacomini,F.,andVistoli,C.Onadvancereservationofheterogeneousnetworkpaths.FutureGenerationComputerSystems21(2005).4:525. [41] DeFanti,T.,d.Laat,C.,Mambretti,J.,Neggers,K.,andArnaud,B.TransLight:Aglobal-scaleLambdaGridfore-science.CommunicationsoftheACM46(2003).11:34. [42] Dick,RobertP.,Rhodes,DavidL.,andWolf,Wayne.TGFF:taskgraphsforfree.Proceedingsofthe6thinternationalworkshoponHardware/softwarecodesign.Seattle,Washington,UnitedStates:IEEEComputerSociety,1998,97. [43] (Ed.),E.Mannie.Generalizedmulti-protocollabelswitching(GMPLS)architecture.RFC3945,IETF,2004. 100

PAGE 101

[44] Erlebach,T.Calladmissioncontrolforadvancereservationrequestswithalternatives.Tech.Rep.TIK-ReportNr.142,ComputerEngineeringandNetworksLaboratory,SwissFederalInstituteofTechnology(ETH)Zurich,2002. [45] Farrel,A.APathComputationElement(PCE)-BasedArchitecture.2006. [46] Ferrari,Tiziana.GridNetworkServicesUseCasesfromthee-ScienceCommu-nity.TheOpenGridForum,2007. [47] Foster,I.andKesselman,C.TheGrid:BlueprintforaNewComputingInfrastruc-ture.MorganKaufmann,1999. [48] Foster,I.,Kesselman,C.,Lee,C.,Lindell,R.,Nahrstedt,K.,andRoy,A.Adistributedresourcemanagementarchitecturethatsupportsadvancereservationsandco-allocation.ProceedingsoftheInternationalWorkshoponQualityofService(IWQoS'99).1999. [49] Govindarajan,R.andGao,Guang.Rate-optimalscheduleformulti-rateDSPcomputations.TheJournalofVLSISignalProcessing9(1995).3:211. [50] Govindarajan,R.,Gao,GuangR.,andDesai,Palash.MinimizingBufferRequirementsunderRate-OptimalScheduleinRegularDataowNetworks.TheJournalofVLSISignalProcessing31(2002).3:207. [51] Guerin,R.andOrda,A.Networkswithadvancereservations:Theroutingperspective.ProceedingsofIEEEINFOCOM99.1999. [52] He,E.,Wang,X.,Vishwanath,V.,andLeigh,J.AR-PIN/PDC:Flexibleadvancereservationofintradomainandinterdomainlightpaths.ProceedingsoftheIEEEGLOBECOM2006.2006. [53] Hutanu,Andrei,Allen,Gabrielle,Beck,StephenD.,Holub,Petr,Kaiser,Hartmut,Kulshrestha,Archit,Lika,Milo,MacLaren,Jon,Matyska,Ludk,Paruchuri,Ravi,Prohaska,Steffen,Seidel,Ed,Ullmer,Brygg,andVenkataraman,Shalini.Distributedandcollaborativevisualizationoflargedatasetsusinghigh-speednetworks.FutureGener.Comput.Syst.22(2006).8:1004. [54] J.Mambretti,etal.ThePhotonicTeraStream:enablingnextgenerationapplicationsthroughintelligentopticalnetworkingatiGRID2002.FutureGenera-tionComputerSystems19(2003).6:897908. [55] Johnston,W.E.,Metzger,J.,OConnor,M.,Collins,M.,Burrescia,J.,Dart,E.,Gagliardi,J.,Guok,C.,andOberman,K.NetworkCommunicationasaService-OrientedCapability.HighPerformanceComputingandGridsinAction,Vol.16.ed.L.Grandinetti.IOSPress,2008. [56] Jung,E.,Li,Y.,Ranka,S.,,andSahni,S.Performanceevaluationofroutingandwavelengthassignmentalgorithmsforopticalnetworks.IEEESymposiumonComputersandCommunications.2008. 101

PAGE 102

[57] Jung,E.,Li,Y.,Ranka,S.,andSahni,S.Anevaluationofin-advancebandwidthschedulingalgorithmsforconnection-orientednetworks.InternationalSymp.onParallelArchitectures,Algorithms,andNetworks(ISPAN).2008. [58] Karypis,GeorgeandKumar,Vipin.MeTis:UnstrcturedGraphPartitioningandSparseMatrixOrderingSystem,Version2.0,1995. [59] Khan,A.A.,Mccreary,C.L.,andJones,M.S.AComparisonofMultiprocessorSchedulingHeuristics.ParallelProcessing,1994.ICPP1994.InternationalConferenceon.vol.2.1994,243. [60] Kwok,Y.-K.andAhmad,I.Benchmarkingthetaskgraphschedulingalgorithms.ParallelProcessingSymposium,1998.IPPS/SPDP1998.ProceedingsoftheFirstMergedInternational...andSymposiumonParallelandDistributedProcessing1998.1998,531. [61] Lee,E.A.andHa,S.Schedulingstrategiesformultiprocessorreal-timeDSP.GlobalTelecommunicationsConference,1989,andExhibition.CommunicationsTechnologyforthe1990sandBeyond.GLOBECOM'89.,IEEE.1989,1279vol.2. [62] Lee,EdwardAshford.Acoupledhardwareandsoftwarearchitectureforpro-grammabledigitalsignalprocessors(synchronousdataow).Ph.D.thesis,UniversityofCalifornia,Berkeley,1986. [63] Lehman,T.,Sobieski,J.,andJabbari,B.DRAGON:Aframeworkforserviceprovisioninginheterogeneousgridnetworks.IEEECommunicationsMagazine(2006). [64] Lewin-Eytan,L.,Naor,J.,andOrda,A.Routingandadmissioncontrolinnetworkswithadvancereservatione.ProceedingsoftheFifthInternationalWorkshoponApproximationAlgorithmsforCombinatorialOptimization(APPROX02).2002. [65] Li,Yan,Ranka,S.,andSahni,S.In-advancepathreservationforletransfersIne-Scienceapplications.ComputersandCommunications,2009.ISCC2009.IEEESymposiumon.2009,176. [66] Liu,Xin.Application-Specic,AgileandPrivate(ASAP)PlatformsforFederatedComputingServicesoverWDMNetworks.Ph.D.thesis,TheStateUniversifyofNewYorkatBuffalo,2009. [67] Liu,Xin,Wei,Wei,Qiao,Chunming,Wang,Ting,Hu,Weisheng,Guo,Wei,andWu,Min-You.TaskSchedulingandLightpathEstablishmentinOpticalGrids.INFOCOM2008.The27thConferenceonComputerCommunications.IEEE.2008,1966. 102

PAGE 103

[68] Lui,King-Shan,Nahrstedt,K.,andChen,Shigang.Routingwithtopologyaggregationindelay-bandwidthsensitivenetworks.Networking,IEEE/ACMTransactionson12(2004):17. [69] Luo,XubinandWang,Bin.IntegratedSchedulingofGridApplicationsinWDMOpticalLight-TrailNetworks.JournalofLightwaveTechnology27(2009).12:1785. [70] Marchal,L.,Primet,P.,Robert,Y.,andZeng,J.Optimalbandwidthsharingingridenvironment.ProceedingsofIEEEHighPerformanceDistributedComputing(HPDC).2006. [71] McDysan,D.E.andSpohn,D.L.ATMTheoryandApplications.McGraw-Hill,1998. [72] Medina,A.,Lakhina,A.,Matta,I.,andByers,J.BRITE:anapproachtouniversaltopologygeneration.Modeling,AnalysisandSimulationofComputerandTelecommunicationSystems,2001.Proceedings.NinthInternationalSymposiumon(2001):346. [73] Munir,K.,Javed,S.,andWelzl,M.AReliableandRealisticApproachofAdvanceNetworkReservationswithGuaranteedCompletionTimeforBulkDataTransfersinGrids.ProceedingsofACMInternationalConferenceonNetworksforGridApplications(GridNets2007).SanJose,California,2007. [74] Munir,K.,Javed,S.,Welzl,M.,Ehsan,H.,andJaved,T.AnEnd-to-EndQoSMechanismforGridBulkDataTransferforSupportingVirtualization.ProceedingsofIEEE/IFIPInternationalWorkshoponEnd-to-endVirtualizationandGridManagement(EVGM2007).SanJose,California,2007. [75] Munir,K.,Javed,S.,Welzl,M.,andJunaid,M.UsinganEventBasedPriorityQueueforReliableandOpportunisticSchedulingofBulkDataTransfersinGridNetworks.Proceedingsofthe11thIEEEInternationalMultitopicConference(INMIC2007).2007. [76] Murthy,Praveen.Schedulingtechniquesforsynchronousandmultidimensionalsynchronousdataow.Berkeley:ElectronicsResearchLaboratoryCollegeofEngineeringUniversityofCalifornia,1996. [77] Naiksatam,SumitandFigueira,Silvia.ElasticReservationsforEfcientBandwidthUtilizationinLambdaGrids.TheInternationalJournalofGridComput-ing23(2007).1:1. [78] Newman,H.B.,Ellisman,M.H.,andOrcutt,J.A.Data-intensivee-sciencefrontierresearch.CommunicationsoftheACM46(2003).11:68. 103

PAGE 104

[79] Palis,M.A.,Liou,Jing-Chiou,andWei,D.S.L.Taskclusteringandschedulingfordistributedmemoryparallelarchitectures.ParallelandDistributedSystems,IEEETransactionson7(1996).1:46. [80] PelsserandBonaventure.PathSelectionTechniquestoEstablishConstrainedInterdomainMPLSLSPs.2006,209. [81] Rajah,K.,Ranka,S.,andXia,Ye.SchedulingBulkFileTransferswithStartandEndTimes.NetworkComputingandApplications,2007.NCA2007.SixthIEEEInternationalSymposiumon.2007,295. [82] Rajah,Kannan,Ranka,Sanjay,andXia,Ye.SchedulingBulkFileTransferswithStartandEndTimes.ComputerNetworks52(2008).5:1105.. [83] Rao,N.S.,Carter,S.M.,Wu,Q.,Wing,W.R.,Zhu,M.,Mezzacappa,A.,Veeraraghavan,M.,andBlondin,J.M.Networkingforlarge-scalescience:Infrastructure,provisioning,transportandapplicationmapping.ProceedingsofSciDACMeeting.2005. [84] Reiter,Raymond.SchedulingParallelComputations.J.ACM15(1968).4:590. [85] Ricciato,F.,Monaco,U.,andAli,D.DistributedschemesfordiversepathcomputationinmultidomainMPLSnetworks.CommunicationsMagazine,IEEE43(2005):138. [86] Rosen,E.,Viswanathan,A.,andCallon,R.Multiprotocollabelswitchingarchitec-ture.RFC3031,IETF,2001. [87] Sahni,S.,Rao,N.,Ranka,S.,Li,Y.,Jung,E.,andKamath,N.Bandwidthschedulingandpathcomputationalgorithmsforconnection-orientednetworks.InternationalConferenceonNetworking.2007. [88] Sarangan,V.,Ghosh,D.,andAcharya,R.Performanceanalysisofcapacity-awarestateaggregationforinter-domainQoSrouting.GlobalTelecom-municationsConference,2004.GLOBECOM'04.IEEE3(2004):1458Vol.3. [89] Schelen,O.,Nilsson,A.,Norrgard,Joakim,andPink,S.PerformanceofQoSagentsforprovisioningnetworkresources.ProceedingsofIFIPSeventhInternationalWorkshoponQualityofService(IWQoS'99).London,UK,1999. [90] Sinnen,O.andSousa,L.A.Communicationcontentionintaskscheduling.ParallelandDistributedSystems,IEEETransactionson16(2005).6:503515. [91] Sinnen,Oliver.Taskschedulingforparallelsystems.HobokenN.J.:Wiley-Interscience,2007. 104

PAGE 105

[92] Sinnen,OliverandSousa,Leonel.Listscheduling:extensionforcontentionawarenessandevaluationofnodeprioritiesforheterogeneousclusterarchitectures.ParallelComputing30(2004).1:81. [93] Sprintson,A.,Yannuzzi,M.,Orda,A.,andMasip-Bruin,X.ReliableRoutingwithQoSGuaranteesforMulti-DomainIP/MPLSNetworks.INFOCOM2007.26thIEEEInternationalConferenceonComputerCommunications.IEEE(2007):1820. [94] Stuijk,S.,Geilen,M.,andBasten,T.Throughput-BufferingTrade-OffExplorationforCyclo-StaticandSynchronousDataowGraphs.Computers,IEEETransac-tionson57(2008).10:1331. [95] Sun,Zhenyu,Guo,Wei,Wang,Zhengyu,Jin,Yaohui,Sun,Weiqiang,Hu,Weisheng,andQiao,Chunming.SchedulingAlgorithmforWorkow-BasedApplicationsinOpticalGrid.JournalofLightwaveTechnology26(2008).17:3011. [96] Thorpe,S.R.,Stevenson,D.,andEdwards,G.K.Usingjust-in-timetoenableopticalnetworkingforgrids.FirstICST/IEEEInternationalWorkshoponNetworksforGridApplications(GridNets2004).2004. [97] Topcuoglu,H.,Hariri,S.,andWu,Min-You.Performance-effectiveandlow-complexitytaskschedulingforheterogeneouscomputing.ParallelandDistributedSystems,IEEETransactionson13(2002).3:260. [98] Uludag,Suleyman,Lui,King-Shan,Nahrstedt,Klara,andBrewster,Gregory.AnalysisofTopologyAggregationtechniquesforQoSrouting.ACMComput.Surv.39(2007):7. [99] Wang,Lee,Siegel,HowardJay,Roychowdhury,VwaniP.,andMaciejewski,AnthonyA.TaskMatchingandSchedulinginHeterogeneousComputingEnvironmentsUsingaGenetic-Algorithm-BasedApproach,.JournalofParallelandDistributedComputing47(1997).1:8. [100] Wang,TaoandChen,Jianer.Bandwidthtree-Adatastructureforroutinginnetworkswithadvancedreservations.ProceedingsoftheIEEEInternationalPerformance,ComputingandCommunicationsConference(IPCCC2002).2002. [101] Wang,Yan,Jin,Yaohui,Guo,Wei,Sun,Weiqiang,Hu,Weisheng,andWu,Min-You.Jointschedulingforopticalgridapplications.JournalofOpticalNetworking6(2007).3:304. [102] Wieczorek,Marek,Hoheisel,Andreas,andProdan,Radu.TaxonomiesoftheMulti-CriteriaGridWorkowSchedulingProblem.GridMiddlewareandServices.2008.237. 105

PAGE 106

[103] Wiggers,M.,Bekooij,M.,Jansen,P.,andSmit,G.Efcientcomputationofbuffercapacitiesformulti-ratereal-timesystemswithback-pressure.Hardware/softwarecodesignandsystemsynthesis,2006.CODES+ISSS'06.Proceedingsofthe4thinternationalconference.2006,10. [104] Xiong,Qing,Wu,Chanle,Xing,Jianbing,Wu,Libing,andZhang,Huyin.Alinked-listdatastructureforadvancereservationadmissioncontrol.ICCNMC2005.2005.LectureNotesinComputerScience,Volume3619/2005. [105] Yang,TaoandGerasoulis,Apostolos.PYRROS:statictaskschedulingandcodegenerationformessagepassingmultiprocessors.Proceedingsofthe6thinternationalconferenceonSupercomputing.Washington,D.C.,UnitedStates:ACM,1992,428. [106] Yang,Xi,Lehman,Tom,Tracy,Chris,Sobieski,Jerry,Gong,Shujia,Torab,Payam,andJabbari,Bijan.Policy-BasedResourceManagementandServiceProvisioninginGMPLSNetworks.ProceedingsofIEEEINFOCOM.2006. [107] Yannuzzi,M.,Masip-Bruin,X.,andBonaventure,O.Openissuesininterdomainrouting:asurvey.Network,IEEE19(2005):49. [108] Yoo,Younghwan,Ahn,Sanghyun,andKim,ChongSang.LinkstateaggregationusingashufenetinATMPNNInetworks.GlobalTelecommunicationsConfer-ence,2000.GLOBECOM'00.IEEE.vol.1.2000,481vol.1. [109] Zhang,Z.L.,Duan,Z.,andHou,Y.T.DecouplingQoScontrolfromcorerouters:Anovelbandwidthbrokerarchitectureforscalablesupportofguaranteedservices.Proc.ACMSIGCOMM.2000. [110] Zheng,Jun,Zhang,Baoxian,andMouftah,H.T.Towardautomatedprovisioningofadvancereservationserviceinnext-generationopticalinternet.Communica-tionsMagazine,IEEE44(2006).12:68. 106

PAGE 107

BIOGRAPHICALSKETCH Eun-SungJungreceivedB.S.andM.S.degreesinelectricalengineeringfromSeoulNationalUniversity,Korea,in1996and1998,respectively.Hisresearchinterestsincludenetworkoptimizationinconnection-orientednetworksanditsapplicationstoexistingresearchnetworks. 107