<%BANNER%>

Complex Networks under Attacks

Permanent Link: http://ufdc.ufl.edu/UFE0045412/00001

Material Information

Title: Complex Networks under Attacks Vulnerability Assessment and Optimization
Physical Description: 1 online resource (148 p.)
Language: english
Creator: Dinh, Thang N
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2013

Subjects

Subjects / Keywords: approximation-algorithm -- complex-network -- pairwise-connectivity -- vulnerability-assessment
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Complex network systems are extremely vulnerable to attacks.  In the presence of uncertainty,  assessing network vulnerability before potential malicious attacks is vital for network planning and risk management. In this dissertation, we apply optimization theory and approximation techniques to address the following fundamental questions: How do we quantitatively measure the vulnerability degree of the network? How to identify critical infrastructures in the network in the context of both individual failures and/or cascading failures  that spread from one node to another across the network structure? The dissertation provides several new theoretical frameworks and approximation algorithms to characterize the network vulnerability and critical infrastructures, which advances the understanding of network vulnerability. The dissertation tackles the above questions by crossing and contributing new techniques to several research areas such as graph theory, approximation algorithms, mathematical programming, and computational complexity.  This research can potentially impact many applications that benefit from networks such as the Internet, smart grids, and transportation networks where vulnerability is an important characteristic.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Thang N Dinh.
Thesis: Thesis (Ph.D.)--University of Florida, 2013.
Local: Adviser: Thai, My Tra.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-05-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2013
System ID: UFE0045412:00001

Permanent Link: http://ufdc.ufl.edu/UFE0045412/00001

Material Information

Title: Complex Networks under Attacks Vulnerability Assessment and Optimization
Physical Description: 1 online resource (148 p.)
Language: english
Creator: Dinh, Thang N
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2013

Subjects

Subjects / Keywords: approximation-algorithm -- complex-network -- pairwise-connectivity -- vulnerability-assessment
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Complex network systems are extremely vulnerable to attacks.  In the presence of uncertainty,  assessing network vulnerability before potential malicious attacks is vital for network planning and risk management. In this dissertation, we apply optimization theory and approximation techniques to address the following fundamental questions: How do we quantitatively measure the vulnerability degree of the network? How to identify critical infrastructures in the network in the context of both individual failures and/or cascading failures  that spread from one node to another across the network structure? The dissertation provides several new theoretical frameworks and approximation algorithms to characterize the network vulnerability and critical infrastructures, which advances the understanding of network vulnerability. The dissertation tackles the above questions by crossing and contributing new techniques to several research areas such as graph theory, approximation algorithms, mathematical programming, and computational complexity.  This research can potentially impact many applications that benefit from networks such as the Internet, smart grids, and transportation networks where vulnerability is an important characteristic.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Thang N Dinh.
Thesis: Thesis (Ph.D.)--University of Florida, 2013.
Local: Adviser: Thai, My Tra.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2015-05-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2013
System ID: UFE0045412:00001


This item has the following downloads:


Full Text

PAGE 1

COMPLEXNETWORKSUNDERATTACKS:VULNERABILITYASSESSMENTANDOPTIMIZATIONByTHANGN.DINHADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2013

PAGE 2

c2013ThangN.Dinh 2

PAGE 3

Idedicatethisdisserationtomylovingparents,KienDuongandThuyDinh,mywifePhuong-Thao,andmydaughterLinh-Thu. 3

PAGE 4

ACKNOWLEDGMENTS Thisdissertationwouldnothavebeenpossiblewithoutthehelp,support,guidanceandeffortsofsomanypeopleinsomanyways.Firstly,IwouldliketothankmymentorDr.MyT.ThaiforherinvaluablesupportthroughoutmyPh.Dprogram.Iwouldalwaystreasureherinfectiouspassion,preciseness,profoundknowledge,andvaluableadviceonbecomingagoodscientistand,moreimportantly,abetterperson.Iamalsoverygratefulformycommitteemembers,Dr.AhmedHelmy,Dr.TamerKahveci,Dr.PanagotePardalos,andDr.SartajSahniforthesupportthey'velentmeoveralltheseyearsaswellastheirvaluablesupportformyfuturecareer.IwouldlikealsotothankDr.MeeraSitharamforhelpingmewithmyquestionsandmanyinsightfuldiscussions.IwouldliketoacknowledgeandthankmyfriendsRaviTiwari,IncheolShin,YingXuan,NamNguyen,DzungNguyen,YilinShen,FerhatAyfortheirhelpinresearchandallthefunwehadtogether.FinallyIwouldliketotaketheopportunitytothankallmyteacherswhohasequippedmewiththerequisiteknowledgeandskills.MyresearchwaspartiallyfundedbyNSFCAREER0953284. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 8 LISTOFFIGURES ..................................... 9 ABSTRACT ......................................... 11 CHAPTER 1INTRODUCTION ................................... 12 1.1Connectivity-basedVulnerabilityAssessment ................ 13 1.1.1Motivation ................................ 13 1.1.2-disruptorProblems .......................... 14 1.1.3RelatedWork .............................. 15 1.1.4Contributions .............................. 16 1.2CascadingFailuresinCriticalInfrastructures ................ 17 1.2.1ProblemDenitions ........................... 18 1.2.2RelatedWork. .............................. 20 1.2.3Contributions .............................. 21 2MULTIPLELINKATTACKS ............................. 22 2.1ComplexityofFindingDisruptor ........................ 23 2.1.1NP-completenessofEdgeDisruptor ................. 24 2.1.2HardnessofApproximation:VertexDisruptor ............ 27 2.2BicriteriaApproximationAlgorithmfor-edgeDisruptor .......... 28 2.2.1BalancedTree-Decomposition ..................... 28 2.2.2DynamicProgrammingAlgorithmontheDecompositionTree ... 29 2.3BoundsontheSizeofEdgeDisruptor .................... 35 2.3.1LaplacianMatrixandandItsEigenvalues .............. 37 2.3.2SpectralLower-boundforLinkAssessment ............. 39 2.3.2.1DynamicProgrammingMethod ............... 40 2.3.2.2LagrangeMultipliersMethod ................ 43 2.3.2.3Timeandqualitytrade-off .................. 46 2.3.3ExperimentalResults .......................... 48 2.3.3.1SyntheticNetworks ..................... 48 2.3.3.2Real-worldDatasets ..................... 49 3MULTIPLENODEATTACKS ............................ 51 3.1BicriteriaApproximationAlgorithmfor-vertexDisruptor .......... 51 3.2ConnectionbetweenEdgeDisruptorandVertexDisruptor ......... 55 3.3Branch-and-cutAlgorithm ........................... 57 5

PAGE 6

3.3.1MixedIntegerProgrammingFormulation ............... 58 3.3.2SparseMetricTechnique ........................ 59 3.3.3CuttingPlanes ............................. 63 3.3.3.1Vertex-ConnectivityandInvalidInequalities ........ 63 3.3.3.2SeparationProcedureforVCInequalities ......... 64 3.3.4PrimalHeuristic ............................. 65 3.4Experimentalstudy ............................... 66 3.4.1PerformanceoftheBranch-and-cutAlgorithm ............ 69 3.4.2Casestudy:WesternStatesPowerGrid ............... 70 4JOINTLINKANDNODEATTACKS ......................... 72 4.1JointLinksandNodeAttacks ......................... 72 4.1.1MixedIntegerLinearProgramming .................. 76 4.1.2NodesandEdgeswithExcessCosts ................. 77 4.2BicriteriaApproximationAlgorithmforJointLink&NodeAttacks ..... 77 4.2.1AlgorithmDescription ......................... 78 4.2.2AnalysisofApproximationRatio .................... 80 4.3HybridMeta-heuristic ............................. 85 4.3.1Controllingthepairwiseconnectivity. ................. 85 4.3.2SpectralPartition ............................ 86 4.3.3VariableNeighborhoodSearch .................... 88 4.4ExperimentalStudies ............................. 88 4.4.1ExperimentSetups ........................... 88 4.4.2Comparisonofthethreedisruptortypes ............... 91 4.4.3SynthesisNetworksofDifferentTopologies ............. 92 4.4.4ASRelationshipsNetworks ...................... 94 5VULNERABILITYASSESSMENTINPROBABILISTICNETWORKS ...... 96 5.1ProbablilisticNetworks ............................. 97 5.1.1ProbabilisticNetworkModel ...................... 97 5.1.2ExpectedPairwiseConnectivity .................... 97 5.1.3VulnerabilityAssessment ....................... 98 5.2EstimationofConnectivityinProbabilisticNetworks ............ 98 5.2.1#P-Completeness ............................ 98 5.2.2Monte-CarloMethodstoApproximateEPC ............. 100 5.2.3FullyPolynomialTimeApproximationScheme ............ 102 5.2.3.1ComponentSamplingAlgorithm .............. 102 5.3VulnerabilityAssessmentusingEPC ..................... 105 5.3.1ApproximatingviatheExpectationGraph .............. 108 5.3.2SampleAverageApproximation(SAA)Method ........... 111 5.3.3LocalSearchHeuristic ......................... 113 6

PAGE 7

6CASCADING-FAILURESINNETWOKRS ..................... 116 6.1SeedingCostofMassiveOutbreak ...................... 116 6.1.1Power-lawNetworkModel. ...................... 116 6.1.2ProhibitiveSeedingCosts ....................... 117 6.2AlgorithmtoIdentifytheMinimumOutbreakSeeding ............ 120 6.3HardnessoftheCFMProblem ........................ 124 6.3.1Feige'sReductionforSetCover .................... 124 6.3.2One-hopCFM .............................. 126 6.3.3Multiple-hopCFM ............................ 131 6.4EmpiricalStudy ................................. 133 6.4.1ComparingtoOptimalSeeding .................... 133 6.4.2LargeSocialNetworks ......................... 136 6.4.3SolutionQualityinLargeSocialNetworks .............. 136 6.4.4Scalability ................................ 138 6.4.5Inuencefactor ............................. 138 7CONCLUSION .................................... 139 REFERENCES ....................................... 141 BIOGRAPHICALSKETCH ................................ 148 7

PAGE 8

LISTOFTABLES Table page 2-1Sizesoftheinvestigatednetworksandthecorrespondingrunningtimetocomputethelower-bound ................................... 49 3-1SizeofdisruptoronErdos-Renyinetworksat60%connectivity. ......... 67 3-2SizeofdisruptoronBarabasiAlbertnetworksat60%connectivity. ....... 67 3-3ComparisonsofIPvdandMIPvdonpower-lawnetworks ............. 69 6-1Sizesoftheinvestigatednetworks ......................... 136 8

PAGE 9

LISTOFFIGURES Figure page 2-1Afterremovingtheverticeswithmaximumout-goingdegree(inblack),network(A)isstillstronglyconnectedwhile(B)isfragmented;however,removingthegreyvertexisenoughtodestroy(A). ........................ 23 2-2ConstructionofH(VH;EH)fromG(V;E) ...................... 25 2-3F=ft2;t3;t5;t6gisaG-partitionableinthedecompositiontree.ThecorrespondingpartitionfV(t2);V(t3);V(t5);V(t6)gcanbeobtainedbyusingcutsatt0;t1;t4. 31 2-4Minimumcostandlower-boundsfor-disruptoronthesynthesisnetworks ... 47 2-5Runningtimeonthesynthesisnetworks ...................... 47 2-6Lowerboundsonthenumberoflink-attackforrealnetworksfoundwiththeLMBalgorithm. .................................... 50 3-1Conversionfromthenodeversioninadirectedgraph(a)intotheedgeversioninadirectedgraph(b) ................................ 55 3-2DisruptorsfoundbydifferentmethodsintheWesternStatesPowerGridoftheUnitedStatesatdifferentlevelsofdisruption. ................. 67 3-3VisualizationofdisruptorsintheWesternStatesPowerGrid. .......... 68 4-1Minimumcostsolutionstoreduce50%oftheconnectivityassuminglinkshavecost2andnodeshavecost3 ............................ 73 4-2Removing(u;v)breaksthenetworkintofourstronglyconnectedcomponents,includingtheredandgreencomponentswhicharenotevenincidentto(u;v). 80 4-3SCCsintheresidualgraphG[EnM].ThecuthS;SiconsistsofasubsetofM:edgesfromC4toC3andfromC5toC2. .................... 81 4-4ThenormalizedoptimalcostsofthreedifferentdisruptortypesontheUSBackbonenetwork. ....................................... 89 4-5Costsofdisruptoralgorithmsonthesynthesisnetworks(thelowerthebetter) 90 4-6Runningtimeofdisruptoralgorithmsonthesynthesisnetworks ......... 91 4-7OregonASnetwork ................................. 94 4-8CAIDAASnetwork .................................. 95 6-1Theinuencepropagationinthenetwork. ..................... 118 6-2ReductionfromSCBtoCFMwhend=1 ...................... 127 9

PAGE 10

6-3Thetransmittergadget. ............................... 131 6-4Gap-reductionfromone-hopCFMtod-hopCFM. ................. 132 6-5Seedingsize(inpercent)onErdos'sCollaborationnetwork.VirAdsproducesclosetotheoptimalseedinginonlyfractionsofasecond. ............ 133 6-6Seedingsizewhenthenumberofpropagationhopdvaries(=0:3).VirAdsconsistentlyhasthebestperformance. ....................... 135 6-7Runningtimewhenthenumberofpropagationhopdvaries(=0:3).Evenforthelargestnetworkof110millionedges,VirAdstakeslessthan12minutes. 135 6-8Degreedistributionofstudiednetworks ...................... 136 6-9Seedingsizeatdifferentinuencefactors(themaximumnumberofpropagationhopsisd=4). .................................... 137 10

PAGE 11

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyCOMPLEXNETWORKSUNDERATTACKS:VULNERABILITYASSESSMENTANDOPTIMIZATIONByThangN.DinhMay2013Chair:MyT.ThaiMajor:ComputerEngineeringComplexnetworksystemsareextremelyvulnerabletoattacks.Inthepresenceofuncertainty,assessingnetworkvulnerabilitybeforepotentialmaliciousattacksisvitalfornetworkplanningandriskmanagement.Inthisdissertation,weapplyoptimizationtheoryandapproximationtechniquestoaddressthefollowingfundamentalquestions:Howdowequantitativelymeasurethevulnerabilitydegreeofthenetwork?Howtoidentifycriticalinfrastructuresinthenetworkinthecontextofbothindividualfailuresand/orcascadingfailuresthatspreadfromonenodetoanotheracrossthenetworkstructure?Thedissertationprovidesseveralnewtheoreticalframeworksandapproximationalgorithmstocharacterizethenetworkvulnerabilityandcriticalinfrastructures,whichadvancestheunderstandingofnetworkvulnerability.Thedissertationtacklestheabovequestionsbycrossingandcontributingnewtechniquestoseveralresearchareassuchasgraphtheory,approximationalgorithms,mathematicalprogramming,andcomputationalcomplexity.ThisresearchcanpotentiallyimpactmanyapplicationsthatbenetfromnetworkssuchastheInternet,smartgrids,andtransportationnetworkswherevulnerabilityisanimportantcharacteristic. 11

PAGE 12

CHAPTER1INTRODUCTIONAssessingnetworkvulnerabilitybeforepotentialdisruptiveeventssuchasnaturaldisastersormaliciousattacksisvitalfornetworkplanningandriskmanagement.Itenablesustoseekandsafeguardagainstmostdestructivescenariosinwhichtheoverallnetworkconnectivityfallsdramatically.Therehavebeennumerouseffortsonproposingevaluationmeasuresofthenetworkvulnerability,assummarizedin[ 48 ].Ononehand,severalglobalgraphmeasures,suchasCyclomaticnumber,Maximumnetworkcircuits,Alphaindex,andBetaindex,whichinvestigatebasicgraphproperties,i.e.,numberofvertices,edgesandpairwiseshortestpaths,areadoptedtoevaluatethenetworkvulnerability.However,theseglobalmeasurescanneitherberigorouslymappedtotheovernetworkconnectivity,norrevealthesetofmostcriticalverticesandedges,thusarenotsuitabletoassessthenetworkvulnerabilityintermsofconnectivity.Ontheotherhand,researchersfocusedonlocalnodalcentrality[ 19 ],suchasdegreecentrality,betweennesscentralityandclosenesscentrality,inordertodifferentiatethecriticalverticesfromtheothers,andfurtherevaluatethenetworkbyquantifyingsuchvertices.Unfortunately,beingunabletocasttheselocalpropertiestoglobalnetworkconnectivity,thesemeasuresfailtoindicateaccuratevulnerabilitiesandcannotrevealtheglobaldamagedoneonthenetworkunderattacks.Tothisend,intherstpartofthisproposal,weinvestigateameasurecalledpairwiseconnectivityandformulatethisvulnerabilityassessmentproblemasanewgraph-theoreticaloptimizationproblems.Thepairwiseconnectivityisthesumofeveryvertexpairconnectivity,whichisquantiedas1iftheyare(strongly)connectedand0ifnot.Ournewoptimizationproblems,called-vertexdisruptorand-edgedisruptor,aimtodiscoverthesetofcriticalnode/edges,whoseremovalresultsinthesharpestdeclineofthepairwiseconnectivity.Withrespecttoalevelofconnectivitydisruption,themore 12

PAGE 13

vertices/edgesrequiredtoberemoved,thelessvulnerablethenetworkis;conversely,thefewervertices/edgesneededtoremoved,theeasierthisnetworkistobedestroyed.The-disruptorproblemsaredenedinsection 1.1 .Thesecondpartofthisproposalfocusesonassessingnetworkvulnerabilityagainstcascading-failures,thatspreadamongnodesofapowergridorcommunicationnetworkduringawidespreadoutage,amongnancialinstitutionsduringanancialcrisis,orthroughahumanpopulationduringtheoutbreakofanepidemicdisease.Wedevelopanewmeasurementforcascading-resilienceinnetworkssubjecttosuchcascadingfailures.Thecascading-resilience(a.k.a.networkvulnerability)ismeasuredastheminimumsizeofasetofnodesthatcantriggeranoutbreakoffailuretothewholenetworkinashortamountoftime.Thus,weformulatethemeasuringcascading-resilienceasanoptimizationproblem,calledcost-effectivemassiveoutbreakproblem(CFM).SinceallformulatedoptimizationareshowntobeNP-complete,efcientalgorithmstondtheexactsolutionsfortheformulatedproblemsareunlikelytoexist.Thus,wefocusondesigningalgorithmsthatcanprovideguaranteeontheirperformances,whichareknownasapproximationalgorithms.Furthermore,wealsodevoteonepartoftheproposaltodesignscalablealgorithmsforlarge-scalenetworks,whichhavehundredsofmillionslinks.Thosealgorithmsareessentialtobenettheavailableofbigdata. 1.1Connectivity-basedVulnerabilityAssessment 1.1.1MotivationConnectivityplaysavitalroleinnetworkperformanceandisfundamentaltovulnerabilityassessment.Potentialdisruptiveevents,suchasnaturaldisastersormaliciousattacks,whichalwaysdestroyasetofinteractingelementsorconnections,candramaticallycompromisetheconnectivityandresultinconsideratedeclineofthenetworkQoS,orevenbreakdownthewholenetwork[ 25 27 48 64 65 71 ].Ofthisconcern,pre-activeevaluationoverthenetworkvulnerabilitywithrespecttoconnectivity, 13

PAGE 14

inordertodefensesuchpotentialdisruptions,isquiteessentialandbenecialtothedesignandmaintenanceofanyinfrastructurenetworks,forexample,communication,commercial,andsocialnetworks. 1.1.2-disruptorProblemsBesidesthehomogeneousnetworkmodelconsistingofuniformnodesandbidirectionallinks,theheterogeneousnetworkmodel,wherevariousinteractingelementsofdifferentkindsareconnectedthroughunidirectionallinkswithnon-uniformexpenses,ndsnumerousapplicationsnowadays[ 55 61 77 ],butaswell,exhibitsmultipledifcultiesforoptimizationandmaintenance.Inthelightofthis,weabstractourgeneralnetworkmodelasadirectedgraphG(V;E),whereVreferstoasetofnodesandEreferstoasetofunidirectionallinks.Theexpenseofeachdirectededge(u;v)betweenvertexuandvisquantiedasanonnegativevaluec(u;v),forallthem=jEjlinksamongn=jVjnodes.Asmentionedabove,ourevaluationoverthenetworkvulnerabilityisbasedonthevalueofoverallpairwiseconnectivityintheabstractedgraph,whichisdenedasfollows:givenanyvertexpair(u;v)2VVinthegraph,wesaythattheyareconnectediffthereexistspathsbetweenuandvinbothdirectionsinG,i.e.,stronglyconnectedtoeachother.Thepairwiseconnectivityp(u;v)isquantiedas1ifthispairisconnected,0otherwise.Sincethemainpurposeofnetworkliesinconnectingalltheinteractingelements,westudyontheaggregatepairwiseconnectivitybetweenallpairs,thatis,thesumofquantiedpairwiseconnectivity,whichwedenoteasP(G)=Pu;v2VVp(u;v)forgraphG.ApparentlyP(G)ismaximizedat)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2whenGisastronglyconnectedgraph.Basedonthis,wehave: Denition1. (Edgedisruptor)Given01,asubsetSEinG=(V;E)isa-edgedisruptoriftheoverallpairwiseconnectivityintheG[EnS],obtainedbyremovingSfromG,isnomorethan)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2.ByminimizingthecostofsuchedgesinS,wehavethe 14

PAGE 15

-edgedisruptorproblem,i.e.,ndaminimized-edgedisruptorinastronglyconnectedgraphG(V;E).RecallthatGisstronglyconnectediffforeveryvertexvinG,thereisadirectedpathfromvtoallothervertices.AsubgraphofGiscalledastronglyconnectedcompo-nent(SCC)iffitisamaximalsubgraphofGwithallvertexpairsu;vwithinitconnectedbydirectedpathsinbothdirections.Assumethata-edgedisruptordisruptstheconnectivityinG(V;E)byseparatingitintoseveralsmallerSCCs,sayCifori=1:::li.e.V=l]i=1Ci.Wehave: P(G)=lXi=1jCij2=1 2 lXi=1jCij2)-222(jVj!=1 2n2 l)]TJ /F3 11.955 Tf 11.96 0 Td[(n+l 2Var(C)whereVar(C)=1 llXi=1(jCij)]TJ ET q .478 w 160.42 -303.44 m 169.66 -303.44 l S Q BT /F3 11.955 Tf 160.42 -313.28 Td[(C)2=1 llXi=1(jCij)]TJ /F3 11.955 Tf 20.34 8.08 Td[(n l)2.Therefore,thetwokeyfactorsaffectingpairwiseconnectivityarethenumberofSCCsandthevarianceoftheirsizes.Theyprovideusanalternativemeasureforevaluatingthestructuralbalanceandfragmentationofthenetwork.Similarly,wedene-vertexdisruptoranditscorrespondingoptimizationproblem:-vertexdisruptorproblem:GivenastronglyconnectedgraphG(V;E)andaxednumber01,ndasubsetSVwiththeminimumsizesuchthatthetotalpairwiseconnectivityinG[VnS],obtainedbyremovingSfromG,isnomorethan)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2.SuchasetSiscalled-vertexdisruptor. 1.1.3RelatedWorkTheclassicvulnerabilitymeasurementsaremainlybasedonthecentralityofeachvertexinthegraph,whichconsistofdegreecentrality,betweenness,closeness,andeigenvectorcentrality[ 19 ].However,thesemeasuresfailtoindicateaccuratevulnerabilitiesandcannotrevealtheglobaldamagedoneonthenetworkunderattacks. 15

PAGE 16

Ontheotherhand,theglobalgraphmeasuresaremainlyfunctionsofgraphproperties,e.g.,thenumberofvertices/edges,operationalO-Dpairs,operationalpaths,minimumshortestpaths[ 25 48 64 ].However,someoftheseattributescannotbecalculatedinpolynomial-timefordensenetworks.Inessence,thesefunctionsdonotrevealthesetofmostcriticalverticesandedges,thusarenotsuitabletoassessthenetworkvulnerabilityintermsofconnectivity.Severalsimilarconceptswithourpairwiseconnectivityhavebeenrecentlyinvestigatedin[ 11 18 78 ],wherethetermsaveragepairwiseconnectivity,pairwiseconnectedratioandcohesionwereused.However,noneofthemwereabletoformulatethecalculationofthismeasureasanoptimizationproblemandprovidethehardnessproofalongwithperformanceguaranteedapproximationalgorithms.Moreover,theproblem-disruptorstudiedinthispapertakeintoaccounttherolesofalledgesandverticesintheglobalnetworkconnectivity,thusprovidesamoreessentialresearchandthoroughanalysisovertheunderlyingvulnerabilityframeworkestablished.Asasubproblemofthisvulnerabilityassessmentproblem,CriticalVertex/Edge,whicharedenedastheminimumnumberofvertices/edgeswhoseremovaldisconnectsthegraph,arealsostudiedandsolvedusingextensiveheuristics,however,withoutperformanceguarantee.Someworkofthiskindinthecontextofwirelessnetworkare[ 46 ][ 49 ][ 50 ],nevertheless,theseworksconsideronlywhetherornotthegraphisdisconnectedandignorehowfragmentalthegraphbecomes.Theyareinsufcienttoevaluatethegraphvulnerability.Bissiasetal.[ 15 16 ]studytheproblemofboundingthedamageunderlinkattacks.However,theprovidedmethodseitherrequiresolvingcostlysemideniteprogrammingproblem[ 16 ]orinvolvingweakboundsduetothepresenceofpartitionswithnegativesizes[ 15 ]. 1.1.4ContributionsOurcontributionsforthevulnerabilityassessmentresearchareasfollows: 16

PAGE 17

Providinganovelunderlyingframeworktowardthevulnerabilityassessmentbyinvestigatingthepairwiseconnectivityandformulatingitasanoptimizationproblem-disruptorongeneralgraphs,whichconsistsoftwoversions-vertexdisruptorand-edgedisruptor; ProvingtheNP-completenessofthetwoproblemsaboveandfurtherprovingthatnoPTASexistsfor-vertexdisruptor; PresentinganO(log3 2n)pseudo-approximationalgorithmfor-vertexdisruptor,andanO(lognloglogn)pseudo-approximationalgorithmfor-edgedisruptor.Thesesolutionscanbeappliedtobothhomogeneousnetworksandheterogeneousnetworkswithunidirectionallinksandnon-uniformnodalproperties. Wepresentaspectrallower-boundmethodforthelinkvulnerabilityassessmentproblem,-edgedisruptor.Thenewlower-boundmethodisusefulinbothcomparingthevulnerabilityofdifferentnetworksandprovidingguaranteesforotherheuristicsassessmentmethods. InChapter4,wepresentanO(p logn)bicriteriaapproximationalgorithmforthe-disruptorproblem.Since-vertexdisruptorisaspecialcaseof-disruptor,thealgorithmimpliesanO(p logn)bicriteriaapproximationalgorithmfor-vertexdisruptor,whichimprovethebestresultfor-vertexdisruptor,theO(lognloglogn)bicriteriaapproximationalgorithm. Inprobabilisticnetworks,Werstshowthatcomputingexpectedpairwiseconnectivityis#P-complete.Inaddition,wedevelopaFullyPolynomialTimeApproximationScheme(FPRAS)toestimatenetworkconnectivitywithanarbitraryprecision. 1.2CascadingFailuresinCriticalInfrastructuresMaliciousattackscancausefailurestospreadoverthenetwork.Suchcascadingprocessescanbefoundincontagiousfailuresthatspreadamongnodesofapowergridorcommunicationnetworkduringawidespreadoutage,amongnancialinstitutionsduringanancialcrisis,orthroughahumanpopulationduringtheoutbreakofanepidemicdisease.Duringthecascadeprocess,nodesareassignedstateswhichchangebecauseoftheinuenceoftheirneighbors.Forexample,aninfectednodecanpasstheinfectiontoitscontactsinthenetwork,andtheinfectioncouldthenbepassedtomoreandmorenodes.Wefocusonthecasewherenodeschangetheirstateonlywhenacertainfractionoftheirneighborsexertinuence(seee.g.[ 22 86 87 ]). 17

PAGE 18

Wedevelopanewmeasurementforcascading-resilienceinnetworkssubjecttosuchcascadingfailures.Thecascading-resilience(a.k.a.networkvulnerability)ismeasuredastheminimumsizeofasetofnodesthatcantriggeranoutbreakoffailuretothewholenetworkinashortamountoftime.Thus,weformulatethemeasuringcascading-resilienceasanoptimizationproblem,calledcost-effectivemassiveoutbreakproblem(CFM).Thekeydifferenceincomparisonwithotherworksoncascadingfailureanddiffusionprocessisthatweconsiderthetime-aspectoftheoutbreak.Welimitthepropagationoffailuretowithindhopsfromthefailuresources.Bothanalyticalanalysisbasedonscale-freenetworktheoryandnumericalanalysisdemonstratethatthemassiveoutbreakmightinvolvecostlyseeding.Tominimizetheseedingcost,weprovidemathematicalprogrammingtondoptimalseedingformedium-sizenetworksandproposeVirAds,anefcientalgorithm,totackletheproblemonlarge-scalenetworks.VirAdsguaranteesarelativeerrorboundofO(1)fromtheoptimalsolutionsinpower-lawnetworksandoutperformsthegreedyheuristicswhichrealizesonthedegreecentrality.Moreover,wealsoshowthat,ingeneral,approximatingtheoptimalseedingwithinaratiobetterthanO(logn)isunlikelypossible. 1.2.1ProblemDenitionsWearegivenanetworkmodeledasanundirectedgraphG=(V;E)wheretheverticesinVrepresentusersinthenetworkandtheedgesinErepresentsociallinksbetweenusers.Weusenandmtodenotethenumberofverticesandedges,respectively.Thesetofneighborsofavertexv2VisdenotedbyN(v)andwedenotebyd(v)=jN(v)jthedegreeofnodev.Wecontinuewithspecifyingthediffusionmodelthatgovernstheprocessofcascadingfailures.Existingdiffusionmodelscanbecategorizedintotwomaingroups[ 51 ]: Thresholdmodel.Eachnodevinthenetworkhasathresholdtv2[0;1],typicallydrawnfromsomeprobabilitydistribution.Eachconnection(u;v)betweennodesu 18

PAGE 19

andvisassignedaweightw(u;v).Foranodev,letF(v)bethesetofneighborsofvthatarealreadyinuenced.ThenvisinuencediftvPu2F(v)w(u;v). Cascademodel.Wheneveranodeuisinuenced,itisgivenasinglechancetoactivateeachofitsneighborvwithagivenprobabilityp(u;v).Mostpapersoncascadingprocessesassumethattheprobabilitiesp(u;v)orweightsw(u;v)andthresholdstvaregivenasapartoftheinput.However,theyaregenerallynotavailableandinferringthoseprobabilitiesandthresholdshasremainedanontrivialproblem[ 45 ].Therefore,inadditiontotheboundedpropagationhop,weuseasimpliedvariationofthelinearthresholdmodelinwhichavertexisactivatedifafractionofitsneighborsareactiveasfollows.LocallyBoundedDiffusionModel.LetR0Vbethesubsetofverticesselectedtoinitiatetheinuencepropagation,whichwecalltheseeding.Wealsocallavertexv2R0aseed.Thepropagationprocesshappensinround,withallverticesinR0areinuenced(thusactiveinadoptingthebehavior)atroundt=0.Ataparticularroundt0,eachvertexiseitheractive(adoptedthebehavior)orinactiveandeachvertex'stendencytobecomeactiveincreaseswhenmoreofitsneighborsbecomeactive.Ifaninactivevertexuhasmorethandd(u)eactiveneighborsatroundt,thenitbecomesactiveatroundt+1,whereistheinuencefactorasdiscussedlater.Theprocessgoesonforamaximumnumberofdroundsandavertexoncebecomesactivewillremainactiveuntiltheend.WesayaninitialsetR0ofverticestobead-seedingifR0canmakeallverticesinthenetworksactivewithinatmostdrounds.Theinuencefactor0<<1isaconstantthatdecideshowwidelyandquicklytheinuencepropagatesthroughthenetwork.Inuencefactorreectsreal-worldfactorssuchashoweasytosharethecontentwithothers,orsomeintrinsicbenetforthosewhoinitiallyadoptthebehavior.Incase=1=2themodelisalsoknownasthemajoritymodelthathasmanyapplicationindistributedcomputing,votingsystem[ 69 ],etc.ProblemDenition.Giventhediffusionmodel,theCost-effective,Fast,andMassiveoutbreak(CFM)problemisdenedasfollows 19

PAGE 20

Denition2(CFMProblem). GivenanundirectedgraphG=(V;E)modelingacomplexnetworkandaninuencefactor0<<1,ndinVaminimumsized-seedingi.e.asubsetofverticesthatcanactivateallverticesinthenetworkwithinatmostdrounds.Generalization.Thediffusionmodelcanbegeneralizedinseveralways.Forexample,themodelcanbeextendednaturallytocoverdirectednetworksorspecifydifferentinuencefactorvforeachnodev2V.Forsimplicitywestickwiththecurrentmodeltoavoidsettingparametersduringtheexperiments.Nevertheless,majorresultssuchastheapproximationratiooftheVirAdsalgorithmorthehardnessofapproximationresultsstillholdforthegeneralizedmodels. 1.2.2RelatedWork.Outbreakcanbethoughtofasadiffusionofinformationabouttheproductanditsadoptionoverthenetwork.Kempeetal.[ 51 52 ]formulatedtheinuencemaximizationproblemasanoptimizationproblem.TheyshowedtheproblemtobeNP-completeanddevisedan(1)]TJ /F5 11.955 Tf 13.1 0 Td[(1=e)]TJ /F3 11.955 Tf 13.09 0 Td[()approximationalgorithm.Amajordrawbackoftheiralgorithmisthattheaccuracy,andefciencydependsonthenumberoftimesrunningMonte-Carlosimulationofthepropagationmodel.Later,Leskovecetal.[ 57 ]studytheinuencepropagationinadifferentperspectiveinwhichtheyaimtondasetofnodesinnetworkstodetectthespreadofvirusassoonaspossible.Theyimprovethesimplegreedymethodtorunfaster.ThegreedyalgorithmisfurtheredimprovedbyChenetal.[ 23 ]byusinganinuenceestimation.However,theproposedalgorithmmightonlyperformwellforsmallvaluesofpropagationprobabilities.Inaddition,thealgorithmtimecomplexityshouldbeO((m+k)logn)insteadoftheclaimedO(klogm+m).InuencepropagationwithlimitednumberofhopsisrstconsideredinWangetal.[ 83 88 ]inwhichtheproposedheuristichashightimecomplexity.Fengetal.[ 87 ]showNP-completenessfortheproblem.Wenotethatnoneofthementionedapproacheshandledlarge-scalesocialnetworksofmillionofnodesasweshallstudyinSection 6.4 20

PAGE 21

1.2.3ContributionsOurcontributionsaresummarizedasfollows: Ourrstndingshowsthattheseedingforfastandmassivespreadingmustcontainanon-trivialfractionofnodesinthenetworks,whichiscost-prohibitiveforlarge-scalenetworks.Thisisconrmedbybothourtheoreticalanalysisbasedonthepower-lawmodelin[ 5 ]andourextensiveexperiments. WeproposeVirAds,ascalablealgorithmtondasetofminimalseedingtoexpeditiouslypropagatetheinuencetothewholenetwork.VirAdsoutperformsthegreedyheuristicsbasedonwell-knowndegreecentralityandscalesuptonetworksofhundredofmillionlinks.WeprovethatthealgorithmguaranteesarelativeerrorboundofO(1),assumingthatthenetworkispower-law. WeshowhowhardtoobtainanearoptimalsolutionforCFMbyprovingtheimpossibilitytoapproximatetheoptimalsolutionwithinaratiobetterthanO(logn). 21

PAGE 22

CHAPTER2MULTIPLELINKATTACKSWeconvertthevulnerabilityassessmentintoagraph-theoreticaloptimizationproblem:ndingaminimizedsetofvertices/edgeswhoseremovaldegradesthepairwiseconnectivitytoadesireddegree.Consideringthatdisruptingtheseverticesandedgeswillconsideratelydegradethenetworkperformance,werefertothemas-disruptorthroughoutthispaper,where0<1denotesthefractionofdesiredpairwiseconnectivity(whichwewilldenelater).Twonewoptimizationproblems-vertexdisruptorand-edgedisruptorwillbestudiedandprovedtobeNP-complete.Weaddressedthemwithseveralpseudo-approximationalgorithmswithprovableperformancebounds,whichthusensurethefeasibilityandaccuracyofthisevaluationmeasure.ThebenetofournewmeasurecanbebrieyillustratedinFig. 2-1 ,comparedwiththeassessmentusingdegreecentrality.NoticethatbothnetworksAandBhave7verticesandareoriginallystronglyconnected.Accordingtothenodaldegreecentrality,removingtheblackvertexwithmaximumoutgoingdegree5inFig. 2-1 -(a)leavesthenetworkAstillstronglyconnectedwith5vertices;andremovingtheblackvertexwithmaximumoutgoingdegree4inFig. 2-1 -(b)partitionsthegraphintotwostronglyconnectedcomponents.Inthissense,networkAissomewhatstronger(lessvulnerable)thanB.However,ourmodelcandiscoverthat,deletingonlythegreyvertexinAwillbeenoughtodecreasetheoverallconnectivityto0;onthecontrary,atleast3verticesinBarerequiredtoberemovedtomakeoverallconnectivity0.Therefore,Aisactuallymuchmorevulnerable.Apparently,ourmeasureprovidesmoreaccurateassessment.Furthermore,ourstudyoverthemultipledisruptionlevels(differentvaluesof)presentsadeepermeaningandgreaterpotentials.Severalrecentstudiesinthecontextofwirelessnetworkshaveaimedtodiscoverthenodes/edgeswhoseremovaldisconnectsthenetwork,regardlessofhowdisconnecteditis[ 46 ][ 49 ][ 50 ].Apparently, 22

PAGE 23

A BFigure2-1. Afterremovingtheverticeswithmaximumout-goingdegree(inblack),network(A)isstillstronglyconnectedwhile(B)isfragmented;however,removingthegreyvertexisenoughtodestroy(A). thisisaweakerversionofour-disruptor,sincenospecicationoverthequantiednetworkconnectivityisconcerned.However,itisnotreasonabletolimitthepossibledisruptiontoonlydisconnectingthegraph,ignoringhowfragmenteditis.Forinstance,ascale-freenetworkcantoleratehighrandomfailurerates[ 13 ],sincethedestructiontoboundaryverticesmaynotsignicantlydeclinethenetworkconnectivityeventhoughthewholegraphbecomesdisconnected.Inaddition,differentdisruptionlevelsmayrequiredifferentsetsofdisruptoronwhichourmodelcandifferentiatewhereasexistingmethodscannot.Forexample,thenodecentralitymethodalwaysreturnsasetofnodeswithnon-increasingdegreesregardlessofthedisruptionlevel.Thischapterisorganizedasfollows.WerstprovidethehardnessresultsinSection 6.3 .Thepseudo-approximationalgorithmsfor-edgedisruptorand-vertexdisruptorarepresentedinSection 2.2 andSection 3.1 respectively.Weproposesparsemetrictechniqueandabranch-and-cutalgorithmtondtheoptimal-vertexdisruptorinSection 3.3 .Section 3.4 presentsthesimulationresultscomparingtheperformanceoftheproposedapproximationalgorithmsandtheexactbranch-and-cutalgorithm. 2.1ComplexityofFindingDisruptorInthissectionweshowthatboththe-edgedisruptorand-vertexdisruptorindirectedgraphareNP-completewhichthushavenopolynomialtimeexactalgorithms 23

PAGE 24

unlessP=NP.WestateastrongerresultthatbothproblemsareNP-completeeveninundirectedgraphwithunitcostedges.NotethatonlyinthissectionweconsidertheproblemforundirectedgraphG(V;E).Allresultsinothersectionsarestudiedondirectedgraphs,thussolvingbothhomogeneousandheterogeneousnetworks. 2.1.1NP-completenessofEdgeDisruptorWeuseareductionfromthebalancedcutproblem. Denition3. AcuthS;VnSicorrespondingtoasubsetS2VinGisthesetofedgeswithexactlyoneendpointinS.Thecostofacutisthesumofitsedges'costs(orsimplyitscardinalityinthecasealledgeshaveunitcosts).WeoftendenoteVnSbyS.Findingamincutinthegraphispolynomialsolvable[ 76 ].However,ifoneasksforasomewhatbalancedcutofminimumsize,theproblembecomesintractable.Abalancedcutisdenedasfollowing: Denition4. (Balancedcut)Anf-balancedcutofagraphG(V;E),wheref:Z+!R+,asksustondacutS;SwiththeminimumsizesuchthatjSj;jSjf(jVj).Abusingnotations,for00.ItfollowsfromCorollaries 1 and 2 thatforeveryf=(n)f-balancedcutisNP-complete.WearereadytoprovetheNP-completenessof-edgedisruptor: 24

PAGE 25

Figure2-2. ConstructionofH(VH;EH)fromG(V;E) Theorem2.1. (-edgedisruptorNP-completeness)-edgedisruptorinundirectedgraphisNP-completeevenifalledgeshaveunitweights. Proof. Weprovetheresultforthespecialcasewhen=1 2.Forothervaluesoftheproofcangothroughwithaslightmodicationofthereduction.Weshallassumethatn,thenumberofnodesisasufcientlargenumber(forourproofn>103).ConsiderthedecisionversionoftheproblemthataskswhetheranundirectedgraphG(V;E)containsa1 2-edgedisruptorofaspeciedsize:1 2-ED=fhG;KijGhasa1 2-edgedisruptorofsizeKgToshowthat1 2-EDisinNP-completewemustshowthatitisinNPandthatallNP-problemsarepolynomialtimereducibletoit.Therstpartiseasy;givenacandidatesubsetofedges,wecaneasilycheckinpolynomialtimeifitisa-edgedisruptorofsizeK.Toprovethesecondpart,weshowthatf-balancedcutispolynomialtimereducibleto1 2-EDwheref=bn)]TJ /F15 7.97 Tf 6.58 10.07 Td[(q 2bn2 3c+n 2c.LetG(V;E)beagraphinwhichoneseekstondaf-balancedcutofsizek.ConstructthefollowinggraphH(VH;EH):VH=V[C1[C2whereC1;C2aretwocliques 25

PAGE 26

ofsizebn2 3c.DenotebyN=jVHj=2bn2 3c+nthetotalnumberofnodesinH.InadditiontoedgesinG;C1,andC2,connecteachvertexv2Vtobn2 4c+1verticesinC1andbn2 4c+1verticesinC2sothatdegreedifferenceofnodesinthecliquesareatmostone.WeillustratetheconstructionofH(VH;EH)inFigure 2-2 .Weshowthatthereisaf-balancedcutofsizekinGiffHhasan1 2-edgedisruptorofsizeK=nbn2 4c+1+kwhere0kbn2 4c.NotethatthecostofanycuthS;VnSiinGisatmostjSjjVnSjb(jSj+jVnSj)2 4c=bn2 4c.Ononehand,anf-balancedcutS;SofsizekinGinducesacutC1[S;C2[Swithsizeexactlynbn2 4c+1+k.Ifweselectthecutasthedisruptor,thepairwiseconnectivitywillbeatmost1 2)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(N2.Ontheotherhand,assumethatHhasan1 2-edgedisruptorofsizeK=nbn2 4c+1+k.Removetheedgesinthedisruptortoreducethepairwiseconnectivitytoatmost1 2)]TJ /F6 7.97 Tf 5.47 -4.37 Td[(N2.SincecuttingnnodesinC1orC2fromthecliquesrequiresremovingatleastn(bn2 3c)]TJ /F3 11.955 Tf 21.11 0 Td[(n)>nbn2 4c+1+kedges,letC01C1andC02C2begiantconnectedsubsetsthatinduceconnectedsubgraphsinC1andC2.ThesesubsetsmustsatisfyjC01j+jC02j>jC1j+jC2j)]TJ /F3 11.955 Tf 19.69 0 Td[(n.DenotebyX1;X2thesubsetsofnodesinVthatareconnectedtoC01andC02respectively.WehaveX1\X2=;otherwiseC01andC02willbeconnected;then,thepairwiseconnectivitywillexceed1 2)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(N2.Wewillmodifythedisruptorwithoutincreasingitssizeandthepairwiseconnectivitysuchthatnonodesinthethecliquesarecutoffi.e.wealterthedisruptoruntilC01=C1andC02=C2.Foreachu2C1nC01removefromthedisruptoralledgesconnectingutoC01andaddtothedisruptoralledgesconnectingutoX2.ThiswillattachutoC01whilereducingthesizeofthedisruptoratleast(bn2 3c)]TJ /F3 11.955 Tf 21.27 0 Td[(n))]TJ /F3 11.955 Tf 12.63 0 Td[(n.Atthesametimeselectanarbitrarynodev2X1andaddtothedisruptorallremainingv'sadjacentedges.Thisincreasesthesizeofthedisruptoratmost(bn2 4c+1)+nwhilemakingvisolated.Bydoingsowedecreasethesizeofthedisruptorby(bn2 3c)]TJ /F3 11.955 Tf 20.28 0 Td[(n))]TJ /F3 11.955 Tf 12.13 0 Td[(n)]TJ /F5 11.955 Tf 12.14 0 Td[(((bn2 4c+1)+n)>0. 26

PAGE 27

Inaddition,thepairwiseconnectivitywillnotincreaseasweconnectutoC01andatthesametimedisconnectvfromC01.IfX1=;,wecanselectv2X2asinthatcasejC02[X2j>jC01[X1jthatmakessurethepairwiseconnectivitywillnotincrease.WerepeatthesameprocessforeverynodeinC2nC02.Sincej(C1nC01)[(C2nC02)jnbn2 4c+1+kthatisacontradiction.SinceX1[X2=VwehavethatthedisruptorinducesacutinG.Tohavethepairwiseconnectivityatmost1 2)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(N2both(C1[X1)and(C2[X2)musthavesizeatleastN)]TJ 6.58 6.75 Td[(p N 2.IffollowsthatX1andX2musthavesizeatleastf(n)=bn)]TJ /F15 7.97 Tf 6.58 10.07 Td[(q 2bn2 3c+n 2c.ThecostofthecutinducedbyhX1;X2iinGwillbenbn2 4c+1+k)]TJ /F3 11.955 Tf 11.96 0 Td[(n(bn2 4c+1)=k. 2.1.2HardnessofApproximation:VertexDisruptor Theorem2.2. -vertexdisruptorinundirectedgraphisNP-complete. Proof. Wepresentapolynomial-timereductionfromVertexCover(VC),anNP-hardproblem[ 42 ]:Instance:GivenagraphGandapositiveintegerk.Question:DoesGhaveaVCofsizeatmostk?toadecisionversionof-vertexdisruptorwhen=0Instance:GivenagraphGandapositiveintegerkQuestion:DoesGhavea-vertexdisruptorofsizeatmostkwhen=0?PairwiseconnectivityequalszerosifandonlyifthecomplementsetofthedisruptorisanindependentsetorinotherwordsthedisruptormustbeaVC. Theorem2.3. UnlessP=NP,-vertexdisruptorcannotbeapproximatedwithinafactorof1:36. 27

PAGE 28

Proof. WeusethesamereductioninTheorem 2.2 .Assumethatwecanapproximate-vertexdisruptorwithinafactorlessthan1:36when=0.In[ 35 ],DinurandSafrashowedthatapproximatingVCwithinconstantfactorlessthan1.36isNP-hard.Sincewehaveanone-to-onemappingbetweenthesetofvertexdisruptorswhen=0andthesetofVCs,itfollowsthatwecanapproximateVCwithinafactorlessthan1:36(contradiction). 2.2BicriteriaApproximationAlgorithmfor-edgeDisruptorInthissection,wepresentanO(log3 2n)pseudo-approximationalgorithmforthe-edgedisruptorprobleminthecasewhenalledgeshaveuniformcosti.e.c(u;v)=18(u;v)2E(G).Formally,ouralgorithmndsinauniformdirectedgraphGa0-edgedisruptorwhosethecostisatmostO(log3 2n)OPT)]TJ /F6 7.97 Tf 6.59 0 Td[(ED,where0 4<<0andOPT)]TJ /F6 7.97 Tf 6.58 0 Td[(EDisthecostofanoptimal-edgedisruptor.AsshowninAlgorithm1,theproposalalgorithmconsistsoftwomainsteps.First,weconstructsadecompositiontreeofGbyrecursivelypartitioningthegraphintotwohalveswithdirectedc-balancedcut.Second,wesolvetheproblemontheobtainedtreeusingadynamicprogrammingalgorithmandtransferthissolutiontotheoriginalgraph.Thesetwomainstepsareexplainedinthenexttwosections. 2.2.1BalancedTree-DecompositionAtreedecompositionofagraphisarecursivepartitioningofthenodesetintosmallerandsmallerpiecesuntileachpiececontainsonlyonesinglenode.WeshowthetreeconstructioninAlgorithm1(line1to11).OurdecompositiontreeisarootedbinarytreewhoseleavesrepresentnodesinG.(Becauseourdecompositiontreeisabinarytreewithnleaves,itwillcontainexactlyn)]TJ /F5 11.955 Tf 12.38 0 Td[(1non-leafnodes.Onecanprovethiswithinductiononnumberofnodes.) Denition5. GivenadirectedgraphG(V;E)andasubsetofverticesSV.WedenotethesetofedgesoutgoingfromSby+(S);thesetofedgesincomingtoSby)]TJ /F5 11.955 Tf 7.09 -4.34 Td[((S).Acut(S;VnS)inGisdenedas+(S).Ac-balancedcutisacut(S;VnS)s.t. 28

PAGE 29

minfjSj;jVnSjgcjVj.Thedirectedc-balancedcutproblemistondtheminimumc-balancedcut.Notethatacut(S;VnS)separatepairs(u;v)2S(VnS)aspathsfromvtoucannotexisti.e.noSCCcancontainvertexinbothSandVnS.Thedecompositionprocedureisasfollows.WestartwiththetreeTcontainingonlyonerootnodet0.Weassociatetherootnodet0withthevertexsetVofGi.e.V(t0)=V(G).Foreachnodeti2TwhoseV(ti)containsmorethanonevertexandV(ti)hasnotbeenpartitioned,wepartitionthesubgraphG[V(ti)]inducedbyV(ti)inGusingac-balancedcutalgorithm.Indetail,weusethedirectedc-balancedcutalgorithmpresentedin[ 2 ]thatndsinpolynomialtimeac0-balancedcutwithinafactorofO(p logn)fromtheoptimalc-balancedcutforc0=candxedconstant.Theconstantcischosentobe1)]TJ /F9 11.955 Tf 9.87 13.38 Td[(q 0.Createtwochildnodesti1;ti2oftiinTcorrespondingtotwosetsofverticesofG[V(ti)]separatedbythecut.Weassociatewithtiacutcostcost(ti)equaltothecostofthec-balancedcut.Wedenetherootnodet0tobeonlevel1.Ifanodeisonlevell,allitschildrenaredenedtobeonlevell+1.NotethatcollectionsofsubsetsofverticesinGthatcorrespondtonodesinasamelevelofTinducesapartitioninG.Oneimportantparameterofthedecompositiontreeistheheighti.e.themaximumlevelofnodesinT.Usingbalancedcutsguaranteesasmallheightofthetreethatinturnleadstoasmallapproximationratio.WhenseparatingV(ti)usingthebalancedcut,thesizeofthelargerpartisatmost(1)]TJ /F3 11.955 Tf 12.53 0 Td[(c0)jV(ti)j.Hence,wecanprovebyinductionthatifanodetiisonlevelk,thesizeofthecorrespondingcollectionV(ti)isatmostjVj(1)]TJ /F3 11.955 Tf 11.96 0 Td[(c0)k)]TJ /F4 7.97 Tf 6.59 0 Td[(1.Itfollowsthatthetree'sheightisatmostO()]TJ /F5 11.955 Tf 11.29 0 Td[(log1)]TJ /F6 7.97 Tf 6.58 0 Td[(c0n)=O(logn). 2.2.2DynamicProgrammingAlgorithmontheDecompositionTreeInthissection,wepresentthesecondmainstepwhichusesthedynamicprogrammingtosearchfortherightsetofnodesinTthatinducesancost-efcientpartitioninG 29

PAGE 30

Algorithm1.-edgeDisruptorInput:Uniformedges'weightdirectedgraphG=(V;E)and0<0<1Output:A0-edgedisruptorofG. /*Constructthedecompositiontree*/1.c 1)]TJ /F9 11.955 Tf 11.96 13.38 Td[(q 0.2.T(VT;ET) (ft0g;),V(t0) V(G);l(t0)=13.while9unvisitedtiwithjV(ti)j2do4.Marktivisited,createnewchildnodesti1;ti2ofti.5.VT VT[fti1;ti2g6.ET ET[f(ti;ti1);(ti;ti2)g7.SeparateG[V(ti)]usingdirectedc-balancedcut.8.AssociateV(ti1);V(ti2)withtwoseparatedcomponents.9.cost(ti) Thecostofthebalancedcut/*FindtheminimumcostG-partitionable*/10.TraverseTinpost-order,foreachti2Tdo11.forp 0to0)]TJ /F6 7.97 Tf 5.48 -4.37 Td[(n212.ifP(G[V(ti)])pthencost(ti;p) 013.elsecost(ti;p) minfcost(ti1;p1)+cost(ti2;p2)+cost(ti)jp1+p2=pg14.FindFopt0associatingwithTopt0=minp0(n2)fcost(t0;p)g15.Returnunionofc-balancedcutsatti2A(Fopt0). 30

PAGE 31

Figure2-3. F=ft2;t3;t5;t6gisaG-partitionableinthedecompositiontree.ThecorrespondingpartitionfV(t2);V(t3);V(t5);V(t6)gcanbeobtainedbyusingcutsatt0;t1;t4. whosepairwiseconnectivityisatmost0)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2.ThedetailsofthisstepareshowninAlgorithm1(lines12to18).DenoteasetF=ftu1;tu2;:::;tukgVTwhereVTisthesetofverticesinTsothatV(tu1);V(tu2);:::;V(tuk)isapartitionofV(G)i.e.V(G)=k]h=1Vuh.WesaysuchasubsetFisG-partitionable.DenotebyA(ti)thesetofancestorsoftiinTandA(F)=[ti2FA(ti).ItisclearthataFisG-partitionableifandonlyifFsatises: 1. 8ti;tj2F:ti=2A(tj)andtj=2A(ti) 2. 8ti2VT;tiisaleaf:A(ti)\F6=IncaseFisG-partitionable,wecanseparateV(tu1);V(tu2);:::;V(tuk)inGbyperformingthecutscorrespondingtoancestorsofnodeinFduringthetreeconstruction.ForexampleinFigure 2-3 ,weshowadecompositiontreewithaG-partitionablesetft2;t3;t5;t6g.ThecorrespondingpartitionfV(t2);V(t3);V(t5);V(t6)ginGcanbeobtainedbycuttingV(t0);V(t1);V(t4)successivelyusingbalancedcutsinthetreeconstruction.Thecutcost,hence,willbecost(t0)+cost(t1)+cost(t4).Ingeneral, 31

PAGE 32

thetotalcostofallthecutstoseparateV(tu1);V(tu2);:::;V(tuk)willbe:cost(F)=Xtu2A(F)cost(tu)ThepairwiseconnectivityinGthenwillbe:P(F)=Xtu2FP(G[V(tu)])WewishtondFsothatP(F)0)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2i.e.theunionofcutstoseparateV(tu1);V(tu2);:::;V(tuk)formsa0-edgedisruptorinG.BecauseofthesuboptimalstructureinT,ndingsuchaG-partitionablesubsetFinVTwithminimumcost(F)canbedoneinO(n3)usingdynamicprogramming.Denotecost(ti;p)theminimumcutcosttomakethepairwiseconnectivityinG[V(ti)]equaltopusingonlycutscorrespondingtonodesinthesubtreerootedatti.TheminimumcostforaG-partitionablesubsetFthatinducesa0-edgedisruptorofGisthenTopt0=minp0(n2)fcost(t0;p)gwheret0istherootnodeinT.Wecaneasilyderivetherecursiveformula:cost(ti;p)=8>><>>:0ifP(G[V(ti)])pminpcost(ti1;)+cost(ti2;p)]TJ /F3 11.955 Tf 11.95 0 Td[()+cost(ti)ifnotwhereti1;ti2arechildrenofti.Intherstcase,whenP(G[V(ti)])pwecutnoedgesinG[V(ti)]hence,cost(ti;p)=0.Otherwise,wetryallpossiblecombinationsofpairwiseconnectivityinV(ti1)andp)]TJ /F3 11.955 Tf 11.96 0 Td[(inV(ti2).Thecombinationwiththesmallestcutcostisthenselected.WenowprovethatTopt0O(log3 2n)Opt-ED,whereOpt-EDdenotesthecostoftheoptimal-edgedisruptorinG. 32

PAGE 33

Lemma1. ThereexistsaG-partitionablesubsetofTthatinducesa0-edgedisruptorwhosecostisatmostOlog3 2nOpt-ED. Proof. LetDbeanoptimal-edgedisruptorinGofsizeOpt-EDandC=fC1;C2;:::;CkgbethesetofSCCs,afterremovingDfromG.WeconstructaG-partitionablesubsetXTasintheAlgorithm2.WetraversetreeTinpreorderi.e.everyparentwillbevisitedbeforeitschildren.Foreachnodeti,weselecttiintoXTifthereexistssomecomponentCj2CthatjV(ti)\Cjj(1)]TJ /F3 11.955 Tf 12.03 0 Td[(c)jV(ti)jandnoancestorsoftihavebeenselectedintoXT.WecanverifythatXTsatisestwomentionedconditionsofaG-partitionablesubset.ForeachCj2C,deneN(Cj)=fti2T:jV(ti)\Cjj(1)]TJ /F3 11.955 Tf 11.95 0 Td[(c)jV(ti)jg:SinceV(ti);ti2Taredisjointsubsets.Wehave P(XT)Xti2XTjV(ti)j2=1 2XCj2CXti2N(Cj)jV(ti)j2)]TJ /F3 11.955 Tf 13.15 8.09 Td[(n 21 2XCj2CXti2N(Cj)jV(ti)j2)]TJ /F3 11.955 Tf 13.16 8.09 Td[(n 21 2XCj2Cp 0=jCjj2)]TJ /F3 11.955 Tf 13.15 8.08 Td[(n 2<0 1 2XCj2CjCjj2)]TJ /F3 11.955 Tf 11.96 0 Td[(n0n2Finallyweshowthatcost(XT)O(log3 2n)Opt-ED.Letdenotebyh(T)theheightofTandLiTthesetofnodesattheithlevelinTG.Wehave: cost(XT)=h(T)Xi=1Xtu2(LiT\A(XT))cost(tu) (2) 33

PAGE 34

Iftu2A(XT)thentuisnotselectedtoXT.Hence,thereexistsCj2CsothatjV(tu)\Cjj<(1)]TJ /F3 11.955 Tf 11.35 0 Td[(c)jV(tu)j(otherwisetuwasselectedintoXTasitsatisedtheconditionsintheline3,Algorithm2).Toguaranteec<1)]TJ /F3 11.955 Tf 11.96 0 Td[(c,weneedc<1=2i.e.>0 4.SincetheedgesinDseparateCjfromtheotherSCCs,theyalsoseparatesCj\V(tu)fromV(tu)nCjinG[V(tu)].Denoteby(tu;D)thesetofedgesinDseparatingCj\V(tu)fromV(tu)nCjinG[V(tu)].Obviously,(tu;D)isadirectedc-balancedcutofG[V(tu)].Since,thecutweusedinthetreeconstructionisonlyO(p logn)timestheoptimalc-balancedcut.Wehavecost(tu)O(p logn)j(tu;D)j.Recallthatiftwonodestu;tvareonasamelevelthenV(tu)andV(tv)aredisjointsubsets.Itfollowsthat(tu;D)and(tv;D)arealsodisjointsets.Therefore,thecutcostattheithlevel Xtu2(LiT\A(XT))cost(tu)O(p logn)Xtu2(LiT\A(XT))j(tu;D)jO(p logn)j[tu2(LiT\A(XT))(tu;D)j=O(p logn)Opt-EDSincethenumberoflevelsh(T)=O(logn),byEq. 2 wehavecost(XT)O(log3 2n)Opt-ED. SincethereexistsaG-partitionablesubsetofTthatinducesa0-edgedisruptorwhosecostisnomorethanOlog3 2nOpt-EDasshowninLemma 1 andthedynamicprogrammingalwaysndsthebestlatentsolutioninT,thefollowingtheoremfollows. Theorem2.4. Algorithm1achievesapseudo-approximationratioofO(log3 2n)forthe-edgesdisruptorproblem.Timecomplexity:ConstructionofthedecompositiontreetakesO(n9:5).Themajorportionoftimeisforsolvingansemideniteprogrammingwith(n3)constraints.Finding 34

PAGE 35

Algorithm2.FindagoodG-partitionablesubsetofTthatinducesa0-edgedisruptorinG Initialization:XT ;Preorder-Selection(t0).Preorder-Selection(tu)1:if(9Cj2C:jV(tu)\Cjj(1)]TJ /F3 11.955 Tf 11.96 0 Td[(c)jV(tu)j)then2:XT XT[ftug3:elselettu1;tu2bechildrenoftu,4:Preorder-Selection(tu1)5:Preorder-Selection(tu2)6:endif theoptimalsolutionusingDynamicProgrammingtakesO(n3).Hence,theoveralltimecomplexityisO(n9:5). 2.3BoundsontheSizeofEdgeDisruptorSimultaneousattackscancausedevastatingdamage,breakingdowncommunicationnetworksintosmallfragments.Tomitigatetheriskanddevelopproactiveresponses,itisessentialtoassesstherobustnessofnetworkintheworst-casescenarios.Inthispaper,weproposeaspectrallower-boundonthenumberofremovedlinkstoincuracertainlevelofdisruptionintermsofpairwiseconnectivity.Ourlower-boundexploresthelatentstructuralinformationinthenetworkLaplacianspectrum,thesetofeigenvaluesoftheLaplacianmatrix,toprovideguaranteesontherobustnessofthenetworkagainstintentionalattacks.Suchguaranteesoftencannotbefoundinheuristicmethodsforidentifyingcriticalinfrastructures.Forthersttime,theattack-resistantproofsoflargescalecommunicationnetworksagainstlinkattacksarepresented.Connectivityplaysavitalroleinnetworkperformanceandisfundamentaltovulnerabilityassessment.Thenumberofconnectednodepairsinthenetwork,(a.k.apairwiseconnectivity),lendsitselfasaneffectivemeasuretoaccountfortheeffectoftheattacks[ 12 15 16 33 34 64 66 ]. 35

PAGE 36

Vulnerabilityassessmenthasbeenrecentlyformulatedasanconnectivityoptimizationproblemcalled-edgedisruptor,whichndsaminimumcostlinkswhoseremovalcausesasignicantlevel()ofnetworkpairwisedegradation[ 34 ].The-edgedisruptorreectsthecommonsensethatwhenbreakingthenetworkbyremovinglinks,themorelinksrequiredtoberemoved,thelessvulnerablethenetworkis.The-edgedisruptorapproachenablestheexplorationofdifferentnetworkdisruptionlevelswhichcanbeusedtogainthedeeperinsightintonetworkstructureandrobustnessinvariousoperatingenvironments.Unfortunately,the-edgedisruptorproblemisNP-hard[ 34 ]i.e.thereisnoefcientalgorithmtosolvetheproblem,unlessP=NP.Apseudo-approximationalgorithmandmathematicalapproachesforthe-edgedisruptorproblemsareintroducedin[ 34 ]and[ 33 ],respectively.Althoughthosemethodscanprovideperformanceguarantees,theyareonlyapplicableforsmallandmediumnetworksoffewthousandnodes.Forlargernetworks,wehavetorelyonheuristicswhichcanhavearbitrarybadworst-caseperformance.Hence,thereisalackofmethodstoproviderobustnessproofsagainstintentionalattacksforlargenetworks.Inthispaper,weanalyzethenetworkspectrum,theeigenvaluesoftheLaplacianmatrix,togivealower-boundfortheminimumsizeofa-edgedisruptor,thus,giveacerticateontherobustnessofthenetwork.OurspectralboundisformulatedasanoptimizationproblemoftheLaplacianeigenvalues,whichareknowntocontainrichinformationaboutthetopologicalstructure[ 24 ].Sinceexactmeasurementforthe-edgedisruptorisnotavailableingeneral,ourlower-boundcanbecoupledwithupperboundmethods1tonarrowdowntherangeforactualvulnerability/robustnessofthenetwork.Weemphasizethatwhileupperboundsfor-edgedisruptor(oranyotherminimizationproblem)canbedesignedeasily, 1Eachheuristictond-edgedisruptorisanupperboundfortheproblem 36

PAGE 37

techniquesforderivinglower-boundismuchscatteredinliterature.Ourcontributionsaresummarizedasfollows. Weintroduceanewspectrallower-boundforthe-edgedisruptorprobleminformofaneigenvalueoptimizationproblems.Atthesametime,weenrichtheliteratureonlower-boundtechniques. Wepresenttwoefcientmethodstocomputetheproposedlower-bound:1)theLagrangemultipliermethodand2)thedynamicprogrammingalgorithm.Moreover,theLagrangemultipliermethodcanderivethelower-boundwithonlyasmallnumberofsmallesteigenvalues.Thisisimportantforlargenetworkswherecomputingthewholenetworkspectrumisbothtimeandmemoryconsuming. Weperformexperimentsondifferentnetworktypesandreallarge-scalenetworkstodemonstratethequalityoftheproposedlower-boundandquantifytherobustnessofthestudiednetworksagainstintentionalattacks.Organization.Webrieypresentterminologiesandproblemdenitionsinsubsection 4.1 .Insubsection 2.3.2 ,weintroducethespectrallower-boundforthethe-disruptorproblemtogetherwithtwomethodstocomputethelower-bound.Experimentalresultsondifferentnetworkmodelsandrealnetworkinstancesareobtainedinsubsection 6.4 2.3.1LaplacianMatrixandandItsEigenvaluesWeabstractourgeneralnetworkmodelasagraphG=(V;E),whereV=fv1;v2;:::;vngreferstoasetofnodesandEreferstoasetoflinks.Eachedge(vi;vj)2Ehasaremovalcostcij0(andcij=0if(vi;vj)=2E).Forconvenience,wealsodenotethenumberofnodesandlinksbynandm,respectively.Sincethemainpurposeofnetworkliesinconnectingalltheinteractingelementsinthenetwork,westudyontheoverallpairwiseconnectivity,whichisdenedasthenumberofconnectedvertexpairsinG.IfGisanundirectedgraph,avertexpair(u;v)2VVisconnectediffthereexistsapathbetweenuandv.WedenotethepairwiseconnectivityofagraphGbyP(G).Apparently,thepairwiseconnectivityismaximizedat)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2whenGisa(strongly)connectedgraph. 37

PAGE 38

LetA=fcijgbetheweightedadjacencymatrixandDbethedegreematrix,denedasthediagonalmatrixwiththeweighteddegreesd1;d2;:::;dnonthediagonal,wheredi=Pjcij.TheunnormalizedgraphLaplacianmatrix[ 63 ]isdenedasL=D)]TJ /F20 11.955 Tf 11.96 0 Td[(AThematrixLissymmetricandpositivesemi-denite,sinceforeveryvectorx2RnwecanverifythatxTLx=1 2nXi;j=1cij(xi)]TJ /F3 11.955 Tf 11.96 0 Td[(xj)20: (2)AdirectconsequenceisthatLhasnnon-negative,real-valuedeigenvalues12:::n.Inaddition,thesmallesteigenvalueof1iszeroandthecorrespondingeigenvectoristheconstantonevector1[ 63 ].Thesecondsmallesteigenvector2isknownasthealgebraicconnectivityofthegraphandcanbeusedtodescribemanypropertiesofgraphs[ 63 ].Forexample,thegraphGisconnectedifandonlyif2>0.For-edgedisruptorproblem,thefollowinglower-boundcanbederivedfrom2. Lemma2. ForanyconnectedgraphG,wehaveOPT1)]TJ /F3 11.955 Tf 11.96 0 Td[( 22(n)]TJ /F5 11.955 Tf 11.96 0 Td[(1) (2)whereOPTdenotestheminimumsizeofa-edgedisruptor.However,theboundprovidedinEq. 2 isratherloose,asthevalueof2isoftenveryclosetozero(forexamplewhenbridges,edgeswhosedeletionincreasesthenumberofconnectedcomponents,arepresentedinthenetworks.)Thismotivatesustostudyhighereigenvaluesbeyond2todesignstrongerboundforthe-edgedisruptorproblem. 38

PAGE 39

2.3.2SpectralLower-boundforLinkAssessmentInthissubsection,wederivealower-boundonsizeof-edgedisruptorusinghighereigenvaluesoftheLaplacianmatrixL.Werstformulatethelower-boundasaneigenvalueoptimizationproblem.Thentwomethodswithdifferenttrade-offbetweentimeandqualityareintroducedtocomputethelower-bound.LetEbeanoptimal-edgedisruptorands1s2:::snbethesizesoftheconnectedcomponentsafterremovingEfromthenetwork.ThenwecanrelateOPTtothesizeofthecomponentsviathefollowinglemma. Lemma3. [ 36 ]Letak-partitionofagraphbeadivisionoftheverticesintokdisjointsubsetscontainings1s2:::skvertices.LetEcutbethesetofedgeswhosetwoverticesbelongtodifferentsubsets.Let12:::k,betheksmallesteigenvaluesoftheLaplacianmatrixplusanydiagonalmatrixUsuchthatthesumofalltheelementsofUiszero.ThenjEcutj1 2kXi=1sii:Thus,wehaveOPT=jEj1 2Pni=1sii.Hereweallowimaginarysubsetsofsizezeroandassumew.l.o.g.thatk=n.Notethats1;:::;snarenotknownwithoutndingE.Thus,weconsiderallpossiblevaluesoffs1;:::;sngwhichinfernetworkpartitionsofpairwiseconnectivityatmost)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2,andgettheminimumofthesum1 2Pni=1siiasalower-boundonOPT.Formally,ourspectrallower-boundonOPTisgivenbysolvingthefollowingquadraticprogramming(QP)optimizationproblem. 39

PAGE 40

minimize1 2nXi=1sii (2)subjecttonXi=1si=n (2)nXi=1)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(si2)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2 (2)si2f0;1;:::;ng (2) Theorem2.5. LetQbetheoptimalobjectiveoftheQPproblem( 2 2 )andOPTbetheminimum-edgedisruptorofgraphG=(V;E).Then,QOPTfor2[0;1].Moreover,theequalityholdswhen=0or=1 Proof. Asdiscussedinthepreviousparagraph,thesizesofconnectedcomponentsafterremovingoptimal-edgedisruptorsatisfyallconstraints( 2 2 ).Hence,QOPTforall2[0;1].Wecontinuewiththetightnessoftheboundatextremecaseswhen=0and=1.Case=0:allsubsetsareofsizeone.Hence,Q1=1 2nXi=1i=1 2Trace(L)=1 2(2jEj)=jEj.Theonlywaytocutallpairsinthenetworkistocutalledges.Inotherwords,Q0=OPT0=jEj.Case=1:inordertoachievethemaximumconnectivity)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2,theremustbeasinglepartitioninthenetworkandtheoptimaldisruptorcuttingnoedges.Thatiss1=nandsi=08i>1.Since1=0,itfollowsthatQ1=0=OPT1. Sincesiareintegralvalues,weproposeadynamicprogrammingalgorithmtocomputethespectralboundinnextsubsubsection. 2.3.2.1DynamicProgrammingMethodWerstdescribetheoptimalsolutionstructurefortheoptimizationproblemin( 2 2 ). 40

PAGE 41

Lemma4. ThereexistsanoptimalsolutionsofQP( 2 2 )suchthats1s2:::sn. Proof. Lets=fs1;s2;:::;sngbeanoptimalsolutionofQP( 2 2 ).Denoteinv(s)thenumberofinversionsofmi.e.suchpairsofindices(i;j)thatisj.Ifinv(s)=0,thens1s2;:::sn,otherwisethereexistsapairisj.Constructs0byswappingsiandsjinsides.Then,s0isafeasiblesolutionofQP( 2 2 )andtheobjectiveincreasesanamountsij+sji)]TJ /F5 11.955 Tf 12.39 0 Td[((sii+sjj)=(si)]TJ /F3 11.955 Tf 12.36 0 Td[(sj)(j)]TJ /F3 11.955 Tf 12.36 0 Td[(i)0.Thus,weobtainanewoptimalsolutionwithlessthenumberofinversions.Repeattheprocessatmost)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2,thatisthemaximumnumberofinversionsins,wenallyobtainanoptimalsolutionwithnoinversions.Thatoptimalsolutionshallsatisfythelemma'scondition. Algorithm3:ILB(G;) 1:Compute1;:::;n2:Lk(l;p)=(+1;ifp
PAGE 42

ByLemma 13 ,wepayattentiononlytopartitionssatisfyings1s2:::sn.WenowderivetherecursiveformulaforLp(l;k)basedonthesub-optimalstructureoftheQPproblem.Considertwopossiblecasesofsk sk=0:Thereareatmostk)]TJ /F5 11.955 Tf 12.12 0 Td[(1partitionswhosesizessumuptol.Hence,forthiscaseL(l;k)=Lk)]TJ /F4 7.97 Tf 6.58 0 Td[(1(l;p). sk>0:Sinces1s2:::sk>0.Let~si=si)]TJ /F5 11.955 Tf 13.2 0 Td[(10,thevector~s=f~s1;~s2;:::;~skgsatisessimultaneouslythefollowingkXi=1i~si=kXi=1isi)]TJ /F6 7.97 Tf 18.28 14.95 Td[(kXi=1ikXi=1~si=kXi=1si)]TJ /F3 11.955 Tf 11.96 0 Td[(k=l)]TJ /F3 11.955 Tf 11.95 0 Td[(kkXi=1~si2=kXi=1si2)]TJ /F3 11.955 Tf 11.96 0 Td[(si+1=kXi=1si2)]TJ /F3 11.955 Tf 11.96 0 Td[(l+kp)]TJ /F3 11.955 Tf 11.95 0 Td[(l+kTherefore,inthiscaseLk(l;p)=Lk(l)]TJ /F3 11.955 Tf 11.96 0 Td[(k;p)]TJ /F3 11.955 Tf 11.96 0 Td[(l+k)+Pki=1iInsummary,wehaveLk(l;p)=minLk)]TJ /F4 7.97 Tf 6.58 0 Td[(1(l;p);Lk(l)]TJ /F3 11.955 Tf 11.95 0 Td[(k;p)]TJ /F3 11.955 Tf 11.95 0 Td[(l+k)+Pki=1iWecomputevalueofLp(l;k)inincreasingorderofpandlbutindecreasingorderofk.ThebasecasesforLp(l;k)areasfollow.Lk(l;p)=8><>:+1;ifp
PAGE 43

Theorem2.6. OptimalsolutionsofQP( 2 2 )canbefoundinO(n4)timeandO(n3)space.Thus,thespectralboundcanbecomputedinpolynomialtime.However,thehightimecomplexityofthedynamicprogrammingalgorithmpreventsthemethodfrombeingappliedtolargenetworks.Moreover,thedynamicprogrammingalgorithmrequirescomputingthewholesetofeigenvaluesofthenetworks,whichisbothtimeandmemoryconsuming.Wecontinuewithanapproximationofthespectralboundthatachieves(almost)thesamelower-boundqualityinsignicantlylesstime. 2.3.2.2LagrangeMultipliersMethodWerelaxtheintegralconditionsonsitoobtainthefollowingrelaxationoftheQP,rewritteninvectornotation.minimize1 2sT (2)subjecttoksk1)]TJ /F3 11.955 Tf 11.96 0 Td[(n=0; (2)ksk22)]TJ /F5 11.955 Tf 11.96 0 Td[(0; (2)s0; (2)where=n(n)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+nandk:kdenotestheEuclideannorm.TheLagrangemultiplieristhenL(s;; ;!)=1 2sT+(ksk1)]TJ /F3 11.955 Tf 11.96 0 Td[(n)+ (ksk22)]TJ /F5 11.955 Tf 11.96 0 Td[())]TJ /F20 11.955 Tf 11.96 0 Td[(!Tswhere!=(!1;:::;!n)0isapositivemultipliervector.NoticethattheproblemisaconvexoptimizationproblemwithdifferentiableobjectiveandconstraintfunctionsanditsatisestheSlater'sconditionwiths=(1;1;:::;1)T[ 20 ].Hence,thefollowingKarushKuhnTucker(KKT)conditionsprovide 43

PAGE 44

thenecessaryandsufcientconditionsforoptimalityrsL=1 2++2 s)]TJ /F20 11.955 Tf 11.96 0 Td[(!=0rL=ksk1)]TJ /F3 11.955 Tf 11.96 0 Td[(n=0r L=ksk22)]TJ /F5 11.955 Tf 11.96 0 Td[(=0!Ts=0s; ;!0 Algorithm4:LMB(G;) 1:t=d2=e, bn(n)]TJ /F5 11.955 Tf 11.96 0 Td[(1)+nc2:Compute1;:::;t3:fork=1ton4:ifk>tthen5:t=minf2t;ng6:Compute1;:::;t7:Compute asinEq. 2 .8:ComputeD(k),andC(k)asinEqs. 2 ,and 2 9:if( 0andC(k)0)or(k=n)then10:returndD(k)e11:endfor Letk=maxfijsi>0g.ByLemma 13 andthecomplementaryslackness!Ts=0,wehavesi>0forik,thus,si=08i>kand!j=08jk. 44

PAGE 45

Denotes(k)=fs1;s2;:::;skgand(k)=f1;2;:::;kg,theKKTconditioncanbesimpliedtors(k)L=1 2(k)++2 s(k)=0;ik (2)rsiL=1 2i+)]TJ /F3 11.955 Tf 11.95 0 Td[(!i=0;i>k (2)rL=ks(k)k1)]TJ /F3 11.955 Tf 11.95 0 Td[(n=0; (2)r L=ks(k)k22)]TJ /F5 11.955 Tf 11.95 0 Td[(=0; (2)s(k)>0; >0;!(k)=0 (2)Foreachvalueofk,wecansolveforvaluesofsiandcheckifallsi0.Theotherunknownscanbefoundasfollows.First,substitutetheconstraint( 2 )intothesumoftheconstraints( 2 )toobtainintermsof .=)]TJ /F5 11.955 Tf 9.3 0 Td[(2n k )]TJ 13.15 8.08 Td[(k(k)k1 2k (2)Therefore,wecanderives(k)from( 2 )ass(k)=n k+k(k)k1 4k)]TJ /F20 11.955 Tf 13.15 8.09 Td[((k) 41 (2)Substitutingtheaboveequationintothecondition( 2 )andsolvingfor ,wehaveks(k)k22)]TJ /F5 11.955 Tf 11.96 0 Td[(=0,k(k)k22 16)]TJ 13.15 8.09 Td[(k(k)k21 16k1 2=)]TJ /F3 11.955 Tf 13.15 8.09 Td[(n2 k, =1 4 k(k)k22)-222(k(k)k21=k )]TJ /F6 7.97 Tf 13.15 4.71 Td[(n2 k!1=2 (2)TheobjectiveisthenD(k)=1 2s(k)T(k)=nk(k)k1 2k+ k(k)k21 4k)]TJ 12.11 7.38 Td[(k(k)k22 4!1 2 =nk(k)k1 2k)]TJ /F25 10.909 Tf 12.1 7.38 Td[(1 2k(k)k22)]TJ 12.11 7.38 Td[(k(k)k21 k1=2)]TJ /F27 10.909 Tf 12.11 7.38 Td[(n2 k1=2 (2) 45

PAGE 46

Since12:::n,Eq. 2 impliesthats(k)1s(k)2:::s(k)k.Hence,inordertosatisfys(k)>0,itissufcientthatC(k)=s(k)k=n k+k(k)k1 4k)]TJ /F20 11.955 Tf 13.15 8.09 Td[(k 41 0: (2) Theorem2.7. Thesizeofa-edgedisruptorislower-boundedbyD=minnkn2=nD(k)jC(k)>0o;whereD(k)andC(k)aregivenbyEqs. 2 and 2 .ThestepstosolvetherelaxationoftheQPissummarizedintheAlgorithm 4 (LMBAlgorithm).Timecomplexity.TheLMBalgorithmspendsitsmajortimeoncomputingtheeigenvalues.ThiscanbedonewithImplicitlyRestartedLanczosMethodwhichhasworst-casetimecomplexityO(mKh+nK2h+K3h)whereKisthenumberofeigenvaluestobecomputed,andhisthenumberofiterationsfortheeigenvaluealgorithmtoconverge[ 85 ].Giventheeigenvalues,therestofLMBtakesonlyO(n)timeintheworst-case.ThenumberofrequiredeigenvaluesKissmallinouralgorithm.Atbeginning,thealgorithmcomputest=d2=esmallesteigenvaluesandthenumberofcomputedeigenvaluesisdoubleeachtimeifnecessary.Inourexperiments,thenumberofneededeigenvaluesis2=inmostcases.Forexample,toboundthenumberofnecessarylinkswhoseremovaldisrupts90%pairwiseconnectivityweonlyneedtocomputeabout20smallesteigenvaluesoftheLaplacianmatrix.WefoundtheLMBalgorithmtobescalable,takinglineartimewithrespecttothenumberofnodesandedges. 2.3.2.3Timeandqualitytrade-offOnonehand,theILBalgorithm(Algorithm 3 )providesabetterboundthanthatoftheLMBalgorithm.ThereasonisthatILBsolvesforexactsolutionsoftheQPwhile 46

PAGE 47

AErdos-Reyni(random)network BBarabasi(power-law)network CWatts-StrogatznetworkFigure2-4. Minimumcostandlower-boundsfor-disruptoronthesynthesisnetworks AErdos-Reyni(random)network BBarabasi(power-law)network CWatts-StrogatznetworkFigure2-5. Runningtimeonthesynthesisnetworks LMBonlytargetsarelaxationoftheQP.However,thedifferencebetweentheoutputoftwoalgorithmsisnegligiblesmallandeitherzeroorone2inourexperiments.Ontheotherhand,theLMBhasmuchmorepracticaltimecomplexity.TheILBhashightimecomplexityO(n4)andcanonlyappliedfornetworkuptofewthousandnodes.Incontrast,LMBtakesonlylineartimetocomputeitscompetitivebound.Overallforsmallandmediumnetworks,onecanapplyILBalgorithm(orothermathematicalapproaches[ 33 ])tocomputethelower-bound,however,forlargenetworksLMBremainstheonlychoice. 47

PAGE 48

2.3.3ExperimentalResultsWecomputeourspectrallower-boundforbothsyntheticandreal-worldnetworksandcomparetheresultswiththeoptimalresultswheneverpossible. 2.3.3.1SyntheticNetworksWegeneratethesyntheticnetworksfollowingwell-knowncomplexnetworkmodels.Allnetworkshave100nodesandaround300edges.Thedetailsofthosenetworksareasfollows. Erdos-Reyni:Arandomgraphof100verticesand300edgesfollowingtheErdos-Reynimodel[ 37 ]. Barabsi-Albert:Apower-lawmodelusingpreferentialattachmentmechanism[ 13 ]. Smallworld:ArandomgraphfollowingWattsandStrogatzmodel[ 84 ].Thedimensionofthelatticeissettobe3andtherewiringprobabilityis0.3.Theoptimalsolutionsarefoundwiththeintegerprogrammingusingthesparsemetrictechniquein[ 33 ].Thetechniquein[ 33 ]isalsoappliedtocomputethelower-boundgivenbysolvingthelinearprogramming.TheresultsproducedbyILBandLMBalgorithmsareidentical(afterroundedup)andplottedunderthesamenamespectralbound.AllalgorithmswererunonaPCwithIntelXeon2.93Ghzprocessorand12GBmemory.Theintegerprogramming(IP)andthelinearprogramming(LP)aresolvedwiththemathematicaloptimizationpackageGUROBI4.5.Theminimumnumberoflinkswhoseremovalcausescertainlevelofdisruption,areshowninFig. 4-5 .Forallthreedifferentnetworks,solvingLPgivesgoodlower-boundontheminimumnumberoflinkstoremove.ThespectralboundsaremuchworsethantheLPboundsintherandomandsmall-worldnetworks;however,thespectralbound 2Bothalgorithmsrounduptheirresultstothenearestintegers. 48

PAGE 49

closelyapproachestheLPboundsandtheoptimalsolutionwhenthenetworkhasthepower-lawtopologyoftheBarabasimodel.AsshowninFig. 4-6 ,thereisabiggapbetweentherunningtimeofthespectralboundandthoseofLPandIP.Notethatallthespectralboundarecomputedatonce,i.e.,theprovidedrunningtimeisthetotalrunningtimeoveralldifferentvaluesof.EventhoughtherunningtimeofthespectralboundisstillthousandoftimesfasterthanLPandIP.Overall,whileIPisbestusedforsmallnetworks,andLPcanbeusedformediumnetworksoffewthousandnodes,theonlyfeasiblemethodtocomputethelower-boundinlargenetworksisthespectralbound.OneoftheattractiveaspectoftheLMBspectralbound,describedintheAlg.2,isthatthealgorithmcanbeeasilyimplementedinadistributedmanner.Themosttime-consumingpartofthealgorithmistocomputethefewsmallesteigenvalues.Thiscanbedonedistributedlywiththeexistingmathematicalsoftware[ 17 ]. Table2-1. Sizesoftheinvestigatednetworksandthecorrespondingrunningtimetocomputethelower-bound CAIDAASOregonASP2PGnutella Vertices 8,02011,17422,663Edges 36,40623,410109,386Time(s) 1530.1321.0207.9 2.3.3.2Real-worldDatasetsWecomputethespectrallower-boundsforrealnetworksareshowninFig. 2-6 .NeitherLPnorIPcanrunonthesenetworksduetobothtimeandmemorylimits.Thestudiednetworksare GnutellaP2P:Gnutellapeer-to-peernetworkfromfromAug.25,2002[ 58 ].NodesrepresentshostsinthenetworkandedgesaretheconnectionsbetweentheGnutellahosts. OregonAS:ASpeeringinformationinferredfromOregonroute-viewsbetweenMar.31andMay26,2001[ 58 ]. 49

PAGE 50

CAIDAAS:TheCAIDAASRelationshipsDatasets,fromSeptember17,2007[ 58 ].Thelower-boundsinFig. 2-6 indicatesthatitisdifculttodestroymajorconnectivityincommunicationnetworks.Forexamples,evenafterremoving369linksatleast50%nodepairsintheCAIDAASnetworkstayconnected;andtobringdowntheconnectivitylevelintheGnutellaP2Pnetworkto15%onehastodestroyatleast960links.Duetolowedgedensity,theOregonASnetworktendstobemorevulnerablethantheothertwonetworks.Nevertheless,uterlydisruptingtheconnectivityinthenetworkto5%levelwouldrequireremovingmorethan763links. Figure2-6. Lowerboundsonthenumberoflink-attackforrealnetworksfoundwiththeLMBalgorithm. 50

PAGE 51

CHAPTER3MULTIPLENODEATTACKS 3.1BicriteriaApproximationAlgorithmfor-vertexDisruptorWepresentapolynomialtimealgorithm(Algorithm3)thatndsa0-vertexdisruptorinthedirectedgraphG(V;E)whosethesizeisatmostO(lognloglogn)timestheoptimal-vertexdisruptorwhere0<<02.Thealgorithminvolvesintwophases.Intherstphase,wespliteachvertexv2Vintotwoverticesv+andv)]TJ /F1 11.955 Tf 10.4 -4.34 Td[(whileputtinganedgefromv)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(tov+andshowthatremovingvinGhasthesameeffectsasremovingedge(v+!v)]TJ /F5 11.955 Tf 7.08 -4.34 Td[()inthenewgraph.Inthesecondphase,wetrytodecomposethenewgraphintoSCCscappingthesizesofthelargestcomponentwhileminimizingthenumberofremovededges.Werelaxtheconstraintsonthesizeofeachcomponentuntilthesetofcutedgesinducesa0-vertexdisruptorintheoriginalgraphG.GivenadirectedgraphG(V;E)forwhichwewanttondasmall0-vertexdisruptor,wespliteachvertexinGintotwonewverticestoobtainanewdirectedgraphG0(V0;E0)where V0=fv)]TJ /F3 11.955 Tf 7.09 -4.94 Td[(;v+jv2VgE0=f(v)]TJ /F2 11.955 Tf 10.4 -4.94 Td[(!v+)jv2Vg[f(u+!v)]TJ /F5 11.955 Tf 7.08 -4.93 Td[()j(u!v)2EgThenewgraphG0(V0;E0)willhavetwicethenumberofverticesinGi.e.jV0j=2jVj=2n.AnexamplefortherstphaseisshowninFigure 3-1 .WesetthecostsofalledgesinE0V=f(v)]TJ /F2 11.955 Tf 13.45 -4.34 Td[(!v+)jv2Vgto1andotheredgesinE0to+1sothatonlyedgesinE0Vcanbeselectedinanedgedisruptorset.Inimplementation,itissafetosetthecostsofedgesnotinE0VtoO(n)notingthatbypayingacostof2nwecaneffectivelydisconnectalledgesinE0V.ConsideradirectededgedisruptorsetD0eE0thatcontainsonlyedgeinE0V.Wehaveaone-to-onecorrespondencebetweenD0etoasetDv=fvj(v)]TJ /F2 11.955 Tf 12.09 -4.34 Td[(!v+)2 51

PAGE 52

Algorithm5.0-vertexdisruptorInput:DirectedgraphG=(V;E)andxed0<0<1:Output:A0-vertexdisruptorofG 1.G0(V0;E0) (;)2.8v2V:V0 V0[fv+;v)]TJ /F2 11.955 Tf 7.08 -4.34 Td[(g3.8v2V:E0 E0[f(v)]TJ /F2 11.955 Tf 10.41 -4.34 Td[(!v+)g;c(v)]TJ /F3 11.955 Tf 7.08 -4.34 Td[(;v+) 14.8(u!v)2E:E0 E0[fu+!v)]TJ /F2 11.955 Tf 7.08 -4.34 Td[(g;c(u+;v)]TJ /F5 11.955 Tf 7.08 -4.34 Td[() 15. 0; 16.DV V(G)7.while( )]TJ /F3 11.955 Tf 11.95 0 Td[( >)do8.~ b + 2c9.FindDeE0toseparateG0intostronglyconnectedcomponentsofsizesatmost~jV0jusingalgorithmin[ 38 ]10.Dv fv2V(G)j(v+!v)]TJ /F5 11.955 Tf 7.09 -4.34 Td[()2Deg11.ifP(G[VnDv]))]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2then12. =~13.RemovenodesfromDvaslongasP(G[VnDv]))]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n214.ifjDVj>jDvjthenDV=Dv15.else =~18.endwhile19.ReturnDV D0eginG(V;E)whichisavertexdisruptorsetinG.SinceGandG0havedifferentmaximumpairwiseconnectivity,(n)]TJ /F4 7.97 Tf 6.59 0 Td[(1)n 2forGand(2n)]TJ /F4 7.97 Tf 6.59 0 Td[(1)2n 2forG0,thefractionsofpairwiseconnectivityremaininginGandG0afterremovingDvandD0eare,however,notexactlyequaltoeachother.InthesecondphaseofAlgorithm3,whenseparatingagraphintoSCCs,thesmallerthesizesofSCCs,thesmallerpairwiseconnectivityinthegraph.However,thesmallerthemaximumsizeofeachSCC,themoreedgestobecut.WeperformbinarysearchtondarightupperboundforsizeofeachSCCinG0.Inthealgorithm, 52

PAGE 53

thelowerboundandupperboundofthesizeofeachSCCaredenotedby jV0jand jV0jrespectively.AteachstepwetrytondaminimumcapacityedgesetinG0(V0;E0)whoseremovalpartitionsthegraphintostronglyconnectedcomponentsofsizeatmost~jV0j,where~=b + 2c.Weroundthevalueof~tothenearestmultipleofsothatthenumberofstepsforthebinarysearchisboundedbylog1 .Theproblemofndingaminimumcapacityedgesettodecomposeagraphofsizenintostronglyconnectedcomponentsofsizeatmostnisknownas-separatorproblem.Weuseherethealgorithmpresentedin[ 38 ]thatforaxed>0ndsa-separatorindirectedgraphGwhosevalueisatmostO)]TJ /F4 7.97 Tf 8.35 -4.98 Td[(1 2:lognloglogntimesOpt()]TJ /F6 7.97 Tf 6.58 0 Td[()-separatorwhereOpt()]TJ /F6 7.97 Tf 6.59 0 Td[()-separatoristhecostoftheoptimal()]TJ /F3 11.955 Tf 12.24 0 Td[()-separator.Finally,wederivethecutverticesinGfromthecutedgesinG0toobtainthe0-vertexdisruptor. Lemma5. Algorithm3alwaysterminateswitha0-vertexdisruptor. Proof. Weshowthatwhenever~0thenthecorrespondingDvfoundinAlgorithm3isa0-vertexdisruptorinG.ConsidertheedgedisruptorD0einG0inducedbyDv.WerstshowthemappingbetweenSCCsinG[VnDv]andSCCsinG0[E0nD0e],thegraphobtainedbyremovingD0efromG0.PartitionthevertexsetVofGinto:(1)Dv:thesetofremovednodes(2)Vsingle:thesetofnodesthatarenotinanycylclei.e.theyareSCCsofsizeone(3)Vconnected:unionofremainingSCCsthatsizesareatleasttwo,sayVconnected=Uli=1Ci;jCij2.VerticesinVconnectedbelongtoatleastonecycleinG.WehavethefollowingcorrespondingSCCsinG0[E0nD0e]: 1. v2Dv$SCCsfv+gandfv)]TJ /F2 11.955 Tf 7.08 -4.34 Td[(g.Sinceafterremoving(v)]TJ /F2 11.955 Tf 11.53 -4.34 Td[(!v+)v+doesnothaveincomingedgesandv)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(doesnothaveoutgoingedges. 2. v2Vsingle$SCCsfv+gandfv)]TJ /F2 11.955 Tf 7.08 -4.34 Td[(g.SincevdoesnotlieonanycycleinG.Assumev+belongtosomeSCCofsizeatleast2i.e.v+liesonsomecycleinG0.Becausetheonlyincomingedgetov+isfromv)]TJ /F1 11.955 Tf 7.08 -4.34 Td[(.Itfollowsthatv)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(isprecedingv+onthatcycle.Letu)]TJ /F3 11.955 Tf 7.08 -4.34 Td[(;u+bethesuccessiveverticesofv+onthatcycle.WehaveuandvbelongtoasameSCCinGwhichyieldsacontradiction.Similarly,v)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(cannotlieonanycycleinG0. 53

PAGE 54

3. SCCCiVconnected$SCCC0i=fv)]TJ /F3 11.955 Tf 7.09 -4.34 Td[(;v+jv2Cig.Thiscanbeshownusingasimilarargumenttothatinthecasev2Vsingle.SinceD0eisa~-separator,thesizesofSCCsinG0[E0nD0e]areatmost~2n.ItfollowsthatthesizesofSCCsinG[VnDv]areboundedby~n.DenotethesetofSCCsinG[VnDv]byCwiththeconventionthatverticesinDvbecomesingletonSCCinG[VnDv].Therefore,wehave:P(G[VnDv])=XCi2CjCij2=1 2 XCi2CjCij2)-222(jVj!1 2 XCi2C~jVj)jCij)-223(jVj!=1 2~jVj2)-221(jVj~jVj2<0jVj2Thisguaranteesthatthebinarysearchalwaysndsa0-vertexdisruptorandcompletestheproof. Theorem3.1. Algorithm3alwaysndsa0-vertexdisruptorwhosethesizeisatmostO(lognloglogn)timestheoptimal-vertexdisruptorfor02>>0. Proof. ItfollowsfromtheLemma 5 thatAlgorithm3terminateswitha0-vertexdisruptorDv.AtsomestepthecapacityofDvequalstothecapacityof~-separatorD0einG0where~isatleast0)]TJ /F3 11.955 Tf 12.63 0 Td[(accordingtoLemma 5 andthebinarysearchscheme.ThecostoftheseparatorisatmostO(lognloglogn)timestheOpt(~)]TJ /F6 7.97 Tf 6.59 0 Td[()-separatorusingthealgorithmin[ 38 ].Consideranoptimal(02)]TJ /F5 11.955 Tf 12.17 0 Td[(9)-vertexdisruptorD0vofGanditscorrespondingedgedisruptorD0einG0.DenotethecostofthatoptimalvertexdisruptorbyOpt(02)]TJ /F4 7.97 Tf 6.59 0 Td[(9)-VD.IfthereexistsinG[VnDv]aSCCCisothatjCij>(0)]TJ /F5 11.955 Tf 12.66 0 Td[(2)nthenP(G[VnDv])>1 2((0)]TJ /F5 11.955 Tf -433.81 -23.9 Td[(2)n)]TJ /F5 11.955 Tf 12.07 0 Td[(2)((0)]TJ /F5 11.955 Tf 12.07 0 Td[(2)n)]TJ /F5 11.955 Tf 12.08 0 Td[(1)>(02)]TJ /F5 11.955 Tf 12.07 0 Td[(9))]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2whenn>20(0+1) .Hence,everySCCinG0[VnD0v]havesizeatmost(0)]TJ /F5 11.955 Tf 12.8 0 Td[(2)(2n)i.e.D0eisan(0)]TJ /F5 11.955 Tf 12.81 0 Td[(2)-separatorinG0.ItfollowsthatOpt(02)]TJ /F4 7.97 Tf 6.59 0 Td[(9)-VDOpt(0)]TJ /F4 7.97 Tf 6.59 0 Td[(2)-separatorinG0. 54

PAGE 55

Figure3-1. Conversionfromthenodeversioninadirectedgraph(a)intotheedgeversioninadirectedgraph(b) Since~)]TJ /F3 11.955 Tf 12.81 0 Td[(0)]TJ /F5 11.955 Tf 12.8 0 Td[(2,wehaveOpt(~)]TJ /F6 7.97 Tf 6.59 0 Td[()-separatorOpt(0)]TJ /F4 7.97 Tf 6.59 0 Td[(2)-separatorOpt(02)]TJ /F4 7.97 Tf 6.59 0 Td[(9)-VD.ThesizeofthevertexdisruptorjDvj=jD0ejisatmostO(lognloglogn)timesOpt(~)]TJ /F6 7.97 Tf 6.59 0 Td[()-separator.Thus,thesizeoffound0-vertexdisruptorDvisatmostO(lognloglogn)timestheoptimal(02)]TJ /F5 11.955 Tf 12.14 0 Td[(9)-vertexdisruptor.Aswecanchoosearbitrarysmall,setting=02)]TJ /F5 11.955 Tf 11.96 0 Td[(9completestheproof. Timecomplexity:FindingtheseparatorcostsO(n9)[ 38 ].Hence,thetotaltimecomplexityisO(log1 n9).However,inourexperiments,thealgorithmtakesmuchlessthanitsworst-caserunningtime. 3.2ConnectionbetweenEdgeDisruptorandVertexDisruptorWeshowthatanapproximationalgorithmforgeneraldirectededgedisruptoryieldsanapproximationalgorithmfordirectedvertexdisruptorwith(almost)thesameapproximationratio. Lemma6. A-edgedisruptorsetinthedirectedgraphG0inducesthesamecost-vertexdisruptorsetinG. Proof. WeuseDvandD0eforvertexdisruptorinGandedgedisruptorinG0.GivenP(G0[E0nD0e]))]TJ /F4 7.97 Tf 5.48 -4.38 Td[(2n2weneedtoprovethat:P(G[VnDv]))]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2wheren=jVj. 55

PAGE 56

AssumeG[VnDv]haslSCCsofsizeatleast2,sayCi;i=1:::l.ThecorrespondingSCCsinG0[E0nD0e]willbeC0i;i=1:::lwherejC0ij=2jCij.Since(2k2) (2n2))]TJ /F5 11.955 Tf 16.47 6.69 Td[((k2) (n2)=k(n)]TJ /F6 7.97 Tf 6.58 0 Td[(k) (n)]TJ /F4 7.97 Tf 6.59 0 Td[(1)n(2n)]TJ /F4 7.97 Tf 6.59 0 Td[(1)0;forall0kn.WehaveP(G[VnDv]) (n2)=lXi=1)]TJ /F10 7.97 Tf 5.48 -4.38 Td[(jCij2 )]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2lXi=1)]TJ /F10 7.97 Tf 5.48 -4.38 Td[(jC0ij2 )]TJ /F4 7.97 Tf 5.48 -4.38 Td[(2n2 Lemma7. A-vertexdisruptorsetinGinducesthesamecost(+)-edgedisruptorsetinG0forany>0. Proof. WeusethesamenotationsintheproofofLemma 6 .GivenP(G[VnDv]))]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2weneedtoprovethat:P(G0[E0nD0e])(+))]TJ /F4 7.97 Tf 5.48 -4.37 Td[(2n2.Wehave: P(G0[E0nD0e]) )]TJ /F4 7.97 Tf 5.48 -4.38 Td[(2n2=lXi=1jCij(n)-221(jCij) (n)]TJ /F5 11.955 Tf 11.96 0 Td[(1)n(2n)]TJ /F5 11.955 Tf 11.95 0 Td[(1)+P(G[VnDv]) )]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2=P(G[VnDv]) )]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n21)]TJ /F5 11.955 Tf 26.88 8.08 Td[(1 2n)]TJ /F5 11.955 Tf 11.96 0 Td[(1+Pli=1jCij n(2n)]TJ /F5 11.955 Tf 11.95 0 Td[(1)<+1 2n)]TJ /F5 11.955 Tf 11.95 0 Td[(1<+ (3) whennb1+ 2c+1. Theorem3.2. Givenafactorf(n)polynomialtimeapproximationalgorithmfor-edgedisruptor,thereexistsafactor(1+)f(n)polynomialtimeapproximationalgorithmfor-vertexdisruptorwhere>0isanarbitrarysmallconstant. Proof. LetGbeadirectedgraphwithuniformvertexcostsinwhichwewishtonda-vertexdisruptor.ConstructG0asdescribedatthebeginningofthisSection.ApplythegivenapproximationalgorithmtondinG0a-edgedisruptor,denotedbyD0e,withthecostatmostf(n)Opt)]TJ /F4 7.97 Tf 6.59 0 Td[(ED(G0),whereOpt)]TJ /F4 7.97 Tf 6.58 0 Td[(ED(G0)isthecostofaminimum-edgedisruptorinG0.FromLemma 6 ,D0einducesinGa-vertexdisruptorDvofthesamecost.WeshallprovethatOpt)]TJ /F4 7.97 Tf 6.58 0 Td[(ED(G0)Opt)]TJ /F4 7.97 Tf 6.59 0 Td[(VD(G)+0; 56

PAGE 57

whereOpt)]TJ /F4 7.97 Tf 6.59 0 Td[(VD(G)isthecostofaminimum-vertexdisruptorinGand0issomepositiveconstant.ItfollowsthatthecostofDvwillbeatmostf(n)(Opt)]TJ /F4 7.97 Tf 6.58 0 Td[(VD(G)+0)(1+)f(n)Opt)]TJ /F4 7.97 Tf 6.59 0 Td[(VD(G)Here,weassumethatOpt)]TJ /F4 7.97 Tf 6.59 0 Td[(VD(G)>0 otherwisewecanndOpt)]TJ /F4 7.97 Tf 6.58 0 Td[(VD(G)intimeO(n0 +2).Fromanoptimal-vertexdisruptorofG,constructitscorrespondingedgedisruptorDeinG0.IfP(G0[EnDe])]TJ /F4 7.97 Tf 5.48 -4.38 Td[(2n2thenOpt)]TJ /F4 7.97 Tf 6.58 0 Td[(ED(G0)Opt)]TJ /F4 7.97 Tf 6.58 0 Td[(VD(G)andweyieldtheproof.Thus,weconsiderthecaseP(G0[EnDe]>)]TJ /F4 7.97 Tf 5.48 -4.38 Td[(2n2.AmongSCCsofG0[EnDe],theremustbeaSCCofsizeatleast2norelseG0[EnDe])]TJ /F4 7.97 Tf 6.58 0 Td[(1)]TJ /F6 7.97 Tf 5.47 -4.38 Td[(2n2)]TJ /F4 7.97 Tf 5.48 -4.38 Td[(2n2(contradiction).Remove0=l1 mverticesfromthatSCC.ThepairwiseconnectivityinG0[EnDe]willdecreaseatleast(2n)]TJ /F4 7.97 Tf 13.82 4.7 Td[(1 )1 =2n)]TJ /F4 7.97 Tf 15.9 4.7 Td[(1 2nforsufcientlargen.FromEq. 3 inLemma 7 ,thepairwiseconnectivityafterremovingverticeswillbelessthan(+1 2n)]TJ /F5 11.955 Tf 11.96 0 Td[(1)2n2)]TJ /F3 11.955 Tf 11.95 0 Td[(n2n2Therefore,afterremovingatmost0verticesfromDe,wegeta-edgedisruptor.Hence,Opt)]TJ /F4 7.97 Tf 6.59 0 Td[(ED(G0)Opt)]TJ /F4 7.97 Tf 6.58 0 Td[(VD(G)+0. 3.3Branch-and-cutAlgorithmBranch-and-cutmethodshaveproventobeaverysuccessfulapproachforsolvingawidevarietyofintegerprogrammingproblems.Incontrastwithmeta-heuristics,theycanguaranteeoptimality.Theycombineabranch-and-boundalgorithmwithacuttingplanemethodthatisusedtoimprovethesolutionofthelinearprogrammingrelaxations.Thissectionpresentscomponentsofourbranch-and-cutalgorithm.Webeginwithanewlightweightmixedintegerprogrammingformulationfor-vertexdisruptorinSubsection 3.3.2 .Inthenextsubsection,weintroduceanewclassofstrongcutting 57

PAGE 58

planesandtheseparationproceduretondsuchcuttingplanes.TheprimalheuristicsthatprovidesupperboundsforpruningduringthesearchprocessispresentedinSubsection 3.3.4 3.3.1MixedIntegerProgrammingFormulationWemodelthenetworkasanundirectedgraphG=(V;E)ofnnodesnumberedfrom1ton;thedegreeofnode1inisdenotedbyd(i).ThepairwiseconnectivityofG,denotedbyP(G)isthenumberofnodepairswithatleastonepathbetweenthem.Forexample,ifGisconnected,thenP(G)=)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2.Givenapositiveconstant01,asubsetofverticesSVinGisa-vertexdisruptorifthesubgraphG[VnS],inducedbyVnSinG,haspairwiseconnectivityatmost)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2.The-edgedisruptorproblemaskstonda-vertexdisruptoroftheminimumsize.Theproblemcanbegeneralizedsothateachnodeu2Vhasacostw(u)ofremovingandwewishtonda-vertexdisruptoroftheminimumcost.Thisgeneralizationisstraightforwardandshallbeignoredtosimplifythepresentation.TheIPformulafor-vertexdisruptor(IPvd)isasfollowminimizenXi=1si (3)subjecttodijsi+sj;(i;j)2E; (3)dij+djkdik;8i6=j6=k (3)Xi
PAGE 59

Weusevariabledijtorepresentthedistancebetweenapairofnodesiandjintheresidualnetworki.e.dij=8><>:0ifiandjareinthesameconnectedcomponent1otherwise:Anextravariablesiisusedforeachnodei2V,wheresi=8><>:0ifnodeiisnotremoved1ifiisremoved(selectedintothedisruptor.)Theobjectiveminimizesthetotalnumberofremovednodesi.e.thesizeofthevertexdisruptor.Notethatdij=dji8(i;j)2VV.Constraint( 4 )isthewell-knowntriangleinequalitywhichimpliesthatifiandjareconnected,andjandkareconnected,theniandkmustbeconnected.Constraint( 4 )limitsthepairwiseconnectivityinGtobeatmost)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2.Constraint( 4 )impliesthebasecasethatifiandjareneighborsandneitheriorjisremoved(si=sj=0),theniandjremainconnectedi.e.dij=0.Constraint( 4 )statesthefactthataremovednodewillnotconnecttoanyothernodes(si=1!dij=1).ThereareseveraldrawbackswiththeIPformulaofthe-vertexdisruptorproblem(IPvd)(andalsoformulationsofk-CNDandk-CED[ 11 ]).Alargenumberofintegralvariables,(n2),makestheselectionofbranchingdifcultandsignicantlyincreasesthedepthandsizeofthesearchtree.Inaddition,excessivenumberofconstraints,(n3),evenforsmallsizedinstancesleadstoalargelinearprogrammingrelaxationthatconsumesanextremelylargeamountofmemoryandcomputingtime. 3.3.2SparseMetricTechniqueWerstdeviseanewMixed-IntegerProgramming(MIP)formulationforthe-vertexdisruptorproblemthatconsistsofonlynintegervariablesandmuchsmallernumberofconstraints.Sincetheonlyroleoftriangleinequalitiesistoguaranteedijtobea 59

PAGE 60

pseudo-metric(asdenedlaterintheproofofTheorem 3.3 ),weintroduceacompactsubsetofinequalities,so-calledsparsemetric,thatguaranteesthesamepseudo-metricproperty.Whenthenetworkissparsei.e.jEj/jVj,thenumberofconstraintsreducessubstantiallyfrom(n3)to(n2).OurnewMIPformulationforthe-vertexdisruptorproblem(MIPvd)issimilartoIPvdexceptinplacesofconstraints( 4 )and( 5 )aspresentedbelow.dij+djkdik;k2Nmin(i;j) (3)dij2[0;1];i;j2[1::n]; (3)whereNmin(i;j)isthesetofneighborsofiexcludingjifd(i)
PAGE 61

TheoptimalfractionalsolutionofLPrelaxationofIPvdcanbefoundbysolvingthe(smaller)LPrelaxationofMIPvd,followingbyanO(mn+n2logn)tuningprocedure(Theorem 3.4 ). Proposition3.1. ForeveryoptimalsolutionofMIPvd,thereisafeasiblesolutionoftheMIPwiththesameobjectivevalueinwhichallvariablesareintegral. Proof. Roundalldij>0to1.Thiswillnotviolateconstraints( 4 )and( 4 ).Forconstraints( 4 ),ifdijisroundedupto1thentheintegralityofsi;sjimpliessi+sj1,orelseifdij=0thentheconstraintsarestillsatised.Assumetheroundingviolatesconstraints( 3 )forsometriple(i;j;k).Thishappensifandonlyifdik=1anddij=djk=0.Hence,beforerounding,dik>0anddij=dj;k=0thatcontradictstheconstraintdij+djkdik.ItfollowsthatroundinggivesafeasibleintegralsolutiontotheMIP. LetDMIP=fijsi=1gbethedisruptorinducedbytheoptimalsolutionofMIPvdandOPTvdbeanoptimal-vertexdisruptor.Bysettingsi=08i2OPTvdanddij=0foralli;jinasameconnectedcomponentofG[VnOPTvd]anddij=1ifnot,weyieldafeasiblesolutionforMIPvd.Therefore,jDMIPjjOPTvdj Theorem3.3. TheoptimalsolutionDMIP=fijsi=1gobtainedbysolvingMIPvdisaminimum-vertexdisruptorofG. Proof. SincejDMIPjjOPTvdj,weonlyneedtoshowthatDMIPisa-vertexdisruptor.Assumethatwecanprovethatdij=0foreveryconnectedpairs(i;j)inG[VnDMIP].Then,onlydisconnectedpairsdi0j0willcontributetothesuminconstraint( 4 ).Sincedi0j018i;j2[1::n],thenumberofdisconnectedpairsmustbeatleast(1)]TJ /F3 11.955 Tf 12.22 0 Td[())]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2.ItwillfollowthatDMIPisa-vertexdisruptor.Hence,therestoftheproofistoshowthatdij=0foreveryconnectedpairs(i;j)inG[VnDMIP]. 61

PAGE 62

Notethatdisapseudo-metric,i.e.,thefunctiond(i;j)=dijsatisfy: Non-negativity:d(i;j)0 Identity:d(i;i)=0 Symmetry:d(i;j)=d(j;i) Subadditivity:d(i;j)d(i;k)+c(k;j).Foreachconnectedpair(i;j)inG[VnDMIP],weprovethatdij=0byinductiononthelengthtoftheshortestpath(innumberofhops)betweennodesiandj.Thebasis.Thestatementholdsfort=1.Byconstraint( 4 ),if(i;j)2Eandi;jareconnectedinGi.e.si=sj=0,thendijsi+sj=0.Sincedij0,itfollowsthatdij=0.Theinductivestep.Assumethatthestatementholdsfort=t0,weshowthatthestatementisalsotruefort=t0+1.Leti;jbesomepairsconnectedwithapathoflengthatmostt0+1.SinceremovingallnodesinNmin(i;j)disconnectsifromj,thepathbetweeniandjmustpassthroughsomenodek2Nmin(i;j).Inaddition,theshortestpathsfromitokandfromktojhavelengthsatmostt0.Thus,bytheinductionhypothesiswehavedik=dkj=0.Itfollowsfromtheconstraintin( 3 )thatdijdik+dkj=0.Thus,thestatementholdsforallt>0. Finally,weshowtherelationshipbetweentheLPrelaxationofIPvdandthatofMIPvd. Theorem3.4. TheoptimalsolutionoftheLPrelaxationIPvdcanbefoundbysolvingtheLPrelaxationofMIPvd,followingbyanO(mn+n2logn)tuningprocedure. Proof. Let(s;d)beanoptimalfractionsolutionoftheLPrelaxationofMIPvd.Associateaweightdijforeachedge(i;j)2E.Letd0ijbetheshortestdistancebetweentwonodes(i;j)withthenewedgeweights.Wehave d0ijdijforalli;jandd0ij=dij8(i;j)2E. d0ij=minnk=1fd0ik+d0kjg.Hence,d0ijisapseudo-metric. 62

PAGE 63

TherststatementcanbeshownbythesameinductionintheproofofTheorem1.Thesecondstatementcomesfromthedenitionofd0ij.Furthermore,wedenedij=minfd0ij;1g.IfweusetheJohnson'salgorithm[ 28 ]tocomputeallpairsshortestpathsd0ij,thetimecomplexitytoconstructdijfromdijisO(mn+n2logn).Weshallprovethat(s;d)isafeasiblesolutionofIPvdbyshowingthat(s;d)satisesallconstraintsinIPvd.Bydenition,wehavedij=minfd0ij;1gminfdij;1g=dij8i;janddij=dij8(i;j)2E.Thus,forall(i;j)2E,dij=dijsi+sj.Inaddition,disalsoapseudo-metricasdik+dkjminfd0ik+d0kj;1gminfd0ij;1g=dij.Fromdijdij,wehavePi;jdi;jPi;jdi;j)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2andsidijdij.Thus,(s;d)isafeasiblesolutionofIPvd.Obviously,theminimumobjectiveoftheLPrelaxationofMIPvdissmallerorequaltothatofIPvd.Since,theobjectivevaluesassociatewith(s;d)and(s;d),aminimumsolutionoftheLPrelaxationofMIPvd,arethesame,(s;d)mustbeaminimumsolutionoftheLPrelaxationofIPvd. 3.3.3CuttingPlanesWepresentaclassofstrongcuttingplanestogetherwiththeseparationproceduretoidentifythosecuttingplanes.Thesecanbeusedinconjunctionwithcuttingplanesgeneratedautomaticallybyoptimizationpackagestoimprovetheconvergenceofthebranch-and-cutalgorithm. 3.3.3.1Vertex-ConnectivityandInvalidInequalitiesOneoftenoverlookedcharacteristicofsolutionsforclusteringandpartitioningproblemsongraphisthatclustersmustinduceconnectedsubgraph.ThischaracteristicisnotreectedineitherIPvdorMIPvdformulations.AsubsetSVisavertex-cutforapair(u;v),ifremovingSfromgraphG,disconnectsandt.Forallvertex-cutSof(u;v),ifPi2Ssi=jSj,thenduvmustbeone. 63

PAGE 64

Thus,wehaveVCinequalityXi2Ssi)]TJ /F3 11.955 Tf 11.96 0 Td[(duvjSj)]TJ /F5 11.955 Tf 17.93 0 Td[(1ThisinequalityisvalidforallfeasiblepointsinsidethepolyhedraofMIPvd. Algorithm6.SeparationprocedureforVCinequalities 1:foreachpair(u;v)2VVdo2:ConstructaownetworkG=(V;E)asfollows3:Assignuandvassourceandsink,respectively4:Eachnodek2Vhascapacitysk5:Everyedgehascapacity16:if(u;v)2E,then(u;v)hascapacityzero.7:Findthemaximum-ow(min-cut)8:ifmaximum-owislessthanduv,then9:Findtheminvertex-cutS10:AddtheVCinequalityassociatedwithStoMIP8:endif11:endfor 3.3.3.2SeparationProcedureforVCInequalitiesGivenapoint(fractionalsolution)(s;d)2R(n+12),anexactseparationalgorithmforsomeclassofinequalitieseitherndsamemberoftheclassviolatedby(s;d),orprovesthatnosuchmemberexists.Inmanycases,ndingsuchalgorithmisintractable(NP-hardproblem)andonehastosettleforheuristicprocedures.Fortunately,thereisanexactalgorithmforourseparationprocedurebasedonndingthemax-owonthenetworkwithnodecapacities.TheVCinequalitycanberewrittenasXi2Ssi)]TJ /F5 11.955 Tf 14.02 3.16 Td[(duv0;Sisanyvertex-cutof(u;v)wheresi=1)]TJ /F3 11.955 Tf 11.95 0 Td[(siandduv=1)]TJ /F3 11.955 Tf 11.95 0 Td[(duv. 64

PAGE 65

Algorithm7.SharpestDecreasingVertices(SDV) 1:Startwithsome-vertexdisruptorDV.2:Repeat3:while(true)do4:u=argminv2DfP(G[Vn(D[fvg)])g5:if(Dnfugisa-disruptor)thenD=Dnfug6:foreach(w2VnD)do7:Dw=D[fwg8:u=argminv2DfP(G[Vn(Dw[fvg)])g9:+if(u6=w)thenD=Dw[fug10:Until(Dnotchanging)11:OutputD. Therefore,thepoint(s;d)violatesthisinequalityifandonlyifPi2Ssi
PAGE 66

upsik+1;sik+2;:::;sintoone,wherekrunsfrom1ton.Iftheobtainsolutionisa-vertexdisruptor,alocalsearchmethoddescribedinAlgorithm2isthenusedtorenethesolution.Thelocalsearchmethodrenesthesolutionbyrepeatedly: Removingnode(s)fromthedisruptorifpossible Swappinganodewoutsidethedisruptorwithanodeuinthedisruptorthatgivesthesharpestdecreaseinconnectivity.Thelocalsearchterminateswhennoimprovementexists. 3.4ExperimentalstudyWeperformexperimentstondoutthegapbetweenthesolutionofthepseudoapproximationalgorithm(Algorithm3)andanoptimalsolutionfoundbysolvinganIntegerprogrammingformulation.Wegeneratetwotypesofnetwork:randomnetworksfollowingErdos-Renyimodelandpower-lawnetworksfollowingBarabasi-Albertmodel.Foreachtypeofnetwork,wegeneratedifferentinstanceswithnumberofnodesrangingfrom30to100.Edgedensitiesofgeneratednetworksarearound10%.Themachineusedfortheexperimentswasan8cores2.2Ghzequippedwith64GBmemory.SizeofdisruptorsfoundbyAlgorithm3andthesizeofoptimaldisruptorsarepresentedinTables 3-1 and 3-2 .Despitealargetheoreticalgapofthepseudoapproximationalgorithm,thealgorithmproducesnear-optimalsolutionsandreturningoptimalsolutionsinmorethanhalfplaces(markedwithboldnumbers).Especially,ouralgorithmperformsextremelywellonpower-lawnetworks.Itmissestheoptimalsolutioninonlyoneplacewhenthenumberofverticesis90.Betweenarandomnetworkandapower-lawnetworkofroughlysamesizes,thesizeofdisruptorinthepower-lawnetworkissignicantlysmaller(approximately50%)thanthatintherandomnetwork,showingextremelyhighdegreeofvulnerabilityofpower-lawnetworktoattacks[ 8 ]. 66

PAGE 67

Table3-1. SizeofdisruptoronErdos-Renyinetworksat60%connectivity. Vertex30405060708090100Edge4378122177241316400495 Optimal 247911121618Approx 348911131619 Table3-2. SizeofdisruptoronBarabasiAlbertnetworksat60%connectivity. Vertex30405060708090100Edge54131189208245262354445 Optimal 13566579Approx 135665109 Figure3-2. DisruptorsfoundbydifferentmethodsintheWesternStatesPowerGridoftheUnitedStatesatdifferentlevelsofdisruption. 67

PAGE 68

A70%connectivity B50%connectivity C30%connectivity D10%connectivityFigure3-3. VisualizationofdisruptorsintheWesternStatesPowerGrid. 68

PAGE 69

TherunningtimeforsolvingtheIntegerprogrammingincreasesfromfewminutesto10hoursforthelargesttestcases,whileinthelongestrun,thepseudo-approximationalgorithmtakesonly29seconds. 3.4.1PerformanceoftheBranch-and-cutAlgorithm VertexEdgeRemovedTime(seconds) Constraint vertexIPvdMIPvd IPvdMIPvd 5014160.0%4638 60,1674,8611502861.0%1819,7882 1,665,36231,887--5.0%1518,0707 -32,161--8.0%12n/a73 -33,242--10.0%11n/a1,363 -39,615--20.0%9n/a1,737 -39,313--40.0%7n/a2,149 -42,830--60.0%5n/a1,610 -38,458--90.0%226,277147 -34,32120038760.0%8n/a64,860 3,960,48872,9806001,1660.5%69n/a48,918 107,641,467516,65610001,9590.5%198n/a747 499,340,0271,437,326 Table3-3. ComparisonsofIPvdandMIPvdonpower-lawnetworks WeimplementourbranchandcutalgorithmusingGUROBI4.0onacomputerwithIntelXeon2.93Ghzprocessorand12GBmemory.Table 3-3 showsresultsforIPvdandournewbranchandcutalgorithm(MIPvd)onpower-lawnetworks[ 13 ]ofvarioussizes.Wereportforeachdisruptionlevel,thenumberofremovedverticesintheoptimalsolution,thenumberofRows(constraints),Nonzeros(nonzerocoefcients),andsolvingtime.AsshowninTable 3-3 ,ourbranch-and-cutalgorithmutilizingsparsemetrictechniqueandstrongcuttingplanesissubstantiallyfasterandmorememory-efcientthantheoriginalbranch-and-cutequippedinGUROBIMIPsolver.Thespeedupfactorisfrom8timesfor50nodestoseveralthousandtimesforlargerinstances.Forthe 69

PAGE 70

networkof150nodes,MIPvdoftentakeslessthan30minutes,whileIPvdrunsoutofmemoryordoesnotterminateafter100,000seconds(notedwithn/a). 3.4.2Casestudy:WesternStatesPowerGridWestudyanetworkof4941nodesand6594edgesrepresentingthetopologyoftheWesternStatesPowerGridoftheUnitedStates.Thenetworkisshowntobehighclusteringwithsmallcharacteristicspathlengths[ 84 ];hencethenetworkisrathervulnerabletotargetedattacks.ItisintractabletondtheoptimaldisruptorusingIntegerProgrammingforsuchalargenetwork.Ourapproximationalgorithmusesrow-generationtechniquetoreduceexcessiveamountofconstraintsandrunsonaclustersof20nodes,eachnodeisequippedwithan8cores2.2GhzCPUand64GBmemory.Wecomparetheattackschemesthattargetnodesbasedontheircentralitywithourpseudoapproximationalgorithmtoshowthatthosemethodsmightnotbesuitabletorevealnetworkvulnerabilityintermofoverallnetworkconnectivity.Comparedmethodsinclude 1. DegreeCentrality:Thealgorithmsequentiallyremovenodewiththemaximumdegreeuntilthepairwiseconnectivityinthegraphlessthan)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2. 2. BetweennessCentrality:Werepeatedlyremovethenodewithmaximumbetweennesscentrality,untilthepairwiseconnectivityinthegraphlessthan)]TJ /F6 7.97 Tf 5.48 -4.37 Td[(n2.RecallthatthebetweennessBt(v)fornodevis:Bt(v)=Ps6=v6=t2Vs6=tst(v) stwherestisthenumberofshortestpathsfromstot,andst(v)isthenumberofshortestpathsfromstotthatpassthroughanodev. 3. EigenvectorCentrality:NodesareremovedindescendingorderoftheirEigenvectorcentrality(Pagerank)valueswiththedefaultdampingfactorof85%asin[ 68 ].WeshowinFigure 3.4 vulnerabilityreportedbydifferentmethodsatvariouslevelsofdisruption.Thenetworkissurprisinglyvulnerabletotargetedattacks.Forexampletoreduce40%connectivityinthenetwork(60%connectivityremain)weonlyneedtodestroy0.16%stations.Bringingdowntheconnectivitytothesamelevel,theaverage 70

PAGE 71

numberofnodestoremoveforrandomnetworksandpower-lawnetworksare13%and3%respectively.Evendestroyingonly1%ofstationscandramaticallydisrupt90%connectivityinthenetwork.Noneofothermethodscanrevealcorrectlythevulnerabilityofthepowergrid.Theirdisruptorsizesare6to20timeslargerthanthoseofourapproximationalgorithm.Thus,usingalternativeassessmentmethodsratherthantheonesweproposedmightleadtoadangerousmiragethatthenetworkisstronglystable.Becauseofhighclusteringproperty,nodesthatlieamongclustersinthenetworkswilloftenhavehighbetweennessvalues.Intuitively,weexpectedthebetweennessmethodtoeasilyidentifythosenodesandperformwellintheexperiment.Surprisingly,theperformanceofbetweennessmethodturnsouttobeevenworsethanthatofdegreecentrality.WevisualizethenetworkfragmentationatvarieddisruptivelevelsinFigures 3-3C and 3-3D .Disruptorseparatesthenetworkintolargeconnectedcomponents.Weobservethatnotallnodesinthedisruptoratthe30%connectivitylevelareinthedisruptorat10%level.Hence,wecannotassumenodesinthedisruptoratthelowerlevelsisthesupersetofnodesinthatatthehigherlevel.Itexplainswhycentralityassessmentmethodsinwhichnodesareselectedinaxedorderfailtoexhibitthevulnerabilityofthenetworkatdifferentdisruptivelevels. 71

PAGE 72

CHAPTER4JOINTLINKANDNODEATTACKS 4.1JointLinksandNodeAttacksWebeginwithanetworksamplethatshowtheadvantageofthepairwiseconnectivitymetricoverthenodecentralitymeasuresinFig. 4-1A .Assumetwonodesaretoberemovedfromoursimpleexample.Ifthetwonodesareselectedaccordingtotheirdegreecentrality,nodes1and2willberemovedandthenetworkremainsconnected.However,ifweremovenodestominimizethepairwiseconnectivity,nodes3and7aregoingtobetargeted,andthenetworkiseffectivelybrokenintotwosmallercomponents.Thefractionofpairwiseconnectivityintheresidualnetwork,denotedby,reducesdrasticallyto=18 5533%.Apparently,optimizingthepairwise-connectivitymetricrevealsmoreaccurateinsightsonthenetworkvulnerability.Fig. 4-1 alsoillustratesafundamentalshortcomingofexistingwork:theabilitytoassessnetworkvulnerabilityunderjointnodeandlinkattacks.Thethreesub-guresshowtheminimumcostattackstrategiestoreduce=50%pairwiseconnectivity,assumingeachlinkhascost2andeachnodehascost3.Whiletheminimumcostsforbothnode-attack(Fig. 4-1A )andlink-attack(Fig. 4-1B )are6,theminimumcostfornode-linkattacks(node3andlink(6;7))(Fig. 4-1C )isonly5.Thus,itisinsufcienttoassesslinkvulnerabilityandnodevulnerabilityseparatelywhenbothlinksandnodesinthenetworkcanbetargeted.Tomakemattersworse,assumenode3andlink(6;7)havethesamecost>0,theminimumcostsfornode,link,andnode-linkattackswillbe3+;4+;and2,respectively.Astheratios(3+)=(2)and(4+)=(2)gounbounded,theexistingmethodscanseriouslymisjudgethenetworkvulnerability.Toaddresstheshortcoming,westudytheeffectofjointnodeandlinkattacksintermofconnectivity.Weintroduceanewproblem,called-disruptor,thatndsaminimumcostsetofnodesandlinkswhoseremovaldegradesthepairwiseconnectivitytoagreatextent(afraction).The-disruptorproblemaimstoprovideamore 72

PAGE 73

comprehensiveassessmentonnetworkvulnerability.Itgeneralizesboththe-vertexdisruptorandthe-edgedisruptorproblemsproposedinourpreviouswork[ 34 ].Toourbestknowledge,thisistherstworktoaddresstheeffectofsimultaneousattacksonbothlinksandnodesonnetworkconnectivity. ANodeattack.Mincost=6 BLinkattacks.Mincost=6 CLink-nodeattack.Mincost=5Figure4-1. Minimumcostsolutionstoreduce50%oftheconnectivityassuminglinkshavecost2andnodeshavecost3 Ourcontributionsaresummarizedasfollows Providinganunderlyingframeworktowardassessingvulnerabilityunderjointnodeandlinkattacksandformulatingitasanoptimizationproblem-disruptor.Otherperformancemeasuressuchasthemaximumowbetweenagivensource-destinationpair[ 3 66 ],theaveragemaximumowbetweenpairsofnodes[ 66 ],etc.canalsobeusedinplaceofpairwiseconnectivitytodenenewproblems. OurmajorresultisanO)]TJ /F2 11.955 Tf 5.48 -.66 Td[(p lognbicriteriaapproximationalgorithmforbothundirectedanddirectednetworks.Thealgorithmndsa-disruptorwiththecostatmostO)]TJ /F2 11.955 Tf 5.48 -.65 Td[(p logntimesthatofanoptimal0-disruptor,with0slightlylessthan. Weproposeanefcientmeta-heuristicwhichisaspecialcombinationofsimulatedannealing,variableneighborhoodsearch,andspectralmethods.Theefcacyand 73

PAGE 74

scalabilityofourproposedalgorithmsisshownthroughextensiveexperimentsonbothsyntheticandreal-worlddatasets.Relatedwork.Manyexistingworksonnetworkvulnerabilityassessmentmainlyfocusonthelocalcentralitymeasurementstodifferentiatebetweencriticallinksandnodesandtheothers,see[ 34 59 ].Otherglobalgraphmeasureshavealsobeenproposedtoassessnetworkvulnerability.Thesemeasuresaremainlyfunctionsofgraphproperties,suchasthediameter,globalclusteringcoefcient,etc.[ 48 64 ].MatisziwandMurray[ 59 ]rstproposedthepairwiseconnectivityasaneffectivemeasurementandusemathematicalprogrammingtosolveforexactsolutions.Arulselvanetal.laterdenetheCriticalNode/Edgeproblems,whichthemainobjectiveistoidentifytopknodes/linksthatremovalminimizethepairwiseconnectivityintheresidualnetwork,andprovideNP-completenessproofsandintegerprogrammingformulations.However,therun-timeforexactsolutionsscaleexponentiallywiththenetworksize.DiSummaetal.[ 31 ]provedthatthecriticalnodedetectionproblem(CNP)isalsoNP-completeontreesforthetotalweightedpairwiseconnectivitymetric.Shenetal.[ 73 ]provedthattheCNPispolynomiallysolvableintreesandseries-parallelgraphsforthecaseswhenthenodeshaveuniformcostsandandtheobjectiveiseitherminimizingthesizeofthelargestcomponentormaximizingthenumberofresidualcomponents.Neumayeretal.[ 66 ]Werstproposedtheassessingvulnerabilitymethodsinformofoptimizationproblems-edge/vertexdisruptorin[ 34 ].ThepaperpresentsNP-hardnessof-edge/vertexdisruptorproblems,anO)]TJ /F5 11.955 Tf 5.48 -9.69 Td[(log1:5nbicriteriaapproximationalgorithmfor-edgedisruptor,andanO(lognloglogn)bicriteriaapproximationalgorithmfor-vertexdisruptor.Finally,wenotethatnoneofpreviousworksconsidermultipleattacksthathappenatbothlinksandnodesatthesametime.Organization.WebrieypresentterminologiesandproblemdenitionsinSection 4.1 .ThenweproposetheO(p logn)bicriteriaapproximationalgorithmfor-disruptor 74

PAGE 75

inSection 4.2 .Section 4.3 presentstheefcientheuristictond-disruptor.WeobtainnumericalresultsforthepresentedalgorithmsinSection 4.4 .Inthispaper,westudythe-disruptorproblem,formulatedin[ 34 ],inwhichthegoalistolocateaminimumsetofedges(nodes)toremovesothatthepairwiseconnectivityfallsdowntoacertainlevel.The-disruptorproblemtakesintoaccounttherolesofalledgesandverticesintheglobalnetworkconnectivity,thusprovidesamoreessentialresearchandthoroughanalysisovertheunderlyingvulnerabilityframeworkestablished.Weassumethateachvertexu2Visassociatedwithacostc(u)0andeachedge(u;v)2Ehasacostc(u;v)0.Forconvenience,wealsodenotethenumberofnodesandlinksbynandm,respectively.-disruptor.Given01,a-disruptorDisapairofsubsetsD=(VV;EE)thatremovalfromGwillmakethepairwiseconnectivityintheresidualgraphG0=(VnV;En(E[VV))tobeatmost)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2.The-disruptorproblemasksfora-disruptorwiththeminimumtotalcostc(D)=Xu2Vc(u)+Xe2Ec(e):Therearetwospecialtypesof-disruptor:ifV=;,thenDisa-edgedisruptor;andifE=;,thenDisa-vertexdisruptor.Theuniform-costversionsof-edgedisruptorproblemandthe-vertexdisruptorproblemarepreviouslystudiedin[ 34 ].Bydenition,-vertexdisruptorcanbeseenasaspecialcaseof-disruptorwhenalledgeshaveinnitycostsand-edgedisruptorisaspecialcaseof-disruptorwhenallverticeshaveinnitycosts.SincebothvertexandedgedisruptorareNP-hard,the-disruptorproblemisalsoNP-hardfor0<<1. 75

PAGE 76

4.1.1MixedIntegerLinearProgrammingThe-disruptorcanbeformulatedasanMixedIntegerLinearProgramming(MILP)problemasfollowsminimizenXu2Vc(u)su+Xe2Ec(e)xe (4)subjecttoduvsu+sv+xuv;(u;v)2E; (4)duv+dvwduw;(u;v)2E (4)Xu6=vduv(1)]TJ /F3 11.955 Tf 11.96 0 Td[())]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2; (4)0suduvxuv1;u;v (4)su;xuv2f0;1g;(u;v)2E (4)wheresu=1ifnodeuisremovedandsu=0otherwise.Similarly,xuv=1indicatestheremovalofedge(u;v).Thevariablesduvrepresentthedistance(ordisconnectivity)betweenuandvintheresidualnetworki.e.duv=0ifuandvisconnectedandduv>0otherwise.Theobjectiveminimizesthetotalcostoftheremovedverticesandedges,subjecttoconstraint( 4 )thatthepairwiseconnectivityinGisatmost)]TJ /F6 7.97 Tf 5.47 -4.38 Td[(n2.Constraint( 4 ),knownastriangleinequality,impliesthatifuandvareconnected,andvandwareconnected,thenuandwmustbeconnected.Constraint( 4 )guaranteesthatforeachedge(u;v)2E,thedistanceduv>0onlyifeitheru,v,oredge(u;v)isremoved.OurMILPformulationofferstwospecialimprovementsoverthedirectIntegerprogrammingformulations[ 12 ].Firstly,ithasonlym+nintegralvariables(suandxe)aswecanprovethat)]TJ /F6 7.97 Tf 5.48 -4.37 Td[(n2variablesduvarenotrequiredtobeintegers[ 33 ].Secondly,ithasonlyO(mn)constraintswhileformulationsforthesimilarproblems[ 12 ]hasatleast(n3)constraints.Formanyrealnetworkswheretheaveragedegreeisboundedbyaconstant,thenumberofconstraintsisonlyO(n2)thatleadstoahuge 76

PAGE 77

reductioninmemoryandrunningtime.Thefollowinglemmastatesthecorrectnessofourformulation. Lemma8. TheoptimalsolutionofILP( 5 5 )inducesaminimumcost-disruptorD=(V;E)ofG,whereV=fujsu=1gandE=f(u;v)jxuv=1g.TheproofissimilartothecaseoftheMILPforthe-vertexdisruptorproblemin[ 33 ],andisomittedhere. 4.1.2NodesandEdgeswithExcessCostsWegivesimplecriteriatoidentifyquicklysafenodesandedgesthatarenotthevulnerabilitiesofthenetworkduetotheirexcesscosts.Thishelpsnarrowingdownthesearchspaceforvulnerabilitiesandsuggeststheprotectionprioritiesandresourceallocationtoothernetworkelements.Sinceremovingeitheruorvcausesmoredisruptionthanremovingtheedge(u;v),edgeswithexcesscostscanbeidentiedbasedonthefollowinglemma. Lemma9. Anedge(u;v)2Ewithc(u;v)>minfc(u);c(v)gwillnotappearinanyoptimal-disruptorforany0.Thelemmaalsoreectsthefactthatnodes'costsareoftenhigherthanthecostsoftheincidentlinks.Similarly,removinganodehasthesameeffectasremovingalltheincidentedges;andanodeuisanexcesscostnodeifc(u)ishigherthanthetotalcostsoftheincidentedges. Lemma10. Avertexuwithc(u)>P(u;v)2Ec(u;v)willnotappearinanyoptimal-disruptorforany0.Fornon-excessnodesandedges,Lemmas 9 and 10 providetherelativecapsforhowmuchextraresourceweshouldallocatetothosenetworkelements. 4.2BicriteriaApproximationAlgorithmforJointLink&NodeAttacksInthissubsection,wepresentanO(p logn)bicriteriaapproximationalgorithmforthe-disruptorproblem.Sinceboth-vertexdisruptorand-edgedisruptorare 77

PAGE 78

specialcasesof-disruptor,ournewalgorithmimprovethebestresultsfor-vertexdisruptor,theO(lognloglogn)bicriteriaapproximationalgorithm,and-edgedisruptor,theO(log3=2n)bicriteriaapproximationalgorithm[ 34 ]. 4.2.1AlgorithmDescriptionWewillrefertotheinputnetworkastheoriginalnetwork.Werstreducethe-disruptorproblemintheoriginalnetworktoaninstanceofthe-edgedisruptorprobleminanauxiliarydirectedgraph.Thereductionmapseachundirectededgetotwoalternatingdirectededgesandeachnodetoasurrogateedge.Moreimportantly,weshowthatthereduction`preserves'relativeperformanceguarantees.Wethenapplyarecursivecutproceduretondanear-optimalsetofbothalternatingedgesandsurrogateedgesthatcorrespondtoa-disruptorintheoriginalnetwork.OuralgorithmJLNA(G)tonda-disruptorindirectedgraphGissummarizedinAlgorithm 8 .Intherstphase,thealgorithmconstructsanauxiliarygraphG0bysplittingeachvertexv2Vintotwonewverticesv+andv)]TJ /F1 11.955 Tf 7.09 -4.34 Td[(.Formally,thesetofverticesandedgesinG0aredenedasV0=fv)]TJ /F3 11.955 Tf 7.09 -4.93 Td[(;v+jv2VgE0=f(v)]TJ /F3 11.955 Tf 7.09 -4.94 Td[(;v+)jv2Vg[f(u+;v)]TJ /F5 11.955 Tf 7.09 -4.94 Td[()j(u;v)2EgInaddition,weassigncostsc0(:)foredgesinG0asfollows:c0(v)]TJ /F3 11.955 Tf 7.09 -4.34 Td[(;v+)=c(v)forthesurrogateedge(v)]TJ /F3 11.955 Tf 7.09 -4.34 Td[(;v+)andc0(u+;v)]TJ /F5 11.955 Tf 7.08 -4.34 Td[()=c(v+;u)]TJ /F5 11.955 Tf 7.08 -4.34 Td[()=c(u;v)foralternatingedges(u+;v)]TJ /F5 11.955 Tf 7.08 -4.34 Td[()and(v+;u)]TJ /F5 11.955 Tf 7.08 -4.34 Td[().Inthecase,Eisamixofbothundirectedanddirectededges,wecanalsoconverteachdirectededge(p;q)2Eintoanalternatingedge(p+;q)]TJ /F5 11.955 Tf 7.09 -4.34 Td[()2E0withacostc0(p+;q)]TJ /F5 11.955 Tf 7.08 -4.33 Td[()=c(p;q).Inthesecondphase,therecursivecutprocedure,showninlines4to11,constructa~-edgedisruptorofG0,denotedbyE~.Hereforagiven0<,~=1 2(+0).The~-edgedisruptorisfoundbyiterativelyapplyingasubroutineSPARSE CUTonthestronglyconnectedcomponentsinG0.ThesubroutineSPARSE CUTcutthecomponents 78

PAGE 79

intosmalleronesandtheedgesinasubsetofthecutsareaddedtoE~.Theprocesscontinuesuntilthepairwiseconnectivityinthegraphreducesto)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2orsmaller.Bytheendofthesecondphase,E~ismappedbacktoedgesandnodesinGtogivea-disruptor. Algorithm8:JLNA(G) 1.ConstructtheauxiliarygraphG0=(V0;E0)2.~ 1 2(+0)3.E~ ;4.foreachSCCCinG05.(CE;C) SPARSE CUT(C)6.whileP(G0)>~)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n27.FindaSCCCofG0withminimumcutratioC8.E~ E~[CE9.RemoveedgesinCEfromG10.foreachnewcomponentC0inG11.(C0E;C0) SPARSE CUT(C0)12.V fvj(v)]TJ /F3 11.955 Tf 7.08 -4.34 Td[(;v+)2E~g13.E f(u;v)j(u)]TJ /F3 11.955 Tf 7.08 -4.34 Td[(;v+)2E~g14.returnD=(V;E) Asshowninlines4and5,thesubroutineSPARSE CUTisappliedtoeachstronglyconnectedcomponentCtondaminimumratiocuthS0; S0iinC.Thecutratioforacutisdenedasfollows. Denition6. LetG0=(V0;E0)beadirectedgraph.TheratioofacuthS0; S0iis(S0)=cout(S0) jS0jj S0j,wherecout(S0)isthetotalcostofedgescomingoutfromS0.Inaddition,acutwiththeminimumcutratioiscalledaminimumratiocutanddenotedby(G0)=minS0(V0(S0)TheoutputofSPARSE CUTisapair(CE;C),whereCE=hS0;S0iandC=(S0).Forsimplicity,wepostponethedescriptionofSPARSE CUTtiltheproofontheapproximationratio. 79

PAGE 80

InthemainloopofJLNA,presentedinlines6to11,foreachroundweselect,amongtheexistingSCCs,aSCCCinGthathasthesmallestcutratio.LetCEandCbethecutsetandthecutratioofthecutfoundbySPARSE CUTinC.WeaddCEtoE~andremoveCEfromG.RemovingE~breaksCintotwoormorestronglyconnectedcomponents.WeagainapplySPARSE CUTonthosecomponentstondtheminimumratiocuts.ThemainloopterminateswhenthepairwiseconnectivityinGisnomorethan)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2.Thenweconstructthenalsolutionbymappingeachsurrogateedge(v)]TJ /F3 11.955 Tf 7.09 -4.34 Td[(;v+)2E~tothenodevinG,andeachalternatingedge(u)]TJ /F3 11.955 Tf 7.09 -4.34 Td[(;v+)2E~totheedge(u;v)inG. Figure4-2. Removing(u;v)breaksthenetworkintofourstronglyconnectedcomponents,includingtheredandgreencomponentswhicharenotevenincidentto(u;v). 4.2.2AnalysisofApproximationRatioWeshowthattheJLNAalgorithmisanO(p logn)bicriteriaapproximationalgorithmforthe-disruptorproblem.WerstshowinLemma 11 theconnectionbetweenthecostofanoptimal-disruptorandtheminimumcutratio.Thisisthekeylemmathatshowtherelationbetween(bipartite)sparsestcutstogeneral(multi-way)cuts.Cutsindirectednetworkshavedifferentcharacteristicsincomparisontotheircounterpartinundirectednetworks.First,thecutratiosofhS;SiandhS;Siaredifferentingeneral.Inaddition,differentcutsmayassociatewiththesamesetoflinks.Forexample,thecutsdenedby 80

PAGE 81

S=fbluenodesg,andS=fblueandgreennodesgassociatestothesamesetoflinksf(u;v)g.Totreatthesedifferences,weusearandomizedargumentinthefollowinglemma.Second,componentsindirectednetworksarehighlyinterdependent.AsillustratedinFig. 4-2 ,thefailureoflink(u;v)effectivelybreaksthenetworkintofourdisconnectedcomponents.Redandgreencomponentsloosethecommunicationtootherpartsofthenetwork,evennoneoftheirincomingandoutgoingedges,coloredinblack,fail.Incontrast,theonlywaytoseparateacomponentfromtherestinundirectednetworksistoremovealllinksincidenttothecomponent.Tolinktheaveragecosttodisruptconnectedpairsinanoptimal-disruptortotheminimumcutratio,weconsidertherandompartitionsofSCCsintheresidualgraph,subjecttotheirtopologicalorder. Figure4-3. SCCsintheresidualgraphG[EnM].ThecuthS;SiconsistsofasubsetofM:edgesfromC4toC3andfromC5toC2. Lemma11. GivenadirectedgraphG=(V;E)andasubsetofedgesME,if!M=P(G))-222(P(G[EnM])>0,thenc(M) !M1 3min(G);wheremin(G)=minf(C)jCisaSCCofGg: Proof. Firstly,weconsiderthecaseGisstronglyconnected.Thenmin(G)=(G)andP(G)=)]TJ /F6 7.97 Tf 5 -4 Td[(n2.LetC1;C2;:::;CkbeSCCsinG[EnM]andletCi(V)denotethesetofverticesincomponentCi.Wehave!M=Pi
PAGE 82

ObservethatifwecontracteachSCCintoasinglenode,weobtainthegraphofSCCswhichisadirectedacyclicgraph.Thus,thereisatopologicalorderforSCCsandwefollowtheconventionthatverticeswithnoincomingedgeswillhavethesmallestorders.Thus,w.l.o.g,weassumethattheremovededgesalwayscomefromSCCswithhigherorderstoSCCswithlowerorders.ConsiderallcutshS;SiofGthatsatisfythefollows 1. EitherCi(V)SorCi(V)S 2. IfCi(V)SandthereexistsanedgefromCi(V)toCj(V)inG[EnM],thenCjS.AnexampleforacuthS;SiisgiveninFig. 4-3 .Clearly,hS;SiM,hence,cout(S)c(M).ForagivenpairofSCCsClandCk,theprobabilitythatCl(V)andCk(V)belongtodifferentsidesofthecutisatleast1=3.Since,therearefourpossiblewaysofassigningCl(V)andCk(V)totwosidesofthecut,andatmostoneoutoffourisforbiddenaccordingtothesecondcondition.Thus,jCl(V)jjCk(V)jpairsofverticesbetweenClandCkareseparatedwithprobabilityatleast1=3.Hence,theexpectednumberofpairsseparatedbyacuthS;SiisatleastE[jSjjSj]1=3Xi
PAGE 83

connectedcomponent,wehavec(M)=Xjc(M(j))3Xj(Tj))]TJ /F24 10.909 Tf 5 -8.84 Td[(P(Tj))-221(P(T0j)3min(G)Xj)]TJ /F24 10.909 Tf 5 -8.84 Td[(P(Tj))-222(P(T0j)=3min(G)(P(G))-222(P(G[EnM]))Thus,thelemmaholdsforeverydirectedgraphG. ThequalityandperformanceJLNAdependontheselectionofSPARSE CUT.Forexample,anexactalgorithmtondminimumratiocutwillleadtoaconstantfactorbicriteriaapproximationalgorithmfor-disruptor.Unfortunately,ndingtheminratiocutisanNP-hardproblem[ 10 ].Thuswehavetorelyonapproximationalgorithmstondgoodratiocutinthegraph. Theorem4.1. Foranyxed00<,algorithmJLNAndsa-disruptorofcostatmostO(p logn)(OPT0),whereOPT0isthecostofaminimum0-disruptor. Proof. Theproofconsistsoftwosteps.Intherststep,weprovethatD=(V;E)isa-disruptorofG.Inthesecondstep,weprovethatthecostofDisatmostO(p logn)timesthecostofaminimum0-disruptor,denotedbyOPT0.InordertoprovethatDisa-disruptorofG,weshowthatthepairwiseconnectivityinGafterremovingedgesinG[)]TJ /F27 10.909 Tf 8.49 0 Td[(D]=(VnV;En(E[VV))isatmost)]TJ /F6 7.97 Tf 5 -4 Td[(n2.First,observethatverticesv)]TJ /F29 10.909 Tf 10.11 -3.96 Td[(andv+areeitherinthesameSCCortheybothareisolated.Here,wesayavertexisisolatedifitbelongstoaSCCofsizeone.AssumethatG0[E0nE~]canbedecomposedintoSCCsC01;C02;:::;C0land2tisolatedverticesw)]TJ /F4 7.97 Tf -.29 -7.71 Td[(1;w+1;:::;w)]TJ /F6 7.97 Tf -.3 -7.48 Td[(t;w+t.BasedontheconstructionofG0,wecanverifythattherearelcorrespondingSCCsC1;C2;:::;Clandtisolatedverticesw1;w2;:::;wtinG[)]TJ /F27 10.909 Tf 8.49 0 Td[(D].Moreover,jC0ij=2jCijfori=1::l.Therefore,wehave~)]TJ /F4 7.97 Tf 5 -3.99 Td[(2n2P(G0[E0nE~])=Xi)]TJ /F10 7.97 Tf 5 -3.78 Td[(jC0ij2=4Xi)]TJ /F10 7.97 Tf 5 -3.99 Td[(jCij2+XijCij=4P(G[)]TJ /F27 10.909 Tf 8.49 0 Td[(D])+(n)]TJ /F27 10.909 Tf 10.91 0 Td[(t) 83

PAGE 84

Since~<,wehaveP(G[)]TJ /F27 10.909 Tf 8.48 0 Td[(D])1 4~)]TJ /F4 7.97 Tf 5 -4 Td[(2n2)]TJ /F25 10.909 Tf 10.9 0 Td[((n)]TJ /F27 10.909 Tf 10.91 0 Td[(t))]TJ /F6 7.97 Tf 5 -4 Td[(n2Thus,wehavecompletedtherststep.Weprovethesecondstepasfollows.LetD0=(V0;E0)beaminimum0-disruptori.e.c(D0)=OPT0.DeneE00=f(v)]TJ /F27 10.909 Tf 7.08 -4.51 Td[(;v+)jv2V0g[f(u+;v)]TJ /F25 10.909 Tf 7.09 -4.51 Td[()j(u;v)2E0g:BymappingSCCsofG[)]TJ /F27 10.909 Tf 8.48 0 Td[(D0]tothoseofG0[E0nE00]asintherststep,wecanshowthatE00isa0-edgedisruptorofG0.Thus,OPT0(G)OPTE0(G0):Since0<~,byLemma 11 ifremovingasetofedgesM!Edisrupts!pairsofvertices,thenc(M!) !1=3min(G).AtanyroundinthewhileloopofJLNA,sinceasetofedgesE0inaminimum0-edgedisruptor,forsome0<0<,candisruptatleast()]TJ /F27 10.909 Tf 11.02 0 Td[(0))]TJ /F6 7.97 Tf 5 -3.99 Td[(n2additionalpairsinG,wehaveOPTE0=)]TJ /F25 10.909 Tf 5 -8.83 Td[(()]TJ /F27 10.909 Tf 10.91 0 Td[(0))]TJ /F6 7.97 Tf 5 -3.99 Td[(n21=3min(G): (4)Inaddition,ourcutprocedureisanO(p logn)factorapproximationalgorithmforthemincutratioproblem,theaveragecosttodisruptapairbyremovingCEisupperboundedbyO(p logn)min(G).By( 4 ),theaveragecosttodisruptpairsinthegraphatanystepisatmostO(p logn)(OPTE0)=)]TJ /F25 10.909 Tf 5 -8.84 Td[(()]TJ /F27 10.909 Tf 10.91 0 Td[(0))]TJ /F6 7.97 Tf 5 -4 Td[(n2.Therefore,evenwhenEdisruptall)]TJ /F6 7.97 Tf 5 -4 Td[(n2pairsinG,thetotalcostisnomorethanO(p logn)OPTE0 ()]TJ /F27 10.909 Tf 10.9 0 Td[(0))]TJ /F6 7.97 Tf 5 -4 Td[(n2)]TJ /F6 7.97 Tf 5 -3.99 Td[(n2O(p logn) ()]TJ /F27 10.909 Tf 10.91 0 Td[(0)OPTE0:Thuswehavec(E~)O(p log2n)OPTE0(G0)O(p logn)OPT0(G):Thatyieldstheproof. 84

PAGE 85

Remarks.WhiletheJLNAalgorithmcanprovideaperformanceguaranteeontheproducedsolution,itcanbefurtherimproved.First,JNLAoftendisruptsmorethanafractionoftheconnectedpairsandresultinahighercostsolution.Second,theSPARSE CUTprocedureinJLNAhasahightimecomplexityofO(n9:5)(itinvolvessolvingalargesizesemideniteprogramming).WeaddresstheseissuesofJLNAtoprovideanimprovedalgorithminthenextsection. 4.3HybridMeta-heuristicWeproposeinAlgorithm 9 ahybridmeta-heuristic(HMH)thatimprovesoverJLNAw.r.t.thefollowingtwoaspects: 1. HMHavoidsdisruptingmorepairsthannecessarybycontrollingthedifferencebetweentheconnectivityintheresidualgraphandthetargetconnectivity.Thepairwiseconnectivityiskepttobewithin())]TJ /F6 7.97 Tf 5 -4 Td[(n2whereisapositiveparameterandisiterativelyreducedbyhalf.HMHreturnstheminimumcost-disruptorencounteredduringthesearchasthesolution. 2. HMHimprovestherunningtimebyreplacingthesparsecutalgorithmin[ 2 ]withanefcientspectralpartitioningmethodinsubsection 4.3.2 .Further,thesolutionisrenedineachstepwithlightweightlocalsearchmethodsinsubsection 4.3.3 4.3.1Controllingthepairwiseconnectivity.Weuseaparameter,similartotheheatingconditioninSimulatedAnnealing[ 53 ],tocontrolhowfarthepairwiseconnectivityinthegraphcandivergefromthetargetconnectivity)]TJ /F6 7.97 Tf 5 -3.99 Td[(n2.Everyround,isreducedbyhalfuntilitisnegligiblysmall.First,HMHrecursivelyseparatestheSCCsinthegraphwithaspectralpartitioningmethod(describedinthenextpart)untilthepairwiseconnectivityisatmost()]TJ /F27 10.909 Tf 11.1 0 Td[())]TJ /F6 7.97 Tf 5 -3.99 Td[(n2(line).IfmultipleSCCscanbecutatthesametime,thealgorithmselecttheSCCwiththeminimumratiocutasintheJLNAalgorithm.Thealgorithmfollowsbytwophases:condensationandexploration,inwhichweimprovethesolutionintermsofcostandcutratiowithlocalsearchmethodsinsubsection 4.3.3 .Forsimplicity,weusethetermneighbortorefertoacandidatesolutionthatcanbeobtainedfromthecurrentsolutionEbyapplyingoneofthelocalchangesinsubsection 4.3.3 85

PAGE 86

Algorithm9:HMH(G) ConstructtheauxiliarygraphG0=(V0;E0)E ;; minf;1)]TJ /F3 11.955 Tf 11.95 0 Td[(gwhile>1=)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2 1 2repeatPartitionSCCsofG0usingthespectralmethodAddtheedgesintoEuntilEisa()]TJ /F3 11.955 Tf 11.96 0 Td[()-edgedisruptorfork=1to3/*Phase1:Condensation*/repeatConsideralltypekneighborsE0thatc(E0)c(E)andP(G[EnE0]))]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2FindamongthemE0withthesmallestcutratioE E0untilnochangeinEfork=1to4/*Phase2:Exploration*/repeatConsideralltypekneighborsE0that()]TJ /F3 11.955 Tf 11.96 0 Td[())]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2P(G[EnE0])(+))]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2FindamongthemE0withthesmallestcutratioE E0untilnochangeinEreturnthebestsolutionsofar Inthecondensationphase,wemovefromthecurrentsolutionEtoasmallercostneigh-bor.Andamongthepossibleneighbors,wemovetotheonewhichresultsinthesmallestcutratio.Intheexplorationphase,weemphasizeonimprovingthecutratiotondpotentialgoodpartitionofthenetwork.Movingtoneighborswithhighercostsispossibleduringthisphaseaslongasthepairwiseconnectivitydiffersatmost)]TJ /F6 7.97 Tf 5 -4 Td[(n2fromthetargetconnectivitylevel)]TJ /F6 7.97 Tf 5 -4 Td[(n2. 4.3.2SpectralPartitionThealgorithmtondsparsecutsin[ 2 ]hasahightimecomplexityO(n9:5)asitrequiressolvingalargesemideniteprogram.Wereplacethatalgorithmwithamoreefcientspectralpartitioningmethodwhichtiesdirectlytothecutratio.Spectralalgorithmsoftengivehighqualitysolutionsandcanbeimplementedefcientlybystandardlinearalgebrapackages[ 67 74 ]. 86

PAGE 87

LetA=fcijgbethecostmatrixofG=(V;E)wherecij=c(vi;vj)isthecostofedge(vi;vj)andcij=0if(vi;vj)=2E.TheunnormalizedgraphLaplacianmatrix[ 63 ]isdenedasL=D)]TJ /F27 10.909 Tf 10.91 0 Td[(A,whereDisadiagonalmatrixwiththeweighteddegreesofverticesonthediagonal.Sinceforeveryvectorx2RnwehavexTLx=1 2nXi;j=1cij(xi)]TJ /F27 10.909 Tf 10.91 0 Td[(xj)20: (4)ThematrixLissymmetricandpositivesemi-denite.Lhasnnon-negative,real-valuedeigenvalues1=02:::n.W.l.o.g,weassumethatGisconnected.ThenthesecondsmallesteigenvectorofL,2,isknownasthealgebraicconnectivityofthegraphandcanbeusedtodescribemanypropertiesofgraphs[ 63 ].Weshallusetheeigenvectorcorrespondingto2toderivethebisectionofverticesinG.RecallthatSPARSE CUTaimstondtheminratiocutminS(Vc(S;S) jSjjSj (4)Consideravectorx2f0;1gnrepresentasetofverticesinSi.e.xi=1ifvi2Sandxi=0otherwise.Werewritetheminratiocutproblemasminx2f0;1gn;x6=0;1P(vi;vj)2Ecij(xi)]TJ /F27 10.909 Tf 10.91 0 Td[(xj)2 PiPj(xi)]TJ /F27 10.909 Tf 10.91 0 Td[(xj)2 (4)SincetheproblemisNP-hard,werelaxtheconditionxi2f0;1gtoxi2[0;1].Substitutexwithvectory=x)]TJ /F10 7.97 Tf 12.1 5.37 Td[(kxk1 n.Aftersomealgebra,weobtainanequivalentproblemof( 4 )miny6=0;y?11 nyTLy yTy (4)ByCourant-Fishertheorem[ 63 ],thesolutionoftheaboveminimizingproblemisexactlytheeigenvectorcorrespondingtothesecondsmallesteigenvalueof2.SowecanapproximatetheoptimalsolutionoftheminratiocutproblemwiththesecondeigenvectorofLbytransformingthereal-valuedxintoazero-onevector.Specically,wesortthevaluesofxitogivealinearorderingoftheverticesthendeterminethesplittingindexpthatyieldsthebestcutratio.Inaddition,tohandledirectededgesweuseoneofsymmetrizationmethodssuchas(A+AT)=2orAAT[ 60 ]totransformthematrixAintoasymmetricmatrix. 87

PAGE 88

4.3.3VariableNeighborhoodSearchFor-edgedisruptorproblem,multipleneighborhoodstructureisessentialtoobtainhighqualitysolutions.Theyhelpinbothminimizingthecutratioandthetotalcutcost.ItisessentialthatHMMallowsbothdownhillmovesthatreducethetotalcutcostanduphillmovesthatincreasethecostofthesolutionbutreducethecutratio.Weconsiderfourdifferentneighborhoodstructures.FromasolutionorapartialsolutionEE,thesetofneighborsineachneighborhoodstructurecanbeobtainedasfollows Type1:MergetwoconnectedcomponentsinG[EnE]i.e.removetheedgesbetweenthemfromE. Type2:MoveavertexfromonecomponenttoanadjacentcomponentinG[EnE]. Type3:Swapplacesoftwoadjacentvertices(u;v)whichbelongtotwodifferentcompo-nents. Type4:PartitionacomponentinG[EnE]withthespectralpartitioningmethodinsubsection 4.3.2 .TimeComplexity.Sincethealgorithmhasatmostlog)]TJ /F6 7.97 Tf 5 -4 Td[(n2=O(logn)phases,anditspendsatmostO(n3)timestoimprovethesolutionwithineachphase,theHMHalgorithmhasatimecomplexityO(n3logn).Ifweassumethattheeigenvaluescanbefoundwithinaconstantnumberofiterations[ 56 ],HMMwillhaveatimecomplexityO(n2logn). 4.4ExperimentalStudiesWeillustratethroughourexperimentstheneedtoassessnetworkvulnerabilityunderjointnodeandlinkattacksandtheefciencyofourproposedalgorithms.Theexperimentsareperformedonasetoffoursynthesisnetworksdescribedinsubsection 4.4.3 andthreerealcommunicationnetworks,namelyIPBackbone[ 1 ],CAIDAAS[ 58 ],andOregon[ 58 ].Thenetworkdetailsaregiveninthesubsequentsubsectionsandthecorrespondingreferences. 4.4.1ExperimentSetupsAllalgorithmsareimplementedinC++andcompiledwithGCC4.4compilerona64bitLinuxmachinewithaQuad-coreAMDOpteron23502.0Ghzprocessorand32GBmemory.Onlyasinglecoreisusedduringtheexperiments.ThemathematicaloptimizationpackagetosolvetheMIPinSection 4.1.1 isGUROBI4.5. 88

PAGE 89

A=0andb=0:25 B=0andb=2:25 C=0:25andb=2:25 D=1:25andb=1Figure4-4. ThenormalizedoptimalcostsofthreedifferentdisruptortypesontheUSBackbonenetwork. Assigningcostsfornodesandedges.Assigningmeaningfulcostsforedgesandverticesisachallengingtaskwhichusuallydependsontheavailabilityofthedata.Forsimplicity,weassumethatalledgeshasuniformremovalcostsc(e)=18e2E.Notethatwecanalwaysmultiplysimultaneouslyedgeandvertexcostswithaconstant,thenalloptimaldisruptorsstayoptimal(withthecostsmultipliedbythesameconstant).Weassignthecostofremovingavertexutobec(u)=b+d(u),wherebandarenon-negativeconstants.Inotherwords,attackinganoderequirespayingabasecostbandanextracostthatisproportionaltothedegreecentrality.Othercentralitymeasurementse.g.PageRank,Betweenesscentralitycanalsobeusedinplaceofd(u)toweighttheu'simportance.Solvingforthesecondeigenvector.ThemajortimeofHMH(Algorithm 9 )spendsonndingthesecondsmallesteigenvectoroftheLaplacianmatrix.Theeigenvectorsarefound 89

PAGE 90

AErdos-Reyni(random)network BBarabasi(power-law)network CWatts-Strogatznetwork DForestrenetworkFigure4-5. Costsofdisruptoralgorithmsonthesynthesisnetworks(thelowerthebetter) usingtheImplicitlyRestartedArnoldiMethod,implementedinARPACK[ 56 ].WeuseSuperLU[ 30 ]asthelinearsystemssolver.WeusetheShiftandInvertspectraltransformationtoenhancetheconvergencerate.Weselectascalar=0:01,calledtheshift,andtransformtheoriginalproblemLx=xintotheshift-and-invertproblem(L)]TJ /F27 10.909 Tf 10.41 0 Td[(I))]TJ /F4 7.97 Tf 6.58 0 Td[(1x=xwhere=1=()]TJ /F27 10.909 Tf 10.4 0 Td[().Notethatsetting=0willcrashARPACKsinceLisnon-invertible(sumofrowsequalzero)1.Inaddition,spectralpartitioningisperformedonthesymmetrizedmatrixA0+A0T,whereA0istheadjacencymatrixoftheauxiliarygraphG0,constructedinAlgorithm 8 1The'eigs'functiontondeigenvaluesinMATLABcrashesforthisreason. 90

PAGE 91

AErdos-Reyni(random)network BBarabasi(power-law)network CWatts-Strogatznetwork DForestrenetworkFigure4-6. Runningtimeofdisruptoralgorithmsonthesynthesisnetworks 4.4.2ComparisonofthethreedisruptortypesInthissection,weexperimentwithdifferentcostschemesforverticesandedgestohighlighttheconnectionsamongthethreedifferentdisruptortypes:edge,vertex,andgeneral(vertex-edge).First,thecostofoptimal-disruptorisalwayslessthanthecostsofboth-edgedisruptorand-vertexdisruptor.Thissuggeststhatwhilemanynetworksarevulnerabletoonlynodeattacks(e.g.scale-freenetworks)oronlylinkattacks,allnetworksexhibithigherlevelofvulnerabilitytothejointattacksonbothnodesandlinks.Thusitisessentialtoassessthenetworkforsuchgraveattackschemes.Second,whenweapplythecostschemesc(u)=b+d(u),thecostofminimumbeta-disruptorcanbestrictlylessthanorequaltheminimumofthoseof-edgedisruptorand-vertexdisruptor,dependingonthevaluesofband.Todistinguishbetweenthesetwocases, 91

PAGE 92

wenotethatifc(u)P(u;v)2Ec(u;v)willnotberemovedsincewecanalwaysremoveallofitsincidentedges.Therefore,weobtainthefollowingcases. =0;b1:OPT=OPTVOPTEi.e.theoptimal-disruptorcontainsnoedges. =0;b>1:theoptimalsolutionscontainnouwithd(u)
PAGE 93

Watts-Strogatz:Arandomgraphwhichexhibitsmall-worldphenomenonfollowingmodel[ 84 ]withthedimensionofthelattice2andtherewiringprobability0.3[ 84 ]. Forestre:Arandompower-lawgraphfollowingForestremodelbyLeskovecetal.[ 58 ]withtheforwardandbackwardburningprobabilities0.3and0.9,respectively.Thecostsofnodesandlinksfollowtheabovelinearscalec(u)=b+d(u)with=0:25andb=0:25.Wecompareintheexperimentsthefollowingalgorithms. HMH,thehybridmeta-heuristicpresentedinSection 4.3 JNLA,thebicriteriaapproximationalgorithminSection 4.2 Betweeness,agreedyalgorithmthatiterativelyremovestheedgewiththehighestbetweenesscentrality[ 41 ]. Opt.-dis.,optimal-disruptorfoundwithMIPinsubsection 4.1.1 .SolutionQuality.WeshowinFig. 4-5 thecostsofsolutionsproducedbythefollowingalgorithms.ItisclearlythatOpt.-dis.givesthesmallestcost(optimalsolutions).Withonlythecostofremoving10%ofthelinksinthenetwork,themethodcandisrupt50%connectivityintheErdos-Reyninetwork,60%connectivityintheBarabasiandWatts-Strogatznetwork,andupto80%intheforestrenetwork.Althoughthemethodtondtheoptimal-disruptordoesnotscaletolargenetwork,itgivesavaluablebenchmarktogaugethequalityofotheralgorithms.ThesecondtotheOpt.-dis.isHMH,whichprovidesnear-optimalcostsinmostcases.Incomparison,whileBetweenessandJLNAproducessolutionswithcostsashighas650%and225%oftheoptimalsolutions',inaveragethecostsoftheHMH'ssolutionsarenomorethan25%higherthantheoptimalcosts.Asmentioned,JLNAalgorithmmayseparatemorenodepairsthannecessary,thusresultinhighercostsolutions.InFig. 4-5B ,JLNAreturnsthesamesolutionforbetween0:8and0:5.Thatiswhen=0:8,theJLNA'ssolutiondisrupts150%morenodepairsthantherequired.ThesameobservationcanbemadeforthecaseofWatts-StrogatznetworkinFig. 4-5C whenisbetween0:7and0:4.Thankstothecontrollingofpairwiseconnectivityandthevariable 93

PAGE 94

neighborhoodsearchmethod,HMHdoesnotencounterthisproblemandisabletoproducemuchsmallercostsolutions.RunningTime.AsshowninFig. 4-5 ,therunningtimeoftheOpt.-dis.cangetanywherebetweenfewseconds(inthecaseoftheForestrenetwork)toseveralhours(inthecasesofErdos-ReyniandWatts-Strogatznetworks).Thusitisnotscalableforlargenetworks.AllthethreeotheralgorithmsBetweeness,JLNA,andHMMtakelessthan1second.ThefastestamongthethreeistheBetweeness,andtheHMMrunsslightlyslowerthanJLNAasitneedstokeepthepairwiseconnectivitynottoofarfromthetargetlevelandperformextralocalsearchsteps.Overall,HMMisthebestchoiceamongthestudiedalgorithmssinceitproducesthesolutionswithnear-optimalcostsandtheperformanceisstableacrossdifferentnetworktopologies.Moreover,itsrunningtimeisfarmorereasonablethansolvingfortheoptimalsolution(Opt.-dis.)Despitebeingthefastestalgorithm,theBetweenessmethodgivesunsatisfactorysolutionsandshouldbeavoided. Figure4-7. OregonASnetwork 4.4.4ASRelationshipsNetworksWefurtherperformexperimentswiththesamesettingsinthelastsubsection,onthefollowingASrelationshipsdatasets. CAIDAAS:TheCAIDAASRelationshipsDatasetfromSep.17,2007[ 58 ]with8,020nodesand36,406links. 94

PAGE 95

Figure4-8. CAIDAASnetwork OregonAS:ASpeeringinformationinferredfromOregonroute-viewsbetweenMarch31andMay26,2001[ 58 ].Onlythelargestconnectedcomponentwith11,174nodesand23,410linksisconsidered.WeshowthecostandrunningtimeofBetweeness,HMM,andJLNAininFigs. 4-7 and 4-8 .Unfortunately,theOpt.-dis.methodcannothandlesuchlargenetworksandisexcludedfromthedrawings.Similartothecasesofsynthesisnetworks,theHMHalgorithmcontinuetoproducessolutionswithupto100%smallercoststhanJLNA's.Inaddition,HMMtakesslightlylongertimetocomplete.InbothHMMandJLNA,Themajorportionoftimeisspentonndingthesecondeigenvector;andperforminglocalsearchandannealingprocedureisrelativelyinexpensive.Obviously,HMHdominatesJLNAbecauseofthemuchhigherqualitysolutions(atthepriceofasmallincreasingintherunningtime).ThebottleneckinHMMistocomputetheeigenvectors,whichcanbedoneefcientlyinadistributedmanner.ThisisoneofourfuturedirectiontomakeHMMscalableformuchlargernetworks. 95

PAGE 96

CHAPTER5VULNERABILITYASSESSMENTINPROBABILISTICNETWORKSWeinvestigatethevulnerabilityofprobabilisticnetworksundermultipleattacks.Thatisweaimtoidentifythemostcriticalsubsetsofinfrastructurewhoseremovalmaximizethedisruptiveeffectonthenetworkintermofconnectivity.Findingsuchasubsetisextremelychallengingduetotheuncertaintyofthenetworktopologyandtheexponentiallylargenumberofattackschemes.Weshowthatndingexactsolutioniscomputationallyintractableandproposeanefcienttwo-stagestochasticprogrammingtoapproximatetheidenticationofthecriticalinfrastructure.Furthermore,weproposeanovelsamplingscheme,whichndsolutionswithguaranteedprobabilisticaccuracy.Finallywedemonstratetheeffectivenessandefciencyoftheproposedalgorithmsonrealandsyntheticdatasets.Disruptiveevents,rangingfromnaturaldisasterstomaliciousattacks,candrasticallycompromisethenetwork'sabilitytomeetitsquality-of-service(QoS)requirements,ifnotcausewidespreadserviceoutagesandpotentiallytotalnetworkbreakdown[ 48 64 66 71 ].Moreover,thereisasignicantconcernovercriticalinfrastructuresinelectricalpowergridsandhighwaysystemsastargetsforterroristattacks[ 70 ].Tomitigatetheriskanddevelopproactiveresponses,itisessentialtoassessnetworkvulnerabilitytoidentifythemostdestructiveattackscenarios.Althoughtherehasbeenasignicantamountofworkonassessingnetworkvulnerability,mostpreviousworksfocusmainlyonusingcentralitymeasurementse.g.degree,betweeness,andclosenesscentralities[ 7 8 ]toidentifycriticallinksornodes.Unfortunately,theseapproachesonlydeterminetherelativeimportanceofasmallnumberofnodesorlinksandcannotrevealtheenormousdamagepotentialcausedundersimultaneousattacks.Othersetofworksstudieslinksandnodesremovalproblemsthatoptimizeseveralglobalgraphmeasures,suchasclusteringcoefcient,networkdiameter,etc.However,thesemeasuresdonotcastwellforparticularkinds 96

PAGE 97

ofnetworkvulnerability,whenthenetworkconnectivityisofhighpriority.Tothisend,pairwiseconnectivity,thenumberofnodepairsthatremainconnected,hasbeenrecentlyusedasaneffectivemeasuretoaccountfortheeffectoftheattacks[ 12 33 34 64 66 ]. 5.1ProbablilisticNetworksInthissection,werstpresenttheconsideredprobabilisticnetworkmodel,followedbytheformaldenitionofthestudiedproblems. 5.1.1ProbabilisticNetworkModelThenetworkwithuncertainlinksismodeledusingatupleG=(V;E;p)whereverticesinVcorrespondstothesetofnodes,edgesinEcorrespondstothesetoflinksinthenetwork,andp:E![0;1]mapseachedge(u;v)2Etoarealnumberinpuv2[0;1]thatrepresentstheavailabilityof(u;v)andpuv=0forall(u;v)=2E.Further,denotebytheadjacencymatrixofG,i.e.,Pr[uv=1]=puvandPr[uv=0]=1)]TJ /F3 11.955 Tf 12.31 0 Td[(puvforallpairs(u;v).Forclarity,weconsideronlyundirectednetworksandassumethattheexistingofedgesareindependentofoneanother,thoughourapproachesalsoapplyinprincipletodirectedgraphsorgraphswithedgecorrelationsaslongaswecaneffectivelygeneratesamplesoftheprobabilisticgraph.Asamplegraph(orarealization)Gl=(V;Ei)ofGisgeneratedbyselectingeachedgee2Ewithprobabilityp(e).ThesamplespaceSGconsistsofN=2jEjpossiblesamplesSG=fG1=(V;E1);G2=(V;E2);:::;GN=(V;EN)gofGthatcorrespondto2jEjpossiblesubsetsofE.TheprobabilitythatGlissampledfromGisgivenbyfG(Gl)=Pr[G=Gl]=Ye2EipeYe2EnEi(1)]TJ /F3 11.955 Tf 11.96 0 Td[(pe)Moreover,thematrixflguvareusedtodenotetheadjacencymatrixofGl. 5.1.2ExpectedPairwiseConnectivityAsmentionedabove,ourmeasureforthedisruptiveeffectisbasedonthevalueofpairwiseconnectivity(EPC),whichisthenumberof(expected)connectedpairsinthe 97

PAGE 98

residualnetwork.ForadeterministicgraphGl,thepairwiseconnectivity,denotedbyP(Gl),isthenumberofpairs(u;v)withatleastonepathbetweenuandv.Naturally,theexpectedpairwiseconnectivityfortheprobabilisticgraphGisdenedasEPC(G)=E[P(G)]=NXl=1fG(Gl)P(Gi): Lemma12. GivenaprobabilisticgraphG=(V;E;p),wehaveEPC(G)=1 2Xu;v2V;u6=vRELu;v(G)whereRELu;v(G)istheprobabilitythatvisreachablefromuwithinG.Partialorder.GiventwoprobabilisticgraphsGA(VA;EA;pA)andGB(VB;EB;pB),wesayGAisdominatedbyGB,andwriteGAGB,iffVAVB;EAEB,andpAepBe8e2EA. Lemma13. GiventwoprobabilisticgraphsGAandGB,ifGAGB,thenEPC(GA)EPC(GB): 5.1.3VulnerabilityAssessmentWedenethefollowingproblemsbasedontheirdeterministicversionsin[ 11 34 ].Ononehand,thenumberofnodes/edgestoremoveisgiven,andwewishtomaximizetheexpecteddisruptiveeffect.k-ProbabilisticCriticalNodesProblem(k-pCNP).GivenaprobabilisticnetworkG=(V;E;p)andaninteger0kn,ndaknodessubsetSVthatremovalminimizestheexpectedpairwiseconnectivityintheresidualnetworkafterremovingthenodesinS. 5.2EstimationofConnectivityinProbabilisticNetworks 5.2.1#P-CompletenessInthispaper,werstshowthatcomputingexpectedpairwiseconnectivityinaprobabilisticnetworkis#P-complete.Acomputationproblemfin#Pissaidtobe#P-completeifeveryproblemsin#Pisreducibletof.Here#Pistheclassofcounting 98

PAGE 99

versionofproblemsinNP,i.e.,theyareproblemsoftheformcomputethenumberofsolutionsforaprobleminNP.Showingthatacomputationproblemis#P-completemakesastrongstatementaboutitsintractability:ifsuchaproblemwerecomputableinpolynomialtimethennotonlyP=NPbutalsoP=PH. Theorem5.1. ComputingtheexpectedpairwiseconnectivityEPC(G),givenaprobabilis-ticgraphG,is#P-complete. Proof. Weprovethetheorembyareductionfromthecountingproblemofs)]TJ /F3 11.955 Tf 13.34 0 Td[(tconnectednessinanundirectedgraph[ 80 ].TheproblemistocountthenumberofsubgraphsofagraphGinwhichthereisapathfromstot.TheproblemisequivalenttocomputingtheprobabilitythatsisconnectedtotwheneachedgeinGhasanindependentprobability1=2ofbeingconnected,andanother1=2tobedisconnected.Wereducethisproblemtotheexpectedpairwiseconnectivitycomputationproblemasfollows.WerstconstructfourprobabilisticgraphsG0;G1;G2;G3,where G0=Gandp(e)=1=2foralle2E. G1isobtainedfromG0byaddinganewnodes0andanedge(s;s0)withp(s;s0)=1. G2isobtainedfromG0byaddinganewnodet0andanedge(t;t0)withp(t;t0)=1. G3isobtainedfromG0byaddingnodess0andt0,andedges(s;s0),and(t;t0)withprobabilitiesp(s;s0)=p(t;t0)=1.NextwecomputeP0=EPC(G0);P1=EPC(G1);P2=EPC(G2),andP3=EPC(P3).Then,wecanreturnP0)]TJ /F3 11.955 Tf 12.56 0 Td[(P1)]TJ /F3 11.955 Tf 12.56 0 Td[(P2+P3astheprobabilitythatsisconnectedtotandthuswesolvethes)]TJ /F3 11.955 Tf 11.97 0 Td[(tconnectednesscountingproblem.Inaddition,thereisanobviousreductionfromtheexpectedpairwiseconnectivitycomputationproblemtothes)]TJ /F3 11.955 Tf 12.8 0 Td[(tconnectednessproblemviatheequalityEPC(G)=Pu6=vRELu;v(G).Itisshownin[ 80 ]thats)]TJ /F3 11.955 Tf 12.42 0 Td[(tconnectednessis#P-complete,andthustheexpectedpairwiseconnectivitycomputationproblemisalso#P-complete. 99

PAGE 100

Finally,weprovethatP0)]TJ /F3 11.955 Tf 12.74 0 Td[(P1)]TJ /F3 11.955 Tf 12.74 0 Td[(P2+P3=RELs;t(G),theprobabilitythatsandtisconnectedinG.BytheconstructionofG1,wehaveP1=P0+Pv2VRELs0;v(G)=P0+Pv2VRELs;v(G).Similarly,P2=P0+Pv2VRELv;t(G)andP3=P0+Pv2VRELs;v(G)+Pv2VRELv;t(G)+RELs;t(G).ItisstraightforwardtoverifythatRELs;t(G)=P0)]TJ /F3 11.955 Tf 11.98 0 Td[(P1)]TJ /F3 11.955 Tf 11.98 0 Td[(P2+P3. Weareinterestedin(;)-approximationsforEPC(G),i.e.,algorithmsreturninganestimateofEPC(G)accuratetowithinarelativeerrorofwithprobabilityatleast1)]TJ /F3 11.955 Tf 12.29 0 Td[(.Formally,wedene(;)-approximationsasfollows. Denition7((;)-approximation). Afunction^F(G)isan(;)-approximationfortheexpectedpairwiseconnectivityE[P(G)]ifPrh(1)]TJ /F3 11.955 Tf 11.96 0 Td[()E[P(G)]^F(G)(1+)E[P(G)]i>1)]TJ /F3 11.955 Tf 11.96 0 Td[(An(;)-approximationiscalledafullypolynomialrandomizedapproximationscheme(FPRAS)ifitsrunningtimeisboundedbyapolynomialin1=;log(1=),andtheinputsize.AnFPRASisgenerallyregardedasarobustnotionofapproximationalgorithmforcountingproblems.SinclairandJerrumshowedthatevery#P-completeproblemeitherhasanFPRAS,orisessentiallyimpossibletoapproximate[ 75 ]. 5.2.2Monte-CarloMethodstoApproximateEPCWepresentasimpleMonte-CarloalgorithmtoestimatetheEPCinAlgorithm10.ThealgorithmdrawsN1(;)samplesofG.Eachsampleisgeneratedbyincludingeachedgee2Ewithprobabilitype.TheaveragepairwiseconnectivityintheN1(;)samplegraphsiscomputedandreturnedasanunbiasedestimatorforEPC(G). 100

PAGE 101

Algorithm10.(;)Monte-CarloAlgorithmtocomputeEPC(G) 1. C1 0. 2. fori=1toN1(;)do DrawasamplegraphGiofG. C1=C1+P(Gi). 3. ReturnE1=C1 N1asanunbiasedestimatorofEPC(G). Thenumberofnecessarysamplestobedrawn,denotedbyN1(;),isderivedbasedonthefollowingGeneralizedZero-OneEstimatorTheoremintroducedbyDagumetal.[ 29 ]. Theorem5.2. (GeneralizedZero-OneEstimator[ 29 ])LetX1;X2;:::;XNbeindepen-dentidenticallydistributedrandomvariablestakingvaluesin[0;1],withmean>0.If0<<1andN4(e)]TJ /F5 11.955 Tf 11.96 0 Td[(2)ln(2=)1=(2),wheree2:718isEuler'snumber,thenPr"(1)]TJ /F3 11.955 Tf 11.96 0 Td[()1 NNXi=1Xi(1+)#>1)]TJ /F3 11.955 Tf 11.96 0 Td[(:ByapplyingTheorem 5.2 tothei.i.d.randomvariablesXi=P(Gi)=)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2withmean=EPC(G)=)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2,weobtainthefollowinglemma. Lemma14. IfN1(;)4(e)]TJ /F5 11.955 Tf 11.95 0 Td[(2)ln2 1 2,thenE1isan(;)-approximation.Timecomplexity.ThetimetodrawasampleandcomputethepairwiseconnectivityisO(m+n).Sinceweoftenregardasaconstant,Algorithm10hasatimecomplexityO((m+n)n2)]TJ /F4 7.97 Tf 6.59 0 Td[(2EPC(G))]TJ /F4 7.97 Tf 6.59 0 Td[(1).IfEPC(G)isboundedbelowby1=poly(n;m),thenAlgorithm10isanFPRAS.ThedifcultcaseistheestimationofsmallvaluesofEPC(G),wherethealgorithmisnolongerapolynomial-timealgorithm.Thismotivatestheconstructionofbetterestimationmethodstobepresentednext. 101

PAGE 102

5.2.3FullyPolynomialTimeApproximationScheme 5.2.3.1ComponentSamplingAlgorithmWepresentanimportancesamplingmethodtoestimateEPC(G)inAlgorithm11.InsteadofgeneratingthewholesamplegraphasinAlgorithm10,weselectanodeu2VuniformlyandperformaBread-FirstSearchprocedurefromu,untilreachingallnodesintheconnectedcomponentthatcontainsu.Thealgorithmthencomputestheaverageofthesizeofthecomponentthatcontainsulessone,andmultiplytheresultbyntoobtainanunbiasedestimatorE2. Algorithm11.(;)ComponentSamplingAlgorithmtocomputeEPC(G) 1. LetPE=Pe2Epe 2. ifPE<1=mthen 3. returnE2=PE. 4. C2 0. 5. fori=1toN2(;)do Selectanodeu2Vuniformly. SimulateaBreath-FirstSearchfromuinG.LetSibethenumberofvisitednodes. C2=C2+(Si)]TJ /F5 11.955 Tf 11.96 0 Td[(1). 6. ReturnE2=nC 2N2asanunbiasedestimatorofEPC(G). Theorem5.3. ForN2(;)=4(e)]TJ /F5 11.955 Tf 11.28 0 Td[(2)ln2 n(n)]TJ /F4 7.97 Tf 6.59 0 Td[(1) 2EPC(G),E2isan(;)-approximationforEPC(G). Proof. InthemainloopofAlgorithm11,wecancomputeSiwiththefollowingequivalentsteps:1)DrawasamplegraphGi;and2)SelectanodeuinGiuniformlyandcomputeSiasthesizeofconnectedcomponentthatcontainsu.Assumethattherearetconnectedcomponentswithsizess1;s2;:::;skinGi.ThenE[Si)]TJ /F5 11.955 Tf 13.29 0 Td[(1jG=Gi]=Pki=1si(si)]TJ /F4 7.97 Tf 6.59 0 Td[(1) Pki=1si=2P(Gi) n.HenceE[Si)]TJ /F5 11.955 Tf 11.96 0 Td[(1]=2EPC(G)=n. 102

PAGE 103

ByapplyingTheorem 5.2 toi.i.d.Yi=(Si)]TJ /F5 11.955 Tf 12.03 0 Td[(1)=(n)]TJ /F5 11.955 Tf 12.02 0 Td[(1)withmean=EPC(G)=)]TJ /F6 7.97 Tf 5.48 -4.38 Td[(n2,itfollowsthatE2isan(;)approximationofEPC(G). Lemma15. InanyundirectedprobabilisticgraphG=(V;E;p),wehaveXe2EpeEPC(G) 1+1 mXe2Epe!m: Proof. Weprovethelowerandupperboundsseparately.Lowerbound:ByLemma 12 ,wehaveEPC(G)=1 2Xu;v2V;u6=vRELuv(G)X(u;v)2ERELuv(G)X(u;v)2EpuvUpperbound:First,weshowthatEPC(G)Qe2E(1+pe).Thenwecanapplytheinequalityofarithmeticandgeometricmeans[ 21 ]forpositivenumbers(1+pe)8e2EtoobtainEPC(G)Ye2E(1+pe) 1+1 mXe2Epe!m:WeproveEPC(G)Qe2E(1+pe)byinductiononEthenumberofundeterminededges(thosewithprobabilitiesstrictlylessthanone).Basis:IfE=0,wehaveadeterministicgraphwithm=jEjedges.Since,thesizeofthelargestcomponentcannotexceedm+1,thepairwiseconnectivityisatmost1=2n(m+1)<1=2m(m+1)<2m8m0.Thus,theinequalityholdsforE=0.Inductionstep:AssumethattheinequalityholdsforE=t0,weshowthattheinequalityalsoholdswhenE=t+1.AssumethatE=t+1,selectanarbitraryundeterminededge(u;v)2Eandperformbranchingon(u;v)asshowninEq. 5 .WehaveEPC(G)=puvEPC(G+)+(1)]TJ /F3 11.955 Tf 11.96 0 Td[(puv)EPC(G)]TJ /F5 11.955 Tf 7.08 -4.94 Td[(); 103

PAGE 104

whereG+isobtainedfromGbysettingthe(u;v)'sprobabilitytooneandG)]TJ /F1 11.955 Tf 10.4 -4.34 Td[(isobtainedfromGbyremoving(u;v).Since,bothG+andG)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(haveexactlyEundeterminededges,wecanapplytheinductionhypothesistoobtainEPC(G)puv(1+1)Ye6=(u;v)(1+pe)+(1)]TJ /F3 11.955 Tf 11.95 0 Td[(puv)Ye6=(u;v)(1+pe)=(1+puv)Ye6=(u;v)(1+pe)=Ye2E(1+pe):Thus,theinequalityholdsforallE0. TheboundsinLemma 15 areasymptotictightinthesensethattherearearbitrarylargegraphsinwhichtheboundsareonlydifferentfromtheactualvaluesofEPC(G)byafactoroftwo.Forexample,considerGasastargraphofsizenthatconsistsofonecentervertexandn)]TJ /F5 11.955 Tf 12.35 0 Td[(1leaves.Alln)]TJ /F5 11.955 Tf 12.34 0 Td[(1edgesareassignedthesameprobability1=(n)]TJ /F5 11.955 Tf 13.04 0 Td[(1).Onecanverifythatthelower-bound,EPC(G),andtheupperboundare1;3 2)]TJ /F4 7.97 Tf 24.42 4.7 Td[(1 2(n)]TJ /F4 7.97 Tf 6.58 0 Td[(1),and)]TJ /F5 11.955 Tf 5.48 -9.68 Td[(1+1 n)]TJ /F4 7.97 Tf 6.59 0 Td[(1n)]TJ /F4 7.97 Tf 6.59 0 Td[(10. 104

PAGE 105

CasePEn)]TJ /F4 7.97 Tf 6.59 0 Td[(2)]TJ /F6 7.97 Tf 6.58 0 Td[(:LetPl;l=0::mbetheprobabilitythatthegraphhasexactlyledge(s).WehavePml=0Pl=1.Inaddition,letP+3=Pml=3Pl.WehaveP0=Ye2E(1)]TJ /F3 11.955 Tf 11.95 0 Td[(pe)1)]TJ /F3 11.955 Tf 11.95 0 Td[(PE (5)P1=Xe2EpeYe06=e(1)]TJ /F3 11.955 Tf 11.96 0 Td[(pe0)= Xe2Epe 1)]TJ /F3 11.955 Tf 11.95 0 Td[(pe!P0 (5)P+2=1)]TJ /F3 11.955 Tf 11.95 0 Td[(P0)]TJ /F3 11.955 Tf 11.95 0 Td[(P11)]TJ /F3 11.955 Tf 11.95 0 Td[(P0(1+Xe2Epe 1)]TJ /F3 11.955 Tf 11.96 0 Td[(pe) (5)1)]TJ /F5 11.955 Tf 11.96 0 Td[((1)]TJ /F3 11.955 Tf 11.95 0 Td[(PE)(1+PE)=P2E (5)WehavePEEPC(G)12P0+22P1+n2P+2 (5)P0pE 1)]TJ /F3 11.955 Tf 11.95 0 Td[(pE+n2P2E (5)ePE 1)]TJ /F3 11.955 Tf 11.95 0 Td[(PEPE+o(1)PE=(1+o(1))PE (5)Therefore,PEisan(;)-approximationforEPC(G).CasePEn)]TJ /F4 7.97 Tf 6.59 0 Td[(2)]TJ /F6 7.97 Tf 6.59 0 Td[(:FromTheorem3,thecomponentsamplingprocedurecangivean(;)approximationwithinapolynomialtime. 5.3VulnerabilityAssessmentusingEPCInthissection,weformulatethevulnerabilityassessmentproblemsasamathematicalprogrammingproblemanddevisetwoapproachestoovercomethedifcultyofhavinganexponentialnumberofconstraintsinthemathematicalformulation.Linearprogrammingfordeterministicnetworks.GivenarealizationGlofG,thek-CNPprobleminGlcanbeformulatedasanintegerlinearprogramming(ILP), 105

PAGE 106

following[ 12 ].minXi
PAGE 107

enhancementsfrom[ 33 ].mins2f0;1gnE[P(s;x;)] (5)s.t.nXi=1sik (5)whereP(s;x;)=minXi
PAGE 108

denotedbyMIPF:minNXl=1fG(Gl)Xi
PAGE 109

Ourrstapproachistoregardtheweightedmatrix asa(binary)adjacencymatrixofadeterministicgraph.ThusweobtainthefollowingMIP,denotedbyMIPE.minXi
PAGE 110

Therestistoshowthat(~s;~x)isafeasiblesolutionofMIPE.Clearly,~ssatisfy( 5 )andtheintegralconstraints.Alsosince~xisaconvexcombinationof^xl;l=1::NwiththemassesfG(Gl),~xsatisfytheconstraints( 5 ),( 5 ),( 5 ),&( 5 )astheycanbeinferredfromthesameconvexcombinationoftheconstraints( 5 )to( 5 ). OnecansolveMIPEoptimallyusingthebranch-and-cutmethodin[ 33 ]toobtain1)asetofkcriticalnodesand2)alower-boundontheminimumexpectedpairwiseconnectivityafterremovingknodes.Wenotethatthenon-integralityofxijisessentialforMIPE.Whenxijisrestrictedtof0;1g,e.g.in( 5 ),theconstraints( 5 )isessentiallyxijsi+sjandMIPEbecomeequivalenttoIP( 5 )-( 5 ).Thatistheinformationencodedintheedgeprobabilitiesisdisregardedandonlythenetworktopologyisusedintheformulation.Inaddition,sincetheconvexcombinationof^xlisafractionalvector,wewillnotbeabletoderivethelower-boundgiveninLemma 16 .Inlargenetworks,branch-and-cutalgorithmstartstoshowitsexponentiallyrunningtime,thefollowingrandomizedroundingalgorithmcanbeusedtoobtainasetofkcriticalnodes.TheroundingprocedureisdescribedinAlgorithm12.ThealgorithmrepeatedlysolvesanLPrelaxationofMIPEandroundupthemaximumsitoone,providedthatsiisnotroundedbefore.Afterksteps,knodesthathavesi=1areretunedasthesetofthecriticalnodes.SincetheLPrelaxationhasatmostO(mn)constraintssolvingtheLPrelaxationtakesanO(m3n3)time[ 33 ]intheworstcase.ThusthetotaltimecomplexityisatmostO(km3n3). 110

PAGE 111

Algorithm12.RoundingontheExpectationGraphAlgorithm(REGA) 1. ObtainanLPrelaxationofMIPEwiththerelaxedconstraintss2[0;1]n. 2. InitializethesetofselectednodesD=;. 3. Repeatktimesthefollowingsteps SolvetheLPrelaxation Selectu=argmax2VnDsi. AddutoDandxsu=1 4. ReturnkcriticalnodesinD. 5.3.2SampleAverageApproximation(SAA)MethodOursecondapproachtoreducethenumberofrealizationsistoapplytheSampleAverageApproximation(SAA)method.WegenerateindependentlyTsamples1;2;;TusingMonteCarlosimulation(i.e.togenerateeachedge(u;v)2ewithprobabilitypuv).Theexpectationobjectiveq(s)=E[P(s;x;)]isthenapproximatedbythesampleaverage^qT(x)=1 TPTl=1Pi
PAGE 112

respectively.Forany>0,itcanbederivedfromPropostition2.2in[ 54 ]thatPr[E[P(s;^x;)])]TJ /F14 11.955 Tf 11.96 0 Td[(E[P(s;x;)]>]exp)]TJ /F3 11.955 Tf 9.3 0 Td[(T2 n4+nlogk (5)Equivalently,ifTn4 2(nk)]TJ /F5 11.955 Tf 12.52 0 Td[(log),thenPr[E[P(s;^x;)])]TJ /F14 11.955 Tf 11.96 0 Td[(E[P(s;x;)]<]>1)]TJ /F3 11.955 Tf 12.51 0 Td[(forany2(0;1).AlthoughtheestimationonTmaybetooconservativeforpracticalestimates,itisexpectedthattheoptimalvalueandoptimalsolutionsoftheSAAproblemconvergetotheircounterpartsevenwithareasonablesmallvalueofT.ThedescriptionforSAAmethodissummarizedinAlgorithm13.Thealgorithmconsistsoftwophases.Intherstphase,thedelayedconstraintstechniqueisusedtoincrementallyconstructandsolveanLPrelaxationoftheSAA.Inthesecondphase,thesameiterativeroundingprocedureinAlgorithm12isappliedtondkcriticalnodesbyroundingupthefractionalsolution. 112

PAGE 113

Algorithm13.SampleAve.Approx.Algorithm(SA3)ParameterT:thenumberofsamplingPhase1:DelayedConstraints 1. InitializeanLPwiththeobjective1 TPTl=1Pi
PAGE 114

Algorithm14.IterativeGreedyAlgorithm(IGA) 1:D ;2:fori=1::kdo3:u=argminv2VnSEPC(G[Vn(S[fvg)])4:D D+fug5:while9(u;v)2D(VnD)6:&(swappingu;vdecreasestheobjective)do7:D D+fvg)-222(fug8:OutputD. Proof. ThisisderiveddirectlyfromthedenitionofEPC(G).Deneconnuv(Gl)=1,ifthereisapathbetweenuandvinasamplegraphGlandconnuv(Gl)=0,otherwise.WehaveEPC(G)=NXl=1fG(Gl)P(Gl)=NXi=1fG(Gl)1 2Xu6=vconnuv(Gl)=1 2Xu6=vNXl=1fG(Gl)connuv(Gl)=1 2Xu6=vRELu;v(G) ProofofLemma 13 Proof. Anedge(u;v)2EBissaidtobeundetermined,if0
PAGE 115

whereG+AisobtainedfromGAbyassigningpuv=1;andG)]TJ /F6 7.97 Tf -.71 -8.19 Td[(AisobtainedfromGAbyremovingtheedge(u;v).Similarly,wehaveEPC(GB)=pBuvEPC(G+B)+(1)]TJ /F3 11.955 Tf 11.95 0 Td[(pBuv)EPC(G)]TJ /F6 7.97 Tf -.71 -8.18 Td[(B):SinceGAGB,itcanbeveriedthatG+AG+BandG)]TJ /F6 7.97 Tf -.71 -8.19 Td[(AG)]TJ /F6 7.97 Tf -.71 -8.19 Td[(B.Notethatthepairs(G+A;G+B)and(G)]TJ /F6 7.97 Tf -.71 -8.19 Td[(A;G)]TJ /F6 7.97 Tf -.71 -8.19 Td[(B)haveatmosttundeterminededges.Bytheinductionhypothesis,wehaveEPC(GB)pBuvEPC(G+A)+(1)]TJ /F3 11.955 Tf 11.95 0 Td[(pBuv)EPC(G)]TJ /F6 7.97 Tf -.71 -8.19 Td[(A)pAuvEPC(G+A)+(1)]TJ /F3 11.955 Tf 11.95 0 Td[(pAuv)EPC(G)]TJ /F6 7.97 Tf -.71 -8.19 Td[(A)=EPC(GA):ThelastinequalityholdsbecausepAuvpNuvandEPC(G+A)EPC(G)]TJ /F6 7.97 Tf -.71 -8.19 Td[(A)whichcanbeshownbasedonthefactthateachsampleofgraphG+AcanbegeneratedbyrstgeneratingasampleofG)]TJ /F6 7.97 Tf -.71 -8.19 Td[(Aandthenadd(u;v)thesample.Obviously,addinganedgetoa(deterministic)graphwillnotdecreasethepairwiseconnectivity.Thus,thelemmaholdsforallB0. 115

PAGE 116

CHAPTER6CASCADING-FAILURESINNETWOKRSInthischapter,weformulatethemeasuringvulnerabiltiyinthepresenceofcascading-failureasanoptimizationproblem:theCost-effective,massiveandoutbreakproblem(CFM).InSection 6.1 ,weanalyzethepropagationprocessonpower-lawnetworkstogiveanlower-boundontheseedingsize.WepresentVirAds,ascalablealgorithmtondaminimalseedingfortheCFMprobleminSection 6.2 .Thehardnessofndingacost-effectiveseedingisaddressedinSection 6.3 .Finally,weperformextensiveexperimentsonlargesocialnetworkssuchasFacebookandOrkuttoconrmtheefciencyofourproposedalgorithmandanalyzetheresultstogivenewobservationstoinformationdiffusionprocessinnetworks. 6.1SeedingCostofMassiveOutbreakInthissection,weexploitthepower-lawtopologyfoundinmostcomplexnetworks[ 13 14 26 ]todemonstratethatwhenthepropagationhopislimited,alargenumberofseedingnodesisneededtospreadtheinuencethroughoutthenetwork.Thesizeofseedingisprovedtobeaconstantfractionofthenumberofverticesn,whichisprohibitiveforlargesocialnetworksofmillionsofnodes.Werstsummarizethewell-knownpower-lawmodelin[ 4 ];thenweusethemodeltoprovetheprohibitiveseedingcostfortheCFMproblem. 6.1.1Power-lawNetworkModel.ManycomplexsystemsofinterestincludingOSNsarefoundtohavethedegreedistributionsapproximatelyfollowsthepowerlaws[ 13 14 26 ].Thatisthefractionofnodesinthenetworkhavingkconnectionstoothernodesisproportionaltok)]TJ /F6 7.97 Tf 6.59 0 Td[(,whereisaparameterwhosevalueistypicallyintherange2<<3.Thosenetworkshavebeenusedinstudyingdifferentaspectsofthescale-freenetworks[ 4 6 40 43 ].WefollowtheP(;)power-lawmodelin[ 4 ]inwhichthenumberofverticesofdegreek 116

PAGE 117

isbe kcwhereeisthenormalizationfactor.Forconvenience,weshallrefertosuchanetworkasaP(;)network.WecandeducethatthemaximumdegreeinaP(;)networkise (sincefork>e ,thenumberofedgeswillbelessthan1).Thenumberofverticesandedgesaren=e Xk=1e k8>>>><>>>>:()eif>1eif=1e 1)]TJ /F6 7.97 Tf 6.59 0 Td[(if<1;m=1 2e Xk=1ke k8>>>><>>>>:1 2()]TJ /F5 11.955 Tf 11.95 0 Td[(1)eif>21 4eif=21 2e2 2)]TJ /F6 7.97 Tf 6.59 0 Td[(if<2 (6)where()=P1i=11 iistheRiemannZetafunction[ 4 ]whichconvergesfor>1anddivergesforall1.Withoutaffectingtheconclusion,wewillsimplyuserealnumbersinsteadofroundingdowntointegers.Theerrortermsaresufcientlysmallandcanbeboundedinourproofs.Whilethescaleofthenetworkdependson,theparameterdecidestheconnectionpatternandmanyotherimportantcharacterizationsofthenetwork.Forinstance,thelarger,thesparserandthemorepower-lawthenetworkis.Hence,theparameterisoftenregardedasthecharacteristicconstantforscale-freenetworks. 6.1.2ProhibitiveSeedingCostsWeprovethattheseedingmustcontainatleast(n)verticesifthepropagationislocallybounded.Theresultisstatedinthefollowingtheorem. Theorem6.1. Givenapower-lawnetworkG2P(;),with>2andconstant0<<1,anyd-seedingisofsizeatleast(n). Proof. Theproofconsistsoftwoparts.Intherstpart,weshowthatthevolumei.e.thetotaldegreeofvertices,ofanyd-seedingmustbe(m).Inthesecondpart,weprove 117

PAGE 118

Figure6-1. Theinuencepropagationinthenetwork. thatanysubsetofverticesSVwithvolumevol(S)=(m)inapower-lawnetworkwithpower-lawexponent>2,willimplythatjSj=(n).Thus,thetheoremfollows.Intherstpart,weconsidertwoseparatecasesCase>1 2:LetS=R0betheoptimalsolutionfortheCFMproblemonG=(V;E),andS=R0;R1;R2;:::;Rdareverticesthatbecomeactiveatround0;1;2;:::;d,respectively(seeFig. 6-4 ).NoticethatfRigdi=0formapartitionofV.Moreover,foreach1tdthefollowinginequalityholds.j(Rt;t)]TJ /F4 7.97 Tf 6.58 0 Td[(1[i=0Ri)j 1)]TJ /F3 11.955 Tf 11.96 0 Td[( j(Rt;d[j=t+1Rj)j+2j(Rt;Rt)j! (6)where(A;B)denotesthesetofedgesconnectingonevertexinAtoonevertexinB.Theinequalitymeansthatatleastafraction 1)]TJ /F6 7.97 Tf 6.59 0 Td[(amongedgesincidentwiththeverticesactivatedinroundtmustbeincidentwithactiveverticesinthepreviousrounds.Sumupallinequalitiesin( 6 )fort=1::d,wehavedXt=1j(Rt;t)]TJ /F4 7.97 Tf 6.59 0 Td[(1[i=0Ri)j 1)]TJ /F3 11.955 Tf 11.95 0 Td[(dXt=1 j(Rt;d[j=t+1Rj)j+2j(Rt;Rt)j! 118

PAGE 119

Eliminatethecommonfactorsinbothsides,wehaved)]TJ /F4 7.97 Tf 6.59 0 Td[(1Xi=0j(Ri;d[t=i+1Rt)j 1)]TJ /F3 11.955 Tf 11.96 0 Td[(d)]TJ /F4 7.97 Tf 6.58 0 Td[(1Xj=1j(Rj;d[t=j+1Rt)j+2d)]TJ /F4 7.97 Tf 6.59 0 Td[(1Xt=1j(Rt;Rt)jAftersomealgebra,weobtainvol(R0)j(R0;d[t=1Rt)j2)]TJ /F5 11.955 Tf 11.95 0 Td[(1 1)]TJ /F3 11.955 Tf 11.95 0 Td[(d)]TJ /F4 7.97 Tf 6.58 0 Td[(1Xj=1j(Ri;d[t=j+1Rt)j+2dXt=1j(Rt;Rt)j, 1)]TJ /F3 11.955 Tf 11.95 0 Td[(j(R0;V)j)-222(j(R0;R0)j2)]TJ /F5 11.955 Tf 11.95 0 Td[(1 1)]TJ /F3 11.955 Tf 11.95 0 Td[(jEj+3)]TJ /F5 11.955 Tf 11.96 0 Td[(4 1)]TJ /F3 11.955 Tf 11.95 0 Td[(dXt=1j(Rt;Rt)j (6)Hence,when>1=2,vol(R0)2)]TJ /F4 7.97 Tf 6.59 0 Td[(1 1)]TJ /F6 7.97 Tf 6.58 0 Td[(jEj=(m)foranyd-seedingR0.Case1 2:Wesaythatanedgeisactiveifitisincidenttoatleastoneactivevertex.Atroundt=0,thereareatmostvol(R0)activeedges,thosewhoareincidenttoR0.Eq. 6 impliesthatthenumberofactiveedgesineachroundincreasesatmost)]TJ /F4 7.97 Tf 6.58 0 Td[(1times.Afterdrounds,thenumberofactiveedgeswillbeboundedbyvol(R0))]TJ /F6 7.97 Tf 6.58 0 Td[(d.Since,alledgesareactiveattheendwehavetheinequality:vol(R0))]TJ /F6 7.97 Tf 6.59 0 Td[(djEj:Inthesecondpartoftheproof,weshowthatifasubsetSVhasvol(S)=(m),thenjSj=(n)wheneverthepower-lawexponent>2.Assumethatvol(S)cm,forsomepositiveconstantc.ThesizeofSisminimumwhenScontainsonlythehighestdegreeverticesofV.Letk0betheminimumdegreeofverticesinSinthatextremecase,byEq. 119

PAGE 120

6 wehavecm=c 2e Xk=1ke kvol(S)1 2e Xk=k0ke kSimplifytwosides,wehavek0)]TJ /F4 7.97 Tf 6.59 0 Td[(1Xk=11 k)]TJ /F4 7.97 Tf 6.59 0 Td[(1(1)]TJ /F3 11.955 Tf 11.95 0 Td[(c)e Xk=11 k)]TJ /F4 7.97 Tf 6.59 0 Td[(1=(1)]TJ /F3 11.955 Tf 11.95 0 Td[(c)()]TJ /F5 11.955 Tf 11.95 0 Td[(1)Since,thezetafunction()]TJ /F5 11.955 Tf 12.46 0 Td[(1)convergesfor>2,thereexistsaconstantk;thatdependsonlyonandthatsatisesk;Xk=11 k)]TJ /F4 7.97 Tf 6.59 0 Td[(1>(1)]TJ /F3 11.955 Tf 11.96 0 Td[(c)()]TJ /F5 11.955 Tf 11.95 0 Td[(1)Obviously,wehavek0k;.Thus,thenumberofverticesthatareinSisatleaste Xk=k;e k=(1)]TJ /F6 7.97 Tf 12.95 16.04 Td[(k;Xk=11 k)n=(n)WehavethelaststepbecausethesumPk;k=11 kisboundedbyaconstantsincek;isaconstant. Inbothcases>1=2and1=2,thesizeofad-seedingsetisatleast(n).However,wecanseeacleardifferenceinthepropagationspeedwithrespecttodbetweentwocases.When<1=2,thenumberofactiveedgescanincreaseexponentially(butisstillboundedifdisaconstant)and,itislikelythatthenumberofactiveverticesalsoexponentiallyincreases.Incontrast,when>1=2,explodinginthenumberofactiveedges(andhenceactivevertices)isimpossibleasthevolumeofthed-seedingistiedtothenumberofedgesmbyaxedconstant2)]TJ /F4 7.97 Tf 6.59 0 Td[(1 1)]TJ /F6 7.97 Tf 6.59 0 Td[(,regardlessofthevalueofd. 6.2AlgorithmtoIdentifytheMinimumOutbreakSeedingInordertounderstandtheinuencepropagationwhenthenumberofpropagationhopsisbounded,weproposeVirAds,anefcientalgorithmfortheCFMproblem.With 120

PAGE 121

thehugemagnitudeofOSNusersanddataavailableonOSNs,scalabilitybecomesthemajorproblemindesigningalgorithmforCFM.VirAdsisscalabletonetworkofhundredofmillionslinksandprovideshighqualitysolutionsinourexperiments.BeforepresentingVirAds,weconsideranaturalgreedyfortheCFMprobleminwhichthevertexthatcanactivatethemostnumberofinactiveverticeswithindhopsisselectedineachstep.Thisgreedyisunlikelytoperformwellonpracticeforfollowingtworeasons.First,atearlysteps,whennotmanyverticesareselected,everyvertexislikelytoactivateonlyitselfafterbeingchosenasaseed.Thus,thealgorithmcannotdistinguishbetweengoodandbadseeds.Second,thealgorithmsuffersseriousscalabilityproblems.Toselectavertex,thealgorithmhastoevaluateforeachvertexvhowmanyverticeswillbeactivatedafteraddingvtotheseeding,e.g.byinvokinganO(m+n)Breadth-FirstSearchprocedurerootedatv.Intheworst-casewhenO(n)verticesareneededtoevaluate,thisalonecantakeO(n(m+n)).Moreover,asshownintheprevioussection,theseedingsizecanbeeasily(n);thus,theworst-caserunningtimeofthenaivegreedyalgorithmisO(n2(m+n)),whichisprohibitiveforlarge-scalenetworks.AsshowninAlgorithm 15 ,ourVirAdsalgorithmovercomesthementionedproblemsinthenaivegreedybyfavoringthevertexwhichcanactivatethemostnumberofedges(indeed,italsoconsidersthenumberofactiveneighboraroundeachvertex).Thisavoidstherstproblemofthenaivegreedyalgorithm.Atearlysteps,thealgorithmbehavessimilartothedegree-basedheuristicsthatfavorsverticeswithhighdegree.However,whenacertainnumberofverticesareselected,VirAdswillmaketheselectionbasedontheinformationwithind-hopneighboraroundtheconsideredverticesratherthanonlyone-hopneighborasinthedegree-basedheuristic. ThescalabilityproblemistackledinVirAdsbyefcientlykeepingtrackofthefollowingmeasuresforeachvertexv. rv:theroundinwhichvisactivated 121

PAGE 122

Algorithm15:VirAds:FindingInuenceNodesinNetworks Input: GraphG=(V;E),0<<1,d2N+ Output: Asmalld-seedingn(e)v d(v);n(a)v d(v);rv d+1;v2V;r(i)v=0;i=0::d;P ;;whilethereexistinactiveverticesdo repeat u argmaxv=2Pfn(e)v+n(a)vg;Recomputen(e)vasthenumberofnewactiveedgesafteraddingu.untilu=argmaxv=2Pfn(e)v+n(a)vg;P P[fug;Initializeaqueue:Q f(u;rv)g;ru 0;foreachx2N(u)do n(a)x maxfn(a)x)]TJ /F5 11.955 Tf 11.96 0 Td[(1;0g; whileQ6=;do (t;~rt) Q:pop();foreachw2N(t)do foreachi=rttominf~rt)]TJ /F5 11.955 Tf 11.95 0 Td[(1;rw)]TJ /F5 11.955 Tf 11.96 0 Td[(2gdo r(i)w=r(i)w+1;if(r(i)wdw)^(rwd)^(i+1
PAGE 123

Exceptforn(e)v,weshowthatallothermeasurescanbeeffectivelykepttrackofinonlyO((m+n)d)duringthewholealgorithm.Whenavertexuisselected,itcausesachain-reactionandactivateasequenceofverticesorlowertheroundsinwhichverticesareactivated.NewactivatedverticestogetherwiththeiractiveroundsaresuccessivelypushedintothequeueQforfurtherupdatingmuchlikewhathappensintheBellman-Fordshortest-pathsalgorithm.EverytimewepopavertexvfromQ,ifrv,thecurrentactiveroundofv,isdifferentfrom~rv,theactiveroundofvwhenvispushedintoQ,weupdateforeachneighborwofvthevaluesofrwandr(i)w.IfanyneighborwofvchangesitsactiveroundandwisnotinQ,wepushwintoQforfurtherupdate.TheupdateprocessstopswhenQisempty.Notethatforeachnodeu2V,changingofrucancauseatmostdupdateforr(:)wwherewisaneighborofu.Forallneighborsofu,thetotalnumberofupdateis,hence,O(dd(u)).Thus,thetotaltimeforupdatingr(:)w8w2VinVirAdswillbeatmostO((m+n)d).Tomaintainn(e))v,theeasiestapproachistorecomputealln(e)v.Thisapproach,calledExhaustiveUpdate,isextremelytime-consumingasdiscussedinthenaivegreedy.Instead,weonlyupdaten(e)vwhennecessary.Indetails,verticesarestoredinamaxpriorityqueueinwhichthepriorityistheireffectiveness.Ineachstep,thevertexuwiththehighesteffectivenessisextractedandn(e)uisrecomputed.Ifafterupdating,ustillhasthehighesteffectiveness,uisthenselected.Otherwise,uispushedbacktothepriorityqueue,andthenewvertexwiththehighesteffectivenessisconsidered,andsoon.ApproximationRatioforPower-lawNetworks.TheCFMproblemcanbeeasilyshowntobeNP-hardbyareductionfromthesetcoverproblem.Thus,weareleftwithtwochoices:designingheuristicswhichhavenoworst-caseperformanceguaranteesordesigningapproximationalgorithmswhichcanguaranteetheproducedsolutionsarewithinacertainfactorfromtheoptimal.Formally, 123

PAGE 124

a-approximationalgorithmforaminimization(maximization)problemalwaysreturnssolutionsthatareatmosttimeslarger(smaller)thananoptimalsolution.Unfortunately,thereisunlikelyanapproximationalgorithmwithfactorlessthanO(logn)asshowninnextsection.However,ifweassumethenetworkispower-law,ourVirAdsisanapproximationalgorithmforCFMwithaconstantfactor. Theorem6.2. Inpower-lawnetworks,VirAdsisanO(1)approximationalgorithmfortheCFMproblemforboundedvalueofd.Thetheoremfollowsdirectlyfromtheresultinprevioussectionthattheoptimalsolutionhassizeatleast(n)inpower-lawnetworks.Thus,theratiobetweentheVirAds'ssolutionandtheoptimalsolutionisboundedbyaconstant. 6.3HardnessoftheCFMProblemThissectionprovidesthehardnessofapproximatingtheoptimalsolutionsoftheCFMproblem,theimpossibilityofndingnear-optimalsolutionsinpolynomialtime.InpreviousSection,wecanobtainO(1)approximationalgorithmsforCFMwhenthenetworkispower-law.However,withoutthepower-lawassumption,thereisnoalgorithmthatcanapproximatetheproblemwithinafactorlessthanO(logn).Werstprovethehardnessforthecasewhend=1,whichisanessentialstepinprovingthehardnessforthegeneralcased1.WebeginwiththeFeige'sreductionforprovinglnnthresholdforthesetcoverproblem.OurproofforthehardnessofapproximationfortheCFMproblemrequiresunderstandingtheFeige'sconstructiontogetherwithitsparametersettings. 6.3.1Feige'sReductionforSetCoverFeigepresentedareductionfromak-proverproofsystemforaMAX3SAT-5instancethatisaconjunctivenormalformformulaconsistsofnvariablesand5n 3clausesofexactly3literals.Theverierinteractswithkprovers,andaskproversdifferentquestionsbasedonarandomstringr;eachquestioninvolvesl=2clausesandl=2variables.Iftheformulaissatisable,thentheprovershaveastrategythatcausetheverieracceptsforallrandomstrings.Ifonlya(1)]TJ /F3 11.955 Tf 12.6 0 Td[()fractionoftheclausesin 124

PAGE 125

aresimultaneouslysatisable,thenforallstrategiesoftheprovers,theverierweaklyacceptwithaprobabilityatmostk22)]TJ /F6 7.97 Tf 6.58 0 Td[(cl,wherecisaconstantthatdependsonlyon.ThecoreoftheSetcovergadgetisapartitionsystemB(m;L;k;d),whereBisagroundsetofmpoints.ThepartitionsystemisacollectionofL=2lpartitionsP1;:::;PLofB,eachpartitionPihasexactlykdisjointsubsetspi;1;:::;pi;k.AnycoverofmpointsinBrequiresatleastd=(1)]TJ /F4 7.97 Tf 13.15 4.7 Td[(2 2)klnmsubsets.Theconditiontomakeconstructingsuchasystempossibleisthatk1 c(5logk+2loglnm).Thehardnessratio(1)]TJ /F3 11.955 Tf 12 0 Td[(f(k))lnmofthesetcoverisobtainedfromthefollowingkeylemma. Lemma17. (Lemma4.1[ 39 ])Ifissatisable,thentheabovesetofN=mRpointscanbecoveredbykQsubsets.Ifonlya(1)]TJ /F3 11.955 Tf 12.56 0 Td[()fractionoftheclausesinaresimul-taneouslysatisable,theabovesetrequires(1)]TJ /F5 11.955 Tf 12.41 0 Td[(2f(k))kQlnmsubsetsinordertobecovered,wheref(k)!0ask!1. 125

PAGE 126

Notethatlnm=(1)]TJ /F3 11.955 Tf 12.06 0 Td[()lnNbythesettingofn;l;andmintheproof.Thus,thenalhardnessratiois(1)]TJ /F3 11.955 Tf 12.05 0 Td[()lnN,whereN=jUj.However,wecanchoosedifferentsettingsofn;l,andmandobtaindifferenthardnessratios.WenishthepresentofFeige'sreductionbygivingupperboundsforquantitiesthatappearlaterinourproofs. ThenumberofsubsetsjSjjQj22l.Since,foreachquestionq2Q,thereareatmost22lanswersof2lbitlength. ThemaximumsizeofasubsetS=maxS2SjSjm3l=2.Sinceeachiandq2Qthereareatmost3l=2randomstringsrsuchthattheveriermakesqueryqtotheithproverandjprar;ijm. Themaximumfrequencyofapoint(element)inU:fk2l.Because,forapair(q;i),eachpartitionprar;iisincludedatmost2ltimes,pluseachpointinBrappearsinexactlykpartitions. 6.3.2One-hopCFMWeprovethattheCFMproblemcannotbeapproximatedwithinafactorln)]TJ /F3 11.955 Tf -428.69 -23.91 Td[(O(lnln)ingraphsofmaximumdegree,unlessP=NP.Theproofusesagap-reductionfromaninstanceoftheBoundedSetCoverproblem(SCB)toaninstanceofCFMproblemwhosedegreesareboundedbyB0=BpolylogB.Forbackgroundonhardnessofapproximationandgap-reductionwerefertoreference[ 9 ]. Denition8(BoundedSetCover). Givenasetsystem(U;S),whereU=fe1;e2;:::;ensgisauniverseandSisacollectionofsubsetsofU.EachsubsetinShasatmostBele-mentsandeachelementbelongstoatmostBsubsets,forapredenedconstantB>0.AcoverisasubfamilyCSofsetswhoseunionisU.Findacoverwhichusestheminimumnumberofsubsets.WestatethetightinapproximabilityresultfortheboundedsetcoverbyTrevisan[ 79 ]inthefollowinglemma. Lemma18. ThereexistconstantsB0;c0>0suchthatforeveryBB0itisNP-hardtoapproximatetheSCBproblemwithinafactoroflnB)]TJ /F3 11.955 Tf 11.95 0 Td[(c0lnlnB. 126

PAGE 127

Figure6-2. ReductionfromSCBtoCFMwhend=1 Theproofin[ 79 ]reducesaninstanceofGAP)]TJ /F3 11.955 Tf 12.53 0 Td[(SAT1;ofsizenStoaninstanceF=(U;S)ofSCBbysettingsparametersl;minFeige'sconstruction[ 39 ]tobe(lnlnB)andB polylog(B),respectively.DenotebySthemaximumcardinalityofsets,andbyfthemaximumfrequencyofelementsinU,wehave jUj=mnlSpolylogB;jSj=nlSpolylogB SB;fpolylogBforsufcientlargeB.SCB-CFMreduction.ForeachinstanceF=(U;S)ofSCB,weconstructagraphH=(V;E)asfollows(Fig. 6-2 ): ConstructabipartitegraphwiththevertexsetU[SandedgesbetweenSandallelementsei2S,foreachS2S. AddasetDconsistingoftverticesandasetD0withsamenumberofvertices,sayD=fx1;x2;:::;xtgandD0=fx01;x02;:::;x0tg,wheret=jUj Bln2B. Connectxitox0i;8i=1:::t.ThisenforcestheselectionofxiintheoptimalCFM. Connecteachvertexej2Utod 1)]TJ /F6 7.97 Tf 6.59 0 Td[(f(ej)e)]TJ /F5 11.955 Tf 20.4 0 Td[(1andeachvertexSk2Stod 1)]TJ /F6 7.97 Tf 6.58 0 Td[(jSkjeverticesinD,wheref(ej)isthefrequencyofelementej.Duringtheconnection,webalancethedegreesofverticesinD.Wecanassumew.l.o.g.thatoptimalsolutionsofCFMcontainsallverticesinDbutnotonesinD0.Then,allverticesinSwillbeactivatedaftertherstround,and 127

PAGE 128

theavertexinUisactivatedifandonlyifoneofitsneighborsinSisselectedintothesolution.Thus,thefollowinglemmaholds. Lemma19. ThesizedifferencebetweentheoptimalCFMofHandtheoptimalSCBofFisexactlythecardinalityofD,i.e.,OPTCFM(H)=OPTSC(F)+t.ThekeytopreservethehardnessratioistokeepthedegreeofverticesinHboundedandthegapbetweentheoptimalsolutions'sizessmall. Lemma20. Ift=jUj Bln2B,thenthemaximumdegreeofverticesinHwillbeB0=(H)=O(BpolylogB). Proof. WecanverifythatverticesinSandUhavedegreeO(B).VerticesinDhavedegreesatmostvol(D) t+1,wherevol(D)isthetotaldegreeofverticesinD.Dene(X;Y)asthesetofedgescrossingbetweentwovertexsubsetsXandY.Wehavevol(D)=j(D;D0)j+j(D;U)j+j(D;S)j=jDj+XSk2Sd 1)]TJ /F3 11.955 Tf 11.96 0 Td[(jSkje+Xej2Ud 1)]TJ /F3 11.955 Tf 11.96 0 Td[(f(ej))]TJ /F5 11.955 Tf 11.96 0 Td[(1e2 1)]TJ /F3 11.955 Tf 11.95 0 Td[(jSjB+jSj+t=2 1)]TJ /F3 11.955 Tf 11.96 0 Td[(B+1jSj+t (6)WehaveusedthefactsthatXSk2SjSkj=Xej2Uf(ej)andjSkjB;8Sk2S.Thus,B01 t2 1)]TJ /F3 11.955 Tf 11.96 0 Td[(B+1jSj+t+12 1)]TJ /F3 11.955 Tf 11.96 0 Td[(B+1Bln2BnlSpolylogB mnlpolylogBO(BpolylogB) (6)Thiscompletestheproof. Theorem6.3. Whend=1,itisNP-hardtoapproximatetheCFMproblemingraphswithdegreesboundedbyB0withinafactoroflnB0)]TJ /F3 11.955 Tf 11.95 0 Td[(c1lnlnB0,forsomeconstantc1>0. 128

PAGE 129

Proof. Weprovebycontradiction.AssumethereexistsalgorithmAtondingraphwithdegreesboundedbyB0andd=1aCFMofsizeatmost(lnB0)]TJ /F3 11.955 Tf 12.45 0 Td[(c1lnlnB0)OPTCFM,whereOPTCFMisthesizeofanoptimalCFM.LetF=(U;S)beaninstanceofSCBwiththeoptimalsolutionofsizeOPTSC.ConstructaninstanceHofCFMproblemusingthereductionSCB-CFMasshownabove.From( 6 ),thereexistsconstant>0sothatB0BlnB.UsingalgorithmAonH,weobtainasolutionofsizeatmost(lnB0)]TJ /F3 11.955 Tf 12.17 0 Td[(c1lnlnB0)OPTCFM.WecanthenconvertthattoasolutionofSCBbyexcludingverticesinD(seeLemma 19 )andobtainasetcoverofsizeatmost(lnB0)]TJ /F3 11.955 Tf 11.95 0 Td[(c1lnlnB0)(OPTSC+t))]TJ /F3 11.955 Tf 11.96 0 Td[(t (6)SinceeachsetinScancoveratmostBelements,wehaveOPTSCjUj B=tBln2B B,thustOPTSC ln2B.Ifweselectc1=c0++1,thesolutionofSCBisthen,aftersomealgebra,atmost(lnB)]TJ /F3 11.955 Tf 11.95 0 Td[(c0lnlnB)OPTSCthatcontradictstheLemma 18 Similarly,withappropriatesettinginFeige'sconstruction[ 39 ],weobtainthefollowinghardnessresultregardingthenetworksizen(theproofdetailcanbefoundinthetechnicalreportonourwebsite). Theorem6.4. Forany>0,theCFMproblem,whend=1,cannotbeapproximatedwithinafactor(1 2)]TJ /F3 11.955 Tf 11.95 0 Td[()lnn,unlessNPDTIME(nO(loglogn)). Proof. WeusethesamegadgetinFig. 6-2 toprovethehardnessforCFM.Since,wenolongerneedtokeepdegreeofverticesinthegadgetbounded,weformacliquewithverticesinD.Wecanconnecteachv2(S[U)tovverticesinD.ThatisjDj=O(maxv2(S[U)v( 1)]TJ /F3 11.955 Tf 11.96 0 Td[(S))=O(S)=O(m3l=2) 129

PAGE 130

OrequivalentlyjDj2=O(Xv2(S[U)d(v)+x0(jSj+jUj)=O(2Xv2Ud(v)+jSj+jUj)=O(mRk2l)Tosummarize,thesufcientconditionisjDj=O(m2(l)+(mRk2l)1=2): (6)ByLemma 17 andtheconstruction,thehardnessratiosofourproblemsaregivenby(1)]TJ /F4 7.97 Tf 13.35 4.7 Td[(4 k)kQlnm+jDj kQ+jDj:Unfortunately,withthesamesettingintheFeige'sreduction,jDj=O(S)=O((5n)2l 2(l)),theabovehardnessratiogetsarbitrarycloseto1.Henceweuseadifferentsettinginwhichm=(5n)clwithasmallconstantc>0toreducethemaximumdegree.Theconsequenceisthattheinapproximabilityratioisreducedaccordingly.Theoptimalsettingtogetthebestinapproximabilityratioistosetm=(5n)l(1)]TJ /F6 7.97 Tf 6.59 0 Td[()forsome>0.Then,N=mR=(5n)l(2)]TJ /F6 7.97 Tf 6.58 0 Td[(),orm=N1)]TJ /F16 5.978 Tf 5.76 0 Td[( 2)]TJ /F16 5.978 Tf 5.76 0 Td[(.From( 6 ),itissufcientthatjDj=nl2(l) nl 2=o(Q)Hence,thehardnessratiowillbe(1)]TJ /F4 7.97 Tf 13.35 4.7 Td[(4 k)kQlnm+o(Q) kQ+o(Q)>(1)]TJ /F5 11.955 Tf 13.47 8.09 Td[(5 k)lnmThenumberofverticesinthegraph,denotedbynH,isnH=2jDj+jSj+jUj<(m3l=2)+nl22l5 3l=2+(5n)2l)]TJ /F6 7.97 Tf 6.59 0 Td[(<2jUj=2NFinally,thehardnessratioisatleast(1)]TJ /F5 11.955 Tf 13.47 8.09 Td[(5 k)lnnH 21=2)]TJ /F16 5.978 Tf 14.32 3.26 Td[( 4)]TJ /F12 5.978 Tf 5.75 0 Td[(2>(1)]TJ /F5 11.955 Tf 13.47 8.09 Td[(5 k)1 21)]TJ /F3 11.955 Tf 23.38 8.09 Td[( 2)]TJ /F3 11.955 Tf 11.96 0 Td[(lnnH)]TJ /F3 11.955 Tf 11.95 0 Td[((1)>1 2(1)]TJ /F3 11.955 Tf 11.96 0 Td[()lnnH:Here,weassumekissufcientlylargeandissufcientlysmall. 130

PAGE 131

NotethatTheorems 6.3 and 6.4 areincomparableingeneral.Letbethemaximumdegree,Theorem 6.3 impliesthehardnessofapproximationwithfactor(1)]TJ /F3 11.955 Tf 12.59 0 Td[()ln,whichislargerthan(1 2)]TJ /F3 11.955 Tf 12.59 0 Td[(n)lnnifn,butsmallerwhen

2.Inaddition,theTheorem 6.4 usesastrongerassumptionthanthatinTheorem 6.3 6.3.3Multiple-hopCFMWenowpresentagapreductionfromtheCFMproblemtotheone-hopCFMproblemwithd2.ThehardnessresultfollowsimmediatelybytheTheorem 6.3 intheprevioussection.GivenagraphG=(V;E)asaninstanceoftheCFMproblem.WewillconstructaninstanceG0=(V0;E0)oftheCFMproblemasfollows(andasillustratedinFig. 6-4 ).We Figure6-3. Thetransmittergadget. addc()verticesw1;w2;:::;wc(),calledashpoints,wherec()=minft2Njt)]TJ /F4 7.97 Tf 6.58 0 Td[(1 t+1
PAGE 132

Figure6-4. Gap-reductionfromone-hopCFMtod-hopCFM. Hence,thenumberofactivatedneighborsofvafterd)]TJ /F5 11.955 Tf 12.25 0 Td[(1roundswillequalthenumberofselectedneighborsofvintheoriginalgraph.Finally,wereplaceeachedge(wp;zp)byatransmitter.Inordertoactivatealldummyverticeszpafterdrounds,wecanassume,w.l.o.g.,thatallashpointsmustbeselectedinanoptimalsolution.Thefollowinglemmafollowsdirectlyfromtheconstruction. Lemma21. Everysolutionofsizekfortheone-hop(d=1)CFMprobleminGinducesasolutionofsizek+c()forthed-hopCFMprobleminG0.Onanotherdirection,wealsohavethefollowinglemma. Lemma22. Anoptimalsolutionofsizek0forthed-hopCFMprobleminducesasizek0)]TJ /F3 11.955 Tf 11.96 0 Td[(c()solutionfortheone-hopCFMprobleminG. Proof. Foratransmitterconnectingutov,ifthesolutionofthed-hopCFMproblemcontainsanyoftheintermediateverticesuv1;:::;uvd)]TJ /F4 7.97 Tf 6.59 0 Td[(1,wecanreplacethatvertexinthesolutionwitheitheruorvtoobtainanewsolutionofsamesize(orless).Hence,wecanassume,w.l.o.g.,thatnoneoftheintermediateverticesareselected.Therefore,allashpointsmustbeselectedinordertoactivatethedummyvertices.Itiseasytoseethatthesolutionofd-hopCFMexcludingtheashpointswillbeasolutionofone-hopCFMinGwithsizek0)]TJ /F3 11.955 Tf 11.96 0 Td[(c(). 132

PAGE 133

NotethatthenumberofverticesinG0isupper-boundedbydn2i.e.lnjV0j<2lnjVj+lnd.Thus,usingthesameargumentsusedintheproofofTheorem 6.4 ,wecanshowthata(1 4)]TJ /F3 11.955 Tf 12.81 0 Td[()lnnapproximationalgorithmalgorithmleadtoa(1 2)]TJ /F3 11.955 Tf 12.81 0 Td[()lnnapproximationalgorithmfortheone-hopCFMproblem(contradictsTheorem 6.4 ). Theorem6.5. TheCFMproblemcannotbeapproximatedwithin(1 4)]TJ /F3 11.955 Tf 12.23 0 Td[()lognford1,unlessNPDTIME(nO(loglogn)) 6.4EmpiricalStudyInthissectionweperformexperimentsonOSNstoshowtheefciencyofouralgorithmsincomparisonwithsimpledegreecentralityheuristicandstudythetrade-offbetweenthenumberoftimestheinformationisallowedtopropagateinthenetworkandtheseedingsize. 6.4.1ComparingtoOptimalSeedingOneadvantageofourdiscretediffusionmodeloverprobabilisticones[ 51 52 ]isthattheexactsolutioncanbefoundusingmathematicalprogramming.Thisenablesustostudytheexactbehavioroftheseedingsizewhenthenumberofpropagationhopvaries. A=0:4 B=0:6 C=0:8Figure6-5. Seedingsize(inpercent)onErdos'sCollaborationnetwork.VirAdsproducesclosetotheoptimalseedinginonlyfractionsofasecond. 133

PAGE 134

WeformulatetheCFMproblemasan0)]TJ /F5 11.955 Tf 12.9 0 Td[(1IntegerLinearProgramming(ILP)problembelow.minimizeXv2Vx0v (6)subjecttoXv2VxdvjVj (6)Xw2N(v)xi)]TJ /F4 7.97 Tf 6.59 0 Td[(1w+dd(v)exi)]TJ /F4 7.97 Tf 6.59 0 Td[(1vdd(v)exiv8v2V;i=1::d (6)xivxi)]TJ /F4 7.97 Tf 6.58 0 Td[(1v8v2V;i=1::d (6)xiv2f0;1g8v2V;i=0::d (6)wherexiv=8>><>>:0ifvisinactiveatroundi1otherwise.TheobjectiveoftheILPistoselectaminimumnumberofseedsatthebeginning.Theconstraint(2)guaranteesallnodesareactivatedattheend,while(3)dealswithpropagationcondition;theconstraint(4)issimplytokeepverticesactiveoncetheyareactivated.WesolvetheILPproblemonErdoscollaborationnetworks,thesocialnetworkoffamousmathematician,[ 14 ].Thenetworkconsistsof6100verticesand15030edges.TheILPissolvedwiththeoptimizationpackageGUROBI4.5onIntelXeon2.93GhzPCandsettingthetimelimitforthesolvertobe2days.TherunningtimeoftheIPsolverincreasessignicantlywhendincreases.Ford=1;2;and3,thesolverreturntheoptimalsolutions.However,ford=4,thesolvercannotndtheoptimalsolutionswithinthetimelimitandreturnssub-optimalsolutionswithrelativeerrorsatmost15%.Theoptimal(orsub-optimal)seedingsizesareshowninFigs. 6-5A 6-5B ,and 6-5C for=0:4;0:6and0:8,respectively.VirAdsprovidesclose-to-optimalsolutionsandperformsmuchbetterMaxDegree.Especially,when=0:8theVirAds'sseedingisonly 134

PAGE 135

differentwiththeoptimalsolutionsbyoneortwonodes.Inaddition,VirAdsonlytakesfractionsofasecondtogeneratethesolutions.AsproveninSection 6.1 ,theseedingtakesaconstantfractionofnodesinthenetwork.ForErdosColloborationNetwork,theseedingconsistsof3.8%to7%thenumberofnodesinthenetworks.Further,theseedingcanconsistashighas20%to40%nodesinthenetworkforlargersocialnetworksinnextsection.Althoughthemathematicalapproachcanprovideaccuratemeasurementontheoptimalseedingsize,itcannotbeappliedforlargernetworks.TherestofourexperimentsmeasuresthequalityandscalabilityofourproposedalgorithmVirAdsonacollectionoflargenetworks. APhysics BFacebook COrkutFigure6-6. Seedingsizewhenthenumberofpropagationhopdvaries(=0:3).VirAdsconsistentlyhasthebestperformance. APhysics BFacebook COrkutFigure6-7. Runningtimewhenthenumberofpropagationhopdvaries(=0:3).Evenforthelargestnetworkof110millionedges,VirAdstakeslessthan12minutes. 135

PAGE 136

APhysics BFacebook COrkutFigure6-8. Degreedistributionofstudiednetworks 6.4.2LargeSocialNetworksWeselectnetworksofvarioussizesincludingCoauthorsnetworkinPhysicssectionsofthee-printarXiv[ 51 ],Facebook[ 81 ]andOrkut[ 62 ],asocialnetworkingrunbyGoogle.Linksinallthreenetworksareundirectedandunweighted.ThesizesofthenetworksarepresentedinTable 6-1 .ThedegreedistributionsofthosenetworksareshowninFig. 6-8 Table6-1. Sizesoftheinvestigatednetworks PhysicsFacebookOrkut Vertices 37,15490,2693,072,441Edges 231,5843,646,662223,534,301 Avg.Degree 12.580.8145.5 Physics:WeshallreferthephysicscoauthorsnetworkasPhysicsnetworkorsimplyPhysics.Eachnodeinthenetworkrepresentsanauthorandthereisanedgebetweentwoauthorsiftheycoauthoroneormorepapers.Facebookdatasetconsists52%oftheusersintheNewOrleans[ 81 ].Orkutdatasetiscollectedbyperformingcrawlinginlast2006[ 62 ].Itcontainsabout11.3%ofOrkut'susers. 6.4.3SolutionQualityinLargeSocialNetworksWecompareourVirAdsalgorithmwiththefollowingheuristics:Randommethodinwhichverticesarepickeduprandomlyuntilformingad-seeding,andMaxDegree 136

PAGE 137

APhysics BFacebook COrkutFigure6-9. Seedingsizeatdifferentinuencefactors(themaximumnumberofpropagationhopsisd=4). methodinwhichverticeswithhighestdegreeareselectoeduntilformingad-hopseeding.Finally,wecompareVirAdswithitsnaiveimplementation,calledExhaustiveUpdate,inwhichafterselectingavertexintotheseeding,theeffectivenessofalltheremainingverticesarerecalculated.Withmoreaccurateestimationonvertexeffectiveness,ExhaustiveSearchisexpectedtoproducehigherqualitysolutionsthanthoseofVirAds.Theseedingsizewithdifferentnumberofpropagationhopdwhen=0:3areshowninFig. 6-6 .Tooursurprise,VirAdsevenperformsequalorbetterthanExhaustiveUpdatedespitethatitusessignicantlylessefforttoupdatevertexeffectiveness.VirAdshassmallerseedinginPhysicsthanExhaustiveUpdate;bothofthemgivesimilarresultsforFaceboook;whileExhaustiveUpdatecannotnishonOrkutafter48hoursandwasforcedtoterminate.Sparinglyupdatethevertices'effectivenessturnsouttobeefcientenoughsincetheinuencepropagationislocallybounded.Inaddition,theseedsproducedbyVirAdsarealmosttwotimessmallerthanthoseofRandom.ThegapbetweenVirAdsandMaxDegreeisnarrowedwhenthenumberofmaximumhopsincreases.Hence,selectingnodeswithhighdegreesasseedingisagoodlong-termstrategy,butmightnotbeefcientforfastpropagationwhenthenumberofhopsislimited.InFacebookandOrkut,whend=1,MaxDegreehas60%to70%moreverticesintheseedingthanVirAds.InPhysics,thegapbetweenVirAdsand 137

PAGE 138

theMaxDegreeislessimpressive.Nevertheless,VirAdsconsistentlyproducesthebestsolutionsinallnetworks. 6.4.4ScalabilityTherunningtimeofallmethodsatdifferentpropagationhopdarepresentedinFig 6-7 .Thetimeismeasuredinsecondandpresentedinthelogscale.Therunningtimesincreaseslightlytogetherwiththenumberofpropagationroundsd,andareproportionaltothesizeofthenetwork.TheExhaustiveUpdatehastheworstrunningtime,takingupto15minutesforPhysics,20minutesforFacebook.ForOrkut,thealgorithmcannotnishwithin2days,asmentioned.ThethreeremainingalgorithmsVirAds,MaxDegree,andRandomtakelessthanonesecondforPhysics,andlessthan10secondsforFacebook.EvenonthelargestnetworkOrkutwithmorethan220millionedges,VirAdsrequireslessthan12minutestocomplete. 6.4.5InuencefactorWestudytheperformanceofVirAdsandtheothermethodatdifferentinuencefactor.Thenumberofpropagationroundsdisxedto4.Thesizeofd-seedingsetsareshowninFigures 6-9 .VirAdsisclearlystillthebestperformer.TheseedingsizesofVirAdsareupto5timessmallerthanthoseofMaxDegreeforsmall(althoughit'shardtoseethisonthechartsduetosmallseedingsizes).Sincealltestednetworksaresocialnetworkswithsmalldiameter,theseedingsizesgotozerowhenisclosetozero.TheexceptionisthePhysics,inwhichtheseedingsizesdonotgobelow10%thenumberofverticesinthenetworksevenwhen=0:05.AcloserlookintothePhysicsnetworkrevealsthatthenetworkcontainmanyisolatedcliquesofsmallsizes(2,3,4,andsoon)whichcorrespondtoauthorsthatappearinonlyonepaper.Ineachclique,regardlessofthethreshold,atleastonevertexmustbeselected,thustheseedingsizecannotgetbelowthenumberofisolatedcliquesinthenetworks.Toeliminatetheeffectofisolatedcliques,apossibleapproachistorestricttheproblemtothelargestcomponentinthenetwork. 138

PAGE 139

CHAPTER7CONCLUSIONSocietyreliesheavilyonitsnetworkedphysicalinfrastructureandinformationsystems.Todetectvulnerabilityissuesinanetwork,itisofparticularimportancetoanalyzehowwell-connectedthenetworkwillremainafteradisruptiveeventtakesplace.Weproposetheuseofpairwiseconnectivity,thenumberofconnectedpairsinthenetwork,asadisruptiveeffectmeasurement,anduseittoformulatenetworkvulnerabilityassessmentasoptimizationproblems.Theobjectiveistoidentifytheminimumsetofcriticalnetworkelements(nodesoredges)whoseremovalresultsinamajordegradationofthenetworkpairwiseconnectivity.Weprovethatbothcriticaledgesdetection(CED)andcriticalnodesdetection(CND)areNP-complete[ 34 ];anddeveloptwonovelsolutionswithprovableguarantees:1)anO(log1:5n)bicriteriaapproximationalgorithmforCEDbasedonconstructingadecompositiontreewithrecursivec-balancedcutand2)anO(lognloglogn)bicriteriaapproximationalgorithmforCND[ 32 ].LaterwedesignabicriteriaapproximationalgorithmwithperformanceguaranteeO(p logn)whenthesetofcriticalelementsmayincludebothedgesandnodes.ThisimmediatelyimpliesimprovedresultsforbothCEDandCND.Theextensiveexperimentshaverevealedmanyinsightsontherelativecriticalitybetweenedgesandnodesinthenetworksondifferentnetworktopologies.dynamicnetworks,e.g.cellularnetworks,ormobilesensornetworks,detectingcriticalnodesisextremelychallengingduetothecontinualchangesinnetworktopology.Weabstractdynamicnetworksasprobabilisticgraphsandmeasurethedisruptiveeffectintermsofexpectedpairwiseconnectivity(EPC).ComputingEPCistightlyrelatedtonetworkreliabilityproblems,someofthemostclassicalopen#P-completeproblems.Beyondshowing#P-completenessofEPC,wehaveapproximatedEPCwithanFPRAS,whichgivesapotentialdirectiontotackleopenquestionsinnetworkreliability.Further,weformulatetheproblemofdetectingcriticalnodesasatwo-level 139

PAGE 140

stochasticprogrammingandpresentasampleaverageapproximationalgorithmtosolvetheformulationwithguaranteedaccuracy.WeinvestigateinChapter 6 cascadingfailuresincomplexsystems.Thosefailuresoftenpropagateandleadtoamuchmoredevastatingconsequence.Thus,itiscrucialtodetectcriticalnodeswhosefailureswilltriggeracascadingfailuretoanentirenetwork,leavingmajornodesinthefailurestatewithinagivennumberofsteps.Mytheoreticalanalysisshowsthatthecascadingoffailuresmaybequitedifferentinpower-lawnetworksthanothers.First,weprovethatalargenumberofinitialfailuresarerequiredtotriggeranetwork-widefailure.Second,theproblemofdetectingcriticalnodescannotbeapproximatedwithinafactorO(logn)ingeneralgraphs,however,thereisaconstantfactorapproximationalgorithmfortheprobleminpower-lawnetworks.Extensiveexperimentsonlarge-scaleOSNsuptohundredsofmillionsofedgesdemonstratetheeffectivenessofmyproposedalgorithm.Mystudyisalsoappliednaturallytotheproblemsofinformationpropagation,viralmarketing,anddiseasespreading. 140

PAGE 141

REFERENCES [1] USIPBackbonenetworkXOcompany.urlhttp://www.xo.com/about/network/Pages/overview.aspx,2012. [2] Agarwal,A.,Charikar,M.,Makarychev,K.,andMakarychev,Y.O(logn)approximationalgorithmsforminUnCut,min2CNFdeletion,anddirectedcutproblems.STOC.NewYork,NY,USA:ACM,2005,573. [3] Agarwal,P.K.,Efrat,A.,Ganjugunte,S.K.,Hay,D.,Sankararaman,S.,andZussman,G.TheResilienceofWDMNetworkstoProbabilisticGeographicalFailures.Networking,IEEE/ACMTransactionsonPP(2013).99:1. [4] Aiello,W.,Chung,F.,andLu,L.Arandomgraphmodelformassivegraphs.STOC'00.NewYork,NY,USA:ACM,2000. [5] .ARandomGraphModelforPowerLawGraphs.ExperimentalMath10(2000):53. [6] Aiello,William,Chung,Fan,andLu,Linyuan.RandomEvolutioninMassiveGraphs.InHandbookofMassiveDataSets.KluwerAcademicPublishers,2001. [7] Albert,R.,Albert,I.,andNakarado,G.L.StructuralVulnerabilityoftheNorthAmericanPowerGrid.Phys.Rev.E69(2004).2:10. [8] Albert,R.,Jeong,H.,andBarabasi,A.Errorandattacktoleranceofcomplexnetworks.Nature406(2000).6794:14. [9] Arora,S.andBarak,B.Computationalcomplexity:amodernapproach.CambridgeUniversityPress,2009.URL http://books.google.com/books?id=nGvI7cOuOOQC [10] Arora,S.,Hazan,E.,andKale,S.O(p logn)ApproximationtoSPARSESTCUTin~O(n2)Time.SIAMJ.Comput.39(2010).5. [11] Arulselvan,A.,Commander,ClaytonW.,Elefteriadou,L.,andPardalos,PanosM.Detectingcriticalnodesinsparsegraphs.Computers&OperationsResearch36(2009).7:2193. [12] .Detectingcriticalnodesinsparsegraphs.ComputersandOperationsResearch36(2009).7. [13] Barabasi,A.,Albert,R.,andJeong,H.Scale-freecharacteristicsofrandomnetworks:thetopologyoftheworld-wideweb.PhysicaA281(2000). [14] Barabasi,A,Jeong,H,Neda,Z,Ravasz,E,Schubert,A,andVicsek,T.Evolutionofthesocialnetworkofscienticcollaborations.PhysicaA:StatisticalMechanicsanditsApplications311(2002).3-4:590. 141

PAGE 142

URL http://linkinghub.elsevier.com/retrieve/pii/S0378437102007367 [15] Bissias,G.D.Boundsonservicequalityfornetworkssubjecttoaugmentationandattack.Ph.D.thesis,UniversityofMassachusettAmherst,2010. [16] Bissias,G.D.,Levine,B.N.,andRosenberg,A.Boundingdamagefromlinkdestruction,withapplicationtotheinternet.SIGMETRICSPerform.Eval.Rev.(2007). [17] Blackford,L.S.,Choi,J.,Cleary,A.,D'Azeuedo,E.,Demmel,J.,Dhillon,I.,Hammarling,S.,Henry,G.,Petitet,A.,Stanley,K.,Walker,D.,andWhaley,R.C.ScaLAPACKuser'sguide.Philadelphia,PA,USA:SIAM,1997. [18] Borgatti,StephenP.Identifyingsetsofkeyplayersinasocialnetwork.Computa-tional&MathematicalOrganizationTheory12(2006).1. [19] Borgatti,StephenP.andEverett,MartinG.AGraph-theoreticperspectiveoncentrality.SocialNetworks(2006). [20] Boyd,S.P.andVandenberghe,L.Convexoptimization.CambridgeUniversityPress,2004. [21] Cauchy,A.L.B.andpolytechnique(France),Ecole.Coursd'analysedel'Ecoleroyalepolytechnique.No.v.1inCoursd'analysedel'Ecoleroyalepolytechnique.Imprimerieroyale,1821.URL http://books.google.com/books?id=n60AAAAAMAAJ [22] Centola,DamonandMacy,Michael.ComplexContagionsandtheWeaknessofLongTies.AmericanJournalofSociology113(2007).3:702.URL http://dx.doi.org/10.1086/521848 [23] Chen,N.OntheApproximabilityofInuenceinSocialNetworks.SIAMJournalofDiscreteMathematics23(2009).3:1400. [24] Chung,FanR.K.SpectralGraphTheory(CBMSRegionalConferenceSeriesinMathematics,No.92).AmericanMathematicalSociety,1997. [25] Church,R.,Scaparra,M.,andMiddleton,R.Identifyingcriticalinfrastructure:themedianandcoveringfacilityinterdictionproblems.AnnAssocAmGeogr94(2004).3. [26] Clauset,A.,Shalizi,C.R.,andNewman,M.E.J.Power-lawdistributionsinempiricaldata.SIAMReviews(2007). [27] Colbourn,CharlesJ.TheCombinatoricsofNetworkReliability.NewYork,NY,USA:OxfordUniversityPress,Inc.,1987. 142

PAGE 143

[28] Cormen,T.H.,Leiserson,C.E.,Rivest,R.L.,andStein,C.IntroductiontoAlgorithms.TheMITPress,2009,3rdeditioned. [29] Dagum,P.,Karp,R.,Luby,M.,andRoss,S.AnOptimalAlgorithmforMonteCarloEstimation.SIAMJournalonComputing29(2000).5:1484.URL http://epubs.siam.org/doi/abs/10.1137/S0097539797315306 [30] Demmel,J.W.,Eisenstat,S.C.,Gilbert,J.R.,Li,X.S.,andLiu,J.W.H.Asupernodalapproachtosparsepartialpivoting.SIAMJ.MatrixAnalysisandApplications20(1999).3:720. [31] DiSumma,Marco,Grosso,Andrea,andLocatelli,Marco.Complexityofthecriticalnodeproblemovertrees.Comput.Oper.Res.38(2011).12:1766.URL http://dx.doi.org/10.1016/j.cor.2011.02.016 [32] Dinh,T.N.,Dung,N.T.,andThai,M.T.Cheap,Easy,andMassivelyEffectiveViralMarketinginSocialNetworks:TruthorFiction?Proceedingsofthe23rdACMconferenceonHypertextandSocialMedia.HT'12.Milwaukee,WI,USA:ACM,2012. [33] Dinh,T.N.andThai,M.T.PreciseStructuralVulnerabilityAssessmentViaMathematicalProgramming.Proc.ofIEEEMILCOM.2011. [34] Dinh,T.N.,X.,Ying,Thai,M.T.,Park,E.K.,andZnati,T.OnApproximationofNewOptimizationMethodsforAssessingNetworkVulnerability.Proc.ofIEEEINFOCOM.2010. [35] Dinur,I.andSafra,S.OntheHardnessofApproximatingMinimumVertexCover.AnnalsofMathematics162(2004):2005. [36] Donath,W.E.andHoffman,A.J.Lowerboundsforthepartitioningofgraphs.IBMJ.Res.Dev.17(1973). [37] Erdos,P.andRenyi,A.Ontheevolutionofrandomgraphs.Publ.Math.Inst.Hungary.Acad.Sci.5(1960):17. [38] Even,G.,Naor,J.S.,Rao,S.,andSchieber,B.Divide-and-conquerapproximationalgorithmsviaspreadingmetrics.J.ofACM47(2000).4:585. [39] Feige,U.Athresholdoflnnforapproximatingsetcover.JournalofACM45(1998).4:634. [40] Ferrante,Alessandro.HardnessandApproximationAlgorithmsofSomeGraphProblems.2006. [41] Freeman,LintonC.ASetofMeasuresofCentralityBasedonBetweenness.Sociometry40(1977).1:3541. 143

PAGE 144

URL http://dx.doi.org/10.2307/3033543 [42] Garey,MichaelR.andJohnson,DavidS.ComputersandIntractability;AGuidetotheTheoryofNP-Completeness.NewYork,NY,USA:W.H.Freeman&Co.,1990. [43] Gkantsidis,C.,Mihail,M.,andSaberi,A.Conductanceandcongestioninpowerlawgraphs.SIGMETRICS'03:ProceedingsoftheInternationalConferenceonMeasurementsandModelingofComputerSystems.NewYork,NY,USA:ACM,2003,148. [44] Goldberg,AVandTarjan,RE.Anewapproachtothemaximumowproblem.ProceedingsoftheeighteenthannualACMsymposiumonTheoryofcomputing.STOC'86.NewYork,NY,USA:ACM,1986,136.URL http://doi.acm.org/10.1145/12130.12144 [45] Goyal,A.,Bonchi,F.,andLakshmanan,L.V.S.Learninginuenceprobabilitiesinsocialnetworks.WSDM'10(2010):241.URL http://portal.acm.org/citation.cfm?id=1718518 [46] Goyal,D.andCaffery,J.PartitioningavoidanceinmobileAdHocnetworksusingnetworksurvivabilityconcepts.ISCC(2002). [47] Grtschel,M.andWakabayashi,Y.Acuttingplanealgorithmforaclusteringproblem.MathematicalProgramming45(1989).10.1007/BF01589097. [48] Grubesic,TonyH.,Matisziw,TimothyC.,Murray,AlanT.,andSnediker,Diane.ComparativeApproachesforAssessingNetworkVulnerability.Inter.RegionalSci.Review(2008). [49] Hauspie,M.,Carle,J.,andSimplot,D.PartitiondetectioninmobileAdHocnetworksusingmultipledisjointpathsset.WorkshopofObjects,ModelsandMultimediatechnology(2003). [50] Jorgic,M.,Stojmenovic,I.,Hauspie,M.,andSimplot-Ryl,D.Localizedalgorithmsfordetectionofcriticalnodesandlinksforconnectivityinadhocnetworks.3rdIFIPMED-HOC-NETWorkshop(2004). [51] Kempe,D.,Kleinberg,J.,andTardos,E.Maximizingthespreadofinuencethroughasocialnetwork.KDD'03:Proceedingsofthe9thACMSIGKDDinterna-tionalconferenceonKnowledgediscoveryanddatamining.ACMNewYork,NY,USA,2003,137. [52] Kempe,D.,Kleinberg,J.,andTardos,E.Inuentialnodesinadiffusionmodelforsocialnetworks.InternationalColloquiumonAutomata,LanguagesandProgramming'05.2005,1127. 144

PAGE 145

[53] Kirkpatrick,S.,Gelatt,C.D.,andVecchi,M.P.OptimizationbySimulatedAnnealing.Science220(1983).4598:671. [54] Kleywegt,A.,Shapiro,A.,andHomem-deMello,T.TheSampleAverageApproximationMethodforStochasticDiscreteOptimization.SIAMJournalonOptimization12(2002).2:479. [55] Lehman,T.,Sobieski,J.,andJabbari,B.DRAGON:aframeworkforserviceprovisioninginheterogeneousgridnetworks.IEEECommunicationMagazines(2006). [56] Lehoucq,R.B.,Sorensen,D.C.,andYang,C.ARPACKUsersGuide:SolutionofLargeScaleEigenvalueProblemsbyImplicitlyRestartedArnoldiMethods.1997. [57] Leskovec,J.,Krause,A.,Guestrin,C.,Faloutsos,C.,VanBriesen,J.,andGlance,N.Cost-effectiveoutbreakdetectioninnetworks.ACMSIGKDDConferenceonKnowledgeDiscoveryandDataMining'07.NewYork,NY,USA:ACM,2007,420. [58] Leskovec,Jure,Kleinberg,Jon,andFaloutsos,Christos.Graphsovertime:densicationlaws,shrinkingdiametersandpossibleexplanations.KDD.ACM,2005,177. [59] Matisziw,T.C.andMurray,A.T.Modelings-tpathavailabilitytosupportdisastervulnerabilityassessmentofnetworkinfrastructure.Comput.Oper.Res.36(2009):16. [60] Meila,M.andPentney,W.ClusteringbyWeightedCutsinDirectedGraphs.ProceedingsoftheSIAMConferenceonDataMining.2007. [61] Mhatre,V.andRosenberg,C.Homogeneousvsheterogeneousclusteredsensornetworks:acomparativestudy.IEEEICC(2004). [62] Mislove,A.,Marcon,M.,Gummadi,KrishnaP.,Druschel,P.,andBhattacharjee,B.MeasurementandAnalysisofOnlineSocialNetworks.IMC'07.SanDiego,CA,2007. [63] MoharandPoljak.Eigenvalueincombinatorialoptimization.CombinatorialandGraph-TheoreticalProblemsinLinearAlgebra(1992). [64] Murray,A.,Matisziw,T.,andGrubesic,T.Multimethodologicalapproachestonetworkvulnerabilityanalysis.GrowthChange(2008). [65] Neumayer,S.,Zussman,G.,Cohen,R.,andModiano,E.AssessingtheVulnerabilityoftheFiberInfrastructuretoDisasters.Proc.ofIEEEINFOCOM.2009. 145

PAGE 146

[66] Neumayer,Sebastian,Zussman,Gil,Cohen,Reuven,andModiano,Eytan.AssessingtheVulnerabilityoftheFiberInfrastructuretoDisasters.IEEE/ACMTrans.Netw.(2011):1610. [67] Ng,AndrewY.,Jordan,M.I.,andWeiss,Y.Onspectralclustering:Analysisandanalgorithm.NIPS14(2001).14:849856. [68] Page,L.,Brin,S.,Motwani,R.,andWinograd,T.ThePageRankCitationRanking:BringingOrdertotheWeb.Tech.rep.,StanfordInfoLab,1999. [69] Peleg,D.LocalMajorityVoting,SmallCoalitionsandControllingMonopoliesinGraphs:AReview.SIROCCO'96:ColloquiumonStructuralInformationandCommunicationComplexity.1996,152. [70] Pinar,A.,Meza,J.,Donde,V.,andLesieutre,B.OptimizationStrategiesfortheVulnerabilityAnalysisoftheElectricPowerGrid.SIAMJ.onOptimization20(2010). [71] Sen,A.,Murthy,S.,andBanerjee,S.Region-basedconnectivity-anewparadigmfordesignoffault-tolerantnetworks.HPSR.2009. [72] Shapiro,A.,Dentcheva,D.,andRuszczynski,A.P.LecturesonStochasticPro-gramming:ModelingandTheory.MPS-SIAMSeriesonOptimizationSeries.SocietyforIndustrialandAppliedMathematics(SIAM,3600MarketStreet,Floor6,Philadelphia,PA19104),2009. [73] Shen,SiqianandSmith,J.Cole.Polynomial-timealgorithmsforsolvingaclassofcriticalnodeproblemsontreesandseries-parallelgraphs.Netw.60(2012).2:103.URL http://dx.doi.org/10.1002/net.20464 [74] Shi,J.andMalik,J.Normalizedcutsandimagesegmentation.IEEETrans.Patt.Anal.Mach.Intell.22(2000).8:888. [75] Sinclair,A.andJerrum,M.Approximatecounting,uniformgenerationandrapidlymixingMarkovchains.Inf.Comput.82(1989).1:93.URL http://dx.doi.org/10.1016/0890-5401(89)90067-9 [76] Stoer,M.andWagner,F.Asimplemin-cutalgorithm.J.ofACM44(1997).4:585. [77] Suh,Y.J.,Kim,D.J.,Lim,W.S.,andBaek,J.Y.Methodforsupportingqualityofserviceinheterogeneousnetworks.2009. [78] Sun,FangtingandShayman,MarkA.Onpairwiseconnectivityofwirelessmultihopnetworks.InternationalJournalofSecurityandNetworks2(2007).1/2. 146

PAGE 147

[79] Trevisan,L.Non-approximabilityresultsforoptimizationproblemsonboundeddegreeinstances.ACMSymposiumonTheoryofComputing'01.NewYork,NY,USA:ACM,2001,453. [80] Valiant,L.TheComplexityofEnumerationandReliabilityProblems.SIAMJournalonComputing8(1979).3:410.URL http://epubs.siam.org/doi/abs/10.1137/0208032 [81] Viswanath,B.,Mislove,A.,Cha,M.,andGummadi,K.P.OntheEvolutionofUserInteractioninFacebook.WOSN'09.2009. [82] Wagner,D.andWagner,F.BetweenMinCutandGraphBisection.MFCS.London,UK:Springer-Verlag,1993,744. [83] Wang,F.,Camacho,E.,andXu,K.PositiveInuenceDominatingSetinOnlineSocialNetworks.Proceedingsofthe3rdInternationalConferenceonCombinatori-alOptimizationandApplications.COCOA'09.Berlin,Heidelberg:Springer-Verlag,2009,313.URL http://dx.doi.org/10.1007/978-3-642-02026-1_29 [84] Watts,D.J.andStrogatz,S.H.Collectivedynamicsof'small-world'networks.Nature393(1998).6684. [85] White,S.andSmyth,P.ASpectralClusteringApproachToFindingCommunitiesinGraph.SDM.2005. [86] Woo,Gordon.IntelligenceConstraintsonTerroristNetworkPlots.MathematicalMethodsinCounterterrorism.eds.NasrullahMemon,JonathanDavidFarley,DavidL.Hicks,andTorbenRosenorn.SpringerVienna,2009.205.URL http://dx.doi.org/10.1007/978-3-211-09442-6_12 [87] Z.,Feng,Z.,Zhao,andW.,Weili.Latency-BoundedMinimumInuentialNodeSelectioninSocialNetworks.WirelessAlgorithms,Systems,andApplications.eds.BenyuanLiu,AzerBestavros,Ding-ZhuDu,andJieWang,LectureNotesinComputerScience.2009,519.URL http://dx.doi.org/10.1007/978-3-642-03417-6_51 [88] Zhu,X.,Yu,J.,Lee,W.,Kim,D.,Shan,S.,andDu,D.-Z.Newdominatingsetsinsocialnetworks.JournalofGlobalOptimization48(2010):633.10.1007/s10898-009-9511-2.URL http://dx.doi.org/10.1007/s10898-009-9511-2 147

PAGE 148

BIOGRAPHICALSKETCH ThangN.DinhreceivedhisB.S.degreeinInformationTechnologyfromVietnamNationalUniversity(2007).Hisresearchfocusesondevelopingmodels,theories,andefcientalgorithmsforfundamentalcomplexnetworkproblemssuchascommunitystructure,informationdiffusion,aswellasprivacyandsecurityinsocialnetworks.Hehaspublished3bookchapters,and20+articlesinprestigiousjournalsandconferencessuchasIEEE/ACMToN,IEEETMC,MOBICOM,INFOCOM,andCIKM.HehasservedasapublicitychairintheCSoNet2013workshop,acommitteememberoftheASE/IEEESocialComconference,andareviewerforseveraljournalsincludingJournalofCombinatorialOptimization,Optimization:AJ.ofMP&OR.,andSocialNetworkAnalysisandMining.Heisarecipientofmanyawards,includingaBronzeMedalinInternationalOlympiadinInformatics,aMicrosoftScholarship,aDistinguishedAcademicAchievementAward(VNU),anAlumniFellowshipAward(UF),andanOutstandingInternationalStudentAward,CollegeofEngineering. 148