<%BANNER%>

Fault Tolerance and Scalability of Data Aggregation in Sensor Networks

Permanent Link: http://ufdc.ufl.edu/UFE0022108/00001

Material Information

Title: Fault Tolerance and Scalability of Data Aggregation in Sensor Networks
Physical Description: 1 online resource (115 p.)
Language: english
Creator: Chitnis, Laukik
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2008

Subjects

Subjects / Keywords: aggregation, algorithms, analysis, data, fault, model, networks, scalability, sensor, tolerance
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Sensor networks are finding significant applications in large scale distributed systems. One of the basic operations in sensor networks is data aggregation. Among the various approaches to in-network aggregation, such as gossip and tree, including the hash-based techniques, the tree-based approaches have better performance and energy-saving characteristics. However, sensor networks are highly prone to failures. A few techniques are suggested in literature to counteract the effect of failures, but have not been carefully analyzed for fault tolerance and scalability. Our work is geared toward analyzing the fault tolerance of such aggregation algorithms, and proposing scalable techniques for data aggregation. To this end, we make the following distinct contributions. First, we propose a simple fault model for analysis of various aggregation techniques. We then utilize this fault model to analyze the fault tolerance of existing techniques such as tree and gossip aggregation, and also weigh the performance gain of various techniques suggested to improve fault tolerance, such as multiple trees and local fixes. We also do the cost-benefit analysis of using the hash-based schemes which are based on FM sketches. Then, we propose a novel hybrid aggregation technique that combines the best characteristics of tree and gossip aggregation and improves fault-tolerance and scalability. Finally, we analyze the impact of using a few of the more reliable, though expensive, nodes -- such as the Intel XScale based Stargates--called microservers, in addition to the standard motes, on the fault tolerance and scalability of the aggregation algorithms in sensor networks. We show that our work can be effectively used as a design tool for maximizing fault tolerance while designing scalable aggregation algorithms for sensor networks.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Laukik Chitnis.
Thesis: Thesis (Ph.D.)--University of Florida, 2008.
Local: Adviser: Ranka, Sanjay.
Local: Co-adviser: Dobra, Alin.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2009-08-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2008
System ID: UFE0022108:00001

Permanent Link: http://ufdc.ufl.edu/UFE0022108/00001

Material Information

Title: Fault Tolerance and Scalability of Data Aggregation in Sensor Networks
Physical Description: 1 online resource (115 p.)
Language: english
Creator: Chitnis, Laukik
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2008

Subjects

Subjects / Keywords: aggregation, algorithms, analysis, data, fault, model, networks, scalability, sensor, tolerance
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Sensor networks are finding significant applications in large scale distributed systems. One of the basic operations in sensor networks is data aggregation. Among the various approaches to in-network aggregation, such as gossip and tree, including the hash-based techniques, the tree-based approaches have better performance and energy-saving characteristics. However, sensor networks are highly prone to failures. A few techniques are suggested in literature to counteract the effect of failures, but have not been carefully analyzed for fault tolerance and scalability. Our work is geared toward analyzing the fault tolerance of such aggregation algorithms, and proposing scalable techniques for data aggregation. To this end, we make the following distinct contributions. First, we propose a simple fault model for analysis of various aggregation techniques. We then utilize this fault model to analyze the fault tolerance of existing techniques such as tree and gossip aggregation, and also weigh the performance gain of various techniques suggested to improve fault tolerance, such as multiple trees and local fixes. We also do the cost-benefit analysis of using the hash-based schemes which are based on FM sketches. Then, we propose a novel hybrid aggregation technique that combines the best characteristics of tree and gossip aggregation and improves fault-tolerance and scalability. Finally, we analyze the impact of using a few of the more reliable, though expensive, nodes -- such as the Intel XScale based Stargates--called microservers, in addition to the standard motes, on the fault tolerance and scalability of the aggregation algorithms in sensor networks. We show that our work can be effectively used as a design tool for maximizing fault tolerance while designing scalable aggregation algorithms for sensor networks.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Laukik Chitnis.
Thesis: Thesis (Ph.D.)--University of Florida, 2008.
Local: Adviser: Ranka, Sanjay.
Local: Co-adviser: Dobra, Alin.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2009-08-31

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2008
System ID: UFE0022108:00001


This item has the following downloads:


Full Text

PAGE 1

FAULTTOLERANCEANDSCALABILITYOFDATAAGGREGATIONINSENSORNETWORKSByLAUKIKVILASCHITNISADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2008 1

PAGE 2

c2008LaukikVilasChitnis 2

PAGE 3

Tomyparentsandteachers 3

PAGE 4

ACKNOWLEDGMENTSFirstofall,letmeexpressmysinceregratitudetowardmyadvisers,Dr.SanjayRankaandDr.AlinDobra,forguidingmeinthisendeavorcalleddoctoralresearch.Ithankthemforalwaysnudgingmeintherightdirection;beginningwithhelpingmedenemyareaofresearch.IwouldliketothankDr.Dobraforintroducingmetotheinterestingproblemofdataaggregationinhighlydistributedsystems;hisenthusiasmtowardsolvinginterestingproblemswillalwaysbeasourceofinspirationforme.IthasbeenapleasureworkingwithDr.Ranka.Apartfromthisresearch,wehavealsoworkedtogetheronproblemsrelatedtoschedulingingridcomputing,andondevelopinggridtutorialforenablinge-science.Ihavealwaysappreciatedhismanagementandcollaborationskills,andhisabilitytoprovidekeyinsights.Icouldalwayslookuptohimforgenuineadvice.IsincerelythankhimwithutmostgratitudeforhisguidanceandsupportthroughoutmyyearsasaPh.D.studenthereattheUniversityofFlorida.ThanksshouldalsogotoDr.ChristopherJermaineandDr.SartajSahniforservingonmycommittee.IwouldliketothankDr.Jermaineforhisconstructivecommentsandsuggestions.AmongmylastingmemoriesabouttheCISEdepartmentwouldbetheinsightful,andentertaining,debatesbetweenDr.Jermaine,Dr.DobraandDr.KahveciattheseriesofDatabaseseminars.IwouldalsoliketothankDr.Sahni,who,asthedepartmentChairofCISE,hasalwaysbeenverysupportiveofgraduatestudentsandhasbeeninstrumentalintheascentofUFasaqualitydestinationforcomputerscienceresearch.IwouldalsoliketothankDr.PaulAvery,whohasnotonlyservedastheexternalmemberonmycommittee,buthasalsoledthevariousgridcomputinginitiativesIhaveworkedon.ItwasapleasureworkingwithDr.Avery.IwouldalsoliketoacknowledgeDr.RichardCavanaugh,Dr.JorgeRodriguez,Jang-ukIn,MandarKulkarniandPradeepPadalaforourcollaborationtowarddemonstrationsandpresentationsofourworkatvariousgridconferencesandworkshops.IwouldalsoliketothankDr.MikeWildewith 4

PAGE 5

whomIhaveworkedonmultipleoccasionsoverthelastfewyearsorganizinggridsummerschools.ItwasalsoapleasureleadingadiscussionsectionofaclassandassistingMs.LolaHaskinswithherteachingresponsibilities.Iwouldalsoliketothankmycolleaguesinthedepartment{ShantanuJoshi,ManasSomaiya,AbhijitPol,ParbatiManna,SubiArumugam,KarthikSG,SrijitKamath,JunLiu,XiuyaoSong,MingxiWu,SeemaDegwekar,AmitDhurandhar{formakingitastimulatingenvironmenttowork.Andofcourse,thisworkwouldnothavebeenpossiblewithouttheunaggingsupportofmyparents,PadmajaandVilasChitnis,andmysister,Dr.SonalChitnis.Theyhavealwaystrustedinmeandencouragedmetopursuemygoals.Itgivesmeimmensehappinessinfulllingthispromise. 5

PAGE 6

TABLEOFCONTENTS page ACKNOWLEDGMENTS ................................. 4 LISTOFTABLES ..................................... 8 LISTOFFIGURES .................................... 9 ABSTRACT ........................................ 11 CHAPTER 1INTRODUCTION .................................. 13 1.1SensorNetworksandAggregation ....................... 13 1.2ImportanceofFaultTolerance ......................... 19 1.3Contributions .................................. 21 2RELATEDWORK .................................. 22 2.1SensorNetworks ................................ 22 2.2TreeAggregation ................................ 25 2.3GossipAggregation ............................... 25 2.4HeterogeneousSensorNetworks ........................ 27 2.5FaultToleranceandScalabilityofSensorNetworks ............. 28 2.6ResearchOpportunity ............................. 30 3FAULTMODEL ................................... 32 3.1Motivation .................................... 32 3.2SensorNetworkAggregationunderFaults .................. 32 3.2.1ASimpleFaultModel .......................... 33 3.2.2CorrectnessunderFaults ........................ 34 3.2.3TreeAggregationunderFaults ..................... 36 3.2.4GossipAggregationunderFaults .................... 40 3.2.4.1Push-Sumprotocol ...................... 40 3.2.4.2DistributedrandomgroupingDRGalgorithm ...... 40 3.3Contributions .................................. 41 4IMPROVINGFAULTTOLERANCEOFTREEAGGREGATION ....... 42 4.1Motivation .................................... 42 4.2MultipleTrees .................................. 42 4.3LocalFixes ................................... 45 4.4ExperimentsandResults ............................ 46 4.4.1GlobalRestart .............................. 47 4.4.2LocalFixes ................................ 49 4.4.3MultipleTrees .............................. 50 6

PAGE 7

4.5Sketch-BasedSchemes ............................. 55 4.6Contributions .................................. 57 5HYBRIDAGGREGATION ............................. 58 5.1Motivation .................................... 58 5.2ProposedHybridAggregationMethodology ................. 59 5.3EmpiricalEvaluationoftheHybridProtocol ................. 63 5.3.1ExperimentalSetup ........................... 63 5.3.2ExperimentalResults .......................... 66 5.3.2.1Localgossipcommunication ................. 67 5.3.2.2Globalgossipcommunication ................ 69 5.3.2.3Limitedglobalcommunication ................ 71 5.4VariationstotheBasicModelandMethodology ............... 71 5.4.1ModelusingFault-ProneMulti-HopRouting ............. 71 5.4.2RobustnessandSensitivityTesting .................. 76 5.4.2.1Sensitivitytoestimationoffailurerate ........... 76 5.4.2.2Robustnesstocorrelatedfailures .............. 79 5.4.3SeededClustering ............................ 80 5.5Discussion .................................... 81 5.6Contributions .................................. 82 6AGGREGATIONINHETEROGENEOUSSENSORNETWORKS ....... 84 6.1Motivation .................................... 84 6.2ModelingFaultTolerantAggregationinHeterogeneousSensorNetwork .. 85 6.2.1TheoreticalModel ............................ 87 6.2.2DesignDecisionsusingTheoreticalModel ............... 90 6.3Experiments ................................... 92 6.3.1Setup ................................... 93 6.3.2VericationofTheoreticalModel .................... 94 6.3.3ImprovementinFaultTolerance .................... 97 6.3.4MaximumGainandOptimalNumberofMicroservers ........ 99 6.4HybridAggregationwithHeterogeneousNodes ................ 101 6.5Contributions .................................. 105 7CONCLUSION .................................... 106 REFERENCES ....................................... 107 BIOGRAPHICALSKETCH ................................ 115 7

PAGE 8

LISTOFTABLES Table page 4-1Standarderrorofcountingsketchesforseveralvaluesofm,thenumberofbitmapvectors,asreportedin[ 28 ] ............................. 56 6-1Optimalnumberofmicroserversaspredictedbythetheoreticalmodelandasobservedfromthesimulationresults.Here,N=16384andf=16. ....... 97 8

PAGE 9

LISTOFFIGURES Figure page 1-1Tree-basedaggregation ................................ 17 1-2Gossipaggregation .................................. 18 4-1Theoreticalpredictionofimprovementinfault-toleranceusingklocalxesforsensornetworkofsizeN=8192 ............................ 47 4-2Timetocompletioninroundsplottedagainsttheprobabilityoffailurewhenusingtheglobalrestarttechnique .......................... 48 4-3Timetocompletionplottedagainsttheprobabilityoffailurewhenthemaximumnumberoflocalxesallowedisk=0,2,8and16 .................. 49 4-4Levelizedmultitreestructure ............................. 51 4-5Ratioofthevarianceofthecountaggregateusingalevelizedmultitreehaving3parentstothevarianceofa3-arysingletree .................... 52 4-6Ratioofthevariancesofthecountforanarbitrarymultitreeinvolving2treestothevarianceofasingletree ............................ 52 4-7MeanandstandarddeviationofthecountaggregateasafunctionoftheprobabilityofnodefailureforasensornetworkofsizeN=8192 ................ 54 5-1Hybridprotocol. ................................... 59 5-2Comparisonofperformanceoftreeandgossipaggregation ............ 60 5-3Frequencydistributionofthetimetocompletionofgossipprotocolforagroupofsizen=4096andp=0.00065536 .......................... 62 5-4Comparisonofthemessagecountswhenusinglocalcommunicationforgossip. 68 5-5Comparisonofthetimetocompletionwhenusinglocalcommunicationforgossip. 68 5-6Comparisonofthemessagecountsforglobalgossipcommunication. ....... 70 5-7Comparisonofthetimetocompletionforglobalgossipcommunication. ..... 70 5-8Comparisonofthemessagecountswhenusinglimitedglobalcommunicationforgossip. ....................................... 72 5-9Comparisonofthetimetocompletionwhenusinglimitedglobalcommunicationforgossip. ....................................... 72 5-10Gainofthehybridprotocolinpresenceoffault-pronemulti-hoprouting.Thegossipphaseofthehybridprotocolhereuseslocalcommunication. ....... 75 9

PAGE 10

5-11Sensitivitytesting:Gainofthehybridprotocolwhentheestimatedpvariesfromtheactualp. ...................................... 77 5-12Completionrateofsensitivityexperiments. ..................... 78 5-13Gainofthehybridprotocolinpresenceofspatiallycorrelatedfailures.Thegossipphaseofthehybridprotocolhereuseslocalcommunication.Theperformanceisstillgoodinspiteofthepresenceofspatiallycorrelatedfailures. ....... 79 5-14Gainofthehybridprotocolusingseededclustering.Thegossipphasehereusesglobalcommunication.Similarresultsareobtainedwhenlocalorlimitedglobalcommunicationisusedinthegossipphase. ..................... 81 6-1Treeaggregationinheterogeneoussensornetwork ................. 86 6-2Comparingtheexpectedmaximumnumberofsubtreerestartsaspredictedbythetheoreticalmodelandasobservedinsimulationresults.Thebehavioraltrendofthenumberofrestartsisaccuratelypredictedbythetheoreticalmodel.Here,N=16384andf=16 ................................ 95 6-3Behavioraltrendsinthecompletiontimesasobservedinsimulationexperimentsaresimilartothetheoreticalpredictions.Here,N=16384andf=16.Thefactorsof4inthereportingofnumberofmicroserversnintheplotsisanartifactoftheanalysisbeinggearedtowardcomparisonwiththequad-treesimulation,andisnotalimitationofthetheoreticalmodel. .................. 96 6-4Sustainableprobabilityoffailureversusthenumberofmicroservers.Thesustainablepdoesnotalwaysincreasewithmorenumberofmicroservers. .......... 98 6-5Gainusingheterogeneoustreeaggregationoverhomogeneoustreeaggregation.ThesizeofthesensornetworkhereisN=65536andthereliabilityfactorofmicroserversisf=16 ................................ 99 6-6Timeversusnumberofmicroserversfordierentreliabilityfactors.N=1048576andp=0:01048576 .................................. 100 6-7Comparingtheexpectedtimetocompletionaspredictedbythetheoreticalmodelandasobservedinsimulationresults.Thebehavioraltrendofthetimetocompletionofthehybridaggregationwithheterogeneousnodesisaccuratelypredictedbythetheoreticalmodel.Here,N=4096andf=16andp=0:16 ......... 103 6-8Hybridaggregationversustreeaggregationinheterogeneoussensornetworks .. 104 10

PAGE 11

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyFAULTTOLERANCEANDSCALABILITYOFDATAAGGREGATIONINSENSORNETWORKSByLaukikVilasChitnisMay2008Chair:SanjayRankaCochair:AlinDobraMajor:ComputerEngineeringSensornetworksarendingsignicantapplicationsinlargescaledistributedsystems.Oneofthebasicoperationsinsensornetworksisdataaggregation.Amongthevariousapproachestoin-networkaggregation,suchasgossipandtree,includingthehash-basedtechniques,thetree-basedapproacheshavebetterperformanceandenergy-savingcharacteristics.However,sensornetworksarehighlypronetofailures.Afewtechniquesaresuggestedinliteraturetocounteracttheeectoffailures,buthavenotbeencarefullyanalyzedforfaulttoleranceandscalability.Ourworkisgearedtowardanalyzingthefaulttoleranceofsuchaggregationalgorithms,andproposingscalabletechniquesfordataaggregation.Tothisend,wemakethefollowingdistinctcontributions.First,weproposeasimplefaultmodelforanalysisofvariousaggregationtechniques.Wethenutilizethisfaultmodeltoanalyzethefaulttoleranceofexistingtechniquessuchastreeandgossipaggregation,andalsoweightheperformancegainofvarioustechniquessuggestedtoimprovefaulttolerance,suchasmultipletreesandlocalxes.Wealsodothecost-benetanalysisofusingthehash-basedschemeswhicharebasedonFMsketches.Then,weproposeanovelhybridaggregationtechniquethatcombinesthebestcharacteristicsoftreeandgossipaggregationandimprovesfault-toleranceandscalability.Finally,weanalyzetheimpactofusingafewofthemorereliable,thoughexpensive,nodes{suchastheIntelXScale{called 11

PAGE 12

microservers,inadditiontothestandardmotes,onthefaulttoleranceandscalabilityoftheaggregationalgorithmsinsensornetworks.Weshowthatourworkcanbeeectivelyusedasadesigntoolformaximizingfaulttolerancewhiledesigningscalableaggregationalgorithmsforsensornetworks. 12

PAGE 13

CHAPTER1INTRODUCTIONAwirelesssensornetworkconsistsofuntetheredsensornodesforminganad-hocwirelessnetworktoco-operativelyperformsensing,andmaybeevenactuating,operations[ 1 ][ 94 ][ 16 ].Themodernsensorscansenseandmeasureodors,vibration,acceleration,pressure,temperatureandotherphysicalquantities;allthisinformationisusefulingatheringdataaboutoursurroundingworld.Today,sensornetworksndapplicationinvariousareasrangingfromenvironmentmonitoring[ 10 ][ 81 ][ 61 ][ 86 ]tobattleeldsurveillance[ 1 ][ 39 ][ 11 ].Improvingprocessorperformancehasmadeitpossibletomanufactureandembedsmallbutsophisticatedchipsatconsiderablylowcosts.Siliconscalinghasalsodrasticallyreducedthecostofcommunicationforbothwiredandwirelesssystems.Thesmallsizeandlowcostofthecomponentsallowdeploymentofnetworkofverylargenumberofsuchdevicesinenvironmentsthatareotherwiseoutofreachofhumanbeings.Technologieslikesmartbuildings,ingestiblehealthmonitors,smartdust,etc.,whicharenowavailableinexperimentalsettingswillbecomecommonplace[ 90 ][ 61 ].Weenvisionthatsensorsnetworksconsistingofhundredthousandtoamillionsensorswillnotbeuncommoninthesesettingsandwillbeindispensable,forexample,forlargescalescienticandcivilianapplications.1.1SensorNetworksandAggregationThepromiseofsensorsnetworksistoenablespatialandtemporaldatacollectionataratemuchhigherthancurrentlyprevalent.Thevariousqueriesforinformationprocessinginsensornetworkscanberoughlycategorizedaspointqueries[ 60 ][ 91 ]andaggregatequeries[ 57 ][ 49 ].Pointqueriesusuallyinvolveroutingthequeryfromthepointofqueryinjectioninthesensornetworktothesensornodeofinterestandthenecientlyroutingtheresultsback.Findingasetofvaluesataparticularsensornodeisanexampleofapointquery.Aggregatequeriesarerelativelymorecomplicatedastheyinvolveallthesensornodesorasubsetofsensornodesforinformationprocessingandhenceare 13

PAGE 14

morechallengingtoexecuteinafault-proneenvironment.Aggregationofdatacanbeusedforconsolidationofredundantdatasensedatnodesdeployedinexcessforfaulttolerance.However,dataaggregation,ingeneral,isawideclassofoperationsinvolvingglobalconsolidationofdatawhichmaynotbenecessarilycorrelated.Simpleexamplesofdataaggregationincludecalculatingthemaximumpressureinanexperimentchamber,orcalculatingtheaveragetemperatureofasurface[ 61 ][ 8 ][ 1 ].Morecomplexexamplesofdataaggregationincludendingauniformlyrandomsetofdataobservationsorquantilesoveradistributedsetofdatastreams[ 49 ][ 18 ].However,thesolutionsforsimpleaggregationoperationscanbeextendedtocomplexoperationsthathaveunderlyingassociativeandcommutativeproperties.Sampleapplications.Sensornetworksarendingapplicationsinwideandemergingdomainsrangingfrombattleeldsurveillancetourbanmonitoring.Withtheadvancementinsilicontechnology,smallerandcheapersensornodeswouldbeavailablefordeploymentonalargerscale.Inthissection,weidentifysomedomainswhichwouldneedaggregationmethodsthatscalewell,andmethodologieswhichcanbeappliedindeploymentsofsensornetworksinwidevarietyofenvironmentalconditions. 1. Weenvisagethattheabilitytoextractaquickglobalviewofthesystemusingfault-tolerantaggregationoperationswouldenableapplicationswhichareconsideredtoohazardoustoevenworktodayduetoextremeconditions.Forexample,modelingofanactivevolcanoorahurricanecanbemadepossiblebyjustdispersingasetofsensorsinanactivevolcano,orahurricane.Sincethesensornodesmightbealiveforaveryshortdurationduetotheextremeconditions,therateoffailurewouldbeextremelyhigh.Techniquessuchasspatialsamplingcanthenbeutilizedtobuildamodel,however,in-networkprocessingwouldbecrucialforaggregatingthedatainascalableandfaulttolerantmanner. 2. TherangeoflargescaleapplicationssuchasEarthScope[ 21 ]andQuake-catcher[ 70 ]canbefurtherextendedwiththedeploymentofsensormotes.Thiswouldcreateaninfrastructureofhundredsofthousandstomillionsofsensornodesreportingdatathatwouldneedtobeaggregatedinordertoobservetrendsordrawhigh-levelconclusionabout,say,oceanicmonitoring,orearlywarningsystemsforearthquakes.TheExScalproject[ 4 ],forexample,seekstodeploytestbedsofthousandsofsensormotes,andhundredsofthemorecapablemicroservernodes. 14

PAGE 15

Dataprocessinginsensornetworkshasbeenapproachedakintoproblemsindistributeddatabases[ 91 ].Declarativequeryprocessingsystemshavebeendesigned[ 60 ].Asaresult,SQL-likequeriescannowbeissuedtoextracttherequiredinformation.Forexample,consideranetworkofwirelesssensorsdeployedformonitoringinabuilding[ 72 ][ 95 ].AsimpliedversionofanaggregationquerytodetectelevatedtemperaturesinthebuildingcanbeasimpleMAXqueryasfollows: SELECTMAXtemperatureFROMsensorsAggregationitselfisanimportantoperationinlargescalenetworksduetothefollowingreasons: 1. Thedeploymentofsensornetworksisoftenad-hoc.Asaresult,morethantheactualnodeidofthesensor,otherpropertiessuchascurrentlocation,orothereventorcontentbasedpropertiesbecomeimportant.Thus,content-basedaddressinganddata-centricprocessingisoftenutilizedinad-hocwirelesssensornetworks[ 45 ][ 60 ].Forexample,aqueryforndingtheaveragetemperatureinareasgettingdirectsunlightcanbeframedasfollows: SELECTAVGtemperatureFROMsensorsWHERElight>thresholdAggregationthusbecomesanimportanttoolindataprocessinginsuchscenarios.Anotherexampleofanevent-basedquery,asitappearsin[ 60 ],forrecordingtheaveragelightandtemperaturenearabird-sightingisquotedbelow: ONEVENTbird-detectloc:SELECTAVGlight,AVGtemp,event.locFROMsensorsASsWHEREdists.loc,event.loc<10m 2. Typically,sensornodesaredeployedinexcessforredundancy.Aggregationwouldberequiredtoconsolidatesuchdata.Aggregationisalsoutilizedtoweedoutanyfaultyreadingsorbyzantinefailures. 3. Insomecases,theeldrangeofanapplicationcanbesubdividedintosmallergroupsofsensorsformanageability.Suchsituationscanalsobeanalyzedinourproposedframework,asdiscussedlaterunderheterogeneoussensornetworks. 15

PAGE 16

4. Ingeneral,aggregationoersaglobalviewoftheentireeldrangeinwhichthenetworkisdeployed.Forexample,distributionfunctionofthevaluescanbeobtainedutilizinguniformrandomsamplingusingtheseaggregationtechniques.Themainchallengeindesigningalgorithms/protocolsfordistributedcomputationofaggregationinsensornetworksistokeeptheresourceutilizationtotheminimumbyreducingthecommunicationamongnodesandcomputationperformed.Thevarioustechniquesfordataaggregationcanbebroadlyclassiedasfollows:Centralizedprocessing.Themoststraight-forwardsolutioninvolvestransmittingthevaluescollectedateachsensornodedirectlytoacentralizedprocessingunit.Aggregationoperationcanthenbeeasilyperformedusing,forexample,thecurrentdatabasetechnology.Themaindrawbackofthisapproachisthesheeramountofcommunicationrequiredsinceeachpieceofinformationhastomakeitswaytothecentrallocation.Thistranslatesimmediatelyintolargeprocessingtimes{whichrulesouttimecriticalapplications{andlargepowerconsumptionsinceeachnodehastouseastrongsignalforthewirelesstransmissionifeachmessageistobedeliveredtothecentralbasestation,alsoknownastherootnode 1 ,inasinglehop.Evenifmultiplehopsareusedandthemessageisroutedthroughanumberofnodesfromthesourcenodestotherootnode,thesheernumberofmessageswouldbeverylarge.Also,sincethenodesnearertotherootnodewouldberequiredtoroutemorenumberofmessagesthanthenodesfartheraway,theexpenditureofenergyacrossthesensornetworkisnotuniform.Thiscreationofhot-spotsinthenetworkeventuallyleadstotheformationofdisconnectedcomponentsinthenetworkgraph,renderingthesensornetworkineective.Thus,uniformutilizationofenergyacrossthesensornodesisanimportantissue. 1sinceitisoftenthenodeatwhichthequeryisinjectedintothenetwork 16

PAGE 17

query aggregate Figure1-1.Tree-basedaggregation Thus,theonlyhopetosignicantlyreducetheresourceutilizationforaggregatecomputationisin-networkaggregation,i.e.performpartialaggregationastheinformationistransmitted.Treeaggregation.Themainideahere,depictedinFigure 1-1 ,istoorganizeallthesensornodesintoanaggregationtree[ 57 ].Therootofthetreeisthenodewherethequeryisinjectedandwheretheaggregationresultisretrieved.Thequery/requestispropagatedfromparenttochildren.Theleafnodessendthevalueofthemeasurementtotheparent.Intermediatenodeswaitforvaluesfromthechildren,doalocalaggregationofthesevaluesanditsownmeasurementandsendtheaggregatetotheparent.Therootnodecomputesinthesamemannerasanyintermediatenodeandpresentsthevalueoftheoverallaggregatetotheuser.Thealgorithmrequiresaspanning-treeamongthesensornodes[ 57 ][ 58 ][ 7 ][ 36 ].Theeciencyofthealgorithmdependsonthelongestroutefromroottotheleaf,i.ehowbalancedthistreeis.Gossip-basedaggregation.Thesetechniquesarebasedonrandomizedalgorithmsthatproceedinroundsuntilallthenodesinthegroupconvergetoasinglevalue,orasinglesetofvaluesfromwhichtheaggregatecanbecalculated.Ineachround,eachnodecontactssomeofitsneighborseitherphysicalneighborsor,inthecaseroutingmessagesthroughthenetworkispossible,anyothernodesandexchangesinformationwiththese 17

PAGE 18

Algorithm:ProtocolPush-Sum 1:Letf^sr;^wrgbeallpairssenttoiinroundt)]TJ/F15 11.9552 Tf 11.9551 0 Td[(12:Letst;i:=Pr^sr;wt;i:=Pr^wr3:Chooseatargetftiuniformlyatrandom4:Sendthepair1 2st;i;1 2wt;itoftiandiyourself5:st;i wt;iistheestimateoftheaverageinstept Figure1-2.Gossipaggregation nodes.Wediscusstwovariationsofthisprotocolproposedintheliterature.InthePush-Sumprotocol[ 49 ],depictedinFigure 1-2 ,eachnodemaintainstwoquantities:aweightandtheaggregate.Ineachround,eachnodechoosesrandomlyanodeinthenetworkandsendsthisnodehalftheweightandhalfthevalueoftheaggregatei.e.thesendingnodehalvesitsownweightandaggregateandthereceivingnodeaddsthesequantitiestoitsown.Ifeverynodecanuniformlyrandomlycontactanyothernodetheprotocolwillcomputetheaggregateintimeproportionaltothelogarithmofthenumberofnodes.TheDistributedRandomGroupingprotocol[ 12 ]proceedsbyrandomlyselectingasmallsetofbroadcastingnodes.Eachsuchnodewillsendabroadcastingmessagetoallitsimmediateneighborsandgetfromthemtheircurrentvalueoftheaggregate.Then,thenewlocalaggregatevalueiscomputedanddisseminatedtoallthenodesthatrespondedtothebroadcast.Theconvergenceofthisprotocoldependsonthealgebraiccomplexity[ 26 ]ofthegraphanditisusuallyslightlybetterthantheconvergenceofpush-sumprotocol.Wenowbroadlydiscusssomeoftheimportantissuesinsensornetworksthatsignicantlyaectdataaggregationacrossthesensornetwork.First,wediscusswhyfaulttoleranceshouldbeanimportantconsiderationwhiledesigningalgorithmsforsensornetworks.Thenwediscusshowthesizeofthesensornetworkwouldaectthefaulttoleranceofaggregationalgorithmsinsensornetworks,andwhyscalabilityisanissue. 18

PAGE 19

1.2ImportanceofFaultTolerancePresenceoffailuresisacommonaspectofapplicationsinsensornetworks.Theamountoffailuresvaryoverawiderange[ 35 ][ 6 ][ 95 ][ 22 ],depending,primarily,ontheenvironmentandthesizeandcostofhardwareused.Thisisduetothefollowingreasons: 1. Theapplicationsofsensornetworksoftenrequirethenodestobedeployedinuncontrolledenvironments.Infact,withapplicationsusuallyrangingfromhabitatmonitoring,coastalandhurricanemonitoringtosewagecontrolanddeploymentinbattleelds,theenvironmentisusuallyhostile[ 10 ][ 81 ][ 95 ][ 61 ][ 89 ].Thisresultsinimpairmentsofsensornodes,thusresultinginhigherprobabilityoffailure. 2. Thesignal-to-noiseratioistypicallypoor.Thisresultsincommunicationfailuresandtransientimpairments. 3. Manyprotocolsinsensornetworkscallforpoweringdownofsomesensornodesatregularintervals[ 51 ].Thisisdonetosaveenergyandlengthenthelifeofsensornetworks.However,fromtheperspectiveofanalgorithmtriggeredwhensomenodesarepowereddown,say,adynamicqueryrequest,thesepowered-downsensornodesareasbadasfailednodes[ 51 ]. 4. Inadditiontooneormoresensors,formonitoringsaytemperature,pressureormotion,eachsensornodetypicallyconsistsofawirelesscommunicationdevice,suchasaradiotransceiver,andamicrocontroller,allpoweredbyabattery.Thesizeofeachsensornode,though,maybeassmallasagrainofdust[ 87 ].This,coupledwiththeneedtokeepthecostpersensornodedown,resultsincomponentfailures[ 51 ]. 5. Largescaleapplicationsinvolvingsensornetworksusuallycallforthousandsofuntetherednodesspreadacrossthesystem.Toconstraintheoverallcost,asadesigndecision,thesensornodestypicallyconsistofinexpensivehardware.Also,theenergysourceofsensornodesistypicallynon-rechargeable.Theideaistoimprovetheoverallfaulttoleranceofthesystemusingredundancy,whilekeepingthesystemcostdown.However,thesefactorsmakethesensornodesmorefault-prone.Thus,fault-toleranceshouldbeanimportantcharacteristicofalgorithmsdesignedforsensornetworks.Infact,incaseofoperationssuchasaggregation,neglectingfailureswouldleadtoincorrectnesswithnoguaranteedbounds.Forexample,inthecaseofbuildingmonitoringintroducedearlier,assumethattree-basedalgorithmisusedforaggregation.Theconnectionbetweenapairofparent-childnodesmightbetemporarilydisconnectedduetoanobstruction,say,acloseddoororthepresenceofajanitor's 19

PAGE 20

cleaningcart.Therecouldbenumerousotherreasonsforsuchfailures[ 95 ][ 35 ][ 6 ][ 51 ]andtheprobabilityoffailuremightbespreadoverawiderangedependingontheapplication,asdiscussedinmoredetailslaterinSection 2.5 .However,ifthisfailureisignored,andiftheactualmaximumtemperatureisrecordedtobeatoneofthenodesinthesubtreeofthispointoffailure,theresultsetreturnedastheanswertothatquerywillbeincorrect.FaultToleranceandScalability.Theneedforfault-toleranceinfactbecomesexaggeratedasthesizeofthesensornetworkincreases.Sensornetworksarebeingdeployedforurbanmonitoring,battleeldsurveillanceandhabitatmonitoring[ 10 ][ 81 ][ 95 ][ 61 ][ 89 ].Similartothecaseofcommunicationdisruptioninthebuildingmonitoringapplicationdiscussedabove,temporarydisruptionsmightbecausedbyeventssuchasacarparkedinanunexpectedlocationinurbanmonitoringapplication,oranarmytankdisruptingcommunicationinbattleeldsurveillance,orananimalobstructinginhabitatmonitoringapplications.Asthescaleincreases,theprobabilityofatleastonesuchfailureoccurrenceincreases[ 6 ].Applicationssuchashurricanemodelingandtrackingareenvisionedtoutilizetensofthousandsofsuchsmallsensornodes.Asaresult,theprobabilityofthepresenceofatleastonefailureinthenetworkatanygiventimeislarge.Aswillbearguedlatertoo,xingsuchfailuresdoesnotreallyhelpinsuchcasessincetheprobabilityofanotherfailureoccurringinthetimetakenforxingisverylarge.Thus,anyalgorithmsforaggregationofdatainsensornetworksneedtobeinherentlyfault-tolerantforthemtobescalable.Hence,itbecomesquiteimperativetoanalyzethebehaviorofalgorithmsinfaultyenvironmentstobetterunderstandtheimplicationsandestimatetheirperformancebeforebeingactuallydeployed,especiallyinthecaseoflargescalesensornetworks.Suchananalysiswouldalsobeequallypertinentincaseswherethefailureratesarehigh,eventhoughthesizeofthenetworkissmall. 20

PAGE 21

1.3ContributionsGiventhecurrentstateoftheresearchonaggregatecomputationinsensornetworks,itisnotapparenthowtheexistingprotocols/approachescanbeadaptedforsuchlargenetworkssincenoscienticstudyuptodateconsiderednetworksofsuchlargesize.Tothisend,weproposetotakethefollowingsteps: 1. Faultmodel:Modelthebehaviorofaggregationtreestoincludethefault-pronenatureofsensornetworks.Also,establishanotionofcorrectnessforaggregationalgorithmsinsensornetworks. 2. Analysisoftreeaggregation:Analyzetheperformanceofthetree-basedaggregationalgorithmsinthepresenceoffaults.Alsoanalyzethevariousxesproposedinliteraturefortreeaggregation,suchasmultipletreesandlocalxes.Alsoanalyzethehash-basedschemesintermsofapplicabilitytoaggregationinsensornetworksandaccuracyguaranteesofthescheme. 3. Hybridaggregation:Designamethodologytointegratethebestofthegossipandtreeaggregationschemestoevolveascalable,fault-toleranthybridaggregationschemeforhomogeneoussensornetworks. 4. Heterogeneousaggregation:Withtheavailabilityofafewmorereliable,thoughexpensive,microserverstoformaheterogeneoussensornetwork,analyzetheimpactonthefault-toleranceandscalabilityofaggregationschemes. 21

PAGE 22

CHAPTER2RELATEDWORKInthischapter,wediscussthebackgroundworkrelatedtodataaggregationinsensornetworks.Wereviewtheliteratureintreeaggregationandgossipaggregation.Wealsocheckonthestateoftheartinheterogeneoussensornetworks.2.1SensorNetworksThepotentialofembeddednetworkcomputing,asenvisagedbyWeiser[ 88 ],hasbeenrecognizedinthelastdecade[ 66 ][ 67 ][ 24 ][ 25 ][ 23 ].Withurbanmonitoringandbattleeldsurveillanceevolvingaskillerapplications,theeldofwirelesssensornetworksgainedattention.Alotofworkhasbeendonewhichfocusedonthepotentialapplicationsofsensornetworksandthemainchallengesinvolvedinthesuccessfuldeploymentofthistechnology.Resourceutilization,faulttoleranceandscalabilityhavebeenidentiedasamongstthemostimportantdesignfactors[ 1 ][ 69 ][ 83 ]Mostoftheworkinwirelesssensornetworkshasrevolvedaroundminimizingresourceutilization[ 25 ][ 13 ][ 83 ][ 93 ].Thefocushassofarbeenongettingthealgorithmsworkingonasmallscale{mosteortsfocusonsensornetworksofsizesrangingfromtenstohundredsofnodes;veryfewreportingresultsformorethanathousandnodesensornetwork.However,recently,eortsarebeingfocusedondemonstratingcapabilitiesforlargernetworksizes[ 4 ].Anotherimportantfactorisfaulttolerance{theabilitytosustainoperations,preferablywithsomeguaranteedqualityofservice,inthepresenceoffaults[ 55 ][ 51 ].Variousaspectsofsensornetworkfunctionalitieshavebeenaddressedfromafault-tolerancepointofviewintheliterature{observabilitysensing[ 44 ],distributedeventdetection[ 52 ][ 56 ][ 14 ]andsynchronization[ 80 ].Directeddiusion[ 45 ]isadata-centricdatadisseminationparadigmforsensornetworks.Thisisacontentbasedschemewheredataisnamedusingattribute-valuepairs.Aquery,calledasensingtask,isdisseminatedthroughoutthesensornetworkasaninterestfornameddata.Thisdisseminationsets 22

PAGE 23

upgradientswithinthenetworkastheinterest"ispropagated.Thedatamatchingtheinterests,calledevent",startsowingtowardtheoriginatorsofinterestsalongmultiplepaths,ascendingthegradientsetupbytheowofinterest.Theoriginatorthenchoosestoreinforceone,orasmallnumberofthesepaths.Thisenablesrobustdisseminationofdatainsensornetworks.GreedyPerimeterStatelessRoutingGPSR[ 48 ]isafault-tolerantroutingprotocolforwirelesssensornetworks.Multipathrouting[ 29 ]proposedtheuseofmultipathstoenableenergyecientrecoveryfromfailuresalongasinglepath,andasanalternativetoooding.Whilesolutionslikethesearegearedtowardfaulttolerantroutinganddatadissemination,weconcentrateonfaulttolerantdataaggregationalgorithms.However,allsuchstudiesexemplifytheimportanceoffaulttoleranceinsensornetworkapplications.TinyOS[ 43 ]isanevent-drivenoperatingsystemforsensornetworks.ItprovidesasetofcomponentsformanagingthehardwarewiththeNesCprogramminglanguage.TOSSIM[ 54 ]isanemulatorforsensornetworkapplications.TheTinyOScodecanbeemulatedandtestedherebeforebeingloadedontothemotes.PowerTOSSIM[ 76 ]extendsTOSSIMtoincludevariousradiomodels;itisasensornetworkemulatorwhichfocusesonpowerconsumptionmodels.Queryprocessinginsensornetworkshasbeenanareaofactiveresearch[ 31 ][ 91 ][ 60 ][ 19 ][ 78 ][ 8 ].COUGAR[ 91 ]proposedanarchitectureforsensordatamanagementsystemwithfocusonin-networkprocessing.Itprovidesaqueryprocessorbasedinterfacetosensornetworks,focusingonqueryoptimizationforlimitedresourceconstraints.Distributedqueryprocessingisdiscussedasascalablealternativetocentralizedapproachin[ 9 ].Amodelforsensordatabasesystemwasdenedin[ 8 ],withthedesignandimplementationoftheCOUGARsensordatabasesystem.Asimpliedwayofdeployingdata-collectionapplicationsusingsensornetworkswasdescribedin[ 31 ]thatprovidedasimple,declarativequerybasedinterface.COUGAR,inshort,viewedsensornetworksasadistributeddatabasesystemandtriedtoachievecross-layeroptimization. 23

PAGE 24

TinyDB[ 60 ][ 58 ][ 59 ]advocatesacquisitionalqueryprocessing{consideringthelocationsandcostofacquiringthedatatolowerpowerconsumptionofqueryprocessinginsensornetworks.TheresolutionofSQL-likeaggregatequerieswasdemonstrated[ 59 ],usingin-networkaggregationforsimpleoperatorslikeCOUNT,MIN,MAXandSUMandproposingtousethesharedchannelmodeofcommunicationinwirelesssensornetworksforaddingfault-tolerance.In-networkaggregationhasbeenexclusivelydealtwithusingspanningtrees[ 57 ],thedetailsofwhicharediscussedinmoredetailinthefollowingsections.TinyDBcanbeextendedtosupportmoresophisticateddataanalysisbyapplyingittosampleapplicationsliketopographicmapping,wavelet-basedcompression,andvehicletracking[ 40 ].[ 41 ]discusseshowdatabasetechnologiescanbeappliedtosolvevariousproblemsinthedomainofsensornetworks.Thecostofdataacquisitionisalsoconsideredasamajorfactorinmodeldrivendataacquisition[ 18 ];Bayesiannetworksareusedtolearnthecorrelationbetweenthedierentsensortypesateachsensornodeandthelearnedmodelisusedtoanswerquerieswithoutactivelyacquiringthedatavalueasfaraspossible,dependingonthecondenceboundsspeciedinthequery.QueryingisscheduledamongsensornodesusingheuristicsforTravelingSalesmanProblem.However,therootnodeneedstoknowtheentiretopology,whichmakesitlessscalable.Also,noin-networkaggregationisused.Anextendedworkexploitedspatialcorrelationamongsensornodesaswell[ 17 ].digesttrees[ 96 ]aremaintainedtocomputecontinuallyupdatedsummariesofnetworkpropertiessuchaslossratesandenergylevels.Thesedigestsarethemselvesusedheretoobtainmoreaccurateestimatesoftheaggregatesinthepresenceoffailures.Similarly,q-digests[ 77 ]arecomputedthatcanbeusedforin-networkprocessingofaggregationoperationslikemedians,quantilesandfrequent-itemsets,apartfromtheusualMIN,MAX,SUMandCOUNT.Itisasketch-basedtechniqueandrequireslargeq-digestsofthesizeofaround400bytesforcontrollingtheerrortowithin2%.Also,thistechniquedidnotconsidermessageslostduetofailures.ThePush-Sumtechnique[ 49 ]andDistributed 24

PAGE 25

RandomGrouping[ 12 ]providegossip-basedrandomizedalgorithmsforcomputingvarioussuchaggregateoperations,detailsofwhicharediscussedindetailinthefollowingsections.2.2TreeAggregationMaddenetal.[ 57 ]providedtherstsystemimplementationofaggregatecomputationinsensornetworks,formingspanningtreesandusingin-networkaggregation.Theyoptedforsimplesolutions{thisishighlydesirableduetothelimitationsofthehardwareused{withafocusonacceptable,ifnotguaranteed,solutions.Whilenotprovidingafaultmodel,anumberofmethodstodealwiththefaultswereproposed:child-caches{cachingoldvaluesofaggregatesatparentsincasecommunicationwiththechildcannotbeestablished{andredundanttrees{insteadofsendingtheaggregatevaluetoasingleparent,theaggregatevalueissplitbetweenmultipleparents.However,inthepresenceoffaults,thesedonotguaranteerobustcorrectnessoranystrictnotionofcorrectness.Whilethissolutionworksextremelywellforsmallsensornetworks{upto1000{scalabilitytotensofthousandsofsensornodesisaproblem;andsoisfault-tolerancetohighprobabilitiesoffailure.Anaiveapproachistojustneglecttheeectoffailures.However,intree-basedaggregation,neglectingtheeectsoffailureofanintermediatetreenodewouldentailneglectingnotonlythevalueofthefailednodefromtheaggregatecalculation,butalsoneglectingthevaluesofallthenodesinthesubtreerootedatthisfailednode.Notsurprisingly,suchanaivestrategydoesnotguaranteeanyformofcorrectness[ 7 ].Theotheroptionsdiscussedinliteraturetotacklefailuresintree-basedaggregationincludemaintainingmultipletrees[ 57 ][ 15 ],xingthetreeslocallybeforeproceedingwithaggregation[ 46 ][ 34 ]andusingorder-independent,duplicate-insensitivemultimaps[ 64 ][ 15 ][ 62 ]basedprimarilyonFMsketches[ 28 ].2.3GossipAggregationThissetofaggregationalgorithmsisbasedonrandomizedalgorithms.WehavedescribedthePush-Sumprotocol[ 49 ]inthepreviouschapter.Amongtheother 25

PAGE 26

gossip-basedaggregationtechniques,DistributedRandomGrouping[ 12 ]isavariationofthePush-sumprotocol[ 49 ]that,inthecasewhenwirelessbroadcastisavailable,performsslightlybetter0%inmostcases.ThisalgorithmessentiallyhasthesamepropertiesasPush-sumdoes;thusistooslowwhencomparedtotreeaggregationforsmallfailureprobabilities.Generallyspeaking,thisalgorithmcanbeusedasadropinreplacementforPush-Sumprotocolforgossip-basedaggregationandwouldresultinslightlydierent,butqualitativelythesamebehavior.Analternativetothegossipprotocolsforfaultresilientaggregatecomputationisproposedforpeer-to-peernetworks[ 7 ].Asisthecaseforgossip,itconsistsinnodescontactingothernodestheparticularscenarioconsideredinthatpaperispeer-to-peernetworkbasedcommunication{weconsiderasimilarscenarioinourproposedalgorithmsrandomlyandexchanginginformation.Inordertolimittheamountofinformationtransmitted,approximationtechniquesbasedonFlajolet-MartinFMsketches[ 28 ]areused.TheFMsketchescanestimatethenumberofdistinctvaluesandarethusimmunetoduplicates{thisisdesirableforthetypeofcommunicationhere[ 7 ]sincethevalueofanodecanmakeitswaytootherpartsofthenetworkviamultiplepaths,thuscreatingduplicates.Itispossibletocomputeapproximatelyaggregateslikesumusingthisscheme.Themaindisadvantageofthisprotocolisthefactthatthemessagesexchangedbetweennodesarelargesincetheyhavetocontainanentiresketchthesizeofthesketchdictatestheprecisionofthecomputation.Thespeedofconvergenceofthealgorithmissimilartothespeedofgossipitisessentiallythealgebraiccomplexityofthecommunicationgraphasshownin[ 12 ].SynopsisDiusion[ 64 ]isamulti-pathaggregationalgorithmproposedtodealwiththevulnerabilityofthetree-basedaggregationtechniquesathigherlossrates.Byintroducingredundancy,duplicatesarepossiblycreated,aswasthecaseintheworkof[ 7 ].Themethodusedtodealwiththisproblemisessentiallythesame,namelyFMsketchtechniques.Whiletheconvergenceisfasterthangossipintermsofthenumber 26

PAGE 27

ofmessages,thismethodsuersfromthesameproblem{messageshavetobelargetoachievereasonableprecision.Manjhietal.[ 62 ]extendedthisworktotakeadvantageofregionaldierencesinthefailureprobability.Here,ahybridalgorithmisproposedthatutilizestreesinthelowfailureregionsandsynopsesdiusioninthehighfailureregions.Grid-boxhierarchies[ 36 ]usegossipingtoreachconsensusamongmembersofagridbox.Theprotocolworksbyescalatingamemberofthegroup{thatreachesconsensususinggossip{tothenextlevelinthetreewheregossipisagainusedtoreachconsensusintheenclosinggridbox.ThegossipusedinthispaperisnotagossipaggregationprotocollikePush-sumwhichwasintroducedlaterbutgossipcommunication.Eachmessagehastocontainthevaluescontributedbyallthenodesaparticularnodehasheardabout,thusthemessagesizewillbeproportionaltothesizeofthegroupinsteadofasmallconstantforcomputingaverageswiththegossipaggregationalgorithms,thetreeorthealgorithmsproposedinthisresearch.2.4HeterogeneousSensorNetworksHeterogeneoussensornetworkshaveoftenbeenreferencedinliteratureasorganizedinatieredhierarchy[ 73 ][ 33 ][ 92 ][ 75 ][ 86 ].Theproblemofwhere,howmanyandwhattypesofheterogeneousresourcestodeploytomaximizebenethasbeenexploredpreviously[ 92 ].However,thatworkdealswithcharacteristicssuchasthedeliveryrateandexpectedlifetimeofthesensornetwork.Ourfocuswouldmainlybeonfault-tolerantperformanceinaggregationoperationsandthecost-benetoftheheterogeneousdeployment.Thus,ourworkwouldactuallybecomplementarytothiswork.Theuseofhighlyreliablelong-rangeback-haullinks{intheformofwires{betweenregionsofthesensornetworkisinvestigatedin[ 75 ].Theresultsindicateareductionintheaverageenergyexpenditurepersensornodeandalsoareductioninthenon-uniformityintheenergyexpenditureofthesensornodes.Thisindicatesthatitisactuallyworth 27

PAGE 28

investigatingtheuseofheterogeneoussensornetworkfornetworklongevity,especiallywhentheoperationstobeperformedareasglobalastheaggregationoperations.Mostofthepreviousworkusingheterogeneoussensornetwork[ 73 ][ 33 ][ 92 ][ 75 ][ 86 ][ 68 ][ 38 ][ 63 ]dealswithlowerlevelprotocols.Heterogeneityisshowntohavebeenleveragedformoreuniformexpenditureofenergy[ 75 ]andmoreimprovementindeliveryrateandexpectedlifetimeofsensornetworks[ 92 ].Inthisresearch,wewouldfocusonthehigh-levelalgorithms,suchastheaggregationalgorithms,thatwouldberunonthesensornetwork.Inthiswork,weleveragethebetterreliabilityofthemicroservernodestoimprovethefault-toleranceofaggregationalgorithms,andoptimizethecostgivencertainperformanceconstraints.Analysis-baseddesigntoolsforsensornetworkshavebeenstudied[ 68 ].Ourworkcanbeusedasadesigntoolforanalyzingaggregationinsensornetworks.Amongotherrelatedproblemsinvestigatedinheterogeneoussensornetworkincludeclustering[ 38 ].Thisworkcanactuallybeutilizedinourworkforconstructionoftheheterogeneousaggregationtree.In[ 63 ],authorscalculatetheoptimumaverageenergycostfortransmittingdatatobase-station,whentwotypesofnodesarerandomlydeployedinalargearea.Themodelweproposeinthisworkcanactuallybeintegratedintoanybroaderoptimizationproblem.Forexample,piecesofourtheoreticalmodelcanbeusedtoestimatethetotalnumberofmessages,andthustheenergyrequirement,foraggregationoperations.Thiscanbeweighedinto,say,anenergyoptimizationcriterionaccordingtotheprobabilitydistributionofthevariousoperationsperformedforanapplication,andaggregationisoftenaverycommonoperationinsensornetworkapplications.2.5FaultToleranceandScalabilityofSensorNetworksTheimportanceofbeingabletoscaletothousandsofnodesforsensornetworkapplications,suchasbattleeldsurveillance,isbeingrecognized.TheneedforinvestmentinworkrelatedtoscalingofdistributedsensingsystemshasbeenhighlightedinastrategicassessmentbytheU.S.ArmyResearchLab[ 84 ],alsoquotedin[ 3 ]and[ 42 ]asfollows: 28

PAGE 29

Itisnotpracticaltorelyonsophisticatedsensorswithlargepowersupplyandcommunication[demands].Simple,inexpensiveindividualdevicesdeployedinlargenumbersarelikelytobethesourceofbattleeldawarenessinthefuture.Asthenumberofdevicesindistributedsensingsystemsincreasesfromhundredstothousandsandperhapsmillions,theamountofattentionpaidtonetworkingandtoinformationprocessingmustincreasesharply."Therehavebeenexperimentalstudiesthathighlighttheprobleminattainingscalabilityinwirelesssensornetworks[ 3 ][ 57 ][ 6 ].Ithasbeenobservedthatmakingaprotocoltoscalebyanorderofmagnitudewouldofteninvolveredesigningtheprotocolitself[ 6 ].Thereliabilityofcommunicationsserviceinwirelesssensornetworkshasbeenreportedtofallfrombetterthan95%,whentestedwithafew-15nodes,to50%whentestedwith100nodes[ 3 ].Eveninthecaseofdataaggregationinsensornetworks,fault-tolerancehasalwaysbeenidentiedasanimportantissue[ 57 ][ 64 ][ 62 ][ 34 ].Specically,inin-networkaggregation,alargenumberofpartialstaterecordsareknowntogetdroppedandnotcontributetothenalaggregationvalue.Forexample,intree-basedaggregation[ 57 ],onlyabout40%ofthepartialstaterecordsarereectedinthenalaggregatevalueforanetworkofsize100nodesdiameterof10;andthisparticipationpercentagefallstolessthan10%asthesizeofthenetworkreaches2500nodesdiameterof50.Suchresultsaboutaggregationinrealisticenvironmentshavemadetheauthors[ 57 ]reachthefollowingconclusion: Thus,thebasicTAGapproachpresentedsofar,runningoncurrentprototypehardwarewithitsveryhighlossrates,isnothighlytoleranttoloss,especiallyforlargenetworks."italicsoursTechniquessuchasutilizingchildcaches[ 57 ]havebeenproposedtoaugmentthebasictreeaggregationapproach.Thoughthepercentageofparticipationshowsimprovementto70%atadiameterof50%,itcomesatapriceofincreasedvarianceand 29

PAGE 30

temporalsmearing"oftheaggregatevaluesduetouseofcachedvalues.Theimpactofadditionaltechniquesproposedinliterature,suchasmultipletrees[ 57 ]andODI-sketchbasedschemes[ 64 ][ 15 ][ 62 ]havebeenanalyzedinthisresearch.Thevariationintheprobabilityoferrorwithrespecttothechangeinsignal-to-noiseratioSNRhasalsobeenstudied[ 35 ].ItdocumentshowareductionofSNRfrom15dBto12dBcanmaketheprobabilityoferror,representativeofbiterrorrateinthatstudy,increaseexponentiallyfrom510)]TJ/F20 7.9701 Tf 6.5865 0 Td[(5to10)]TJ/F20 7.9701 Tf 6.5865 0 Td[(3.Withapacketsizeof36bytes,thiswouldamounttoanincreaseinpacketlossratefrom0:0576%to25%.Sincetheunderlyingprotocolswouldmostlikelyuseretransmissiontotideoverpacketlosses,weshouldcompensateforthenumberofretransmissions.Assumingthatthetotalnumberoftransmissionattemptsis3,thispacketlossratewouldamounttoaprobabilityoffailure,p,intherangeof1:9110)]TJ/F20 7.9701 Tf 6.5865 0 Td[(10to0:0156.Asnoted[ 35 ],changesinweatherandsurroundingssuchaspresenceofcarsandpeoplewouldfurtheraddtothevarianceintheprobabilityoffailure.Thedierentstudiesthathavedealtwithfaulttoleranceindataaggregation,orwithfaulttoleranceingeneral,haveconsideredpacketlossesashighas50%[ 34 ][ 64 ][ 30 ][ 95 ].Inthisresearch,wehaveconsideredawiderangeoftheprobabilityoffailurewhileconductingouranalysis.Wehavealsoconsiderednetworksofsizerangingfromafewhundredstohundredsofthousandsofsensornodesinordertostudythebehaviorofvariousaggregationalgorithmsacrossawidespectrumofscalabilityandfaulttolerance.Thismakesthetheoreticalproblemwearesolvingmoregeneric,withsolidpracticalunderpinnings.2.6ResearchOpportunityAsevidentfromthediscussedliteraturereview,alotofworkhasbeendoneintheeldofdataaggregationinsensornetworks.Incomparison,wecandescribetheguidingphilosophyforourresearchworkasfollows: 30

PAGE 31

1. Mostoftheworkdealswithhowtogetaggregationdoneinsensornetworks.Wetakeastepback,andfocusmoreonthefeasibilityofawideclassofaggregationalgorithms.Wedonotfocusonanyspecicimplementationofalgorithms.Thisenableshigherleveldecision-makingwhenchoosingaggregationprotocolsforsensornetworkapplications. 2. Wedeneasimplefaultmodelthatenablesustoconceptuallyandquantitativelyanalyzethescalabilityandfaulttoleranceoftheaggregationprotocols.Weapplythisfault-modeluniformlyacrossallouranalysistogetanormalizedview.Ofcourse,weverifyouranalysiswithexperimentalsimulations. 3. Basedonouranalysis,weproposeamethodologyforaggregationinlargescalesensornetworkstodeviseahybridprotocolthatisfault-tolerant,andscalable.Whenafewpowerful,andmorereliable,sensornodesareavailable,weanalyzeitsimplicationsontheimprovementinfaulttoleranceandpresentanalyticalsolutionstooptimizingtheusageofsuchheterogeneoussensornetworks. 31

PAGE 32

CHAPTER3FAULTMODEL3.1MotivationThevarioustechniquesdiscussedinliteratureforaggregationinsensornetworks,webelieve,suerfromacommondrawbackwhenitcomestoanalysisoffault-toleranceandscalability{lackofafault-model.Intheabsenceofastandardfaultmodel,itisdiculttoanalyze,andqualitativelyandquantitativelycompare,thescalabilityandfault-toleranceofaggregationprotocols.So,astherstandforemostcontributioninourresearchwork,weproposeafault-modelforaggregationprotocolsandanalyzetheexistingfamiliesofaggregationalgorithmsagainstthisfault-modelforcomparingfault-toleranceandscalability.3.2SensorNetworkAggregationunderFaultsIntheabsenceofanyfaults,thetreeaggregationisclearlythebestchoicefortwomainreasons:aEvenifasignicanteortisrequiredtobuildagoodtree,theabsenceoffaultswouldmakethemaintenanceofthetreetrivial,andbAggregationusingawellbalancedtreeisthemostecientmethodintermsofthetimetocompletion,andthenumberofmessagesexchanged.Ifthenodesarehighlypronetofaults,thegossip-basedaggregationmethodsarefavoredsince,bydesign,theydonotdependonanystabledistributedaggregationinfrastructure.Thesemethodswouldbemerelysloweddownbyfaults;theabilitytoaccommodatealotoffaultsistheirstrongestpoint.Inpracticalsituations,however,theactualfaultbehaviorliessomewhereinbetweenthesetwoextremesituations.Forthisreason,inthissection,weintroduceasimpleparameterizedfaultmodel{aparameterwillcontroltheprobabilityoffaultsinthesystem{anduseittopredictthebehavioroftreeandgossipaggregationinthepresenceoffaults. 32

PAGE 33

3.2.1ASimpleFaultModelForafaultmodeltobeusefulforanalyzingthebehaviorofaggregationtechniquesforsensornetworksinfaultyenvironments,themodelmusthavetwocrucialproperties:aItmustberealisticinorderforthefactsderivedusingthemodeltobereasonablyextensibletorealscenarios,andbItmustbesimpleenoughtoallowtheoreticalanalysisoftheprotocols{duetothescaleoftheproblem,onlytheoreticalanalysisisfeasible.Oneimportantpointtomakeisthefactthatthefaultmodelisexclusivelyusedforthedesignofthehybridprotocolforaggregation.Eveniftheactualfaultbehaviorisdierent,thisassumptioncanbemadeaslongasitleadstogoodperformanceoftheresultingprotocol.Themainconcernswithchoosingasophisticatedfaultmodelare: 1. Sophisticatedmodelsdependonalargenumberofparameters.Accuratelyestimatingtheseparametersforagivenpracticalscenariocanbechallenging.Inaccuratevaluescanleadtomodelsthataresofarfromtheactualfaultbehaviortothepointthatbaddecisionsaremade. 2. Sophisticatedmodelsmightnotmakeasignicantdierenceintheperformanceoftheresultingprotocolsincetheyarecapturingsecondaryeectsthatmayhavelittleinuence. 3. Inorderforasophisticatedfaultmodeltobeuseful,allnodeshavetobeawareofthemodel.Especiallywhensuchmodelscapturecorrelationsbetweenfailures,largeamountsofinformationhavetobecommunicatedtoallsensornodes.Thiscanbeproblematicfromboth,localstorageandcommunication,pointsofview.Faultmodel.Thefaultmodelweconsiderheremakesthefollowingassumptions: 1. Eachofthenodescanfailindependentlywithprobabilitypinthetimeittakestomakeonetransmissionthisincludesthecomputationtimerequired.Probabilityoffailureinunittimecanbeselectedbut,webelieve,thisparticulardenitionleadstosimplerandmoreintuitiveformulas.Iftheprobabilitytofailinunittimeisknown,theprobabilitytofailinthetimeittakestomakeatransmissioncanbereadilyobtained. 2. Thefailuresarenotcorrelated,i.e.theyareindependentfromeachother.Thisisarealisticassumptioninmostsituations.Wediscusstheimpactofcorrelationsafterwegivethedetailsofoursolution. 33

PAGE 34

3. Linkfailuresareabsorbedintheprobabilityp.Thisisnotalimitingassumptionsinceitisimpossibleforanodetomakethedierencebetweenanodefailureandlinkfailureifonlydirectcommunicationisused,asisthecaseinmostaggregationprotocols.Ifcommunicationisachievedusingrouting,linkfailuresaremaskeddealtwithatthecommunicationlevel. 4. Transientfailures,whichmighttriggermessageretransmissionswhichdosucceedlater,areignored.Shorttermtransitoryfailureswouldsimplyincreasetheoveralltimerequiredfortheaggregatecomputationandwouldaectallthemethodsweconsiderinthesamewayi.e.byincreasingthetimeandthenumberofmessagesbyaconstantfactor.Forthisreason,thedesignofthehybridprotocolisnotinuencedbythetransientfailures,henceweignorethem.Asitisapparentfromtheabovedescription,apartfromthesizeofthesensornetwork,theonlyparameteristheprobabilityoffailurepertimetomakeatransmission.Sincethefailuresareindependent,thetheoreticalanalysisisgreatlysimpliedandthedesignofthebesthybridprotocolisreasonablysimple.Webelievethatthisfaultmodelcapturesthedominantbehaviorinalargescalesensornetworkanditissimpleenoughtoallowanalysisofthevariousaggregationtechniques.3.2.2CorrectnessunderFaultsIffaultsarepossibleinthesystem,theusualnotionofcorrectnessi.e.thevalueofthecomputedaggregatebeingexactlyequaltothetruevalueoftheaggregatehastoberevised.Sincethenodesdofail,deningthetrueaggregateisproblematic:shoulditbetheaggregateoverthesensorsaliveatthestartoraliveattheendofthecomputation?Thereareanumberofproblemswithsuchnotionsofcorrectnesswhichareinsurmountableorimpracticalinthepresenceoffaultsandasynchronouscommunication:aCorrectnesswithrespecttonodesaliveatthestartofthecomputationisnotachievablesincenodescanfailevenbeforetheycommunicatewithanyothernodesotheinformationislost,andbCorrectnesswithrespecttothenodesaliveattheendofcomputationimpliescompleteknowledgeaboutwhatnodesarealivewhichisnotpossibleinfaultyenvironments[ 27 ]. 34

PAGE 35

Alessrestrictivecorrectnesscriterionistoprovidearesultfortheaggregatethatisbetweenorapproximateswellthesmallestandlargestvalueoftheaggregatecomputedoverthevaluesgeneratedbythesensorsthatwerealiveatsomepointduringthecomputation.Forexample,ifonlyoneofthenodesbecomefaultyandtheaggregateisthesumofallthevaluesonepersensorthen,underthisnotionofcorrectness,weallowtheresulttobebetweenthevalueofthesumwithandwithoutthefaultynode.Formally,werequiretheresulttobealinearcombination 1 ofthevaluesofnon-faultysensorsatthebeginningofthecomputation.Furthermore,theweightofallnon-faultynodesattheendofthecomputationhavetobeapproximativelyequal.Intherestofthework,wewillrefertothisnotionofcorrectnessasrobustcorrectness.Whileitmightseemthatrobustcorrectnessisoverlycomplicated,webelievethatthisisthestrongestnotionofcorrectnessthatcanbeensuredbyaggregationprotocolsforthereasonsstatedearlier.Alltheintuitivenotionsofcorrectnessweconsidereddependoncommonknowledgethatsomeeventhappenednodeswerealiveatsomestage;sincecommonknowledgecannotbeachievedindistributedsystemswithfailures,noprotocolcanbecorrectwhichrenderstheotherintuitivenotionsofcorrectnessnotveryuseful.Webelievethattheguaranteeofourrobustcorrectnessisstrongenoughfortheendusertofeelcondentabouttheresultobtainedbytheaggregationprotocol.Averyimportantpointaboutthecorrectnessinsensornetworkswithnodefailures,isthefactthatcorrectnesscannotbecheckedintheprotocolthatcomputestheaggregatesincethereisnowaytodeterminewhatnodesare/werealiveatanygivenmoment[ 27 ].Thegoalofthecorrectnessprotocolweproposedisnottoprovideamechanismtocheckthecorrectnessbuttocharacterizeareasonablebehaviorofthesensornetwork.Themannerwewillusetherobustcorrectnessistoensurethattheprotocolsweconsider 1Alinearcombinationbetweenvaluesx1;:::;xnhastheformPni=1wixi Pni=1wiforpositiveweightswi. 35

PAGE 36

acceptableareimplicitlycorrecti.e.satisfythisnotionofcorrectnesssincenoexplicitcheckcanbeperformed.3.2.3TreeAggregationunderFaultsThemostimportantquestionwehavetoask,foreachaggregationmethod,ishowitcandealwithfaultsandstillachieverobustcorrectness.Inthecaseofaggregationtrees,therearethreebasicapproachestodealwithfaults:Ignorefault.Whenafaultoccurs,simplyignoreit.However,theinformationinthesubtreerootedatthefaultynodeisnotaggregated,i.e.notonlytheinformationofthefaultynodeislost,buttheinformationintheentiresubtreeislost.Thissolutiondoesnotensurerobustcorrectness,unlessthefaultynodeisaleaf,andisnotusuallyacceptableinpracticesincenothingcanbeguaranteedabouttheanswer.Localx.Whenafaultoccursforanintermediatenode{faultsinleavescanjustbeignored{attemptscanbemadetoxthetreelocally.Thispossibility,whichhasnotbeenreallyexploredindepthintheresearchliteraturetothebestofourknowledge,hasthefollowingdisadvantages: 1. Acrucialrequirementforthetreeaggregationisthefactthatthegraphformedbytheparent-childconnectionsisactuallyatree.Itisnotclearhowtherequirementcanbeensuredonlyusinglocalknowledge;networkwidecommunicationmightberequiredtoensurethishappens,orconventionsonwhatnodescanbeconnectedtoensureloopscannotbeformed. 2. Reconnectionofthechildren/descendantsofafaultynodeiscomplicatedbythefactthattheaggregationprocedureisinprogress.Ifthecomputationhastoberestarted,thelocalxwouldmostlikelycostalmostthesameamountoftimeasthecompletetreereconstruction[ 46 ]. 3. Implementingcomplicatedprotocolsonsensornetworkscanbeproblematicsincethecomputationandcommunicationcapabilitiesofthesensorsarelowandlikelytoremainlowsincetheprimarygoalisprolongingbatterylife. 4. Whentherateoffailureishigh,thelocalxesmaynotbeabletokeepupwiththerateatwhichthefaultsoccurandsevereinterferenceproblemsbetweenparallelxesarelikelytooccur. 36

PAGE 37

Forthereasonslistedabove,webelievethatlocalxesareimpracticalandalternativeshavetobeconsidered.Itispossibletocombinetreeaggregationwithothertechniquessuchascachingoldvaluesandusingthemwhencommunicationwithachildcannotbeestablished[ 57 ],andusingmultipletrees[ 57 ].Acarefullookatthesemethodsrevealsthattheydonotsatisfyournotionofrobustcorrectness{weextensivelycommentonthisissueinSection 2 .Hence,weseekanapproachthatavoidsaltogethertheneedforlocalxesbutretainsgoodperformance.Sincenostrictnotionofcorrectnessisensuredbythelocalxprotocolsproposedintheexistingliterature, 2 wedonotexplorethisoptionanyfurther.Globalx.Eachparentkeepstrackofthestateoftheirchildren.Whenthefaultofoneofthechildrenisdetected{orlinkfailureisdetected,whichwouldhavethesameconsequence{amessageispushedupthetreetoinformtheroottorebuildthetree.Asimpleoodingalgorithm[ 20 ][ 57 ]canbeusedtobuild/rebuildthetreeandtheaggregationquerydisseminationcanbeperformedsimultaneously.Inorderfortheaggregationtobesuccessful,thetreehastosurvivethelengthofthecomputation{thismeansthatboththeconstruction/querydisseminationandtheaggregationfromleavestotherootmustbefaultfree.Aninterestingobservationisthefactthatfailureofnodesaftertheaggregatevaluehasbeensenttotheparentdonotinvalidatethecorrectness.Modelingthisdetaildoesnotaddsignicantprecision,aswediscussbelow.Fromthediscussionabove,theapproachwechosefortreeaggregationistheglobalx,i.e.rebuildthetreewhenafaultoccurs.Inwhatfollows,weestimate,underthesimplefaultmodelinSection 3.2.1 ,theamountoftimeandnumberofmessagesrequiredonaveragetocomputeanaggregateinasensornetworkofsizeNwithmaximumdepthdN.ThemaximumdepthdNdependsbothontheactualtopology,mannerof 2Forthereasonswelistedinthetext,webelievethatnonotionofcorrectnesscanactuallybeensuredforlocalxesunlessglobalknowledgecanbeachieved. 37

PAGE 38

establishingconnectionsbetweensensorsandthetreeconstructionalgorithm.InthebestcasedN=logNbutitcanbeasbadasNwhenthetreeisaline.Ratherthancomplicatingthefaultmodelweobservethatthetimeforthebottom-upaggregationisequaltothetimeforaggregationquerydissemination/treeconstruction. Proposition1. SupposewehaveasensornetworkofsizeNforwhichaggregationtreesofmaximumdepthdNcanbeconstructed.Assumethesimplefaultmodelandletpbetheindependentprobabilityoffailureforanyofthenodesinthenetwork.Then,theaveragenumberofrestartstreereconstructionsbeforesuccessis1 )]TJ/F22 7.9701 Tf 6.5865 0 Td[(pdNNandtheaveragetimetosuccessfullynishtheaggregation,asamultiplieroftransmissionduration,isdN1 )]TJ/F22 7.9701 Tf 6.5865 0 Td[(pdNN Proof. First,wehavethefollowingresult:ifXisarandomvariablethatencodesthenumberofcoinipsittakestoseeaheadasopposedtotailwhentheprobabilityofeachindependenttrialisq,thenXhasaGeometricdistributionwithparameterqandE[X]=1 qseeforexample[ 74 ].Thevarianceofsucharandomvariableis1)]TJ/F22 7.9701 Tf 6.5865 0 Td[(q q21 q2thusthestandarddeviationisupper-boundedby1 q=E[X].ThismeansthatthebehaviorofXisreallydictatedby1 q;thevaluesobservedarewithinasmallconstantmultiplierofthisquantity.Toestimatetheprobabilityoffailureinoneround,wemakethefollowingobservations:forthecomputationtofail,itisenoughtohaveanyoftheNnodesfailinanyofthedNtimeslots.Sinceallfailuresareindependent,theprobabilitythatthisdoesnothappenis1)]TJ/F21 11.9552 Tf 12.8514 0 Td[(pdNNthisisequivalenttohavingdNNindependentcoinipsallturningtailswithptheprobabilityforahead.Linkingthetworesultstogether,wehavetherequiredresults,i.e.thenumberoftreereconstructionsis1 )]TJ/F22 7.9701 Tf 6.5865 0 Td[(pdNNandthetimeisdN1 )]TJ/F22 7.9701 Tf 6.5865 0 Td[(pdNN. Togetanintuitiveideaofwhatthebehaviorofthetreeaggregationasafunctionoftheprobabilityoffailureis,weprovidethefollowingresult: 38

PAGE 39

Proposition2. UsingthesamesetupasinProposition 1 anddeningq=pdNN,i.e.theratioofpand1 dNN,forlargedNNtheaveragenumberoftimesthetreehastobereconstructeduntilaggregationsucceedsisapproximatelyeqeisthebaseofthenaturallogarithm. Proof. UsingtheresultinProposition 1 ,thenumberoftreereconstructionsisR=1 )]TJ/F23 5.9776 Tf 8.2695 3.6928 Td[(q nnwheren=dNN.Whennislargewehave:Rlimn!11)]TJ/F21 11.9552 Tf 13.8349 8.0878 Td[(q n)]TJ/F22 7.9701 Tf 6.5865 0 Td[(n=eqThus,thenumberofreconstructionsuntilsuccessiseq. Thistheoreticalresultsuggeststhat,whenq1,theaveragenumberoftreereconstructionsise2:7,thustheimpactofthetreereconstructionfortreeaggregationisminimal{thisisthebestscenariofortreeaggregation.Whenq>1evenbyasmallmargin,thenumberofreconstructionsrequiredmightbeprohibitivelylarge,thusrenderingtreeaggregationunacceptablyslow.Forexample,whenq=10,about22000treereconstructionsarerequiredonanaverageforthetreeaggregationtosucceed.Tocheckthecorrectness,weobservethattheresultreturnedistheaggregateoverthenodesthatwerealiveinthelast,successfulround,thustheresultoftheaggregationhasrobustcorrectnessitisoneofthepossiblevaluesoftheaggregateobtainedwiththenodesthatwerealiveatthestartandtheendofthecomputation.Thisfaultmodelfortreeaggregationassumesthefactthatthefailuresareuncorrelated.Whenthisisnotthecase,anewfaultmodelspecictotheparticulartypeofcorrelationcanbeconstructed.Aswewillseelater,thedesignofthehybridprotocoldependsmainlyonthebehaviorofthetreeaggregationinthepresenceoffaultsi.e.theprobabilitythatatleastonenodefailsduringthetreeaggregation,andisnotverysensitivetocommoncasesofcorrelatedfaultbehavior. 39

PAGE 40

3.2.4GossipAggregationunderFaultsFollowingthedescriptionintheintroduction,thegossipalgorithmsdonotdependonanydistributeddata-structuretowork.Ifanodefails,theothernodeswillcarryoutthedistributedcomputationwithoutthefailednodeandsimplynotcommunicatewiththisnode.Theonlyconcerniswhethertheresultobtainedhasrobustcorrectness.3.2.4.1Push-SumprotocolWhenusingthePush-Sumprotocol[ 49 ],sensorsmaintaintwoquantities:aweightedsumofthevaluestobeaggregatedandthesumofweights.Theratioofthesetwoquantitiesisthelocalestimateoftheaggregate.Ifnoneofthenodesfail,itisproved[ 49 ]thattheprotocolconvergesforfullyconnectednetworkswheneachsensorpicksarandomsensortocommunicatewiththesendingsensorwillsendthereceiverhalfitssumandhalfitsweight.Ifasensor'sconnectionisfaulty,thecommunicationisneverinitiatedthusnothingislost{thisassumessomeformofhandshakingprotocolbutthisisthestandardforcommunicationprotocolsanditisusuallyimplementedinhardware.Acompletenodefailurecanbemodeledbysimplynotallowingthesumandtheweighttochange.TodetermineifthePush-Sumprotocolhasrobustcorrectness,wemakethefollowingobservations.First,thesumofsumsandthesumofweightsincludingthefaultynodesinthesensornetworkisconservedthisisthemainobservationin[ 49 ].Second,foranactivenode,theratioofthesumandtheweightconvergestoaconstantvaluethroughoutthesystem.Third,duetothenatureoftheprotocol,theratioofthesumandtheweightforanynodeisalinearcombinationofthesensorvaluesincludingthefaultysensors.Atconvergence,theweight,asseenbyanylivenode,ofallthenodesaliveisapproximatelythesame.ThisimmediatelyimpliesthatPush-Sumsatisesrobustcorrectness.3.2.4.2DistributedrandomgroupingDRGalgorithmTheDRGprotocol[ 12 ]consistsinhavingeachnodedeciding,withsmallprobability,whetheritwantstobeabroadcastingnode.Then,thebroadcastingnodesendsa 40

PAGE 41

broadcastmessagetoallneighbors,gatherstheircurrentvalue,averagesallthevaluesreceivedanddistributestheinformationtoallnodesthatreplied.Sinceaveragingtransformslinearcombinationsintonewlinearcombinations,thepartialaggregatevalueatanynodeatanytimeisalinearcombinationofthevaluesinthesensornetwork.Theprotocolisguaranteedtoconverge 3 forallthenodesalive.Thisimmediatelyimpliesthattheprotocolsatisesrobustcorrectness.Noteanimportantdierencebetweenthegossip-basedprotocolsandtreeaggregation:thecorrectnessisachievedwithoutsignicantlyslowingthecomputationdown{theslowdownisdueonlytothemultipleretriestocommunicatewithanodeandnotduetotheneedtoreconstructlargedistributeddata-structures.Thisrobustnesstofaultsistheprimarymotivationforthesetypesofprotocols.3.3ContributionsWiththiswork,wemakethefollowingimportantcontributions: 1. Weproposeasimplebutrealisticfaultmodelwhichweusetoanalyzethebehavior,inthepresenceoffaults,ofbothtreeandgossipaggregation. 2. Weproposearealisticnotionofcorrectnessforaggregationprotocolsinsensornetworks,thatwecallrobustcorrectness.Weexplainwhythisnotionofcorrectnessisthebestthatcanbehopedforwhenfailuresarepossible.Thesecontributionsactasasteppingstonetowardmethodicalanalysisofaggregationalgorithmsandtheprotocolswefurtherproposeforlargescalehomogeneousandheterogeneoussensornetworks. 3Theconvergencespeeddependsonthealgebraiccomplexityofthegraphandisguaranteedforanyconnectedgraph[ 12 ]. 41

PAGE 42

CHAPTER4IMPROVINGFAULTTOLERANCEOFTREEAGGREGATION4.1MotivationTheinaccuracyintroducedduetofailuresmightbeprohibitivelylarge.Theotheroptionsdiscussedinliteratureincludemaintainingmultipletrees[ 57 ][ 15 ],xingthetreeslocallybeforeproceedingwithaggregation[ 46 ][ 34 ]andusingorder-independent,duplicate-insensitivemultimaps[ 64 ][ 15 ][ 62 ]basedprimarilyonFMsketches[ 28 ].Weanalyzeeachoftheseoptionsinfurtherdetailaspartofthiswork.Themainthrustofthisworkisonconceptuallyanalyzingthefaulttolerance,scalability{andhencetheapplicability{andaccuracyofthedierenttechniques.Werstanalyzetheoreticallytheextenttowhichthefaulttoleranceofanaggregationtreecanbeimprovedusingeachofthetechniques.Theanalysissuggeststhatthegainsofusingthesetechniquesmightnotbeashighaspreviouslybelievedtobe.Weidentifyhighleveltrendsinfaulttolerance,scalabilityandaccuracyofthevarioustree-basedtechniqueswhichwouldenableinformeddecision-makingindesigningaggregationalgorithmsforsensornetworks.4.2MultipleTreesMultipledecisiontreeshavebeenutilizedindataminingtoimproverobustness[ 47 ][ 53 ].Insuchcases,theuseofmultipletreeshasbeenshowntoreducethevariance.Usingmultipleaggregationtreesatthesametimehasbeenproposed[ 58 ][ 15 ]asawayofreducingvariance.Themainideahereisthatiftwotreesareindependent,thevarianceofalinearcombinationofthetwonalaggregateswouldbehalfofthevarianceoftheindividualaggregatefromasingletree.Considertheexampleofaqueryforndingthecountofthenodesaliveinthenetwork.Twoaggregationtreesareconstructedindependentlyandtheaggregationoperationoccurssimultaneouslyinboththetrees.Thenalestimateofthecountofthesensorsiscalculatedastheaverageoftheresultsreturnedbythetwotrees.Itiseasytoseethattheexpectedvalueofthisnalaggregateissameastheexpectedvalueoftheindividualaggregatefromeachofthetrees. 42

PAGE 43

EX1+X2 2=EX1 2+X2 2=EX1 2+EX2 2=1 2EX1+EX2=1 2EX=EX {1 Now,considerthevarianceofthenalaggregate,assumingthetwotreesareindependentVarX1+X2 2=VarX1 2+X2 2=VarX1 2+VarX2 2=1 4VarX1+VarX2=1 4VarX=VarX 2 {2 However,theaboveargumentholdsonlyifthetwotreesareindependent.Thoughthetreesmaybeconstructedindependently,andeventhoughweassumethatthefailuresareindependent,itcannotbeguaranteedthattheaggregatevalueobtainedfromthetwotreesareindependent.Themultipleaggregationtreesmaybehighlycorrelated,aswewillbeobservedfromtheresultsintheexperimentssubsection.Asaresultofthishighcorrelationbetweenthetrees,thevarianceofthenalaggregate,asestimatedfromthesemultipletrees,doesnotdecreaseasmuchassuggestedintheequation 4{2 .Infact,weprovethatthevariancebetweenmtrees,evenasm!1,cannotbebetterthanthevarianceoftheaggregateobtainedfromasingletreebyafactorgreaterthantheinverseofthecorrelationcoecientbetweenthetwotrees.Thefollowingproofestablishesaconnectionbetweenthecorrelationcoecientandtheupperboundonthereductionthatcanbeexpectedinthevarianceoftheaggregatevalue. Theorem3. Thefactorofreductioninthevarianceoftheaggregatereturnedbyasingletreeusingmtrees,asm!1,isthecorrelationcoecientbetweentwoindividualtrees. 43

PAGE 44

Proof. Considerthefollowingcorrelationmatrixbetweentwoaggregationtrees)]TJ/F20 7.9701 Tf 7.5232 -4.3123 Td[(11.Sincethevarianceofeachoftheaggregationtreeindividuallyisthesame,sayVarX,thecovariancematrixwouldsimplybeVarX)]TJ/F20 7.9701 Tf 7.5232 -4.3123 Td[(11Thenthevarianceofthenalaggregate,ascalculatedbyaveragingtheindividualresults,wouldbeVarX1+X2 2=VarX1 2+VarX1 2+2CovX1 2;X2 2=1 4VarX1+VarX2+21 4CovX1;X2=1 4VarX+1 2VarX=1 2+VarX {3 Generalizingthistomaggregationtrees,Cov=VarX0BB@1::1::1:::::::11CCAAndthevarianceofthenalaggregatewouldbeVarX1+X2++Xm m=1 mVarX+m2)]TJ/F21 11.9552 Tf 11.9552 0 Td[(m m2VarX=1+m)]TJ/F15 11.9552 Tf 11.9552 0 Td[(1 mVarX {4 Hence,asm!1,1 m!0andm)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1 m!1,andsolimm!1VarX1+X2++Xm m=VarX Thisimmediatelyhasthefollowingimplications: 44

PAGE 45

1. Theuseofmultipleaggregationtreesisnotgoingtoreducethevarianceoftheaggregatebyafactorgreaterthanthecorrelationbetweentwotrees.So,ifthecorrelationcoecientbetweentwotreesishigh,thereducedvariancemaynotbeworththeoverheadcostofmaintainingmultipletrees. 2. Plottingthefactorofreductioninvariance1+m)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1 mfordierentvaluesofmand,wenotethatmostofthereductioninvarianceofthemaximumpossibleisachievedusingasfewas5to7trees{thereturnsdiminishwithincreasingnumberoftrees.Intheexperimentalevaluation,weshowthatthecorrelationcoecient,andhencethereductioninvarianceobtainedbyusingmultipletrees,isdependentonthewaythemultipletreesareconstructed.Wealsoanalyzetheimpactofconstructingandmaintainingthosemultitreestructuresonthefaulttoleranceoftheaggregationsystem.4.3LocalFixesUnlikegeneraltreemaintenanceprotocols,thefailuresoccurringduringtheaggregationphaseshouldnotimpedethepartialaggregatesfrombeingretained.Inthecaseofxingthetreeslocallytotideoverfailures,thechildrenofthefailednode,basically,needtoreattachthemselvestoanothernodeinthenetworkwhichstillhasanunimpededpathtotherootnode.Notethatsatisfyingthisconditionimpliesthattheorphanedchildrencannotattachthemselvestothedescendantnodesintheirownsubtree{thisistoavoidcreationofcycles.However,sincetheirpartialaggregatesmustbepreserved,anadditionalconditiontobesatisedisthattheycannotattachtoanynodewhichhasalreadysentoutitsownpartialaggregate.Satisfyingtheseconditionsmakesxingtreeslocallyduringtheaggregationphasemoredicultthanxingduringjustthemaintenancephase,whichisthecommoncasediscussedingeneraltree-basedalgorithmslikeroutingorsubscription-basedprotocols.Thus,intheadventofmultiplefailures,xingthetreelocallytocompensateforallthesefailuresmightactuallycostmorethanrebuildingthetree.Dependingonthenumberoffailuresthatcanbesustainedbeforerebuildingthetreebecomescost-eective,thetreereconstructioncanbedelayed.Onceagain,sincewewanttoprovethateventhebestofthesetechniquesdonotsubstantiallyimprovethefaulttoleranceoftheaggregationtree, 45

PAGE 46

weassumethatthecostofeachlocalxiszero.Thisisalsoagoodwayofsteeringclearofthespecicsoflocalfault-xingalgorithmssince,withzerocost,itisasideal,thoughprobablypracticallyunrealizable,asitcanbe.Wecanextendtheexpressionforthenumberofglobalrestartsfromthepreviouschaptertoallowuptoklocalxesbeforearebuild-and-restartistriggered.Thenumberofrestartsincaseofklocalxescanthusbeapproximatedto 1 kXi=0nCkpi)]TJ/F21 11.9552 Tf 11.9552 0 Td[(pn)]TJ/F22 7.9701 Tf 6.5865 0 Td[(iNotethatwehavediscountedthedepthdofthetreefromtheformula{thisonlymakesthelocal-xingalgorithmlookbetterandthus,again,provideabest-casescenario.Tovisualizethistheoreticalmodel,weplotthenumberofrestarts,ascalculatedfromthisapproximateestimate,forvaluesofkrangingfrom0,whichisequivalenttoglobalrestart,tok>lognforasensornetworkofsizen=8192inFigure 4-1 .Redundancycanbeusedtotakecorrectiveactions,asdiscussedintheRideSharing[ 34 ]approachtofaulttolerantaggregation.Thisapproachexempliesaminimaloverheadforlocalxesparadigm,usingtheconceptofbackupparents.Theapproachperformsbestiftransmissionorderingisenforcedalongwithparent-cliqueformationandco-trackingweencouragethereaderstolookup[ 34 ]forfurtherdetails{thisrequiresadditionalworkduringtreeconstruction.However,notalllossesareguaranteedtobeaccountedforinthisapproach.Also,nodefailuresareignoredinthisapproach.Inthefollowingsection,weconductexperimentstostudytheeectonfaulttoleranceofhavingtheabilitytosustainklocalxesbeforetriggeringatreereconstructionandrestartingtheaggregationoperation.4.4ExperimentsandResultsThevarioustechniquesdiscussedintheprevioussectionsforfaulttoleranceintreebasedaggregationareevaluatedexperimentallyinthissection.Thegoalsoftheexperimentsare: 46

PAGE 47

Figure4-1.Theoreticalpredictionofimprovementinfault-toleranceusingklocalxesforsensornetworkofsizeN=8192 1. ToverifythatglobalrestartsrenderthetreesunusableatNp>10 2. Tostudytheeectonfaulttoleranceofhavingtheabilitytosustainklocalxesbeforetriggeringatreereconstructionandrestartingtheaggregationoperation. 3. Toevaluatetheamountofcorrelationbetweenmultipletreesfordierentmultitreestructures,and 4. Toanalyzetheimpactofconstructingandmaintainingthosemultitreestructures.Westartwithstudyingthescalabilityandfaulttoleranceoftheglobalrestarttechnique.4.4.1GlobalRestartInthistechnique,onencounteringafailure,thecurrenttreeisdismantledandanewtreeisformed.Thisnewtreewillthusspanallthenodesaliveatthetimeofreconstructionandtheaggregationprocessedistriggered.Sincethetreeisreconstructedeverytimeafailureisencountered,ifandwhentheaggregationoperationdoesmanagetoruntocompletion,itimpliesthatallthenodesthatarealivearecountedforintheaggregationqueryresult.Forourexperiments,weuseatop-downversionoftheGISTalgorithm[ 46 ]forconstructinga4-arytree.Thiskindoftreehastheadditionaladvantageofbeing 47

PAGE 48

insensitivetogroupingselectqueries.Theaggregationoperationthenbeginsattheleafnodesandpropagatesupthetree.Thequerydisseminationisassumedtohaveoccurredduringtreeconstruction.Also,inanattempttostudythecostinthebestcasescenarioforthistechnique,thedetectionoffailureisassumedtobefreeofcost.Ondetectingafailure,thetreeisreconstructedandtheaggregationoperationisrestarted.Thecostofreconstructingthetreeisalsoassumedtobezero.Hencetheonlycostcountedisthetimespentorthemessagessentintheaggregationprocess. Figure4-2.Timetocompletioninroundsplottedagainsttheprobabilityoffailurewhenusingtheglobalrestarttechnique TheresultsinFigure 4-2 ,whichplotstheaveragetimetocompletionversustheprobabilityoffailureforvarioussizesofthenetwork,showthataslongasNp<1,theaggregationtreeshaveexcellentperformance.ButoncetheprobabilityoffailurestartstorampupbeyondNp>1,theexpectedtimetocompletionshootsupexponentially.Animportantobservationhereisthattheperformanceofthetree-basedtechniqueturnsfrombesttoworstwithinanincreaseofanorderofmagnitudeoftheprobabilityoffailurepforanygivennetworkofsizeN.TheseresultsconrmthetheoreticalanalysispresentedearlierabouttheexponentiallyincreasingnumberofrestartsafterNp>1. 48

PAGE 49

4.4.2LocalFixesOnemightargue,fromtheresultsoftheexperimentsintheprevioussubsectionconcerningglobalrestarts,thatthisexceptionallybadbehaviorofthetreemaybeduetothestringentrequirementthatthetreebereconstructedonencounteringtherstfailureintheaggregationprocess.Ageneralbeliefexpressedintheliterature[ 46 ][ 57 ]isthatxingtreestructurestoovercomethefaultslocallywouldprovideabetterperformanceathigherprobabilitiesoffailure.Inthissetofexperiments,weusethesamesetupasusedintheprevioussetofexperiments,exceptthefactthatwenowallowuptokfailurestooccurbeforewereconstructthetree.Usingkasafreevariable,weseetheeectonthefault-toleranceofthetreestructureofbeingabletosustainvaryingnumberoffailures.Here,wealsoassumethatthecostofxingeachfaultUptonkfaultsiszero.Thisguaranteesthebestpossibleperformanceoftheaggregationtreeusinglocalfaultxingtechniquesforcomparison,thoughthismaybepracticallyunrealizable. Figure4-3.Timetocompletionplottedagainsttheprobabilityoffailurewhenthemaximumnumberoflocalxesallowedisk=0,2,8and16 Figure 4.4.2 reectsthefault-toleranceoftheaggregationtreefordierentvaluesofk-itmarkstheprobabilitiesoffailureforwhichtheexpectednumberofrestarts/timetocompletionislesserthanamaximumthreshold.Byobservingthisplotforincreasingvaluesofk,wecanstudytheeectivenessofallowinglocalxes-themorerangecoveredontheprobabilityoffailureaxisforlesserincreaseink,thebetter.Wevarykfrom0,case 49

PAGE 50

ofglobalrestart,toOp n.Weobservethat,evenwiththemaximumsustainablenumberoffailuresbeforerestartashighasOp n,thefault-toleranceoftheaggregationtreeisimprovedbyonlyanorderofmagnitudeinp.Thisshowsthat,evenafterassumingnocostforlocalxes,allowingveryhighnumberoffailuresbeforereconstructiondoesn'ttransformtomuchgaininfaulttolerance.4.4.3MultipleTreesMultipledecisiontreeshavebeenutilizedindataminingtoimproverobustness[ 47 ][ 53 ].Insuchcases,theuseofmultipletreeshasbeenshowntoreducethevariance.Multipletreesarealsoproposedtoreducethevarianceoftheestimateofaggregationoperation.However,asexplainedintheprevioussection,correlationisanissue.Ifthereisahighamountofcorrelationbetweenmultipletrees,thereductioninvarianceislimitedandmaynotbeworththeadditionalcostofmultipletrees,especiallywhenenergyisatapremium.Inthissection,weperformexperimentstodeterminethepossiblecorrelationfordierentmodelssuggestedinliteratureforutilizingmultipletrees.Inparticular,westudythetrade-obetweenthecostofmultipletreesandtheadvantagestheyprovideintermsofscalabilityandfault-tolerance.Sofarinourpreviousexperiments,wehaveusedasingleparameterforprobabilityoffailure,knowingfullywellthatfailurecanoccurduetodierentfactorssuchastemporaryuctuationsinlinkqualitybetweenapairofnodesorapermanentfailureofanode.Wedonotconsiderbyzantinefailuresinthisresearch.However,theassumptionholdsforallourpreviousexperimentsbecause: Wehavebeenusingasingletreeforaggregationtillnow. Everynodehasasingleparentinatree.Thisimpliesthatwhenacommunicationfails,itisnoteasytodistinguishbetweenthefailureofthereceivingnodeandthefailureofthelinkconnectingthetwonodes.Asaresult,wedonotaddanyadditionalinformationtothefaultmodelbyseparatingoutthetwosourcesoffailureswhenwehaveasingletree. 50

PAGE 51

However,whenwehavemultipletreesperformingtheaggregationoperationsimultaneously,eachnode 1 hasmultipleparents.Hence,evenifaparentnode,sayp1cannotreachachildnode,sayc1,duetofailureofthelink,nodecmaybereachedbyanotherparentp2,ifthereisnonodefailureatc.Hence,whendealingwithmultipletrees,weusetwoseparateparameters,pnodeandplinkinourfaultmodel.Weconsiderthefollowingmodelsoftree-basedaggregationtechniqueswhichusemultipletrees: 1. Levelizedmultitrees:Wecategorizeamultipletreestructure,wherealltheparentsofanodelieinthesamelevel,aslevelizedtrees.Themultipletreestructurediscussedin[ 57 ]hastwoparentsintheleveljustabovethechildnodeandthusfallsinthiscategory.Themultitreestructurediscussedin[ 15 ]hasupto3parentnodes,allatthesamelevel.Thus,thelevelofeachofthenodesisthesameineachofthetreesinsuchcases. 2. Arbitrarymultitrees:Thesearethemultitreestructurescreatedwithoutanyrestrictionsonthepositionoftheparentinanyothertree.Thus,inthiscase,thenodeisfreetobeatanyofthelevelsinthedierenttrees.Multipletreesformedindependentlyusinganytreegeneratingalgorithm,suchasGIST[ 46 ],wouldfallinthiscategory. Figure4-4.Levelizedmultitreestructure 1exceptthechildrenoftheuniquerootnode 51

PAGE 52

Inourexperiments,westudythecorrelationbetweentwotreeswherethemulitreestructureisaalevelizedmultitree,andbanarbitrarymultitree.Weobservethevarianceofasingletree,andthevarianceofthemultitreeineachcase.Thereductioninthevarianceobtainedusingmultipletreescanbeusedtoestimatetherequiredcorrelation.TheratioofthevarianceofmultitreestoasingletreeisplottedforalevelizedmultitreeinFigure 4.4.3 andforanarbitrarymultitreeinFigure 4.4.3 Figure4-5.Ratioofthevarianceofthecountaggregateusingalevelizedmultitreehaving3parentstothevarianceofa3-arysingletree Figure4-6.Ratioofthevariancesofthecountforanarbitrarymultitreeinvolving2treestothevarianceofasingletree 52

PAGE 53

Thelevelizedmultitreestructureusedinourexperimentsissimilartothemultitreestructureusedin[ 15 ]whereeachnodehas3parentnodes. 2 Forafaircomparison,itsvarianceiscomparedtoa3-arysingletreestructure.Wealsoexperimentedwithmultitreestructureasproposedin[ 57 ]andobservedthattheresultsarequalitativelythesame.Theexperimentsinvolvingarbitrarymultitreesrelyontheindependentformationofmdierenttrees.Weusethesametreeconstructionalgorithmasusedintheprevioussection.Werunthealgorithmmtimes,eachreturningatreeformedindependentlybutrootedatthesamenode.Adistinctadvantageofusinglevelizedmultitreesisthefactthatthewaytheyareconstructed,alltheparentnodescanbereachedwiththesamebroadcastoftheintermediateaggregatevaluefromthechildnode.Eachofthemparentsshare=mthoftheaggregatevaluesentbythechild.Ifitisassumedthattheeachofthemultipletreesareindependent,thevarianceofthenalaggregateattherootshouldbe1=mthofthevarianceofthenalaggregatefromasingletree.However,thereductioninvarianceismuchlesser,asobservedinFigure 4.4.3 .Thissuggeststhepresenceofcorrelationbetweenofthetreesinthismultitreestructure.Comparethisagainstthereductionofvariancewhenusingthearbitrarymultitree.Figure 4.4.3 showsthatusinganarbitrarymultitreeconsistingoftwotreesreducesthevarianceby50%ascomparedtoasinglebinarytree.Thisshowsthattheassumptionofindependenceholdsinthiscase.Weveriedthatbycalculatingthecorrelationcoecientbetweentwotreesofthearbitrarymultitree.Weobservedthatthecorrelationcoecientisalwayscloseto0,whichprovesthattheaggregatevaluesreturnedbythetwotreesareindeedindependent.However,incaseofarbitrarymultitrees,alltheparentscannotbecommunicatedtowiththesamebroadcast.Infact,thepartialaggregatethatanodesendstoeach 2exceptthenodesatlevel1,whichhavetherootnodeastheonlyparentnode. 53

PAGE 54

oftheparentswouldbetypicallydierent.Thisisduetothefactthateachnodenowhasadierentsubtreebelowitineachofthetreesinthemultitree.Thus,theamountofcommunicationincreasesm-foldinanarbitrarymultitreeascomparedtoasingleaggregationtree,wheremisthenumberoftreesinthemultitree.Thisimmediatelyhasthefollowingramications: 1. Theamountofenergyspentincreasesm-fold. 2. Sincethecommunicationmediumissharedbroadcastorsharedandeachnodehastosendandreceivemtimesthemessages,thecommunicationhastobeserialized. 3. Sincethecommunicationneedstobeserialized,thetotaltimerequiredforaggregationcanalsobeassumedtoincreasemtimes. 4. Theprobabilityoffailureofanodeiscalculatedastheprobabilityofthenodestayingaliveinthetimeperiodofanaggregationoperation.Nowthattheaggregationtimehasincreased,morenodeswouldfailinthisinterval.Thus,theprobabilityofnodefailurecanbeassumedtoincreasemtimestoo. Figure4-7.MeanandstandarddeviationofthecountaggregateasafunctionoftheprobabilityofnodefailureforasensornetworkofsizeN=8192 Wenotethat,astheprobabilityofnodefailureincreases,athebiasfromthetrueaggregatevalueincreases,andbthevarianceincreasesasobservedinFigure 4-7 .Thus,thereductioninthevarianceduetoindependencebetweentreesissomewhatcounteractedby 1. theincreaseinthevarianceduetoeectivem-foldincreaseinprobabilityofnodefailurepnode,and 54

PAGE 55

2. ifbiascorrectionistobeapplied,theincreasedbiasduetoincreaseinpnodealsoincreasesthevariance.Sincetheprobabilityoflinkfailuredependsonthetimepertransmission,theincreaseintheaggregationtimeforthearbitrarymultitreesdoesnothaveanydirectimpactontheprobabilityoflinkfailure.However,theincreasedenergyexpenditureleadstoweakeningoftransmissionpoweroverlongerrangeoftimeandthusindirectlyaectstheprobabilityoflinkfailure.4.5Sketch-BasedSchemesSketchbasedmethodshavebeenproposedinliterature[ 64 ][ 15 ]asfault-tolerantaggregationtechniques.Thesetechniquesareessentiallybasedontheduplicate-insensitivecountingFM-sketches[ 28 ].Eachsensornodemaintainsasketch-abitmapvector-indicativeofthenumberofdistinctsensornodesithasheardabout.Whenasensornodereceivesasketchfromaneighboringnode,theunionoftheincomingsketchwithitscurrentsketchyieldsabitmapvectorwhichapproximatesthecountoftheunionofnodesseenbythesetwocommunicatingsensors.Byensuringthatthesketchesaresentalongatreestructuretotherootnode,attheendoftheprocess,therootnodehasagoodestimateofthetotalcount.Thisideahasbeenfurtherextendedby[ 64 ]and[ 15 ]tocalculatethesumofvalues,inparticular,andhencealotofotheraggregateoperationstoo,ingeneral.Sketchesforincreasingfaulttolerance.Thissketchingtechniqueforaggregationoperationisinherentlyfaulttoleranttonodeandlinkfailuresinsensornetworksduetoitspropertyofbeingduplicate-insensitive.Thesamebitmapvectorcanbetransmittedtomorethanoneparentnodesatatime.Sincetheprocessisinsensitivetoduplicates,thisdoesnotaecttheaggregatevalue,however,itimprovesthefaulttoleranceoftheaggregationnetwork.Thisisprimarilybecauseforanodetobecountedintheaggregation,thereneedstobeanuninterruptedpathfromthatnodetotherootnode.Andso,bycreatingmultiplepaths,thefaulttoleranceisgreatlyimproved.Thus,the 55

PAGE 56

immunityofthesketch-basedtechniquestomultipathalgorithms,andthefactthatasinglecopyofthesketchndingitswaytotherootisgoodenough,makethistechniqueagoodcandidateforaggregation.However,asisthecasewithapproximatetechniques,thisschemealsosuersfrominaccuracies.Theaccuracyofthetechniquecanbeimprovedbyusingmultiplebitvectorseachutilizingdierenthashfunctionsinsteadofasinglevectorformingthesketch.Theimprovementinaccuracywithincreasingnumberofbitmapvectorsm,asobservedinthedecreasingpercentagestandarderrorin[ 28 ],islistedinTable 4.5 Table4-1.Standarderrorofcountingsketchesforseveralvaluesofm,thenumberofbitmapvectors,asreportedin[ 28 ] m %standarderror 2 61.04 40.98 28.216 19.632 13.864 9.7128 6.8256 4.8512 3.41024 2.4 Foraccommodatingaggregatesoflargescalesensornetworksofsizerangingfromtensofthousandstomillionnodes,asafesizeofasinglebitvectorcanbe32bits,asassumedin[ 64 ].Thestandardpacketsizeofmessagesis,bydefault 3 ,36bytes 4 .Withthismessagelength,wecanbarelyusem=8bitvectors,whichwouldbringthestandarderrordowntojustabout30%,asseeninTable 4.5 .Tolimitthestandarderrortowithin10%,thenumberofbitvectorsrequiredtobemaintainedandtransmittedineachmessageism=64.Thenumberofbitvectorsrequiredtogetthestandarderrorwithin2%isgreaterthanm=1024.Evenafteremployingsomeofthespacesavingtechniquessuchasrun-lengthencodingutilizedin[ 15 ],thesizeofeachsketchmessagewouldbearound 3TinyOSspecications:http://www.tinyos.net/faq.html4Thispacketsizeincludestheheaders 56

PAGE 57

4KB.Communicationofpacketsofthislengthinthelow-bandwidthandlow-energyparadigmofsensornetworksiscurrentlyunrealizableand,infact,isachallengeevenforEthernet-basedad-hocnetworks.Asaresult,eventhoughsketch-basedaggregationtechniquesaremostimmunetonodeandlinkfailuresinsensornetworks,theinaccuraciesintroducedduetotheinnateapproximationnatureofthetechniquerenderitdiculttorealizeforfairlyaccurateaggregatecalculations.Also,byconstruction,sketchescanonlydealwithintegercalculationsandadaptationtooatingpointaggregationwouldleadtoadditionallossofprecision.4.6ContributionsInthispieceofwork,wehaveevaluated,qualitativelyandquantitatively,thedierenttree-basedtechniquesforaggregationinsensornetworkfromthepointofviewoffaulttoleranceandscalability.Wemakethefollowingobservations: 1. Theglobalrestartstechnique,thoughensurescorrectness,rendersthetreesunusableatNp>10 2. Havingtheabilitytosustainklocalxesbeforetriggeringatreereconstructionandrestartingtheaggregationoperationhardlyimprovesthefault-tolerancebyanorderofmagnitude{andthiswasundertheextremelyidealassumptionthatlocalxesoccurinstantaneouslyanddonotincuranycostoftheirown. 3. Theamountofcorrelationbetweenmultipletreesdiersdependingonthewaythemultitreestructuresarecreated.Localizedmultitreestructures,likethelevelizedmultitreeswhichhaveminimalaggregationoperationtimeoverhead,havehighercorrelation. 4. Ifthemultipletreesaretrulyarbitrary,asistherequirementfortrulyindependenttrees,thegainsinminimizingthevariancecomeatthecostofincreasingtheoverallprobabilityoffailurewhich,invariably,resultsinanincreaseinthevariance.Thisshowsthatthereisnoonepuretree-basedaggregationtechniquethatcanbefault-tolerantinallcasesofnetworksizesandprobabilitiesoffailures.Amorecomprehensivehybridapproachneedstobeconsideredtooptimizeonperformanceinallcases. 57

PAGE 58

CHAPTER5HYBRIDAGGREGATION5.1MotivationAswehaveshownintheprevioussection,whentheprobabilityoffaultsissmallsmallerthan1 dNN,thebehaviorofthetreeaggregationisexcellent.Inthiscase,thegossiptypeprotocolshavereasonableperformanceaswell,buttheyareinferiortotreeaggregationbyasignicantconstantfactor.Whentheprobabilityoffailureislarge10 dNNtreeaggregationispoorbutgossipprotocols'performancesaremostlyunaected;inthesecircumstancesthegossipprotocolsareclearlypreferable.Sincewealreadyhaveareasonablefaultmodelandthecharacterizationofthetwotypesofprotocols,asimpleapproachfordecidingwhatprotocoltouseinaparticularpracticalsituationistoestimatethefailureprobabilityandtocompareitwith1 dNN.Ifitissmall,treeaggregationshouldbepreferred,ifitislarger,gossipprotocolsareabetterchoice.Athirdalternativeispossibleandpotentiallyinteresting:acombinationbetweentreeandgossipaggregation.Thereasonsuchahybridprotocolmightworkbetterthaneithertreeandgossipaggregationisbasedontheobservationthattheperformanceofthetreeaggregationisdramaticallyinuencedbythesizeofthetree.Forexample,ifq=pdNNislarge,say16,thenumberofrestartsneededforthetreeaggregation,accordingtoProposition 2 ,ise168:8106.Ifwereducethenumberofnodesthatparticipateintreeaggregationbyafactorof12,thenevenatdN=13,dNNisreducedbyatleastafactorof12logN logN=1214:84;thusqbecomes1:07.Sonow,thenumberofrestartsis2:94,a6orderofmagnitudereductionfromthepreviouscase.Thereductioninthenumberofnodesinthetreecanbeachievedbypartitioningsensorsintogroups,computingtheaggregatewithineachgroupusingamorerobustapproachandusingatreetocomputetheaggregateoverthegroups.Intheaboveexampleweonlyneedgroupsofsize12tomaketheprotocolpractical. 58

PAGE 59

aggregate sensornode gossipgroup tree-based group representative Figure5-1.Hybridprotocol. 5.2ProposedHybridAggregationMethodologyBasedontheaboveobservation,theparticularhybridprotocolweproposeisdepictedinFigure 5-1 .First,thesensorsareorganized,basedonlocalproximity,intomgroups,eachofaveragesizeN m.Withineachgroup,gossiptypeprotocolsareusedtocomputethevalueoftheaggregatewithinthegroup.Atthesametimeofrunningtheaggregation,aleaderisselectedforeachgroupanyofthegossipbasedleaderselectionalgorithmscanbeused[ 37 ].Thegroupleadersthenorganizeintoanaggregationtreeandaggregatethegroupvaluesusingthetreeaggregationprotocol.Ifthegroupleaderfails,anyothernodeinthegroupcantakeovertheroleandparticipateintreeaggregationsince,bythenatureofgossipprotocols,allnodesconvergetothesamevalue.Thegroupscanbereusedbetweenqueries;thereisnoneedtorepartitionthenetworkforeachquery.Thus,anyalgorithmforpartitioningthesensornetworkcanbeusedsinceeciencyisnotcrucial.Thehybridprotocolhasrobustcorrectnesssinceitisasimplecompositionoftreeandgossipaggregation,bothofwhichhavethispropertyassumingthatthetreeisreconstructedwhenanodefails.Choosingthegroupsizeforhybridprotocol.Giventhequalitativeobservationsabouttheeciencyofthetreeaggregationasafunctionofthenumberofnodes,astraightforwardapproachtodeterminingareasonablevalueforthesizeofthegroupsistochooseN msothatpdN mN m1misdeterminedbysolvingtheequation.Thiswould 59

PAGE 60

Figure5-2.Comparisonofperformanceoftreeandgossipaggregation ensurethatthetreeaggregationpartoftheprotocolneedsfewrestarts;therestwillbetakencareofbythegossipwithinthegroups.Thisapproach,eventhoughreasonable,isbasedonqualitativeobservationsandignoresthenedetailsoftheactualdierenceineciencybetweenthetreeandgossipaggregation,whichcouldbesubstantialseeFigure 5-2 .Theimportantobservationhereisthefactthatnoprotocoldominatestheotherforthewholespectrumofprobabilitiesoffailure.Amoreprincipledapproachistoformulateandsolveanoptimizationproblemthatgivestheoptimalvalueforthenumberofgroupsgiventhecharacteristicsofthesensornetwork.Thisapproachcanbeusedfordeterminingtheoptimalgroupsizeinordertominimizethetimetocompletionorthetotalnumberofmessagestheenergyusedinthesystemdependsmostlyonthetotalnumberofmessages.Weexplainhereindetailhowtooptimizeforthetimetocompletion;wethenbrieymentionhowtochangetheoptimizationproblemtooptimizeforthenumberofmessages.Asbefore,dNistheamountofaggregationtimeinafault-freetreeofsizeNthatcanbeconstructed,usingthepreferredtreeconstructionmethod,onsensornetworkswithsimilararchitecture.dNissimplytheproductofthemaximumdepthofthetreeand 60

PAGE 61

thenumberofactualmessagesrequiredfortheparentandchildtocommunicate,i.e.thelengthofthepathbetweengroups.LetgNbetheconvergencetimefortheparticulargossipprotocolandcommunicationgraphasafunctionofthesizeofthenetwork.Thisfunctionwouldtakeintoconsiderationthefactthatthecommunicationislocalorlongdistancebymeasuringthetimeinlocaltransmissions.Withthis,thetimetocompletionisthesumofthecompletiontimesforrunninggossipwithineachgroupmaximumtimeacrossthegroupsandtreeaggregationbetweengroups,thatis:T=Tgossip+Ttree=c1gc2N m+dm1 )]TJ/F21 11.9552 Tf 11.9552 0 Td[(pmdm{1Intheaboveequations,factorc11isintroducedtoaccountforthefactthatgNistheexpectedtime,andnottheactualrunningtimeofthegossipprotocol;anactualrunofthealgorithmmightrequiremoretime.Thefactorc21isintroducedtoaccountforthefactthatthesizeofindividualgroupswillbeN=monlyonaverage;individualgroupscouldbemuchlarger.Theproductc1gc2N mmodelsthepessimisticbehaviorofthegossipaggregation:theunluckyspeedoftheslowestgossipgroup.Wediscusshowc1andc2aredeterminedlaterinthesection.Withthis,theoptimizationproblemis:mtimeopt=argminm2[1:::N]c1gc2N m+dm1 )]TJ/F21 11.9552 Tf 11.9552 0 Td[(pmdm{2Whenthenumberofmessageshastobeoptimized,theoptimizationcriterionis:mmsgopt=argminm2[1:::N]NgN m+m1 )]TJ/F21 11.9552 Tf 11.9552 0 Td[(pmdm{3Thefactorsc1andc2arenotnecessaryinEquation 5{3 sincewewanttheaverage,notextremebehaviorofthegroupswhilecountingthenumberofmessages. 61

PAGE 62

Figure5-3.Frequencydistributionofthetimetocompletionofgossipprotocolforagroupofsizen=4096andp=0.00065536 Aboutthegossipspeedfactorc1.Determiningthevariabilityduetotherunningofthegossipprotocolisveryinvolvedtheoretically.Forthisreason,themethodweproposehereistodetermineanappropriatevalueforthefactorc1usinganempiricalapproximationofthedistributionobtainedbysimulatingthebehaviorofgossiponthegivennetwork.Figure 5-3 depictstheempiricalapproximationofthedistributionoftheconvergencespeedforanetworkofsizen=4096andp=0.00065536.Asisapparentfromthegure,thedistributionisconcentratedaroundtheexpectedvalueandtheprobability,thatthespeedistwiceasslowastheaverage,isverysmall.Animmediateconsequenceisthefactthatthemaximum,evenoveralargenumberofsuchdistributions,willbewithinafactorof2oftheaveragethus2isanupperboundofthecontributiontotheconstantc1.Aboutthemaximumgroupsizefactorc2.Thevariabilityduetounevengroupsizehastobedeterminedbasedontheparticulargroupingalgorithmused.Unlessthegroupingalgorithmstrivestoachieveuniformgroups,thegroupingcanbeconsideredrandomandeachgroupsizemodeledbyabinomialvariable{thegroupofallsizesformsamultinomialdistribution.TheaverageofeachsuchvariableisN m,buthereweneedan 62

PAGE 63

estimateforthelargestgroup{thisgroupwilllikelyhavetheslowestgossip.Suchanestimateisgivenbyaclassicresultbyobservingthatthisisaballsintobinsproblem[ 71 ].AgoodestimateofthesizeofthelargestgroupisN m+logm.Whenthenumberofgroupsissmall,thelogarithmicfactorhasnoinuencebutbecomesdominantwhenmapproachesN.Thecontributiontotheconstantc2isdeterminedbydividingthisquantitybytheexpectedsizeofthegroup.Puttingthesetwocontributionstogether,thevalueofTgossipweusedinequation 5{2 andrecommendinsuchscenariosis:Tgossip=2g+mlogm NN m5.3EmpiricalEvaluationoftheHybridProtocolIntheprevioussection,weintroducedahybridtree/gossipprotocol.Theoptimalcombinationbetweenthetwoprotocolsisdeterminedbyformulating,basedonthecharacteristicsoftheapplication,andthensolvinganoptimizationproblem.Themaingoalofthissectionistopresentempiricalevidencethatthehybridprotocolisalwaysatleastasgoodasthebestofthetreeandgossipprotocolsandsometimessignicantlybetter,byasmuchastwoordersofmagnitude.5.3.1ExperimentalSetupInthissectionwegiveadetaileddescriptionoftheexperimentalsetupusedtoproducetheresultsreportedlaterinthesection.Duetolackofspace,werestrictthetypeandparametervaluesforthesyntheticsensornetworksforwhichwereportresults;weobservedqualitativelysimilarresultsfordierenttypesofnetworksanddierentparametervalues.Theexperimentswepresentherewereperformedonsimulatedsensornetworksconstructedbyrandomlyplacingthenodesinaunitsquareeld.Foreachnode,thetransmissionradiusisselectedsuchthatthenumberofdirectneighborsis32.WepresentresultsforthenumberofnodesNvaryingbetween16384and262144. 63

PAGE 64

Thehybridprotocolthatweintroducedcanuseanygossipprotocolforaggregationwithinagroup.Inalltheexperimentsreportedhere,weusedthepush-sumprotocoldescribedin[ 49 ]forperformingaggregationusinggossip;weusethreevariationsofthisprotocol:anodesareallowedtotalkonlywiththeimmediateneighbors,bnodesareallowedtotalk,usingrouting,withanynode{thisisthemodelassumedin[ 49 ],andcnodesareallowedtotalktoalogarithmicnumberofnodesthataredistributedrandomlythroughoutthenetwork{thisisthemodelusedinpeer-to-peernetworks.Forexperimentsinsections 5.3.2 and 5.4.2 ,thepartitioningofnodesintogroupsisachievedbysplittingthespaceintoagridwithasmanycellsasgroupsthepossiblenumberofgroupshasbeenrestrictedsothatthisispossible.Byassumingrandomplacementofthesensors,wecanindeedusetheformulaforthefactorc2derivedinSection 5.2 .WeprovidesomeexperimentsbasedonadierentgroupingtechniqueinSection 5.4.3 .Apointtobenotedhereisthatthegroups,onceformed,canbeusedformultipleinstancesofaggregation.Thus,thegroupingcostwhenamortizedforoneinstanceofaggregationisnotsignicant.Infact,whenthegainofusingthehybridprotocol,asestimatedtheoreticallybyourmethodology,isverysmallsay,<1:5,wedefaulttothepureprotocolsjusttreeorgossip,insteadofincurringtheoverheadcostsofgroupingwhichmaynotbejustiedinsuchcases.Inthetree-basedapproach,aggregationoccursbyformingaspanningtreeamongtheparticipatingnodesbystartingwiththerootnode,splittingthespaceinto4regions,selectingarandomnodeineachparttobethechildofthisnodeandrecursinguntilnonodesareleft.Thismethodofsplittingwillproduceareasonablybalancedtreewithgoodlocalityforthelowerpartofthetree.Thisapproachissimilartothemethodof[ 36 ].Analternativeistouseooding{asin[ 57 ]{inwhichcasethecommunicationbetweenparentandchildrenwillbemoreecientbutthetreemaybeunbalanced.Forhybridprotocol,thegossip-basedaggregationinthegroupsisfollowedbytreeaggregation.Here,thespanningtreeisformedoverrepresentativenodesfromthegossip 64

PAGE 65

groups.Asaresult,aparent-childpairofnodesinthespanningtreemightbemultiplehopsapartfromeachotherandhencemulti-hoproutingneedstobeemployed.Animportantpointtonotehereisthattheintermediatenodeswhichparticipateinthisroutingcalled,say,`hop-nodes'donotcontributetothefaultinessintheaggregationtree.Multi-hoproutingtechniquessuchas[ 48 ]and[ 79 ]areknowntobefault-resilient.Sincemulti-hoproutingisafarmorefault-resilientoperationthantreeaggregation,theassumptionofnotcountingthehop-nodesforestimationinourmethodologyinEquations 5{2 and 5{3 isvalid.However,wehavealsoanalyzedthescenarioinwhichthemulti-hoproutingisachievedusingsimpleforwardingandisnotfault-resilientandhavepresentedexperimentalresultsinSection 5.4.1 .Simulatingverylargesensornetworks.Theidealtestbedforevaluatingthehybridprotocolisarealsensornetwork.Sincethehybridprotocolweproposedisdesignedtodealprimarilywithlargenetworks{hundredsofthousandstomillionsofnodes{empiricalevaluationonarealsensornetworkisoutsidethepossibilitiesofasmallresearchgroup.Evenmore,accuratelysimulatingsuchlargenetworks,forexampleusingsomeoftheavailableemulators[ 54 ][ 65 ][ 32 ],iscomputationallyinfeasible;thelargestsensornetworksimulationscanreasonablyaccommodateonlythousandsofnodesandthebenetsofthehybridprotocolwillbesignicantusuallyformuchlargernetworks.Underthesecircumstances,theonlyreasonablesolutionistouseacoarsesimulatorthatwillignoremostdetailsofsendingthemessagesusingradiocommunicationbutwillsimulatefaults.Evensimulatingthemulti-hoproutingisanoverkillforsimulatinglargenetworks;forthisreasonweapproximatethenumberofhopswiththedistancebetweenthecommunicatingnodesdividedbytheradiusofcommunication.ItisimportanttomentionthoughthatthesimulatordoesnotjustusethemodelintroducedinSection 3.2 ;itactuallysimulatessendingthemessagesbetweensensorsandactingonbehalfofthesensorsastheywouldbehaveinarealimplementation.Whilethesimplicationsinhow 65

PAGE 66

thesensornetworkissimulated/testedarelikelytointroduceinaccuracies,webelievethatqualitativelythebehavioroftheprotocolswillbethesameinpractice.5.3.2ExperimentalResultsWeperformtwotypesofexperimentstodeterminetheeciencyofthehybridprotocol: 1. Weminimizethetotalnumberofmessagesrequiredforanaggregation.Thisisafairindicatoroftheamountofpowerconsumedintheexecutionoftheaggregationalgorithmsincecommunicationconsumesthemostsignicantpercentageofpowerinsensornetworks[ 76 ]. 2. Weminimizethetimetocompletionofthehybridprotocolandcompareitwiththebestofthetreeandthegossipprotocol.Weperformsuchexperimentsforthethreetypesofgossipcommunication:localcommunicationwithimmediateneighbors,communicationwithanynodeusingrouting,andcommunicationwitharandomsubsetofsizelogNusingrouting.Foralltheexperimentsreportedinthenextthreesubsections,weusethefollowingconventions: Foreachindividualexperiment,weprovidethreeplotsgroupedinasinglegure.Therstplotistheabsoluteperformanceofthequantitymeasured/optimizedfor|weplotthesevaluesforeachofthetreeaggregation,gossipaggregationandhybridaggregationbutonlyforthelargestnetworksinceotherwisethegraphswouldbetoocrowded.Thesecondplotistherelativegaininperformanceofthehybridprotocoloverthebestofthetreeandgossipaggregationforvariousnetworksizes.Onthey-axis,wehavetherationoftheminimumofthetreeandgossipvaluetothevalueofthehybridprotocol;thus,thehigherthisratio,thebetteristherelativeperformanceofthehybridprotocol.Thethirdplotisthegroupsizedeterminedasthesolutionoftheoptimizationproblemforthehybridprotocol.Forallthreeplots,thedependencyisontheprobabilityoffailurewhichcoversafourorderofmagnituderange. 66

PAGE 67

Theoptimizationcriterionforselectingthegroupsizeofthehybridprotocolisalwaysthequantityplotted,e.g.forplotsofthetotalnumberofmessages,thehybridprotocolisoptimizedtominimizethenumberofmessages.5.3.2.1LocalgossipcommunicationWhenusinglocalgossipcommunication,nodesareallowedtocommunicateonlywiththeirimmediateneighborswhenrunningthegossippartoftheprotocol{nolongdistancecommunicationisallowedforgossipingbutitisallowedfortreeaggregation.ResultsofexperimentsusinglocalgossipcommunicationarereportedinFigure 5-4 formessagecountandinFigure 5-5 fortotaltimetocompletion.FromtheplotoftheabsolutevaluesofthetotalnumberofmessagesforthethreeprotocolsinFigure 5-4a ,weobservethattheintuitionsaboutthebehaviorofthetreeandthepuregossipprotocolisconrmedexperimentally.Inparticular: 1. Forsmallprobabilitiesoffailure,thetreeperformanceisveordersofmagnitudebetterthangossip{thisbehavioriswellpredictedbythetreefaultmodelandintuition. 2. Asthefailureprobabilityincreases,theperformanceofthetreedeterioratesexponentiallytothepointthatitbecomessignicantlyworsethantheratherpoorperformanceofthegossip.Noticethattheperformancegoesfromgoodtoverybadwhentheprobabilityoffailureincreasesfour-foldfrom10)]TJ/F20 7.9701 Tf 6.5865 0 Td[(5to410)]TJ/F20 7.9701 Tf 6.5865 0 Td[(5.Wepredictedthisbehaviorwiththetreefaultmodel. 3. Theperformanceofgossipismostlyimmunetofailures{thispropertyisintuitiveaswepointedoutearlierbutitwasneverconrmedexperimentally. 4. Theperformanceofthehybridprotocolcoincideswiththeperformanceofthetreeforsmallfailureprobabilitiesbutitstaysreasonableevenwhenthetreeperformancebecomesunacceptable.Also,atalltimes,thehybridprotocolsignicantlyoutperformsthegossipprotocol. 5. Thedependencyontheprobabilityoffailureofthehybridprotocolisalmostlinearbutnotasatasitisthecaseforgossip. 67

PAGE 68

aAbsolutevalues bGainofhybridprotocol cOptimalgossipgroupsizeFigure5-4.Comparisonofthemessagecountswhenusinglocalcommunicationforgossip. aAbsolutevalues bGainofhybridprotocol cOptimalgossipgroupsizeFigure5-5.Comparisonofthetimetocompletionwhenusinglocalcommunicationforgossip. 68

PAGE 69

ThesametrendsareobservedintheplotoftheabsolutevaluesofthetimetocompletioninFigure 5-5a exceptthatthegapbetweenthetreeandgossipforsmallprobabilitiesoffailureissignicantlysmaller.FromtheplotsofthegainofthehybridprotocoloverthebestoftreeandgossipinFigures 5-4b and 5-5b weobservethat,exceptforsmallprobabilityoffailure,thehybridprotocolissignicantlymoreecientbothintermsoftotalnumberofmessagesandtotaltime.Fordierentnetworksizes,theshapeofthecurveofthegainisverysimilar;thevariationsbeingthefactsthatforlargernetworksthetransitionfromnogaintosignicantgainhappensforsmallerfailureprobabilitiesandthatthelargestgainismoresignicant.Thismeansthattheadvantageofthehybridprotocolwidensforlargernetworks{itsuseiscrucialinsteadofthepureprotocols.Largegains,intheorderof10000,areobservedforlocalcommunicationbecausethegossipisveryinecient.Moreinsightcanbegainedonthewaythehybridprotocolworksbyanalyzingthegroupsizedeterminedusingtheoptimizationproblem.FromtheplotsinFigures 5-4c and 5-5c weobservethat,assuspected,forsmallprobabilitiesoffailure,onlythetreeaggregationisusedgroupsizeis1.Atsomepoint{thelargerthenetworkthesooner{thegroupsizeisincreasedalmostlinearlyasafunctionoftheprobabilityoffailure.5.3.2.2GlobalgossipcommunicationWhenusingglobalgossipcommunication,nodesareallowedtocommunicatewithanynodeinthenetwork{routingisusedtoaccomplishthis.ResultsofexperimentsusingglobalgossipcommunicationarereportedinFigure 5-6 formessagecountandinFigure 5-7 fortotaltimetocompletion.Thesametrendsweobservedforlocalgossipcommunicationareobservedinthiscasewiththefollowingexceptions: Thegapbetweenthetreeandgossipcommunicationforsmallprobabilityoffailuresissignicantlysmaller:2103formessagecountsinFigure 5-6a andonlyabout50forthetimetocompletioninFigure 5-7a 69

PAGE 70

aAbsolutevalues bGainofhybridprotocol cOptimalgossipgroupsizeFigure5-6.Comparisonofthemessagecountsforglobalgossipcommunication. aAbsolutevalues bGainofhybridprotocol cOptimalgossipgroupsizeFigure5-7.Comparisonofthetimetocompletionforglobalgossipcommunication. 70

PAGE 71

GainsofthehybridprotocolaresignicantlysmallertheopportunitiesarenotasbigsincethegapbetweentreeandgossipissmallerascanbeobservedfromFigures 5-6b and 5-7b .Nevertheless,thehybridprotocolstillperformssignicantlybetter.5.3.2.3LimitedglobalcommunicationLimitedglobalcommunicationallowsnodestocommunicatewithanynodeinthenetworkbutxesthenumberofsuchnodesforanyspecicnodetothelogarithmofthesizeofthenetwork.Inthisway,knowledgeaboutonlyasmallnumberofnodesinthenetworkisrequired.Thus,thistypeofcommunicationismorepracticalthantheglobalgossip.AsitcanbeobservedfromtheexperimentalresultsinFigures 5-8 and 5-9 ,thebehaviorofthelimitedglobalcommunicationisverysimilartotheglobalcommunicationexceptthatthegossipperformanceisabout5timesworsebothintermsofthenumberofmessagesandtimetocompletionFigures 5-8a and 5-9a .Thisgetsreectedaswellintheperformanceofthehybridprotocol.Webelievethatthisisanacceptablecompromisebetweenperformanceandtheneedtokeepeverynodeinformedaboutalltheothernodesinthenetworkwhichmightbeimpossibletoachieve.5.4VariationstotheBasicModelandMethodologyInthissection,wediscussvariousvariationstothefaultmodelandthemethodologyusedintheprevioussectionsforobtainingtheparametersforthehybridprotocol.Weexperimentwitheachofthesevariationsandobservetheimpactontheperformanceofthehybridaggregationtechnique.Wethendiscusstheimplicationsoftheseresultslaterinthesection.5.4.1ModelusingFault-ProneMulti-HopRoutingThehybridtechniquecomprisesofutilizingasmallaggregationtreeovergroupsofgossipingnodes.Asaresult,apairofparent-childnodesintheaggregationtreewouldmostlikelynotbewithincommunicationradiusandwouldutilizemulti-hoprouting.Themethodologydiscussedsofarassumesthatthemulti-hoproutingisfault-resilient 71

PAGE 72

aAbsolutevalues bGainofhybridprotocol cOptimalgossipgroupsizeFigure5-8.Comparisonofthemessagecountswhenusinglimitedglobalcommunicationforgossip. aAbsolutevalues bGainofhybridprotocol cOptimalgossipgroupsizeFigure5-9.Comparisonofthetimetocompletionwhenusinglimitedglobalcommunicationforgossip. 72

PAGE 73

[ 48 ][ 79 ].Here,weanalyzethescenarioinwhichthemulti-hoproutingthatisusedintreeaggregationisnotfault-resilient.Eventhoughtheintermediatenodesusedforroutingareequallypronetofailures,usually,theroutingoperationitselfcanbeeasilymadefault-resilientby,say,maintainingmultiplepathsforrouting.However,consideringthatnosucheortsaremadeforfaultresilienceofrouting,hereweanalyzetheimpactontheperformanceofthehybridprotocol. Proposition4. GiventhatthenumberoflinksinatreeisLandtheprobabilityoffailureisp,theexpectednumberofreconstructionsforasuccessfultreeaggregationoperationis1 )]TJ/F22 7.9701 Tf 6.5865 0 Td[(pL Proof. ForatreewithLlinks,thenumberoftransmissionsbarringretransmissionsduetotransientfailuresrequiredforaggregatecalculationonceisalsoL.Iftheprobabilityoffailureisp,theprobabilityofLsuccessfultransmissionstowardonesuccessfulaggregationis1)]TJ/F21 11.9552 Tf 13.029 0 Td[(pLandtheprobabilityofatleastonefailedtransmissionleadingtoafailedaggregationattemptis1)]TJ/F15 11.9552 Tf 11.9552 0 Td[()]TJ/F21 11.9552 Tf 11.9552 0 Td[(pLConsidertherandomvariableXwhichdenotesthenumberofaggregationattemptleadingtoasuccess.Thus,k)]TJ/F15 11.9552 Tf 12.7475 0 Td[(1failedattemptsfollowedbythesuccessfulattemptfollowstheprobabilityP[X=k],whereP[X=k]=[1)]TJ/F15 11.9552 Tf 11.9552 0 Td[()]TJ/F21 11.9552 Tf 11.9552 0 Td[(pL]k)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1[)]TJ/F21 11.9552 Tf 11.9551 0 Td[(pL] 73

PAGE 74

Thus,theexpectedvalueofthenumberoftreeaggregationattemptsrequiredforasuccessfulaggregationisE[k]=1Xk=1P[X=k]k=1Xk=1[1)]TJ/F15 11.9552 Tf 11.9552 0 Td[()]TJ/F21 11.9552 Tf 11.9551 0 Td[(pL]k)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1[)]TJ/F21 11.9552 Tf 11.9552 0 Td[(pL]k=)]TJ/F21 11.9552 Tf 11.9552 0 Td[(pL 1)]TJ/F15 11.9552 Tf 11.9552 0 Td[()]TJ/F21 11.9552 Tf 11.9552 0 Td[(pL1Xk=1k[1)]TJ/F15 11.9552 Tf 11.9552 0 Td[()]TJ/F21 11.9552 Tf 11.9552 0 Td[(pL]k=)]TJ/F21 11.9552 Tf 11.9552 0 Td[(pL 1)]TJ/F15 11.9552 Tf 11.9552 0 Td[()]TJ/F21 11.9552 Tf 11.9552 0 Td[(pL1)]TJ/F15 11.9552 Tf 11.9551 0 Td[()]TJ/F21 11.9552 Tf 11.9551 0 Td[(pL [1)]TJ/F15 11.9552 Tf 11.9552 0 Td[()]TJ/F15 11.9552 Tf 11.9552 0 Td[()]TJ/F21 11.9552 Tf 11.9551 0 Td[(pL]2=1 )]TJ/F21 11.9552 Tf 11.9552 0 Td[(pL ThetasknowistoestimatethisL,assumingthatintermediatehop-nodesareusedforforwardingmessages.Apairofnodesthatarerepresentativesofadjacentgossipgroupsare,onanaverage,hhopsawayh=q N m d,fordeploymentdensityd.Sinceweareformingaquadtree,theapproximatedistancebetweenachild-parentpairinthetreeincreasesgeometricallybyafactorof2astheaggregationoperationproceedsfromtheleafnodestowardtherootnodes.Thus,forasteinertreespanningallthemgossipgrouprepresentatives,L=log4m)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1Xi=0h2i4log4m)]TJ/F22 7.9701 Tf 6.5865 0 Td[(i=2hp mp m)]TJ/F15 11.9552 Tf 11.9552 0 Td[(1andtheexpectedtimeforeachattemptoftreeaggregationintermsofnumberofhopswouldbehPlog4m)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1i=02i=hp m)]TJ/F15 11.9552 Tf 11.9551 0 Td[(1Pluggingthesevaluesintheoptimizationfunctionsforminimizingthetotalnumberofmessagesorthetotaltimetakenforaggregationinequations 5{2 and 5{3 wouldyieldtheparametersrequiredforthehybridprotocolasshowninequations 5{4 and 5{5 .The 74

PAGE 75

actualgainintermsofreductioninthenumberofmessagesandtheaggregationtimeasobservedfromexperimentsisplottedinFigure 5-10a andFigure 5-10b .mtimeopt=argminm2[1:::N]c1gc2N m+hp m)]TJ/F15 11.9552 Tf 11.9552 0 Td[(11 )]TJ/F21 11.9552 Tf 11.9552 0 Td[(p2hp mp m)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1{4mmsgopt=argminm2[1:::N]NgN m+2hp mp m)]TJ/F15 11.9552 Tf 11.9552 0 Td[(11 )]TJ/F21 11.9552 Tf 11.9551 0 Td[(p2hp mp m)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1{5 aMessagecount bTimetocompletionFigure5-10.Gainofthehybridprotocolinpresenceoffault-pronemulti-hoprouting.Thegossipphaseofthehybridprotocolhereuseslocalcommunication. Asignicantchangefrompreviousobservationsisthedropintheoverallgainofthehybridprotocolto1athigherprobabilitiesatfailure{puregossipisemployedforaggregationinsuchcases.Thisoccursduetofault-proneroutinginthetreephase.ForagivennetworksizeN,asthesizeofthetreemusedinthehybridprotocoldecreasesi.e.forgroupsizesincrease,thereisanincreasingpenaltyintermsofincreasingnumberofintermediatehop-nodes.Soon,maintainingatreestructure,evenasmallone,becomesuneconomicalathigherfailurerates.Hence,thehybridprotocolrightlysuggeststheuseofpuregossipathigherprobabilitiesoffailure. 75

PAGE 76

However,forintermediaterangeofprobabilitiesoffailure,theactualrangevaryingaccordingtothesizeNofthesensornetwork,theuseofhybridprotocolstillyieldshighgains{forexample,asseeninFigure 5-10b ,forN=262144,thehybridprotocoloutperformstheotherpureprotocolsintherange110)]TJ/F20 7.9701 Tf 6.5865 0 Td[(5p0:001withgainsnearlyashighas1000.ForN=1024,thehybridprotocolperformsbetterintherange0:001p0:1.5.4.2RobustnessandSensitivityTestingThesimplicityofthefaultmodelusedtodeterminethevaluesoftheparametersforthehybridprotocolallowsforeasytheoreticalanalysis.Theresultspresentedsofarvalidatethegaininperformanceachievedusingthehybridprotocoldesignedusingthefaultmodel.Now,weexperimentallyverifytherobustnessofthesimplefaultmodeltodierencesintheactualfaultbehaviorfromtheassumedfaultmodel.Weconsiderthefollowingscenarios: 1. Inaccuraciesinestimationoftheparameterpforthefaultmodel:Itmightnotalwaysbepossibletodetermineaccuratelytheestimatedprobabilityoffailurep.Theactualprobabilityoffailuremightdierfromtheestimatedp.Consequently,wealsoinvestigatehowsensitivetheperformanceofthehybridprotocol,designedfromthesimplefaultmodel,is,todeviationofactualfailureratefromtheestimatedpusedasinputtothefaultmodel. 2. Presenceofcorrelatedfailures:Thoughthesimplefaultmodelassumesabsenceofanycorrelationsinthefailurerate,environmentalconditionsmightactuallycreatesomekindofcorrelationsinfailures{fore.g.inabattleeld,somesensorsmightbeoverrunbyatankorinhabitatmonitoring,sensorsindamp,swampyareasmayfailmoreoften.Weinvestigatehowwellthehybridprotocol,designedfromthissimplefaultmodel,copesupinsituationswherefaultsareinfactcorrelated.5.4.2.1SensitivitytoestimationoffailurerateTotestthesensitivityofthefaultmodeltotheactualfailurerate,werunexperimentswhereintheactualfailureratepaisdierentfromtheestimatedprobabilityoffailurepe.Weobservethechangeingainofthehybridprotocolwhenpe=fpa.Here,wereporttheresultsforf=1 16,f=1 4andf=1 2inFigure 5-11b andforf=2,f=4andf=16inFigure 5-11a .Thereportedresultsareforexperimentsthatoptimizethenumberof 76

PAGE 77

aOverestimatedp.Thegainisnotashighaswhenpisaccuratelyestimated,butisstillsubstan-tial bUnderestimatedp.Theperformancevariesalotandis,ingeneral,poor.Figure5-11.Sensitivitytesting:Gainofthehybridprotocolwhentheestimatedpvariesfromtheactualp. messagesusinglocalcommunicationingossipphase.Theotherresultsarequalitativelysimilar.FromFigure 5-11 ,weobservethatwhentheactualprobabilityoffailureisoverestimated,theoverallgainobtainedbyusingthehybridprotocolisslightlyworseingeneralthanthegainachievedwhentheestimationisaccuratei.e.whenpe=pa.However,thegainisstillsignicantandfollowsthetrendasobservedsofarwhentheprobabilityoffailureisestimatedcorrectly.However,whentheprobabilityoffailureisunderestimated,theperformancedeterioratesexponentially.Animportantthingtonotewhentheprobabilityoffailureisunderestimatedisthatthenumberofinstancesofaggregationexperimentsthatdonotcomplete, 1 asseeninFigure 5.4.2.1 ,rapidlyincreases.Duetothedicultiesofdealingwiththeresultstheexperimentsthatdidnotcomplete,theresultsinFigure 5-11b ,thatincludestheresultsoftheexperimentswhichfailedtocomplete,shouldbeinterpreted 1Inourexperimentsweusedacutotimeof5105rounds. 77

PAGE 78

qualitativelyratherthanquantitatively.Inparticular,theperformanceofthehybridprotocolvariesconsiderablyduetowidevariationinthetimeittakestosuccessfullyperformthetreeaggregationthedistributionofthistimeisgeometricand,asexplainedinSection 3.2.3 ,forsmallp,thestandarddeviationisalmostequaltotheexpectedvalue,whichishighandingeneraltheperformanceispoor.Thissuggeststhatcareshouldbetakennottounderestimatetheprobabilityoffailure. Figure5-12.Completionrateofsensitivityexperiments. TheseresultsconrmthetheoreticalanalysisofthefaultmodelinSection 3.2.3 whereweshowedthattheexpectednumberoftimesatreehastobereconstructediseqwithq=pdNN.Whentheprobabilityoffailureisunderestimated,qisunderestimatedthusthenumberofrequiredreconstructionsisunderestimated.Ifthetargetvalueforqis1andwechooseatreesizethatachievesthisforprobabilityoffailurepe,iftheprobabilityoffailurewasinfactpa=4pe,thenwegete4=54:6timesmoretreereconstructionsthanwethinkweare,whichleadstothepoorperformanceobserved.Whenweoverestimatetheprobability,sayalsobyafactorof4,thenumberofrestartsise1=4=1:28insteadoftheestimatede1=2:7.However,thisleadstosomeperformancedegradationduetothefactthatthegossipgroupsaremadelargerthantheoptimalsize;but,theperformance 78

PAGE 79

aMessagecount bTimetocompletionFigure5-13.Gainofthehybridprotocolinpresenceofspatiallycorrelatedfailures.Thegossipphaseofthehybridprotocolhereuseslocalcommunication.Theperformanceisstillgoodinspiteofthepresenceofspatiallycorrelatedfailures. degradationissmallcomparedtotheunderestimationcase.Theseexperimentalresultsthusconrmourtheoreticalanalysisandsuggestthatitisbettertobeonthesafersideandoverestimatetheprobabilityoffailureratherthanunderestimatingit.5.4.2.2RobustnesstocorrelatedfailuresWeinducespatialcorrelationintheactualfaultbehaviorbymakingasetofsensornodesinthecenteroftheeldfailatadierentratethantherestofthesensornodes.Morespecically,therateoffailurepcofnodesfallinginthecircularregionatthecenterencompassingl%oftheunitsquareeldareaisxtimeshigher.Theresultspresentedhereareforx=2,x=4andx=8andl=20%.AswecanseeinFigures 5-13a and 5-13b ,thegainsofthehybridprotocolevenincasesofcorrelatedfailures,arecomparabletothegainsobtainedincaseswhereallfailuresareuncorrelatedFigures 5-4b and 5-5b .Itisimportanttonotethatthehybridprotocolusedhereisinfactdesignedfromthesimplefaultmodelwhichassumesnocorrelation.Notethataslightincreaseingainsinsomecasesofcorrelatedfaults10)]TJ/F20 7.9701 Tf 6.5865 0 Td[(5p410)]TJ/F20 7.9701 Tf 6.5865 0 Td[(4hereisobserved.Thisisduetothefactthatthepuretree 79

PAGE 80

protocolperformsevenworseinsuchsituationsandthatthegainplottedistheratiooftheperformanceofthehybridprotocoltothatofthebestofpuretreeandgossipprotocols.Thisimpliesthattheproposedfaultmodel,whichassumesthatthefaultsareuncorrelated,isinfactquiterobusttocommoncasesinvolvingcorrelatedfaults.5.4.3SeededClusteringThegridclusteringtechniqueusedsofarforclusteringsensornodesintogroupsneedslocationinformationforeachofthesensors.Ifsucharequirementistobeavoided,otherclusteringtechniquescanbeused[ 2 ][ 5 ].Weexperimentedusingasimpleseededclusteringtechniquesimilarto[ 5 ]{eachnodeselectsitselfasaclusterheadwithaprobabilityproportionaltotherequirednumberofclustersandgrowsclustersarounditusingoodingmessages.Sincetheballsinbinstheorycannotbeusedhere,weusedalearningtechniquetoestimatethesizeofthelargestgossipgroupwhichisexpectedtobetheslowestforestimatingthefactorc2inEquation 5{2 .Wederive,experimentally,thesize,saylargestx,ofthelargestgroupthatisformedwhentheexpectedgroupsizeisx.Now,theconvergencetimeofgossipgroupoflargestxnodesisconsideredwhenwewanttocountinthemaximumconvergencetimeforanexpectedgroupsizeofx.Ifwecanlearntheexpectedvalueoftheslowestconvergencerate,sayslowestx,whentheexpectedsizeofthegossipgroupisx,wecanuseslowestxasTgossipxinEquation 5{2 .TheexperimentalresultsprovidedinFigures 5-14a and 5-14b showthatthegainsachievedfollowatrendsimilartothose 5-6b and 5-7b withthegridclusteringtechniqueusedpreviously.Whilethisschemeappearstobemoreinvolvedthangridclustering,itisworthpointingoutthattheclusteringofnodesintogroupsisaone-timeoperationanditscostisamortizedovertime. 80

PAGE 81

aMessagecount bTimetocompletionFigure5-14.Gainofthehybridprotocolusingseededclustering.Thegossipphasehereusesglobalcommunication.Similarresultsareobtainedwhenlocalorlimitedglobalcommunicationisusedinthegossipphase. 5.5DiscussionWemakethefollowingimportantobservationsaboutthehybridprotocolweproposed: Eventhoughthefaultmodelweproposedandusedtodeterminetheparametersofthehybridmodelignorescorrelatedfailures,thehybridprotocolobtainedinthismannerisrobustwithrespecttosuchcorrelationsanddeviationsfromtheestimatedfailureprobabilities.Thissuggeststhatconsideringcomplicatedfailuremodelsmightnotprovideasignicantbenet. Aswehaveshown,thegreatestbenetofthehybridprotocolisobtainedeitherwhenthefailurerateisveryhigh{thiswouldhappenforexamplewhensensorsaredeployedinharshenvironmentssuchasvolcanoesorhurricanes{orwhenthesizeofthenetworkislarge{forexampleforlargescalescienticexperimentssayforoceanmonitoring. Implicitlyweassumedthatallsensorshavethesamecapabilitieswithrespecttocommunicationratesand,toalesserdegree,reliability.Shouldthisnothappen,themorecapablenodesshouldalwaysparticipateinthetreeprotocolwherereliabilityiscrucial.Webelievethatamodicationofthehybridprotocolweproposeherecannaturallyaccommodatesuchsituations.Weplantoexplorethispossibilityinfuturework. Inthischapterwedidnotexplorelocalxesasapossibility,mostlybecausethisissueisnotexploredintheliteratureand,inouropinion,isnontrivial.Whilesome 81

PAGE 82

implementationsofsimplelocalxesexist,itisnotobvioushowtoattachanynotionofcorrectiontothetypeofxesproposed/implemented.Providingsomeguaranteesiscrucialwhensensornetworksareused.Wehaveexploredtheissueoflocalxesindetailinthepreviouschapterabouttreeaggregation. Thenotionofcorrectionweproposedinthiswork,robustcorrectness,lookscomplicatedandnotveryintuitive.Asweexplainedpreviously,itisnotpossibletogiveasimplenotionofcorrectnesssinceitwouldimplyachievingglobalknowledgeofonesortoranother.Someofthedicultywithunderstandingrobustcorrectnesscomesfromthefactthatanynotionofcorrectnesshastobeexpressedmathematically{anintuitive/descriptivenotionofcorrectnesswouldnotbesatisfactorysinceitwouldnotprovideanyquantiableguarantee.Theworkinthischaptershouldbeviewedmoreasexploratoryworkratherthananactualcompletesolution.Largesensornetworkswerenotexploredinpreviousliteratureand,aswehaveshown,theyposeseveredicultiesthatarenotencounteredinsmallsensornetworks.Inthiswork,wearedemonstratingthefeasibilityofcombining,inaprincipledway,dierentsimpleprotocolstosignicantlyimprovetheperformance.Webelievethisistheonlywaytoscaleprotocolstolargesensornetworks.5.6ContributionsThegoalofthispieceofworkistodevelopsuchahybridprotocolandtoanalyzeitspropertiesforlargeand/orhighlyfaultysensornetworks.Moreprecisely,ourcontributionsare: 1. Weproposeahybridprotocolthatpartitionsthesensorsintogroupsandusesgossipforaggregationwithinthegroupsandatreeforaggregationbetweengroups.Weanalyzethefaultbehaviorofthehybridprotocolunderthefaultmodel.Whilethesimplicityofthefaultmodelmightbeofconcern,wecarryexperimentstoshowthatthehybridprotocoldesignedusingtheassumedfaultmodelbehavesreasonablywhencorrelatedfaultsareactuallypresent. 2. Weintroducemethodologyandalgorithmstodeterminetherightcombinationbetweentreeandgossipaggregationinthehybridmodel.Thisproblemresultsintheformulationofoptimizationproblems;weexplainhowtheoptimalsolutioncanbedetermined. 3. Westudyempiricallytheeciencyofthehybridprotocolandshowthatitcanbesignicantlybetter,sometimesbyasmuchasafactorof1000,thanthebestofthetreeandgossipaggregation.Weperformexperimentstoalsoshowthatthehybrid 82

PAGE 83

protocolhasthepotentialtoscaletohundredsofthousandsofsensornodesandthattheprotocolisrobustwithrespecttovariationstothefaultmodel. 83

PAGE 84

CHAPTER6AGGREGATIONINHETEROGENEOUSSENSORNETWORKSAwirelesssensornetworkisacollectionofnodesthatcollaboratetoperformsensing,processing,and/oractuationoperationsbyformingad-hocnetworks.Broadlyspeaking,sensornetworkscanbeclassiedbasedonthearchitectureinwhichtheyself-organize{whenallnodesarepeersandarehomogeneousinfunction,thenetworkissaidtohaveaatarchitecture.Ontheotherhand,thenodesmightbeheterogeneousinformandfunction.Suchheterogeneoussensornetworkspresentanopportunityoffurtheroptimizationonvariousfrontssuchastaskallocation,andstorageandnetworktracallocation.Inthischapter,westudythefeasibilityofutilizingthebetterreliabilityofstrongersensornodesinimprovingtheperformanceofaggregationalgorithms,especiallyinlargesensornetworks.6.1MotivationMostoftheworkinsensornetworkaggregation[ 57 ][ 64 ][ 62 ][ 46 ]dealswithsensornetworkshavinguniformorhomogeneoussensornodes{thecapabilitiesofallthesensornodesareassumedtobesimilar.Interestingpossibilitiesarisewhenweconsiderasensornetworkwhereinthenodesmighthavevaryingcapabilities.Suchheterogeneoussensornetworkshavebeendiscussedinliteraturewithaviewofexploitingtheiradvantagesintermsofbetterbandwidthandprocessingpower[ 92 ][ 75 ].Inthischapter,weinvestigatehowtoexploittheimprovedreliabilityofsuchmicroservernodesinsensornetworks.Usingthesimplefaultmodelusedforanalyzingfaulttoleranceandscalability,wemodelthebehaviorofaggregationalgorithmsinsuchaheterogeneoussensornetwork.Anaccuratetheoreticalmodelofthebehaviorofheterogeneousaggregationinsensornetworkswouldenabletheanalysisoftheoptionsavailablebeforedeployingsensornetworkapplications.Inthischapter,wedevelopsuchamodelanduseanasymptoticapproximationthatmakesthemodelmoreecientandcomputationallystable.Themodelcanbeusedtodesignheterogeneousnetworksfarmoreecientlythanrunning 84

PAGE 85

time-consumingsimulationsorexperiments,orworse,usingatrial-and-errorapproachtodeployment.WeexemplifytheuseofthemodelbyansweringdesignquestionssuchasHowmanymicroserversareneededtoachieveagivenperformancegoal?"Wealsoperformanumberofstudiesontheperformanceofheterogeneousnetworksasafunctionofthenumberofmicroservers.Onesurprisingresultfromthesestudiesisthefactthattheperformanceisnotalwaysbetterifmoremicroserversareadded{thismakestheoptimizationproblempertinent.Wealsoshowthatthetheoreticalmodelcanbeusedtoaccuratelydeterminethisoptimalnumberofmicroservers.Mostofthepreviousworkusingheterogeneoussensornetwork[ 73 ][ 33 ][ 92 ][ 75 ][ 86 ][ 68 ][ 38 ][ 63 ]dealswithlowerlevelprotocols.Heterogeneityisshowntohavebeenleveragedformoreuniformexpenditureofenergy[ 75 ]andmoreimprovementindeliveryrateandexpectedlifetimeofsensornetworks[ 92 ].Inthiswork,weleveragethebetterreliabilityofthemicroservernodestoimprovethefault-toleranceofaggregationalgorithms,andoptimizethecostgivencertainperformanceconstraints.6.2ModelingFaultTolerantAggregationinHeterogeneousSensorNetworkThesensornetworkconsideredhereconsistsofnumerouslow-power,fault-pronesensornodes{suchastheBerkeleymotes,MICA2motesorSmartDust[ 87 ],hereafteralsoreferredtoasjustmotes{andafewhigh-powered,morereliableandhigh-bandwidthmicroserversliketheIntelXScale 1 .In-networkaggregationcanbeperformedbyformingaspanningtreeamongstallthetreenodes[ 57 ][ 15 ][ 46 ].Aggregationthenstartswiththeleafnodesofthetreesendingtheirvaluestotheirparentnodes.Theintermediatenodescalculateapartialaggregateofthesevaluesalongwithitsownvalueandpushthispartialaggregateupthetreeuntilthenalaggregateisobtainedattherootnode.However,thepresenceoffaultsaectthecorrectnessofthistreeaggregation.Afewoptionsconsideredinliteraturetoimprovethefaulttoleranceofhomogeneous 1IntelXScaleTechnology:http://www.intel.com/design/intelxscale/ 85

PAGE 86

sensormote microserver Treeaggregation Figure6-1.Treeaggregationinheterogeneoussensornetwork treeaggregationincludeutilizinghash-basedschemes[ 64 ][ 62 ]orusingahybridoftreeandgossipaggregation,assuggestedearlierinthisresearch.Iftheoptionofutilizingthemorereliablemicroserversisavailable,themicroserverscanbeutilizedasthenodesinthetoptierofthetree.Thus,theoverallarchitecturewouldhaveaninterconnect,atree,ofmicroserversandtheneachmicroservercanserveastherootofasubtreeofmotes.Sincethemicroserversaremorereliable,thepartialaggregatevalueemergingfromthesubtreeofamicroservercanbestoredtemporarilyatthemicroserver.Toensurecorrectness,afailureatanynodeinthetreeaggregationwouldresultintherebuildingoftheentiretree.Wehaveassessedthistobethemostsimpleandworkablestrategythatguaranteedrobustcorrectnessearlierinthiswork.However,thepresenceofthemorereliablemicroserversnowallowsustoexploreanotheroption{insteadoftriggeringtherebuildoftheentiretree,afailureatanymotewouldtriggertherebuildingofjustthesubtreerootedatthemicroserverwhichisthenearestancestorofthefailedmote.Ifthesubtreerootedatanyothermicroserverdoesruntocompletioninthemeanwhile,thepartialaggregatecanbebueredatthemicroserversinceitsmorereliableandassumedtohaveafairlystablestorage.However,ifanyofthemicroserversfail{theprobabilityof 86

PAGE 87

thishappeningwouldbemuchlowerthantheprobabilityofthemotesfailing{thetreeaggregationcanberestarted.Wemakethefollowingassumptionsinourtheoreticalcommunicationmodel: 1. Thepresenceofheterogeneoussensornodesresultsindierentkindsofinteractions:sensormotetosensormoteS-S,sensormotetomicroserverS-RandmicroservertomicroserverR-R.FollowingthesimplefaultmodelproposedearlierinChapter 3 ,theprobabilityoffailurewhenwhosensormotescommunicatewitheachotherS-Sisgivenbyp.Themicroserverismoreresilienttofailures.So,theprobabilityoffailureintheinteractionsbetweentwomicroserversR-Risassumedtobebetterbyafactoroff.Thereliabilityfactorfisaninputparameter,f1,anddependsonthereliabilityofthehardwareusedasmicroserver.TheprobabilityoffailurewhenamoteinteractswithamicroserverS-Rwouldbebetweenpandp f.However,toavoidcomplicatingthemodelatthispoint,thisprobabilityoffailureisconservativelyestimatedtobep. 2. Thecommunicationmodelforboththemicroserversandthemotesisassumedtobesimilar{aunitdiskmodelofdensitydsensors.Multi-hoproutingisassumedwhenthecommunicatingnodesarenotwithincommunicationradius.Tolimitthediscussionhere,wedonotconsiderthefailuresofintermediatenodesinmulti-hoprouting.Thisisareasonableassumptionsincemulti-hoproutingismoreresilienttofailuresthanaggregationoperation,andwecountthenumberofhopsasthetimeunitsrequiredforcommunicationbetweenthetwonodes.6.2.1TheoreticalModelLetthetotalnumberofsensorsbeN.Letusalsoassumethattherearenmicroservers.Accordingtothearchitecturediscussedpreviously,ifthenmicroserversformthetopofthetree,eachofthesubtreesrootedateachofthenmicroserverswouldbe,onanaverage,ofsizeN n.Thetimetocompletionofthetreeaggregationwouldthusbeboundedbythetimerequiredfortheslowestaggregationamongthensubtreesfollowedbytheaggregationtimeamongthemicroservers.Again,thisentireprocessisrepeatedforasmanytimesasthetreeaggregationamongmicroserversfail.Inshort,thefollowingfactorscontributetotheaveragetimetocompletionofthisheterogeneousaggregation: 1. Subtreeaggregationtime:ThisisthetimerequiredforaggregationinasubtreeofsizeN nmotes.Though,intheabsenceoffailuresthisaggregationtimeisgivenby,say,TN n,itisdominatedbythenumberofrestarts,E[RN n],inthepresenceoffailures 87

PAGE 88

andcanbeestimatedtobeE[RN n]TN n 2. Barriertime:Since,inthepresenceoffailures,theprocessoftreeaggregationisprobabilistic,theaggregationtimevariesamongthensubtrees.Sincethemicroserveraggregationcannotproceedwithoutthelastofthesubtreescompletingaggregation,insteadofjusttheaveragecompletiontimeoftheaggregationinsubtrees,themaximumoftheexpectedaggregationtimeamongthensubtreesneedtobeconsidered.Thus,thetotalsubtreeaggregationtimecanbeestimatedtobetheexpectedvalueofthemaximumnumberofrestartsamongnsubtrees,eachofsizeN n,multipliedbytheestimatedtimeforfaultlessaggregationintreeofsizeN nE[MaxnE[RN n]]TN n{1 3. Aggregationamongstmicroservers:ThisisthetimeTnrequiredforaggregationamongstthenmicroservers,whichformthetoptierofthetree.ThistimeaddsuptotheaggregationtimederivedinEquation 6{1 .However,themicroserversmightfail;andsotheactualaggregationtimeTwouldneedtoconsidertheE[Rn]restartsoftheentireaggregationtreeduetofailuresinthemicroservertree.T=fE[MaxnE[RN n]]TN n+TngE[Rn]6{2Wenowderivetheformulaeforeachofthetermsusedinthisequation,namelyTn,TN n,E[Rn]andE[MaxnE[RN n]].EvaluatingTnandTN n:Thetimeperaggregationroundinatreeofmicroserversofsizencanbeexpressedintermsoftheaveragenumberofhopsfromtheleaftotheroot.Theaveragenumberofhopsistheratiooftheaveragedistancetotherootandtheaverageradiusofcommunication.Foradensityofdsensorsintheradiusofcommunication,andgiventhattheNsensorsarespreaduniformlyrandomlyintheunitsquare,theradiusofcommunicationrisq d N.Theaveragedistanceiscalculatedassumingthatthedistancebetweentheparentandchildnodesincreasegeometricallyastheaggregationprogressesupthetree.Thus,thetimetakenforaroundofaggregationintheabsenceofanyfailures 88

PAGE 89

forthemicroservertreeTnandthesensormotesubtreesTN ncanbeexpressedasfollows:Tn=1Xi=logn1 2i1 r=)]TJ/F15 11.9552 Tf 18.6993 8.0877 Td[(1 p nr N d {3 TN n=lognXi=logN1 2i1 r=1 p n)]TJ/F15 11.9552 Tf 20.5168 8.0878 Td[(1 p Nr N d {4 EvaluatingE[Rn]andE[MaxnE[RN n]]:Thenumberoftimestheprocessoftreeaggregationrestartsisageometricrandomvariable.Thus,foramicroservertreeofsizenandprobabilityoffailurep f,theexpectednumberoftimesthetreeofmicroserversrestartsisE[Rn]=1 )]TJ/F22 7.9701 Tf 13.4926 5.2558 Td[(p fn {5 EstimatingthetermE[MaxnE[RN n]]ismorecomplicatedsinceitinvolvestakingtheexpectedvalueofthemaximumofasetofrandomvariables.However,weknowthatthenumberoftimesthetreeaggregationrestarts,RN n,isageometricrandomvariablewiththeprobabilityofsuccessP=)]TJ/F21 11.9552 Tf 12.324 0 Td[(pN n.Theprobabilitydistributionoftheorderstatisticsofasetofgeometricrandomvariablesisawell-studiedproblem[ 82 ][ 50 ].Therstmomentofthemaximumofaset,ofsizen,ofgeometricrandomvariableshavingprobabilityofsuccessP=1)]TJ/F21 11.9552 Tf 11.9552 0 Td[(QcanbeexpressedasthesumE[MaxnE[RN n]]=Mn=)]TJ/F22 7.9701 Tf 11.7521 14.944 Td[(k=nXk=1)]TJ/F15 11.9552 Tf 9.2985 0 Td[(1knk1 1)]TJ/F21 11.9552 Tf 11.9551 0 Td[(Qk {6 However,thisequationisasumofalternatingpositiveandnegativevaluesandisknowntobecomputationallyunstableforlargen[ 82 ].Thus,forlargen,weusethefollowingasymptoticsthatholdfortheaverageMnasdescribedin[ 82 ]. 89

PAGE 90

Mn=logn logq+ logq+1 2+P0logqn+On)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1 {7 wherelogdenotesthenaturallogarithm,qisdenedasQ)]TJ/F20 7.9701 Tf 6.5865 0 Td[(1and=0:577.TheuctuatingfunctionPrxisdenedasfollowsPrx=1 logqk=1Xk=;k6=0\050r+kexp)]TJ/F15 11.9552 Tf 9.2985 0 Td[(2ikxwherer=0;1;2;:::k=2ik=logq;k=0;1;:::;Thus,substitutingEquations 6{3 6{4 6{5 and 6{7 inEquation 6{2 ,wegetanexpressionthatcanbeusedasatheoreticalmodelforestimatingthetimetocompletionoftreeaggregationinheterogeneoussensornetworks.6.2.2DesignDecisionsusingTheoreticalModelThetheoreticalmodelisaneasytousetoolforanalyzingtheimpactofusingheterogeneoussensornodes.TheonlyparameterstothemodelarethesensornetworksizeN,numberofmicroserversn,probabilityoffailurep,andimprovementinreliabilityofmicroserversf.Nandnareinputparameters,whilepandfvaryaccordingtotheapplicationandcanbeestimatedbymeasuringtheerrorratesusingafewsensormotesandmicroservers.Theavailabilityofanaccuratetheoreticalmodelaidstakingdesigndecisions.Withtheotheroptionsbeingrunningsimulations,oractuallydeployingtotestthecongurations,utilizingthetheoreticalmodelisthemostecientchoice.Deployingatestapplicationtoobservethebehaviorofdierentcongurationscanbeprohibitivelycostly.Ontheotherhand,thesimulations,sincetheyneedtoberandomized,takemultipleruns 90

PAGE 91

foreachcongurationofthesensornetworktoarriveatstablestatistics.Thus,simulatingvariouscongurationstochoosethebestcongurationcanalsobeverytime-consuming.Forexample,ittookdaysofsimulationontensofhighperformanceOpteronprocessorstovalidateourtheoreticalmodel.However,withatheoreticalmodel,thebehaviorofmultiplecongurationscanbepredictedinafewseconds.Havingthisabilitytoquicklypredicttheperformanceinaggregationofvariousheterogeneoussensornetworkcongurationscanbeusedtoanswervariousdesignquestions: Giventhatthetimetocompleteoneroundofaggregationinanapplicationisconstrainedby,say,Tmax,whatistheminimumnumberofmicroserversrequired?KnowingthesizeofthesensornetworkN,theprobabilityoffailureofmotesp,andthereliabilityfactorfofthemicroservers,wecanestimatetheaveragetimetocompletionTnforarangeofnumberofmicroserversn=1;2;:::;Nusingthetheoreticalmodel.Ifafeasiblesolutiondoesexist,theminimumnsuchthatTnTmaxcanbeusedastherequirednumberofmicroservers. WhenthesizeofthesensornetworkN>c p,forasmallconstantc10,theperformanceofpuretreeaggregationcanbeprohibitivelybad.Intermsofsustainingtheaggregationprocessathigherprobabilitiesoffailure,howmuchmorereliablecanyoumakethesystembyaddingmicroservers?Weusethemorereliablemicroserverstosustaintheaggregationprocessatahigherprobabilityoffailure.Theeectiveimprovementinthesustainableprobabilityoffailurecanthenbecalculatedbyequatingthetimetocompletionusingtheheterogeneousmodelwiththemaximumacceptablecompletiontime.T=fE[MaxnE[RN n]]TN n+TngE[Rn]Tacceptable Now,giventhatthecostofthemicroserversvarieswiththeirreliabilityfactorf,whatistheoptimalnumberandtypeofmicroserversintermsofcost? 91

PAGE 92

KnowingN;p;Tacceptable,andhavingfasaparameter,thisequationcanbesolvedfortheminimumn.Thisgivesusasetoffeasibleff;ngpairs.Oneofthesefeasiblecongurationscanthenbechosendependingonthecostofthevariousmicroservers.Thus,havingatheoreticalmodel,thisproblemcanbemodeledandsolvedasanoptimizationproblem.Amoregeneralquestionthatcanbeaskedis`Howdoweoptimizethecostalongwiththeperformanceofasensornetworkapplication?'Theperformanceofanapplicationinsensornetworkmaybemeasuredintermsofthevariouscharacteristicsofthenetwork{suchastheexpectedlifetimeofthenetwork,deliveryrate{andfrequentoperationsperformedintheapplication{suchassensing,informationdisseminationandaggregation.Allsuchcriteriaimportantforthatapplication,alongwiththetotalcostofthehardware,canbefactoredintoformulateaglobaloptimizationproblem,witharelativeweightassociatedwitheachperformancecriteriaandthecost.Thisweightedsumoranyotheroptimizationfunctioncreatesatrade-obetweenvariouscriteria{forexample,alessertimetocompletioncanbetradedinforthecostofmorenumberofcheapermicroservers,orlessernumberofthemorereliableandexpensivemicroservers.Havingasimpletheoreticalmodelenablesecientevaluationofanycongurationintheprocessof,say,theconvexoptimization,insteadofperformingexpensiveexperimentsortime-consumingsimulations.Ifthetheoreticalmodeliscomputationallyfast,evenacompletelinearsweepoftheentirecongurationspacemightbeplausible.6.3ExperimentsInthissection,weexplorethevariousdesignquestionsposedintheprevioussection.Wealsoverifytheaccuracyofthetheoreticalmodelbycomparingittoactualresultsfromexperimentsusingthelargescalesensornetworksimulatorusedearlierinthisresearch.Thegoalsofourexperimentsareasfollows: 1. Tovalidatethecorrectnessofthetheoreticalmodel.Weverifythatthenumberofmicroserversrequiredforoptimalperformanceofheterogeneousaggregationas 92

PAGE 93

observedfromtheexperimentresultsconcurwiththepredictionsofthetheoreticalmodel. 2. Tondtheeectiveincreaseintheprobabilityofsustainablefailureofthesensornetwork. 3. Tondtheimprovementinthecompletiontimeofaggregation{wecomparetheperformanceoftheproposedheterogeneousaggregationwiththeperformanceofpurehomogeneoustreeaggregation.Wenowbrieydescribehowthesimulationenvironmentissetupforourexperiments,followedbyanswerstothesequestionsinthefollowingsubsections.6.3.1SetupForasensornetworkofsizeN,withnamongtheseasmicroservers,allthesensorsareassumedtobeplaceduniformlyrandomlyinaunitsquareeld.Aunitdiskcommunicationmodelisassumedwithanaveragedensityofd=32sensors.Aspanningtreeisformedbystartingwiththerootnode,splittingthespaceinto4regions,selectingarandomnodeineachparttobethechildofthisnodeandrecursinguntilnonodesareleft.Thisquad-treeapproachissimilartothemethodof[ 36 ]andwillproduceareasonablybalancedtreewithgoodlocalityforthelowerpartofthetree.Analternativeistouseooding{asin[ 57 ]{inwhichcasethecommunicationbetweenparentandchildrenwillbemoreecientbutthetreemaybeunbalanced.Therstnnodeschosenthiswayaremarkedasmicroserversandtherestaretreatedassensormotesthedierencebeingintheirprobabilityoffailure{assignedpforthemotes,andp fforthemicroservers.Thisensuresthatthetopportionofthetreeismadeupofmicroservers.Notethatthisalsoimpliesthatthemicroserversaremoreevenlyspreadoutacrosstheeld,whichisthetypicalcaseindeploymentofheterogeneoussensornetworks. 93

PAGE 94

Aggregationstartswiththeleafnodessendingtheirvaluestotheparents.Eachintermediatenodeaggregates 2 thevaluesfromallitschildrenwithitsownvalueandsendsthepartialaggregatetoitsparent.However,themoreimportantscenarioisthebehaviorinthepresenceoffaults.Ifanyofthemotesfail,theaggregationisrestartedonlyforthesubtreerootedatthemicroserverwhichistheclosestancestortothefailedmote.Thismaycausesomeofthemicroserverstoobtainpartialaggregatesconsiderablyearlierthanothermicroserverswhosesubtreesmightencountermorefailures.Themicroservers,duetotheirmorereliablestorage,canbuerthesepartialaggregates.However,ifanyofthemicroserversfail,thewholeaggregationprocessisrestarted.Thecostofreconstructionofthetreeisassumedtobezero{thiscostisproportionaltothecostofaggregationinafault-lesstree.Attheendofasuccessfulaggregation,thenalaggregateisobtainedattherootnode.6.3.2VericationofTheoreticalModelFirst,wevalidatethetheoreticalmodelwithresultsfromsimulationexperiments.Webeginbyobservingakeystatisticinourtheoreticalmodel{themaximumofthenumberofrestartsamongthensubtrees{andcomparingitwiththeexperimentalobservation.WeplotthemaximumnumberofrestartsagainstthenumberofmicroserversinFigure 6-2 forvariousprobabilitiesoffailure.AsseenfromFigures 6-2a and 6-2b ,theobservednumberofrestartscloselymatchesthetheoreticalpredictionforthenumberofrestarts.ThisobservationisconsistentacrossdierentfN;p;f;ngcongurationsofthesensornetwork.WealsoplottheaveragetimetocompletionoftheaggregationoperationasobservedfromtheexperimentsinFigure 6-3b ,andaspredictedbythetheoreticalmodelinFigure 6-3a .Weagainobservethatthesimulationresultsareconsistentwiththe 2TheaggregationoperationperformedhereisSum.Butitcanbeextendedtomanyotheroperators[ 57 ][ 64 ]withoutanychangeintheanalysis. 94

PAGE 95

theoreticalmodel.Thepredictedtimetocompletionistypicallyslightlyhigherineachcasebecauseoftheconservativeapproachinmodelingwhereinthetimespentforeachtreeaggregationattemptisassumedtobethetimerequiredforasuccessfulaggregationinafault-lessenvironment.Theoptimumnumberofmicroserverscanbeobtainedbyminimizingthetimetocompletion.However,ascanbeobservedinFigures 6-3a and 6-3b ,mostoftheimprovementintheperformancemetricisobtainedwithmicroserversthatarefarfewerinnumberthantheabsoluteoptimuminmostcases.Thecurvemostlyplateausout aTheoreticalprediction bSimulationresultFigure6-2.Comparingtheexpectedmaximumnumberofsubtreerestartsaspredictedbythetheoreticalmodelandasobservedinsimulationresults.Thebehavioraltrendofthenumberofrestartsisaccuratelypredictedbythetheoreticalmodel.Here,N=16384andf=16 95

PAGE 96

afterthisknee.Wedenethekneetobetheminimumnumberofmicroserversinaheterogeneoussensornetworkcongurationsuchthatthedierencefromtheoptimumperformanceiswithinacertainsmallkpercentageoftheoptimumperformance.Wetabulatesuchreasonablenumberofmicroserversattheknee,fork=5%,inFigure 6.3.2 .Ascanbeseen,thetheoreticalmodelpredictionofthekneealmostalwaysmatchestheoutcomeofthesimulationexperiments. aTheoreticalprediction bSimulationresultFigure6-3.Behavioraltrendsinthecompletiontimesasobservedinsimulationexperimentsaresimilartothetheoreticalpredictions.Here,N=16384andf=16.Thefactorsof4inthereportingofnumberofmicroserversnintheplotsisanartifactoftheanalysisbeinggearedtowardcomparisonwiththequad-treesimulation,andisnotalimitationofthetheoreticalmodel. 96

PAGE 97

Table6-1.Optimalnumberofmicroserversaspredictedbythetheoreticalmodelandasobservedfromthesimulationresults.Here,N=16384andf=16. p Theoretical Observed 0.00032768 16 40.00131072 64 640.00524288 256 2560.02097150 256 256 6.3.3ImprovementinFaultToleranceThemaximumsustainableprobabilityoffailureforaheterogeneoussensornetworkcanbedenedasthemaximumprobabilityoffailureinwhichtheaggregationoperationcanbecompletedinareasonableamountoftime.Thereasonableamountoftimecanbeassumedtobeasmallconstantctimestheamountoftimerequiredatverylowprobabilityoffailureplow.Thesustainableprobabilityoffailureisanimportantstatisticsinceitisindicativeoftheimprovementinfaulttoleranceoftreeaggregationusingheterogeneoussensornodes.Intheresultsreportedhere,c=2andplow=10)]TJ/F20 7.9701 Tf 6.5866 0 Td[(8.Notethatwewouldobservequalitativelysimilarresultsfordierentvaluesofcandplow.WeplotthesustainableprobabilityoffailureagainstthenumberofmicroserversusedinFigure 6-4 .Thefactorofreliabilityisheldconstanttof=16inFigure 6-4a andthetrendofthesustainableprobabilityoffailureversusthenumberofmicroserversisobservedforvariousnetworksizes.Asexpected,thesustainablepinitiallyincreases,almostlinearly,withincreaseinthenumberofmicroservers.However,athigherpercentagesofmicroserversinthesensornetworksize,thesustainablep,infact,decreases,asclearlyobservedfordierentnetworksizesinFigure 6-4a .AsimilartrendisobservedfordierentvaluesoffinFigure 6-4b .Thisconrmsthatthesustainableprobabilityoffailure,andhencethefaulttolerance,isinuencedbybothnandfinheterogeneoussensornetworks,inadditiontothesizeNandtheprobabilityofmotefailurep.Ingeneral,thesustainablepisboundedbytwolinearfunctions{oneduetothemaximumpthatcanbesustainedfortreeaggregationineachofthesubtreesofsizeroughlyN n,andtheotherduetothemaximumvalueofpimposedduetoaggregationoveratreeofnmicroservers.Therstboundrestrictssustainableptobe
PAGE 98

aForvariousnetworksizesataxedreliabilityfactorf=16 bForvariousreliabilityfactorsatN=65536 cMaximumboundsforsustainablepFigure6-4.Sustainableprobabilityoffailureversusthenumberofmicroservers.Thesustainablepdoesnotalwaysincreasewithmorenumberofmicroservers. 98

PAGE 99

thesecondboundrestrictsthesustainableptobe10,thegainincreasesexponentially.Thecorrespondingoptimalnumberofmicroserversisalsoplotted.Itisinterestingtonotethattheoptimalnumberofmicroserversquicklytransitionsfromn=1homogeneoustreeofmoteston=N 4themaximumnumberofmicroserversconsidered.Soweinvestigated 99

PAGE 100

Figure6-6.Timeversusnumberofmicroserversfordierentreliabilityfactors.N=1048576andp=0:01048576 theactualtimetocompletionasafunctionofthenumberofmicroserversataroundthoseprobabilitiesoffailuresinFigure 6-5b andobservedthefollowing:Forlowerprobabilitiesoffailure,thetimetocompletionactuallyincreaseswithincreaseinthenumberofutilizedmicroservers.TheareaofinterestismarkedasAinFigure 6-5b .ThisisduetothebehavioroftheMnfunctionEquation 6{7 ,thatis,eventhoughtheaveragenumberofrestartsmightalwaysdecreasewiththenumberofmicroservers,theexpectedvalueofthemaximumoftheexpectednumberofrestartsamongnsubtreescouldincreasewithn.Thissuggeststhatanincreaseinthenumberofmicroserversmightactuallyharmthecompletiontime,whichunderlinestheimportanceofanalysisasundertakeninthispieceofwork.Themoreinterestingobservationhereisthatthemostreductioninthetimetocompletionisobtainedwithaninitialfewnumberofmicroservers.Thus,aconsiderableimprovementinperformanceisachievablewiththecostofafewmicroservers.Thisalsosuggeststhathavingacostmodelinplacethatfactorsinthecostofthemicroserversintheoptimizationcriterionwouldbeinteresting.Toaidinperformingcost-benetanalysis,thecompletiontimeisplottedagainstthenumberofmicroserversfordierentreliabilityfactorsinFigure 6-6 .Itsuggeststheexistenceoftrade-osbetweenperformance,intermsofthetimetocompletion,andthecostofthemicroservers.Thebestperformanceisobservedatf;n=024;16384,and 100

PAGE 101

theleastcostisobtainedatf;n=16;4096.However,ifthemaximumacceptabletimetocompletionisT1,themostcost-eectivesolutionisobtainedatpointAinFigure 6-6 ,whichisf;n=;4096.Also,ifthemaximumacceptabletimetocompletionisloweredtoT2,amongthefeasiblecongurationsetff;ng,eitherBorCcanbethemostcost-eectiveconguration,dependingonthecostofmicroserverswithf=1024andf=256.Thisexempliestheapplicationoftheoreticalmodeltoenableanalysisbaseddesignforoptimizingthecost-benetofsensornetworkdeployments.6.4HybridAggregationwithHeterogeneousNodesInthischaptersofar,wehaveutilizedtheavailabilityofafewofthemorereliablesensornodesforimprovingthefaulttoleranceofanaggregationtree.Wehadearlierproposedthehybridtechniquetoimprovethefaulttoleranceandscalabilityofaggregationalgorithmsinhomogeneoussensornetwork.Inthissection,weextendthehybridaggregationtechniquetoheterogeneoussensornetworks.Thetechniquethatshowssignicantimprovementinfault-toleranceandperformanceofthesystemusinghomogeneoussensornodesishybridaggregation.Thehybridtechniqueinvolvesorganizingthehomogeneoussensornodesinad-hocgroupsofopti-malsize.Withineachgroup,gossiptechnique[ 49 ]isutilizedforperformingaggregation.Anaggregationtreeisthenformedbetweenthesegossipgroups{sinceallthenodesinagossipgroupconvergetoagroupaggregate,anyofthenodescanbeelected[ 37 ][ 85 ]asagrouprepresentative,thoughthemorereliablemicroservers,ifpresentinthegroup,wouldbegivenapreference.Thefactthatallthegroupmembersconvergetothesamegroupaggregatealsoensuresthatthegossipphaseforgroupaggregationneednotberepeatedeveniftherepresentativenodefails{anothernodeinthegroupcantakeoverandhencethepartialaggregateofthegroupisnotlost.Inthissection,weexplainhowafewmicroserverscanbeusedtofurtherleveragefaulttoleranceinsuchhybridaggregationtechnique.Asseenpreviouslyinthischapter,theuseofsuchheterogeneousmicroserversresultsintheimprovementinfaulttolerance 101

PAGE 102

oftreeaggregationatthecostofafewofthemoreexpensivemicroservers.Here,weutilizethemicroserversinthetoptierofthetreepartofhybridaggregation.Intuitively,thisshouldstrengthentheaggregationinthetreesection,thusincreasingtheoverallperformanceofhybridaggregation.Letmbethenumberofgossipgroupsformed.EachofthegossipgroupsofexpectedsizeN mwouldproceedinroundstoconvergetoagroupaggregate,attheendofwhichagroupleaderiselected.Asnotedpreviously,thecostofgroupformationandleaderelectioncanbeamortizedsincegossipgroups,onceformed,canbeusedformultipleroundsofaggregation.Morepreferencecanbeeasilygiventoamicroserver,ascomparedtoamote,intheleaderelectionprocess.Thus,attheendofthegossipphase,wewouldhavemelectednodes,namongwhicharemicroservers.Thesemnodesthenformaspanningtree,aspreviouslyinthischapter,withthennodesformingthetoptierofthetree.Then,aggregationproceedsasexplainedpreviouslyinthischapter.So,anequationcanbeformulatedtocharacterizethecompletiontimeofthisalgorithmfromEquations 5{1 and 6{2 asfollows:T=c1gc2N m+fE[MaxnE[Rm n]]Tm n+TngE[Rn]{8Notethatattheendofthegossipphase,allthenodesinagossipgroupconvergetothesamepartialaggregate.Asaresult,thegossipphase,oncecompleted,neednotberepeatedevenifanyofthemicroserversfailinthetree{anyothernodeinitsgossipgroupcaninsteadcontribute,renderingtheentirealgorithmmorefaulttolerant.ThecompletiontimeTcanbeoptimizedforagivensizeNofthesensornetwork,andprobabilitiesoffailurepandp f,fordierentvaluesofm,thenumberofgossipgroups,andn,thenumberofmicroservers.Weverifythistheoreticalmodelwithsimulationexperimentsinthenextsection.Simulationresults.Thegoalsofourexperimentswithhybridaggregationinheterogeneoussensornetworksareasfollows: 102

PAGE 103

1. Toverifythetheoreticalmodelbysimulation,and 2. Tocomparetheperformanceofheterogeneoushybridaggregationwithheterogeneoustreeaggregation.Weusethesamesetupforourexperimentsasusedpreviously{aunitsquareeldwithsensornodesdistributeduniformlyrandomly.Groupsofsensormotesareformedutilizingthegridclusteringtechnique,thoughothertechniqueslikeseededclusteringwouldessentiallyresultinsimilarperformance.Aggregationthenproceedsasexplainedintheprevioussection.Here,wenodethatforallofthenmicroserverstobepresentneartherootofthelogicalspanningtree,themicroserverswouldneedtobemoreuniformlyspreadacrossthesensoreld.However,thiswouldactuallybethecase,asseenintypicaldeploymentsofheterogeneoussensornetworks[ 75 ][ 92 ] aTheoreticalprediction bSimulationresultFigure6-7.Comparingtheexpectedtimetocompletionaspredictedbythetheoreticalmodelandasobservedinsimulationresults.Thebehavioraltrendofthetimetocompletionofthehybridaggregationwithheterogeneousnodesisaccuratelypredictedbythetheoreticalmodel.Here,N=4096andf=16andp=0:16 WeperformedthesimulationsfordierentnetworksizesNandprobabilitiesoffailurep,andexperimentedwithdierentcongurationsofthehybridheterogeneousaggregationdierentcombinationsofmandn.WeplottheresultsforN=4096andp=0:16inFigure 6-7 .Inthiscase,thereliabilityofthemicroserversisassumedtobef=16timesbetterthanthemotes.Thetimetocompletionisplottedfordierentnumberofgroups 103

PAGE 104

Figure6-8.Hybridaggregationversustreeaggregationinheterogeneoussensornetworks m,treesizeanddierentnumberofmicroserversn.ThetrendsasobservedfromthetheoreticalmodelisplottedinFigure 6-7a andthesimulationresultsareplottedinFigure 6-7b .Theresultsshowthatthetrendsasobservedfromsimulationexperimentscloselymatchthetrendsaspredictedbythetheoreticalmodel.Wealsonotethattheifthenumberofmicroserversthatcanbeusednisnotconstrained,thebestperformanceintermsoftimetocompletionimproveswithincreasingtreesizemandthendegradesagainifthetreesizeisfurtherincreased.Thisshowsthatthefunctionisnotmonotonic,andtheoptimizationproblemconsideredinthispieceofworkispertinent.Wealsocomparedtheperformancebetweentreeaggregationandhybridaggregationinheterogeneoussensornetworks.ResultsfromsimulationsareplottedinFigure 6-8 .ForasensornetworkofsizeN=4096atmotefailureprobabilityp=0:16,andmicroserverfailureprobabilityp f=0:01,hybridaggregation,withnumberofgossipgroupsm=256,performsalmostatitbestwithn=16microservers{incidentally,thiscongurationwouldbasicallyhaveatreeofmicroserverswhicharerepresentativesoftheirgossipgroups.Incomparison,thepureheterogeneoustreewouldneedn=256microserverstobeatitsbest,comparedton=16microserversrequiredforhybridaggregation.Ontop 104

PAGE 105

ofthat,theperformanceofthehybridtechniqueisapproximatelytwoorderofmagnitudesbetterthantreeaggregationinthiscase.WehavealsoplottedtheperformanceofthehybridaggregationinnetworksofsizeN=16384inFigure 6-8 .ItisstillalotbetterthantreeaggregationatN=4096.Thisshowsthatthehybridtechniqueisalotmorescalablethanthepuretreetechnique,eveninheterogeneoussensornetworks.6.5ContributionsInthischapter,weexploredtheheterogeneityinthetypeofsensornodestoprovidemorereliabilityanddependabilitytotreeaggregation.Weproposedasimpletheoreticalmodelforaggregationinheterogeneoussensornetworksthatmakesiteasytoanalyzevariouscongurationoptionswhiledesigningsensornetworksspecictoanyapplication.Weveriedtheaccuracyofthetheoreticalmodelwithsimulationresults.Weobservedthatasurprisinglysmallnumberofmicroserversisrequiredforobtainingsubstantialgainsoveraggregationinhomogeneoussensornetworks.Wealsoobservedthatanincreaseinthenumberofmicroserversusedmightactuallybedetrimentaltotheperformanceofthesystem.Wethenusedthehybridtechnique,insteadofpuretree-basedmethods,foraggregation.Onceagain,weverieditstheoreticalmodelwithsimulations.Wealsoobservedthatthehybridaggregationperformsbetterthantreeaggregation,asexpected,eveninheterogeneoussettings.Thisalsoexempliesthegeneralityoftheframeworkanditsutilizationasatoolforanalysisbaseddesignofheterogeneoussensornetworks. 105

PAGE 106

CHAPTER7CONCLUSIONInthisresearch,wehaveworkedonthefaulttoleranceandscalabilityofdataaggregationalgorithmsinsensornetworks.Wehaveestablishedasimplefaultmodelthatcapturesparametersimportanttofaulttoleranceofsensornetworks.Wehavethenusedthisfaultmodelfortheanalysisofvariousexistingaggregationtechniques,liketreeandgossip-basedaggregationalgorithms.Wehavealsoveriedallourtheoreticalanalysiswithextensivesimulationexperiments.Weobservedthatthecurrenttree-basedalgorithmsarenotscalable;infact,iftheeectsoffaultsisignored,thetree-basedalgorithmsdonotguaranteecorrectness.Wealsoanalyzedsometechniquessuggestedinliteraturetoimprovetreeaggregation,suchasmultipletreesandlocalxes,andobservedthatthegaininfaulttoleranceduetotheseenhancementsisnotsubstantial.Wehavealsoproposedanovelhybridaggregationtechnique,thatoptimallyleveragestheeciencyofaggregationtreesandtherobustnessofgossipaggregation.Weobservedthatsuchanhybridschemeishighlyscalable,withperformancegainsseveralorderofmagnitudesbetterthananypureaggregationscheme.Wehavealsoexploredtheopportunityofleveragingthebetterreliabilityofthemicroservernodesinaggregationschemes.Wehaveestablishedaframeworkthathelpsusanalyzethefaulttoleranceoftheaggregationschemesintypicalusageofsuchhet-erogeneoussensornetworks.Sincethecostofthemorereliablenodesishigher,wealsoperformedacost-benetanalysistondanoptimalcongurationforminimizingcostundersomeperformanceconstraints.Webelievethatthisworkprovidesageneralframeworkforanalysisbaseddesignofaggregationalgorithms,beforethesensornetworksareactuallydeployed,ascomparedtotakingacalculatedguess,orworsestill,trial-and-errorapproachtodeployment. 106

PAGE 107

REFERENCES [1] I.Akyildiz,W.Su,Y.Sankarasubramaniam,andE.Cayirci.Asurveyonsensornetworks,2002. [2] A.D.Amis,R.Prakash,D.Huynh,andT.Vuong.Max-mind-clusterformationinwirelessadhocnetworks.InIEEEINFOCOM,pages32{41,TelAviv,Israel,2000.IEEEpress. [3] A.Arora,P.Dutta,S.Bapat,V.Kulathumani,H.Zhang,V.Naik,V.Mittal,H.Cao,M.Demirbas,M.Gouda,Y.Choi,T.Herman,S.Kulkarni,U.Arumugam,M.Nesterenko,A.Vora,andM.Miyashita.Alineinthesand:Awirelesssensornetworkfortargetdetection,2004. [4] A.Arora,R.Ramnath,E.Ertin,P.Sinha,S.Bapat,V.Naik,V.Kulathumani,H.Zhang,H.Cao,M.Sridharan,S.Kumar,N.Seddon,C.Anderson,T.Herman,N.Trivedi,C.Zhang,M.Nesterenko,R.Shah,S.Kulkarni,M.Aramugam,L.Wang,M.Gouda,Y.riChoi,D.Culler,P.Dutta,C.Sharp,G.Tolle,M.Grimmer,B.Ferriera,andK.Parker.Exscal:Elementsofanextremescalewirelesssensornetwork.rtcsa,00:102{108,2005. [5] S.BandyopadhyayandE.Coyle.Anenergyecienthierarchicalclusteringalgorithmforwirelesssensornetworks.InProceedingsofthe22ndAnnualJointConferenceoftheIEEEComputerandCommunicationsSocietiesInfocom2003,volume3,pages1713{1723,SanFranciso,CA,USA,April2003.Springer-Verlag. [6] S.Bapat,V.Kulathumani,andA.Arora.Analyzingtheyieldofexscal,alarge-scalewirelesssensornetworkexperiment.InICNP'05:Proceedingsofthe13THIEEEInternationalConferenceonNetworkProtocols,pages53{62,Washington,DC,USA,2005.IEEEComputerSociety. [7] M.Bawa,H.Garcia-Molina,A.Gionis,andR.Motwani.Estimatingaggregatesonapeer-to-peernetwork.Technicalreport,Stanford,2003. [8] P.Bonnet,J.Gehrke,andP.Seshadri.Towardssensordatabasesystems.InMDM'01:ProceedingsoftheSecondInternationalConferenceonMobileDataManagement,pages3{14,London,UK,2001.Springer-Verlag. [9] P.Bonnet,J.Gehrke,andP.Seshadri.Queryingthephysicalworld.PersonalCommunications,IEEE[seealsoIEEEWirelessCommunications],7:10{15,Oct2000. [10] A.Cerpa,J.Elson,D.Estrin,L.Girod,M.Hamilton,andJ.Zhao.Habitatmonitoring:applicationdriverforwirelesscommunicationstechnology.SIGCOMMComput.Commun.Rev.,312supplement:20{41,2001. 107

PAGE 108

108 [11] K.Chakrabarty,S.S.Iyengar,H.Qi,andE.Cho.Gridcoverageforsurveillanceandtargetlocationindistributedsensornetworks.IEEETransactionsonComputers,51:1448{1453,2002. [12] J.-Y.Chen,G.Pandurangan,andD.Xu.Robustaggregatescomputationinwirelesssensornetworks.InProceedingsoftheFourthInternationalConferenceonInfor-mationProcessinginSensorNetworksIPSN,LosAngeles,California,USA,2005.IEEE. [13] C.-Y.ChongandS.Kumar.Sensornetworks:evolution,opportunities,andchallenges.ProceedingsoftheIEEE,918:1247{1256,Aug.2003. [14] T.Clouqueur,K.K.Saluja,andP.Ramanathan.Faulttoleranceincollaborativesensornetworksfortargetdetection.IEEETransactionsonComputers,53:320{333,2004. [15] J.Considine,F.Li,G.Kollios,andJ.Byers.Approximateaggregationtechniquesforsensordatabases.InICDE'04:Proceedingsofthe20thInternationalConferenceonDataEngineering,page449,Washington,DC,USA,2004.IEEEComputerSociety. [16] D.Culler,D.Estrin,andM.Srivastava.Overviewofwirelesssensornetworks.IEEEComputer,SpecialIssueinSensorNetworks,37:41{49,August2004. [17] A.Deshpande,C.Guestrin,W.Hong,andS.Madden.Exploitingcorrelatedattributesinacquisitionalqueryprocessing.InICDE'05:Proceedingsofthe21stInternationalConferenceonDataEngineering,pages143{154,Washington,DC,USA,2005.IEEEComputerSociety. [18] A.Deshpande,C.Guestrin,S.R.Madden,J.M.Hellerstein,andW.Hong.Model-drivendataacquisitioninsensornetworks.InVLDB'04:ProceedingsoftheThirtiethinternationalconferenceonVerylargedatabases,pages588{599.VLDBEndowment,2004. [19] A.Deshpande,S.Nath,P.B.Gibbons,andS.Seshan.Cache-and-queryforwideareasensordatabases.InSIGMOD'03:Proceedingsofthe2003ACMSIGMODinternationalconferenceonManagementofdata,pages503{514,NewYork,NY,USA,2003.ACM. [20] M.Ding,X.Cheng,andG.Xue.Aggregationtreeconstructioninsensornetworks.InVehicularTechnologyConference,Orlando,Florida,USA,2003.IEEE. [21] EarthScope:AnEarthScienceProgram.www.earthscope.org/. [22] D.EckhardtandP.Steenkiste.Measurementandanalysisoftheerrorcharacteristicsofanin-buildingwirelessnetwork.InSIGCOMM,pages243{254,1996. [23] D.Estrin,D.Culler,K.Pister,andG.Sukhatme.Connectingthephysicalworldwithpervasivenetworks.IEEEPervasiveComputing,1:59{69,2002.

PAGE 109

109 [24] D.Estrin,L.Girod,G.Pottie,andM.Srivastava.Instrumentingtheworldwithwirelesssensornetworks.Acoustics,Speech,andSignalProcessing,2001.Proceedings.ICASSP'01.2001IEEEInternationalConferenceon,4:2033{2036vol.4,2001. [25] D.Estrin,R.Govindan,J.Heidemann,andS.Kumar.Nextcenturychallenges:scalablecoordinationinsensornetworks.InMobiCom'99:Proceedingsofthe5thannualACM/IEEEinternationalconferenceonMobilecomputingandnetworking,pages263{270,NewYork,NY,USA,1999.ACM. [26] M.Fiedler.Algebraicconnectivityofgraphs.CzechoslovakMathematicalJournal,23:298{305,1973. [27] M.J.Fischer,N.A.Lynch,andM.S.Paterson.Impossibilityofdistributedconsensuswithonefaultyprocess.J.ACM,322:374{382,1985. [28] P.FlajoletandG.Martin.Probabilisticcountingalgorithmsfordatabaseapplications.JournalofComputerandSystemSciences,322:182{209,1985. [29] D.Ganesan,R.Govindan,S.Shenker,andD.Estrin.Highly-resilient,energy-ecientmultipathroutinginwirelesssensornetworks.SIGMOBILEMob.Comput.Commun.Rev.,5:11{25,2001. [30] D.Ganesan,B.Krishnamachari,A.Woo,D.Culler,D.Estrin,andS.Wicker.Complexbehavioratscale:Anexperimentalstudyoflow-powerwirelesssensornetworks,2002. [31] J.GehrkeandS.Madden.Queryprocessinginsensornetworks.IEEEPervasiveComputing,031:46{55,2004. [32] L.Girod,J.Elson,A.Cerpa,T.Stathopoulas,N.Ramanathan,andD.Estrin.Emstar:asoftwareenvironmentfordevelopinganddeployingwirelesssensornetworks.InUSENIXTechnicalConference,Boston,MA,USA,2004.USENIX. [33] O.Gnawali,K.-Y.Jang,J.Paek,M.Vieira,R.Govindan,B.Greenstein,A.Joki,D.Estrin,andE.Kohler.Thetenetarchitecturefortieredsensornetworks.InSenSys'06:Proceedingsofthe4thinternationalconferenceonEmbeddednetworkedsensorsystems,pages153{166,NewYork,NY,USA,2006.ACMPress. [34] S.Gobriel,S.Khattab,J.B.DanielMosse,andR.Melhem.Ridesharing:Faulttolerantaggregationinsensornetworksusingcorrectiveactions.InSECON'06,ThirdAnnualIEEECommunicationsSocietyConferenceonSensorandAdHocCommunicationsandNetworks,Reston,VA,2006.IEEE. [35] M.Grimmer.Considerationsforradiocommunicationslinks. [36] I.Gupta,R.vanRenesse,andK.Birman.Scalablefault-tolerantaggregationinlargeprocessgroups,2001.

PAGE 110

110 [37] I.Gupta,R.vanRenesse,andK.P.Birman.Aprobabilisticallycorrectleaderelectionprotocolforlargegroups.InDISC'00:Proceedingsofthe14thConferenceonDistributedComputing,pages89{103,London,UK,2000.Springer-Verlag. [38] M.Gupta,G.;Younis.Fault-tolerantclusteringofwirelesssensornetworks.WirelessCommunicationsandNetworking,2003.WCNC2003.2003IEEE,3:1579{1584vol.3,16-20March2003. [39] T.He,S.Krishnamurthy,J.A.Stankovic,T.Abdelzaher,L.Luo,R.Stoleru,T.Yan,L.Gu,J.Hui,andB.Krogh.Energy-ecientsurveillancesystemusingwirelesssensornetworks.InMobiSys'04:Proceedingsofthe2ndinternationalconferenceonMobilesystems,applications,andservices,pages270{283,NewYork,NY,USA,2004.ACM. [40] J.Hellerstein,W.Hong,S.Madden,andK.Stanek.Beyondaverage:Towardssophisticatedsensingwithqueries,2003. [41] J.M.Hellerstein,W.Hong,andS.R.Madden.Thesensorspectrum:technology,trends,andrequirements.SIGMODRec.,324:22{27,2003. [42] M.Hewish.Reformattingghtertactics.Jane'sInternationalDefenseReview,June2001. [43] J.Hill,R.Szewczyk,A.Woo,S.Hollar,D.Culler,andK.Pister.Systemarchitecturedirectionsfornetworkedsensors.SIGPLANNot.,3511:93{104,2000. [44] G.Hoblos,M.Staroswiecki,andA.Aitouche.Optimaldesignoffaulttolerantsensornetworks.ControlApplications,2000.Proceedingsofthe2000IEEEInternationalConferenceon,pages467{472,2000. [45] C.Intanagonwiwat,R.Govindan,andD.Estrin.Directeddiusion:ascalableandrobustcommunicationparadigmforsensornetworks.InMobiCom'00:Proceedingsofthe6thannualinternationalconferenceonMobilecomputingandnetworking,pages56{67,NewYork,NY,USA,2000.ACM. [46] L.Jia,G.Noubir,R.Rajaraman,andR.Sundaram.Group-independentspanningtreefordataaggregationindensesensornetworks.InInternationalConferenceonDistributedComputinginSensorSystemsDCOSS,SanFrancisco,CA,USA,2006.SpringerBerlin/Heidelberg. [47] H.Kargupta,B.-H.Park,andH.Dutta.Orthogonaldecisiontrees.IEEETransac-tionsonKnowledgeandDataEngineering,188:1028{1042,2006. [48] B.KarpandH.T.Kung.Gpsr:greedyperimeterstatelessroutingforwirelessnetworks.InMobiCom'00:Proceedingsofthe6thannualinternationalconferenceonMobilecomputingandnetworking,pages243{254,NewYork,NY,USA,2000.ACMPress.

PAGE 111

111 [49] D.Kempe,A.Dobra,andJ.Gehrke.Gossip-basedcomputationofaggregateinformation.In44thAnnualIEEESymposiumonFoundationsofComputerScience,Cambridge,MA,USA,2003.IEEEComputerSociety. [50] P.KirschenhoferandH.Prodinger.Aresultinorderstatisticsrelatedtoprobabilisticcounting.Computing,511:15{27,1993. [51] F.Koushanfar,M.Potkonjak,andA.Sangiovanni-Vincentelli.HandbookofSensorNetworks,chapterFault-ToleranceinSensorNetworks.CRCpress,2004. [52] B.KrishnamachariandS.Iyengar.Distributedbayesianalgorithmsforfault-toleranteventregiondetectioninwirelesssensornetworks.Computers,IEEETransactionson,53:241{250,March2004. [53] S.W.KwokandC.Carter.Multipledecisiontrees.InUAI'88:ProceedingsoftheFourthAnnualConferenceonUncertaintyinArticialIntelligence,pages327{338.North-Holland,1990. [54] P.Levis.Tossim:Accurateandscalablesimulationofentiretinyosapplications,2003. [55] H.Liu,A.Nayak,andI.Stojmenovic.FaultTolerantAlgorithms-ProtocolsinWirelessSensorNetworks.Springer-Verlag,2007. [56] X.Luo,M.Dong,andY.Huang.Ondistributedfault-tolerantdetectioninwirelesssensornetworks.IEEETransactionsonComputers,551:58{70,2006. [57] S.Madden,M.J.Franklin,J.M.Hellerstein,andW.Hong.Tag:atinyaggregationserviceforad-hocsensornetworks.SIGOPSOper.Syst.Rev.,36SI:131{146,2002. [58] S.Madden,M.J.Franklin,J.M.Hellerstein,andW.Hong.Thedesignofanacquisitionalqueryprocessorforsensornetworks.InSIGMODConference,pages491{502,SanDiego,California,2003.ACM. [59] S.Madden,R.Szewczyk,M.Franklin,andD.Culler.Supportingaggregatequeriesoverad-hocwirelesssensornetworks.MobileComputingSystemsandApplications,2002.ProceedingsFourthIEEEWorkshopon,pages49{58,2002. [60] S.R.Madden,M.J.Franklin,J.M.Hellerstein,andW.Hong.Tinydb:anacquisitionalqueryprocessingsystemforsensornetworks.ACMTrans.DatabaseSyst.,301:122{173,2005. [61] A.Mainwaring,J.Polastre,R.Szewczyk,D.Culler,andJ.Anderson.Wirelesssensornetworksforhabitatmonitoring.InACMWorkshoponWirelessSensorNetworksandApplicationsWSNA'02,Atlanta,GA,Sep2002.ACM. [62] A.Manjhi,S.Nath,andP.B.Gibbons.Tributariesanddeltas:ecientandrobustaggregationinsensornetworkstreams.InSIGMOD'05:Proceedingsofthe2005ACMSIGMODconferenceonManagementofdata,pages287{298,NewYork,NY,USA,2005.ACMPress.

PAGE 112

112 [63] V.P.Mhatre,C.Rosenberg,D.Kofman,R.Mazumdar,andN.Shro.Aminimumcostheterogeneoussensornetworkwithalifetimeconstraint.IEEETransactionsonMobileComputing,4:4{15,2005. [64] S.Nath,P.B.Gibbons,S.Seshan,andZ.R.Anderson.Synopsisdiusionforrobustaggregationinsensornetworks.InSenSys'04:Proceedingsofthe2ndConferenceonEmbeddednetworkedsensorsystems,pages250{262,NewYork,NY,USA,2004.ACMPress. [65] J.Polley,D.Blazakis,J.McGee,D.Rusk,andJ.S.Baras.Atemu:Ane-grainedsensornetworksimulator.InSECON'04,TheFirstIEEECommunicationsSocietyConferenceonSensorandAdHocCommunicationsandNetworks,SantaClara,CA,2004.IEEE. [66] G.Pottie.Wirelesssensornetworks.InformationTheoryWorkshop,1998,pages139{140,22-26Jun1998. [67] G.J.PottieandW.J.Kaiser.Wirelessintegratednetworksensors.Commun.ACM,43:51{58,2000. [68] V.PrasadandS.H.Son.Classicationofanalysistechniquesforwirelesssensornetworks.NetworkedSensingSystems,2007.INSS'07.FourthInternationalConfer-enceon,pages93{97,6-8June2007. [69] H.Qi,S.S.Iyengar,andK.Chakrabarty.Distributedsensornetworks:Areviewofrecentresearch.JournaloftheFranklinInstitute,338:655{668,2001. [70] Quake-CatcherNetwork.http://qcn.ucr.edu/. [71] M.RaabandA.Steger."ballsintobins"-asimpleandtightanalysis.InRANDOM'98:ProceedingsoftheSecondWorkshoponRandomizationandApproximationTechniquesinComputerScience,pages159{170,London,UK,1998.Springer-Verlag. [72] T.Schmid,H.Dubois-Ferriere,andM.Vetterli.Sensorscope:Experienceswithawirelessbuildingmonitoringsensornetwork.InWorkshoponReal-WorldWirelessSensorNetworksREALWSN05,2005. [73] R.Shah,S.Roy,S.Jain,andW.Brunette.Datamules:Modelingathree-tierarchitectureforsparsesensornetworks.InIEEEInternationalWorkshoponSensorNetworkProtocolsandApplications,pages30{41,2003. [74] J.Shao.MathematicalStatistics.SpringerTextsinStatistics,2edition,2003. [75] G.SharmaandR.Mazumdar.Hybridsensornetworks:asmallworld.InMobi-Hoc'05:Proceedingsofthe6thACMinternationalsymposiumonMobileadhocnetworkingandcomputing,pages366{377,NewYork,NY,USA,2005.ACMPress.

PAGE 113

113 [76] V.Shnayder,M.Hempstead,B.Chen,G.W.Allen,andM.Welsh.Simulatingthepowerconsumptionoflargescalesensornetworkapplications.InSenSys,Baltimore,Maryland,USA,2004.ACM. [77] N.Shrivastava,C.Buragohain,D.Agrawal,andS.Suri.Mediansandbeyond:newaggregationtechniquesforsensornetworks.InSenSys'04:Proceedingsofthe2ndinternationalconferenceonEmbeddednetworkedsensorsystems,pages239{249,NewYork,NY,USA,2004.ACM. [78] U.Srivastava,K.Munagala,andJ.Widom.Operatorplacementforin-networkstreamqueryprocessing.InPODS'05:Proceedingsofthetwenty-fourthACMSIGMOD-SIGACT-SIGARTsymposiumonPrinciplesofdatabasesystems,pages250{258,NewYork,NY,USA,2005.ACM. [79] S.SubramanianandS.Shakkottai.Geographicroutingwithlimitedinformationinsensornetworks.InIPSN'05:Proceedingsofthe4thinternationalsymposiumonInformationprocessinginsensornetworks,page36,Piscataway,NJ,USA,2005.IEEEPress. [80] K.Sun.Fault-tolerantcluster-wiseclocksynchronizationforwirelesssensornetworks.IEEETrans.DependableSecur.Comput.,2:177{189,2005.Member-PengNingandMember-CliWang. [81] R.Szewczyk,A.Mainwaring,J.Polastre,J.Anderson,andD.Culler.Ananalysisofalargescalehabitatmonitoringapplication.InSenSys'04:Proceedingsofthe2ndinternationalconferenceonEmbeddednetworkedsensorsystems,pages214{226,NewYork,NY,USA,2004.ACM. [82] W.SzpankowskiandV.Rego.Yetanotherapplicationofabinomialrecurrence.orderstatistics.Computing,434:401{410,1990. [83] M.TubaishatandS.Madria.Sensornetworks:anoverview.Potentials,IEEE,22:20{23,April-May2003. [84] U.S.ArmyResearchLaboratory.BroadAgencyAnnouncementDAAD19-03-R-0017,2006. [85] S.Vasudevan,B.DeCleene,N.Immerman,andJ.K.D.Towsley.Leaderelectionalgorithmsforwirelessadhocnetworks.InDARPAInformationSurvivabilityConferenceandExposition,volume1,2003. [86] H.Wang,D.Estrin,andL.Girod.Preprocessinginatieredsensornetworkforhabitatmonitoring,2002. [87] B.Warneke,M.Last,B.Liebowitz,andK.S.J.Pister.Smartdust:Communicatingwithacubic-millimetercomputer.Computer,341:44{51,2001. [88] M.Weiser.Thecomputerforthetwenty-rstcentury.ScienticAmerican,1991.

PAGE 114

114 [89] G.Werner-Allen,J.Johnson,M.Ruiz,J.Lees,andM.Welsh.Monitoringvolcaniceruptionswithawirelesssensornetwork.WirelessSensorNetworks,2005.Proceeed-ingsoftheSecondEuropeanWorkshopon,pages108{120,31Jan.-2Feb.2005. [90] G.Werner-Allen,P.Swieskowski,andM.Welsh.Motelab:Awirelesssensornetworktestbed,2005. [91] Y.YaoandJ.Gehrke.Thecougarapproachtoin-networkqueryprocessinginsensornetworks.SIGMODRec.,313:9{18,2002. [92] M.Yarvis,N.Kushalnagar,H.Singh,A.Rangarajan,Y.Liu,andS.Singh.Exploitingheterogeneityinsensornetworks,.InINFOCOM2005.24thAnnualJointConferenceoftheIEEEComputerandCommunicationsSocieties.ProceedingsIEEE,vol.2,no.,,pages878{890,13-17March2005. [93] Y.Yu,B.Krishnamachari,andV.Prasanna.Issuesindesigningmiddlewareforwirelesssensornetworks.Network,IEEE,181:15{21,Jan/Feb2004. [94] F.ZhaoandL.Guibas.WirelessSensorNetworks:AnInformationProcessingApproach.MorganKaufmann,2004. [95] J.ZhaoandR.Govindan.Understandingpacketdeliveryperformanceindensewirelesssensornetworks.InSenSys'03:Proceedingsofthe1stinternationalconfer-enceonEmbeddednetworkedsensorsystems,pages1{13,NewYork,NY,USA,2003.ACM. [96] J.Zhao,R.Govindan,andD.Estrin.Computingaggregatesformonitoringwirelesssensornetworks.SensorNetworkProtocolsandApplications,2003.ProceedingsoftheFirstIEEE.2003IEEEInternationalWorkshopon,pages139{148,11May2003.

PAGE 115

BIOGRAPHICALSKETCHLaukikChitnisreceivedhisBachelorofEngineeringincomputerscienceinJuly2001fromtheUniversityofMumbai,India,wherehewasranked4th.AfterworkingforayearwithXoriantSolutionsinMumbaionJ2EEtechnologies,hejoinedtheDepartmentofComputerandInformationSciencesandEngineeringattheUniversityofFloridaforgraduatestudiesinfall2002.Insummer2007,heinternedwithGoogleInc.Hisresearchinterestsincludescalabilityandfaulttoleranceofparallelanddistributedsystems,gridcomputingandqueryprocessinginsensornetworks.LaukikreceivedhisPh.D.incomputerscienceinAugust2008fromtheUniversityofFloridaandjoinedtheWebSearchteamatYahoo!Inc.asaseniortechnicalstamember. 115