Citation
Design, Implementation, and Applications of Peer-to-Peer Virtual Private Networks from Grids to Social Networks

Material Information

Title:
Design, Implementation, and Applications of Peer-to-Peer Virtual Private Networks from Grids to Social Networks
Creator:
Wolinsky,David Isaac
Place of Publication:
[Gainesville, Fla.]
Florida
Publisher:
University of Florida
Publication Date:
Language:
english
Physical Description:
1 online resource (194 p.)

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Electrical and Computer Engineering
Committee Chair:
Figueiredo, Renato J
Committee Members:
Fortes, Jose A
Boykin, P. Oscar
Chen, Shigang
Graduation Date:
8/6/2011

Subjects

Subjects / Keywords:
Bandwidth ( jstor )
Broadcasting industry ( jstor )
Connectivity ( jstor )
Household appliances ( jstor )
Internet ( jstor )
Local area networks ( jstor )
Professional license revocation ( jstor )
Simulations ( jstor )
Software ( jstor )
Tunnels ( jstor )
Electrical and Computer Engineering -- Dissertations, Academic -- UF
grid -- network -- overlay -- p2p -- peer -- social -- structured -- virtual -- vpn
Genre:
bibliography ( marcgt )
theses ( marcgt )
government publication (state, provincial, terriorial, dependent) ( marcgt )
born-digital ( sobekcm )
Electronic Thesis or Dissertation
Electrical and Computer Engineering thesis, Ph.D.

Notes

Abstract:
Virtual private networks (VPNs) enable existing network applications to run unmodified in insecure and constrained environments by creating an isolated and secure virtual environment providing all-to-all connectivity for VPN members. While there exist both centralized and distributed VPN implementations, current approaches lack self-configuration and organization capabilities that would reduce management overheads and minimize effort by non-experts. Recent use of peer-to-peer (P2P) techniques have focused on alleviating pressure placed upon infrastructure nodes by allowing peers to form direct connections for communication purposes, while infrastructure nodes are used for handling session management and supporting indirect communication by relaying traffic when NAT (Network Address Translation) or firewall traversal fails. In terms of decentralized, P2P-based VPN solutions, the mechanisms explored thus far in related works employ unstructured P2P systems, which can have significant scalability limitations. This thesis constructs a novel decentralized P2P VPN that addresses the following core aspects that are integral to user-friendliness: bootstrapping, discovery, security, and endpoint configuration. A resource joining a distributed system goes through a bootstrapping process. The target environment for VPNs include small systems with many if not all users behind NATs and firewalls making the bootstrapping process challenging. Centralized systems address the bootstrapping problem by using a common resource for peer registration, discovery, and connection establishment. Centralized systems, however, come with additional costs in deploying and managing a dedicated resource with a public Internet address and the capability to handle demands placed upon it by clients. I have investigated, implemented, and evaluated decentralized means to bootstrap private P2P overlays for connectivity-constrained resources, with an approach that supports a recursive overlay organization or the use of third-party free-to-join public overlay infrastructures using technologies such as XMPP (Extensible Messaging and Presence Protocol). Bootstrapping helps establish connectivity into an overlay; however, many systems including P2P VPNs require a means for discovery specific peers. Existing VPNs either rely on large tables hosted on infrastructure nodes or overlay broadcast techniques to find a resource. As a system grows in capacity, these approaches have their limitations, especially in VPNs where all IP (Internet Protocol) addresses are independent of their location inside the VPN. I have employed distributed hash tables to efficiently establish decentralized IP address allocation and discovery seamlessly providing scalability and resilience. In a VPN, other peers are typically either trusted directly by the peer, or indirectly through a trusted third-party. While users may trust a third-party to assist them in creating network links to other peers, they do not desire to have intermediaries that are able to read or modify their IP packets. Unfortunately, most VPNs only encrypt messages on a point-to-point (PtP) basis allowing these intermediaries privileged access to their identity and their messages. In these cases, end-to-end (EtE) security relies on out-of-bound exchanges and applications. To transparently handle security at both PtP and EtE layers across a wide spectrum of communication transports, I have developed a novel security filter, which has been demonstrated to support existing Public Key Infrastructure based security systems (such as DTLS (Datagram Transport Layer Security)) for both PtP and EtE traffic inside connectivity-constrained environments. While security primitives enable private and authenticated communication, the configuration and management overheads involved in establishing trust and maintaining secure connections in VPNs are a significant hindrance to usability and adoption. In my approach, all security links are established from exchanged certificates, so each peer is uniquely identifiable. My approach uniquely handles administrative and user aspects of certificates automatically through the use of online social networking features such as peer relationships and groups. The above self-organizing mechanisms to create VPN links need to be complemented with approaches that support effective bindings to endpoints from which messages are captured/injected from/to the VPN. In a typical approach, called the interface model, each resource in the VPN has a local binding to the VPN by locally installed software. Unfortunately, this introduces significant overheads when two or more such systems are running inside the same trusted LAN (Local Area Network). Alternatively, if all resources in a LAN connect to a common VPN, such as in a grid or for cloud computing environments, the resources can share a common entry point to the VPN through a router model. Unfortunately, existing approaches do not transparently configure the router and connected resources. Additionally, the router model does not work well on shared networks, where there are either untrusted users or some resources should not be accessible through the VPN. I have shown herein how all of these considerations can be handled without the introduction of new protocols by utilizing existing services commonly provided by network stacks, primarily DHCP (Dynamic Host Control Protocol) and ARP (Address Resolution Protocol), which enables a new type of VPN model that balances the benefits of the interface and router models. The premise for this work is to enhance the usability of VPN systems enabling wider adoption by non-expert users in home, small/medium business, and education environments. The concepts for this work have been carefully designed, implemented, and evaluated and then demonstrated through the implementation of novel systems (SocialVPN, GroupVPN, and Grid Appliance) accessed by real users. The SocialVPN creates user-centric VPNs so that peers only have VPN links with their social network friends, whereas the GroupVPN employs a group infrastructure to manage VPN members and distribute VPN configuration. A free GroupVPN bootstrapping environment relying on PlanetLab hosted resources has been available for over three years and has been accessed by over hundreds of users including several universities and commercial entities, whereas the SocialVPN has over 80 active members online at any given time. The Grid Appliance uses the GroupVPN to form ad-hoc and distributed computing pools, facilitating computer architecture research in the Archer project. The Archer project has been accessed by student at several universities and has accumulated over 500,000 CPU hours in a little less than three years. Furthermore, the Grid Appliance has been used as both a teaching tool in distributed computing classrooms as well as by external users to create their own grids. The challenges faced in these deployments have opened the door for other avenues of research into built-in self-simulation, P2P connection establishment, efficient IP broadcasting and multicasting, and decentralized establishment of Internet gateways. ( en )
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis:
Thesis (Ph.D.)--University of Florida, 2011.
Local:
Adviser: Figueiredo, Renato J.
Statement of Responsibility:
by David Isaac Wolinsky.

Record Information

Source Institution:
UFRGP
Rights Management:
Copyright Wolinsky,David Isaac. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
872113451 ( OCLC )
Classification:
LD1780 2011 ( lcc )

Downloads

This item has the following downloads:


Full Text

PAGE 1

DESIGN,IMPLEMENTATION,ANDAPPLICATIONSOFPEER-TO-PEERVIRTUALPRIVATENETWORKSFROMGRIDSTOSOCIALNETWORKSByDAVIDISAACWOLINSKYADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2011

PAGE 2

c2011DavidIsaacWolinsky 2

PAGE 3

Idedicatethistofamilyandthosewhosehavesupportedme. 3

PAGE 4

ACKNOWLEDGMENTSOverthepast5years,therehasbeenmanyconstantsandmanychanges,soontherewillbeonlychanges.Thoughthroughitall,Ihavebeensurroundedbywonderfulpeoplewhohavehelpedandencouragedmetosucceedandcanonlyhopethattherelationshipwasbenecialforthemaswell.Tobeginnamingnames,Iwillbeginwithmyadvisor,ProfessorRenatoFigueiredo.Igreatlyappreciatethetimehehasinvestedintomeandhiswisdomsharedwithme.IamgreatlyblessedtohaveworkedsocloselywithaprofessorwhomIworksowellwith.ThatleadsmeintoProfessorP.OscarBoykin,theotherheadoftheACISP2PGroup,whoalongwithProfessorFigueiredo,moldedmeintotheboldPh.D.Iamtoday.ProfessorBoykinalsohelpedenrichmydesignanddevelopmentskills,forwhich,Iamextremelygrateful.RoundingouttheACISprofessors,leadstoProfessorJoseFortes,whohasalwaysbeenagoodsourceofwisdomandencouragement.AsleaderoftheACISlab,ProfessorForteshasalwaysbeenverygenerousinprovidingbothhistime,whichiswhyIamveryappreciativetohavehimasamemberonmyPh.D.committee.IwouldalsolikethankProfessorsShigangChenandY.PeterShengfortheirtimeinvestmentsinmyresearchandwhosecommentshavebeeninvaluableinshapingmydissertation.Mypeersandfamilyhavealsobeencriticalsourcesofsupport,encouragement,andwisdom.IamthankfultothemembersoftheACISP2Pgroup,bothpastandpresent,namely,TaeWoongChoi,HeungSikEom,ArijitGanguly,KyungyongLee,YonggangLiu,PierreSt.Juste,andJiangyanXu,whosecommentsandcontributionshavepavedthewayformyresearch.Iamthankfultomembersofmysportsgroups,boththeLarsen-BentonBasketballAssociationandtheBadmintonGroupfortheirfriendships,astheyprovidedameanstoredirectfrustrationsdevelopedalongtheway.IappreciatethehardworkanddedicationofmyfellowArchercolleague,GirishVenkatasubramanian.Mygratitudegoestothekindofceladieswhoassistedmesomuch,CatherineReeves,JanetSloan,andDinaStoeber.Iamthankfulforthetime 4

PAGE 5

putforthbymyGridAppliancecolleagues,PanoatChuchaisriandArjunPrakash.MylabexperiencewouldhavebeenmuchmoredifcultwithouttheexpertiseandkindnessofSumalathaAdabala,MatthewCollins,AndreaMatsunaga,andMauricioTsugawa.IwouldliketothankPriyaBhatt,BingyiCao,XinFu,SelviKadrivel,andPrapapornRattanatamrongfortheirkindheartsandencouragementand,insomecases,theirspicyfood.IwouldliketothankDonnaGimbertforhersupportandencouragementthroughouttheyears,likewise,Ihavebeenblessedtohaveparentsthathaveencouragedmetopressforwardandachievemygoalsinlifeandasonwhoprovidesmeimmenseamountsofhappiness.Researchisacollaborativeeffortthat,formeatleast,involvesbothprofessionalandhomelife.MysuccesshaslargelybeentheresultofthequalityindividualsthatIhavebeenfortunateenoughtohaveinmylife.Itisforthosealreadymentionedandthoserememberedthatthisdissertationisowed.Thankyousomuch. 5

PAGE 6

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 10 LISTOFFIGURES ..................................... 11 ABSTRACT ......................................... 13 CHAPTER 1INTRODUCTION ................................... 17 1.1VirtualPrivateNetworkBasics ........................ 20 1.2ComputerNetworkArchitectures ....................... 23 1.3StructuredOverlays .............................. 25 1.4NetworkAsymmetries ............................. 28 1.5Contributions .................................. 30 2VIRTUALNETWORKCONFIGURATIONANDORGANIZATION ........ 34 2.1NetworkConguration ............................. 35 2.1.1CentralizedVPNSystems ....................... 35 2.1.2CentralizedP2PVPNSystems .................... 36 2.1.3DecentralizedVPNSystems ...................... 37 2.1.4UnstructuredP2PVPNSystems ................... 37 2.1.5StructuredP2PVPNSystems ..................... 37 2.2LocalConguration ............................... 39 2.2.1LocalVPNArchitecture ........................ 41 2.2.2AddressResolution ........................... 44 2.2.3AddressAllocation ........................... 46 2.2.4DomainNameServersandServices ................. 48 2.3SupportingMigration .............................. 48 2.4EvaluationofVPNNetworkConguration .................. 52 2.5EvaluationofVPNLocalConguration .................... 54 2.5.1OntheGrid ............................... 56 2.5.2IntheClouds .............................. 60 3BOOTSTRAPPINGPRIVATEOVERLAYS ..................... 65 3.1CurrentBootstrapSolutions .......................... 69 3.2CoreRequirements .............................. 71 3.2.1Reection ................................ 71 3.2.2Relaying ................................. 73 3.2.3Rendezvous ............................... 74 3.3Implementations ................................ 75 6

PAGE 7

3.3.1UsingBrunet .............................. 76 3.3.2UsingXMPP .............................. 78 3.4EvaluatingOverlayBootstrapping ....................... 80 3.4.1DeploymentExperiments ....................... 80 3.4.2DeploymentExperiences ....................... 82 4FROMOVERLAYSTOSECUREVIRTUALPRIVATENETWORKS ....... 86 4.1ExperimentalEnvironment ........................... 87 4.2TowardsPrivateOverlays ........................... 88 4.2.1TimetoBootstrapaPrivateOverlay ................. 89 4.2.2OverheadofPathing .......................... 92 4.3SecurityfortheOverlayandtheVPN ..................... 92 4.3.1ImplementingOverlaySecurity .................... 93 4.3.2OverheadsofOverlaySecurity .................... 96 4.3.2.1AddingaSingleNode .................... 97 4.3.2.2BootstrappinganOverlay .................. 97 4.3.3Discussion ................................ 98 4.4HandlingUserRevocation ........................... 99 4.4.1DHTRevocation ............................ 100 4.4.2BroadcastRevocation ......................... 100 4.4.3EvaluationofBroadcast ........................ 100 4.4.4Discussion ................................ 101 4.5ManagingandConguringtheVPN ..................... 102 4.6LeveragingTrustfromOnlineSocialNetworks ................ 104 4.6.1Architecture ............................... 105 4.6.2LeveragingTrustFromFacebook ................... 106 4.6.3LeveragingTrustfromXMPP ..................... 107 4.6.4AddressAllocationsandDiscovery .................. 107 4.7RelatedWork .................................. 108 4.7.1VPNs .................................. 108 4.7.2P2PSystems .............................. 109 5EXTENSIONSTOP2POVERLAYSANDVIRTUALNETWORKS ........ 111 5.1Built-inSelf-Simulation ............................. 111 5.1.1Time-BasedEvents ........................... 112 5.1.2NetworkCommunication ........................ 113 5.1.3UserActions .............................. 115 5.1.4TheRestoftheSystem ........................ 116 5.1.5Optimizations .............................. 116 5.2EfcientRelays ................................. 118 5.2.1MotivationforRelaysintheOverlay .................. 119 5.2.2ComparingRelaySelection ...................... 121 5.3PoliciesforEstablishingDirectConnections ................. 122 5.3.1Limitations ................................ 122 7

PAGE 8

5.3.2On-DemandConnections ....................... 123 5.4BroadcastingIPBroadcastandMulticastPacketsViatheOverlay ..... 126 5.5FullTunnelVPNOperations .......................... 127 5.5.1TheGateway .............................. 128 5.5.2TheClient ................................ 129 5.5.3FullTunnelOverhead .......................... 131 6AD-HOC,DECENTRALIZEDGRIDS ........................ 133 6.1WOWs ...................................... 137 6.1.1P2POverlays .............................. 137 6.1.2VirtualPrivateNetworks ........................ 138 6.1.3VirtualMachinesinGridComputing ................. 139 6.2ArchitecturalOverview ............................. 140 6.2.1WebInterfaceandtheCommunity .................. 142 6.2.2TheOrganizationoftheGrid ..................... 144 6.2.2.1SelectingaMiddleware ................... 145 6.2.2.2Self-OrganizingCondor ................... 146 6.2.2.3PuttingItAllTogether .................... 146 6.2.3SandboxingResources ........................ 147 6.2.3.1SecuringtheResources ................... 147 6.2.3.2RespectingtheHost ..................... 148 6.2.3.3DecentralizedSubmissionofJobs ............. 148 6.3DeployingaCampusGrid ........................... 149 6.3.1Background ............................... 150 6.3.2TraditionalCongurationofaCampusGrid ............. 151 6.3.3GridApplianceinaCampusGrid ................... 152 6.3.4ComparingtheUserExperience ................... 154 6.3.5QuantifyingtheExperience ...................... 154 6.4LessonsLearned ................................ 156 6.4.1Deployments .............................. 156 6.4.2TowardsUnvirtualizedEnvironments ................. 157 6.4.3AdvantagesandChallengesoftheCloud .............. 157 6.4.4StackedFileSystems ......................... 159 6.4.5PriorityinOwnedResources ..................... 160 6.4.6TiminginVirtualMachines ....................... 161 6.4.7SelectingaVPNIPAddressRange .................. 161 6.4.8AdministratorBackdoor ........................ 162 6.5RelatedWork .................................. 163 7SOCIALPROFILEOVERLAYS ........................... 165 7.1RelatedWorks ................................. 167 7.2SocialOverlays ................................. 169 7.2.1FindingFriends ............................. 169 7.2.2MakingFriends ............................. 171 8

PAGE 9

7.2.3TheProleOverlay ........................... 172 7.2.4EventBasedMessageNotication .................. 173 7.2.5ActivePeers ............................... 174 7.2.6Groups .................................. 175 7.3UserInteraction ................................. 175 7.4Challenges ................................... 178 8CONCLUSIONS ................................... 180 APPENDIX:STRUCTUREDOVERLAYBROADCAST ................. 183 REFERENCES ....................................... 185 BIOGRAPHICALSKETCH ................................ 194 9

PAGE 10

LISTOFTABLES Table page 2-1VPNclassications .................................. 35 2-2Qualitativecomparisonofthethreedeploymentmodels ............. 40 2-3WANresultsforinter-cloudnetworking ....................... 60 2-4LANresultsperformedatGoGrid .......................... 61 2-5Virtualnetworkcomparison ............................. 62 3-1Timeinsecondsforvariousprivateoverlayoperations .............. 81 3-2Publicandresearchoverlays ............................ 83 4-1Pathingoverheads .................................. 92 5-1Relaycomparison .................................. 121 5-2Fulltunnelevaluation ................................ 131 6-1Gridmiddlewarecomparison ............................ 141 10

PAGE 11

LISTOFFIGURES Figure page 1-1AtypicalVPNclient ................................. 21 1-21-Dringstructuredoverlay ............................. 26 1-3CommunicationbetweenapeerbehindaNATandonewithapublicaddress 29 2-1ThreeVNapproaches:router,interface,andhybrid ................ 40 2-2Thestatediagramofaself-conguringVN .................... 42 2-3VNinterface ...................................... 43 2-4VNrouter ....................................... 43 2-5VNhybrid ....................................... 44 2-6ARPrequest/replyinteraction ............................ 45 2-7DHCPclient/serverinteraction ........................... 46 2-8VNroutermigration ................................. 52 2-9VNroutermigrationevaluation ........................... 53 2-10SystemtransactionrateforvariousVPNapproaches ............... 54 2-11SystembandwidthforvariousVPNapproaches .................. 55 2-12Gridevaluationsetup ................................ 57 2-13GridNetperfbandwidth(TCPSTREAM)evaluation ................ 57 2-14GridNetperflatency(TCPRR)evaluation ..................... 58 2-15GridSPECjbbevaluationwithNetperfTCPSTREAMload ............ 58 2-16GridSPECjbbevaluationwithNetperfTCPRRload ............... 59 3-1BootstrappingaP2Psystemusinganexisting(generic)overlay ......... 66 3-2BootstrappingaP2PsystemusingBrunet ..................... 78 4-1CDFofprivateoverlaybootstraptime ....................... 92 4-2Securitylter ..................................... 95 4-3DTLShandshake ................................... 96 4-4Asinglenodejoininganinsecureandsecureoverlay ............... 98 11

PAGE 12

4-5Simulataneousbootstrappingofasecureandaninsecureoverlay ....... 99 4-6Overlaybroadcasttime ............................... 101 4-7BootstrappinganewGroupVPN .......................... 104 5-1Creatingrelays .................................... 119 5-2Acomparisonofall-to-alloverlayrouting,two-hoprelay,anddirectconnectioninBrunet ....................................... 120 5-3LatencyinPlanetLabdeploymentcomparedtoiPlane .............. 124 5-4DroprateinPlanetLabdeploymentcomparedtoiPlane ............. 124 5-5Timetoformadirectconnection .......................... 126 5-6AnexampleofbothfullandsplittunnelVPNmodes ............... 128 5-7ThecontentsofafulltunnelEthernetpacket ................... 129 6-1GridAppliancemiddleware ............................. 135 6-2GridAppliancedeploymentscenario ........................ 143 6-3Acollectionofvariouscomputingresourcesatatypicaluniversity ........ 150 6-4Timetoconstructagrid ............................... 155 6-5Timetorunajobonagrid .............................. 155 6-6GridAppliancestackablelesystem ........................ 159 7-1AnexampleOverSocsocialoverlaynetwork .................... 167 7-2AlicerequestsandreceivesafriendshipfromBob ................ 169 7-3Alice,alreadyafriendofBob,connectstohissocialoverlay ........... 171 A-1Tree-basedoverlaybroadcast ............................ 183 12

PAGE 13

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyDESIGN,IMPLEMENTATION,ANDAPPLICATIONSOFPEER-TO-PEERVIRTUALPRIVATENETWORKSFROMGRIDSTOSOCIALNETWORKSByDavidIsaacWolinskyAugust2011Chair:RenatoFigueiredoMajor:ElectricalandComputerEngineeringVirtualprivatenetworks(VPNs)enableexistingnetworkapplicationstorununmodiedininsecureandconstrainedenvironmentsbycreatinganisolatedandsecurevirtualenvironmentprovidingall-to-allconnectivityforVPNmembers.WhilethereexistbothcentralizedanddistributedVPNimplementations,currentapproacheslackself-congurationandorganizationcapabilitiesthatwouldreducemanagementoverheadsandminimizeeffortsbynon-experts.Recentuseofpeer-to-peer(P2P)techniqueshavefocusedonalleviatingpressureplaceduponinfrastructurenodesbyallowingpeerstoformdirectconnectionsforcommunicationpurposes,whileinfrastructurenodesareusedforhandlingsessionmanagementandsupportingindirectcommunicationbyrelayingtrafcwhenNAT(NetworkAddressTranslation)orrewalltraversalfails.Intermsofdecentralized,P2P-basedVPNsolutions,themechanismsexploredthusfarinrelatedworksemployunstructuredP2Psystems,whichcanhavesignicantscalabilitylimitations.ThisthesisconstructsanoveldecentralizedP2PVPNthataddressesthefollowingcoreaspectsthatareintegraltouser-friendliness:bootstrapping,discovery,security,andendpointconguration.Aresourcejoiningadistributedsystemgoesthroughabootstrappingprocess.ThetargetenvironmentforVPNsincludesmallsystemswithmanyifnotallusersbehindNATsandrewallsmakingthebootstrappingprocesschallenging.Centralized 13

PAGE 14

systemsaddressthebootstrappingproblembyusingacommonresourceforpeerregistration,discovery,andconnectionestablishment.Centralizedsystems,however,comewithadditionalcostsindeployingandmanagingadedicatedresourcewithapublicInternetaddressandthecapabilitytohandledemandsplaceduponitbyclients.Ihaveinvestigated,implemented,andevaluateddecentralizedmeanstobootstrapprivateP2Poverlaysforconnectivity-constrainedresources,withanapproachthatsupportsarecursiveoverlayorganizationortheuseofthird-partyfree-to-joinpublicoverlayinfrastructuresusingtechnologiessuchasXMPP(ExtensibleMessagingandPresenceProtocol).Bootstrappinghelpsestablishconnectivityintoanoverlay;however,manysystemsincludingP2PVPNsrequireameansfordiscoveryspecicpeers.ExistingVPNseitherrelyonlargetableshostedoninfrastructurenodesoroverlaybroadcasttechniquestondaresource.Asasystemgrowsincapacity,theseapproacheshavetheirlimitations,especiallyinVPNswhereallIP(InternetProtocol)addressesareindependentoftheirlocationinsidetheVPN.IhaveemployeddistributedhashtablestoefcientlyestablishdecentralizedIPaddressallocationanddiscoveryseamlesslyprovidingscalabilityandresilience.InaVPN,otherpeersaretypicallyeithertrusteddirectlybythepeer,orindirectlythroughatrustedthird-party.Whileusersmaytrustathird-partytoassistthemincreatingnetworklinkstootherpeers,theydonotdesiretohaveintermediariesthatareabletoreadormodifytheirIPpackets.Unfortunately,mostVPNsonlyencryptmessagesonapoint-to-point(PtP)basisallowingtheseintermediariesprivilegedaccesstotheiridentityandtheirmessages.Inthesecases,end-to-end(EtE)securityreliesonout-of-boundexchangesandapplications.TotransparentlyhandlesecurityatbothPtPandEtElayersacrossawidespectrumofcommunicationtransports,Ihavedevelopedanovelsecuritylter,whichhasbeendemonstratedtosupportexistingPublic 14

PAGE 15

KeyInfrastructurebasedsecuritysystems(suchasDTLS(DatagramTransportLayerSecurity))forbothPtPandEtEtrafcinsideconnectivity-constrainedenvironments.Whilesecurityprimitivesenableprivateandauthenticatedcommunication,thecongurationandmanagementoverheadsinvolvedinestablishingtrustandmaintainingsecureconnectionsinVPNsareasignicanthindrancetousabilityandadoption.Inmyapproach,allsecuritylinksareestablishedfromexchangedcerticates,soeachpeerisuniquelyidentiable.Myapproachuniquelyhandlesadministrativeanduseraspectsofcerticatesautomaticallythroughtheuseofonlinesocialnetworkingfeaturessuchaspeerrelationshipsandgroups.Theaboveself-organizingmechanismstocreateVPNlinksneedtobecomplementedwithapproachesthatsupporteffectivebindingstoendpointsfromwhichmessagesarecaptured/injectedfrom/totheVPN.Inatypicalapproach,calledtheinterfacemodel,eachresourceintheVPNhasalocalbindingtotheVPNbylocallyinstalledsoftware.Unfortunately,thisintroducessignicantoverheadswhentwoormoresuchsystemsarerunninginsidethesametrustedLAN(LocalAreaNetwork).Alternatively,ifallresourcesinaLANconnecttoacommonVPN,suchasinagridorforcloudcomputingenvironments,theresourcescanshareacommonentrypointtotheVPNthrougharoutermodel.Unfortunately,existingapproachesdonottransparentlyconguretherouterandconnectedresources.Additionally,theroutermodeldoesnotworkwellonsharednetworks,wherethereareeitheruntrustedusersorsomeresourcesshouldnotbeaccessiblethroughtheVPN.Ihaveshownhereinhowalloftheseconsiderationscanbehandledwithouttheintroductionofnewprotocolsbyutilizingexistingservicescommonlyprovidedbynetworkstacks,primarilyDHCP(DynamicHostControlProtocol)andARP(AddressResolutionProtocol),whichenablesanewtypeofVPNmodelthatbalancesthebenetsoftheinterfaceandroutermodels.ThepremiseforthisworkistoenhancetheusabilityofVPNsystemsenablingwideradoptionbynon-expertusersinhome,small/mediumbusiness,andeducation 15

PAGE 16

environments.Theconceptsforthisworkhavebeencarefullydesigned,implemented,andevaluatedandthendemonstratedthroughtheimplementationofnovelsystems(SocialVPN,GroupVPN,andGridAppliance)accessedbyrealusers.TheSocialVPNcreatesuser-centricVPNssothatpeersonlyhaveVPNlinkswiththeirsocialnetworkfriends,whereastheGroupVPNemploysagroupinfrastructuretomanageVPNmembersanddistributeVPNconguration.AfreeGroupVPNbootstrappingenvironmentrelyingonPlanetLabhostedresourceshasbeenavailableforoverthreeyearsandhasbeenaccessedbyoverhundredsofusersincludingseveraluniversitiesandcommercialentities,whereastheSocialVPNhasover80activemembersonlineatanygiventime.TheGridApplianceusestheGroupVPNtoformad-hocanddistributedcomputingpools,facilitatingcomputerarchitectureresearchintheArcherproject.TheArcherprojecthasbeenaccessedbystudentatseveraluniversitiesandhasaccumulatedover500,000CPUhoursinalittlelessthanthreeyears.Furthermore,theGridAppliancehasbeenusedasbothateachingtoolindistributedcomputingclassroomsaswellasbyexternaluserstocreatetheirowngrids.Thechallengesfacedinthesedeploymentshaveopenedthedoorforotheravenuesofresearchintobuilt-inself-simulation,P2Pconnectionestablishment,efcientIPbroadcastingandmulticasting,anddecentralizedestablishmentofInternetgateways. 16

PAGE 17

CHAPTER1INTRODUCTIONAVirtualPrivateNetwork(VPN)providestheillusionofalocalareanetwork(LAN)spanningawideareanetwork(WAN)infrastructurebycreatingencryptedandauthenticated,secure1communicationlinksamongstparticipants.CommonusesofVPNsincludesecureaccesstoenterprisenetworkresourcesfromremote/insecurelocations,connectingdistributedresourcesfrommultiplesites,andestablishingvirtualLANsformultiplayervideogamesovertheInternet.VPNsinthiscontextdifferfromothersthatprovide`emulationofaprivateWANfacilityusingIP(InternetProtocol)facilities'(includingthepublicInternetorprivateIPbackbones).[ 47 ].ThisstyleofVPNisusedtoconnectlargesetsofmachinesbehindrouterstoavirtualprivateWAN,whereasthisdissertationfocusesontheapproachofconnectingindividualresourcesintoaprivateLAN.Asatoolenablingcollaborativeenvironments,VPNscanbeusefulforvariousapplications.Iffriendsandfamilyrequirecomputerassistanceandtheircomputergurunolongerlivesnearby,thegurucanremotelylogintothemachineusingaVPNrunningovertheInternetdespitenetworkingconstraintsbetweenthetwoparties.Whentravelingabroad,ausermaywishthattheirInternettrafcbekeptprivatefromthelocalnetwork.AVPNcanbeusedtorouteallInternetpacketssecurelythroughtheuser'shomeorofcenetwork,ensuringtheuser'sprivacy.Manycomputerandvideogameshavemultiplayernetworkingcomponentsthatrequiredirectconnectivity.Mostofthesegamesrelyoncentralizedserversforbootstrappinglimitingtheirlifespan.PlayersofthesegamescancontinueplayingthroughVPNs.SmallandmediumbusinessesmayndVPNsusefulforconnectingdesktopsandserversacrossdistributedsitessecuringtrafc 1Fortheremainderofthisdocument,unlessexplicitlystatedotherwise,securityimpliesencryptionandmutualauthenticationbetweenpeers. 17

PAGE 18

toenterprisenetworkedresources.IndependentorganizationsthateachhavelimitedresourcescancombinetogethertheirresourcesthroughaVPNtocreateapowerfulcomputinggrid.TheutilityoftheVPNdescribedhereinisillustratedbyacollaborativegridcomputingproject,Archer[ 37 ].Archerconsistsofover700coreresourcesaswellasvoluntaryresourcesfromthecommunitytoprovideadynamicanddecentralizedgridenvironmentforcomputerarchitectureresearcherstoshareandaccesscomputecycles.UseofcentralizedsystemswouldlimitthescopeofArcherandrequirededicatedadministration,whereasexistingdecentralizedsolutionsrequiremanualcongurationoflinksbetweenpeers,whichisbeyondthescopeofArcher'stargetusers.CurrentP2P(peer-to-peer)virtualnetwork(VN)approacheseitherlackscalabilityorpropersecuritycomponentstobeconsideredVPNs;whereasmyapproachappliesnaturallytosuchsystems.TherearevariousVPNarchitecturesthatattempttodealwiththechallengespresentedintheseusecases.Insome,certainVPNapproachesmaywork,whereothersarenotapplicable,andinothersscenarios,nocurrentVPNapproachisapplicable.Ingeneral,successfuldeploymentanduseofVPNsfacethefollowingchallenges: OverlayConguration.Peersmustbeabletondeachotherinordertobootstrapintotheoverlayandtoestablishlinkswithspecicusersinsidetheoverlay. Connectivity.Networkasymmetriescreatedbyrewalls,NAT(networkaddresstranslation),andInternetrouteroutagesmotivatetheuseforVPNs.Oneapproachistoroutealltrafcthroughathirdparty,butthisincursoverheads.TherealsoexistapproachestoallowtwopeersbehindNATdevicestocommunicatedirectlyandfallingbacktoarelayifthetwopeersarebehindtoorestrictivenetworks. PeerManagement.Toensurereliabilityandtrust,adistributedsystemshouldemploysecurity.Peermanagementinvolvesprovidingandobtainingsecuritycredentialsaswellaspreventingmisbehavingpeersfromcommunicatingwithauser'sresourceorexcludingthementirelyfromthesystem. 18

PAGE 19

Privacy.TheoriginalintentionforVPNswasnetworksecurity,thusallcommunicationbetweenpeersisprivate.ManyVPNsonlysecuretrafcbetweenhopsandarethussusceptibletoman-in-the-middleattacks.Unfortunately,establishingend-to-endprivacycanbechallengingasitrequiresadditionalout-of-bandexchanges. EndpointConguration.Applicationstransferpacketsthroughanetworkinterface.EndpointcongurationisnecessarytoenablesendingandreceivingpacketsfromuserspacethroughtheVPN.CollaborativeenvironmentscanstronglybenetfromaVPNthatisbothuser-friendlyaswellasscalable.Asystemthatisonlyuser-friendlywillinitiallyattractinterestbutfrustrateusersinthelongrun,whilesystemslackinguser-friendlinessmayhavelimiteduseradoption.Byapplyingtheserequirementstotheaforementionedchallengesleadstothefollowinggoals:acollaborativeVPNshouldbeeasytocongure,suchthatusersshouldbeabletodeployandusethemwithoutbeingexpertsinoperatingsystems(O/Ss)ornetworks;thesystemshouldnotrequireadditionalresourcestosupportmoreusers;addingnewusersandresourcesshouldbestraight-forwardusingapproachesfamiliartocommonusers;peersshouldbeabletoconnecttoeachotherdirectlyifandwhenpossible;andallcommunication,notjusthop-by-hop,shouldbesecure.WhileexistingVPNsareabletomeetsomeoftheserequirements,theyareunabletomeetthemall.Centralizedapproaches(e.g.OpenVPNR[ 120 ])bytheirverynaturerequirededicatedinfrastructuresanddonotallowdirectcommunicationbetweenpeers,thoughwhenconguredtodosoareabletoguaranteeall-to-allcommunicationregardlessofNATandrewallconditions.P2Pbasedapproaches(e.g.HamachiR[ 67 ],Wippien[ 79 ],GbridgeR[ 66 ],PVC[ 85 ])solvetheissueofdirectcommunication,thoughtheyarevulnerabletoman-in-the-middleattackswhensessionmanagementishandledbyanexternalprovider,relyonacentralresourceforthecreationofVPNlinks,andrequiremanagedrelaysifdirectpeercommunicationacrossNATsandrewallsfails.Distributedapproaches(e.g.,ViNe[ 108 ],Violin[ 53 ],VNET[ 106 ],tinc[ 100 ])requiremanualcongurationoflinksbetweenmembersofthevirtualnetwork.ExistingP2P 19

PAGE 20

overlayapproacheslackscalability(N2N[ 29 ]andP2PVPN[ 46 ])oraredifculttocongureandlackprivacy(I3[ 103 ]).MyworkculminatesinanoveldesignandimplementationofaVPNfromendpointcongurationtooverlayconstructionandorganizationresultinginanautonomicVPNthatbridgesthegapbetweenuser-managedVPNsandthosehostedbythird-partyservices.TheVPNbuildsontopofaP2Psystemusedtotransparentlyhandlenetworkasymmetriesandsupportaddressallocationandresolution.TheP2Psystemorganizesintoastructuredoverlay,whichsupportsscalable,distributeddatasharingviaadistributedhashtable(DHT).PeerssearchforeachotherbyqueryingtheDHTandthenuseconstructsprovidedbytheP2Playertoformdirectlinksorrelayswithremotepeers.PeermanagementishandledthroughcommonsocialnetworkinginterfacessuchasdedicatedgroupinfrastructuresorrelationshipsbaseduponXMPP(ExtensibleMessagingandPresenceProtocol)orFacebookR.BoththeVPNandtheoverlayaresecuredbyacommonsecuritylterframework,whichcanbedecentrallylocatedandbootstrappedthroughexistingoverlays.Finally,throughvariousVPNmodels,usersandsystemadministratorscantakethesameVPNsoftwareandinstallitinvariousenvironmentswithminimalcongurationoverhead. 1.1VirtualPrivateNetworkBasicsVPNsconsistoftwocomponents:clientsandservers2.Clientsdiscoverotherclientsbymeansofserversoroverlays.DependingontheVPNstyle,clientswilltheneithercommunicatewitheachotherthroughserversorusethemtoestablishdirectlinkswitheachother.WhilesetupmaybedifferentamongstthevariousVPNs,duringrun 2ThedenitionofaserverisVPNdependent,thegeneralconceptisaresourceorsetofresourcesthatmaintainstateinthesystem.Itmightbeacentralizedresource,anoverlay,orevenaclientinthecaseofP2Psystems. 20

PAGE 21

Figure1-1.AtypicalVPNclient time,theenvironmentprovidedbyaVPNclientisthesameregardlessofhowtheserveroroverlayisimplemented.Figure 1-1 abstractsthecommonfeaturesofallVPNsclients,aserviceandavirtualnetwork(VN)deviceprovidingcommunicationwiththeVPNsystemandhostintegration,respectively.Duringinitialization,theVPNserviceauthenticateswiththeoverlaybymeansofacentralizedordistributedservice,independentlywitheachpeer,orsomeothermeans;then,optionally,queryingforinformationaboutthenetwork,suchasnetworkaddressspace,addressallocations,anddomainnameservice(DNS)servers.Atwhichpoint,theVPNenablessecurecommunicationamongstparticipants.Clientscanauthenticatewiththeoverlayusingavarietyofmethods.Asystemcanbesetupquicklybyusingnull(no)authenticationorasharedsecretsuchasakeyorapassword.Usingaccountsandpasswordswithorwithoutasharedsecretprovidesindividualizedauthentication,allowinganadministratortoblockallusersifthesharedsecretiscompromisedorindividualuserswhoactmaliciously.Usinguniqueprivatekeyswithcorrespondingsignedcerticatesprovideamoresecureapproach,because 21

PAGE 22

iteliminatesthefeasibilityofbruteforceattacks.Thetrade-offsintheapproachescomeintermsofsecurity,usability,andmanagement.Whiletheuseofsignedcerticatesprovidesbettersecuritythansharedsecrets,certicatesrequiremorecongurationandmaintenance.Inasystemcomprisingofnon-experts,likemanyuniversityVPNs,theusualsetupusesasharedsecretandindividualuseraccounts.SecretscanbepackagedwiththeVPNapplication,solongasitisdistributedthroughsecurechannelssuchasauthenticatedHTTPS(hypertexttransferprotocolsecure).AVNdeviceaspresentedinFigure 1-1 allowsnetworkingapplicationstocommunicatetransparentlyovertheVPN.TheVNdeviceprovidesmechanismsforinjectingincomingpacketsintoandretrievingoutgoingpacketsfromthenetworkingstack,enablingtheuseofcommonnetworkAPIssuchasBerkeleySockets,allowingexistingapplicationstoworkovertheVPNwithoutmodication.WhiletherearemanydifferenttypesofVNdevices,TAP[ 63 ]standsoutfromtherestduetoitsopensourceandpervasivenature.TAPallowsthecreationofoneormoreVirtualEthernetand/orIPdevicesandisavailableforalmostallmodernoperatingsystemsincludingWindows,Linux,MacOS/X,BSD,andSolaris.ATAPdevicepresentsitselfasacharacterdeviceprovidingreadandwriteoperations.IncomingpacketsfromtheVPNarewrittentotheTAPdeviceandthenetworkingstackintheOSdeliversthepackettotheappropriatesocket.OutgoingpacketsfromlocalsocketsarereadfromtheTAPdevice.VNdevicesarenodifferentthananyothernetworkdevice.Theycanbeconguredmanuallythroughcommand-linetoolsorOS'APIs(applicationprogramminginterface)ordynamicallybytheuniversallysupporteddynamichostcongurationprocess(DHCP)[ 4 31 ].UpontheVNdeviceobtaininganIPaddress,thesystemaddsanewruletotheroutingtablethatdirectsallpacketssenttotheVPNaddressspacetobedirectedtotheVNdevice.PacketsreadfromtheTAPdeviceareencryptedandsenttotheoverlayviatheVPNclient.TheoverlaydeliversthepackettoanotherclientoraserverwithaVNstackenabled.Receivedpacketsaredecrypted,veried 22

PAGE 23

forauthenticity,andthenwrittentotheVNdevice.Inmostcases,theIPlayerheaderremainsunchanged,whileVPNcongurationdetermineshowtheEthernetheaderishandled. 1.2ComputerNetworkArchitecturesAllmodelsforcomputercommunicationindistributedsystemsfallundertwocategories:centralizedanddecentralized.Sub-classesofthesecategoriesincludehybridsystemswithcentralizedsessionmanagementanddecentralizedcommunicationandself-conguring,dynamicP2Psystems.ThearchitecturescommonlyusedforimplementingVPNsystemsarecentralizedorganizationandcommunication,centralizedorganizationanddecentralizedcommunication,decentralizedcommunicationwithmanualorganization,anddecentralizedcommunicationwithautomaticorganization.SystemswithCentralizedorganizationandcommunicationconsistofclientsandserverswherealldistributedpeersareclientsdiscoveringandconnecting,ororganizing,throughadedicatedcentralizedresource.Clientsnevercommunicatewitheachotherdirectly,butrathereverymessagebetweentwoclientsmusttraversetheserver.Forinstance,mostonlinesocialnetworks(OSNs)arerepresentativeofthesetypeofsystems.UsersofOSNslikeFacebookR[ 36 ]andMySpaceR[ 74 ]communicatethroughcentralizedenvironments,neverdirectlytoeachother'scomputers.OpenVPNR[ 120 ]representsthisVPNapproach.Thesesystemsrelyondedicatedresources.Inthesituationthataservergoesofineorbecomesoverwhelmedbytheclients,thesystemisrendereduseless.CentralizedorganizationanddecentralizedcommunicationsystemsincludetherstsetofpopularP2Psystems,suchastheoriginalNapsterR,KazaaR,andVPNslikeHamachiR[ 67 ].Similartotheclient-servermodel,clientsconnecttoaservertondotherclients,thoughinsteadofcommunicatingthroughtheserver,theclientsformdirectconnectionswitheachother.Theseapproachesarelimitedbynetworkaddresstranslation(NAT)andrewallsthatmaypreventpeersfromcommunicatingwitheach 23

PAGE 24

other.Inthesecases,thecentralservermayactasarelayallowingthetwoclientstocommunicatethroughit.Unlikesystemsusingcentralizedcommunication,thesesystemsarelesssusceptibletobeingoverwhelmedbyclienttrafcandeveniftheservergoesofineexistingclientlinksremainactive,thoughnewconnectionscannotbeestablished.Systemsemployingdecentralizedcommunicationwithmanualorganizationaddresstheissuesofacentralsystemgoingofine,becauseclientsareconguredtoconnecttoanynumberofdistributedserversforminganoverlay.Inthesesystems,serversareexplicitlyconguredtocommunicatewithotherservers.Thoughthisapproachimprovesupontheperformanceandavailabilityissuesinherenttocompletelycentralizedarchitectures,ifaservergoesofineanysystemscommunicatingthroughitwillnolongerbeconnectedtotherestofthesystemuntiltheadministratorcreatesadditionallinksortheserverbecomesactiveagain.Clientsinthesesystemsdonottypicallyformdirectlinkswitheachother;rather,theyroutepacketsthroughtheoverlay.ThisapproachhasbeenusedtocreatescalableVPNs,likeViNe[ 108 ],VNET[ 106 ],Violin[ 53 ],andLayer2TunnelingProtocolbasedVPNs[ 107 ].Inautomaticorganization-baseddecentralizedcommunicationsystems,thereisnodistinctionamongstpeersastheyactasbothclientandservers,i.e.,aP2Psystemoroverlay.P2Psystemsareusuallydistributedwithalistofcommonpeers.ApeerattemptingtobootstrapintotheP2Poverlayrandomlyselectspeersonthislistuntilitisabletoconnectwithone.Thisconnectionisthenusedtoformconnectionswithotherpeerscurrentlyintheoverlay.Theoverlaycanbeorganizedintwodifferentforms:randomlyordeterministicallycreatingunstructuredorstructuredoverlays,respectively.Inanunstructuredoverlay,linksareformedarbitrarily,thusapeersearchesforanotherpeerbybroadcastingthemessageorusingstochastictechniques.Instructuredoverlays,peersorganizeintotopologiesbydeterministicallyformingconnectionswithpeersnearbyintheoverlayaddressspacecreatingstructuressuchasringand 24

PAGE 25

hybercubes.Peerscanbefounddeterministicallyusinggreedyroutingapproachesinusuallylog(N)time.Gnutella[ 87 ]lesharingsystemandSkypeR[ 99 ]arepopularexamplesofunstructuredsystems,whileP2PSIP[ 14 ]anddistributedhashtables(DHTs)[ 104 ]arepopularinstructuredsystems.Achallengeinunstructuredsystemsisndingdataobjectsinreasonableamountoftime,whilestructuredsystemssufferwhenlargeamountofpeersjoinorleavethesystem,knownaschurn[ 86 ].Ingeneral,bothapproachesaredifculttosecuredependingonthenatureoftheapplicationanddeployment.Whenusedinprivateenvironmentsthough,theyhavebeenshowntobeveryuseful,exempliedbyDynamo[ 28 ]orBigTable[ 22 ].Thisdissertationusesstructuredoverlaysasthefoundationinbuildingscalable,decentralizedVPNs,thefollowingsectionreviewsstructuredoverlays. 1.3StructuredOverlaysStructuredP2Poverlaysprovidedistributedqueryingsystemswithguaranteedsearchtime.Unlikeunstructuredsystems[ 19 ],whichrelyonglobalknowledge/broadcastorstochastictechniquessuchasrandomwalksthattakeO(N)timetoguaranteendingdataintheoverlay,structuredoverlaysorganizeintowell-denedgeometrieswithsupporttoresolvequerieswithinO(log(N))Thereexistsaplethoraofstructuredsystemsfoundbothinresearchandinavailableapplications[ 13 71 72 82 94 104 ].Inordertoobtainguaranteedsearchtime,structuredsystemsself-organizingintowelldenedtopologies,suchasaring(picturedinFigure 1-2 )orahypercube.Peersjoininganoverlaytypicallyfollowtheseabstractedsteps: 1. generateorobtainauniqueidenticationnumber(nodeID)withintheoverlay'saddressspace,usuallyontheorderof128-bitsto256-bits; 2. attempttoconnecttooneormorerandomaddressesfromapre-sharedlistofwell-knownendpoints,dedicatesresourcesfromaserviceprovideroruserswithhighuptime; 3. becomeconnectedtoatleastonepeerinthislist(leafconnection,bootstrappeer); 4. ndthesetofpeersintheaddressspaceclosesttothenode'sID; 25

PAGE 26

Figure1-2.1-Dringstructuredoverlay 5. establishconnectionsorexchangeconnectioninformationwiththosepeers(neighborornearconnections); 6. andnallyconnecttoothernodesintheoverlayoutsidethesetofnearconnectionstoenablequicklytraversingtheaddressspace(shortcutorfarconnections).AllnodesarerequiredtohaveauniquenodeID.Addresscollisionscancauseinconsistenciesintheoverlay,whereoneorbothofthenodeswillnotbeabletoproperlyconnecttotheoverlay.Furthermore,havinguniformlydistributednodeIDsenhancestheutilityoftheshortcutconnections.ToobtainagooddistributionofnodeIDs,eitheracentralservercanprovidetheIDoreachnodeindependentlyofotherscanusea 26

PAGE 27

cryptographicallystrongrandomnumbergenerator.Theformerapproachcanbeusedtocreateatrustedoverlaybyhavingthethird-partysigneachnodeIDs[ 20 ].Inaring,eachnodemustbeconnectedtoclosestneighborsinthenodeIDaddressspace,thatisthenodeimmediatelybeforeandafterit.Optimizationsforfaulttolerancesuggestthatforringtopologiestheamountshouldbeatleast2anduptolog(N)onbothsides.Considerthecasewhenthereisoverlaydisconnectivitypotentiallyduetochurn;apeerreceivesapacketbutcannotrouteitclosertothedestinationthanitselfbecauseitdoesnothaveaconnectionwiththatpeer.Themessagemayeitherbelocallyconsumedorthrownawayneverarrivingatitsintendeddestination.Increasingthenumberofnearneighborpeersreducesthelikelihoodchancesofpacketsbeinglostduetochurn,especiallyifpeersleavesuddenlywithoutwarning.Asmentioned,shortcutsorfarconnectionsenableefcientroutinginring-basedandsimilarlydesignedstructuredoverlays.Thevariousshortcutselectionmethodsinclude:maintaininglargetableswithoutusingconnectionsandonlyverifyingusabilitywhenroutingmessages[ 72 94 ],maintainingaconnectionwithapeeratspeciclocationsintheP2Paddressspace[ 104 ],orusinglocationsdrawnfromaharmonicdistributioninthenodeaddressspace[ 71 ].Structuredoverlayssupportdecentralizedquerysystemsthatcanbeusedtobuilddistributeddatastructuressuchasadistributedhashtable(DHT)bymappingkeysviaahashfunctiontonodeIDsinanoverlay.ThedataassociatedwiththekeyisthenstoredatthenodeclosesttothenodeIDofthekeyandforfaulttolerancecanbestoredbyothernodesnearbyormorekeyscanbegeneratedbyrecursivelyhashingtheoriginalkey.UsingtheDHTprimitives,Past[ 95 ]andKosha[ 17 ]projectshavedesignedmorecomplexdistributeddatastores.TheactualmechanismforqueryingnodesorroutinginaP2Poverlaycanbeeitheriterativeorrecursive.Initerativerouting,thequeryingnodeiterativelycontactsnodescloserandclosertotheaddressuntilndingtheclosestnodeatwhichpointitmakes 27

PAGE 28

therequestdirectlytothatnode.Inmoredetail,thequeryingnodedirectlyqueriesthenodeclosesttothedestination,thatnodereturnsbackoneormorenetwork(IP)andP2Paddressesofcloserpeers,thequeryingnodequeriesthesepeers,andtheprocesscontinuesuntildeterminingthereexistsnoclosernode.Alternativelyinrecursiverouting,aqueryingpeersendsthemessagetothepeerclosesttothedestinationfromitsperspective,repeatingtheprocessuntilthemessagehasarrivedattheclosestpeertotheaddressorthedestination.Comparedtorecursiverouting,iterativecanbeimplementmoreeasilythoughwithconsiderableoverheadaseachoverlayquerywillcauselog(N)connectionstoform.NATsfurthercomplicatetheuseofiterativeroutingaspeersattemptingtoconnectwithanotherpeerbehindaNATwillneedtheassistanceofathird-party,whereasrecursiveroutingmaintainsactiveconnectionsandmessages,seamlesslytraversingNATlinksandnon-NATlinkssincetheconnectionsareestablishedpriortomessagetransmission. 1.4NetworkAsymmetriesNaiveP2Psystemsassumenetworksymmetry,thatisanypeercancommunicatedirectlywithanyotherpeerusingtheunderlyinginfrastructure.UnlessthesoftwareisruninsideaLANoranenvironmentwherethenetworktopologyiswellcontrolledanddened,symmetrycannotbeguaranteed.P2PusedinwideareasystemsoftenreliesontheInternet.BesidesthepotentialroutingoutagesontheInternet,signicantamountofresourceswhicharenotdirectlyaccessibleareconnectedtoit.TheissueisonlyfurtherpressedbythecurrentmeansofconnectingtotheInternet:IPv4(InternetProtocolversion4)withitslimitedaddressspaceofonly232(approximately4billion).WiththeEarth'spopulationatover6.8billionandeachindividualpotentiallyhavingmultipleInternet-capabledevices,theselimitationsbecomemoreapparent.CurrentlythetwoapproachesaddressingIPv4limitationsare:theuseofNATstoenablemanymachinesanddevicestoshareasingleIPaddressbutpreventingbidirectionalconnectioninitiationandIPv6(InternetProtocolversion6)whichsupports 28

PAGE 29

Figure1-3.CommunicationbetweenapeerbehindaNATandonewithapublicaddress 2128addresses.TheuseofNATs,asshowninFigure 1-3 ,complicatesthebootstrappingofP2Psystemsasitpreventspeersfromsimplyexchangingaddresseswitheachothertoformconnections,astheaddressesmaynotbepublic.Inaddition,rewallsmaypreventpeersfromreceivingincomingconnections.Thus,whiletheeventualwidespreaduseIPv6mayeliminatetheneedforaddresstranslation,itdoesnotdealwiththeissueofrewallspreventingP2Papplicationsfromcommunicatingaswellasroutingoutages,anditisnotclearthatIPv6userswillnotcontinuetorelyonNAT/rewalldevicestoprovideawell-denedboundaryofisolationfortheirlocalnetworks.Whenamachine,A,behindatypicalNAT,B,sendsoutapackettoanInternethost,C,theNATdevicetranslatesthepacketsothatitappearsitiscomingfromtheNATdevicemakingtheNATdeviceagateway.WhenthepacketissentfromAtoC,thesourceanddestinationarelistedasIP:portpairs,wherethesourceanddestinationareIPA:PortAandIPC:PortC,respectively.AforwardsthepackettoBwhotransformsthesourcefromIPA:PortAtoIPB:PortB,wherePortAmayormaynotbeequaltoPortB.ThiscreatesaNATmappingsothatincomingpacketsfromIPC:PortCtoIPB:PortBaretranslatedandforwardedtoIPA:PortA.ThereareahandfulofrecognizedNATdevices[ 93 101 ].Thefollowinglistfocusesonthemoreprevalenttypes: FullCone.AllrequestsfromthesameinternalIPandportaremappedtoastaticexternalIPandport,thusanyexternalhostcancommunicatewiththeinternalhostonceamappinghasbeenmade. 29

PAGE 30

RestrictedCone.Likeafullcone,butitrequiresthattheinternalhosthassentamessagetotheexternalhostbeforetheNATwillpassthepackets. PortRestrictedCone.Likearestrictedcone,butitrequiresthattheinternalhosthassentthepackettotheexternalhostsspecicport,beforetheNATwillpasspackets. Symmetric.Eachsourceanddestinationpairhavenorelation,thusonlyamachinereceivingamessagefromaninternalhostcansendamessageback.OfthevariousscenariosinvolvingpeersandNATs,solongasonepeerisonanyoftheconeNATsandtherearenorewalls,itcanreceiveincomingconnectionrequests.ChallengestothisapproachexistwhenrewallsareintroducedorbothpeersarebehindsymmetricNATs.FirewallsmaytrafcthatwouldotherwiseallowNATtraversal,whereassymmetricNATsrequirecomplexmechanismsinanattempttohaveincomingconnectionrequests.Thesetypesofsystemstypicallyrelyonathird-partytopassmessagesbetweenthepeers. 1.5ContributionsTheresultingexpectationsofacollaborativeenvironmentthataddressesthechallengeslistedintheintroductionareself-conguringenvironmentsenablingevennon-expertstosetup,deploy,andmanagetheirownVPNs;peersshouldcommunicatewitheachotherdirectlywhenpossibleorthroughefcientindirectpathswhenconstrained;andthesystemshouldbereliableandensuretheprivacyofitsusersToaddresstheserequirements,IproposeanovelGroupVPNusingstructuredoverlaysconsistingofthefollowingnovelcontributions:SecureOverlays.Typicaloverlaysaresecuredusingheuristicsthatlimittheeffectsofmalicioususers.Challengesofusingsecuresessionsforinstitutingtrustorsecurityintoanoverlaydependsonthecommunicationpathways.Ifthegoalforthesystemistosupportasymmetriesonthenetwork,thenthesystemwillhavetomakesignicantuseofdatagramtechnologies.Thisworkproposesauniqueltermechanismtosupport 30

PAGE 31

encryptinganyformofcommunicationbetweentwopartiesandexaminestheoverheadsofdeployingitinsimulatedandrealenvironments.BootstrappingAd-Hoc,DecentralizedSystems.SecureoverlayspresentachallengewhenthereisaonetoonemappingbetweenoverlayandVPNinordertosecurelyisolateaVPN.Thisstemsfromthefactthatatanygiventime,peersmayormaynotbeconnectedtotheoverlay.Whenusedinsmallgroups,mostorallmembersmaybebehindNATsorremainonlineforshortperiodsoftime,creatingasituationwherenotasingleuseronapubliclyaddressableresourcewillbeonline,limitingtheuseofprivateoverlays.Toaddressthisissue,Iproposethereusingofpublicfree-to-joinoverlaystobootstrapintoaprivateoverlay.Peersusethepublicoverlaytondeachotherandexchangeconnectioninformationusingsecuremessages.Onlypeerswithappropriatesecuritycredentialsareabletojointheprivateoverlay.DecentralizedRelays.Incollaborativeenvironments,mostpeersarebehindNATsandpotentiallyrewallsaswell.WhileingeneralmostNATsaretraversablethroughexistingapproaches,notallare.Firewallsonlycomplicatethematter.Whilethesepeersmaybeabletocommunicatethroughtheoverlay,astheoverlaygrows,thislatencycanbecomeahinderacetousabilityandinteractivity.Toimprovethissituation,Iproposethecreationofautonomic2-hoprelaysbetweenthepeers.UsingSocialInfrastructuresForManagementandDistributionofSecurityCredentials.InordertosimplifythemanagementandaccesstoaVPN,thiscomponentexplorestheuseofsocialnetworksintermsofbothgroupsandpeerstofacilitatetrustestablishmentforaVPN.BeyondthecontributionofuniquelyusingsocialnetworkstoestablishVPNtrust,thisworkshowshowsystemscanleveragetrustinanexistingenvironmentsforuseinanother.Self-ConguringVPNArchitectures.ManyexistingVPNapproachesrequiretheuserstosetuptheirenvironmentanddonotprovideaplugandplaysystem.Inaddition,differentenvironmentscallfordifferenttypesofVPNs,explicitly,individualusersconnect 31

PAGE 32

viatheirownVPNconnections,whileclustersmaybenetfromasharedVPNormaydesirefaulttoleranceofhavingmanybutdonotwantthecommunicationoverheadwhentalkingtoVPNpeersontheLAN.Iaddressthisissuewithaself-conguringVPNapproachthatcanbeappliedtovariouslocalenvironmentsscalingfromasinglecomputertomany.PP2PVPNEnabledInternetTrafcTunneling.Whenininsecureenvironmentssuchasbrowsingprivateinformationinacoffeeshop,usersmaydesiretopreventlocalusersandadministratorsfromsnifngtheirtrafc.TraditionalVPNssupportthisbehavior,buttheapproachisdifculttoimplementinP2Psystemsduetotheirdynamicnature.Currently,nodecentralizedVPNsupportstheabilitytoperformthisbehavior.IproposeamethodthatnotonlyworksfordecentralizedandP2Psystemsbutensuresagreaterlevelofsecuritythanexistingapproachesbysecuringothernon-VPNcommunicationbetweenthepeerandgatewayresources.ApplicationsinAd-Hoc,DistributedSystems.Thevalueinacomplexsystemliketheoneproposedhereincanberealizedwhentiedtogetherforthecreationofad-hoc,distributedsystems.Thetypeofsystemfocusedoninthisdissertationisagrid.Whiletherearemanygridtopologies,approachesthatshareresourcesamongstusersandevenmostthatareusedbyasingleuserrequireauserwithexpertiseinoperatingsystems,networks,andmiddleware.ThisdissertationshowstheapplicabilityofP2PVPNmethodsandtechniquesthatcanbeusedtocreateatrusted,ad-hoc,distributedgridthatrequireslittleifanyexpertiseintheunderlyingtechnologybeingutilized.DecentralizedSocialNetworks.Traditionalapproachestosocialnetworks,suchasFacebookRandMySpaceR,requirestrustinathird-partyentity.Thesethird-partiesmineusersinformationforadvertisements,potentiallyviolatinguser'sprivacy.ThisdissertationpresentsadecentralizedsocialnetworkthataddressesrealproblemsbytakingadvantageoftheP2Psystemdescribedhereinbyprovidingeachuserina 32

PAGE 33

socialnetworktheirownprivateoverlaywhosemembersconstitutethefriendsofthatindividual.ImprovedModelsForDirectConnectionEstablishment.Originally,directlinksintheP2PVNwerebaseduponpacketowpassingathreshold.ThroughtheuseofprolingrealsystemsandpublishedresultsofInternetbehavior,Ihaveconcludedthatthismodeldoesnotscalewellandhavedesignandimplementedamodelthatsatisfactorilysolvesthisproblem.Therestofthisdissertationisorganizedasfollows.Chapter 2 overviewsexistingVNandVPNapproachesanddiscussescongurationandorganizationoftheVPNincludingend-pointconguration.InChapter 3 ,Ireviewthechallengestobootstrappingoverlaysandpresentmysolutionthatreusesexistingoverlaystobootstrapsmaller,ad-hocoverlays.ThisleadsintoChapter 4 ,whichdiscussessecurityissuesinstructuredoverlaysandaddressesthemeanstobootprivateandsecureVPNs.Chapter 5 coversextensionstotheVPNbaseduponpracticaldemandsandexperiences.Chapters 6 describestheGridAppliance,thetargetapplicationformyresearch.Chapter 7 presentsaproposedideaonhowtousethetechnologydiscussedthusfartocreateadecentralizedonlinesocialnetwork.Finally,IconcludeinChapter 8 bydiscussingthevalueinmycontributionsandchallengesthatwererevealedbutnotaddressedinthisbodyofwork,thusmotivatingfuturework. 33

PAGE 34

CHAPTER2VIRTUALNETWORKCONFIGURATIONANDORGANIZATIONVPNsenableseamlesscommunicationindistributedcomputingparticularlywhencombinginglargesetsofremoteresourcesorconnectingtocentralizedorpersonelresources.Similarusecasescanbeextrapolatedontoothercollaborativeenvironmentssuchasmultiplayergames,merginghomenetworksoveraVPN(virtualprivatenetwork),oraccessingaworkcomputerremotely.Eachapplicationhasdifferentrequirementsandinreviewofrelatedresearch[ 29 41 46 53 55 57 66 67 79 85 100 106 108 120 ]notasingleapproachefcientlysupportsthesedynamicenvironments.Certainly,ISP(Internetserviceproviders)largescaleVPNssuchasMultiprotocolLabelSwitching[ 88 ](MPLS)donotaswellduetothemanualcongurationandexpertiserequired.AnoverviewoftheseandtheonedescribedhereinarepresentedinTable 2-5 .TheorganizationofaVPNhasadirecteffectontheamountofusereffortrequiredtoconnectmultiplesites.InthisregardtherearetwocomponentsofaVPN,thelocalorganizationandtheremoteornetworkorganization.Thesetupofthevirtualnetworkinordertohavebeadestinationandrecognizedsourceforremotepacketsconstitutethelocalorganization,whereastheroutingofthepacketsamongstpeersishandledbythenetworkorganization.Priorresearchworksprimarilyfocusedonthelatterissue,whileignoringtheformer.ThisleftuserstosetuptheirownaddressallocationseitherthroughmanuallyconguringeachenvironmentordealingwiththeproblemscausedbyDHCP(dynamichostcontrolprotocol)serversincrossdomainnetworkconstruction,aswellastheirownsecuritydistributionsystems.Inaddition,organizinganetworkcanbeanevenmorecomplicatedtaskthanlocallyconguringthenetwork,becauseitmayrequirethecooperationofmanyadministratorsatthevarioussites.ThischapterpresentsanovelapproachtoVPNsthatachievesbothlocalandnetworkself-conguration. 34

PAGE 35

Table2-1.VPNclassications TypeDescription CentralizedClientscommunicatethroughoneormoreserverswhicharestaticallyconguredCentralizedServers/P2PClientsServersprovideauthentication,sessionmanagement,andoptionallyrelaytrafc;peersmaycommunicatedirectlywitheachotherviaP2PlinksifNATtraversalsucceedsDecentralizedServersandClientsNodistinctionbetweenclientsandservers;eachmemberinthesystemauthenticatesdirectlywitheachother;linksbetweenmembersmustbeexplicitlydenedUnstructuredP2PNodistinctionbetweenclientsandservers;memberseitherknowtheentirenetworkorusebroadcasttodiscoverroutesbetweeneachotherStructuredP2PNodistinctionbetweenclientsandservers;membersareusuallywithinO(logN)hopsofeachotherviaagreedyroutingalgorithm;usedistributeddatastorefordiscovery 2.1NetworkCongurationThekeytocommunicatinginaVPNiscreatinglinkstotheVPNandndingthepeerintheVPN.ThedifferentarchitecturesforVPNlinkcreationarebasedonthemethodsdescribedinTable 2-1 .Theseapproachesaredescribedinmoredetailbelow. 2.1.1CentralizedVPNSystemsOpenVPNRisanopenandwell-documentedplatformfordeployingcentralizedVPNs.Inthisdissertation,itisusedasthebasisforunderstandingcentralizedVPNsasitrepresentsfeaturescommontomostcentralizedVPNs.IncentralizedVPNsystems,clientsforwardallVPNrelatedpacketstotheserver.ClientresponsibilitiesarelimitedtoconguringtheVN(virtualnetwork)deviceandauthenticatingwiththeVPNserver,whereastheserversareresponsibleforauthenticationandroutingbetweenclientsandprovidingaccesstotheservers'localresourcesandtheInternet(fulltunnel).Likewise,broadcastandmulticastpacketsalsomustpassthroughthecentralserver. 35

PAGE 36

CentralizedVPNscansupportmultipleservers:uponstarting,theclientcanrandomlyselectfromalistofknownservers,implementingasimpleloadbalance.Onceconnected,theserversprovidetheclientanIP(InternetProtocol)addressintheVPNaddressspace.Dependingonconguration,thisallowsaclienttocommunicatewithotherclients,resourcesontheserver'snetwork,orInternethostsviatheVPN.Serversrequireadditionalcongurationtocommunicatewitheachother.Allinter-clientcommunicationowsthroughacentralserver.Bydefault,aclientencryptsapacketandsendsittotheserver.Uponreceivingthepacket,theserverdecryptsit,determineswheretorelayit,encryptsit,andthensendsthepackettoitsdestination.Thismodelallowsaservertoeavesdroponcommunication.Whileasecondlayerofencryptionispossiblethroughasharedsecret,itrequiresout-of-bandcommunicationandincreasesthecomputingoverheadoncommunication. 2.1.2CentralizedP2PVPNSystemsHamachiR[ 67 ]istherstwell-knowncentralizedVPNthatusedtheambiguousmonikerP2PVPN.Inreality,thesesystemsarebetterclassiedascentralizedVPNserverswithP2P(peer-to-peer)links.SimilarVPNsincludeWippien[ 79 ],GbridgeR[ 66 ],PVC[ 85 ],andP2PVPN1[ 46 ].TheP2Pinthesesystemsislimitedtodirectconnectivitybetweenclientsorchestratedthroughacentralserver:inWippienitisachatserver,whileP2PVPNusesaBitTorrentRtracker.IfNAT(networkaddresstranslation)traversalorrewallspreventdirectconnectivity,thecentralservercanactasarelay.Eachapproachusestheirownsecurityprotocolswithmostusingaservertoverifytheauthenticityandsetupsecureconnectionsbetweenclients.InregardstotheP2PVPN,longtermgoalsinvolvethecreationofanunstructured,whichwouldprovideamethodofdecentralizedorganization. 1DuetothesimilaritiesbetweenthenameP2PVPNandfocusofthisdissertation,P2PVPNrefersto[ 46 ]andP2PVPNtototheuseofP2PinVPNs. 36

PAGE 37

2.1.3DecentralizedVPNSystemsSomeexamplesofsystemsthatassistindistributingloadinVPNsystemsaretinc[ 100 ],CloudVPN[ 34 ],ViNe[ 108 ],VNET[ 106 ],andViolin[ 53 ].Thesesystemsarenotautonomicandrequireexplicitspecicationoflinksbetweenresources.Thismeansthat,likeOpenVPNR,thesesystemscansufferVPNoutageswhennodesgoofine,thusadministratorsmustmaintaintheVPNconnectiontable.UnlikeOpenVPNR,theseapproachestypicallydonotrequireall-to-alldirectconnectivityforall-to-allcommunication.Userscaneithersetupout-of-bandNATtraversalorroutethroughrelays.Linksaremanuallycongured. 2.1.4UnstructuredP2PVPNSystemsUnlikecentralizedanddecentralizedsystems,P2Penvironmentsrequiretheusertoconnecttotheoverlay,whichthenautomaticallycongureslinks.Thesimplestformofoverlaysareunstructured,wherepeersformrandomconnectionswitheachotherandusebroadcastandstochastic(e.g.randomwalks)techniquestondinformationandotherpeers;however,duetoitsunstructurednature,thesystemcannotguaranteedistanceandroutabilitybetweenpeers.TheonlyexampleofanunstructuredVPNisN2N[ 29 ].InN2N,peersrstconnecttoasupernodeandthen,tondanotherpeer,theybroadcastdiscoverymessagestotheentireoverlay.Inthecasethatpeerscannotformdirectconnection,peerscanroutetoeachotherovertheN2Noverlay.IntherealmofVPNs,allclientVPNsarealsoserversperformingauthenticationthoughneitherapproachdealswithdecentralizedaddressallocation. 2.1.5StructuredP2PVPNSystemsToaddressthescalabilityconcernsinunstructuredsystems,thisworkusesstructuredP2Poverlays.Asdescribedintherstchapter,structuredP2Poverlaysprovidedistributedlookupserviceswithguaranteedsearchtimewithinlog(N)timeincontrasttounstructuredsystemswithNtime.Ingeneral,structuredsystemsareabletomaketheseguaranteesbyself-organizingastructuredtopology,suchasa 37

PAGE 38

one-dimensional(1-D)ringorahypercube,deterministicallybyrandomlygeneratednodeidentiers.Theprimaryfeatureusedbystructuredoverlaysisadistributeddatastoreknownasadistributehashtable(DHT),whichstoreskey,valuepairs.Intheoverlay,thekeyisanoverlayaddress,wherethevalueisstored.Thepeerclosesttothekey'soverlayaddressisresponsibleformaintainingthevalue.CryptographichasheslikeSHA(SecureHashAlgorithm)andMD5(Message-Digestalgorithm5)canbeusedtoobtainthekey'soverlayaddressfromastringorsomeotherbytearray.Gangulyetal.[ 44 ]andStoicaetal.[ 103 ]describemethodsforaddressallocationusingaDHT(distributedhashtable).EachVPNhasauniquenameornamespace,whenapeerrequestsanIPaddress,amappingofhash(namespace,IP)tothepeersoverlayaddressisatomicallywrittentotheDHT.AsuccessimpliesthatthewriterwastherstwritertothatvalueandotherpeersreadingthatvaluewillbeabletoidentifythatpeerasownerofthatIPaddressinthatnamespace.LikewisewhenapeerwantstorouteapackettoaremoteVPNpeer,theyquerytheDHTusingthemapping,whichreturnstheoverlayaddress.TheIPpacketisthensenttotheoverlaydestination.Unicastmessagesaresentbetweentwoendpointsontheoverlayusingnormaloverlayroutingmechanisms.Directoverlaylinkscanbeusedtoimproveperformancebetweenendpoints.Gangulyetal.[ 41 ]describesamethodbywhichpeerscanformautonomicdirectconnectionswitheachotherusinganunstructuredoverlay.AsIPtrafcincreasesoveraperiodoftime,adirectconnectiontobypasstheoverlayisinitiatedbythereceiverofthepackets.Alternatively,aVPNmaywishtoformall-to-allconnectionswithVPNpeers[ 38 ].Tosupportbroadcastandmulticastinanoverlay,allmembersofasubnetassociatethroughtheDHTbyplacingtheiroverlayaddressataspecickey,i.e.,namespace:broadcast.Thenwhensuchapacketisreceived,itissenttoalladdressesassociated 38

PAGE 39

withthatkey.ItisuptotheVNateachsitetolterthepacket.Thisissufcienttosupportdeploymentswheremulticastorbroadcastisnotrelieduponextensively. 2.2LocalCongurationAtrstorder,therearetwoapproachestolocalVPNconguration:asingleVNendpointperahost,Interface,andaVNrouterendpointformanyhostsonthesameLAN(localareanetwork),router.Thecomponentsdifferingbetweenthetwoapproachesare: SoftwareLocation.InterfacesexecutethesoftwareoneachVPNconnectedresource,whereasanymachineconnectedtothesameLANasaRouterwillbeabletoaccesstheVPN.TheRouterrequiresadedicatedresource. NetworkConguration.SincetheInterfacesoftwarerunsoneachmachine,itisabletodirectlycongurenetworkingparameters,whereasaRoutermustuseexternalmethodstoconguretheresources. CommunicationonaLAN.WhentwopeersonaLANusingaVPNInterfacetocommunicate,alltrafcmustpassthroughtheVPNaddingunnecessaryoverhead,thoughinaRouterthetwopeershaveamergedphysicalandvirtualnetworkbetweenthemandthetrafcisabletobypasstheVPN. FaultTolerance.TheRouteronlyhasasingleinstancerunning,whenitgoesofine,allresourceswilllosetheirVPNaccess,whereaseachindividualresourcehastheirownInterfaceandisresponsiblefortheirownVPNconnectivity. CommunicationOvertheWAN.PerformingencryptioncanbeexpensiveandmaylimitthebandwidthavailableduetoCPU(centralprocessingunit)constraints.ARoutermaystruggletousealltheavailablebandwidth,whereasenoughInterfaceswilleventuallybeabletouseallthebandwidth.AlthougheachadditionalVPNInterfacealsohasidletrafc,potentiallyreducingusablebandwidth.Thisdissertationidentiesmethodsbywhichasinglesoftwarestackcanbeimplementedtosupportself-congurationandresourcemigrationinawaythatisplatformindependent.ThismethodlendsitselftoanewarchitectureknownasHybrid,allowinganinstancetoberunoneachVPNresourcebutenablingdirectcommunicationamongstpeersonaLAN[ 116 ].ThearchitecturesareshowncommunicatingviaanoverlayinFigure 2-1 andcomparedinTable 2-2 .ThetwoaspectsthatneedcongurationinthelocalcongurationbeyondtheVPNarchitectureareaddress 39

PAGE 40

Table2-2.Qualitativecomparisonofthethreedeploymentmodels InterfaceRouterHybrid HostLANNoassumptionIdeally,VLANNoassumption,thoughmayhaveduplicateaddressallocationinthesamesubnetfordifferentnamespaces.2HostsoftwareIPOP,tapEndnode:none.Router:IPOP,tap,bridgeIPOP,tap,VETH,bridgeHostoverheadCPU,memoryEndnode:none.Router:CPU,memoryCPU,memoryLANtrafcThroughIPOPBypassesIPOP*BypassesIPOP*MigrationHandledbynodeInvolvessourceandtargetroutersHandledbynodeTolerancetofaultsNodesareindependentRouterfaultaffectsallLANnodesNodesareindependent Figure2-1.ThreeVNapproaches:router,interface,andhybrid 40

PAGE 41

allocation,obtainingandsettinganIPaddressonaresource,andaddressresolution,determiningwheretorouteaVPNpacket.Thekeystocreatingthisenvironmentinvolvetheuseofstandardnetworkprotocolsimplementeduniformlyacrossoperatingsystems,includingDHCPandARP(addressresolutionprotocol).ManyapplicationsmakeuseofnamesinsteadofIPaddressestoresolvepeers,assuchanamingsystem,likeDNS(domainnameservice)isalmostasimportantasaddressresolutionandallocation.AstatemachinerepresentationofthisarchitectureisshowninFigure 2-2 .Inthisrepresentation,aVNinterfaceisidenticaltoaVNrouterwiththecaveatthattheTAPdeviceisnotbridged,thusisolatingtheVNtrafc.TheShouldHandlewithdashedlinesisafeaturethatisspecictotheVNhybrid;thatis,aVNhybridmustbeconguredtocommunicateforasinglenetworkdevice. 2.2.1LocalVPNArchitectureAsdescribedintheintroduction,theTAPdeviceisthegluebywhichthelocalresourcescommunicatewiththeVPN.EachapproachreliesontheTAPdevicethoughindifferentcongurations.IntheInterface(Figure 2-3 ),theTAPdeviceisuseddirectlybytheuserasanyothernetworkdevice.Inshort,packetsarewrittentotheTAPdevicebytheO/S(operatingsystem)socketsandreadbytheVPNsoftwaretosendtotheremotelocation,packetsreceivedbytheVPNarewrittentotheTAPdeviceanddeliveredtosocketsbytheO/S.TheRouter(Figure 2-4 )bridgestheTAPdevicetoaLAN,thuspacketscanberoutedtoitandsentthroughtheVPN.TAPdevicevirtualizesabridgetootherphysicalnetworks.Finally,theHybrid(Figure 2-5 )liketheRouterconnectstotheLANbutonlyallowscongurationfromthelocalhost.InLinuxthisispossiblethroughtheuseofaVETHpseudodevicethatprovidesavirtualEthernetpair,sothatoneendcanbebridgedwiththeTAPdeviceandLANwhiletheotherprovidesanotherinterfacethatcanbeconguredontheLAN,whichwillbeusedbytheVPN.Thereasonforthisliesinthenatureofthestateoftheinterfacesconnectedtothebridge,whichgointopromiscuous 41

PAGE 42

Figure2-2.Thestatediagramofaself-conguringVN mode,sothatallpacketssenttothemareforwardedonasiftheyareonawireasiftherewereonlyasinglenetworkinterface.Innon-promiscuousmode,thenetworkcardwilldroppacketsthatarenotdestinedforthatnetworkcard.Sointhatcase,itisnotpossibletoassignmorethanoneIPaddresstoabridge,becauseitandalldevicesconnectedtoitareviewedasonebignetworkinterface.ConnectingtheVETHdeviceallowsanadditionaluniquelyidentiableEthernetaddressesandthusadditionalIPaddresses.Incontrast,aliasingaEthernetcardonlyprovideadditionalIPaddresses 42

PAGE 43

Figure2-3.VNinterface Figure2-4.VNrouter 43

PAGE 44

Figure2-5.VNhybrid andservicesthatrelyonlayer2networking.Inthiscase,someservicesmaynotwork,forexample,DHCPdoesnotworkonaliasednetworkcards. 2.2.2AddressResolutionIPisalayer3protocol.Layer2devicessuchasswitches,bridges,andhubsarenotawareofIPaddresses.Whenasystemwantstosendalayer3packetoveralayer2network,itrstusesARPtondthelayer2addressowningthelayer3address.Thisprocess,asshowninFigure 2-6 ,beginsbythesendingofalayer2broadcastmessagewhichcontainsanARPrequest,askingallmembersintheLANthatthenodeowningthetargetIPaddressrespondtothesenderoftherequest.IfanodeownsthetargetIPaddress,itrespondswithanARPreply,makingthemselvesthesenderandtheoriginalsenderisthemessagerecipient.TheEthernetheaderconsistsofthesourceaddressbeingthesenderandthetargetbeingthedestination.Bylisteningtotheserequests,layer2devicessuchasaswitchcanautonomouslylearnthelocationofnodesholding 44

PAGE 45

Figure2-6.ARPrequest/replyinteraction Ethernetaddressesareandcanforwardpacketsthroughappropriateportsasopposedtobroadcastingordroppingthem.InatypicalIPsubnet,allmachinestalkdirectlywitheachotherthroughswitches.Assuch,theymustlearneachother'sEthernetaddress.TheVNmodelusedhereinfocusonalarge,atsubnetspanningacrossallnodesconnectedtotheVPN.Toaccomplishthis,theVNprovidestheabilitytovirtualizeabridge,similartoproxyARPs[ 80 ]usedtoimplementatransparentsubnetgateway[ 18 ].Inthisscenario,theVNwouldneedtorespondtotheARPpacketswithafakelayer2address.Layer2devicesinthesystemwouldthenrouteallpacketsdestinedforthatlayer2addresstotheVN.Asshowninthestatemachine(Figure 2-2 ),ARPsareonlyrespondedtoif(a)theyareinquiringaboutaVNIPaddress,(b)theVNaddressisnotlocallyallocated,and(c)thereisaP2P:IPmapping.Ifallthosearetrue,thenanARPresponseissentbacktothesender.ARPsareoccasionallysentoutduringthecourseofcommunicationandthusifamachinemigratestoaVNrouter,theVNrouterwillnolongerrespondwithARPs.AnARPresponsesentbytheVNrequiresasourceEthernetaddress,bridgesandswitcheswillseetheresponseandwillforwardalltrafctowardstheTAPdeviceforthatEthernetaddress.AVNdevicecanusethesameEthernetaddressforremoteentities.PriortotheintroductionoftheVNhybrid,theVNsusedtheEthernetaddressFE:FD:00:00:00:00torefertoremoteentities.IfeachVNhybridusedthisaddress,therewouldbelayer2collisioncausingasinglehybridtohavealltrafcsenttoit.In 45

PAGE 46

Figure2-7.DHCPclient/serverinteraction hybridmode,eachVNmustgenerateauniqueremoteEthernetaddressatruntime.Experienceandresearchhasledtothefollowingsolution:(1)useFE:FDforthersttwobytesastheytendtobeunallocatedand(2)assignrandomvaluestothe4remainingbytes.Applyingthebirthdayprobleminthiscontext,theexpectedprobabilityofaddresscollisionsissmallfortypicalLANenvironments(lessthan50%iftheaveragenumberofVNhybridnodesonthesameL2networkis65,000).ThekeydifferencefromtheHybridandRouteristhattheHybridroutesforonlyasinglenode,sayA,andthusmustignoremessagesthatdonotoriginatefromA.TheHybridmodeldoesnotnecessarilyknowabouttheexistenceofallmachinesinaLAN,becauseitdoesnotownthem.SowhenanARPrequestofsomeremotemachine,sayB,issentbyA,theHybridmustsendoutamatchingrequestwiththeresultbeingsentbacktothepseudo-entityofthetransparentsubnetgatewaysothattheVPNcandetermineifBexistslocally.Ifnomessageisreturnedafterasetamountoftime(thereferenceimplementationused2seconds),thenassumingthatthereisapeerintheoverlaywiththeIPaddress,theoriginalARPwillberespondedtowiththepseudo-entitybeingthetarget. 2.2.3AddressAllocationIPaddressesaretraditionallyallocatedinoneofthreeways:1)statically,2)dynamicallythroughDHCP,or3)throughpseudo-randomlink-localaddressing.Thismodelfocusesonstaticanddynamicaddressing. 46

PAGE 47

ThenetworkcomponentscongurablebyDHCP[ 4 31 ]thatareinterestingtoaVPNareaddresses,routing,andothernetworkingrelatedfeatures.Whilemanydifferentclientandserversexist,theyalltendtosupportthebasicfeaturesofallowingtheservertospecifytoaclientandIPaddress,agatewayaddress,anddomainnameservers.AsshowninFigure 2-7 ,thestepsinDHCPare: 1. ClientsendsDiscoverpacketrequestingaddress. 2. Serverreceivesthepacket,allocatesanaddress,andsendsanOfferoftheaddressandothernetworkconguration. 3. ClientreceivesandacknowledgestheOfferbysendingaRequestmessagetoaccepttheOffer. 4. ServerreceivesRequestmessageandreturnsanACKmessagecontainingthesamedetailsastheOffer.DuringtheDHCPphase,theVPNcommunicateswithaDHCPserverfortheVPN,whichwillallocateanaddressfortherequester.Similarly,aVNmodelcanreviewpacketscomingintotheVPN,reviewthesenderIPaddress,andrequestandnotifytheserverofthisallocation.TreatingstaticaddresseslikeDHCPenableseasiercongurationoftheVPN,thoughitisdifculttohandleaddressconicts.Inthismodel,thisisdonebytheserverignoringtheduplicaterequests,anditisuptotheusertocongureforanewaddress.ThusDHCPprovidesamorereliablemethodinthesesystems.Tosupportscalableaddressallocationindecentralizedsystems,theDHCPserverisavirtualentity,parsingDHCPpacketsandinteractingwithanoverlaybasedDHT.ThisapproachdoesnotneedtobelimitedtostructuredoverlaybasedVPNsbutcanbeintroducedasanaddedvaluecomponent.AnimportantaspectofDHCPisthatafteramachinehasreceivedanIPaddressfromtheDHCPserver,italwayscheckstoensurethattheaddresshasnotbeenallocated,assuchtheVPNshouldneverrespondtoaddressresolutionsforlocalIPaddresses. 47

PAGE 48

IfanoverlayallocatesanaddresstotheVN,thentheVNownsit.TheotheraddressthattheVNownsisthenulladdress,0.0.0.0,whichissentduringDHCPtoindicatethatthemachinehasnoaddresspriortotherequest. 2.2.4DomainNameServersandServicesNameservicesallowmachinestobeaddressedwithnamesthataremoremeaningfultousersthannumericaddresses.Certainapplicationsandservicesrequiredomainnamechecking,suchasCondor.TosupportDNS,thisrequiresthattheO/SbeprogrammedwiththeVN'sDNSserversIP,typicallythelowestavailableIPaddressinasubnet.Instaticconguration,thisprocessrequirestheusertomanuallyaddthisaddress,thoughthroughDHCPthisissetautomatically.InthestaterepresentationoftheVN(Figure 2-2 ),theVNcheckstheIPpackettoensurethatthedestinationIPandportmatchthatofthevirtualDNSserverandthewell-knownDNSport,53.Intheeventofamatch,thepacketispassedtotheVN'shandlerfordomainnames.Namesaretypicallyusedforthefollowingpurposes:1)becauseapplicationsrequireit,and2)toassistusersinndingresources.Todealwith1),theDNScandeterministicallymapsIPaddressestonames,suchas10.250.5.5mapstoC250005005.2)canbesolvedbyusingtheDHTandplacingkey:valuepairsoftheformhash(namespace:hostname)toIPaddressandhash(namespace:IPaddress)tohostname. 2.3SupportingMigrationTherehasbeenarapidincreaseinthedeploymentofVirtualMachines(VMs)foruseinresourceconsolidationintheserverindustryaswellasthedomainofcloudcomputing.Providersofcloudcomputingserviceshaveadoptedvirtualmachinesastheunitofgranularityforprovidingservicesandservicelevelagreementtotheusers.UsersarebilledaccordingtothenumberofVMsandtheiruptime.Majorcloud-computingprovidersincludingAmazonREC2(ElasticCloud2)andGo-GridhaveadoptedXenas 48

PAGE 49

thevirtualizationplatformfortheirservicesandsellcomputeresourcesintheformofvirtualmachines.Apartfromadvantageslikeperformanceisolation,security,andportability,oneofthesignicantadvantagesofusingVMsisthecapabilitytomigratetheVMwithitsentiresoftwarestackfromonephysicalhosttoanother.Thismigrationmaybeperformedinastop-restartmanner,wheretheVMispaused,migratedtoanotherhostandrestarted,orinalivemode,whichattemptstominimizedowntimetoreduceinterruptionofservicesrunningontheVM.VMsincludingXen[ 61 ],VMwareESX[ 75 ]andKVM[ 81 ]supportmigrationwithtwocriticalrequirements:(1)lesystems(diskimages)mustbeonasharedstoragesystem(i.e.networklesystemsorstorageareanetworks)and(2)tomaintainnetworkconnectivity,themigrationmustoccurwithinanIPsubnet.Inordertoretainnetworkconnectivityaftermigration,theVMM(virtualmachinemanager)mustnotifytheLANoftheVM'snewlocation.ThenewVMMhostgeneratesanunsolicitedARPreplywhichbroadcaststotheentirenetworktheVM'snewlocation.TheVNInterfaceandHybridmodelssupportmigrationofthevirtualaddressusingtechniquespreviouslydescribedbyGangulyetal.[ 41 ].Thisisaproductofthedecentralized,isolatedoverlayapproachwhereeachoverlayendpointhasaone-to-onemappingtoVNendpoint,e.g.,P2PtoIP.WhenaVNInterfaceorHybridmodelmigrates,theoverlaysoftwaremustreconnecttotheoverlay,atwhichpoint,packetswillbegintoberoutedtotheVNendpointagain,completingmigration.UnlikeInterfaceandHybridmodels,theVNRouterdoesnotsupportaone-to-onemapping.Infact,aVNroutertendstohaveoneP2PaddressformanyIPaddresses.WhenamachinewithaVNIPwantstomigrate,itcannotalsotakeitsP2PaddresswithitotherwiseitwouldendconnectivityfortherestofthemembersoftheVNroutersharedoverlayendpoint.AsolutiontothisproblemsrequirestheabilitytodeleteIP-to-P2PmappingsintheDHT,detectnewaddressesonthenetwork,andinformsendersthat 49

PAGE 50

anIPisnolongerlocatedatthatoverlayendpoint.Withthesecapabilities,transparentmigrationcanbeachievedfortheVNroutermodelasfollows.TheVMMinitiatesamigrationonasourcenode.Untilthemigrationcompletes,theVNrouteratthesourcecontinuestoroutevirtualIPpacketsfortheVM.Uponcompletionofmigration,theVNrouteratthetargetlearnsaboutthepresenceofthemigratedVMbyeitherreceiptofanunsolicitedARPorbyproactivelyissuingperiodicbroadcastICMP(InternetControlMessageProtocol)messagesonitsLAN.TheVNrouterattemptstostore(Put)theIP:P2PaddressmappingintheDHT,andqueriesfortheexistenceofotherIP:P2Pmapping(s).Ifnopreviousmappingsarefound,theVNrouterassumesresponsibilityfortheIPaddress.Otherwise,theVNroutersendsamigrationrequesttoeachP2PaddressreturnedbytheDHT.TheVNrouterreceivingamigrationrequestconrmstheexistenceoftheIPaddressinitsroutingtableandthatifthereisthatthereisnoresponsetoARPrequestssenttotheIPaddress.Iftheseconditionshold,itdeletesitsIP:P2PmappingfromtheDHTandreturnstruetothemigrationrequest;otherwise,itreturnsfalse.Ifthemigrationrequestreturnstrue,theVNrouteratthetargetLANstartsroutingforthevirtualIPaddress;ifitreturns,false,theVNrouterdoesnotrouteforthevirtualIPaddressuntilthepreviousIP:P2PmappingexpiresfromtheDHT.InadditiontoVNrouterssynchronizingownershipofthemigratedvirtualIPaddress,anyhostthatisconnectedtothatmachinemustbeinformedofthenewP2Phost.Overtime,thiswillhappennaturallyasARPcacheentriesexpireandtheIP:P2PmappingislookedupfromtheDHT.Additionally,theVNrouteratthesourcemaykeepforwardingrulesforthemigratedIPaddressforacertainperiodoftime,akintomobileIPbutnotonapermanentbasis.Amoredirectapproach,asimplementedintheprototype,involvestheVNrouternotifyingtheconnectedhostofachangeinownership,resultinginthehostqueryingtheDHTfortheupdatedP2Pendpoint.An 50

PAGE 51

evaluationoftradeoffsinthemigrationdesign,whileinteresting,isoutsidethescopeofthisdissertation.AstaticaddressallocationissimilartoamigrationwithouttherebeinganIP:P2PvalueintheDHT,thoughwithoutqueryingtheDHT,thesituationisunclear.SystemsthatuseDHCPonlymusthavesomemethodfordetectingnewaddresses,becausethereisnoguaranteethataDHCPwilloccurimmediatelyfollowingmigration,infact,dependingontheleasetimethatishighlyunlikely.UsinganinsecureDHTthatsupportsdeletesissketchyasitwouldberelativelyeasyformachinestoperformmaninthemiddleattacksbydeletingkeyswhichtheydonotown.EventheuseofpasswordsmentionedinDHTliteratureisnotsufcientasitisnotimmunetocollusion,orSybil,attacks.VNroutermigrationwasanalyzedthroughtheuseoftwoXen-basedVMwareVMsco-locatedonthesamequad-coreXeon53352GHzmachineeachwith1GBmemoryallocatedusingaminimallyconguredO/SwithaSSH(secureshell)server.Theevaluationattemptstounderstandoverlayoverheadsoftheapproach.Theexperiment,asshowninFigure 2-8 ,involvedmigratingaXenguestVMbetweentwoXenhostVMMsrunninginVMware.Althoughtheyarehostedinthesameinfrastructure,thetwodomainsareconnectedtotwoseparateVLANs,andthusisolated.TheresourceinformationisstoredinaDHTrunningontopofPlanetLab.Thusthemigrationoverheadsintheexperimentcapturethecostofwide-areamessaginginarealisticenvironment.Duringthecourseoftheexperiment,over50differentIPaddressesweremigrated10timeseachinanattempttogainsomeinsightsinthecostofusingtheDHTwithsupportfordeletesandVNroutermessagesasameanstoimplementmigration.Theresult,presentedinFigure 2-9 gatheredfromtheexperimentwashowlongtheVNIPwasofine,measuredbymeansofICMPpingpackets.Onaverage,theoverheadofVNmigrationwas20seconds.ThisoverheadisinadditiontothetimetakentomigrateaVM,sincetheVNroutersbegintocommunicateonlyaftermigrationnishes. 51

PAGE 52

Figure2-8.VNroutermigration 2.4EvaluationofVPNNetworkCongurationThisexperimentsexploresbandwidthandlatencyinadistributedVPNsystemtomotivatetheusageofP2PlinksinaVPN.TheVPNsusedareincludetheprototype,whichextendsfromIPOP;OpenVPNR;andHamachiR.OpenVPNRrepresentsatypicalcentralizedVPN,whileHamachiRrepresentsawell-tunedP2P-linkVPN.TheevaluationwasperformedonAmazonREC2usingsmallinstancesizedUbuntui386instancestocreatevarioussizednetworksrangingfrom1to32.OpenVPNRusesanadditionalnodeasthecentralserverandHamachiRhasanupperboundof16due 52

PAGE 53

Figure2-9.VNroutermigrationevaluation tolimitationsintheLinuxversionatthetimeofthisevaluation.Toperformbandwidthtests,theinstancesarebootedandqueryanNFSforthelistvirtualIPaddresses,peersareorderedsuchthathalfthepeersareactasclientsandtheotherhalfthepeerscreatinga1to1mappingbetweenallsets.Latencyandbandwidthtestsareperformedusingnetperf'srequest-replyandstreamingtestsrespectively.Priortothestartofthetests,peershavenoknowledgeofeachother,exceptthevirtualIPaddresses,thusconnectionstartupcostsareincludedinthetest.Testarerunfor10minutesdilutingtheconnectioninitiationoverheadbutrepresentanexampleofrealusage.Resultsfromtheclientsarepolledatalllocationsandaveragedtogether,thoughtheOpenVPNRserverismeasuredseparately.IPOPandOpenVPNRuseauthenticated128-bitAES,while 53

PAGE 54

Figure2-10.SystemtransactionrateforvariousVPNapproaches HamachiRdoesnotallowcongurationofthesecurityparametersandusesthedefaultHamachiRsettings.Figure 2-10 and 2-11 presenttheresultsforlatencyandbandwidthrespectively.Latencyismeasuredintransactionsofsuccessfulrequest/replymessages.Inthelatencytest,itisobviousthathavingthecentralserverincreasesthedelaybetweentheclientandserverandtheresultsdegrademorequicklyasadditionalpeersareaddedtothesystem.Insmallsystems,OpenVPNRshinesprobablyduetooptimizedsoftware,thoughasthesystemgrows,thesystembandwidthdoesnot.Bythetime8peershaveenteredintothesystem,bothdecentralizedapproachesperformbetterthantheOpenVPNRsolution.Tosummarize,decentralizedVPNapproachesprovidebetterscalability,whichcanbeimmediatelynoticedbylowlatencytimesand,asthesystemgrows,availablebandwidth. 2.5EvaluationofVPNLocalCongurationThissectionpresentsanevaluationofthedifferentVNmodels,usingprototypeimplementationsbuiltuponIPOP.Thegridevaluationsimulatesaclient/serverenvironmentandinvestigateCPU/networkingoverheadsrelatedwitheachapproach. 54

PAGE 55

Figure2-11.SystembandwidthforvariousVPNapproaches InadditionaclouddeploymentshowsaproofofconceptthatconnectsmultiplecloudandlocalresourcesaswellasevaluationofoverheadofthedifferentapproachesinWANandLANenvironments.InallWANexperiments,awide-areaIPOPoverlaynetworkwithapproximately500overlaynodesdistributedacrosstheworldonPlanetLabresourcesisusedtobootstrapVNconnectionsandtosupportDHT-basedstorageandP2Pmessaging.TheproposedVNmodelsplacevaryingdemandsontheresourcesofthesystemsinvolved.TheevaluationfocusesonCPUasexperiencesuggestthatthisisthemostsignicantlimitingfactor.Aswillbepresented,theCPUloadofferedbythesemodelsdependsonthebandwidthoftheunderlyingnetworklink,sincealargerbandwidthrequiresmoreprocessingofpackets.ThetoolsforevaluatingtheseVNmodelsareNetperfandSPECjbbR.Netperf[ 54 ]isusedtoestimatethelatencyandbandwidthofthedifferentVNmodels.ThelatencyismeasuredbydeployingNetperfintheTCP RRmode,whichmeasuresthenumberof1-byterequest-receivetransactionsthatcanbecompletedinasecond.ThebandwidthisestimatedbyrunningNetperfintheTCP STREAM 55

PAGE 56

mode,whichisabulktransfermode.Itshouldbenotedthatinsituationswherethelinkbandwidthswereasymmetric,Netperfisdeployedinbothdirections.SincebothlatencyandbandwidtharedependentontheCPUcomparison,evaluationsthatincludeCPUutilizationtasksrequirecreatingabaselinerstwhereonlyNetperfistheonlyactiveworkload.SPECjbbR[ 102 ]simulatesathree-tierwebapplicationwithalltheclients,themiddletier,andthedatabaserunningonasinglesysteminasingleaddressspace(insideaJavavirtualmachine).Oncompletion,thebenchmarkprovidesthemetricintermsofbusinessofoperationspersecond(bops).ThebopsscoreofthesystemundertestdependsonboththeCPUandthememoryinthesystem,astheentiredatabaseforthebenchmarkisheldinmemory.Thisbenchmarkgeneratesnegligiblediskactivityandnonetworkactivity. 2.5.1OntheGridTheinitialevaluationinvolvestestingaclient-serverenvironment.Thebaselinehardwareconsistedofquad-core2.3GHz5140Xeonwith5GBmemoryandGigabitnetworkconnectivity.EachVMwasallocated512MBofRAMandranDebian4.0usingaLinux2.6.25kernel.Theclientsideconsistedof4VMson5machines.Theserversideconsistedof5VMsononemachinewith4actingasserversand1actingasagateway,whichwasnecessarytocontrolbandwidthintothesystem,donethroughtheLinuxutilitytc[ 52 ],trafccontrol.Inthisenvironment,eachserverhad5clientscommunicatingwithit.ThesetupisshowninFigure 2-12 .TheVMServersranSPECjbbRandwerealsothesiteforthecollectionofthenetperfbenchmarks.AlltheVMServerswereconnectedthroughtheTCGatewaythroughhost-onlynetworkingtotheVMClients.AlltrafcfortheVMServerspassesthroughtheTCGateway,whichalsodoubledastheRouterintheRouterexperiments.AlltheevaluationspresentedinFigures 2-13 2-14 2-15 ,and 2-16 aremarkedupinthesamefashion.TheevaluationswereperformedwithandwithoutaSPECjbbR 56

PAGE 57

Figure2-12.Gridevaluationsetup Figure2-13.GridNetperfbandwidth(TCPSTREAM)evaluation load.Linesareoftheform(nospec,spec).(phys,interface,router).WherespecindicatesSPECjbbRbenchmarkisactive,whilenospecindicatesthatSPECJbbisinactive.physimpliestheabsenceofIPOPwithbenchmarksoccurringdirectlyoverthephysicalnetworkcard.interfaceandrouterpresenttheresultsforVNinterfaceandRouterrespectively.Themaximumbandwidthof600Mbpsisachievedwhenneithervirtualnetworknortrafcshapingareenabled(nospec.physat1000MbpslimitinFigure 2-13 ),whichisonly60%ofthetheoreticalmaximum.ThislimitismostlikelythecostofVMMs,specicallythetimerequiredforapackettotraversebothVMMsnetworkingstackaswellasthehostsnetworkingstack.Anotherobservationwasthattransactionspersecond(Figure 2-16 )donotimprovesignicantlyfortcbandwidthlimitabove25Mbpsinallcases;thusfocusisononlytherelevantdatauptothislimit. 57

PAGE 58

Figure2-14.GridNetperflatency(TCPRR)evaluation Figure2-15.GridSPECjbbevaluationwithNetperfTCPSTREAMload DistinguishingfeaturesofthedifferentVNmodelsincludethefollowing.Figure 2-13 showsthatbandwidthinallVNmodelsiscomparablewithtrafccontrollimitupto75Mbps.Beyondthispoint,theinterfacemodelachievesbetterbandwidththantheRouter(VNprocessingisdistributedacrossmultipleprocesses);thespec/nospecratiointheroutermodelissmallerthanintheinterfacemodelbecausethereislessresourcecontentioncausedbyVNprocessingonendnodes.Forthesamereason,theRoutertendstoachievebetterSPECresults(Figure 2-15 )thantheinterface.Figure 2-14 showsthattheRouterperformspoorlycomparedtotheinterfacemodelintermsoftransactions/second,thoughitachievesabetterratioofSPECjbbRscore(Figure 2-16 )totransactionsthantheinterfaceatconstrainedbandwidths(lessthan5Mbps).Thehybridmethodwastested,andresultswerenearlyidenticaltothoseoftheinterface,fromthepointofviewoftheWANpartoftheVN,itisthesamearchitecture. 58

PAGE 59

Figure2-16.GridSPECjbbevaluationwithNetperfTCPRRload Theseresultsarenotreportedintheplotsastheyaddlittlevalueandfurtherobfuscatetheresults.ThebandwidthcapobservedintheRouterapproachreectstheperformanceachievedbythecurrentprototypeoftherouter,subjecttoVMoverheads.TheuseofVMisanassumptionthatisvalidinthedomainofcloudcomputingwhereallresourcesruninaVM.Thisexperimentfocusedontheinterplaybetweenresourceconsumptionbyoverlayroutersandapplicationperformance.Optimizeduser-leveloverlayroutersrunningondedicatedphysicalmachineshavebeenreportedtoachieveperformancenearGbit/sinrelatedwork[ 109 ].OnethingthatleftunevaluatedthatmayprovidemoreinterestingdatawouldbeprovidingtheVNrouterdedicatedhardware.Inthetestenvironments,thiswasinfeasible,becauseallbutoneofthemachinesinthelabrunVMwareServer1,whichhasabugwithsettingthevirtualnetworkcardinpromiscuousmode.ThiseffectivelymakesitimpossibleforaVMtobeaVNrouterasnopacketswillevermaketheirwayintotheVM,astheVMMwillrejectallpackets.Assuch,themachineshostingtheservershadVMwareServer2,whichdoesallowsettinganetworkinterfaceintopromiscuousmode. 59

PAGE 60

Table2-3.WANresultsforinter-cloudnetworking EC2/UFEC2/GoGridRUF/GoGridR StreamPhys(Mbs)89.2135.9330.17StreamVN(Mbs)75.3119.2125.65RRPhys(Trans./s)13.3511.099.97RRVN(Trans./s)13.3310.699.76 2.5.2IntheCloudsThegoalofthisexperimentistodemonstratethefeasibilityofconnectingmultiplecloudprovidersaswellaslocalresourcestogetherthroughvirtualnetworking.ThesiteschosenforevaluationwerelocalresourcesatUniversityofFloridaandcloudresourcesprovidedbyAmazonREC2andGoGridR.Aqualitativeobservationherewasthatthedifferencesinthenetworkinginfrastructureexposedbydifferentcloudprovidersreinforcetheimportanceofthevirtualnetworktoallowexibilityinhowendnodesareconnected.Specicnetworkcongurationsforthecloudswereasfollows: AmazonREC2providesstaticIPnetworking(publicandprivate),noEthernetconnectivity,andnoabilitytorecongureIPaddressesfornetwork.Currently,onlytheVNinterfacemodelissupported. GoGridRprovides3interfaces(onepublic,staticallycongured,andtwoprivate,whichcanbeconguredinanymanner);the2privateinterfacesareonseparateVLANssupportingEthernetconnectivity.TheVNinterface,router,andhybridmodelsaresupported.ThisexperimentnarrowsdowntheperformanceevaluationtofocusonWANandLANperformanceofVNsincloudenvironmentsandconsiderNetperfsingleclient-serverinteractionsonly.AmazonRonlysupportsInterfacemode,thusitisonlyevaluatedintheWANexperiment.Ithasbeenobservedthat,withinAmazonR,theVNisabletoself-organizedirectoverlayconnections[ 42 ].Eachtestwasrun5timesfor30seconds,thestandarddeviationforallresultswaslessthan1.Becauseofthis,onlytheaverageispresentedinTable 2-3 60

PAGE 61

Table2-4.LANresultsperformedatGoGrid VNInterfaceVNRouterVNHybridPhysical Stream(Mbs)109325324327RR(Trans./s)1863227722533121 ItcanbeseeninTable 2-3 thattheVNaddslittleoverheadintheNetperf-RRexperiment.BetweenUFandGoGridRaswellasbetweenUFandAmazonREC2,theoverheadfortheStreamexperimentwasabout15%.Thismaybeattributedtotheadditionalper-packetoverheadoftheVNandthesmallMTUsetfortheVNinterface(1200).TheMTU,ormaximumtransmissionunit,isthelargestpacketthatissentfromaninterface.IPOPconservativelylimitstheVNMTUto1200downfromthedefault1500toallowforoverlayheadersandtoworkproperlywithpoorlyconguredrouters,whichhasencounteredinpracticaldeployments.AmoredynamicMTU,whichwillimproveperformance,isleftasfuturework.TheEC2/GoGridRexperimenthadgreateroverheadwhichcouldpossiblybeattributedtobytheVMencapsulationofcloudresources.Table 2-4 showsthatsomeoftheperformanceexpectationsforthedifferentmodelsinaLANwereaccuratelypredictedwhileotherswerenotsoclear.StreamresultsmatchtheexpectationthatVNmodelshybridandrouterbypassvirtualizationandgetnearphysicalspeeds,whereasinterfacedoesnot.Interestingly,RRhadratherpoorresultsforRouterandHybridthoughfurthertestingseemstoindicatethatthisisanissueofusingtheVLANconnectednetworkinterfacesasopposedtothepublicnetworkconnectedinterface. 61

PAGE 62

Table2-5.Virtualnetworkcomparison OverlayRoutingCongurationMiscellaneous IPOPStructuredP2Poverlaywithlog(N)routinghops,whereNisthesizeofP2Pnetwork.Self-optimizingshortcutsandSTUN-basedNATtraversal.MappingstoredinDHTresolvesvirtualIPaddresstoP2Paddress.Virtualnetworkpacketsareroutedtocor-respondingP2Paddress.EachmachinerunsP2PVPNsoftwarewithadynamicIPaddressinacom-monsubnet.Com-moncongurationsharedamongstallhosts.SupportsencryptedP2Plinksandend-to-endVPNtunnels(unpublishedwork).Migrationpossi-ble;routesself-congurewithoutuserintervention,productoftheP2Poverlay.N2NUnstructuredP2Pnetwork,supernodesprovidecontrolpaths,formsdirectcon-nectionsfordata.Broadcastfordiscov-eryandoverlayforcontrol.Noorganiza-tion,noguaranteesaboutroutingtime.RequiresN2Nsoftwareateachhost,mustconnecttoasupernode.Supportslayer2Ethernetnetwork.Supportssharedsecretstocreateprivatetunnelsbetweenedges.Migrationnotdis-cussed,butpoten-tiallyfreeduetolayer2approach.OCALANottiedtoanyspecicoverlay,layer3middle-ware.Baseduponchosenoverlay.RequiresOCALAstack,overlaycon-guration,andIPtooverlaymapping.SecurityisoverlaybasedorSSHtun-nels.Migrationnotmentioned.SoftUDCVNETDecentralizedwithexplicitlycon-guredoverlayroutes.Broadcastfordiscov-ery.Requiressoftwareoneachhostandoneproxypersite.Layer2networking.Securityisnotdiscussednoriswide-areamigra-tion.ViNeViNeauthorityconguresglobalnetworkdescrip-tortable(GNDT)explicitlyateachrouter.SupportsproxyingtoonelocationthroughanotherandNATtraversal.GNDTprovidesoverlayroutesforallroutersinoverlay.Eachsubnetisal-locatedasinglerouter.EachhostmustbeconguredforregularandViNenetworks,butnoVNsoftwareneededonhost.SupportsencryptedtunnelsbetweenViNerouters,migra-tionnotdiscussed.ViolinDecentralizednet-workwithstaticallyconguredoverlayroutes.BroadcastdiscoveryforEthernet,staticroutesforIPsubnet.Virtualhostscon-nectVMstotheVN.Hostsconnecttovirtualswitchesorproxies(gateways).Switchesconnecttoproxies.Sitesaretypicallyallo-catedanIPaddressspace.SecuritypotentiallythroughtheuseofSSHTunnels.Migrationpossible;requiresrecongu-rationofswitches. 62

PAGE 63

Table 2-5 .Continued OverlayRoutingCongurationMiscellaneous VirtuosoVNETDecentralizedwithexplicitlycon-guredoverlayroutes.Broadcastfordiscov-ery.Bridginglearnspathsafterinitialdiscovery.VirtualnetworkpacketsareroutedbetweenVNETproxies.Canbeconguredmanu-ally.EachsiterunsaproxyprovidingEthernetbridgetootherproxies.VMhostsforwardpack-etstolocalproxy.Proxiesconguredtoconnecttootherproxies.SecuritythroughtheuseofSSLandSSHTunnels.Layer3migration,productoflayer2virtualization.OpenVPNRCentralizedCentralserverServersmanu-allyconguredtoconnectwitheachother.Clientsran-domlyselectserverfrompre-sharedlistAllcommunicationtraversescentralserver,endtoendtrafcbydefaultisnotprotectedfromcentralserverTincandCloudVPNDecentralizedwithexplicitlycon-guredoverlayroutesBroadcastfordis-covery,messagestraverseoverlayManualcongura-tionNATtraversalthroughrelaysonlyHamachiRCentralizedDis-covery,P2PlinksPeersestablishse-curitylinksandendpointinformationfromacentralserver,attempttoformdi-rectconnections,iffails,relaythroughcentralserverSelectanetworktojoinorcreateandspecifyapassword,communicateswithacentralizedservertomanagetheVPNLacksportability,Linuxversionoutofdate,inabilitytorunexternalrelayservers,UDPNATtraversalGBridgeCentralizedDis-covery,P2PlinksPeersestablishse-curitylinksandendpointinformationfromacentralserver,attempttoformdi-rectconnections,iffails,relaythroughcentralserverSelectanetworktojoinorcreateandspecifyapassword,communicateswithacentralizedservertomanagetheVPNLacksportabilityandinabilitytorunexternalrelayservers,usesTCPNATtraversalWippienCentralizedDis-covery,P2PlinksPeersdiscoverandauthenticateeachotherthroughXMPPchatserver,securityprovidedunknown,peersattempttoformdirectcon-nectionswitheachother,ifthatfails,nocommunicationAllpeersmustbemembersofas-sociatedXMPPchatroomsandbeconnectedtothechatRequiresaGUI,difcultypenetratingNATs,claimstobeopensourcethoughmostofthecodeisunavailable,Linuxclientoutofdateanddoesnotsup-portNATtraversal 63

PAGE 64

Table 2-5 .Continued OverlayRoutingCongurationMiscellaneous P2PVPNCentralizedDis-covery,P2PlinksPeersdiscovereachotherthroughaBitTorrentRtrackerandattempttoformdirectlinkswitheachother,attemptstoformall-to-allcon-nectivity,ifdirectlinksareunavailable,indirectlinkscanbeusedtoforwardpacketsPeersmustjointhesametrackerandusecommonsharedsecretWorkinprogresstomakemoreun-structured,currentlyacrossbetweencentralizedanddecentralized 64

PAGE 65

CHAPTER3BOOTSTRAPPINGPRIVATEOVERLAYSWhileP2Poverlaysprovideascalable,resilient,andself-conguringplatformfordistributedapplications,theiradoptionrateforuseacrosstheInternethasbeenslowoutsideoflarge-scalesystems,suchasdatadistributionandcommunication.Generaluseofdecentralized,P2P(peer-to-peer)applicationstargetinghomesandsmall/mediumbusinesses(SMBs)hasbeenlimitedinlargepartduetodifcultyindecentralizeddiscoveryofP2Psystems,thebootstrapproblem,furtherinhibitedbyconstrainednetworkconditionsduetorewallsandNATs(networkaddresstranslators).WhiletheseenvironmentscouldbenetfromP2P,manyoftheseuserslacktheresourcesorexpertisenecessarytobootstrapprivate1P2Poverlaysparticularlywhenthemembershipisunsteadyanddistributedacrosswide-areanetworkenvironmentswhereasignicantamountof(orall)peersmaybeunabletoinitiatedirectcommunicationwitheachotherduetorewallsandNAT(networkaddresstranslation).Examplesoflarge-scaleP2PsystemsincludeSkypeR,BitTorrentR,andGnutella.SkypeRisavoiceoverP2Psystem,whereasBitTorrentRandGnutellaareusedforlesharing.Thebootstrappinginthesesystemstypicallyreliesonoverlaymaintainersusinghighavailabilitysystemsforbootstrapping,bundlingtheirconnectioninformationwiththeapplicationdistribution.Theapplicationthenusesthesesserversduringtheinitializationphasetoconnectwithotherpeersinthesystem.Alternatively,someservicesconstantlycrawlthenetworkandplacepeerlistsondedicatedwebsites.Anewpeerwishingtojointhenetworkqueriesthewebsiteandthenattemptstoconnecttothepeersonthislist. 1Inthecontextofthischapter,privateimpliesthattheoverlay'spurposeisnotforgeneraluse.Onceestablished,suchoverlayscansupportprivacyincommunication;however,overlaysecurityisbeyondthescopeofthischapterandcoveredinmoredepthinChapter 4 65

PAGE 66

Figure3-1.BootstrappingaP2Psystemusinganexisting(generic)overlay Insmaller-scalesystems,P2Pinterestsfocusondecentralization.Forexample,usersmaydesiretorunanapplicationatmanydistributedsites,buttheapplicationlacksdedicatedcentralserverstoprovidediscoveryorrendezvousserviceforpeers.Incontrast,dedicated,centralizedP2Pserviceproviders,suchasLogMeIn'sHamachiR,aP2PVPN(virtualprivatenetwork),maycollectusagedata,whichtheusersmaywishtoremainprivate,orarenotfreeforuse.Manyapplicationsmakesenseforsmall-scaleoverlayusage,includingmultiplayergames,especiallythosethatlackdedicatedonlineservices;privatedatasharing;anddistributedlesystems.Clearly,asmallP2Psystemcouldbebootstrappedbyoneormoreusersofthesystemrunningonpublicaddresses,distributingaddressesout-of-band,instructingtheirpeerstoaddthataddresstotheirP2Papplication,andtheninitiatebootstrapping;butthesetypesofsituationsareanexceptionandnotthenorm.Ultimately,theuserswouldbeenhancedsignicantlythroughapproachesthatcanmakedecentralizedbootstrappingtransparentthroughminimalandintuitiveinteractionwiththeP2Pcomponent.Thebasicbootstrappingprocesscanbebrokendownintotwocomponents:ndingandconnectingtoanactivepeerinthesystem.Whenanodestarts,itcontactsvariousbootstrapservers,untilitsuccessfullyconnectswithone,uponwhichtheyexchange 66

PAGE 67

information.Thebootstrapservermayinquireintotheoverlayforthebestsetofpeersforthenewpeerandrespondwiththatinformationoritmayrespondwithitsexistingneighborset.Atwhichpoint,thepeerattemptstoconnectwiththosepeers.Thisprocesscontinuesaggressivelyuntilthepeerarrivesatasteadystate,eitherconnectingwithaspecicsetoforanumberofpeers.Afterwards,theP2Plogicbecomespassive,onlyreactingtochurnfromnewincomingoroutgoingpeers.Overlaysupportforconstrainedpeers,i.e.,thosebehindNATsandrestrictiverewalls,requiresadditionalfeaturestosupportall-to-allconnectivityforpeersintheoverlay.TheinstantiationofP2Psystemsforprivateusecouldbecomeoverlyburdensome,potentiallyrelyingonsignicanthumaninteractiontobootstrapthem,forexample,byrelayingconnectioninformationthroughphonecallsande-mail.Evenifthisisfeasible,thissortofinteractionisundesirable.P2Psystemsshouldbeself-discovering,minimizingtheamountofworkusersneedtodoinordertotakeadvantageofthem,afeaturestressedbyad-hocsystems.Inaddition,theseapproachesmayrelyoncentralizedcomponents;iftheybecomeunavailable,whichisapossibilitysincemostuserslacktheexpertiseinconguringhighlyavailablesystems,thesystemwillnotbeaccessible.Toaddressthis,Ihaveexploredthepossibilityofusingexistingpublicoverlaysasameanstobootstrapprivateoverlays.Therearemanyexistingpublicoverlayswithhighavailability,suchasSkypeR,Gnutella,XMPP(ExtensibleMessagingandPresenceProtocol),andBitTorrentR;byleveragingthesesystems,systemintegratorscaneasilyenableuserstoseamlesslybootstraptheirownprivateP2Psystems.Intheprecedingparagraphs,Ihaveidentiedthecomponentsnecessaryforbootstrappingahomogeneoussystem;inthefollowing,Iwillexpandthemforenvironmentstosupportthebootstrappingofaprivateoverlayfromapublicoverlaywithconsiderationfornetworkconstrainedpeers.ThepublicoverlaymustsupportthefollowingmechanismsasillustratedinFigure 3-1 : 67

PAGE 68

Reection.AmethodforobtainingglobalapplicationandIPaddressesoridentierforapeerthatcanbesharedwithotherstoenabledirectcommunication. Relaying.Amethodforpeerstoexchangearbitrarydata,whenadirectIPlinkisunavailable. Rendezvous.AmethodforidentifyingpeersinterestedinthesameP2Pservice.Thisworkmotivatesfromthebeliefthatwhilesmall-scaleP2Psystemsareattractivefordecentralizedsystems,theoverheadsrelatingtocreatingandmaintainingbootstrapservicesmakethemunfeasible.Apublicoverlaycanbeusedtotransparentlybootstrapaprivateoverlaywithminimaluserinteraction.Therequirementsarepresentedandveriedinthecontextoftwoprototypeimplementations:aXMPP/Jabber[ 96 ]andBrunet[ 13 ].XMPP-basedoverlaysarecommonlyusedaschatportals,suchasGoogleTalkandFacebookRChat.XMPPalsosupportsanoverlayamongstserversformingthroughtheXMPPFederation,whichallowsinter-domaincommunicationamongstchatpeers,sothatusersfromvariousXMPPserverscancommunicatewitheachother.BrunetprovidesgenericP2PabstractionsaswellasanimplementationoftheSymphonystructuredoverlay.Ipresentthearchitectureforthesesystems,thelessonslearnedinconstructingandevaluatingthem,andprovideananalysisofthelatencytoestablishpeerconnectivityinasmall-scaleprivateBrunetoverlaywithNAT-constrainednodes.Theorganizationofthischapterfollows.Section 3.1 overviewsexistingsolutionstothebootstrappingproblem,andNATchallengesinP2Psystems.Section 3.2 presentsasurveyofoverlays,applyingtherequirementsforprivateoverlaybootstrappingtothem,andthenshowindetailhowtheycanbeappliedtoBrunetandXMPP.MyimplementationisdescribedinSection 3.3 .InSection 3.4 ,IperformatimingevaluationofbootstrappingoverlaysusingmyprototypeonPlanetLabanddiscussexperiencesindeployingthesystem. 68

PAGE 69

3.1CurrentBootstrapSolutionsAsdescribedintheintroduction,thesimplecaseofbootstrappingislimitedtoonepeerattemptingtondanactivepeerintheoverlayinorderforitselftobecomeamember.Thelarge-scaleprovidershaveresourcesnotreadilyavailabletosmall-scaleoverlays.Thissectionreviewsexistingtechniquesandthosebeingdevelopedanddescribestheirapplicationtosmall-scalesystems.Whenusingdedicatedbootstrapoverlays,aserviceproviderhostsoneormorebootstrapresources.Peersdesiringtojointheoverlayquerybootstrapnodes,untilasuccessfulconnectionismadetoone.ThebootstrapserverwillthenassistinconnectingthepeertoothernodesintheP2Psystem.Bootstrapnodesareeitherpackagedwiththeapplicationatdistributiontimeorthroughametadatale,suchasinBitTorrentR.Drawbackstothisapproachforsmall,ad-hocpoolsincludethatthesameserverwouldhavetobeusedeverytimetobootstrapthesystem,oruserswouldhavetoreconguretheirsoftwaretoconnecttonewbootstrapserversovertime;atleastonepeermusthaveapubliclyaccessibleaddress;andabootstrapservercanbecomeasinglepointoffailure.Anothercommonlyusedapproachforlarge-scalesystemsistheuseofahostcache[ 27 ].Clientspostcurrentconnectioninformationtodedicatedwebservices,ahostcache,thatinturncommunicatewithotherhostcaches.Forsmall,ad-hocnetworks,ahostcacheactsnodifferentlythanacentralizedrendezvouspoint,requiringthatatleastonepeerhasapubliclyaccessibleaddress.P2PVPN's[ 46 ]useofaBitTorrentRtrackerissimilartothehostcacheconcept.Thetrackerhostslemetadataandpeersinvolvedinsharing.FortheVPN,thepeerregistersavirtualleusedtoorganizethepeers,aformofrendezvous.EachpeerintheVPNqueriesthetrackerregardingthele,registersitsIPaddress,andreceivesotheractivesharersIPaddresses.PeersonpublicaddressesorusingUPnP(universalplugandplay)areabletoreceiveincomingconnectionsfromallotherpeers.The 69

PAGE 70

problemwiththisapproachisthatitisheavilyuser-driven.AusermustregisterwitheachBitTorrentRtrackerindividuallyandmaintainaconnectionwitheachofthem,inordertohandlecaseswhereBitTorrentRtrackersgoofine.Inaddition,thisdoesnotusetheBitTorrentRtrackersinanormalfashion,soitmaybebannedbytrackerhosts.ResearchhasshownthatpeerscanusethelocalitypropertiesofrecentIP(InternetProtocol)addressesinalarge-scaleP2PsystemtomakeintelligentguessesaboutotherpeersintheP2Psystemusinganapproachcalledrandomprobing[ 25 45 ].Theresultsshowthat,inanetworkoftenstohundredsofthousandsofpeers,abootstrappingpeercanndanactivepeerin100guessesto2,000guesses,dependingontheoverlay.Theapproachdoesnotreallyapplywelltosmall-scalesystems,especiallywhenpeersareconstrainedbyNATsandrewalls.RatherthandistributeanIPaddress,whichpointsexplicitlytosomelocationintheInternet,asmallP2PnetworkcanapplyanameabstractionaroundonepeerintheoverlayusingDynamicDNS[ 62 ](domainnameservice).PeersshareaDNSentry,whichpointstoabootstrapserver.Whenthepeersdetectthatthebootstrapserverisofine,atrandomtimeintervalstheywillupdatetheDNSentrywiththeirown.Theapplicationofthisapproachiswell-suitedtosmall,ad-hocgroups,astheservicecouldbedistributedacrossmultipleDynamicDNSregistrations.However,sharingaDNSentryrequirestrustingallpeersintheoverlay,makingiteasyformaliciouspeerstoinhibitsystembootstrapping.Alsotheapproachrequiresthatatleastonepeerbepubliclyaddressable;ifanon-publiclyaddressablepeerupdatesthecacheinadvertently,itcoulddelayorpermanentlypreventpeersfromcreatingaP2Psystem.Thereportedresults[ 62 ]weresimulation-basedanddidnotdeterminehowwelladynamicDNShandlesrapidchangingofnametoIPmappings.IPsupportsmulticastingtogroupsinterestedinacommonservice.InthecaseofbootstrappingaP2Psystem[ 25 94 ],allpeerswouldbemembersofaspecicgroup.Whenanewpeercomesonline,itqueriesthegroupforconnectioninformationand 70

PAGE 71

connectstothosethatrespond.Theapproach,byitself,requiresthatallpeersarelocatedinamulticastcapablenetwork,restrictingthisapproachtypicallytolocalareanetworks.Alarge-scalestructuredoverlay[ 21 24 ]couldenablepeerstopublishtheirinformationintoadedicatedlocationfortheirserviceorapplicationandthenquerythatlisttoobtainalistofonlinepeers.Peerscouldsearchforotherpeersintheiroverlayandconnectwiththemusingtheirconnectioninformation.Sincetheservicewouldbealarge-scalesystem,itcouldeasilybebootstrappedbyadedicatedbootstraporhostcaches.Asitstands,thedescribedworkswerepositionpapersandthesystemshavenotbeenfullyeshedout.Theprimarychallengeinrelationshiptosmall,ad-hocnetworksisthatitlacksdetailsbootstrappingofpeersbehindNATsintooverlaysasitprovidesonlyameansforrendezvousandnotreectionnorrelaying. 3.2CoreRequirementsAspresentedintheprecedingsections,asolutiontobootstrappingsmallP2Poverlaysmustaddressseveralchallenges,namelyreection,rendezvous,andrelaying.Thissectionpresentsagenericsolutiontothisproblem.Thebasisformysolutionisreusingexisting,free-to-joinpublicoverlay.Inordertosupportthesefeaturesthepublicoverlaymusthavemechanismsforpeerstoobtainapublicnetworkidentity(reection);searchforotherpeersthatarebootstrappingthesameP2Pservice(rendezvous);andsendmessagestopeersthroughtheoverlay(relaying).Thesearetheminimumrequirementstobootstrapadecentralized,P2PsystemwhenallpeersarebehindNATs. 3.2.1ReectionReectionprovidesapeerwithaglobally-addressableidentierforreceivingincomingmessagesfromotherpeers.Withoutreection,peersondifferentnetworkswithnon-publicaddressesareunabletocommunicatedirectlywitheachother.ReectionisnotlimitedtoIP.Forexample,whenapeerjoinsaservice,suchasa 71

PAGE 72

chatapplicationoraP2Psystem,theoverlayprovidesauniqueidentier,whichalsoservesasaformofreection.InIPcommunication,reectionenablesNATtraversal.ThesimplestmethodforNATtraversalreliesonobtainingthepublicinformationforanexistingUDP(userdatagramprotocol)socketandthensharingthatwithotherpeers.Thisbehaviorcanbesupportedthrougheitherlocalserviceorremoteassistance.ThelocalapproachapproachreliesonhavingarouterwithapublicIPaddresssupportingeitherUPnP[ 110 ]orportforwarding/tracking.Inmanycases,UPnPisnotenabledbydefaultandinmostcommercialvenuesitwillrarelybeenabled.Portforwarding/trackingrequiresnon-trivialrouterconguration,outsidethecomfortrangeofmanyindividualsandisnotuniformacrossrouters.ApeerusingUPnPneedsnofurtherservices,asUPnPenablesapeertosetandobtainbothpublicIPaddressandportmappings.PortforwardingandtrackingmechanismsstillrequirethattheuserobtainsandinputsintotheapplicationtheirpublicIPaddressorusein-bandassistancedescribednext.Intheremotelyassistedscenario,apeerrstsendsamessagetoareectionprovider,perhapsusingSTUN[ 91 ](SimpleTraversalofUDPthroughNATs).TheresponsefromtheprovidertellsthepeerfromwhichIPaddressandportthemessagewassent.InthecaseofallconeNATs,thiswillcreateabindingsothatthepeercanthensharethatIPaddressandportwithotherpeersbehindNATs.Whenthetwopeerscommunicatesimultaneously,alltypesofconeNATscanbetraversed;thetimingofmessagesneedstobecarefullyconsidered,however,sinceNATmappingsmaychangeovertime.SolongasonepeerisbehindaconeNAT,NATtraversalusingthismechanismispossible.ThesituationbecomescomplicatedwhenbothpeersarebehindsymmetricNATs,orwheneitheroneofthemhavearewallpreventingUDPcommunication.PeersbehindsymmetricNATscannoteasilycommunicatewitheachother,sincethereisnorelationbetweenremotehostsandportsandlocalports.Further 72

PAGE 73

complicatingthematteristhattherearevarioustypesofsymmetricNATs,havingbehaviorssimilartothevariousconeNATtypes.TheredoesexistsmethodstotraversetheseNATssolongasthereisapredictablepatterntoportselection[ 89 ].UnlikeUDP,TCP(transmissioncontrolpacket)NATtraversaliscomplicatedbythestateassociatedwithTCP.Inmanysystems,thesocketAPI(applicationprogramminginterface)canbeusedtoenableapeertobothlistenforincomingconnectionsandformoutgoingconnectionsusingthesamelocaladdressinginformation.ThismethodworksforvarioustypesofsystemsthoughthesuccessrateonNATsislow,40%[ 78 ].Othermechanismsrelyonout-of-bandcommunication[ 85 ],oruseofcomplicatedpredictivemodels[ 11 ]. 3.2.2RelayingNATtraversalservicesonlydealwithoneaspectofthebootstrapproblem:reection.Thatis,peersareabletoobtainapublicaddressforreceivingincomingconnectionswithnomeansfortoexchangeaddresseswithotherpeersnorperformasimultaneousopentotraverserestrictiveNATs.Toaddressthisissue,manysystemsincorporatetheseNATtraversallibrarieswhileusingintermediariestoexchangeaddressesasamethodofrelaying.AnotherformofrelayingexistswhentwopeersareunabletoformdirectIPconnectionswitheachotherandroutedatamessagesbetweenathird-party.ThemostcommonmethodforrelayinginIPistheuseofTURN[ 90 ](TraversalUsingRelayNAT).ApeerusingTURNobtainsapublicIPaddressandportthatcanbeusedasaforwardingaddress.Whenaremotepeersendstothisaddress,theTURNserverwillforwardtheresponsetothepeerwhohasbeenallocatedthatmapping.ThelackofabstractioninTURNmakesthesystemheavilycentralized,makingitsapplicationinsmall-scalesystemscomplicated.Inoverlays,peerstypicallyhaveanabstractedidentierthatdoesnotassociatethemwithasingleserverenablingmoredecentralizedapproachestorelaying.Whena 73

PAGE 74

remotepeersendsamessagetotheidentier,theoverlayshouldtranslatetheidentierintonetworkleveladdressesandforwardittothedestination.Becauseofthisrestriction,messagessentbyrelayingcannothaveexpectationsmorethanthatofsendingapacketbyUDP.Inotherwords,apacketwilleitherbereceivedinareasonableamountoftimeornotatall.Supportforreliability,streaming,andowcontrol,ifnecessary,mustbeprovidedinuser-space.Finally,theserviceshouldbeasynchronousoreventdriven.ThepreviousrequirementswouldallowpeerstorelaythroughamessageboardorevenbypostingmessagestoaDHT.Theproblemwiththesetwoapproachesisthatpeersmayverywellcommunicateforlongperiodsoftimeusingtheseservices.Thatmeansthepotentialforpostinglargeamountsofdatatoaservicethatwillretainitandconstantlyqueryingtheservicetodetermineifanupdateisavailable.Bothofthesearehighlyundesirableandmaybeviewedasdenialofserviceorspamattacks. 3.2.3RendezvousArendezvousserviceallowspeerstodiscovertheglobalidentierofpeersinterestedinthesameservice.Foranygivenoverlay,anaiveapproachforrendezvousistheuseofabroadcastqueryorrandomprobingtodetermineifanyotherpeersareusingthesameservice.Thisapproachisunreasonable,dependingonthesizeofthebootstrapoverlaycomparedtothedestinationoverlay,itmaybeverydifculttondanotherpeer,someorallpeersmaybebehindNATsandunreachablewithoutassistance,andintheworstcasescenarioamaliciousattackercouldbewaitingforbootstraprequestsintothesystem.Ratherthanattempttomakeasingleuniedrendezvoustechnique,eachoverlaystyleusuallyprovideanefcientmeansforrendezvous,thusreducingthenetworkandtimeoverheadofndinganotherpeer.Forexample,inthecaseofaDHT,peerscanuseasingleDHTkeytostoremultiplevalues,allofwhichwouldbeaddressesusedto 74

PAGE 75

communicatewithpeersintheoverlay.Alternatively,inasystemlikeBitTorrentR,peerscouldusethesametrackerandbecomeseedstothesamevirtualle. 3.3ImplementationsTable 3-2 reviewsvariousoverlays,themajorityofwhicharehighavailability,public,free-to-joinoverlays,thoughsomeresearchonlyoverlaysareincluded.Fromthislist,IchosetoextendBrunetandXMPPtosupportprivateoverlaybootstrapping.BrunetprovidesastructuredP2Pinfrastructure,thoughlacksanactive,large-scaledeploymentoutsideofacademicdeployment(mine)duetobeingrootedinanacademicproject.XMPP,ontheotherhand,hassupportfromalargecontigencyofprivateusersandenablesconnectionsbetweenfriendswithroutingoccurringacrossadistributedoverlay.MyimplementationmakesheavyuseofthetransportsincorporatedintoBrunet[ 13 ].ThekeydistinguishingfeatureofthislibraryistheabstractionofsendingoveracommunicationlinkasitsupportsprimitivessimilartosendandreceivethatenabletheabilitytocreateP2Pcommunicationchannelsoveravarietyoftransports.Inthenextsections,IwilldescribehowIextendedBrunettobeself-bootstrappingaswellasextensionstoenablebootstrappingfromXMPP.Theapplicationofstructuredoverlaysasthebasisprivateoverlaysfocusesontheautonomous,self-managingpropertyoftheoverlaynetworkratherthantheabilitytoscaletoverylargenumbers.Thishasalsobeenthemotivationofrelatedworkwhichhasemployedstructuredoverlaysinsystemsintheorderof10sto100sofnodes.Forexample,AmazonR'sshoppingcartrunsonDynamo[ 28 ]usingacoupleofhundredofnodesorless.FacebookRprovidesaninboxsearchsystemusingCassandra[ 64 ]runningon+cores.Structuredoverlayssimplifyorganizationofanoverlayandprovideeachmemberauniqueidentierabstractedfromtheunderlyingnetwork.Asmentionedinthecitedworks,theyprovidehighavailabilityandautonomicfeaturesthathandlechurnwell.Whenusedinsmallnetworks,moststructuredoverlays(includingBrunetandPastry)ineffectactasO(1)systems,self-organizinglinksthatestablish 75

PAGE 76

all-to-allconnectivityamongpeers.Brunetexplicitlysupportsall-to-allconnectivity,thoughinsomecasesmayrequireconstrainedpeerstoroutethroughrelays.Thiscanfurtherbeensuredbysettingtheamountofnearconnectionsfortheinfrastructures,whichinBrunetiscongurableatruntime. 3.3.1UsingBrunetPriortothiswork,BrunetbootstrappedusingarecentlyonlinecacheofpeersandIPmulticast.BrunetalreadysupportsbehaviorsimilartoSTUN,suchthat,witheveryconnectionBrunetmakes,peersinformeachotheroftheirviewoftheremotepeersnetworkstate,aformofpassivereection.Peersalsogenerateaunique160-bitnodeidentierthatcanbeusedintheoverlayasadirectlyreceivepacketsregardlessoftheunderlayconditions.Inasingleoverlay,BrunetsupportsrelayingeitherthroughtheoverlayorpseudodirectconnectionscalledTunnels[ 43 ],wherepeersroutetoeachotherthroughcommonneighboringconnections.Therelayinginthiscontextisusedeithertomaintainanecessaryoverlayconnection,ortoexchangeintentionstoconnectwitheachotherthroughConnectToMessagemessages.Thuswhenapeerdesiresaconnectiontoanother,bothpeerssimultaneouslyattempttoconnecttoeachotherafterexchangingendpointsdiscoveredthroughreectionusingtheoverlayrelaymechanisms,dealingwiththeissueofmorerestrictiveconeNATsandthecasewhenthepeerisbehindanon-traversableNAT.Tosupportrelayingwithinthescopeofaprivateoverlay,IhavefurtherextendedBrunet'stransportlibrarytosupporttreatinganexistingoverlayasamediumforpoint-to-pointcommunication.ThisiscalledaSubringtransport,becauseitsupportstheabstractionofmultipleprivatesub-ringswithinacommonlargestructuredring.Whentheprivateoverlaytransmitsdataacrossthepublicoverlay,theprivateoverlaypacketisencapsulated(andpossiblyencrypted)inapacketthatensuresitwillbedeliveredtothecorrectprivatedestinationusuallybymeansofgreedyroutingonthe 76

PAGE 77

publicoverlay.InordertoinstructpeerstoestablishSubringlinks,theyexchangeanidentieroftheformbrunet://P2P ID.PeersstoretheirSubringidentiersintotheDHTforrendezvous.TheDHTprovidesascalableandself-maintainingmechanismformaintainingabootstrap,solongastheDHTsupportsmultiplevaluesatthesamekey,asBrunetdoes.ThekeyusedfortheDHTrendezvousisahashoftheservicesnameanditsversionnumber,whichIcallanamespace.PeerscanthenquerythisentryintheDHTtoobtainalistofpeersintheprivateoverlay.SinceDHTsaresoft-state,orleasesystems,wheredataisreleasedafteracertainperiodoftime,anonlinepeermustactivelymaintainitsDHTentry.Inthecasethatapeergoesofine,theDHTwillautomaticallyremovethevalueafteritsleasehasexpired.Tosupportreectionintheprivateoverlay,thereweretwopotentialpaths.TherstwouldhavebeentoextendBrunettosupportSTUNineachoftheremoteserversandthenhaveaprivatenodequerythemfortheirpublicinformation.TheproblemwiththisapproachisthatitwouldrequiremaintainingadditionalstateinordertodiscernwhichoftheremotepeersareonpublicaddressesandcanprovideSTUNservices.Instead,Ioptedtomultiplexthesocketusedforthepublicoverlayasitalreadyhadgonethroughtheprocessofreection.ThemultiplexingofasinglesocketformultipleoverlayiscalledPathing.Inthiscontext,thepublicandprivateoverlaysaregivenavirtualtransportlayerthathooksintoananothertransportlayer,thusnotlimitedpurelytosockettransportlayers.Whenpeersexchangeidentiers,insteadoftransmittingasimpleidentierlikeudp://192.168.1.1:15222,thePathinglibraryextendsittoudp://192.168.1.1:15222/path,whereeachpathmightsignifyauniqueoverlay.ThecompletedapproachisillustratedinFigure 3-2 .TheapproachofSubringandPathingenabledthereuseofthecorecomponentsofBrunet.UsingSubringenablespeerstoformbootstrapconnectionstothenexchangeConnectToMessagemessages.Ifthedirectconnectionsfailed,thentheSubringconnectionscouldbe 77

PAGE 78

Figure3-2.BootstrappingaP2PsystemusingBrunet usedaspermanentconnections.TheuseofPathingmeantreuseofexistingNATtraversaltechniquesandlimitedtheamountofsystemresourcesrequiredtorunmultipleoverlays.Intermsoftotallinesofcode,theseabstractionsenabledarecursiveoverlaybootstrappingwitharelativelysmallcodefootprint,lessthan1000linesofcode. 3.3.2UsingXMPPInadditiontosupportingrecursivebootstrappingofprivateoverlays,thetechniquesdescribedabovecanbeextendedtouseadifferentpublicoverlay,anXMPP-basedfederation,tosupportthebootstrappingofprivateoverlays.ThekeyfeaturesthatmakeXMPPattractivearethedistributednatureofthefederationandtheopennessoftheprotocol.AsofDecember2009,thereareover70activeXMPPserversintheXMPPFederation[ 119 ].TheseincludeGoogleTalk,Jabber.org,andLiveJournalTalk.InXMPP,eachuserhasauniqueidentieroftheformusername@domain.Wherethedomainspeciestheclient'sXMPPserverandtheusernameuniquelyidentiesasingleindividual.XMPPsupportsconcurrentinstancesforeachuserbyappendingaresourceidentiertotheuserID:username@domain/resource.Aresourceidentiercaneitherbeprovidedbytheclientorgeneratedbytheserver.Forusersinthesame 78

PAGE 79

domain,theserverforwardsthemessagefromsourcetodestination.Whentwousersareindifferentdomains,thesender'sserverforwardsthemessagetothereceiver'sserver,whothenrelaysittothereceiver.XMMPallowsforsendingarbitrarybinarymessagescalledIQ.Whilepeerrelationshipsaremaintainedbytheserver,theyareinitiatedbetweenpeersusingIQ.Oncepeershaveestablishedaconnectionorsubscription,theyareinformedthroughaPresencenoticationthatthepeerhascomeonline,thisincludethefulluseridentier.TherstformofreectioninXMPPistheuniqueclientidentier.AnotherisanIPreectionserviceavailablefromsomeXMPPserviceproviderscalledJingle[ 69 ].JingleusesIQtodetermineavailableSTUNandTURNservers.Fortunately,theseservicesareprovidedfreeofchargethroughGoogleTalk.InBrunet,IextendedtheUDPtransporttosupportqueryingSTUNserverstoobtainandmaintainandopenanaddressmapping.STUNpacketsareeasilydistinguishedfromotherpackesasthersttwobitsaresetto0aswellasastaticcookiefoundinallmessages.Inordertosupportthesituationwheretwopeersareunabletocommunicatethroughtheexchangedaddresses,IhaveextendedXMPPIQasatransporttosupportrelaying.OncepeershasformedaconnectionthroughXMPP,theyareabletorouteconnectioninformationtoeachotherandattempttoformadirectconnection.Inthecasethatthisisunsuccessful,theyareabletofallbacktothislinkasameanstotransmitP2Pdata.Thisapproachalsohasthebenetthat,ifaXMPPserverdoesnotsupportJingle,thetwopeerscanstillformlinkswitheachother.SinceBrunetinternallysupportsIPreection,eventually,ifoneofthepeersinthesystemhasapublicaddress,itwillautomaticallyassisttheotherpeersintoformingdirectlinkswitheachother.Rendezvoususesatwostepapproach.Firstpeersadvertisetheiruseofprivateoverlayintheresourceidentier.Thenameishashedtoensurethattheuserscompleteidentierdoesnotextendpast1,023bytes,themaximumlengthfortheseidentiers. 79

PAGE 80

Inaddition,acryptographicallygeneratedrandomnumberisappendedtotheresourceidentiertodistinguishbetweenmultipleinstancesoftheusersapplicationinthesameprivateoverlay.Onceapeerreceivesapresencenoticationfromaremotepeerandthebasecomponentsmatch,thatisthehashoftheservice,thepeeraddsittoalistofknownonlinepeers.Ifthepeerlacksconnections,thesystembroadcaststothatlistarequestforaddresses.ThepeersrespondwithalistofaddressesincludingUDP,TCP,andXMPPaddresses,concludingrendezvous.Ideally,peerswouldnotneedtocreateXMPPconnectionswitheachother;iftheyareonapublicaddress,therendezvousphasealonewillsufce.Whenpeersdonothaveapublicaddress,theycanobtainamappingthroughSTUN,thenformanXMPPconnectionwitheachother,andnallyperformsimultaneousconnectionattempts.IfNATtraversalfails,thepeerscancontinueroutingthroughtheXMPPconnection.Duetotheabstractionsemployedbythetransportlibrary,theadditionalsupportforXMPP-basedbootstrappingrequiredonlyanadditional700linesofcodetoBrunetandnomodicationtothecoresystem. 3.4EvaluatingOverlayBootstrappingThissectionpresentsaqualitativeevaluationofthissystemprototypebootstrappingasmall-scalenetworkaswellassomeoftheexperiencesindeployingbootstrappingoverlays. 3.4.1DeploymentExperimentsTheseexperimentsverifythatthetechniquesworkanddetermineexpectedoverheadsinusingBrunetandXMPPtobootstrapanoverlay.RatherthananextensiveexperimentoverlyfocusedonoverheadsofBrunetandXMPP,thisexperimentisprimarilyfocusedonthefeasibilityofformingsmall-scaleoverlaysamongnetwork-constrainedpeers.Theexperimentrepresents5peersdesiringall-to-alldirectconnectivity,afeaturetransparentlyavailabletothemiftheybootstrapintoaprivateBrunetoverlay.Theexperimentswererunonpeersdeployedon5distinctvirtualmachines.Eachvirtual 80

PAGE 81

Table3-1.Timeinsecondsforvariousprivateoverlayoperations ReectionRendezvousRelayingConnected XMPP0.0350.1100.24320.3Brunet3.050.3300.53323.22 machinehaditsownseparateNAT,andthuspeerswereunabletocommunicatedirectlywithoutassistance.ThepublicBrunetoverlayusedinthisexperimentconsistedofover600nodesrunningonPlanetLab.PlanetLab[ 23 ]isaconsortiumofresearchinstitutessharinghundredsofgloballydistributednetworkandcomputingresources.GoogleTalkprovidedtheXMPPoverlayusedinthisexperiment.ThoughthisexperimentdoesnottakeintoadvantagethefeaturesoftheXMPPFederation,thisaspectispresentedinmoredetailinthenextsectionreviewingexperiencesdeployingoverlaysusingXMPP.Intheexperiment,5P2Pnodeswerestartedsimultaneously,whilemeasuringthetimespentforreection,rendezvous,reection,andconnection.TheresultsarepresentedinTable 3-1 .ForXMPP,thesearetranslatedasfollows:reectionmeasuresthetimetoobtainIPaddressesfromtheSTUNserver,rendezvousisthetimetoreceiveapresencenotication,relayingisthetimetoreceiveamessageacrossXMPP,andconnectedisonceallnodesintheprivateoverlayhasall-to-allconnectivity.ForBrunet,thesearetranslatedasfollows:reectionmeasuresthetimetoconnecttothepublicoverlay,rendezvousisthetimetoquerytheDHT,relayingistheaveragetimetosendamessageacrosstheoverlay,andconnectedisthetimeuntiltheprivateoverlayhasall-to-allconnectivity.TheresultsarehighlycorrelatedtotimeoutsinBrunet,whichemploysamixtureofeventsandpollingtostabilizetheoverlay,aswellasthelatencybetweentheclientandGoogleTalk.Asthiswasmoreofaqualitativeexperiment,theresultsareclear:privateoverlaysprovidingall-to-allconnectivityamongNATednodescanbootstrapwithinaveryreasonableamountoftime. 81

PAGE 82

3.4.2DeploymentExperiencesRecently,FacebookRannouncedthattheywouldbesupportingXMPPasameanstoconnecttoFacebookRchat.Thiswasratherexcitingandfurthermotivatedthiswork,asFacebookRhasover400millionactiveusers,whichwouldhavemadetheirXMPPoverlay,potentially,thelargestfree-to-joinoverlay.Unfortunately,FacebookRdoesnotemployatraditionalXMPPsetup,insteaditprovidesaproxyintotheirchatnetwork,preventingfeatureslikearbitraryIQsandotherformsofout-of-bandmessagestobeexchangedbetweenpeers.Useridentiersarealsotranslated,soapeercannotobtainaremotepeersrealidentier.Thusthereexistsnoout-of-bandmechanismforrendezvous.Peerscouldpotentiallysendrendezvousmessagesthroughthein-bandXMPPmessaging,butthismaybeviewedbymostrecipientsasspamasitwouldarriveasnormalchatmessages.Unfortunately,therealizationisthatnotallXMMPservers,especiallythoseunrelatedtotheFederation,supportfeaturesnecessarytobootstrap.DuringinitialtestsinverifyingtheworkingsoftheXMPPcodebase,IbootstrappedaprivateBrunetoverlayonPlanetLabthroughvariousXMPPserviceproviders.Unfortunately,someservers(GoogleTalk)ignoredclientsonPlanetLab.Anotherservercrashedafter257concurrentinstancesofthesameaccountloggedin.Becausetheproviderhadnocontactinformation,Iwasunabletoascertainthereasonforthecrash.Thoughtheredidexistsomeserversthathadnotroublehostingover600concurrentinstancesrunningonPlanetLab.OncethesystemwasrunningonPlanetLab,moretestswereperformedtodeterminetheabilitytobootstrapacrosstheXMPPFederation.Forthispurpose,severalfriendships,orsubscriptions,wereformedbetweenusersacrossvariousXMPPserviceproviders.Inthemostevaluatedcase,asinglepeeronGoogleTalkalongwith600peersonPlanetLabsystemusingjabber.rootbash.com,theGoogleTalkpeerwouldnotalwaysreceivepresencenoticationsforallpeersonline,thoughalwayswouldreceivesome.Whenapeerbegantherelayingmechanism,itwouldbroadcasttoevery 82

PAGE 83

peerfromwhomitreceivedapresencenotication.WhenperformingthisbetweenGoogleTalkandrootbash,theGoogleTalkpeerwouldnotreceivearesponse.Thoughinreducingthebroadcasttoarandomselectionof10peers,every10secondsuntiltheGoogleTalkpeerwasconnected,thepeerreceivedresponses.ThebehaviorindicatesthattheXMPPserversmayhavebeenlteringtopreventdenialofserviceattacks.PeersonthesameXMPPserverseemtobeconnectedveryquickly,thoughpeersondifferentservicescantakesignicantlylonger.Forexample,whenbootstrappingasinglepeerfromGoogleTalkintotherootbashsystem,italwaystook1minuteforthenodetobecomefullyconnectedtotheprivateoverlay.Whenthepeerusedrootbash,thepeeralwaysconnectedwithin30seconds.ItseemsasifthecommunicationbetweenXMPPserverswasbeingdelayedforsomereason.Thesamebehaviorwasnotexperienced,whenchattingbetweenthetwopeers. Table3-2.Publicandresearchoverlays DescriptionReectionRendezvousRelay BitTorrentRDefaultBitTorrentRimplementationsrelyonacentralizedtrackertoprovidetheinitialbootstrapping.Peerscanestablishnewconnectionsthroughinformationobtainedfromestablishedcon-nections.Thisrelegatesthetrackerasameansofmonitoringthestateoftheledistribution.BitTorrentRspeciesaprotocol,thougheachclientmaysupportad-ditionalfeaturesnotcoveredbytheprotocol.Thecurrentspec-icationdoesnotsupportNATtraver-sal,thoughfutureversionsmaypo-tentiallyuseUDPNATtraversal.Atwhichpoint,BitTorrentRmaysupportareectionservice.Peerscanreg-isterasseedstothesamelehash,thustheirIPaddresswillbestoredwiththetracker.Peersreceiveeachother'sIPaddressesfromthetracker,thereisnoinherentrelaying. 83

PAGE 84

Table 3-2 .Continued DescriptionReectionRendezvousRelay GnutellaGnutellaisalarge-scaleunstructuredover-laywithoveramillionpeers;primarily,itisusedforlesharing.Gnutellaconsistsofacouplehundredthou-sandultra(super)peerstoprovidereliabilitytotheoverlay.Gnutellaisfree-to-joinandrequiresnoregistrationtouse.Workinprogress.Peersattempttoconnecttoasharer'sre-source,thoughaPushnotica-tionreversesthisbehavior.ThusapeerbehindaNATcansharewithapeeronapublicaddress.Peerscanper-formbroadcastsearcheswithTTLupto2;whennetworksconsistofmil-lionsofpeers,smalloverlayswillmostlikelynotbeabletodiscovereachother.Notexplicitly,couldpoten-tiallyutilizepingmessagestoexchangemes-sages.SkypeRSkypeRisalarge-scaleunstructuredoverlay,consistingofoveramillionactivepeers,andprimarilyusedforvoiceoverP2Pcommu-nication.SkypeR,likeGnutella,alsohassuperpeers,thoughtheown-ersofSkypeRprovideauthenticationandboot-strapservers.ThoughSkypeRisfree-to-join,itrequiresregistrationtouse.SkypeRAPIspro-videnomeansforreection.SkypeRsup-portsapplica-tions,oradd-ons,whichcanusedtotranspar-entlybroadcastqueriestoausersfriendtodetermineifthepeerhastheapplica-tioninstalled.ThusSkypeRdoessupportrendezvous.SkypeRap-plicationsareallowedtoroutemessagesviatheSkypeRoverlay,butbe-causeSkypeRlacksreection,allcommuni-cationmusttraversetheSkypeRoverlay.XMPPXMPPconsistsofafederationofdistributedservers.Peersmustregisteranaccountwithaserver,thoughreg-istrationcanbedonethroughXMPPAPIswithoutuserinterac-tion.XMPPisnotatraditionalP2Psystem,thoughithassomeP2Pfeatures.XMPPserversondistinctserversareabletocommunicatewitheachother.Linksbetweenserversarecreatedbaseduponclientdemand.Duringlinkcreation,serversexchangeXMPPFeder-ationsignedcerticates.Whilenotpro-videdbyallXMPPservers,thereexistextensionsforNATtraversal.GoogleTalk,forexample,providesbothSTUNandTURNservers.SimilartoSkypeR,XMPPfriendscanbroadcastqueriestoeachothertondotherpeersus-ingthesameP2Pservice.ThusXMPPsupportsren-dezvous.TheXMPPspecicationallowspeerstoexchangearbi-traryout-of-bandcommunica-tionwitheachother.Mostserverssupportthisbehavior,evenwhensentacrosstheFed-eration.ThusXMPPsupportsrelaying. 84

PAGE 85

Table 3-2 .Continued DescriptionReectionRendezvousRelay Kademlia[ 72 ]Thereexiststwopopu-larKademliasystems,oneusedbymanyBitTorrentRsystems,Kad,andtheotherusedbyGnutella,calledMojito.Kademliaim-plementsaniterativestructuredoverlays,wherepeersqueryeachotherdirectlywhensearchingtheoverlay.ThusallresourcesofaKademliaoverlaymusthaveapubliclyaddressablenetworkendpoint.Existingimplemen-tationsofKademliadonotsupportmechanimsforpeerstodeter-minetheirnetworkidentity.PeerscanusetheDHTasarendezvousservice,storingtheirconnectiv-ityinformationintheDHTatkeylocation:hash(SERVICE).Aniterativestructuredover-layhasnosup-portforrelayingmessages.OpenDHT[ 86 ]OpenDHTisarecentlydecommissionedDHTrunningonPlanetLab.OpenDHTisbuiltusingBamboo,aPastry-likeprotocol[ 94 ].Pastryimplementsrecursiverouting,peersroutemessagesthroughtheoverlay.Existingimplemen-tationsofBambooandPastrydonotsupportmechan-imsforpeerstodeterminetheirnetworkidentity.Thoughthisisongoingwork.PeerscanusetheDHTasarendezvousservice,storingtheirconnectiv-ityinformationintheDHTatkeylocation:hash(SERVICE).BecausePastryusesrecursiverouting,itcanbeusedasarelay.Further-more,extensionstoPastryhaveenabledexplicitrelayscalledvirtualconnec-tions[ 73 ].Brunet[ 13 ]BrunetlikeOpenDHTisafreelyavailableDHTrunningonPlan-etLab,thoughstillinactivedevelopment.BrunetcreatesaSym-phony[ 71 ]overlayusingrecursiverouting.Brunetsupportsinherentreectionservices,whenapeerformsaconnectionwitharemotepeer,thepeersexchangetheirviewofeachother.PeerscanusetheDHTasarendezvousservice,storingtheirconnectiv-ityinformationintheDHTatkeylocation:hash(SERVICE).LikePastry,Brunetsupportsrecursiveroutingandrelayscalledtunnels[ 43 ]. 85

PAGE 86

CHAPTER4FROMOVERLAYSTOSECUREVIRTUALPRIVATENETWORKSInthischapter,ItaketheresultsfromChapter 3 andapplythemtoIPOP[ 41 ]inordertoconstructafullydecentralizedP2PVPN(peer-to-peervirtualprivatenetwork).WhilesharingoverlaysinIPOPmakesforsimplieduseofthesystem,inreality,itintroducessignicantsecuritychallenges.Forexample,amisconguredormaliciouspeercouldpotentiallydisabletheentireoverlay,renderingallVNs(virtualnetworks)useless.Ifsecurityandhenceisolationisimportant,priortoVNdeployment,auserwouldneedtodeployasecureoverlayandconguretheirVPNtobootstrapfromit,giventhecomplexitymanyusersmayreconsidertheP2PapproachanduseasimplecentralizedVPn.ToaddressthischallengeandtomakeafullydecentralizedP2PVPN,IhaveextendedtheIPOPconcepttosupportbootstrappingfrompublicinfrastructuresandoverlaysintoprivateandsecureP2PoverlayswhosemembershipislimitedtoanindividualVPNuserbase.Chapter 3 focusedonasmallscalefeasibilityofbootstrappingdecentralizedoverlays.ThischapterfurtherextendsintoperformanceoverheadsofrecursiveBrunetoverlaysandlargernetworksizes.IthenconsidersecurityintheoverlayandpresenttherstimplementationandevaluationofanoverlaywithsecurecommunicationbothbetweenendpointsintheP2Poverlay(e.g.VPNnodes)aswellasbetweennodesconnectedbyoverlayedges.Securityrequiresameansforpeerrevocation;however,currentrevocationtechniquesrelyoncentralizedsystemssuchascerticaterevocationlists(CRLs).TheproposedapproachallowsrevocationusingscalabletechniquesprovidedbytheP2Poverlayitself.IcallthecompletedsystemandtheinterfaceusedtoadministrateitGroupVPN,anoveldecentralizedP2PVPN.Therestofthischapterisorganizedasfollows.Throughoutthechapter,therearetwotechniquesusedtoevaluatemyapproaches,simulationandrealsystem 86

PAGE 87

deployments;thesearedescribedinSection 4.1 .Section 4.2 describestechniquesthatallowuserstocreatetheirownprivateoverlaysfromasharedpublicoverlayinspiteofNAT(networkaddresstranslation).UseofsecurityprotocolshasbeenassumedinmanyP2Pworks,thoughwithoutconsiderationofimplementationandoverheads.IinvestigateimplementationissuesandoverheadsofsecurityinP2PwithemphasisonP2PVPNsinSection 4.3 .Withoutrevocation,useofsecurityislimited,andindecentralizedsystems,theuseofcentralizedrevocationmethodsisarenotsufcient,IpresentnovelmechanismsfordecentralizedrevocationinSection 4.4 .Thecompletesystem,GroupVPN,ispresentedinSection 4.5 .Section 4.7 comparesandcontraststhisworkwithrelatedwork. 4.1ExperimentalEnvironmentThroughoutthispaper,myquantitativeevaluationenvironmentusesbothrealdeploymentsonPlanetLabandsimulation.Theevaluationrequirementsdictatetheenvironmentused.Whentheperspectiveofasinglenodeisuseful,PlanetLab'soverloadednaturemakescomplexsystemanalysischallenging,especiallywhenattemptingtosimulateaninstantaneousbehavioronasystem,whichhasrandomoutageanddelaysinaccess.IPOPusesBrunetastheunderlyingP2Pinfrastructureforconnectivity.Brunethasbeeninactivedevelopmentforthepast5yearsandisroutinelyrunonPlanetLab[ 23 ]forexperimentsandtests.PlanetLabconsistsofofnearly1,000resourcesdistributedacrossEarth.Inpracticalapplications,though,roughly40%oftheresourcesareunavailableatanygiventimeandtheremainingbehavesomewhatunpredictably.PlanetLabdeploymenttakesapproximately15minutesforallresourcestohaveBrunetinstalledandconnecttotheoverlayandthenmuchmoretimetoobservecertainbehaviors,makingregressionandvericationtestscomplicated.Toaddressthis,IhaveextendedBrunettosupportasimulationmode.ThesimulatorinheritsalloftheBrunetP2Poverlaylogicbutusessimulatedvirtualtimebaseduponanevent-drivenscheduler 87

PAGE 88

insteadofrealtime.Furthermore,thesimulationframeworkusesaspecializedtransportlayertoavoidtheoverheadofusingTCP(transmissioncontrolprotocol)orUDP(userdatagramprotocol)onthehostsystem,bothofwhicharelimitedresourcesandcanhampertheabilitytosimulatelargesystems.Thespecializedtransportusesdatagramstopassmessagesbetweennodes,thusfromthenode'sperspective,itisverysimilartoaUDPtransportandcansimulatebothlatencyandpacketdropping.Latencybetweenallnodepairsissetto100msbydefault.Bothsimulationandrealsystemevaluationprovideuniqueadvantages.Simulationsallowfasterthanrealtimeexecutionofreasonablesizednetworks(uptoafewthousand)usingasingleresource,whileenablingeasydebugging.Incontrast,deploymentonrealsystems,inparticularPlanetLab,presentsopportunitiestoaddnon-deterministic,dynamicbehaviorintothesystemwhichcanbedifculttoreplicate,suchasnetworkglitchesandlongCPU(centralprocessingunit)delaysonprocessing. 4.2TowardsPrivateOverlaysManyusersofIPOPbeginbyusingthepublicsharedoverlayand,oncecomfortable,movetowardshostingtheirowninfrastructure.Somearesuccessfulwithoutassistance,whileamajorityarenot.NetworkcongurationissuestendtobethemostcommonissuepreventingusersfromhostingtheirownindependentIPOPsystems.Whileuserswereabletoeasilyjointhesharedoverlay,similarattemptstoconstructtheirownwerehinderedandultimatelyonlysuccessfulafterreceivingfeedback.PriorworkinIPOP[ 44 ]enabledmanyVPNstoshareasingleP2PoverlaybystoringIP(InternetProtocol)addressintotheDHT(distributedhashtable)atthekeyhash(Namespace:IP).Unfortunately,thisapproachisfraughtwithsecurityissues.Inthepreviouschapter,IestablishedmethodsthatenabledbootstrappingprivateBrunetoverlaysaseasilyasconnectingtoapublicP2Poverlay.ThischapterbeginsbyfocusingontheintegrationofthemethodologiesemployedinrecursiveBrunetoverlaysasappliedtoIPOP. 88

PAGE 89

TobootstrapfromanexistingBrunetoverlay,peersrstinserttheirpublicoverlaynodeaddressintothekeyrepresentedbyhash($PrivateOverlayNamespace)andcontinuetodosoregularlyuntiltheydisconnect,soastonotlettheentrybecomestaleanddisappear.Peersattemptingtobootstrapintotheprivateoverlaycanthenquerythiskeyandobtainalistofpublicoverlaynodesthatarecurrentlyactingasproxiesintotheprivateoverlay.Byusingthepublicoverlayasatransport,similartoUDPorTCP,theprivateoverlaynodeformsbootstrappingconnectionsviathepublicoverlay.Atwhichpoint,overlaybootstrappingproceedsasnormal.TheentireprocessisrepresentedinFigure 3-1 .Asmentionedinthepreviouschapter,smalloverlaysmayhavenomemberswithapublicaddress,makingitdifculttoprovideoverlaybasedNATtraversal.ToavoidhavingaspecialcaseforNATtraversalinprivateoverlays,inmymodel,theprivateoverlayshareTCPandUDPsocketswiththepublicoverlay.Thismechanism,referredtoaspathing,allowsmultiplexingasingleUDPsocketandlisteningTCPsocketbymanyoverlays.ThisisonlypossibleduetothegenerictransportslibraryoftheBrunetP2Poverlay,whichdoesnotdifferentiateUDP,TCP,orevenrelayedlinks.Pathingworksasaproxy,interceptingalinkcreationrequestfromalocalentity,mappingthattoapath,andthenrequestingfromtheremoteentityalinkforthatpath.Theunderlyinglinkisthenwrappedbypathingandgiventothecorrectoverlaynode,resultinginacompletelytransparentmultiplexingofaTCPandUDPsockets,therebyenablingtheNATtraversalinoneoverlaytobenettheother.Oncealinkhasbeenestablished,thepathinginformationisirrelevant,limitingtheoverheadintothesystemtoasinglemessageexchangeduringlinkestablishment. 4.2.1TimetoBootstrapaPrivateOverlayThisexperimentfocusesontheoverheadsinbootstrappingaprivateoverlayusingthetechniquesmentionedintheprevioussection.Thetimetobootstrapcanbederivedanalyticallybyconsideringtheminimumstepsforanodetojointhepublicoverlay, 89

PAGE 90

obtainprivateoverlaypeersfromthepublicoverlayDHT,andthenconnecttotheprivateoverlay.InBrunet,peersbeginbyformingleaforbootstrappingconnectionsandusethesetocommunicatewiththeneighbororpeerintheP2PnetworknearesttotheirP2Paddress.Theprocesstoformaconnectioncanbedoneinasfewas4messagesandupto6,ifthepeersonlyknoweachother'sP2Paddress,whichisthecaseforneighborconnections.AssumingapeeralreadyhasIPaddressinformationforanother,aconnectioncanbeinitiatedbythepeersendingamessagetotheremotepeerexpressingthedesireforaconnection.Theremotenoderespondsbyeitherrejectingtherequestorcommittingtotheconnection.Inthenextexchange,theinitiatingpeercommitstoformingtheconnectionandtheremotepeeracknowledges.Thetwophasecommitprocessisusedtohandlethecomplexitythatensueswhenmultiplesimultaneousconnectionattemptsoccurinparallel.Allthesemessagestake1hop,sincetheyaredirectlinksbetweenpeers.Whenpeersonlyhaveeachother'sP2Paddressand/ortheinitiatingpeerisbehindaNAT,itmaytakefthandsometimesasixthmessage.Thesemessagesarerequestsfortheremotepeer'sIPaddressesaswellasaskingthepeertoconnectwiththeinitiatingpeer,addressingthecasewheretheremotepeerisbehindaNATandcannothandleinboundmessages.Thesemessagesareroutedovertheoverlaytakinglog(N)hops,whereNisthenetworksizeofthepublicoverlay.Privateoverlaybootstrappingfollowsasimilarprocess,though,rst,thepeeracquiresP2PaddressesofotherparticipantsthroughthepublicDHT,anoperationtaking2log(N)hops.Intheprivateoverlay,theleafconnectionsdonotcommunicatedirectly;rather,theyusethepublicoverlay,causingsomeofthe1hopoperationsabovetotakelog(N)hops.Finally,ndingthenearestremotepeerintheprivateoverlaytakeslog(N)+log(n),wherenisthenetworksizeoftheprivateoverlay. 90

PAGE 91

Giventhismodel,eachoperationtakesthefollowinghopcounts:publicoverlaybootstrapping=>8+log(N),DHToperations=>2log(N),andprivateoverlaybootstrapping=>4+5log(N)+log(n).Thecumulativeoperationtakes12+8log(N)+log(n)hops.Thedominatingoverheadinbootstrappingtheprivateoverlayisthetimeittakestoperformoverlayoperationsonthepublicoverlay(log(N)).Forinstance,assuminganetworksizeof512publicand8private,anodeshouldbeconnectedwithin87hops.ToevaluatemyimplementationforGroupVPN,IusedbothPlanetLabandthesimulator.100testswererunforvariousnetworksizes.ThoughduetodifcultyincontrollingnetworksizesinPlanetLab,IseteachPlanetLabnodetorandomlydecideifitwouldconnecttotheprivateoverlay.Thenetworksizeswerethenusedinthesimulatorandtheanalyticalmodel.Theaveragepublicnetworksizeforeachofthesetestswas600.TheresultsarepresentedinFigure 4-1 1.Usinga100msdelaylikethesimulatorresultsin9.2and9.3secondsfortheanalyticalmodelforprivatenetworksizesof68and147,respectively.BasedupontheresultspresentedinFigure 4-1 ,thebootstrappingtimefortheimplementationperformsbetterthantheanalyticalmodel,duetothesimplicityoftheanalyticalmodelandthesmallnetworksizes.Itisofinterestthatwhilethesimulatorresultstendtobeinawelldenedrange,thePlanetLabresultshaveafewoutlierswithlongbootstraptimes.SomeoftheexpectedcausesforthisarechurninthesystemandstatemachinetimeoutsinBrunet,thoughIhavenotconsideredthisinmuchdepth. 1Iperformedmeasurementsformanymoreprivatenetworksizes,butalltheresultsweresosimilarthatitdidnotintroduceanythingofinterestandareomittedfromtheplotstoimproveclarity. 91

PAGE 92

Figure4-1.CDFofprivateoverlaybootstraptime 4.2.2OverheadofPathingMuchlikethepreviousexperiment,thisveriesthatthepathingtechniquehasnegligibleoverheadsforVPNusage.Todeterminetheoverheads,twoGroupVPNsaredeployedonresourcesonthesamegigabitLAN(localareanetwork).Tomeasurelatencyandthroughput,netperfexperimentsarerunfor30seconds,5timeseachonanunutilizednetworkswitch.Otherspecicationsofthemachineareignoredasthesystemwithoutpathingisusedasthebaseline.Theresults,Table 4-1 ,indicatethattheuseofpathingpresentsnegligibleoverheadforboththroughputandlatency,justifyingtheuseofthisapproachtotransparentlydealwithNATandrewalltraversal. Table4-1.Pathingoverheads Latency(ms)Throughput(Mbit/s) Standard0.303225.27Pathing0.308224.36 4.3SecurityfortheOverlayandtheVPNStructuredoverlaysaredifculttosecureandaprivateoverlayisnotsecureifitprovidesnomeanstolimitaccesstothesystem.MalicioususerscanpollutetheDHT,sendbogusmessages,andevenpreventtheoverlayfromfunctioning,renderingthe 92

PAGE 93

VPNuseless.ToaddressthisinmeansthatmakesenseforVPNsandcommonusers,Ihaveemployedapublickeyinfrastructure(PKI)toencryptandauthenticatebothcommunicationbetweenpeersaswellascommunicationacrosstheoverlay,calledpoint-to-point(PtP)andend-to-end(EtE)communication,respectively.UseofaPKI(publickeyinfrastructure)motivatesfromtheabilitytoauthenticatewithoutathirdparty,idealforP2Puse,unlikeakeydistributioncenters(KDC)usedbyotherVPNs.APKIcanuseeitherpre-exchangepublickeysoracerticateauthority(CA)tosignpublickeys,i.e.,certicates.Thuspeerscanexchangekeysandcerticateswithoutrequiringathird-partytobeonline.ThereasonsforsecuringPtPandEtEaredifferent.SecuringPtPcommunicationpreventsunauthorizedaccesstotheoverlay,aspeersmustauthenticatewitheachotherforeverylinkcreated.Thoughonceauthenticated,apeercanperformmaliciousactsandsincetheoverlayallowsforroutingoverit,thepeercandisguisetheoriginationofthemaliciousacts.ByalsoemployingEtEsecurity,theauthenticityofmessagestransferredthroughanoverlaycanbeveried.ThoughEtEsecuritybyitself,willnotpreventunauthorizedaccessintotheoverlay.ByemployingbothPtPandEtE,overlayscanbesecuredfromuninvitedguestsfromtheoutsideandcanidentifymalicioususersontheinside.Implementingbothleadstoimportantquestions:whatmechanismscanbeusedtoimplementbothandwhataretheeffectsofbothonanoverlayandtoaVPNonanoverlay. 4.3.1ImplementingOverlaySecurityTherearevarioustypesofPtPlinks,suchasTCPandUDPsocketsandrelaysacrossindividualnodesandtheoverlay.EtEcommunicationisdatagram-orientedinIPOP.TraditionalapproachesofsecuringcommunicationsuchasIPsecarenotconvenientduetocomplexity,i.e.,operatingsystemspecic,portabilityconstraints,andlackofcommonAPIs.Securityprotocolsthatrelyonreliableconnections,suchasSSL(securesocketslayer)orTLS(transportlayersecurity)areundesirableaswellasthey 93

PAGE 94

wouldrequireauserspaceimplementationofreliablestreams(akintoTCP).Assuch,IhaveimplementedanabstractioncalledasecuritylteraspresentedinFigure 4-2 ,whichenablesnearlytransparentuseofsecuritylibrariesandprotocols.Inthisexample,thesecuritylterabstractionisusedbysendersandreceiversthroughanEtEsecuredchatapplication.Eachreceiverandsenderusethesameabstractedmodelandthusthechatapplicationrequiresonlyhigh-levelchanges,suchasverifyingthecerticateusedisAlice'sandBob's,tosupportsecurity.Asecuritylterhastwocomponents:themanager,andindividualsessionsorlters.Whiletheindividualsessionscouldactasltersbythemselves,bycombiningwithamanager,theycanbeconguredforacommonpurposeandsecuritycredentials.Thisapproachenablestheuseofsecuritytobetransparenttotheothercomponentsofthesystemasthemanagerhandlessessionestablishment,garbagecollectionofexpiredsessions,andrevocationofpeers.Tothisdate,IhaveimplementedbothaDTLS[ 83 ](datagramtransportlayersecurity)lterusingtheOpenSSLimplementationofDTLSaswellasaprotocolthatreusescryptographiclibrariesprovidedby.NETthatbehavessimilarlytoIPsec.Certicateembedidentityoftheowner,thusasignedcerticatestatesthatthesignertruststhattheidentityisaccurate.Innetworksystems,thecerticateusesthedomainnametouniquelyidentifyandlimittheuseofacerticate.WhenaCAsignsthecerticate,byincludingthedomainname,itensuresthatuserscantrustthatacerticateisvalid,whileusedtosecuretrafctothatdomain.Communicationwithanotherdomainusingthesamecerticatewillraiseaagandwillresultintheusernottrustingthecerticate.InenvironmentswithNATs,dynamicIPaddresses,orportabledevices,typicalofP2Psystems,assigningacerticatetoadomainnamewillbeahassleasitconstrainsmobilityandthetypeofusersinthesystem.Furthermore,mostusersareunawareoftheirIPaddressandchangestoit.Instead,acerticateissignedagainsttheuser'sP2PaddressanduniqueusernameasdelegatedbytheCA.The 94

PAGE 95

Figure4-2.Securitylter purposeoftheformerisforefciencyofrevocationasdiscussedinSection 4.4 .DuringtheformationofPtPlinksorwhileparsingEtEmessages,thetwonodesdiscovereachother'sP2Paddresses.Iftheaddressesdonotmatchtheaddressontheveriedcerticate,thecommunicationneednotproceedfurther.Priortotrustingthesecuritylter,thecoresoftwareorthesecurityltermustensurethattheP2Paddressoftheremoteentitymatchesthatofthecerticate.Inmyapproach,Ididthisbymeansofacallback,whichpresentstheunderlyingsendingmechanism,EtEorPtP,andtheoverlayaddressstoredinthecerticate.Thereceiverofthecallbackcanattempttocastitintoknownobjects.Ifsuccessful,itwillcomparetheoverlayaddresswiththesendertype.Ifunsuccessful,itignorestherequest.Ifanycallbacksreturnthatthesenderdoesnotmatchtheidentier,thesessionisimmediately 95

PAGE 96

closed.Thusthesecuritylterneednotunderstandthesendingmechanismandthesendingmechanismneednotunderstandthesecuritylter.ThelastconsiderationcomesinthecaseofEtEcommunicationthatprovidesanabstractionlayer.Forexample,inthecaseofVPNs,whereaP2PpacketcontainsanIPpacketandthusaP2PaddressmapstoaVPNIPaddress,amaliciouspeermayestablishatrustedlink,butthenhijackanotheruser'sIPsession.Assuch,theapplicationmustverifythattheIPaddressintheIPpacketmatchestheP2PaddressofthesenderoftheP2Ppacket.Ingeneral,anapplicationaddressshouldbematchedagainstaP2Paddress. 4.3.2OverheadsofOverlaySecurityWhenapplyinganadditionallayertoaP2Psystem,thereareoverheadsintermsoftimetoconnectwiththeoverlay.Otherlessobviouseffectsarethroughput,latency,andprocessingoverheads,assumingthattheP2Psystemwillbeusedoverawideareanetwork,wherethelatencyandthroughputlimitationsbetweentwopointswillmaketheoverheadofsecuritynegligible.Thoughbootstrappingwillbeaffectedduetoadditionalroundtripmessagesusedforformingsecureconnections. Figure4-3.DTLShandshake TheDTLShandshakeaspresentedinFigure 4-3 ,whichconsistsof6messagesor3roundtrips.PtPsecuritymayverywellhaveaneffectonthedurationofoverlay 96

PAGE 97

bootstrapping.Thereevenexistsapossibilitythatwithmoremessagesduringbootstrap,theprobabilityonedropsishigher,whichcould,inturn,alsohaveaneffect,thoughpossiblynegligible,ontimetoconnect.Toevaluatetheseconcerns,Ihaveemployedbothsimulationandrealsystemexperiments.ThefollowingexperimentsusebothsimulationandPlanetLabdeploymenttoevaluatetimetoconnectanewnodetoanexistingresource.Thenanotherexperimentisperformedtoevaluatehowlongittakestobootstrapvarioussizedoverlaysifallnodesjoinatthesametime.Thisexperimentisonlyfeasibleviasimulationasattemptingtoreproduceinarealsystemisextremelydifcultduetohowquicklytheoperationscomplete. 4.3.2.1AddingaSingleNodeThisexperimentdetermineshowlongittakesasinglenodetojoinanexistingoverlaywithandwithoutDTLSsecurity.TheexperimentisperformedusingbothsimulationandPlanetLab.AfterdeployingasetofnodeswithoutsecurityandwithsecurityonPlanetLab,thenetworkiscrawledtodeterminethesizeofthenetwork.Inbothcases,theoverlaymaintainedanaveragesizeofaround600nodes.Atwhichpoint,Iconnectedanode1,000,eachtimeusinganew,randomlygeneratedP2Paddress,thusconnectingtoadifferentpointintheoverlay.TheexperimentconcludesassoonasthenodehasconnectedtothepeersintheP2PoverlayimmediatelybeforeandafteritintheP2Paddressspace.Inthesimulation,anewoverlayiscreatedandafterwardanewnodejoins,thisisrepeated100times.ThecumulativedistributionfunctionsobtainedfromthedifferentexperimentsarepresentedinFigure 4-4 ,whichusesthefollowingnotations:secure(dtls),insecure(nosec),PlanetLab(plab),andtheSimulator(sim). 4.3.2.2BootstrappinganOverlayThepurposeofthisexperimentistodeterminehowquicklyanoverlayusingDTLScanbootstrapincomparisontoonethatdoesnotgiventhattherearenoexisting 97

PAGE 98

Figure4-4.Asinglenodejoininganinsecureandsecureoverlay participants.Nodesinthisevaluationarerandomlygiveninformationabout5differentnodesintheoverlayandthenallattempttoconnectwitheachotheratthesametime.Theevaluationcompletesaftertheentireoverlayhasallnodesconnectedandintheirproperposition.Foreachnetworksize,thetestisperformed100timesandtheaverageresultispresentedinFigure 4-5 ,whichusesthefollowingnotations:secure(dtls)andinsecure(nosec). 4.3.3DiscussionBothevaluationsshowthattheoverheadinusingsecurityispracticallynegligible,whenanoverlayissmall.Inthecaseofaddingasinglenode,itisclearthatthesimulationanddeploymentresultsagree,asthedifferencebetweenbootstrappingintoanoverlaywithandwithoutsecurityremainsnearlythesame.Clearlythismotivatestheuseofsecurityiftimetoconnectisthemostpressingquestion.Thetimetobootstrapasecureoverlaywasnotsignicantlymorethanthatofaninsecureoverlay.WhatIrealizedisthatcomplexconnectionhandshaking,asimplementedinBrunet,seemstodominateconnectionestablishmenttime.Forexample,inBrunet,twopeersmustcommunicateviatheoverlaypriortoforminga 98

PAGE 99

Figure4-5.Simulataneousbootstrappingofasecureandaninsecureoverlay connection,andthesystemdifferentiatesbetweenbootstrappingconnectionsandoverlayconnections.Thuseventhoughapeermayhaveabootstrappingconnection,itwillneedtogothroughtheentireprocesstoformanoverlayconnectionwithapeer.Whilethismayleadtoinefciencies,thissimplicationkeepsthesoftwaremoremaintainableandeasiertounderstand. 4.4HandlingUserRevocationUnlikedecentralizedsystemsthatusesharedsecrets,inwhichthecreatoroftheoverlaybecomespowerlesstocontrolmalicioususers,PKIsenabletheircreatorstoeffectivelyremovemalicioususers.TypicalPKIseitheruseacerticaterevocationlist(CRL)oronlinecerticatevericationprotocolssuchasOnlineCerticateStatusProtocol(OCSP).Theseapproachesareorthogonaltodecentralizedsystemsastheyrequireadedicatedserviceprovider.Iftheserviceproviderisofine,anapplicationcanonlyrelyonhistoricalinformationtomakeadecisiononwhetherornottotrustalink.Inadecentralizedsystem,thesefeaturescanbeenhancedsonottorelyonasingleprovider.Inthissection,Ipresenttwomechanismsofdoingso:storingrevocationsintheDHTandperformingoverlaybroadcastbasedrevocations. 99

PAGE 100

4.4.1DHTRevocationADHTcanbeusedtoproviderevocationsimilartothatofOCSPorCRLs.Revocations,ahashofthecerticateandatimestampsignedbytheCA,arestoredarestoredintheDHTatthekeyformedbythehashingofthecerticate.Indoingso,revocationswillbeuniformlydistributedacrosstheoverlay,notrelyingonanysingleentity.TheproblemwiththeDHTapproachisthatitdoesnotprovideaneventnoticationformemberscurrentlycommunicatingwiththepeer.WhilepeerscouldcontinuetopolltheDHTtodeterminearevocation,doingsoisinefcient.Furthermore,amaliciouspeer,whohasavalidbutrevokedcerticatecouldforceeverymemberintheoverlaytoquerytheDHT,negativelyaffectingtheDHTnodesstoringtherevocation. 4.4.2BroadcastRevocationBroadcastrevocationusesastructuredoverlaybasedbroadcastapproachasdescribedinAppendix 8 .Theformofbroadcastcanbeusedtoperformtonotifytheentireoverlayimmediatelyaboutanewrevocation.Itisimportanttonote,thatthemessageneedstobedeliveredlocallypriortoforwarding,sothatpeerswhohaveaconnectiontothemaliciouspeer,willendtheconnectionpriortoaccidentallyforwardingthemessagetothepeerbyreceivingandactingupontherevocationpriortoforwardingthemessage. 4.4.3EvaluationofBroadcastIperformedanevaluationonthebroadcastusingthesimulationtodeterminehowquicklypeersintheoverlaywouldreceivethemessage.Thetestednetworksizesrangedfrom2to256inpowersof2.Thetestswereevaluationswereperformed100timesforeachnetworksize.TheCDFofhopsforeachnodearepresentedinFigure 4-6 .Theresultsmakeitquiteclearthatthebroadcastcanefcientlydistributearevocationmuchmorequicklythanlog(N)time. 100

PAGE 101

Figure4-6.Overlaybroadcasttime 4.4.4DiscussionIncontrasttotheDHTsolution,broadcastrevocationoccursonlyonceandleavesnostatebehind.Thusthebroadcastisnotacompletesolution,asnewpeersconnectedtotheoverlayorthosewhomissedthebroadcastmessagewillbeunawareofarevocation.Furthermore,ifanoverlayissharedbymanyVPNs,itmaypreventoverlaybroadcastingoritselfmaybeinefcient.TheDHTsolutionbyitselfmayalsonotsufcientasrevocationsmaybelostovertimeastheentriesmusthavetheirleasesrenewedintheDHT.Toaddressthiscondition,eachpeermaintainsalocalCRLandtheowneroftheoverlaycanoccasionallysendupdatestotheCRLthroughanoutofbandmedium,suchase-mail.Abetterlongtermsolutionmaybetheuseofagossipprotocolssothatpeerscansharetheirlistswitheachotherduringbootstrappingphases.AkeyassumptioninusingtheseisthataSybil[ 30 ],orcollusionattack,isdifcultinthesecuredoverlay.IfaSybilattackissuccessful,bothaDHTandbroadcastrevocationmaybeunsuccessful,thoughpeerscouldxthisproblembyobtainingtheCRLoutofband.Inaddition,previouswork[ 20 ]hasdescribeddecentralizedtechniquestolimitthe 101

PAGE 102

probabilityofsuchattacksfromoccurring.Inmyapproach,theuseofcentralauthoritytoreviewcerticaterequestscanbeusedtolimitasingleuserfromobtainingtoomanycerticatesaswellasensuringuniformdistributionofthatuser'sP2Paddresses,furtherhamperingthelikelihoodofaSybilattack.Theabilitytoautomatethisisleftasfuturework.OnewaytomitigateSybilattacksusingthebroadcastapproachistobundlecolludingoffendersintoasinglerevocationmessage.Thatwouldpreventthosefromcolludingtogethertopreventeachother'srevocations.Furthermore,whilenotemphasizedabove,revocationinmysystemrevokesbyusernameandnotindividualcerticates.CombinedthesetwocomponentslimitSybilattacksagainstbroadcast. 4.5ManagingandConguringtheVPNWhilethePKImodelappliestoP2Poverlays,actualdeploymentandmaintenanceofsecuritycredentialscanbetoocomplextomanage,particularlyfornon-experts.MostPKI-enabledsystemsrequiretheuseofcommand-lineutilitiesandlackmethodsforassistinginthedeploymentofcerticatesandpolicingusers.MysolutiontofacilitateuseofPKIsfornon-expertsisapartially-automatedPKIreliantonagroup-basedWebinterfacedistributableinformsofJoomlaadd-onsaswellasavirtualmachineappliance.Inthisenvironment,groupscanshareacommonWebsite,whileeachgrouphastheirownuniqueCA.AlthoughthisdoesnotprecludeothermethodsofCAinteraction,experiencehasshownthatitprovidesamodelthatissatisfactoryformanyusecases.Group-basedWeb2.0sitesenablelowoverheadcongurationofcollaborativeenvironments.Therolesinagroupenvironmentcanbedividedintoadministratorsandusers.Usershavetheabilitytojoinandcreategroups;whereasadministratorsdenenetworkparameters,canacceptordenyjoinrequests,removeusers,andpromoteotheruserstoadministrators.ByapplyingthistoaVPN,thegroupenvironmentprovidesasimpletousewrapperaroundPKI,wheretheadministratorsofthegroupactastheCAandthemembershavetheabilitytoobtainsignedcerticates. 102

PAGE 103

Elaboratingfurther,whenauserjoinsagroup,theadministratorcanenableautomaticsigningofcerticatesorrequirepriorreview;andwhenpeershaveoverstayedtheirwelcome,anadministratorcanrevoketheircerticatebyremovingthemfromthegroup.RevocationsarehandledasdescribedinSection 4.4 .InthecontextofGroupVPNsystems,auserrevocationlistasopposedtoaCRLsimpliesrevocation,sinceusersandnotindividualcerticateswillberevoked.Registereduserswhocreategroupsbecomeadministratorsoftheirowngroups.Whenauserhasbeenacceptedintoagroupbyitsadministrator,theyareabletodownloadVPNcongurationdatafromtheWebsite.CongurationdataisloadedbytheGroupVPNduringitscongurationprocesstospecifyIPaddressrange,namespace,andsecurityoptions.Thecongurationdataalsostoresasharedsecret,whichuniquelyidentiestheuser,enablingtheWebsitetoautomaticallysignthecerticate(orenqueueitformmanualsigning,dependingonthegroup'spolicy).CerticaterequestsconsistofsendingapublickeyandasharedsecretoveranHTTPSconnectiontothewebserver.Uponreceivingthesignedcerticate,peersareabletojointheprivateoverlayandGroupVPN,enablingsecurecommunicationamongsttheVPNpeers.Theentirebootstrappingprocess,includingaddressresolutionandcommunicationwithapeer,isillustratedinFigure 4-7 .TherearemanywaysofimplementingandhostingtheWebsite.Forexample,GoogleRoffersfreehostingofPythonwebapplicationsthroughGoogleRApps,anoptionavailableiftheuserownsadomain.Alternatively,theusercouldhostthegroupsiteonapublicvirtualnetwork.Inthiscase,peersinteractingwiththeGroupVPNwouldneedtoconnectwiththepublicvirtualnetworkinordertocreateanaccount,getthecongurationdata,andretrieveasignedcerticate,atwhichpointtheycoulddisconnectfromit.ThisdoesnotprecludetheuseofothersocialmediumsnoracentralsitededicatedtotheformationofmanyGroupVPNs.ManyGroupVPNscanshareasinglesite,solongasthegroupmemberstrustthesitetohosttheCAprivatekey. 103

PAGE 104

Figure4-7.BootstrappinganewGroupVPN 4.6LeveragingTrustfromOnlineSocialNetworksGroupsareveryusefulforcoordinatingasetofindividualswhenasubsetofthemcanbeusedtoestablishtrustamongstthemall.However,groupscanlackclearandconciseindividualityandlimitindependencefromthecollective.Trustcanbeleveragedfromexistingsocialnetworkstocreatetrustinotherdomains.Muchlikethatofasocialnetwork,aVPNconsistsoftrustedlinks,tyingthetwotogetherproducedtheSocialVPN[ 56 ].Inthiswork,IalongwithmyfellowresearchersimplementedaprototypeforSocialVPN,whichexempliestheutilityofmyapproachesinhandlingsecuritybothintermsoftrustandsessionestablishmentaswellasendpointcongurationoftheVPN.Besidesthecontentdescribedinthefollowingsubsections,namelyestablishingidentityandtrustaswellasaddressallocationanddiscovery,SocialVPNreusesexistingcomponentsalreadyprovidedinIPOP,suchassecurelinkestablishment,endpointconguration,andpackethandling.EventhefunctionalityusedbySocialVPNfor 104

PAGE 105

addressallocationanddiscoverybuildsuponexistingabstractionsalreadyprovidedbyIPOP. 4.6.1ArchitectureSocialVPNleveragestheonlinesocialnetworktoestablishtrustandexchangecerticates.Thusasocialnetworkmustprovideameansforanexternalapplicationtodeterminefriendshipsandstorearbitrarydataintothesocialnetwork.Thearbitrarydatainthiscasewouldbethecerticatethatcanbeusedtondapeerinthenetworkoverlayandverifyitsidentity.Thecerticateconsistsofthepeer'ssocialnetworkinformationandP2Paddress.Thusonceapeerhasconnectedtoasocialnetworkthersttime,itneedonlyberepeatedtoobtainthelatestinformation.Existingcerticatesremainvaliduntilthefriendshiphasendedforthecerticatecachehasbeenexplicitlyushed.Onceapeerhasacerticate,connectionsareimmediatelyestablishedwithallfriendsthatarecurrentlyconnectedtotheoverlay.Asnewpeerscomeonline,theyestablishconnectionswiththosealreadythere.Duetopotentialnetworkproblems,thismaynotoccur,andsoallmembersofSocialVPNwilloccasionallycheckthelivenessofpeersnotconnectedtotheoverlay.Becauseonlinepeershaveanactiveconnection,thereisnoneedtoexplicitlymonitortheirstate.Whentheygoofine,theconnectionwillbebrokenandcanberepresentedtotheuserappropriately.Themotivationinestablishingconnectionsimmediatelycomesfromtwopurposes:nooverheadinbootstrappingondemandconnectionsandbetterabilitytodistributeIPmulticastandbroadcastpackets.UsingthetraditionalIPOPstyletoestablishadirectconnectioncantakesomewherebetweenseveralhundredsofmillisecondsuptoseveralseconds,whichmaybedisappointingtouserswhohaveusedcentralizedVPNsthathavemuchfasterconnectionestablishment.Becausetherecurrentlyexistsnosupportforefcientbroadcast/multicastmessagedistributioninsideSocialVPN,maintaining 105

PAGE 106

anactivelinktoallpeersallowsapeertopushthatmessagetoalltheirpeerswithouthavingtoestablishanewtrustedlinkrst. 4.6.2LeveragingTrustFromFacebookTrustorfriendshipsalreadyestablishedinFacebookRusedanowdeprecatedtechnologythatalloweddesktopapplicationsaccesstoFacebookR.Certicateexchangerelieduponaweb-baseddatastorecomponentprovidedbyFacebookR,whichwaspresentedasadatabase.WhenSocialVPNrstcontactsFacebookR,itwouldaddthecurrentcerticate,ifitdidnotexistandthendownloadcerticatesforallfriendsthatitdidnotalreadyhave.BecauseausermighthavemorethanoneinstanceofSocialVPNrunning,thedatabasewasdesignedtoallowtheusertostoremultiplecerticatesandtocleartheircerticates.Asmentionedearlier,eachcerticatecontainsthefriendsP2Paddress,whichallowsapeertodiscoveraremotepeerandestablishatrusted,directVPNlinkwiththem.Unfortunately,FacebookR'sinterfacewaspoorlyconstructedandnolongerexists.EachapplicationhadtoembedinitselfprivatekeyinformationtoauthenticateitselfwithFacebookR.Amaliciousattackercouldeasilydiscoverthisandchangethestoreddatatosuittheirneeds.WhileFacebookRnevergaveareasonforshuttingdownthedesktopapplicationcomponentoftheirsystem,thisisaprobablereason.Asanalternative,Idevelopedawebapplication,whichwasusedforashortperiodoftimetoreplacethedesktopapplication;however,thiswasfraughtwithproblems.UnlikethedesktopapplicationthatonlyrequiredtrustwithFacebookR,thewebapplicationrequiredhostingathird-partywebsitetosupportthesystem.Thetrustmodelisnotsignicantlydifferentastheadministratorfortheapplicationhasaccesstothetrustedmaterialregardless,itsimplymeantanothercentralizedcomponent.ThesecomplicationsledtothedevelopmentofanXMPP-basedSocialVPN. 106

PAGE 107

4.6.3LeveragingTrustfromXMPPUnlikeFacebookR,XMPPisawellstandardized,opensystemwithmanyindividualmemberscontributingcompatibleservices.ThusifoneofthemdecidestobreakfromtheXMPPspecication,userscaneasilymigratetoanotherserviceprovider.AftertheunfortunateincidentwithFacebookR,thisopenaspectwasmuchmoreattractivetoSocialVPNasaresearchprojectAsdiscussedinSection 3.3.2 ,XMPPisprimarilyusedasanopenprotocolinstantmessengerservice.ThoughithassupportforexchangingbinarymessagesthroughIQ.WhenapeerusingSocialVPNconnectstoXMPP,theyareinformedoftheirfriendsthatareonline.Eachfriendhasauniquenameintheformusername@domain/resource.Whenapeerreceivesthismessage,theycandetermineifthatfriendisusingSocialVPNbytheresourcename.IfthepeerisdiscoveredtobeusingSocialVPNtheywillexchangecerticatesandproceedtoestablishtrustedlinksintheoverlay. 4.6.4AddressAllocationsandDiscoveryIncreatingaVPNwherelinksaredenedbysocialnetworkingrelationships,themechanismsforIPaddressallocationviatheDHTdonotapplywell.AsocialnetworkingbasedVPNwillformanoverlaythatdoesnotneedtorelyonastructuredoverlayandsotodowithoutrequiresanewaddressingscheme.Additionally,attemptingtoplaceallpeersinsideasocialnetworkwithinanIPaddressrange,especiallyIPv4,isfraughtwithproblems[ 92 ].NamelythatitcanbedifculttondacommonaddressspaceforallpeersinsideaVPN,whichcanbemadeevenmoredifcultifthosepeersuseanotherVPNproduct.TheconceptemployedinSocialVPNistoplaceeachuserinsidetheirownprivateaddressspaceindependentofotherusers.EachfriendofthatpeerhasauniqueIPaddressunbeknownsttotheminsidethisaddressspace.TheIPismappedtotheusersP2Paddress.Thenthroughtheuseofpackettranslation,IPaddressesaretransparentlychangedasthepacketistransferredbetweenpeers.Priortodelivery,the 107

PAGE 108

packet'sdestinationaddressisconvertedtothepeersownpre-denedIPaddressandthesourceaddressisbaseduponamappingstoredinsideahashtablethatmapsP2PaddresstoIPaddress.Onlytrustedpeerswillhaveamappinglikethis. 4.7RelatedWork 4.7.1VPNsHamachiR[ 67 ]isacentralizedP2PVPNproviderusingthewebsiteforauthentication,peerdiscovery,andconnectionestablishment.WhiletheHamachiRprotocolclaimstosupportvarioustypesofsecurity[ 68 ],theimplementationappearstoonlysupporttheKDCrequiringthatallpeersestablishtrustedrelationshipthroughthecentralwebsite.TheHamachiRapproachmakesiteasyforuserstodeploytheirownservices,butplaceslimitationsonnetworksize,usesaproprietarysecuritystack,anddoesnotallowindependentVPNdeployments.Incontrast,myapproachpresentsacompletelydecoupledenvironmentallowingpeerstostartusingthesharedsystemtobootstrapprivateoverlaysandmigrateawaywithoutcostifneedbe.Furthermoremyapproachreliesonlyonacentralservertoobtainthecerticateotherwise,itisdecentralized.InHamachiR,ifthecentralservergoesofine,nonewpeerscanjointheVPN.CampagnolVPN[ 12 ]providessimilarfeaturestoHamachiR:aP2PVPNthatreliesonacentralserverforrendezvousordiscoveryofpeers.ThekeydifferencesbetweenHamachiRandCampagnolisthatCampagnolisfreeanddoesnotprovideaservice;usersmustdeploytheirownrendezvousservice.TheauthorsofCampagnolalsostatethatthecurrentapproachlimitsthetotalnumberofpeerssharingaVPNto100sonottooverloadtherendezvousservice.Thecurrentimplementationdoesnotsupportasetofrendezvousnodes,thoughdoingsowouldmaketheapproachmuchmorelikeours.Inaddition,thesystemreliesontraditionaldistributionofaCRLtohandlerevocation.Tinc[ 100 ]isadecentralizedVPNrequiringuserstomanuallyorganizeanoverlaywithsupportforndingoptimalpaths.Incomparisontomyapproach,TincdoesnotautomaticallyhandlechurnintheVPN.Ifanodeconnectingtwoseparatepiecesofthe 108

PAGE 109

VPNoverlaygoesofine,theVPNwillbepartitioneduntilausermanuallycreatesalinkconnectingthepieces.Furthermore,Tincdoesnotformdirectconnectionsforimprovedlatencyandthroughputreasons,thusmembersactingasroutesintheoverlayincurthepriceofactingaspacketforwarders.ThelastVPN,IdiscussisthemostsimilartoIPOP,itscalledN2N[ 29 ].N2Nusesunstructuredp2ptechniquestoformanEthernetbasedVPN.Whiletheirapproach,likeours,hasbuilt-inNATtraversal,itrequiresthatusersdeploytheirownbootstrapandlimitssecuritytoasinglepre-sharedkeyfortheentireVPN,thususerscannotberevoked.SinceN2NprovidesEthernet,usersmustprovidetheirownmechanismforIPaddressallocation,whilediscoveryutilizesoverlaybroadcasting.Thusthereareconcernsthatassystemsgetlarger,N2Nmaynotbeveryefcient. 4.7.2P2PSystemsBitTorrentR[ 10 ],aP2Pdatasharingservice,supportsstreamencryptionbetweenpeerssharingles.ThepurposeofBitTorrentRsecurityistoobfuscatepacketstopreventtrafcshapingduetopacketsnifng.ThusBitTorrentRsecurityusesaweakstreamcipher,RC4,andlackspeerauthenticationassymmetrickeysareexchangedthroughanunauthenticatedDife-Hellmanprocess.SkypeR[ 99 ]providesdecentralizedaudioandvideocommunicationtooveramillionconcurrentusers.WhileSkypeRdoesnotprovidedocumentationdetailingthesecurityofitssystem,researchers[ 35 48 ]havediscoveredthatSkypeRsupportsbothEtEandPtPsecurity.ThoughsimilartoHamachiR,SkypeRusesaKDCanddoesnotletuserssetuptheirownsystems.AsofDecember2009,theFreePastrygroupreleasedanSSLenabledFreePastry[ 94 ].Thoughrelativelylittleispublishedregardingtheirsecurityimplementation,theuseofSSLpreventsitsapplicationforuseintheoverlayandforoverlaylinksthatdonotuseTCP,suchasrelaysandUDP.Thustheirapproachislimitedtosecuringenvironments 109

PAGE 110

thatarenotbehindNATsandrewallsthatwouldpreventdirectTCPlinksfromformingbetweenpeers. 110

PAGE 111

CHAPTER5EXTENSIONSTOP2POVERLAYSANDVIRTUALNETWORKSThischaptercontainscomponentsthatextendtheVPN(virtualprivatenetwork)softwaretoprovideadditionalimportantfeatures.ManyofthesecomponentsderivefromexperiencesanddemandsthathavearisenasaresultofthedeploymentoftheVPNsoftwareinrealsystems.DeploymentexperiencesincludebutarenotlimitedtousageonPlanetLab,resourcesincludingpersonalcomputersandclustersinresidentialandacademicsenvironments,virtualmachines,andcloudresources.EachoftheseenvironmentsexposesadifferentsetofrequirementstothedesignandimplementationofapracticalP2P(peer-to-peer)VPN. 5.1Built-inSelf-SimulationSoftwaresystemsarecomplexandinvolvemanymovingparts.Traditionally,systemdesignbeginsbyconsideringthegoalsofthesystem,choosingalgorithmsanddatastructuresthatcanachievethosegoals,andsimulatingormodelingthesystem.Thoseresultsthentranslateintoarealsystemthatconsistsofanewcodebasedupontheconceptinthesimulation.Inthisprocess,simulationisappliedprimarilytovalidateadesignconceptbutnotitsimplementation.Thentheentiresoftwarebasemustbeindependentlycheckedforbugsandotherissuesthatmayhavealreadyappearedinthesimulationcode,doublingdeveloperefforts.Toreduceeffortsindevelopmentandevaluation,IhaveinvestigatedandimplementedmechanismsfordistributedsystemsandinparticularBrunettosupportbuilt-inself-simulationusingevent-drivensimulationtechniques.Inotherwords,eventhoughBrunetiswrittenforrealsystemdeployment,thesamecodecanrunusingsimulatedcommunicationlinksandsimulatedtime,allowingmanynodestorunonthesameresourceandpotentiallyfasterthanwallclocktime.Thisapproachallowstransitioningfeaturesfromsimulationdirectlyintodeployment,hasteningdevelopmentcycles.Furthermore,interestingdiscoveriesintherealsystemcanbemodeledinsimulationto 111

PAGE 112

makethesimulationbehaviormoreaccurate.Becauseasimulationcanrunonasinglecomputer,scalinguptoasignicantlylargesystem,newfeaturescanbeconstructedandevaluatedlocally,removingmanybugsandreusingandapplyingtestcasesalreadypresentinthesimulationenvironmentsignicantlyreducingtestingoverheads.Theconceptcanbeappliedtonetworking/distributedsystemsingeneral.Distributedsystemsoftwarecanusuallybedividedintomanypieces,suchasnetworkcommunication,state,time-basedevents,useractions,andsoon.Simulationofthesesystemsfocusesonthreeaspectshandlingoftime-basedevents,communicationbetweenthevariousmembersofthesystem,andtheinjectionandhandlingofuseractions.Intherestofthissection,IwilldiscusstheseinmoredepthanddiscusshowIaddressedtheminthecontextofBrunet. 5.1.1Time-BasedEventsEventsoractionscausechangesinasystem.Someareduetoexternalstimuli,suchashardwareorsoftwareinterruptsinaprocessorsoruserinput,othersarearesultoftimers,whichmaybeasubsetofhardwareinterrupts.Inthecontextofsimulation,timersandexternalstimulicanbeviewedastwodifferentcomponents.Theexternalstimulimaybedeliveredbaseduponatimedeventoranactioninitiatedbyaremoteparty.Iftimeisignored,thenanodewillruninaloopuntilitssteadystatehasbeenachievedandthenconstantlyverifyingthatitstillisinsteadystate.Timingallowsanodetodelaythisbehavior,suchasestablishingmoreconnectionsorverifyingitsconnectivity,behaviorwhichproducesmoreefcientsystems.Asystemcouldbemadeentirelywithouttimersandrunonexternaleventsalone.Inthiscase,timingisstillrequiredtomodelthecommunicationdelaybetweenpeers.Amessagesentfromonepeershouldnotinstantaneouslyarriveatanotherpeer.AswillbedescribedinSection 5.1.2 ,peerscanusetimerstosimulatelatencybetweenpeers.Eventsinasimulationarestoredinatimerwithdelaysspeciedintermsofavirtualclock.Methodsinordertoretrievethecurrenttimeshouldbebaseduponthesame 112

PAGE 113

clockinbothsimulationanddeploymentsystems.Inarealsystem,thiswouldthenrevealtheactualcurrenttime,whereasinasimulatedenvironment,thiswouldrevealthecurrentsimulatedtime.Byvirtualizingtimeretrievalcalls,thecallercanbedirectedtotheappropriateclockdependingonwhetherthesystemisrunninginsimulationmodeornot.Howthisisimplementeddependsonthelanguagethesoftwareiswrittenin.Forexample,languageswithnamespacescaneasilyreplacetheclockfunctionswiththeirown.LanguageslikeCmayrequirepre-processormacrostospecifyrealorvirtualtime.Aseventsarequeuedintothesystem,theymustbestoredinordered.Thestructureshouldbesuchthattheeventtoexecutenextisalwaysavailableinminimumtime,whileoptimizingforinsertingandremovingeventsfromthetimer.Forthisapplication,aminimumheapworkswell.Aminimumheapprovidesconstantseektimeforthesmallestvalueaswellaslog(N)insertanddeletiontime.InBrunet,thishasbeenimplementedasabinaryheap.Afterthesystemhasinitialized,itmayaddoneormoreeventsintothetimertocauseanactiontooccur.Thesimulatorwillthenadvancethevirtualclocktothetimethenexteventissupposedtooccur,executealleventsthatoccurreduptothatpointincludingthenexteventandthenrepeat.Therunningofeventsshouldnotstopuntilthenexteventtoexecuteshouldberuninthefuture,becauseoneeventmaycauseanothereventtooccurimmediatelyrequiringnodelay.Itshouldbenotedthateventsmaywanttoexecuteotherevents.Theseeventsshouldnotbeexecutedin-lineandinsteadshouldbeaddedintothequeuetobeexecutedatthesamevirtualclocktime.Ifthisisnotdone,thereisapotentialforstackoverowsduetoextremelydeepcallsintothecode. 5.1.2NetworkCommunicationUsingcommunicationmodelsortransportsthatrelyonlimitedresourcessuchasthenumberofopensocketsorinteractionswiththeoperatingsystemcanseverelyhindertheusabilityandfunctionalityofasimulator.Asystemusingsocketswillquicklyhitawall,Linux,forexample,limitstheamountofopenledescriptorsto1,024,which 113

PAGE 114

meansthatinaUDP(userdatagramprotocol)systemasimulationwouldbelimitedtohavingasfewas1,024peersinthesystemwhereasaTCP(transmissioncontrolprotocol)systemcouldbeunabletoproceedwithmoreconnectionsthan32(32peerswithall-to-allconnectionswouldresultin1,024activeTCPsockets).Furthermore,eachinteractionwithasocketrequiresatleastoneifnotmoretransitionsbetweenuser-spaceandkernel-space.Sowhileexistingtransportscouldbeusedforsimulatedcommunication,theoverheadindoingsoisundesirableasitwouldlimitlarge-scalesimulations.Assumingthatthesystemismodularlywritten,itispossibleforvariousformsoftransportlayerstobeusedfornetworkcommunication.Thusforscalabilitypeerscouldexchangebuffersorpointerstomessageswitheachother.ThiswouldremoveanyrestrictionsonO/S(operatingsystem)resourcesandwouldnotrequirethateachcommunicationpathwaypassthroughasystemcall.Brunetsupportsagenerictransportsframeworkthatprovidesthemethodforsendingamessageandtheabilitytoregisteracallbackwhenamessageisreceived.Thisconceptisbuiltintoanedge.Eachedgeisassociatedwitharemotepeer,andwhensendingorreceivingapacket,thedestinationorsourcewouldbethepeerassociatedwiththatedge.Edgescomeinpairs,iftheyareconnected,thusasimulatededgeconsistsoftwocomponents:knowledgeoftheremoteedgeandtimingmeasurementofthelatencybetweenthetwopeers.Whenapeersendsamessagetotheremotepeer,thesimulatededgeenqueuesthemessageintothetimerwithacallbackintotheremoteedge'sreceivehandler.TCPandUDPuseIP(InternetProtocol)addressesandportstolocallyand,potentially,globallyuniquelydistinguishthemselvesandthat,moreimportantly,canbesharedwithothers.Inotherwords,theconceptofaddressesiskeytotransports.Sincethesimulatedtransportsareallrunninginthesameaddressspace,theredoesnotneedtobeamultilevelnamingschemeasprovidedbyIPaddressesandports.Instead, 114

PAGE 115

simulatedtransportsuseasingleinteger,whichcanthenbeusedasthekeyintoahashtable,whosevalueisthenodematchedtotheinteger.Whenapeerwantstoconnecttoaspecicnode,insteadofconnectingtoapeerataremoteIP,portpair,itseekstheremotenodeinthehashtable.Ifnopeerexists,dependingontheprotocolsimulated,theresultwillbeabrokenlinkoraconnectionerror.Ifitndsanentry,thetwopeerscreateedgesassociatedwitheachother.Atwhichpoint,thepeerscaneasilycommunicatewitheachotherusingthetimerandtheexchangeofbuffers. 5.1.3UserActionsAuserinadistributedsystemdoesnotnecessarilyimplyahuman,butrather,anexternalinputfromeitheranapplication,auser,asensor,orbysomeothermeans.Inasimulatedenvironment,thesetypesofbehaviorsshouldbeproperlymodeled.Thatis,ifauserrequestsinformationfromanotheruserusingthesimulator,itshouldbedeliveredwhenavailable,notafterpollingsomeentrypointaftersomeperiodoftime.Toeffectivelymodelthisbehaviorrequirestheuseofasynchronousinterfaces.Aftertheinitiationactionistriggered,aregisteredcallbackwillbetriggereduponcompletion.Asynchronouscallcanbeinefcientlyturnedintoanasynchronouscallthroughtheuseofathreadorbypolling.Thoughforperformancepurposes,itisbesttouseanasynchronousinterfacethatonlygetsinvokeduponcompletionofthetask.Fortunately,itisveryeasytomakesynchronousinterfacesfromasynchronous,soifdesignedproperly,thisisnotdifculttoimplementforsystemdesigners.Ifasynchronoushandlersarenotavailable,theinterfacecanbemadeasynchronousthroughpollingatthecostofoverhead.InBrunet,thisbehaviorhasbeenmodeledininteractionswiththeDHT(distributedhashtable)andsendingmessagesthroughtheoverlay.Acommonabstractclasscontainsamethodtostarttheaction.Itnotesthestartingtimewhenthismethodisexecuted.Itthenwaitsfortheasynchronousresponsefromtheunderlyingcomponentto 115

PAGE 116

informthattheuseractionhascompleted.Optionally,itwillcallanotheruser-speciedcallbackuponcompletion. 5.1.4TheRestoftheSystemTheothercomponentsofthesystemmayhaveanimpactonthespeedofsimulation,butingeneralshouldnotaffecttheabilityofthesystemtobesimulated.Thusthekeytomakingasystemself-simulatingismodularityandsupportforasynchronousinterfaces.Inthefollowingsection,Idiscussoptimizationsthatcanbemadetothesecomponentsandotherstoimprovesimulation. 5.1.5OptimizationsSimulationscanbeslowforanumberofreasonsandthatonlyincreasesbyattemptingtosimulatesoftwarethatwasnotintendedtobesimulated.Overlaysoftware,forexample,typicallyusesverylargeaddresses(16bytesorlarger)justtorepresentanothernode,whereasinasimulationormodelthisistypicallyrepresentedasanintegerornotatall.Additionally,duetothefactthatthelifetimeofvariousbuffersinthesystemcanbehardtopredict,wheninteractingwithincomingmessagesandevenoutgoingmessages,manydatastructureseitherstayinscopeforalongtimeorthereisheavychurnonmemoryintheheap.Formanagedlanguages,thiscanresultinsignicantoverheadduetogarbagecollection.Finally,sincetheentiresystemdependsonorderedtime,themechanismorderingtimingeventsplaysakeyrole.WhilethetypicaladdressinaP2Psystemmaybelargeinordertoallownodestoobtainaddressesindependentlyofeachotherthroughrandomnumbergeneration,inasimulation,thislargeaddressspaceisunnecessary,becausethereisnoneedtogeneratetheaddresseswithoutknowledgeofotheraddresses.Oneconditionthatmayneedlargeaddressspacesisextremelylargesimulations,butgiventhata32-bitnumberallowsfor4billionnodes,thisshouldnotbeanissue.Manycommondatastructuresaregeneratedinadistributedsystemeveninsideasinglenode.Thisincludestransportaddresses,P2Paddresses,andcommon 116

PAGE 117

stringsinsidethesystem.Bycachingthesevalues,thesystemcanreduceitsmemoryconsumptionandbenicertothegarbagecollector.Acacheinthissenseconsistsofahashtable,whosekeyistheobjectofinterestandthevalueisasingletonoravaluethatisidenticaltothekeyineverywayexceptbutpotentiallytheyrefertodifferentlocationsinmemory.Thuswhenapeerconstructsanotherpeer'saddress,itcancheckthehashtableforasingleton.Ifoneexists,itusesthesingletonandnoadditionalmemoryisrequiredbesidesapointertothissingleton.Ifonedoesnotexist,thisnewvalueisstoredasasingletonintothesystem.Therearevariousmeanstolimitingtheentriesinahashtable,suchasonlykeepingthelastNentries,keepingtrackofthelastaccesstime,countingthenumberofreferences,orusingaconceptknownasweakreferences.Weakreferencesprovideanattractiveoptionasitrequiresnoadditionalstateinthecache,agarbagecollectorwillremoveanobjectwhentherearenoreferencestoitbesidesweakreferences.Thusstaleentriesinthehashtablewillreturnnullobjects.Soacacheusingweakreferenceswillneedtoiteratethroughtheentirecacheoccasionallytoremovethesestalereferences.Messagesareusuallyassembledfromasetofmemoryblocks.Priortotransferringthem,theymustbeplacedintoacontiguousbuffer.Unfortunately,thiscanleadtosignicantmemoryallocationsandgarbagecollections.Toaddressthis,Ihaveutilizedmemoryheaps,whichcanbeusedtocreatemultiplememoryblocks.Theconceptistoallocatealargememoryblock.Whenassemblingamessage,itiswrittentoanoffsetintothisblock.Theblockcanthenbesharedwithothersbyprovidingareferencetothememoryblockandtheoffsetandlengthofthemessageinsidethisblock.Whentheblockisnolongerinscope,itisgarbagecollected.Thisapproachsignicantlyreducesdynamicallocationofdataandinturnsignicantlyimprovestheperformanceofthesimulation. 117

PAGE 118

5.2EfcientRelaysSometimesNAT(networkaddresstranslation)traversalusingSTUN[ 93 ]failsduetorestrictiverewallsandNAT.Occasionallythereareother,hardertodiagnose,connectivityissues.SomeP2PVPNs[ 66 67 ]supportrelaying,similartoTraversalUsingRelayNAT(TURNTraversalUsingRelayNAT)[ 90 ]providedbyamanagedrelayinfrastructure.CentralizedanddecentralizedVPNsdonotsufferfromthisproblemasalltrafcpassesthroughthecentralserverormanagedlinks.Toaddressthemanagementandoverheadconcernsinthesesystems,Iproposetheuseofdistributed,autonomicrelayingsystembaseduponpreviouswork[ 43 73 ].ThispreviousworkinvolvedtheuseoftriangularroutingthatallowedpeersnexttoeachotherinthenodeID(identication)spacetocommunicatedespitebeingunabletocommunicatedirectlybecauseofrewall,NAT,orInternetfragmentationissues.Theprocessforforminglocalrelaysortunnels[ 43 ]beginswithtwonodesdiscoveringeachotherviaexistingpeersanddeterminingtheneedtobeconnected.Ifadirectconnectionattemptfails,thepeersexchangeneighborsetsthroughtheoverlay.Uponreceivingthislist,thetwopeersusetheoverlapintheneighborsetstoformatwo-hopconnection.Inthiswork,Ihavefurtherextendedthismodeltosupportcaseswhennodesdonothaveanoverlapset.Thisinvolveshavingthepeersconnecttoeachother'sneighborsetsproactivelycreatingoverlap.InFigure 5-1 ,twomembers,0000andABCD,desireadirectconnectionbutareunabletodirectlyconnect,perhapsduetoNATsorrewalls.Theyexchangeneighborinformationthroughtheoverlayandconnecttooneofeachother'sneighbors,creatinganoverlap.Theoverlapthenbecomesarelaypath(representedbydashedlines),improvingperformanceoverroutingacrosstheentireoverlay.Additionally,Ihaveaddedthefeaturetoexchangearbitraryinformationalongwiththeneighborlist.Thusfar,Ihaveimplementedsystemsthatpassinformationaboutnodestability(measuredbytheageofaconnection)andproximity(baseduponping 118

PAGE 119

Figure5-1.Creatingrelays latencytoneighbors).Furthermore,whenoverlapchanges,anothermechanismcandeterminewhichsubsetofthepeerstouse;forexample,apeermayonlyroutethroughthefastestormorestableoverlapintheset.Toverifytheusefulnessoftwo-hopoveroverlayrouting,IperformedexperimentsandsharetheresultsinSection 5.2.1 .Inalivesystem,Ihaveveriedtheaccuracyandusefulnessofthelatency-basedrelayselectionalgorithminSection 5.2.2 5.2.1MotivationforRelaysintheOverlayThepurposeofthisexperimentistoquantifytheperformancebenetsofautonomicrelays.ForthisexperimentIusedtheMITKingdataset[ 49 ],whichcontainsall-to-alllatenciesbetween1,740well-distributedInternethosts.Varioussizesofnetworksupto1,740nodeswereevaluated100timeseach.TheexperimentswereexecutedbyrunningtheBrunetinsimulatedmode.Onceatsteadystate,Ithencalculatedtheaverageall-to-alllatencyforallmessagesthatwouldhavetakentwooverlayhops 119

PAGE 120

Figure5-2.Acomparisonofall-to-alloverlayrouting,two-hoprelay,anddirectconnectioninBrunet ormore,theaverageofthelowlatencyrelaymodel,andtheaverageofsinglehopcommunication.Inthelowlatencyrelaymodel,eachdestinationnodeformaconnectiontothesourcenode'sphysicallyclosestpeerasdeterminedvialatency(inalivesystembyapplicationlevelping).Thenthispathwayisusedasatwo-hoprelaybetweensourceandnode.Ionlylookattwooverlayhopsandmore,asasinglehopwouldnotnecessarilybenetfromtheworkandwouldbethecauseofatriangularinequality.TheresultsarepresentedinFigure 5-2 .Theinitialstartingsizeforthenetworkwassetto25,becausenetworksizesaround20andundertendtobefullyconnectedduetotheconnectivityrequirementsofthesystem.Itisnotuntilthenetworksizeexpandspast100andtowards200nodesthatrelaysbecomesignicantlybenecial.At100nodes,thereisapproximatelya54%performanceincrease,whereasat200thereisan87%increaseanditappearstogrowproportionatelytothesizeofthepool.Thekeytakeawayisthatlatency-boundapplicationsusingareasonablysizedoverlaywouldsignicantlybenetfromtheuseoftwo-hoprelays. 120

PAGE 121

Table5-1.Relaycomparison LatencyBandwidth (ms)stdevKbit/sstdev Hamachi-Free60.82.5440.20.87Hamachi-Pro60.21.6810001.29Latency-aware58.135.522451080 5.2.2ComparingRelaySelectionInthisexperiment,Isharemyexperiencesoftestingtheuseoflatency-awarerelaysusingthepublicP2PpoolrunningonPlanet-LabaswellasHamachi-FreeandHamachi-Prorelays.DuetoHamachiRnotsupportingrelaysinLinux,thisexperimentwasperformedinWindowsVista64-bit.HamachiRisdiscussedingreaterdepthinChapter 2 .Thetestingplatformconsistsoftwovirtualmachinelocatedonthesamehostwitharewallpreventingthemfromestablishingdirectconnections.Allexperimentswererepeated5timesusingacleancongurationeachtime.InHamachiR,thismeantthattheserverwouldneedtore-evaluateNATtraversingcapabilitiesandtheoptimalrelaytouse.InBrunet,thismeantanewnodeIDandestablishingrelayswithpeersindifferentregionsoftheoverlay.TheresultsarepresentedinTable 5-1 .AsHamachiRwasstartedandguredoutthatNATtraversalwasnotpossible,itbeganusingmultipledifferentrelaysasevidentbyseveraldifferentpingtimes.EventuallyHamachiRsettledonarelayserveranditappearedtobethesameoneeverytime,forbothHamachi-FreeandHamachi-Pro.TheonlydifferencebetweenHamachi-ProandHamachi-FreeisthatinProthereisabandwidthcapofapproximately1Mbit/swhereasFreeislimitedto40Kbit/s.BrunethasnodesbothonPlanet-LabbutalsodedicatedsystemsforArcher[ 37 ].ThesemachinesareatUniversitiesandthushaveahighbandwidthandlowlatencyconnectiontothetestingsite.Aswitnessedbytheresults,itappearsthatinmostifnotalltheseexperimentspeershadalowlatencyconnectiontoaUniversitycomputeresourceanditwaschosenaheadofPlanet-Lab. 121

PAGE 122

Thetwotakeawaysarethebenetofbeingabletodynamicallydeployrelayserversandreusecomputenodesasrelaysystems.Asthenetworkgrows,theremaybeneedtoimplementsomeformofbandwidthlimitatrelaynodes. 5.3PoliciesforEstablishingDirectConnectionsRoutingthrougharing-structuredoverlayusingagreedyroutingalgorithmtakeslog(N)timeandaddslog(N)overallbandwidthforasinglemessage.Therefore,sendingmessagesfrequentlybetweentwopeersthroughtheoverlayisnotcosteffective.Whatisnotapparentthroughthealgorithmiccomplexityanalysisisthefactthatmanypathsinanoverlaycanbeinefcientduetopeersroutingthroughdistantpartsoftheworldorhavinglimitedbandwidth.Lessfrequently,packetsroutedviatheoverlaycanjustdisappearduetonodesdisconnectingorpacketdropsacrosstheInternet.Toaddressthis,Gangulyetal.[ 42 ]madeasystemforcreatingadaptiveshortcuts.AdaptiveshortcutsenablepeerstoestablishdirectlinkswitheachotherusingBrunet'sbuiltinNATtraversalcapabilities.TheapproachtakenbyGangulyetal.wastomonitorincomingpacketsfromremotepeersandafteracertainthresholdwaspassedthesystemautomaticallymakesadirectconnectiontotheremotepeer.Asaresultofthistransparency,softwareusingtheoverlaycouldsimplystartthisservicewithoutmakinganyadditionalchangestotheapplication. 5.3.1LimitationsUnfortunatelythisapproachcomeswithlimitations.Therewasneveranysystematicunderstandingofbehaviorsthatshouldsignifythecreationofadirectlinkorwhenadirectlinkshouldbeclosed.Thusapplyingthelayernaivelycouldresultinconnectionchurn,whichwouldhaveramicationontheroutabilityofthenetwork.Thusacompromisewasmadetohaveitenabledonlyforselectedtrafc,whichinthecaseasdescribedbyGangulyetal.[ 42 ]wasIPorVNtrafc.Twoissuesmadetheapproachnolongerfeasible:theincreaseinsizeoftheoverlaynetworkandthesecuringofIPlinks.TheInternetdropspacketsatapproximately 122

PAGE 123

.00835%ofthetimeaccordingtothedataprovidedbyiPlane[ 70 ].Thiscompoundedwiththefactthatanoverlaymessagemaytakelog(N)hopssignicantlyincreasesthelikelihoodofapacketdroppingbeforearrivingatitsdestination.Addingasecuritymodelmakesthisevenmorecomplicated,becauseatrustedlinkmustbeestablishedbeforeroutinganymessagesbetweentheendpoints.InthecaseofDTLS(datagramtransportlayersecurity),thiscanresultin6messagestraversingtheoverlay 4-3 ,priortotherstIPpacket.Ifpeersmustrstestablishatrustedlinkandthentransmitacertainamountofpacketsinagiventime,thereisareasonablechancethattheymaynevernaturallytriggerthecreationofanadaptivelink.Inpractice,itwasquitecommonforthisnottosucceedand,infact,securitylinkswereoftentimesnotevenformed.Toverifythis,IimplementedanetworkprolingtoolanddeployeditontoPlanetLab.Themonitoringtoolmeasuredthedelayandsuccessofsendingmessagesbetweenthenodeandeveryothernodeintheoverlay.ThedroprateandlatencyforaroundtripmessageperhopdistancebetweentwopeersarepresentedinFigure 5-4 andFigure 5-3 ,respectively.ThedatainthoseguresiscomparedtodataretrievedfromiPlane,whichmakesitclearthatPlanetLabexacerbatesthesituation.Whilethisisawell-knownissue,itisaveryimportantconclusionsincemanyofthepublicsystemsprovidedbymyresearchgrouprelyonPlanetLab. 5.3.2On-DemandConnectionsBeforedeninganewarchitecture,ImeasuredthenetworktrafcofactiveandidleapplicationsthatwereofinteresttotypicalP2PVPNs,Condor[ 65 ]anddatatransfers.Datatransferstendtobesimple,iftheyareTCPdriven,rstaTCPlinkmustbeestablished,thendatatransferred,andnallythelinkisclosed.Condorisajobschedulemanagementtool,whichisdiscussedinmoredepthinChapter 6 .TheimportantaspectforthissectionisunderstandingthatallnodesinaCondorpoolhavearelationshipwithamanagernode.Theirbehavioristoinitiallysendaregistration 123

PAGE 124

Figure5-3.LatencyinPlanetLabdeploymentcomparedtoiPlane Figure5-4.DroprateinPlanetLabdeploymentcomparedtoiPlane messagecontainingthedetailsofthenodeandthereaftertosendaone-waymessagestatingtheirpresenceevery5minutesorso.Thenextstepwasdeterminingthecostofcreatingaconnectionversusroutingviatheoverlay.ThiscostneedstoconsiderthatbeforeasingleIPpacketcanberouted,asecuritylinkmustbeestablished.InBrunet,thecreationofalinkrequires1roundtripmessageacrosstheoverlay,whereasthesecuritylink,asmentionearlier,takes3.Soitisintuitivethatsendingasinglepacketsecuredthroughanend-to-endchannelviaadirectlinkisfarmoreefcientthandoingsoviatheoverlay.Soinsteadofhavingameterdeterminingwhentocreateconnections,connectionsshouldbemadeassoonasthereisinterestincommunication,inotherwords,on-demand.InaDHTsystem, 124

PAGE 125

thismaybedonepriortosendingorretrievingdatafromtheDHT.InaVPN,thismaybeduringthemappingofIPtoP2PasdescribedinSection 2.2.2 ,whichoccursbeforesecurelinkestablishment.Unfortunately,thishasthesideaffectofnotbeingtransparenttoapplicationsusingtheP2Psoftware'sinterfaceatthecostofbeingmoreresponsive.Thecreationofon-demandconnectionsresultsinahigherfrequencyofconnectionestablishment.Asaresult,betterheuristicsarenecessaryinordertodeterminewhentocloseunusedconnections.UsingtheprolinginformationretrievedbeforewithregardstoCondor'sone-wayheartbeatmessages,itwasimportantthataconnectionwasonlyclosedifitwasunusedinbothdirections.OtherwisethemanagerwouldbeconstantlyclosingconnectionsandpeerswouldrandomlydisappearfromCondorforperiodsoftime.AnalgorithmthatseemstohaveworkedsofarisbaseduponsomethingIcallatime-basedcache.Initially,entriesarestoredinahashtable;afteracertainperiodoftime,theyaremovedtoasecondhashtable,andthoseinthesecondhashtablearelost.Ifanentryisaccessedwhileinthesecondhashtableornotatall,itisaddedtothersthashtableandifapplicableremovedfromthesecondhashtable.Whenanentryisremovedfromthecache,itcausesanevictionnotice,whichresultsintheconnectionbeingremoved.Thetimerisbasedupona7.5minutetimer,sothataninactiveconnectionwillbeclosedwithin7.5to15minutes.Theapplicabilityofon-demandconnectionscomparedtoChotawasevaluatedusingthesimulatorwith1,024nodesandadroprateof0.00925%asfoundonPlanetLab.TheOn-demandconnectionswereestablishedusingtheexactsemanticsoftheOn-demandprotocol;however,thebehaviorofestablishingChotaconnectionsisalittlecomplicated,sincethetrafcbehaviorofsuccessfulandunsuccessfulconnectionattemptsisnotidentical.Toaddressthis,IsimulatedanidealChotasituation:thenodesoptionallyestablishasecurityconnectionviatheoverlay,thentheyexchangearoundtripmessage,andnallyestablishadirectconnectionbetweeneachother.The 125

PAGE 126

Figure5-5.Timetoformadirectconnection On-demandapproachinvolvedoptionallycreatingasecurityconnectionfollowedbythedirectconnection.WhilethisevaluationmodelcreatesahighlyidealsituationforChotaetablishment.TheresultsinFigure 5-5 makeitclearthatChotaisnotidealforthistypeofapplicationandthatsecurityonlymakestheissuesworse.TheOn-demandconnectionsshowasignicantimprovement,buttherestillexistsanobviousissuewithconnectionestablishmentthatwillonlybemadeworseasthesystemexpands.Perhapsusingmultiplepathsontheoverlaycanimprovethissituationpacketdropsontheoverlaymayhavehighcorrelationratherthanbeinguniformlyrandom.Inapplication,thismodicationmadetheArcher,whichusesbothsecurelinksandCondor,signicantlymorestable.Asthesystemexpanded,nodeswereconstantlyappearinganddisappearing,usersjobswerebeinglostduetodisconnectivities,anduserswerecomplainingaboutbeingunabletoevensubmitjobsintothesystem.Sincethechange,theissueshavebeenresolved. 5.4BroadcastingIPBroadcastandMulticastPacketsViatheOverlayTheuseofaprivatevirtualoverlayenablesanewmethodforsendingmulticastandbroadcastpackets.IntheoriginalapproachtoIPOP,broadcastingapackettotheentireoverlayisnotsuitablebecausetheoverlaycouldconsistofpeersfromotherVPNsandthosenoteveninvolvedwithVPNoperations,whileapproachesthatgenerate 126

PAGE 127

unicastmessageswhenabroadcastormulticastpacketarrivesattheVPN(e.g.byqueryingaDHTkeywhereallpeersintheVPNwouldplacetheiroverlayaddresssothattheycouldreceivethepackets)donotscalewell.TheabstractionofaprivatevirtualoverlayenablesscalablebroadcastingwithinaVPNbecausetheonlypeersintheprivateoverlayarepeersforasingleVPN.Likethebroadcastrevocationdiscussedearlier,IPbroadcastingandmulticastingusethemethoddescribedinAppendix 8 toefcientlydistributemessages.ThoughinVPNsituations,manypeersmayalreadyhaveconnectionstomostifnotalloftheirVPNpeers,thusthebroadcastalgorithmhasbeenmodiedtoallowapeertoselecthowmanypeerstheywouldliketoforwardthemessageto.Otherwiseinmanycases,thisalgorithmwilldegenerateintoonesimilartothepreviousapproach.TheoverlaybroadcastmethodforIPbroadcastandmulticastcaneasilybeanalyticallycomparedtotheoriginalDHTmethod.Amessageroutedviatheoverlaywilltakeapproximatelylog(N)hops.SofortheDHTmethod,thisinvolvesNmessageswithlog(N)hopseachorNlog(N)messagestotalandcompletinginatotaltimeoflog(N)excludingthelimitationsofbandwidth.WhereasefcientoverlaytakesexactlyNmessagesandcompletesinlog2(N)time.Theoverlaybroadcastsignicantlyreducesbandwidth,whichcanhaveadirecteffectonthesuccessofpacketsactuallymakingittotheendpeer.Alsoastheoverlaygrowsinsize,storingallpeersinsideasingleDHTmaycreateotherproblemsthatcannotberesolvedeasilythroughanalyticalmodeling. 5.5FullTunnelVPNOperationsThecongurationdetailedsofardescribesasplittunnel:aVPNconnectionthathandlesinternalVPNtrafconly,notInternettrafc.Priortothiswork,onlycentralizedVPNscurrentlysupportfulltunnel:providingthefeaturesofasplittunnelinadditiontosecurelyforwardingalltheirInternettrafcthroughaVPNgateway.Afulltunnelprovidesnetwork-layerprivacywhenauserisinaremote,insecurelocationsuchasan 127

PAGE 128

Figure5-6.AnexampleofbothfullandsplittunnelVPNmodes openwirelessnetworkatacoffeeshopbysecurelyrelayingallInternettrafcthroughatrustedthirdparty,theVPNgateway.BothmodelsareillustratedinFigure 5-6 .CentralVPNclientsusefulltunnelingthrougharoutingruleswap,settingthedefaultgatewaytobeanendpointintheVPNsubnetandtrafcfortheVPNserverisroutedexplicitlytotheLANgateway.ThisruleswapcausesallInternetpacketstoberoutedtotheVNdeviceandtheVPNsoftwarecanthensendthemtotheremoteVPNgateway.AttheVPNgateway,thepacketisdecryptedanddeliveredtotheInternet.AP2Psystemencounterstwochallengesinsupportingfulltunnels:P2PtrafcmustnotberoutedtotheVPNgatewayandtheremaybemorethanoneVPNgateway.IaddresstheseissuesandprovideasolutiontothisprobleminSection 5.5 .ThechallengesfacedinadecentralizedP2PVPNareprovidingdecentralizeddiscoveryofaVPNgatewayandsupportingfulltunnelmodeinaP2PenvironmentsuchthatallP2Ptrafcissenttotheintendedreceiverdirectlyinsteadofthroughthegateway.Theremainderofthissectioncoversgatewayandclientsolutionstoaddressthesechallenges. 5.5.1TheGatewayAgatewaycanbeconguredthroughNATsoftware,likemasqueradinginIPtablesorInternetConnectionSharingwithWindows.ThisautomaticallyhandlestheforwardingofpacketsreceivedontheNATinterfacetoanotherinterfacebringingthepacketcloser 128

PAGE 129

Figure5-7.ThecontentsofafulltunnelEthernetpacket toitsdestination.Similarly,incomingpacketsontheoutgoinginterfacemustbeparsedinordertodeterminethedestinationNATclient.FollowingfromtheoriginaldesignoftheVPNstatemachineinFigure 2-2 ,ifaVPNisagateway,theVPNstatemachinenolongerrejectspackets,whenthedestinationisnotintheVPNsubnet,thoughwhentheVPNgatewaymodeisdisabledthesepacketsarestillrejected.Whenenabled,allInternetandnon-VPNbasedtrafciswrittentotheTAPdevicesettingthedestinationEthernetaddresstotheTAPdevice.TheremainingcongurationisidenticaltoothermembersofthesystemaspacketsfromtheInternetwillautomaticallyhavetheclientsIPasthedestinationasaproductoftheNAT.Toprovidefordynamic,self-conguringsystems,VPNgatewaysannouncetheiravailabilityviaanentryintheDHT.Asfuturework,thisapproachcanbeexploredtoprovideintelligentselectionandloadbalancingofgateways. 5.5.2TheClientVPNClientswishingtousefulltunnelmustredirecttheirdefaulttrafctotheirVNdevice.IntheprototypeVPNmodel,avirtualIPaddressisallocatedforthepurposeofprovidingdistributedVNservicesDHCPandDNS.Thissameaddressisusedasthedefaultgateway'sIP.BecausethisIPaddressneverappearsinaInternetboundpacket,onlyitsEthernetaddressdoes,asshowninFigure 5-7 ,thisapproachenablestheuseofanyandmultipleremotegateways.Inthisgure,PNandVNtranslatetophyiscalandvirtualnetwork,respectively. 129

PAGE 130

Tosupportfulltunnelmode,theVPN'sstatemachinehastobeslightlymodiedtohandleoutgoingpacketsdestinedforIPaddressesoutsideoftheVPN,onlyrejectingthemwhenfulltunnelclientmodeisdisabled.Whenenabled,theVPNsoftwaresendspacketstotheremotepeeractingasafulltunnelgateway.Likewise,incomingpacketsthathaveasourceaddressoutsidethesubnetshouldnotberejectedbutinsteadtheoverlayaddressshouldbeacertiedVPNgatewaypriortoforwardingthepacket.Toselectaremotegateway,peersquerytheDHT.Astheremaybemultiplegatewaysinthesystem,thepeerrandomlyselectsone,forwardingpacketstothatnode.Toensurereliability,whentheclienthasnotheardfromthegatewayrecently,theclientsendsalivenessquerytothegateway.Ifthegatewayisdown,thetakenpessimisticapproachndsanewgatewaywhenthenextInternetpacketarrives.TherealchallengeinapplyingfulltunnelVPNmodetoP2PVPNsisthenatureoftheP2Psystem,namelydynamicconnections.Peersdonotknowaheadoftimewhatremotepeerconnectionswillbethusasimpleruleswitchdoesnotwork.Theoriginalapproachwastowatchincomingconnectionrequestsandaddingadditionalroutingrulesondemand,thoughthisisonlyreasonablyfeasiblewithUDPasaTCPhandshakemessagewouldneedtobeinterceptedandpotentiallyreplayedbythelocalhostinordertoenabletheruleandallowproperrouting.TherealdrawbackoftheapproachthoughisthatUDPmessagescaneasilybespoofedbyremotepeersenablingunsecuredInternetpacketstobeleakedinthepublicenvironment.Eveniftheconnectionsaresecured,itcouldtakesometimeforthepeerstorecognizeafalseconnectionattemptanddeletetherule.AsolutiontothesecurityproblemistohavealltrafcdirectlyroutedtotheVNdevicewithnoadditionalroutingrules.TheVNisthenresponsibleforlteringP2PtrafcandforwardingittotheLAN'sgatewayviaEthernetpackets.IntheVPNapplication,outgoingIPpackets'sourceportsarecomparedtoVPNapplication'ssourceports.Uponamatch,theVPNapplicationdirectsthepackettotheLAN'sgateway.The 130

PAGE 131

Table5-2.Fulltunnelevaluation GoogleRGatewayVPNaddressGatewaypublicaddress Ethernet70.612.913.9Routing71.413.211.0NoVPN66.1N/A10.9 threestepsinvolvedinthisprocessaretranslatingthesourceIPaddresstomatchthephysicalEthernet'sIPaddress,encapsulatingtheIPpacketinanEthernetpacketwitharandomlysourceaddress[ 116 ]andthedestinationtheLAN'sgateway,andsendingthepacketviathephysicalEthernetdevice.SendinganEthernetpacketisnottrivialasWindowslackssupportforthisoperationandmostUnixsystemsrequireadministratorprivilege.Analternative,platformindependentsolutionusesasecondTAPdevicebridgedtothephysicalEthernetdevice,allowingEthernetpacketstobesentindirectlythroughtheEthernetdeviceviatheTAPdevice.BecausethesolutionresultsinincomingpacketstoarriveatadifferentIPaddressthantheactualoriginalsourceIPaddressTCPdoesnotworkinthissolution.ThismethodhasbeenveriedtoworkonbothLinuxandWindowsusingOSdependentTAPdevicesandbridgeutilities. 5.5.3FullTunnelOverhead Whilethefulltunnelclientmethodeffectivelyresolvesthelingeringproblemofensuringthatallpacketsinafulltunnelwillbesecure,itraisesanissue:couldtheeffectofhavingallpacketstraversetheVPNapplicationbeprohibitivelyexpensive.Analysisofthisapproachcomparesitwithonethatusesthetraditionalroutingruleswitch.Figure 5-2 presentthepingtimefromaresidentiallocationtooneofGoogleR'sIPaddressesusingagatewaylocatedattheUniversityofFloridawhentheVPNisinsplittunnelmode,fulltunnelusingtheroutingruleswitch,andfulltunnelusingEthernetforwarding.Theresultsexpressthatthereisnegligibledifferencebetweenthefulltunnelapproaches.Oneinterestingresultisthelatencytogatewayspublicaddressinthe 131

PAGE 132

routingtest,whichmostlikelyisaresultofthepingbeingsentinsecurelyavoidingtheVPNstackcompletely. 132

PAGE 133

CHAPTER6AD-HOC,DECENTRALIZEDGRIDSGiveamanash,feedhimforaday.Teachamantosh,feedhimforalifetimeLauTzuLarge-scalegridcomputingprojectssuchasTeraGridandOpenScienceGridprovideresearchersvastamountsofcomputeresourcesbutwithrequirementsthatcouldlimitaccess,resultsdelayedduetopotentiallylongjobqueues,andenvironmentsandpoliciesthatmightaffectauser'sworkow.InmanyscenariosandinparticularwiththeadventofInfrastructureasaService(IaaS)cloudcomputing,individualusersandcommunitiescanbenetfromlessrestrictive,dynamicsystemsthatincludeacombinationoflocalresourcesandon-demandresourcesprovisionedbyoneormoreIaaSprovider.Thesetypesofscenariosbenetfromexibilityindeployingresources,remoteaccess,andenvironmentconguration.Gridcomputingpresentsopportunitiestocombinedistributedresourcestoformpowerfulsystems.Duetothechallengesincoordinatingresourcecongurationanddeployment,researcherstendtoeitherbecomemembersofexistinggridsordeploytheirownprivateresources.Theformerapproachislimitedbylackofexibilityintheenvironmentandpolicies,whilethelatterrequiresexpertiseinsystemscongurationandmanagement.Thoughthereexistsawealthofmiddlewareavailable,includingresourcemanagerssuchasCondor[ 65 ],Torque(PBS)[ 84 ],andSunGridEngine[ 105 ],manyseethecostofinstallingandmanagingthesesystemsasbeinggreaterthantheirusefulnessandasaresultturntoinefcientadhocresourcediscoveryandallocation.TocombineresourcesacrossmultipledomainssolutionsthereexistsolutionssuchastheGlobusToolkit[ 40 ]orgLite[ 9 ];however,thesetoolsetscomewiththeirownchallengesthatrequirethelevelofexpertisemostresearchersineldsoutsideofinformationtechnologylack. 133

PAGE 134

Withtherecentadventofcost-effectiveon-demandcomputingthroughInfrastructureasaServiceclouds,newopportunitiesforuser-deployedgridshavearisen;where,forexample,asmalllocalcomputerclustercanbecomplementedbydynamicallyprovisionedresourcesthatruncloud-burstworkloads.However,whilecloud-provisionedresourcessolvetheproblemofon-demandinstantiation,theproblemofhowtoconguretheseresourcestoseamlesslyandsecurelyintegratewithone'sinfrastructureremainsachallenge.Inparticular,consideringthatusersmayprovisionresourcesfrommultipleIaaSproviders,thecongurationdemandsaresimilartoadistributedgrid:whileacloudimagecanbeencapsulatedwithagridcomputingstack,itstillneedscongurationintermsofallocatinganddistributingtheappropriatecerticates,networkcongurationtoestablishend-to-endconnectivity,andpropercongurationofthemiddlewaretoestablishworker,submit,andschedulernodes.Inthischapter,Ipresenttechniquesthatreducetheentrybarrierintermsofnecessaryexpertiseandtimeinvestmentindeployingandextendingadhoc,distributedgrids.Toverifythisassertion,IhaveimplementedasystemsupportingtheseideasintheGridAppliance,whichaswillbedemonstrated,allowsuserstofocusonmakinguseofagridwhileminimizingtheireffortsinsettingupandmanagingtheunderlyingcomponents.Thecorechallengessolvedbymyapproachinclude: decentralizeddirectoryservicefororganizinggrids, decentralizedjobsubmission, gridsinglesignonthroughwebservicesandinterfaces, sandboxingwithnetworksupport, andall-to-allconnectivitydespitenetworkasymmetries.TheGridApplianceprojectandconceptshavebeenactivelydevelopedandusedinseveralprojectsforthepastsixyears.Oftheseprojects,Archer,adistributedgridforcomputerarchitectureresearch,hasdemonstratedthefeasibilityandutility 134

PAGE 135

Figure6-1.GridAppliancemiddleware ofthisapproachbydeployingasharedcollaborativeinfrastructurespanningclustersacrosssixUSuniversities,wherethemajorityofthenodesareconstrainedbynetworkaddresstranslation(NAT).EveryresourceinArcherisconguredinthesame,simplemanner:bydeployingaGridAppliancethatself-congurestojoinawide-areagrid.Researchersinterestedordesiringtheabilitytoaccessbothgridresourcesandspecializedcommercialsimulationtools(suchasSimics)caneasilyuseandcontributeresourcesfromthissharedpoolwithlittleeffortbyjoiningawebsite,downloadingacongurationimageandavirtualmachine(VM),andstartingtheVMinsideaVMmanager(VMM).Uponcompletionofthebootingprocess,usersareconnectedtothegridandabletosubmitandreceivejobs.AttheheartofmyapproachliesaP2P(peer-to-peer)infrastructurebaseduponadistributedhashtable(DHT)usefulfordecentralizedcongurationandorganizationofsystems.Peersareabletostorekey,valuepairsintotheDHTandtoquerytheDHTwithakeyandpotentiallyreceivemultiplevaluesefciently.TheDHTprovidesdiscoveryandcoordinationprimitivesforthecongurationofadecentralizedP2Pvirtualprivatenetwork(VPN),whichsupportsunmodiedapplicationsacrossanetworkoverlay.TheDHTisalsousedforthedecentralizedcoordinationofthegrid.Userscanconguretheir 135

PAGE 136

gridthroughawebinterface,whichoutputscongurationlesthatcanbeusedwiththeGridAppliance.Thetechniquesdescribedinthispaperhavemanyapplications.ThebasicsystemsupportsthecreationoflocalgridsbystartingavirtualmachineonthecomputersintendedforusewithinthegridandusingLANmulticastfordiscovery.Itallowsuserstoseamlesslycombinetheirdedicatedgridswithexternalresourcessuchasworkstationsandcloudresources.Theleveloffamiliaritywithsecurity,operatingsystems,andnetworkingisminimalasallthecongurationdetailsarehandledascomponentsofthesystem.Managementofthesystemincludingusersandnetworkcongurationutilizesasocialnetworkinglikegroupinterface,whiledeploymentusespre-builtvirtualmachineimages.AgraphicaloverviewofthesystemisillustratedinFigure 6-1 .ThesetechniquessimplifythetetheringofresourcesacrossdisparatenetworksThesetupofsecurity,connectivity,andtheircontinuousmanagementimposesconsiderableadministrativeoverhead,inparticularwhennetworksareconstrainedbyrewallsandNATdevicesthatpreventdirectcommunicationwitheachother,andwhicharetypicallyoutsidethecontrolofauserorlab.OurapproachintegratesdecentralizedsystemsbehindNATsinamannerthatdoesnotrequirethesetupofexceptionsandcongurationatNAT/rewallbysystemadministrators.Therestofthepaperisasfollows.Section 6.1 highlightsofmyresearchgroupspreviousworktoprovidebackgroundformycontributionsinthispaper.InSection 6.2 ,IdescribethecomponentsoftheGridApplianceWOW.Section 6.3 providesacasestudyofagriddeploymentusingstandardgriddeploymenttechniquescomparedtoourGridAppliance,describingqualitativelythebenetsandevaluatingquantitativelytheoverheadsofthisapproach.IsharemyexperiencesfromthislongrunningprojectinSection 6.4 .Finally,Section 6.5 comparesandcontrastsothersolutionstotheseproblems. 136

PAGE 137

6.1WOWsThisworkfurthersthevisionbeganbymyselfandmyresearchlabinearlierdescribedasworkwide-areaoverlayofvirtualworkstations[ 42 ](WOW).TheWOWpaperestablishedtheuseofvirtualizationtechnologies,primarilyvirtualnetworkingandvirtualmachines,tosupportdynamicallocationofadditionalresourcesingridsthatspanwideareanetworks.Forreference,theextensionsmadeinthispapertotheWOWconceptaremeansforthedynamiccreationofgridswithsupportforsecurity,decentralizedaccess,anduser-friendlyapproachestogridmanagement.ThissectioncoversthedevelopmentofWOWsovertheyearsasitrelatestootherpublicationsandasmeanstodistinguishthecontributionsmadebymeandinthischapter. 6.1.1P2POverlaysPeer-to-peerorP2Psystemscreateenvironmentswheremembershaveacommonfunctionality.P2Psystemsareoftenusedfordiscoveryinadditiontosomeuser-specicservice,suchasvoiceandvideowithSkypeRordatasharingwithBitTorrentR.ManyformsofP2Phaveautonomicfeaturessuchasself-healingandself-optimizationwiththeabilitytosupportdecentralizedenvironments.AsIwillshow,thismakestheirapplicationinthesystemveryattractive.FortheGridAppliance,IhavechosentouseBrunet[ 13 ],atypeofstructuredoverlay.Structuredoverlaystendtobeusedtoconstructdistributedhashtables(DHT)andincomparisontounstructuredoverlaysprovidefasterguaranteedsearchtimes(O(logN)comparedtoO(N),whereNisthesizeofthenetwork).ThetwomostsuccessfulstructuredoverlaysareKademlia[ 72 ],commonlyusedfordecentralizedBitTorrentR,andDynamo[ 28 ],tosupportAmazonR'swebsiteandservices.BrunetsupportforNATtraversalmakesituniquefromotherstructuredoverlays.OriginallyintheWOWs[ 42 ],Brunetfacilitatedthedynamicconnectionsamongstpeersinthegrid.Sincethen,ithasbeenextendedtosupportDHTwithatomicoperations[ 44 ], 137

PAGE 138

efcientrelayswhendirectNATtraversalfails[ 115 ],resilientoverlaystructureandrouting[ 43 ],andcryptographicallysecuremessaging[ 115 ]. 6.1.2VirtualPrivateNetworksAcommonquestionwithregardstothisworkiswhyVPNs?Thecorereasonisconnectivity.IPv4(InternetProtocolversion4)hasalimitedaddressspace,whichhasbeenextendedthroughtheuseofNATallowingasingleIPtobemultiplexedbymultipledevices.Thiscreatesaproblem;however,asitbreakssymmetryintheInternetlimitingtheabilityforcertainpeerstobecomeconnectedandwhichpeerscaninitiateconnections.WiththeadventofIPv6(InternetProtocolversion6),thesituationmightimprove,buttherearenoguaranteesthatNATswilldisappearnorcanusersbecertainthatrewallswillnotbeinplacethatinhibitsymmetry.AVPNcircumventstheseissues,solongastheusercanconnecttotheVPN,asalltrafcisroutedthroughasuccessfullyconnectedpathway.TheproblemwithtraditionalVPNapproachesismanagementoverheadincludingmaintainingresourcesonpublicIPaddressesandestablishinglinksamongstmembersintheVPN.TheVPNusedinthesystemiscalledIPOP[ 41 115 ].IPOP(IPoverP2P),asthenameimplies,usesaP2Poverlay(Brunet)torouteIPmessages.ByusingP2P,maintainingdedicatedbootstrapnodeshavelessoverhead,myapproachwithIPOPallowsanexistingBrunetinfrastructuretobootstrapindependentBrunetinfrastructuresinordertoisolateIPOPnetworksintheirownenvironments[ 117 ].OnceIPOPhasentereditsuniqueBrunetoverlay,itobtainsanIPaddress.IPaddressreservationanddiscoveryreliesonBrunet'sDHT.EachVPNstoresitsP2PidentierintotheDHTatthegeneratedbythedesiredIPaddress,suchthatthekey,valuepairis(hash(IP),P2P).Inordertoensuretherearenoconicts,thestoringofthisvalueintotheDHTusesanatomicoperation,whichsucceedsonlyifnootherpeerhasstoredavalueinthash(IP). 138

PAGE 139

TheprocessforcreatingconnectionsbeginswhenIPOPreceivesanoutgoingmessage.FirstitparsesthedestinationaddressandqueriestheDHTfortheremotepeersP2Paddress.Thepeerthenattemptstoformasecure,directconnectionwiththeremotepeerusingBrunet'ssecuremessaginglayer.Oncethathasformed,packetstothatIPaddressaredirectedoverthatsecurelink.Inmyoriginaldesign[ 113 ],thevirtualnetworkwassecuredthroughakernel-levelIPsecstack,amodelkeptthroughtherstgenerationArcherdeployment.ThisapproachonlysecuresvirtualnetworklinksbetweenpartiesanddoesnotsecuretheP2Player;furthermore,inIPseccongurationeachpeerrequiresauniqueruleforeveryotherpeer,whichlimitedthemaximumnumberofpeersintheVPN.SecuringtheP2Playerisimportant,otherwisemalicioususerscouldeasilyderailtheentiresystem,butsecuringwithIPsecwouldpracticallynegatethebenetsoftheP2Psystem,becauseofnetworkcongurationissuesrelatedtoNATsandrewalls.Inmoderndeployments,IhaveemployedthesecuritylayerattheP2Player,whichinturnalsosecuresvirtualnetworkinglinks.ForgridsthatrelyuponVPNstoconnectresourcesandusers,thiscanimposetheneedforacerticatefortheVPNandoneforthegrid.Thoughinourapproach,IavoidthisproblembyusingaVPNthatallowsausertoverifytheidentityofaremotepeerandobtainitscerticate,andhavetakenadvantageofhooksingridsoftwarethatarecalledtoverifyaremotepeersauthenticity.Inotherwords,useraccessislimitedbytheVPNandidentityinsidethegridismaintainedbythatsamecerticate.Thismightnotbepossibleifallusersweresubmittingfromthesameresourcesbutisfeasibleinthesystemsinceeachusersubmitsfromtheirownsystem. 6.1.3VirtualMachinesinGridComputingEarlierwork[ 39 ]advocatedtheuseofvirtualmachines(VMs)ingridcomputingforimprovedsecurityandcustomization.Otherssince[ 7 58 97 ]havebeenestablishedVMsasmeansforsandboxing,thatisenvironmentsthatallowuntrusteduserstouse 139

PAGE 140

trustedresourcesinalimitedfashion.VMsrunasaprocessonasystem,whereprocessesrunninginsidetheVMhavenoaccesstothehostoperatingsystem.Furthermore,VMscanhavelimitedornonetworkingaccessascontrolledbythehost,whicheffectivelysealstheminacageorsandboxprotectingthehostsenvironment.VMsarealsousefulforcustomizationandlegacyapplications,sinceadevelopercanconguretheVMandthendistributeitasanappliance,withtheonlyrequirementontheenduserbeingthattheyhaveaVMsoftwareormanager.Quantitatively,previousworkhasshownthatCPU-boundtasksperformfairlywellrunningwithnomorethan10%overheadandinsomecases0%,whichisthecasewithVMslikeXen.Whilenotadirectcorrelationtogridcomputing,cloudshavebenetedsignicantlyfromVMs.VMsarethemagicbehindcloudinfrastructuresthatprovideIaaS,suchasEC2.Intheseenvironments,usersareabletocreatecustomizedinstances,orpackagedoperatingsystemsandapplications,insideofcloudenvironments,sharewitheachother,anddynamicallycreateorshutdownthemasnecessary.Whiletheapplicationofcloudsisgeneric,itcaneasilybeappliedtowardsgrids.Ausercancreatepushexcessjobsintothecloud,whenthereisoverow,highdemands,ortheuserdoesnotwanttomaintaintheirownhardware.Onechallenge,however,isthedynamiccreationofagridaswellasextensionofanexistinggridusingthecloud,challengesthatareaddressedinthispaper. 6.2ArchitecturalOverviewMyapproachattemptstoreuseasmanyavailablecomponentstodesignagridmiddlewaregenericenoughthatthideascanbeappliedtoothermiddlewarestacks.Asaresult,mycontributioninthischapterandinparticularthissectionfocusesprimarilyonthefollowingkeytasks:makinggridconstructioneasy,supportingdecentralizeduseraccess,sandboxingtheusersenvironment,limitingaccesstothegridtoauthorizedidentities,andensuringpriorityonusersownresources. 140

PAGE 141

Table6-1.Gridmiddlewarecomparison DescriptionScalabilityJobqueue/submis-sionsiteAPIRequirements BoincVolunteercomput-ing,applicationsshipwithBoincandpollheadnodefordatasetsNotexplicitlymen-tioned,limitedbytheabilityofthesched-ulertohandlethedemandsoftheclientEachapplicationhasadifferentsite,noseparationfromjobqueueandsub-missionsiteBoincAPIandmid-dlewarebundlingrequiredBonjourGridDesktopgrid,usezeroconf/BonjourtondavailableresourcesinaLANNoboundstested,limitsincludemulti-castingoverheadsandprocessingpowerofjobqueuenodeEachuserhastheirownjobqueue/submissionsiteNoneCondorHighthroughputcomputing/onde-mand/desktop/etc/generalgridcomputingOver10,0001Globaljobqueue,nolimitonsub-missionsites,submissionsitecommunicatesdi-rectlywithworkernodesOptionalAPItosupportjobmi-grationandcheckpointingPastryGridUsestructuredover-layPastrytoformdecentralizedgridsDecentralized,singlenodelimitedbyitsprocessingpower,thoughcollectivelylimitedbythePastryDHTEachconnectedpeermaintainsitsownjobqueueandsubmissionsiteNonePBS/Torque[ 84 ]Traditionalapproachtodedicatedgridcomputingupto20,000CPUs2GlobaljobqueueandsubmissionsiteNoneSGETraditionalapproachtodedicatedgridcomputingTestedupto63,000coresonalmost4,000hosts3GlobaljobqueueandsubmissionsiteNoneXtremWebDesktopgrid,similartoCondorbutusespullinsteadofpush,likeBoincNotexplicitlymen-tioned,limitedbytheabilityofthesched-ulertohandlethedemandsofclientsGlobaljobqueue,separatesubmis-sionsite,optionallyoneperuserNobuilt-insup-portforsharedlesystems 1 http://www.cs.wisc.edu/condor/CondorWeek2009/condor_presentations/sfiligoi-Condor_WAN_scalability.pdf 2 http://www.clusterresources.com/docs/211 3 http://www.sun.com/offers/docs/Extreme_Scalability_SGE.pdf 141

PAGE 142

6.2.1WebInterfaceandtheCommunityBeforedeployinganysoftwareorconguringanyhardware,agridneedsorganizationincludingcerticatemanagement,gridaccess,useraccountmanagement,anddelegationofresponsibilities.Thesearecomplexquestions,whichcanbechallengingtoaddress,thoughforlessrestrictivesystems,likeacollectionofacademiclabssharingclusters,theymaybeveryeasy.Oneoftheprofessorscouldhandletheinitialauthorizationofalltheotherlabsandthendelegatetothemtheresponsibilityofallowingtheirafliates,suchasstudentsandscholarsaccess.Foracademicenvironments,gridsbecomemorechallengingwhentheprofessororworseyetstudentsmustmaintainthecerticates,handlingcerticaterequests,andplacingsignedcerticatesinthecorrectlocation.Oursolutiontothispotentiallyconfusingareawasagroupinterface,akintosomethinglikeFacebookR'sorGoogleR'sgroups.Albeit,thosetypesofgroupsarenothierarchal,whichisanecessityinordertohavedelegatedresponsibilities.ThusIhaveatwolayerapproach,agridgroupformembersofthegridtrustedbythegridorganizersandusergroupsforthosewhoaretrustedbythoseinthegridgroup.Membersofthegridgroupcancreatetheirownusergroups.Amemberofausergroupcangainaccesstothegridbydownloadinggridcongurationdataavailablewithintheusergroupwebinterface.Thiscongurationdatacomesintheformatofadiskimage,whenaddedtoaGridApplianceVM,itisusedtoobtaintheuser'scredentialsandenablingthemtoconnecttothegrid.Togiveanexample,considerthecomputerarchitecturegrid,Archer.ArcherwasseededinitiallybytheUniversityofFlorida,somygroupandIarethefoundersandmaintainersoftheArchergridgroup.AsnewuniversitiesandindependentresearchershavejoinedArcher,theyrequestaccesstothisgroup.Uponreceivingapproval,theythenneedtoformtheirownusergroupsothattheycanallowotherstoconnecttothegrid.SoatrustedmembermightcreateausergrouptitledArcherforUniversityXandallmembersofuniversityXwillapplyformembershipinthatgroup.Thecreatorcan 142

PAGE 143

Figure6-2.GridAppliancedeploymentscenario makedecisionstoeitheracceptordenytheseusers.Oncetheuserhasaccess,theywilldownloadtheircongurationdataformattedasavirtualdiskimageandtheGridApplianceVMandstarttheVM.AfterstartingtheVM,theuserwillbeconnectedtothegridandabletosubmitandreceivejobs.Joiningiseasy;agridrequiresausertosignontoawebsiteanddownloadacongurationdata,whichcanthenbeusedonmultiplesystems.Tosupportthisprocess,thecongurationdatacontainscryptographicinformationthatfacilitatesacquisitionofasignedcerticatefromthewebinterfacethroughXML-RPCoverHTTPS(ExtensibleMarkupLanguageRemoteProcedureCalloverHypertextTransferProtocolSecure).TheprocessbeginsbyeitherbootingtheGridApplianceorrestartingaGridApplianceservice.Whenstartingtheservicewilldetectifthereisnewcongurationdata,andifthereis,itcontactsthewebinterfacewiththecryptographicinformationandapublickey.Thewebinterfaceveriestheuser'sidentity,retrievestheirprolefromitsdatabaseandbindsthatinformationwiththepublickeytocreateacerticaterequest,whichwillthenbesignedandreturnedtotheuser. 143

PAGE 144

Withapublicwebinterface,Ihavebeenabletocreateavarietycommunities.Oneofparticularinterestisnotthegriditselfbutratherabootstrappingcommunityforgrids.Thewebinterfacehasbeendesignedtosupportmanygridgroups,sotoohastheP2PinfrastructureasitsupportsbootstrappingintouniqueprivateoverlaysforindividualgridsbymeansofBrunet'sabilitytosupportrecursivebootstrapping.Byusingthepublicinterface,usershaveanopportunitytoreuseapublicbootstrapinfrastructureandonlyneedtofocusonthecongurationoftheirVPNandgridservices,whichhasbeentrivializedtoacceptingordenyingusersaccesstoagroupandturningonresources.Wewouldliketonotethatthereisnoneedtomakeanexplicitpublicgridcommunitythroughthewebinterface,sinceallGridAppliancescomewithadefaultcongurationlethatwillconnectthemtoaninsecurepublicgrid. 6.2.2TheOrganizationoftheGridTheprevioussectionfocusedfacilitationofgridcongurationusingthewebinterfaceandskirtedtheissuesofdetailedcongurationandorganization.Thecongurationofthegridmirrorsthatoftheconnectionprocess.ThersttiergroupmapstoacommongridandeachgridmapstoaVPN.Thuswhenausercreatesanewgridgroup,theyareactuallyconguringanewVPN,whichinvolvesaddressrange,securityparameters,useragreements,andthenameofthegroup.Thesystemprovidesdefaultsforaddressrangeandsecurityparameters,souserscanfocusonhighleveldetailsliketheuseragreementandthegrid'sname.Asmentionedearlier,thesecondtierofgroupsenablesmembersinthegridgrouptoprovideaccesstotheircommunity.Itisalsothelocationthatusersdownloadtheircongurationdata.Thecongurationlescomeinthreeavors:submission,worker,ormanager.Workernodesstrictlyrunjobs.Submissionnodescanrunjobsaswellassubmitjobsintothegrid.Managernodesareakintoheadnodes,thosethatmanagetheinteractionbetweenworkerandsubmissionnodes. 144

PAGE 145

WhilethecongurationdetailsarehandledbythewebinterfaceandscriptsinsidetheGridAppliance,organizationofthegrid,morespecicallythelinkingofworkerandsubmissionnodestomanagernodes,reliesontheDHT.ManagersstoretheirIPaddressesintotheDHTatthekeymanagers.Whenworkersandclientsjointhegrid,theyautomaticallyquerythiskey,usingtheresultstoconguretheirgridsoftware.Managerscanalsoquerythiskeytolearnofothermanagerstocoordinatewitheachother. 6.2.2.1SelectingaMiddlewareMygridcompositionislargelybaseduponadesiretosupportadecentralizedenvironment,whilestillretainingreliabilityandlimitingdocumentationsupportefforts.Asthereexistmanymiddlewarestosupportjobsubmissionandscheduling,Isurveyedavailableandestablishedmiddlewaretodeterminehowwelltheymatchedmyrequirements.MyresultsarepresentedinTable 6-1 ,whichcoversmostofthewellestablishedmiddlewareandsomerecentresearchprojectsfocusedondecentralizedorganization.Oftheresourcemanagementmiddlewaressurveyed,IchosetouseCondorasitmatchesclosestwithmygoalsduetoitsdecentralizedpropertiesandfocusondesktopgrids.Condorallowsmultiplesubmissionpoints,anon-trivialobstacleinsomeoftheothersystems.Additionally,addingandremovingresourcesinCondorcanbedonewithoutanycongurationfromthemanagers.Conversely,inSGEandTorque,afterresourceshavebeenaddedintothesystem,theadministratormustmanuallycongurethemanagertocontrolthem.Mostschedulingsoftwareassumesthatresourcesarededicated,whileCondorsupportsopportunisticcycles,bydetectingthepresenceofotherentitiesandwillsuspend,migrate,orterminateajob,thusenablingdesktopgrids.Acommondrawbacktoestablishedmiddlewaresistherequirementofamanagernode;havingnomanagerinanadhocgridwouldbeideal. 145

PAGE 146

6.2.2.2Self-OrganizingCondorWhiletherequirementofacentralmanagermaybeundesirable,theycaneasilyberuninsideaVMandCondorsupportstheabilitytorunmanyinparallelthroughtheuseofocking[ 33 ].Flockingallowssubmissionsitestoconnecttomultiplemanagers.Thisservestwopurposes:1)toprovidetransparentreliabilitybysupportingmultiplemanagersand2)userscansharetheirresourcesthroughtheirownmanager.Flockingallowseachsitetorunitsownmanagerorsharethecommonmanager.TocongureCondor,managerIPaddressesarestoredintotheDHTusingthekeymanagers.JoiningpeersquerytheDHTtoobtainalistofmanagers,selectingonerandomlytouseasitsprimarymanagerwiththeresultusedforocking.Ifthesystemprefersmanagersfromitsgroup,itwillrandomlycontacteachmanagerinanattempttondamatch,selectingoneatrandomifnomatchisfound.Untilamanagerisfound,theprocessrepeatsevery60seconds.Uponndingamanager,thestateofthesystemisveriedevery10minutesandnewmanagersareaddedtotheocklist. 6.2.2.3PuttingItAllTogetherThefollowingsummarizesthecongurationandorganizationofthegrid.Minimallyagridwillconstituteamanager,someworkers,andasubmitter.ReferencingFigure 6-2 step,duringsystemboot,withoutuserinteraction,eachmachinecontactsthegroupwebsitetoobtainavalidVPNcerticate.Whereupon,itconnectstotheP2Poverlaywhosebootstrappeersarelistedinsidethecongurationle,step2.Atwhichpoint,themachinestartstheVPNservicerunningontopoftheP2Poverlay,alsopartofstep.Theself-conguringVPNcreatesatransparentlayerhidingfromtheuserandadministratorsthecomplexityinsettingupacommonfabricthatcanhandlepotentialnetworkdynamics.MachinesautomaticallyobtainauniqueIPaddressandndtheirplaceinsidethegrid.Foramanagermachine,thismeansregisteringintheDHT(notshown),whileclientsandworkerssearchforavailablemanagersbyqueryingtheDHT, 146

PAGE 147

step;IPOPtranslatestheIPtoaP2Paddress,step;andthenclientcontactsthemanagerdirectly,step. 6.2.3SandboxingResourcesAstaskscanrunonworkerandpotentiallysubmissionnodes,Ihavedevisedmeanstosandboxtheenvironmentsthatdonotlimituserinteractionswiththesystem.Whilemoretraditionalapproachestosandboxingemphasizeaseparationbetweenworkerandsubmissionmachine,inactualdeployments,veryfewusersexplicitlydeployworkermachines,mostaresubmissionmachines.ThusIdevelopedsandboxingtechniquestolimittheabilityofsubmittedjobsonsystemsthataresimultaneouslybeingusedforsubmission.Sothesesandboxingtechniqueconsidersmorethanjustlockingdownthemachinebutalsoensuringareasonablelevelofaccess. 6.2.3.1SecuringtheResourcesThecoreofmysandboxingapproachistolimitattackstosoftwareinthesystemandnotpoorlycongureduserspace,suchaspoorlychosenpasswordsorresourcesexternaltotheGridAppliance.Alljobsarerunasasetofpredeneduseridentities.Whenthejobsarenishedexecuting,whetherforciblyshutdownorcompletedsuccessfully,allprocessesfromthatuserareshutdown,preventingmalicioustrojanattacks.Thoseusersonlyhaveaccesstotheworkingdirectoryforthejobandthosewithpermissionforeverybody.Escalationofprivilegeattacksduetopoorpasswordsarepreventedbydisallowinguseofsuorsudofortheseusers.Finally,networkaccessislimitedtotheVPN,thustheyareunabletoperformdenialofserviceattacksontheInternet.Additionally,systemscanbeconguredsuchthattheonlynetworkpresentedtothemisthatofthevirtualnetwork.Tosupportthis,IPOPhasbeenenhancedtosupportaroutermode,whichcanbebridgedtoavirtualmachineadapterrunningonthehostmachinethatconnectstothenetworkdevicerunninginsidetheVM.Notonlydoesthis 147

PAGE 148

improveperformance,duetoreducedI/Ooverhead,thesamevirtualnetworkroutercanbeusedformultipleVMs.Toensurethatsubmitmachinesstillhaveahighleveloffunctionalitywithoutriskingthesystemtoexternalattacksevenfromusersonthesamenetwork,userservicesarerunonlyonahost-onlynetworkdevicewithinthevirtualmachine.ThisincludesanSSHserverandaSambaorWindowsFileShare.Theusernamematchesthatfromthewebsite,whilethepassworddefaultstopassword.IwouldliketonotethatlesharingservicesworktheoppositetothatofhosttoguestasmostVMsalreadyhaveinplace.InsteaduserscanaccesstheirlesontheVMfromthehost.Thiswasdonetolimitpotentialattacksonsubmissionmachine. 6.2.3.2RespectingtheHostAnotheraspectofsandboxingisrespectingtheusageofthehost.WhileCondorcandetecthostusageonamachineitisrunning,whenruninsideaVMitcannotdetectusageonthehost.Thusitisimperativetosupportsuchacongurationotherwisemyapproachwouldbelimitedinthatitcanonlyberunduringidletimes.IntheGridAppliance,thisisaddressedbyrunningalight-weightagentonthehostthatcommunicatestotheVMthroughthesecondEthernetinterface.TheagentdiscoversaVMthroughmulticastservicediscoveryexecutedonlyonhost-onlyvirtualnetworkdevices.Whenauseraccessesthehost,theagentnotiesaserviceintheVM,whichresultsinrunningtasksbeingsuspended,migrated,orterminated.Themachineremainsofflimitsuntiltherehasbeennouseractivityfor10minutes. 6.2.3.3DecentralizedSubmissionofJobsFromtheadministrator'sperspective,notrequiringasubmissionmachineisalsoaformofsandboxing.Maintainingaworkermachinerequiresverylowoverhead,sincejobsandtheirassociatedlesareremoveduponthecompletionofajobandcorruptedworkerscanbedeletedandredeployed.Maintainingasubmissionmachinemeansuseraccounts,networkaccess,providingdatastorage,andtrustinguserstoplaynicelyona 148

PAGE 149

sharedresource.Sohavingusersbeabletosubmitfromtheirownresourcesreducestheoverheadinmanagingagrid.Itdoescomewithaconsequence,mostgridsprovidesharedlesystems,whicharestaticallymountedinallnodes.Inadynamicgridthatmighthavemultipleshares,thistypeofapproachmaynotbeveryfeasible.Allisnotlost,forexample,Condorprovidesdatadistributionmechanismsforsubmittedjobs.Thiscanbeaninconvenience,however,ifonlyaportionoftheleisnecessary,astheentirelemustbedistributedtoeachworker.Thiscanbeparticularlytruewithdiskimagesusedbycomputerarchitecturesimulationsandapplicationsbuiltwithmanymodulesordocumentation.Tosupportsparsedatatransfersandsimplifyaccesstolocaldata,eachGridAppliancehasalocalNFSshareexportedwithread-onlypermission.Toaddresstheissueofmountingalesystem,thereexistsatooltoautomaticallymountlesystems,autofs.autofstoolworksbyinterceptinglesystemcallsinsideaspecicdirectory,parsingthepath,andmountingaremotelesystem.IntheGridAppliance,accessingthepath /mnt/ganfs/hostname ,wherehostnameiseithertheIPaddressorhostnameofanappliance,willautomaticallythatappliance'sNFSexportwithouttheneedforsuper-userintervention.Mountsareautomaticallyunmountedafterasufcientperiodoftimewithoutanyaccesstothemountedlesystem. 6.3DeployingaCampusGridInowpresentacasestudyexploringaqualitativeandquantitativecomparisonindeployingacampusgridandextendingitintotheCloudusingtraditionaltechniquesversusagridconstructedbyGridAppliance.OneofthetargetenvironmentsfortheGridApplianceisresourcesprovidedindistributedcomputerlabsandmanysmalldistributedclustersononeormoreuniversitycampusasshowninFigure 6-3 .Thegoalsinboththesecasesaretousecommoditysoftware,whereavailable,andtoprovideasolutionthatisbothsimplebutcreatesanadequategrid.Inbothcases,Condorischosenasthemiddleware,whichisapushschedulerandbydefaultrequiresthatall 149

PAGE 150

Figure6-3.Acollectionofvariouscomputingresourcesatatypicaluniversity resourcesbeonacommonnetworkthusaVPNwillbeutilized.Additionally,inthissection,IcoverdetailsoftheGridAppliancethatdidnottinthecontextofpreviousdiscussionsinthepaper. 6.3.1BackgroundInthiscasestudy,Iwillcompareandcontrasttheconstructionoftwotypesofgrids:astaticgridconguredbyhandandadynamicgridconguredbytheGridAppliance.EachgridisinitiallyconstructedusingresourcesattheUniversityofFloridaandlaterextendedtoAmazonR'sEC2andFutureGridatIndiaUniversityusingEucalyptus.EachenvironmenthasaNATlimitingsymmetriccommunication:UniversityofFloridaresourcesarebehindtwolayers,rstaniptablesNATandthenaCiscoNAT;EC2resourceshaveasimple1:1NAT;andtheEucalyptusresourcesappeartohaveaniptablesNAT. 150

PAGE 151

6.3.2TraditionalCongurationofaCampusGridAVPNmustbeusedtoconnecttheresourcesduetothelackofnetworksymmetryacrossthesites.ThereexistsawealthofVPNsavailable[ 67 100 120 ]andsomeexplicitlyforgrids[ 53 106 108 ].Forsimplicitysake,OpenVPNRwaschosenduetothesimplicityinitsconguration.Inreality,OpenVPNRmakesapoorchoicebecauseitiscentralized,thusalltrafcbetweensubmitterandworkermusttraversetheVPNsserver.Whereasothersinthelistaredistributedandthusallownodestocommunicatedirectly,butinordertodoso,manualsetupisrequired,aprocess,thatwouldoverwhelmmanynovicegriddeployers.Inallthesecases,theVPNrequiresthatatleastasinglenodehaveapublicaddress,thusIhadtomakeasingleconcessioninthedesignofthisgrid,thatis,theOpenVPNRserverrunsonapublicnode.InordertoconnecttoOpenVPNR,itmustknowtheserver'saddressandhaveasignedcerticate.Whiletypically,mostadministratorswouldwantauniqueprivatekeyforeachmachinejoiningthegrid,inmycasestudyandevaluation,Iavoidedthisprocessandusedacommonkey,certicatepair.Indoingso,therearepotentialdangers,forexample,ifanyofthemachineswerehijacked,thecerticatewouldhavetoberevokedandallmachineswouldberenderedinoperable.Tocreateaproperlysecuredenvironment,eachresourcewouldhavetogenerateorbeprovidedaprivatekey,acerticaterequestsubmittedtothecerticateauthority,andasignedcerticateprovidedtotheresource.Withthenetworkingandsecuritycomponentsinplace,thenextstepisconguringgridmiddleware.Priortodeployinganyresources,themanagermustbeallocatedanditsIPaddressprovidertootherresourcesinthesystem.Submissionpointsarenotafocusonthiscasestudy,thoughingeneralmostsystemsofthisnaturehaveasinglesharedsubmissionsite.Thechallengesinsupportingmultiplesubmissionpointsinthisenvironmentincludecreatingcerticatessameasworkernodes,requiringuserstocongureOpenVPNRandCondor,andhandlingNFSmounts.Whereashavingasingle 151

PAGE 152

submissionpointcreatesmoreworkforthesystemadministratorasmentionedearlier.Bothapproacheshavetheirassociatedcostsandneitheristrivial.Theevaluationassumesasingleusersubmittingfromasingleresource.Toaddresspotentialheterogeneityissues.Anadministratorwouldneedtocollaboratewithotherstoensurethatallresourcesarerunningacommonsetoftoolsandlibraries.Otherwiseanapplicationthatworkswellononeplatformcouldcauseasegmentationfaultonanother,throughnofaultoftheuser,butratherduetolibraryincompatibilities.Toexportthissystemintovariousclouds,anadministratorstartsbyrunninganinstancethatcontainstheirdesiredLinuxdistributionandtheninstallingthegridutilitieslikeCondorandOpenVPNR.Supportingindividualizationoftheresourcesischallenging.Thesimplestapproachistostoreallthecongurationinthatinstanceincludingthesingleprivatekey,certicatepairaswellastheIPaddressofthemanagernode.Alternatively,theadministratorcouldbuildaninfrastructurethatreceivescerticaterequestsandreturnsacerticate.TheIPaddressofthemanagernodeandofthecerticaterequesthandlercouldbeprovidedtothecloudviauserdata,afeaturecommontomostIaaScloudsthatallowsuserstoprovideeithertextorbinarydatathatisavailableviaaprivateURLinsideacloudinstance. 6.3.3GridApplianceinaCampusGridAllthesecongurationissuesareexactlythereasonswhyGridApplianceanditsassociatedgroupWebinterfacearedesirableforsmallandmediumscalegrids.Therstcomponentisdecidingwhichwebinterfacetouse,public( www.grid-appliance.org )orprivatehostedontheirownresources.Similarly,userscandeploytheirownP2Poverlayorusethesharedoverlay.Thewebinterfaceenforcesuniquenamesforboththeusersandthegroups.Oncetheuserhasmembershipinthesecondtierofgroups,theycandownloadalethatwillbeusedtoautomaticallyconguretheirresources.Asmentionedearlier,this 152

PAGE 153

handledobtainingauniquesignedcerticate,connectingtotheVPN,anddiscoveringthemanagerinthegrid.CongurationoftheVPNandgridarehandledseamlessly,theVPNautomaticallyestablishesdirectlinkswithpeersondemandandpeerscongurebaseduponinformationavailableintheP2Poverlaydynamicallyallowingforcongurationchanges.Heterogeneityisaproblemthatwillalwaysexistifindividualsaregivengovernanceoftheirownresources.Ratherthanghtthatprocess,theGridApplianceapproachistoprovideareferencesystemandthenincludethatversionandadditionalprogramsintheresourcedescriptionexportedbyCondor.Thusauserlookingforaspecicapplication,library,orcomputerarchitecturecanspecifythatintheirjobdescription.Additionally,bymeansofthetransparentNFSmounts,userscaneasilycompiletheirownapplicationsandlibrariesandexportthemtoremoteworkernodes.ExtendingtheGridAppliancesystemintothecloudsiseasy.ThesimilaritybetweenaVMapplianceandacloudinstancearestriking.TheonlydifferencefromtheperspectiveoftheGridAppliancesystemiswheretocheckforcongurationdata.OnceauserhascreatedaGridApplianceinacloud,everyoneelsecanreuseitandjustsupplytheircongurationdataastheuserdataduringtheinstantiationoftheinstances.AsIdescribeinSection 6.4.2 ,creatingGridAppliancefromscratchisatrivialprocedure.Asdescribedindetailearlier,anadministratorneedstoinstallthenecessarysoftwareeitherbydeployingVMMsandVMappliancesorinstallingGridAppliancepackagesonDebian/Ubuntusystems.Additionally,thesesystemsneedtobepackagedwiththecongurationlesoroppydiskimages.Atwhichpoint,thesystemswillautomaticallycongureandconnecttothegrid.AnadministratorcanverifythisbymonitoringCondor.Additionalresourcescanbeaddedseamlessly,likewiseresourcescanberemovedbyshuttingthemoffwithoutdirectinteractionwiththeGridApplianceormanagernode. 153

PAGE 154

6.3.4ComparingtheUserExperienceInthecaseofatraditionalgrid,mostuserswillcontacttheadministratorandmakearequestforanaccount.Uponreceivingconrmation,theuserwillhavetheabilitytoSSHintoasubmissionsite.Theirconnectivitytothesystemisinstantaneous,theirjobswillbeginexecutingassoonasitistheirturninthequeue.User'swillmostlikelyhaveaccesstoaglobalNFS.Fromtheuser'sperspective,thetraditionalapproachisveryeasyandstraightforward.WiththeGridAppliance,auserwillobtainanaccountatthewebinterface,downloadaVMandacongurationle,andstarttheVM.Uponbooting,theuserwillbeabletosubmitandreceivejobs.Toaccessthegrid,userscaneitherSSHintothemachineorusetheconsolesintheVM.Whilethereisnosingle,globalNFS,eachuserhastheirownuniqueNFSandmustmaketheirjobsubmissionlescontaintheiruniquepath.Forthemostpart,theuser'sperspectiveoftheGridApplianceapproachhasmuchofthesamefeelasthetraditionalapproach.AlthoughusershaveadditionalfeaturessuchasaccessingtheirlesviaSambaandhavingaportableenvironmentfordoingtheirsoftwaredevelopment. 6.3.5QuantifyingtheExperienceTheevaluationoftheseenvironmentsfocusesonthetimetakentodynamicallyallocatetheresources,connecttothegrid,andsubmitasimplejobtoallresourcesinthegrid.Inbothsystems,asinglemanagerandsubmissionnodewereinstantiatedinseparateVMs.Inthetraditionalsetup,OpenVPNRisrunfromthemanagernode.Eachcomponentintheevaluationwasrunthreetimes.Betweeniterations,thesubmissionnodeandthemanagernodewererestartedtoclearanystate.Thetimesmeasuredincludethetimefromwhenthelastgridresourcewasstartedtothetimeitreportedtothemanagernode,Figure 6-4 ,aswellasthetimerequiredforthesubmitnodetoqueueandruna5minutejobonalltheconnectedworkers,Figure 6-5 .Thepurposeofthesecondtestistomeasurethetimeittakesfora 154

PAGE 155

Figure6-4.Timetoconstructagrid Figure6-5.Timetorunajobonagrid submissionsitetoqueueatasktoallworkers,connecttotheworkers,submitthejob,andtoreceivetheresults;thusastresstestontheVPN'sabilitytodynamicallycreatelinksandverifyingall-to-allconnectivity.Thetestswererunon50resources(virtualmachines/cloudinstances)ineachenvironmentandthenonagridconsistingofall150resourceswith50ateachsite.Intheprevioussection,Iqualiedwhytheapproachwaseasierthanconguringagridbyhand,thoughbydoingsoIintroduceoverheadsrelatedtocongurationandorganization.Theevaluationveriesthattheseoverheadsdonotconictwiththeutilityofmyapproach.NotonlydoresourceswithinaclusterinstalltheVMsandconnecttothegridquickly,thecloudsdoaswell.Whiletheresultsweresimilar,itshouldbenotedthatthetimerequiredtocongurethestaticapproachwasnottakenintoeffect.A 155

PAGE 156

processthatisdifculttomeasureandislargelyreliantontheabilityoftheadministratorandthetoolsused.WhereasthetimefortheGridAppliancedoesincludemanyofthesecomponents.Itshouldbestatedthattheevaluationonlyhasasinglesubmissionnode.Inasystemwithmultiplesubmitters,theOpenVPNRservercouldeasilybecomeabandwidthbottleneckinthesystemasalldatamustpassthroughit,whichcanbeavoidedusingIPOP.Additionally,thecurrentGridAppliancereliesonpollingwithlongdelays,soastonothavenegativeeffectsonthesystem.Eithershrinkingthosetimesormovingtoaneventbasedsystemshouldsignicantlyimprovethespeedatwhichconnectivityoccurs. 6.4LessonsLearnedThissectionhighlightssometheinterestingdevelopmentsandexperiences,wehavehadthatdonottthetopicsdiscussedsofar. 6.4.1DeploymentsAsignicantcomponentofmyexperiencestemsfromthecomputationalgridprovidedbyArcher[ 37 ],anactivegriddeployedforcomputerarchitectureresearch,whichhasbeenonlineforover3years.Archercurrentlyspanssixseeduniversitiescontributingover600CPUsaswellascontributionsandactivitiesfromexternalusers.TheArchergridhasbeenaccessedbyhundredsofstudentsandresearchersfromoveradozeninstitutionssubmittingjobstotalingover500,000hoursofjobexecutioninthepasttwoyearsalone.TheGridAppliancehasalsobeenutilizedbygroupsattheUniversitiesofFlorida,Clemson,Arkansas,andNorthwesternSwitzerlandasatoolforteachinggridcomputing.MeanwhiletheuniversitiesofClemsonandPurdueareusingtheGridAppliance'sVPN(GroupVPN/IPOP)tocreatetheirowngridsystems.Overtime,therehavebeenmanyprivate,small-scalesystemsusingthesharedsystemavailableat www.grid-appliance.org withothergroupsconstructingtheirownindependent 156

PAGE 157

systems.Feedbackfromusersthroughsurveyshaveshownthatnon-expertusersareabletoconnecttothepublicGridappliancepoolinamatterofminutesbysimplydownloadingandbootingaplug-and-playVMimagethatisportableacrossVMware,VirtualBox,andKVM. 6.4.2TowardsUnvirtualizedEnvironmentsBecauseofthedemandsputonArcherintermsofavoidingtheoverheadsofvirtualizationandtheperceivedsimplicityofmanagingphysicalresourcesasopposedtovirtualresourcesrunningontopofaphysicalresources,manyusershaverequestedtheabilitytorunGridAppliancesdirectlyontheirmachine.UnlikecloudswithmachineimagessuchasAMIs(AmazonRMachineImage)orVMappliances,physicalmachinesimagescannotbeeasilyexported.MostphysicalOSinstalledonphysicalmachineswillneedsomesomecustomtailoringtohandleenvironmentspecicissues.Withthisinmind,Imovedawayfromstackablelesystemsandtowardscreatingrepositorieswithinstallablepackages,suchasDEBorRPM.TheimplicationsofpackagesmeanthatuserscaneasilyproduceGridAppliancesfrominstalledsystemsorduringsysteminstallation.WiththeVPNroutermode,mentionedearlier,resourcesinaLANcancommunicatedirectlywitheachotherratherthanthroughtheVPN.Thatmeansiftheyareonagigabitnetwork,theycanfullnetworkspeedsasopposedtobeinglimitedto20%ofthatduetotheVPN,overheadsdiscussedin[ 116 ]. 6.4.3AdvantagesandChallengesoftheCloudIhavehadtheexperienceofdeployingtheGridApplianceonthreedifferentcloudstacks:AmazonR'sEC2[ 5 ],FutureGrid'sEucalyptus[ 76 ],andFutureGrid'sNimbus[ 60 ].Allofthesystems,encounteredsofar,allowfordatatobeuploadedwitheachcloudinstancestarted.TheinstancecanthendownloadthedatafromastaticURLonlyaccessiblefromwithintheinstance,forexample,EC2userdataisaccessibleat http://169.254.169.254/latest/user-data .AGridAppliancecloudinstancescanbeconguredviauser-data,whichisthesamecongurationdatausedasthe 157

PAGE 158

virtualandphysicalmachines,albeitzipcompressed.TheGridApplianceseeksthecongurationdatabyrstcheckingforaphysicaloppydisk,theninspecicdirectory( /opt/grid_appliance/var/floppy.img ),followedbytheEC2/EucalyptusURL,andnallytheNimbusURL.Uponndingaoppyandmountingit,thesystemcontinuesonwithconguration.Cloudshavebeenalsoveryusefulfordebugging.ThoughAmazonRisnotfree,withFutureGrid,gridresearchersnowhavefreeaccesstobothEucalyptusandNimbusclouds.Manybugscanbedifculttoreproduceinsmallsystemtestsorbootingonesystematatime.Bystartingmanyinstancessimultaneously,Ihavebeenabletoquicklyreproduceproblemsandisolatethem,leadingtotimelyresolutions,andvericationofthosexes.Beyondtheuseofextendingintocloudsforon-demandresources,theyarealsoveryconvenientfordebugging.DoingsoonAmazonRthoughisnotfree.Fortunately,gridresearchersnowcanhavefreeaccesstoFutureGridwithbothEucalyptusandNimbusstyleclouds.Ididhavetodosometinkeringtogetthesesystemstowork.First,becausetheuserdataisbinarydataandthecommunicationexchangeusesRPC,whichmayhavedifcultyhandlingbinarydata,itmustbeconvertedtobase64beforetransferringandconvertedbackintobinarydataafterward.EC2handlesthistransparently,ifusingcommand-linetools.Unfortunately,EucalyptusandNimbusdonot,eventhoughEucalyptusissupposedtobecompatiblewithEC2.Furthermore,whenstartinganEC2instance,networkingisimmediatelyavailable,whereaswithEucalyptusandNimbus,networkingoftentimestakesmorethan10secondsafterstartingtobeavailable.Thusastartupscriptmustbepreparedfornetworkingnottobereadyandhenceunabletoimmediatelydownloaduserdata.ThebestapproachtodealwiththisinadistributionindependentmanneristowaituntiltheprimaryEthernetinterface(eth0)hasanIPandthencontinuing. 158

PAGE 159

Figure6-6.GridAppliancestackablelesystem 6.4.4StackedFileSystemsConguringsystemscanbedifcult,whichmakesitimportanttohavetheabilitytosharetheresultingsystemwithothers.Theapproachofactuallycreatingpackagescanbeoverlycomplicatedfornovices.Toaddressthisconcern,theoriginalGridAppliancesupportedabuilt-inmechanismtocreatepackagesthroughastackablelesystemusingcopy-on-write[ 113 ].Inthisenvironment,theVMused3disks:theGridAppliancebaseimage,thesoftwarestackconguredbyus;amodule;andahomedisk.Innormalusage,boththebaseandmoduleimagesaretreatedasread-onlylesystemswithalluserchangestothesystembeingrecordedbythehomeimage,asdepictedinFigure 6-6 .Toupgradethesystem,usersreplacedtheircurrentbaseimagewithanewerone,whilekeepingtheirmoduleandhomedisks.WhilethepurposeofthemodulewastoallowuserstoextendthecongurationoftheGridAppliance.Tocongureamodulethesystemwouldbebootedintodevelopermode,anoptionduringthebootphase,whereonlythebaseandmoduleimagesareincludedinthestackedlesystem.Uponcompletingthechanges,auserwouldrunascriptthatwouldcleanthesystemandprepareitforsharing.Ausercouldthensharetheresultingmoduleimagewithothers. 159

PAGE 160

Issueswiththisapproachmadeitunattractivetocontinueusing.First,thereexistsnokernellevelsupportforstackablelesystems,IhadtoaddUnionFS[ 118 ]tothekernel,addingtheweightofmaintainingakerneluntomyshoulders.WhileFUSE(lesysteminuserspace)solutionsexist,theyrequiremodicationstotheinitialramdisk,whichisreproducedautomaticallyduringtheinstallationofeverynewkernel,furthermore,ourexperiencewiththemsuggeststheyarenotwellsuitedforproductionsystems.Additionally,theapproachwasnotportabletocloudsorphysicalresources.SowhileIhavedeprecatedthefeaturefornow,IseeitasapotentialmeanstoeasilydeveloppackageslikeDEBandRPM. 6.4.5PriorityinOwnedResourcesInArcher,seeduniversitiesshouldhavepriorityontheresourcesattheiruniversity.Similarly,usersshouldhavepriorityontheircontributions.Otherwise,userswillremovetheirresourcesfromthegrid,whentheywantguaranteedaccess.Tosupportuserandgroupbasedpriorities,Condorhasmechanismsthatcanbeenforcedattheserverthatallowforarbitrarymeanstospecifyuserpriorityforaspecicresource.Sothecongurationspeciesthatiftheresource'suserorgroupmatchesthatofthesubmitter,thepriorityishigherthanotherwise.Thisaloneisnotsufcientasmalicioususerscouldeasilytweaktheirusernameorgrouptoobtainpriorityonallresources.Thuswheneverthischeckismadetheuser'sidentityinthesubmissioninformationisveriedagainsttheirP2PVPNcerticate.Failedmatchesarenotscheduledandarestoredinalogatthemanagerfortheadministratortodealwithlater.Tosupportthisbehavior,thefollowingstatementshavebeenaddedtotherespectivesystem'sCondorcongurationle: Theformatforthiscongurationisasfollows:Jobqueue(server):NEGOTIATOR PRE JOB RANK=10*(MY.RANK)Worker: 160

PAGE 161

GROUP RANK=TARGET.Group=?=MY.GroupUSER RANK=TARGET.User=?=My.UserRANK=GROUP RANKjjUSER RANKWorkerandSubmitter:Group=Group'sNameUser=User'sName 6.4.6TiminginVirtualMachinesCertainapplications,particularlylicenseservers,aresensitivetotime.Becauseofthenatureofgrids,thereexistpossibilitiesofhavinguncoordinatedtiming,suchasimproperlyspecifyingthetimezoneornotusinganetworktimeprotocol(NTP)serverWithregardstoVMs,VMWare[ 112 ]suggestssynchronizingwiththehost'stimeandtoavoidusingserviceslikeNTP,whichmayhaveadverseaffectsontiminginsidethevirtualmachine.WhileNTPmighthavesomestrangebehavior,relyingonhosttimemayproduceerraticjumpsintimethatsomesoftwarecannothandle.MyexperiencesrecommendstheuseofNTPtoaddresstheseconcerns,whichhasresolvedmanyissueswithstrangesoftwarebehaviorandfrustrationfromuserswhentheirjobsfailduetobeingunabletoobtainalicenseduetoatimingmismatch. 6.4.7SelectingaVPNIPAddressRangeOnechallengeindeployingaVPNisensuringthattheaddressspacedoesnotoverlapwiththatovertheenvironmentswhereitwillbeused.Ifthereisoverlap,userswillbeunabletoconnecttotheVPN.Doingsowillconfusethenetworkstack,astherewillbetwonetworkinterfacesconnectedtothesameaddressspacebutdifferentnetworks.Aguaranteed,thoughnotnecessarilypracticalsolutionistoruntheresourceonaVMNAToraclusterNATthatdoesnotoverlaptheIPaddressspaceoftheVPN.UsersoftheGridApplianceshouldnothavetoconcernthemselveswiththisissues.PriorworkonthetopicbyAlaRezmeritaetal.[ 85 ]recommendsusingtheexperimentaladdressclassErangingbetween240.0.0.0-255.255.255.254, 161

PAGE 162

unfortunatelythisrequiresLinuxkernelmodications.Withtheamountofbugsandsecurityxesregularlypushedintothekernel,maintainingaforkedkernelrequiresasignicantamountoftime,duplicatingtheworkalreadybeingperformedbytheOSdistributionmaintainers.Thiswouldalsolimittheabilitytoeasilydeployresourcesinphysicalandcloudenvironments.Additionally,usersthatwantedtomultipurposeaphysicalresourcemaynotwanttorunamodiedkernel,whileinmostcloudsetupsthekernelchoiceislimited.Ihavesincemovedtowardsusingthe5.0.0.0-5.255.255.255addressrange.LiketheclassEaddressspaceitisunallocated,butitrequiresnochangestoanyoperatingsystems.TheonlylimitationisthatsomeotherVPNsalsouseit,thusauserwouldnotbeabletoruntwoVPNsonthesameaddressspaceconcurrently.Thisapproachismuchbetterthanprovidingkernelsordealingwithnetworkaddressoverlaps.Interestingly,evenwiththisinplace,westillseesomeGroupVPNsusingaddressrangesinnormalprivatenetworkaddressrangesfortheVPN,like10.0.0.0-10.255.255.255and192.168.0.0-192.168.255.255. 6.4.8AdministratorBackdoorWhilemostadministratorswillagreethatmostproblemsthatusersencounterareself-inicted,therearetimes,whenthesystemisatfault.Debuggingsystemsfaultsinadecentralizedsystemcanbeverytricky,sinceitisverydifculttotrackdownaresourceinordertogaindirectphysicalaccess.Additionally,havingauserbringtheirresourcetoanadministratormaybeprohibitivelycomplicated,astheuserwouldneedtorelocatetheirGridApplianceinstanceandhavenetworkconnectivityinordertoconnecttothegridandshowtheproblemtotheadministrator.Toaddressthisandotherconcernsthatonlyappearafterrunningthesystemforlongperiodsoftime,wehavesuppliedanadministratorbackdoorintoallresourcesbyinstallingourpublicsshkey,thoughusersareinformedofthisandarefreetoremoveitforprivacyconcerns.Intypicalcongurations,thisapproachmightnotbefeasible,butbecausetheGridAppliance 162

PAGE 163

shipswithadecentralizedVPNsupportingall-to-allconnectivity,anyresourceconnectedtotheVPNisaccessibleforremotedebuggingbyanadministrator.Mostusersinvolvedareextremelydelightedwiththeprocessasithasanappearancethatthesystemjustworks. 6.5RelatedWorkExistingworkthatfallsunderthegeneralareaofdesktopgrids/opportunisticcomputingincludeBoinc[ 6 ],BonjourGrid[ 2 ],andPVC[ 85 ].Boinc,usedbymany@homesolutions,focusesonaddingexecutenodeseasy;however,jobsubmissionandmanagementrelyoncentralizationandalltasksmustusetheBoincAPIs.BonjourGridremovestheneedforcentralizationthroughtheuseofmulticastresourcediscovery;theneedforwhichlimitsitsapplicabilitytolocalareanetworks.PVCenablesdistributed,wide-areasystemswithdecentralizedjobsubmissionandexecutionthroughtheuseofVPNs,butreliesoncentralizedVPNandresourcemanagement.Eachapproachaddressesauniquechallengeingridcomputing,butnoneaddressesthechallengepresentedasawhole:easilyconstructingdistributed,cross-domaingrids.ChallengesthatIconsiderinthedesignofmysystemincludeallowingsubmissionsitestoexistanywherewithoutbeingconnedtocomplexcongurationorhighlyavailable,centralizedlocations;theabilitytodynamicallyaddandremoveresourcesbystartingandstoppingaaresource;andthesharingofcommonserverssothatnogroupinthegridisdependentonanother.Iemphasizethesepoints,whilestillretainingtheeaseofuseofBoinc,theconnectivityofPVC,andtheexibilityofBonjourGrid.TheendresultisasystemsimilartoOurGrid[ 8 ];however,OurGridrequiresmanualcongurationofthegridandnetworkingamongstsites,administrationofuserswithinasite,andlimitsnetworkconnectivityamongstresources,whereasGridAppliancetransparentlyhandlestheseissueswithaP2PoverlayandVPNtohandlenetworkconstraintsandsupportnetworksandboxingandawebinterfacetocongureandmanagethegrid. 163

PAGE 164

Withregardstoclouds,thereexistscontextualization[ 59 ].UsersconstructanXMLcongurationlethatdescribeshowacloudinstanceshouldbeconguredandprovidethistoabroker.Duringbootingofacloudinstance,itwillcontactathird-partycontextualizationbrokertoreceivethisleandcongurethesystem.ThisapproachhasbeenleveragedtocreatedynamicgridsinsidetheNimbuscloud[ 51 ].WhilethisapproachcanreproducesimilarfeaturesoftheGridAppliance,suchascreatinggridsinsidethecloud,therearechallengesinaddressingcloudbursting,automatedsigningofcerticates,andcollaborationamongstdisparategroups. 164

PAGE 165

CHAPTER7SOCIALPROFILEOVERLAYSOnlinesocialnetworkinghasbecomepervasiveindailylife,thoughassocialnetworksgrowsodoesthewealthofpersonalinformationthattheystore.Onceinformationhasbeenreleasedonasocialnetwork,knownasauser'sprole,theuserandthedataareatthemercyofthetermsdictatedbythesocialnetworkinfrastructure,whichtodayistypicallythird-party,centrallyowned.Ifthesocialnetworkengagesinactivitiesdisagreeabletotheuser,duetochangeoftermsoropt-outprogramsnotwellunderstoodbyuserssuchasrecentissueswithFacebookR'sBeaconprogram[ 77 ],theoptionspresentedtotheuserarelimited.Theoptionsincludeleavingthesocialnetwork,surrenderingtheiridentityandfeaturesprovidedbythesocialnetwork;acceptingthedisagreeableactivities;ortopetitionandhopethatthesocialnetworkchangesitsbehavior.Astheuseofsocialnetworkingexpandstobecometheprimarywayinwhichuserscommunicateandexpresstheiridentityamongsttheirpeers,theusersbecomemoredependentonthepoliciesofsocialnetworkinfrastructureowners.Recentwork[ 15 ]exploresthecouplingbetweensocialnetworksandP2P(peer-to-peer)systemsasameanstoreturnownershiptotheusers,notingthatasocialnetworkmadeupofsociallinksisinherentlyaP2Psystemwiththeasidethattheyarecurrentlydevelopedontopofcentralizedsystems.Thischapterextendsthisideawithfocusonthetopicoftopology;thatis,howtoorganizesocialprolesthatleveragethebenetsofferedbyastructuredP2Poverlayabstraction.StructuredP2Poverlaysprovideascalable,resilient,autonomicplatformfordistributedapplications.Structuredoverlaysenableuserstoeasilycreatetheirowndecentralizedsystemsforthepurposeofdatasharing,interactiveactivities,andothernetworking-enabledactivities.Thischapterisbaseduponmypreviouswork[ 115 117 ]discussedinChapters 3 and 4 toenablesocialnetworkproleoverlays.Theseworks 165

PAGE 166

addressthechallengesofbootstrappingsecure,privateoverlaysinenvironmentsconstrainedbynetworkaddresstranslators(NATs)andrewallsthroughapublicoverlayusedfordiscoveryandasarelayorcommunicationtransport.Atypicalsocialnetworkconsistsofusersandgroups.Eachuserhasaprole,asetoffriends,andtheabilitytosendandreceiveprivatemessages;eachgroupconsistsofoneormoremanagers,users,andamessagingboard.Prolescontainuser'spersonalinformation,statusupdates,andpublicconversations,similartoamessageboard.Friendsareindividualstrustedsufcientlybyausertoviewtheuser'sprole.Privatemessagingsendsmessagesdiscretelybetweenuserswithoutleakingthemessagetoothermembers.Groupshavesimilarfeatures,thoughidentityissharedbymanyusers.Usingthissocialnetworkingmodel,IhavedesignedOverSoc.OverSocusesapublicoverlayasadirectoryforndingandbefriendingpeersorndingandaccessinggroups.Oncegroupandproleaccesshasbeenoffered,thepublicoverlaycanbeusedtobootstrapconnectivitytoexistingproleandgroupoverlays.Securityforaproleisprovidedbyapublickeyinfrastructure(PKI),whereproleownersorgroupmanagersarethecerticateauthorities(CA)andallmembershavesignedcerticates.Theoverlaystoresproledataorgroupinformationinitsdistributeddatastore,supportingdecentralizedaccessusingscalablemechanismsregardlessoftheproleowner'sonlinepresence.Inthischapter,Ipresentthearchitectureoftheseoverlays,aspresentedinFigure 7-1 .AlicehasafriendshipwithBobandCarol,hencebotharemembersofherproleoverlay.BobhasafriendshipwithAliceandDavebutnotCarol;henceAliceandDavearemembersofhisproleoverlay,whileCarolisnot.Eachpeerhasmanyoverlaymembershipsbutasinglerootrepresentedbydashedlinesinvariousshadesofgray.Forclarity,overlayshortcutconnectionsarenotshown.Therestofthischapterisorganizedasfollows.Section 7.1 discussesrelatedwork.Section 7.2 describesOverSoc,explaininghowtomapsocialnetworksontostructured 166

PAGE 167

Figure7-1.AnexampleOverSocsocialoverlaynetwork P2Poverlays.Section 7.3 expressesexpectationsforuserinteractioninthesystem.InSection 7.4 ,Iexploresomeoftheremainingchallengesintroducedbythisapproach. 7.1RelatedWorksBucheggeretal.[ 16 ]describehowtouseaDHT(distributedhashtable)tostoresocialnetworkingprole.TheDHTprovideslook-upservicesforstoringmeta-datapertainingtoapeer'sprole.PeersquerytheDHTforupdatedcontentfromtheirfriendsbyhashingtheiruniqueidentiers(e.g.friends'emailaddresses).Theretrievedmeta-datacontainsinformationforobtainingtheproledatasuchasIPaddressandleversion.TheirworkreliesonaPKIsystemthatprovidesidentication,encryption,andaccesscontrol.Incontrast,OverSocmapsindividualuserprolesandgroupstoaprivateoverlaysecuredbypoint-to-pointencryptionandauthenticationamongstallpeersintheoverlay.Theprivateoverlayprovidesacleanabstractionofaccesscontrol,wherebyonceadmittedtoaprivateoverlay,userscanaccessadistributeddatastorewhichholdsthecontentsoftheowner'sprole. 167

PAGE 168

Shakimovetal.[ 98 ]takeadifferentapproachbydependingonvirtualindividualservers(VIS)hostedonacloudinfrastructuresuchasAmazonREC2.Friendscontacteachother'sVISdirectlyforupdates.ADHTisusedasadirectoryforgroupsandinterest-basedsearches.Theirapproachassumesbidirectionalend-to-endconnectivitybetweeneachVIS,whereaproleisonlyavailableduringtheuptimeoftheVIS.Becauseofthedemandsonnetworkconnectivityanduptime,theapproachassumesacloud-hostedVISandhasdifcultybeingusedonuser-ownedresources.OverSocallowspeerstohaveasymmetricconnectivityanddoesnotrequireconstantuptimethroughtheuseofNATtraversalsupportandtheabilitytostoretheproleintheoverlay'sdistributeddatastore.TheapproachpresentedbyCutilloetal.[ 26 ]reliesonacentralsystemtohostidentitiesandcerticatesthatcanthenbeusedtoqueryaDHTtodiscoveraninitialhopinaroutetoaspecicpeerthroughtheircircleoffriends.Thecircleoffriendsconsistsofanunstructuredoverlay,wheredirectfriendsmaintaindirectconnectionswiththepeer,andoutercirclesconsistoffriendsoffriendsandfriendsoffriendsoffriends.Themaingoalofthisworkistoremovetheprivatecomponentsofaprolefromacentralentity,whereasOverSocmakesacleanbreakfromallcentralizationandenablesscalabilitythroughdistributedreplicationtechniques.Unliketheaboveapproaches,theP2PsocialnetworkpresentedbyAbbasetal.[ 1 ]usesanunstructuredoverlaywithoutaDHTwherepeersconnectdirectlytoeachotherratherthanthroughtheoverlayestablishinguniqueidentierstodealwithdynamicIPs.Peerscacheeachother'sdatatoimproveavailability,whilehelpernodesareusedtoassistwithcommunicationbetweenpeersbehindNATs.Theapproachlackssecurityandaccesscontrolconsiderationsandlackstheguaranteesandthesimplicityoftheabstractionofferedbyastructuredoverlay. 168

PAGE 169

Figure7-2.AlicerequestsandreceivesafriendshipfromBob 7.2SocialOverlaysInthissection,IexplainhowOverSocmapsonlinesocialnetworkingtovirtualprivateoverlaysconsistingofapublicdirectoryoverlaywithmanyprivateproleoverlays.Thedirectoryoverlaysupportsfrienddiscoveryandvericationandstoresalistsofpeerscurrentlyactiveineachproleoverlay.Proleoverlayssupportmessageboards,privatemessages,andmediasharing. 7.2.1FindingFriendsInatraditionalsocialnetwork,directoriesareusedtosearchforusersbaseduponpublicinformation,suchastheuser'sfullname,userID(identication),e-mailaddress,groupafliations,andfriends.Theresultingsearchreturnszeroormorematchingdirectoryentries.InOverSoc,directoryentriesareinsertedintotheDHTofapublicoverlay.Sincethepublicinformationhasmanycomponents,varioussubsetsformDHTkeysthatallpointtoacommon,completelistingofthematchingpublicinformation.Forexample,ausercanstoreapointerattheDHTkeyhash("alice")orhash("alicebob").Thekeyhereisthatanysubsetoftheuser'spublicinformationinlower-caseformatcan 169

PAGE 170

behashedintoaDHTindexthatwouldeventuallydirectthesearchingusertooneormoreusers'publicinformation.Moreexplicitsearchescouldsiftthroughtheresultsandpresenttotheuseronlythosepeersmatchingallthesearchparameters.Theamountofinformationsharedpubliclyshouldbecongurablebytheuser.Whilelookingforanindividual,apeermaydiscoverthatmanyindividualshaveoverlappingpublicinformationcomponents,suchastheuser'sname.Assumingallentriesarelegitimate,theoverlaymusthavesomemethodofsupportingmultiple,distinctvaluesatthesamekey,requiringtheapplicationandusertoparsetheresponsesanddeterminethebestmatchbyreviewingthecontentsofeachcerticate.Alternatively,atechniquelikeSword[ 3 ],whichsupportsattributebasedsearching,couldbeusedtoefcientlyndpeersinanoverlay.Toaddresstrustlevelswhensearchingforfriends,aPGP(prettygoodprivacy)certicatecanbeusedtostoreuser'spublicinformationandverifyuser'sfriendsandgroups.InOverSoc,themainportionofaPGPcerticatecontainsinformationsuchasusername,fullname,e-mailaddress,potentiallyotheruser-deneddata,andsignaturepacketsfromtheuserandthosethattrustthecerticateincludinggroupsandindividuals.Thesesignaturepacketsrepresentalistofveriablefriendsandgroupsassistingtofurtheruniquelyidentifyauser.Eachtimeauserbefriendssomeone,theyshouldexchangesignaturepacketscontainingataminimumthefriend'sPGPcerticateID,asignatureexpirationtime,andasignaturebindingthisinformationwiththenewfriend'sexistingPGPcerticate.Thisincreasesthetrustlevelofindividualssearchingforothersespeciallyiftheyhavecommonfriendshipsorgroupmembership.Theuseofatimestampinthesignatureassistsindecidingwhetherornotafriendshiplinkisstillactivewithoutaccessingtheproleoverlayofeitherpeers.Thuspeersthatmaintainfriendshipsneedtoperiodicallyexchangesignaturepackets. 170

PAGE 171

Figure7-3.Alice,alreadyafriendofBob,connectstohissocialoverlay 7.2.2MakingFriendsInthisexample,AlicebecomesfriendswithBob,asillustratedinFigure 7-2 .Onceauser,Alice,hasfoundafriendcandidate,Bob,AlicecanissueafriendshiprequestandstoreitintheDHTusingthehashofBob'scerticateasanindex,thisactsapublicoverlaymailbox.BobcanreviewthepublicinformationofAlicepriortomakingadecision.IfBobacceptstherequest,AliceandBobexchangesignaturepacketsandaregrantedaccesstoeachother'sproles.Onceproleaccesshasbeenenabled,theAliceandBobcanlearnmoreinformation,andifitturnsouttobeamistake,eitheroneofthemcanunilaterallyendtherelationship.Alice'sfriendshiprequestshouldcontainapointertohercerticateintheoverlay,atimestamp,andBob'scerticateidentier.ThefriendshiprequestisencryptedusingBob'spublickeyandsignedusingAlice'sprivatekeyforthepurposesofanonymityandauthenticity.WhenBobreceivesthefriendshiprequest,hecanverifythattherequestwasmadeforBobbyAlice.Uponreceivingthefriendshiprequest,hehasthreechoices:aconditionalaccept,anunconditionalaccept,orareject.Duringanunconditionalaccept,BobsignsAlice'sPGPcerticateandissuesarequesttobefriend 171

PAGE 172

her.Alternatively,hecouldissuearequesttobefriendherandwaitforhertosignhiscerticateandinvestigatesherprolepriortosigninghers.Discoveryofauserisnotlimitedtothedirectoryentries.Becauseusershaveapublicoverlaybasedmailbox,theyarenotrequiredtodiscovereachotheronlythroughthedirectory.Instead,theycanuseoutofbanddiscovery,usingmechanismslikee-mail,chat,orpersonalwebsitestoexchangecerticates.Onceapeerhasreceivedanotherpeer'scerticate,theycansubmitsecurefriendshiprequestsusingthepublicoverlay.Infact,thissortofsystemcanleveragethetrustestablishedbyanexistingsocialnetworktosignandexchangeOverSoc'scerticates. 7.2.3TheProleOverlayInatraditionalsocialnetwork,theproleoruser-centricportionconsistsofprivatemessaging,datasharing,friendshipmaintenance,andapublicmessageboardforstatusupdatesorpublicmessages.Inthissection,weexplainhowthesecomponentscanbeappliedtoastructuredoverlaydedicatedtoanindividualprole.Usingthetechniquessuchasthosedescribedearlier,itisfeasibletoefcientlymultiplexaP2Psystemacrossmultiple,virtualprivateoverlaysenablingeachproleownertohaveaproleoverlayconsistingoftheironlinefriends.Foraccesscontrol,OverSocemployspoint-to-pointencryptionandauthentication,peersbootstrapprivateconnectionsbyexchangingthebaseofthePGPcerticateandtheproleoverlayssignaturepacketobtainedinthemakingfriendsstage.BecausetheproleowneralsoistheCA,controlofwhichcouldbedistributedacrosstheusersresources,forallmembersoftheoverlay,theycaneasilyrevokeusersfromaccesstotheproleoverlay.Chapter 4 describesefcientmechanismsforoverlayrevocationthroughtheuseofbroadcastingforimmediaterevocationandtheuseofDHTforindirectandpermanentrevocation.Themessageboardofaprolecanbestoredintwoways:distributedwithintheproleoverlayviaadatastoreorstoredontheproleowner'spersonalcomputing 172

PAGE 173

devices.Thedistributeddatastoreprovidestheprolewhentheownerisofineandalsodistributestheloadforpopularproles.Forhigheravailability,eachpeeralwaysstoresandprovidesalldataintheirprolewhentheyareonline.Toensureauthenticityandintegrity,peerssigntheirmessagesandeachpeer'scerticateisavailableintheoverlayaswellasstoredbymutualfriendsforverication.Messagesthatareunsignedareignoredbyallmembersoftheoverlay.Anidealoverlayforthispurposeshouldsupportcomplexqueries[ 50 ]allowingeasyaccesstodatastoredchronologically,bycontent,bytype,i.e.,media,statusupdates,ormessageboarddiscussions.Privatemessagingintheproleoverlayisunidirectional;onlytheproleownercanreceiveprivatemessagesusingtheiroverlay.Toenforcethis,aprivatemessageshouldbeprependedwithasymmetrickeyencryptedbytheproleownerspublickey,themessageshouldbeappendedbyasignatureofthemessageusingtheprivatekeyofthemessagesender,andtheentiremessageencryptedbythesymmetrickey.Thisapproachensuresthatonlythesenderandtheproleownercandecrypttheprivatemessageandverifythesendersidentity.Thecontentsoftheprivatemessageincludethesender,timesent,andthesubject.MessagesarebestoredinwellknownlocationsintheDHT,likeprivatemessagesforme,sothattheproleownercaneitherpollthelocation. 7.2.4EventBasedMessageNoticationBoththedirectoryandproleoverlayshavemethodsbywhichpeerscanreceivemessages.Inthedirectoryoverlay,thesetakeformbymeansoffriendshiprequestsandfriendshipaccepts,certicatesignaturepackets.Theproleoverlaysupportsprivatemessages.WhilepollingthelocationintheDHToccasionallywillallowpeerstoreceivethemessages,pollinghasinherentdelaysandnetworkcosts.Alternatively,eventenablepeerstoreceivesentmessagesveryquicklyaftertheyhavebeensentwithminimalimpactonnetworkthroughput. 173

PAGE 174

AsimplemethodforimplementinganeventnoticationsysteminvolvesusingtheDHT.Eacheventwouldhaveanidenticationthatwouldmaptoalistofpeerswantingtoknowwhenaneventoccurredandthedataassociatedwithit.Thusmappingthe(eventid,listener)totheDHTcouldbedonebyhashingastringsuchasprivatemessagesformeortakingahashoftheuser'scerticatehashforpublicoverlaymessagesandstoringtheproleownersactivenodesintothelistoflisteners.Whenamessageisinsertedintotheuser'smailbox,thesendercouldquerythislistandsendtoeachlisteneranoticationofthenewprivatemessage.Alternatively,ifahigherdegreeofanonymityisrequired,theDHTservercouldbemodiedtoforwardtheresponsetothelistenersdirectlyratherthanreturningalistoflisteners.Ofcourse,thisdoesnotpreventpotentialraceconditionsoccurring,suchasasituationwhereapeerrecentlyjoinedtheirproleoverlay,hadalreadyqueriedtheirmailboxandfounditempty,whilesimultaneouslyaprivatemessagewassenttothemyettheywerenotinthelistenerslist.Thusoccasionalpollingisrequired,thoughcanbeminimized,thelongeranodehasbeenonline. 7.2.5ActivePeersThedirectoryoverlayshouldbeusedtoassistinndingcurrentlyactivepeersintheproleoverlays.ByplacingtheirnodeIDsatawell-known,uniqueper-proleoverlaykeysintheDHT,activepeerscanbootstrapincomingpeersintotheproleoverlay.IimplementedandevaluatedthisconceptinChapter 4 .BecausetheproleoverlaymembersallusePKItoensuremembership,evenifmaliciouspeersinserttheirIDintotheactivelist,itwouldbeuselessasthepeerwouldonlyformconnectionswithpeerswhoalsohaveasignedcerticate.Extendingfromtheearlierexample,whereAlicebecameBob'sfriends,Figure 7-3 presentsindetailhowshewouldjoinhisprivateoverlay. 174

PAGE 175

7.2.6GroupsGroupscanbeconsideredextensionsofproleoverlays.Thefundamentaldifferencebetweenagroupandaproleisthatagrouplacksprivatemessagingandhassharedownership.Sojustasapeercanndaproleinthedirectorybyhashingthenameoftheuserandotheridentiableinformation,socantheuserndthegroup.Likethecerticateoftheuser,themembersofagroupsignthegroup'scerticatetorepresenttheirmembershiptothatgroup.InOverSoc,usersrequestmembershiptothegroupliketheydofriendshiprequests,inresponseagroupmanagercansigntheircerticateallowingthatmemberaccesstothegroup.Finally,thegroupcanbebootstrappedinthesamewayastheproleoverlaythroughthedirectoryoverlay.TheuniquechallengepresentedbygroupsisthesharingoftheCAtask.AdecentralizedsolutionwouldbeforallmembersofthegrouptobelistedinthegroupsDHTandwhenapeerbecomesamanager,theyobtainanewsignaturepacketthatcontainsauser-denedcomponentstatingthattheyaremanagers.Ifanadministratorlosestheirposition,thenallmemberswhohadtheircerticatesignedbythatadministratorwouldneedtoobtainanewcerticate.Toavoidmemberchurn,theownercouldprovidesignaturepacketsforallgroupmembers.Thusthemanagersjustallowtemporaryaccessuntiltheownercomesonlineandprovidesmorepermanentaccess. 7.3UserInteractionOverSocconsistsofmanycomponentsthataretransparenttotheuser,theuserexperienceshouldappeartotheusernodifferentlythananexistingonlinesocialnetwork.TheOverSoccouldbeadownloadableapplicationorabrowserbasedFlashorSilverlightapplication.Iftheuser,Bob,hadalreadycreatedanaccount,Bobwouldbepresentedwithaninterfaceshowingtheirfriendsproles.BaseduponBob'sconguration,thesocialapplicationcouldretrieveproleupdatesashenavigates 175

PAGE 176

toindividualprolesorassoonastheapplicationjoinsanindividualproleoverlay,reactiveversusproactiveprolequerying.IfthiswasBob'srsttimestartingOverSoc,hewouldbepresentedwithscreensaskingforhisprivacypreferences,suchaswhetherornothewantshisinformationinthedirectoryoverlay,ifhefeltcomfortableenoughwiththeideaofpeopleknowinghewasamemberofthesocialnetworkandwhohisfriendsare.ThenOverSocwouldaskforpersonalinformationtopopulatehisproleandtogeneratehisdirectoryinformation.Atwhichpoint,theOverSocwouldjointheoverlayandcreateBob'sprivateoverlay.Bobcouldthenstartsearchingforfriends,makefriendrequests,andrespondtofriendrequests.Recently,BobhadbeenthinkingabouthishighschooldaysandwascuriousifAlicewasalsoamemberofOverSoc,thoughBobdidnothaveAlice'se-mailaddress,justherrstandlastname.BobentersAlice'snameintotheOverSocsearchboxandispresentedbyalistofAlice's.AsBobreviewseachoftheentries,herecognizesanAlicethatisfriend'swithsomeofthesamepeopleBobwasinhighschool.Bobselectstobecomeherfriend.Atwhichpoint,theOverSoctransparentlyinsertsafriendshiprequesttoAliceandsignsAlice'scerticatesoAlicecanviewBob'sprole.OfcoursethatisbecauseBobhaschosentoallowuser-initiatedfriendrequestsaccesstohisprole.AlicereceivesBob'srequest,peruseshisproleandfeelsnebecomingfriendswithBob,whichinitiatesatransparentprocessofsigningBob'scerticateandplacingtheresultinthepublicoverlay.Thereisoneproblemthough,whenBobreceivesAlice'ssignatureandviewsherprole,herealizesthatthisissomeotherAlice.Hequicklychoosestodefriendher.ThiscausesBob'sOverSocinstancetobroadcastarevocationforAlice'ssignatureandtostoretherevocationintheDHT.Alice,whowasviewingBob'sprole,isnotiedofthissuddenlossoftrustandwhilesheisabletoviewthecontentsofBob'sprole,whichshehasalreadyaccessedandobtained,shecannolongerreceiveupdatesasmembersofBob'soverlaypreventherfromaccessingit. 176

PAGE 177

Inanotherinstance,BobbumpedintoCarol,whoe-mailedBobacopyofhercerticate.BobpointsOverSoctothecerticate,andOverSocveriesthathewantstobecomefriendswiththeidentityassociatedwiththecerticate.Whenheaccepts,OverSocimmediatelysubmitsarequesttobecomeCarol'sfriend.CarolreceivesnoticationandacceptsBob'sfriendshiprequest.Atthispoint,bothBobandCarolhavetransparentlyexchangedsignedcerticatesandhavemutualaccesstoeachotherproles.AsBobreadsCarol'slatestnews,heremembersafunnypersonalstoryandthathewouldliketosharewithCarol.SohesendsCarolaprivatemessage.Carolisofinethough.ThenexttimeCarolgoesonline,hersocialapplicationdiscoversthemessageandpresentsittoher.Inthisscenario,OverSochastakentheprivatemessage,secureditwithherpublickeyandasymmetrickeyandsigneditwithhisprivatekey.Afterwhich,itinsertsthemessageintotheDHTandsendsanoticetotheeventnoticationsystem,whichdetectsthattherewerenolisteners.WhenCarol'sapplicationcomesonline,itqueriestheDHTreceivingthemessage.PriortopresentingCarolthemessage,theOverSocdecryptsandveriesthemessage.TheOverSocarchitecturecanleverageexistingsocialnetworkstobootstraptrust.Forexample,considerBobandDavidaretwofriendsonFacebookR.BobjoinsaFacebookRapplicationcalledOverSoc/FacebookRBridge,whichstoresacopyofhisOverSoccerticateinhispersonalprole.BobhasbeenbraggingtoDavidaboutOverSocandmentionstohimhoweasyitistomigratefromFacebookRtoOverSocusingthisapplication.SoDavidjoinsOverSocaswellastheapplication.WhenDavidaccessestheapplication,itpasteshiscerticatetohisprole,notiesnotieshimthathehasafriendalreadyusingit,Bob,andthathecanimmediatelysignBob'scerticate,andleavesarequestforBobtosignhiscerticate.Additionally,whenDavidlogsintoOverSoc,hecanleaveafriendrequestthereaswell,sothatthenexttimeBobaccessesFacebookRorOverSoc,hewillreceiveDavid'srequestandcansignDavid'scerticate.Atwhichpoint,bothwillhaveaccesstoeachothersOverSocproleoverlays. 177

PAGE 178

7.4ChallengesWhilestructuredP2Poverlayshavebeenwell-studiedinavarietyofapplications,theiruseinsocialproleoverlaysraisesnewinterestingquestions,including:Handlingsmalloverlaynetworks-P2Poverlayresearchtypicallyfocusesonnetworkslargerthanthetypicaluser'sfriendcount(FacebookR'saverageis1301).Becausesocialproleoverlaysarecomparativelysmaller,thiscanimpactthereliabilityoftheoverlayandavailabilityofproledata.Ausercanhosttheirownprole;howeverwhentheuserisdisconnecteditisimportantthattheirproleremainsavailableevenunderchurn.Itisthusimportanttocharacterizechurninthisapplicationtounderstandhowtobestapproachthisproblem.Anoptionalofper-userdeploymentofavirtualindividualserver(VIS)andtheuseofreplicationschemesawareofauser'sresourcesprovidepossibledirectionstoaddressthisissue.Overlaysupportforlowthroughput,unconnecteddevices-devicessuchassmartphonescannotconstantlybeactivelyconnectedtotheoverlayandtheconnectiontimenecessarytoretrievesomethinglikeaphonenumbermaybetoomuchtomakethisapproachuseful.Similartothepreviouschallenge,thisapproachcouldbenetfromusingaVISenablingusersaccesstotheirsocialoverlaysbyproxywithoutestablishingadirectconnectiontotheoverlaynetwork.Reliabilityofthedirectoryandproleoverlay-Overlaysaresusceptibletoattacksthatcannullifytheirusefulness.Whiletheproleoverlaydoeshavepoint-to-pointsecurity,inthepublic,directoryoverlay,thelackofanyformcentralizationmakespolicingthesystemacomplicatedprocedure.Whiletheapproachofappendingfriendslistcanassistusersinmakingdecisionsonidentity,itdoesnotprotectagainstdenialofserviceattacks.Forexample,userscouldattemptcreatemanysimilaridentitiesinanattempttooverwhelmauserintheirattempttondaspecicpeer. 1http://www.facebook.com/press/info.php?statistics 178

PAGE 179

Previousworkhasproposedmethodstoensuretheusabilityofoverlaysevenwhileunderattack.Forthesocialoverlaytobesuccessful,onemustidentifywhichmethodsshouldbeused.Apossibleapproachistoreplicatepublicinformationwithinauser'sproleoverlaythusprovidinganalternativedirectoryoverlayforqueryingpriortousingthepublicdirectoryoverlay.Socialproledatastorage-Inpreviousworks,DHTshavebeenusedasthebuildingblockstoformmorecomplexdistributeddatastoresaspresentedinPast[ 95 ]andKosha[ 17 ].Applicationofdatastoreswillbeheavilydependentonthechurnrateassociatedwiththeoverlay.Ifthesystemlacksanyreasonablystablemembership,largedatalesmaybecorruptedwhilesmallerdatasetsarecompletelylost.Ideally,theusagemodelwouldbesimilartothoseofSkypeRandTwitter,whichhaveactiveprocessesforthedurationofthecomputersusage.Inanenvironmentlikethis,datastoragewouldbelimitedonlybytheavailablebandwidthoftheparticipants. 179

PAGE 180

CHAPTER8CONCLUSIONSThisworkbringssignicantadvancestotheusabilityofVPNsthroughunderstandingimportantpracticalapplicationsandvericationinbothsimulationandrealdeployments.ThearchitectureexploredhereinprovidesageneralframeworkforcreatingVPNsthathascontributedtovariousend-pointandoverlaycongurationsusefulforbothlargeandsmallscaledeploymentsforgrouporpersonaluse.Inordertosupportacompletelyad-hoc,decentralizedVPN,usersbeginbyconnectingtoapublicoverlay,suchasXMPPorKademlia,inordertodiscoverotherusers.Afterexchangingtheirinformationviathesemediums,peerscanestablishdirectcommunicationlinkswitheachotherandwithotherpeersalreadyintheVPN.Peerscanexchangeorobtaintrustedidentitiesusingestablishedpeerorgrouprelationshipsinexistingsocialnetworks.BecausemostpeersarebehindNATsandrewalls,thisdissertationcoversmethods,whichallowpeerstouseexistingoverlaystobootstrapthroughNATsandrewalls.Thiscanmeanusingathird-partysystemuntilapeeronapublicIPaddresscomesonline,ormorelikely,usingaservicetoobtainapublicIPandportmappingforthepeer'sprivateaddress.Whenpeerscannotdirectlyestablishdirectlinks,theoverlaystillprovidestheabilitytoroutemessagesbetweenpeers.Routingmessagesacrosstheoverlaycanincursignicantoverheadandisnotoptimizedforanyspecicpurpose.Toremedythis,Ihaveestablishedamechanismtocreatetwo-hoplinksbetweenpeersemphasizedonthelatencybetweenthepeers.ThisworkdescribestwonovelmechanismsforhandlingaddressassignmentsinsideaVPN:usingaDHTwithatomiccapabilitiesaswellasindependentnetworkswithexplicitlinksbaseduponsocialconnections.TheDHTapproachallowsforhighlyscalablesystemsincomparisontootherapproachesthatrequirestatetobemanuallyspreadacrossthesystemorthroughtheuseofbroadcastmechanisms.Bymakingthe 180

PAGE 181

networkaddressesdependentonsociallinksandindependentoftheactualoverlay,peersneednotworryaboutaddresscollisions.Furthermore,thisworkexploresmeanstotransparentlymigrateresourcesusingDHTstyleaddressingevenwhenthoseresourcesareconnectedviatheVPNrouter.ExistingapproachestoVPNplacementuseeitherinterfaceorroutermodels.Interfacemodelscaneasilybeconstructedtobetransparent,whereasexistingroutermodelshavenosuchfeatures.Throughtheuseofnetworkprotocolssupportedinnetworkstacksfoundincommonoperatingsystems,mechanismsforsupporttransparentcongurationofbothinterfaceandroutingmodelshavebeendetailed.Whereroutersaredesirableduetoperformance,andinterfaceisattractiveforsecuritypurposes,ahybridmodelprovidesamiddlegroundthatcombinesthesetwoaspectstosupporthigh-performance,thoughsecurevirtualnetworking.Becauseattemptingtoverifythesystemineverypossibleenvironmentaftermakingaddingfeaturesorxingbugsrequiresasignicanttimeinvestment,Ihaveemployedabuilt-inself-simulatingenvironmentintoBrunet.Theapplicationhasreducedthetimenecessarytodevelop,evaluate,anddebugnewcontributionsaswellasreproducebugsandprotectfromhavingthemreoccur.Inthecontextofthisdissertation,ithasbeenusedtoverifyandmotivatethenecessityforondemandasopposedtopassiveconnections,aswellastherelay,bootstrapping,andsecuritywork.Thisworkhasbeenthecornerstoneinarealsystemusedtoprovidead-hocgridcomputingnamedtheGridAppliance[ 114 ],whichhasbeenrealizedinavoluntarycomputinggridforcomputerarchitectureresearchcalledArcher[ 37 ].Archercurrentlyspanssixuniversitieswithover600resources.Overhundredsofusershaveconnectedseamlesslytotheseresourcesfrommanylocations.APlanetLabbackenddistributedacrossover600resourcesprovidesnearconstantoverlayuptimeforArcherandexternalusers.Externalusersincludeclassesandgroupsatotheruniversities.Mostrecently,agridatLaJollaInstituteforAllergyandImmunologywentlivewithminimal 181

PAGE 182

communicationwithourgroup.ResearchersattheClemsonUniversityandPurduehaveoptedforthisapproachovercentralizedVPNsasthebasisoftheirfuturedistributedcomputeclustersandhaveactivelytestednetworksofover1000nodes.Themajorityofthisdissertationfocusedonservicestoenableuser-friendlyVPNs.Duringthedesignandevaluation,itbecameapparentthattherestillexistsignicantdecienciesinthedesigndescribedherein.Thefollowingareimportantresearchtopicsthatareleftasfutureresearchtopics.DuringtheevaluationofpacketdropratesacrosstheInternetandinparticularlyPlanetLab,itisapparentthatrecursivepacketroutingwillnotscalewellespeciallywhendealingwithnon-negligibletrafc.AnotheraspectoflimitedscalabilityispresentedwhenusingasingleUDPsocketmultiplexedacrosspotentiallymanyVPNconnections.EachUDPsockethasasmallbufferassociatedwithit.Whenthatbufferisexhaustedsendseitherbecomeblockingorarethrownaway,dependingonusage.IntermsofVPNs,therestillexistsawidegapbetweenIPv6andIPv4.MyworkcouldbeusedtoassistindeployingIPv6toIPv4tunnels,whichunlikeexistingapproaches,havenaturalfailoversupportandefcientpathsbetweensourceanddestination.Finally,thelastmajordrawbacktotheGridApplianceisthenecessityforacentralizedscheduler/managementcomponent.Ideally,thiscouldbehandledindecentralizedmeanswithtrust,auserranksystemforpriority,abilitytohandlesimultaneousschedulingoftasks,andlimitingtherelianceofanodebeingonlinetoreceiveresultsfortasks. 182

PAGE 183

APPENDIX:STRUCTUREDOVERLAYBROADCAST FigureA-1.Tree-basedoverlaybroadcast BroadcastrevocationcanbeusedtoaddressthedecienciesofDHTrevocation.Asatopicofpreviousresearchworks[ 32 111 ],structuredoverlayscanbeusedwithoutadditionalstatetoperformefcientbroadcastsfromanypointintheoverlaytotheentireoverlay.Inthesepapers,analysisandsimulationshaveshownthattheapproachcanbecompletedinanetworksizeofninO(log2n)timewithnmessages.TheoverlaybroadcastalgorithmusedinthispaperprovidesacompleteoverlaybroadcastinO(log2n)timewithnmessages.WhenappliedtoBrunet,asillustratedinFigure A-1 ,itutilizestheorganizationofastructuredsystemwithacircularaddressspacethatrequirespeersbeconnectedtothosewhosenodeaddressesaretheclosesttotheirown,featurestypicalofone-dimensionalstructuredoverlaysincludingChord[ 104 ],Pastry[ 94 ],andSymphony.Usingsuchanorganization,itispossibletodoperformabroadcastwithnoadditionalstate.Toperformabroadcast,eachnodeperformsthefollowingrecursivealgorithm: BROADCAST(start,end,message): RECEIVE(message) 183

PAGE 184

foriinlength(connections)do n start ADDRESS(connections[i]) ifn start62[start,end)then continue endif n end ADDRESS(connections[i+1]) ifn end62[start,end)then n end end endif msg (BROADCAST,n start,n end,message) SEND(connections[i],msg) endforwithconnectionsasacircularlistofconnectionsinnon-decreasingorderfromtheperspectiveofthenodeperformingthecurrentrecursive,broadcaststep.Inthisalgorithm,thebroadcastinitiatorusesitsownaddressasthestartandend,thusthebroadcastwillspantheentireoverlayaftercompletingrecursivecallsateachconnectednode.Arecursiveend,n end,mustbeinsidetheregionbetweenstartandend,thusiftheconnectionfollowingthecurrentsendingconnection,connections[i+1],isnotinthatregion,itwillonlybroadcastuptoendandnottheaddressspeciedbythatconnection.Tosummarize,theoverlayisrecursivelypartitionedamongstthenodesateachhopinthebroadcast.Bydoingso,allnodesreceivethebroadcastwithoutreceivingduplicatebroadcastmessages. 184

PAGE 185

REFERENCES [1] S.M.A.Abbas,J.A.Pouwelse,D.H.J.Epema,andH.J.Sips.Agossip-baseddistributedsocialnetworkingsystem.InEnablingTechnologies,IEEEInternationalWorkshopson,2009. [2] H.Abbes,C.Cerin,andM.Jemni.Bonjourgrid:Orchestrationofmulti-instancesofgridmiddlewaresoninstitutionaldesktopgrids.InInternationalParallelandDistributedProcessingSymposium(IPDPS),2009. [3] J.Albrecht,D.Oppenheimer,A.Vahdat,andD.A.Patterson.Designandimplementationtrade-offsforwide-arearesourcediscovery.InACMTrans.InternetTechnol.,2008. [4] S.AlexanderandR.Droms.RFC2132DHCPOptionsandBOOTPVendorExtensions,March1997. [5] Amazon.com,Inc.Amazonelasticcomputecloud. http://aws.amazon.com/ec2 ,2009. [6] D.P.Anderson.Boinc:Asystemforpublic-resourcecomputingandstorage.IntheInternationalWorkshoponGridComputing,2004. [7] N.Andrade,L.Costa,G.Germoglio,andW.Cirne.Peer-to-peergridcomputingwiththeourgridcommunity.InBrazilianSymposiumonComputerNetworks,May2005. [8] N.Andrade,L.Costa,G.Germglio,andW.Cirne.Peer-to-peergridcomputingwiththeourgridcommunity.InBrazilianSymposiumonComputerNetworks(SBRC)-4thSpecialToolsSession,2005. [9] P.Andreetto,S.Andreozzi,G.Avellino,S.Beco,A.Cavallini,M.Cecchi,V.Ciaschini,A.Dorise,F.Giacomini,A.Gianelle,U.Grandinetti,A.Guarise,A.Krop,R.Lops,A.Maraschini,V.Martelli,M.Marzolla,M.Mezzadri,E.Molinari,S.Monforte,F.Pacini,M.Pappalardo,A.Parrini,G.Patania,L.Petronzio,R.Piro,M.Porciani,F.Prelz,D.Rebatto,E.Ronchieri,M.Sgaravatto,V.Venturi,andL.Zangrando.Thegliteworkloadmanagementsystem.JournalofPhysics:ConferenceSeries,119(6):062007,2008. [10] Azurues.Messagestreamencryption. http://www.azureuswiki.com/index.php/Message_Stream_Encryption ,December2007. [11] A.Biggadike,D.Ferullo,G.Wilson,andA.Perrig.NATBLASTER:EstablishingTCPconnectionsbetweenhostsbehindNATs.InACMSIGCOMMAsiaWork-shop,April2005. [12] F.Bondoux.Campagnol:distributedvpnoverudp/dtls. http://campagnol.sourceforge.net ,2010. 185

PAGE 186

[13] P.O.Boykin,J.S.A.Bridgewater,J.S.Kong,K.M.Lozev,B.A.Rezaei,andV.P.Roychowdhury.Asymphonyconductedbybrunet. http://arxiv.org/abs/0709.4048 ,2007. [14] D.Bryan,B.Lowekamp,andC.Jennings.Sosimple:Aserverless,standards-based,p2psipcommunicationsystem.InAdvancedArchitecturesandAlgorithmsforInternetDeliveryandApplications,June2005. [15] S.BucheggerandA.Datta.AcaseforP2Pinfrastructureforsocialnetworks-opportunities&challenges.InWONS'09:TheSixthInternationalConferenceonWirelessOn-demandNetworkSystemsandServices,2009. [16] S.Buchegger,D.Schioberg,L.H.Vu,andA.Datta.Peerson:P2psocialnetworking:earlyexperiencesandinsights.InWorkshoponSocialNetworkSystems,2009. [17] A.R.Butt,T.A.Johnson,Y.Zheng,andY.C.Hu.Kosha:Apeer-to-peerenhancementforthenetworklesystem.InIEEE/ACMSupercomputing,2004. [18] S.Carl-MitchellandJ.S.Quarterman.RFC1027-usingarptoimplementtransparentsubnetgateways,October1987. [19] M.Castro,M.Costa,andA.Rowstron.Debunkingsomemythsaboutstructuredandunstructuredoverlays.InSymposiumonNetworkedSystemsDesign&Implementation,2005. [20] M.Castro,P.Druschel,A.Ganesh,A.Rowstron,andD.S.Wallach.Securityforstructuredpeer-to-peeroverlaynetworks.InSymposiumonOperatingSystemsDesignandImplementaion(OSDI),December2002. [21] M.Castro,P.Druschel,A.-M.Kermarrec,andA.Rowstron.Oneringtorulethemall:Servicediscoverandbindinginstructuredpeer-to-peeroverlaynetworks.InSymposiumonOperatingSystemsPrinciples(SOSP)EuropeanWorkshop,Sept.2002. [22] F.Chang,J.Dean,S.Ghemawat,W.C.Hsieh,D.A.Wallach,M.Burrows,T.Chandra,A.Fikes,andR.E.Gruber.Bigtable:adistributedstoragesystemforstructureddata.InSymposiumonOperatingSystemsDesignandImplementation(OSDI),2006. [23] B.Chun,D.Culler,T.Roscoe,A.Bavier,L.Peterson,M.Wawrzoniak,andM.Bowman.Planetlab:anoverlaytestbedforbroad-coverageservices.SIG-COMMComput.Commun.Rev.,2003. [24] M.ConradandH.-J.Hof.Ageneric,self-organizing,anddistributedbootstrapserviceforpeer-to-peernetworks.InInternationalWorkshoponSelf-OrganizingSystems(IWSOS),2007. 186

PAGE 187

[25] C.Cramer,K.Kutzner,andT.Fuhrmann.Bootstrappinglocality-awarep2pnetworks.InInternationalConferenceonNetworks(ICON),2004. [26] L.A.Cutillo,R.Molva,andT.Strufe.Privacypreservingsocialnetworkingthroughdecentralization.InWirelessOn-DemandNetworkSystemsandServices(WONS),2009. [27] H.Damfpling.Gnutellawebcachingsystem. http://www.gnucleus.com/gwebcache/specs.html ,2003. [28] G.DeCandia,D.Hastorun,M.Jampani,G.Kakulapati,A.Lakshman,A.Pilchin,S.Sivasubramanian,P.Vosshall,andW.Vogels.Dynamo:amazon'shighlyavailablekey-valuestore.InSymposiumonOperatingSystemsPrinciples(SOSP),NewYork,NY,USA,2007.ACM. [29] L.DeriandR.Andrews.N2N:Alayertwopeer-to-peervpn.InInternationalconferenceonAutonomousInfrastructure,ManagementandSecurity,2008. [30] J.R.Douceur.Thesybilattack.InInternationalWorkshoponPeer-to-PeerSystems,pages251.Springer-Verlag,2002. [31] R.Droms.RFC2131DynamicHostCongurationProtocol,March1997. [32] S.El-Ansary,L.Alima,P.Brand,andS.Haridi.Efcientbroadcastinstructuredp2pnetworks.InInternationalWorkshoponPeer-to-PeerSystems(IPTPS),2003. [33] D.H.J.Epema,M.Livny,R.vanDantzig,X.Evers,andJ.Pruyne.Aworldwideockofcondors:Loadsharingamongworkstationclusters.FutureGenerationComputerSystems,12(1):5365,1996. [34] E.Exa.CloudVPN. http://e-x-a.org/?view=cloudvpn ,September2009. [35] D.Fabrice.Skypeuncovered. http://www.ossir.org/windows/supports/2005/2005-11-07/EADS-CCR_Fabrice_Skype.pdf ,November2005. [36] Facebook.Facebook. http://www.facebook.com ,January2010. [37] R.J.Figueiredo,P.O.Boykin,J.A.B.Fortes,T.Li,J.Peir,D.Wolinsky,L.K.John,D.R.Kaeli,D.J.Lilja,S.A.McKee,G.Memik,A.Roy,andG.S.Tyson.Archer:Acommunitydistributedcomputinginfrastructureforcomputerarchitectureresearchandeducation.InCollaborateCom,November2008. [38] R.J.Figueiredo,P.O.Boykin,P.S.Juste,andD.Wolinsky.Integratingoverlayandsocialnetworksforseamlessp2pnetworking.InWorkshoponEnablingTechnologies:InfrastructureforCollaborativeEnterprises,2008. [39] R.J.Figueiredo,P.A.Dinda,andJ.A.B.Fortes.Acaseforgridcomputingonvirtualmachines.InInternationalConferenceonDistributedComputingSystems.IEEEComputerSociety,2003. 187

PAGE 188

[40] I.Foster.Globustoolkitversion4:Softwareforservice-orientedsystems.JournalofComputerScienceandTechnology,21:513,2006.10.1007/s11390-006-0513-y. [41] A.Ganguly,A.Agrawal,O.P.Boykin,andR.Figueiredo.IPoverP2P:Enablingself-conguringvirtualIPnetworksforgridcomputing.InInternationalParallelandDistributedProcessingSymposium,2006. [42] A.Ganguly,A.Agrawal,P.O.Boykin,andR.Figueiredo.Wow:Self-organizingwideareaoverlaynetworksofvirtualworkstations.InIEEEHighPerformanceDistributedComputing(HPDC),June2006. [43] A.Ganguly,P.O.Boykin,D.Wolinsky,andR.J.Figueiredo.Improvingpeerconnectivityinwide-areaoverlaysofvirtualworkstations.ClusterComputingJournal,72009. [44] A.Ganguly,D.Wolinsky,P.Boykin,andR.Figueiredo.Decentralizeddynamichostcongurationinwide-areaoverlaysofvirtualworkstations.InInternationalParallelandDistributedProcessingSymposium,March2007. [45] C.GauthierDickeyandC.Grothoff.Bootstrappingofpeer-to-peernetworks.InInternationalSymposiumonApplicationsandtheInternet,2008. [46] W.Ginolas.P2PVPN. http://p2pvpn.org ,2009. [47] B.Gleeson,A.Lin,J.Heinanen,T.Finland,G.Armitage,andA.Malis.RFC2764aframeworkforIPbasedvirtualprivatenetworks,February2000. [48] S.Guha,N.Daswani,andR.Jain.Anexperimentalstudyoftheskypepeer-to-peervoipsystem.InInternationalWorkshoponPeer-to-PeerSystems,2006. [49] K.P.Gummadi,S.Saroiu,andS.D.Gribble.King:Estimatinglatencybetweenarbitraryinternetendhosts.InSIGCOMMInternetMeasurementWorkshop,2002. [50] M.Harren,J.M.Hellerstein,R.Huebsch,B.T.Loo,S.Shenker,andI.Stoica.Complexqueriesindht-basedpeer-to-peernetworks.InInternationalWorkshoponPeer-to-PeerSystems,2002. [51] A.Harutyunyan,P.Buncic,T.Freeman,andK.Keahey.DynamicvirtualAliEngridsitesonnimbuswithCernVM.JournalofPhysics:ConferenceSeries,2010. [52] B.Hubert.tc-linuxadvancedrouting&trafccontrol. http://lartc.org/ ,June2009. [53] X.JiangandD.Xu.Violin:Virtualinternetworkingonoverlay.InInternationalSymposiumonParallelandDistributedProcessingandApplications,pages937,2003. 188

PAGE 189

[54] R.Jones.Netperf:Anetworkperformancemonitoringtool. http://www.netperf.org ,2009. [55] D.Joseph,J.Kannan,A.Kubota,K.Lakshminarayanan,I.Stoica,andK.Wehrle.Ocala:anarchitectureforsupportinglegacyapplicationsoveroverlays.InSymposiumonNetworkedSystemsDesign&Implementation,pages20,2006. [56] P.S.Juste,D.Wolinsky,P.OscarBoykin,M.J.Covington,andR.J.Figueiredo.SocialVPN:Enablingwide-areacollaborationwithintegratedsocialandoverlaynetworks.ComputerNetworking,54:1926,August2010. [57] M.Kallahalla,M.Uysal,R.Swaminathan,D.E.Lowell,M.Wray,T.Christian,N.Edwards,C.I.Dalton,andF.Gittler.SoftUDC:Asoftware-baseddatacenterforutilitycomputing.Computer,37(11):38,2004. [58] K.Keahey,K.Doering,andI.Foster.Fromsandboxtoplayground:Dynamicvirtualenvironmentsinthegrid.InInternationalWorkshopinGridComputing,November2004. [59] K.KeaheyandT.Freeman.Contextualization:Providingone-clickvirtualclusters.IneScience,2008. [60] K.KeaheyandT.Freeman.Scienceclouds:Earlyexperiencesincloudcomputingforscienticapplications.InCloudComputingandItsApplications,2008. [61] C.C.Keir,C.Clark,K.Fraser,S.H,J.G.Hansen,E.Jul,C.Limpach,I.Pratt,andA.Wareld.Livemigrationofvirtualmachines.InSymposiumonNetworkedSystemsDesignandImplementation,pages273,2005. [62] M.Knoll,A.Wacker,G.Schiele,andT.Weis.Bootstrappinginpeer-to-peersystems.InInternationalConferenceonParallelandDistributedSystems(IPDPS),2008. [63] M.Krasnyansky.Universaltun/tapdevicedriver. http://vtun.sourceforge.net/tun ,2005. [64] A.Lakshman.CassandraastructuredstoragesystemonaP2Pnetwork. http://www.facebook.com/note.php?note_id=24413138919 ,August2008. [65] M.Livny,J.Basney,R.Raman,andT.Tannenbaum.Mechanismsforhighthroughputcomputing.SPEEDUPJournal,11(1),June1997. [66] G.LLC.Gbridge. http://www.gbridge.com ,September2009. [67] LogMeIn.Hamachi. https://secure.logmein.com/products/hamachi2/ ,2009. [68] LogMeIn,Inc.LogMeInhamachi2security,2009. 189

PAGE 190

[69] S.Ludwig,J.Beda,P.Saint-Andre,R.McQueen,S.Egan,andJ.Hildebran.XEP-0166:Jingle,December2009. [70] H.Madhyastha,T.Isdal,M.Piatek,C.Dixon,T.Anderson,A.Krishnamurthy,andA.Venkataramani.iplane:aninformationplanefordistributedservices.InUSENIXSymposiumonOperatingSystemsDesignandImplementation(OSDI),2006. [71] G.S.Manku,M.Bawa,andP.Raghavan.Symphony:distributedhashinginasmallworld.InUSITS,2003. [72] P.MaymounkovandD.Mazieres.Kademlia:Apeer-to-peerinformationsystembasedontheXORmetric.InInternationalWorkshoponPeer-to-PeerSystems,2002. [73] A.Mislove,A.Post,A.Haeberlen,andP.Druschel.Experiencesinbuildingandoperatingepost,areliablepeer-to-peerapplication.InSymposiumonOperat-ingSystemsPrinciples(SOSP)/EuroSysEuropeanConferenceonComputerSystems,2006. [74] MySpace,Inc.Myspace. http://www.myspace.com ,January2010. [75] M.Nelson,B.-H.Lim,andG.Hutchins.Fasttransparentmigrationforvirtualmachines.InUSENIXAnnualTechnicalConference,pages25,Berkeley,CA,USA,2005. [76] D.Nurmi,R.Wolski,C.Grzegorczyk,G.Obertelli,S.Soman,L.Youseff,andD.Zagorodnov.Theeucalyptusopen-sourcecloud-computingsystem.InIEEE/ACMInternationalSymposiumonClusterComputingandtheGrid(CCGrid),2009. [77] J.C.Perez.Facebook'sbeaconmoreintrusvethanpreviouslythought. http://www.pcworld.com/article/140182/facebooks_beacon_more_intrusive_than_previously_thought.html ,2007. [78] S.PerreaultandJ.Rosenberg.TCPcandidateswithinteractiveconnectivityestablishment(ICE). http://tools.ietf.org/html/draft-ietf-mmusic-ice-tcp-08 ,October2009. [79] K.Petric.Wippien. http://wippien.com/ ,August2009. [80] J.Postel.RFC0925-multi-lanaddressresolution,1984. [81] Qumranet.Kernel-basedvirtualmachineforlinux. http://kvm.qumranet.com/kvmwiki ,March2007. [82] S.Ratnasamy,P.Francis,S.Shenker,andM.Handley.Ascalablecontent-addressablenetwork.InACMSIGCOMM,2001. 190

PAGE 191

[83] E.RescorlaandN.Modadugu.RFC4347datagramtransportlayersecurity,April2006. [84] C.Resources.Torqueresourcemanager. http://www.clusterresources.com/pages/products/torque-resource-manager.php ,March2007. [85] A.Rezmerita,T.Morlier,V.Neri,andF.Cappello.Privatevirtualcluster:Infrastructureandprotocolforinstantgrids.InEuro-Par,November2006. [86] S.Rhea,B.Godfrey,B.Karp,J.Kubiatowicz,S.Ratnasamy,S.Shenker,I.Stoica,andH.Yu.Opendht:apublicdhtserviceanditsuses.InConferenceonAppli-cations,technologies,architectures,andprotocolsforcomputercommunications,pages73,NewYork,NY,USA,2005.ACM. [87] M.Ripeanu.Peer-to-peerarchitecturecasestudy:Gnutellanetwork. http://www.cs.uchicago.edu/%7Ematei/PAPERS/gnutella-rc.pdf ,2001. [88] E.RosenandY.Rekhter.RFC2547BGP/MPLSVPNs,March1999. [89] J.Rosenberg.Interactiveconnectivityestablishment(ICE):Aprotocolfornetworkaddresstranslator(NAT)traversalforoffer/answerprotocols. http://tools.ietf.org/html/draft-ietf-mmusic-ice-19 ,October2008. [90] J.Rosenberg,R.Mahy,andP.Matthews.traversalusingrelaysaroundnat(turn). http://tools.ietf.org/html/draft-ietf-behave-turn-16 ,2009. [91] J.Rosenberg,R.Mahy,P.Matthews,andD.Wing.RFC3489sessiontraversalutilitiesfornat(STUN),October2008. [92] J.Rosenberg,R.Mahy,P.Matthews,andD.Wing.RFC3489sessiontraversalutilitiesfornat(STUN),October2008. [93] J.Rosenberg,J.Weinberger,C.Huitema,andR.Mahy.RFC3489stun-simpletraversalofuserdatagramprotocol(udp)throughnetworkaddresstranslators(nats),2003. [94] A.RowstronandP.Druschel.Pastry:Scalable,decentralizedobjectlocationandroutingforlarge-scalepeer-to-peersystems.InIFIP/ACMInternationalConferenceonDistributedSystemsPlatforms(Middleware),November2001. [95] A.RowstronandP.Druschel.StoragemanagementandcachinginPAST,alarge-scale,persistentpeer-to-peerstorageutility.InSymposiumonOperatingSystemsPrinciples(SOSP),2001. [96] P.Saint-Andre.RFC3920extensiblemessagingandpresenceprotocol(XMPP):Core,October2004. [97] S.Santhanam,P.Elango,A.A.Dusseau,andM.Livny.Deployingvirtualmachinesassandboxesforthegrid.InWORLDS,2005. 191

PAGE 192

[98] A.Shakimov,H.Lim,L.P.Cox,andR.Caceres.Vis-a-vis:onlinesocialnetworkingviavirtualindividualservers,May2008. [99] SkypeLimited.Skype. http://www.skype.com [100] G.Sliepen.tinc. http://www.tinc-vpn.org/ ,September2009. [101] P.Srisuresh,B.Ford,andD.Kegel.RFC5128StateofPeer-to-Peer(P2P)CommunicationacrossNetworkAddressTranslators(NATs),March2008. [102] StandardPerformanceEvaluationCorporation.Specjbb2005. http://www.spec.org/jbb2005/ ,2005. [103] I.Stoica,D.Adkins,S.Zhuang,S.Shenker,andS.Surana.Internetindirectioninfrastructure.IEEE/ACMTransactionsonNetworking,2004. [104] I.Stoica,R.Morris,D.Liben-Nowell,D.R.Karger,M.F.Kaashoek,F.Dabek,andH.Balakrishnan.Chord:ascalablepeer-to-peerlookupprotocolforinternetapplications.IEEE/ACMTransactionsonNetworking,11(1),2003. [105] Sun.gridengine. http://gridengine.sunsource.net/ ,March2007. [106] A.I.SundararajandP.A.Dinda.Towardsvirtualnetworksforvirtualmachinegridcomputing.InConferenceonVirtualMachineResearchAndTechnologySymposium,pages14,2004. [107] W.Townsley,A.Valencia,A.Rubens,G.Pall,G.Zorn,andB.Palter.RFC2661LayerTwoTunnelingProtocol,August1999. [108] M.TsugawaandJ.Fortes.Avirtualnetwork(vine)architectureforgridcomputing.InternationalParallelandDistributedProcessingSymposium,2006. [109] M.TsugawaandJ.Fortes.Characterizinguser-levelnetworkvirtualization:Performance,overheadsandlimits.InIEEEInternationalConferenceoneScience,pages206,Dec.2008. [110] UPnPForum.UPnPdevicearchitecture1.1. http://www.upnp.org/specs/arch/UPnP-arch-DeviceArchitecture-v1.1.pdf ,October2008. [111] V.Vishnevsky,A.Safonov,M.Yakimov,E.Shim,andA.D.Gelman.Scalableblindsearchandbroadcastingoverdistributedhashtables.ComputerCommunications,31(2),2008. [112] VMware,Inc.Timekeepinginvmwarevirtualmachines. http://www.vmware.com/pdf/vmware_timekeeping.pdf ,2008. [113] D.I.Wolinsky,A.Agrawal,P.O.Boykin,J.Davis,A.Ganguly,V.Paramygin,P.Sheng,andR.J.Figueiredo.Onthedesignofvirtualmachinesandboxesfordistributedcomputinginwideareaoverlaysofvirtualworkstations.InInternationalWorkshoponVirtualizationTechnologiesinDistributedComputing,2006. 192

PAGE 193

[114] D.I.WolinskyandR.Figueiredo.Gridapplianceuserinterface. http://www.grid-appliance.org ,September2009. [115] D.I.Wolinsky,K.Lee,P.O.Boykin,andR.Figueiredo.Onthedesignofautonomic,decentralizedvpns.InInternationalConferenceonCollaborativeComputing:Networking,ApplicationsandWorksharing,2010. [116] D.I.Wolinsky,Y.Liu,P.S.Juste,G.Venkatasubramanian,andR.Figueiredo.Onthedesignofscalable,self-conguringvirtualnetworks.InIEEE/ACMSupercomputing2009,November2009. [117] D.I.Wolinsky,P.St.Juste,P.O.Boykin,andR.Figueiredo.AddressingtheP2Pbootstrapproblemforsmalloverlaynetworks.In10thIEEEInternationalConferenceonPeer-to-PeerComputing(P2P),2010. [118] C.P.WrightandE.Zadok.Unionfs:Bringinglesystemstogether.InLinuxJournal,December2004. [119] XMPPStandardsFoundation.PublicXMPPservices. http://xmpp.org/services/ ,December2009. [120] J.Yonan.OpenVPN. http://openvpn.net/ ,2009. 193

PAGE 194

BIOGRAPHICALSKETCHDavidIsaacWolinskywasbornonOctober31,1982.Hewasblessedwithanawesome,IsaacEmmanuel,bornNovember30,2009.BeginninghisstudiesinAugust2001attheUniversityofFlorida,Davidobtainedthefollowingdegreesinelectricalandcomputerengineering:BachelorofScienceinspring2005,MasterofScienceinspring2007,andDoctorateofPhilosophyinsummer2011.HisadvisorattheUniversityofFloridawasProfessorRenatoFigueiredo,whomhebeganworkingwithsincetheduringthespringof2006attheAdvancedComputingandInformationSystemsLab.HisprimaryresearchfocusesarenetworkvirtualizationusingstructuredP2P(peer-to-peer)overlaysandgridcomputing.ThenetworkingresearchhasbeenrealizedinIPOP,afree(BSD-BerkeleySoftwareDistributionlicence))networkvirtualizationsoftware.Additionally,hehasworkedonenablingDHTs,decentralizedNAT(networkaddresstranslation)traversalthroughrelays,softwaremodelsforimprovednetworkvirtualization,andautonomicvirtualnetworkingstacks.Thisworkisamajorcontributiontohisgridcomputingresearchfocus,GridAppliance,whichenablesthecreationofdecentralized,distributedgridsusingvirtualized,physical,andcloudresources.Goingforward,heexpressedgreatinterestedinusingtheseconceptsinotherdistributedsystemssuchassensornetworks,socialnetworks,cloudservices,orevenwebservices.Duringhisfreetime,heenjoystimewithmyboy,running,playingbasketball,andoccasionallyplayingvideogames.Atonepoint,hewasrankedinthetop20ontheUSEastWarcraftIIIFreeForAllLadder. 194