<%BANNER%>

Inference of User Behavior and Investigation of Privacy Issues in WLAN Trace Analysis

Permanent Link: http://ufdc.ufl.edu/UFE0022541/00001

Material Information

Title: Inference of User Behavior and Investigation of Privacy Issues in WLAN Trace Analysis
Physical Description: 1 online resource (29 p.)
Language: english
Creator: Kumar, Udayan
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2008

Subjects

Subjects / Keywords: gender, grouping, traces, wlan
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, M.S.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Science and Society have been interacting and changing each other timelessly. A scientific invention brings new challenges to the society and society reacts to it. Wireless networks are no exceptions. In this thesis, we present methodologies to classify WLAN(Wireless Local Area Network) users belonging to a large community like university campus into social groups like gender and major. With few examples we illustrate how WLAN user behavior and preferences can be studied. Similar studies have been done for technologies like Internet, but so far no study has been conducted for WLANs. One of the probable reason is, the unavailability of the data about user groups. Our method overcomes this limitation by providing a novel method for the same and thus opens the doors for further studies. The results of these studies, not only help in understanding interactions between society and technology but also in improving technology, and in providing specific services to those whose interests are unique. For example we show that the gender of the WLAN user affects the preference for the brand of device. This can very well be used for better marketing of products and understanding the bias. We also bring out the privacy concerns of using WLAN traces. We believe this work would open doors for further studies in understanding user behavior in wireless environment.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Udayan Kumar.
Thesis: Thesis (M.S.)--University of Florida, 2008.
Local: Adviser: Helmy, Ahmed H.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2008
System ID: UFE0022541:00001

Permanent Link: http://ufdc.ufl.edu/UFE0022541/00001

Material Information

Title: Inference of User Behavior and Investigation of Privacy Issues in WLAN Trace Analysis
Physical Description: 1 online resource (29 p.)
Language: english
Creator: Kumar, Udayan
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2008

Subjects

Subjects / Keywords: gender, grouping, traces, wlan
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, M.S.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Science and Society have been interacting and changing each other timelessly. A scientific invention brings new challenges to the society and society reacts to it. Wireless networks are no exceptions. In this thesis, we present methodologies to classify WLAN(Wireless Local Area Network) users belonging to a large community like university campus into social groups like gender and major. With few examples we illustrate how WLAN user behavior and preferences can be studied. Similar studies have been done for technologies like Internet, but so far no study has been conducted for WLANs. One of the probable reason is, the unavailability of the data about user groups. Our method overcomes this limitation by providing a novel method for the same and thus opens the doors for further studies. The results of these studies, not only help in understanding interactions between society and technology but also in improving technology, and in providing specific services to those whose interests are unique. For example we show that the gender of the WLAN user affects the preference for the brand of device. This can very well be used for better marketing of products and understanding the bias. We also bring out the privacy concerns of using WLAN traces. We believe this work would open doors for further studies in understanding user behavior in wireless environment.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Udayan Kumar.
Thesis: Thesis (M.S.)--University of Florida, 2008.
Local: Adviser: Helmy, Ahmed H.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2008
System ID: UFE0022541:00001


This item has the following downloads:


Full Text

PAGE 1

1

PAGE 2

2

PAGE 3

3

PAGE 4

page LISTOFTABLES ..................................... 5 LISTOFFIGURES .................................... 6 LISTOFSYMBOLS .................................... 7 ABSTRACT ........................................ 8 CHAPTER 1INTRODUCTION .................................. 9 1.1Introduction ................................... 9 1.2Contributions .................................. 10 1.3OrganizationoftheWork ........................... 10 2USERINFERENCESANDGROUPING ...................... 11 2.1Challenges .................................... 12 2.2Approach .................................... 12 2.3ChoiceofTrace ................................. 14 2.4Filtering ..................................... 15 2.5VericationofFiltering ............................. 17 2.6TimeEvolution ................................. 17 3USERBEHAVIOREVALUATIONANDRESULTS ............... 21 3.1WLANUsagebyArea ............................. 21 3.2AverageSessionDuration ........................... 22 3.3ManufacturerPreferences ........................... 22 3.4Applications ................................... 24 3.5FutureWork ................................... 24 3.6Conclusions ................................... 25 REFERENCES ....................................... 27 BIOGRAPHICALSKETCH ................................ 29 4

PAGE 5

Table page 2-1Similarityintheuserpopulationselectedafterlteringfraternityusers ..... 17 2-2Similarityintheuserpopulationselectedafterlteringsororityusers ...... 20 5

PAGE 6

Figure page 2-1Querybasedbsergroupingtechnique ........................ 13 2-2Asampletracedatabasesnapshot .......................... 13 2-3Gendergroupinginfraternitiesandsororities ................... 14 2-4Sessioncountforfraternityandsororityusers ................... 16 2-5Fraternityusersvssessionplotatvarioustraceandsessiondurationcut-os(timeinseconds) ................................... 18 2-6Sororityusersvssessionplotatvarioustraceandsessiondurationcut-os(timeinseconds) ...................................... 19 3-1Distributionofusersacrossthecampus ....................... 22 3-2AveragedurationofmaleandfemalesindierentAreasofthecampus ..... 23 3-3Devicedistributionbymanufacturer ........................ 24 6

PAGE 7

WLANWirelessLocalAreaNetwork(IEEE802.11)APAccessPointusedforsupportingInfrastructuralmodeofIEEE802.11MACMediumAccessControladdressPathThetimebasedorderinwhichauserlogsintoAPs.WiredtraceAtcpdumporthenetowoftheethernetusage.SessionAsessionmeansaneventinWLANtrace,whichstartswiththeassociationwiththeAPandendswiththedisassociationwiththeAP. 7

PAGE 8

8

PAGE 9

2 12 ].Thetracesareanonymizedandlackanyinformationaboutthesocialcontext,attributes,aliationorgender,andhencehidesomepotentiallyinterestingcharacteristicsofgroupbehaviorinmobilesocieties.Thus,itbecomeschallengingtomineuserbehaviorfromthesetraces.Inthiswork,wepresentnoveltechniqueswhichcanbeusedtogroupusersinsocialcontext(likealiations).WhileresearchershavebeenstudyingWLANdeploymentissues[ 4 ],issuesofmobility[ 5 ]anduserassociationpatterns[ 1 6 ],weaimtoaddressissuesof 9

PAGE 10

2 ,weexplainthetechniquesforgroupinguserswithacaseexampleofgenderbasedgrouping.Weshowhowlteringcanbedonetoremovevisitors.InChapter 3 ,wepresentstatisticalresultsshowingdierencesinusagepatternsofthetwogenders.Wethenidentifypossibleapplicationsofthiskindofstudy.Conclusionandfutureworkarepresentedattheendofthischapter. 10

PAGE 11

4 ],issuesofmobility[ 5 ]anduserassociationpatterns[ 6 ][ 1 ],weinthisresearch,addresstheissuesofuserclassicationbasedonsocialgroupingandanalyzeWLANusagepatternsbasedongender,majorsandotherinterestgroups.Thisstudyallowsustoexaminethetrendsamongdierentsocialgroups.Aninsightintouser'ssocialbehaviorcanfacilitatethedesignnetworkprotocols,suchasdelaytolerantnetworks(DTNs)andmobilesocial-networks.Incorporatinguserssocialbehaviorhasalreadyimprovedmobilityanalysis[ 5 ]anduserpredictionsinmobilenetworks.Understandingthesocialbehavioroftheuserisimportantforfuturecontextawareservicesofmobilenetworks,whichwouldrequireunderstandingofthecontextfromusersperspective.WeproposetouseWLANtraces,whicharegenerallyconsideredforstudyingnetworkcharacteristics,tominesocialbehavioroftheusers.Wepresentageneralmethodologywithanexamplecasestudyofgroupingbygender,andinvestigategendergapsinWLANusage.Thelackofsuchempiricaldataposesaninterestingchallengeandraisesseveralresearch(andprivacy)questions,suchas:Howcanwemeaningfullyinfergenderinformationfromsuchanonymoustraces?Doesgenderinuenceuserbehaviorandpreferenceinasignicantandconsistentmanner?Inthiswork,weintroduceanoveltechniquetomineWLANusagepatternsbasedongender,majorsandotherinterestgroups.SomeofthecentralideasofourworkincludetheusageofbuildingmapsfortheWLANtraces,knowledgeoflocationsofdepartments,fraternitiesandsororities,andtheuseofstatisticalmethodstoclassifyusersinmajorsandgendersandcause&eectofsuchclassicationonnetworkactivityandpreference.Themethodweprovidecanbeusedfurthertoanalyzebehaviorbasedonvariousothergroupings. 11

PAGE 12

3 ].Thisworkistherst[ 9 11 ],toourknowledge,toanalyzeWLANadoptionpatternsacrossthesegroups.Amongtheparameterswehaveconsideredforevaluatingthegendergaps,wefoundenoughstatisticalevidencetoconcludethat(forthetracesinourstudy)usagepatternsofmalesandfemalesisdierent,andthatgenderdoesindeedaectuseractivityandvendorpreference.Oursuccessalsoindicatesthattheproblemofmobileuserprivacyshouldbere-visited;atopicthatwewanttoaddressinourfuturework. 2 ][ 12 ].Often,becauseofuserprivacyissues,theMACaddressesareanonymized.Havingameaningfulclassicationwiththispartialinformationisthemainchallengethatweaddressinthiswork.Ideally,wewouldwanttoclassifyallstudentsintogroups.Takingarststepinthisdirectionwepresentageneraltechnique,whichcanbeusedtoclassifyasmallersectionofWLANusersintogroups.Doingitforalltheusersstillremainsachallengeasweshallsee.Instead,wefocusonobtainingasamplesignicantenoughforastatisticalanalysis. 2-1 WealsousethelocationinformationoftheAPs,intheformofbuildingsinwhichtheyarelocated.Thishelpsinidentifyingthegeographiclocationsofauseratalaterstage.MobilityofuserscanbetrackedbylookingattheapproximategeographiclocationsoftheAPs.TheprocesseddataisfedintoadatabaseonwhichSQL 12

PAGE 13

Querybasedbsergroupingtechnique Figure2-2. Asampletracedatabasesnapshot queriescanberuneasily(andgenerically)toextractinformationofinteresttous.Figure 2-2 illustratesthetracedatabaselayoutwhichwasusedinourexperiment.Theeldsincludethefollowing:1.MACaddressesofthewirelessdevicesloggedontotheWLAN,2.thestartingsessiontimeinseconds,3.theAPwithwhichthewirelessdeviceassociated,4.DurationoftheassociationwiththeAP,5.themanufacturer(whichcanbeinferredfromMACaddress),and6.thebuildingatwhichtheAPislocatedat(approximately),whichcanbecheckedbasedonaccesspointlocationinformation,whichisexternaldatatotheactualtraces.Two-dimensionalco-ordinatescanbeinbuiltintothedatabasebasedonacampusgridmaptoallowmobilitybasedqueriestobeperformedaswell.ThetracedatabaseprovidesenoughinformationonwhichtorunSQLqueries.Forinstance,asimplequeryreturnsthenumberofMACsloggedintobuildingaorbwithdurations 13

PAGE 14

Gendergroupinginfraternitiesandsororities withinacertainrange.WehaveusedthissamedatabaseframeworktoanalyzetracesfromUSC[ 12 ],Dartmouth[ 2 ],UFandUNC[ 13 ],themethodisgeneralandapplicabletomanytraces,campusesandsocieties.Completingtheseanalysisispartofourfuturework.Thegroupingparameterweuseinthisdocumentforinvestigationisgender.Todothiscategorization,weproposethefollowingnoveltechnique.Mostuniversitieshavesororitiesandfraternitiesassocialorganizations.Sororitiesarefemaleorganizationswhilefraternitiesrepresentmaleorganizations.GiventhephysicallocationofAPsoncampus,APslocatedinsororitiesandfraternitiesareidentied,andtheusersassociatedwiththemareclassiedasfemaleormale.Figure 2-3 showshowgroupingisdoneinthissetting.Thefactthatvisitorsmayfrequenttheselocationsalsoneedstobetakenintoaccount.Wedealwithvisitorsinthelteringsection. 14

PAGE 15

12 ],UNC[ 13 ]andDartmouth[ 2 ].DartmouthtracesdonotprovideAP-to-buildingmapping,whichmakesitdiculttodothiskindofstudy.UNCtraces,ontheotherhand,havelimitednumberofAPsinsororitiesandfraternities.WechosetheUSCtracesforourstudyas12fraternitiesand7sororitiesareincludedinWLANtracesandtheAP-to-buildingmappingisalsoavailable.Wehavechosen3monthsforthestudyfromthreedierentsemesters-Feb2006,Oct2006andFeb2007.Thereasonforhavingtracesfrommultipleperiodsistolookatconsistencyintheresultsandalsoatthetrends.Traceshavebeentakenfromdierentsemesters,inordertocheckandverifythesemestereectintheresults. 2-4 representssessioncountsperMACaddressindecreasingorder.Figure 2-4 presentsthegraphsthatareproducedusingtheaveragesessionduration(insororitiesandfraternities,respectively)asthethresholdforsessionduration.WeobserveaninterestingdistinctcharacteristicinFigure 2-4 {thepresenceofasharpbend(knee)asthenumberofsessionsperMACaddressdecreases.Intuitively,thismeansthatMACaddressesbelowthekneehaveanorderofmagnitudelessnumberofsessions 15

PAGE 16

B CFigure2-4. Sessioncountforfraternityandsororityusers. A )Feb2006. B )Oct2006. C )Feb2007. 16

PAGE 17

2-1 andTable 2-2 showtheresultweobtainforbothfraternityandsororityusers.WeseethatforfraternitiesbeforelteringthepercentageofcommonMACsintwoconsecutivemonthsisaround56to61percentandafterlteringitgoesupbetween73to80percent.Incaseofsororities,beforelteringweseethatcommonusersarebetween61to65percentandafterlteringthepercentageofcommonusersshootsupto88to90percent.Thisshowsthatlteringisselectingregularusers,aspercentageofcommonusersrisesafterltering. Table2-1. Similarityintheuserpopulationselectedafterlteringfraternityusers Feb2006Mar-Apr20061350144181656.63Oct2006Nov20061520157296961.64Feb2007Mar-Apr200716921875105056 Feb2006Mar-Apr200647346337879.92Oct2006Nov200647444537178.27Feb2007Mar-Apr200744648235473.44 17

PAGE 18

B C D E FFigure2-5. Fraternityusersvssessionplotatvarioustraceandsessiondurationcut-os(timeinseconds) A )3Days. B )4Days. C )5Days. D )6Days. E )7Days. F )14Days. 18

PAGE 19

B C D E FFigure2-6. Sororityusersvssessionplotatvarioustraceandsessiondurationcut-os(timeinseconds) A )3days. B )4days. C )5days. D )6days. E )7days. F )14days. 19

PAGE 20

Similarityintheuserpopulationselectedafterlteringsororityusers Feb2006Mar-Apr2006991115571762.08Oct2006Nov20061264130584464.67Feb2007Mar-Apr20071169132782161.87 Feb2006Mar-Apr200646347442990.51Oct2006Nov200649345643287.63Feb2007Mar-Apr200743945840588.43 (intermsofdays)requiredtodotheltering,whichessentiallymeansthevisibilityofthekneefeature[ 10 ].Thiskneefeaturecanbeidentiedbysharpchangeintheslopeofthecurve.WecanseeinFigure 2-5 andFigure 2-6 thatthiscurvestartstoappearfromday4onwardsforbothsororityusersandfraternityusers.Forthisstudy,wehaveusedtracebelongingtothemonthofFeb2006.Thisindicatesthesuitabilityofouranalysistotracesofshorterdurationinsimilarenvironments.Inthefollowingchapterwepresentbehavioranalysisofusersbasedontheclassicationdoneinthischapter. 20

PAGE 21

(a). WLANusageandgenderdistributionbyarea:WhatarethetrendsinWLANusageacrossdierent(buildings)areasoncampus? (b). Averagesessionduration:Aretheretrendsintheaverageon-linetimesofusersandcandierencesbespottedbasedongenderand(building)areaswithinthecampus? (c). Manufacturerpreferences:Whichdevicevendorsdodierentgendersprefer? 3-1 showstheusagedistributionperareatypebasedonourdenitionofregularuser.Economicsbuildingsshowahigherpopulationofmaleusers,socialsciencebuildingshaveahighercountoffemaleusers.ItisinterestingtoseethatfemaleWLANusersout-numbermaleusersinEngineeringbuildingsforthesampleFeb2006;however,malesuserstaketheleadinFeb2007.WeseethatabsolutenumberofstudentsclassiedasmaleandfemalesincreaseinOct2006andthendropdowninFeb2007.ThismaybeattributedtothefactthatmorestudentsjoininFallsemestersthaninanyotherperiodoftheyearandmanystudentsgraduateaftertheFallsemesters.ItalsoindicatesthatseveralcoursesoeredintheFallmayrequiretheuseoflaptops(andthewirelessnetwork)forthecoursework. 21

PAGE 22

Distributionofusersacrossthecampus 3-2 weobservethatmalesspendmoretimeon-linethanfemalesinmostoftheareas.FemalesshowdominantusageintheSocialScience,EconomicsandMedicineareasacrosscampus.Weseethataverageon-linetimeofmalesisdecreasingovertimeandthatoffemalesisincreasing.InFeb,2006maleshavemoreaverageon-linetimethanfemalesinmostofthebuildings;however,inFeb2007,ayearlater,weseethatmaleshavelesseraverageon-linetimethanfemalesinalmostallthebuildings.Aremalesgettingmoremobile?AnotherobservationofinterestisthataveragedurationpersessiondecreasesfromFeb2006toFeb2007inalmostallthecases(Engineering,residence,social,sports,music).Thisindicatestoapossibilitythatstudentsarebecomingmoremobile,andthushavingshortersessionsinthesamelocation. 3-3 .ItisinterestingtonotethatwhileApplecomputersaremorepopular 22

PAGE 23

AveragedurationofmaleandfemalesindierentAreasofthecampus amongstfemales,malespreferIntelcomputers.Forthisstudy,onlymajorvendorswereconsidered.ForexampleusingtheFeb2006trace,wendthat:Incaseofmales,25%useAppleand32%useIntel,sothereare28%moremaleusersusingIntelthanApple.Inthecaseoffemales:30%useAppleand27%useIntel,whichindicatesthat12%morefemalesusersuseApplethanIntel.Totestwhethergenderprovidesabiastowardsspecicvendors,weusethestatisticalsignicancetest,Chi-Square.TheChi-Squaretestshowswith90%condencethatthereisabiasbetweengenderandvendor/brand.AnotherinterestingobservationwemakefromFigure 3-3 istheconsistenttrendofincreasingpercentageofApplecomputersusageinboththegenders.WealsoseethatvendorslikeEnterasys,LinksysandAskeyCorp.showadecreasingtrendintermsofpercentageofusers.OneofthereasonsisthatthesemanufacturersmostlymakeexternalWi-Fidevicesforoldlaptops(withnobuilt-inWi-FiNICs)andcurrentlyalmostallnewlaptopscomewithabuilt-inWi-Fi,soshrinkingthepieofusersofexternaldevices. 23

PAGE 24

Devicedistributionbymanufacturer 7 ]. 24

PAGE 25

25

PAGE 26

26

PAGE 27

[1] G.Chen,H.Huang,andM.Kim,\MiningFrequentandPeriodicAssociationPatterns,"DartmouthCollegeComputerScienceTechnicalReportTR2005-550,July2005. [2] CRAWDAD:,\CommunityResourceforArchivingWirelessDataAtDartmouth," [3] R.R.Dholakiaetal,\GenderandInternetUsage,"TheInternetEncyclopedia,Wiley,2003. [4] T.Henderson,D.Kotz,andI.Abyzov,\TheChangingUsageofaMatureCampus-wideWirelessNetwork,"inProceedingsofACMMobiCom2004,September2004. [5] W.Hsu,T.Spyropoulos,K.Psounis,andA.Helmy,\ModelingTime-VariantUserMobilityinWirelessMobileNetworks,"inProceedingsofIEEEINFOCOM,May2007. [6] W.Hsu,andA.Helmy,\OnModelingUserAssociationsinWirelessLANTracesonUniversityCampuses,"inProceedingsofTheSecondInternationalWorkshoponWirelessNetworkMeasurement(WiNMee),,April2006. [7] W.Hsu,D.Dutta,andA.Helmy,\Prole-Cast:Behavior-AwareMobileNetworking,"inProceedingsIEEEWirelessCommunicationsandNetworkingConference(WCNC),March2008. [8] D.KotzandT.HendersonandI.Abyzov\CRAWDADDataSetDartmouth/Campus,December2004"Downloadedfrom [9] U.Kumar,N.Yadav,A.Helmy,\Gender-basedFeatureAnalysisinCampus-wideWLANS,"ACMMobileComputingandCommunicationReview,MC2Rjournal,Vol12,Issue1,pp.40{42,2008 [10] U.Kumar,N.Yadav,A.Helmy,\"POSTER:AnalyzingGender-gapsinMobileStudentSocieties,"CRAWDADWorkshop(colocatedwithACMMOBICOM),Montreal,September,2007 [11] U.Kumar,N.Yadav,A.Helmy,\Gender-basedGroupingofMobileStudentSocieties,"TheInternationalWorkshoponMobileDeviceandUrbanSensing(MODUS),IPSNworkshop,St.Louis,MO,April,2008. [12] MobiLib:,\Community-wideLibraryofMobilityandWirelessNetworksMeasurements(InvestigatingUserBehaviorinWirelessEnvironments)," 27

PAGE 28

UNC/FORTH,\RepositoryofTracesandModelsforWirelessNetworks,SyslogDataset#2," 28

PAGE 29

UdayanKumarwasbornandbroughtupinNorthernIndia.UdayanattendedDhirubhaiAmbaniInstituteofInformationandTechnology,Gandhinagar,Gujarat,India,wherehereceivedaBachelorofTechnologydegreeininformationandcommunicationtechnologyin2005.In2006,heenteredtheComputerandInformationScienceandEngineeringgraduateprogramattheUniversityofFlorida{Gainesville,whereheworkedasaresearchassistantandattendedschoolfull{timeallthewhilecompletingtherequirementsforaMasterofSciencedegreeincomputerengineering.Hisresearchinterestsincludemobilesocialnetworks,embeddedsystems,andnetworkingprotocols. 29