Citation
Inference of User Behavior and Investigation of Privacy Issues in WLAN Trace Analysis

Material Information

Title:
Inference of User Behavior and Investigation of Privacy Issues in WLAN Trace Analysis
Creator:
Kumar, Udayan
Place of Publication:
[Gainesville, Fla.]
Publisher:
University of Florida
Publication Date:
Language:
english
Physical Description:
1 online resource (29 p.)

Thesis/Dissertation Information

Degree:
Master's ( M.S.)
Degree Grantor:
University of Florida
Degree Disciplines:
Computer Engineering
Computer and Information Science and Engineering
Committee Chair:
Helmy, Ahmed H.
Committee Members:
Mishra, Prabhat
Thai, My Tra
Graduation Date:
8/9/2008

Subjects

Subjects / Keywords:
Apples ( jstor )
Databases ( jstor )
Fraternities ( jstor )
Information behavior ( jstor )
Narrative devices ( jstor )
School campuses ( jstor )
Social behavior ( jstor )
Sororities ( jstor )
Universities ( jstor )
Vendors ( jstor )
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
gender, grouping, traces, wlan
Genre:
Electronic Thesis or Dissertation
born-digital ( sobekcm )
Computer Engineering thesis, M.S.

Notes

Abstract:
Science and Society have been interacting and changing each other timelessly. A scientific invention brings new challenges to the society and society reacts to it. Wireless networks are no exceptions. In this thesis, we present methodologies to classify WLAN(Wireless Local Area Network) users belonging to a large community like university campus into social groups like gender and major. With few examples we illustrate how WLAN user behavior and preferences can be studied. Similar studies have been done for technologies like Internet, but so far no study has been conducted for WLANs. One of the probable reason is, the unavailability of the data about user groups. Our method overcomes this limitation by providing a novel method for the same and thus opens the doors for further studies. The results of these studies, not only help in understanding interactions between society and technology but also in improving technology, and in providing specific services to those whose interests are unique. For example we show that the gender of the WLAN user affects the preference for the brand of device. This can very well be used for better marketing of products and understanding the bias. We also bring out the privacy concerns of using WLAN traces. We believe this work would open doors for further studies in understanding user behavior in wireless environment. ( en )
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis:
Thesis (M.S.)--University of Florida, 2008.
Local:
Adviser: Helmy, Ahmed H.
Statement of Responsibility:
by Udayan Kumar.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright Kumar, Udayan. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Classification:
LD1780 2008 ( lcc )

Downloads

This item has the following downloads:


Full Text

PAGE 1

1

PAGE 2

2

PAGE 3

3

PAGE 4

page LISTOFTABLES ..................................... 5 LISTOFFIGURES .................................... 6 LISTOFSYMBOLS .................................... 7 ABSTRACT ........................................ 8 CHAPTER 1INTRODUCTION .................................. 9 1.1Introduction ................................... 9 1.2Contributions .................................. 10 1.3OrganizationoftheWork ........................... 10 2USERINFERENCESANDGROUPING ...................... 11 2.1Challenges .................................... 12 2.2Approach .................................... 12 2.3ChoiceofTrace ................................. 14 2.4Filtering ..................................... 15 2.5VericationofFiltering ............................. 17 2.6TimeEvolution ................................. 17 3USERBEHAVIOREVALUATIONANDRESULTS ............... 21 3.1WLANUsagebyArea ............................. 21 3.2AverageSessionDuration ........................... 22 3.3ManufacturerPreferences ........................... 22 3.4Applications ................................... 24 3.5FutureWork ................................... 24 3.6Conclusions ................................... 25 REFERENCES ....................................... 27 BIOGRAPHICALSKETCH ................................ 29 4

PAGE 5

Table page 2-1Similarityintheuserpopulationselectedafterlteringfraternityusers ..... 17 2-2Similarityintheuserpopulationselectedafterlteringsororityusers ...... 20 5

PAGE 6

Figure page 2-1Querybasedbsergroupingtechnique ........................ 13 2-2Asampletracedatabasesnapshot .......................... 13 2-3Gendergroupinginfraternitiesandsororities ................... 14 2-4Sessioncountforfraternityandsororityusers ................... 16 2-5Fraternityusersvssessionplotatvarioustraceandsessiondurationcut-os(timeinseconds) ................................... 18 2-6Sororityusersvssessionplotatvarioustraceandsessiondurationcut-os(timeinseconds) ...................................... 19 3-1Distributionofusersacrossthecampus ....................... 22 3-2AveragedurationofmaleandfemalesindierentAreasofthecampus ..... 23 3-3Devicedistributionbymanufacturer ........................ 24 6

PAGE 7

WLANWirelessLocalAreaNetwork(IEEE802.11)APAccessPointusedforsupportingInfrastructuralmodeofIEEE802.11MACMediumAccessControladdressPathThetimebasedorderinwhichauserlogsintoAPs.WiredtraceAtcpdumporthenetowoftheethernetusage.SessionAsessionmeansaneventinWLANtrace,whichstartswiththeassociationwiththeAPandendswiththedisassociationwiththeAP. 7

PAGE 8

8

PAGE 9

2 12 ].Thetracesareanonymizedandlackanyinformationaboutthesocialcontext,attributes,aliationorgender,andhencehidesomepotentiallyinterestingcharacteristicsofgroupbehaviorinmobilesocieties.Thus,itbecomeschallengingtomineuserbehaviorfromthesetraces.Inthiswork,wepresentnoveltechniqueswhichcanbeusedtogroupusersinsocialcontext(likealiations).WhileresearchershavebeenstudyingWLANdeploymentissues[ 4 ],issuesofmobility[ 5 ]anduserassociationpatterns[ 1 6 ],weaimtoaddressissuesof 9

PAGE 10

2 ,weexplainthetechniquesforgroupinguserswithacaseexampleofgenderbasedgrouping.Weshowhowlteringcanbedonetoremovevisitors.InChapter 3 ,wepresentstatisticalresultsshowingdierencesinusagepatternsofthetwogenders.Wethenidentifypossibleapplicationsofthiskindofstudy.Conclusionandfutureworkarepresentedattheendofthischapter. 10

PAGE 11

4 ],issuesofmobility[ 5 ]anduserassociationpatterns[ 6 ][ 1 ],weinthisresearch,addresstheissuesofuserclassicationbasedonsocialgroupingandanalyzeWLANusagepatternsbasedongender,majorsandotherinterestgroups.Thisstudyallowsustoexaminethetrendsamongdierentsocialgroups.Aninsightintouser'ssocialbehaviorcanfacilitatethedesignnetworkprotocols,suchasdelaytolerantnetworks(DTNs)andmobilesocial-networks.Incorporatinguserssocialbehaviorhasalreadyimprovedmobilityanalysis[ 5 ]anduserpredictionsinmobilenetworks.Understandingthesocialbehavioroftheuserisimportantforfuturecontextawareservicesofmobilenetworks,whichwouldrequireunderstandingofthecontextfromusersperspective.WeproposetouseWLANtraces,whicharegenerallyconsideredforstudyingnetworkcharacteristics,tominesocialbehavioroftheusers.Wepresentageneralmethodologywithanexamplecasestudyofgroupingbygender,andinvestigategendergapsinWLANusage.Thelackofsuchempiricaldataposesaninterestingchallengeandraisesseveralresearch(andprivacy)questions,suchas:Howcanwemeaningfullyinfergenderinformationfromsuchanonymoustraces?Doesgenderinuenceuserbehaviorandpreferenceinasignicantandconsistentmanner?Inthiswork,weintroduceanoveltechniquetomineWLANusagepatternsbasedongender,majorsandotherinterestgroups.SomeofthecentralideasofourworkincludetheusageofbuildingmapsfortheWLANtraces,knowledgeoflocationsofdepartments,fraternitiesandsororities,andtheuseofstatisticalmethodstoclassifyusersinmajorsandgendersandcause&eectofsuchclassicationonnetworkactivityandpreference.Themethodweprovidecanbeusedfurthertoanalyzebehaviorbasedonvariousothergroupings. 11

PAGE 12

3 ].Thisworkistherst[ 9 11 ],toourknowledge,toanalyzeWLANadoptionpatternsacrossthesegroups.Amongtheparameterswehaveconsideredforevaluatingthegendergaps,wefoundenoughstatisticalevidencetoconcludethat(forthetracesinourstudy)usagepatternsofmalesandfemalesisdierent,andthatgenderdoesindeedaectuseractivityandvendorpreference.Oursuccessalsoindicatesthattheproblemofmobileuserprivacyshouldbere-visited;atopicthatwewanttoaddressinourfuturework. 2 ][ 12 ].Often,becauseofuserprivacyissues,theMACaddressesareanonymized.Havingameaningfulclassicationwiththispartialinformationisthemainchallengethatweaddressinthiswork.Ideally,wewouldwanttoclassifyallstudentsintogroups.Takingarststepinthisdirectionwepresentageneraltechnique,whichcanbeusedtoclassifyasmallersectionofWLANusersintogroups.Doingitforalltheusersstillremainsachallengeasweshallsee.Instead,wefocusonobtainingasamplesignicantenoughforastatisticalanalysis. 2-1 WealsousethelocationinformationoftheAPs,intheformofbuildingsinwhichtheyarelocated.Thishelpsinidentifyingthegeographiclocationsofauseratalaterstage.MobilityofuserscanbetrackedbylookingattheapproximategeographiclocationsoftheAPs.TheprocesseddataisfedintoadatabaseonwhichSQL 12

PAGE 13

Querybasedbsergroupingtechnique Figure2-2. Asampletracedatabasesnapshot queriescanberuneasily(andgenerically)toextractinformationofinteresttous.Figure 2-2 illustratesthetracedatabaselayoutwhichwasusedinourexperiment.Theeldsincludethefollowing:1.MACaddressesofthewirelessdevicesloggedontotheWLAN,2.thestartingsessiontimeinseconds,3.theAPwithwhichthewirelessdeviceassociated,4.DurationoftheassociationwiththeAP,5.themanufacturer(whichcanbeinferredfromMACaddress),and6.thebuildingatwhichtheAPislocatedat(approximately),whichcanbecheckedbasedonaccesspointlocationinformation,whichisexternaldatatotheactualtraces.Two-dimensionalco-ordinatescanbeinbuiltintothedatabasebasedonacampusgridmaptoallowmobilitybasedqueriestobeperformedaswell.ThetracedatabaseprovidesenoughinformationonwhichtorunSQLqueries.Forinstance,asimplequeryreturnsthenumberofMACsloggedintobuildingaorbwithdurations 13

PAGE 14

Gendergroupinginfraternitiesandsororities withinacertainrange.WehaveusedthissamedatabaseframeworktoanalyzetracesfromUSC[ 12 ],Dartmouth[ 2 ],UFandUNC[ 13 ],themethodisgeneralandapplicabletomanytraces,campusesandsocieties.Completingtheseanalysisispartofourfuturework.Thegroupingparameterweuseinthisdocumentforinvestigationisgender.Todothiscategorization,weproposethefollowingnoveltechnique.Mostuniversitieshavesororitiesandfraternitiesassocialorganizations.Sororitiesarefemaleorganizationswhilefraternitiesrepresentmaleorganizations.GiventhephysicallocationofAPsoncampus,APslocatedinsororitiesandfraternitiesareidentied,andtheusersassociatedwiththemareclassiedasfemaleormale.Figure 2-3 showshowgroupingisdoneinthissetting.Thefactthatvisitorsmayfrequenttheselocationsalsoneedstobetakenintoaccount.Wedealwithvisitorsinthelteringsection. 14

PAGE 15

12 ],UNC[ 13 ]andDartmouth[ 2 ].DartmouthtracesdonotprovideAP-to-buildingmapping,whichmakesitdiculttodothiskindofstudy.UNCtraces,ontheotherhand,havelimitednumberofAPsinsororitiesandfraternities.WechosetheUSCtracesforourstudyas12fraternitiesand7sororitiesareincludedinWLANtracesandtheAP-to-buildingmappingisalsoavailable.Wehavechosen3monthsforthestudyfromthreedierentsemesters-Feb2006,Oct2006andFeb2007.Thereasonforhavingtracesfrommultipleperiodsistolookatconsistencyintheresultsandalsoatthetrends.Traceshavebeentakenfromdierentsemesters,inordertocheckandverifythesemestereectintheresults. 2-4 representssessioncountsperMACaddressindecreasingorder.Figure 2-4 presentsthegraphsthatareproducedusingtheaveragesessionduration(insororitiesandfraternities,respectively)asthethresholdforsessionduration.WeobserveaninterestingdistinctcharacteristicinFigure 2-4 {thepresenceofasharpbend(knee)asthenumberofsessionsperMACaddressdecreases.Intuitively,thismeansthatMACaddressesbelowthekneehaveanorderofmagnitudelessnumberofsessions 15

PAGE 16

B CFigure2-4. Sessioncountforfraternityandsororityusers. A )Feb2006. B )Oct2006. C )Feb2007. 16

PAGE 17

2-1 andTable 2-2 showtheresultweobtainforbothfraternityandsororityusers.WeseethatforfraternitiesbeforelteringthepercentageofcommonMACsintwoconsecutivemonthsisaround56to61percentandafterlteringitgoesupbetween73to80percent.Incaseofsororities,beforelteringweseethatcommonusersarebetween61to65percentandafterlteringthepercentageofcommonusersshootsupto88to90percent.Thisshowsthatlteringisselectingregularusers,aspercentageofcommonusersrisesafterltering. Table2-1. Similarityintheuserpopulationselectedafterlteringfraternityusers Feb2006Mar-Apr20061350144181656.63Oct2006Nov20061520157296961.64Feb2007Mar-Apr200716921875105056 Feb2006Mar-Apr200647346337879.92Oct2006Nov200647444537178.27Feb2007Mar-Apr200744648235473.44 17

PAGE 18

B C D E FFigure2-5. Fraternityusersvssessionplotatvarioustraceandsessiondurationcut-os(timeinseconds) A )3Days. B )4Days. C )5Days. D )6Days. E )7Days. F )14Days. 18

PAGE 19

B C D E FFigure2-6. Sororityusersvssessionplotatvarioustraceandsessiondurationcut-os(timeinseconds) A )3days. B )4days. C )5days. D )6days. E )7days. F )14days. 19

PAGE 20

Similarityintheuserpopulationselectedafterlteringsororityusers Feb2006Mar-Apr2006991115571762.08Oct2006Nov20061264130584464.67Feb2007Mar-Apr20071169132782161.87 Feb2006Mar-Apr200646347442990.51Oct2006Nov200649345643287.63Feb2007Mar-Apr200743945840588.43 (intermsofdays)requiredtodotheltering,whichessentiallymeansthevisibilityofthekneefeature[ 10 ].Thiskneefeaturecanbeidentiedbysharpchangeintheslopeofthecurve.WecanseeinFigure 2-5 andFigure 2-6 thatthiscurvestartstoappearfromday4onwardsforbothsororityusersandfraternityusers.Forthisstudy,wehaveusedtracebelongingtothemonthofFeb2006.Thisindicatesthesuitabilityofouranalysistotracesofshorterdurationinsimilarenvironments.Inthefollowingchapterwepresentbehavioranalysisofusersbasedontheclassicationdoneinthischapter. 20

PAGE 21

(a). WLANusageandgenderdistributionbyarea:WhatarethetrendsinWLANusageacrossdierent(buildings)areasoncampus? (b). Averagesessionduration:Aretheretrendsintheaverageon-linetimesofusersandcandierencesbespottedbasedongenderand(building)areaswithinthecampus? (c). Manufacturerpreferences:Whichdevicevendorsdodierentgendersprefer? 3-1 showstheusagedistributionperareatypebasedonourdenitionofregularuser.Economicsbuildingsshowahigherpopulationofmaleusers,socialsciencebuildingshaveahighercountoffemaleusers.ItisinterestingtoseethatfemaleWLANusersout-numbermaleusersinEngineeringbuildingsforthesampleFeb2006;however,malesuserstaketheleadinFeb2007.WeseethatabsolutenumberofstudentsclassiedasmaleandfemalesincreaseinOct2006andthendropdowninFeb2007.ThismaybeattributedtothefactthatmorestudentsjoininFallsemestersthaninanyotherperiodoftheyearandmanystudentsgraduateaftertheFallsemesters.ItalsoindicatesthatseveralcoursesoeredintheFallmayrequiretheuseoflaptops(andthewirelessnetwork)forthecoursework. 21

PAGE 22

Distributionofusersacrossthecampus 3-2 weobservethatmalesspendmoretimeon-linethanfemalesinmostoftheareas.FemalesshowdominantusageintheSocialScience,EconomicsandMedicineareasacrosscampus.Weseethataverageon-linetimeofmalesisdecreasingovertimeandthatoffemalesisincreasing.InFeb,2006maleshavemoreaverageon-linetimethanfemalesinmostofthebuildings;however,inFeb2007,ayearlater,weseethatmaleshavelesseraverageon-linetimethanfemalesinalmostallthebuildings.Aremalesgettingmoremobile?AnotherobservationofinterestisthataveragedurationpersessiondecreasesfromFeb2006toFeb2007inalmostallthecases(Engineering,residence,social,sports,music).Thisindicatestoapossibilitythatstudentsarebecomingmoremobile,andthushavingshortersessionsinthesamelocation. 3-3 .ItisinterestingtonotethatwhileApplecomputersaremorepopular 22

PAGE 23

AveragedurationofmaleandfemalesindierentAreasofthecampus amongstfemales,malespreferIntelcomputers.Forthisstudy,onlymajorvendorswereconsidered.ForexampleusingtheFeb2006trace,wendthat:Incaseofmales,25%useAppleand32%useIntel,sothereare28%moremaleusersusingIntelthanApple.Inthecaseoffemales:30%useAppleand27%useIntel,whichindicatesthat12%morefemalesusersuseApplethanIntel.Totestwhethergenderprovidesabiastowardsspecicvendors,weusethestatisticalsignicancetest,Chi-Square.TheChi-Squaretestshowswith90%condencethatthereisabiasbetweengenderandvendor/brand.AnotherinterestingobservationwemakefromFigure 3-3 istheconsistenttrendofincreasingpercentageofApplecomputersusageinboththegenders.WealsoseethatvendorslikeEnterasys,LinksysandAskeyCorp.showadecreasingtrendintermsofpercentageofusers.OneofthereasonsisthatthesemanufacturersmostlymakeexternalWi-Fidevicesforoldlaptops(withnobuilt-inWi-FiNICs)andcurrentlyalmostallnewlaptopscomewithabuilt-inWi-Fi,soshrinkingthepieofusersofexternaldevices. 23

PAGE 24

Devicedistributionbymanufacturer 7 ]. 24

PAGE 25

25

PAGE 26

26

PAGE 27

[1] G.Chen,H.Huang,andM.Kim,\MiningFrequentandPeriodicAssociationPatterns,"DartmouthCollegeComputerScienceTechnicalReportTR2005-550,July2005. [2] CRAWDAD:,\CommunityResourceforArchivingWirelessDataAtDartmouth," [3] R.R.Dholakiaetal,\GenderandInternetUsage,"TheInternetEncyclopedia,Wiley,2003. [4] T.Henderson,D.Kotz,andI.Abyzov,\TheChangingUsageofaMatureCampus-wideWirelessNetwork,"inProceedingsofACMMobiCom2004,September2004. [5] W.Hsu,T.Spyropoulos,K.Psounis,andA.Helmy,\ModelingTime-VariantUserMobilityinWirelessMobileNetworks,"inProceedingsofIEEEINFOCOM,May2007. [6] W.Hsu,andA.Helmy,\OnModelingUserAssociationsinWirelessLANTracesonUniversityCampuses,"inProceedingsofTheSecondInternationalWorkshoponWirelessNetworkMeasurement(WiNMee),,April2006. [7] W.Hsu,D.Dutta,andA.Helmy,\Prole-Cast:Behavior-AwareMobileNetworking,"inProceedingsIEEEWirelessCommunicationsandNetworkingConference(WCNC),March2008. [8] D.KotzandT.HendersonandI.Abyzov\CRAWDADDataSetDartmouth/Campus,December2004"Downloadedfrom [9] U.Kumar,N.Yadav,A.Helmy,\Gender-basedFeatureAnalysisinCampus-wideWLANS,"ACMMobileComputingandCommunicationReview,MC2Rjournal,Vol12,Issue1,pp.40{42,2008 [10] U.Kumar,N.Yadav,A.Helmy,\"POSTER:AnalyzingGender-gapsinMobileStudentSocieties,"CRAWDADWorkshop(colocatedwithACMMOBICOM),Montreal,September,2007 [11] U.Kumar,N.Yadav,A.Helmy,\Gender-basedGroupingofMobileStudentSocieties,"TheInternationalWorkshoponMobileDeviceandUrbanSensing(MODUS),IPSNworkshop,St.Louis,MO,April,2008. [12] MobiLib:,\Community-wideLibraryofMobilityandWirelessNetworksMeasurements(InvestigatingUserBehaviorinWirelessEnvironments)," 27

PAGE 28

UNC/FORTH,\RepositoryofTracesandModelsforWirelessNetworks,SyslogDataset#2," 28

PAGE 29

UdayanKumarwasbornandbroughtupinNorthernIndia.UdayanattendedDhirubhaiAmbaniInstituteofInformationandTechnology,Gandhinagar,Gujarat,India,wherehereceivedaBachelorofTechnologydegreeininformationandcommunicationtechnologyin2005.In2006,heenteredtheComputerandInformationScienceandEngineeringgraduateprogramattheUniversityofFlorida{Gainesville,whereheworkedasaresearchassistantandattendedschoolfull{timeallthewhilecompletingtherequirementsforaMasterofSciencedegreeincomputerengineering.Hisresearchinterestsincludemobilesocialnetworks,embeddedsystems,andnetworkingprotocols. 29