<%BANNER%>

Online Adaptive Appearance Models for Robust Visual Tracking

Permanent Link: http://ufdc.ufl.edu/UFE0043750/00001

Material Information

Title: Online Adaptive Appearance Models for Robust Visual Tracking
Physical Description: 1 online resource (109 p.)
Language: english
Creator: Nejhum, S M Shahed
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2011

Subjects

Subjects / Keywords: appearance -- gpu -- models -- tracking -- visual
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Robust tracking of visual targets is a very challenging task in the field of computer vision. The target has to be reliably modeled and the model needs to be updated according to the target's appearance and shape variations over time. Visual tracking algorithms available in the literature do not fully explore mid-level image cues. This dissertation presents visual tracking algorithms where mid-level image cues are used efficiently and effectively to model the target. The first algorithm tracks articulated objects by constantly modeling the changing target shape by a small number of rectangular blocks whose positions are updated accordingly. To improve the tracking speed a modified algorithm processes the computationally extensive steps in parallel using a GPU. Both algorithms are evaluated on several videos of articulated targets undergoing significant shape variations. We compare the results with the mean shift tracker and the histogram-based tracker. Our algorithms consistently outperform these algorithms and produce robust tracking results. We present a novel technique to generate coherent superpixels from a pair of successive video frames. We show that the similarity of corresponding superpixels can be increased by generating superpixels jointly from the images. We present a visual tracking algorithm that uses a novel superpixel-based appearance model. The model is continuously updated to handle variations of the target. To evaluate the performance of the tracker, we report experimental results on several publicly available challenging sequences. We show that our superpixel-based visual tracker produces improved performance over recently published state-of-the-art tracking algorithms.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by S M Shahed Nejhum.
Thesis: Thesis (Ph.D.)--University of Florida, 2011.
Local: Adviser: Ho, Jeffrey.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2011
System ID: UFE0043750:00001

Permanent Link: http://ufdc.ufl.edu/UFE0043750/00001

Material Information

Title: Online Adaptive Appearance Models for Robust Visual Tracking
Physical Description: 1 online resource (109 p.)
Language: english
Creator: Nejhum, S M Shahed
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2011

Subjects

Subjects / Keywords: appearance -- gpu -- models -- tracking -- visual
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Robust tracking of visual targets is a very challenging task in the field of computer vision. The target has to be reliably modeled and the model needs to be updated according to the target's appearance and shape variations over time. Visual tracking algorithms available in the literature do not fully explore mid-level image cues. This dissertation presents visual tracking algorithms where mid-level image cues are used efficiently and effectively to model the target. The first algorithm tracks articulated objects by constantly modeling the changing target shape by a small number of rectangular blocks whose positions are updated accordingly. To improve the tracking speed a modified algorithm processes the computationally extensive steps in parallel using a GPU. Both algorithms are evaluated on several videos of articulated targets undergoing significant shape variations. We compare the results with the mean shift tracker and the histogram-based tracker. Our algorithms consistently outperform these algorithms and produce robust tracking results. We present a novel technique to generate coherent superpixels from a pair of successive video frames. We show that the similarity of corresponding superpixels can be increased by generating superpixels jointly from the images. We present a visual tracking algorithm that uses a novel superpixel-based appearance model. The model is continuously updated to handle variations of the target. To evaluate the performance of the tracker, we report experimental results on several publicly available challenging sequences. We show that our superpixel-based visual tracker produces improved performance over recently published state-of-the-art tracking algorithms.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by S M Shahed Nejhum.
Thesis: Thesis (Ph.D.)--University of Florida, 2011.
Local: Adviser: Ho, Jeffrey.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2011
System ID: UFE0043750:00001


This item has the following downloads:


Full Text

PAGE 1

ONLINEADAPTIVEAPPEARANCEMODELSFORROBUSTVISUALTRACKINGByS.M.SHAHEDNEJHUMADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2011

PAGE 2

c2011S.M.ShahedNejhum 2

PAGE 3

TomyParentsforalltheirloveandsupport 3

PAGE 4

ACKNOWLEDGMENTS IamextremelygratefultoDr.JeffreyHoforhiscontinuousguidanceandsupportduringmygraduatestudies.Hehasbeenaconstantsourceofinspirationandencouragementforme.IamalsothankfultoDr.BabaC.Vemuri,Dr.AnandRangarajan,Dr.JorgPetersandDr.QunGuforbeingonmysupervisorycommitteeandprovidingextremelyusefulinsightsintotheworkpresentedinthisdissertation.IwouldliketothanktheDepartmentofComputerandInformationScienceandEngineering(CISE)andtheUniversityofFlorida(UF)forgivingmetheopportunitytopursuemygraduatestudiesinaveryconstructiveenvironment.Duringmygraduatestudies,Ienjoyedmyjobasateachingandresearchassistant.IwanttothankDr.Ming-HsuanYangforhisadviceandsupportinmyresearchwork.Iamthankfultoallofmycurrentandformerlab-matesMuhammadRushdi,MohsenAli,JasonYu-TsehChi,VenkatakrishnanRamaswamy,KarthikS.Gurumoorthy,ManuSethi,SubhajitSengupta,NathanVanderKraats,LeilaKalantariandShaoyuQi.IhavealwaysfoundthemwheneverIneededthem.Lastlyandmostimportantly,Iamthankfultomyfamily,fortheirunconditionalloveandsupport. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 7 LISTOFFIGURES ..................................... 8 ABSTRACT ......................................... 10 CHAPTER 1VISUALTRACKING ................................. 12 1.1Introduction ................................... 12 1.2MotivationandObjectives ........................... 12 1.3ProposedAlgorithms .............................. 13 1.4Organization .................................. 15 2VISUALTRACKINGUSINGARTICULATINGBLOCKS ............. 16 2.1Introduction ................................... 16 2.2RelatedWork .................................. 18 2.3TrackingAlgorithm ............................... 20 2.3.1Detection ................................ 22 2.3.2Scaling .................................. 23 2.3.3Renement ............................... 24 2.3.4Update .................................. 25 2.3.5Discussion ................................ 26 2.4ExperimentsandResults ........................... 27 2.4.1QualitativeResults ........................... 28 2.4.2QuantitativePerformance ....................... 29 2.5Discussion ................................... 30 3SIMULTANEOUSMULTI-FRAMETRACKINGFORARTICULATEDTARGETS 45 3.1Introduction ................................... 45 3.2GraphicsProcessingUnits .......................... 46 3.2.1OverviewofGPU ............................ 46 3.2.2ApplicationofGPUinComputerVision ................ 46 3.3CUDAArchitectureandProgrammingFramework .............. 47 3.3.1CUDAHardwareArchitecture ..................... 47 3.3.2CUDAProgrammingFramework ................... 48 3.4AlgorithmDescription ............................. 50 3.5ImplementationDetails ............................. 51 3.6ExperimentalResults ............................. 53 5

PAGE 6

3.6.1QualitativeResults ........................... 54 3.6.2QuantitativePerformance ....................... 54 3.6.3ComputationalPerformance ...................... 54 3.7Discussion ................................... 56 4CO-SUPERPIXELS:GENERATINGCONSISTENTSUPERPIXELSACROSSIMAGEFRAMES ................................... 66 4.1Introduction ................................... 66 4.2SuperpixelGenerationfromaSingleImage ................. 67 4.3ConsistentSuperpixelGeneration ...................... 68 4.4Co-superpixelGenerationFramework .................... 70 4.4.1SuperpixelSimilarityMetrics ...................... 72 4.4.2IterativeEnergyMinimization ..................... 72 4.5ExperimentalResults ............................. 74 5TRACKINGUSINGSUPERPIXEL-BASEDAPPEARANCEMODEL ...... 82 5.1Introduction ................................... 82 5.2RelatedWork .................................. 83 5.3TheAppearanceModel ............................ 85 5.4TrackingAlgorithm ............................... 87 5.4.1TargetLocalization ........................... 87 5.4.2OcclusionDetection .......................... 89 5.4.3ModelUpdate .............................. 89 5.4.4Scaling .................................. 90 5.5ExperimentalResults ............................. 91 5.5.1ExperimentalSetup .......................... 91 5.5.2ResultsandDiscussion ........................ 92 6CONCLUSION .................................... 99 REFERENCES ....................................... 101 BIOGRAPHICALSKETCH ................................ 109 6

PAGE 7

LISTOFTABLES Table page 2-1Centerlocationerrorsoftheintegralhistogram-basedtracker,themean-shifttrackerandtheproposedBHT ........................... 32 2-2Coverageerrorsoftheintegralhistogram-basedtracker,themean-shifttrackerandtheproposedBHT ............................... 32 2-3Centerlocationerrorsofourtrackerwiththexedandtheadaptivescalingwindow ........................................ 32 2-4Coverageerrorsofourtrackerwiththexedandtheadaptivescalingwindow 32 2-5RMSerrorsintrackingwindowsize ......................... 32 3-1Centerlocationerrorsoftheintegralhistogram-basedtracker,themean-shifttracker,theBHTandthep-BHT ........................... 57 3-2Coverageerrorsoftheintegralhistogram-basedtracker,themean-shifttracker,theBHTandthep-BHT ............................... 57 3-3Performanceanalysisofdifferentsub-stepsinthemulti-framedetectionstep. 57 3-4Computationalperformanceofthep-BHTalgorithm. ............... 58 4-1Superpixelsimilarityattheinitialandnaliterationsofouriterativejointsuperpixelgenerationscheme. ................................. 75 5-1QuantitativeevaluationoftheSSPtrackingalgorithm. .............. 93 7

PAGE 8

LISTOFFIGURES Figure page 2-1Motivationofourblock-basedtrackingalgorithm ................. 33 2-2High-leveloutlineoftheBHTalgorithm. ...................... 33 2-3InitializationoftheBHTalgorithm. ......................... 34 2-4Examplesofsegmentationwithinthetrackingwindow. .............. 34 2-5Trackingresultsusingxedandadaptivescalingwindow. ............ 35 2-6Overviewoftheblockcongurationupdateprocess ............... 35 2-7VisualcomparisonoftrackingresultsoftheFemaleSkatersequence ..... 36 2-8VisualcomparisonoftrackingresultsoftheDancersequence .......... 36 2-9VisualcomparisonoftrackingresultsoftheMaleSkatersequence ....... 37 2-10Examplesofgroundtruthwindowsforarticulatedtargets. ............ 38 2-11VisualcomparisonoftrackingresultsoftheIndianDancersequence ...... 38 2-12TrackingresultsofthecartoonsequenceusingtheBHTalgorithm. ....... 38 2-13Trackingoccludedtarget. .............................. 39 2-14Trackingwithaclutteredbackground ........................ 39 2-15TrackingofawildlifetargetusingtheBHTalgorithm ............... 39 2-16TrackingresultsoftheBHTalgorithmwithxedandadaptivescale ....... 40 2-17Per-framecenterlocationerrorplotsusingdifferenttrackingalgorithms ..... 41 2-18Per-framecoverageerrorplotsusingdifferenttrackingalgorithms ........ 42 2-19Comparisonbetweenourxedandadaptivescalingwindowbasedtracker. .. 43 2-20ResultsusingtheBHTalgorithmwithxedandadaptivescaling. ........ 44 3-1Comparisonofcomputationalspeedbetweenthep-BHTandtheBHTalgorithm. 59 3-2BlockdiagramoftheCUDAHardwareArchitecture ................ 59 3-3BlockdiagramoftheCUDAProgrammingModelandMemoryModel ...... 60 3-4Blockdiagramofthep-BHTalgorithm. ....................... 60 3-5BlockdiagramoftheCPU-GPUimplementationofthep-BHTalgorithm. .... 61 8

PAGE 9

3-6VisualcomparisonoftrackingresultsoftheFemaleSkaterSequence. ..... 61 3-7VisualcomparisonoftrackingresultsoftheMaleSkaterSequence. ...... 62 3-8VisualcomparisonoftrackingresultsoftheIndianDancerSequence. ..... 62 3-9VisualcomparisonoftrackingresultsoftheDancerSequence. ......... 63 3-10Per-framecenterlocationerrorplotsusingdifferenttrackingalgorithms ..... 64 3-11Per-framecoverageerrorplotsusingdifferenttrackingalgorithms ........ 65 4-1Consistentsuperpixelgenerationforapairofimages. .............. 76 4-2Stepsforcomputingtheshapesimilaritybetweenapairofsuperpixels. .... 77 4-3Improvementintheaveragesuperpixelsimilarityscore. ............. 78 4-4SuperpixelgenerationresultsfortheSponzasequence ............. 79 4-5SuperpixelgenerationresultsfortheWoodensequence ............. 79 4-6SuperpixelgenerationresultsfortheUrban3sequence ............. 80 4-7SuperpixelgenerationresultsfortheMeqounsequence ............. 80 4-8SuperpixelgenerationresultsfortheUrban2sequence ............. 81 4-9SuperpixelgenerationresultsfortheArmysequence ............... 81 5-1Constructionofthesuperpixel-basedappearancemodel. ............ 94 5-2High-leveloutlineoftheSSPtrackingalgorithm. ................. 94 5-3TrackingresultsusingtheSSPtrackingalgorithm. ................ 95 5-4TrackingresultsusingtheSSPtrackingalgorithmonPROSTsequences. ... 96 5-5Visualcomparisonoftrackingresultsobtainedusingdifferentalgorithms. ... 96 5-6VisualcomparisonoftrackingresultsobtainedusingdifferentalgorithmsonPROSTsequences .................................. 97 5-7Per-frametrackingerrorplotsusingdifferenttrackingalgorithms. ........ 98 9

PAGE 10

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyONLINEADAPTIVEAPPEARANCEMODELSFORROBUSTVISUALTRACKINGByS.M.ShahedNejhumDecember2011Chair:JeffreyHoMajor:ComputerEngineering Robusttrackingofvisualtargetsisaverychallengingtaskintheeldofcomputervision.Thetargethastobereliablymodeledandthemodelneedstobeupdatedaccordingtothetarget'sappearanceandshapevariationsovertime.Visualtrackingalgorithmsavailableintheliteraturedonotfullyexploremid-levelimagecues.Thisdissertationpresentsvisualtrackingalgorithmswheremid-levelimagecuesareusedefcientlyandeffectivelytomodelthetarget. Therstalgorithmtracksarticulatedobjectsbyconstantlymodelingthechangingtargetshapebyasmallnumberofrectangularblockswhosepositionsareupdatedaccordingly.ToimprovethetrackingspeedamodiedalgorithmprocessesthecomputationallyextensivestepsinparallelusingaGPU.Bothalgorithmsareevaluatedonseveralvideosofarticulatedtargetsundergoingsignicantshapevariations.Wecomparetheresultswiththemeanshift[ 1 ]trackerandthehistogram-basedtracker[ 2 ].Ouralgorithmsconsistentlyoutperformthesealgorithms[ 1 2 ]andproducerobusttrackingresults. Wepresentanoveltechniquetogeneratecoherentsuperpixelsfromapairofsuccessivevideoframes.Weshowthatthesimilarityofcorrespondingsuperpixelscanbeincreasedbygeneratingsuperpixelsjointlyfromtheimages.Wepresentavisualtrackingalgorithmthatusesanovelsuperpixel-basedappearancemodel.Themodeliscontinuouslyupdatedtohandlevariationsofthetarget.Toevaluatetheperformance 10

PAGE 11

ofthetracker,wereportexperimentalresultsonseveralpubliclyavailablechallengingsequences.Weshowthatoursuperpixel-basedvisualtrackerproducesimprovedperformanceoverrecentlypublishedstate-of-the-arttrackingalgorithms[ 3 5 ]. 11

PAGE 12

CHAPTER1VISUALTRACKING 1.1Introduction Whilehumanvisioncaneasilylocateatargetundergoingunknownmotioninvideo,developinganautomaticvisionsystemforaccuratetrackingisstillverydifcultevenafterdecadesofresearch.Thesuccessofmakingarobusttrackerdependsonreliablemodelingofthetargetandtheexibilitytoupdatethemodelaccordingtoitsvariationsovertime.Thefocusofthisdissertationistodevelopvisualtrackingalgorithmsusingefcientmodelingofthetarget'sshapeandappearance.Weproposesimple,exibleandgenericmodelsforrobusttrackingofvisualtargets. Theneedforrobusttrackingappearsinamyriadofapplications,acrossmultipledisciplines.Trackingisthemaincomponentinvisualsurveillancesystems[ 6 7 ].Inautonomoustrafcsurveillancesystems,trackingthevehiclesinrealtimeandpredictingtheirmotionsisveryimportant.Formedicaldataanalysis,non-rigidstructureshavebeentrackedusingdeformabletemplates[ 8 ].Autonomousrobotstrackobjectsaroundtheminordertonavigatein3Denvironments.Visualtrackingalgorithmsarealsousedinhuman-computerinteractions[ 9 ],computergamesandanimation[ 10 ],activityandeventanalysis(e.g.gesturetracking[ 11 ]),multimediaapplications(e.g.faceandpeopletrackingforvideoconferencing[ 12 ])andinmanyotherapplications. 1.2MotivationandObjectives Developinganaccurate,efcientandrobustvisualtrackerhasalwaysbeenchallenging,andthetaskbecomesevenmoredifcultwhenthetargetisexpectedtoundergosignicantandrapidvariationsinshapeaswellasappearance.Infact,thegoalsofachievingbothefciencyandrobustnessaresomewhatconicting.Asimplerepresentationofthetargetmakesthetrackerefcient,butitscapabilityofhandlingsignicantshape,appearanceandscalevariationsandtrackingthetargetundersevereocclusionbecomeslimited.Thesedifcultiesmotivateustodesignefcient 12

PAGE 13

representationsofvisualtargetsandtrackthemrobustly.Inparticular,weusemid-levelimagecuestodesigntheappearanceofthetarget. Inthisdissertation,wediscussalgorithmsforarticulatedobjecttrackingwherethetargetcontinuouslychangesitsshape.Wepresentablock-basedappearancemodelinwhichthetargetisrepresentedusingasetofrectangularblocks.Aswell,weintroduceanalgorithmfortrackingnon-articulatedtargets.Thesetargetsoftenchangetheirappearancesandsizes,andundergopartialocclusions.Inthealgorithm,thetargetisrepresentedusinganovelsuperpixel-basedappearancemodel.Theappearancemodelsareupdatedadaptivelyinouralgorithmstohandlethevariationsofthetarget.Briefdescriptionsofthesealgorithmsareoutlinedinthenextsection. 1.3ProposedAlgorithms Wepresenttwonovelalgorithmsfortrackingarticulatedobjects.Intherstalgorithm,thetargetisefcientlymodeledusingasmallnumberofrectangularblocks.Theseblocksarerepresentedbyintensityhistograms,andwerefertothisalgorithmastheBlockHistogramTracking(BHT)algorithm.Thesizesandlocationsoftheseblocksefcientlyencodeobjectshapeandappearance.Ouralgorithmadaptivelyupdatestheblocklocationsaccordingtothetargetarticulationovertime.Thealgorithmmakesefcientuseofintegralimagehistogramstorapidlysearchthetargetovertheentireframe.Thisgivesthealgorithmtheabilitytotracktargetsundergoingrapidmotion.TheBHTalgorithmconsistsofdetection,renementandupdatesteps.Thetrackeriterativelyloopsthroughthesestepsforvisualtracking.Underthegeneralassumptionofstationaryforegroundappearance,weshowthatrobustobjecttrackingispossiblebyadaptivelyadjustingthelocationsoftheseblocks.WehaveimplementedtheBHTalgorithminMATLABandthetrackerrunsat8)]TJ /F5 11.955 Tf 11.95 0 Td[(10fpsona2.66GHzmachine. ThesecondalgorithmisamodiedversionoftheBHTalgorithm.Insteadofupdatingtheblockcongurationateveryframe,themodiedalgorithmcomputestheblockcongurationregularlyafterseveralframes,i.e.,multipledetectionstepsare 13

PAGE 14

followedbyarenementandanupdatestep.Thesedetectionstepscanbeprocessedinparallel,andthismakesthealgorithmsuitableforaGPU-basedimplementation.WerefertothisversionasparallelBHTorp-BHT.Moreover,thedetectionstepoftheBHTalgorithminvolvesevaluatinglargenumberofcandidatewindowsovertheimageframe,andevaluationofthesewindowsareindependentofeachother.WealsoexploitthisintheGPU-basedimplementation.Thep-BHTalgorithmhasbeenimplementedonaGPUusingNVIDIA'sCUDAprogrammingmodelandtheimplementationrunsat55)]TJ /F5 11.955 Tf 11.96 0 Td[(80fps. Thisdissertationalsodiscussesanovelalgorithmfornon-articulatedobjecttracking.Unlikepreviousalgorithmsinwhichthearticulatedtargetismodeledusingasetofblocks,weuseasetofsmallandhomogeneouslytexturedregions.Thesesmallandhomogeneouslytexturedregionsareknownassuperpixels.Mostoftheexistingsuperpixel-basedtrackingalgorithmsgeneratesuperpixelsindependentlywithoutenforcinganyconsistencymeasurementsacrosstheimages.Toaddressthisissue,wehavedevelopedanovelapproachforconsistentsuperpixelgenerationfromconsecutivevideoframes.Anovelenergyfunctionisformulatedthatincorporatessuperpixelshapeandappearancesimilaritymeasures.Wedevelopaniterativealgorithmtominimizeourproposedenergyfunctionandgenerateconsistentsuperpixels. Finally,wepresentthedetailsofthetrackingalgorithm.Thealgorithmrobustlyhandlesocclusions,complexbackground,scaleandappearancevariationsofnon-articulatedtargets.Weintroduceasuperpixel-basedappearancemodelforvisualtracking.Fromtheinitialtrackingwindow,weextractsuperpixelsandcomputetheirhistogramfeatures.Thesesuperpixelfeaturesareorganizedusingagrid;hencewerefertothisalgorithmasthestructuredsuperpixel(SSP)trackingalgorithm.Insubsequentframes,wesearchfortheregionthatmaximizesthesimilarityoftheseorganizedsuperpixelfeatures.Ouralgorithmdetectstargetocclusion,andupdatestheappearancemodelaccordingly.Aswell,theappearancemodelisupdatedtohandlelarge-scalevariations. 14

PAGE 15

1.4Organization Theremainderofthedissertationisorganizedasfollows.TheBHTalgorithmispresentedinChapter 2 .InChapter 3 ,wediscusstheGPU-basedimplementationofthep-BHTalgorithm.TheconsistentsuperpixelgenerationframeworkfromsimilarimagesisdescribedinChapter 4 .TheSSPtrackingalgorithmisdescribedinChapter 5 .Finally,wesummarizeourcontributionsandconcludethisdissertationinChapter 6 15

PAGE 16

CHAPTER2VISUALTRACKINGUSINGARTICULATINGBLOCKS 2.1Introduction Oneofthesimplestwaystomodelthetargetappearanceistocomputetheintensityhistogramofthetargetregion,andmanytrackingalgorithmsbasedonthisideaareavailableintheliterature(e.g.,[ 1 13 ]).Usingarectangularboundingboxforthetarget,theintensityhistogramcanbecomputedefcientlyfromtheintegralimage[ 14 ],andtheentireimagecanbescannedrapidlytolocatethetargetforvisualtracking.However,computingtheintensityhistogramfromaregionboundedbysomeirregularshapecannotbedoneefcientlyusingtheintegralimage.Oneofthechallengesinvisualtrackingistorobustlyhandleshapevariationsofthetarget.Inthecontextofhistogrambasedtracking,onegeneralideaistousea(circularorelliptical)kernel[ 15 16 ]todenearegionaroundthetargetfromwhichaweightedhistogramcanbecomputed.Thekernelimposesaregularityconstraintontheirregularshape,therebyrelaxingthemoredifcultproblemofefcientlycomputingtheintensityhistogramfromanirregularshapetothatofasimpleroneofestimatinghistogramfromaregularshape.Rapidscanningoftheimageusingthisapproachisstillnotpossible;instead,differentialalgorithmscanbedesignedtoiterativelyconvergetothetargetobject[ 1 ].However,thesedifferentialapproachesoftenfailtotrackthetargetswithrapidandlargemotions. Anotherwaytodealwithirregularshapesistoenclosethetargetwitharegularshape(e.g.,arectangularwindow)andcomputehistogramfromtheenclosedregion.However,thisinevitablyincludesbackgroundpixelswhentheforegroundshapecannotbecloselyapproximated.Consequently,theresultinghistogramcanbecorruptedbybackgroundpixels,andthetrackingresultdegradesaccordingly(e.g.,unstableorjitteredresultsasshowninFigure 2-1 ).Furthermore,completelackofspatialinformationinhistogramsisalsoundesirable.Forproblemssuchasfacetrackingthatdonothavesignicantshapevariation,itisadequatetouseintensityhistogramasthemaintracking 16

PAGE 17

feature[ 13 ].However,foratargetundergoingsignicantshapevariation,thespatialcomponentoftheappearanceisveryprominent,andtheplainintensityhistogrambecomesinadequateasitaloneoftenyieldsunstabletrackingresults. Eachoftheaforementionedproblemshasbeenaddressedtosomeextent(e.g.,spatiogram[ 17 ]forencodingspatialinformationinhistogram).However,mostofthemrequiresubstantialincreaseofcomputationtime,therebymakingthesealgorithmsapplicableonlytolocalsearchandinfeasibleforglobalscansofimages.Consequently,suchalgorithmsarenotabletotrackobjectsundergoingrapidmotions.Inthischapter,wepresentatrackingalgorithmthatsolvestheaboveproblems,andatthesametime,itstillhascomparablerunningtimeasthetrackingalgorithmusing(plain)integralhistogram[ 2 ].Thealgorithmconsistsofglobalscanning,localrenementandupdatesteps.Themainideaistoexploitanefcientappearancerepresentationusinghistogramsthatcanbeeasilyevaluatedandcomparedsothatthetargetobjectcanbelocatedbyscanningtheentireimage.Shapeupdate,whichtypicallyrequiresmoreelaboratedalgorithms,iscarriedoutbyadjustingafewsmallblockswithintrackedwindow.Specically,weapproximatetheirregularshapewithasmallnumberofblocksthatcovertheforegroundobjectwithminimaloverlaps.Asthetrackingwindowistypicallysmall,wecanextractthetargetcontourusingafastsegmentationalgorithmwithoutincreasingtherun-timecomplexitysignicantly.Wethenupdatethetargetshapebyadjustingtheseblockslocallysothattheyprovideamaximalcoverageoftheforegroundtarget. Theadaptivestructureinouralgorithmcontainstheblockcongurationandtheirassociatedweights.Shapeofthetargetobjectislooselyrepresentedbyblockconguration,whileitsappearanceisrepresentedbyintensitydistributionsandweightsoftheseblocks.Indoingso,spatialcomponentoftheobject'sappearanceisalsolooselyencodedinblockstructure.Furthermore,theserectangularblocksallowrapidevaluationsandcomparisonsofhistograms.Notethatourgoalisnottorepresent 17

PAGE 18

bothshapeandappearancepreciselysincethiswillmostlikelyrequiresubstantialincreaseincomputation.Instead,westriveforasimplebutadequaterepresentationthatcanbeefcientlycomputedandmanaged.WerefertothealgorithmastheBlockHistogramTracking(BHT)algorithm.Comparedwithtrackingmethodsbasedonintegralhistograms,ourtrackerisalsoabletoefcientlyscantheentireimagetolocatethetarget,whichamountstothebulkoftheprocessingtimeforthesealgorithms.TheextraincreaseinrunningtimeoftheBHTalgorithmresultsfromtherenementandupdatesteps.Sincesegmentationiscarriedoutonlylocallyina(relatively)smallwindowandtheweightscanbecomputedveryefciently,suchcomputationoverheadisgenerallysmall.Experimentalresultsdemonstratethatthealgorithmrendersmuchmoreaccurateandstabletrackingresultscomparedtotheintegralhistogrambasedtracker,withanegligibleincreaseinrunningtime. 2.2RelatedWork Thereisarichliteratureonshapeandappearancemodelingforvisualtracking.Inthissection,wediscussthemostrelevantworkswithinthecontextofsinglearticulatedobjecttracking.Specically,weaimtotrackgenericarticulatedobjectsfromimagesacquiredwithonecameraatadistancewhileundergoinglargeandrapiddeformationinshapeaswellasappearance.Wenotethatthereexisttrackingalgorithmsforspecicobjectsoperatingunderdifferentimagingconditionsandconstraints,e.g.,humantracking[ 18 21 ],handtracking[ 15 22 24 ],model-basedtracking[ 25 26 ],tonameafew. Articulatedobjectscanbemodeledwithparameterizedshapesorcontours.Activecontoursusingparametricmodels[ 24 27 ]typicallyrequireofinetraining,andexpressivenessofthesemodels(e.g.,splines)issomewhatrestrictive.Furthermore,withalltheofinetraining,itisstilldifculttopredictthetracker'sbehaviorwhenhithertounseentargetisencountered.Forexample,anumberofexemplarshavetobelearnedfromtrainingdatapriortotrackingin[ 28 ],andthetrackerdoesnotprovide 18

PAGE 19

anymechanismtohandleshapesthataredrasticallydifferentfromthetemplates.Likewise,thereisalsoanofinelearningprocessinvolvedintheactiveshapeandappearancemodels[ 29 ].Levelsetalgorithmshavealsobeensuccessfullyappliedtotrackarticulatedobjects[ 30 33 ].However,thesemethodsrelymainlyontheinformationnearthecontoursanddonotexploittherichappearanceortextureinformation.Inaddition,thesealgorithmsusuallydonothavemechanismstohandledriftingeffects. Insteadofusingcontourstomodelshapes,kernel-basedmethodsrepresenttarget'sappearancewithintensity,gradients,andcolorstatistics[ 1 13 34 ].Thesemethodshavedemonstratedsuccessesintrackingtargetswhoseshapescanbewellenclosedbyellipses.Althoughmethodsusingmultiplekernels[ 15 35 ]andadaptivescaling[ 36 ]havebeenproposedtocopewiththisproblem,itisnotclearsuchmethodsareabletoeffectivelytrackarticulatedobjectswhoseshapesvaryrapidlyandsignicantly. Inasomewhatdifferentdirection,theuseofHaar-likefeaturesplaysanimportantroleinthesuccessofreal-timeobjectdetection[ 14 ].However,fastalgorithmsforcomputingHaar-likefeaturesandhistogramssuchasintegralimages[ 14 ]orintegralhistograms[ 2 ]requirerectangularwindowstomodelthetarget'sshape.Consequently,itisnotstraightforwardtoapplyefcientmethodstotrackanddetectarticulatedobjectwithvaryingshapes.Haar-likeandrelatedfeaturesplayasignicantroleinseveralrecentworkononlineboostinganditsapplicationstotrackingandobjectdetection[ 37 ].Oneinterestingaspectofthislatterworkistotreattrackingassequentialdetectionproblems,andanimportantcomponentinthetrackingalgorithmistheonlineconstructionofanobject-specicdetector.However,thecapabilityofthetrackerissomewhathamperedbytheHaar-likefeaturesitusesinthatthisinvariablyrequirestheshapesofthetargettobewellapproximatedbyrectangularwindows. Finally,theBHTalgorithmsharessomesimilaritywiththepart-basedobjectdetectionalgorithmproposedin[ 38 ]asbothalgorithmsuserectangularblockstodene 19

PAGE 20

thetargetobject.However,thesimilarityisonlysupercialsince,inourmethod,thereisnospecicpartdenitionastheblocksareonlineadjustedtoprovidethecoveragefortheforegroundtargetonly.Whiledecompositionsofthetargetusingrectangularblocksareemployedinbothmethod,thedecompositioninourcaseisgeometricalwiththeexplicitpurposeofcoveringtheforegroundandaccuratelyestimatetheintensityhistogramwhiletheirsissemanticalinthateachblockorparthasitsownuniqueappearanceandcharacteristic.Ourgoalistohaveageneral-purposetrackerandthisnecessarilyrequiresustoavoiddetectingobjectpartsasitwillinvolvemoreextensivetrainingandrequiremoreassumptionsonthetarget'sappearance. 2.3TrackingAlgorithm WepresentthedetailsoftheBHTalgorithminthissection.Theoutputofthetrackerconsistsofarectangularwindowenclosingthetargetineachframe.Furthermore,anapproximatedforegroundregionisalsoestimated.Ourobjectiveistoachieveabalanceamongthethreesomewhatconictinggoalsofefciency,accuracyandrobustness.Specically,wetreatthetrackingproblemasasequenceofdetectionproblems,andthemainfeaturethatweusetodetectthetargetistheintensityhistogram.Thedetectionprocessiscarriedoutbymatchingforegroundintensityhistogramandweemployintegralhistogramsforefcientcomputation.Inthefollowingdiscussion,wewillusethetermshistogramanddensityinterchangeably.Themaintechnicalproblemthatwesolvewithinthecontextofvisualtrackingishowtoapproximatetheforegroundhistogramundersignicantshapevariationssothatefcientandaccuratearticulatedobjecttrackingispossibleunderthegeneralassumption(heldbymosttrackingalgorithms)thattheforegroundhistogramstaysroughlystationary. Thehigh-leveloutlineoftheBHTalgorithmisshowninFigure 2-2 .Itconsistsoffoursequentialsteps:detection,scaling,renement,andupdate.Attheoutset,thetrackerisinitializedwiththecontourofthetarget,itthenautomaticallydeterminestheinitialtrackingwindowWandKrectangularblocksBiaswellastheirweightsiaccording 20

PAGE 21

totheproceduredescribedbelow.TheforegroundintensityhistogramHf0fortheinitialframeiskeptthroughoutthesequence. TheshapeoftheforegroundtargetisapproximatedbyKrectangularblocks,Bi,1iK,withinthemaintrackingwindowWasshowninFigure 2-3 .Thepositionsoftheblockswithinthetrackingwindowareadaptivelyadjustedthroughoutthetrackingsequence,andtheymayhavesomeoverlapstoaccountforextremeshapevariations.Ateachframet,thetrackermaintainsthefollowing:1)atrackingwindowWtwithablockconguration,2)aforegroundhistogramHftrepresentedbyacollectionoflocalforegroundhistograms,HBfitandtheirassociatedweightsi,computedfromtheblocks,and3)abackgroundhistogramHbt.Thetrackerrstdetectsthemostlikelylocationofthetargetbyscanningtheentireimage(i.e.,thewindowwiththehighestsimilaritywhencomparedwiththetrackingwindowW). Afterdetection,trackingwindowsizecanbeadjustedtomakeittightlyenclosethetargetwithoutunnecessarybackgroundpixels.Notethatfortrackingarticulatedobjects,itisinevitablefortrackingwindowstoenclosesomebackgroundpixelsastheshapesandsizesoftargetsvarysignicantly.Weintroduceanadaptivescalingsteptoupdatethesizeofthetrackingwindowbasedonscaleofthetargetobject. Intherenementstep,thetrackerworksexclusivelyinthedetectedwindowandthetargetissegmentedfromthebackgroundusingthecurrentforegrounddensity.ThisresultisthenusedintheupdatesteptoadjusttheblockpositionsinthetrackingwindowWandthentheweightsassignedtoeachblockisrecomputed.Inaddition,thebackgrounddensityHbtisalsoupdated. Whileitisexpectedthattheunionoftheblockswillcovermostofthetarget,theseblockswillneverthelesscontainbothforegroundandbackgroundpixels.Thishappensoftenwhentheshapeofthetargetobjectisfarfromconvexandexhibitsstrongconcavity.Inparticular,blockscontaininglargepercentagesofbackgroundpixelsshouldbedownweightedintheirimportancewhencomparedwithblocksthatcontainmostly 21

PAGE 22

foregroundpixels.Therefore,eachblockBiisassignedaweighti,whichwillbeusedinallthreesteps.Inthisframework,theshapeinformationisrepresentedbytheblockcongurationandtheassociatedweights.Comparedwithotherformulationsofshapepriors[ 31 33 ],itisaratherfuzzyrepresentationofshapes.However,thisispreciselywhatisneededheresincerapidandsometimeextremeshapevariationisexpected,theshaperepresentationshouldnotberigidandtooheavilyconstrainedsoastopermitgreaterexibilityinanticipatingandhandlinghithertounseenshapes. 2.3.1Detection Foreachframe,thetrackerrstscanstheentireimagetolocatethetargetobject.Aswithmanyotherhistogram-basedtrackers,thetargetwindowWselectedinthisstepistheonethathasthemaximumforegroundsimilaritymeasurewithrespecttoinitialtrackingwindowW.AfterscanningallpossiblecandidatewindowWTincurrentframe,weselectthewindowasW,whichminimizesourproposeddistancefunctionD W=argminWTD(WT,W)(2) ThedistancefunctionDisconstructedfromlocalforegroundhistogramscomputedfromtheblocksasfollows.First,wetransfertheblockcongurationofthetrackingwindowWt)]TJ /F10 7.97 Tf 6.58 0 Td[(1ontoeachscannedwindowWT,andaccordingly,wecanevaluateKlocalforegroundhistogramsineachofthetransferredblocks.ThelocalforegroundhistogramHBfitfortheblockBiistheintersectionoftherawhistogramHBitwiththeinitialforegroundhistogramofthecorrespondingblockHBfit(b)=min(HBit(b),HBfi0(b)) wherebindexesthebins.ThedistancefunctionisdenedastheweightedsumoftheBhattacharyyadistancebetweenthedensitiesHBfi0(b)andHBfit(b) D(WT,W)=KXi=1i(HBfi0,HBfit)(2) 22

PAGE 23

whereiistheweightassociatedtoblockBiandistheBhattacharyyadistancebetweentwodensities(HBfi0,HBfit)=vuut 1)]TJ /F8 7.97 Tf 17.29 14.95 Td[(NXb=1q HBfi0(b)HBfit(b) whereNisthenumberofbins.Sincetheblocksarerectangular,allhistogramscanbecomputedbyafewsubtractionsandadditionsusingintegralhistograms.Becauseofi,Dwilldownweightblockscontainingmorebackgroundpixels,andthisisdesirablebecauseitprovidessomemeasureagainstbackgroundnoiseandclutters.Notethatcomparingwithmosthistogram-basedtrackers,whichinvariablyusesonlyonehistogramintersection,thedistancefunctionDdenedinEquation 2 actuallyencodessomeamountofshapeandspatialinformationthroughtheblockcongurationandtheirweights. 2.3.2Scaling Afterdetectionstep,werandomlyvarysizeofthetrackingwindowwhilekeepingthetargetobjectatthecenterofthesewindows.ForeachscaledwindowWs=scale(W,sh,sw),weestimatetheforegrounddensityHft,Ws=fHBfit,WsgandbackgrounddensityHbt,Ws(thevaluesofshandswareselectedwithinarange,i.e.,0.8sh,sW1.2).Weselectthescaleofatrackingwindowwithinwhichitsforegroundmatchingandbackgroundmismatchingismaximized.Inotherwords,theadaptivewindowWshouldminimizefollowingobjectivefunction W=minWsKXi=1i(HBfi0,HBfit,Ws)+(1)]TJ /F9 11.955 Tf 11.95 0 Td[()(1)]TJ /F9 11.955 Tf 11.95 0 Td[((Hf0,Hbt,Ws))(2) whereisaparameterforspecifyingweightsoftwomatchingterms.Wesetto0.3inallourexperimentstoputmoreweightsonbackgroundmismatchingtermforscaleselection.Withthisscheme,wecanbetterdeterminethewindowthattightlyenclosesthetargetobject.Figure 2-5 showssometrackingresultsusingxedandadaptivescaling. 23

PAGE 24

2.3.3Renement OncetheglobalscanproducesthetrackingwindowWinwhichthetargetislocated,thenextstepistoextractanapproximateforegroundregionsothattheshapevariationcanbebetteraccountedfor.Weapplyagraph-cutsegmentationalgorithmtosegmentoutthetheforegroundregioninW.Previousworkonthistypeofsegmentationinthecontextofvisualtracking(e.g.,[ 31 33 39 ])alwaysdenethecostfunctionintheformE=EA+ES whereEAandESaretermsrelatingtoappearanceandshape,respectively.However,wehavefoundthathardcodingtheshapepriorinaseparatetermESismoreofahindrancethanhelpinourproblembecauseoftheextremeshapevariationasstrongshapepriorswithoutdynamicinformationoftenleadtounsatisfactoryresults.Instead,oursolutionwillbetouseonlytheappearancetermEAbutincorporatingshapecomponentthroughthedenitionofforegrounddensity. Specically,letpdenoteapixelandPdenotethesetofallpixelsinW.LetPBdenotethebackgrounddensitythatweestimatedinthepreviousframe,andPi,1iKtheforegrounddensityfromBi(bynormalizingthehistogramHBfii).Furthermore,wewilldenotePftheforegrounddensityobtainedbynormalizingthecurrentforegroundhistogramHft.Following[ 39 40 ],thegraph-cutalgorithmwillminimizethecostfunction E(Cp)=Xp2PRp(Cp)+X(p,q)2N:Cp6=CqBp,q(2) whereCp:P!f0,1gisabinaryassignmentfunctiononPsuchthatforagivenpixelp,C(p)=1ifpisaforegroundpixeland0otherwise1.isaweightingfactorandN 1WewilldenoteC(p)byCp. 24

PAGE 25

denotesthesetofneighboringpixels.Weuseto0.5inouralgorithm.WedeneBp,q/exp((I(p))]TJ /F7 11.955 Tf 11.96 0 Td[(I(q))2=22) jjp)]TJ /F7 11.955 Tf 11.96 0 Td[(qjj whereI(p)istheintensityvalueatpixelpandisthekernelwidth.ThetermRp(Cp)isgivenasRp(Cp=0)=)]TJ /F5 11.955 Tf 11.29 0 Td[(logPF(I(p),p)Rp(Cp=1)=)]TJ /F5 11.955 Tf 11.29 0 Td[(logPB(I(p)) wherePF(I(p))=Pi(I(p))ifp2Bi,andPF(I(p))=Pf(I(p))ifpisnotcontainedinanyblockBi.NotethattheshapeinformationisnowimplicitlyencodedthroughPF.AfastcombinatorialalgorithmwithpolynomialcomplexityexistsforminimizingtheenergyfunctionE,basedontheproblemofcomputingaminimumcutacrossagraph[ 40 ].Sinceweonlyperformthegraph-cutina(relatively)smallwindow,thiscanbedoneveryquicklyanddoesnotsubstantiallyincreasethecomputationalload.Figure 2-4 presentssomeexamplessegmentationresult. 2.3.4Update Aftertheobjectcontourisextractedfromthesegmentationresult,weupdatethepositionsoftheblocksBiwithinW.Theideaistolocallyadjusttheseblockssothattheyprovideamaximalcoverageofthesegmentedforegroundregion.Weemployagreedystrategytocovertheentiresegmentedforegroundbymovingeachblocklocallyusingaprioritybasedontheirsizes.Notethatsuchanapproach(i.e.,localjittering)hasoftenbeenadoptedinobjectdetectionandtrackingalgorithmsforlater-stagerenementandne-tuning.AnexampleoftheblockcongurationupdateprocessisshowninFigure 2-6 Astheforegrounddenitionisnowknown,wecancomputetheforegroundhistogramHBfitfromeachblockBi.Afterthat,werecomputecorrespondingblock 25

PAGE 26

weightsaccordingtofollowingequationi=PNb=1HBfit(b) Pp2WC(p) Weightsiarenormalizedtoenforcetherequirementthattheirsumisone. 2.3.5Discussion Comparingwiththerecentworkthatemploydiscriminativemodels(classiers)fortracking(e.g.,[ 37 ]),ourapproachismainlygenerativethroughtheuseofintensityhistograms.Whileweassumethattheintensitydistributionstaysstationary,thefeaturesweconstantlyupdatearetheblockcongurationsandtheassociatedweights.Onlineappearanceupdates(e.g,[ 37 41 42 ])havebeenshowntobeeffectivefortrackingrigidobjects.However,astheexamplesshownintheseworkarealmostwithoutsignicantshapevariation,itisdifculttoseethatthesetechniquescanbegeneralizedimmediatelytohandleshapeupdates.Ontheotherhand,shapevariationhasoftenbeenmanagedinvisualtrackingalgorithmsusingshapetemplateslearnedofineandthedynamicsamongthetemplates[ 24 28 31 33 ].Itisalsonotclearhowthesealgorithmscandealwithsequencescontainingunseenshapesordynamics.Insteadofhardcodingtheshapeprior,ouralgorithmprovidesasoftupdateonshapeintheformofupdatingtheblockconguration,andtheupdateisconstrainedbytheappearancemodelthroughtherequirementthattheforegroundintensitydistributionstaysroughlystationary. Ouruseofadaptiveblockstructureiseasilyassociatedwithrecentworkthattrackanddetectpartsofanarticulatedobject(e.g.,[ 43 44 ]).However,ourgoalandmotivationarequitedifferentinthattheblocksareemployedforprovidingaconvenientstructuretoapproximatetheobject'sshapeandestimatingintensityhistogram.Ourobjectiveisanaccurateandefcienttracker,notthepreciselocalizationofparts,whichingeneralrequiressubstantiallymoreprocessing.Nevertheless,itisinterestingto 26

PAGE 27

investigatethepossibilityofapplyingourtechniquetothistypeoftracking/detectionproblem,andwewillleavethistofuturework. 2.4ExperimentsandResults TheBHTalgorithmhasbeenimplementedinMATLABwithsomeoptimizationusingMEXC++subroutines.Thecodeanddataareavailableat http://www.cise.ufl.edu/~smshahed/tracking.htm .Inthisimplementation,weuseintensityhistogramswith16binsforgrayscalevideos.Eachvideoconsistsof320240pixelimagesrecordedat15framespersecond.Thenumberofblocks,K,issettotwoorthree.Thetrackerhasbeentestedonavarietyofvideosequences,andeightofthemostrepresentativesequencesarepresentedhere.Wecomparetrackingresultsofouralgorithmwithatrackerusingtheplainintegralhistogram[ 2 ]andmeanshifttracker[ 1 ].OnaDell2.66GHzmachine,ourtrackerrunsat8)]TJ /F5 11.955 Tf 12.67 0 Td[(10framespersecondwhiletheintegralhistogramtrackerhasaslightlybetterperformanceat12framespersecond2.Theadditionaloverheadincurredinouralgorithmcomesfromtheupdateofblockconguration,whichamountstoasmallfractionofthetimespentoncomputingtheintegralhistogramovertheentireimage.However,experimentalcomparisonsshowthatthisnegligibleoverheadinrun-timecomplexityallowsourtrackertoconsistentlyproducemuchmorestableandsatisfactorytrackingresults. Inthefollowingexperiments,forinitialization,wemanuallyoutlinecontourofthetargetintherstframe,andfortheexperiments,alltrackersstartwiththesametrackingwindow.Thesequencesshownbelowareallcollectedfromtheweb.Inthesesequences,theforegroundtargetsundergosignicantappearancechanges,whichismainlyduetoshapevariation.Werstpresentthequalitativetrackingresultsandthenquantitativecomparisonswithgroundtruthdata. 2InourMATLABimplementation,bothalgorithmssharesameMEXC++subroutines. 27

PAGE 28

2.4.1QualitativeResults Thefemaleskatingsequencecontainsover150frames,andthedazzlingperformanceisaccompaniedbyanequallydazzlingposevariation.AsshowninFigure 2-7 ,whilethebackgroundisrelativelysimple,theintegralhistogramtrackerandthemean-shifttrackerarenotabletolocatetheskateraccurately,producingjitteredandunstabletrackingwindows.Inparticular,itisimpossibletoutilizethisunsatisfactorytrackingresultforothervisionapplicationssuchasgaitorposerecognition.However,ourtrackerisabletotracktheskaterwellandprovidestrackingwindowsthataremuchmoreaccurateandconsistent.Asshowninthegures,onemajorreasonforthisimprovementisthatthespatiallocationsoftheblocksareupdatedcorrectlybyouralgorithmastheskaterundergoingsignicantchangesinpose. Thesecondsequencecontains435frameswithagureskaterperforminginaclutteredenvironment,andthetrackingresultsusingourmethodandthemean-shiftalgorithmareshowninFigure 2-9 .Ouralgorithmisabletoaccuratelytracktheskaterthroughoutthewholesequence(i.e.,thetrackingwindowsareaccuratelycenteredaroundtheskater)asshowninFigure 2-9 whiletheintegralhistogramtrackeragainproducesunsatisfactoryresults.Perhapsmoreimportantly,ouralgorithmisabletotracktheskateracrossshotstakenfromtwodifferentcameras(e.g.,fromframe372to373andonwards),whichisdifculttohandleformostvisualtrackingalgorithms,particularlythoseusingdifferentialtechniques.Theresultsalsodemonstratetheadvantageofhavingthecapabilitytoefcientlyscantheentireimageforthetargetasthemean-shifttrackerlosesthetargetwhenthecameraanglechanges(e.g.,frame372to373andonwards). Thethirdandfourthsequencescontaintwostylisticallydifferentdances.Inbothsequences,adverseconditionssuchasclutteredbackgrounds,scalechangesandrapidmovementshavesignicancepresences,andtheshapevariationsinthemareevenmorepronouncedwhencomparedwiththetwoprevioussequences.Inboth 28

PAGE 29

experiments(Figures 2-8 and 2-11 ),ourtrackerisabletotrackthedancersaccuratelywhiletheintegralhistogramandthemean-shifttrackerfailtoproduceconsistentandaccurateresults. Figure 2-12 presentsthetrackingresultsusingaverychallengingsequenceinwhichthereisalargevariationinshape,scale,andappearanceofthetarget.Furthermore,thetargetundergoesballisticmovements.Notwithstandingthesedifculties,theproposedtrackerisabletofollowthetargetaccuratelyusingonlytwoblocks. InFigure 2-13 ,Weapplyourtrackingalgorithmtoasequenceinwhichthetargetobjectisfullyoccludedatsomepoint.Inthissequence,twopersonswalkpasseachotherandthepersonbeingtrackedisfullyoccluded.AsshowninFigure 2-13 ,ourtrackingalgorithmisabletotrackthetargetcorrectlybeforeandafterocclusionwhilethemean-shifttrackerisconfusedbyocclusionandlosethetargetafterwards. Wetestourtrackerwithasequenceinwhichthetargetsoccerplayerappearsinaclutteredandchangingbackground.AsshowninFigure 2-14 ,ourtrackerproducesmorestableresultsthanthemeanshifttracker.Andnally,inFigure 2-15 ,weapplytheproposedtrackertoawildlifesequencewithnaturalsceneasthebackground. AsdescribedinSection 2.3.2 ,ourtrackerisabletoadjustthesizeoftrackingwindowtotightlyenclosethetargetobject.Inthissection,wepresentsomeresultsofourtrackerwithxedandadaptivescaling.AsshowninFigure 2-16 ,ourtrackerwithadaptivescalingisabletobetterenclosethetargetobjectthantheonewithxedscalingalthoughbotharecenteredatthesametargetlocations. 2.4.2QuantitativePerformance Forquantitativeperformanceevaluationofourtracker,wemanuallylabelthegroundtruthbyselectingtheminimalwindowthatenclosesthetargetineveryframe.Asoursequencescontainarticulatingtargets,wedonotincludeparts(hands,legs)inthe 29

PAGE 30

groundtruthwindowwhentheyarespreadouttoomuch.SomeoftheseexamplesareshowninFigure 2-10 Weusetwoerrormetricsforquantitativeevaluations.Therstonemeasuresthedeviationofthecenterofthetrackingwindowfromthegroundtruth,whereasthesecondonemeasuresthecoverageofthetrackingwindowagainstthegroundtruth.Certainlyanoptimaltrackerisexpectedtohavesmallerrorsinbothmetrics.Quantitativeperformanceofourtracker,theintegralhistogrambasedtrackerandthemean-shifttrackerwithrespecttotheseerrormeasurementsaresummarizedinTables 2-1 2-2 andFigures 2-17 aswellas 2-18 .Weobservethatourtrackeroutperformstheothertwotrackersbyalargemarginasourtrackerachievesthelowestmeanerrorsinallsequenceswithsmallstandarddeviations. Wepresentquantitativecomparisonsofourtrackerswithxedandadaptivescaling.Fromeachoriginalsequence,weselectasubsequencewhichcontainssubstantialscalevariationoftargetforexperiments.Experimentalresults,assummarizedinTables 2-3 2-4 andFigure 2-19 ,showthatadaptivescalingimprovestheaccuracyinlocationinallcasesandcoverageinmostcases. Asthesizeofthetargetobjectsvariessignicantlyinourexperiments,itisofgreatinteresttofurtheranalyzewhetherthealgorithmisabletoadjustthetrackingwindowsizeintermsofwidthandheight.Usinggroundtruthdata,wecomputethevariationofwidthandheightthetargetobjectandthencompareitwiththeresultsobtainedfromourtrackerwithadaptivescaling.Figure 2-20 showstheplotsforthisanalysisandTable 2-5 summarizestheerrorsforthisexperiment.Overall,ourtrackerwithadaptivescalingisabletoadjustboththewidthandheightofthetrackingwindowwhenthetargetobjectundergoeslargevariationinscale. 2.5Discussion Inthiswork,wehaveintroducedanalgorithmforaccuratetrackingofobjectsundergoingsignicantshapevariation(e.g.,articulatedobjects).Underthegeneral 30

PAGE 31

assumptionthattheforegroundintensitydistributionisapproximatelystationary,weshowthatitispossibletorapidlyandefcientlyestimateitamidstsubstantialshapechangesusingacollectionofadaptivelypositionedrectangularblocks.Thealgorithmrstlocatesthetargetbyscanningtheentireimageusingtheestimatedforegroundintensitydistribution.Therenementstepthatfollowsprovidesanestimatedtargetcontourfromwhichtheblockscanberepositionedandweighted.Ouralgorithmisefcientandsimpletoimplement.ExperimentalresultshavedemonstratedthattheBHTalgorithmconsistentlyprovidesmoreprecisetrackingresultwhencomparedwithintegralhistogrambasedtracker[ 2 ]andmeanshifttracker[ 1 ]. 31

PAGE 32

Table2-1. Centerlocationerrorsoftheintegralhistogram-basedtracker,themean-shifttrackerandtheproposedBHT IntegralhistogramMean-shiftOurproposedtrackertrackerBHTSequenceMaxMeanStdMaxMeanStdMaxMeanStd Femaleskater55.229.711.869.226.916.133.113.16.8Maleskater165.045.729.6168.069.826.635.013.67.5Indiandancer25.512.15.849.525.411.629.88.44.8Dancer58.931.910.261.541.611.833.716.16.9 Table2-2. Coverageerrorsoftheintegralhistogram-basedtracker,themean-shifttrackerandtheproposedBHT IntegralhistogramMean-shiftOurproposedtrackertrackerBHTSequenceMaxMeanStdMaxMeanStdMaxMeanStd Femaleskater79.9161.159.37100.0063.7616.2963.4548.249.39Maleskater100.0067.4813.63100.0080.1219.0174.7953.2811.77Indiandancer66.6850.196.3889.2161.2513.1358.9646.265.66Dancer87.5962.559.4287.3871.498.9569.8248.238.70 Table2-3. Centerlocationerrorsofourtrackerwiththexedandtheadaptivescalingwindow Fixedscale Adaptivescalingwindow window SequenceMaxMeanStd MaxMeanStd Femaleskater33.1112.117.72 21.849.154.98 Maleskater22.6410.285.2726.739.485.56 Indiandancer29.838.364.7524.636.584.35 Table2-4. Coverageerrorsofourtrackerwiththexedandtheadaptivescalingwindow Fixedscale Adaptivescalingwindow window SequenceMaxMeanStd MaxMeanStd Femaleskater63.4952.445.89 59.7244.956.07 Maleskater72.6452.0213.3771.0744.7511.08 Indiandancer58.9546.265.6657.2937.298.41 Table2-5. RMSerrorsintrackingwindowsize(in%) Errorinwidth ErrorinheightSequenceMeanStd MeanStd Indiandancer9.227.13 6.135.02Maleskater8.186.46 20.1513.94Femaleskater27.8421.05 13.5510.76 32

PAGE 33

FemaleSkater#43Dancer#78MaleSkater#160IndianDancer#212 Figure2-1. Motivationofourblock-basedtrackingalgorithm.Top:Usingonlyhistogramforrepresentingobject'sappearance,thetrackingresultsareoftenunsatisfactory.Bottom:UsingtheBHTalgorithm,thetrackingresultsaremuchmoreconsistentandsatisfactory. TrackingAlgorithmOutline 1.Detection TheentireimageisscannedandthewindowwiththehighestsimilarityisdeterminedtobethetrackingwindowW. 2.Scaling SizeoftrackingwindowWisadjustedaccordingtoscaleoftarget. 3.Renement WithinW,thetargetissegmentedoutusingagraph-cutbasedsegmentationwhichdividesthetrackingwindowbetweenforegroundandbackgroundregions.Thesegmentationusesbothestimatedforegroundandbackgrounddistributions. 4.Update Blockcongurationisadjustedlocallybasedonthesegmentationresultobtainedinthepreviousstep.Thenon-negativeweightsioftheblocksarerecomputed. Figure2-2. High-leveloutlineoftheBHTalgorithm. 33

PAGE 34

IndianDancer#58MaleSkater#355 Figure2-3. InitializationoftheBHTalgorithm.Top:Examplesofarticulatingtargets.Bottom:Giventhecontourofthetargetobject,weselectthoseblocks(andassociatedweights)withnon-emptyintersectionwiththeinteriorregionofthetargetdenedbythecontour.Blockscontainingonlybackgroundpixelsarenotselected.Theimportanceofablockisproportionaltothepercentageofitspixelsbelongingtotheforeground. Figure2-4. Examplesofsegmentationwithinthetrackingwindow. 34

PAGE 35

MaleSkater#113FemaleSkater#87IndianDancer#126 Figure2-5. Top:Trackingwithxedscalewindow.Bottom:Trackingusingadaptivescalingwindow. Figure2-6. Overviewoftheblockcongurationupdateprocess.Blockcongurationwithintrackingwindowregularlyupdatedusingthesegmentationresultoftherenementstep,therebyenablingtotrackatargetwithlargeshapevariation. 35

PAGE 36

FemaleSkater#41FemaleSkater#73FemaleSkater#96FemaleSkater#125FemaleSkater1#144 Figure2-7. VisualcomparisonoftrackingresultsoftheFemaleSkatersequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:TrackingresultsusingtheBHTalgorithm.Theshapevariationinthissequenceissubstantial.Noticetheunsatisfactoryresultproducedbytheintegralhistogramtracker.Theinaccuratetrackingresultsaredifculttobeutilizedbyothervisionapplications. Dancer#33Dancer#73Dancer#127Dancer#166Dancer#213 Figure2-8. VisualcomparisonoftrackingresultsoftheDancersequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:TrackingresultsusingtheBHTalgorithm.Notethetargetundergoeslargeshapeandscalevariation. 36

PAGE 37

MaleSkater#17MaleSkater#96MaleSkater#156MaleSkater#226MaleSkater1#239 MaleSkater#256MaleSkater#293MaleSkater#342MaleSkater#373MaleSkater1#376 Figure2-9. VisualcomparisonoftrackingresultsoftheMaleSkatersequence.Firstandfourthrows:Trackingresultsusingonlyintegralhistogram.Secondandfthrows:Trackingresultsusingthemean-shifttracker.Thirdandsixthrows:TrackingresultsusingtheBHTalgorithm. 37

PAGE 38

IndianDancer#96FemaleSkater#14MaleSkater#228 Figure2-10. Examplesofgroundtruthwindowsforarticulatedtargets. IndianDancer#20IndianDancer#26IndianDancer#110IndianDancer#122IndianDancer#134 Figure2-11. VisualcomparisonoftrackingresultsoftheIndianDancersequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:TrackingresultsusingtheBHTalgorithm.Notethetargetundergoessignicantshapeandappearancevariation. Cartoon#3Cartoon#18Cartoon#28Cartoon#60Cartoon#74 Cartoon#104Cartoon#112Cartoon#141Cartoon#150Cartoon#176 Figure2-12. TrackingresultsofthecartoonsequenceusingtheBHTalgorithm.Notethereisalargevariationinshape,scaleandappearanceofthetarget.Inaddition,thetargetexhibitsballisticandnon-rigidmotions. 38

PAGE 39

Occluded#19Occluded#36Occluded#50Occluded#63 Figure2-13. Trackingoccludedtarget.Toprow:TrackingresultbytheBHTalgorithm.Bottomrow:Trackingresultusingthemean-shifttracker.Theproposedtrackercanrecoverthetargetafterocclusion,whilethemean-shifttrackerfails. SoccerPlayer#53SoccerPlayer#75SoccerPlayer#85SoccerPlayer#95 Figure2-14. Trackingwithaclutteredbackground.Toprow:TrackingresultusingtheBHTalgorithm.Bottomrow:Trackingresultusingthemean-shifttracker.Theproposedtrackercantrackthetargetmoreconsistentlythanthemean-shifttracker. Deer#10Deer#20Deer#27Deer#41 Deer#48Deer#57Deer#71Deer#77 Figure2-15. TrackingofawildlifetargetusingtheBHTalgorithm.Eightselectedframesfromasequenceof100framesareshownthere. 39

PAGE 40

FemaleSkater#12FemaleSkater#34FemaleSkater#57FemaleSkater#87 IndianDancer#28IndianDancer#41IndianDancer#89IndianDancer#126 MaleSkater#47MaleSkater#91MaleSkater#113MaleSkater#153 Figure2-16. Toprows:Trackingresultsofouralgorithmwithxedscale.Bottomrows:Trackingresultsofouralgorithmwithadaptivescaling. 40

PAGE 41

Figure2-17. Per-framecenterlocationerrorplotsusingdifferenttrackingalgorithms.Toprow:(Left)MaleSkater,(Right)FemaleSkater,BottomRow:(Left)IndianDancerand(Right)Dancersequence. 41

PAGE 42

Figure2-18. Per-framecoverageerrorplotsusingdifferenttrackingalgorithms.Toprow:(Left)MaleSkater,(Right)FemaleSkater,BottomRow:(Left)IndianDancerand(Right)Dancersequence. 42

PAGE 43

Figure2-19. Comparisonbetweenourxedandadaptivescalingwindowbasedtracker.Top:Indiandancersequence.Middle:Femaleskatersequence.Bottom:Maleskatersequence.Centerlocationerrorsareshowninleftcolumnandrightcolumncontainscoverageerrors. 43

PAGE 44

Figure2-20. ResultsusingtheBHTalgorithmwithxedandadaptivescaling.Top:Indiandancersequence.Middle:Maleskatersequence.Bottom:Femaleskatersequence.Variationsinwidthandheightareshownintheleftandtherightcolumns,respectively. 44

PAGE 45

CHAPTER3SIMULTANEOUSMULTI-FRAMETRACKINGFORARTICULATEDTARGETS 3.1Introduction InChapter 2 ,wehavediscussedtheBHTalgorithmfortrackingarticulatedobjects,andourexperimentshaveshownthatthealgorithmproducesrobusttrackingresults.ThealgorithmhasbeenimplementedinMATLAB,andtoimprovetherunningtime,thedetectionstephasbeencodedusingC++MEXsubroutines.Currentimplementationofthealgorithmrunsat8)]TJ /F5 11.955 Tf 12.78 0 Td[(10fpsona2.66GHzCPU.Inthischapter,wefocusonimprovingtheperformanceoftheBHTalgorithm.Morespecically,weidentifythecomputationallyintensivestepsofthealgorithmandobservethatthesestepsareamenabletoparallelization.ThestructureoftheBHTalgorithmisalsomodiedtoworkonseveralconsecutiveframesinparallel.WedesignanddevelopanoptimizedimplementationofthemodiedBHTalgorithm,referredtoasparallel-BHTorp-BHT,onagraphicalprocessingunit(GPU). Manycomputervisionalgorithmshavehighdataparallelism,i.e.thesamesetofinstructionsareexecutedonmanydata.TheperformancesofthesealgorithmscanbeimprovedsignicantlyusingtheGPUsavailableinmoderngraphicshardware.Forexample,aparticlelteringalgorithmisproposedin[ 45 ]for3DtrackingoffacesusingGPUs.TheiralgorithmextractsNfeaturepointsfromthedetectedface,andgeneratesMparticlesfortracking.Themostexpensivetaskofthealgorithmistocomputethematchingerrorsofallfeature/particlepairs.Calculationoftheseerrorsforthepairsareindependentofeachother,andthisisexploitedin[ 45 ].TheerrorsarecalculatedinparallelusingaGPU,andtheiralgorithmruns4)]TJ /F5 11.955 Tf 12.43 0 Td[(7timesfasterthantheCPU-onlyversion. Similarly,inourcase,theevaluationsofmanycandidatewindowsinthedetectionstepareindependentofeachother,andwecanprocesstheminparallel.WehaveusedMATLABtoimplementthesequentialstepsofthep-BHTalgorithm,andthedetection 45

PAGE 46

stepofthealgorithmhasbeenimplementedonagraphicsprocessingunit(GPU)forparallelprocessing.TheGPUisprogrammedusingNVIDIA'sCUDAprogrammingframework.Theimplementationofthep-BHTalgorithmproducesverysimilartrackingresultswhencomparedwiththeresultsobtainedbytheBHTalgorithm.However,thep-BHTalgorithmrunsconsiderablyfasterthantheBHTalgorithm. Therestofthechapterisorganizedasfollows.InSection 3.2 ,webrieydescribegraphicalprocessingunits(GPUs)withsomerelevantcomputervisionapplicationsdevelopedusingGPUs.AbriefdescriptionoftheCUDAprogrammingframeworkisgiveninSection 3.3 .Thep-BHTalgorithmispresentedinSection 3.4 .CPU-GPUbasedimplementationofthealgorithmisdescribedinSection 3.5 .Wediscusstheexperimentalresultsandtheperformanceofthep-BHTalgorithminSection 3.6 .Finally,weconcludethechapterwithashortsummaryinSection 3.7 3.2GraphicsProcessingUnits 3.2.1OverviewofGPU AGPUcontainsmanycoresanditdistributesthedataontoallofthesecoresforparallelcomputation.Aprogramcalledakernelisexecutedonalloftheavailableprocessingunitswithdifferentinputdata.WhilemanycomputerstodaycontainmultipleCPUcoresandsupportexecutionsofmulti-threadedapplications,suchprocessorsarenotdataparallelandtheyusuallyexecutedifferentprogramsondifferentthreads.GPUs,ontheotherhand,arecreatedtoaccountfordataparallelismandtheycancooperatewithdatafetchingandotheroperationssincetheyallexecutethesameinstructions. 3.2.2ApplicationofGPUinComputerVision Theparalleldata-processingcapabilityofmodernGPUshasrecentlybeenusedinmanycomputervisionalgorithmsforrealtimeapplications.Inthissection,wedescribeseveralrelevantworkonvisualtracking. AGPU-basedKanade-Lucas-Tomasi(KLT)featuretrackerhasbeendevelopedin[ 46 ]thatcantrack1000featurepointsat30fps,whichis20timesfasterthantheCPU 46

PAGE 47

implementation.TheirGPUimplementationcanextractabout800SIFTfeaturesfrom640480imagesat10fps.In[ 47 ],theauthorsdevelopedaframeworkforeye-blinkdetectionandtracking,whichcandetecteyeblinkswith97%accuracy.TheyusedopenVidia's[ 48 ]GPU-basedSIFTimplementationforfeatureextractions.Theirtrackercanprocessupto25framespersecond.TheGPU-basedparticlelteringframeworkdevelopedin[ 49 ]forvisualtrackingruns2.5)]TJ /F5 11.955 Tf 12.17 0 Td[(3.5timesfasterthanpreviousCPU/GPUversionsofparticlelter-basedtrackers[ 50 51 ].Thisworkhasbeenextendedin[ 52 ]formoreaccuratetrackingbyadoptingmulti-resolutionandlocalsearch.Theoveralltrackingspeedwasupto40fpsfor320240videosequences.Recentlyin[ 45 ],aparticlelteringframeworkhavebeenusedtodetectandtrackmultiplefacessimultaneouslyusingNVIDIA'sCUDAarchitecture.Thesystemcandetectandtrackfacesin3Denvironmentsat25)]TJ /F5 11.955 Tf 12.26 0 Td[(40fpsandtheGPUimplementationruns4)]TJ /F5 11.955 Tf 12.27 0 Td[(9timesfasterthantheCPUversion.TheSSD-basedplanartrackingalgorithmproposedin[ 53 ]usesatemplateofthetargetfortracking.ThetrackerachievesrealtimeperformanceonCPUifthesizeofthetemplateissmallerthan100100pixels.Forlargertemplates,however,thetrackerrunsslow(thespeedisonly4fpsfor256256sizetemplates).Ontheotherhand,theGPU-basedimplementationofthisalgorithmdevelopedin[ 54 ]runsat18)]TJ /F5 11.955 Tf 11.96 0 Td[(96fps,dependingontheGPUunitused. 3.3CUDAArchitectureandProgrammingFramework 3.3.1CUDAHardwareArchitecture Inthissection,wedescribethehardwarearchitectureofatypicalCUDAenableGPU.TheGPUcontainsalargenumberofhighlythreadedstreamingmultiprocessors(SMs),whereeachSMhasseveralstreamingprocessors(SPs).AlltheSPsinanSMexecutethesameinstructionatthesametime.Also,theseSMscontainasmallamountofhigh-speedmemorythatisusedtosharethedataamongthethreadsrunningonasingleSM.TheDRAMavailableinsidetheGPUiscalledthedevicememory.TheblockdiagramofthehardwarearchitectureofaCUDAenabledGPUisshowninFigure 3-2 47

PAGE 48

Differenttypesofmemoryresideinthedevicememory.Constantandtexturememoriesarecachedforfasteraccesstimes.Howevertheydonothavewriteaccess.Whiletheglobalmemoryiswrite-enabled,theaccesstimeismuchslowersincetheyarenevercached. 3.3.2CUDAProgrammingFramework Thecomputingsystemconsistsofahost(CPU)andoneormoredevices(GPU)withparallelprocessingcapability.Apotentialapplicationmusthavehighdataparallelismtotakeadvantageoftheprogrammingframework.Thedevicesexploitthedataparallelismtoreducetheexecutiontime. CUDAprogramstructure.ACUDAprogramcanbedividedintoseveralmodules.Someofthesemodulesdonotshowanydataparallelismandthesesequentialmodulesareexecutedonthehost.Theothermodules,whichshowdataparallelism,areexecutedontheGPU.Duringthecompilation,thesesequentialandparallelmodulesareseparatedandcodesforboththeCPUandGPUaregenerated.TheprogramthatrunsontheGPUisknownasthekernel.Thekernelprogramcreatesalargenumberofthreadstoprocessthedatainparallel.ThesethreadsareverylightweightandtheGPUcontainsthenecessaryhardwaretoefcientlyschedulethem. Devicememoryanddatatransfer.Thehostanddeviceshavedifferentmemoryspaces.Beforeakernelstartsexecution,memoryisallocatedonthedevicesandtheinputiscopiedfromthehostmemorytothedevicememory.Similarly,whenthekernelexecutionisnishedonthedevices,theoutputiscopiedfromthedevicememorytothehostmemory.ThesememorytransferoperationsareperformedusingtheAPIfunctionsavailableintheCUDAprogrammingframework. TheCUDAdevicememorycontainsthreedifferenttypesofmemory.Theyaretheglobalmemory,theconstantmemory,andthetexturememory.Theglobalmemoryresidinginsidethedevicehaslongaccesslatenciesandnitebandwidth.Thisistheonlymemorywherethekernelcanperformbothreadandwriteoperations.Many 48

PAGE 49

threadstryingtoaccesstheglobalmemorysimultaneouslycanreducetheexecutionspeedduetoitslonglatency.However,executingalargenumberofthreadscantoleratethislonglatencies.Theconstantmemorysupportsshort-latencyandhighbandwidthincontrasttotheglobalmemory.However,theconstantmemoryonlyallowsreadoperations.Thetexturememoryisalsoread-onlymemorythatcanbecachedaccordingtospatiallocality.RegistersandsharedmemoryinFigure 3-3 areon-chipmemorieswhichcanbeaccessedatveryhighspeeds.Thethreadscanaccessonlythoseregisterswhichareallocatedtothem.Sharedmemoryisallocatedtothethreadblockandallthethreadsintheblockcanaccesstheallocatedmemory.Theportionoftheinputdatathatisusedbyallthethreadsshouldbecopiedinthesharedmemoryforfasteraccess. Threads.ThekernelprogramisexecutedinsidetheGPUasagridofparallelthreads.EachthreadgridcontainsthousandsofGPUthreads.Allthesethreadsexecutethesamecodebutperformoperationsondifferentpartsoftheinputdata.Toidentifythethreads,eachthreadisgivenauniqueID.ThisIDisoftenusedbyathreadtoidentifythespecicpartofinputdata.ThetwolevelhierarchicalorganizationofthethreadsisshowninFigure 3-3 .Atthetoplevel,eachgridcontainsoneormorethreadblocks.Allblocksinagridhavethesamenumberofthreads.Agridcanbeorganizedasamulti-dimensionalarrayofblocks,andsimilarly,eachthreadblockcanbeorganizedasamulti-dimensionalarrayofthreads.Athreadinthishierarchicalmodelcanbeidentiedbybuilt-inCUDAvariablesblockIdxandthreadIdxwhichgetassignedbytheCUDAruntimesystem.Thesevariablesareaccessibleinsidethekernelandmostofthetimetheydeterminethepartofthedatatobeusedbythethread.Therearetwoadditionalvariables:gridDimandblockDimwhichspeciythedimensionsofthegridandtheblocksrespectively. Allthethreadsinablockcanbesynchronized.CUDAprovidestheAPIfunctionsyncthreads()toachievethissynchronization.Whenakernelcallsthisfunction,the 49

PAGE 50

threadthatexecutesthecallwillbehelduntileverythreadintheblockreachesthesamelocation.Thisensuresthatallthreadsinablockhavenishedexecutingaphaseofthekernelbeforetheybeginexecutingthenextphase.However,threadsindifferentblockscannotbesynchronizedwitheachother. 3.4AlgorithmDescription Inthissection,wedescribethep-BHTalgorithm.ThisalgorithmisamodiedversionoftheBHTalgorithmthatimprovestherunningtimebyutilizingtheparallelprocessingcapabilityoftheGPU.Specically,wemodifythestructureoftheBHTalgorithm.TheBHTalgorithmusesasmallnumberofblockstorepresenttheobject'sappearanceandthepositionsoftheblocksapproximatelyrepresentsthetarget'sshape.Itloopsthroughthedetection,therenement,andtheupdatesteps.Inthedetectionstep,thetargetislocalizedbyscanningthewholeimage.Therenementandtheupdatestepsareusedtocomputenewblockpositions.TheBHTalgorithmcomputestheblockpositionsforeachframe.However,weobservedthatitisnotnecessarytoupdatetheblockcongurationateachframe.ThecongurationshouldbechangedwheneverthedistanceD(W,W)getslargebecauselargerdistanceimpliesthatthetarget'sshapeisnolongerwellrepresentedbycurrentblockcongurationandthereforeanewblockcongurationshouldbecomputed.Thus,wecanavoidtherenementandtheupdatestepsoftheBHTalgorithmaslongasdistanceD(W,W)isreasonablysmall.Thisobservationprovidesthemotivationtocombinethedetectionstepsofseveralconsecutiveframestogether.Werefertothisasthemulti-framedetectionstep.Inthep-BHTalgorithm,wecomputetheblockcongurationeveryNthframe.Thus,amulti-framedetectionstepthatdetectsthetargetinNframesisfollowedbyarenementandanupdatestep.TheblockdiagramofthetrackingalgorithmisshowninFigure 3-4 Thereisatrade-offinchoosingthevalueofN.AlargervalueofNspeedsupthetrackerbyavoidingtherenementandtheupdatesteps.However,thisalsomeansthattheblockcongurationisnotupdatedforN)]TJ /F5 11.955 Tf 11.96 0 Td[(1frames.Thismaycauseaproblemif 50

PAGE 51

thetargetshapechangessubstantiallyovertheseNframes.Therefore,thechosenNshouldbesmallenoughthatthetarget'sshapedoesn'tvarytoomuchintheseNframes.Inourexperiments,wefoundthat,selectingavaluewithin8)]TJ /F5 11.955 Tf 11.96 0 Td[(16workswell. Wealsoidentifythetwocomputationallyintensivecomponentsofthep-BHTalgorithmwhichcanbeprocessedinaparallelarchitecturetospeeduptherunningtime:thecomputationofintegralhistogramsandthecomputationofthedistancesD(WT,W).Inthemulti-framedetectionstep,thetargetisdetectedinNframes.Therefore,Nintegralhistogramsneedtobecomputedrst.Sincecomputationoftheintegralhistogramforaframedoesn'tdependonotherframes,weexploitthistocomputetheintegralhistogramsinparallel.Also,inthedetectionstep,alargenumberofcandidatewindowsovertheentireframeareevaluated.ThecurrentblockcongurationistransferredontothecandidatewindowWTandlocalforegroundhistogramsoftheblocksareestimated.Then,thedistanceD(WT,W)iscalculatedusingequation 2 .ThesedistancecalculationsareindependentofeachotherandwecanevaluatetheminparallelonaGPU. 3.5ImplementationDetails Figure 3-5 showstheowdiagramoftheCPU-GPUimplementationofthep-BHTalgorithm.Thetrackerisrstinitializedandthenitloopsthroughthemulti-framedetection,therenement,andtheupdatesteps.Ateachiteration,Nframesareprocessedinparallel.TheinitializationstepofthetrackingalgorithmisexecutedontheCPU.Inthisstep,thecontourofthetargetismanuallyoutlined,andthenthealgorithmcomputestheinitialblockpositionsandtheirweights.Theinitialappearanceofthetargetisalsocomputedinthisstep.Moreover,themulti-framedetectionsteprunsontheGPUandthisstepdetectsthetargetinNconsecutiveframes.TheframesarerstloadedontotheGPUmemory.Then,theGPUcomputestheintegralhistogramsfromtheframes.Afterthis,thedistanceD(W,WT)ofeachcandidatewindowWTiscalculated.Attheend,foreachframe,thedistancesofthecandidatewindowsare 51

PAGE 52

comparedandthecandidatewindowwiththesmallestdistanceisselectedasthedetectionresult.Thiscompletestheexecutionofthemulti-framedetectionstep.TheGPUthencopiesthedetectionresultsintotheCPUmemory.TherenementandtheupdatestepsrunontheCPUandthesestepsareperformedonlyonthelastframeofNframesprocessedincurrentiteration.Intherenementstep,thewindowreturnedbythedetectionstepissegmentedintotheforegroundandthebackground.Thissegmentationisusedtocomputetheblockpositions.ThesetwostepsaresameastheBHTalgorithm. Next,wepresentouroptimizedparallelimplementationofthemulti-framedetectionstepontheGPUusingtheCUDAProgrammingframework.Asdiscussedabove,thisconsistsofthreesteps:(1)computingtheintegralhistogramsfromtheframes,(2)computingthedistancesofthecandidatewindows,and(3)ndingthebestcandidatewindowbycomparingthedistances.EachofthesestepsrunsasakernelontheGPU. TheintegralhistogramofanimageofsizeRCisamultidimensionalarrayofsizeRCB,whereBisthenumberofbinsinthehistogram.Thecomputationoftheintegralhistogramforeachbinisperformedinparallelusingseparatethreads.Intheimplementation,weuseNBthreadsforNframes,whereeachthreadisassignedaframeandabinindex.ThesethreadsaredividedintoN=2blocks,andeachblockcomputestheintegralhistogramsoftwoframes.Eachthreadscanstheframeassignedtoitanditcalculatesthebinvaluesofitsassignedindex.WestoretheframesintheCUDAtexturememory,whichiscached.Hencewhenthethreadsscantheframes,spatiallocalityisexploitedwhichresultsinimprovedperformance.TheoutputintegralhistogramsarestoredintheCUDAglobalmemoryandtheyareusedbythesubsequentkernelsduringtheirexecutions. Foreveryframe,thetrackingalgorithmndsthewindowwhichhasthemaximumforegroundsimilaritywithrespecttotheinitialtrackingwindowW.ItevaluatesthedistancesofallpossiblecandidatewindowsWTintheframe.ForNframeswhereeachframehasthedimensionHW,thetotalnumberofcandidatewindowsare 52

PAGE 53

approximatelyNHW.TheimplementationusesNWthreadsandeachthreadcomputesthedistancesofHcandidatewindows.Thesethreadsaredividedinto2WblocksandeachblockcontainsN=2threads.Tocalculatethedistance,theforegroundappearanceestimatedfromthecandidatewindowWTiscomparedwiththetargetappearance.Thus,eachthreadneedstoaccessthetargetappearance.Thisappearanceiscopiedontothesharedmemorybytherstthreadofeachblock.Thesharedmemoryison-chipandfast.Further,multiplethreadsreadingthesamedatafromsharedmemoryresultsinabroadcastofinformationratherthanmultiplereads.Thus,usingsharedmemorytostorethetarget'sappearancereducestheexecutiontime.Thesyncthreads()functioniscalledbythethreadstomakesurethatthedistancecomputationstartsafterthetargetappearanceiscopiedontothesharedmemory.Thecandidatewindow'sappearanceiscalculatedusingtheintegralhistogramresidingintheCUDAglobalmemory.TheimplementationusesCUDAdatatypeoat4toreadfromtheglobalmemory,whichcanload4valuesinasinglememoryaccess.UseofthisspecialCUDAdatatypeimprovestheglobalmemoryaccessperformance. Oncethedistancesofthecandidatewindowsarecomputed,thenextstepistondthetrackingwindowwiththeminimumdistance.Again,thiscomputationcanbeperformedforeachframeindependently.ThekernelforcomputingtheminimumdistanceisexecutedinNthreads.Thesethreadsruninparallelandselectthebesttrackingwindowcontainingthetarget. Finally,computedintegralhistogramsandtrackingwindowsarecopiedfromtheGPUmemorytotheCPUmemory.Detectedtrackingwindowsaretheoutputofthetrackerwhileintegralhistogramsareusedintherenementandtheupdatestepstocomputeanewblockconguration. 3.6ExperimentalResults Toevaluatetheperformanceofthetrackingalgorithm,wehavetestedthetrackerwiththefoursequencesusedinChapter 2 .Thesearethefemaleskater,themale 53

PAGE 54

skater,thedancer,andtheindiandancersequences.ThesevideoswerecollectedfromYoutubeandeachoftheseconsistsof320240pixelframesrecordedat15fps.WepresentqualitativeandquantitativeanalysesofthetrackingresultswhereourtrackingresultsarecomparedwiththeBHTalgorithm,theintegralhistogramtracker[ 2 ],andthemeanshifttracker[ 1 ].Then,wecomputationallyanalyzethep-BHTalgorithmandcomparethetrackingspeedwiththeBHTalgorithm. 3.6.1QualitativeResults Wecomparedtheresultsobtainedusingthetrackerwiththeresultsofthehistogrambasedtracker[ 2 ]andthemeanshifttracker[ 1 ].TheseresultsareshowninFigures 3-6 3-7 3-8 ,and 3-9 .Wefoundthat,ouralgorithmtracksthearticulatedtargetmorerobustlyandconsistentlythanothertrackers. 3.6.2QuantitativePerformance Forquantitativeperformanceevaluationofthetracker,thecenterlocationerrorandthecoverageerrorsarecomputedineachframe.Quantitativeperformancesofthep-BHT,theBHT,theintegralhistogrambasedtracker[ 2 ]andthemean-shifttracker[ 1 ]withrespecttotheseerrormeasurementsaresummarizedinTables 3-1 and 3-2 .Thesetablesshowthatthep-BHTalgorithmhasverysimilarbehaviorastheBHTalgorithmandoutperformstheothertwotrackersbylargemargins.PlotsrelatedtothesetablesareshowninFigures 3-10 and 3-11 3.6.3ComputationalPerformance Thep-BHTalgorithmhasbeentestedusinganIntelQuadprocessor(Q9450)2.66GHzhostsystemwith4GBofRAMandaNVIDIAGeForceGT8800GPU.ThisGPUfeatures14multiprocessors,16KBsharedmemorypermultiprocessor,and512MBofdevicememory.Therecanbeamaximumof512threadsperblockand768activethreadspermultiprocessor.Forcomparisonpurposes,thesinglethreadedBHTalgorithmhasalsobeentestedonthesamehostCPU. 54

PAGE 55

Figure 3-1 showstherunningtimesoftheBHTandthep-BHTalgorithmsondifferentsequences.WhiletheBHTalgorithmrunsat8)]TJ /F5 11.955 Tf 12.79 0 Td[(10fps,thep-BHTtrackerrunsat56)]TJ /F5 11.955 Tf 12.2 0 Td[(78fps.Thisshowsthatthep-BHTalgorithmrunsaboutseventimesfasterthantheBHTalgorithm.InTable 3-4 ,theaveragetimetakenbythestepsofthep-BHTalgorithmispresented.IntheforthcolumnoftheTable 3-4 ,themulti-framedetectiontimeperiterationisgiven.Thedetectiontimeperframeiscalculatedbydividingthedetectiontimeperiterationbythenumberofframesprocessedinaniteration.Thefthandthesixthcolumnsshowtherunningtimesoftherenementandtheupdatesteps.ThesetwostepsrunonlyonceperiterationandtheyonlyprocessthelastframeofthebatchofNframes.Totalexecutiontimeforeachiterationiscalculatedbysummingthedetectiontimeperiterationandthetimetakenbytherenementandtheupdatesteps. Wenowdiscusstheperformanceofthemulti-framedetectionstepofthep-BHTalgorithm.ThisstephasbeenexecutedontheGPU.First,theinputissentfromthehostmemorytothedevicememory.TheamountoftimeelapsedtocopytheinputfromthehostmemorytothedevicememoryisshownintherstcolumnoftheTable 3-3 .Inourimplementation,threeCUDAkernelfunctionsareexecutedontheGPU.Theaverageexecutiontimesofthesekernelsarepresentedinthefourth,fth,andsixthcolumnsoftheTable 3-3 ,respectively.Whenallthekernelsnishtheirexecution,theoutputiscopiedfromthedevicetothehost.Thetimespentforthisoperationisshowninthesecondcolumn.Totalexecutiontimeforeachiterationiscalculatedbysummingthememorytransfertimesandthekernelexecutiontimes.Thelasttwocolumnsofthetablecontaintheaverageexecutiontimeperframeandtheframerates.Fromthetable,weseethataverageGPUexecutiontimeperframeisaround10)]TJ /F5 11.955 Tf 12.29 0 Td[(15msec.However,thedetectionstepoftheBHTalgorithmtakesaround80)]TJ /F5 11.955 Tf 12.55 0 Td[(90msecperframe.Thus,theGPU-basedimplementationofthedetectionstepruns6)]TJ /F5 11.955 Tf 11.96 0 Td[(8timesfaster. 55

PAGE 56

3.7Discussion Inthischapter,wehavedevelopedaGPU-acceleratedrealtimeandrobusttrackingsystem.Oursystemusesthep-BHTalgorithm,whichproducesrobusttrackingresultsforarticulatedobjects.ThesystemisimplementedontheGPUusingtheCUDAprogrammingmodelproposedbyNVIDIA.TheimplementationcreatesalargenumberofthreadsinordertoefcientlyutilizetheparallelprocessingabilityofferedbytheGPU.ThesethreadsoccupythecoresavailableintheGPUandperformthecomputationsinparallel.Oursystemefcientlyutilizestheglobalandsharedmemorybandwidths.Forcreatingtheintegralhistograms,theframeswerestoredintexturememorywhichusesdatacachingandreducesthedatareadtime.TheGPU'ssharedmemoryaccesstimeisseveraltimesfasterthanglobalmemory.Totaketheadvantageofthis,theappearancehistogramofthetargetwasloadedintothesharedmemory.Whilereadingthedatafromtheglobalmemory,weusedtheoat4datatype.Thishelpedtoreadfouroatingpointdatainonememoryreadoperationandreducedthetotalnumberofglobalmemoryreadinstructions.Intermsofperformance,thesystemrunsat56)]TJ /F5 11.955 Tf 12.76 0 Td[(78fps,whichisseveraltimesfasterthantheBHTalgorithmdevelopedinChapter 2 .Inthefuture,weplantoimplementoursystemonmorepowerfulGPUandanalyzetheperformanceofouralgorithm. 56

PAGE 57

Table3-1. Centerlocationerrorsoftheintegralhistogram-basedtracker,themean-shifttracker,theBHTandthep-BHT Integralhist.tracker Mean-shifttracker BHT p-BHTSequenceMaxMeanStd MaxMeanStd MaxMeanStd MaxMeanStd Femaleskater55.227.911.969.226.917.235.812.36.626.710.54.9Maleskater168.036.825.9 165.445.333.636.413.27.437.412.857.8Indiandancer25.512.26.0 42.022.312.123.89.34.321.69.65.0Dancer57.530.210.3 59.439.712.033.116.06.741.120.97.5 Table3-2. Coverageerrorsoftheintegralhistogram-basedtracker,themean-shifttracker,theBHTandthep-BHT Integralhist.tracker Mean-shifttracker BHT p-BHTSequenceMaxMeanStd MaxMeanStd MaxMeanStd MaxMeanStd Femaleskater79.959.710.6100.065.516.960.748.27.860.748.47.8Maleskater100.063.614.4 100.068.719.878.952.413.478.952.313.1Indiandancer66.749.87.8 86.659.015.559.045.67.159.045.87.0Dancer87.261.110.5 87.070.010.371.747.19.970.551.69.1 Table3-3. Performanceanalysisofdifferentsub-stepsinthemulti-framedetectionstep. copyhostcopydeviceintegraldetectionbestwindowtotaltimetotaltimeframeratetodevicetohosthistogramscoreselection/iteration/framesequence(msec)(msec)(msec)(msec)(msec)(msec)(msec)(fps) Femaleskater5.735.1983.0957.510.17151.7010.8492Maleskater4.174.9981.9243.110.16134.3413.4374Dancer4.365.0582.0460.960.16152.5715.2666Indiandancer5.685.4383.1283.240.17177.6512.6979 57

PAGE 58

Table3-4. Computationalperformanceofthep-BHTalgorithm. framesDetectionRenementUpdateTotalTime/iteration(GPU)(CPU)(CPU) averagetime/iteration(inmsec)151.70180.54Femaleskater14averagetime/frame(inmsec)10.848.4020.4412.89framerate(infps)92.00119.0049.0078.00averagetime/iteration(inmsec)134.34160.99Maleskater10averagetime/frame(inmsec)13.437.7218.9316.10framerate(infps)74.00130.0053.0062.00averagetime/iteration(inmsec)152.57177.27Dancer10averagetime/frame(inmsec)15.266.4318.2717.73framerate(infps)66.00155.0055.0056.00averagetime/iteration(inmsec)177.65208.39Indiandancer14averagetime/frame(inmsec)12.699.7421.0014.88framerate(infps)79.00102.00648.0067.00 58

PAGE 59

Figure3-1. Comparisonofcomputationalspeedbetweenthep-BHTandtheBHTalgorithm. Figure3-2. BlockdiagramoftheCUDAHardwareArchitecture(Source[ 55 ]) 59

PAGE 60

Figure3-3. Left:CUDAProgrammingModelRight:CUDAMemoryModel.(Source[ 55 ]) Figure3-4. Blockdiagramofthep-BHTalgorithm. 60

PAGE 61

Figure3-5. BlockdiagramoftheCPU-GPUimplementationofthep-BHTalgorithm. FemaleSkater#35FemaleSkater#52FemaleSkater#68FemaleSkater#94FemaleSkater1#117 Figure3-6. VisualcomparisonoftrackingresultsoftheFemaleSkaterSequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:Trackingresultsusingthep-BHTalgorithm. 61

PAGE 62

MaleSkater#29MaleSkater#145MaleSkater#175MaleSkater#232MaleSkater1#281 Figure3-7. VisualcomparisonoftrackingresultsoftheMaleSkaterSequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:Trackingresultsusingthep-BHTalgorithm. IndianDancer#9IndianDancer#26IndianDancer#82IndianDancer#115IndianDancer#134 Figure3-8. VisualcomparisonoftrackingresultsoftheIndianDancerSequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:Trackingresultsusingthep-BHTalgorithm. 62

PAGE 63

Dancer#35Dancer#59Dancer#122Dancer#160Dancer#212 Figure3-9. VisualcomparisonoftrackingresultsoftheDancerSequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:Trackingresultsusingthep-BHTalgorithm. 63

PAGE 64

Figure3-10. Per-framecenterlocationerrorplotsusingdifferenttrackingalgorithms.Toprow:(Left)MaleSkater,(Right)FemaleSkater,BottomRow:(Left)IndianDancerand(Right)Dancersequence. 64

PAGE 65

Figure3-11. Per-framecoverageerrorplotsusingdifferenttrackingalgorithms.Toprow:(Left)MaleSkater,(Right)FemaleSkater,BottomRow:(Left)IndianDancerand(Right)Dancersequence. 65

PAGE 66

CHAPTER4CO-SUPERPIXELS:GENERATINGCONSISTENTSUPERPIXELSACROSSIMAGEFRAMES 4.1Introduction Manycomputervisionalgorithmsrelyonparsingtheimageintosmallhomogeneousimagesegments.Superpixelsrepresentsuchspatiallycoherentregionsthatrespectimageedgesandwhosepixelssharecommoncolorortextureinformation.Computingsuperpixelshasrecentlyreceivedagrowinginterestintheliterature.Forexample,superpixelsareusedasbuildingblocksinobjectdetection[ 56 ],recognition[ 57 58 ],foregroundsegmentation[ 59 ]andmanyothervisionalgorithms[ 60 ].Nowadays,manyalgorithmsareavailableforcomputingsuperpixels.Forinstance,normalizedcutalgorithms[ 61 63 ]areoftenusedforextractingsuperpixelsfromimages.However,thesealgorithmsarecomputationallyveryexpensive.Levinshtein[ 64 ]proposedalevel-setmethodthatefcientlycomputessuper-pixelsfromanimage.However,thealgorithmprovidesnowaytocontroltheextentorsizeofsuperpixels. Recently,Veksleretal.[ 65 ]proposedadiscreteenergyminimizationframeworkforsuperpixelgeneration.Thealgorithmgeneratessuperpixelswhichareregularinsizeandshape.However,theirapproachshowsinconsistencywhenappliedtotwoverysimilarimagesindependently.Forexample,thetoprowinFigure 4-1 showssuperpixelmapsoftwosimilarimagesgeneratedusingtheiralgorithm.Atseveralareas,superpixelboundariesareinconsistentacrossthetwoimages(someexamplesaremarkedinboldrectangles).Theirframeworkgeneratessuperpixelsthatrespectimageedges.However,thisisnotsufcientforgeneratingconsistentsuperpixelsacrosstwosimilarimages.Indeed,someconsistencymeasureacrossthetwoimagesarerequired. Inthischapter,weaddressthisproblembyintroducinganenergyminimizationframeworkwhichestablishesconsistencybetweencorrespondingsuperpixelsofimagepairs.Thisconsistencyisestablishedbyintroducingatermwhichencodesshapeandappearancemeasuresbetweencorrespondingsuperpixelsintheenergyfunction.We 66

PAGE 67

developaniterativealgorithmtominimizetheenergyfunction.ThemiddlerowinFigure 4-1 showstheconsistentsuperpixelmapsusingouralgorithm. Ourcontributionsinthischapterareasfollows.Firstly,wediscussanalgorithmtogeneratesuperpixelsfromanimagewhichareconsistentwithalreadycomputedsuperpixelsofasimilarimage.Secondly,wedevelopanalgorithmforcomputingconsistentsuperpixelsforpairsofvideosuccessiveframes.Weevaluateouralgorithmwithseveralexamplestodemonstrateitsabilitytogenerateconsistentsuperpixels. Therestofthischapterisorganizedasfollows.InSection 4.2 ,wereviewtheenergyminimizationframeworkforsuperpixelizationofasingleimage.WedevelopanenergyfunctiontogenerateconsistentsuperpixelmapsinSection 4.3 .Then,inSection 4.4 ,wepresentthedetailsofouriterativealgorithm,andnally,experimentalresultsarediscussedinSection 4.5 4.2SuperpixelGenerationfromaSingleImage Inthissection,wereviewtheenergyminimizationframework[ 65 ]forsuperpixelgenerationfromasingleimage.LetPbethesetofpixelsintheimageandL=f1,2,...,jLjg,bethesetofsuperpixellabels.Eachsuperpixelwiththelabell2LisassumedtocontainedintheboundingboxRl.Thealgorithmassignsasuperpixellabell2Ltoeachpixelp2P.Letfpdenotethesuperpixelassignedtopixelpandletfbethecollectionofallsuperpixelassignments.Werefertofasasuperpixelmap.ItisamapfromPtoL,i.e.f:P7!L.Theassignmentproblemisformulatedasanenergyminimizationframework,wheretheenergyfunctionisdenesas E(f)=Xp2PDp(fp)+Xfp,qg2NwpqVpq(fp,fq)(4) Thisenergyequationconsistsoftwokindsofterms.TheunarytermDp(l)intheEquation 4 speciesthecostofassigningthesuperpixellabelltothepixelp.Thistermisusedtoregulatethesizeofgeneratedsuperpixels.Morespecically,apixelpis 67

PAGE 68

allowedtobeassignedthelabellonlyifp2Rl.Thereforetheunarytermisdenedas Dp(l)=8><>:1ifp2Rl1otherwise(4) ThebinarytermVpq(lp,lq)intheEquation 4 speciesthecostoflabelingtwoneighboringpixelspandqwithsuperpixellabelslpandlq,respectively.ItisdenedasVpq(fp,fq)=min(1,jfp)]TJ /F7 11.955 Tf 12.59 0 Td[(fqj)forallneighboringpixelpairsfp,qg.Thatistosay,ifneighboringpixelsareassignedtothesamesuperpixel,thennocostisaddedintheenergyfunction.Ontheotherhand,ifneighboringpixelsareassigneddifferentlabels,thenthepenaltywpqisused.Thisbinarytermallowsthesuperpixelboundariestoalignwithedgesoftheimage. Aparameter,controlstherelativeweightbetweentheunaryandthebinaryterms.Thecoefcientwpqisinverselyproportionaltotheintensitydifferencebetweentheneighboringpixelspandq.LetIpdenotestheintensityofpixelpanddist(p,q)denotesthedistancebetweenthetwopixelspandq.Then,following[ 66 ],wpqiscomputedfromtheimageIusing wpq=exp()]TJ /F5 11.955 Tf 26.1 8.09 Td[((Ip)]TJ /F7 11.955 Tf 11.96 0 Td[(Iq)2 dist(p,q)22)(4) Veksleretal.[ 65 ]usedthe-expansionalgorithm[ 67 ]tominimizetheenergyfunctionE(f). 4.3ConsistentSuperpixelGeneration Ourgoalinthissectionistodevelopaframeworkforgeneratingasuperpixelmapfwhichisconsistentwithagivensuperpixelmapg.WedenegasamapfromPtoL[f0g,i.e.g:P7!L[f0g.ThesuperpixelmapgspeciespreferredlabelsforasubsetQofthepixelsinP.Foreachp2Q,gp2L,andifp=2Qthengp=0.Inotherwords,apositivegpdenotesapreferredsuperpixellabelforthepixelp,sinceeachsuperpixelhasadistinctlabelwhichisapositiveinteger.Otherwise,ifgp=0,thengdoesnotspecify 68

PAGE 69

anylabelpreferenceforthepixelp.Werefertogasaconstrainingmapforf,andQisthesetofconstrainedpixels. WeextendtheenergyfunctionE(f)toincludetheconstrainingmapg.Tomeasuretheconsistencybetweenthesuperpixelmapsfandg,weintroduceanadditionalunarytermSp(fp,gp)intheextendedenergyfunctionE(f,g).Foragivenconstrainingmapg,themapfcanbefoundbyminimizingtheextendedenergyfunctionE(f,g).Theenergyfunctioncanbewrittenasfollows E(f,g)=Xp2PDp(fp)+Xp2PSp(fp,gp)+Xfp,qg2NwpqVpq(fp,fq)(4) whereistheweightoftheadditionalunaryterm.TheadditionalunarytermSp(fp,gp)incursapenaltytotheenergyfunctionE(f,g)wheneverfp6=gpandthusencouragestheconsistencybetweenfandg.ThefunctionSp(fp,gp)isdenedas Sp(fp,gp)=8><>:0iffp=gp1otherwise(4) Asstatedearlier,weminimizetheEquation 4 overfforagivengusingthe-expansionalgorithm[ 67 ]. Now,wediscusssomepropertiesofthefunctionE(f,g).Letusassumethatasuperpixelmapg0isgivenasg0p=0,forallp.Thismeansgdoesnotspecifypreferredlabelsforanypixelp2P.ThentheenergyfunctionE(f,g0)becomesequivalenttotheenergyfunctionE(f).Forg0,theEquations 4 and 4 willhavethesameoptimalsolutionfandtheminimumenergyvalueswilldifferonlybyaconstantjPj.Moreover,foranyotherg6=g0,thefollowingrelationholds minfE(f,g)minfE(f,g0)(4) WealsoconsideranothercasewherethesubsetQisextendedtoalargersubsetS,i.e.QS.Let,thecorrespondingconstrainingsuperpixelmapsaregQandgS, 69

PAGE 70

respectively.WealsoassumethatgQandgShavesamepreferredsuperpixellabelforallpixelsp2Q.Insuchcases,followingrelationholds minfE(f,gS)minfE(f,gQ)(4) Thus,extendingthesubsetofpixelswheregispositivedoesnotincreaseE(f,g).Moreover,thesuperpixelmapftriestomatchwithganddecreasesthevalueofE(f,g). Inthenextsection,wepresenttheformulationofournovelenergyfunctiontocomputeaconsistentsuperpixelizationofasimilarimagepair.Wealsodiscussaniterativealgorithmtominimizetheenergyfunction. 4.4Co-superpixelGenerationFramework LetusassumethatwearegivenapairofimagesI(1)andI(2),andwewanttogeneratesuperpixelsfromtheseimages.Sincetheimagesaresimilartoeachother,theresultantsuperpixelmapsf(1)andf(2)shouldalsobesimilarandtheyshouldsatisfythefollowingproperties Foreachsuperpixelinthemapf(1),thereshouldbeacorrespondingsuperpixelinthemapf(2). Eachcorrespondingsuperpixelpairshouldhaveverysimilarappearances. Eachcorrespondingsuperpixelpairshouldhaveverysimilarshapesandsizes. Weformulateanovelenergyfunctionthatenforcestheseconstraintsonthemapsf(1)andf(2),andminimizationoftheenergyfunctiongivestheconsistentsuperpixelizationoftheimagesI(1)andI(2).Letg(t)betheconstrainingmapforthemapf(t)andCbethebinarymatrixthatencodesthecorrespondencebetweenthesuperpixelsetsoftheimagesI(1)andI(2).ThebinarymatrixChasdimensionsjL(1)jjL(2)jwherejL(t)jrepresentsthemaximumnumberofsuperpixelsthatcanbegeneratedfromtheimageI(t).WeimposethefollowingconstraintsonthematrixCtoensurethateach 70

PAGE 71

superpixelinonesethasatmostonecorrespondingsuperpixelintheothersetjL(1)jXi=1C(i,j)1 (4)jL(2)jXj=1C(i,j)1 (4) WenowpresenttheenergyfunctionforthejointsuperpixelizationschemeoftheimagepairasfollowsE(f(1),g(1),f(2),g(2),C)=E(1)(f(1))+E(2)(f(2))+Xp2P(1)Sp(f(1)p,g(1)p)+Xp2P(2)Sp(f(2)p,g(2)p))]TJ /F9 11.955 Tf 9.3 0 Td[(jL(1)jXl1=1jL(2)jXl2=1C(l1,l2)Sim(l1,l2;g(1),g(2)) (4) wherethetermE(t)(f(t))isthesuperpixelizationcostoftheimageI(t)withthemapf(t),andthisisdenedthesamewayasE(f)isdenedinEquation 4 E(t)(f(t))=Xp2P(t)Dp(f(t)p)+Xfp,qg2N(t)w(t)pqVpq(f(t)p,f(t)q)(4) ThethirdandthefourthtermsoftheEquation 4 measuretheconsistencybetweenthesuperpixelmapf(t)andtheconstrainingmapg(t).Finally,thelasttermmeasuresthesimilaritybetweenthecorrespondingsuperpixelsoftheconstrainingmapsg(1)andg(2).Asmentionedearlier,thesuperpixelcorrespondenceisspeciedbythebinarymatrixC.Thenegativesignindicatesthatincreasingthesuperpixelsimilaritydecreasesthevalueoftheenergyfunction.Thesuperpixelsimilaritytermisweightedbytheparameter.InEquation 4 ,thesimilaritybetweentwosuperpixelsl1andl2,obtainedfromtheconstrainingmapsg(1)andg(2)respectively,isdenotedbySim(l1,l2;g(1),g(2)).WedenethissuperpixelsimilarityterminSection 4.4.1 ,andtosimplifythenotation,thesimilaritytermisdenotedasSim(l(1),l(2))inthesubsequentdiscussions. 71

PAGE 72

4.4.1SuperpixelSimilarityMetrics WedenethesimilaritySim(l(1),l(2))betweenthesuperpixelsl(1)andl(2)asaweightedsumofappearanceandshapesimilarityterms Sim(l(1),l(2))=Sappearance(l(1),l(2))+(1)]TJ /F9 11.955 Tf 11.96 0 Td[()Sshape(l(1),l(2))(4) wherespeciestheweight.Thersttermin 4 ,Sappearance(l(1),l(2))measurestheappearancesimilaritybetweensuperpixelsl(1)andl(2).Toevaluatethissimilaritymetric,wecomputeintensityhistogramsofthesuperpixelsandmeasurethedistancebetweenthemusinghistogramintersection[ 68 ].Thesecondterm,Sshape(l(1),l(2))iscomputedasfollows.LetPl(i)(i)bethesetofallpixelswhichareassignedtothesuperpixell(i).Also,letbeadisplacement,whichcanbeaddedtoapixelinP1andmapittoapixelinP2.WeapplythedisplacementtoallthepixelsinPl(1)(1),andthenndthemappedpixelsinthesecondimagePl(1)(2)=Pl(1)(1).Now,theshapesimilaritymetricoftwosuperpixelsl(1)andl(2)isdenedas Sshape(l(1),l(2))=maxjPl(1)(2)\Pl(2)(2)j jPl(1)(2)[Pl(2)(2)j(4) Also,theoptimaldisplacementbetweenthesuperpixelsisdeterminedby (l(1),l(2))=argmaxjPl(1)(2)\Pl(2)(2)j jPl(1)(2)[Pl(2)(2)j(4) Forasuperpixelpairl(1)andl2,Pl(1)(1)mapstoPl(1)(1)(l(1),l(2))intheimageI(2)andsimilarly,Pl(2)(2)mapstoPl(2)(2)(l(1),l(2)))]TJ /F10 7.97 Tf 6.59 0 Td[(1intheimageI(1).Tosimplifyourdiscussion,wedenefollowingnotations Pl1!l2(2)=Pl(1)(1)(l(1),l(2))(4) Pl2!l1(1)=Pl(2)(2)(l(1),l(2)))]TJ /F10 7.97 Tf 6.58 0 Td[(1(4) 4.4.2IterativeEnergyMinimization DirectminimizationoftheenergyfunctionintheEquation 4 isdifcultbecauseofthepresenceofthesimilarityterms.Totacklethisproblem,wefollowanalternating 72

PAGE 73

minimizationapproach.Inthisapproach,thesetofvariablesVispartitionedintokdisjointsubsetsi.e.V=V1[V2[...[Vk.Then,kphasesareperformediteratively.Ateachphasei,thevariablesinthesetViareestimatedwhilexingtheremainingvariables.Theiterationsoverthesekphasesarecontinueduntilconvergencecriteriaismet.Althoughthisapproachdonotsolvetheoriginalproblemexactly,inmostofthecasestheapproachproducesverygoodapproximatesolutions. TheenergyfunctioninEquation 4 containsthevariablesf(1),f(2),g(1),g(2),andC.Wepartitionthesevariablesintothreedisjointsubsets:V1=f(1),V2=f(2)andV3=fg(1),g(2),Cg.Thus,ouriterativeminimizationalgorithmcontainsthreephases.Intherstphase,weestimatef(1)whilexingf(2),g(1),g(2)andC.FixingthevariablessimpliestheEquation 4 ,andf(1)canbeestimatedusingtheequation E(f(1),g(1))=Xp2P(1)Dp(f(1)p)+Xp2P(1)Sp(f(1)p,g(1)p)+Xfp,qg2N(1)w(1)pqVpq(f(1)p,f(1)q)(4) ThisequationisexactlyinthesameformastheenergyfunctioninEquation 4 ,andwecanapplythe-expansionalgorithm[ 67 ]toestimatef(1).Thesecondphaseisexactlysameastherstphase.Inthisphase,weestimatethevariablef(2)wherethevariablesf(1),g(1),g(2)andCarexed.Again,weusethethe-expansionalgorithm[ 67 ]toestimatef(2).Inthethirdphase,weestimatethevariablesg(1),g(2)andCwhilexingthevariablesf(1)andf(1).First,thevariableCisestimatedfromthesuperpixelmapsf(1)andf(2).Then,weestimateg(1)andg(2)fromC. WeusetheHungarianalgorithm[ 69 ]todeterminethecorrespondencebetweenthesuperpixelsetsobtainedfromthesemaps.TocomputethesimilaritybetweentwosuperpixelsintheHungarianalgorithm[ 69 ],weuseourproposedsuperpixelsimilaritymetricSim(l(1),l(2)).Foreachcorrespondingpairl(1)andl(2),wesetC(l(1),l(2))=1onlyifSim(l(1),l(2)).Here,isthesimilaritythreshold.AllotherentriesinCaresetto0. 73

PAGE 74

Theconstrainingmapsg(1)andg(2)areconstructedusingthematrixC.Initially,wesetg(t)p=0forallp2Pt.Foreverysuperpixelpairl(1)andl(2),ifC(l(1),l(2))=1,weupdateg(1)andg(2)asfollows.Foreachp2Pl2!l1(1)wesetg(1)p=l(1).Andsimilarly,foreachp2Pl1!l2(2)wesetg(2)p=l(2). Werepeatedlyperformthesephasesandupdatethevariables.Thethresholdisinitiallysettoahighvalue,andthealgorithmdecreasesthevalueofateveryiterationtoallowmoresuperpixelcorrespondencesinthematrixC.Decreasingintroducesmorenonzeroentriesintheconstrainingmapsg(1)andg(2),andthishelpstoreducethevalueofthejointenergyfunctionfurther.Wecontinuetheiterationsuntilthevalueofthestopsdecreasingforfewiterations.Theoutputoftheiterativealgorithmisthesuperpixelscomputedfromthenalsuperpixelmapsf(1)andf(2). 4.5ExperimentalResults Weuseseveralexamplestoevaluatetheproposedalgorithm.MostoftheexamplesweretakenfromtheMiddleburydataset[ 70 ].Theweightoftheadditionalunarytermwassetto1andthesmoothnessweightingparameterwassetto200empirically.Thesuperpixelsimilarityweightparameterwassetto0.05.Foreachpairofimages,werstusedthesuperpixelizationalgorithmbyVeksleretal.[ 65 ]togeneratesuperpixelsindependentlyoverthegiventwoimageframes.Wefoundrepeatedlythatthesuperpixelboundariesatcorrespondingimagepointsarequitedifferentfromeachother.Thishappensmainlyinareaswhereedgesareclosetoeachother.Secondly,weappliedourjointframeworkthattakessuperpixelconsistencyintoconsideration.Ourproposedalgorithmwasabletodetectmostoftheinconsistentregionsandrectifythem.Insomecases,ouralgorithmfollowedanalreadyexistingsuperpixelboundarypresentinoneoftheimagesandmodiedthecorrespondingsuperpixelintheotherimage.Inothercases,thealgorithmcompletelyreconstructedtheregionforbothimagestoestablishtheconsistency.Theresultsshowninthefollowingguresdemonstratethattheproposedalgorithmproducesmoreconsistentsuperpixelsacrosstheframes.In 74

PAGE 75

eachoftheFigures( 4-4 to 4-5 ),weshowthesuperpixelmapsofthethegivenimagepairusingVeksler'salgorithm,ouralgorithm,andthenanoverlayofthesuperpixelmapsfromthetwoapproachestorevealthepointsofinconsistencyintheformerapproach.Aswell,wereporttheincreaseintheaveragesimilarityofcorrespondingsuperpixelpairsattheinitiationandterminationoftheiterativealgorithmforeachoftheshownimagepairs.Table 4-1 summarizestheseincreasesofsuperpixelsimilarities.Figure 4-3 showstheimprovementintheaveragesuperpixelsimilarityscoreoveriterationsforthetestedimagepairs. Table4-1. Superpixelsimilarityattheinitialandnaliterationsofouriterativejointsuperpixelgenerationscheme. SequenceInitialSimilarityFinalSimilarity Army0.830.96Sponza0.720.82Wooden0.750.88Urban20.670.73Urban30.700.80Mequon0.620.79 75

PAGE 76

Figure4-1. Consistentsuperpixelgenerationforapairofimages.TopRow:SuperpixelmapsgeneratedindependentlyusingVeksler's[ 65 ]approach.MiddleRow:Superpixelmapsgeneratedbyouralgorithm.Bottomrow:Overlayofthesuperpixelmapsofthetwoapproaches.Ourmethodproducesmoreconsistentsuperpixels.Partsofnoticeabledifferencesareenlarged.Imagesarebestviewedincolor. 76

PAGE 77

Figure4-2. Stepsforcomputingtheshapesimilaritybetweenapairofsuperpixels. 77

PAGE 78

Figure4-3. Improvementintheaveragesuperpixelsimilarityscoreusingouriterativejointsuperpixelgenerationscheme. 78

PAGE 79

Figure4-4. SuperpixelgenerationresultsfortheSponzasequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. Figure4-5. SuperpixelgenerationresultsfortheWoodensequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. 79

PAGE 80

Figure4-6. SuperpixelgenerationresultsfortheUrban3sequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. Figure4-7. SuperpixelgenerationresultsfortheMeqounsequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. 80

PAGE 81

Figure4-8. SuperpixelgenerationresultsfortheUrban2sequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. Figure4-9. SuperpixelgenerationresultsfortheArmysequence(Frames7and8).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. 81

PAGE 82

CHAPTER5TRACKINGUSINGSUPERPIXEL-BASEDAPPEARANCEMODEL 5.1Introduction Visualtrackinginuncontrolledenvironmentsisanimportantproblemincomputervisionresearchbecauseofitsapplicationsinmanyeldsincludingsurveillance[ 71 ],human-computerinteraction[ 72 ]andmedicalimageanalysis[ 73 ].Despitesignicantadvancesinrecentyears,visualtrackingisstillachallengingproblem.Themaindifcultiesintrackingatargetoveralongperiodoftimelieinhandlingtheappearancechanges,illumination,poseandscalevariationsandrecoveringfromtargetocclusions.Therefore,thetargetappearancemodelneedstobeupdatedadaptivelytohandlethesechallenges.Adaptivemodelsofthetargetappearancehavebeenproposedinmanyvisualtrackingalgorithms[ 4 74 78 ].Inthesemodels,theappearanceiscontinuouslyupdatedwithnewtrackingresults.Sometrackingalgorithmsalsomodelthebackgroundsurroundingthetarget.Dependingontheforegroundandthebackgroundmodelingapproach,trackersaredividedintotwoclasses,generativeanddiscriminative. Discriminativetrackingmethods[ 5 79 ]areposedasbinaryclassicationproblemswherethetaskistodistinguishthetargetregionfromthebackground.Thisrequiresmodelingtheforegroundandthebackgroundseparatelywhichalsohelpstohandlepartialocclusionsofthetarget.However,theupdateoftheseappearancemodelsisoftenverycostly.Ontheotherhand,generativetrackingmethods[ 76 77 80 ]trackatargetobjectbysearchingfortheregionmostsimilartothereferencemodel.Forthesemethods,theupdateoftheappearancemodelismoreefcientbuttheupdatecancausethetrackertodriftawayfromthetargetwhenitispartiallyoccluded.Withoutanocclusiondetectionmechanismingenerativetrackingalgorithms,adaptivetargetmodelingcancausethetrackertodriftaway.Inthischapterweproposeagenerativetrackingalgorithmthataddressesthisproblemandprovidesasolutionforrobustvisualtracking.Inparticular,ourtrackingalgorithmpresentsanovelsuperpixel-based 82

PAGE 83

appearancemodelwithasimplebuteffectivemodelupdatemechanism.Ouralgorithmalsorobustlyhandlesocclusionsandscalevariationsofthetarget. TherestoftheChapterisorganizedasfollows.WerstdescriberelatedworkinSection 5.2 .Theproposedsuperpixel-basedappearancemodelispresentedinSection 5.3 .Section 5.4 outlinesthedetailsoftheproposedtrackingalgorithmincludingthemodelupdateandocclusiondetection.ExperimentalresultswithqualitativeandquantitativeevaluationandcomparisonofthetrackerwithotheralgorithmsarepresentedinSection 5.5 5.2RelatedWork Reliablemodelingofobjectappearanceisveryimportantforrobusttracking.Thishelpsthetrackertondthetargetamidvariouschallengesinthescene.Also,tohandlethevariationsofthetarget,theappearancemodelshouldbeeasytoupdate.Forrobustandadaptiveappearancemodeling,trackingalgorithmsavailableintheliteratureusevarioushigh-,mid-orlow-levelimagefeatures.Inhigh-levelappearancemodelingparadigms,thetargetisconsideredasasingleregionandfeaturessuchascolororintensityhistogram[ 2 81 ]areusedtodescribetheregion.However,thistypeofmodelingcompletelydisregardsthespatialcongurationofthetargetandhencethetrackeroftenconfusesthetargetwithsimilarbackgrounds.Moreover,thisholisticmodelingproducesjitteredtrackingresultsfortargetswithlargeshapevariationseveniftheappearanceofthetargetremainsstationary[ 82 ].Low-levelimagecues(forexampleSIFT[ 83 ],SURF[ 84 ],LBP[ 85 ])aremainlyusedintrackingfeaturepoints.However,theselow-levelimagecuesarenotsuitablefortrackingobjectsinlongsequences. Recently,mid-levelcueswereeffectivelyusedfortrackingobjectsinlongandchallengingsequences[ 75 86 88 ].Specically,thetargetregionisdividedintosmallerregionsandtheappearancemodelisconstructedfromthefeaturevectorsdescribingthoseregions.In[ 75 ],theauthorsproposeatrackerthatmodelsthetargetbydividingthetargetregionintofragments.Localhistogramscomputedfromthefragmentsare 83

PAGE 84

usedtomodelthetarget.Thismakesthetrackerrobustagainstpartialocclusions.However,thisfragment-basedappearancemodelisnotadaptiveandhencethetrackerfailstotracktargetshavinglargeappearancevariationsovertime.In[ 86 ],smallimagepatchesareselectedfromthetargettomodelitsappearance.Thesepatchesarecalledattentionalregions(AR)ofthetarget.ThetrackercombinesthevotemapsobtainedfromtheseARstodecidethetargetlocationinthenewimageframe.However,theauthorsdidnotshowifthetrackerhandleslargeshapeorappearancevariationsofthetarget.Asimilarpatch-basedappearancemodelisalsoproposedin[ 3 ]whereaparticleltering-basedframeworkisusedfortracking.Whiletheauthorsdemonstraterobusttrackingresultsfortargetshavinglargeshapevariations,theydidn'tdoenoughexperimentswithoccludedtargets.Liuetal.[ 87 ]presentatrackingalgorithmwherelocalpatchesaresparselycodedusingalearneddictionary.However,sparsecodingoflargenumberofpatchesanddynamicupdateofthedictionaryisverycostlyfortrackingapplications. Whilealloftheseaforementionedapproachesuserectangularpatches,somealgorithmsusesuperpixelstomodelthetarget.In[ 89 ],asuperpixel-basedtrackingalgorithmwasproposedthatcomputessuperpixelsfromanimageframeandclassifythemasforegroundorbackground.Itisveryinefcienttosuperpixelizeawholeimagejusttotrackasmalltargetintheimage.Toalleviatethisproblem,theauthorsin[ 90 ]computesuperpixelsfromalimitedregionaroundthetarget. Apartfromreliablemodelingofthevisualtarget,anotherimportantfactorofatrackingalgorithmistoupdatethemodeltoadapttothevariationsofthetarget.IntheVisualTrackingDecomposition(VTD)framework[ 78 ],severalbasicappearancemodelsareconstructedusingtheinitialtargetregionandthefourmostrecenttrackingresults.Basictrackersconstructedfromthesemodelsareusedinteractivelytotrackthetarget.Thismethodperformswellwiththeproposedadaptiveappearancemodelfortargetshavingposeandilluminationvariations.However,thealgorithmlosesthe 84

PAGE 85

targetunderheavyocclusion.ThePROSTtrackingalgorithm[ 4 ]usesthreedifferenttrackingmodules:non-adaptive,fullyadaptiveandsemi-adaptive.ThePROSTalgorithmproposessomecriteriatoupdatethesemi-adaptivemodulebasedontheoutputsfromthetwoothermodules.Althoughthealgorithmcantrackoccludedtargets,itdoesnotexplicitlydetectocclusion.TheMILTrackalgorithm[ 5 ]isconsideredtobeastate-of-the-artdiscriminativetrackingalgorithmthatusesmultiple-instancelearning[ 91 ].Theforegroundandbackgroundappearance(positiveandnegativesamples)areupdatedregularlytodealwiththeappearancevariationsofthetarget. Ourproposedalgorithmusesmid-levelimagefeaturestomodelthetargetappearance.Inparticular,thealgorithmmodelsthetargetusingsuperpixels[ 60 64 65 ].Ouralgorithmcomputesasuperpixelmapandthensuperimposesarectangulargridontopofthemap.Eachgridpointisassignedthefeaturevectorassociatedwiththecorrespondingsuperpixel.Hence,asuperpixel-basedsimilaritymeasureisevaluatedbetweenthemodelandthecandidatewindowstondthetargetinthenewframe.Inordertoaccountforappearancevariations,themodelisupdatedwithnewtrackingresults.Tokeepthemodelrobust,themodelusessuperpixelsfromtheinitialframeandthemostrecentframe.Usingthesesuperpixels,weconstructtheadaptiveappearancemodel.Theproposedalgorithmalsousesanocclusiondetectionmodulethatguidesthemodelupdateprocess.Whileusingsuperpixelsintrackinghasbeendonebefore[ 89 90 ],ourworkisdistinguishedbythreeaspects.First,weincorporatespatialinformationbyplacingagridontopofthesuperpixelmap.Second,ourmodelupdateprocessismuchsimplerthanthatof[ 90 ].Third,ourmodelisgenerativewhilethepreviousapproachesarediscriminativeones.Ourexperimentalresultsshowthatthealgorithmproducesbetterresults.TheresultswillbeillustratedinSection 5.5 5.3TheAppearanceModel Inthissection,wedescribetheconstructionofthetargetappearancemodel.Theinitialappearancemodelisconstructedfromtherstframeofthesequence.Thetarget 85

PAGE 86

ismanuallyspeciedasarectangularregionRinthisframe.InsideR,weestablisharectangulargridGasshowninFigure 5-1 .AssumingthatthegridGcontainsNgridpoints,thecollectionofthesegridpointsisspeciedbyG=fgigNi=1.Therelativelocationofagridpointgiisdenotedbypi=pos(gi))]TJ /F1 11.955 Tf 12.03 0 Td[(pos(upperLeft(R)),wherepos(gi)isthepixellocationofthegridpointgiandpos(upperLeft(R))isthepixellocationoftheupperleftcorneroftheinitialrectangleR.Eachofthesegridpointsisannotatedwithafeaturevectorfi.Acollectionofthesefeaturevectors,F=ffigNi=1,onthegridconstitutestheinitialappearancemodelofthetarget.ThespatialinformationofthetargetisthusencodedintheproposedappearancemodelthroughthegridGandthefeaturesetF.Thisspatialmodelmakesthetrackerrobustagainstpartialocclusionsandalsohelpsittodealwithtargetsofcomplexappearances. Inthiswork,weusemid-levelimagecuestocomputethefeaturevectorsetF.Morespecically,thesuperpixelscomputedinsidetherectangleRareusedforthispurposeasshowninFigure 5-1 .Toextractthesuperpixels,weusetheenergyminimizationalgorithmproposedin[ 65 ].Thisalgorithmefcientlycomputessuperpixelswhoseboundariesarealignedwiththeedgespresentintheimage.ThealgorithmproducesasuperpixelmapMthathasthesamedimensionastherectangleR.Foreachgridpointgi,wedeterminetheassociatedsuperpixelfromthesuperpixelmapM.Inparticular,letmi=M(pi)specifytheassociatedsuperpixelofthegridpointgilocatedatthepositionpi.Theintensityorcolorhistogramhmiofthesuperpixelmiisassignedasthefeaturefiforthegridpointgi. Insteadofcomputingthesuperpixels,onepossiblealternativewouldhavebeentousethehistogramcomputedfromtherectangularimagepatchcenteredatpos(gi).Althoughthiswayiscomputationallyefcient,itcompletelyignorestheedgesoftheimage.Ontheotherhand,ourapproachofusingthesuperpixelsimplicitlyencodestheedgesinthetargetappearancemodelaswell.Thisinfactmakestheappearancemodelmoreinformativeaboutthetargetthanthepatch-basedalternative.Moreover,the 86

PAGE 87

superpixelcomputationalgorithmweuserunsreasonablyfastintherectangleRwhichisonlyasmallregionoftheentireimage.Updateofourproposedappearancemodelisalsoverysimpleandintuitive.ThefeaturevectorsetFisupdatedtohandlethetargetappearancechangeduetoposeorilluminationvariations.Tohandlethetargetscalevariations,thegridGisupdated.Inthenextsection,wepresentourtrackingalgorithmusingthisnovelappearancemodel. 5.4TrackingAlgorithm Atthebeginningoftheproposedalgorithm,theinitialappearancemodelisconstructedusingtherstframeasdescribedinSection 5.3 .Then,foreachsubsequentframeItofthesequence,theproposedtrackingalgorithmproducesarectangleRtenclosingthetarget.Localizingthetargetinthenewimageisformulatedasasearchprobleminouralgorithm(Section 5.4.1 ).Briey,thissearchisperformedasfollows.Atrst,asetofcandidaterectanglesR=fRjgJj=1aregenerated.Then,thealgorithmevaluatesthesimilaritybetweenthecurrenttargetappearancemodelandtheappearanceofthecandidaterectangles.Andnally,thecandidaterectanglehavingthehighestsimilarityscoreisselectedastherectangleRtcontainingthetarget.Nextstepafterthelocalizationofthetargetistodeterminewhetherthetargetisbeingoccludedornot(Section 5.4.2 ).Updateoftheappearancemodelisperformednext(Section 5.4.3 ).Thisupdatedependsontheoutcomeoftheocclusionstep.Thenalstepistoadjustthescaleoftheappearancemodel(Section 5.4.4 ).Thesefoursteps-targetlocalization,occlusiondetection,modelupdateandscaling-arerepeatedineachframetotrackthetarget.Thehighlevel-overviewofthealgorithmispresentedinFigure 5-2 5.4.1TargetLocalization Inthisstep,wedeterminethebestrectangleRtintheimageItthatisthemostsimilartothecurrenttargetappearancemodel.ThebestrectangleRtisselectedfromasetofcandidaterectanglesRasfollows.LetRt)]TJ /F10 7.97 Tf 6.59 0 Td[(1bethebestwindowintheimageIt)]TJ /F10 7.97 Tf 6.59 0 Td[(1.Alsoletct)]TJ /F10 7.97 Tf 6.59 0 Td[(1,wt)]TJ /F10 7.97 Tf 6.59 0 Td[(1andht)]TJ /F10 7.97 Tf 6.59 0 Td[(1bethecenter,widthandheightofRt)]TJ /F10 7.97 Tf 6.59 0 Td[(1,respectively. 87

PAGE 88

AnyrectangleRjintheimageItwhosedimensionsarewt)]TJ /F10 7.97 Tf 6.59 0 Td[(1ht)]TJ /F10 7.97 Tf 6.59 0 Td[(1andwhosecenterlieswithinddistancefromthepixellocationct)]TJ /F10 7.97 Tf 6.59 0 Td[(1isincludedinthesetR.TheunionoftherectanglesinthesetRformsarectangularregionRintheimageItwhosecenterisatthepositionct)]TJ /F10 7.97 Tf 6.59 0 Td[(1andwhosedimensionsare(wt)]TJ /F10 7.97 Tf 6.58 0 Td[(1+2d)(ht)]TJ /F10 7.97 Tf 6.58 0 Td[(1+2d).TheregionRissuperpixelizedinordertocomputethesuperpixel-basedappearanceofeachcandidaterectangleRjinthesetR.Thesuperpixelgenerationalgorithm[ 65 ]producesasuperpixelmapMfortheregionR.TheintensityhistogramsfhkgSk=1ofthesesuperpixelsfskgSk=1arethenestimated1.ForeachpixelpinR,thecorrespondingsuperpixelcanbedeterminedfromM(l)wherel=pos(p))]TJ /F1 11.955 Tf 12.2 0 Td[(pos(upperLeft(R))2.ThecomputationoftheappearanceoftherectangleRjisdoneasfollows. First,wesuperimposethegridGontherectangleRj.Thisgivesusthegrid-pointlocationsfp0igNi=1ofRj.Then,usingtheselocations,wedeterminethesuperpixelsandhistogramsassociatedwitheachofthegrid-points.TheresultanthistogramsetFj=ffjigNi=1representstheappearanceofthecandidaterectangle.ThenwecomputethesimilaritybetweenthehistogramsetFjofthecandidaterectangleRjandthecorrespondingsetFofthetargetmodel.Wedenethesimilarityscorebetweentwohistogramsetssim(F1,F2)asfollows sim(F1,F2)=NXi=1hist sim(f1i,f2i)(5) wherehist sim(f1i,f2i)representsthesimilaritybetweenthehistogramsf1iandf2i.Weusehistogramintersection[ 68 ]tocalculatethesimilaritybetweenthehistograms hist sim(f1i,f2i)=BXb=1min(f1i(b),f2i(b))(5) 1Sisthenumberofgeneratedsuperpixelsbythealgorithm.2pos(x)isthepixellocationofx. 88

PAGE 89

whereBisthenumberofbinsinthehistogram.WeuseEquation 5 tocalculatetheappearancesimilarityofeachcandidaterectangleinthesetR.Then,therectanglewiththehighestsimilarityischosentobethetargetrectangleRt. 5.4.2OcclusionDetection Ourtrackingalgorithmusesanadaptiveappearancemodelofthetarget.Theappearancemodelisupdatedwhenanewtrackingresultisavailable.However,theupdatemaycausethetrackertodriftawayfromthetargetifitiscurrentlyunderpartialocclusionandtheappearancemodelisupdatedwiththispartiallyoccludedtrackingresult.Forthisreason,ourproposedtrackingalgorithmrstdetectswhetherthetargetisbeingoccludedornot.Then,thealgorithmupdatestheappearancemodel.Next,wedescribetheocclusiondetectionstep. Inthetargetlocalizationstep,therectangleRtcontainingthetargetintheimageItisdetermined.TheappearanceofRtisfurtheranalyzedintheocclusiondetectionstep.Inparticular,weestimatetheappearancesimilarityofRtwithrecentbackgrounds.Ifthisappearancesimilarityscoreislarge(i.e.theappearanceofRtishighlysimilartorecentbackgrounds),weinferthatthetargetishavingocclusion.TherecentbackgroundsaremodeledasasetofrectanglesBtakenfromtheLmostrecentlyprocessedframes(It)]TJ /F10 7.97 Tf 6.59 0 Td[(1,It)]TJ /F10 7.97 Tf 6.59 0 Td[(2,...,It)]TJ /F8 7.97 Tf 6.58 0 Td[(L).AteachframeIt)]TJ /F8 7.97 Tf 6.59 0 Td[(j,weincludetherectangleBt)]TJ /F8 7.97 Tf 6.59 0 Td[(icenteredatctandwhosedimensionsarewthtinthesetBonlyifjct)]TJ /F7 11.955 Tf 10.42 0 Td[(ct)]TJ /F8 7.97 Tf 6.58 0 Td[(jj>docc.(i.e.ifthecenteroftherectangleBt)]TJ /F8 7.97 Tf 6.58 0 Td[(iisatleastdoccawayfromthecenteroftherectangleRt)]TJ /F8 7.97 Tf 6.59 0 Td[(j).WeestimatethesimilaritiesbetweenRtandeachrectangleinthesetBandcalculatetheaveragevalue.If>occ,weconcludethat,thetargetispartiallyoccluded.Here,occistheocclusiondetectionthreshold. 5.4.3ModelUpdate AsdescribedinSection 5.3 ,theproposedtargetappearancemodelconsistsofagridGandacollectionoffeaturevectorsF(Eachgridpointgi2Ghasafeaturevectorfi2F).LetF1andFtdenotethefeaturevectorsetsobtainedbyimposingthegridG 89

PAGE 90

ontherectanglesR1andRt,respectively.Inordertohandletheappearancevariationsofthetarget,themodeliscontinuouslyupdatedusingthefeaturesetsF1andFt.Usingonlytheinitialandthemostrecentframeskeepsthemodelupdatesimplebutmoreimportantlyrobustagainstanydrift[ 78 ].Now,themodelfeaturesetFisconstructedfromF1andFtasfollows. Eachfeaturevectorfi2Fisconstructedasaconvexcombinationofthevectorsf1i2F1andfti2Ft.Theweightofthisconvexcombinationdependsontheoutcomeoftheocclusiondetectionstep.Ifthetargetisnotoccluded(occ),weputmoreweightonthecurrentfeaturesfti,otherwise,moreweightisgiventotheinitialfeaturesf1i fi=8><>:w1f1i+(1)]TJ /F7 11.955 Tf 11.95 0 Td[(w1)ftiif>occw2f1i+(1)]TJ /F7 11.955 Tf 11.95 0 Td[(w2)ftiifocc(5) wherew1>w2.Also,whenthealgorithmdetectsocclusion(>occ),thetargetismorelikelytobeoccludedforsometime.Therefore,forthenextfewframes,thealgorithmskipstheocclusiondetectionstepandfavorstheinitialfeatures. 5.4.4Scaling Thenalstepofouralgorithmistodeterminethescaleofthetarget.Thetargetappearancemodelisupdatedaccordinglytohandlethescalevariations.Todeterminethescaleofthetargetonthecurrentframe,asetofrectanglesaregeneratedbyvaryingthesizeoftherectangleRtwhilekeepingthenewrectanglecentersxedatct(centerofRt).Theappearancesoftheserectanglesarecomparedwiththetargetappearancemodelandthescaleoftherectanglehavingthehighestsimilarityisselectedastheupdatedscaleofthetarget.Theupdateofthemodelisperformedasfollows.Firstly,Theappearancemodelisresizedwiththeselectedscale.Then,anewgridG0iscreatedtoreplaceG.Andnally,thefeaturevectorsetFisrecomputedusingthenewgridG0.Inourimplementation,weperformthisscalingstepafterevery4to5framesratherthanateveryframe. 90

PAGE 91

5.5ExperimentalResults Weevaluatedouralgorithmusingtwelvechallengingsequences:Sylvestersequence[ 76 ],WomanandFaceocc1sequences[ 75 ],BasketballandSinger1sequences[ 78 ],Tiger1,Tiger2andFaceocc2[ 5 ]andLemming,Liquor,BoxandBoardsequences[ 4 ].Thetrackingperformanceofourproposedstructuredsuperpixel-basedtracker(SSP)hasbeencomparedwithseveralstate-of-the-arttrackingalgorithms:themultiple-instance-learningtracker(MIL)[ 5 ],thevisualtrackingdecompositiontracker(VTD)[ 78 ],theSemiBoosttracker(SB)[ 79 ],thesuper-pixeltracker(SPT)[ 90 ],theOnlineAdaBoosttracker(OAB)[ 92 ],theOnlineRandomForest(ORF)tracker[ 93 ],theParallelRobustOnlineSimpleTracker(PROST)[ 4 ],andthefragment-basedtracker(FRAG)[ 75 ]. 5.5.1ExperimentalSetup Foreachvideosequence,wecomputethesuperpixelmapoftheinitialframeusingtheenergyminimizationframeworkVeksleret.al[ 65 ]wherethepatchsizeissetto8)]TJ /F5 11.955 Tf 12.63 0 Td[(14pixels,thesmoothnessweighttermto100andthenumberofiterationsto10.Thenagridwithaspacingof4)]TJ /F5 11.955 Tf 12.41 0 Td[(7pixelsisplacedontopofthesuperpixelmap.Thisgridspacingisselectedbasedonthesizeofthetargetsothatnumberofgridpointsisroughly200inallexperiments.Thesuperpixelfeaturesareintensityhistogramsof16)]TJ /F5 11.955 Tf 12.14 0 Td[(20binswheredifferentcolorchannelsareprocessedindependently.ThesimilaritybetweentrackingwindowsiscomputedaccordingtoEquation 5 andthennormalizedbetween0and1.Forocclusiondetection,the5mostrecentframesareusedtomodelthebackground.Adistancethresholdof10)]TJ /F5 11.955 Tf 12.8 0 Td[(15pixelsisusedtodecidewhetherawindowbelongstothebackground.Theocclusiondetectionthresholdissetto0.6)]TJ /F5 11.955 Tf 12.07 0 Td[(0.7dependingontheforeground-backgroundsimilarityofaspecicsequence.Themodelupdateweightsw1andw2inEquation 5 aresetwithintheranges0.8)]TJ /F5 11.955 Tf 12.98 0 Td[(0.9and0.5)]TJ /F5 11.955 Tf 11.95 0 Td[(0.6,respectively. 91

PAGE 92

5.5.2ResultsandDiscussion Table 5-1 summarizestheaveragecenterlocationerror[ 5 ]resultsontensequenceswhoseground-truthdataareavailable.Forspacelimitationsweshowthecomparisonswithvemethodsonly.Pleasecheckthesupplementalmaterialsforfurtherexperimentalresults.Thequalitymetricweusedistheaveragecenterlocationerror[ 5 ].Inmostsequences,ourmethod(SSP)mostaccuratelytrackedthetargetevenwiththepresenceofocclusions,posevariations,illuminationchangesandabruptmotions.InspiteofthemotionblurandheavyocclusionsinthelongLiquorsequence,ourmethodachievedasignicantlylowererrorthanallothermethods.Also,ourerrorresultfortheLemmingsequenceisverylowalthoughthissequenceexperienceslargescalevariationsandmotionblur. InFigures 5-3 and 5-4 ,wepresenttheresultsofouralgorithmforthetwelveaforementionedsequences.TheSinger1sequencecontainslargescaleandilluminationvariations.Still,ourtrackerfollowsthesingeraccuratelythankstotherobustadaptivityofourmodel.TheTiger1andTiger2sequencesexhibitfastmotion,heavyocclusionanddrasticappearancechanges.Despiteallofthesechallenges,ourtrackerrobustlytracksthetigerfacesinbothsequences.NumericalerrorresultsinTable 5-1 supportthisconclusion.Indeed,ouralgorithmachievesthebestperformancefortheTiger2sequenceandcloselymatchesthePROSTtrackerfortheTiger1sequence. TheFaceocc1,Faceocc2andWomansequencesallsufferfromsevereocclusionandheavyappearancechanges.However,becauseofthespatialinformationencodedintotheproposedappearancemodel,ourtrackermanagedtokeeptrackoftherespectiveobjects(Figures 5-3 )andachievethelowesterrorsonthesethreesequences. Figures 5-5 and 5-6 comparestheperformanceofouralgorithmwithotheralgorithmsonfewselectedframes.Aswecanseefromtheexamples,ourtrackerrecoversfromsevereocclusionsandappearancechangeswhilemostoftheothertrackersfail.Forexample,intheBoardsequence,ourtrackerdetectsthemovingcircuit 92

PAGE 93

accuratelywhileothertrackersdriftaway.Similarly,fortheBoxsequence,ourtrackercloselymatchesthePROSTtrackerinfollowingthetarget.Figure 5-7 comparestheper-frametrackingerrorsofourmethodwiththeotherreferencemethods.Theerrorofourmethodisconsistentlylowerthantheerrorsofallothermethods. Table5-1. QuantitativeevaluationoftheSSPtrackingalgorithm.Thenumbersindicatetheaveragecenterlocationerrors(inpixels)ofthetrackingwindows.Thenumberofframesforeachsequenceisshownintheparenthesisnexttothesequencename. SSP(Ours)MIL[ 5 ]SB[ 79 ]SPT[ 90 ]PROST[ 4 ]FRAG[ 75 ] Board(698)14.251.2--39.090.1Box(1161)17.1104.6--12.157.4Lemming(1336)9.414.9-7.025.482.8Liquor(1741)5.3165.1-9.021.630.7Tiger1(354)10.015.046.0-7.040.0Tiger2(360)14.017.053.0--17.0Sylvester(1344)7.09.416.0-11.011.0Faceocc1(886)6.918.47.0-7.06.5Faceocc2(812)13.120.023.0-17.045.0Woman(539)5.0120.0-9.0-112.0 93

PAGE 94

Figure5-1. Constructionofthesuperpixel-basedappearancemodel.Givenanimagewindow,werstgeneratethesuperpixelsdenedbytheredboundaries.Secondly,thehistogramfeaturesarecomputedforeachsuperpixel.Thirdly,wesuperimposeagridontopofthesuperpixelmap.Finally,eachgridpoint(ingreen)isassociatedwiththeenclosingsuperpixelanditshistogramfeaturevector. AlgorithmOutline 1. Initialization:Initializethetrackerandthetargetappearancemodel(Section 5.3 ). 2. TargetLocalization:FindtherectangleRtcontainingthetarget(Section 5.4.1 ). 3. OcclusionDetection:Decidewhetherthetargetisundergoingocclusionornot(Section 5.4.2 ). 4. ModelUpdate:Updatetheappearancemodelusingthemostrecenttrackingresult(Section 5.4.3 ). 5. Scaling:Adjustthescaleoftheappearancemodel(Section 5.4.4 ). Figure5-2. High-leveloutlineoftheSSPtrackingalgorithm. 94

PAGE 95

Sylvester#129Sylvester#715Sylvester#1105Sylvester#715Sylvester#1105 Woman#122Woman#240Woman#332Woman#240Woman#332 Faceocc1#102Faceocc1#277Faceocc1#537Faceocc1#277Faceocc1#537 Faceocc2#93Faceocc2#180Faceocc2#657Faceocc2#180Faceocc2#657 Tiger1#24Tiger1#78Tiger1#200Tiger1#78Tiger1#200 Tiger2#6Tiger2#149Tiger2#228Tiger2#149Tiger2#228 Singer1#58Singer1#136Singer1#191Singer1#136Singer1#191 Basketball#84Basketball#284Basketball#625Basketball#284Basketball#625 Figure5-3. TrackingresultsusingtheSSPtrackingalgorithm. 95

PAGE 96

Box#236Box#622Box#977Box#622Box#977 Board#180Board#456Board#510Board#456Board#510 Lemming#330Lemming#586Lemming#948Lemming#586Lemming#948 Liquor#359Liquor#778Liquor#1042Liquor#778Liquor#1042 Figure5-4. TrackingresultsusingtheSSPtrackingalgorithmonPROSTsequences. Tiger1#47Tiger1#85Tiger1#141Tiger1#85Tiger1#141 Tiger2#185Tiger2#232Tiger2#265Tiger2#232Tiger2#265 Sylvester#833Sylvester#1087Sylvester#1177Sylvester#1087Sylvester#1177 Figure5-5. Visualcomparisonoftrackingresultsobtainedusingdifferentalgorithms.Trackingresultsbyourmethod(SSP),MIL,PROSTandFRAGarerepresentedbyred,blue,cyanandgreenrectangles,respectively. 96

PAGE 97

Box#306Box#500Box#1093Box#500Box#1093 Board#442Board#527Board#649Board#527Board#649 Liquor#449Liquor#779Liquor#1097Liquor#779Liquor#1097 Lemming#328Lemming#589Lemming#1064Lemming#589Lemming#1064 Figure5-6. VisualcomparisonoftrackingresultsobtainedusingdifferentalgorithmsonPROSTsequences.Trackingresultsbyourmethod(SSP),MIL,PROSTandFRAGmethodsarerepresentedbyred,blue,cyanandgreenrectangles,respectively. 97

PAGE 98

BoxBoard LemmingLiquor Tiger1Tiger2 SylvesterFaceocc2 Figure5-7. Per-frametrackingerrorplotsusingdifferenttrackingalgorithms.ForeachframethecenterlocationerrorisshownontheY-axis.Ourmethod(SSP),MIL,PROST,FRAG,SBandOABarerepresentedbyred,blue,cyan,green,blackandmagentalines,respectively. 98

PAGE 99

CHAPTER6CONCLUSION Inthisdissertation,wepresentednovelalgorithmsfortrackingarticulatedandnon-articulatedobjects.Weusedmid-levelimagecuestoefcientlymodelthetargets.Ournovelappearancemodelshavebeensuccessfullyusedtotrackthetargetsinmanychallengingsequences. Wedevelopedatrackingalgorithmforarticulatedtargetsthatusesasmallnumberofrectangularblocksforefcientandrapidevaluationsofintensityhistogramsfortracking.Trackersusingintegralhistogramhavetheabilitytoquicklyscantheentireimageforlocalizingthetarget;andthisgivesthemthecapabilitytotrackrapidmotionsandacrossdifferentshots.However,forobjectsundergoingsignicantshapedeformation,accurateforegroundhistogramestimationsbecomedifcultandthechallengeistoreliablyandrobustlytracktheseobjectswhilemaintainingsimilarperformance(framerate).Westartedwiththenaturalideaofapproximatingirregularshapewithrectangularblocks,andadaptivelyadjustedtheblockstructuretoapproximatetheforegroundshape.Themainpointwasnottoestimatetheshapeexactlyasinmanycontourtrackers(whichinvariablyrequireshapepriorsanddynamics)buttoapproximateitwithaunionofrectangularblocks,withwhichwerapidlyevaluatedtheforegroundhistogram.TheBHTalgorithmputthisideaintoworkandexperimentalresultsshowedgreatimprovementsonrobustnessandstabilityovertypicalhistogram-basedtracker,withoutcompromisingtheperformance. WealsopresentedamodiedversionoftheBHTalgorithmanddevelopedaGPU-acceleratedreal-timeandrobusttrackingsystem.ThesystemwasimplementedontheGPUusingtheCUDAprogrammingmodelproposedbyNVIDIA.Themodiedp-BHTalgorithmcombineddetectionstepsofseveralframestogetherandprocessedtheminparallel.Apartfromthis,distancesofthecandidatetrackingwindowsinthedetectionstepwerealsocomputedinparallel.Thisfurtherimprovedthecomputational 99

PAGE 100

performance.Todealwiththetarget'sshapechanges,theblockstructureoftheappearancemodelwasupdatedregularlyeveryNthframe.Ourexperimentalresultsshowedthatthealgorithmruns7)]TJ /F5 11.955 Tf 11.95 0 Td[(8timefasterthantheBHTalgorithm. Wethendiscussedanovelalgorithmtondconsistentsuperpixelsfromimagepairsofthesamescene.Inparticular,weintroducednovelsuperpixelshapeandappearancemeasuresthatwereusedtoguidethejointsuperpixelizationprocess.Ourapproachshowedgoodconsistencyresultswhencomparedwithtraditionalsuperpixeltechniques. Finally,wedescribedanovelstructuredsuperpixel-basedappearancemodelfornon-articulatedobjecttracking.Ourappearancemodelwassimple,reliableandeasytoupdate.TheSSPtrackersuccessfullytrackedtargetsundergoingsevereshape,scaleandposevariations.Aswell,theSSPtrackerrobustlyhandledoccludedtargets.Qualitativeandquantitativeevaluationsofthetrackerdemonstratedthatouralgorithmoutperformsmanystate-of-the-arttrackingalgorithmsonchallengingsequences. 100

PAGE 101

REFERENCES [1] D.Comaniciu,V.Ramesh,P.Meer,Real-timetrackingofnon-rigidobjectsusingmeanshift,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2000,pp.142. [2] F.M.Porikli,Integralhistogram:Afastwaytoextracthistogramsincartesianspaces,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2005,pp.829. [3] J.Kwon,K.M.Lee,Trackingofanon-rigidobjectviapatch-baseddynamicappearancemodelingandadaptivebasinhoppingmontecarlosampling,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2009,pp.1208. [4] J.Santner,C.Leistner,A.Saffari,T.Pock,H.Bischof,Prost:Parallelrobustonlinesimpletracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2010,pp.723. [5] B.Babenko,M.-H.Yang,S.Belongie,Robustobjecttrackingwithonlinemultipleinstancelearning,IEEETransactionsonPatternAnalysisandMachineIntelligence33(8)(2011)1619. [6] C.Stauffer,W.E.L.Grimson,Learningpatternsofactivityusingreal-timetracking,IEEETransactionsonPatternAnalysisandMachineIntelligence22(2000)747. [7] B.Leibe,N.Cornelis,K.Cornelis,L.J.V.Gool,Dynamic3dsceneanalysisfromamovingvehicle,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [8] T.Mcinerney,D.Terzopoulos,Adynamicniteelementsurfacemodelforsegmentationandtrackinginmultidimensionalmedicalimageswithapplicationtocardiac4dimageanalysis,in:ComputerizedMedicalImagingandGraphics,Vol.19,1995,pp.69. [9] Z.Zhu,Q.Ji,Eyegazetrackingundernaturalheadmovements,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,Vol.1,2005,pp.918. [10] V.B.Zordan,J.K.Hodgins,Motioncapture-drivensimulationsthathitandreact,in:ProceedingsoftheACMSIGGRAPH/EurographicsSymposiumonComputeranimation,2002,pp.89. [11] Y.Wu,J.Lin,T.S.Huang,Analyzingandcapturingarticulatedhandmotioninimagesequences,IEEETransactionsonPatternAnalysisandMachineIntelligence27(2005)1910. 101

PAGE 102

[12] D.N.Zotkin,V.C.Raykar,R.Duraiswami,L.S.Davis,Multimodaltrackingforsmartvideoconferencingandvideosurveillance,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [13] S.Bircheld,Ellipticalheadtrackingusingintensitygradientsandcolorhistograms,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,1998,pp.232. [14] P.A.Viola,M.J.Jones,Rapidobjectdetectionusingaboostedcascadeofsimplefeatures,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2001,pp.511. [15] Z.Fan,M.Yang,Y.Wu,G.Hua,T.Yu,Efcientoptimalkernelplacementforreliablevisualtracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.658. [16] V.Parameswaran,V.Ramesh,I.Zoghlami,Tunablekernelsfortracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.2179. [17] S.Bircheld,S.Rangarajan,Spatiogramsversushistogramsforregion-basedtracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2005,pp.1158. [18] J.K.Aggarwal,Q.Cai,Humanmotionanalysis:Areview,ComputerVisionandImageUnderstanding73(1999)428. [19] D.Gavrila,Thevisualanalysisofhumanmovement:Asurvey,ComputerVisionandImageUnderstanding73(1999)82. [20] T.B.Moselund,E.Granum,Asurveyofcomputervision-basedhumanmotioncapture,ComputerVisionandImageUnderstanding81(2001)231. [21] T.Moselund,A.Hilton,V.Kruger,Asurveyofadvancesinvision-basedhumanmotioncaptureandanalysis,ComputerVisionandImageUnderstanding104(2)(2006)90. [22] J.M.Rehg,T.Kanade,Model-basedtrackingofself-occludingarticulatedobjects,in:ProceedingsoftheInternationalConferenceonComputerVision,1995,pp.612. [23] M.J.Black,A.D.Jepson,Eigentracking:Robustmatchingandtrackingofarticulatedobjectsusingaview-basedrepresentation,InternationalJournalofComputerVision26(1998)63. [24] A.Blake,M.Isard,ActiveContours,Springer,1998. 102

PAGE 103

[25] M.LaCascia,S.Sclaroff,V.Athitsos,Fast,reliableheadtrackingundervaryingillumination:Anapproachbasedonregistrationoftexture-mapped3Dmodels,IEEETransactionsonPatternAnalysisandMachineIntelligence22(2000)322. [26] S.Sclaroff,J.Isidoro,Activeblobs,in:ProceedingsoftheInternationalConferenceonComputerVision,1998,pp.1146. [27] M.Kass,A.P.Witkin,D.Terzopoulos,Snakes:Activecontourmodels,InternationalJournalofComputerVision1(1988)321. [28] K.Toyama,A.Blake,Probabilistictrackinginametricspace,in:ProceedingsoftheInternationalConferenceonComputerVision,2001,pp.50. [29] T.F.Cootes,G.J.Edward,C.J.Taylor,Activeappearancemodels,in:ProceedingsoftheEuropeanConferenceonComputerVision,1998,pp.484. [30] V.Caselles,R.Kimmel,G.Sapiro,Geodesicactivecontours,InternationalJournalofComputerVision22(1997)61. [31] D.Cremers,Nonlineardynamicalshapepriorsforlevelsetsegmentation,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [32] N.Paragios,R.Deriche,Geodesicactivecontoursandlevelsetsforthedetectionandtrackingofmovingobjects,IEEETransactionsonPatternAnalysisandMachineIntelligence22(2000)266. [33] T.Zhang,D.Freedman,Trackingobjectsusingdensitymatchingandshapepriors,in:ProceedingsoftheInternationalConferenceonComputerVision,2003,pp.1056. [34] R.T.Collins,Y.Liu,On-lineselectionofdiscriminativetrackingfeatures,in:ProceedingsoftheInternationalConferenceonComputerVision,2003,pp.346. [35] G.D.Hager,M.Dewan,C.V.Stewart,Multiplekerneltrackingwithssd,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2004,pp.790. [36] R.T.Collins,Mean-shiftblobtrackingthroughscalespace,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2003,pp.234. [37] H.Grabner,H.Bischof,On-lineboostingandvision,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.260. [38] P.F.Felzenszwalb,D.P.Huttenlocher,Pictorialstructuresforobjectrecognition,InternationalJournalofComputerVision61(2005)55. 103

PAGE 104

[39] D.Freedman,T.Zhang,Interactivegraphcutbasedsegmentationwithshapepriors,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2005,pp.755. [40] Y.Boykov,M.-P.Jolly,Interactivegraphcutsforoptimalboundaryandregionsegmentationofobjectsinn-dimages,in:ProceedingsoftheInternationalConferenceonComputerVision,2001,pp.105. [41] J.Ho,K.-C.Lee,M.-H.Yang,D.J.Kriegman,Visualtrackingusinglearnedsubspaces,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2004,pp.782. [42] J.Lim,D.A.Ross,R.-S.Lin,M.-H.Yang,Incrementallearningforvisualtracking,in:AdvancesinNeuralInformationProcessingSystems,2004,pp.793. [43] A.O.Balan,M.J.Black,Anadaptiveappearancemodelapproachformodel-basedarticulatedobjecttracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.758. [44] D.Ramanan,C.Sminchisescu,Trainingdeformablemodelsforlocalization,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.206. [45] O.M.Lozano,K.Otsuka,Simultaneousandfast3dtrackingofmultiplefacesinvideobygpu-basedstreamprocessing,in:ProceedingsoftheIEEEInternationalConferenceonAcoustics,Speech,andSignalProcessing,2008,pp.713. [46] S.N.Sinha,J.-M.Frahm,M.Pollefeys,Y.Genc,Featuretrackingandmatchinginvideousingprogrammablegraphicshardware,MachineVisionandApplications22(2011)207. [47] M.Lalonde,D.Byrns,L.Gagnon,N.Teasdale,D.Laurendeau,Real-timeeyeblinkdetectionwithGPU-basedSIFTtracking,in:CanadianConferenceonComputerandRobotVision,2007,pp.481. [48] J.Fung,S.Mann,Openvidia:Parallelgpucomputervision,in:ProceedingsofACMInternationalConferenceonMultimedia,2005,pp.849. [49] A.Montemayor,B.Payne,J.Pantrigo,R.Cabido,A.Sanchez,F.Fernandez,ImprovingGPUparticlelterwithshadermodel3.0forvisualtracking,in:ACMSIGGRAPHResearchposters,2006,pp.1. [50] A.Montemayor,J.Pantrigo,A.Sanchez,F.Fernandez,ParticlelteronGPUsforreal-timetracking,in:ACMSIGGRAPHposters,2004,pp.1. [51] P.Lanvin,J.-C.Noyer,M.Benjelloun,Anhardwarearchitecturefor3Dobjecttrackingandmotionestimation,in:ProceedingsofIEEEInternationalConferenceonMultimediaandExpo,2005,pp.326. 104

PAGE 105

[52] R.Cabido,A.S.Montemayor,J.J.Pantrigo,B.R.Payne,Multi-resolutionandlocalsearchmethodsforoptimizingvisualtrackingprocessesonGPU,in:ACMSIGGRAPHposters,2007,pp.1. [53] S.Baker,I.Matthews,Lucas-Kanade20YearsOn:AUnifyingFramework,InternationalJournalofComputerVision56(3)(2004)221. [54] L.A.Leiva,A.Sanz,J.M.Buenaposada,PlanartrackingusingtheGPUforaugmentedrealityandgames,in:ACMSIGGRAPHposters,2007,pp.1. [55] Nvidiacudaprogrammingguide1.0(2007). [56] A.Toshev,B.Taskar,K.Daniilidis,Objectdetectionviaboundarystructuresegmentation,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2010,pp.950. [57] C.Pantofaru,C.Schmid,M.Hebert,Objectrecognitionbyintegratingmultipleimagesegmentations,in:ProceedingsoftheEuropeanConferenceonComputerVision,Vol.3,2008,pp.481. [58] G.Mori,X.Ren,A.A.Efros,J.Malik,Recoveringhumanbodycongurations:Combiningsegmentationandrecognition,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,Vol.2,2004,pp.326. [59] H.Zhang,J.Malik,Learningadiscriminativeclassierusingshapecontextdistances,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2003,pp.242. [60] A.P.Moore,S.J.D.Prince,J.Warrell,U.Mohammed,G.Jones,Superpixellattices,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2008,pp.1. [61] J.Shi,J.Malik,Normalizedcutsandimagesegmentation,IEEETransactionsonPatternAnalysisandMachineIntelligence22(8)(2000)888. [62] X.Ren,J.Malik,Learningaclassicationmodelforsegmentation,in:ProceedingsoftheInternationalConferenceonComputerVision,2003,pp.10. [63] X.He,R.S.Zemel,D.Ray,Learningandincorporatingtop-downcuesinimagesegmentation,in:ProceedingsoftheEuropeanConferenceonComputerVision,2006,pp.338. [64] A.Levinshtein,A.Stere,K.N.Kutulakos,D.J.Fleet,S.J.Dickinson,K.Siddiqi,Turbopixels:Fastsuperpixelsusinggeometricows,IEEETransactionsonPatternAnalysisandMachineIntelligence31(12)(2009)2290. [65] O.Veksler,Y.Boykov,P.Mehrani,Superpixelsandsupervoxelsinanenergyoptimizationframework,in:ProceedingsoftheEuropeanConferenceonComputerVision,2010,pp.211. 105

PAGE 106

[66] Y.Boykov,G.Funka-Lea,Graphcutsandefcientn-dimagesegmentation,InternationalJournalofComputerVision70(2006)109. [67] Y.Boykov,O.Veksler,R.Zabih,Fastapproximateenergyminimizationviagraphcuts,IEEETransactionsonPatternAnalysisandMachineIntelligence23(11)(2001)1222. [68] M.J.Swain,D.H.Ballard,Colorindexing,InternationalJournalofComputerVision7(1991)11. [69] H.Kuhn,TheHungarianmethodfortheassignmentproblem,Vol.2,WileyOnlineLibrary,1955. [70] S.Baker,S.Roth,D.Scharstein,M.J.Black,J.Lewis,R.Szeliski,Adatabaseandevaluationmethodologyforopticalow,in:ProceedingsoftheInternationalConferenceonComputerVision,2007,pp.1. [71] I.Haritaoglu,D.Harwood,L.S.Davis,W4:Arealtimesystemfordetectingandtrackingpeople,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,Vol.0,1998,pp.962. [72] M.deLaGorce,N.Paragios,D.J.Fleet,Model-basedhandtrackingwithtexture,shadingandself-occlusions,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2008,pp.1. [73] X.S.Zhou,D.Comaniciu,A.Gupta,Aninformationfusionframeworkforrobustshapetracking,IEEETransactionsonPatternAnalysisandMachineIntelligence27(1)(2005)115. [74] R.Collins,Y.Liu,M.Leordeanu,On-lineselectionofdiscriminativetrackingfeatures,IEEETransactionsonPatternAnalysisandMachineIntelligence27(10)(2005)1631. [75] A.Adam,E.Rivlin,I.Shimshoni,Robustfragments-basedtrackingusingtheintegralhistogram,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.798. [76] D.A.Ross,J.Lim,R.-S.Lin,M.-H.Yang,Incrementallearningforrobustvisualtracking,InternationalJournalofComputerVision77(1-3)(2008)125. [77] A.D.Jepson,D.J.Fleet,T.F.El-Maraghi,Robustonlineappearancemodelsforvisualtracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2001,pp.415. [78] J.Kwon,K.M.Lee,Visualtrackingdecomposition,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2010,pp.1269. 106

PAGE 107

[79] H.Grabner,C.Leistner,H.Bischof,Semi-supervisedon-lineboostingforrobusttracking,in:ProceedingsoftheEuropeanConferenceonComputerVision,2008,pp.234. [80] X.Mei,H.Ling,Robustvisualtrackingusingl1minimization,in:ProceedingsoftheInternationalConferenceonComputerVision,2009,pp.1436. [81] D.Comaniciu,V.Ramesh,P.Meer,Kernel-basedobjecttracking,IEEETransactionsonPatternAnalysisandMachineIntelligence25(5)(2003)564. [82] S.M.S.Nejhum,J.Ho,M.-H.Yang,Visualtrackingwithhistogramsandarticulatingblocks,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2008,pp.1. [83] D.G.Lowe,Objectrecognitionfromlocalscale-invariantfeatures,in:ICCV,1999,pp.1150. [84] H.Bay,A.Ess,T.Tuytelaars,L.J.V.Gool,Speeded-uprobustfeatures(surf),ComputerVisionandImageUnderstanding110(3)(2008)346. [85] T.Ojala,M.Pietikainen,D.Harwood,Acomparativestudyoftexturemeasureswithclassicationbasedonfeatureddistributions,PatternRecognition29(1)(1996)51. [86] M.Yang,J.Yuan,Y.Wu,Spatialselectionforattentionalvisualtracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [87] B.Liu,J.Huang,L.Yang,C.A.Kulikowski,Robusttrackingusinglocalsparseappearancemodelandk-selection,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2011,pp.1313. [88] X.Mei,H.Ling,Y.Wu,E.Blasch,L.Bai,Minimumerrorboundedefcientl1trackerwithocclusiondetection,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2011,pp.1257. [89] X.Ren,J.Malik,Trackingasrepeatedgure/groundsegmentation,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [90] S.Wang,H.Lu,F.Yang,M.-H.Yang,Superpixeltracking,in:ProceedingsoftheInternationalConferenceonComputerVision,2011,pp.1. [91] T.G.Dietterich,R.H.Lathrop,T.Lozano-Perez,Solvingthemultipleinstanceproblemwithaxis-parallelrectangles,ArticialIntelligence89(1-2)(1997)3171. [92] H.Grabner,M.Grabner,H.Bischof,Real-timetrackingviaon-lineboosting,in:ProceedingsoftheBritishMachineVisionConference,2006,pp.47. 107

PAGE 108

[93] A.Saffari,C.Leistner,J.Santner,M.Godec,H.Bischof,On-linerandomforests,in:IEEEInternationalConferenceonComputerVisionWorkshops(ICCVWorkshops),2009,pp.1393. 108

PAGE 109

BIOGRAPHICALSKETCH S.M.ShahedNejhumreceivedtheBSdegreefromtheBangladeshUniversityofEngineeringandTechnology.HereceivedhisPhDdegreeinComputerEngineeringfromtheComputerandInformationScienceandEngineeringattheUniversityofFlorida.Hisresearchinterestsincludecomputervisionandmachinelearning. 109