Citation
Online Adaptive Appearance Models for Robust Visual Tracking

Material Information

Title:
Online Adaptive Appearance Models for Robust Visual Tracking
Creator:
Nejhum, S M Shahed
Place of Publication:
[Gainesville, Fla.]
Publisher:
University of Florida
Publication Date:
Language:
english
Physical Description:
1 online resource (109 p.)

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Computer Engineering
Computer and Information Science and Engineering
Committee Chair:
Ho, Jeffrey
Committee Members:
Peters, Jorg
Vemuri, Baba C
Rangarajan, Anand
Gu, Qun Jane
Graduation Date:
12/17/2011

Subjects

Subjects / Keywords:
Algorithms ( jstor )
Computer conferencing ( jstor )
Computer memory ( jstor )
Computer pattern recognition ( jstor )
Computer vision ( jstor )
Conference proceedings ( jstor )
Histograms ( jstor )
Optical tracking ( jstor )
Pixels ( jstor )
Rectangles ( jstor )
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
appearance -- gpu -- models -- tracking -- visual
Genre:
Electronic Thesis or Dissertation
born-digital ( sobekcm )
Computer Engineering thesis, Ph.D.

Notes

Abstract:
Robust tracking of visual targets is a very challenging task in the field of computer vision. The target has to be reliably modeled and the model needs to be updated according to the target's appearance and shape variations over time. Visual tracking algorithms available in the literature do not fully explore mid-level image cues. This dissertation presents visual tracking algorithms where mid-level image cues are used efficiently and effectively to model the target. The first algorithm tracks articulated objects by constantly modeling the changing target shape by a small number of rectangular blocks whose positions are updated accordingly. To improve the tracking speed a modified algorithm processes the computationally extensive steps in parallel using a GPU. Both algorithms are evaluated on several videos of articulated targets undergoing significant shape variations. We compare the results with the mean shift tracker and the histogram-based tracker. Our algorithms consistently outperform these algorithms and produce robust tracking results. We present a novel technique to generate coherent superpixels from a pair of successive video frames. We show that the similarity of corresponding superpixels can be increased by generating superpixels jointly from the images. We present a visual tracking algorithm that uses a novel superpixel-based appearance model. The model is continuously updated to handle variations of the target. To evaluate the performance of the tracker, we report experimental results on several publicly available challenging sequences. We show that our superpixel-based visual tracker produces improved performance over recently published state-of-the-art tracking algorithms. ( en )
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Thesis:
Thesis (Ph.D.)--University of Florida, 2011.
Local:
Adviser: Ho, Jeffrey.
Statement of Responsibility:
by S M Shahed Nejhum.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
Copyright Nejhum, S M Shahed. Permission granted to the University of Florida to digitize, archive and distribute this item for non-profit research and educational purposes. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder.
Resource Identifier:
818025587 ( OCLC )
Classification:
LD1780 2011 ( lcc )

Downloads

This item has the following downloads:


Full Text

PAGE 1

ONLINEADAPTIVEAPPEARANCEMODELSFORROBUSTVISUALTRACKINGByS.M.SHAHEDNEJHUMADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2011

PAGE 2

c2011S.M.ShahedNejhum 2

PAGE 3

TomyParentsforalltheirloveandsupport 3

PAGE 4

ACKNOWLEDGMENTS IamextremelygratefultoDr.JeffreyHoforhiscontinuousguidanceandsupportduringmygraduatestudies.Hehasbeenaconstantsourceofinspirationandencouragementforme.IamalsothankfultoDr.BabaC.Vemuri,Dr.AnandRangarajan,Dr.JorgPetersandDr.QunGuforbeingonmysupervisorycommitteeandprovidingextremelyusefulinsightsintotheworkpresentedinthisdissertation.IwouldliketothanktheDepartmentofComputerandInformationScienceandEngineering(CISE)andtheUniversityofFlorida(UF)forgivingmetheopportunitytopursuemygraduatestudiesinaveryconstructiveenvironment.Duringmygraduatestudies,Ienjoyedmyjobasateachingandresearchassistant.IwanttothankDr.Ming-HsuanYangforhisadviceandsupportinmyresearchwork.Iamthankfultoallofmycurrentandformerlab-matesMuhammadRushdi,MohsenAli,JasonYu-TsehChi,VenkatakrishnanRamaswamy,KarthikS.Gurumoorthy,ManuSethi,SubhajitSengupta,NathanVanderKraats,LeilaKalantariandShaoyuQi.IhavealwaysfoundthemwheneverIneededthem.Lastlyandmostimportantly,Iamthankfultomyfamily,fortheirunconditionalloveandsupport. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 7 LISTOFFIGURES ..................................... 8 ABSTRACT ......................................... 10 CHAPTER 1VISUALTRACKING ................................. 12 1.1Introduction ................................... 12 1.2MotivationandObjectives ........................... 12 1.3ProposedAlgorithms .............................. 13 1.4Organization .................................. 15 2VISUALTRACKINGUSINGARTICULATINGBLOCKS ............. 16 2.1Introduction ................................... 16 2.2RelatedWork .................................. 18 2.3TrackingAlgorithm ............................... 20 2.3.1Detection ................................ 22 2.3.2Scaling .................................. 23 2.3.3Renement ............................... 24 2.3.4Update .................................. 25 2.3.5Discussion ................................ 26 2.4ExperimentsandResults ........................... 27 2.4.1QualitativeResults ........................... 28 2.4.2QuantitativePerformance ....................... 29 2.5Discussion ................................... 30 3SIMULTANEOUSMULTI-FRAMETRACKINGFORARTICULATEDTARGETS 45 3.1Introduction ................................... 45 3.2GraphicsProcessingUnits .......................... 46 3.2.1OverviewofGPU ............................ 46 3.2.2ApplicationofGPUinComputerVision ................ 46 3.3CUDAArchitectureandProgrammingFramework .............. 47 3.3.1CUDAHardwareArchitecture ..................... 47 3.3.2CUDAProgrammingFramework ................... 48 3.4AlgorithmDescription ............................. 50 3.5ImplementationDetails ............................. 51 3.6ExperimentalResults ............................. 53 5

PAGE 6

3.6.1QualitativeResults ........................... 54 3.6.2QuantitativePerformance ....................... 54 3.6.3ComputationalPerformance ...................... 54 3.7Discussion ................................... 56 4CO-SUPERPIXELS:GENERATINGCONSISTENTSUPERPIXELSACROSSIMAGEFRAMES ................................... 66 4.1Introduction ................................... 66 4.2SuperpixelGenerationfromaSingleImage ................. 67 4.3ConsistentSuperpixelGeneration ...................... 68 4.4Co-superpixelGenerationFramework .................... 70 4.4.1SuperpixelSimilarityMetrics ...................... 72 4.4.2IterativeEnergyMinimization ..................... 72 4.5ExperimentalResults ............................. 74 5TRACKINGUSINGSUPERPIXEL-BASEDAPPEARANCEMODEL ...... 82 5.1Introduction ................................... 82 5.2RelatedWork .................................. 83 5.3TheAppearanceModel ............................ 85 5.4TrackingAlgorithm ............................... 87 5.4.1TargetLocalization ........................... 87 5.4.2OcclusionDetection .......................... 89 5.4.3ModelUpdate .............................. 89 5.4.4Scaling .................................. 90 5.5ExperimentalResults ............................. 91 5.5.1ExperimentalSetup .......................... 91 5.5.2ResultsandDiscussion ........................ 92 6CONCLUSION .................................... 99 REFERENCES ....................................... 101 BIOGRAPHICALSKETCH ................................ 109 6

PAGE 7

LISTOFTABLES Table page 2-1Centerlocationerrorsoftheintegralhistogram-basedtracker,themean-shifttrackerandtheproposedBHT ........................... 32 2-2Coverageerrorsoftheintegralhistogram-basedtracker,themean-shifttrackerandtheproposedBHT ............................... 32 2-3Centerlocationerrorsofourtrackerwiththexedandtheadaptivescalingwindow ........................................ 32 2-4Coverageerrorsofourtrackerwiththexedandtheadaptivescalingwindow 32 2-5RMSerrorsintrackingwindowsize ......................... 32 3-1Centerlocationerrorsoftheintegralhistogram-basedtracker,themean-shifttracker,theBHTandthep-BHT ........................... 57 3-2Coverageerrorsoftheintegralhistogram-basedtracker,themean-shifttracker,theBHTandthep-BHT ............................... 57 3-3Performanceanalysisofdifferentsub-stepsinthemulti-framedetectionstep. 57 3-4Computationalperformanceofthep-BHTalgorithm. ............... 58 4-1Superpixelsimilarityattheinitialandnaliterationsofouriterativejointsuperpixelgenerationscheme. ................................. 75 5-1QuantitativeevaluationoftheSSPtrackingalgorithm. .............. 93 7

PAGE 8

LISTOFFIGURES Figure page 2-1Motivationofourblock-basedtrackingalgorithm ................. 33 2-2High-leveloutlineoftheBHTalgorithm. ...................... 33 2-3InitializationoftheBHTalgorithm. ......................... 34 2-4Examplesofsegmentationwithinthetrackingwindow. .............. 34 2-5Trackingresultsusingxedandadaptivescalingwindow. ............ 35 2-6Overviewoftheblockcongurationupdateprocess ............... 35 2-7VisualcomparisonoftrackingresultsoftheFemaleSkatersequence ..... 36 2-8VisualcomparisonoftrackingresultsoftheDancersequence .......... 36 2-9VisualcomparisonoftrackingresultsoftheMaleSkatersequence ....... 37 2-10Examplesofgroundtruthwindowsforarticulatedtargets. ............ 38 2-11VisualcomparisonoftrackingresultsoftheIndianDancersequence ...... 38 2-12TrackingresultsofthecartoonsequenceusingtheBHTalgorithm. ....... 38 2-13Trackingoccludedtarget. .............................. 39 2-14Trackingwithaclutteredbackground ........................ 39 2-15TrackingofawildlifetargetusingtheBHTalgorithm ............... 39 2-16TrackingresultsoftheBHTalgorithmwithxedandadaptivescale ....... 40 2-17Per-framecenterlocationerrorplotsusingdifferenttrackingalgorithms ..... 41 2-18Per-framecoverageerrorplotsusingdifferenttrackingalgorithms ........ 42 2-19Comparisonbetweenourxedandadaptivescalingwindowbasedtracker. .. 43 2-20ResultsusingtheBHTalgorithmwithxedandadaptivescaling. ........ 44 3-1Comparisonofcomputationalspeedbetweenthep-BHTandtheBHTalgorithm. 59 3-2BlockdiagramoftheCUDAHardwareArchitecture ................ 59 3-3BlockdiagramoftheCUDAProgrammingModelandMemoryModel ...... 60 3-4Blockdiagramofthep-BHTalgorithm. ....................... 60 3-5BlockdiagramoftheCPU-GPUimplementationofthep-BHTalgorithm. .... 61 8

PAGE 9

3-6VisualcomparisonoftrackingresultsoftheFemaleSkaterSequence. ..... 61 3-7VisualcomparisonoftrackingresultsoftheMaleSkaterSequence. ...... 62 3-8VisualcomparisonoftrackingresultsoftheIndianDancerSequence. ..... 62 3-9VisualcomparisonoftrackingresultsoftheDancerSequence. ......... 63 3-10Per-framecenterlocationerrorplotsusingdifferenttrackingalgorithms ..... 64 3-11Per-framecoverageerrorplotsusingdifferenttrackingalgorithms ........ 65 4-1Consistentsuperpixelgenerationforapairofimages. .............. 76 4-2Stepsforcomputingtheshapesimilaritybetweenapairofsuperpixels. .... 77 4-3Improvementintheaveragesuperpixelsimilarityscore. ............. 78 4-4SuperpixelgenerationresultsfortheSponzasequence ............. 79 4-5SuperpixelgenerationresultsfortheWoodensequence ............. 79 4-6SuperpixelgenerationresultsfortheUrban3sequence ............. 80 4-7SuperpixelgenerationresultsfortheMeqounsequence ............. 80 4-8SuperpixelgenerationresultsfortheUrban2sequence ............. 81 4-9SuperpixelgenerationresultsfortheArmysequence ............... 81 5-1Constructionofthesuperpixel-basedappearancemodel. ............ 94 5-2High-leveloutlineoftheSSPtrackingalgorithm. ................. 94 5-3TrackingresultsusingtheSSPtrackingalgorithm. ................ 95 5-4TrackingresultsusingtheSSPtrackingalgorithmonPROSTsequences. ... 96 5-5Visualcomparisonoftrackingresultsobtainedusingdifferentalgorithms. ... 96 5-6VisualcomparisonoftrackingresultsobtainedusingdifferentalgorithmsonPROSTsequences .................................. 97 5-7Per-frametrackingerrorplotsusingdifferenttrackingalgorithms. ........ 98 9

PAGE 10

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyONLINEADAPTIVEAPPEARANCEMODELSFORROBUSTVISUALTRACKINGByS.M.ShahedNejhumDecember2011Chair:JeffreyHoMajor:ComputerEngineering Robusttrackingofvisualtargetsisaverychallengingtaskintheeldofcomputervision.Thetargethastobereliablymodeledandthemodelneedstobeupdatedaccordingtothetarget'sappearanceandshapevariationsovertime.Visualtrackingalgorithmsavailableintheliteraturedonotfullyexploremid-levelimagecues.Thisdissertationpresentsvisualtrackingalgorithmswheremid-levelimagecuesareusedefcientlyandeffectivelytomodelthetarget. Therstalgorithmtracksarticulatedobjectsbyconstantlymodelingthechangingtargetshapebyasmallnumberofrectangularblockswhosepositionsareupdatedaccordingly.ToimprovethetrackingspeedamodiedalgorithmprocessesthecomputationallyextensivestepsinparallelusingaGPU.Bothalgorithmsareevaluatedonseveralvideosofarticulatedtargetsundergoingsignicantshapevariations.Wecomparetheresultswiththemeanshift[ 1 ]trackerandthehistogram-basedtracker[ 2 ].Ouralgorithmsconsistentlyoutperformthesealgorithms[ 1 2 ]andproducerobusttrackingresults. Wepresentanoveltechniquetogeneratecoherentsuperpixelsfromapairofsuccessivevideoframes.Weshowthatthesimilarityofcorrespondingsuperpixelscanbeincreasedbygeneratingsuperpixelsjointlyfromtheimages.Wepresentavisualtrackingalgorithmthatusesanovelsuperpixel-basedappearancemodel.Themodeliscontinuouslyupdatedtohandlevariationsofthetarget.Toevaluatetheperformance 10

PAGE 11

ofthetracker,wereportexperimentalresultsonseveralpubliclyavailablechallengingsequences.Weshowthatoursuperpixel-basedvisualtrackerproducesimprovedperformanceoverrecentlypublishedstate-of-the-arttrackingalgorithms[ 3 5 ]. 11

PAGE 12

CHAPTER1VISUALTRACKING 1.1Introduction Whilehumanvisioncaneasilylocateatargetundergoingunknownmotioninvideo,developinganautomaticvisionsystemforaccuratetrackingisstillverydifcultevenafterdecadesofresearch.Thesuccessofmakingarobusttrackerdependsonreliablemodelingofthetargetandtheexibilitytoupdatethemodelaccordingtoitsvariationsovertime.Thefocusofthisdissertationistodevelopvisualtrackingalgorithmsusingefcientmodelingofthetarget'sshapeandappearance.Weproposesimple,exibleandgenericmodelsforrobusttrackingofvisualtargets. Theneedforrobusttrackingappearsinamyriadofapplications,acrossmultipledisciplines.Trackingisthemaincomponentinvisualsurveillancesystems[ 6 7 ].Inautonomoustrafcsurveillancesystems,trackingthevehiclesinrealtimeandpredictingtheirmotionsisveryimportant.Formedicaldataanalysis,non-rigidstructureshavebeentrackedusingdeformabletemplates[ 8 ].Autonomousrobotstrackobjectsaroundtheminordertonavigatein3Denvironments.Visualtrackingalgorithmsarealsousedinhuman-computerinteractions[ 9 ],computergamesandanimation[ 10 ],activityandeventanalysis(e.g.gesturetracking[ 11 ]),multimediaapplications(e.g.faceandpeopletrackingforvideoconferencing[ 12 ])andinmanyotherapplications. 1.2MotivationandObjectives Developinganaccurate,efcientandrobustvisualtrackerhasalwaysbeenchallenging,andthetaskbecomesevenmoredifcultwhenthetargetisexpectedtoundergosignicantandrapidvariationsinshapeaswellasappearance.Infact,thegoalsofachievingbothefciencyandrobustnessaresomewhatconicting.Asimplerepresentationofthetargetmakesthetrackerefcient,butitscapabilityofhandlingsignicantshape,appearanceandscalevariationsandtrackingthetargetundersevereocclusionbecomeslimited.Thesedifcultiesmotivateustodesignefcient 12

PAGE 13

representationsofvisualtargetsandtrackthemrobustly.Inparticular,weusemid-levelimagecuestodesigntheappearanceofthetarget. Inthisdissertation,wediscussalgorithmsforarticulatedobjecttrackingwherethetargetcontinuouslychangesitsshape.Wepresentablock-basedappearancemodelinwhichthetargetisrepresentedusingasetofrectangularblocks.Aswell,weintroduceanalgorithmfortrackingnon-articulatedtargets.Thesetargetsoftenchangetheirappearancesandsizes,andundergopartialocclusions.Inthealgorithm,thetargetisrepresentedusinganovelsuperpixel-basedappearancemodel.Theappearancemodelsareupdatedadaptivelyinouralgorithmstohandlethevariationsofthetarget.Briefdescriptionsofthesealgorithmsareoutlinedinthenextsection. 1.3ProposedAlgorithms Wepresenttwonovelalgorithmsfortrackingarticulatedobjects.Intherstalgorithm,thetargetisefcientlymodeledusingasmallnumberofrectangularblocks.Theseblocksarerepresentedbyintensityhistograms,andwerefertothisalgorithmastheBlockHistogramTracking(BHT)algorithm.Thesizesandlocationsoftheseblocksefcientlyencodeobjectshapeandappearance.Ouralgorithmadaptivelyupdatestheblocklocationsaccordingtothetargetarticulationovertime.Thealgorithmmakesefcientuseofintegralimagehistogramstorapidlysearchthetargetovertheentireframe.Thisgivesthealgorithmtheabilitytotracktargetsundergoingrapidmotion.TheBHTalgorithmconsistsofdetection,renementandupdatesteps.Thetrackeriterativelyloopsthroughthesestepsforvisualtracking.Underthegeneralassumptionofstationaryforegroundappearance,weshowthatrobustobjecttrackingispossiblebyadaptivelyadjustingthelocationsoftheseblocks.WehaveimplementedtheBHTalgorithminMATLABandthetrackerrunsat8)]TJ /F5 11.955 Tf 11.95 0 Td[(10fpsona2.66GHzmachine. ThesecondalgorithmisamodiedversionoftheBHTalgorithm.Insteadofupdatingtheblockcongurationateveryframe,themodiedalgorithmcomputestheblockcongurationregularlyafterseveralframes,i.e.,multipledetectionstepsare 13

PAGE 14

followedbyarenementandanupdatestep.Thesedetectionstepscanbeprocessedinparallel,andthismakesthealgorithmsuitableforaGPU-basedimplementation.WerefertothisversionasparallelBHTorp-BHT.Moreover,thedetectionstepoftheBHTalgorithminvolvesevaluatinglargenumberofcandidatewindowsovertheimageframe,andevaluationofthesewindowsareindependentofeachother.WealsoexploitthisintheGPU-basedimplementation.Thep-BHTalgorithmhasbeenimplementedonaGPUusingNVIDIA'sCUDAprogrammingmodelandtheimplementationrunsat55)]TJ /F5 11.955 Tf 11.96 0 Td[(80fps. Thisdissertationalsodiscussesanovelalgorithmfornon-articulatedobjecttracking.Unlikepreviousalgorithmsinwhichthearticulatedtargetismodeledusingasetofblocks,weuseasetofsmallandhomogeneouslytexturedregions.Thesesmallandhomogeneouslytexturedregionsareknownassuperpixels.Mostoftheexistingsuperpixel-basedtrackingalgorithmsgeneratesuperpixelsindependentlywithoutenforcinganyconsistencymeasurementsacrosstheimages.Toaddressthisissue,wehavedevelopedanovelapproachforconsistentsuperpixelgenerationfromconsecutivevideoframes.Anovelenergyfunctionisformulatedthatincorporatessuperpixelshapeandappearancesimilaritymeasures.Wedevelopaniterativealgorithmtominimizeourproposedenergyfunctionandgenerateconsistentsuperpixels. Finally,wepresentthedetailsofthetrackingalgorithm.Thealgorithmrobustlyhandlesocclusions,complexbackground,scaleandappearancevariationsofnon-articulatedtargets.Weintroduceasuperpixel-basedappearancemodelforvisualtracking.Fromtheinitialtrackingwindow,weextractsuperpixelsandcomputetheirhistogramfeatures.Thesesuperpixelfeaturesareorganizedusingagrid;hencewerefertothisalgorithmasthestructuredsuperpixel(SSP)trackingalgorithm.Insubsequentframes,wesearchfortheregionthatmaximizesthesimilarityoftheseorganizedsuperpixelfeatures.Ouralgorithmdetectstargetocclusion,andupdatestheappearancemodelaccordingly.Aswell,theappearancemodelisupdatedtohandlelarge-scalevariations. 14

PAGE 15

1.4Organization Theremainderofthedissertationisorganizedasfollows.TheBHTalgorithmispresentedinChapter 2 .InChapter 3 ,wediscusstheGPU-basedimplementationofthep-BHTalgorithm.TheconsistentsuperpixelgenerationframeworkfromsimilarimagesisdescribedinChapter 4 .TheSSPtrackingalgorithmisdescribedinChapter 5 .Finally,wesummarizeourcontributionsandconcludethisdissertationinChapter 6 15

PAGE 16

CHAPTER2VISUALTRACKINGUSINGARTICULATINGBLOCKS 2.1Introduction Oneofthesimplestwaystomodelthetargetappearanceistocomputetheintensityhistogramofthetargetregion,andmanytrackingalgorithmsbasedonthisideaareavailableintheliterature(e.g.,[ 1 13 ]).Usingarectangularboundingboxforthetarget,theintensityhistogramcanbecomputedefcientlyfromtheintegralimage[ 14 ],andtheentireimagecanbescannedrapidlytolocatethetargetforvisualtracking.However,computingtheintensityhistogramfromaregionboundedbysomeirregularshapecannotbedoneefcientlyusingtheintegralimage.Oneofthechallengesinvisualtrackingistorobustlyhandleshapevariationsofthetarget.Inthecontextofhistogrambasedtracking,onegeneralideaistousea(circularorelliptical)kernel[ 15 16 ]todenearegionaroundthetargetfromwhichaweightedhistogramcanbecomputed.Thekernelimposesaregularityconstraintontheirregularshape,therebyrelaxingthemoredifcultproblemofefcientlycomputingtheintensityhistogramfromanirregularshapetothatofasimpleroneofestimatinghistogramfromaregularshape.Rapidscanningoftheimageusingthisapproachisstillnotpossible;instead,differentialalgorithmscanbedesignedtoiterativelyconvergetothetargetobject[ 1 ].However,thesedifferentialapproachesoftenfailtotrackthetargetswithrapidandlargemotions. Anotherwaytodealwithirregularshapesistoenclosethetargetwitharegularshape(e.g.,arectangularwindow)andcomputehistogramfromtheenclosedregion.However,thisinevitablyincludesbackgroundpixelswhentheforegroundshapecannotbecloselyapproximated.Consequently,theresultinghistogramcanbecorruptedbybackgroundpixels,andthetrackingresultdegradesaccordingly(e.g.,unstableorjitteredresultsasshowninFigure 2-1 ).Furthermore,completelackofspatialinformationinhistogramsisalsoundesirable.Forproblemssuchasfacetrackingthatdonothavesignicantshapevariation,itisadequatetouseintensityhistogramasthemaintracking 16

PAGE 17

feature[ 13 ].However,foratargetundergoingsignicantshapevariation,thespatialcomponentoftheappearanceisveryprominent,andtheplainintensityhistogrambecomesinadequateasitaloneoftenyieldsunstabletrackingresults. Eachoftheaforementionedproblemshasbeenaddressedtosomeextent(e.g.,spatiogram[ 17 ]forencodingspatialinformationinhistogram).However,mostofthemrequiresubstantialincreaseofcomputationtime,therebymakingthesealgorithmsapplicableonlytolocalsearchandinfeasibleforglobalscansofimages.Consequently,suchalgorithmsarenotabletotrackobjectsundergoingrapidmotions.Inthischapter,wepresentatrackingalgorithmthatsolvestheaboveproblems,andatthesametime,itstillhascomparablerunningtimeasthetrackingalgorithmusing(plain)integralhistogram[ 2 ].Thealgorithmconsistsofglobalscanning,localrenementandupdatesteps.Themainideaistoexploitanefcientappearancerepresentationusinghistogramsthatcanbeeasilyevaluatedandcomparedsothatthetargetobjectcanbelocatedbyscanningtheentireimage.Shapeupdate,whichtypicallyrequiresmoreelaboratedalgorithms,iscarriedoutbyadjustingafewsmallblockswithintrackedwindow.Specically,weapproximatetheirregularshapewithasmallnumberofblocksthatcovertheforegroundobjectwithminimaloverlaps.Asthetrackingwindowistypicallysmall,wecanextractthetargetcontourusingafastsegmentationalgorithmwithoutincreasingtherun-timecomplexitysignicantly.Wethenupdatethetargetshapebyadjustingtheseblockslocallysothattheyprovideamaximalcoverageoftheforegroundtarget. Theadaptivestructureinouralgorithmcontainstheblockcongurationandtheirassociatedweights.Shapeofthetargetobjectislooselyrepresentedbyblockconguration,whileitsappearanceisrepresentedbyintensitydistributionsandweightsoftheseblocks.Indoingso,spatialcomponentoftheobject'sappearanceisalsolooselyencodedinblockstructure.Furthermore,theserectangularblocksallowrapidevaluationsandcomparisonsofhistograms.Notethatourgoalisnottorepresent 17

PAGE 18

bothshapeandappearancepreciselysincethiswillmostlikelyrequiresubstantialincreaseincomputation.Instead,westriveforasimplebutadequaterepresentationthatcanbeefcientlycomputedandmanaged.WerefertothealgorithmastheBlockHistogramTracking(BHT)algorithm.Comparedwithtrackingmethodsbasedonintegralhistograms,ourtrackerisalsoabletoefcientlyscantheentireimagetolocatethetarget,whichamountstothebulkoftheprocessingtimeforthesealgorithms.TheextraincreaseinrunningtimeoftheBHTalgorithmresultsfromtherenementandupdatesteps.Sincesegmentationiscarriedoutonlylocallyina(relatively)smallwindowandtheweightscanbecomputedveryefciently,suchcomputationoverheadisgenerallysmall.Experimentalresultsdemonstratethatthealgorithmrendersmuchmoreaccurateandstabletrackingresultscomparedtotheintegralhistogrambasedtracker,withanegligibleincreaseinrunningtime. 2.2RelatedWork Thereisarichliteratureonshapeandappearancemodelingforvisualtracking.Inthissection,wediscussthemostrelevantworkswithinthecontextofsinglearticulatedobjecttracking.Specically,weaimtotrackgenericarticulatedobjectsfromimagesacquiredwithonecameraatadistancewhileundergoinglargeandrapiddeformationinshapeaswellasappearance.Wenotethatthereexisttrackingalgorithmsforspecicobjectsoperatingunderdifferentimagingconditionsandconstraints,e.g.,humantracking[ 18 21 ],handtracking[ 15 22 24 ],model-basedtracking[ 25 26 ],tonameafew. Articulatedobjectscanbemodeledwithparameterizedshapesorcontours.Activecontoursusingparametricmodels[ 24 27 ]typicallyrequireofinetraining,andexpressivenessofthesemodels(e.g.,splines)issomewhatrestrictive.Furthermore,withalltheofinetraining,itisstilldifculttopredictthetracker'sbehaviorwhenhithertounseentargetisencountered.Forexample,anumberofexemplarshavetobelearnedfromtrainingdatapriortotrackingin[ 28 ],andthetrackerdoesnotprovide 18

PAGE 19

anymechanismtohandleshapesthataredrasticallydifferentfromthetemplates.Likewise,thereisalsoanofinelearningprocessinvolvedintheactiveshapeandappearancemodels[ 29 ].Levelsetalgorithmshavealsobeensuccessfullyappliedtotrackarticulatedobjects[ 30 33 ].However,thesemethodsrelymainlyontheinformationnearthecontoursanddonotexploittherichappearanceortextureinformation.Inaddition,thesealgorithmsusuallydonothavemechanismstohandledriftingeffects. Insteadofusingcontourstomodelshapes,kernel-basedmethodsrepresenttarget'sappearancewithintensity,gradients,andcolorstatistics[ 1 13 34 ].Thesemethodshavedemonstratedsuccessesintrackingtargetswhoseshapescanbewellenclosedbyellipses.Althoughmethodsusingmultiplekernels[ 15 35 ]andadaptivescaling[ 36 ]havebeenproposedtocopewiththisproblem,itisnotclearsuchmethodsareabletoeffectivelytrackarticulatedobjectswhoseshapesvaryrapidlyandsignicantly. Inasomewhatdifferentdirection,theuseofHaar-likefeaturesplaysanimportantroleinthesuccessofreal-timeobjectdetection[ 14 ].However,fastalgorithmsforcomputingHaar-likefeaturesandhistogramssuchasintegralimages[ 14 ]orintegralhistograms[ 2 ]requirerectangularwindowstomodelthetarget'sshape.Consequently,itisnotstraightforwardtoapplyefcientmethodstotrackanddetectarticulatedobjectwithvaryingshapes.Haar-likeandrelatedfeaturesplayasignicantroleinseveralrecentworkononlineboostinganditsapplicationstotrackingandobjectdetection[ 37 ].Oneinterestingaspectofthislatterworkistotreattrackingassequentialdetectionproblems,andanimportantcomponentinthetrackingalgorithmistheonlineconstructionofanobject-specicdetector.However,thecapabilityofthetrackerissomewhathamperedbytheHaar-likefeaturesitusesinthatthisinvariablyrequirestheshapesofthetargettobewellapproximatedbyrectangularwindows. Finally,theBHTalgorithmsharessomesimilaritywiththepart-basedobjectdetectionalgorithmproposedin[ 38 ]asbothalgorithmsuserectangularblockstodene 19

PAGE 20

thetargetobject.However,thesimilarityisonlysupercialsince,inourmethod,thereisnospecicpartdenitionastheblocksareonlineadjustedtoprovidethecoveragefortheforegroundtargetonly.Whiledecompositionsofthetargetusingrectangularblocksareemployedinbothmethod,thedecompositioninourcaseisgeometricalwiththeexplicitpurposeofcoveringtheforegroundandaccuratelyestimatetheintensityhistogramwhiletheirsissemanticalinthateachblockorparthasitsownuniqueappearanceandcharacteristic.Ourgoalistohaveageneral-purposetrackerandthisnecessarilyrequiresustoavoiddetectingobjectpartsasitwillinvolvemoreextensivetrainingandrequiremoreassumptionsonthetarget'sappearance. 2.3TrackingAlgorithm WepresentthedetailsoftheBHTalgorithminthissection.Theoutputofthetrackerconsistsofarectangularwindowenclosingthetargetineachframe.Furthermore,anapproximatedforegroundregionisalsoestimated.Ourobjectiveistoachieveabalanceamongthethreesomewhatconictinggoalsofefciency,accuracyandrobustness.Specically,wetreatthetrackingproblemasasequenceofdetectionproblems,andthemainfeaturethatweusetodetectthetargetistheintensityhistogram.Thedetectionprocessiscarriedoutbymatchingforegroundintensityhistogramandweemployintegralhistogramsforefcientcomputation.Inthefollowingdiscussion,wewillusethetermshistogramanddensityinterchangeably.Themaintechnicalproblemthatwesolvewithinthecontextofvisualtrackingishowtoapproximatetheforegroundhistogramundersignicantshapevariationssothatefcientandaccuratearticulatedobjecttrackingispossibleunderthegeneralassumption(heldbymosttrackingalgorithms)thattheforegroundhistogramstaysroughlystationary. Thehigh-leveloutlineoftheBHTalgorithmisshowninFigure 2-2 .Itconsistsoffoursequentialsteps:detection,scaling,renement,andupdate.Attheoutset,thetrackerisinitializedwiththecontourofthetarget,itthenautomaticallydeterminestheinitialtrackingwindowWandKrectangularblocksBiaswellastheirweightsiaccording 20

PAGE 21

totheproceduredescribedbelow.TheforegroundintensityhistogramHf0fortheinitialframeiskeptthroughoutthesequence. TheshapeoftheforegroundtargetisapproximatedbyKrectangularblocks,Bi,1iK,withinthemaintrackingwindowWasshowninFigure 2-3 .Thepositionsoftheblockswithinthetrackingwindowareadaptivelyadjustedthroughoutthetrackingsequence,andtheymayhavesomeoverlapstoaccountforextremeshapevariations.Ateachframet,thetrackermaintainsthefollowing:1)atrackingwindowWtwithablockconguration,2)aforegroundhistogramHftrepresentedbyacollectionoflocalforegroundhistograms,HBfitandtheirassociatedweightsi,computedfromtheblocks,and3)abackgroundhistogramHbt.Thetrackerrstdetectsthemostlikelylocationofthetargetbyscanningtheentireimage(i.e.,thewindowwiththehighestsimilaritywhencomparedwiththetrackingwindowW). Afterdetection,trackingwindowsizecanbeadjustedtomakeittightlyenclosethetargetwithoutunnecessarybackgroundpixels.Notethatfortrackingarticulatedobjects,itisinevitablefortrackingwindowstoenclosesomebackgroundpixelsastheshapesandsizesoftargetsvarysignicantly.Weintroduceanadaptivescalingsteptoupdatethesizeofthetrackingwindowbasedonscaleofthetargetobject. Intherenementstep,thetrackerworksexclusivelyinthedetectedwindowandthetargetissegmentedfromthebackgroundusingthecurrentforegrounddensity.ThisresultisthenusedintheupdatesteptoadjusttheblockpositionsinthetrackingwindowWandthentheweightsassignedtoeachblockisrecomputed.Inaddition,thebackgrounddensityHbtisalsoupdated. Whileitisexpectedthattheunionoftheblockswillcovermostofthetarget,theseblockswillneverthelesscontainbothforegroundandbackgroundpixels.Thishappensoftenwhentheshapeofthetargetobjectisfarfromconvexandexhibitsstrongconcavity.Inparticular,blockscontaininglargepercentagesofbackgroundpixelsshouldbedownweightedintheirimportancewhencomparedwithblocksthatcontainmostly 21

PAGE 22

foregroundpixels.Therefore,eachblockBiisassignedaweighti,whichwillbeusedinallthreesteps.Inthisframework,theshapeinformationisrepresentedbytheblockcongurationandtheassociatedweights.Comparedwithotherformulationsofshapepriors[ 31 33 ],itisaratherfuzzyrepresentationofshapes.However,thisispreciselywhatisneededheresincerapidandsometimeextremeshapevariationisexpected,theshaperepresentationshouldnotberigidandtooheavilyconstrainedsoastopermitgreaterexibilityinanticipatingandhandlinghithertounseenshapes. 2.3.1Detection Foreachframe,thetrackerrstscanstheentireimagetolocatethetargetobject.Aswithmanyotherhistogram-basedtrackers,thetargetwindowWselectedinthisstepistheonethathasthemaximumforegroundsimilaritymeasurewithrespecttoinitialtrackingwindowW.AfterscanningallpossiblecandidatewindowWTincurrentframe,weselectthewindowasW,whichminimizesourproposeddistancefunctionD W=argminWTD(WT,W)(2) ThedistancefunctionDisconstructedfromlocalforegroundhistogramscomputedfromtheblocksasfollows.First,wetransfertheblockcongurationofthetrackingwindowWt)]TJ /F10 7.97 Tf 6.58 0 Td[(1ontoeachscannedwindowWT,andaccordingly,wecanevaluateKlocalforegroundhistogramsineachofthetransferredblocks.ThelocalforegroundhistogramHBfitfortheblockBiistheintersectionoftherawhistogramHBitwiththeinitialforegroundhistogramofthecorrespondingblockHBfit(b)=min(HBit(b),HBfi0(b)) wherebindexesthebins.ThedistancefunctionisdenedastheweightedsumoftheBhattacharyyadistancebetweenthedensitiesHBfi0(b)andHBfit(b) D(WT,W)=KXi=1i(HBfi0,HBfit)(2) 22

PAGE 23

whereiistheweightassociatedtoblockBiandistheBhattacharyyadistancebetweentwodensities(HBfi0,HBfit)=vuut 1)]TJ /F8 7.97 Tf 17.29 14.95 Td[(NXb=1q HBfi0(b)HBfit(b) whereNisthenumberofbins.Sincetheblocksarerectangular,allhistogramscanbecomputedbyafewsubtractionsandadditionsusingintegralhistograms.Becauseofi,Dwilldownweightblockscontainingmorebackgroundpixels,andthisisdesirablebecauseitprovidessomemeasureagainstbackgroundnoiseandclutters.Notethatcomparingwithmosthistogram-basedtrackers,whichinvariablyusesonlyonehistogramintersection,thedistancefunctionDdenedinEquation 2 actuallyencodessomeamountofshapeandspatialinformationthroughtheblockcongurationandtheirweights. 2.3.2Scaling Afterdetectionstep,werandomlyvarysizeofthetrackingwindowwhilekeepingthetargetobjectatthecenterofthesewindows.ForeachscaledwindowWs=scale(W,sh,sw),weestimatetheforegrounddensityHft,Ws=fHBfit,WsgandbackgrounddensityHbt,Ws(thevaluesofshandswareselectedwithinarange,i.e.,0.8sh,sW1.2).Weselectthescaleofatrackingwindowwithinwhichitsforegroundmatchingandbackgroundmismatchingismaximized.Inotherwords,theadaptivewindowWshouldminimizefollowingobjectivefunction W=minWsKXi=1i(HBfi0,HBfit,Ws)+(1)]TJ /F9 11.955 Tf 11.95 0 Td[()(1)]TJ /F9 11.955 Tf 11.95 0 Td[((Hf0,Hbt,Ws))(2) whereisaparameterforspecifyingweightsoftwomatchingterms.Wesetto0.3inallourexperimentstoputmoreweightsonbackgroundmismatchingtermforscaleselection.Withthisscheme,wecanbetterdeterminethewindowthattightlyenclosesthetargetobject.Figure 2-5 showssometrackingresultsusingxedandadaptivescaling. 23

PAGE 24

2.3.3Renement OncetheglobalscanproducesthetrackingwindowWinwhichthetargetislocated,thenextstepistoextractanapproximateforegroundregionsothattheshapevariationcanbebetteraccountedfor.Weapplyagraph-cutsegmentationalgorithmtosegmentoutthetheforegroundregioninW.Previousworkonthistypeofsegmentationinthecontextofvisualtracking(e.g.,[ 31 33 39 ])alwaysdenethecostfunctionintheformE=EA+ES whereEAandESaretermsrelatingtoappearanceandshape,respectively.However,wehavefoundthathardcodingtheshapepriorinaseparatetermESismoreofahindrancethanhelpinourproblembecauseoftheextremeshapevariationasstrongshapepriorswithoutdynamicinformationoftenleadtounsatisfactoryresults.Instead,oursolutionwillbetouseonlytheappearancetermEAbutincorporatingshapecomponentthroughthedenitionofforegrounddensity. Specically,letpdenoteapixelandPdenotethesetofallpixelsinW.LetPBdenotethebackgrounddensitythatweestimatedinthepreviousframe,andPi,1iKtheforegrounddensityfromBi(bynormalizingthehistogramHBfii).Furthermore,wewilldenotePftheforegrounddensityobtainedbynormalizingthecurrentforegroundhistogramHft.Following[ 39 40 ],thegraph-cutalgorithmwillminimizethecostfunction E(Cp)=Xp2PRp(Cp)+X(p,q)2N:Cp6=CqBp,q(2) whereCp:P!f0,1gisabinaryassignmentfunctiononPsuchthatforagivenpixelp,C(p)=1ifpisaforegroundpixeland0otherwise1.isaweightingfactorandN 1WewilldenoteC(p)byCp. 24

PAGE 25

denotesthesetofneighboringpixels.Weuseto0.5inouralgorithm.WedeneBp,q/exp((I(p))]TJ /F7 11.955 Tf 11.96 0 Td[(I(q))2=22) jjp)]TJ /F7 11.955 Tf 11.96 0 Td[(qjj whereI(p)istheintensityvalueatpixelpandisthekernelwidth.ThetermRp(Cp)isgivenasRp(Cp=0)=)]TJ /F5 11.955 Tf 11.29 0 Td[(logPF(I(p),p)Rp(Cp=1)=)]TJ /F5 11.955 Tf 11.29 0 Td[(logPB(I(p)) wherePF(I(p))=Pi(I(p))ifp2Bi,andPF(I(p))=Pf(I(p))ifpisnotcontainedinanyblockBi.NotethattheshapeinformationisnowimplicitlyencodedthroughPF.AfastcombinatorialalgorithmwithpolynomialcomplexityexistsforminimizingtheenergyfunctionE,basedontheproblemofcomputingaminimumcutacrossagraph[ 40 ].Sinceweonlyperformthegraph-cutina(relatively)smallwindow,thiscanbedoneveryquicklyanddoesnotsubstantiallyincreasethecomputationalload.Figure 2-4 presentssomeexamplessegmentationresult. 2.3.4Update Aftertheobjectcontourisextractedfromthesegmentationresult,weupdatethepositionsoftheblocksBiwithinW.Theideaistolocallyadjusttheseblockssothattheyprovideamaximalcoverageofthesegmentedforegroundregion.Weemployagreedystrategytocovertheentiresegmentedforegroundbymovingeachblocklocallyusingaprioritybasedontheirsizes.Notethatsuchanapproach(i.e.,localjittering)hasoftenbeenadoptedinobjectdetectionandtrackingalgorithmsforlater-stagerenementandne-tuning.AnexampleoftheblockcongurationupdateprocessisshowninFigure 2-6 Astheforegrounddenitionisnowknown,wecancomputetheforegroundhistogramHBfitfromeachblockBi.Afterthat,werecomputecorrespondingblock 25

PAGE 26

weightsaccordingtofollowingequationi=PNb=1HBfit(b) Pp2WC(p) Weightsiarenormalizedtoenforcetherequirementthattheirsumisone. 2.3.5Discussion Comparingwiththerecentworkthatemploydiscriminativemodels(classiers)fortracking(e.g.,[ 37 ]),ourapproachismainlygenerativethroughtheuseofintensityhistograms.Whileweassumethattheintensitydistributionstaysstationary,thefeaturesweconstantlyupdatearetheblockcongurationsandtheassociatedweights.Onlineappearanceupdates(e.g,[ 37 41 42 ])havebeenshowntobeeffectivefortrackingrigidobjects.However,astheexamplesshownintheseworkarealmostwithoutsignicantshapevariation,itisdifculttoseethatthesetechniquescanbegeneralizedimmediatelytohandleshapeupdates.Ontheotherhand,shapevariationhasoftenbeenmanagedinvisualtrackingalgorithmsusingshapetemplateslearnedofineandthedynamicsamongthetemplates[ 24 28 31 33 ].Itisalsonotclearhowthesealgorithmscandealwithsequencescontainingunseenshapesordynamics.Insteadofhardcodingtheshapeprior,ouralgorithmprovidesasoftupdateonshapeintheformofupdatingtheblockconguration,andtheupdateisconstrainedbytheappearancemodelthroughtherequirementthattheforegroundintensitydistributionstaysroughlystationary. Ouruseofadaptiveblockstructureiseasilyassociatedwithrecentworkthattrackanddetectpartsofanarticulatedobject(e.g.,[ 43 44 ]).However,ourgoalandmotivationarequitedifferentinthattheblocksareemployedforprovidingaconvenientstructuretoapproximatetheobject'sshapeandestimatingintensityhistogram.Ourobjectiveisanaccurateandefcienttracker,notthepreciselocalizationofparts,whichingeneralrequiressubstantiallymoreprocessing.Nevertheless,itisinterestingto 26

PAGE 27

investigatethepossibilityofapplyingourtechniquetothistypeoftracking/detectionproblem,andwewillleavethistofuturework. 2.4ExperimentsandResults TheBHTalgorithmhasbeenimplementedinMATLABwithsomeoptimizationusingMEXC++subroutines.Thecodeanddataareavailableat http://www.cise.ufl.edu/~smshahed/tracking.htm .Inthisimplementation,weuseintensityhistogramswith16binsforgrayscalevideos.Eachvideoconsistsof320240pixelimagesrecordedat15framespersecond.Thenumberofblocks,K,issettotwoorthree.Thetrackerhasbeentestedonavarietyofvideosequences,andeightofthemostrepresentativesequencesarepresentedhere.Wecomparetrackingresultsofouralgorithmwithatrackerusingtheplainintegralhistogram[ 2 ]andmeanshifttracker[ 1 ].OnaDell2.66GHzmachine,ourtrackerrunsat8)]TJ /F5 11.955 Tf 12.67 0 Td[(10framespersecondwhiletheintegralhistogramtrackerhasaslightlybetterperformanceat12framespersecond2.Theadditionaloverheadincurredinouralgorithmcomesfromtheupdateofblockconguration,whichamountstoasmallfractionofthetimespentoncomputingtheintegralhistogramovertheentireimage.However,experimentalcomparisonsshowthatthisnegligibleoverheadinrun-timecomplexityallowsourtrackertoconsistentlyproducemuchmorestableandsatisfactorytrackingresults. Inthefollowingexperiments,forinitialization,wemanuallyoutlinecontourofthetargetintherstframe,andfortheexperiments,alltrackersstartwiththesametrackingwindow.Thesequencesshownbelowareallcollectedfromtheweb.Inthesesequences,theforegroundtargetsundergosignicantappearancechanges,whichismainlyduetoshapevariation.Werstpresentthequalitativetrackingresultsandthenquantitativecomparisonswithgroundtruthdata. 2InourMATLABimplementation,bothalgorithmssharesameMEXC++subroutines. 27

PAGE 28

2.4.1QualitativeResults Thefemaleskatingsequencecontainsover150frames,andthedazzlingperformanceisaccompaniedbyanequallydazzlingposevariation.AsshowninFigure 2-7 ,whilethebackgroundisrelativelysimple,theintegralhistogramtrackerandthemean-shifttrackerarenotabletolocatetheskateraccurately,producingjitteredandunstabletrackingwindows.Inparticular,itisimpossibletoutilizethisunsatisfactorytrackingresultforothervisionapplicationssuchasgaitorposerecognition.However,ourtrackerisabletotracktheskaterwellandprovidestrackingwindowsthataremuchmoreaccurateandconsistent.Asshowninthegures,onemajorreasonforthisimprovementisthatthespatiallocationsoftheblocksareupdatedcorrectlybyouralgorithmastheskaterundergoingsignicantchangesinpose. Thesecondsequencecontains435frameswithagureskaterperforminginaclutteredenvironment,andthetrackingresultsusingourmethodandthemean-shiftalgorithmareshowninFigure 2-9 .Ouralgorithmisabletoaccuratelytracktheskaterthroughoutthewholesequence(i.e.,thetrackingwindowsareaccuratelycenteredaroundtheskater)asshowninFigure 2-9 whiletheintegralhistogramtrackeragainproducesunsatisfactoryresults.Perhapsmoreimportantly,ouralgorithmisabletotracktheskateracrossshotstakenfromtwodifferentcameras(e.g.,fromframe372to373andonwards),whichisdifculttohandleformostvisualtrackingalgorithms,particularlythoseusingdifferentialtechniques.Theresultsalsodemonstratetheadvantageofhavingthecapabilitytoefcientlyscantheentireimageforthetargetasthemean-shifttrackerlosesthetargetwhenthecameraanglechanges(e.g.,frame372to373andonwards). Thethirdandfourthsequencescontaintwostylisticallydifferentdances.Inbothsequences,adverseconditionssuchasclutteredbackgrounds,scalechangesandrapidmovementshavesignicancepresences,andtheshapevariationsinthemareevenmorepronouncedwhencomparedwiththetwoprevioussequences.Inboth 28

PAGE 29

experiments(Figures 2-8 and 2-11 ),ourtrackerisabletotrackthedancersaccuratelywhiletheintegralhistogramandthemean-shifttrackerfailtoproduceconsistentandaccurateresults. Figure 2-12 presentsthetrackingresultsusingaverychallengingsequenceinwhichthereisalargevariationinshape,scale,andappearanceofthetarget.Furthermore,thetargetundergoesballisticmovements.Notwithstandingthesedifculties,theproposedtrackerisabletofollowthetargetaccuratelyusingonlytwoblocks. InFigure 2-13 ,Weapplyourtrackingalgorithmtoasequenceinwhichthetargetobjectisfullyoccludedatsomepoint.Inthissequence,twopersonswalkpasseachotherandthepersonbeingtrackedisfullyoccluded.AsshowninFigure 2-13 ,ourtrackingalgorithmisabletotrackthetargetcorrectlybeforeandafterocclusionwhilethemean-shifttrackerisconfusedbyocclusionandlosethetargetafterwards. Wetestourtrackerwithasequenceinwhichthetargetsoccerplayerappearsinaclutteredandchangingbackground.AsshowninFigure 2-14 ,ourtrackerproducesmorestableresultsthanthemeanshifttracker.Andnally,inFigure 2-15 ,weapplytheproposedtrackertoawildlifesequencewithnaturalsceneasthebackground. AsdescribedinSection 2.3.2 ,ourtrackerisabletoadjustthesizeoftrackingwindowtotightlyenclosethetargetobject.Inthissection,wepresentsomeresultsofourtrackerwithxedandadaptivescaling.AsshowninFigure 2-16 ,ourtrackerwithadaptivescalingisabletobetterenclosethetargetobjectthantheonewithxedscalingalthoughbotharecenteredatthesametargetlocations. 2.4.2QuantitativePerformance Forquantitativeperformanceevaluationofourtracker,wemanuallylabelthegroundtruthbyselectingtheminimalwindowthatenclosesthetargetineveryframe.Asoursequencescontainarticulatingtargets,wedonotincludeparts(hands,legs)inthe 29

PAGE 30

groundtruthwindowwhentheyarespreadouttoomuch.SomeoftheseexamplesareshowninFigure 2-10 Weusetwoerrormetricsforquantitativeevaluations.Therstonemeasuresthedeviationofthecenterofthetrackingwindowfromthegroundtruth,whereasthesecondonemeasuresthecoverageofthetrackingwindowagainstthegroundtruth.Certainlyanoptimaltrackerisexpectedtohavesmallerrorsinbothmetrics.Quantitativeperformanceofourtracker,theintegralhistogrambasedtrackerandthemean-shifttrackerwithrespecttotheseerrormeasurementsaresummarizedinTables 2-1 2-2 andFigures 2-17 aswellas 2-18 .Weobservethatourtrackeroutperformstheothertwotrackersbyalargemarginasourtrackerachievesthelowestmeanerrorsinallsequenceswithsmallstandarddeviations. Wepresentquantitativecomparisonsofourtrackerswithxedandadaptivescaling.Fromeachoriginalsequence,weselectasubsequencewhichcontainssubstantialscalevariationoftargetforexperiments.Experimentalresults,assummarizedinTables 2-3 2-4 andFigure 2-19 ,showthatadaptivescalingimprovestheaccuracyinlocationinallcasesandcoverageinmostcases. Asthesizeofthetargetobjectsvariessignicantlyinourexperiments,itisofgreatinteresttofurtheranalyzewhetherthealgorithmisabletoadjustthetrackingwindowsizeintermsofwidthandheight.Usinggroundtruthdata,wecomputethevariationofwidthandheightthetargetobjectandthencompareitwiththeresultsobtainedfromourtrackerwithadaptivescaling.Figure 2-20 showstheplotsforthisanalysisandTable 2-5 summarizestheerrorsforthisexperiment.Overall,ourtrackerwithadaptivescalingisabletoadjustboththewidthandheightofthetrackingwindowwhenthetargetobjectundergoeslargevariationinscale. 2.5Discussion Inthiswork,wehaveintroducedanalgorithmforaccuratetrackingofobjectsundergoingsignicantshapevariation(e.g.,articulatedobjects).Underthegeneral 30

PAGE 31

assumptionthattheforegroundintensitydistributionisapproximatelystationary,weshowthatitispossibletorapidlyandefcientlyestimateitamidstsubstantialshapechangesusingacollectionofadaptivelypositionedrectangularblocks.Thealgorithmrstlocatesthetargetbyscanningtheentireimageusingtheestimatedforegroundintensitydistribution.Therenementstepthatfollowsprovidesanestimatedtargetcontourfromwhichtheblockscanberepositionedandweighted.Ouralgorithmisefcientandsimpletoimplement.ExperimentalresultshavedemonstratedthattheBHTalgorithmconsistentlyprovidesmoreprecisetrackingresultwhencomparedwithintegralhistogrambasedtracker[ 2 ]andmeanshifttracker[ 1 ]. 31

PAGE 32

Table2-1. Centerlocationerrorsoftheintegralhistogram-basedtracker,themean-shifttrackerandtheproposedBHT IntegralhistogramMean-shiftOurproposedtrackertrackerBHTSequenceMaxMeanStdMaxMeanStdMaxMeanStd Femaleskater55.229.711.869.226.916.133.113.16.8Maleskater165.045.729.6168.069.826.635.013.67.5Indiandancer25.512.15.849.525.411.629.88.44.8Dancer58.931.910.261.541.611.833.716.16.9 Table2-2. Coverageerrorsoftheintegralhistogram-basedtracker,themean-shifttrackerandtheproposedBHT IntegralhistogramMean-shiftOurproposedtrackertrackerBHTSequenceMaxMeanStdMaxMeanStdMaxMeanStd Femaleskater79.9161.159.37100.0063.7616.2963.4548.249.39Maleskater100.0067.4813.63100.0080.1219.0174.7953.2811.77Indiandancer66.6850.196.3889.2161.2513.1358.9646.265.66Dancer87.5962.559.4287.3871.498.9569.8248.238.70 Table2-3. Centerlocationerrorsofourtrackerwiththexedandtheadaptivescalingwindow Fixedscale Adaptivescalingwindow window SequenceMaxMeanStd MaxMeanStd Femaleskater33.1112.117.72 21.849.154.98 Maleskater22.6410.285.2726.739.485.56 Indiandancer29.838.364.7524.636.584.35 Table2-4. Coverageerrorsofourtrackerwiththexedandtheadaptivescalingwindow Fixedscale Adaptivescalingwindow window SequenceMaxMeanStd MaxMeanStd Femaleskater63.4952.445.89 59.7244.956.07 Maleskater72.6452.0213.3771.0744.7511.08 Indiandancer58.9546.265.6657.2937.298.41 Table2-5. RMSerrorsintrackingwindowsize(in%) Errorinwidth ErrorinheightSequenceMeanStd MeanStd Indiandancer9.227.13 6.135.02Maleskater8.186.46 20.1513.94Femaleskater27.8421.05 13.5510.76 32

PAGE 33

FemaleSkater#43Dancer#78MaleSkater#160IndianDancer#212 Figure2-1. Motivationofourblock-basedtrackingalgorithm.Top:Usingonlyhistogramforrepresentingobject'sappearance,thetrackingresultsareoftenunsatisfactory.Bottom:UsingtheBHTalgorithm,thetrackingresultsaremuchmoreconsistentandsatisfactory. TrackingAlgorithmOutline 1.Detection TheentireimageisscannedandthewindowwiththehighestsimilarityisdeterminedtobethetrackingwindowW. 2.Scaling SizeoftrackingwindowWisadjustedaccordingtoscaleoftarget. 3.Renement WithinW,thetargetissegmentedoutusingagraph-cutbasedsegmentationwhichdividesthetrackingwindowbetweenforegroundandbackgroundregions.Thesegmentationusesbothestimatedforegroundandbackgrounddistributions. 4.Update Blockcongurationisadjustedlocallybasedonthesegmentationresultobtainedinthepreviousstep.Thenon-negativeweightsioftheblocksarerecomputed. Figure2-2. High-leveloutlineoftheBHTalgorithm. 33

PAGE 34

IndianDancer#58MaleSkater#355 Figure2-3. InitializationoftheBHTalgorithm.Top:Examplesofarticulatingtargets.Bottom:Giventhecontourofthetargetobject,weselectthoseblocks(andassociatedweights)withnon-emptyintersectionwiththeinteriorregionofthetargetdenedbythecontour.Blockscontainingonlybackgroundpixelsarenotselected.Theimportanceofablockisproportionaltothepercentageofitspixelsbelongingtotheforeground. Figure2-4. Examplesofsegmentationwithinthetrackingwindow. 34

PAGE 35

MaleSkater#113FemaleSkater#87IndianDancer#126 Figure2-5. Top:Trackingwithxedscalewindow.Bottom:Trackingusingadaptivescalingwindow. Figure2-6. Overviewoftheblockcongurationupdateprocess.Blockcongurationwithintrackingwindowregularlyupdatedusingthesegmentationresultoftherenementstep,therebyenablingtotrackatargetwithlargeshapevariation. 35

PAGE 36

FemaleSkater#41FemaleSkater#73FemaleSkater#96FemaleSkater#125FemaleSkater1#144 Figure2-7. VisualcomparisonoftrackingresultsoftheFemaleSkatersequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:TrackingresultsusingtheBHTalgorithm.Theshapevariationinthissequenceissubstantial.Noticetheunsatisfactoryresultproducedbytheintegralhistogramtracker.Theinaccuratetrackingresultsaredifculttobeutilizedbyothervisionapplications. Dancer#33Dancer#73Dancer#127Dancer#166Dancer#213 Figure2-8. VisualcomparisonoftrackingresultsoftheDancersequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:TrackingresultsusingtheBHTalgorithm.Notethetargetundergoeslargeshapeandscalevariation. 36

PAGE 37

MaleSkater#17MaleSkater#96MaleSkater#156MaleSkater#226MaleSkater1#239 MaleSkater#256MaleSkater#293MaleSkater#342MaleSkater#373MaleSkater1#376 Figure2-9. VisualcomparisonoftrackingresultsoftheMaleSkatersequence.Firstandfourthrows:Trackingresultsusingonlyintegralhistogram.Secondandfthrows:Trackingresultsusingthemean-shifttracker.Thirdandsixthrows:TrackingresultsusingtheBHTalgorithm. 37

PAGE 38

IndianDancer#96FemaleSkater#14MaleSkater#228 Figure2-10. Examplesofgroundtruthwindowsforarticulatedtargets. IndianDancer#20IndianDancer#26IndianDancer#110IndianDancer#122IndianDancer#134 Figure2-11. VisualcomparisonoftrackingresultsoftheIndianDancersequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:TrackingresultsusingtheBHTalgorithm.Notethetargetundergoessignicantshapeandappearancevariation. Cartoon#3Cartoon#18Cartoon#28Cartoon#60Cartoon#74 Cartoon#104Cartoon#112Cartoon#141Cartoon#150Cartoon#176 Figure2-12. TrackingresultsofthecartoonsequenceusingtheBHTalgorithm.Notethereisalargevariationinshape,scaleandappearanceofthetarget.Inaddition,thetargetexhibitsballisticandnon-rigidmotions. 38

PAGE 39

Occluded#19Occluded#36Occluded#50Occluded#63 Figure2-13. Trackingoccludedtarget.Toprow:TrackingresultbytheBHTalgorithm.Bottomrow:Trackingresultusingthemean-shifttracker.Theproposedtrackercanrecoverthetargetafterocclusion,whilethemean-shifttrackerfails. SoccerPlayer#53SoccerPlayer#75SoccerPlayer#85SoccerPlayer#95 Figure2-14. Trackingwithaclutteredbackground.Toprow:TrackingresultusingtheBHTalgorithm.Bottomrow:Trackingresultusingthemean-shifttracker.Theproposedtrackercantrackthetargetmoreconsistentlythanthemean-shifttracker. Deer#10Deer#20Deer#27Deer#41 Deer#48Deer#57Deer#71Deer#77 Figure2-15. TrackingofawildlifetargetusingtheBHTalgorithm.Eightselectedframesfromasequenceof100framesareshownthere. 39

PAGE 40

FemaleSkater#12FemaleSkater#34FemaleSkater#57FemaleSkater#87 IndianDancer#28IndianDancer#41IndianDancer#89IndianDancer#126 MaleSkater#47MaleSkater#91MaleSkater#113MaleSkater#153 Figure2-16. Toprows:Trackingresultsofouralgorithmwithxedscale.Bottomrows:Trackingresultsofouralgorithmwithadaptivescaling. 40

PAGE 41

Figure2-17. Per-framecenterlocationerrorplotsusingdifferenttrackingalgorithms.Toprow:(Left)MaleSkater,(Right)FemaleSkater,BottomRow:(Left)IndianDancerand(Right)Dancersequence. 41

PAGE 42

Figure2-18. Per-framecoverageerrorplotsusingdifferenttrackingalgorithms.Toprow:(Left)MaleSkater,(Right)FemaleSkater,BottomRow:(Left)IndianDancerand(Right)Dancersequence. 42

PAGE 43

Figure2-19. Comparisonbetweenourxedandadaptivescalingwindowbasedtracker.Top:Indiandancersequence.Middle:Femaleskatersequence.Bottom:Maleskatersequence.Centerlocationerrorsareshowninleftcolumnandrightcolumncontainscoverageerrors. 43

PAGE 44

Figure2-20. ResultsusingtheBHTalgorithmwithxedandadaptivescaling.Top:Indiandancersequence.Middle:Maleskatersequence.Bottom:Femaleskatersequence.Variationsinwidthandheightareshownintheleftandtherightcolumns,respectively. 44

PAGE 45

CHAPTER3SIMULTANEOUSMULTI-FRAMETRACKINGFORARTICULATEDTARGETS 3.1Introduction InChapter 2 ,wehavediscussedtheBHTalgorithmfortrackingarticulatedobjects,andourexperimentshaveshownthatthealgorithmproducesrobusttrackingresults.ThealgorithmhasbeenimplementedinMATLAB,andtoimprovetherunningtime,thedetectionstephasbeencodedusingC++MEXsubroutines.Currentimplementationofthealgorithmrunsat8)]TJ /F5 11.955 Tf 12.78 0 Td[(10fpsona2.66GHzCPU.Inthischapter,wefocusonimprovingtheperformanceoftheBHTalgorithm.Morespecically,weidentifythecomputationallyintensivestepsofthealgorithmandobservethatthesestepsareamenabletoparallelization.ThestructureoftheBHTalgorithmisalsomodiedtoworkonseveralconsecutiveframesinparallel.WedesignanddevelopanoptimizedimplementationofthemodiedBHTalgorithm,referredtoasparallel-BHTorp-BHT,onagraphicalprocessingunit(GPU). Manycomputervisionalgorithmshavehighdataparallelism,i.e.thesamesetofinstructionsareexecutedonmanydata.TheperformancesofthesealgorithmscanbeimprovedsignicantlyusingtheGPUsavailableinmoderngraphicshardware.Forexample,aparticlelteringalgorithmisproposedin[ 45 ]for3DtrackingoffacesusingGPUs.TheiralgorithmextractsNfeaturepointsfromthedetectedface,andgeneratesMparticlesfortracking.Themostexpensivetaskofthealgorithmistocomputethematchingerrorsofallfeature/particlepairs.Calculationoftheseerrorsforthepairsareindependentofeachother,andthisisexploitedin[ 45 ].TheerrorsarecalculatedinparallelusingaGPU,andtheiralgorithmruns4)]TJ /F5 11.955 Tf 12.43 0 Td[(7timesfasterthantheCPU-onlyversion. Similarly,inourcase,theevaluationsofmanycandidatewindowsinthedetectionstepareindependentofeachother,andwecanprocesstheminparallel.WehaveusedMATLABtoimplementthesequentialstepsofthep-BHTalgorithm,andthedetection 45

PAGE 46

stepofthealgorithmhasbeenimplementedonagraphicsprocessingunit(GPU)forparallelprocessing.TheGPUisprogrammedusingNVIDIA'sCUDAprogrammingframework.Theimplementationofthep-BHTalgorithmproducesverysimilartrackingresultswhencomparedwiththeresultsobtainedbytheBHTalgorithm.However,thep-BHTalgorithmrunsconsiderablyfasterthantheBHTalgorithm. Therestofthechapterisorganizedasfollows.InSection 3.2 ,webrieydescribegraphicalprocessingunits(GPUs)withsomerelevantcomputervisionapplicationsdevelopedusingGPUs.AbriefdescriptionoftheCUDAprogrammingframeworkisgiveninSection 3.3 .Thep-BHTalgorithmispresentedinSection 3.4 .CPU-GPUbasedimplementationofthealgorithmisdescribedinSection 3.5 .Wediscusstheexperimentalresultsandtheperformanceofthep-BHTalgorithminSection 3.6 .Finally,weconcludethechapterwithashortsummaryinSection 3.7 3.2GraphicsProcessingUnits 3.2.1OverviewofGPU AGPUcontainsmanycoresanditdistributesthedataontoallofthesecoresforparallelcomputation.Aprogramcalledakernelisexecutedonalloftheavailableprocessingunitswithdifferentinputdata.WhilemanycomputerstodaycontainmultipleCPUcoresandsupportexecutionsofmulti-threadedapplications,suchprocessorsarenotdataparallelandtheyusuallyexecutedifferentprogramsondifferentthreads.GPUs,ontheotherhand,arecreatedtoaccountfordataparallelismandtheycancooperatewithdatafetchingandotheroperationssincetheyallexecutethesameinstructions. 3.2.2ApplicationofGPUinComputerVision Theparalleldata-processingcapabilityofmodernGPUshasrecentlybeenusedinmanycomputervisionalgorithmsforrealtimeapplications.Inthissection,wedescribeseveralrelevantworkonvisualtracking. AGPU-basedKanade-Lucas-Tomasi(KLT)featuretrackerhasbeendevelopedin[ 46 ]thatcantrack1000featurepointsat30fps,whichis20timesfasterthantheCPU 46

PAGE 47

implementation.TheirGPUimplementationcanextractabout800SIFTfeaturesfrom640480imagesat10fps.In[ 47 ],theauthorsdevelopedaframeworkforeye-blinkdetectionandtracking,whichcandetecteyeblinkswith97%accuracy.TheyusedopenVidia's[ 48 ]GPU-basedSIFTimplementationforfeatureextractions.Theirtrackercanprocessupto25framespersecond.TheGPU-basedparticlelteringframeworkdevelopedin[ 49 ]forvisualtrackingruns2.5)]TJ /F5 11.955 Tf 12.17 0 Td[(3.5timesfasterthanpreviousCPU/GPUversionsofparticlelter-basedtrackers[ 50 51 ].Thisworkhasbeenextendedin[ 52 ]formoreaccuratetrackingbyadoptingmulti-resolutionandlocalsearch.Theoveralltrackingspeedwasupto40fpsfor320240videosequences.Recentlyin[ 45 ],aparticlelteringframeworkhavebeenusedtodetectandtrackmultiplefacessimultaneouslyusingNVIDIA'sCUDAarchitecture.Thesystemcandetectandtrackfacesin3Denvironmentsat25)]TJ /F5 11.955 Tf 12.26 0 Td[(40fpsandtheGPUimplementationruns4)]TJ /F5 11.955 Tf 12.27 0 Td[(9timesfasterthantheCPUversion.TheSSD-basedplanartrackingalgorithmproposedin[ 53 ]usesatemplateofthetargetfortracking.ThetrackerachievesrealtimeperformanceonCPUifthesizeofthetemplateissmallerthan100100pixels.Forlargertemplates,however,thetrackerrunsslow(thespeedisonly4fpsfor256256sizetemplates).Ontheotherhand,theGPU-basedimplementationofthisalgorithmdevelopedin[ 54 ]runsat18)]TJ /F5 11.955 Tf 11.96 0 Td[(96fps,dependingontheGPUunitused. 3.3CUDAArchitectureandProgrammingFramework 3.3.1CUDAHardwareArchitecture Inthissection,wedescribethehardwarearchitectureofatypicalCUDAenableGPU.TheGPUcontainsalargenumberofhighlythreadedstreamingmultiprocessors(SMs),whereeachSMhasseveralstreamingprocessors(SPs).AlltheSPsinanSMexecutethesameinstructionatthesametime.Also,theseSMscontainasmallamountofhigh-speedmemorythatisusedtosharethedataamongthethreadsrunningonasingleSM.TheDRAMavailableinsidetheGPUiscalledthedevicememory.TheblockdiagramofthehardwarearchitectureofaCUDAenabledGPUisshowninFigure 3-2 47

PAGE 48

Differenttypesofmemoryresideinthedevicememory.Constantandtexturememoriesarecachedforfasteraccesstimes.Howevertheydonothavewriteaccess.Whiletheglobalmemoryiswrite-enabled,theaccesstimeismuchslowersincetheyarenevercached. 3.3.2CUDAProgrammingFramework Thecomputingsystemconsistsofahost(CPU)andoneormoredevices(GPU)withparallelprocessingcapability.Apotentialapplicationmusthavehighdataparallelismtotakeadvantageoftheprogrammingframework.Thedevicesexploitthedataparallelismtoreducetheexecutiontime. CUDAprogramstructure.ACUDAprogramcanbedividedintoseveralmodules.Someofthesemodulesdonotshowanydataparallelismandthesesequentialmodulesareexecutedonthehost.Theothermodules,whichshowdataparallelism,areexecutedontheGPU.Duringthecompilation,thesesequentialandparallelmodulesareseparatedandcodesforboththeCPUandGPUaregenerated.TheprogramthatrunsontheGPUisknownasthekernel.Thekernelprogramcreatesalargenumberofthreadstoprocessthedatainparallel.ThesethreadsareverylightweightandtheGPUcontainsthenecessaryhardwaretoefcientlyschedulethem. Devicememoryanddatatransfer.Thehostanddeviceshavedifferentmemoryspaces.Beforeakernelstartsexecution,memoryisallocatedonthedevicesandtheinputiscopiedfromthehostmemorytothedevicememory.Similarly,whenthekernelexecutionisnishedonthedevices,theoutputiscopiedfromthedevicememorytothehostmemory.ThesememorytransferoperationsareperformedusingtheAPIfunctionsavailableintheCUDAprogrammingframework. TheCUDAdevicememorycontainsthreedifferenttypesofmemory.Theyaretheglobalmemory,theconstantmemory,andthetexturememory.Theglobalmemoryresidinginsidethedevicehaslongaccesslatenciesandnitebandwidth.Thisistheonlymemorywherethekernelcanperformbothreadandwriteoperations.Many 48

PAGE 49

threadstryingtoaccesstheglobalmemorysimultaneouslycanreducetheexecutionspeedduetoitslonglatency.However,executingalargenumberofthreadscantoleratethislonglatencies.Theconstantmemorysupportsshort-latencyandhighbandwidthincontrasttotheglobalmemory.However,theconstantmemoryonlyallowsreadoperations.Thetexturememoryisalsoread-onlymemorythatcanbecachedaccordingtospatiallocality.RegistersandsharedmemoryinFigure 3-3 areon-chipmemorieswhichcanbeaccessedatveryhighspeeds.Thethreadscanaccessonlythoseregisterswhichareallocatedtothem.Sharedmemoryisallocatedtothethreadblockandallthethreadsintheblockcanaccesstheallocatedmemory.Theportionoftheinputdatathatisusedbyallthethreadsshouldbecopiedinthesharedmemoryforfasteraccess. Threads.ThekernelprogramisexecutedinsidetheGPUasagridofparallelthreads.EachthreadgridcontainsthousandsofGPUthreads.Allthesethreadsexecutethesamecodebutperformoperationsondifferentpartsoftheinputdata.Toidentifythethreads,eachthreadisgivenauniqueID.ThisIDisoftenusedbyathreadtoidentifythespecicpartofinputdata.ThetwolevelhierarchicalorganizationofthethreadsisshowninFigure 3-3 .Atthetoplevel,eachgridcontainsoneormorethreadblocks.Allblocksinagridhavethesamenumberofthreads.Agridcanbeorganizedasamulti-dimensionalarrayofblocks,andsimilarly,eachthreadblockcanbeorganizedasamulti-dimensionalarrayofthreads.Athreadinthishierarchicalmodelcanbeidentiedbybuilt-inCUDAvariablesblockIdxandthreadIdxwhichgetassignedbytheCUDAruntimesystem.Thesevariablesareaccessibleinsidethekernelandmostofthetimetheydeterminethepartofthedatatobeusedbythethread.Therearetwoadditionalvariables:gridDimandblockDimwhichspeciythedimensionsofthegridandtheblocksrespectively. Allthethreadsinablockcanbesynchronized.CUDAprovidestheAPIfunctionsyncthreads()toachievethissynchronization.Whenakernelcallsthisfunction,the 49

PAGE 50

threadthatexecutesthecallwillbehelduntileverythreadintheblockreachesthesamelocation.Thisensuresthatallthreadsinablockhavenishedexecutingaphaseofthekernelbeforetheybeginexecutingthenextphase.However,threadsindifferentblockscannotbesynchronizedwitheachother. 3.4AlgorithmDescription Inthissection,wedescribethep-BHTalgorithm.ThisalgorithmisamodiedversionoftheBHTalgorithmthatimprovestherunningtimebyutilizingtheparallelprocessingcapabilityoftheGPU.Specically,wemodifythestructureoftheBHTalgorithm.TheBHTalgorithmusesasmallnumberofblockstorepresenttheobject'sappearanceandthepositionsoftheblocksapproximatelyrepresentsthetarget'sshape.Itloopsthroughthedetection,therenement,andtheupdatesteps.Inthedetectionstep,thetargetislocalizedbyscanningthewholeimage.Therenementandtheupdatestepsareusedtocomputenewblockpositions.TheBHTalgorithmcomputestheblockpositionsforeachframe.However,weobservedthatitisnotnecessarytoupdatetheblockcongurationateachframe.ThecongurationshouldbechangedwheneverthedistanceD(W,W)getslargebecauselargerdistanceimpliesthatthetarget'sshapeisnolongerwellrepresentedbycurrentblockcongurationandthereforeanewblockcongurationshouldbecomputed.Thus,wecanavoidtherenementandtheupdatestepsoftheBHTalgorithmaslongasdistanceD(W,W)isreasonablysmall.Thisobservationprovidesthemotivationtocombinethedetectionstepsofseveralconsecutiveframestogether.Werefertothisasthemulti-framedetectionstep.Inthep-BHTalgorithm,wecomputetheblockcongurationeveryNthframe.Thus,amulti-framedetectionstepthatdetectsthetargetinNframesisfollowedbyarenementandanupdatestep.TheblockdiagramofthetrackingalgorithmisshowninFigure 3-4 Thereisatrade-offinchoosingthevalueofN.AlargervalueofNspeedsupthetrackerbyavoidingtherenementandtheupdatesteps.However,thisalsomeansthattheblockcongurationisnotupdatedforN)]TJ /F5 11.955 Tf 11.96 0 Td[(1frames.Thismaycauseaproblemif 50

PAGE 51

thetargetshapechangessubstantiallyovertheseNframes.Therefore,thechosenNshouldbesmallenoughthatthetarget'sshapedoesn'tvarytoomuchintheseNframes.Inourexperiments,wefoundthat,selectingavaluewithin8)]TJ /F5 11.955 Tf 11.96 0 Td[(16workswell. Wealsoidentifythetwocomputationallyintensivecomponentsofthep-BHTalgorithmwhichcanbeprocessedinaparallelarchitecturetospeeduptherunningtime:thecomputationofintegralhistogramsandthecomputationofthedistancesD(WT,W).Inthemulti-framedetectionstep,thetargetisdetectedinNframes.Therefore,Nintegralhistogramsneedtobecomputedrst.Sincecomputationoftheintegralhistogramforaframedoesn'tdependonotherframes,weexploitthistocomputetheintegralhistogramsinparallel.Also,inthedetectionstep,alargenumberofcandidatewindowsovertheentireframeareevaluated.ThecurrentblockcongurationistransferredontothecandidatewindowWTandlocalforegroundhistogramsoftheblocksareestimated.Then,thedistanceD(WT,W)iscalculatedusingequation 2 .ThesedistancecalculationsareindependentofeachotherandwecanevaluatetheminparallelonaGPU. 3.5ImplementationDetails Figure 3-5 showstheowdiagramoftheCPU-GPUimplementationofthep-BHTalgorithm.Thetrackerisrstinitializedandthenitloopsthroughthemulti-framedetection,therenement,andtheupdatesteps.Ateachiteration,Nframesareprocessedinparallel.TheinitializationstepofthetrackingalgorithmisexecutedontheCPU.Inthisstep,thecontourofthetargetismanuallyoutlined,andthenthealgorithmcomputestheinitialblockpositionsandtheirweights.Theinitialappearanceofthetargetisalsocomputedinthisstep.Moreover,themulti-framedetectionsteprunsontheGPUandthisstepdetectsthetargetinNconsecutiveframes.TheframesarerstloadedontotheGPUmemory.Then,theGPUcomputestheintegralhistogramsfromtheframes.Afterthis,thedistanceD(W,WT)ofeachcandidatewindowWTiscalculated.Attheend,foreachframe,thedistancesofthecandidatewindowsare 51

PAGE 52

comparedandthecandidatewindowwiththesmallestdistanceisselectedasthedetectionresult.Thiscompletestheexecutionofthemulti-framedetectionstep.TheGPUthencopiesthedetectionresultsintotheCPUmemory.TherenementandtheupdatestepsrunontheCPUandthesestepsareperformedonlyonthelastframeofNframesprocessedincurrentiteration.Intherenementstep,thewindowreturnedbythedetectionstepissegmentedintotheforegroundandthebackground.Thissegmentationisusedtocomputetheblockpositions.ThesetwostepsaresameastheBHTalgorithm. Next,wepresentouroptimizedparallelimplementationofthemulti-framedetectionstepontheGPUusingtheCUDAProgrammingframework.Asdiscussedabove,thisconsistsofthreesteps:(1)computingtheintegralhistogramsfromtheframes,(2)computingthedistancesofthecandidatewindows,and(3)ndingthebestcandidatewindowbycomparingthedistances.EachofthesestepsrunsasakernelontheGPU. TheintegralhistogramofanimageofsizeRCisamultidimensionalarrayofsizeRCB,whereBisthenumberofbinsinthehistogram.Thecomputationoftheintegralhistogramforeachbinisperformedinparallelusingseparatethreads.Intheimplementation,weuseNBthreadsforNframes,whereeachthreadisassignedaframeandabinindex.ThesethreadsaredividedintoN=2blocks,andeachblockcomputestheintegralhistogramsoftwoframes.Eachthreadscanstheframeassignedtoitanditcalculatesthebinvaluesofitsassignedindex.WestoretheframesintheCUDAtexturememory,whichiscached.Hencewhenthethreadsscantheframes,spatiallocalityisexploitedwhichresultsinimprovedperformance.TheoutputintegralhistogramsarestoredintheCUDAglobalmemoryandtheyareusedbythesubsequentkernelsduringtheirexecutions. Foreveryframe,thetrackingalgorithmndsthewindowwhichhasthemaximumforegroundsimilaritywithrespecttotheinitialtrackingwindowW.ItevaluatesthedistancesofallpossiblecandidatewindowsWTintheframe.ForNframeswhereeachframehasthedimensionHW,thetotalnumberofcandidatewindowsare 52

PAGE 53

approximatelyNHW.TheimplementationusesNWthreadsandeachthreadcomputesthedistancesofHcandidatewindows.Thesethreadsaredividedinto2WblocksandeachblockcontainsN=2threads.Tocalculatethedistance,theforegroundappearanceestimatedfromthecandidatewindowWTiscomparedwiththetargetappearance.Thus,eachthreadneedstoaccessthetargetappearance.Thisappearanceiscopiedontothesharedmemorybytherstthreadofeachblock.Thesharedmemoryison-chipandfast.Further,multiplethreadsreadingthesamedatafromsharedmemoryresultsinabroadcastofinformationratherthanmultiplereads.Thus,usingsharedmemorytostorethetarget'sappearancereducestheexecutiontime.Thesyncthreads()functioniscalledbythethreadstomakesurethatthedistancecomputationstartsafterthetargetappearanceiscopiedontothesharedmemory.Thecandidatewindow'sappearanceiscalculatedusingtheintegralhistogramresidingintheCUDAglobalmemory.TheimplementationusesCUDAdatatypeoat4toreadfromtheglobalmemory,whichcanload4valuesinasinglememoryaccess.UseofthisspecialCUDAdatatypeimprovestheglobalmemoryaccessperformance. Oncethedistancesofthecandidatewindowsarecomputed,thenextstepistondthetrackingwindowwiththeminimumdistance.Again,thiscomputationcanbeperformedforeachframeindependently.ThekernelforcomputingtheminimumdistanceisexecutedinNthreads.Thesethreadsruninparallelandselectthebesttrackingwindowcontainingthetarget. Finally,computedintegralhistogramsandtrackingwindowsarecopiedfromtheGPUmemorytotheCPUmemory.Detectedtrackingwindowsaretheoutputofthetrackerwhileintegralhistogramsareusedintherenementandtheupdatestepstocomputeanewblockconguration. 3.6ExperimentalResults Toevaluatetheperformanceofthetrackingalgorithm,wehavetestedthetrackerwiththefoursequencesusedinChapter 2 .Thesearethefemaleskater,themale 53

PAGE 54

skater,thedancer,andtheindiandancersequences.ThesevideoswerecollectedfromYoutubeandeachoftheseconsistsof320240pixelframesrecordedat15fps.WepresentqualitativeandquantitativeanalysesofthetrackingresultswhereourtrackingresultsarecomparedwiththeBHTalgorithm,theintegralhistogramtracker[ 2 ],andthemeanshifttracker[ 1 ].Then,wecomputationallyanalyzethep-BHTalgorithmandcomparethetrackingspeedwiththeBHTalgorithm. 3.6.1QualitativeResults Wecomparedtheresultsobtainedusingthetrackerwiththeresultsofthehistogrambasedtracker[ 2 ]andthemeanshifttracker[ 1 ].TheseresultsareshowninFigures 3-6 3-7 3-8 ,and 3-9 .Wefoundthat,ouralgorithmtracksthearticulatedtargetmorerobustlyandconsistentlythanothertrackers. 3.6.2QuantitativePerformance Forquantitativeperformanceevaluationofthetracker,thecenterlocationerrorandthecoverageerrorsarecomputedineachframe.Quantitativeperformancesofthep-BHT,theBHT,theintegralhistogrambasedtracker[ 2 ]andthemean-shifttracker[ 1 ]withrespecttotheseerrormeasurementsaresummarizedinTables 3-1 and 3-2 .Thesetablesshowthatthep-BHTalgorithmhasverysimilarbehaviorastheBHTalgorithmandoutperformstheothertwotrackersbylargemargins.PlotsrelatedtothesetablesareshowninFigures 3-10 and 3-11 3.6.3ComputationalPerformance Thep-BHTalgorithmhasbeentestedusinganIntelQuadprocessor(Q9450)2.66GHzhostsystemwith4GBofRAMandaNVIDIAGeForceGT8800GPU.ThisGPUfeatures14multiprocessors,16KBsharedmemorypermultiprocessor,and512MBofdevicememory.Therecanbeamaximumof512threadsperblockand768activethreadspermultiprocessor.Forcomparisonpurposes,thesinglethreadedBHTalgorithmhasalsobeentestedonthesamehostCPU. 54

PAGE 55

Figure 3-1 showstherunningtimesoftheBHTandthep-BHTalgorithmsondifferentsequences.WhiletheBHTalgorithmrunsat8)]TJ /F5 11.955 Tf 12.79 0 Td[(10fps,thep-BHTtrackerrunsat56)]TJ /F5 11.955 Tf 12.2 0 Td[(78fps.Thisshowsthatthep-BHTalgorithmrunsaboutseventimesfasterthantheBHTalgorithm.InTable 3-4 ,theaveragetimetakenbythestepsofthep-BHTalgorithmispresented.IntheforthcolumnoftheTable 3-4 ,themulti-framedetectiontimeperiterationisgiven.Thedetectiontimeperframeiscalculatedbydividingthedetectiontimeperiterationbythenumberofframesprocessedinaniteration.Thefthandthesixthcolumnsshowtherunningtimesoftherenementandtheupdatesteps.ThesetwostepsrunonlyonceperiterationandtheyonlyprocessthelastframeofthebatchofNframes.Totalexecutiontimeforeachiterationiscalculatedbysummingthedetectiontimeperiterationandthetimetakenbytherenementandtheupdatesteps. Wenowdiscusstheperformanceofthemulti-framedetectionstepofthep-BHTalgorithm.ThisstephasbeenexecutedontheGPU.First,theinputissentfromthehostmemorytothedevicememory.TheamountoftimeelapsedtocopytheinputfromthehostmemorytothedevicememoryisshownintherstcolumnoftheTable 3-3 .Inourimplementation,threeCUDAkernelfunctionsareexecutedontheGPU.Theaverageexecutiontimesofthesekernelsarepresentedinthefourth,fth,andsixthcolumnsoftheTable 3-3 ,respectively.Whenallthekernelsnishtheirexecution,theoutputiscopiedfromthedevicetothehost.Thetimespentforthisoperationisshowninthesecondcolumn.Totalexecutiontimeforeachiterationiscalculatedbysummingthememorytransfertimesandthekernelexecutiontimes.Thelasttwocolumnsofthetablecontaintheaverageexecutiontimeperframeandtheframerates.Fromthetable,weseethataverageGPUexecutiontimeperframeisaround10)]TJ /F5 11.955 Tf 12.29 0 Td[(15msec.However,thedetectionstepoftheBHTalgorithmtakesaround80)]TJ /F5 11.955 Tf 12.55 0 Td[(90msecperframe.Thus,theGPU-basedimplementationofthedetectionstepruns6)]TJ /F5 11.955 Tf 11.96 0 Td[(8timesfaster. 55

PAGE 56

3.7Discussion Inthischapter,wehavedevelopedaGPU-acceleratedrealtimeandrobusttrackingsystem.Oursystemusesthep-BHTalgorithm,whichproducesrobusttrackingresultsforarticulatedobjects.ThesystemisimplementedontheGPUusingtheCUDAprogrammingmodelproposedbyNVIDIA.TheimplementationcreatesalargenumberofthreadsinordertoefcientlyutilizetheparallelprocessingabilityofferedbytheGPU.ThesethreadsoccupythecoresavailableintheGPUandperformthecomputationsinparallel.Oursystemefcientlyutilizestheglobalandsharedmemorybandwidths.Forcreatingtheintegralhistograms,theframeswerestoredintexturememorywhichusesdatacachingandreducesthedatareadtime.TheGPU'ssharedmemoryaccesstimeisseveraltimesfasterthanglobalmemory.Totaketheadvantageofthis,theappearancehistogramofthetargetwasloadedintothesharedmemory.Whilereadingthedatafromtheglobalmemory,weusedtheoat4datatype.Thishelpedtoreadfouroatingpointdatainonememoryreadoperationandreducedthetotalnumberofglobalmemoryreadinstructions.Intermsofperformance,thesystemrunsat56)]TJ /F5 11.955 Tf 12.76 0 Td[(78fps,whichisseveraltimesfasterthantheBHTalgorithmdevelopedinChapter 2 .Inthefuture,weplantoimplementoursystemonmorepowerfulGPUandanalyzetheperformanceofouralgorithm. 56

PAGE 57

Table3-1. Centerlocationerrorsoftheintegralhistogram-basedtracker,themean-shifttracker,theBHTandthep-BHT Integralhist.tracker Mean-shifttracker BHT p-BHTSequenceMaxMeanStd MaxMeanStd MaxMeanStd MaxMeanStd Femaleskater55.227.911.969.226.917.235.812.36.626.710.54.9Maleskater168.036.825.9 165.445.333.636.413.27.437.412.857.8Indiandancer25.512.26.0 42.022.312.123.89.34.321.69.65.0Dancer57.530.210.3 59.439.712.033.116.06.741.120.97.5 Table3-2. Coverageerrorsoftheintegralhistogram-basedtracker,themean-shifttracker,theBHTandthep-BHT Integralhist.tracker Mean-shifttracker BHT p-BHTSequenceMaxMeanStd MaxMeanStd MaxMeanStd MaxMeanStd Femaleskater79.959.710.6100.065.516.960.748.27.860.748.47.8Maleskater100.063.614.4 100.068.719.878.952.413.478.952.313.1Indiandancer66.749.87.8 86.659.015.559.045.67.159.045.87.0Dancer87.261.110.5 87.070.010.371.747.19.970.551.69.1 Table3-3. Performanceanalysisofdifferentsub-stepsinthemulti-framedetectionstep. copyhostcopydeviceintegraldetectionbestwindowtotaltimetotaltimeframeratetodevicetohosthistogramscoreselection/iteration/framesequence(msec)(msec)(msec)(msec)(msec)(msec)(msec)(fps) Femaleskater5.735.1983.0957.510.17151.7010.8492Maleskater4.174.9981.9243.110.16134.3413.4374Dancer4.365.0582.0460.960.16152.5715.2666Indiandancer5.685.4383.1283.240.17177.6512.6979 57

PAGE 58

Table3-4. Computationalperformanceofthep-BHTalgorithm. framesDetectionRenementUpdateTotalTime/iteration(GPU)(CPU)(CPU) averagetime/iteration(inmsec)151.70180.54Femaleskater14averagetime/frame(inmsec)10.848.4020.4412.89framerate(infps)92.00119.0049.0078.00averagetime/iteration(inmsec)134.34160.99Maleskater10averagetime/frame(inmsec)13.437.7218.9316.10framerate(infps)74.00130.0053.0062.00averagetime/iteration(inmsec)152.57177.27Dancer10averagetime/frame(inmsec)15.266.4318.2717.73framerate(infps)66.00155.0055.0056.00averagetime/iteration(inmsec)177.65208.39Indiandancer14averagetime/frame(inmsec)12.699.7421.0014.88framerate(infps)79.00102.00648.0067.00 58

PAGE 59

Figure3-1. Comparisonofcomputationalspeedbetweenthep-BHTandtheBHTalgorithm. Figure3-2. BlockdiagramoftheCUDAHardwareArchitecture(Source[ 55 ]) 59

PAGE 60

Figure3-3. Left:CUDAProgrammingModelRight:CUDAMemoryModel.(Source[ 55 ]) Figure3-4. Blockdiagramofthep-BHTalgorithm. 60

PAGE 61

Figure3-5. BlockdiagramoftheCPU-GPUimplementationofthep-BHTalgorithm. FemaleSkater#35FemaleSkater#52FemaleSkater#68FemaleSkater#94FemaleSkater1#117 Figure3-6. VisualcomparisonoftrackingresultsoftheFemaleSkaterSequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:Trackingresultsusingthep-BHTalgorithm. 61

PAGE 62

MaleSkater#29MaleSkater#145MaleSkater#175MaleSkater#232MaleSkater1#281 Figure3-7. VisualcomparisonoftrackingresultsoftheMaleSkaterSequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:Trackingresultsusingthep-BHTalgorithm. IndianDancer#9IndianDancer#26IndianDancer#82IndianDancer#115IndianDancer#134 Figure3-8. VisualcomparisonoftrackingresultsoftheIndianDancerSequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:Trackingresultsusingthep-BHTalgorithm. 62

PAGE 63

Dancer#35Dancer#59Dancer#122Dancer#160Dancer#212 Figure3-9. VisualcomparisonoftrackingresultsoftheDancerSequence.Top:Trackingresultsusingonlyintegralhistogram.Middle:Trackingresultsusingthemean-shifttracker.Bottom:Trackingresultsusingthep-BHTalgorithm. 63

PAGE 64

Figure3-10. Per-framecenterlocationerrorplotsusingdifferenttrackingalgorithms.Toprow:(Left)MaleSkater,(Right)FemaleSkater,BottomRow:(Left)IndianDancerand(Right)Dancersequence. 64

PAGE 65

Figure3-11. Per-framecoverageerrorplotsusingdifferenttrackingalgorithms.Toprow:(Left)MaleSkater,(Right)FemaleSkater,BottomRow:(Left)IndianDancerand(Right)Dancersequence. 65

PAGE 66

CHAPTER4CO-SUPERPIXELS:GENERATINGCONSISTENTSUPERPIXELSACROSSIMAGEFRAMES 4.1Introduction Manycomputervisionalgorithmsrelyonparsingtheimageintosmallhomogeneousimagesegments.Superpixelsrepresentsuchspatiallycoherentregionsthatrespectimageedgesandwhosepixelssharecommoncolorortextureinformation.Computingsuperpixelshasrecentlyreceivedagrowinginterestintheliterature.Forexample,superpixelsareusedasbuildingblocksinobjectdetection[ 56 ],recognition[ 57 58 ],foregroundsegmentation[ 59 ]andmanyothervisionalgorithms[ 60 ].Nowadays,manyalgorithmsareavailableforcomputingsuperpixels.Forinstance,normalizedcutalgorithms[ 61 63 ]areoftenusedforextractingsuperpixelsfromimages.However,thesealgorithmsarecomputationallyveryexpensive.Levinshtein[ 64 ]proposedalevel-setmethodthatefcientlycomputessuper-pixelsfromanimage.However,thealgorithmprovidesnowaytocontroltheextentorsizeofsuperpixels. Recently,Veksleretal.[ 65 ]proposedadiscreteenergyminimizationframeworkforsuperpixelgeneration.Thealgorithmgeneratessuperpixelswhichareregularinsizeandshape.However,theirapproachshowsinconsistencywhenappliedtotwoverysimilarimagesindependently.Forexample,thetoprowinFigure 4-1 showssuperpixelmapsoftwosimilarimagesgeneratedusingtheiralgorithm.Atseveralareas,superpixelboundariesareinconsistentacrossthetwoimages(someexamplesaremarkedinboldrectangles).Theirframeworkgeneratessuperpixelsthatrespectimageedges.However,thisisnotsufcientforgeneratingconsistentsuperpixelsacrosstwosimilarimages.Indeed,someconsistencymeasureacrossthetwoimagesarerequired. Inthischapter,weaddressthisproblembyintroducinganenergyminimizationframeworkwhichestablishesconsistencybetweencorrespondingsuperpixelsofimagepairs.Thisconsistencyisestablishedbyintroducingatermwhichencodesshapeandappearancemeasuresbetweencorrespondingsuperpixelsintheenergyfunction.We 66

PAGE 67

developaniterativealgorithmtominimizetheenergyfunction.ThemiddlerowinFigure 4-1 showstheconsistentsuperpixelmapsusingouralgorithm. Ourcontributionsinthischapterareasfollows.Firstly,wediscussanalgorithmtogeneratesuperpixelsfromanimagewhichareconsistentwithalreadycomputedsuperpixelsofasimilarimage.Secondly,wedevelopanalgorithmforcomputingconsistentsuperpixelsforpairsofvideosuccessiveframes.Weevaluateouralgorithmwithseveralexamplestodemonstrateitsabilitytogenerateconsistentsuperpixels. Therestofthischapterisorganizedasfollows.InSection 4.2 ,wereviewtheenergyminimizationframeworkforsuperpixelizationofasingleimage.WedevelopanenergyfunctiontogenerateconsistentsuperpixelmapsinSection 4.3 .Then,inSection 4.4 ,wepresentthedetailsofouriterativealgorithm,andnally,experimentalresultsarediscussedinSection 4.5 4.2SuperpixelGenerationfromaSingleImage Inthissection,wereviewtheenergyminimizationframework[ 65 ]forsuperpixelgenerationfromasingleimage.LetPbethesetofpixelsintheimageandL=f1,2,...,jLjg,bethesetofsuperpixellabels.Eachsuperpixelwiththelabell2LisassumedtocontainedintheboundingboxRl.Thealgorithmassignsasuperpixellabell2Ltoeachpixelp2P.Letfpdenotethesuperpixelassignedtopixelpandletfbethecollectionofallsuperpixelassignments.Werefertofasasuperpixelmap.ItisamapfromPtoL,i.e.f:P7!L.Theassignmentproblemisformulatedasanenergyminimizationframework,wheretheenergyfunctionisdenesas E(f)=Xp2PDp(fp)+Xfp,qg2NwpqVpq(fp,fq)(4) Thisenergyequationconsistsoftwokindsofterms.TheunarytermDp(l)intheEquation 4 speciesthecostofassigningthesuperpixellabelltothepixelp.Thistermisusedtoregulatethesizeofgeneratedsuperpixels.Morespecically,apixelpis 67

PAGE 68

allowedtobeassignedthelabellonlyifp2Rl.Thereforetheunarytermisdenedas Dp(l)=8><>:1ifp2Rl1otherwise(4) ThebinarytermVpq(lp,lq)intheEquation 4 speciesthecostoflabelingtwoneighboringpixelspandqwithsuperpixellabelslpandlq,respectively.ItisdenedasVpq(fp,fq)=min(1,jfp)]TJ /F7 11.955 Tf 12.59 0 Td[(fqj)forallneighboringpixelpairsfp,qg.Thatistosay,ifneighboringpixelsareassignedtothesamesuperpixel,thennocostisaddedintheenergyfunction.Ontheotherhand,ifneighboringpixelsareassigneddifferentlabels,thenthepenaltywpqisused.Thisbinarytermallowsthesuperpixelboundariestoalignwithedgesoftheimage. Aparameter,controlstherelativeweightbetweentheunaryandthebinaryterms.Thecoefcientwpqisinverselyproportionaltotheintensitydifferencebetweentheneighboringpixelspandq.LetIpdenotestheintensityofpixelpanddist(p,q)denotesthedistancebetweenthetwopixelspandq.Then,following[ 66 ],wpqiscomputedfromtheimageIusing wpq=exp()]TJ /F5 11.955 Tf 26.1 8.09 Td[((Ip)]TJ /F7 11.955 Tf 11.96 0 Td[(Iq)2 dist(p,q)22)(4) Veksleretal.[ 65 ]usedthe-expansionalgorithm[ 67 ]tominimizetheenergyfunctionE(f). 4.3ConsistentSuperpixelGeneration Ourgoalinthissectionistodevelopaframeworkforgeneratingasuperpixelmapfwhichisconsistentwithagivensuperpixelmapg.WedenegasamapfromPtoL[f0g,i.e.g:P7!L[f0g.ThesuperpixelmapgspeciespreferredlabelsforasubsetQofthepixelsinP.Foreachp2Q,gp2L,andifp=2Qthengp=0.Inotherwords,apositivegpdenotesapreferredsuperpixellabelforthepixelp,sinceeachsuperpixelhasadistinctlabelwhichisapositiveinteger.Otherwise,ifgp=0,thengdoesnotspecify 68

PAGE 69

anylabelpreferenceforthepixelp.Werefertogasaconstrainingmapforf,andQisthesetofconstrainedpixels. WeextendtheenergyfunctionE(f)toincludetheconstrainingmapg.Tomeasuretheconsistencybetweenthesuperpixelmapsfandg,weintroduceanadditionalunarytermSp(fp,gp)intheextendedenergyfunctionE(f,g).Foragivenconstrainingmapg,themapfcanbefoundbyminimizingtheextendedenergyfunctionE(f,g).Theenergyfunctioncanbewrittenasfollows E(f,g)=Xp2PDp(fp)+Xp2PSp(fp,gp)+Xfp,qg2NwpqVpq(fp,fq)(4) whereistheweightoftheadditionalunaryterm.TheadditionalunarytermSp(fp,gp)incursapenaltytotheenergyfunctionE(f,g)wheneverfp6=gpandthusencouragestheconsistencybetweenfandg.ThefunctionSp(fp,gp)isdenedas Sp(fp,gp)=8><>:0iffp=gp1otherwise(4) Asstatedearlier,weminimizetheEquation 4 overfforagivengusingthe-expansionalgorithm[ 67 ]. Now,wediscusssomepropertiesofthefunctionE(f,g).Letusassumethatasuperpixelmapg0isgivenasg0p=0,forallp.Thismeansgdoesnotspecifypreferredlabelsforanypixelp2P.ThentheenergyfunctionE(f,g0)becomesequivalenttotheenergyfunctionE(f).Forg0,theEquations 4 and 4 willhavethesameoptimalsolutionfandtheminimumenergyvalueswilldifferonlybyaconstantjPj.Moreover,foranyotherg6=g0,thefollowingrelationholds minfE(f,g)minfE(f,g0)(4) WealsoconsideranothercasewherethesubsetQisextendedtoalargersubsetS,i.e.QS.Let,thecorrespondingconstrainingsuperpixelmapsaregQandgS, 69

PAGE 70

respectively.WealsoassumethatgQandgShavesamepreferredsuperpixellabelforallpixelsp2Q.Insuchcases,followingrelationholds minfE(f,gS)minfE(f,gQ)(4) Thus,extendingthesubsetofpixelswheregispositivedoesnotincreaseE(f,g).Moreover,thesuperpixelmapftriestomatchwithganddecreasesthevalueofE(f,g). Inthenextsection,wepresenttheformulationofournovelenergyfunctiontocomputeaconsistentsuperpixelizationofasimilarimagepair.Wealsodiscussaniterativealgorithmtominimizetheenergyfunction. 4.4Co-superpixelGenerationFramework LetusassumethatwearegivenapairofimagesI(1)andI(2),andwewanttogeneratesuperpixelsfromtheseimages.Sincetheimagesaresimilartoeachother,theresultantsuperpixelmapsf(1)andf(2)shouldalsobesimilarandtheyshouldsatisfythefollowingproperties Foreachsuperpixelinthemapf(1),thereshouldbeacorrespondingsuperpixelinthemapf(2). Eachcorrespondingsuperpixelpairshouldhaveverysimilarappearances. Eachcorrespondingsuperpixelpairshouldhaveverysimilarshapesandsizes. Weformulateanovelenergyfunctionthatenforcestheseconstraintsonthemapsf(1)andf(2),andminimizationoftheenergyfunctiongivestheconsistentsuperpixelizationoftheimagesI(1)andI(2).Letg(t)betheconstrainingmapforthemapf(t)andCbethebinarymatrixthatencodesthecorrespondencebetweenthesuperpixelsetsoftheimagesI(1)andI(2).ThebinarymatrixChasdimensionsjL(1)jjL(2)jwherejL(t)jrepresentsthemaximumnumberofsuperpixelsthatcanbegeneratedfromtheimageI(t).WeimposethefollowingconstraintsonthematrixCtoensurethateach 70

PAGE 71

superpixelinonesethasatmostonecorrespondingsuperpixelintheothersetjL(1)jXi=1C(i,j)1 (4)jL(2)jXj=1C(i,j)1 (4) WenowpresenttheenergyfunctionforthejointsuperpixelizationschemeoftheimagepairasfollowsE(f(1),g(1),f(2),g(2),C)=E(1)(f(1))+E(2)(f(2))+Xp2P(1)Sp(f(1)p,g(1)p)+Xp2P(2)Sp(f(2)p,g(2)p))]TJ /F9 11.955 Tf 9.3 0 Td[(jL(1)jXl1=1jL(2)jXl2=1C(l1,l2)Sim(l1,l2;g(1),g(2)) (4) wherethetermE(t)(f(t))isthesuperpixelizationcostoftheimageI(t)withthemapf(t),andthisisdenedthesamewayasE(f)isdenedinEquation 4 E(t)(f(t))=Xp2P(t)Dp(f(t)p)+Xfp,qg2N(t)w(t)pqVpq(f(t)p,f(t)q)(4) ThethirdandthefourthtermsoftheEquation 4 measuretheconsistencybetweenthesuperpixelmapf(t)andtheconstrainingmapg(t).Finally,thelasttermmeasuresthesimilaritybetweenthecorrespondingsuperpixelsoftheconstrainingmapsg(1)andg(2).Asmentionedearlier,thesuperpixelcorrespondenceisspeciedbythebinarymatrixC.Thenegativesignindicatesthatincreasingthesuperpixelsimilaritydecreasesthevalueoftheenergyfunction.Thesuperpixelsimilaritytermisweightedbytheparameter.InEquation 4 ,thesimilaritybetweentwosuperpixelsl1andl2,obtainedfromtheconstrainingmapsg(1)andg(2)respectively,isdenotedbySim(l1,l2;g(1),g(2)).WedenethissuperpixelsimilarityterminSection 4.4.1 ,andtosimplifythenotation,thesimilaritytermisdenotedasSim(l(1),l(2))inthesubsequentdiscussions. 71

PAGE 72

4.4.1SuperpixelSimilarityMetrics WedenethesimilaritySim(l(1),l(2))betweenthesuperpixelsl(1)andl(2)asaweightedsumofappearanceandshapesimilarityterms Sim(l(1),l(2))=Sappearance(l(1),l(2))+(1)]TJ /F9 11.955 Tf 11.96 0 Td[()Sshape(l(1),l(2))(4) wherespeciestheweight.Thersttermin 4 ,Sappearance(l(1),l(2))measurestheappearancesimilaritybetweensuperpixelsl(1)andl(2).Toevaluatethissimilaritymetric,wecomputeintensityhistogramsofthesuperpixelsandmeasurethedistancebetweenthemusinghistogramintersection[ 68 ].Thesecondterm,Sshape(l(1),l(2))iscomputedasfollows.LetPl(i)(i)bethesetofallpixelswhichareassignedtothesuperpixell(i).Also,letbeadisplacement,whichcanbeaddedtoapixelinP1andmapittoapixelinP2.WeapplythedisplacementtoallthepixelsinPl(1)(1),andthenndthemappedpixelsinthesecondimagePl(1)(2)=Pl(1)(1).Now,theshapesimilaritymetricoftwosuperpixelsl(1)andl(2)isdenedas Sshape(l(1),l(2))=maxjPl(1)(2)\Pl(2)(2)j jPl(1)(2)[Pl(2)(2)j(4) Also,theoptimaldisplacementbetweenthesuperpixelsisdeterminedby (l(1),l(2))=argmaxjPl(1)(2)\Pl(2)(2)j jPl(1)(2)[Pl(2)(2)j(4) Forasuperpixelpairl(1)andl2,Pl(1)(1)mapstoPl(1)(1)(l(1),l(2))intheimageI(2)andsimilarly,Pl(2)(2)mapstoPl(2)(2)(l(1),l(2)))]TJ /F10 7.97 Tf 6.59 0 Td[(1intheimageI(1).Tosimplifyourdiscussion,wedenefollowingnotations Pl1!l2(2)=Pl(1)(1)(l(1),l(2))(4) Pl2!l1(1)=Pl(2)(2)(l(1),l(2)))]TJ /F10 7.97 Tf 6.58 0 Td[(1(4) 4.4.2IterativeEnergyMinimization DirectminimizationoftheenergyfunctionintheEquation 4 isdifcultbecauseofthepresenceofthesimilarityterms.Totacklethisproblem,wefollowanalternating 72

PAGE 73

minimizationapproach.Inthisapproach,thesetofvariablesVispartitionedintokdisjointsubsetsi.e.V=V1[V2[...[Vk.Then,kphasesareperformediteratively.Ateachphasei,thevariablesinthesetViareestimatedwhilexingtheremainingvariables.Theiterationsoverthesekphasesarecontinueduntilconvergencecriteriaismet.Althoughthisapproachdonotsolvetheoriginalproblemexactly,inmostofthecasestheapproachproducesverygoodapproximatesolutions. TheenergyfunctioninEquation 4 containsthevariablesf(1),f(2),g(1),g(2),andC.Wepartitionthesevariablesintothreedisjointsubsets:V1=f(1),V2=f(2)andV3=fg(1),g(2),Cg.Thus,ouriterativeminimizationalgorithmcontainsthreephases.Intherstphase,weestimatef(1)whilexingf(2),g(1),g(2)andC.FixingthevariablessimpliestheEquation 4 ,andf(1)canbeestimatedusingtheequation E(f(1),g(1))=Xp2P(1)Dp(f(1)p)+Xp2P(1)Sp(f(1)p,g(1)p)+Xfp,qg2N(1)w(1)pqVpq(f(1)p,f(1)q)(4) ThisequationisexactlyinthesameformastheenergyfunctioninEquation 4 ,andwecanapplythe-expansionalgorithm[ 67 ]toestimatef(1).Thesecondphaseisexactlysameastherstphase.Inthisphase,weestimatethevariablef(2)wherethevariablesf(1),g(1),g(2)andCarexed.Again,weusethethe-expansionalgorithm[ 67 ]toestimatef(2).Inthethirdphase,weestimatethevariablesg(1),g(2)andCwhilexingthevariablesf(1)andf(1).First,thevariableCisestimatedfromthesuperpixelmapsf(1)andf(2).Then,weestimateg(1)andg(2)fromC. WeusetheHungarianalgorithm[ 69 ]todeterminethecorrespondencebetweenthesuperpixelsetsobtainedfromthesemaps.TocomputethesimilaritybetweentwosuperpixelsintheHungarianalgorithm[ 69 ],weuseourproposedsuperpixelsimilaritymetricSim(l(1),l(2)).Foreachcorrespondingpairl(1)andl(2),wesetC(l(1),l(2))=1onlyifSim(l(1),l(2)).Here,isthesimilaritythreshold.AllotherentriesinCaresetto0. 73

PAGE 74

Theconstrainingmapsg(1)andg(2)areconstructedusingthematrixC.Initially,wesetg(t)p=0forallp2Pt.Foreverysuperpixelpairl(1)andl(2),ifC(l(1),l(2))=1,weupdateg(1)andg(2)asfollows.Foreachp2Pl2!l1(1)wesetg(1)p=l(1).Andsimilarly,foreachp2Pl1!l2(2)wesetg(2)p=l(2). Werepeatedlyperformthesephasesandupdatethevariables.Thethresholdisinitiallysettoahighvalue,andthealgorithmdecreasesthevalueofateveryiterationtoallowmoresuperpixelcorrespondencesinthematrixC.Decreasingintroducesmorenonzeroentriesintheconstrainingmapsg(1)andg(2),andthishelpstoreducethevalueofthejointenergyfunctionfurther.Wecontinuetheiterationsuntilthevalueofthestopsdecreasingforfewiterations.Theoutputoftheiterativealgorithmisthesuperpixelscomputedfromthenalsuperpixelmapsf(1)andf(2). 4.5ExperimentalResults Weuseseveralexamplestoevaluatetheproposedalgorithm.MostoftheexamplesweretakenfromtheMiddleburydataset[ 70 ].Theweightoftheadditionalunarytermwassetto1andthesmoothnessweightingparameterwassetto200empirically.Thesuperpixelsimilarityweightparameterwassetto0.05.Foreachpairofimages,werstusedthesuperpixelizationalgorithmbyVeksleretal.[ 65 ]togeneratesuperpixelsindependentlyoverthegiventwoimageframes.Wefoundrepeatedlythatthesuperpixelboundariesatcorrespondingimagepointsarequitedifferentfromeachother.Thishappensmainlyinareaswhereedgesareclosetoeachother.Secondly,weappliedourjointframeworkthattakessuperpixelconsistencyintoconsideration.Ourproposedalgorithmwasabletodetectmostoftheinconsistentregionsandrectifythem.Insomecases,ouralgorithmfollowedanalreadyexistingsuperpixelboundarypresentinoneoftheimagesandmodiedthecorrespondingsuperpixelintheotherimage.Inothercases,thealgorithmcompletelyreconstructedtheregionforbothimagestoestablishtheconsistency.Theresultsshowninthefollowingguresdemonstratethattheproposedalgorithmproducesmoreconsistentsuperpixelsacrosstheframes.In 74

PAGE 75

eachoftheFigures( 4-4 to 4-5 ),weshowthesuperpixelmapsofthethegivenimagepairusingVeksler'salgorithm,ouralgorithm,andthenanoverlayofthesuperpixelmapsfromthetwoapproachestorevealthepointsofinconsistencyintheformerapproach.Aswell,wereporttheincreaseintheaveragesimilarityofcorrespondingsuperpixelpairsattheinitiationandterminationoftheiterativealgorithmforeachoftheshownimagepairs.Table 4-1 summarizestheseincreasesofsuperpixelsimilarities.Figure 4-3 showstheimprovementintheaveragesuperpixelsimilarityscoreoveriterationsforthetestedimagepairs. Table4-1. Superpixelsimilarityattheinitialandnaliterationsofouriterativejointsuperpixelgenerationscheme. SequenceInitialSimilarityFinalSimilarity Army0.830.96Sponza0.720.82Wooden0.750.88Urban20.670.73Urban30.700.80Mequon0.620.79 75

PAGE 76

Figure4-1. Consistentsuperpixelgenerationforapairofimages.TopRow:SuperpixelmapsgeneratedindependentlyusingVeksler's[ 65 ]approach.MiddleRow:Superpixelmapsgeneratedbyouralgorithm.Bottomrow:Overlayofthesuperpixelmapsofthetwoapproaches.Ourmethodproducesmoreconsistentsuperpixels.Partsofnoticeabledifferencesareenlarged.Imagesarebestviewedincolor. 76

PAGE 77

Figure4-2. Stepsforcomputingtheshapesimilaritybetweenapairofsuperpixels. 77

PAGE 78

Figure4-3. Improvementintheaveragesuperpixelsimilarityscoreusingouriterativejointsuperpixelgenerationscheme. 78

PAGE 79

Figure4-4. SuperpixelgenerationresultsfortheSponzasequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. Figure4-5. SuperpixelgenerationresultsfortheWoodensequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. 79

PAGE 80

Figure4-6. SuperpixelgenerationresultsfortheUrban3sequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. Figure4-7. SuperpixelgenerationresultsfortheMeqounsequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. 80

PAGE 81

Figure4-8. SuperpixelgenerationresultsfortheUrban2sequence(Frames11and12).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. Figure4-9. SuperpixelgenerationresultsfortheArmysequence(Frames7and8).TopRow:independentsuperpixelgenerationapproach,MiddleRow:ourjointsuperpixelgenerationalgorithm,BottomRow:Overlayofthesuperpixelmapsfromthetwomethods. 81

PAGE 82

CHAPTER5TRACKINGUSINGSUPERPIXEL-BASEDAPPEARANCEMODEL 5.1Introduction Visualtrackinginuncontrolledenvironmentsisanimportantproblemincomputervisionresearchbecauseofitsapplicationsinmanyeldsincludingsurveillance[ 71 ],human-computerinteraction[ 72 ]andmedicalimageanalysis[ 73 ].Despitesignicantadvancesinrecentyears,visualtrackingisstillachallengingproblem.Themaindifcultiesintrackingatargetoveralongperiodoftimelieinhandlingtheappearancechanges,illumination,poseandscalevariationsandrecoveringfromtargetocclusions.Therefore,thetargetappearancemodelneedstobeupdatedadaptivelytohandlethesechallenges.Adaptivemodelsofthetargetappearancehavebeenproposedinmanyvisualtrackingalgorithms[ 4 74 78 ].Inthesemodels,theappearanceiscontinuouslyupdatedwithnewtrackingresults.Sometrackingalgorithmsalsomodelthebackgroundsurroundingthetarget.Dependingontheforegroundandthebackgroundmodelingapproach,trackersaredividedintotwoclasses,generativeanddiscriminative. Discriminativetrackingmethods[ 5 79 ]areposedasbinaryclassicationproblemswherethetaskistodistinguishthetargetregionfromthebackground.Thisrequiresmodelingtheforegroundandthebackgroundseparatelywhichalsohelpstohandlepartialocclusionsofthetarget.However,theupdateoftheseappearancemodelsisoftenverycostly.Ontheotherhand,generativetrackingmethods[ 76 77 80 ]trackatargetobjectbysearchingfortheregionmostsimilartothereferencemodel.Forthesemethods,theupdateoftheappearancemodelismoreefcientbuttheupdatecancausethetrackertodriftawayfromthetargetwhenitispartiallyoccluded.Withoutanocclusiondetectionmechanismingenerativetrackingalgorithms,adaptivetargetmodelingcancausethetrackertodriftaway.Inthischapterweproposeagenerativetrackingalgorithmthataddressesthisproblemandprovidesasolutionforrobustvisualtracking.Inparticular,ourtrackingalgorithmpresentsanovelsuperpixel-based 82

PAGE 83

appearancemodelwithasimplebuteffectivemodelupdatemechanism.Ouralgorithmalsorobustlyhandlesocclusionsandscalevariationsofthetarget. TherestoftheChapterisorganizedasfollows.WerstdescriberelatedworkinSection 5.2 .Theproposedsuperpixel-basedappearancemodelispresentedinSection 5.3 .Section 5.4 outlinesthedetailsoftheproposedtrackingalgorithmincludingthemodelupdateandocclusiondetection.ExperimentalresultswithqualitativeandquantitativeevaluationandcomparisonofthetrackerwithotheralgorithmsarepresentedinSection 5.5 5.2RelatedWork Reliablemodelingofobjectappearanceisveryimportantforrobusttracking.Thishelpsthetrackertondthetargetamidvariouschallengesinthescene.Also,tohandlethevariationsofthetarget,theappearancemodelshouldbeeasytoupdate.Forrobustandadaptiveappearancemodeling,trackingalgorithmsavailableintheliteratureusevarioushigh-,mid-orlow-levelimagefeatures.Inhigh-levelappearancemodelingparadigms,thetargetisconsideredasasingleregionandfeaturessuchascolororintensityhistogram[ 2 81 ]areusedtodescribetheregion.However,thistypeofmodelingcompletelydisregardsthespatialcongurationofthetargetandhencethetrackeroftenconfusesthetargetwithsimilarbackgrounds.Moreover,thisholisticmodelingproducesjitteredtrackingresultsfortargetswithlargeshapevariationseveniftheappearanceofthetargetremainsstationary[ 82 ].Low-levelimagecues(forexampleSIFT[ 83 ],SURF[ 84 ],LBP[ 85 ])aremainlyusedintrackingfeaturepoints.However,theselow-levelimagecuesarenotsuitablefortrackingobjectsinlongsequences. Recently,mid-levelcueswereeffectivelyusedfortrackingobjectsinlongandchallengingsequences[ 75 86 88 ].Specically,thetargetregionisdividedintosmallerregionsandtheappearancemodelisconstructedfromthefeaturevectorsdescribingthoseregions.In[ 75 ],theauthorsproposeatrackerthatmodelsthetargetbydividingthetargetregionintofragments.Localhistogramscomputedfromthefragmentsare 83

PAGE 84

usedtomodelthetarget.Thismakesthetrackerrobustagainstpartialocclusions.However,thisfragment-basedappearancemodelisnotadaptiveandhencethetrackerfailstotracktargetshavinglargeappearancevariationsovertime.In[ 86 ],smallimagepatchesareselectedfromthetargettomodelitsappearance.Thesepatchesarecalledattentionalregions(AR)ofthetarget.ThetrackercombinesthevotemapsobtainedfromtheseARstodecidethetargetlocationinthenewimageframe.However,theauthorsdidnotshowifthetrackerhandleslargeshapeorappearancevariationsofthetarget.Asimilarpatch-basedappearancemodelisalsoproposedin[ 3 ]whereaparticleltering-basedframeworkisusedfortracking.Whiletheauthorsdemonstraterobusttrackingresultsfortargetshavinglargeshapevariations,theydidn'tdoenoughexperimentswithoccludedtargets.Liuetal.[ 87 ]presentatrackingalgorithmwherelocalpatchesaresparselycodedusingalearneddictionary.However,sparsecodingoflargenumberofpatchesanddynamicupdateofthedictionaryisverycostlyfortrackingapplications. Whilealloftheseaforementionedapproachesuserectangularpatches,somealgorithmsusesuperpixelstomodelthetarget.In[ 89 ],asuperpixel-basedtrackingalgorithmwasproposedthatcomputessuperpixelsfromanimageframeandclassifythemasforegroundorbackground.Itisveryinefcienttosuperpixelizeawholeimagejusttotrackasmalltargetintheimage.Toalleviatethisproblem,theauthorsin[ 90 ]computesuperpixelsfromalimitedregionaroundthetarget. Apartfromreliablemodelingofthevisualtarget,anotherimportantfactorofatrackingalgorithmistoupdatethemodeltoadapttothevariationsofthetarget.IntheVisualTrackingDecomposition(VTD)framework[ 78 ],severalbasicappearancemodelsareconstructedusingtheinitialtargetregionandthefourmostrecenttrackingresults.Basictrackersconstructedfromthesemodelsareusedinteractivelytotrackthetarget.Thismethodperformswellwiththeproposedadaptiveappearancemodelfortargetshavingposeandilluminationvariations.However,thealgorithmlosesthe 84

PAGE 85

targetunderheavyocclusion.ThePROSTtrackingalgorithm[ 4 ]usesthreedifferenttrackingmodules:non-adaptive,fullyadaptiveandsemi-adaptive.ThePROSTalgorithmproposessomecriteriatoupdatethesemi-adaptivemodulebasedontheoutputsfromthetwoothermodules.Althoughthealgorithmcantrackoccludedtargets,itdoesnotexplicitlydetectocclusion.TheMILTrackalgorithm[ 5 ]isconsideredtobeastate-of-the-artdiscriminativetrackingalgorithmthatusesmultiple-instancelearning[ 91 ].Theforegroundandbackgroundappearance(positiveandnegativesamples)areupdatedregularlytodealwiththeappearancevariationsofthetarget. Ourproposedalgorithmusesmid-levelimagefeaturestomodelthetargetappearance.Inparticular,thealgorithmmodelsthetargetusingsuperpixels[ 60 64 65 ].Ouralgorithmcomputesasuperpixelmapandthensuperimposesarectangulargridontopofthemap.Eachgridpointisassignedthefeaturevectorassociatedwiththecorrespondingsuperpixel.Hence,asuperpixel-basedsimilaritymeasureisevaluatedbetweenthemodelandthecandidatewindowstondthetargetinthenewframe.Inordertoaccountforappearancevariations,themodelisupdatedwithnewtrackingresults.Tokeepthemodelrobust,themodelusessuperpixelsfromtheinitialframeandthemostrecentframe.Usingthesesuperpixels,weconstructtheadaptiveappearancemodel.Theproposedalgorithmalsousesanocclusiondetectionmodulethatguidesthemodelupdateprocess.Whileusingsuperpixelsintrackinghasbeendonebefore[ 89 90 ],ourworkisdistinguishedbythreeaspects.First,weincorporatespatialinformationbyplacingagridontopofthesuperpixelmap.Second,ourmodelupdateprocessismuchsimplerthanthatof[ 90 ].Third,ourmodelisgenerativewhilethepreviousapproachesarediscriminativeones.Ourexperimentalresultsshowthatthealgorithmproducesbetterresults.TheresultswillbeillustratedinSection 5.5 5.3TheAppearanceModel Inthissection,wedescribetheconstructionofthetargetappearancemodel.Theinitialappearancemodelisconstructedfromtherstframeofthesequence.Thetarget 85

PAGE 86

ismanuallyspeciedasarectangularregionRinthisframe.InsideR,weestablisharectangulargridGasshowninFigure 5-1 .AssumingthatthegridGcontainsNgridpoints,thecollectionofthesegridpointsisspeciedbyG=fgigNi=1.Therelativelocationofagridpointgiisdenotedbypi=pos(gi))]TJ /F1 11.955 Tf 12.03 0 Td[(pos(upperLeft(R)),wherepos(gi)isthepixellocationofthegridpointgiandpos(upperLeft(R))isthepixellocationoftheupperleftcorneroftheinitialrectangleR.Eachofthesegridpointsisannotatedwithafeaturevectorfi.Acollectionofthesefeaturevectors,F=ffigNi=1,onthegridconstitutestheinitialappearancemodelofthetarget.ThespatialinformationofthetargetisthusencodedintheproposedappearancemodelthroughthegridGandthefeaturesetF.Thisspatialmodelmakesthetrackerrobustagainstpartialocclusionsandalsohelpsittodealwithtargetsofcomplexappearances. Inthiswork,weusemid-levelimagecuestocomputethefeaturevectorsetF.Morespecically,thesuperpixelscomputedinsidetherectangleRareusedforthispurposeasshowninFigure 5-1 .Toextractthesuperpixels,weusetheenergyminimizationalgorithmproposedin[ 65 ].Thisalgorithmefcientlycomputessuperpixelswhoseboundariesarealignedwiththeedgespresentintheimage.ThealgorithmproducesasuperpixelmapMthathasthesamedimensionastherectangleR.Foreachgridpointgi,wedeterminetheassociatedsuperpixelfromthesuperpixelmapM.Inparticular,letmi=M(pi)specifytheassociatedsuperpixelofthegridpointgilocatedatthepositionpi.Theintensityorcolorhistogramhmiofthesuperpixelmiisassignedasthefeaturefiforthegridpointgi. Insteadofcomputingthesuperpixels,onepossiblealternativewouldhavebeentousethehistogramcomputedfromtherectangularimagepatchcenteredatpos(gi).Althoughthiswayiscomputationallyefcient,itcompletelyignorestheedgesoftheimage.Ontheotherhand,ourapproachofusingthesuperpixelsimplicitlyencodestheedgesinthetargetappearancemodelaswell.Thisinfactmakestheappearancemodelmoreinformativeaboutthetargetthanthepatch-basedalternative.Moreover,the 86

PAGE 87

superpixelcomputationalgorithmweuserunsreasonablyfastintherectangleRwhichisonlyasmallregionoftheentireimage.Updateofourproposedappearancemodelisalsoverysimpleandintuitive.ThefeaturevectorsetFisupdatedtohandlethetargetappearancechangeduetoposeorilluminationvariations.Tohandlethetargetscalevariations,thegridGisupdated.Inthenextsection,wepresentourtrackingalgorithmusingthisnovelappearancemodel. 5.4TrackingAlgorithm Atthebeginningoftheproposedalgorithm,theinitialappearancemodelisconstructedusingtherstframeasdescribedinSection 5.3 .Then,foreachsubsequentframeItofthesequence,theproposedtrackingalgorithmproducesarectangleRtenclosingthetarget.Localizingthetargetinthenewimageisformulatedasasearchprobleminouralgorithm(Section 5.4.1 ).Briey,thissearchisperformedasfollows.Atrst,asetofcandidaterectanglesR=fRjgJj=1aregenerated.Then,thealgorithmevaluatesthesimilaritybetweenthecurrenttargetappearancemodelandtheappearanceofthecandidaterectangles.Andnally,thecandidaterectanglehavingthehighestsimilarityscoreisselectedastherectangleRtcontainingthetarget.Nextstepafterthelocalizationofthetargetistodeterminewhetherthetargetisbeingoccludedornot(Section 5.4.2 ).Updateoftheappearancemodelisperformednext(Section 5.4.3 ).Thisupdatedependsontheoutcomeoftheocclusionstep.Thenalstepistoadjustthescaleoftheappearancemodel(Section 5.4.4 ).Thesefoursteps-targetlocalization,occlusiondetection,modelupdateandscaling-arerepeatedineachframetotrackthetarget.Thehighlevel-overviewofthealgorithmispresentedinFigure 5-2 5.4.1TargetLocalization Inthisstep,wedeterminethebestrectangleRtintheimageItthatisthemostsimilartothecurrenttargetappearancemodel.ThebestrectangleRtisselectedfromasetofcandidaterectanglesRasfollows.LetRt)]TJ /F10 7.97 Tf 6.59 0 Td[(1bethebestwindowintheimageIt)]TJ /F10 7.97 Tf 6.59 0 Td[(1.Alsoletct)]TJ /F10 7.97 Tf 6.59 0 Td[(1,wt)]TJ /F10 7.97 Tf 6.59 0 Td[(1andht)]TJ /F10 7.97 Tf 6.59 0 Td[(1bethecenter,widthandheightofRt)]TJ /F10 7.97 Tf 6.59 0 Td[(1,respectively. 87

PAGE 88

AnyrectangleRjintheimageItwhosedimensionsarewt)]TJ /F10 7.97 Tf 6.59 0 Td[(1ht)]TJ /F10 7.97 Tf 6.59 0 Td[(1andwhosecenterlieswithinddistancefromthepixellocationct)]TJ /F10 7.97 Tf 6.59 0 Td[(1isincludedinthesetR.TheunionoftherectanglesinthesetRformsarectangularregionRintheimageItwhosecenterisatthepositionct)]TJ /F10 7.97 Tf 6.59 0 Td[(1andwhosedimensionsare(wt)]TJ /F10 7.97 Tf 6.58 0 Td[(1+2d)(ht)]TJ /F10 7.97 Tf 6.58 0 Td[(1+2d).TheregionRissuperpixelizedinordertocomputethesuperpixel-basedappearanceofeachcandidaterectangleRjinthesetR.Thesuperpixelgenerationalgorithm[ 65 ]producesasuperpixelmapMfortheregionR.TheintensityhistogramsfhkgSk=1ofthesesuperpixelsfskgSk=1arethenestimated1.ForeachpixelpinR,thecorrespondingsuperpixelcanbedeterminedfromM(l)wherel=pos(p))]TJ /F1 11.955 Tf 12.2 0 Td[(pos(upperLeft(R))2.ThecomputationoftheappearanceoftherectangleRjisdoneasfollows. First,wesuperimposethegridGontherectangleRj.Thisgivesusthegrid-pointlocationsfp0igNi=1ofRj.Then,usingtheselocations,wedeterminethesuperpixelsandhistogramsassociatedwitheachofthegrid-points.TheresultanthistogramsetFj=ffjigNi=1representstheappearanceofthecandidaterectangle.ThenwecomputethesimilaritybetweenthehistogramsetFjofthecandidaterectangleRjandthecorrespondingsetFofthetargetmodel.Wedenethesimilarityscorebetweentwohistogramsetssim(F1,F2)asfollows sim(F1,F2)=NXi=1hist sim(f1i,f2i)(5) wherehist sim(f1i,f2i)representsthesimilaritybetweenthehistogramsf1iandf2i.Weusehistogramintersection[ 68 ]tocalculatethesimilaritybetweenthehistograms hist sim(f1i,f2i)=BXb=1min(f1i(b),f2i(b))(5) 1Sisthenumberofgeneratedsuperpixelsbythealgorithm.2pos(x)isthepixellocationofx. 88

PAGE 89

whereBisthenumberofbinsinthehistogram.WeuseEquation 5 tocalculatetheappearancesimilarityofeachcandidaterectangleinthesetR.Then,therectanglewiththehighestsimilarityischosentobethetargetrectangleRt. 5.4.2OcclusionDetection Ourtrackingalgorithmusesanadaptiveappearancemodelofthetarget.Theappearancemodelisupdatedwhenanewtrackingresultisavailable.However,theupdatemaycausethetrackertodriftawayfromthetargetifitiscurrentlyunderpartialocclusionandtheappearancemodelisupdatedwiththispartiallyoccludedtrackingresult.Forthisreason,ourproposedtrackingalgorithmrstdetectswhetherthetargetisbeingoccludedornot.Then,thealgorithmupdatestheappearancemodel.Next,wedescribetheocclusiondetectionstep. Inthetargetlocalizationstep,therectangleRtcontainingthetargetintheimageItisdetermined.TheappearanceofRtisfurtheranalyzedintheocclusiondetectionstep.Inparticular,weestimatetheappearancesimilarityofRtwithrecentbackgrounds.Ifthisappearancesimilarityscoreislarge(i.e.theappearanceofRtishighlysimilartorecentbackgrounds),weinferthatthetargetishavingocclusion.TherecentbackgroundsaremodeledasasetofrectanglesBtakenfromtheLmostrecentlyprocessedframes(It)]TJ /F10 7.97 Tf 6.59 0 Td[(1,It)]TJ /F10 7.97 Tf 6.59 0 Td[(2,...,It)]TJ /F8 7.97 Tf 6.58 0 Td[(L).AteachframeIt)]TJ /F8 7.97 Tf 6.59 0 Td[(j,weincludetherectangleBt)]TJ /F8 7.97 Tf 6.59 0 Td[(icenteredatctandwhosedimensionsarewthtinthesetBonlyifjct)]TJ /F7 11.955 Tf 10.42 0 Td[(ct)]TJ /F8 7.97 Tf 6.58 0 Td[(jj>docc.(i.e.ifthecenteroftherectangleBt)]TJ /F8 7.97 Tf 6.58 0 Td[(iisatleastdoccawayfromthecenteroftherectangleRt)]TJ /F8 7.97 Tf 6.59 0 Td[(j).WeestimatethesimilaritiesbetweenRtandeachrectangleinthesetBandcalculatetheaveragevalue.If>occ,weconcludethat,thetargetispartiallyoccluded.Here,occistheocclusiondetectionthreshold. 5.4.3ModelUpdate AsdescribedinSection 5.3 ,theproposedtargetappearancemodelconsistsofagridGandacollectionoffeaturevectorsF(Eachgridpointgi2Ghasafeaturevectorfi2F).LetF1andFtdenotethefeaturevectorsetsobtainedbyimposingthegridG 89

PAGE 90

ontherectanglesR1andRt,respectively.Inordertohandletheappearancevariationsofthetarget,themodeliscontinuouslyupdatedusingthefeaturesetsF1andFt.Usingonlytheinitialandthemostrecentframeskeepsthemodelupdatesimplebutmoreimportantlyrobustagainstanydrift[ 78 ].Now,themodelfeaturesetFisconstructedfromF1andFtasfollows. Eachfeaturevectorfi2Fisconstructedasaconvexcombinationofthevectorsf1i2F1andfti2Ft.Theweightofthisconvexcombinationdependsontheoutcomeoftheocclusiondetectionstep.Ifthetargetisnotoccluded(occ),weputmoreweightonthecurrentfeaturesfti,otherwise,moreweightisgiventotheinitialfeaturesf1i fi=8><>:w1f1i+(1)]TJ /F7 11.955 Tf 11.95 0 Td[(w1)ftiif>occw2f1i+(1)]TJ /F7 11.955 Tf 11.95 0 Td[(w2)ftiifocc(5) wherew1>w2.Also,whenthealgorithmdetectsocclusion(>occ),thetargetismorelikelytobeoccludedforsometime.Therefore,forthenextfewframes,thealgorithmskipstheocclusiondetectionstepandfavorstheinitialfeatures. 5.4.4Scaling Thenalstepofouralgorithmistodeterminethescaleofthetarget.Thetargetappearancemodelisupdatedaccordinglytohandlethescalevariations.Todeterminethescaleofthetargetonthecurrentframe,asetofrectanglesaregeneratedbyvaryingthesizeoftherectangleRtwhilekeepingthenewrectanglecentersxedatct(centerofRt).Theappearancesoftheserectanglesarecomparedwiththetargetappearancemodelandthescaleoftherectanglehavingthehighestsimilarityisselectedastheupdatedscaleofthetarget.Theupdateofthemodelisperformedasfollows.Firstly,Theappearancemodelisresizedwiththeselectedscale.Then,anewgridG0iscreatedtoreplaceG.Andnally,thefeaturevectorsetFisrecomputedusingthenewgridG0.Inourimplementation,weperformthisscalingstepafterevery4to5framesratherthanateveryframe. 90

PAGE 91

5.5ExperimentalResults Weevaluatedouralgorithmusingtwelvechallengingsequences:Sylvestersequence[ 76 ],WomanandFaceocc1sequences[ 75 ],BasketballandSinger1sequences[ 78 ],Tiger1,Tiger2andFaceocc2[ 5 ]andLemming,Liquor,BoxandBoardsequences[ 4 ].Thetrackingperformanceofourproposedstructuredsuperpixel-basedtracker(SSP)hasbeencomparedwithseveralstate-of-the-arttrackingalgorithms:themultiple-instance-learningtracker(MIL)[ 5 ],thevisualtrackingdecompositiontracker(VTD)[ 78 ],theSemiBoosttracker(SB)[ 79 ],thesuper-pixeltracker(SPT)[ 90 ],theOnlineAdaBoosttracker(OAB)[ 92 ],theOnlineRandomForest(ORF)tracker[ 93 ],theParallelRobustOnlineSimpleTracker(PROST)[ 4 ],andthefragment-basedtracker(FRAG)[ 75 ]. 5.5.1ExperimentalSetup Foreachvideosequence,wecomputethesuperpixelmapoftheinitialframeusingtheenergyminimizationframeworkVeksleret.al[ 65 ]wherethepatchsizeissetto8)]TJ /F5 11.955 Tf 12.63 0 Td[(14pixels,thesmoothnessweighttermto100andthenumberofiterationsto10.Thenagridwithaspacingof4)]TJ /F5 11.955 Tf 12.41 0 Td[(7pixelsisplacedontopofthesuperpixelmap.Thisgridspacingisselectedbasedonthesizeofthetargetsothatnumberofgridpointsisroughly200inallexperiments.Thesuperpixelfeaturesareintensityhistogramsof16)]TJ /F5 11.955 Tf 12.14 0 Td[(20binswheredifferentcolorchannelsareprocessedindependently.ThesimilaritybetweentrackingwindowsiscomputedaccordingtoEquation 5 andthennormalizedbetween0and1.Forocclusiondetection,the5mostrecentframesareusedtomodelthebackground.Adistancethresholdof10)]TJ /F5 11.955 Tf 12.8 0 Td[(15pixelsisusedtodecidewhetherawindowbelongstothebackground.Theocclusiondetectionthresholdissetto0.6)]TJ /F5 11.955 Tf 12.07 0 Td[(0.7dependingontheforeground-backgroundsimilarityofaspecicsequence.Themodelupdateweightsw1andw2inEquation 5 aresetwithintheranges0.8)]TJ /F5 11.955 Tf 12.98 0 Td[(0.9and0.5)]TJ /F5 11.955 Tf 11.95 0 Td[(0.6,respectively. 91

PAGE 92

5.5.2ResultsandDiscussion Table 5-1 summarizestheaveragecenterlocationerror[ 5 ]resultsontensequenceswhoseground-truthdataareavailable.Forspacelimitationsweshowthecomparisonswithvemethodsonly.Pleasecheckthesupplementalmaterialsforfurtherexperimentalresults.Thequalitymetricweusedistheaveragecenterlocationerror[ 5 ].Inmostsequences,ourmethod(SSP)mostaccuratelytrackedthetargetevenwiththepresenceofocclusions,posevariations,illuminationchangesandabruptmotions.InspiteofthemotionblurandheavyocclusionsinthelongLiquorsequence,ourmethodachievedasignicantlylowererrorthanallothermethods.Also,ourerrorresultfortheLemmingsequenceisverylowalthoughthissequenceexperienceslargescalevariationsandmotionblur. InFigures 5-3 and 5-4 ,wepresenttheresultsofouralgorithmforthetwelveaforementionedsequences.TheSinger1sequencecontainslargescaleandilluminationvariations.Still,ourtrackerfollowsthesingeraccuratelythankstotherobustadaptivityofourmodel.TheTiger1andTiger2sequencesexhibitfastmotion,heavyocclusionanddrasticappearancechanges.Despiteallofthesechallenges,ourtrackerrobustlytracksthetigerfacesinbothsequences.NumericalerrorresultsinTable 5-1 supportthisconclusion.Indeed,ouralgorithmachievesthebestperformancefortheTiger2sequenceandcloselymatchesthePROSTtrackerfortheTiger1sequence. TheFaceocc1,Faceocc2andWomansequencesallsufferfromsevereocclusionandheavyappearancechanges.However,becauseofthespatialinformationencodedintotheproposedappearancemodel,ourtrackermanagedtokeeptrackoftherespectiveobjects(Figures 5-3 )andachievethelowesterrorsonthesethreesequences. Figures 5-5 and 5-6 comparestheperformanceofouralgorithmwithotheralgorithmsonfewselectedframes.Aswecanseefromtheexamples,ourtrackerrecoversfromsevereocclusionsandappearancechangeswhilemostoftheothertrackersfail.Forexample,intheBoardsequence,ourtrackerdetectsthemovingcircuit 92

PAGE 93

accuratelywhileothertrackersdriftaway.Similarly,fortheBoxsequence,ourtrackercloselymatchesthePROSTtrackerinfollowingthetarget.Figure 5-7 comparestheper-frametrackingerrorsofourmethodwiththeotherreferencemethods.Theerrorofourmethodisconsistentlylowerthantheerrorsofallothermethods. Table5-1. QuantitativeevaluationoftheSSPtrackingalgorithm.Thenumbersindicatetheaveragecenterlocationerrors(inpixels)ofthetrackingwindows.Thenumberofframesforeachsequenceisshownintheparenthesisnexttothesequencename. SSP(Ours)MIL[ 5 ]SB[ 79 ]SPT[ 90 ]PROST[ 4 ]FRAG[ 75 ] Board(698)14.251.2--39.090.1Box(1161)17.1104.6--12.157.4Lemming(1336)9.414.9-7.025.482.8Liquor(1741)5.3165.1-9.021.630.7Tiger1(354)10.015.046.0-7.040.0Tiger2(360)14.017.053.0--17.0Sylvester(1344)7.09.416.0-11.011.0Faceocc1(886)6.918.47.0-7.06.5Faceocc2(812)13.120.023.0-17.045.0Woman(539)5.0120.0-9.0-112.0 93

PAGE 94

Figure5-1. Constructionofthesuperpixel-basedappearancemodel.Givenanimagewindow,werstgeneratethesuperpixelsdenedbytheredboundaries.Secondly,thehistogramfeaturesarecomputedforeachsuperpixel.Thirdly,wesuperimposeagridontopofthesuperpixelmap.Finally,eachgridpoint(ingreen)isassociatedwiththeenclosingsuperpixelanditshistogramfeaturevector. AlgorithmOutline 1. Initialization:Initializethetrackerandthetargetappearancemodel(Section 5.3 ). 2. TargetLocalization:FindtherectangleRtcontainingthetarget(Section 5.4.1 ). 3. OcclusionDetection:Decidewhetherthetargetisundergoingocclusionornot(Section 5.4.2 ). 4. ModelUpdate:Updatetheappearancemodelusingthemostrecenttrackingresult(Section 5.4.3 ). 5. Scaling:Adjustthescaleoftheappearancemodel(Section 5.4.4 ). Figure5-2. High-leveloutlineoftheSSPtrackingalgorithm. 94

PAGE 95

Sylvester#129Sylvester#715Sylvester#1105Sylvester#715Sylvester#1105 Woman#122Woman#240Woman#332Woman#240Woman#332 Faceocc1#102Faceocc1#277Faceocc1#537Faceocc1#277Faceocc1#537 Faceocc2#93Faceocc2#180Faceocc2#657Faceocc2#180Faceocc2#657 Tiger1#24Tiger1#78Tiger1#200Tiger1#78Tiger1#200 Tiger2#6Tiger2#149Tiger2#228Tiger2#149Tiger2#228 Singer1#58Singer1#136Singer1#191Singer1#136Singer1#191 Basketball#84Basketball#284Basketball#625Basketball#284Basketball#625 Figure5-3. TrackingresultsusingtheSSPtrackingalgorithm. 95

PAGE 96

Box#236Box#622Box#977Box#622Box#977 Board#180Board#456Board#510Board#456Board#510 Lemming#330Lemming#586Lemming#948Lemming#586Lemming#948 Liquor#359Liquor#778Liquor#1042Liquor#778Liquor#1042 Figure5-4. TrackingresultsusingtheSSPtrackingalgorithmonPROSTsequences. Tiger1#47Tiger1#85Tiger1#141Tiger1#85Tiger1#141 Tiger2#185Tiger2#232Tiger2#265Tiger2#232Tiger2#265 Sylvester#833Sylvester#1087Sylvester#1177Sylvester#1087Sylvester#1177 Figure5-5. Visualcomparisonoftrackingresultsobtainedusingdifferentalgorithms.Trackingresultsbyourmethod(SSP),MIL,PROSTandFRAGarerepresentedbyred,blue,cyanandgreenrectangles,respectively. 96

PAGE 97

Box#306Box#500Box#1093Box#500Box#1093 Board#442Board#527Board#649Board#527Board#649 Liquor#449Liquor#779Liquor#1097Liquor#779Liquor#1097 Lemming#328Lemming#589Lemming#1064Lemming#589Lemming#1064 Figure5-6. VisualcomparisonoftrackingresultsobtainedusingdifferentalgorithmsonPROSTsequences.Trackingresultsbyourmethod(SSP),MIL,PROSTandFRAGmethodsarerepresentedbyred,blue,cyanandgreenrectangles,respectively. 97

PAGE 98

BoxBoard LemmingLiquor Tiger1Tiger2 SylvesterFaceocc2 Figure5-7. Per-frametrackingerrorplotsusingdifferenttrackingalgorithms.ForeachframethecenterlocationerrorisshownontheY-axis.Ourmethod(SSP),MIL,PROST,FRAG,SBandOABarerepresentedbyred,blue,cyan,green,blackandmagentalines,respectively. 98

PAGE 99

CHAPTER6CONCLUSION Inthisdissertation,wepresentednovelalgorithmsfortrackingarticulatedandnon-articulatedobjects.Weusedmid-levelimagecuestoefcientlymodelthetargets.Ournovelappearancemodelshavebeensuccessfullyusedtotrackthetargetsinmanychallengingsequences. Wedevelopedatrackingalgorithmforarticulatedtargetsthatusesasmallnumberofrectangularblocksforefcientandrapidevaluationsofintensityhistogramsfortracking.Trackersusingintegralhistogramhavetheabilitytoquicklyscantheentireimageforlocalizingthetarget;andthisgivesthemthecapabilitytotrackrapidmotionsandacrossdifferentshots.However,forobjectsundergoingsignicantshapedeformation,accurateforegroundhistogramestimationsbecomedifcultandthechallengeistoreliablyandrobustlytracktheseobjectswhilemaintainingsimilarperformance(framerate).Westartedwiththenaturalideaofapproximatingirregularshapewithrectangularblocks,andadaptivelyadjustedtheblockstructuretoapproximatetheforegroundshape.Themainpointwasnottoestimatetheshapeexactlyasinmanycontourtrackers(whichinvariablyrequireshapepriorsanddynamics)buttoapproximateitwithaunionofrectangularblocks,withwhichwerapidlyevaluatedtheforegroundhistogram.TheBHTalgorithmputthisideaintoworkandexperimentalresultsshowedgreatimprovementsonrobustnessandstabilityovertypicalhistogram-basedtracker,withoutcompromisingtheperformance. WealsopresentedamodiedversionoftheBHTalgorithmanddevelopedaGPU-acceleratedreal-timeandrobusttrackingsystem.ThesystemwasimplementedontheGPUusingtheCUDAprogrammingmodelproposedbyNVIDIA.Themodiedp-BHTalgorithmcombineddetectionstepsofseveralframestogetherandprocessedtheminparallel.Apartfromthis,distancesofthecandidatetrackingwindowsinthedetectionstepwerealsocomputedinparallel.Thisfurtherimprovedthecomputational 99

PAGE 100

performance.Todealwiththetarget'sshapechanges,theblockstructureoftheappearancemodelwasupdatedregularlyeveryNthframe.Ourexperimentalresultsshowedthatthealgorithmruns7)]TJ /F5 11.955 Tf 11.95 0 Td[(8timefasterthantheBHTalgorithm. Wethendiscussedanovelalgorithmtondconsistentsuperpixelsfromimagepairsofthesamescene.Inparticular,weintroducednovelsuperpixelshapeandappearancemeasuresthatwereusedtoguidethejointsuperpixelizationprocess.Ourapproachshowedgoodconsistencyresultswhencomparedwithtraditionalsuperpixeltechniques. Finally,wedescribedanovelstructuredsuperpixel-basedappearancemodelfornon-articulatedobjecttracking.Ourappearancemodelwassimple,reliableandeasytoupdate.TheSSPtrackersuccessfullytrackedtargetsundergoingsevereshape,scaleandposevariations.Aswell,theSSPtrackerrobustlyhandledoccludedtargets.Qualitativeandquantitativeevaluationsofthetrackerdemonstratedthatouralgorithmoutperformsmanystate-of-the-arttrackingalgorithmsonchallengingsequences. 100

PAGE 101

REFERENCES [1] D.Comaniciu,V.Ramesh,P.Meer,Real-timetrackingofnon-rigidobjectsusingmeanshift,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2000,pp.142. [2] F.M.Porikli,Integralhistogram:Afastwaytoextracthistogramsincartesianspaces,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2005,pp.829. [3] J.Kwon,K.M.Lee,Trackingofanon-rigidobjectviapatch-baseddynamicappearancemodelingandadaptivebasinhoppingmontecarlosampling,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2009,pp.1208. [4] J.Santner,C.Leistner,A.Saffari,T.Pock,H.Bischof,Prost:Parallelrobustonlinesimpletracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2010,pp.723. [5] B.Babenko,M.-H.Yang,S.Belongie,Robustobjecttrackingwithonlinemultipleinstancelearning,IEEETransactionsonPatternAnalysisandMachineIntelligence33(8)(2011)1619. [6] C.Stauffer,W.E.L.Grimson,Learningpatternsofactivityusingreal-timetracking,IEEETransactionsonPatternAnalysisandMachineIntelligence22(2000)747. [7] B.Leibe,N.Cornelis,K.Cornelis,L.J.V.Gool,Dynamic3dsceneanalysisfromamovingvehicle,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [8] T.Mcinerney,D.Terzopoulos,Adynamicniteelementsurfacemodelforsegmentationandtrackinginmultidimensionalmedicalimageswithapplicationtocardiac4dimageanalysis,in:ComputerizedMedicalImagingandGraphics,Vol.19,1995,pp.69. [9] Z.Zhu,Q.Ji,Eyegazetrackingundernaturalheadmovements,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,Vol.1,2005,pp.918. [10] V.B.Zordan,J.K.Hodgins,Motioncapture-drivensimulationsthathitandreact,in:ProceedingsoftheACMSIGGRAPH/EurographicsSymposiumonComputeranimation,2002,pp.89. [11] Y.Wu,J.Lin,T.S.Huang,Analyzingandcapturingarticulatedhandmotioninimagesequences,IEEETransactionsonPatternAnalysisandMachineIntelligence27(2005)1910. 101

PAGE 102

[12] D.N.Zotkin,V.C.Raykar,R.Duraiswami,L.S.Davis,Multimodaltrackingforsmartvideoconferencingandvideosurveillance,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [13] S.Bircheld,Ellipticalheadtrackingusingintensitygradientsandcolorhistograms,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,1998,pp.232. [14] P.A.Viola,M.J.Jones,Rapidobjectdetectionusingaboostedcascadeofsimplefeatures,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2001,pp.511. [15] Z.Fan,M.Yang,Y.Wu,G.Hua,T.Yu,Efcientoptimalkernelplacementforreliablevisualtracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.658. [16] V.Parameswaran,V.Ramesh,I.Zoghlami,Tunablekernelsfortracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.2179. [17] S.Bircheld,S.Rangarajan,Spatiogramsversushistogramsforregion-basedtracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2005,pp.1158. [18] J.K.Aggarwal,Q.Cai,Humanmotionanalysis:Areview,ComputerVisionandImageUnderstanding73(1999)428. [19] D.Gavrila,Thevisualanalysisofhumanmovement:Asurvey,ComputerVisionandImageUnderstanding73(1999)82. [20] T.B.Moselund,E.Granum,Asurveyofcomputervision-basedhumanmotioncapture,ComputerVisionandImageUnderstanding81(2001)231. [21] T.Moselund,A.Hilton,V.Kruger,Asurveyofadvancesinvision-basedhumanmotioncaptureandanalysis,ComputerVisionandImageUnderstanding104(2)(2006)90. [22] J.M.Rehg,T.Kanade,Model-basedtrackingofself-occludingarticulatedobjects,in:ProceedingsoftheInternationalConferenceonComputerVision,1995,pp.612. [23] M.J.Black,A.D.Jepson,Eigentracking:Robustmatchingandtrackingofarticulatedobjectsusingaview-basedrepresentation,InternationalJournalofComputerVision26(1998)63. [24] A.Blake,M.Isard,ActiveContours,Springer,1998. 102

PAGE 103

[25] M.LaCascia,S.Sclaroff,V.Athitsos,Fast,reliableheadtrackingundervaryingillumination:Anapproachbasedonregistrationoftexture-mapped3Dmodels,IEEETransactionsonPatternAnalysisandMachineIntelligence22(2000)322. [26] S.Sclaroff,J.Isidoro,Activeblobs,in:ProceedingsoftheInternationalConferenceonComputerVision,1998,pp.1146. [27] M.Kass,A.P.Witkin,D.Terzopoulos,Snakes:Activecontourmodels,InternationalJournalofComputerVision1(1988)321. [28] K.Toyama,A.Blake,Probabilistictrackinginametricspace,in:ProceedingsoftheInternationalConferenceonComputerVision,2001,pp.50. [29] T.F.Cootes,G.J.Edward,C.J.Taylor,Activeappearancemodels,in:ProceedingsoftheEuropeanConferenceonComputerVision,1998,pp.484. [30] V.Caselles,R.Kimmel,G.Sapiro,Geodesicactivecontours,InternationalJournalofComputerVision22(1997)61. [31] D.Cremers,Nonlineardynamicalshapepriorsforlevelsetsegmentation,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [32] N.Paragios,R.Deriche,Geodesicactivecontoursandlevelsetsforthedetectionandtrackingofmovingobjects,IEEETransactionsonPatternAnalysisandMachineIntelligence22(2000)266. [33] T.Zhang,D.Freedman,Trackingobjectsusingdensitymatchingandshapepriors,in:ProceedingsoftheInternationalConferenceonComputerVision,2003,pp.1056. [34] R.T.Collins,Y.Liu,On-lineselectionofdiscriminativetrackingfeatures,in:ProceedingsoftheInternationalConferenceonComputerVision,2003,pp.346. [35] G.D.Hager,M.Dewan,C.V.Stewart,Multiplekerneltrackingwithssd,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2004,pp.790. [36] R.T.Collins,Mean-shiftblobtrackingthroughscalespace,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2003,pp.234. [37] H.Grabner,H.Bischof,On-lineboostingandvision,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.260. [38] P.F.Felzenszwalb,D.P.Huttenlocher,Pictorialstructuresforobjectrecognition,InternationalJournalofComputerVision61(2005)55. 103

PAGE 104

[39] D.Freedman,T.Zhang,Interactivegraphcutbasedsegmentationwithshapepriors,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2005,pp.755. [40] Y.Boykov,M.-P.Jolly,Interactivegraphcutsforoptimalboundaryandregionsegmentationofobjectsinn-dimages,in:ProceedingsoftheInternationalConferenceonComputerVision,2001,pp.105. [41] J.Ho,K.-C.Lee,M.-H.Yang,D.J.Kriegman,Visualtrackingusinglearnedsubspaces,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2004,pp.782. [42] J.Lim,D.A.Ross,R.-S.Lin,M.-H.Yang,Incrementallearningforvisualtracking,in:AdvancesinNeuralInformationProcessingSystems,2004,pp.793. [43] A.O.Balan,M.J.Black,Anadaptiveappearancemodelapproachformodel-basedarticulatedobjecttracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.758. [44] D.Ramanan,C.Sminchisescu,Trainingdeformablemodelsforlocalization,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.206. [45] O.M.Lozano,K.Otsuka,Simultaneousandfast3dtrackingofmultiplefacesinvideobygpu-basedstreamprocessing,in:ProceedingsoftheIEEEInternationalConferenceonAcoustics,Speech,andSignalProcessing,2008,pp.713. [46] S.N.Sinha,J.-M.Frahm,M.Pollefeys,Y.Genc,Featuretrackingandmatchinginvideousingprogrammablegraphicshardware,MachineVisionandApplications22(2011)207. [47] M.Lalonde,D.Byrns,L.Gagnon,N.Teasdale,D.Laurendeau,Real-timeeyeblinkdetectionwithGPU-basedSIFTtracking,in:CanadianConferenceonComputerandRobotVision,2007,pp.481. [48] J.Fung,S.Mann,Openvidia:Parallelgpucomputervision,in:ProceedingsofACMInternationalConferenceonMultimedia,2005,pp.849. [49] A.Montemayor,B.Payne,J.Pantrigo,R.Cabido,A.Sanchez,F.Fernandez,ImprovingGPUparticlelterwithshadermodel3.0forvisualtracking,in:ACMSIGGRAPHResearchposters,2006,pp.1. [50] A.Montemayor,J.Pantrigo,A.Sanchez,F.Fernandez,ParticlelteronGPUsforreal-timetracking,in:ACMSIGGRAPHposters,2004,pp.1. [51] P.Lanvin,J.-C.Noyer,M.Benjelloun,Anhardwarearchitecturefor3Dobjecttrackingandmotionestimation,in:ProceedingsofIEEEInternationalConferenceonMultimediaandExpo,2005,pp.326. 104

PAGE 105

[52] R.Cabido,A.S.Montemayor,J.J.Pantrigo,B.R.Payne,Multi-resolutionandlocalsearchmethodsforoptimizingvisualtrackingprocessesonGPU,in:ACMSIGGRAPHposters,2007,pp.1. [53] S.Baker,I.Matthews,Lucas-Kanade20YearsOn:AUnifyingFramework,InternationalJournalofComputerVision56(3)(2004)221. [54] L.A.Leiva,A.Sanz,J.M.Buenaposada,PlanartrackingusingtheGPUforaugmentedrealityandgames,in:ACMSIGGRAPHposters,2007,pp.1. [55] Nvidiacudaprogrammingguide1.0(2007). [56] A.Toshev,B.Taskar,K.Daniilidis,Objectdetectionviaboundarystructuresegmentation,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2010,pp.950. [57] C.Pantofaru,C.Schmid,M.Hebert,Objectrecognitionbyintegratingmultipleimagesegmentations,in:ProceedingsoftheEuropeanConferenceonComputerVision,Vol.3,2008,pp.481. [58] G.Mori,X.Ren,A.A.Efros,J.Malik,Recoveringhumanbodycongurations:Combiningsegmentationandrecognition,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,Vol.2,2004,pp.326. [59] H.Zhang,J.Malik,Learningadiscriminativeclassierusingshapecontextdistances,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2003,pp.242. [60] A.P.Moore,S.J.D.Prince,J.Warrell,U.Mohammed,G.Jones,Superpixellattices,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2008,pp.1. [61] J.Shi,J.Malik,Normalizedcutsandimagesegmentation,IEEETransactionsonPatternAnalysisandMachineIntelligence22(8)(2000)888. [62] X.Ren,J.Malik,Learningaclassicationmodelforsegmentation,in:ProceedingsoftheInternationalConferenceonComputerVision,2003,pp.10. [63] X.He,R.S.Zemel,D.Ray,Learningandincorporatingtop-downcuesinimagesegmentation,in:ProceedingsoftheEuropeanConferenceonComputerVision,2006,pp.338. [64] A.Levinshtein,A.Stere,K.N.Kutulakos,D.J.Fleet,S.J.Dickinson,K.Siddiqi,Turbopixels:Fastsuperpixelsusinggeometricows,IEEETransactionsonPatternAnalysisandMachineIntelligence31(12)(2009)2290. [65] O.Veksler,Y.Boykov,P.Mehrani,Superpixelsandsupervoxelsinanenergyoptimizationframework,in:ProceedingsoftheEuropeanConferenceonComputerVision,2010,pp.211. 105

PAGE 106

[66] Y.Boykov,G.Funka-Lea,Graphcutsandefcientn-dimagesegmentation,InternationalJournalofComputerVision70(2006)109. [67] Y.Boykov,O.Veksler,R.Zabih,Fastapproximateenergyminimizationviagraphcuts,IEEETransactionsonPatternAnalysisandMachineIntelligence23(11)(2001)1222. [68] M.J.Swain,D.H.Ballard,Colorindexing,InternationalJournalofComputerVision7(1991)11. [69] H.Kuhn,TheHungarianmethodfortheassignmentproblem,Vol.2,WileyOnlineLibrary,1955. [70] S.Baker,S.Roth,D.Scharstein,M.J.Black,J.Lewis,R.Szeliski,Adatabaseandevaluationmethodologyforopticalow,in:ProceedingsoftheInternationalConferenceonComputerVision,2007,pp.1. [71] I.Haritaoglu,D.Harwood,L.S.Davis,W4:Arealtimesystemfordetectingandtrackingpeople,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,Vol.0,1998,pp.962. [72] M.deLaGorce,N.Paragios,D.J.Fleet,Model-basedhandtrackingwithtexture,shadingandself-occlusions,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2008,pp.1. [73] X.S.Zhou,D.Comaniciu,A.Gupta,Aninformationfusionframeworkforrobustshapetracking,IEEETransactionsonPatternAnalysisandMachineIntelligence27(1)(2005)115. [74] R.Collins,Y.Liu,M.Leordeanu,On-lineselectionofdiscriminativetrackingfeatures,IEEETransactionsonPatternAnalysisandMachineIntelligence27(10)(2005)1631. [75] A.Adam,E.Rivlin,I.Shimshoni,Robustfragments-basedtrackingusingtheintegralhistogram,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2006,pp.798. [76] D.A.Ross,J.Lim,R.-S.Lin,M.-H.Yang,Incrementallearningforrobustvisualtracking,InternationalJournalofComputerVision77(1-3)(2008)125. [77] A.D.Jepson,D.J.Fleet,T.F.El-Maraghi,Robustonlineappearancemodelsforvisualtracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2001,pp.415. [78] J.Kwon,K.M.Lee,Visualtrackingdecomposition,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2010,pp.1269. 106

PAGE 107

[79] H.Grabner,C.Leistner,H.Bischof,Semi-supervisedon-lineboostingforrobusttracking,in:ProceedingsoftheEuropeanConferenceonComputerVision,2008,pp.234. [80] X.Mei,H.Ling,Robustvisualtrackingusingl1minimization,in:ProceedingsoftheInternationalConferenceonComputerVision,2009,pp.1436. [81] D.Comaniciu,V.Ramesh,P.Meer,Kernel-basedobjecttracking,IEEETransactionsonPatternAnalysisandMachineIntelligence25(5)(2003)564. [82] S.M.S.Nejhum,J.Ho,M.-H.Yang,Visualtrackingwithhistogramsandarticulatingblocks,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2008,pp.1. [83] D.G.Lowe,Objectrecognitionfromlocalscale-invariantfeatures,in:ICCV,1999,pp.1150. [84] H.Bay,A.Ess,T.Tuytelaars,L.J.V.Gool,Speeded-uprobustfeatures(surf),ComputerVisionandImageUnderstanding110(3)(2008)346. [85] T.Ojala,M.Pietikainen,D.Harwood,Acomparativestudyoftexturemeasureswithclassicationbasedonfeatureddistributions,PatternRecognition29(1)(1996)51. [86] M.Yang,J.Yuan,Y.Wu,Spatialselectionforattentionalvisualtracking,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [87] B.Liu,J.Huang,L.Yang,C.A.Kulikowski,Robusttrackingusinglocalsparseappearancemodelandk-selection,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2011,pp.1313. [88] X.Mei,H.Ling,Y.Wu,E.Blasch,L.Bai,Minimumerrorboundedefcientl1trackerwithocclusiondetection,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2011,pp.1257. [89] X.Ren,J.Malik,Trackingasrepeatedgure/groundsegmentation,in:ProceedingsofIEEEConferenceonComputerVisionandPatternRecognition,2007,pp.1. [90] S.Wang,H.Lu,F.Yang,M.-H.Yang,Superpixeltracking,in:ProceedingsoftheInternationalConferenceonComputerVision,2011,pp.1. [91] T.G.Dietterich,R.H.Lathrop,T.Lozano-Perez,Solvingthemultipleinstanceproblemwithaxis-parallelrectangles,ArticialIntelligence89(1-2)(1997)3171. [92] H.Grabner,M.Grabner,H.Bischof,Real-timetrackingviaon-lineboosting,in:ProceedingsoftheBritishMachineVisionConference,2006,pp.47. 107

PAGE 108

[93] A.Saffari,C.Leistner,J.Santner,M.Godec,H.Bischof,On-linerandomforests,in:IEEEInternationalConferenceonComputerVisionWorkshops(ICCVWorkshops),2009,pp.1393. 108

PAGE 109

BIOGRAPHICALSKETCH S.M.ShahedNejhumreceivedtheBSdegreefromtheBangladeshUniversityofEngineeringandTechnology.HereceivedhisPhDdegreeinComputerEngineeringfromtheComputerandInformationScienceandEngineeringattheUniversityofFlorida.Hisresearchinterestsincludecomputervisionandmachinelearning. 109