<%BANNER%>

Statistical modeling and segmentation of sky/ground images

University of Florida Institutional Repository

PAGE 1

STATISTICALMODELINGANDSEGMENTATIONOFSKY/GROUNDIMAGESBySINISATODOROVICATHESISPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFMASTEROFSCIENCEUNIVERSITYOFFLORIDA2002

PAGE 2

ACKNOWLEDGMENTSHerewith,IwouldliketoexpressmysinceregratitudetoDr.MichaelNechybaforhiswiseguidanceofmyresearchforthisthesis.Asmyadvisor,Dr.Nechybahasbeendirectingbutonnoaccountconningmyinterests.Therefore,Ihavehadtheimpressionofcompletefreedom,which,inmyopinion,isaprerequisiteforfruitfulscienticresearch.Iespeciallyappreciatehisreadinessandexpertisetohelpmesolvenumerousimplementationissues.Mostimportantly,Iamthankfulforthefriendshipthatwehavedevelopedcollaboratingonthiswork.Also,IthankDr.AntonioArroyo,whosebrilliantlecturesonmachinein-telligencehaveinspiredmetoendeavorresearchintheeldofrobotics.AsthedirectoroftheMachineIntelligenceLabMIL,Dr.Arroyohascreatedawarm,friendly,andhardworkingatmosphereamongtheMIL-ers."Thankstohim,IhavedecidedtojointheMIL,whichhasprovedonnumerousoccasionstobetherightdecision.Therefore,IthankDr.ArroyoandallthemembersoftheMILforhelpingmeassimilatefastertothenewenvironment.IthankDr.PeterIfjuandhisteamofstudentsattheMechanicalandAerospaceEngineeringDepartmentforenablingmetoparticipateintheirmicroairvehicleMAVproject.MyresearchhasbeeninitiallyinspiredbythemutualworkofDr.NechybaandDr.Ifju.Certainly,thetheorydevelopedinthethesiswouldhavebeenfutilehadtherenotbeentheperfectapplication{MAV.Finally,specialthanksgotoAshishJainforinnumerableinterventionsincasesinwhichtheLinuxoperatingsystemhasbeenmytopadversary.ii

PAGE 3

TABLEOFCONTENTS page ACKNOWLEDGMENTS.............................ii LISTOFTABLES.................................v LISTOFFIGURES................................vi KEYTOABBREVIATIONS...........................vii KEYTOSYMBOLS................................viii ABSTRACT....................................x 1INTRODUCTION..............................1 1.1OverviewoftheProposedComputerVisionSystem........2 1.2OverviewoftheThesis........................3 2FEATURESPACE..............................5 2.1Color..................................5 2.2Texture................................5 2.2.1WaveletTransform......................6 2.2.2WaveletProperties......................6 2.2.3ComplexWaveletTransform.................9 2.2.4ColorMulti-resolutionRepresentation............12 3STATISTICALMODELING.........................13 3.1MarginalDensity...........................13 3.2GraphModels.............................14 3.3HMTModel..............................15 3.4HMTProperties............................16 4TRAININGOFTHEHMTMODEL....................18 4.1ImplementationoftheEMAlgorithm................18 4.1.1EStep.............................20 4.1.2UpStep............................20 4.1.3DownStep...........................21 4.1.4MStep.............................21 4.2ImplementationIssues........................22 iii

PAGE 4

4.2.1ScaledUpStep........................23 4.2.2ScaledDownStep.......................24 5MULTISCALESEGMENTATION.....................26 6EXPERIMENTALRESULTS........................30 6.1Real-TimeRealizationProblems...................30 6.2ClassifcationoftheFirstTypeofTestImages...........32 6.3SegmentationoftheSecondTypeofTestImages..........33 7CONCLUSION................................36 REFERENCES...................................38 BIOGRAPHICALSKETCH............................40 iv

PAGE 5

LISTOFTABLES Table page 2.1CoecientsofthefltersusedintheQ-shiftDTCWT..........10 6.1TheperformanceoftheRGB-basedclassiferforgroundtestimages..32 6.2TheperformanceoftheRGB-basedclassiferforskytestimages....32 6.3Performanceofthewavelet-basedclassifer................33 v

PAGE 6

LISTOFFIGURES Figure page 1.1Acomputervisionsystem........................2 2.1TwolevelsoftheDWTofatwo-dimensionalsignal...........7 2.2Theoriginalimage(left)anditstwo-scaledyadicDWT(right)....7 2.3TheQ-shiftDual-TreeCWT.......................9 2.4TheCWTisstronglyorientedatangles 15 ; 45 ; 75 .......11 3.1TheHMTofthethree-levelCWT....................14 5.1Thecontextvector.............................28 6.1Trainingimagesoftheskyandground..................31 6.2TheclassifcationerrorusingRGBandHSIcolorspaces........33 6.3AMAV'sright:favorablesky/groundpatterns.............34 6.4Watersurfacewithicepatchessimilartoclouds.............34 6.5Amountainviewwithfuzzyhorizonline.................34 6.6Segmentationofcutanduncutgrassregions...............35 vi

PAGE 7

KEYTOABBREVIATIONSB:ThebluefeatureoftheRGBcolorspaceCWT:ComplexWaveletTransformDWT:DiscreteWaveletTransformG:ThegreenfeatureoftheRGBcolorspaceH:ThehuefeatureoftheHSIcolorspaceHMT:HiddenMarkovTreeHSI:Thecolorspacethatconsistsofhue,saturationandintensityfeaturesI:TheintensityfeatureoftheHSIcolorspaceMAP:MaximumAPosteoriMAV:MicroAirVehiclepdf:ProbabilitydensityfunctionR:TheredfeatureoftheRGBcolorspaceRGB:Thecolorspacethatconsistsofred,greenandbluefeaturesS:ThesaturationfeatureoftheHSIcolorspacevii

PAGE 8

KEYTOSYMBOLSThelistshownbelowgivesabriefdescriptionofthemajormathematicalsymbolsdenedinthiswork.Foreachsymbol,thepagenumbercorrespondstotheplacewherethesymbolisrstused.aJi;im;n,ThetransitionprobabilitythatthenodeiatthescaleJisinastatem,giventhatitsparentnodeisinastaten................16Jim,Ajointprobabilityofthetreeundertherootnode0withtheremovedsub-treeunderthenodeiatthescaleJandthatthenodeiisinthestatem..........................................18^Jim,ThedynamicallyscaledJimvalue....................23Jim,AprobabilityofthewholetreeunderthenodeiatthescaleJ,giventhatthenodeiisinastatem..........................18^Jim,ThedynamicallyscaledJimvalue....................23Ji;im,AprobabilityofthewholetreeunderthenodeiatthescaleJ,giventhattheparentnodeiisinastatem.....................18J+1inim,AprobabilityofthecuttreeundertheparentnodeiatthescaleJ+1withallthechildrennodesremoved,giventhatthenodeiisinastatem.......................................18i,ChildrennodesatthenerscaleJ)]TJ/F15 11.955 Tf 11.955 0 Td[(1ofthenodeiatthescaleJ....15cJi,ThecontextvectorofthecoecientiatthescaleJ.............28gWJ;J,Thediscriminantfunction.......................27J,Thenumberofthescale.............................15M,ThenumberofstatesintheGaussianmixturemodel.............13Ji,ThemeanvalueoftherandomvariablewJi..................16!Ji,TheclasslabelofthecoecientiatthescaleJ...............27J,ThecollectionofallclasslabelsatthescaleJ................27viii

PAGE 9

i,AparentnodeatthecoarserscaleJ+1ofthenodeiatthescaleJ...15SJ)]TJ/F19 7.97 Tf 6.587 0 Td[(1i,ArandomvariablerepresentingthestateofachildnodeatthescaleJ)]TJ/F15 11.955 Tf 11.955 0 Td[(1........................................16Jim,ThestandardvariationoftherandomvariablewJi............16SJi,ArandomvariablerepresentingthestateofacoecientatthepositioniandatthescaleJ.................................16SJ+1i,ArandomvariablerepresentingthestateofaparentnodeatthescaleJ+1........................................16,Arandomvectorthatconsistsofallparameters,whichdeterminetheHMTmodel........................................16g,Allparametersofthestatisticalmodelfortheground............26s,Allparametersofthestatisticalmodelforthesky..............26WHH,ThesetofwaveletcoecientsoftheDWTobtainedlteringrowswithhighpassandcolumnswithhighpasslters...................6WHL,ThesetofwaveletcoecientsoftheDWTobtainedlteringrowswithhighpassandcolumnswithlowpasslters....................6WLH,ThesetofwaveletcoecientsoftheDWTobtainedlteringrowswithlowpassandcolumnswithhighpasslters....................6wJi,ArandomvariablerepresentingthemagnitudeofacoecientoftheCWTorHSIatthepositioniandatthescaleJ....................16WJ,ThecollectionofallrandomvariableswJiatthescaleJ..........15xmax,Themaximumnumberofcolumnsofanimage...............15ymax,Themaximumnumberofrowsofanimage.................15ix

PAGE 10

AbstractofThesisPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofMasterofScienceSTATISTICALMODELINGANDSEGMENTATIONOFSKY/GROUNDIMAGESBySinisaTodorovicDecember2002Chair:MichaelC.NechybaMajorDepartment:ElectricalandComputerEngineeringInthisthesis,anovelcomputervisionalgorithmforstatisticalimagemodelingisproposed.Thenecessityforsuchanalgorithminitiallycameasarequirementforvision-basedstabilityandcontrolofmicroairvehiclesMAV.TheMAVightautonomycanbeachievedbyreal-timehorizontracking,usingtheon-boardvisionsystem.Occasionally,thisalgorithmfailsinscenarioswheretheunderlyingGaussianassumptionfortheskyandgroundappearancesisnotappropriate.Therefore,weproposeastatisticalmodelingframeworktobuildpriormodelsoftheskyandground.Oncetrained,thesemodelscanbeincorporatedintotheexistinghorizon-trackingalgorithmforMAVs.Sincetheappearancesoftheskyandgroundvaryenormously,nosinglefeatureissucientforaccuratemodeling.Assuch,werelybothoncolorandtextureascriticalfeaturesinourmodelingframework.Specically,wechoosethehue,saturationandintensityHSIcolorsystemforcolorrepresentation,andthecomplexwavelettransformCWTforx

PAGE 11

texturerepresentation.WethenusetheHiddenMarkovTreeHMTmodel,astheunderlyingstatisticalmodeloverourfeaturespace.Withthisapproach,wehaveachievedreliableandrobustimagesegmentationofightimagesfromon-boardourMAVsaswellasonmoredicult-to-classifysky/groundimages.Furthermore,wedemonstratethegeneralityofourmodelingframeworkthroughanothersegmentationtask{classifyingthecut/uncutgrassregionsonimagestakenbyanon-boardcameraofalawnmower.xi

PAGE 12

CHAPTER1 INTRODUCTION Recently,theresearchersintheDepartmentofMechanicalandAerospace EngineeringandintheDepartmentofElectricalandComputerEngineering,at theUniversityofFlorida,Gainesville,havesuccessfullyimplementedavisionbasedhorizontrackingalgorithmforrightstabilityandautonomyofMicroAir Vehicles(MAVs)[1].Thehorizontrackingalgorithmworkswell,especiallywhen theskyandgrounddistributionsarerelativelycoherent.Occasionally,however, horizondetectionfailsinscenarioswheretheunderlyingGaussianassumption fortheskyandgroundappearancesisnotappropriate.Moreover,thehorizon detectionalgorithmisbootstrappedbyassumingthatinitiallytheskyoccupies theupperpartoftheimage.Forcomplexmissionscenarios,thismaybean incorrectassumptionwithpotentiallyfatalconsequencestotherightvehicle. Forexample,inthecaseofdeployingMAVsonmunitionsforpost-impactbomb damageassessment,aMAVwouldseparatefromthemunitionpriortoimpact, andanuprightattitudewithrespecttothegroundcannotbeguaranteed.Correct identifcationofskyandground,therefore,becomesimperative. Whilemodelingtheappearanceofskyandgroundregionsinimagesmay seemintuitivelyeasy,itis,infact,averychallengingtask.Dependingonlighting, weather,landscape,etc.,theappearanceoftheskyandgroundcanvaryenormously.Moreover,insomecases,evenhumanperceptionfails,trickedby\unusual" patternsofskyandground.Forinstance,theimagesofwatersurfacererecting theskyormountaintopscoveredwithsnowrepresenttheextremecases,inwhich humandetectionofskyandgroundisunreliable. 1

PAGE 13

2 Consequently,giventhecomplexvariationsinourtwoimageclasses(i.e., skyandground),carefulconsiderationmustbegiventoselectingsuciently discriminatingfeaturesandasucientlyexpressivemodelingframework.Nextwe presenttheoverviewofourapproach. 1.1OverviewoftheProposedComputerVisionSystem Inthisthesis,weproposeacomputervisionalgorithmforimagesegmentation. Theproposedvisionsystemrepresents,infact,aBayesclassifer,asillustrated inFigure1.1.ABayesclassiferassignsapixeltoaparticularclassifaposterioriprobabilityoftheclassgiventhepixelvalueismaximum.DesigningaBayes classiferguaranteesthebestperformance,subjecttothecorrectnessofavailable statisticalmodels[2].Therefore,itisnecessarytostatisticallymodelinterdependenciesamongpixelsasaccuratelyaspossible.Thisisachievedinthecomputer visionsystem,whosecomponentsareexplainedfurtherinthetext. Thefrstmoduleisapreprocessingblock,whichsubsamplestheinputimage andthusreducestheoverallcomputationcost. Inthefollowingblock,thekeyfeaturesofanimageareextracted.Having experimentedwithcolorandtexturefeaturesseparately,weconcludethatonlythe featuresetthatincludesbothcolorandtexturecluesenablesaccuratestatistical modelingforourapplication.Previousexperimentsalsosuggestthatitisimportanttorepresentbothlocalaswellasregionalinterdependenciesinthefeature space.Assuch,wefrstemploythehue,intensityandsaturation(HSI)colorspace torepresentcolor.Also,weresorttowavelet-basedmulti-resolutionanalysisin Figure1.1 Acomputervisionsystem

PAGE 14

3theformoftheComplexWaveletTransformCWT.Thus,fromoriginalcomplexrepresentationofimageclasses,weobtainanitesetoffeatures.Givenourfeatureselection,wethenchoosetheHiddenMarkovTreeHMTmodelasourunderlyingstatisticalmodel.Sincethetruedistributionofthechosenfeaturesetisnotknown,itisnecessarytostatisticallymodeltheinterdependenciesamongthefeatures.Thereforeinthefollowingblock,weestimatestatisticalparameterswhichcompletelycharacterizetheunderlyingdistributionofthefeatures.Once,themodelparametersarecomputedfromtheavailabletrainingdata,itispossibletondthelikelihoodofimageclassesforeachpixel.EmployingthemaximumaposteoriMAPcriterion,weassignaclasslabeltoeachpixel,and,thus,performsegmentation.Inconclusion,themutualdependenceofallmodulesinthesystemshouldalsobeemphasized.Thesystemperformspoorlyoverallifonlyoneofthecomponentsisbadlydesigned.Forinstance,aperfectsetoffeaturesandasophisticatedstatisticalmodelcouldyieldhorribleresultsifonlythesegmentationblockisbad.Therefore,oneshouldalwayshaveinmindthatmodularity,asdepictedinFigure1.1,servesonlyforpresentationpurposes.1.2OverviewoftheThesisInthefollowingchapterswediscussthemaincomponentsofourcomputervisionsystem.First,inChapter2,weexplainourchoiceofthefeaturespaceforstatisticalmodelingandreviewfundamentaltheory,necessaryforbetterunderstandingofourwork.Specically,wegiveanoverviewofthemostimportantaspectsoftheHSIcolorsystemandtheCWT.Next,inChapter3weintroducethepropertiesoftheHiddenMarkovModel,andinChapter4wedevelopthealgorithmfortrainingHMTmodels,pointingoutimplementationissues.Further,inChapter5,we

PAGE 15

4explaintheprocedureformultiscaleimagesegmentation.InChapter6,wepresentexperimentalresults.Weillustratethereliabilityandrobustnessofourapproachonexamplesofightimagesfromon-boardourMAVsaswellasonmoredicult-to-classifysky/groundimages.Furthermore,wedemonstratethegeneralityofourmodelingframeworkthroughanothersegmentationtask{classifyingthecut/uncutgrassregionsonimagestakenbyanon-boardcameraofalawnmower.Finally,wediscusstheexperimentalresultsinChapter7.

PAGE 16

CHAPTER2 FEATURESPACE Forourstatisticalmodels,weseektoidentifyfeaturesthatleadtoimproved segmentationperformancewithoutunnecessarilyincreasingcomputationalcomplexity.Colorortexturecluesbythemselvesyieldpoorsegmentationresults. Therefore,weconsiderafeaturespacethatspansbothcolorandtexturedomains. 2.1Color ThecolorinformationinavideosignalisusuallyencodedintheRGBcolor space.Unfortunately,the R G and B colorchannelsarehighlycorrelated;therefore,wechoosetheHSIspaceasamoreappropriatecolorrepresentation[3]. Though,HSIexhibitsnumericalinstabilityatlowsaturation,thisproblemcanbe easilyovercomebyassigningsomeapplication-dependent,predefnedvaluestothe hue( H ),intensity( I )andsaturation( S )features.Thus,forsky/groundsegmentation,weassumethatlow S and/or I valuesmostlikelyappearforthegroundclass, whichthensets H valueappropriately. 2.2Texture Forthechoiceoftexture-basedfeatures,wehaveconsideredseveralfltering, model-basedandstatisticalmethodsfortexturefeatureextraction.Ourconclusion complieswiththecomparativestudyofRandenandHusoy[4]thatforproblems withmanytextureswithsubtlespectraldierences,asinthecaseofourcomplex classes,itisreasonabletoassumethatthespectraldecompositionbyaflter bankyieldsconsistentlysuperiorresultsoverothertextureanalysismethods.Our experimentalresultsalsosuggestthatitiscrucialtoanalyzebothlocalaswellas 5

PAGE 17

6 regionalpropertiesoftexture.Assuch,weemploythewavelettransform,duetoits inherentrepresentationoftextureatdierentscalesandlocations. 2.2.1WaveletTransform Waveletatomfunctions,beingwelllocalizedbothinspaceandfrequency, retrievetextureinformationquitesuccessfully[5].Theconventionaldiscrete wavelettransform(DWT)mayberegardedasequivalenttoflteringtheinput signalwithabankofbandpassflters,whoseimpulseresponsesareallgivenby scaledversionsofamotherwavelet.Thescalingfactorbetweenadjacentflters is2:1,leadingtooctavebandwidthsandcenterfrequenciesthatareoneoctave apart.Theoctave-bandDWTismostecientlyimplementedbythedyadic waveletdecompositiontreeofMallat[6],wherewaveletcoecientsofanimageare obtainedconvolvingeveryrowandcolumnwithimpulseresponsesoflowpassand highpassflters,asshowninFigure2.1.Practically,coecientsofonescaleare obtainedconvolvingeverysecondrowandcolumnfromthepreviousfnerscale. Thus,theflteroutputisawaveletsubimagethathasfourtimeslesscoecients thantheoneatthepreviousscale.Thelowpassflterisdenotedwith H 0 and thehighpassflterwith H 1 .ThewaveletcoecientsWhaveinindexLdenoting lowpassoutputandHforhighpassoutput. Separableflteringofrowsandcolumnsproducesfoursubimagesateachlevel, whichcanbearrangedasshowninFigure2.2.Thesamefgurealsoillustrateswell thedirectionalselectivityoftheDWT,because W LH W HL ,and W HH ,bandpass subimagescanselecthorizontal,verticalanddiagonaledges,respectively. 2.2.2WaveletProperties ThefollowingpropertiesoftheDWThavemadewavelet-basedimageprocessingveryattractiveinrecentyears[5,7,8]:

PAGE 18

7 Figure2.1 TwolevelsoftheDWTofatwo-dimensionalsignal. Figure2.2 Theoriginalimage(left)anditstwo-scaledyadicDWT(right).

PAGE 19

8 a.locality :eachwaveletcoecientrepresentslocalimagecontentinspaceand frequency,becausewaveletsarewelllocalizedsimultaneouslyinspaceand frequency b.multi-resolution :DWTrepresentsanimageatdierentscalesofresolutionin spacedomain(i.e.,infrequencydomain);regionsofanalysisatonescaleare dividedupintofoursmallerregionsatthenextfnerscale(Fig.2.2) c.edgedetector :edgesofanimagearerepresentedbylargewaveletcoecients atthecorrespondinglocations d.energycompression :waveletcoecientsarelargeonlyifedgesarepresent withinthesupportofthewavelet,whichmeansthatthemajorityofwavelet coecientshavesmallvalues e.decorrelation :waveletcoecientsareapproximatelydecorrelated,sincethe scaledandshiftedwaveletsformorthonormalbasis;dependenciesamong waveletcoecientsarepredominantlylocal f.clustering :ifaparticularwaveletcoecientislarge/small,thentheadjacent coecientsareverylikelytoalsobelarge/small g.persistence :large/smallvaluesofwaveletcoecientstendtopropagate throughscales h.non-Gaussianmarginalpdf :waveletcoecientshavepeakyandlong-tailed marginaldistributions;duetotheenergycompressionpropertyonlya fewwaveletcoecientshavelargevalues,thereforeaGaussianpdfforan individualcoecientisapoorstatisticalmodel ItisalsoimportanttointroduceshortcomingsoftheDWT.Discretewavelet decompositionssuerfromtwomainproblems,whichhampertheiruseformany applications,asfollows[9]: a.lackofshiftinvariance :smallshiftsintheinputsignalcancausemajor variationsintheenergydistributionofwaveletcoecients b.poordirectionalselectivity :forsomeapplicationshorizontal,verticaland diagonalselectivityisinsucient WhenweanalyzetheFourierspectrumofasignal,weexpecttheenergyin eachfrequencybintobeinvarianttoanyshiftsoftheinput.Unfortunately,the DWThasasignifcantdrawbackthattheenergydistributionbetweenvarious waveletscalesdependscriticallyonthepositionofkeyfeaturesoftheinputsignal,

PAGE 20

9 Figure2.3 TheQ-shiftDual-TreeCWT. whereasideallydependenceisonjustthefeaturesthemselves.Therefore,thereal DWTisunlikelytogiveconsistentresultswhenusedintextureanalysis. Inliterature,thereareseveralapproachesproposedtoovercomethisproblem (e.g.,DiscreteWaveletFrames[5,10]),allincreasingcomputationalloadwith inevitableredundancyinthewaveletdomain.Inouropinion,theComplexWavelet Transform(CWT)oersthebestsolutionprovidingadditionaladvantages, describedinthefollowingsubsection. 2.2.3ComplexWaveletTransform ThestructureoftheCWTisthesameasinFigure2.1,exceptthattheCWT fltershavecomplexcoecientsandgeneratecomplexoutput.Theoutputsampling ratesareunchangedfromtheDWT,buteachwaveletcoecientcontainsarealand imaginarypart,thusaredundancyof2:1forone-dimensionalsignalsisintroduced. Inourcase,fortwo-dimensionalsignals,theredundancybecomes4:1,because twoadjacentquadrantsofthespectrumarerequiredtorepresentfullyareal two-dimensionalsignal,addinganextra2:1factor.Thisisachievedbyadditional flteringwithcomplexconjugatesofeithertheroworcolumnflters[9,11].

PAGE 21

10 Table2.1 CoecientsofthefltersusedintheQ-shiftDTCWT. H 13 (symmetric) H 19 (symmetric) H 6 -0.0017581 -0.0000706 0.03616384 0 0 0 0.0222656 0.0013419 -0.08832942 -0.0468750 -0.0018834 0.23389032 -0.0482422 -0.0071568 0.76027237 0.2968750 0.0238560 0.58751830 0.5554688 0.0556431 0 0.2968750 -0.0516881 -0.11430184 -0.0482422 -0.2997576 0 . 0.5594308 0 -0.2997576 . Despiteitshighercomputationalcost,weprefertheCWTovertheDWT becauseoftheCWT'sfollowingattractiveproperties.TheCWTisshowntoposses almostshiftandrotationalinvariance,givensuitablydesignedbiorthogonalor orthogonalwaveletflters.WeimplementtheQ-shiftDual-TreeCWTscheme, proposedbyKingsbury[12],asdepictedinFigure2.3.Thefgureshowsthe CWTofonlyone-dimensionalsignal x ,forclarity.Theoutputofthetrees a and b canbeviewedasrealandimaginarypartsofcomplexwaveletcoecients, respectively.Thus,tocomputetheCWT,weimplementtworealDWT's(see Fig.2.1),obtainingawaveletframewithredundancytwo.AsfortheDWT, here,lowpassandhighpassfltersaredenotedwith0and1inindex,respectively. Thelevel0comprisesodd-lengthflters H 0 a ( z )= H 0 b ( z )= H 13 ( z )(13taps) and H 1 a ( z )= H 1 b ( z )= H 19 ( z )(19taps).Levelsabovethelevel0consistof even-lengthflters H 00 a ( z )= z )Tj/T1_23 7.97 Tf6.586 0 Td(1 H 6 ( z )Tj/T1_23 7.97 Tf6.586 0 Td(1 ), H 01 a ( z )= H 6 ( )Tj/T1_22 11.955 Tf9.299 0 Td(z ), H 00 b ( z )= H 6 ( z ), H 01 b ( z )= z )Tj/T1_23 7.97 Tf6.586 0 Td(1 H 6 ( )Tj/T1_22 11.955 Tf9.299 0 Td(z )Tj/T1_23 7.97 Tf6.587 0 Td(1 ),wheretheimpulseresponseoftheflters H 13 H 19 and H 6 isgiveninthetable2.1.

PAGE 22

11 Figure2.4 TheCWTisstronglyorientedatangles15 ;45 ;75 Asidefrombeingshiftinvariant,theCWTissuperiortotheDWTintermsof directionalselectivity,too.Atwo-dimensionalCWTproducessixbandpasssubimages(analogoustothethreesubimagesintheDWT)ofcomplexcoecientsateach level,whicharestronglyorientedatanglesof 15 ; 45 ; 75 ,asillustratedin Figure2.4. AnotheradvantageouspropertyoftheCWTexertsinthepresenceofnoise. Thephaseandmagnitudeofthecomplexwaveletcoecientscollaborateinanon trivialwaytodescribedata[11].Thephaseencodesthecoherent(inspaceand scale)structureofanimage,whichisresilienttonoise,andthemagnitudecaptures thestrengthoflocalinformationthatcouldbeverysusceptibletonoisecorruption. Hence,thephaseofcomplexwaveletcoecientsmightbeusedasaprincipalclue forimagedenoising.However,ourexperimentalresultshaveshownthatphaseis notagoodfeaturechoiceforsky/groundmodeling.Therefore,weconsideronly magnitudes. Toconclude,weconsidertheCWTasuitabletooltoanalyzeimagetexture. TheCWTinheritedattractivepropertiesoftheDWT,suchas:locality,multiresolution,energycompression,etc.Itsshiftinvarianceandincreaseddirectional selectivitygiveadditionalpowertostatisticalmodelingofimageclasses.Therefore,

PAGE 23

12wechoose,besidethecolorfeaturesHSI,theCWTtocomplementourfeaturespace.2.2.4ColorMulti-resolutionRepresentationInordertounifythecolorandtexturefeaturesintoauniquemodelingframework,wealsoformmulti-resolutionrepresentationfortheH,SandIfeatures,analogoustotheCWTstructure.TheH,SandIvaluesatcoarserscalesarecomputedasthemeanofthecorrespondingfourvaluesatthenexthigher-resolutionscale.Hence,theH,SandIfeaturesalsoexhibittheclusteringandpersistencepropertiestosomeextent.Naturally,thechoiceofthefeaturespacedeterminesthestatisticalimagemodel.Inthenextchapter,wedescribetheHMTmodelasanappropriatestatisticalframeworkformodelingourcomplexclassesinthechosenfeaturespace.

PAGE 24

CHAPTER3 STATISTICALMODELING Inordertocomputethejointpdfofthechosenfeatures,whichfullycharacterizesaclassforourBayesclassifer(seeFig1.1),wemustaccountforthekey dependencies.Inourcase,weneedtoexaminethoroughlyallthepropertiesofthe CWTandHSIfeatures,andtobestincorporatethisknowledgeintoourstatistical model. 3.1MarginalDensity WemodelthetruemarginalpdfofafeatureasamixtureofGaussianpdf's. Let'srefertothemagnitudeofaCWTcoecientand/orthe H S and I color values,simplyascoecient.Then,acoecientcanbeinoneofthe M states, whichresultsinanM-stateGaussianmixturemodel.Itmeansthatweassociate toeachcoecient w i ,atlocation( x;y ),astaterandomvariable S i 1 Ingeneral, S i cantakeonvalues m 2f 0 :::M )Tj/T1_0 11.955 Tf11.956 0 Td(1 g .Themixturemodeliscompletely parameterizedwiththeprobabilitymeasurefunction(pmf)ofthestaterandom variable P ( S i = m )andwiththemeans i ( m )andthevariances 2 i ( m )ofeach state.Byincreasingthenumberofstates,wecanmakeourmodelftarbitrarily closetorealdensitieswithafnitenumberofdiscontinuities[7]. Thus,themarginalpdfofawaveletcoecient w i isgivenby p ( w i )= M )Tj/T1_4 7.97 Tf6.587 0 Td(1 X m =0 P ( S i = m ) p ( w i j S i = m ) ; (3.1) 1 Notethatboldfacelettersrepresentrandomvariablesandtheirparticularrealizationsarenotboldface. 13

PAGE 25

14 Figure3.1 TheHMTofthethree-levelCWT. where p ( w i j S i = m )= 1 p 2 i ( m ) e )Tj/T1_36 5.978 Tf7.782 4.432 Td(( w i )Tj/T1_37 5.978 Tf5.756 0 Td( i ( m )) 2 2 2 i ( m ) : (3.2) Althougheachcoecient w i isconditionallyGaussian,givenitsstaterandom variable S i ,theoveralldensityisnon-Gaussianduetotherandomnessof S i 3.2GraphModels Aprobabilisticgraphassociateseachrandomvariablewithanodeinthegraph [13].Correlationbetweenpairsofvariablesisrepresentedbylinksconnectingthe correspondingnodes.Thechosenfeaturespacecanbemodeledbyaprobabilistic graphinthefollowingway.Anodeisassignedtoeachcoecientandanedgeis placedbetweentwomutuallydependentnodes.Allcoecientsthatbelongtothe samescaleareplacedinthecorrespondinghorizontalplane.Thecoecientsofthe fnerscalesformplanesunderthecurrentone,andthecoecientsthatbelongto thecoarserscalesformplanesabovethecurrentone,asillustratedinFigure3.1. Fouradjacentcoecientsatonescalehaveauniqueparentbelongingtotheupper coarserscale,onlywithinthesamefeature(shadedarea).States S J i aredepictedas whitenodesandcoecientvalues w J i asblacknodes.

PAGE 26

15 Thus,weobtainapyramidstructure,becausethecoarserthescalethefewer nodesthereare.Wecandevelopgraphsthatcancapturearbitraryinterdependencies,butthecomputationalcomplexityincreasessubstantiallyforgraphsmore complicatedthantrees.Ifwekeeponlyhorizontaledges,whichconnectadjacent nodesofonescale,weobtaintheHiddenMarkovChainModel(HMC).Byconnectingonlytheadjacentnodesverticallyacrossscale,weobtaintheHiddenMarkov TreeModel(HMT). 3.3HMTModel TheHMTmodelsboththeclusteringandpersistenceproperties.Atree structureisformedconnectingeveryparentcoecient 2 fromthecoarserscalewith itschildrenatthefnerscale.Therearefourdirectchildrenfromoneparentnode, andeverychildhasonlyoneparentnode,asdepictedinFigure3.1.Leafnodes havenochildrenandtheyareformedfromthecoecientsatthefnestresolution. Althoughwedonotincludehorizontaledgesamongthenodesofonescaleinour model,theclusteringpropertyisstillwellmodeled,becausetheadjacentnodes atonescalehavethesameparent.Thereforethelocalcorrelationisindirectly modeledbyinterscaledependencies[14]. InordertodiscussHMTproperties,frstweneedtointroducethefollowing notation.Therandomfeldofallwaveletcoecientsatascale J isdenotedwith W J = f w J i g .Anode i hasaparentnode ( i )andasetofchildrennodes ( i ). Thisimpliesthat J ( ( i ))= J ( i ) )Tj/T1_0 11.955 Tf12.023 0 Td[(1and J ( ( i ))= J ( i )+1.Thescalesareordered fromthefnest J =0tothecoarsest J = L )Tj/T1_0 11.955 Tf12.594 0 Td[(1,where L = log 2 x max .Inthis project,thedimensions x max and y max ofanimageareassumedtobeequalto facilitatecomputationoftheCWT. 2 Here,coecientreferstothemagnitudeofaCWTcoecientand/orthe H S and I colorvalues.

PAGE 27

16 3.4HMTProperties TheHMTimposesthat w J i isconditionallyindependentofallotherrandom variablesgivenitsassociatedstate S J i [15].Secondly,givenaparentstateS J+1 ( i ) w J i isindependentofthe ( i )'sancestors.Conversely,givenachildstate S J )Tj/T1_42 7.97 Tf6.587 0 Td(1 ( i ) w J i isindependentofthechild'sdescendants.Combiningthesepropertiesshowsthat w J i isconditionallyindependentoftheentiretree,givenonlytheparentstate S J +1 ( i ) andthechildrenstates S J )Tj/T1_42 7.97 Tf6.587 0 Td(1 ( i ) .Herefollowsamathematicalformulationofthese statements: p ( w J i jf w J k g k 6= i ; f S J k g k 6= i ; S J i = m )= p ( w J i j S J i = m ) ; (3.3) p ( w J i jf w k g J ( k ) >J ( ( i )) ; w J +1 ( i ) )= p ( w J i j w J +1 ( i ) ) ; (3.4) p ( w J i jf w k g J ( k )
PAGE 28

17 FortrainingtheHMT,weemploytheexpectationmaximization(EM) algorithm[16,17],whichiterativelyestimatesmodelparameters .Undermild conditions,theiterationconvergestoalocalmaximumofthefunction p ( W j ). ThetrainingoftheHMTisexplainedinthefollowingchapter.

PAGE 29

CHAPTER4 TRAININGOFTHEHMTMODEL Ouraimistodeterminetheparametervector ,giventhepyramidalstructureofobservedcoecients W = W .Here,theusualEMalgorithmfortraining HMM's[15],ismodifedforthecontinuousoutcomeHMT. First,let'sintroducethefollowingnotation.Wedefne T J i tobethesubtree ofobservedcoecientswitharootatnode w J i ,atthelevel J T J i containsthe coecient w J i andallitsdescendants.If T f J ( k )
PAGE 30

19TodeterminetheHMTparametervector,itisnecessaryrsttocomputethestateandtransitionprobabilitiesforalltheobservedcoecientsW,PSJi=mjW;andPSJi=mjSJ+1i=n;W;.FromtheHMTproperties3.3,.4and.5itfollowsthatthetreesTJiandTL)]TJ/F19 7.97 Tf 6.587 0 Td[(10niareindependent,giventhestaterandomvariableSJi.Thisfact,togetherwiththeMarkovchainrule,leadstothedesiredexpressionsintermsofthedenedand.First,weobtainPSJi=m;TL)]TJ/F19 7.97 Tf 6.586 0 Td[(10j=JimJim;.5PSJi=m;SJ+1i=n;TL)]TJ/F19 7.97 Tf 6.587 0 Td[(10j=JimaJi;im;nJ+1imJ+1inin:.6Then,thejointpdfofWcanbeexpressedaspWj=pTL)]TJ/F19 7.97 Tf 6.586 0 Td[(10j=M)]TJ/F19 7.97 Tf 6.587 0 Td[(1Xm=0PSJi=m;TL)]TJ/F19 7.97 Tf 6.586 0 Td[(10j;=M)]TJ/F19 7.97 Tf 6.587 0 Td[(1Xm=0JimJim:.7ApplyingtheBayesruleto.5-.7resultsinthedesiredconditionalprobabilitiesPSJi=mjW;=JimJim M)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xn=0JinJin;.8PSJi=m;SJ+1i=njW;=JimaJi;im;nJ+1inJ+1inin M)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xn=0JinJin:.9From.8,.9and.3followstheexpressionforthetransitionprobabilityPSJi=mjSJ+1i=n;W;=PSJi=m;SJ+1i=njW; PSJ+1i=njW;;=JimaJi;im;nJ+1inJ+1inin J+1inJ+1in;=JimaJi;im;n Ji;in:.10

PAGE 31

20Apparently,theexpressions.8and4.10guaranteethatthecomputedproba-bilitiesarelessorequalto1.Fromthestateandtransitionprobabilitiesallothercomponentsoftherandomvectorcanbecomputed.4.1.1EStepIndeterminingprobabilitiesforthestatevariables,thestateinformationispropagatedthroughoutthetreebymeansofupward-downward1algorithm.Theup stepcalculates'sbytransmittinginformationaboutthene-scalewavelet/colorcoecientsuptocoarserscales.Thedown stepcomputes'sbypropagatinginformationaboutthecoarse-scalewavelet/colorcoecientsdowntonerscales.Inthefollowingsubsectionswepresenttheup anddown stepsforthel-thiterationoftheEMalgorithm.4.1.2UpStepTheup partoftheEMalgorithmconsistsofthefollowingsteps:1.8S0i=m,wherem2f0;:::;M)]TJ/F15 11.955 Tf 9.298 0 Td[(1g,calculate0im=Nw0i;0im;0im;.112.8SJi=m,wherem2f0;:::;M)]TJ/F15 11.955 Tf 9.298 0 Td[(1g,calculateJi;in=M)]TJ/F19 7.97 Tf 6.587 0 Td[(1Xm=0aJi;im;nJim.12J+1im=NwJ+1i;J+1im;J+1imYk2iJk;km;.13J+1inim=J+1im Ji;im:.14 1EstepoftheEMalgorithmisoftenreferredtoastheforward-backward al-gorithmintheHMMliteratureandastheupward-downward algorithmintheAIliterature.Sincethetreestructureimposesmovesup anddown intheEstepcom-putation,hereweusethelatterterm.

PAGE 32

213.Moveupthetree,settingJ=J+1.4.IfJ=L,thenstop;elsereturntostep2.4.1.3DownStepThedown partoftheEMalgorithmconsistsofthefollowingsteps:1.8SL)]TJ/F19 7.97 Tf 6.586 0 Td[(10=m,wherem2f0;:::;M)]TJ/F15 11.955 Tf 9.299 0 Td[(1g,setL)]TJ/F19 7.97 Tf 6.587 0 Td[(10m=PSL)]TJ/F19 7.97 Tf 6.587 0 Td[(10=mjW;=l:.152.Movedownthetree,settingJ=J)]TJ/F15 11.955 Tf 11.955 0 Td[(1.3.8SJi=m,wherem2f0;:::;M)]TJ/F15 11.955 Tf 9.298 0 Td[(1g,calculateJim=M)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xn=0aJi;im;nJ+1inJ+1inin:.164.IfJ=0,thenstop;elsereturntostep2.4.1.4MStepOnceandvaluesarecomputedforallnodes,themaximizationstepoftheEMalgorithmfollowsstraightforward.Thestateandtransitionprobabilitiescouldbecomputedapplying.8and.10.WecouldthenndJimandJimforallcoecients.However,allthatwillmakeourHMTmodeltoocomplex.InordertosimplifyandtoavoidtheriskofoverttingtheHMTmodel,weassumethatstatisticparametersareequalforallcoecientsbelongingtothesamescale.ItmeansthatallnodesatthescaleJarecharacterizedbythefollowinguniqueparameters:PSJ=mjW;,Jm,Jm,andPSJ=mjSJ+1=n;W;.Here,SJdenotesthestaterandomvariableforallnodesiatthesamescaleJ.

PAGE 33

22Applyingtheassumption,intheMstep,resultsinthefollowingexpressionsforallKnodesatthescaleJ,forthel-thiteration:PSJ=mjW;=1 KK)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xi=0PSJi=mjW;=l;=1 KK)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xi=0JimJim M)]TJ/F19 7.97 Tf 6.587 0 Td[(1Xn=0JinJin;.17PSJ=mjSJ+1=n;W;=1 KK)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xi=0PSJi=mjSJ+1i=n;W;=l;=1 KK)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xi=0JimaJi;im;n Ji;in;.18Jm=K)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xi=0wJiPSJi=mjW; KPSJ=mjW;;.19Jm2=K)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xi=0wJi)]TJ/F20 11.955 Tf 11.956 0 Td[(Jm2PSJi=mjW; KPSJ=mjW;:.204.2ImplementationIssuesThecomputerimplementationoftheupward-downwardalgorithm,iftheequations.11-.16aredirectlyused,suersfromnumericalunderow.Unfortunately,itisnotpossibletoimplementthelogarithmscaling.Inordertomakethealgorithmstable,weneedtointroduceadaptivescalingofvalues.Thedynamicscalingisperformedforvaluesonly.Since'sarecomputedonlyintheup step,weneedonlytorewritetheequations.11-4.14.Thescaledvaluesaredenotedwith^i.Inthefollowingsubsectionswepresentthescaledup stepanddown stepalgorithms.

PAGE 34

234.2.1ScaledUpStepThescaledup partoftheEMalgorithmconsistsofthefollowingsteps:1.8iatthescaleJ=0compute0im=Nw0i;0im;0im;s0i=M)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xm=00im;scalingfactorfor0s;^0im=1 s0i0im:.212.8iatascaleJcompute^Ji;in=M)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xm=0aJi;im;n^Jim=1 sJiM)]TJ/F19 7.97 Tf 6.587 0 Td[(1Xm=0aJi;im;nJim;=1 sJiJi;in;.22~J+1im=NwJ+1i;J+1im;J+1imYk2i^Jk;km;=1 Yk2isJkJ+1im;.23~sJ+1i=M)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xm=0~J+1im=1 Yk2isJkM)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xm=0J+1im;.24^J+1im=~J+1im ~sJ+1i=J+1i PM)]TJ/F19 7.97 Tf 6.586 0 Td[(1m=0J+1im=J+1im sJ+1i;.25^J+1inim=^J+1im ^Ji;im=J+1im sJ+1i Ji;im sJi=sJi sJ+1iJinim:.263.J=J+1.4.IfJ=L,thenstop;elsereturntostep2.From.21and4.25itisobviousthatscalingateachstepoftheup algorithmlimitsthedynamicsof^valuestotheinterval[0;1].Ifwecarefully

PAGE 35

24observetheequation.16inthedown -stepalgorithm,valuesneednotbescaled,becauselimiting^J+1inimstabilizes^'sin.16,aswell.4.2.2ScaledDownStepThescaleddown partoftheEMalgorithmconsistsofthefollowingsteps:1.ForJ=L)]TJ/F15 11.955 Tf 11.955 0 Td[(1computetL=1;scalingfactorforL)]TJ/F19 7.97 Tf 6.587 0 Td[(10;^L)]TJ/F19 7.97 Tf 6.587 0 Td[(10m=1 tLL)]TJ/F19 7.97 Tf 6.587 0 Td[(10:.272.J=J)]TJ/F15 11.955 Tf 11.955 0 Td[(1.3.8i,Ji=J,^Jim=M)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xn=0aJi;im;n^J+1in^J+1inin;=sJi sJ+1itJ+1iM)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xn=0aJi;im;nJ+1inJ+1inin;=sJi sJ+1itJ+1iJim:.284.IfJ=0,thenstop;elsereturntostep2.Finally,itisnecessarytochecktheexpressionsforthestateandtransitionprobabilitiestoaccountforthescaled^'sand^'s.Fromtheequations4.17,.25and4.28itfollows:^PSJ=mjW;=1 KK)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xi=0^Jim^Jim M)]TJ/F19 7.97 Tf 6.587 0 Td[(1Xn=0^Jin^Jin;=1 KK)]TJ/F19 7.97 Tf 6.587 0 Td[(1Xi=0sJi sJ+1itJ+1iJim1 sJiJim M)]TJ/F19 7.97 Tf 6.586 0 Td[(1Xn=0sJi sJ+1itJ+1iJin1 sJiJin;=PSJ=mjW;;.29

PAGE 36

25andfromtheequations4.18,.22and.25^PSJ=mjSJ+1=n;W;=1 KK)]TJ/F19 7.97 Tf 6.587 0 Td[(1Xi=0^JimaJi;im;n ^Ji;in;=1 KK)]TJ/F19 7.97 Tf 6.587 0 Td[(1Xi=01 sJiJimaJi;im;n 1 sJiJi;in;=PSJ=mjSJ+1=n;W;:.30Clearly,theequations.29and4.30showthatitisnot necessarytoperformadditionalcomputationduetoscaling.Thus,thesubstitutionofthescaled^'sand^'sintheMstepdoesnotchangetheresultsoftheexpressions.17-.20.Forthenextl+1)]TJ/F20 11.955 Tf 12.972 0 Td[(thiterationstepoftheEMalgorithm,weusethecomputed=lofthel)]TJ/F20 11.955 Tf 12.779 0 Td[(thstep.InthiswaytheEMconvergestoalocalmaximumofthejointpdffWjWj.Therearevariouscriteriawhentostoptheiteration.Inordertosimplifythecomputation,wecomparethevaluesofthecomponentsoflandl+1.Iftheydierlessthansomepresetvalue,thentheiterationisstopped.AftertrainingtheHMT,weareabletocomputethejointpdfofallpixelsandtoclassifyimagesusingaBayesclassier.Inthenextchapterweexplainhowtoobtainthejointpdfofanimageandhowtoclassifyit.

PAGE 37

CHAPTER5MULTISCALESEGMENTATIONTheoverallstatisticalmodelconsistsofHiddenMarkovTreesTfthatassignanodeitoeachcoecient1wJi;Tf,atascaleJ,ofthesamefeaturef.Weassumethatthefeaturesaremutuallyindependent.Inotherwords,connectingcoecientsthatbelongonlytothesamefeaturef,weobtainninemutuallyindependentprobabilitytreesTf,f2F=f15;45;75;H;S;Ig.EachtreeTfrepresents,infact,adistinctHMTmodel,whichcanbetrainedseparatelyfromtheothertrees,asexplainedinthepreviouschapter.Aftertraining,weperformBayesianclassication,wheretheindependenttreesTfareuniedintoauniquestatisticalmodel.AsaresultoftrainingtheHMTforskyandgroundimages,weobtainthemodelparametervectorssandg,respectively.Itfollows,foreachcoecientwJi;TfofthetreeTf,itispossibletocomputethelikelihoodfortheclassskyaspwJi;Tfjsf=XmpwJi;Tf;SJTf=mjsf;=XmpwJi;TfjSJTf;sfPSJTfjsf;=XmpwJi;TfjSJTf;sfPSL)]TJ/F19 7.97 Tf 6.587 0 Td[(1Tf=mjsfYJPSJTf=mjSJ+1Tf=n;.1wherepwJi;TfjSJTf;sfrepresentstheGaussiandistribution,fullycharacterizedbyJfsandJfs,whicharethecomponentsoftheknowns.Clearly,fromtheassumptionthatthetreesTfareindependent,forallf2F,thelikelihoodofany 1Here,coecientreferstothemagnitudeofaCWTcoecientand/ortheH,SandIcolorvalues.26

PAGE 38

27 coecient w J i ,atascale J ,givenaclass,saysky,canbecomputedas p ( w J i j s f )= Y f 2F p ( w J i; T f j s f )(5.2) Consequently,weareabletoperformBayesianclassifcationatallscales,and combiningtheseresultstoperformsegmentationofanimage. Undersuccessfulimagesegmentation,weimplybothaccurateclassifcationof large,homogeneousregions,anddistinctdetectionoftheirboundaries.Toachieve thisgoal,bothlargeandsmallscaleneighborhoodsshouldbeanalyzed.Therefore, weimplementamultiscalesegmentationalgorithm,basedonthealgorithm proposedbyChoiandBaraniuk[18]. Denotingwith W J allcoecientsfromallfeaturesatascale J W J = f w J i g andwith n J thecollectionofallclasslabelsatthescale J n J = f J i g ,where J i 2f s;g g ,theclassifcationisperformedaccordingtotheMAPrule,asfollows: ^ n J = arg max n J f P ( n J j W J ) g ; (5.3) = arg max n J f p ( W J j n J ) P ( n J ) g : (5.4) Thus,ourgoalistocomputethediscriminantfunction g ( W J ; n J )= p ( W J j n J ) P ( n J ), in(5.4),whichwepresentfurtherinthetext. Assumingthataclasslabel J i completelydeterminesthedistributionof thecorrespondingcoecient w J i ,itfollowsthatallcoecientsaremutually independent,giventheirclasslabels: p ( W J j n J )= Y i 2 J p ( w J i j J i ) ; (5.5) where p ( w J i j J i )isalreadyknownfrom(5.2). Tocomputethejointprobability P ( n J ),weassumethatthedistributionof n J iscompletelydeterminedby n J +1 atthecoarser J +1scale.Here,weonce againintroduceanewMarkovtree,whereclasslabels J i playaroleanalogousto

PAGE 39

28 Figure5.1 Thecontextvector. thehiddenstates S J i intheHMT.CombinedwiththeMarkovpropertythat,given J +1 ( i ) J i isindependentofall J k k 6= i ,theMarkovchainrulereads: P ( n J )= P ( n L ) Y i 2 J L )Tj/T1_34 7.97 Tf6.587 0 Td(1 Y j = J P ( j i j n j +1 ) ; (5.6) where L denotesthecoarsestlevel,whichisstillabletoprovidestatisticallyreliable segmentationsn L Theconditionalprobability P ( j i j n j +1 )in(5.6),beingunknowningeneral, mustbeestimatedusingaprohibitiveamountofdata.Inordertosurmountthis problem,weintroducecontexts[18].Toeachcoecient w J i ,withthehiddenclass label J i ,weassignthecontext c J i ,whichrepresentstheinformationon n J +1 .The choiceof c J i ,poisedbetweencomplexityandaccuracy,inourcase,isavectorthat consistsofthetwocomponents: 1.theclasslabel J +1 ( i ) ,ofthecorrespondingdirectparent(asintheHMT), 2.themajorityvoteoftheclasslabels J +1 f ( i ) g ,where f ( i ) g denotestheeight neighboringcoecientsofthedirectparent J +1 ( i ) RecallthatintheHMTacoecient w J i hasexactlyoneparent w J +1 ( i ) andfour children w J )Tj/T1_34 7.97 Tf6.587 0 Td(1 ( i ) .Itfollowsthatallfourneighboringcoecientsatascale J ,havethe samecontextvector c J i =[ J +1 ( i ) ; J +1 f ( i ) g ],asillustratedinFigure5.1.

PAGE 40

29 Weassumethat c J i representsareliablesourceofinformationonthedistributionofallclasslabelsat J +1level.Thus,werewritetheequation(5.6) as P ( n J )= P ( n L ) Y i 2 J L )Tj/T1_6 7.97 Tf6.586 0 Td(1 Y j = J P ( j i j c j i ) : (5.7) Finally,wederiveamoreconvenientexpressionforthediscriminantfunction from(5.4)as g ( W J ; n J )= P ( n L ) Y i 2 J p ( w J i j J i ) L )Tj/T1_6 7.97 Tf6.586 0 Td(1 Y j = J P ( j i j c j i ) : (5.8) Theunknowntransitionprobabilities P ( J i j c J i )canbeobtained,maximizing thediscriminantfunction(5.8)incascades,employingtheEMalgorithmateach scale J [18].Usingthealreadyknownlikelihoods p ( w J i j J i )fromtraining,the P ( J i j c J i )isestimatedintheMLsensebyaveragingovertheentirescale J becauseitisreasonabletoassumethatthetransitionprobabilitiesareequalforall coecientsatthesamescale. TheinitialparametersfortheEMatthelevel L are p ( w L i j L i ),obtained intraining(5.2).Also,forthecoarsestlevel L ,weassumethat P ( L i j c L i )=1 forallcoecients.Then,theEMisperformedincascadestowardsthefner scales,estimating P ( J i j c J i )and P ( J i ),untilthefnestlevel J =0isreached. Experimentalresultsshowthatthealgorithmconvergesrapidlyiftheinitial parametersaresettothevaluesestimatedatthepreviouscoarserscale. Oncethetransitionprobabilities P ( J i j c J i )areestimatedforallclasslabels andcontextvalues,itispossibletoperformimagesegmentation,usingtheMAP ruleasfollows: ^ 0 i = arg max 0 i 2f s;g g P ( 0 i ) P ( 0 i j c 0 i ) p ( w 0 i j 0 i ) : (5.9) Inthenextchapterwepresentexperimentalresultsforsky/groundimage segmentation,usingtheproposedcomputervisionsystem.

PAGE 41

CHAPTER6 EXPERIMENTALRESULTS Inourexperiments,frst,theHMTmodelistrainedusingadatabaseof trainingimages.Werecordedtwosetsof500images.Onesetrepresentsthe skyandtheotherground,only.Imagesarecarefullychosentoaccountforgreat variabilitywithinclasses,asillustratedinFigure6.1. Then,thetrainedHMTistestedonanothersetoftestimages.Thereare twotypesoftestingimages.Thefrsttestingsetcontainsimagessimilartothe trainingdatabase.Thesecondtypeoftestingimagesconsistsofrightimagesof ourMAV's,whereboththeskyandgroundappear.Inthefollowingsectionswe presenttheexperimentalset-upandresults. 6.1Real-TimeRealizationProblems InordertoimplementtheHMT-basedBayesclassiferinreal-time,which isourultimategoal,theamountofcomputationmustbeassmallaspossible. Thereareseveralavenuestodecreasetheamountofcomputation.Thegreatest computationalloadcomesfromtheEMalgorithm.Itisknownthatgiven M trainingimagesof N pixelseach,thetotalcomputationalcostperEMiterationis O ( MN )[7]. Thus,frst,wereduce N ,thenumberofpixelstobeanalyzed.Sincethefast CWTalgorithmrequiresimagedimensionsbeapowerof2,wesubsampleeach 640 480imageinthedatabasetodimensions128 128.Hence,themaximum numberoflevelsintheHMTis L = log 2 127=7.Segmentationresultsshowthat the128 128imageresolutionissucientforreliableclassifcation. 30

PAGE 42

31 Figure6.1 Trainingimagesoftheskyandground. Second,withreducingthenumberofHMTparameters,welowerthenumber M ofimagesnecessaryfortrainingtheHMT.Recallthatwecomputeonlythe componentsof forallcoecientsatascale J {notforeachparticularcoecient w J i .ThisreducesthenumberofiterationstepsfortheEMtoconverge.Inour experiments,theconvergencecriterioncomparesvaluesofthecomponentsof l +1 withthevaluesfromthepreviousiterationstepin l .TheEMalgorithmstopsif j l +1 )Tj/T1_23 11.955 Tf11.956 0 Td( l j <"; where shouldnotbeanarbitrarilysmallvalue,becauseitwouldleadtooverfttingtheHMTmodel.If =10 )Tj/T1_28 7.97 Tf6.586 0 Td(4 ,theconvergenceisachievedin5iterationsteps, intheworstcase. Finally,intelligentchoiceofinitialvaluesalsoreducesthenumberofiteration steps.Appropriatesettingofalltheparametersandthecodeoptimizationmade theexecutionoftrainingrelativelyfast.Trainingon900images(450ofeachclass), onanAthlon900MHzPC,islessthan3minutes,andittakeslessthan1minute, fortesting100images(50ofeachclass).

PAGE 43

32 6.2ClassifcationoftheFirstTypeofTestImages HavingtrainedtheHMTonthetrainingset,weobtainedparameters for allfeatures f 2f 15 ; 45 ; 75 ;H;S;I g .Then,wetestedtheperformanceof ourcomputervisionsystemonthefrsttypeoftestimages,whereeithertheskyor groundispresent.Here,thegoalistorecognizethewholeimageeitherasthesky orastheground. First,weshowthatthechoiceofH,S,Icolorfeaturesisjustifed.Aswe havealreadypointedout,theR,G,Bcolorfeaturesexhibitsignifcantstatistical dependence.Here,wepresenttheclassifcationerrorofouralgorithmforbothR, G,BfeaturesandH,S,IfeaturesinTable6.1andTable6.2.Betterresultsare achievedwhenthenumberofpossiblecolorstatesisincreased.However,abetter modeliscomputationallymoreexpensive. Table6.1 TheperformanceoftheRGB-basedclassiferforgroundtestimages. class numberofstates errorHSI errorRGB 2 4% 6% ground 3 2% 4% 4 2% 4% 5 2% 4% 6 2% 4% Table6.2 TheperformanceoftheRGB-basedclassiferforskytestimages. class numberofstates errorHSI errorRGB 2 12% 26% sky 3 10% 18% 4 6% 16% 5 6% 12% 6 6% 10% ComparingtheclassifcationresultsforHSI-basedandRGB-basedclassifer, presentedinFigure6.2,weconcludethattheHSIcolorfeaturespaceissuperiorto theRGBspace.

PAGE 44

33 Figure6.2 TheclassifcationerrorusingRGBandHSIcolorspaces. Theperformanceofthewavelet-basedclassifer,thecasewhenonlytheCWT featuresareused( f 2f 15 ; 45 ; 75 g ),ispresentedinTable6.3. Table6.3 Performanceofthewavelet-basedclassifer. class errorCWT ground 2% sky 6% FromTables6.2,6.1,and6.3,weconcludethatevenwhenthenumberof statesisincreasedthecolor-basedclassiferperformsworsethanthewavelet-based classifer.Therefore,forreliableclassifcation,bothcolorandtexturefeatureshave tobeincorporated. 6.3SegmentationoftheSecondTypeofTestImages HavingtrainedtheHMTforallfeatures f 2f 15 ; 45 ; 75 ;H;S;I g ,we testedtheperformanceofourcomputervisionsystemonthesecondtypeoftest images.ThissetofimagesconsistsofrightimagesofourMAV's,aswellasother dicult-to-segmentsky/groundimages.Here,thegoalistosegmentanimageinto skyandgroundregions.TheimagesegmentationispresentedinFigures6.3-6.5.

PAGE 45

34 Figure6.3 AMAV's right:favorable sky/groundpatterns. Figure6.4 Watersurface withicepatchessimilar toclouds. Figure6.5 Amountain viewwithfuzzyhorizon line.

PAGE 46

35 Finally,weillustratethegeneralityofouralgorithminsegmentationof cut/uncutgrassregions(seeFigure6.6).Wepresenttheoriginalimage(left),the subsampledimage(center),andsegmentationofthesubsampledimage(right). Theseimages 1 showagrasslawnfromtheperspectiveofacameramountedonan autonomouslawnmower.Evenatimageresolutionsaslowas64 64,weachieve satisfactoryresultsatveryfastprocessingspeeds. Figure6.6 Segmentationofcutanduncutgrassregions. 1 WethankRandChandlerforgivingusaccesstohisimagedatabase.

PAGE 47

CHAPTER7 CONCLUSION Segmentationofcompleximageclasses,suchasskyandground,demandan elaborateconsiderationofclassproperties.Clearly,insomecases,colorprovides sucientinformationforskyandgrounddetection.However,duetovideonoise and/orunfavorableclasspatterns,bothcolorandtexturecluesarenecessaryfor successfulrecognition. Inthisthesis,frst,wepresentedourchoiceoffeatures,consistingofHandI valuesfromtheHSIcolorspace,andCWTcoecients.Inourexperiments,the HSIcolorspaceprovedtobesuperiortotheRGBspace.Also,intheearlystageof consideringthemethodfortextureanalysis,experimentalresultsshowedthatthe DWT(realizedwithDaubechieswavelets)isinferiortotheCWT. Second,weshowedtheimplementationoftheHMTmodelandthetraining stepsforobtainingitsparameters.Thecontributionofthisthesisrerectsinthe derivationoftheformulae,whichmustbeusedintheEMalgorithmtosurmount theimplementationissues;specifcallythenumericalunderrow.Weproposedthe method,wheredynamicscalingisperformedfor f valuesonly,whereasinliterature [7,11,15]otherapproachesareused. Further,wedescribedhowthelearnedparametersetcouldbeusedfor computinglikelihoodsofallnodesatallscalesofourHMT.Wethendeveloped multiscaleBayesianclassifcationforourapplication. ThemostimportantcontributionofthisthesisrerectsinsuccessfulimplementationofcolorfeaturesintotheHMTframework.Inrelevantliterature[7,11,18], theHMTmodelisrelatedonlytowavelets.Incorporatingcolorfeaturesresulted 36

PAGE 48

37inmorereliableimagesegmentation,whichisillustratedfordiversesky/groundimagesandforcut/uncutgrassimages.Weincorporatedinourdesignresultsfromtheavailableliterature,modifyingtheoriginalalgorithmsforourpurposeswhereappropriate.Wedesignedourcomputervisionsystemwithreal-timeconstraintsinmind.Therefore,someaspectsaresuboptimalwithrespecttotheclassicationerror.

PAGE 49

REFERENCES[1]ScottM.Ettinger,MichaelC.Nechyba,PeterG.Ifju,andMartinWaszak,Vision-guidedightstabilityandcontrolforMicroAirVehicles,"inProc.IEEEInt.ConferenceonIntelligentRobotsandSystems,Lausane,October2002.[2]RichardO.Duda,PeterE.Hart,andDavidG.Stork,PatternClassication,2ndedition,JohnWiley&Sons,NewYork,2000.[3]Heng-DaCheng,XihuaJiang,YingSun,andJingliWang,Colorimagesegmentation:advancesandprospects,"PatternRecognition,vol.34,pp.2259{2281,2001.[4]TrygveRandenandHakonHusoy,Filteringfortextureclassication:Acomparativestudy,"IEEETransactionsonPatternAnalysisandMachineIntelligence,vol.21,no.4,pp.291{310,1999.[5]StephaneMallat,AWaveletTourofSignalProcessing,2ndedition,AcademicPress,SanDiego,2001.[6]StephaneG.Mallat,Atheoryformultiresolutionsignaldecomposition:thewaveletrepresentation,"IEEETransactionsonPatternAnalysisandMachineIntelligence,vol.11,no.7,pp.674{693,1989.[7]MatthewS.Crouse,RobertD.Nowak,andRichardG.Baraniuk,Wavelet-basedstatisticalsignalprocessingusingHiddenMarkovModels,"IEEETransactionsonSignalProcessing,vol.46,no.4,pp.886{902,1998.[8]JeromeM.Shapiro,Embeddedimagecodingusingzerotreesofwaveletcoecients,"IEEETransactionsonSignalProcessing,vol.41,no.12,pp.3445{3462,1993.[9]NickKingsbury,Imageprocessingwithcomplexwavelets,"PhilosophicalTransactionsRoyalSocietyLondon,vol.357,no.1760,pp.2543{2560,1999.[10]MichaelUnser,Textureclassicationandsegmentationusingwaveletframes,"IEEETransactionsonImageProcessing,vol.4,no.11,pp.1549{1560,1995.[11]DiegoClonda,Jean-MarcLina,andBernardGoulard,MixedmemorymodelforimageprocessingandmodelingwithcomplexDaubechieswavelets,"inProc.SPIE'sInternationalSymposiumonOpticalScienceandTechnology,SanDiego,August2000.38

PAGE 50

39[12]NickKingsbury,Complexwaveletsforshiftinvariantanalysisandlteringofsignals,"JournalofAppliedandComputationalHarmonicAnalysis,vol.10,no.3,pp.234{253,2001.[13]JudeaPearl,ProbabilisticReasoninginIntelligentSystems:NetworksofPlausibleInference,MorganKaufamnn,SanMateo,1988.[14]JustinK.Romberg,HyeokhoChoi,andRichardG.Baraniuk,Bayesiantree-structuredimagemodelingusingwavelet-domainHiddenMarkovModels,"IEEETransactionsonImageProcessing,vol.10,no.7,pp.1056{1068,2001.[15]LawrenceR.Rabiner,AtutorialonHiddenMarkovModelsandselectedapplicationsinspeechrecognition,"ProceedingsoftheIEEE,vol.77,no.2,pp.257{286,1989.[16]GeoreyJ.McLachlanandKrishnanT.Thriyambakam,TheEMAlgorithmandExtensions,JohnWiley&Sons,NewYork,1996.[17]HelmutLucke,WhichstochasticmodelsallowBaum-Welchtraining?,"IEEETransactionsonSignalProcessing,vol.44,no.11,pp.2746{2756,1996.[18]HyeokhoChoiandRichardG.Baraniuk,Multiscaleimagesegmentationusingwavelet-domainHiddenMarkovModels,"IEEETransactionsonImageProcessing,vol.10,no.9,pp.1309{1321,2001.

PAGE 51

BIOGRAPHICALSKETCHIn1994,SinisaTodorovicgraduatedattheSchoolofElectricalEngineer-ing,theUniversityofBelgrade,Yugoslavia,withamajorinelectronicsandtelecommunications.From1994until2001,heworkedasasoftwareengineerinthecommunicationsindustry.Specically,from1998to2001,asanemployeeofSiemens,heworkedonahugenetworkinstallationprojectinRussia.Infall2001,SinisaTodorovicwasadmittedtothemaster'sdegreeprogramattheDepartmentofElectricalandComputerEngineering,UniversityofFlorida,Gainesville.Hismaineldofinterest,throughouthisgraduatestudies,wascomputervisionandpatternrecognition.40


Permanent Link: http://ufdc.ufl.edu/UFE0000616/00001

Material Information

Title: Statistical modeling and segmentation of sky/ground images
Physical Description: Mixed Material
Creator: Todorovic, Sinisa ( Author, Primary )
Publication Date: 2002
Copyright Date: 2002

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0000616:00001

Permanent Link: http://ufdc.ufl.edu/UFE0000616/00001

Material Information

Title: Statistical modeling and segmentation of sky/ground images
Physical Description: Mixed Material
Creator: Todorovic, Sinisa ( Author, Primary )
Publication Date: 2002
Copyright Date: 2002

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0000616:00001


This item has the following downloads:


Full Text










STATISTICAL MODELING AND SEGMENTATION
OF SKY/GROUND IMAGES















By

SINISA TODOROVIC


A THESIS PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
MASTER OF SCIENCE

UNIVERSITY OF FLORIDA


2002














ACKNOWLEDGMENTS


Herewith, I would like to express my sincere gratitude to Dr. Michael Nechyba

for his wise guidance of my research for this thesis. As my advisor, Dr. Nechyba

has been directing but on no account confining my interests. Therefore, I have

had the impression of complete freedom, which, in my opinion, is a prerequisite for

fruitful scientific research. I especially appreciate his readiness and expertise to help

me solve numerous implementation issues. Most importantly, I am thankful for the

friendship that we have developed collaborating on this work.

Also, I thank Dr. Antonio Arroyo, whose brilliant lectures on machine in-

telligence have inspired me to endeavor research in the field of robotics. As the

director of the Machine Intelligence Lab (\ IL ), Dr. Arroyo has created a warm,

friendly, and hard working atmosphere among the \ ll -ers." Thanks to him, I

have decided to join the MIL, which has proved on numerous occasions to be the

right decision. Therefore, I thank Dr. Arroyo and all the members of the MIL for

helping me assimilate faster to the new environment.

I thank Dr. Peter Ifju and his team of students at the Mechanical and

Aerospace Engineering Department for enabling me to participate in their micro air

vehicle (\!AV) project. My research has been initially inspired by the mutual work

of Dr. N. livba and Dr. Ifju. Certainly, the theory developed in the thesis would

have been futile had there not been the perfect application-MAV.

Finally, special thanks go to Ashish Jain for innumerable interventions in cases

in which the Linux operating system has been my top adversary.















TABLE OF CONTENTS

page

ACKNOW LEDGMENTS ............................. ii

LIST OF TABLES .................... ............. v

LIST OF FIGURES ...................... ........ vi

KEY TO ABBREVIATIONS ........ ........... ...... vii

KEY TO SYMBOLS ......... .......... .......... viii

ABSTRACT ...... ........................... x

1 INTRODUCTION .................. .......... 1

1.1 Overview of the Proposed Computer Vision System ........ 2
1.2 Overview of the Thesis .......... ............. 3

2 FEATURE SPACE ............ ... ...... ...... 5

2.1 C olor . . . . . . . . 5
2.2 Texture . . . . . . . . 5
2.2.1 Wavelet Transform .......... ........... 6
2.2.2 Wavelet Properties .......... ........... 6
2.2.3 Complex Wavelet Transform ........ ......... 9
2.2.4 Color Multi-resolution Representation . . ... 12

3 STATISTICAL MODELING .......... ...... ... .. 13

3.1 M arginal Density .................. ........ .. 13
3.2 Graph M odels .................. ........ .. .. 14
3.3 HMT Model ....... ........ ......... .... 15
3.4 HMT Properties .................. ......... .. 16

4 TRAINING OF THE HMT MODEL ...... . . 18

4.1 Implementation of the EM Algorithm .. . . ...... 18
4.1.1 E Step .................. .......... .. 20
4.1.2 Up Step .................. ......... .. 20
4.1.3 Down Step .................. ........ .. 21
4.1.4 M Step . . . . . . ... .. 21
4.2 Implementation Issues .................. .... .. 22









4.2.1 Scaled Up Step ............... .. .. 23
4.2.2 Scaled Down Step ...... ......... .... 24

5 MULTISCALE SEGMENTATION .................. .. 26

6 EXPERIMENTAL RESULTS .................. ... .. 30

6.1 Real-Time Realization Problems .................. .. 30
6.2 Classification of the First Type of Test Images . .... 32
6.3 Segmentation of the Second Type of Test Images . ... 33

7 CONCLUSION .................. ............. .. 36

REFERENCES ................... ... ... ........ .. 38

BIOGRAPHICAL SKETCH .............. . .. 40















LIST OF TABLES


Table page

2.1 Coefficients of the filters used in the Q-shift DTCWT. . ... 10

6.1 The performance of the RGB-based classifier for ground test images. 32

6.2 The performance of the RGB-based classifier for sky test images. 32

6.3 Performance of the wavelet-based classifier. ............. 33















LIST OF FIGURES

Figure page

1.1 A computer vision system .................. ..... 2

2.1 Two levels of the DWT of a two-dimensional signal. . . ... 7

2.2 The original image (left) and its two-scale dyadic DWT (right). .. 7

2.3 The Q-shift Dual-Tree CWT. .................. 9

2.4 The CWT is strongly oriented at angles 150, 450, 75. ..... 11

3.1 The HMT of the three-level CWT. ................. 14

5.1 The context vector. .................. ....... 28

6.1 Training images of the sky and ground. ............... 31

6.2 The classification error using RGB and HSI color spaces. ...... .. 33

6.3 A MAV's flight: favorable sky/ground patterns. . . 34

6.4 Water surface with ice patches similar to clouds. . . 34

6.5 A mountain view with fuzzy horizon line. ............ 34

6.6 Segmentation of cut and uncut grass regions. ............. 35














KEY TO ABBREVIATIONS


B: The blue feature of the RGB color space

CWT: Complex Wavelet Transform

DWT: Discrete Wavelet Transform

G: The green feature of the RGB color space

H: The hue feature of the HSI color space

HMT: Hidden Markov Tree

HSI: The color space that consists of hue, saturation and intensity features

I: The intensity feature of the HSI color space

MAP: Maximum A Posteori

MAV: Micro Air Vehicle

pdf: Probability density function

R: The red feature of the RGB color space

RGB: The color space that consists of red, green and blue features

S: The saturation feature of the HSI color space














KEY TO SYMBOLS


The list shown below gives a brief description of the i i, i" mathematical

symbols defined in this work. For each symbol, the page number corresponds to the

place where the symbol is first used.

0J i)(mT, n), The transition probability that the node i at the scale J is in a
state m, given that its parent node is in a state n.. . . ...... 16

a4f(m), A joint probability of the tree under the root node 0 with the removed
sub-tree under the node i at the scale J and that the node i is in the state
m. ................... ...... ...... .... .. 18

af(m), The dynamically scaled a f(m) value .. . . 23

j3g(m), A probability of the whole tree under the node i at the scale J, given
that the node i is in a state m. .................. .... 18

/03(m), The dynamically scaled jf3(m) value. ..... . . 23

3i) ,i(m), A probability of the whole tree under the node i at the scale J, given
that the parent node p(i) is in a state m. .................. .18

pJl,'((m), A probability of the cut tree under the parent node p(i) at the scale
J + 1 with all the children nodes removed, given that the node p(i) is in a
state m. ................... .................. 18

X(i), Children nodes at the finer scale J 1 of the node i at the scale J. 15
ci, The context vector of the coefficient i at the scale J. . . 28

g(W [2J), The discriminant function. .............. ...... 27

J, The number of the scale .................. ........ .. 15

M, The number of states in the Gaussian mixture model . ...... 13

p/f, The mean value of the random variable w. . . 16

f, The class label of the coefficient i at the scale J. .. . .. 27

f2', The collection of all class labels at the scale J. .. . ..... 27









p(i), A parent node at the coarser scale J + 1 of the node i at the scale J. 15

S(i,1 A random variable representing the state of a child node at the scale
J- 1. ..... .............. .................... 16

a-(m), The standard variation of the random variable wj ........... 16

S A random variable representing the state of a coefficient at the position i
and at the scale J ...... .... ............... ...... 16

Sj+, A random variable representing the state of a parent node at the scale
J + 1. .................. .................. .. 16

e, A random vector that consists of all parameters, which determine the HMT
model ................... ............ ...... 16

09, All parameters of the statistical model for the ground. ......... ..26

08, All parameters of the statistical model for the sky. ........... .. 26

WHH, The set of wavelet coefficients of the DWT obtained filtering rows with
highpass and columns with highpass filters ................... 6

WHL, The set of wavelet coefficients of the DWT obtained filtering rows with
highpass and columns with lowpass filters. ................. 6

WLH, The set of wavelet coefficients of the DWT obtained filtering rows with
lowpass and columns with highpass filters .. ................ 6

w', A random variable representing the magnitude of a coefficient of the CWT
or HSI at the position i and at the scale J. ..... . . ... 16

W The collection of all random variables w' at the scale J. . ... 15

maxx, The maximum number of columns of an image. ............ .. 15

ymax, The maximum number of rows of an image. ............ 15















Abstract of Thesis Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Master of Science



STATISTICAL MODELING AND SEGMENTATION
OF SKY/GROUND IMAGES




By

Sinisa Todorovic

December 2002

C'!h Ii: Michael C. Nechyba
Major Department: Electrical and Computer Engineering

In this thesis, a novel computer vision algorithm for statistical image modeling

is proposed. The necessity for such an algorithm initially came as a requirement

for vision-based stability and control of micro air vehicles (\!AV). The MAV

flight autonomy can be achieved by real-time horizon tracking, using the on-board

vision system. Occasionally, this algorithm fails in scenarios where the underlying

Gaussian assumption for the sky and ground appearances is not appropriate.

Therefore, we propose a statistical modeling framework to build prior models of

the sky and ground. Once trained, these models can be incorporated into the

existing horizon-tracking algorithm for MAVs. Since the appearances of the sky

and ground vary enormously, no single feature is sufficient for accurate modeling.

As such, we rely both on color and texture as critical features in our modeling

framework. Specifically, we choose the hue, saturation and intensity (HSI) color

system for color representation, and the complex wavelet transform (CWT) for









texture representation. We then use the Hidden Markov Tree (HMT) model,

as the underlying statistical model over our feature space. With this approach,

we have achieved reliable and robust image segmentation of flight images from

on-board our MAVs as well as on more difficult-to-classify sky/ground images.

Furthermore, we demonstrate the generality of our modeling framework through

another segmentation task-classifying the cut/uncut grass regions on images taken

by an on-board camera of a lawn mower.















CHAPTER 1
INTRODUCTION


Recently, the researchers in the Department of Mechanical and Aerospace

Engineering and in the Department of Electrical and Computer Engineering, at

the University of Florida, Gainesville, have successfully implemented a vision-

based horizon tracking algorithm for flight stability and .,.i 11... ,ii: of Micro Air

Vehicles (\ !AVs) [1]. The horizon tracking algorithm works well, especially when

the sky and ground distributions are relatively coherent. Occasionally, however,

horizon detection fails in scenarios where the underlying Gaussian assumption

for the sky and ground appearances is not appropriate. Moreover, the horizon

detection algorithm is bootstrapped by assuming that initially the sky occupies

the upper part of the image. For complex mission scenarios, this may be an

incorrect assumption with potentially fatal consequences to the flight vehicle.

For example, in the case of deploying MAVs on munitions for post-impact bomb

damage assessment, a MAV would separate from the munition prior to impact,

and an upright attitude with respect to the ground cannot be guaranteed. Correct

identification of sky and ground, therefore, becomes imperative.

While modeling the appearance of sky and ground regions in images may

seem intuitively easy, it is, in fact, a very challenging task. Depending on lighting,

weather, landscape, etc., the appearance of the sky and ground can vary enor-

mously. Moreover, in some cases, even human perception fails, tricked by nii-ii i"

patterns of sky and ground. For instance, the images of water surface reflecting

the sky or mountain tops covered with snow represent the extreme cases, in which

human detection of sky and ground is unreliable.









Consequently, given the complex variations in our two image classes (i.e.,

sky and ground), careful consideration must be given to selecting sufficiently

discriminating features and a sufficiently expressive modeling framework. Next we

present the overview of our approach.

1.1 Overview of the Proposed Computer Vision System

In this thesis, we propose a computer vision algorithm for image segmentation.

The proposed vision system represents, in fact, a B-iv.-- classifier, as illustrated

in Figure 1.1. A B-,-;. classifier assigns a pixel to a particular class if a posteri-

ori probability of the class given the pixel value is maximum. Designing a B-,-

classifier guarantees the best performance, subject to the correctness of available

statistical models [2]. Therefore, it is necessary to statistically model interdepen-

dencies among pixels as accurately as possible. This is achieved in the computer

vision system, whose components are explained further in the text.

The first module is a preprocessing block, which subsamples the input image

and thus reduces the overall computation cost.

In the following block, the key features of an image are extracted. Having

experimented with color and texture features separately, we conclude that only the

feature set that includes both color and texture clues enables accurate statistical

modeling for our application. Previous experiments also -i-i:: -1 that it is impor-

tant to represent both local as well as regional interdependencies in the feature

space. As such, we first employ the hue, intensity and saturation (HSI) color space

to represent color. Also, we resort to wavelet-based multi-resolution analysis in


Figure 1.1 A computer vision system


INPUT FEATURE COMPUTATION OF SEGMENTED
SPREPROCESSING MAP CLASSIFIER
IMAGE EXTRACTION STATISTICAL PARAMETERS IMAGE
BAYES CLASSIFIER









the form of the Complex Wavelet Transform (CWT). Thus, from original complex

representation of image classes, we obtain a finite set of features.

Given our feature selection, we then choose the Hidden Markov Tree (HMT)

model as our underlying statistical model. Since the true distribution of the chosen

feature set is not known, it is necessary to statistically model the interdependencies

among the features. Therefore in the following block, we estimate statistical

parameters which completely characterize the underlying distribution of the

features.

Once, the model parameters are computed from the available training data,

it is possible to find the likelihood of image classes for each pixel. Employing the

maximum a posteori (1\ AP) criterion, we assign a class label to each pixel, and,

thus, perform segmentation.

In conclusion, the mutual dependence of all modules in the system should also

be emphasized. The system performs poorly overall if only one of the components

is badly designed. For instance, a perfect set of features and a sophisticated

statistical model could yield horrible results if only the segmentation block is

bad. Therefore, one should .li-- ii- have in mind that modularity, as depicted in

Figure 1.1, serves only for presentation purposes.

1.2 Overview of the Thesis

In the following chapters we discuss the main components of our computer

vision system.

First, in ('! Ilpter 2, we explain our choice of the feature space for statistical

modeling and review fundamental theory, necessary for better understanding of

our work. Specifically, we give an overview of the most important aspects of the

HSI color system and the CWT. Next, in C'! lpter 3 we introduce the properties of

the Hidden Markov Model, and in C'! Ilter 4 we develop the algorithm for training

HMT models, pointing out implementation issues. Further, in C'!I pter 5, we







4

explain the procedure for multiscale image segmentation. In C'!i ipter 6, we present

experimental results. We illustrate the reliability and robustness of our approach

on examples of flight images from on-board our MAVs as well as on more difficult-

to-classify sky/ground images. Furthermore, we demonstrate the generality of our

modeling framework through another segmentation task-classifying the cut/uncut

grass regions on images taken by an on-board camera of a lawn mower. Finally, we

discuss the experimental results in C'!i lpter 7.















CHAPTER 2
FEATURE SPACE


For our statistical models, we seek to identify features that lead to improved

segmentation performance without unnecessarily increasing computational com-

plexity. Color or texture clues by themselves yield poor segmentation results.

Therefore, we consider a feature space that spans both color and texture domains.

2.1 Color

The color information in a video signal is usually encoded in the RGB color

space. Unfortunately, the R, G and B color channels are highly correlated; there-

fore, we choose the HSI space as a more appropriate color representation [3].

Though, HSI exhibits numerical instability at low saturation, this problem can be

easily overcome by assigning some application-dependent, predefined values to the

hue (H), intensity (I) and saturation (S) features. Thus, for sky/ground segmenta-

tion, we assume that low S and/or I values most likely appear for the ground class,

which then sets H value appropriately.

2.2 Texture

For the choice of texture-based features, we have considered several filtering,

model-based and statistical methods for texture feature extraction. Our conclusion

complies with the comparative study of Randen and Husoy [4] that for problems

with many textures with subtle spectral differences, as in the case of our complex

classes, it is reasonable to assume that the spectral decomposition by a filter

bank yields consistently superior results over other texture analysis methods. Our

experimental results also -i-.; -1 that it is crucial to analyze both local as well as









regional properties of texture. As such, we employ the wavelet transform, due to its

inherent representation of texture at different scales and locations.

2.2.1 Wavelet Transform

Wavelet atom functions, being well localized both in space and frequency,

retrieve texture information quite successfully [5]. The conventional discrete

wavelet transform (DWT) may be regarded as equivalent to filtering the input

signal with a bank of bandpass filters, whose impulse responses are all given by

scaled versions of a mother wavelet. The scaling factor between .,ii i: ent filters

is 2:1, leading to octave bandwidths and center frequencies that are one octave

apart. The octave-band DWT is most efficiently implemented by the dyadic

wavelet decomposition tree of Mallat [6], where wavelet coefficients of an image are

obtained convolving every row and column with impulse responses of lowpass and

highpass filters, as shown in Figure 2.1. Practically, coefficients of one scale are

obtained convolving every second row and column from the previous finer scale.

Thus, the filter output is a wavelet subimage that has four times less coefficients

than the one at the previous scale. The lowpass filter is denoted with Ho and

the highpass filter with H1. The wavelet coefficients W have in index L denoting

lowpass output and H for highpass output.

Separable filtering of rows and columns produces four subimages at each level,

which can be arranged as shown in Figure 2.2. The same figure also illustrates well

the directional selectivity of the DWT, because WLH, WHL and WHH, bandpass

subimages can select horizontal, vertical and diagonal edges, respectively.

2.2.2 Wavelet Properties

The following properties of the DWT have made wavelet-based image process-

ing very attractive in recent years [5, 7, 8]:










































Figure 2.1 Two levels of the DWT of a two-dimensional signal.


23 40 60 8D 100 12


Figure 2.2 The original image (left) and its two-scale dyadic DWT (right).


Level 1
Row filters


Level 0
Column filters


Level 0
Row filters


Level 1
Column filters


LEVEL LEVELWH
LEVEL 0 LEVEL 1









a. locality: each wavelet coefficient represents local image content in space and
frequency, because wavelets are well localized simultaneously in space and
frequency
b. multi-resolution: DWT represents an image at different scales of resolution in
space domain (i.e., in frequency domain); regions of ,i" 1.,-i- at one scale are
divided up into four smaller regions at the next finer scale (Fig. 2.2)

c. edge detector: edges of an image are represented by large wavelet coefficients
at the corresponding locations
d. energy compression: wavelet coefficients are large only if edges are present
within the support of the wavelet, which means that the ii i i lity of wavelet
coefficients have small values

e. decorrelation: wavelet coefficients are approximately decorrelated, since the
scaled and shifted wavelets form orthonormal basis; dependencies among
wavelet coefficients are predominantly local
f. clustering: if a particular wavelet coefficient is large/small, then the ..1 11cent
coefficients are very likely to also be large/small
g. persistence: large/small values of wavelet coefficients tend to propagate
through scales
h. non-Gaussian marginal pdf: wavelet coefficients have peaky and long-tailed
marginal distributions; due to the energy compression property only a
few wavelet coefficients have large values, therefore a Gaussian pdf for an
individual coefficient is a poor statistical model

It is also important to introduce shortcomings of the DWT. Discrete wavelet

decompositions suffer from two main problems, which hamper their use for many

applications, as follows [9]:

a. lack of shift invariance: small shifts in the input signal can cause in !ii"r
variations in the energy distribution of wavelet coefficients
b. poor directional selectivity: for some applications horizontal, vertical and
diagonal selectivity is insufficient

When we analyze the Fourier spectrum of a signal, we expect the energy in

each frequency bin to be invariant to any shifts of the input. Unfortunately, the

DWT has a significant drawback that the energy distribution between various

wavelet scales depends critically on the position of key features of the input signal,









TREE a


I TREE b

Figure 2.3 The Q-shift Dual-Tree CWT.


whereas ideally dependence is on just the features themselves. Therefore, the real

DWT is unlikely to give consistent results when used in texture analysis.

In literature, there are several approaches proposed to overcome this problem

(e.g., Discrete Wavelet Frames [5, 10]), all increasing computational load with

inevitable redundancy in the wavelet domain. In our opinion, the Complex Wavelet

Transform (CWT) offers the best solution providing additional advantages,

described in the following subsection.

2.2.3 Complex Wavelet Transform

The structure of the CWT is the same as in Figure 2.1, except that the CWT

filters have complex coefficients and generate complex output. The output sampling

rates are unchanged from the DWT, but each wavelet coefficient contains a real and

imaginary part, thus a redundancy of 2:1 for one-dimensional signals is introduced.

In our case, for two-dimensional signals, the redundancy becomes 4:1, because

two .,li i: ent quadrants of the spectrum are required to represent fully a real

two-dimensional signal, adding an extra 2:1 factor. This is achieved by additional

filtering with complex conjugates of either the row or column filters [9, 11].


Level 2









Table 2.1 Coefficients of the filters used in the Q-shift DTCWT.

H13 (symmetric) H19 (symmetric) H6
-0.0017581 -0.0000706 0.03616384
0 0 0
0.0222656 0.0013419 -0 :2'-42
-0.0468750 -0.0018834 0 2;;- ':32
-0.0482422 -0.0071568 0.76027237
0.2968750 0 i_' ;, .,i. 0.58751830
0.555 .11- 0.0556431 0
0.2968750 -0.0516881 -0.11430184
-0.0482422 -0.2997576 0

-0.2997576 0
-0.2997576


Despite its higher computational cost, we prefer the CWT over the DWT

because of the CWT's following attractive properties. The CWT is shown to posses

almost shift and rotational invariance, given suitably designed biorthogonal or

orthogonal wavelet filters. We implement the Q-shift Dual-Tree CWT scheme,

proposed by Kingsbury [12], as depicted in Figure 2.3. The figure shows the

CWT of only one-dimensional signal x, for clarity. The output of the trees a

and b can be viewed as real and imaginary parts of complex wavelet coefficients,

respectively. Thus, to compute the CWT, we implement two real DWT's (see

Fig. 2.1), obtaining a wavelet frame with redundancy two. As for the DWT,

here, lowpass and highpass filters are denoted with 0 and 1 in index, respectively.

The level 0 comprises odd-length filters Hoa(z) = Hob(z) = H13() (13 taps)

and Ha(z) = Hlb(z) H19(z) (19 taps). Levels above the level 0 consist of

even-length filters Hooa(Z) z-1H6(-1), HOia(Z) = H6(-z), HOOb(z) = H6(),

Holb(z) = Z-1H6(-Z-1), where the impulse response of the filters H13, H19 and H6

is given in the table 2.1.





















Figure 2.4 The CWT is strongly oriented at angles 150, 450, 75.

Aside from being shift invariant, the CWT is superior to the DWT in terms of

directional selectivity, too. A two-dimensional CWT produces six bandpass subim-

ages (analogous to the three subimages in the DWT) of complex coefficients at each

level, which are strongly oriented at angles of 150, 450, 75, as illustrated in

Figure 2.4.

Another advantageous property of the CWT exerts in the presence of noise.

The phase and magnitude of the complex wavelet coefficients collaborate in a non

trivial way to describe data [11]. The phase encodes the coherent (in space and

scale) structure of an image, which is resilient to noise, and the magnitude captures

the strength of local information that could be very susceptible to noise corruption.

Hence, the phase of complex wavelet coefficients might be used as a principal clue

for image denoising. However, our experimental results have shown that phase is

not a good feature choice for sky/ground modeling. Therefore, we consider only

magnitudes.

To conclude, we consider the CWT a suitable tool to analyze image texture.

The CWT inherited attractive properties of the DWT, such as: locality, multi-

resolution, energy compression, etc. Its shift invariance and increased directional

selectivity give additional power to statistical modeling of image classes. Therefore,









we choose, beside the color features HSI, the CWT to complement our feature

space.

2.2.4 Color Multi-resolution Representation

In order to unify the color and texture features into a unique modeling

framework, we also form multi-resolution representation for the H, S and I

features, analogous to the CWT structure. The H, S and I values at coarser scales

are computed as the mean of the corresponding four values at the next higher-

resolution scale. Hence, the H, S and I features also exhibit the clustering and

persistence properties to some extent.

Naturally, the choice of the feature space determines the statistical image

model. In the next chapter, we describe the HMT model as an appropriate

statistical framework for modeling our complex classes in the chosen feature space.















CHAPTER 3
STATISTICAL MODELING


In order to compute the joint pdf of the chosen features, which fully charac-

terizes a class for our B ,-;. classifier (see Fig 1.1), we must account for the key

dependencies. In our case, we need to examine thoroughly all the properties of the

CWT and HSI features, and to best incorporate this knowledge into our statistical

model.

3.1 Marginal Density

We model the true marginal pdf of a feature as a mixture of Gaussian pdf's.

Let's refer to the magnitude of a CWT coefficient and/or the H, S and I color

values, simply as coefficient. Then, a coefficient can be in one of the M states,

which results in an M-state Gaussian mixture model. It means that we associate

to each coefficient wi, at location (x, y), a state random variable Si In general,

Si can take on values m e {0... M 1}. The mixture model is completely

parameterized with the probability measure function (pmf) of the state random

variable P(Si = m) and with the means pJ(m) and the variances ao(m) of each

state. By increasing the number of states, we can make our model fit arbitrarily

close to real densities with a finite number of discontinuities [7].

Thus, the marginal pdf of a wavelet coefficient wi is given by

M-1
p(wi) = P(Si m)p(w, Si m) (3.1)
m=0



1 Note that boldface letters represent random variables and their particular real-
izations are not boldface.










w,5 w, w, LEVEL J + 1
-/ SJ+1
2 / P (i)
W15 45 J+
Wp (i)

1 I 4 Si LEVEL J
W\ / 4
w


LEVEL J -,
S 1
X (1)

15 W45 x W

Figure 3.1 The HMT of the three-level CWT.


where

p(wSlSi = m) = e (3.2)

Although each coefficient w, is conditionally Gaussian, given its state random

variable Si, the overall density is non-Gaussian due to the randomness of Si.


3.2 Graph Models


A probabilistic graph associates each random variable with a node in the graph

[13]. Correlation between pairs of variables is represented by links connecting the

corresponding nodes. The chosen feature space can be modeled by a probabilistic

graph in the following way. A node is assigned to each coefficient and an edge is

placed between two mutually dependent nodes. All coefficients that belong to the

same scale are placed in the corresponding horizontal plane. The coefficients of the

finer scales form planes under the current one, and the coefficients that belong to

the coarser scales form planes above the current one, as illustrated in Figure 3.1.

Four .i.1i ient coefficients at one scale have a unique parent belonging to the upper

coarser scale, only within the same feature (shaded area). States S' are depicted as

white nodes and coefficient values wW as black nodes.









Thus, we obtain a pyramid structure, because the coarser the scale the fewer

nodes there are. We can develop graphs that can capture arbitrary interdepen-

dencies, but the computational complexity increases substantially for graphs more

complicated than trees. If we keep only horizontal edges, which connect .,.i i: ent

nodes of one scale, we obtain the Hidden Markov ('!i ,i Model (HMC). By connect-

ing only the .,I.i ient nodes vertically across scale, we obtain the Hidden Markov

Tree Model (HMT).

3.3 HMT Model

The HMT models both the clustering and persistence properties. A tree

structure is formed connecting every parent coefficient2 from the coarser scale with

its children at the finer scale. There are four direct children from one parent node,

and every child has only one parent node, as depicted in Figure 3.1. Leaf nodes

have no children and they are formed from the coefficients at the finest resolution.

Although we do not include horizontal edges among the nodes of one scale in our

model, the clustering property is still well modeled, because the .,.1] i:ent nodes

at one scale have the same parent. Therefore the local correlation is indirectly

modeled by interscale dependencies [14].

In order to discuss HMT properties, first we need to introduce the following

notation. The random field of all wavelet coefficients at a scale J is denoted with

W {w'}. A node i has a parent node p(i) and a set of children nodes X(i).

This implies that J((i)) = J(i) 1 and J(p(i)) = J(i) + 1. The scales are ordered

from the finest J = 0 to the coarsest J = L 1, where L = log2 Xax. In this

project, the dimensions Xmax and ymax of an image are assumed to be equal to

facilitate computation of the CWT.



2 Here, coefficient refers to the magnitude of a CWT coefficient and/or the H, S
and I color values.








3.4 HMT Properties

The HMT imposes that w[ is conditionally independent of all other random
variables given its associated state Sj [15]. Secondly, given a parent state SJ) ,

wf is independent of the p(i)'s ancestors. Conversely, given a child state S(J,1 w

is independent of the child's descendants. Combining these properties shows that

wj is conditionally independent of the entire tree, given only the parent state SJl)

and the children states S Here follows a mathematical formulation of these

statements:

p(w { }ki, {S}ki, Si m) p(w S m) (3.3)

p(w {w } IWk ) ,w(1)+ p(w L w%), (3.4)
w | w J(k)>J(p(i)) J1 J1)

p(w i {wk }()
It is important to note that the Markov structure is related to the states of

coefficients and not to the coefficient values. Further, since the states are never

known exactly, the Markov model is hidden with real continuous outcomes. Also,

unlike the usual HMM treatment in literature [15], here transitions between states

are not related to time. In fact, the HMT deals with transitions between scales

(i.e., horizontal planes) of the pyramidal graph structure.

Using an M-state Gaussian mixture model for each wavelet coefficient wi, the

parameters for the HMT model are

i. P(SL-' m) m E [0, M 1], the pmf of the root node S01-
ii. o (i)m, n) P(S' m|I S =- n), the transition probability that S
is in a state m given that SJ+l is in a state n, where m E [0, M 1] and
ne [0,M 1]
iii. pf~(m), and (of(m))2, the mean and variance, respectively, of the wavelet
coefficient wf, given Sf is in a state m, where m E [0, M 1]

These parameters can be grouped into a model parameter random vector O.






17

For training the HMT, we employ the expectation maximization (EM)

algorithm [16, 17], which iteratively estimates model parameters O. Under mild

conditions, the iteration converges to a local maximum of the function p(W|I).

The training of the HMT is explained in the following chapter.














CHAPTER 4
TRAINING OF THE HMT MODEL

Our aim is to determine the parameter vector e, given the pyramidal struc-

ture of observed coefficients W = W. Here, the usual EM algorithm for training

HMM's [15], is modified for the continuous outcome HMT.

First, let's introduce the following notation. We define ~T to be the subtree

of observed coefficients with a root at node wf, at the level J. TJ contains the

coefficient wJ and all its descendants. If k{J(k)
and all its descendants are members of TJ), then we define 7Ik to be the set of

coefficients obtained by removing the subtree 7{J(k)
generality, we order W so that wt-1 is at the root of the entire tree. Thus, T0L1 is

the entire tree of observed coefficients and we interchange the symbols ToL- and W

when convenient.

4.1 Implementation of the EM Algorithm

Customary for the EM algorithm [15], for each subtree Ti, we define the

following conditional densities:


j3(m) = p(Q S = m, 0) (4.1)

().,i() = p(7 J S1 rm, ) (4.2)
ppP( i)
j(\m) P(Tp S = mJ+ ), (4.3)

and the joint probability

af(m) P(SJ -m, 1 ) (4.4)

with SJ taking discrete values and TL-1 taking continuous values.









To determine the HMT parameter vector E, it is necessary first to compute

the state and transition probabilities for all the observed coefficients W, P(Sf

m W, ) and P(S'J m S 1 = n, W, O). From the HMT properties (3.3),

(3.4) and (3.5) it follows that the trees T1 and 07 1 are independent, given the

state random variable Sf. This fact, together with the Markov chain rule, leads to

the desired expressions in terms of the defined a and 3. First, we obtain

P(SJ = m, TOL- ) -= a (m)3/(m) (4.5)
J+l n J+1
P(S= JlmSi n, 7JL-1 ( ) = (n)a (rm, n)a (m) )\i(n) (4.6)
P(Sf -, j' n,

Then, the joint pdf of W can be expressed as
M-1
p(W|) E- p) ( P(S 1m, o-l O),
m=0
M-1
= a(r n) j(mn). (4.7)
m=0

Applying the B-iv-;. rule to (4.5)-(4.7) results in the desired conditional

probabilities

P(S | W, E) J ,(m) (4.8)
Z af (n) @^ (n)
n= 0

P(SJ S n| W, ) I W (4.9)
( sp M-1 () (C), (4.9)
nOo

Ya J(n)}Oj (n)
n=0
From (4.8), (4.9) and (4.3) follows the expression for the transition probability

P(S SJ+ n W, E)
P(Sf m SJ+ n, W, O) P(S m, P
S pW( P(SJ+(t n W, O)


+1 (n) fqg+l( 1

nJ s T i)
!3(i), (4.10)









Apparently, the expressions (4.8) and (4.10) guarantee that the computed proba-

bilities are less or equal to 1. From the state and transition probabilities all other

components of the random vector e can be computed.

4.1.1 E Step

In determining probabilities for the state variables, the state information

is propagated throughout the tree by means of upward-downward' algorithm.

The up step calculates O's by transmitting information about the fine-scale

wavelet/color coefficients up to coarser scales. The down step computes a's by

propagating information about the coarse-scale wavelet/color coefficients down to

finer scales.

In the following subsections we present the up and down steps for the 1-th

iteration of the EM algorithm.

4.1.2 Up Step

The up part of the EM algorithm consists of the following steps:

1. VS = m, where m e {0,..., M-1}, calculate

(m)= N(, (m), oa(m)), (4.11)

2. VS~ = m, where m e {0,..., M-1}, calculate
M-1
O,/,j(n) aj7jE )(Tn)O/(n) (4.12)
m=O
Nwt(mT) ,- N(1wJtp(),7J1 (n)) IJ /JPk)k() (4.13)
(i) (), ( ), O. )
kex(p(i))

l P(i) (4.14)




1 E step of the EM algorithm is often referred to as the forward-backward al-
gorithm in the HMM literature and as the upward-downward algorithm in the AI
literature. Since the tree structure imposes moves up and down in the E step com-
putation, here we use the latter term.









3. Move up the tree, setting J = J + 1.
4. If J = L, then stop; else return to step 2.

4.1.3 Down Step

The down part of the EM algorithm consists of the following steps:

1. VSL-1 where m E {0,...,M-1}, set

ao (m) = P(S- = m W,e = ) (4.15)

2. Move down the tree, setting J = J 1.
3. VS~ = m, where m E {0,..., M-1}, calculate

M-1
aiJ () -E aaJ (, ) Wn) (4.16)
n=0

4. If J = 0, then stop; else return to step 2.

4.1.4 M Step

Once a and f values are computed for all nodes, the maximization step of the

EM algorithm follows straightforward. The state and transition probabilities could

be computed applying (4.8) and (4.10). We could then find p '(m) and ar(m) for

all coefficients. However, all that will make our HMT model too complex.

In order to simplify and to avoid the risk of overfitting the HMT model, we

assume that statistic parameters are equal for all coefficients belonging to the same

scale. It means that all nodes at the scale J are characterized by the following

unique parameters: P(S = m I W, E), pJ(m), a-(m), and P(SJ = m I SJ+1

n, W, 3). Here, S' denotes the state random variable for all nodes i at the same

scale J.









Applying the assumption, in the M step, results in the following expressions for

all K nodes at the scale J, for the 1-th iteration:


P(SJ =m W, O)









P(S = m SJ+1 n, W, )







(J(m)



(o-(m))2


K-1
K P(S = m I W, e = -1),

1 K-1 af(m)/(m)
K M-1

n=O


(4.17)


SK-1
- P(S=m JSj = n, W, O = ),
i=0
1 -1 3(M ,, i) ( n)
K (n (4.
i=o pii()
K-1
SP(S = Tm I W, O)
i=o (4
K P(SJ m W,) ')
K-1
(, ,-J(m))2 P(S;' W, O)

i=0 KP(S n W, ) (4
KP(SJ m W, o)


.18)


.19)


.20)


4.2 Implementation Issues

The computer implementation of the upward-downward algorithm, if the

equations (4.11) (4.16) are directly used, suffers from numerical underflow.

Unfortunately, it is not possible to implement the logarithm scaling. In order to

make the algorithm stable, we need to introduce adaptive scaling of 3 values. The

dynamic scaling is performed for 3 values only. Since 3's are computed only in the

up step, we need only to rewrite the equations (4.11) (4.14). The scaled values

are denoted with /P. In the following subsections we present the scaled up step and

down step algorithms.










4.2.1 Scaled Up Step

The scaled up part of the EM algorithm consists of the following steps:

1. Vi at the scale J = 0 compute


Oo(m)


M-1
so 0,0 (0n)
m=O


f3(m)


, (scaling factor for 3's) ,


(4.21)


2. Vi at a scale J compute


M-1
Y a',p(i) Tl, n){Tn) -j
m= o
1
J p(i),i( )
Ni
N(wJ+ J' (T) 1-)J+1 ())

1
t J+ /\l
pi)(m) ,

kEx(p(i))


i1(n)
p( i)(T )


gJ+l
p(i)


S1(m)
p( )1i


/J3 f (nn


M-1

m(O


1

k (p(i))
fcex(pi))


~J+1

p(i)

pi)n)i


M-1

S ao0 (i) ,n)3{ )
m=O


(4.22)


H
kEX(p(i))


(4.23)


M-1

m=O


p(i)



P,(iim).


-3 i
8J1
Sp(i)
J (m)
sJ


J
s1
Si
PJi


3. J=J+l.

4. If J = L, then stop; else return to step 2.

From (4.21) and (4.25) it is obvious that scaling at each step of the up

algorithm limits the dynamics of f values to the interval [0, 1]. If we carefully


1
i


(4.24)


(4.25)



(4.26)


N(,,, m p ( ), 7 (m)) ,


01(k), k(TO


M-1 J+l
Tm-0 p(i) (T)









observe the equation (4.16) in the down-step algorithm, a values need not be

scaled, because limiting J+L (m) stabilizes d's in (4.16), as well.

4.2.2 Scaled Down Step

The scaled down part of the EM algorithm consists of the following steps:

1. For J L- 1 compute


tp(0) = 1 (scaling factor for a 1) ,

oF 1 i= tL o- (4.27)
p(O)
2. J J-1.

3. Vi, J(i) J,
M-1
/(Tm) = 8api)(7j ,n} ,JI J
n=0
J M-1
si C iJJF,+l{)J(n) {
= J+ltJ+l E a(i)(1' ) \
p(i) p(i) n=0

Js o(m) (4.28)
p(i) p(i)
4. If J = 0, then stop; else return to step 2.

Finally, it is necessary to check the expressions for the state and transition

probabilities to account for the scaled d's and 3's. From the equations (4.17),

(4.25) and (4.28) it follows:

t K-1 d J (Tn) (Tn)
K m-1
P(s5J =m I W, O) 1 an i-
KII a(n> 3^j)
n=O
1 K-1 +aJ (m)7/i (m)
SK i M-1

n=0 p(i) p(i) t
P(SJ m I W, E) (4.29)









and from the equations (4.18), (4.22) and (4.25)

1 K-1 J Mn)
P(S r I SJ' n, W, e) )


-O

SP(SJ -m I S J+ n, W, O). (4.30)


Clearly, the equations (4.29) and (4.30) show that it is not necessary to

perform additional computation due to scaling. Thus, the substitution of the

scaled a^'s and 3's in the M step does not change the results of the expressions

(4.17)-(4.20).

For the next (1 + 1) th iteration step of the EM algorithm, we use the

computed e = 01 of the 1 th step. In this way the EM converges to a local

maximum of the joint pdf fwel(W|0). There are various criteria when to stop

the iteration. In order to simplify the computation, we compare the values of the

components of 01 and 01+1. If they differ less than some preset value, then the

iteration is stopped.

After training the HMT, we are able to compute the joint pdf of all pixels and

to classify images using a B- i -, classifier. In the next chapter we explain how to

obtain the joint pdf of an image and how to classify it.














CHAPTER 5
MULTISCALE SEGMENTATION

The overall statistical model consists of Hidden Markov Trees 7f that assign

a node i to each coefficient1 wrf at a scale J, of the same feature f. We assume

that the features are mutually independent. In other words, connecting coefficients

that belong only to the same feature f, we obtain nine mutually independent

probability trees T7, f EF = {15, 450, 750, H, S, I}. Each tree 7f represents,

in fact, a distinct HMT model, which can be trained separately from the other

trees, as explained in the previous chapter. After training, we perform B ,v, -i I

classification, where the independent trees Tf are unified into a unique statistical

model.

As a result of training the HMT for sky and ground images, we obtain the

model parameter vectors W8 and 09, respectively. It follows, for each coefficient

wJ of the tree T7, it is possible to compute the likelihood for the class sky as


p w p(wi ,



= pw%, s 8 PS -1 m8) P(S mSJ1 =,), (5.1)
n J

where p(w, |S,, OV) represents the Gaussian distribution, fully characterized by
(f)}8 and (o)"8, which are the components of the known e8 Clearly, from the

assumption that the trees Tf are independent, for all f E F, the likelihood of any



1 Here, coefficient refers to the magnitude of a CWT coefficient and/or the H, S
and I color values.









coefficient w, at a scale J, given a class, -i sky, can be computed as


p(w |e) = p(w 8@9f (5.2)
f S

Consequently, we are able to perform B iv -i classification at all scales, and

combining these results to perform segmentation of an image.

Under successful image segmentation, we imply both accurate classification of

large, homogeneous regions, and distinct detection of their boundaries. To achieve

this goal, both large and small scale neighborhoods should be analyzed. Therefore,

we implement a multiscale segmentation algorithm, based on the algorithm

proposed by Choi and Baraniuk [18].

Denoting with WJ all coefficients from all features at a scale J, WJ = {wJ},

and with f2i the collection of all class labels at the scale J, fi = {wf}, where

w E {s, g}, the classification is performed according to the MAP rule, as follows:

J' argmax{P(QJW')} (5.3)

argmax{p(WJ1J)P(QJ)} (5.4)

Thus, our goal is to compute the discriminant function g(WJ, QJ) = p(W i2J)P([2J ),

in (5.4), which we present further in the text.

Assuming that a class label wf completely determines the distribution of

the corresponding coefficient wJ, it follows that all coefficients are mutually

independent, given their class labels:


p(W'J J') = p(wi'|w) (5.5)
iEJ

where p(wf Iww) is already known from (5.2).

To compute the joint probability P(f2J), we assume that the distribution of

f2i is completely determined by Qf'1l at the coarser J + 1 scale. Here, we once

again introduce a new Markov tree, where class labels wi piv a role analogous to








- -/ /---. - -


/ / X: / LEVEL J+1
i------- ------- -^ -i- ----_ -------_

I' klO p(i }





Figure 5.1 The context vector.


the hidden states S in the HMT. Combined with the Markov property that, given

W1, J' Wis independent of all w, k i, the Markov chain rule reads:

L-1
p(eJ)= p( L)P (w |"-) (5. ")
ieL j=J

where L denotes the coarsest level, which is still able to provide statistically reliable

segmentations QL.

The conditional probability P(w ~j+1) in (5.6), being unknown in general,

must be estimated using a prohibitive amount of data. In order to surmount this

problem, we introduce contexts [18]. To each coefficient wf, with the hidden class

label wf, we assign the context ci, which represents the information on t+'1. The

choice of c<, poised between complexity and accuracy, in our case, is a vector that

consists of the two components:

1. the class label w J4, of the corresponding direct parent (as in the HMT),

2. the i ii ly vote of the class labels woJ+ where {p(i)} denotes the eight

neighboring coefficients of the direct parent p() +

Recall that in the HMT a coefficient wi has exactly one parent wJ+1 and four

children wJ-1 It follows that all four neighboring coefficients at a scale J, have the

same context vector c = [(i)w wp(J)}], as illustrated in Figure 5.1.
li L PW~pi~:t \Pusratdi' lrr 5









We assume that cf represents a reliable source of information on the dis-

tribution of all class labels at J + 1 level. Thus, we rewrite the equation (5.6)

as
L-1
P( )P(()L) J n P(w[ c) (5.7)
iEJ j=J
Finally, we derive a more convenient expression for the discriminant function

from (5.4) as
L-1
g(WJ, Q')= P( 2) lp(wf wJf) J P(wJ C). (5.8)
iEJ j=J
The unknown transition probabilities P(wjcic) can be obtained, maximizing

the discriminant function (5.8) in cascades, employing the EM algorithm at each

scale J [18]. Using the already known likelihood p(wflw ) from training, the

P(w |cf) is estimated in the ML sense by averaging over the entire scale J,

because it is reasonable to assume that the transition probabilities are equal for all

coefficients at the same scale.

The initial parameters for the EM at the level L are p(wfi if), obtained

in training (5.2). Also, for the coarsest level L, we assume that P(w cc) = 1

for all coefficients. Then, the EM is performed in cascades towards the finer

scales, estimating P(wf cf) and P(wf), until the finest level J = 0 is reached.

Experimental results show that the algorithm converges rapidly if the initial

parameters are set to the values estimated at the previous coarser scale.

Once the transition probabilities P(wf cf) are estimated for all class labels

and context values, it is possible to perform image segmentation, using the MAP

rule as follows:

arg max P(w)P(w? c;)p(w I?) (5.9)
"i 0ag max
we{s,g}
In the next chapter we present experimental results for sky/ground image

segmentation, using the proposed computer vision system.














CHAPTER 6
EXPERIMENTAL RESULTS


In our experiments, first, the HMT model is trained using a data base of

training images. We recorded two sets of 500 images. One set represents the

sky and the other ground, only. Images are carefully chosen to account for great

variability within classes, as illustrated in Figure 6.1.

Then, the trained HMT is tested on another set of test images. There are

two types of testing images. The first testing set contains images similar to the

training data base. The second type of testing images consists of flight images of

our MAV's, where both the sky and ground appear. In the following sections we

present the experimental set-up and results.

6.1 Real-Time Realization Problems

In order to implement the HMT-based B i-;. classifier in real-time, which

is our ultimate goal, the amount of computation must be as small as possible.

There are several avenues to decrease the amount of computation. The greatest

computational load comes from the EM algorithm. It is known that given M

training images of N pixels each, the total computational cost per EM iteration is

O(MN) [7].

Thus, first, we reduce N, the number of pixels to be analyzed. Since the fast

CWT algorithm requires image dimensions be a power of 2, we subsample each

640 x 480 image in the data base to dimensions 128 x 128. Hence, the maximum

number of levels in the HMT is L = 1(.,-i 127 = 7. Segmentation results show that

the 128 x 128 image resolution is sufficient for reliable classification.




























Figure 6.1 Training images of the sky and ground.


Second, with reducing the number of HMT parameters, we lower the number

M of images necessary for training the HMT. Recall that we compute only the

components of e for all coefficients at a scale J-not for each particular coefficient

w(. This reduces the number of iteration steps for the EM to converge. In our

experiments, the convergence criterion compares values of the components of 01+1

with the values from the previous iteration step in 01. The EM algorithm stops if


01+1 (I < E ,


where E should not be an arbitrarily small value, because it would lead to overfit-

ting the HMT model. If E = 10-4, the convergence is achieved in 5 iteration steps,

in the worst case.

Finally, intelligent choice of initial values also reduces the number of iteration

steps. Appropriate setting of all the parameters and the code optimization made

the execution of training relatively fast. Training on 900 images (450 of each class),

on an Athlon 900 MHz PC, is less than 3 minutes, and it takes less than 1 minute,

for testing 100 images (50 of each class).









6.2 Classification of the First Type of Test Images

Having trained the HMT on the training set, we obtained parameters e for

all features f e {150, 450, 75, H, S, I}. Then, we tested the performance of

our computer vision system on the first type of test images, where either the sky or

ground is present. Here, the goal is to recognize the whole image either as the sky

or as the ground.

First, we show that the choice of H, S, I color features is justified. As we

have already pointed out, the R, G, B color features exhibit significant statistical

dependence. Here, we present the classification error of our algorithm for both R,

G, B features and H, S, I features in Table 6.1 and Table 6.2. Better results are

achieved when the number of possible color states is increased. However, a better

model is computationally more expensive.

Table 6.1 The performance of the RGB-based classifier for ground test images.

class number of states error HSI error RGB
2 4 6.
ground 3 2 4
4 2 4
5 2 4
6 2 4


Table 6.2 The performance of the RGB-based classifier for sky test images.

class number of states error HSI error RGB
2 12 .26
sky 3 10 18
4 6 16
5 6 12
6 6 10


Comparing the classification results for HSI-based and RGB-based classifier,

presented in Figure 6.2, we conclude that the HSI color feature space is superior to

the RGB space.










Color-based Classification Error for Sky Images
IRGB
|-- HSI |
25


20
\

j 15-






5 -



0 1 2 3 4 5 6 7
number of states

Figure 6.2 The classification error using RGB and HSI color spaces.


The performance of the wavelet-based classifier, the case when only the CWT

features are used (f e {+150, 450, 750}), is presented in Table 6.3.

Table 6.3 Performance of the wavelet-based classifier.

class error CWT
ground 2
sky 6



From Tables 6.2, 6.1, and 6.3, we conclude that even when the number of

states is increased the color-based classifier performs worse than the wavelet-based

classifier. Therefore, for reliable classification, both color and texture features have

to be incorporated.


6.3 Segmentation of the Second Type of Test Images


Having trained the HMT for all features f e {15, 450, 750, H, S, I}, we

tested the performance of our computer vision system on the second type of test

images. This set of images consists of flight images of our MAV's, as well as other

difficult-to-segment sky/ground images. Here, the goal is to segment an image into

sky and ground regions. The image segmentation is presented in Figures 6.3-6.5.






























LI M


Figure 6.3 A MAV's
flight: favorable
sky/ground patterns.


Figure 6.4 Water surface
with ice patches similar
to clouds.


-rr
.L*.



I -





Figure 6.5 A mountain
view with fuzzy horizon
line.









Finally, we illustrate the generality of our algorithm in segmentation of

cut/uncut grass regions (see Figure 6.6). We present the original image (left), the

subsampled image (center), and segmentation of the subsampled image (right).

These images1 show a grass lawn from the perspective of a camera mounted on an

autonomous lawn mower. Even at image resolutions as low as 64x64, we achieve

satisfactory results at very fast processing speeds.




SIn






.4 .








Figure 6.6 Segmentation of cut and uncut grass regions.


1 We thank Rand ('!i i~i1.' I for giving us access to his image database.















CHAPTER 7
CONCLUSION


Segmentation of complex image classes, such as sky and ground, demand an

elaborate consideration of class properties. Clearly, in some cases, color provides

sufficient information for sky and ground detection. However, due to video noise

and/or unfavorable class patterns, both color and texture clues are necessary for

successful recognition.

In this thesis, first, we presented our choice of features, consisting of H and I

values from the HSI color space, and CWT coefficients. In our experiments, the

HSI color space proved to be superior to the RGB space. Also, in the early stage of

considering the method for texture analysis, experimental results showed that the

DWT (realized with Daubechies wavelets) is inferior to the CWT.

Second, we showed the implementation of the HMT model and the training

steps for obtaining its parameters. The contribution of this thesis reflects in the

derivation of the formulae, which must be used in the EM algorithm to surmount

the implementation issues; specifically the numerical underflow. We proposed the

method, where dynamic scaling is performed for 3 values only, whereas in literature

[7, 11, 15] other approaches are used.

Further, we described how the learned parameter set could be used for

computing likelihood of all nodes at all scales of our HMT. We then developed

multiscale B iv, ~i classification for our application.

The most important contribution of this thesis reflects in successful implemen-

tation of color features into the HMT framework. In relevant literature [7, 11, 18],

the HMT model is related only to wavelets. Incorporating color features resulted







37

in more reliable image segmentation, which is illustrated for diverse sky/ground

images and for cut/uncut grass images.

We incorporated in our design results from the available literature, modifying

the original algorithms for our purposes where appropriate. We designed our

computer vision system with real-time constraints in mind. Therefore, some aspects

are suboptimal with respect to the classification error.














REFERENCES


[1] Scott M. Ettinger, Michael C. Nechyba, Peter G. Ifju, and Martin Waszak,
"Vision-guided flight stability and control for Micro Air Vehicles," in Proc.
IEEE Int. Conference on Intelligent Robots and S,'-/.i m Lausane, October
2002.

[2] Richard O. Duda, Peter E. Hart, and David G. Stork, Pattern Cl...:l ,l.:.n,
2nd edition, John Wiley & Sons, New York, 2000.

[3] Heng-Da C'!. i- Xihua Jiang, Ying Sun, and Jingli Wi:. "Color image
segmentation: advances and prospects," Pattern Recognition, vol. 34, pp.
2259-2281, 2001.

[4] Trygve Randen and Hakon -. ..",, "Filtering for texture classification: A
comparative study," IEEE Transactions on Pattern A,.al;-,. and Machine
Intelligence, vol. 21, no. 4, pp. 291-310, 1999.

[5] Stephane Mallat, A Wavelet Tour of S.:g,.rl1 Processing, 2nd edition, Academic
Press, San Diego, 2001.

[6] Stephane G. Mallat, "A theory for multiresolution signal decomposition: the
wavelet representation," IEEE Transactions on Pattern A,.a,;-,-. and Machine
Intelligence, vol. 11, no. 7, pp. 674-693, 1989.

[7] Matthew S. Crouse, Robert D. Nowak, and Richard G. Baraniuk, \\ !. t-
based statistical signal processing using Hidden Markov Models," IEEE
Transactions on S.:girl Processing, vol. 46, no. 4, pp. 886-902, 1998.

[8] Jerome M. Shapiro, "Embedded image coding using zerotrees of wavelet
coefficients," IEEE Transactions on S.:g,.,rl Processing, vol. 41, no. 12, pp.
3445-3462, 1993.

[9] Nick Kingsbury, l1 ii ,. processing with complex wavelets," Philosophical
Transactions Roil S... /, ; London, vol. 357, no. 1760, pp. 2543-2560, 1999.

[10] Michael Unser, "Texture classification and segmentation using wavelet frames,"
IEEE Transactions on Image Processing, vol. 4, no. 11, pp. 1549-1560, 1995.

[11] Diego Clonda, Jean-Marc Lina, and Bernard Goulard, '\! :: d memory model
for image processing and modeling with complex Daubechies wavelets," in
Proc. SPIE's International Symposium on Optical Science and T.. I ,,4..i/;/ San
Diego, August 2000.









[12] Nick Kingsbury, "Complex wavelets for shift invariant ii 1, i-- and filtering of
signals," Journal of Applied and Computational Harmonic A,.l.;-,.: vol. 10,
no. 3, pp. 234-253, 2001.

[13] Judea Pearl, Probabilistic Reasoning in Intelligent S,-I. i- : Networks of
Plausible Inference, Morgan Kaufamnn, San Mateo, 1988.

[14] Justin K. Romberg, Hyeokho Choi, and Richard G. Baraniuk, "B i, -i i:
tree-structured image modeling using wavelet-domain Hidden Markov Models,"
IEEE Transactions on Image Processing, vol. 10, no. 7, pp. 1056-1068, 2001.

[15] Lawrence R. Rabiner, "A tutorial on Hidden Markov Models and selected
applications in speech recognition," Proceedings of the IEEE, vol. 77, no. 2, pp.
257-286, 1989.

[16] Geoffrey J. McLachlan and Krishnan T. Thriyambakam, The EM Algorithm
and Extensions, John Wiley & Sons, New York, 1996.

[17] Helmut Lucke, "Which stochastic models allow Baum-Welch training?," IEEE
Transactions on S.:g,.rl Processing, vol. 44, no. 11, pp. 2746-2756, 1996.

[18] Hyeokho Choi and Richard G. Baraniuk, \! ll scale image segmentation
using wavelet-domain Hidden Markov Models," IEEE Transactions on Image
Processing, vol. 10, no. 9, pp. 1309-1321, 2001.















BIOGRAPHICAL SKETCH


In 1994, Sinisa Todorovic graduated at the School of Electrical Engineer-

ing, the University of Belgrade, Yugoslavia, with a in irw in electronics and

telecommunications. From 1994 until 2001, he worked as a software engineer in

the communications industry. Specifically, from 1998 to 2001, as an employee of

Siemens, he worked on a huge network installation project in Russia. In fall 2001,

Sinisa Todorovic was admitted to the master's degree program at the Department

of Electrical and Computer Engineering, University of Florida, Gainesville. His

main field of interest, throughout his graduate studies, was computer vision and

pattern recognition.