<%BANNER%>

Structured Graphical Models for Unsupervised Image Segmentation

Permanent Link: http://ufdc.ufl.edu/UFE0043604/00001

Material Information

Title: Structured Graphical Models for Unsupervised Image Segmentation
Physical Description: 1 online resource (174 p.)
Language: english
Creator: Kampa, Kittipat - Mr
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2011

Subjects

Subjects / Keywords: bayesian -- cbir -- expectation-maximization -- graphical -- image -- models -- networks -- probabilistic -- segmentation -- superpixel -- tree-structured -- unsupervised
Electrical and Computer Engineering -- Dissertations, Academic -- UF
Genre: Electrical and Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: In the dissertation, we seek the following goals: (1) to come up with a probabilistic graphical model framework for unsupervised segmentation on structured data, and (2) to find a computationally efficient and reliable solution to image segmentation with superpixels as opposed to pixels. We develop a Data-Driven Tree-structured Bayesian network (DDT), a novel probabilistic graphical model for hierarchical unsupervised image segmentation. Like tree-structure belief networks (TSBNs), DDT captures both long and short-ranged correlations between neighboring regions in each image using a tree-structured prior. Unlike other approaches, DDT first segments an input image into superpixels and learns a tree-structured prior based on the topology of superpixels in different scales. Such a tree structure is referred to as a data-driven tree structure. Each superpixel is represented by a variable node taking a discrete value of segmentation class/label. The probabilistic relationships among the nodes are represented by edges in the network. Hence, unsupervised image segmentation can be viewed as an inference problem on the DDT structure nodes, which can be carried out efficiently. The end image segmentation result can be obtained by applying the maximum posterior marginal to each variable node in the network. We provide the parameter estimation regime using the Expectation-Maximization (EM) algorithm combined with the sum-product algorithm. With respect to the objectives, we hypothesize that 1) Hierarchical segmentation gives more meaningful results than the results from only one scale; 2) The tree-structure prior would smooth the segmentation results, yielding better segmentation; 3) Exploiting the superpixel would be a way to smooth the segmentation, so the segmentation differences between the model with and without the tree-structured prior would be less at the superpixel-level segmentation than the pixel-level; 4) The model with evidence in all scales gives better results than the one without. We evaluate quantitatively our results with respect to the ground-truth segmentation in the Berkeley Segmentation Dataset and Benchmark 500 (BSDS500), a well-known image database benchmark, demonstrating that our proposed framework performs competitively with the state of the art in unsupervised image segmentation and contour detection.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Kittipat - Mr Kampa.
Thesis: Thesis (Ph.D.)--University of Florida, 2011.
Local: Adviser: Principe, Jose C.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2013-06-30

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2011
System ID: UFE0043604:00001

Permanent Link: http://ufdc.ufl.edu/UFE0043604/00001

Material Information

Title: Structured Graphical Models for Unsupervised Image Segmentation
Physical Description: 1 online resource (174 p.)
Language: english
Creator: Kampa, Kittipat - Mr
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2011

Subjects

Subjects / Keywords: bayesian -- cbir -- expectation-maximization -- graphical -- image -- models -- networks -- probabilistic -- segmentation -- superpixel -- tree-structured -- unsupervised
Electrical and Computer Engineering -- Dissertations, Academic -- UF
Genre: Electrical and Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: In the dissertation, we seek the following goals: (1) to come up with a probabilistic graphical model framework for unsupervised segmentation on structured data, and (2) to find a computationally efficient and reliable solution to image segmentation with superpixels as opposed to pixels. We develop a Data-Driven Tree-structured Bayesian network (DDT), a novel probabilistic graphical model for hierarchical unsupervised image segmentation. Like tree-structure belief networks (TSBNs), DDT captures both long and short-ranged correlations between neighboring regions in each image using a tree-structured prior. Unlike other approaches, DDT first segments an input image into superpixels and learns a tree-structured prior based on the topology of superpixels in different scales. Such a tree structure is referred to as a data-driven tree structure. Each superpixel is represented by a variable node taking a discrete value of segmentation class/label. The probabilistic relationships among the nodes are represented by edges in the network. Hence, unsupervised image segmentation can be viewed as an inference problem on the DDT structure nodes, which can be carried out efficiently. The end image segmentation result can be obtained by applying the maximum posterior marginal to each variable node in the network. We provide the parameter estimation regime using the Expectation-Maximization (EM) algorithm combined with the sum-product algorithm. With respect to the objectives, we hypothesize that 1) Hierarchical segmentation gives more meaningful results than the results from only one scale; 2) The tree-structure prior would smooth the segmentation results, yielding better segmentation; 3) Exploiting the superpixel would be a way to smooth the segmentation, so the segmentation differences between the model with and without the tree-structured prior would be less at the superpixel-level segmentation than the pixel-level; 4) The model with evidence in all scales gives better results than the one without. We evaluate quantitatively our results with respect to the ground-truth segmentation in the Berkeley Segmentation Dataset and Benchmark 500 (BSDS500), a well-known image database benchmark, demonstrating that our proposed framework performs competitively with the state of the art in unsupervised image segmentation and contour detection.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Kittipat - Mr Kampa.
Thesis: Thesis (Ph.D.)--University of Florida, 2011.
Local: Adviser: Principe, Jose C.
Electronic Access: RESTRICTED TO UF STUDENTS, STAFF, FACULTY, AND ON-CAMPUS USE UNTIL 2013-06-30

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2011
System ID: UFE0043604:00001


This item has the following downloads:


Full Text

PAGE 1

STR UCTUREDGRAPHICALMODELSFORUNSUPERVISEDIMAGE SEGMENTATION By KITTIPATKAMPA ADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOL OFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENT OFTHEREQUIREMENTSFORTHEDEGREEOF DOCTOROFPHILOSOPHY UNIVERSITYOFFLORIDA 2011

PAGE 2

c r 2011 KittipatKampa 2

PAGE 3

Dedicated tomyfamily 3

PAGE 4

A CKNOWLEDGMENTS IwouldliketoexpressmysinceregratitudetoDr.JosePrincipenotonlyforhis wiseandpatienceguidancetowardmyresearch,butalsoforhiswisdomandaspectof lifeingeneral.Ithasbeenawonderfultimecollaboratingwithhim;numerouslessons Ihavelearned.Asmyformeradviser,IwouldliketothankDr.KennethC.Slattonfor theopportunityandhisinvaluableadvicesduringthetimeunderhissupervisory.I especiallyappreciatehisgreatdeterminationtosolvenumerousreal-worldproblems withengineeringsolutions.IthankDr.AnandRangarajanforhisinvaluableadvices andexcellentlecturesthatsparkedlotsofideas.Discussingwithhimiseducationally entertainingandutterlyinspiring.IthankDr.KshitijKhareforhisvaluableadvices, helpfuldiscussionsandhisefforttowardmybetterunderstandingofstatistics.Also, IwouldliketothankDr.JohnHarrisforhisvaluablesuggestionstoputmyworkinto perspectivewhenregardingreal-worldapplicationsandforhissupportsIreceivedinthe academicaltransitionperiod. AmillionthankstomyfriendsandcolleaguesatNationalCenterforAirborneLaser Mapping(NCALM),AdaptiveSignalProcessingLaboratory(ASPL)andComputational NeuroEngineeringLaboratory(CNEL)notonlyfortheexcellentlearningenvironment, butalsoforalltheirwonderfulfriendshipsandsupportsthroughoutmyacademicyears hereatUF.AnothermillionthanksgotomyfriendsinThaistudentassociation(TSA) andThaicommunityinGainesvillefortheirkindnessandgenerosity.Withoutthem, Fridaymusicandsportdaywouldbemissingfrommylife. Lastbutnotleast,Iwouldliketothankmyparents,mysisterandmywife,forall loves,supportsandmotivationsinundatingmeineverydayofmylife.Mylifewouldhave beenmuchmoredifcultwithoutGif,thebetterhalfofme,stayingheretomakeour houseahome. 4

PAGE 5

T ABLEOFCONTENTS page A CKNOWLEDGMENTS ..................................4 LISTOFTABLES ......................................9 LISTOFFIGURES .....................................10 ABSTRACT .........................................12 CHAPTER 1INTRODUCTION ...................................14 1.1Motivation ....................................14 1.2ImageTasksinComputerVision .......................17 1.3ProbabilisticFrameworkforImageSegmentation ..............19 1.4FixedQuad-TreeStructure ..........................20 1.5DynamicTreeArchitecture ...........................23 1.6OurProposedMethodtoImageSegmentation ...............24 1.7Contributions ..................................30 1.8DissertationOutline ..............................31 2DATA-DRIVENTREE-STRUCTUREDBAYESIANNETWORKS .........33 2.1ProbabilisticModelofDDT ..........................33 2.2InferenceandParameterLearning ......................37 2.3Summary ....................................39 3VARIANTSOFDDT .................................41 3.1Leaf-EvidenceDDT(leDDT) ..........................41 3.2Combinedpixel-superpixelevidenceDDT(coDDT) .............43 3.3TSBNasaSpecialCasesofleDDT .....................46 3.4Summary ....................................46 4IMAGESEGMENTATIONUSINGDDTS ......................47 4.1ImageSegmentationusingSuperpixels ...................47 4.2TheChoiceofSuperpixelGeneratorandTreeStructure ..........51 4.2.1ComparisonofSuperpixelGenerators ................51 4.2.2HumanPerformance ..........................55 4.2.3TheChoiceofScaleandSuperpixelImageCardinality .......57 4.3BuildingaData-DrivenTreeStructure ....................60 4.4ImageDataSet .................................61 4.5PerformanceEvaluation ............................61 4.5.1Results .................................61 4.5.2ContourDetection ...........................62 5

PAGE 6

4.5.3 RegionImageSegmentation .....................62 4.6Summary ....................................63 5FEATUREANDMODELSELECTION .......................64 5.1FeatureSelection ................................64 5.1.1ColorFeature ..............................65 5.1.2TextureFeature .............................65 5.1.3FeatureVectorConstruction ......................66 5.1.3.1Populatesampleswithinasuperpixel ...........66 5.1.3.2Transformationofinformation ................67 5.1.4Experiment ...............................67 5.2ModelSelectionusingsuperGMM ......................69 5.2.1Sampling-SuperpixelGaussianMixtureModel(superGMM) ....70 5.2.2ModelSelectionusingModiedBayesianInformationCriteria ...72 5.3Summary ....................................75 6EXPERIMENTANDDISCUSSION .........................76 6.1DeterminingtheNumberofSegmentsineachImage ............76 6.2PerformancesofmeDDTanditsVariants ..................78 6.2.1EvidenceinMultipleScales ......................80 6.2.2CombiningPixelLevelFeatures ....................83 6.2.3IndependentAssumptioninSuperpixelLevel ............84 6.3ComparisonwithExistingApproachesandStateoftheArt ........85 6.3.1ComparisonwithaRestrictionontheNumberofContourPixels ..86 7CONCLUSIONSANDFUTUREWORK ......................91 7.1SummaryofContributions ...........................91 7.2OpportunitiesforFutureWork .........................93 APPENDIX AIMAGEDATASET ..................................95 A.1TheBerkeleySegmentationDatasetandBenchmark500(BSDS500) ..95 BDERIVATIONOFmeDDT ..............................96 B.1E-step ......................................96 B.2M-step ......................................97 B.2.1Maximize F w.r.t c ...........................97 B.2.2Maximize F w.r.t c ...........................97 B.2.3Maximize F w.r.t ...........................98 B.3ParameterEstimationinCaseofDiagonalCovarianceMatrix .......98 B.4FinalUpdateEquationsformeDDT ......................99 6

PAGE 7

C DERIVATIONOFleDDT ...............................101 C.1E-step ......................................101 C.2M-step ......................................102 C.3FinalUpdateEquationsforleDDT ......................102 DDEFORMABLEBAYESIANNETWORKSFORDATACLUSTERINGANDFUSION 104 D.1Overview ....................................104 D.2DeformableBayesianNetworks ........................107 D.2.1BayesianNetworks ...........................107 D.2.2ProbabilisticModelofGeneralizedDEBAN .............109 D.2.3ASpecialCaseofDEBAN .......................110 D.3LearningParametersofDEBAN .......................112 D.3.1E-step ..................................112 D.3.2M-step ..................................113 D.3.3Initializing ...............................114 D.3.4KernelPerspectiveofKDW ......................115 D.4InferenceinDEBAN ..............................116 D.5ExperimentandResults ............................119 D.5.1SummaryofDEBANforclustering ..................119 D.5.2StructureUpdatingAlgorithm .....................120 D.5.3UnsupervisedClusteringusingDEBAN ...............121 D.6Discussion ...................................124 D.7Summary ....................................125 EDEFORMABLEBAYESIANNETWORK:AROBUSTFRAMEWORKFORUNDERWATER SENSORFUSION ..................................126 E.1Overview ....................................126 E.2ProbabilisticGraphicalModels ........................134 E.2.1DeformableBayesianNetworks(DFBNs) ..............135 E.2.2InferenceintheDFBN .........................137 E.2.3DFBNfortheDataClusteringandFusionAlgorithm .........138 E.3SensorFusionFrameworkUsingDFBN ...................139 E.3.1ReformulatingtheDFBNforMultipleFeatures ............141 E.4DFBNAlgorithmImplementation .......................143 E.4.1FeatureExtraction ...........................143 E.4.1.1Seabottomtextureextraction ................145 E.4.2GraphStructureInitialization ......................147 E.4.2.1Geographicdataclusteringviavectorquantization ....147 E.4.2.2Divideandconquerstrategy ................149 E.4.3ProbabilisticInferenceforDFBN ...................150 E.4.3.1Sum-productalgorithmforDFBN ..............150 E.4.3.2Structurepriorprobability P (Z ) ...............151 E.4.3.3Penaltiesforstructuralcomplexity .............152 7

PAGE 8

E.4.4 DFBNOptimizationScheme ......................153 E.4.4.1Exhaustivesearch ......................153 E.4.4.2DFBNoptimizationviasimulatedannealing ........153 E.4.4.3Structureupdateusingdirectedperturbation .......156 E.5ExperimentsandDiscussion .........................156 E.6Summary ....................................162 REFERENCES .......................................164 BIOGRAPHICALSKETCH ................................174 8

PAGE 9

LIST OFTABLES T able page 5-1 ThePRIofeachfeatureextractionalgorithmperformedonsuperpixelimage setgeneratedbyUCM. ...............................70 5-2TheBDEofeachfeatureextractionalgorithmperformedonsuperpixelimage setgeneratedbyUCM. ...............................70 5-3ThePRIofeachfeatureextractionalgorithmperformedonsuperpixelimage setgeneratedbyMori. ................................70 5-4ThePRIofeachfeatureextractionalgorithmperformedonsuperpixelimage setgeneratedbyMori. ................................70 6-1BoundarydetectionresultsobtainedfrommeDDTwhenapplyingscaleparameters =1,4and7. .....................................78 6-2RegionsegmentationresultsobtainedfrommeDDTwhenapplyingscaleparameters =1,4and7. .....................................78 6-3BoundarydetectionbenchmarksonBSDS ....................80 6-4RegionsegmentationbenchmarksonBSDS v al .................81 6-5RegionsegmentationbenchmarksonBSDS test ................82 6-6Run-timeofeachmethod ..............................82 6-7Summaryofperformancesoftheproposedmodelsonboundarydetection criteria. ........................................85 6-8Summaryofperformancesoftheproposedmodelsonregioncriteria. .....85 6-9ComparisonbetweentheOISsegmentationofmeDDTandUCMwiththenumber ofcontourpixelsisrestrictedtobeascloseaspossible. .............89 D-1Parametersforsimulatedannealing(SA)optimization ..............120 E-1Numberofpossiblestructuresofthe2-layerDFBN. ...............153 E-2ParametersusedinsimulatedannealingoptimizationinDFBN. .........158 E-3Estimatedrelativex-ylocationsanduncertainty. .................160 E-4Computationaltimeofeachalgorithm. .......................161 9

PAGE 10

LIST OFFIGURES Figure page 1-1 Tree-structuredgraphicalmodels. ..........................22 1-2Samplesofsegmentationresults. ..........................22 1-3TheoverviewofDDTframework. ..........................28 1-4ThediagramofsuperGMM. .............................29 1-5ThemultiscalesegmentationresultsfrommeDDT. ................29 1-6Theworkowdiagramofthedissertation. .....................32 2-1Theprobabilisticgraphicalmodelofmulti-scaleevidenceData-DrivenTree-Structured BayesianNetwork(meDDT). ............................33 2-2ThefactorgraphrepresentationofmeDDT. ....................34 3-1AllthreearchitecturesofDDT ............................42 4-1OverviewofDDTframework. ............................48 4-2Comparisonofrecallvaluesachievedbycandidatesuperpixelgenerators. ...55 4-3Comparisonofrecallvaluesachievedbycandidatesuperpixelgeneratorswhen usingboundarypixelratio. ..............................56 4-4Comparisonofthenumberofsuperpixelsversustheboundarypixelratioof eachsuperpixelgenerator. .............................56 4-5ExampleofsuperpixelimagesgeneratedfromMori. ...............57 4-6ExampleofsuperpixelimagesgeneratedfromUCM. ...............58 4-7Comparisonofground-truthandgeneratedsuperpixelcontours. ........59 6-1Segmentationresultsfromusing =1,4and7onBSDS tr ain. .........79 6-2Themulti-scalecontourresultsonBSDS test. ..................87 6-3TheoptimaldatasetscalesegmentationresultsonBSDS test. ........88 6-4ComparisonbetweentheOISsegmentationofmeDDTandUCMwiththenumber ofcontourpixelsisrestrictedtobeascloseaspossible. .............90 D-1Bayesiannetworkgraphicalmodel. .........................108 D-2AgraphicaloverviewofDeformableBayesianNetworks(DEBAN) .......110 D-3The k )Tj/T1_0 11.955 Tf[(nndensitywalkmapwhen S =3. .....................117 10

PAGE 11

D-4 ResultingCPTsfromtheKDWalgorithm. .....................118 D-5Datsetframe2 ....................................122 D-6ResultsofDEBAN ..................................123 E-1Underwatersensingplatformssurveyaeldoftargets. ..............128 E-2TheowdiagramofthedeformableBayesiannetworks(DFBNs)forunderwater sensorfusion. .....................................130 E-3DFBNsensorfusionarchitecture. ..........................131 E-4Bayesiannetworkgraphicalmodel. .........................135 E-5InterpretationofDFBNstructure. ..........................140 E-6ParallelDFBNsensorfusionstructure. .......................142 E-7Theoverviewofthe3featuresextractedfrom27sensormeasurements. ....144 E-8FourSASimagesofacylinderindifferentseabottomenvironments. ......145 E-9LBG-VQclusteringtoinitializetheDFBNforest. ..................149 E-10Thesummarizeddiagramof2-levelDFBN. ....................150 E-11Comparisonoftheclusteringresults. ........................159 E-12ThebestgraphstructureobtainedatthenalstageofDFBN. ..........160 11

PAGE 12

Abstr actofDissertationPresentedtotheGraduateSchool oftheUniversityofFloridainPartialFulllmentofthe RequirementsfortheDegreeofDoctorofPhilosophy STRUCTUREDGRAPHICALMODELSFORUNSUPERVISEDIMAGE SEGMENTATION By KittipatKampa Dec2011 Chair:JoseC.Principe Major:ElectricalandComputerEngineering Inthedissertation,weseekthefollowinggoals:(1)tocomeupwithaprobabilistic graphicalmodelframeworkforunsupervisedsegmentationonstructureddata,and (2)tondacomputationallyefcientandreliablesolutiontoimagesegmentationwith superpixelsasopposedtopixels. WedevelopaData-DrivenTree-structuredBayesiannetwork(DDT),anovel probabilisticgraphicalmodelforhierarchicalunsupervisedimagesegmentation.Like tree-structurebeliefnetworks(TSBNs),DDTcapturesbothlongandshort-ranged correlationsbetweenneighboringregionsineachimageusingatree-structuredprior. Unlikeotherapproaches,DDTrstsegmentsaninputimageintosuperpixelsand learnsatree-structuredpriorbasedonthetopologyofsuperpixelsindifferentscales. Suchatreestructureisreferredtoasadata-driventreestructure.Eachsuperpixelis representedbyavariablenodetakingadiscretevalueofsegmentationclass/label. Theprobabilisticrelationshipsamongthenodesarerepresentedbyedgesinthe network.Hence,unsupervisedimagesegmentationcanbeviewedasaninference problemontheDDTstructurenodes,whichcanbecarriedoutefciently.Theend imagesegmentationresultcanbeobtainedbyapplyingthemaximumposteriormarginal toeachvariablenodeinthenetwork.Weprovidetheparameterestimationregimeusing theExpectation-Maximization(EM)algorithmcombinedwiththesum-productalgorithm. 12

PAGE 13

With respecttotheobjectives,wehypothesizethat1)Hierarchicalsegmentation givesmoremeaningfulresultsthantheresultsfromonlyonescale;2)Thetree-structure priorwouldsmooththesegmentationresults,yieldingbettersegmentation;3)Exploiting thesuperpixelwouldbeawaytosmooththesegmentation,sothesegmentation differencesbetweenthemodelwithandwithoutthetree-structuredpriorwouldbeless atthesuperpixel-levelsegmentationthanthepixel-level;4)Themodelwithevidencein allscalesgivesbetterresultsthantheonewithout. Weevaluatequantitativelyourresultswithrespecttotheground-truthsegmentation intheBerkeleySegmentationDatasetandBenchmark500(BSDS500),awell-known imagedatabasebenchmark,demonstratingthatourproposedframeworkperforms competitivelywiththestateoftheartinunsupervisedimagesegmentationandcontour detection. 13

PAGE 14

CHAPTER 1 INTRODUCTION Imageandsceneunderstandingisafundamentalyetdifcultproblemincomputer vision.Suchaproblemseemstrivialtohumans,butcouldbeextremelychallengingfor anautomatedcomputeralgorithm.Forhumans,objectsinanimagearerecognized andidentiedwithoutmucheffort,immaterialiftheyhaveappearedinoureverydaylife oriftheyhavebeenseenforthersttime.Itisfascinatinghowhumanscanmemorize suchalargenumberofobjectmodelsintheirbrains.Incomputervision,inorderto recognizeandunderstandobjectsandscenes,adivideandconquerapproachmust beundertaken,breakingdownimagesintoseveralclasses(e.g.,edgedetections, unsupervisedandsupervisedimagesegmentation,contourmatching,objecttracking, etc.),synergisticallyinterrelated.Computervisionexpandedintoanareawhere hundredsorthousandsofresearchersareworkingwithtremendousimpactonthe real-worldapplicationswherehumanqualityoflifecanbegreatlyameliorated,for instance,medicalimageanalysis,smartauto-pilotcontrolforcars,facedetection, imagesearch,etc.Computervisioncangreatlybenetfrommodelingtechniquesusing probabilisticmodelsandinferencetotakeintoaccountuncertaintiesinthedecision processandthenoiseinthesystem. 1.1Motivation Imagesegmentationspansabroadspectruminthecomputervisioncommunity, andcanbemainlycategorizedinto3levels,eachofwhichisofequalimportance. First,thelow-leveltaskaimstoidentifyandenhanceedgesintheimagebyasking ifagivenpixelisonanedge,alineoraboundary.Inthemiddle-leveltask,themain goalistogrouptheedgepixelsintolinesorcurvesandassignalabeltoeachpixel intheimage.Contiguousareaswherethepixelshavethesamelabelaresaidtobe inthesamesegmentwhosepixelmembersaresimilarinsomesense.Althoughno semanticmeaningisassignedtoeachsegment,themiddle-leveltaskisthekeyto 14

PAGE 15

identify thenotionofobjectsinanimagewithoutatrainingdataset,andhenceisa criticallyimportantstepforthenextlevel.Thehighleveltaskinvolvesanearhuman levelofinterpretation,bycollectingandcombinginformationfromthemiddlelevel.It focusesonassigningasemanticlabeltoeachpixelintheimageorathemelabeltothe scene. Ourmaininterestsarefocusedonthejunctionofthemiddle-levelandhigh-level tasks.Ourproposedmethodattemptstoparsetheimageintoseveralsegments,each ofwhichismadeasinformativeaspossible.Hence,eachsegmentformsthenotionofa semanticobjectintheimage,despitetheunavailabilityoftrainingdata.Inotherwords, ourmethodologyisanunsupervisedimagesegmentationalgorithmwhoseoutputisas semanticaspossible.Theunderlyingmodeloftheproposedmethodisagenerative probabilisticmodelcapturingthejointprobabilitydistributionoflabels,permutable classesorannotationsgiventoobjectsintheimage,andobservations,whichare featurevectorsextractedfromtheinputimage.Theimagelabelsandthefeaturevectors aremodeledusingdiscreterandomvariablesandmultidimensionalcontinuousrandom variables,respectively.Unlikesupervisedimagesegmentationproblems,eachlabel inthisclassofproblemsisnotassignedtoanysemanticmeaning,hencethelabel ispermutable.Hence,thereisnoneedforatrainingdatasetforsemanticpurposes. However,thetrainingsetcanbehelpfulwhendeterminingthescaleofthesegmentation result.Inthisdissertation,suchaproblemisreferredtoasthe unsupervisedimage segmentation problem. Ourproposedmethodologyintegratesseamlesslytwomainapproaches,bottom-up andtop-down,inthesameframeworkusingatree-structurepriordistribution.The standardbottom-upapproachisapowerfulwaytoaggregateimagefeatureslocally, startingfromsmalltobiggerunits(e.g.,startingfrompixelstotheentireimage),but suffersfromlong-rangedimagecorrelation.Ourframeworkovercomestheshortcoming inthestandardbottom-upapproachbyincorporatingthetop-downtreestructureprior 15

PAGE 16

which encapsulatestherelationshipofthelabelsintheimageinahierarchicalmanner. Theoveralljointprobabilityofthelabelsinthetreecanthenbesystematicallyfactorized intotheproductofthejointateachlevelofthetree.Thisfactnotonlyallowsmulti-scale representationsoftheimagelabel,butalsoenablesanefcientinferencealgorithm. Theprobabilisticrelationshipbetweenfeaturevectorsandlabelsareencodedviaa bottom-upapproachusingtheconditionalprobabilityoffeaturevectorsgiventhelabels. Alltheprobabilisticmodelsmentionedearlieraresystematicallyintegratedunderthe samegraphicalmodel,resultinginacompletejointdistributionofbothlabelsandimage features. Althoughatree-structurepriorisnotanewidea,asithasbeenexploitedinimage segmentationapplicationforyears,thereisstillalotofon-goingresearchontree structureconstructionandonnovelwaystoapplyit.Thequad-treestructureisone ofthemostprimitivetreestructuresusedinthecommunity,however,whenapplied toimagesegmentation,suffersfromblockysegmentationduetothefactthatthe quad-treestructureusuallydoesnotagreewiththenaturalboundaryoftheimage. Animprovementhasbeenproposedbyintroducingmoreconnectionsfromparents toachildnodetoinducemorecorrelationstothecorrespondingneighborhoodpixels. Indeed,theblockinessproblemissubstantiallyamelioratedwiththepriceofamore complicatedinferencealgorithm.Yettheresultwasfurtherimprovedbyintroducingan adaptivestructuretothesegmentationframeworkwhicheffectivelychangesthetree structureineveryiteration,henceinducesmorecomputationalpowerandrequires approximateinferencealgorithmstoinferthelabeldistribution.Wethenobservedthe resultsproducedbythepreviousapproachesaswellastheiranalyticalsolutionsand proposed data-driventree-structuredBayesiannetwork (DDT),aframeworkwherethe treestructurecanbelearnedfromthedata(i.e.,inputimage)onlyoncepriortothe inference.Inthiswork,insteadofapixel,thenestimageunitisasuperpixel,dened asacontiguousimageregionwhosememberssharesimilarcharacteristicsinsome 16

PAGE 17

sense .Asuperpixelprovidesnotonlytherichinformationcollectedfromitsmembers butalsocomputationalefciencyduetothesignicantlysmallernumberofsamplesto beconsidered.Therefore,theproposedmodelhelpstokeepthecostoftheinference andtheparameterestimationdownwhereasthenaturalboundariesarestillpreserved welltosomeextent. 1.2ImageTasksinComputerVision Itisanambitiousgoaltomakeacomputerunderstandanimagebecausethere aresomanyaspectsofinformationcontainedintheimage,asstatedintheproverb Apictureisworthathousandwords.Sceneunderstandingisarguablytheultimate goalinthecomputervisioncommunity,whichcomprisesthreemainingredients: low-level,middle-levelandhigh-levelvision,eachofwhichcanberegardedasamain streamofresearchinthecommunity.Ourwork,anunsupervisedimagesegmentation algorithm,iscategorizedintotheclassofmiddle-levelvisionwhichisverycrucialto accomplishtheultimategoal.Manyapproacheshavebeenproposedforthepurpose ofimagesegmentation,whichcanbroadlydividedinto2groups:non-probabilisticand probabilisticapproaches.Recentlythelatterapproacheshavegainedmorepopularity, duenotonlytotheadvancementinprobabilitytheoryandcomputationalinstruments, butalsotheiradvantagesonthemodelingandinterpretationissues.Thischapter focusesontheprobabilisticapproachesforimagesegmentation. Asdescribedearlier,the3visiontasksincomputervisionareofequalimportance asthelow-levelisafoundationoftherest,themiddle-levelcanberegardedasa pre-processingstagepriortothenear-humaninterpretationframeworkofthehigh-level approaches.Missingoneofthemcouldpreventusfromtheultimateachievement.Each taskcanbedescribedasfollows. Earlyworksofcomputervisioncanbemostlycategorizedinthelow-levelvision whichcomprisesedgedetectionandenhancement,whereimagepixelsareassigned toedgeornon-edge[ 13 ].Nevertheless,therearestillalotofon-goingresearchon 17

PAGE 18

this category,suchasmodel-basedshadowremoval[ 4],salient-pointdetection[ 5], scale-invariancefeaturetransformation(SIFT)[ 6]anditssuccessorSURF[ 7],histogram oforientedgredients(HoG)[ 8],etc. Withitsfoundationonthelow-leveloutputs,themiddle-levelvisiontasksgroupedge segmentsintolinesandcurves,smallregionsintobiggerandmoremeaningfulregions. Althoughthesemanticmeaningisnotamainfocusinthislevel,someregions,curves andcontoursareobservedtobesemanticallyinterpretableasappearedin[ 9 13] includingourworkinthisdissertation. High-leveltasksaremostlybuiltonthemiddle-leveloutputs.Theinformation obtainedfromsegmentsandobjectsiscollectedandcombinetomakeanear-human interpretation,forexamplesceneunderstanding[ 1416],part-basedobjectrecognition [17 19 ]andactionrecognition[ 20, 21]. Inthisdissertation,ourproposedunsupervisedimagesegmentationmethod iscategorizedintoamiddle-levelvisiontask,whichisoneofthelargestresearch areasincomputervision.Hereby,itisworthtonotethedifferencesbetweenimage labelingandimagesegmentation.Imagelabelingmeanstheprocessofassigninga semanticmeaningfullabel,a.k.aclass,toanimageregion(i.e.,pixelorsuperpixel). Thistypeofprocessrequirestrainingimagedatasetinordertomodelthepredened objectsofinterest.Imagesegmentationisaprocesstogroupsimilarimageregions intoonesegmentsuchthatsimilaritymeasuredbetweenapairofimageregionwithin asegmentshouldbegreaterthanthatfromdifferentsegments.Unlikeimagelabeling, eachresultingsegmentfromtheprocessisnotassignedwithanymeaningfullabel and,therefore,arefullypermutable.Inotherwords,imagelabelingisasupervised segmentationalgorithmwhereasimagesegmentationismostlyusedinthecontext ofunsupervisedsegmentationasnotrainingdatasetisrequired.Eventhoughimage labelingisreferredtoasasupervisedsegmentationwhereeachlabelpossesssome 18

PAGE 19

semantic meaning,furtherinthetext,thetermlabelwillbeusedtorefertothenotion ofdiscrete-valuedclassvariablewhichdoesnotholdanysemanticmeaning. Sincetherehavebeenamultitudeapproachesforimagesegmentationalgorithms, whichcanbecategorizedintonon-probabilisticandprobabilisticapproach,wewould ratherfocusoureffortonreviewingliteratureofthesecondapproach,intowhichour proposedmethodbelongs. 1.3ProbabilisticFrameworkforImageSegmentation Ultimately,animagesegmentationproblemcanbereformulatedasastatistical inferenceproblemofposteriordistributionoveraeldofimageregions(e.g.,pixels orsuperpixels).Onceinferred,theresultingposteriorisusedtodeterminethemost probablelabel(MPL)foreachimageregionusing maximumaposterior (MAP)or maximumposteriormarginal (MPM).Ingeneral,thisframeworkrequiresthreeprincipal ingredients:1)Theprobabilisticmodelsfortheimagelabelsovertheimageregion randomelds,2)Theparameterlearningandtheinferencealgorithmsforcalculatingthe posteriordistributionoftheimagelabelsand3)Aframeworktoassignlabelstoimage regionsaccordingtotheresultingposteriordistribution. Thechoiceofprobabilisticmodelsisonechallengingproblemasamultitudeof suchmodelsareavailablewithdifferentadvantagesanddisadvantages.Eventually,we areconsidering2maintypesoftheprobabilisticmodels:discriminativeandgenerative. Thegenerativemodelfocusesitseffortsonmodelingthejointdistributionofthehidden andobservedvariables,whichconcernsabouthowthedatasamplesaregenerated fromsuchamodel[ 18, 2224].Themodelofthistypecanbeverycomplexandhence mayrequiresignicantresourcesformodeling.Ontheotherhand,thediscriminative modelfocusesonmodelingtheposterioroflabelsgiventheobservationsdirectly,which canbelesscomplexthanthatofgenerativemodel[ 2529].However,theendposterior distributionmightnotcontaintheinformationregardinghowthedataaregenerated. 19

PAGE 20

Ne vertheless,asofnow,thereisnoprincipledwaytosuggestoneapproachoverthe other.Consequently,thechoiceofmodelissteeredbythefollowingobjectives: 1.Capturebothshortandlong-rangedcorrelationofobjectlabelintheimage 2.Providethenotionofthecomponentandsub-components 3.Benetfromtheoriginalstructureoftheimage Toincludetheaforementionedcharacteristicsintotheuniedmodel,multiscale probabilisticgraphicalmodelsareadoptedasamainapproachinthiswork.Recently,a numberofsuccessfulapplicationsonthistypearereported[ 3037].Alongthislineof research,weproposeagenerativegraphicalmodel data-driventree-structuredBayesian network (DDT),whosestructureisamultiscalehierarchicaltreewhichfacilitateboth componentandsub-componentinterpretationandefcientinferencealgorithm.The principalreasonweselectthegenerativeapproachagainstthediscriminativeapproach isthelatteronedoesnotprovidethenaturalinterpretationoftheparametersused tocreatethedata.Ontheotherhand,DDTmodelsprovideintrinsicallythephysical meaningofsuchparametersandcanincorporatethesegmentsinmultiplescale seamlessly.BeforediscussingthedetailsofDDT,itisinterestingtoknowadvantages anddisadvantagesoftree-structuredprobabilisticgraphicalmodelingeneral,whichis presentedinthefollowingsections. 1.4FixedQuad-TreeStructure Thetree-structuredprobabilisticmodelhasbeenappliednotonlytomachine learning,butalsotocomputervisioncommunitybecauseitprovidesaprincipled approachtoincorporaterandomprocessesinmultiscalemannerandhasefcient learningandinferencealgorithm.Thesimplesttree-structuredmodelforimage segmentationisthexedbalancedquad-treestructurewhereeachparentnodeis connectedto4childreninthelevelbelow.Therefore,thenumberofnodesineach levelwouldbeamultipleoffour(i.e.,1,4,16,64andsoon).Tree-structuredbelief network(TSBN)[ 34]isintroducedusingthebalancedquad-treestructure,asdepicted 20

PAGE 21

in Fig. 1-1,whichisxedthroughouttheinferenceprocess.Eachnodeinthetree, exceptatthebottom,representsadiscrete-valuedhiddenrandomvariablewhosevalue representsaclassorlabel.EachnodeatthebottomlevelofaTSBNrepresentsa continuous-valuedobservedvariableswhichissometimesreferredtoas evidence .The edgesinthenetworkrepresentthedependenciesbetweenparentandchildnodesfrom adjacentlayersofhiddenvariables.Whenapplyingtheconditionalindependence,we willhavethathiddennodesinthesamelevelareindependentofeachotherwhenthe parentnodeisgiven.In[ 34],anobservedvariabledependsonlyonitscorresponding parentinthelevelabove.TSBNshaveefcientlinear-timeinferencealgorithmwith respecttothenumberofnodesinthenetwork,whichcanbeimplementedusing belief propagation [38],aspecialcaseof sum-productalgorithm usedwhenthenetworkis reformulatedasa factorgraph [39 ].TSBNshavebeenappliedin[ 40 ]formultiscale documentanalysis,in[ 41]forman-madestructuredetectioninnaturalsceneimages, in[ 37, 42]fornaturalimagesegmentation,in[43 ]forairborneimagesegmentation.All theexamplesmentionedearliershowthepowerofthetree-structuredpriortocapturing theshortandlong-rangecorrelationofsegmentsintheimage,andthecomputationally efciencyoftheirinferencealgorithms,whicharecriticallycrucialforourpurposes. OnemaindrawbackofTSBN,however,isthatthexedtreestructureusedto enforcethelocalhomogeneitybetweenneighboringpixelsdisregardsthenatural objectboundaries,whichoftenresultsinblockysegmentationoutputasshowninFig. 1-2.Severalapproacheshavebeenproposedtoaddressthisproblembyintroducing complex,cross-linkedmodel[ 30],whichinturnrequiresmorecomplexinference algorithmssuchasjunctiontreealgorithm[ 44]orloopy-beliefpropagation[45 46 ].In thenextsectiontheblockinessproblemisresolvedbyintroducinganarchitecturethat adaptsitsstructuretotheimage 21

PAGE 22

Figure 1-1.Tree-structuredgraphicalmodels.(Left)Tree-structuredbeliefnetwork (TSBN)hasitsxedquad-treestructure.(Right)multiscale-evidence data-driventree-structuredBayesiannetwork(meDDT)hasirregular structuredependingontheinputimage,andhastheevidence(blacksquare) ineveryscaleofthetree. Figure 1-2.Segmentationresults.Fromlefttoright:Originalimage,human-made ground-truthcontourimage,optimalsegmentationresultobtainedfrom meDDTandfrommeTSBNrespectively.Thehuman-madecontourcontains alotofdetails. 22

PAGE 23

1.5 DynamicTreeArchitecture Recently,theuseofgraphicalmodelswithadaptablestructureshasbeenproposed in[ 18, 35, 36, 47].Bytreatingthemodelstructureasarandomvariableandadapting thestructuretoteachinputimage,theproposedmodelssignicantlymitigatethe disagreementbetweenthesegmentationboundaryandthenaturalobjectboundary. The Dynamictree (DT)wasproposedtosegmentroadsceneimagesin[ 35]whichis oneoftherstdynamictreemodelintroducedinthecomputervisioncommunity.ADT modelhasitsobservationsmerelyattheleafofthetree,andteapproachfocuseson theobservationsthatareclassvariablesasopposedtofeaturevectors.Subsequently, fromthesameresearchgroup, position-encodingdynamictree (PEDT),DTarchitecture withnodepositionsincorporatedinthejointdistribution,hadbeenintroducedin[ 36]. FurtherdevelopedfromDT,PEDTusesafnitiesofnodesintheneighboringlevelsas apriortoregulatethejointdistributionandmakesseamlessconnectionsbetweenthe observationsandthehiddenvariables.TheresultsofbothDTandPEDTonpixel-level imagesegmentationsignicantlysubsidestheblockinessoccuredinTSBN.Whena structurevariableisintroducedintothejointdistribution,however,theexactinferenceis computationallyintractableasthestructurelearninginthiscaseisanNP-hardproblem [48 49 ].Variationalapproximateinference[ 50]isthenadoptedtothoseframework inordertokeeptheinferencealgorithmcomputationallytractableandpractical.In suchworks,mean-eldvariationalapproximation,thesimplestformofvariational approximation,hasbeenmadebycomparingthetruedistributionwiththevariational distributionwhereallthevariablesareassumedindependent.Followingthesame trends, irregulartree-structurebeliefnetwork ,orshort, irregulartree (IT)hasbeen introducedin[ 18].ThemaindifferencebetweenITanditspredecessorsisthatthe variationaldistributionofITutilizesstructuredvariationalapproximation[ 51, 52]which preservesthestructureofthetruedistributionbetterthanthemean-eldapproximation. 23

PAGE 24

Ne vertheless,theneedtore-estimatethemodelstructureineveryiterationindeed incurssignicantcomputationcostamajordrawbackforthislineofapproach. 1.6OurProposedMethodtoImageSegmentation Inthiswork,weproposeanovelprobabilisticgraphicalmodelcalled DataDrivenTree-StructuredBayesianNetworks (DDTs)whereallgoodmeritsoftree andhierarchicalmulti-scalestructurearepreserved.ThetreestructureofDDTisbuilt accordingtothesimilarityofimageregionsinaninputimage,thuscandescribe approximatelytheorientationofobjectsintheimage.Asaresult,theshort-and long-rangedcorrelationsencodedbyDDTcanbemoreprecisethanthatofTSBN [34 ].Unliketheexiblestructuregraphicalmodelsproposedin[ 18 35 36 ],DDT doesnotneedtoadaptitsstructureineveryiterationwhichdrasticallyreducesthe computationofthealgorithm.Inaddition,insteadofusingpixelasthenestinformation scale,weusesuperpixel[ 53 ],agroupoflocallysmoothlabelingregion,whichcontains moredescriptivecontextualinformationofthecorrespondingimageregionthanatthe pixellevel.ExperimentalresultsinTable 6-3, 6-4 and 6-5 demonstratethatourmethod performscompetitivelytothestateoftheart.Ourproposedmodel,DDT,isdiscussed brieyinthissectionasfollows. First,aninputimageis superpixelized bysome superpixelgenerators which over-segmenttheimageintosmallregionscalled superpixels .Duetothehomogeneity inasuperpixel,everypixelinthesamesuperpixelisassumedtosharethesamelabel, andthismotivatesustousesuperpixel,insteadofpixel,asthenestimageunitwhen performingsegmentation.Notonlyricherandmoreinformativefeaturevectorscanbe obtainedfromsuperpixelthanpixels,butthenumberofdatasamplesinthefeature spaceissubstantiallyreduced,sayfrom150,000downtoseveralhundreds,whichhelps improvethecomputationalcharacteristicofthealgorithm.Inthiswork,themultiscale superpixelsareobtainedbyvaryingthescale-parameterinthesuperpixelgenerators, whichresultsinamultiple-levelsuperpixelimage. 24

PAGE 25

Each superpixelisthenassociatedtoa hiddenvariablenode ,orforshorta hiddennode ,intheDDTmodel.Eachhiddennode,asaparent,hasitscorresponding observationnode,a.k.aevidencenode,connectedasachild.Thedirectededgefrom parenthiddennode x tothechildevidence y noderepresentsconditionalprobability p (y jx ) oftheobservation y giventhelabel x .Suchconditionalprobabilitydistribution isreferredtoaslabel-evidenceconditionalprobabilitydistribution(CPD),orforshort label-evidence,whichismodeledusingmultivariateGaussiandistributionwhich offersclosed-formexpressionforparametersupdatingequations.Theobservation y isinstantiatedwiththeobservedfeaturevectorextractedfromitscorresponding superpixel.Itisworthmentioningthatthefeatureextractionprocedureineachlevel mightbedifferent,dependingonthephysicalmeaningofthescalethelevelresides in.Thisgrantsthefreedomofdesigningdifferentfeatureextractionproceduresineach differentscale,whichisanotherstrengthofDDT. Whiletheconnectivitiesbetweenhiddennodesandtheircorrespondingevidence nodesaresimpleandtransparent,theconnectivitiesamonghiddennodesare data-dependentandcomply2regulations:1)everyhiddennode,excepttheroot node,musthaveasaparentahiddennodefromtheleveldirectlyabove2)everyhidden nodeisallowedtoconnecttoonlyoneparentinordertopreservethetreestructure characteristic.Theedgeconnectingbetweenapairofhiddennodesrepresents label-label conditionalprobabilitytable (CPT)whichissometimesreferredtoasCPTfor short.TheCPTinthiscasecharacterizestherelationshipbetween2discreterandom variablesand,thus,ismodeledusingmultinomialdistribution. Ourapproachcanbeseenasacompromisebetweenthexedquad-treestructure andtheadaptivestructurebecausewechooseasingletreestructurefortheinput imageanddonotperformfurtherstructuraladaptation.Thisapproachrelaxestherigid quad-treestructureinDDT,andalsoeliminatestheneedofcalculatingthestructure ineveryiterationoftheinferencealgorithm.InSection 4.3 weproposeaninsightful 25

PAGE 26

methodology maximum-overlappingalgorithm ,tobuildsuchatreestructurefroman inputimage.TheoverviewoftheframeworkisdepictedinFig. 1-3. SinceaDDTcontainsbothobservationsandhiddenvariables, ExpectationMaximization (EM)algorithm[ 54, 55]isaprincipledframeworktothemaximum likelihoodestimationoftheparameters.Oncethetreestructureislearnedandthe featurevectorsareextractedforeachevidencenode,theparametersareestimatedand thestatisticalinferencearethenapplied.Theinferenceprocessisaniterativeprocedure alternatingbetweenE-stepandM-step.TheE-stepisessentiallytheinferenceproblem andemployssum-productalgorithmforefcientcomputation,whereastheM-step aimstoreestimateeachofthemodelparameters,suchasthelabel-labelCPTand label-evidenceCPD. Aftertheinferenceprocess,themarginalposteriordistributiongiventheobservations willbepresentedateachhiddennodeintheDDT.Theultimatelabelforeachnodecan thenbedeterminedusing maximumposteriormarginal (MPM)whichissimplythe modeofthemarginalposteriordistributionobtaineddirectlyfromtheE-step.Each labelintheresultisregardedasasegmentoftheimage,whichcanbeinterpretedas adistinctobjectintheinputimage.Segmentationindifferentscalesbringsindifferent physicalmeaningtoeachsegment,andthereforecreatestheparent-childnotion betweenobjectsandtheirparts.Forexample,inFig. 1-5,thedetailsoftheclothesare sub-objectsoftheclothesitself,and,eventually,theclothes,faceandbodypartsarethe sub-objectsofperson. Moreover,thecomplementary(ordual)partofsegmentationprovidescontours becausewhenasegmentiscreated,thecorrespondingcontourissimplytheboundary ofthesegment.Atthispointeachpixelisdecidedtobeacontourornon-contourby assigningthevalue1or0respectively.Areal-valuedcontourestimateisformedby countingthetimesapixelisinacontouracrossmultiplescalesofsegmentation.This contourestimateisreferredtoasmultiscalecontourimageasshowninthelastcolumn 26

PAGE 27

of Fig. 1-5.Thevalueofeachpixelinthemultiscalecontourimagerangesfrom0, whenthepixelisnotacontourinanylevel,to1,whenthepixelisacontourinevery levelofthetree;thegreatervalueofthemultiscalecontourpixelindicatesincreased prevalenceacrossscales.Frompreliminaryempiricalresults,wehaveanassumption thatthecontoursappearinginmorelevelsrepresentthemoreobviousobjectthan thoseappearinginlesslevels.Forexample,themultiscalecontourofthestarshshape appearsineveryscaleofthesegmentation,and,hence,shouldbemoreimportantthan thestripeofthestarsh.Insummary,theinferenceinDDTprovidesnotonlymultiscale hierarchicalsegmentation,butalsothemultiscalecontourofobjects. Inthisdissertation,werstintroduce multiscale-evidenceDDT (meDDT)as amainarchitecture,followingby2variants.Therstvariantisleaf-evidenceDDT (leDDT)whoseobservationsappearattheleafnodeofthetreeonly,whichisshown tobeinferiortothemeDDT.Ourattemptisalsotocombinethefeaturevectorfrom pixelandsuperpixellevel,whichboilsdowntothesecondvariant ,combinedpixelsuperpixelevidenceDDT (coDDT)whichslightlyoutperformsmeDDTwiththeexpense ofmuchlongerrun-time.WealsoshowTSBNisaspecialcaseofDDTandpropose multiscale-evidenceTSBN(meTSBN)whichoutperformstheoriginalTSBN.Itis illustratedthattheupdateequationsofthoseaforementionedproposedmethodshave verysimilarformbecausetheysharethesamestructureofDDT. Determinationofthenumberofobjectsorsegmentsinaninputimageisacritically importantissuesincomputervisionandmachinelearningingeneral.Suchproblemis includedinaclassofmodelselectionproblem.Therearemanyapproachesproposed toresolvethisproblem;[ 56]derivesthemeanshift-likealgorithm,whichcancomeup withthenumberofclusterautomatically,fromtheprincipleofrelevantinformation(PRI), [57 ]reformulatesthenumberofobjectsusinglossycompression,[ 58]caststheprior overtheparametersspaceofGaussianmixturemodel(GMM)resultingsomenumberof segmentsarefavoredthanothers,[ 22]employsFisherinformationmatrixasanindicator 27

PAGE 28

Figure 1-3.TheoverviewofDDTframework. toannihilateclusters,Bayesianinformationcriterion(BIC)[ 59]isacriterionformodel selectionbybalancingthelikelihoodandthemodelcomplexitybygivinggreaterpenalty tothemorecomplexmodel.Interestingly,notonlysimpleandintuitive,ourexperimental resultsshowsBICoutperformingboth[ 22 ]and[ 58 ]intheimagedataset.Consequently, buildingfromBIC,wepropose modiedBIC (mBIC)asthemodelselectionframework whichcanbeseamlesslyintegratedinGaussianmixturemodelforsegmentation framework. Wealsopropose sampling-superpixelGaussianmixturemodel (superGMM) algorithmtouseinjunctionwithmBICtodeterminethenumberofobjectsinanimage efciently.superGMM,anapproximatemodelofcoDDT,startsbyrstsampling, say10%,oftheentirepixelsintheimage,thensegmentingthesampleddatainthe featurespaceusingGMM(workswellwithk-meantoo).Thesamplesarethenlabeled accordingtotheGMMclusteringbeforeputtingbackintheiroriginalpositionsinthe inputimage.Withinasuperpixel,majority-voteschemeisappliedtoselectawinner label.Finally,thewinnerlabelispropagatingtoeverypixelinthesuperpixelenforcing thesamelabeltoallpixelsinthesamesuperpixel.TheowdiagramofsuperGMMis illustratedinFig. 1-4. WithrespecttotheobjectivesaddressedinSection 1.3,wehypothesizethat 28

PAGE 29

Figure 1-4.ThediagramofsuperGMM.First,theoriginalimageissuperpixelized. Randomlysamplesomepixels,say10%,fromeachsuperpixelandperform unsupervisedclusteringinthefeaturespace.Finally,applymajority-vote schemewithineachsuperpixeltocalculatethewinnerlabelandpropagate suchlabeltoallthepixelsinthesuperpixel.(Left)Originalimage.(Middle) Segmentedsampledpointswhenplacedinsuperpixelimage.(Right)The endsegmentationresultaftermajority-voteschemeandlabelpropagation. Figure 1-5.ThemultiscalesegmentationresultsfrommeDDT.Thenotionofobjectsand objectpartiscapturedbyparentanditscorrespondingchildnode respectively. 1.Hierarchicalsegmentationgivesmoremeaningfulresultthantheresultsfromonly onescale 2.Thetree-structurepriorwouldsmoothenthesegmentationresults,yieldingtothe bettersegmentation 3.Exploitingthesuperpixelwouldbeawaytosmooththesegmentationalready,so thedifferencebetweenmodelwithandwithouttree-structuredpriorwouldbeless inthesuperpixel-levelsegmentationthanthepixel-levelone. 4.Themodelwithevidenceinallscalesgivesbetterresultsthanonewithout ThecomparisonofmeDDT,leDDT,coDDT,TSBN,meTSBNandsuperGMMcan befoundinTable 6-7 and 6-8.Wealsocompareourproposedmethodsagainstthe 29

PAGE 30

w ell-knownalgorithmsincludingthestateoftheart,madeavailablebytheresearchers ontheInternet.Theexperimentalresultsshowthat 1.Visually,thecontoursofimportantobjectsseemtohavegreatervalueinthe multiscalecontourimage,andthenotionofobjectandobjectpart(sub-object)can benoticedonthemultiscaleresults. 2.Thetree-structuredpriorhelpsimprovingthesmoothnessoftheresulting segmentation.However,thegainedsmoothnessdoesnotgivetheadvantageto themodelwiththetreepriorbecausethehuman-madegroundtruthsegmentation containsalotofdetails. 3.ThedifferencebetweenGMiNDandmeDDTisnotsignicant,whichimpliesthat exploitingsuperpixelscanreducetheneedforstructureprior. 4.Themodelswithevidenceineveryscaleoutperformtheoneswithout,implying thatthemultiplescaleevidenceshelpimprovingtheresults. 5.Thenumberofnodesincorporatedinthestructureiscrucialespeciallyinthe unsupervisedsegmentationscenario.Havinglotsofsuperpixelsmakesthe inferencemorerobustbutnotcomputationallyefcient.Ontheotherhand,having onlyafewsuperpixelsprovidespreferablerun-timebutnotrobustinference.That mightbeareasonforcoDDTandsuperGMMoutperformingmeDDT. ThedetailsoftheexperimentsandthediscussioncanbefoundintheChapter 6 1.7Contributions Ourmaincontributionsonimagesegmentationareoutlinedbelow WeproposeDDT,aprobabilisticgraphicalmodel,forunsupervisedimage segmentation.Themodelnotonlyresolvestheblockysegmentationcaused bytherigidxedquad-treearchitecture,butalsoprovidesexactandefcient inferencealgorithmasopposedtothatofadapativestructuremodels. SincetheDDTmodelrequiresthetreestructurelearningprocesspriortothe inference,weproposemaximum-overlapalgorithm,aninsightfulandintuitive approach,tolearnthetreestructurefromaninputimage.Suchapproachis substantiallyrobustandcanbeappliedtoanytypeofsuperpixel. Thequalityandcharacteristicofsuperpixelgeneratorarecrucialforsuccessful segmentation.Weprovideprincipledguidlinestostudyandcomparequantitatively thequalityofsuperpixelgenerator. Westudytheeffectofmodelshavingobservationsinmultiplescales,whichproved tooutperformtheoneshavingobservationsattheleafscale. 30

PAGE 31

W eproposemBIC,amodiedversionofBIC,toresolvethemodelselection probleminDDT.Morespecically,weusemBICtodeterminethenumberof segmentsineachinputimage. ByobservingtheresultsfromGMMandcoDDT,weproposesuperGMM,asimple andefcientunsupervisedimagesegmentation,whichsamplesonlyafewpixels fromeachsuperpixelforclustering.ThesuperGMMoutperformsotherproposed methodinbothperformanceandrun-time. Wealsodiscoveraprovocativebutusefulfactaboutthestructureprior:itisless importantwhenusingsuperpixelsinsteadofpixels. Themaincontributionondataclusteringandfusionareoutlinedasfollows. WeproposedeformableBayesiannetworks(DFBN)tolearnthetreestructureof thedataobtainedfromunderwaterplatforms. WeproposeanovelwaytoinitializetheCPToftheDFBNusingstochastic simulationcalledk-nndensitywalk(KDW) Weproposedirectedperturbation,aGibbs-likesamplingmethodology,forthe inferenceinDFBN Formoredetails,pleaserefertoAppendix D and E.Theworkowdiagramisillustrated inFig. 1-6. 1.8DissertationOutline Afterasurveyofexistingmodelsinthischapter,the data-driventree-structured Bayesiannetwork willbeintroducedtogetherwithitsparametersestimationalgorithm inChapter2.Morespecically,wediscuss multiscaleevidenceDDT (meDDT),where observationsarepresentedineverylevelofthetree,asamaintemplatewhosevariant modelswillbediscussedinChapter3.TheconnectionbetweenDDTmodelsand imagesegmentationhasnotbeendoneuntilChapter4,inwhichimagesegmentation problemismappedseamlesslytoprobabilisticinferenceinDDTs.Subsequently,the featureextractionmethodologiesandmodelselectionproblemsarediscussed,andits empiricalresultsarereportedattheendofChapter5.Morespecically,weproposea modiedversionof Bayesianinformationcriterion,calledmBIC,asamainalgorithmto determinethenumberofsegmentsorobjectsinaninputimage.Additionally,wealso 31

PAGE 32

Figure 1-6.Theworkowdiagramofthedissertation. propose superGMM ,amajority-votebasedalgorithmbenetingfromsuperpixel,asan efcientwaytoestimatethenumberofsegmentsandinitializetheDDT'sparameters. Experimentsandcomparisonamongtheproposedmodelsandthestateoftheartare reportedinChapter6andconcludedinChapter7,thelastchapter. 32

PAGE 33

CHAPTER 2 DATA-DRIVENTREE-STRUCTUREDBAYESIANNETWORKS Inthischapter,wediscusstheprobabilisticmodelofdata-driventree-structured Bayesiannetworks(DDTs),derivethelearningalgorithmusingmaximumlikelihood approachandtheinferencealgorithmusingsum-productalgorithm.First,theprobabilistic graphicalmodelofDDTwillbediscussedwiththeminimalconnectiontotheimage segmentationapplication.Thisway,theDDTmodelisproposedasamoregeneral frameworkthatcanbeappliedtomanyapplicationsbeyondhierarchicalimage segmentation.Althoughthischapterconcentratesonlyonthegraphicalmodelsof DDT,theconnectionstoimagesegmentationapplicationswillbeintroducedwhen appropriateforsakeofbetterunderstanding. Figure 2-1.Theprobabilisticgraphicalmodelofmulti-scaleevidenceData-Driven Tree-StructuredBayesianNetwork(meDDT).Hiddenvariablesandevidence variablesaredenotedbyround-shapednodesandrectangular-shaped nodesrespectively.Thegraphstructureisconnedtobeoftreetypeand usuallyisunbalanced.Suchtreestructurescanbeconstructedbymany choicesofalgorithm. 2.1ProbabilisticModelofDDT DDTisadirectedacyclicgraph(DAG)with 2 disjointsetsofrandomvariables; hiddenandevidence(a.k.a.observed),graphicallyrepresentedbyround-shapedand rectangular-shapednodesrespectively,asdepictedintheright-handsideofFig. 2-1.In graphicalmodels,variablesarerepresentedbynodesinthenetwork,andtheunderlying structureofjointprobabilityisencodedbythegraphstructureofthenetworkwhich 33

PAGE 34

Figure 2-2.ThefactorgraphrepresentationofmeDDTmodelillustratedinFig. 2-1.The conditionalprobabilitytables(CPT)ofclasstransitionarerepresentedbythe triangular-shapedfactornodesandaretiedtogetherwithinthesamelevel. Theevidence-classconditionaldistributionsarerepresentedbythe pentagon-shapedfactornodesandaretiedtogetherwithinthesamelevelfor robustness.ThevariablenodesaredenotedbythesamesymbolsasinFig. 2-1. isrepresentedbyedges.Thus,fromthispointon,thetermsnodeandvariableare usedinterchangeably.WhilehiddennodesinaDDTmodelareconnectedviadirected edgesrepresentingconditionaldependenciesamongthevariables,evidencenodesare connectedonlytotheircorrespondinghidden-nodeparents,asdepictedintheleft-hand sideofFig. 2-1.TheDDTmodelpresentedinthissectionincorporatestheevidence frommultiplescalesbyhavinganevidencenodeforeachhidden-nodeparentinevery levelofthenetwork,andthus,itisnamedmulti-scaleevidenceDDT(meDDT).Hidden andobservednodesareassociatedwithimagesiteswhichingeneralcanbeany arbitraryimageregion,e.g.pixels,blockofpixels,orsuperpixels.Throughoutthepaper, weassociateimagesiteswithsuperpixelsasmentionedearlierinthetext.Moreover, meDDTisahierarchicaltree-structuredmodel,meaningthatthestructureofmeDDTis connedtobeatree,eachofwhoselevelshastheinterpretationofscale,andhence, thewholestructurerepresentsmultiscaleinformationoftheunderlyingdata. ThehierarchicaltreestructureofmeDDTcanbedescribedasfollows.Thereare L levelsinthehierarchy,where H l denotesthesetofindicesinthelevel l 2f1,..., Lg ,and whenitisappropriate,thelevel H l isusedtodenotethelevel l ofthemeDDT.Theroot 34

PAGE 35

le vel H L containsonlyasinglesuperpixel,whichisequivalenttotheentireinputimage (thecorrespondingindexset H L = f1g).Asthelevelindex l decreasesalongthedepth ofthetree,thenumberofthenodescontainedineachlevelincreases.Thelevel H 1 thuscorrespondstothenestscaleoftheresultingsegmentationofthemeDDT.Laying beneaththelevel H l istheevidencelevel E l whosenodesaretheobservedimage features,whichareconnectedtotheircorrespondingparentnodesinthelevel H l ina one-to-onemanner.Theindicesofnodesinthetreeareusedinascendingorderfrom theroottotheleaf.Thesetoftotalindicesinthestructureisdenotedby H[E ,withthe hiddenandobservednodesdenotedas H = fH l g L l =1 and E = fE l g L l =1 respectively.The numberofnodesinthehiddenandevidencelevel l is jH l j and jE l j respectively,andwe denotethetotalnumberofnodesinthestructureas N = jHj + jEj.Generally,wehave jH 1 j > jH 2 j > > jH L j,andthesameappliesfor E ,whichconstitutesapyramidaltree. Thestructureconnectivityischaracterizedbyan N N adjacencymatrix Z ,where z ji takesavalue 1 ifnode j 2H l and i 2f 0, H l +1 g areconnected.Connectionsare establishedunderaconstraintthatanodeinlevel l canonlyconnecttoaparentnode intheadjacentupperlevel l +1 excepttherootnodewhichconnectstonull.Therefore, arealizationof structure matrix Z canhaveatmostoneentryequalto 1 ineachrow. Unlike[ 18, 35, 36, 60 ],thestructure Z inourframeworkisnotarandomvariableasitis learnedfromeachinputimagefromaseparatealgorithmandremainxedthroughout theprocessofsegmentingtheimage. Eachround-shapednodeinFig. 2-1 representsdiscreterandomvariable x j ,ofan imagesite j inthelevel l ,whichtakesavaluefromalabelset C l = f1,..., C l g suchthat x jc =1 and x j c ; c 6= c =0 iftheimagesite j hasclasslabel c 2C l .Thisnotationfacilitates theuseofthefollowingexpressionforthemultinomialconditionalprobabilityofhidden node x j givenitsparent x i as p (x j jx i ji ; l )= C l Y v =1 C l Y u =1 x jv x iu jivu (2) 35

PAGE 36

where jivu denotes theclasstransitionprobability p (x jv =1jx iu =1) whichis conventionallyreferredtoas conditionalprobabilitytable (CPT).Forthesakeof computationalstability,letusassumethattheCPTissharedamongallthenodesin thesamelevel,i.e., jivu = j ivu = lvu for l 2f1,..., L )Tj/T1_4 11.955 Tf12.41 0 Td(1g.Consequently, = f lvu g collectivelydenotestheCPTofthemodelwhichisgivenby p (x j jx i ; l )= C l Y v =1 C l Y u =1 x jv x iu lvu (2) Notethatthisframeworkisunsupervised,hencetheprobabilisticmodelofaclass c is notprovidedapriori,andthenumberoflabelsallowedforeachinputimagemustbe denedbeforeexecutingthealgorithm.Estimatingtheappropriatenumberofclassesfor eachimage(themodelselectionproblem)isexplainedinmoredetailinSection 5.2 Weintroduceobservedvariablesrepresentedbyshadedsquare-shapednodesin thestructureasillustratedinFig. 2-1.Eachobservedrandomvariable y e 2R d ofan imagesite e 2E representstherelevantimagefeaturessuchascolorortexturewhich takeoncontinuousvalues.Extensivedetailsonourchoiceoffeatureswillbediscussed intheexperimentalresultssection.Wemodelthefeaturevector y e usingamultivariate Gaussiandistributiongivenas: p (y e jx i ; l )= C l Y c =1 N (y e j c )Tj/T1_7 7.97 Tf6.59 0 Td(1 c ; l ) x ic (2) where N (y e j c )Tj/T1_7 7.97 Tf(1 c ; l )= j c j 1 =2 (2 ) D l = 2 exp )Tj/T1_4 11.955 Tf10.49 8.09 Td(1 2 (y e )Tj/T1_1 11.955 Tf11.96 0 Td( c ) > c (y e )Tj/T1_1 11.955 Tf11.95 0 Td( c ) (2) is theGaussiandistributionwhere c and c are D l 1 meanparameterand D l D l precisionmatrixforclass c inthelevel l respectively.Generally,thecardinalityofthe labelset C l decreasesasthelevel l decreases. Usingthenotationdescribedabove,wecannowwritethehiddenlabelscollectively as X = fx j g j 2H andtheobservedimagefeaturesas Y = fy e g e 2E .Theprobabilistic 36

PAGE 37

g raphicalmodelofmeDDTcanbefactorizedas p (X Y jZ )= L Y l =1 Y e 2E l Y i 2H l p ( y e jx i ; l ) z ei (2) L Y l =1 0 @ Y j 2H l Y i 2f 0, H l +1 g p (x j jx i ; l ) z ji 1 A (2) andwhentakenintoaccountalltheconditionalprobabilitiesmentionedabove,thefull jointprobabilitydistributioncanbewrittenas: p ( X Y jZ )= L Y l =1 Y e 2E l Y i 2H l Y c 2C l N (y e j c )Tj/T1_7 7.97 Tf6.59 0 Td(1 c ; l ) x ic z ei (2) L Y l =1 0 @ Y j 2H l Y i 2f 0, H l +1 g Y v 2C l Y u 2C l +1 x jv x iu z ji lvu 1 A (2) and,therefore,thelog-likelihoodofthecompletedatacanbeexpressedas: log p (X Y jZ )= P L l =1 )Tj5.47 -0.72 Td(P e 2E l P i 2H l z ei P c 2C l x ic log N ( y e j c )Tj/T1_7 7.97 Tf6.59 0 Td(1 c ; l ) (2) + P L l =1 P j 2H l P i 2f 0, H l +1 g z ji P v 2C l P u 2C l +1 x jv x iu log lvu (2) Alltheparametersofthejointcanbegroupedintheset = f lvu c 2C l c 2C l g, 8 l 2 f1,..., Lg, 8 c v u 2f1,..., C l g.Notethattheconnectivitystructurematrix Z isassumed knownfromeachinputimage,henceitisexcludedfromtheparameterset. 2.2InferenceandParameterLearning Inthissection,wepresentamaximumlikelihoodestimationalgorithmforDDT modelparameter .Formodelswithhiddenvariables,Expectation-Maximization(EM) [54 55 ]isaprincipledframeworkthatalternatesbetweeninferringtheposteriorover hiddenvariablesintheE-step,andmaximizingtheexpectedvalueofthecomplete datalikelihoodwithrespecttothemodelparameterintheM-step.Morespecically,we infertheposteriorprobabilityoverthehiddenlabelsintheE-stepandcomputerelevant sufcientstatisticsneededintheM-stepformaximizing .Sinceexactinferenceona 37

PAGE 38

tree structuregraphistractable,theobjectivefunctioncanbewrittenas: F ( ; t )Tj/T1_6 7.97 Tf6.59 0 Td(1 ) hlog p ( X Y jZ )i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) (2) where hf (x ) i q (x ) denotestheexpectationoffunction f (x ) withrespecttothedistribution q (x ).IntheM-step,bymaximizing F w.r.t. ,wederivetheclosed-formupdate equationsfor c c and lvu asfollows: c 2C l = P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.75 0 Td(1 ) y e P e 2E l P i 2H l z e i hx i c i p (X j Y Z t )Tj/T1_6 5.978 Tf5.75 0 Td(1 ) (2) )Tj/T1_6 7.97 Tf(1 c 2C l = P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) (y e )Tj/T1_3 11.955 Tf11.96 0 Td( c )(y e )Tj/T1_3 11.955 Tf11.96 0 Td( c ) > P e 2E l P i 2H l z e i hx i c i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) (2) lvu = ^ lvu P v ^ l v u (2) where ^ lvu = X j 2H l X i 2f 0, H l +1 g z ji hx jv x iu i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) (2) denotestheunnormalizedclass-transitionCPT.Moredetailsofthederivationcanbe foundinAppendix B .Theupdateequationsaboverequirethatwecomputethefollowing expectationterms: h x ic i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) and hx jv x iu i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) ,whichcanbedoneefciently usingasum-productalgorithm.Tocompactlyexplainourinferencestep,weexpress DDTintermsofafactorgraph[ 39]using2typesofnodes:1)avariablenodeforeach randomvariable x i and2)afactornodeforeachlocalfunction,namelyCPTinthiscase. ThefactorgraphrepresentationofmeDDTisillustratedinFig. 2-2.Intheoperationof sum-productalgorithm,wecomputeamessagefromvariablenode x tofactornode f : m x !f (x ),andamessagefromfactornode f tovariablenode x : m f x (x ) ,whichcanbe 38

PAGE 39

e xpressedas: m x !f (x )= Y h 2 n ( x )nf f g m h !x (x ), (2) m f !x (x )= X n (f )nf x g 0 @ f (n (f )) Y y 2 n (f )nf x g m y f ( y ) 1 A (2) where n (x ) denotesthesetofneighborsofagivennode x ,and n (x )nff g denotesthe remainingnodesafter f isremovedfromtheset n (x ).Theposteriormarginal p (x j jY ) canbecalculatedbymultiplyingallincomingmessagestothevariablenode x j .EM algorithmalternatelyiteratesbetweentheaboveE-andM-stepsuntilthecostfunction F converges,yieldingthesetofoptimalparameters = f lvu c c g Wecaninferthedistributionateachnodeofthehiddenlevel(H l )using maximum posteriormarginal (MPM)whichisoptimalforsitewise 0 )Tj/T1_4 11.955 Tf12.89 0 Td(1 loss[ 25 26 ]function. Thatis,welabelimagesite j withlabel c =argmax c 2C l p (x j = c jY Z ; l ) ,whichcanbe computedusingthesum-productalgorithmas p (x j = c jY Z )= hx jc i p (X j Y Z ) .Note: thedifferencebetween maximumaposteriori (MAP)andMPMisthatthelabelofnode usingthelattermethodcanbecalculatedindependentlyoftheothers.Theinference andparameterestimationprocedureissummarizedinAlgorithm 1. 2.3Summary Inthischapter,theprobabilisticmodelofmeDDThasbeendiscussedtogether withitslearningandinferencealgorithmusinganEM-likeapproachandthemaximum posteriormarginal.SomeconnectionsbetweenmeDDTandtheimagesegmentation applicationshavebeenmade.InChapter3,deviantmodelsofDDTaredevelopedon thetemplateofmeDDT. 39

PAGE 40

Algorithm 1 ParameterestimationandinferenceformeDDT 1: pr ocedure I NFERENCE M E DDT(structure Z ,observations Y ) 2: Inserttheevidences y e 2E l toallthelevel l 2 1,..., L 3: RunGMMfortheevidenceineachlevel l separately 4: UsetheGMMparameterstoinitializetheparameters c c for c 2C l where l 2 1,..., L 5: Theinitialvalueofmarginalposteriorofeach x j canbecalculatedfromthe correspondingGMMclassier 6: while Notconverged do 7: The lvu canbecalculatedfrom lvu = ^ lvu P v ^ l v u (2) 8: Calculate c from c = P e 2E l P i 2H l z ei hx ic i p ( X jY ,Z t )Tj/T1_9 5.978 Tf5.76 0 Td(1 ) y e P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_9 5.978 Tf5.76 0 Td(1 ) (2) 9: Calculate c from )Tj/T1_9 7.97 Tf6.59 0 Td(1 c = P e 2E l P i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_9 5.978 Tf5.76 0 Td(1 ) (y e )Tj/T1_7 11.955 Tf11.95 0 Td( c )(y e )Tj/T1_7 11.955 Tf11.95 0 Td( c ) > P e 2E l P i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_9 5.978 Tf5.75 0 Td(1 ) (2) when full-covariancematrixisdesiredor )Tj/T1_9 7.97 Tf6.59 0 Td(1 c = P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_9 5.978 Tf5.76 0 Td(1 ) diag [( y e )Tj/T1_7 11.955 Tf11.96 0 Td( c ) 2 ] P e 2E l P i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_9 5.978 Tf5.75 0 Td(1 ) (2) when diagonalcovariancematrixisdesired. 10: Runsum-productalgorithmtocalculateeach x j 11: endwhile 12: Returnthedistribution p (x j jY Z ; l ) forall l 2 1,..., L ,themaximum-likelihood parameters 13: endprocedure 40

PAGE 41

CHAPTER 3 VARIANTSOFDDT InChapter2,weproposedanarchitectureofDDTnamelymeDDTwhichhas evidenceinallthescalesandisfullycharacterizedbythefollowingjointdistribution: p ( X Y jZ )= L Y l =1 Y e 2E l Y i 2H l Y c 2C l N (y e j c )Tj/T1_7 7.97 Tf6.59 0 Td(1 c ; l ) x ic z ei L Y l =1 0 @ Y j 2H l Y i 2f 0, H l +1 g Y v 2C l Y u 2C l +1 x jv x iu z ji lvu 1 A Thelog-likelihoodcanthenbeexpressedas: log p (X Y jZ )= P L l =1 )Tj5.48 -0.72 Td(P e 2E l P i 2H l z ei P c 2C l x ic log N (y e j c )Tj/T1_7 7.97 Tf(1 c ; l ) + P L l =1 P j 2H l P i 2f 0, H l +1 g z ji P v 2C l P u 2C l +1 x jv x iu log lvu InthischaptertheformulationabovemotivatestwoadditionalarchitecturesofDDTs: leDDTandcoDDT.Theformerremovesalltheevidencenodesfromthehidden-node parentexceptatthebottom-mostlevel H 1 ,andthelatterincorporatesnodesrandomly selectedwithinasuperpixelnode.Inthesubsequentsection,wediscussprobabilistic modelsofthesenewlyderivedmodelswhichhaveasimilarformasmeDDT. 3.1Leaf-EvidenceDDT(leDDT) Alltheevidencenodesexceptatthebottom-mostlevelofthetree(H 1 )havebeen removed,asshowninFig. 3-1A.ThisisthesimplestamongalltheDDTarchitectures, yettheleDDTenablesunderstandmoreabouttheeffectofhavingversusnothaving evidencenodesinallscales.Weanticipatetheexperimentwillbringaboutabetter understandingoftheDDT. 41

PAGE 42

A Leaf-EvidenceDDT(leDDT) B Multiscale-EvidenceDDT(meDDT) C Combinedpixel-superpixelevidence DDT(coDDT) Figure3-1.AllthreearchitecturesofDDT.ThisgureillustratestheevolutionofDDT architecturesintheorderofstructuralcomplexityfromleastinleDDTtomost incoDDT.(Top)StartwithleDDTwhosestructurecontainingevidencenodes atthebottom-mostlevelonly.(Middle)meDDTcanbeobtainedfromleDDT astheevidencenodesareaddedtoallofitslevels.(Bottom)Randomly sampled1%fromeachsuperpixel,thepixel-levelnodesareaddtothe hiddennodesinthebottom-mostlevelofthetreeresultingincoDDT. 42

PAGE 43

The probabilisticmodelofleDDTissimilartothatofmeDDTexceptthatallthe evidenceinthelevel H l where l 2 isremoved,andsothejointdistributionremainsas p ( X Y jZ )= Y e 2E 1 Y i 2H 1 Y c 2C 1 N (y e j c )Tj/T1_8 7.97 Tf6.59 0 Td(1 c ) x ic z ei (3) L Y l =1 0 @ Y j 2H l Y i 2f 0, H l +1 g Y v 2C l Y u 2C l +1 x jv x iu z ji lvu 1 A (3) TheupdateequationscanbefacilitatedusingtheEMalgorithm,thederivationofwhich canbefoundinAppendix C.Theinferenceandparameterestimationprocedureis summarizedinAlgorithm 2. 3.2Combinedpixel-superpixelevidenceDDT(coDDT) InadditiontomeDDT,nodesinlevel H 1 ofacoDDTconnecttotheirchildnodes representingpixel-levelfeaturesasillustratedinFig. 3-1C.Eachrectangularplatein Fig. 3-1C representsapixel-levelnodewithitscorrespondingevidence.Thepixel-level nodesarerandomlyselectedwithinasuperpixel.ThecoDDTseamlesslycombinesboth superpixel-andpixel-levelnodesinordertomakearicherrepresentationofanimage. Duetothelargenumberofpixelsinanimage,thepixel-levelnodesareobtainedby randomlysamplingasmallportion(1%)oftotalnumberofpixelsineachsuperpixel. TheprobabilisticmodelofcoDDTcanbeexpressedinexactlythesamewayasthat ofmeDDTexceptthatthenodesatthebottom-mostlevelofcoDDTtakespixel-level nodes.Toavoidthereferenceconfusion,weshallcallthepixellevelby H 0 ,andhence thelevel H 1 continuestorepresentthenestscaleofsuperpixelasinmeDDTand leDDT.Consequently,theprobabilisticmodelofcoDDTcanbewrittenas p ( X Y jZ )= L Y l =0 Y e 2E l Y i 2H l Y c 2C l N (y e j c )Tj/T1_8 7.97 Tf6.59 0 Td(1 c ; l ) x ic z ei (3) L Y l =0 0 @ Y j 2H l Y i 2f 0, H l +1 g Y v 2C l Y u 2C l +1 x jv x iu z ji lvu 1 A (3) 43

PAGE 44

Algorithm 2 ParameterestimationandinferenceforleDDT 1: pr ocedure I NFERENCE L E DDT(structure Z ,observations Y ) 2: Inserttheevidences y e 2E 1 toallthelevel l 2 0,..., L 3: RunGMMfortheevidenceinthelevel l =1 4: UsetheGMMparameterstoinitializetheparameters c c for c 2C 1 5: Theinitialvalueofmarginalposteriorofeach x j canbecalculatedfromthe correspondingGMMclassier 6: while Notconverged do 7: The lvu forall l 2 1,..., L canbecalculatedfrom lvu = ^ lvu P v ^ l v u (3) 8: Calculate c from c = P e 2E 1 P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_5 5.978 Tf5.75 0 Td(1 ) y e P e 2E 1 P i 2H 1 z ei hx ic i p (X j Y Z t )Tj/T1_5 5.978 Tf5.75 0 Td(1 ) (3) 9: Calculate c from )Tj/T1_5 7.97 Tf(1 c = P e 2E 1 P i 2H 1 z ei h x ic i p (X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) (y e )Tj/T1_8 11.955 Tf11.96 0 Td( c )(y e )Tj/T1_8 11.955 Tf11.96 0 Td( c ) > P e 2E 1 P i 2H 1 z ei hx ic i p ( X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) (3) when full-covariancematrixisdesiredor )Tj/T1_5 7.97 Tf(1 c = P e 2E 1 P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) diag [( y e )Tj/T1_8 11.955 Tf11.95 0 Td( c ) 2 ] P e 2E 1 P i 2H 1 z ei hx ic i p ( X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) (3) when diagonalcovariancematrixisdesired. 10: Runsum-productalgorithmtocalculateeach x j 11: endwhile 12: Returnthedistribution p (x j jY Z ; l ) forall l 2 1,..., L ,themaximum-likelihood parameters 13: endprocedure The onlydifferencebetweentheEquation 3 and 2 isthatEquation 3 incorporates thepixel-levelnodesasthebottom-mostlevel,hencetheindexstartsfrom l =0 as opposedto l =1 inEquation 2.Therefore,theupdateequationsforcoDDTwouldbe thesameasthoseofmeDDTexcepttheystartfrom l =0 ,asshowninAlgorithm 3. 44

PAGE 45

Algorithm 3 ParameterestimationandinferenceforcoDDT 1: pr ocedure I NFERENCE C O DDT(structure Z ,observations Y ) 2: Inserttheevidences y e 2E l toallthelevel l 2 0,..., L 3: RunGMMfortheevidenceineachlevel l separately 4: UsetheGMMparameterstoinitializetheparameters c c for c 2C l where l 2 0,..., L 5: Theinitialvalueofmarginalposteriorofeach x j canbecalculatedfromthe correspondingGMMclassier 6: while Notconverged do 7: The lvu canbecalculatedfrom lvu = ^ lvu P v ^ l v u (3) 8: Calculate c from c = P e 2E l P i 2H l z ei hx ic i p ( X jY ,Z t )Tj/T1_9 5.978 Tf5.76 0 Td(1 ) y e P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_9 5.978 Tf5.76 0 Td(1 ) (3) 9: Calculate c from )Tj/T1_9 7.97 Tf6.59 0 Td(1 c = P e 2E l P i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_9 5.978 Tf5.76 0 Td(1 ) (y e )Tj/T1_7 11.955 Tf11.95 0 Td( c )(y e )Tj/T1_7 11.955 Tf11.95 0 Td( c ) > P e 2E l P i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_9 5.978 Tf5.75 0 Td(1 ) (3) when full-covariancematrixisdesiredor )Tj/T1_9 7.97 Tf6.59 0 Td(1 c = P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_9 5.978 Tf5.76 0 Td(1 ) diag [( y e )Tj/T1_7 11.955 Tf11.96 0 Td( c ) 2 ] P e 2E l P i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_9 5.978 Tf5.75 0 Td(1 ) (3) when diagonalcovariancematrixisdesired. 10: Runsum-productalgorithmtocalculateeach x j 11: endwhile 12: Returnthedistribution p (x j jY Z ; l ) forall l 2 0,..., L ,themaximum-likelihood parameters 13: endprocedure 45

PAGE 46

3.3 TSBNasaSpecialCasesofleDDT AtthispointitisworthmentioningthattheTSBNcanbeseenasaspecialcaseof leDDT.ATSBNisatreestructurewhosenodesrepresentrectangularregionsofthe imageandtakesasinputtheimagefeaturesatthebottom-mostlevelofthetreeonly. Thenodesinthenetworkareconnectedviaquad-treestructurewhichdiscardsallthe intrinsicstructuresandboundariesofobjectsintheimage.Becauseofthis,TSBNtends toproduceblockysegmentation.Duetothesecharacteristics,TSBNcanbemodeled usignleDDTwhereallthesuperpixelsarerectangularandareconnectedwithquad-tree structure.Additionally,weaddmulti-scaleevidencenodestoallthelevelsoftheTSBN tree,whichwerefertoasmulti-scaleevidenceTSBN(meTSBN). 3.4Summary InthischapterweproposedtwoarchitectureswhichstemfrommeDDT,namely leDDTandcoDDT.BotharchitectureshaveasimilarprobabilisticmodeltomeDDT. Additionally,wedemonstratedthatTSBNisaspecialcaseofleDDTandproposed meTSBNasananalogofmeDDT,wherethesuperpixelisrectangularandthe connectionsformaquad-treestructure. 46

PAGE 47

CHAPTER 4 IMAGESEGMENTATIONUSINGDDTS InChapter2and3,wediscussedtheprobabilisticgraphicalmodelsofmeDDT anditsvariants.Suchgraphicalmodelsconsistsofedgesandnodes,formingatree structure.Eachnodeinthenetworkrepresentsanimageregion,rangingfrompixel tosuperpixel.Thestructureofthejointdistributionisencodedbytheconnectivities inthenetworkswhichrepresenttherelationshipofobjectswithinaninputimage.In thischapterweproposeaninsightfulapproachtoconstructsuchatreetocapture therelationshipsamongthedifferentregionsoftheimage,whichisakeytoaccurate segmentationresults.Twosuperpixelgenerators,denotedbyMoriandUCM,are comparedquantitativelyintermsofthedeviationsfromtheground-truthboundaries madebyhumansubjects. Followingtherecentworkin[12, 17, 53, 61 64 ],weadoptasuperpixelrepresentation ofimages.Indeed,forimagesegmentationproblem,theuseofsuperpixels,which representasetoflocallysimilarpixels,candrasticallyreducethenumberofleafnodes andgreatlysimplifythestructureoftreesthatweneedtobuild.Inadditiontothe computationalefciency,richerandmoredescriptiveinformationcanbegatheredfrom agroupofsimilarpixelscombinedtogether.Inrecentyears,anumberofsuperpixel algorithmshavebeenmadepubliclyavailable.Whileadetailedcomparisonofdifferent superpixelalgorithmsisbeyondthescopeofourwork,ourpreliminaryresultsshow thatthequalityofimagesegmentationoutputdoesdependheavilyontheoutputofthe superpixelalgorithm. 4.1ImageSegmentationusingSuperpixels Generally,agraphicalmodeliscomposedofnodesandedgesasmentioneda prioriinthetext.Thusfar,theDDTshavebeenpresentedasahierarchicalunsupervised segmentationmethodthatcanbeappliedtomanyapplications.Fromthispointon,we focusourattentionontheapplicationofDDTsforunsupervisedimagesegmentation.In 47

PAGE 48

Figure 4-1.TheoverviewofData-DrivenTree-StructuredBayesianNetwork(DDT) framework.Theoriginalimage(1)isover-segmentedinamultiscale hierarchicalmannerinprocess(2).(3)ThecorrespondingDDTisbuilt accordingtothesuperpixelsineachlevel.(4)Thefeaturesareextracted fromtheoriginalimagecorrespondingtoeachsuperpixel.(5)-(10)Afterthe learningandinferencealgorithm,theresultingmultiscalesegmentationcan beinterpretedfromallthelevels. thissectionwedescribetheconnectionbetweentheprobabilisticmodeloftheDDTand themultiscaleimagesegmentationusingsuperpixelsasthenestimageunit. Toavoidpossibleconfusioninthefuture,weshallhaveaclearterminologyfor eachstepinconstructingaDDT.Foragiveninputimage,weover-segment,a.k.a superpixelize ,byapplyinga superpixelgenerator totheimagewithsomepredened parametersincludingthenumberofsuperpixels.Theresultingsegmentation,referred toasthe superpixelimage,containsmultiplecontiguousregions,eachofwhichisa superpixel.Theboundarylineencompassingasuperpixelisdenotedbya superpixel contour.Theunionofallsuperpixelcontoursinanimageformsa superpixelcontour image whichisacomplementtothesuperpixelimage.Thetotalnumberofsuperpixels withinasuperpixelimageisreferredtoas superpixelimagecardinality whichcanbe constitutedbypredenedparametersfedtothesuperpixelgenerator.Thedifferent settingsofsuchparametersdeterminethe scale ofthesuperpixelimage;asuperpixel imagewithgreatersuperpixelimagecardinalityisregardedashavingnerscalethan theother. 48

PAGE 49

When aninputimageissuperpixelizedwithdifferentpredenedparametersinto severalscales,thesetcontainingallresultingsuperpixelimagesarecollectivelydenoted asa multiscalesuperpixelset whosetotalnumberofscalesistermed scalecardinality Eachscaleinsuchasetcanbereferredtoby scaleindex ;thenestsuperpixelimage isalwaysassignedwiththescaleindex1,thesecondnestwith2andsoonuntilthe coarsestscale.Moreformally,theterminologiescanbemathematicallydenedas follows: Letaninputimage i IMG i ,comprise N i pix pixels,thatis,wecanwritetheimage i asasetofpixels IMG i = fp n g N i pix n =1 ,whereeachpixel p n 2R 5 isa 5-dimensionalvector containingtheintensityforred,greenandblue(RGB)channelsandthe(integer) x-ylocationofthepixel.Asuperpixelgeneratorcanbeexpressedasafunction (IMG i ; f ) (whichtakesaninputimage IMG i andsomescaleparameterdenotedby f )whichoutputsasuperpixelimage S f i ofthesamesizeastheoriginalimage IMG Thesuperpixelimagecontainsalotofsuperpixels,whosenumberdependingonthe user-denedscaleparameter f ,whichcanbewrittenas S f i = fs f j g N f ,i supx j =1 where s f j denotesthe j th superpixelretrievedatthescale f andthesuperpixelimagecardinality N f i supx denotesthenumberofsuperpixelsofimage i obtainedatthescale f .Toavoid thenotationalclutter,weneglecttheimageindex i whenappropriate.Asuperpixel isessentiallya connectedset ofpixels s f j = f p n g n 2S f j where S f j denotesthesetof indicesofpixelssharingthesimilarpropertiesorformingahomogeneousobject.Let B denotethesetofscaleparametersas B = ff l g L l =1 and,forthesakeofconsistency withthemodelsinChapter2and3,thenestandcoarsestscalesaredenotedby thescaleindex l =1 and l = L respectively.Ingeneral,thescaleiscoarserasthe scaleindexincreases,andtherefore N f l supx > N f l +1 supx > N f L supx =1.Wedenethesetof superpixelimageswithdifferentscalesasamultiscalesuperpixelimagesetdenotedby M = f S f l g l 2B 49

PAGE 50

F oraninputimage,wecancalculatethecorrespondingmultiscalesuperpixelimage set M byfeedingthescaleparametersin B oneatatimetoasuperpixelgenerator.It isrequiredinthisframeworkthatthecoarsestscalesuperpixelimageat l = L contains only1superpixel,namelytheentireinputimage.Theindicesofsuperpixelsin M will berearrangedsuchthattheindicesofthecoarserscaleissmallerthanthoseofthe nerscale(similartothoseintheDDTgraphstructure).Aftertherearranging,itis convenienttodene H l asthesetofsuperpixelindicesinthescale l ,and,therefore, therelationship 8 l 2f 1,..., L )Tj/T1_3 11.955 Tf12.51 0 Td(1g, 8 j 2H l 8 i 2H l +1 : j > i holds.Letalsodene H = [ L l =1 H l asthecollectionsofindiceswithin M .Since H l ,for l 2f1,..., Lg,are mutuallyexclusive( H l 1 \H l 2 = ; for l 1 6= l 2 ),wewillhavethat jHj = P L l =1 jH l j = P L l =1 N l supx = N supx thetotalnumberofsuperpixelsin M.Therefore,atscale l = L the superpixel s f L j startswiththeindex j =1;thescale l = L )Tj/T1_3 11.955 Tf12.11 0 Td(1 startswith j =2 andsoon untilthelastsuperpixel s f 1 N supx isatthenestscale l =1.Itisconvenienttowrite M asa setofsuperpixels M = [ L l =1 fs j g j 2H l = fs j g N supx j =1 andomitthescalenotation f l when M is includedinthecontext. Next,atreestructurecanbeconstructedon M byconnectingadirectededgetoa superpixel s f l j inscale l fromitsmostlikelyparent s f l +1 i inscale l +1 (right-abovescale). Inthischapterweproposeamaximum-overlappingalgorithm,anintuitivestrategytobe discussedlater,tobuildsuchatree,whichcanbeappliedtoanysuperpixelgenerator algorithm.Infact,wealsofoundthatthereareamultitudeofpossiblestrategiesto constructatreestructurefor M,includingagglomerativeclustering,hierarchical clustering,graph-basedregionmerging,mean-shiftclustering,principleofrelevant information(PRI)forclustering(amoregeneralcaseofmean-shift).Thestructureofthe treeisencodedbyanadjacencymatrix Z whosedetailswillbediscussedinsection 4.3 Eachsuperpixel s j 2M isthenassociatedwithaclasslabelrandomvariable x j representedbyanodeinatreestructureconstructedfromthestructurematrix Z deningtheconnectivitiesamongthenodes x j 'sinthenetwork.InthemeDDT 50

PAGE 51

fr amework,foreachnode x j ,therewillbeanobservationnode y j instantiatedbya featurevectorderivedfromthesuperpixel s j .Thedirectededgefrom x j to y j indicates theobservedfeaturevector y j isgeneratedfromthelabelvariable x j takingadiscrete value c fromthepre-denedlabelset C = f1,..., C g .Thefeaturevectorsareobtained fromfeatureextractionalgorithmtobediscussedinChapter 5.2 .Therefore,inthis paragraph,weillustratehowtoconvertthemultiscalesuperpixelimageset M into anmeDDTgraphicalmodel.Intuitively,theconversiontoleDDT,coDDT,TSBN andmeTSBNcanbedoneinasimilarmanner.AftertheconversiontoDDTs,the unsupervisedsegmentationcanbeperformedsystematicallyinsuchframeworks.The unsupervisedimagesegmentationprocessusingmeDDTissummarizedgraphicallyin Fig. 4-1. 4.2TheChoiceofSuperpixelGeneratorandTreeStructure Inthissectionweintroduceandcompare2superpixelgeneratorsquantitatively. Theground-truthsegmentationresultsmadebyhumansubjectsarediscussed accordingly.Wealsoproposecriteriatopickthescaleofsuperpixelforunsupervised imagesegmentationonthedatasetBSDS500,popularbenchmarkusedinthecomputer visioncommunity. 4.2.1ComparisonofSuperpixelGenerators Thechoiceofthescalecardinalityandthesuperpixelimagecardinalityineach scaledeterminetheachievableaccuracyofthesegmentationalgorithm.More specically,amultiscalesuperpixelsetdeterminesthemaximumrecallrateofa segmentation.Thatis,thesetestablishestheupperboundoftheendaccuracyand withoutagoodchoiceofsuchaset,theoverallaccuracycanbelimitedconsiderably. Therefore,therststepwouldbetoselectagoodchoiceofsuperpixelgenerator. Therigorouscomparisonofsuperpixelgeneratorspersecanbeamajortask,andis beyondthescopeofthisdissertation.Nevertheless,weconductsomeexperimentsto empiricallyverifythatthegeneratorweselectedmetourrequiredaccuracy. 51

PAGE 52

In thisexperiment,wecompare2superpixelgeneratorcandidates:[61]and[13] denotedbyMoriandultrametriccontourmapping(UCM)respectively,duetotheir computationalefciencyandhomogeneityintheresultingsuperpixeloutput(e.g., pixelswithsimilartexture,color,andresidingwithinthesameobjectboundaryare groupedintothesamesuperpixel).Additionally,thecandidatesarecomparedagainsta quad-treegenerator,likethoseproducedbyaTSBNwhosesuperpixelsareofthesame rectangularshapewhichdisagreewiththenaturalboundaryoftheobjectsintheimage, usedasabaselinemethod. Firstwecomparethequalityofthreesuperpixelgeneratorsontheevaluationset ofBSDS500bycalculatingtheaveragerecallvaluethateachgeneratorcanachieve byvaryingthenumberofsuperpixelsasshowninFig. 4-2.Generallythecontour disagreementbetweenanytwocomparedcontours(e.g.,groundtruthcontourversus computedcontour)naturallyoccursevenwithintwocontoursgeneratedbythesame humantwice.Suchadisagreementcanbecompensatedbyallowingsomethreshold marginbetweenbothcontours,andinthisexperimentsuchthethresholdvalueis3 pixels.Wealsoillustratetheeffectiverecallvalueswhensuchmarginthresholdvalues are0,1,2and3pixelsinFig. 4-2.TheexperimentalresultsuggeststhatUCMisranked thebestamongthecandidates,followedbyMoriandquad-tree.Forinstance,inorder toachieve80%recall,UCM,Moriandquad-treeeachhastouseapproximately80,120 and800superpixelsrespectivelyatthethreshold thr =3.Therefore,UCMischosen asthesuperpixelgeneratorinmostofourexperiments.Itisworthtonotethatthe averagerecallofthecontoursdonebyhumansubjectshasintrinsicvariationsandhas 70%recall.Atrstglance,itmayseemsurprisinghowaquad-treecanreachalmost unityrecallratewiththenumberofsuperpixelaround4000.Sucharesolutionofthe superpixelimagecombiningwitha3-pixelthresholdmargincoveralmostallofthepixels intheimagealready,contributingtowhythealmost-unityrecallhasbeenreached. 52

PAGE 53

In termsofcomputationalrun-time,a7-levelquad-treestructurecanbegenerated withinasecondandisthefastestsuperpixelgeneratorhere.Theimplementationof Morispends45sec,1,1.5and4minutesbyaverageforgeneratingthesuperpixel imagesofcardinality20,60,80and200respectivelyforanoriginalimageofthesize 321481pixels.Duetoourlimitedcomputationalresources,themaximumsuperpixel imagecardinalityisconnedto200superpixels.OnthesameimagesetUCMspends 5minutestoproducethewholemultiscalesuperpixelsetofthecardinalityof1400 onaverage.Essentially,UCMrstgeneratesthenestscalesuperpixelimageofthe cardinalityapproximatelymorethanathousand,thensimplyusesagraph-based regionmergingalgorithmtomergetwosimilaradjacentsuperpixelsintooneandkeep goinguntilcardinalityisunity.Intermsofmemoryefciency,wefoundthatwhenitis generatedfromaregionmergingalgorithm,amultiscalesuperpixelsetcanbestored efcientlyinthememorybymarginalizingthecontoursacrossthescales.Thatisthe summedcontourcanbethoughtofasthelinearcombinationofallthesuperpixel contourswithequalweightcoefcients.Sincethesuperpixelcontourineachscaleisa propersubsetofthenestcontour,and,hence,thereisonlyoneuniquesetofweight coecientssatisfyingthesummedcontour,whichmeansthesummedcontourcan bedecomposedinoneuniqueway.However,theimplementationofMoriusedinthis experimentdoesnotadopttheregion-mergingscheme. Ingeneral,wewouldliketocomparetherecallvaluesoftwosuperpixelgenerators obtainedfromapproximatelythesamescale.Whilethenotionofscaleinsuchasetting isnotwellestablished,weproposethatthescalecanbedeterminedbythenumber ofboundarypixelsinthecontourimage.Thatisbecausethemorepixelswehave, therearemorecontourswecanconstructfromthesetofgivensuperpixels;inother words,thepricewepaytomakeregionsarethenumberofpixelsweputintheimage. Therefore,itissensibletoaskaquestion Howmuchcanasuperpixelgeneratorrecall giventhesamenumberofboundarysuperpixels?.Consequently,werepresentinFig. 53

PAGE 54

4-3 the recallperformanceofeachgeneratorversustheboundarypixelratiodened asthefractionofthenumberofboundarypixelstothetotalnumberofpixelsinthe image.Thereasonsweusetheboundarypixelratioratherthantheabsolutenumber ofboundarypixelsareasfollows.First,wewanttoaccountforthesituationwherethe sizeoftheimagesaredifferent.Second,atanextremecase,everypixelintheimage isregardedtobeonthecontourboundary,inwhichcasetheboundarypixelratiois unitywhichrepresentsthenestdetailsontheimage;andwecannotpossiblygetmore detailsthanthatofthislevel. Itisalsoaninterestingquestiontoask whatistherelationshipbetweenthenumber ofsuperpixelsandtheboundarypixelratio?.Analytically,givenaboundarypixelratio value,wecanconstructmanydifferentnumberofsuperpixels.However,interestingly, thecurvesinFig. 4-4 presentasimilartrendoftherelationshipacrossthesuperpixel generators.Thatmeansthenumberofsuperpixelscanberegardedasameasure ofscaleaswellastheboundarypixelratioandisarguablyinvarianttothesuperpixel generatorsusedinthisexperiment.Inpractice,wefoundthatitismoreconvenientto usethenumberofsuperpixelsthantheboundarypixelratiotoregulatethescaleofthe superpixelcontour.Thatisbecausethenumberofsuperpixelscanbedirectlyinputto mostofthesuperpixelgeneratorswhengeneratingthesuperpixelcontour.Hence,from thispointon,thenumberofsuperpixelsareusedtorefertothescaleofthesuperpixel contour. ObservethevarianceinthecurvesshowninFig. 4-4,wecanrankfromthebiggest tosmallestvarianceasUCM,Moriandquad-treerespectively.Notsurpriselythatthe quad-treehaszerovariance,thatisbecausethequad-treesuperpixelimageisheld xedforeveryinputimageasopposedtoUCMandMoriwhosesuperpixelimages areadaptedtoeachinputimage.ThebiggervarianceinUCMimpliesthatitvariesthe contourtoagreaterextentthanMori.Thisfactseemsnoticeableevenwithasimple visualobservation.ThesuperpixelimagegeneratedfromMoriseemstospreadacross 54

PAGE 55

Figure 4-2.Comparisonofrecallvaluesachievedbycandidatesuperpixelgenerators. Wecomparetherecallvalueofeachsuperpixelgeneratorversusthe numberofsuperpixelsperimagewhenthepixelthresholdvaluesarevaried (i.e.,thr=0,1,2,3).ThecurvesindicateUCMisthebest,Moryisthe secondandquad-treewhichisthebaselinemethodisthelast. theinputimagealmostequallyasshowninFig. 4-5,butthisisnotthecaseforUCM showninFig. 4-6.Ifahistogramofthenumberofpixelsineachsuperpixelweretobe plotted,itisnotdifculttoseethatsuchhistogramswouldbepeakaroundavaluein MoriasopposedtoevenlydistributedinUCM,whichsuggeststhehigherentropyin UCMthaninMori. 4.2.2HumanPerformance Inordertoidentifythescalethatahumanisinterestedin,weaverageacrossthe humangroundtruththenumberofsuperpixelsversustheboundarypixelratioasshown inFig. 4-4.Wefoundthathumansubjectstendtomakeanaverageof20regionsper inputimagewiththeboundarypixelratioof0.017.Ontheotherhand,eachinputimage containslessthan10semanticobjects,whichmeansthehuman-madegroundtruth contoursusuallycontainnon-semanticregionsimplyingthegroundtruthisnoisy.From thescatterplotinthesamegure,lessthan20imageshaveanaveragenumberof 55

PAGE 56

Figure 4-3.Comparisonofrecallvaluesachievedbycandidatesuperpixelgenerators whenusingboundarypixelratio.Inwhichcasewexthethresholdmargin atthr=3,andtheUCMisstillthebestfollowedbyMoriandquad-tree. Figure 4-4.Comparisonofthenumberofsuperpixelsversustheboundarypixelratioof eachsuperpixelgenerator.Notethesimilartrendsofthecurvesfrom differentsuperpixelgenerators.Thisalsosuggeststhatthenumberof superpixelsaswellastheboundarypixelratiocanbeuseinterchangeably todeterminethescaleofsuperpixelcontour. 56

PAGE 57

Figure 4-5.SuperpixelimagesgeneratedfromMoriusingthecardinality=5,20,80and 320intherst,second,thirdandfourthcolumnrespectively.Inthelast column,thesummationofallsuperpixelcontourimagesisillustratedinthe lastcolumn,wherethecontoursappearinginseveralscalesareindicatedby moreintensitycomparedtoothers. regionslessthan10.Thecontourimagesdonebyhumansubjectsarequalitatively contrastedtothosegeneratedbyUCMandMoriinFig. 4-7. Itisinterestingthatthescatterplotofhuman-madecontourliesalmostbetweenthe UCMcurveandthequad-treecurve.Moreimportantly,itsaverageliesonthequad-tree curvealmostperfectly,indicatingthathumansmightspreadthecontoursalmostequally inanimage.Thisisstillaninterestingopenproblempotentiallyleadingtoabetter understandingofhowhumansdealwithobjectrecognition. 4.2.3TheChoiceofScaleandSuperpixelImageCardinality WhileUCMisselectedasourchoiceforasuperpixelgeneratorinthisexperiment, intuitivelynotallthescalesmatterforthesegmentation.Therefore,ourexperimental resultssuggestthatusingonly5scales(outof1400)fromtheUCMissufcientto 57

PAGE 58

Figure 4-6.SuperpixelimagesgeneratedfromUCMusingthecardinality=5,20,80and 320intherst,second,thirdandfourthcolumnrespectively.Inthelast column,thesummationofallsuperpixelcontourimagesisillustratedinthe lastcolumn,wherethecontoursappearinginseveralscalesareindicatedby moreintensitycomparedtoothers. produceavisuallymeaningfulresultandyieldareasonablecomputationaltime.From thecoarsestscaletonestscale,thesuperpixelimagescontain1,5,20,80and320 superpixelsrespectively.Thecoarsestscaleiscomprisedofonly1superpixel,namely theentireimage,andisreferredtoasthe rootlevel whosescaleindexissetto5inthis 5-levelarchitecture.Byhaving320superpixels,thenest-scalesuperpixelimagecan maximallyachieve90%segmentationaccuracy. Thusfarinthissectionwediscussthequalityofsuperpixelgeneratorsandthe choiceofscaleandimagecardinality.Noassumptionhasbeenmadeaboutthe connectionswithinthemultiscalesuperpixelset,and,therefore,therearemanydifferent approachestoconstructsuchconnectionsamongsuperpixelsintheset.Insubsequent 58

PAGE 59

Figure 4-7.Comparisonofhuman-madeground-truthcontourandtheonesgenerated fromUCMandMoriintherst,secondandlastcolumnrespectively. 59

PAGE 60

section, weproposearobustapproachtoformsuchconnections,whichcanbeusedin mostofthesuperpixelgeneratorsbesidesUCM. 4.3BuildingaData-DrivenTreeStructure Inthissection,wediscussthemethodologyappliedtoeachinputimageinorderto createadata-driventreestructure,whereeachlevelinthehierarchycanberegarded asascaleofvisualization(coarse-to-nerepresentation).Therefore,thetermsleveland scaleareusedinterchangeablywhenappropriatefromthispointon. Wedescribeamethodtobuildadata-driventreestructurefromeachinputimage anditscorrespondingmultiscalesuperpixelsetderivedfromtheoutputofUCMor anysuperpixelgenerator.Twoconstraintsarerequiredinordertobuildsuchatree inourexperiment.First,eachobservednode y e intheleaflevelcanconnecttoits parentinaone-to-onetransparentmannerasshowninFig. 2-1 sincebothlevels areindeedderivedfromthesameover-segmentingprocess.Second,eachvariable node x j inlevel l 2f1,..., L )Tj/T1_5 11.955 Tf12.99 0 Td(1 g canonlybeconnectedtoaparentinlevel l +1. Thisregularizationenablessmoothnessoftheinformationtransferacrossthescaleof image.Theconstraintisformulatedformallyasfollows.Forall j 2H l i 2H l +1 and l 2f1,..., L )Tj/T1_5 11.955 Tf12.73 0 Td(1 g,theconnectivity z ji =1 when i = i andotherwise z ji =0;when i =argmax i 2H l +1 js i \ s j j where s j denotesthesuperpixel(orimagesite) j associatedwith node x j andtheoperator jj isameasureofanimagesite.Inthisparticularcase, js i \ s j j meansthenumberofoverlappedpixelswhenintersectingsuperpixel s i and s j together.Hence,fromthispointonthisapproachisreferredtoas maximum-overlapping algorithm.Whenregarding s j asanarbitraryset,thisideacanbeextendedtobuilda data-driventreestructureinothersettingsbeyondsuperpixels,forinstanceblockpixels, voxels,abstractimageregions.Furthermore,theproposedmethodstillworksinthe casewhere s j isnotastrictsubsetof s i whichmakestheconstructionprocessquite robustandgeneral. 60

PAGE 61

It isworthtomentionthat,unlikeMori,themultiscalecontourobtainedfromUCM hasaspecialdesiredpropertysuchthatthecontoursappearinginthelevel l strictly appearinthelevel k < l ,whichmakestheresultingmultiscalecontoursobtainedfrom UCMmoreconsistentthanonesfromMori.Inaddition,ourexperimentalresultssuggest thatUCMmatchesthehuman-madecontoursbetterthanMori's. 4.4ImageDataSet Wereportourresultsfromexperimentonsegmentationandmatchingofboundary imagesfromtheBerkeleySegmentationDataSetandBenchmarks500(BSDS500), reportedin[ 13].TheBSDS500comprises500,321481,colorimagesportraying thenaturalscenewithhighdiversity,henceisconsideredachallengingdataset. Theimagesaredividedinto3sets;200fortraining(BSDS tr ain),100forevaluating (BSDS v al)and200fortesting(BSDS test). Throughoutthedissertation,weuse BSDS tr ainasatrainingdatasetandBSDS v alandBSDS test astestset.Moredetails regardingthedatasetcanbefoundinAppendix A.1. Despitethedatasetbeingdesignedforunsupervisedsegmentation,thetraining imagescanbeusedtodeterminetheappropriatescaleforimagesegmentationand thenumberofclusters/segmentsinaparticularimage.Sincethisisanunsupervised framework,themodelsofobjectsintheimagearenotavailablebeforehand,hencethe resultsfromthisexperimentcannotbereasonablycomparedagainstsupervisedobject recognitionorsupervisedboundarydetectionresultsreportedintheliterature. 4.5PerformanceEvaluation Althoughthemainfocusofthisframeworkisforunsupervisedimageregion segmentation,wealsoreportourresultsasacontourdetectionproblem.Theevaluation procedureineachapproachisexplainedinthissection. 4.5.1Results SinceourproposedDDTsproducehierarchicalsegmentationtree,andobtaining asinglesegmentationasoutputinvolvesachoiceofoptimalscale.Asadoptedin 61

PAGE 62

[13 ], onepossibilityistouseaxedscaleparameterforalltheimagesinthedataset, calibratedtoyieldthebestperformanceonthetrainingset.Suchapproachisreferred toas optimaldatasetscale (ODS).Ontheotherhand,whenthescaleparameteris adaptedontheper-imagebasis,suchapproachisreferredtoas optimalimagescale (OIS)whoseresultsarenaturallybettersegmentationsthanODS. 4.5.2ContourDetection Sincescaleisanimportantfactorinunsupervisedimagesegmentation,thecurrent evaluationmethodadoptsamultiscalecontourlikethatmentionedinmoredetailin[ 13]. ThemeDDThasmultiplelevelsofsegmentation,hencenaturallyoutputsmultiscale contoursforaninputimage.Inamultiscalecontour,eachpixelisassignedwitha probabilityofbeingonacontourboundaryfrom0to1.InmeDDT,suchaboundary imagecanbeproducedbyrstassigningthevalue1toalltheboundarypixels,and0 otherwise.Secondly,simplyaveragingthecontourimageacrossthelevelinthetree, resultinginamultiscalecontourimagewhosepixelvalueisboundedaboveandbelow by1and0respectively.Theevaluationofsuchcontourscanbedonebasedonthe precision-recallframeworkforimagesegmentationdevelopedby[ 65].Themeasure evaluatesthecontourdetectionperformanceintermsofprecision(P),thefractionof truepositive,andrecall(R),thefractionofgroundtruthboundarypixelsrecovered.The harmonicmeanofprecisionandrecall,namelytheF-measure,attheoptimaldetector threshold,summarizesthedetectionquality. 4.5.3RegionImageSegmentation Eventhoughthecontourdetectionevaluationcanbedoneonmultiplescale,the regionimagesegmentation,ontheotherhand,isperformedonasingleoptimalscale dependingontheuserandmodelselection.Inthisexperiment,thenestlevel( l =1 ) isselected.Thequantitativecomparisonoftheregionsegmentationisbasedontwo performancemeasures: 62

PAGE 63

1.Segmentation Covering(Covering)[13]measurestheaveragedoverlapbetween calculatedsegmentationandthegroundtruthsegmentation.Thiscanbedoneby rstcalculatingthebestcompatibilitybetween2partitionsusingthemaximum overlapcriterion.Next,calculatetheoverlapdenedasthefractionofthenumber ofpixelssharedby2compatibleregionstothenumberofpixelsoftheirunion. 2.TheProbabilisticRandIndex(PRI)[ 66]measurestheconsistencyofsegmentation betweenthecalculatedandthegroundtruthsegmentationbycountingthefraction ofpairsofpixelsthatareconsistentoverthatarenot.ThevalueofPRIranges between [0,1] ,higherisbetter. 3.TheVariationofInformation(VoI)metric[ 67]measuresthedistancebetweentwo segmentationsastheaverageconditionalentropyofoneimagegiventheother, whichrangesbetween [0, 1),lowerisbetter.[68] 4.6Summary Inthischapter,wediscussedtheprocessofunsupervisedimagesegmentation usingDDTsstartingfromsuperpixelizingtheinputimage,constructingacorresponding treeandinferenceintheDDTframework.AlthoughthedetailsareillustratedformeDDT only,theycanbeappliedinasimilarmannerforotherDDTmodelssuchasleDDT, coDDT,TSBNandmeTSBN.Twosuperpixelgenerators,MoriandUCMarecompared quantitatively;thelatterapproachisshowntobemoreaccurate.Furthermore,the datasetBSDS500isintroducedandwillbeusedfurtherinthetextasabenchmark dataset.Wealsodiscussedtheperformanceevaluationmethodsinbothregimes: contourdetectionandimageregionevaluation.Next,featureextractionwillbe discussed. 63

PAGE 64

CHAPTER 5 FEATUREANDMODELSELECTION 5.1FeatureSelection InChapter2and3,weintroducedthemeDDTforunsupervisedimagesegmentation anditsvariantsleDDTandcoDDT.Thedetailsofthefeatureselectionandfeature extractionapproacharedescribedinthissection.Whilepixel-levelfeaturesarewell studiedinthecommunityandveryconvenienttoobtain,onemaindrawbackisthatthe featureinformationisusuallyderivedfromonesinglepixel.Inourwork,weemploythe superpixelasthesmallestunitofthesegmentation,eachofwhichcontainsmanypixels withsimilarproperties.Therefore,richerinformation,includinggroupstatisticsofthe memberpixels,canbeextractedfromeachsuperpixel. Infact,featureextractionforadiscriminativemodel(e.g.,supervisedimage segmentationandobjectrecognition)isnotanewideaandthereareanumberof approachesadoptedasstandardtools,suchas,SIFT,SURF,HoF,Texton,histogram ofk-means,etc.However,fewpapersdiscusshowtousethosefeaturesingenerative models,especiallywhenincorporatingwithsuperpixelswhoseshapeandsizeare notusuallyconsistentamongthem.Moreover,becauseinDDTthefull-covariance matrixispreferredinordertopreciselycapturetheprobabilisticmodelofthedata, thedimensionalityofthefeaturevectorsneedstobehandledwithcare.Thegreater dimensionalitycontributestomorenumericalinstability. Inthischapter,weexperimentonseveralcombinationsofcolorandtexture featuresandfeaturevectorconstructionmethods.Thecolorfeaturesofinterestare generalizedRGB(gRGB),generalized-standardizedCIELab(gsLab)andstandardized CIELuv(sLuv).ThechoiceoftexturefeatureisMR8,composedofseveralscalesand orientationsoftextons,whichreportedlyperformswellinseveraltextureclassication regimes.Thereare4featurevectorconstructionmethodsusedinthisexperiment,the detailsofwhicharedescribedsubsequentlyinthechapter.Intotal,weexperimenton24 64

PAGE 65

diff erentcombinationsoffeaturesandconstructionmethods.Thebestcombinationwill beusedinfurtherexperiments. 5.1.1ColorFeature Generally,thecolorinformationinanaturalimageisencodedbytheRGBcolor spacewhosepairwisedistancebetweenanytwopointsinthespacedoesnotusually conveythehumanperception.Thus,wetransformtheRGBvaluesintoCIELaband CIELuv,whicharemoreperceptuallyuniform,andthennormalizethevaluestobe intherangefrom0to1acrosseachimage.Thisyieldsageneralized-standardized CIELab(gsLab)andstandardizedCIELuv(sLuv)respectively.Thegeneralizationand standardizationofthefeaturevectorsenablethefeatureswithdifferentrangesofvalues tobecomparableinthesamespace,yieldingtheimprovementinbothaccuracyand numericalstability.Nevertheless,wealsoexperimentonthegeneralizedRGB(gRGB) becauseitisrelativelyinexpensivecomputationallyandthereforecanpotentiallybe usefulforreal-timeapplications. 5.1.2TextureFeature Wehaveconsideredseverallteringtechniques,statisticalandmodel-based approaches,forfeatureextraction.Withtheconstraintsinourgenerativemodels,some currentfeaturesaredisqualied.ScaleInvarianceFeatureTransform(SIFT)[ 6]and SURF[ 7]aretwowell-knownscale-invariantfeaturesthatcalculatethehistogramof gradientsaroundakeypoint,whichusuallylivesontheedgeorthecornerofobject intheimage.Consequently,thekeypointisnaturallyalmostoverlappedwiththe boundariesofsuperpixelsandhence,isnotagoodsuperpixeldescripter.Furthermore, thedimensionalityofthedescripteris,ingeneral,128whichisrelativelyhighforthe generativemodelwithseveralhundredsofsamplesinthefeaturespace. In[69],theperformanceoffourlterbanks,includingthelterbankusedby LeungandMalik[ 70];CulaandDana[71 ];andSchmid[ 72],wascomparedanditwas demonstratedthattherotationallyinvariant,multi-scale,maximumresponseMR8lter 65

PAGE 66

bank outperformstheotherthree.Therefore,inthisframework,thetexturefeaturesare extractedfromMR8. TheMR8lterbankconsistsof38ltersbutoutputsonly8lterresponses.The lterbankiscomposedof: aGaussianandaLaplacianofaGaussian(LOG)isotropiclter,bothwith =10 rstderivativelters,associatingtoedgeintheimage,at6orientations = f 2 i 6 ; i = 0,...,5 g and3scales ( x y )= f(1,3),(2,6),(4,12) g secondderivativelters,associatingtobarintheimage,alsoatthesame6 orientationsandthesame3scales. TheresponseofaGaussianandaLOGareuseddirectly,buttheresponsesof therotationalltersareselectedateachscaleusingonlythemaximumlterresponses acrossallorientationsinordertoensuretherotationalinvariance.Thisgives8lter responsesintotal. 5.1.3FeatureVectorConstruction Wecategorizetheconstructionprocessofafeaturevectorforasuperpixelintotwo mainstepsasfollows.First,populatetherelevantsamplesinaparticularsuperpixel, andsecond,transformandrepresenttheretrievedinformationintoanappropriate format.Weexperimentallyfoundthatthosetwostepsarecrucialfortheperformanceof thesegmentationalgorithm. 5.1.3.1Populatesampleswithinasuperpixel Asmentionedbefore,mostpixelsinasuperpixelsharesimilarproperties,soitis unnecessarytocalculatethesuperpixelstatisticsusingallofthem.Instead,usingsome portionsofthemissufcienttocapturesuchstatistics.Weadopttwoapproachesto aggregatesuchsamples:1)Randomsampleand2)Centroidpatch,whosedetailsare presentedbelow: Randomsample(Rnd):Randomlysamplepixelsfromeachsuperpixel.Witha sufcientnumber,theaggregatedsamplesspreadthroughoutthesuperpixel,and thereforeyieldsgoodstatisticsofthesuperpixel.Empiricallywefoundthatutilizing 66

PAGE 67

only 5%oftheentiresuperpixelcanyieldanacceptablesegmentationaccuracy. Weuseonly10%inthisexperiment. Centroidpatch(Cen-Patch):Retrievethepixelsintheimmediaterectangular vicinityofeachsuperpixelcentroid.Startbycomputingthecentroidofeach superpixel,thencollectpixelswithinarectangularcut-offwindowcenteredatthe centroid.Inthisexperiment,weusearectangularpatchofthesize 7 7 pixels. 5.1.3.2Transformationofinformation Afterthesamplesaregatheredwithineachsuperpixel,wecanthencalculate thegroupstatisticsofsuchsamplesandconstructfeaturevectorsfurtherusedinthe segmentationalgorithm.Weexperimentontwotypesoftransformations: Medianofthefeature(med):Calculatethemedianofthevaluesforeach dimensionofthesamples.Mediancanbereplacedwithanysufcientstatistic suchasmeanandvariance,butwefoundthatmedianisamorerobustestimator inthisframework. Normalizedhistogramofk-meansmember(norm-histo-kmean):Treateach sampleasafeaturevectorinthefeaturespace,applyk-meansclusteringtothose samples.Assignthelabel1toktoeachsampleaccordingtowhichcentroidis nearesttothesample.Withineachsuperpixel,createak-binhistogramcounting thenumberoflabels1tok,andnallynormalizethehistogramsuchthatthevalue inallbinssummedtounity.Weempiricallyselecttheinitialnumberofclustering seedstok=20.Afterthek-meansclustering,thenumbercanpossiblybereduced assomeseedscannotattractanymembersample. 5.1.4Experiment Inthisexperiment,weaimtondthebestcombinationoutofthefeatureconstruction processandthefeaturesinordertofurtherusethemintheDDTexperiment.Thereare mainly6features:gRGB,gsLab,sLuv,concatenatedallcolorfeatures(all-color),MR8 andallfeaturescombinedtogether(all-color+MR8).Plus,whencombiningallpossible combinationsof2aggregatingmethodswith2transformations,thereare4waysto constructafeaturevectorfromasuperpixel: Randomsample+Mediandenotedbyrnd-median Randomsample+Normalizedhistogramofk-meansmemberdenotedby rnd-histo-kmean 67

PAGE 68

Centroid patch+Mediandenotedbycen-patch-median Centroidpatch+Normalizedhistogramofk-meansmemberdenotedbycen-patch-histo-kmean Therearetwosetsofsuperpixelimagesinvolvedinthisexperiment.Bothsetsare createdbyrunningUCMandMorisuperpixelgeneratoronthetrainingsetofBSDS300, whichconsistsof200naturalcolorimages.Theparametersofbothgeneratorsare settoyieldresultingsuperpixelimages,eachofwhichcontainsapproximately80 superpixels.Therefore,whentakingintoaccountallpossiblefeatureconstruction methodsandthefeatures,thetotalnumberofexperimentsthathastobeconductedfor eachsuperpixelsetis24. Gaussianmixturemodel(GMM)unsupervisedsegmentationwith7Gaussian componentsisappliedoneachdataset.Eachsuperpixelisthenrepresentedbyone featurevectorobtainedfromoneofthe24experiments.Theexperimentyieldingthe bestGMMsegmentationperformancewillbeselectedasthebestfeatureconstruction methodusedinthefurtherexperiments.Weevaluatethesegmentationresultsusing probabilisticrandindex(PRI)andboundarydisplacementerror(BDE)whosedetailsare discussedbrieyinChapter4. ThePRIandBDEevaluatedontheUCMandMorisuperpixelimagesetare reportedinTable 5-1 andTable 5-2;andTable 5-3 andTable 5-4.Fromthetables, someinterestingfactscanbesummarizedasfollows.Whencomparedwithina featureconstructionmethod,thetop-3-rankingfeaturesareall-color,sLuvandgsLab respectively,andthetrendcanbeobservedacrossthemultipleconstructionmethods. TheMR8responsefeaturedoesnotperformaswellasthecolorfeatures,however,the combinationofMR8andthecolorsignicantlyimprovestheperformanceofMR8inthe constructionmethodsthatusesmediantorepresenttheinformation.Withinatypeof featurerandomlysamplingpixelsineachsuperpixeloutperformsonaveragewhenusing a7 7-pixelcenter-patch. 68

PAGE 69

As expected,theperformanceoftexturefeatureinUCMisinferiorthanthatinMori. Ifweplotthehistogramofthenumberofpixelperasuperpixelinasuperpixelimage generatedbyUCMandMori,wewillndthatthedistributionobtainedfromUCMis moreuniformthanthatfromMori,buthasarelativelyhighpeakatthesmallnumber ofpixels.Therefore,rnd-median,whichrandomlysamplespixelswithinasuperpixel regardlessofhowsmallthesuperpixelis,doesnotcapturethetextureinformationwhich requiresasufcientnumberofsamples. Overall,theexperimentalresultssuggestthatcolorhistogramaloneisabetter featurethantextureandcolorhistogramtogether.Thebestcombinationinthis experimentisUCMsuperpixelimageusedwithall-colorand10%rnd-hist-kmean, whichwillbeusedfromthispointoninthedissertation. 5.2ModelSelectionusingsuperGMM AlthoughtheDDTframeworkcanlearnthemodelofeachpotentialobjectinthe sceneautomatically,itrequiresapre-denednumberofclasses/labelsindicating howmanyobjectsarethereintheimagescene.Suchaproblemisknownasa modelselectionproblem,whichisnotonlycrucialfortheDDTframework,butfor otherunsupervisedimagesegmentationalgorithmsproposedinthecomputervision communityaswell.Inthischapter,weproposeusing Sampling-SuperpixelGaussian MixtureModel (superGMM)asanefcientwaytoestimatethenumberofobjectsin eachinputimage,whichisperformedasapre-processingunitandindependentofthe DDTmodelitself.ThecoreofsuperGMMreliesheavilyonGMM,andtherefore,itis naturaltoenhancetheexistingGMMwiththeBayesianinformationcriteria(BIC)which isubiquitouslyusedformodelselectionproblems.WereformulatetheBICcostfunction toincorporatethescaleparameterofimagesegmentation,whosedetailsarediscussed laterinthischapter. 69

PAGE 70

T able5-1.ThePRIofeachfeatureextractionalgorithmperformedonsuperpixelimage setgeneratedbyUCM. methodsgRGB gsLabsLuvall-colorMR8all-color+MR8 r nd-histo-kmean0.764880.761380.768440.772650.698750.69323 rnd-median0.735070.748220.759150.761930.448750.73294 cen-patch-histo-kmean0.723290.736570.739960.735210.623310.62162 cen-patch-median0.730790.740220.749750.758710.580360.72491 T able5-2.TheBDEofeachfeatureextractionalgorithmperformedonsuperpixelimage setgeneratedbyUCM. methodsgRGB gsLabsLuvall-colorMR8all-color+MR8 r nd-histo-kmean11.7841011.332511.22111.441212.279112.5081 rnd-median12.057711.819711.656611.406917.779312.0375 cen-patch-histo-kmean12.411812.299311.97411.859113.507613.7478 cen-patch-median12.192512.078611.585711.502114.890912.4038 T able5-3.ThePRIofeachfeatureextractionalgorithmperformedonsuperpixelimage setgeneratedbyMori. methodsgRGB gsLabsLuvall-colorMR8all-color+MR8 r nd-histo-kmean0.733150.735260.74030.743790.660130.65982 rnd-median0.738270.740170.750550.750620.608750.73972 cen-patch-histo-kmean0.715230.719760.724670.731410.637650.63814 cen-patch-median0.721610.726070.73570.736110.604740.72558 T able5-4.ThePRIofeachfeatureextractionalgorithmperformedonsuperpixelimage setgeneratedbyMori. methodsgRGB gsLabsLuvall-colorMR8all-color+MR8 r nd-histo-kmean13.728213.441613.188113.041515.819315.8722 rnd-median13.672113.635312.970113.507715.890813.6492 cen-patch-histo-kmean14.714913.912813.813613.535015.455315.5735 cen-patch-median14.283514.141813.847113.859916.003414.0653 5.2.1 Sampling-SuperpixelGaussianMixtureModel(superGMM) Generally,theGaussianMixtureModel(GMM)isaninsightfulandversatile graphicalmodelwhosenodesareassumedindependentintheentirenetwork.In thecomputervisioncommunity,GMMhasbeenemployedforpixel-levelunsupervised imagesegmentationapplications,buttheresultsarenoisy.Ontheotherhand,our experimentalsodemonstratesthatawholesuperpixelcanbeusedasafeature 70

PAGE 71

v ectorsampleintheGMMsegmentationframework.However,therobustnesscan beaproblemwhenthenumberofsuperpixelsisreducedbelowacertainthreshold dependingonthedimensionalityofthefeaturevector. Wepropose sampling-superpixelGaussianmixturemodel (superGMM)which canbedescribedasfollows.First,weobservethatthereareredundancyinimage representation,thatis,itisnotnecessarytouseeverypixelintheimageinorderto performsegmentation.Instead,wejustwanttousesomeportionofthepixelsand hopethattheywillrepresenttheimportantpartsoftheimage.Wealsoobservethat mostofthetime,importantparts/objectsinanimagearecapturedwellbyahandfulof superpixels.Therefore,insteadofusingeverypixelwithintheinputimage,wecanjust drawaportionofthem,say10%,fromeachsuperpixel,thenperformsegmentationon thosesampledpixels.Theresultingsegmentationisstillnoisybecauseitisdoneatthe pixellevel.Thenextstepistoprovidethesmoothingpriortotheresultsbyapplyingthe majority-voteschemewithinasuperpixelsothateverypixelwithinthesuperpixelwillbe assignedwiththesamelabel. Thealgorithmisdescribedformallyasfollows.Asuperpixelimage S f i isgenerated fromasuperpixelgenerator (IMG i ; f ) where IMG denotesaninputimageand f representthescaleparameter.Toavoidthenotationalclutter,thescaleparameter f will beomitted.NotethatsuperGMMdoesnotrequirethemulti-scalerepresentationofan image,soitcanbeusedforeachscaleindependently.Foreachsuperpixel s j withina superpixelimage S i ,werandomlysamplepixels fp j k g N j samp k =1 wherethenumberofpixelsto besampledinsuperpixel s j isdenotedby N j samp whichdependsontheimagescenario. Attheendofthesamplingprocess,wewillhaveasetofpixelsfromeverysuperpixel intheimage,andtheexperimentalresultssuggestthatusingonly5%sampling-rate canyieldgoodsegmentationresultsonBSDS500.Whenthesizeofthesuperpixelis big,thenumberofsamplesisbigtoo;thiscanbeseenasapplyingauniformrandom samplingtotheentireimage.Next,weextractafeaturevector y k fromeachsampled 71

PAGE 72

pix el p j k andperformunsupervisedsegmentationusingGMMonthefeaturespace wherethenumberofsegments C isdenedbytheuser.Thelabelvariableforeach pixel p j k isdenotedby v j k ,whichcantakeadiscretevalueintheset C = f1,..., C g.The mostprobablelabelforeach v j k canbeobtainedusingmaximum aposterior (MAP)or maximumlikelihood.Unlessasmoothingpriorisprovided,themostprobablelabels ofpixelswithinasuperpixelareusuallynothomogeneousandhencecontributeto thenoisysegmentationresult.Inthiscaseweusemajority-voteschemewithineach superpixelasthesmoothingprior,andthenallabel x j foreachsuperpixel s j canbe obtainedfrom x j =argmax c 2C ( N j samp X k =1 ( v j k c )) (5) where istheKroneckerdelta (v c )=1 when v = c and ( v c )=0 otherwise. Theproposedmethodnotonlyoutputsahomogeneousandsmoothsegmentation, butalsoitprovidesanicebalanceinthenumberofsamplesfromwhichtoperform segmentationasopposedtohavingtoofewsamplesinasuperpixel-levelsegmentation andhavingtoomanysamplesinapixel-levelsegmentation.Alsohavingsampledpixels doesnotmeantohaveonlyapixel-levelfeaturevector.Forinstance,foreachsampled pixelwecanconstructafeaturevectortoincludethepropertiesofitsneighbors. However,thismethodreliesheavilyonsuperpixelquality,sothebadsuperpixelquality, suchashavingasuperpixelcontainingdifferentobjectsinside,canleadtoimperfect segmentation.Therefore,itiscrucialtoensurethateachsuperpixelencompasses homogeneousinformationinsomesense.Notethatwhenusingthesuperpixelasthe nestunitofsegmentation,itisnotpossibletocorrectthesegmentationbeyondsucha level. 5.2.2ModelSelectionusingModiedBayesianInformationCriteria Whenttingmodels,itispossibletoincreasethelikelihoodofthemodelgiven thedatabyaddingmoreparameters,butdoingsomayresultinoverttingingeneral. 72

PAGE 73

Ba yesianinformationcriterion(BIC)[ 59 ]isawell-knowncriterionformodelselection amonganitesetofmodels.TheBICcanresolvetheoverttingbyintroducinga penaltytermforthenumberofparametersinthemodel,whichoffersabalancebetween thelikelihoodandthemodelcomplexity.Moreformally,BICisformulatedas: BIC = )Tj/T1_2 11.955 Tf[(2log L max + K log(N ) (5) where L max isthemaximizedvalueofthelikelihoodfunctionfortheestimatedmodel; K isthenumberoffreeparameterstobeestimated; N isthenumberofdatasamples. ThersttermontheRHSisproportionaltothelikelihoodofthemodel,andthesecond termisproportionaltothemodelcomplexity.InGMMcase, L max canbeobtainedfrom evaluatingtheGMMcostfunctionwiththemaximumlikelihoodparametersestimated fromtheEM-algorithmforGMM.AlthoughtheEMevaluatedcostfunctionmightbea localmaximum,multipleinitializationscanalleviatesuchproblems.Theparameter K canbecalculatedfromthenumberofparametersinGMM,whichis K (C )= C 2 (D 2 + 3D ) + C )Tj/T1_2 11.955 Tf12.13 0 Td(1 where D isthedimensionalityofthedataand C isthepre-denednumber ofsegments/clusters. WeproposeamodiedversionofBICbyaddingascale-dependentparameter tothepenaltyterm,andourgoalistocalculatetheoptimalparameter foreachimage dataset.Essentiallytheparameter controlsthesensitivityofthesegmentation.The numberofsegmentstendstodecreaseas increases.ThemodiedBIC(mBIC)can beexpressedasfollows: mBIC ( C )= )Tj/T1_2 11.955 Tf[(2log L max (C )+ K (C )log( N ) (5) whichisafunctionof and C .Foreachimage,theoptimalnumberofsegments C givenaknownscale canbeobtainedbyminimizing mBIC (C ; ) withrespectto C : C =argmin C mBIC (C ; ) (5) 73

PAGE 74

which containstheEMalgorithmforGMMasasub-problemtosolvefor L max (C ) Theremainingquestionishowtoderiveanappropriate .Unsupervisedsegmentation ingeneralrequiresanotionofscaletomaketheoutputsegmentmeaningful,otherwise theoutputsegmentswouldbemeaningless.Forexample,consideranimagecontaining atiger,ifweareinterestedinthebigpicture,wewouldconsideratigerasanobject,but whenthescaleofinterestisveryne,eachofthetiger'sstripescouldpossiblybean object.Thereisnorightorwronganswertothisquestionasitdependsonwhattheuser isinterestedinwhenlookingatthepicture.Inotherwords,weproposethattheresulting segmentationandthenumberofsegmentsshouldbedenedandpresentedwithina certainscaleparameter,inthiscase Interestingly,wecanaccuratelydeterminewhataretheobjectsofinterestinmost ofphotoswenormallytake.Thefactsuggeststhatthereisarangeofscaleshumans shareincommon,andsuchscalescanbeestimatedbyallowinghumansubjectsto segmentobjectsinsuchanimagedataset,whichisthecaseforBSDS500.Thescale parameter canbeestimatedfromthedatasetbyregardingasegmentationalgorithm asafunctiontaking asaninputandresultinginasegmentationoutputwiththe optimalnumberofsegments.WhenappliedtosuperGMM,wehave [C i ( ), X i ( )] superGMM ( IMG i ; ) (5) where X i ( ) istheresultingsegmentationindicatingwhichpixelintheimagetakes whichlabelin f1,..., C g giventhescale .Thesegmentationaccuracycanbe determinedbycomputingthesimilarity Sim (X i ( ), X gt i ) between X i and X gt i theground truthofimage i .Therefore,wecanestimatetheoptimal fromthetrainingdatasetby maximizingtheaccuracywithrespectto intherange A forthewholetrainingset T : =argmax 2A X i 2T Sim ( X i ( ), X gt i ) (5) 74

PAGE 75

One waytoestimate isusingcross-validationover .InSection 6.1,weestimate forBSDS500. 5.3Summary Inthischapter,wediscussfeatureselectionmethodologiesforsuperpixeland modelselectionstrategiesfortheDDTframework.InSection 5.1.4,theexperimental resultssuggestthatthebestcombinationisUCMsuperpixelimageemployedwith all-colorand10%rnd-hist-kmean,whichwillbeusedfromthispointoninthedissertation. Forthemodelselectionstrategy,weproposedsuperGMM,whichperformsGMM segmentationonthefeaturevectorsextractedfrompixelsrandomlysampledfroman inputimage.Theproposedalgorithmassumesallpixelsareindependentbutgoverned bythesmoothingpriorinaformofmajority-voteschemeattheend.Wealsoproposed mBICformodelselection,whichcanbeseamlesslyintegratedintosuperGMM.In Chapter6,superGMMandmBICareemployedtoselectthescaleparametersandthe numberofsegmentsforeachimageinBSDS500. 75

PAGE 76

CHAPTER 6 EXPERIMENTANDDISCUSSION Wereportexperimentalresultsonimagesegmentationfor3setsofimages: BSDS tr ain,BSDS v alandBSDS test, whosedetailscanbefoundinAppendix A.1. WerstutilizemodiedBICandsuperGMMformodelselection,morespecically, determiningthenumberofsegmentsorobjectsineachimageinBSDS tr ainwhichis thetrainset.Wethenusescaleparametersobtainedfromthemodelselectionstepfor BSDS v alandBSDS test, thevalidationandtestsetrespectively. 6.1DeterminingtheNumberofSegmentsineachImage WerstreportexperimentsonunsupervisedimagesegmentationusingmeDDT withmBIC+superGMM.The5-levelmeDDT,eachwiththenumberofnodesfromrootto leaf1,5,20,80and320respectively,aregeneratedfrom2superpixelgenerators; MoriandUCM,denotedbyMori+meDDTandUCM+meDDTrespectively.The scaleparameters arevariedwiththevalues1,4and7.Thebigger makethe superGMMwithbiggernumberofsegmentslesspreferred.Inotherwords,the resultingsegmentationofsuperGMMwithbigger tendstoyieldthesmallernumber ofsegments.Givena5-levelmeDDT,werstrunmBIC+superGMMoneachlevel independentlyusingtheapproachmentionedinSection 5.2.1 and 5.2.2.Theend resultprovidessegmentationresultswithestimatednumberofsegmentsineachlevel, whichwillbeusedtoinitializetheCPTofthemeDDT.Inordertoreducetherun-time, werandomlysampleonly10%ofthenumberofpixelsineachsuperpixel.Thescale parameters arethenvariedwiththevalues1,4and7respectivelywiththenumberof classesrangingfrom3to11. InTable 6-1,theboundarydetectionperformancesarereportedusingthecriteria describedinSection 4.5.2 .InbothUCM+meDDTandMori+meDDT,theempirical resultssuggest =1astheoptimalscaleparameterbecausethehighest F -measure isachievedatsuchvalueof inbothODSandOIS.Furthermore,whencompared 76

PAGE 77

using allthelevelscombinedtogether,areacoveredbythePRcurve(PR-Area)isat itsmaximumat =1.Theperformanceisbetterwhen issmaller,suggestingthatthe groundtruthprefersmoredetailsofsegmentation. InTable 6-2,theregionsegmentationperformancesarereportedusingthecriteria describedinSection 4.5.3 .ForUCM+meDDT,thescaleparameterat =1yieldsthe bestresultsmeasuredbyCovering,PRIandVoI(inbothODSandOIS).Ontheother hand,Mori+meDDTisatitsbestwhen =4asopposedto1inUCM+meDDT.This impliesthattheboundaryandregioncriteriaarecorrelated,butnotequivalent. TheresultingsegmentationsusingUCM+meDDT,illustratedinFig. 6-1,shows noisiersegmentationinsmaller thanthebiggerones.Whencomparedbetween bothalgorithms,UCM+meDDToutperformsMori+meDDT,especiallywhencompared usingboundarydetectioncriteria.Wedecidetouseonlyonescaleparameterinthe subsequentexperiments,so =1isselectedasitperformssubstantiallywellonboth UCMandMoriandbothboundaryandregioncriteria.Sincethesimilarcharacteristics aresharedamongBSDS(e.g.,BSDS tr ain,BSDS v alandBSDS test containsimilar objects,obtainedfromsimilarscales.),theoptimalscaleparameter =1isemployed whenevaluatingmeDDTonallBSDSsets,namelyBSDS v alandBSDS test. The mBIC+superGMMusedinthisexperimentsweepthenumberofthesegments foreachlevelfrom3to11,andeachcaseisrepeatedfor5timestoalleviatethelocal maximaproblem.Atthescale l =1and2,morethan60%ofimagesaredesignatedwith 7segments/objects,andonlyacoupleofimagesaredecidedtohave11segments.On theotherhands,theestimatednumbersofobjectsinlevel l =3,4and5tiedtogetherare 4and5inmostimages.OnedisadvantageofmBIC+superGMMistheevaluationfor eachnumberneedstobedone,and,therefore,requiresagoodrangeofsuchnumber. However,withrelativelyfewsampledusedinsegmentationandmajority-votescheme, mBIC+superGMMperformssufcientlyfastandcanberepeatedseveraltimesinorder 77

PAGE 78

T able6-1.BoundarydetectionresultsobtainedfrommeDDTwhenapplyingscale parameters =1,4and7onmBIC+superGMMformodelselection.The performances,evaluatedonBSDS tr ain,aremeasuredusingrecall(R), precision(P),F-measure(F)andtheareaunderthePR-curve(PR-Area). ODS OIS Method R PFRPFPR-Area UCM+meDDT =1 0.710.700.710.730.720.720.52 UCM+meDDT, =4 0.660.730.690.680.730.710.48 UCM+meDDT, =7 0.630.740.680.660.740.700.46 Mori+meDDT, =1 0.630.640.630.670.650.660.45 Mori+meDDT, =4 0.590.670.630.640.670.650.42 Mori+meDDT, =7 0.570.670.620.600.690.640.39 T able6-2.RegionsegmentationresultsobtainedfrommeDDTwhenapplyingscale parameters =1,4and7onmBIC+superGMMformodelselection.The performancesareevaluatedonBSDS tr ain. Co vering PRI VoI MethodODSOISBestODSOISODSOIS UCM+meDDT =1 0.550.580.640.800.821.901.71 UCM+meDDT, =4 0.540.570.620.780.811.901.71 UCM+meDDT, =7 0.520.550.590.760.791.951.76 Mori+meDDT, =1 0.480.550.610.760.811.921.81 Mori+meDDT, =4 0.490.540.590.770.791.951.82 Mori+meDDT, =7 0.490.540.580.750.781.951.81 to reducethesensitivitycausedbyinitializations.Table 6-6 summarizestherun-time spentbyeachalgorithm. 6.2PerformancesofmeDDTanditsVariants Inthissection,wecomparetheperformanceofDDTmodels(i.e.,meDDT,leDDT, coDDTandmeTSBN)andsomerelevantmodels(i.e.,GMiNDandsuperGMM)onthe BSDS v alandBSDS test whenthescaleparameter issetto1asobtainedfromtrain setBSDS tr aindescribedinSection 6.1.Eachexperimentisrepeated5times,andthe measuresarethenaveragedwhendisplayedinTable 6-3, 6-4 and 6-5.AlltheDDT modelsareof5-levelhierarchicaltreewith1,5,20,80and320nodesfromroottoleaf. 78

PAGE 79

Figure 6-1.Segmentationresultsfromusing =1,4and7onBSDS tr ain.Fromleftto right:Originalimage,human-madegroundtruthsegmentation,hierarchical contourfromusing =1,4and7respectively,segmentationsobtained usingoptimaldatasetscale(ODS)fromusing =1,4and7respectively.All imagesarefromBSDS tr ain. 79

PAGE 80

T able6-3.BoundarydetectionbenchmarksonBSDS BSDS v al BSDS test MethodODS OISPR-AreaODSOISPR-Area Human0.79 0.79-0.800.80UCM+coDDT0.700.700.510.710.730.53 UCM+GMiND0.690.700.510.710.730.53 UCM+superGMM0.690.710.500.710.730.53 UCM+meDDT0.690.690.500.710.730.53 UCM+leDDT0.670.680.460.690.700.51 baselineUCM0.700.720.410.720.740.42 Mori+coDDT0.620.640.440.640.670.47 Mori+GMiND0.620.640.440.640.670.47 Mori+superGMM0.620.630.440.640.660.46 Mori+meDDT0.610.630.430.640.660.46 Mori+leDDT0.590.620.410.620.640.44 baselineMori0.580.610.360.600.620.38 meTSBN0.380.380.130.390.390.14 TSBN0.360.380.110.370.390.12 gPb-owt-UCM0.710.74-0.730.76CM-MeanShift0.630.66-0.640.68NCuts0.620.66-0.640.68Canny-owt-UCM0.580.63-0.600.64Felz-Hutt0.580.62-0.610.64SWA0.560.59---6.2.1 EvidenceinMultipleScales WerstreportourresultsoncomparisonbetweenTreeStructureBeliefNetwork (TSBN)[ 34 ]andourMulti-scaleEvidenceTSBN(meTSBN),modiedversionofTSBN withobservationsineverylevelofthetree.TheempiricalresultsshowsthemeTSBN's performanceissuperiortothatoforiginalTSBNinbothboundarydetectionandregion segmentationasillustratedinTable 6-3, 6-4 and 6-5.Theresultingsegmentationof meTSBNisshowninthelastcolumnofFig. 6-3.AlthoughmeTSBNperformspoorly inboundarydetectioncriteria,itdoesrelativelywellinregionsegmentation.Thatis becausetheboundarydetectioncriteriadependsonhowprecisetheoutputcontours 80

PAGE 81

T able6-4.RegionsegmentationbenchmarksonBSDS v al Co vering PRI VoI MethodODSOISBestODSOISODSOIS Human0.73 0.73-0.870.871.161.16 UCM+superGMM0.540.570.620.770.801.831.69 UCM+meDDT0.500.550.610.770.801.911.77 UCM+coDDT0.500.550.610.760.801.901.77 UCM+GMiND0.500.550.610.760.791.901.77 UCM+leDDT0.480.540.600.750.791.951.80 baselineUCM0.580.620.700.790.841.761.58 Mori+superGMM0.480.540.590.750.781.901.79 Mori+coDDT0.460.530.590.750.791.941.82 Mori+GMiND0.460.530.590.750.791.951.84 Mori+meDDT0.460.530.590.750.791.981.85 Mori+leDDT0.450.520.570.730.772.011.87 baselineMori0.500.530.610.730.801.841.78 meTSBN0.420.470.530.730.762.342.15 TSBN0.400.460.510.710.742.382.20 gPb-owt-UCM0.590.650.750.810.851.651.47 CM-MeanShift0.540.580.660.780.801.831.63 Canny-owt-UCM0.480.560.660.770.822.111.81 Felz-Hutt0.510.580.680.770.822.151.79 SWA0.470.550.660.750.802.061.75 NCuts0.440.530.660.750.792.181.84 are comparedtothegroundtruth,andevenshouldafewpixelsindeviationoccur, the F -measurecanbepoor.Inthisexperiment,thethresholdforthedeviationis3 pixelswhichisbroadlyadoptedinthecomputervisioncommunity.Ontheotherhand, theregionsegmentationcriteriafocusesontheareasorthenumberofpixelsshared between2correspondingsegments,whichislesssensitivetotheboundarydeviations, and,hence,givesappropriatevaluestomeTSBN. Nextwecomparetheperformancebetween2ofourproposedmethods:meDDT andleDDT.Thecharacteristicsofmodelwithandwithoutevidences/observationsin multiplescalescanbeillustratedwhencomparingbetweenmeDDTandleDDT.The 81

PAGE 82

T able6-5.RegionsegmentationbenchmarksonBSDS test Co vering PRI VoI MethodODSOISBestODSOISODSOIS Human0.72 0.72-0.880.881.171.17 UCM+superGMM0.540.570.630.800.821.851.72 UCM+coDDT0.530.570.620.800.822.011.81 UCM+GMiND0.520.560.610.800.822.011.81 UCM+meDDT0.520.550.610.800.822.021.82 UCM+leDDT0.510.530.590.780.802.051.85 baselineUCM0.590.620.700.820.851.761.60 Mori+superGMM0.480.530.590.770.791.991.86 Mori+coDDT0.470.520.580.770.792.051.94 Mori+GMiND0.460.520.580.770.792.051.93 Mori+meDDT0.460.520.580.770.792.071.93 Mori+leDDT0.440.500.560.750.782.091.94 baselineMori0.480.510.600.750.791.961.87 meTSBN0.420.460.520.750.772.452.23 TSBN0.400.440.490.730.712.482.27 gPb-owt-UCM0.590.650.740.830.861.691.48 Felz-Hutt0.520.570.690.800.822.211.87 CM-MeanShift0.540.580.660.790.811.851.64 Canny-owt-UCM0.490.550.660.790.832.191.89 NCuts0.450.530.670.780.802.231.89 T able6-6.Run-timeofeachmethod Methodsr un-timeperimage GMiND34 sec/10rep TSBN15iterations,34sec meTSBN15iterations,38sec superGMM3-11classes,47sec/5rep leDDT15iterations,52sec meDDT15iterations,58sec coDDT15iterations,20min UCM > 1000levels,7.3min Mori5-level,8.1min 82

PAGE 83

meDDT outperformsleDDTinbothMoriandUCMincriteriaandbothdatasets,which demonstratestheimportanceofhavingobservationsinmultiplescalesofthetree.The resultsareshownintherstandsecondpartsoftheTable 6-3, 6-4 and 6-5.When performingsum-productalgorithm,themessagesencodingtheobservationssentfrom theleafofleDDTwillbeattenuatedwhengoupinthetree.Thehigherupinthetree theytravel,themessagesbecomelesscorrelatedtotheobservationsatthebottom andmoreaffectedby l ,theCPTinthelevel l .StilltheCPTdependsheavilyonthe messages,sothesegmentationaccuracywillberareedatthehigherlevelsofthetree. Consequently,meDDThelpsalleviatethisshortcomingbyprovidingobservationsin everyscales,servingasevidenceboosterstoreinforcethemessages.Additionally, havingobservationsinmultiplescalesallowsmorefreedomtoincorporatedifferent featurevectorsineachlevelofthetree,whichisdesiredinmanyapplicationssuchas multi-spectralormulti-scaleremotesensingimagesanalysis. 6.2.2CombiningPixelLevelFeatures ThemotivationundercoDDTistoincorporatefeaturesretrievedfromboth superpixelsandpixelstoachievethemorecompletemodelofimage.Intherst partofTable 6-3,UCM+coDDTslightlyoutperformsUCM+meDDTinbothBSDS v al andBSDS test. ThesametrendsappliestothatofMori+coDDTandMori+meDDT. However,UCM+coDDTisslightlyoutperformedbyUCM+meDDTwhencomparedin regionsegmentationcriteria,but,incontrast,Mori+coDDToutperformsMori+meDDT. Theexperimentalevidencesuggestthat,inoverall,coDDTyieldsslightlysuperior resultstomeDDTingeneral.However,thecoDDT'srun-time,asshowninTable 6-6,is signicantlyinferiortothatofmeDDTwhichtakeslessthanaminuteforaninputimage. Therefore,inpracticeperspective,coDDTstillneedssignicantimprovementwhen implemented.Despitethefactthatthecomplexityofthesum-productalgorithmislinear tothenumberofnodesinthenetwork,theimplementationinMATLABcanbeslowdue totheforloops. 83

PAGE 84

Inspired bycoDDT,weinventamodelwhereeverynodeinthenetworkconnected topixel-levelnodessampledfromthecorrespondingsuperpixels.Thenewmodel incorporatespixel-levelnodesineveryscaleasopposedtocoDDT,inwhichthe pixel-levelnodesareincorporatedatthelevel l =1only.However,thenewmodel wouldsufferfromthenumberofthenodeswhichis5timesmorethanthatofcoDDT, whichwouldcauseanimplementationissuessimilartothatofcoDDT.Therefore, wehavetoapproximatethenewmodelbydiscardingalltheconnectionsexceptthe class-observationdistribution,grantingtheindependencetoallthenodesinthenetwork. Howeverweenforcethattheinteractioncanbedoneamongnodeswithinthesamelevel andthattheparametersinthesamelevelaretiedtogether.Attheend,thelabelsof samplednodesarepropagatedtotherestusingmajority-votescheme.Theapproximate modeliscalledsuperGMM,asexplainedinSection 5.2.1 TheboundarydetectionperformanceinTable 6-3 showsthecoDDTachievesbetter accuracythansuperGMMinbothUCMandMorionbothdataset.Thecharacteristics areoppositewhencomparedusingregionsegmentationcriteria,asshowninTable 6-4 and 6-5,inwhichcase,superGMMoutperformscoDDTforbothUCMandMori.In general,theempiricalresultsimplysuperGMMandcoDDTperformcompetitively,but theformerhasarun-timeadvantageasshowninTable 6-6. 6.2.3IndependentAssumptioninSuperpixelLevel InthisexperimentwecomparemeDDTagainstGaussianmixturemodelsfor independentsuperpixel(GMiND)whichisessentiallymeDDTwithallthetree-structured priorremoved.Visually,GMiNDsegmentationresultsseemsnoisierthatthatofmeDDT, asshowninFig. 6-3,however,thatcanoccasionallybeanadvantageinsomeimages containingalotofdetails.Forexample,inthelastrowofFig. 6-3,thesegmentation resultofUCM+GMiNDshowsthesock'sblackstrapwhichappearsintheground truthsegmentation,whereasUCM+meDDT'sresultdoesnot.Quantitatively,in BSDSdataset,GMiNDissuperiortomeDDTinbothboundarydetectionandregion 84

PAGE 85

T able6-7.Summaryofperformancesoftheproposedmodelsonboundarydetection criteria.TheexpressionA >BindicatesAoutperformsB. Dataset P erformance UCM+BSDS v alcoDDT >GMiND >superGMM> meDDT>leDDT UCM+BSDS test coDDT >GMiND >superGMM> meDDT>leDDT Mori+BSDS v alcoDDT >GMiND >superGMM> meDDT>leDDT Mori+BSDS test coDDT >GMiND >superGMM> meDDT>leDDT T able6-8.Summaryofperformancesoftheproposedmodelsonregioncriteria.The expressionA>BindicatesAoutperformsB. Dataset P erformance UCM+BSDS v alsuperGMM > meDDT > coDDT>GMiND UCM+BSDS test superGMM > coDDT > GMiND> meDDT Mori+BSDS v alsuperGMM > coDDT > GMiND> meDDT Mori+BSDS test superGMM > coDDT > GMiND> meDDT segmentation benchmarksbecauseBSDS'shuman-madegroundtruthseemstoprefer multiscalecontourswithsignicantdetails. Additionally,theresultsfromsuperGMMareobservedtocontainevenmoredetails thanthoseofGMiNDandmeDDT.Interestingly,superGMMalsooutperformsboth methodsmentionedearlier.ThemultiscalecontoursandtheODSsegmentations producedbymeDDT,GMiND,superGMMandmeTSBNareillustratedinFig. 6-2 and 6-3 respectively.TheperformancesofproposedmethodsaresummarizedinTable 6-7 and 6-8. 6.3ComparisonwithExistingApproachesandStateoftheArt Inthissection,ourproposedalgorithmsarecomparedagainst5contourdetection algorithmsmadeavailablepublicly:CM-MeanShift[ 73],NCuts[74],Felzenszwalb andHuttenlocker(FH)[ 10]and2versionsultrametriccontourmapping(UCM): Canny-owt-UCMandgPb-owt-UCM[ 13]whichisthestateoftheart.Theresults arelistedinthelastpartofTable 6-3, 6-4 and 6-5.Intermsofboundarydetection performance,ourproposedalgorithmsoutperformmostofthepriorartsexcept gPb-owt-UCMwhichisthestateoftheartatthattime.Whencomparedusing 85

PAGE 86

region segmentationcriteria,UCM+superGMM,ourbestcandidate,outperforms Canny-owt-UCM,FH,SWAandNCuts,performcompetitivelywithCM-MeanShift,butis outperformedbygPb-owt-UCM.ThesametrendscanbefoundinbothBSDS v aland BSDS test. Our proposedmethodsrelyonthe5-levelmultiscalesuperpixelimagesetsobtained frombothUCMandMori,namelybaselineUCMandbaselineMorirespectively.In thisexperiment,weareinterestedintheimprovementofeachproposedmethodfrom thebaselinemethodsmeasuredbytotalareaunderthePR-curve(PR-Area)which takesintoaccounteverylevelofthemultiscalesegmentations.TheresultsinTable 6-3 demonstratessignicantimprovementofPR-AreainbothUCMandMori. Theresultssupportourhypothesisthatthetreestructurepriorinducesthesmooth segmentationresults,butthegroundtruthsegmentationmadebyhumancancontaina lotofdetailswherethesmoothingpriorcanbedisadvantage. 6.3.1ComparisonwithaRestrictionontheNumberofContourPixels Inthissection,weevaluatethequalityoftheoptimalimagescale(OIS)segmentation byanalternativemethodthattriestorestrictthenumberofcontourpixelsbetweentwo resultingsegmentationtobeequal.InthisexperimentwerstsearchfortheOIS segmentationofUCM,thenwekeepaccumulatingthemeDDTlevelssuchthatthe OISsegmentationofmeDDTisasclosetothatofUCMaspossible.SincemeDDT hasdiscretescale,thenumberofthecontourpixelsofeachOISsegmentationis alsodiscrete.Thisprohibitsusfromcomparingthesegmentation(fromUCMand meDDT)withanexactlyequalnumberofcontourpixels.FindingtheOISsegmentation byaccumulatingthescalescanbelabor-intensive,soweselectonlyaportionof interestingimagesfromthedatabase.ThesegmentationresultsareillustratedinFig. 6-4 andthecomparisonisshowninTable 6-9.Theresultsshowthatwhenthenumber ofcontourpixelsisrestricted,bothmethodsperformcompetitively,butthemeDDT segmentationresultstendtousefewercontourpixelsthanthoseofUCM.Thissupports 86

PAGE 87

Figure 6-2.Themulti-scalecontourresultsonBSDS test. Fromlefttoright:Original image,human-madegroundtruthsegmentation,baselineUCM,baseline Mori,UCM+meDDT,Mori+meDDT,UCM+GMiNDandUCM+superGMM respectively. 87

PAGE 88

Figure 6-3.TheoptimaldatasetscalesegmentationresultsonBSDS test. Fromleftto right:Originalimage,UCM+meDDT,Mori+meDDT,UCM+GMiND, Mori+GMiND,UCM+superGMM,Mori+superGMMandmeTSBN respectively. 88

PAGE 89

T able6-9.ComparisonbetweentheOISsegmentationofmeDDTandUCMwiththe numberofcontourpixelsisrestrictedtobeascloseaspossible.Therstand secondrowrepresenttheF-measureofmeDDTandUCM,respectively,when comparedinsingleoptimalimagescale(OIS)segmentation.The 3 rd 4 th ,and 5 th representthenumberofcontourpixelsineachmethodandthedifference inpercentage.TheminussignindicatesthatthemeDDTsegmentation containsfewernumberofcontourpixelsthanthatofUCM. image#1 image#2image#3image#4 OIS-meDDT F=0.83F=0.78F=0.76F=0.60 OIS-UCMF=0.84F=0.76F=0.78F=0.54 #meDDT3782326535481339 #UCM4596314250361754 #meDDT )Tj/T1_3 7.97 Tf(#UCM #UCM -17% 4%-29%-23% our hypothesisthatthetreestructurepriorofmeDDTsmoothsthesegmentation resultsinUCM.Itisinterestingthatthesegmentationcharacteristicsinthisexperiment aresimilartotheODSsegmentationinthepreviousexperiments,whichcanalsobe explainedbythecharacteristiccurveinFig. 4-4,wherethenumberofcontourpixelsand thenumberofsuperpixelsinthesameimageshowstrongcorrelation. 89

PAGE 90

Figure 6-4.ComparisonbetweentheOISsegmentationofUCMandmeDDTwiththe numberofcontourpixelsisrestrictedtobeascloseaspossible.Fromleftto right:image#1toimage#4,respectively.Thersttothelastrowarethe originalimage,themeDDTOISsegmentationandtheUCMOIS segmentation,respectively.Thecorrespondingnumericalcomparisonis showninTable 6-9. 90

PAGE 91

CHAPTER 7 CONCLUSIONSANDFUTUREWORK 7.1SummaryofContributions Inthisdissertation,wehaveaddressedunsupervisedimagesegmentation innaturalimages,afundamentalyetchallengingprobleminthecomputervision community.Weproposedanunsupervisedimagesegmentationalgorithmusing data-driventree-structuredBayesiannetworks(DDTs),whosetreestructureis constructedaccordingtoobjectinformationintheinputimage.Theexperimentalresults showtheimprovementinbothsegmentationqualityandcomputationalcostcompared tothepreviousmethodsinthesameclass.Asopposedtopixel,superpixel,agroup ofpixelshavingsimilarproperties,isusedasthesmallestunitinthesegmentationin ordertoreducethecomputationalcostintheinference.Weemphasizedtheimportance ofhavingmultiplescalesofsegmentationintheunsupervisedimagesegmentation frameworkbyassumingthatthemorescalesonwhichasegmentappears,themore likelythatthesegmentbecomesasemanticobject.Thesegmentationprobleminthis frameworkiscastedasaninferenceonthediscretelabelnodegiventheobserved featurevectorsextractedfromtheinputimage.Thewholetree-structuredprobabilistic modelcanberepresentedusingafactorgraphandhencetheinferencecanbecarried outusingsum-productalgorithm.Consequently,thesegmentationinallscalescanbe doneconcurrentlyfacilitatedbythefactorgraph,andareaccumulatedattheendofthe process,resultinginamultiscalecontourofobjectsintheimage.Wereporttheresults seenas2complementaryproblems:1)multiscalecontourdetectionand2)single optimalscaleimageregionsegmentation.Finally,theevaluationresultsdemonstratethe proposedalgorithmimprovesthepreviousworkandperformcompetitivelywiththestate oftheart. InChapter2,wehaveproposedanarchitecturewithintheDDTframeworkwhere evidencearepresentedinallthelevelsofthetree,calledmultiscale-evidenceDDT 91

PAGE 92

(meDDT). Theprobabilisticmodelcapturingthelabelnodesandtheevidencenodes isformulatedaswellastheefcientinferencealgorithmandtheparameterestimation schemeusingEMalgorithm. InChapter3,wehaveadditionallyproposedtwoDDTarchitectures,referredtoas coDDTandleDDT.Wehavedemonstratedforeachstructuretheinferencealgorithm andparameterestimationschemecanbeobtainedviathesimilarmethodology.We alsoshowedthatTreeStructureBeliefNetworks(TSBNs)canberegardedasaspecial caseofDDTswherethestructureisaquadtree.Furthermore,wealsoproposed, multiscale-evidenceTSBN(meTSBN),anenhancedversionofTSBN,whichisshown empiricallytooutperformitsoriginal. WemadeaconnectionbetweentheprobabilisticmodelDDTsandtheimage segmentationinChapter4.Weproposedanapproachtoanalyzequantitativelythe superpixelquality,whichcanbeusedtoanalyzeanysuperpixelgeneratoringeneral. Morespecically,inthiswork,wefocusesourattentionontwosuperpixelgenerators, namelyUCMandMori.Maximum-overlapalgorithmisproposedasanapproachto constructaDDTstructure.Theimagedatasetandtheperformanceevaluationcriteria areintroduced. InChapter5thefeatureselectionmethodologiesarediscussed.Werstintroduced colorandtexturefeaturesthenexploredsomeapproachestoconstructafeaturevector fromtheproposedfeatures.Experimentshavebeenconductedtorankthefeature vectorconstructionmethodologies,andthebestapproachwillbepickedtousedin ourfurtherexperimentsinthedissertation.Moreover,wediscussedmodelselection, morespecically,determiningthenumberofsegments/objectsineachinputimage.The modiedBayesianInformationCriterion(mBIC),amodiedversionofBIC,hasbeen proposedformodelselectiontaskasmBICincorporatesscaleparameterwhichcanbe adjustedtotwiththeimagedataset.WealsoproposedanapproximationofcoDDT 92

PAGE 93

with pixel-levelnodesincorporatedineverylevelofthetree,superGMM,whichwillbe usedasamainalgorithmtodeterminethenumberofsegmentswithinanimage. InChapter6,wereporttheexperimentalresultsfromtheproposedalgorithms.We rstusemBIC+superGMMtoselectthenumberofsegmentsformeDDTbyvaryingthe scaleparameter similartocrossvalidationapproach.Theoptimalscaleofthedataset BSDSisoccurwhen =1whichisusedfurtherinsubsequentexperiments.Proposed algorithmsarecomparedagainstoneanotherandagainsttheexistingapproaches includingUCM,thestate-of-the-artmethod. Theresultssupportourhypothesisaboutthetreestructurepriorfacilitatingthe smoothnessintheendsegmentationresults.However,wealsodiscoverthatthe human-madegroundtruthsegmentationmightcontainalotofdetails,whichscenario mightputthesmoothnessprioratadisadvantage.Attheend,superGMM,thebest algorithmamongourproposedmethods,demonstratesasignicantimprovementfrom itsbaselinesuperpixelimage,andperformcompetitivelywithstateoftheart. Ultimately,thegenerativemodelframeworkwehaveproposedallowustoobtainan unsupervisedimagesegmentationalgorithmthatproducesqualitymultiscalecontour andimageregionsegmentationinacomputationallyefcientmanner.Furthermore, theframeworkrepresentsthehierarchicalsegmentationinaintuitivemanner,which potentiallyestablishesthekeyfactorforimprovingthedetectionofunknownobjectsnot availableinthelibrarywhichisnecessaryforobject-sceneunderstanding. 7.2OpportunitiesforFutureWork FromtheanalysisinChapter6,someopportunitiesforfutureworkhavebeen suggested. IntheDDTframework,thenumberofobjectcomponentsplaysanimportantrolein thesegmentation,yetisnotamajorconcerninthiswork.Bayesianinformationcriteria (BIC)issimplyappliedforthispurposeandgivessatisfactoryresult,however,stillshows thesignofover-segmentationofthedata.Itwouldbeinterestingifonecancomeup 93

PAGE 94

with amodelselectionmethodthattakeintoaccountthesemanticmeaningofobject class. Featureselectionisakeyfactorforanyimagesegmentationalgorithmincludingour proposedmethod.Itistruethatourfocusisontheimagesegmentationmodelalone, butourstudyontheimagedatabasemakesusrealizethattheperformancecanbe improvedgreatlyiffeatureselectionisbroughttoourattention.Moreimportantly,since theimageunitissuperpixelwhichhasricherinformationthanpixelalone,therefore,we speculatethatthenewwaytoextractfeaturefromsuperpixelswouldhavegreatlyimpact onthesegmentationperformance. Thesegmentationresultineachlevelcouldhavedifferentweightsdependingonthe scaleofinterest.Inourwork,thesegmentationineveryscaleisassumedtosharean equalweightandtheoptimalscaleisthesegmentationfromasinglescaledecidedto bethebest,butinrealitythenalsegmentationmightbederivedfromweightingeach leveldifferently. Anothercriticallyimportantissueistoaskaquestionofwhatdoesitreallymeanby unsupervisedimagesegmentationinonescaleandhowsegmentsformasemantic meaningfulobject.Thatisbecausenotallthesegmentsaremeaningful.More sophisticatedinterpretationofanobjectshouldbestudiedanddened.Empirically weenvisionthatscaleisnecessarywhenanobjectisdened,andwithoutthenotionof scale,thenotionofobjectcanbeambiguous. 94

PAGE 95

APPENDIX A IMAGEDATASET Inthissection,weprovidemoredetailsoftheimagedatabaseusedinour experiment. A.1TheBerkeleySegmentationDatasetandBenchmark500(BSDS500) TheBerkeleySegmentationDatasetandBenchmark500(BSDS500)[65 ]is apubliclyavailableimagedatasetbuiltspecicallyforcontourdetectionproblem andusuallyisusedtoevaluatetheimagesegmentationalgorithmtoo.Thedataset comprisesoftotally500naturalimages:200fortrainingset,100forevaluatingsetand 200fortestingset,eachofwhichisofthesize321 481pixels.BSDS500coversa varietyofnaturalscenecategories,forinstance,animals,landscape,beaches,buildings, humanandportraits.ThedatasetcanbeaccessedviatheURLbelow: http://www.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/ segbench/ Atthewebsite,bothimagesandthesupplementalMATLABcodesareavailablefor testing.Inouropinions,thisdatasetisverysuitableforcontourdetectionasthecontours ofobjectsinanimageiscreatedby5humansubjectsbyaverage.However,wefound thatsomecontoursarenotcorrespondingtoanysemanticallymeaningfulobjects,and therefore,mightnotbethebestdatasetforunsupervisedimagesegmentationaiming forsemanticobjects.Nevertheless,thisdatasethasbeenusedforsuchapurposein severalresearchers[ 10, 13, 6365]. 95

PAGE 96

APPENDIX B DERIVATIONOFMEDDT InSection 2.1,thejointdistributionofmeDDThasbeenformulatedas p ( X Y jZ )= L Y l =1 Y e 2E l Y i 2H l Y c 2C l N (y e j c )Tj/T1_7 7.97 Tf6.59 0 Td(1 c ; l ) x ic z ei (B) L Y l =1 0 @ Y j 2H l Y i 2f 0, H l +1 g Y v 2C l Y u 2C l +1 x jv x iu z ji lvu 1 A (B) andhencethelog-likelihoodcanbewrittenas log p (X Y jZ )= P L l =1 )Tj5.47 -0.71 Td(P e 2E l P i 2H l z ei P c 2C l x ic log N ( y e j c )Tj/T1_7 7.97 Tf6.59 0 Td(1 c ; l ) (B) + P L l =1 P j 2H l P i 2f 0, H l +1 g z ji P v 2C l P u 2C l +1 x jv x iu log lvu (B) Sincethelog-likelihoodcontainsboththeobservedvariables Y andthehidden variable X ,themaximumlikelihoodparameterestimationcanbeobtainedusingthe EMalgorithmwhichhastwoiterativeprocedures:Expectationstep(E-step)and Maximizationstep(M-step).Thedetailsaredescribedasfollows. B.1E-step InE-step,wetaketheexpectationofthecompletelog-likelihoodwithrespecttothe posteriordistributionofthehiddenvariablesgiventheobservations,whichiswrittenas: F ( ; t )Tj/T1_7 7.97 Tf6.59 0 Td(1 )= hlog(p (X Y jZ ))i p ( X jY ,Z t )Tj/T1_7 5.978 Tf5.76 0 Td(1 ) (B) = L X l =1 X e 2E l X i 2H l z ei X c 2C l h x ic i p (X jY ,Z t )Tj/T1_7 5.978 Tf5.76 0 Td(1 ) 1 2 log j c j )Tj/T1_1 11.955 Tf13.153 8.08 Td(D 2 log 2 )Tj/T1_2 11.955 Tf13.15 8.08 Td(1 2 (y e )Tj/T1_4 11.955 Tf11.95 0 Td( c ) > c (y e )Tj/T1_4 11.955 Tf11.95 0 Td( c ) + L X l =1 X j 2H l X i 2f 0, H l +1 g z ji X v 2C l X u 2C l +1 hx jv x iu i p (X jY ,Z t )Tj/T1_7 5.978 Tf5.76 0 Td(1 ) log lvu (B) 96

PAGE 97

The expectationtermscanbecalculatedfrom hx ic i p ( X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) = X X x ic p (X jY Z t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ) (B) = X x i x ic p (x i jY Z t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ) (B) = p (x ic =1j Y Z t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ), (B) andsimilarly,wehave hx jv x iu i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) = p (x jv =1, x iu =1jY Z t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ). (B) Bothexpectationtermscanbeefcientlycalculatedfromsum-productalgorithm. B.2M-step InM-step,weaimtomaximizetheobjectivefunction F obtainedintheE-stepwith respecttoeachmemberoftheparameterset ,whichcanbederivedasfollows. B.2.1Maximize F w.r.t c Inthissectionwederivetheupdateequationfortheparameter c where c 2C l @ F ( t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ) @ c = X e 2E l X i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_4 5.978 Tf5.75 0 Td(1 ) c (y e )Tj/T1_10 11.955 Tf11.95 0 Td( c ) (B) andequatethederivativeto0,wewillhave c = P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) y e P e 2E l P i 2H l z ei hx ic i p ( X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) (B) B.2.2 Maximize F w.r.t c Inthissectionwederivetheupdateequationfortheparameter c where c 2C l @ F ( t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ) @ c = X e 2E l X i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.75 0 Td(1 ) 1 2 )Tj/T1_4 7.97 Tf(1 c )Tj/T1_8 11.955 Tf11.96 0 Td((y e )Tj/T1_10 11.955 Tf11.95 0 Td( c )(y e )Tj/T1_10 11.955 Tf11.95 0 Td( c ) > (B) and equatethederivativeto0,wewillhave )Tj/T1_4 7.97 Tf(1 c = P e 2E l P i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_4 5.978 Tf5.75 0 Td(1 ) (y e )Tj/T1_10 11.955 Tf11.96 0 Td( c )( y e )Tj/T1_10 11.955 Tf11.96 0 Td( c ) > P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.75 0 Td(1 ) (B) 97

PAGE 98

B.2.3 Maximize F w.r.t Herewewilllearneachof lvu onebyonewithrespecttoaconstrainthat P v lvu = 1,sotheLagrangian )Tj/T1_2 11.955 Tf( ( P v lvu )Tj/T1_6 11.955 Tf11.95 0 Td(1) willbeincorporatedintotheobjectivefunction @ F ( t )Tj/T1_8 7.97 Tf(1 ) @ lvu = X j 2H l X i 2f 0, H l +1 g z ji hx jv x iu i p (X jY ,Z t )Tj/T1_8 5.978 Tf5.75 0 Td(1 ) 1 lvu )Tj/T1_2 11.955 Tf11.96 0 Td( (B) and equatethederivativeto0,wewillhave lvu = P j 2H l P i 2f 0, H l +1 g z ji hx jv x iu i p ( X jY ,Z t )Tj/T1_8 5.978 Tf5.76 0 Td(1 ) (B) where = P v P j 2H l P i 2f 0, H l +1 g z ji hx jv x iu i p (X jY ,Z t )Tj/T1_8 5.978 Tf5.76 0 Td(1 ) Infact,itismoreconvenientto write lvu = ^ lvu P v ^ l v u (B) where ^ lvu = P j 2H l P i 2f 0, H l +1 g z ji h x jv x iu i p ( X jY ,Z t )Tj/T1_8 5.978 Tf5.76 0 Td(1 ) denotesunnormalizedclassCPT. B.3ParameterEstimationinCaseofDiagonalCovarianceMatrix Insomespecialoccasionswemightwanttorestrictwiththediagonalcovariance assumptionwhichismorestablethanfull-covarianceonewhenthenumberofsamples aresmall.Inwhichcase,theobjectivefunctionissimpliedto F ( ; t )Tj/T1_8 7.97 Tf6.59 0 Td(1 ) hlog(p (X Y jZ ))i p ( X jY ,Z t )Tj/T1_8 5.978 Tf5.76 0 Td(1 ) (B) = L X l =1 X e 2E l X i 2H l z ei X c 2C l h x ic i p (X jY ,Z t )Tj/T1_8 5.978 Tf5.76 0 Td(1 ) 1 2 log j c j )Tj/T1_9 11.955 Tf13.153 8.09 Td(D 2 log 2 )Tj/T1_6 11.955 Tf13.15 8.09 Td(1 2 (y e )Tj/T1_2 11.955 Tf11.95 0 Td( c ) > c (y e )Tj/T1_2 11.955 Tf11.95 0 Td( c ) + L X l =1 X j 2H l X i 2f 0, H l +1 g z ji X v 2C l X u 2C l +1 hx jv x iu i p (X jY ,Z t )Tj/T1_8 5.978 Tf5.76 0 Td(1 ) log lvu (B) The diagonalprecisionmatrixisintheformof c = diag ( c ,1 ,..., c ,D ),hence log j c j = P D d =1 log j c ,d j and (y e )Tj/T1_2 11.955 Tf12.79 0 Td( c ) > c (y e )Tj/T1_2 11.955 Tf12.79 0 Td( c )= P D d =1 c ,d (y e ,d )Tj/T1_2 11.955 Tf12.8 0 Td( c d ) 2 .Alltheupdate equationsremainthesameasinthegeneralcaseexceptthattheonefortheprecision matrixwhichisderivedasfollows.Learningtheparameter c where c 2C l 98

PAGE 99

@ F ( t )Tj/T1_5 7.97 Tf6.59 0 Td(1 ) @ c ,d = X e 2E l X i 2H l z ei hx ic i p (X jY Z t )Tj/T1_5 5.978 Tf5.75 0 Td(1 ) 1 2 j c ,d j )Tj/T1_5 7.97 Tf6.59 0 Td(1 )Tj/T1_2 11.955 Tf11.95 0 Td((y e ,d )Tj/T1_0 11.955 Tf11.96 0 Td( c d ) 2 (B) and equatethederivativeto0,wewillhave j c ,d j )Tj/T1_5 7.97 Tf6.59 0 Td(1 = P e 2E l P i 2H l z ei hx ic i p ( X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) (y e d )Tj/T1_0 11.955 Tf11.96 0 Td( c d ) 2 P e 2E l P i 2H l z ei hx ic i p ( X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) (B) which canbewritteninthematrixformas )Tj/T1_5 7.97 Tf(1 c = diag P e 2E l P i 2H l z ei hx ic i p ( X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) (y e )Tj/T1_0 11.955 Tf11.95 0 Td( c ) 2 P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_5 5.978 Tf5.75 0 Td(1 ) (B) where (y e )Tj/T1_0 11.955 Tf11.95 0 Td( c ) 2 is theelement-wisesquare. B.4FinalUpdateEquationsformeDDT Theupdateequationsaresummarizedasfollows: 1.Inserttheevidences y e inallthescales 2.RunGMMfortheevidenceineachscaleseparately (a)Theparameters c c canbeinitializedfromtheGMMparameters (b)Theinitialvalueofmarginalposteriorofeach x j canbecalculatedfromGMM classier 3.The lvu canbecalculatedfrom lvu = ^ lvu P v ^ l v u (B) 4.Theparameters c c canbecalculatedfrom c = P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) y e P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) (B) and )Tj/T1_5 7.97 Tf6.59 0 Td(1 c = P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_5 5.978 Tf5.76 0 Td(1 ) (y e )Tj/T1_0 11.955 Tf11.95 0 Td( c )(y e )Tj/T1_0 11.955 Tf11.95 0 Td( c ) > P e 2E l P i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_5 5.978 Tf5.75 0.01 Td(1 ) (B) 99

PAGE 100

or )Tj/T1_3 7.97 Tf6.59 0 Td(1 c = P e 2E l P i 2H l z ei hx ic i p (X jY ,Z t )Tj/T1_3 5.978 Tf5.76 0 Td(1 ) diag [( y e )Tj/T1_10 11.955 Tf11.96 0 Td( c ) 2 ] P e 2E l P i 2H l z ei hx ic i p (X j Y Z t )Tj/T1_3 5.978 Tf5.75 0 Td(1 ) (B) f orfull-covariancematrixandthediagonalcovariancematrixrespectively. 5.Runsum-productalgorithmtocalculateeach x j 6.Repeat3,4,5untiltheobjectivefunctionconverges 100

PAGE 101

APPENDIX C DERIVATIONOFLEDDT InSection 3.1,thejointdistributionofleDDThasbeenformulatedas p ( X Y jZ )= Y e 2E 1 Y i 2H 1 Y c 2C 1 N (y e j c )Tj/T1_8 7.97 Tf6.59 0 Td(1 c ) x ic z ei (C) L Y l =1 0 @ Y j 2H l Y i 2f 0, H l +1 g Y v 2C l Y u 2C l +1 x jv x iu z ji lvu 1 A (C) andhencethelog-likelihoodcanbewrittenas log p (X Y jZ )= X e 2E 1 X i 2H 1 z ei X c 2C 1 x ic log N (y e j c )Tj/T1_8 7.97 Tf(1 c ) (C) + L X l =1 0 @ X j 2H l X i 2f 0, H l +1 g z ji X v 2C l X u 2C l +1 x jv x iu log lvu 1 A (C) Sincethelog-likelihoodcontainsboththeobservedvariables Y andthehidden variable X ,themaximumlikelihoodparameterestimationcanbeobtainedusingthe EMalgorithmwhichhastwoiterativeprocedures:Expectationstep(E-step)and Maximizationstep(M-step).Thedetailsaredescribedasfollows. C.1E-step InE-step,wetaketheexpectationofthecompletelog-likelihoodwithrespecttothe posteriordistributionofthehiddenvariablesgiventheobservations,whichiswrittenas: F ( ; t )Tj/T1_8 7.97 Tf(1 ) hlog(p (X Y jZ )) i p (X jY ,Z t )Tj/T1_8 5.978 Tf5.75 0 Td(1 ) (C) = X e 2E 1 X i 2H 1 z ei X c 2C 1 hx ic i p (X j Y Z t )Tj/T1_8 5.978 Tf5.75 0 Td(1 ) 1 2 log j c j )Tj/T1_1 11.955 Tf13.154 8.08 Td(D 2 log 2 )Tj/T1_2 11.955 Tf13.15 8.08 Td(1 2 (y e )Tj/T1_4 11.955 Tf11.95 0 Td( c ) > c (y e )Tj/T1_4 11.955 Tf11.96 0 Td( c ) + L X l =1 X j 2H l X i 2f 0, H l +1 g z ji X v 2C l X u 2C l +1 hx jv x iu i p (X j Y Z t )Tj/T1_8 5.978 Tf5.76 0 Td(1 ) log lvu (C) Similar tomeDDT,theexpectationtermscanbeefcientlycalculatedfromsum-product algorithmresultingin h x ic i p (X jY ,Z t )Tj/T1_8 5.978 Tf5.76 0 Td(1 ) = p ( x ic =1jY Z t )Tj/T1_8 7.97 Tf6.59 0 Td(1 ) (C) 101

PAGE 102

and hx jv x iu i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) = p (x jv = 1, x iu =1jY Z t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ). (C) C.2M-step InM-step,weaimtomaximizetheobjectivefunction F obtainedfromtheE-step withrespecttoeachmemberoftheparameterset .SincethesecondtermontheRHS ofthelog-likelihoodexpressionisunchangedfromthatofmeDDT,hencetheM-stepof lvu inthiscaseisthesameasbefore,whichis: lvu = ^ lvu P v ^ l v u (C) where ^ lvu = P j 2H l P i 2f 0, H l +1 g z ji h x jv x iu i p ( X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) denotesunnormalizedclassCPT. ThersttermontheRHSofthelog-likelihoodisnotexactlythesame,yetsimilarto thatofmeDDT;infact,itisthecasewhere l =1 .Therefore,theupdateequationsfor c c fromtheM-stepareasfollows: c = P e 2E 1 P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) y e P e 2E 1 P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.75 0 Td(1 ) (C) )Tj/T1_4 7.97 Tf6.59 0 Td(1 c = P e 2E 1 P i 2H 1 z ei hx ic i p ( X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) (y e )Tj/T1_9 11.955 Tf11.95 0 Td( c )(y e )Tj/T1_9 11.955 Tf11.95 0 Td( c ) > P e 2E 1 P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) (C) f orfull-covariancematrixand )Tj/T1_4 7.97 Tf6.59 0 Td(1 c = diag P e 2E 1 P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.75 0 Td(1 ) (y e )Tj/T1_9 11.955 Tf11.96 0 Td( c ) 2 P e 2E 1 P i 2H 1 z ei hx ic i p (X j Y Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) (C) f ordiagonal-covariancematrix,where (y e )Tj/T1_9 11.955 Tf11.95 0 Td( c ) 2 istheelement-wisesquare. C.3FinalUpdateEquationsforleDDT Theupdateequationsaresummarizedasfollows: 1.Inserttheevidences y e inthelevel H 1 2.RunGMMfortheevidenceinthelevel H 1 102

PAGE 103

(a)The parameters c c canbeinitializedfromtheGMMparameters (b)Theinitialvalueofmarginalposteriorofeach x j canbecalculatedfromGMM classier 3.The lvu canbecalculatedfrom lvu = ^ lvu P v ^ l v u (C) where ^ lvu = P j 2H l P i 2f 0, H l +1 g z ji hx jv x iu i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) denotesunnormalizedclass CPT. 4.Theparameters c c canbecalculatedfrom c = P e 2E 1 P i 2H 1 z ei hx ic i p (X jY Z t )Tj/T1_6 5.978 Tf5.75 0 Td(1 ) y e P e 2E 1 P i 2H 1 z ei hx ic i p (X j Y Z t )Tj/T1_6 5.978 Tf5.75 0 Td(1 ) (C) and )Tj/T1_6 7.97 Tf(1 c = P e 2E 1 P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) (y e )Tj/T1_1 11.955 Tf11.96 0 Td( c )(y e )Tj/T1_1 11.955 Tf11.96 0 Td( c ) > P e 2E 1 P i 2H 1 z ei hx ic i p ( X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) (C) or )Tj/T1_6 7.97 Tf(1 c = P e 2E 1 P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) diag [( y e )Tj/T1_1 11.955 Tf11.95 0 Td( c ) 2 ] P e 2E 1 P i 2H 1 z ei hx ic i p ( X jY ,Z t )Tj/T1_6 5.978 Tf5.76 0 Td(1 ) (C) f orfull-covariancematrixandthediagonalcovariancematrixrespectively,where ( y e )Tj/T1_1 11.955 Tf11.96 0 Td( c ) 2 istheelement-wisesquare. 5.Runsum-productalgorithmtocalculateeach x j 6.Repeat3,4,5untiltheobjectivefunctionconverges 103

PAGE 104

APPENDIX D DEFORMABLEBAYESIANNETWORKSFORDATACLUSTERINGANDFUSION Inthiswork,weproposeDEformableBAyesianNetworks(DEBAN),aprobabilistic graphicalmodelframeworkwheremodelselectionandstatisticalinferencecanbe viewedastwokeyingredientsinthesameiterativeprocess.Whilethisconcepthas shownsuccessfulresultsincomputervisioncommunity[ 35, 36 60 75 ],ourproposed approachgeneralizestheconceptsuchthatitisapplicabletoanydatatype.Ourgoalis toinfertheoptimalstructure/modeltotthegivenobservations.Theoptimalstructure conveysanautomaticwaytondnotonlythenumberofclustersinthedataset,but alsothemultiscalegraphstructureillustratingthedependencerelationshipamongthe variablesinthenetwork.Finally,themarginalposteriordistributionateachrootnode isregardedasthefusedinformationofitscorrespondingobservations,andthemost probablestatecanbefoundfromthemaximum aposteriori (MAP)solutionwiththe uncertaintyoftheestimateintheformofaprobabilitydistributionwhichisdesiredfora varietyofapplications. D.1Overview Probabilisticgraphicalmodelshavebeensuccessfullyappliedtomanyreal-world applications.Asbeingamarriagebetweengraphtheoryandprobabilitytheory,a probabilisticgraphicalmodelprovidesintuitivewaystorepresentdatathatareobserved partiallyorwithsomeuncertaintiesduetonoiseinthesystem.Inthiswork,wepropose anovelgraphicalmodelframeworktodataclusteringandfusionwhichcurrentlyremain challengingproblemsinmanyapplications. Inordertoexploitagraphicalmodeltoanapplication,thereare2mainissues toconsider:1)thegraphicalmodelstructurelearningproblemand2)theinference problem.Theformerprobleminvolveshowtocomeupwithagoodstructureto associatetherandomvariablessuchthatthegraphicalmodelbesttstheobserved data.Thisproblemis,infact,equivalenttothemodelselectionproblemingraphical 104

PAGE 105

models whichhasbeenalargeandactiveareaforresearchfordecades[ 7679 ]. Thelatterproblemdealswithinferringprobabilitydistribution/massfunction(pdf/pmf) atunobserved(hidden)nodesinthemodel.Atthismoment,advancesinstatistical machinelearningandrelatedareasprovidemanyapproachestosolvethisproblem efcientlyandsystematically. Boththestructurelearningproblemandtheinferenceproblemaresochallenging thatmostofthetimetheyaretreatedseparatelyandindependentlyeventhoughin manyapplicationstheyshouldbetreatedasdependentprocesses.Inaparticular application,thestructureisusuallyassumedknowntoahumanoperatorandisxed throughouttheprocedure,leavingonlythecorrespondinginferenceproblemtobe solved.Unfortunately,inasystemwheretherearemanyvariablesorthevariablesare non-stationary,thexedstructuregraphicalmodelmaybeextremelyinaccurateinthe resultingprediction.Thesepotentialproblemsemphasizetheneedforagraphicalmodel frameworkwherestructurelearningandinferencearetreateddependently. Inthiswork,weproposeDEformableBAyesianNetworks(DEBAN),aprobabilistic graphicalmodelframeworkwheremodelselectionandstatisticalinferencecanbe viewedastwoprincipalingredientsinthesameiterativeprocess.Whilethisconcept hasshownsuccessfulresultsincomputervisioncommunity[ 35, 36, 60, 75],our proposedapproachgeneralizestheconceptsuchtobeapplicabletoanydatatype. Ourframeworkisbasedondirectedacyclicgraphicalmodels,a.k.aBayesiannetworks. IntheBayesiannetworkswedescribehere,therearetwotypesofrandomvariables; hiddenandobserveddenotedby X and Y respectively.Theunderlyinggraphstructure Z encodesthedependencerelationshipamongrandomvariables X and Y ,each ofwhichisrepresentedbyanodeintheBayesiannetworkstructure.Inorderto incorporatethebehaviorofthedatasetatvariousscalesandtopreservelongrange correlationsbetweenapairofnodes,thestructure Z isrestrictedatohierarchicaltree structurelikethetree-structureBayesianNetworks(TSBN)[ 34 ].Thetreestructure 105

PAGE 106

enab lesnotonlytheexactinferencealgorithm,butalsotheefcientcomputational timeforthehiddennodes.Eachdirectededgeofthegraphstructure Z representsthe conditionalprobabilitytableofthechildnode(nodeattachedtothearrow)givenits correspondingparent(nodeattheotherendofthearrow).Werestrictourattentionto generativemodelsforalltheCPTsinthenetwork. UnlikeaTSBN,thegraphstructureinourproposedmodelcanadaptaccordingto thelog-likelihoodofthestructuregiventhedistributionateachhiddennode.InDynamic Tree(DT)Bayesiannetworks[ 35, 60, 75],theproposedconcepthasbeendemonstrated toalleviatetheblockyeffectcausedbyxedgraphstructureinTSBNandtoimprove theaccuracyofimageclassicationsignicantly.WhileDTformulationassumesthat theobservedvariablesarediscrete,DEBANmakesaclearseparationbetweenthe observedvariablesandhiddenvariableswhichmakesitmoregeneralized.Ourwork alsoprovidesaspecialcasewherethe state-transition distribution p ( X j = x j jX i = x i ) ismultinomialandthe state-observation distribution p (Y e = y e jX i = x i ) isGaussian distributedwhichisapplicabletomanyapplicationsandeasytounderstand.The structure Z istreatedasarandomvariableinthesystem.DEBANcanbeseenasan iterativeprocesscomposedoftwoalternatingalgorithms: 1.Giventhelikelihoodofthestructure Z andtheconditionalprobabilitiesinthe network,theposteriormarginaldistributionateachnodeinthenetworkis calculatedusingthesum-productalgorithm[38]. 2.Giventheposteriormarginaldistributionateachnode,weadaptthestructure Z suchthatthecostfunctionisincreased. Thetwoprocessesareperformedalternatelyuntilthealgorithmconverges.Usingthis process,wecaninferthemaximum aposteriori (MAP)structure Z thatmaximizesthe posterior p ( Z jY ).Twocriteriadecidetheoptimalstructure Z :1)thenumberofclusters inthedataset,and2)thelikelihoodthatanobservationisgeneratedfromaparticular cluster.Thenumberofclusterscanbedeterminedfromthenumberoftherootnodesin 106

PAGE 107

the nalgraphstructure Z andthemembershipassignedtoeachobservationisfound fromtheoptimalstructureandtheMAPstateofitscorrespondingparentnode. D.2DeformableBayesianNetworks Inthissection,beforediscussingourproposedframeworkwerstpresentan overviewofBayesiannetworks.NexttheprobabilisticmodelofDEBANismathematically described,followedbyitsspecialcasewherethestate-observationprobability distributionofeachclassismultivariateGaussiandistributed. D.2.1BayesianNetworks InaBayesiannetwork(BN)graphicalmodel,thejointprobabilityofasetof randomvariables(ornodes)isdenedbytheconnectionsofthegraphwhichinturn denetheconditionalprobabilityrelationshipsamongstthenodes.InFig. E-4 asetof randomvariables A, B C D isconnectedasshown.Directedlinksfromparentnode tochildnodesdenetheprobabilitystateofthechildrenconditionedontheparents. Forexample,thearrowfromnode A tonode C inFig. E-4 denotestheconditional probability P (C jA).Rootnodesornodeswhichdonothaveaparentaredened throughapriorprobability.ThroughthisnotationthejointprobabilityofthegraphinFig. E-4 canbefactoredas P ( A B C D )= P (A) P (B jA)P ( C jA)P (D jB C ). (D) Normallytherearetwotypesofvariablesornodesinagraphicalmodel;1) observedorevidentialvariablesand2)hiddenorunobservedvariables.Mostofthe time,ourgoalistoinfertheprobabilitydistributionofasubsetofhiddennodesgiventhe evidentialnodesviaposteriordistributionderivedfromBayes'ruleandmarginalizingout otherhiddennodesthatareoutofourinterest.Forexample,toinfertheprobabilityof thestatesofrandomvariable C whenthevaluesatnodes B = b and D = d inFig. E-4 requiresthatwesumovertheprobabilitymassfunctionofvariable A inthenumerator; 107

PAGE 108

Figure D-1.Bayesiannetworkgraphicalmodel.Nodes A, B C ,and D arerandom variables,andlinksbetweenthenodesdeneconditionalprobability relationshipsbetweenparentandchildnodes. and A and C inthedenominatorofEquation E.Thisexpression(i.e.,theposterior) canbewrittenaccordingtoBayes'ruleas p (c jb d )= P a 2A p (a )p ( b ja ) p (c ja )p (d jb c ) P (a c )2AC p (a )p (b ja ) p (c ja )p (d jb c ) (D) The computationalcomplexityofmarginalizationincreasesasthenumberof nodes,statesandlinksincreasesintheBN.Inmostinferencecases,theprobability mass/distributionfunctionateachnodeintheBNneedstobeupdatedafterthe introductionofnewevidenceintothenetwork.Whenthenumberofhiddennodestobe marginalizedislarge,thenormalizationfactor(i.e.thedenominatorinEquation E)can becomputationallyintractable.Fortunately,thereareseveralalgorithmsthatcanmake marginalizationtractable.Fortree-structurednetworks,i.e.eachchildnodehasonly oneparent,themarginalizationorbeliefupdateateachnodecanbecarriedoutexactly (i.e.exactinference)usingafastsum-productalgorithmknownasbeliefpropagation (BP)orPearl'smessagepassing[ 38].BPgivesexactinferenceforatree-structured networkandgivesapproximateinferenceforanetworkcontainingcyclesinwhichcase itiscalledloopybeliefpropagation(LBP)[ 80 ].BPgreatlyreducescomputationtimeand providesastandardprogrammingframeworktoaccomplishthemarginalizationoverthe entirenetwork.Inthiswork,werestricttheframeworktoatree-structurednetworkand useBPduetothecomputationaladvantages. 108

PAGE 109

D .2.2ProbabilisticModelofGeneralizedDEBAN Let X = f x h g h 2H denoteasetofhiddennodes,where H isthesetofindicesofthe hiddennodes;andlet Y = f y e g e 2E denoteasetofobservednodes,where E isthesetof indicesoftheobservednodes.Thenumberofnodesinthenetworkwillbe N = jHj + jEj. Moreover,let H l denoteasetofindicesofnodesinhiddenlayer l 2f1,2,..., L )Tj/T1_3 11.955 Tf12.65 0 Td(1 g where L )Tj/T1_3 11.955 Tf12.33 0 Td(1 istheroot(top-most)layerasshowinFig. D-2.Wesettheindexforeach nodeinthenetworkinascendingorderfromroottoleaf.Inotherwords,theindexof aparentnodeissmallerthanthatofitschildren.Amodelwithaxedstructure Z is denedbythejointdistribution p (X Y jZ ,) .SinceDEBANisadirectedacyclicgraph (DAG),thejointfactorizesoverthenodes,i.e. p (X Y jZ ,)= Q j p (u j jpa j j Z ) ,where u j 2 X [ Y pa j isthesetofparentsof u j ,and j 2 parameterizestheedges(conditional probability)pointingtowardthechildnode u j .Thestructureofthemodelisdetermined bythevariable Z = f z ji g (j ,i ) 2A ,whereallpossibleconnectionsaredeterminedby A = fEH 1 g[fH l H l +1 g L)Tj/T1_9 7.97 Tf11.41 0 Td(1 l =1 ,and z ji =1 whenthechildnode u j isconnectedtothe parent u i ,otherwise z ji =0.Normally, Z willbewrittenintheformofan N N matrix whoseelementin j th rowand i th columnis z ji .Thejointprobability P (X Y Z j) denes allpossiblestatecongurationsandallpossiblestructuresoftheDEBANandcanbe factorizedas P (X Y Z j)= P (X Y jZ ) P (Z j ), (D) where = f g; ji 2 denoteparametersoftheedgefrom u i toward u j ; ji 2 denotestheprobabilitythatthechildnode u j isconnectedtotheparentnode u i ,i.e. p (z ji =1)= ji .Inthiswork,werestrictourattentiontoatree-structuredarchitecture for Z becauseofitscomputationalefciencyforinference.Thejointprobabilitygivena 109

PAGE 110

Figure D-2.AgraphicaloverviewofDeformableBayesianNetworks(DEBAN) structure Z (a.k.astructurelikelihood)canbewrittenas: p (X Y jZ )= Y e 2E Y i 2H 1 p ( y e jx i ei ) z ei (D) L)Tj/T1_8 7.97 Tf(1 Y l =1 0 @ Y j 2H l Y i 2f 0, H l +1 g p ( x j jx i ji ) z ji 1 A (D) Notethattheformofstructurelikelihooddoesnotdependonthefunctionalformof Z atall.Alsonotethatwehave0inthesetofparentnodes f 0, H l +1 g indicatingthatany nodein H l connectingtothe0-nodebecomesarootnodeand H L isanemptyset,so rootnodesinlayer H L)Tj/T1_8 7.97 Tf11.41 0 Td(1 areautomaticallyconnectedto0. D.2.3ASpecialCaseofDEBAN Inthissection,werestrictourattentiontoasimplespecialcaseofDEBANwhere thestate-transitiondistribution p (x j jx i ji ) ismultinomialandthestate-observation distribution p (y e j x i ) isGaussian-distributedwhichisapplicabletomanyapplicationsand iseasytounderstand.Inthispaper,thestructureisupdatedusingaheuristicmethod, whichwillbediscussedinlatersections.Forsimplicity,weassumeforthemomentthat thestructure Z isxedandthevalueof z ji isknown.Fromthispointforward,wewill usetheterminologystate-transitiondistributionand conditionalprobabilitytable (CPT) 110

PAGE 111

interchangeab ly.Eachhiddennode x i cantakeavectorvaluefromasetofstates 1 C = f1 1 ,...,1 C g where 1 c isa C 1 columnvectorwith1onlyattheposition c and0 elsewhere.Hence,theCPTisgivenby: p (x j jx i ji )= C Y v =1 C Y u =1 x jv x iu jivu (D) where jivu denotesthevalueofthestate-transitionprobability p (x jv =1jx iu =1, ji j ) and x ju denotesthe u th rowofvector x j .AssumethattheCPT issharedamong nodesinthesamelevel,i.e. jivu = j ivu = lvu where j j 2H l and i i 2H l +1 for l 2f1,..., L )Tj/T1_4 11.955 Tf12.32 0 Td(1 g; v u 2f 1,..., C g .Hencelet = f lvu g l 2f 1,..., L)Tj/T1_5 7.97 Tf11.41 0 Td(1g ,v u 2fCCg collectively denotetheCPTofthemodelwhichisgivenby p (x j jx i l )= C Y v =1 C Y u =1 x jv x iu lvu (D) Each y e isacontinuous-valuedrandomvariablenodewhichcantakeany D 1 real-valuedvector.Inthisworkweassumethatthenumberofthestatesisknownas C andaclasscanberepresentedbyonemultivariateGaussian,thus,thestate-observation probabilitydistribution p ( y e jx i ) isgivenby p (y e jx i )= C Y c =1 N (y e j c )Tj/T1_5 7.97 Tf(1 c ) x ic (D) where N (y e j c )Tj/T1_5 7.97 Tf(1 c )= j c j 1 =2 (2 ) D =2 exp )Tj/T1_5 7.97 Tf10.5 4.71 Td(1 2 (y e )Tj/T1_6 11.955 Tf11.96 0 Td( c ) > c (y e )Tj/T1_6 11.955 Tf11.96 0 Td( c ) ; c and c are respectively D 1 meanvectorand D D precisionmatrixforclass c .Each x ic canonlytakethe 1 In thispaper,thenotionofastatecanmeanthesamethingasclass.Thatiswhy weuse C torepresentthenumberofstatesforhiddenvariables. 111

PAGE 112

v alueof1or0,and P C c =1 x ic =1.Fromallabove,thejointdistributioncanbewrittenas p (X Y jZ )= Y e 2E Y i 2H 1 C Y c =1 N (y e j c )Tj/T1_3 7.97 Tf6.59 0 Td(1 c ) x ic z ei (D) L)Tj/T1_3 7.97 Tf(1 Y l =1 0 @ Y j 2H l Y i 2f 0, H l +1 g C Y v =1 C Y u =1 x jv x iu z ji lvu 1 A (D) andthecorrespondinglog-likelihoodis log p ( X Y jZ )= X e 2E X i 2H 1 z ei C X c =1 x ic log N (y e j c )Tj/T1_3 7.97 Tf6.59 0 Td(1 c ) (D) + L)Tj/T1_3 7.97 Tf11.41 0 Td(1 X l =1 0 @ X j 2H l X i 2f 0, H l +1 g z ji C X v =1 C X u =1 x jv x iu log lvu 1 A D.3LearningParametersofDEBAN Sincethevariables x inthelog-likelihoodinEquation D arenotobserved, inordertolearntheparameters c and c weusetheEMalgorithm[ 54, 55].In thissection,wederiveforDEBANasetofEMupdateequationsobtainedfromthe Expectationstep(E-step)andMaximizationstep(M-step)respectively. D.3.1E-step Inthisstep,wecalculatetheexpectedvalue,denotedbythesymbol hi,ofthe complete-datalog-likelihood log p ( X Y jZ ) withrespecttounknowndata X giventhe observeddata Y .Thatis,wedene F ( ; t )Tj/T1_3 7.97 Tf6.59 0 Td(1 ) hlog p (X Y jZ )i p (X j Y Z t )Tj/T1_3 5.978 Tf5.75 0 Td(1 ) (D) = X e 2E X i 2H 1 z ei C X c =1 hx ic i p (X jY ,Z t )Tj/T1_3 5.978 Tf5.75 0 Td(1 ) 1 2 log j c j )Tj/T1_4 11.955 Tf13.153 8.09 Td(D 2 log 2 )Tj/T1_5 11.955 Tf13.15 8.09 Td(1 2 (y e )Tj/T1_7 11.955 Tf11.96 0 Td( c ) > c (y e )Tj/T1_7 11.955 Tf11.95 0 Td( c ) + L)Tj/T1_3 7.97 Tf11.41 0 Td(1 X l =1 0 @ X j 2H l X i 2f 0, H l +1 g z ji C X v =1 C X u =1 hx jv x iu i p (X j Y Z t )Tj/T1_3 5.978 Tf5.75 0 Td(1 ) log lvu 1 A (D) 112

PAGE 113

The expectationtermscanbecalculatedfrom hx ic i p ( X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) = X X x ic p (X jY Z t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ) (D) = X x i x ic p (x i jY Z t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ) (D) = p (x ic =1j Y Z t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ), (D) and hx jv x iu i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) = p (x jv =1, x iu =1jY Z t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ). (D) Bothexpectationtermscanbeefcientlycalculatedusingthesum-productalgorithm [39 ]. D.3.2M-step Inthisstep,weequatetozerothederivativeofthecostfunction F ( ; t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ) with respecttoeachparameter: Learningtheparameter c : @ F ( t )Tj/T1_4 7.97 Tf(1 ) @ c = X e 2E X i 2H 1 z ei hx ic i p (X jY Z t )Tj/T1_4 5.978 Tf5.75 0 Td(1 ) c (y e )Tj/T1_10 11.955 Tf11.95 0 Td( c ) (D) andequatingthederivativeto0,wehave c = P e 2E P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) y e P e 2E P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) (D) Lear ningtheparameter c : @ F ( t )Tj/T1_4 7.97 Tf6.59 0 Td(1 ) @ c = X e 2E X i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_4 5.978 Tf5.76 0 Td(1 ) 1 2 )Tj/T1_4 7.97 Tf6.59 0 Td(1 c )Tj/T1_8 11.955 Tf11.95 0 Td(( y e )Tj/T1_10 11.955 Tf11.96 0 Td( c )( y e )Tj/T1_10 11.955 Tf11.96 0 Td( c ) > (D) 113

PAGE 114

and equatingthederivativeto0,wehave )Tj/T1_3 7.97 Tf6.59 0 Td(1 c = P e 2E P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_3 5.978 Tf5.75 0 Td(1 ) ( y e )Tj/T1_10 11.955 Tf11.96 0 Td( c )( y e )Tj/T1_10 11.955 Tf11.96 0 Td( c ) > P e 2E P i 2H 1 z ei hx ic i p (X jY ,Z t )Tj/T1_3 5.978 Tf5.76 0 Td(1 ) (D) Lear ningtheparameter Herewewilllearneach lvu undertheconstraintthat P v lvu =1,sotheLagrange multiplierterm )Tj/T1_10 11.955 Tf9.29 0 Td(( P v lvu )Tj/T1_1 11.955 Tf11.96 0 Td(1) willbeincorporatedintothelog-likelihoodterm @ F ( t )Tj/T1_3 7.97 Tf6.59 0 Td(1 ) @ lvu = X j 2H l X i 2f 0, H l +1 g z ji h x jv x iu i p ( X jY ,Z t )Tj/T1_3 5.978 Tf5.76 0 Td(1 ) 1 lvu )Tj/T1_10 11.955 Tf11.95 0 Td(. (D) Equating theabovederivativeto0,wehave lvu = P j 2H l P i 2f 0, H l +1 g z ji hx jv x iu i p (X j Y Z t )Tj/T1_3 5.978 Tf5.75 0 Td(1 ) (D) where = P v P j 2H l P i 2f 0, H l +1 g z ji hx jv x iu i p (X jY ,Z t )Tj/T1_3 5.978 Tf5.76 0 Td(1 ) Itismoreconvenienttowrite lvu = ^ lvu P v ^ l v u (D) where ^ lvu = P j 2H l P i 2f 0, H l +1 g z ji h x jv x iu i p ( X jY ,Z t )Tj/T1_3 5.978 Tf5.76 0 Td(1 ) denotestheunnormalizedclassCPT. D.3.3Initializing OneproblemfortheEM-algorithmislocalminima,soagoodinitializationusually contributestoagoodnalsolution.Wepropose k nearestneighbordensitywalk(KDW) algorithm,afastandintuitiveapproachtocalculatingtheinitialCPT .First,theuser hastopickparameters:thelengthofdensitywalk L ,number k for k )Tj/T1_0 11.955 Tf[(nearestneighbor (k -nn)andthenumberofsamples S .Next,a C -componentGaussianmixturemodel (GMM)totheobservation Y andmakeasimple C -classBayesclassierfromthisGMM. Thisclassierwillbeusedveryofteninthefollowingsteps.Afterthat,randomlypick astartingpoint y 1 fromtheobservation Y ,thencalculate k -nnof y 1 andrandomlypick oneofthem,theselectedpointiscalled y 2 .Repeatthisprocessuntilthesample y L isobtained.Thiscreatesaseriesofpointsgeneratedfromarandom-walkprocess 114

PAGE 115

guided bytheregionsofhighdensitythisiscalleda k -nndensitywalkofsize Las aresult,therstdensitywalk R 1 ismade.Thedensitywalkwillbegenerated S times. Let y s l denotethe l th sampleofthewalk R s .As S isgettingbigger,theCPTwillbebetter represented.Next,thewalks R 1 ,..., R S areusedtoproducetheinitialCPT, 0 ,forthe wholenetwork.Theelementat r th rowand c th columnisdenotedby 0 rc = ^ 0 rc P r ^ 0 r c where ^ 0 rc = P S s =1 c ( MAP (y s 1 )) P L l =2 r (MAP (y s l )) MAP (y s l ) denotesthemaximumaposteriori stateof y s l classiedbytheGMMearlierand r () denotestheKroneckerdelta,i.e. r (x )=1 when x = r ,and0otherwise. TheelementsintheCPTindicatethestrengthoftheconnectionsamongtheclass. TheCPTbuiltfromtheKDWalgorithmcapturestherelationsamongtheGaussian distributioncomponentsandproposesthetopologyoftheclusterinaprobabilisticway. DifferentwaysofmakingtheCPTwouldcapturedifferentmeaningsoftherelations amongtheclasses.Thereisaverystrongconnectiontobediscussedfurtherbetween parameters k and L withkernelwidthinnon-parametricpdfestimation. D.3.4KernelPerspectiveofKDW Likeinnon-parametricpdfestimation,choosingparameters k and L iscrucialinthis application.Thereisastrongconnectionbetweenthephysicalmeaningof k and L with thekernelsizeinnon-parametricpdfestimation.Asmentionedearlierinthissection, theCPTissupposedtocapturetherelationshipsamongtheGaussiancomponents,but sucharelationshipcanbesubjectiveanddataset-dependent,soinordertomakean appropriatechoice,weneedtofullyunderstandthebehavioroftheparameters.One waytounderstanditistocomparewiththoseofnon-parametricpdfestimationwhich sharesimilarbehavior. Whenthenumberofnearestneighbor k isbig,theradiusofcandidatepointsis big,sothedensitywalktendstohavegreaterchancethansmall k totakealongwalk. Thus,thechancethatthedensitywalkwouldpassmultiplecomponentswouldbehigh, however,thechancethatthepathwouldincludeundesiredclustersishightoo,so 115

PAGE 116

pic kinganappropriatesizeof k isveryimportant.Thisbehaviorisequivalenttohaving abigkernelsizeinnon-parametricpdfestimation.Furthermore, L hasverysimilar behaviorto k andtheybothworktocomplementeachother.Inanextremecasewhen k and L areontheorderofthenumberofpointsinthespace,eachcolumnintheCPT shiftstowardthepriorprobability. Anumericalexperimentisconductedontheframe2datasetcontaining400 samplesof2-dimensionalvectors.Thevaluesof k are3,10and20fromlefttoright. Thevaluesof L are30,100and200fromtoptobottom. C =4 wasassumedtobethe numberofcomponentsintheGMM.The k -nndensitywalkmapof S =3 isshownin Fig. D-3.TheCPTsareshowninFig. D-4 for S =50 whichwasusedinthisexperiment. Fig. D-3 supportstheclaimthatwhen k and L increase,thechancethatthedensity walkskipsbetweenseveralclustersincreasesaswell,andeachcolumnintheCPT convergestothepriorprobabilityasshowninFig. D-4. D.4InferenceinDEBAN ThegoalofaDEBANistosearchfortheoptimalstructure Z thatcanexplainthe observationsbest,thatisthestructurethatmaximizestheposteriormarginal P (Z jY ,) [35 ].Fromthispointon,itwillbeunderstoodthattheprobabilitiesaregivenwiththe modelparameters ,butwewillnotwriteitexplicitly.Theposteriorofthestructurecan becalculatedfrom p (Z jY )= P X p (X Y Z ) P X ,Z p (X Y Z ) (D) The numeratorcanbecalculatedefcientlyusingthesum-productalgorithm[ 39 ]which isageneralizedcaseofbeliefpropagationorPearl'smessagepassingalgorithm [38 ],butthedenominatorrequiressummationoverallpossiblestructures Z which mightmakethecomputationintractablewhenthenumberofnodesislarge.There aresomeapproximationtechniquesandheuristicoptimizationtechniquestoavoidthe computationalproblem(e.g.simulatedannealingandvariationalapproximation).One 116

PAGE 117

A 3-nn,L=30,S=3 B 10-nn,L=30,S=3 C 20-nn,L=30,S=3 D 3-nn,L=100,S=3 E 10-nn,L=100,S=3 F 20-nn,L=100,S=3 G 3-nn,L=200,S=3 H 10-nn,L=200,S=3 I 20-nn,L=200,S=3 FigureD-3.The k )Tj/T1_0 11.955 Tf[(nndensitywalkmapwhen S =3 waytoavoidthecalculationofthedenominatoristorealizethatthedenominatoris effectivelyaconstantdependingonlyontheobservations Y ,andconsiderthemaximum aposteriori problemasanoptimizationproblemwherewedon'thavetoexplicitly evaluatethedenominatoror Z =argmax Z P (Z jY ), (D) 117

PAGE 118

A 3-nn,L=30,S=50 B 10-nn,L=30,S=50 C 20-nn,L=30,S=50 D 3-nn,L=100,S=50 E 10-nn,L=100,S=50 F 20-nn,L=100,S=50 G 3-nn,L=200,S=50 H 10-nn,L=200,S=50 I 20-nn,L=200,S=50 FigureD-4.ResultingCPTsfromtheKDWalgorithmillustratedinFig. D-3. whichisequivalentto Z =argmax Z flog P (Y jZ )+log P (Z )g (D) However,forsimplicityweassume P (Z ) isuniformlydistributedover Z ,thereforewedo notneedtotake log P (Z ) intoaccountwhenoptimizingthestructure.Thelikelihoodofa 118

PAGE 119

str ucture Z giventheevidence Y canbecalculatedfrom P (Y jZ )= X X P (X Y jZ ), (D) whichcanbeevaluatedexactlyandefcientlyusingthesum-productalgorithmsince thestructureofDEBANisatree.Inthisworkweadoptarelativelysimplesimulated annealingoptimizationframeworktosolveforthebeststructure.Thesummaryofthe processcanbefoundinSection D.5.1 Oncetheoptimalstructure Z isdetermined,wecangathertheinformationatthe rootnode X r ofeachtreefromitscorrespondingleafnodesviaprobabilisticinference: P (x r jY Z )= P (x r Y jZ ) P (Y jZ ) = P X Hfr g P (X Y jZ ) P X P (X Y jZ ) (D) where r is theindexofarootnode,and X Hf r g denotesthesetofallhiddenvariables except x r .NotethatthedenominatorinEquation E isthesameasEquation E ,and thenumeratorofEquation E canbeobtainedonestepbeforeEquation E usingthe sum-productalgorithm. D.5ExperimentandResults Beforewepresenttheexperimentsandresults,wesummarizetheprocedureused intheexperimentanddemonstratetheproposedstructureupdatingalgorithm. D.5.1SummaryofDEBANforclustering TheoverviewofDEBANforunsuperviseddataclusteringissummarizedasfollows: 1.Initialization (a)Initialize Z withthebalanced-treestructure (b)Initialize usingKDW (c)Initializetheparameters and withsomeappropriatevalues 2.StartsimulatedannealingroutineusingparametersinTable D-1 tomaximize p (Y jZ ) w.r.t. Z 119

PAGE 120

T ableD-1.Parametersforsimulatedannealing(SA)optimizationintheexperiment. par ametersvalue initial temperature T 0 25 temperaturelimit T limit 0.05 temperatureschedule T 0.9 k 1 max accept to update 20 max anneal inter val50 max iter ation10000 3.Run theEM-DEBANalgorithmtore-estimatealltheparameters and using Equation D D and D ,respectively. 4.Calculatethelikelihoodofthegivenstructure Z p (Y jZ ) usingsum-product algorithm. 5.UpdatethestructureaccordingtoourproposedmethodgiveninSection D.5.2 6.Repeat3-5insimulatedannealing(SA)optimizationframework,andgetthe optimalstructure Z ? D.5.2StructureUpdatingAlgorithm Inthissection,weproposeafastwaytoupdatethestructurebasedonastochastic method.Naturally,wewanttoconnectachildnode x toaparent Pa (x ) suchthat thestateposteriormarginals p (x jY ) and p ( Pa ( x )jY ) aresimilar.Sincethenumber ofclasses C isthesamefromchildleveltoparentlevel,thesupportoftheposterior marginaldistributionisthesameforbothlayers.Thisfacilitatesustouseasimple distancemeasureoftwodistributionslikethe L 2 -normdistancedenedby D (f (x ), g (x ))= s X i 2I ( f (x i ) )Tj/T1_1 11.955 Tf11.95 0 Td(g (x i ) ) 2 (D) where I is thesupportofthediscretedistributions f ( x ) and g (x ).Forachildnode x j and aparentnode x i inthelevel H l and H l +1 respectively,wecalculate D (p (x j jY ), p (x i j Y )) denotedby D ji forall i 2H l +1 .Weconstructamultinomialdistributionoverthenode 120

PAGE 121

inde x i : Mult (i jj )= 1 D ji P i 2H l +1 1 D j i (D) where therandomfactor determinesthedegreeoftherandomnesswhenupdatingthe structure.When =0,thestructureupdatingalgorithmbecomescompletelyrandom algorithm,andwhentakingthelimitof toinnity,thealgorithmbecomesastochastic greedyalgorithmwherethechildnode x j isconnectedtotheparent x i withminimum distance.Thechoiceof dependsonthedatasetandapplication,inourworkwe pick =2 sothatwedon'thavetocalculatethesquarerootandbecauseitcanbe interpretedasdependingonthesquareddistances.Theparentnode x i thatthechild x j willbeconnectedtoinaniterationisdeterminedstochasticallyby i Mult (i jj ),thatis, sampling i fromthemultinomial Mult (i jj ). D.5.3UnsupervisedClusteringusingDEBAN Thedatasetframe2comprises2classes,eachofwhichisformedinanL-shape andhas200pointsasshowninFig. D-5.Itisobviousthateachclustercannotbe modeledproperlyusingasingleGaussian,soaGMMisusedtomodelthedata. WeuseBayesianinformationcriterion(BIC)togureouttheoptimalnumberofthe componentsfortheGMM,whichis4,asshowninFig. D-5.TheCPT isinitialized usingKDWparameters k =3 L =100, S =50 asshowninFig. D-6f.Inorderto testtherobustnessandforareasonablerun-time,thedatasetisdownsampledfrom 400to32samples,sothecorrespondingDEBAN,whosestructureisinitializedusinga balancedtree,has5hiddenlevelsandoneobservedlevelwherethe32observations arefedto.Thestate-observationdistribution p (y e jx l ) ismodeledusingparameters and calculatedfromtheGMMdescribedearlier. Intheprocess,thestate-transitionCPTwillbecalculatedlocallyusingEquation D.Tobemoreprecise,eachlevel H l hasitsownCPT lvu exceptforthetop3levels. Itisobviousthateachofthetoplevelshasveryfewnodes,therefore,itisagoodidea 121

PAGE 122

A Rawobservations B FitGMMtotheobservation FigureD-5.Datasetframe2.Therawpointcloudofthedatasetframe2andthe correspondingGMM. toconsidertop3layers( L )Tj/T1_3 11.955 Tf12.26 0 Td(1 to L )Tj/T1_3 11.955 Tf12.25 0 Td(3)tosharethesameCPT (L)Tj/T1_5 7.97 Tf(1)vu (i.e.parameter tying).WefollowtheprocessgiveninSection D.5.1 ,andtheresultsareshowninFig. D-6. FromtheresultingclustersshowninFig. D-6 (a),thedatasamplesarecorrectly separatedintotwoclusters,eachofwhichistheL-shape.Theresultingoptimalgraph structure Z ,showninFig. D-6 (b),hastworootnodes,eachofwhichiscorresponding toeachoftheL-shapeclusterinFig. D-6 (a). TherearesixrowsinFig. D-6 (c),eachofwhich,fromtoptobottom,represents thearrayofthenalposteriordistribution p (x j jY ) eachnode x j inrootlevel( H 6 )tolabel level( H 1 )respectively.Ineachrow,thex-axisandy-axisrepresentthenodenumber andtheclassrespectively,andthemagnitudeoftheprobabilityrangesfromblack(0) towhite(1).Inlevel H 1 ,thereare32nodes(node#32-#63),andforsimplicitywealign samplesfromthesameGaussiancomponenttogether,thatis,thesample#32-#39are fromthesamecomponent;and#40-#47arealsofromthesamecomponentandsoon. Thesimilaritybetweenthedistributioninlevel H l andlevel H l +1 indicatesthatthechild 122

PAGE 123

A DEBANclusteringresults B Thebesttreestructurefromtheoptimization C Posteriordistributionateachhiddennode D ThenalCPTFrombottomtotop (1) vu (2) vu (3) vu and (4)Tj/T1_2 6.974 Tf13.12 0 Td(6)vu respectively. E Thelearningcurve;redandbluerepresentacceptedandrejectedrespectively F TheinitialCPTforthewholemodel. FigureD-6.ResultsofDEBAN. 123

PAGE 124

node anditscorrespondingparenttendstotakethesamestate.Level H 5 hasaunique patternofposteriorwhichlookslikethecombinationofitschildren.Thisisnotasurprise becausethiseventindicatesthemergingoftwosubclustersintotheL-shapecluster. InFig. D-6 (d),thereare4matrices;thetop 4 4 matrixrepresentstheshared CPTofthelevel H 4 to H 6 ,therestrepresenttheCPTinlevel H 3 to H 1 fromtopto bottomrespectively.Asforeachmatrix,thestatesofchildnodeandtheparentnode aredenotedbyrowandcolumnrespectively.TheCPTs H 3 to H 1 havestrongdiagonal elements,meaningthatchildrenandcorrespondingparentsarelikelytohavethesame state.TheinterestingpartisshowninthesharedCPTofthelevel H 4 to H 6 because someoff-diagonalelementsgethighvalues.Infact,thispatternencodestheinformation regardingwhichGaussiancomponentsshouldbemergedintoL-shapeclusterwhich canbeassociatedwiththeposteriordistributioninFig. D-6 (c).Thelearningcurvein Fig. D-6 (e)illustratestheiterationsanditslog-likelihoodinxandy-axisrespectively.SA isguaranteedtoconvergetotheglobaloptimalsolutionwhentheannealingperiodisbig enough,therefore,inthisexperiment,wehaveusedalongannealingperiodasshown inTable D-1,andtheexperimenttakes1.5hourstonishonadesktopwithanIntel R r PentiumD2.8GHzCPU,3.62GBofRAM,andprogrammedinMATLAB R r R2010a.In fact,thecurveshowsthatthealgorithmyieldsagoodsolutionintheearlyiterations,so aquickersolutionispossiblewithabetterSAannealingschedule. D.6Discussion Theoreticallythisframeworkcanbeviewedasamoregeneralizedframeworkof GMMandHMM.InaGMMeachstatevariable(orcomponentlabel) x j isassumed independentofothers,whereasDEBANassumesconditionalindependencesamong thosevariablesbyputtingatree-structureprioroverthestatevariables.InanHMM,the statevariablesarecorrelatedbyaMarkovchainwhichisseenas1-Dspecialcaseof DEBAN.DEBANalsogeneralizesDTinthesensethatitmakesanobviousconnection betweenstatenodesandtheobservationnodes,andprovidesaframeworktoapplyto 124

PAGE 125

other typesofdatabesidesimages.Therecanbealotmoreextensionstoourproposed method,forinstance,usingamorecomplexmodelforthestate-observation p (y e jx i ), usingadifferentnumberofstatesfor x j indifferentlevelswhichmakesmoresensein thereal-worldproblem,applyingaBayesianprioroverthehiddenvariables,etc.DEBAN alsoincludesintrinsicallytheabilitytodeterminethenumberofclustersinthedataset accuratelywhichisderivedfromthedeformablegraphstructure Z .KDWdemonstrates goodresultsininitializingtheCPT,butitisalsointerestingtouseotherkindsofgraphs toinitialize .Wearecurrentlyexploringawaytocomeupwithabetterannealing schedulesuchthatthesolutionisfoundinanacceptableperiodoftime.Alsoweare consideringusingapproximateinferencetechniquesinordertomakethewholeprocess faster. D.7Summary WehavedevelopedDEBAN,anovelprobabilisticframeworkbasedonahierarchical tree-structuredBayesiannetworkthatiscapableofadaptingitsgraphstructureto incorporatefeaturevectorsattheleaflevel.ThemultiscalenatureofDEBANenables ustodiscover(sub)structuresinthedataanddependencerelationshipsamongthem. DEBANyieldsageneralframeworkwhichcanbeappliedtoanystructureddatain general.WehavedemonstratedthatDEBANperformsunsupervisedclusteringwell onachallengingdatasetframe2whichcannotbemodeledsuccessfullyusingaGMM aseachclusterisnotasingleGaussian.Moreover,DEBANsolvesfortheappropriate numberofclusterautomaticallyfromthedata,thiseliminatestheneedofhavingmodel selectionexplicitly.TogetherwithDEBAN,wealsoproposethe k -nndensitywalkas apowerfulandfastwaytoinitializethestate-transitionCPT.Inaddition,weproposea stochasticgreedystructureupdatingalgorithmwhichyieldsgoodresults.Infuturework wewishtondabetterSAannealingscheduleandtodevelopvariationalapproximation algorithmssuchasmean-eldvariationalapproximationinordertospeedupthe algorithmrun-time. 125

PAGE 126

APPENDIX E DEFORMABLEBAYESIANNETWORK:AROBUSTFRAMEWORKFOR UNDERWATERSENSORFUSION Thedynamictree(DT)graphicalmodelisapopularanalyticaltoolforimage segmentationandobjectclassicationtasks.ADTisausefulmodelinthiscontext becauseitshierarchicalpropertyenablestheusertoexamineinformationinmultiple scalesanditsexiblestructurecanmoreeasilytcomplexregionboundariescompared torigidquadtreestructuressuchastree-structuredBayesiannetworks.Thispaper proposesanovelframeworkfordatafusioncalleda deformableBayesiannetwork (DFBN)byusingaDTmodeltofusemeasurementsfrommultiplesensingplatforms intoanon-redundantrepresentation.ThestructuralexibilityoftheDFBNwillbe usedtofusecommoninformationacrossdifferentsensormeasurements.The appropriatestructureupdatestrategiesfortheDFBNanditsparametersforthedata fusionapplicationarediscussed.Areal-worldexampleapplicationusingsonarimages collectedfromasurveymissionispresented.Thefusionresultsusingthepresented DFBNframeworkisshowntooutperformstate-of-the-artapproachessuchasthe Gaussianmeanshiftandspectralclusteringalgorithms.TheDFBN'scomplexityand scalabilityarediscussedtoaddressitspotentialforalargerdataset. E.1Overview Inmanyunderwatersensingapplications,humanoperatorsarerequiredto manuallycombineorfusemultiplemeasurementsofthesameregionorobjectintoa singleobservationwithnoredundantinformation.Thisisusuallythecaseinunderwater surveys,wherethepathtrajectoriesofdiversesensingsystemsoverlapthesame sensedregionmanytimes,producingaheterogeneoussetofmeasurementsfrom thesametarget.Theanswermostcommonlysoughtregardingaparticularsensed targetisanaccurategeographicpositionlinkedtothesetofcorrespondingsensordata gatheredfromthesensingsystems.Inthispaper,weuseprobabilisticgraphicalmodels 126

PAGE 127

as theanalyticalframeworkforautomaticallycombiningredundantsonarimagetarget informationgatheredfrommultipleunderwatersensingplatforms. Fig. E-1 depictsanunderwatersensingscenariowhereaplatformwithaside-looking sonarhassensedagroupoftargetsonvariouspassesduringatypicalsurveymission. Whenallthesurveydataiscollected,anautomatedtargetrecognitionalgorithmand/or ahumanoperatorreviewsallthesonarimagesandmarkslocationsintheimagery wheretargetsarediscovered.Followingthisreview,themultipletargetinstancesmust becombinedorfusedintoknowledgeaboutthetargetlocationontheseaoor.In thisfusionscenario,thereisuncertaintyabouteachtarget'sabsolutepositionfrom imprecisionintheplatformnavigationsystem.Thus,targetsthatarelocatedcloseto oneanothermustbediscriminatedbyadditionalmeans.Additionallyimagesofagiven targetaredistortedbetweenpassesduetovariousrangesandsensinganglesthusany clusteringorclassicationalgorithmmustbeinvariantoranticipatesuchdistortions. ThesedistortionsaredepictedinthesmallimagechipsinFig. E-1 wherethethree targetslookdifferentdependingonthelocationandheadingofthesensingplatform. Thegraphicalmodelformulationwedescribeinthispaperuseslocationinformation alongwiththeseabottomtextureandtargetimagefeaturestodiscriminateandgroup targetslocatedwithinthenavigationaluncertaintyofthesensingplatform. Aspectrumofstatisticalapproacheshavebeenproposedtoresolvethisparticular problemasexplainedcomprehensivelyin[ 81],however,duetothelimitationsin length,weshalladdresspriorworkrelatedtographicalmodels.Variousgraphical modelformulationshavebeenappliedtodifferentsensorfusiontasks,andone oftherstattemptscanbefoundin[ 82],whereheterogeneoussensorsworking cooperativelytoachieveacommongoal.Inthepaper,thedatafusionproblemis mappedintoagraphicalmodel,andthemessage-passingschemeisderivedtofuse thedataefciently.Moreover,combiningseveralsensorsusinggraphicalmodelsyields robustnesstotheframeworkasreportedin[ 83]. 127

PAGE 128

Figure E-1.Underwatersensingplatformssurveyaeldoftargets.Thetargetsareseen multipletimesfromavarietyofpasses.Thesensorfusiontaskistocorrectly associatetargetlabelsbetweenthevarioussensingplatformsandprovide anestimatedposition.Inthisgure,they-directionandx-directionofa sampleimageassociatewiththeircorrespondingplatformdirection(denoted byarrow)andside-scandirectionrespectively. Approachestotransformdataassociationproblemsintoinferenceproblemson graphicalmodelsareformallydiscussedatlengthin[ 84 ].Theworkfocusesonefciently solvingtheproblemusinglocalmessage-passingalgorithms,whichcanbeseenas solvingoptimizationproblemsinadistributedmannerwheretheinformationexchanges areallowedonlyamongtheneighboringnodesonthegraph.Itisawell-knownfactthat message-passingalgorithmsgiveapproximateandexactinferenceongraphswithand withoutloopsrespectively.Recently,generalizedbeliefpropagationisappliedonsensor networklocalizationproblemsongraphswithloops[ 85 ].Graphicalmodelshavebeen appliedintrackingproblemswithcomplextopologyofsensornetworksasexploredin [86 ],andthetargetlocationsareinferredaccuratelyusingmessage-passingalgorithms. Inthecaseswherecomplexprobabilitydistributionsareutilizedtocapturemoredetails ofthesystem,nonparametricmessage-passingalgorithmsarerequiredforefcient inference[ 87]. 128

PAGE 129

Gr aphicalmodelapproacheshavebeenappliedtotraditionaldecentralized datafusion[ 88],aframeworkwherethereisnosinglecentralfusioncenterandthe communicationsonlyhappenonanode-to-nodebasis.Insuchframeworks,thenetwork ofsensorsisrepresentedbyagraph,andthecommunicationscanbecalculated efcientlyusingmessage-passingalgorithms[ 89 90]. Ingeneral,thegraphicalmodelscanbeclassiedintotwomaincategories:1) xedstructureand2)exiblestructure.Manyofproposedsensorfusionapplications mentionedearlierfallintotherstcategory,wherethegraphstructureisxedthroughout thelearningandinferenceprocess.However,themaindrawbackofthisapproachis thediscrepancybetweenthedatasampleandthexedgraphstructure.Itistruethat thexedgraphstructurecanbelearnedaprioriinamodelselectionprocess,butin somesituationsthetrainingdatasamplesareobtainedatgreatexpense.Frequently, thetrainingdatasamplesarenotavailableapriorifortraditionalmodelselection. Inthiscasethedata-graphdiscrepancysignicantlyaffectstheresultingaccuracy. Therefore,itiscrucialforaframeworkofthistypetobecapableofperformingmodel selectionsimultaneouslywiththeinferenceprocess.Hencethegraphicalmodels inthesecondcategory,wherethegraphstructureisexibleanddrivenbythedata samples,aredevelopedtomitigatethediscrepancy.Somesuccessfulexamplesofsuch modelsincludethe dynamictree (DT)graphicalmodel,aprobabilisticgraphicalmodel whosestructureisexiblealongtheinferenceandlearningprocess,whichhasbeen successfullyimplementedinimagesegmentationapplications[ 47, 91, 92].Furthermore, theanalyticalframeworkprovidesaexiblehierarchicalstructurewhichisdesiredin manyimageprocessingapplications,e.g.,texturesegmentationacrosscomplexregion boundaries. Inadditiontoitshierarchicalstructure,DTsarebasedonaprobabilisticframework, usingprobabilitydistributionstomodeltherelationshipsamongstthesensormeasurements 129

PAGE 130

Figure E-2.TheowdiagramofthedeformableBayesiannetworks(DFBNs)for underwatersensorfusion. andthusexplicitlyprovidinganestimateofuncertaintyoftheDT maximumaposteriori (MAP)solution.Inotherwords,thesolutionofDTisintermsofposteriormarginal distributionofeachnoderatherthanasinglevalue.TheDTsolutioncanbecontrasted withanunsupervisedclusteringroutinesuchas k -meansclusteringwherethesolution doesnothaveanexplicitmeasureofuncertainty.Moreover,ourproposedmethod outputsthecomprehensiverelationshipamongobservationsinthedatasetinterms ofgraphstructurewhichcontainsrichinformationregardingthedata.Suchgraph structurescanalsobebenecialinmanyhigh-levelapplicationsrequiringagood representationoftheinter-relationshipamongtheobservations. InthispaperweextendthemeritsoftheDTframeworktosolveanunderwater sensorfusionproblem.WeproposeahierarchicalDTarchitectureforthesensorfusion tasknamed deformableBayesiannetworks (DFBNs).Theultimategoaloftheproposed frameworkistoreducetheredundancyinthegivendatasetbymeansofclustering algorithmswhoseoutputsaretree-structuredgraphs.Thesummarizedowdiagram oftheframeworkisillustratedinFig. E-2 .TheDFBNowdiagrampresentedhere startswithrawmeasurementsfrommissionsurveys.Themeasurementsthenundergo thefeatureextractionprocess,whosedetailsareproblem-dependent,thatproduces afeaturevectorforeachobjectofinterest.TheDFBNframeworktakesasinputthe featurevectorsandautomaticallyoutputsagraphstructurefromwhichanumberofthe clustersandestimatedassociationscanbeinterpreteddirectly.TheowdiagramofFig. E-2 issignicantlydetailedinAlgorithm 5. 130

PAGE 131

Figure E-3.DFBNsensorfusionarchitecture.Inasensingscenariodataisorganizedby aforestofdynamictrees.Eachtreerepresentstheinformationknownabout atargetsensedintheenvironment.Withineachtree,sensormeasurements (leafnodes)arelinkedtothetruestateofthetarget(rootnodes)throughan intermediatesetofnodesthatdenetheuncertaintyinterjectedbya particularsensor. Thedetailsofthegraph-structuralinterpretationaredepictedinFig. E-3 .Thetop ofthegureillustratesthesensingscenario,whereeachtreerepresentsasingletarget ofinterest.Thebottomofthegureillustratesthestructureofeachtree.Therootnode, atthetoplevelofthetree,representsthetargetinquestion,theintermediatelevelof thetreerepresentstheparticularsensingplatformthatcollectedthemeasurementson thetarget,andtheleafnodes,atthebottomlevelofthetree,arerepresentedbythe rawmeasurements.Priorknowledgeaboutsensorplatformidenticationandnavigation informationiseasilyincorporatedintothishierarchicalformulationandwilleasethe computationalburdenonthealgorithm. Itisimportanttocontrastthesensorfusionapproachweassumeherewiththe largebodyofresearchinsimultaneouslocalizationandmapping(SLAM),whichhasthe capacitytoaccuratelydeterminespatiallocationsoftargetswithintheeldofviewof amobilesensingplatform.SLAMisapopularalgorithmicapproachtobuildingamap ofthetargetlocationsandreconstructedsensorplatformpathbyestimatingtargetand platformpositions(withassociateduncertainty)withinaKalmanlterframework[ 9396]. 131

PAGE 132

With appropriatesensors,accuratedynamicmodelsofthesensingplatform,andcorrect landmarkassociationbetweensensingpasses(a.k.a.thedataassociationproblem),a highlyaccuratespatialmapofasensedareacanbeconstructed[ 97 102]. Theapproachwepresentheredepartsfromordegradesthemeasurement assumptionsinSLAMinseveralimportantways.First,weassumethatthereisno continuousstreamofsensorandplatformnavigationdata;wereceiveonlysnapshots ofrelevanttargetsintheenvironment.Second,thedynamicmodelofthesensing platformisnotavailableorisnotconsistentbetweenmultiplesnapshotsofthesame target.Third,thefeaturesusedinourapproacharericherthanthosecommonlyused inSLAMandincludetargetandseabottomtexturecharacteristicsinadditiontotarget locationmeasurements.Finally,themultipleviewsofatargetcanoccurindisparate timesusingdifferentsensingplatforms,e.g.,combiningcurrentmeasurementswith historicalmeasurements.Withinoursensorfusionparadigm,themeasurementswe receivedependupontheimagecharacteristicsofthetargetsofinterest,ratherthanthe usefulnessofthetargetsaslandmarksinderivingaspatialmap.Nevertheless,there maybeapplicationswhereSLAMandDTsensorfusioncanbebeneciallycoupled, particularlyinresolvingthedataassociationproblem[ 103105]. Itiscrucialtodifferentiateourproposedmethodwiththejointcompatibilitybranch andbound(JCBB)[ 104],awidely-adoptedalgorithmfordataassociationincomputer visionandSLAMcommunitiesduetoitsrobustnessandefciency.GenerallyJCBB performsdataassociationbyevaluatingthecompatibilitybetweentwosetsofpartially overlappingobservations,namelyjointcompatibility,whichmakesJCBBmorerobust thanthecompatibilitycalculatedfromeachpairofsingleobservations.Theheuristic approachusingMahalanobisdistancetoconnethesearchdomainenablesefcient computationinJCBBoverjointcompatibilitymethodsingeneral.Nevertheless,JCBBis notnaturallysuitableforourscenariowheremeasurementsareobtainedinbatchrather thanonline,andtheplatformdynamicsarehighlyambiguousornotavailableatall.On 132

PAGE 133

the otherhand,ourproposedmethodisdesignedspecicallyforthishostileparadigm, andisalsoapplicabletodatabatchclusteringingeneral.Furthermore,DTsensorfusion providesrichinformationintheformofgraphstructurewhichcanbeagoodresourcefor otherrelevantapplications. ThispaperusesideasofboththeSLAMandgraphicalmodelcommunities,so occasionallytheremightbesometerminologicalandnotationalconfusionswhichare explainedasfollows.First,thewordmeasurementandobservationrefertotheraw measurementacquiredfromthesensorplatforms,andinsomecontextsbothwordsare usedinterchangeably.Second,thetermfeaturereferstothediscriminativenumerical informationextractedfromeachmeasurement,andisnormallyarrangedinvector form,andhenceiscalledafeaturevector.Third,inthegraphicalmodelcommunity, X and Y commonlyrefertohiddenandobservedvariablesrespectivelywhereasin SLAMcommunitytheyarecommonlyusedtodescribeeither2D(xandy)coordinates orthestateandobservedvariablesofadiscrete-timedynamicalsystem.Toavoid confusion,hereweusethemathematicalfontof X and Y or x and y todenotethe hiddenandobservedvariablesrespectivelyandregularfontxandytodescribe2D coordinates.Nevertheless,themeaningofthevariableswillbeclearinthecontextof theirappearance. Therestofthepaperpresentsasolutiontothesensorfusionproblemofcombining redundantsensorinformationusingtheanalyticalframeworkofDFBNs.Summaries ofthepropertiesofBayesiannetworksandDTsarepresentedinthenextsection.A descriptionoftheDFBNsensorfusionarchitectureisthenintroduced.Proceduresfor optimizingtheDFBNandthusndingthesensorfusionsolutionaredescribednext. Theremainingsectionsexplaintheincorporationoftargetandsonarimagetexture featuresintotheDFBNframeworkforsensormeasurementdiscriminationandinclude resultsfromareal-worldsensingscenario.TheresultsfromDFBNsarecomparedwith 133

PAGE 134

state-of-the-ar tmethods,andthecomputationalcomplexityisdiscussed.Finallythe paperisconcludedinthelastsection. E.2ProbabilisticGraphicalModels InaBayesiannetworkgraphicalmodel,thejointprobabilityofasetofrandom variables(ornodes)isdenedbytheconnectionsofthegraph,whichinturn,denethe conditionalprobabilityrelationshipsamongstthenodes.InFig. E-4,asetofdiscrete randomvariables A, B C D isconnectedasshown.Adirectedlinkfromparentnode tochildnodesdenestheprobabilitystateofthechildconditionedontheparent.For example,thearrowfromnode A tonode C denotestheconditionalprobability P (C jA). Rootnodesornodeswhichdonothaveaparentaredenedthroughapriorprobability. Throughthisnotation,thejointprobabilityofthegraphinFig. E-4 canbefactoredas P ( A B C D )= P (A) P (B jA)P ( C jA)P (D jB C ). (E) Normally,therearetwotypesofvariablesornodesinagraphicalmodel:1) observedorevidentialvariables,and2)hiddenorunobservedvariables.Mostofthe time,thegoalistoinfertheprobabilitydistributionofasubsetofhiddennodesgiventhe evidentialnodesviaposteriordistributionderivedfromBayes'ruleandmarginalizing outthehiddennodes.Forexample,toinfertheprobabilityofthestatesofrandom variable C inFig. E-4 whenthevaluesatnodes B = b and D = d ,itrequiresthatwe sumovertheprobabilitymassfunctionofvariable A inthenumerator,and A and C in thedenominatorofEquation E.Thisexpression,i.e.,theposterior,canbewritten accordingtoBayes'ruleas p (c jb d )= P a 2A p ( a )p (b ja )p (c ja )p (d jb c ) P (a ,c ) 2AC p (a )p (b ja )p (c ja )p ( d jb c ) (E) The computationalcomplexityofmarginalizationincreasesasthenumberof nodes,states,andlinksincreasesintheBayesiannetwork.Inmostinferencecases, theprobabilitymassfunction(orprobabilitydensityfunctioninthecaseofcontinuous 134

PAGE 135

Figure E-4.Bayesiannetworkgraphicalmodel.Nodes A, B C ,and D arerandom variables,andlinksbetweenthenodesdeneconditionalprobability relationships. randomvariables)ateachnodeintheBayesiannetworkneedstobeupdatedafter theintroductionofnewevidenceintothenetwork.Whenthenumberofhiddennodes tobemarginalizedislarge,thenormalizationfactor,i.e.,thedenominatorinEquation E,canbecomputationallyintractable.Fortunately,thereareseveralalgorithmsthat canmakemarginalizationtractable.Fortree-structurednetworks,whereeachchild nodehasonlyoneparent,themarginalizationorbeliefupdateateachnodecanbe carriedoutexactly,i.e.,exactinference,usingafast sum-productalgorithm,amore generalizedversionofbeliefpropagationandPearl'smessagepassing[ 106].The sum-productalgorithmgreatlyreducescomputationtimeandprovidesastandard programmingframeworktoaccomplishthemarginalizationovertheentirenetwork.It isworthmentioningthatthemethodalsoworkswithanynetworkingeneral;itgives exactinferencefortree-structurednetworksandapproximateinferencefornetworks containingcycles,inwhichcaseitisknownasloopybeliefpropagation[ 80 ].Inthis work,werestricttheframeworktotree-structurednetworksandusethesum-product algorithmforitscomputationaladvantages. E.2.1DeformableBayesianNetworks(DFBNs) A dynamictree (DT)graphicalmodel[47],isatree-structuredBayesiannetwork whosejointprobability p (X Y Z ) includesarandomvariable Z thatdenesthe structureofthenetwork,whichcanbeviewedasthelinkagebetweennodesinthe 135

PAGE 136

tree .ToavoidtheconfusionwithdynamicBayesiannetworks 1 andbecauseofsome extensionsthatdifferfromthetraditionalDTarchitecture,thenameofthisarchitecture iscalled deformableBayesiannetworks (DFBNs).Let Y = f y j g j 2E denotetheobserved datanodes,where E isthesetofindicesoftheobservednodes,andlet X = fx j g j 2H denotehiddennodes,where H isthesetofindicesofthehiddennodes.Thetotal numberofnodesintheentirenetworkwillbe N = jHj + jEj.Wesettheindexforeach nodeinthenetworkinascendingorderfromroottoleaf.Inotherwords,theindexof aparentnodeissmallerthanthatofitschildren.Amodelwithaxedstructure Z is denedbythejointdistribution p (X Y jZ ) .SinceDFBNisadirectedacyclicgraph (DAG),thejointfactorizesoverthenodes,i.e., p (X Y jZ )= Q j p (u j jpa j j Z ), where u j 2 X [ Y pa j isthesetofparentsof u j ,and j 2 parameterizestheedges (conditionalprobability)pointingtowardthechildnode u j .Thestructureofthemodel isdeterminedbythevariable Z = f z ji jj i 2fH[Egg,where z ji =1 whenthechild node u j isconnectedtotheparent u i ,otherwise z ji =0.Normally, Z willbewrittenin theformofa N N matrixwhoseelementin j th rowand i th columnis z ji .Allpossible statecongurationsandallpossiblestructuresoftheDFBNcanbedenedasajoint probabilitydistributionwhosefunctionalformcanbefactorizedas P ( X Y Z j )= P (X Y jZ ) P (Z j ) (E) where ji denotestheconditionalprobabilityofthearrowedgefrom u i toward u j ,and ji denotestheprobabilitythatthechildnode u j isconnectedtotheparentnode u i .For convenience,bothtypesofparameterscanbecollectivelydenotedas = f ji g j ,i 2fH[Eg and = f ji g j ,i 2fH[Eg respectively. 1 T oclarifyfurther,dynamicBayesiannetworksfocusonthetime-seriesaspectof Bayesiannetworksandhaveaxedgraphstructure.Dynamictreegraphicalmodels addexibilitytothegraphstructurebydeningthelinkageofthemodelasanadditional randomvariable. 136

PAGE 137

E.2.2 InferenceintheDFBN ThegoalofaDFBNistosearchfortheoptimalstructure Z ? thatcanexplainthe evidencebest,thatisthestructurethatmaximizestheposteriormarginal P (Z jY ) [47 ].Fromthispointon,itwillbeunderstoodthattheprobabilitiesaregivenwiththe modelparameters and ,butwewillnotwritethemexplicitly.Theposteriorofthe structurecanbecalculatedfrom p ( Z jY )= R p (X Y Z )dX P Z R p (X Y Z )dX (E) The numeratorcanbecalculatedefcientlyusingthesum-productalgorithm,but thedenominatorrequiressummationoverallpossiblestructures Z whichmight makethecomputationintractablewhenthenumberofnodesislarge.Thereare someapproximationtechniquesandheuristicoptimizationtechniquestoavoidthe computationalproblem,forinstance,simulatedannealingandvariationalapproximation, whichwillbediscussedindetailsubsequently.Onewaytoavoidthecalculationof thedenominatoristorealizethatthedenominatoriseffectivelyaconstantdepending onlyontheevidence Y ,andconsiderthemaximum aposteriori (MAP)problemasan optimizationproblemwherewedonothavetoexplicitlyevaluatethedenominator: Z ? =argmax Z P (Z j Y ) (E) whichisequivalentto Z ? =argmax Z flog P (Y jZ )+log P ( Z ) g (E) Thelikelihoodofastructure Z giventheevidence Y canbecalculatedfrom P (Y j Z )= Z P (X Y jZ ) dX (E) whichcanbeevaluatedexactlyandefcientlyusingthesum-productalgorithmsince themodelisatree.Inthissectionweadoptarelativelysimplesimulatedannealing 137

PAGE 138

optimization frameworktosolveforthebeststructure.Thedetailsofthisarefoundin Section E.4. Oncetheoptimalstructure Z ? isdetermined,wecangathertheinformationatthe rootnode x r ofeachtreefromitscorrespondingleafnodesviaprobabilisticinference: P (x r jY Z ? )= P (x r Y jZ ? ) P (Y jZ ? ) (E) = R P (X Y jZ ? )dX Hf r g R P (X Y jZ ? )dX (E) where r is theindexofarootnode,and X Hf r g denotesthesetofallhiddenvariables except x r .NotethatthedenominatorinEquation E isthesamethingasinEquation E,andthenumeratorofEquation E canbeobtainedonestepbeforeEquation E usingthesum-productalgorithmasshowninSection E.4.3.1. Inthispaper,werestricttheconditionalprobabilitybetweennodestolinear multivariateGaussian.Forachildnode x k connectedtotheparent x j ,theconditional distributionisdenedas p (x k jx j )= N ( x k ; W kj x j + kj kj ) where W kj istheloadingmatrix (weightmatrix), kj isthetranslationvector,and kj istheprecisionmatrixofthenode x k withrespecttotheparent x j .Formorerobustperformance,theparameterssharingthe sameparentwillbetiedtohavethesamevalue,i.e., W kj = W j kj = j and kj = j E.2.3DFBNfortheDataClusteringandFusionAlgorithm DFBNshavetheabilitytoinfertheinformationateachnodeintheformofa distributionusingwell-establishedstatisticalinferencealgorithms.Furthermore,the structure Z oftheDFBNcanbedeformedaccordingtotheevidenceindicatingthat modelselectionisintrinsicallyincludedintheDFBN.Therefore,theDFBNapproach describedhereisappropriatetodataclusteringandfusionwherethedatahavean underlyinginterpretablestructure.Intermsofthesensorfusionproblemwepropose inthispaper,theMAPsolution Z ? yieldsanon-redundantsetoftargetrootnodes connectedtothemostlikelysensingplatformnodes,whichareconnectedtothemost likelysetofmeasurementsthatoriginatefromtheparticularsensingplatforms.Hence, 138

PAGE 139

the mostimportantpartoftranslatingaDFBNframeworkintothesensorfusionproblem istodenethesensor-targetrelationshipsintermsofthestructurevariable Z andthe modelparameters and thatdenetheinternodalrelationshipssothattheDFBN solutionsarewithintherealmofpossibility,andmoreimportantly,arethemostlikely answerof apriori constraints. Thenextsectionwilldiscussthesensorfusionproblemwithintheframework ofDFBNsanddetailthestepsnecessarytooptimizethestructuregivensensor measurementsonleafnodesofeachtree. E.3SensorFusionFrameworkUsingDFBN Usingthisframework,analgorithmwasdevelopedthattakestherawsensor measurementsandsensoridenticationfromanunderwatersensingmissionand automaticallybuildsatreeforeachuniquetargetusingastructuraloptimization algorithmcombinedwiththesum-productalgorithm.Thealgorithmoutputsaforest orgroupoftrees,eachhavinganinferredprobabilitydistributionofthesummarized targetmeasurementattherootnode.Eachnodeinthenetworkisarandomvariable representingthefeature.Inthispaper,wewilluse3features:1)x-ytargetlocation,2) correspondingseabottomtextureparameters,and3)targetshapeparameters.The detailsoffeatureextractionalgorithmwillbediscussedinthesubsequentsection.In thisapplication,wealsowanttoknowthetrueorsummarizedlocationofanindividual targetwhichcanbeobtainedfromtherootnodeofeachtree.Thus,theDFBNsensor fusionframeworkyieldstwovaluablepiecesofinformation:1)atargetlocation(and othermeasurements)intermsofaprobabilitydistributionfunction,and2)aparent-child relationshipthatlinksthetargettothesensingplatformsandrawmeasurementsthat wereusedtogathertheinformationaboutitsposition. InthesensorfusionDFBNframework,therearemultiplelevelsofnodesandthe numberofnodesineachlayerdependsonthephysicalmodeloftheproblem.For example,inourscenarioshowninFig. E-3,wehave3levels: 139

PAGE 140

Figure E-5.InterpretationofDFBNstructure.(a)TheinitialstructureforDFBN.(b)The optimalstructureafterstructureoptimizationviasimulatedannealing.(c) Thesimpliedstructureisthestructurewhenirrelevantnodesareremoved. 1. Leaflevel :Eachnodeintheleaflevelrepresentsfeaturevectorsextractedfrom measurements.Thisinformationwillbecombinedintheupperlevel. 2. Intermediatelevel :Eachnodeintheintermediatelevelrepresentsthefused informationcollectedfromthelevelbelow.Furthermore,eachnodeinthislevel providestheinformationabouttheplatformorsensortakingthemeasurements,so thislevelisalsocalledSensorIDorPlatformID. 3. Rootlevel :Eachtreerepresentsasingletarget,thuseachrootnoderepresents theprobabilitydistributionoffusedinformationforeachtarget. Letusassumethatthemeasurementinthediagramislocation,theneachroot nodeofthetreewillbethesummarizedlocationofeachtargetgatheredfromallthe measurementsatitscorrespondingleafnodes.Infact,thisframeworkcanbeseenas twosimultaneousprocesses: 1. Probabilisticclustering :Eachstructure Z canbeinterpretedasaresultofa clusteringprocessbecauseitimpliestheparent-childrelationshipbetweentwo nodesfromadjacentlevels.Thenumberoftreesinthenalstructureindicates thenumberofactualtargets.Havingasetofchildnodesunderthesameparent (orgrandparent)impliesthattheyarefromthesamephysicaltarget.Forexample, inFig. E-5,wecaninterpretthatthereare10measurementstaken,butafterall theredundancyisreducedbytheDFBN,thereareonly3individualtargets.Each treehasonlyonerootnode,anditrepresentsasingletarget.So,eachtreeand itsrootnodecanbeviewedasaresultingclusteranditscorrespondingcentroid respectively. 2. Bayesiandatafusion :Eachrootnodeservesasthecentroidofthecluster,so itrepresentsthefusedinformationoftheremainingmembersofthecluster, particularly,itscorrespondingchildrenattheleaflevel.Unlikeotherclustering algorithms,inthisframework,theinformationiscombinedviahierarchical Bayesianprobabilisticinference.Consequently,thefusedinformationisinthe formofaposteriorprobabilitywhichprovidesbothestimationanduncertainty. 140

PAGE 141

E.3.1 ReformulatingtheDFBNforMultipleFeatures Toimproveclusteringperformanceweextractseveralindependentfeaturesfrom thedata.Fromthispointonweabusenotationbyusing P ( X Y Z j ) asthejoint probabilityofallfeatures f 2f1,..., F g : P ( X Y Z j )= P (Z j ) F Y f =1 P (X (f ) Y (f ) jZ f ), (E) where X (f ) Y (f ) and f arethehiddenvariables,observedvariablesandtheedge parametersforthefeatures f respectively,andtheoptimizationproblemcanbewritten as Z ? =argmax Z J (Z ), (E) wheretheobjectivefunction J (Z ) isdenedas: J (Z )= X f log P ( Y (f ) jZ f )+log P ( Z j ). (E) Furthermore,eachfeaturecontributesdifferentlytotheclustering,sowewanttoassign ascalarweight w f toeachofthefeatures f .Finally,aftermodifyingtheobjective function,wehaveouraugmentedobjectivefunction 2 : J ( Z )= X f w f log P (Y (f ) jZ f )+log P (Z j ) (E) Thersttermcanbecalculatedefcientlyusingsum-productalgorithm[ 106, 107]. Thesecondtermisthestructureprobability,wherewecanincorporateotherpenalty functionslikestructurecomplexityorinitialconditiontotheobjectivefunction.Note thatthestructure Z isrestrictedtobethesameforallthefeatures.Fig. E-6 isan illustrationoftheparallelDFBNstructureandthethreeboxesfromlefttorightarethe 2 This newobjectivefunctionmaynotbeavalidprobability,butithasphysical meaning.Forinstance,in1DGaussianwewillhave w f log N (x f ; f 2 f ) log N (x f ; f 2 f w f ) 141

PAGE 142

Figure E-6.ParallelDFBNsensorfusionstructure.Theobjectivefunction J (Z ) isa weightedsumoftheDFBNsformedfor1)x-ytargetlocation,2) correspondingseabottomtextureparameters,and3)targetshape parameters.Ultimately,theBayesianinformationcriteria(BIC)isusedasthe objectivefunctioninordertopenalizethestructurecomplexity. threefeatures:location,seabottomtexture,andtargetshape.Notethatmorefeatures canbeaddedtothenetworkasneeded. Inthesensorfusionframeworkdescribedhere,weassumeauniformprior distributiononeachrootnode.Asmentionedearlier,themodelparameters parameterize linearmultivariateGaussianrelationshipsbetweeneachparent-childpair.Each parameterhasphysicalmeaning.Forinstance,theprecisionmatricesoftheedges betweentherootandintermediatenodesforFeature1aredenedbythenavigational uncertaintyofthesensingplatform. Givenasetofmeasurements,theoptimalgraphstructureisoneinwhichthe objectivefunction J (Z ) inEquation E ismaximized.Ageneralalgorithmfornding theoptimalgraphstructureisshowninAlgorithm 4. 142

PAGE 143

Algorithm 4 OptimalGraphStructure 1: Input parameters: structureparameters edgeparameters(conditionalprobability) and weights w 1 w 2 w 3 fromtrainingprocess. 2: Inputvariables: asetof N possiblegraphstructuresand evidenceonleafnodes. 3: for i =1: N do 4: Assignstructure Z i 5: Calculate P ( Z i j ). 6: Calculatethestructurelikelihood P (Y ( f ) jZ i f ),forfeature f =1,2,3 using sum-productalgorithm. 7: Calculate J (Z i ) 8: endfor 9: Maximumvalueof J (Z ) overall i givesgloballyoptimalsolution.Theoptimal structure Z ? isthestructurethatmaximizestheobjectivefunction. E.4 DFBNAlgorithmImplementation ThissectiondescribesthedetailoftheDFBNimplementation.First,thefeatures anditsextractionalgorithmareexplained.Thenthegraphstructureinitialization usingvectorquantizationisintroduced.Next,theefcientinferencealgorithmand thesuggestedchoiceofparameterssettingfortheobjectivefunctionisaddressed. Attheendofthissection,astochastic-searchoptimizationframeworkandheuristic structureupdatestrategyarepresented. E.4.1FeatureExtraction FeaturesextractedfromrawmeasurementsoccupytheleafnodesofDFBN1, DFBN2,andDFBN3andareultimatelyusedtoresolvethestructureofthetree.In fact,DFBNframeworkisnotrestrictedtoanyspecicsetoffeaturesorthenumber offeaturetouse,andinthispaper,wewilluse3fundamentalfeaturesdescribedas follows.InDFBN1,Feature1,thegeographicpositionofthetarget,istakendirectlyfrom thesensormeasurement.InDFBN2,Feature2,threesonarimageseabottomtexture 143

PAGE 144

A feature#1:x-ylocation B feature#2:seabottom texturex-andy-direction correlationlength C feature#3:targetshapeparameters FigureE-7.(Bestviewedincolor)Theoverviewofthe3featuresextractedfrom27 sensormeasurements.Sampleswiththesamecolorcomefromthe identicaltarget;herewehave5actualtargets.Thesamesampleshape indicatesthattheyareobtainedfromthesamesensorplatform;inthiscase, thereare2sensorplatforms. parametersofcorrelationlengthsinthex-andy-direction,andthe K -distributionshape parameter,areextractedviathealgorithmdescribedin[ 108].InDFBN3,thetarget shapeisrepresentedasanellipsoidandthetworadiiareextractedasFeature3to describethetargetshapecharacteristics.The3featuresextractedfromthe27samples areplottedseparatelyinFig. E-7.Amongthreefeatures,thescatterplotofFeature1 seemstobethemostseparable.However,beingverydiscriminativedoesnotguarantee thecorrectclusteringresult,forinstance,thatcanleadtoover-segmentingthedata.On theotherhand,theotherfeaturesmightgiveaninformationaboutwhichsamplesshould bemergedintothesamecluster.Therefore,furtherinformationgatheredfromtheother featuresiscriticallyimportantforanaccurateresult. ForDFBN1,eachofthetargetlocationsisassumedtobeamultivariateGaussian randomvariablewiththemeancenteredaboutthemeasuredgeographiclocation andthecovariancematrixdeterminedbythenavigationalaccuracyofthesensor.For DFBN2,thetexturefeaturesareassumedtobedrawnfromamultivariateGaussian 144

PAGE 145

Figure E-8.FourSASimagesofacylinderindifferentseabottomenvironments. distributionwiththecovariancematrixcommensuratewiththenumberofsamplesused toestimatethetexturefeatures[ 109].ForDFBN3,thetargetshapeparametersarealso assumedmultivariateGaussiandistributed. E.4.1.1Seabottomtextureextraction Theseabottomtexturesthatsurroundatargetofinterestcontaindiscriminatory information.Thesetexturescanbeconsideredacontextinwhichagiventargetoccurs. Fig. E-8 depictsacylindricaltargetwiththesameorientationandrangefromthesensor placedinfourdifferentseabottomenvironments.Whileextractedtargetfeaturesfrom thecylindershouldbethesameforthefourimages,featuresbasedonthetextureofthe seabottomshouldbedifferent.Seabottomtexturefeaturesareextractedinthetarget fusionschemetoprovideanothersetoffeatureswithdifferentinformationthanthat providedbythegeographiclocationandtargetfeaturesinthesonarimagesnippet. Seabottomfeaturesareextractedfromthesonarintensityimagebasedon parameterizedmodelsoftheautocorrelationfunctionoftheimagepixels[ 110, 111].The 145

PAGE 146

nor malizedACFofasonarintensityimageisdenedinspatiallaginx-andy-directions denotedby x and y respectivelyas[108] R I ( x y )= I (1+ [ R h ( x y ) ] 2 (E) + 1 X i i l x i l y i jb i j )Tj/T1_5 5.978 Tf7.79 3.26 Td(1 2 (E) h e )Tj/T1_5 5.978 Tf7.79 3.26 Td(1 2 [ x y ] b )Tj/T1_5 5.978 Tf5.75 0 Td(1 i [ x y ] T + [ R h ( x y ) ] 2 i ). (E) where I is themeanoftheintensityimage, R h ( x x ),istheACFoftheimaging pointspreadfunctionimposedbythebeamformerorimageformationalgorithm, istheshapeparameterofthesingle-pointimagestatistics, i ismixtureparameter correspondingthespatialcorrelationparameters l x i and l y i ,and b i iscomposedofa rotationmatrix t i = 2 6 4 cos r i sin r i )Tj/T1_4 11.955 Tf11.29 0 Td(sin r i cos r i 3 7 5 (E) adiagonalspatialcorrelationparametermatrix i = 2 6 4 l 2 x i 0 0 l 2 y i 3 7 5 (E) andadiagonalmatrixofimagingpointspreadfunctionparameters B = 2 6 4 f 2 x 0 0 f 2 y 3 7 5 (E) viathefollowingequation b i =t i ( i +2B )t T i (E) Themixingparameterisconstrainedsuchthat P i i =1.Inthisapplicationasingle componentor i =1 isusedinEquation E forallfeaturesextracted. ThealgorithmforestimatingtheparametersinEquation E isexplainedindetail in[ 108].Weareinterestedprimarilyinthreeparameterstodistinguishbetweendifferent 146

PAGE 147

image seabottomtexture: l x thespatialcorrelationlengthinthe x spatiallagdirection, l y thespatialcorrelationlengthinthe y spatiallagdirection,and theshapeparameter ofthesingle-point K -distributedprobabilitydensityfunction[ 112 ].Theestimated r in therotationmatrix t alsoplaysanimportantroleinthefeatureextractionbyreecting thechangeinsensinganglebetweenmultiplelooksatthesametarget.Iftheunderlying textureisnotangularlydependentonthesensorforthegivenseabottomarea,theset oftexturefeatureswillberotationinvariant.Thereareinstanceswherethisassumption willnothold,e.g.,asensorangleboresighttothetroughsinasandrippleeld[ 113],but formanyseabottomtypesitisvalid. Forthisapplication,priortoseabedtextureparameterestimationweremoveda patchofthesonarimagesnippetcontainingthetargetandreplaceditwithapatchof seabedextractedabovethetargetregion.Thissimpletwo-dimensionalbootstrap stepisintendedtogivebetterestimatesoftheshapeparameterandxandycorrelation lengthsinaseabedregionbyremovingthehighlycorrelatedtargetandshadowpixels fromtheimagesnippet. E.4.2GraphStructureInitialization Inmanysituations,wecanusethepriorknowledgeaboutthetargets'geographic locationtofacilitatethealgorithm,andthesensorplatformIDisknownforeach measurement,sowecanuseadivide-and-conquerstrategytoreducethecomputational time.ThissectiondescribestheDFBNinitializationanddivideandconquerstrategy. E.4.2.1Geographicdataclusteringviavectorquantization Unsupervisedclusteringviavectorquantization,ormorespecicallyLinde-Buzo-Gray vectorquantization(LBG-VQ)[ 114, 115],isusedtoinitiallyclusterthesensordata bygeographicproximity.Thegeographicconstraintisreasonablebecauseitisnot necessarytofusetargetsthataretoodistantfromoneanotherassumingareasonable accuracyinthesensor'snavigationsystem.Theoutputofthealgorithmyields subgroupsofsensormeasurementsthatareincloseproximitytooneanotherand 147

PAGE 148

reduces thepossiblestructuresoftheDFBN.LBG-VQisaniterativealgorithmthat meetstwocriteria:1)thenearestneighborcondition,whereeachmemberofacluster isassignedacentroidthatistheclosestcentroidofthemember,and2)adistortion criteria,wherethesumofthedistancefromallthepointsinaclustertoavalidcentroid islessthansomethresholdvalue. AsampleproblemisnowintroducedtoillustratetheuseoftheLBG-VQclustering algorithmintheDFBNinitialization.ReferringtoFig. E-9,aninitialmapoftargetsare detectedwithsensorsfromtwodifferentsensingplatformsandhavelocationsdened byxandy.InStep1ofthealgorithm,theLBG-VQalgorithmclusterstargetsonthe x-yinputspacewithanoutputthatassignscentroids(stars)topossiblegeographic centersofthetargets.InStep2,thetargetsareseparatedaccordingtotheassigned sensorplatformID.Inthisstep,aDFBNforestisinitializedwiththecentroidsasthe rootnodesandthetargetlocationsastheleafnodes.Theleafnodesarepartitionedby centroidassignment.InStep3,thecentroidsarelinkedtothecorrespondingleafnodes toinitializetheDFBNforest. Thereareseveralalternativetogeographicaldataclusteringalgorithms,for instance,Gaussianmeanshift,hierarchicalclustering, k -mean,Gaussianmixture modelsandspectralclustering.However,keepinmindthatcandidatealgorithmsare computationallyinexpensive,donotover-segmentthedata,andmostimportantly,do notrequireapre-denednumberofclusterasthispre-processingstepisunsupervised. Fromthealgorithmsmentionedabove,thelastthreeneedcross-validationtondthe numberofclusters,andthusarenotpracticalforourapplicationhere.Plusitismuch simplertopickascalarthresholdinLBG-VQthanchoosekernelsizematrix(with possiblymanydegreesoffreedom)intheGaussianmeanshiftalgorithm.Inlightof theseconsiderationsLBG-VQwaschosenforinitialdataclusteringduetoitssimplicity andabilitytoautomaticallydetermineclustercardinality. 148

PAGE 149

Figure E-9.LBG-VQclusteringtoinitializetheDFBNforest.Targetsarerstclusteredby theirlocationsthenconnectedintoaninitialtreebasedoncentroid membershipandsensingplatformassociation. E.4.2.2Divideandconquerstrategy SincethesensorplatformIDisknownforeachmeasurement,wecanreducethe computationaltimeoflearningthestructurebygroupingandapplyingthedatafusion frameworktoeachgroup.Therefore,wecanuseonly2layersinsteadof3atatime. ThesummarizedprocedureisillustratedinFig. E-10.First,wegroupmeasurements accordingtotheirsensorplatformID(pfm#)asinFig. E-10(b),andatthispointwe canbypasstheplatformIDlevel,sowewillhaveonly2levels:1)measurement and2)object/targetlevel.Next,weapplyourdatafusionalgorithmforeachgroup separatelyandgettheresultwhichlookslikeFig. E-10(c).However,theremightbe someduplicatedtargetsacrosstheplatforms.Inotherwords,anindividualtargetmight besensedbyseveralplatforms,sowewillhavetomergetheresultfromtheprevious stepagainbylettingtherootnodesfromthepreviousstepbecomeleafnodes,and addthethirdlevelwhichwillbetheindividualroot.Thesamedatafusionalgorithmis 149

PAGE 150

Figure E-10.Thesummarizeddiagramofimplementationwhereweuseonly2layersat atime.(a)theinitialstructure,(b)thefeaturesseparatedaccordingtothe sensorplatformID,(c)sensorfusionalgorithmisappliedto2layersto obtainaninterimsolution,(d)thirdlayerisintroducedinordertomergethe targetineachplatformIDand,(e)thesimplieddiagramwithorphan nodesremoved. runagain,andthenalresultisshowninFig. E-10(d).TheoutputinFig. E-10(d)is thenusedastheinitialinputtotheDFBNalgorithmgreatlyincreasingtheaccuracyand speedingthecomputationofthenalsolution. E.4.3ProbabilisticInferenceforDFBN Thissectiondescribesthesum-productalgorithm,theefcientinferencealgorithm usingfactorgraphrepresentation,whichprovidesexactinferenceforDFBN.Furthermore, thechoiceofpriorprobabilityofthestructure Z andpenaltyforstructurecomplexityare describedsubsequently. E.4.3.1Sum-productalgorithmforDFBN Givenaxedstructure Z ,weuseafactorgraph[39, 116]representationand sum-productalgorithmtomarginalizethehiddenvariablesinDFBN.Thesum-product ruleisageneralizedcaseofPearl'smessagepassingalgorithmanduses2types ofnodes:1)avariablenodeforeachrandomvariableand2)afactornodeforeach localfunction,namelytheconditionalprobabilityinthiscase.Intheoperationofthe sum-productalgorithm,foreachtree,werstsendthemessagefromeachevidential nodeupwardtoitscorrespondingfactornode.Eachofthefactornodesthensends 150

PAGE 151

the messageupwardtoitscorrespondingparentvariablenode.Thetwoprocesses areperformedalternativelyuntilthemessagereachtherootofthetree.Next,the messageissentfromtheroottotheleafnodesinthesimilarmannerbutdownward. Eachmessagetransmittedinthenetworkrepresentsthelocalknowledgeorbeliefwhich istheprobabilitythatthenodesinthenetworkattempttoshareamongoneanotherin ordertocombineandgettheglobalknowledgeforthenetwork,namelytheposterior marginalateachnode.Oncetheprocessiscompleted,eachedgeinthenetwork containstwotypesofmessages:upwardanddownward,andtheposteriormarginal distribution p (x j jY ) foraparticularnode x j giventheevidences Y canbecarriedoutby multiplyingalltheincomingmessagestogether. Thesum-productalgorithmgivesefcientandexactinferenceforDFBNbecauseof itstreestructure.Inwhichcase,thecomputationalcomplexityislinearandpolynomial intermsofthenumberofsamplesandthedimensionalityoffeaturerespectively.The complexitywillbediscussedagaininSection E.5.Moredetailofthesum-product algorithmcanbefoundin[ 39]. E.4.3.2Structurepriorprobability P ( Z ) Therearemanywaystocomeupwiththepriorprobabilityofthestructure.In thisdatafusionapplication,geographicallocationisonestrongfeaturewhichwecan exploittodenethepriorofthestructure.TheGaussiankernelisusedtodeterminethe associationsamongtargets: p (z ji =1)= exp )Tj/T1_7 7.97 Tf14.97 4.71 Td(1 2 2 k r j )Tj/T1_1 11.955 Tf11.95 0 Td(r i k 2 2 P k 2 V pa (j ) exp )Tj/T1_7 7.97 Tf14.97 4.71 Td(1 2 2 kr j )Tj/T1_1 11.955 Tf11.96 0 Td(r k k 2 2 (E) where r j denotes thelocationvector(x-ycoordinate)ofthenode X j V pa (j ) istheindex setofnodesinthesamelevelastheparentofnode X j ,and 2 isthevariance,orthe kernelwidth,ofthepriorstructure.Large 2 suppressestheeffectoftheafnitybetween twotargets,wheretheprobability p (z ji =1) tendstotheuniformdistributionoverthe 151

PAGE 152

parent X i Inthisexperiment,weuseauniformpriordistributionforthestructure,thatis, p (z ji =1)= 1 j pa ( j )j where j pa (j )j denotes thecardinalityofthesetofpossibleparentsof j E.4.3.3Penaltiesforstructuralcomplexity Amorecomplexstructuretendstotabroadrangeofmodelsequally,soitishard todistinguishthebestmodelfromtherest.Thus,itisusuallythecasethatthestructure complexityhastoberegularizedorpenalizedbysomecriteriasothatthebestmodel issufcientlysimpleyetstillabletoexplaintheobservationswell.Bayesianinformation criterion(BIC)isacriterionformodelselectionamongaclassofparametricmodels withdifferentnumbersofparameters[ 59 ].Inthiscase,theparametersofinterestare thedimensionalityofthedata,thenumberofedgesandthenumberoftreesfromgraph structure Z .TheobjectivefunctionderivedfromtheBICisgivenby: BIC(Z )= )Tj/T1_3 11.955 Tf[(2log p (Y jZ )+ K ( Z )log( #sample) (E) where K (Z ) denotesthecomplexityofstructure Z whichis K 2 )Tj/T1_2 7.97 Tf[(layer (Z )= #sample + #root D 2 2 + 3D 2 (E) f ora2-layerstructureand K 3)Tj/T1_2 7.97 Tf11.02 0 Td[(layer (Z )= #sample (E) +#intermediate D 2 2 + 3D 2 + 1 (E) +#root D 2 2 + 3 D 2 (E) f ora3-layerstructurerespectively,where D isthefeaturedimensionality,#sampleis thenumberofleafnodes,#intermediateisthenumberofnodesinthemiddlelayer,and #rootisthenumberofrootnodes.Thenewobjectivefunction,BIC (Z ),issmallwhen thestructure Z giveshighlog-likelihoodandhasanuncomplicatedstructure.Thusthe optimalstructure Z ? isobtainedwhenBIC (Z ) isminimized. 152

PAGE 153

E.4.4 DFBNOptimizationScheme Thissectiondescribestheexhaustivesearchalgorithmwhichallowsallpossible structurestobeevaluated,thensimulatedannealing,astochastic-searchoptimization frameworkforDFBN,whichprovidesbetterrun-timethanusinganexhaustivesearchin generaliselaborated.Also,weproposeaheuristicmethodwhichsignicantlyspeeds upthesimulatedannealingalgorithmwhentherearealargenumberofsamples. E.4.4.1Exhaustivesearch ItcanbeshownusingBell'snumber[117]thatsolvingforthegloballyoptimal structureoftheDFBNusingexhaustivesearchalgorithmrequiresasearchoveran extremelylargenumberofpossiblelinkagecombinationsforarelativelysmallnumberof inputnodesandiscomputationallyinfeasible.Forexample,withonly10measurements fromasinglesensor,thepossiblenumberof2-layerDFBNrepresentations/structuresis 115,975.Byenforcingsome apriori constraintsontheproblem,suchasageographic thresholdmentionedearlierinthissectionforpossibletargetcombinations,and initializingviaunsupervisedclustering,thenumberofpossibleDFBNsolutionsis reduced,andthus,thesearchoverthepossiblestructuresislesstime-consuming. Inthispaper,werecommendusinganexhaustivesearchstrategywhenthenumber ofsamplesislessthan9,butinthecaseofhaving9ormore,stochastic-search optimization,whichwillbeexplainedsubsequently,isrecommended. TableE-1.Numberofpossiblestructuresofthe2-layerDFBNvs.numberofsamples.It canbeshownthatthenumberofpossiblestructuresofthe2-layerDFBNis representedbyBell'snumber. # ofsamples12345678910 # ofpossiblestructures1251552203877414021147115975 E.4.4.2 DFBNoptimizationviasimulatedannealing Simulatedannealingisanalternativeoptimizationschemetoexhastivesearch, whichcandrasticallyspeedupthealgorithmwhenhavinglargenumberofsamples. 153

PAGE 154

After theDFBNforestisinitializedbytheLBG-VQclusteringalgorithm,theoptimum structureiscalculatedusingsimulatedannealing.Simulatedannealing(SA)isa stochasticsearchalgorithmusedtondmaximainlargeoptimizationproblemsand drawsitsnamefromthemethodofallowingmagneticsystemstondlow-energy structuresthroughannealing[ 118, 119].IntheDFBNsensorfusionframework, parent-childconnectionsarechangedatrandomandacceptedifthenewstructure eitherresultsinahigherlogposteriororrandomlyexceedsathresholdgovernedbythe decreasingtemperatureoftheannealingprocess.ThestochasticnatureofSAallows forunfavorablestructureswhichhavelowBIC (Z ) valuesfrequentlyinearlyiterations ofthealgorithminhopesofavoidinglocalmaxima.Atlateriterations,thealgorithm performslikeastochasticgreedysearchandonlyacceptsnewstructuresthatincrease theMAPvalue.Thestructureacceptanceisgovernedbyatemperatureparameterthat decreasesasthenumberofiterationsincreases. FollowingtheoutlineoftheDFBNoptimizationalgorithmdescribedinSection E.3,a summaryoftheDFBNsensorfusionispresentedinAlgorithm 5.Thealgorithmrequires inputsofthefeaturevectorsextractedfromthesensormeasurements,conditional probabilitydistributionateachedge,aninitialstructure,anannealingschedulewith aninitialtemperatureandastructureupdateprocedure.The randomstructure isa simplemethodologytoupdatethestructureinSAbyrandomlypickingachildnodeand randomlychangingitscorrespondingparentaspresentedinAlgorithm 6. Algorithm 6 Randomstructureapproachforgraphstructureupdate 1: pr ocedure R ANDOM S TRUCTURE (structure Z ) 2: randomlypickachildnode j instructure Z 3: listallparentcandidates A ofthechildnode j excludingitscurrentparent 4: randomlypickaparentnode i from A 5: removetheconnectionbetween j anditscurrentparent 6: makeaconnectionbetween j and i inthenewstructure Z new 7: Return Z new 8: endprocedure 154

PAGE 155

Algorithm 5 SummaryoftheDFBNoptimizationalgorithm 1: Deter miningthefollowinguser-denedparametersaccordingtoTable E-2 It max Maximumnumberofiterationallowedforthewholeprocess T 0 Initialtemperature Temperaturedecayconstant, 0 << 1 T limit Limitingtemperature P max Maximumnumberofproposedstructurechangesallowedbeforechanging temperature A max Maximumnumberofacceptedstructurechangesallowedbeforechanging temperature 2: Initializingparameters Resetproposedstructurechangescounter P 0 Resetacceptedstructurechangescounter A 0 Initializetemperature T T 0 3: Z InitializeparallelDFBNstructureviaLBG-VQprocedureinSection E.4.2.1 4: Setthebeststructure Z ? Z 5: Calculate-BIC (Z ) usingsum-productalgorithm 6: Set E a -BIC(Z ) and 7: Setthebestscore E ? E a 8: repeat 9: Z structureupdatealgorithm e.g.,randomstructureasinAlgorithm 6 or directedperturbationasinAlgorithm 7 10: Recalculate-BIC(Z ) 11: Set E b -BIC(Z ) 12: if exp )Tj/T1_3 7.97 Tf6.67 -4.57 Td(E b E a kT Unif orm r andom [0,1] then 13: E a E b acceptthenewstructure 14: A A +1 15: elseif E b E ? then 16: E ? E b 17: Z ? Z acceptthebeststructure 18: endif 19: P P +1 20: if A > A max or P > P max then 21: T T decreasethetemperature 22: A 0 resetthecounter 23: P 0 resetthecounter 24: endif 25: until T < T limit or It > It max 26: Return Z ? Theoptimalstructure 155

PAGE 156

E.4.4.3 Structureupdateusingdirectedperturbation TheDFBNcanbeacceleratedsignicantlywhenaseriesofappropriatestructures isproposedtoSA.InspiredbyGibbssampling,weproposedanintuitivestochastic methodology,called directedperturbation,toupdatethestructureinSAoptimization scheme.First,arandomchildnodeisselected,thentheafnitymeasureiscarried outbetweeneachpairofthechildnodeandalloftheparentcandidatesintheupper level.Amultinomialdistributionisthenformedtorepresenttheprobabilitiesthatthe childnodewouldconnecttoeachoftheparentcandidates.Thesuccessfulparentis sampledfromthedistributionwhichisupdatedeverytimethestructuralchangeshave beenmade.Thisheuristicapproachenablesfasterconvergencethanusingtherandom structure.Empirically,itissufcienttouse 100 iterationsinSAwithdirectedperturbation asopposedto 1000 iterationsinSAwithrandomstructureupdateinordertogetagood resultinthescenariowithabout20samples.ThealgorithmissummarizedinAlgorithm 7. Algorithm 7 Directedperturbationforgraphstructureupdate 1: pr ocedure D IRECTED P ERTURBATION (structure Z ) 2: randomlypickachildnode j instructure Z 3: listallparentcandidates A ofthechildnode j excludingitscurrentparent 4: for i 2A do 5: calculatethelikelihood A j ( i ) thatthechild j andtheparent i areconnected usingEquation E 6: endfor 7: [optional]re-normalize A j (i ) over i toensurethemultinomialdistribution 8: samplethesuccessfulparent i from A j (i ) 9: removetheconnectionbetween j anditscurrentparent 10: makeaconnectionbetween j and i inthenewstructure Z new 11: Return Z new 12: endprocedure E.5 ExperimentsandDiscussion TheDFBNalgorithm'sabilitytoautomatethesensorfusiontaskwasdemonstrated againstaside-lookingsonardatasetcomprising27differentsonarimageswith associatedlocationmeasurementsof5targetssensedmultipletimesviadifferent 156

PAGE 157

sensor tracks.Theobjectiveoftheexperimentistocorrectlycategorizethe27images intothe5targetclasses.Thesonarimageswerecreatedbyahigh-resolutionsonar stripmapimagingsystemintegratedontoanunmannedunderwatervehicle.Themultiple imagesofeachtargetweretheresultofoverlappingsensingtrackswherethetargetof interestwasplacedinthesensoreldofviewatvariousorientationsandranges. Giventhesonarimagedatasetdescribedinthepreviousparagraph,theparameters neededtoinitializetheDFBNare:1)thestructureprior,2)thecontributionweightfor eachfeature,and3)thelinearGaussianparameters.Intheseexperiments,allthe parametersarelearnedfromatrainingdatasetof15imagesfromthesurveydataset, wherethegroundtruthlabelsforallimagesarealreadyknown.Foreachexperiment, theparallelDFBNstructureswereinitializedusingtheworstcasepillarstructureand thestructurepriorprobabilityisassumeduniformlydistributed,i.e., p ( Z i )= p (Z ),thus simplifyingthecalculationoftheobjectivefunction.Theweightsfor3featuresareequal, i.e.,eachweightis 1 =3 ,eachbasedonresultsfromapreviouslysimulatedscenarioasa trainingexample.ForthelinearGaussianparametersweassumeidentityweightsmatrix W ( f ) j = I ,zerotranslation ( f ) j = 0 andfullprecisionmatrix (f ) j foreverynodeandevery feature.Theprecisionmatricescanbeestimatedfromthesonarimagesintraining process.Thesonarimagetexturefeaturesofxcorrelationlength l x andycorrelation length l y ,wereestimatedfromthesonarimagesusingtheproceduredescribedin[ 111]. Sincewehavegeographicalclusteringasapreprocess,thenumberofsamplesis reducedsignicantlywithinagroupasshowninFig. E-11.Thecomputationalcostof DFBNusingtheexhaustivesearch,denotedbyDFBN+Alldependsonthenumberof possiblestructurescreatedfromthenodeswithinagroup.TheTable E-1 belowshows thenumberofpossiblestructuresofa2-layerDFBNwhenthenumberofnodesis increased.Inthecasewherethenumberofsamplesislessthan9,wecanperforman exhaustivesearchefcientlyoverthestructuredomain.Whenthenumberofsamples isgreaterorequalto9,SAcanperformfasterandprovidecompetitivelygoodresults. 157

PAGE 158

T ableE-2.ParametersusedinSAoptimization.Thesecondandthethirdcolumn representtheparametersusedintheDFBNwiththerandominitialstructure approachanddirectedperturbationapproachrespectively. P arametersDFBN+RandDFBN+DP T 0 10 10 T limit 0.10.1 0.90.9 k 1 1 A max 10 10 P max 20 20 It max 1000100 In additiontoDFBN+All,weexperimentontwotypesofstructureupdatealgorithms implementedforSA:1)randomstructureupdateand2)directedpurturbation,and hencedenotedbyDFBN+RandandDFBN+DPrespectively.TheparametersfortheSA optimizationalgorithmofbothapproachesaregiveninTable E-2 InordertocomparetheperformanceofDFBN,weuseGaussianmeanshift(GMS) andspectralclustering[ 120 ]asbaselinemethodssincetheyhavebeenadoptedin manydataclusteringapplications.LikeDFBN,GMSrequiresonlythecovariancematrix ofeachfeatureasaninput,andmoreimportantly,itdoesnotrequireanyinformation aboutthenumberofclusters.Ontheotherhand,spectralclusteringrequiressuch information,therefore,weuse[ 121],aninformation-theoretic-basedprocedure,to determinetheoptimalnumberofclusters,whichislessconvenientthanDFBNwhere thenumberofclusterisdeterminedautomaticallyintheprocess.TheoptimalDFBN graphstructuresobtainedfromthedatainFig. E-7 areillustratedinFig. E-12.The empiricalresultsobtainedfromusingDFBN+All,DFBN+RandandDFBN+DPare identical,andhencearedepictedidenticallyinFig. E-11 (c),providessomeimportant informationtous.First,SAconvergestotheglobaloptimalsolutionwithinthegiven numberofiterations.Second,thedirectedpurturbationstrategycansupplyappropriate structurestoSA,whichinturnpermitstheDFBNtoconvergetotheglobaloptimal 158

PAGE 159

A Gaussianmeanshift(GMS) B Spectralclustering C DFBN(+All,+Rand,+DP) FigureE-11.Comparingtheclusteringresults(feature#1only).Thedashedboxes representgroupsofx-ylocationsseparatedbygeographicdataclustering. Shadedellipsesrepresentgroundtruthcluster;thefeaturevectors(points) underthesameshadedellipsearefromthesameactualtarget.The unlledclosedcontoursrepresentresultingclustersobtainedfromeach method. solutioninashortertime.Theclusteringresultsofusingalltheapproachesmentioned earlierareshowninFig. E-11.EachtreestructureinFig. E-12 canbeinterpretedas clusteringresults,andeachrootnodedistributioncapturestheestimatedlocationof theactualtargetdepictedasreddotsinFig. E-11(c).Theestimatedlocationsand 159

PAGE 160

A Group1 B Group2 C Group3 FigureE-12.ThebestgraphstructureobtainedatthenalstageofDFBN.Fromleftto rightareGroup1,2and3respectively.Thegraphstructuresareinterpreted astheclustersinFig. E-11 TableE-3.Estimatedrelativex-ylocationsanduncertaintyusingDFBNforeach detectedtargetaccordingtotheresultsshowninFig. E-11 detected target#group#rootnode#estimatedlocationslocationaluncertainties 1 11 38.99 3.84 3.95 1.25 1.251.26 2 21 0 0 2.510.77 0.770.78 3 22 16.00 2.24 3.811.18 1.181.19 4 31 67.67 57.16 3.951.25 1.251.26 5 32 80.82 72.25 3.911.23 1.231.24 uncer taintiesobtainedfromDFBN(asdescribedinSection. E.2.2)arelistedinTable E-3.Therun-timeiscomparedinTable E-4.AlltheexperimentsareruninMATLAB R r r2010aonanIntel R r Core TM 2DuoCPUE8400@3.00GHzmachinewithLinuxUbuntu operatingsystem. TheempiricalresultsshowthatDFBNoutperformsGMSwiththesameinformation ofparameters,i.e.,covariancematricesintheGaussiankernels.Inthiscase,the numberofsamplesaresparse,andGMStendstooverestimatethenumberofclusters 160

PAGE 161

T ableE-4.ComputationaltimeforGMS,spectralclustering(SPC),DFBNwith exhaustiveseach,randominitialstructure,anddirectedperturbation.Note thatwhenthenumberofsamplespergroupislessthan9,onlyDFBN+Allis implementedduetotheefciency,hencen/a(notavailable)appearsinsome entriesofthetable. run-time foreachalgorithm gr oup##sampleGMSSPCDFBN+All+Rand+DP 13 1sec0.5sec4secn/an/a 2171sec0.6sec10hr20min3min 371sec0.5sec18secn/an/a despite usingtheparametersfromthetrainingdata.Similardrawbacksarealsofoundin spectralclusteringwheretheoptimalkernelsizeandtheoptimalnumberofclustersare learnedfromthetrainingdataandinformation-theoreticapproach. Ontheotherhand,DFBNperformsbetterbecauseitconsidersaclusterasa structurewhichgenerallycontainsmoreinformationthanthenotionofmembershipin traditionalclusteringframeworks.StructureswithintheDFBNcontaininformationthatis smoothlycombinedbothlocally(bottom-up)andglobally(top-down)viathesum-product algorithm.Moreover,theBayesianframeworkassistswithstabilizingDFBNincaseof sparsedatawhereasothersdonot,andtherefore,theDFBNdoesnotrequirealarge numberofsamples. ThedeformablestructureofDFBNnotonlyadjustsitsstructuretotwiththegiven data,butalsosearchesclusteringcongurationsingraph-structurespacewhichcannot bedoneintraditionaldistribution-basedclusteringalgorithms.Thisbehaviorissimilarto modelselection. DespitethefactthatthelinearmultivariateGaussianmodelisusedinthispaper, thereisnolimitationontheconditionalprobabilitymodelusedinDFBNingeneral. Forinstance,thelinearGaussianDFBNwouldover-segmentclusterswhoseactual uncertaintiesareGaussianmixturemodel,soitmightbemoreappropriatetomodelthe conditionalprobabilitydistributionoftheDFBNwithGaussianmixturemodelinstead. ThisexibilityempowersDFBNtobeversatiletoabroadarrayofapplications. 161

PAGE 162

Although DFBNoutperformsbothGMSandspectralclustering,itsrun-timeis sensitivetothechoiceofstructureupdatestrategy,andinappropriatechoicecan resultinapoorclusteringresultorunefcientrun-rime.Thecostofaninferencefor astructureusingthesum-productalgorithmis O (jEjD 2 ) (recall jEj and D denotethe numberofsamplesinthedataandthedimensionalityofthefeaturerespectively). Theexhaustivesearchdependsonthenumberofpossiblestructureboundedby O ( h jEj log(jE j+1) i jEj ) [122].Whenthenumberofsampleissmall,DFBN+All,whose complexityis O (jEjD 2 h jEj log( jE j+1) i jEj ),canbestillcarriedout,however,thatmightnotbe thecasewhenthenumberofsampleislarge.Therstefforttoovercometheproblem istouseDFBN+Rand,whosecomplexityis O (jEjD 2 )It ( Rand ) max where It ( Rand ) max denotes themaximumnumberofiterationsforSAusingrandomstructure,resultinginbetter run-timeingroup2asshowninTable E-4.However,ashortcomingofDFBN+Randis therandomnessoftheproposedstructureswhichrequireslargenumberofiterations toconvergetoagoodsolution.Suchahindranceisovercomebyhavingmorecontrol overtheproposedstructuresresultinginDFBN+DPmentionedearlierinprevious section.DFBN+DP,whosecomplexityis O (jEjD 2 )It (DP ) max where It (DP ) max denotesthe maximumnumberofiterationsforSAusingdirectedperturbation,mitigatestheproblem byprovidingappropriatestructurestoSA,thus,ingeneral,requiressignicantlyless numberofiterationstoconverge,i.e., It (DP ) max << It (Rand ) max .Therearemanymorepossible strategiestoaccelerateDFBN,forinstance,greedystructureupdate,Gibbssampling, variationalapproximationinferenceonpossiblestructures.Theseissuesarestillopen forfutureresearch. E.6Summary Inmanysurvey-basedsensingtasks,multiplemeasurementsaretakenofthe sametargetofinterest.Attheendofthesesurveys,theredundancyintheinformation mustberemovedtodeclutterdataproductsandinformationdisplaystotheend-user. TheDFBNarchitectureyieldsahierarchicalanalyticalframeworktoautomatethe 162

PAGE 163

task ofcombiningsimilartargetsthathavediscriminatingmeasurements.Asonar surveyof5targetssensed27timesfromaplatformtraversingatargeteldwasused toillustratetheperformanceoftheapproach.Measuredgeographiclocationsand featuresextractedfromboththetargetandthecorrespondingseabottomtexturewere usedtofusetheredundantlysensedtargetsviaoptimizationoftheDFBNstructure ornodallinkage.TheDFBNframeworkwascomparedagainstGMSandspectral clusteringalgorithmanddemonstratedsignicantimprovementoverthesealgorithms. TheresultingtreestructureobtainedfromDFBNisarichrepresentationofclusters whichshowsmoredetailsofinterrelationshipamongthedatasamplesthantraditional clusteringmethodssuchask-meanandGMM.Dataavailabilityforourproblemis severlylimited,hencethesmalltraining/testsetandconditionspresentedinthiswork. Wearebuildingasyntheticenvironmenttoincreasethedatatestingcapabilitiesand plantoshowmoredetailedresultsaspartoffuturework. TheDFBN'srun-timeissensitivetothechoiceofstructureupdatestrategies,and improperchoiceofsuchstrategiescanresultinundesiredclusteringresultorinefcient runtime.Severalupdatestrategieswereproposedthatworkedwellinthevariousdata scenarios.DFBN+All,amodicationwhichevaluatesallpossiblestructures,issuitable whenthenumberofsamplesissmall.DFBN+Rand,asimulatedannealingoptimization regimewithrandomstructureupdatestrategywhichencouragesgloballyoptimal solutions,issuitedforlargeroptimtizationproblems.Athirdstrategy,DFBN+DP,uses directedperturbationtogreatlyimprovetherun-timeoverDFBN+Randwhileattaining thedesiredsolution.Withacomputationalcost O (jEjD 2 )It max ,DFBN+DPhaspotential usewithlargerdatasetsandisanavenueoffutureresearch. 163

PAGE 164

REFERENCES [1]B .Schachter,A.Lev,S.Zucker,andA.Rosenfeld,Anapplicationofrelaxation toedgereinforcement, IEEETransactionsonSystems,Man,andCybernetics, vol.7,pp.813,1977. [2]A.HansonandE.Riseman,Segmentationofnaturalscenes, ComputerVision Systems,pp.129,1978. [3]S.GemanandD.Geman,Stochasticrelaxation,Gibbsdistributions,andthe Bayesianrestorationofimages, IEEETrans.PatternAnal.MachineIntell,vol.6, no.6,pp.721,1984. [4]G.Finlayson,S.Hordley,C.Lu,andM.Drew,Ontheremovalofshadowsfrom images, IEEETransactionsonPatternAnalysisandMachineIntelligence ,pp. 59,2006. [5]N.SebeandM.Lew,Comparingsalientpointdetectors, Patternrecognition letters,vol.24,no.1-3,pp.89,2003. [6]D.Lowe,Distinctiveimagefeaturesfromscale-invariantkeypoints, International journalofcomputervision,vol.60,no.2,pp.91,2004. [7]H.Bay,T.Tuytelaars,andL.VanGool,Surf:Speededuprobustfeatures, ComputerVisionECCV2006,pp.404,2006. [8]N.DalalandB.Triggs,Histogramsoforientedgradientsforhumandetection, in ComputerVisionandPatternRecognition,2005.CVPR2005.IEEEComputer SocietyConferenceon,vol.1,2005,pp.886vol.1. [9]J.ShiandJ.Malik,Normalizedcutsandimagesegmentation, IEEETransactions onPatternAnalysisandMachineIntelligence,vol.22,no.8,pp.888,Aug. 2000. [10]P.FelzenszwalbandD.Huttenlocher,Efcientgraph-basedimagesegmentation, InternationalJournalofComputerVision,vol.59,no.2,pp.167,2004. [11],Efcientgraph-basedimagesegmentation, Int.J.Comput.Vision,vol.59, pp.167,September2004. [12]A.Levinshtein,A.Stere,K.Kutulakos,D.Fleet,S.Dickinson,andK.Siddiqi, Turbopixels:Fastsuperpixelsusinggeometricows, IEEEtransactionson patternanalysisandmachineintelligence,pp.2290,2009. [13]P.Arbelaez,M.Maire,C.Fowlkes,andJ.Malik,Contourdetectionand hierarchicalimagesegmentation, IEEETransactionsonPatternAnalysisand MachineIntelligence,vol.33,pp.898,2011. 164

PAGE 165

[14]A. Torralba,K.Murphy,W.Freeman,andM.Rubin,Context-basedvisionsystem forplaceandobjectrecognition,2003. [15]L.LiandL.Fei-Fei,What,whereandwho?classifyingeventsbysceneand objectrecognition,in ComputerVision,2007.ICCV2007.IEEE11thInternational Conferenceon.IEEE,2007,pp.1. [16]L.Li,R.Socher,andL.Fei-Fei,Towardstotalsceneunderstanding: Classication,annotationandsegmentationinanunsupervisedframework, 2009. [17]G.Mori,X.Ren,A.Efros,andJ.Malik,Recoveringhumanbodycongurations: Combiningsegmentationandrecognition, IEEEComputerSocietyConferenceon ComputerVisionandPatternRecognition,vol.2,pp.326,2004. [18]S.TodorovicandM.Nechyba,Dynamictreesforunsupervisedsegmentationand matchingofimageregions, IEEEtransactionsonpatternanalysisandmachine intelligence,pp.1762,2005. [19]P.FelzenszwalbandD.Huttenlocher,Pictorialstructuresforobjectrecognition, InternationalJournalofComputerVision,vol.61,no.1,pp.55,2005. [20]P.Dollar,V.Rabaud,G.Cottrell,andS.Belongie,Behaviorrecognitionvia sparsespatio-temporalfeatures,in 2005IEEEInternationalWorkshoponVisual SurveillanceandPerformanceEvaluationofTrackingandSurveillance.IEEE, 2005,pp.65. [21]I.Laptev,M.Marszalek,C.Schmid,andB.Rozenfeld,Learningrealistichuman actionsfrommovies,in ComputerVisionandPatternRecognition,2008.CVPR 2008.IEEEConferenceon.IEEE,2008,pp.1. [22]M.FigueiredoandA.Jain,Unsupervisedlearningofnitemixturemodels, PatternAnalysisandMachineIntelligence,IEEETransactionson ,vol.24,no.3, pp.381,2002. [23]J.CoughlanandA.Yuille,Algorithmsfromstatisticalphysicsforgenerative modelsofimages, ImageandVisionComputing,vol.21,no.1,pp.29,2003. [24]J.GoldbergerandS.Roweis,Hierarchicalclusteringofamixturemodel,in In NIPS.MITPress,2005,pp.505. [25]S.KumarandM.Hebert,Discriminativerandomelds:Adiscriminative frameworkforcontextualinteractioninclassication, ComputerVision,IEEE InternationalConferenceon,vol.2,p.1150,2003. [26],Discriminativerandomelds, InternationalJournalofComputerVision, vol.68,no.2,pp.179,2006. 165

PAGE 166

[27]X. He,R.Zemel,andD.Ray,Learningandincorporatingtop-downcuesinimage segmentation, ComputerVisionECCV2006 ,pp.338,2006. [28]D.Batra,R.Sukthankar,andT.Chen,Learningclass-specicafnitiesforimage labelling, ComputerVisionandPatternRecognition,IEEEComputerSociety Conferenceon,vol.0,pp.1,2008. [29]J.Shotton,J.Winn,C.Rother,andA.Criminisi,Textonboostforimage understanding:Multi-classobjectrecognitionandsegmentationbyjointly modelingtexture,layout,andcontext, InternationalJournalofComputerVision, vol.81,no.1,pp.2,2009. [30]C.BoumanandM.Shapiro,Amultiscalerandomeldmodelforbayesianimage segmentation, ImageProcessing,IEEETransactionson ,vol.3,no.2,pp.162 ,Mar.1994. [31]W.Irving,P.Fieguth,andA.Willsky,Anoverlappingtreeapproachtomultiscale stochasticmodelingandestimation, ImageProcessing,IEEETransactionson vol.6,no.11,pp.1517,1997. [32]M.Crouse,R.Nowak,andR.Baraniuk,Wavelet-basedstatisticalsignal processingusinghiddenmarkovmodels, SignalProcessing,IEEETransactionson,vol.46,no.4,pp.886,1998. [33]N.Adams,A.Storkey,Z.Ghahramini,andC.Williams,MFDTs:Meaneld dynamictrees,in Proc.15thInt'lConf.PatternRecognition,vol.3,Sep.2000,pp. 147. [34]X.Feng,C.Williams,andS.Felderhof,Combiningbeliefnetworksandneural networksforscenesegmentation, PatternAnalysisandMachineIntelligence, IEEETransactionson,vol.24,no.4,pp.467,2002. [35]N.AdamsandC.Williams,Dynamictreesforimagemodelling, ImageandVision Computing,vol.21,no.10,pp.865,2003. [36]A.StorkeyandC.Williams,Imagemodelingwithposition-encodingdynamic trees, IEEETransactionsonPatternAnalysisandMachineIntelligence ,pp. 859,2003. [37]X.He,R.Zemel,andM.Carreira-Perpin an,Multiscaleconditionalrandomelds forimagelabeling,in ComputerVisionandPatternRecognition,2004.CVPR 2004.Proceedingsofthe2004IEEEComputerSocietyConferenceon ,vol.2. IEEE,2004,pp.II. [38]J.Pearl, Probabilisticreasoninginintelligentsystems:networksofplausible inference .SantaMateo,CA,USA:MorganKaufmann,September1997. 166

PAGE 167

[39]F .R.Kschischang,B.J.Frey,andH.A.Loeliger,Factorgraphsandthe sum-productalgorithm, IEEETransactionsonInformationTheory ,vol.47,pp. 498,2001. [40]H.ChengandC.Bouman,Multiscalebayesiansegmentationusingatrainable contextmodel, ImageProcessing,IEEETransactionson,vol.10,no.4,pp. 511,2001. [41]S.KumarandM.Hebert,Man-madestructuredetectioninnaturalimagesusinga causalmultiscalerandomeld,2003. [42]P.Awasthi,A.Gagrani,andB.Ravindran,Imagemodelingusingtreestructured conditionalrandomelds,in Proceedingsofthe20thInternationalJointConferenceonArticialIntelligence(IJCAI2007) ,2007,pp.2060. [43]C.D'Elia,G.Poggi,andG.Scarpa,Atree-structuredmarkovrandomeldmodel forbayesianimagesegmentation, ImageProcessing,IEEETransactionson, vol.12,no.10,pp.1259,2003. [44]S.Lauritzen, Graphicalmodels.OxfordUniversityPress,USA,1996. [45]B.FreyandD.MacKay,Arevolution:Beliefpropagationingraphswithcycles,in Advancesinneuralinformationprocessingsystems10:proceedingsofthe1997 conference .TheMITPress,1998,p.479. [46]Y.Weiss,Correctnessoflocalprobabilitypropagationingraphicalmodelswith loops, NeuralComput. ,vol.12,no.1,pp.1,2000. [47]N.AdamsandC.Williams,Dynamictreesforimagemodeling, ImageandVision Computing,vol.21,pp.865,2003. [48]M.Jordan, LearninginGraphicalModels(AdaptiveComputationandMachine Learning).Cambridge,MA:MITpress,1999. [49]M.I.Jordan,GraphicalModels,vol.19,no.1,pp.140,2004. [50]C.Bishop, Patternrecognitionandmachinelearning .Springer,2007,vol.4. [51]Z.GhahramaniandM.Jordan,Factorialhiddenmarkovmodels, Machine learning,vol.29,no.2,pp.245,1997. [52]T.Jaakkola,Tutorialonvariationalapproximationmethods,in InAdvancedMean FieldMethods:TheoryandPractice .MITPress,2000,pp.129. [53]X.RenandJ.Malik,Learningaclassicationmodelforsegmentation, IEEE InternationalConferenceonComputerVision,vol.1,p.10,2003. [54]A.Dempster,N.Laird,andD.Rubin,Maximumlikelihoodfromincomplete dataviatheemalgorithm, JournaloftheRoyalStatisticalSociety.SeriesB (Methodological),pp.1,1977. 167

PAGE 168

[55]J .Bilmes,Agentletutorialoftheemalgorithmanditsapplicationtoparameter estimationforgaussianmixtureandhiddenmarkovmodels, InternationalComputerScienceInstitute,vol.4,p.126,1998. [56]S.Rao,D.Erdogmus,D.Xu,andK.Hild,Self-organizingitlprinciplesfor unsupervisedlearning, InformationTheoreticLearning ,pp.299,2010. [57]A.Yang,J.Wright,Y.Ma,andS.Sastry,Unsupervisedsegmentationofnatural imagesvialossydatacompression, ComputerVisionandImageUnderstanding, vol.110,no.2,pp.212,2008. [58]H.Attias,Avariationalbayesianframeworkforgraphicalmodels,in InAdvances inNeuralInformationProcessingSystems12.MITPress,2000,pp.209. [59]G.Schwarz,Estimatingthedimensionofamodel, Theannalsofstatistics ,vol.6, no.2,pp.461,1978. [60]N.Adams,A.Storkey,Z.Ghahramani,andC.Williams,Mfdts:Meaneld dynamictrees,vol.3,pp.147,2000. [61]G.Mori,Guidingmodelsearchusingsegmentation,in ProceedingsoftheTenth IEEEInternationalConferenceonComputerVision-Volume2 .IEEEComputer Society,2005,pp.1417. [62]X.Ren,C.Fowlkes,andJ.Malik,Scale-invariantcontourcompletionusing conditionalrandomelds,in Proc.10thInt'l.Conf.ComputerVision,vol.2,2005, pp.1214. [63]P.Arbelaez,Boundaryextractioninnaturalimagesusingultrametriccontour maps,in ComputerVisionandPatternRecognitionWorkshop,2006.CVPRW'06. Conferenceon.Ieee,2006,pp.182. [64]H.Mobahi,S.Rao,A.Yang,S.Sastry,andY.Ma,Segmentationofnatural imagesbytextureandboundarycompression, InternationalJournalofComputer Vision,pp.1,2010. [65]D.Martin,C.Fowlkes,andJ.Malik,Learningtodetectnaturalimageboundaries usinglocalbrightness,color,andtexturecues, PatternAnalysisandMachine Intelligence,IEEETransactionson,vol.26,no.5,pp.530,may2004. [66]C.PantofaruandM.Hebert,Acomparisonofimagesegmentationalgorithms, TheRoboticsInstitute,CarnegieMellonUniversity,Tech.Rep.CMU-R1-TR-05-40 2005. [67]M.Meila,Comparingclusterings:anaxiomaticview,in Proceedingsofthe22nd internationalconferenceonMachinelearning.ACM,2005,pp.577. 168

PAGE 169

[68]D .Martin,C.Fowlkes,D.Tal,andJ.Malik,Adatabaseofhumansegmented naturalimagesanditsapplicationtoevaluatingsegmentationalgorithmsand measuringecologicalstatistics,2001. [69]M.VarmaandA.Zisserman,Classifyingimagesofmaterials:Achieving viewpointandilluminationindependence, ComputerVisionECCV2002 ,pp. 255,2002. [70]T.LeungandJ.Malik,Representingandrecognizingthevisualappearanceof materialsusingthree-dimensionaltextons, InternationalJournalofComputer Vision,vol.43,no.1,pp.29,2001. [71]O.CulaandK.Dana,Compactrepresentationofbidirectionaltexturefunctions, in ComputerVisionandPatternRecognition,2001.CVPR2001.Proceedingsof the2001IEEEComputerSocietyConferenceon ,vol.1.IEEE,2001,pp.I. [72]C.Schmid,Constructingmodelsforcontent-basedimageretrieval,in Computer VisionandPatternRecognition,2001.CVPR2001.Proceedingsofthe2001IEEE ComputerSocietyConferenceon,vol.2.IEEE,2001,pp.II. [73]D.ComaniciuandP.Meer,Meanshift:Arobustapproachtowardfeaturespace analysis, IEEETransactionsonpatternanalysisandmachineintelligence ,pp. 603,2002. [74]J.ShiandJ.Malik,Normalizedcutsandimagesegmentation, PatternAnalysis andMachineIntelligence,IEEETransactionson,vol.22,no.8,pp.888,Aug. 2000. [75]C.WilliamsandN.Adams,DTs:Dynamictrees,in Proc.AdvancesinNeural InformationProcessingSystems,M.Kearns,S.Solla,andD.Cohn,Eds.,vol.11, 1999. [76]C.I.Chow,S.Member,andC.N.Liu,Approximatingdiscreteprobability distributionswithdependencetrees, IEEETransactionsonInformationTheory,vol.14,pp.462,1968. [77]G.CooperandE.Herskovits,Abayesianmethodfortheinductionofprobabilistic networksfromdata,vol.9,no.4.Springer,1992,pp.309. [78]D.HeckermanandD.Chickering,Learningbayesiannetworks:Thecombination ofknowledgeandstatisticaldata,in MachineLearning,1995,pp.20. [79]M.MeilaandM.Jordan,Learningwithmixturesoftrees, JournalofMachine LearningResearch,2000. [80]Y.Weiss,Correctnessoflocalprobabilitypropagationingraphicalmodelswith loops, NeuralComput. ,vol.12,no.1,pp.1,2000. 169

PAGE 170

[81]I. Cox,Areviewofstatisticaldataassociationtechniquesformotion correspondence, InternationalJournalofComputerVision,vol.10,no.1,pp. 53,1993. [82]J.Moura,J.Lu,andM.Kleiner,Intelligentsensorfusion:Agraphicalmodel approach,in InternationalConferenceonAcoustics,Speech,andSignalProcessing,2003,ser.ICASSP'03,vol.6,Apr.2003,pp.VI. [83]A.Kushal,M.Rahurkar,L.F.,J.Ponce,andT.Huang,Audio-visualspeaker localizationusinggraphicalmodels,in 18thInternationalConferenceonPattern Recognition,2006,ser.ICPR2006,vol.1,Sep.2006,pp.291. [84]L.Chen,M.Wainwright,M.Cetin,andA.Willsky,Dataassociationbasedon optimizationingraphicalmodelswithapplicationtosensornetworks, MathematicalandComputerModelling,vol.43,no.9-10,pp.1114,2006. [85]V.SavicandS.Zazo,Sensorlocalizationusingnonparametricgeneralized beliefpropagationinnetworkwithloops,in 12thInternationalConferenceon InformationFusion2009.,ser.FUSION'09,Jul.2009,pp.1966. [86]L.Shi,J.Tan,andZ.Zhao,Targettrackinginsensornetworksusingstatistical graphicalmodels,in IEEEInternationalConferenceonRoboticsandBiomimetics, 2008,ser.ROBIO2008,Feb.2009,pp.2050. [87]P.PaulandV.Rajbabu,Nonparametrictechniquesforgraphicalmodel-based targettrackingincollaborativesensorgroups,in 2010NationalConferenceon Communications ,Jan.2010,pp.1. [88]S.GrimeandH.Durrant-Whyte,Datafusionindecentralizedsensornetworks, ControlEngineeringPractice ,vol.2,no.5,pp.849863,1994. [89]A.MakarenkoandH.Durrant-Whyte,Decentralizedbayesianalgorithmsfor activesensornetworks, InformationFusion ,vol.7,no.4,pp.418,2006. [90]A.Makarenko,A.Brooks,T.Kaupp,H.Durrant-Whyte,andF.Dellaert, Decentraliseddatafusion:Agraphicalmodelapproach,in 12thInternationalConferenceonInformationFusion2009.,ser.FUSION'09,Jul.2009, pp.545. [91]A.StorkeyandC.Williams,Imagemodelingwithposition-encodeddynamic trees,vol.25,no.7,pp.859,2003. [92]S.TodorovicandM.Nechyba,Dynamictreesforunsupervisedsegmentationand matchingofimageregions,vol.27,no.11,pp.1762,2005. [93]B.Yamauchi,A.Schultz,andW.Adams,Mobilerobotexplorationand map-buildingwithcontinuouslocalization,in RoboticsandAutomation,1998. Proceedings.1998IEEEInternationalConferenceon,vol.4.IEEE,1998,pp. 3715. 170

PAGE 171

[94]M. Dissanayake,P.Newman,S.Clark,H.Durrant-Whyte,andM.Csorba,A solutiontothesimultaneouslocalizationandmapbuilding(slam)problem, RoboticsandAutomation,IEEETransactionson,vol.17,no.3,pp.229,jun 2001. [95]M.Montemerlo,S.Thrun,D.Koller,andB.Wegbreit,Fastslam:Afactored solutiontothesimultaneouslocalizationandmappingproblem,in Proceedingsof theNationalconferenceonArticialIntelligence.MenloPark,CA;Cambridge, MA;London;AAAIPress;MITPress;1999,2002,pp.593. [96]S.Thrun,W.Burgard,andD.Fox,Aprobabilisticapproachtoconcurrent mappingandlocalizationformobilerobots, AutonomousRobots ,vol.5,no.3, pp.253,1998. [97]G.KantorandS.Singh,Preliminaryresultsinrange-onlylocalizationand mapping,in RoboticsandAutomation,2002.Proceedings.ICRA'02.IEEE InternationalConferenceon,vol.2,2002,pp.18181823vol.2. [98]R.J.RikoskiandJ.J.Leonard,Trajectorysonarperception, ProcIEEEInternationalConferenceonRoboticsandAutomationICRA03 ,vol.1,pp.963 vol.1,2003. [99]R.Rikoski,J.Leonard,P.Newman,andH.Schmidt,Trajectorysonarperception intheliguriansea, ExperimentalRoboticsIX,pp.557,2006. [100]E.Olson,J.Leonard,andS.Teller,Robustrange-onlybeaconlocalization, OceanicEngineering,IEEEJournalof,vol.31,no.4,pp.949,oct.2006. [101]R.Eustice,O.Pizarro,andH.Singh,Visuallyaugmentednavigationfor autonomousunderwatervehicles, OceanicEngineering,IEEEJournalof,vol.33, no.2,pp.103,april2008. [102]N.Gracias,S.vanderZwaan,A.Bernardino,andJ.Santos-Victor,Mosaic-based navigationforautonomousunderwatervehicles, OceanicEngineering,IEEE Journalof,vol.28,no.4,pp.609624,oct.2003. [103]T.Fortmann,Y.Bar-Shalom,andM.Scheffe,Sonartrackingofmultipletargets usingjointprobabilisticdataassociation, OceanicEngineering,IEEEJournalof vol.8,no.3,pp.173184,jul1983. [104]J.NeiraandJ.Tardos,Dataassociationinstochasticmappingusingthejoint compatibilitytest, RoboticsandAutomation,IEEETransactionson ,vol.17,no.6, pp.890,2001. [105]J.Tardos,J.Neira,P.Newman,andJ.Leonard,Robustmappingandlocalization inindoorenvironmentsusingsonardata, TheInternationalJournalofRobotics Research,vol.21,no.4,p.311,2002. 171

PAGE 172

[106]J .Pearl, ProbabilisticReasoninginIntelligentSystems:NetworksofPlausible Inference .SanFrancisco,CA:MorganKaufmann,1988. [107]F.Jensen, BayesianNetworksandDecisionGraphs,2nded.NewYork,NY: Springer-Verlag,2007. [108]J.T.Cobb,K.Slatton,andG.Dobeck,Aparametricmodelforcharacterizing seabedtexturesinsyntheticaperturesonarimages, OceanicEngineering,IEEE Journalof,vol.35,no.2,pp.250,2010. [109]C.Oliver,Thesensitivityoftexturemeasuresforcorrelatedrandomclutter, InverseProblems ,vol.5,pp.875,1989. [110],Theinterpretationandsimulationofcluttertexturesincoherentimages, InverseProblems ,vol.2,pp.481,1986. [111]J.CobbandK.Slatton,Aparameterizedstatisticalsonarimagetexturemodel,in Proc.SPIEDefenseandSecuritySymposium,vol.6953,Orlando,FL,Apr.2008. [112]D.AbrahamandA.Lyons,Novelphysicalinterpretationsofk-distributed reverberation, OceanicEngineering,IEEEJournalof,vol.27,no.4,pp.800, 2002. [113]S.Johnson,Syntheticaperturesonarimagestatistics,Ph.D.dissertation, PennsylvaniaStateUniversity,StateCollege,Pennsylvania,May2009. [114]S.Lloyd,Leastsquaresquantizationinpcm,vol.28,no.2,pp.129,January 1982. [115]Y.Linde,A.Buzo,andR.Gray,Analgorithmforvectorquantizerdesign,vol.28, no.1,pp.84,Jan.1980. [116]K.Kampa,J.C.Principe,andK.C.Slatton,Dynamicfactorgraphs:Anovel frameworkformultiplefeaturesdatafusion,in Acoustics,SpeechandSignal Processing,2010.ICASSP2010.IEEEInternationalConferenceon ,2010. [117]E.Weisstein.(2010,Apr.)Bellnumber.[Online].Available: http: //mathworld.wolfram.com/BellNumber.html [118]S.Kirkpatrick,C.D.Gelatt,andM.P.Vecchi,Optimizationbysimulatedannealing, vol.220,pp.671,1983. [119]R.Duda,P.Hart,andD.Stork, PatternClassication ,2nded.NewYork,NY: JohnWileyandSons,2001. [120]A.Ng,M.Jordan,andY.Weiss,Onspectralclustering:Analysisandan algorithm,in AdvancesinNeuralInformationProcessingSystems14:Proceeding ofthe2001Conference ,2001,pp.849. 172

PAGE 173

[121]C .SugarandG.James,Findingthenumberofclustersinadataset, Journalof theAmericanStatisticalAssociation,vol.98,no.463,pp.750,2003. [122]D.BerendandT.Tassa,Improvedboundsonbellnumbersandonmomentsof sumsofrandomvariables, ProbabilityandMathematicalStatistics ,vol.30,no.2, 2010. 173

PAGE 174

BIOGRAPHICAL SKETCH KittipatBotKampaisamachinelearningscientistwhointendstoapplylearning theorytothereal-worldproblemsandenjoyusingmathematicstoexplainphysiopsychological phenomenon,forinstance,musicandart.HeattendedChulalongkornUniversity wherehereceivedaBachelorofEngineeringdegreeinelectricalengineeringin2001. HepursuedhisgraduatestudywithDr.KennethC.SlattonintheAdaptiveSignal ProcessingLaboratory(ASPL)andwithDr.JoseC.PrincipeintheComputational NeuroEngineeringLaboratory(CNEL)atUniversityofFlorida.HereceivedaMaster ofSciencedegreeinElectricalandComputerEngineeringin2006,wherehiswork concentratedonmachinelearningapplicationsonLiDAR.KittipatreceivedhisPh.D. fromtheUniversityofFloridainfallof2011.HisPhDdissertationaimedtodevelop data-driventreestructureBayesiannetworksforunsupervisedimagesegmentationand deformablestructuregraphicalmodelsforunderwaterdataclusteringandfusion. 174