<%BANNER%>

Optimized Dictionary Design and Classification Using the Matching Pursuits Dissimilarity Measure

Permanent Link: http://ufdc.ufl.edu/UFE0024309/00001

Material Information

Title: Optimized Dictionary Design and Classification Using the Matching Pursuits Dissimilarity Measure
Physical Description: 1 online resource (135 p.)
Language: english
Creator: Mazhar, Raazia
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2009

Subjects

Subjects / Keywords: agglomeration, camp, classification, clustering, competitive, detection, dictionary, dissimilarity, enhanced, fuzzy, ksvd, learning, machine, matching, measure, outlier, prototype, pursuits
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Discrimination-based classifiers differentiate between two classes by drawing a decision boundary between their data members in the feature domain. These classifiers are capable of correctly labeling the test data that belongs to the same distribution as the training data. However, since the decision boundary is meaningless beyond the training points, the class label of an outlier determined with respect to this extended decision boundary will be a random value. Therefore, discrimination-based classifiers lack a mechanism for outlier detection in the test data. To counter this problem, a prototype-based classifier may be used that assigns class label to a test point based on its similarity to the prototype of that class. If a test point is dissimilar to all class prototypes, it may be considered an outlier. Prototype-based classifiers are usually clustering-based methods. Therefore, they require a dissimilarity criterion to cluster the training data and also to assign class labels to test data. Euclidean distance is a commonly used dissimilarity criterion. However, the Euclidean distance may not be able to give accurate shape-based comparisons of very high-dimensional signals. This can be problematic for some classification applications where high-dimensional signals are grouped into classes based on shape similarities. Therefore, a reliable shape-based dissimilarity measure is desirable. Inorder to be able to build reliable prototype-based classifiers that can utilize shape-based information for classification, we have developed a matching pursuits dissimilarity measure (MPDM). The MPDM is capable of performing shape-based comparisons between very high-dimensional signals. The MPDM extends the matching pursuits (MP) algorithm which is a well-known signal approximation method. The MPDM is a versatile measure as it can also be adopted for magnitude-based comparisons between signals, similar to the Euclidean distance. The MPDM has been used with the competitive agglomeration fuzzy clustering algorithm (CA) to develop a prototype-based probabilistic classifier, called CAMP. The CAMP algorithm is the first method of its kind as it builds a bridge between clustering and matching pursuits algorithms. The preliminary experimental results also demonstrate its superior performance over a neural network classifier and a prototype-based classifier using the Euclidean distance. The performance of CAMP has been tested on high-dimensional synthetic data and also on real landmines detection data. The MPDM is also used to develop an automated dictionary learning algorithm for MP approximation of signals. This algorithm uses the MPDM and the CA clustering algorithm to learn the required number of dictionary elements during training. Under-utilized and replicated dictionary elements are gradually pruned to produce a compact dictionary, without compromising its approximation capabilities. The experimental results show that the size of the dictionary learned by our method is 60% smaller but with same approximation capabilities as the existing dictionary learning algorithms.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Raazia Mazhar.
Thesis: Thesis (Ph.D.)--University of Florida, 2009.
Local: Adviser: Gader, Paul D.
Local: Co-adviser: Wilson, Joseph N.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2009
System ID: UFE0024309:00001

Permanent Link: http://ufdc.ufl.edu/UFE0024309/00001

Material Information

Title: Optimized Dictionary Design and Classification Using the Matching Pursuits Dissimilarity Measure
Physical Description: 1 online resource (135 p.)
Language: english
Creator: Mazhar, Raazia
Publisher: University of Florida
Place of Publication: Gainesville, Fla.
Publication Date: 2009

Subjects

Subjects / Keywords: agglomeration, camp, classification, clustering, competitive, detection, dictionary, dissimilarity, enhanced, fuzzy, ksvd, learning, machine, matching, measure, outlier, prototype, pursuits
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre: Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract: Discrimination-based classifiers differentiate between two classes by drawing a decision boundary between their data members in the feature domain. These classifiers are capable of correctly labeling the test data that belongs to the same distribution as the training data. However, since the decision boundary is meaningless beyond the training points, the class label of an outlier determined with respect to this extended decision boundary will be a random value. Therefore, discrimination-based classifiers lack a mechanism for outlier detection in the test data. To counter this problem, a prototype-based classifier may be used that assigns class label to a test point based on its similarity to the prototype of that class. If a test point is dissimilar to all class prototypes, it may be considered an outlier. Prototype-based classifiers are usually clustering-based methods. Therefore, they require a dissimilarity criterion to cluster the training data and also to assign class labels to test data. Euclidean distance is a commonly used dissimilarity criterion. However, the Euclidean distance may not be able to give accurate shape-based comparisons of very high-dimensional signals. This can be problematic for some classification applications where high-dimensional signals are grouped into classes based on shape similarities. Therefore, a reliable shape-based dissimilarity measure is desirable. Inorder to be able to build reliable prototype-based classifiers that can utilize shape-based information for classification, we have developed a matching pursuits dissimilarity measure (MPDM). The MPDM is capable of performing shape-based comparisons between very high-dimensional signals. The MPDM extends the matching pursuits (MP) algorithm which is a well-known signal approximation method. The MPDM is a versatile measure as it can also be adopted for magnitude-based comparisons between signals, similar to the Euclidean distance. The MPDM has been used with the competitive agglomeration fuzzy clustering algorithm (CA) to develop a prototype-based probabilistic classifier, called CAMP. The CAMP algorithm is the first method of its kind as it builds a bridge between clustering and matching pursuits algorithms. The preliminary experimental results also demonstrate its superior performance over a neural network classifier and a prototype-based classifier using the Euclidean distance. The performance of CAMP has been tested on high-dimensional synthetic data and also on real landmines detection data. The MPDM is also used to develop an automated dictionary learning algorithm for MP approximation of signals. This algorithm uses the MPDM and the CA clustering algorithm to learn the required number of dictionary elements during training. Under-utilized and replicated dictionary elements are gradually pruned to produce a compact dictionary, without compromising its approximation capabilities. The experimental results show that the size of the dictionary learned by our method is 60% smaller but with same approximation capabilities as the existing dictionary learning algorithms.
General Note: In the series University of Florida Digital Collections.
General Note: Includes vita.
Bibliography: Includes bibliographical references.
Source of Description: Description based on online resource; title from PDF title page.
Source of Description: This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility: by Raazia Mazhar.
Thesis: Thesis (Ph.D.)--University of Florida, 2009.
Local: Adviser: Gader, Paul D.
Local: Co-adviser: Wilson, Joseph N.

Record Information

Source Institution: UFRGP
Rights Management: Applicable rights reserved.
Classification: lcc - LD1780 2009
System ID: UFE0024309:00001


This item has the following downloads:


Full Text

PAGE 1

1

PAGE 2

2

PAGE 3

MomandAdeel,itsbecauseofyou. 3

PAGE 4

FirstandforemostIwouldliketothankGodforallhisblessingsandforenablingmetosuccessfullycompletemystudies.ThenIwouldliketothankmycommitteechairDr.PaulGaderforbeingapatientteacherandmentor,andforhelpingmeachievemygoals.Myco-chairDr.JosephWilson,forguidingandencouragingmeinmystudiesandforbeingsuchagreatpersontoworkwith.MycommitteemembersDr.GerhardRitter,Dr.RandyY.C.ChowandDr.ClintSlatton.Mylabmates,especiallyXupingZhangforhiscontinuedsupportandfriendshipeversinceIstartedmyPh.D.,JeremyBoltonforbeingagoodfriendandKennethWatfordforhelpingmerunvariousexperiments.ThanksisalsoduetoDr.QasimSheikh,myundergraduateadvisor,forencouragingmetodoPh.D.Tomyfamily,especiallytomyfatherforinspiringmeforhigherstudies.TomyhusbandAdeel,forallhisloveandsupportduringthemostdicultoftimesandforalwaysbeingsopatientwithme.Intheend,Iwouldliketothankmymotherforalwaysbeingthereformeandforallherprayersandencouragement,Icouldn'thavedoneitwithouther. 4

PAGE 5

page ACKNOWLEDGMENTS ................................. 4 LISTOFTABLES ..................................... 7 LISTOFFIGURES .................................... 8 ABSTRACT ........................................ 10 CHAPTER 1INTRODUCTION .................................. 12 1.1ProblemStatementandMotivation ...................... 12 1.2ProposedSolution ................................ 15 2LITERATUREREVIEW .............................. 21 2.1TheMatchingPursuitsAlgorithm ....................... 21 2.1.1OrthogonalMatchingPursuit ..................... 24 2.1.2BasisPursuit .............................. 25 2.1.3WeightedMatchingPursuit ....................... 26 2.1.4GeneticMatchingPursuit ....................... 26 2.1.5Tree-BasedApproachestoMatchingPursuits ............. 28 2.1.6OtherVariationsoftheMatchingPursuitsAlgorithm ........ 29 2.2DictionariesforMatchingPursuits ...................... 30 2.2.1ParametricDictionaries ......................... 31 2.2.2LearningDictionariesfromData .................... 34 2.2.3DictionaryPruningMethods ...................... 42 2.3ClassicationUsingMatchingPursuits .................... 43 2.3.1DiscriminationBasedApproaches ................... 44 2.3.2ModelBasedApproaches ........................ 49 3TECHNICALAPPROACH ............................. 54 3.1MatchingPursuitsDissimilarityMeasure ................... 55 3.2DictionaryLearningUsingtheEK-SVDAlgorithm ............. 58 3.2.1RelationshipBetweentheMatchingPursuitsandtheFuzzyClus-teringAlgorithms ............................ 60 3.2.2TheEnhancedK-SVDAlgorithm ................... 62 3.3ClassicationUsingtheCAMPAlgorithm .................. 64 3.4DetectingOutliersintheTestData ...................... 73 4EXPERIMENTALRESULTS ............................ 78 4.1ShapeandMagnitudeBasedComparisonsUsingMatchingPursuitsDis-similarityMeasure ............................... 78 5

PAGE 6

........ 78 4.3ClassicationofSyntheticHigh-DimensionalDataUsingCAMP ...... 79 4.4ClusterValidationofCAMPAlgorithm .................... 82 4.5ClassicationofLandminesVectorData ................... 84 4.5.1PerformanceComparisonwithDiscrimination-BasedClassiers ... 85 4.5.2EectofChoosingVariousDictionariesonClassication ....... 90 4.5.3DetectingOutliersinTestData .................... 93 4.5.4AutomatedOutlierDetection ..................... 94 4.6ClassicationofLandminesImageData .................... 95 4.6.1PerformanceComparisonwithExistingClassiers .......... 96 4.6.2EectofChoosingVariousDictionariesonClassication ....... 97 5CONCLUSION .................................... 124 APPENDIX ACOMPETITIVEAGGLOMERATIONFUZZYCLUSTERINGALGORITHM 125 BIOGRAPHICALSKETCH ................................ 135 6

PAGE 7

Table Page 4-1ThesetofparametersusedtogeneratetheGaussiandictionary ......... 101 4-2Thet-testoutcomesforCAMPandEUCROCs. ................. 102 4-3Numberofminesandnon-minesinthelandminesdatasets ............ 105 4-4Thesetofparametersusedtogeneratethe1-DGabordictionary ........ 113 4-5Thesetofparametersusedtogeneratethe2-DGabordictionary ........ 123 7

PAGE 8

Figure Page 1-1Outlierdetectionproblemindiscrimination-basedclassicationmethods .... 20 1-2Themodel-basedapproachtoclassication ..................... 20 1-3Shape-basedcomparisonusingtheEuclideandistance ............... 20 3-1Importanceofprojectionsequence,coecientvectorandtheresidue ...... 75 3-2RoleofindeterminingthevalueofMPDM ................... 76 3-3Updateequationforct 76 3-4CAalgorithmmergesclustersbasedontheircardinality ............. 77 4-1ShapeandmagnitudebasedcomparisonsusingMPDM .............. 98 4-2RMSEasfunctionofMduringEK-SVDdictionarytraining ........... 99 4-3ImageapproximationsusingK-SVDandEK-SVDTrainedDictionaries ..... 99 4-4Samplesyntheticdataforclassication ....................... 100 4-5TheGaussiandictionary ............................... 101 4-6MPDMvs.theEuclideandistanceforprototype-basedclassication ....... 102 4-7DatasetusedtovalidateCAMPandFCM ..................... 103 4-8HistogramofnumberofclustersdiscoveredbyCAMP ............... 104 4-9Fuzzyvaliditymeasures ............................... 104 4-10SampleEMIdata ................................... 105 4-11MeanPFAas2andCarevaried .......................... 106 4-12TheEKSVDdictionary ............................... 107 4-13Classicationresultsfortraining,T1andT2datasetswitherrorbars ....... 108 4-14Classicationresultsfortraining,T1andT2datasetswitherrorbars ....... 109 4-15ClassicationresultsforT1andT2datasetsforFOWA,SVM,EUCandCAMP 110 4-16ErrorbarsforT1andT2datasetsforFOWAandCAMP .............. 111 4-17ReductionofGaussiandictionaryusingEK-SVD ................. 112 4-18TheGabordictionary ................................ 113 8

PAGE 9

............................... 114 4-20Crossvalidationresultsforthetrainingdatausingvariousdictionaries ...... 115 4-21ResultsforT1testdatasetusingvariousdictionaries ................ 115 4-22ResultsforT2testdatasetusingvariousdictionaries ................ 116 4-23MSEforMPreconstructionofthetrainingsetfordierentdictionaries ..... 116 4-24OutlierdetectionbyFOWA,SVM,EUCandCAMPclassiers. ......... 117 4-25Errorbarsformineclassofsyntheticdata ..................... 118 4-26Ranknormalizationforautomatedoutlierdetection ................ 118 4-27SampleGPRimagedata ............................... 119 4-28Imagedictionarylearnedfromdata ......................... 120 4-29Restoftheelementsofimagedictionarylearnedfromdata ............ 121 4-30ClassicationresultsforGPRdataforLMS,HMM,SPEC,EHDandCAMP .. 122 4-31ROCofGPRdatatrainedusingtheEKSVDandtheGabordictionaries .... 123 9

PAGE 10

Discrimination-basedclassiersdierentiatebetweentwoclassesbydrawingadecisionboundarybetweentheirdatamembersinthefeaturedomain.Theseclassiersarecapableofcorrectlylabelingthetestdatathatbelongstothesamedistributionasthetrainingdata.However,sincethedecisionboundaryismeaninglessbeyondthetrainingpoints,theclasslabelofanoutlierdeterminedwithrespecttothisextendeddecisionboundarywillbearandomvalue.Therefore,discrimination-basedclassierslackamechanismforoutlierdetectioninthetestdata.Tocounterthisproblem,aprototype-basedclassiermaybeusedthatassignsclasslabeltoatestpointbasedonitssimilaritytotheprototypeofthatclass.Ifatestpointisdissimilartoallclassprototypes,itmaybeconsideredanoutlier. Prototype-basedclassiersareusuallyclustering-basedmethods.Therefore,theyrequireadissimilaritycriteriontoclusterthetrainingdataandalsotoassignclasslabelstotestdata.Euclideandistanceisacommonlyuseddissimilaritycriterion.However,theEuclideandistancemaynotbeabletogiveaccurateshape-basedcomparisonsofveryhigh-dimensionalsignals.Thiscanbeproblematicforsomeclassicationapplicationswherehigh-dimensionalsignalsaregroupedintoclassesbasedonshapesimilarities.Therefore,areliableshape-baseddissimilaritymeasureisdesirable. Inordertobeabletobuildreliableprototype-basedclassiersthatcanutilizeshape-basedinformationforclassication,wehavedevelopedamatchingpursuitsdissimilarity 10

PAGE 11

1 ]whichisawell-knownsignalapproximationmethod.TheMPDMisaversatilemeasureasitcanalsobeadoptedformagnitude-basedcomparisonsbetweensignals,similartotheEuclideandistance. TheMPDMhasbeenusedwiththecompetitiveagglomerationfuzzyclusteringalgorithm(CA)[ 2 ]todevelopaprototype-basedprobabilisticclassier,calledCAMP.TheCAMPalgorithmistherstmethodofitskindasitbuildsabridgebetweenclusteringandmatchingpursuitsalgorithms.Thepreliminaryexperimentalresultsalsodemonstrateitssuperiorperformanceoveraneuralnetworkclassierandaprototype-basedclassierusingtheEuclideandistance.TheperformanceofCAMPhasbeentestedonhigh-dimensionalsyntheticdataandalsoonreallandminesdetectiondata. TheMPDMisalsousedtodevelopanautomateddictionarylearningalgorithmforMPapproximationofsignals.ThisalgorithmusestheMPDMandtheCAclusteringalgorithmtolearntherequirednumberofdictionaryelementsduringtraining.Under-utilizedandreplicateddictionaryelementsaregraduallyprunedtoproduceacompactdictionary,withoutcompromisingitsapproximationcapabilities.Theexperimentalresultsshowthatthesizeofthedictionarylearnedbyourmethodis60%smallerbutwithsameapproximationcapabilitiesastheexistingdictionarylearningalgorithms. 11

PAGE 12

Thereceivedsignalsareusuallyveryhigh-dimensionaltimeorfrequencydomainsignals.Theyareanalyzedusingsignalprocessingandmachinelearningalgorithmsforexistenceandidenticationofthetargetobjects.Theobjectdetectiontaskcansimplybetodetermineifanobjectexistsinthetestdata,asfortheradarsusedbyairtraccontrollerstodeterminethelocationofaircraft,oritcanbemorecomplicated,forexample,byincludingtherecognitionofthetarget.LandminedetectionsystemsthatuseEMIsensorsnotonlyneedtodeterminewhetheranobjectisburiedintheground,buttheyalsoneedtorecognizewhethertheburiedobjectisamineoranon-mine.TheEMIresponseofaburiedobjectdependsonitsmetalliccompositionandgeometryandstaysconsistentacrossmostweatherandsoilconditions.Therefore,thehigh-dimensionalEMIresponsecontainsshape-basedinformationaboutthetarget.Thisinformationcanbecharacterizedtoidentifytheobjectasamineoranon-mine. Oneapproachtoclassicationistoextractsfeaturesthatcapturetheshapeanddistinguishingcharacteristicsofsignalsinthetrainingdataset.Thesefeaturesarethenusedtotrainadiscrimination-basedclassierwhichlearnsadecisionruleforassigningclasslabelstothetestdata.Adiscrimination-basedclassierlearnsthedecisionrulebydrawingadecisionboundarybetweentrainingdataofbothclassesinthefeature 12

PAGE 13

3 ].Ontheotherhand,dimensionality-reductiontransformslikeLDAdrawthedecisionboundaryinalowerdimension[ 4 ]. Discrimination-basedclassiersarecapableofcorrectlylabelingthetestdatawhichbelongtothesamedistributionasthetrainingdata.However,theseclassierssuerfromtwomajorweaknesses.First,theirperformancereliesheavilyonthegoodnessoffeaturesextractedfromthetrainingdata.Iftheextractedfeaturesareabletocapturethedistinguishingcharacteristicsofbothclasses,theclassicationaccuracywillbehighforthetestdata.Ontheotherhand,ifthefeaturesfailtocapturethedierencesbetweentheclasses,theclassicationaccuracyfortestdatawillbelow.Devisinggoodfeaturesforagivenproblemrequireampledomainknowledgeandcreativity.Therefore,thefeaturestobeextractedareusuallydenedmanually.However,manualfeaturedenitionissubjecttohumaninterpretationandhinderthescalabilityandgeneralizationoftheclassicationsystems. Thesecondmajorweaknessofdiscrimination-basedclassiersistheirinabilitytoaccuratelyclassifyoutliersintestdata.Asdiscussedearlier,discrimination-basedclassierslearnthediscriminationrulebydeningadecisionboundarybetweentrainingdataofthetwoclasses.Therefore,theseclassiersareabletoaccuratelyclassifythetestpointsthatbelongtothesamedistributionasthetrainingdata.Althoughthedecisionboundaryextendstoinnityinalldirections,itismeaninglessbeyondtheregioncontainingtrainingpoints.Therefore,theclasslabelofanoutliertdeterminedwithrespecttothisextendeddecisionboundarywillessentiallybearandomvalue(Fig. 1-1 ). 13

PAGE 14

Sincetherelativelocationofeverytestpointtothedecisionboundarycanalwaysbedetermined,discrimination-basedclassiersareincapableofdetectingoutliersinthetestdata.Therefore,aprototype-basedclassiermustbeusedinsteadofadiscrimination-basedclassier.Inaprototype-basedclassier,eachclassisrepresentedbyasetofprototypes.Atestpointisassignedaclasslabelbasedonitssimilaritytotheprototypesofthatclass.Consequently,ifatestpointisdissimilartoprototypesofalltheclasses,itwillnotbeassignedaclasslabelandmaybeconsideredanoutlier(Fig. 1-2 ).Therefore,theoutlierdetectionmechanismisinherentlybuiltintotheprototype-basedclassiers. Prototype-basedclassiersusuallyemployclusteringtechniquestobuildclassprototypes.Thisrequiresuseofadissimilaritycriteriontoclusterthetrainingdataintovariousclusters.Classassignmenttotestpointsalsoneedsthedissimilaritymeasuretodeterminetheclosestprototype.Therefore,choosinganappropriatedissimilaritymeasureisimportantfortheperformanceofprototype-basedclassiers.AcommonlyuseddissimilaritymeasureisEuclideandistance.Euclideandistancemeasurestheshortestdistancebetweentwopointsinspacealongastraightline.Inotherwords,iftwopointsxandzareconnectedbyalinesegment,theirEuclideandistancewillbethelengthofthislinesegment.However,inhighdimensions,theshortestdistanceinspacemaynotbe 14

PAGE 15

1-3 Figure 1-3 showsthreevectorsA;B;C2<100.SupposesignalsAandBbelongtoaclasscharacterizedbycurveswithshapessimilartotheirs.Furthermore,supposesignalCbelongstoaclasswithadierentcharacteristicshape.ButtheEuclideandistancebetweenAandCis617:7andbetweenAandBisequalto725:3.Therefore,theEuclideandistancemaynotbethemostsuitabledissimilaritymeasuretoperformshape-basedcomparisonsbetweensignalsinhighdimensions. Inshort,prototype-basedclassicationmethodsaremoresuitablethandiscrimination-basedapproachesforclassicationofhigh-dimensionaldatathathasoutliers.However,prototype-basedclassiersrequireashape-baseddissimilaritymeasureforbuildingclassprototypesandforassigningclasslabelstotestpoints. Inordertoachievetheaboveobjectives,aMatchingPursuitsDissimilarityMeasureispresented.TheMPDMextendsthewell-knownsignalapproximationtechniqueMatch-ingPursuits(MP)forsignalcomparisonpurposes[ 1 ].MPisagreedyalgorithmthatapproximatesasignalxasalinearcombinationofsignalsfromapre-deneddictionary.MPiscommonlyusedforsignalrepresentationandcompression,particularlyimageandvideocompression[ 5 6 ].ThedictionaryandcoecientsinformationproducedbytheMPalgorithmhasbeenpreviouslyusedinsomeclassicationapplications.However,mostoftheseapplicationsworkonsomeunderlyingassumptionsaboutthedataandtheMP 15

PAGE 16

2.3 ).TheMPDMistherstMPbasedcomparisonmeasurethatdoesnotrequireanyassumptionsabouttheproblemdomain.Itisversatileenoughtoperformshape-basedcomparisonsofveryhigh-dimensionalsignalsanditcanalsobeadoptedtoperformmagnitude-basedcomparisons,similartotheEuclideanDistance.SincetheMPDMisadierentiablemeasure,itcanbeseamlesslyusedwithexistingclusteringordiscriminationalgorithms.Therefore,theMPDMmayndapplicationinavarietyofclassicationandapproximationproblemsofveryhigh-dimensionalsignals,includingimageandvideosignals.TheexperimentalresultsshowthatMPDMismoreusefulthantheEuclideandistanceforshape-basedcomparisonbetweensignalsinhighdimensions. ThepotentialusefulnessoftheMPDMforavarietyofproblemsisdemonstratedbydevisingtwoimportantMPDM-basedalgorithms.Therstalgorithm,calledCAMP,dealswiththeprototype-basedclassicationofhigh-dimensionalsignals.ThesecondalgorithmiscalledtheEK-SVDalgorithmanditautomatesthedictionarylearningprocessfortheMPapproximationofsignals. IntheCAMPalgorithm,MPDMisusedwiththeCompetitiveAgglomeration(CA)clusteringalgorithmbyFriguiandKrishnapuramtoproposeaprobabilisticclassicationmodel[ 2 ].TheCAalgorithmisafuzzyclusteringalgorithmthatlearnstheoptimalnum-berofclustersduringtraining.Therefore,iteliminatestheneedformanuallyspecifyingthenumberofclustersbeforehand.ThisalgorithmhasbeennamedasCAMPasanabbre-viationofCAandMPalgorithms.Foratwoclassproblem(y2f0;1g),CAMPclustersmembersofeachclassseparatelyandusestheclusterrepresentativesasprototypes.Thepriorprobabilityp(yjcj)ofaclassiscomputedbasedonsimilarityoftheclustercjtoclustersoftheotherclass.Thelikelihoodp(xjcj)ofapointxisdeterminedusingMPDM.Thelikelihoodp(xjcj)andthepriorp(yjcj)isusedtocomputetheposteriorprobabilityp(yjx)ofxofbelongingtoaclassy.Thetestpointtthathaslowposteriorprobabilitiesforbothclassesmaybeconsideredtobeanoutlier. 16

PAGE 17

2.3 ).However,thenewCAMPalgorithmistherstmethodsthatbuildsabridgebetweenclusteringandmatchingpursuitstechniques.Therefore,itcanbeusedtocombineexistingMP-basedimagecompressiontechniqueswiththeprototype-basedimagerecognitionandretrievalapplicationsinoneframework.Theex-perimentalresultsalsoshowtheusefulnessofCAMPforclassicationofhigh-dimensionaldata.TheCAMPalgorithmhasbeenusedforclassicationofreallandminesdetectiondatacollectedusinganelectromagneticinductionsensor,discussedinSection 1.1 .TheclassicationperformanceoftheCAMPalgorithmhasbeenfoundtobebetterthananexistingmulti-layerperceptronbasedsystemforthisdata.OurCAMPalgorithmalsooutperformedsupportvectormachinesusingnon-linearradialbasisfunctionaskernel.TheexperimentalresultsalsodemonstratethesuperiorityofMPDMovertheEuclideandistanceforshape-basedcomparisonsinhighdimension.AnextensiveexperimentusingsimulateddataisalsoreportedtodemonstratetheoutlierdetectioncapabilitiesofCAMPoverdiscrimination-basedclassiersandtheprototype-basedclassierusingtheEuclideandistance. TheCAMPalgorithmmaybeusefulasabridgebetweenclusteringandMPalgo-rithms,withoutlierdetectioncapabilities.However,italsohasapotentialweaknessbecauseofthedependenceoftheMPalgorithmonthechoiceofdictionarybeingusedforapproximations.Ifthedictionaryiswell-suitedtothedata,theMPalgorithmwillbeabletogivegoodapproximationsinfeweriterations.Otherwise,theapproximationerrormaystillbelargeevenaftermanyMPiterations.Therefore,theMPdictionaryusedwiththeCAMPalgorithmislearnedusingthetrainingdata.Thereareanumberofdictionarylearningalgorithmsavailable.K-SVD 7 ].Itisa 17

PAGE 18

K-SVDisausefulstate-of-the-artdictionarylearningalgorithmthathasbeendemonstratedtooutperformotherdictionarylearningmethods[ 7 ].However,thedrawbackoftheK-SVDalgorithmisthatthetotalnumberKofthedictionaryelementstobelearnedneedstobespeciedbeforehand.ChoosingKisacumbersomemanualactivityandthereisapossibilityofchoosingtoobigortoosmallanumber.Inordertohaveafullyautomatedandreliabledictionarylearningprocess,thelearningalgorithmshouldnotdependonmanualspecicationofK.Instead,thecorrectnumberofrequireddictionaryelementsshouldbediscoveredduringthetrainingprocess. Therefore,usingMPDM,anenhancementovertheK-SVDalgorithmisdeveloped,calledtheenhancedK-SVD(EK-SVD)algorithm.EK-SVDobviatestheneedforspec-ifyingthetotalnumberKofrequireddictionaryelementsbeforehand.ThedictionarymembersupdatestageisthesameforbothK-SVDandEK-SVDalgorithms.However,inthecoecientupdatestage,insteadofusingMPforupdate,EK-SVDusesMPDMwithCAtolearnthecoecientsasfuzzyclustermemberships.TheclusterpruningcapabilitiesoftheCAalgorithmareusedtograduallypruneunder-utilizedandreplicateddictionaryelements,whileusingMPDMensuresconsistenceofthelearnedcoecientswiththeMPalgorithm. TheEK-SVDalgorithmhastwoimportantproperties.First,itproducessmallerdic-tionarieswithgoodapproximationcapabilitiesforthegivendata.InsignalapproximationandcompressionapplicationsofMP,notonlytheapproximationaccuracybutalsothecomputationspeedisimportant.TheexperimentalresultsshowthatbothK-SVDandEK-SVDlearndictionarieswithsimilarapproximationcapabilities.Theonlydierence 18

PAGE 19

TheotherimportantpropertyofEK-SVDisthatitdoesnotdependonmanualspecicationofthetotalnumberKofdictionaryelements.Thispropertyisespeciallyusefulinthecontextofclassicationapplications,likeCAMP.SinceCAMPheavilyusestheMPalgorithmduringthetrainingandtheclassicationprocesses,thedictionarylearningmaybeconsideredasapre-processingstepfortrainingofclassier.Automatingthedictionarylearningprocessreducesdependenceoftheclassierondomainknowledge,thusminimizinghumaninterventionintheformofuser-speciedparameters. Therestofthisdocumentelaboratesupontheconceptsintroducedhere.Chapter2isanoverviewofexistingmethodsfordictionarylearningandclassicationusingtheMPalgorithm.InChapter3,thetechnicalapproachofthenewmethodsisdiscussedindetail.Chapter4reportssomepreliminaryexperimentalresultsabouttheproposedalgorithms. 19

PAGE 20

Outlierdetectionproblemindiscrimination-basedclassicationmethods. Figure1-2. Themodel-basedapproachtoclassication.Thetestpointtisnotassignedanyclasslabelasitisfarfromalltheclassmodels. Figure1-3. Underashape-basedcomparison,signalAandBshouldbemoresimilar.ButEuclideandeclaresAandCtobemoresimilarbecausetheirmagnitudesarecomparable. 20

PAGE 21

Matchingpursuits(MP)isawellknowntechniqueforsparsesignalrepresentation.MPisagreedyalgorithmthatndslinearapproximationsofsignalsbyiterativelyproject-ingthemoveraredundant,possiblynon-orthogonalsetofsignalscalleddictionary.SinceMPisagreedyalgorithm,itmaygiveasuboptimalapproximation.However,itisusefulforapproximationswhenitishardtocomeupwithoptimalorthogonalapproximations,asinthecaseofhigh-dimensionalsignalsorimages.Historically,matchingpursuits(MP)techniqueisusedforsignalcompression,particularlyaudio,videoandimagesignalcom-pression.However,MPhasalsobeenusedinsomeclassicationapplications,usuallyasafeatureextractor. ThischapterisanoverviewoftheMatchingPursuitsalgorithm,itsdictionariesanditsapplicationtotheclassicationproblems.Therefore,inSection 2.1 ,wediscussindetailthedenitionandcharacteristicsoftheMPalgorithmandalsosomecommonlyusedimprovementsoverthebasicMPalgorithm.ThedictionaryplaysapivotalroleinperformanceoftheMPalgorithm,thereforeinSection 2.2 wediscussindetailsomewellknownMPdictionariesandalsothedictionarylearningmethods.SincewearetryingtoadopttheMPalgorithmforclassicationpurposes,inSection 2.3 wereviewtheexistingdiscriminationandmodelbasedclassicationsystemsthatusetheMPalgorithm. 1 ].ItwasreintroducedfromthestatisticalcommunitytothesignalprocessingcommunitybyMallatandZhangin1993[ 8 ].LetHbeaHilbertspace,thenmatchingpursuitsdecomposesasignalx2Hthroughaniterative,greedyprocessoveranovercompletesetofsignals,calledthedictionaryD=fg11;g22;:::;gMMgH.Eachgii2HiscalledanatominthedictionaryDandkgiik=1.Hereidenotesasetofparametersthatdenetheatomand 21

PAGE 22

2.2 Givenasignalx,thematchingpursuitsalgorithmgeneratesanapproximationofxasalinearcombinationofpatomsfromD: wherebxdenotestheapproximationofxandw(x)idenotesthecoecientcorrespondingtoeachg(x)di.Thedierencebetweenxanditsapproximationiscalledtheresidue: Thealgorithmstartsbyndingthegdthatgivesthemaximumprojectionofx: Theresidueisupdatedbysubtractingg(x)dd0timesitsmagnitudeofprojectionw(x)0fromx: wherew(x)0=Dx;g(x)dd0Eiscalledthecoecientofg(x)dd0.SinceR(x)1isorthogonaltow(x)0g(x)dd0,wehave: ThisprocesscontinuesiterativelybyprojectingR(x)iontodictionaryatomsandupdatingtheR(x)i+1accordingly.Afterpiterations,xcanbewrittenas: 22

PAGE 23

Matchingpursuitsissimilartothevectorquantization(VQ)ofthesignals,whereadictionaryelementischosentorepresentxbasedonsomedissimilaritycriteria[ 9 ].However,thedierencebetweenVQandMPisthatVQchoosesonlyonedictionaryelementtorepresentx,whileMPchoosespelements.Ifthedictionaryiscomplete(i.e.,span(D)=H)ithasbeenshownbyMallatandZhang[ 1 ]thatasp!1,R(x)p!0andbx!x.ThetotalnumberofiterationspcaneitherbexedbeforehandoritcanbechosendynamicallybyiteratinguntilR(x)pislessthanathreshold. Ateachiterationthedictionaryelementthatisthemostsimilartotheresidueischosenandsubtractedfromthecurrentresidue.Iftheangleofprojectionateachiterationissmall,thenitwilltakeonlyafewiterationstodrivetheresiduetozero.Conversely,ifateachiterationtheangleofprojectionbetweentheresidueandthechosenelementislarge,itwilltakemoreiterationsanddictionaryelementstoreducetheresiduesignicantly.Inaddition,ifthedictionaryislarge,thenthecomputationtimeoftheiterationswillbelarge.Hencetheproperchoiceofdictionaryisessential.SinceMPisagreedyalgorithm,thechosencoecientsshouldgetsmallerastheiterationindex,j,getslarger.Hence,themaximuminformationaboutthesignalxiscontainedintherstfewcoecients.Therefore,MPalsohasadenoisingeectonthesignalx.Sparsityofrepresentationisanimportantissue,bothforthecomputationaleciencyoftheresultingrepresentationsandforitstheoreticalandpracticalinuenceongeneralizationperformance.TheMPalgorithmprovidesanexplicitcontroloverthesparsityoftheapproximationsolutionthroughchoiceofasuitablevalueofp. VariousenhancementstotheMPalgorithmhavebeenproposedintheliterature.TheseenhancementsdealwithimprovingvariousaspectsoftheMPalgorithm,like 23

PAGE 24

ThedisadvantageofthissuboptimalityofMPfornitepisthatitwilltakemoreiterationstoreduceR(x)pbelowagiventhreshold.ThisshortcomingofMPisremovedbytheOrthogonalMatchingPursuits(OMP)algorithm[ 10 11 ].Ateveryiteration,OMPgivestheoptimalapproximationwithrespecttotheselectedsubsetofdictionaryelementsbymakingtheresidueorthogonaltoallofthechosendictionaryelements.NotethattheOMPisstillusingtheMPtechniquetondthenextdictionaryelementandtocomputeitsinitialcoecient.Infact,OMPonlyaddsanorthogonalizationstepattheendofeachMPiterationbyrecomputingallthecoecientsofthedictionaryelementschosensofar.Forthispurpose,allthedictionaryelementschosentillthepthiterationaretakenandtheircoecientsarerecomputedbysolvingtheleast-squaresproblem: min(w(x)j)xp1Xj=0w(x)jg(x)ddj2(2{8) ThisensuresthattheresidueR(x)p2V?pandtheapproximationofxisoptimalforthegivensetofpdictionaryelements.OMPalsoconvergesfasterthanMP.ForadictionaryofnitesizeM,OMPconvergestoitsspaninnomorethanMiterations[ 10 ].However,at 24

PAGE 25

12 ].Ateachiterationj,OOMPonlychoosesfromthedictionaryelementsthatresideinthespanofV?i.Therefore,OOMPensuressparserepresentationsthroughorthogonalityofthechosendictionaryelementsandthenalresidue.However,thecomputationalcomplexityofOMPandOOMParehigherthanthatofMPbecauseoftherequirementofsolvingtheleast-squaresproblemineachiteration. 13 ].BPchoosesthedictionaryelementsandcoecientsthatminimizethe`1normbetweenxanditsapproximation.LetDbeacolumnvectorofalldictionaryelementsandW(x)bethecorrespondingdictionarycoecientsforx,thenBPsolvesthefollowingproblem: minW(x)1subjecttoDW(x)=x(2{9) BPrequiresthesolutionofaconvex,nonquadraticoptimizationproblem.IthasbeenshownbyChenetal.[ 13 ]thatitcanbetranslatedintoalinearprogrammingproblem.Therefore,BPisanoptimizationprincipleandnotanalgorithm[ 13 ].OnceBPhasbeentranslatedintoalinearprogram,itcanbesolvedusinganystandardlinearprogrammingtechniquelikethesimplexorinterior-pointmethod[ 14 ]. 25

PAGE 26

15 ].ForthispurposeaBayesianinterpretationhasbeengiventoMP.MPchoosesthenextdictionaryelementbasedonitssimilaritytothecurrentresidue.Thiscanberegardedassearchingfortheelementg(x)jwithmaximumlikelihoodforR(x)j.AssumingthattheresidueR(x)jismadeofsuperpositionofadictionaryelementg(x)jandGaussiannoise(i.e.,R(x)j=g(x)j+vj),theinnerproductDR(x)j;g(x)jEcanbeseenasamaximizationoftheprobabilityp(g(x)jjR(x)j).InthestandardMPwherethealldictionaryelementsareequi-likely,maximizingp(R(x)jjg(x)j)isequivalenttomaximizingp(g(x)jjR(x)j).Howeverassumethatalldictionaryelementsarenotequi-probabletoappearintheapproximationofx.Theneachdictionaryelementhasapriorprobabilityp(gj)associatedwithit.Inthiscase,usingBayes'rule,theprobabilitytomaximizep(g(x)jjR(x)j)becomes: wherep(R(x)j)isconstantforallj.Thisprocesshasaneectofmultiplyingaweightingfactorcj2(0;1]byeachDR(x)j;g(x)jEduringthedictionaryelementselectionprocess.Theweightingfactorcjisspecictoeachdictionaryelementg(x)jandispre-denedheuris-tically.Inthiswaythea-prioriknowledgeabouteachdictionaryelementisconsideredbytheWMPalgorithm.Notethatwhencj=1,itisreducedtothestandardMPalgorithm. 16 ]ineachiterationtondasuitabledictionaryelement[ 17 ].Insteadofiteratingoverallthedictionaryelementstondthebestmatchingg(x)jfortheresidueR(x)j,GMPndsagoodmatchforR(x)jbasedonsomeacceptabilitycriteria.GMPisusefulinsituationswherethesizeofthedictionaryisquitelargeanditeratingoverthewholedictionaryperMPiterationcanbeaperformancebottleneck. 26

PAGE 27

17 ]usedaTime-Frequency-Scale(TFS)dictionarytoillustratetheGMPalgorithm.Itisgeneratedbyapplyingscaling,timeshiftingandharmonicmodulationtotheunitenergyGaussfunction.Eachdictionaryindexlisuniquelyidentiedbythreeparametersm,nandkdenotingthescale,timeshiftandharmonicmodulationindicesrespectively.Therefore,ateachiterationjofGMP,thistripletofindicesisidentiedtomaximizethefollowingtnessfunction: wheren;k=[njk]isapairofsuccessivebinarygenestorepresenttimeshiftandharmonicmodulationrespectively.Foreachscalem,apopulationofchromosomesisusedinoptimization.Thegeneticoperators,crossover,mutationandinversionareappliedinthisorderwiththeirrespectiveprobabilitiesPc,PmandPi.Pcdeterminesthepercentageofttestchromosomesinheritedbythenextpopulation.Therestofthechromosomesareplacedinthetemporarypopulationwheretheycombinewithotherstoproducechromosomesforthenextpopulation.Thettestchromosomefromeachpopulationisenteredintoasetofoutstandingchromosomes.Thisprocessofgeneratingpopulationscontinuesforaspeciednumberofiterations.Intheend,thettestchromosomefromthepoolofoutstandingchromosomesischosenandthecorrespondingdictionarymemberg[m;n;k]ischosentoapproximatetheresidueR(x)jforcurrentiterationj. GMPspeedsupthecomputationwhenthesizeofdictionaryisquitelarge.How-ever,GMPonlyguaranteesthebestmatchinthedictionaryuptoaninsurancelevel.Therefore,thesolutionmightbesuboptimalfromanMPstandpoint.Also,theuniquenessofsolutionisnotguaranteed.Still,searchingthedictionaryspaceusingGMPmaybeconsideredinasituationwherecomputationspeedismorecriticalthantheapproximationaccuracy.AnotherrelatedmethodofusinggeneticalgorithmtochoosedictionaryelementsforMPiterationsispresentedin[ 18 ]. 27

PAGE 28

SinceMPisagreedyalgorithm,itdoesnotguaranteethatthenalsignalrepresenta-tionisthebestonepossiblegiventhesamedictionary.Atree-basedapproachcanbeusedtondabetterapproximationthangivenbythegreedyapproachesoftheMPandOMPalgorithms.ThisalgorithmiscalledtheMP:Kalgorithm[ 19 ].IneachiterationjofMP,insteadofpickingonlythetop-matchingdictionaryelement,MP:KchoosestopK1el-ementsandusestheminparalleltoapproximatethecurrentresidueR(x)j.Foratotalofpiterations,thisgivesrisetoatreeofdepthpwhereeachnodehasKchildren.Thesubsetcorrespondingtothesmallestresidueatleafnodesischosenastherepresentationoftheinputsignalx.NotethatwithK=1,MP:KisequaltotheMPalgorithm.Karabulut,etal.[ 20 ]enhancedtheMP:KalgorithmtomakeKanexponentiallydecayingfunctionofthedepthofthetree.KislargerinearlieriterationsasthedictionaryelementschosenearlieronplayabiggerroleintheapproximationofxthantheoneschosenlateronintheMPiterations.MakingKsmallerwithlargeriterationsmakestreesearchfasterasnowthesizeoftreeissmallerthaninthecaseofaxedK. Asdiscussedintheprevioussection,theexhaustivecomparisonsofMPwitheachdictionaryelementcanmakethealgorithmslowifsizeofdictionaryisquitelarge.Computationscanbemadefasterbyintroducingatreestructureinthedictionary[ 21 ].Thedictionaryisdividedintotwopartsandagrouprepresentativeischosenbyaveragingdictionarymembersofthatgroup.Inthiswayfurthersub-groupscanbeintroducedwithinthesegroupsandsoon,givingrisetoabinarytreestructureddictionary.TheresidueR(x)jisrstcomparedwiththegrouprepresentativesofthetopgroups.WhicheverdictionarygroupcorrelatesbetterwithR(x)jischosenfortraversalandtheothergroupisignored.Thisprocesscontinuestillthewholetreeistraversedandadictionaryelement 28

PAGE 29

22 23 ].TheonlydierencebetweenMPandSMPisthatinsteadofchoosingthedictionarymemberwiththelargestinnerproduct,thedictionaryelementthatmaximizesthetheexpectedvalueofthesquareoftheinnerproductbetweenthedictionaryelementgjandtheresidueR(x)j,chosenateachiterationj: SMPwasdevelopedtodetectspeecheventstodierentiatebetweenutteranceofvariouswords.ThechosencoecientsWx=fw(x)2;w(x)2;:::;w(x)pgareassumedtohaveaGaus-sianmixturedistributionwithparameterscwhicharelearnedusingtheexpectation-minimizationalgorithm[ 24 ]. AnotherapplicationspecicvariationofMPistomakethecalculationofMPinnerproductsfasterbytakingadvantageoftheseparablenatureofthedictionariesbeingused.Forexample,fastinnerproductimplementationsusingtheseparableGabordictionary[ 5 ] 29

PAGE 30

25 ].ThesedictionariesandotherMPdictionariesarediscussedindetailinthefollowingsection. However,sometimesitmaybehardtocomeupwithagoodparametricdictionaryortheparametricdictionarymaynotbeexpressiveenoughtogiveaccurateandsparserepresentationsofthedata.Astraightforwardsolutiontohavemorevarietyinthedic-tionaryistocombinevarioustypesofdictionaries[ 1 ].Butthiswillalsoincreasethesizeofdictionary,resultinginhighercomputationtime.Therefore,insteadofusingageneralparameterizedformofadictionary,giventhetrainingdata,sometimesitismoreusefultolearnthedictionary.Tailoringthedictionarytothedataproducessparsersolutionswithbetterapproximations,whilekeepingthesizeofthedictionarymanageable.Theironlydrawbackisthattheyneedmorestoragespacethantheparametricdictionaries.Therefore,specialattentionneedtobepaidinchoosingthesizeofthelearneddictionary. Forbothparametricandlearneddictionaries,thesizeofthedictionaryisanimpor-tantfactorfornotonlystorageconsiderations,butalsoforcomputationalspeed.The 30

PAGE 31

Inthefollowingsections,wewillreviewthesethreeaspectsoftheMPdictionaryselectionprocess,namely,parametricdictionaries,dictionarylearningmethodsanddictionarypruningmethods. MallatandZhang[ 1 ]notonlyintroducedthematchingpursuitsalgorithmbuttheyalsogavegeneralguidelinesforchoosingfamiliesoftime-frequencyatomsasdictionaries.Ageneralfamilyoftime-frequencyatomscanbegeneratedbyscaling,translatingandmodulatingasinglewindowfunctiong(t)2L2(R).ThespaceL2(R)istheHilbertspaceofcomplexvaluedfunctionssuchthat: Thewindowfunctiong(t)shouldberealandcontinuouslydierentiablewithunitnorm(i.e.,kg(t)k=1).Lets>0bethescale,ubetranslationandbethefrequencymodulationindex.Giventheset=(s;u;)oftheseindices,eachdictionarymembergcanbedenedas: seit(2{14) 31

PAGE 32

TheGabordictionaryisonesuchcommonlyuseddictionaryforimageandvideosignalapproximation[ 5 ],[ 26 ],[ 6 ].GiventheGaussianwindowfunctiong(t): The1-DdiscreteGabordictionary[ 1 ]isdenedas: where=(s;;)isatripleconsistingofscale,phaseshiftandfrequencymodu-lationindicesrespectively,iisanindexovertotalnumberofsamplesNing(i.e.,i2f0;1;:::;N1g)andKischosentomakethenormofgequalto1.Ifisanelementofsuchthat=<+<2,thenthe2DseparableGabordictionaryusedforimageandsignalapproximationisgivenby: 32

PAGE 33

5 ]. GabordictionaryhasbeenshowntogivebetterandsparserapproximationsthanthediscretecosinetransformbasedcompressionstandardH.263[ 5 ].However,VandergheynstandFrossard[ 27 ]haveshownthatusinganisotropicrenementatomsaremoreusefultodescribeedgesinimagesandorientedcontoursthantheGabordictionary.Suchdictionariesshouldhaveparameterstoallowtranslation,rotationandanisotropicscalinginbothxandydirections.TheyhaveproposedadictionaryofcombinationsofGaussianandtheirsecondderivatives: with264xy375=264cos(i)sin(i)sin(i)cos(i)375+264(~xpxi)=xi(~ypyi)=yi375 27 ],[ 28 ]and[ 29 ]. ThefamilyofGaussianderivativesisalsoausefuldictionarythathasbeensuc-cessfullyappliedtoimageapproximation[ 25 ].Itisaone-dimensionalbasisofGaussianderivativesofordernatscaleandlocation.For=(n;;),theGaussiandictionaryofderivativesisdenedas: 22(2{19) Liketheseparable2DGabordictionary,aseparable2DGaussiandictionarycanalsobebuilt: 33

PAGE 34

30 ]and[ 31 ].Thesedictionarieshavebeenspecicallydesignedtoapproximatethewave-formsscatteredfromthetargetsofinterest. Althoughparametricdictionariesmayhaveastorageadvantage,itishardtocomeupwithawell-suitedparametricdictionaryforallsortsofdata,particularlyforclassicationproblems.Therefore,theMPdictionarymaybelearnedusingthetrainingdata.Inthefollowingsection,wediscussthedictionarylearningmethodsfortheMPalgorithm: 7 ].Ittreatsthedictionaryelementsgjastheclustercentersandthecoecientswijasmembershipofasignalxiintoclustergj.K-SVDminimizesthefollowing: minD;WfkXDWk2Fg(2{21) Subjecttokwik0T0,fori2f1;:::;Ng. 34

PAGE 35

PijA2ij. LikeK-means,K-SVDusesatwophaseapproachtoupdatethevaluesofWandD.Intherstphase,thedictionarycoecientsWareupdatedusingMP.Inthesecondphase,WisassumedtobexedandonlyonecolumngkofDisupdatedatatime.LetthekthrowinWthatgetsmultipliedwithgkbedenotedbywk.ThengkwkcanbeseparatedfromEquation 2{21 asfollows: whereEk=(XPj6=kgjwj)istheapproximationerrorofallxiwhenthekthatomisremoved.TheSingularValueDecomposition(SVD)ofEkwillproducetheclosestrank-1matrixthatminimizetheaboveerror.AfterremovingcolumnsfromEkthatdonotusegk,SVDofEkyieldsEk=UVT.ThegkisreplacedwiththerstcolumnofUandwkwiththerstcolumnofV.Alldictionaryelementsgjareupdatedusingthesamemethod.IteratingthroughthetwophasesofK-SVDproducesdictionarythatapproximatesgivenxisparselyandaccurately.K-SVDisanexcellentstate-of-the-artdictionarylearningthathasbeenshowntogivebetterdictionarylearningperformancethanotherexistingmethods.However,thedrawbackofKSVDisthatthetotalnumberofelementsKisheuristicallychosenbyhumaninterpretation.Therefore,theproblemoftrainingagooddictionaryusingK-SVDboilsdowntothecumbersomeprocessofmanuallyselectingagoodvalueofK. AnotherclusteringbaseddictionarylearningalgorithmhasbeenpresentedbySchmid-SaugeonandZakhor[ 32 ].LikeK-SVD,italsotreatsdictionarymemberstobelearnedasclustercentersandoptimizesthefollowingdistortionmeasurebetweenanormalizedtrainingpatternxiandthedictionarymembergj: 35

PAGE 36

Xj;k=fxi2Xjd(xi;gj;k)d(xi;gl;k);8l2f1;:::;Mgg(2{24) whereallpartitionsaredisjointsetsandtheunionofthesepartitionsisequaltoX.ThedictionaryelementgjisupdatedbyminimizingthetotalweighteddistortioninthesetXj;k.Forthispurpose,denetwosetsX(+)j;kandX()j;kofpatternshavingpositiveandnegativeinnerproductswithgjrespectively.Let~xirepresenttheenergyofeachpatternxi.Thentheupdateequationforgj;k+1isgivenas: ThederivationofEquation 2{25 canbefoundin[ 32 ].LikeK-SVD,italsolearnsapre-denednumberofdictionaryvectors.Butforpracticalpurposes,adictionarypruningmethodisalsousedwhereleastfrequentlyusedorhighlycorrelateddictionaryelementsareexcludedfromthedictionary. 33 { 36 ].Wewanttomatchtheprobabilitydistributionofourinputsignalsxigiventhesetofdictionaryelementsp(xijD)ascloselytotheprobabilitydistributionofinputsignalsp(xi)aspossible.ForagivensetofinputsignalsxiandthedictionaryD,therecanbeinnitelymanywaystochoosethecoecientsw(xi)j.Thechoiceofthesecoecientsdeterminesthesparsenessaswellastheaccuracyofthesolution.Ifwegeneratesignalsxistochasticallybydrawingeachw(xi)jindependentlyfromsomedistribution,theprobabilitydistributionofthegeneratedsignalwillbe: 36

PAGE 37

FromEquation 2{6 weknowthatxicanbewrittenasasumoflinearcombinationofelementsfromthedictionaryDandtheresidueR(x)p.IfweassumetheresidueR(xi)ptobeGaussiannoisewithvariance2,thentheprobabilityofsignalsxiarisingfromaparticularchoiceofcoecientsw(xi)jandgivendictionaryDcanbewrittenas: 22(2{27) whereZisanormalizationconstantandjXDWj2denotesthesquaredsumoveralltheresiduesPNi=1hxiPMj=1w(xi)jgji2. Sinceallthecoecientsw(xi)jareassumedtobedrawnindependently,theprobabilityP(W)isgivenbytheproductoftheindividualprobabilitiesp(w(xi)j).Thedistributionoverp(w(xi)j)shouldbesparsitypromoting,mostofthecoecientsinapproximationofxishouldbezeroorclosetozero.Thustheprobabilitydistributionofactivityofeachw(xi)jshouldbeuni-modaldistributionpeakedatzero: whereZandareconstantsandwjisascalingparameter.ThefunctionS(t)isasuitablesparsitypromotingprior[ 33 { 36 ]. GiventhisprobabilisticframeworkofEquation 2{26 ,ourgoalistondasetofdictionaryelementsDsuchthat: However,theabovecomputationrequiresintegrationofP(XjD)overallvaluesinWwhichisingeneralintractable.Therefore,weevaluateP(XjW;D)onlyatthemaximumvaluesofW.Therefore,theprocessofndingDbecomesatwo-stepmaximization 37

PAGE 38

IfwedeneJ(X;W;D)=logP(XjW;D)P(W),thenwehave: where and=22Nisaregularizationconstantthatcontrolsthesparsityofthesolution.Equation 2{32 isanunconstrainedminimizationproblemandthesolutioncanbeobtainedbydierentiatingJ(X;W;D)withrespecttothecoecientsw(xs)tandbygradientdescentovereachdictionaryelementgt[ 33 ]. 2{32 ,minimizingthesumofsquaredresiduesandasparsityconstraintonthedictionarycoecients[ 37 ].TheonlydierenceisthatconvexoptimizationformulationimposesanadditionalconstraintonthevectorsinDtohaveanormequaltosomeconstantc: minW;DNXi=1"xiMXj=1w(xi)jgj#2+NXi=1MXj=1Sw(xi)j(2{33) subjecttoPNi=1D2i;jc8j=1;:::;M. TheoptimizationproblemisconvexinDwhileholdingWxedandconvexinWwhenDisheldxed,butitisnotconvexinbothsimultaneously.InLeeetal.[ 37 ],theaboveobjectiveisoptimizedbyalternativelyoptimizingwithrespecttoDandWwhileholdingtheotherxed.ForlearningthedictionaryD,theoptimizationproblemisaleastsquaresproblemwithquadraticconstraints.Thereareseveralapproachestosolvingthisproblem,suchasgenericconvexoptimizationsolversaswellasgradientdescentusingiterativeprojections[ 14 ].ForlearningthecoecientsW,theoptimizationproblem 38

PAGE 39

14 ]. Aproblem-domainspecicdictionarylearningalgorithmispresentedbyDangwei,etal.[ 38 ]whichdealswithlearningdictionariesfordatascatteredfromvarioustargets.Foreachtypeoftarget,aseparatedictionaryislearntbyrstcalculatingtherepresentationerror: wherethesubscriptmrepresentsthetargetclassm.ThedictionaryDmisupdatedbyndingtheperturbationDmthataccountsfortheresidualerror:(Dm+Dm)A=Xm)DmA=E)Dm=(AHA)1AHE Thedictionaryisiterativelyupdatedtillastoppingcriteriaismet.Sincethisupdatemethodinvolvesmatrixinversions,thismethodisnotsuitableforverylargetrainingdataanddictionaries. 39

PAGE 40

39 ]introducedadictio-narylearningmethodwhereeachelementinthelearneddictionaryDiscomposedofaweightedsumoffunctionsfromasimplerelementarydictionaryS.ThisstructuremakesitpossibletocomputeinnerproductsoftargetsignalsxiwiththedictionaryDasweightedsummationsoftheelementaryinnerproductswiththedictionaryS.Supposegj2Dands1;s22Saredenedsuchthatgj=c1s1+c2s2.Thetheinnerproductofgjwithxicanbewrittenas: Therefore,theinnerproductcomputationscanbeimplementedasfasttwo-stagelteringstructureswithweightsck'sconnectingeachdictionarymemberinDtovectorsinS.HoweverthistechniqueaddstothestoragerequirementofthealgorithmasnowwealsoneedtostorethedictionarySanditscoecients. NeandZakhor[ 40 ]extendedtheideaofRedmill,etal.TheyassumedthatthedictionaryDistrainedusingsomedictionarytrainingalgorithmandisoptimizedtogivegoodapproximationsforthegivendataset.TomakeDmorecomputationallyfeasible,theyapproximatedtheelementsofDusingasimplerdictionarySuptoauser-denedapproximationaccuracyparameter.TheMPalgorithmisusedtoapproximateeachmemberofDusingthedictionarySandthecoecientsarestored.ThisalsogivesrisetoatwostageapproximationsystemwhichcanbeimplementedinanecientwayasdescribedinRedmill,etal.[ 39 ].ByvaryingcomplexityoftheapproximationofD,atradeobetweenthecodingeciencyandcomplexityoftheresultingmatchingpursuitencodercanbeachieved. Anothertwo-stagedictionarylearningandfastcomputationmodelhasbeenpresentedbyLin,etal.[ 41 ].SimilartotheapproachofNeandZakhor[ 40 ],thedictionaryDisassumedtobealreadyoptimizedbysomeotherdictionarylearningmethod.PrincipalComponentAnalysis(PCA)isappliedtotheelementsofDandthetopKeigenvectorscorrespondingtoKlargesteigenvaluesarechosentorepresentD.Theseeigenvectorsare 40

PAGE 41

21 ]anddiscussedinSection 2.1.5 Anotherslightlydierentdictionarylearningalgorithmusingsub-dictionariesispresentedbyLesage,etal.[ 42 ].Theyconsiderthedictionaryasaunionoforthonormalbases: whereDj2
PAGE 42

42 ]. 2.1 ,thedictionariesusedforMPapproximationsareanovercompletesetofvectorsinaHilbertspace.Overcompletenessofasetmeansthatithasmoremembersthanthedimensionalityofitsmembers(i.e.,n>M).Theadvantageofovercompletenessofadictionaryisitsrobustnessincaseofnoisyordegradedsignals.Also,itintroducesgreatervarietyofshapesinthedictionary,thusleadingtosparserrepresentationsofavarietyofinputsignals. Overcompletenessofdictionariesforsparserepresentationsiscertainlydesirable.However,therearenosetguidelinesaboutchoosingtheoptimalsizeofadictionary.Forexample,forann-dimensionalinputsignals,bothasizen+1and2ndictionarymaybeconsideredovercomplete.Abiggerdictionarymayseemtogivemorevarietyofshapes,butitalsoaddstoapproximationspeed.Also,biggermaynotalwaysbebetterasthedictionarycancontainsomesimilarlookingelementsorsomeelementsthatareseldomusedforrepresentation.Excludingsuchelementscanenhancetheencodingspeedofthedictionarybutwillnotcompromiseitsapproximationaccuracy. Theimportanceofadictionarymembercanbejudgedontwofactors,therstbeingitsfrequencyofusage,orhowmanytimesadictionarymembergjhasbeenusedinapproximationofsignalsxifromthegivendatasetX.Ifgjisseldomorneverused,itmaybeexcludedwithoutmuchloss.Secondly,howdierentgjisfromtheothermembersofthedictionary.Ifthereissomedictionaryelementgkwhichcloselyresemblesgj,itcanbeusedinplaceofgjforapproximations.Therefore,gjcanbeexcludedfromthedictionaryD. Usingtheabovefactors,someheuristicdictionarypruningmethodshavebeenproposedintheliterature.Forexample,Schmid-SaugeonandZakhor[ 32 ]proposedaclusteringbaseddictionarylearningalgorithm.ItisdiscussedintheSection 2.2.2 42

PAGE 43

44 ]excludedatomsbasedonusage-frequencyfromthe2-DseparableGabordictionaryproposedbyNeandZakhor[ 5 ].Inseveralmethods[ 45 46 ],adictionarymemberisprunedifitcanbeexpressedasalinearcombinationofothermembersofthedictionary. Adoptingaslightlydierentapproach,insteadofstartingfromabiggerdictionaryandpruningitsmembers,Monro[ 47 ]buildsadictionaryfromscratchbasedontheusefulnessofthecandidateelements.InfactlikeMP,thisbasispickingmethodisalsogreedyinnature.EverymemberofthetrainingdatasetisapproximatedwithonlyoneelementfromapoolofcandidatedictionaryelementsofGabordictionary.Whicheverelementgivestheoverallbestapproximationperformanceischosenastherstdictionarymemberofthenewdictionary.Forchoosingsubsequentdictionaryelements,thetrainingdatasetisapproximatedusingtheelementsofthenewdictionaryandoneelementofthecandidatedictionaryateachiteration.Whicheverdictionaryelementgivestheoverallbestperformanceisenteredintothenewdictionary.Thisprocessterminateswhenthedesireddictionarysizeisachieved. Asnotedearlier,allthedictionarypruningmethodsarebasedonusagefrequencyandsimilarityofdictionarymembers.However,allthesemethodsarequiteapplicationdependentheuristicmethods.Wehavenotcomeacrossageneraldictionarypruningmethodwithasoundtheoreticalfoundation. 43

PAGE 44

44

PAGE 45

ThefeaturesextractedusingtheMPalgorithmcanbeusedtotrainaneuralnetworkclassier.ForexampleHerrero,etal.[ 48 ]extractfeaturesfromelectrocardiogram(ECG)signalsusingMPandusedthemwiththeMultilayerPerceptronNetwork(MLP)[ 3 ]forheartbeatclassication.Foreachclassci,projectionsequenceGiischosenthatgivesthebestMPapproximationofitsclassmembers.Foratestpointt,MPcoecientsW(t)iandresidueR(t)iarefoundbyprojectingitontoeachGi.Inordertodiscriminatebetweenvariousclassesusingthisinformation,theW(t)iandR(t)ifromallclassesareconcatenatedtogethertobuildthefeaturevector.ThisfeaturevectoristhenfedintotheMLPtodeterminetheclasslabeloft.Severalauthors[ 49 50 ]usetheGabordictionary(Equation 2{16 )forMPfeatureextraction.ParametersofdictionaryelementsintheprojectionsequenceofthetrainingsignalsandtheircorrespondingcoecientsareusedtotraintheRadialBasisFunctionNetwork[ 3 ]forclassication.SimilarlyKrishnapuram,etal.[ 51 ]useGabordictionaryparametersandtheircoecientsasfeatureswiththeSupportVectorMachine[ 3 ]forobjectdetectioninimages. Whentemporalorspatialrelationshipsexistbetweentheobservationsofaneventandcanbeexpressedasastatesequenceusingthefeaturevector,HiddenMarkovModels(HMM)[ 52 ]canbeusedforclassication.Bharadwaj,etal.[ 31 ]usetheMPfeatureswiththeHMMtoidentifytargetsbasedontheirscattereddata.Basedontargetcomplexityandsensorbandwidth,eachtargetrespondsdierentlyundervarioustarget-sensororientations.Therefore,thesequenceofresponsesforanobjectovervariousobservationsischaracterizedasthestatemodeloftheHMM.AfeaturevectorfordatacollectedateachangleisbuiltbyconcatenatingparametersofthedictionaryelementsandcoecientschosenateachMPiteration.ThesefeaturesvectorsarethenusedtotrainHMMsforeachtargettype.ThesequenceofresponsesforatestpatternisshowntoeachHMM 45

PAGE 46

53 { 55 ]alsouseMPasfeatureextractorandHMMforclassicationandobjectrecognition. OtherclassiersthathavebeenusedwiththeMPfeatureextractionmethodincludethestepwiselogisticregressionanalysis[ 56 ]andadaptiveneuro-fuzzyinferencesystem[ 57 ]. AlthoughthedimensionalityofthefeaturesgeneratedusingtheMPalgorithmislowerthanthatoftheinputsignals,theirdimensionalitycanbefurtherreducedusingtheLinearDiscriminantAnalysis(LDA)[ 3 ].ThisapproachhasbeenusedtomaptheMPfeaturevectorsintolowerdimensionswherelinearboundariescanbedrawnbetweenclassesforclassication[ 58 59 ]. 60 ].GiventhetrainingsetX=fx1;x2;:::;xNgandcorrespondingclasslabelsY=fy1;y2;:::;yNg,machine-learningapplicationstrytolearnthefunctionf(x)suchthatforeachtrainingpatternxi,f(xi)=yi.Therefore,whenpresentedwithatestpointxt,thefunctionf(xt)triestopredictitsclasslabelyt.Tolearnf(x),theKernel-basedlearningalgorithmsrepresentf(x)asalinearcombinationoftermsK(x;xi),whereKisasymmetricpositivedenitekernelfunction.SincetheMPalgorithmalsowritesasignalasalinearcombinationofelementsfromadictionary,wecanlearnthefunctionf(x)usingtheMPalgorithm,usingthetermsK(x;xi)asourdictionary.ThefunctionlearnedinsuchawaywillessentiallyhavethesameformasaSupportVectorMachine(SVM)[ 3 ],whichisapopularKernel-basedclassier.However,thesparsityofsolutionsfoundbytheSVMsisnotcontrollableandthesesolutionsareoftennotverysparse.Ontheotherhand,theMPalgorithmallowsadirectcontroloverthesparsityofthesolution. 46

PAGE 47

^fN(x)=p1Xj=0wjKx;xj+b(2{42) wherejrepresentstheindexofthetrainingpatternwhosecorrespondingkernelfunctionKx;xjwaschosenasthedictionaryelementatthejthiterationoftheKMP.Theconstantbiscalledthebiastermwhichmaybeincludedtoosettheapproximatedfunction^fN(x).Notethatbisnotthepthresidueashereweareonlydealingwiththeapproximatedfunction^fN(x)andnottheoriginalfunctionf(x).ThechoiceofthekernelfunctionKisusuallyproblem-specic.TwocommonlyusedkernelfunctionsaretheGaussiankernelfunctionandthepolynomialkernelfunction[ 3 ]. SinceKMPisanextensionofMPdesignedspecicallyforclassication,itndsapplicationinvariousproblemdomains.Forexampleaclassication-orientedimagecompressiontechniquehasbeenpresentedbyChangandCarin[ 61 ].Thewavelet-basedsetpartitioninginhierarchicaltrees(SPIHT)imagecompressiontechniquetriestominimizethemeansquarederror(MSE)betweentheoriginalandthedecodedimage.However,thewaveletcoecientschosenbytheMSEcriteriamaynotbesuitableifthedecodedimagewillbeusedforclassicationorimagerecognitionpurposes.Therefore,ChangandCarinrstrankedwaveletcoecientsusingKMPtochoosecoecientsusefulfordiscriminationpurposesbeforecompressingthemusingSPIHT.SimilarlyStack,etal.[ 62 ]usetheKMPalgorithmtorankthefeaturestochooseanoptimalsetforclassication. Zhangetal.[ 63 ],haveusedtheKMPfordetectionofburiedunexplodedordnance(UXO)targets.TheybuildtheFisherinformationmatrixofthetrainingdataXtochooseasubsetXSthatismostinformativeincharacterizingthedistributionofthetargettestsiteandwhosememberscanbeusedtobuildthekerneldictionary.TheKMPclassier 47

PAGE 48

Theideaofchoosingasubsetofthetrainingdataforbuildingthedictionaryand^fn(x)isquiteusefulwhenthetrainingdatasetislargeandusingthewholedatasetwillconsiderablyincreasethecomputationtimewithoutaddingtotheclassicationdelity.Therefore,Popvicietal.[ 64 ]haveintroducedastochasticversionofKMP.Assumingaprobabilitydistributionoverthetrainingpatternsusedtobuildthekerneldictionary,ateachiterationjthedictionaryisrstrandomlysampledtobuildasubsetDSfromtheoriginaldictionaryD.TheselectionofthedictionaryelementKx;xjisthenmadefromDSinsteadofthewholedictionaryD.Thisgivesrisetoaweakgreedyalgorithm,butalsoconsiderablyspeedsupthecomputations. OtherapplicationsofKMPincludedetectingairportsinaerialopticalimagery[ 65 ]andsimultaneouslylearningmultipleclassiers[ 66 ]. 67 ]havepresentedatheoreticalframeworkforsuchsignalclassicationthatcombinestheobjectivefunctionofLDAwithsparserepresentation.LetXbeamatrixcontainingallinputsignalsxi,Dbethematrixofalldictionaryelements,WthematrixofcoecientsofXcorrespondingtoDandw(xi)avectorofcoecientsforxi.EachxialsobelongstoaclassCjwhere1jCandNjdenotesthenumberofsamplesbelongingtoclassCj.Then 48

PAGE 49

TherstterminJisthestandardFisherdiscriminantthatmaximizestheratioofbetween-classscatterandthewithin-classscatter.Themeanmjisdenedas: Notethatthemeanisoverthedictionarycoecientsoftheinputsignalsinsteadoftheactualsignals.Similarlythevariances2jisdenedas: ThereforethersthalfterminEquation 2{43 triestomaximizetheseparationbetweentheMPcoecientsofpatternsbelongingtodierentclasses.Thesecondandthirdtermsmakethestandardsparseapproximationobjectivefunctionwithasparsityconstraintonthenumberofnon-zeroelementsinw(xi).Withsuitablechoiceof1and2,thesolutionofEquation 2{43 givessparsedictionarycoecientsthatmaximizetheseparationbetweenvariousclasses. 49

PAGE 50

25 ],wheretheyareconcernedwithobjectandposerecognitionproblem.Eachobjectinclassdisparameterizedbythreevariables,therowindexx,thecolumnindexyandtheorientation.Theimageisrepresentedasf(d)(x;y;).Atthecoarserecognitionlevel,thetaskistoidentifytheobjectclassdofatestimagef(x;y): andatthendlevel,thetaskistoidentifytheorientation: Thesetwotasksarequitesimilarasbothrequirecomparisonsbetweenf(d)(x;y;)andf(x;y).Themodelisbuiltforf(d)(x;y;)byapproximatingitusingtheMPalgorithmandstoringthechosensequenceofdictionaryelementsandthecoecientinformation.TheGaussianderivativesdictionary(Equation 2{19 )isusedforthispurpose.Thetestimagef(x;y)iscomparedwithf(d)(x;y;)byprojectingitonthesequenceofdictionaryelementsofitsmodelandcomparingthecorrespondingcoecients.Inordertoecientlyimplementthisrecognitionsystem,theGaussianderivativedictionaryisapproximatedbycubicB-splines.ThisapproximationreducestheEquations 2{46 and 2{47 totheproblemofsolvingpolynomialsofdegreesix,whichcanbeimplementedeciently[ 25 ]. 50

PAGE 51

28 68 ].Theauthorsusetheanisotropicrenementatoms(Equation 2{18 )togenerateaseparatedictionaryforeachtargettype.TheMPapproximationfortherepresentativeimageofeachtargetclassisfoundusingtheanisotropicrenementdictionarytochooseassubsetOciofdictionaryelementstorepresentthetargetclassci.KeepingtheintrinsicrelationsofOciintact,itsscaledandrotatedversionsareintroducedinthedictionarycorrespondingtotheclassci.Usingtheseexibleclass-specicdictionaries,therotatedandscaledversionsofeachtargetcicanbefoundinatestimage.AsimilarapproachisalsousedbyDangweietal.[ 38 ]whereaseparatedictionaryistrainedforeachclass.TheirdictionarylearningalgorithmisdiscussedintheSection 2.2.2 undertheheadingConvexOptimizationApproach. 69 ].Thematchingpursuitlters(MPF)arebuiltbyndingtheMPdecompositionoftheobjectofinterestinthesampleimage.Thedictionaryele-ments,theirrelativepositionswithrespecttoanoriginandthecorrespondingcoecientsarestoredforanMPF.Forlocatinganobjectinanimage,theMPFcanbecomparedintwowayswiththetestimage.Inrstmethod,MPFcanbecorrelatedwitheverypixellocationintheimagelikeanordinarylter.Inthesecondapproach,theimagecanbeprojectedontothedictionaryelementsoftheMPFandthecorrespondingcoecientscanbecomparedtoascertainthepresenceoftheobjectofinterest.CaremustbetakeninchoosingthetotalnumberpofdictionaryelementschosentobuildMPFasusingtoomanycoecientswillrecordtoomuchinformationorevennoiseintheMPFthatmaynotgeneralizewelltoallinstancesoftheobject.Ontheotherhand,choosingptobeverysmallwillnotrecordenoughinformationoftheobjectofinterestintheltertobeabletodierentiateitfromotherobjectsintheimage. Ifthereismorethanonesamplefortheobjectofinterest,theMPFcoecientsneedtobelearnedbyoptimizingthecoecientsofallthesesamplestogether.Let 51

PAGE 52

Thedictionaryelementchosenattheiterationj+1isthatwhichgivesminimumvarianceofcoecientsfromallxiaroundthemeanwhenitsinnerproductDR(x)j+1;gEisconcatenatedtotheendofthecoecientvectorW(xi)j.ThemeanissimplytheaverageofhW(xi)jDR(x)j+1;gEifori=1:::N.SimilartothesingletrainingimageMPF,thismulti-imageMPFalsoconsistsofanorderedlistofpdictionaryelementsandtheircorrespondingcoecients.Theidenticationofobjectinthetestimageiscarriedoutinthesamewayasitwasdonewithasingle-imageMPF.TheMPFltershavebeenappliedtofacerecognition[ 70 71 ]andhavebeenusedtointerpretroadsignsfromimages[ 72 ]. AlthoughMPFshavebeenshownusefulforface[ 70 71 ]androadsigns[ 72 ]recogni-tion,thedictionaryelementchoosingprocedureemployedbymulti-imagetemplatesmaynotproducethebestltersfortheobjectsofinterest.Evenifallthetrainingimageshavebeencenteredandnormalized,theymaystillhavedierencesintheirshapeandintensityvariations.Therefore,adictionaryelementchosenforoneimageataparticularlocationmaynotbethemostsuitableforanotherimage.Hence,thedictionaryelementchosenbyminimizingthevarianceofallthecoecientsmaynothavebeenthepreferredchoiceforeachindividualimage.SuchchoiceofdictionaryelementscannotonlyleadtosuboptimalMPapproximations,butcanalsoproducefalseartifactsinimagesaftersubtractionofthedictionaryelementfromthecurrentresidue.Therefore,insteadoftryingtooptimizeMP 52

PAGE 53

70 ],takingthemeanimageofallthetrainingimagesandndingMPFfromitusingthesingle-imagemethodmayproducebetterlters. 53

PAGE 54

Amodel-basedclassicationmethodforveryhigh-dimensionalsignalsispresentedwhichiscapableofdetectingoutliersinthetestdata.Thismethodachievesthefollowing: Themethodisbasedonthemodicationofthematchingpursuits(MP)algorithm,commonlyusedforsignalsapproximationandrepresentation,forclassicationpurposescoupledwithanoveldissimilaritymeasureandcompetitiveagglomeration(CA)clustering.ThebiggeststrengthoftheMPalgorithmisthatgivenadictionary,itisageneralmethodthatcanbeusedtoapproximatesignalsofanytypewithoutmodifyingthebasicalgorithm.AsdiscussedintheChapter 2 ,thematchingpursuitsalgorithmhaspreviouslybeenusedinsomeclassicationapplications.However,alltheseapplicationsarestrictlyboundtotheirrespectiveproblemdomains,thusmakingtheirgeneralizationdicult.Therefore,wedeviseageneralizedmatchingpursuitsdissimilaritymeasure(MPDM)thatrequiresnoassumptionsaboutthedata. TheMPdictionaryisthebasicingredientofthefeatureextractionprocess.However,itisboundtodomainknowledge.Therefore,weautomatethedictionarylearningprocess.Forthispurpose,wegeneralizedthestate-of-the-artdictionarylearningalgorithmK-SVD[ 7 ].UsingMPDMandthecompetitiveagglomeration(CA)clusteringalgorithm,ourenhancedK-SVDalgorithmautomatesthedictionarylearningprocessbydiscoveringthetotalnumberofrequireddictionaryelementsduringtraining.WecallthisalgorithmtheEnhancedK-SVD(EK-SVD)algorithm. Intheend,weadoptaBayesianapproachtomodel-basedclassication.Thisap-proachusesMPDMforshape-basedcomparisonswiththeCAclusteringalgorithmtobuildmodelsforclassication.WecallthisalgorithmCAMP,asanabbreviationofCA 54

PAGE 55

Wediscussourmethodsindetailinthefollowingsections: 3-1 .Fig. 3-1 -AshowstwodictionaryelementschosenfromtheparametricGaussiandictionary(Equation 2{19 ).Byvaryingthecoecientsofg1andg2,signalsofvariousdierentshapescanbeconstructedasshowninFig. 3-1 -B.Notethatbothx1andx2inFig. 3-1 -Bhavezeroresiduesastheywereconstructedaslinearcombinationsofg1andg2.However,thesignalsshowninFig. 3-1 -ChaveverysimilarMPcoecientvectors,buttheirresiduesareverydierent(W(x1;G(x1))=(13;5),kR(x1;G(x1))k=0:25,W(x2;G(x2))=(13;2)andkR(x2;G(x2))k=83:66.) Therefore,inordertocomparetwosignalsx1andx2,weneedtocompareallthreefactors.Thiscanbedonebyprojectingx1ontotheprojectionsequenceG(x2)ofx2andnotingthecorrespondingcoecientvectorW(x1;G(x2))andtheresidueR(x1;G(x2)).Basedonthesefactors,thematchingpursuitsdissimilaritymeasure(MPDM)isdenedas: 55

PAGE 56

andDW(x1;x2)comparestheircorrespondingMPcoecientssimilarly: anddeterminestherelativeimportanceofDR(x1;x2)andDW(x1;x2)inEquation 3{1 TheMPDMcomparestwosignalsbyprojectingthemontothesubspacedenedbyG(x2).ThevalueoftheresiduetermDR(x1;x2)comparesthedistanceofx1andx2fromthissubspace.ThevalueofthecoecienttermDW(x1;x2)comparesthedistancebetweenprojectionsofx1andx2withinthesubspaceG(x2).Notethattheprojectionofx1andx2ontothesubspaceofG(x2)isnotnecessarilyorthogonalunlesstheorthogonalmatchingpursuitorsomeotherexplicitorthogonalizationmethodisused.However,ifthedictionaryformsanorthogonalbasisandallitselementsareusedintheapproximation(i.e.,p=M)thentheapproximationisexact.ButusuallyinapplicationswheretheMPDMisused,thesignalsorimagesareveryhigh-dimensional(hundredsorthousandsofdimensions)andthedictionariesarenotorthogonal.Also,foreaseofcomputation,thenumberofdictionaryelementsusedismuchsmallerthanthesizeofdictionary(i.e.,p<
PAGE 57

3-2 illustratestheroleofindeterminingthetypeofcomparisonbeingmadeusingtheMPDM. (x1;x2)=1 2((x1;x2)+(x2;x1))(3{4) AmetriconasetXisafunctiond:XX!<,whichsatisesthefollowingfourconditionsforallx1;x2;x3inX: Ifadissimilaritymeasuresatisesonlytherstthreeconditions,itiscalledapseudomet-ric.ThesymmetricMPDMdenedinEquation 3{4 satisesthersttwoconditionsbyconstruction.Thethirdconditionissatisedwhen2(0;1).However,itmaynot 57

PAGE 58

3-1 showsonesuchexamplewherethecoecientsofx1andx2arequitesimilar.Also,inaveryunlikelysituation,itmaybepossiblefortwosignalswithdierentprojectionsequencesanddierentcoecientvectorstohavethesameresidues.Therefore,thethirdconditionmaynotalwaysholdwhen=1. For2(0;1),thethirdconditionofbeingapseudometriccanbeprovedbycontradiction.Supposethatx16=x2but(x1;x2)=0,whichalsoimpliesthat(x1;x2)=0and(x2;x1)=0.If(x1;x2)=0,itmeansthattheresiduesR(x1;G(x2))andR(x2;G(x2)areequalandalsothecoecientsW(x1;G(x2))andW(x2;G(x2))areequaltoeachotherrespectively.Sincetheresiduesoftwodierentsignalscanbethesame,wewillonlyinvestigatetheequalityofW(x1;G(x2))andW(x2;G(x2)).NotethatintheMPDMwecomparethesecoecientvectorsusingtheEuclideandistance,whichisametricandsatisescondition3.Therefore,W(x1;G(x2))andW(x2;G(x2))havetobethesamefor(x1;x2)tobeequalto0.NowW(x1;G(x2))andW(x2;G(x2))wereobtainedbyprojectingbothx1andx2onthesameprojectionsequenceG(x2).However,theiterativegreedyprojectionmethodofMPassuresthateachsignalxhasauniqueprojectionsequenceG(x),coecientvectorW(x;G(x))andresidueR(x;G(x)).Therefore,iftheprojectionsequence,residueandthecoecientvectorofx1andx2areexactlythesame,itmustbethatx1=x2.Thiscontradictsourassumption.Therefore,^(x1;x2)isapseudometric. Apreliminarydictionarylearningmethod 73 ].Thehigh-dimensionaltrainingsignalsaresegmentedbasedontheirzerocrossings.Thedictionaryobtainedinthismannermayhavemanyelementsofsimilarshapewithdierentdisplacements.Therefore,thedictionary 58

PAGE 59

74 ].ClusteringismadeinvarianttosignaldisplacementbyclusteringthepowerspectraofsignalsusingtheEuclideannorm.Clustercentersofallclustersarethenusedastheele-mentsofthedictionary.Sincetheshiftinformationwasremovedfromthedictionary,thecorrectshiftofeachdictionaryelementduringMPiterationcanbefoundbycorrelatingitwiththecurrentresidue. UnlikethecompressionapplicationsofMPwherethedictionariesneedtobehighlyredundanttogivegoodapproximationresults,thisdictionarycanbeverycompactasitisusedforclassicationpurposes.Thedictionaryonlyneedstobeexpressiveenoughtoapproximatesignalsreasonablywellforclassicationanddierentiationpurposesbutitisnotnecessarilyrequiredtogiveaccurateapproximationsfromthesignalreconstructionstandpoint.Therefore,theemphasisisonmakingdictionaryquitecompactasitgivesspeedadvantageduringcomputations. Theabovemethodsummarizestheuseofanautomateddictionarylearningmethodforaclassicationapplication.However,theabovemethodisasequential-clusteringbasedmethod.Theadvantageofusingsequential-clusteringisthatitdiscoverstherequirednumberofclusters,thuseliminatingtheneedtospecifythetotalnumberofclustersbeforehand.Butitsbiggestdisadvantageisitssensitivitytoinitializationandtotheorderinwhichthedataispresented.Therefore,sequential-clusteringmayfailtoproduceagloballyoptimalpartitioningofdata. 2.2.2 ,thedictionarylearningproblemcanbeinterpretedasaclusteringproblem.Eachdictionaryelementistreatedasaclustercentercj,whilethecoecientsofeachsignalxicorrespondingtocjaretreatedasitsmembershipuijinthejthcluster.Usuallythedictionarylearningproblemisinterpretedasahardclusteringproblemwhereeachxicanbeamemberofonlyoneclusteratatime.Themostnotableofsuchalgorithmsis 59

PAGE 60

7 ].However,whiletreatingdictionarylearningasaclusteringproblem,eachsignalxishouldbeallowedmembershipwithpclusters,wherepisthetotalnumberofdictionaryelementsusedforMPapproximationofxi.Therefore,itismorelogicaltotreatthedictionarylearningproblemasafuzzyclusteringprobleminsteadofahardclusteringproblem. Inthefollowingsubsection,weformallyestablishtherelationshipbetweenthematchingpursuitsandthefuzzyclusteringalgorithms.Thisrelationshipwillhelpustodevelopafuzzyclusteringbaseddictionarylearningalgorithminthesubsequentsubsection. 74 ].GivethedatasetX=fxigNi=1andtheclustercentersC=fcjgMj=1,thefuzzyC-means(FCM)algorithmminimizesthefollowingcostfunction: subjectto whered2(xi;cj)isthesquareddistanceordissimilarityofxitotheclustercentercjanduij2[0;1]representsthedegreeofmembershipofxiintheclusterrepresentedbycj. Insparsesignalapproximationalgorithms,likeMP,givenadictionaryD=fgjgMj=1,pdictionarymemberstakepartinsparserepresentationofasignalxi.Thereforeatanygiventime,asignalxiissimilartop1dictionarymembers.Thisconceptissimilartothefuzzyclustering.Infact,adirectcorrespondencebetweenthefuzzyclusteringmembershipsuijandthedictionarycoecientswijforMPcanbeestablished.TheMPapproximationofasignalxiconservesitsoverallenergybetweenthechosendictionary 60

PAGE 61

2{7 wehave: orbysummingoveralldictionaryelementsweget: whereonlypdictionarycoecientscorrespondingtotheMPapproximationofxiarenon-zeroandRiisthenormalizedresidueofxi. TherelationinEquation 3{8 issimilartothefuzzyclusteringconstraintinEquation 3{6 ,withanadditionaltermRi.Therefore,assumingthedictionaryelementsgjtobeclustercentersandthenormalizeddictionarycoecientsinEquation 3{8 tobethemembershipsofxiwithgj,thesimultaneoussparserepresentationanddictionarylearningproblemcanbeframedasafuzzyclusteringproblemasfollows: Subjectto: kuik0=p Therefore,therstconstraintonEquation 3{9 isexactlytheEquation 3{8 .TherelationbetweenuijandwijestablishesacorrespondencebetweentheMPandthefuzzyclusteringalgorithms,allowingafuzzyclusteringproblemtobeinterpretedasanMPproblemandviceversa.ToconvertafuzzyclusteringproblemintoanMPproblem,weusethefact 61

PAGE 62

3{10 .Thesignofwikcanberesolvedbycheckingifhxi;wiki>hxi;wikiorviceversa.Similarly,anMPproblemcanbeconvertedintoafuzzyclusteringproblembyusingEquation 3{10 andtreatingthedictionaryelementsasthefuzzyclustercenters. A .Thustheautomateddictionarydesignalgorithmisdenedbythefollowingobjectivefunction: SubjecttoPMj=1uij=1Ri,fori2f1;:::;Ngandkuik0=p.NotethatEquation 3{11 issameastheobjectivefunctionofCA,excepttheconstraint.TheRiisthenormalizedresidueofthesignalxi.TheMPDMisusedasthedissimilaritymeasurebetweenxiandgj.Sincegjisadictionaryelement,itsprojectionsequencecontainsonlyitself(i.e., 62

PAGE 63

ThevaluesofuijareupdatedbyminimizingEquation 3{11 .Forthispurpose,weapplyLagrangemultiplierstoobtain: AssumingDandXxed,weobtain: wheret=PNi=1uit.Thevalueoftisassumedtonotchangedrasticallyoverconsecutiveiterations.Therefore,forcomputationalease,thevalueoftfrompreviousiterationisused.Tosolvefors,applytheconstraintinEquation 3{11 toEquation 3{15 : Byrearrangingtermsweget: where~st=1 3{17 inEquation 3{15 : 63

PAGE 64

3{18 .Whenu(Update)dominatesEquation 3{19 ,thevalueofuijisupdatedasusualandwhenu(Prune)dom-inates,thevalueofuijmaybeincreasedordecreasedbasedontheusageofgj.ThemethodusedbyFriguiandKrishnapuram[ 2 ]tocalculateisdiscussedinappendix A ThecoecientswijarefoundusingtheabovemodiedCAalgorithm.Oncealluijhavebeenupdated,Equation 3{10 canbeusedtondthecorrespondingwij.Thesignofwijcanberesolvedbyinspectingthesignoftheprojectionofxiontogj.Giventhecoecientswij,anydictionarylearningalgorithmcanbeusedtoupdatethedictionaryelementsgj.WeusetheK-SVDalgorithm,discussedinSection 2.2.2 ,toupdategj.Oncethedictionaryhasbeenupdated,thevalueofRiisupdatedusingtheMPalgorithm.Therefore,thedictionarylearningisathree-foldoptimizationalgorithm. TheaverageRicangoupduringaniterationwhereafewdictionaryelementsaredroppedsimultaneously.Butitcomesdowninsuccessiveiterations.Intheearlypruningstagesofthealgorithm,thecoecientschosenbytheCAalgorithmmaynotbeoptimalintermsofsparserepresentation.Butoncethecorrectnumberofdictionaryelementshasbeendiscovered,thecoecientswijcanbechosenaccordingtothestrictsparsityconstraintoftheK-SVDalgorithmusingthestandardMPalgorithm. Therefore,thelearningtaskistondtheparametersforp(yjx).LetC=fc1;:::;cMgbeasetofprototypesrepresentingthedatasetX,whereM<
PAGE 65

3{20 asfollows: or,byBayesrule, Assumingthatalltheprototypesareequallylikelyandignoringthenormalizationconstantinthedenominator,p(y=1jx)isproportionalto: Similarlytheprobabilityofxhavingalabely=0isgivenby: Therefore,thetrainingprocessrequiresestimationoftheprototypesC=fcjgMj=1andassociatedprobabilitiesp(yjcj).Thenthetestingprocess(i.e.,assigningaclasslabeltoapointx)requiresestimationoftheprobabilitiesp(xjcj)andapplyingtheBayesrulesEquation 3{23 andEquation 3{24 .TheideaistousetheunsupervisedclustersCtobuildafuzzynearest-prototypebasedclassicationsystemusingtheMPDMandusethefuzzymembershipofapointxintotheclustercjasp(xjcj)toassignaclasslabelytox.AnequivalentfuzzyinterpretationoftheaboveequationscanbefoundinFrigui,etal.[ 75 ]wheretheytreattheclasspriorp(y=1jcj)asthefuzzymembershipoftheprototypecjintoclassy.Hereweareaddressingthetwo-classproblem,butthegeneralizationton-classesisstraightforward. 65

PAGE 66

A .Sincewearedealingwithhigh-dimensionaldataxi,wewillusetheMPDMasthedissimilaritymeasurewiththeCAalgorithm.UsingtheMPDMwillgiveusmeaningfulshape-basedcomparisonsbetweenclustercenterscjandxi.TheMPDMisusedasthedissimilaritymeasureinEquation A{1 toyield: SubjecttoPMj=1uij=1,fori2f1;:::;Ng. ByusingtheMPDMwithCA,wegetathree-phaseoptimizationalgorithmcom-prising,updatingtheclustercenterscj,updatingtheMPapproximationsofclustercenters^cj(G(cj))andupdatingthemembershipsuij.WecallthisalgorithmCAMP,anabbreviationofCAandMPDM. Inordertondanupdateequationforcj,letusmakethedependenceofEquation 3{25 oncjexplicit.FromEquation 2{6 ,wehaveR(xi;G(xi))=xi^xi,where^xiisdenedinEquation 2{1 .ThusEquation 3{25 becomes: where^xi(G(cj))xiR(xi;G(cj))istheapproximationofxiwhenprojectedontheprojectionsequenceofcj.LetusassumethemembershipsuijandMPapproximations^cj(G(cj))areconstantsforthiscomputation.ThendierentiatingEquation 3{26 with 66

PAGE 67

Rearrangingtermstoobtainanupdateequationforct: where ormoresuccinctly,wewriteEquation 3{28 as: wheretand~tdenotetheweightedaverageoftheoriginalandapproximatedsignalsinclustert,respectively.Notethat~tisnottheprojectionoftinthesubspaceG(ct)butaweightedaverageoftheprojections^xi(G(ct)).Aftercthasbeenestimated,^ct(G(ct))canberecomputedbycalculatingthematchingpursuitsdecompositionofctoverthedictionaryD.Themembershipsuijcanthenbeupdatedandthethree-phasealternatingoptimizationproceedsinthisfashion. Oncetheclustercenterscjhavebeenfound,theprobabilitiesp(yjcj)foreachclusterrepresentedbycjcanbeassignedinmanyways.However,sincewearebuildingseparatemodelsforbothclassestocomparewithx,weclusterthesamplesfrombothclassesseparatelyusingtheCAMPalgorithm.Nowalltheclusterscenterscjbelongingtotheclass1willhavep(y=1jcj)=1andp(y=0jcj)=0andviceversa.Therefore,weneedtoadjusttheprobabilitiesforcjbasedonitssimilaritytoaclwhichbelongstotheotherclass: 67

PAGE 68

Thefunctionf(:)isamonotonicallydecreasingfunctionusedtoconvertthedissimilaritymeasureMPDMintoasimilaritymeasure.Ifbothcjandclhavethesameclasslabelk,thencj=cland!cjkwillhavethemaximumvalue.Ontheotherhand,whenbothcjandclhaveoppositelabels,thevalueof!cjkwilldependontheirdissimilarity,asmeasuredbytheMPDM.SimilartoEquation 3{31 theequationforp(y=0jcj)isgivenby: Theprobabilitiesp(y=0jcj)andp(y=1jcj)canbeinterpretedasthefuzzymembershipsoftheprototypecjintheclassesy=0andy=1respectively.TheEquations 3{31 and 3{33 usetheminimumdistancebetweenprototypesofbothclassesandfuzzyC-meansbasedlabellingtoassignthesememberships[ 75 ]. Oncetheprobabilitiesp(yjcj)havebeenupdatedforallcj,theprobabilityp(xjcj)ofapointxbelongingtoaclusterjcanbecalculatedasasimilaritybetweenxandcj: Sinceweareanticipatingpresenceofoutliersinthetestdata,wewillnotnormalizeEquation 3{34 overallcjbecausenormalizationwouldremovethenotionofdistancefromp(xjcj).Supposex1isanoutlierandx2isnot.Thenregardlessoftheiractualdistancefromtheprototypes,p(x1jcj)=p(x2jcj)ifbothx1andx2lieatthesamerelativedistancefromtheallcj(Figure8in[ 76 ]).Therefore,inordertobeabletodetectoutliersinthetestdata,weusetheabsolutesimilarityofxtocjinEquation 3{34 Givencj,p(yjcj)andp(xjcj),therelationsdenedinEquations 3{23 and 3{24 canbeusedtoassignp(y=1jx)andp(y=0jx)toxrespectively.Ifbothp(y=1jx)and 68

PAGE 69

3{20 Therefore,givenalabeleddatasetX,thefollowingstepssummarizetheprocessoftrainingaCAMPclassieronX: 1. BuildtheprototypescjthatrepresentthedatasetXusingtheCAMPalgorithm. 2. Findtheprobabilitiesp(y=0jcj)andp(y=1jcj)usingEquations 3{31 and 3{33 Atestpointtcanbeassignedacondencevaluebythefollowingsteps: 1. Findp(tjcj)usingEquation 3{34 2. UseEquations 3{23 and 3{24 tondtheprobabilitiesp(y=1jt)andp(y=0jt). 3. Ifbothp(y=1jt)andp(y=0jt)arelow,identifytasanoutlier. 4. Otherwise,applyBayes'decisionruleEquation 3{20 toassignaclasslabeltot. 3{30 withR+,andcallitpseudo-residue.ThenEquation 3{30 canbewrittenas: TheclosenessofeachpointxitothesubspaceG(ct)andthemembershipsuitaecttheoverallvalueofR+.Ifanxiissimilartoct,theweighteddierencebetweenxiand^xi(G(ct))willbesmall.Theirdierencecanalsobesmallifthemembershipuitofxiinctislow.Figure 3-3 showsagraphicalinterpretationofEquation 3{35 .Equation 3{35 isquitesimilarinconstructiontotheMPapproximationEquation 2{6 .However,Equation 3{35 denesaprocessthatistheexactoppositeofMP.InMP,theapproximatedsignal^xisbuiltusingtheoriginalsignalxandtheresidueR(x)pistheby-product.Ontheotherhand,whileupdatingtheclustercenterct,theapproximatedsignal^ctiscalibratedusingtheresidueR+tobuildtheoriginalsignalct.ThisinterpretationofEquation 3{35 isconsistentwiththestandardupdateequationsofclusteringalgorithmswhereclustercentersareupdatedusingtheclustermembers.Therefore,theMPDMbuildsabridgebetweenthematchingpursuitsandtheclusteringalgorithms.SincetheMPDMcanwork 69

PAGE 70

A{2 weknowthattheupdateequationofuijisgivenby: d2(xi;cj)NjPMk=1(1=d2(xi;ck))Nk whereNj=PNi=1uijisthecardinalityofclusterj.LetDij=2(xi;cj),thentheupdateequationforuijusingMPDMcanbewrittenas: where Ifasmallchangeinthevalueofresultsinalargechangeinthevalueofuij,itwouldmeanthatuijissensitivetotheparameter.Inordertodetermineifthemembershipsuijaresensitivetothevaluesof,letusdierentiateuijwithrespectto.Forthispurpose,letusdenesomenotation: wheredenitionsofA,BandCfollowfromEquation 3{37 .ThederivativeofDijisgivenby: ij=dDij 70

PAGE 71

NowdierentiatingAwithrespecttoweget:dA d=1 (DijSi)2d d(DijSi)=ijSi+DijPMk=1ik d=Njij d= S2iSid dMXk=1Nk S2iSiMXk=1Nkik SiMXk=1Nkik S2iMXk=1Nk SiMXk=1Nkik S2iMXk=1Nk TheonlytermsthatappearinthedenominatorhereareDijand1=Dij.Regardlessofthevalueof,thesetermswillbezerowhenxi=cjandthevalueofuijwillbeundened.Thisisthepropertyoffuzzyclusteringthatthemembershipatpointxiisundenedifit 71

PAGE 72

However,fortheboundaryvaluesof(i.e.,for2f0;1g)thevalueofDijcanbezero,evenifxi6=cj.Thisisbecausetwodierentsignalscanhavethesamecoecientswithdierentresidueorviceversa(Figure 3-1 ).Insuchascenario,settingequalto0or1respectivelymakesDij=0,thusmakingthevalueofuijundened. Therefore,uijisstablewithrespecttofor2(0;1). A{5 weknowthattheregularizationparameterisgivenby: where(t)isanexponentialdecayfunction. SinceCAprunestheclustersbasedontheircardinalities,itcansometimesmergeclustersthataredissimilartoeachother.Forexample,considerthethreeclustersshowninFigure 3-4 .TheclustersAandBshouldhavebeenonebigcluster.ThereforeCAshoulddiminishmembershipsofpointsassignedtoAsothatallmembersofAcanbemergedintoB.However,clusterCisadistinctclusterlocatedatadistancefromB.Butthereareafewpointsfrombothclustersthatconnectthemthroughathinlink.ThiswillleadtheCAalgorithmtobelievethatBandCshouldbeoneclusteranditwouldtrytopruneC. Therefore,amechanismshouldbeintroducedintheCAMPtrainingthatwouldprotectsmallerclustersfrombeingmergedintobiggerclusters,eveniftheyhavesomeoutliers.Thiscanbeachievedbymodifyingtheregularizationparameter.WeknowfromEquation A{4 thatuBiasijaltersthemembershipsofaclusterjbasedonitssize,relativetootherclusters.IfthedistanceofclusterjfromtherestoftheclustersislargerthansomethresholdT,uBiasijshouldbezeroandmembershipvaluesofclusterjshouldbeupdatedusingonlyuFCMij.Thiscanbedonebymodifyingtheregularizationparameter

PAGE 73

where(t)isanexponentialdecayfunctionandjisdenedas: ThejwillbesettozeroifcjisfartherthanTfromeveryotherclustercenterck.Therefore,jdenestheboundonthemaximumsizeoftheclusters,preventingthemfromgrowinginanunboundedfashion.Whenj=1,themembershipsofthesmallerclusters,likeclusterA,aregraduallydiminishedsothattheycanbemergedintoanearbybiggercluster,likeB.Butwhenj=0,itpreservesthesmallerclusters,liketheclusterCinFigure 3-4 BasedonCAMP'sabilitytoassignlowp(y=1jx)aswellasalowp(y=0jx)tounseenpatterns,anautomateddecisionruletoidentifyoutliersinthedatacanbedevised.Forthispurpose,thep(y=1jx)andp(y=0jx)assignedbytheCAMPalgorithmtoeachpatterninthetrainingsetisranknormalized.Ranknormalizationisatechnique 73

PAGE 74

Theadvantageofranknormalizingthep(y=1jx)andp(y=0jx)isthatnowbothprobabiliteshaveauniformdistribution.Therefore,asingleoperatingthresholdcanbechosenforboth.Plottingtherank-normalizedp(y=1jx)vs.p(y=0jx)oftrainingsetdenesa2-Daxis.Eachtestpatternisassignedarank-normalizedp(y=1jx)andp(y=0jx)bylookingupintoranksofthetrainingdataset.Thetestpatternsarethenplottedontheaxisdenedbythetrainingdata.Alloutlier(i.e.,testpatternswithlowp(y=1jx)andlowp(y=0jx))shouldbuncharoundtheorigininthisplot.Nowanisocirclecenteredattheoriginandhavingradiusrcanbedenedwithinwhichalltestpatternswillbeconsideredoutliers.Outsidethisradius,atestpatternwillbeassignedaclasslabelusingtheBayesiandecisionruleofEquation 3{20 .Theradiusrcanbeconsideredtheoperatingthresholdofthesystemforagivenapplication. 74

PAGE 75

B C ExampletodemonstratetheimportanceofW(x;G(x)),G(x)andR(x;G(x))forcomparingtwosignals.A)TheDictionaryelementschosenfromtheGaussianparametricdictionary(eq. 2{19 ).B)Twodierentsignalsgeneratedbyabovegi'susingdierentcoecients.C)TwosignalswithsimilarMPcoecientsbutquitedierentresidues. 75

PAGE 76

RoleofindeterminingthevalueofMPDM.Tocomparethesignalsx1andx3withx2,theyareprojectedontotheprojectionsequenceG(x2)=fg1gofx2.When!1,x2andx3aredeemedmoresimilarastheyaresimilarinshape(ororientation).Butwhen!0,moreemphasisisgivenoncomparingtheprojectioncoecientsoftwosignalsinsubspacedenedbyG(x2),makingx1andx2moresimilar. Figure3-3. Updateequationforct.Forsomemembershipvalues,theR+mayupdatethectasshown.NotethatR+isnotnecessarilyorthogonalto^ct.Ontheotherhand,forMPapproximation,theresidueR(cj;G(cj))isalwaysorthogonaltotheapproximation^ctofthesignalct

PAGE 77

CAalgorithmmergesclustersbasedontheircardinality.AppropriatechangesaremadetosothatCAmergesclustersAandBbutnotC. 77

PAGE 78

WehaveimplementedandtestedthealgorithmsproposedinChapter 3 .Inthefollowingsectionswediscusstheexperimentalresultsofourproposedmethods: 3.1 ,theMPDMcanbeusedforavarietyofshape-,aswellas,magnitude-basedcomparisons.Inordertodemonstratethat,letusrevisittheexampleinSection 1.1 wherewediscussedtheinabilityofEuclideandistancetodoshape-basedcomparisonsinhighdimensions.Recallthatforashape-basedcomparison,wewouldlikesignalsAandBtobedeemedmoresimilarthanthesignalsAandC.Butamagnitude-basedcomparisonwilldeemsignalsAandCtobemoresimilar. Figure 4-1 showstheMPDMvaluesforcomparisonsbetween(A;B)and(A;C)respectively,overvariousvaluesof.Forsmallervaluesof,MPDM(A;C)MPDM(A;B).Therefore,forsmallervaluesof,MPDMperformsmagnitude-basedcomparisonsbetweensignals.Forlargervaluesof,MPDM(A;B)MPDM(A;C),thesignalsthataremoresimilarshape-wisearecloserthansignalsthathavedierentshapes.Therefore,forlargervaluesof,MPDMperformsshape-basedcomparisonsbetweensignals.Ontheotherhand,theEuclideandistancecanonlyperformmagnitude-basedcomparisons.Therefore,MPDMismoresuitableforshape-basedcomparisonsofhigh-dimensionaldata. 77 ].Atotalof9imagesfromtheimagedatabasewererandomlychosenastestimages.Noneofthepatchesfromthese 78

PAGE 79

7 ]setthetotalnumberofdictionaryelementsKto441,wealsotrainedK-SVDwithK=441.EK-SVDwasalsoinitializedatM=441.ForbothK-SVDandEK-SVD,thetotalnumberofMPiterationsusedforapproximationduringtrainingwassettop=6.Fordemonstrationpurposes,EK-SVDwasallowedtorununtilM=1andateachpruningstep,testimageswereapproximatedusingtheintermediatedictionary. 4-2 showstheaveragerootmeansquareerror(RMSE)ofalltestimagesagainstthesizeMofthedictionary.ThevalueatM=441istheperformanceofK-SVD.TheRMSEstaysalmostconstantuntilthedictionarysizeisreducedtoabout40%ofitsoriginalsize.Thisshowsthattheapproximationcapabilitiesofthedictionaryisnotcompromisedbyreducingthesizeofthedictionaryuntillthatpoint.Afterthat,theRMSEstartsincreasingwhichshowsthattheremainingdictionaryelementsareunabletomantainthesameapproximationaccuracylevel.Therefore,theterminatingconditionofEK-SVDalgorithmissetbasedonitsapproximationaccuracy. DuringtheactualdictionarytrainingusingtheEK-SVDalgorithm,whentheperformancegoalsforthenaldictionaryweresettobethesameasthosefortheK-SVDdictionary,EK-SVDlearnedadictionarywithtotalnumberofelementsM=179.Figure 4-3 showstwoofthetestimagesapproximatedusingthedictionarieslearnedwithK-SVDandEK-SVDalgorithms.Forimages1and2,K-SVDandEK-SVDgaveRMSEsof0.03and0.02respectively.TheEK-SVDdictionarythusgivesapproximationaccuracysimilartotheK-SVDdictionary.ButithasahugespeedadvantageasthedictionarylearnedwithEK-SVDis60%smallerthanthatlearnedusingK-SVD. 79

PAGE 80

3.3 .Forcomparisonpurposes,aclassiersimilartoCAMPistrainedthatusestheEuclideandistanceforcomparisonsinsteadofMPDM.LetuscallthesetrainedclassiersCAMPandEUCrespectively.WecomparetheclassicationperformanceofCAMPandEUConatwo-classsyntheticdatasetandplottheirreceiveroperatingcharacteristic(ROC)curves wherec;;N(0;1)andt=[50:::50].Otherwise,generateamemberxiofclass0as: 101)2+10c2e(t152 102)2(4{2) wherec1;c2;1;2;12N(0;1)andt=[50:::50]. Membersofclass1haveasmoothbell-curvedshape.Ontheotherhand,sincethemembersofclass0arealinearcombinationoftwoGaussians,theirshapeswillhavemorevariation.Drawingallparametersindependentlyfromidenticalnormaldistributionsensuresthatthereisavarietyofshapesinbothclasses.Figure 4-4 showssomesamplesignalsfrombothclasses. Thetrainingsetconsistsof1000pointsgeneratedbytheaboveprocess.Atotalof30testsets,eachcontaining1000points,arealsogenerated.ThetrainedCAMPandEUCclassiersaretestedonall30testdatasetsinordertoachieveastatisticallyreliabletest. 80

PAGE 81

2{19 )formatchingpursuitsapproximationofthesignals.EachmemberoftheGaussiandictionaryisindexedbyasetofthreeparameters=(n;;),wherethevaluesoftheseparametersusedtolearnthedictionaryarelistedinTable 4-1 .Notethatforasignalxi2<101,iscenteredat=51.ThecorrectshiftofeachdictionaryelementwithrespecttotheresidueiscomputedusingcorrelationduringeachMPiteration.Therefore,theGaussiandictionaryconsistof33elementsandisshowninFigure 4-5 ThemaximumnumberofMPiterationspwassetto6.FortheCAalgorithm,40datapointswererandomlychosenastheclustercenters.ThetrainingusingCAMPwasthenconductedusingthemethodoutlinedintheSection 3.3 .Theparameterwassetequalto0:6forMPDM.Forf(z)inEquation 3{32 ,theinversefunctionusedwas: Tocomputep(y=1jtj)foreachtestpointtj,insteadofsummingtheprobabilitiesconditionaloverallprototypes,onlythetopK=3nearestneighborsprototypesweredeterminedandusedtocomputep(y=1jtj). TheexperimentalsetupforEUCwasidenticaltothatofCAMP,exceptthatinsteadofusingtheMPDM,theEuclideandistancewasusedforcomparison. 4-6 showstheaverageROCcurvesofall30datasetsfortheCAMPandEUCclassiers.ClassicationperformanceofMPDMissuperiortothatofEUC'satmostlevels,withthedierenceparticularlylargebetween70%and90%correctclassicationrateofclass1.Sincetherewereshapedierencesbetweenthemembersoftwoclasses,theshape-basedcomparisoncapabilityofMPDMwasabletodetectthesedierencesbetterthantheEuclideandistance. 81

PAGE 82

4-6 isstatisticallysignicant,misclassicationratesofclass0werecollectedfromall30datasetsfrombothCAMPandEUC,atprobabilities0.1to1ofcorrectclassicationofclass1,andsubjectedtoat-test 4-2 showstheoutcomeoft-test.Exceptatplaceswherebothcurvescomeveryclosetoeachother,thenullhypothesisisrejectedatmostlevels.ThisshowsthatMPDMismoresuitableforshapecomparisonsofhigh-dimensionaldatathantheEuclideandistanceforprototype-basedclassiers. Inordertobuildadatasetforclustering,sixsignalsofdistinctshapeswerechosen.Eachsignalwasmultipliedwithafactorbetween5and10withanincrementof0.05.Henceatotal101signalsweregeneratedforeachsignalandintroducedintothedataset. 82

PAGE 83

4-7 .ForMP,wassetequalto0.7. FortheCAMPalgorithm,theinitialnumberofclusterswassetequalto15.ThedatawasindependantlyclusteredwithCAMPfor50timeswhereinitialclustercenterswerechosenrandomlyforeachrun.Figure 4-8 showsthehistogramofthenumberofclustersfoundbyCAMPfor50runsoftheexperiment.Thenumberofclustersfoundforalltrialsformanormaldistributionaround6,theactualnumberofclustersinthedata.Figure 4-7 showstheclustercentersfoundbyCAMPononeofthetrials.Therefore,theCAMPalgorithmwasabletodiscovertheunderlyingstructureinthedata. ForFCM,MPDMwasusedasthedissimilaritymeasurewith=0:7.LetMdenotethetotalnumberofclustersforFCM.ForeachM2f2;:::;15g,thedatawasclusteredusingFCM,whereinitialclustercenterswerechosenrandomly.Thisexperimentwasrepeated50timesforallvaluesofM.SincenosinglevaliditymeasureisabsolutelyabletodeterminetheoptimalnumberofclustersforFCM,wecomputedthefollowingfourfuzzyvalidityindicesforeachrunoftheFCMalgorithm[ 74 ]: ExceptforPC,allotherindicesareminimizedforoptimal 4-9 showsthehistogramoftheoptimalnumberofclustersfoundbyeachindex.ThePC,PEandFSindicespeakat6forthecorrectnumberofclustersforFCM.However,theirdistributionshavewiderspreadthan 83

PAGE 84

Therefore,bothFCMandCAMPalgorithmwereabletodiscovertheunderlyingstructureofdatawhenMPDMwasusedasthedissimilaritymeasure.However,usingCAMPforclusteringisafasterwayofclusteringbecauseunlikeFCM,whereoneneedstore-runclusteringforeverynumberofclusters,CAMPdiscoversthecorrectnumberofclustersbyrunningthealgorithmonlyonce. 78 ].TheresponseofatargettotheEMIsensordependsonthemetallicpropertiesoftheburiedobjectsuchasmetalcontentandconductivity.Eachtargetobjecthasasignatureresponsethatisfairlyconsistentacrossvariousweatherandsoilconditions.However,theremaybeconsiderablesimilaritiesbetweenmetallicsignaturesoflandminesandmetallicclutterobjects,thusmakingtheclassicationproblemhard.Figure 4-10 showsmetallicsignaturesofsomemineandnon-mineobjects.Itcanbeseenthatthesamplesofbothminesandnon-mineslieinsamemagnituderanges,withsubtleshapedierences.Therefore,weseektobuildreliableshape-basedmodelsofmineandnon-minetargetsusingtheCAMPalgorithm.TheEK-SVDalgorithmprovidesausefultooltolearnadata-specicdictionarythatwouldgivesuitableMPapproximationsofthisEMIdata.Also,inanactualmineeld,thereisalwaysapossibilityofencounteringatargetobjectwhosesampledoesnotexistinthetrainingset.Therefore,insteadofassigningarandomvaluetoanoutlier,itisimportantthatthesystemisabletoidentifytheoutliersinthetestdataset. 84

PAGE 85

4-3 summarizesthemine/non-minecompositionofthesedatasets. Thefollowingexperimentsdemonstratetheperformanceoftheproposedmethodsascomparedtodiscrimination-baseclassiers.Alsotheimportanceofchoosinganappropriatedictionaryisstudied.Theoutlierrejectionandreportingcapabilitiesarealsoanalyzed. 78 79 ].WecomparetheclassicationperformanceofCAMPagainsttheFOWAsystem.Also,sincesupportvectormachines(SVM)areconsideredrobustdiscrimination-basedclassiers,wealsotrainandevaluateanSVM-basedclassierforthis 85

PAGE 86

4.3 ,isalsoevaluatedforcomparisonpurposes. Ahighercondencevaluemeansthattheprobabilityofthetargetbeingamineishighandvice-versa.Inordertomodeltheuncertaintyinanactualtest,10FOWAnetworkswerelearnedusinga10-foldcrossvalidation.Thenalcondenceofatestsampleisreportedasthemeanofvaluesassignedbythese10networks.AdetaileddiscussionofthesefeaturesandtrainingoftheFOWAsystemcanbefoundinNganetal.[ 78 ]. 3 ],and2fortheRBFkernelwerefoundbysearchingoverarangeofvaluesbetween0and100foreachparameter.Whilelookingforoptimalvaluesof 86

PAGE 87

4-11 showstheplotofmeanPFAsatallthetestedvaluesof2andC.TheparametervaluescorrespondingtoreportedSVMclassierwere2=0:1andC=0:1.LikeFOWA,10cross-validatedSVMnetworksweretrainedusingthesamecrossvalidationfoldsofFOWA. 4-12 showsdictionarylearnedbytheEKSVDalgorithm. Samplesfromeachmineandthenon-minetypewereclusteredseparatelyusingtheCAMPalgorithm.Theclustercentersforeachclasswereinitializedusingtheglobal-FCM(G-FCM)clusteringalgorithmproposedbyHeoandGader[ 80 ].TheG-FCMgivesapartitionofdatathatisinsensitivetoinitializationandoutliers.SinceCAisafuzzyclusteringalgorithmandfuzzyclusteringisknowntobesensitivetoinitialization,usingG-FCMgivesareasonableinitialestimatetotheCAMPalgorithmandalsohelpsittoconvergefaster. 87

PAGE 88

4-13 showstheROCsresultscorrespondingto=0:26,witherrorbarsfor=0:0:02:1.EachhorizontalerrorbarforagivenPDcorrespondstothevariationinthePFAoverallvaluesof.Inanidealsituation,allthereportedROCsshouldbeontheleftboundaryoftheirrespectiveerrorbars.However,thisisnotthecaseasthesamevaluewaschosenforallthreedatasets.Notethatforthetrainingset,thereportedROCcorrespondstothecrossvalidatedresults,whiletheerrorbarswereplottedfortestontrainresults.Therefore,theerrorbarsonleftoftheROCalsorecordthedierencebetweentest-on-trainandcrossvalidationresultsonthetrainingdataset. AnotherparameterthatcaneecttheclusteringoutcomeisTinEquation 3{45 ,theminimumalloweddistancebetweentwoclustercenters.IfTisverysmall,itcanleadtoovertrainingastheprototypeswillbemodeledverytightlyaroundthetrainingsamples.Similarly,choosingabigTcanleadtoovergeneralizationoftheprototypes.InordertondthesuitablevalueforT,theCAMPprototypesweretrainedusingTvaluesbetween0and1,varyingwithanincrementof0:1.BasedonthemeanPDs,thevalueofTwaschosentobe0:2.Figure 4-14 showstheclassicationresultsforT=0:2,with=0:26.Onceagainforthetrainingsettheerrorbarsfortest-on-trainandtheROCforcrossvalidationisshown.LookingonlyatthePDsoftest-on-trainoftrainingdataset,itfavorsavalueofT=0:1.However,T=0:1givesbadclassicationresultsforbothdatasetsT1andT2.ThisshowsthatmakingTsmallerovertrainstheprototypes.UsingcrossvalidationwithT=0:2ensuresgeneralizationofthenetworkwhilestillavoidingovergeneralization. 88

PAGE 89

3{32 ,theexponentialdecayfunctionwasused: where2isanormalizationconstant.Thevalue2=0:16wasempiricallyfoundtobeusefulforbothmineandnon-mineclustersonthetrainingdata. 4-15 showsROCcurvesforclassicationresultsofeachdatasetforthefourclassiers.Eachtestpatternisassignedacondencevaluep(y=1jx)ofbeingamine.ThereforetheseROCsrecordthepercentageofdetectionofminesvs.thepercentageofincorrectclassicationofnon-mines. CAMPperformsbetterthanthediscrimination-basedclassiersFOWAandSVMonbothdatasetsatPD90becauseitusesaprototype-basedapproachtoclassicationwithrobustshape-baseddissimilaritymeasure.NotethatalthoughEUCisalsoaprototype-basedclassier,itdoesnotdoanybetterthanFOWAorSVM.ThisisbecausetheEuclideandistanceisunabletoperformaccurateshape-basedcomparisonsinhigh-dimensionalvectors. TheresultsondatasetT2areespeciallyinterestingastheperformanceofallfourclassierssuersfromthepresenceofoutliersinbothmineandnon-mineclasses.ThePDofFOWAandSVMisadverselyaectedforT2bymisclassifyingtheseoutliersfrombothclasses.Ontheotherhand,CAMPgivesalowminecondencetoanytestpatterndierentfromitsmineprototypes(i.e.,thenon-minesandtheoutliers).Therefore,theperformanceofCAMPfordetectingminessuersonlybecauseofassigninglowcondencetooutliersinthemineclass.NotethatCAMPalsoassignsalownon-minecondenceto 89

PAGE 90

Figure 4-16 showstheerrorbarsfortheCAMPandFOWAROCsdenotingthedegreeofuncertainityassociatedwiththeirresults.Theseerrorbarswerecreatedbytreatingeachinstanceofminefoundinthetestdatasetasasuccessinabinomialtrial.BothCAMPandFOWAshowhighercondenceintheiroutputsforhighervaluesofp(y=1jx)andviceversa.NotethatthecondenceintervalsofFOWAandCAMPoverlapforthetestdatasetT1.ThisisbecausedatasetT1isquitesimilartothetrainingdataset.Ontheotherhand,sinceT2issomewhatdierentfromthetrainingdataset,theircondenceintervalsarefartherapart. 2.2 ,theperformanceoftheMPalgorithmhighlydependsonthechoiceofdictionary.Therefore,asuitableparametricdictionarymayusedorthedictionarymaybelearnedfromthedataforMP.SincethechoiceofdictionaryaectsMPapproximations,itmayalsoindirectlyaectperformanceoftheMPDMandCAMP.Therefore,inordertounderstandtheeectofvariousdictionarychoicesonclassicationofthelandminedata,CAMPwastrainedwithavarietyofdictionariesandtheclassicationresultswerenoted. 4-12 .TheGaussiandictionary,showninFigure 4-5 hasbeenfoundtobeanexcellentparametricdictionaryforthisEMIdataset.Therefore,thisdictionaryisalsousedasapreprocessorforCAMP. 90

PAGE 91

4-17 .Inordertodrawcomparisonwiththe\good"dictionariesforthegivendata,theGabordictionary(equation 2{16 )isalsousedasoneofthepre-processorsforCAMP.ThisdictionaryhasbeenshowninFigure 4-18 .EachelementoftheGabordictionaryisgovernedbyatripleofparametersnamelythescales,thephaseshiftphiandthefrequencymodulation.ThevaluesoftheseparametersareshowninTable 4-4 .EachmemberofGabordictionaryinFigure 4-18 wasproducedasacombinationoftheseparameter.NotethattheseparametersareasubsetofdictionaryusedbyNeandZakhor[ 5 ].Thefthdictionaryconsistsof40randomnoisesignalsgeneratedusingtherandfunctionofMatlab,shownin 4-19 .Sincethereisnostructureintherandomdictionary,itsclassicationperformancewouldactasalowerboundfortheCAMPalgorithm. 4-20 showsthe10-foldcrossvalidationresultfortrainingdatasetforalldictionaries.SimilarlyFigures 4-21 and 4-22 showtheresultsfordatasetsT1andT2respectively.Fortestingdatasets,thenalcondenceassignedtoeachtargetisthemeanofoutcomesof10trainedclassiers.LookingattheROCsonenoticesthattheoutcomesforalldictionarytypesarequitesimilar.ThePFAsatvariousPDlevelsproducedbythe10classiersforeachdictionarytypewerethensubjectedtoaT-test.TheT-testfoundthatthesevalueswerenotstatisticallydierent.Therefore,thechoiceofdictionaryseemstohavenosignicanteectontheclassicationperformanceoftheCAMPalgorithm.However,theMPalgorithmdependshighlyonthechoiceofdictionarybeingusedforapproximation.ThiscanbeseenfromtheFigure 4-23 thatshowsthesumofnormalizedMSEforallsignalsintrainingdataset.TheEKSVD,Gauss,andreducedGaussdictionariesgivesimilarsmallMPreconstructionerrors.SincetheGabordictionaryisnotverysimilartothedata,itperformstwiceasbadastheEKSVDandGauss 91

PAGE 92

TheinsensitivityoftheCAMPalgorithmseemscounter-intuitivewhenMPishighlydependentonthechoiceofdictionary.However,lookingattheMPDMcloselyanswersthisquestions: TheMPprojectionsequenceofx2denesasubspacewhichisuniqueforagivendictio-nary,regardlesshowwellthedictionarycanMPapproximatethedata.TheDW(x1;x2)measuresdistanceof^x1and^x2withinthissubspace,whileDR(x1;x2)measuresthedier-enceindistanceofx1andx2fromthissubspace.Usingadierentdictionaryonlychangesthecompositionofthesubspaceforx2.Ifx1andx2aresimilarshape-wise,theirapproxi-mationsshouldbecloselylocatedinanyMP-generatedsubspaceandviceversa,regardlessofthedictionaryusedtogeneratethissubspace.Therefore,foragivenvalueof,theMPDMcomparisonbetweentwosignalsisnothighlyaectedbythechoiceofdictionary.Consequently,CAMP'sperformanceisnotaectedbythechoicethedictionary. However,lookingattheROC's,wecanseethattheEKSVDdictionaryslightlyoutperformstherestofthedictionaries.ThereasonforthisisthattheEuclideandistanceisusedtocomparetheresiduesofx1andx2inDR(x1;x2).Usingadictionarycloselymodeledaroundthedataensuresthatnotalotofinformationisleftintheresidueofx2.Therefore,whentheresidueofx1withnon-negligibleresidueiscomparedtoanegligibleresidueofx2,theEuclideandistanceisabletogiveameaningfulcomparison.However,whenthedictionaryisnotrelatedtodata,theinformationcontainedintheresidueofx2isnotnegligibleandEuclideandistancemaynotbeabletogiveaccuratecomparisonbetweenthehigh-dimensionalsignalsx1andx2.Therefore,onewouldbewisetouseadictionarycloselyrelatedtothedatafortheCAMPalgorithm. 92

PAGE 93

81 ],describedaparametricmodelthatemulatestheoutputoftheEMIsensorusedtocollectthelandminesdatainthepreviousexperiments.ThismathematicalmodelproducesthesignalthattheEMIsensorwouldproducewhensweptoveranobjectburiedintheground.Therefore,byalteringonlyfourparametersofthemodel,realisticEMIresponsescanbegeneratedthataredierentfromtheactuallandminedatausedtotraintheclassiersinthepreviousexperiments.Sincethesegeneratedsignalsaredierentfromthetrainingdataset,theycanbeconsideredoutliers.Usingthismodel,atotalof10,000signalsweregeneratedastestpatterns.Sincethesetestpatternsarenotminesamples,theoreticallytheyshouldbeassignedlowcondencevaluesbyallclassiers.Therefore,thisexperimentwasdesignedtodeterminetheoutlierrejectioncapabilitiesoftheCAMP,FOWA,SVMandEUCclassiers. 4.5.1 .FOWAandSVMclassierswereassignedminecondenceusingEquation 4{4 ,whileCAMPandEUCwereassignedminecondenceasp(y=1jx).TheminecondencesassignedbyeachclassierwerethresholdedbytheirrespectivePDvaluesandthenumberofpatternsbeloweachthresholdwasnotedforeachofthefourclassiers. Anunseenpatternshouldhavealowmine,aswellas,alownon-minecondencetobeconsideredanoutlier.Therefore,thesameexperimentwasrepeatedforassigningnon-minecondencetoeachunseenpattern.ForFOWAandSVM,thefollowingrelationwasusedtoassignthenon-minecondence: 93

PAGE 94

4-24 AshowsthepercentageoftestpatternsbelowthePDthresholdsformineclassforallclassiers.Sincenoneofthepatternswerefromthemineclass,thesuperiorshape-basedprototypesapproachofCAMPwasabletorejectalmostallthepatterns.Ontheotherhand,FOWAandSVMrejectedfewerpatterns.Similarly,Figure 4-24 BshowsthepercentageoftestpatternsbelowthePDthreshold(ofthenon-mineclass)forallclassiers.Onceagain,CAMPassignedverylownon-minecondencetoalmostallthepatterns,whileFOWAandSVMrejectedsignicantlyfewerpatterns.Figure 4-25 showsthecorrespondingerrorbarsformineclassforFOWAandCAMPalgorithms. 4-26 showsonesuchrule.Thep(y=1jx)andp(y=0jx)assignedbytheCAMPalgorithmtoeachpatterninthetrainingsetisranknormalizedandplottedinFigure 4-26 .Ranknormalizationisatechniquetoconvertanunknowndistributionintoauniformdistribution.Allthevaluesinagivendistributionweresortedinanascendingorder.Eachpointoftheoriginaldistributionisassignedanewvaluebasedonitsindexinthissortedlist.Themineandnon-minecondencevaluesofthetrainingdatasetwereranknormalizedseparatelyandareplottedagainsteachotherinFigure 4-26 .Usingtheserankindices,1000unseenpatternswerealsoassignedthemineandnon-minerankcondencesandareplottedintheFigure 4-26 .Almostalloftheunseenpatternsliewithinanisocircleofradius0:25.Therefore,thisrank-normalizedplotcanactasaguidetochoosearadiuswithinwhicheachdatapointwillbeconsideredanoutlier.Outsidethisradius,atestpatterncanbeassignedaclasslabelusingtheBayesiandecisionruleofEquation 3{20 94

PAGE 95

75 ].Thesamplesineachchannelarealignedusingthepeaksofgroundbounce.Inordertominimizetheeectsofnoise,awhiteningtransformisthenappliedtoeachchannel.Thepixelscontainedina17-pixelwidewindowaroundthesuspectedpixellocationofthetargetconstitutetheforegroundwhiletherestofthepixelsareusedasbackgroundforcomputingwhiteningstatistics.Only5channelspertargetareusedfortestingandtrainingwhereeachchannelcomprisesonlytheforegroundpixels.Figure 4-27 showssamplemineandnon-mineimageswheretheleftimagesineachsubgureshowthechannelimage.TherightimagesshowtheirMPapproximationsandarediscussedshortly. 95

PAGE 96

4-27 wecanseethattherawchannelimagesarequitenoisy.Hencethepatchesextractedbasedonzerocrossingswillbenumerousandusuallyonlyafewpixelswide.Therefore,weneedcriteriathosepatchesthatshouldbeintroducedinthe2Ddictionary.Notethattheminesignaturesusuallyconsistofasetofhyperboliccurves,whilethenon-minesamplesusuallyconsistoflow-energybackgroundnoisewithafewirregularhigher-energypatches.Therefore,wewouldlikeourchosendictionarytocontainamplehyperbolicshapestoapproximatemineswellandsomeothergeometricshapestoapproximatenon-mines.Forthispurpose,thebinarymaskofeachpatchextractedfromtheimagesiscorrelatedwithasetofshape-templates.Theimagepatcheshavingahighcorrelationwithanyshape-templateisintroducedintotheinitialdictionary.Thebasicshapeforthesetemplatesistheimagepatchresultingasanintersectionoftwohyperbolasofdierentwidthsin2D.Therestoftheshapesarecreatedbyscalingandsegmentingthisbasichyperbolicshape.Theinitialdictionaryextractedfromdatainthismannerconsistedof1022elementswhichwasreducedto93usingtheEK-SVDalgorithm.ThelearneddictionaryisshowninFigures 4-28 and 4-29 96

PAGE 97

75 ],thehiddenmarkovmodel(HMM)approach,thespectral(SPEC)methodandtheedgehistogram(EHD)algorithm[ 75 ].TheROCofPDvs.PFAforallclassiersisshowninFigure 4-30 A.TheCAMPalgorithmoutperformstheLMS,SPECandHMMalgorithmsandissurpassedatsomeplacesbytheEHDalgorithm.SinceEHDandCAMPalternativelyperformbetterthaneachother,combiningthemshouldimprovetheoverallclassicationperformance.Figure 4-30 BshowstheROCplottedbytakingtheproductofcondencesofEHDandtheCAMPalgorithm. CAMPisausefulalgorithmforlandminesdiscriminationusingGPRimagedata.WecanalsoseefromtheFigure 4-30 -BthatcombiningtheoutputsofCAMPandEHDimprovestheclassicationresultsofbothalgorithmsoverawiderangeofPDvalues. 5 ]forvideocoding,showninFigure2in[ 5 ].TheparametersusedtogeneratetheGabordictionaryarelistedinTable 4-5 TheclassicationresultsobtainedusingtheEKSVDandGabordictionaryareshowninFigure 4-31 .Unlike1Ddata,heretheclassicationperformancewithGabordictionaryisworsethanthatoftheEKSVDdictionary.Thisisbecausetheresiduesherearevectorsofmuchhigherdimensionalitythanthosein1Ddata.Therefore,theEuclideandistanceusedtocomparetworesiduesfailstoprovideaccurateaccuratecomparisonsinthishighdimension.Thisresultre-enforcestheneedtouseacloselydictionaryrelatedtothedatasets,preferablylearnedoverthedatausingtheEK-SVDalgorithm.ThisalsoshowstheneedtouseamorerobustmeasuretocomparetheresiduesforMPDMthantheEuclideandistance.Choosingabettermeasureisoneofthepossibleavenuesforfutureresearch. 97

PAGE 98

Revisitingtheg. 1-3 :ShapeandmagnitudebasedcomparisonsusingMPDM 98

PAGE 99

TheplotoftheRMSEoftestimagesasafunctionoftotalnumberofdictionaryelementsMduringtheEK-SVDdictionarytraining Figure4-3. TwotestimagesapproximatedusingthedictionarieslearnedbyK-SVDandEK-SVDalgorithms 99

PAGE 100

B Samplesyntheticdataforclassication.A)Sampledatafromtrainingsetofclass1.Drawingparametersfromanormaldistributiongivesalotofvariationinspreadandmagnitudeofsignals.B)Sampledatafromtrainingsetofclass0.Sincethemeans1and2arealsodrawnfromanormaldistribution,class0canhavemembersthatlooklikemembersofclass1.Thismakesthedatalinearlyinseparable,makingtheclassicationproblemharder. 100

PAGE 101

TheGaussiandictionary Table4-1. ThesetofparametersusedtogeneratetheGaussiandictionary 101

PAGE 102

ROCcurvesforclassicationperformanceofMPDMandEUCdissimilaritymeasures. Table4-2. Thet-testoutcomesforCAMPandEUCROCs. 102

PAGE 103

The6-classdatasetusedforvalidatingCAMPandFCMclustering.Ineachsubplot,the101classmembersareshowninblackandtherespectiveclusterrepresentativefoundbyCAMPareshowningray. 103

PAGE 104

HistogramofnumberofclustersdiscoveredbyCAMPon50independantrunsproducinganormaldistributionwith=6:3and2=1:2 Figure4-9. Histogramoftheoptimalnumberofclustersdiscoveredbypartitioncoecient(PC),partitionentropy(PE),Xie-Beni(XB)andFukuyama-Sugeno(FS)clustervalditiyindicesforFCMon50independantruns. 104

PAGE 105

SampleEMIData Table4-3. Numberofminesandnon-minesinthelandminesdatasets TestingT144108152 TestingT211234146 105

PAGE 106

MeanPFAofT1atPDs90,95and100asthe2andCparametersofSVMarevaried. 106

PAGE 107

TheEKSVDdictionarylearnedforthetrainingdata 107

PAGE 108

B C Classicationresultsfortraining,T1andT2datasetswitherrorbarsfor=0:0:02:1.A)Crossvalidationresultsfortrainingdataset.B)ResultforT1dataset.C)ResultforT2dataset. 108

PAGE 109

B C Classicationresultsfortraining,T1andT2datasetswitherrorbarsforT=0:0:1:1.A)Crossvalidationresultsfortrainingdataset.B)ResultforT1dataset.C)ResultforT2dataset. 109

PAGE 110

B ClassicationresultsforT1andT2datasetsforFOWA,SVM,EUCandCAMP.A)ResultforT1dataset.B)ResultforT2dataset. 110

PAGE 111

B ErrorbarsforT1andT2datasetsforFOWAandCAMP.A)ErrorbarsforT1dataset.B)ErrorbarsforT2dataset. 111

PAGE 112

ReductionofGaussiandictionaryusingEK-SVD 112

PAGE 113

TheGabordictionary Table4-4. Thesetofparametersusedtogeneratethe1-DGabordictionary 113

PAGE 114

TheRandomDictionary 114

PAGE 115

Crossvalidationresultsforthetrainingdatausingvariousdictionaries Figure4-21. ResultsforT1testdatasetusingvariousdictionaries 115

PAGE 116

ResultsforT2testdatasetusingvariousdictionaries Figure4-23. MSEforMPreconstructionofthetrainingsetfordierentdictionaries 116

PAGE 117

B PerformancecomparisonofFOWA,SVM,EUCandCAMPclassierforperformingaccurateshape-basedclassicationofdata.A)PercentageofunseenpatternsbelowthePDofmineclass.B)PercentageofunseenpatternsbelowthePDofnon-mineclass. 117

PAGE 118

Errorbarsformineclassofsyntheticdata. Figure4-26. Ranknormalizedplotofp(y=1jx)vs.p(y=0jx)forthetrainingdatasetandtheresultantcondencevaluesassignedtotheunseenpatterns. 118

PAGE 119

B C D SampleGPRimagesforminesandnon-mines.A)SamplemineimageanditsMPapproximation.B)SamplemineimageanditsMPapproximation.C)SamplenonmineimageanditsMPapproximation.D)SamplenonmineimageanditsMPapproximation. 119

PAGE 120

Imagedictionarylearnedfromdata. 120

PAGE 121

Restoftheelementsoftheimagedictionarylearnedfromdata. 121

PAGE 122

B ClassicationresultsforGPRdataforLMS,HMM,SPEC,EHDandCAMP.A)Resultsforallclassiers.B)CombinedresultforCAMPandEHD. 122

PAGE 123

ROCofGPRdatatrainedusingtheEKSVDandtheGabordictionaries Table4-5. Thesetofparametersusedtogeneratethe2-DGabordictionary 123

PAGE 124

Amatchingpursuitsdissimilaritymeasurehasbeenpresented,whichiscapableofperformingaccurateshape-basedcomparisonsbetweenhigh-dimensionaldata.Itextendsthematchingpursuitssignalapproximationtechniqueandusesitsdictionaryandcoecientinformationtocomparetwosignals.MPDMiscapableofperformingshape-basedcomparisonsofveryhighdimensionaldataanditcanalsobeadaptedtoperformmagnitude-basedcomparisons,similartotheEuclideandistance.SinceMPDMisadierentiablemeasure,itcanbeseamlesslyintegratedwithexistingclusteringordiscriminationalgorithms.Therefore,MPDMmayndapplicationinavarietyofclassicationandapproximationproblemsofveryhighdimensionaldata. TheMPDMisusedtodevelopanautomateddictionarylearningalgorithmforMPapproximationofsignals,calledEnhancedK-SVD.TheEK-SVDalgorithmusestheMPDMandtheCAclusteringalgorithmtolearntherequirednumberofdictionaryelementsduringtraining.Under-utilizedandreplicateddictionaryelementsaregraduallyprunedtoproduceacompactdictionary,withoutcompromisingitsapproximationcapabilities.Theexperimentalresultsshowthatthesizeofthedictionarylearnedbyourmethodis60%smallerbutwithsameapproximationcapabilitiesastheexistingdictionarylearningalgorithms. TheMPDMisalsousedwiththecompetitiveagglomerationfuzzyclusteringalgo-rithmtobuildaprototype-basedclassiercalledCAMP.TheCAMPalgorithmbuildsrobustshape-basedprototypesforeachclassandassignsacondencetoatestpatternbasedonitsdissimilaritytotheprototypesofallclasses.Ifatestpatternisdierentfromalltheprototypes,itwillbeassignedalowcondencevalue.Therefore,ourexperimentalresultsshowthattheCAMPalgorithmisabletoidentifyoutliersinthegiventestdatabetterthandiscrimination-basedclassiers,like,multilayerperceptronsandsupportvectormachines. 124

PAGE 125

TheCompetitiveAgglomeration(CA)algorithmisausefulclusteringalgorithmthatcombinesthestrengthsofhierarchicalandpartitionalclusteringbyminimizingafuzzyprototype-basedobjectivefunctioniterativelytondtheoptimalpartitioningandnumberofclusters[ 2 ].Itstartsbypartitioningthedatasetintoalargenumberofsmallclusters.Asthealgorithmprogresses,clusterscompetefordatapoints.Membershipsofredundantclustersareprogressivelydiminished.Eventuallysuchclustersbecomeemptyandarethendropped.WhentheCAalgorithmterminates,itproducestheoptimalclusteringofthedatausingfewestpossiblenumberofclusters. LetX=fxiji=1;:::;NgbeasetofNvectorsinann-dimensionalfeaturespaceandC=(c1;:::;cM)representaM-tupleofprototypescharacterizingMclusters.ThentheCAalgorithmminimizesthefollowingobjectivefunction: subjecttoPMj=1uij=1,fori2f1;:::;Ng. A{1 )issimilartotheFuzzyC-Means(FCM)objectivefunctionanditcontrolstheshapesandsizesofclusterstoobtaincompactclusters.ItsglobalminimumisachievedwhenthenumberofclustersMisequaltothenumberofdatapointsN.Thesecondcomponentofequation( A{1 )isthesumofsquaresoffuzzycardinalitiesoftheclustersandcontrolsthenumberofclusters.Itsglobalminimumisachievedwhenalldatapointsarelumpedintoonecluster.Whenbothcomponentsarecombinedandtheischosenproperly,thenalpartitionwillminimizethesumofintra-clusterdistances,whilepartitioningthedatasetintothesmallestpossiblenumberofclusters. 125

PAGE 126

d2(xi;cj)(NjNi)(A{2) whereNj=PNi=1uijisthecardinalityofclusterjandNiisdenedas: Ni=PMk=1(1=d2(xi;ck))Nk A{2 )canbefoundin[ 2 ].Moresuccinctlywecanwriteuijas: wherethedenitionsofuFCMijanduBiasijfollowfrom( A{2 ).TheparametercontrolstherelativeweightofuFCMijwhichupdatesthemembershipuijusingthestandardFCMmethodanduBiasijtermwhichincreasesordecreasesuij,basedonthesizeofclusterj.TheuBiasijtermshoulddominateinitiallytoquicklyremovetheredundantclusters.Lateron,uFCMijshoulddominatetorenetheclustermembershipsofdatapoints.Therefore,isanexponentialdecayfunctionofiterationnumbert,proportionaltotheupdateandbiastermsoftheobjectivefunction: where(t)isanexponentialdecayfunction. Inthesecondstepofoptimization,themembershipsuijareassumedconstantandtheclustercenterscjareupdatedbydierentiatingtheequation( A{1 )withrespecttoct.Thechoiceofthedissimilarityfunctionddependsonthetypeofdatebeingclustered.Forexample,FriguiandKrishnapuram[ 2 ]haveusedtheEuclideanandMahalanobisdistances 126

PAGE 127

127

PAGE 128

[1] S.MallatandZ.Zhang,\Matchingpursuitswithtime-frequencydictionaries,"IEEETransactionsonSignalProcessing,vol.41,no.12,pp.3397{3415,1993. [2] HichemFriguiandRaghuKrishnapuram,\Clusteringbycompetitiveagglomeration,"PatternRecognition,vol.30,no.7,pp.1109{1119,1997. [3] SimonHaykin,NeuralNetworks:AComprehensiveFoundation,PrenticeHallPTR,UpperSaddleRiver,NJ,USA,1998. [4] RichardO.Duda,PeterE.Hart,andDavidG.Stork,PatternClassication(2ndEdition),Wiley-Interscience,November2000. [5] R.NeandA.Zakhor,\Verylowbit-ratevideocodingbasedonmatchingpursuits,"CircuitsandSystemsforVideoTechnology,IEEETransactionson,vol.7,no.1,pp.158{171,Feb1997. [6] A.Ebrahimi-MoghadamandS.Shirani,\Matchingpursuit-basedregion-of-interestimagecoding,"ImageProcessing,IEEETransactionson,vol.16,no.2,pp.406{415,Feb.2007. [7] M.Aharon,M.Elad,andA.Bruckstein,\K-svd:Analgorithmfordesigningovercompletedictionariesforsparserepresentation,"SignalProcessing,IEEETrans-actionson[seealsoAcoustics,Speech,andSignalProcessing,IEEETransactionson],vol.54,no.11,pp.4311{4322,2006. [8] JeromeH.FriedmanandWernerStuetzle,\Projectionpursuitregression,"JournaloftheAmericanStatisticalAssociation,vol.76,no.376,pp.817{823,1981. [9] H.Abut,R.Gray,andG.Rebolledo,\Vectorquantizationofspeechandspeech-likewaveforms,"Acoustics,Speech,andSignalProcessing[seealsoIEEETransactionsonSignalProcessing],IEEETransactionson,vol.30,no.3,pp.423{435,Jun1982. [10] Y.Pati,R.Rezaiifar,andP.Krishnaprasad,\Orthogonalmatchingpursuit:Recur-sivefunctionapproximationwithapplicationstowaveletdecomposition,"1993. [11] G.M.Davis,S.G.Mallat,andZ.Zhang,\Adaptivetime-frequencydecompositionswithmatchingpursuit,"inProc.SPIEVol.2242,p.402-413,WaveletApplications,HaroldH.Szu;Ed.,H.H.Szu,Ed.,Mar.1994,vol.2242ofPresentedattheSocietyofPhoto-OpticalInstrumentationEngineers(SPIE)Conference,pp.402{413. [12] L.Rebollo-NeiraandD.Lowe,\Optimizedorthogonalmatchingpursuitapproach,"SignalProcessingLetters,IEEE,vol.9,no.4,pp.137{140,Apr2002. [13] ScottShaobingChen,DavidL.Donoho,andMichaelA.Saunders,\Atomicdecompo-sitionbybasispursuit,"SIAMReview,vol.43,no.1,pp.129{159,2001. 128

PAGE 129

StephenBoydandLievenVandenberghe,ConvexOptimization,CambridgeUniversityPress,March2004. [15] O.DivorraEscoda,L.Granai,andP.Vandergheynst,\Ontheuseofaprioriinformationforsparsesignalapproximations,"SignalProcessing,IEEETransactionson[seealsoAcoustics,Speech,andSignalProcessing,IEEETransactionson],vol.54,no.9,pp.3468{3482,Sept.2006. [16] DarrellWhitley,\Ageneticalgorithmtutorial,"StatisticsandComputing,vol.4,pp.65{85,1994. [17] D.StefanoiuandF.lonescu,\Ageneticmatchingpursuitalgorithm,"SignalProcessingandItsApplications,2003.Proceedings.SeventhInternationalSymposiumon,vol.1,pp.577{580,1-4July2003. [18] R.Figueras,iVentura,andP.Vandergheynst,\Matchingpursuitthroughgeneticalgorithms,". [19] S.F.CotterandB.D.Rao,\Applicationoftree-basedsearchestomatchingpursuit,"Acoustics,Speech,andSignalProcessing,2001.Proceedings.(ICASSP'01).2001IEEEInternationalConferenceon,vol.6,pp.3933{3936vol.6,2001. [20] G.Z.Karabulut,L.Moura,D.Panario,andA.Yongacoglu,\Integratingexibletreesearchestoorthogonalmatchingpursuitalgorithm,"Vision,ImageandSignalProcessing,IEEProceedings-,vol.153,no.5,pp.538{548,Oct.2006. [21] A.ShoaandS.Shirani,\Treestructuresearchformatchingpursuit,"ImageProcessing,2005.ICIP2005.IEEEInternationalConferenceon,vol.3,pp.III{908{11,11-14Sept.2005. [22] KuansanWang,Chin-HuiLee,andBiing-HwangJuang,\Selectivefeatureextractionviasignaldecomposition,"SignalProcessingLetters,IEEE,vol.4,no.1,pp.8{11,Jan1997. [23] K.Wang,C.-H.Lee,andB.-H.Juang,\Maximumlikelihoodlearningofauditoryfeaturemapsforstationaryvowels,"SpokenLanguage,1996.ICSLP96.Proceedings.,FourthInternationalConferenceon,vol.2,pp.1265{1268vol.2,3-6Oct1996. [24] K.WangandD.M.Goblirsch,\Extractingdynamicfeaturesusingthestochasticmatchingpursuitalgorithmforspeecheventdetection,"AutomaticSpeechRecognitionandUnderstanding,1997.Proceedings.,1997IEEEWorkshopon,pp.132{139,14-17Dec1997. [25] T.V.PhamandA.W.M.Smeulders,\Sparserepresentationforcoarseandneobjectrecognition,"PatternAnalysisandMachineIntelligence,IEEETransactionson,vol.28,no.4,pp.555{567,April2006. 129

PAGE 130

F.BergeaudandS.Mallat,\Matchingpursuitofimages,"inICIP,1995,pp.53{56. [27] P.VandergheynstandP.Frossard,\Ecientimagerepresentationbyanisotropicrenementinmatchingpursuit,"Acoustics,Speech,andSignalProcessing,2001.Proceedings.(ICASSP'01).2001IEEEInternationalConferenceon,vol.3,pp.1757{1760vol.3,2001. [28] FrancoisMendels,PierreVandergheynst,andJean-PhilippeThiran,\Rotationandscaleinvariantshaperepresentationandrecognitionusingmatchingpursuit,"16thInternationalConferenceonPatternRecognition(ICPR'02),vol.04,pp.40326,2002. [29] FrancoisMendels,Jean-PhilippeThiran,andPierreVandergheynst,\Matchingpursuit-basedshaperepresentationandrecognitionusingscale-space,"InternationalJournalofImagingSystemsandTechnology,vol.6,no.15,pp.162{180,2006. [30] M.R.McClureandL.Carin,\Matchingpursuitswithawave-baseddictionary,"SignalProcessing,IEEETransactionson[seealsoAcoustics,Speech,andSignalProcessing,IEEETransactionson],vol.45,no.12,pp.2912{2927,Dec1997. [31] P.K.Bharadwaj,P.R.Runkle,andL.Carin,\Targetidenticationwithwave-basedmatchedpursuitsandhiddenmarkovmodels,"AntennasandPropagation,IEEETransactionson,vol.47,no.10,pp.1543{1554,Oct1999. [32] P.Schmid-SaugeonandA.Zakhor,\Dictionarydesignformatchingpursuitandapplicationtomotion-compensatedvideocoding,"CircuitsandSystemsforVideoTechnology,IEEETransactionson,vol.14,no.6,pp.880{886,June2004. [33] B.A.OlshausenandD.J.Field,\Sparsecodingwithanovercompletebasisset:astrategyemployedbyv1?,"VisionRes.,vol.37,no.23,pp.3311{25,Dec1997. [34] BrunoA.OlshausenandDavidJ.Field,\Naturalimagestatisticsandecientcoding,"Network,vol.7,no.2,pp.333{339,1996. [35] KennethKreutz-Delgado,JosephF.Murray,BhaskarD.Rao,KjerstiEngan,Te-WonLee,andTerrenceJ.Sejnowski,\Dictionarylearningalgorithmsforsparserepresentation,"NeuralComput.,vol.15,no.2,pp.349{396,2003. [36] BrunoA.OlshausenandK.JarrodMillman,\Learningsparsecodeswithamixture-of-gaussiansprior,"inNIPS,1999,pp.841{847. [37] HonglakLee,AlexisBattle,RajatRaina,andAndrewY.Ng,\Ecientsparsecodingalgorithms,"inAdvancesinNeuralInformationProcessingSystems19,B.Scholkopf,J.Platt,andT.Homan,Eds.,pp.801{808.MITPress,Cambridge,MA,2007. [38] W.Dangwei,M.Xinyi,andS.Yi,\Radartargetidenticationusingalikelihoodratiotestandmatchingpursuittechnique,"Radar,SonarandNavigation,IEEProceedings-,vol.153,no.6,pp.509{515,December2006. 130

PAGE 131

D.W.Redmill,D.R.Bull,andP.Czerepinki,\Videocodingusingafastnon-separablematchingpursuitsalgorithm,"ImageProcessing,1998.ICIP98.Proceedings.1998InternationalConferenceon,vol.1,pp.769{773vol.1,4-7Oct1998. [40] R.NeandA.Zakhor,\Matchingpursuitvideocoding.i.dictionaryapproximation,"CircuitsandSystemsforVideoTechnology,IEEETransactionson,vol.12,no.1,pp.13{26,Jan2002. [41] Jian-LiangLin,Wen-LiangHwang,andSoo-ChangPei,\Ajointdesignofdictionaryapproximationandmaximumatomextractionforfastmatchingpursuit,"ImageProcessing,2004.ICIP'04.2004InternationalConferenceon,vol.2,pp.1125{1128Vol.2,24-27Oct.2004. [42] S.Lesage,R.Gribonval,F.Bimbot,andL.Benaroya,\Learningunionsoforthonor-malbaseswiththresholdedsingularvaluedecomposition,"Acoustics,Speech,andSignalProcessing,2005.Proceedings.(ICASSP'05).IEEEInternationalConferenceon,vol.5,pp.v/293{v/296Vol.5,18-23March2005. [43] SylvainSardy,AndrewG.Bruce,andPaulTseng,\Blockcoordinaterelaxationmethodsfornonparametricwaveletdenoising,"JournalofComputationalandGraphicalStatistics,vol.9,no.2,pp.361{379,2000. [44] P.Czerepinski,C.Davies,N.Canagarajah,andD.Bull,\Matchingpursuitsvideocoding:dictionariesandfastimplementation,"CircuitsandSystemsforVideoTechnology,IEEETransactionson,vol.10,no.7,pp.1103{1115,Oct2000. [45] YannickMorvan,DirkFarin,andPeterH.N.deWith,\Matching-pursuitdictionarypruningformpeg-4videoobjectcoding,"inInternetandmultimediasystemsandapplications,2005. [46] L.Rebollo-Neira,\Dictionaryredundancyelimination,"Vision,ImageandSignalProcessing,IEEProceedings-,vol.151,no.1,pp.31{34,5Feb.2004. [47] D.M.Monro,\Basispickingformatchingpursuitsaudiocompression,"CircuitsandSystems,2006.ISCAS2006.Proceedings.2006IEEEInternationalSymposiumon,pp.4pp.{,21-24May2006. [48] G.G.Herrero,A.Gotchev,I.Christov,andK.Egiazarian,\Featureextractionforheartbeatclassicationusingindependentcomponentanalysisandmatchingpursuits,"Acoustics,Speech,andSignalProcessing,2005.Proceedings.(ICASSP'05).IEEEInternationalConferenceon,vol.4,pp.iv/725{iv/728Vol.4,18-23March2005. [49] T.A.HoangandD.T.Nguyen,\Matchingpursuitfortherecognitionofpowerqualitydisturbances,"PowerElectronicsSpecialistsConference,2002.pesc02.2002IEEE33rdAnnual,vol.4,pp.1791{1796,2002. 131

PAGE 132

A.P.LoboandP.C.Loizou,\Voiced/unvoicedspeechdiscriminationinnoiseusinggaboratomicdecomposition,"Acoustics,Speech,andSignalProcessing,2003.Proceedings.(ICASSP'03).2003IEEEInternationalConferenceon,vol.1,pp.I{820{I{823vol.1,6-10April2003. [51] B.Krishnapuram,J.Sichina,andL.Carin,\Physics-baseddetectionoftargetsinsarimageryusingsupportvectormachines,"SensorsJournal,IEEE,vol.3,no.2,pp.147{157,April2003. [52] L.R.Rabiner,\Atutorialonhiddenmarkovmodelsandselectedapplicationsinspeechrecognition,"ProceedingsoftheIEEE,vol.77,no.2,pp.257{286,Feb1989. [53] P.RunkleandL.Carin,\Multi-aspecttargetidenticationwithwave-basedmatchingpursuitsandcontinuoushiddenmarkovmodels,"Acoustics,Speech,andSignalProcessing,1999.ICASSP'99.Proceedings.,1999IEEEInternationalConferenceon,vol.4,pp.2115{2118vol.4,15-19Mar1999. [54] P.Bharadwaj,P.Runkle,L.Carin,J.A.Berrie,andJ.A.Hughes,\Multiaspectclassicationofairbornetargetsviaphysics-basedhmmsandmatchingpursuits,"AerospaceandElectronicSystems,IEEETransactionson,vol.37,no.2,pp.595{606,Apr2001. [55] Chi-WeiChuandI.Cohen,\Postureandgesturerecognitionusing3dbodyshapesdecomposition,"ComputerVisionandPatternRecognition,2005IEEEComputerSocietyConferenceon,vol.3,pp.69{69,20-26June2005. [56] S.Krishnan,R.M.Rangayyan,G.D.Bell,andC.B.Frank,\Adaptivetime-frequencyanalysisofkneejointvibroarthrographicsignalsfornoninvasivescreeningofarticularcartilagepathology,"BiomedicalEngineering,IEEETransactionson,vol.47,no.6,pp.773{783,Jun2000. [57] S.E.HusseinandM.H.Granat,\Intentiondetectionusinganeuro-fuzzyemgclassier,"EngineeringinMedicineandBiologyMagazine,IEEE,vol.21,no.6,pp.123{129,Nov.-Dec.2002. [58] K.Umapathy,S.Krishnan,andS.Jimaa,\Multigroupclassicationofaudiosignalsusingtime-frequencyparameters,"Multimedia,IEEETransactionson,vol.7,no.2,pp.308{315,April2005. [59] J.Marchal,C.Ioana,E.Radoi,A.Quinquis,andS.Krishnan,\Soccervideoretrivalusingadaptivetime-frequencymethods,"Acoustics,SpeechandSignalProcessing,2006.ICASSP2006Proceedings.2006IEEEInternationalConferenceon,vol.5,pp.V{V,14-19May2006. [60] P.VincentandY.Bengio,\Kernelmatchingpursuit,"MachineLearning,vol.48,no.1,pp.165{187,Apr2002. 132

PAGE 133

ShaorongChangandL.Carin,\Amodiedspihtalgorithmforimagecodingwithajointmseandclassicationdistortionmeasure,"ImageProcessing,IEEETransactionson,vol.15,no.3,pp.713{725,March2006. [62] J.R.Stack,R.Arrieta,X.Liao,andL.Carin,\Akernelmachineframeworkforfeatureoptimizationinmulti-frequencysonarimagery,"OCEANS2006,pp.1{6,Sept.2006. [63] YanZhang,XuejunLiao,andL.Carin,\Detectionofburiedtargetsviaactiveselectionoflabeleddata:applicationtosensingsubsurfaceuxo,"GeoscienceandRemoteSensing,IEEETransactionson,vol.42,no.11,pp.2535{2543,Nov.2004. [64] V.Popovici,S.Bengio,andJ.Thiran,\KernelMatchingPursuitforLargeDatasets,"PatternRecognition,vol.38,no.12,pp.2385{2390,2005. [65] DehongLiu,LihanHe,andL.Carin,\Airportdetectioninlargeaerialopticalimagery,"Acoustics,Speech,andSignalProcessing,2004.Proceedings.(ICASSP'04).IEEEInternationalConferenceon,vol.5,pp.V{761{4vol.5,17-21May2004. [66] XuejunLiao,HuiLi,andB.Krishnapuram,\Anm-arykmpclassierformulti-aspecttargetclassication,"Acoustics,Speech,andSignalProcessing,2004.Proceedings.(ICASSP'04).IEEEInternationalConferenceon,vol.2,pp.ii{61{4vol.2,17-21May2004. [67] KeHuangandSelinAviyente,\Sparserepresentationforsignalclassication,"inAdvancesinNeuralInformationProcessingSystems19,B.Scholkopf,J.Platt,andT.Homan,Eds.,pp.609{616.MITPress,Cambridge,MA,2007. [68] F.Mendels,P.Vandergheynst,andJ.Thiran,\AneinvariantMatchingPursuit-basedshaperepresentationandrecognitionusingscale-space,"Tech.Rep.,Ecublens,2003,ITS. [69] P.J.Phillips,\Matchingpursuitlterdesign,"PatternRecognition,1994.Vol.3-ConferenceC:SignalProcessing,Proceedingsofthe12thIAPRInternationalConferenceon,pp.57{61vol.3,9-13Oct1994. [70] P.J.Phillips,\Matchingpursuitltersappliedtofaceidentication,"ImageProcessing,IEEETransactionson,vol.7,no.8,pp.1150{1164,Aug1998. [71] W.ZhaoandN.Nandhakumar,\Lineardiscriminantanalysisofmpfforfacerecognition,"PatternRecognition,1998.Proceedings.FourteenthInternationalConferenceon,vol.1,pp.185{188vol.1,16-20Aug1998. [72] Chung-LinHuangandShih-HungHsu,\Roadsigninterpretationusingmatchingpur-suitmethod,"PatternRecognition,2000.Proceedings.15thInternationalConferenceon,vol.1,pp.329{333vol.1,2000. 133

PAGE 134

RaaziaMazhar,JosephN.Wilson,andPaulD.Gader,\Useofanapplication-specicdictionaryformatchingpursuitsdiscriminationoflandminesandclutter,"GeoscienceandRemoteSensingSymposium,2007.IGARSS2007.IEEEInternational,pp.26{29,23-28July2007. [74] SergiosTheodoridisandKonstantinosKoutroumbas,PatternRecognition,SecondEdition,ElsevierAcademicPress,2003. [75] H.FriguiandP.Gader,\Detectionanddiscriminationoflandminesbasedonedgehistogramdescriptorsandfuzzyk-nearestneighbors,"FuzzySystems,2006IEEEInternationalConferenceon,pp.1494{1499,0-02006. [76] H.FriguiandP.D.Gader,\DetectionandDiscriminationofLandMinesinGround-PenetratingRadarBasedonEdgeHistogramDescriptorsandaPossi-bilisticK-NearestNeighborClassier,"FuzzySystems,IEEETransactionson,2008Forthcoming. [77] \Yalefacedatabase,"RetrievedApril10,2008,from [78] P.Ngan,S.P.Burke,R.Cresci,J.N.Wilson,P.D.Gader,D.K.C.Ho,E.E.Bartosz,andH.A.Duvoisin,\Developmentofprocessingalgorithmsforhstamids:statusandeldtestresults,"inProceedingsoftheSPIEConferenceonDetectionandRemediationTechnologiesforMinesandMinelikeTargetsX,April2007. [79] W.-H.Lee,P.D.Gader,andJ.N.Wilson,\Optimizingtheareaunderareceiveroperatingcharacteristiccurvewithapplicationtolandminedetection,"GeoscienceandRemoteSensing,IEEETransactionson,vol.45,no.2,pp.389{397,Feb.2007. [80] GyeongyongHeoandPaulGader,\KG-FCM:Kernel-BasedGlobalFuzzyC-MeansClusteringAlgorithm,"TechnicalReport,ComputerandInformationScienceandEngineering,UniversityofFlorida,2009. [81] K.C.Ho,L.M.Collins,L.G.Huettel,andP.D.Gader,\DiscriminationModeProcess-ingforEMIandGPRSensorsforHand-HeldLandmineDetection,"GeoscienceandRemoteSensing,IEEETransactionson,vol.42,no.1,pp.249{263,Jan.2004. 134

PAGE 135

RaaziaMazharreceivedherBachelorofScienceinComputerSciencefromtheNationalUniversityofComputerandEmergingSciences,Islamabad,PakistaninJuly2003.FromJuly2003toJune2004,sheworkedassoftwaredeveloperatLandmarkResources,Islamabad,Pakistan.ShestartedherPh.D.attheUniversityofFloridainAugust2004.ShehasbeenaresearchassistantwiththeComputationalScienceandIntelligenceLabatComputerandInformationScienceandEngineeringdepartmentattheUniversityofFloridafromJanuary2005toDecember2008.SherecievedherMastersdegreeinComputerEngineeringinDecember2008.ShegraduatedwithherPh.D.inComputerEngineeringinMay,2009.Herresearchinterestsincludemachinelearning,imageandsignalanalysis,signalcompressionandapproximation,automatedfeaturelearningandclustering. 135