Deconstructive Learning

MISSING IMAGE

Material Information

Title:
Deconstructive Learning
Physical Description:
1 online resource (88 p.)
Language:
english
Creator:
Ali, Mohsen
Publisher:
University of Florida
Place of Publication:
Gainesville, Fla.
Publication Date:

Thesis/Dissertation Information

Degree:
Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Computer Engineering, Computer and Information Science and Engineering
Committee Chair:
HO,JEFFREY YIH CHIAN
Committee Co-Chair:
PETERS,JORG
Committee Members:
BANERJEE,ARUNAVA
RANGARAJAN,ANAND
KHARE,KSHITIJ

Subjects

Subjects / Keywords:
classifiers -- computer-vision -- deconstruction -- machine-learning
Computer and Information Science and Engineering -- Dissertations, Academic -- UF
Genre:
Computer Engineering thesis, Ph.D.
bibliography   ( marcgt )
theses   ( marcgt )
government publication (state, provincial, terriorial, dependent)   ( marcgt )
born-digital   ( sobekcm )
Electronic Thesis or Dissertation

Notes

Abstract:
This dissertation introduces the novel notion of deconstructive learning and it proposes a practical computational framework for deconstructing a broad class of binary classifiers commonly used in computer vision applications. While the ultimate objective of most learning problems is the determination of classifiers from labeled training data, for deconstructive learning, the objects of study are the classifiers themselves. As its name suggests, the goal of deconstructive learning is to deconstruct a given classifier by determining and characterizing (as much as possible) the full extent of its capability, revealing all of its powers, subtleties and limitations. In particular, this work is motivated by the seemingly innocuous question that given an image-based binary classifier $\MC$ as a black-box oracle, how much can we learn of its internal working by simply querying it? To formulate and answer this question computationally, I will first describe a general two-component design model employed by many current computer vision binary classifiers, a clear demonstration of the division of labor between practitioners in computer vision and machine learning. In this model, an input image is first transformed via a (nonlinear) feature transform to a feature space and a classifier is applied to the transformed feature to produce the classification output. The deconstruction of such a classifier therefore aims to identify the specific feature transform and the feature-space classifier used in the model. Accordingly, the process of deconstructing a classifier $\MC$ will be formulated as the identification problem for its two internal components from a finite list $\mathcal F$ of candidate features and a finite list $\mathcal C$ of candidate classifiers. As the main technical components of the deconstruction algorithm, I will introduce three novel ideas: feature identifiers, classifier deconstructors and the geometric feature-classifier compatibility. Specifically, feature identifiers are a set of image-based operations that can be applied to the input images, and the different degree of sensitivity and stability of the features in the feature list $\mathcal F$ under these operations would allow us to exclude elements in $\mathcal F$. The classifier deconstructors, on the other hand, are algorithms that can deconstruct classifiers in the candidate list $\mathcal C$ using a (relatively) small number of features by recognizing certain geometric characteristics of the classifier's decision boundary such as its parametric form. In this dissertation, I will present deconstruction algorithms for two popular families of classifiers in computer vision: cascades of linear classifiers and support vector machines. The interaction between elements in the feature and classifier lists during the deconstruction process is based on the notion of geometric feature-classifier compatibility that provides a principled and effective criterion for selecting the correct pair of feature and classifier as the output of the deconstruction process. The bulk of this work will be devoted to realize the deconstruction framework in concrete and practical terms. In particular, I will present a variety of experimental results that validate the proposed deconstruction methods and demonstrate the viability of deconstructing computer vision algorithms. Interesting highlights of the experimental results include the deconstruction of the popular OpenCV pedestrian and face detectors and the demonstration of a kernel machine update/upgrade without using its source code. To the best of my knowledge, no similar results have been reported in the literature previously. Finally, in the last part of this dissertation, I will briefly speculate on the future application potential of deconstuctive learning.
General Note:
In the series University of Florida Digital Collections.
General Note:
Includes vita.
Bibliography:
Includes bibliographical references.
Source of Description:
Description based on online resource; title from PDF title page.
Source of Description:
This bibliographic record is available under the Creative Commons CC0 public domain dedication. The University of Florida Libraries, as creator of this bibliographic record, has waived all rights to it worldwide under copyright law, including all related and neighboring rights, to the extent allowed by law.
Statement of Responsibility:
by Mohsen Ali.
Thesis:
Thesis (Ph.D.)--University of Florida, 2014.
Local:
Adviser: HO,JEFFREY YIH CHIAN.
Local:
Co-adviser: PETERS,JORG.

Record Information

Source Institution:
UFRGP
Rights Management:
Applicable rights reserved.
Classification:
lcc - LD1780 2014
System ID:
UFE0046149:00001


This item is only available as the following downloads:


Full Text

PAGE 1

DECONSTRUCTIVELEARNINGByMOHSENALIADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2014

PAGE 2

c2014MohsenAli 2

PAGE 3

Toallwhowerethereandtoallwhowillcome.Foroneswhoknow. 3

PAGE 4

ACKNOWLEDGMENTS Toallteachers;onesthatadornedtheformofhumansandonesthatcameinshapeofmoments.Togodsanddevils,tofriendsandfoes.Tounrequitedlovesandcountlesscrushes,oneswhodestroyedmeandoneswhomademebetter.Tountameddesiresandunlimiteddreams(whichoneisbetter?);thatkepturgingtoclimbwhenthrowninthevalleysoffailures,onesthatneverletyoubeinpeace,neverletyoubehappyforwhatyouhave.ToAllah,themostmercy-full.Inthelasttomeremortals,likeme. 4

PAGE 5

TABLEOFCONTENTS page ACKNOWLEDGMENTS .................................. 4 LISTOFTABLES ...................................... 7 LISTOFFIGURES ..................................... 8 ABSTRACT ......................................... 10 CHAPTER 1INTRODUCTION ................................... 12 1.1DeconstructiveLearningAndComputerVision ............... 16 1.2RelatedWork .................................. 21 1.3RoadMap .................................... 22 1.4Etymology .................................... 23 2DECONSTRUCTINGKERNELMACHINES .................... 24 2.1ProblemStatement ............................... 25 2.2Preliminaries .................................. 28 2.3DeconstructionMethod ............................ 30 2.3.1Bracketing ................................ 31 2.3.2EstimatingNormalVectors ....................... 33 2.3.3DetermineKernelSubspaceSY .................... 33 2.3.4DetermineKernelType ......................... 36 2.3.5ComplexityAnalysisandExactRecoveryof ............ 37 2.4Experiments .................................. 39 2.4.1EvaluationofDeconstructionAlgorithm ................ 40 2.4.2KernelMachineUpgradeWithoutSourceCode ........... 42 3DECONSTRUCTINGCASCADEOFLINEARCLASSIFIERS .......... 47 3.1ProblemIntroduction .............................. 48 3.2Preliminaries .................................. 50 3.3DeconstructionAlgorithm ........................... 54 3.4Experiments .................................. 59 3.4.1SoftCascade .............................. 59 3.4.2Viola-JonesHaarCascade ...................... 60 3.5ConcludingRemarks .............................. 62 4TWOSTAGECLASSIFICATIONSYSTEMS .................... 63 4.1ProblemIntroduction .............................. 63 4.2DeconstructionProcessandMethod ..................... 67 4.2.1FeatureIdentiers ........................... 69 5

PAGE 6

4.2.2DeconstructorMethods ........................ 70 4.2.3GeometricCompatibilityBetweenFeaturesandDeconstructors .. 70 4.3Experiments .................................. 71 4.3.1DistinguishbetweenHOGandSIFT ................. 72 4.3.2DeconstructingOpenCVHOG-basedPedestrianDetector ..... 73 4.3.3DeconstructingAirplaneDetector ................... 76 4.3.4DeconstructingwhenDimensionalityReductionisBeingEmployed 76 5CONCLUSIONSANDFUTUREWORK ...................... 80 REFERENCES ....................................... 84 BIOGRAPHICALSKETCH ................................ 88 6

PAGE 7

LISTOFTABLES Table page 2-1Confusionmatricesfortheoriginalkernelmachineandthekernelmachinedenedbytherecoveredquasi-supportvectors .................. 46 2-2Comparisonsofclassicationresultsfortheoriginalkernelmachineandtheupgradedkernelmachine .............................. 46 4-1Effectsofdifferentimagetransformsonclassicationresults.HOG-basedfeaturesasexpectedproduceunstableresultsunderrotationandscalechange. 73 7

PAGE 8

LISTOFFIGURES Figure page 1-1Schematicillustrationofdeconstructivelearning ................. 14 1-2Proposeddeconstructingalgorithmforclassierwithtwocomponentdesign. 21 2-1BracketprocessandthePN-pairs. ......................... 32 2-2CalculatingnormalsusingPN-Pairs. ........................ 34 2-3Intersectionsofandtwo-dimensionalafnesubspaces ............ 35 2-4SingularvaluesofthenormalmatrixNfordifferentchoicesof ......... 42 2-5Meansandvariancesofthecosinesofthetwelveprincipalanglesbetween SYandSY. ...................................... 43 2-6Meansandvariancesofthethetwelveprincipalanglesbetween SYandSY. .. 43 2-7DistributionofVotesonKernelType. ........................ 44 2-8Cosinesoftheprincipalanglesbetweentherecoveredkernelsubspaceandtheground-truthkernelsubspace. ......................... 44 2-9SingularvaluesofthematrixN. ........................... 45 3-1Algorithmsfordeconstructionofclassiers. .................... 50 3-2Filtersrecoveredbyprobingthedecisionboundaryusingpositiveandnegativepairs .......................................... 51 3-3MultipleHaar-likefeaturespresentinsameretrievedlter. ............ 53 3-4Averagerunningtimeforthe4400featuresrejectedbytheclassierC. ..... 55 3-5Averagetimetakenbyfeaturesrejectedatagivenlevel. ............ 56 3-6Dictionaryelementsorderedaccordingtotheaveragerunningtimetakenbynegativefeaturesinthepairs. ............................ 58 3-7Recoveredsoft-cascade. .............................. 60 3-8PercentageoftheOpenCVHaarcascadeltersrecoveredatdifferentcorrelationthresholds. ...................................... 60 4-1Schematicillustrationofdeconstructivelearningandproposedalgorithm. ... 66 4-2PN-pairsobtainedbybracketingprocess,duringdeconstructingOpenCVpersondetector. ........................................ 74 4-3PN-pairsobtainedbybracketingproces,duringdeconstructingMotorbikedetector. 75 8

PAGE 9

4-4Classicationerrorsexemplifyingconceptofgeometriccompatibilitybetweenfeaturesanddeconstrcutors. ............................ 76 4-52DprojectionsofthelabeledPN-pairsinthreedifferentfeaturespaces,exemplifyingconceptofgeometriccompatibilitybetweenfeaturesanddeconstrcutors .... 77 4-6Deconstructionofclassication-systemwithHOGfeaturesandLinearSVMasclassier. ...................................... 78 4-7SingularValuesofthenormalmatrix. ........................ 79 4-8IdentifyingKernelType. ............................... 79 9

PAGE 10

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyDECONSTRUCTIVELEARNINGByMohsenAliMay2014Chair:JeffreyHoMajor:ComputerEngineeringThisdissertationintroducesthenovelnotionofdeconstructivelearninganditproposesapracticalcomputationalframeworkfordeconstructingabroadclassofbinaryclassierscommonlyusedincomputervisionapplications.Whiletheultimateobjectiveofmostlearningproblemsisthedeterminationofclassiersfromlabeledtrainingdata,fordeconstructivelearning,theobjectsofstudyaretheclassiersthemselves.Asitsnamesuggests,thegoalofdeconstructivelearningistodeconstructagivenclassierbydeterminingandcharacterizing(asmuchaspossible)thefullextentofitscapability,revealingallofitspowers,subtletiesandlimitations.Inparticular,thisworkismotivatedbytheseeminglyinnocuousquestionthatgivenanimage-basedbinaryclassierCasablack-boxoracle,howmuchcanwelearnofitsinternalworkingbysimplyqueryingit?Toformulateandanswerthisquestioncomputationally,Iwillrstdescribeageneraltwo-componentdesignmodelemployedbymanycurrentcomputervisionbinaryclassiers,acleardemonstrationofthedivisionoflaborbetweenpractitionersincomputervisionandmachinelearning.Inthismodel,aninputimageisrsttransformedviaa(nonlinear)featuretransformtoafeaturespaceandaclassierisappliedtothetransformedfeaturetoproducetheclassicationoutput.Thedeconstructionofsuchaclassierthereforeaimstoidentifythespecicfeaturetransformandthefeature-spaceclassierusedinthemodel. 10

PAGE 11

Accordingly,theprocessofdeconstructingaclassierCwillbeformulatedastheidenticationproblemforitstwointernalcomponentsfromanitelistFofcandidatefeaturesandanitelistCofcandidateclassiers.Asthemaintechnicalcomponentsofthedeconstructionalgorithm,Iwillintroducethreenovelideas:featureidentiers,classierdeconstructorsandthegeometricfeature-classiercompatibility.Specically,featureidentiersareasetofimage-basedoperationsthatcanbeappliedtotheinputimages,andthedifferentdegreeofsensitivityandstabilityofthefeaturesinthefeaturelistFundertheseoperationswouldallowustoexcludeelementsinF.Theclassierdeconstructors,ontheotherhand,arealgorithmsthatcandeconstructclassiersinthecandidatelistCusinga(relatively)smallnumberoffeaturesbyrecognizingcertaingeometriccharacteristicsoftheclassier'sdecisionboundarysuchasitsparametricform.Inthisdissertation,Iwillpresentdeconstructionalgorithmsfortwopopularfamiliesofclassiersincomputervision:cascadesoflinearclassiersandsupportvectormachines.Theinteractionbetweenelementsinthefeatureandclassierlistsduringthedeconstructionprocessisbasedonthenotionofgeometricfeature-classiercompatibilitythatprovidesaprincipledandeffectivecriterionforselectingthecorrectpairoffeatureandclassierastheoutputofthedeconstructionprocess.Thebulkofthisworkwillbedevotedtorealizethedeconstructionframeworkinconcreteandpracticalterms.Inparticular,Iwillpresentavarietyofexperimentalresultsthatvalidatetheproposeddeconstructionmethodsanddemonstratetheviabilityofdeconstructingcomputervisionalgorithms.InterestinghighlightsoftheexperimentalresultsincludethedeconstructionofthepopularOpenCVpedestrianandfacedetectorsandthedemonstrationofakernelmachineupdate/upgradewithoutusingitssourcecode.Tothebestofmyknowledge,nosimilarresultshavebeenreportedintheliteraturepreviously.Finally,inthelastpartofthisdissertation,Iwillbrieyspeculateonthefutureapplicationpotentialofdeconstuctivelearning. 11

PAGE 12

CHAPTER1INTRODUCTIONSignicantprogresshasbeenmadeincomputervisionandmachinelearninginthepastdecadeandincreasingly,thisadvancehasmanifesteditselfintheformofarticialintelligence-basedproducts(applications)thatarestartingtopermeateourdailylife,providingthecovetedconvenienceandexpediencythatmakesthembecomingmoreindispensableeachpassingday.Suchexamplesarereadilyavailableandtheyrangefromthemoretypicalapplicationssuchasthenewgenerationofhand-heldcamerasthatusefacedetectiontore-focusandcamerasusedinsmartsurveillancesystemsthatincorporatehumandetectionandtracking,tothemoreesotericdisplayoffuturisticvisionsuchastherecentannouncementbyFacebookofitsfacevericationsystem[ 46 ],Google'sdevelopmentofself-drivingcars[ 2 ]andtheadvancedrobotsthatarereplacinghoardinganimalsindifcultterrainsunderbattleeldcondition[ 1 ].Itistransparentlyclearthatwiththecurrentpaceoftechnologydevelopment,ourlifeinanot-so-distantfuturewillbelargelysharedwithandperhapsevendominatedbytheseAI-basedapplicationsinsuchawaythatthemajorityofthedecision-makingprocessesinourlifewillbedelegatedtomachinesandthealgorithmspoweringthem.Decisionmakersandtheirdecision-makingprocesses,howeverinsignicantinappearance(butteryeffect),makehistory.Preciselybecauseofthis,thestudyofdecisionmakers(hithertohumans)andtheirdecision-makingprocesshasoccupiedacentralplaceintheintellectualheritageofmeneverywhere.Sincetimeimmemorial,philosophershaveponderedandarguedoverthenatureofmenandtheirdecisions,andhistorianshaveexpoundedandoccasionallyrationalizedthedecisionmakersandtheirprocessesthatshapedthehistory.Withtheriseofscienticmethods,modernsocialscientistssuchaspsychologistsandeconomistsuseingeniousexperimentsorconstructsophisticatedmathematicalmodelstoanalyzeandexplainhumanbehavior.Almostuntilrecently,alltheseeffortshavebeenindirectandexternalintheirmethodologyin 12

PAGE 13

thatconclusionsaredrawnbasedontheresponsesofthedecisionmakerstoexternaleventsandtheconsequencesofthedecisions.Notableexamplesincludetheink-blobtestadministeredbythepsychologistsandthevariousmarketsurveysandopinionpollsconductedbytheeconomistsandpoliticalscientists1.Historically,themotivationforundertakingsuchintellectualpursuitarevarious,andtheycanbebroadlyclassiedasconstructiveoradversarial.Inparticular,outoftheirinsatiableintellectualcuriosityandperhapsalsosomeboredom,ancientphilosophersembarkedonthisprofoundintellectualjourneythatisstillfarfromcompleteafterseveralmillenniums.However,withtheincreasingunderstandingofourselvesandourdecision-makingprocess,wegainnewinsightsandideasonhowtoimproveourselvesindividuallyandcollectivelyasasociety.Inanidealworld,thisknowledgecanbeusedforthebettermentofmankindthatoptimizestheworldforitsharmonyandproductivity.Inalessaltruisticway,thisknowledgealsoconfersadvantageinacompetitiveworld.Forexample,individualsareoftenabstractlyconsideredasautonomousagentsthataimtomaximizetheirrewardsinmanyeconomicmodels,andinthisconstantlyinvolvingandcompetitiveenvironment,understandingtheopponent'sdecision-makingprocessprovidesacriticaladvantage,asarticulatedbySunTzumorethantwomillenniumsago2.Withtheascendencyofmachinesandtheirever-increasingrolesasdecisionmakers,Iwouldarguethataparallelintellectualpursuitshouldcommence,thistime,forthemachines.Whilehumanshavetriedtounderstandhumansinhumantermsinthepastfewmillenniums,theriseofmachinesentailsthemachinestounderstandthemselvesintheirownterms.Thisisthegistofthenewtypeoflearning, 1WhilepowerfulmedicalimagingdevicessuchasfunctionalMRIcannowlookintothebrainandperhapsofferaphysiologicalperspectiveonhumandecision-makingprocess,manydetailsarestillelusiveandobscureatthistime.2Itissaidthatifyouknowyourenemiesandknowyourself,youwillnotbeimperiledinahundredbattles(TheArtofWar,ChapterThree). 13

PAGE 14

Figure1-1. Schematicillustrationofdeconstructivelearning.DeconstructiveLearningaimstounderstandtheinnerworkingofanAI-systemwithoutaccesstoitssourcecode.Here,weassumeagivenclassicationsystemisablack-boxorcleandourobjectiveistogureouthowitworksbysimplyqueryingit. thedeconstructivelearning,proposedandinvestigatedinthisdissertation.Whiletheultimateobjectiveofmostlearningproblemsinmachinelearningandcomputervisionisthedeterminationofclassiersfromlabeledtrainingdata,fordeconstructivelearning,theobjectsofstudyaretheclassiersthemselves.Asitsnamesuggests,thegoalofdeconstructivelearningistodeconstructagivenclassierbydeterminingandcharacterizing(asmuchaspossible)thefullextentofitscapability,revealingallofitspowers,subtletiesandlimitations.Thisisanunchartedterritorythathasnotbeenstudiedpreviouslyintheliterature,andinthisdissertation,Iwillfocusmyattentionona(broad)classofobjectdetectors(binaryclassiers)incomputervision.Mystartingpointistheseemingly-innocuousquestionthatgivenanimage-basedbinaryclassierCasablack-boxoracle,howmuchcanwelearnofitsinternalworkingbysimplyqueryingit(Figure 1-1 )?Themaincontributionofthisdissertationismyanswertothisquestion,intheformofanovelcomputationalframeworkthataimstocapturethespiritofdeconstructivelearningoutlinedabove.Specically,Iwillproposeanalgorithm 14

PAGE 15

that,bysimplyqueryingagivendetectorwithimages,canidentifyitsinternalfeatureandclassier.Inthisnovelcontextofunderstandingtheinnerworkingofa(smart)machine,severalparallelswiththeexampleofunderstandinghumandecisionmakingdescribedaboveareimmediatelynoticeable.First,ourapproachisalsoindirectandexternalinmethodologyinthatouralgorithmhasaccessonlytheresponsesofthemachine(detector),anditssimilaritywiththeoftenconjuredimageofapsychologistpsychoanalyzinghispatientbyissuingprobingquestionsisquiteapparent.Second,thereareseveralpracticalmotivationsforpursuingdeconstructivelearningandtherearebothconstructiveandadversarialcomponents.Onthepracticalside,IbelievethatdeconstructivelearningcanprovideagreaterexibilitytotheusersofAI/machinelearningproductsbecauseitallowstheuserstodeterminethefullextentofanAI/MLsystem,andtherefore,createhis/herownadaptation,modicationorimprovementofthegivensystemforspecicandspecializedtasks.Forexample,onceakernelmachine3hasbeendeconstructed,itcanbesubjecttovariouskindsofimprovementandupgradeintermsofitsapplicationscope,runtimeefciencyandothers.Imagineakernelmachinethatwasoriginallytrainedtorecognizehumansinimages.Bydeconstructingthekernelmachineandknowingitskerneltypeandpossiblyitssupportvectors,wecanimproveandupgradeittoakernelmachinethatalsorecognizesotherobjectssuchasvehicles,scenesandotheranimals.Ontheotherhand,inthecontextofadversariallearning,deconstructivelearningallowsagivensystemtobedefeatedanddeciencyrevealed.Finally,perhapsthemostcompellingreasonforstudyingdeconstructivelearningisinscribedbythefamousmottoutteredbyDavidHilbertmorethaneightyyearsago:wemustknowandwewillknow!Indeed,whenpresentedwithablack-boxclassier(especiallytheonewithgreatrepute),Ihavefoundtheproblemofdetermining 3SupportVectorMachines(SVM)usingkernelfunctions 15

PAGE 16

thesecretofitsinnerworkingbysimplyqueryingitwithimagesbothfascinatingandchallenging,aproblemwithitspeculiareleganceandcharm. 1.1DeconstructiveLearningAndComputerVisionAsanexampleofdeconstructivelearning,imaginethatwearepresentedwithaclassierofgreatrepute,sayapedestrian(human)detector.Thedetector,asabinaryclassierofimages,ispresentedonlyasabinaryexecutablethattakesimagesasinputsandoutputs1asitsdecisionforeachimage.Theclassierislaconicinthesensethatexceptthepredictedlabel1,itdoesnotdivulgeanyotherinformationsuchascondencelevelorclassicationmarginassociatedwitheachdecision.However,weareallowedtoquerythedetector(classier)usingimages,andtheproblemstudiedinthisdissertationistodeterminetheinnerworkingoftheclassierusingonlyimagequeriesandtheclassier'slaconicresponses.Forexample,canwedeterminethetypeoffeaturesituses?Whatkindofinternalclassierdoesitdeploy?Supportvectormachine(SVM)orcascadeoflinearclassiersorsomethingelse?IfitusesSVMinternally,whatkindofkernelfunctiondoesituse?Howmanysupportvectorsarethereandwhatarethesupportvectors?Similartomanyproblemstackledincomputervision,deconstructivelearningisaninverseproblem;therefore,withoutanappropriateregularization,theproblemisill-posedanditisimpossibletodenedesiredsolutions.Inparticular,sinceweareallowedtoaccessonlythelaconicresponsesoftheclassier,thescopeseemsalmostunbounded.Theappropriatenotionofregularizationinthiscontextistodeneatractabledomainonwhichsolutionscanbesought,andthemaincontributionofthisdissertationistheproposalofacomputationalframeworkthatwouldallowustoposeandanswertheabovequestionsascomputationallytractableproblems.OurproposalisbasedonaspecicassumptionontheclassierCthatitsinternalstructurefollowsthecommontwo-componentdesign:afeature-transformcomponentthattransformstheinputimageintoafeatureandamachine-learningcomponentthatproducestheoutputbyapplying 16

PAGE 17

itsinternalclassiertothefeature(Figure 1-2 ).Manyexistingbinaryclassiersincomputervisionfollowthistypeofdesign,acleardemonstrationofthedivisionoflaborbetweenpractitionersincomputervisionandmachinelearning.Forexample,mostofthewell-knowndetectorssuchasfaceandpedestriandetectors(e.g.,[ 8 34 50 ])conformtothisparticulardesign,withotherlesser-knownbutequally-importantexamplesinsceneclassication,objectrecognitionandothers(e.g.,[ 10 27 ])adoptingthesamedesign.Byclearlydelineatingthevisionandlearningcomponents,wecanformulateacomputationalframeworkfordeconstructingCastheidenticationproblemforitstwointernalcomponentsfromanitecollectionofpotentialcandidates.Moreprecisely,foragivenvisionclassierC(e.g.,anobjectdetector),thedeconstructionprocessrequiresalistoffeatures(andtheirassociatedtransforms)Fandalistof(machinelearning)classiersC.Basedonthesetwolists,thealgorithmwouldeitheridentifythecomponentsofCamongtheelementsinFandCorreturnavoidtoindicatefailureinidentication.Computationally,thelistsdenetheproblemdomain,andtheyconstitutetherequiredminimalpriorknowledgeofC.Inpractice,thegeneraloutlineofthefeatureusedinaparticularvisionalgorithmisoftenknownandcanbeascertainedthroughvarioussourcessuchaspublications.However,importantdesignparameterssuchassmoothingvalues,cell/block/binsizesetc.,areoftennotavailableandtheseparameterscanbedeterminedbysearchingoveranexpectedrangeofvaluesthatmadeuptheelementsinF.Similarly,thetypeofclassierusedcanoftenbenarroweddowntoasmallnumberofchoices(e.g.,anSVMoracascadeoflinearclassiers).Withinthiscontext,weintroducethreenovelnotions,featureidentiers,classierdeconstructorsandgeometricfeature-classiercompatibility,asthemaintechnicalcomponentsofthedeconstructionprocess.Specically,featureidentiersareasetofimage-basedoperationssuchasimagerotationsandscalingsthatcanbeappliedtotheinputimages,andthedifferentdegreeofsensitivityandstabilityofthefeaturesinFundertheseoperationswouldallowustoexcludeelementsinF,making 17

PAGE 18

theprocessmoreefcient.Forexample,supposeFcontainsbothScale-invariantfeaturetransform(SIFT)andHistogramofOrientedGradients(HOG)-basedfeatures.SinceSIFTisinprinciplerotationallyinvariant,SIFT-basedfeaturesaremorestableunderimagerotationsthanHOG-basedfeatures;andtherefore,ifCusesaSIFT-basedfeatureinternally,itoutputswouldbeexpectedtobemorestableunderimagerotations.Therefore,byqueryingCwithrotatedimagesandcomparingtheresultswithun-rotatedimages,wecanexcludefeaturesinFthatarerotationallysensitive.Theclassierdeconstructors,ontheotherhand,arealgorithmsthatcandeconstructclassiersinCusinga(relatively)smallnumberoffeaturesbyrecognizingcertaingeometriccharacteristicsoftheclassier'sdecisionboundary(e.g.,itsparametricform).Forexample,anSVMdeconstructoralgorithmisableto(givensufcientlymanyfeatures)determinethenumberofsupportvectorsandthetypeofkernelusedbyakernelmachinebyrecognizingcertaingeometriccharacteristicsofitsdecisionboundary.TheinteractionbetweenelementsinFandCduringthedeconstructionprocessisbasedonthenotionofgeometricfeature-classiercompatibility:forapair(f,c)offeaturefandclassierc,theyarecompatibleifgivensufcientlymanyfeaturesdenedbyf,thedeconstructoralgorithmassociatedtoccancorrectlyrecognizeitsdecisionboundary.Morespecically,givenavisionclassierCinternallyrepresentedbyapair(f,c)offeaturefandclassierc,wecanqueryCusingasetofimagesI1,...,In,andusingthefeature(anditassociatedtransform)f,wecantransformtheimagesintofeaturesinthefeaturespacespeciedbyf.Thedeconstructoralgorithmassociatedwithcthendeterminestheclassierbasedonthesefeatures.However,foranincorrecthypotheticalpair( f,c),thedifferencebetweenthetransformedfeaturesspeciedby fandfaregenerallynon-linear,andthisnon-linearitychangesthegeometriccharacteristicsofthedecisionboundaryinthefeaturespacespeciedby f,renderingthedeconstructoralgorithmcunabletoidentifythedecisionboundary(Figure 1-2 ). 18

PAGE 19

Theabstractframeworkoutlinedaboveprovidesapracticalandusefulmodularizationofthedeconstructionprocesssothattheindividualelementssuchastheformationoffeatureandclassierlists,featureidentiersandclassierdeconstructorscanbesubjecttoindependentdevelopmentandstudy.Inthefollowingchapters,wewillrealizetheabstractframeworkinconcreteterms.Specically,weintroducetwodeconstructoralgorithmsforsupportvectormachine(SVM)andforthecascadeoflinearclassiers.Theformerisapopularfamilyofclassierswidelyusedinvisionapplicationsandthelatterisoftendeployedinobjectdetectors,withthefacedetectorofViola-Jonesasperhapsthemostwell-knownexample[ 50 ].DeconstructingKernelMachines:Animportanttechnicalcontributionofthisworkisanalgorithmfordeconstructingsupportvectormachines.Morespecically,givenakernelmachineC,weaskthefollowingfourquestions: 1. Canthekernelfunctionbedetermined? 2. Canthenumber,m,ofsupportvectorsbedetermined? 3. Canthe(kernel)subspacespannedbythesupportvectorsofCbedetermined? 4. Canthesupportvectorsthemselvesbedetermined?Myinvestigationhasshownthattherstthreequestionscanbeansweredafrmatively.Iwillassumethatthekernelmachineusesoneofthevetypesofkernelfunctions:polynomialkernels(linear,quadratic,andcubic),hyperbolictangentkernelandRBFkernel.LetRldenotethefeaturespaceinwhichCisdenedandIwillalsoassumethatm<
PAGE 20

oncomputingrudimentarydifferentialgeometricfeaturesofthedecisionboundary,itsnormalvectors.DeconstructingCascadesofLinearClassiers:Recallthatacascadeoflinearclassiersisstructurallyadegeneratedecisiontree(e.g[ 50 ])whereeachnodeisalinearclassier.Unlikethecaseofkernelmachinesabove,thedecisionboundaryofalinearcascadeisgenerallynotsmooth(unlessthecascadehasonlyonelinearclassier)butpiecewiselinear;therefore,itsdeconstructionrequiresalargernumberofrandomlygeneratedPN-pairstocapturelocallylinearareas.Morespecically,thedeconstructoralgorithmisableto Determinethenumberoflevels(depthofthetree)inthecascade. showthatforeachlinearclassierinthecascade,h(x)=n>x+b,thenormalvectornandbiasbcanbedetermineduptoamultiplicativeconstant.ComparedtotheSVMdeconstructerabove,thedeconstructorforcascadeoflinearclassiersisconsiderablysimplersinceonlylinearclassiersareinvolved.Inparticular,thecontinuityassumptiononthefeaturedomainisnecessarilyforrecoveringthenormalvectornusingbracketing.However,themethodwehavedevelopedfordeconstructinglinearcascadesismoregeneralanditworksalsofordiscretedomains(integrallatticesandintegralimages)asintheoriginalsetting[ 50 ].ForpracticalevaluationofourmethodwedeconstructedSoftCascade(basedon[ 11 ])trainedontheFaceimages.DeconstructingimplementationofViola-JonesFaceDetector[ 49 ],presentinOpenCVlibrary;ismorechallenging.In[ 49 ],eachlinearclassierisconstructedbyAdaBoosting[ 25 ]haar-likefeatures.EvaluationoftheHaarlikefeatures,usedbytheViola-Jones,onanygivenimagecouldberepresentedbythedotproductbetweenimagevectorandvectorhsuchthatelementsofhhavevaluesequaltoweightsofpositiveandnegativeregionsofthegivenhaarfeature;thustheybehavelikelinearclassier.Forcompletedeconstructionweneednotonlyndthelinearclassieroneachnodeofdecisiontreebutalsohaar-likefeaturesusedtoconstructthoselinearclassiers.Inboththeoriginal 20

PAGE 21

BoostedCascadebyViola-Jones[ 49 ]andSoftCascade[ 11 ]orderingandplacementoftheclassierisveryimportant;usingthecomputationaltimeitrequiredtorejectnegativesampleswewereabletorecovernumberofstagesincascades,locationoflinearclassierandwhatfeaturesmakethatclassiers. Figure1-2. Proposeddeconstructingalgorithmforclassierwithtwocomponentdesign.Left:Schematicillustrationofdeconstructivelearning.Center:Two-componentdesignofaclassier:afeature-transformcomponentprovidedbycomputervisionfollowedbyafeature-classicationcomponentfurnishedbymachinelearning.Right:Schematicillustrationoftheproposeddeconstructionalgorithm.Internally,thealgorithmsearchesoverasetofcandidatefeaturespacesandprobesthespacesfordecisionboundaries.Onlyinthecorrectfeaturespacetheparametricformofthedecisionboundarywouldberecognizedbythedeconstructoralgorithm. 1.2RelatedWorkTothebestofourknowledge,thereisnopreviousworkondeconstructivelearning(DL)asdescribedabove.However,[ 36 ]studiedtheproblemofdeconstructinglinearclassiersinacontextthatisslightlydifferentfromours.Thiscorrespondstolinearkernelmachinesandconsequently,theirscopeisconsiderablynarrowerthanoursas(single)linearclassiersarerelativelystraightforwardtodeconstruct.Activelearning(e.g.,[ 20 ][ 3 ][ 4 ])sharescertainsimilaritieswithdeconstructivelearning(DL)inthatitalsohasanotionofqueryingasource.However,themaindistinctionistheirspecicitiesandoutlooks:foractivelearning,itisgeneralandrelativewhileforDL,itisspecicandabsolute.Moreprecisely,foractivelearning,thegoalistodetermineaclassierfromaconceptclasswithsomeprescribed(PAC-like)learningerrorboundusingsamplesgeneratedfromtheunderlyingjointdistributionoffeatureandlabel.Inthismodel,thelearningtargetisthejointdistributionandtheoptimallearnedclassierisrelativetothegivenconceptclass.Ontheotherhand,inDL,thelearningtargetisagivenclassier 21

PAGE 22

andtheclassierdenesanabsolutepartitionofthefeaturespaceintotwodisjointregionsofpositiveandnegativefeatures.Furthermore,theclassierisassumedtobelongtoaspecicconceptclass(e.g.,kernelmachineswithknowntypesofkernelfunction)suchthatthegoalofDListoidentifytheclassierwithintheconceptclassusingthegeometricfeaturesofthedecisionboundary.Inthisabsolutesetting,geometryreplacesprobabilityasthejointfeature-labeldistributiongiveswaytothegeometricnotionofdecisionboundaryasthemaintargetoflearning.Inparticular,bracketingisafundamentallygeometricnotionthatisgenerallyincompatiblewithaprobabilisticapproach,andwithit,DLpossessesamuchmoreefcientandprecisetoolforexploringthespatialpartitionofthefeaturespace,andconsequently,itallowsforadirectandgeometricapproachwithoutrequiringmuchprobability. 1.3RoadMapInchapter 2 ,wewillpresentthedeconstructoralgorithmforsupportvectormachines.Inthischapter,wewillalsointroducemostofthetechnicalconceptsusedinthesubsequentdeconstructionprocesses:bracketing,PN-pairs,ndingnormalsandtheirsubspace,etc.Inchapter 3 ,wewillpresentthedeconstructoralgorithmforcascadesoflinearclassiers,andinthissection,wewillpresentthedetailsexperimentalresultsonthedeconstructionoftheOpenCVfacedetector.Usingthesedeconstructoralgorithmsfortwodifferentfamiliesofclassiers,wewillpresentthedetailedmethodandalgorithmfordeconstructingclassicationsystemsemployingthetwo-componentdesigndescribedabove.Inparticular,technicalconstructssuchasfeatureandclassierlistsF,Candnotionssuchasgeometricfeature-classiercompatibilitywillbediscussedindetailwithmultipleexperimentalresultsthatsuccessfullydeconstructlinearandnonlinearclassicationsystemsusingcommoncomputervisionfeaturessuchasHOG.Finally,inthelastchapterofthisdissertation,Iwillbrieyspeculateonthefutureapplicationpotentialofdeconstuctivelearning. 22

PAGE 23

1.4EtymologyThewordDeconstructioninDeconstructiveLearning,takespartofitsmeaningfromhowithasbeenusedinphilosophicalandliteraryanalysis.MainlyderivedfromJacquesDerrida's1967workofGrammatology[ 22 ],Deconstructionmethodtendstolookataphilosophicalargumentoraliterarypiecebyanalyzingfundamentalconceptualdistinctionspresentedinit,withtheaimofshowingcontradictionsand/orpossibilityofmultiplemeanings.FurtherclaricationonthemeaningofdeconstructionhasbeengivenbyBarbaraJohnson[ 18 ]whodescribeditasmuchclosertotheoriginalmeaningoftheword'analysis'itself,whichetymologicallymeanstoundoavirtualsynonymfortode-construct.Insteadofndingmultipleversionsbyassigningdifferentmeaningstopartsofanargument,indeconstructivelearning,wearetryingtondrelationshipbetweenparts(e.g.,supportvectors)thatcanexplainthegivenclassier.Inparticular,deconstructivelearningprovidesaframeworkforanalyzingclassicationsystemsandexposingtheirfunctionalityandmechanics.Forthis,deconstructivelearningdoseaformofexplorationinthedomainspeciedbythefeatureandclassierlistsinordertoproperlyexplainthegivenclassicationsystem,anaspectsharedbydeconstructivereadingwheremultipleversionsaregeneratedbyassigningdifferentmeaningstopartsofarguments.However,thisiswherethesimilaritiesbetweenintentionandobjectiveofourworkandDeconstructioninliteratureanalysisends.Ourworkdealswithmoreconcretesettingsandexploresclassicationsystemswithtoolsthatgeneratemeasurableandveriableresults. 23

PAGE 24

CHAPTER2DECONSTRUCTINGKERNELMACHINESWhiletheultimateobjectiveofmostlearningproblemsisthedeterminationofclassiersfromlabeledtrainingdata,fordeconstructivelearning,theobjectsofstudyaretheclassiersthemselves.Asitsnamesuggests,thegoalofdeconstructivelearningistodeconstructagivenclassierCbydeterminingandcharacterizing(asmuchaspossible)thefullextentofitscapability,revealingallofitspowers,subtletiesandlimitations.Sinceclassiersinmachinelearningcomeinavarietyofforms,deconstructivelearningcorrespondinglycanbeformulatedandposedinmanydifferentways.Thischapterfocusesonafamilyofbinaryclassiersbasedonsupportvectormachines[ 48 ],anddeconstructivelearningwillbeformulatedandstudiedusinggeometricandalgebraicmethodswithoutrecoursetoprobabilityandstatistics.GivenanSVM(kernel)-basedbinaryclassierCasablack-boxoracle,wewanttogureouthowmuchcanwelearnofitsinternalworkingbyqueryingit?Specically,weassumethefeaturespaceRdisknownandthekernelmachinehasmsupportvectorssuchthatd>m(ord>>m),andinaddition,theclassierCislaconicinthesensethatforafeaturevector,itonlyprovidesapredictedlabel(1)withoutdivulgingotherinformationsuchasmarginorcondencelevel.WeformulatethetheproblemofunderstandingtheinnerworkingofCascharacterizingthedecisionboundaryoftheclassier,andweintroducethesimplenotionofbracketingtosamplepointsonthedecisionboundarywithinaprescribedaccuracy.Forthevemostcommontypesofkernelfunction,linear,quadraticandcubicpolynomialkernels,hyperbolictangentkernelandGaussiankernel,weshowthatwithO(dm)numberofqueries,thetypeofkernelfunctionandthe(kernel)subspacespannedbythesupportvectorscanbedetermined.Inparticular,forpolynomialkernels,additionalO(m3)queriesaresufcienttoreconstructtheentiredecisionboundary,providingasetofquasi-supportvectorsthatcanbeusedtoefcientlyevaluatethedeconstructedclassier.Wespeculatebriey 24

PAGE 25

onthefutureapplicationpotentialofdeconstructingkernelmachinesandwepresentexperimentalresultsvalidatingtheproposedmethod. 2.1ProblemStatementLetusassumethatthe(continuous)featurespaceinwhichtheclassierCisdenedisassumedtobead-dimensionalvectorspaceRd,andtheclassierCisgivenasabinary-valuedfunctionC:Rd!f)]TJ /F8 11.955 Tf 32.45 0 Td[(1,+1g,indicatingtheclassassignmentofeachfeaturex2Rd.Asakernelmachine,Cisspeciedbyasetofmsupportvectorsy1,,ym2RdandakernelfunctionK(x,y)suchthatthedecisionfunction(x)isgivenasthesum (x)=!1K(x,y1)+!mK(x,ym),(2)where!1,,!maretheweights.Withthebiasb, C(x)=8>><>>:+1if(x)b,)]TJ /F8 11.955 Tf 9.3 0 Td[(1if(x)>b.(2)TheclassierCisalsoassumedtobelaconicinthesensethatexceptforthebinarylabel,itdoesnotdivulgeanyotherpotentiallyusefulinformationsuchasmarginorcondencelevel.Withtheseassumptions,weformulatetheproblemofdeconstructingCthroughthefollowinglistoffourquestions(orderedinincreasingdifculty) 1. CanthekernelfunctionK(x,y)bedetermined? 2. CanthesubspaceSYspannedbythesupportvectorsbedetermined? 3. Canthenumbermofsupportvectorsbedetermined? 4. Canthesupportvectorsthemselvesbedetermined?Withoutlossofgenerality,wewillhenceforthassumeb=1.Therefore,ifthesupportvectorsandthekernelfunctionareknown,theweights!icanbedeterminedcompletelygivenenoughpointsxonthedecisionboundary =fxkx2Rd,(x)=bg.(2) 25

PAGE 26

Thatis,akernelmachineCcanbecompletelydeconstructedifitssupportvectorsandkernelfunctionareknown.Thefourquestionsaboveareimpossibletoanswerwithoutfurtherquanticationonthetypeofkernelfunctionandthenumberofsupportvectors.Inthischapter,weassume1)theunknownkernelfunctionbelongstooneofthefollowingvetypes:polynomialkernelsofdegreeone,twoandthree(linear,quadraticandcubickernels),hyperbolictangentkernelandRBFkernel,and2)thenumberofsupportvectorsislessthanthefeaturedimension,d>m(ord>>m)andtheyarelinearlyindependent.Formostapplicationsofkernelmachines,thesetwoassumptionsarenotparticularlyrestrictivesincethevetypesofkernelarearguablyamongthemostpopularones.Furthermore,asthefeaturedimensionsareoftenveryhighandthesupportvectorsareoftenthoughttobeasmallnumberoftheoriginaltrainingfeaturesthatarecriticalforthegivenclassicationproblem,itisgenerallyobservedthatd>m.Withthesetwoassumptions,themethodproposedinthischaptershowsthattherstthreequestionscanbeansweredafrmatively.Whilethelastquestioncannotbeansweredfortranscendentalkernels,weshowthatusingrecentresultsontensordecomposition(e.g.,[ 12 ]),asetofquasi-supportvectorscanbecomputedforapolynomialkernelthatrecoverthedecisionboundaryexactly.GiventhelaconicnatureofC,itseemsthattheonlyeffectiveapproachistoprobethefeaturespacebylocatingpointsonthedecisionboundaryandtoanswertheabovequestionsusinglocalgeometricfeaturescomputedfromthesesampledpoints.Moreprecisely,theproposedalgorithmtakestheclassierCandasmallnumberofpositivefeaturesinRdastheonlyinputs.Startingwiththesesmallnumberofpositivefeatures,thealgorithmproceedstoexplorethefeaturespacebygeneratingnewfeaturesandutilizingthesenewfeaturesandtheirclasslabelsprovidedbyCtoproducepointsonthedecisionboundary.Thechallengeisthereforetouseonlyacomparablysmallnumberofsampledfeatures(i.e.,queriestoC)tolearnenoughaboutinorderto 26

PAGE 27

answerthequestions,andourmaincontributionisanalgorithmthathascomplexity(tobedenedlater)linearinthedimensiondoftheambientspace.Samplingpointsoncanbeaccomplishedeasilyusingbracketing,thesameideausedinndingtherootofafunction(e.g.,[ 30 ]).Givenapairofpositiveandnegativefeatures(PN-pair),theintersectionofandthelinesegmentjoiningthetwofeaturescannotbeempty,andbracketingallowsatleastonesuchpointontobedetermineduptoanyprescribedprecision.Usingbracketingasthemaintool,thersttwoquestionscanbeansweredbyexploringthegeometryofintwodifferentways.First,thedecisionboundaryisgivenastheimplicitsurfaceofthemulti-variatefunction,(x)=b.Withhigh-dimensionalfeatures,itisdifculttoworkdirectlywithor(x);instead,theideaistoexaminetheintersectionofwithatwo-dimensionsubspaceformedbyaPN-pair.Thelocusofsuchintersectionisinfactdeterminedbythekernelfunction,andbycomputingsuchintersection,wecanascertainthekernelfunctiononthistwo-dimensionalsubspace.Forthesecondquestion,theansweristobefoundinthenormalvectorsofthehypersurface.Usingbracketing,thenormalvectoratagivenpointoncanbedetermined,againinprinciple,uptoprescribedprecision.Fromtheparametricformsofthekernelfunctions,itreadilyfollowsthatthenormalvectorsofaregenerallyquitewell-behavedinthesensethattheyeitherbelongtothekernelsubspaceSYspannedbythesupportvectorsortheyareafne-translationsofthekernelsubspaceSY.Fortheformer,aquickapplicationofsingularvaluedecompositionimmediatelyyieldsthekernelsubspaceSY,andforthelatter,thekernelsubspaceSYcanbecomputedviaarank-minimizationproblemthatcanbesolved(inmanycases)asaconvexoptimizationproblemwiththenuclearnorm.Ifwedenethecomplexityofthealgorithmastherequirednumberofsampledpointsinthefeaturespace,itwillbeshownthatthecomplexityoftheproposedmethodisessentiallyO(dm)asitrequiresO(m)normalvectorstodeterminethem-dimensionalkernelsubspaceandO(d)pointstodeterminethenormalvectoratapointinRd.Theconstantdependsonthenumberof 27

PAGE 28

stepsusedforbracketing,andifthefeaturesareassumedtobedrawnfromaboundedsubsetinRd,thisconstantisthenindependentofthedimensiond.WenotethatforapolynomialkernelofdegreeD,itsdecisionfunction(x)isadegree-Dpolynomialwithdvariables.Therefore,inprinciple,CcanbedeconstructedbyttingapolynomialofdegreeDinRdgivenenoughsampledpointson.However,thissolutionisingeneralnotusefulbecauseitdoesnotextendreadilytotranscendentalkernels.Furthermore,thenumberofrequiredpointsisintheorderofdD,andcorrespondingly,adirectpolynomialttingwouldrequiretheinversionofalargedense(Vandermonde)matrixthatisintheorderofdDdD.Withamoderatedimensionofd=100andD=3,thiswouldrequire106pointsandtheinversionofa106106densematrix.Ourmethod,ontheotherhand,encompassesboththetranscendentalandpolynomialkernelsandatthesametime,itavoidsthedirectpolynomialttinginRdandhastheoverallcomplexitythatislinearind,makingitatrulypracticalalgorithm. 2.2PreliminariesLetRddenotethefeaturespaceequippedwithitsstandardEuclideaninnerproduct,andforx,y2Rd,kx)]TJ /F4 11.955 Tf 12.41 0 Td[(yk2=(x)]TJ /F4 11.955 Tf 12.41 0 Td[(y)>(x)]TJ /F4 11.955 Tf 12.41 0 Td[(y).Forthekernelmachinesstudiedinthischapter,weassumetheirkernelfunctionsareofthefollowingvetypes:LinearKernelK(x,y)=x>y,QuadraticKernelK(x,y)=(x>y+1)2,CubicKernelK(x,y)=(x>y+1)3,HyperbolicTangentKernelK(x,y)=tanh(x>y+),GaussianKernelK(x,y)=exp()]TJ 10.49 8.09 Td[(kx)]TJ /F4 11.955 Tf 11.95 0 Td[(yk2 22),forsomeconstants,,.WewillfurtherrefertothethreepolynomialkernelsandthehyperbolictangentkernelastheType-AkernelsandtheGaussiankernelastheType-Bkernel.Thisparticulartaxonomyisbasedontheirformsthatcanbegenericallywritten 28

PAGE 29

asK(x,y)=f(x>y),K(x,y)=g(kx)]TJ /F4 11.955 Tf 11.95 0 Td[(yk2),forsomesmoothunivariatefunctionf,g,R!R.Giventheformsofthekernelfunction,animportantconsequenceisthatthedecisionboundaryisdeterminedinlargepartbyitsintersectionwiththekernelsubspaceSYspannedbythesupportvectors.Moreprecisely,forx2Rd,let xdenotetheprojectionofxonSY: x=argminy2SYkx)]TJ /F11 11.955 Tf 11.96 0 Td[(yk2.ForType-AkernelK(x,y)=f(x>y),wehaveK(x,yi)=K( x,yi)foreverysupportvectoryi.Inparticular, xisonthedecisionboundaryifandonlyxis.ForType-Bkernels,wehave(usingPythagoreantheoremwithq2=kxk2)-222(k xk2)K(x,yi)=g(kx)]TJ /F4 11.955 Tf 11.96 0 Td[(yik2)=g(k x)]TJ /F4 11.955 Tf 11.95 0 Td[(yik2+q2),andwiththeGaussiankernelg,wehaveg.Itthenfollowsthatforanyx2,itsprojection xonSYmustsatisfy( x)=eq2 22(x)=eq2 22b.Inotherwords,thedecisionboundaryisessentiallydeterminedbythelevel-setsof(x)onthekernelsubspaceSY.Sincedecisionboundaryisgivenastheimplicitsurface(x)=b,anormalvectorn(x)atapointx2Scanbegivenasthegradientof(x): n(x)=r(x)=mXi=1!irxK(x,yi).(2) 29

PAGE 30

Forthetwotypesofkernelsweareinterestedin,theirgradientvectorsassumethefollowingforms: rxK(x,y)=f0(x>y)y, (2) rxK(x,y)=2g0(kx)]TJ /F4 11.955 Tf 11.96 0 Td[(yk2)(x)]TJ /F4 11.955 Tf 11.95 0 Td[(y). (2) Usingtheaboveformulas,itisclearthatforType-Akernels,thenormalvectorn(x)dependsonxonlythroughthecoefcientsinthelinearcombinationofthesupportvectors,whileforType-Bkernels,xactuallycontributestothevectorialcomponentofn(x).Itwillfollowthatanimportantelementinthedeconstructionmethodintroducedbelowistoexploitthisdifferenceinhowthenormalvectorsarecomputedforthetwotypesofkernels.Forexample,forapolynomialkernelofdegreeD,anormalvectoratapointx2is n(x)=mXi=1D!i(x>yi+1)D)]TJ /F5 7.97 Tf 6.59 0 Td[(1yi.(2)Asaspecialcase,forlinearkernelD=1,wehave n(x)=mXi=1!iyi,thatisindependentofx.FortheGaussiankernel,wehave n(x)=mXi=1)]TJ /F7 11.955 Tf 11.26 8.09 Td[(!i 2exp()]TJ 10.5 8.09 Td[(kx)]TJ /F4 11.955 Tf 11.95 0 Td[(yik2 22)(x)]TJ /F4 11.955 Tf 11.96 0 Td[(yi).(2) 2.3DeconstructionMethodThedeconstructionalgorithmrequirestwoinputs:1)anSVM-basedbinaryclassier(x)thatusesoneofthevekerneltypesindicatedabove,and2)asmallnumberofpositiveandnegativefeatures.Thealgorithmusesthesmallnumberofinputfeaturestogenerateotherpairsofpositiveandnegativefeatures.Forapairp,nofpositiveandnegativefeatures(aPN-pair),wecanbecertainthatthelinesegmentjoiningp,nmust 30

PAGE 31

intersectthedecisionboundaryinatleastonepoint.Usingbracketing,wecanlocateonesuchpointxonthedecisionboundarywithinanygivenaccuracy,i.e.,wecanusebracketingtoobtainaPN-pairp,nsuchthatkp)]TJ /F4 11.955 Tf 12.6 0 Td[(nk0.Withasmallenough,themidpointbetweenp,ncanbeconsideredapproximatelyasasampledpointxonanditsnormalvectorcanthenbeestimated.Thealgorithmproceedstosampleacollectionofpointsandtheirnormalsonthedecisionboundary,andusingthisinformation,thealgorithmrstcomputesthekernelsubspaceSYandthisstepseparatestheType-AkernelsfromtheType-Bkernels(Gaussiankernel).ThefourType-Akernelscanfurtherbeidentiedbycomputingtheintersectionofwithafewrandomlychosentwo-dimensionalsubspaces.Thesetwostepsprovidetheafrmativeanswerstotherstthreequestionsintheintroduction.Forpolynomialkernels,wecandetermineasetofquasi-supportvectorsthatprovidetheexactrecoveryofthedecisionboundary.However,nosuchresultsforthetwotranscendentalkernelsareknownatpresentandweleaveitsresolutiontofutureresearch. 2.3.1BracketingGivenaPN-pair,p,n,thedecisionboundarymustintersectthelinesegmentjoiningthetwofeatures.Therefore,wecanusebracketing,thewell-knownroot-ndingmethod(e.g.,[ 30 ]),tolocatethepointon.Notethatbracketingdoesnotrequirethefunctionvalue,onlyitssign.ThisiscompatiblewithourclassierCthatonlygivesbinaryvalues1.Inparticular,ifwebisecttheintervalineachstepofbracketing,thelengthoftheintervalishalvedateachiteration,andforagivenprecisionrequirement>0,thenumberofstepsrequiredtoreachitisintheorderofjlogj.IfwefurtherassumethatthefeaturesaregeneratedfromaboundedsubsetofRd(whichisoftenthecase)withdiameterlessthanK,thenforanyPN-pairp,n,bracketingterminatesafteratmost log2K)]TJ /F8 11.955 Tf 11.95 0 Td[(log2+1(2) 31

PAGE 32

steps,anumberthatisindependentoftheambientdimensiond.Figure 2-1 demonstratesbracketingprocedure.Figureonbottom-rightindicatesresultantinformationwegetwhenweperformbracketingmanytimeovermanyrandompairsofpositiveandnegativesamples,wehavepeggeddownthedecisionboundary.ItalsoshowshowthecomplexityofdecisionboundarywilleffecthowmanyPN-pairswillberequired.Ifdecisionboundaryislinearwecandowithveryfewsamples(atleastd),quadraticorcubicwewillrequiremore;howeverlinearly-piecewisedecisionboundariesaremostcomplexandwillrequirelargenumberofPN-pairs. Figure2-1. BracketprocessandthePN-pairs.Bottom-Right:Bracketingprocessisrepeatedusingdifferentpositive-negativesamplestoobtainmultiplePN-pairsThesePN-pairsallowustoobtaingeometricalinformationofthedecisionboundaryoftheclassierC. 32

PAGE 33

2.3.2EstimatingNormalVectorsGiventhepairp,n,letp,ndenotethetwopointsnearafterthebracketingstepandxdenotetheirmidpoint.Toestimatethenormalvectoratx,weusethefactthatthe(unknown)kernelfunctionisassumedtobesmoothandisalevel-surfaceofthedecisionfunction(x)thatisalinearcombinationofsmoothfunctions.Consequently,arandomlychosenpointonisalmostsurelynon-singular[ 32 ]inthatithasasmallneighborhoodinthatcanbewell-approximatedusingalinearhyperplane(itstangentspace)inRd.Accordingly,wewillestimatethenormalvectoratxbylinearlyttingasetofpointsonthatbelongtoasmallneighborhoodofx(Figure 2-2 ).Morespecically,wechoseasmall>>0andgeneratePN-pairsonthespherecenteredatxwithradius.Usingbracketingandtheconvexityoftheballenclosedbythesphere,weobtainPN-pairsthatarenearandnomorethanawayfromx.TakingthemidpointofthesePN-pairs,weobtainasetofrandomlygeneratedO(d)pointson.Welinearlyta(d)]TJ /F8 11.955 Tf 12.9 0 Td[(1)-dimensionalhyperplanetothesepointsandthenormalvectoristhencomputedastheeigenvectorassociatedtothesmallesteigenvaluesofthenormalizedcovariancematrix.Theresultcanbefurthersharpenedbyrepeatingthestepovermultipleandtakingthe(spherical)averageoftheunitnormalvectors.However,inpractice,wehaveobservedthatgoodnormalestimatescanbeconsistentlyobtainedusingonesmall10)]TJ /F5 7.97 Tf 6.58 0 Td[(3(with=10)]TJ /F5 7.97 Tf 6.59 0 Td[(6)and2dsampledpoints1. 2.3.3DetermineKernelSubspaceSYTodeterminethekernelsubspaceSY,wewillusetheformulasforthenormalvectorsgiveninEquations 2 and 2 .Assumethatwehavesampleds>mpointsonand 1Wenotethatforsufcientlysmall,theangularerroroftheestimatednormalisapproximatelyintheorderoftan)]TJ /F5 7.97 Tf 6.58 0 Td[(1( 2). 33

PAGE 34

A B C D Figure2-2. CalculatingnormalsusingPN-Pairs.(A)PN-pairduetobracketing,(B)samplingpointsaroundPN-pair,(C)usingclassiersystemCtolabelthepoints,(D)recoveringnormalonthedecisionboundary. theircorrespondingnormalvectors.LetN,Xdenotethefollowingtwomatrices X=[x1x2...xs],N=[n1n2...ns](2)thathorizontallystacktogetherthepointsxiandtheirnormalvectorsni,respectively.Ifallniarecorrectlyrecovered(withoutnoise),wehavethefollowing: ForType-Akernels,ni2SY,i.e.,niisalinearcombinationofthesupportvectors. ForType-Bkernels,ni2x+SY,forsomei2R,i.e.,ni)]TJ /F7 11.955 Tf 11.95 0 Td[(x2SY.NotethatidependsonxiandthetwostatementscanbereadilycheckedusingEquations 2 2 .Therefore,thekernelsubspaceSYcanberecovered,forType-A 34

PAGE 35

Figure2-3. Intersectionsofandtwo-dimensionalafnesubspaces.AnSVMusingthecubickernelistrainedonMINSTdataset.TopRow:MidpointsofPN-pairsnearthedecisionboundaryafterbracketing.BottomRow:Sampledpolynomialcurvesgiventheintersectionsofthedecisionboundarywithtwo-dimensionalafnesubspacescontainingtheimagesabove. kernels,usingSingularValueDecomposition(SVD).Specically,letN=UDV>denotethesingularvaluedecompositionofN.TherearepreciselymnonzerosingularvaluesandSYisspannedbytherstmcolumnsofU.ForType-B,aslightcomplicationarisesbecausewemustdeterminesconstants1,,ssuchthatthespanofthefollowingmatrixisSY: N)]TJ /F4 11.955 Tf 11.96 0 Td[(X)]TJ /F2 11.955 Tf 10.09 0 Td[([n1n2...ns])]TJ /F8 11.955 Tf 11.96 0 Td[([1x12x2...sxs],(2)where)]TJ /F1 11.955 Tf 10.09 0 Td[(isadiagonalmatrixwithiasitsentries.Notethatingeneral,N,Xareoffull-rankmin(d,s),andwearetryingtondasetofisuchthattheabovematrixhasrankm
PAGE 36

andthereareefcientalgorithmsforsolvingthistypeofconvexoptimizationproblem[ 41 ].WenotethatforType-Akernels,therankisminimizedat1==s=0.Inbothcases,thespanofN)]TJ /F4 11.955 Tf 13.14 0 Td[(X)]TJ /F1 11.955 Tf 10.1 0 Td[(givesthekernelsubspaceSY.Asthesupportvectorsareassumedtobelinearlyindependent,thedimensionofSYthengivesthenumberofsupportvectors.Fornoisyrecovery,theabovemethodrequiresthestandardmodicationthatusesthesignicantgapbetweensingularvaluesastheindicator.ForType-Akernels,thisisappliedtotheSVDdecompositionofNdirectly,andforType-Bkernels,thisisappliedtotheSVDdecompositionofN)]TJ /F4 11.955 Tf 11.63 0 Td[(X)]TJ /F1 11.955 Tf 10.1 0 Td[(with)]TJ /F1 11.955 Tf 10.09 0 Td[(determinedbythenuclearnormminimization. 2.3.4DetermineKernelTypeFordeterminingthefourType-Akernels,wewillexaminethelocusoftheintersectionofthedecisionboundarywithatwo-dimensionalafnesubspacecontainingapointclosetothedecisionboundary.Morespecically,letx+,x)]TJ /F1 11.955 Tf 10.4 1.8 Td[(denoteaPN-pairthatissufcientlyclosetothedecisionboundary.Wecanrandomlygenerateatwo-dimensionalsubspacecontainingx+,x)]TJ /F1 11.955 Tf 10.41 1.79 Td[(by,forexample,takingthesubspaceAformedbyx+,x)]TJ /F1 11.955 Tf -435.71 -22.12 Td[(andtheorigin.Foragenerictwo-dimensionalsubspaceA,itsintersectionwithisaone-dimensionalcurve,andtheparametricformofthiscurveisdeterminedbythe(yetunknown)kernelfunction.SeeFigure 2-3 .TakeapolynomialkernelofdegreeDasanexample.Byitsconstruction,theintersectionofthedecisionboundaryandtheafnesubspaceAisnonempty,andthelocusoftheintersectionformedacurveinAthatsatisesapolynomialequationofdegreeD.Thiscanbeeasilyseemasfollows:takex+astheoriginonAandchoosean(arbitrary)orthonormalvectorsU1,U22Rdsuchthatthetripletx+,U1,U2identiesAwithR2.Therefore,anypointp2Acanbeuniquelyidentiedwithatwo-dimensionalvectorp=[p1,p2]2R2asp=x++p1U1+p2U2. 36

PAGE 37

Ifp2AisapointintheintersectionofAwiththedecisionboundary(p)=b,wehave mXi=1wi((x>+Yi+p1U>1Yi+p2U>2Yi)1)D=b,(2)whichisapolynomialofdegreeDinthetwovariablesp1,p2.Therefore,toascertainthedegreeofthepolynomialkernel,wecan(assumingD<4) SampleatleastninepointsontheintersectionofthedecisionboundaryandA. FitabivariatepolynomialofdegreeDtothepoints.Ifthettingerrorissufcientlysmall,thisgivesanindicationthatthepolynomialkernelisindeedofdegreeD.Wenotethatuptoamultiplicativeconstant,abivariatecubicpolynomialinR2hasninecoefcientsandthisgivestheminimumnumberofpointsrequiredtotacubicpolynomial.Inaddition,sincethedegreeofthepolynomialisinvariantunderanylineartransform,thisshowsthatthechoiceofthetwobasisvectorsisimmaterial.TheadvantageofthereductionfromRdtoR2isconsiderableasitimpliesthatthecomplexityofthisstepisessentiallyindependentoftheambientdimensiond.Foratranscendentalkernelsuchasthehyperbolictangentkernel,thelocusoftheintersectionisgenerallynotapolynomialcurveandthiscanbedetectedbythecurve-ttingerror.Although,inprinciple,oneafnesubspaceAissufcienttodistinguishingthefourType-Akernels(asshownbytheaboveequation),inpractice,duetovariousissuessuchaspossibledegeneracyofthepolynomialcurveandthecurvettingerror,werandomlysampleseveralafnesubspacesanduseamajorityvotingschemetodeterminethekerneltype. 2.3.5ComplexityAnalysisandExactRecoveryofThestepsoutlinedaboveessentiallyaimtoascertaintheparametricformofthedecisionboundaryusinga(relatively)smallnumberofsampledpointson.Wenotethatthebracketingerroringeneralcanbeexplicitlycontrolled,andthereareonlytwostepsabovethatincuruncertainty:thenormalestimateandthenuclearnormrelaxationoftherankminimizationproblem.Ourapproachofusingthelocallinearapproximationtoestimatethenormalvectoratapointisthestandardonecommonincomputational 37

PAGE 38

geometryandmachinelearning(e.g.,[ 6 44 ][ 43 ]),andthenuclearnormrelaxationisthestandardconvexrelaxationfortheoriginalNP-hardrankminimizationproblem[ 14 ].Acompletecomplexityanalysisoftheproposedalgorithmwouldrequiredetailedprobabilisticestimatespertainingtothesetwosteps,andalthoughtherearepartialandrelatedresultsscatteredintheliterature(e.g.,[ 14 ][ 7 ]),weareunabletoprovideadenitiveresultatthispoint.Instead,wepresentasimplecomplexityanalysisbelowundertheassumptionthatthesetwostepscanbedeterminedexactly,i.e.,theconvexrelaxationusingthenuclearnormgivesthesameresultastheoriginalrankminimizationproblem.Thecomputationalcomplexitycanbedenedasthenumberoffeatures(notnecessarilyonlyonthedecisionboundary)inRdsampledduringtheprocessandthisnumberisthesameasthenumberofqueriestotheclassierC.Fromtheabove,itisclearthattodeterminethem-dimensionalkernelsubspace,itrequiresatleastO(m)samplednormals,i.e.,Nhasatleastmcolumns.Furthermore,todetermineeachnormalvectoratagivenpointx,itwouldrequireO(d)numberofpointsastheambientdimensionisd.Therefore,thetotalcomplexityisO(dm).Themultiplicativeconstanthere,ascanbereadilyseen,isboundedbythemaximumnumberofstepsrequiredforthebracketing,andthisnumberisindependentofthedimensionsd,m,providedthefeaturesaredrawnfromaboundedsubsetofRd(Equation 2 ).OncethekernelsubspaceSYandthekerneltypearedetermined,thisallowsustofocusontheintersection\SY.Inthecasem<
PAGE 39

tensordecomposition(e.g.,[ 12 ][ 5 ])2,wecandecompose(x)(moreprecisely,itshomogenizedversion) (x)=rXi=1`i(x)D,(2)where`1,,`rarelinear(homogeneous)polynomials.Thesmallestintegerrforsuchdecompositiongivestherankofthe(homogeneous)polynomial(asasymmetrictensor)andingeneral,suchdecompositionisalsopossibleforrgreaterthantherank.Ifwewritethelinearpolynomials(afterde-homogenization)as`i(x)=z>ix+1forsomevectorzi,itistemptingtoinferziasthesupportvectoryifromtheaboveequation.However,becausethenon-uniquenessofthedecomposition,zi6=yiingeneral.Nevertheless,zidoactasiftheyaresupportvectorsinthesensethattheevaluationofthepolynomial(x)becomescomputationallytrivialusingtheabovedecomposition.Forpolynomialkernels,therecoveryofthesequasi-supportvectorszithendeterminesthedecisionboundaryexactly,essentiallycompletingthedeconstructionprocess.Althoughthegeneralalgorithmsfortensordecomposition[ 12 ][ 5 ]requiresomemathematicalmachinery,thespecialcaseofquadratickernels(degree-twopolynomials)canbereadilysolvedusingeigen-decompositionofasymmetricmatrix.Fortranscendentalkernels,nosimilarresultsareknownatpresent.AlthoughthereductionfromRdto\SYSYoffersthepossibilityofreconstructingthedecisionboundaryinSY,duetothenatureofthetranscendentalfunctions,thedetailsareconsiderablymoredifcultthanthepolynomialcase,andweleaveitsresolutiontofutureresearch. 2.4ExperimentsWepresenttwosetsofexperimentsinthissection.Therstsetofexperimentsevaluatesvariouscomponentsoftheproposedmethodandthesecondsetofexperiments 2Algorithm5.1in http://arxiv.org/pdf/0901.3706v2.pdf ,thearchivedversionof[ 12 ]. 39

PAGE 40

appliestheproposedmethodtoexplicitlydeconstructakernelmachineandsubsequentlyimproveitusingincrementalSVM[ 24 ]. 2.4.1EvaluationofDeconstructionAlgorithmWepresenttwoexperimentsusingkernelmachineswhosesupportvectorsarerandomlygenerated(rstexperiment)andsupportvectorstrainedusingrealimagedata(secondexperiment).Weremarkthatthereisnoqualitativedifferencesbetweendeconstructingkernelmachineswithrandomly-generatedsupportvectorsanddeconstructingkernelmachinestrainedwithrealdatasince,inbothcases,thekernelfunctionanddecisionfunction(Eq 2 )arethesame.Usingrandomly-generatedkernelmachinesallowustostudythebehaviorofthedeconstructionalgorithmoveramuchwiderrangeofsupportvectorconguration,demonstratingitsaccuracyandrobustness.Intherstsetofexperiments,wesetfeaturedimensiond=30,andwerandomlygenerate12supportvectors.Fordeterminingthekerneltype,wesample25pointsclosetothedecisionboundaryandateachpoint,wecomputetheintersectionofandatwo-dimensionalsubspace.Wetaquadraticandthenacubicpolynomialtothesepoints,andthesmallestdegreegivinganerrorbelowsomethresholdvalueisdeclaredasthedegreeofthekernel.However,ifinbothcasesthettingerrorsaregreaterthanthethresholdvalue,thekernelisdeclaredtobeaGaussiankernelatthislocation.Thisisrepeatedat25sampledlocationsandamajorityvoteisusedtodeterminethekerneltype.Oncethekerneltypeisdetermined,weuseSVDtodeterminethedimensionofthekernelsubspaceSYandthesubspaceitself.FortheGaussiankernel,thenuclear-normminimizationisperformedbeforeusingSVDtolocatethesubspaceSY.Inthisexperiment,wesamples=100pointsonthedecisionboundaryinordertoformthematricesN,Xandthetoleranceinthebracketingstepissetat10)]TJ /F5 7.97 Tf 6.59 0 Td[(6.Let SYdenotethekernelsubspacecomputedbyourmethod.Weusetheprincipalangles[ 29 ]betweenthetwosubspacesSY, SYasthemetricforquantifyingtheerror. 40

PAGE 41

SummaryThegapbetweenthesingularvaluesofNisanimportantindicatorofthedimensionofthekernelsubspace,anditisaffectedbytheaccuracyofthenormals.Figure 2-4 showstheeffectintermsoftheradiususedincomputingthenormals,showingtheexpectedresultthattheratioof=isdirectlyrelatedtotheaccuracyoftherecoverednormals(largerratiosprovidemoreaccuracy).Fordeterminingthekerneltype,thespecicityforthepolynomialkernelsiscloseto100%withthespecicityofapproximately80%fortheGaussiankernel(andhyperbolictangentkernel).Thiscanbeattributedtothemajorityvotingschemeusedinassigningthekerneltype,andweleaveitasimportantfutureworkfordesigningmorerobustcriteria.TheaccuracyoftherecoveredkernelsubspacesisshowninFigure 2-5 and 2-6 .Therstgureshowsthemeansandvariancesofthe(cosineof)twelveprincipalangles,takenoveronehundredrandomlygeneratedkernelmachinesusingpolynomialkernels.Notethatcos)]TJ /F5 7.97 Tf 6.59 0 Td[(1(0.99)isapproximately8andthisgivesagoodindicationoftheaccuracy.Inthesecondgure,thetwelveprincipalanglescomputedbeforeandaftertherank-minimizationareshown,indicatingthecorrectnessandnecessityofperformingrank-minimization.Finally,eachdeconstructionmakesbetween60,000and70,000queriestotheclassier,andonatypical3Ghzmachine,ittakesnomorethanafewminutestocompletethedeconstructionprocess.Sincethealgorithmisreadilyparallelizable(whichwouldbeimportantfordeconstructioninhigh-dimensionalfeaturespaces),afullparallelizedandoptimizedimplementationcanbeexpectedtoshortentherunningtimeconsiderably,perhapsintherangeofonlyafewseconds.Inthesecondexperiment,wetrainakernelmachinewithcubicpolynomialkernelusing1000imagesfromMNISTdataset[ 34 ].Thepositiveclassconsistsofimagesofthedigit2andthenegativeclassconsistsof0,5,7,8.Thetrainedkernelmachinehas275supportvectors.Figure 2-3 displaystheintersectionsofthedecisionboundarywithseveraltwo-dimensionalafnesubspaces,noticingthesuperpositionsoftheimagesof2withimagesofotherdigits.Inthisexperiment,werandomlygenerate200 41

PAGE 42

Figure2-4. SingularvaluesofthenormalmatrixNfordifferentchoicesof.Thekernelmachineusesapolynomialcubickernelwith12supportvectors.Theexpectedgapsbetweenthe12thand13thsingularvaluesareindicatedbythegreenmarkers.Notethatforaxedtolerance=10)]TJ /F5 7.97 Tf 6.58 0 Td[(6,theoptimal=10)]TJ /F5 7.97 Tf 6.59 0 Td[(3.Forsmallerwithoutchangingaccordingly,theestimatednormalsbecomelessaccurate.(Imagebestviewedincolor) two-dimensionalafnesubspacesandforeachsubspace,itsvoteonthetypeofkernelisdeterminedasabove.Figure 2-7 showsthedistributionofvotes,clearlyindicatingthecorrectresult.Forthisexperiment,thegapinthesingularvaluesofNindicatesthecorrectdimensionofthekernelsubspace(275)andthekernelsubspaceisalsosuccessfullyrecovered. 2.4.2KernelMachineUpgradeWithoutSourceCodeInthesecondexperiment,wedemonstratethepossibilityofupgradingakernelmachinewithoutaccesstothekernelmachine'ssourcecode.Asoutlinedintheintroduction,weapplythedeconstructionalgorithmtodeconstructthekernelmachine.Thisstepprovidesuswiththekerneltypeandquasi-supportvectors(forapolynomialkernelmachine).Forthesubsequentupgrade(orupdate),weusetheincrementalSVMalgorithm[ 24 ]toretrainthekernelmachinegiventhenewtrainingdata.Specically,wersttrainakernelmachineusingMNISTdataset:imagesofdigit1aspositive 42

PAGE 43

Figure2-5. Meansandvariancesofthecosinesofthetwelveprincipalanglesbetween SYandSY.Meansandvariancesaretakenoveronehundredindependentdeconstructionresultsforkernelmachineswithtwelvesupportvectorsusingapolynomialkernel(Quadratickernelontheleftandcubickernelontheright).(Imagebestviewedincolor) Figure2-6. Meansandvariancesofthethetwelveprincipalanglesbetween SYandSY.MeansandvariancesaretakenoveronehundredindependentdeconstructionresultsforkernelmachineswithtwelvesupportvectorsusingaGaussiankernel.Theprincipalanglesbeforeandafterrankminimizationareshown.(Imagebestviewedincolor) samplesandthenegativetrainingsamplescomprisetheremainingdigitsexcept8.DimensionalityreductionisappliedtotheimagesusingPCAtoafeaturespaceofdimension60.AnSVMwithquadratickernelistrainedonthesetrainingsamples,resultingin97.30%truepositivedetectionrateand99.17%truenegativedetection 43

PAGE 44

Figure2-7. DistributionofVotesonKernelType.Foracubickernelmachinetrainedon1000MNISTimages,thedistributionofvotesonkerneltypefor200randomlysampledtwo-dimensionalafnesubspaces.Thecorrectresultisclearlyindicated. rateonthetestdataset.Theinitialkernelmachinehas48supportvectors.During Figure2-8. Cosinesoftheprincipalanglesbetweentherecoveredkernelsubspaceandtheground-truthkernelsubspace. deconstruction,thekernelsubspaceisrecoveredusing800samplednormalvectors.LetNdenotethematrixobtainedbyhorizontallystackingtogetherthenormalvectorsandN=USD,itsSVDdecomposition.TheplotofthesingularvaluesisshowninFigure 2-9 andthesignicantgapbetweenthe48thand49thsingularvaluesindicatethe 44

PAGE 45

correctdimension(andthenumberofsupportvectors).Theprincipleanglesbetweenthekernelsubspaceestimatedbytherst48columnsofUandtheground-truthisshowninFigure 2-8 .Oncethekernelsubspaceisrecovered,weproceedtorecoverthe Figure2-9. SingularvaluesofthematrixN.Thegapbetween48thand49thsingularvaluesissignicantasthegapsamongtheremainingsingularvaluesaresubstantiallysmaller.Thecorrectdimensionofthekernelsubspace(andthenumberofsupportvectors)is48. quasi-supportvectors.Thekernelmachinedenedbythequasi-supportvectorsshouldbeagoodapproximationoftheoriginalkernelmachineandthisisshowninTable4.1,wherewecomparetheclassicationresultsusingtherecoveredkernelmachineandtheoriginalone.Inthisexample,theresultsasexpectedarequitesimilar,withtherecoveredkernelmachineactuallyperformingslightlybetter.Oncewehaverecoveredthequasi-supportvectors,wenextproceedtoupgradethekernelmachine.Thetaskistoupgradeakernelmachinethatrecognizesonlydigit1toakernelmachinethatrecognizesdigits1and8.TheclassicationresultsfortheinitialandupgradedkernelmachinesaretabulatedinTable4.2.Asshowninthetable,beforetheupgrade,the 45

PAGE 46

originalkernelmachineperformspoorlyontheimagesofdigit8andfortheupgradedmachine,bothdigitscannowbesuccessfullyclassied. Table2-1. Confusionmatricesfortheoriginalkernelmachineandthekernelmachinedenedbytherecoveredquasi-supportvectors.Bothmachinesaretestedonthesametestdataset. Quasi-SVMachineOriginalMachineoutcomeoutcome+ve-ve+ve-ve Positive100.00%00.00%97.30%2.70%Negative3.73%96.27%0.83%99.17% Table2-2. Comparisonsofclassicationresultsfortheoriginalkernelmachineandtheupgradedkernelmachine ClassicationRateOriginalMachineUpgradedMachine Digit197.30%100.00%Digit800.00%92.31%Negative99.17%97.93% 46

PAGE 47

CHAPTER3DECONSTRUCTINGCASCADEOFLINEARCLASSIFIERSAlinearclassierisperhapsthesimplestclassierforvectorialfeaturesintermsofitscomputationalcomplexityandthegeometriccomplexityofitsdecisionboundary,alinearhyperplane.However,thegeometricsimplicityoftenrendersitineffectivefornontrivialclassicationproblemsthatalmostalwaysrequirenonlineardecisionboundaries.Thenonlinearitycanbeachievedusingmultiplelinearclassiers,andapopularmethodfororganizingtheselinearclassiersintoaneffectiveclassieristostructurethemusingalineardecisiontree,acascadeof(binary)classiers.Thecombinationoflinearclassiersanddecisiontreescreateaexibleandversatileplatformforlearninghighlynonlineardecisionboundaries,withonlyamodestincreaseincomputationalcomplexity.Inparticular,thefacedetectororiginallyproposedbyViolaandJones[ 49 ](anditsmanyvariantsthathavebeenincorporatedintoawidevarietyofcommercially-availabledevices)iscertainlythemostwell-knownexampleofasuccessfulbinaryclassierthatemploythecascadingarchitecture,andthereisasubstantialbodyofliteraturedevotedtomethodsforlearning(linear)cascades.Forexample,cascadeclassierswereusedinobjectdetectionwithrotatedHaar-likefeatures[ 35 ],contourfragments[ 45 ],anddeformablepartmodels[ 28 ].Numerousapplicationsofcascadeschemesweredevelopedincludingfacedetection[ 52 ],pedestrianclassication[ 40 ],biomedicalsignalanalysis[ 42 ],andtextdetectioninnaturalimages[ 16 ].Asperourdiscussioninchapter 2 ,linearclassierscouldbedeconstructedquiteeasilybymethodslistedinthatchapter,cascadeoflinearclassiershoweverposesamorechallengingscenario.Thecascadeoflinearclassiersmakesthedecisionboundarytobepiecewiselinear,onesideguaranteeingsuccessofbracketingwithhighprobabilitybutontheotherrequiringbracketingtobeperformedlargenumberoftimesbypairingrandompositiveandnegativesamples.Notethat,beingpiecewise 47

PAGE 48

lineardecisionboundarycannotberepresentedinparametricform.Orderofclassiersinadecisiontreeisveryimportant,weexploreusingcomputationaltimetogureoutsizeofcascadeandorderingoflinearclassiers.WepresentexperimentalresultsondeconstructingthefacedetectorimplementedintheOpenCVlibrarytoconrmthevalidityandviabilityoftheproposedapproach. 3.1ProblemIntroductionLetusassumeabinaryclassierC,internallycascadeoflinearclassiers,isgivenandthefeaturespaceRkisknown.Examplesofthelatterincludeimagesofagivensize(kpixels)forfacedetectionandimagefeaturessuchasSIFT[ 37 ]andHOG[ 19 ]withknowndimensionsthatarepopularforimage-basedclassicationproblems.TheobjectofinterestindeconstructivelearningistheclassierC,andthegoalistodetermineasmuchaspossibletheinternalworkingofC.Inthecontextofdeconstructingalinearcascade,theaimisthentorecovertheconstituentlinearclassiersinthecascadebyprobingthefeaturespaceandlocatingthedecisionboundary.Asthesimplestlinearcascade,asinglelinearclassierprovidesanillustrativeexample.Abinarylinearclassierh(v)forv2Rkisdeterminedbyanormalvectorh2Rkandathresholdvalue!suchthatafeaturevisconsideredpositive(h(v)=1)ifh>v!andnegative(h(v)=)]TJ /F8 11.955 Tf 9.3 0 Td[(1),otherwise.Therefore,Ciscompletelydeconstructedifhand!canbedetermined,andthiscanbeeasilyaccomplishedusingtheideaofbracketingapairofpositiveandnegativefeaturestolocatesufcientlymanypointsonthedecisionboundary.Intheory,oncekpointsonthedecisionboundaryhavebeenlocated,boththenormalvectorhandthresholdconstant!canbedetermined(uptoanunimportantmultiplicativeconstant).Foragenerallinearcascade,itsdecisionboundaryD,althoughhighlynonlinear,isreadilyshowntobepiecewiselinear.Essentially,eachlinearpieceofDcorrespondstoapartofthedecisionboundaryforoneofitsconstituentlinearclassiers(detailsbelow),andgeometrically,thelocallinearityallowstheconstituentlinearclassierstobe 48

PAGE 49

recoveredusingbracketing,asforsinglelinearclassiersdescribedabove.Inparticular,ifChasmconstituentlinearclassiers,recoveringtheselinearclassiersrequiresbracketingpointsonmlocallinearpiecesonthedecisionboundaryD.Unfortunately,thereisnotaprioriwaytoknowthesemlocallinearpiecesonD,andthisinevitablyrequiresarandomizedapproachthatinitiatesbracketingusingalargenumberofpairsofpositivenegativefeatures.Ontheotherhand,thecascadingstructurecannotbedeterminedbyconsideringgeometryalone;instead,acarefulanalysisoftheCPUrunningtimesforalargecollectionoffeatureswillallowustodetermine,withreasonablycertainty,thelocationofeachlinearclassierinthecascade.Oncethelinearclassiersandtheirlocationsinthecascadehavebeendetermined,theentiremakeupoftheclassierCcanbedeterminedinastraightforwardmanner.Thespecicalgorithmproposedinthischaptercloselyfollowstheoutlineabove,andusingbracketing,itsamplesalargenumberofpointsonthedecisionboundaryD,andforeachsuchsampledpoint,itsassociatedlocallinearpieceisestimated,yieldinganormalvectoranditslocationinthecascade.Inrealandpracticalexperiments,twoimportantcomplicationsbecomeapparent.First,duetolocallinearityofDandtheniteprecisionofthecomputation,thealgorithmoftenoverestimatesthenumberoflinearclassiersbyapproximately30%,andextrastepsarerequiredtoeliminatespuriouslinearclassier.Secondandmoreimportantly,manyimage-basedapplications(e.g.,facedetection)useintegralfeatures,oftenbecausetheintensityvaluesthemselvesarequantizedintegers.Inparticular,thefeaturesbelongtotheintegerlatticeZkRk,andnotsurprisingly,thedeconstructionproblemisconsiderablyeasierforrealfeaturesthanintegerfeatures(linearprogrammingvs.integerprogramming)inpartbecausebracketingisnolongereffectiveforintegerfeatures.Furthermore,thediscretenatureoftheintegerfeaturesalsorequiremoreconstraintstodeterminethetheparametersinfull. 49

PAGE 50

Algorithm1CalculatingNormalfromPN-Pairs Input:v+,v)]TJ /F8 11.955 Tf 7.08 1.79 Td[(, Output:n m=(v++v)]TJ /F8 11.955 Tf 7.08 1.79 Td[()=2; S=samplesin10neighbourhoodofm rS=Classify(S) n=FB(S,r) Algorithm2BracketingAlgorithmforsamplesinintegerlattice Input:v2Rn Output:1,)]TJ /F8 11.955 Tf 9.3 0 Td[(1 r 0n rv=Classify(v) fori=1!ndo v1=v,v2=v v1(i)=v1(i)+1 v2(i)=v2(i))]TJ /F8 11.955 Tf 11.95 0 Td[(1 r1=Classify(v1) r2=Classify(v2) ifr1r2<0then ri=1 ifr1==rvthen ri=)]TJ /F8 11.955 Tf 9.3 0 Td[(1 endif endif endfor Algorithm3Bracketing Input:v+,v)]TJ /F1 11.955 Tf 7.08 1.79 Td[(, whilejv+-v)]TJ /F2 11.955 Tf 7.09 1.79 Td[(j0then v+=d else v)]TJ /F8 11.955 Tf 10.41 1.79 Td[(=d endif endwhile Figure3-1. AlgorithmsforDeconstructionofClassiers:Left:Findingnormalforthereallattice,FBisboundarybyFisherDiscriminant.Middle:Bracketingwhentheintegerlatticeisbeingused.Right:Bracketingalgorithm. 3.2PreliminariesWewilldenotebyCtheclassiertobedeconstructed,anditisassumedtobeabinaryfunctiondenedonthek-dimensionalfeaturespaceRktakingvalues(labels)inf1,)]TJ /F8 11.955 Tf 9.3 0 Td[(1g:forafeaturev2Rk,thebinaryclassierCreturnsC(v)=1,indicatingthefeature'slabel.Otherthanthelabels,theblack-boxclassierCdoesnotprovideanyotherinformation(suchascondencelevel)aboutthefeaturev.Inthefollowingdiscussion,featureswillalwaysrefertoreal-valuedfeatures,vectorsinRk.However,onoccasions,wewilldiscussintegralfeatures,andtheyrefertofeaturesontheintegerlatticeZkRk.AclassierCtakingonlyintegralfeaturesisthenabinaryfunctiondenedonZk. 50

PAGE 51

A B Figure3-2. Filtersrecoveredbyprobingthedecisionboundaryusingpositiveandnegativepairs.Black,whiteandgrayregionsindicatingthenegative,positiveandnullregionsoftheHaarfeature.(a)RecoveredHaarfeaturesconsideredasrealfeatures,and(b)recoveredHaarfeaturesconsideredasintegralfeatures.Comparedwithintegerfeatures,therealfeaturesareconsiderablyeasiertoberecovered.Forintegerfeatures,thereareseveralincorrectandspuriousfeatures. Internally,Cisassumedtohaveemployedacascadeofbinarylinearclassiers.Specically,thecascadeisalineardecisiontreewithlnodes(orlayers),andforeachnodeindexedbytheinteger1ik,thereisanassociatedbinaryclassier(nodeclassier)Li.Forafeaturev,thevalueC(v)isdeterminedbythevaluesLi(v)accordingtoC(v)=8><>:1,Li(v)=1forall1il,)]TJ /F8 11.955 Tf 9.3 0 Td[(1,otherwise.EachnodeclassierLiinturnisformedbyacollectionofni(weak)binarylinearclassiershi1,hi2,...,hini,andtheirassociatedweightsai1,ai2,...,aini,andathreshold 51

PAGE 52

valueisuchthatthevalueLi(v)isgivenby Li(v)=8><>:1,Pnj=1aijhij(v)i)]TJ /F8 11.955 Tf 9.29 0 Td[(1,otherwise.(3)Finally,weaklinearclassierhijisspeciedbya(normal)vectorhijandathresholdvalue!ijsuchthathij(v)=8><>:1,h>ijv!ij,)]TJ /F8 11.955 Tf 9.29 0 Td[(1,otherwise.Forfacedetectors(e.g.,[1]),eachweakclassierhijisassociatedwithaHaarfeature,andinparticular,thenormalvectorhij,whenviewedasanimage,ismadeupofasmallnumberofrectangularregionswithconstantintensity(asmallintegerwithabsolutevaluestypicallynogreaterthanfour).Insummary,theblack-boxclassierCisspeciedbythefollowinginternalparameters:L,ni,i,aij,hij,!ij,andtheclassierChasm=n1+n2+...+nlconstituentlinearclassiershij.ThegoalofthedeconstructionlearningistorecovertheseparametersbyprobingtheclassierCusingfeaturesinRk,predenedorrandomlygenerated.Specically,thelearningalgorithmaimsto determinethenumberofnodes(l)inthecascade, determine,foreachnodei,thenumberofweaklinearclassiers(ni)andtheirassociatednormalvectorshij,weightsaijandthresholdvalues!ij. determine,foreachnodei,itsthresholdi.Itisclearthatthethresholdparametersi,!ijcanbenormalizedtoone,inexchangeforrescalingtheweightsaijandthenormalvectorhij,respectively.Therefore,thedeconstructionprocesscanbeformulatedintworelatedparts: Determinethenumberofnodes,l,andthelocationinthecascadeofeachweaklinearclassierhij(i.e.,iinhij). Determinetheweaklinearclassiershijanditsassociatednormalvectorhij. 52

PAGE 53

A B Figure3-3. MultipleHaar-likefeaturespresentinsameretrievedlter,indicatingthedecisionboundaryisdenedbycombinationofmorethanoneHaarclassiers(a)Integercase(b)Realcase Therststepcanbedeterminedbyexploitingthecomputationalconsequenceofthecascadingarchitecture,andthesecondpartrequiresknowingsomegeometryofthedecisionboundaryDofC.Forcomputationalefciency,thecascadeisoftenarrangedintheorderofthenodeclassier'scomputationalcomplexity,i.e.,0iandC(v)=)]TJ /F8 11.955 Tf 9.3 0 Td[(1,andinthelatter,C(v)=1.Inparticular,therunningtimeislongerforafeaturethatisrejectedatthedeeperpartofthecascade,andpositivefeaturesalwaystakethelongest.TheCPUrunningtimeisausefulconsequenceofthecascadingarchitecture,andcarefulanalysisofrunningtimewillallowustodeterminethestructureofthecascade.GeometryoftheDecisionBoundaryDWhilethedecisionboundaryDforCcanbehighlynonlinear,itisnonethelesspiecewiselinear.InordertoanalyzeD,wewillstartwiththedecisionboundaryofeachnodeclassierLi.LetR+idenotethepositiveregionforLiinRk:v2R+iiffLi(v)=1.ItisclearfromEquation( 3 )thatR+iistheunionofacollectionofcellsinRk,wherethecellsareunionsofnihalfspacesinRkdenedbythevaluesoftheweakclassiershij.Forexample,ifLihasthreeweakclassiershi1,hi2,hi3,Rkispartitionedintoeightcellsaccordingtotheeightdifferentvaluesof(hi1,hi2,hi3),andthesetR+iistheunionofthecellsforwhichai1hi1+ai2hi2+ai3hi3i.Inparticular,theboundaryofR+imustbepiecewiselinear.Consequently,letR+denotethepositive 53

PAGE 54

regionforCinRk,anditisimmediatethatR+=R+1\R+2\....\R+l,andthisalsoimpliesthattheboundaryofR+,whichisthedecisionboundaryDofC,isalsopiecewiselinearandapolytope(possiblyunbounded).Morespecically,Dhasasubsetofmeasurezerocontainingpointsbelongingtotheintersectionofthedecisionboundariesoftwoormoreweakclassiers.Outsideofthissubset,Dislocallylinear(piecewiselinear)inthesensethateachpointiscontaineda(local)linearfacetthatisaconnectedcomponentofthe(open)interioroftheintersectionofDwiththedecisionboundaryofoneuniqueweaklinearclassier.Weremarkthatthenormalvectortoalinearfacetisalsothenormalvectorforitscorrespondinglinearclassier. 3.3DeconstructionAlgorithmThedeconstructionalgorithmhastwosequentialsteps.Intherststep,thealgorithmlocatesalargenumberofpointsonthedecisionboundaryD.Thedeconstructionalgorithmstartswithacollectionoffeatures,andthemainideaistousethesefeatures,ifnecessary,togeneratemorefeaturesinRk,andinitiatebracketingonpairsofpositiveandnegativesamplestoapproximatelylocatepointsonthedecisionboundaryD.ThesampledpointsonDwillhavealargeprobability1ofbelongingtoalinearfacetofD,anditsassociatednormalvectorcanthenbeestimated.Furthermore,foreachsampledboundarypoint,itslocationinthecascadeisalsodeterminedbymonitoringitsrunningtimes.Inthesecondstep,thesesampledboundarypointsandtheirassociatednormalvectorsandcascadelocationsareusedtodeterminetheweightsaij,andhencetheentireclassierC. 1Althoughtheoreticallytheprobabilityisone,duetoniteprecision,inpractice,theprobabilityisstrictlylessthanone. 54

PAGE 55

A B Figure3-4. Averagerunningtimeforthe4400featuresrejectedbytheclassierC.(a)Foreachplottedpoint(t,i),tistheaveragerunningtime(verticalaxis)takenbyfeaturei(horizontalaxis).(b)Averagerunningtimesortedinincreasingordershowingthatthefeaturescanbedividedintodifferentgroupsaccordingtotheirrunningtimesandhencetheirlocationsinthecascades.Weperformmean-shiftclusteringusingprocessingtimestakenbyfeatures.Horizontallinesindicateboundariesofclusters.Clustersarevisibleincolorprint. BracketingTolocateapointonornearthedecisionboundaryweperformbracketing(section 2.3.1 ).Specically,startingwithapairofpositiveandnegativefeatures,v+,v)]TJ /F1 11.955 Tf 7.08 1.79 Td[(,respectively,wecansuccessivelyhalvetheintervalbetweenv+,v)]TJ /F1 11.955 Tf 10.41 1.79 Td[(toproduceasequenceofpositive-negativepairs,allofwhichlieonthelinesegmentjoiningv+,v)]TJ /F8 11.955 Tf -308.69 -34.07 Td[((v+,v)]TJ /F8 11.955 Tf 7.08 1.79 Td[()(v1+,v1)]TJ /F8 11.955 Tf 7.09 2.95 Td[()(v2+,v2)]TJ /F8 11.955 Tf 7.09 2.95 Td[()(v3+,v3)]TJ /F8 11.955 Tf 7.08 2.95 Td[()....suchthatjvi+1+)]TJ /F4 11.955 Tf 12.89 0 Td[(vi+1)]TJ /F2 11.955 Tf 14.38 2.61 Td[(j=1 2jvi+)]TJ /F4 11.955 Tf 12.88 0 Td[(vi)]TJ /F2 11.955 Tf 7.09 2.96 Td[(j,asinrootnding,usingthelabelprovidedbyC(v)asthesignofthefunctionvalue.Forrealfeatures,thisprocesscanberepeatedindenitelytolocatethepointonthedecisionboundarywithinrequiredprecision,andthepseudo-codeforthebracketingfunctionisgiveninFigure 3.1 .Forintegralfeatures,theprocessstopsoncejvi+)]TJ /F4 11.955 Tf 11.95 0 Td[(vi)]TJ /F2 11.955 Tf 7.08 2.96 Td[(j<1.EstimatingNormalVectorsForagivensampledboundarypoint,givenasapairofpositiveandnegativesamples(v+,v)]TJ /F8 11.955 Tf 7.09 1.79 Td[()suchthats=jv+)]TJ /F4 11.955 Tf 12.64 0 Td[(v)]TJ /F2 11.955 Tf 7.09 1.79 Td[(j<<1,thealgorithm 55

PAGE 56

Figure3-5. Averagetimetakenbyfeaturesrejectedatagivenlevel. proceedstoestimatethenormalvectorassociatedwithitslocallinearfacet.Forrealfeatures,theideaistorandomlygenerateanumberofpositiveandnegativesamplesinasmallneighborhood(ofradiusr)ofv+,v)]TJ /F1 11.955 Tf 10.41 1.79 Td[(andcomputealinearclassierthatseparatesthesefeatures.Notethatforasufcientlysmallneighborhood,theserandomlygeneratedsamplesarelinearlyseparable,andthenormalvectorofthecomputedlinearclassierisusedasanapproixmationforthedesirednormalvector.Weremarkthatitispossibletoshowthatthisapproximationbecomesexactasymptoticallyasr!0,s r!0(fordetailsection 2.3.2 ).Fortheintegralfeatures,theaboveapproachisnotviablesincetheintervalcannotbedecreasedindenitely.However,thiscanbecircumventedbyindividuallyperturbingeachcomponentofthefeaturevectorvby1.Giventhepair(v+,v)]TJ /F8 11.955 Tf 7.09 1.8 Td[(),thebehaviorsofthepositiveandnegativeregionsofthevectorhijaredifferentunder1perturbationforv+,v)]TJ /F1 11.955 Tf 7.08 1.79 Td[(.Forinstance,supposeforagivencomponentofhij,anincreasebyonepushesv+intothenegativeregion.Itthenfollowsthatthiscomponentchangehasnoeffectonv)]TJ /F1 11.955 Tf 7.09 1.8 Td[(,i.e.,v)]TJ /F1 11.955 Tf 10.41 1.8 Td[(willstillbeclassiedasanegativesampleusingtheperturbed 56

PAGE 57

hij.However,ifthegivencomponentisnegative,itisstraightforwardtoseethatthecomponentchangeof)]TJ /F8 11.955 Tf 9.3 0 Td[(1willaffectthelabelofv)]TJ /F1 11.955 Tf 10.41 1.8 Td[(butnotv+.Therefore,byobservingthisqualitativedifferencebetweeneachcomponentofhij,thepositiveandnegative(andalsozero)regionsofhijcanbecompletelydetermined.ThepseudocodesforestimatingthenormalvectorsforrealandintegralfeaturesaregiveninFigure 3.1 .WenotethatforhijmodelledafterHaarfeatures,theaboveprocessisparticularlyeasytoimplementbecausethepositiveandnegativeregionsofhijarerectangularandcomponentstypicallyhaveintegralvalueswithsmallabsolutevalues.DeterminingCascadeLocationsForasampledboundarypointrepresentedasapairv+,v)]TJ /F1 11.955 Tf 7.09 -4.34 Td[(,lethijdenotetheweaklinearclassieronwhoselocallinearfacetthepointresides.Sincev)]TJ /F1 11.955 Tf 10.4 -4.34 Td[(liesonthenegativesideofthedecisionboundaryforhij,thenegativesamplev)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(isrejectedbyLi(andhenceC)atnodei.Therefore,bytimingtherunningtimeforCtorejectv)]TJ /F1 11.955 Tf 7.09 -4.34 Td[(,thelocationofhijinthecascadecanbedetermined.Inparticular,thenegligibledifferencebetweentherunningtimesoftwosamplesrejectedatneighboringnodescanbeconsiderablymagniedbyrunningeachsamplemultipletimes.Figure 3-4B showsanexpectedstaircase-likeplotfortherunningtimes.RecoveringtheWeightsThedatacollectedintherststepprovidesuswiththenumberlofnodesinthecascadeandalsoforeachnode,theconstituentlinearclassiershijinthenodeclassierLi.Thenalstepistodeterminetheweightsaijfornodei.Asmentionedbefore,wecanassumethethresholdvaluei=1.Givenasampledboundarypointrepresentedasthepair(v+,v)]TJ /F8 11.955 Tf 7.08 -4.34 Td[(),wehave1+>ai1hi1(v+)+...+ainihini(v+)11)]TJ /F7 11.955 Tf 11.96 0 Td[(
PAGE 58

Figure3-6. Dictionaryelementsorderedaccordingtotheaveragerunningtimetakenbynegativefeaturesinthepairs.Thegraphfollowsafamiliarstaircasepattern,indicatingthefeatures'locationsinthecascade. constraintai1hi1(v+)+...+ainihini(v+)=1forpairsv+,v)]TJ /F1 11.955 Tf 10.41 -4.33 Td[(suchthatjv+)]TJ /F11 11.955 Tf 13.15 0 Td[(v)]TJ /F2 11.955 Tf 7.09 -4.33 Td[(jissufcientlysmall.Therefore,bygeneratingsufcientlymanysuchpairsofpositiveandnegativefeatures,wehaveenoughlinearequationstosolveforasetofweights2.PracticalWorkingoftheAlgorithmInpractice,themainqualitativebottleneckisthestepthatestimatesthenormalvectors.Becauseoftheniteprecisionincomputation,wecannotcomputethelimitdescribedabove,andinoursimpliedimplementationreportedbelow,thealgorithminvariablyoverestimatesthenumberofweakclassiers, 2Theerroranalysisforthisapproximationwillbeaddressedinafuturework. 58

PAGE 59

typicallybyabout30%.Improvementsarecertainlypossibleanditisthefocusofanongoingwork.However,inthecontextofdeconstructivelearning,thiscurrentachievableresultisalreadyquiteinteresting,sincepresumablythetrueweakclassierscanberecoveredbysimplytrainingonthiscollectionofoverestimatedlinearclassiers.Comparedwiththeoriginaltrainingalgorithm,whichmustsearchfortheseweakclassiersoverahugesetofpossiblecandidates,thesearchspaceinourcasehasbeenimmenselyreduced(only30%morethantheactualnumberofweakclassiers).Recoveryoftheactuallinearclassiers,inprinciple,shouldbestraightforwardgivenenoughtrainingfeatures. 3.4ExperimentsInthissection,wepresentasummaryofexperimentsondecontructingtheSoft-Cascade[ 11 ]andViola-Jonesfacedetector[ 49 ]includedinOpenCVlibrary.ViolaJonesfacedetectorisalinearcascadeclassier(asdescribedabove)trainedon2020faceimages.Fortheexperiments,weusefaceimagesthataredetectedasfaceimagesbythedetectorandwealsorandomlygenerateseventy-ninethousandpairofpositiveandnegativesamples,i.e.faceandnon-faceimagepairs.Theyareofsamesizeasthetrainingimages.TheOpenCVfacedetectoronlyacceptsintegralfeaturesandwemodifytheimplementationsothatthedetectoralsoacceptsrealfeatures.Duetolimitedspace,wewillpresentonlytheresultsfromtherststepofthealgorithmthatdeterminestheweakclassiersandthenumberofnodesinthecascade. 3.4.1SoftCascadeSoft-cascadewastrainedusing33Haarlikefeaturesonfaceimagesandmathematicallyitisofmuchsimplerform.InsteadoflevelsofcascadesithassinglestageconsistingofTweakclassiers.Theseweakclassiersareorderedaccordingtohowsoontheycanrejectanon-faceimages.Letct(x)=tht(x)beresponseoftthclassier,thenatanystagetifaccumulativesumHt(x)=Pti=1ci(x)islessthanstagethresholdrtitisrejected.Usingonlyabout2,000randompairingofpositiveandnegativesamples 59

PAGE 60

Figure3-7. Recoveredsoft-cascade. Figure3-8. PercentageoftheOpenCVHaarcascadeltersrecoveredatdifferentcorrelationthresholds. inthealgorithmdescribedabovewewereabletorecover32out33classiers.TherecoveredfeaturesareshowninFigure 3-7 ,wherethelastclassierisvisiblynon-Haarinstructure. 3.4.2Viola-JonesHaarCascadeIncomparisontoselftrainedsoft-cascade,HaarCascadeincludedinOpenCVhasbeentrainedonmuchlargerdataset,ithasmorethantwothousandweakHaarclassiersdistributedinits22stages.Forsuchacomplexclassier,onehastorecoverthenumberofstagesandassignclassierstoeachstage.Wewereabletorecover98%oftheweakclassiersintheOpenCVHaarcascadewhererecoveredclassiers 60

PAGE 61

hadcorrelationof0.9ormorewiththeoriginalclassiers.MoredetailresultisgiveninFigure 3-8 .TheseresultsprovethattherecoveredclassiersetdoescontainclassiersusedintheOpenCVcascade.Wediscussbelowhowtoconstructcascadegiventheserecoveredclassiers.RecoveringWeakClassiersFigure 3-2 demonstratestherecoveredHaarfeatures(weakclassiers)usingonehundredpairsofpositiveandnegativefeatures.Inthisexperiment,weshowalltherecoveredHaarfeatures,consideringthemasrealandintegralfeatures.Thedeconstructionprocessusesbracketingandestimatethelocalnormalvectorsasdescribedintheprevioussection.Itisclearfromthegurethat1)therearesomespuriousandincorrectHaarfeaturesrecovered,particularlyfortheintegercase,and2)theintegercaseissubstantiallymoredifcultthantherealcaseastherearemorespuriousrecoveriesfortheintegralfeatures.RecoveringLocationsintheCascade(stages)Wepresenttworesultsonrecoveringthelocationsofweakclassiersinthecascade.Intherstexperiment,wetakes4400features(images),andeachfeatureisrepeated5000timesformonitoringtheirrunningtimesasdescribedbefore.TheplotisshowninFigure 3-4 .Foreachcascadelevel,wecalculatedaveragerunningtimetakenbyimagesrejectedatthatlevelandalsothestandardderivation.Thisindicatesthatthereisdifferencebetweenthetimetakenbetweentwocascadelevels,andsufcientsampleswouldhelpdifferentiatingbetweenthelayers.InthesecondexperimentFigure 3-5 ,weuse30,000pairsofpositiveandnegativesamplestodiscoverpossibleclassierandpairofv+,v)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(makingbracket.Sinceweknowthatnegativeimagesarerejectedatdifferentstages,timeittooktorejectv)]TJ /F1 11.955 Tf -425.79 -28.25 Td[(couldbeindicativeofwheretheclassierexists(notethatv)]TJ /F1 11.955 Tf 10.41 -4.34 Td[(isnotjustanynegativesamplebutoneveryneartoboundaryduetobracketingalgorithm.ThecollectionallthedistinctweakclassiersgeneratedintheaboveprocessisusedtolearndictionaryD,andassociateeachdictionaryelementwiththeaveragerunningtimeofthepairswith 61

PAGE 62

thegivenelement.TheplotisshowninFigure 3-6 .Torecoverthenumberofstages,wepreformedmean-shiftclusteringontheprocessingtimeasshowninFigure 3-4B .Thisgivesusthepredictionof20stages(originalclassierhas22stages).Fewearlierstagesarecombinedinonecluster.Reasonbeingthatearlierstagesnotonlyhavesmallnumberofclassiersbutalsorejectnegativeimagesquitequicklythereforeprocessingtimeinearlierstagesisundistinguishable.Thiscouldberectiedbyrunningnegativeimagesshowinglowprocessingtimesmoreintheloop,thusmakingtimesofrejectionmoredistinguishable.RecoveringStageInformationAfterweclustertherecoveredfeaturesintocascades,itiseasytoestimatestagelevelinformation(i.e.rightandleftweightsforeachclassier,aswellstagethreshold).Duetobracketingalgorithmwehavepositiveandnegativepairnearboundaryassociatedwitheachrecoveredclassier.Allsuchimagepairsbelongingtoclassiersinclusteriareusedtosolveforright,leftweightsandstagethreshold.Inadditiontheseimages,alltheboundarypairsassociatedtoclassiersabovethisstagearetakenaspositiveimages. 3.5ConcludingRemarksWeconcludethischapterwithfewpertinentremarksandobservations.Thischapterexposesfundamentalasymmetrybetweenconstructivelearning(learningclassiersfromtrainingsamples)anddeconstructivelearning.Asreportedin[ 49 ],thetrainingtimefortheViola-Jonesfacedetectorusingadaboosttakesmorethanonemonth.However,fordeconstructivelearning,the(almost)entiresetofHaarfeaturesselectedbyadaboostcanberecoveredinamatterofafewhours.Decsontructivelearningishighlyparallelizableanddeconstructagivenclassierinfewhours.WehaveshownhowtodeconstructasoftwareprovidedbyOpenCVlibrarywherewedealtwiththeissuesofsoftwareonlytakingintegerimages.Howeverwestillhavetodealwiththeissuesofimageprocessingandnormalizationstakingplacebeforetheimageovertoclassier.Thiscouldbepartofoutfuturework. 62

PAGE 63

CHAPTER4TWOSTAGECLASSIFICATIONSYSTEMSInpreviouschapterswelaiddownmuchofthefoundationofhowtodeconstructaclassier.WestudieddeconstructionofSVMandcascadeoflinearclassiers;nowusingthesetoolswepresentframeworktodeconstructacompleteclassicationsystem,whereclassierisjustonepart.Infollowingsectionswere-introduceclassicationsystemandexplainwhydeconstructingsuchasystemisacomplex,difcultbutatthesametimerewardingtask.ToevaluateeffectivenessofourmethodwepresentresultsondeconstructionofOpenCVhuman/persondetector,motorbikedetector(trainedbyus)anddecsonstructionoffewexamplesontheMNISTdataset. 4.1ProblemIntroductionLikeinpreviouschapters,letusassumeabinaryclassicationsystemCThesystem,asabinaryclassierofimages,ispresentedonlyasabinaryexecutablethattakesimagesasinputsandoutputs1asitsdecisionforeachimage.Theclassierislaconicinthesensethatexceptthepredictedlabel1,itdoesnotdivulgeanyotherinformationsuchascondencelevelorclassicationmarginassociatedwitheachdecision.However,weareallowedtoquerythedetector(classier)usingimages,andtheproblemstudiedinthisdissertationistodeterminetheinnerworkingoftheclassierusingonlyimagequeriesandtheclassier'slaconicresponses.Unlikepreviouschaptersherewearelookingforcomplexinformation.Wewanttodeterminethetypeoffeaturesituses?Insteadofassumingwewanttogureoutwhatkindofinternalclassierdoesitdeploy?Supportvectormachine(SVM)orcascadeoflinearclassiersorsomethingelse?AndthenIfitusesSVMinternally,whatkindofkernelfunctiondoesituse?Howmanysupportvectorsarethereandwhatarethesupportvectors?Deconstructivelearningaspresentedaboveisaninverseproblem;therefore,withoutanappropriateregularization,theproblemisill-posedanditisimpossibleto 63

PAGE 64

denedesiredsolutions.Inparticular,sinceweareallowedtoaccessonlythelaconicresponsesoftheclassier,thescopeseemsalmostunbounded.Theappropriatenotionofregularizationinthiscontextistodeneatractabledomainonwhichsolutionscanbesought,andthemaincontributionofthisdissertationistheproposalofacomputationalframeworkthatwouldallowustoposeandanswertheabovequestionsascomputationallytractableproblems.OurproposalisbasedonaspecicassumptionontheclassierCthatitsinternalstructurefollowsthecommontwo-componentdesign:afeature-transformcomponentthattransformstheinputimageintoafeatureandamachine-learningcomponentthatproducestheoutputbyapplyingitsinternalclassiertothefeature(Figure 4-1 ).Manyexistingbinaryclassiersincomputervisionfollowthistypeofdesign,acleardemonstrationofthedivisionoflaborbetweenpractitionersincomputervisionandmachinelearning.Forexample,mostofthewell-knowndetectorssuchasfaceandpedestriandetectors(e.g.,[ 8 34 50 ])conformtothisparticulardesign,withotherlesser-knownbutequally-importantexamplesinsceneclassication,objectrecognitionandothers(e.g.,[ 10 27 ])adoptingthesamedesign.Byclearlydelineatingthevisionandlearningcomponents,wecanformulateacomputationalframeworkfordeconstructingCastheidenticationproblemforitstwointernalcomponentsfromanitecollectionofpotentialcandidates.Moreprecisely,foragivenvisionclassierC(e.g.,anobjectdetector),thedeconstructionprocessrequiresalistoffeatures(andtheirassociatedtransforms)Fandalistof(machinelearning)classiersC.Basedonthesetwolists,thealgorithmwouldeitheridentifythecomponentsofCamongtheelementsinFandCorreturnavoidtoindicatefailureinidentication.Computationally,thelistsdenetheproblemdomain,andtheyconstitutetherequiredminimalpriorknowledgeofC.Inpractice,thegeneraloutlineofthefeatureusedinaparticularvisionalgorithmisoftenknownandcanbeascertainedthroughvarioussourcessuchaspublications.However,importantdesignparameterssuchassmoothingvalues,cell/block/binsizesetc., 64

PAGE 65

areoftennotavailableandtheseparameterscanbedeterminedbysearchingoveranexpectedrangeofvaluesthatmadeuptheelementsinF.Similarly,thetypeofclassierusedcanoftenbenarroweddowntoasmallnumberofchoices(e.g.,anSVMoracascadeoflinearclassiers).Withinthiscontext,weintroducethreenovelnotions,featureidentiers,classierdeconstructorsandgeometricfeature-classiercompatibility,asthemaintechnicalcomponentsofthedeconstructionprocess.Specically,featureidentiersareasetofimage-basedoperationssuchasimagerotationsandscalingsthatcanbeappliedtotheinputimages,andthedifferentdegreeofsensitivityandstabilityofthefeaturesinFundertheseoperationswouldallowustoexcludeelementsinF,makingtheprocessmoreefcient.Forexample,supposeFcontainsbothSIFTandHOG-basedfeatures.SinceSIFTisinprinciplerotationallyinvariant,SIFT-basedfeaturesaremorestableunderimagerotationsthanHOG-basedfeatures;andtherefore,ifCusesaSIFT-basedfeatureinternally,itoutputswouldbeexpectedtobemorestableunderimagerotations.Therefore,byqueryingCwithrotatedimagesandcomparingtheresultswithun-rotatedimages,wecanexcludefeaturesinFthatarerotationallysensitive.Theclassierdeconstructors,ontheotherhand,arealgorithmsthatcandeconstructclassiersinCusinga(relatively)smallnumberoffeaturesbyrecognizingcertaingeometriccharacteristicsoftheclassier'sdecisionboundary(e.g.,itsparametricform).Forexample,anSVMdeconstructoralgorithmisableto(givensufcientlymanyfeatures)determinethenumberofsupportvectorsandthetypeofkernelusedbyakernelmachinebyrecognizingcertaingeometriccharacteristicsofitsdecisionboundary.TheinteractionbetweenelementsinFandCduringthedeconstructionprocessisbasedonthenotionofgeometricfeature-classiercompatibility:forapair(f,c)offeaturefandclassierc,theyarecompatibleifgivensufcientlymanyfeaturesdenedbyf,thedeconstructoralgorithmassociatedtoccancorrectlyrecognizeitsdecisionboundary.Morespecically,givenavisionclassierCinternallyrepresentedbyapair(f,c)offeaturefandclassierc,wecanqueryC 65

PAGE 66

usingasetofimagesI1,...,In,andusingthefeature(anditassociatedtransform)f,wecantransformtheimagesintofeaturesinthefeaturespacespeciedbyf.Thedeconstructoralgorithmassociatedwithcthendeterminestheclassierbasedonthesefeatures.However,foranincorrecthypotheticalpair( f,c),thedifferencebetweenthetransformedfeaturesspeciedby fandfaregenerallynon-linear,andthisnon-linearitychangesthegeometriccharacteristicsofthedecisionboundaryinthefeaturespacespeciedby f,renderingthedeconstructoralgorithmcunabletoidentifythedecisionboundary(Figure 4-1 ). Figure4-1. Schematicillustrationofdeconstructivelearningandproposedalgorithm.Left:Two-componentdesignofaclassier:afeature-transformcomponentprovidedbycomputervisionfollowedbyafeature-classicationcomponentfurnishedbymachinelearning.Right:Schematicillustrationoftheproposeddeconstructionalgorithm.Internally,thealgorithmsearchesoverasetofcandidatefeaturespacesandprobesthespacesfordecisionboundaries.Onlyinthecorrectfeaturespacetheparametricformofthedecisionboundarywouldberecognizedbythedeconstructoralgorithm. Theabstractframeworkoutlinedaboveprovidesapracticalandusefulmodularizationofthedeconstructionprocesssothattheindividualelementssuchastheformationoffeatureandclassierlists,featureidentiersandclassierdeconstructorscanbesubjecttoindependentdevelopmentandstudy.Inthisdissertatoin,werealizetheabstractframeworkinconcreteterms.Specically,weintroducedtwodeconstructoralgorithmsforsupportvectormachine(SVM)andforthecascadeoflinearclassiers.Theformerisapopularfamilyofclassierswidelyusedinvisionapplicationsandthelatterisoftendeployedinobjectdetectors,withthefacedetectorofViola-Jonesasperhapsthemost 66

PAGE 67

well-knownexample[ 50 ].Intheexperimentalsection,wepresentfourpreliminaryexperimentalresultsdemonstratingtheviabilityoftheideasproposedinthischapter. Intherstexperiment,weshowtheapplicationofafewsimpleheuristicscansubstantiallyreducethesizeoffeaturelistFandtherefore,allowforamoreefcientdeconstructionprocess. Inthesecondexperiment,experiment,wepresenttheresultofacompletedeconstructionofOpenCV'sHOG-basedpedestriandetector.Theentiredeconstructionprocesssearchesoveronehundredpotentialfeaturestocorrectlyidentifythelinearclassierusedinthedetector.Thenormalvectorofthelinearclassierrecoveredbyouralgorithmhasthenormalizedcorrelationofmorethan0.99withthegroundtruth(i.e.,withanangulardifferencesmallerthan2). Inthirdexperimentwerepeatdeconstructclassiertrainedtodetectaeroplanes. Infourthandlastexperimentwerecoverthelinearsubspaceusedtoperformdimensionreductionforfeatureextractionpurposes,andthenrecoverthequadratickernelbasedSVMclassier.TheMATLABimplementationofthedeconstructionalgorithmisonlyaround100linesofcodeandittakesnolongerthananhourtocorrectlyidentifythefeature,theclassiertype(linearSVM)andthelinearclassieritself.Letusreiteratethattothebestofourknowledge,thereisnotapreviousworkondeconstructivelearningcomparabletothejointrecoveryoffeaturetransformandclassierasoutlinedabove.Wheremostofthelearningproblemsdealwiththedistributionandprobabilitiesinoursettinggeometryreplacesprobabilityasthejointfeature-labeldistributiongiveswaytothegeometricnotionofdecisionboundaryasthemaintargetoflearning. 4.2DeconstructionProcessandMethodLetF,Cdenotethefeatureandclassierlists.GivenaclassierCwiththetwo-componentdesignasdescribedabove,thedeconstructionalgorithmattemptstoidentifythefeature-transformandfeature-classicationcomponentsofCwiththeelementsinFandC.Specically,weassumethat 67

PAGE 68

Eachfeaturefi2Fdenesafeaturetransformfi:Rd!RnifromtheimagespaceRdtoafeaturespaceRniofdimensionni.Fortechnicalreason,thefeaturetransformfi:Rd!RniisassumedtobeLipschitzcontinuousinthatkfi(Ia))]TJ /F11 11.955 Tf -405.13 -14.44 Td[(fi(Ib)k20andIa,Ib2Rd.Furthermore,weassumethataninverseofthefeaturetransformficanalsobecomputed:forvi2Rni,animageinf)]TJ /F5 7.97 Tf 6.58 0 Td[(1i(vi)canbecomputed1. Eachelementci2Crepresentsaknownfamilyofclassiers(e.g.,SVMandcascadeoflinearclassiersastwodifferentfamilies)andhasitsassociateddeconstructoralgorithm(alsodenotedasci).ForeachfeaturespaceRni,withsufcientlymany(feature)pointslocatedonahypotheticaldecisionboundary,cicandetermineifsuchdecisionboundaryistheresultofoneofitsmemberclassiersandprovideothermoredetailedinformationaboutthespecicclassier.Forexample,forthedeconstructorassociatedwithSVM,withenoughfeaturepointslocatedonahypotheticaldecisionboundaryinRni,itcandetermineifthedecisionboundaryistheresultofanSVMclassierandifso,itwillreturnthetypeofkernelandthenumberofsupportvectors,etc.Thenumberofrequiredpointsonthedecisionboundarydependsoneachdeconstructoralgorithm.Wehaveassumedthatthefeaturespacesareallcontinuous(Rni)andthefeaturetransformsfiaresurjectivemaps.Technically,workingincontinuousdomainsissimplerbecauseusefuldifferential-geometricfeaturessuchasthenormalvectorsoftheclassier'sdecisionboundaryareavailable.Furthermore,continuousdomainsallowustolocatethedecisionboundarywithinanyprescribedaccuracyusingthesimpleideaofbracketing(asinroot-nding[ 30 ]):givenapairofpositiveandnegativeimages(PN-pair),wecanproduceaPN-pairofimagesnearthedecisionboundaryintheimagespacebysuccessivelyhalvingtheintervalbetweenapairofpositiveandnegativeimages,usingthelabelsprovidedbyC.ByLipschitzcontinuity,aPN-pair(sufciently)nearthedecisionboundaryintheimagespacecanbetransformedbyafeaturefi2FintoaPN-pairnearthedecisionboundaryinthefeaturespaceRni.BysamplingenoughPN-pairsthatarenearthedecisionboundaryintheimagespace,weobtainthecorrespondingPN-pairsnearthedecisionboundaryineach 1Forsimplicity,weassumethatthetransformfiissurjectiveandasaset(ofimages),f)]TJ /F5 7.97 Tf 6.59 0 Td[(1i(vi)isnonemptyandwecancomputeanelement(image)inf)]TJ /F5 7.97 Tf 6.58 0 Td[(1i(vi)(e.g.,[ 51 ]). 68

PAGE 69

featurespaceRnispeciedbythefeaturefi2F.ForthesesampledPN-pairsineachfeaturespaceRni,weapplythedeconstructoralgorithmctoseeifitrecognizesthedecisionboundaryfromthesesamples.Furthermore,theinversefeaturetransformf)]TJ /F5 7.97 Tf 6.59 0 Td[(1ipermitsthedeconstructoralgorithmthatoperatesinthe(opaque)featurespaceRnitosampleadditionalfeaturesnearthedecisionboundaryifnecessary.Essentially,eachdeconstructoralgorithmisdesignedtoprobeagivenfeaturespaceforthedecisionboundaryanditrecognizestheparametricformofthedecisionboundaryarisingfromaclassierinitsassociatedfamily.Inparticular,startingwithasmallnumberofpositivefeatures,thedeconstructoralgorithmproceedstoexploreeachfeaturespaceRnibygeneratingpointsnearthedecisionboundary.GiventhetwolistsF,C,thedeconstructionprocessproceedsinadirectmanner:runalldeconstructoralgorithmsinCinparalleloverallthecandidatefeaturesinF,andforeachpair(fi,cj)offeature(space)anddeconstructor,cieithersucceedsindetectingarecognizableboundaryorfailstodoso.Ifthereisnosuccessfulpairs(fi,cj),thealgorithmthenfailstodeconstructC.Otherwise,itprovidestheuserwithallthesuccessfulpairs(fi,cj)aspotentialcandidatesforfurtherinvestigation2.Inthisdissertation,theclassierlistCcontainstwoelements:thefamilyofsupportvectormachines(SVM)andthefamilyofcascadesoflinearclassiers,andinthefollowingthreesubsections,weprovidetheremainingdetailsintheproposeddeconstructionprocess. 4.2.1FeatureIdentiersAnimportantrststepinthedeconstructionprocessistoproperlydenethelistsFandC,withtheaimofmakingthemasshortaspossible.Variousheuristicsbasedonexploitingthedifferencesbetweenvarioustypesoffeaturescanbedevelopedtoaccomplishthisgoal.Forexample,itispossibletoexcludesimpleHOG-based 2Thispartisbeyondthescopeofthisdissertation. 69

PAGE 70

featuresfromFbyperturbingtheimagesslightly.Inparticular,becauseHOG(unlikeSIFT)isingeneralnotrotationallyinvariant,andhence,iftheimagesareslightlyrotated,wecanexpectless-than-stableresultsfromtheclassierCifitusesHOGasthemainfeature.Intheexperimentalsection,westudyseveralsuchfeatureidentiersintheformofsimpleimageoperationssuchasrotationsandscalings,anddemonstratetheirusefulnessinexcludingfeaturesinF.Ontheotherhand,thedifferencebetweentheclassiersinCcanalsobeexploitedtoshortenthelistC.Inourcase,itisstraightforwardtodetermineifthegivenCusesSVMoracascadeoflinearclassiersasitsinternalclassier.Recallthatacascadeoflinearclassiersisadecisiontreewithalinearclassierassociatedwitheachtreenode.Consequently,positivefeaturesalwaystakelongertoprocessthannegativefeaturesandamongthenegativefeatures,therunningtimecanvaryconsiderablydependingonthedepthofthetree.However,foranSVM-basedC,allfeaturesareexpectedtohavethesameorsimilarrunningtimes.Therefore,bycheckingthedistributionoftherunningtimeamongpositiveandnegativefeatures,wecandeterminewithgreatcertaintythetypeofclassierusedinternallybyC. 4.2.2DeconstructorMethodsGivenafeaturespaceRl,wecanndtransoformoursamplesfromimagespacetothefeaturespaceandperformbracketing.GeneratingmanyPN-pairsnearthedecisionboundarywecanusemethodsdetailedinchapterschapter 2 andchapter 3 torecoverdifferentclassiers.RepeatingthisprocessforallthefeaturespacesavailableinFwebuildourclassierlistC. 4.2.3GeometricCompatibilityBetweenFeaturesandDeconstructorsInthedeconstructionprocess,theinteractionbetweenthefeaturesinFanddeconstructorsinCisbasedonthenotionofgeometrycompatibility,andthedeconstructionalgorithmselectsthepair(f,c)asasolutionifthedeconstructoralgorithmcrecognizesthedecisionboundaryfromacollectionofsampledpointsinthefeaturespaceRn 70

PAGE 71

speciedbyf.Thegeometricpictureisneatlycapturedbythefollowingdiagram:Rdf2)305()222()306(!Rn2=??yx??Rdf1)305()222()306(!Rn1Supposef1,f2aretwofeaturesinFwiththeirrespectivefeaturespacesRn1,Rn2,andthevisionclassierCinternallyemploysthepair(f1,c).Therefore,ifwereconstructthedecisionboundaryinthecorrectfeaturespaceRn1,thedeconstructoralgorithmwouldbeabletorecognizethedecisionboundaryandhencethepair(f1,c)wouldbeselectedbythedeconstructionalgorithm.However,fortheincorrectfeaturef2,thedecisionboundaryreconstructedinRn2isrelatedtothedecisionboundaryreconstructedinRn1viaamapthatarisesfromthefactthatweusethesamesetofimagesintheimagespaceRdtoreconstructthedecisionboundaryinbothRn1andRn2.Theimportantobservation(orassumption)isthatfordifferentfeaturesf1,f2,themapisgenerallynonlinearanditwouldmapthedecisionboundaryinRn1toadecisionboundaryRn2withanunknownparametricform.Forexample(aswillbeshownlater),foralineardecisionboundaryinRn1,thecorrespondingdecisionboundaryinRn2wouldgenerallybenonlinear.Therefore,ifcisadeconstructorforlinearclassier,itwouldfailtorecognizethedecisionboundaryinRn2.ForSVMdeconstructor,themapessentiallymapsadecisionboundaryofaknownparametricformtoaboundarywithunknownparametricform,i.e.,thedecisionboundaryinRn2isnotcompatiblewiththedeconstructorc. 4.3ExperimentsInthissection,wefourthreeexperimentalresults.Intherstexperiment,wedemonstratetheideaofusingsimpleheuristics(imageoperations)toshortenthefeaturelistF.Inthesecondexperimentandthirdone,wedetailtheexperimentalresultofdeconstructingthepedestrian(human)detectorintheOpenCVlibraryandairoplanedetector(trainedbyus).Infourthandlastexperimentweshowhowtodiscoverthe 71

PAGE 72

featurespacewhenlineardimensionalityreductionisusedasfeaturetransformandthenweshowresultsondeconstructionofquadratickernelbasedSVM. 4.3.1DistinguishbetweenHOGandSIFTManyvisionalgorithmsusefeaturesderivedfromthewell-knowngradient-basedfeaturessuchasHOG[ 19 ]orSIFT[ 38 ].ToshortenthesearchlistF,weusevariousinvariancepropertiesofthesefeatures.Forexamples,withnon-denseSIFTusedinthebagofwordsmodel,itisgenerallyinvariantunderscale,rotationandevenshiftingtransformation.Ontheotherhand,thedenseSIFT,whenusedinapyramidscheme,isgenerallyinvariantunderreasonableamountofscalechange,imageippingandsmallamountoftranslations.Itisgenerallynotinvariantunderrotationoripping(unlessobjectissymmetric).HOGasmentionedpreviouslyinnotinvariantunderrotation,althoughitisinvariantundersmallamountoftranslationwhenitissmallerthansizeofitscells.Inthisexperiment,weexperimentallydemonstratetheabovegeneralimpressionontheinvariancepropertyoftheHOGandSIFTundervariousimagetransforms.Wecomparedthesepropertiesofthefourdifferenttypefeatures(Table 4-1 )byconstructingaeroplaneSVMclassiersusingtheimagesfromCaltech101dataset[ 26 ].Werandomlyselected100aeroplaneimagesaspositivesamplesand100imagesfromothercategoriesasnegativesamples.WeusedfourtypefeaturesextractedfromthesesamplestotrainlinearSVMs.Inthetestphase,threesimpleimagetransformsareappliedtothe200randomlyselectedtestimages.Thetransformationsare180rotation,atranslationofeightpixelsinbothxandydirections,andasimplezoomin(achievedbyscalingthebyafactorof1.2andcroppingtheboundaries).TheclassicationratesoftheSVMsconstructedusingthefourdifferenttypesoffeaturesandunderthethreetransformsaredemonstratedinTable 4-1 .Asshowninthetable,rotationalinvarianceofSIFTmakeitrelativelystableunderrotationandscaletransforms.Weusetheseinvarianceresultstodecreaseoutfeaturelistduringourexperiments 72

PAGE 73

Table4-1. Effectsofdifferentimagetransformsonclassicationresults.HOG-basedfeaturesasexpectedproduceunstableresultsunderrotationandscalechange. SIFT+BoWDense-SIFT+SpatialPyramidHOG(4)HOG(8) Rotation(180)0.80000.43000.68000.7450Zoom-in0.99500.96000.35500.3850Translation(8,8)0.95500.85000.95500.9350 4.3.2DeconstructingOpenCVHOG-basedPedestrianDetectorInthisexperiment,wedeconstructthepedestrian(human)detectorprovidedintheOpenCVlibrary[ 13 ].Thisimplementationisbasedonthealgorithmproposedin[ 19 ],wherehistogramoforientedgradients(HOG)isusedasfeaturewithlinear(SVM)classierasinternalfeatureclassier.Thegoalofthisdeconstructionexperimentistorecover(a)ThethreeimportantdesignparametersfortheHOGfeature:cellsize,blocksizeandblockstride.(b)Theparametersforthelinear(SVM)classier:itsweight(normal)vectorandbias.Forthisexperiment,aquickcheckontherunningtimesofafewpositiveimagesimmediatelyrulesoutthecascadeoflinearclassiersasaviablecandidate(Section3.1)ormoreprecisely,itshowsthatthecascademustbeaveryshallowtreeandwesimplyinterprettheresultasrulingoutthecascadeasacandidate.Furthermore,becausethehigh-dimensionalityofthefeaturespace(usuallyinthethousands),theSVM-basedclassierisalmostcertainlylinear,sinceothertypesofnonlinearkernelsareoftencomputationallydemandingforhigh-dimensionalfeatures.Inparticular,bycheckingthenormalvectorsofthedecisionboundaryatfortydifferentplaces,itessentiallyprovidesonlyonenormalvector,indicatingtheunderlyinglinearclassier.Therefore,theclassierlistChasonlyoneelementandthekerneltypeisassumedtobelinear.ForthefeaturelistF,wedeneapproximately30candidateparametersettingsandaccordingly,thefeaturelistFhaslength100.Morespecically,thecellsizecantakethethreeintegralvaluesf4,8,16g,andforeachcellsize,theblocksizecanequaltocellsize,doublethecellsizeortriplethecellsize.Similarly,theblockstride 73

PAGE 74

isseteithertohalfoftheblocksize,thefullblocksizeortwicetheblocksize.WenotethatdifferentparametersgivedifferentHOGfeaturesthattypicallyresideindifferentfeaturespaces(inparticular,withdifferentdimension).Notethatsincewerandomlypairpositiveandnegativeimages,PN-pairsetcanhavesizeequaltoproductofsizeofpositiveimagesetandnegativeimageset.Thisrestrictswhichfeaturedimensionswecantakeinconsideration.WeusetheclassierasprovidedintheOpenCVwithoutanymodicationandpositiveandnegativeimagesareobtainedbyrunningtheclassieroverasetofimages.WeremarkthatthepositiveandnegativeimagesareaccordingtotheoutputsoftheclassierC,notvisuallywhichclasstheyshouldbelongto.WerandomlypairthesepositiveandnegativeimagesandrunthebracketingalgorithmtolocatePN-pairs(Figure 4-2 4-3 )closetothedecisionboundaryinthe(xed)imagespace,andforeachsuchPN-pairclosetothedecisionboundaryintheimagespace,wecomputeitscorrespondingPN-pairineachofthefeaturespaceRniforfi2F.Inthisexperiment,wedonotusetheinversefeaturetransformtodeterminethefeaturelabelsinthefeaturespace,althoughinversetransformforHOG-basedfeatureshasbeenproposedin[ 51 ]. Figure4-2. PN-pairsobtainedbybracketingprocess,duringdeconstructingOpenCVpersondetector.Noticethatevenwhentheyappeartobeverysimilar,oneislabeledaspositivesamplebyclassierandotherasnegative. Animportantexperimentalresultisthatlineardecisionboundaryisobservedonlyforthecorrectparametersetting,whileforincorrectparametersetting,thedecisionboundaryisgenerallynonlinear.Toefcientlyandaccuratedetectthelinearboundary 74

PAGE 75

Figure4-3. PN-pairsobtainedbybracketingprocess,duringdeconstructingMotorbikedetector.MotorbikedetectoristrainedusingCaltech101dataset[ 26 ]usingHOGasthefeatureandaquadratickernelmachineastheclassier. inthesehigh-dimensionalfeaturespaces,weuseFisherlineardiscriminant(FLD).Specically,forthelabeledPN-pairsineachcandidatefeaturespace,wetrainaFisherlineardiscriminant.Ifthelabeledfeaturesareindeedlinearlyseparable,theFisherlineardiscriminantwoulddetectitbycorrectlyclassifyinglargeportionofthelabelfeatures.Furthermore,intherightfeaturespace,aswegeneratemorePN-pairsthatareclosetotruedecisionboundary,thelinearclassierprovidedbyFLDwillmoveclosertothetruelinearclassier.Thislatterstatementiseasilyvisualizedanditsrigorousjusticationseemsstraightforward.Inparticular,thelinearclassierdeterminedbyFLDprovidesgoodapproximationstotheweight(normal)vectorandthebiasofthetruelinearclassier.Ontheotherhand,inthewrongfeaturespace,thedecisionboundarywouldbenonlinearandthetrainedFLDisnotexpectedtocorrectlyclassifylargeportionofthelabeledfeatures,regardlessthenumberofPN-pairsgenerated.ExperimentalconrmationsoftheseobservationsareshowninFigures 4-4 and 4-5 .InFigure 4-4 ,theclassicationerrorofthetrainedFLDdecreasesonlyinthecorrectfeaturespace.AndinFigure 4-5 ,2DprojectionsofthelabeledPN-pairsinthreedifferentfeaturespacesaredisplayedandlinearseparabilityisclearlyshownonlyforfeaturesfromthecorrectfeaturespace.Theweight(normal)vectorrecoveredusingFLDwith10,000PN-pairshasthenormalizationcorrelationof0.99withthegroundtruth,i.e.,withanangulardifferenceofroughly2. 75

PAGE 76

Figure4-4. Classicationerrorsexemplifyingconceptofgeometriccompatibilitybetweenfeaturesanddeconstrcutors.LEFT:DeconstructionofOpenCvPedestrianDetector:ClassicationerrorsfortheFLDtrainedinvedifferentfeaturespaces.Thefeatureparametersaregiveninthelegend(cellsize,blocksize)andthevisibledecreaseinclassicationratesisobservedonlyinthecorrectfeaturespace.RIGHT:DeconstructionofairplaneDetector:Similarresultsobtainedinthedeconstructionofanairplanedetector(HOGfeatures+linearSVM.). 4.3.3DeconstructingAirplaneDetectorTofurthertestalgorithmweconstructedclassierforlabellingairplane(Caltech101[ 26 ]).HOGfeatureswerecalculated(withcell-size16,block-size22cell-size,strideequaltocellsize)fortrainingimagesandlinearSVMwastrainedonthesefeaturesasclassier.Thistwo-stageclassicationsystemwasprovidedasblackboxtoourdeconstructionalgorithm.WefollowallthestepsdescribedinexperimentofdeconstructingOpenCVpedestriandetector,howeverthistimewealsovarynumberoforientations(numberofbinsused)toconstructHOG.AsindicatedinFigure 4-4 separationbetweenPN)]TJ /F11 11.955 Tf 11.06 0 Td[(pairsincreases(henceerrordecreases)forcorrectparametersassamplesizeisincreased.InFigure 4-6 weshowrecoveredlinearclassierandtheoriginallinearclassier. 4.3.4DeconstructingwhenDimensionalityReductionisBeingEmployedComputervisionproblemsaremarkedbythehighdimensionalityofdata.Dimensionalityreductioniscommonlyusedinmachinelearningtasks(including 76

PAGE 77

Figure4-5. 2DprojectionsofthelabeledPN-pairsinthreedifferentfeaturespaces,exemplifyingconceptofgeometriccompatibilitybetweenfeaturesanddeconstrcutors.The2DprojectedsubspaceisspannedbythenormalvectordeterminedbyFLDandtherstsingularvectorforthecollectionoffeatures.Theresultsfromthecorrectfeaturespaceareshowninlastrow,thetwoplotsdisplaytheprojectionsintwodifferentscales(forfurtherdetailsrefertothesection 4.3.2 ) computervisionproblems)tonotonlyreducecomplexityofdataandmakeprocessingcomputationallylessexpansive,butalsoasafeatureextraction.Infollowingexperimentweexplorehowtorecoversubspacebeingusedfordimensionalityreductioninagivenclassier.WetrainSVMbasedclassierfortheMNISTdatasethavingfollowingtwostages.Firststageconsistsoflineardatatransformation(principalcomponentanalysis)usedto 77

PAGE 78

Figure4-6. Deconstructionofclassication-systemwithHOGfeaturesandLinearSVMasclassier.LEFT:Originallinearclassier.RIGHT:Recoveredlinearclassier. extractfeatures(weuse10eigenvectorscorrespondingto10largesteigenvaluescalculatedonofcovariancematrixoftrainingdata).Insecondstage,SVMwithquadratickernelisusedtoclassifytheextractedfeatures.Asinpreviouscasesthisclassierappearsasablack-boxtousandwecanonlygetclassicationresultonceweinputtheimage.FirststepinthisexperimentistorecoverthesubspaceonwhichtheSVMclassierworks.UsingbracketingprocedureweextractPN-pairsandnormalsontheselocations.Sincethesenormals(lettheirsetbecalledN)arecalculatedonthedecisionboundaryandourdecisionboundaryshouldexistinthesubspace;thesenormalsshouldprovideusinformationaboutsubspace.WeperformSingularValueDecomposition(SVD)todecomposeN=UDVT).LetsbediagonalelementsofD,thenthedifferencebetweentheelementsofs(singularvalues)shouldgiveusestimateofthedimensionalitykofsubspace.Figure 4-7 showssingularvaluesapproachingzeroandbecomingverysimilartoeachotherafterthek.OncethekisdeterminedwecollectrstkcolumnsofUtorepresentsubspaceS.Oncethesubspaceisrecoveredwecannowproceedtorecoverthekerneltypeoftheclassier.Wefollowtheprocedure,asdetailedinchapter 2 ,withasmallvariation.Inpresentscenariothedecisionboundaryistobecheckedinthefeaturespacewhereastheclassier-black-boxonlyworksintheimagespace.Fortunatelyduetothelinearnatureoftransformationusedfordimensionalityreductionwecantransformpoints 78

PAGE 79

Figure4-7. SingularValuesofthenormalmatrix.Thesingularvaluesaresortedindecreasingorderandtheguredisplaysthesingularvaluesstartingfromtheninthsingularvalue.Theplothighlightthesubstantialgapbetweentheninthandthetenthsingularvaluesandtheabsenceofvisiblegapsfortheremainingsingularvalues. Figure4-8. IdentifyingKernelType:Histogramshowsvotingdonebyttingcurvesontheintersectionsof2Dplanewiththedecisionboundary.Thequadraticpolynomialkernelisclearlymostvotedandcorrectresult. onthe2Dplanefromfeaturespacetotheimagespaceandcalculatetheirlabels.Changeofthelabel,aswemovefrompositiveelementtonegativeelement,indicatesintersectionof2Dplaneswiththedecisionboundary.Wesamplemanysuchcurvesandvoteforthetypeofcurveitcouldbe.Figure 4-8 showsourresult.OncethecorrecttypeisfoundwecontinuetorecovertheQuasi-SupportVectorsasindicatedinthechapter 2 .NOTE:useoflinearSVMinsuchsetupresultsintoadegeneratecaseandcorrectsubspacecannotberecovered. 79

PAGE 80

CHAPTER5CONCLUSIONSANDFUTUREWORKWehaveproposedanewtypeoflearningproblem,termedDeconstructiveLearning,whoseobjectiveistolearnagiven(binary)classierbycharacterizingasmuchaspossiblethefullextentofitscapability.Wehaveintroducedanddescribeatwo-componentdesignmodelforabroadclassofclassiersusedincomputervision.Inthistwo-componentdesignmodel,theclassierrsttransformstheinputimageintoafeatureinthefeaturespaceandtheclassicationresultisobtainedbyapplyingafeature-spaceclassiertothetransformedfeature.Wehaveproposedadeconstructionalgorithmthatistailoredtoclassierswithsuchtwo-componentdesign,withthedeconstructionalgorithmaimingtoidentify,foragivenclassier,itsinternalfeaturetransformandfeature-spaceclassierfromalistofpotentialcandidates.Wehaveintroducedthenotionofgeometricfeature-classiercompatibilityasthecriterionforselectingthedeconstructionoutputandwehavealsopresentedalgorithmsfordeconstructingtwobroadclassesofclassiers:cascadesoflinearclassiersandsupportvectormachines.Inparticular,forSVM-basedclassiers,ourmethodcandeterminethekerneltypeandthedimensionofkernelsubspace(spannedbythesupportvectors)andrecoverthesubspaceitself.AlthoughwewerenotabletorecoverexactsupportvectorsforthegeneralSVMs,wecanrecoverquasi-supportvectorsforSVMsusingpolynomialkernels.Foracascadeoflinearclassiers,weareabletodeterminethe(weak)classiersthemselvesandtheirlocationsinthecascade.Multipleexperimentshavedemonstratedthevalidityandviabilityoftheproposedapproach,andsuccessfulandpracticalapplicationsofourmethodincludethedeconstructionofpedestrianandfacedetectorsintheOpenCVlibrary.WorkiscurrentlyongoingtoexpandthefeatureandclassierlistsF,CtoincludestructuredSVM[ 47 ],dictionary-learning-basedclassier[ 39 ],bagofwordsfeatures[ 17 ]andothers.Recently,neuralnetwork-basedclassierssuchasthoseemergedfrom 80

PAGE 81

thedeeplearningcommunity[ 9 31 ]hasgainedconsiderableinterestandpopularity.Attherstglance,thedirectapproachproposedinthisdissertationseemsinsufcientandinadequatefordeconstructingthistypeofclassiers.However,webelievethatmoreindirectapproachesandformulationsarepossible,withtheaimofidentifyingcrucialinformationthatarenecessaryforthedeconstructionprocess.Additionally,tofurtherimprovetheapplicabilityandpracticalityofourmethod,itisnecessarytotodeveloptheconceptofFeatureClassSignature.Thesignaturereferstoasetofimageoperationswhich,ifperformedoninputimages,canhelpdeterminingfeature-spaceclassier.Onepossiblewaytodenesuchasignatureistobuildadecisiontreethatcangeneratethesignatureforthegivenclassier,andmuchlikethesignaturesgeneratedbythevirus-detectionsoftware,thissignaturelistcanbekeptinthedatabaseandtobecomparedwithwheneveranewclassicationsystemisprovided.TheCaseandOutlookForDeconstructiveLearning:Iconcludethisdissertationwithabriefdiscussiononthepotentialusefulnessofdeconstructivelearning,providingseveralexamplesthatillustrateitssignicanceintermsofitsfutureprospectsfortheoreticaldevelopmentaswellaspracticalapplications.Thegeometricapproachtakeninthisdissertationsharessomevisiblesimilaritieswithlow-dimensionalreconstructionproblemsstudiedincomputationalgeometry[ 21 ],andinfact,itispartiallyinspiredbyvarious3Dsurfacereconstructionalgorithmsstudiedincomputationalgeometry(andcomputervision)[ 23 ][ 33 ].However,duetothehighdimensionalityofthefeaturespace,deconstructivelearningoffersabrandnewsettingthatisqualitativelydifferentfromthoselow-dimensionalspacesstudiedincomputationalgeometryandvariousbranchesofgeometryinmathematics.Highdimensionalityofthefeaturespacehasbeenahallmarkofmachinelearning,arealmthathasnotbeactivelyexploredbygeometers,mainlyforthelackofinterestingexamplesandmotivation.Perhapsdeconstructivelearning'semphasisonthegeometryofthedecisionboundaryinhighdimensionalspaceanditsconnectionwithmachinelearningcouldprovidestimulatingexamplesor 81

PAGE 82

evencounterexamplesunbeknowntothegeometers,andtherefore,providetheneededmotivationforthedevelopmentofnewtypeofhigh-dimensionalgeometry[ 15 ].Onamorepracticalside,image-basedclassicationssuchasface,pedestrianandvariousobjectdetectionsandscenerecognitionareimportantcomputervisionapplicationsthathavebeguntohavevisibleandnoticeableimpactinourdailylife.Indeed,withthecurrenttrendintechnologydevelopment,itisnotdifculttoenvisionanot-so-distantfutureinwhichtheworldispartiallypoweredbysuchapplications.Inthebackdropofsuchfuturisticvision,deconstructivelearningpointstoaninterestingandunchartedterritory,perhapsapromisingnewdirectionwithpotentialforgeneratingimportantimpacts.Severalpotentialconsequencesofthisnewcapabilityareinterestingtoponder.OurbeliefissupportedbytheincreasingpresenceandinuenceofAI/MLsystemsindailylifeandbythefundamentalasymmetrybetweenconstructivelearning(learningclassiersfromtrainingsamples)anddeconstructivelearning.Computationally,theformerisessentiallyserialwhilethelatterisparallelinnature.Morespecically,deconstructionisahighlyparallelizableprocess,andinouralgorithm,theprocessesofsamplingpointsonthedecisionboundaryarecompletelyindependentandparallelizable.However,forconstructivelearning,parallelizationisoftenineffectivebecausetheimmensityofthesearchspace,andoften,theonlyviablesolutionsarelocalgreedyapproaches,whichareinvariablyserial.AcaseinpointistheOpenCV'sHOG-basedpedestriandetector.Todevelopsuchanapplication,thedesignersmusthavespentweeksifnotmonthsofeffortin,amongotherthings,gatheringusefultrainingimages,managingotheroftentime-consuminglogisticmattersandtuningboththefeatureparameters(cell/block/binsizes)andthelearningalgorithminordertoobtainthebest(linear)classier.However,asdemonstratedinthisdissertation,thedetectorcanbecompletelydeconstructedinafewhoursandtheuserofthedeconstructionalgorithmonlyrequirestocollectafewpositiveimagestostartthedeconstructionprocess,sincethenegativeimagescanbeobtainedrandomly(withlabelsprovidedbytheclassier 82

PAGE 83

C).Theresultiscertainlynotsurprisingsinceitbasicallymirrorsthewell-knownfactthatndingasolutionisalwaysmoretime-consumingthancheckingthesolution.However,itsimplicationsaremultipleandperhapsprofound.Forexample,theresultofmonthsorevenyearsofhardworkcanbedeconstructedinamatterofafewhours,evenwhenitishiddenundertheseeminglyimpenetrablebinarycodes.Additionally,webelievethatdeconstructivelearningcouldprovidegreaterexibilitytotheusersofAI/machinelearningproductsinthefuturebecauseitallowstheuserstodeterminethefullextentofanAI/MLprogram/system,andtherefore,createhis/herownadaptationormodicationofthegivensystemforspecicandspecializedtasks.Forexample,howwouldanreviewerforanymachinelearningjournalknowthatasubmittedbinarycodeofapaperreallydoesimplementthealgorithmproposedinthepaper,notsomecleverimplementationofsomeknownalgorithm?Deconstructivelearningproposedinthisdissertationoffersapossiblesolutionbyexplicitlydeconstructingthesubmittedcode.Itprovidestheopportunitytoupdate(e.g.updatingexistingSVMwithnewtrainingdataofnewinstancesorevennewclasses),speedup(e.g.movinglteroperationsfromCPUtoGPU)andlearningwhatparametersworkinthegivenclassicationsystemsothatthesameparameterscouldbeusedinothersystems. 83

PAGE 84

REFERENCES [1] LS3-LeggedSquadSupportSystems.2013.URL http://www.bostondynamics.com/robot_ls3.html [2] Self-DrivingCarTest:SteveMahan.2013.URL http://www.google.com/about/careers/lifeatgoogle/self-driving-car-test-steve-mahan.html [3] Balcan,M.,Beygelzimer,A.,andLangford,J.Agnosticactivelearning.Proc.Int.Conf.MachineLearning(ICML).2006. [4] Balcan,M.,Broder,A.,andZhang,T.Margin-basedactivelearning.LearningTheory(2007):35. [5] Ballico,E.andBernardi,A.DecompoisionofHomogeneousPolynomialswithLowRank.MathematischeZeitschrift271(2012).3:1141. [6] Belkin,MikhailandNiyogi,Partha.Semi-supervisedlearningonRiemannianmanifolds.Machinelearning56(2004).1-3:209. [7] Belkin,Mikhail,Niyogi,Partha,andSindhwani,Vikas.Manifoldregularization:Ageometricframeworkforlearningfromlabeledandunlabeledexamples.TheJournalofMachineLearningResearch7(2006):2399. [8] Benenson,R.,Mathias,M.,Timofte,R.,andVanGool,L.Pedestriandetectionat100framespersecond.CVPR.2012. [9] Bengio,Yoshua.LearningdeeparchitecturesforAI.FoundationsandtrendsRinMachineLearning2(2009).1:1. [10] Bosch,Anna,Zisserman,Andrew,andMunoz,Xavier.SceneclassicationviapLSA.ComputerVisionECCV2006.Springer,2006.517. [11] Bourdev,LubomirandBrandt,Jonathan.Robustobjectdetectionviasoftcascade.ComputerVisionandPatternRecognition,2005.CVPR2005.IEEEComputerSocietyConferenceon.vol.2.IEEE,2005,236. [12] Brachat,J.,Comon,P.,Mourrain,B.,andTsigaridas,E.SymmetricTensorDecomposition.LinearAlgebraAppl433(2010).11-12:1851. [13] Bradski,G.TheOpenCVLibrary.Dr.Dobb'sJournalofSoftwareTools(2000). [14] Candes,E.andRecht,B.Exactmatrixcompletionviaconvexoptimization.Commun.ACM55(2012).6:111. [15] Carlsson,Gunnar.Topologyanddata.BulletinoftheAmericanMathematicalSociety46(2009).2:255. 84

PAGE 85

[16] Chen,XiangrongandYuille,A.L.Detectingandreadingtextinnaturalscenes.ComputerVisionandPatternRecognition,2004.CVPR2004.Proceedingsofthe2004IEEEComputerSocietyConferenceon.vol.2.2004,IIIIVol.2. [17] Csurka,Gabriella,Dance,Christopher,Fan,Lixin,Willamowski,Jutta,andBray,Cedric.Visualcategorizationwithbagsofkeypoints.Workshoponstatisticallearningincomputervision,ECCV.vol.1.2004,1. [18] Cuddon,JohnAnthony.Dictionaryofliterarytermsandliterarytheory.JohnWiley&Sons,2012. [19] Dalal,NavneetandTriggs,Bill.Histogramsoforientedgradientsforhumandetection.ComputerVisionandPatternRecognition,2005.CVPR2005.IEEEComputerSocietyConferenceon.vol.1.IEEE,2005,886. [20] Dasgupta,S.Analysisofagreedyactivelearningstrategy.InAdvancesinNeuralInformationProcessingSystems.2004. [21] deBerg,M.,Cheong,O.,vanKreveld,M.,andOvermars,M.ComputationalGeometry:AlgorithmsandApplications.Springer,2010. [22] Derrida,Jacques.Delagrammatologie(OfGrammatology).LesditionsdeMinuit,1967. [23] Dey,T.CurveandSurfaceReconstruction:AlgorithmswithMathematicalAnalysis.CambridgeUniversityPress,2006. [24] Diehl,ChristopherPandCauwenberghs,Gert.SVMincrementallearning,adaptationandoptimization.NeuralNetworks,2003.ProceedingsoftheInterna-tionalJointConferenceon.vol.4.IEEE,2003,2685. [25] Duda,RichardO,Hart,PeterE,andStork,DavidG.Patternclassication.JohnWiley&Sons,2012. [26] Fei-Fei,Li,Fergus,Rob,andPerona,Pietro.One-shotlearningofobjectcategories.PatternAnalysisandMachineIntelligence,IEEETransactionson28(2006).4:594. [27] Fei-Fei,LiandPerona,Pietro.Abayesianhierarchicalmodelforlearningnaturalscenecategories.ComputerVisionandPatternRecognition,2005.CVPR2005.IEEEComputerSocietyConferenceon.vol.2.IEEE,2005,524. [28] Felzenszwalb,P.F.,Girshick,R.B.,andMcAllester,D.Cascadeobjectdetectionwithdeformablepartmodels.ComputerVisionandPatternRecognition(CVPR),2010IEEEConferenceon.2010,2241. [29] Golub,G.andLoan,C.Van.MatrixComputation.JohnHopkinsUniversityPress,1996. 85

PAGE 86

[30] Heath,M.ScienticComputing.TheMcGraw-HillCompanies,Inc,2002. [31] Hinton,GeoffreyE,Osindero,Simon,andTeh,Yee-Whye.Afastlearningalgorithmfordeepbeliefnets.Neuralcomputation18(2006).7:1527. [32] Hirsch,M.DifferentialTopology.Springer,1997. [33] Horn,B.RobotVision.MITPress,2010. [34] LeCun,Yann,Bottou,Leon,Bengio,Yoshua,andHaffner,Patrick.Gradient-basedlearningappliedtodocumentrecognition.ProceedingsoftheIEEE86(1998).11:2278. [35] Lienhart,R.andMaydt,J.AnextendedsetofHaar-likefeaturesforrapidobjectdetection.ImageProcessing.2002.Proceedings.2002InternationalConferenceon.vol.1.2002,IIvol.1. [36] Lowd,D.andMeek,C.Adversariallearning.ProceedingsoftheeleventhACMSIGKDDinternationalconferenceonKnowledgediscoveryindatamining.ACM,2005,641. [37] Lowe,David.Distinctiveimagefeaturesfromscale-invariantkeypoints.Interna-tionaljournalofcomputervision60(2004).2:91. [38] Lowe,DavidG.Objectrecognitionfromlocalscale-invariantfeatures.Computervision,1999.TheproceedingsoftheseventhIEEEinternationalconferenceon.vol.2.Ieee,1999,1150. [39] Mairal,Julien,Bach,Francis,Ponce,Jean,Sapiro,Guillermo,andZisserman,Andrew.Discriminativelearneddictionariesforlocalimageanalysis.ComputerVisionandPatternRecognition,2008.CVPR2008.IEEEConferenceon.IEEE,2008,1. [40] Munder,S.andGavrila,D.M.AnExperimentalStudyonPedestrianClassication.PatternAnalysisandMachineIntelligence,IEEETransactionson28(2006).11:1863. [41] Nesterov,Y.andNemirovski,A.SomeFirst-OrderAlgorithmforl1/NuclearNormMinimization.ActaNumerica(2014). [42] Osowski,StainslawandLinh,TranHoai.ECGbeatrecognitionusingfuzzyhybridneuralnetwork.BiomedicalEngineering,IEEETransactionson48(2001).11:1265. [43] Rifai,Salah,Dauphin,Yann,Vincent,Pascal,Bengio,Yoshua,andMuller,Xavier.TheManifoldTangentClassier.NIPS.2011,2294. [44] Roweis,SamTandSaul,LawrenceK.Nonlineardimensionalityreductionbylocallylinearembedding.Science290(2000).5500:2323. 86

PAGE 87

[45] Shotton,J.,Blake,A.,andCipolla,R.MultiscaleCategoricalObjectRecognitionUsingContourFragments.PatternAnalysisandMachineIntelligence,IEEETransactionson30(2008).7:1270. [46] Taigman,Yaniv,Yang,Ming,Ranzato,Marc'Aurelio,andWolf,Lior.DeepFace:ClosingtheGaptoHuman-LevelPerformanceinFaceVerication.ComputerVisionandPatternRecognition,2014.CVPR2014.IEEEComputerSocietyConferenceon.IEEE,2014. [47] Tsochantaridis,Ioannis,Hofmann,Thomas,Joachims,Thorsten,andAltun,Yasemin.Supportvectormachinelearningforinterdependentandstructuredoutputspaces.Proceedingsofthetwenty-rstinternationalconferenceonMachinelearning.ACM,2004,104. [48] Vapnik,V.StatisticalLearningTheory.Wiley-Interscience,1998. [49] Viola,P.andJones,M.Rapidobjectdetectionusingaboostedcascadeofsimplefeatures.ComputerVisionandPatternRecognition,2001.CVPR2001.Proceedingsofthe2001IEEEComputerSocietyConferenceon.vol.1.2001,IIvol.1. [50] Viola,PaulandJones,Michael.Rapidobjectdetectionusingaboostedcascadeofsimplefeatures.ComputerVisionandPatternRecognition,2001.CVPR2001.Proceedingsofthe2001IEEEComputerSocietyConferenceon.vol.1.IEEE,2001,I. [51] Vondrick,C.,Khosla,A.,Malisiewicz,T.,andTorralba,A.HOGgles:VisualizingObjectDetectionFeatures.Proc.Int.Conf.onComputerVision.2013. [52] Wu,Jianxin,Brubaker,S.C.,Mullin,M.D.,andRehg,J.M.FastAsymmetricLearningforCascadeFaceDetection.PatternAnalysisandMachineIntelligence,IEEETransactionson30(2008).3:369. 87

PAGE 88

BIOGRAPHICALSKETCH MohsenAlireceivedhisPh.D.fromUniversityofFloridainspring2014.HewasawardedFulbrightscholarshipbyInternationalInstituteofEducationforhisPhDstudies.Priortothathecompletedhispre-doctoralgraduatestudiesatLahoreUniversityofManagementSciencesandPunjabUniversity(Lahore),incomputerscience.Hisresearchinterestsincludecomputervisionandmachinelearning. 88