<%BANNER%>

Efficient Object Recognition Using Color Quantization


PAGE 1

EFFICIENTOBJECTRECOGNITIONUSINGCOLORQUANTIZATIONBySIGNEANNEREDFIELDADISSERTATIONPRESENTEDTOTHEGRADUATESCHOOLOFTHEUNIVERSITYOFFLORIDAINPARTIALFULFILLMENTOFTHEREQUIREMENTSFORTHEDEGREEOFDOCTOROFPHILOSOPHYUNIVERSITYOFFLORIDA2001

PAGE 2

Copyright2001bySigneAnneRedeld ii

PAGE 3

ACKNOWLEDGMENTSIthankmyhusband,Karl,forhelpingmetoignoredistractionsandkeepingmemotivatedtonish,andmyson,Rowan,forprovidingdistractionswhenIneededthemmost.Iwouldnothavenishedthiswithouttheexceptionalhelpofmyadvisor,Dr.JohnG.Harris.ManythanksgotoDr.MichaelNechybaandDr.A.AntonioArroyo,fortheuseofequipmentandtheirconstantsupportandencouragement,andtoDr.KeithDoty,whogotmeinterestedinthisprojectintherstplace.Particularthanksgotomyfamily,wholetmebounceideasoofthemuntilInallyguredoutwhatIneededtodo,andespeciallytoMaryJavorski,whogavemethekerneloftheideathatevolvedintothisdissertation.ThisresearchwaspartiallysupportedbyaNationalScienceFoundationgrantthroughtheMinorityEngineeringDoctoralInitiativeprogram,whichamongotherthingssuppliedthecomputerIusedtodotheexperimentsandwritethethesisandtravelmoneytoattendconferences.ManythanksalsogotoPublixSupermarkets,thePepsiBottlingCompany,andthedenizensoftheMachineIntelligenceandComputationalNeuroEngineeringLab-oratoriesfortheirdonationsofsodacanstomydatabase. iii

PAGE 4

TABLEOFCONTENTS ACKNOWLEDGMENTS .......................iii LISTOFFIGURES ..........................vii LISTOFTABLES ..........................xi ABSTRACT .............................xiiiCHAPTERS 1INTRODUCTION ..........................1 2ROBOTICOBJECTIDENTIFICATION ................8 2.1Platform .................................. 9 2.2IdenticationMethods .......................... 10 2.3HistogramIndexing ............................ 11 2.4ColorConstancy ............................. 13 2.4.1Retinex .............................. 15 2.4.2GamutMapping ......................... 17 2.4.3OtherMethods .......................... 20 2.5FixingHistogramIndexing ........................ 21 3COLORMEMORY ........................23 3.1Experimentalmethods .......................... 24 3.2Dataanalysis ............................... 31 3.2.1Sensitivity ............................. 33 3.2.2Calibration ............................ 37 3.3Summary ................................. 39 4QUANTIZATION .........................42 4.1OverviewofAlgorithms ......................... 43 4.1.1UniformQuantization ...................... 43 4.1.2Dithering ............................. 44 4.1.3ModiedUniformQuantization ................. 45 4.1.4TreeQuantization ......................... 46 4.1.5Median-CutQuantization .................... 48 4.1.6VectorQuantization ....................... 48 4.2NumberofColors ............................. 49 4.2.1OptimizingAccuracy ....................... 50 4.2.2LightingShifts .......................... 51 iv

PAGE 5

4.3Summary ................................. 58 5THEORYANDRESULTSONSYNTHETICDATA ..........61 5.1Theory ................................... 61 5.1.1StructureandMethods ...................... 61 5.1.2TheoreticalResults ........................ 62 5.1.3Largerdatabases ......................... 65 5.2SyntheticData .............................. 68 5.2.1SingleHueRegion ........................ 69 5.2.2AllHues .............................. 72 5.2.3MultipleHues ........................... 73 5.2.4MoreComplexLightingShifts .................. 77 5.3Over-tting ................................ 80 5.4Conclusions ................................ 81 6STILLIMAGERESULTS .....................83 6.1TheoreticalComparison ......................... 84 6.2LightingCompensation .......................... 91 6.2.1WorkplaceLighting ........................ 91 6.2.2HouseholdLighting ........................ 95 6.2.3FullDatabase ........................... 97 6.3LocalizedHueData ............................ 99 6.4Orientationvs.LightingChange ..................... 100 6.5ChromaticityResults ........................... 104 6.6Retinex .................................. 106 6.7Summary ................................. 112 7ROBOTPROTOTYPES ......................115 7.1O-LineImplementation ......................... 116 7.2Real-TimeImplementation ........................ 118 7.2.1FirstImplementation ....................... 118 7.2.2ResultsonFirstImplementation ................. 118 7.2.3SecondImplementation ...................... 120 7.2.4FixedLighting .......................... 122 8CONCLUSION ..........................125APPENDICES ACOLORSPACESANDSENSORYSYSTEMS ............127 A.1HumanVision ............................... 127 A.1.1ColorConstancy ......................... 129 A.1.2ColorMemory ........................... 132 A.2ColorSpaces ................................ 133 A.3RobotSensorySystems .......................... 137 v

PAGE 6

BDATABASESANDIMAGES ....................140 REFERENCES ............................144 BIOGRAPHICALSKETCH ......................150 vi

PAGE 7

LISTOFFIGURES 3.1Interfacewindowforprogramenablingusertochoosefocalcolors.Pri-marywindowshownhereallowsusertochoosecolors;secondarywin-dowshowsfocalcolorsafteruserhasmovedontonextcolor ..... 27 3.2Interfaceforusertochooseboundariesbetweenfocalcolors.Foreachofsevenlevelsofbrightness,theuserplaceslineswhereboundariesareperceived.Focalcolorsareassignedtotheregionsbetweenthelinesbytheuser. ................................ 28 3.3Screenshotsofpsychophysicaltestinginterface.aoriginalcolor;bintersamplenoise;ctestcolor;dsimultaneouscomparison. .... 30 3.4FocalcolorsandinitiallydisplayedcolorsatLSB5. .......... 32 3.5Samplereceiveroperatorcharacteristicscurve. ............. 34 3.6MonitorCalibrationResults ....................... 38 3.7ResultsofpAanalysiswithactualchangederivedfrommonitorcal-ibration. .................................. 39 4.1Originalimageusedtodemonstrateresultsofdierentquantizationschemes. .................................. 43 4.2Resultsofuniformquantization ..................... 44 4.3Resultsofditheringonuniformquantization. .............. 45 4.4Resultsofmodieduniformquantization ................ 46 4.5Diagramfortreequantization. ...................... 48 4.6Resultsofmedian-cutquantization ................... 49 4.7Samplecolorchangesunderasingleday'ssunlight. .......... 52 4.8Averageandstandarddeviationforhueundervaryinglighting.Thexlocationofeachbarindicatestheaverageoftheaveragevaluesfortheday. .................................... 53 4.9Meanvaluesforcansandcalibratorhues. ................ 54 4.10Probabilitydensityfunctionsforchromaticcancolorsalongthehueaxis. 55 vii

PAGE 8

4.11Probabilitydensityfunctionsforallcancolorsalongthesaturationaxis. 56 4.12Probabilitydensityfunctionsforallcancolorsalongthevalueaxis. 57 4.13Probabilitydensityfunctionsforthecolorsintheredcategory. .... 58 4.14Sampleimagequantizedusingthe8-colorlighting-shiftbasedclassier. 59 5.1Diagramshowingexamplesofeachvariableusedinthetheoreticalequations. ................................. 63 5.2Theoreticalresultsforc=224,p=2andp=3,n=3,andkvarying. 65 5.3Theoreticalresultsforc=224,p=3,andkvaryingalongthexaxis.Eachlinecorrespondstoadierentvalueofn. ............. 66 5.4Theoreticalresultsforc=224,n=4,andkvaryingalongthexaxis.Eachlinecorrespondstoadierentvalueofp. ............. 67 5.5Simulationaveragedresultsforc=224,p=2andp=3,k=10,andnvarying. ............................... 68 5.6Imageofsinglehueregionsyntheticdatabase.Redcorrespondstohighervalues;bluetolowervalues. ................... 69 5.7Resultsonsinglehueregiondatabaseforvaryingnumbersofbins,usinguniformblueandaccuracysweepredquantization.Averageaccuracyacrossthedierentshiftsisshowninthelowerrighthandplot.Standarddeviationofaveragevalueisshownwithdottedlines.Progressivelylargershiftsfromnoneupperlefthandplotto7middlelowerplotareshownintheremainingplots. .............. 70 5.8Databaseofobjectscompletelyspanningthehueaxis. ......... 71 5.9Resultsofuniformblueandaccuracysweepredmethods. ..... 72 5.10Originaldatabaseforhistogramswithtwocolors. ........... 73 5.11Averageaccuracyofuniformblueandaccuracysweepredmethodsasafunctionofshift. ........................... 74 5.12Resultsofuniformblueandaccuracysweepredmethods. ..... 75 5.13Secondtwo-peakdatabase. ........................ 76 5.14Averageaccuracyforsecondtwo-peakdatabaseasafunctionofshift. 77 5.15Resultsofuniformandaccuracysweepmethodsonsecondtwo-peakdatabase. ................................. 78 viii

PAGE 9

5.16Tranformationfromaverageilluminanttoearlymorningandeveningilluminants. ................................ 79 5.17Comparisonoforiginalandwarpeddatabases. ............. 80 5.18Resultsfortrainingoriginalandtestingwarped. .......... 81 5.19Over-ttingofaccuracysweepmethod. ................. 82 6.1Samplesodaimagesusedin14-candatabase.SodashownhereisPublixbrandDietCola,undereachofeightdierentilluminants. .. 84 6.2Colormapsfor2,3,4,5,6,8and14colors. ............... 85 6.3Theoreticalpredictionsbluesolidlinesforc=224,p=2andp=3,n=3,andkvarying.Realdatabaseresultsreddashedlinesforc=224,p=3andp=5,n=3averagedover20sets,andkvarying 86 6.4Realdatabaseresultsfornvarying,withk=8andk=14,p=2andp=3.Thebluesolidlinesshowk=14andthereddashedlinesshowk=8.Theblueandredlineswithtrianglemarkersshowtheresultsfromthecomparisonbetweentheoreticaldatablueandrealdatared. ................................ 87 6.5Realandtheoreticalresultsfork=1tok=25.Dottedlineshowstheoreticalresultsforp=2.Dashedlineshowstheoreticalresultsforp=3.Crossesshowaveragedrealresultsforp=2.Starsshowaveragedrealresultsforp=3. ...................... 90 6.6Uniformquantizationresultsfortheworkplacedatabase. ....... 93 6.7Exampleofthecharacteristickneeintheaccuracycurveforrealdata. 94 6.8Comparisonofuniformandaccuracysweepmethodsontheworkplacedatabase. ................................. 95 6.9Fulldatabaseaccuracyvs.numberofbinsforuniformquantization. 98 6.10Localizeddatabaseaccuracyvs.numberofbinsforuniformquantization. 99 6.11Comparisonofaccuracysweepredanduniformbluequantizationmethods. .................................. 101 6.12Comparisonofdierentnumbersoforientationswithrespecttoaccuracy 102 6.13Comparisonoflightingconditionchangetoorientationchange. .... 103 6.14Imagesand2-Dhistogramsusing[r;g]chromaticities. ......... 104 6.15Comparisonofuniformquantizationmethods. ............. 106 ix

PAGE 10

6.16Histogramresultsforthefulldatabase,withoutcolorconstancy. ... 108 6.17Histogramresultsforthefulldatabase,withcolorconstancy. ..... 109 6.18Comparisonbetweenuniformquantizationresultsonfulldatabasefordatawithandwithoutretinexpre-processing. ............. 110 6.19Comparisonbetweenresultswithdierentcolorconstancymethods. 111 7.1Sampleimageofinputtosystem.Regionusedtocreatehistogramisoutlinedinred;remainderofimageisshadowed. ............ 116 7.2Samplehistogramsfromdatabase. .................... 117 7.3Comparisonof7-UpTMandMountainDewTMnutritionalinformation 124 A.1RGBcolorspace. ............................. 134 A.2Munsellcolorspace. ........................... 135 A.3MunsellcolorsusedbyBerlinandKay,withfocalcolorsdotsandregionboundaries. ............................ 136 B.1Sampleimageusedtogeneratefulldatabaseof86cans,with8dierentorientationsand4dierentilluminants. ................. 142 B.2Samplesofimagesusedtogeneratefulldatabaseofover82cans,with8dierentorientationsand4dierentilluminants.Toprowisnotprocessed;bottomrowisprocessedwiththeretinexalgorithm. .... 143 x

PAGE 11

LISTOFTABLES 3.1Numberoftrialsgivenbit-depth ..................... 33 3.2Numberoftrialsgivenbit-depth ..................... 33 3.3Resultsofthed0andpAanalyses. ................... 35 3.4GammaCalibrationData ........................ 37 6.1Realdataresultsforp=3.CCshowstheexpectedresultsifthealgo-rithmisperformingcorrectcolorconstancyandaccuratelyidentifyingeachobject. ................................ 89 6.2Objectrecognitionaccuracygeneratedfromdatabaseof9sodacansunder4dierentlightingconditions.Ill.inthetablereferstoillumi-nant,andindicatestheimagesthatwereusedasthetemplatesinthedatabase,whilethetestdataconsistedoftheremaining27images. 92 6.3Accuracyonhouseholddatabase. .................... 96 6.4Accuracyonfulldatabaseforlightingshift. .............. 97 6.5Accuracyonfulldatabasewithorientationvariedandlightingconstant. 100 6.6Accuracyonfulldatabaseforlightingshiftquantizationmethods. .. 107 6.7Accuracyonfulldatabasewithretinexpre-processing. ........ 112 7.1Numbercorrectwhenidenticationofsodasistestedusingorientationmidwaybetweenthoseinthedatabase.Eightcanswithtwoorienta-tionsindatabase,fourteencolors.Thenumbers1and2correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,EO=EckerdsOrangeDrinkTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,D7=Diet7-UpTM,CD=CanadaDryTM,and7U=7-UpTM. .................................. 119 xi

PAGE 12

7.2Numbercorrectwhenidenticationofsodasistestedusingorientationmidwaybetweenthoseinthedatabase.Eightcanswithfourorien-tationsinthedatabase,elevencolors.Thenumbers1,2,3,and4correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,EO=EckerdsOrangeDrinkTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,D7=Diet7-UpTM,CD=CanadaDryTM,and7U=7-UpTM. ............................ 120 7.3Numbercorrectwhenidenticationofsodasistestedusingorientationmidwaybetweenthoseinthedatabase.Tencanswithfourorientationsinthedatabase,elevencolors.Thenumbers1,2,3,and4correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,EO=EckerdsOrangeDrinkTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,CD=CanadaDryTM,MD=MountainDewTM,P=PepsiTM,SL=SliceTM,andLDL=LiptonDietLemonBriskIcedTeaTM. ... 121 7.4Numbercorrectwhenidenticationofsodasistestedusingorientationmidwaybetweenthoseinthedatabase.Elevencanswithfourorien-tationsinthedatabase,sixteencolors.Thenumbers1,2,3,and4correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,EO=EckerdsOrangeDrinkTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,P=PepsiTM,SL=SliceTM,LDL=Lip-tonDietLemonBriskIcedTeaTM,7U=7-UpTM,WC=WildCherryPepsiTM,andDP=DietPepsiTM. .................... 122 7.5Numbercorrectwhenidenticationofsodasistestedusingorientationmidwaybetweenthoseinthedatabase.Twelvecanswithfourorien-tationsinthedatabase,sixteencolors.Thenumbers1,2,3,and4correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,MMO=MinuteMaidOrangeSodaTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,P=PepsiTM,SL=SliceTM,LDL=LiptonDietLemonBriskIcedTeaTM,7U=7-UpTM,WC=WildCherryPepsiTM,DP=DietPepsiTM,andSP=SpriteTM. ....... 123 B.1Sodasinnaldatabase. .......................... 141 xii

PAGE 13

AbstractofDissertationPresentedtotheGraduateSchooloftheUniversityofFloridainPartialFulllmentoftheRequirementsfortheDegreeofDoctorofPhilosophyEFFICIENTOBJECTRECOGNITIONUSINGCOLORQUANTIZATIONBySigneAnneRedeldDecember2001 Chairman:J.HarrisMajorDepartment:ElectricalandComputerEngineeringAsimplicationofthecolorhistogramindexingalgorithmisproposedandan-alyzed.Insteadoftakingahistogramconsistingofhundredsofcolors,eachinputimageisrstquantizedtoonlyafewcolorsbetweeneightandsixteenandthefea-turevectorisgeneratedbytakingahistogramofthissmallerspace.Thisincreasestheeciencyofthesystembyordersofmagnitude.Wealsoproposedthatthiswouldreducetheeectsoflightingchangeonthealgorithmandthatthiswouldbeabet-termodelforthehumanobjectrecognitionmechanismthanthealgorithmcombinedwithcolorconstancyalone.Insupportofthecontentionthatthismaybeabetterhumanmodel,apsy-chophysicalexperimentwasconducted.Thebit-depthofhumancolormemorywasshowntoliebetween3and4bits,correspondingto8to16colorcategorieswhenacolorisrememberedforveseconds.Thisexperimentcreatedabridgebetweentheworldsofthepsychophysicalresultsandthecomputertestbed.Theresearchshowedthatquantizationcanoccasionallycompensateforsmalllightingchanges,butthatthecompensationishighlydatabase-dependentanderratic.However,quantizationalwaysproducedamuchmoreecientsystemandgenerallydidnotsubstantiallyreducetheaccuracy. xiii

PAGE 14

Theresultsofthisworkwerethreefold.First,humancolormemoryisrelativelypoor,indicatingthatasystemincorporatingquantizationwillbefarclosertomimick-inghumanabilitiesthanasystemwithoutit.Second,quantizationaloneisinsucienttoperformcolorconstancyinmostcases.Third,withorwithoutacolorconstantpre-processor,ourresultsconsistentlyshowedthatquantizationhaslittleeectonaccuracywhenusingmorethansixteenbins.Objectrecognitionaccuracydegradessubstantiallyasthenumberofcolorcategoriesdropsbelowsix.From10categoriesto256,accuracyisessentiallyunchanged.Quantizationisaveryecientwaytoreducethecomputationalcomplexityandstoragerequirementsofthisalgorithmwithoutsubstantiallyaectingitsobjectrecognitionaccuracy. xiv

PAGE 15

CHAPTER1INTRODUCTIONThepurposeofthisresearchistoexploretheeectofquantizationonobjectrecognitionaccuracywhencoloristheonlycue.Therearemanywaysinwhichasystemcanidentifyanobject.Thesesensorysystemsrangefromverysimplesuchasdistinguishingbetweenobstacleandairviainfra-redreturntoextremelycomplexsuchasusingmulti-sensoryinputandsensorfusiontomakesharpdistinctionsbe-tweenspecicobjects.Objectrecognitionusingcolorimagesliessomewhereinthemiddleofthisrange.Generallythesystemisbeingrequiredtomakefairlysophis-ticatedchoicestracktheredobjectbutignoretheotherredobject,withlimitedresources.Therearemanyothercuesthatcanandoftenshouldbeused,evenwhenvisionisstipulatedastheprimarysense,butherewewillbeexploringtheeectsofquantizationwhenonlycolorisconsidered.Objectrecognitionusingimagesisgenerallyinecient.Thevastquantitiesofinputdatatendtooverwhelmsystemsattemptingtoextractonlyusefulinformation.Systemsthatrecognizeobjectsbasedonshapealonerequiremoderatedatabasesandimposeasubstantialcomputationalburden.Systemsthatrelyoncolorhavetodealwithtripletheinputinformationrequiredforshape-basedsystems,andhavedicultywhenthelightingchanges.However,undercontrolledconditions,orwhenthelightingchangesaresomewhatpredictable,ecientobjectrecognitionusingcolorispossible.Furthermore,ifatrulyecientobjectrecognitionalgorithmusingcolorcanbederived,moreexpensivetechniquescanbeusedononlythesubsetofdataproducedbythecolorrecognitionprocess.Thiscoulddramaticallyspeeduptherecognitionprocess,evenwhenmultipleobjectpropertiesareusedasfeatures. 1

PAGE 16

2 Objectscanbeidentiedusingmanypossiblefeatures.Herewelookexclusivelyatcolor,usedastheonlyfeature.Obviously,notalldatabasesaresuitedforthisapproach.Sodacans,however,formaverygooddatasetforthisexperiment.Theyareuniforminshapeandthuseasilysegmentedfromtheirbackground,andtheycomeinawidevarietyofcolors.Inaddition,therearemanypotentialusesforthesystem,suchasabutlerrobot.IfyousendyourrobottogetyouanicecoldCoca-ColaTM,therobotshouldbeabletoidentifyitreliablyandnotbringyouFrescaTMinstead.Ourresearchshowsthatquantizationcanmakeanappropriateobjectrecognitionalgorithmmuchmoreecient.Visionisapassivesense.Assumingweareusingarobottoidentifyobjects,therobotdoesnotneedtodisturbwhatitislookingattoidentifytheobject.Therobotdoesnotevenhavetobeclosetotheobjecttobecapableofidentifyingit.Inaddition,visionisnotobjectspecic.Thesamesensorcanbeusedtoidentifymanydierenttypesofobjects.Forexample,humanscantellsimplybylookingwhichobjectisachairandwhichisadesk.Theycanevendierentiatebetweenthebluechairandtheredchair.Thismakesvisiononeofthemostpotentiallyexiblesensorymodalities.Furthermore,visionenableshumansand,potentially,robotstopreciselydeterminethelocationsofrelativelydistantobjects.Othersensesarenotasuseful.Sonar,forexample,wouldallowarobottoapproximatelydeterminedistantobjectsandcloselydeterminenearbyobjects.Infra-redreectionsallowarobottoonlyapproximatelydeterminenearbyobjects,andrequiretheuseofadditionalsensorstofurtherdeneitsenvironment.Inaddition,inordertousesuchsenseseectively,andeliminatepossibleconfusion,theplacementofthesesensorsiscritical.Iftheyarepoorlyplaced,therobotwillbeunabletodistinguishbetweencriticalsituations.Furthermore,manysensorsareneededinorderforthesealternativesensorymodalitiestooperatewell.Withvision,however,asinglecameracanbeplacedanywherewithagoodviewandwillprovidefarmoreinformation.Visionisinmanywaysanidealroboticsensor.

PAGE 17

3 However,visionisalsothesourceofmanyproblems.Vision-basedobjectrecogni-tionisplaguedbythecomputationallyintensivenatureofimagedata.Colorimagesinparticularrequirevaststoragecapabilities.Animageof128by128pixels,with8bitsor256levelspercolorbandrequires393,216bitsjusttostoreallthepixelvalues.Almostallthealgorithmsusingvisualdataalsorequiresubstantialprocess-ingtoextracttherelevantinformation.Imagedatacontainsremarkablequantitiesofinformation,butextractingthatinformationcanbeverytime-consumingandcompli-cated.Thesamecharacteristicthatmakesvisionsuchawonderfulsensorymodalityalsomakesitverydiculttoworkwith.Itis,ofcourse,possibletoobtaintremendousamountsofinformationabouttheworldusingonlygrayscaleimages.Whyincreasetherobot'scomputationalburdenbyusingcolor?Ifweassumethatalltheedge-detectionandshape-basedtechniqueswewoulduseforagrayscaleimagearealreadyavailable,theadditionofcolorallowsustomakethetaskofseparatingobjectsfromtheirenvironmentmucheasier.Suddenlywecandistinguishbetweenanappleandanorangewithoutcomplicatedtextureanalysis.Wecaneveneasilydistinguishtheripeorangesfromtheunripeoranges.Unfortunately,thecomputationalcomplexityofthesystemwillbegreatlyin-creased.Howcanweusethewealthofinformationinherentinthissensewithoutoverwhelmingoursystem?Giventhecomputationalburdenofshapebasedalgo-rithms,andtherelativelysmoothvariationofchromaticityfrompixeltopixel,per-hapsastatisticalmethodworkingonlywithchromaticitywouldbeanappropriatesolution.SwainandBallard[ 57 ]proposedthehistogramindexingalgorithmin1991.Fun-damentally,thealgorithmsimplylooksatthestatisticalprobabilityoftheoccurenceofeachcolorforagivenobject,andusesthatinformationtodeterminethemostlikelymatch.Chapter 2 explorestheoriginalalgorithmindetail.Itissucienttosayherethatthealgorithmhasalmostallthepropertiesrequiredforrobustobjectrecognition

PAGE 18

4 inarealisticenvironment.Itisrobusttoorientationchanges,scaling,rotationanddeformation.Itisunfortunatelyfragilewhenexposedtolightingchangesandlargedatabases.Usingcolorastheonlyfeaturedramaticallyreducesbothstorageandprocessingrequirements.Severalsuggestionsofwaystomakethealgorithmmorelightinginvarianthavebeenmade,andareexploredinmoredetailinChapter 2 .Theauthorssuggestedtheuseofacolorconstantpreprocessorbeforethealgorithmisrun.Theeldofcolorconstancyincludesallthealgorithmsthatattempttoconvertanimagetakenunderonelighttoanimagetakenunderadierent,defaultlight.Therehavebeenmanyapproachestothisproblem.Ofthese,Land'sretinextheory[ 36 37 38 39 40 30 51 10 33 49 31 67 42 ]isoneofthemostwidelyresearchedandisrelativelysimpletoimplement.AlternativemethodsincludeForsyth'sgamutmappingalgorithm[ 23 21 ]andglobalmethodssuchassubtractingthedominantcolorfromtheentireimage,ornormalizingeachpixelbyitsluminance.Insteadofcolorconstancymethods,someresearchersoptforalocalmethoduti-lizingshapeinformation[ 25 ].Insteadoftakinghistogramsofthecolorsthemselves,thismethodincorporatescolorconstancyintothealgorithmbytakinghistogramsofcolorratios.However,thissimplyaddsasteptoourresearch,aswewouldstillbeleftwiththeproblemofquantizingthecolorratios.Allmethodsthatincorporatecolorconstancyintothealgorithmaremorecomputationallycomplexthantheoriginal.Inaddition,manysimplyarenotgoodenoughforobjectrecognition.Funtetal.'spaper[ 24 ]showedfairlyconclusivelythatofthesimpleversionsofthealgorithms,Forsyth'sandLand'salgorithmsworkalmostequallywell,obtainingroughly70%accuracyontheirdatabase.Theirdatabaseisunrealistic,containingimagesofsingleobjectsunderdramaticallycoloredsingleilluminantswithawhitepatchavailableforcalibrationpurposes.Theauthorsconcludedthattheirresultsmeantthatcurrent

PAGE 19

5 colorconstancyalgorithmswereinsucienttoreliablyrecognizeobjectsintherealworld.Forourpurposes,thealgorithmsmayormaynotbesucient,buttheyareunquestionablytoocomplicatedandrequirefartoomanyresources.Inorderforourrobottofunctioneciently,itshouldnothavetospendveminutesinfrontofthefridgewiththedooropencompensatingforcurrentlightingconditions.Histogramindexingwasbiologicallyinspired.Psychophysicalexperimentsshowedthatthiswasaplausibleapproximationtooneofthemethodshumansusetoidentifyobjects.Colorconstancyisalsobiologicallyinspired.Theretinextheorywasdevisedasamodelforhumancolorconstancy.However,algorithmsthatapproximatehumancolorconstancyarenotperfect.Infact,althoughthegeneralperceptionmaybethathumansarequitegoodatcompensatingfortheilluminant,researchhasshownthathumansaredecidedlyimperfect[ 19 69 1 ].Consensusinthebiologicalliteratureisthatcolorconstancyisinstantaneous.Humansseeascene,andtheyareinstantlycapableofdeterminingtheapproximatehueofagivenobjectwithoutrespecttolighting.Backgroundinformationoncolorconstancyandotherpsychophysicalphe-nomenaisdiscussedinChapter 2 andAppendix A .Theobjectrecognitiontask,inandofitself,mustincorporateadelayofsomesort.Inordertorecognizeanobject,onemusthaveseenitbefore;thustheelementoftimeisimplicitinthetask.Whathappenstocolorswhentheyarestoredinhumanmemory?Chapter 3 exploresthisissue.Humansensitivitytoshiftsinhuewithaveseconddelaycorrespondstobetween8and16colorcategories.Weimplementthedegradationofcolormemoryinaroboticsystemwithquan-tization.Forthemanypossiblecolorsthatourcameracancapture,wesortthemintocategoriesviaquantizationandthenusethosecategoriestogenerateourhis-tograms.Chapter 4 exploresthebackgroundofquantization,andspecicmethods,indetail.SwainandBallardshowedthatunderstrictlycontrolledlightingconditions

PAGE 20

6 objectscouldbereliablyrecognizedusingcolorhistograms.Theyused512colorstoprovetheirpoint,butshowedlittledegradationforasfewas64colors.However,ourpreviousexperiments[ 28 53 52 54 ]haveshownthatquantizationtobetween8and16colorscanproduceimprovedrecognitionaccuracyundervaryinglightingcon-ditions,andrarelydecreasesaccuracy.Analysisoflightingshiftsshowedthatwhenuorescentlightanddaylightaretheprimarysources,6chromaticcategoriesandtwoachromaticcategoriesaresucientforourdatabases.OurresultsonsyntheticdataarepresentedinChapter 5 .Thischapteralsoin-cludestheoreticalandsimulationresults,showingthatquantizationisaneectivewayofmakingthealgorithmmoreecientandisapotentialreplacementformoretradi-tionalcolorconstancyalgorithms.Chapter 6 containstheresultswhenthedatabaseismadeupofstillimagesofrealobjects.Thesedatabasesrangeinsizefrom9ob-jectstoover80objects,generallywithatleast4commonlightingconditions.Theseresultsshowthatusingquantizationasasubstituteforcolorconstancyalgorithmsisimpractical.However,quantizationproducesafarmoreecientsystem,withorwithoutcolorconstancyalgorithms.Chapter 7 describesseveralprototypesystemsusing10,11,14,and16colorswithvaryingresults.Thesesystemsshowthatforsmalldatabasesandminimallyvaryinglighting,goodlightinginvariancecanbeobtained.Formorevariedlighting,additionaldataunderrepresentativelightingconditionswillallowthealgorithmtoperformwell.Whencolorconstancyalgorithmsareusedwithquantization,thesystemiscombinestheincreasedaccuracyofthecolorconstancyalgorithmsandtheincreasedeciencyofthequantization.Chapter 8 summarizestheresultsofthisresearch.Ingeneral,humancolormemoryiscapableofdistinguishingbetweenmembersof8to16categories.Theresultsofexperimentsonstillimagesandsyntheticdatabasesshowthatthereislittletonodegradationinaccuracyasthenumberofquantizationcategoriesisdecreasedtobetween8and16.Accuracytendstodecreaseasthenumberofcategoriesfalls

PAGE 21

7 below8.Ingeneral,ourresultsindicatethatquantizationisaveryeectivemeansofincreasingtheeciencyofasystemwithoutdecreasingitsaccuracy.Insomecases,quantizationmayincreasetheaccuracyasthenumberofbinsisdecreasedtobetween8and16.Ourprototypesshowthatnearperfectaccuracyisobtainableinrealsituations,withminimallyvaryinglightingusingonly11colorcategories.

PAGE 22

CHAPTER2ROBOTICOBJECTIDENTIFICATIONOurpurposeistodemonstratetheeectofquantizationoncolorobjectrecog-nition.Unfortunately,almostallcolorconstancyalgorithmspublishedaredeemedgood"bytheauthors,withnoindicationofametricwithwhichtojudgethemotherthanthehumaneye.Ifwearetodeterminequantizationcanhavethesideeectofcolorconstancy,weneedamoreindependent,reproduciblemetric.Theaccuracyofcolorobjectrecognitionprovidessuchametric.Letusassumewehaveabutlerrobotwhosefunctionistodeliversodacans.Wewouldlikeourrobottoidentifythesecansinarealisticlaboratoryenvironment.Manyexperimentshavetestedalgorithmsinveryaustereenvironments[ 57 ],withxedenvironmentalvariablessuchasonlyallowingtherobottomovewithinarigidlycon-trolledareasuchasahallway[ 55 ].Generally,theconstraintsontheseenvironmentsarejustiedbyassumingthealgorithmwillbeimplementedinanequallyconstrainedenvironment,suchasafactoryoor.However,thesealgorithmsgenerallybreakdowninmorerealistic,uncontrolledenvironments[ 57 ].Someoftheearliestautonomousrobotsrespondedtostimuliinsimpleways,butwerecapableofexhibitingbehaviorsthatwe,ashumans,associatewithemotion[ 12 ].Theserobotshadverylimitedsen-sors,butwereveryrobust.Vision,asthemostcomplexsenseavailabletorobots,isbothextremelydata-intensiveandextremelyfragileinthewayitisusuallyusedforrobots.Inthelaboratorywhereourrobotmustwork,thelightingvariesandtheoverallenvironmentchangesoften.Furnitureismoved,creatingnewshadowsandreections.Thearrangementofdrinkswithintherefrigeratorchanges.Thisisnotarigidlycontrolledenvironment.Infact,itcontainsmanyofthecharacteristicsofthereal 8

PAGE 23

9 world:itisminimallyconstrained,chaoticandunpredictable.Thisdissertationisnotconcernedwiththephysicaldesignofourrobot,orwithitsmanipulators,orwithhowourrobotndstherefrigerator,theoces,oreventhecanswithintherefrigerator.Wecouldevenuseasimplermethodtodeterminewhichcaniswhich,suchasreadingthebarcodesonthesidesofthecans.Onepurposeofthisresearchistodeterminetheeectofquantizationoncolorobjectrecognitionwithaviewtodeterminingitseectivenessasasubstituteformoreelaboratecolorconstancyalgorithms,andweuseourrobot'sidenticationofthecorrectcanasthemetricwithwhichtocomparethealgorithms.2.1PlatformQuantizationhasseveraladvantagesovercolorconstancy.Forexample,itismoreusefulinthecaseofanactualroboticimplementation.Inorderforourrobottofunction,theobjectrecognitionalgorithmmustsatisfycertainconstraints.First,thereistheissueofprice.Inordertokeepthecostoftherobottoaminimum,wewantanalgorithmthatwillminimizeuseoftheprocessor,minimizethenecessarymemory,andnotrequireexpensivesensoryequipment.Intheory,abettersensorimpliesbetterresults.Inactuality,givenniteresources,amoreexpensivesensorwillaecttherestoftherobot,possiblyresultinginacheaperprocessor,lessmemory,andperhapsevencompromisesinthedesignoftherobot'smanipulatorsanditsmaneuverability.Second,ourrobotshouldworkinrealtime.Inordertoidentifythecan,therobotwillhavetokeepthedooroftherefrigeratoropen.Thefastertherobotcandecide,thesoonertherefrigeratordoorwillcloseandthelesselectricitywillbewasted.Inaddition,thefastertherobotknowswhichisthecorrectcan,thefasterthecanwillbedelivered.Thesetwoconstraintspriceandspeeddenethedegreeofacceptablecomputa-tionalcomplexity.Pricedetermineshowfastaprocessorwecanuse,withhowmuch

PAGE 24

10 memory.Pricealsodeterminesthequalityoftheinputdata,bydeterminingthecapabilititesofthecameraandframegrabber.Thespeedconstraintcombinedwiththeprocessorandmemoryconstraintsdeterminethecomplexityofthenalobjectrecognitionalgorithm.Thenalcriterionisthatthesystemmustberobusttoasmanychangesaspossibleintheobjecttobeidentiedandintheenvironment.Pricemustbeminimized,whilespeedandrobustnessmustbemaximized.Quantizationwillmakethesystemmoreecient,bothbyreducingthenumberofelementsthatneedtobestoredinthedatabaseandbyreducingthenumberofcomputationaloperationsthatneedtobeexecutedinordertoidentifyanobject.2.2IdenticationMethodsTherearemanypossiblesolutionstothesodaidenticationproblem.Insteadofactuallyaskingourrobottoidentifythecans,wecouldstandardizetheorganizationoftherefrigerator.SpriteTMwouldalwaysbeinthesameplace,aswouldeveryothersoda.Unfortunately,thismethodisunlikelytoworkinthechaoticenvironmentassumedabove.Thereisnoguaranteethatthepersonresponsibleforllingtherefrigeratorwillchoosetobuythesamedrinkseverytime,orputtheminthesameplace.Furthermore,thecontentsoftherefrigeratormaybeshiftedtoallowroomforitemsotherthandrinks.Wecouldtrytosimplifythesystembyusinggrayscaleimagesinsteadofcolorimages.However,inordertorecognizecanswithgrayscaleimages,wewouldneedtouseshape-basedmethods.Canstendtohavesimilardistributionsofcurvesandangles,andeverycanhasonesidededicatedtonutritionalinformation.Thenutri-tionalinformationhasapproximatelythesameshapedistributionforeverycan.Ifwewantedtoidentifycansbasedonthenutritionalinformationwewouldhavetobeabletoextractthewordsfortheingredientlist,andmatchthemuptoadatabaseofwordstodeterminetheingredients,whichwouldthenbecomparedtoaknown

PAGE 25

11 databaseofobjects.Thenutritionalcontentofdierentsodasisremarkablysimi-lar.Ifwewanttoidentifythecansbasedonthelogosofthedierentdrinks,wewouldneedsomeformofshape-matching,andpossiblycharacterrecognition[ 8 2 ].Fundamentally,shape-basedmethodstendtobebothcomputationallyintensiveandtime-consuming,violatingourcostcriteria.Instead,weimplementSwainandBallard'shistogramindexingmethod[ 57 ].Thismethod,inspiredbyhumanpsychophysicalexperiments,meetsourcriteriainalmosteveryway.Itisveryecient.Insteadofperformingcomplicatedshape-basedcal-culationstoextractinformation,thecomputergeneratesahistogramofthecolorspresentinanimageofthedesiredobject.Thishistogramisnormalizedandusedasthefeaturevectortodescribetheobjectinthedatabase.Whenanunknownobjectispresented,itshistogramisgeneratedandcomparedtothehistogramsinthedatabase.SimpleEuclideandistanceisusedtocomparethehistograms,andanearestneighborclassier[ 18 ]isusedtoidentifytheobjects.Thealgorithmsatisesthecostcriteriaabove.Itisalsorobusttoorientationchanges,scaling,androtationanddeformationoftheobject.Theonlyareainwhichitfailsisrobustnesstotheenvironmentwhenthelightingchanges.Becauseofthisfailure,thealgorithmisagoodchoicefortestingtheectivenessofcolorconstancymethods.Thehistogramindexingalgorithm'saccuracywillbecomeclosetoperfectwhenthecolorconstancymethodiscompensatingadequatelyforthelightingconditions.Astheinputbecomeslessandlesscolorconstant,theaccuracywilldecrease.2.3HistogramIndexingThecolorhistogramindexingalgorithmwasoriginallyproposedasaninstanceofanimatevisionforroboticsystems.Thetworequirementsitmustsatisfyarethatitmustworkinrealtimeandthatitmustdisplayenvironmentalrobustness.The

PAGE 26

12 statedgoalsoftheresearcherswere,rst,todeterminetheidentityofanobjectwithaknownlocationandsecond,todeterminethelocationofaknownobject.Tosolvetherstproblem,thatofdeterminingtheidentityofanobject,theauthorsdeneasimilaritymetriccalledhistogramintersection,whichcomparesthenumberofpixelsofeachcolorinthenewimagetothenumberofpixelsofthatcolorinthemodelinthedatabase.Themodelinthedatabaseisassumedfreeofbackgroundnon-objectpixels,whilesegmentationofrealdataisassumedincomplete,oratbestinaccurate.Histogramintersectionisrobusttodistractionsinthebackgroundoftheobject,toviewingtheobjectfromavarietyofviewpoints,toocclusion,andtovaryingimageresolution.Itisnotrobusttovaryinglightingconditionsortolargedatabasesizes.SwainandBallard'sdenitionoftheintersectionoftwohistogramsisnXj=1minIj;Mj.1whereIandMarehistogramswithnbinseach.Theyobtainedamatchvaluefromzerotoonebynormalizingthisintersectionbythenumberofpixelsinthereferencehistogram,andscalingtheimageoftheunknownobjecttothesamenumberofpixels.Forecientindexingintoalargedatabase,theycalculatedamatchusingonlysomenumberofthelargestbins.Theirdatabasecontained66imagesofrealobjectsthatwereusedasthetrainingset,andatestingsetofimagesincludingoccludedandrotatedobjects.Theyobtained100%accuracykeepingthe10largestbins,and99.86%accuracykeeping200bins.Theysimulatedchanginglightingconditions,butonlylinearintensitychanges,notchromaticornon-linearchanges.

PAGE 27

13 Tosolvethesecondproblem,thatofndingthelocationofaknownobject,theydenearatiohistogramRasRi=minMi Ii;1.2ThevaluesofthehistogramRreplacetheimagevaluesandtheresultisconvolvedwithamaskoftheappropriateshapeandsizefortheobjectinquestion.Ifthetargetisintheimage,apeakintheconvolvedimageindicatesitsmostlikelylocation.Resultsweregood,with5occludedobjectsoutof32correspondingtothe2ndhighestpeak,1occludedobjectcorrespondingtothe7thhighestpeak,andtherestidentiedcorrectly.Thehistogramindexingalgorithmisextremelyrobust.Itretainsitsaccuracyformanychangesintheobject,suchasrotation,occlusion,scalinganddeforma-tion.Itis,however,extremelysensitivetolightingchanges.Ifthelightingchanges,thenmanypixelswillchangesucientlytomovefromonehistogrambintoanother,profoundlyalteringtheoverallhistogram.Tosolvethisproblem,theauthorssug-gestedasimplecolorconstancypreprocessor.Manyapproachestothesolutionofthisproblemhavebeenattempted.Colorconstancybasedsystemshaveshownlimitedsuccess,whileshapebasedsystemshaveshownslightlymore.Allthesesolutionsrequiredramaticallymorecomputationsandmemorythantheoriginalalgorithm.2.4ColorConstancyThegeneralcolorconstancyproblemhasbeenstudiedextensively[ 1 4 9 19 23 29 39 40 42 58 69 47 ].Twobasicassumptionsunderliemostofthesesolutions.First,peopledoitwell.ThisisaddressedinmoredetailinAppendix A ,butisopentodebate,dependingonyourcriteriaforwell."Peoplearecapableofcolorconstancytosomedegree,butactualcolor-to-colormatchingexercisesshowthat

PAGE 28

14 peoplecanperformwellonlyifthecolorisperceivedasanaspectofaconcreteobject[ 1 ].Simplycorrectingforagivenlightingconditionwithoutthisassociationseemstobeverydicult.Thesecondcommonassumptionisthatcolorconstancyisaninstantaneousphenomenon,andthereforetheeectofcolormemoryoncolorconstancyisnegligibleinthiscontext.Thismaybetrue.Manycolorconstancyexperimentsinvolvingthesameobjectrelyonrstoneexemplar,andthenadelay,andthentheexemplarunderadierentlight.Thiswas,infact,oneoftherstexamplesofcolorconstancyinLand'sandMcCann'swork[ 36 37 40 ].However,thisdelayedexperimentinherentlyinvolvescolormemory,ascolormemoryhasbeenshowntodegradeoverdelaysontheorderof200ms[ 61 ].Evenwithinascene,aswelookfromoneobjecttoanotherunderthesamelight,itisnotalwaysobviousthattwocolorsaredierent.Itisoftennecessarytoplacetheobjectssidebysidetoenableanobservertoseethesubstantialdierencebetweentheircolors.Researchersattempttogetaroundthisbyhavingtwoknownidenticalscenesvisibletotheparticipant,andaskingtheparticipantstoadjustthelightingononescenesoitmatchestheother.Therearetwotypesofsolutionstothecolorconstancyproblem.Globalsolutionsmakeupthersttype.Generally,colorconstancysolutionsrelyonthestricturethatlightingchangeshaverelativelylowspatialfrequencycomparedtosurfacepropertychanges.Themostextrememanifestationofthis,therefore,istoremovetheDCcolorcomponentofagivenimage.Manyglobalalgorithmsexistalongtheselines.Somemethodsattempttonormalizethecolorvalues,toeliminatedependenceonintensityorsomeothercharacteristic.Finlayson[ 22 ]hasdevelopedacolorconstantspacethateectivelyeliminatestheilluminant.Unfortunately,inordertodeterminetherequiredprojectionthatwilltakeyouintothisspace,youmustrsthaveexamplesofthechangesinlightingyouwishtocompensatefor.Localalgorithmsmakeupthesecondtypeofsolutionstothecolorconstancyproblem.TheseincludeLand'sretinextheoryofcolorconstancy,rstpublishedin

PAGE 29

15 1959[ 36 37 ].Thealgorithmusesaverylargelocalregionalmostglobalcomparingthecurrentpixelvaluetotheoverallcolorsurroundingit,withasteepdiscountasafunctionofdistance.In1971[ 40 ],heandMcCannpublishedadditionalresultsusingtheretinexasamodelforhumancolorvision.Inthemid-eighties,Landpublishedpaperscontainingnewversionsofthistheory[ 38 39 ],whichwasrevisedandexpandedbyotherresearchersthroughouttheeightiesandnineties[ 9 10 4 33 49 31 42 ].MostcolorconstancyalgorithmsarebasedonwhatisknownasvonKries'prin-ciple.Coecientsindependentlyadjustthegainofeachphotoreceptororchanneltoobtainsurfacecolordescriptors.Thesefactorsvaryaccordingtotheauthorcited:forexample,BrillandWest[ 13 ]usefactorsofoneovertheoutputofthatphotore-ceptorforaknownwhitepatch.Inadditiontofailingwhenthephotoreceptorclassesarenotindependentofeachother,thisalgorithmalsoillustratesacommonfailingamongcolorconstancyalgorithms.Ingeneral,colorconstancyalgorithmsrequireeitherknownsurfaceseitherknownwhitesorknowntypesofsurfaces,suchasMon-drians:at,matteimagesofsmoothcolorregionsorknownlightingconditions,orboth.2.4.1RetinexIn1959,EdwinLand[ 36 37 ]proposedtherstbiologically-basedcolorconstancyalgorithm.Inthemid-eighties,withtheadventofcomputersfastenoughtosimulatetheprocess,herevisitedhispreviouswork[ 38 39 ].ThesenewalgorithmsexploredpossiblemethodstoimprovethebiologicalaccuracyincorporationofMachbandeects,forexampleandthesimplicityofthecomputation.Inthenalform,Land'salgorithmissimilartohomomorphicltering,andreliesonsimilarpropertiesofimages.Homomorphiclteringenhanceshighfrequenciesandreduceslowfrequencies,inthefrequencydomain.Intheretinexalgorithm,highfrequenciesareenhancedandlowfrequenciesarereducedinthespatialdomain.Thecenter/surroundretinexuses

PAGE 30

16 thefollowingthreestepstoobtainanalrelativelycolorconstantimage.First,foreachcolorchannelindependently,takethelogofthevalueofeachpixel.Second,againforeachcolorchannelindependently,subtracttheresultofconvolvingalocalsurroundfunctionwiththeimagefromeachpixel.Third,globallyscaletheresultingvaluesappropriatelyforimagedisplay.Muchresearchhasbeendoneonwhichtypeofsurroundfunctionisbest[ 33 49 31 10 42 ]andontheplacementofthelogfunction.Jobsonetal.[ 33 ]concludethattheGaussiansurroundwiththelogtakenafterthesurroundfunctioninsteadofbeforeprovidedthebestresults.However,theirinterpretationofbestseemsdistressinglyfreeofanymetricbesidesthatoftheopinionoftheresearchers.Mooreetal.[ 42 ]designedandbuiltananalogVLSIchipthatperformedtheretinexalgorithmonrealtimevideodata,withgoodresults.Theirresearchdemon-stratedafundamentalproblemwiththeretinexalgorithm,asitthenwas.Inimageswithlargeregionsofasinglecolor,thegrayworldassumptionforcesthoseregionstogray,evenwhentheactualcolorishighlysaturated.Theysolvedthisproblembyintroducingavariancecompensationmechanism,whichusesthelocalvariancetodeterminehowmuchtochangetheoverallcolor.Again,themetricforgood"andpoor"resultsdependsentirelyontheopinionoftheauthors.In1986,BrainardandWandell[ 10 ]publishedananalysisoftheretinextheoryintermsofcolorconstancy.Theirconcernwaswithcolorconstancythatretainsthecolorsofobjectsindependentofbothnearbyobjectsandillumination.TheyanalyzedtheretinexalgorithmdevelopedbyLandin1983[ 38 ]inthiscontext.LandandMcCann[ 40 ]concludedthattheretinexperformssimilarlytohumanswhenonlythespectralcompositionoftheilluminantisconsidered.BrainardandWandell'spaperdealsspecicallywiththeeectofthealgorithmontheperceivedcolorsofobjectsclosetotheobjectinquestion.Theydeterminedthatretinexisnotacolorconstantalgorithmandthatitisnotanadequatemodelofhumanperformance."[ 10 ],p.1657.

PAGE 31

17 Theretinexisnotacolorconstantalgorithmbecauseitnormalizesthephotoreceptorgainstovaluesthatdependontheinputimage,ratherthantoaconstant.Itisnotanadequatemodelofhumanperformancebecauseitdependsonthesurfacespresentinascenetoagreaterdegreethanahumanwould.However,theauthorspresentthisconclusionwithonlytheassertionthatAhumanobserver...perceivesvirtuallynochangeintheappearanceoftheuppertworowsofchips"[ 10 ],p.1656whereasthealgorithmproducesnoticeablechangeinthechips'appearance.Theydonotprovideoruseanyquantitativemetric.Morerecently,researchers[ 33 49 31 ]havedevelopedamulti-scaleretinexal-gorithm,withbetterresults.Thesepapersdetailthedevelopmentandtestingofamulti-scaleretinexwhichachievesbothcolor/lightnessrenditionanddynamicrangecompressionsimultaneously.TheyuseaGaussiansurroundfortheirkernel,andper-formthelogfunctionaftertheconvolutionwiththesurroundkernel.Theyalsouseacanonicalgain/oset.Themulti-scaleversionissimplythesummationofseveralsingle-scaleretinexes,eachwithadierentstandarddeviationGaussiansurround.Thisisoneofthefewpapersthatcomparestheperformanceoftheretinextotheperformanceofotheralgorithms.Thefollowuptothisstudyisthedevelopmentofthemulti-scaleretinexwithcolorrestoration[ 32 ].Thisalgorithmaddsacolorrestorationstepafterthemulti-scaleretinexprocessing,andpurportstoproduceimagesthatcloselymimicthehumanviewingexperience.Theresearchershavetakenoutapatentontheirmethod[ 50 ]whichcontainsmoreexplicitinstructionsforsettingthevariousparameters.2.4.2GamutMappingForsyth's[ 23 ]workistheonlyothercolorconstancyalgorithmtoapproachtheperformanceoftheretinexalgorithmwhenusedinconjunctionwithcolorindexing.Hepresentstwoalgorithms,CruleandMwext,whichareeectiveundertwodierent

PAGE 32

18 setsofconditions.Cruleworksforanysurfacereectanceifthephotoreceptorsarenarrowband.Mwextfunctionsinthecasewherebothsurfacereectancesandillumi-nantsarechosenfromnitedimensionalspaces.ExperimentalworkwithCruleshowsthatforgood"constancy,acolorconstancysystemwillneedtoadjustthegainofthereceptorsitemploysinafashionanalogoustoadaptationinhumans."[ 23 ],p.5.Heuses5assumptionsforthedevelopmentofthisalgorithm: 1. Allsurfacesareat,frontallypresented,andtherearenoshadowingormutualilluminationeects. 2. Thereisasingle,spatiallyuniformilluminant.Here,thismeansonlythatthereisonlyoneilluminantatatime,notthatthereisonlyonetypeofilluminant.Thesingle,spatiallyuniformilluminant'schromaticitycanbechanged. 3. AllsurfacesareLambertianandallreectionisdiuse.Surfacesvaryonlywithwavelengthanddonotuoresce. 4. Theproblemhesolvesisdenedintwoparts:rst,thattheilluminantmustbeestimated,andsecond,thatthesomestatementaboutthepropertiesofthesurfacesintheimagemustbeobtained. 5. Theproductofanysurfacereectancefunction,andanyilluminantfunction,andanyphotoreceptorsensitivitycanbeintegratedwithrespecttowavelength.Surfacereectancefunctionsareneithergreaterthanone,norlessthanzero.Theseareveryweakassumptions."p.7Obviously,heisnotsolvingtherealworldproblemsofnaturallyilluminatedob-jects.Eventhesimplestrealenvironmentcontinuallyviolatesassumptions1and2.Thewingsofbirdsandbutteriesandallspecularreectionsviolateassumption3.Thereareseveraladditionalassumptionsaboutthecharacteroftheilluminant. 1. Illuminantsare`reasonable.'"p.7Thismeansthat,forinstance,asampleobjectseenundermonochromaticlightcouldnotbereasonablyusedtoidentifytheobjectunderwhitelight.Also,itmustbepossibletodescribeeachmemberofthesetofilluminantsthatoneobservese.g.withaparameterization."p.7 2. Photoreceptorsarealsoreasonable,"inthattheilluminantsmustexcitethereceptors.

PAGE 33

19 3. Iftheilluminantproducesmetamersapairofmetamersareapairofpatcheswhoseunderlyingpigmentisdierentbutthatevokethesamereceptorresponseunderagivenilluminant,thereisnoconstraintonthealgorithmtopredictthattheywilllookdierentunderdierentlighting. 4. Thecolorsoftheobjectsintheimagearenotunreasonably"distributed,andtherearesucient"dierentcolors.Examplesofanunreasonabledistributionofcolorswouldincludeaforestscenealmostentirelygreenandbrownshadesorasceneviewedthroughcoloredglasses.Accordingtothis,areasonabledistributionwithsucientdierentcolorswouldincludescenesofman-madeenvironments,suchasachild'splayroom. 5. Photoreceptoroutputsdonotdegradesubstantiallyfordeeplycoloredillumi-nantsForanyilluminant,areasonablemeasurementofthephotoreceptoroutputsispossible."p.7Theseassumptionsaregenerallyvalidintherealworld,exceptpossiblyforas-sumption4.Metamers,whilecommonintheory,arerareinfact.Assumption5isarestrictionontheilluminantswherecolorconstancyispossibleratherthanarestrictiononthereceptors.ThebasictaskofthealgorithmistorecovertheRGBdescriptorofagivensur-faceunderacanonicalilluminant.ThecanonicalgamutisaconvexsetcontainingallpossibleRGBresponsesunderthecanonicalilluminant.Usingthisgamut,andtheaboveconstraints,someilluminantscanberuledoutandtheremainderarealineartransformawayfromthecanonicalilluminant.Crulesolvesforallthediagonalmatricesofcoecientsthattakethegamutoftheimageintothecanonicalgamut.Duetothecomputationalexpenseofcalculatingthesematrices,anapproximateso-lutionwascalculated.Diagonalmatricescanapproximatematricesinthefeasiblesetduetothenon-overlappingnarrowbandsensorconstraint.Cuboidapproximationsdeterminetheintersectionoftheimagegamutandthecanonicalgamut.Thefeasiblesetwiththelargestvolumeischosenasthemappingmostlikelytoachievecolorcon-stancy.ThisalgorithmprovidedgoodresultsforMondrianimageswithasucientlydiverseselectionofcolors.

PAGE 34

20 Finlayson's[ 21 ]extensionofForsyth'salgorithmrelaxestheconstraintsontheilluminant,thesurfaceshape,andthespecularitiesilluminantassumptions,andpartsofassumptions1and3above.Theauthorshowsthatconfoundingfactorsinrealimagessuchasspecularitiesandshapeaecttheintensitybutnotthecolororientation.Therefore,thealgorithmuses2Dperspectivechromaticitycoordinatesinsteadof3DRGBcoordinates,andtheintensityisnotrecovered.The2Dspaceisgivenbyr=R=Bandg=G=B.Theoretically,ifBissmall,valuescouldbecomenoisy,buttheauthorreportsnodicultieswiththisinpractice.HealsoimplementsacanonicalilluminantgamutconstraintsimilartoForsyth'scanonicalsurfacecolorconstraint.Again,testsshowthatthealgorithmperformswell.Theresultsshowninthispaperindicatesthattheangularerroronreal,highlycoloredimagesunderstandarddaylightandtungstenilluminantsislessthan10degrees.Thisisonlyafewdegreesmoreinaccuratethanthebestpossiblewithdiagonalapproximationstoilluminanttransformations.Yetanotherapproachbasedonthegamutmappingmethod,Barnardetal.'s[ 4 ]algorithmidentiesilluminationvariationsacrossanimage,andremovesthem.Italsousestheilluminationinformationgatheredtoconstrainthecolorconstantsolution.Insteadofarestrictiononthereectances,itrequiressucientvariationineitherthereectances,theilluminants,orinacombinationofboth.TheyinterpretthecolorconstancyprobleminthesamewayasForsyth,takingimagesofscenesunderunknownilluminationanddeterminingthecameraresponsetothesamesceneunderaknown,canonicallight.2.4.3OtherMethodsSomeresearchersavoidedcolorconstancybyintroducingashape-basedmethod,usinglocalcolorinformation.In1995,EnnesserandMedioni[ 20 ]usedtheWhere'sWaldoTMimagestotesttheirlocalcolorinformationadaptationofthehistogram

PAGE 35

21 indexingalgorithm.NovakandShafer[ 47 ]explicitlyassumedtheavailabilityofacolorcalibratorintheeldofviewofthecamera.Ifyouhavesomethingtocalibrate,theproblemunquestionablybecomessimpler,butwecannotassumeacolorchartwillbeavailableorusable.2.5FixingHistogramIndexingIn1995,FuntandFinlayson[ 25 ]attemptedtoeliminatetheneedforacolorconstancypreprocessorbyincorporatingilluminationindependenceintothealgo-rithm.Insteadofindexingonthephotoreceptorclasstriples,theyindexedonanillumination-invariantsetofcolordescriptors:thederivativeLaplacianoftheloga-rithmofthecolors.Thisisequivalenttoindexingontheratioofneighboringcolors,similartoLandandMcCann'sworkoncolorconstancy[ 40 ].Thisvariationdoesnotworkwithsaturatedimagepixels,becauseratiosbasedonthosepixelsareunlikelytobeconstant.TheiralgorithmworksalmostaswellasSwainandBallard'sunderthesamelightingconditions.TheirLaplacianofGaussianoperatoridenties19objectscorrectlyoutof25imagesofrealobjectsundertheoriginalilluminant,butthetoler-anceispoorandtwooftheobjectsareverypoorlymatched"ranksof18and27.SwainandBallard'salgorithmidenties23ofthese25realobjectscorrectly,andthetwoidentiedincorrectlywerethesecondmostlikelychoices.Next,thealgorithmsarecomparedusingsyntheticMondrianinput.Bothalgorithmsshouldshowperfectaccuracyonthisdatawhentheilluminantisunchanged.Whenunderdiering,spa-tiallyconstantilluminants,FuntandFinlayson'salgorithmcorrectlyidentiesalloftheobjects,whileSwainandBallard'sproduceszerointersectionsfor155ofthe180objectsandcorrectlyidentiesonly20.Whentestedwithilluminantsthatvariedspatiallyinintensityandcolor,againcolorconstantcolorindexingFuntandFin-layson'smethodproducedperfectresultswhiletheoriginalcolorindexingalgorithmidentiedonly7of30correctlyandfailedon12oftheMondrians.Resultsonreal

PAGE 36

22 dataweresimilar,withcolorconstantcolorindexingproducingoneerrorwitharankof2outof11objectsunderdierentilluminations.Colorindexingaloneinthebestcaseidentiedonly14of22correctly,with5imageshavingarankof2and3havingarankgreaterthan3.Plainly,thecolorconstantcolorindexingalgorithmisbetterthancolorindexingalone,aslongasyoucanguaranteelightingshifts.Ifthelightingisthesame,thealgorithmdoesnotperformaswellastheoriginal.Allofthemethodsdescribedherearemorecomputationallyintensivethantheoriginalalgorithm,andrequiremuchgreaterstorage.Chapter 3 addressesthecolormemoryofthehumanvisualsystemintermsofbit-depth,pavingthewayforadetailedexplorationofquantizationmethodsinChapter 4

PAGE 37

CHAPTER3COLORMEMORYPsychophysicalexperimentsaretheprimarymethodavailableforcharacterizingthevisualprocessesofthehumanbrain.Theonlyothermajormethodistheanalysisofcasestudiesofpatientswithusualandunusualproblems,rangingfromsimplecolor-blindnesstotruecolorconstancyBothcolorconstancyandcolormemoryhavebeenstudiedextensively.Unfortu-nately,duetothetime-consumingandtediousnatureoftheexperiments,thenumberofparticipantsandtheextentofthetestsareoftenseverelylimited.Inaddition,thevariabilityofcolorperceptionbetweenindividualsisoftenhigh.Evenforasingleindividual,resultsfromagivenexperimentwillvarysubstantiallyfromonetimetoanother.BackgroundinformationoncolorconstancywaspresentedindepthinChapter 2 .Variousresearchershaveinvestigatedcolormemory[ 60 59 61 5 3 16 14 15 41 27 46 63 ].Uchikawaetal.[ 60 63 ]showedthatforbothsingleandmultiplecolors,humansremembercolorsasmembersoftheBerlinandKaycolorcategories[ 7 ].Furthermore,insteadofbecominglessaccurateasmorecolorswereaddedtothetask,participantswouldforgetoneormoreofvecolorscompletelywhileretainingroughlythesameabilitywiththeremainingcolors.Otherresearch[ 62 61 59 14 ]indicatedthatourabilitytodiscriminatebetweencolorsdeteriorateswithmemory.However,noneoftheseexperimentswasperformedinacolorspaceeasilytransformedintothecolorcoordinatesusedinacomputer,orinaroboticsystem.Theresearchindicatedthatdiscriminationthresholdswerelargerforsuccessiveviewingthanforsimultaneousviewingbothcolorsviewedsimultane-ouslyonthefovea.Inaddition,colorswererememberedwithashifttowardshigher 23

PAGE 38

24 saturation[ 59 15 ].Therewasnodiscernableshiftofhueinmemory[ 46 15 ].Weperformedanexperimenttodeterminearoughestimateoftheaccuracyofhumancolormemoryintermsofbit-depth.Ingeneral,psychophysicalexperimentstodeterminecharacteristicsofhumancolorvisionaredesignedtoobtainresultsthatareaspreciseaspossible,inwhatevercoordinatesaremostconvenientfortheresearcher'sinterpretation.ThesecoordinatesincludeOSAuniformcolorscales[ 60 63 ],wavelength[ 34 ],ortheCIEspaces[ 43 ].Manyexperiments[ 7 ]useaphysicalstimulussuchasstandardizedcoloredchips,which,exceptundercertainfullycontrolledconditions,areimpossibletoperfectlytransformintoacomputer'srepresentationofcolor.Theseexperimentsalsogenerallyallowthesubjecttoviewthestimulusonlyforashort,xedperiodoftime,resultinginamorecontrolledexperimentbutalessrelevantassessmentforreal-worldsituations.Becauseouralgorithmisconcernedwiththesmallestpossiblenumberofcolors,wewantanoptimisticestimateoftheaccuracyofhumancolormemory.Byndingthelargestrealisticvalueforhumancolormemory,wecandetermineagoodtargetforourroboticsystem.Thepurposeofthisexperimentwastodeterminethisestimateoftheaccuracyofhumancolormemoryunderconditionsclosetothoseapersonmightactuallyencounterindailylife.3.1ExperimentalmethodsTheexperimentwasperformedonanApple400MHzRev.DiMacDVTM,runningtheStudentVersionofMATLABr5.0.Theoverallluminancecharacteristicwasusedtocalibratethemonitor,andtheluminancecharacteristicofeachgunwasusedtodetermineeachgun'sgamma.However,becausethepurposeofthisexperimentwasspecicallytondanestimateoftheaccuracyofcolormemoryintermsofcomputerbit-depth,nocompensationwasincorporatedforhumanperceptionoflightnessorsaturationvariationalongthehueaxis.Thus,ifashiftinthehuevalueofacolor

PAGE 39

25 producedashiftinperceivedlightnessorsaturationofthatcolor,thatshiftwasconsideredpartoftheperturbationbeingtested.Inordertomaintainasmuchrealisminthestructureoftheexperimentaspossible,subjectsweregivenasmuchtimeastheywantedtodeterminetheiranswers.Inaddition,theywerealsoallowedtousementaltrickstoxthecolorintheirmemories.Theenvironmentwasausterenocolorchartsandtheywerenotallowedtoholdobjectsuptothecomputermonitor.However,otherpurelymentalgames,suchascomparingthecolortoaknowncolorsuchasthecolorassociatedwithasportsteam,wereallowed.Theexperimentstookplaceintwolocations,oneinaroomwithwindowsandtheblindsclosed,andtheotherinaroomwithoutwindows.Inbothcases,thelightswerealwaysinthesamecongurations,withalloverheaduorescentlightson.However,intherstroomtherewasasmallamountofadditionallightfromcracksintheblinds.Thesubjectswereallowedtoadjustthecomputermonitorandtheirpositionforthebestviewing.Theonlyrequirementwasthattheparticipantsbeabletoseethecolorsclearly.Ineachcase,subjectswereallowedtolookawayfromthemonitorwhenevertheyliked.Inalllikelihood,undernormalcircumstanceshumanswouldnotperformaswelloncolormemorytasksasthisexperimentimplies.Butforourpurposes,wewishtondthemostgenerousestimate.Firstthesubjectswereaskedtoprovidesomebaselinedata.UsinganinterfaceprogrammedinMATLABr,theyindicatedtheirchoicesforthebestexampleofeachoftheBerlinandKay[ 7 ]colorcategories.ThisinterfaceisshowninFigure 3.1 .Inthisgure,theprimarywindowFigureNo.1showsthecolorplane.Theparticipantswereassignedonefocalcoloratatime.Inthisimage,pink,red,orangeandyellowhavebeenassigned,andtheparticipantisselectingacolortorepresentgreen.Theparticipantwasaskedtousethemousetochoosethecolorthatbestrepresentedthegivenfocalcolor.Thecolorchosenwasdisplayedintheblackbox,labelledgreen"inthisimage.Iftheparticipantwasunsatisedwiththecolor,theycontinuedtoselect

PAGE 40

26 colorsuntiltheywerecomfortablewiththeassignment.Oncetheyweresatised,theyclickedontheDone"button.Thisresetthewindowtothenextcolorandassignedthechosencolortoitsspaceinthecolormapontheright.Thiswindow,entitledFigureNo.2inFigure 3.1 ,showedthecolorschosensofar.DierentcolorplanesderivedfromtheHLSandRGBspaceswereavailabletotheparticipant,butthedefaultwastheHue-Lightnessplane.Thisplane,fromHue-Lightness-Saturationspace,wasdisplayedwithsaturationxedatthemaximum.Thisspacewaschosenbecauseitusesasingleaxistorepresentchromaticity,thusreducingthenumberoftrialsnecessarytocharacterizeeachparticipant'ssensitivity.Next,againusinganinterfaceprogrammedinMATLABr,theywereaskedtoprovidetheboundaryinformationforthesameplane,thistimesubdividedintosevenlightnesslevels.ThisisshownwiththemiddlelevelinFigure 3.2 .Inthisgure,theparticipanthasplacedlinesindicatingwheretheirboundariesbetweenfocalcolorsoccur,usingtheCreateBoundaryLine"buttonandthemouse.TheparticipantisusingtheSelectRegionColor"menutoassignacolortooneoftheoutlinedregions.Onceallregionswereassigned,theparticipantclickedtheShowSelection"buttontodisplaytheresultsoftheirmapping,usingthefocalcolorschoseninthepreviousstep.Thisallowedtheparticipanttomakesuretheyallocatedthefocalcolorstotheregionsasintended.Whentheparticipantwassatisedwiththeboundarylocationsandcolorregions,theyclickedtheLightnessDone"button.Thisresetthewindowwiththenextlightnesslevel.Thisprocedurewasrepeateduntilsevenlightnesslevelswereset.Oncethesepreliminarytaskswereover,themainexperimentbegan.Threesetsoftenpairsofcolorsweregenerated.Oneofeachpairwasdisplayedrst,andthesecondaperturbedversionoftherstcolorwasdisplayedafteraveseconddelay.Foreachsetoftenpairs,oneoftheinitialcolorswasthefocalcolorchosenbytheparticipantandtheotherninewereosetfromthisfocalcolorinRGBspacebyshifts

PAGE 41

27 Figure3.1:Interfacewindowforprogramenablingusertochoosefocalcolors.Pri-marywindowshownhereallowsusertochoosecolors;secondarywindowshowsfocalcolorsafteruserhasmovedontonextcolor randomlygeneratedfromaGaussiandistributionwitha.05standarddeviation.TheoriginalRGBvaluesrangedfromzerotoone.Valuesgreaterthanoneorlessthanzeroaftertheshiftwereroundedtooneorzero,respectively.Theresultingrandomlyshiftedpointswerecheckedtoensuretherewerenoduplicationswithinaset.Foreachsetoftenpairs,onedelayedcolorwasnotperturbed,andtherestwereperturbedbyalteringthevalueofthehuecomponentofthecolorinquestion,ippingthebitofthebit-depthbeingtested.Thiscorrespondedtoashiftinonedirectionortheotherofsomemultipleoftwo.Forexample,iftheprogram'sbit-depthwassettovethedefaultstartingvalueandthecolor'shuevaluewas1001 1000,thentheperturbedcolorwouldbe1000 1000,withthefthleastsignicantbitipped,correspondingtoashiftof16alongthehueaxis.Intheorywecouldhavesimplyaddedorsubtractedagivennumberfromacertaincolor,andthensweptthatnumberfromthesmallesttothelargest.Inordertoreducethesearchspace,severalconstraintswereintroduced.First,tosimplifythetransitiontoaroboticsystem,onlyintegerbit-depthshiftswereintroduced.Ifweaccepttheblue-yellow,red-greenperceptualdichotomy,thenanintegergreaterthantwoinbase-twobit-depthwillalwaysbecapableofbeinginterpretedintermsofthese

PAGE 42

28 Figure3.2:Interfaceforusertochooseboundariesbetweenfocalcolors.Foreachofsevenlevelsofbrightness,theuserplaceslineswhereboundariesareperceived.Focalcolorsareassignedtotheregionsbetweenthelinesbytheuser. channelseg.ippingthe5thleastsignicantbitisequivalentto16regions,whichisequivalenttosegmentingthespaceinto4regionsforeachquandrantblue-red,blue-green,yellow-red,yellow-greenoftheperceptualcolorspace.Second,weusedtheHLSspace,ratherthanRGBoroneoftheCIEspaces.Again,thisreducesoursearchspace,andthusthetimenecessaryforeachparticipanttodonate.InsteadofhavingthreeaxestosearchRGBortwoCIEspaceswehaveonlyone,hue.EightbitswereallocatedforeachaxisinHLSspace.Therefore,themaximumpossiblebit-depthusedwouldbeeight,whichwouldchangethevaluetotheoppositehue.Flippingthefthleastsignicantbit,asabove,wouldcorrespondtoshiftingthehuevalueby24,or16outof256.Third,wereducedtheexperimentalsearchspacebyafactoroftwobydoingonlyoneshiftpercolor.Wecouldhavepresentedtwotrialsforeachcolor,onewiththeperturbationaddedandonewiththeperturbation

PAGE 43

29 subtracted.Instead,intheinterestsofbeingabletotestalargernumberofcolors,weallowedonlyoneperturbationpercolor.Thesubjectviewedeachcoloronablackbackground,foraslongastheyfeltnecessary.ThisinitialdisplayisshowninFigure 3.3 ,parta.Thecolorisshowninasquareinthecenterofthescreen.Theremainderofthescreenisblack,exceptforinstructionsinwhitetextintheupperright-handcorner.Whentheparticipantwassatisedwiththeirknowledgeorperceptionofthecolor,theyhitakey,whichtriggeredthecomputertodisplay5secondsofblackandwhitetemporalandspatialnoiseoverasquareslightlylargerthanthecolorsample.AsamplescreenshotofthenoiseisshowninFigure 3.3 ,partb.Thenthesecondcolorinthepairperturbedornotperturbedwasdisplayed,asinFigure 3.3 partc,andagainthesubjecthadasmuchtimeastheyfeltnecessarytomakeuptheirmindswhethertheywereseeingthesamecolororadierentcolorandindicatingtheirchoicetothecomputer.Aftertenofthesetrials,thesametencolorpairsweredisplayedsimultaneously,onepairatatime,asnestedsquaresshowninFigure 3.3 ,partd.Thesubjectwasaskedwhethertheyperceivedonesquareofasinglecolorortwonestedsquaresofdierentcolors.Again,MATLABrwasusedtocontrolthemonitorandthesubjectsweregivenasmuchtimeastheyneededtocometoadecision.ThisexperimentwascreatedwiththefreePsychophysicsToolbox[ 11 ]forMATLABrtodisplaythecolorsandtheintervalnoise,andtorecordthekeyboardresponses.Consensusintheliterature[ 60 46 5 63 ]isthathumanmemoryforcolorsisworsethanhumaninstantaneousperceptionofcolors.Insignaldetectionterms,weareattemptingtodeterminethesensitivityofhumanobserverstovisualnoisecolorperturbationswhenagivendelayispresent.Consistentandinconsistentresponsesweregenerated,aswellastheusualhitrateandfalsealarmrate.Thisportionoftheexperimentalsoidentiedthepeoplewhowerecarelessintheirresponses:ifasubjectsawtwocolorseverytime,theywereobviouslynotpayingcloseenough

PAGE 44

30 Figure3.3:Screenshotsofpsychophysicaltestinginterface.aoriginalcolor;bintersamplenoise;ctestcolor;dsimultaneouscomparison. attentionormistakingtheinversedelayresponsetheafterimageoftheprevioussetforadierenceinthecurrentsetforanactualdierenceinhue,asaprioritherewasalwaysonepairofidenticalcolorsineachsetoften.Theparticipantswereunawareofanyinformationabouthowmanypairscontainedperturbedcolorsandhowmanywerenotperturbedoutofanygivenset.Threesetsoftentrialswereperformedforeachfocalcolor.Ifthesubjectper-formedwellenoughonagivensetoften,thebit-depthwasreducedbyonethenumberofvaluesshiftedwasreducedbyafactoroftwoforthenextset.Ifthesub-jectperformedpoorlyenough,thebit-depthwasincreased.Inorderforthehueshifttobereduced,thesubjecthadtorespondtoatleast70%ofthetrialsconsistentlysimultaneousandsuccessiveresponseswerethesame.Toincreasethehueshift,thesubjecthadtoprovideinconsistentresponsesmeaningthesimultaneousandsucces-siveresponsesdieredonatleast70%ofthetrials.Breakswerescheduledevery20minutes,andmanypeoplechosetocontinueanotherdayratherthancontinuethe

PAGE 45

31 sessionaftertherstbreak.Thepreliminarydata-gatheringhappenedeachtimeasubjectstartedasession.Thus,ifsomeonechosetocontinuetheexperimentonanewday,theywouldselectnewfocalcolorsandnewboundariesasbefore,andthosecolorswouldbeusedinthenewsession.However,ifthesubjectchosetocontinueafterthebreak,newfocalcolorsandboundarieswerenotgenerated.Asaresult,thecolorsusedwererepresentativeofthatsubject'sfocalcolorsonthatday,underthoselightingconditions.Ifthesubjectwaswillingtocontinueafterthetrialsfortheeightchromaticfocalcolors,thesameprocedurewascarriedoutfortheboundarycolorschosenatthecentrallightnessvalue.3.2DataanalysisTwenty-twopeopleparticipatedinthisexperiment.Thisresultedinatotalof5885signaltrialsand635noisetrials.Asignaltrialisdenedasasinglepairofcolorswhosehuesdieredandwhichhadvalidresponsesfromtheparticipant.Anoisetrialisdenedasasinglepairofcolorswhosehuesdidnotdierandwhichalsohadvalidresponsesfromtheparticipant.Iftheparticipantgavearesponsethatwasnotoneoftheacceptableresponsessame"ordierent"forthedelayedresponse,"or"forthesimultaneousresponsethetrialwasdiscarded.Figure 3.4 showsthefocalcolorschosenbytheparticipantsintermsofhueblackx'sandtheactualinitialhuesusedforthefthLSBtestingred.Thefocalcolorsareconcentratedintheleftsideoftheplot,correspondingtored,oranges,yellowsandbrowns.Theremainingcolorsgreen,blueandpurpleareslightlyclusteredingroupsintheremainderofthehuespace.Thetestingdatacoversthehueaxiswell,withmorerepresentationintheregionswithclusteredfocalcolors.They-axisshowsthenumberofresponsesobtainedforeachfocalcolor/testcolor.Eachhuewasroundedtoanumberbetween0and255,resultinginmultipletrialsformosthues.

PAGE 46

32 Figure3.4:FocalcolorsandinitiallydisplayedcolorsatLSB5. Table 3.1 showsthenumberofhitsandthenumberofsignaltrials.Thehitrateiscalculatedbytakingpercentageofsignaltrialsthatwerehits.Zhitrateisusedtodeterminethevalueofthesensitivitymetric.Table 3.2 showsthesamedataforthefalsealarms.ThesetablesindicateclearlythatmanypeoplewereabletoreliablyidentifyshiftsincolorwhenthefthLSBwasipped,butthatfewweregoodenoughtoprogresstothestagewherethethirdLSBwasipped.VeryfewdidsobadlythattheygottothesixthLSB,butondeterminingthis,additionaltrialswererunatboththesixthandseventhtocompensate.OnlyonepersonmadeanyerrorsattheseventhLSB.SimplylookingattheseresultswouldcauseonetohypothesizethatagoodsensitivitythresholdwouldliesomewherebetweenthefourthandsixthLSB.

PAGE 47

33 Table3.1:Numberoftrialsgivenbit-depth Bit-depth SignalTrials Hits HitRate ZHitRate 3 428 248 0.579439252 0.200459453 4 1477 814 0.551117129 0.128484317 5 3340 1812 0.54251497 0.106771267 6 543 449 0.826887661 0.941936378 7 97 95 0.979381443 2.041142579 Table3.2:Numberoftrialsgivenbit-depth Bit-depth NoiseTrials FalseAlarms F.A.Rate ZF.A.Rate 3 42 19 0.452380952 -0.119648575 4 153 44 0.287581699 -0.560462468 5 330 92 0.278787879 -0.586446731 6 97 10 0.103092784 -1.264124876 7 13 0 0 -1 Sensitivityanalyseswereperformedonthedatatodetermineroughlyhowsensitivehumansaretobit-depthvariation.Thetwosensitivitymetricsusedproducedsimilarresults.3.2.1SensitivityTwometricsderivedfromsignaldetectiontheoryprduceverysimilarresults.Aparametricsensitivitymeasure,knownasthed0metric[ 18 ],andanon-parametricsensitivitymeasure,knownasthepAmetric,arepresented.ThereceiveroperatorcharacteristicsROCcurveishelpfulinvisualizingthed0metric.TheROCcurveisgeneratedbyplottingthehitrateagainstthefalsealarmrate.Thehitrateistheprobabilitythatanobserverwillgenerateatruepositive,giventhatthetwocolorsaredierent.Thefalsealarmrateistheprobabilitythatanobserverwillgenerateafalsenegativegiventhatthetwocolorsarethesame.Anobserverwhoissimplyguessingwillproduceahitrateequaltothefalsealarmrate.

PAGE 48

34 Figure3.5:Samplereceiveroperatorcharacteristicscurve. Anobserverwhocandiscriminateperfectlywillhavenofalsealarmsandahitrateofone.Thed0metrictakesa[hitrate,falsealarmrate]point,extrapolatesaGaussianfunction,andlooksatthedierencebetweenthemeansassumingbothGaussianshavethesamevariance.Aconstantd0producesacurvelikethedashedlineinFigure 3.5 .ThepAmetriclooksatthesamepointandinterpolateslinearlybetweentheendpointsoftheROCcurveandndstheareaundertheinterpolatedcurve.Thed0metricmeasuressensitivityastheseparationbetweenthemeansofthedistributionofthesignaltrialsthereisadierencebetweenthetwocolorsandthenoisetrialsthereisnodierencebetweenthetwocolors.First,thedistanceoftheobserver'scriterionfromthemeansofthetwodistributionsiscalculated.Thisisdonebyconvertingthehitandfalsealarmratestozscores,asshowninTables 3.1 and 3.2 .Thehitrateistheprobabilitythattheobserversaidthestimuliweredierentwith

PAGE 49

35 Table3.3:Resultsofthed0andpAanalyses. Bit-depth d0 pA 3 0.3192 0.5634 4 0.6889 0.6317 5 0.6931 0.6319 6 2.2061 0.8619 7 1 0.9897 thedelay,giventhattheobserverviewedstimulithatweredierent.Thefalsealarmrateistheprobabilitythattheobserversaidthestimuliweredierentwiththedelay,giventhatthestimuliwerethesame.ThedataaremappedtoaGaussianprobabilitydistribution,andthezscoregivesthedistancesdesired.Thed0measureiscomputedbytakingthedierencebetweenthezscoreforthehitsandthezscoreforthefalsealarms,asshownbelow.d0=zhitrate)]TJ/F21 11.955 Tf 11.955 0 Td[(zfalsealarmrate3.1Thisisequivalenttocalculatingd0=jS)]TJ/F21 11.955 Tf 11.955 0 Td[(Nj .2whereSisthemeanofthesignaltrialsandNisthemeanofthenoisetrials.Thelowerlimitford0iszero,whichindicatesnoabilitytodiscriminatebetweensignaltrialsdierentcolorswithdelayandnoisetrialssamecolorswithdelay.ThisisindicatedasthedottedlineinFigure 3.5 .Thetheoreticalmaximumis1,correspondingtoperfectabilitytodiscriminateandthusnofalsealarms,representedbythepointintheupperleftcorneroftheplotinFigure 3.5 .Theusualthresholdforabilitytodiscriminateisd0ofone,plottedasareddashedline.Table 3.3 showsourexperimentalresultsforhumans'overallabilitytodeterminethepresenceofcolor

PAGE 50

36 perturbationsasafunctionofbit-depth.Clearly,thethresholdofoneispassedbetweenthefthandsixthleastsignicantbits.Usinglinearinterpolation,wendthatthethresholdiscrossedatjustoverfourteenbins.ThepAmetriciscomputedbyndingtheareaunderthecurvedenotedbyalinearinterpolationfrom[0,0]to[1,1]throughagiven[hitrate,falsealarmrate]point.ThiscorrespondstopA=fh 2+h)]TJ/F21 11.955 Tf 11.955 0 Td[(f+)]TJ/F21 11.955 Tf 11.955 0 Td[(f)]TJ/F21 11.955 Tf 11.955 0 Td[(h 2.3Whenthed0metricis0,pAis0.5.Whenthed0metricis1,pAis1.0.ThetypicalthresholdusedwiththepAmetricis0.75.Theadvantageofthismetric,asopposedtothed0metric,isthatthepAmetricisbounded.Ifweplottheresultsofthiscalculationforourdataasafunctionofbit-depth,asshowninTable 3.3 ,itisclearthatthesamepatternasthatshowninthed0resultsispresent.Again,theresultsarealmostidenticalforthe5th,and4thleastsignicantbits,andincreasedramaticallyforthe6thLSB.Clearlythethresholdliesbetweenthe5thand6thleastsignicantbits,correspondingtohueshiftsofbetween8and16outof256.Thiscorrespondstoabit-depthofbetween3and4.Abit-depthof3correspondsto8categories,whichistheresultifthethresholdissettothe6thLSB.Abit-depthof4correspondsto16categories,whichistheresultifthethresholdissettothe5thLSB.Usinglinearinterpolation,wendthatthepAthresholdiscrossedat12bins.Usingbinomialanalyses 1 ,wendthattheproportionofcorrectresponsesforthe4th,5thand6thleastsignicantbitsare:566:031,:558:017and:838:031respectivelyata95%condenceinterval.The4thand6thleastsignicantbitsaresignicantlydierentp:05,asarethe5thand6thleastsignicantbits.However,the4thand5thleastsignicantbitsdonotdiersignicantly.Thissupportsthed0andpAresults. 1AnalysesperformedbyDr.KeithD.White

PAGE 51

37 Table3.4:GammaCalibrationData Level Grayscale Red Green Blue 1.0 7.5 1.950 5.60 0.740 0.75 3.8 0.375 1.20 0.195 0.5 1.5 1.000 3.00 0.420 0.25 0.4 0.093 0.24 0.069 3.2.2CalibrationThemonitorcalibrationconsistedofdeterminingtheluminanceofthescreenfordierentRGBcombinations.Table 3.4 showstheresultsatfourdierentluminancelevels.Thegammascomputedfromtheselevelsare2.1144fortheoverallluminance,2.1958fortheredchannel,2.2722forthegreenchannel,and1.7114forthebluechannel.Theactualchangeinhueviewedbytheparticipantscorrespondingtoeachpossiblehuevaluewasdeterminedusingthefollowingprocedure.Saturationisrstdenedas255andlightnessas192.Foreachpossiblehue,theHLScoordinateandtheperturbedcoordinateforeachbit-depthfrom3to7arerstconvertedtocomputerRGBcoordinates.Thesevaluesareconvertedtorelativeluminancesusingthegammascomputedabove.Therelativeluminancesarescaledbytheappropriatemaximumvalueforeachchannelcorrespondingtotheresultsforlevel100%inTable 3.4 .ThesenormalizedRGBluminancevaluesareconvertedbacktoHLSspace.Theoutputoftheprocedureistheabsolutedierencebetweenthetworesultinghuevalues.Figure 3.6 showstheresultsofthisprocedureforeachbit-depth/numberofbins.Thedesiredvalues,thedierencesexpectedwhentheaxisisuniformlydividedintotheappropriatenumbersofbins,areshownasdottedlines.Thesolidlinesshowthechangeinhueasproducedbythemonitor.Theaveragevaluesofthesolidlinesare

PAGE 52

38 Figure3.6:MonitorCalibrationResults extremelyclosetothedesiredvalues,dieringbylessthan1%.Thenumbersinthelegendrefertothenumberofbinscorrespondingtothegivenbit-depth.Figure 3.7 showstheresultswhentheactualshiftsinhueaftermonitorcalibrationareusedtodetermineapAvalueforeachrangeofshifts.First,histogramsoftheamountofshiftforeachtrialweregeneratedfromthecalibrationresults.Eachbincorrespondedtoa0.01shiftinhueusingthey-axisinFigure 3.6 .Anyhistogrambinscontainingnosignaltrialsornonoisetrialswereeliminated.Theremainingbinswereusedtocalculatethehitrateandfalsealarmrateforeachbin.TheresultingvaluesfromeachbinwereusedtodeterminethepAresultshowninFigure 3.7 .ThepAresultsfromSection 3.2.1 areplottedingreen,andclearlycorrespondwelltothesweepresults,evencrossingthethresholdatthesamenumberofbins.Thed0thresholdwas14bins,correspondingwelltothepAthresholdof12bins.This

PAGE 53

39 Figure3.7:ResultsofpAanalysiswithactualchangederivedfrommonitorcalibra-tion. supportsourconclusioninSection 3.2.1 ,thatthethresholdforhuediscriminationwithaveseconddelayliesbetweenthefthandsixthleastsignicantbit.3.3SummaryTheimplicationisthathumancolormemoryhasabit-depthofbetween3and4.Useoftwometricsshowsasignicantdierenceinsensitivitybetweenthe5thand6thleastsignicantbits.Ingeneral,wecanonlyreliablyremembercolorsasmembersofbetween8and16categories.Thiscorrespondsnicelytosimplegradationsofthefourperceptualchromaticchannels,whereacolorisrememberedasbeingononesideoranotherofagivenchannel.Forexample,itwouldbepossibletorememberacolorasbeingonthebluesideofgreen,ratherthanontheyellowsideofgreen,

PAGE 54

40 butdistinguishingbetweenarememberedgreenblue-greenandarememberedblueblue-greenwouldbedicult.Subjectsreportedthatbytheendofthe20minutesession,theywerebeginningtolosetrackofwhatcolortheywerelookingat,andattheendofvesecondswouldhaveforgottenwhethertheyhadjustseenagreenorayellowpatch.Inaddition,virtuallyallparticipantsreportedusingmentaltrickstorememberaspeciccolor.Forexample,onepersonsaidthatherememberedtheorangesbyassociatingthemwithsportsteamsThisisTennesseeorange".Intherealworld,peoplerarelyneedtoremembertheprecisecolorofanobject.TheysimplyassignittoacategoryanduseothercuesandaprioriinformationtorecognizetheobjectIhaveonlyoneorangemug;Iaminmyhouse;Iamlookingatanorangemug;thereforethismustbemyorangemug."Iftheprecisecolorofanobjectisimportant,humansdonottrusttocolormemoryalone.Therefore,giventheintenseandspecicnatureoftheexperimentaltask,itisunlikelythathumansareeventhispreciseinrememberingthecolorsofrealobjectsunderrealworldconditions.Soonequestionremains:Isthiscolorsegmentationsucientforcolorobjectrecognition?Ifso,separationofthehueaxisintoeightchromaticcolorcategoriesshouldbesucient.Colorindexingisaverypowerfulandrobustobjectrecognitionalgorithm,suf-feringfromoneproblem:itisnotrobusttoilluminationchanges.Variousmethodsattempttocompensateforthiswithoutlosingthealgorithm'sinherentrobustnesstoocclusion,orientation,andscaling,withoutmarkedsuccess.ThecolorconstancyalgorithmswiththebestresultsweredetailedinChapter 2 ,althoughtheyarecur-rentlyinsucientforadequateobjectrecognition.Colorconstantcolorindexing[ 25 ]ispromising,butrequireseliminationofsaturatedpixelsforconsistentlygoodresults.Anycolorconstancypreprocessorimprovesinaccuracycomparedtotheoriginalal-gorithmalone.Unfortunately,thisdoesnotimproverecognitionaccuracysucientlyforatrulyrobustobjectrecognitionsystem.

PAGE 55

41 Thealgorithmdoesnotcurrentlyincorporateanyversionofcolormemory.Havingshownthatcolormemoryhasadramaticeectonhumanresolutionofhueandhueperceptionovertime,wewouldliketoincorporateitintothecolorindexingalgorithm.Ifthehumansystemcanbetakenasahigh-levelmodeloftheprocessingoccurringinthecolorobjectrecognitiontask,thentheeectofmemoryshouldcertainlybetakenintoaccount.Howcanthatbedone?Quantizationistheengineeringequivalentofthecolorcategorizationthattakesplaceinmemory.Insomeways,ourcomputersalreadyhavelossymemoryforcolorimages,asmanyofourvideocompressionalgorithmsinvolvequantizationtomanyfewercolorsthanourdisplaysarecapableofreproducing.Inordertoincorporatetheeectsofcolormemoryintothealgorithm,weneedtoincorporatequantization.Clearly,ifwereducethenumberofhistogrambins,wewillmakethealgorithmsubstantiallymoreecient.Unfortunately,nothingisknownabouthowthehumanvisualsystemactuallydeterminesgivencategories,orhowagivenperceivedcolorisassignedtoagivencategory.Therefore,wewillhavetoderiveinsightonquantizationmethodsfromengineering.

PAGE 56

CHAPTER4QUANTIZATIONMostquantizationalgorithmstodatehavebeenconcernedprimarilywithoneoftwocases.First,inthecaseofmonitordisplayswithlimitedresources,implementa-tionsareconcernedwitha256-colorlimitationofdisplaydevices,ratherthanwithan8or16colorlimitation.Thegoalofthesealgorithmsistodisplayanimageascloseaspossibletotheoriginalimage.Ideally,ahumanviewershouldndtheoutputimageindistinguishablefromtheoriginal.Inthesecondcase,quantizationisusedtostoreortransmittheimageasecientlyaspossible.Again,thedesiredoutputisassumedtobeascloseaspossibletoanobserver'sperceptionoftheoriginal.Time-dependenteectsofthehumanvisualsystem,suchaschromaticadaptationanddegradationoftheimageduetomemoryareexplicitlyignoredbecausethepurposeofthisquanti-zationistoallowthehumanobservertoviewtheimageindependentoftheseeects.Becauseofthisgoal,asecondassumptioninherentinallthesequantizationmethodsisthatcomparisontotheoriginalimageisthebestmeasureofthemethod'sim-pact.Thismeansthatgenerally,quantizationitselfisseenasintroducingnoiseanddegradingtheimage.Inourcase,thebestmeasureoftheeectofthealgorithmiscomparisontootherimagesofthesamescene.Theoriginalimagecontainsnoiseinthepixelvalues.Asecondimageofthesameviewwillgenerallycontaindierentalthoughsimilarvalues.Ifthelightingisunchanged,aquantizationalgorithmthatiseectiveforthesepurposesshouldmaptheimagestothesameresult,ratherthantodierentresults.Comparisontotheoriginalimagewillgiveasenseofthedegradationtotheimagecausedbythealgorithm.Comparisontodierentimages,ratherthancomparisonofstorageordisplayconstraints,shouldprovidethemetricfortheeectivenessofthe 42

PAGE 57

43 Figure4.1:Originalimageusedtodemonstrateresultsofdierentquantizationschemes. quantizationalgorithm.Instead,wewillmeasureanalgorithmbyitseectivenesswhenusedinthecontextoftheobjectrecognitionalgorithm.Agoodalgorithmwillproducegoodobjectrecognitionresults,andapooronewillnotidentifyobjectscorrectly.Inthesamewayascolorconstancyalgorithms,simplydeemingaresultgood"isinsucient.Objectrecognitionaccuracyprovidesanexcellentmetricfordeterminingtheabilityofanalgorithmtocompensateforlightingvariation,butviewingthequantizedimagescanbehelpfulinunderstandingthemistakesmadebythealgorithm.Figure 4.1 showstheoriginalimageusedtodisplaytheresultsofthedierentquantizationalgorithms.4.1OverviewofAlgorithms4.1.1UniformQuantizationThisisthesimplestofthequantizationalgorithms.Thecolorspaceisdividedupintonblocksofequalvolume.Thecentroidofeachblockisthecolorusedinthecolorpalette.Thecolorpaletteisthereforexed,data-independent,andtakesnonoticeof

PAGE 58

44 Figure4.2:Resultsofuniformquantization thecombinationsthathumansmayndmorepleasing.Generallythisalgorithmisconsideredtoproducepoorresultsfromahumanstandpoint.Figure 4.2 showstheresultsofuniformquantizationto8colors.4.1.2DitheringThisprocessisusedtoimprovetheresultsofquantizationforhumanviewing.Asmallamountofnoiseisaddedtoeachpixeloftheimagebeforequantizationoccurs.Thisimprovesthequantizedimageforahumanbyeliminatingbanding,theeectthatoccurswhenasmoothtransitionbetweentwocolorsisreplacedbydistinctbandsofcolorsfromthepalette.However,forobjectrecognitionpurposes,andforlownumbersofcolors,bandingmaybeuseful.Atverylownumbersofcolors,ditheringwillhaveverylittleeect,astheeectoflownumbersofcolorsisincreasedresistancetonoise.TheresultsofditheringareshowninFigure 4.3

PAGE 59

45 Figure4.3:Resultsofditheringonuniformquantization. 4.1.3ModiedUniformQuantizationInthisalgorithm,insteadofdividingthespaceevenlyintonblocksofequalvolumeinthreedimensions,thenblocksareeithercubicorrectangular,withanintegernumberofthemtoaside.Theprocesscanbeseensimplyintwodimensions.Ifn=5,westartwitha2x2array.Thenoneblockisaddedinonerow,togiveveblocksfromonerowoftwoandonerowofthree.Ifn=6wehavea2x3array.Ifn=7,oneblockisaddedtoonecolumn,yieldingtwocolumnswithtwoblocks,andonewiththree.Forn=8,therewouldbeonecolumnwithtwoblocksandtwowiththree,andforn=9wewouldhavea3x3array.Inthisway,thelargestdierencebetweenthemaximumandminimumnumberofblocksinanysingledimensionisone,andthenumberofblocksinallotherdimensionshaveneitheramaximumnoraminimum.Thisisadata-independentmethodforsimplequantizationtofewcolors.Theorderinwhichtheaxisalongwhichthevariableblocksizeoccurscanbechosen

PAGE 60

46 Figure4.4:Resultsofmodieduniformquantization inadvancetotthedatabest,makingthisadata-dependentmethod,oritcanbedeterminedinadvanceandxed,makingthisadata-independentmethod.InRGBspace,modieduniformquantizationreducestouniformquantizationforn3colors,whennisaninteger.Quantizationto9colorsisshowninFigure 4.4 .4.1.4TreeQuantizationInthisquantizationscheme,theRGBvaluesarerstconvertedtotheHue-Lightness-Saturationspace,coveredindetailinAppendix A .Theadvantageofthisspaceisthatmostmeaningfulcolordemarcationschromatic/achromatic;light/dark;darkcolor/black;lightcolor/whitecanbemadewithsimplethresholdoperations.Separatingthecolorspaceintochromaticandachromaticregionscanbeacheivedbythresholdingthesaturationandlightnessaxes.Saturationofonerepresentsmax-imalsaturationfortheentirelengthofthelightnessaxis.Onethresholdonthe

PAGE 61

47 saturationaxisandtwoonthelightnessaxisdeterminearingofchromaticcolors.Valuesabovetheupperlightnessthresholdareassignedtothelightestachromaticvalue.Colorswhoselightnessisbelowthelowerlightnessthresholdareassignedtothedarkestachromaticvalue,regardlessofsaturation.Allcolorswhosesaturationisbelowthesaturationthresholdareassignedtoachromaticvaluesbasedontheirlightnesscomponent.Therstdivisionoccursbetweenachromaticandchromatic.ThentheachromaticregionisuniformlyquantizedalongthelightnessaxistoAregions.ThechromaticregionisuniformlyquantizedinthehuedimensiontoCregions.Allchromaticvaluesarearbitrarilyassignedtoalightnessof0:55andsaturationof0:6,whileallachromaticvaluesareassignedzerosaturationandhue.BothAandCaresetbytheuser,andA+Cisthetotalnumberofcolorsinthenalcolormap.Forexample,ifA=0andC=2,thenalmapwouldcontaintwocolors,diametricallyopposedinhue.IfA=1andC=1,thenalmapwouldcontainmediumgrayandsaturatedgreen.ThismethodisshowninFigure 4.5 ,for11chromaticregionsandthreeachromaticregions.TheimageisrstconvertedtoHLSspace.Asaturationthresholdof0.15determineswhethereachpixelischromaticorachromatic.Ifthepixelisachromaticsaturationbelow0.15,itisputintooneofthreeachromaticbins.Belowalightnessof0.2,thepixelisputintheblackbin.Abovealightnessof0.9,thepixelisputinthewhitebin.Betweenthevalues,thepixelisassumedgray.Ifthepixelischromatic,itisplacedinoneofnchromaticbins.Theboundariesforthechromaticbinsaredeterminedbyuniformlyquantizingthehueaxisintonregions.Inthisgure,thereare11chromaticregions,andthehueaxisoftheHLSspaceisosetby60degrees,placingpinkatboth0and1.

PAGE 62

48 Figure4.5:Diagramfortreequantization. 4.1.5Median-CutQuantizationThisisadata-dependentmethodusingthemedianofthevaluesofthethreechannelsintheinputimage,givenacolormap.Assumenisthenumberofcolorsdesiredinthenalcolormap.Themedianvaluesoftheinputmapareusedtosplitthecolorspaceintotwonewboxes.ThelimitsoftheseboxesareusedtondthemedianvaluesofthenewcolormapinRGBspace,whichareinturnusedtosplitthecurrentboxesintonewboxesalongthelargestaxis,andsoonuntilnboxeshavebeengenerated.Theaverageofthecolorsineachofthesenboxesarethevaluesinthenewcolormap.Figure 4.6 showstheresultsofthisalgorithm,againforeightcolors.4.1.6VectorQuantizationVectorquantizationisageneraltermfordata-dependentquantizationmethods.Thecolorpaletteisgeneratedviaaniterativeprocedure,similarlytomedian-cut

PAGE 63

49 Figure4.6:Resultsofmedian-cutquantization quantizationabove.Thesealgorithmsareevenmoretime-consumingtorun,butproducesubjectivelygoodresults.Onepossibleimplementationfollowsthesesteps.Firstthemostcommoncolorisassignedtothepalette,andtheerrorisevaluatedusingmeansquareerrororanothermethodofchoice.Thenthenextmostcommoncolorisassignedtothepalette,andtheremainingcolorsareassignedtooneofthetwovaluessoastominimizetheerror.Thisprocessisrepeateduntilncolorsarechosen.4.2NumberofColorsGenerally,theoptimalcaseforhumanviewingiswherequantizationisunneces-sary,socomputervisionresearchtendstoassumeaxedmaximumnumberofcolorsusually256,andgeneratesalgorithmsthatproducethemostpleasingresults.Wehavenotfoundanyalgorithmsintheliteratureonchoosingthismaximumbyany

PAGE 64

50 methodsotherthanananalysisofthetransmissiontimesorstorageconstraintsfortheimagedata,orbyusingthemaximumnumberofcolorsadisplaycanshow.Theseareimportant,butnotparticularlyhelpfulforthisresearch.Ourassumptionisthatthenecessarystoragecanbeaddedtotherobotorcomputer,aswewillbereducingthedatatosubstantiallyfewerthan256colors.Oneconstraintofinteresttousmaybethetimenecessarytoperformthequantization.Anotherpossiblemetricwouldconsidermorecloselythetaskweareaskingquantizationtoperform.4.2.1OptimizingAccuracyTheprevioustechniquesarealldesignedtominimizeartifactsvisibletothehumanviewer.However,forourpurposes,thehumanviewerisirrelevant.Instead,weneedametricthatwillenablethesystemtocorrectlyclassifycolorsevenwhenthelightingchanges.Becausethealgorithmisprimarilyconcernedwithchromaticresponses,andforspeedandsimplicity,onlythehueaxisisconsidered.Theaccuracysweepmethodisasimple,greedyapproximationtothebruteforcemethodoftryingallpossiblecombinationstoobtainthebinlocationsthatproducethelowestpossibleaccuracyonthetrainingdata.Thersttwobinslocationsarefoundbysweepingallpossiblecombinationsofbinboundariesalongthehueaxis.Thereare256possiblelocationsfortherstbinand255forthesecondbecauseboundariesatthesamelocationwouldeectivelyproduceonlyonebin.Oncethersttwobinshavebeendetermined,theyarexedandthethirdbinboundaryisfoundbycalculatingtheaccuracyonthetrainingdatawhenthethirdbinboundaryissweptthroughall254possiblelocations.Thenthethirdbinboundaryisxedandthefourthisswept,andsoonuptothenthbinboundary,producingthedesiredn)]TJ/F15 11.955 Tf 11.955 0 Td[(1bins.Again,hereweassumeweknowhowmanybinsaredesired.Wewantthequanti-zationalgorithmtofreeusfromneedingacolorconstancypreprocessor.Howmany

PAGE 65

51 colorsarenecessaryifwewisheachbintocontainmostofthehuesthatresultfromimagesofasinglecolorunderavarietyoflightingchanges?Todeterminethis,wegathereddataofdierentcolorsatdierenttimesofdayandanalyzedtheresults.4.2.2LightingShiftsImagesofacolorcalibratorandasetofcansweretakenunderavarietyofcon-ditions.Datafor2dierentlightingconditionswastakenateveryhour,startingat9amandnishingafterdark,withanadditionalimageat6:20pmatdusk.Thelightingconditionsconsistedofthefourpermutationsofthestatusofthecamera'swhitebalanceonoroandthestatusoftheblindsopenorclosed.Theresultingimagesweresegmentedintocolorpatchesrepresentativeofdierentcolors.AGretagMacbethColorCheckerrwasusedasthecalibrator,andasetofcansrepresentativeofcommoncolorsusedinsodacanlabeldesignwerealsopresentintheimages.Thelayoutwasunchangedoverthecourseoftheday.Thecolorcalibratorconsistedof24dierentcolors,including6achromaticcolors.Theremainderoftheimageproducedanother24samplecolors,includingbothcommonrepresentationsofblackandwhiteanddarkbackgroundregions,suchastheshadowunderthetablewherethecansandthecalibratorweresetup.Figure 4.7 showstherangeofcolorsthatasinglewhiteregionofasinglecantookoverthecourseoftheday.Thisdatawasanalyzedtodeterminethescopeofpossiblelightingshifts,andpo-tentiallytodeterminealikelyfunctionformappingcolorstakenunderoneilluminanttothosetakenunderanother.Therangeofvaluesforasinglepatchofcolor,underasinglelightingcondition,wasmuchhigherthananticipated.Generally,alongthehueaxis,asinglepatchwouldhaveastandarddeviationbelow.05wheretheoriginaldataarescaledtohavepotentialvaluesfrom0to1.Toencompassonestandarddeviationtoeachsideofthemeanwouldrequireabinsizeofroughly28outof256possiblebins.Thus,only

PAGE 66

52 Figure4.7:Samplecolorchangesunderasingleday'ssunlight. 8to10binsalongthehueaxiswouldbepossible.However,thissmallnumberisveryclosetowhatweknowoftheaccuracyofhumanmemory.Instantaneously,weaverageoveragivenspatialregiontogetanestimateofthelocalcolor.Overtime,wedonotstoretheprecisecolor,butonlyaroughrepresentationofit.Strangelyenough,thestandarddeviationalongthehueaxiswhenallthepixelsofagivencoloroverthecourseofadaywereincludedproducedasimilarmaximum.Fromthevariabilityofthedata,itappearsthat8to10binsisaverygoodapproximationtotheaccuracyoftheinputdata.AgraphoftheseresultsforthesetofcolorsfoundinsodacansisshowninFigure 4.8 .Thisgraphshowsthemeanandstandarddeviationforchromaticblueandachromaticblackcolors.Thisplotisforthesodacancolors,ratherthanthecalibratorcolors.Theachromaticcolorsincludedblack,thedarkbackgroundcolor,thedeskthecanswereplacedon,themiddlebrightnessbackgroundcolor,silver,

PAGE 67

53 Figure4.8:Averageandstandarddeviationforhueundervaryinglighting.Thexlocationofeachbarindicatestheaverageoftheaveragevaluesfortheday. white,andthelightestbackgroundcolor.Thedeskwasclosesttothegoldororangehue.Black,silverandwhitewereallcolorstakenfromregionsofcans.Clearlythehueaxisisagoodchoiceforquantizingchromaticregions,asthestandarddeviationforchromaticcolorsisverylowandtheseparationbetweencolorsisreasonablygood.Figure 4.9 showsthedierencebetweentheaveragesforthecanslowerrowandthecalibratorupperrowhueresults.Thesedatapointsclearlyseemtoclusterinalongthehueaxis.Theregionbetween0.3and0.4isalmostempty,whiletheregionincludingvaluesjustbelow1.0andvaluesfrom0.0to0.2isverycrowded.From0.5to0.7,therearethreedistributedcanclasses,butthecalibratorclassesaremoretightlyclusteredwithoutsuchcleardivisionsbetweencolors.IfweassumethatthemeanandstandarddeviationcalculatedoverthecourseofthedayarerepresentativeofGaussiandistributions,wecanderiveaBayesclassier

PAGE 68

54 Figure4.9:Meanvaluesforcansandcalibratorhues. todeterminethebestcolortochooseforagivenHSVtriple.Figure 4.10 showstheprobabilitydensityfunctionsforthechromaticcancolorsalongthehueaxis.Itisassumedthateachcolorisequallylikely.ThecolorofeachlineisdeterminedbytheaverageRGBcoordinatesforthatcolorfortheday.Clearly,thehueaxisissepara-bleintodistincthuecategories.Multiplepeaksarepresentintheseregionsbecausethe17chromaticcancolorsincludemorethanonerepresentativeofimportantcol-ors.Forexample,alightyellowandadarkeryellowwerebothtakenfromimagesoftheCountryTimeLemonadeTMcan.Similarly,thereweretwopurplemeasurementsWelch'sGrapeDrinkTM,twocyanmeasurementsFrescaTMandtwopinkmeasure-mentsDietCranberryCanadaDryGingerAleTMandTabTM.GoldCaeine-FreeCoca-ColaTMseemstohaveaseparatedistributionfromyellow,whilemanycolorsoverlapintheredregionofthehueaxis.Naturalseparationsfallbetweenyellowand

PAGE 69

55 Figure4.10:Probabilitydensityfunctionsforchromaticcancolorsalongthehueaxis. green,greenandblue,blueandpurple,purpleandreds,redsandyellow.Theredsclasscontainsnotonlyredanddarkred,butalsobothpinksandbothoranges.Inaddition,theachromaticcolorsareseparableontheothertwoaxes.Figure 4.11 showstheprobabilitydensityfunctionsforallthecancolorsalongthesaturationaxis.Clearly,thelightachromaticcolorsaremuchlowerinsaturationthantheothers.Thedarkachromaticcolors,however,arejustassaturatedasthelighterchromaticcolors.Figure 4.12 showsthatthedarkachromaticcolorscanbedistinguishedwithathresholdonthevalueaxis.Usingtheseprobabilitydensityfunctionswecanderiveaclassierwith8or10colors.Forthe8-colorcase,goldandyellowareconsideredonecategory,andredcontainsredanddarkred,bothpinks,andbothoranges.The10-colorclassierseparatesgoldandyellowinto2classes,andndsaseparatecategoryforpinks.The

PAGE 70

56 Figure4.11:Probabilitydensityfunctionsforallcancolorsalongthesaturationaxis. eightcolorregionscommontobothclassiersareblack,white/silver,red,yellow,green,cyan,blue,andpurple.Wedeterminethatblackispresentwhenthevaluecoordinateisbelow28onascaleof1to256.Whiteandsilveraregroupedintoasingleclass,identiedbysaturationoflessthan95.Theremainderoftheclassesaredenedonthehueaxis.Redsareintherangebelow18andabove206,whileyellowandgoldlieinthe18to47range.Greensarepresentfrom47to92,cyansfrom92to144,andbluesfrom144to164.Purpleslltheholefrom164bluesto206reds.Thecalibratorhadamagentapatchthatformedaclearpeakbetweenthepurpleandredclusters,butmagentaisnotwellrepresentedamongcanlabelcolors,somagentaisincludedinthepurplecategory.Thesampleimagequantizedwiththe8-colorschemeisshowninFigure 4.14 .Thewheelsandpavementarecategorizedasachromatic/lightandtheshadowsare

PAGE 71

57 Figure4.12:Probabilitydensityfunctionsforallcancolorsalongthevalueaxis. categorizedasblueandcyan,astheyarenotdarkenoughtotintheblackcategoryusedbythesodacans.ThisisprimarilybecausethetruckimageisagenericimageincludedwiththeAdobePhotoshopTMpackage,foruseintutorials,andthesodacanimagesweretakenwithaSonyDigital8camerainthelaboratory.Imagesofsodacans,quantizedwiththismethod,assignblackregionsonthecanstotheblackcategory.Thetwoadditionalregionsintheten-colorschemeconsistofgold,from18to31alongthehueaxis,leavingyellowfrom31to47,andpink.Figure 4.13 showstheprobabilitydensityfunctionsalongeachaxisforthecolorsincludedintheredcategory.Clearly,onecolorisseparablealongthevalueaxis,andoneisseparablealongthesaturationaxis.Thesetwocolorsarethetwopinks.Thetwopinksareputinatenthcategory,consistingoftwosubdivisionsoftheredcategory.Thepinkcategoryisdenedasthosecolorsthattintheredcategoryalongthehueaxis,

PAGE 72

58 Figure4.13:Probabilitydensityfunctionsforthecolorsintheredcategory. whosesaturationisbelow163andabovethewhite/silversaturationboundary,orwhosevalueisabove123,orboth.4.3SummaryThemainproblemwithquantizationimplementationsisthatofthenalgoalofthequantization.Theintentofmostalgorithmsismaintenanceoftheimageintermsofinstantaneoushumanperception,whileincorporatingreducedstorageanddisplayconstraints.Thismeansthatthemaximumpossiblenumberofcolorsisused.Metricsusedtorateimagesandtechniquesincludebit-rateincodingandqualityintermsofhumanperception.Myalgorithmisconcernedwithusingtheminimumpossiblenumberofcolorstorepresentanimagewhilemaintainingtheabilitytorecognizeanobject,notthemaximumavailablewhilemaintaininghumanperceptionoftheimage.Becausethedatastoredareintheformofaseriesofhistogramvalues,not

PAGE 73

59 Figure4.14:Sampleimagequantizedusingthe8-colorlighting-shiftbasedclassier. intheformofanimage,bit-ratevaluesforanimagearenotausefulmetric.Thismakesitverydiculttousetheresearchintheliteraturetoanalyzeanalgorithm'ssuitabilityforthisproject.Wetaketheapproachthatthedegradationincolorrepresentationintroducedbyquantizinganimageofanobjectcanactuallyimprovethedesiredresult:namelyidentifyingthesameobjectinmultipleimages.Generally,quantizationalgorithmsarejudgedqualitatively.Theeldisinagreementthatmeansquareerrorisapoormetricofimagequality,andnoothermetrichasbeenproposedwithsucientacceptancetoprovideaclearquantitativemeasureofimagequality.Analysisofdatagatheredundertypicallaboratorylightingconditionsresultedinaplausiblequantizationscheme.Inaddition,thisanalysisprovidedaclearerunderstandingofthenatureoftheproblem.Thevarianceinlightnessorvalueishigh,renderingthatfeatureunlikelytobehelpfulexceptinthebroadestsense.ThesegmentationintodierentcolorcategoriesisnotintuitiveintheRGBspace,whileit

PAGE 74

60 isverystraightforwardusinghueandsaturationassalientfeaturesofthedata.Thisresultedineightortencolorcategories,closetothenumberofcategoriesresultingfromtheworkinChapter 3 .Littleworkhasbeendoneforcasesofextremequantizationforobjectrecognition.Generally,colorquantizationhasfocussedalmostexclusivelyoncompressionwithminimallossinqualitywithregardtoinstantaneoushumanimageperception.Inthiscase,quantizationisusuallyregardedasintroducingnoise,ratherthanasreducingit.Thereareafewexceptionstothisrule[ 17 6 ].Theyattempttocompensatefornon-linearitiesinthedatabytransformingtheoriginaldatatoanewspace,viatheK-LtransformoraDCT,andthenuniformlyquantizing,ratherthanbyusinganadaptivequantizationalgorithm.BecausetheyaredoingthiswithrespecttoSwainandBallard'salgorithm,theirresultsareofmoreinterestherethantheotherresults.Ingeneral,theyhavefoundthatusingthesetransformscombinedwithuniformquantizationresultedinhighaccuracythatdroppedosharplywhenthenumberofbinsfellbeloweight.Theresultingnumberofcategoriesissimilar,regardlessofhowthenumberisobtained.

PAGE 75

CHAPTER5THEORYANDRESULTSONSYNTHETICDATA5.1Theory5.1.1StructureandMethodsThecolorhistogramindexingmethodderivesmuchofitsrobustnessfromitssimplicity.Thefeaturevectorsconsistofnormalizedcolorhistogramsofimagesoftheobjectsinthedatabase.Inourimplementation,wedeterminematchclosenessbyasimpleEuclideandistancemeasurebetweenfeaturevectors.Wehavevariablesforthenumberofobjectsinthedatabase,thenumberofcolorsintheoriginalspace,thenumberofcolorsinthenalspace,andthenumberofvaluesusedtoformthehistogram.Forexample,supposethenalhistogramcontained512values.Ingeneral,onlythefewlargesthaveobject-relatedinformationinthem.Thesmallestvaluesareverynoisy.Insteadofkeepingall512values,wecouldkeeponlyafewofthelargest,andsettheresttozero.Thiswouldgreatlyincreasethestorageeciencyofthesystem,asonlythenon-zerovaluesandtheirindexwouldneedtobestored.Becauseofstorageandprocessingconstraints,thepreferableoptionwouldbetosimplyquantizethespacetoonlyafewcolorsintherstplace.Iffewercolorsareused,onlythemagnitudesneedbestoredinanorderedarray,ratherthanamorecomplexstructureincorporatingtheindexvaluesandeliminatingthezeros.Sohowmanycolorcategoriesaresucient?Wehaveshownthathumanmemoryforhuesdegradesbetween8and16divisionsofthehueaxis[ 54 ]inChapter 3 .Drewetal.[ 17 ]andBerensetal.[ 6 ]foundaccuracybegantodegradeatbetween8and16categoriesintheirrespectivetransformedspaces.Analyzingthevariationin 61

PAGE 76

62 illuminationduetosunlightoverthecourseofadayresultedin8or10categoriesusefulinsodacanidentication.Alltheseresultsindicatethattheregionbetween8and16categoriesmaybewhereweshouldinvestigate.Nowthatwehaveanapproximatenumberofcategoriestoexplore,howdowechoosethecategoriesthemselves?Computationally,thisshouldhavelittleornoimpactonthenalrequirementsofthesystem.Ideally,thecategoriesshouldbexedfortheexpectedenvironmentoftherobot,sothatalookuptableisallthatisneeded.Webeginwithamodiedversionofuniformquantization.Obviously,uniformquantizationinRGBspaceproducesquitedierentcategoriesthanuniformquan-tizationinahue-basedspace.Giventhatweareconcernedexclusivelywithcolorinformation,ahue-basedspaceseemsmostpromising.Wersttestquantizationto512colorsinRGBspace,whichallowsustoroughlydeterminetherobustnessoftheoriginalalgorithmundermultiplelightingconditions.Second,wetestquantizationto64colorsinRGBspace.Wethentestquantizationofeachoriginalimageto14colorsinHLSspace,obtaining3achromaticcolorswhite,gray,andblackand11chromaticcolorspink,red,orange,yellow,threegreens,adarkcyan,lightblue,darkblue,andpurpleusingthetree"quantizationdescribedinSection 4.1.4 .Featurevectorsaregeneratedbyscalingthehistogramsoftheindexedimagestosumtoone.5.1.2TheoreticalResultsAssumingallcolorsandcombinationsofcolorsareequallylikely,ourequationsndtheexpectedvalueoftheerror.Inreality,agivendatabasewillproducebetterorworseresultswiththistechnique,dependingonthequantizationmethodusedandthecharacteristicsofthedatabase.Noiseintheformofcameracalibration,non-uniformityofcolordistributioninagivendatabase,andlightingvariationswillallaectanyresultsonrealdata.However,givenappropriateconditionsobjectseasily

PAGE 77

63 Figure5.1:Diagramshowingexamplesofeachvariableusedinthetheoreticalequa-tions. distinguishablebycoloralone,orthealgorithmusedasonemethodamongmanyforidentifyingtheobjects,andappropriatequantizationmethodsforthecharacteristicsofthedatainquestionrealresultsshouldatleastfollowthesametrendsasthoseoutlinedhere.Thereisatractablesolutiontotheeectsofthevariablesonexpectederrorfor3objects[ 53 52 ].Formoreobjects,itispossibletoderiveaspecicanalyticsolution,butsimplertoimplementaprogramtocalculateappropriatevalues.Forlargerdatabasesmorethan10objectseventhisapproachbecomesunwieldyandtime-consuming.The3-objectsolutionispresentedhere,alongwithadescriptionoftheprogramusedtogeneratetheaccuracyvaluesforupto6objectsandsimulationresultsforlargerdatabases.Fordatabasescontainingonlythreeobjectsn=3,wewishtodeterminetheexpectederror.Inthiscase,thevariablecisusedtorepresentthenumberofcolorsintheoriginalnotquantizedspace,kisusedtorepresentthenumberofcolorsinthequantizedspace,andpisusedtorepresentthenumberofvaluesofthehistogram.ThesevariablesareillustratedinFigure 5.1 .Inthisgure,therearesixcolorsin

PAGE 78

64 theoriginalspace.Asaresultofquantization,redandorangearemergedtoformasinglenewcolor,representedhereasorange.Asaresultofthis,thersttwoobjectsbecomeindistinguishable,whiletheothertwoarestillseparable.Therearecppossibledierentobjectsintheinputspace.Wechoosethreeoftheseforourdatabase,andthenquantizethem.Theresultingthreeobjectsinourdatabasearemembersofthekppossibleobjectsinthequantizedspace.Forthiscase,weassumethattheonlydegradationoccurringovertimeismodeledbythequantization.Theonlytimetheerrorincreasesiswhenmorethanoneoftheobjectsinourdatabasequantizestothesameobjectinthenewspace.Otherwise,theobjectsremainperfectlydistinguishableandtheerrorremainsat0.Ifthereismorethanonecopyofagivenobject,wechooserandomlybetweenthem.Theexpectederrorforagivensetofobjectscantakeonlythreevalues:0allobjectsdistinguishable,1=3twoobjectsthesameor2=3allthreeobjectsidentical.However,whenwelookattheexpectedvalueofthiserroroverallpossiblesetsofobjects,intermsofc,kandp,wegetE[error]=1 Nall[1 3N2+2 3N3].1HereNallisthenumberofpossiblecases,N2isthenumberofdatabaseswheretherearetwoobjectswhichareidenticalandN3isthenumberofdatabaseswhereallthreeobjectsareduplicates.N2isdenedasN2=kpXj=1sjsj)]TJ/F15 11.955 Tf 11.955 0 Td[(1cp)]TJ/F21 11.955 Tf 11.955 0 Td[(sj+Xt6=jstst+sj)]TJ/F15 11.955 Tf 11.955 0 Td[(2!.2andN3isdenedasN3=kpXj=1sjsj)]TJ/F15 11.955 Tf 11.955 0 Td[(1sj)]TJ/F15 11.955 Tf 11.955 0 Td[(25.3wheresiisthenumberofobjectscreatedintheoriginalc-colorspacethatarequan-tizedtothesameithobjectinkspace.Nallisthenumberofpossiblen-object

PAGE 79

65 Figure5.2:Theoreticalresultsforc=224,p=2andp=3,n=3,andkvarying. databasesinthec-colorspace.Nall=cp! cp)]TJ/F21 11.955 Tf 11.955 0 Td[(n!.4Figure 5.2 showstheresultsforthisequationwhenp=2crossesandp=3starsforkvarying.Theeectofkeepingpofthebinsisclearlydistinguishable.They-axisistheerrordisplayedonalogscale.Clearly,alargerpcorrespondstoameasurablereductioninerror.Increasingpbyoneproducesareductioninerrorofmorethananorderofmagnitudefork>10.k=10isalsothethresholdforerrorbelow1%forp=2.5.1.3LargerdatabasesFormorethan3objectsinthedatabase,thecomputationalcomplexityofthetheoreticalsolutionincreasesgreatly.However,forsmallnupto6onourequipmentandwithourtimelimitations,itispossibletocomputetheexpectedaccuracy,withoneassumption.Weassumethatcp=kpislargerthann.Thisensuresthat

PAGE 80

66 Figure5.3:Theoreticalresultsforc=224,p=3,andkvaryingalongthexaxis.Eachlinecorrespondstoadierentvalueofn. allcombinationsofnofthekppossibleobjectsarepossible.Thisisnotaseverelimitation,asingeneralwequantizefrom224colorsdowntotensofcolors.InFigure 5.3 ,wecanseetheeectofchangingthenumberofbinsinthenewspaceontheerror.Askincreases,theerrordecreases.Ifweplotthelogofkversusthelogoftheerror,wegetthelinesshown.Eachlinecorrespondstotheresultsforagivenvalueofn.Asnincreases,theerroralsoincreases,buttherelativeincreaseinerrorissmall.Figure 5.4 showsthebehaviouroflogerrorversuslogkaspisincreased.Again,theresultsareshownonalog-logplot.Aspincreases,notonlydoestheerrordecrease,butthelinesalsobecomesteeper.Thus,increasingphasagreatereectonerrorwhenkislarger.Experimentally,theerrorwhenn=4,p=3andkisontheorderof1000isofthesameorderofmagnitudeastheerrorwhenn=4,p=6andk=25.Doublingthenumberoffeatureskeptreducedthenumberofcolorsneededbyafactorof40withoutchangingtheerror.Forlargerdatabases,weimplementedaMonteCarlosimulation.Theseresultswereobtainedbyaveragingtheerrorfrom10;000randomlygenerateddatabasesofn

PAGE 81

67 Figure5.4:Theoreticalresultsforc=224,n=4,andkvaryingalongthexaxis.Eachlinecorrespondstoadierentvalueofp. objects.ThisaverageerrorisplottedinFigure 5.5 .Predictably,errorincreaseswiththenumberofobjectsinthedatabase.Asthehistogramspacebecomeslesssparse,thesystem'srobustnesstouctuationsinhuedeteriorates.Theoreticalresultsshowcoarsequantizationcandramaticallyreducenecessarystorageandprocessingwhileproducingaminimalreductioninaccuracy.Increasingthenumberoffeatureskeptdramaticallyincreasestheaccuracy,whileincreasingthenumberofobjectsreducestheaccuracy.Largedecreasesinthenumberofcolorsavailablecanbecompensatedforbysmallincreasesinthenumberoffeatureskept.Theresultspresentedtothispointareforthebestpossiblecase:onlydegradationresultingfromquantizationisinterferingwithperfectaccuracy.Intherealworld,otherfactorsaectaccuracy.Theseincludelightingvariationilluminantnoiseanddierencesincameraplacementandcalibration.Ingeneral,aslongastheilluminantnoiseisnottoogreatactualmagnitudewilldependonthedatabaseandthedegreeofquantization,someamountofquantizationwillhelpeliminatethisnoiseandimproveaccuracy.

PAGE 82

68 Figure5.5:Simulationaveragedresultsforc=224,p=2andp=3,k=10,andnvarying. 5.2SyntheticDataTherearetwointerlinkedfactorsstilltobetested.First,thetheoreticalresultsassumebothuniformdistributionofcolors/objectsanduniformquantization.Thereisnoguaranteethatthedatabasewillbeinanywaysimilartothequantizationprocedurefollowed.Thusaquantizationschemethatisinsomesensedata-dependentmayberequired.Second,andmoreimportantly,theseresultsshowwhathappenswhenthereisnonoiseintheformoflightingshifts.Inordertosimulatetheeectoflightingshiftsandtestvariousquantizationschemes,furtherresearchwasperformedusingsyntheticdata.Therstexperimenttestedthecasewhentheobjectsallfallintoasingleregionofthehueaxis.Sec-tion 5.2.1 describestheresultsofthisexperiment.Section 5.2.2 coverstheresultswhentheobjectsareequallydistributedinhue.Theresultswheneachobjectiscon-centratedinmorethanonehueregionarecoveredinSection 5.2.3 ,andSection 5.2.4 showtheresultswhenthesemultiply-huedobjectsareshiftedinhuenotwithasim-plelinearshift,asintheprevioussections,butwiththelightingshiftsmeasuredinSection 4.2.2 ofChapter 4

PAGE 83

69 Figure5.6:Imageofsinglehueregionsyntheticdatabase.Redcorrespondstohighervalues;bluetolowervalues. 5.2.1SingleHueRegionAsyntheticdatabaseof16objectswasgenerated.Eachobjectconsistedofasinusoidalbump,8of256binswide.Figure 5.6 showsanimageofthisdatabase.Largervaluesarered;0isblue.Theentiredatabasetakesuponlyhalfofthepossiblehueaxis.ThejointpurposesofthisexperimentweretoverifythattheaccuracysweepmethoddescribedinSection 4.2.1 ofChapter 4 ofoptimizingbinselectionwasworkingproperlyandtocomparetheresultsoftheuniformandtheaccuracysweepmethodsofquantization.Theaccuracysweepmethodshouldoutperformuniformquantizationwhenthedatabaseislocalizedwithinoneormoreregionsofthehueaxis,andshouldperformaswellastheuniformmethodwhenthedatabaseconsistsofcolorsevenlyspreadthroughoutthehistogramspace.Whensmalllightingshiftsoccur,theaccuracysweepmethodshoulddramaticallyoutperformtheuniformquantization

PAGE 84

70 Figure5.7:Resultsonsinglehueregiondatabaseforvaryingnumbersofbins,usinguniformblueandaccuracysweepredquantization.Averageaccuracyacrossthedierentshiftsisshowninthelowerrighthandplot.Standarddeviationofaveragevalueisshownwithdottedlines.Progressivelylargershiftsfromnoneupperlefthandplotto7middlelowerplotareshownintheremainingplots. methodontrainingdataexceptwhenuniformquantizationresultsinpreciselythecorrectbinlocations.Thenumberofbinswassweptfrom2to256foruniformquantizationandfrom2to128foraccuracysweepquantization.Theaccuracyresultsfrom2to128binsareshowninFigure 5.7 .Only7shiftswereperformed,asthe8thshiftwouldcauseallbutoneoftheobjectstolineupperfectlywithadierentobject,producing100%erroronthose7objects.Thesingleobjectthatdidn'tlineupwithanotherobjectwouldlineupwithnothing,producingerrorofrandomchancein16chanceofidentifyingtheobjectcorrectly,giventhatobject.Astheshiftsincrease,theexpectedvalueof

PAGE 85

71 Figure5.8:Databaseofobjectscompletelyspanningthehueaxis. theaccuracywouldincreaseuntilashiftof128,andthendecreaseuntilwewere8binsawayontheotherside.Clearlytheaccuracysweepmethodproducessuperiorresultsonthistypeofdataset.Itoutperformstheuniformquantizationbyalargerandlargermarginasthevaluesareshiftedawayfromtheoriginal.Evenforthelargestshift,whereonlyonepixeloftheshiftedversionoverlapswiththeoriginalobject,theaccuracysweepmethodcorrectlyidentiesoverhalftheobjects,whileuniformquantizationfailsdramatically.Lookingattheaverageaccuracy,itisclearthatoveralltheaccuracysweepmethodperformsbetterthantheuniformmethodontrainingdata.Thecharacteristicshapeoftheuniformquantizationmethodforlargeshiftscom-paredtothesizeoftheobjects,largeisdenedasshiftsgreaterthanhalfthesizeoftheobjectbumpsisofmoreinterestthantheaverageaccuracyresults.Forsmallshiftsaccuracydegradesasthenumberofbinsisreduced.Foramediumshiftexactly

PAGE 86

72 Figure5.9:Resultsofuniformblueandaccuracysweepredmethods. halftheobjecttheuniformaccuracyisextremelyunpredictablefromonenumberofbinstoanother.However,forlargeshifts,theuniformaccuracyfollowsapredictabletrajectory.Firsttheaccuracyisverylowbins.Asthenumberofbinsisincreased,theaccuracyincreasescompatiblewiththetheoreticalresults.Eventuallyapeakisreached,andtheaccuracybeginstodecrease.Thispeakisthepointatwhichthemaximumnumberofpixelsarebeingconsolidatedintothecorrectbinasaresultofthewidenedbinsizes.Asthebinsizesdecreaseandthenumberofbinsincrease,theaccuracydecreases,untilat256binstheaccuracyis0.Noobjectsareidentiedcorrectly,becauseeveryobjectoverlapsmorewithadierentobjectthanwithitself.5.2.2AllHuesInsteadoflimitingtheobjectstoasingleregion,thenewdatabasehasidenticalobjectsspreadalongtheentirehueaxis.Insteadof16objects,thenewdatabase

PAGE 87

73 Figure5.10:Originaldatabaseforhistogramswithtwocolors. showninFigure 5.8 has32objects.Eachofthese32objectsarethesameasanyoneoftheoriginal16objects,merelyshiftedtoanewregionoftheaxis.Figure 5.9 comparesuniformandaccuracysweepmethods.Clearly,whenthecol-orsoftheobjectswithinthedatabaseareuniformlyspread,andthereisnonoiseintheformofosets,theuniformmethoddoesaswellasthemoreoptimizedmethod.However,whenosetsareintroduced,againtheuniformaccuracydegradessubstan-tiallywhiletheaccuracysweepmethodcontinuestoperformwellontrainingdata.5.2.3MultipleHuesNowthatweknowwhathappenswhenthedatabaseconsistsofidenticalobjectswithsinglecolorregions,wewishtodeterminehowtheaccuracyisaectedbyhueshiftswheneachobjectconsistsofmultiplehues.Webeginwithobjectsconsistingoftwocoloredregionsofequalsize.Thiscorrespondstohistogramswithtwoidentical

PAGE 88

74 Figure5.11:Averageaccuracyofuniformblueandaccuracysweepredmethodsasafunctionofshift. bumps.Figure 5.10 showsthenewdatabase.Again,redrepresentslargervaluesandblue,smallervalues.Theosetis64elements,soasweepof72binswillshowthefullrangeofpossibleaccuracies.Eachbumpofeachobjectisstill8huevalueswide,generatedwithhalfaperiodofasinewave.Figure 5.11 showstheaccuracyasafunctionofshift.Thedottedlinesindicatestandarddeviation.Thecurveisbimodal,withtherstpeakcorrespondingtoacorrectmatchofthersthistogrampeakwiththersthistogrampeakforagivenobject,andthesecondaccuracypeakcorrespondingtoamatchofthersthistogrampeakofanobjectwithitsshiftedsecondpeak.Becausethesecondpeakofthisplotcorrespondstoonlyonepeakcontributingtotheaccuracy,ratherthantwo,itspeakaccuracyislower.

PAGE 89

75 Figure5.12:Resultsofuniformblueandaccuracysweepredmethods. Figure 5.12 showstheresultsforselectedshifts.Clearly,theresultsarebestwhenthereisnoshift.Thesamepatternasbeforeisevident.Theaccuracysweepresultsredlineshowthesamesmoothriseandlargeraccuracythantheuniformresults.Theuniformresultsbluelineshowthesameabruptpeakfollowedbyadecreasetozeroforlargershifts.TheplotsinthisgurewerechosentoshowtheeectsofthebimodalcurveinFigure 5.11 .Notethatsmallshiftsto10aresimilartotheresultsforthepreviousdatabase,withroughly50%accuracyatshiftsof4and5.Fortheregionbetweenthetwopeaksinthebimodalcurveto55theuniformresultsshowanegligiblebumpforfewbinswhiletheaccuracysweepresultsarenon-zeroandvarying.Shiftsofbetween55and72thecurvesfollowtheresultsfortherstpeak,butattainalowermaxima.OursecondsyntheticdatabasewithtwopeaksisshowninFigure 5.13 .Thisdatabasecoverstheentirehueaxisandhasanosetbetweenthetwoobjectsof128

PAGE 90

76 Figure5.13:Secondtwo-peakdatabase. halfthepotentialhuespace.Thisisamoreuniformdatabase,soweanticipatethattheaccuracysweepmethodshouldnotincreasetheaccuracysubstantially.Becauseofthesymmetryinthedatabase,weneedonlysweeptheshiftsfrom0to8.Figure 5.14 showstheaverageresultsasafunctionofshift.Thereisacleardecreaseinaccuracyastheshiftisincreased.Continuingtoashiftof16,itisreasonablyclearthattheaccuracylevelsoatashiftof8.Wewouldexpectthattheaccuracyofshiftsbetween16and120wouldbeapproximatelythesameastheaccuracyforshiftsbetween8and16,withshiftsbetween120and128mirroringtheshiftsfrom1to8.ResultsforavarietyofshiftsareshowninFigure 5.15 .Again,wehaveroughly50%accuracywhentheshiftis4,orhalfthebumpsize.Wealsocontinuetoseetheclearpatternofincreasedaccuracyforfewbinswhentheshiftisgreaterthanhalfbutlessthanthefullwidthofthebump.Whentheshiftislargerthanthebumpsize,accuracyisdecreasedtoalmostzero.Theaccuracysweepresultsfortheatsection

PAGE 91

77 Figure5.14:Averageaccuracyforsecondtwo-peakdatabaseasafunctionofshift. oftheshiftcurveshowasimilarsmallbumpforfewbinsastheuniformresultsinthepreviousdatabase.5.2.4MoreComplexLightingShiftsInthissection,thedatabasesfromtheprevioussubsectionsaretestedwithamorecomplexlightingshift.TheshiftisgeneratedfromtheimagesofcansunderdierentlaboratorylightingconditionsdescribedinSection 4.2.2 .Figure 5.16 showssampletransforms.Theblacklineofpointscorrespondstothetransformationfromtheaveragelightingconditionveryclosetothemiddayresponsestotheearliestmorninglightingconditionamdaylightwithuorescentlights.Thebluelinecorrespondstothetransformationfromtheaveragetothelatesteveningconditionuorescentlights,nodaylight.Ineachcase,thevaluesfromthecansintheimage,ratherthanthecalibrator,wereusedtogeneratethetransformations.Ifwewantto

PAGE 92

78 Figure5.15:Resultsofuniformandaccuracysweepmethodsonsecondtwo-peakdatabase. performcolorconstancywiththisdata,weshouldmakesurethatagivenlocationonthex-axistsintoabinthatcontainstheentiredierencebetweentheblueandblackcurves.Usingthewidthbetweenthecurvesbetween75and125,roughly16categorieswouldbenecessary.Figure 5.17 showstheoriginalinputintheupperplotandthewarpedinputgeneratedassumingtheoriginaldataweretakenundertheaverageilluminantandthewarpedversionwastakenundertherstlightingconditioninthelowerplot.Theoriginaldatabasecorrespondstothesingle-peakdatabasethatspanstheentirehuespace,asdescribedinSection 5.2.2 .However,thewarpedversionissubstantiallydierentfromthesimpleshiftstestedpreviously.Insteadofalinearshift,thewarpingstretchessomeregionsandcompressesothers.

PAGE 93

79 Figure5.16:Tranformationfromaverageilluminanttoearlymorningandeveningilluminants. Thisnon-uniformwarpingisapparentwhenwecomparetheresultsofthelinearshiftwiththeresultsusingthewarpeddata.Figure 5.18 comparesthelinearshiftofonetotheclosestwarpeddatamiddaytotheaverage,andthemostdistantwarpeddataearlymorning.TheresultsinFigure 5.9 arecomparabletotheresultswithashiftof6.Theaccuracysweepmethodreachesapeakofroughlythesameaccuracy,butinthewarpedcase,thereisasubstantialdropinaccuracybetweenthe45thand100thbins.Inaddition,theuniformresultsneverdroptozerointhewarpeddatacaseinthewaytheydowithasimplelinearshift.However,atleastfortheaccuracysweepmethod,thereisaclearinitialbumpintheaccuracy,followedbyadecreaseasthenumberofbinsincreases.

PAGE 94

80 Figure5.17:Comparisonoforiginalandwarpeddatabases. 5.3Over-ttingTheaccuracysweepmethoddoesawonderfuljobofenablingthealgorithmtoadapttogivenlightingchanges.Ifthelightingisonlygoingtochangeinoneway,theaccuracysweepmethodwilldesignthebinstooptimizeforthatshift.However,iftheexamplesthatthemethodisusingtotraindonotreecttheactuallightingconditionstobeencountered,thebinlocationsarelessoptimalthanuniform.Figure 5.19 showstheresultswhenthebinsgeneratedforasinglelightingshiftareusedtodeterminetheaccuracyfortheotherlightingshifts.Thedottedlinesaretheaccuraciesgeneratedinthetestcase,withtheoptimalaccuracysweepforthetrainingdatainredandtheuniformresultsinblue.Thesolidredlinesshowtheaccuracyforthegivenshiftusedascross-validationdatawhenthetrainingdataisthesametrainingdatausedtogeneratethebinlocations.

PAGE 95

81 Figure5.18:Resultsfortrainingoriginalandtestingwarped. 5.4ConclusionsThetheoreticalresultsshowedthatwhenthereisnonoiseinthesystem,wecanexpectatmostaminordropinaccuracywhenwequantizefrommanybinstoontheorderoften.Forlargedatabases,morethantencolorsmaybenecessary,dependingonthesizeandcompositionofthedatabase.Theseresultsindicatethatsystemsusingquantizationasatechniquetoimproveeciencyarelikelytoworkverywell.However,thetheoreticalresultsdidnotpredictwhetherornotthealgorithmiscapableofperforminglimitedcolorconstancy.Thesyntheticresultsshowedthatinthetrainingcase,wecanexpecttheaccuracysweepmethodofdeterminingthebinlocationstooutperformtheuniformquantiza-tionmethod.Inaddition,whentheuniformquantizationmethodisusedwithvaluesthathavebeenshiftedbymorethanhalfthewidthofthehistogrampeaks,accuracyis

PAGE 96

82 Figure5.19:Over-ttingofaccuracysweepmethod. substantiallyhigherforfewcategoriesthanformany.Thus,forsmalllinearhueshifts,quantizationcanperformlimitedcolorconstancy.Thisisshownbytheincreaseintheaccuracywhenonlyafewbinsareused.Therealisticlightingshift,however,didnotshowperceptiblecolorconstancy.Ontrainingdata,theaccuracysweepmethoddidseemtoperformsomeformofcolorconstany,astheaccuracyincreasedforfewerbins.However,theuniformmethoddidnotshowanysignicantchangeinaccuracyasafunctionofthenumberofbins,andtheaccuracysweepmethodoverttedthedataandwasthusuntforuseinenvironmentswherelightingisunpredictable.

PAGE 97

CHAPTER6STILLIMAGERESULTSTotestourresultsfromChapter 5 ,wegeneratedaseriesofdatabases.Thesmall-estdatabaseconsistedofimagestakenundertypicallaboratorylightingconditions.Thedatabasecontainedimagesof9dierentsodacans,under4dierentilluminants.Eachimagewassegmentedtocontainonlythesodacaninquestion,andnopre-processingwasdonebeforethequantizationstage.Thefourilluminantsconsistedofambientdaylight,ambientuorescentlight,frontaldaylightwithleftuorescentlight,andfrontaldaylightwithrightuorescentlight.Eachcanwasinthesamelocation,inthesameorientation,withthesamesurroundings,sodierencesduetoreections,orientationshifts,andshadowswereminimized.Aseconddatabaseof14cansunder8dierentilluminantswasusedtoverifythetheoreticalresults.Thisdatabasewasintendedtobemuchmoredicult,astheilluminantswerecommonhouseholdilluminants,notpre-processedtocompensateforintensityorhuevariations.Asaresults,theimagesinthisdatabasevariedfarmorethantheothersintermsofoverallbrightnessandhue.Athirddatabaseof86cansunder4commonlaboratoryilluminantsand8dierentorientationswasusedtotesttheeectsofquantizationonlargerdatabases.Ofthislargerdatabase,asubsetof15canswereusedtotesttheresultsonadatabaseofobjectswhoseprimarycolorsallcomefromasingleregionofthehueaxis.Databaseswerequantizedto2,3,4,5,6,8,and14colorsusingthetree"methoddescribedinSection 4.1.4 .Thenumberofbinswasalsosweptfrom2to256usingtheuniformandaccuracysweepquantizationmethodsalongthehueaxis,withonlythehueinformationusedtoidentifytheobjects,asintheresultsonsyntheticdatainChapter 5 .Inaddition,thelightingshiftmethodproposedattheendofChapter 4 wasused.Alltheseresultson 83

PAGE 98

84 Figure6.1:Samplesodaimagesusedin14-candatabase.SodashownhereisPublixbrandDietCola,undereachofeightdierentilluminants. theunprocesseddatabasearecomparedtotheresultswhenthelargedatabaseispre-processedwiththemulti-scaleretinexwithcolorrestorationdescribedinChapter 2 ,andtotheresultsusingsimplenormalizationtoan[r;g]chromaticityspace.6.1TheoreticalComparisonPredictionsweretestedagainstarealdatabasecontainingimagesof14sodacans.Eachcanwasorientedthesamewayineachimage,tominimizetheeectsofori-entationchanges.InFigure 6.1 theeightdierentlightingconditionsareshownforonecan.Thecanshowniswhite,withmarkingsindarkredandgray.Threeimagesweretakenofeachcanunderincandescentlight:onewithnootherambientlight,onewithambientbutnotdirectsunlight,andonewithfullsun.Oneimagewastakenofeachcanunderambientbutnotdirectsun.Twoimagesweretakenofeachcanin

PAGE 99

85 Figure6.2:Colormapsfor2,3,4,5,6,8and14colors. brightanddimhalogenlight,withandwithoutsunlight,foratotaloffourimagesofeachcan.Eachofthese112imageswerequantizedto2,3,4,5,6,8,and14colorsusingthedata-independenttreemethoddescribedinSection 4.1.4 .Thecolormapsgeneratedforourquantizationto2,3,4,5,6,8and14colorsareshowninFigure 6.2 .Asthenumberofavailablecolorsincreased,thenumberofchromaticcolorsalsoincreased.Becausetheobjectrecognitionalgorithmisbasedoncolor,itseemedappropriatetoincludeasmanyhuesaspossibleandminimizeachromaticdiscrimination.Colorswhosesaturationwasbelow0:2wereconsideredachromatic,andcolorswhoselightnesswasbelow0:1wereassignedtothedarkestachromaticcolor.Theupperlightnessthresholdwassetto1,soevenverypalecolorswereconsideredchromatic.Inthetheoreticalequation,eachareaofuniformcolorisassignedanindex,andsoitispossibletoderiveaccuracyvaluesfork
PAGE 100

86 Figure6.3:Theoreticalpredictionsbluesolidlinesforc=224,p=2andp=3,n=3,andkvarying.Realdatabaseresultsreddashedlinesforc=224,p=3andp=5,n=3averagedover20sets,andkvarying asthesumofallpixelsofthesamequantizedcolor,sokcouldnotbesmallerthanp.Forcomparisonwiththetheoreticalpredictionsforvaryingk,weaveragedtheresultsfrom20setsofthreeimages,whereeachimageinagivensetwasofadierentcan.Forcomparisontothesimulationdata,wedeterminedtheaccuracyforsetsof10,20,30,50and100images,withoutaveragingoverdierentcombinations.Ourdatabaseofrealimagesutilizesresubstitution,tobettercompareaccuraciesderivedfromrealdatatothetheoreticalexpectedaccuracy.Figure 6.3 comparesthetheoreticalandrealdata,forseveralvaluesforp,askvariesfrom1to25andcisheldto224.Thetheoreticalresultsbluesolidlinespredicthighaccuracyevenforvaluesofkaslowas5>96%forp=2and>99:5%forp=3.Theresultsfromtherealimagedatabasereddashedlinesaresomewhatlowerwithaccuracybelow90%whenkis6orless.However,whenkequals8or14theaccuracy

PAGE 101

87 Figure6.4:Realdatabaseresultsfornvarying,withk=8andk=14,p=2andp=3.Thebluesolidlinesshowk=14andthereddashedlinesshowk=8.Theblueandredlineswithtrianglemarkersshowtheresultsfromthecomparisonbetweentheoreticaldatablueandrealdatared. iswellabove90%forbothp=3andp=5.Weexpecttherealresultstobelowerthanthetheoreticalpredictionbecausetheinputimagesdonotspanthecompletecolorspacecasthesimpliedpredictionassumes.Inaddition,thevaluesherearederivedbasedontheassumptionthatthesetssiaretheresultofquantizationthatisasuniformaspossible.Thecolorspaceseparationintoachromaticandchromaticregionsviolatesthatassumption.Asexpected,highervaluesofpgivegreateraccuracy,becausebyincreasingthenumberoffeaturesweexponentiallyincreasethepotentialnumberofdistinguishableobjects.Reducingchaslittleeectonaccuracy.Whencis16,andkis10,accuracydecreasesbyonly0.2%.AsinChapters 3 4 and 5 ,theaccuracybeginstodegradeataround8bins.

PAGE 102

88 Figure 6.4 comparessimulatedandrealdataforvariablen.Thesimulationresultsbluesolidlinespredictthataccuracywilldropasafunctionofdatabasesize,andthatchangesinpwillhavealargereectasthedatabasesizeincreases.Theblackdottedlinecirclesshowstheaccuracythatwouldbeexpectedifthequantizationwasperformingcorrectcolorconstancyandeachimageofagivenobjectwerebeingincorrectlyassignedtoasingleimageofthatobject.Ifenoughfeaturesareretainedplargerhighaccuracyismaintainedevenforlargedatabases>95%forp=3andn=100.Theresultsfromtherealimagedatabasereddashedlinesareagainlowerforsimilarvaluesofp.However,accuracysimilartothetheoreticalpredictionisobtainedforlargerp.Thetheoreticalresultsareshownfork=10,whiletherealvaluesarefork=8andk=14.Settingp=8foreachshowsthatacombinationofsucientcolorskandsucientfeaturespisnecessarytoobtainreasonablyhighaccuracy.Therealdataweregeneratedfromdatabasesderivedfromrandomcollectionsofimagesfromtheoriginaldatabaseof112images.Accuracyincreasedtojustover80%withlargedatabasesize,fork=8andp=8.Astherealdataforanygivennaregeneratedforonlyonedatabase,theincreaseinaccuracycanbeattributedtotheparticularcompositionofthelargerdatabaseincomparisontothesmaller.Largervaluesofkandparerequiredtoobtaincomparableaccuracybecausethedatabaseobjectsdonotspanthecolorspaceinthesamewayasthesimulatedobjects.Thesimulatedobjects'colorsarechosenusingauniformrandomvariable,whiletherealdatabaseislimitednotonlyinthenumberofcolorsavailablelimitedbythecameraandlightingconditionsbutalsobytheobjectsthemselves.Alltheobjectsinthisdatabasearetakenfromoneclassofobjectssodacansandthereforemaybeexpectedtobelimitedinstyleaectingcolorregionsizeandmaterialaectingrangeofcolors.Thevaluesofkandpshownarethevaluesthattheaccuracyconvergedtoaspincreasedforagivenk.Thesearetheresultsthatrepresentedeach

PAGE 103

89 Table6.1:Realdataresultsforp=3.CCshowstheexpectedresultsifthealgorithmisperformingcorrectcolorconstancyandaccuratelyidentifyingeachobject. n k=14 k=8 CC 10 90% 70% 100% 20 70% 50% 70% 30 70% 46.7% 46.7% 50 66% 48% 28% 100 59% 42% 14% imageinthedatabase.Largervaluesofpdidnotresultinappreciableincreasesinaccuracy.Thus,thesevaluesillustratetheeectofquantizationonaccuracyonthisparticulardatabaseofimages,ratherthanillustratingtheeectofquantizationonaccuracyasafunctionoftheobjectinquestion.Adatabasecontainingalargervarietyofobjectclasseswouldbemorelikelytoproducecurvesclosertothetheoreticalandsimulatedvalues.Inaddition,allthedatabaseswithmorethan14objectsofnecessityhaveobjectduplications.Ideally,thisduplicationwouldresultinloweraccuracyasthenumberofobjectsincreases,becauseaccuracyhereisconsideredperfectrecognitionofeachimageinthedatabase,ratherthanidenticationofindividualobjects.Ifalltheimagesofeachobjectafterquantizationwereidenticalcorrectcolorconstancy,thealgorithmwouldbeunabletodistinguishbetweenthem,andwouldarbitrarilyassignalltheimagesofagivenobjecttooneimageofthatobject.Thiswouldresultin14correctlyidentiedimagesandincorrectlyidentiedimages,butcorrectlyidentiedobjects,fortherestofthedatabase.Thevaluesforpwerechosentodisplaythebestaccuracyobtainedwithkcolors.Thus,fork=14,highervaluesofpdidnotchangetheaccuracy.FromTable 6.1 itisobviousthatthealgorithmisstillabletodistinguishbetweenseveralimagesofeachobject.Quantizingto14colorsresultsinrecognitionaccuracyof59%,incomparisontotheexpected14%ifthelightingvariabilityisbeingcorrectly

PAGE 104

90 Figure6.5:Realandtheoreticalresultsfork=1tok=25.Dottedlineshowstheoreticalresultsforp=2.Dashedlineshowstheoreticalresultsforp=3.Crossesshowaveragedrealresultsforp=2.Starsshowaveragedrealresultsforp=3. compensatedfor.Notethataccuracyismediocreforsmalldatabasesizes,butisbetterthancolorconstancywouldpredictforlargerdatabases.ThisproblemisnotpresentinthegraphforFigure 6.3 ,asoneofthecriteriausedtogeneratethosecombinationsofimageswasthatonlyoneimageofagivenobjectbepresentinanygivenset.Figure 6.4 showstheresultswhensmallernumbersofcolorsandsmallernumbersoffeatureskeptbegincompensatingforlightingvariation.Unfortunately,thelevelofquantizationalsointerfereswithobjectrecognition,astheaccuracyiswellbelowtheoreticalevenforsmalldatabasesizes.Tochecktherealdataagainstthetheoreticalvaluesintermsofimageidentica-tion,wegenerated10setsofthreeobjectseach,andaveragedtheerrorforvariousvaluesofk.TheseresultsareplottedinFigure 6.5 .Thetheoreticalandrealresultsmatchveryclosely.Nodatabasehadmorethanoneimageofagivenobject.Clearly,

PAGE 105

91 theassumptionofuniformdistributionofcolorsanduniformdistributionofobjectswillproducetripleswithdierentelementsmoreoftenthantripleswiththesameelements,andtherealdatabasewillcloselymimicthetheoreticalresults.Here,wetestedsetsofdierentobjects,wheredierentwassimplydenedasdierentimages,notasdierentobjects.6.2LightingCompensation6.2.1WorkplaceLightingTotesttheabilityofquantizationtocompensateforlightingconditions,wecon-structedtheworkplacedatabasefromimagesof9dierentsodacans,againallori-entedthesameway,under4dierentlightingconditions[ 28 ].Thereferencedatabaseconsistedoffeaturevectorsforallcansunderonelightingcondition,andthetestdatabaseconsistedofvectorsforallcansundertheotherlightingconditions.Multi-plesetsofreferenceandtestdatabaseswereconstructed,onequantizedto256colorsred,8greenand4bluevalues,andtheotherto14colors3achromaticand11chromatic.Forthe256colorset,the14highesthistogramvalueswerekeptandtherestweresettozero,andforthe14colorset,all14valuesshowninFigure 6.2 werekept.The512colordatabasecontainedhistogramsgeneratedfromuniformquanti-zationoftheRGBspaceto8binsalongeachaxis,andthe64colordatabasewastheresultofuniformquantizationoftheRGBspaceto4binsalongeachaxis.The10and8colordatabasesweretheresultofthequantizationbasedonthelightingshiftsdescribedinChapter 4 .FromTable 6.2 wecanseeadramaticimprovementinobjectrecognitionaccuracyintermsofobjectrecognition.Accuracyisdenedascorrectlyassigninganimageofanobjecttothatobject'stemplateimage,ratherthancorrectlycorrelatingthatimagewithitself.Weobtainedahighof81%accuracywhentheimageswerequantizedto

PAGE 106

92 Table6.2:Objectrecognitionaccuracygeneratedfromdatabaseof9sodacansunder4dierentlightingconditions.Ill.inthetablereferstoilluminant,andindicatestheimagesthatwereusedasthetemplatesinthedatabase,whilethetestdataconsistedoftheremaining27images. Ill. 512Colors 256Colors 64Colors 14Colors 10Colors 8Colors 1 33% 41% 52% 70% 59% 67% 2 40% 52% 44% 74% 67% 78% 3 40% 41% 44% 67% 63% 63% 4 40% 37% 59% 81% 63% 74% 14valuesbeforetakingthehistograms.With256values,thehighestaccuracywasonly52%,andwith512colors,quantizedto8by8by8inRGBspace,thehighestaccuracywasonly40%.Evenwhenthehistogramsaretheresultofquantizationto8colorschromaticand2achromatic,theaccuracyisstillfarabovethatwith512or64colors.Forthisdatabase,quantizationisextremelyeectiveatcompensatingforlightingchanges.Table 6.2 showstheresultsofquantizationcomparing,insomesense,applesandoranges.Inthiscase,resultsfromuniformquantizaitoninRGBspacearecomparedtoresultsusingcategoriesthatattempttooptimizebinlocationinsomeway.Howdoesquantizationaectaccuracywhenonlyasingletypeofquantizationisused?Theaccuracydoesseemtobeincreasing,asthevaluesfor64binsarehigherthanthosefor512bins,andusuallyhigherthanthosefor256.Figure 6.6 showstheresultsofuniformquantizationalongthehueaxisfordierentnumbersofbinsontheworkplacedatabase.Fortheseexperiments,allthepixelsareclassiedaccordingtotheirhuevalue.Thisgureshowsagradualdeclineinaccuracyasthenumberofbinsdecreases.Between30and50binsweseeacharacteristicdropfollowedbyapeakattainingroughlythe256-binaccuracyandthenanaldroptorandomchancewhenallthepixelsareplacedinasinglebin.Thispeakfollowedbyasharpkneeatlownumbersofbinsischaracteristicofeveryquantizationmethodtested.

PAGE 107

93 Figure6.6:Uniformquantizationresultsfortheworkplacedatabase. Figure 6.7 showstheabruptdropvisibleinFigure 6.6 inmoredetail.Again,weshowtheaccuracyoftheworkplacedatabaseasafunctionofthenumberofbinswhenuniformquantizationofthehueaxisisused.Forthisdata,thekneebetween3binsaccuracysweepand5binsuniformisparticularlystriking.Theaccuracydropsofortheuniformcasebetween10and35bins,andthenincreasestoroughlythesamelevelasthepeakjustbeforetheknee.Thisgeneraltrendischaracteristicofallthequantizationmethodsappliedtothisdata.Anoveralllevelofaccuracyissetbytheaccuracyfor256binsalongthehueaxis.Asthehueaxisisquantizedusingoneoranotherofthesemethods,tolargerandlargerbinsizesandfewerandfewerbins,theaccuracyoscillatesaroundthisvalue.Somewherebetween30and50bins,thereisaslightdropinaccuracyfromthislevel.Immediatelybeforethenaldrop,thereisusuallyaslightpeak,attainingtheoriginalorslightlyhigherthanthe

PAGE 108

94 Figure6.7:Exampleofthecharacteristickneeintheaccuracycurveforrealdata. originalaccuracy.Asthestandarddeviationsshow,theactualvaluearoundwhichtheaccuracyoscillatesvariesfromonesetoftrainingdatatoanother.Thisdatabasewasalsousedtocomparetheaccuracysweepmethodofbinloca-tionoptimizationtotheuniformquantizationmethod,todeterminewhetherornottheoverttingshowninChapter 5 appearsunderrealworldconditions.Figure 6.8 showstheresultsfortrainingandtestingvariouscombinationsofdataundervariousmethods.Plotashowstheresultswhenlightingconditiononeisusedasthetrain-ingdata,forbothmethods,bothfortestingandtrainingcases.Plotaalsoshowstheresultsforupto256bins.Theremainingplots,bthroughd,showresultsupto50bins,forthecaseswhentheotherthreelightingconditionsareusedastrainingdata.Plotbcorrespondstolightingconditiontwo,plotccorrespondstolightingconditionthree,andplotdcorrespondstolightingconditionfour.Resultsaresim-ilarinallcases,withtheoptimizedtrainingresultsproducinghigheraccuracythan

PAGE 109

95 Figure6.8:Comparisonofuniformandaccuracysweepmethodsontheworkplacedatabase. theuniformtrainingresults.However,undertestconditions,theuniformmethodreliablyoutperformstheoptimizationmethod.Again,asinSection 5.2 ofChapter 5 ,theaccuracysweepoptimizationmethodisoverttingthetrainingdata,producingbinlocationsthatdonotgeneralizewell.6.2.2HouseholdLightingThedatabaseusedinSection 6.1 wasalsousedtodeterminetheecacyofthealgorithminthefaceofmoreextremelightingvariation.Thisseconddatabasewasgeneratedtodeliberatelytestthemoredicultcaseofcommonhouseholdlighting,asopposedtocommonworkplacelighting.ThehouseholdlightingrepresentedinthisdatabaseconsistsofvariouscombinationsofearlymorningsunshowninChapter 4 tobesubstantiallydierentchromaticallyfrommiddaysunandfarmorechromatically

PAGE 110

96 Table6.3:Accuracyonhouseholddatabase. Ill. 512Colors 64Colors 14Colors 10Colors 8Colors 1 32% 30% 27% 35% 35% 2 18% 18% 37% 25% 28% 3 31% 34% 27% 37% 44% 4 38% 40% 37% 41% 46% 5 36% 29% 30% 32% 32% 6 23% 23% 40% 40% 39% 7 16% 15% 32% 33% 42% 8 27% 31% 26% 39% 34% Mean 28% 29% 32% 35% 37% intense,halogenlight,andincandescentlightwithano-whitelampshade.Thedatabaseincludesawidevarietyofilluminantcolors,aswellasawidevarietyofilluminantintensities.Althoughtherearestillrelativelyfewcans,themagnitudeofthelightingchangesmakesthehouseholddatabasesuitablefortestingthelimitsofthealgorithm.Resultsfollowedthesametrendasfortheworkplacedatabase.Giventhehighdegreeofvariationbetweenthelightingconditions,itisnotsurprisingthattheoverallaccuracyislowerthanfortheworkplacedatabase.However,itissurprisingthatthelightingshiftquantizationschemeproducessubstantiallyhigherresultsthantheothermethods.Thelightingshiftmethodwasderivedfrommeasurementsoflightingshiftsunderonlytwoilluminants;daylightanduorescentlight.Thisdatabasewastakenunderearlymorningdaylight,halogenlight,andincandescentlight.Itisunexpectedthatthebroadeightcategoriessuggestedbythelaboratorylightingconditionswouldworksowellonthisdata.Theaverageaccuracy,asshowninTable 6.3 ,increasesby9%asthenumberofbinsisdecreasedfrom512inRGBspaceto8usingthelightingshiftdata.Again,weseeanincreaseinaccuracywhenthenumberofbinsisreduced.

PAGE 111

97 Table6.4:Accuracyonfulldatabaseforlightingshift. Ill. 512Colors 64Colors 14Colors 10Colors 8Colors 1 25.33% 10.00% 24.95% 27.66% 27.97% 2 28.65% 13.95% 25.76% 26.65% 27.42% 3 21.28% 11.51% 17.82% 20.09% 21.55% 4 24.11% 13.76% 24.52% 28.46% 27.61% Mean 24.84% 12.31% 23.26% 25.72% 26.14% 6.2.3FullDatabaseFinally,adatabaseof86canswasgenerated.Thisdatabaseagainrevertedtocommonlaboratorylighting,withimagestakeninfourdierentroomsunderfourdierentlightingconditions.Eachcanwasimagedin8orientations,atintervalsofroughly45degrees.Thecanswereplacedinasmallportablerefrigerator,andimagesweretakenatroughlytheheightoftheanticipatednishedrobot.Theuppershelfwassomewhatexible,allowingthecanstotiltatdierentangles.Alltheseeectscombinetoproduceadatabasethatisasrepresentativeaspossibleoftheconditionsarobotislikelytoencounter.ThefulllistofsodacansinthedatabaseiscontainedinAppendix B ,alongwithatypicalsampleimagefromthedatabase.Figure 6.9 showstheresultsofuniformquantizationofthehueaxis.Again,weseethesametrend;aslowdecreaseasthenumberofbinsdecreases,thenashortpeakbeforethenaldrop.Thepeakinthiscaseisspreadacrosstheregionfrom10binsto25bins,intheformofmultiplepeaks.Again,withtheuniformRGBcomparedtotheoptimizedquantizationschemes,weseetheaccuracydecreasingfor64colors,thenincreasingfor14colorswithtreequantization,andincreasingstillfurtherforthelightingshift-basedquantization.Theaccuracyonthisdatabaseislowerthanthatoftheotherdatabasesbecausethisdatabasecontains6timesasmanyobjects.Forthedatashown,ineachcasealltheorientationsforagivenlightingconditionareusedastrainingdata,andallthe

PAGE 112

98 Figure6.9:Fulldatabaseaccuracyvs.numberofbinsforuniformquantization. orientationsunderthedierentlightingconditionsareusedastestingdata.Ifweinsteadtakeasingleorientationunderasinglelightingconditionastrainingdata,weobtainthefollowingresults.Whenthetrainingdataisfromlightingcondition1,weget41%oftheobjectsunderlightingcondition1correct,21%oftheobjectsfromlightingcondition2correct,15%oftheobjectsfromlightingcondition3and24%oftheobjectsfromlightingcondition4forquantizationtothe10-colorspacedenedbythelightingshiftdata.ThesenumbersaresubstantiallylowerthantheresultsinTable 6.4 .Obviously,asingleorientationisinsucienttoaccuratelydescribetheseobjects.Althoughtheoriginalalgorithmshowedstrongrobustnesstoorientationchange[ 57 ],perhapsthisdatabaseischangesmorefromoneorientationtoanother.

PAGE 113

99 Figure6.10:Localizeddatabaseaccuracyvs.numberofbinsforuniformquantiza-tion. 6.3LocalizedHueDataAsinSection 5.2 ,resultsweregatheredforadatabaseconsistingofobjectswhoseprimarycolorswereinasingleregionofthespectrum.Onewouldanticipatethattheaccuracysweepmethod'sperformancewouldbemuchhigherthantheuniformmethod'sperformance.Infact,asshowninFigure 6.10 ,theaccuracysweepmethoddoesproducebetteraccuracyontrainingdata,asitplacesthebinstooptimizetheaccuracy.However,ontestdata,thetwomethodsperformequallywell.Thisindicatesthattheoptimizationproducedbytheaccuracysweepmethodisagainover-ttingthetrainingdata,andnotgeneralizingtothetestcase.

PAGE 114

100 Table6.5:Accuracyonfulldatabasewithorientationvariedandlightingconstant. Or. 512Colors 64Colors 14Colors 10Colors 8Colors 1 49.19% 32.21% 2.13% 26.26% 26.82% 2 53.26% 36.59% 2.17% 29.87% 28.25% 3 51.62% 37.47% 2.83% 21.21% 20.98% 4 49.29% 35.28% 3.48% 27.12% 26.76% Mean 50.84% 35.39% 2.65% 26.11% 25.70% 6.4Orientationvs.LightingChangeInordertoanalyzetheeectsofquantizationontheaccuracyofthismethod,wemustalsoinvestigatetheeectsofquantizationwhenorientationshiftsoccur.Table 6.5 showsthecorrespondingresultstoTable 6.4 withorientationchangeratherthanlightingchange.Figure 6.11 showstheaccuracyforthetwoquantizationmethodstestedinChap-ter 5 .Theaccuracysweepmethodredagainproduceshigheraccuracythantheuniformquantizationmethodblueontrainingdata.Perhapsmostinterestingisthelocationofthekneeofthecurves.Inbothcases,thekneeliesbetween8and16binsmarkedinblackverticaldashedlines.Figure 6.12 showstheresultswhendierentnumbersoforientationsareincludedinthetrainingknowndatasetforthehue-localizeddatabase.Obviously,asmoredierentorientationsareincludedinthetrainingdata,theliklihoodincreasesthatthetrainingdatabasewillincludesimilarhistogramstothetestingdatabase.Thedierentlinescorrespondtotheaccuracyfordierentnumbersoforientationsinthetrainingdatabase.ThelinecorrespondingtoOne"blueshowstheaverageaccuracyoverall8orientationswhenasingleorientationisusedtogeneratethetrainingdatabaseandtheremaining7orientationsareusedastestingdata.Theredline,correspondingtoTwo",showstheaverageaccuracywhentwomaximally

PAGE 115

101 Figure6.11:Comparisonofaccuracysweepredanduniformbluequantizationmethods. dierentorientationsareusedastrainingdata,withtheremaining6usedastestingdata.Whilethislineisalmostthesameshapeastheoriginal,isitnoticablyhigher.Theblacklineshowstheresultswhenhalfoftheorientations,at90intervals,areusedastraining,andtheotherhalfareusedastesting.Theseresultsareaveragedoverthepossiblecases.Usingmoreorientationsinthedatabasedoesdramaticallyimprovetheaccuracy.Againthekneeofthecurvesliesbetween8and16bins.Obviously,theeectsoforientationarefargreaterthananticipated.Theresultsofchangingonlytheorientationblackline,onlythelightingconditionblueline,andbothatonceredlineareshowninFigure 6.13 .Clearly,changingtheorienta-tionhasasignicanteectonaccuracythiscorrespondstothecasewhereasingleorientationisusedinthetrainingdatabase.Bothlinesweregeneratedusinguniformquantizationonthesubsetofthefulldatabaseconsistingonlyofgreencans.Thered

PAGE 116

102 Figure6.12:Comparisonofdierentnumbersoforientationswithrespecttoaccuracy lineindicatestheaccuracyforagivennumberofbinsaveragedoverthe8dierentorientationcases,withonestandarddeviationtoeachsideshownasmagentadots.Thebluelineshowstheaccuracyaveragedoverthe4dierentlightingconditions,withonestandarddeviationtoeachsideplottedincyandots.Thelightingchangesproduceadecreaseinaccuracythatisonlyslightlylessthanthedecreaseinaccuracyproducedbyorientationchanges!Thisislikelytobecausedbytherelatednatureoftheobjectsinthedatabase.InSwainandBallard's[ 57 ]originaltests,theirdatabasewasmadeupofarbitraryobjects;detergent,clothing,andfood.Ourdatabaseismadeupofdierentexamplesofverysimilarobjects.Furthermore,eachcanisre-quiredbylawtohavenutritionalinformationvisible.Thus,foratleastoneoftheeightorientations,thelogoandcolorsbywhichweidentifythecanswillnotbevis-ible.Instead,onlythebasecolorofthecanandacontrastingcolorwhitefordarkcansandblackforlightoneswillbevisibleandusedtogeneratethehistogram.In

PAGE 117

103 Figure6.13:Comparisonoflightingconditionchangetoorientationchange. addition,certainbrandsareverynon-uniformincolorasthecanisrotated.MinuteMaidTMjuicesareprimarilycoloredononesideandblackandwhiteontheother.Whenbothlightingandorientationarechanged,theresultisnotmulitiplicativeoradditive.Theresultingaccuracyisgreaterthantheresultofmultiplyingthetwoaccuracies.Changingthelightingconditionalonealsohasasignicantimpactontheaccuracy.However,theresultsofchangingbothlightingandorientationsimultane-ouslyarenotasbadasmightbeexpectedfromtheindividualresults.Insteadofanadditiveormultiplicativeeect,onlyaslightdecreaseinaccuracyisnotedfromthelowerofthetwoindividualcasestothecombinedcase.

PAGE 118

104 Figure6.14:Imagesand2-Dhistogramsusing[r;g]chromaticities. 6.5ChromaticityResultsFigure 6.14 showstheeectsofnormalizationontwoimagesfromthehouseholdlightingdatabase.Theseimageswerenormalizedbythepixelintensityasfollows:Xki;j=xki;j Pml=1xli;j.1whereXki:jisthenormalizedvalueatlocation[i,j]intheimageofpixelelementxki;j.Themcolorchannelsarerepresentedbyk.Inthiscase,m=3,consistingofthered,green,andbluechannels.Theneteectofthisprocessisbrightnessconstancy.Becauseoftheredundancyassociatedwithnormalizingbythesummation,itisusualtokeeponlytheredandgreenelements.Imagesofdierentintensitiesandnohuedierencesarenormalizedtothesamepixelvalueswiththisprocess.This

PAGE 119

105 canbethoughtofasanintermediatecolorconstancy,ashuevaluesarenotaected,whilebrightnessvaluesare.Resultswiththismethodonthehouseholddatabasewerebetterthanthosewith-outanycolorconstancyatall.Figure 6.15 comparesthistoseveralotherpossibilities.Theaccuracyusingthealternatetwo-dimensionalchromaticitymethodadvocatedbyFinlayson[ 22 ]producedsubstantiallylowerresultsaverageof11.44%thantheothermethods.Inthismethod,thechromaticitiesaregeneratedbynormalizingtheredandgreenchannelvaluesbythebluevalue.Theresultsforthecasewheneachpixelchannelvalueisnormalizedbythesumofthevaluesforthatpixelisshowninred.TheblacklineindicatestheaccuracyderivedbyuniformlyquantizingbasedonthehuevalueinHSVspace,andthebluelineshowstheresultswhentheRGBspaceisquantizeduniformlyoneachaxis.Normalizingbyintensitygenerallyoutperformedeverything,includingtheuni-dimensionalhuequantization.ThehueandRGBuni-formquantizationsproducedsimilarresults,withthehuequantizationslightlyout-performingtheRGBquantization.Thisisintuitivelycorrect,giventhatasubstantialpartofthenoiseinthisdatabaseisrelatedtotherelativebrightnessoftheimages.Usingthehueaxisandeliminatingthevalueandsaturationaxesgivesthehuequan-tizationaslightedge.Onceagain,thekneeofeachcurveliesbetween8and16bins.Convertingtochromaticitycoordinatesbynormalizingeachpixelvalueisacom-putationallynegligiblestep,comparedtomostcolorconstancyalgorithms.Onlythreeadditionsandasingledivisionareneededforeachpixeltomovetothetwo-dimensionalspace,foratotalof3NadditionsandNdivisionsforanN-pixelimage.Itprovidessomeadditionalrobustnesstolightingchange,andshowsthesamechar-acteristicpeakanddropatlownumbersofbins.

PAGE 120

106 Figure6.15:Comparisonofuniformquantizationmethods. 6.6RetinexTotesttheeectsofhueconstancy,thenaldatabasewaspre-processedwithWoodelletal.'sMSRCRmethod[ 50 ].Becausewewereinterestedonlyincolorconstancy,notindynamicrangecompression,weusedonlyonesurround,asinthetraditionalretinexmethod,insteadofthethreetheyrecommend.However,wedidimplementthecolorrestorationasindicatedinthepatent,with=128and=48.Alsoasindicatedinthepatenttoperformcolorconstancy,wesetthesinglesurroundconstanttohalfofthelargestimagedimension.Appendix B showsexamplesoftheoriginalimagesfromthisdatabaseandtheresultsoftheretinexpre-processing.Resultsfromlightingshiftquantizationinconjunctionwiththeretinexprepro-cessingshowedanincreaseinaccuracywhenthelightingshiftquantizationschemeisalteredtotakeintoaccountthewashedoutcharacteroftheretinexpre-processed

PAGE 121

107 Table6.6:Accuracyonfulldatabaseforlightingshiftquantizationmethods. Ill. 8Colors 8ColorsAdjusted 10Colors 10ColorsAdjusted RetinexPre-ProcessedInput 1 14.39% 30.51% 14.90% 32.27% 2 13.03% 35.09% 15.87% 36.05% 3 13.90% 30.40% 13.75% 32.54% 4 15.04% 36.03% 16.11% 36.24% Mean 14.09% 33.01% 15.16% 34.28% PlainInput 1 26.82% 12.88% 26.26% 14.24% 2 28.25% 15.42% 29.87% 16.68% 3 20.98% 8.96% 21.21% 9.93% 4 26.76% 13.46% 27.12% 15.09% Mean 25.70% 12.68% 26.11% 13.99% images.Generally,theresultswiththewarpingadjustedtoallowforthewashedoutnatureoftheretineximageswerebetterthantheresultswithnoadjustmenttothelightingshiftquantizationscheme,byasmuchas17%.Thisisexpectedbecausewithoutadjustingthewarping,almostallthepixelsareplacedinthewhite/silverbin,andveryfewareplacedinchromaticbins.Theadjustmenttothelightingshiftquantizationschemeconsistedofincreasingthelightnessvalueusedtothresholdthewhite/silvercategoryandincreasingthesaturationusedtothresholdtheblackcategory.Inaddition,forthe10-colorshifting,lightnesswasalsoadjustedforthepinkcategorythreshold.Table 6.6 comparestheresultsofthetwodierentlighting-shiftbasedmethodsoninputdatawithandwithoutretinexpre-processing.Clearlytheadjustedmethodoutperformstheoriginalmethodontheretinex-processeddata,whiletheoriginalmethodoutperformstheadjustedmethodontheunprocesseddata.Figures 6.16 and 6.17 showthehistogramsinthefulldatabasewithFigure 6.17 andwithoutFigure 6.16 theretinexcolorconstancypre-processing.Lowvaluesarerepresentedinblue,andthehighestvalueisrepresentedinred.Bothimagesare

PAGE 122

108 Figure6.16:Histogramresultsforthefulldatabase,withoutcolorconstancy. scaledsothatthelargestvaluecorrespondstothesamered.Clearly,fromthegen-erallylighterblueoftheretinex-processedhistograms,thehistogramswiththecolorcontancypre-processingaremuchlesspeakyandmuchmorespreadoutalongthehueaxis.Thiscorrespondswelltothechangesintheimages,asgenerallylighterimageswillcorrespondtoalargerrangeofhuevalues.Thereareslightchangesinpeakloca-tionsbetweentheretinexpre-processedhistogramsandtheunprocessedhistograms,indicatingthatthealgorithmisperformingbothbrightnessandhueconstancy.Thesechangesdoappeartoincreaseaccuracy.Figure 6.18 showstheresultsofuniformquantizationfordierentnumbersofbinswithandwithoutretinexpre-processing.Theredlineshowsthemeanoftheresultswithretinexpre-processing.Eachvaluecontributingtothemeanconsistsoftheresultswhenallorientationsunderasinglelightingconditionareusedastrainingdataandtheremainderofthedataareusedfortesting.Thestandarddeviationisshownwithadottedlineinthesame

PAGE 123

109 Figure6.17:Histogramresultsforthefulldatabase,withcolorconstancy. color.Thebluelinesshowtheresultswithouttheretinexpre-processing.Clearly,thecolorconstancyprocessingisbuyinganincreaseinaccuracy.However,ittookoveradaytoperformtheretinexalgorithmusingtheconv"functioninMatlabronall2,626imagesinthedatabase,sothecomputationalcostmayoutweightheincreaseinaccuracy.Thevalueat256huelevelswasameanof0.3756fortheretinexpre-processingandameanof0.2569forthedatabasewithoutpre-processing.Withuniformquantizationonthehueaxis,thekneeofeachcurveliesbetween5and15binsalongthehueaxis.ThiscorrespondswelltotheresultsinChapters 3 and 4 .Figure 6.19 comparestheresultsofthetwodierentcolorconstancyapproaches.Here,thehue-limiteddatabaseofgreencansisusedtoshowhowaccuracychangeswithcolorconstancymethod.Theblacklineshowstheresultsusingthehue-onlyapproachwithuniformquantization.Theseresults,withthe[r;g]normalizationofallthreevaluesusedtocalculatethehuevalue,arealmostidenticaltotheresultswithout

PAGE 124

110 Figure6.18:Comparisonbetweenuniformquantizationresultsonfulldatabasefordatawithandwithoutretinexpre-processing. anycolorconstancyprocessing.The[r;g]conversionisonlyperformingbrightnessconstancy,sothereshouldbenochangeinthehueresultsforthiscase.Theredlineshowstheresultsusingtwo-dimensionalchromaticityhistograms.Clearly,brightnessconstancywithouthueconstancyimprovestheaccuracysubstantially.Theincreaseinaccuracyasthenumberofbinsincreasesdoesnotcontinue;theaccuracygoesupto.35at64bins,anddoesnotsubstantiallyexceedthatevenfor144bins.Thebluelineshowstheresultswiththeretinexpre-processing.Ineachcase,thesolidlinesrepresenttheaverageaccuracyoverdierentcombinationsoftestandtrainingdata,andthedottedlinesrepresentasinglestandarddeviationtoeachside.Resultswhenretinexpre-processingwasusedinconjunctionwiththe512-coloruniformRGBquantization,the64-coloruniformRGBquantization,andthe14-colortreequantizationareshowninTable 6.7 .Quantizingto512or64colorsproduced

PAGE 125

111 Figure6.19:Comparisonbetweenresultswithdierentcolorconstancymethods. onlyamodestgainovertheresultswithouttheretinexpre-processinginTable 6.4 .Quantizingto14colorswiththetreemethodproducedsubstantiallyworseresultswiththeretinexpre-processingthanwithoutit.Whentheretinexalgorithmisimplementedonrealdata,withoutacolorpatchintheimagefromwhichtojudgetheilluminant,thewashedoutcharacteroftheimageistheconsequenceofanosetofthemajorityofpointsinthecolorspacetowardswhite.Therefore,whencolorconstancyisimplemented,theusualdistribu-tionofcolorsspreadinacloudalongtheblack-whiteaxisistransformedintoateardrop,withthepointtowardblackandthebaseoftheteardroptowardswhite.Somequantizationschemeswillproducebetterresultswiththisdata,andotherswillnot.Unfortunately,theretinexcolorconstantpre-processingimposesasubstantialcomputationalburden.

PAGE 126

112 Table6.7:Accuracyonfulldatabasewithretinexpre-processing. Ill. 512Colors 64Colors 14Colors 1 22.32% 15.45% 12.12% 2 32.96% 19.47% 14.86% 3 27.90% 15.68% 11.41% 4 28.95% 18.25% 15.09% Mean 28.03% 17.21% 16.71% 6.7SummaryExperimentswithrealdataconrmthetheoreticalresultthat,foragivenimagedatabase,extremequantizationhasasmalleectonexpectedaccuracy.Morevaluesareneededthanpredictedbytheoryandsimulations,butsimilaraccuracycanbeobtainedwithanorderofmagnitudefewervaluesthanhavebeenusedpreviously.Colorproportionwastheonlyfeatureusedtoidentifytheseobjects,ignoringothercuesusedbyhumanssuchasshapeandtexture.Quantizingthissinglefeaturetoonly14possiblecolorsproducedlittledegradationinaccuracyifenoughhistogramvalueswereretained.Forlargerdatabases,accuracyremainedabove90%whilereducingthenumberofcolorsto14andthenumberofhistogramvaluesto8.Quantizationdidnotenablethealgorithmtocompensatefullyforlightingvariationwhenfewhistogramvalueswerekept.Objectrecognitionaccuracyasopposedtoimagerecognitionaccuracydidnotdecreaseasfarasitwouldhaveforlargenumbersofimages,norwasitashighasitwouldhavebeenforsmallnumbersofimagesifallobjectswereidentiedcorrectly.Realdata,whensubjecttonoisefromilluminationvariation,notonlyprovedthatquantizationcanproduceamoreecientsystem,butalsoshowedthatquan-tizationcanactuallyincreaseaccuracywhenusedinappropriatecircumstances.Inotherwords,ifagivensystemwillbeusedunderconditionswherethelighting's

PAGE 127

113 exactcompositionisunpredictable,butitsvariationiswithinthetolerancesofthequantization,thesystemwillbeabletousethecolorhistogramindexingalgorithmandtakeadvantageofallitsassociatedrobustnesswithoutsacricingitsprocessorandstorageadvantages.Quantizationcanhelpreduceobjectrecognitionerrorbyreducingsmallamountsofnoiseduetoilluminationvariation.Thischapterhasansweredthreeremainingquestions.First,howmanycategoriesaresucienttoretainagivenaccuracy?Ingeneral,nomorethan50divisionsofthehueaxiswillberequiredtoobtainaccuracycommensuratewithaccuracywhenall256divisionsareused.Frequently,amuchsmallernumberissucient,withaccuraciesremainingclosetothehighestaccuracyevenforasfewas8bins.Ifuniformquantizationisused,fewerbinsbetween8and16canproducelowererrorthanmorebins.Second,howmanycategoriesaresucienttodescribeagivennumberofobjects?Thenumberofcategoriesseemstoberelativelyinvarianttothenumberofobjects.Forbothlargeandsmalldatabasesbetween8and20categoriesseemstobesucient.Additionalbinsdonotseemtoincreaseordecreaseaccuracy,whilefewerthan8binsgenerallydegradesaccuracy.Third,doesquantizationreducetheeectsoflightingchangeonaccuracy?Theanswertothisquestionisaqualiedyes".Forsomedatabaseandlightingconditions,yes,quantizationcanhelpreducetheeectsoflightingchangeonaccuracy.Aslightingchanges,accuracydecreases.Insomecases,asthenumberofbinsdecreasestobetween8and20,accuracyincreases.However,inmanycasesaccuracydoesnotincrease.Instead,accuracysimplyremainslowbeforenallydroppingtorandomchanceinthesinglebincase.Inallcases,quantizationresultsinamuchmoreecientsystem.Throughoutthischapter,wehaveseenagainandagainthatthekneeoftheaccuracycurveoccursbetween8and16bins.Evenwhenthecurveisgradually

PAGE 128

114 slopingtowardslowerandloweraccuraciesasthenumberofbinsdecreases,theaccuracydropsosharplybeyondthe8-to16-binregion.Thelevelsofthecurvesabovethekneevaries,dependingonthedatabaseandthetypeofquantizationused,butthenumberoffeaturesatwhichtheaccuracydropsoisrelativelyconstant.Furthermore,thatnumberoffeaturesistheroughlythesameasthenumberoffeaturesusedbyhumanstorepresentcolorinmemory,andthenumberoffeaturesfoundbyotherresearcherstobenecessaryintransformedspaces.

PAGE 129

CHAPTER7ROBOTPROTOTYPESThecompleteprototypesystemconsistedofacameraconnectedviaanS-videocabletoaworkstationrunningLinux.Thecomputercapturedimagesfromthecam-era,viaanOmniMediaSequenceP1Sframegrabber.Theseimageswerecroppedtoaspeciedrangeofpixelsandquantized.Thecomputergeneratedthecorrespondinghistogram,andthenmatchedthathistogramtothehistogramsinitsdatabase.Fi-nally,thesodanamecorrespondingtotheclosestdatabasehistogramwasdisplayedonthescreen.Variouscombinationsofsodacanswereusedtotestthesystem,be-ginningwith6cansandgraduallyexpandingtoourcurrenttotalof14.Twocameraswereused,aSonyTRV-310andaSharpVL-H860.Aslongasthereferencedatabasewasgeneratedwiththesamecamerausedfortesting,theresultswereequivalent.Thecameraandthecanwereassumedtobearrangedsuchthatthecanoccupiedagivenregionoftheimage.Each640by480imagewascroppedtoarectangularregionofcolumns283to389androws170to343.Whenthedatabasewasgenerated,eachcanwasplacedinthesamelocationwithrespecttothecamera.Figure 7.1 showsanimagetakenwiththisarrangement,withthecroppedregionusedtogeneratethehistogramoutlinedinred,andtheremainderoftheimageshaded.Thisarrangementallowssomefreedomonthepartofthetesterwithregardtocanplacement;roughlyhalfaninchtoeithersideandaninchormoreoffoward/backwardmovement.HLSspacewasusedforthequantizationstep.Tosimplifyvisualizationoftheresults,thehueaxiswasosetbyone-sixth.Thismovedredto60degreesandputpinkatbothzeroandone.Quantizationwasdoneusingthedata-independenttreemethoddescribedinSection 4.1.4 ,withthe14colorsillustratedinFigure 4.5 115

PAGE 130

116 Figure7.1:Sampleimageofinputtosystem.Regionusedtocreatehistogramisoutlinedinred;remainderofimageisshadowed. 7.1O-LineImplementationTherstimplementationdidnotoperateinrealtime.Eachimagewascapturedbyoneprogramandstoredtodiskasale.Asecondprogramperformedtheobjectrecognitioncomponentofthesystem,readingintheinformationfromtheimageleanddisplayingthenameofthesodacorrespondingtothematchinghistogram.Forexample,ifacanofCountry-TimeLemonadeTMwasplacedintheappropriatelocationinfrontofthecamera,theuserwouldrunbothprogramsandthecomputerwouldrespondThissodaisCountry-TimeLemonade."Thedatabaseconsistedofthirty-sixhistogramsrepresentingsixcans.Eachcanwasrepresentedbyhistogramsgeneratedfromsiximages,takenintwodierentorientationspull-tabtowardsthecameraandpull-tabawayfromthecameraandthreedierentlightingconditionsoverheaduorescentlightingonandblindsclosed,overheaduorescentlightingonandblindsopen,andoverheaduorescentlightingoandblindsopen.Eachcroppedimagewasquantizedto14valuesnaboveequal

PAGE 131

117 Figure7.2:Samplehistogramsfromdatabase. to11thesamenumberofpixels,thehistogramsdidnotneedtobenormalized,soeachvalueinagivenhistogramwasthenumberofpixelsthatfellintothatbinduringquantization.ThecansusedinthisinitialimplementationwereWelch'sGrapeDrinkTM,EckerdsOrangeSodaTM,Country-TimeLemonadeTM,Coca-ColaTM,SpriteTMandCanadaDryGingerAleTM.RepresentativehistogramsfromthedatabaseareshowninFigure 7.2 .Thenaldatabasewasgeneratedinearlyafternoonandtestedintheearlyevening.Thesystemwasrobusttocombinationsoflightingrepresentedinthedatabase.Forexample,thedatabasecontainedimagesunderuorescentlightingwiththeblindsfullyopenandfullyclosed.Thesystemaccuratelyidentiedcansregardlessofwhethertheblindswerepartiallyopen,fullyopen,orclosed.Thesys-temalsocorrectlyrecognizedcansupsidedownorontheirsides.Whentwocanswereplacedsidebysideintheimagingarea,thesystemreliablychoseoneofthetwocansasthecorrectcan.Thesystemwaslessreliablewhencanswereplacedsotheirbottomortopfacedthecamera,butstillmanagedtocorrectlyidentifythecan

PAGE 132

118 roughlyhalfthetime.However,becausethewhitebalanceonthecamerawason,changingthebackgroundreducedthesystem'saccuracy.Thesystemperfectlyidentiedveofthesixcansinthedatabase.However,CanadaDryGingerAleTMprovedexceptionallydiculttorecognize.Thedefaultorientations'imagescontainednogreenandfewgoldpixels.Thus,whengreenorgoldwasthepredominantcolorthesystemroutinelymisidentiedthecanasSpriteTM.Onlytheexactorientationsincludedinthedatabasewerecorrectlyidentied.Theothercanswerealwaysidentiedcorrectly,regardlessoforientation.7.2Real-TimeImplementation7.2.1FirstImplementationAsecond,real-time,systemalsoproducedsatisfactoryresults.Forthisimple-mentation,onlysevenchromaticbinswereused,althoughthethreeachromaticbinsremainedthesame.Thesevenbins,chosenbyhand,werepink,red,orange,yellow,green,blue,andpurple.Alltestinganddatabasegenerationweredoneunderasclosetoasinglelightingconditionaspossibleblindsfullyclosed.Initially8canswereusedwiththetwoorientationsasdenedinSection 7.1 .Unfortunately,thesystemdidnotperformaswellasbefore,evenwhennopeoplewereallowedintotheimageareawhiledatawasbeingtaken.7.2.2ResultsonFirstImplementationResultswithtwotestorientationsareshowninTable 7.1 .Thetestorientationsusedwerethemidpointsbetweenthereferencedatabaseorientations,correspondingto90and270degreeswithrespecttothepulltabreferencedatabaseorientationscorrespondedto0and180degrees.WhilethesystemidentiedEckerdsOrangeSodaTMandWelch'sGrapeDrinkTMwithouterror,theresultsontheremainingsodas

PAGE 133

119 Table7.1:Numbercorrectwhenidenticationofsodasistestedusingorientationmidwaybetweenthoseinthedatabase.Eightcanswithtwoorientationsindatabase,fourteencolors.Thenumbers1and2correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,EO=EckerdsOrangeDrinkTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,D7=Diet7-UpTM,CD=CanadaDryTM,and7U=7-UpTM. Soda 1Err 2Err Sch 21/307U 21/307U CT 27/30CD 0/30CD EO 30/30 30/30 WG 30/30 30/30 C 15/30EO 0/30EO D7 13/30CD 0/3021/30CD 9/307U CD 1/30D7 13/30D7 7U 28/30Sch 13/30Sch wereremarkablypoor.Afewsecondsdelaywasintroducedtoallowthecameratimetoadjustthewhitebalance,withoutimprovement.Accuracyimproveddramaticallywhenfourorientationswereincludedinthedatabase.Again,theintermediateorientationswereusedtotestthedatabase.Thefourorientationsincludedwereat0,90,180and270degreeswithrespecttothepulltab.Thetestorientationswereat45degreesfromeachofthese.Oneadditionalchromaticcategorywasaddedtocompensateforconsistentlylargevaluesinthebluebinoverallcans.TheresultsareshowninTable 7.2 .Eachnumberinthetablecorrespondstotheresultsover50trialswithoutchangingtheenvironmentorthecamera.ObviouslySchweppesTM,Diet7-UpTM,and7-UpTMhadtheworstaccuracy,andSchweppesTM,Diet7-UpTM,7-UpTM,andCanadaDryTMwereresponsibleforthemosterrors.Diet7-UpTMand7-UpTMwereremovedfromthedatabaseforthenexttrial,replacedbyPepsiTM,SliceTM,MountainDewTMandLiptonDietLemonBriskIcedTeaTM.Thesubstitutionwasintendedtojointlyreduceerrorsresultingfrom

PAGE 134

120 Table7.2:Numbercorrectwhenidenticationofsodasistestedusingorientationmidwaybetweenthoseinthedatabase.Eightcanswithfourorientationsinthedatabase,elevencolors.Thenumbers1,2,3,and4correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,EO=EckerdsOrangeDrinkTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,D7=Diet7-UpTM,CD=CanadaDryTM,and7U=7-UpTM. Soda Err Err Err Err Sch 20%7U 98%7U 100% 54%7U CT 100% 100% 100% 100% EO 100% 100% 100% 100% WG 100% 100% 100% 100% C 100% 100% 100% 100% D7 100% 58%CD 46%CD 82%CD CD 100% 100% 62%D7 100% 7U 44%D7 10%Sch 2%4%D7 0%Sch 94%Sch misclassicationofgreenpixelsanddeterminewhetherthebluecategorywassubjecttothesameproblems.ResultsonthisnewarrangementareshowninTable 7.3 ,withanaverageaccuracyof86%.Again,CanadaDryTMnowwithMountainDewTMwereresponsibleformanyerrors.OtherthanMountainDewTMandCanadaDryTM,EckerdsOrangeDrinkTMproducedtheonlymajorerror.Thisoccurredwhenthenutritionalinformationwasfacingthecamera.TheorangefromthissodaandtheredfromCoca-ColaTMwerelargelyinthesamehistogrambin,sowhennobluewaspresent,thesystemincorrectlyidentiedthecan.WhenCanadaDryTMandMoun-tainDewTMwerereplacedby7-UpTM,DietPepsiTM,andWildCherryPepsiTM,theresultswereunreliable.7.2.3SecondImplementationAthirdversionofthesystemwascreatedforpublicdisplay.Thissystemwasalsoreal-time,withacompletelyautomaticdisplayfunction.Aoneseconddelaybetween

PAGE 135

121 Table7.3:Numbercorrectwhenidenticationofsodasistestedusingorientationmid-waybetweenthoseinthedatabase.Tencanswithfourorientationsinthedatabase,elevencolors.Thenumbers1,2,3,and4correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,EO=EckerdsOrangeDrinkTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,CD=CanadaDryTM,MD=MountainDewTM,P=PepsiTM,SL=SliceTM,andLDL=LiptonDietLemonBriskIcedTeaTM. Soda Err Err Err Err Sch 100% 94%MD 100% 100% CT 100% 100% 100% 100% EO 100% 100% 16%C 100% WG 100% 100% 100% 100% C 100% 100% 100% 100% CD 58%MD 10%MD 100% 0%MD MD 16%CD 100% 68%CD 0%CD P 100% 100% 98%WL 100% SL 100% 100% 100% 100% LDL 98%P 100% 100% 100% eachreadingmadeviewingeasier.Thissystemused16histogrambins,ofwhich3wereachromaticand13werechromatic.Atrstonlytheelevencansfromthenalsecondimplementationwereused,withresultsshowninTable 7.4 .Theerrorwassignicantlyreduced,withEckerdsOrangenutritionalinformationproducingtheonlymajorremainingerror.Theaverageaccuracyincreasedto95%.Finally,EckerdswasreplacedwithMinuteMaidOrangeSodaTM,andSpriteTMwasadded.ResultsontheMinuteMaidsodawerefarbelowthosefortheEckerdssoda%inoneorientationand66%intheother,bothmistakenforWildCherryPepsiTM.AverageaccuracyontheMinuteMaidwas67%,comparedto78%fortheEckerdssoda.MinoradditionalerrorswereintroducedforWildCherryPepsiTMandCoca-ColaTM,resultinginanincreaseinerrorof1-2%each.SpriteTMwasalwaysrecognizedcorrectly,andintroducednonewerrors.Anewdatabasewasgeneratedwiththewhitebalanceoandxedbackground.Withtheblindsclosedbutsomelightleakingin,thedatabasewasrobusttooutdoor

PAGE 136

122 Table7.4:Numbercorrectwhenidenticationofsodasistestedusingorientationmidwaybetweenthoseinthedatabase.Elevencanswithfourorientationsinthedatabase,sixteencolors.Thenumbers1,2,3,and4correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,EO=EckerdsOrangeDrinkTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,P=PepsiTM,SL=SliceTM,LDL=LiptonDietLemonBriskIcedTeaTM,7U=7-UpTM,WC=WildCherryPepsiTM,andDP=DietPepsiTM. Soda Err Err Err Err Sch 100% 100% 100% 100% CT 100% 100% 100% 100% EO 100% 100% 12%C 100% WG 100% 100% 100% 100% C 100% 100% 98%EO 100% P 100% 100% 100% 100% SL 100% 100% 100% 100% LDL 92%DP 96%DP 96%DP 98%DP 7U 100% 100% 100% 100% WC 96%EO 100% 10%EO 100% DP 92%LDL 100% 100% 100% lightingchangesovercast,timeofday,sunlight.InTable 7.5 ,theresultsforthisdatabaseshowthatoverallaccuracywasverygood.OtherthanDietPepsiTM,theonlyerrorsoccurredwhenthenutritionalinformationonagivencanwasfacingthecamera.AverageaccuracyincludingDietPepsiTMwas92.2%.ExcludingtheresultsfromDietPepsiTM,accuracyincreasedto94.73%.7.2.4FixedLightingThesystemwasondisplayinaroomwithxedlighting.Anewdatabasewasgeneratedatthedisplaylocation,consistingoffourhistogramseachforCountry-TimeLemonadeTM,SchweppesTM,Welch'sGrapeDrinkTM,MinuteMaidOrangeSodaTM,Coca-ColaTM,DietCoca-ColaTM,PepsiTM,SliceTM,LiptonDietLemonBriskIcedTeaTM,7-UpTM,WildCherryPepsiTM,SpriteTM,MountainDewTM,andDr.PepperTM.

PAGE 137

123 Table7.5:Numbercorrectwhenidenticationofsodasistestedusingorientationmidwaybetweenthoseinthedatabase.Twelvecanswithfourorientationsinthedatabase,sixteencolors.Thenumbers1,2,3,and4correspondtoorientations;Errindicateswhatwaschoseninsteadofthecorrectsoda.Codesare:Sch=SchweppesTM,CT=Country-TimeLemonadeTM,MMO=MinuteMaidOrangeSodaTM,WG=Welch'sGrapeDrinkTM,C=Coca-ColaTM,P=PepsiTM,SL=SliceTM,LDL=LiptonDietLemonBriskIcedTeaTM,7U=7-UpTM,WC=WildCherryPepsiTM,DP=DietPepsiTM,andSP=SpriteTM. Soda Err Err Err Err Sch 100% 100% 100% 100% CT 100% 100% 100% 100% MMO 56%C 100% 100% 100% WG 100% 100% 100% 100% C 100% 100% 100% 100% P 100% 100% 100% 100% SL 100% 100% 0%LDL 100% LDL 100% 36%DP 100% 100% 7U 100% 100% 98%Sch 100% WC 100% 100% 82%MMO 96%DP DP 96%MMO 74%8%WC 68%6%SL 24%22%LDL 18%LDL 26%WC 54%WC SP 100% 100% 100% 100% Passers-bywereencouragedtoplacesodasinfrontofthecameraandtestthesystem.Thecamerawasplacedclosetothedesiredcanlocation.Thismadeitlesslikelythatpeoplewouldstandintheimageandthrowothecamera'swhitebalancetheSonyTRV-310'swhitebalancecannotbeturnedo.Thesystemidentiedallthecanswithouterror,withoneexception.WhenMoun-tainDewTM'snutritionalinformationwastheonlythingvisible,itwasmistakenfor7-UpTM.TheimagesusedtogeneratetheirhistogramsareshowninFigure 7.3 .Eventoahuman,theseimagesarevirtuallyindistinguishable.Thisimplementationrequired128bytesperobject,assumingagenerousmaximumof2bytesperhistogramelement,upto65536pixelsperbin.Ourimplementationhadamaximumpossibleof18338perbin.The128bytesperobjectisbasedon16

PAGE 138

124 Figure7.3:Comparisonof7-UpTMandMountainDewTMnutritionalinformation colors.Assumingthatanadditionalbytewillberequiredtonotewhichhistogramgoeswithwhichobjectallowsdatabasesupto128objects,thetotalbytesperobjectneededwillbe129.Our14objectdatabaserequiresonly1,806bytesofstoragespacefortheentiredatabase.Intheory,withfourimagesofeachobject,adatabasewith100objectswouldrequireonly12,900bytesofstoragespace.Ofcourse,thischangesasmorevariedlightingconditionsareintroducedandmoredatabasehistogramsarenecessary,butthesizeofthedatabaseshouldincreaselinearly,notexponentially,withthenumberoflightingconditions.Adatabaseof100objectsunder80dierentlightingconditionswouldtinto1MBofRAM.Ourprogram,incorporatingthecameradriversanddatabase,takesup37kilobytesofspace.However,thiscodehasnotbeenoptimizedinanyway.Theprogramanalyzesasingleframein70milliseconds,includingthetimetocroptheimage.Thismethodresultsinextremelyaccurateobjectidentication,aslongasthelightingdoesn'tvarybeyondthelimitsimposedbythedatabase.Accuracyofalmost100%wasacheivedunderxedlightingconditions.

PAGE 139

CHAPTER8CONCLUSIONThisresearchhascontributedtothebodyofscienticknowledgeinthreeways.First,anexperimentwasperformedtodeterminethebit-depthofcolormemory.Wedeterminedthatthebit-depthofcolormemoryisbetween3and4bits.Ingeneral,witha5seconddelay,humanscanremembercolorsintermsofbetween8and16huecategories.Previousresearchhadaddressedthequestionoftheaccuracyofhumancolormemoryintermsofothersystemsofcolormeasurement,suchaswavelengthoraccuracyofrememberingcolorchips.Thisexperimentwasuniqueinthatitestimatedhumancolormemoryaccuracyintermsofacomputer'srepresentationofcolor.Insodoing,itenabledatransitionfromthepsychophysicalworldofcognitionandperceptiontotheengineeringworldofsignalprocessing.Second,weproposedthatthecombinationofquantizationandcolorconstancyisamorereasonablemodelforourabilitytorecognizeobjectsthansimplyconsideringcolorconstancyalone.Theresultsonthedatawithandwithoutcolorconstancypre-processingandquantizationsupportthiscontention.Colorconstancy,asanticipated,doesimproveobjectrecognition.However,resultswhenbothcolorconstancyandquantizationareemployedshowthatthepointatwhichaccuracybeginstodegradeasthenumberofcolorbinsisdecreasedcorrespondswelltothenumberofcolorsthathumansarecapableofperceivingwhenadelayispresent.Infact,accuracyappearstodegradewhenthenumberoffeaturesisbetweeneightandsixteen,regardlessofquantizationmethod,compression,orwhetherornotcolorconstancyisused.Third,asadirectresultoftheanswerstoquestionsoneandtwo,weproposedthatquantizationbeconsideredasanalternativeorsupplementtotraditionalcolorconstancymechanisms.Thisproposalhasbeeninvestigatedandtheresultsshowed 125

PAGE 140

126 thatquantizationgenerallydoesnotdecreaseaccuracysignicantly,andcaninsomecasesincreaseit.Infact,dependingonthetypeofquantizationandthedatabase,theadditionofacolorconstantpre-processorcanadderroraswellasdramaticallyincreasingthenumberofrequiredcomputations.Alltheresultsinthisdissertationcombinetosayonething.Highresolutioninhueisunnecessary.Theexperimentsinhumancolormemoryshowedthathumansdonotpossesshighresolutioninmemoryforchromaticities.Atagiveninstant,toperceiveadisplayedsceneassimilartoascenethatmightbeperceivedinreallife,highresolutionwillresultinamorerealisticexperience.Whenwerememberacolor,however,werememberitonlyapproximately,andalltheresultspresentedhereindicatethathigherresolutionthanthatwhichwepossessisunnecessary.Colorconstancyaugmentsourabilitytoperceivecolors.Withcolorconstancy,boththeroboticsystemandhumansarebetteratidentifyingcolors.However,thetraditionalapproachofgivingtheroboticsystemall1024possiblecolorsasinputssimplyslowsdownthesystem.Resultsonrealandsyntheticdataindicatethatthereisanoptimalrangeofnumbersofcategories,andthatthisrangeisfarbelowthetraditional256levelsusedbyresearchersattemptingtospeeduptheirsystems.Ourresultsshowunequivocallythatmorethan50categoriesisunlikelytoproduceasubstantial>5%increaseinaccuracy.Generally,between8and16categorieswillproduceacceptableaccuracywhileminimizingthestorageandcomputationre-quirements.Humancapabilitiesalsoliebetween8and16categorieswhenmemoryistakenintoaccount.Byloweringthenumberofcolorstheroboticsystememploys,weareunlikelytosubstantiallyaecttheresultingobjectrecognitionaccuracy,andwewilldramaticallyincreasetheeciencyofthesystem.Colorquantizationcanproducefarmoreecientobjectrecognition.

PAGE 141

APPENDIXACOLORSPACESANDSENSORYSYSTEMSThisappendixcontainsthenecessarybackgroundinformationintheeldsofcolorspacesandsensorysystems.InSection A.1 ,wewilladdressthehumanvisualsystematafunctionallevel.Section A.2 willsummarizethedierentspacesusedtodescribecolors,andthetopicofrobotvisionwillbeaddressedinSection A.3 .Afulldescriptionofvariousalternatecolorconstancymethodsandashortdescriptionofcolormemoryresearchareincluded.A.1HumanVisionOurknowledgeofthehumanvisualsystemisincomplete,withresearchersstillarguingoverwherecertaintypesofcomputationstakeplace,butthefundamentaltasksperformedarefairlyclear.First,lightpassesthroughtheeyeandreachestheretina,wherefourtypesofsensorstransducethesignalintoneuralpulsesandpossiblyperformsomeadditionalprocessing.Fromeacheye,thesignalsreachtheLateralGeniculateNucleusLGNwherethecrossoverprocessbegins.Bythetimethesignalsreachthevisualcortex,alltheinformationfromtheeyeshasbeenroughlysortedbylocationintotherightandlefthalvesofthevisualeldandtheleftandrighthalvesofthevisualcortex.Varioustypesofprocessingoccur,withthenetresultthatweareabletoidentifytheobjectswelookatinaveryrobustway[ 48 ].Forourpurposes,onlythreeaspectsofthisprocessareimportant.First,weareinterestedinhowthebrainperformscolorconstancy;ourabilitytoidentifycol-orscorrectlyunderwidelyvaryinglightingconditions.Second,weareinterestedin 127

PAGE 142

128 thebrain'sabilitytoremembercolors.Third,weareinterestedinhowthebrainrepresentscolor.In1969BerlinandKay[ 7 ]publishedtheresultsoftheirstudyofthefundamentalcolorcategoriesusedbydierentcultures.Theydeterminedthathumanstendtocategorizecolorsintoafewgeneralunique"categories.Culturedeterminesthefocalcolorsofthesecategories,whilethecombinationofthecultureandtheindividualdeterminetheboundariesbetweencategories.Generally,apopulationwillagreeonthefocalcolorofacategorytowithinoneortwoMunsellchips,buttherewillbewidevariationinthelocationoftheboundariesbetweencategories.Forexample,everyonewillagreeonthebestexemplarof`yellow,'butdisagreeonwhereexactlyyellowbecomesorangeorgreen.ForEnglishspeakers,theyfoundelevenbasiccategories:white,black,gray,red,blue,green,yellow,orange,purple,pink,andbrown.TheyusedtheMunsellcolorspacetodenetheirsubjectsresponses,withchipsproducedbytheMunsellcompanyforhumanpsychophysicalexperiments.Unfortunately,theirconclusionsaresuspectduetotheuncontrollednatureoftheirexperimentsfewpeoplewithnoexposuretoEnglishusedasrepresentativesoftheirculture;testsperformedoutsidewithnoattempttocontrolillumination.However,theexistenceofcolorcategorieshasbeenveriedthroughseveralexperimentsintothenatureandabilitiesofthehumanvisualsystem.Forexample,researchintoinstantaneouscolorperception[ 26 ]showedthatcolorcategorizationisusedinstantaneouslytocategorizedobjects.In1998,Healey[ 65 ]exploredtheuseofcolorinscienticvisualization.TheydeterminedthathumanswereabletopreattentivelysegmentcolorsfromdierentBerlinandKaycolorcategorieswhilehumanswereunabletopreattentivelysegmentthesamenumberofdierentcolorswhenthosecolorswerechosentobeequidistantonthecolorwheel.Byusingequi-luminantcolorsfromtheelevenBerlinandKaycategories,subjectswereabletodistinguishagivencolorfromuptosixotherspreattentively.Eveninaperceptually

PAGE 143

129 uniformcolorspace,describedinSection A.2 ,categorizationratherthandistancehadanimpactontheeasewithwhichsubjectsidentiedobjects.A.1.1ColorConstancyOpinionsvaryastothemethodweusetocompensateforlightingvariation.Clearly,ifapersonlooksatanobjectunderdierentlightingconditions,theobjectwillbereectingdierentwavelengthsoflighttotheobserver'seye.Ifanobjectispartiallyshadowed,theobservercanbothdiscriminatebetweentherelativecolorsofthewell-litandshadowedregions,anddeterminetheunderlyingcoloroftheobject.Thisholdstrueevenforcaseswheretherearemultiplecoloredlightsstrikingtheobject.Againassumingthelightsarenotallhighlysaturatedtheobservercandistinguishbetweenthecolorsproducedbythelightingandtheunderlyingcoloroftheobject.QasimZaidi,attheOSA1998AnnualMeeting,gaveaninvitedpaperentitledAdierentlookatcolorconstancy:heuristic-basedalgorithms"[ 68 ].Itsthrustwasthatthehumanvisualsystemdoesnotbringaboutstablecolorappearancebydiscountingilluminants,butrecognizeswhenidenticalobjectsarebeingviewedunderdierentlightsandderivesilluminantproperties.Thisdoesseemfeasibleforthecasewheretheobjectisinplainviewcontinually,whilethesubjectadaptstolightingchanges.Howeverforcaseswheretheobjectisremovedwhiletheilluminationconditionschangecolormemorywillcertainlyplayarole.Thestudyofhumancolorconstancyfocusesonourabilitytodeterminetheunderlyingcoloroftheobject,regardlessoflightingvariation.Biologicalsystemsingeneralseemtohavenotroubleusingcolor.Manyanimalshumans,pigeons,bees,butteries,goldshandmonkeys,tonameafewpossesscolorvisionsystems.Humanslieinthemidstofthepossiblesystems.Wehavesensorsforthreepigmentsatbrightlightlevelsandoneatlowlightlevels,insteadofveatbrightlightlevelsasinpigeons,ortwoasinotheranimals.Weuseourcolor

PAGE 144

130 visiontosearchourvisualspace[ 57 ]andtoidentifyobjects.Shapeandcolorappeartobeprocessedconcurrentlyinthebrain,withmotionlaggingbehind[ 45 44 ].Ifshapeweretheonlycharacteristicnecessaryforobjectrecognition,colorwouldnotneedtobeprocessedsimultaneously.Humansobviouslyhavesomemethodfordealingwithlightingchanges.Becauseofthedicultyindoinganytestingonhumansbesidespsychophysicaltesting,mostofwhatweknowaboutthephysicalpropertiesofthehumanvisualsystemisderivedfromexperimentsonmacaquemonkeys.Kulikowskietal.[ 64 ]showedthatmonkeyswithlesionsinareV4ofthevisualcortexwereabletodistinguishbetweendierenthuesunderasinglelightingcondition,butthattheirabilitytorecognizeacolorwhentheilluminationchangedwasimpaired,comparedtomonkeyswithnolesions.However,morerecentresearch[ 56 ]hasshownthatitispossibletoapproximatethecolorconstancyabilitiesofhumansdeterminedthroughpsychophysicalexperimentsbysimplyimplementingthelocalandremoteadaptationtocolorsthatexistsintheretina.Theothermethodforexploringtheprocessesusedinthebrainisextensivetestingofpatientswithunusualproblems.In1995,Ruddock[ 35 ]publishedtheresultsofaseriesofexperimentswithapatientwhoexhibitedsymptomsofpartialachrompatop-siainabilitytoperceivecolor.hisconesandimmediatepostreceptoralprocessingwerenormal,buthisabilitytonamecolorsconsistentlyandhisabilitytocategorizecolorswerebothsignicantlydisturbed.Someluminancechangesdidnotaectthepatient'scolornaming,butthepapernotesatendencytousebrownandorangeinterchangably.Althoughthepaperdoesnotaddressthis,theseseemliketypicalre-sponsesforthecasewhentheperceivedsignalisanormalizedsignalthesumofthethreeelementsnormalizeseachelement.Thisunreliablecolornamingistheprimaryproblemwithnormalizationaloneasacolorconstancymechanism.ArendandReeves[ 1 ]usedacomputermonitortotesthumancolorconstancywithlimitedadaptationtothelighting.Theexperimentstestedtheeectofpsycho-

PAGE 145

131 logicalaspectsonsimultaneouscolorconstancycolorconstancywheretheobserverisviewingtwosceneswithtwodierentlightingconditionsatthesametime.Thetestsubjectswereabletoviewboththeoriginalandthetestcoloronadjacentmonitors,atthesametime.Theexperimentwasdesignedtoeliminatethepossibledependenceonsurroundalterationinobserverdeterminationofhuevalues.Eachsubject'schro-maticadaptationwaslimited.EachmonitordisplayedacontrolpatchandatestpatchorcontrolpatchesandatestMondrianunderagivenilluminant.ObserverswereallowedtochangethecolorfothetestpatchorMondrianonthemonitorwhosepatchesweredisplayedasthoughunderadierentilluminant.Intherstexperiment,observerscompensatedforthetestilluminant'shueandsaturation.Re-sultsofthehue-saturationmatchingwerepoor:theyshowedlittlecolorconstancy."However,whentheobserversadjustedthecolorofthetestpatchsothatitlookedasifitwerecutfromthesamepieceofpaperasthestandardpatch"twoofthethreeobserversshowedapproximatecolorconstancy.Thesmallnumberofpeopleinvolvedintheexperiment,andlackofexplicitnumericerrordatawithitsaccompanyingsignicanceanalysisistypicalofresearchinthiseld.Otherresearchhasshownthathumanobserverscanextractthecoloroftheilluminantfromanimage,althoughtheresultsarenotasconclusiveasthecolorconstancyresults.ArendandReevescon-cludeOurdatashowthatsimultaneousmechanismsalonee.g.,simultaneouscolorcontrastalterhuesandsaturationstoolittletoproducehueconstancy."Otherresearch[ 9 ]comparessimultaneouscolorconstancyheredenedascolorconstancythattakesplacebeforetheobserverhashadachancetoadapttotheprevailingilluminanttosuccessivecolorconstancythatwhichtakesplaceaftertheobserverhasadaptedtotheprevailingilluminant.Theauthorisconcernedwithhowsurroundobjectsaectcolorconstancy.Inhisexperiments,thesubjectshavetensec-ondstoadapttotheilluminantbeforeperformingthetests.Heobtainssubstantiallybetterresultsfromtestingthesubjectsinaroomenvironmentthanthosereported

PAGE 146

132 byothersusingstimulionamonitor.Hisresultsshowapproximatecolorconstancywhenobserversadapttotheilluminant.Brainarddenesaconstancyindexofvaluesfromoneperfecttozeronone.Oneobserveraverages0.87andanotheraverages0.84overeightdierentilluminantscomparedtoastandardilluminant.Whenthebackgroundischanged,theydeneasimilarmetric,withzeroindicatingthatthebackgroundhasnoeectandoneindicatingthatthebackgroundshiftstheperceivedilluminant.Averagevalueswere0.07forayellowilluminantand0.08forablueilluminant.Humancolorconstancyisgoodbutnotperfectwhentheobserversareallowedtoadapttotheilluminant.However,changingthebackgroundcolorappearedtohavelittleimpactonourcolorconstancyabilities.Thisconictswiththesimulationresultsshowinggoodcolorconstancywhenonlyourlocalandremotebackgroundadaptationisusedtoperformcolorconstancy.Theonlyrealconclusionwecanmakeisthatclearlyadaptationplaysaroleinourcolorconstancyabilities.A.1.2ColorMemoryHumancolormemoryisdenitelynotperfecteither.Ourabilitytoperceivedierencesinbrightnessdeteriorateswithtimedelaybetweenlights.Thevariabilityofsuccessivebrightnesscomparisonsis1.5to2.0timesgreaterthanthevariabilitybetweensimultaneouscomparisons[ 62 ].Theseresultsarederivedfromtestsincludingdelaysofmorethanelevenseconds,afteraonesecondstimulus.Ingeneral,shiftsoccuredtowardsdarkercolors.Severalexperimentsintothefactorsaectingcolormemoryshowedthathumanstendtoremembercolorsasmembersofcategories[ 63 60 ].UsingBerlinandKay'selevenuniquecolornames,therstexperiment[ 63 ]showedthat,givenacolordif-ference,colorsinthesamecategoryweremorelikelytobeconfusedthancolorsindierentcategories.Thesecondexperiment[ 63 ]showedthatcolorsidentiedwith

PAGE 147

133 memorytendtodistributewithintheirowncolor-categoryregionsortheneighbourcolor-categoryregions,dependingontheirpositionsinacolorspace."Inotherwords,whenacolorisstoredinmemory,onlyitsapproximatecolorisstored,aspartofacategoryregion.Thefollow-upexperimenttotheseresults[ 60 ]testedwhethertheseeectswerealsopresentwhenrememberingmultiplecolorssimultaneously.Resultsshowedtheessentiallysamephenomenaasabove.Forvecolors,oneortwowereoccasionallyforgottenentirely,ratherthanclassiedintoacategory.Thiswouldseemtoindicatethatperhapshumansareonlyusingthethreeorfourmostimportantcol-orstoidentifyobjects,andeitherignoringtherestorputtingtheobjectinaspecialclassof`toomanycolorstorememberspecics.'A.2ColorSpacesTherearemanywaystorepresentthecolorsthathumansperceive.Almostallcolorspacesaddressedherearethree-dimensional.Ingeneral,theyfallintooneoffourcategories,withaspecialcategoryfornon-axis-basedspaces.Intherstcategory,eachaxisrepresentsaparticularchromaticity.Thistypeofspaceisusedforcolorreproduction.Monitors,forexample,usetheRGBred-green-bluecolorspace.Eachaxisrepresentstheintensityofagivenlight,althoughred,greenandbluemaybeslightmisnomersastheblueisaverydark,purplyblueandthegreentendstohaveabluetinge.Printers,ingeneral,usefour-orsix-dimensionalspaces,withonedimensionforeachinkavailabletoreproduceagivencolor.Thesespacesincluderedundantcolordenitions,asthreeaxesisgenerallysucienttoreproducemostcolorsperceptiblebyhumans.CMYK,forinstance,consistsofcyanC,magentaM,yellowY,andblackK.Theblackaxisisredundant,asblackcanbeproducedbyusingthemaximumofeachoftheotherthreeaxes.However,itisamuchmoreecientuseofinktohaveaseparateblackforprintingtext.TheRGBspaceisillustratedinFigure A.1

PAGE 148

134 FigureA.1:RGBcolorspace. Thesecondcategorycontainsspaceswithtwochromaticaxesandasingleachro-maticaxis.TheYIQspaceusedfortelevisiontransmissionsisinthiscategory.Becausehumanachromaticresolutionisfarsuperiortochromaticresolution,atele-visionsignalconsistsofasinglehighspatialresolutionachromaticchannelandtwolowerspatialresolutionchromaticchannels.Thesethreechannelsareconvertedatthereceiverintothered,greenandbluespacementionedearlierfordisplay.Becausethechromaticchannelsarelowerresolution,theyrequirethatlessinformationbetransmitted.Thisresultsinafarmoreecientsystemthantransmittingallthreechannelsatthefullnalresolution[ 66 ].Usuallythesespacesusealightnessaxisfortheachromaticaxis,withlightercolorscorrespondingtolargervalues.Spaceswithasinglechromaticaxismakeupthethirdcategory.Here,oneaxisrepresentspossiblechromaticvalues,andtheothertwodimensionsaretraditionally

PAGE 149

135 FigureA.2:Munsellcolorspace. representedbysomeformoflightnessandsaturation.TheHSVspaceisbasedonaconicalmodel,ratherthanacubicone.ItisawarpingoftheRGBspace.Theblackcornerliesatthepointofthecone.Thehueaxiscircumnavigatesthechromaticedgesofthecube,fromredtoyellowtogreentobluetomagentaandbacktored,andthewhitecornerispushedintothecenteroftheatfaceofthecone.Inthisspace,thehueHaxisrepresentschromaticvaluesasangularmeasurementsaroundthewidthofthecone,withvalueVcorrespondingtolightnessalongtheaxisoftheconeandsaturationSasradialmeasurementsfromthecenterofthecone.HLSisanalternativetoHSV,andconsistsoftwocones,withthemostsaturatedcolorsaroundthewidestpartandwhiteandblackatthetips.Thefourthcategoryconsistsofperceptually-basedspaces.Thesegenerallyfallintooneofthesecondandthirdcategories,butarewarpedinsomewaytomakethemapproximatehumanperceptionmoreclosely.ThesespacesincludetheMunsellspaceasinglechromaticaxis,withmoredepthintheredregionthanintheblueregionandtheCIEspacestwochromaticaxes,attemptingtomimicthered-greenandblue-yellowperceptualaxes.ThesetofMunsellchipsusedbyBerlinandKayinSection A.1.2 consistsof320colorchipsoffortyequallyspacedhuesandeightdegreesofbrightness,allatmaximumsaturation,withanadditionalninechipsofneutralhuewhite,black,andgrays.Notethatequallyspaced"isarbitrary.Thehuesare

PAGE 150

136 FigureA.3:MunsellcolorsusedbyBerlinandKay,withfocalcolorsdotsandregionboundaries. equallyspacedwithrespecttotheMunsellsystem,whichiswarpedcomparedtotheHSVorHLSsystems.TheMunsellspaceisshowninFigure A.2 ,andarepresentationoftheBerlinandKayresultsintermsofthechipsusedisshowninFigure A.3 .Themostusefulforbiologicalexperimentsisstrictlyphysiologicallybased;colorsarerepresentedbytherelativestimulifromeachtypeofconeintheretina.Thereisafthcategoryofcolorrepresentation;thearbitraryspaces.ThePantonesystemofcolorrepresentationisanexampleofthis.Thisisanexampleoftheindexedcolorspace.Inthisarrangement,everycolorisgivenasingleuniquenumberornameandalwaysreferredtointhatway.ThePantonespaceisusedforprintreproduc-tion,toensurethatagivencoloralwayshasthemostsimilarwavelengthreectanceaspossible.EachcolorinthePantonesystemrepresentsaspecicprintedspot"color.ThesecolorscanbealmostperfectlyrepresentedwiththeirHexachromeTMsix-colorprintprocess.Amoreusualconsumercolorprinterwillhaveonlyfourcolors,consistingofcyan,magenta,yellowandblackinks.ThesecolorsaregenerallynotsucienttoperfectlyreproducethecolorsinthePantonesystem.Indexedspacesareoftenusedwhenonlyafewcolorsneedtoberepresented,orwhencolorsneedtobe

PAGE 151

137 representedprecisely.Inprint,eachindexmaycorrespondtoaspecicreectance.Incomputers,eachindexwillcorrespondtoavectorindicatingthecorrespondingcolorinoneoftheotherspacesusuallyRGB.Thearraythatmapseachindextoitscorrespondingcolorisknownasacolormap.However,becausethesespacesareinconsistenteveryindexedspaceisuniqueandarbitrary;index1cancorrespondtoanycolortheuserdesiresandrequiretheuseofoneoftheaxes-basedspacesabove,Ihavenotincludedtheminthepreviouscategorizationscheme.A.3RobotSensorySystemsVisionisonlyoneofthemanysystemsthatarobotwillusetogetinformationaboutitsenvironment.Themostpopularsensorysystemsincludeactivesystems,generallyeitherreactivesuchasabump"sensorthatreactswhenitistouchedorreectivesuchassonar.Thesesystemsareveryusefulinsituationswhereonlylocalwithinashortdistanceoftherobotinformationisneeded.Theyeachhavetheirdrawbacks.Forexample,sonarandinfra-redreectionsareexceptionallypooratdistinguishingtall,narrowobjects,suchasthelegsofchairs.Bumpersareonlyusefuliftheobjectactuallycomesintocontactwiththerobot.Someproblemsrobotsareaskedtosolverequiretheuseofmoresophisticatedsensors.Asprocessorsandmemorybecomefasterandcheaper,visionisbecomingmoreandmoreprevalent.Asasensorysystem,visionhasmanyadvantages.Itusespassivesensors,sotherobotdoesnotneedtoaectitsenvironmenttogatherinformation.Thereisawealthofinformationpresentinthedatastream.Humansusevisiontoidentifyentireclassesofobjects,andtoidentifyspecicobjectswithinthoseclasses.Wecanrecognizeadiversesetoffaces,gestures,rigidobjectssuchaschairsandautomobilesanddeformableobjectssuchasshirts.Weusethatinformationaboutourenvironmenttoperformtasks.

PAGE 152

138 Thecanidenticationprobleminthisthesisisgoodexampleofaproblemmosteasilysolvedwithvision.Itmaybetheoreticallypossibletoidentifythecansusingsomeothersense,butitisunlikelytobesimplerorrequirelessinteractionwiththecan.Withasensorysystembasedonvisualinformation,therobotcanidentifythecanwithouttouchingit;possiblywithoutevengettingparticularlyclosetothecan.Mostroboticsystemsthatincorporateavisionsystemuseano-the-shelfvideoorstillcameraforinput,linkedtoaframe-grabberofsomesort.ThesesystemsgenerallyproducedataaseitherRGBoravariantthereof,whichtheframe-grabberreceivesfromthecameraandconverttorawpixelvalues.ThesystemdescribedinChapter 7 storesthesepixelvalueswithfour8-bitnumbers,representingred,green,blue,andanalphachannel.ThealphachannelisusedtostoredatasuchasThispixelissomethingI'minterestedin"orThispixeldoesnotcontainanyusefulinformation."Eachofthered,greenandbluechannelscontainsanintegerfrom0to255torepresentthevalueofthataxisatthatpixellocation.Thedarkestpossiblecolorismappedto0,andthebrightestpossiblevalueonthataxisismappedto255.Thus,[0,0,0]wouldcorrespondtoblackorthedarkestpossibleperceptionand[255,255,255]wouldcorrespondtowhiteorthelightestpossibleperception.Theprimaryproblemwithvisionasasensorysystemisthesheerquantityofinformation.Avisualdatastreamcontains8bitsforeachchannel,3channels,andasmanypixelsasyouneedtocompletethetask.Ifweassumea32by32pixelimageissucientwhichitmaynotbethatresultsin1024pixels,foratotalof24,576bitstobeprocessedineachimage.Forconsumergradevideosignals,assumingyoudown-sampleto32by32beforeanythingelse,thereare30oftheseimagesineachsecond.Clearly,datareductionistherststepinanyvisualsensorysystem.Inthehumanvisualsystem,thisdatareductionoccursintherststage.Intheroboticsystem,thedatareductiongenerallyoccursaftertheframe-grabber,whenthecomputerrstseestheinformation.

PAGE 153

139 Noiseisanothercommonproblemwithroboticvisualsystems.Ingeneral,twoneighboringpixelswillhavesimilarbutdierentvalues.Thus,eveninperfectlysmoothregions,individualpixelswillhaveslightlydierentvaluesineachaxis.Inordertogetanaccurateestimateoftheregion'scolor,thepixelsmustbesmoothed.Iftextureistobeoneofthecharacteristicsthattherobotusestoidentifyobjects,thissmoothingmaydestroytheveryinformationbeingusedtodeterminerelevantfeatures.Inaddition,smoothinginRGBspacedoesnotproduceanaccurateesti-mateofcolorsatthebordersbetweenregions.Hue-basedspacesproducemuchmorerealisticcolorsatboundarieswhensimpleaveragingisusedtosmoothanimage.

PAGE 154

APPENDIXBDATABASESANDIMAGESThisappendixcontainsthesodacansusedineachdatabaseandadditionalim-agesshowingresultsthatsupplementthosepresentedinthemainbodyofthetext.Figure B.1 showsasampleimagefromtheimagesusedtogeneratethefulldatabase.ThesodasusedtogeneratethatdatabasearelistedinTable B.1 .Inthattable,theabbreviation`Min.'standsfor`Minute',andtheindicatedsodasnumbered34through39,inclusiveshouldbereadas`MinuteMaid'sodas.Figure B.2 showsPublixDietColaunderallfourilluminants,forasingleorien-tation.Thetoprowofimagesshowstheoriginalpixelsusedtocalculatetheresultswithoutretinexpre-processing.Thebottomrowshowstheresultsoftheretinexpre-processing.Lettersathroughdindicatethefourlightingconditions.TheremainingimagesinthisappendixillustratetheresultsofthelightingshiftexperimentdescribedinChapter 4 140

PAGE 155

141 TableB.1:Sodasinnaldatabase. ID SodaName ID SodaName 001 A&WCreamSoda 044 PepsiOne 002 A&WRootBeer 045 PublixBlackCherrySoda 003 Barq'sRootBeer 046 PublixCaeine-FreeCola 004 Caeine-FreeCoca-Cola 047 PublixCola 005 Caeine-FreePepsi 048 PublixCreamSoda 006 CanadaDryGingerAle 049 PublixDietCaeine-FreeCola 007 CherryCoca-Cola 050 PublixDietCola 008 Citra 051 PublixDietGingerAle 009 Coca-Cola 052 PublixDietLemon-Lime 010 CountryTimeLemonade 053 PublixDietRootBeer 011 CranberryCanadaDryGingerAle 054 PublixGingerAle 012 DietA&WRootBeer 055 PublixGrapeSoda 013 DietCaeine-FreeCoca-Cola 056 PublixLemon-Lime 014 DietCaeine-FreePepsi 057 PublixOrangeSoda 015 DietCanadaDryGingerAle 058 PublixRootBeer 016 DietCherryCoca-Cola 059 RaspberryLiptonBriskIcedTea 017 DietCoca-Cola 060 RCCola 018 DietCranberryCanadaDryGingerAle 061 Sam'sChoiceCola 019 DietDr.Pepper 062 Sam'sChoiceDietCaeine-FreeCola 020 DietLemonLiptonBriskIcedTea 063 Sam'sChoiceDietCola 021 DietMountainDew 064 Sam'sChoiceDr.Thunder 022 DietPepsi 065 Sam'sChoiceGrapeSoda 023 DietRiteCola 066 Sam'sChoiceMountainLightning 024 DietSeven-Up 067 Sam'sChoiceRootBeer 025 DietSprite 068 Sam'sChoiceTwistUp 026 DietSunkist 069 Schweppe'sGingerAle 027 Dr.Pepper 070 Seagram'sGingerAle 028 EckerdGrapeSoda 071 Seven-Up 029 EckerdUp 072 Slice 030 Fresca 073 Sprite 031 HawaiianPunch 074 Squirt 032 LemonLiptonBriskIcedTea 075 Sunkist 033 MelloYello 076 SunnyDelight 034 Min.MaidAppleFruitDrink 077 Surge 035 Min.MaidFruitPunchFruitDrink 078 SweetenedLemonNesteaIcedTea 036 Min.MaidGrapeSoda 079 SweetenedLiptonBriskIcedTea 037 Min.MaidOrangeBlendFruitDrink 080 Tab 038 Min.MaidOrangeSoda 081 Vernor'sGingerAle 039 Min.MaidPinkGrapefruitFruitDrink 082 Welch'sFruitPunchFruitDrink 040 MountainDew 083 Welch'sGrapeFruitDrink 041 Mr.Pibb 084 Welch'sOrangePineappleFruitDrink 042 MugRootBeer 085 WildCherryPepsi 043 Pepsi 086 PublixCherryCola

PAGE 156

142 FigureB.1:Sampleimageusedtogeneratefulldatabaseof86cans,with8dierentorientationsand4dierentilluminants.

PAGE 157

143 FigureB.2:Samplesofimagesusedtogeneratefulldatabaseofover82cans,with8dierentorientationsand4dierentilluminants.Toprowisnotprocessed;bottomrowisprocessedwiththeretinexalgorithm.

PAGE 158

REFERENCES [1] LawrenceArendandAdamReeves.Simultaneouscolorconstancy.JournaloftheOpticalSocietyofAmerica,A,30:1743{1751,1986. 5 13 14 130 [2] NazAricaandFatosT.Yarman-Vural.Anoverviewofcharacterrecognitionfocusedono-linehandwriting.Systems,ManandCybernetics{PartC:Appli-cationsandReviews,312:216{233,2001. 11 [3] WilliamP.BanksandGraysonBarber.Colorinformationiniconicmemory.PsychologicalReview,846:536{546,1977. 23 [4] KobusBarnard,GrahamFinlayson,andBrianFunt.Colorconstancyforsceneswithvaryingillumination.ComputerVisionandImageUnderstanding,65:311{321,1997. 13 15 20 [5] C.J.Bartleson.Memorycolorsoffamiliarobjects.JournaloftheOpticalSocietyofAmerica,501:73{77,1960. 23 29 [6] J.Berens,G.D.Finlayson,andG.Qiu.Imageindexingusingcompressedcolourhistograms.IEEProc.-Vis.ImageSignalProcess.,147,August2000. 60 61 [7] BrentBerlinandPaulKay.BasicColorTerms:TheirUniversalityandEvo-lution.UniversityofCaliforniaPress,Berkeley,2ndedition,1991. 23 24 25 128 [8] AaronF.BobickandRobertC.Bolles.Therepresentationspaceparadigmofconcurrentevolvingobjectdescriptions.IEEETransactionsonPatternAnalysisandMachineIntelligence,142:146{156,1992. 11 [9] DavidH.Brainard.Colorconstancyinthenearlynaturalimage.2.achromaticloci.JournaloftheOpticalSocietyofAmericaA,15:307{325,1998. 13 15 131 [10] DavidH.BrainardandBrianA.Wandell.Analysisoftheretinextheoryofcolorvision.JournaloftheOpticalSocietyofAmericaA,30:1651{661,1986. 4 15 16 17 [11] D.H.Brainard.Thepsychophysicstoolbox.SpatialVision,10:433{436,1997. 29 [12] ValetinoBraitenburg.Vehicles:ExperimentsinSyntheticPsychology,pages1{19,94{109.TheMITPress,Cambridge,MA,1984. 8 144

PAGE 159

145 [13] M.BrillandG.West.Contributionstothetheoryofinvarianceofcolorundertheconditionofvaryingillumination.JournalofMathematicalBiology,11:337{350,1981. 15 [14] RobertW.BurnhamandJoyceR.Clark.Atestofhuememory.TheJournalofAppliedPsychology,393:164{172,1955. 23 [15] R.W.Burnham,JoyceR.Clark,andS.M.Newhall.Spaceerrorincolormatch-ing.JournaloftheOpticalSocietyofAmerica,4710:959{966,1957. 23 24 [16] SandraE.Clark.Retrievalofcolorinformationfrompreperceptualmemory.JournalofExperimentalPsychology,822:263{266,1969. 23 [17] MarkS.Drew,JieWei,andZe-NianLi.Illumination-invariantcolorobjectrecognitionviacompressedchromaticityhistogramsofcolor-channel-normalizedimages.InternationalConferenceonComputerVision1998,pages533{540,January1998. 60 61 [18] RichardO.Duda,PeterE.Hart,andDavidG.Stork.PatternClassication.JohnWileyandSons,NewYork,2ndedition,2001. 11 33 [19] M.D'ZmuraandP.Lennie.Mechanismsofcolorconstancy.JournaloftheOpticalSocietyofAmerica,A,3:1662{1672,1986. 5 13 [20] FrancoisEnnesserandGirardMedioni.Findingwaldo,orfocusofattentionusinglocalcolorinformation.IEEETransactionsonPatternAnalysisandMachineIntelligence,178:805{809,1995. 20 [21] G.D.Finlayson.Colorinperspective.IEEETransactionsonPatternAnalysisandMachineIntelligence,1810:1034{1038,1996. 4 20 [22] GrahamFinlayson.Computationalcolourconstancy.InProceedingsofCGIP2000,pages100{105,2000. 14 105 [23] D.A.Forsyth.Anovelalgorithmforcolorconstancy.InternationalJournalofComputerVision,5:5{36,1990. 4 13 17 18 [24] BrianFunt,KobusBarnard,andLindsayMartin.Iscolourconstancygoodenough?In5thEuropeanConferenceonComputerVision,pages445{459,1998. 4 [25] BrianD.FuntandGrahamD.Finlayson.Colorconstantcolorindexing.IEEETransactionsonPatternAnalysisandMachineIntelligence,17:522{529,1995. 4 21 40 [26] RobertL.Goldstone.Eectsofcategorizationoncolorperception.PsychologicalScience,6:298{304,1995. 128 [27] AuraHannaandRogerRemington.Therepresentationofcolorandforminlong-termmemory.Memory&Cognition,243:322{330,1996. 23

PAGE 160

146 [28] JohnG.HarrisandSigneRedeld.Theroleofcolorcategorizationinobjectrecognition.InvestOphthalmolVisSci,41:S240.Abstractnr1260.,2000. 6 91 [29] AnyaC.Hurlbert.TheComputationofColor.PhDthesis,MassachssettsInsti-tuteofTechnology,1989. 13 [30] DavidJ.Ingle.Thegoldshasaretinexanimal.Science,227February:651{654,1985. 4 [31] DanielJ.Jobson,Zia-urRahman,andGlennA.Woodell.Retineximagepro-cessing:Improveddelitytodirectvisualobservation.ProceedingsoftheISandTFourthColorImagingConference:ColorScience,Systems,andApplications,1996. 4 15 16 17 [32] DanielJ.Jobson,Zia-urRahman,andGlennA.Woodell.Amultiscaleretinexforbridgingthegapbetweencolorimagesandthehumanobservationofscenes.IEEETransactionsonImageProcessing,6:965{976,1997. 17 [33] DanielJ.Jobson,Zia-urRahman,andGlennA.Woodell.Propertiesandper-formanceofacenter/surroundretinex.IEEETransactionsonImageProcessing,63:451{462,1997. 4 15 16 17 [34] H.JordanandJ.J.Kulikowski.Arecolourcategoricalbordersstableundervariousilluminants?InChristineDickinson,IanMurray,andDavidCarden,editors,JohnDalton'sColorVisionLegacy,pages421{430.TaylorandFrancis,London,1997. 24 [35] C.Kennard,M.Lawdon,A.B.Morland,andK.H.Ruddock.Colouridenticationandcolourconstancyareimpairedinapatientwithincompleteachromatopsiaassociatedwithprestriatecorticallesions.ProceedingsoftheRoyalSocietyofLondon,SeriesB,260:169{275,1995. 130 [36] EdwinH.Land.Colorvisionandthenaturalworld.parti.ProceedingsoftheNationalAcademyofSciencesUSA,45:115{129,1959. 4 14 15 [37] EdwinH.Land.Colorvisionandthenaturalworld.partii.ProceedingsoftheNationalAcademyofSciencesUSA,45:636{644,1959. 4 14 15 [38] EdwinH.Land.Recentadvancesinretinextheoryandsomeimplicationsforcorticalcomputations:Colorvisionandthenaturalimage.ProceedingsoftheNationalAcademyofSciencesUSA,80:5163{5169,1983. 4 15 16 [39] EdwinH.Land.Analternativetechniqueforthecomputationofthedesignatorintheretinextheoryofcolorvision.ProceedingsoftheNationalAcademyofSciencesUSA,83:3078{3080,1986. 4 13 15 [40] EdwinH.LandandJohnJ.McCann.Lightnessandretinextheory.JournaloftheOpticalSocietyofAmerica,611:1{11,1971. 4 13 14 15 16 21

PAGE 161

147 [41] SveinMagnussen.Low-levelmemoryprocessesinvision.TrendsinNeuro-sciences,23:247{251,2000. 23 [42] AndrewMoore,JohnAllman,andRodneyM.Goodman.Areal-timeneuralsystemforcolorconstancy.IEEETransactionsonNeuralNetworks,2:237{247,1991. 4 13 15 16 [43] A.B.Morland,J.H.MacDonald,andK.F.Middleton.Colourconstancyinac-quiredandcongenitalcolourvisiondeciencies.InChristineDickinson,IanMurray,andDavidCarden,editors,JohnDalton'sColorVisionLegacy,pages463{468.TaylorandFrancis,London,1997. 24 [44] K.MoutoussisandS.Zeki.Adirectdemonstrationofperceptualasynchronyinvision.ProceedingsoftheRoyalSocietyofLondon,SeriesB,264:393{399,1997. 130 [45] K.MoutoussisandS.Zeki.Functionalsegregationandtemporalhierarchyofthevisualperceptivesystems.ProceedingsoftheRoyalSocietyofLondon,SeriesB,264387:1407{1414,1997. 130 [46] T.H.NilssonandT.M.Nelson.Delayedmonochromatichuematchesindicatecharacteristicsofvisualmemory.JournalofExperimentalPsychology:HumanPerceptionandPerformance,7:141{150,1981. 23 24 29 [47] C.L.NovakandS.A.Shafer.Supervisedcolorconstancyusingacolorchart.TechnicalReportCMU-CS-90-140,CarnegieMellonUniversity,SchoolofCom-puterScience,1990. 13 21 [48] JoelPokornyandVivianneC.Smith.Colorvisionandnightvision.InThomasE.Ogden,TimothyC.Hengst,andStephenRyan,editors,Retina,volume1,pages127{145.Mosby,Baltimore,2ndedition,1994. 127 [49] Zia-urRahman,DanielJ.Jobson,andGlennA.Woodell.Multi-scaleretinexforcolorimageenhancement.InternationalConferenceonImageProcessingICIP,1996. 4 15 16 17 [50] Zia-urRahman,DanielJ.Jobson,andGlennA.Woodell.Methodofimprovingadigitalimage.U.S.Patentnumber5,991,456,November1999. 17 106 [51] Zia-urRahman,GlennA.Woodell,andDanielJ.Jobson.Acomparisonofthemultiscaleretinexwithotherimageenhancementtechniques.website. 4 [52] SigneRedeldandJohnG.Harris.Theroleofextremecolourquantizationinobjectrecognition.ProceedingsoftheFirstInternationalConferenceonColorinGraphicsandImageProcessing,pages225{230,2000. 6 63 [53] SigneRedeldandJohnG.Harris.Theroleofmassivecolorquantizationinobjectrecognition.ProceedingsoftheInternationalConferenceonImagePro-cessing,1:57{60,2000. 6 63

PAGE 162

148 [54] SigneRedeldandJohnG.Harris.Bit-depthofcolormemory.InvestOphthalmolVisSci,424:S50.Abstractnr281.,2001. 6 61 [55] JosSantos-Victor,GuilioSandini,FrancescaCurotto,andStefanoGaribaldi.Divergentstereoinautonomousnavigation:Frombeestorobots.InternationalJournalofComputerVision,14:159{177,1995. 8 [56] HedvaSpitzerandSaritSemo.Abiologicalcolorconstancymodelanditsap-plicationforrealimages.ProceedingsofCGIP2000,pages111{115,2000. 130 [57] MichaelJ.SwainandDanaH.Ballard.Colorindexing.InternationalJournalofComputerVision,7:11{32,1991. 3 8 11 98 102 130 [58] KeisukeTakabe,ShigeiNakauich,andShiroUsui.Acomputationalmodelforcolorconstancybyseparatingreectanceandilluminantedgeswithinascene.NeuralNetworks,9:1405{1415,1996. 13 [59] K.Uchikawa.Puritydiscrimination:Successivevs.simultaneouscomparisonmethod.VisionResearch,23:53{58,1983. 23 24 [60] K.Uchikawa,H.Ujike,andT.Sugiyama.Categoricalcharacteristicsofmultiple-colormemory.InChristineDickinson,IanMurray,andDavidCarden,editors,JohnDalton'sColorVisionLegacy,pages409{414.TaylorandFrancis,London,1997. 23 24 29 132 133 [61] KeijiUchikawaandMitsuoIkeda.Temporaldeteriorationofwavelengthdiscrim-inationwithsuccessivecomparisonmethod.VisionResearch,21:591{595,1981. 14 23 [62] KeijiUchikawaandMitsuoIkeda.Accuracyofmemoryforbrightnessofcol-oredlightsmeasuredwithsuccessivecomparisonmethod.JournaloftheOpticalSocietyofAmerica,A,3:34{39,1986. 23 132 [63] KeijiUchikawaandHiroyukiShinoda.Inuenceofbasiccolorcategoriesoncolormemorydiscrimination.ColorResearchandApplication,21:430{439,1996. 23 24 29 132 [64] V.Walsh,D.Carden,S.R.Butler,andJ.J.Kulikowski.Theeectsofv4lesionsonthevisualabilitiesofmacaques:huediscriminationandcolourconstancy.BehaviouralBrainResearch,53:51{62,1992. 130 [65] LizhiWangandGlennHealey.Usingzernikemomentsfortheilluminationandgeometryinvariantclassicationofmultispectraltexture.IEEETransactionsonImageProcessing,7:196{203,1998. 128 [66] G.WyszeckiandW.S.Stiles.ColorScience:ConceptsandMethods,QuantitativeDataandFormulas.JohnWileyandSons,NewYork,2ndedition,1982. 134

PAGE 163

149 [67] RichardA.Young.Colorvisionandtheretinextheory.Science,238:1731{1732,1987. 4 [68] QasimZaidi.Adierentlookatcolorconstancy:heuristic-basedalgorithms.InvitedtalkatOSA-98,SymposiumonAlgorithmsforExtractingObjectandIl-luminantColors,October1998.SymposiumonAlgorithmsforExtractingObjectandIlluminantColors. 129 [69] QasimZaidi,BrankaSpehar,andJeremyDeBonet.Colorconstancyinvariegatedscenes:roleoflow-levelmechanismsindiscoutingilluminationchanges.JournaloftheOpticalSocietyofAmericaA,1410:2608{2621,1997. 5 13

PAGE 164

BIOGRAPHICALSKETCHSigneRedeldwasborninNorthVancouver,BritishColumbia,Canada,inthefallof1970.Sheledanomadicchildhood,livinginCanada,Australia,andtheUnitedStates.In1988,shegraduatedfromEmmaWillardSchool,inTroy,NY.During1989sheparticipatedinanexchangeprogramsponsoredbytheEnglishSpeakingUnionandattendedschoolatHaberdashers'Aske'sSchoolforGirlsatElstreejustoutsideofLondon.AftertakinganA-Levelinphysics,sheenteredJohnsHopkinsUniversityasafreshmaninthefallof1989.Asanundergraduate,shestudiedphysics,music,andelectricalengineering,eventuallygraduatingwithaB.A.ingeneralengineeringwithconcentrationsinelectricalengineeringandmusic.Herextra-curriculartimewascompletelylledwithlivetheater.Havingspentenoughtimeonauditorystimuli,onherarrivalingraduateschoolattheUniversityofFloridain1995shebeganstudyingimageprocessing.QuicklybecomingaliatedwiththeMachineIntelligenceLaboratoryandtheComputationalNeuroEngineeringLaboratory,shebeganworkingonanobjectrecognitionsystemforanautonomousmobilerobot.Thisdissertationistheresultofthatlabor.


xml version 1.0 encoding UTF-8
METS:mets LABEL Efficient Object Recognition Using Color Quantization OBJID UFE0000347 TYPE monograph xmlns:METS http:www.loc.govMETS xmlns:daitss http:www.fcla.edudlsmddaitss xmlns:dc http:purl.orgdcelements1.1 xmlns:mods http:www.loc.govmodsv3 xmlns:palmm http:www.fcla.edudlsmdpalmm xmlns:rightsmd http:www.fcla.edudlsmdrightsmd xmlns:techmd http:www.fcla.edudlsmdtechmd xmlns:xlink http:www.w3.org1999xlink xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation ..E20051012_AAAABMlinks_20051012234823www.loc.govstandardsmetsmets_LOC.xsd ..E20051012_AAAABMlinks_20051012234823dublincore.orgschemasxmlssimpledc20021212_LOC.xsd ..E20051013_AAAAAKlinks_20051013064438www.loc.govstandardsmodsv3mods-3-0_LOC.xsd ..E20051012_AAAABMlinks_20051012234823www.fcla.edudlsmdtechmd_LOC.xsd ..E20060621_AAAELKlinks_20060621194313www.fcla.edudlsmdpalmm_LOC.xsd ..E20051012_AAAABMlinks_20051012234823www.fcla.edudlsmdrightsmd_LOC.xsd ..E20060502_AAACYYlinks_20060502001940www.fcla.edudlsmddaitssdaitss_LOC.xsd
METS:metsHdr CREATEDATE 2002-04-07T19:00:25Z ID LASTMODDATE 2006-09-14T16:31:24Z RECORDSTATUS NEW
METS:agent OTHERROLE MXF CREATOR ROLE OTHER ORGANIZATION
METS:name FCLA
METS:note directory=L:\Common 1\Data\UFE_2001_fall\UFE0000347\
makerules=etd
server=TD
formats=application/pdf
projects=ETD
OTHERTYPE SOFTWARE
MXFClient
INDIVIDUAL
emh
METS:dmdSec DMD1
METS:mdWrap MDTYPE MODS MIMETYPE textxml
METS:xmlData
mods:mods
mods:titleInfo
mods:title Efficient Object Recognition Using Color Quantization
mods:name
mods:namePart REDFIELD, SIGNE ANNE
mods:role
mods:roleTerm type text creator
John G. Harris
contributor
mods:originInfo
mods:publisher University of Florida
mods:dateIssued 2001
20011215
mods:language
mods:languageTerm English
mods:abstract A simplification of the color histogram indexing algorithm is proposed
and analyzed. Instead of taking a histogram consisting of hundreds of
colors, each input image is first quantized to only a few colors
(between 8 and 16) and the feature vector is generated by taking a
histogram of this smaller space. This increases the effciency of the
system by orders of magnitude. We also proposed that this would reduce
the effects of lighting change on the algorithm and that this would be
a better model for the human object recognition mechanism than the
algorithm combined with color constancy alone.
In support of the contention that this may be a better human model, a
psychophysical experiment was conducted. The bit-depth of human color
memory was shown to lie between 3 and 4 bits, corresponding between 8
and 16 color categories when a color is remembered for 5 seconds. This
experiment created a bridge between the worlds of the psychophysical
results and the computer testbed.
The research showed that quantization can occasionally compensate for
small lighting changes, but that the compensation is highly
database-dependent and erratic. However, quantization always produced
a much more efficient system and generally did not substantially
reduce the accuracy.
The results of this work were threefold. First, human color memory is
relatively poor, indicating that a system incorporating quantization
will be far closer to mimicking human abilities than a system without
it. Second, quantization alone is insufficient to perform color
constancy in most cases. Third, with or without a color constant
pre-processor, our results consistently showed that quantization has
little effect on accuracy when using more than sixteen bins. Object
recognition accuracy degrades substantially as the number of color
categories drops below 6. From 10 categories to 256, accuracy is
essentially unchanged. Quantization is a very efficient way to reduce
the computational complexity and storage requirements of this
algorithm without substantially affecting its object recognition accuracy.
mods:subject
mods:topic histogram indexing, quantization, psychophysics, color vision, color memory, object recognition, autonomous robotics
mods:accessCondition useAndReproduction Public
METS:amdSec
METS:rightsMD RMD1
OTHERMDTYPE RIGHTSMD
rightsmd:versionStatement Electronic version created 2002, State University Sytem of Florida.
METS:sourceMD SMD1
PALMM
palmm:entityDesc SOURCE UF
METS:digiprovMD DPMD1
DAITSS
daitss:daitss
daitss:AGREEMENT_INFO ACCOUNT PROJECT ETD
METS:fileSec
METS:fileGrp
METS:file CHECKSUM 76fba166e9c99338314a8181138dd077 CHECKSUMTYPE MD5 CREATED 2002-03-06T13:38:09Z GROUPID GID1 FID1 applicationpdf SIZE 3587065
METS:FLocat LOCTYPE OTHERLOCTYPE SYSTEM xlink:href hyperthesis.pdf
METS:structMap
METS:div ADMID DMDID
main
file
METS:fptr FILEID


xml version 1.0 encoding ISO-8859-1
METS:mets LABEL Efficient Object Recognition Using Color Quantization OBJID UFE0000347 TYPE monograph xmlns:METS http:www.loc.govMETS xmlns:daitss http:www.fcla.edudlsmddaitss xmlns:dc http:purl.orgdcelements1.1 xmlns:mods http:www.loc.govmodsv3 xmlns:palmm http:www.fcla.edudlsmdpalmm xmlns:rightsmd http:www.fcla.edudlsmdrightsmd xmlns:techmd http:www.fcla.edudlsmdtechmd xmlns:xlink http:www.w3.org1999xlink xmlns:xsi http:www.w3.org2001XMLSchema-instance xsi:schemaLocation http:www.loc.govstandardsmetsversion14mets.xsd http:dublincore.orgschemasxmlssimpledc20021212.xsd http:www.loc.govstandardsmodsv3mods-3-0.xsd http:www.fcla.edudlsmdtechmd.xsd http:www.fcla.edudlsmdpalmm.xsd http:www.fcla.edudlsmdrightsmd.xsd http:www.fcla.edudlsmddaitssdaitss.xsd
METS:metsHdr CREATEDATE 2002-04-07T19:00:25Z ID LASTMODDATE 2006-09-14T16:31:24Z RECORDSTATUS NEW
METS:agent OTHERROLE MXF CREATOR ROLE OTHER ORGANIZATION
METS:name FCLA
METS:note directory=L:\Common 1\Data\UFE_2001_fall\UFE0000347\
makerules=etd
server=TD
formats=application/pdf
projects=ETD
OTHERTYPE SOFTWARE
MXFClient
INDIVIDUAL
emh
METS:dmdSec DMD1
METS:mdWrap MDTYPE MODS MIMETYPE textxml
METS:xmlData
mods:mods
mods:titleInfo
mods:title Efficient Object Recognition Using Color Quantization
mods:name
mods:namePart REDFIELD, SIGNE ANNE
mods:role
mods:roleTerm type text creator
John G. Harris
contributor
mods:originInfo
mods:publisher University of Florida
mods:dateIssued 2001
20011215
mods:language
mods:languageTerm English
mods:abstract A simplification of the color histogram indexing algorithm is proposed
and analyzed. Instead of taking a histogram consisting of hundreds of
colors, each input image is first quantized to only a few colors
(between 8 and 16) and the feature vector is generated by taking a
histogram of this smaller space. This increases the effciency of the
system by orders of magnitude. We also proposed that this would reduce
the effects of lighting change on the algorithm and that this would be
a better model for the human object recognition mechanism than the
algorithm combined with color constancy alone.
In support of the contention that this may be a better human model, a
psychophysical experiment was conducted. The bit-depth of human color
memory was shown to lie between 3 and 4 bits, corresponding between 8
and 16 color categories when a color is remembered for 5 seconds. This
experiment created a bridge between the worlds of the psychophysical
results and the computer testbed.
The research showed that quantization can occasionally compensate for
small lighting changes, but that the compensation is highly
database-dependent and erratic. However, quantization always produced
a much more efficient system and generally did not substantially
reduce the accuracy.
The results of this work were threefold. First, human color memory is
relatively poor, indicating that a system incorporating quantization
will be far closer to mimicking human abilities than a system without
it. Second, quantization alone is insufficient to perform color
constancy in most cases. Third, with or without a color constant
pre-processor, our results consistently showed that quantization has
little effect on accuracy when using more than sixteen bins. Object
recognition accuracy degrades substantially as the number of color
categories drops below 6. From 10 categories to 256, accuracy is
essentially unchanged. Quantization is a very efficient way to reduce
the computational complexity and storage requirements of this
algorithm without substantially affecting its object recognition accuracy.
mods:subject
mods:topic histogram indexing, quantization, psychophysics, color vision, color memory, object recognition, autonomous robotics
mods:accessCondition useAndReproduction Public
METS:amdSec
METS:rightsMD RMD1
OTHERMDTYPE RIGHTSMD
rightsmd:versionStatement Electronic version created 2002, State University Sytem of Florida.
METS:sourceMD SMD1
PALMM
palmm:entityDesc SOURCE UF
METS:digiprovMD DPMD1
DAITSS
daitss:daitss
daitss:AGREEMENT_INFO ACCOUNT PROJECT ETD
METS:fileSec
METS:fileGrp
METS:file CHECKSUM 76fba166e9c99338314a8181138dd077 CHECKSUMTYPE MD5 CREATED 2002-03-06T13:38:09Z GROUPID GID1 FID1 applicationpdf SIZE 3587065
METS:FLocat LOCTYPE OTHERLOCTYPE SYSTEM xlink:href hyperthesis.pdf
METS:structMap
METS:div ADMID DMDID
main
file
METS:fptr FILEID


Permanent Link: http://ufdc.ufl.edu/UFE0000347/00001

Material Information

Title: Efficient Object Recognition Using Color Quantization
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0000347:00001

Permanent Link: http://ufdc.ufl.edu/UFE0000347/00001

Material Information

Title: Efficient Object Recognition Using Color Quantization
Physical Description: Mixed Material
Copyright Date: 2008

Record Information

Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.
System ID: UFE0000347:00001


This item has the following downloads:


Full Text









EFFICIENT OBJECT RECOGNITION
USING COLOR QUANTIZATION












By

SIGNE ANNE REDFIELD


A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT
OF THE REQUIREMENTS FOR THE DEGREE OF
DOCTOR OF PHILOSOPHY


UNIVERSITY OF FLORIDA


2001
































Copyright 2001

by

Signe Anne Redfield













ACKNOWLEDGMENTS


I thank my husband, Karl, for helping me to ignore distractions and keeping

me motivated to finish, and my son, Rowan, for providing distractions when I needed

them most. I would not have finished this without the exceptional help of my advisor,

Dr. John G. Harris. TM ,j: thanks go to Dr. Michael N. i lvba and Dr. A. Antonio

Arroyo, for the use of equipment and their constant support and encouragement, and

to Dr. Keith Doty, who got me interested in this project in the first place. Particular

thanks go to my family, who let me bounce ideas off of them until I finally figured

out what I needed to do, and especially to Mary Javorski, who gave me the kernel of

the idea that evolved into this dissertation.

This research was partially supported by a National Science Foundation grant

through the Minority Engineering Doctoral Initiative program, which among other

things supplied the computer I used to do the experiments and write the thesis and

travel money to attend conferences.

Many thanks also go to Publix Supermarkets, the Pepsi Bottling Company, and

the denizens of the Machine Intelligence and Computational NeuroEngineering Lab-

oratories for their donations of soda cans to my database.












TABLE OF CONTENTS


ACKNOWLEDGMENTS . . . . . iii

LIST OF FIGURES . . . . . . vii

LIST OF TABLES . . . . . . xi

ABSTRACT . . . . . . . xiii

CHAPTERS

1 INTRODUCTION . . . . . . 1

2 ROBOTIC OBJECT IDENTIFICATION . . . 8
2.1 Platform . . . . . . . . 9
2.2 Identification Methods ................... .... 10
2.3 Histogram Indexing ............................ 11
2.4 Color Constancy .................... ........ 13
2.4.1 R etinex . . . . . . . 15
2.4.2 Gamut Mapping .......... ............... 17
2.4.3 Other Methods ................... .... 20
2.5 Fixing Histogram Indexing ........... ............. 21

3 COLOR MEMORY . . . . . . 23
3.1 Experimental methods ............... ...... .. 24
3.2 Data analysis ............... .......... .. 31
3.2.1 Sensitivity ............... ......... .. 33
3.2.2 Calibration ............... ........ .. 37
3.3 Summary ............... ............. .. 39

4 QUANTIZATION . . . . . . 42
4.1 Overview of Algorithms .................. ..... .. 43
4.1.1 Uniform Quantization .................. ..... 43
4.1.2 Dithering .................. .......... .. 44
4.1.3 Modified Uniform Quantization ................ .. 45
4.1.4 Tree Quantization. .................. .... .. 46
4.1.5 Median-Cut Quantization ............. ... .. .. 48
4.1.6 Vector Quantization .................. .. 48
4.2 Number of Colors .................. .......... .. 49
4.2.1 Optimizing Accuracy .................. .. 50
4.2.2 Lighting Shifts .................. ....... .. 51









4.3 Summary .................. .............. .. 58

5 THEORY AND RESULTS ON SYNTHETIC DATA.. . 61
5.1 Theory .................. ................ 61
5.1.1 Structure and Methods .................. ..... 61
5.1.2 Theoretical Results .................. ... .. 62
5.1.3 Larger databases .................. ..... .. 65
5.2 Synthetic Data .................. ........... .. 68
5.2.1 Single Hue Region .................. .... .. 69
5.2.2 All Hues .................. ........... .. 72
5.2.3 Multiple Hues ............ . . .... 73
5.2.4 More Complex Lighting Shifts ................. .. 77
5.3 Over-fitting . . . . . . .. . ... 80
5.4 Conclusions . . . . . .... . ... 81

6 STILL IMAGE RESULTS . . . . . 83
6.1 Theoretical Comparison .................. ..... .. 84
6.2 Lighting Compensation .................. ....... .. 91
6.2.1 Workplace Lighting .................. .... .. 91
6.2.2 Household Lighting .................. .... .. 95
6.2.3 Full Database .................. ........ .. 97
6.3 Localized Hue Data .................. ......... .. 99
6.4 Orientation vs. Lighting C(i I,.;.. ................ .... 100
6.5 C('!i i. i Ii city Results . . . . . .. 104
6.6 Retinex .................. ............... .. 106
6.7 Summ ary .................. ............ .. .. 112

7 ROBOT PROTOTYPES . . . . . 115
7.1 Off-Line Implementation .................. ..... .. 116
7.2 Real-Time Implementation .................. .... 118
7.2.1 First Implementation ................ .... .. 118
7.2.2 Results on First Implementation . . ..... 118
7.2.3 Second Implementation ................. . 120
7.2.4 Fixed Lighting .................. .. ..... 122

8 CONCLUSION . . . . . . 125

APPENDICES

A COLOR SPACES AND SENSORY SYSTEiMS . 127
A.1 Human Vision .................. .......... 127
A.1.1 Color Constancy .................. .... 129
A.1.2 Color M emory .................. ........ 132
A.2 Color Spaces .................. ............. 133
A.3 Robot Sensory Systems .................. .. ..... 137









B DATABASES AND IMAGES . . . . . 140

REFERENCES . . . . . . . 144

BIOGRAPHICAL SKETCH . . . . . 150













LIST OF FIGURES


3.1 Interface window for program enabling user to choose focal colors. Pri-
mary window shown here allows user to choose colors; secondary win-
dow shows focal colors after user has moved on to next color .....

3.2 Interface for user to choose boundaries between focal colors. For each
of seven levels of brightness, the user places lines where boundaries are
perceived. Focal colors are assigned to the regions between the lines
by the user . . . . . . . .

3.3 Screen shots of p-' i-1 |1 ',!-i. i1 testing interface. (a) original color; (b)
intersample noise; (c) test color; (d) simultaneous comparison.....

3.4 Focal colors and initially di-! '.1-, ,I colors at LSB 5 .. .......

3.5 Sample receiver operator characteristics curve .. ..........

3.6 Monitor Calibration Results .. ...................

3.7 Results of p(A) analysis with actual change derived from monitor cal-
ibration . . . . . . . . .

4.1 Original image used to demonstrate results of different quantization
schem es . . . . . . . . .

4.2 Results of uniform quantization .. .................

4.3 Results of dithering on uniform quantization .. ...........

4.4 Results of modified uniform quantization .. .............

4.5 Diagram for tree quantization . .................

4.6 Results of median-cut quantization .. ...............

4.7 Sample color changes under a single di-'s sunlight .. .......

4.8 Average and standard deviation for hue under varying lighting. The x
location of each bar indicates the average of the average values for the
(d -1. ... . . . . . . . . ...

4.9 Mean values for cans and calibrator hues .. .............

4.10 Probability density functions for chromatic can colors along the hue axis.









4.11 Probability density functions for all can colors along the saturation axis. 56

4.12 Probability density functions for all can colors along the value axis. 57

4.13 Probability density functions for the colors in the red category. ... 58

4.14 Sample image quantized using the 8-color lighting-shift based classifier. 59

5.1 Diagram showing examples of each variable used in the theoretical
equations. .................. .............. .63

5.2 Theoretical results for c = 224, p = 2 and p = 3, n = 3, and k varying. 65

5.3 Theoretical results for c = 224, p = 3, and k varying along the x axis.
Each line corresponds to a different value of n. ............ ..66

5.4 Theoretical results for c = 224, n 4, and k varying along the x axis.
Each line corresponds to a different value of p. . . ..... 67

5.5 Simulation (averaged) results for c = 224, p = 2 and p = 3, k = 10,
and n varying. .................. ............ ..68

5.6 Image of single hue region synthetic database. Red corresponds to
higher values; blue to lower values. ................. .69

5.7 Results on single hue region database for varying numbers of bins,
using uniform (blue) and accuracy sweep (red) quantization. Average
accuracy across the different shifts is shown in the lower right hand
plot. Standard deviation of average value is shown with dotted lines.
Progressively larger shifts from none (upper left hand plot) to 7 (middle
lower plot) are shown in the remaining plots. . . ...... 70

5.8 Database of objects completely spanning the hue axis . .... 71

5.9 Results of uniform (blue) and accuracy sweep (red) methods. . 72

5.10 Original database for histograms with two colors. . . 73

5.11 Average accuracy of uniform (blue) and accuracy sweep (red) methods
as a function of shift. ............... ..... 74

5.12 Results of uniform (blue) and accuracy sweep (red) methods. ...... 75

5.13 Second two-peak database. ............... .... 76

5.14 Average accuracy for second two-peak database as a function of shift. 77

5.15 Results of uniform and accuracy sweep methods on second two-peak
database. ............... ........... .. .. 78









5.16 Tranformation from average illuminant to early morning and evening
illum inants. . . . . . .. . 79

5.17 Comparison of original and warped databases. ........... ..80

5.18 Results for training (original) and testing (warped). . .... 81

5.19 Over-fitting of accuracy sweep method. ............... 82

6.1 Sample soda images used in 14-can database. Soda shown here is
Publix brand Diet Cola, under each of eight different illuminants. 84

6.2 Colormaps for 2, 3, 4, 5, 6, 8 and 14 colors. ............. 85

6.3 Theoretical predictions (blue solid lines) for c = 224, p = 2 and p = 3,
n = 3, and k varying. Real database results (red dashed lines) for
c = 224, p 3 and p 5, n 3 averaged over 20 sets, and k varying 86

6.4 Real database results for n varying, with k = 8 and k = 14, p = 2
and p = 3. The blue solid lines show k = 14 and the red dashed lines
show k = 8. The blue and red lines with triangle markers show the
results from the comparison between theoretical data (blue) and real
data (red). .................. .......... 87

6.5 Real and theoretical results for k = 1 to k = 25. Dotted line shows
theoretical results for p = 2. Dashed line shows theoretical results
for p = 3. Crosses show averaged real results for p = 2. Stars show
averaged real results for p = 3. ............. . .90

6.6 Uniform quantization results for the workplace database. ...... ..93

6.7 Example of the characteristic knee in the accuracy curve for real data. 94

6.8 Comparison of uniform and accuracy sweep methods on the workplace
database. ... ............... .. ...... ... ..95

6.9 Full database accuracy vs. number of bins for uniform quantization. 98

6.10 Localized database accuracy vs. number of bins for uniform quantization. 99

6.11 Comparison of accuracy sweep (red) and uniform (blue) quantization
methods. . .............. ............. ..101

6.12 Comparison of different numbers of orientations with respect to accuracy102

6.13 Comparison of lighting condition change to orientation change ..... 103

6.14 Images and 2-D histograms using [r, g] chromaticities. . ... 104

6.15 Comparison of uniform quantization methods. ........... ..106









6.16 Histogram results for the full database, without color constancy. 108

6.17 Histogram results for the full database, with color constancy. .... 109

6.18 Comparison between uniform quantization results on full database for
data with and without retinex pre-processing. . . 110

6.19 Comparison between results with different color constancy methods. 111

7.1 Sample image of input to system. Region used to create histogram is
outlined in red; remainder of image is shadowed. . . .. 116

7.2 Sample histograms from database. ................ . 117

7.3 Comparison of 7-Up and Mountain Dew nutritional information 124

A.1 RGB color space. ............... ...... 134

A.2 Munsell color space .............. . .. 135

A.3 Munsell colors used by Berlin and Kay, with focal colors (dots) and
region boundaries. .................. .... ..... 136

B.1 Sample image used to generate full database of 86 cans, with 8 different
orientations and 4 different illuminants. ............... 142

B.2 Samples of images used to generate full database of over 82 cans, with
8 different orientations and 4 different illuminants. Top row is not
processed; bottom row is processed with the retinex algorithm. . 143














LIST OF TABLES


3.1 Number of trials given bit-depth ................ ... 33

3.2 Number of trials given bit-depth ................ ... 33

3.3 Results of the d' and p(A) analyses. ................. 35

3.4 Gamma Calibration Data .................. .... .. 37

6.1 Real data results for p = 3. CC shows the expected results if the algo-
rithm is performing correct color constancy and accurately identifying
each object. .................. ............. .. 89

6.2 Object recognition accuracy generated from database of 9 soda cans
under 4 different lighting conditions. Ill. in the table refers to illumi-
nant, and indicates the images that were used as the templates in the
database, while the test data consisted of the remaining 27 images. 92

6.3 Accuracy on household database ............ .. .. .. 96

6.4 Accuracy on full database for lighting shift. . . ...... 97

6.5 Accuracy on full database with orientation varied and lighting constant. 100

6.6 Accuracy on full database for lighting shift quantization methods. .. 107

6.7 Accuracy on full database with retinex pre-processing. . ... 112

7.1 Number correct when identification of sodas is tested using orientation
midway between those in the database. Eight cans with two orienta-
tions in database, fourteen colors. The numbers 1 and 2 correspond to
orientations; Err indicates what was chosen instead of the correct soda.
TM TM
Codes are: Sch = Schweppes CT =Country-Time Lemonade ,
TM
EO = Eckerds Orange Drink WG Welch's Grape DrinkTM, C
TM TM TM
SCoca-Cola D7 Diet 7-Up CD Canada Dry and 7U =
TM
7-Up ......... ...................................119










7.2 Number correct when identification of sodas is tested using orientation
midway between those in the database. Eight cans with four orien-
tations in the database, eleven colors. The numbers 1, 2, 3, and 4
correspond to orientations; Err indicates what was chosen instead of
the correct soda. Codes are: Sch = Schweppes CT = Country-Time
TM TM
Lemonade EO Eckerds Orange Drink WG =Welch's Grape
TM TM TM TM
Drink C Coca-Cola D7 Diet 7-Up CD Canada DryT
and 7U = 7-UpTM 120

7.3 Number correct when identification of sodas is tested using orientation
midway between those in the database. Ten cans with four orientations
in the database, eleven colors. The numbers 1, 2, 3, and 4 correspond to
orientations; Err indicates what was chosen instead of the correct soda.
TM TM
Codes are: Sch= Schweppes CT Country-Time Lemonade EO
= Eckerds Orange Drink WG = Welch's Grape DrinkTM, C = Coca-
TM TM TM .TM
Cola CD Canada Dry MD = Mountain Dew P = Pepsi,
TM TM
SL = Slice and LDL = Lipton Diet Lemon Brisk Iced Tea ... 121

7.4 Number correct when identification of sodas is tested using orientation
midway between those in the database. Eleven cans with four orien-
tations in the database, sixteen colors. The numbers 1, 2, 3, and 4
correspond to orientations; Err indicates what was chosen instead of
TM
the correct soda. Codes are: Sch = Schweppes CT = Country-Time
TM TM
Lemonade EO = Eckerds Orange Drink WG = Welch's Grape
Drink C = Coca-Cola P PepsiT, SL SliceTM, LDL = Lip-
ton Diet Lemon Brisk Iced Tea 7U = 7-Up WC = Wild ('C! y
Pepsi and DP Diet Pepsi TM. . 122

7.5 Number correct when identification of sodas is tested using orientation
midway between those in the database. Twelve cans with four orien-
tations in the database, sixteen colors. The numbers 1, 2, 3, and 4
correspond to orientations; Err indicates what was chosen instead of
the correct soda. Codes are: Sch = Schweppes CT = Country-Time
TM TM
Lemonade MMO Minute Maid Orange Soda WG = Welch's
M T.M TM
Grape Drink C Coca-Cola P PepsiT, SL -Slice LDL
TM TM
SLipton Diet Lemon Brisk Iced Tea 7U 7-Up WC = Wild
C(!, i y Pepsi DP Diet Pepsi and SP SpriteT ....... 123

B.1 Sodas in final database. ................... .... 141









Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

EFFICIENT OBJECT RECOGNITION
USING COLOR QUANTIZATION

By

Signe Anne Redfield

December 2001


(C i i i, il : J. Harris
Major Department: Electrical and Computer Engineering


A simplification of the color histogram indexing algorithm is proposed and an-

alyzed. Instead of taking a histogram consisting of hundreds of colors, each input

image is first quantized to only a few colors (between eight and sixteen) and the fea-

ture vector is generated by taking a histogram of this smaller space. This increases

the efficiency of the system by orders of magnitude. We also proposed that this would

reduce the effects of lighting change on the algorithm and that this would be a bet-

ter model for the human object recognition mechanism than the algorithm combined

with color constancy alone.

In support of the contention that this may be a better human model, a psy-

chophysical experiment was conducted. The bit-depth of human color memory was

shown to lie between 3 and 4 bits, corresponding to 8 to 16 color categories when a

color is remembered for five seconds. This experiment created a bridge between the

worlds of the p-i-l k i-, -i. I1 results and the computer tested.

The research showed that quantization can occasionally compensate for small

lighting changes, but that the compensation is highly database-dependent and erratic.

However, quantization alv--,v- produced a much more efficient system and generally

did not substantially reduce the accuracy.









The results of this work were threefold. First, human color memory is relatively

poor, indicating that a system incorporating quantization will be far closer to mimick-

ing human abilities than a system without it. Second, quantization alone is insufficient

to perform color constancy in most cases. Third, with or without a color constant

pre-processor, our results consistently showed that quantization has little effect on

accuracy when using more than sixteen bins. Object recognition accuracy degrades

substantially as the number of color categories drops below six. From 10 categories to

256, accuracy is essentially unchanged. Quantization is a very efficient way to reduce

the computational complexity and storage requirements of this algorithm without

substantially affecting its object recognition accuracy.













CHAPTER 1
INTRODUCTION


The purpose of this research is to explore the effect of quantization on object

recognition accuracy when color is the only cue. There are many v--,v in which a

system can identify an object. These sensory systems range from very simple (such

as distinguishing between obstacle and air via infra-red return) to extremely complex

(such as using multi-sensory input and sensor fusion to make sharp distinctions be-

tween specific objects). Object recognition using color images lies somewhere in the

middle of this range. Generally the system is being required to make fairly sophis-

ticated choices (track the red object but ignore the other red object), with limited

resources. There are many other cues that can and often should be used, even when

vision is stipulated as the primary sense, but here we will be exploring the effects of

quantization when only color is considered.

Object recognition using images is generally inefficient. The vast quantities of

input data tend to overwhelm systems attempting to extract only useful information.

Systems that recognize objects based on shape alone require moderate databases

and impose a substantial computational burden. Systems that rely on color have to

deal with triple the input information required for shape-based systems, and have

difficulty when the lighting changes. However, under controlled conditions, or when

the lighting changes are somewhat predictable, efficient object recognition using color

is possible. Furthermore, if a truly efficient object recognition algorithm using color

can be derived, more expensive techniques can be used on only the subset of data

produced by the color recognition process. This could dramatically speed up the

recognition process, even when multiple object properties are used as features.









Objects can be identified using many possible features. Here we look exclusively

at color, used as the only feature. Obviously, not all databases are suited for this

approach. Soda cans, however, form a very good data set for this experiment. They

are uniform in shape and thus easily segmented from their background, and they come

in a wide variety of colors. In addition, there are many potential uses for the system,
TM
such as a butler robot. If you send your robot to get you an ice cold Coca-Cola ,
TM
the robot should be able to identify it reliably and not bring you FrescaM instead.

Our research shows that quantization can make an appropriate object recognition

algorithm much more efficient.

Vision is a passive sense. Assuming we are using a robot to identify objects, the

robot does not need to disturb what it is looking at to identify the object. The robot

does not even have to be close to the object to be capable of identifying it. In addition,

vision is not object specific. The same sensor can be used to identify many different

types of objects. For example, humans can tell simply by looking which object is a

chair and which is a desk. They can even differentiate between the blue chair and the

red chair. This makes vision one of the most potentially flexible sensory modalities.

Furthermore, vision enables humans and, potentially, robots to precisely determine

the locations of relatively distant objects. Other senses are not as useful. Sonar, for

example, would allow a robot to approximately determine distant objects and closely

determine nearby objects. Infra-red reflections allow a robot to only approximately

determine nearby objects, and require the use of additional sensors to further define

its environment. In addition, in order to use such senses effectively, and eliminate

possible confusion, the placement of these sensors is critical. If they are poorly placed,

the robot will be unable to distinguish between critical situations. Furthermore, r, ,ni

sensors are needed in order for these alternative sensory modalities to operate well.

With vision, however, a single camera can be placed anywhere with a good view and

will provide far more information. Vision is in many v- -, an ideal robotic sensor.









However, vision is also the source of many problems. Vision-based object recogni-

tion is plagued by the computationally intensive nature of image data. Color images

in particular require vast storage capabilities. An image of 128 by 128 pixels, with

8 bits or 256 levels per color band requires 393,216 bits just to store all the pixel

values. Almost all the algorithms using visual data also require substantial process-

ing to extract the relevant information. Image data contains remarkable quantities of

information, but extracting that information can be very time-consuming and compli-

cated. The same characteristic that makes vision such a wonderful sensory modality

also makes it very difficult to work with.

It is, of course, possible to obtain tremendous amounts of information about the

world using only grayscale images. Why increase the robot's computational burden by

using color? If we assume that all the edge-detection and shape-based techniques we

would use for a grayscale image are already available, the addition of color allows us

to make the task of separating objects from their environment much easier. Suddenly

we can distinguish between an apple and an orange without complicated texture

an i ,-i- We can even easily distinguish the ripe oranges from the unripe oranges.

Unfortunately, the computational complexity of the system will be greatly in-

creased. How can we use the wealth of information inherent in this sense without

overwhelming our system? Given the computational burden of shape based algo-

rithms, and the relatively smooth variation of chromaticity from pixel to pixel, per-

haps a statistical method working only with chromaticity would be an appropriate

solution.

Swain and Ballard [57] proposed the histogram indexing algorithm in 1991. Fun-

damentally, the algorithm simply looks at the statistical probability of the occurence

of each color for a given object, and uses that information to determine the most likely

match. ('!i ipter 2 explores the original algorithm in detail. It is sufficient to -iv- here

that the algorithm has almost all the properties required for robust object recognition









in a realistic environment. It is robust to orientation changes, scaling, rotation and

deformation. It is unfortunately fragile when exposed to lighting changes and large

databases. Using color as the only feature dramatically reduces both storage and

processing requirements.

Several sil. --Ii. i. i- of v--,y to make the algorithm more lighting invariant have

been made, and are explored in more detail in C'! Ilpter 2. The authors s.-.L-- -. I1 the

use of a color constant preprocessor before the algorithm is run. The field of color

constancy includes all the algorithms that attempt to convert an image taken under

one light to an image taken under a different, default light. There have been r, ir:y'

approaches to this problem. Of these, Land's retinex theory [36, 37, 38, 39, 40, 30, 51,

10, 33, 49, 31, 67, 42] is one of the most widely researched and is relatively simple to

implement. Alternative methods include Forsyth's gamut mapping algorithm [23, 21]

and global methods such as subtracting the dominant color from the entire image, or

normalizing each pixel by its luminance.

Instead of color constancy methods, some researchers opt for a local method uti-

lizing shape information [25]. Instead of taking histograms of the colors themselves,

this method incorporates color constancy into the algorithm by taking histograms of

color ratios. However, this simply adds a step to our research, as we would still be left

with the problem of quantizing the color ratios. All methods that incorporate color

constancy into the algorithm are more computationally complex than the original.

In addition, many simply are not good enough for object recognition. Funt et al.'s

paper [24] showed fairly conclusively that of the simple versions of the algorithms,

Forsyth's and Land's algorithms work almost equally well, obtaining roughly 70'.

accuracy on their database. Their database is unrealistic, containing images of single

objects under dramatically colored single illuminants with a white patch available for

calibration purposes. The authors concluded that their results meant that current









color constancy algorithms were insufficient to reliably recognize objects in the real

world.

For our purposes, the algorithms may or may not be sufficient, but they are

unquestionably too complicated and require far too many resources. In order for our

robot to function efficiently, it should not have to spend five minutes in front of the

fridge with the door open compensating for current lighting conditions.

Histogram indexing was biologically inspired. Psi. 1- .1 ,!i -- l experiments showed

that this was a plausible approximation to one of the methods humans use to identify

objects. Color constancy is also biologically inspired. The retinex theory was devised

as a model for human color constancy. However, algorithms that approximate human

color constancy are not perfect. In fact, although the general perception may be that

humans are quite good at compensating for the illuminant, research has shown that

humans are decidedly imperfect [19, 69, 1]. Consensus in the biological literature is

that color constancy is instantaneous. Humans see a scene, and they are instantly

capable of determining the approximate hue of a given object without respect to

lighting. Background information on color constancy and other p-i-, !.1 .1, !i-i. phe-

nomena is discussed in C'! lpter 2 and Appendix A.

The object recognition task, in and of itself, must incorporate a delay of some

sort. In order to recognize an object, one must have seen it before; thus the element

of time is implicit in the task. What happens to colors when they are stored in human

memory? C'! Ilpter 3 explores this issue. Human sensitivity to shifts in hue with a

five second delay corresponds to between 8 and 16 color categories.

We implement the degradation of color memory in a robotic system with quan-

tization. For the many possible colors that our camera can capture, we sort them

into categories via quantization and then use those categories to generate our his-

tograms. C!i lpter 4 explores the background of quantization, and specific methods,

in detail. Swain and Ballard showed that under strictly controlled lighting conditions









objects could be reliably recognized using color histograms. They used 512 colors to

prove their point, but showed little degradation for as few as 64 colors. However,

our previous experiments [28, 53, 52, 54] have shown that quantization to between 8

and 16 colors can produce improved recognition accuracy under varying lighting con-

ditions, and rarely decreases accuracy. All 1,i-;i of lighting shifts showed that when

fluorescent light and d light are the primary sources, 6 chromatic categories and two

achromatic categories are sufficient for our databases.

Our results on synthetic data are presented in ('!i lpter 5. This chapter also in-

cludes theoretical and simulation results, showing that quantization is an effective way

of making the algorithm more efficient and is a potential replacement for more tradi-

tional color constancy algorithms. Chapter 6 contains the results when the database

is made up of still images of real objects. These databases range in size from 9 ob-

jects to over 80 objects, generally with at least 4 common lighting conditions. These

results show that using quantization as a substitute for color constancy algorithms

is impractical. However, quantization produces a far more efficient system, with or

without color constancy algorithms. ('! Ilpter 7 describes several prototype systems

using 10, 11, 14, and 16 colors with varying results. These systems show that for small

databases and minimally varying lighting, good lighting invariance can be obtained.

For more varied lighting, additional data under representative lighting conditions will

allow the algorithm to perform well. When color constancy algorithms are used with

quantization, the system is combines the increased accuracy of the color constancy

algorithms and the increased efficiency of the quantization.

C'!I ipter 8 summarizes the results of this research. In general, human color memory

is capable of distinguishing between members of 8 to 16 categories. The results of

experiments on still images and synthetic databases show that there is little to no

degradation in accuracy as the number of quantization categories is decreased to

between 8 and 16. Accuracy tends to decrease as the number of categories falls







7

below 8. In general, our results indicate that quantization is a very effective means

of increasing the efficiency of a system without decreasing its accuracy. In some

cases, quantization may increase the accuracy as the number of bins is decreased to

between 8 and 16. Our prototypes show that near perfect accuracy is obtainable in

real situations, with minimally varying lighting using only 11 color categories.













CHAPTER 2
ROBOTIC OBJECT IDENTIFICATION


Our purpose is to demonstrate the effect of quantization on color object recog-

nition. Unfortunately, almost all color constancy algorithms published are deemed

;ood" by the authors, with no indication of a metric with which to judge them other

than the human cv- If we are to determine quantization can have the side effect of

color constancy, we need a more independent, reproducible metric. The accuracy of

color object recognition provides such a metric.

Let us assume we have a butler robot whose function is to deliver soda cans.

We would like our robot to identify these cans in a realistic laboratory environment.

Many experiments have tested algorithms in very austere environments [57], with fixed

environmental variables such as only allowing the robot to move within a rigidly con-

trolled area such as a hallway [55]. Generally, the constraints on these environments

are justified by assuming the algorithm will be implemented in an equally constrained

environment, such as a factory floor. However, these algorithms generally break down

in more realistic, uncontrolled environments [57]. Some of the earliest autonomous

robots responded to stimuli in simple v--,- but were capable of exhibiting behaviors

that we, as humans, associate with emotion [12]. These robots had very limited sen-

sors, but were very robust. Vision, as the most complex sense available to robots, is

both extremely data-intensive and extremely fragile in the way it is usually used for

robots.

In the laboratory where our robot must work, the lighting varies and the overall

environment changes often. Furniture is moved, creating new shadows and reflections.

The arrangement of drinks within the refrigerator changes. This is not a rigidly

controlled environment. In fact, it contains many of the characteristics of the real









world: it is minimally constrained, chaotic and unpredictable. This dissertation is not

concerned with the physical design of our robot, or with its manipulators, or with how

our robot finds the refrigerator, the offices, or even the cans within the refrigerator.

We could even use a simpler method to determine which can is which, such as reading

the bar codes on the sides of the cans. One purpose of this research is to determine

the effect of quantization on color object recognition with a view to determining its

effectiveness as a substitute for more elaborate color constancy algorithms, and we

use our robot's identification of the correct can as the metric with which to compare

the algorithms.


2.1 Platform


Quantization has several advantages over color constancy. For example, it is

more useful in the case of an actual robotic implementation. In order for our robot

to function, the object recognition algorithm must satisfy certain constraints. First,

there is the issue of price. In order to keep the cost of the robot to a minimum, we want

an algorithm that will minimize use of the processor, minimize the necessary memory,

and not require expensive sensory equipment. In theory, a better sensor implies better

results. In actuality, given finite resources, a more expensive sensor will affect the

rest of the robot, possibly resulting in a cheaper processor, less memory, and perhaps

even compromises in the design of the robot's manipulators and its maneuverability.

Second, our robot should work in real time. In order to identify the can, the robot

will have to keep the door of the refrigerator open. The faster the robot can decide,

the sooner the refrigerator door will close and the less electricity will be wasted. In

addition, the faster the robot knows which is the correct can, the faster the can will

be delivered.

These two constraints (price and speed) define the degree of acceptable computa-

tional complexity. Price determines how fast a processor we can use, with how much









memory. Price also determines the quality of the input data, by determining the

capabilities of the camera and frame grabber. The speed constraint combined with

the processor and memory constraints determine the complexity of the final object

recognition algorithm. The final criterion is that the system must be robust to as

many changes as possible in the object to be identified and in the environment. Price

must be minimized, while speed and robustness must be maximized. Quantization

will make the system more efficient, both by reducing the number of elements that

need to be stored in the database and by reducing the number of computational

operations that need to be executed in order to identify an object.


2.2 Identification Methods


There are many possible solutions to the soda identification problem. Instead of

actually asking our robot to identify the cans, we could standardize the organization

of the refrigerator. SpriteTM would ahv--l be in the same place, as would every other

soda. Unfortunately, this method is unlikely to work in the chaotic environment

assumed above. There is no guarantee that the person responsible for filling the

refrigerator will choose to buy the same drinks every time, or put them in the same

place. Furthermore, the contents of the refrigerator may be shifted to allow room for

items other than drinks.

We could try to simplify the system by using grayscale images instead of color

images. However, in order to recognize cans with -i ,v-' 1. images, we would need

to use shape-based methods. Cans tend to have similar distributions of curves and

angles, and every can has one side dedicated to nutritional information. The nutri-

tional information has approximately the same shape distribution for every can. If

we wanted to identify cans based on the nutritional information we would have to be

able to extract the words for the ingredient list, and match them up to a database

of words to determine the ingredients, which would then be compared to a known









database of objects. The nutritional content of different sodas is remarkably simi-

lar. If we want to identify the cans based on the logos of the different drinks, we

would need some form of shape-matching, and possibly character recognition [8, 2].

Fundamentally, shape-based methods tend to be both computationally intensive and

time-consuming, violating our cost criteria.

Instead, we implement Swain and Ballard's histogram indexing method [57]. This

method, inspired by human p- i- 1 ,!'i--~i. l experiments, meets our criteria in almost

every way. It is very efficient. Instead of performing complicated shape-based cal-

culations to extract information, the computer generates a histogram of the colors

present in an image of the desired object. This histogram is normalized and used as

the feature vector to describe the object in the database. When an unknown object is

presented, its histogram is generated and compared to the histograms in the database.

Simple Euclidean distance is used to compare the histograms, and a nearest neighbor

classifier [18] is used to identify the objects. The algorithm satisfies the cost criteria

above. It is also robust to orientation changes, scaling, and rotation and deformation

of the object. The only area in which it fails is robustness to the environment when

the lighting changes.

Because of this failure, the algorithm is a good choice for testing the effectiveness of

color constancy methods. The histogram indexing algorithm's accuracy will become

close to perfect when the color constancy method is compensating adequately for the

lighting conditions. As the input becomes less and less color constant, the accuracy

will decrease.


2.3 Histogram Indexing


The color histogram indexing algorithm was originally proposed as an instance of

animate vision for robotic systems. The two requirements it must satisfy are that

it must work in real time and that it must di p!iv environmental robustness. The









stated goals of the researchers were, first, to determine the identity of an object with

a known location and second, to determine the location of a known object.

To solve the first problem, that of determining the identity of an object, the

authors define a similarity metric called histogram intersection, which compares the

number of pixels of each color in the new image to the number of pixels of that color in

the model in the database. The model in the database is assumed free of background

(non-object) pixels, while segmentation of real data is assumed incomplete, or at best

inaccurate.

Histogram intersection is robust to distractions in the background of the object,

to viewing the object from a variety of viewpoints, to occlusion, and to varying image

resolution. It is not robust to varying lighting conditions or to large database sizes.

Swain and Ballard's definition of the intersection of two histograms is


Smin(Ij, Mj) (2.1)
j 1

where I and M are histograms with n bins each. They obtained a match value from

zero to one by normalizing this intersection by the number of pixels in the reference

histogram, and scaling the image of the unknown object to the same number of pixels.

For efficient indexing into a large database, they calculated a match using only

some number of the largest bins. Their database contained 66 images of real objects

that were used as the training set, and a testing set of images including occluded

and rotated objects. They obtained 10to'. accuracy keeping the 10 largest bins, and

99.>.' accuracy keeping 200 bins. They simulated changing lighting conditions, but

only linear intensity changes, not chromatic or non-linear changes.









To solve the second problem, that of finding the location of a known object, they

define a ratio histogram R as


Ri min -, 1 (2.2)


The values of the histogram R replace the image values and the result is convolved

with a mask of the appropriate shape and size for the object in question. If the target

is in the image, a peak in the convolved image indicates its most likely location.

Results were good, with 5 occluded objects out of 32 corresponding to the 2nd highest

peak, 1 occluded object corresponding to the 7th highest peak, and the rest identified

correctly.

The histogram indexing algorithm is extremely robust. It retains its accuracy

for many changes in the object, such as rotation, occlusion, scaling and deforma-

tion. It is, however, extremely sensitive to lighting changes. If the lighting changes,

then many pixels will change sufficiently to move from one histogram bin to another,

profoundly altering the overall histogram. To solve this problem, the authors sug-

gested a simple color constancy preprocessor. Many approaches to the solution of this

problem have been attempted. Color constancy based systems have shown limited

success, while shape based systems have shown slightly more. All these solutions

require dramatically more computations and memory than the original algorithm.


2.4 Color Constancy


The general color constancy problem has been studied extensively [1, 4, 9, 19, 23,

29, 39, 40, 42, 58, 69, 47]. Two basic assumptions underlie most of these solutions.

First, people do it well. This is addressed in more detail in Appendix A, but is

open to debate, depending on your criteria for I. !! People are capable of color

constancy to some degree, but actual color-to-color matching exercises show that









people can perform well only if the color is perceived as an aspect of a concrete

object [1]. Simply correcting for a given lighting condition without this association

seems to be very difficult. The second common assumption is that color constancy

is an instantaneous phenomenon, and therefore the effect of color memory on color

constancy is negligible in this context. This may be true. Many color constancy

experiments involving the same object rely on first one exemplar, and then a d. 1li-,

and then the exemplar under a different light. This was, in fact, one of the first

examples of color constancy in Land's and McCann's work [36, 37, 40]. However,

this d, 1 i '1- experiment inherently involves color memory, as color memory has been

shown to degrade over d,-1 i- on the order of 200 ms [61]. Even within a scene, as we

look from one object to another under the same light, it is not ah--iv-x obvious that two

colors are different. It is often necessary to place the objects side by side to enable an

observer to see the substantial difference between their colors. Researchers attempt to

get around this by having two known identical scenes visible to the participant, and

asking the participants to adjust the lighting on one scene so it matches the other.

There are two types of solutions to the color constancy problem. Global solutions

make up the first type. Generally, color constancy solutions rely on the stricture that

lighting changes have relatively low spatial frequency compared to surface property

changes. The most extreme manifestation of this, therefore, is to remove the DC color

component of a given image. Many global algorithms exist along these lines. Some

methods attempt to normalize the color values, to eliminate dependence on intensity

or some other characteristic. Fir1 ,i- -, [22] has developed a color constant space

that effectively eliminates the illuminant. Unfortunately, in order to determine the

required projection that will take you into this space, you must first have examples

of the changes in lighting you wish to compensate for.

Local algorithms make up the second type of solutions to the color constancy

problem. These include Land's retinex theory of color constancy, first published in









1959 [36, 37]. The algorithm uses a very large local region (almost global) comparing

the current pixel value to the overall color surrounding it, with a steep discount as a

function of distance. In 1971 [40], he and McCann published additional results using

the retinex as a model for human color vision. In the mid-eighties, Land published

papers containing new versions of this theory [38, 39], which was revised and expanded

by other researchers throughout the eighties and nineties [9, 10, 4, 33, 49, 31, 42].

Most color constancy algorithms are based on what is known as von Kries' prin-

ciple. Coefficients independently adjust the gain of each photoreceptor (or channel)

to obtain surface color descriptors. These factors vary according to the author cited:

for example, Brill and West [13] use factors of one over the output of that photore-

ceptor for a known white patch. In addition to failing when the photoreceptor classes

are not independent of each other, this algorithm also illustrates a common failing

among color constancy algorithms. In general, color constancy algorithms require

either known surfaces (either known whites or known types of surfaces, such as Mon-

drians: flat, matte images of smooth color regions) or known lighting conditions, or

both.


2.4.1 Retinex

In 1959, Edwin Land [36, 37] proposed the first biologically-based color constancy

algorithm. In the mid-eighties, with the advent of computers fast enough to simulate

the process, he revisited his previous work [38, 39]. These new algorithms explored

possible methods to improve the biological accuracy (incorporation of Mach band

effects, for example) and the simplicity of the computation. In the final form, Land's

algorithm is similar to homomorphic filtering, and relies on similar properties of

images. Homomorphic filtering enhances high frequencies and reduces low frequencies,

in the frequency domain. In the retinex algorithm, high frequencies are enhanced and

low frequencies are reduced in the spatial domain. The center/surround retinex uses









the following three steps to obtain a final relatively color constant image. First, for

each color channel independently, take the log of the value of each pixel. Second,

again for each color channel independently, subtract the result of convolving a local

surround function with the image from each pixel. Third, globally scale the resulting

values appropriately for image display. Much research has been done on which type

of surround function is best [33, 49, 31, 10, 42] and on the placement of the log

function. Jobson et al. [33] conclude that the Gaussian surround with the log taken

after the surround function instead of before provided the best results. However,

their interpretation of best seems distressingly free of any metric besides that of the

opinion of the researchers.

Moore et al. [42] designed and built an analog VLSI chip that performed the

retinex algorithm on real time video data, with good results. Their research demon-

strated a fundamental problem with the retinex algorithm, as it then was. In images

with large regions of a single color, the gray world assumption forces those regions

to gray, even when the actual color is highly saturated. They solved this problem

by introducing a variance compensation mechanism, which uses the local variance to

determine how much to change the overall color. Again, the metric for ;ood" and

"poor" results depends entirely on the opinion of the authors.

In 1986, Brainard and Wandell [10] published an analysis of the retinex theory

in terms of color constancy. Their concern was with color constancy that retains the

colors of objects independent of both nearby objects and illumination. They analyzed

the retinex algorithm developed by Land in 1983 [38] in this context. Land and

McCann [40] concluded that the retinex performs similarly to humans when only the

spectral composition of the illuminant is considered. Brainard and Wandell's paper

deals specifically with the effect of the algorithm on the perceived colors of objects

close to the object in question. They determined that "retinex is not a color constant

algorithm and that it is not an adequate model of human performance." [10], p. 1657.









The retinex is not a color constant algorithm because it normalizes the photoreceptor

gains to values that depend on the input image, rather than to a constant. It is not

an adequate model of human performance because it depends on the surfaces present

in a scene to a greater degree than a human would. However, the authors present this

conclusion with only the assertion that "A human observer ... perceives virtually no

change in the appearance of the upper two rows of chips" [10], p. 1656 whereas the

algorithm produces noticeable change in the chips' appearance. They do not provide

or use any quantitative metric.

More recently, researchers [33, 49, 31] have developed a multi-scale retinex al-

gorithm, with better results. These papers detail the development and testing of a

multi-scale retinex which achieves both color/lightness rendition and dynamic range

compression simultaneously. They use a Gaussian surround for their kernel, and per-

form the log function after the convolution with the surround kernel. They also use

a canonical gain/offset. The multi-scale version is simply the summation of several

single-scale retinexes, each with a different standard deviation Gaussian surround.

This is one of the few papers that compares the performance of the retinex to the

performance of other algorithms.

The follow up to this study is the development of the multi-scale retinex with color

restoration [32]. This algorithm adds a color restoration step after the multi-scale

retinex processing, and purports to produce images that closely mimic the human

viewing experience. The researchers have taken out a patent on their method [50]

which contains more explicit instructions for setting the various parameters.


2.4.2 Gamut Mapping

Forsyth's [23] work is the only other color constancy algorithm to approach the

performance of the retinex algorithm when used in conjunction with color indexing.

He presents two algorithms, Crule and Mwext, which are effective under two different









sets of conditions. Crule works for any surface reflectance if the photoreceptors are

narrowband. Mwext functions in the case where both surface reflectances and illumi-

nants are chosen from finite dimensional spaces. Experimental work with Crule shows

that for ;ood" constancy, "a color constancy system will need to adjust the gain of

the receptors it employs in a fashion analogous to adaptation in humans." [23], p. 5.

He uses 5 assumptions for the development of this algorithm:

1. All surfaces are flat, frontally presented, and there are no shadowing or mutual
illumination effects.

2. There is a single, spatially uniform illuminant. Here, this means only that there
is only one illuminant at a time, not that there is only one type of illuminant.
The single, spatially uniform illuminant's chromaticity can be changed.

3. All surfaces are Lambertian and all reflection is diffuse. Surfaces vary only with
wavelength and do not fluoresce.

4. The problem he solves is defined in two parts: first, that the illuminant must
be estimated, and second, that the some statement about the properties of the
surfaces in the image must be obtained.

5. "The product of any surface reflectance function, and any illuminant function,
and any photoreceptor sensitivity can be integrated with respect to wavelength.
Surface reflectance functions are neither greater than one, nor less than zero.
These are very weak assumptions." p. 7


Obviously, he is not solving the real world problems of naturally illuminated ob-

jects. Even the simplest real environment continually violates assumptions 1 and 2.

The wings of birds and butterflies (and all specular reflections) violate assumption 3.

There are several additional assumptions about the character of the illuminant.



1. "Illuminants are 'reasonable.' p. 7 This means that, for instance, a sample
object seen under monochromatic light could not be reasonably used to identify
the object under white light. Also, "it must be possible to describe each member
of the set of illuminants that one observes (e.g. with a parameterization)." p. 7

2. Photoreceptors are also i .-.. iI .1!.," in that the illuminants must excite the
receptors.









3. If the illuminant produces metamers (a pair of metamers are a pair of patches
whose underlying pigment is different but that evoke the same receptor response
under a given illuminant), there is no constraint on the algorithm to predict that
they will look different under different lighting.

4. The colors of the objects in the image are not inio .-. i illyy" distributed, and
there are "sul!l. iil different colors. Examples of an unreasonable distribution
of colors would include a forest scene (almost entirely green and brown shades)
or a scene viewed through colored glasses. According to this, a reasonable
distribution with sufficient different colors would include scenes of man-made
environments, such as a child's pl ivroom.

5. Photoreceptor outputs do not degrade substantially for deeply colored illumi-
nants ("For any illuminant, a reasonable measurement of the photoreceptor
outputs is possible." p.7)


These assumptions are generally valid in the real world, except possibly for as-

sumption 4. Metamers, while common in theory, are rare in fact. Assumption 5

is a restriction on the illuminants where color constancy is possible rather than a

restriction on the receptors.

The basic task of the algorithm is to recover the RGB descriptor of a given sur-

face under a canonical illuminant. The canonical gamut is a convex set containing

all possible RGB responses under the canonical illuminant. Using this gamut, and

the above constraints, some illuminants can be ruled out and the remainder are a

linear transform away from the canonical illuminant. Crule solves for all the diagonal

matrices of coefficients that take the gamut of the image into the canonical gamut.

Due to the computational expense of calculating these matrices, an approximate so-

lution was calculated. Diagonal matrices can approximate matrices in the feasible set

due to the non-overlapping narrowband sensor constraint. Cuboid approximations

determine the intersection of the image gamut and the canonical gamut. The feasible

set with the largest volume is chosen as the mapping most likely to achieve color con-

stancy. This algorithm provided good results for Mondrian images with a sufficiently

diverse selection of colors.









Fill] i,-, oi's [21] extension of Forsyth's algorithm relaxes the constraints on the

illuminant, the surface shape, and the specularities illuminantt assumptions, and

parts of assumptions 1 and 3 above.) The author shows that confounding factors in

real images (such as specularities and shape) affect the intensity but not the color

orientation. Therefore, the algorithm uses 2D perspective chromaticity coordinates

instead of 3D RGB coordinates, and the intensity is not recovered. The 2D space is

given by r = R/B and g = G/B. Theoretically, if B is small, values could become

noisy, but the author reports no difficulties with this in practice. He also implements

a canonical illuminant gamut constraint similar to Forsyth's canonical surface color

constraint. Again, tests show that the algorithm performs well. The results shown

in this paper indicates that the angular error on real, highly colored images under

standard daylight and tungsten illuminants is less than 10 degrees. This is only a

few degrees more inaccurate than the best possible with diagonal approximations to

illuminant transformations.

Yet another approach based on the gamut mapping method, Barnard et al.'s [4]

algorithm identifies illumination variations across an image, and removes them. It also

uses the illumination information gathered to constrain the color constant solution.

Instead of a restriction on the reflectances, it requires sufficient variation in either

the reflectances, the illuminants, or in a combination of both. They interpret the

color constancy problem in the same way as Forsyth, taking images of scenes under

unknown illumination and determining the camera response to the same scene under

a known, canonical light.


2.4.3 Other Methods

Some researchers avoided color constancy by introducing a shape-based method,

using local color information. In 1995, Ennesser and Medioni [20] used the Where's
TM
Waldo images to test their local color information adaptation of the histogram









indexing algorithm. Novak and Shafer [47] explicitly assumed the availability of a

color calibrator in the field of view of the camera. If you have something to calibrate,

the problem unquestionably becomes simpler, but we cannot assume a color chart

will be available or usable.


2.5 Fixing Histogram Indexing


In 1995, Funt and Fi-r Li-,v-ii [25] attempted to eliminate the need for a color

constancy preprocessor by incorporating illumination independence into the algo-

rithm. Instead of indexing on the photoreceptor class triples, they indexed on an

illumination-invariant set of color descriptors: the derivative (Laplacian) of the loga-

rithm of the colors. This is equivalent to indexing on the ratio of neighboring colors,

similar to Land and McCann's work on color constancy [40]. This variation does not

work with saturated image pixels, because ratios based on those pixels are unlikely to

be constant. Their algorithm works almost as well as Swain and Ballard's under the

same lighting conditions. Their Laplacian of Gaussian operator identifies 19 objects

correctly out of 25 images of real objects under the original illuminant, but the toler-

ance is poor and two of the objects are v. i poorly matched" (ranks of 18 and 27).

Swain and Ballard's algorithm identifies 23 of these 25 real objects correctly, and the

two identified incorrectly were the second most likely choices. Next, the algorithms

are compared using synthetic Mondrian input. Both algorithms should show perfect

accuracy on this data when the illuminant is unchanged. When under differing, spa-

tially constant illuminants, Funt and Fi-r li- -, .'s algorithm correctly identifies all of

the objects, while Swain and Ballard's produces zero intersections for 155 of the 180

objects and correctly identifies only 20. When tested with illuminants that varied

spatially in intensity and color, again color constant color indexing (Funt and Fin-

l-.i--.'s method) produced perfect results while the original color indexing algorithm

identified only 7 of 30 correctly and failed on 12 of the Mondrians. Results on real









data were similar, with color constant color indexing producing one error with a rank

of 2 out of 11 objects under different illuminations. Color indexing alone in the best

case identified only 14 of 22 correctly, with 5 images having a rank of 2 and 3 having

a rank greater than 3.

Plainly, the color constant color indexing algorithm is better than color indexing

alone, as long as you can guarantee lighting shifts. If the lighting is the same, the

algorithm does not perform as well as the original.

All of the methods described here are more computationally intensive than the

original algorithm, and require much greater storage. C'! Ilpter 3 addresses the color

memory of the human visual system in terms of bit-depth, paving the way for a

detailed exploration of quantization methods in ('!i lpter 4.













CHAPTER 3
COLOR MEMORY


Ps, !-. il !.1,ii. d experiments are the primary method available for characterizing

the visual processes of the human brain. The only other in i, jr method is the analysis

of case studies of patients with usual and unusual problems, ranging from simple

color-blindness to true color constancy

Both color constancy and color memory have been studied extensively. Unfortu-

nately, due to the time-consuming and tedious nature of the experiments, the number

of participants and the extent of the tests are often severely limited. In addition, the

variability of color perception between individuals is often high. Even for a single

individual, results from a given experiment will vary substantially from one time

to another. Background information on color constancy was presented in depth in

C'!i pter 2.

Various researchers have investigated color memory [60, 59, 61, 5, 3, 16, 14, 15,

41, 27, 46, 63]. Uchikawa et al. [60, 63] showed that for both single and multiple

colors, humans remember colors as members of the Berlin and Kay color categories

[7]. Furthermore, instead of becoming less accurate as more colors were added to the

task, participants would forget one or more of five colors completely while retaining

roughly the same ability with the remaining colors.

Other research [62, 61, 59, 14] indicated that our ability to discriminate between

colors deteriorates with memory. However, none of these experiments was performed

in a color space easily transformed into the color coordinates used in a computer, or in

a robotic system. The research indicated that discrimination thresholds were larger

for successive viewing than for simultaneous viewing (both colors viewed simultane-

ously on the fovea). In addition, colors were remembered with a shift towards higher









saturation [59, 15]. There was no discernable shift of hue in memory [46, 15]. We

performed an experiment to determine a rough estimate of the accuracy of human

color memory in terms of bit-depth.

In general, p- vi-, .1 i--1i d1 experiments to determine characteristics of human color

vision are designed to obtain results that are as precise as possible, in whatever

coordinates are most convenient for the researcher's interpretation. These coordinates

include OSA uniform color scales [60, 63], wavelength [34], or the CIE spaces [43].

Many experiments [7] use a physical stimulus such as standardized colored chips,

which, except under certain fully controlled conditions, are impossible to perfectly

transform into a computer's representation of color. These experiments also generally

allow the subject to view the stimulus only for a short, fixed period of time, resulting in

a more controlled experiment but a less relevant assessment for real-world situations.

Because our algorithm is concerned with the smallest possible number of colors,

we want an optimistic estimate of the accuracy of human color memory. By finding

the largest realistic value for human color memory, we can determine a good target for

our robotic system. The purpose of this experiment was to determine this estimate of

the accuracy of human color memory under conditions close to those a person might

actually encounter in daily life.


3.1 Experimental methods


The experiment was performed on an Apple 400 MHz Rev. D iMac DVT, running

the Student Version of MATLAB 5.0. The overall luminance characteristic was used

to calibrate the monitor, and the luminance characteristic of each gun was used to

determine each gun's gamma. However, because the purpose of this experiment was

specifically to find an estimate of the accuracy of color memory in terms of computer

bit-depth, no compensation was incorporated for human perception of lightness or

saturation variation along the hue axis. Thus, if a shift in the hue value of a color









produced a shift in perceived lightness or saturation of that color, that shift was

considered part of the perturbation being tested. In order to maintain as much

realism in the structure of the experiment as possible, subjects were given as much

time as they wanted to determine their answers. In addition, they were also allowed to

use mental tricks to fix the color in their memories. The environment was austere (no

color charts) and they were not allowed to hold objects up to the computer monitor.

However, other purely mental games, such as comparing the color to a known color

such as the color associated with a sports team, were allowed. The experiments took

place in two locations, one in a room with windows and the blinds closed, and the

other in a room without windows. In both cases, the lights were alv--,v- in the same

configurations, with all overhead fluorescent lights on. However, in the first room

there was a small amount of additional light from cracks in the blinds. The subjects

were allowed to adjust the computer monitor and their position for the best viewing.

The only requirement was that the participants be able to see the colors clearly. In

each case, subjects were allowed to look away from the monitor whenever they liked.

In all likelihood, under normal circumstances humans would not perform as well on

color memory tasks as this experiment implies. But for our purposes, we wish to find

the most generous estimate.

First the subjects were asked to provide some baseline data. Using an interface

programmed in MATLAB, they indicated their choices for the best example of each

of the Berlin and Kay [7] color categories. This interface is shown in Figure 3.1. In this

figure, the primary window (Figure No. 1) shows the color plane. The participants

were assigned one focal color at a time. In this image, pink, red, orange and yellow

have been assigned, and the participant is selecting a color to represent green. The

participant was asked to use the mouse to choose the color that best represented the

given focal color. The color chosen was di- i'- '1 in the black box, labelled ;i, i1" in

this image. If the participant was unsatisfied with the color, they continued to select









colors until they were comfortable with the assignment. Once they were satisfied,

they clicked on the "Done" button. This reset the window to the next color and

assigned the chosen color to its space in the colormap on the right. This window,

entitled Figure No. 2 in Figure 3.1, showed the colors chosen so far. Different color

planes derived from the HLS and RGB spaces were available to the participant, but

the default was the Hue-Lightness plane. This plane, from Hue-Lightness-Saturation

space, was di-'1 'i, .1 with saturation fixed at the maximum. This space was chosen

because it uses a single axis to represent chromaticity, thus reducing the number of

trials necessary to characterize each participant's sensitivity.

Next, again using an interface programmed in MATLAB, they were asked to

provide the boundary information for the same plane, this time subdivided into seven

lightness levels. This is shown with the middle level in Figure 3.2. In this figure, the

participant has placed lines indicating where their boundaries between focal colors

occur, using the "Create Boundary Line" button and the mouse. The participant is

using the "Select Region Co l.-' menu to assign a color to one of the outlined regions.

Once all regions were assigned, the participant clicked the "Show Selection" button

to display the results of their mapping, using the focal colors chosen in the previous

step. This allowed the participant to make sure they allocated the focal colors to the

regions as intended. When the participant was satisfied with the boundary locations

and color regions, they clicked the "Lightness Done" button. This reset the window

with the next lightness level. This procedure was repeated until seven lightness levels

were set.

Once these preliminary tasks were over, the main experiment began. Three sets

of ten pairs of colors were generated. One of each pair was di'-1 i-', t first, and the

second (a perturbed version of the first color) was di- .11, i, 1 after a five second dl i-v.

For each set of ten pairs, one of the initial colors was the focal color chosen by the

participant and the other nine were offset from this focal color in RGB space by shifts









File Edit Window Resolution Help















Figure 3.1: Interface window for program enabling user to choose focal colors. Pri-
mary window shown here allows user to choose colors; secondary window shows focal
colors after user has moved on to next color

randomly generated from a Gaussian distribution with a .05 standard deviation. The

original RGB values ranged from zero to one. Values greater than one or less than

zero after the shift were rounded to one or zero, respectively. The resulting randomly

shifted points were checked to ensure there were no duplications within a set.

For each set of ten pairs, one d.1 -i, .1 color was not perturbed, and the rest were

perturbed by altering the value of the hue component of the color in question, flipping

the bit of the bit-depth being tested. This corresponded to a shift in one direction

or the other of some multiple of two. For example, if the program's bit-depth was

set to five (the default starting value) and the color's hue value was 10011000, then

the perturbed color would be 10001000, with the fifth least significant bit flipped,

corresponding to a shift of 16 along the hue axis.

In theory we could have simply added or subtracted a given number from a certain

color, and then swept that number from the smallest to the largest. In order to

reduce the search space, several constraints were introduced. First, to simplify the

transition to a robotic system, only integer bit-depth shifts were introduced. If we

accept the blue-yellow, red-green perceptual dichotomy, then an integer greater than

two in base-two bit-depth will .i.-- il- be capable of being interpreted in terms of these


































Figure 3.2: Interface for user to choose boundaries between focal colors. For each of
seven levels of brightness, the user places lines where boundaries are perceived. Focal
colors are assigned to the regions between the lines by the user.

channels (eg. flipping the 5th least significant bit is equivalent to 16 regions, which

is equivalent to segmenting the space into 4 regions for each quandrant (blue-red,

blue-green, yellow-red, yellow-green) of the perceptual color space). Second, we used

the HLS space, rather than RGB or one of the CIE spaces. Again, this reduces our

search space, and thus the time necessary for each participant to donate. Instead of

having three axes to search (RGB) or two (CIE spaces) we have only one, hue. Eight

bits were allocated for each axis in HLS space. Therefore, the maximum possible

bit-depth used would be eight, which would change the value to the opposite hue.

Flipping the fifth least significant bit, as above, would correspond to shifting the

hue value by 24, or 16 out of 256. Third, we reduced the experimental search space

by a factor of two by doing only one shift per color. We could have presented two

trials for each color, one with the perturbation added and one with the perturbation









subtracted. Instead, in the interests of being able to test a larger number of colors,

we allowed only one perturbation per color.

The subject viewed each color on a black background, for as long as they felt

necessary. This initial display is shown in Figure 3.3, part (a). The color is shown

in a square in the center of the screen. The remainder of the screen is black, except

for instructions in white text in the upper right-hand corner. When the participant

was satisfied with their knowledge or perception of the color, they hit a key, which

tl i-.-i 1i. the computer to display 5 seconds of black and white temporal and spatial

noise over a square slightly larger than the color sample. A sample screen shot of the

noise is shown in Figure 3.3, part (b). Then the second color in the pair (perturbed

or not perturbed) was di-p1l ',- ,1, as in Figure 3.3 part (c), and again the subject

had as much time as they felt necessary to make up their minds whether they were

seeing the same color or a different color and indicating their choice to the computer.

After ten of these trials, the same ten color pairs were di-pl li '1 simultaneously, one

pair at a time, as nested squares (shown in Figure 3.3, part (d)). The subject was

asked whether they perceived one square of a single color or two nested squares of

different colors. Again, MATLAB was used to control the monitor and the subjects

were given as much time as they needed to come to a decision. This experiment was

created with the free Psi, 1. ..1 v-i. Toolbox [11] for MATLAB to display the colors

and the interval noise, and to record the keyboard responses.

Consensus in the literature [60, 46, 5, 63] is that human memory for colors is

worse than human instantaneous perception of colors. In signal detection terms, we

are attempting to determine the sensitivity of human observers to visual noise (color

perturbations) when a given delay is present. Consistent and inconsistent responses

were generated, as well as the usual hit rate and false alarm rate. This portion of

the experiment also identified the people who were careless in their responses: if

a subject saw two colors every time, they were obviously not p .iing close enough





























1I)


Figure 3.3: Screen shots of p-i-, !. .,!1.i dil testing interface. (a) original color; (b)
intersample noise; (c) test color; (d) simultaneous comparison.


attention or mistaking the inverse delay response (the afterimage of the previous set

for a difference in the current set) for an actual difference in hue, as a priori there was

alv--,i- one pair of identical colors in each set of ten. The participants were unaware

of any information about how many pairs contained perturbed colors and how 111 Ir:

were not perturbed out of any given set.

Three sets of ten trials were performed for each focal color. If the subject per-

formed well enough on a given set of ten, the bit-depth was reduced by one (the

number of values shifted was reduced by a factor of two) for the next set. If the sub-

ject performed poorly enough, the bit-depth was increased. In order for the hue shift

to be reduced, the subject had to respond to at least 7i1' of the trials consistently

(simultaneous and successive responses were the same). To increase the hue shift, the

subject had to provide inconsistent responses (meaning the simultaneous and succes-

sive responses differed) on at least 71i '- of the trials. Breaks were scheduled every 20

minutes, and many people chose to continue another d4-- rather than continue the









session after the first break. The preliminary data-gathering happened each time a

subject started a session. Thus, if someone chose to continue the experiment on a

new day, they would select new focal colors and new boundaries as before, and those

colors would be used in the new session. However, if the subject chose to continue

after the break, new focal colors and boundaries were not generated. As a result, the

colors used were representative of that subject's focal colors on that day, under those

lighting conditions. If the subject was willing to continue after the trials for the eight

chromatic focal colors, the same procedure was carried out for the boundary colors

chosen at the central lightness value.


3.2 Data analysis


Twenty-two people participated in this experiment. This resulted in a total of

5885 signal trials and 635 noise trials. A signal trial is defined as a single pair of

colors whose hues differed and which had valid responses from the participant. A

noise trial is defined as a single pair of colors whose hues did not differ and which also

had valid responses from the participant. If the participant gave a response that was

not one of the acceptable responses ( iii -" or "Ill, il for the d, 1i, ,1 response,

'1" or "2" for the simultaneous response) the trial was discarded.

Figure 3.4 shows the focal colors chosen by the participants in terms of hue (black

x's) and the actual initial hues used for the fifth LSB testing (red). The focal colors

are concentrated in the left side of the plot, corresponding to red, oranges, yellows

and browns. The remaining colors (green, blue and purple) are slightly clustered

in groups in the remainder of the hue space. The testing data covers the hue axis

well, with more representation in the regions with clustered focal colors. The y-axis

shows the number of responses obtained for each focal color/test color. Each hue was

rounded to a number between 0 and 255, resulting in multiple trials for most hues.















x x Focal Colors
Initial Displayed Colors


0.12



0.1

E
0.08 -


0.06
S0.0 -
z

0.04


100 150
Hue


200 250 300


Figure 3.4: Focal colors and initially di-,l ,1, -1 colors at LSB 5.



Table 3.1 shows the number of hits and the number of signal trials. The hit rate

is calculated by taking percentage of signal trials that were hits. Z(hit rate) is used

to determine the value of the sensitivity metric. Table 3.2 shows the same data for

the false alarms.

These tables indicate clearly that many people were able to reliably identify shifts

in color when the fifth LSB was flipped, but that few were good enough to progress

to the stage where the third LSB was flipped. Very few did so badly that they got to

the sixth LSB, but on determining this, additional trials were run at both the sixth

and seventh to compensate. Only one person made any errors at the seventh LSB.

Simply looking at these results would cause one to hypothesize that a good sensitivity

threshold would lie somewhere between the fourth and sixth LSB.


11, Mjo~e


X
X
Xx x x
X- A A -nA


x
5r~ ?3~nrvJ~i~pin .










Table 3.1: Number of trials given bit-depth


Bit-depth Signal Trials Hits Hit Rate Z(Hit Rate)
3 428 248 0.579439252 0.200459453
4 1477 814 0.551117129 0.128484317
5 3340 1812 0.54251497 0.106771267
6 543 449 0.>. -,i7661 0.941936378
7 97 95 0.979381443 2.041142579


Table 3.2: Number of trials given bit-depth


Bit-depth Noise Trials False Alarms F.A. Rate Z(F.A. Rate)
3 42 19 0.452 -V '152 -0.119648575
4 153 44 0.287581699 -0.560462468
5 330 92 0.278787879 -0.586446731
6 97 10 0.103092784 -1.264124876
7 13 0 0 -oc


Sensitivity analyses were performed on the data to determine roughly how sensitive

humans are to bit-depth variation. The two sensitivity metrics used produced similar

results.


3.2.1 Sensitivity

Two metrics derived from signal detection theory produce very similar results. A

parametric sensitivity measure, known as the d' metric [18], and a non-parametric

sensitivity measure, known as the p(A) metric, are presented.

The receiver operator characteristics (ROC) curve is helpful in visualizing the d'

metric. The ROC curve is generated by plotting the hit rate against the false alarm

rate. The hit rate is the probability that an observer will generate a true positive,

given that the two colors are different. The false alarm rate is the probability that

an observer will generate a false negative given that the two colors are the same. An

observer who is simply guessing will produce a hit rate equal to the false alarm rate.















0.9

0.8

0.7

0.6
/

n 0 .5

0.4

0.3-

0.2- /

d'=1
0.1 d'=0
d'=lnfinity
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
False Alarm Rate



Figure 3.5: Sample receiver operator characteristics curve.


An observer who can discriminate perfectly will have no false alarms and a hit rate

of one.

The d' metric takes a [hit rate, false alarm rate] point, extrapolates a Gaussian

function, and looks at the difference between the means assuming both Gaussians have

the same variance. A constant d' produces a curve like the dashed line in Figure 3.5.

The p(A) metric looks at the same point and interpolates linearly between the end

points of the ROC curve and finds the area under the interpolated curve.

The d' metric measures sensitivity as the separation between the means of the

distribution of the signal trials (there is a difference between the two colors) and the

noise trials (there is no difference between the two colors). First, the distance of the

observer's criterion from the means of the two distributions is calculated. This is done

by converting the hit and false alarm rates to z scores, as shown in Tables 3.1 and 3.2.

The hit rate is the probability that the observer said the stimuli were different with









Table 3.3: Results of the d' and p(A) analyses.


Bit-depth d' p(A)
3 0.3192 0.5634
4 0.6889 0.6317
5 0.6931 0.6319
6 2.2061 0.8619
7 00 0.9897


the delay, given that the observer viewed stimuli that were different. The false alarm

rate is the probability that the observer said the stimuli were different with the d 1 iv,

given that the stimuli were the same. The data are mapped to a Gaussian probability

distribution, and the z score gives the distances desired. The d' measure is computed

by taking the difference between the z score for the hits and the z score for the false

alarms, as shown below.


d' = z(hit rate) z(false alarm rate) (3.1)


This is equivalent to calculating


d' = P- (3.2)


where ps is the mean of the signal trials and PN is the mean of the noise trials.

The lower limit for d' is zero, which indicates no ability to discriminate between

signal trials (different colors with delay) and noise trials (same colors with delay).

This is indicated as the dotted line in Figure 3.5. The theoretical maximum is oo,

corresponding to perfect ability to discriminate and thus no false alarms, represented

by the point in the upper left corner of the plot in Figure 3.5. The usual threshold for

ability to discriminate is d' of one, plotted as a red dashed line. Table 3.3 shows our

experimental results for humans' overall ability to determine the presence of color









perturbations as a function of bit-depth. Clearly, the threshold of one is passed

between the fifth and sixth least significant bits. Using linear interpolation, we find

that the threshold is crossed at just over fourteen bins.

The p(A) metric is computed by finding the area under the curve denoted by

a linear interpolation from [0,0] to [1,1] through a given [hit rate, false alarm rate]

point. This corresponds to


p(A) ( ) + h( f) + ((- f)(1 h) (3.3)
2 2


When the d' metric is 0, p(A) is 0.5. When the d' metric is oo, p(A) is 1.0. The typical

threshold used with the p(A) metric is 0.75. The advantage of this metric, as opposed

to the d' metric, is that the p(A) metric is bounded. If we plot the results of this

calculation for our data as a function of bit-depth, as shown in Table 3.3, it is clear

that the same pattern as that shown in the d' results is present. Again, the results are

almost identical for the 5th, and 4th least significant bits, and increase dramatically

for the 6th LSB. Clearly the threshold lies between the 5th and 6th least significant

bits, corresponding to hue shifts of between 8 and 16 out of 256. This corresponds to

a bit-depth of between 3 and 4. A bit-depth of 3 corresponds to 8 categories, which

is the result if the threshold is set to the 6th LSB. A bit-depth of 4 corresponds to

16 categories, which is the result if the threshold is set to the 5th LSB. Using linear

interpolation, we find that the p(A) threshold is crossed at 12 bins.

Using binomial analyses1, we find that the proportion of correct responses for the

4th, 5th and 6th least significant bits are .566 .031, .558 .017 and .838 .031

respectively at a 95'. confidence interval. The 4th and 6th least significant bits are

significantly different (p < .05), as are the 5th and 6th least significant bits. However,

the 4th and 5th least significant bits do not differ significantly. This supports the d'

and p(A) results.


1Analyses performed by Dr. Keith D. White










Table 3.4: Gamma Calibration Data


Level Ci -, .. Red Green Blue
1.0 7.5 1.950 5.60 0.740
0.75 3.8 0.375 1.20 0.195
0.5 1.5 1.000 3.00 0.420
0.25 0.4 0.093 0.24 0.069


3.2.2 Calibration

The monitor calibration consisted of determining the luminance of the screen for

different RGB combinations. Table 3.4 shows the results at four different luminance

levels. The gammas computed from these levels are 2.1144 for the overall luminance,

2.1958 for the red channel, 2.2722 for the green channel, and 1.7114 for the blue

channel.

The actual change in hue viewed by the participants corresponding to each possible

hue value was determined using the following procedure. Saturation is first defined as

255 and lightness as 192. For each possible hue, the HLS coordinate and the perturbed

coordinate (for each bit-depth from 3 to 7) are first converted to computer RGB

coordinates. These values are converted to relative luminances using the gammas

computed above. The relative luminances are scaled by the appropriate maximum

value for each channel corresponding to the results for level 1 (lOii'.) in Table 3.4.

These normalized RGB luminance values are converted back to HLS space. The

output of the procedure is the absolute difference between the two resulting hue

values.

Figure 3.6 shows the results of this procedure for each bit-depth/number of bins.

The desired values, the differences expected when the axis is uniformly divided into

the appropriate numbers of bins, are shown as dotted lines. The solid lines show the

change in hue as produced by the monitor. The average values of the solid lines are
















0.3

64 Cal
0.25 I- 64 Des
32 Cal
32 Des
16 Cal
S0.2 16- Des
8 Cal
8- 8-Des
C)4 Cal
4-Des
0.15 4 Des


0.1 \


0.05


0
50 100 150 200 250
Hue


Figure 3.6: Monitor Calibration Results


extremely close to the desired values, differing by less than 1 The numbers in the

legend refer to the number of bins corresponding to the given bit-depth.

Figure 3.7 shows the results when the actual shifts in hue after monitor calibration

are used to determine a p(A) value for each range of shifts. First, histograms of the

amount of shift for each trial were generated from the calibration results. Each bin

corresponded to a 0.01 shift in hue using the y-axis in Figure 3.6. Any histogram

bins containing no signal trials or no noise trials were eliminated. The remaining bins

were used to calculate the hit rate and false alarm rate for each bin. The resulting

values from each bin were used to determine the p(A) result shown in Figure 3.7.

The p(A) results from Section 3.2.1 are plotted in green, and clearly correspond well

to the sweep results, even crossing the threshold at the same number of bins. The

d' threshold was 14 bins, corresponding well to the p(A) threshold of 12 bins. This














p(A) Sweep Results
0.95 p(A) Quantized Results
Threshold

0.9

0.85

0.8

0.75

0.7

0.65

0.6

0.55

0.5
0 10 20 30 40 50 60 70 80 90 100
Number of Bins



Figure 3.7: Results of p(A) analysis with actual change derived from monitor calibra-
tion.


supports our conclusion in Section 3.2.1, that the threshold for hue discrimination

with a five second delay lies between the fifth and sixth least significant bit.



3.3 Summary


The implication is that human color memory has a bit-depth of between 3 and 4.

Use of two metrics shows a significant difference in sensitivity between the 5th and

6th least significant bits. In general, we can only reliably remember colors as members

of between 8 and 16 categories. This corresponds nicely to simple gradations of the

four perceptual chromatic channels, where a color is remembered as being on one

side or another of a given channel. For example, it would be possible to remember

a color as being on the blue side of green, rather than on the yellow side of green,









but distinguishing between a remembered green blue-green and a remembered blue

blue-green would be difficult.

Subjects reported that by the end of the 20 minute session, they were beginning

to lose track of what color they were looking at, and at the end of five seconds would

have forgotten whether they had just seen a green or a yellow patch. In addition,

virtually all participants reported using mental tricks to remember a specific color.

For example, one person said that he remembered the oranges by associating them

with sports teams ("This is Tennessee o01 ,i, ). In the real world, people rarely need

to remember the precise color of an object. They simply assign it to a category and

use other cues and a priori information to recognize the object ("I have only one

orange mug; I am in my house; I am looking at an orange mug; therefore this must

be my orange mug.") If the precise color of an object is important, humans do not

trust to color memory alone. Therefore, given the intense and specific nature of the

experimental task, it is unlikely that humans are even this precise in remembering

the colors of real objects under real world conditions. So one question remains: Is

this color segmentation sufficient for color object recognition? If so, separation of the

hue axis into eight chromatic color categories should be sufficient.

Color indexing is a very powerful and robust object recognition algorithm, suf-

fering from one problem: it is not robust to illumination changes. Various methods

attempt to compensate for this without losing the algorithm's inherent robustness

to occlusion, orientation, and scaling, without marked success. The color constancy

algorithms with the best results were detailed in C'! lpter 2, although they are cur-

rently insufficient for adequate object recognition. Color constant color indexing [25]

is promising, but requires elimination of saturated pixels for consistently good results.

Any color constancy preprocessor improves in accuracy compared to the original al-

gorithm alone. Unfortunately, this does not improve recognition accuracy sufficiently

for a truly robust object recognition system.









The algorithm does not currently incorporate any version of color memory. Having

shown that color memory has a dramatic effect on human resolution of hue and hue

perception over time, we would like to incorporate it into the color indexing algorithm.

If the human system can be taken as a high-level model of the processing occurring

in the color object recognition task, then the effect of memory should certainly be

taken into account. How can that be done?

Quantization is the engineering equivalent of the color categorization that takes

place in memory. In some v--v-;, our computers already have lossy memory for color

images, as many of our video compression algorithms involve quantization to 1r Ilnr

fewer colors than our di-pli-1 are capable of reproducing. In order to incorporate

the effects of color memory into the algorithm, we need to incorporate quantization.

Clearly, if we reduce the number of histogram bins, we will make the algorithm

substantially more efficient.

Unfortunately, nothing is known about how the human visual system actually

determines given categories, or how a given perceived color is assigned to a given

category. Therefore, we will have to derive insight on quantization methods from

engineering.













CHAPTER 4
QUANTIZATION


Most quantization algorithms to date have been concerned primarily with one of

two cases. First, in the case of monitor di-pl 'i1 with limited resources, implementa-

tions are concerned with a 256-color limitation of di-p1 w devices, rather than with an

8 or 16 color limitation. The goal of these algorithms is to display an image as close as

possible to the original image. Ideally, a human viewer should find the output image

indistinguishable from the original. In the second case, quantization is used to store

or transmit the image as efficiently as possible. Again, the desired output is assumed

to be as close as possible to an observer's perception of the original. Time-dependent

effects of the human visual system, such as chromatic adaptation and degradation of

the image due to memory are explicitly ignored because the purpose of this quanti-

zation is to allow the human observer to view the image independent of these effects.

Because of this goal, a second assumption inherent in all these quantization methods

is that comparison to the original image is the best measure of the method's im-

pact. This means that generally, quantization itself is seen as introducing noise and

degrading the image.

In our case, the best measure of the effect of the algorithm is comparison to other

images of the same scene. The original image contains noise in the pixel values.

A second image of the same view will generally contain different (although similar)

values. If the lighting is unchanged, a quantization algorithm that is effective for these

purposes should map the images to the same result, rather than to different results.

Comparison to the original image will give a sense of the degradation to the image

caused by the algorithm. Comparison to different images, rather than comparison of

storage or display constraints, should provide the metric for the effectiveness of the













P19.


Figure 4.1: Original image used to demonstrate results of different quantization
schemes.

quantization algorithm. Instead, we will measure an algorithm by its effectiveness

when used in the context of the object recognition algorithm. A good algorithm

will produce good object recognition results, and a poor one will not identify objects

correctly. In the same way as color constancy algorithms, simply deeming a result

;ood" is insufficient. Object recognition accuracy provides an excellent metric for

determining the ability of an algorithm to compensate for lighting variation, but

viewing the quantized images can be helpful in understanding the mistakes made by

the algorithm. Figure 4.1 shows the original image used to display the results of the

different quantization algorithms.


4.1 Overview of Algorithms


4.1.1 Uniform Quantization

This is the simplest of the quantization algorithms. The color space is divided up

into n blocks of equal volume. The centroid of each block is the color used in the color

palette. The color palette is therefore fixed, data-independent, and takes no notice of











Truck Image: Uniform Quantization in RGB Space to 8 Levels
;. % .' .
_ r. .. -.

C'


Figure 4.2: Results of uniform quantization


the combinations that humans may find more pleasing. Generally this algorithm is

considered to produce poor results from a human standpoint. Figure 4.2 shows the

results of uniform quantization to 8 colors.


4.1.2 Dithering

This process is used to improve the results of quantization for human viewing.

A small amount of noise is added to each pixel of the image before quantization

occurs. This improves the quantized image (for a human) by eliminating 1' ,lin_:

the effect that occurs when a smooth transition between two colors is replaced by

distinct bands of colors from the palette. However, for object recognition purposes,

and for low numbers of colors, banding may be useful. At very low numbers of colors,

dithering will have very little effect, as the effect of low numbers of colors is increased

resistance to noise. The results of dithering are shown in Figure 4.3.











Truck Image: Uniform Quantization in RGB Space to 8 Levels With Dithering


Figure 4.3: Results of dithering on uniform quantization.


4.1.3 Modified Uniform Quantization

In this algorithm, instead of dividing the space evenly into n blocks of equal

volume in three dimensions, the n blocks are either cubic or rectangular, with an

integer number of them to a side. The process can be seen simply in two dimensions.

If n 5, we start with a 2x2 array. Then one block is added in one row, to give five

blocks from one row of two and one row of three. If n = 6 we have a 2x3 array. If

n = 7, one block is added to one column, yielding two columns with two blocks, and

one with three. For n 8, there would be one column with two blocks and two with

three, and for n = 9 we would have a 3x3 array. In this way, the largest difference

between the maximum and minimum number of blocks in any single dimension is

one, and the number of blocks in all other dimensions have neither a maximum nor a

minimum. This is a data-independent method for simple quantization to few colors.

The order in which the axis along which the variable block size occurs can be chosen











Truck Image: Modified Uniform Quantization in RGB Space to 9 Levels


Figure 4.4: Results of modified uniform quantization


in advance to fit the data best, making this a data-dependent method, or it can be

determined in advance and fixed, making this a data-independent method. In RGB

space, modified uniform quantization reduces to uniform quantization for n3 colors,

when n is an integer. Quantization to 9 colors is shown in Figure 4.4.


4.1.4 Tree Quantization

In this quantization scheme, the RGB values are first converted to the Hue-

Lightness-Saturation space, covered in detail in Appendix A. The advantage of this

space is that most meaningful color demarcations (chromatic/achromatic; light/dark;

dark color/black; light color/white) can be made with simple threshold operations.

Separating the color space into chromatic and achromatic regions can be acheived

by thresholding the saturation and lightness axes. Saturation of one represents max-

imal saturation for the entire length of the lightness axis. One threshold on the









saturation axis and two on the lightness axis determine a ring of chromatic colors.

Values above the upper lightness threshold are assigned to the lightest achromatic

value. Colors whose lightness is below the lower lightness threshold are assigned to

the darkest achromatic value, regardless of saturation. All colors whose saturation

is below the saturation threshold are assigned to achromatic values based on their

lightness component.

The first division occurs between achromatic and chromatic. Then the achromatic

region is uniformly quantized along the lightness axis to A regions. The chromatic

region is uniformly quantized in the hue dimension to C regions. All chromatic

values are arbitrarily assigned to a lightness of 0.55 and saturation of 0.6, while all

achromatic values are assigned zero saturation and hue. Both A and C are set by the

user, and A + C is the total number of colors in the final color map. For example, if

A = 0 and C = 2, the final map would contain two colors, diametrically opposed in

hue. If A = 1 and C = 1, the final map would contain medium gray and saturated

green.

This method is shown in Figure 4.5, for 11 chromatic regions and three achromatic

regions. The image is first converted to HLS space. A saturation threshold of 0.15

determines whether each pixel is chromatic or achromatic. If the pixel is achromatic

(saturation below 0.15), it is put into one of three achromatic bins. Below a lightness

of 0.2, the pixel is put in the black bin. Above a lightness of 0.9, the pixel is put in the

white bin. Between the values, the pixel is assumed gray. If the pixel is chromatic,

it is placed in one of n chromatic bins. The boundaries for the chromatic bins are

determined by uniformly quantizing the hue axis into n regions. In this figure, there

are 11 chromatic regions, and the hue axis of the HLS space is offset by 60 degrees,

placing pink at both 0 and 1.











Input Color
Achromatic Chromatic
Saturation 0 015 1
Lightness Hue
0.0 Black 0.000 Red
0.2 0.091
SGray 0091 Orange
0.9 0.182
1.0 0.273
0.364
0.455
0.546
0.637
Sky Blue
0.728
Royal Blue
0.819
Purple
0.910
Magenta
1.000


Figure 4.5: Diagram for tree quantization.


4.1.5 Median-Cut Quantization


This is a data-dependent method using the median of the values of the three

channels in the input image, given a colormap. Assume n is the number of colors

desired in the final colormap. The median values of the input map are used to split

the colorspace into two new boxes. The limits of these boxes are used to find the

median values of the new colormap in RGB space, which are in turn used to split

the current boxes into new boxes along the largest axis, and so on until n boxes have

been generated. The average of the colors in each of these n boxes are the values

in the new colormap. Figure 4.6 shows the results of this algorithm, again for eight

colors.



4.1.6 Vector Quantization


Vector quantization is a general term for data-dependent quantization methods.

The color palette is generated via an iterative procedure, similarly to median-cut












Truck Image: Median Cut Quantization in RGB Space to 8 Levels
AL."
.'.
a ,


Figure 4.6: Results of median-cut quantization


quantization above. These algorithms are even more time-consuming to run, but

produce subjectively good results. One possible implementation follows these steps.

First the most common color is assigned to the palette, and the error is evaluated

(using mean square error or another method of choice). Then the next most common

color is assigned to the palette, and the remaining colors are assigned to one of the

two values so as to minimize the error. This process is repeated until n colors are

chosen.



4.2 Number of Colors


Generally, the optimal case for human viewing is where quantization is unneces-

sary, so computer vision research tends to assume a fixed maximum number of colors

(usually 256), and generates algorithms that produce the most pleasing results. We

have not found any algorithms in the literature on choosing this maximum by any









methods other than an analysis of the transmission times or storage constraints for

the image data, or by using the maximum number of colors a display can show. These

are important, but not particularly helpful for this research. Our assumption is that

the necessary storage can be added to the robot or computer, as we will be reducing

the data to substantially fewer than 256 colors. One constraint of interest to us may

be the time necessary to perform the quantization. Another possible metric would

consider more closely the task we are asking quantization to perform.


4.2.1 Optimizing Accuracy

The previous techniques are all designed to minimize artifacts visible to the human

viewer. However, for our purposes, the human viewer is irrelevant. Instead, we need

a metric that will enable the system to correctly classify colors even when the lighting

changes. Because the algorithm is primarily concerned with chromatic responses, and

for speed and simplicity, only the hue axis is considered.

The accuracy sweep method is a simple, greedy approximation to the brute force

method of trying all possible combinations to obtain the bin locations that produce

the lowest possible accuracy on the training data.

The first two bins locations are found by sweeping all possible combinations of bin

boundaries along the hue axis. There are 256 possible locations for the first bin and

255 for the second because boundaries at the same location would effectively produce

only one bin. Once the first two bins have been determined, they are fixed and the

third bin boundary is found by calculating the accuracy on the training data when

the third bin boundary is swept through all 254 possible locations. Then the third

bin boundary is fixed and the fourth is swept, and so on up to the nth bin boundary,

producing the desired n 1 bins.

Again, here we assume we know how many bins are desired. We want the quanti-

zation algorithm to free us from needing a color constancy preprocessor. How ri ',i-:









colors are necessary if we wish each bin to contain most of the hues that result from

images of a single color under a variety of lighting changes? To determine this, we

gathered data of different colors at different times of div and analyzed the results.


4.2.2 Lighting Shifts

Images of a color calibrator and a set of cans were taken under a variety of con-

ditions. Data for 2 different lighting conditions was taken at every hour, starting at

9 am and finishing after dark, with an additional image at 6:20 pm (at dusk). The

lighting conditions consisted of the four permutations of the status of the camera's

white balance (on or off) and the status of the blinds (open or closed). The resulting

images were segmented into color patches representative of different colors. A Gretag

Macbeth ColorC'l ., i,~ was used as the calibrator, and a set of cans representative

of common colors used in soda can label design were also present in the images. The

layout was unchanged over the course of the d4v-. The color calibrator consisted of 24

different colors, including 6 achromatic colors. The remainder of the image produced

another 24 sample colors, including both common representations of black and white

and dark background regions, such as the shadow under the table where the cans and

the calibrator were set up. Figure 4.7 shows the range of colors that a single white

region of a single can took over the course of the d4-v.

This data was analyzed to determine the scope of possible lighting shifts, and po-

tentially to determine a likely function for mapping colors taken under one illuminant

to those taken under another.

The range of values for a single patch of color, under a single lighting condition,

was much higher than anticipated. Generally, along the hue axis, a single patch

would have a standard deviation below .05 where the original data are scaled to have

potential values from 0 to 1. To encompass one standard deviation to each side of

the mean would require a bin size of roughly 28 out of 256 possible bins. Thus, only











9am 10am 11am 12pm




10 10 10 10

1515 15 15
2 4 6 2 4 6 2 4 6 2 4 6
1 pm 2pm 3pm 4pm






15 15 15 15
2 4 6 2 4 6 2 4 6 2 4 6
5 pm 6 pm 6:20 pm 7 pm




10 10 10 10

15 15 15 15
2 4 6 2 4 6 2 4 6 2 4 6



Figure 4.7: Sample color changes under a single d-i-'s sunlight.


8 to 10 bins along the hue axis would be possible. However, this small number is very

close to what we know of the accuracy of human memory. Instantaneously, we average

over a given spatial region to get an estimate of the local color. Over time, we do not

store the precise color, but only a rough representation of it. Strangely enough, the

standard deviation along the hue axis when all the pixels of a given color over the

course of a d4v- were included produced a similar maximum. From the variability of

the data, it appears that 8 to 10 bins is a very good approximation to the accuracy

of the input data.

A graph of these results for the set of colors found in soda cans is shown in

Figure 4.8. This graph shows the mean and standard deviation for chromatic (blue)

and achromatic (black) colors. This plot is for the soda can colors, rather than the

calibrator colors. The achromatic colors included black, the dark background color,

the desk the cans were placed on, the middle brightness background color, silver,











Mean and standard deviation for hue axis
1.5





1-





> 0.5 -
0=










-0.5
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Hue value


Figure 4.8: Average and standard deviation for hue under varying lighting. The x
location of each bar indicates the average of the average values for the d4iv.


white, and the lightest background color. The desk was closest to the gold or orange

hue. Black, silver and white were all colors taken from regions of cans. Clearly the

hue axis is a good choice for quantizing chromatic regions, as the standard deviation

for chromatic colors is very low and the separation between colors is reasonably good.

Figure 4.9 shows the difference between the averages for the cans (lower row) and

the calibrator (upper row) hue results. These data points clearly seem to cluster in

along the hue axis. The region between 0.3 and 0.4 is almost empty, while the region

including values just below 1.0 and values from 0.0 to 0.2 is very crowded. From

0.5 to 0.7, there are three distributed can classes, but the calibrator classes are more

tightly clustered without such clear divisions between colors.

If we assume that the mean and standard deviation calculated over the course of

the d iv are representative of Gaussian distributions, we can derive a B ,-.- classifier











Hue
2.5
x Cal, Chrom.
SCans, Chrom
Cans, Achr.
x Cal, Achr.


2 x x x x xx >X< xx x x >xx x x xx x -










0.5
o 1.5












0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Hue


Figure 4.9: Mean values for cans and calibrator hues.


to determine the best color to choose for a given HSV triple. Figure 4.10 shows the

probability density functions for the chromatic can colors along the hue axis. It is

assumed that each color is equally likely. The color of each line is determined by the

average RGB coordinates for that color for the d4 Clearly, the hue axis is separa-

ble into distinct hue categories. Multiple peaks are present in these regions because

the 17 chromatic can colors include more than one representative of important col-

ors. For example, a light yellow and a darker yellow were both taken from images of

the Country Time LemonadeTM can. Similarly, there were two purple measurements
TM
(Welch's Grape Drink ), two cyan measurements (FrescaTM) and two pink measure-

ments (Diet Cranberry Canada Dry Ginger AleTM and TabTM). Gold (Caffeine-Free

Coca-Cola TM) seems to have a separate distribution from yellow, while many colors

overlap in the red region of the hue axis. Natural separations fall between yellow and










Comparison of mean and standard deviation of can chromaticities


Hue


Figure 4.10:
axis.


Probability density functions for chromatic can colors along the hue


green, green and blue, blue and purple, purple and reds, reds and yellow. The reds

class contains not only red and dark red, but also both pinks and both oranges.

In addition, the achromatic colors are separable on the other two axes. Figure 4.11

shows the probability density functions for all the can colors along the saturation axis.

Clearly, the light achromatic colors are much lower in saturation than the others.

The dark achromatic colors, however, are just as saturated as the lighter chromatic

colors. Figure 4.12 shows that the dark achromatic colors can be distinguished with

a threshold on the value axis.

Using these probability density functions we can derive a classifier with 8 or 10

colors. For the 8-color case, gold and yellow are considered one category, and red

contains red and dark red, both pinks, and both oranges. The 10-color classifier

separates gold and yellow into 2 classes, and finds a separate category for pinks. The










Comparison of mean and standard deviation of can saturations


Saturation


Figure 4.11: Probability density functions for all
axis.


can colors along the saturation


eight color regions common to both classifiers are black, white/silver, red, yellow,

green, cyan, blue, and purple. We determine that black is present when the value

coordinate is below 28 on a scale of 1 to 256. White and silver are grouped into a

single class, identified by saturation of less than 95. The remainder of the classes are

defined on the hue axis. Reds are in the range below 18 and above 206, while yellow

and gold lie in the 18 to 47 range. Greens are present from 47 to 92, cyans from 92 to

144, and blues from 144 to 164. Purples fill the hole from 164 (blues) to 206 (reds).

The calibrator had a magenta patch that formed a clear peak between the purple and

red clusters, but magenta is not well represented among can label colors, so magenta

is included in the purple category.

The sample image quantized with the 8-color scheme is shown in Figure 4.14.

The wheels and pavement are categorized as achromatic/light and the shadows are











Comparison of mean and standard deviation of can values
0.16


0.14


0.12


0.1


0.08


0.06


0.04


0.02


0
0 50 100 150 200 250
Value


Figure 4.12: Probability density functions for all can colors along the value axis.


categorized as blue and cyan, as they are not dark enough to fit in the black category

used by the soda cans. This is primarily because the truck image is a generic image

included with the Adobe Photoshop package, for use in tutorials, and the soda

can images were taken with a Sony Digital8 camera in the laboratory. Images of

soda cans, quantized with this method, assign black regions on the cans to the black

category.

The two additional regions in the ten-color scheme consist of gold, from 18 to

31 along the hue axis, leaving yellow from 31 to 47, and pink. Figure 4.13 shows

the probability density functions along each axis for the colors included in the red

category. Clearly, one color is separable along the value axis, and one is separable

along the saturation axis. These two colors are the two pinks. The two pinks are

put in a tenth category, consisting of two subdivisions of the red category. The pink

category is defined as those colors that fit in the red category along the hue axis,











Red PDF's Along Saturation Axis


50 100 150 200 250 0 50 100 150 200 250
Hue Saturation
Red PDF's Along Value Axis


50 100 150 200 250
Value


Figure 4.13: Probability density functions for the colors in the red category.


whose saturation is below 163 (and above the white/silver saturation boundary), or

whose value is above 123, or both.



4.3 Summary


The main problem with quantization implementations is that of the final goal of

the quantization. The intent of most algorithms is maintenance of the image in terms

of instantaneous human perception, while incorporating reduced storage and display

constraints. This means that the maximum possible number of colors is used. Metrics

used to rate images and techniques include bit-rate (in coding) and quality in terms

of human perception. My algorithm is concerned with using the minimum possible

number of colors to represent an image while maintaining the ability to recognize

an object, not the maximum available while maintaining human perception of the

image. Because the data stored are in the form of a series of histogram values, not


Red PDF's Along Hue Axis











Lighting Shift Derived Quantization







1. .
^^













Figure 4.14: Sample image quantized using the 8-color lighting-shift based classifier.


in the form of an image, bit-rate values for an image are not a useful metric. This

makes it very difficult to use the research in the literature to analyze an algorithm's

suitability for this project. We take the approach that the degradation in color

representation introduced by quantizing an image of an object can actually improve

the desired result: namely identifying the same object in multiple images. Generally,

quantization algorithms are judged qualitatively. The field is in agreement that mean

square error is a poor metric of image quality, and no other metric has been proposed

with sufficient acceptance to provide a clear quantitative measure of image quality.

Analysis of data gathered under typical laboratory lighting conditions resulted

in a plausible quantization scheme. In addition, this ain i ,~i-; provided a clearer

understanding of the nature of the problem. The variance in lightness or value is

high, rendering that feature unlikely to be helpful except in the broadest sense. The

segmentation into different color categories is not intuitive in the RGB space, while it









is very straightforward using hue and saturation as salient features of the data. This

resulted in eight or ten color categories, close to the number of categories resulting

from the work in '! Ilpter 3.

Little work has been done for cases of extreme quantization for object recognition.

Generally, color quantization has focused almost exclusively on compression with

minimal loss in quality with regard to instantaneous human image perception. In this

case, quantization is usually regarded as introducing noise, rather than as reducing

it. There are a few exceptions to this rule [17, 6]. They attempt to compensate

for non-linearities in the data by transforming the original data to a new space,

via the K-L transform or a DCT, and then uniformly quantizing, rather than by

using an adaptive quantization algorithm. Because they are doing this with respect

to Swain and Ballard's algorithm, their results are of more interest here than the

other results. In general, they have found that using these transforms combined

with uniform quantization resulted in high accuracy that dropped off sharply when

the number of bins fell below eight. The resulting number of categories is similar,

regardless of how the number is obtained.













CHAPTER 5
THEORY AND RESULTS ON SYNTHETIC DATA


5.1 Theory


5.1.1 Structure and Methods

The color histogram indexing method derives much of its robustness from its

simplicity. The feature vectors consist of normalized color histograms of images of

the objects in the database. In our implementation, we determine match closeness

by a simple Euclidean distance measure between feature vectors. We have variables

for the number of objects in the database, the number of colors in the original space,

the number of colors in the final space, and the number of values used to form the

histogram. For example, suppose the final histogram contained 512 values. In general,

only the few largest have object-related information in them. The smallest values

are very noisy. Instead of keeping all 512 values, we could keep only a few of the

largest, and set the rest to zero. This would greatly increase the storage efficiency

of the system, as only the non-zero values and their index would need to be stored.

Because of storage and processing constraints, the preferable option would be to

simply quantize the space to only a few colors in the first place. If fewer colors are

used, only the magnitudes need be stored in an ordered array, rather than a more

complex structure incorporating the index values and eliminating the zeros.

So how many color categories are sufficient? We have shown that human memory

for hues degrades between 8 and 16 divisions of the hue axis [54] in C'! plter 3. Drew

et al. [17] and Berens et al. [6] found accuracy began to degrade at between 8

and 16 categories in their respective transformed spaces. Analyzing the variation in









illumination due to sunlight over the course of a day resulted in 8 or 10 categories

useful in soda can identification. All these results indicate that the region between 8

and 16 categories may be where we should investigate.

Now that we have an approximate number of categories to explore, how do we

choose the categories themselves? Computationally, this should have little or no

impact on the final requirements of the system. Ideally, the categories should be

fixed for the expected environment of the robot, so that a lookup table is all that is

needed.

We begin with a modified version of uniform quantization. Obviously, uniform

quantization in RGB space produces quite different categories than uniform quan-

tization in a hue-based space. Given that we are concerned exclusively with color

information, a hue-based space seems most promising. We first test quantization to

512 colors in RGB space, which allows us to roughly determine the robustness of the

original algorithm under multiple lighting conditions. Second, we test quantization

to 64 colors in RGB space. We then test quantization of each original image to 14

colors in HLS space, obtaining 3 achromatic colors (white, gray, and black) and 11

chromatic colors (pink, red, orange, yellow, three greens, a dark cyan, light blue, dark

blue, and purple) using the "tree" quantization described in Section 4.1.4. Feature

vectors are generated by scaling the histograms of the indexed images to sum to one.


5.1.2 Theoretical Results

Assuming all colors and combinations of colors are equally likely, our equations

find the expected value of the error. In reality, a given database will produce better

or worse results with this technique, depending on the quantization method used and

the characteristics of the database. Noise in the form of camera calibration, non-

uniformity of color distribution in a given database, and lighting variations will all

affect any results on real data. However, given appropriate conditions (objects easily









c=6 mOE3OM
p=3



Emmmmmm mmmmm mmmmmm MEOOQm



mo00nM MooooeM mOmmOm moooom
*OOOQE EQOQOM HEEMEE EmOQDmE
MBDBoBl MonBoM MnEmmm *NNEDDn


k=5 MuoMm

Figure 5.1: Diagram showing examples of each variable used in the theoretical equa-
tions.

distinguishable by color alone, or the algorithm used as one method among many for

identifying the objects, and appropriate quantization methods for the characteristics

of the data in question) real results should at least follow the same trends as those

outlined here.

There is a tractable solution to the effects of the variables on expected error

for 3 objects [53, 52]. For more objects, it is possible to derive a specific analytic

solution, but simpler to implement a program to calculate appropriate values. For

larger databases (more than 10 objects) even this approach becomes unwieldy and

time-consuming. The 3-object solution is presented here, along with a description of

the program used to generate the accuracy values for up to 6 objects and simulation

results for larger databases.

For databases containing only three objects (n = 3), we wish to determine the

expected error. In this case, the variable c is used to represent the number of colors in

the original (not quantized) space, k is used to represent the number of colors in the

quantized space, and p is used to represent the number of values of the histogram.

These variables are illustrated in Figure 5.1. In this figure, there are six colors in









the original space. As a result of quantization, red and orange are merged to form a

single new color, represented here as orange. As a result of this, the first two objects

become indistinguishable, while the other two are still separable.

There are c" possible different objects in the input space. We choose three of

these for our database, and then quantize them. The resulting three objects in our

database are members of the k" possible objects in the quantized space. For this

case, we assume that the only degradation occurring over time is modeled by the

quantization. The only time the error increases is when more than one of the objects

in our database quantizes to the same object in the new space. Otherwise, the objects

remain perfectly distinguishable and the error remains at 0. If there is more than one

copy of a given object, we choose randomly between them. The expected error for

a given set of objects can take only three values: 0 (all objects distinguishable), 1/3

(two objects the same) or 2/3 (all three objects identical). However, when we look at

the expected value of this error over all possible sets of objects, in terms of c, k and

p, we get
11 2
E[error] [- N2 + N3 (5.1)
Na11 3 3

Here Naii is the number of possible cases, N2 is the number of databases where there

are two objects which are identical and N3 is the number of databases where all three

objects are duplicates. N2 is defined as


N2 = S] (j -i) (C -S) + St (St + S- 2) (5.2)
j 1 t4j

and N3 is defined as

N3 (sj )(sj 2) (5.3)
j=1

where si is the number of objects created in the original c-color space that are quan-

tized to the same ith object in k space. Naii is the number of possible n-object











10,

PP=2
10 -


10 2 p=3 ,


10


10



10 101
k

Figure 5.2: Theoretical results for c 224, p 2 and p 3, n 3, and k varying.


databases in the c-color space.


c!
Na11 (P! (5.4)



Figure 5.2 shows the results for this equation when p 2 (crosses) and p 3

(stars) for k varying. The effect of keeping p of the bins is clearly distinguishable.

The y-axis is the error di l on a log scale. Clearly, a larger p corresponds to a

measurable reduction in error. Increasing p by one produces a reduction in error of

more than an order of magnitude for k > 10. k 10 is also the threshold for error

below 1 for p 2.



5.1.3 Larger databases


For more than 3 objects in the database, the computational complexity of the

theoretical solution increases greatly. However, for small n (up to 6 on our equipment

and with our time limitations), it is possible to compute the expected accuracy,

with one assumption. We assume that (cP)/(k ) is larger than n. This ensures that










Theoretical Results for Varying k


n=5 dashes
10 n=4 dots
n=3 solid

10 2


102


10-3


104
10-5 ..--------------------------
100 101
k =1 to25

Figure 5.3: Theoretical results for c = 224, p = 3, and k varying along the x axis.
Each line corresponds to a different value of n.


all combinations of n of the k" possible objects are possible. This is not a severe

limitation, as in general we quantize from 224 colors down to tens of colors.

In Figure 5.3, we can see the effect of changing the number of bins in the new

space on the error. As k increases, the error decreases. If we plot the log of k versus

the log of the error, we get the lines shown. Each line corresponds to the results for

a given value of n. As n increases, the error also increases, but the relative increase

in error is small.

Figure 5.4 shows the behaviour of log error versus log k as p is increased. Again,

the results are shown on a log-log plot. As p increases, not only does the error

decrease, but the lines also become steeper. Thus, increasing p has a greater effect on

error when k is larger. Experimentally, the error when n = 4, p = 3 and k is on the

order of 1000 is of the same order of magnitude as the error when n = 4, p = 6 and

k = 25. Doubling the number of features kept reduced the number of colors needed

by a factor of 40 without changing the error.

For larger databases, we implemented a Monte Carlo simulation. These results

were obtained by averaging the error from 10, 000 randomly generated databases of n










Theoretical Results for Varying k

p=3 solid
10 p=4 dots
p=5 dashes

102

104

10

105

106

10'
100 10'
k =1 to25

Figure 5.4: Theoretical results for c = 224, n 4, and k varying along the x axis.
Each line corresponds to a different value of p.


objects. This average error is plotted in Figure 5.5. Predictably, error increases with

the number of objects in the database. As the histogram space becomes less sparse,

the system's robustness to fluctuations in hue deteriorates.

Theoretical results show coarse quantization can dramatically reduce necessary

storage and processing while producing a minimal reduction in accuracy. Increasing

the number of features kept dramatically increases the accuracy, while increasing the

number of objects reduces the accuracy. Large decreases in the number of colors

available can be compensated for by small increases in the number of features kept.

The results presented to this point are for the best possible case: only degradation

resulting from quantization is interfering with perfect accuracy. In the real world,

other factors affect accuracy. These include lighting variation illuminantt noise) and

differences in camera placement and calibration. In general, as long as the illuminant

noise is not too great (actual magnitude will depend on the database and the degree of

quantization), some amount of quantization will help eliminate this noise and improve

accuracy.









Simulation Results for Varying n, k=10







10 p=3 p=

10





10
10 10

Figure 5.5: Simulation (averaged) results for c = 224, p = 2 and p = 3, k = 10, and n
varying.


5.2 Synthetic Data


There are two interlinked factors still to be tested. First, the theoretical results

assume both uniform distribution of colors/objects and uniform quantization. There

is no guarantee that the database will be in any way similar to the quantization

procedure followed. Thus a quantization scheme that is in some sense data-dependent

may be required. Second, and more importantly, these results show what happens

when there is no noise in the form of lighting shifts.

In order to simulate the effect of lighting shifts and test various quantization

schemes, further research was performed using synthetic data. The first experiment

tested the case when the objects all fall into a single region of the hue axis. Sec-

tion 5.2.1 describes the results of this experiment. Section 5.2.2 covers the results

when the objects are equally distributed in hue. The results when each object is con-

centrated in more than one hue region are covered in Section 5.2.3, and Section 5.2.4

show the results when these multiply-hued objects are shifted in hue not with a sim-

ple linear shift, as in the previous sections, but with the lighting shifts measured in

Section 4.2.2 of C'!i pter 4.










Single Region Hue-Localized Database


50 100 150 200 250
Histogram Bin / Hue


Figure 5.6: Image of single hue region synthetic database. Red corresponds to higher
values; blue to lower values.


5.2.1 Single Hue Region

A synthetic database of 16 objects was generated. Each object consisted of a

sinusoidal bump, 8 of 256 bins wide. Figure 5.6 shows an image of this database.

Larger values are red; 0 is blue. The entire database takes up only half of the possible

hue axis. The joint purposes of this experiment were to verify that the accuracy

sweep method (described in Section 4.2.1 of C'i Ilpter 4) of optimizing bin selection was

working properly and to compare the results of the uniform and the accuracy sweep

methods of quantization. The accuracy sweep method should outperform uniform

quantization when the database is localized within one or more regions of the hue axis,

and should perform as well as the uniform method when the database consists of colors

evenly spread throughout the histogram space. When small lighting shifts occur, the

accuracy sweep method should dramatically outperform the uniform quantization












Shift of 0


0.5

S Uniform
SAcc. Sweep
0
0 50 100
Shift of 3

1

S0.5

Uniform
S Acc. Sweep
0
0 50 100
Shift of 6
1


2 Uniform
S0.5 Acc. Sweep
'< n


50 100
Number of Bins


Shift of 1 Shift of 2
1 1


05 Uniform
S Acc. Sweep
Uniform
S Acc. Sweep
0 0
0 50 100 0 50 100
Shift of 4 Shift of 5
1 1


0.5 0 .5 0. Uniform
0.5 0.5eAc c. weep
Uniform
S-- Acc. Sweep
0 0
0 50 100 0 50 100
Shift of 7 Average accuracy
1
Uniform 1
Acc. Sweep

0.5 0.5

Uniform
0 Acc. Sweep
0 50 100 0 50 100


Number of Bins


Number of Bins


Figure 5.7: Results on single hue region database for varying numbers of bins, using
uniform (blue) and accuracy sweep (red) quantization. Average accuracy across the
different shifts is shown in the lower right hand plot. Standard deviation of average
value is shown with dotted lines. Progressively larger shifts from none (upper left
hand plot) to 7 (middle lower plot) are shown in the remaining plots.



method on training data except when uniform quantization results in precisely the

correct bin locations.


The number of bins was swept from 2 to 256 for uniform quantization and from 2


to 128 for accuracy sweep quantization. The accuracy results from 2 to 128 bins are

shown in Figure 5.7. Only 7 shifts were performed, as the 8th shift would cause all but


one of the objects to line up perfectly with a different object, producing 101' error

on those 7 objects. The single object that didn't line up with another object would

line up with nothing, producing error of random chance (1 in 16 chance of identifying


the object correctly, given that object). As the shifts increase, the expected value of











Single Region Database



5











50 100 150 200 250
t 15
O

20


25


30

50 100 150 200 250
Histogram Bin / Hue


Figure 5.8: Database of objects completely spanning the hue axis.


the accuracy would increase until a shift of 128, and then decrease until we were 8

bins away on the other side.

Clearly the accuracy sweep method produces superior results on this type of data

set. It outperforms the uniform quantization by a larger and larger margin as the

values are shifted away from the original. Even for the largest shift, where only

one pixel of the shifted version overlaps with the original object, the accuracy sweep

method correctly identifies over half the objects, while uniform quantization fails

dramatically. Looking at the average accuracy, it is clear that overall the accuracy

sweep method performs better than the uniform method on training data.

The characteristic shape of the uniform quantization method for large shifts (com-

pared to the size of the objects, large is defined as shifts greater than half the size of

the object bumps) is of more interest than the average accuracy results. For small

shifts accuracy degrades as the number of bins is reduced. For a medium shift (exactly












Shift of 0
1



0.5



0 50 100
Shift of 3
1



g 0.5

Uniform
S Acc. Sweep
0
0 50 100
Shift of 6
1
S Uniform
Acc. Sweep

0.5


50 100
Number of Bins


Shift of 1



0.5 Uniform
S Acc. Sweep


0
0 50 100
Shift of 4


50 100
Number of Bins


Shift of 2




0.5

Uniform
Acc. Sweep
0 50 100
Shift of 5


50 100
Average accuracy






Uniform
S Acc. Sweep
50 100
Number of Bins


Figure 5.9: Results of uniform (blue) and accuracy sweep (red) methods.



half the object) the uniform accuracy is extremely unpredictable from one number of

bins to another. However, for large shifts, the uniform accuracy follows a predictable


trajectory. First the accuracy is very low (2 bins). As the number of bins is increased,


the accuracy increases (compatible with the theoretical results). Eventually a peak

is reached, and the accuracy begins to decrease. This peak is the point at which the

maximum number of pixels are being consolidated into the correct bin as a result of

the widened bin sizes. As the bin sizes decrease and the number of bins increase,


the accuracy decreases, until at 256 bins the accuracy is 0. No objects are identified


correctly, because every object overlaps more with a different object than with itself.




5.2.2 All Hues


Instead of limiting the objects to a single region, the new database has identical


objects spread along the entire hue axis. Instead of 16 objects, the new database











Two-Region Hue-Localized Database


2







t8
0
10








50 100 150 200 250
Histogram Bin / Hue


Figure 5.10: Original database for histograms with two colors.


(shown in Figure 5.8) has 32 objects. Each of these 32 objects are the same as any

one of the original 16 objects, merely shifted to a new region of the axis.

Figure 5.9 compares uniform and accuracy sweep methods. Clearly, when the col-

ors of the objects within the database are uniformly spread, and there is no noise in

the form of offsets, the uniform method does as well as the more optimized method.

However, when offsets are introduced, again the uniform accuracy degrades substan-

tially while the accuracy sweep method continues to perform well on training data.



5.2.3 Multiple Hues


Now that we know what happens when the database consists of identical objects

with single color regions, we wish to determine how the accuracy is affected by hue

shifts when each object consists of multiple hues. We begin with objects consisting of

two colored regions of equal size. This corresponds to histograms with two identical











Two-Region Hue-Localized Database: Average accuracy as a function of shift
1.2


1-


0.8


0.6


0.4


0.2


0-


-0.2
0 10 20 30 40 50 60 70
Shift


Figure 5.11: Average accuracy of uniform (blue) and accuracy sweep (red) methods
as a function of shift.


bumps. Figure 5.10 shows the new database. Again, red represents larger values and

blue, smaller values. The offset is 64 elements, so a sweep of 72 bins will show the

full range of possible accuracies. Each bump of each object is still 8 hue values wide,

generated with half a period of a sine wave.

Figure 5.11 shows the accuracy as a function of shift. The dotted lines indicate

standard deviation. The curve is bimodal, with the first peak corresponding to a

correct match of the first histogram peak with the first histogram peak for a given

object, and the second accuracy peak corresponding to a match of the first histogram

peak of an object with its shifted second peak. Because the second peak of this plot

corresponds to only one peak contributing to the accuracy, rather than two, its peak

accuracy is lower.












Shift of 1 Shift of 5 Shift of 10
1 1 1
/Uniform
SAcc. Sweep

I0.5 0.5 0.5

Uniform Uniform
Ac. Sweep Acc. Sweep
0 0 0
0 20 40 0 20 40 0 20 40
Shift of 25 Shift of 30 Shift of 55
1 1 1
S Uniform Uniform Uniform
S Acc. Sweep Acc. Sweep Acc. Sweep

S0.5 0.5 0.5


0\ 0 0
0 20 40 0 20 40 0 20 40
Shift of 65 Shift of 70 Average accuracy
1 1
S Uniform Uniform 1 Uniform
S Acc. Sweep Acc. Sweep Acc. Sweep
o
S0.5 0.5 0.5 0.5


0 A 0
0 20 40 0 20 40 0 20 40
Number of Bins Number of Bins Number of Bins



Figure 5.12: Results of uniform (blue) and accuracy sweep (red) methods.



Figure 5.12 shows the results for selected shifts. Clearly, the results are best when

there is no shift. The same pattern as before is evident. The accuracy sweep results

(red line) show the same smooth rise and larger accuracy than the uniform results.

The uniform results (blue line) show the same abrupt peak followed by a decrease

to zero for larger shifts. The plots in this figure were chosen to show the effects of

the bimodal curve in Figure 5.11. Note that small shifts (1 to 10) are similar to the

results for the previous database, with roughly 50'. accuracy at shifts of 4 and 5. For

the region between the two peaks in the bimodal curve (10 to 55) the uniform results

show a negligible bump for few bins while the accuracy sweep results are non-zero

and varying. Shifts of between 55 and 72 the curves follow the results for the first

peak, but attain a lower maxima.

Our second synthetic database with two peaks is shown in Figure 5.13. This

database covers the entire hue axis and has an offset between the two objects of 128










Two-Region Database


50 100 150 200 250
Histogram Bin / Hue


Figure 5.13:


Second two-peak database.


(half the potential hue space). This is a more uniform database, so we anticipate that

the accuracy sweep method should not increase the accuracy substantially. Because of

the symmetry in the database, we need only sweep the shifts from 0 to 8. Figure 5.14

shows the average results as a function of shift. There is a clear decrease in accuracy

as the shift is increased. Continuing to a shift of 16, it is reasonably clear that the

accuracy levels off at a shift of 8. We would expect that the accuracy of shifts between

16 and 120 would be approximately the same as the accuracy for shifts between 8

and 16, with shifts between 120 and 128 mirroring the shifts from 1 to 8.

Results for av iii, I, of shifts are shown in Figure 5.15. Again, we have roughly

50' accuracy when the shift is 4, or half the bump size. We also continue to see the

clear pattern of increased accuracy for few bins when the shift is greater than half but

less than the full width of the bump. When the shift is larger than the bump size,

accuracy is decreased to almost zero. The accuracy sweep results for the flat section











Two-Region Database: Average accuracy as a function of shift


Uniform
Acc. Sweep


2 4 6 8
Shift


10 12 14 16


Figure 5.14: Average accuracy for second two-peak database as a function of shift.


of the shift curve show a similar small bump for few bins as the uniform results in

the previous database.



5.2.4 More Complex Lighting Shifts


In this section, the databases from the previous subsections are tested with a

more complex lighting shift. The shift is generated from the images of cans under

different laboratory lighting conditions described in Section 4.2.2. Figure 5.16 shows

sample transforms. The black line of points corresponds to the transformation from

the average lighting condition (very close to the mniddli responses) to the earliest

morning lighting condition (9 am d4i-light with fluorescent lights). The blue line

corresponds to the transformation from the average to the latest evening condition

(fluorescent lights, no d4i-light). In each case, the values from the cans in the image,

rather than the calibrator, were used to generate the transformations. If we want to


-0.2
0













Shift of 1
1



S0.5

Uniform
S Acc. Sweep
0
0 20 40
Shift of 4
1
Uniform
Acc. Sweep

= 0.5



0
0 20 40
Shift of 11

Uniform
S Acc. Sweep


0.5


O:
0
0


20 40
Number of Bins


Shift of 2
1
Uniform
Acc. Sweep

0.5



0
0 20 40
Shift of 5
1
S Uniform
Acc. Sweep

0.5



0 20 40
Shift of 15

Uniform
Acc. Sweep

0.5


20 40
Number of Bins


Shift of 3

Uniform
Acc. Sweep

0.5



0
0 20 40
Shift of 6

Uniform
Acc. Sweep

0.5



0 20 40
Average accuracy

1 Uniform
Acc. Sweep

0.5


0

0 20 40
Number of Bins


Figure 5.15:
database.


Results of uniform and accuracy sweep methods on second two-peak


perform color constancy with this data, we should make sure that a given location


on the x-axis fits into a bin that contains the entire difference between the blue and


black curves. Using the width between the curves between 75 and 125, roughly 16


categories would be necessary.


Figure 5.17 shows the original input in the upper plot and the warped input


generated assuming the original data were taken under the average illuminant and


the warped version was taken under the first lighting condition in the lower plot.


The original database corresponds to the single-peak database that spans the entire


hue space, as described in Section 5.2.2. However, the warped version is substantially


different from the simple shifts tested previously. Instead of a linear shift, the warping


stretches some regions and compresses others.











Lighting Shift Comparison


200-
Mean to 12
Mean to 1


150


0 50 100 150 200
Hue In


Figure 5.16: Tranformation from average illuminant
illuminants.


to early morning and evening


This non-uniform warping is apparent when we compare the results of the linear

shift with the results using the warped data. Figure 5.18 compares the linear shift of

one to the closest warped data (middiilv) to the average, and the most distant warped

data (early morning). The results in Figure 5.9 are comparable to the results with a

shift of 6. The accuracy sweep method reaches a peak of roughly the same accuracy,

but in the warped case, there is a substantial drop in accuracy between the 45th and

100th bins. In addition, the uniform results never drop to zero in the warped data

case in the way they do with a simple linear shift. However, at least for the accuracy

sweep method, there is a clear initial bump in the accuracy, followed by a decrease

as the number of bins increases.











Original Database


IUU IOU

Warped Database (Mean to LC1)


50 100


Zuu


Hue


Figure 5.17: Comparison of original and warped databases.


5.3 Over-fitting


The accuracy sweep method does a wonderful job of enabling the algorithm to

adapt to given lighting changes. If the lighting is only going to change in one way, the

accuracy sweep method will design the bins to optimize for that shift. However, if the

examples that the method is using to train do not reflect the actual lighting conditions

to be encountered, the bin locations are less optimal than uniform. Figure 5.19 shows

the results when the bins generated for a single lighting shift are used to determine

the accuracy for the other lighting shifts.

The dotted lines are the accuracies generated in the test case, with the optimal

accuracy sweep for the training data in red and the uniform results in blue. The solid

red lines show the accuracy for the given shift used as cross-validation data when the

training data is the same training data used to generate the bin locations.











Hue-Warped Database
1

0.9 -

0.8 Uniform training
Uniform 13 vs 1 Warp
-Acc Sweep training
0.7 Acc Sweep 13 vs 1 Warp

0.6

= 0.5




0.3

0.2

0.1

0
0 20 40 60 80 100 120
Number of Bins


Figure 5.18: Results for training (original) and testing (warped).


5.4 Conclusions


The theoretical results showed that when there is no noise in the system, we

can expect at most a minor drop in accuracy when we quantize from many bins to

on the order of ten. For large databases, more than ten colors may be necessary,

depending on the size and composition of the database. These results indicate that

systems using quantization as a technique to improve efficiency are likely to work very

well. However, the theoretical results did not predict whether or not the algorithm is

capable of performing limited color constancy.

The synthetic results showed that in the training case, we can expect the accuracy

sweep method of determining the bin locations to outperform the uniform quantiza-

tion method. In addition, when the uniform quantization method is used with values

that have been shifted by more than half the width of the histogram peaks, accuracy is













Shift of 1


0.5



0"
0 50 100
Shift of 3
1


50
Shift of 6


50
Shift of 4


S0
100 0


50
Shift of 7


0.5



0
100 0 50 100
Shift of 5




0.5



0
100 0 50 100





0.5
Uniform
Acc Sw Training
Acc Sw Testing
100 0 50 100


Figure 5.19: Over-fitting of accuracy sweep method.



substantially higher for few categories than for many. Thus, for small linear hue shifts,


quantization can perform limited color constancy. This is shown by the increase in


the accuracy when only a few bins are used. The realistic lighting shift, however, did


not show perceptible color constancy. On training data, the accuracy sweep method


did seem to perform some form of color constany, as the accuracy increased for fewer


bins. However, the uniform method did not show any significant change in accuracy


as a function of the number of bins, and the accuracy sweep method overfitted the


data and was thus unfit for use in environments where lighting is unpredictable.


Shift of 0


Shift of 2













CHAPTER 6
STILL IMAGE RESULTS


To test our results from C'!i ipter 5, we generated a series of databases. The small-

est database consisted of images taken under typical laboratory lighting conditions.

The database contained images of 9 different soda cans, under 4 different illuminants.

Each image was segmented to contain only the soda can in question, and no pre-

processing was done before the quantization stage. The four illuminants consisted

of ambient di-light, ambient fluorescent light, frontal d4i-light with left fluorescent

light, and frontal d4i-light with right fluorescent light. Each can was in the same

location, in the same orientation, with the same surroundings, so differences due to

reflections, orientation shifts, and shadows were minimized.

A second database of 14 cans under 8 different illuminants was used to verify the

theoretical results. This database was intended to be much more difficult, as the

illuminants were common household illuminants, not pre-processed to compensate

for intensity or hue variations. As a results, the images in this database varied far

more than the others in terms of overall brightness and hue. A third database of 86

cans under 4 common laboratory illuminants and 8 different orientations was used to

test the effects of quantization on larger databases. Of this larger database, a subset

of 15 cans were used to test the results on a database of objects whose primary

colors all come from a single region of the hue axis. Databases were quantized to

2, 3, 4, 5, 6, 8, and 14 colors using the "tree" method described in Section 4.1.4.

The number of bins was also swept from 2 to 256 using the uniform and accuracy

sweep quantization methods along the hue axis, with only the hue information used to

identify the objects, as in the results on synthetic data in ('!i lpter 5. In addition, the

lighting shift method proposed at the end of C'!i lpter 4 was used. All these results on





















Diet Cola 2



t, m


Diet Cola 5 Diet Cola 6


Diet Cola 3


Diet Cola 7


Diet Cola 4


Diet Cola 8


Figure 6.1: Sample soda images used in 14-can database. Soda shown here is Publix
brand Diet Cola, under each of eight different illuminants.


the unprocessed database are compared to the results when the large database is pre-

processed with the multi-scale retinex with color restoration described in ('!i Ilter 2,

and to the results using simple normalization to an [r, g] chromaticity space.



6.1 Theoretical Comparison


Predictions were tested against a real database containing images of 14 soda cans.

Each can was oriented the same way in each image, to minimize the effects of ori-

entation changes. In Figure 6.1 the eight different lighting conditions are shown for

one can. The can shown is white, with markings in dark red and gray. Three images

were taken of each can under incandescent light: one with no other ambient light, one

with ambient (but not direct) sunlight, and one with full sun. One image was taken

of each can under ambient but not direct sun. Two images were taken of each can in




























1 -



Figure 6.2: Colormaps for 2, 3, 4, 5, 6, 8 and 14 colors.

bright and dim halogen light, with and without sunlight, for a total of four images of

each can. Each of these 112 images were quantized to 2, 3, 4, 5, 6, 8, and 14 colors

using the data-independent tree method described in Section 4.1.4.

The colormaps generated for our quantization to 2, 3, 4, 5, 6, 8 and 14 colors

are shown in Figure 6.2. As the number of available colors increased, the number of

chromatic colors also increased. Because the object recognition algorithm is based

on color, it seemed appropriate to include as ri ,r' hues as possible and minimize

achromatic discrimination. Colors whose saturation was below 0.2 were considered

achromatic, and colors whose lightness was below 0.1 were assigned to the darkest

achromatic color. The upper lightness threshold was set to 1, so even very pale colors

were considered chromatic.

In the theoretical equation, each area of uniform color is assigned an index, and so

it is possible to derive accuracy values for k < p. For real data, an area was designated











Results for Varying k

pReal=5 +

0.9 -


0.8 -

pReal=3 o

0.7
s 0-7-

0.6


0.5

pSim=2 x

0.4

pSim=3*
0.3
0 5 10 15 20 25
k



Figure 6.3: Theoretical predictions (blue solid lines) for c 224, p =2 and p = 3,
n 3, and k varying. Real database results (red dashed lines) for c = 224, p 3 and
p 5, n = 3 averaged over 20 sets, and k varying


as the sum of all pixels of the same quantized color, so k could not be smaller than p.

For comparison with the theoretical predictions for varying k, we averaged the results

from 20 sets of three images, where each image in a given set was of a different can.

For comparison to the simulation data, we determined the accuracy for sets of 10, 20,

30, 50 and 100 images, without averaging over different combinations. Our database

of real images utilizes resubstitution, to better compare accuracies derived from real

data to the theoretical expected accuracy.

Figure 6.3 compares the theoretical and real data, for several values for p, as k

varies from 1 to 25 and c is held to 224. The theoretical results (blue solid lines) predict

high accuracy even for values of k as low as 5 (> .'. for p = 2 and > 99.5'. for p = 3).

The results from the real image database (red dashed lines) are somewhat lower with

accuracy below 911' when k is 6 or less. However, when k equals 8 or 14 the accuracy