UFDC Home  myUFDC Home  Help 



Full Text  
PAGE 1 CORRENTROPY FOR SHAPEBAS ED MATCHING AND RETRIEVAL OF OBJECTS IN COLOR IMAGES By PRIYANK D. BAGRECHA A THESIS PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORID A IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF MASTER OF SCIENCE UNIVERSITY OF FLORIDA 2011 1 PAGE 2 2011 Priyank D. Bagrecha 2 PAGE 3 To my parents and my sister 3 PAGE 4 ACKNOWLEDGMENTS A lot of people have been instrumental in my stay for a masters degree at University of Florida being successful and frui tful. I would like to acknowledge and thank my advisor Dr. Jose C. Principe for prov iding me an opportunity to work under his guidance. His constant encouragement and motivation along with the numerous technical discussions with him provided me wit h a good insight and help me structure this thesis to its present state. I am also thankful to Dr. John G. Harri s and Dr. Paul D. Gader for being on my thesis committee and for their invaluable suggestions. I would like to acknowledge my colleagues at Computational NeuroEngineering Laboratory, especially Alex and Erion for thei r constant help throug hout the course of this thesis. I am thankful to my friends at University of Florida who made my stay in Gainesville a memorable one and who always pushed me to work harder. Last but not least, I am highly thankful to my parents and my sister for their infinite support, love and encouragement. 4 PAGE 5 TABLE OF CONTENTS pagemage Retrieval .................................................................................................11 1.2 Motivation and Goals ........................................................................................13 1.3 Outline ..............................................................................................................14 2 SHAPEBASED IM AGE RETR IEVAL.....................................................................16 2.1 Shapebased Image Retrieval ..........................................................................16 2.2 Shape Descriptors ............................................................................................17 2.2.1 Shape Parameters ...................................................................................17 2.2.2 Shape Signatures for Shape Representation ..........................................17 2.2.3 Polygonal Approximation .........................................................................18 2.2.4 Space Interrelation Feature .....................................................................18 2.2.5 Moments ..................................................................................................19 2.2.6 Transform Domain Features ....................................................................19 2.3 MPEG7 Shape Descriptors ..............................................................................20 2.3.1 Region Shape Descriptor ........................................................................20 2.3.2 Contour based Shape Descriptor ............................................................21 2.4 Shape Similarity Measures ...............................................................................21 3 INFORMATION THEORETIC LEAR NING AND CORR ENTROPY........................23 3.1 Information Theoretic Learning .........................................................................23 3.1.1 Renyis Entropy and Information Potential ...............................................23 3.1.2 Divergence ..............................................................................................25 3.2 Correntropy and its Properties ..........................................................................26 4 CORRENTROPY FOR SH APEBASED MATCHING OF PARTIALLY OCCLUDED OB JECTS..........................................................................................31 4.1 Shapebased Matching of Partially Occluded Objects ......................................31 4.2 Problem Model and Solution .............................................................................31 4.3 Experimental Results ........................................................................................35 5 PAGE 6 5 SHAPE EXTR ACTION............................................................................................40 5.1 Shape Extraction in Color Images .....................................................................40 5.1.1 Kasss Snake ...........................................................................................40 5.1.2 Gradient Vect or Flow Snakes ..................................................................42 5.1.3 Level Set based Active Contour ..............................................................43 5.1.4 Geodesic Active Contour .........................................................................45 5.1.5 Region based Active Contour ..................................................................45 5.2 Nonparametric KDE based Active Contour ......................................................46 5.2.1 Approach .................................................................................................47 5.2.2 Segmentation Results .............................................................................49 6 CORRENTROPY FOR SHA PEBASED RETRIEVAL OF COLOR IMAGES..........52 6.1 Problem Model and Solution .............................................................................53 6.2 Experimental Results ........................................................................................55 7 CONCLUS ION........................................................................................................60 APPENDIX A UPDATE EQUATION FOR CORRENT ROPY BASED SHAPE MATCHING..........61 B DERIVATION OF UPDATE EQ UATION FOR L EVEL SET....................................63 LIST OF REFERENCES ...............................................................................................65 BIOGRAPHICAL SKETCH ............................................................................................69 6 PAGE 7 LIST OF TABLES Table page 41 Recognition and Misclassification Rates ............................................................36 61 Best Retrieval Accuracy for Correntropy and LMedS .........................................56 7 PAGE 8 LIST OF FIGURES Figure page 31 Probabilistic interpretation of VXY (the maximum of each curve is normalized to 1 for visual convenience) ................................................................................29 41 The fish template database ................................................................................33 42 Occluded fish template .......................................................................................36 43 Extracted boundary of the fish in Figure 42 .......................................................37 44 Recognition rate for different kernel sizes ..........................................................38 45 Recognition rate vs occlusion .............................................................................39 51 Image with object regions (A ) The true segmented regions and and their associated density functions and (B) The evolving contour the region inside the curve with its associated density function and the region outside the curve with its associated density function ..........47 1R 2R )(1xp )(2xp C inR inp outR outp 52 Segmentation results us ing Nonparametric KDE based Active Contour (A) for a black and white image (B) for a simple color image. .......................................49 53 Segmentation Result for (A) Image of an Ibis (B) Ex tracted contour for Figure 53 (A) scaled down to 1/4th its size ....................................................................50 54 Segmentation Result for (A) Image of a Crab with complex background (B) Extracted contour for Figure 54 (A) scaled down to 1/4th its size .......................50 55 Segmentation Result for (A) Image of a Crab with grainy background (B) Extracted contour for Figure 55 (A) scaled down to 1/4th its size .......................51 61 Representatives of 10 categories in the image database (A) Airplane (B) Crayfish (C) Crab (D) Butterfly (E) Bird (F) Flower (G) Seahorse (H) Starfish (I) Motorbike (J) Revolver ...................................................................................54 62 Average Retrieval Accura cy for Correntropy and LMedS ...................................56 63 Retrieval Result for (A) Correntro py and (B) LMedS. First image is the query image. Retrieval ranks are disp layed at the top of each image. .........................57 64 Retrieval Result for Crab Category fo r (A) Correntropy (B ) LMedS. Retrieval ranks are displayed at the top of each image. ....................................................58 8 PAGE 9 Abstract of Thesis Pres ented to the Graduate School of the University of Florida in Partial Fulf illment of the Requirements for t he Degree of Master of Science CORRENTROPY FOR SHAPEBAS ED MATCHING AND RETRIEVAL OF OBJECTS IN COLOR IMAGES By Priyank D. Bagrecha May 2011 Chair: Jose Carlos Principe Major: Electrical and Computer Engineering The majority of shape matching algorithms use either shape landmarks or features extracted from the shape boundary. Automatic ex traction of these l andmarks or features from the shape boundary, under occlusion is a problem. The problem is majorly due to either the lack of or misplaced correspondence for the occluded portion of the object or due to difficulty in feature extraction due to occlusion. The central idea of this thesis is that occlusion can be considered as outliers and the shape boundary can be modeled as a heavy tailed nonGaussian distribution. Our methodology uses the shape boundary directly and we do not need to extract shape landmarks or shape descriptors from the object boundary. Correntropy matched filter have been shown to perform better for nonG aussian signal processing applications. We propose Correntropy as a cost func tion for shapebased matching of partially occluded objects and shapebased retrieval of color images. The shapes of the objects from color images are extr acted using a nonparametric kernel density estimation method formulated in the level set framework. The results obtained are promising, and they suggest that Correntropy as a shapematching 9 PAGE 10 criterion provides higher retrieval accuracy in comparison to LMedS as a shapematching criterion. 10 PAGE 11 CHAPTER 1 INTRODUCTION 1.1 Image Retrieval Improved computing power and networki ng speeds have generated a significant amount of interest in multim edia services and applications. ( Furht 1994 ) Powerful processors, high speed network connecti vity, high capacity storage devices, improvement in compression of digital m edia, and advancement of audio, image and video technologies have made multimedia systems technically and economically feasible. These advances in technology have initiated a variety of potential applications like online video rental services, online movie cata logs, and digital media libraries etc. which strive to provide the user with ondemand multimedia services. Owing to an increased interest in such services and to a huge explos ion of visual data, it is necessary to provide an improved and robust searchi ng, indexing and retrieval system. Although advances in compression algorithms and storage equipment technologies have alleviated the problem of storing gigabytes of digital data, a user looking for a specific multimedia file finds it difficult to browse through the entire database due to its high volume. Therefore an automated indexing and retrieval system is needed. Traditionally, textual features such as the file names, file metadata, captions, keywords etc. are used for representing an image or a video for indexing and retrieval purposes. However, there are several inadequac ies with this system. Firstly, such an indexing system needs human intervention fo r tagging. The problems with such a supervised system in terms of manual tagging are manifold. Consider, for instance, a 11 PAGE 12 system with images of modern art. Diffe rent people have varying opinions and understandings about the content of these images. As a resu lt different people will have different tags and keywords for the same multimedia file, making the keywords subjective and not unique. Such an inde xing scheme and management of a huge database takes a long time and may not be economically viable. With manual tagging, there is always a need to update the set of keyw ords and to maintain a proper hierarchy of these keywords. This may result in missing several multimedia files during the retrieval because they may have been indexed with synonyms of the keywords used for the query, rather than with the keywords themselves. Secondly, since a picture is worth a thousand words, the use of textual features is inadequate and ineffective in representing the a ttributes such as color, texture, shape, layout of the image or the vi deo frame. Keywords are not su fficient for abstract images, because abstraction, by nature, is elusive, and is expressible only in its currently presented form. While the s hortage of keywords poses a problem, the excess of keywords, such as one necessary to describe a scenic view of a bustling market, may result in abundant and therefore ineffective search results. Finally, the subjective nature of the vis ual search is also an issue. An image containing several objects will generate differ ent interests in different users. The precision that a user looks for in the cont ent of the images is also subjective. For example, in a database of images of American presidents, a particular user might be interested in images of George Bush while another user might be looking for images of George Bush wearing a navy colored suit. Due to these difficulties with manual annotation of visual data, it is imperative to index images and videos automatically for 12 PAGE 13 efficient search and retrieval based on their content. As a result, there is a need to automatically extract features based on visual cues from the images and to index, search and retrieve on the basis of these features. Since humans use color, texture and shape to differentiate and identif y the contents of an image, it is natural to use features based on these visual cues for image retrieval. Most of the prior works have concentrated on identifying the co rrect model or framework fo r image retrieval. A typical image retrieval application in volves preprocessing of in put images, extraction of features and storage with other images in the database. When a query in the form of an image or a sketch is provided, it too is preprocessed with feature(s) extracted and compared with those of the images in t he database. A ranked list of images with best scores based on some similarity or dissimila rity measure is generated as the retrieval result. 1.2 Motivation and Goals For years, researchers have been working on different featur e extraction schemes and corresponding feature matching techniques for the problem of image retrieval ( Datta, et al. 2008 ), ( Smeulders, et al. 2000 ), ( Lew, et al. 2006 ). Several low level visual descriptors based on color, texture ( Manjunath, Ohm, et al. 2001 ) and shape ( Ansari and Delp 1990 ), ( Jain and Vailaya 1996 ), ( Bober 2001 ) of the objects in the image; have been proposed in the literature for image retrieval. Object shape gives the best semantic representation of image content. Hence we focus on shape as a feature for content based matching of images. Most of the shape descriptors or shape matching techniques have a major limitat ion that they focus on glo bal features and hence are sensitive to local changes ( Dryden and Mardia 1992 ), ( Li, Lee and Adjeroh 2005 ), ( Jain and Vailaya 1996 ). The ones that focus on local features are prone to noise ( Latecki 13 PAGE 14 and Lakmper 2000 ). Most of the shape matching techniques are based on computing feature(s) from the shape boundary ( Bober 2001 ). The major difficulty in shape matching can be attributed to difficulty in ex traction of features or due to lack of correspondence or misplacement of shape landmarks. We propose an information theoretic bas ed measure, Corrent ropy, for matching the boundaries of the objects in the images. Correntropy is defined as a generalized correlation function between two random variabl es in a high dimensional feature space ( Santamaria, Pokharel and Principe 2006 ). This definition induces a Reproducing Kernel Hilbert Space (RKHS). Correntropy also includes a ll the even order moments of the difference of the two random variables in the input space ( Santamaria, Pokharel and Principe 2006 ). It is observed that Correntropy has a strong outlier rejection capability in the input space ( Liu, Pokharel and Principe 2007 ). Correntropy based matched filters have been shown to perform better fo r nonGaussian signal processing ( Pokharel, Agrawal and Principe 2005 ). These properties strongly motivate us to investigate Correntropy or as a measure for shape ma tching and shapebased retrieval of color images. Another advantage of our approach is t hat we use the object boundary directly and that there is no need to extract features from the shape contour. 1.3 Outline The thesis comprises of 7 chapters. C hapter 2 discusses the problem of shape based image retrieval. It also entails the different methodologies existing in the literature. Chapter 3 deals with Information theoretic l earning followed by Correntropy and its properties. Chapter 4 proposes t he Correntropy based algorithm for shapebased matching of partially occluded objects Chapter 5 discusses the active contour based shape extraction algorit hms in detail and describes the nonparametric level set 14 PAGE 15 based shape extraction algorithm used in th is thesis. In Chapter 6, results for Correntropy as a similarity metric fo r shapebased retrieval of color images are presented. The thesis is concluded in Chapter 7. 15 PAGE 16 CHAPTER 2 SHAPEBASED IMAGE RETRIEVAL 2.1 Shapebased Image Retrieval Among the large number of low level techniq ues for image retrieval such as color, texture, layout etc, shape of the object provides a better semantic description of the content of the image. The s hape in an image could be under stood as a medium level feature, since shape is something which is also recognized by humans perception. Shapebased image retrieval is the proce ss of searching of shapes in an image database which are similar to the query s hape. Usually all shapes within a given distance from the query are determined or the fi rst few shapes at smallest distances are returned. Efficient shapebased image retrie val system needs good shape descriptor(s) and similarity measure(s). A good shape descr iptor must possess following properties ( Latecki and Lakmper 2000 ) Identifiability: Shapes which are found perceptually si milar should have similar shape descriptor Translation, Rotation and Scale Invariance: Translation, rotation or scaling of the shape should not change the shape descriptor Affine Invariance: The descriptor shoul d be invariant to affine transforms like shears, flips etc. Robust: It should not be affe cted by the presence of noise. Occlusion Invariance: If the shape is parti ally occluded, the descriptor should be similar to that of the nonoccluded shape Good retrieval accuracy needs shape descripto r to be able to effectively find perceptually similar shapes from the dat abase. Hence, the descriptor(s) should be complete so as to represent the content of the image. It should be compact for effective 16 PAGE 17 indexing and storage. The shape similarity measure bet ween two descriptors should not be computationally expensive. 2.2 Shape Descriptors The shape description techniques can be classified into 3 categories ( Mingqiang, Kidiyo and Joseph 2008 ) (i) contour based methods and region based methods, (ii) space domain and transform domain, and (iii) information preserving and noninformation preserving. Contour based methods and Region based me thods: They will be covered in 2.3 MPEG7 Descriptors. Space domain and Transform domain me thods: Methods in space domain match shapes on point or point feature basis, while Transform domain methods match shapes on the basis of features computed through some transform. Information preserving and Noninformation preserving: The difference is made on the basis of whether the method the method allows for an appreciable reconstruction of the shape given the shape descriptor. 2.2.1 Shape Parameters Shape can be described by simple geometric properties or shap e parameters such as Center of gravity, axis of least inertia, digital bending energy, eccentricity, circularity ratio, elliptic variance, rectangularity, convex ity, solidity, euler number, profiles, hole area ration etc. ( Mingqiang, Kidiyo and Joseph 2008 ). These shape parameters can only discriminates shapes with large differences and hence are best used to filter false positives or combined with other shape descriptors. 2.2.2 Shape Signatures for Shape Representation Shape signatures are one dimensional feature which capt ures the visual or the perceptual feature of the shape. They are usually co mputed from the shape boundary coordinates. Several examples could be complex coordinates (r epresentation of shape 17 PAGE 18 boundary coordinates as a complex number), distance of the coordinates from the centroid, tangent angle (turning angle), curv ature at the boundary c oordinates, area of the triangle formed by two c onsecutive boundary points and the centroid of the shape, area of the triangles formed by the consecutive boundary points etc. These methods are sensitive to noise as small changes in the boundary points can cause large errors in matching. Also to accommodate rotation invariance, shift ma tching is required. Normally feature histograms are used for rotational invariance. 2.2.3 Polygonal Approximation These methods focus on capturing the over all shape information while ignoring the minor variations along the edge. This reduces the effects due to discretization of the shape contour. Merging and splitting are two ways of implementing polygonal approximation. Merging methods adds successive pixels to a line segment if each new pixel that is added does not c ause the segment to deviate too much from being a line. Polygon evolution based method presented in ( Latecki and Lakmper 2000 ) is one such method. Splitting methods work by connecti ng the lines from one point on the shape boundary to another. The perpendicu lar distance of each point on the boundary to this line are calculated If this distance exceeds so me threshold then the line is broken at the point of greatest distance. Polygonal appr oximation methods simplify the shapes without causing any blurring effects and reduc e the noise. These methods are generally used as a preprocessing step for computing other features. 2.2.4 Space Interrelation Feature Spatial interrelation features describe t he shape of the object through the relation of the pixels in the shape region or contour segments or curves. Generally geometric features such as length, curvature, relati ve orientation, area etc are used for the 18 PAGE 19 representation. Convex hull, chain codes, shape contexts are several examples of spatial interrelation features. ( Xu, Saber and Tekalp 2001 ) employs a heirarchical spatially interreated feature for shape matchi ng of partially occluded objects. Spatial feature based methods provide a direct and compact way of representing the shape. 2.2.5 Moments The concept of moments originated from moments in physics. They can be used for both contour and region based methods for shape representation. St atistical features can be used to reduce the size of the boundary representation. The normalized statistical features are rotation, translation and scale invariant features. Invariant moments or geometric moment invariants, and Zernike moments are several region based moments which are widely used. These features are global in nature and hence cannot be associated with salient features of the shape or provided with a physical significance. 2.2.6 Transform Domain Features Unlike the spatial interrelation featur e based methods, these methods project the shape contour or region into another domai n to obtain some intrinsic features. Transforms thus can be used to characteri ze the images or shape contours from the images. The shape features are then represent ed by all or few coefficients of the transform. Fourier descriptors are features obtained by comp uting Fourier transform of a one dimensional function of shape contour. Usually normalized Fourier descriptors are used. A region based Fourier descriptor is ob tained by applying a modified polar Fourier transform on the image. Wavelet transforms and Radon transform are several other transforms that are used to compute transform domain features. Shapelets descriptor proposes the formation of a shape via linea r superposition of a number of shape bases. 19 PAGE 20 More information or references for the features discussed in this chapter can be found in ( Mingqiang, Kidiyo and Joseph 2008 ), ( Li, Lee and Adjeroh 2005 ), and ( Bober 2001 ) 2.3 MPEG7 Shape Descriptors To facilitate effective use of audio, video and motion descriptors, ISO/IEC has launched the MPEG7 standard ( Martinez 2004 ) for multimedia retrieval. The MPEG7 standard includes three shape descriptors (i) Region based shape descriptor (ii) Contour based shape descriptor and (iii) 3D shape descriptor. Since we are focusing only on 2D images, we will not be discussing the 3D shape descriptor. Interested reader can find more information in ( Bober 2001 ). 2.3.1 Region Shape Descriptor MPEG7 region based descriptor ( Bober 2001 ) can provide a compact representation of several di sconnected regions as well as simple connected regions with or without holes. It is also robust to se gmentation noise. This f eature belongs to the broad class of features based on moments. It is computed via a 2D Angular Radial Transformation (ART) defined in a unit disc in polar coordinates. A set of ART coefficients are extracted for each shape which is defined as nmF ),(),,( IVFnm nm (21) where 1 0 2 0 *),(),( ddIVnm ),( I is the image intensity function r epresented in polar coordinates and )( 2 1 ),( n jm nmRe V (22) where ,1)( nR (23) 0 n 20 PAGE 21 0),cos(2 nn The default region based shape descriptor has 140 bits. It uses 35 coefficients quantized to 4 bits per coefficient. 2.3.2 Contour based Shape Descriptor This descriptor characterizes the shape of an object or region in the image on the basis of its outline or boundary. Objects having di sconnected regions can be characterized on the basis of boundaries of a ll the segmented results. Naturally, contour based shape descriptor is aligned towards sem antic based image retrieval as it reflects properties of human visual system. The descriptor is based on Curvature Scale Space (CSS) representation of the contour. The contour is smoothed unless t he shape becomes convex. A so called CSS image is generated by repetit ively smoothing the contour with every row as the resultant contour. The horizontal index being the coordinate index of the contour and the vertical index being t he number of times smoothing is repeated. The coordinate values of the contour for prominent CSS peaks are recorded along with the eccentricity and circularity of the contour. For more details interested reader can refer to ( Bober 2001 ). MPEG7 contour based descriptor is effici ent in the retrieval of objects even with deformation (rigid, nonrigid or perspective deformation) in shape. It can also distinguish between shapes which have similar region s hape properties or similar regional pixel distribution and supports search for shapes wh ich are semantically similar to human vision. 2.4 Shape Similarity Measures A good shape similarity measures should sa tisfy all the properties required for a good shape descriptor that are mentioned in 2.1. Depending upon the task at hand, the 21 PAGE 22 similarity function should be symmetric or not strictly symmetric A relaxed triangle inequality is desired for partial matching problem ( Veltkamp 2001 ). The similarity function should be continuous, robust to noise and invariant under some transformations. The group of transformati ons for which the measure should be invariant is also application dependent. As a result different similarity measures are used for different shape descripto rs, depending upon the application. pL distance or Minkowski distance, Bottl eneck distance, Hausdorff distance (HD), Turning function distance, Frechet distance, Earth Movers Distance (EMD), and Hashing based similarity f unction are few measures which are widely used in the literature. ( Grauman and Darrell 2004 ) proposed an approximate EMD based similarity measure for shapebased image retrieval. A weighted least squares based method is presented for shape matching in ( Salberg, Hanssen and Scharf 2007 ). ( Orrite and Herrero 2004 ) modified the Hausdorff transform for matching of partially occluded curves under projective transformation. 22 PAGE 23 CHAPTER 3 INFORMATION THEORETIC L EARNING AND CORRENTROPY 3.1 Information Theoretic Learning The central idea of Information T heoretic Learning proposed by ( Principe, Dongxin, et al. 2000 ) is to nonparametrically estimate the descriptors from information theory (entropy, divergence et c.) directly from the data to substitute widely used conventional statistical measures of variance and covariance. Information Theoretic Learning or ITL is used in the adaptation of linear or nonlinear filters and also in supervised or unsupervised mach ine learning applications. 3.1.1 Renyis Entropy and Information Potential Renyis order entropy is defined as dxxf XHx )(log 1 1 )( (31) A special case with =2 is considered and is called Renyis Quadratic Entropy. Estimator of Renyis Quadratic Entropy fo rms the basis of Information Theoretic Learning. It is defined over a random variable X as dxxf XHx)(log)(2 2 (32) The pdf of X is estimated by using the fa mous Parzen Density Estimation ( Parzen, 1962 ) and is given by N i ixxk N xf1)( 1 )( (33) where are samples from the random variable N1,2,.., = i ,ix N X Popularly a Gaussian kernel defined as 2 222 1 )( xe xG (34) 23 PAGE 24 is used for density estimation. In case of multidimensional density estimation, the use of joint or separable kernels of the type D ddxGxGd1))(()( (35) where D is the dimens ion of the vector x is suggested. Substituting Eq. 33 in Eq. 32, we get dxxxk N XHN i i 2 1 2)( 1 log)( dxxxGxxG N XHN i N j j i 11 2 2)()( 1 log)( dxxxGxxG N XHN i N j j i 11 2 2)()( 1 log)( dxxxG N XHN i N j ij 11 2 2 2)( 1 log)( (36) )(log)(2xIPXH where dxxxG N xIPN i N j ji 11 2)( 1 )( (37) Here Eq. 37 is called as the Information Po tential of X. It defi nes the state of the data particles or points in the input spac e and interaction between the information particles. It is also shown that the Parzen pdf estimation of X is equivalent to the information potential field. ( Erdogmus and Principe 2002 ) Another measure which can estimate the interaction between the points of two random variables in the input space is the crossinformation potential (CIP) ( Principe, 24 PAGE 25 Dongxin, et al. 2000 ). It is defined as an integral of the product of two probability functions which characterizes similarity between two random variables. It determines the effect of the potential created by at a particular location in the input space defined by the (or vice versa); where and are the two probability density function. yg xf f g dxdyygxfYXCIP )()(),( (38) Using the Parzen density estimation again with the Gaussian kernel, we get M i N j j idxdyyyGxxG MN YXCIPg f11)()( 1 ),( (39) Similarly as the Inform ation Potential we get M i N j jiyxG MN YXCIP11)( 1 ),( (310) with 222 gf 3.1.2 Divergence The dissimilarity between two probability den sity functions can be quantified using the divergence measure. ( Renyi 1961 ) proposed an order divergence measure for which the popular KLdivergence is a special case. The Renyis divergence between the two pdfs f and g is defined as dx xg xf xf gfD 1)( )( )(log 1 1 ),( (311) Again, by varying the value of the user can set the di vergence varying different order. As a limiting case when KLDD ,1 where refers to KLDivergence and is defined as KLD 25 PAGE 26 dx xg xf xfgfD )( )( log)(),( (312) Using the approach similar to that used for calculating Renyis entropy, can be estimated from the input samples by subs tituting for the probability density functions using the Parzen density estimation. KLD If and are two probability density functi ons of the a random variable f g X then dxxgxfdxxfxfgfDKL))(log()())(log()(),( ))]([log())]([log(),( xgExfEgfDx x KL ])( 1 [log])( 1 [log),(1 1 N i i x N i i x KLxxG N ExxG N EgfDg f (313) Several other measures are proposed using the information potential (IP) like the CauchySchwartz divergence and quadratic divergence. But they are not discussed here and interested readers can refer to ( Principe, Information Theoretic Learning: Renyi's Entropy and Kernel Perspectives, 1st ed. 2010 ). 3.2 Correntropy and its Properties Let be a stochastic process where },{ TtxXt T is a index set. The generalized correlation function or autocorrentropy is defined as a function from to ),( tsVX TT R ( Liu, Pokharel and Principe 2007 ) as )],([),(ts XxxEtsV A generalized similarity m easure between two arbitrar y random variables, crosscorrentropy, is defined by ( Liu et al., 2007 ) as )]([),( YXEYXV (314) 26 PAGE 27 Crosscorrentropy is referred to as correntropy. If is a kernel satisfying the Mercers Theorem ( Vapnik, 1995 ), then it induces a Reproducing Kernel Hilbert Space (RKHS) called as Hence, Correntropy can also be defined as the dot product of the two random variables in the feature space H H ])(),([),( YXEYXV (315) where is a nonlinear mapping from the input space to the feature space such that )(),(),( YXYX with .,. defined as the inner product. Here we use the Gaussian kernel 2 22)2( 1 ),( yx de yxG (316) where is the input dimension; since it is most popularly used in the information theoretic learning (ITL) literature. d In practice, the joint pdf is unknown and only a finite number of samples of the data are available and hence, Corr entropy can be estimated as N i iiyx N V1)( 1 (317) Some of the properties of Correntropy t hat are important in this thesis are discussed here and a detailed an alysis can be obtained at ( Liu et al., 2007 ; Santamaria et al., 2006 ). Property 1: Correntropy is a symmetric function ),(),( XYVYXV Property 2: Correntropy is positive and bounded. 2 1 ),(0 YXV It reaches its maximum if and only if YX 27 PAGE 28 Property 3: Correntropy involves all the even order moments of Y X Using the series expansion for gaussian ke rnel, the Correntropy function can be expanded as 0 2 2!2 )1( 2 1 ),(n n n n nYX E n YXV (318) which involves all the even order moments of Y X The term corresponding to is proportional to 1 n ][2][][2 2XYEYEXE This shows that Correntropy includes information provided by the conventional crosscorrelation function. Property 4: Let be i.i.d data drawn from the joint pdf and let N iiiyx1)},{( ),( yxfXY ),(,yxfXY be its Parzen density estimate with kernel bandwidth The crosscorrentropy estimate approaches, through Pa rzen density estimation, a measure of the joint pdf along the line YX dxdyyxyxfYXVXY)(),(),(2 (319) Now N i i i XYyyxx N yxf1 ,)()( 1 ),( N i i i XYdxdyyxyyxx N dxdyyxyxf0)()()( 1 )(),( N i i idxyxxx N0)()( 1 ),()( 12 0 2YXVyx NN i ii Thus correntropy gives maximum value along the Y X line i.e when the two random variables are similar as shown in Figure 31. 28 PAGE 29 Figure 31. Probabilistic interpretation of V XY (the maximum of each curve is normalized to 1 for visual convenience) Property 5: Correntropy as an estimator induces a metric called Correntropy Induced Metric (CIM) in the samp le space. Given two vectors and T NxxxX ),...,,(21 T NyyyY ),...,,(21 2 1)),()0((),( YXV YXCIM (320) 1. 0),( YXCIM 2. if and only if 0),( YXCIM Y X 3. is symmetric. ),( YXCIM 4. obeys the triangl e inequality. ),( YXCIM CIM induces a nonlinear similarity measure in the input space ( Liu, Pokharel and Principe 2007 ). It has been observed that CIM behaves like an norm when the two vectors are close, as norm outside the norm and as they go farther apart it 2L 1L 2L 29 PAGE 30 becomes insensitive to dist ance between the two vectors ( norm). The extent of the space over which CIM acts as or norm is directly related to the kernel bandwidth. This unique property of CIM localizes the si milarity measure and is very helpful in rejecting outliers. 0L 2L 0L 30 PAGE 31 CHAPTER 4 CORRENTROPY FOR SHAPEBASED MATCHING OF PARTIALLY OCCLUDED OBJECTS 4.1 Shapebased Matching of Partially Occluded Objects For years, researchers have been working on different schemes for the problem of shape matching ( Bhanu 1982 ) ( Ansari and Delp 1990 ) ( Dryden and Mardia 1992 ). However automated extraction of shape featur es is a problem when the objects are partially occluded ( Ansari and Delp 1990 ) ( Orrite and Herrero 2004 ) ( Salberg, Hanssen and Scharf 2007 ). The problem is majorly due to either the lack of or misplaced correspondence for the occluded portion of the object or due to difficulty in feature extraction due to occlusion. Using Correntro py as a similarity metric reduces the influence due to the occluded portion. The idea is that the occlusi on can be considered as outliers and the shape boundary can be modeled as a heavy tailed nongaussian distribution. It is observed that Correntropy has a strong outlier rejection capability in the input space ( Liu, Pokharel and Principe 2007 ). Correntropy based matched filters have been shown to perform better for nongaussian signal processing ( Pokharel, Agrawal and Principe 2005 ). In our methodology, we use t he object boundary directly and hence there is no need to extract features from the shape contour. The ex tracted contours are then matched with the known shape templates. 4.2 Problem Model and Solution We consider the shape matching problem as a classification problem. Consider the following Mary cla ssification problem. For 1,...,1,0 M m the hypothesis is thm EAXYm (41) 31 PAGE 32 where Y is a measurement matrix, is the measurement component due to the signal of interest when model or class m is in force. KxN mAX E is a zero mean additive noise matrix of size KxN A is a unknown matrix of size where is the signal matrix of size All the quantities involv ed may be complex. Hence, given the measurement matrix Y, the problem is to classify which of the KxP mX thm PxN M signal components are present in Y. The signal components can be called subspace signal ( mAX mAX Salberg, Hanssen and Scharf 2007 ). If some of the parameters are unknown, as is the case here, Generaliz ed Likelihood Ratio (GLR) is employed ( Scharf and Friedlander 1994 ). GLR replaces the unknown parameters with their Maximum Likelihood Estimates (ML). However, If the pdf of the no ise matrix is not known, GLR based methods are hard to im plement. ML estimate s are not the best estimates if the noise model deviates from gaussianity as is the case with our problem. In this chapter we will show the effectiveness of Correntropy as a robust detector for shape based matching of partially occluded objects. Figure 41 displays the set of 13 fish silhouettes downloaded from http://www.lems.brown.edu/vision/software/ which constitutes the template database. For each image, the boundary of the fish is extracted. Thus each column of in Eq. 41 is the coordinates of a point on the boundary of the fish silhouette. mX A is the scaling and the rotation matrix of type ab ba A (42) An occluded or distorted version of one of the images from the template database is presented to the detector. The occluded versi on is then matched to one of the image in 32 PAGE 33 the template database. In our case, 13 ma tching operations are performed. Optimal scaling, rotation and translation matrices are estimated by optimizing or maximizing the Correntropy between the given shape boundary and the shape boundary of the image from the template database. The corre sponding cost function is given by 1 0 22 21 ),(N i dyAxiie N dAJ (43) where are the columns of and Y respectively and iiyx thi mX A is the scaling and rotation matrix and is the translation vector. is the number of points in the shape boundary. d N 1 2 3 4 5 6 7 8 9 10 11 12 13 Figure 41. The fish template database Then by differentiating Eq. 43 with respect to and equating to zero gives the following fixed point update equation a 33 PAGE 34 N i i dyAx N i i iiiii dyAxx e dxdxyxyx e aii ii1 2 2 1 22112211 22 2 2 2 (44) The update equation for is b N i i dyAx N i i iiiii dyAxx e dxdxyxyx e bii ii1 2 2 1 12211221 22 2 2 2 (45) where and are the components of vectors and Similarly differentiating Eq. 43 with respect to translation vector and equating the gradient to zero gives the following update rule ikx iky thk ix iy d N i dyAx N i ii dyAxii iie yAx e d1 2 1 22 2 2 2 (46) After the optimization process for each of the templates is fini shed, the template corresponding to the largest value of the co st function is chosen as the matched shape. The shape boundaries of the 13 fish images are extracted and stored. Since a shape is completely described by its boundary, the shape boundary can be extracted by any conventional method. We used a level set method for extracting the object contour. We also have to extract the shape boundary for the occluded silhouette. The points on the boundary should be ordered as in a time series. In addition we have to extract a fixed number of points, equal to the number of points in the template boundaries. Note that these points have to be properly ordered as the ker nel function is evaluated for 34 PAGE 35 each point in the boundary with one point in the template. Hence we order the extracted points in an effective way as follows, which we call as Nearest Point Ordering. For each point in the boundary, we pick the closes t point on the template boundary as its corresponding point for evaluatio n of the kernel function. Thus for each shape template there will be an ordered set of points i.e. for mX M m ,...,2,1 and a corresponding set of boundary points for the occluded image given by Y. M is the number of templates or classes (here 13). The algorithm can be represented as follows: Step 1 Center and scale the boundary poi nts of the occluded shape Step 2 for template index M m ,...,2,1 for fixed point update iterati on index i = 1to number of iterations do Nearest Point Ordering of Y with respect to mX update A and using Eq. 43. Eq. 44 and Eq. 45 d end for loop for fixed point update compute ),()( dAJmJopt end for loop for templates Step 3 )(maxarg mJ mopt m opt Chose the template as the most likely shape for the given occluded shape template. optm 4.3 Experimental Results The inputs given to the detector were o ccluded versions of shape template 1 shown in Figure 41. The occluded versions were constructed by random rotations (within 30 to 330 degrees), and random scali ng from 1 and 2 and translation of two 35 PAGE 36 fishes of the shape number 1. Only o ccluded silhouettes where the area of the nonoccluded portion was between 50 to 80% of t he original silhouette were considered. Figure 42 and Figure 43 shows an occl uded version of shape 1 and its shape boundary respectively. Figure 42. Occluded fish template Table 41. Recognition and Misclassification Rates Shape Number Correntropy LMedS 1 94 89 2 to 13 6 11 1000 simulations were performed by random ly generating the occluded versions and classifying it to one of the 13 known shape templates. Table 41 shows the recognition and misclassification rates fo r the proposed method along with the LMedS method using Row Weighted Re siduals as reported in ( Salberg, Hanssen and Scharf 2007 ). LMedS is based along the same ideol ogy as our method and uses the shape boundary directly. It treats occlusion as outlier s and formulates the problem of matching an occluded shape as a subspace matching for classification based on weighted least squared error. The idea is to use median of least squared errors for a number of subsets of contour points to determine a threshold. A weight of 0 or 1 is assigned to 36 PAGE 37 each point on the contour on the basis of this threshold value. A weight of 1 implying that the error for the corrensponding contour points should be used to evaluate final matching error and weight of zero means otherwise. ( Salberg, Hanssen and Scharf 2007 ) reported a classification accuracy of 63 %. We were able to do much better than that due to two reasons. Firstly Salberg et al. considered only the largest unoccluded portion of the occluded shape and discarding the rest of the shape, while we used the complete unoccluded shape. Secondly, we us ed Nearest Point Ordering mechanism for selecting the correspondence between the poi nts on the shape boundary for the two shapes to be compared. Another important thing to be noted in ( Salberg, Hanssen and Scharf 2007 ) is that recognition rate of 63% was obtained when the ratio of largest nono ccluded part to that of complete silhouette under consderation wa s between 70% to 90%. In our case we have much better recognition rates with even more occlusion. Figure 43. Extracted boundary of the fish in Figure 42 37 PAGE 38 As in all methods that uses G aussian kernel, the kernel size is important. In our case, kernel bandwidth determines what is considered close or distant. Points which are distant do not contribute si gnificantly towards correntropy while the points which are distance apart returns much higher value. Figure 44 depicts the performance of the proposed method under different kernel sizes. Kernel sizes were varied from 0.01, 0.05 to 0.5 in steps of 0.05. Number of iterations for the fixed point update was varied from 5, 10 and 20. The best performance of 94% recognition rate was obtained for 05.0 with 20 iterations. Figure 45 shows the shape recognition rate against the ratio of nonoccluded portion to the area of the silhouette for number of it erations equal to 20. A larger number of contour points contribute significantly towards correntropy as the kernel bandwidth is increased. As a result deterioration in the classification accuracy could be seen with larger kernel bandwidth. A smaller kernel bandwidth implies a smaller space over which correntropy induces a norm and a larger outlier rejection space which also resulted in a reduced classification accuracy. 2L Figure 44. Recognition rate for different kernel sizes 38 PAGE 39 Figure 45. Recognitio n rate vs occlusion 39 PAGE 40 CHAPTER 5 SHAPE EXTRACTION 5.1 Shape Extraction in Color Images Image segmentation is one of the core pr oblems in computer vision and related fields. Many approaches have been used to addre ss the problem in the past which form the basis for recent developments. Some of the earlier attempts have been Canny edge detection ( Canny 1986 ), region growing ( Adams and Bischof 1994 ), Mumford shah functional based global optimization ( Mumford and Shah 1989 ), graph theoretic approach( Malcolm, Rathi and Tannenbaum 2007 ) etc. Here, we use shape boundary of the object directly for shapebas ed retrieval of color images. Shape of an object gives a better semantic representation of content of the image as compared to low level image descriptors dealing with color, texture, layout etc. Curve evolution based segmentation approach has been of great interest in recent times ( Kass, Witkin and Terzopoulos 1987 ), ( Malladi, Sethian and Vemuri 1995 ), ( Xu and Prince 1998 ), ( Caselles, Kimmel and Sapiro 1997 ), ( Sethian 1999 ), ( Chan and Vese 2001 ), ( Ozertem, Erdogmus and B 2007 ). We use a curve evolution based algorithm for extraction of shape boundaries. This chapt er discusses the progression of curve evolution or active contour based algor ithms for image segm entation. The nonparametric statistical method formulated in level set framework used for extraction of shapes from color images is described in detail in Section 5.2. 5.1.1 Kasss Snake The basic idea of active contour or Kasss Snake ( Kass, Witkin and Terzopoulos 1987 ) is to evolve a curve, under constraints defined from the image, to detect objects in that image. Let be a bounded open subset of 2 R with its boundary. Ri :0 40 PAGE 41 is the image under consideration with the parameterized curve under evolution. The idea is then to evolve t he curve under smoothness and image constraints depending on gradient of the image computed through an edge detector. 2]1.0[:)( R SC 0i The model can be defined as where )(inf CJsnakeC 1 0 2 0 1 0 2 '' 1 0 2 ')))((( )( )()( dssCigdsSCdsSCCJsnake (51) where ,, and are positive parameters whic h can be tuned locally on the contour. )(oig is a general gradient based edgedetector function which can be defined as follows 1, )*(1 1 )(0 0 p iG igp (52) 0* iG is a smoother version of obtained through the convolution of the image with a gaussian 0i 0i 2 222 )( 2 1 ),( yx e yxG The first and second term in Eq. 51 imposes smoothness constraint while the third term imposes image based constraints to ensure that the curve stops on the object boundary marked by the presence of strong edge. The optimization problem in Eq. 51 can be reformulated as a force equation through EulerLagrange method given as 0)()()( sfsvsvssss ss (53) The first two terms corresponds to splin es inherent smoothness constraints while the third term corresponds to image external forces. ))(),(()( sysxsv are the coordinates of points on the snake, is the static external force field. denotes )( sf )( svs 41 PAGE 42 derivative of with respect to )( sv ]1,0[ s The equations for iteratively updating the snake points are ),(1yn x n ssss n ss n nyxfxxxx (54) ),(1 yn y n ssss n ss n nyxfyyyy (55) ),(ynyx gives the coordinates of the snake for nth iteration. Kasss Snake has several shortcomings. Fi rstly, the parameterization of active contour changes as the curve evolves. Sec ondly, the snake is unable to cling to a concave boundary and does not allow for cusps or corners. The major shortcoming of a snake is the need to initialize it closer to the object boundary. Multiple instances of snakes need to be initialized fo r capturing multiple objects. ( Cohen 1991 ) proposed a ballooning force to address the problem of initialization, but the inability to adhere to concave boundaries and multiple initializatio n for capturing more than one object still persists. 5.1.2 Gradient Vector Flow Snakes ( Xu and Prince 1998 ) proposed GVF Snakes to eradica te the problem of closer initialization of a snake and i nability to cling to concave boundaries. It assumed that a general static field comprises of a solenoidal as well as an irrotational component. The gradient vector field of external fo rces is defined as a vector field that minimizes the energy computed using )),(),,((),( yxvyxuyxf )),(( yxfEGVF dxdy ifivvuu yxfEedge edge yxyx GVF) ) (()),((2 ,0 2 ,0 2222 (56) is a regularization parameter and is edge map of the image The first term in the integrand is the smoothness constraint which imposes smooth variation on edgei,0 0i 42 PAGE 43 the gradient vector field. The second term imposes the image based constraint so that the field is stronger in magnit ude near object boundaries. Note that is zero or nearly zero in the homogeneous regions and thus the energy defined in Eq. 56 is minimized by a smoothly varying field wh ich is nonzero in homogeneous region in comparison to external energy potent ial derived from image gradient. At edges equal to minimizes the energy functional. T he Euler Lagrange equations for minimizing the energy functional are edgei,0 f edgei,0 ))),(()),()((),((2 2 2 y edge x edge x edgeyxIyxIyxIuu (57) ))),(()),()((),((2 2 2 y edge x edge x edgeyxIyxIyxIvv (58) GVF snake solved the problem of smaller capture region and clinging to concave boundaries. However, noisy imag es presented a problem for computing the gradient vector flow. The problem of multiple initializations for segmenting more than one object could not be alleviated. 5.1.3 Level Set based Active Contour In level set based active contour algorithm ( Malladi, Sethian and Vemuri 1995 ) ( Sethian 1999 ), the curve is deformed by means of a velocity term which contains two terms. One term is based on the regularity of the curve and the other attracts the curve towards the object boundary. This method avoids the Lagrangian formulation and adopts a geometric flow based formulation using partial differential equations. The geometric flow is derived through mean curvature motion of the curve. The curve travels along the normal at a speed proportional to t he curvature. The curve is embedded in a higher dimensional surface and is represented implicitly through a Lipschitz function ),( yx as 43 PAGE 44 }0),(),{( yxyxC (59) The evolution of the curve at time t is given by the zero level of ),,( yxt Starting with an initial contour represent ed by the zero level of ),,0(0yx the curve is deformed using the update equation Ft (510) t is the derivative of with respect to time. divF is the curvature of the evolving contour. The contour update equati on can be expanded as G AtFF The curve deforms uniformly under the effect of the advection term and is analogous to the inflation force defined in ( AF Cohen 1991 ). on the other hand is based on the local properties of the evolving contour. ( GF Malladi, Sethian and Vemuri 1995 ) presented two variations of the expanded update equation. I AtFF where )*( )(20 21MiG MM F FA I (511) 1M and are maximum and minimum values of the image gradient 2M 0* iG )(GAItFFk where 0*1 1 iG kI divFG (512) The level set method alleviated the major drawbacks of classical snakes. It allows for cusps, corners; clings to conc ave boundaries and increased the capture range. Major advantage however is that severa l objects can be detected simultaneously without the need for multiple in itializations, and that it s eamlessly segments objects with holes. The computations are done on a fixed rectangular grid. 44 PAGE 45 5.1.4 Geodesic Active Contour Geodesic Active Contours ( Caselles, Kimmel and Sapiro 1997 ) improved classical level set based method, allowing for stable boundary detection for larger variance in gradient magnitude and even allowing for gaps. ( Caselles, Kimmel and Sapiro 1997 ) formally presented results for existence, uniqueness, stability and correctness for contour propagation. It connected classical snakes and level set based formulation. The geodesic model is given by 1 0 0 ')))((()()( dssCigsCCJGAC (513) This is a problem of geodesic computati on in Riemannian space according to a metric induced by image The solution is obtained by minimizing the energy functional in Eq. 513 which gives the minima l length path in the induced metric. The geodesic active contour model in the level set framework is given by 0i v divigt )(0 is a constant (514) 0 v 5.1.5 Region based Active Contour All the classical snakes and active contour models rely on the edge detector g which depends upon the gradient of the image, as a stopping crit erion. As a result these models can only detect objects defined by the image gradient. Discretization of the grid and gradient computations imp lies that the gradient is bounded. As a result the stopping function might never become zero and the c ontour might just pass over the object boundary. In short, these models are as good as the edge detector involved. Another drawback arises of the fact that the gaussian smoothing needs to be stronger if the 45 PAGE 46 image is noisier. This might reduce the edge strength. To overco me these problems ( Chan and Vese 2001 ) presented an active contour model without stopping edge detector functions. Let be the deforming contour. We define C }0),(),{()( yxyxCinside and }0),(),{()( yxyxCoutside where C is the zerolevel set of the Lipschitz function The contour deformation can be presented as a problem of minimizing the energy functional given by )( 2 20 2 )( 2 101 )( 1 0 21),,(Coutside Cinside Cinside regiondxdyci dxdyci dxdy dsCccJ (515) 2121,,,,, cc are positive constants. The level set formulation of the above equation is given by 2 202 2 101)()( )( cici divt (516) dz zdH z )( )( where is the Heaviside function. )( zH 5.2 Nonparametric KDE based Active Contour Region based active contour improved on the classical snake model and edge detector based level set methods. The basic assumption of modeling an image region by constant intensity however is not ju stified in general. There have been approaches in the literature where a par ametric model was assumed for a region instead ( Yezzi, Tsai and Willsky 1999 ) ( Paragios and Deriche 2002 ). However, the performance of a parametric method can be severely affect ed if the assumed parametric model is incorrect. To solve this problem, ( Kim, et al. 2005 ) came up with a nonparametric method formulated in the level set based curve evolution framework. The method 46 PAGE 47 utilizes the techniques from kernel density estimation to estimate the probability distributions of the regions to be segm ented. The geometric flow is obtained in an information theoretic framework via maximi zation of mutual information between the pixel intensity and the region label at that pixel. An important aspect of this method is that it does not require training. We us e the two region segmentation model for our implementation of shape extraction. 5.2.1 Approach Let and be the unknown regions in the image or the true segmented regions. Let and be the probability distributions for the two regions where 1R 2R )(1xp )(2xp x is the pixel index. Let be the region inside the curve under evolution while is the region outside the curve. inR C outR )())((11 0xpRxxiP )())((22 0xpRxxiP (517) A B Figure 51. Image with object regions (A) The true segmented regions and and their associated density functions and (B) The evolving contour the region inside the curve with its associated density function and the region outside the curve with its associated density function 1R 2R )(1xp )(2xp C inR inp outR outp 47 PAGE 48 Let and be the density functions associated with regions and respectively. The idea is to evolve the curve C so that converges to and to or viceversa. Pixels are assigned labels or depending on whether they lie in or Let Mutual information, is defined as inp outp inR outR 1R inR 2R outR in L outL inR outR },{outinLLL )()();(0 0 0LiHiHLiI (518) where zzpzpzH ))(log()()( )()()()()();(0 0 0 0 0 0out out in inLLiHLLiPLLiHLLiPiHLiI (519) Now )( )( )(02 2 01 1 0ip R RR ip R RR LLiPin in in in in (520) and )( )( )(02 2 01 1 0ip R RR ip R RR LLiPout out out out out (521) Since the unknowns and feature in Eq. 520 and Eq. 521 and hence in computation of mutual information, we need to estimate. Thus the energy functional to be minimized for evolving the active contour is defined as 1R 2R C MIdsLiICE );( )(0 (522) The second term is a regularizing constrai nt to minimize the contour length. The update equation for the contour is given by NN dx xip CixiG R dx xip CixiG RCip Cip t Cout inR out out R in in out in ))(( ))()((  1 ))(( ))()((  1 ))(( ))(( log0 0 0 0 0 0 0 0 (523) where is a positive regularizing constant and is the curvature of the contour. 48 PAGE 49 For detailed derivation of equations refer to ( Kim, et al. 2005 ). As can be seen, the complexity of direct computation of the gradient flow at each step is where is the number of pixels in the image. We us e the improved fast gauss transform (IFGT) ( )(2NO N Yang, et al. 2003 ) to improve the computational complexity. Thus the level set update equation is given by )( ))(( ))()((  1 ))(( ))()((  1 ))(( ))(( log )()(0 0 0 0 0 0 0 0ydx xip yixiG R dx xip yixiG Ryip yip y yout inRx out out Rx in in out in t (524) Refer to Appendix B for derivation of Eq. 524 for implementation. 5.2.2 Segmentation Results We use the nonparametric active contour based algorithm for extracting shapes for shapebased retrieval of color images. Figure 52 (A) shows the extracted shape boundary superimposed on the image for a partia lly occluded object in a black and white image. Figure 52 (B) shows the extr acted contour for a simple color image. A B Figure 52. Segmentation result s using Nonparametric KDE based Active Contour (A) for a black and white image (B) for a simple color image. 49 PAGE 50 A B Figure 53. Segmentation Result for (A) Image of an Ibis (B) Extracted contour for Figure 53 (A) scaled down to 1/4 th its size A B Figure 54. Segmentation Result for (A) Image of a Crab with complex background (B) Extracted contour for Figure 54 (A) scaled down to 1/4 th its size Figure 53 (B), 54 (B) and 55 (B) shows the segmentation result for images in Figure 53 (A), 54 (A) and 55 (A). The ex tracted contours are marked in red. The images were scaled down to 1/4 th its size to reduce the computation time. Kernel bandwidth was set to ten times the standard dev iation of the image density. Values of 50 PAGE 51 the regularizing constant and time step were experimentally set to 0.01 and 10 respectively. The level set was updated for 10 iterations for the bl ack and white images and for 50 iterations for the color images. dt A B Figure 55. Segmentation Result for (A) Image of a Crab with grainy background (B) Extracted contour for Figure 55 (A) scaled down to 1/4 th its size 51 PAGE 52 CHAPTER 6 CORRENTROPY FOR SHA PEBASED RETRIEVAL OF COLOR IMAGES Shapebased retrieval of images has been an ac tive topic for research for quite a few years now. ( Veltkamp and Tanase, 2002 ) provides a good survey of exisitng image retrieval systems in use. Most of the shapebased image retrieval methodoligies in the literature are for black and white images which provide with a perfect shape contour ( Ansari and Delp 1990 ), ( Jain and Vailaya 1996 ), ( Latecki and Lakmper 2000 ), ( Grauman and Darrell 2004 ). Almost all the methods exis iting in the literature compute several shape descriptors from the s hape boundary. These descriptors are used heirarchically to provide a good retrieval resu lt. Several region based methods also exist ( Xu, Saber and Tekalp 2001 ) which use a graph or tree based algorithm for shapebased image retrieval. SQUID ( http://www.ee.surrey.ac.uk/ CVSSP/demos/css/demo.html ) is a shapebased image retrieval system whic h tries to match shape contour s. Firstly it filters out images through 3 global shape par ameters eccentricity, circularity and aspect ratio of curvature scale space (CSS). The final retr ieval result is obtained via a descriptor obtained from CSS over the images retrieved from the first pass. MPEG7, the ISO/IEC standard for multimedia retrieval includes 3 shape descriptors. However, these descriptors are designed on the assumption that a perfect segmentation is available. Mostly image features which measure spatial coherency of color or texture within the image have been employed for retrieval of color images. In this chapter we present a novel color image retrieval system formulated in the information theoretic framework. The system compares the shape boundaries of the objects directly. We will show the effectiveness of correntropy as a shape similaity measure for image retrieval. 52 PAGE 53 6.1 Problem Model and Solution A shapebased image retrieval system has two components (i) Shape (feature) extraction and (ii) shape (f eature matching). We do not need to compute any shape descriptor from the shape boundary and use the shape contour directly for matching. The boundary of the object describes the shape completely and can be extracted using any conventional image segmentation or shape extraction method. We use the nonparametric active contour based method discussed in Section 5.2. The reason for using an active contour based method formulated in the level set framework is that it gives a single shape boundary for a disconnected object or an object with holes. This reduces the bookkeeping for storing the s hape and results in an effective and compact description of the shape. On the other hand, if a conv entional feature based image segmentation method is used, there is a need to ensure that disconnected regions of the same object are combined (figurativel y) before extracting the shape boundary. Also, any feature can also be included within the active contour based methodology. The shapes of the objects are extract ed and stored in a database. Given a query image, shape boundary of the object in the q uery is extracted and compared with the contours stored in the database. Correntropy is used as the similarity measure. The top 10 images with largest correntropy value are re turned as the retrieval result. The shape matching algorithm explained in Chapter 4 is used for comparing shape boundaries. We have already proved the effectiveness of corr entropy as a shape similarity measure for matching partially occluded objects in Section 43. A database of 100 color images, comprising of 10 categories with 10 images in each category, is constructed to test the algorithm. The 10 categor ies are.(a) Airplane, (b) Butterfly, (c) Crab, (d) Crayfish, (e) Bird, (f) Flower, (g) Motorbike, (h) Revolver, (i) 53 PAGE 54 Seahorse, and (j) Starfish. The images were selectively download ed from the Caltech 101 database website ( http://www.vision.caltech. edu/Image_Datasets/Caltech101/ ). Figure 61 shows the represent ative images of each category from the image database. A B C D E F Figure 61. Representatives of 10 categor ies in the image database [Reprinted with permission from L. FeiFei, R. Fergus and P. Perona. Learning generative visual models from few training exam ples: an incremental Bayesian approach 54 PAGE 55 tested on 101 object categorie s, IEEE CVPR 2004] (A) Airplane (B) Crayfish (C) Crab (D) Butterfly (E) Bird (F) Flower (G) Seahorse (H) Starfish (I) Motorbike (J) Revolver G H I J Figure 61. Continued 6.2 Experimental Results Each image in the database was used as a query image. The shape of the object in the image were stored in a database were stored prior to the queryretrieval phase. Given a query image, the shape contour for that image was compared with the contours for all the images in the database. Hence, in our case, we have 100 comparisons for each query. The correntropy value is com puted for each comparis on. Images are then ranked in order of decreasing value of correntropy. Image with maximum value of 55 PAGE 56 correntropy is ranked as 1 st and henceforth. The top 10 images with largest correntropy values are retrieved and displayed as the re trieval result. The image retrieval system is compared with a similar syst em employing LMedS ( Salberg, Hanssen and Scharf 2007 ) as the similarity measure. In case of LM edS, images are ranked in order of increasing error. Image with the least error is ranked as 1 st .10 images with the least error are displayed as the retrieval result. Table 61 shows the best retrieval accuracy for both Correntropy and LMedS as a similarity meas ure for each category of images in the database. The kernel bandwidth was kept cons tant at 0.05 for t he whole experiment. Table 61. Best Retrieval Accuracy for Correntropy and LMedS Airplane Butterfly Crab Crayfish Bird Flower Motorbike Revolver Seahorse Starfish Correntropy 50% 40% 70% 40% 40% 30% 100% 70% 50% 60% LMedS 50% 40% 60% 40% 50% 30% 100% 70% 40% 60% Figure 62 shows the average retrieval a ccuracy for Correntropy and LMedS for the 10 image categories. Figure 62. Average Retrieval Accu racy for Correntropy and LMedS 56 PAGE 57 As can be inferred from both Table 61 and Figure 62, the best performance for both correntropy and LMedS as a similarity measure are very similar. However the average accuracy for Correntropy is much be tter. It should be noted that retrieval accuracy is show only for one value of kernel bandwidth A B Figure 63. Retrieval Result for [Reprinted wit h permission from L. FeiFei, R. Fergus and P. Perona. Learning generative vi sual models from few training examples: an incremental Bayesian approach tested on 101 object categories, IEEE CVPR 2004] (A) Corrent ropy and (B) LMedS. First image is the query image. Retrieval ranks are displayed at the top of each image. An image retrieval system should not only re trieve the images similar to the query image but also rank them higher as compared to not so similar images (if retrieved). One stand out feature of corr entropy during this experiment was its consistency in 57 PAGE 58 retrieving similar images. The similar images we re ranked higher (better as it should be) as compared to that with LMedS. An eviden ce of this statement can be seen in the retrieval result for both the methodologies show n in Figure 63. The first image is the query image which is also ranked as 1. Figure 64 shows the retrieval result for the Crab category. A B Figure 64. Retrieval Result for Crab Cat egory [Reprinted with permission from L. FeiFei, R. Fergus and P. Perona. Learni ng generative visual models from few training examples: an incremental Bayesian approach tested on 101 object categories, IEEE CVPR 2004] for (A) Correntropy (B) LMedS. Retrieval ranks are displayed at the top of each image. One important point to be noted here is that the results depicted in this thesis are for the complete image retrieval system and not just for shape matching. We extract the 58 PAGE 59 shapes from the images and do not use the ground truth segmentation result for shapematching. As can be seen from Figure 62, t he retrieval results ar e appreciable overall. In case of the Motorbike category we have obtained very good results (as good as 100% for some query images). The good results for the Motorbike category are a result of very good shape extraction for those images Most of the false positives encountered during this experiment were from two classes (a) Crab and (b) Crayfish. The reason could be attributed to the fact that the number of contour points for these categories was relatively high. Thus for a given query shape, a part of shape could always be founded that fitted these false positiv e shapes really well (shapewis e). Kernel Annealing could probably improve the performance of the system by filtering out the false positives. 59 PAGE 60 CHAPTER 7 CONCLUSION Kernel space methods have been used in se veral fields. We attempted to use such a methodology for the shapebased matching of partially occluded images and for shapebased retrieval of color images. Correntropy Induced Metric (CIM) behaves as a metric when the vectors are close in the input space and as outside the As they go farther apart, correntropy becomes insensitive to the distance. This distance in the input space over which CIM induces a norm is dictated by the kernel ban dwidth. A larger kernel bandwidth implies a larger induced space, while a smaller kernel bandwidth will focus more on closer points. The advantage of Correntropy as a similarity measure thereby lies in its strong outlier rejection abilit y and the presence of higher order moments. These properties were successfully utilized for matchi ng of partially occluded objects. We were able to obtain classification accuracy as good as 94% even under high degree of occlusion. We were also able to im prove on the existing LMedS method. 2L 1L 2L 2L 2L Correntropy as a similarity measure for retrieval of similar images resulted in promising results. Retrieval accuracy of 100% could be obtained for several query images. It seems that with better shape extr action, the results could be further improved. Use of Kernel Annea ling could also improve the re sult. Use of an hierarchical shape matching, where initially false posit ives could be weeded out using standard shape descriptors before applying correntropy based methodology, remains a promising avenue to be checked. 60 PAGE 61 APPENDIX A UPDATE EQUATION FOR CORRENTROPY BASED SHAPE MATCHING Derivation of the update equations of scaling and rotation matrix (Eq. 44, Eq. 45) and translation matrix (Eq. 46) is done here. Correntropy based cost function is defined in Eq. 43 as 1 0 22 21 ),(N i dyAxiie N dAJ (A1) We want to estimate the scaling and ro tation matrix and translation matrix via maximizing Substituting the matrix A from Eq. 42, and using and one gets ),( dAJ T iiixxx21 T i iy1 iyy2 Tddd21 1 0 22 2 2121211 ),,(),(N i ddyyxx ab bat t ii t iie N dbaJdAJ 1 0 2 ) () (2 2 2221 11211 ),,(N i dyaxbxdybxaxT iii iiie N dbaJ 1 0 2 ) () (2 2 2221 2 11211 ),,(N i dyaxbxdybxaxiii iiie N dbaJ (A2) Differentiating Eq. A2 with respect to and equating to zero, we obtain a 0) () ( 21 0 22221 11121 2 ) () (2 2 2221 2 1121 N i i iii i iii dyaxbxdybxaxxdyaxbxxdybxax e Na Jiii iii 1 0 22112211 2 1 0 2 2 2 1 2) ( )(2 2 2 2N i i iiiii dyAx N i ii dyAxdxdxyxyx exx eaii ii 1 0 2 2 2 1 2 1 0 22112211 2)( ) (2 2 2 2N i ii dyAx N i i iiiii dyAxxx e dxdxyxyx e aii ii 61 PAGE 62 thereby giving the fixe d point update equation 1 0 2 2 1 0 22112211 22 2 2 2) (N i i dyAx N i i iiiii dyAxx e dxdxyxyx e aii ii (A3) Similarly differentiating Eq. A2 with respect to b and equating to zero we get 0) () ( 21 0 12221 21121 2 ) () (2 2 2221 2 1121 N i i iii i iii dyaxbxdybxaxxdyaxbxxdybxax e Nb Jiii iii 1 0 12211221 2 1 0 2 2 2 1 2) ( )(2 2 2 2N i i iiiii dyAx N i ii dyAxdxdxyxyx exx ebii ii giving 1 0 2 2 1 0 12211221 22 2 2 2) (N i i dyAx N i i iiiii dyAxx e dxdxyxyx e bii ii (A4) Differentiating Eq. A1 with respect to d and equating to zero gives 0 21 0 22 2 N i ii dyAxdyAx e Nd Jii 1 0 2 1 0 22 2 2 2)(N i dyAx N i ii dyAxii iie yAx e d (A5) 62 PAGE 63 APPENDIX B DERIVATION OF UPDATE EQUATION FOR LEVEL SET This section derives the level set update Eq. 524. Interested reader can find the derivation for Eq. 523 in ( Kim, et al. 2005 ). The active contour is embedded in the level set )( tC ),,( tyx and at any given time is given by t }0),,(),{()( tyxyxtC (B1) where is the pixel location. Differentiating Eq. B1 with respect to t one gets ),( yx 0 ),,(0),,(tyxdt tyxd (B2) Now using chain rule we have t y y tyx t x x tyx t tyx dt tyxd ),,(),,(),,(),,( (B3) Tt y t x t tyx ),,( (B4) substituting this in Eq. B2 and using the fact that Tt y t x t C one obtains 0 ),(),( t C t tC dt tCd t C C t tC )( ),( (B5) Now TCNC t CTt Nt, and N where N is the unit vector in the normal direction and T in the tangent direction. is the component of NtC, t C in the normal direction and is the component of NtC, t C in the tangent direction. Substituting in Eq. B5 gives 63 PAGE 64 NtCC t tC,)( ),( (B6) Substituting from Eq. 523, assuming continuity of update we obtain the level set update equation at pixel index in Eq. 524 NtC, y )( ))(( ))()((  1 ))(( ))()((  1 ))(( ))(( log )()(0 0 0 0 0 0 0 0ydx xip yixiG R dx xip yixiG Ryip yip y yout inRx out out Rx in in out in t 64 PAGE 65 LIST OF REFERENCES Adams, R., and L. Bischof. "Seeded Region Growing." IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 16(6), 1994: 641647. Ansari, N, and E J Delp. "Partial Shape Recognition: A landmarkbased approach." IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12(5), 1990: 470483. Bhanu, Bir. "Shape Matching of Two Dimen sional Occluded Object s." International Conference on Pattern Recognition. 1982. 742744. Bober, M. "MPEG7 Visual Shape Descrip tor." IEEE Transactions on Circuits and Systems for Video Technology, Vol. 11(6), 2001: 716719. Canny, J F. "A computational Approach to Edge Detection." IEEE Transactions of Pattern Analysis and Machine In telligence, Vol. 8(6), 1986. Caselles, V, R Kimmel, and G Sapiro. "Geodesic Active Contours." International Journal of Computer Vision, Vo l. 22(1), 1997: 6679. Chan, T, and L Vese. "Active Contours wit hout Edges." IEEE Transactions on Image Processing, Vol. 10(2), 2001. Cohen, L. "On Active Contour Models with Balloon." Journal of Computer Vision, Graphics and Image Processing, Vol. 53(2), 1991. Datta, Ritendra, Dhiraj Joshi, Jia Li, and James G Wang. "Image Retrieval: Ideas, Influences, and Trends of the New Age." ACM Computing Surveys, Vol. 40(2), 2008: 5:160. Dryden, I. L., and K. V. Mardia. "Siz e and Shape Analysis of Landmark Data." Biometrika, Vol. 79(1), 1992: 5768. Erdogmus, Deniz, and Jose C. Principe. "Generalized Informa tion Potential Criterion for Adaptive System Training, Vol. 12(2)." IEEE Transactions on Neural Networks, 2002: 10351044. Furht, Borko. "Multimedia Syst ems: An Overview." IEEE Multimedia, Vol. 1(1), 1994: 4759. Grauman, Kristen, and Trevor Darrell. "Fast Contour Matchi ng Using Approximate Earth Movers Distance." IEEE Conference on Computer Vision and Pattern Recognition. 2004. 220227. Jain, Anil K, and Aditya Vailaya. "Image Retrieval Using Color and Shape." Pattern Recognition, Vol. 29(8), 1996: 12331244 65 PAGE 66 Kass, M, A Witkin, and D Terzopoulos. "Snakes : Active Contour Models." International Journal of Computer Visi on, Vol. 1, 1987: 321331. Kim, J, J W Fisher III, A Ye zzi, M Cetin, and A S Willsky. "A Nonparametric Statistical Method for Image Segmentation Using In formation Theory and Curve Evolution." IEEE Transactions on Image Proce ssing, Vol. 14(10), 2005: 14861502. Latecki, Longin J., and Rolf Lakmper. "Shape Similarity Measure Based on Correspondence of Visual Parts." IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22(10), 2000: 11851190. Lew, Michael S, Nicu Sebe, Chabane Djeraba, and Ramesh Jain. "Content Based Multimedia Information Retrieval: St ate of the Art and Challenges." ACM Transactions on Multimedia Computing, Co mmunications, and Applications, Vol. 2(1), 2006: 119. Li, Shan, MoonChuen Lee, and Donald Adjer oh. "Effective Invariant Features for Shape Based Image Retrieval." Journal of the American Society for Information Science and Technology, Vol. 56(7), 2005: 729740. Liu, Weifeng, Puskal Pokharel, and Jose C Principe. "Corrent ropy: Properties and Applications in NonGaussian Signal Pr ocessing." IEEE Transactions on Signal Processing, Vol. 55(11), 2007: 52865298. Malcolm, J., Y. Rathi, and A. T annenbaum. "A Graph Cut Approach to Image Segmentation in Tensor Space." I EEE Conference on Computer Vision and Pattern Recognition. 2007. 18. Malladi, R, J Sethian, and B C Vemuri. "Shape Modeling with Front Propagation: A Level Set Approach." IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 17(1), 1995: 158175. Manjunath, B S, J R Ohm, V V Vasudevan, and A Yam ada. "Color and Texture Descriptors." IEEE Transactions on Cir cuits and Systems for Video Technology, Vol. 11(6), 2001: 710715. Martinez, Jose M. MPEG7 Overview ISO/IEC JTC1/SC29/WG11N6828, 2004. Mingqiang, Y., K. Kidiyo, and R. Joseph. "A Survey of Shape Feature Extraction Techniques." Pattern Recogn ition Techniques, 2008: 626684. Mumford, D., and J. Shah. "O ptimal Approximations by Piecewise Smooth Functions and Associated Variational Problems." Communications on Pure and Applied Mathematics, Vol. 42(4), 1989: 577685. Orrite, Carlos, and J. Elias Herrero. "S hape Matching of Partially Occluded Curves Invariant Under Projective Transfo rmation." Computer Vision and Image Understanding, Vol. 93(1), 2004: 3464. 66 PAGE 67 Ozertem, U., D. Erdogmus, and B. "Nonpar ametric Snakes." IEEE Transactions on Image Processing, Vol. 16(9), 2007: 23612368. Paragios, N., and R. Deriche. "Geodesic Active Regions and Level Set Methods for Supervised Texture Segmentatio n." International Journal on Computer Vision, Vol. 46(3), 2002: 223247. Parzen, Emanuel. "On Estimation of a Prob ability Density Function and Mode." The Annals of Mathematical Statistics, 1962: 10651076. Pokharel, Puskal, Rati Agrawal, and Jose C Principe. "Correntropy based Matched Filtering." IEEE International Workshop on Machine Learning for Signal Processing. 2005. 148155. Principe, Jose C. Information Theoretic Learning: Renyi's Entropy and Kernel Perspectives, 1st ed. Springer Series in Information Sciences and Statistics, 2010. Principe, Jose C, Xu Dongxin, Qun Z hao, and John W Fisher III. "Learning from Examples with Information Theoretic Crit eria." Journal of VLSI and Signal Processing Systems, Vol. 26, 2000: 6177. Renyi, A. "On Measures of Information and Entropy." Proceedings of the 4th Berkeley Symposium on Mathematics, Statistics and Probability. University of California Press, 1961. 547561. Salberg, Arnt B., Alfred Hanssen, and Louis L. Scharf. "Robust Multidimensional Matched Subspace Classifiers Based on Weighted Least Squares." IEEE Transactions on Signal Processi ng, Vol. 55(3), Mar 2007: 873880. Santamaria, I, Puskal Pokharel, and Jose C Principe. "Generalized Correlation Function: Definition, Properties and A pplication to Blind Equalization." IEEE Transactions on Signal Processi ng, Vol. 54(6), 2006: 21872197. Scharf, L. L., and B. Friedlander. "Matched S ubspace Detectors." IEEE Transactions of Signal Processing, Vo l. 42(8), 1994: 21462157. Sethian, J. Level Set Methods and Fast Marching Methods: Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision, and Material Science. Cambridge University Press, 1999. Smeulders, Arnold, Marcel Worring, Sim one Santini, Amarnath Gupta, and Ramesh Jain. "Content Based Image Retrieval at the End of the Early years." IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 22(12), 2000: 13491380. Vapnik, Vladimir N. The Nature of St atistical Learning Theory. New York, USA: SpringerVerlag New York, 1995. 67 PAGE 68 Veltkamp, Remco C. "Shape Matching Similarity Measures and Algorithms." International Conference on Shape M odeling & Applications. 2001. 188197. Veltkamp, Remco C., and Mirela Tanase. ContentBased Image Retr ieval Systems: A Survey. Technical Report, Department of Co mputing Science, Utrecht University, 2002. Xu, C, and P L Prince. "Snakes, Shapes and Gradient Vector Flow ." IEEE Transactions on Image Processing, Vol. 7(3), 1998: 359369. Xu, Y, S Saber, and M Tekalp. "Shape Matching of Partially Occluded Objects for Image Retrieval Using Hierarchical Content Description." SPIE, Vol. 4315(86). 2001. 8696. Yang, C., R. Duraiswami, N. Gumerov, and L. Davis. "I mproved Fast Gauss Transform and Efficient Kernel Density Estimation." IEEE Conference on Computer Vision. 2003. 464471. Yezzi, A., A. Tsai, and A. Willsky. "A Statis tical Approach to Snakes for Bimodal and Trimodal Imagery." Internati onal Conference on Computer Vision. 1999. 898903. 68 PAGE 69 BIOGRAPHICAL SKETCH Priyank D. Bagrecha was born in Sirohi, India in 1985. He obtained the degree of Bachelor of Technology from Department of Electrical Engineeri ng, Sardar Vallabhbhai National Institute of Technology, Surat, Indi a (better known as NIT Surat) in 2007. He worked as a Project Linked Personnel at Machine Intelligence Unit, Indian Statistical Institute, Kolkat a, India on Color and Texture based Image Retrieval till July 2008. He joined the Department of Electrical and Computer Engineeri ng at University of Florida in August 2008. In summer 2009 he did an internship in the TEXMEX research group at INRIA, Rennes, France w here he worked on a tree based nearest neighborhood approach and Random Forest bas ed approach for indexing of images for content based image retrieval. He has been working with Dr. Jose C. Principe since 2009 and received his Master of Science degree in May 2011. His cu rrent interests include Computer Vision, Computer Graphics, Machine Learning and Mathematics. 69 