Group Title: Department of Computer and Information Science and Engineering Technical Reports
Title: Shape L'Ane Rouge : sliding wavelets for indexing and retrieval
Full Citation
Permanent Link:
 Material Information
Title: Shape L'Ane Rouge : sliding wavelets for indexing and retrieval
Alternate Title: Department of Computer and Information Science and Engineering Technical Report
Physical Description: Book
Language: English
Creator: Peter, Adrian
Rangarajan, Anand
Ho, Jeffrey
Publisher: Department of Computer and Information Science and Engineering, University of Florida
Place of Publication: Gainesville, Fla.
Copyright Date: 2007
 Record Information
Bibliographic ID: UF00095658
Volume ID: VID00001
Source Institution: University of Florida
Holding Location: University of Florida
Rights Management: All rights reserved by the source institution and holding location.


This item has the following downloads:

2007425 ( PDF )

Full Text

Technical Report

Shape L'Ane Rouge: Sliding Wavelets for Indexing and Retrieval

Adrian Peter1, Anand Rangarajan2 and Jeffrey Ho2
1Dept. of ECE, 2Dept. of CISE, University of Florida, Gainesville, FL


Shape representation and retrieval of stored shape mod-
els are becoming increasingly more prominent infields such
as medical imaging, molecular biology and remote sensing.
We present a novel framework that di,,. r! addresses the
necessity for a rich and compressible shape representation,
while simultaneously providing an accurate method to index
stored shapes. The core idea is to represent point-set shapes
as the square root of probability densities expanded in a
wavelet basis. We then use this representation to develop a
natural similarity metric that respects the geometry of these
probability distributions, i.e. under the wavelet expansion,
densities are points on a unit hypersphere and the distance
between densities is given by the separating arc length. The
process uses a linear assignment solver for non-rigid align-
ment between densities prior to matching; this has the con-
notation of "sliding" wavelet coefficientv akin to the sliding
block puzzle L'Ane Rouge. We illustrate the utility of this
framework by matching shapes from the MPEG-7 data set
and provide comparisons to other similarity measures, such
as Euclidean distance shape distributions.

1. Introduction

Today's scientific (and non-scientific) community gener-
ates information at a frantic pace; thus placing a paramount
emphasis on developing flexible and robust systems for
mining the data. Often, the desires are to classify test data,
cluster similar groups or discover the closest match to an
incoming query. The key enablers of these operations are
the similarity metrics used for querying the data [ ]. In
this paper we focus on similarity metrics for shape models
having applicability to a variety of disciplines, e.g. medical
imaging, remote sensing and robotics. Our framework in-
troduces a new shape representation and then uses the nat-
ural geometry arising from this representation to derive a
geodesic-distance, similarity metric.
The present effort is motivated by a recent wavelet den-
sity estimation method [ ] that estimates x) and then
obtains a bona fide density as i px) This has several

advantages over estimating p(x) directly such as guaran-
teeing non-negativity and imposing a simple constraint on
the wavelet coefficients. This new density estimator uses a
wavelet expansion of p(x) i.e.

p(x)= a .' (x) + kj,kk(x), (1)
jo,k j>jo,k
where ajo,k and 3j,a are coefficients for the father q(x) and
mother y(x) basis function; the j-index represents the cur-
rent scale level and the k-index the integer translation value.
(Note: (zx) and 7(x) are also referred to as the scaling and
wavelet functions respectively.) For numerical implementa-
tion, the infinite expansion in (1) is truncated to some n set
of scale levels and we must also select a starting scale level
jo. The coefficients in (1) are estimated with a maximum
likelihood objective function which is minimized using a
modified Newton's method.
Expanding the /p with a wavelet basis serves as the
spring board to our development of an efficient similarity
metric between shapes. We will show that given point-set
shapes, we can use this density estimation method to rep-
resent shapes as probability densities-a natural by product
is that the densities visually resemble the shapes. (For the
purposes of this paper, we consider only two dimensional
shapes but the theory and algorithmic procedures readily
extend to higher dimensions.) All shapes in a given data
set can be similarly represented. This representation has
excellent properties like: (1) the multiscale wavelet coeffi-
cients of the densities can be thresholded [ ] to compress
the storage requirements (2) several different orthonormal
bases can be used to estimate the densities, thus enhancing
their descriptive capabilities and (3) the compact nature of
wavelets provides both spatial and frequency localization
enabling the densities to closely mimic shape features.
Based on this representation, the intuition for the sim-
ilarity metric follows from considering the coefficients of
the probability density {ajo, 3j,k } as the coordinates c
r. I .. a~jo,m, 3,1,, /n,m] indexing the location of
a density on a unit hypersphere; then the distance between
two distributions pi and p2 indexed by their coordinates cl
and c2, respectively, is given by
d(p,p2) = cos (cc2). (2)

Technical Report

Technical Report

(The unit hypersphere comes about from the constraints on
the coefficients in conjunction with a diagonal metric tensor
as discussed in 2.2.) We expand on these intuitive ideas to
develop a matching procedure that casts density matching in
a linear assignment problem. The linear assignment is used
to handle non-rigid differences between shapes. It "warps"
the densities while preserving their defining properties, e.g.
unit integrability and non-negativity. Since the densities
closely resemble the shapes, we are in effect warping the
shapes. It will be shown that this non-rigid alignment is
necessary to obtain more accurate recognition. When one
uses a Haar basis (box function) for the density expansion,
the permutation of the wavelet coefficients due to the linear
assignment visually looks like sliding blocks. Thus we have
informally branded this process as Shape L'Ane Rouge after
the French moniker for sliding block puzzles. Our method
has several benefits, including:

Shapes are not limited by topological constraints (such
as the need to represent shapes as closed curves),
eliminating extra effort often spent in developing
parametrizations or other preprocessing.

All of the intensive computations happen offline, e.g.
the density estimation.

For querying, the similarity metric computation be-
tween source and target shape is fast, satisfying the
requirements for demanding indexing and retrieval ap-

Use of wavelet representations enables flexibility in
compression and storage.

1.1. Related Work

Existing work in shape modeling and matching span a
broad spectrum of representations and their corresponding
metrics. There are several recent surveys, e.g. [ ], that suc-
cinctly describe shape representations such as unstructured
point-sets or curves. They also detail the myriad of simi-
larity measures that provide a means by which to compare
shapes under a common representation. The advances made
by all these methods have been instrumental in enabling ro-
bust indexing and retrieval mechanisms.
Because we incorporate a linear assignment solver to
handle non-rigid deformations, our method is situated in
close proximity to techniques that use transportation and as-
signment problem formulations [ ] to obtain their distance
measures. One such measure is the Earth Mover's Distance
(EMD) [ ], a metric between general mass distributions of
objects. Given two distributions x and y, the goal becomes
to find a matrix fj that establishes a flow between all fea-
tures x and yj in x and y. Feasible flows must satisfy row
sum, column sum and total sum constraints. Obtaining the

flow and subsequently the EMD is generally based on the
solution to the transportation problem [ ]. Hence, one of
the main differences between our approach and EMD is that
we solve a matching problem in contrast to the transporta-
tion problem. The EMD also requires one to decide on the
features as well as the appropriate weighting of each fea-
ture per object. For some applications these choices may
already be readily apparent, but for most this requires an
added level of effort and investigation. Our method sim-
ply works on the point sets that naturally arise either from
sampling or preprocessing.
Our present method also falls in the same paradigm as
a recently introduced shape analysis framework [ ] which
uses geodesic distances on the manifold of Gaussian mix-
ture models to establish a shape similarity metric. The au-
thors represent shapes as mixture models and use the Fisher-
Rao metric derived directly from the representation to ob-
tain intrinsic distances on the manifold of parametric mix-
tures. Like their method, we also leverage the ,L 'Id ill \ ili.i
results directly from the shape representation. However, it
is not feasible to use their metric for retrieval due to the
computational demand of solving for geodesic distances on
arbitrary, high-dimensional manifolds; a by-product of us-
ing Gaussian mixtures. We on the other hand, have a well
understood geometry with an easy to compute metric.
The remainder of this paper is organized as follows.
In the next section we provide detailed discussions of our
method-the representation of shapes as density functions
expanded in a wavelet basis, the geometry that arises from
this representation and the derivation of the similarity met-
ric. We then follow with experimental verification of our
method, Section 3. The indexing and retrieval accuracies
are tested on a shape database consisting of 1400 shapes
from the MPEG-7 Core Experiment CE-Shape-l[ ]. Our
the method is compared with another density-matching
technique for retrieval: D2 shape distributions [ ], for
which we compute four different similarity measures. We
also compare our results with published recognition rates of
other algorithms on the MPEG-7 data. The last section con-
cludes by summarizing our effort and proposing directions
for future work.

2. Shape L'Ane Rouge

Our similarity metric, the geodesic distance on a unit hy-
persphere, is obtained directly from our representation of
shapes as probability densities expanded in a orthonormal
wavelet basis. This shape representation is detailed next,
followed by a discussion on how this leads to the hyper-
sphere geometry for the distributions. Afterwards, we il-
lustrate the need for non-rigid alignment and how it can be
accomplished on the space of distributions through a linear
assignment formulation. It will turn out that the linear as-
signment process has to be regularized to improve matching

Technical Report

Technical Report

performance. To this end, we formulate a penalty term that
restricts large movements of wavelet basis.

2.1. From Shapes to Wavelet Densities
The idea of representing shapes as densities is usually
brought to fruition in two ways, either the density is di-
rectly estimated from the shape's discrete samples [ ] or
some other feature is first extracted from the shape and then
the density is fit to these features [ ]; our method falls
in line with the former. To our knowledge, this is the first
time a wavelet density estimator has been used to directly
represent shapes.
Many of the issues of estimating a bona fide density can
be overcome by first estimating p(x) and then obtaining
the desired density as (p) [ ]. For two dimensional
densities the wavelet expansion of the square root of the
density is given by:

3i 3
Vpx= a Z k(x)+ ES 3 ,k ',k(x) (3)
jo,k j>jo,kw 1

where x E R2, ji is some stopping scale level for the multi-
scale decomposition and (kl, k2) k E Z2 is a multi-index
that represents the spatial location of the basis. (The trans-
lation range of k can be computed from the span of the data
and basis function support size.) The father and mother ba-
sis are tensor product combinations of their one dimensional
counterparts, i.e.

,k (x)
Yk (x)

2io 0(2jo x
2J3 (2J3ox

- k1) (2 ,,
- kl) (2x2 -
- kl)(23 2 -
k1)y(2x2 -

- k2)

The goal is to estimate the set of coefficients { ajo,k, L k
and reconstruct the density using (3). An efficient maxi-
mum likelihood method to estimate them, with fast conver-
gence, is discussed in [ ]. Due to the increased indexing no-
tation for two dimensional wavelet expansion, we will typ-
ically resort to one dimensional arguments, as in Section 1,
with it being understood that all results directly translate
to two dimensions. Under a wavelet expansion of p(x),
the unit integrability requirement of all probability densities
translates to a constraint on the wavelet coefficients

J(/ (X))2

2 2
o 0,+k + jo k
jof 3>3o, k

Recall that we are using only orthonormal bases such as
Haar, Coiflets or Symlets. Figure 1 illustrates estimated
densities for four point set shapes, using a single level
wavelet decomposition. The points were extracted from the

-0] . -

-. : ..
,1 il t ii


Figure 1. Example wavelet densities estimated from points-sets of MPEG-7 shapes.
Top row are point sets, cardinality from left to right: 4,948;5,578;7,773;11,984. Sec-
ond row is a nadir view of the estimated densities using the following wavelet families
(from left to right): Haar (o 2 ), Coiflet-4 (o 1), Symlet-10 (o 0) and
Haar (jo 2). Third row is the perspective view. Notice how the wavelet densities
accurately represent the shapes.

MPEG-7 binary image data set. Notice how the compact
nature of the bases does an excellent job in modeling the
shape features. In the overhead views, it is readily apparent
how closely the densities resemble the shapes. We feel this
direct visual association of the density and the shape pro-
vides a nice advantage over trying to extract features and
then fit the density to the features. Also, notice that shapes
exhibit a variety of topological properties like interior struc-
tures and disconnected components.

2.2. The Geometry of Wavelet Densities: Geodesic
Distances on the Hypersphere
Equation (5) showed that a natural by-product of work-
ing with the square root of density and then expanding it
with an orthonormal wavelet expansion was that it imposed
a constraint on the basis coefficients; namely the sum of
squared coefficient values must equal one. This immedi-
ately leads to the interpretation that the basis coefficients-
which are unique to a particular density since wavelets serve
as a true basis for the space of continuous distributions-
give the coordinates for a position on the unit hypersphere.
The ordering of the coefficients in the coordinate vector can
be taken in any arrangement but it must be consistent across
all densities. The dimensionality of the hypersphere is de-
termined by the cardinality of the set containing all the co-
efficients. The hypersphere geometry of the densities can
be more rigorously justified when we analyze the p(x
representation under the theoretical basis of information ge-
ometry [ ]. In this context, the Fisher information matrix
(FIM) serves as the metric tensor on the manifold of a para-
metric family of distributions. One of the algebraic forms
of the FIM is given by

9uv 4f a (x dx (6)
n ,y 4 f th se ctan

Technical Report

Technical Report

1), il) (2) (2)
jk' j,kk k 0 k

Figure 2. Hypersphere of densities. Unit integrability for densities requires
jo k ,2 k + EC1j k = 1 and the Fisher information matrix is diago-
nalized when ,p is expanded in an orthonormal basis. This places the shapes rep-
resented by the densities on unit hypersphere with coordinates given by the wavelet
coefficients. The above figure shows two densities, see coefficient superscript, on the
hypersphere-their geodesic distance is the angle between the unit vectors.

where O {0 1,..., 0 } denotes the parameters of the
distribution and u and v indicate the row and column in-
dex, i.e. for a family with m parameters the FIM is m x m.
Under an orthonormal expansion of /p(x ), Eq. (6) re-
duces to g,u, 41 which when taken in conjunction with
a constraint E i (0i)2 1 gives the unit hypersphere ge-
ometry. Such is the case in our framework where Vp(xE)
has been expanded in a orthonormal wavelet basis with the
coefficients of the expansion serving as the parameters of
the density, i.e. = {ajo,,, /3j,k}. Hence the orthonormal
wavelet bases serve as eigenfunctions diagonalizing certain
classes of parametric densities with a particular class de-
fined by regularity properties of the density which come
about from the wavelet's order of vanishing moments [ ].
Two shapes represented as wavelet densities end up as two
points on the hypersphere, see Figure 2. Since this is a unit
hypersphere with the wavelet coefficients for each shape
playing the role of two unit vectors, the angle between these
unit vectors immediately gives the geodesic distance be-
tween the shapes.
It is also interesting to note that we can obtain this same
inner product interpretation by taking the approach of work-
ing with a similarity measure directly between the densities,
instead of analyzing the geometry implied by the coefficient
constraints and the metric tensor. In particular, using the
Hellinger divergence [ ] to calculate the distance between
two densities pi and p2 gives


P2 1_ )2 dx
f2(i (/p2
2 2 Zo, ,(1) ,(2)
2--2 [ 0jo,k jok jo,k
j+ X ki (1() (2)

where {a(1), 3(1) } and {c(2), /(2)} are the wavelet param-
eters of pi and P2 respectively. Notice that we can factor
out a -2 and drop the constant without effecting metric
qualities of the measure. This reduces (7) to an inner prod-

uct between the coefficients of the densities, hence giving
the same metric as the one we derived above by analyz-
ing the geometry of the space of distributions. There are
other notions of similarity measures between densities such
as the Kullback-Leibler divergence and Euclidean distance
but none of them operate on the square root of the density
and they also do not provide a closed-form expression for
the distance. We refer the reader to [ ] for a summary of
other distance measures between densities.

2.3. Sliding Wavelets
If our analysis ended with the previous section, we would
be equipped with a very fast similarity metric. Given a pair
of point-set shapes we would merely estimate the coeffi-
cients for the wavelet, square-root densities of each shape
and then take their inner product to get a measure of their
closeness to each other. However, this approach is some-
what naive in that it does not leverage the full mathemat-
ical formalisms that relate one shape to another. Follow-
ing the Klein school of thought [ ], similarity between
shapes is often considered after quotienting out some trans-
formation group, typically the group of similarity transfor-
mations [ ]. The motivation for this is that without remov-
ing transformations we are not really analyzing effects that
are intrinsic to the shapes. Non-rigid transformations are
the most general, basically encompassing any continuous
transformation. Practically it is expected that most shapes
from the same category should differ by i.ull c 'non-rigid
warps compared to shapes from other arbitrary categories;
hence correcting for this prior to evaluating the similarity
metric should enhance its discriminability. In our frame-
work, we could incorporate non-rigid alignment in one of
two ways: perform non-rigid alignment of the point sets
prior to fitting the wavelet density or fit the density to the
data and then adjust for non-rigid deformations by warping
the densities. The former method usually involves adopt-
ing a spline based model to represent the non-rigid transfor-
mation [ ] and can involve iterative optimization to solve
for the spline parameters. Though these methods are able
to model a large class of non-rigid deformations, they do
not posses the computational efficiency needed for query-
ing systems. Our method takes the second option of warp-
ing the densities which we accomplish by locally translating
wavelet coefficients.
We now give a simple example to illustrate how warping
the densities by local translations can increase recognition.
Suppose two shapes have been affine aligned and there only
remains an non-rigid warp between the two. We model the
non-rigid deformation, in the infinitesimal, as local transla-
tions. Figure 3 shows the estimated densities of two hypo-
thetical shapes, see (a) and (b). The coefficients for the ba-
sis functions of each shape are indicated by a red bar. The
density function shown in (a) only differs by a translation

Technical Report

Technical Report

each wavelet is allowed to slide, we cannot allow the slid-
ing wavelets to collide and end up at the same spatial loca-
tion. This imposes a permutation constraint on the sliding
wavelets and the resulting deformation picture evokes the
L'Ane Rouge puzzle, see Figure 4. Thus our new objective
to minimize becomes

1:4 ~1'

.: .' .; .. .
'"... | : 1/4: ia4 .' ,,
.... ...] a"'2 '., ,
n! ,,,Z..
'"'' .: ..... < :,: ....

"",, "":.. ..... ~ :"

00 0 00 0 000 1- 00000
000 00 000 00 0 1
0000 1 / 00 100 10

Figure 3. Local non-rigid effects and the need for linear assignment. (a) is density
pi of the first shape, with only scaling coefficients, c = shown. (b)
is the second shape with density p2 with coefficients c2 = Locally the
point sets only differed by a translation which resulted in the densities differing by
a translation. Without linear assignment the coefficient vectors of these would give
a inner product of 0 and consequently large geodesic distance on the hypersphere.
Linear assignment can correctly recover the local translation and then the geodesic
distance will be small, reflecting the true similarity between the shapes.

to density (b). Notice that if we were to stack the coeffi-
cients in a vector (from bottom left to top right) for each
density and perform an inner product between them, the re-
sulting value would be zero. This leads to high geodesic
distance, cos 1(0) j. However, if we simply slide the
wavelet bases of one shape to correspond to locations on the
other, our inner product would then yield a very high cor-
relation indicating the true similarity between the shapes.
Also we must be careful that whatever mechanism we use
to translate the bases does not alter the values of their coeffi-
cients and compromise the properties of a bona fide density,
i.e. (5) must hold to maintain unit integrability. The most
straightforward way to accommodate these objectives is to
reformulate our similarity metric under the action of a per-
mutation group on the ordering of the coefficients. These
specific requirements can be addressed within a linear as-
signment construct [ ]; thus our deformation model can
be interpreted as a "sliding grammar" wherein we only al-
low wavelets at each level j to independently slide to get
a good match. The independent sliding assumption at each
level can be rigorously justified. The hypersphere nature
of the manifold (discussed in the previous section) implies
that the "probability density mass" corresponding to each
wavelet is independent of the rest. Consequently, this al-
lows us to independently slide each wavelet to get a best
match. While this justifies the independence assumption,
"deformation more complex than sliding could
be considered, e.g. splitting coefficients. However, in this
paper, we restricted ourselves to only sliding the wavelets
leaving more exotic rules for future research. Even though

where 7r(k) is a permutation operator that takes as input the
wavelet spatial index k and returns a new index k' at the
same level. (Since the wavelet coefficients can all be re-
versed to get the same density, there's an overall sign sym-
metry which is accounted for in the linear assignment algo-
rithm by running it twice-once with the set of coefficients
{ajo,k, 3j,k} and a second time with {-ajo,k, -3j,k}.)
The space of possible permutations is large and hence this
objective needs to be regularized to yield useful results.
Otherwise, every source shape's coefficients could be re-
ordered to be in the shape of the target; this is a detriment to
recognition since any shape can essentially match another.
To overcome this effect, we penalize large spatial move-
ments by incorporating a cost based on the Euclidean dis-
tance between the centers of basis functions. This restricts
large movements of the coefficients forcing them to be only
locally translated. Incorporating this penalty gives our final
objective function


D(pi,p2; 7r)
+ A [jk r(Jo,k) r((jo, k))|2

+ E,k r(j, k) r((, k))||2]

where r(j, k) is a location operator-essentially giving us
the center of the wavelet basis at (j, k)-which has two
inputs, the level j (and this includes jo), the wavelet spa-
tial index k and returns a spatial location r E R2. The
basic idea here is that as the regularization parameter A is
increased, the objective increasingly favors shorter wavelet
sliding movements and hence smaller deformations. The
optimal permutation 7* can be obtained by setting up the
cost matrix
C = clc + Ad (10)
where c, is a vectorized representation of all the density
wavelet coefficients for shape i and the matrix d contains
pairwise distances between the wavelet basis locations. Fig-
ure 4 illustrates the effect of A on the linear assignment and
hence the similarity metric.

3. Experiments

The presented technique was evaluated on the MPEG-7
database [ ]. The original data set consists of 70 different

Technical Report

k j (1) w(2)
-2 + 2 Ejok Ojo,^^kOjo0r(k)
i v jl (4(1) (2) 1
+ Ej>jo,k /j,k/Jjlr(k)\

D(pl, p2; i7)

S.. ::

:1 .
.. -.


V.-l; *


"r "

Technical Report


;: i
st**l*- = .~


Figure 4. Effects of A on linear assignment. Top row far left is target shape and
far right is the source. Second row shows for small A the source shape is almost
perfectly transformed to the target while for large A the source shape retains original
shape; A values from left to right: 10, 250, 500, and 1000. Third row illustrates the
wavelet coefficients movement in row two (best viewed in color). The densities were
estimated using the Haar family with jo 2.

categories with 20 observations per category for a total of
1400 binary images. Each image consists of a single shape.
One of the main strengths of our method is its accessibil-
ity and ease of use. The first part involves simply taking
the data samples for each object and using them to esti-
mate <{jo,k,/ k} for the wavelet expansion of /p. In
the context of shape indexing this phase is completely off-
line, i.e. wavelet densities for the entire database can be
estimated once and before the actual similarity computation
takes place. Next, to compare two shapes, we first use the
regularized linear assignment (9) to handle non-rigid effects
and then use closed-form distance on unit hypersphere to
obtain the similarity measure between them. We compare
the performance of our method, Shape L'Ane Rouge, to D2
shape distributions [ ].
For the MPEG-7 data, each shape was represented with
a subset of points. There are no topology or equal point-set
cardinality requirements amongst shapes, allowing shapes
with richer features to be represented with a greater num-
ber of points, see Figure 1 for some examples. In this
preliminary effort, we have focused on handling non-rigid
effects. To this end, shapes within each category were
affine aligned to a category reference shape. We used a re-
cently introduced affine alignment algorithm that enables
alignment of 2D point-set data without iterative optimiza-
tion [ ]. Once the shapes were aligned, all of them were
brought into a common field of view by placing them in a
[-10, 10] x [-10, 10] coordinate system. This was done
to control the range over which we estimate the densities.
Next we estimated coefficients for the wavelet density of
each shape using a Haar basis with jo = 1. Note it is

possible to use several other families, but the Haar basis
is available in closed form and reduces the time required to
estimate the densities (on average about 2 to 3 minutes per
shape). It is worth mentioning that regardless of the num-
ber of points used to represent each shape, once the den-
sities are estimated all of them will have the same number
of wavelet coefficients. (Recall that the densities are all es-
timated in the same square coordinate system.) Per these
specifications, each wavelet density was represented with
1, 764 coefficients.
Once the densities are estimated for all the shapes, pair-
wise matching between densities only involves working
with the wavelet coefficients of the densities. When match-
ing two shapes, the wavelet density coefficients of each are
used to create the cost matrix in Eq. (10). With this cost ma-
trix, we can then use the linear assignment solver presented
in [ ] to obtain the wavelet-coefficient rearrangements of
the source shape with respect to the target. All of our ex-
periments were conducted with multiple values of A. For a
shape pair, it typically takes less than 5 seconds to perform
the linear assignment. Once the coefficients are re-ordered
we can use Eq. (2) to obtain the geodesic distance between
the shapes. In fact we experimented with three possible
similarity measures that can be computed after the linear
assignment: (1) the standard arc length geodesic distance
after linear assignment, (2) geodesic distance plus the total
distance penalty incurred for sliding and (3) just the total
sliding penalty. (Note: The last two metrics are not be con-
fused with the fact that the distance penalty is also used to
regularize the sliding process which is different from treat-
ing the total amount of movement as a metric.)
We compared our method to D2 shape distributions as
this is also a density-based shape retrieval metric. For each
shape, a D2 shape distribution was created by taking 10, 000
random pairwise distances between points on the shape. In
[ ], the authors then use these distances to construct a 1D
histogram for each shape; this serves as a unique shape
signature. Instead of using histograms, we estimate a 1D
wavelet density for each shape. Distance metrics between
shapes can be obtained by using a variety of 1D density dis-
similarity measures. In addition to the Hellinger divergence,
Eq. (7), we computed three other measures:

Bhattacharyya: D(pl,p2) = 1 f p/p2dx

X2: D(pl,p2) = j1 P2
Pl @P2

* L2: D(pl,p2)

(f(pl -P2)2dx)2

Figure 5 shows some example D2 shape distributions using
the 1D wavelet density estimator; these distributions corre-
spond to shapes shown in Figure 1.
Performance on the MPEG-7 is most commonly evalu-
ated using the bulls-eye criterion [ ]. Each shape is used

Technical Report

Technical Report

Figure 5. Example D2 shape distributions using wavelet densities estimators..
These distributions correspond to shapes in Figure 1 from left to right. All densi-
ties were estimated using a Symlet-7, jo = 1.

Shape L'Ane Rouge D2 Shape Distributions
Metrics A = 500 A = 1000
X2 59.3%
Geodesic w/ LA 81.7% 84.4% Helling
---------------------Hellinger 58.6%
Geodesic+ EDP 32.6% 18.5% Bhattar 58.6%
-EDP 32.5-% 18- Bhattacharyya 58.6%
EDP 32.5% 18.3% L2 56 %
L2 56.6%
Table 1. MPEG-7 recognition rate. Our method Shape L'Ane Rouge out performs
D2 Shape Distributions [ ]. In our method the choice of A effects the recognition
rate. See text for explanation of metrics. (LA-linear assignment, EDP-Euclidean
distance penalty).

as a query shape and the top 40 matches are retrieved from
all 1400 shapes (the test shape is not removed). For a single
query, maximum possible correct retrievals are 20 coincid-
ing with the number of shapes in each category. Hence there
are a total of 28, 000 possible matches with the recognition
rate reflecting the number of correct matches divided by this
total. Table 1 lists the recognition rates using several density
similarity measures for both Shape L'Ane Rouge and D2
shape distributions. Shape L'Ane Rouge significantly out-
performs D2 shape distributions. This gives credence to the
idea of working with feature representations that mimic the
true visual properties of shapes, i.e. D2 shape distributions
represent objects using a 1D signature derived from the 2D
points whereas Shape L'Ane Rouge represents shapes using
2D densities which are visually similar to the shapes. The
three different metrics computed for Shape L'Ane Rouge
illustrate how A impacts recognition performance. A judi-
cious choice for A can be made by optimizing over a training
set. The different metrics also show that the wavelet den-
sity representation provides a rich set of features evident by
that the fact the geodesic distance (with linear assignment)
out performs the metrics that include the Euclidean distance
penalty. Hence the sliding alone is not sufficient to discrim-
inate between shapes. (For high A, the total sliding penalty
dominates the second metric giving similar performance to
the third.)
Recently, methods based on hierarchical representations
[ ] have reported recognition rates greater than .-".'
on the MPEG-7 data set. However, these methods work on
a more simplified version of the problem than what we have
addressed. They assume shapes are represented by their
boundary outlines and typically use less than 200 points for
the shapes. A hierarchical representation is used to cap-
ture both global and local properties. These methods have

the drawback of extracting oriented, boundary curves which
can be a troublesome preprocessing procedure. We also
lose the descriptive power afforded by allowing arbitrary
shape topologies and unconstrained point set cardinalities.
The closest method, in terms of operating on unstructured
point sets and not restricting shape topology, is [ ] which
has published recognition rate of 76.51% on the MPEG-7
data set. Our results clearly show reasonable gains over this
method. We are still in the preliminary stages of exploiting
the full capabilities of the Shape L'Ane Rouge framework,
i.e. using multiscale representations to get more descrip-
tive attributes, experimenting with different wavelet fami-
lies, etc. Since we are already exceeding .-1' ., we believe
in the future these enhancements will improve our recog-
nition rates significantly without sacrificing our ease-of-use
and rich descriptive power.

4. Discussion

The development of robust and effective shape indexing
and retrieval mechanisms largely depends on the represen-
tation model for the data and also the metrics used to distin-
guish one observation from another. In this paper, we have
presented a novel shape representation scheme which gives
rise to a natural metric that comes directly from the repre-
sentation. Given an unstructured point set model of a shape,
our Shape L'Ane Rouge framework estimates /p, under
a wavelet expansion, directly from the point data and re-
covers the probability density as (/p) As we illustrated,
these densities have a direct visual similarity to the original
shape. The unit integrability property of all densities trans-
lates to a constraint on the wavelet coefficients, i.e. the sum
squared coefficients equal one, see Eq. (5). Since the den-
sities are uniquely identified by their wavelet coefficients,
these are in effect the coordinates by which probability den-
sities are indexed on a unit hypersphere-because the den-
sities represent the original shapes, intuitively the shapes
are also on the unit hypersphere. Arising from this repre-
sentation, we immediately gain a natural similarity measure
between shapes by computing the arc length between prob-
ability densities on the unit hypersphere. Shape recognition
can be improved if we adjust for non-rigid differences be-
tween a pair of shapes before computing a similarity mea-
sure. Rather than do this in the original shape space, we
have introduce a novel way of deforming their wavelet den-
sity representation through the use of penalized linear as-
signment; allowing us to locally warp the density while
maintaining its defining integrability and positivity proper-
Our framework has several advantages over other con-
temporary shape modeling and matching schemes, includ-

Each shape can have an arbitrary number of points

Technical Report

Technical Report

without topological restrictions. This is in sharp con-
trast to methods that work only on shape silhouettes or
are limited to only a few sample points. Hence, the car-
dinality of a shape point set is dictated by the amount
of points needed to accurately represent a shape's fea-
tures not by algorithmic limitations.

Limited preprocessing is required since we directly
take the shape points and estimate the density.

The metric is in closed form and when incorporating
linear assignment our method is still computationally
efficient enough for querying applications.

We are still in the preliminary stages of fleshing out the ca-
pabilities of our Shape L'Ane Rouge technique. In the im-
mediate future, we plan to incorporate the use of the mul-
tiscale wavelet densities along with studying the effects of
multiple wavelet families. We anticipate these will provide
additional attributes for each shape which we will further
increase shape discriminability and subsequently improve
recognition rates. We are also planning to investigate other
penalty terms for the linear assignment objective function
and better mechanisms for choosing A.


[1] J. Tangelder and R. Veltkamp, "A survey of content based 3d
shape retrieval methods," in Proceedings of the Shape Mod-
eling International 2004, vol. 00. IEEE Computer Society,
2004,pp. 145-156. 1
[2] Authors, "Maximum likelihood wavelet density estimation
with applications to image and shape matching," IEEE Trans.
Image Processing (in submission), 2007. 1, 3
[3] D. Donoho, I. Johnstone, G. Kerkyacharian, and D. Picard,
"Density estimation by wavelet thresholding," Ann. Statist.,
vol. 24(2), pp. 508-539, 1996. 1
[4] P. Shilane, P Min, M. Kazhdan, and T. Funkhouser, "The
Princeton shape benchmark," in Proceedings of the Shape
Modeling International 2004, vol. 00. IEEE Computer So-
ciety, 2004, pp. 167-178. 2
[5] D. Luenberger, Linear and Nonlinear Programming. Read-
ing, MA: Addison-Wesley, 1984. 2, 5
[6] Y. Rubner, C. Tomasi, and L. J. Guibas, "The earth mover's
distance as a metric for image retrieval," International Jour-
nal of Computer Vision, vol. 40, no. 2, pp. 99-121, 2000. 2,
[7] F. Hitchcock, "The distribution of a product from several
sources to numerous localities." J. Math. Phys., vol. 20, pp.
224-230, 1941. 2
[8] A. Peter and A. Rangarajan, "Shape matching using the
Fisher-Rao Riemannian metric: Unifying shape represen-
tation and deformation," IEEE International Symposium on
Biomedical Imaging (ISBI), pp. 1164-1167, 2006. 2

[9] L. J. Latecki, R. Lakiimper, and U. Eckhardt, "Shape de-
scriptors for non-rigid shapes with a single closed contour,"
in IEEE Conf on Computer Vision and Pattern Recognition
(CVPR), 2000, pp. 424-429. 2, 5, 6
[10] R. Osada, T. Funkhouser, B. Chazelle, and D. Dobkin,
"Shape distributions," ACM Transactions on Graphics, no. 4,
pp. 807-832, 2004. 2, 3, 6, 7
[11] F. Wang, B. Vemuri, A. Rangarajan, I. Schmalfuss, and
S. Eisenschenk, "Simultaneous nonrigid registration of mul-
tiple point sets and atlas construction," in European Confer-
ence on Computer Vision (ECCV), 2006, pp. 551-563. 3
[12] S. Penev and L. Dechevsky, "On non-negative wavelet-based
density estimators," Journal of Nonparametric Statistics,
vol. 7, pp. 365-394. 3
[13] A. Srivastava, I. Jermyn, and S. Joshi, "Riemannian analysis
of probability density functions with applications in vision,"
in IEEE Conf on Computer Vision and Pattern Recognition
(CVPR), 2007, pp. 1-8. 3
[14] S.-I. Amari and H. Nagaoka, Methods of Information Geom-
etry. American Mathematical Society, 2001. 3
[15] I. Daubechies, Ten Lectures on Wavelets, ser. CBMS-NSF
Reg. Conf. Series in Applied Math. SIAM, 1992. 4
[16] R. Beran, "Minimum Hellinger distance estimates for para-
metric models," Annals of Statistics, vol. 5, no. 3, pp. 445-
463, 1977. 4
[17] F. Klein, "Vergleichende betrachtungen Uber neuere ge-
ometrische forsuchungen," 1872. 4
[18] I. L. Dryden and K. V. Mardia, Statistical Shape Analysis.
Wiley, 1998. 4
[19] F. L. Bookstein, "Principal warps: Thin-plate splines and
the decomposition of deformations," IEEE Trans. Patt. Anal.
Mach. Intell., vol. 11, no. 6, pp. 567-585, June 1989. 4
[20] J. Ho, M. Yang, A. Rangarajan, and B. Vemuri, "A new affine
registration algorithm for matching 2d point sets," in Pro-
ceedings of the F;. ii IEEE Workshop on Applications of
Computer Vision, 2007, p. 25. 6
[21] R. Jonker and A. Volgenant, "A shortest augmenting path
algorithm for dense and sparse linear assignment problems,"
Computing, vol. 38, pp. 325-340, 1987. 6
[22] G. McNeill and S. Vijayakumar, "Hierarchical procrustes
matching for shape retrieval," in IEEE Conf on Computer
Vision and Pattern Recognition (CVPR), 2006, pp. 885-894.
[23] P. Felzenszwalb and J. Schwartz, "Hierarchical matching of
deformable shapes," in IEEE Conf on Computer Vision and
Pattern Recognition (CVPR), 2007, pp. 1-8. 7
[24] S. Belongie, J. Malik, and J. Puzicha, "Shape matching and
object recognition using shape contexts," IEEE Trans. Patt.
Anal. Mach. Intell., vol. 24, no. 4, pp. 509-522, 2002. 7

Technical Report

University of Florida Home Page
© 2004 - 2010 University of Florida George A. Smathers Libraries.
All rights reserved.

Acceptable Use, Copyright, and Disclaimer Statement
Last updated October 10, 2010 - - mvs