Recognition of Handwritten Characters by
Voronoi Representations
Niranjan Mayya
Andrew F. Laine
UF-CIS-TR-94-18
Department of Computer & Information Sciences
University of Florida, Gainesville, Fl 32611
Submitted to IEEE PAMI
Submitted for publication to the
IEEE Transactions on Pattern A,,,,lii.' and Machine Intelligence
as a Regular Paper.
Recognition of Handwritten Characters by
Voronoi Representations
Niranjan Mayya and Andrew F. Laine1
Department of Computer & Information Sciences
University of Florida,
Gainesville, FL 32611
Abstract
We present a new skeletonization algorithm well suited for the problem of handprinted
character recognition. Our approach employs a novel algorithm for computing the Voronoi
diagram of a polygon with holes. We show that Voronoi skeletons can serve as efficient
shape descriptors because they preserve connectivity and Euclidean metrics. Compared
to traditional skeletonization techniques, we -i,--.. -1 that shape representations based on
Voronoi skeletons may increase the reliability of production quality character recognition
systems.
A feasibility study is described in which more than 10,000 handprinted characters were
recognized with an error rate of 2.3 I'. by a neural network trained using Voronoi skeletons
of character shapes from a class of 52 distinct alphanumeric patterns and graphical symbols.
These results show that feature vectors extracted from Voronoi skeletons provide for high
reliability in handprinted character recognition at a reduced cost of representation.
KEYWORDS: Character Recognition, Voronoi Diagrams, Skeletons, Medial Axis Transform.
1Please address all correspondence to:
Dr. Andrew F. Laine
Computer and Information Sciences Department
CSE Building, Room 301
P.Oi Box 116120
University of Florida
Gainesville, FL 32611-2024
E-mail: laine@cis.ufl.edu
Phone/Fax: ('in 1) 392-1239
1 Introduction
Handwritten character recognition is an important subproblem within the broader field of Optical
Character Recognition (OCR). While many researchers have successfully provided partial solutions
to this difficult problem, the problem of character recognition in general and handwritten character
recognition in particular, remains unsolved. Specifically, more reliable methods need to be found
before human performance in recognizing characters can be matched. A survey article by Mori et.
al [20] documents the state-of-the-art in OCR, and papers by Suen et. al [30, 31] describe more
recent advances made in handwriting recognition.
The steps involved in most character recognition systems can be broadly classified into prepro-
cessing, feature extraction and classification. Of these steps, methods of feature extraction have
received the largest amount of research and development. There are two parts to this step: (a)
defining distinctive features of characters, (b) extracting features once they have been defined.
Part (a) of this problem remains open. That existing character recognition systems have yet to
match human performance, can in part be ascribed to the intractable nature of defining "dis-
tinctive characteristics" for character patterns. In [31], a broad taxonomy of feature extraction
approaches are discussed. These include global ,i,,i.,,l'; (techniques such as template matching,
measurement densities of points, moments and mathematical transforms) and structural ,,,i,;..;';
(methods that extract loops, endpoints, junctions and arcs from the contour or skeleton of a char-
acter). The latter class is concerned with capturing the essential shape of each character. In the
next subsection, we shall look at skeletons and skeletonization techniques in greater detail.
1.1 Skeletons (Symmetric Axis Transform)
The skeleton of a planar object, also known as the Symmetric Axis Transform (SAT) or the Medial
Axis Transform (MAT) is a well known tool in shape modeling. Pavlidis [26] provides a formal
definition: "Let R be a plane set, B its boundary, and P a point in R. A nearest neighbor of P on
B is a point M in B such that there is no other point in B whose distance from P is less than the
distance PM. If P has more than one nearest neighbor, then P is said to be a skeletal point of R.
The union of all skeletal points is called the skeleton or medial axis of R." An analogous definition
is given by Blum using the well-known prairie fire analogy [3]; if one applies fire to all the sides
of P, and lets the fire propagate at constant speed, then the skeleton is the locus of points where
(fire) wave fronts meet.
The skeleton of a planar object relates the internal structure of the object to significant bound-
ary features. It is a compact descriptor for the "natural" shape of an object, that well describes
its global topological and geometric properties. Historically, skeletons have been previously used
in higher level computer vision tasks such as object/character recognition, as they provide a more
explicit, compact and stable representation, compared to an original intensity map of an image.
In the context of character recognition, skeletons have an added advantage in that they provide
a thin line representation similar to human handwriting. This allows for a more intuitive design
of recognition algorithms [31]. Another important feature exhibited by skeletons is that their
compact thin line representation is invariant to minor distortions, which allows for some measure
of invariance to differences in individual writing style. Finally, skeletons make possible simple
extraction of critical points such as end points and junctions which are cornerstone features for
syntactic recognition.
However, skeletons have some disadvantages. The first among these, is that the thinning process
may discard some amount of useful detail, that would be retained in other representations, such as
contours. A good example of this is a filled in hole or concavity. This is an important problem in
the context of character recognition, and an easy solution is not immediately apparent. Contours
and skeletons can complement each other as feature descriptors for character recognition. A
particularly good discussion of the relative merits and demerits of using skeletons versus contours
in character recognition is given in [30] and [31], where a system is built that combines the powers
of both representations. Another important disadvantage of skeletons is that some skeletonization
algorithms introduce distortions of their own (sometimes known as noise spurs.). In later sections,
we look in greater detail at how researchers across the field have handled this problem.
In the next sub section we describe previous skeletonization algorithms and introduce our
approach to the skeletonization problem.
1.2 Algorithms for Skeletonization
Most existing implementations for computing skeletons use discrete space concepts that only
approximate Blum's definition. In particular, preserving important properties such as conne, l''.'/i
and Euclidean metrics have been difficult to achieve in the discrete world. Previous algorithms
(and implementations) that have been developed can be classified [13, 23] into three distinct
groups: (1) Topological thinning, (2) Medial Axis extraction from a distance map, and (3) Analytic
computation of a skeleton based on an approximation of the object contour.
Topological Thinning. A large class of these algorithms examine the topological relevance of
object pixels rather than the metric properties of a shape. Typically, object pixels are repetitively
tested and subsequently deleted, whenever their removal does not alter the topology of a thinned
shape. Most character recognition systems that utilize skeletons, employ thinning techniques to
compute them. The advantage of this approach is that connected skeletons can be ensured by
relatively fast algorithms. However a significant disadvantage is that their discrete domain gives
rise to non-Euclidean metrics. Different thinning algorithms applied on the same image can result
in skeletons that may vary. These problems are in part due to different pixel "removal condil .ii -
that are defined in terms of local configurations.
Medial Axis extraction from a distance map. The second method of skeleton extraction
requires the computation of a distance map; determine for each point inside an object, the distance
of the closest point from its boundary. Depending on the metric used for distance, a wide array
of possible distance maps can be obtained. Unfortunately, the easiest and simplest algorithms are
those based on non-Euclidean metrics, which lead to skeletons that are not accurate in terms of
the fire front paradigm described earlier. However, algorithms that compute correct Euclidean
distance maps do exist [4]. Skeletonization algorithms using quasi-Euclidean and Euclidean maps
follow ridges in the distance map to construct a skeleton. However, the problem with these methods
are that conne. i,'.t,'li is not guaranteed [13].
Analytic computation of symmetric axes. The third method involves computation of sym-
metric axes by a direct analytical method based on polygonal approximation of a shape. Early
work using this approach was by Montanari [21] who solved a system of linear equations to compute
loci of equidistant points.
The Voronoi diagram 7_] is a useful geometric structure which contains complete planar prox-
imity information for a set of points [1]. This allows for simplified computation of a distance map
directly from the Voronoi diagram. Furthermore, the Voronoi diagram of boundary line segments
of a polygon is closely associated to its medial axis. In fact, the medial axis is exactly contained in
the set of Voronoi edges of the polygon, and can be obtained simply by deleting the two Voronoi
edges incident with each concave vertex [12]. Thus construction of the Voronoi diagram is clearly
a technique for the skeletonization of polygonal shapes.
1.3 Skeletons Derived from Voronoi Diagrams
Using the Voronoi diagram to compute the skeleton of a polygonal shape is attractive because
it results in skeletons which are connected while retaining Euclidean metrics. Furthermore, we
obtain an exact medial axis, compared to an approximation. Thus we may reconstruct exactly
an original polygon from its skeleton, (invertibility or one-to-one mapping). Finally, algorithms
to compute the Voronoi diagram (and hence the skeleton) are much faster than methods that
compute a distance map.
However, there are some disadvantages of using Voronoi diagrams to derive skeletons. We
.i-.-, -1 that any method that utilizes Voronoi diagrams of polygons to compute skeletons must
somehow overcome the disadvantages listed below, before it can be of practical value.
i) Natural shapes are non-polygonal. Thus, accurate polygonal approximations of such shapes
are required in order to compute skeletons without loss of accuracy.
ii) The skeleton of a many sided polygon (of very short sides) will have a large number of
redundant edges because of the Voronoi edges at these vertices. This results in increased
complexity of the skeleton, without the addition of any shape information.
iii) Finally, robust and practical algorithms (affording ease of implementation) for Voronoi dia-
gram construction of polygons are uncommon. Most existing algorithms make assumptions
about cocircularity of no more than three points, and colinearity of no more than two. These
constraints are difficult to satisfy in most practical applications.
In the next subsection we describe the basis of our approach to the skeletonization problem
and show how we overcome the disadvantages of Voronoi diagrams described in this section.
1.4 Voronoi Skeletons for Character Recognition
We now describe a skeletonization algorithm that is well suited for the character recognition
problem. The basis of our approach is the application of a algorithm for computing the Voronoi
diagram of a polygon [16]. Our skeletonization algorithm retains the advantages of Voronoi dia-
grams described in the previous section, (connectivity, Euclidean metrics and high accuracy). This
approach is thus a marked improvement over traditional skeletonization methods previously used
in character recognition studies. Furthermore we overcome the disadvantages of Voronoi skeletons,
identified in the previous section:
i) Character shapes are not extremely complex; hence accurate polygonization is realistic.
ii) We use a pruning approach to delete redundant skeleton edges that is guaranteed to retain
connectivity. The pruning step is a simple consequence of the Voronoi diagram algorithm
and does not require postprocessing. This also ensures that our method is stable with respect
to being invariant to perturbations along a boundary. This allows for a certain degree of
invariance to differences in handwriting style.
iii) Our Voronoi diagram algorithm is simple to implement. Assumptions about points being in
general position are unnecessary. Unlike most geometric algorithms, special cases such as
cocircularity of greater than three points or colinearity of more than two points are handled
elegantly [16]. In addition, our algorithm features a robust numerical scheme to compute
non-linear parabolic edges that avoids having to solve equations of degree greater than two.
We present experimental results, in which over ten thousand handprinted characters were rec-
ognized with a 2.34 error rate by a neural net classifier [7] trained with Voronoi skeletons of
character shapes, from a class of 52 distinct alphanumeric and graphical patterns. These results
represent an improvement over earlier work by one of the authors of this paper [8, 9].
In the remaining sections we present the details of our approach. The rest of this paper is
organized as follows. Section 2 contains a brief description of our Voronoi diagram algorithm. In
Section 3 we explain how the Voronoi skeletons of character shapes are derived from their Voronoi
diagrams. Finally, we discuss performance and recognition results in Section 4 and summarize our
study in Section 5.
2 The Voronoi Diagram
In this section, we provide a brief description of our Voronoi diagram algorithm. For greater details
of the algorithm and the proofs the reader is directed to [16].
The Voronoi diagram of a set of sites in two dimensions 12-] is the partition of the plane
into regions; each region i containing the set of points in the plane closest to the site i. In
the most common case which has been exhaustively researched 12'] in the past decade, the sites
under consideration are points in the plane. In this case, edges of the diagram are straight line
segments that are perpendicular bisectors of pairs of sites. Optimal algorithms as well as robust
implementations have been devised for this version of the problem. The notion of a site has been
generalized to include a collection of two-dimensional objects such as line segments and circular
arcs. In the case of line segments, the edges of the ensuing Voronoi diagram are not just straight-
line segments, but also arcs of parabola, since they separate the loci of proximity of object pairs of
the types (point, point), (line, line), (point, line); the last of these pairs give rise to arcs of parabola
(See Figure 1). The Voronoi diagram of the interior of a polygon is of interest to researchers across
different fields. In this case the set of sites include the edges and vertices of the polygon. It has
been used [25] to generate meshes for multiply-connected polygonal domains and has been long
known to provide the medial axis of polygonal shapes [12] that serve as a basis for efficient shape
el el
B(el,e) B(e,ejq)
ej ej
(a) 2 point sites (b) 1 point and 1 line
(a) 2 point sites .
site
elej
B(el,ej)
(c) the point site is an
end point of the line
site
Figure 1: Bisectors of e and ej
representation schemes [17]. While algorithms that compute the Voronoi diagram of the interior
of a polygon exist [24, 19], easy to implement algorithms that are robust and make no assumptions
about the position of the sites in a plane have not been proposed thus far. We believe that our
algorithm fills this void. (See [16] for a discussion of previous work on Voronoi diagrams.)
2.1 Preliminaries
In this section, we will introduce some definitions and notation. Definitions 1 through 8 pertain
to the definition of polygonal domains and Voronoi diagrams and are taken from [24].
Definition 1 A closed line segment [a, b] is the union of two endpoints a and b and the open line
segment (a, b).
Definition 2 A ,,i,,lli'l,-connected j.1li,,,.',,l domain P is the closure of a ,...r.', ,,, iii. bounded,
connected, open (in the relative topology) subset of R2 whose boundary is the union of a finite
number of closed line segments.
The boundary of P, denoted by 6P, consists of one or more disjoint subsets. The outer boundary
of the polygonal domain is denoted by 6Po and contains P. The inner boundaries represent the
holes of the polygonal region and are denoted by 6Pi, 1 < i < H, where H denotes the number of
holes.
Definition 3 The vertices of 6P are the points of intersection of the closed line segments which
constitute 6P. The edges of 6P are the open line segments obtained by deleting the endpoints of
the closed line segments which constitute SP
Definition 4 The projection p(q, [a, b]) of a point q onto a closed segment [a, b] is the intersection
of the line through a and b and the line perpendicular to [a, b] and passing through q.
Definition 5 The bisector B(ei, cj) of two sites ec, and ej, is the locus of points equidistant from
ei and cj.
Definition 6 The half-plane h(e, ej) is the set of points closer to site ec than to site cj. Its
complement, h(ie, cj), is the set of points not closer to site ei than to cj.
The nature of the bisector is determined by the nature of the sites (point, line). In particular,
when both ei and cj are point sites, the bisector is a straight line (the perpendicular bisector of
ei and cj. When one of the sites is a point, and the other is an open line segment the bisector
is in general a parabolic arc. For the special case where the point is one of the endpoints of the
open line segment, the bisector is a straight line passing through the point and perpendicular to
the open line segment. See Figure 1.
Definition 7 Given a set of sites S, and a site ei, ei S, the Voronoi region of ei with respect
to S, denoted by V(ci, S) is the set of all points closer to ei than to any site in S.
Lemma 1 V(ci, S) = n,,sh(ei, cj)
Proof: Corollary 2, Lemma 1 of [24] m
Definition 8 The Voronoi diagram, VOD(S), of a set of elements, S = ei is given by
Uees V(ei, S ei)
2.2 Primitives used in the Algorithm
A site e can be either a point or an open line segment. Each such site is associated with the
following information.
1. A contact point xpi. For a point site, the contact point is the site itself; for a line site, the
contact point is one of the endpoints.
2. A distance function di(xo); returning the square of the distance of a point x0 from site ei.
For a line site, the distance of a point to the site is defined as the distance from the point
x0 to its projection on the line site e. Given this definition, we can define the distance of
a point to any site (either a point or a line) as follows, Given a line site, let Tr be its unit
vector. If I is the identity matrix, and T denotes the transpose of a matrix, we define a 2 x 2
matrix M as
M = I for a point site,
= I Wn nT for a line site
Now we define the distance function as,
di(xo) = (xo Xp)T M (xo _- Xp)
3. A gradient vector g(xo); which is defined by.
gi(xo) = 2(xo XpQ" M
2.3 Algorithm
2.3.1 Overview
The Voronoi diagram gives us a complete description of the function t(x), that returns the distance
of the point x to the closest site in the set S. In particular,
A Voronoi region (face) is characterized by a single site e; the function is given by ti(x).
A Voronoi edge is characterized by two sites, e and ej; the edge comprises the set of points
where ti(x) = tj(x).
A Voronoi vertex is characterized by 3 or more sites, i, j, k,...m; the vertex satisfies the locus
ti(x) = tj(x) = tk(x) = ...tm(.X)
Given an initial Voronoi vertex and the initial direction of the Voronoi edge emanating out of
that vertex, we follow the path traced by this edge to determine the vertex at the other end. Every
other site is examined to find the closest site that determines the new vertex. The new vertex is
equidistant from three or more sites, every pair of which gives rise to a possible new Voronoi edge.
The new Voronoi edges are added to an unexamined Edge List. The program terminates when all
the edges have been traced.
B
/ 12 2
pc C
(a) Voronoi edges of 2 (b) Initialization for a
sides of a polygon polygon with holes
Figure 2: Initialization
2.3.2 Initialization
Given a pair of successive segments of any polygonal region, this gives rise to the following 3
Voronoi edges. See Figure 2a. BA and AC are 2 successive sides of a polygon. There are 3 sites
corresponding to these 2 sides; two line sites el and e2 corresponding to the open line segments
BA and AC, and the point site e3. The 3 Voronoi edges corresponding to these sites are shown by
the dashed lines. 11 is the bisector of e2 and e3, 12 that of el and e3, and 13 the 1e, e2 bisector. If
we are considering the Voronoi diagram of the interior of the polygonal region alone, we are only
concerned with those Voronoi edges inside the region. These are easily obtained by considering only
12
the (line,line) bisectors (the 13 type edges) of every convex pair of successive polygon segments. for
the outer boundary of the polygon, and the (point,line) edges (the 11 and 12 type edges) for every
concave pair of successive polygon segments. For the inner boundaries (the holes), we perform the
reverse. Figure 2b shows the complete initialization for a simple case. The vertices of the polygon
are also Voronoi vertices. Each of the edges determined in this step are added to the unexamined
Edge List.
ei
ei.
A
B i ) B(ei,ej -
B(ei,ej)
xo
ek ek
*ej ej
(a) B(ei, ej) is along the (b) F is tangent to B(ei, ej)
direction of V at xo
Figure 3: Initial Direction for a Voronoi edge
2.3.3 Curve Tracing
The unexamined Edge List holds edges which have been only partly determined. Specifically, an
unexamined Voronoi edge E is a bisector of two sites e and ej, containing the starting point xo,
and the initial direction U along which we must traverse to determine the other points along the
edge. The initial direction of the bisector is determined upto the linear order and is given by iU
= (gi(xo) gj(xo))', where a1 denotes a unit vector perpendicular to the vector a. As has been
noted earlier, if the Voronoi edge is a (point,point) or (line, line) bisector, it will be a straight
line, in which case there is only a single terminating Voronoi vertex to be determined. If we have
a (point, line) bisector, the Voronoi edge is a parabolic curve; this curve has to be traced and
intermediate points along it computed to fully determine the Voronoi edge.
In the linear case, the bisector B(e,, ej) of sites e, and ej (see Figure 3), lies along the direction
-...g(xo)/2 B(ei,ej)
''**-. .xt "
xt
-v
ek xZ
ej
Figure 4: Finding an intermediate point on the curve
of U*. In this case, the bisector is fully determined by computing the Voronoi vertex Xt at the other
end. Finding the terminating vertex on the bisector involves determining the site ei, (1 i,j, k),
that is closest to the sites e and ej. The algorithm examines every site e( (1 i,j, k), and computes
the point xt that is the center of the circle passing through e, ej and el. Each such point xt1 will
lie on the bisector B(e, ej) along the direction iU; the point closest to the starting vertex xo is the
new Voronoi vertex Xt.
The situation in the non-linear case is slightly more complicated. In this case, the bisector
B(e, ej) is a parabolic curve, and the initial direction U is a tangent to the bisector at the starting
point xo, (Figure 3b) and is the linear approximation of the bisector.
In order to determine the Voronoi vertex at the other end of a parabolic curve, an iterative
technique is employed. First, a similar procedure as in the linear case is followed to determine a
point Xt on the linear approximation of the bisector. Since Xt does not lie on the bisector itself,
we need to move back to a point on the curve itself. It can be seen that we will intersect the
bisector B(e, ej) if we move along the contact vector -g(xt) towards the point site. The point of
intersection x1l is an intermediate point along B(e, ej). See Figure 4.
It is easy to prove [16] that this procedure is guaranteed not to overshoot the Voronoi vertex;
namely, the open interval of the bisector B(e, ej) between the points xo and xl1 contains no Voronoi
vertex. Having determined an intermediate point on the bisector, the procedure is repeated using
a new value for U that is given by U = (gi(xti) gj(xnt))'. The procedure terminates when we
arrive at a point xt, which corresponds to a site e, such that xt, is equidistant from e, ej and
ec. x1t is the new Voronoi vertex.
/2ek el
a 3 c c n (b) 4 co-circular sites with 1 line
.gi gj/2
S_ l/2.: %
(a) 3 co-circular point sites (b) 4 co-circular sites with line
site
Figure 5: Determining edges from the convex hull of points of contact
We note that the rate of convergence of the iteration is the same as that of Newton's method,
since we are approximating an arc by a straight line.
2.3.4 Adding New Edges
Having computed a new Voronoi vertex Xj, we must first perform a check to determine if this
vertex has been found before. If this vertex has been computed earlier, it means that there must
exist in the unexamined Edge List, an unexamined edge with starting point Xt. This edge must
be deleted from the Edge List, since it has now been fully determined. If the new vertex has not
been computed before, we proceed to determine all the edges coming out of this vertex.
We first need to determine all the other sites equidistant from this vertex, and then determine
the new Voronoi edges that emanate out of the new vertex.
The first step is easy to perform; all sites el, which satisfy
di(x ) = di(x.t) = dj(x t)
lie on the circle of radius V x), centered at the new Voronoi vertex Xt.
The second step is to determine all the edges coming out of this vertex. In general, if we have k
sites equidistant from a vertex, there can be pairs of possible edges but all of these will not
be Voronoi edges. From the vector g(xt), for each site e, we can determine the point of contact
of -.; /2 with e. If we denote the points of contact as yi, the convex hull of all the yi determines
which pairs of sites correspond to Voronoi edge. Every convex hull edge gives a pair of sites that
correspond to a Voronoi edge. Thus if we have k equidistant sites from the vertex xt, there are
exactly k Voronoi edges. See Figure 5.
One of these k edges is the edge which was just traced. The other k 1 edges are appended
to the unexamined Edge List. The head of the Edge List is then examined and the curve tracing
procedure is repeated until the Edge List is empty.
Due to the incremental nature of the edge tracing algorithm, we do not need to make any
assumptions about points in general position. Hence co-circularity of more than 3 sites can be
handled easily.
2.3.5 Time Complexity
Let the number of sites in the input be np, the number of Voronoi edges be n, and the number of
Voronoi vertices be n,.
The initialization step is of the order of the number of sites in the input, since a single pass
over the input suffices to create the initial edges and append them to the unexamined Edge List.
This step takes O(np) time.
Each edge that is added to the diagram in the edge tracing step requires us to examine each
site to determine the closest site. Also, we need to search the list of current Voronoi vertices to
check for the existence of a new vertex. Hence, we need O(np + n,) time for each new edge found.
The total time required for the algorithm is therefore O(npn, + nfne). By Euler's formula the
number of Voronoi edges is 2n, 3 and the number of Voronoi vertices n, 2. Hence the time
complexity is O(n ).
3 Skeletonization using Voronoi Diagrams
In this section, we describe our technique for obtaining skeletons of character patterns that are
derived from Voronoi diagrams.
3.1 Preprocessing: Computing a Polygonal Approximation
In order to compute the Voronoi diagram of a shape, we first segment the shape's boundary and
derive a polygonal approximation of its bounding contour. In our implementation, we used the zero
crossings obtained from an image filtered with a Laplacian of a Gaussian [14], to obtain a bounding
contour. The advantage of this technique is that it guarantees closed contours, which ensures
that a contour tracing algorithm such as chain coding will converge quickly. After computing a
chain coded bounding contour, critical points (those that reflect significant change in curvature)
are retained; these serve as the vertices of polygonal approximation for each (planar) shape. The
input shapes were approximated to ensure that significant changes in curvature are retained, while
minor distortions that could result in noise spurs in the skeleton representation are smoothed over.
This limited the occurrence of redundant edges in each representation.
3.2 Computing the Voronoi Diagram and Extracting Skeletons
Given the vertices for a polygonal approximation as determined above, we computed the Voronoi
diagram using the algorithm described earlier in Section 2. The medial axis [12] of a polygonal
shape can be derived from its Voronoi diagram by deleting those edges that arise from concave
vertices of the polygon (see Section 2.) The data structure created to store Voronoi edges contained
the information required to identify these edges. Hence it was easy to identify (and delete) edges
that emerged from concave vertices of a polygon. However, resultant medial axes are characterized
by a very large number of edges, as every vertex of the polygonal approximation (both concave
and convex) gave rise to a Voronoi edge. Many of these Voronoi edges were redundant; i.e. they
did not provide any additional structural information about the planar shape. Hence we designed
a pruning technique to delete redundant edges.
(a) Planar Shape
(b) Polygonal
Approximation
(c) Voronoi (d) Pruned
Diagram Skeleton
Figure 6: Voronoi Skeletonization of Planar Shapes.
3.3 The Pruning Operator
As shown in Figure 6c, the vertices of the initial polygonal approximation of a shape lead to a large
number of edges that do not contribute to overall shape information. In a Voronoi diagram every
Voronoi edge is a bisector of two sites on the boundary of a polygon; in particular the Voronoi
edges at the vertices of a polygon are bisectors of adjacent sites. If the sides of a polygon are
numbered in counterclockwise order along the boundary, we can define the I,,,1i .. i .i of a Voronoi
edge as follows. Let E be a Voronoi edge that is a bisector of two sites numbered i and j. Then
Adj ... i. ;.(E) = i j1. We observe that Voronoi edges that lie deep inside an object have higher
adjacency than those on the perimeter. Furthermore segments that describe "global" topological
(symmetry) relations are bisectors of high adjacency. This fact gives us a means of filtering out
unimportant segments. We simply discard edges that are of adjacency lower than some preset
threshold. Since the adjacency information is implicitly contained in our edge data structure (see
Section 2), no additional post-processing is required to obtain this reduced efficient set of edges.
We note that the authors of [23] use a similar (but not identical) technique for pruning the
edges of Voronoi diagrams. The problem in their case was more critical, since they started out
with a set of raster crack end-points along the boundary of an object and hence had an extremely
large number of Voronoi edges to process.
Figure 6 shows the result of the three step procedure described above on a planar shape of the
character 'S'. Figure 7 presents additional examples of skeletons derived using our skeletonization
technique.
3.4 Preserving Connectivity
As described in Section 1, one of the advantages of skeletons derived from Voronoi diagrams is
that we are guaranteed connected skeletons. However, what is the effect of the pruning step
on connectivity? Furthermore, how does one determine an optimal threshold for deleting the
maximum number of redundant skeletal edges without losing connectivity? Unfortunately, it is
not possible to determine a threshold value a priori and apply it to a large class of objects. However,
by constraining the threshold value to 1 (retaining only those edges that are of adjacency greater
than 1), we can prove that the resultant skeleton remains connected.
(a) (b)
(c) (d)
Figure 7: (a),(c) Sample Voronoi Diagrams, (b), (d) Corresponding pruned skeletons
Lemma 2 Pruning Voronoi edges of Ii,,.,1I .. 1, is guaranteed to preserve .,,i. 1/..,,;1
Proof: The Voronoi diagram of a polygon is a Planar Straight Line Graph [24], which by
definition, is connected. For the removal of an edge to result in a disconnected graph, the edge
must be a "bridge"; namely the edge must be part of every path between any two vertices of
the graph. It is easy to see that no edge of adjacency 1 can be a bridge, since (by definition of
adjacency) every edge of adjacency 1 terminates at a vertex of a polygon. Hence removal of all
edges of adjacency 1 is guaranteed to preserve connectivity of the resulting skeleton. m
3.5 Skeletonizing Shapes with Holes
One of the attractive features of our method is that it handles shapes with and without holes in
a uniform manner. Figure 8 shows an example of our approach applied on a holed object. No
additional initialization is required. However, when carrying out the pruning step, care must be
taken to define the adjacency information correctly, in order that the last numbered site on the
outer boundary and the first numbered site on the next hole are not considered adjacent. Shapes
with multiple holes are handled similarly.
IF
(a) Planar Shape
(c) Voronoi
Diagram
(b) Polygonal
Approximation
(d) Pruned
Skeleton
Figure 8: Skeletons of Character shapes with holes
(b)
(d)
Figure 9: (a,b): Very small edges cause noise spurs. (c,d): Smoothing contour eliminates noise
spurs.
Time Complexity Actual Running Time
Segmentation O(N log N) 2.26 sec 1
Polygonization O(B) 0.10 sec
Voronoi Skeleton O(B2) 0.81 sec
Table 1: Performance Evaluation: Running times reported per 64x64 character image.
3.6 Performance Evaluation
In this section, we briefly look at the time complexity of the entire skeletonization process. In the
discussion that follows, let N denote the size of a square image, (N = M M), where M is the
length of each row and column. Let B represent the average size of the bounding contour of each
planar shape. In Section 2, we saw that the time complexity of the Voronoi diagram algorithm
is O(B2), where B is the number of sites on the bounding contour. Existing skeletonization
algorithms have time complexities that are a function of N alone, and hence valid comparisons are
difficult. In practice, we have observed that in spite of the quadratic complexity, the algorithm is
efficient in practice, since the number of boundary points B, is generally an order of magnitude
less than N. For completeness, however, the preprocessing steps must also be taken into account.
The table below summarizes both the theoretical time complexity of each stage, as well as the
actual execution time taken on a Sparc 10 computer to compute the skeleton of each character
shape. The timing measurements are reported for each character of size 64x64, averaged over a
total of over 1,350 distinct character samples. Note that the Voronoi construction processing time
accounts for only 2".' of the total time required for skeletonization.
In the next section we show that Voronoi skeletons computed by the procedure described above,
provide a more efficient representation compared to existing skeletonization algorithms.
3.7 Comparison of Techniques
Our approach exploits the fact that the medial-axis of a polygonal shape is implicitly contained
in its Voronoi diagram. This fact immediately ensures (a) that the computed skeleton lies in R2,
(Euclidean metrics) and (b) conne. /1,'i.'l, is guaranteed. We compute the exact medial axis as
opposed to a discrete approximation. In the following paragraphs we show that these properties
provide a significant advantage over traditional skeletonization techniques.
Skeletons computed by thinning algorithms are constrained by 4 or 8 connectivity of a discrete
grid in which the object shapes are embedded [5]. The typical drawback of such approaches is
the loss of Euclidean metrics. Furthermore thinning algorithms also have to deal with redundant
edges as a result of noise artifacts on the boundary of object shapes.
An alternative approach to skeletonization involves ridge following techniques based on distance
maps computed from object shapes. The quality of the resulting skeletons is critically dependent on
the metric used to derive the distance map. Non-Euclidean metrics (city-block distance, chessboard
distance among others), lead to simple skeletonization algorithms, but can result in inaccuracy
of upto 411 with respect to Euclidean distances [13]. While methods to compute Euclidean [4]
or quasi-Euclidean [22] distance maps exist, these methods do not guarantee connectivity. Gaps
occur due to the discrete domain on which pattern shapes are embedded, and are usually filled in
by postprocessing steps.
More recently, new skeletonization algorithms [23], [13] have appeared in the literature that are
marked improvements over the traditional techniques described above. In the sequel, we briefly
describe these methods, and compare them with our approach.
In [23], the authors compute the Voronoi diagram of the set of points along the boundary of
an object shape. Two points arise with respect to the general scope of this technique: (a) A very
large number of points are required to correctly approximate object shapes; however this may not
be efficient. Further the large number of points lead to an extremely large number of redundant
Voronoi edges, necessitating the use of complex pruning techniques. (b) A more critical drawback
of this approach is that it is not easily applicable to objects with holes. In particular, it is not
possible to decide a priori, the number of sampled points on an object, necessary to ensure that
the Voronoi edges between the inner and outer boundaries will be computed.
Our approach on the other hand, overcomes both these deficiencies. The number of redundant
edges are small. Thus a simple pruning step, that involves no postprocessing suffices to eliminate
almost all spurious edges. In addition, by controlling the polygonal approximation process to
discard edges of very small length on a contour, we can eliminate the occurrence of redundant
edges that might not be deleted by the pruning step. Figures 8a and 8b, show examples where the
polygonal approximations of the character shapes contained very small length edges that resulted
in redundant edges that were not deleted by the pruning step. By ensuring that the polygonal
approximation will not contain very small length edges these spurs are eliminated in Figures 9c
and d.
Furthermore, polygons with and without holes are handled in a uniform manner. Polygons are
defined in terms of line segments, and are therefore well defined at every point p(x, y) E IR2 along
the boundary. This overcomes the drawbacks of discretizing the bounding contour as in [23].
The authors of [13] employ the snake model (an active contour model), to compute skeletons.
In their technique, initial control points are defined for the snake at curvature extrema of a shape's
bounding contour. The l, ,i .ii;'.- grassfire is then simulated by snake propagation. This technique
maintains correct Euclidean techniques by computing the Euclidean distance map, while the nature
of the contour model ensures that connectivity is maintained. However, one limitation of this
approach is the problem of computing the correct curvature in the discrete domain. Furthermore,
a very 1.i,--. d boundary implies a large number of curvature extrema leading to redundant skeleton
edges; in this case other criterion need to be applied to reduce the number of initial control
points. In addition, a special case arises when the bounding contour includes a circular arc whose
center (a) lies within the object and (b) is an end-point of a skeleton branch. In such a case,
additional control points need to be defined to ensure accurate skeleton computation. In contrast,
our technique offers a elegant and uniform approach to skeletonization that is independent of the
shapes /,,/,,,1..,,;
An implicit advantage offered by our method, is that a graph representation of the skeleton
is .',,, ,.i,/,'1 li available. Recall that the Voronoi diagram of a polygon is a planar straight line
graph, and that by Lemma 2, the pruning step maintains connectivity. Skeletonization methods
that compute the skeleton as a pixel map (as compared to an edge map representation provided
by our method), require an intermediate vectorization step before a graph representation may
be obtained. In the context of character recognition systems, our technique is amenable to the
simple extraction of structural features such as end-points and junctions by a simple traversal of
the graph comprising a skeleton.
In summary, we have shown that Voronoi skeletons are powerful shape descriptors which over-
come several significant disadvantages of existing techniques. In the next section we present results
of our character recognition system based on this novel representation.
4 Experimental Results
In this section, we describe experimental results for our method exercised on a large database of
handprinted characters.
4.1 The Database of Sample Patterns
In order to both evaluate and compare the performance of our feature extraction and represen-
tation methods, we exercised a large database of handprinted characters previously studied [8].
The database consisted of 52 distinct patterns containing over 10,000 samples, collected from 17
different writers. Table 2 summarizes the distribution of patterns in our database.
The characters included were well defined. To avoid ambiguity, authors were asked to slash the
numeral '0', to print '1' without serifs and to place horizontal bars on the letter 'I'. In addition,
only upper-case alphabets were considered, and the numeral '4' was required to be closed. Sample
characters are shown in Figure 10.
Figure 10: Samples patterns for ''0', 'O' ', ', 2', 'Z' and '4'.
Sample patterns were collected on forms and were digitized at 300dpi/8bit resolution. Each
sample was then normalized in size to 64 x 64 before processing [8].
Number of Number
Patterns Distinct of Samples
Patterns Training Testing Total
0-9 10 1,360 680 2,040
Graphics 16 2,176 1,-s 3,264
A-Z 26 3,536 1,768 5,304
All Above 52 7,072 3,536 10,608
Table 2: Distribution for database of sample patterns.
4.2 A Neural Network Classifier
The neural network used in our recognition system required that every input pattern be of constant
size. Recall that, the output of our Voronoi skeleton algorithm was defined by a set of connected
edges. Since the number of edges will vary for distinct instances of characters, we needed to ensure
that each input was represented by a uniform size. The edge set representing the skeleton of each
input pattern was rendered onto a binary image of size 16 by 16. The binary skeletons consisting
of 256 input nodes were then used to drive the neural network with distinct training and testing
samples from the 52 alphanumeric patterns in our database.
4.3 Neural Network Topology
We employed a conjugate gradient method to train a neural network [7] containing at most three
hidden nodes and 52 output nodes. The number of hidden nodes was determined experimentally
by varying the number of hidden nodes and identifying the best results. The input patterns
were preclassified according to their Euler number (the number of connected components minus
the number of holes.) This topological sorting step was performed primarily to reduce training
time. We found that training three smaller neural networks (for shapes with 0, 1 and 2 holes
respectively), took considerably less time than training a single more complex network. We note
that classifying characters according to the topology is justified on account of the constrained
nature of our database and the quality of our characters (no gaps.) When dealing with databases of
poorer quality, it may not be reasonable [27, 2] to attempt such preclassification without additional
preprocessing.
Table 3: Performance evaluation for three classification cases: [0-9], [A-Z] and [A-Z, Graphics,
0-9] for Voronoi skeleton representation.
We trained the network with three subsets of the database: (a) Numeric characters 0-9, (b)
Alphabetic characters A-Z and (c) the entire database of 52 patterns (A-Z, 0-9, and 16 graphic
characters).
Table 3 summarizes our existing experimental results. As shown, results for recognizing nu-
merals and characters were very reliable; correct recognition rates of 9 ,'s ." and 98.51' were
observed respectively.
4.4 Comparison of Performance
In the context of handwritten character recognition, it is difficult to exactly compare distinct
approaches. There are a large number of variables that can bias results. These include the
database size, the method of partitioning training and testing data sets, the kinds of errors reported
(substitution and rejection errors), the quality of characters in the database, the nature of the
character set (constrained or unconstrained) and so on. However, it is certainly meaningful to
compare methods that have been tested under similar (if not identical) testing conditions. In
[8], an earlier study by Laine et al, a neural net classifier was trained on rectangular wavelet
representations of character shapes. The database used in our present study was identical to the
one used in [8], as was the type of classifier used. A comparison of our results is provided in Table
4 and shows that Voronoi skeletons provided more reliable classification. The percentage of correct
Classification Number Error Classification
Cases of Errors Rate Rate
0-9 7 1.112., 98.98
A-Z 25 1.41 98.5!',
A-Z,
Graphics, 83 2.34 97.11,',
0-9
Performance Evaluation
Voronoi Skeletons Rectangular Wavelets
Classification Number Error Number Error
Cases of Errors Rate of Errors Rate
0-9 7 1.ii" 11 1.61 ,
A-Z 25 1.41, 26 1.47'.
A-Z,
Graphics, 83 2.34 93 2.1 .1
0-9
Table 4: Performance evaluation for three classification cases: [0-9], [A-Z] and [A-Z, Graphics,
0-9] for Voronoi skeletons and rectangular wavelet representations.
classification is comparable to those obtained by some of the leading practitioners in this area. In
[18], the authors extract structural features from the skeleton and uses 11 specialized modules to
come to a decision on the identity of patterns. The authors of [10] also use primitives derived from
skeletons for recognition. Both of the last two methods were tested on the same database (US ZIP
code database of CENPARMI (Concordia University); the training and testing sets consisted of
4000 and 2000 numerals respectively. Correct classification rates of i" 11 '' [18] and 93.1 [10] were
observed. In [31], a multi-expert system is described which combines the results of four techniques.
These include the last two methods described above, a structural technique that extracts features
from the contours of characters, and a statistical approach. The combined system (tested on the
same database) reported a correct classification rate of 93.0I' but had a zero substitution rate
to achieve 1011' reliability. Srihari [29], presents several different approaches ranging from stroke
based recognizers to structural contour based chain code classifiers. The classifiers were based on
different methodologies: statistical, structural, and syntactic. Each approach was trained on a
set of 18,468 digits and tested on a set of 2,711 digits with correct classification results ranging
from 83.1' to 96.4 The data was derived from handwritten addresses obtained from the US
Postal Service. Le Cun et al. [11] achieved good results with a back propagation neural network
using size-normalized images as direct input. With a training and testing set size of 7291 and 2007
handwritten digits respectively, the system achieved a correct classification result of '"-,
5 Conclusions and Summary
In this paper we have presented a new skeletonization algorithm derived from Voronoi diagrams.
Our approach relies on a new method for constructing Voronoi diagrams of polygons that is ro-
bust, easy to implement and attractive for practical applications. We have shown that Voronoi
skeletons are connected, accurate (characterized by Euclidean metrics), and are superior to skele-
tons obtained by existing thinning algorithms. Redundant edges were deleted in a pruning step,
that was guaranteed to preserve connectivity of each skeleton. The technique implicitly provides a
graph representation that enables the simple extraction of end points and junctions. Feature vec-
tors consisting of Voronoi skeletons of character shapes were shown to be well suited for character
recognition. Correct classification rates of 97.1i' ., 98.,1-.' and 95 '- for the entire database,
characters 'A'-'Z', and numeric characters '0'-'9' respectively, were obtained, when a neural net-
work was trained with Voronoi skeleton representations of the character shapes. These results
i- i.-. -1 that Voronoi skeletons can provide efficient shape descriptors for character recognition.
Our future research efforts will be directed towards building a better and more robust character
recognition system. Specifically, we shall work with larger databases (such as the NIST database),
and test our representation using unconstrained characters. For character recognition systems to
be practically viable, it is essential that error rates be split into substitution errors and rejection
errors [27]. Modifying our existing recognition system to incorporate such distinctions is another
area that will receive our attention in the near future. Finally, our skeletonization algorithm
makes it easy to extract structural information such as end-points and junctions that are used in
many syntactic recognition schemes. Since the skeleton is output in the form of a planar straight
line graph, a single graph traversal is sufficient to determine end points and junctions. We shall
investigate whether syntactic character recognition schemes based on Voronoi skeletons would
provide efficient representations of more complex patterns.
References
[1] F. Aurenhammer [1991], "Voronoi diagrams a survey of a fundamental geometric data
structure," ACM Computing Surveys, Vol. 23, 345-405.
[2] H. S. Baird [1993], "Recognition technology froni Pattern Recognition Letters Vol 14,
327-334.
[3] H. Blum, [1967] "A transformation for extracting new descriptors of shape," Models for the
Perception of Speech and Visual Form (W. Wathen-Dunn, ed.), Cambridge MA: MIT Press.
[4] [1'' ii] P. E. Danielsson, "Euclidean distance mapping," Comput. Graphics Image Processing,
Vol 14, 227-248.
[5] [1981] E. R. Davies and A. P. N. Plummer, "Thinning algorithms: a critique and a new
methodology," Pattern Recognition, Vol. 14, 53-63.
[6] R. C. Gonzalez and R. E. Woods[1992], "Digital Image Processing", Addison-Wesley.
[7] B. Kalman [1990], "Super linear learning in back propagation neural nets," Dept. of Computer
Science Technical Report WUCS-90-21, Washinton University, St. Louis.
[8] A. Laine, S. Schuler, and V. Girish [1993], "Wavelet representations for recognizing complex
annotations," Machine Vision and Applications Vol 6, 110-123.
[9] A. Laine, W. E. Ball and Arun Kumar [1991], "A Multi-Scale Approach for Recognizing
Complex Annotations in Engineering Drawings," Proceedings of the IEEE CVPR, Lahaina,
Hawaii.
[10] Lam, L., S. W. Lee and C. Y. Suen [1'""], "Structural classification and relaxation matching
of totally unconstrained handwritten zip=code numbers", Pattern Recognition Vol 21(1), 19
31.
[11] Le Cun Y., B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, L. D. Jackel,
and H. S. Baird [1990], "Constrained neural network for unconstrained handwritten digit
recognition", Proc. International Workshop on Frontiers in Handwriting Recognition, Con-
cordia University, Montreal, April 1990, 145-154.
[12] D. T. Lee [1'- '], 11. .l.,i Axis Transformation of a Planar Shape," IEEE Trans. Patt. Anal.
Machine Intell. PAMI-4, No. 4, 363-369.
[13] F. Leymarie and M. D. Levine [1992], "Simulating the grass-fire transform using an active
contour model," IEEE Trans. Patt. Anal. Machine Intell. PAMI-14 No 1, 56-75.
[14] D. C. Marr and E. Hildreth,[1'1-ii] "Theory of edge detection," Proc. R...,,J Soc. London., B
207, 187-217.
[15] N. Mayya and V. T. Rajan [1994] "Voronoi Diagrams of Polygons: A Framework for Shape
Representation," Proceedings of IEEE CVPR 1994, Seattle, Washington (to appear).
[16] N. Mayya and V. T. Rajan [1993] "Voronoi Diagrams of Polygons: A Framework for Shape
Representation," IBM Research Report RC 191-_' 11/23/93.
[17] N. Mayya and V. T. Rajan [1993] "An Efficient Shape Representation Scheme using Voronoi
Skeleli- ", IBM Research Report RC 19161, 09/14/93.
[18] Nadal C. and C. Y. Suen [1'l"], "Recognition of totally unconstrained handwritten digit by
decomposition and vectorisation," Technical Report, Concordia University, Montreal.
[19] Siavash N. Meshkat & Constantine M. Sakkas [1987], "Voronoi Diagram for Multiply Con-
nected Polygonal Domains II: Implementation and Application," IBM J. Res. Develop. 31,
No. 3, 373-381.
[20] S. Mori, C. Y. Suen and K. Yamamoto [1992], "Historical Review of OCR Research and
Development," Proceedings of the IEEE, Vol 80, No. 7. 1029-1058.
[21] U. Montanari [1969], "Continuous Skeletons from Digitized Images," JACM, 16, No. 4, 534-
549.
[22] U. Montanari [1968], "A method for obtaining skeletons using a quasi-Euclidean distance,"
J. Assoc. Comput. Machinery, Vol. 15, 60-624.
[23] R. Ogniewicz and M. Ilg [1992], "Voronoi Skeletons: Theory and Appli. .l;i ,,." Proc. ICPR,
pp 63-69.
[24] Vijay Srinivasan & Lee R. Nackman [1987], "Voronoi diagram for multiply connected polyg-
onal domains I: Algorithm," IBM J. Res. Develop., 31, No 3, 361-372.
[25] V. Srinivasan, L.R. Nackman, J. Tang and S.N. Meshkat [1992], "Automatic Mesh Generation
using the Symmetric Axis Transformation of Polygonal Domains," Proceedings of the IEEE,
Vol. 80, No. 9, 14S- 1501.
[26] Theo Pavlidis [1'-'2], "Algorithms for Graphics & Image Processing", Computer Science
Press.
[27] Theo Pavlidis [1993], "Recognition of printed text under realistic condil. *i,", Pattern Recog-
nition Letters, Vol. 14 317-326.
2-] F. P. Preparata and M. I. Shamos [l-'r], Computational Geometry-An Introduction, Springer
Verlag, New York.
[29] Sargur N. Srihari [1993], "Recognition of handwritten and machine-printed text for postal
address interpretation", Pattern Recognition Letters, Vol 14, 291-302.
[30] C. Y. Suen, R. Legault, C. Nadal, M. Cheriet and L. Lam [1993], "Building a new generation
of handwriting recognition systems," Pattern Recognition Letters Vol 14, 303-315.
[31] C. Y. Suen, R. Legault, C. Nadal, M. Cheriet and L. Lam [1992], "Computer Recognition of
Unconstrained Handwritten Numerals," Proceedings of the IEEE VO1 80, No 7, 1162-1180.
|