Citation
Knowledge base for consultation and image interpretation

Material Information

Title:
Knowledge base for consultation and image interpretation
Creator:
Cheng, Ming-Chieh, 1948-
Publication Date:
Language:
English
Physical Description:
vii, 230 leaves : illustrations ; 28 cm

Subjects

Subjects / Keywords:
Agriculture ( jstor )
Circuit diagrams ( jstor )
Coordinate systems ( jstor )
Knowledge bases ( jstor )
Knowledge representation ( jstor )
Line segments ( jstor )
Mathematical vectors ( jstor )
Pixels ( jstor )
Symbolism ( jstor )
Symbols ( jstor )
APRIKS (Electronic computer system) ( lcsh )
Image processing ( lcsh )
Pattern perception ( lcsh )
Pattern recognition systems ( lcsh )
Genre:
bibliography ( marcgt )
theses ( marcgt )
non-fiction ( marcgt )

Notes

Bibliography:
Includes bibliographical references (leaves 223-229).
General Note:
Typescript.
General Note:
Vita.
General Note:
REPL*
Statement of Responsibility:
by Ming-Chieh Cheng.

Record Information

Source Institution:
University of Florida
Holding Location:
University of Florida
Rights Management:
The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. §107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
Resource Identifier:
11795739 ( OCLC )
ocm11795739

Downloads

This item has the following downloads:


Full Text














KNOWLEDGE BASE FOR CONSULTATION AND IMAGE INTERPRETATION


BY




MING-CHIER CHENG













A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY






UNIVERSITY OF FLORIDA


1983

















TO MYf PARENTS















ACKNOWLEDGMENTS

The author expresses his deep appreciation to the chairman of his supervisory committee, Dr. Julius T. Tou, for his guidance and encouragement during the course of the work presented in this dissertation. The author also thanks the other members of his supervisory committee, Dr. John Staudhammer, Dr. Jack R. Smith, Dr. Leslie H. Oliver, and Dr. Gerhard Ritter for their friendly help and

for their participation on the committee, with special thanks to Dr. John Staudhammer for extra assistance and guidance.

The author gratefully acknowledges the stimulating discussions with his colleagues and friends, Ms. Dragana Brzakovic and Mr. Malek

Adjouadi, and the great help of Ms. Patricia Lindsay in preparing this dissertation. Thanks are also extended to Ms. Carole Boone for her excellent typing work.

Special thanks are extended for the financial support of the Center for Information Research under the research projects APRIKS from Kellogg

Foundation, KUTE from National Science Foundation, ATI from Florida State, AUTORED from National Science Foundation, and the Center-ofExcellence Program from the University of Florida.

Last, but not least, the author thanks his wife for her patience and encouragement through his graduate school career.


iii
















TABLE OF CONTENTS

ACKNOWLEDGMENT......................................................... iii.

ABSTRACT ....................................................................... Vi

CHAPTER Page

I INTRODUCTION...........................................................1

2 IMAGE PROCESSING TECHNIQUES AND
KNOWLJEDGE REPRESENTATION .......................................3

2.1 Introduction ..............................................3
2.2 Image Segmentation.......................................... 3
2.2.1 Edge Detection ........................................ .4
2.2.2 Smoothing .............................................4
2.2.3 Automatic Threshold Selection .........................5
2.2.4 Gap Filling ...........................................7
2.3 Shape Analysis ............................................9
2.3.1 Chain Coding ..........................................9
2.3.2 Fourier Descriptor ...................................10
2.3.3 Extension of FD to a Non-Closed Curve ................12
2.3.4 Expression of Fourier Coefficients
In Terms of Chain Codes ..............................14
2.3.5 Predictive Searching for Chain Encoding ..............17
2.4 Decision Criterion .......................................20
2.5 Knowledge Representation .................................21
2.6 Knowledge-Based System ...................................24

3 DESIGN OF A KNOWLEDGE-BASED EXPERT SYSTEM FOR
APPLICATION IN AGRICULTURE ....................................28

3.1 Introduction .............................................28
3.2 Design Concepts for Knowledge-Based
Expert Systems ...........................................31
3.3 Knowledge Representation............................ 3
3.4 Knowledge Base Generation ................................37
3.5 Data Structure for the APRIKS System ..................... 46
3.6 Knowledge-Seeking Strategies ........... o.........o........ 53
3.7 Experimental Results .....................................63
3.8 Conclusion.......... ............. ................ ........ 63


iv










4 FALSIFIED DOCUMENT DETECTION AND FONT IDENTIFICATION ..........68

4.1 Introduction .............................................68
4.2 Noise Background .........................................70
4.3 Design Methodology .......................................71
4.3.1 Document Data Acquisition ............................71
4.3.1.1 Scanning, Reflection and Windowing.... ....._71
4.3.1.2 Image Preprocessing .........................71
4.3.1.3 Adaptive Threshold Determination
and Binary Picture Generation ...... o........ 74
4.3.1.4 Binary Filtering (Spur Removal
and Gap Filling). ..........................78
4.3.1.5 Character Isolation ..................... o... 83
4.3.2 Character Recognition and Grouping ..... o........o..... 95
4.3o2.1 Correlation Technique ......... o............ 95
4.3.2.2 Feature Pattern Matching ...................101
4.3.3 Falsification Detection-..........o..........o........106
4.3.3.1 Alignment Analysis. .....oo.....o.....o........110
4.3o3.2 Shape Analysis .............................114
4.3.3.3 Intensity Change Analysis .................119
4.3.4 Type-Font Identification-.......o................... 125
4.3.5 Knowledge Base Design ............... o............... 133
4.4 Discussion..........................o................... 139

5 COMPUTER RECOGNITION OF ELECTRONIC CIRCUIT DIAGRAMS .......... 142

5.1 Introduction .......... ......................o........... 142
5.2 Analysis of Electronic Circuit Diagram ..................143
5.3 System Architecture.... o...............o....o........... o.150
5.4 Multi-Pass Pattern Extraction ......... ........o......o...156
5.4.1 Extraction of Junction Dots........... o.............o157
5.4.2 Extraction of Horizontal Connecting
Line Segments... ................ o.....o....o.......... 169
5.4.3 Extraction of Functional Elements ............. ..... 173
5.4.4 Denotation Recognition.... o......................... 190
5.4.5 Reconstruction of Rectangular Shape Elements .... ....198 5.4.6 Processing of Unrecognizable Page .................._200
5.5 Pictorial Manipulation Language ............oo.........o... 200
5.5.1 Symbol Description language (SDL) ....o......o.........202
5.5.2 Picture Generation language (PGL) ........... o....... 207
5.6 Knowledge Base Configuration................o........... 214
5.7 Discussion ....... ....o...............o.........o......o....215

6 CONCLUSION-..................................o.........o..... 218

6.1 Summary .........-o.....o....o.........................218
6.2 Areas for Future Work ................. o....o........o.....o221

REFERENCES..........o......................o.......o......o.....o..... 223

B IOGRAIPHICAL SKETCH ....... .........o....o.............o.......o........o230


v


CHAPTER


Page












Abtract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy

KNOWLEDGE BASE FOR CONSULTATION AND IMAGE INTERPRETATION By

Ming-Chieh Cheng

December 1983

Chairman: Dr. Julius T. Tou
Major Department: Electrical Engineering

An entity-attribute relationship associated with a certainty factor is proposed to represent knowledge. To conduct this knowledge

representation, we propose the modified top-down approach to generate the knowledge-based system. In order to improve the user interface

problem, we further propose the three operation modes (information retrieval, decision-making, and question answering) to access the system

through either the menu selection input or simple natural language input. A working system APRIKS is established for agricultural pest control and some other applications.

We further integrate image processing techniques and pattern recognition principles with a knowledge-based system to form a pictorial knowledge-based system to conduct falsified document detection and font identification, and electronic circuit diagram recognition and interpretation. To describe the interrelationship among the functional

elements of a circuit diagram, we propose two pictorial manipulation languages (symbol description language and picture generation language)


Vi









using the concept of the associative network. Finally, we propose the

conversion rules to link the electronic recognition system with the SPICE package to enhance the system's capability. This link

demonstrates that the pictorial knowledge-based system can be integrated with current CADl machines to make diagnosis and reduce manpower.


vii


















CHAPTER 1
INTRODUCTION

There have been many researchers working on object recognition within the past decade. Their typical applications are optical

character recognition (OCR) [l]-[4], industrial parts inspection and assembly [5]-[9], target detection and identification [10]-[12], and agricultural remote sensing [13]-[14]. The techniques employed in those researches include statistical pattern recognition, chain code correlation, moment invariants, Fourier descriptor and syntactic (structural) pattern recognition. However, all of these approaches are in some way successful only in their respective domain because they lack the assistance of the knowledge base. The knowledge-based system

integrates all these techniques and performs hypothesis tests to select the necessary tasks automatically. Thus the knowledge base will make the recognition system efficient and cost effective.

In Chapter 2, we discuss image processing techniques and knowledge representation. The image processing techniques include image

segmentation and shape analysis. The image segmentation-used consists of edge detection, smoothing, threshold selection and gap filling. For

shape analysis, the 2D object boundary is represented by chain coding and linked with the Fourier descriptor to compute four shape measures (SF1 to S-F4) as extracted features. Then we investigate the knowledge representation techniques and propose a method which represents the


I







2


knowledge by the entity-attribute relationship associated with a certainty factor or a conditional probability.

In Chapter 3 we illustrate the knowledge base design by using the APRIKS system as an example. The modified top-down approach is proposed to generate the knowledge-based system. The APRIKS system will perform three operation modes (information retrieval mode, decision-making mode,

and question-answering mode) and improve the user interface by menu selection input or simple natural language input.

In Chapters 4 and 5 we integrate the knowledge-based system with the image processing techniques to form a pictorial knowledge-based system which performs falsified document detection _ and--font indentification, and electronic circuit diagram recognition and interpretation. Two pictorial manipulation languages, symbol

description language and picture generation language, are proposed to interpret circuit diagrams. The conversion rules to the SPICE package are proposed to show that the knowledge base can be integrated with the

current CAD machines to enhance the system's potential. Finally, in

Chapter 6, we present some concluding remarks and project future directions of this important area of image understanding.


















CHAPTER 2
IMAGE PROCESSING TECHNIQUES AND KNOWLEDGE REPRESENTATION



2.1 Introduction

In the first part of this chapter we investigate the existing image

processing techniques and tailor it to suit our system requirements. The techniques of the image-processing we adapted are image segmentation

and shape analysis. Secondly, we discuss the decision function of

pattern recognition theory. Finally, we study the knowledge

representation methods and integrate a knowledge-based system.



2.2 Image Segmentation

Image segmentation is the division of an image into different regions, each having certain properties. It is the first step of image analysis which aims at either a description of an image or a classification of the image if a class label is meaningful. Moreover,

it is a critical component of an image recognition system because errors in segmentation might propagate to the feature -extraction and classification stages.

During the past decade, many image segmentation techniques have been proposed, which can be categorized into three classes: (1)

characteristic feature extraction or clustering,.(2) edge detection, and

(3) region extraction. For a comprehensive discussion, see Fu and Mui [15].


3







4


In the following sections, we discuss and modify some of the image segmentation techniques which could be utilized in our design.

2.2.1 Edge Detection

The intent here is to enhance the image by increasing the values of

those pixels along the boundaries (edges) of the objects. Edges are

detected between regions of different intensity. To detect edges by

mask matching, we convolve a set of difference-operator-like masks, in various orientations, with the picture. The mask giving the highest

-value at a given point determines the edge orientation at that point, and that value determines the edge strength. The masks used in our

system are the generalized Sobel operators 116] with constant factor 1/4 as shown below.



1 2 1 2 1 0 1 0 -1 0 -1 -2
0 0 0 1 0 -1 2 0 -2 1 0 -1
-1 -2-1 0 -1 -2 1 0 -1 2 1 0

-1 -2-1 -2 -1 0 -1 0 1 0 1 2
0 0 0 -1 0 1 -2 0 2 -1 0 1 1 2 1 0 1 2 -1 0 1 -2 -1 0


2.2.2 Smoothing

The object of the smoothing is to "remove' isolated edges

corresponding to noise in the background, and to "insert" more edges along the object boundaries. A powerful smoothing technique.-that does not blur edges is median filtering [17], in which we replace the value at a point by the median of the values in a neighborhood of the point. For a 3x3 neighborhood, we use the fifth largest value. Since median

filtering does no~t blur edges, it can reiterated. However, a problem







5


with two-dimensional median filtering is that it destroys thin lines as well as isolated points, and it also "clips" corners.

Another edge-preserving smoothing using discrete bar masks was proposed by Nagao and Matsuyama [18]. The discrete bar masks are shown

in Figure 2. 1. The procedure of the edge-preserving smoothing is as follows:

Step 1: Examine a set of a point (X,Y)'s neighborhood which is

covered by the discrete bar masks.

Step 2: Detect the position of the mask where its gray-level

variance is minimum.

Step 3: Give the average gray-level of the mask at the selected

position to the point (X,Y).

Step 4: Apply Steps 1-3 to all points in the picture.

Step 5: Iterate the above processes until the gray-levels of all

points in a picture do not change.

This smoothing method takes the average in a certain region; it will also destroy the fine lines. These smoothing techniques above

cannot be used in the character recognition in Chapter 4 and the circuit diagram recognition in Chapter 5 because, if the characters or the line segments are very thin, then these line segments will be lost.

2.2.3 Automatic Threshold Selection

To conduct the binary picture generation, we segment the gray-level

image by thresholding. That is, the pixels greater or equal to a

threshold T are set to 1, and all other pixels are set to 0:







6


0 10









center


45*









225


E


900









2700


Figure 2.1 The Discrete Bar Masks for the Edge-Preserving Smoothing


135 315








7


1 ,if f(x,y) > T
f (x, y) = {
0 ,if f(x,y)



In the ideal case,. the histogram of the image has a deep and sharp valley between two peaks representing object and background, respectively, so that the threshold can be chosen at the bottom of this valley. However, for most real pictures, it is often difficult to detect the valley bottom precisely, especially in cases where the valley

is flat and broad, imbued with noise, or when the two peaks are extremely unequal in height, all of which cause the valley to be untraceable. Some techniques have been proposed in order to overcome these difficulties [19]-[21].

Using the clustering techniques, we develop a thresholding method which chooses a threshold based upon the maximum interset variance criterion. Our approach is similar to Otsu' s method [21]. The theoretical development of our approach is in Chapter 4.

2.2.4 Gap Filling

We now wish to assure that the boundaries of all objects are indeed closed in the binary image. This is done by connecting two pixels

within three pixels of each other (see Figure 2.2). Given-two pixels at (x 02y0) and (xi ,y1), the distance d 0and the angle 0 are

do= [(x i - x 0) 2+ ( - YO) 2



0 0=tan- [(y1 - y.)/x, 0 o)















(xi,y.)


Shaded area :=
gap-f illing area


x


t(X0,'Y0)


AEI0


~rn


LIE


*AAQ


*Extent point


A :Point to be filled in


Figure 2.2 Spatial Configuration of Pairs of Points and the Manner in Which They Are To Be Connected


8


Y


LN


A

A







9


If d 4 3, then the area which has y 4 y y and 04 (O 90 willibe

filled in. That is, the pixels existed in the gap-filling area will be changed to "1." However, after gap filling, a side effect will be noted in the edges of the object become "fat." This phenomenon will deform

the shape of an object.



2.3 Shape Analysis

To recognize and interpret an object, we need to extract the discriminatory features of the object and perform the detection and classification. The common features are area, perimeter, compactness, convexity measure, aspect ratio, gray-level statistics (mean and variance), and invariant moments (M1 to M48) [22]-[23]. However, some features are highly desirable, in particular, invariance with respect to

size, orientation, and position. A spatial attribute which has these

properties is the concept of shape. Two regions are said to have the

same shape if they are congruent after applying the operations of translation, rotation, and scale change. In this section, we discuss

the chain coding and focus on the shape analysis algorithm. The Fourier descriptor is a candidate for shape analysis.

2.3.1 Chain Coding

Using the standard grid-intersection digitization sch eme_,a curve (or boundary) can be represented by its chain code. In this

representation, a curve is specified, relative to a given starting point, as a sequence of integers representing vectors whose slopes are multiples -of 450, and whose lengths are F2~ (if diagonal) and I (if







10


horizontal or vertical). The chain code notation is shown in

Figure 2.3.

In mathematical form, the chain code can be expressed as

Ik exp [ ; k]

where (2.1)
1 ,k = 0,2,4,6

k 2 ,k = 1,3,5,7




The chain codes can provide a compact representation of regions. Freeman [24] proposed a correlation scheme for chain-coded curves. if

c1~2,...,--cn is one chain code, did 21 ..,dm is another chain code, and m < n, then the chain code correlation C(j) of d at cji is

C(J) = -L I cos[((di - c+) mod 8)(2.2)
m 8) i i~
Chain-code matching is computationally efficient, but cannot be considered a general tool for shape matching since it is not rotation invariant, very sensitive to local changes in the number of chain elements, and also quite sensitive to small global changes in scale. To

solve these drawbacks we combine the chain coding with the Fourier descriptor for shape analysis.

2.3.2 Fourier Descriptor

Two types of Fourier descriptors (FD) are used -for shape

description. One is the Fourier transform of a boundary expressed in terms of tangent angle versus arc length [25]. The other uses the

complex function of a boundary [26]-[27].






11


s art ing p cin I

Z-4


- IP -- 2-L.


4 0


6


Chain Code :556667021001224443 Figure 2.3 Chain Code of a Contour C







12


A closed contour C in Figure 2.4 is expressed as a complex function u~t), where u(t) =x(t) + jy(t). The FD of C is defined as

an = LI u(t)exp[-jn.2irt/L~dt (2.3)

(L :the total length of the contour)

The curve u~t) can be expressed by a nas 00n
u(t) = I a nexp[jn2rt/LI (2.4)
n7 --0
In short, we assume that L = 27r. The formulas (2.3) and (2.4) are simplified as below

an1= f 27f u(t)exp[-jnt]dt (2.5)

and

u(t) I a nexp[jnt] (2.6)
-00
The FD has the properties of translation, rotation, and scale invariance [26].

2.3.3 Extension of FD to a Non-Closed Curve

From the previous analysis, we know that the FD is efficient only for a closed contour. But in the real world, the closed contour is not easy to acquire due to the poor imagery, or partial view. Thus,

extending the FD to a non-closed curve is very important.

Consider a segment as a closed curve (refer to Figure 2.5) in the following way:

(a) Take one end point of the segment as the starting point.

(b) Trace the segment in the counterclockwise direction to the other

end.

(c) Retrace the segment to the starting point in the clockwise

direction.







13


C:)


iTI counterclockwise direction


Re


Figure 2.4


An Example of a Contot~i Function


Starting & end point
* ~.
-. .
~'
I,

%%


'*~*~**\ \


F~gur 2.5An Example of the FD) of a Curve Stgment


IM


F.7.gure 2.5







14


From this modification, we can apply the FD to any arbitrary curve whether it is closed or not.

2.3.4 Expression of Fourier Coefficients in Terms of Chain Codes

In this section, we want to compute the FD of the contour in terms of its chain code sequences. Let the grid spacing f or chain code be

equivalent to the unit of coordinate system and let 919. mbe the

chain code string of a contour C. This chain code string should keep

the interior region enclosed by C on the left (see Figure 2.3). Let P

be the perimeter of the contour C, and let 0 =to < tj < <.. < l <

t M =2w be a partition of [0,2N].

From equation (2.5), we get

a =- f 2rexp(-jnt)du(t)
n 2nwrj 0



2nwrj m [u~t) - u(tM71)]exp(-jntM) (2.7)
From equation (2.1), the chain code sequence gk can be expressed as

Ukexp[J2k

where 1 if Ck is even



v(2 if C kis odd , k =1,.M


Further, by approximation, we obtain
m
P M I -k(2.8) mk=l

M
P X k (2.9)
k=l







15


21EP m M
t m P 7 Z/I '.' (2.10)
k=l 1~
and

U(tm)-u(tmi I exp(~m "E M = ,. (2.11)



Substituting (2.10) and (2.11) into (2.7), we obtain


M In M
a 1 1 exp[j(ng - 2nir n ) 1, 2n,
n =2nicj m1 k=1 '/I k
(2.12)



For the coefficient a0



1 2
0



M7 1



M rn-i M
u(t0) - I2 Y. ( k)exp[i~ (2.13)



The coefficient a 0 represents the position of the shape center of the contour C. Therefore equation (2.13) can be used to compute the

shape center asymptotically. If we shift the shape center-to the origin of the new coordinates, then the Fourier expression of the contour C can be rewritten as

S[a nexp(jnt) - an exp(-jnt)], 0 < t < 2n (2.14)
n1
To describe the shape feat-ores, we propose the-following measures:







16


(a) Circularity

SF1 = __________i (Iaj + Ia__ I)
n1l
SF1 = 1 when C is a circle and 0 < SF1 < 1 otherwise.


(b) Elongatedness

S2=short semi-axis _ all - Ia_11
S2long semi-axis - jall + a,
SF2 = 1 when C is a circle and 0 < SF2


(2.16)


< 1 otherwise.


(c) Complexity
2
S3=p
S3-47rA


M 2 k=1


(2.17)


41r 2 0
n4l


where


Area( A ) =


perimeter( P )=


I Re[ f 27 ju(t)u'(t)dt] =w


M k=1


SF3 = 1 if C is a circle and SF3 > 1 otherwise.

(d) Convexity

The curvature of C can be written as


= x'y"- x".
( )2
dt


where t = 27p


xy-_3xyt = (const.)(xly"- x"y') P3


increases in the counterclockwise sense.


(2.15)


2o
I nlaln


dcx







17


X1 y" x"y' = Re[ju'(t)u"(t)]


= Re ~ n 2m a ma nexp(j(m -n)t)]



=Re[ I n 3 Iani 2] if m7-n

Now, we define


I n3 (janI 2_IanI 2
SF4 = n-i (2.18



SF4 = 1 when C is a circle, and SF4 > 1 when C has more convexity.



2.3.5 Predictive Searching for Chain Encoding

To generate the chain codes from a geometrical configuration, the search direction of successive codes is predetermined. Such a method

works very well when it is applied to simple geometrical

configurations. .However, when the geometrical configuration contains overlapping and touching points, chain-coding methods may become less satisfactory. Xii and Tou [28] have proposed a predictive searching method based upon the past information. That is, the subsequent chain code is predicted from information on the preceding chain codes. The

outline of the predictive searching method is as follows.

Assume that at the i-th step of chain encoding, the predictive chain7-code direction is D P(i), the real chain direction is D r(i), the predictive code-angle increment is a (i), and the real code-angle increment is r (i). Then we have the following relationships:







18


D (1) =D (i-1) + a (i) (2.19)
p r p

- L
"a (i) I a ia r(i-j) (2.20)
jrl


"a (k) =a p(k) + v (k) (2.21)

where a.jj1,2,...,L, are the weighting coefficients, L is the number of

angle increments employed in the prediction process, and v(k) denotes the difference in code-angle increment and k i-,-,.,-,and M is

an integer equal to or larger than L. When the curve is very smooth

with small curvature, a large number is assigned to L. When the curve

has a larger curvature, a small number is assigned to L.

By considering the past M code-angle increments, we may express equation (2.21) in vector form as

Y =RA +V (2.22)

where Y = [a (i-1) a (i-2) ... a (i-M)I T is the real code-angle increment vector.

01(12 a (1-3) ..a (ilL

H a= (13 a (i-4) ..a (i- IL (2.23)


a rU--1 ar (iM2 a iML


is an M X L matrix



is the coefficient vector, and

V = [v(i-l) v(i-2) ... v(i-M)] T is the error vector.







19


We can determine A, the least-square estimation of A, so that the

following criteria are satisfied:

(1) E JAI = A
T AT A
(2) E{c ej = E{(A - A) (A - W) = minimum (2.24)

The least-square estimation of coefficient vector A is given by
AT -1 T
A=(H H) H Y

where

A=[aI a 2 ... a1] (2.25)

if (HRTH)_ does not exist, we may use pseudo-inverse matrix Hinstead Sof (HTH) 'H T The pseudo-inverse matrix of H is H# = (C r) HT ) where
B=HT.;C = I; Ci = C ;i=1,2,...,r; and r is rherako HT

T r(C rB) indicates the trace of matrix (C rB).

Consequently, the optimal prediction direction of i-th chain code

is given by
L L
a M(i) a ja r(i-j) fj ajK(ij))J(T/4) (2.26)
p j-l rjwhere K(-) = 0, 1, 2, 3, for octal chain codes 0,1,2,...,7

and (X IJ denotes the largest integer which is less than the number

x.

In [28], Xu and Tou chose L =3 and M = 3 for predictive

searching. From the equation (2.24), we know that the performance measure is essentially based on the view that all the errors are equally important. This is not necessarily so. We may know, for example, that

data taken later in the experiment were much more in error than data taken early on and it would seem reasonable to weight the errors accordingly. Such a scheme is referred to as weighted least squares and

is based on the performance criterion







20


j= CT We (2.27)

Then the least-square estimation of coefficient vector A is given by
A T -1iT
A =(H W) H WY (2.28)

We note that equation (2.28) reduces to ordinary least squares when

W = I, where I is the identity matrix. The predictive searching may

help the continuity of the boundary instead of using interpolation or extrapolation.



2.4 Decision Criterion

During the classification process, the statistical pattern

recognition techniques combined with production rules form a decision tree to guide the system to make a decision. There are two decision

criterions used in pattern recognition, time domain classifier and frequency domain classifier. In the time domain, we consider that the

most random processes are governed by the Gaussian probability law; therefore, the Bayesian decision function [29] is given by
di() n[~w)] 1 1 T -1il,.M '29
d () =knP~w)]- jZnlciI - -(x-i) c (x-Pi(2.29

where P(w 1) is a priori probability of the occurrence of the i-th class
T
w X =[x1 x 2 -.x M is the pattern vector, xl,.**,x m are the n selected features; M4 is the number of classes; Vi and ci are the mean vector and covariance matrix of the i-th class. The_pat tern X is

assigned to class wiif for that pattern

d C X) > d .(X) for all j * i (2.30)

In the frequency domain (e.g. the Fourier descriptor), for similar

reasons it is assumed that the- distribution of certain descriptor







21


coefficients (f n2n=l,...,N) of a class is also a two-dimensional Gaussian distribution Pf n (rf'if)$,, where r f and i f are the real and imaginary components of a coefficientfn


f n(r f) (laln + Ja~I)cos(nt) f n(i f) (la n - IanI)sin(nt)

To classify the patterns, a simple decision function is defined

D i(rfi) ) [Pf n(rf i f)], J=1,2,...,M (2.31)
n
where the summation is taken over all the fn coefficients used for classification. A pattern is assigned to class wiif for that pattern

D i(rf~if) > D i(r fif) for all i~j (2.32)



2.5 Knowledge Representation

One of the most common methods for representing knowledge is to use condition-action rules or productions, e.g. Davis, Buchanan, and

Shortliffe [30]. The condition part of such rules is typically a logical product of several conditions, and the action part describes a decision, an action, or an assignment of values to variables that is to be performed when a situation satisfies the condition part. Basically, the production rules represent the how-type knowledge and the denotation of the form

IF ... THEN

or


Condition + Action.







22


Using the variable-valued logic calculus VL1[31], we may represent the production rule as

[x i # R] + [D]

where # stands for one of the relational symbols >31 ) >$ <, or <; R denotes a subset of the value set of variable xj; D denotes a term describing the decision to be assigned when the condition part (on the left of +) is satisifed; + denotes the decision assignment operator.

It is easy to see that if the condition part has more than one element, the condition part becomes a disjunction of the individual condition

[x # a,b,c,...] !-[x # a] & [x # b] & [x # C] &.

The LISP and Prolog programming languages [32] are designed to perform the above logical statement.

The advantages of a production system are

(1) Each production rule is completely modular and independent of the

other rules.

(2) The stylized nature of the production rules makes the coding easy to

examine.

(3) Each rule represents a small, isolated chunk of knowledge. Thus a

user familiar with the system may be able to formulate new rules if

necessary.

The disadvantages are

(1) Sometimes we cannot easily represent a piece of knowledge by a

production rule.







23


(2) Some impact may arise f rom unexpected interactions with other rules

due to a rule change or an addition of a rule.

The other way to represent knowledge is to take the conditional probabilities into account. This representation is used to diagnose

entities such as diseases by repeatedly applying Bayes' theorem to compute the probabilities that certain diseases are present, given that certain symptoms have been observed.

Another way in which knowledge is sometimes organized is by static

descriptions o -f phenomena. This representation is used in INTERNIST [33],[34] and MEDIKS [35].

We represent the knowledge in entity-attribute relationship

associated with conditional probabilities. Initially, the conditional

probabilities P(EiIAji) and P(AjIEi) of the entity-attribute relationship are determined by the knowledge base. That is, if there are n possible

entities associated with a particular attribute APthen the same initial conditional probability is attached at each Ei, i=1 ...' n.

P(E1IJA) P(E 2IA) . P(E JA ) = 1 2jn j n
However, to use frequency of the occurrence to replace the conditional probability is not accurate. This assignment will blur the

importance of the key feature and lead to a misclassification problem. To compensate for this drawback, we have, to manual19---replac e the conditional probability by a weight (or certainty factor). However,

experts will not feel comfortable deciding the weight. This weight

assignment is updated until the system performs correct decision making.







24


2.6 Knowledge-Based System

What distinguishes such a knowledge-based system (or expert system) from an ordinary application program is that in most expert systems, the

knowledge is explicitly in view as a separate entity, rather than appearing only implicitly as part of the coding of the program. Ordinary computer programs organize knowledge into two levels: data and program. Most expert computer systems, however, organize knowledge on three levels: data, knowledge base and control. For a general review, gee Nau [36].

We propose that our knowledge-based system incorporate the concept of pattern recognition as the classification reference to compensate for

the lack of a suitable rule. There are three goals in our knowledgebased system. The first is to structure the problem domain and to develop analysis techniques. These techniques include algorithm, heuristic rules and the use of contextual information. The second is to use the techniques developed in Sections 2.2 to 2.4 for analyzing visual imagery. The third one is to develop analysis techniques for effective object classification that would be less computationally expensive.

The knowledge-based approach to visual imagery involves formulating

and evaluating hypotheses about the objects observed in the imagery. This is accomplished by extracting features from the imagery_and then associating those features with high-level models of possible objects that are stored in the system's knowledge base. The basic system block diagram is shown in Figure 2.6. It consists of four main parts:







25


Image Data (Results) Decito
obtained about Decito
image









Control MduleStructure













Methods Information

for Preprocessing, about Structur
Feature Extraction and
and Interpre
I I Classification
Development
of
Method


al Properties


tation


Figure 2.6 The Block Diagram of Pictorial Knowledge Base







26


(1) A data base which contains results about the images. This comprises

spatial information, possible alternatives, and different levels of

representation.

(2) A module which executes control; it decides which methods to apply,

which information to use, and how to access the results contained in

the knowledge base.

(3) A knowledge base which contains a method for preprocessing an image

extraction of simple features and classification.

-(4) A data base which contains knowledge about structural properties of

images and possibly about the field oif problems.

A typical processing sequence will include the following steps:

Step 1 Extract some strong major features.

Step 2 Use the available information (extracted features) to

prune the list of possible object types and suggest

correct hypotheses.

Step 3 Use the library (data base) information to predict other

lower level features.

Step 4 Associate the predicted lower level features with the

image.

Step 5 Iterate between Steps 2 to 4 until a classification

decision is reached.

In the following chapters, we illustrate the knowledge-based system design using APRIKS system (Agricultural Productivity Improvement Knowledge System). Then incorporating the image processing techniques,

we develop the pictorial kiiowledge-based system (some use the image








27


understanding system) to help the falsified document detection and font identification in Chapter 4, and the circuit diagram recognition and interpretation in Chapter 5.

















CHAPTER 3
DESIGN OF A KNOWLEDGE-BASED EXPERT SYSTEM FOR APPLICATIONS IN AGRICULTURE

3.1 Introduction

In a technology- intensive industrial environment, scientific and technological knowledge is not only the most important element but also a key factor in productivity improvement. The whole world today has

moved into a knowledge-based mode of operation. As a result, under the present circumstances, we are confronted by three major problems in dealing with knowledge [37]

(1) proliferation of knowledge,

(2) utilization of knowledge,

(3) unavailability of knowledge.

The proliferation of knowledge has raised the question of how much a person should selectively read and how much knowledge and information

he should acquire every day in order to meet his needs. The second

problem is concerned with the most efficient way of transferring the knowledge to a user when his needs arise. The third problem is the fact

that what masters really know is normally not in the textbooks- written by the masters. The introduction of knowledge-based expert systems has shown great promise of solving these problems. Knowledge-based expert

systems may be designed to provide selective material for reading and viewing by an individual, depending upon the scope of his interest and


28







29


the nature of his problem. Knowledge-based expert systems may be

designed to bring the specific knowledge to the user when his needs arise and to provide general consultation services to him. Knowledgebased expert systems may be designed to transfer the experience and know-how of masters into the knowledge base of the expert system. These three may be considered as the important types of knowledge-based expert systems which we need in the coming decades.

In recognizing the above mentioned problems, a great deal of interest has been developed in recent years in the design of knowledgebased expert systems for various applications, especially in medical consultation. To conduct medical diagnosis, for instance,, a knowledgebased expert system may provide comprehensive analysis of possible disorders, suggest suitable treatment plans, and serve as a ready source

of expertise. Among the well-known computer-based systems for medical applications are MYCIN [30],[38] for antimicrobial therapy advice, INTERNIST [33],[34] for diagnosis in internal medicine, CASNET [39] for glaucoma, and NEDIKS [35] for diagnosis of multiple disorders, IIEDAS [40] for interactive diagnosis.

Other. applications include DENDRAL [41] for determining molecular structures of complex organic chemicals from mass spectrograms and related data, PROSPECTOR [42] for consultation about potential mineral deposits, PLANT [43] for diagnosis of soybean diseases, Hearsay-lI [44] for speech understanding, APRIKS, [45] for knowledge transfer and utilization in agriculture, and a number of applications in other fields [46]-[50].







30


Food productivity may be enhanced by taking several approaches:

(1) crop pest control, (2) plant disease control, and (3) fertilizer management. The primary function of crop pest control is to make optimum use of pesticides on plants. The main objective of plant

disease control is to prevent the spread of plant diseases. The primary

goal of fertilizer management is to provide an adequate amount of nutrition to the plants. To enhance food productivity, we should make the right diagnostic analysis, select the right treatment plan, and use the right pesticides and fertilizers with the right amount at the right time. To accomplish these objectives, farmers and growers should have the necessary knowledge at their fingertips when their needs arise. The

computer-based APRIKS system was conceived some five years ago to meet this challenge. APRIKS is the acronym for Agricultural Productivity Improvement Knowledge System. This paper presents the design of APRIKS which is a pilot knowledge-based expert system for applications in agriculture. The AFRIKS enables the user (farmer, grower, or county agent) to interact with the computer in a conversational mode to obtain

satisfactory answers to various questions and suggestions of solutions to specific problems. The APRIKS is a pilot system which demonstrates the use of a minicomputer in agricultural information browsing, knowledge transfer, diagnostic consultation, management-recommendation, and science education. On the basis of a set of observations provided by the user, the APRIKS can determine the plant diseases, the damaging insects, or the planting instructions. It can recommend treatment plans

and pest control:.pro ced-ures,-" .and can provide useful information such as







31


lif e history, injury to crop, a nd injury threshold. The APRIKS is

designed to respond to two types of input formats:

(a) Menu selection,

(b) Simple natural language.

Thus, the APRIKS performs three modes of operation:

(1) Interactive retrieval and browsing mode,

(2) Decision-making and consultation mode,

(3) Question-answering and diagnostic mode.

The design of the APRIKS is divided into two major tasks:

(1) APRIKS knowledge base generation,

(2) APRIKS knowledge seeking and utilization.

The knowledge base, generated by two modes of operation, is the off-line batch mode and the interactive conversational mode. The

knowledge seeking and utilization process is accomplished on the basis of knowledge-based pattern recognition and inference principles. It is

our hope that the APRIKS design concept will provide a solution to the three major problems created by proliferation of knowledge, utilization of knowledge, and unavailability of knowledge.



3.2 Design Concepts for Knowledge-Based Expert Systems

A knowledge-based expert system generally consists__ of two fundamental components (1) a knowledge base and (2) a recognition/

inference mechanism. The knowledge base is a structural and relational

representation of knowledge. In a natural format, knowledge may be

represented -in an associative hierarchical structure. The most general







32


concepts are placed at the top of the tree and the most specific items at the bottom of the hierarchy. Knowledge of various levels of

specificity is distributed throughout the hierarchy. The structure of

the knowledge base is useful knowledge which may facilitate knowledge classification, new knowledge acquisition by inserting it at the right location in the hierarchy, identification of possible problems, and efficient seeking of the appropriate knowledge in response to a query [511-[53].

The recognition/ inference mechanism interprets the user's query input, performs pattern matching, and generates recommendations,

specific answers or inferences. Its decision-making process makes use of discriminant analysis, Bayesian statistics, clustering analysis, feature extraction, syntactic rules, and production rules. The design

of a knowledge-based expert system may be conducted by following two major approaches:

(1) Rule-based approach,

(2) Pattern-directed approach [291.

Sometimes a combination of both approaches should be cleverly employed. The rule-based approach makes use of a collection of "ifthen" rules. A typical design based upon this approach is the MYCIN system [38]. The pattern-directed approach is based -upon the construction of pyramid-like "networks" of know-hows, which we refer to as a knowledge hierarchy. The MEDIKS system has been designed on the basis of this approach [35]. In this paper, we have developed the

framework of a knowledge-based expert system for the extension, transfer







33


and utilization of knowledge. The organization of this knowlege-based expert system is illustrated in Figure 3.1.

Through the knowledge acquisition task, the system transfers experts' experience and know-how into the knowledge base. The system

extends its strategy through an inference mechanism. The system is

designed to perform the following three modes of operation: information retrieval and browsing, decision-making and consultation, and diagnostic

analysis and question-answering. This framework of a knowledge-based

-expert system is applied to productivity improvement in agriculture. The design of APRIKS system is discussed in the following sections.

In the APRIKS knowledge base, strategic information items are characterized by feature patterns. The knowledge seeking process is

accomplished through the recognition of feature patterns which match most closely with query patterns formulated by APRIKS from the user's query. Two information patterns are said to be matched if a certain performance criterion is satisfied. The strategic information items are

linked to the detailed descriptions and working knowledge which are stored in various knowledge files.

In the APRIKS system, the knowledge seeking and utilization process involved six basic steps:

(1) Formulate the query information patterns from user's query,

(2) Retrieve the feature patterns from the knowledge base,

(3) Modify the user's query to make it more specific,

(4) Recognize the associated feature patterns via knowledge-based

pattern recognition and inference,




























Know-Deion


Trnfr Base Consul tat ion










From Bae Syroe
Experts enerario


Figure 3.1 Organization of a Knowledge-Based System







35


(5) Generate detailed information on the associated feature

patterns,

(6) Suggest optimal treatment plans and alternative solutions.



3.3 Knowledge Representation

In the APRIKS system, we represent knowledge in the knowledge base by hierarchical entity-attribute-value relations. The structural knowledge is semantically categorized into an associative tree. Each

node of the relational hierarchy represents an entity associated with a set of attributes and values. An entity E i is described by a

conjunctive expression of the descriptors,

{D i~ (w.,A.j V.i)I , j = 1,2, ..., N

where D denotes the jth descriptor of entity Ei; N is the number of

attributes used to characterize this entity; and A.P V., and wji denote the corresponding attribute, value of the attribute, and the weighting factor, respectively. Attribute A is a characteristic used to classify an entity E..* The weighting factor w.i is assigned to the jth attribute A.i to specify its relative importance in the characterization of the entity E i. Weighting factor w.i may be a specific number determined by an expert's opinion or a value derived from conditional probability or fuzzy set theory. For example,, in document retrieval;- -the weighting factor associated with a keyword may be determined by the frequency of occurrence of the keyword which appears in the document.

The descriptor D. is an attribute-value pair with a weighting factor, D.i (W.,AV.). The set of descriptors { D ij (wJAi


j=l, 2, ..., N form a cluster space to describe the entity Ei:







36


{D i~ (w.,A., V .)I = {D i (w,,A,,V,), --., D iN(wNAN,"VN)'

For simplicity in writing,, we use the short notation Djj~i(wj) by

dropping the attribute name and value. Furthermore, it is assumed that the descriptors are independent, i.e.,

D il(w 1) n Di12 (w 2) n ... n D , (wN)

We make this assumption since the dependence aspect of the descriptors is not adequately understood. In fact, it has been recognized that

classification decisions are robust with respect to the assumption of

-conditional independence [54].

Common descriptors exist between two entities E i and E.,5 if the name or the meaning between two descriptor sets {Di,k(wk)} and {DjLk (W.), are the same. We denote the common descriptors by
{ i ^ }
~(w k) where the factor wk is defined by

wk =min (w k' w

The common descriptors form a semantic link between two entities, and the most frequently used name is chosen as the master term.

Distinct descriptors between two entities E i and E.i are expressed j 0
as {Do (w)), if the name and meaning between two descriptor sets are different. The distinct weighting factor is defined as
0
wk if j D
i ik
w = Pif Di Dj"

Two entities will be linked together, if the names of the entities

are the same. We call this link an entity semantic link, denoted by E (i,j). However, several individual descriptors may have common descriptions with different weights. The entity. semantic lihk, has the








37


property that if exist two entity semantic links E (i,j) and E (j,k) , then a link E (i,k) exists. We denote partial link by the

notation E (i/j).

Shown in Figure 3.2 is the relational hierarchy for a portion of knowledge representation in APRIKS system. The top-level node is soybean which is followed by five entities (insects, diseases, nematodes, weeds and varieties). Furthermore, under the insect node are

included soybean looper, green coverworm, velvetbean caterpillar, corn earworm, and so on. The associative tree can readily be expanded

systematically by acquiring more knowledge for the knowledge base. The entity soybean looper can be described by such physical characteristics

as body color, body length, abdominal proleg, etc., as illustrated in Figure 3.3. Some of the attributes are key characteristics of an entity. For instance, the two-pair abdominal proleg is a key

characteristic of soybean looper and it carries a heavy weighting factor.

3.4 Knowledge Base Generation

Knowledge base generation is an essential step in the design of knowledge-based systems. In our approach, we divide the knowledge base

into two components. Part One contains knowledge outline or knowledge sketch, and Part Two stores knowledge details. Knowledge- sketch is

represented as a relational hierarchy. Semantic links are used to

associate the nodes of the hierarchy. Knowledge details constitute the

data files and dictionaries of the knowledge base. To generate the

relational hierarchy for knowledge'sket'ch " eitherr the bottom-up approach










SOY BEAN





Insect Disease Nanatode Weed Variety





Soybean Cre evtb an Corn.
iLooper Cloverworm Caterpillar Earworn





trol Li 1fe, Control Life
ondation History .. Recommiendation History ..


:al Chemical
Control





Itlus ilethonyl


Thir ing lens is




Dipel flnctcir Ticuricide





Formulation A.l/Acre Application Toxicity ..

See / Air \
. abe1 ~ pray


Non-chcmical Chtem
Control Con


cical itro 1


Bacillus Metlionyl ..
Tisiringi ens is




Dipel Pactur Thuricide




Formulto ...cr ppiato


( wP)


(See (Air\ Label/ I Spray~


Figure 3.2 An Example of APRIKS System Hierarchy


Con P ecoiro


Non-chemtic
Control


Bac


Lc







39


Entty

(Soybean Looper)





Atrbutes Vle

Body Length ---> 1/8 - 1 1/2 inches
Body Shape ---> Head is smaller than tail
Body Color ---> Green
Abdominal Prolegs------> 2 pairs
Anal Prolegs ---> Erect
Stripe Color ---> White
Walk Status ---> Loop-like


Figure 3.3 Attributes of Soybean Looper







40


or the top-down approach may be taken. In the bottom-up approach, it is assumed that a node in the hierarchy will contain all the characteristics of its subnodes. As an illustration, consider a

knowledge sketch shown in Figure 3.4. The knowledge base generation

rule for the bottom:-up approach is a



[D I 2** 1M7- 'j(w )]newI'1 *2"'* 1 *,- (w)]

Ui] V i ...i~ (w)*0 u [D i V 1 " -1 wi)old



where m = 2, ..., M; M = total number of levels in the hierarchy; at,~ subnode number in the M-th level of the hierarchy; and i1, i 2, .

subnode number in the 1st, 2nd, ... level of the hierarchy, respectively. The symbol u denotes union set operator. The above

knowledge base generation rule is applied repeatedly from the (M-l)th level to the top level, and the result is shown in Figure 3.5. The topdown approach is a process in the reverse order of the bottom-up approach. The knowledge base generation rule for the top-down approach is
[Di1*i 2... ij~ (w new i-- [Di ...**i ,j jw )old



u Di 1 *12* 71j( where m =2,..,M


This rule is applied iteratively from the 2nd level to the bottom level and the result is shown in Figure 3.6.

The selection of either the bottom-up approach or the top-down approach depends on the system design and knowledge-seeking







41


E {D (.
1 j,


E 1 1 {D 1. .i (w.)} E 111{D (W.)}





E {D(w.)} E


E1.2 f .,


E1.1.2 {D11.2,j .1..2D1...,


Figure 3.4 An Example of the Hierarchy f or Knowledge Sketch


(w.)}







(w.) } (jl



















Implementation
Direction


El 2{D1
E1 {D I( , 1.1 wj )l,:,l


Al 1 2 - A" ol 1 12


/ 1.1.1 {D 111i( ,


E1.2 {D1.21 (w j ) E1.1.2 ID1.1.21i(


E {... D 1112jwj)


Figure 3.5 Bottom-Up Approach for KB Generation


6


.L . L . L . 4 ( ) , L . L . L . (o 1.1.1.1 w D 1.1.1.1 w

















E1.2 {D 11.(wj),Dl 2,j(w.)}


Imp lementat
Direction


ion
In/ E 1.1.1 {D 1.1


E {D ..1(wj)D ...1i( )


Figure 3.6 Top-Down Approach for KB Generation


E {D (.
1 . 1 ,


E1.1{D1i( ), .1(w )


i ~ ~ ~ ~ ~ ~ , J112 .1i3112iJ


Li)


"N,


E 1.1.1.2 {D 1.1.11i (w j ),D 1.1.1.21i (W j )I


I







44


strategies. However, in the real-world situation, the associated

descriptors at the top level are of general nature and may be irrelevant to system requirements because of the need for specific characterization at the lower level. However, in practice, the number of inherent

descriptors at the top level are smaller than that of the lower level. In our work, we propose a modified top-down approach to simplify knowledge base generation. From the relational hierarchy, we understand that each entity is actually described by its associated descriptor sets and its father-son relationship. We, therefore, consider the father-son relationship as a potential descriptor, calling it the cover. That is, the cover is a descriptor for the system internal representation. Thus, the semantic linking rule for knowledge base generation is given by
[D i '1213* mi(w )]Ie



[Di V i .i 2 . * 1 * 'i ~ i) old U i , * 1 - 1 7


where the cover, 6 *1 . 1 1



{E 1. E V12,E V1213*, ..., E 1 *12 13




The modified top-down approach is illustrated in Figure 3.7.

The two kinds of data input method for knowledge base -generat ion and updating are batch mode and interactive mode. The batch mode is

suitable to accept a bunch of raw data and process at a time. The

interactive mode is used for small amounts of data. In APRIKS, we use

batch mode for knowledge base -generation and interactive mode for













EE


E1.1 {E;1D 111i (w.)}


D 11 (w}


{C .1 D 1..1 wiE


E {E D (w.)}
1.2 1.1 1.2j 1


E E



1... 1.J


L'D111 (w) E11.1.2 Ic1111 D 11.1.2 (w )I~


where E 1.1.1 = {EJ1.E11 E 11.1}C1.1 = {Ej1,E1.1 },and c..1 [E 11


Figure 3.7 Modified Top-Down Approach for KB Generation


IJ_


= - E 1







46


knowledge base updating. In order to speed up data collection, we use some pre-defined delimiters to identify the data characteristics. Each delimiter will perform one generation function. The raw data are typed

in compact form to save storage, then sorted into a readable form. The examples of raw data and delimiter function are shown in Figures 3.8, 3. 9, and 3. 10. The simplified knowledge base generation scheme is illustrated in Figure 3.11.



3.5 -Data Structure for the APRIKS System

The data structure for the APRIKS system is composed of hierarchical node index table (NITA), entity dictionary (ED), attribute

dictionary (AD), value dictionary (VD), unofficial entity dictionary (UED), unofficial attribute dictionary (UAD), entity-at tribute table (EAT), information file (INF), name dictionary (ND), hash table (HT), and text file. The heart of the APRIKS system is the NITA which

represents an agricultural knowledge sketch base in an associative tree structure. Information flow in the APRIKS system is summarized in Figure 3.12, and the data formats for various elements in the data structure are shown in Figure 3.13.

The hierarchy consists of eight generation codes, with one byte being used for each generation code. Thus, the eight-level1 tree is used

to represent 256 subnodes for each node. ED and AD are employed to

interpret entity names and attribute names, respectively. The synonym

dictionaries, UED and UAD, are created for interpreting the corresponding-master terms. For instance, SL and VBC are the synonyms for soybean looper and velvetbean caterpillar, respectively.













Si*SOYBEAN 2*TOMATO 3*TURFS1. i*INSECT=PESTi. 2*DISEASES1. 3*NAMATQDESi. 4
*1'.EEDl1. S*VARIETY1. 1. i*SOYBEAN LOOPER=SBL=PSEUDOPLUSIA INCLUDENS~..i. i
* CONTROL RECOMMENDATION=PEST TREATMENT=CONTROLMANAGEMENTSIi.1.i+INJUR Y TO CROP :FOLIAGE(DAMAGE TO CROP(FOLIAGE FEEDER(LEAF FEEDER%A&LARVAE ARE HEAVY FOLIAGE FEEDERS. &LARVAE FEED ON UNDERSIDES OF LEAVES. &LARVAE FEED) IN DEEP CANOPY~ OF PLANT. &LARVAE FEED ON OLD LEAVES. A+THRESHOLD OF INJURY( ECONOMIC THRESHOLD( ECONOMIC INJURY LEVEL%A&33 C%) DEFOLIATION PRI OR TO BLOOM.&N0 MORE THAN AN ADDITIONAL \iO(%/) AFTER BLOOM\.&iO WORMS i /?2" OR LONGER PER LINEAR FOOT OF ROW BEFORE BLOOM.&4 WORMS 1/2" OR LONG ER PER LINEAR FOOT OF ROW AFTER BLOOM.A$i.I.i.i.i*NON-CHEMICAL CONTROL +BIOLOGICAL CONTROL%A&PROMOTE BENEFICIAL ARTHROPOD POPULATIONS.A+CULTUR4 AL CQNTROL(AGRICULTURAL TREATMENT(AGRICULTURAL CONTROL/,A&PLANT BEANS As EARLY IN SEASON AS POSSIBLE.&WINTER PLOWING.ASJ.i.i.i.2*CHEMICAL CONTR OL=CHEMICAL APPLICATION=PESTICIDE APPLICATION=INSECTICIDE APPLICATION = CHEMICAL TREATMENT=CHEMICAL SPRAYING$1. 1.1.1.2. 1*BACILLUS THURINGIENSIS S1.1.1.1.2.1.1*DIPEL+FORMULATION :WP+A.I./ACRE :SEE LABEL(ACTIVE IN GREDIENT LBS. PER ACRE

Figure 3.8 Raw Data for Batch Mode







48


(1) $ Category Code * Entity Name

(2) Synonym Name of the Entity

(3) @ Number of Record, Length of Record,

Picture File Name

(4) + Attribute [: Value]

(5) < Synonym Name of the Attribute

(6) % A
\\ for Underline @ @ for Negative --Detailed
* Image
Information
* for Blinking

A


(7) /1 (End of Raw Data File)






Where [Iis optional and A A is enclosed rectangular box


Figure 3.9 Delimiters for Raw Data










49


Si SOYBEANN S2*TOMATO
S3*TURF
S1. 1*INSECT
=PEST
=PEST S
Si. 2*DISEASE S1. 3*NEMATODE
Si. 4*WEED
II. S*VARIETY Si.1. I*SOYBEAN LOOPER =SBL
=PSEUDOPLUSIA INCLUDENS +INJURY TO CROP :FOLIAGE (DAMAGE TO CROP (FOLIAGE FEEDER (LEAF FEEDER %A 1. LARVAE ARE HEAVY FOLIAGE FEEDERS.
2. LARVAE FEED ON UNDERSIDES OF LEAVES.
3. LARVAE FEED IN DEEP CANOPY OF PLANT. A 4. LARVAE FEED ON OLD LEAVES. +THRESHOLD OF INJURY (ECONOMIC THRESHOLD (ECONOMIC INJURY LEVEL %A 1. 33%. DEFOLIATION PRIOR TO BLOOM.
2. NO MORE THAN AN ADDITIONAL \iOfl AFTER BLOOM\.
3. 10 WORMS 1/2" OR LONGER PER LINEAR FOOT OF
ROW BEFORE BLOOM.
4. 4 WORMS 1/2" OR LONGER PER LINEAR FOOT OF A ROW AFTER BLOOM.
Si. 1. 1. CONTROL RECOMMENDATION =PEST TREATMENT
TREATMENTT PLAN =PEST CONTROL
=CONTROL
=MANAGEMENT
S1.1. 1.1. i*NON-CHEMICAL CONTROL +BIOLOGICAL CONTROL

PROMOTE BENEFICIAL ARTHROPOD POPULATIONS.
A
+CULTURAL CONTROL )AGRICULTURAL TREATMENT
)AGRICULTURAL CONTROL %.A i. PLANT BEANS AS EARLY IN SEASON AS POSSIBLE.
A 2. WINTER PLOWING. 51.1.1. I.Z*CHEMICAL CONTROL =CHEMICAL APPLICATION =PESTICIDE APPLICATION
=INSECTICIDE APPLICATION =CHEMICAL TREATMENT
=CHEMICAL SPRAYING Si.i.i.i.2. I*BACILLUS THURINGIENSIS


Figre .10 Raw Data After Sorting


Figure 3. 10








50


Raw "Data




Yes Identifier

# EGEN for Entity
No File Generation
End of Row CATGEN for
Data File Yes NITA Code
Identifier Generation and
$,* 1 Linking

UPDOWN No

or Top-Down
Generation Identifier Yes UEDGEN for
E n Entity Synonyms
Generation

No

Yes PFGEN for
Identifier Pict re File
@ Generation

No ATTRIB for

Yes Attribute File
Identifier Generation and
+ Entity-Attribute
Linking and
Updating

No


U for
Identifier Yes =trEbu te
Synonyms
Generation

No

INFORM for
Identifier Yes Detailed
% Information
Generation

No














Figure 3;11 The Simplified Knowledge
Base Generation Scheme










Category
File


PiSture
Entity File
Dictionary



LoLityttribute
Table




Attribute Dictionary Name

Dictionary

Value Dictionary




Det
Infor tion
F i TMI :e



Synonym I
Dictionary Jf-J]




Ua.h


rInterpreter


Figure 3.12 Information Flow in APRIKS System


Hlash Tabl e


Useor
Interface








Menu Selection
(R')

Question Answer Iing
(Q)
Deci1si on Making
(D)


tun HJ


Function


I



















Block ptr. Record pt.to # of 1ptr. to collision
In ND ptr. In EAT E Eattributes NiTA Link
I ND Jptr.


UED B lock ptr. Record 1Record
in ND ptr.in pr.Inl
I ND ED I


ptr. to ptr. to next ptr. to ptr. to last record LI ptr. to next
AD weight entity with VD INF In INF EAT record
Same attri- for same
but e entity


Block ptr. Record ptr. to 1st I Iof assocIn NI) ptr.in associated iated
N lD entity en titles


N c haracters


Character string


-A-


UAD


Block ptr. Record Record in ND ptr.in ptr.in
ND AD


0 of Character .. IF Character ptr., to
characters string NF string next
Record


TEXT


0 of I Character
characters string


Figure 3.13 APRIKS Data Structure


NI TA


ED


EAT


1 2 3 4 5 6 7 8 ptr.i to ptr.i to a-,# of pr o pr onx
Category Code (8 levels) ED1t ubn des ejsrent subnode with
subnode I node same paren th


AD


11T


Record File ptr. LI of Coillision
ptr.to i to characters ptr.
dictionarirs dictionaries







53


To represent m:n entity-attribute relation, a simple method is to create a two-dimensional array which stores the relations between entity occurrence and attribute occurrence. Figure 3.14 illustrates a typical example. However, the limited number of attributes associated with a particular entity makes the two-dimensional array very sparse and storage inefficient. In our design we create an EAT file to represent entity-attribute relation in a compact form. The interrelations among

the ED, AD, and EAT records are illustrated in Figure 3.15. By using

the EAT table, we can list the attributes of an entity, or seek all of the entities which have a specific attribute. The ND contains the ASCII character strings which represent the names of entities or attributes in

the knowledge base. Access to the ND is through the block pointer and the record pointer in dictionary records. The f ormat of the ND is

variable-length record as shown in Figure 3.13.



3.6 Knowledge-Seeking Strategies

The APRIKS system has been designed to perform three types of operations (interactive retrieval and browsing, decision-making and consultation, question-answering and diagnostic analysis) in the

agricultural f ield. To accomplish these operations, we have designed knowledge-seeking strategies. From the interactive retrieval task, the

system allows the user to access any portion of the knowledge by navigating through the hierarchical structure. The system contains

refreshing mechanisms which remove the irrelevant information and keep them in manageable size. A minimum amount of typing is required from








54


E 1 (D 1 1,D 1.) z 1.2



E 11 ( 1.,1-D 11,) E1.2(D .21 JD 1.2)


$ PD 1.1.1$2)


E1.1.2(D 1.1.2 sD 1.1.2$2


i 1 1=DD

1:2 1 1 .11" 12


an )1.1.2 =D1.1.1,1 D1.1.2 22


(a) An Example of Hierarchy


DV . '1,2

0

0


0

0

0


;1 1D D



0 0 0

i 1.1,1 wi1.1,2 0 012 0 012


0 0 0

o 0 0


;1 1.1



0

0




w1.1.1,3 V 1.1.2,2

0


D 1112


0

0

0



0


0


0




0

0




0


D .,


0 0

0

0 0-l~ 1


(b) Array Figur-p,: 3.14


Representation of m :n Pattern Vectors Associated Pattern Vectors in Hierarchy


D










0

0

0


E 1

E 1.1 E 1.2 E 1.1,1 E 1.1.2 E 1.1.1.1


E (D


zz E 1.1.1 (D 1.1.1







55


First Associated
Entity


Entity Dictionary








EntityAttribute Table]







Attribute Dictionary


Figure 3.15 The Inter-relations Among the
ED, EAT and AD Records







56


the user to avoid misspelling due to fatigue. Also the system provides the following three display symbols to enhance visualization (1)

rectangular box f or clear view, (2) underline f or emphasizing important messages, and (3) blinking as a warning signal.

For designing the diagnostic task, we drew on our experience from the design of the TBS system [55] and the MEDIKS system [35]. We have

simplified the diagnostic strategy and made it more friendly. The

system plays the role of an intelligent advisor with tremendous patience who serves all the user's requests and makes impartial recommendations. From the knowledge base, the system seeks the key features and forms a diagnostic scheme to help the user make accurate diagnosis and positive identification. The user answers the presorted problems provided by the diagnostic scheme. The diagnostic pattern

vector is automatically formulated and is sent to the decision-making and inference module. The system performs pattern matching, ranks the most probable causes, and displays the causes which exceed the predetermined threshold. At the user's request, the system is capable of explaining to the user the reasoning process behind its recommendations. User-system interaction may continue until a

satisfactory conclusion is reached.

Man-machine interface is accomplished by a menu-drivent scheme or by

a question-answering format. In the menu-driven mode, each displayed

page is divided into three sections of title, contents, and control mechanism. The title is the name of the selected entity or

attributes. The content section contains a list of subnode entities,







57


associated attributes, or detailed knowledge for each category of the hierarchy. The sequence numbers are automatically assigned to the list

according to the brother sequence in the associative tree structure. This arrangraent will make the user easy to select the desired items by soft-key approach and facilitate the growth of the knowledge base or updating the contents without any programming change. The control mechanism will provide selection control which includes (1) select the desired item, (2) back-search to the previous page, (3) return to root of hierarchy, and (4) exit from the current mode. The functional

flowchart for the menu-selection mode (information retrieval and browsing mode) is illustrated in Figure 3.16.

The decision-making mode is designed on the basis of similarity measures. The typical measures between two vectors

[Xil)xi2'***Ixin] T and x. [xjl~xj2,... llxjn] T may be
n
S (x 4,x.) = xx
k1 xikjk
or I x kXj

S (xi'x. n k=l kj
2n2 n x k2 _x
X (x ik) + X-Xj) X kj
k=l k-i k=1 kj
In APRIKS system, we consider the hierarchical tree as the clusters. The associated descriptor vector may be viewed as describing the centroid of the cluster in M-dimensional feature space. The three

similarity computations used in the APRIKS system are S(Ei, E) similarity computation between entities E i and Ej, S(Ai, Aj) similarity computation between attributes A and A.and S QEi)similarity

computation between query vector Qi and entity E J. The decision is made









58


(1) ~ ~ ld 7uocsSbndsFn (2)~~ Attrbuf ofYe FrtTePrn

Nod










(1) eafoe Nodeo (1) Fintdn(2

(3) Lnkin Leve (3)





D~E ite


Figre .16Functional Flowchart of Menu Selection Mode


Figure 3.16







59


for the highest similarity measure if it exceeds the preset threshold. The operation flowchart of decision-making mode is shown in Figure 3.17.

The question-answering mode provides a friendly input/output interface. In the APRIKS system, the user's query can be semantically categorized into three types:

(a) Cases which require a Yes/No type answer.

(b) Specific questions which require a what/which/who! when/where ... type

of answer.

(c) Procedures or sequences of actions which require a how/why type of

answer.

Although numerous techniques for processing natural languages have been proposed in the literature, the problem of natural language understanding by machine still remains unsolved in the practical sense. In the APRIKS system we have introduced methods to handle simple natural language sentences with structural format common to most users.

In the question-answering mode, the APRIKS system is designed to analyze a simple sentence and to identify its five contextural components.

ILI = {semantically logical operators}


= {not, or, and) III {identifiers}

= {when, how, why, where, what) {CI} {contextual information)

= {father-son relationship in the hierarchy)


{C} = {characteristics}










60



















Betrning



to Associated
Category











F-[ stn One T



Cratcing and
Genti tn Proabtes
As ci ionsaegr



User Asecs




Verification HisDecisionst e oe nomto


Figure 3.17 Operation Flowchart of Decision-Making Mode






61


- {entity-attribute relationships I

{K} f all keywords, including entities and attributes)

The identifiers are converted into the "what" canonical form,

{ identifier --- > { "what" canonical form1

when time
how procedure
why -->what + reason
where place

A question is segmented into five contextual components

Uquestionl ={ LI}+{ I}+{ K}+{ Cl}+{ CI

Co nsider the question "How to control soybean looper and velvetbean caterpillar by Dipel?" The system performs contextual segmentation and determines the following components.

{L I ={and I

fI I {how I1= what + { procedure I

{K I ={control, soybean looper, velvetbean caterpillar, Dipell

{ CI I={(control, soybean looper), (control, velvetbean

caterpillar), (Dipel, control)}

C I={(Dipel, treatment procedure)I

The associative tree is a "what" type structure, and the production rule is a "how" type structure. By embedding the production rule into the hierarchy, the APRIKS system answers questions with "how," "why," I.when," "where" as well as "what" format. In fact, our-design of the AFRIKS system combines the concepts of patter ndirected approach and rule-based approach. A simplified functional flowchart for the

question-answering mode is summarized in Figure 3.18.









62


EnterAnoter Quesmtionnserinate


Keyword














Extrat io Ex






Queestion
Lo i al {


Figure 3.18 The"Operation Flowchart of Question-Answering Mode







63


3.7 Experimental Results

The AFRIKS system software has been designed and implemented on a PDP-11/40 minicomputer under the RSX-11M operating system. The system performs three modes of operation. The current system is designed to respond to input format types of either menu selection or simple natural language input. Computer printouts are given in Figures 3.19-3.21 to demonstrate the various modes of operation of this knowledge-based expert system.

3.8 Conclusion

In this chapter we have presented the design of a knowledge-based expert system for applications in agriculture. The fundamental components of the APRIKS system are a knowledge base and a recognition/inference mechanism.

To facilitate the design for knowledge base generation and

updating, our approach is to divide the knowledge base into knowledge sketch and knowledge details. A recognition/inference mechanism is used to perform knowledge seeking and utilization. On the basis of a set of

observations provided by the users, the system can determine the plant diseases, the damaging insects, or the planting instructions. It can

recommend plans for treatment and pest control and can provide useful information such as life history, injury to crop,- and- inj ury threshold. The APRIKS system is able to respond to menu-selection type of inputs as well as simple natural language inputs. The design

concepts for the APRIKS system are not limited to agricultural applications. Based upon the same principles, we are developing







64


INSECT

1 SOYBEAN LOOPER
2 GREEN.CLOVERWORM
3 VELVETBEAN CATERPILLAR
4 CORN EARWORM
5 FALL ARMY WORM 6 BEET ARMY WORM
7 LESSER CORNSTALK BORER
8 GREEN STINK BUG
9 MEXICAN BEAN BEETLE
10 BEAN LEAF BEETLE
11 THREE-CORNERED ALFALFA HOPPER
12 THE SOUTHERN RED-LEGGED GRASSHOPP:ER
13 THE LARGE AMERICAN GRAS SHOPPER
-TYPE *.


A
B E
INDEX NO.


RETURN TO PREVIOUS INDEX RETURN TO BEGINNING EXIT
GET INFOIRMATrION


ENTER YOUR SELECTION : 3

VELVETBEAN CATERPILLAR

1 CONTROL RECOMMENDATION
2 LIFE HISTORY
3 INJURY TO CROP
4 THRESHOLD OF INJURY
TYP:E *,


A
B E
INDEX NO.


RETURN TO PREVIOUS INDEX RETURN TO BEGINNING EX IT
GET INFORMATION


ENTER YOUR SELECTION : 4

THRESHOLD OF INJURY


Figure 3.19 An Example of Menu Selection Mode


1, 33% DEFrOLIATION PRIOR TO BLOOM. 2* NO MORE THAN AN ADDITIONAL 10% AFTER BLOOM, 3. 10 WORMS 1/2' OR LONGER PER LINEAR FOOT OF
ROW BEFORE BLOOM.
4. 4 WORMS 1/2' OR LONGER PER LINEAR FOOT OF
ROW AFTER BLOOM.







65


INSECT

WHAT KIND OF INSECT ?

1 WORMY LOOPER, OR CATERPILLAR
2 BUG
3 BEETLE 4 HORPER

ENTER YOUR SELECTION : 1 ANSWER THE FOLLOWING QUESTION :6 (TYPE *(CR:> IF YOU CAN'T ANSWER)
[HOW MANY ABDOMINAL PROLEGS ?
1 2 PAIRS 2 -3 PAIRS


ENTER YOUR SELECTION :6 1

WHAT IS TH-E BODY COLOR ?
1 GREEN
2 GRAY
3 BROWN
4 PINK
5 'BLACK
6 YELLOW

ENTER YOUR SELECTION :0 1






THE SUGGESTED INSECTS ARE RANKED:


1
2
3
4
5
6
7


SOYBEAN LOOPER
GREEN CLOVERWORM VELVETBEAN CATERPILLAR CORN EARWORM BEET ARMY WORM
LESSER CORNSTALK BORER THREE-CORNERED ALFALFA HOPPER


Figure 3.20 An Example of Decision-Making Mode






66


HOW TO CONTROL VBC BY USING BACTUR ?


AI:R


PRAY


MINIMUM OF 3 GALLONS SPRAY PER ACRE.1 USE EARLY IN SEASON.


ANY OTHER QUESTION ? (Y/N) Y

ENTER YOUR QUESTION : (NO MORE THAN ONE LINE)



SOYBEAN INSECT ?


INSECT

1 SOYBEAN LOOPER
2 GREEN CLOVERWORM
3 VELVETBEAN CATERPILLAR
4 CORN EARWORfI
5 FALL ARMY WORM 6 BEET ARMY WORM
7 LESSER CORNSTALK BORER
8 GREEN STINK BUG
9 MEXICAN BEAN BEETLE
10 BEAN LEAF BEETLE
11 THREE-CORNERED ALFALFA HOPPER
12 THE SOUTHERN RED-LEGGED GRASSHOPPER
13 THE LARGE AMERICAN GRASSHOPPi:ER --


Figure 3.21 An Example of Question-Answering Mode







67


knowledge-based expert systems f or production automation and computerintegrated manufacturing, for performing self-diagnosis and selfmaintenance in industrial environments, and for the design of intelligent robots.

















CHAPTER 4
FALSIFIED DOCUMENT DETECTION AND FONT IDENTIFICATION



4.1 Introduction

Today, a high percentage of white-collar crimes often involves the

-falsification of typewritten documents. The falsified document

detection is currently performed by expert document examiners using manual techniques. However, the number of different type-fonts

currently in existence is over two thousand. Keeping track of these

fonts represents a nearly impossible data-processing task when manual techniques are used. The popularity of certain type-fonts have caused other manufacturers to produce look-alike type-fonts. These fonts are,

to the human eye, indistinguishable from the original manufacturer's product. Furthermore, the advent of interchangeable type elements, such as on the -ubiquitous IBM Selectric, has caused additional complications. A single type element may be interchanged between different typewriters and even between typewriters from different manufacturers. As a result, the falsified document detection problem is

extremely difficult to solve if we do not make use of modern computer technology (e.g., expert system), image processing techniques, and pattern recognition theory.

The main objective of this chapter is to develop a knowledge-based system for detecting falsification in a document, identifying type-fonts


68







69


and typewriter manufacturers, and providing positive evidence of falsification.

To detect falsification in a document, the system should be capable of

(a) differentiating typewritten documents prepared by different

typewriters with the same single element ball,

(b) differentiating typewritten documents produced by different

single element balls of the same manufacturer and style placed

on the same typewriter, and

(c) differentiating typewritten documents prepared by the same

typewriter and the same element but at a different time.

To identify type-fonts and typewriter manufacturers, the system should be able to

(a) determine the similarities and differences between two typefonts,

(b) recognize type-font manufacturers of the same and similar

typestyle, and

(c) determine the characteristics of typewriters made by different

manufacturers.

To provide positive evidence of falsification in a document, the system

should be able to make quantitative and minute comparisons and measurements of individual typewritten characters and their interrelationships.

In the following sections, we investigate the existing image

-processing techniques in Chapter 2 and develop or modify them to suit







70


the above system requirements. Finally, a knowledge-based system is

created to conduct and link all the functions using experts' knowledge.



4.2 Noise Background

The system involves minute feature comparison, thus the noise could be a very important factor to the system reliability and accuracy. From our study, noise occurring in the system may be classified into

typewriter noise, mechanical scanner noise, electronic scanner noise, and quantization noise. Typewriter noise may be further subdivided into noise due to ribbons, paper, strike strength, and reproduction. Ribbon noise is caused by the type of ribbon (carbon or fabric), the weave, the

age, and the ink saturation. Paper noise is caused by texture, ink

absorption, and reflectivity. Strike strength noise is caused by the different force typists use on the keys on a manual typewriter, or the different force settings used on an electric typewriter. Reproductive

noise includes noise introduced by carbons or photostatic copying. Mechanical scanner noise consists of alignment noise and optical path irregularities. Finally, electronic scanner noise and quantization noise are standard problems in all image processing applications.

Scanner noise and quantization noise may be largely engineered to an acceptable level which depends upon the design of -the-scanning device. Typewriter noise is inherent in the type samples and has a high ambient level. Since the system involves microscopic comparisons, noise reduction or removal techniques are employed.







71


4.3 Design Methodology

To meet the above objectives and goals, we propose a system architecture as shown in Figure 4.1, which consists of the subsystems of document data acquisition, typewritten character recognition, falsification detection, type-f on t/typewriter identification and typefont/typewriter database.

4.3.1 Document Data Acquisition

The document data acquisition subsystem reads the document page, removes the noise, converts characters into binary images, and isolates the characters for subsequent character recognition. Major operations

in this subsystem are briefly discussed as follows (see Figure 4.2).

4.3.1.1 Scanning, Reflection and Windowing

Scanning the document is the first step in this system. We use an Optronics Photoscan, which possesses a resolution of 100 pmn in the reflection mode, to generate a digitized picture. A typical lower case

character of 1.8 mm x 1.8 mm is represented by an 18 pixel by 18 pixel digitized image. A typical upper case character is represented by a 30 pixel by 22 pixel digitized image.

Scanning creates a mirror image when it scans the printed pictures. We employ a "REFLECT" algorithm to eliminate the mirror effect and a "WINDOW" operation tailors the portion of A document for examination.

4.3.1.2 Image Preprocessing

The goal of preprocessing is to convert an input gray-level picture into a noise-free binary image. Preprocessing includes smoothing,






























Typewrittenl Doctillidnt ur






Isolated Charactcr











Figure 4.1 System Architecture for Au


Idccnjcera ltiofii




Spai~fcin rg




Fille


itomatic Typewriter Identification


-J

























F-

Stitle spacin~g, I listograml I Line shitt angle
Atialysis Ispacing
detectioni




I Adaptive horizontal spacinig,
I Threshonlding Character vertical spacing
11acinig
dcternsiiinat ion


Smoothing

(li No., character No.)






Isolated character (binoary)


BlockhigCharacter size (XY) isolated character (gray level)


Figure 4.2 The Data Acquisition Subsystem


DOC
or


-4
w







74


histogram analysis, threshold determination, binary picture generation, spur removal, and gap filling. Smoothing is to remove the background

noise due to rough reflection from the paper. Several effective noise

smoothing methods have been discussed in Chapter 2. We employed a

median filter with 3x3 window for smoothing because it improves the edge blur. The smoothed picture is shown in Figure 4.3.

4.3.1.3 Adaptive Threshold Determination and Binary Picture Generation

To generate a binary picture for the input document, we have developed an adaptive thresholding method based upon the concept of maximizing interset variance of pattern recognition [29]. The histogram of the gray-level image can be considered as a set of clustering. Then the threshold selection is equivalent to cluster separation. From the

observation, the document shows dark objects (i.e., characters) on a light background (i.e., paper). The embedded noises are within the

bright region. Then the threshold selection is simplified as a two clusters separation; one cluster is characters and the other cluster is background noises. The gray-level histogram is normalized and treated as a probability distribution as shown in Figure 4.4. Assume there are L quantized gray levels (L =256, in the system).- The probability of

gray level i is PVwhere i = 1 2, ..., L.

Let 8 be the desired threshold. Pixels with intensity-less than or equal to 0 are considered as belonging to the background class C 0* Pixels with intensity greater than 0 are treated as belonging to the object class C1. The probabilities for these two classes are P0 and Pl, respectively, which are functions of 0, given by












75














........................ ....




.......... . ....... . . . . . . . .



















. . ........ . .. . . . . . . . . . .


... ........T.............. ... .. .
.............



........................ . . . .....





-49. 4 *.. P P

2.........








(b Ate mothn


Figure 4.3 Smoothing Operation







76


P.

















C 0 C1






Figure 4.4 Probability Density Function Derived from
Histogram of Gray-level Picture







77


ie+


I 1=1 i





LTthe mean ofh gray levels of the picture be
L
= 6+




Tmeaen oferyles of the beclsackrune iseb



61ormx{I61



Th mzean" o graileels ofithe obecty iss hno qa o I





= ~ iP1]=PI
01=0+1)







78


However, from the experimental results, we have found that if we shift the theoretical 6 value a little bit toward the background side, we will get a better visual binary picture. The shifted threshold O0 C T 0 and CT = 0.9.

To show how the optimal thresholding works, we use a windowed document as shown in Figure 4.5 and its gray-level picture is shown in Figure 4.6. The histogram of Figure 4.6 is shown in Figure 4.7 and its

binary picture is shown in Figure 4.8 with the optimal threshold

6 = 116.

4.3.1.4 Binary Filtering (Spur Removal and Gap Filling)

After binary image generation, there exist spurs, gaps, and

isolated noise in the binary picture. To determine which pixel is a

spur to remove and which pixel is a gap to f ill in is usually objectionable and difficult to judge by computer. From our study, we

define the spurs, gaps, and isolated noise as follows:

Definition of Spur

A pixel Ill is a spur, if the sum of its 4-connected neighbors is

one.

Definition of Isolated Noise

A pixel Ill is an isolated noise, if the sum of its 4-connected

neighbors is zero.

Definition of Gap

A pixel tot is a gap, if the sum of its 4-connected neighbors is

larger than two.

For the 3 x 3 window,






79


I This paper presents an approach to automatic

Detection of document falsification by making use

of pattern recognition and image processing techniII
I
1ques. The proposed system consists of two parts:
I document data acqusition and falsification detec1tion. The document data acquistion sybsystem reads
I
ithe document under examination,prcsteinu

Idocument data andiisolates suspicious characters
I I
1for subsequent falsification detection.


Figure 4.5 An Example of Windowed Document





80





This paper

cle ection of ' dowu .Or pattern recoqn, queS. T he 0 pps:

(A0cuLmet= dataL aC

T o QflOC lll.te. -do cu-met urid&e



fors subs eqd dnitq



Figure 4.6 Gray-Level Picture of Figure 4.5










81











EACH *REPRESENTS 24. 80 PIXELS.
FREQUENCY OF OCCURENCE
0 1696 339 2 se88
G- - - - - - - - -- -- - - - - - - - - - - - - - - - -








9 75




90!**
93 4







117 * II


123 I I I


I2

1294-I



147 1
15 4 I I
15S--I 4


5 * 4 I I
15 * I I I
L-1 4
1-5 !*:R I I I

1a:S t


Figure 4.7 Histogram of Figure 4.6




82


This paper pres, tect ion~ of documen pattern recogni tit


The proposed


acqui


cument data


Th


e


4
4


document


document under


eI


,cument data and is( r subsequent fa1sih


Figur 4.8Binary Picture (Before Noise Removal)


es .


on .1


Figure 4.8







83


pi-l, j

P P..
i'j-l P)ji,j+l
Pi+l Ij

the algorithm of the binary filtering is the following.

(1) Calculate the sum of the 4-connected neighbors

S=Pi-l'j + Pi,j+l + P i+l~j + '1

(2) If S >2, set P ij= 1

If S < 2, set P ' =0

If S = 2,1 P remains unchanged.

Figure 4.9 shows the results of binary filtering of Figure 4.8.

4.3.1.5 Character Isolation

The isolation procedure is accomplished by assigning a block number to each isolated character. The whole procedure involves three

operations. One operation is blocking, or pyramid generation, which isolates the characters by inscribing them with rectangular blocks, and

determines the coordinates of each block. The second operation is

labeling, which assigns each isolated character block a code number by scanning order. The third operation is resorting, which resorts all the blocks in text order (i.e. character by character in typing order).

(a) Blocking Algorithm

The essence of the blocking concept is using the top-down pyramid generation to find the corner coordinates of the circumscribing rectangular box of each character.

The pyramid generation algorithm is as follows (refer to Figure 4.10).




84


This paper pres tection of documen


pattern


es .


recognita I


The proposed


cument data


on .


Th


e


acqui


document


C


document under e


9


:..ument data and is(

subsequent fabsi1'i
Figure 4.9 Binary Picture (After Noise Removal)














I.i(k) = 0 Z'I.j(k) =0


I.(k -*I.1l(k.

I.(k-1)






i-(k+1)=0 (k+1)l= =



I (k+1)=0


I.(k) -(- 0


J k+1 l


0



-1) =0 =0 1 -1(. l=

I.i (k-1)=O k-i I4

Lool






I jkl


40 1
(k+l) -Al


I(k-1)=1


(-1


I() -CM - 1


(k-i)-L
ing


I.(k)M4- 1


Looping


Figure 4.10 Pyramid Generation Algorithm


00j







86


Step 1 J =2.

Step 2 Read two consecutive data records (I.~ and I)from the binary

image file and do steps 3 to 6.

Step 3 Compare twio corresponding pixels Ij....(k) and I.(k), where k is

from 1 to LR.EC and LREC is the length of the record.

if I.i(k) = 1, then I.i(k) +- 1 and go to step 6.

if Ij_1(k) = I.(k) = 0, then I.(k) + 0 and go to step 6.

if Ij_1(k) 1 n I.(k) = 0, then go to step 4.

Step 4 Compare two previous pixels Ij_1(k-1) and I.(k-1).

if I.i(k-1) = 1, then I.i(k) + 1 and go to step 6.

if Ij_1(k-~1) = I..(k-1) = 0, then go to step 5.

if Ij_1(k-1) = 1 n Ij(k-1) = 0, then

k-i + (k-1)-1, if k-i > 1, then step 4.

else go to step 6.

Step 5 Compare two next pixels I._(k+1) and I.(k+1).

if I.i(k+1) = 1, then I.i(k) + 1 and go to step 6.

if I._(k+1) = I.(k+l) = 0, then I.(K) +-0 and go to step 6.

if i._(k+1) = I n I.(K+1) = 0, then

k+1 + (k+1)+1, if k+1 4 LREC, then step 5.

else go to step 6.

Step 6 J + J+1, if J < NREC (the end of the record)

then go to step 2.

else end of algorithm.

To illustrate the pyramid generation algorithm, using Figure 4. 9 as an example, we can get a pyramid-like picture after pyramid generation (see Figure 4.11).




87


da ~ dI Ul fad~ ~~1 aaaa3-n Uag


Ilba dou33.


a1u73f3 da an~d


Figre .11Pyramid Generation


Figure 4.11


13ho amanaal


auaz a A


a3d


nounaft ]a]


1,- 5., n. :1 L I







88


Upon the completion of the pyramid generation, we start to trace the coordinates of the circumscribing rectangular box of each character

from the pyramid information and assign a code number to each block. From the properties of the pyramid, we know that (1) the base of the pyramid possesses the maximum number of elements compared with the other

layers, (2) the top element of the pyramid has support from every layer (see Figure 4.12). Furthermore, based upon the above two properties, we

find that the top element provides the starting y-coordinate and the base provides the ending y-coordinate, starting x-coordinate, and ending x-coordinate. The x- and y-coordinates above correspond to two corner coordinates of the circumscribing box.

The algorithm of the labeling is as follows.

Step 1 Scan the pyramid picture record by record and find a pixel with



Step 2 Take that pixel as the top element of the pyramid and record

its y-coordinate (YO).

Step 3 Use the x-coordinate as the guide and tracing downward until

the base is reached (i.e. the last "1" before "0" is detected). Step 4 Record the y-coordinate as Y2,, and take the x-coordinate of the

lef tmost pixel of the base line as X1and the x-coordinate of

the rightmost pixel of the base line as X2.

Step 5 Store (X1,Yl) and (X2,Y2) in the character coordinate file and

take the record number of the coordinate f ile as a block

number.







89


'-x







1 2 3 ~4 5 6 7




support


elements
1 1 3
4 6
7 - maximum


Figure 4.12 The Property of the Pyramid


layer
1
2
3
4 5 6


y







90


Step 6 Delete the rectangular area, which has X 1 x 4 X 2 and

Y 1 4y 4 2 from the pyramid picture.

Step 7 Repeat steps 1 to 6 until no more "1" pixels exist.

The character coordinate file of Figure 4.9 is shown in the left 5 columns of Figure 4.13. The ordering of the block number shows the

scanning order of the characters.

(b) Character Classification

Before discussing the re-sorting algorithm, we like to classify the character into five types according to its "height", and typing position.

Type 0: a,c,e ,m,n,o,r, s,u,v,w,x, z.

Type 1: b,d,f,h,k,l,t, all uppercase letters, and all numbers.

Type 2: gpp~qly.

Type 3: ij'';, ,! (which have two-part characteristics).
Type 4: ~ ''~ (which have very small dimensions).

The y-dimension of types 0 to 2 is listed in Table 4.1. In the

falsification detection task, Type 3 and Type 4 symbols are ignored except for i and j. The i and j symbols appear without the dots on the top. We have two reasons for ignoring Type 3 and Type 4 characters. First, they indicate very little on the document falsificaton. Second,

due to the preprocessing they become degraded and can not be easily differentiated from the common noise. Furthermore, we classify the

characters i and j as Type 0 and Type 1, respectively, after deletion of the dot as a feature.









91


ELCCKU xi

1 74
2 ail
3 1O0 4 11 5 146 6 170 7 135 8 159 9 183
10 207 11 13
12 147 13 170 14 38
15 74
16 24
*17 49
is 62
19 87
20 97
at 134i a2a t a3 tea
24 94
as 207 26 2 8 27 26
2s 74
a9 S6
30 12
31 49?
32 61
33 96
34 110 35 121 36 146 37 157 38 170 39 1e1 40 19rf4 41 206 42 109 43 96
44 11
45 3;3
46 48
47 2 -4
40a 120 49 145 so isa


vi :
6 33
6 94
10 104
10 1i8
10 154
10 178
11 143
11 67
11 192
11 2 5 54 22
55 55
55 178
56 46
56 E2
58 33
59? 58
59 70
59 91
59 106
s59 143 &s its
60 91
60 202
61 2 5
6 21?
103 34
105 8
1OS 93
107 21
lee 57
l0g 68
1O8 105
109 118S
189 129
109 184
109 166
1097 178 109 190
109 2-02 110 214
152 117
153 Izs
156 20
166 44
156 55
157 32
157 129
137 153
158 166


y a


21
21 22 22 25 26
22
26 70 71
71 70 70 70 70 70 70 71 71
70 72 72 72
71
119 20
1a0 1?
123 119
120 119 iao 120 120
La1 121 125 121. 168 168 171 168
168
168
169 173
168


LINE# POS.U 01FFY NCHAR

1 0 10
1 a a 8
1 3 0 0
I I. 1 10
1 6 1 0
1 C 1 10
5 4 10
7 S 10
1 9 1 10
10 S 0
2 1 0 16
2 it 1 16
2 12 1 16
2 3 0 6
2 6 0 16
2 2 0 16
2 4 0 16
a S 0 16
2 7 0 16
a 0 6
a 10 1 6
2 9 0 16
2 13 2 16
2 4 a 16
Z! 2s 16
2 16 16
3 2 0 15
3 5 1 is
3 6 1 15
3 1 0 15
3 .3 4 15
3 4 0 I5
3 7 1 i5
3 a 0 15
3 9 1 15
3 10 1 15
3 It 1 1s
3 12 2 15
3 13 2 15
3' 14 6 15
3 15 a 15
4 6 0 13
4 5 0 13
4 1 3 13
4 3 0 13
4 A 0 13
4 2 0 13
4 7 1 13
4 8 5 13
A 9 0 13


Figure 4.13 Character Coordinate File







92


Table 4.1 Types Criterion (IBM Selectric)


(Units :mm)


101


122'


Type Font Type 0 Type 1 Type 2
1 0 - z1 Z2) G Z 0+Zl) 0+1 I2)


Gothic 1.9 0.7 0.7 1.9 2.6 2.6

Elite 1.6. 0.8 0.8 1.6 2.4 2.4







93


The single line spacing between two typing lines can be measured by using the above character type classification (see Table 4.2). The line spacing information provides a criterion for the re-sorting algorithm.

(c) Re-sorting Algorithm

After the labeling of each character, we have isolated each character by assigning a distinct code number. However, for a document

examiner, it is not convenient to f ind the corresponding character by a code number because the code numbers are arranged in the scanning order. We re-sort the characters by text order and estimate the

location of each typing line. The re-sorting algorithm is as follows. Step 1 Pick up the first ten character blocks from the character

coordinate file and find a candidate Type 1 or 2 character

whose y-dimension, H = max (Y-l~ i1,'...,lO.

Step 2 Take the threshold 0 =0.4*H as line spacing separation

criterion to scan the coordinate file. If the spacing

[ (2)i~- Y2 i > ey) that means a new typing line is

generated.

Step 3 Resort the character sequence in a typing line by increasing

the order of the Xl-coordinate, then assign the line number and

position number to each character.

Step 4 Count total number of characters in each typing line.

Step 5 Estimate the location of each typing line by averaging the


Y2-coordinate of the characters in that typing line.

Step 6 Refine the typing line estimation by dropping the Type 2

characters, if they existed, whose Y2 -coordinates appeared at a

certain distance below the typing line.




Full Text
125
middle portion due to the erasing problem. To quantitatively describe
the intensity change, we define three parameters as 1^ denoting the
optimal peak intensity in the background region, I denoting the peak
intensity in the object region, and Iq denoting the optimal thresholding
of the intensity histogram. Referring to Figure A.29, the peak
intensity shifts are given by
AI = I I and AI = I I I
b D2 o o c>2
where AI^ is due to erasing, AIQ is caused by ribbon aging and AIr is
due to the combination of both cases. We introduce an erasing threshold
0e, a ribbon aging threshold 0r, and the threshold 0t of the optimal
threshold change. The decision rule is (1) if AI^ > 0e, the paper has
been erased, (2) if AIQ > 0r, the ribbon has been changed, (3) if
AIq > 0t, the character has been corrected (or erased and retyped).
However, if the intensity histogram is unimodal (i.e. has only one
peak), then Iq cannot be extracted. The Iq will be an important
parameter for the detection of the intensity change. In our working
model, we just use Iq in the intensity change analysis which is shown in
Figure 4.30. The Iq's of two characters is shown in Figure 4.31.
4.3.4 Type-Font Identification
The identification of the type-font is a pattern recognition
problem. The number of type-fonts including variationsexceeds three
hundred. To achieve accurate identification, we have designed a type-
font database which contains information on the type-font name,
features, manufacturers, year, spacing, number of characters per inch,
etc.
In our experimental study, we used an IBM Selectric II with


190
the third page and their corresponding images are shown in Table 5.12(a)
and (b) and Figure 5.21(a) and (b), respectively. Through the corner
feature and vertical line segment detection, we obtain the new pages 1
and 2 as shown in Table 5.13 and 5.14. After recognition routine is
performed, the images of the corresponding functional elements are shown
in Figure 5.22 and listed in Table 5.15.
5.4.4 Denotation Recognition
After the removal of the junction dots, the connecting line
segments and the functional elements, the binary image contains disjoint
denotations. The denotations of the circuit diagram are used to
describe physical properties or labels of the functional elements in a
schematic in terms of a character string. The denotation recognition
scheme is shown in Figure 5.23. The first step of the denotation
recognition is to recognize all the letters, numerals, and special
characters such as 2, y if they exist. is a resistance unit and p is
a capacitance unit. This step is the same as that of the character
recognition technique given in Chapter 4. The second step is to
concatenate the character string to form a labeling of a functional
element. The denotation is always concatenated horizontally and is
similar to word recognition. Thus the criteria of concatenation are
(1) Y-coordinate of character string is almost same (i.e.|Ay| < 2).
(2) X-coordinate of character string is increasing (i.e., xn > xn_^ >
..> X£ > x^ if n character are concatenated).
(3) the spacing between two consecutive characters is limited to the
width of the normal character x- (i.e.*--x x x ).
c i ii c


+
0=0
0=180
0=270
RI=0
RI=1
RI=2
RI=3


156
unrecognizable elements. We call this procedure a multi-pass pattern
extraction process. Why do we decompose the schematic diagram into five
pages? The reasons are
(1) Each pass performs one specific modular function. The task is
easy to replace and modify.
(2) We want to .extract each component and connection and still
"reserve the nature of the schematics.
(3) By considering each functional element as a character, we can
further utilize the concept of blocking.
5.4 Multi-Pass Pattern Extraction
After adaptive thresholding, a binary image of the schematic is
generated. Cleaning of spurs is not necessary in the extraction phase
because the spurs are small and can be ignored during the blocking
procedure. The first pass of pattern extraction is to remove the
junction dots from the binary image and store them in the first page,
i.e., reserve the branch property. The second pass is to remove
horizontal line segments first in order to further isolate the
functional elements. The third pass is to isolate functional elements
which are considered as character blocking as discussed in Chapter 4.
The fourth pass which is to recognize the denotations __is a process
similar to the character recognition. The fifth and last pass is to
remove streaks and noise then copy the unrecognizable elements to page
five. The procedures of the multi-pass pattern extraction are described
in the following sections. ^


using the concept of the associative network. Finally, we propose the
conversion rules to link the electronic recognition system with the
SPICE package to enhance the system's capability. This link
demonstrates that the pictorial knowledge-based system can be integrated
with current CAD machines to make diagnosis and reduce manpower.
vii


\
I
Figure 3.12 Information Flow in APRIKS System


65
INSECT
WHAT KIND OF INSECT ?
1 WORM, LOOPER OR CATERPILLAR
2 BUG
3 BEETLE
A HOPPER
ENTER YOUR SELECTION t 1
ANSWER THE FOLLOWING QUESTION (TYPE IF YOU CAN'T ANSWER)
HOW MANY
ABDOMINAL PROLEGS ?
1
2 PAIRS
2
-3 PAIRS
3
A PAIRS :
ENTER YOUR SELECTION l 1
WHAT IS
THE BODY COLOR ?
1
GREEN
9
Am
GRAY
3
BROWN
A
PINK
5
BLACK
6
YELLOW
ENTER YOUR SELECTION 1
THE SUGGESTED INSECTS ARE RANKED
1 SOYBEAN LOOPER
2 GREEN CLOVERWORM
3 VELVETBEAN CATERPILLAR
A CORN EARWORM
5 BEET ARMY WORM
6 LESSER CORNSTALK BORER
7 THREE-CORNERED ALFALFA HOPPER
Figure 3.20 An Example of Decision-Making Mode


175
Table 5.8 Contents of Page 2 for Window 1
(a) Before Resolving Ambiguities
The Coordinates of the Line Segments :
REC #
LINE #
P0S1
P0S2
TYPE
1
30
84
98
H
2
30
121
128
H
3
31
120
163
H
4
56
83
93
H
5
57
83
105
H
6
57
118
137
H
7
75
4
12
H
8
75
20
50
H
9
76
29
50
H
10
76
73
91
H
11
88
125
137
H
12
99
85
92
H
(b) After Resolving Ambiguities
The Coordinates of the Line Segments :
REC #
LINE #
P0S1
P0S2
TYPE
1
30
84
98
H
2
31
120
163
H
3
56
83
105
H
4
57
118
137
H
5
75
4
12
H
6
75
20
50
H
7
76
73
91
H
8
88
125
137
H
9
99
85
92
H


H< j¡! 4 Hi Hi 4 4 Hi4 H< >f 4 4 Hi H< H< H< Hi Hi 4 Hi H< : !' Hi 4 4 Hi Hi Hi Hi H< Hi Hi Hi Hi -Hi H< Hi -Hi 4 4 ?!< Hi Hi Hi Hi Hi H >!< Hi H: H 4 H< H< H< Ht H< H¡ H: 4 Hi H< H< H< 4 Hi 4 4 4 Hi 4 4 Hi Hi Hi Hi 4
LINE= i
a
a
i
o
u
p
r
.7
t
b
c
d
f
Q
h
1
0.549
0.495
0.756
0.50 3
G. 477
0.675
0. 666
0.479
0.693
0.585
0.545
0.5 61
0. 74 3
0.574
2
0.664
0. 65 2
0.623
0.630
0.699
0.638
0.696
0.55 8
0.642
0. S24
0. 603
0.635
0.646
3. 647
3
0.6 34
0.470
0.383
0. 5 36
0. 5 37
0. 573
8. 624
8.5 36
0.696
0.574
0. 490
8.676
8.727
8. 5 65
4
0.622
0.650
0.499
0.62
8.539
0.537
0. 471
8.3S6
0.594
0.585
0.600
3.593
0.5 37
0. 524
5
0.65 0
0.741
0.633
0.326
0.75 2
0.969
0. 65 8
3. 596
0.66
0.321
0.65 3
0.722
0.646
0.704
6 7 3 9
0.914 0.673 0.698 0.505
0.751 0.730 0.396 0.629
0.532 0.636 0.431 0.628
10
0.65 2
0.713
0. 619
0.718
0. 789
8.662
3. 519
0.635
8.522
0.641
0. 612
0. 702
0.799
0.734
3.953
0. 632
0.613
3.623
0.330
0. 673
0.713
0.309
0.729
0.712
0.645
0.655
0.537
0. 656
0.736
8.762
0.614
0. 5 32
0.623
0.940
0.584
0.621
0. 5 31
0.65 7
0. 685
0.366
0.771
0.931
0. 626
0. 597
3.591
0. 755
0. 700
0.75 7
0'. 5 06 0.645 0.531 0.635 0.634
8.646 0.696 8.703 0.538 0.750
8.575 0.951 0.571 0.546 0.723 0.672 0.723 0.666 0.66? 0.669
m
n
Q
T
0.541 0.564 0.522 0.542 0.565 0.532 0.552 0.514 0.531 0.593
0.573 0.383 0.604 0.594 0.803 0.678 0.789 0.766 0.766 0.323
0.665 0.691 8.591 0.563 0.707 0.694 0.639 0.750 0.561 0.763
0.953 0.564 0.693 0.453 0.651 0.501 0.657 0.497 0.673 0.676
0.445 0.418 8.501 0.425 0.353 0.459 0.356 0.331 0.507 0.36?
Hi 8 Hi 4 6 H-: Hi 8 4 6 H< 4 Hi 6 H< 6 4 H< 6 6 Hi H< 4 4 4 4 8 Hi H< H! H< H< 9 4 H< 4 4 4 4 4 Hi 4 Hi 4 4 Hi44444444444444444444444 4444 44 4444
Figure 4.16 Computing Results of Correlations for the First Line of the Document
102


83
i-l> j
Pi,j-1 Pi,j Pi,j+1
Pi+1,j
the algorithm of the binary filtering is the following.
(1) Calculate the sum of the 4-connected neighbors
S = P. 1 + P. .,1 + P.,i + P. -i
i-l,j i,3+l 1+1)3 i,j-1
(2) If S > 2, set P, = 1
i > 3
If S < 2, set P, = 0
1 > J
If S = 2, P, remains unchanged.
1 > J
Figure 4.9 shows the results of binary filtering of Figure 4.8.
4.3.1.5 Character Isolation
The isolation procedure is accomplished by assigning a block number
to each isolated character. The whole procedure involves three
operations. One operation is blocking, or pyramid generation, which
isolates the characters by inscribing them with rectangular blocks, and
determines the coordinates of each block. The second operation is
labeling, which assigns each isolated character block a code number by
scanning order. The third operation is resorting, which resorts all the
blocks in text order (i.e. character by character in typing order).
(a) Blocking Algorithm
The essence of the blocking concept is using the top-down pyramid
generation to find the corner coordinates of the circumscribing
rectangular box of each character.
The pyramid generation algorithm is as follows (refer to
Figure 4.10).


166
Table 5.5 Contents of Page 1 for Window 1
(a) Before Resolving Ambiguities
The Coodinates of the Connection Components :
REC #
XI
Y1
X2
Y2
TYPE
1
136
29
140
33
J-l
2
137
29
141
33
J-2
3
79
55
83
59
J-4
4
80
55
84
59
J-l
5
136
55
140
59
J-2
6
137
55
141
59
J-3
7
39
73
43
77
J-2
8
40
73
44
77
J-2
9
79
73
-83
77
J-4
10
80
73
84
77
J-3
11
39
74
43
78
J-l
12
40
74
44
78
J-l
13
79
74
83
78
J-5
14
80
74
84
78
J-5
(b) After Resolving Ambiguities
The Coordinates of the Connection Components :
REC #
XI
Y1
X2
Y2
TYPE
1
137
29
141
33
J-2
2
79
55
83
59
J-4
3
137
55
141
59
J-3
4
39
73
43
77
J-2
5
79
74
83
78
J-5


138
Table 4.4 Results of Type-Font Identification
CHARACTER LINE POSITION TYPE-FONT
1
3
ELITE
16
GOTHIC
13
ELITE
3
1
GOTHIC
4
GOTHIC
7
GOTHIC
10
GOTHIC
14
GOTHIC
5
7
GOTHIC
8
GOTHIC
1
15
GOTHIC
6
ELITE
8
ELITE
2
2
GOTHIC
7
GOTHIC
11
GOTHIC
3
8
GOTHIC
13
GOTHIC
5
3
GOTHIC
6
GOTHIC
9
GOTHIC
1
11
ELITE
2
16
GOTHIC
3
3
GOTHIC
4
15
GOTHIC
5
14
GOTHIC
1
12
ELITE
5
1
GOTHIC


75
(a) Before Smoothing
(b) After Smoothing
Figure 4.3
Smoothing Operation


158
Figure 5.6 Junction Dot Patterns


212
can be easily converted into SPICE input format to interface and to
perform network analysis.
For the fourth advantage, we use AND gate symbol generation to
illustrate it. The simplified associative network tree of AND gate
symbol is shown in Figure 5.33.
To illustrate the fifth advantage, we set the conversion rules in
the following:
(1) Replace $ sign by a node number as an input.
(2) Replace # sign by node number 0.
(3) Replace sign by a node number as an output.
(4) Combine two consecutive junction dots into one and assign a node
number to it.
(5) Insert a node number to the branch in which two symbols are linked
directly.
(6) Replace active elements by their equivalent circuit which appears as
a subcircuit form.
Following the conversion rules, we can get the SPICE format of
Figure 5.31 as follows.
$ - 1 J1 <- 3
# 0 J2, J3 <- 4
* + 2 J4, J5 - 5
between CR2 and R3 < 6
AR1 <- SUBCKT of operational amplifier
CRl, CR2 - SUBCKT of diode


I certify that I have read this study and that in my opinion it conforms
to acceptable standards of scholarly presentation and is fully adequate,
in scope and quality, as a dissertation for the degree of Doctor of
Philosophy.
Julius T. Tou, Chairman
Graduate Research Professor of
Electrical Engineering
I certify that I have read this study and that in my opinion it conforms
to acceptable standards of scholarly presentation and is fully adequate,
in scope and quality, as a dissertation for the degree of Doctor of
Philosophy.
^hn Staudhammer
Professor of Electrical Engineering
I certify that I have read this study and that in my opinion it conforms
to acceptable standards of scholarly presentation and is fully adequate,
in scope and quality, as a dissertation for the degree of Doctor of
Philosophy.


209
$1
( 1 )
R1 [0]
( 2 )
( 1 )
J1 [0]
( 3 ) ( 2 )
/ \
( 2 )
R3 [1]
( 1 )
( 2 )
Cl [1]
( 1 )
*1
( 1 )
R2 [0]
( 2 )
*2
node expression :
Artificial code of
terminal
Rotation
index
Label
Figure 5.30 An Associative Tree of Figure 5.29


214
Thus
Cl 1 3
R1 3 0
R2 3 4
R3 4 6
R4 5 2
CR1 3 5
CR2 5 6
AR1 3 0 5
5.6 Knowledge Base Configuration
The knowledge base system of the circuit diagram recognition
consists of three major parts. The first part of the system is the
library which stores the symbol and its features. The library includes
the Symbol Dictionary (SD), Symbol-to-Feature Table (SFT), Feature
Dictionary (FD), Synonym Dictionary (SYND), Primitive File (PF),
Rotation Index Table (RIT), and Constraint File (CF). The SD, SFT and
FD constitute the heart of the knowledge-based system. Primitive File
provides pictorial features to the Feature Dictionary according to the
rotation index information which is stored in the Rotation Index
Table. The Synonym Dictionary is used to store the different
representations of the same symbol according to different graphic
standards.
The second part of the knowledge-based system is the control
module. The control module executes the multi-pass pattern extraction,


71
4.3 Design Methodology
To meet the above objectives and goals, we propose a system
architecture as shown in Figure 4.1, which consists of the subsystems of
document data acquisition, typewritten character recognition,
falsification detection, type-font/typewriter identification and type-
font/typewriter database.
4.3.1 Document Data Acquisition
The document data acquisition subsystem reads the document page,
removes the noise, converts characters into binary images, and isolates
the characters for subsequent character recognition. Major operations
in this subsystem are briefly discussed as follows (see Figure 4.2).
4.3.1.1 Scanning, Reflection and Windowing
Scanning the document is the first step in this system. We use an
Optronics Photoscan, which possesses a resolution of 100 pm in the
reflection mode, to generate a digitized picture. A typical lower case
character of 1.8 mm x 1.8 mm is represented by an 18 pixel by 18 pixel
digitized image. A typical upper case character is represented by a
30 pixel by 22 pixel digitized image.
Scanning creates a mirror image when it scans the printed
pictures. We employ a "REFLECT" algorithm to eliminate the mirror
effect and a "WINDOW" operation tailors the portion of a document for
examination.
4.3.1.2 Image Preprocessing
The goal of preprocessing is to convert an input gray-level picture
into a noise-free binary image. Preprocessing includes smoothing,


W M H
12345 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
9
9
9
9
9
9
9
9
9
9
9
T
9
9
9
9
9
9
9
9
(a) Line Segments Before Linking
1
2
3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
(b) New Line Segment After Linking


54
1.2v 1.2,1* 1.2,2;
E1.1.1.1(D1.1.1.1,1)
Dn
if !2 Dl.l,l D1.2,l ,
and lll2 Dl.1.1,1 Dl.1.2,2
(a) An Example of Hierarchy
l.l
1.2
1.1,1
1.1.2
1.1.1.1
1 1
d.*: d.
*111
d; ; d.
1,1 1,2
0
1.1 1.2 1.2 1.1,2 u1.2,2 U1.1.2 Dl.l.l,2 D1.1.2,l Dl.l.l.l,l
0 0 0 0 0 0
0 0 0
1.2,1 0 , 0 0
0 0
0 wl.l,l wl.l,2
0 w,
0
1.2,2
0
"l.l.1,1 "l.l.1,2
1.1.2,2
0
1.1.2,1
0 w.
1.1.1.1,1
(b) Array Representation of m : n Pattern Vectors
Figur-e-3.14 Associated Pattern Vectors in Hierarchy


182
For an operational amplifier, the feature vector will be
a) # of primitive -4': 2
b) # of primitive > : 1
c) # of terminal (input) : 2
d) # of terminal (output) : 1
e) # of repetition of 4 : 1
Curved shape elements such as AND and OR type logic symbols can be
simplified by line segments representation. The simplified symbols of
AND and OR gates are shown in Figure 5.19. The simplified configuration
is very close to the quantized symbol, so the simplification would not
affect the recognition of the real curved symbols. Furthermore, after
simplification, we can find that the intersections of line segments and
vertices can be extracted as the feature vectors. These feature vectors
will enhance the recognition scheme.
For AND gate,
the feature vector is
a)
#
of primitive d; 2
b)
#
of primitive r: 1
c)
#
of primitive ¡_: 1
d)
#
of primitive1
e)
#
of primitives: 1
f)
#
of primitive ^: 1
g)
#
of primitive j: 1
h)
#
of primitive f: 1
i)
#
of terminal (input)
j)
#
of terminal (output)
k)
#
or repetition of 4:
For OR Gate,
the feature vector is
a)
#
of primitive 4 : 2
b)
#
of primitive v : 1
c)
#
of primitive : 1
d)
#
of primitive ^ : 1
e)
#
of primitive : 1
f)
#
of primitive : 1
g)
#
of primitive r : 1
b)
#
of primitive : 1
i)
#
of primitive : 1
j)
#
of primitive > : 1
k)
#
of terminal (input)
> 2
1
> 2


108
Figure
Document
.19 Flowchart of Character Recognition
and Grouping


194
Table 5.14 Contents of Page 1 and Page 2
(a) Contents of Page 1 After Extracting Corners (Window 2)
The Coordinates of the Connection Components :
REC #
XI
Y1
X2
Y2
TYPE
1
137
24
141
28
J-l
2
204
49
208
53
J-l
3
137
84
141
88
J-2
4
42
84
46
88
J-2
5
89
84
93
88
J-2
6
169
97
173
101
J-4
7
130
98
134
102
J-2
8
204
26
206
28
J-7
9
137
42
139
44
J-9
10
139
51
141
53
J-6
11
205
73
207
75
J-9
12
197
99
199
101
J-7
(b) Contents of Page 2 After Extracting Vertical Line
Segments (Window 2)
The Coordinates of the Line Segments :
:c it
LINE it
P0S1
P0S2
TYPE
1
26
5
203
H
2
44
5
55
H
3
44
78
136
H
4
51
142
162
H
5
51
171
216
H
6
74
186
204
H
7
86
5
9
H
8
86
32
57
H
9
86
80
103
H
10
86
126
151
H
11
99
186
196
H
12
139
2
23
V
13
206
29
48
V
14
137
29
41
V
15
139
54
82
V
16
205
55
72
V
17
44
90
100
V
18
91
90
100
V
19
198
102
114
V
20
91
109
119
V
21
45
110
119
V
22
170
122
130
V


KNOWLEDGE BASE FOR CONSULTATION AND IMAGE INTERPRETATION
BY
MING-CHIEH CHENG
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1983


224
11. F. W. Smith and M. H. Wright, "Automatic Ship Photo Interpretation
by the Method of Moments," IEEE Trans. Computers, Vol. C-20,
pp. 1089-1095, 1971.
12. S. A. Dudani, K. J. Breeding and R. B. McGhee, "Aircraft
Identification by Moment Invariants," IEEE Trans. Computers,
Vol. C-26, pp. 39-45, 1977.
13. P. H. Swain and S. M. Daivs, (Eds.), Remote Sensing: The
Quantitative Approach, McGraw-Hill International, London, 1978.
14. D. A. Landgrebe, "Analysis Technology for Land Remote Sensing,"
Proc. IEEE, Vol. 69, pp. 628-642, 1981.
15. K. S. Fu and J. K. Mui, "A Survey on Image Segmentation," Pattern
Recognition, Vol. 13, pp. 3-16, 1981.
16. A. Rosenfeld and A. C. Kak, Digital Picture Processing, Academic
Press, New York, 1982.
17. B. Justusson, "Noise Reduction by Median Filtering," Proc. 4th
Int. Conf. Pattern Recognition, pp. 502-504, 1978.
18. M. Nagao and T. Matsuyama, "Edge Preserving Smoothing," Computer
Graphics and Image Processing, Vol. 10, pp. 394-407, 1979.
19. J. S. Weszka, R. N. Nagel, and A. Rosenfeld, "A Threshold
Selection Technique," IEEE Trans. Computers, Vol. C-23,
pp. 1322-1326, 1974. "
20. C. K. Chow and T. Kaneko, "Automatic Boundary Detection of the
Left Ventricle from Cineangiograms," Computer Biomed. Res.,
Vol. 5, pp. 388-410, 1972.
21. N. Otsu, "A Threshold Selection Method from Gray-level Histogram,"
IEEE Trans. Syst. Man. Cybern., Vol. SMC-9, pp. 62-66, 1979.
22. M. K. Hu, "Visual Pattern Recognition by Moment Invariants," IRE
Trans. Information Theory, Vol. IT-8, pp. 179-187, 1962.
23. T. C. Hsia, "A Note on Invariant Moments in Image Processing,"
IEEE Trans. Syst. Man. Cybern., Vol. SMC-12, pp. 831-834, 1981.
24. H. Freeman, "Computer Processing of Line-Drawing Data," Computing
Surveys, Vol. 6, pp. 57-96, 1974.
25. C. T. Zahn and R. Z. Roskies, "Fourier Descriptors for Plane
Closed Curves," IEEE Trans. Computers, Vol. C-21, pp. 269-281,
1972. 7


80
;V ..y :-.-i'''-* **--f- %>i;? \
;£[. has|g a p er-J, d
date cti on-.-'of-docin
of pattern recoqrr
ques. The propos-
L
,j- ft.
j o c usne n t ;d at a offT;
L
aocuine?
the do GUiTtefftSundev
=*-** i* v~ v
document data and
for s uhsequentT fa
Figure 4.6 Gray-Level Picture of Figure 4.5


15
t =
m
2iiP m M
p- ~ 2it £
* k=l k=l K
M
(2.10)
and
it
^m^Sn-P = VXp(i4gm) m 1*"
M
(2.11)
Substituting (2.10) and (2.11) into (2.7), we obtain
M
a =
\t I \,exP[j(fgm -
2mn m
J m=l
m M
2nk l U l A )], n = 1,2,
k=l K k=l K
(2.12)
For the coefficient a.
0
=
0 2%
. 2%
or / u(t)dt
1 M
- X £ [t t i ] u( t )
2n L m m-1 m
m=l
M m-1 M
U(V l *m < 2A/ \ Vexp[j^m] (2'13)
m=2 k=l k=l
The coefficient aQ represents the position of the shape center of
the contour C. Therefore equation (2.13) can be used to compute the
shape center asymptotically. If we shift the shape center to the origin
of the new coordinates, then the Fourier expression of the contour C can
be rewritten as
CO
I tanexP(Jnt) a_nexp(-jnt)], 0 < t < 2% (2.14)
n=l
To describe the shape features, we propose the following measures:


113
Figure 4.22 (continued)


2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Y2
78
79
79
79
79
79
79
90
90
90
86
87
100
100
102
103
122
101
107
107
106
107
107
108
107
110
109
122
127
112
118
116
116
116
138
127
189
Table 5.11 (continued)
(b) Window 2
XI
Yl
X2
Y2
BLOCK it
XI
Yl
X2
139
2
139
23
41
60
74
61
2
26
4
27
42
63
74
66
200
26
207
48
43
69
74
71
35
27
37
27
44
73
74
77
39
27
39
27
45
107
74
108
41
27
41
27
46
116
74
121
87
27
89
27
47
112
75
114
93
27
95
27
48
10
82
31
105
27
107
27
49
58
82
79
109
27
111
27
50
103
82
125
120
27
122
27
51
2
86
4
124
27
125
27
52
98
87
98
174
27
176
27
53
44
90
44
181
27
181
27
54
91
90
92
183
27
185
27
55
124
97
128
191
27
193
27
56
139
99
143
195
27
198
27
57
194
99
203
137
29
139
44
58
137
100
137
55
32
61
37
59
34
102
36
65
32
67
37
60
72
102
73
71
32
79
37
61
81
102
83
63
36
63
37
62
18
103
22
157
38
161
42
63
24
103
25
164
38
165
42
64
29
104
32
173
38
175
42
65
77
104
79
56
40
77
48
66
125
106
141
168
40
171
43
67
143
106
144
128
43
128
43
68
87
109
97
2
44
4
44
69
39
110
49
20
45
20
45
70
141
112
141
23
45
23
45
71
127
113
137
168
46
170
56
72
139
113
139
163
47
166
56
73
141
113
148
139
51
141
82
74
125
114
125
205
55
207
75
75
166
122
-175
146
69
189
108
76
89
124
95
9
74
10
79
12
74
19
79
22
74
24
79
26
74
30
79


20
J = £TWe (2.27)
Then the least-square estimation of coefficient vector _A is given by
A = (HTWH)_1HTWY (2.28)
We note that equation (2.28) reduces to ordinary least squares when
W = I, where I is the identity matrix. The predictive searching may
help the continuity of the boundary instead of using interpolation or
extrapolation.
o
2.4 Decision Criterion
During the classification process, the statistical pattern
recognition techniques combined with production rules form a decision
tree to guide the system to make a decision. There are two decision
criterions used in pattern recognition, time domain classifier and
frequency domain classifier. In the time domain, we consider that the
most random processes are governed by the Gaussian probability law;
therefore, the Bayesian decision function [29] is given by
diC^) = MP^)] -|^n|c;Ll ~ |-[(x-yi)TcL 1(x-y1)], i=l,...,M (2.29)
where P(w^) is a priori probability of the occurrence of the i-th class
T
on; X = [x, x_ ...x ] is the pattern vector, x,,...,x are the n
i 12 m l m
selected features; M is the number of classes; and c^ are the mean
vector and covariance matrix of the i-th class. The pattern X is
assigned to class if for that pattern
d(X) > d_.(X) for all j i (2.30)
In the frequency domain (e.g. the Fourier descriptor), for similar
reasons it is assumed that the. distribution of certain descriptor


116
The size difference detection is very straightforward. We choose
one character from a group of characters as the reference (AXr,AYr). Do
size measurements to the rest of the characters and compare with the
reference. If | AXi AXr| > 0x or | AY AYr| > 0 then the size
spacing difference is detected. Continue to test the other groups until
no group is left. The functional diagram of size difference detection
is shown in Figure 4.24.
For type-font difference detection, we use the shape matching
technique. If two characters in the same group are identical with no
blurred edge, all the corresponding pixels will overlap when they are
superimposed. However, under practical situations, the typewritten
characters are contaminated by noise due to dust or ribbon defects.
These phenomena will cause the characters to shift relatively few
pixels. We check the shift effect and evaluate whether they are the
same type-font or not.
In a group, when character 1 is matched with character 2, all
matched pixels are cancelled. The unmatched pixels from both characters
are labeled 1 for pixels from characters 1 and 2 for those from
character 2 as illustrated in Figure 4.25. Let be the total number
of pixels in character 1, C2 be the total number of pixels in character
2, N-^ be the number of unmatched l's, and N2 be the number of unmatched
2's. Defining a shape matching index as
^2 maxCN^,^)


7
1 if f(x,y) > T
f(x,y) = {
0 if f(x,y) < T
In the ideal case, the histogram of the image has a deep and sharp
valley between two peaks representing object and background,
respectively, so that the threshold can be chosen at the bottom of this
valley. However, for most real pictures, it is often difficult to
detect the valley bottom precisely, especially in cases where the valley
is flat and broad, imbued with noise, or when the two peaks are
extremely unequal in height, all of which cause the valley to be
untraceable. Some techniques have been proposed in order to overcome
these difficulties [19][21].
Using the clustering techniques, we develop a thresholding method
which chooses a threshold based upon the maximum interset variance
criterion. Our approach is similar to Otsu's method [21]. The
theoretical development of our approach is in Chapter 4.
2.2.4 Gap Filling
We now wish to assure that the boundaries of all objects are indeed
closed in the binary image. This is done by connecting two pixels
within three pixels of each other (see Figure 2.2). Given two pixels at
(x ,y ) and (x.,y.), the distance d and the angle 8 are
o o i i o o
do 1(xi v>2 + 6o yo)/(r-i xo)J


123
++++++ SHAPE ANALYSIS ++++++
(A)
SMEARED CHARACTERS
CHARACTER
LINE *
POSITION
e
7
15
s
9
7
(B)
SIZE DIFFERENCE


CHARACTER
LINE 1
POSITION
NONE
(C)
TYPE-FONT DIFFERENCE t
CHARACTER
LINE *
POSITION
NONE
REMARKS : NO FALSIFICATION DETECTED
Figure 4.28 The Result of Shape Analysis


NITA
i
I 1 2 | 3 I A
|s | 6 | 7
1 8
ptr. to
ptr. to
' H of
ptr. to
ptr. to next
Category Code
1 1 1
(8 levels)
J 1 1
J
ED
181
subnode
subnodes
parent
node
subnode with
same parent
F.D
Block ptr.
Record
ptr. to
0 of
ptr. to
collision
Block ptr.
Record
Record
In ND
ptr. In
EAT
attributes
NITA
Link
UE U
In ND
ptr.in
ptr.In
ND
ptr.
ND
ED
ptr. to
ptr. to next
ptr. to
ptr. to
last record H
ptr. to next
AD
weight
entity with
VD
INF
In INF
EAT record
same attrl-
for same
bute
entity
l_n
M
Block ptr.
Record
ptr. to 1st
of assoc-
UAD
Block ptr.
Record
Record
In ND
ptr.in
associated
latcd
in ND
ptr.in
ptr.in
ND
entity
entitles
ND
AD
ND
IIT
0 of
Character
0 of
Character
characters
string
characters
string
<(
U
INF
Character
string
ptr. to
next
record
0 of
Character
characters
string
Record
File ptr.
ff of
Collision
ptr.to |
to
characters
ptr.
dictionaries
dictionaries
Figure 3.13 APRIKS Data Structure


76
Figure 4.4 Probability Density Function Derived from
Histogram of Gray-level Picture


91
SLOCK
Ml
VI
M2
Y2
LINE
POS.
DIFFY
NCHAR
1
74
6
83
21
1
1
0
10
2
36
6
94
21
1
2
a
10
3
100
10
104
21
1
3
a
10
4
111
10
113
22
1
4
i
10
5
146
10
154
22
1
6
i
10
6
170
10
173
22
1
8
i
10
7
135
11
143
25
1
5
4
10
8
159
11
167
26
1
7
S
10
9
183
11
192
22
1
9
1
10
10
207
11
215
26
1
10
5
10
11
13
54
22
70
2
1
£5
16
12
147
55
155
71
2
11
1
16
13
170
55
173
71
2
12
1
16
14
38
56
46
70
2
3
0
16
IS
74
56
82
70
2
6
0
16
16
24
53
33
70
2
2
0
16
17
49
59
58
70
2
4
0
16
13
62
59
70
70
2
5
0
16
19
67
59
91
70
2
7
0
16
20
97
59
106
71
2
3
1
16
21
134
59
143
71
2
10
1
16
22
no
0
113
70
2
9
0
16
23
132
60
191
72
2
13
2
16
24
194
60
202
72
2
14
2
16
25
207
61
215
72
2
15
2
16
26
218
61
219
71
2
16
1
16
27
26
103
34
119
3
2
0
15
23
74
105
31
120
3
5
1
IS
29
86
105
93
120
3
6
1
15
30
12
107
21
11?
3
1
0
15
31
49
103
57
123
3
3
4
15
32
61
103
63
119
3
4
0
IS
33
96
103
105
120
3
7
1
IS
34
110
109
113
119
3
3
0
15
35
121
10?
129
120
3
9
1
IS
36
146
109
154
120
3
10
1
15
37
157
109
166
120
3
11
1
15
38
170
109
173
121
3
12
2
15
39
131
109
190
121
3
13
2
15
40
194
109
202
125
3
14
6
15
41
206
110
214
121
3
15
2
15
42
109
152
117
163
4
6
0
13
43
96
153
105
163
4
5
0
13
44
11
156
20
171
4
1
3
13
45
33
15 6
44
163
4
3
0
13
46
43
156
55
168
4
4
0
13
47
24
157
32
163
4
2
0
13
48
120
157
129
169
4
7
1
13
49
145
157
153
173
4
3
5
13
50
153
15a
166
163
4
?
0
13
Figure 4.13 Character Coordinate File


CHAPTER
Page
4 FALSIFIED DOCUMENT DETECTION AND FONT IDENTIFICATION 68
4.1 Introduction 68
4.2 Noise Background 70
4.3 Design Methodology 71
4.3.1 Document Data Acquisition 71
4.3.1.1 Scanning, Reflection and Windowing 71
4.3.1.2 Image Preprocessing 71
4.3.1.3 Adaptive Threshold Determination
and Binary Picture Generation 74
4.3.1.4 Binary Filtering (Spur Removal
and Gap Filling) 78
4.3.1.5 Character Isolation 83
4.3.2 Character Recognition and Grouping 95
4.3.2.1 Correlation Technique 95
4.3.2.2 Feature Pattern Matching 101
4.3.3 Falsification Detection 106
4.3.3.1 Alignment Analysis 110
4.3.3.2 Shape Analysis 114
4.3.3.3 Intensity Change Analysis 119
4.3.4 Type-Font Identification 125
4.3.5 Knowledge Base Design 133
4.4 Discussion 139
5 COMPUTER RECOGNITION OF ELECTRONIC CIRCUIT DIAGRAMS 142
5.1 Introduction 142
5.2 Analysis of Electronic Circuit Diagram 143
5.3 System Architecture 150
5.4 Multi-Pass Pattern Extraction 156
5.4.1 Extraction of Junction Dots 157
5.4.2 Extraction of Horizontal Connecting
Line Segments 169
5.4.3 Extraction of Functional Elements 173
5.4.4 Denotation Recognition 190
5.4.5 Reconstruction of Rectangular Shape Elements 198
5.4.6 Processing of Unrecognizable Page 200
5.5 Pictorial Manipulation Language 200
5.5.1 Symbol Description Language (SDL) 202
5.5.2 Picture Generation Language (PGL) 207
5.6 Knowledge Base Configuration 214
5.7 Discussion 215
6 CONCLUSION 218
6.1 Summary ..~r 218
6.2 Areas for Future Work 221
REFERENCES 223
BIOGRAPHICAL SKETCH 230
v


169
contents of the temporary file are shown in Table 5.7 and the image
after potential diode and capacitor symbols removal is shown in
Figure 5.13(a). The image after junction dots removal is shown in
Figure 5.13(b).
5.4.2 Extraction of Horizontal Connecting Line Segments
Observing the electronic diagrams, there are only horizontal and
vertical line segments connecting various circuit elements. In order to
isolate each junctional element, the horizontal and vertical connecting
line segments must be removed before blocking. However, to remove the
vertical connecting line segments is very tedious and time-consuming
because image data are scanned horizontally and stored in a random
access file. The lack of array processing facilities does not permit us
to extract the vertical connecting line segments from the binary image
files. To compensate for this drawback, we consider vertical connecting
line segments as functional elements during the blocking routine and
they will be isolated with very thin rectangular blocks which are
different from the real functional element blocks.
Following the removal of the junction dots, the binary image
contains disjoint connecting line segments and functional symbols. The
second pass pattern extraction is to extract the horizontal connecting
line segments and put them in the second page. The functional diagram
of horizontal connecting line extraction is shown in Figure 5.14 and the
algorithm is the following:
(1) Trace and record the starting point and the length of each
horizontal line segment in scanning order which may end at a
functional element or end at the rim of the image.


205
CONNECTION TYPE
GRAPHIC REPRESENTATION
- .
3
1
>
Conducting Crossing
1*
1 2
<

4
Corner
Free End
1
1'
1
2
* 2
Conducting Touching
1*
2
\
3
<
3
Nonconducting
1*
2
Crossing
4
Figure 5.28 Standard Configuration of Connections


219
valley in a bimodal histogram by taking the point of the maximum
interset variance. This algorithm is also used for the unimodal
histograms successully such as lung tissue image and textural
image. This adaptive threshold selection enables the system to
perform the analysis stably in spite of changing photographic
conditions.
(2) A direct method to compute the Fourier descriptor of the contour in
terms of its chain code sequences has been proposed. Four shape
measures (SF1 to SF4) are proposed to estimate the shape of an
object. These shape measures have the properties of invariance
under translation, rotation, and scaling and can be considered as
the distinguishing features for object recognition.
(3) A multi-pass pattern extraction has been proposed to recognize an
electronic circuit diagram. It separates a circuit diagram into
five pages according to the nature of the circuit. These five pages
will consist of the junction dots, line segments, functional
elements, denotations, and unrecognizable elements. Each page can
be processed individually. This approach simplifies the recognition
scheme.
(4) We used the pictorial primitives to extract the distinguished
features for the classification. The symbol is converted into the
structural features which can be described by the syntactic grammar
to perform syntactic pattern recognition. This approach links the
statistical pattern recognition with the syntactic pattern
recognition to make the recognition scheme more efficent.


spacing:
line spacing
Isolated Character
(grey level)
I Figure
4.21 Functional Block Diagram of Falsification Detection
111


Looping t
V.
Figure 4.10 Pyramid Generation Algorithm


148
Table 5.4 Estimating Spacing Requirements
Average Diagram Spacing
(0.20-irugrid space)
Component
Capacitors
3-4
Inductors
4
Resistor
4
Diagram Items
Transistor envelop diameter
4
Resistor symbol length
3
Capacitor symbol width
1-2
Lettering height
3/4
Connection line spacing
1-11/2
Spacing between groups of
connection lines
1-2


228
67. W. C. Lin and J. H. Pun, "Machine Recognition and Plotting of
Hand-Sketched Line Figures," IEEE Trans. Syst. Man. Cybern., Vol.
SMC-8, pp. 52-57, 1978.
68. S. Kakumoto, Y. Fujimoto, and J. Kawasaki, "Logic Diagram
Recognition by Divide and Synthesize Method," in J. C. Latombe
(Ed.), Artificial Intelligence and Pattern Recognition in Computer
Aided Design, North-Holland, Amsterdam, pp. 457-476, 1978.
69. B. Zavidovique and G. Stamon, "An Automated Process for
Electronics Scheme Analysis," Proc. 5th Int. Conf. Pattern
Recognition, pp. 248-250, 1980.
70. H. Bunke, "Experience with Several Methods for the Analysis of
Schematic Diagrams, Proc. 6th Int. Conf. Pattern Recognition,
pp. 710-712, 1982.
71. H. Bunke, "Computer Recognition of Circuit Diagrams," Purdue
University, TR-EE 80-54, 1980.
72. S. I. Shimizu, S. Nagata, A. Inoue and M. Yoshida, "Logic Circuit
Diagram Processing System, Proc. 6th Int. Conf. Pattern
Recognition, pp. 717-719, 1982.
73. Y. Fukada, "Primary Algorithm for the Understanding of Logic
Circuit Diagrams," Proc. 6th Int. Conf. Pattern Recognition,
pp. 706-709, 1982.
74. T. Sato and A. Tojo, "Recognition and Understanding of Hand-Drawn
Diagrams," Proc. 6th Int. Conf. Pattern Recognition, pp. 674-677,
1982.
75. J. F. Jarvis, "The Line Drawing Editor: Schematic Diagram Editing
Using Pattern Recognition Techniques," Computer Graphics and Image
Processing, Vol. 6, pp. 452-484, 1977.
76. M. Ishii, M. Yamamoto, M. Iwasaki, and H. Shiraishi, "An
Experimental Input System of Hand-Drawn Logic Circuit Diagram for
LSI CAD," Proc. 16th Design Automation Conf., pp. 114-120, 1979.
77. J. M. Cheng, "Literature Review of Computer Recognition of
Schematic Diagrams," Center for Information Research, University
of Florida, Technic Report, 1982.
7 8. IEEE, Graphic Symbols for Electrical and Electronics Diagrams,
ANSI Y32.2-1975.
IEEE, Graphic Symbols for Logic Diagrams (Two-State Devices), ANSI
Y32.4-1973.
79.


69
and typewriter manufacturers, and providing positive evidence of
falsification.
To detect falsification in a document, the system should be capable
of
(a) differentiating typewritten documents prepared by different
typewriters with the same single element ball,
(b) differentiating typewritten documents produced by different
single element balls of the same manufacturer and style placed
on the same typewriter, and
(c) differentiating typewritten documents prepared by the same
typewriter and the same element but at a different time.
To identify type-fonts and typewriter manufacturers, the system should
be able to
(a) determine the similarities and differences between two type-
fonts,
(b) recognize type-font manufacturers of the same and similar
typestyle, and
(c) determine the characteristics of typewriters made by different
manufacturers.
To provide positive evidence of falsification in a document, the system
should be able to make quantitative and minute comparisons and
measurements of individual typewritten characters and their
interrelationships.
In the following sections, we investigate the existing image
processing techniques in Chapter 2 and develop or modify them to suit


82
This paper pres'
tection of documen
pattern recognitit
es* The proposed :
cument data
on. The document <
e document under e;
cument data and is<
r subsequent falsii
Figure 4.8 Binary Picture (Before Noise Removal)


143
multi-pass pattern extraction, the electronic diagram is segmented
according to the nature of the elements. Then the reading of electronic
symbols is treated as a "character" recognition problem discussed in
Chapter 4. The various procedures of multi-pass pattern extraction are
described in Sections 5.4.1-5.4.6. To describe the symbol
configurations and the inter-relationships among them, we introduce two
high-level pictorial manipulation languages, Symbol Description Language
(SDL) and Picture Generation Language (PGL). The PGL is an associative
network structure.
5.2 Analysis of Electronic Circuit Diagram
Electrical and logical schematics are line drawings which consist
of line segments and symbols for the representation of the circuits in
graphics form. They provide the important means of communication in
electrical, electronic and computer engineering. Standards for the
drawing of electric schematics can be found in several reference books
[78][80].
In general, electronic and logical schematics are characterized by
functional elements, connecting elements, and denotations. The
functional elements are represented by symbols consisting of two or more
terminals for connection to other elements. For instance, resistors,
diodes, and capacitors are two-terminal elements; transistors and
amplifiers are three-terminal elements; and flip-flops are four or more
terminal elements. Some basic symbols in_ electronic and logical
schematics are listed in Table 5.1. The connecting elements are


136
Type-font
Gothic
Prestige
Elite
Figure 4.37
Binary
Character
Prestored
Templates
m
£3

(Quadrants
I and IV)
[3?
(Quadrants
I and IV)
Division of a Character into Quadrants


229
80. G. E. Rowbothan, Engineering and Industrial Graphics Handbook,
McGraw-Hill, New York, 1982.
81. R. 0. Duda and P. E. Hart, "Use of the Hough Transformation to
Detect Lines and Curves in Pictures, Commun. ACM, Vol. 15,
pp. 11-15, 1972.
82. D. H. Ballard, "Generalizing the Hough Transform to Detect
Arbitrary Shapes," Pattern Recognition, Vol. 13, pp. 111-122,
1982.


99
Thus, it becomes obvious that FFT approach is much superior to the
direct-multiplication approach in both computing-time and reliability
considerations.
(d) Computing Algorithm for FFT Approach
In this subsection, we will briefly list the correlation-computing
algorithm of FFT approach which is primarily based on Equations (4.5),
(4.6), and (4.7).
Step 1 Input f(npn2), g(npn2) with dimension (NpN2).
Step 2 Normalize f(npn2) and g(npn2), i.e.,
N1 N2
f(npn2) = f(n1,n2)/ l l |f(u,v)|2
u=l v=l
N1 N2
g(n1#n2) = g(nlfn2)/ l l |g(u,v)
u=l v=2
Step 3 Find 2-D Fourier transform F(kpk2) and G(kpk2) of the
A A
normalized arrays f(npn2) and g(n^,n2) respectively.
Step 4 H(kpk2) = F(kpk2) G*(kpk2) for all k^ and k2.
Step 5 Find the 2-D inverse Fourier transform h(n^,n2) of H(kpk2).
Step 6 Find the maximum of the array h(n^,n2).
Figure 4.15 shows the block diagram of the above algorithm.
At this point, we need to determine the threshold 6^. Because of
the scanning noise, the cross-correlation will not be the ideal value of
unity. If the threshold is set too low, errors might occur. If set too
high, there may be no samples reaching this high degree of
resemblance. The threshold is determined through experiment.


145
represented by junction dots, horizontal line segments, and vertical
line segments which connect functional elements. Various types of
connections are listed in Table 5.2. The denotations are used to
describe physical properties or labels of the functional elements in a
schematic in terms of a character string. For example, the string 10K
attached to a resistor means that the resistance of the resistor is
10,000 ft. The string Cl attached to a functional element indicates that
the element is capacitor number one in the given schematics.
Denotations in a schematic will facilitate automatic recognition of the
functional elements.
From the electronic drafting handbook, some layout guidelines may
provide useful information for diagram recognition such as class
designation letters (Table 5.3) and estimating space requirements
(Table 5.4) [80]. Moreover, there exist some geometrical constraints of
logic symbols [67] (refer to Figure 5.1). They are as follows:
Class name: AND type gates
2 2
fj^ = (x (x + a b)) /b2 + (y y) /b2 1
f2 = (y (y + b))/b
f3 = (x (x a))/a
f4 = (y (y b))/b
Class name: OR type gates -
2 2
f. = 3((x (x a))/4a) ((y (y b))/2b) 1
1 2 2
f_ = 3((x (x a))/4a) + ((y (y + b))/2b) 1
2 2
f^ = ((x (x a /3b))/2b) + ((y y)/2b) 1
Class name: amplifier type


BIOGRAPHICAL SKETCH
Ming-Chieh Cheng was born in Tainan, Taiwan, Republic of China
(R.O.C), on April 2, 1948. He received his B.S. degree in engineering
science and his M.S. degree in electrical engineering from the National
Cheng Rung University, Taiwan, R.O.C. in 1970 and 1973, respectively.
He received an award from the Chinese Engineer Society in 1973 for
outstanding quality and content of his master's thesis. He served as an
officer in the Army R.O.T.C. in 1972. From 1973 to 1980, he was an
assistant scientist and worked on system configurations and special
hardward designs at the Chung Shan Institute of Science and Technology,
Taoyuan, Taiwan. Since March 1980, he has been working toward the Ph.D.
degree in electrical engineering at the University of Florida,
Gainesville, Florida. He is a research assistant in the Center for
Information Research, the University of Florida and a member of IEEE.
His research interests include pattern recognition, computer
vision, image processing, computer graphics, and mini- and microcomputer
applications.
230


131
ORIGINAL FIGURE SMOOTHED FIGURE
6444444
6
2
007
2
6
1
6
2
6
2
6
2
6
2
6
2
54444
6
344
34
6
000007
3
6
i
7
2
6
1
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
G
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
5
2
6
2
64
34
5
34
*7
i
O
64
2
000001
0000002
6444444
6 2
007 2
6
1
6
2
6
2
6
2
6
2
6
2
54444
6
344
34
6
000007
3
6
1
7
2
6
1
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
5
2
6
2
5
34
5
34
3
5
2
000001
0000C02
AREA OF THE FIGURE IS =-120.5 UNITS
CENTROID 13 C 8
.2/ 10
. 7 ;
MOX = 6303.S046
moy =
3292.7033
NXY = 1593.9303
TEX= 2537.2666
TEY:
6S89.2463
THET A = 66.723S
El =
0.3716
(a)
Elite
Type-Font "h"
Figure 4.34 Chain-Coded Features of Character


en h
S1*S0YEEAN$2*T0MAT0S3*TURFS1. 1*INSECT=PE5T$1.2*DI5EA5ES1 3>HNAMAT0DE!E i 4
4 HE EDI-i B^VARIETYSi i. 1*50YBEAN L00PER=SBL=P5EUD0PLU5IA INCLUEEN5S1. i. i
. 14C0NTR0L RECOMMENDATIQN^FEST TREATMENT~C0NTR0L=MANAGEMENT4i.i.i + INJUR
Y TO CROP : FOLIAGE< DAMAGE TO CROP ARE HEAVY FOLIAGE FEEDERS.&LARVAE FEED ON UNDERSIDES OF LEAVES.&LARVAE
FEED IN DEEP CANOPY OF PLANT. &LARVAE FEED ON OLD LEAVES.A+THRE5H0LD OF
INJURY< ECONOMIC THRESHOI_D< ECONOMIC INJURY LEVEL'/.A&33 ('/.) DEFOLIATION PRI
OR TO BLOOM. &NO MORE THAN AN ADDITIONAL \10('/.) AFTER BLOOM\.Stl0 WORMS i
/3." OR LONGER PER LINEAR FOOT OF ROW EEFORE BLOOM.&4 WORMS 1/2" OR LONG
ER PER LINEAR FOOT OF ROW AFTER BLOOM.A$i. j..i.i.i*NON-CHEMICAL CONTROL
BIOLOGICAL CONTROL'/.AStPROMOTE BENEFICIAL ARTHROPOD POPULATIONS. A+CULTUR
AL CONTROL< AGRICULTURAL TREATMENT AGRICULTURAL CONTROL'/.A&PLANT BEANS AS
EARLY IN SEASON AS P055IELE.&WINTER PLOWING.as.1.1.1.2+CHEMICAL CONTR
OL--CHEMICAL APFLICATI0N=PE5TICIDE APPLICATION=INSECTICIDE APPLICATION =
CHEMICAL TREATMENT=CHEMICAL SPRAYINGSi.1.1.1.2.1*EACILLU5 THURINGIEN5I5
1.1.1.1.2. 1. 1>1 REDIENT LBS. PER ACRE Figure 3.8 Raw Data for Batch Mode
I


Figure 5.11 Binary Images


137
The system first makes the alignment test
on suspicious characters. If the characters fail the
alignment test, they are regarded as a falsified
i 1
document. If the suspcious characters pass the
l
automatically be subject
i
is test is also passed
! 1
they will take feature discrimination tests. If
alignment test, they wil
to the size test. If thi
i i
i i
this suspicious character can pass all these
tests, the system will make a final test, the inten
sity analysis test.
(a) Windowed Document
ous characters pas
1 automatically be .
s test is also pas
fscrimination test
r can pass all the
(b) Filtered Binary Image
Figure 4.38 An Example of a Document Which Consists of
Two Mixed Type-Fonts


61
= { entity-attribute relationships }
{K} = {all keywords, including entities and attributes)
The identifiers are converted into the "what" canonical form,
{identifier) > {"what" canonical form)
when time
how procedure
why > what + reason
where place
A question is segmented into five contextual components
{ question ) = { L)+{ I)+{k)+{CI)+{ C )
Consider the question "How to control soybean looper and velvetbean
caterpillar by Dipel? The system performs contextual segmentation and
determines the following components.
{L ) = { and )
{I ) = { how )= what + {procedure )
K ) = { control, soybean looper, velvetbean caterpillar, Dipel)
{Cl ) = { (control, soybean looper), (control, velvetbean
caterpillar), (Dipel, control))
C ) = { (Dipel, treatment procedure))
The associative tree is a "what" type structure, and the production
rule is a "how" type structure. By embedding the production rule into
the hierarchy, the APRIKS system answers questions with "how," "why,"
"when," "where" as well as "what" format. In fact, our design of the
APRIKS system combines the concepts of pattern-directed approach and
rule-based approach. A simplified functional flowchart for the
question-answering mode is summarized in Figure 3.18.


14
From this modification, we can apply the FD to any arbitrary curve
whether it is closed or not.
2.3.4 Expression of Fourier Coefficients in Terms of Chain Codes
In this section, we want to compute the FD of the contour in terms
of its chain code sequences. Let the grid spacing for chain code be
equivalent to the unit of coordinate system and let gq >§2*' *§m **e t*ie
chain code string of a contour C. This chain code string should keep
the interior region enclosed by C on the left (see Figure 2.3). Let P
be the perimeter of the contour C, and let 0 = tQ < t-^ < ... < <
t^ = 2i be a partition of [0,2ir] .
From equation (2.5), we get
1 2tt
an 2mtJ !0 exp(jnt)du(t)
2J \ tu(tm) u J m=l
From equation (2.1), the chain code sequence gk can be expressed
as
Ukexp[j|gk]}
where
1 > ifCk
is even
*k- <
/2 if is odd k = 1, ... ,M
Further, by approximation, we obtain
m
P = l L
m k-1 k
M
P l L
k=l 1
(2.8)
(2.9)


129
AAAAABBBBBCCCCCDDDDD
(a) Letter Gothic
aaaaabbbbbcccccddddd 12 pitch
AAAAABBBBBCCCCCDDDDD
aaaaabbbbbcccccddddd
(b) Elite
12 pitch
AAAAABBBBBCCCCCDDDDD
(c) Prestige Elite
bbbbbcccccddddd 12 pitch
a a a a a
A A A A A B B
a a a a a b b
B
B
B
C
C
C
C
C
D
D
D
D
b
b
b
c
c
c
c
c
d
d
d
d
(d) Courier Italic
12 pitch
Figure 4.32 Four Type-Font Samples


CHECK h M GROUP
EACH REPRESENTS 0.55 PIXELS.
FREQUENCY OF OCCURENCE
0
11
22
33

1
!* !
!

! !
l !
b
3
M4
! ! !
1
!
1 1
E
5
666 !
!
!
1 5
;
V
7
44 4646666466
! ! !
!
! !
1 1
9
!n4 *4 44:44*446664**6*6**44*466*****!*

1
1
L
11
f 4 4 *444644664+4646644+6 fr*********!
!
! !
E
13
! 44***44644446444 i^******************************************
V
15
*4444444444444444444444444444444 !
(
!
!
E
17
< 44*4**4444
i
1
f !
!
L
19
! 4 !
!
J
1 1
1 1
cl
!4*4 !
! ! !
!
1
! !
23
6644444*4466
! ! !
!
1
! !
25
! *

!
!
i !
27
! 4 !
! ! !
!
! !
! !
29
! !
! ! t !
!
! 1
! !
31
4 !
! ! !
!
! (
! !
33
! !
! !
1
1 !

3S
! !
!*!!!
! 1
! !
37
44* !
J !
!
! !
39
! 4 ;
! ! !
!
! !
!
41
!44444 1
! ! !
!
i !
i
43
!444
1 1
!

! !
45

! ! !
!
! !
i i
47
444
! ! !
!
! !

49

! ! !
!
! !
! 1
51
!
! ! !
1

! i
S3
.'
! ! !
!
!
! !
55
! !
! ! !
I
! !
! !
i 7
i i i
! ! {
!
J !

59
! !
! ! f
!
! !
! !
61
!* !
! ! !
!
!
! !
t 3
! 4
! ! !
;
!

5
!4*4 !
! ! !
!
!
! !
67
444 !
! ! (
¡
! !
!
9
!444*4 !
! !
!
! !

71
4 !
! ! i !
!
! !

73
44* i
! ! !
!
!
! !
75
444 !
! ! !
!
! !
! !
77
44* !
1 J
!
! !

79
!6*44444444
.' ! !
!
! !
!
3 1
!4*44444
! ! !
!
! !
!
83
144**4444444**44444 ! J
!
! !
! (
as
!
f
! !
37
44*4444444444444 ! !
!
! !
! !
39
444*444*4*444444 ! !
!
i
i t
91
444444444*
! !
!
i !
! !
93
4444*44**!
! ! !
¡
!
! !
95
444*444*4!
! f ! !
!
!
! !
97
444 ! I
! ! J
!
1 i
! !
THE REFERENCE THRESHOLD I IN LINE R 1 POS. 1 51 IS 4*
EACH REPRESENTS 0.53 PIXELS.
FREQUENCY OF OCCURENCE
a
10
21
3
G
7
4
.
!
.
!
1

R
9
4**444**444**
!
1 1
!
!
1
1
E
11
94*4444494*44**44*4****44*4444*44**4***4*****444***********1
V
13
4**4**444*4444444*44***4444**44444*44444444*4***4*
IS
444444444*444444444444444* 1
!


1
L
17
4444*444444444!
1

!
!
! !
1
E
19
J 4*444*44*!
! !
j
!
! !
1
V
21
!*+ !
1

1
!
1
1
E
23
!44444444*! !
j
1
1
!
i !
!
L
25
*4444 J
1
!
1
;

1
27
*** ! !
1
< 1
1
1
!
29
;4*4444*
1
!
;
! !
1
31
!*******
1
1

1
!
33
4 64 !
1
!
1
! !
;
35
!44444
!
! !
!
! !
1
;
37
!4*4444*
;
!
1
; 1

39
4444* !
1
!
1
! !
1
41
* !
I
1
! !
!

43
!* 1 i !
!
!
1
! !
1
45
14! !
1
1
!
1 1
¡
47
!4*4 !
!


!
49
4
! !
!
!
1
1
51
! 4 ! !
!
! !
!
¡
¡
53
! !
! !
!
1
! !
1
SS
! ! !
! !

1
5 !
1
57
¡4444444
! !
!

! !
1
59
4 !
! !
1
!
!
!

61


! !
1
1
! !
;
1
63
4*4*4 !
>
! !
¡
1
! !
1
65
! !

! !

;
! !
;
1
67
! ! !

!
;
1
! !
1
1
69
! 4 ! !
!
!
!
1
! !
1
71
! ! !
!
! !
!
1
!
73
4 ! !
!
!
j
1 1
1
75
iii!
!
1 1
!
I

1
1
77
!
!
! !
1
! !
1
79
!*44 ! !
!
!
1
1
81
444 !


1
!
1
83
!4***4 !
!
!

! !
¡
as
44*44*4444444
!
! !
1
! !

1
97
1
! !
1
89
!4444444*44*4464t444*4*464444444
1
! !
91
!46*46*6*66**4*4*
:
! !
1
! !
!
1
93
4*4 444444 4444444 4444
! !
1
! !
!
95
*** !
!
i

! !
97
!64464
!
! !
!
! !
9?
THE
THRESHOLD OF CHARACTER
IN LINE
<1 2 POS.
a 1
IS
S3
<<<(<<
SAME INTENSITY >)>)))
Figure 4.31 Intensity Histogram Comparison of Two Characters (h)
127


Ei.i {Di.j>Di.i.j(Vl
E-, 9 (Dn (w.) ,D, (w.) }
1.2 l,j 3 1.2,3 J
{D
{D
LO
I
I
Figure 3.6 Top-Down Approach for KB Generation


CHAPTER 4
FALSIFIED DOCUMENT DETECTION AND FONT IDENTIFICATION
4.1 Introduction
Today, a high percentage of white-collar crimes often involves the
falsification of typewritten documents. The falsified document
detection is currently performed by expert document examiners using
manual techniques. However, the number of different type-fonts
currently in existence is over two thousand. Keeping track of these
fonts represents a nearly impossible data-processing task when manual
techniques are used. The popularity of certain type-fonts have caused
other manufacturers to produce look-alike type-fonts. These fonts are,
to the human eye, indistinguishable from the original manufacturer's
product. Furthermore, the advent of interchangeable type elements, such
as on the ubiquitous IBM Selectric, has caused additional
complications. A single type element may be Interchanged between
different typewriters and even between typewriters from different
manufacturers. As a result, the falsified document detection problem is
extremely difficult to solve if we do not make use of modern computer
technology (e.g., expert system), image processing techniques, and
pattern recognition theory.
The main objective of this chapter is to develop a knowledge-based
system for detecting falsification in a document, identifying type-fonts
68


135
Figure 4.36 Key Feature of type-Font


55
First
Associated
Entity
Figure 3.15 The Inter-relations Among the
ED, EAT and AD Records


90
Step 6 Delete the rectangular area, which has X < x < X and
y < Y2, from the pyramid picture.
Step 7 Repeat steps 1 to 6 until no more "1 pixels exist.
The character coordinate file of Figure 4.9 is shown in the left 5
columns of Figure 4.13. The ordering of the block number shows the
scanning order of the characters.
(b) Character Classification
Before discussing the re-sorting algorithm, we like to classify the
character into five types according to its "height, and typing
position.
Type 0: a,c,e,m,n,o,r,s,u,v,w,x,z.
Type 1: b,d,f,h,k,l,t, all uppercase letters, and all numbers.
Type 2: g,p,q,y.
Type 3: i,j1!' (which have two-part characteristics).
Type 4: '*' + (which have very small dimensions).
The y-dimension of types 0 to 2 is listed in Table 4.1. In the
falsification detection task, Type 3 and Type 4 symbols are ignored
except for i and j. The i and j symbols appear without the dots on the
top. We have two reasons for ignoring Type 3 and Type 4 characters.
First, they indicate very little on the document falsificaton. Second,
due to the preprocessing they become degraded and can not be easily
differentiated from the common noise. Furthermore, we classify the
characters i and j as Type 0 and Type 1, respectively, after deletion of
the dot as a feature.


25
Figure 2.6 The Block Diagram of Pictorial Knowledge Base




44
strategies. However, in the real-world situation, the associated
descriptors at the top level are of general nature and may be irrelevant
to system requirements because of the need for specific characterization
at the lower level. However, in practice, the number of inherent
descriptors at the top level are smaller than that of the lower level.
In our work, we propose a modified top-down approach to simplify
knowledge base generation. From the relational hierarchy, we understand
that each entity is actually described by its associated descriptor sets
and its father-son relationship. We, therefore, consider the father-son
relationship as a potential descriptor, calling it the cover. That is,
the cover is a descriptor for the system internal representation. Thus,
the semantic linking rule for knowledge base generation is given by
[D (w.)] <
il*i2 i3'*'imj 2 neW
'i. .i .i-...i ,j (wj)]old u '"i. .i0.i... .i .
1 Z J in l z j m JL
where the cover, e
i. i i~ i .
12 3 nr-1
{E. > E E ..., E }
X1 il*i2 il'12'i3* 1 12i3 m-1
The modified top-down approach is illustrated in Figure 3.7.
The two kinds of data input method for knowledge base generation
and updating are batch mode and interactive mode. The batch mode is
suitable to accept a bunch of raw data and process at a time. The
interactive mode is used for small amounts of data. In APRIKS, we use
batch mode for knowledge base generation and interactive mode for


Al
.1.2
,3
-L *-
,3
(V}
(w )}
(w. ) }
J
Figure 3.4 An Example of the Hierarchy for Knowledge Sketch


101
Figure 4.16 shows a computing result of cross-correlations of the
first line of the document shown in Figure 4.6. With a threshold
setting of = 0.9, we get the reprint of a document accurately (see
Figure 4.17).
4.3.2.2 Feature Pattern Matching
The feature pattern matching [62] is on the basis of the character
profile. A character profile is generated by projecting a character on
two main axes, which we then call x-profile and y-profile. The
advantages of using profiles are (1) the profiles of the same character
are similar and independent of type-fonts, and (2) profile formulation
involves simple binary addition only and which is very fast and
efficient. By considering the x- and y-profiles as binary histograms,
and by dividing each histogram into three sections at the top (or left),
in the middle, or at the bottom (or right) portion of the character, we
can study further the histogram shapes. In order to quantitatively
describe the histogram shape, there are three estimated levels (1) no
maxima, (2) minor maxima (i.e., 0.5M* < M < 0.85M*, M* is the global
maximum), and (3) major maxima (i.e., M > 0.85M*). We encode three
levels by 0 for no maxima, 1 for minor maxima, and 2 for major maxima.
Using the above encoding, we can represent the x- and y-profiles by
T
pattern vectors V = tvi ,v2 ,Vg ,v^ ,Vg ,Vg ] where v-^ to Vg represent the
code of jc-profile at left, in the middle, and at right portions, and v^
to Vg represent the code of y-profile at top, in the middle, and at
bottom portions. Figures 4.18(a) and (b) illustrate the profile pattern
vectors of characters p' and 'e1, respectively.


114
(2) up-incline case if the deviations are monotonic increasing from
left to right (i.e., dy^ > dy2 > ... > dyn),
(3) jump-misalignment if the deviations are zig-zag like change, or
step jump, or sawtooth like change.
The threshold 0^ of jump-misalignment is 2 pixels. If |dy^| > 0 ^,
then jump-misalignment of jth character is detected.
For the different line spacing detection, we calculate the distance
(dV^) between two consecutive typing lines in one page. If the error of
the line spacing |dV^ dVj| is larger than the threshold 0V, then line
spacing difference is detected.
For horizontal spacing detection, we calculate the distance (dH^)
between two consecutive characters in the same typing line. A space is
considered as a character. If the distance difference |dH^ dIL+^| >
0jj, then the horizontal spacing difference is detected. In our working
model, we set 0jj = 0^ = 2 pixels. Figure 4.23 shows the result of the
alignment analysis. The document which passes the alignment test may
proceed to shape analysis.
4.3.3.2 Shape Analysis
The shape analysis has the two major functions of size difference
detection and type-font difference detection. If the sizes of two
characters which belong to the same group are different, then their
type-fonts shall be different. Thus we examine the size first. If the
characters can pass the size measurement test, they continue to take the
shape-matching test pair-wisely.


221
with the entity-attribute relationship is created. The upper level
entity is considered as an implicit attribute to speed up the
retrieval mode.
(3) The event driven control mechanism integrates a set of mutually
independent picture processing subsystems and controls the overall
process of the analysis. The control mechanism resolves the
ambiguities or conflicts among picture processing subsystems using
the predetermined thresholds which are stored in the knowledge base.
(4) The system has the feedback loops from the high-level processing
stage to the low-level processing stage to re-identify or re-examine
the unrecognizable object.
6.2 Areas for Future Work
Three examples in Chapter 3, 4, and 5 have demonstrated that our
proposed knowledge-based system is able to perform an efficient and
reliable analysis of images. This study will be a step toward the
complex imagery such as aerial photography. Our blocking concept can
still be applied to the complex imagery by the assistance of the edge
detection and boundary tracing. However, the pictorial knowledge-based
system includes many picture processes: preprocessing, segmentation,
and feature extraction. These processes generally take too much time,
especially for large pictures. It will be necessary to design special
hardware architectures which will perform picture processing in
parallel. The current work has focused on 2-D objects only. To extend
to three-dimensional object recognition, we need to extract the partial


59
for the highest similarity measure if it exceeds the preset threshold.
The operation flowchart of decision-making mode is shown in Figure 3.17.
The question-answering mode provides a friendly input/output
interface. In the APRIKS system, the user's query can be semantically
categorized into three types:
(a) Cases which require a Yes/No type answer.
(b) Specific questions which require a what/which/who/ when/where...type
of answer.
(c) Procedures or sequences of actions which require a how/why type of
answer.
Although numerous techniques for processing natural languages have been
proposed in the literature, the problem of natural language
understanding by machine still remains unsolved in the practical
sense. In the APRIKS system we have introduced methods to handle simple
natural language sentences with structural format common to most users.
In the question-answering mode, the APRIKS system is designed to
analyze a simple sentence and to identify its five contextural
components.
{l} = {semantically logical operators)
= {not, or, and)
{I} = {identifiers) __
= {when, how, why, where, what)
{Cl} = {contextual information)
{father-son relationship in the hierarchy)
{characteristics}
{C} =


4
In the following sections, we discuss and modify some of the image
segmentation techniques which could be utilized in our design.
2.2.1 Edge Detection
The intent here is to enhance the image by increasing the values of
those pixels along the boundaries (edges) of the objects. Edges are
detected between regions of different intensity. To detect edges by
mask matching, we convolve a set of difference-operator-like masks, in
various orientations, with the picture. The mask giving the highest
rvalue at a given point determines the edge orientation at that point,
and that value determines the edge strength. The masks used in our
system are the generalized Sobel operators [16] with constant factor V4
as shown below.
12 1
0 0 0
-1 -2 -1
2 10 10-1
10-1 20-2
0-1-2 10-1
0 -1 -2
10-1
2 10
-1-2-1 -2-10 -10 1 0 12
000 -1 01 -2 02 -1 01
121 012-101 -2-10
2.2.2 Smoothing
The object of the smoothing is to "remove" isolated edges
corresponding to noise in the background, and to "insert" more edges
along the object boundaries. A powerful smoothing technique .that does
not blur edges is median filtering [17], in which we replace the value
at a point by the median of the values in a neighborhood of the point.
For a 3x3 neighborhood, we use the fifth largest value. Since median
filtering does not blur edges, it can be iterated. However, a problem


144
Tble 5.1 Some Basic Symbols in Electrical Schematics


179
Circuit Diagram
Without Junction Dots And
Figure 5.17 Functional Element Extraction


225
26. G. H. Granlund, "Fourier Preprocessing for Hand Print Character
Recognition," IEEE Trans. Computers, Vol. C-21, pp. 195-201, 1972.
27. E. Persoon and K. S. Fu, "Shape Discrimination Using Fourier
Descriptor, IEEE Trans. Syst. Man. Cybern., Vol. SMC-7,
pp. 170-179, 1977.
28. Jianhua Xu and Julius T. Tou, "Predictive Searching for Chain
Encoding by Computers," Int. J. Computer and Information Sciences,
Vol. 11, pp. 213-229, 19821
29. J. T. Tou and R. C. Gonzalez, Pattern Recognition Principles,
Addison Wesley Publishing Co., MA, 1974.
30. R. Davis, B. Buchanan, and E. Shortliffe, "Production Rules as a
Representation for a Knowledge-Based Consultation Program,"
Artificial Intelligence, Vol. 8, pp. 15-45, 1977.
31. R. S. Michalski, "Pattern Recognition as Ruled-Guided Inductive
Inference," IEEE Trans. Pattern Anal. and Mach. Intell. ,
Vol. PAMI-2, pp. 349-361, 1980.
32. R. A. Kowalski, Logic for Problem Solving, North Holland, New
York, 1979.
33. H. E. Pople, J. D. Myers, and R. A. Miller, "DIALOG: A Model of
Diagnostic Logic for Internal Medicine," Proc. 2nd Int. Joint
Conf. Artificial Intelligence, pp. 848-855, 1975.
34. H. E. Pople, "The Formation of Composite Hypotheses in Diagnostic
Problem Solving: An Exercise in Synthetic Reasoning," Proc. Fifth
Int. Joint Conf. Artificial Intelligence, pp. 1030-1037, 1977.
35. J. T. Tou, "Design of a Medical Knowledge System for Diagnostic
Consultation and Clinical Decision-Making," Proc. of the Int.
Computer Symposium, Vol. 1, pp. 80-99, 1978.
36.
D.
pp.
S. Nau, "Expert Computer
63-85, 1983.
Systems,
Computer,
Vol. 16,
37.
J.
T. Tou, "Knowledge Engineering
," Int.
Journal of Computer and
Information Sciences, Vol. 9, pp.
275-285,
1980.
38.
E.
H. Shortliffe, Computer-Based
Medical
Consultations
: MYCIN,
American Elsevier, New York, 1976.
39.
S.
M. Weiss, C. A. Kulikowski, S.
Amarel,
and A. Safir,
"A Model-
Based Method for Computer-Aided Medical Decision-Making,
Artificial Intelligence, Vol. 11, pp. 145-172, 1978.


180
Figure 5.18 An Example of Component Blocking


211
$i
( i )
Cl [0]
( 2 )
( 1 )
J1 [0]
( 3 ) ( 2 )
\
( 2 )
( 1
)
R1 [1]
R2
[0]
( 1 )
I
( 2
I
)
1
G
1
( 1
)
J2
(2]
( 2 )
( 3 )
/
\
( 1 )
( 1 )
AR1 [0]
33 [1]
(
2 ) ( 3 )
/
\
G
( 1 )
J5 [2]
( 2 :
) ( 3 )
/
\
( i )
( 1 )
R4 [0]
J4 (3]
( 2 )
( 2 ) ( 3 )
/
/ \
*1
( 1 ) ( 2 )
CR2 [1] CR1 [0]
( 2 ) ( 1 )
1 1
( 2 ) ( 3 )
R3 (0) J3 [1]
( 1 )
1
( 2 )
J3 [11
Figure 5.32 An Associative Tree of Figure 5.31


5
with two-dimensional median filtering is that it destroys thin lines as
well as isolated points, and it also "clips" corners.
Another edge-preserving smoothing using discrete bar masks was
proposed by Nagao and Matsuyama [18]. The discrete bar masks are shown
in Figure 2.1. The procedure of the edge-preserving smoothing is as
follows:
Step 1: Examine a set of a point (X,Y)'s neighborhood which is
covered by the discrete bar masks.
Step 2: Detect the position of the mask where its gray-level
variance is minimum.
Step 3: Give the average gray-level of the mask at the selected
position to the point (X,Y).
Step 4: Apply Steps 1-3 to all points in the picture.
Step 5: Iterate the above processes until the gray-levels of all
points in a picture do not change.
This smoothing method takes the average in a certain region; it
will also destroy the fine lines. These smoothing techniques above
cannot be used in the character recognition in Chapter 4 and the circuit
diagram recognition in Chapter 5 because, if the characters or the line
segments are very thin, then these line segments will be lost.
2.2.3 Automatic Threshold Selection
To conduct the binary picture generation, we segment the gray-level
image by thresholding. That is, the pixels greater or equal to a
threshold T are set to 1, and all other pixels are set to 0:


36
Dij (wj,Aj,Vj)} {Di,l(wl*W* '** d,n(WV}
For simplicity in writing, we use the short notation j(wj) by
dropping the attribute name and value. Furthermore, it is assumed that
the descriptors are independent, i.e.,
Di,l(wl> n i,2 (w2) Di,N We make this assumption since the dependence aspect of the descriptors
is not adequately understood. In fact, it has been recognized that
classification decisions are robust with respect to the assumption of
"conditional independence [54].
Common descriptors exist between two entities and E^, if the
name or the meaning between two descriptor sets (D^ k(wk) } and
{Dj,£ are the same. We denote the common descriptors by
{D. (w )}, where the factor w is defined by
IK. K
wk = min (wk,w)
The common descriptors form a semantic link between two entities, and
the most frequently used name is chosen as the master term.
Distinct descriptors between two entities E^ and E^ are expressed
o -j o
as (D^ (w)}, if the name and meaning between two descriptor sets are
different. The distinct weighting factor is defined as
o
w = w.
if
8?
= D. ,
k
i
i,k
o
w = w
if
= D- ,
i
i
Two entities will be linked together, if the names of the entities
are the same. We call this link an entity semantic link, denoted by
A
E (i,j). However, several individual descriptors may have common
descriptions with different weights. The entity semantic link- has the


However, due to the preprocessing and noise effect, some pixels are
missing and some edges are blurred. These phenomena will degrade the
profiles and make some characters undistinguishable. To solve this
problem, we use gap characteristic as an additional feature. We encode
the gap features of a character by the location of the gaps as shown in
Table 4.3. Now the number of elements of the pattern vector are
increased to 8 (i.e., V = [v]_ ,v2 >v-j ,v^ ,v5 ,v^ ,Vy Vg]T) The algorithm of
the feature pattern matching is as follows.
Step 1 Determine character type for each character.
Step 2 Using character type to reduce the search range in the
character file.
Step 3 Perform the pattern vector matching to find a candidate
character.
Using the above algorithm to test four different type-fonts (Letter
Gothic, Olympia Gothic, Elite, and Prestige Elite), there are 70 85%
of characters which are recognized correctly.
The grouping procedure is to classify the isolated characters into
alphabet or numeral categories. After character recognition, a
character-grouped file is generated. The flowchart of the character
recognition and grouping subsystem is shown in Figure 4.19. The
character-grouped file of Figure 4.9 is shown in Figure 4.20.
4.3.3 Falsification Detection
The falsification detection subsystem performs a sequence of
operations including size measurement, spacing measurement, alignment
analysis, shape template matching and gray-level intensity analysis.


nj nj rij nj
nj nj rij
nj nj nj
118
CHAR, in LINE 3 7 POS. 8 f IS COMPARED WITH CHAR. IN LINE 8 IPOS. 3 1 :
ill 2
Ill
ill
11 22 2
1
11
i £
Line ? 1
Lin?
i £
£
Pos. ¡? 1
Pos.
Figure
4.25 An Example of Shape Matching


188
Table 5.11 Coordinate Files
(a) Window 1
BLOCK#
XI
Yl
X2
Y2
1
100
19
106
23
2
109
19
111
23
3
113
19
117
23
4
99
26
120
34
5
81
30
83
54
6
129
30
131
30
7
133
30
133
30
8
163
31
171
136
9
134
34
143
54
10
106
52
117
61
11
94
56
96
56
12
98
56
98
56
13
128
58
128
58
14
130
58
131
58
15
133
58
135
58
16
81
60
82
72
17
138
60
140
88
18
9
62
10
66
19
19
62
21
66
20
14
63
17
67
21
53
63
59
68
22
66
63
70
68
23
62
64
64
67
24
13
70
15
80
25
17
70
19
90
26
88
70
124
108
27
51
72
72
79
28
173
73
175
78
29
177
73
179
77
30
182
73
185
78
31
191
73
195
78
32
188
74
189
77
33
1
75
3
75
34
26
76
28
76
35
38
79
46
136
36
78
99
88
118
37
49
103
55
108
38
58
103
60
108
39
62
103
66
108
40
114
110
114
114
41
116
110
117
114
42
119
110
119
110
43
102
111
112
114
44
119
111
124
114
45
80
120
86
120
46
82
122
84
123


33
and utilization of knowledge. The organization of this knowlege-based
expert system is illustrated in Figure 3.1.
Through the knowledge acquisition task, the system transfers
experts' experience and know-how into the knowledge base. The system
extends its strategy through an inference mechanism. The system is
designed to perform the following three modes of operation: information
retrieval and browsing, decision-making and consultation, and diagnostic
analysis and question-answering. This framework of a knowledge-based
expert system is applied to productivity improvement in agriculture.
The design of APRIKS system is discussed in the following sections.
In the APRIKS knowledge base, strategic information items are
characterized by feature patterns. The knowledge seeking process is
accomplished through the recognition of feature patterns which match
most closely with query patterns formulated by APRIKS from the user's
query. Two information patterns are said to be matched if a certain
performance criterion is satisfied. The strategic information items are
linked to the detailed descriptions and working knowledge which are
stored in various knowledge files.
In the APRIKS system, the knowledge seeking and utilization process
involved six basic steps:
(1) Formulate the query information patterns from user's query,
(2) Retrieve the feature patterns from the knowledge base,
(3) Modify the user's query to make it more specific,
(4) Recognize the associated feature patterns via knowledge-based
pattern recognition and inference,


V
'J
ro
Figure 4.1
I
System Architecture for Automatic Typewriter Identification


155
9
9
9
9
9
9
9
9
9
9
9
9
O
9
9
9
9
9
9
9
9
9
9
¡9
r-
\o
r
9
9
O
9
9
9
191
9
O
a
o
9
J
9
9
9
9j
9
9
9
9
9
9
9
91
Â¥

I
9
9
i
9j
9
0
9
9
9


lu
u
3


n
u




9



19
19
¡9
91
9
_L_
Figure 5.5 The Primitives Are Used in the Circuit Diagram
Recognition System


62
Figure 3.18 The Operation Flowchart of Question-Answering Mode


133
For minute difference comparison, we propose to use local pictorial
features. The two approaches to extract pictorial features are
skeletonization [64] and quadrant division [62]. The skeletonization
provides topological features such as triple point, end point, and
connectedness for the font identification. The skeletons of characters
'A* of four different type-fonts are shown in Figure 4.35. Experienced
document examiners classify type-fonts by key features only. An example
of the key feature of some type-fonts is shown in Figure 4.36. For the
quadrant division, we divide a character into four quadrants and store
the significant quadrant(s) into the data base for partial matching
purpose. The significant quadrants of a character "a" for the Gothic
and the Prestige Elite type-fonts are shown in Figure 4.37. The system
performs the partial matching with the data base and selects the one
which has the highest score as a candidate type-font.
We illustrate the identification task using Figure 4.38 as an
example. Figure 4.38 consists of two mixed type-fonts. After
identification, the results are shown in Table 4.4.
4.3.5 Knowledge Base Design
The fourth subsystem of the proposed system is the knowledge base
for falsification detection and font identification. Without a
knowledge base, the system will not be able to integrate the procedures,
pattern features, and detection criteria together to make a positive
identification. The knowledge base will be filled with complete
information on various type-fonts and typewriter manufacturers together
with the experience and know-how of expert document examiners. The
control module determines which analysis techniques will be performed


198
The third step is to label the possible functional element. The
labeling algorithm is illustrated in Figure 5.24.
Step 1: Construct the denotation window (e.g., 200Kft)
Step 2: Move the window horizontally and vertically (e.g. 1, 2, 3, and
4). If any function element is met, then record the moving
distance d.^ (i = 1, 2, 3, 4).
Step 3: Compare d^ and choose a functional element with minimum
distance as a candidate (e.g., the top resistor will be the
best candidate) and store it in page 4 and mark it as
indentified.
However, an ambiguity may occur such as assigning a label of
capacitor to a resistor. To avoid this ambiguity, the knowledge base
will provide the reasonable interpretation for denotation checking. For
instance, the "K" or "2" inside the denotation window means that it
belongs to a resistor. The system must skip the candidate even if it
has minimum distance and select the next one.
Furthermore, some denotations consist of only a single character.
This situation appears in vertical bar shape elements and provides the
useful pin information to help identify a functional element. These
denotations will be saved in page 4 until the rectangular box is
constructed.
5.4.5 Reconstruction of Rectangular Shape Elements
The rectangular shape element has been destroyed by the extraction
of horizontal connecting line segments. However, the features are
preserved in the separated files. The horizontal line segment pairs are


98
(c) Comparison Between Direct-Multiplication Approach and FFT
Approach
The correlation can be calculated with direct multiplication. With
one displacement, pixel-by-pixel corresponding multiplication must be
performed for the whole array. While using FFT approach, correlations
of all displacements, from (0,0) to (N^,^) which is the dimension of
the patterns, are computed at one time.
For 4-by-4 array patterns, this procedure has been tested using
PDP-11/40 minicomputer and shows that the computing time of one
displacement correlation is about one third of that of FFT approach
which computes 4x8=32 displacements. The comparison means that FFT
approach is at least ten times faster than the direct-multiplication
approach for 4-by-8 arrays.
For arrays of larger dimensions (e.g. 16 by 32 used in our case),
the computing time will be much shorter for the FFT approach compared
with the direct approach because the computing time ratio of FFT to
direct Fourier Transform is inversely proportional to N logN, where N is
the dimension of 1-D arrays.
In general, it is not necessary to compute all the displacements.
However, because of the uncertain noise during the scanning process,
there will be always one-pixel errors at each side of the characters.
Therefore, at least nine displacements must be considered in direct-
multiplication approach and there is no guarantee that these nine
displacements are enough


88
Upon the completion of the pyramid generation, we start to trace
the coordinates of the circumscribing rectangular box of each character
from the pyramid information and assign a code number to each block.
From the properties of the pyramid, we know that (1) the base of the
pyramid possesses the maximum number of elements compared with the other
layers, (2) the top element of the pyramid has support from every layer
(see Figure 4.12). Furthermore, based upon the above two properties, we
find that the top element provides the starting y-coordinate and the
base provides the ending y-coordinate, starting x-coordinate, and ending
x-coordinate. The x- and y-coordinates above correspond to two corner
coordinates of the circumscribing box.
The algorithm of the labeling is as follows.
Step 1 Scan the pyramid picture record by record and find a pixel with
"1".
Step 2 Take that pixel as the top element of the pyramid and record
its y-coordinate (Y^).
Step 3 Use the x-coordinate as the guide and tracing downward until
the base is reached (i.e. the last "1" before "0" is detected).
Step 4 Record the y-coordinate as Y2, and take the x-coordinate of the
leftmost pixel of the base line as X^ and the x-coordinate of
the rightmost pixel of the base line as
Step 5 Store (X^Y-^ and (X2,Y2) in the character coordinate file and
take the record number of the coordinate file as a block
number.


1
I
Know
ledge
Transfer
from
Expert s
Strategy
; Extension
From
Experts
Lo
Figure 3.1
Organization of a Knowledge-Based System


67
knowledge-based expert systems for production automation and computer-
integrated manufacturing, for performing self-diagnosis and self
maintenance in industrial environments, and for the design of
intelligent robots.


227
52. J. T. Tou, "Application of Pattern Recognition to Knowledge System
Design and Diagnostic Inference," in J. Kittler, K. S. Fu, and L.
F. Dau (Eds.), Pattern Recognition Theory and Applications,
D. Reidel Publishing Co., New York, pp. 413-429, 1982.
53. J. T. Tou, "Computer-Based Intelligent Information Systems," Proc.
of the IEEE 1978 COMPSAC Conference, Chicago, IL, pp. 735-740,
1978.
54. L. B. Lichtenstein, "Conditional Non-independence of Data in a
Practical Bayesian Decision Task," Organization Behavior Human
Performance, Vol. 8, pp. 21-25, 1972.
55. J. T. Tou, "Telebrowsing of Science Information Via a
Minicomputer," Current Research on Science and Technical
Information Transfer, Jeffrey Norton Publishers, New York, 1976.
56. R. G. Casey, "Moment Normalization of Hand Printed Characters,
IBM J. Res. Develop., Vol. 14, pp. 548-557, 1970.
57. M. R. Teague, "Image Analysis Via the General Theory of Moments,"
J. Opt. Soc. Am., Vol. 70, pp. 920-930, 1980.
58. T. Pavlidis, Structural Pattern Recognition, Springer-Verlag, New
York, 1977.
59. K. S. Fu, Syntactic Pattern Recognition and Application, Prentice-
Hall, NJ, 1982.
60. R. C. Gonzalez and M. G. Thomason, Syntactic Pattern Recognition -
An Introduction, Addison-Wesley, MA, 1978.
61. J. D. Gaskill, Linear Systems, Fourier Transforms, and Optics,
John Wiley and Son, New York, 1978.
62. J. T. Tou, J. M. Cheng, and D. Brzakovic, ATI Final Report, Center
for Information Research, University of Florida, 1983.
63. P. P. Lin, Object Extraction and Identification in Picture
Processing, Ph.D. Dissertation, University of Florida, 1972.
64. D. Rutoviz, "Pattern Recognition," J. of Royal Statistics Society,
Vol. 129, pp. 504-530, 1966.
65. K. Y. Wong, R. G. Casey, and F. M. Wahe, "Document Analysis
System," IBM J. Res, and Develop., Vol. 26, pp. 647-656, 1982.
66. C. Y. Suen and T. Radhalcrishnan, "Recognition of Hand-Drawn
Flowcharts, Proc. 3rd Int. Conf. Pattern Recognition,
pp. 424-428, 197(7


126
^ Start ^
Grouped
Characters
Select
Reference
For Each Group
Original
Picture
and
Coordinate
File
Thresholding
From
Original
Picture
Figure 4.30 Intensity Change Analysis


I
SYMBOLS
Shape
Shape
type type type type
Figure 5.3 Symbol Tree


24
2.6 Knowledge-Based System
What distinguishes such a knowledge-based system (or expert system)
from an ordinary application program is that in most expert systems, the
knowledge is explicitly in view as a separate entity, rather than
appearing only implicitly as part of the coding of the program.
Ordinary computer programs organize knowledge into two levels: data and
program. Most expert computer systems, however, organize knowledge on
three levels: data, knowledge base and control. For a general review,
see Nau [36].
We propose that our knowledge-based system incorporate the concept
of pattern recognition as the classification reference to compensate for
the lack of a suitable rule. There are three goals in our knowledge-
based system. The first is to structure the problem domain and to
develop analysis techniques. These techniques include algorithm,
heuristic rules and the use of contextual information. The second is to
use the techniques developed in Sections 2.2 to 2.4 for analyzing visual
imagery. The third one is to develop analysis techniques for effective
object classification that would be less computationally expensive.
The knowledge-based approach to visual imagery involves formulating
and evaluating hypotheses about the objects observed in the imagery.
This is accomplished by extracting features from the imagery and then
associating those features with high-level models of possible objects
that are stored in the system's knowledge base. The basic system block
diagram is shown in Figure 2.6. It consists of four main parts:


147
Table 5.3 Class Designation Letters for Some
Electrical and Logic Components
Class Letter
Component Name
A
AND gate
AR
amplifier
C
Capacitor
CR
diode
FF
flip-flop
L
inductor
NAND
NAND gate
NOR
NOR gate
NOT
inverter
OE
Exclusive OR gate
OR
OR gate
Q
transistor
R
resistor


157
5.4.1 Extraction of Junction Dots
The junction dots consist of crossing dots and touching dots, as
shown in Figure 5.6, which are ordered from J1 to J5 according to the
frequency of occurrence in a schematic diagram. The black dots
represent the skeleton of a crossing or touching junction. These
patterns are part of the primitives stored in the knowledge base (refer
to Figure 5.5) The knowledge base will handle the ordering of J1 to J5
during the skeleton matching. The white dots denote the possible
conducting pixels. Due to poor paper quality and imperfect
preprocessing, the configuration of conducting pixels may vary from the
patterns shown in Figure 5.6.
The junction dots are extracted by using the following algorithm
(see Figure 5.7)
(1) Detect the skeleton pattern via a template matching process if the
number of pixels in the scanning line exceeds the threshold
(where 0^ = 30). The order J1 to J5 is followed.
(2) Check the number of conducting pixels around the center pixels. If
this number is equal to the maximum or less than the maximum by one
or two pixels, a junction dot is extracted and the X-Y coordinates
of the center pixel are stored in the first page with an
identification code ID (i.e., J1 to J5). Remove that-junction dot
from the binary image.
(3) Check the contents of the first page and resolve the ambiguities.
The ambiguities (like blurring) occur due to the sampling noise of
the scanner, inaccurate 'alignment of the drawing, and the poor


100
Figure 4.15 Block Diagram of Calculating
Correlation Using FFT Approach


DOC
U)
Figure 4.2 The Data Acquisition Subsystem


12
A closed contour C in Figure 2.4 is expressed as a complex function
u(t), where u(t) = x(t) + jy(t). The FD of C is defined as
1 L
a = =- / u(t)exp[-jn*2irt/L] dt (2.3)
n L 0
(L : the total length of the contour)
The curve u(t) can be expressed by a as
00
u(t) = l anexp[ jn*2irt/L] (2.4)
n=-oo
In short, we assume that L = 2tt. The formulas (2.3) and (2.4) are
simplified as below
,2*
a = jr / u(t)exp[-jnt]dt (2.5)
n ^ 0
and
OO
u(t) = l anexP[jnt] (2.6)
OO
The FD has the properties of translation, rotation, and scale
invariance [26].
2.3.3 Extension of FD to a Non-Closed Curve
From the previous analysis, we know that the FD is efficient only
for a closed contour. But in the real world, the closed contour Is not
easy to acquire due to the poor imagery, or partial view. Thus,
extending the FD to a non-closed curve is very important.
Consider a segment as a closed curve (refer to Figure 2.5) in the
following way:
(a) Take one end point of the segment as the starting point. --
(b) Trace the segment in the counterclockwise direction to the other
end.
(c) Retrace the segment to the starting point in the clockwise
direction.


208
Example 1: Consider the diagram in Figure 5.29, an associative
network structure can be shown in Figure 5.30, where "$"
denotes input and "*" denotes output. The node represents
the information of SDL. The PGL can be expressed as
(I;$,1;R,1,0,1)
(B;R,1,0,2;J,1,0,1)
(B;J,1,0,3;R,3,1,2)
(B;R,3,1,1;C,1,1,2)
(0;C,1,1,1;*,1)
(B;J,1,0,2;R,2,0,1)
(0;R,2,0,2;*,2)
Example 2: Consider the complex diagram in Figure 5.31, its
associative network structure can be shown in Figure 5.32
and the PGL is the following
(I;$,1;C, 1,0,1)
(B;C,1,0,2;J,1,0,1)
(B;J,1,0,3;R,1,1,2)
(B;J,1,0,2;R,2,0,1)
(B;R,1,1,1;#,0)
(B;R,2,0,2;J,2,2,1)
(B;J,2,2,2;AR,1,0,1)
(B;J,2,2,3;J,3,1,1)
(B;AR,1,0,2;#,0)
(B;AR,1,0,3;J,5,2,1)
(B;J,5,2,2;R,4,0,1)
(B;J,5,2,3;J,4,3,1)
(0;R,4,0,2;*,1)
(B;J,4,3,2;CR,2,1,1)
(B;J,4,3,3;CR,1,0,2)
(B;CR,2,1,2;R,3,0,2)
(B;R,3,0,1;J,3,1,2)
(B;CR,1,0,1;J,3,1,3)
The advantages of the PGL are (1) it describes electronic and logic
schematics very easily and very fast, (2) from the description of PGL,
the schematics can be reconstructed systematically, (3) it is easy to
update or modify the circuit organization, (4) it can be applied to
generate a symbol itself as symbol generation language (SGL), and (5) it


10
horizontal or vertical). The chain code notation is shown in
Figure 2.3.
In mathematical form, the chain code can be expressed as
\ exp [j|k]
where (2.1)
1 k = 0,2,4,6
\ = { _
2 k = 1,3,5,7
The chain codes can provide a compact representation of regions.
Freeman [24] proposed a correlation scheme for chain-coded curves. If
c^,C2,-.*,cn is one chain code, d^,d2,**,dm is another chain code, and
m < n, then the chain code correlation C(j) of d^ at c^ is
m
C(j) = l cos[((d c ) mod 8) |] (2.2)
i=l J
Chain-code matching is computationally efficient, but cannot be
considered a general tool for shape matching since it is not rotation
invariant, very sensitive to local changes in the number of chain
elements, and also quite sensitive to small global changes in scale. To
solve these drawbacks we combine the chain coding with the Fourier
descriptor for shape analysis.
2.3.2 Fourier Descriptor
Two types of Fourier descriptors (FD) are used for shape
description. One is the Fourier transform of a boundary expressed in
terms of tangent angle versus arc length [25]. The other uses the
complex function of a boundary [26] £27].


8
Shaded area :=
gap-filling area
X
A
A
9
0
9
A
A
9
A
9
9
9
A
A
9
9
A
0
9
A
0
: Extent point A : Point to be filled in
Figure 2.2 Spatial Configuration of Pairs of Points
and the Manner in Which They Are To Be Connected


139
and what the criterion is. Then, on the basis of the conclusion of the
analysis, the control module then selects the next analysis task. Thus
the control module is an event driven mechanism. The functional block
diagram of the knowledge base design is shown in Figure 4.39. The
hierarchical tree handles the type-font classification. The type-font
dictionary stores the type-font information and links with the chain-
coded picture file and the manufacturer's file. The type-font
dictionary also provides the corresponding feature vectors through the
type-font-to-feature table for font identification. The detection
criterion file provides the threshold setting for falsification
detection subsystem. The pictorial feature file stores the quadrant
pictorial features of the character for font identification. A
questioned document passes through image processing task linked with the
knowledge base and falsification detection and the font identification
is performed on it.
4.4 Discussion
There are several basic questions in falsified document detection
and font identification.
a) Is the document falsified?
b) In what typefont(s) is the document written?
c) Which manufacturer produced the machine used in the typing of
the document?
d) What actual individual machine is used in typing the document?


119
we obtain the following cases
(1)
S1 K 6S1
Test passes
(2)
9S1 Check shift effect
(3)
0S1 < S1 < 0S2 and S2 > 0S3
Smeared character
may exist but check
if the reference is
smeared. If yes
change it
(4)
Test fails
The shift effect checking is to move the two characters relative to
each other in eight compass directions with one pixel offset and re
compute the shape matching indexes (S-^ and S2) to see if the test passes
or not. The shift effect checking is illustrated in Figure 4.26. The
functional diagram for shape analysis is shown in Figure 4.27 and the
result is shown in Figure 4.28.
4.3.3.3 Intensity Change Analysis
A document may be falsified by the same typewriter at a later
date. In this case, the suspicious document may have passed all the
preceding tests. We make use of intensity change analysis to detect
this type of falsification which may be revealed from the change of
ribbon darkness or the change of paper brightness due to erasing. The
differences in ribbon darkness and paper brightness may be determined
from the intensity histogram plots, as illustrated in Figure 4.29. The
solid curve represents the intensity histogram of a character. The
dashed curve represents the intensity shift from both sides to the


128
different elements to type four different type-fonts (Letter Gothic,
Elite, Prestige Elite, and Courier Italic) as shown in Figure 4.32.
To facilitate the type-font identification task, hierarchical
pattern matching is proposed (see Figure 4.33). The hierarchical
pattern matching performs the three-level discriminations using size
information, global features and local features in sequence. The size
information can be found from the manufacturer's catalog.
The global features used in our system are some chain-coded
features as follows [63].
(1) Centroid : (x, y)
(2) Moment of inertia about centroidal x-axis : (MOX).
(3) Moment of inertia about centroidal y-axis : (MOY)
(4) Inertia product about centroidal x- and y-axis : (MXY)
(5) Direction angle of the major axis : (THETA)
(6) Moment about the major axis : (TEX)
(7) Moment about the minor axis : (TEY)
(8) Elongation index : (El)
(9) AREA : (A).
These nine features form a global feature vector to describe a
character. The similarity between two characters can be evaluated by
the distance measurement between two corresponding feature vectors.
Figures 4.34(a) and (b) show two chain-coded characters "h", which are
of different type-fonts. The chain-coded features are calculated to
show the differences. These two h's are easy to identify by the global
features. However, the chain-coded features may not detect the minute
differences between two similar type-fonts.


-r
o
i
i
i
i
2
2
2
O
1
1
.1
I
Table 5.10 The Primitive of Functional Elements
h>AVO-//-N 1/ si /I ^ f T L 'l ) V Z.
00330110
11000011
10000000
01000011
01001001
01000011
10000100
01000110
00030140
00000001
00000001
0 0 0 0 0 0 0
11110 0 0
0 0 0 0 0 0 0
1 0 0 1 0 0 0
1 0 0 0 0 '0 0
1 0 0 1 0 0 0
0 0 0 0 1 0 0
0 0 0 1 1 0 0
0 0 0 1 4 0 0
1 0 0 0 0 1 1
1 0 0 0 0 1 1
00000000
00000000
00000000
00000 0 00
00000000
00000000
00111100
00000111
00000000
10000000
01000000
185


178
elements, denotations, and vertical line segments. The extraction of
functional elements is conducted in the following three
steps (1) blocking, (2) grouping, (3) and recognition (see
Figure 5.17). The blocking routine isolates a functional element by
inscribing it with a rectangular or square block and assigns a code
number to each isolated block in the order of scanning sequence. The
blocking routine used here is the same as the character blocking concept
used in Chapter 4, if we consider the functional elements as
characters. Furthermore, the isolated block can be divided into two
types of circular ones, and rectangular ones as shown in Figure 5.18.
The circular symbols are extracted by using the circle detection
technique based upon a modified Hough transform [81],[82]. The circular
blocks may represent such active elements as transistors, junction FET,
and MOSFET as shown in Figure 5.2. The rectangular blocks may represent
such functional elements as resistors, inductors, capacitors, diodes,
amplifiers, logic gates, flip-flops, etc. The grouping routine is to
categorize the isolated blocks using the proposed symbol tree (see
Figure 5.3). Rectangular blocks are grouped into two classes, vertical
bar and horizontal bar, according to their aspect ratios. The resistor
and inductor block in normal position are of horizontal bar shape. Most
of VLSI blocks and logic element blocks are in the form of vertical bar
shape and some of them are in the form of a square shape. Here, if we
define aspect ratio = AY/AX, then


121
Binary
Picture
And
Coordinate
File
Figure 4.27 Shape Analysis


66
HOW TO CONTROL VBC BY USING BACTUR ?
* AIR SPRAY
1 MINIMUM OF 3 GALLONS SPRAY PER ACRE.
2. USE EARLY IN SEASON.
ANY OTHER QUESTION ? (Y/N) Y
ENTER YOUR QUESTION (NO MORE THAN ONE LINE)
SOYBEAN INSECT ?
INSECT
1 SOYBEAN LOOPER
2 GREEN CLOVERWORM
3 VELVETBEAN CATERPILLAR
A CORN EARWORM
5 FALL ARMY WORM
6 BEET ARMY WORM
7 LESSER CORNSTALK BORER
8 GREEN STINK BUG
? MEXICAN BEAN BEETLE
10 BEAN LEAF BEETLE
11 THREE-CORNERED ALFALFA HOPPER
12 THE SOUTHERN RED-LEGGED GRASSHOPPER
13 THE LARGE AMERICAN GRASSHOPPER
Figure 3.21 An Example of Question-Answering Mode


46
knowledge base updating. In order to speed up data collection, we use
some pre-defined delimiters to identify the data characteristics. Each
delimiter will perform one generation function. The raw data are typed
in compact form to save storage, then sorted into a readable form. The
examples of raw data and delimiter function are shown in Figures 3.8,
3.9, and 3.10. The simplified knowledge base generation scheme is
illustrated in Figure 3.11.
3.5 Data Structure for the APRIKS System
The data structure for the APRIKS system is composed of
hierarchical node index table (NITA), entity dictionary (ED), attribute
dictionary (AD), value dictionary (VD), unofficial entity dictionary
(UED), unofficial attribute dictionary (UAD), entity-attribute table
(EAT), information file (INF), name dictionary (ND), hash table (HT),
and text file. The heart of the APRIKS system is the NITA which
represents an agricultural knowledge sketch base in an associative tree
structure. Information flow in the APRIKS system is summarized in
Figure 3.12, and the data formats for various elements in the data
structure are shown in Figure 3.13.
The hierarchy consists of eight generation codes, with one byte
being used for each generation code. Thus, the eight-level tree is used
to represent 256 subnodes for each node. ED and AD are employed to
interpret entity names and attribute names, respectively. The synonym
dictionaries, UED and UAD, are created for interpreting the
corresponding master terms. For instance, SL and VBC are the synonyms
for soybean looper and velvetbean caterpillar, respectively.


11
Chain Code : 556667021001224443
Figure 2.3
Chain Code of a Contour C


151
Figure 5.2 Key Feature of Circular Elements


23
(2) Some impact may arise from unexpected interactions with other rules
due to a rule change or an addition of a rule.
The other way to represent knowledge is to take the conditional
probabilities into account. This representation is used to diagnose
entities such as diseases by repeatedly applying Bayes' theorem to
compute the probabilities that certain diseases are present, given that
certain symptoms have been observed.
Another way in which knowledge is sometimes organized is by static
descriptions of phenomena. This representation is used in INTERNIST
[33],[34] and MEDIKS [35].
We represent the knowledge in entity-attribute relationship
associated with conditional probabilities. Initially, the conditional
probabilities P(E^|Aj) and P(A^|E^,) of the entity-attribute relationship
are determined by the knowledge base. That is, if there are n possible
entities associated with a particular attribute A^, then the same
initial conditional probability is attached at each E^, i=l, ..., n.
PCEjA.) = P(E2|A.) = ... = P(En|A.) = ~
However, to use frequency of the occurrence to replace the
conditional probability is not accurate. This assignment will blur the
importance of the key feature and lead to a misclassification problem.
To compensate for this drawback, we have to manually replace the
conditional probability by a weight (or certainty factor). However,
experts will not feel comfortable deciding the weight. This weight
assignment is updated until the system performs correct decision making.


19
We can determine A, the least-square estimation of A, so that the
following criteria are satisfied:
(1) E {A} = A
(2) E{e^e} = E{(A A)^(A A)} = minimum (2.24)
The least-square estimation of coefficient vector _A is given by
~ T -1 T
A = (H H) H Y
where
A A
A [a a2 ... aL]
(2.25)
, /
'/
T 1 //
If (H H) does not exist, we may use pseudo-inverse matrix H instead
T 1 T # r^r T
of (H H) H The pseudo-inverse matrix of H is H = H where
T (C B)
r r
B = ErH;C^ = I; Ci+^ = C^B; i=l,2,...,r; and r is the rank of HH.
Tr(CrB) indicates the trace of matrix (CrB).
Consequently, the optimal prediction direction of i-th chain code
is given by
1* L .
a (i) = l a P j=i J r s-j=i J y
where K(*) = 0, 1, 2, 3, for octal chain codes 0,1,2,...,7
and { X J denotes the largest integer which is less than the number
X.
In [28], Xu and Tou chose L = 3 and M = 3 for predictive
searching. From the equation (2.24), we know that the performance
measure is essentially based on the view that all the errors are equally
important. This is not necessarily so. We may know, for example, that
data taken later in the experiment were much more in error than data
taken early on and it would seem reasonable to weight the errors
accordingly. Such a scheme is referred to as weighted least squares and
is based on the performance criterion


ei tDi,J<"j)-s:(v-s:2i El.2 tD1.2,j("j)}
Implementation
Direction
{D.. .. (w.)}
{D1.1.1.2,j(wj)}
N5
Figure 3.5 Bottom-Up Approach for KB Generation


200
in the temporary file, the blended corners are in the junction dot file,
and the vertical connecting line segments are in the line connecting
page. The reconstruction routine is to find the compatible pairs and
merge them together if the consistency is satisfied. Then the
denotation file is searched to find all the denotations inside the
rectangular box and the knowledge base is called for functional
identification. Next step is to pick up the denotation window which is
outside the rectangular box, and perform minimum distance measure for
labeling.
5.4.6 Processing of Unrecognizable Page
After the four pages extraction, the remaining page contains noise
and unrecognizable elements which may be new elements or deformed
symbols. This page will provide very useful information to augment the
system capability. Via man/machine interaction, the system will create
new records for new elements or create synonym records for existing
elements. Using Figure 5.16(a) and (b) as examples, after the
functional elements are removed and the denotation recognition skipped,
the unrecognizable pages are shown in Figure 5.25(a) and (b),
respectively.
5.5 Pictorial Manipulation Language
To accomplish the symbol interpretation task and pictorial database
generation task, we introduce picture manipulation languages which
consist of the symbol description language (SDL) and the picture
generation language (PGL). The SDL and PGL are high level languages.


186
conditions are satisified. Then the system stores that functional
element with label and rotation index in page 3 and removes it from the
binary image. The unrecognizable functional elements, due to missing
pixels, new symbol, or noise blurring problem, will remain in the binary
image.
However, some of the vertical bar shape elements may embed
redundant vertical connecting line segments after the blocking
routine. Thus the vertical line detection must be performed before
recognition routine execution. The functional elements and the detected
vertical connecting line segments will be stored in page 3 and 2,
respectively.
Furthermore, if some of the vertical bar shape elements contain
corner features, we may detect that breaking point and transfer it to
page 1.
Some ground symbols are blurred and some are distinct. The
distinct ground symbol may appear as 1 and lose its proper
configuration and becomes unrecognizable. To compensate for this
problem, the system will trigger an inference mechanism to trace the
rest of the parts if only one simple feature is detected.
After the recognition routine is performed, the functional elements
are replaced by their standard drawings.
Continuing the last examples and using blocking routine to
Figure 5.16(a) and (b), we can obtain the block diagrams and their
corresponding coordinate files shown in Figure 5.20(a) and (b) and
Table 5.11(a) and (b), respectively. After grouping, the contents.of


154
extraction is performed, a clean binary image of the schematics is
generated. The binary image generation involves reflection, filtering,
noise removal, and adaptive thresholding. The reflection algorithm is
employed to correct the mirror effect. The filtering and noise removal
tasks are used to enhance the lines that have a certain width and to
suppress small spot-like noise and shading effects caused by non-uniform
illuminations on poor quality paper. Through adaptive thresholding, we
obtain a binary image with some spurs and gaps. In the training phase,
the spurs will be removed by noise removal operation and small gaps will
be filled by gap filling task before skeletonization is performed. The
skeletonization is employed to create thinned schematics. After careful
tailoring, the thinned symbols are decomposed into the primitives which
will be stored in the knowledge base as feature vectors. The primitives
currently used are shown in Figure 5.5. The new primitives of each
symbol may be generated and updated in an interactive manner. The
special relationships among the primitives of a symbol are extracted and
stored as feature vectors. Then, knowledge base is created during the
training phase.
To conduct the task of schematic symbols extraction we decompose a
schematic diagram into five sets of drawings, and each set is stored in
a dedicated file which we call a page. The five sets are junction dots,
connecting line segments, functional symbols, denotations, and
unrecognizable elements. The first page stores a drawing of the
junction dots, the second of connecting line segments, the third of
functional elements, the fourth of denotations, and the fifth of


89
layer
1
2
3
4
't 5
Y 6
> X
elements
1
1
3
4
6
7 maximum
Figure 4.12 The Property of the Pyramid


50
Figure 3.11 The Simplified Knowledge
Base Generation Scheme


SOYBEAN
i
Non-chemical
Control
Chemical
Control
l.oopor Cloverworm Caterpillar Earworm
Control
Recommendation
Hon-chemlcal
Control
Baci1lus
Tlnirlnglcnsls
Hetlionyl
Dlpel Ractur Tlmrlclde
Formulation A.l/Acre Application Toxicity
H ,7 See \ / Air \
' > ^ LabelJ \Spray)
Cliemlcal
Control
Bacillus
Tlnirlnglcnsls
Hetlionyl
Hlpel Bactur Tliuriclile.
Formulation A.l/Acre Application
("0 (lSi)
LO
00
Figure 3.2 An Example of APRIKS System Hierarchy


132
ORIGINAL FIGURE
SMOOTHED FIGURE
544
6 2
5 2
6 1
6 2
6 2
6 2
6 1
6 2
5 2
6 1
6 2
-& S 5444
5 2 5 7 3
6 1 501 6 2
6 2 51 6
6 2 51
6 2 6 2
6 2 51
5 31
7 2
5 1
6 2
6 2
6 1
6 2
5 2
6 1
6 2
7 02
1
6 2
6 2
51
6 2
6 2
6 2
5 2
6 1
6 2
6 2
6 2
6 2 54
6 2 5 2
6 34 02
000001
544
6 2
5 2
6
6
6
6
6
6
5
6 1
6 2
6 ?
5 2
6 1
1
2
2
2
6
6
6
6
6
6
5
6
6
6
6
5
6 1
6 2
0002
51
51
6 2
51
5444
5
50006
6
31
2
1
2
2
6
6
51
6 2
6 2
6 2
5 2
1
2
2
2
2
2
34
54
5 2
1
oqoooi
AREA OF THE FIGURE IS =-132.0 UNITS
CENTROID IS C 7.4/ 12.5)
M O X = 9337.0745
TEX = 2779.9470
THETA: 82.1420
MOY = 2895.3704
TEY= 9953.3979
EI = 0.3105
MXY = 936.2037
(b) Courier Italic Type-Font "h"
Figure 4.34 (continued)


84
This paper pres
tection of documen
pattern recogniti
es. The proposed

cument data accJin'
on. The document
e document under e;
cument data and Ts
r subsequent falsii
Figure 4.9 Binary Picture (After Noise Removal)


195
AW
t*
(a) Window 1
AW
AAV AAV AW
(b) Window 2
Figure 5.22 Functional Elements


2
knowledge by the entity-attribute relationship associated with a
certainty factor or a conditional probability.
In Chapter 3 we illustrate the knowledge base design by using the
APRIKS system as an example. The modified top-down approach is proposed
to generate the knowledge-based system. The APRIKS system will perform
three operation modes (information retrieval mode, decision-making mode,
and question-answering mode) and improve the user interface by menu
selection input or simple natural language input.
In Chapters 4 and 5 we integrate the knowledge-based system with
the image processing techniques to form a pictorial knowledge-based
system which performs falsified document detection and font
jLndentification, and electronic circuit diagram recognition and
interpretation. Two pictorial manipulation languages, symbol
description language and picture generation language, are proposed to
interpret circuit diagrams. The conversion rules to the SPICE package
are proposed to show that the knowledge base can be integrated with the
current CAD machines to enhance the system's potential. Finally, in
Chapter 6, we present some concluding remarks and project future
directions of this important area of image understanding.


30
Food productivity may be enhanced by taking several approaches:
(1) crop pest control, (2) plant disease control, and (3) fertilizer
management. The primary function of crop pest control is to make
optimum use of pesticides on plants. The main objective of plant
disease control is to prevent the spread of plant diseases. The primary
goal of fertilizer management is to provide an adequate amount of
nutrition to the plants. To enhance food productivity, we should make
the right diagnostic analysis, select the right treatment plan, and use
the right pesticides and fertilizers with the right amount at the right
time. To accomplish these objectives, farmers and growers should have
the necessary knowledge at their fingertips when their needs arise. The
computer-based APRIK3 system was conceived some five years ago to meet
this challenge. APRIKS is the acronym for Agricultural productivity
JCmprovement Knowledge _System. This paper presents the design of APRIKS
which is a pilot knowledge-based expert system for applications in
agriculture. The APRIKS enables the user (farmer, grower, or county
agent) to interact with the computer in a conversational mode to obtain
satisfactory answers to various questions and suggestions of solutions
to specific problems. The APRIKS is a pilot system which demonstrates
the use of a minicomputer in agricultural information browsing,
knowledge transfer, diagnostic consultation, management-recommendation,
and science education. On the basis of a set of observations provided
by the user, the APRIKS can determine the plant diseases, the damaging
insects, or the planting instructions. It can recommend treatment plans
and pest control procedures,- and can provide useful information such as


PROFILE FOR THE 16-TH CHARACTER
105
BLOCK NUMBER = 16
LINE NUMEER = 2 CHARACTER POSITION = 2
s
OJ
in
Ui
Q.
U-
a
*
*
5* -t
ae* *
-at-
Sr *** *
* *
= * -x-
nninTfin-ONraff-i
o:
u
K
U
<
c:
<
X
I-
I
Cl
a
u.
u
_i
PH
U.
a
in
S
lx.
a
*
Jtm
-f*
-i*
+ "i-
T,*-r*-*T--4*-T'---
-*-----~---Tf--£--T
rrtr-rTr*-*----3e--¡r
^njmfin-ONCDO'O
3 h Q H I
3 M a H X
PATTERN 7E
"GR FOR THE XS-
CHARACTER IS
2 0 1
(b) Character "e
Figure 4.18
(continued)
i+.ii..|,+++


Figure 5.16 Image After Horizontal Line Segments Removal


17
xy" x"y' = Re[ju'(t)u"(t)]
00 00
= Re[ l l n m a a exp( j(m-n)t) ]
uj=oo n-00
= Re[ l n3 |an|2] if m=n
n=-oo
Now, we define
SF4 =
n=l
(|a1|2-|a.1|2)
(2.18)
SF4 = 1 when C is a circle, and SF4 > 1 when C has more convexity.
2.3.5 Predictive Searching for Chain Encoding
To generate the chain codes from a geometrical configuration, the
search direction of successive codes is predetermined. Such a method
works very well when it is applied to simple geometrical
configurations. However, when the geometrical configuration contains
overlapping and touching points, chain-coding methods may become less
satisfactory. Xu and Tou [28] have proposed a predictive searching
method based upon the past information. That is, the subsequent chain
code is predicted from information on the preceding chain codes. The
outline of the predictive searching method is as follows. .
Assume that at the i-th step of chain encoding, the predictive
chain-code direction is D^Ci), the real chain direction is Dr(i), the
predictive code-angle increment is cx^Ci), and the real code-angle
increment is ar(i). Then we have the following relationships:


141
e) Is the entire document typed on the same machine?
f) Is the document typed continuously?
The proposed system in this chapter has been implemented in the
Center for Information Research, the University of Florida, and answers
the above questions. The system is called the Automatic Typewriter
Identification (ATI) System. Due to the limited samples, the questions
(c) and (d) are not fully tested. However, the growth of the knowledge
base will provide a positive answer to these two questions.


79
1
I
This paper presents an approach to automatic
I
I
detection of document falsification by making use
i
of pattern recognition and image processing techni-
i
ques. The proposed system consists of two parts:
document data acquisition anc| falsification detec-
i
tion. The document data acquistion sybsystem reads
the document under examination, processes input
i
document data andiisolates suspicious characters
i
for subsequent falsification detection
L J
Figure 4.5 An Example of Windowed Document


KNOWLEDGE BASE FOR CONSULTATION AND IMAGE INTERPRETATION
BY
MING-CHIEH CHENG
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA
IN PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1983

TO MT PARENTS

ACKNOWLEDGMENTS
The author expresses his deep appreciation to the chairman of his
supervisory committee, Dr. Julius T. Tou, for his guidance and
encouragement
during
the
course
of the work presented
in this
dissertation.
The
author
also
thanks the other members
of his
supervisory
committee, Dr.
John
Staudhammer, Dr. Jack R.
. Smith,
Dr. Leslie H. Oliver, and Dr. Gerhard Ritter for their friendly help and
for their participation on the committee, with special thanks to
Dr. John Staudhammer for extra assistance and guidance.
The author gratefully acknowledges the stimulating discussions with
his colleagues and friends, Ms. Dragana Brzakovic and Mr. Malek
Adjouadi, and the great help of Ms. Patricia Lindsay in preparing this
dissertation. Thanks are also extended to Ms. Carole Boone for her
excellent typing work.
Special thanks are extended for the financial support of the Center
for Information Research under the research projects APRIKS from Kellogg
Foundation, KUTE from National Science Foundation, ATI from Florida
State, AUTORED from National Science Foundation, and the Center-of-
Excellence Program from the University of Florida.
Last, but not least, the author thanks his wife for her patience
and encouragement through his graduate school career.
iii

TABLE OF CONTENTS
ACKNOWLEDGMENT iii
ABSTRACT vi
CHAPTER Page
1 INTRODUCTION 1
2 IMAGE PROCESSING TECHNIQUES AND
KNOWLEDGE REPRESENTATION 3
2.1 Introduction 3
2.2 Image Segmentation 3
2.2.1 Edge Detection 4
2.2.2 Smoothing 4
2.2.3 Automatic Threshold Selection 5
2.2.4 Gap Filling 7
2.3 Shape Analysis 9
2.3.1 Chain Coding 9
2.3.2 Fourier Descriptor 10
2.3.3 Extension of FD to a Non-Closed Curve 12
2.3.4 Expression of Fourier Coefficients
In Terms of Chain Codes 14
2.3.5 Predictive Searching for Chain Encoding 17
2.4 Decision Criterion 20
2.5 Knowledge Representation 21
2.6 Knowledge-Based System 24
3 DESIGN OF A KNOWLEDGE-BASED EXPERT SYSTEM FOR
APPLICATION IN AGRICULTURE 28
3.1 Introduction 28
3.2 Design Concepts for Knowledge-Based
Expert Systems 31
3.3 Knowledge Representation 777TV 35
3.4 Knowledge Base Generation 37
3.5 Data Structure for the APRIKS System 46
3.6 Knowledge-Seeking Strategies 53
3.7 Experimental Results 63
3.8 Conclusion 63
iv

CHAPTER
Page
4 FALSIFIED DOCUMENT DETECTION AND FONT IDENTIFICATION 68
4.1 Introduction 68
4.2 Noise Background 70
4.3 Design Methodology 71
4.3.1 Document Data Acquisition 71
4.3.1.1 Scanning, Reflection and Windowing 71
4.3.1.2 Image Preprocessing 71
4.3.1.3 Adaptive Threshold Determination
and Binary Picture Generation 74
4.3.1.4 Binary Filtering (Spur Removal
and Gap Filling) 78
4.3.1.5 Character Isolation 83
4.3.2 Character Recognition and Grouping 95
4.3.2.1 Correlation Technique 95
4.3.2.2 Feature Pattern Matching 101
4.3.3 Falsification Detection 106
4.3.3.1 Alignment Analysis 110
4.3.3.2 Shape Analysis 114
4.3.3.3 Intensity Change Analysis 119
4.3.4 Type-Font Identification 125
4.3.5 Knowledge Base Design 133
4.4 Discussion 139
5 COMPUTER RECOGNITION OF ELECTRONIC CIRCUIT DIAGRAMS 142
5.1 Introduction 142
5.2 Analysis of Electronic Circuit Diagram 143
5.3 System Architecture 150
5.4 Multi-Pass Pattern Extraction 156
5.4.1 Extraction of Junction Dots 157
5.4.2 Extraction of Horizontal Connecting
Line Segments 169
5.4.3 Extraction of Functional Elements 173
5.4.4 Denotation Recognition 190
5.4.5 Reconstruction of Rectangular Shape Elements 198
5.4.6 Processing of Unrecognizable Page 200
5.5 Pictorial Manipulation Language 200
5.5.1 Symbol Description Language (SDL) 202
5.5.2 Picture Generation Language (PGL) 207
5.6 Knowledge Base Configuration 214
5.7 Discussion 215
6 CONCLUSION 218
6.1 Summary ..~r 218
6.2 Areas for Future Work 221
REFERENCES 223
BIOGRAPHICAL SKETCH 230
v

Abtract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
KNOWLEDGE BASE FOR CONSULTATION AND IMAGE INTERPRETATION
By
Ming-Chieh Cheng
December 1983
Chairman: Dr. Julius T. Tou
Major Department: Electrical Engineering
An entity-attribute relationship associated with a certainty factor
is proposed to represent knowledge. To conduct this knowledge
representation, we propose the modified top-down approach to generate
the knowledge-based system. In order to improve the user interface
problem, we further propose the three operation modes (information
retrieval, decision-making, and question answering) to access the system
through either the menu selection input or simple natural language
input. A working system APRIKS is established for agricultural pest
control and some other applications.
We further integrate image processing techniques and pattern
recognition principles with a knowledge-based system to form a pictorial
knowledge-based system to conduct falsified document detection and font
identification, and electronic circuit diagram recognition and
interpretation. To describe the interrelationship among the functional
elements of a circuit diagram, we propose two pictorial manipulation
languages (symbol description language and picture generation language)
vi

using the concept of the associative network. Finally, we propose the
conversion rules to link the electronic recognition system with the
SPICE package to enhance the system's capability. This link
demonstrates that the pictorial knowledge-based system can be integrated
with current CAD machines to make diagnosis and reduce manpower.
vii

CHAPTER 1
INTRODUCTION
There have been many researchers working on object recognition
within the past decade. Their typical applications are optical
character recognition (OCR) [1][4], industrial parts inspection and
assembly [5]-[9], target detection and identification [10][12], and
agricultural remote sensing [13] [14] The techniques employed in those
researches include statistical pattern recognition, chain code
correlation, moment invariants, Fourier descriptor and syntactic
(structural) pattern recognition. However, all of these approaches are
in some way successful only in their respective domain because they lack
the assistance of the knowledge base. The knowledge-based system
integrates all these techniques and performs hypothesis tests to select
the necessary tasks automatically. Thus the knowledge base will make
the recognition system efficient and cost effective.
In Chapter 2, we discuss image processing techniques and knowledge
representation. The image processing techniques include image
segmentation and shape analysis. The image segmentation-used consists
of edge detection, smoothing, threshold selection and gap filling. For
shape analysis, the 2D object boundary is represented by chain coding
and linked with the Fourier descriptor to compute four shape measures
(SF1 to ST'4) as extracted features. Then we investigate the knowledge
representation techniques and propose a method which represents the
1

2
knowledge by the entity-attribute relationship associated with a
certainty factor or a conditional probability.
In Chapter 3 we illustrate the knowledge base design by using the
APRIKS system as an example. The modified top-down approach is proposed
to generate the knowledge-based system. The APRIKS system will perform
three operation modes (information retrieval mode, decision-making mode,
and question-answering mode) and improve the user interface by menu
selection input or simple natural language input.
In Chapters 4 and 5 we integrate the knowledge-based system with
the image processing techniques to form a pictorial knowledge-based
system which performs falsified document detection and font
jLndentification, and electronic circuit diagram recognition and
interpretation. Two pictorial manipulation languages, symbol
description language and picture generation language, are proposed to
interpret circuit diagrams. The conversion rules to the SPICE package
are proposed to show that the knowledge base can be integrated with the
current CAD machines to enhance the system's potential. Finally, in
Chapter 6, we present some concluding remarks and project future
directions of this important area of image understanding.

CHAPTER 2
IMAGE PROCESSING TECHNIQUES AND KNOWLEDGE REPRESENTATION
2.1 Introduction
In the first part of this chapter we investigate the existing image
"" processing techniques and tailor it to suit our system requirements.
The techniques of the image processing we adapted are image segmentation
and shape analysis. Secondly, we discuss the decision function of
pattern recognition theory. Finally, we study the knowledge
representation methods and integrate a knowledge-based system.
2.2 Image Segmentation
Image segmentation is the division of an image into different
regions, each having certain properties. It is the first step of image
analysis which aims at either a description of an image or a
classification of the image if a class label is meaningful. Moreover,
it is a critical component of an image recognition system because errors
in segmentation might propagate to the feature extraction and
classification stages.
During the past decade, many image segmentation techniques have
been proposed, which can be categorized into three classes: (1)
characteristic feature extraction or clustering, (2) edge detection, and
(3) region extraction. For a comprehensive discussion, see Fu and Mui
[15].
3

4
In the following sections, we discuss and modify some of the image
segmentation techniques which could be utilized in our design.
2.2.1 Edge Detection
The intent here is to enhance the image by increasing the values of
those pixels along the boundaries (edges) of the objects. Edges are
detected between regions of different intensity. To detect edges by
mask matching, we convolve a set of difference-operator-like masks, in
various orientations, with the picture. The mask giving the highest
rvalue at a given point determines the edge orientation at that point,
and that value determines the edge strength. The masks used in our
system are the generalized Sobel operators [16] with constant factor V4
as shown below.
12 1
0 0 0
-1 -2 -1
2 10 10-1
10-1 20-2
0-1-2 10-1
0 -1 -2
10-1
2 10
-1-2-1 -2-10 -10 1 0 12
000 -1 01 -2 02 -1 01
121 012-101 -2-10
2.2.2 Smoothing
The object of the smoothing is to "remove" isolated edges
corresponding to noise in the background, and to "insert" more edges
along the object boundaries. A powerful smoothing technique .that does
not blur edges is median filtering [17], in which we replace the value
at a point by the median of the values in a neighborhood of the point.
For a 3x3 neighborhood, we use the fifth largest value. Since median
filtering does not blur edges, it can be iterated. However, a problem

5
with two-dimensional median filtering is that it destroys thin lines as
well as isolated points, and it also "clips" corners.
Another edge-preserving smoothing using discrete bar masks was
proposed by Nagao and Matsuyama [18]. The discrete bar masks are shown
in Figure 2.1. The procedure of the edge-preserving smoothing is as
follows:
Step 1: Examine a set of a point (X,Y)'s neighborhood which is
covered by the discrete bar masks.
Step 2: Detect the position of the mask where its gray-level
variance is minimum.
Step 3: Give the average gray-level of the mask at the selected
position to the point (X,Y).
Step 4: Apply Steps 1-3 to all points in the picture.
Step 5: Iterate the above processes until the gray-levels of all
points in a picture do not change.
This smoothing method takes the average in a certain region; it
will also destroy the fine lines. These smoothing techniques above
cannot be used in the character recognition in Chapter 4 and the circuit
diagram recognition in Chapter 5 because, if the characters or the line
segments are very thin, then these line segments will be lost.
2.2.3 Automatic Threshold Selection
To conduct the binary picture generation, we segment the gray-level
image by thresholding. That is, the pixels greater or equal to a
threshold T are set to 1, and all other pixels are set to 0:

6
0
180
center
; 7
45
/
,
O
225
a
\
270
1
O
315
Figure 2.1 The Discrete Bar Masks for the Edge-Preserving Smoothing

7
1 if f(x,y) > T
f(x,y) = {
0 if f(x,y) < T
In the ideal case, the histogram of the image has a deep and sharp
valley between two peaks representing object and background,
respectively, so that the threshold can be chosen at the bottom of this
valley. However, for most real pictures, it is often difficult to
detect the valley bottom precisely, especially in cases where the valley
is flat and broad, imbued with noise, or when the two peaks are
extremely unequal in height, all of which cause the valley to be
untraceable. Some techniques have been proposed in order to overcome
these difficulties [19][21].
Using the clustering techniques, we develop a thresholding method
which chooses a threshold based upon the maximum interset variance
criterion. Our approach is similar to Otsu's method [21]. The
theoretical development of our approach is in Chapter 4.
2.2.4 Gap Filling
We now wish to assure that the boundaries of all objects are indeed
closed in the binary image. This is done by connecting two pixels
within three pixels of each other (see Figure 2.2). Given two pixels at
(x ,y ) and (x.,y.), the distance d and the angle 8 are
o o i i o o
do 1(xi v>2 + 6o yo)/(r-i xo)J

8
Shaded area :=
gap-filling area
X
A
A
9
0
9
A
A
9
A
9
9
9
A
A
9
9
A
0
9
A
0
: Extent point A : Point to be filled in
Figure 2.2 Spatial Configuration of Pairs of Points
and the Manner in Which They Are To Be Connected

9
If d <3, then the area which has y < y < y. and 9 < 9 < 90 will be
o o i o
filled in. That is, the pixels existed in the gap-filling area will be
changed to "1." However, after gap filling, a side effect will be noted
in the edges of the object become "fat." This phenomenon will deform
the shape of an object.
2.3 Shape Analysis
To recognize and interpret an object, we need to extract the
discriminatory features of the object and perform the detection and
classification. The common features are area, perimeter, compactness,
convexity measure, aspect ratio, gray-level statistics (mean and
variance), and invariant moments (M-^ to Mg) [22] [23] However, some
features are highly desirable, in particular, invariance with respect to
size, orientation, and position. A spatial attribute which has these
properties is the concept of shape. Two regions are said to have the
same shape if they are congruent after applying the operations of
translation, rotation, and scale change. In this section, we discuss
the chain coding and focus on the shape analysis algorithm. The Fourier
descriptor is a candidate for shape analysis.
2.3.1 Chain Coding
Using the standard grid-intersection digitization scheme,'a curve
(or boundary) can be represented by its chain code. In this
representation, a curve is specified, relative to a given starting
point, as a sequence of integers representing vectors whose slopes are
multiples of 45, and whose lengths are /2 (if diagonal) and 1 (if

10
horizontal or vertical). The chain code notation is shown in
Figure 2.3.
In mathematical form, the chain code can be expressed as
\ exp [j|k]
where (2.1)
1 k = 0,2,4,6
\ = { _
2 k = 1,3,5,7
The chain codes can provide a compact representation of regions.
Freeman [24] proposed a correlation scheme for chain-coded curves. If
c^,C2,-.*,cn is one chain code, d^,d2,**,dm is another chain code, and
m < n, then the chain code correlation C(j) of d^ at c^ is
m
C(j) = l cos[((d c ) mod 8) |] (2.2)
i=l J
Chain-code matching is computationally efficient, but cannot be
considered a general tool for shape matching since it is not rotation
invariant, very sensitive to local changes in the number of chain
elements, and also quite sensitive to small global changes in scale. To
solve these drawbacks we combine the chain coding with the Fourier
descriptor for shape analysis.
2.3.2 Fourier Descriptor
Two types of Fourier descriptors (FD) are used for shape
description. One is the Fourier transform of a boundary expressed in
terms of tangent angle versus arc length [25]. The other uses the
complex function of a boundary [26] £27].

11
Chain Code : 556667021001224443
Figure 2.3
Chain Code of a Contour C

12
A closed contour C in Figure 2.4 is expressed as a complex function
u(t), where u(t) = x(t) + jy(t). The FD of C is defined as
1 L
a = =- / u(t)exp[-jn*2irt/L] dt (2.3)
n L 0
(L : the total length of the contour)
The curve u(t) can be expressed by a as
00
u(t) = l anexp[ jn*2irt/L] (2.4)
n=-oo
In short, we assume that L = 2tt. The formulas (2.3) and (2.4) are
simplified as below
,2*
a = jr / u(t)exp[-jnt]dt (2.5)
n ^ 0
and
OO
u(t) = l anexP[jnt] (2.6)
OO
The FD has the properties of translation, rotation, and scale
invariance [26].
2.3.3 Extension of FD to a Non-Closed Curve
From the previous analysis, we know that the FD is efficient only
for a closed contour. But in the real world, the closed contour Is not
easy to acquire due to the poor imagery, or partial view. Thus,
extending the FD to a non-closed curve is very important.
Consider a segment as a closed curve (refer to Figure 2.5) in the
following way:
(a) Take one end point of the segment as the starting point. --
(b) Trace the segment in the counterclockwise direction to the other
end.
(c) Retrace the segment to the starting point in the clockwise
direction.

13
Figure 2.4 An Example of a Contolii Function
Starting & end point
Figure 2.5 An Example of the FD of a Curve Segment

14
From this modification, we can apply the FD to any arbitrary curve
whether it is closed or not.
2.3.4 Expression of Fourier Coefficients in Terms of Chain Codes
In this section, we want to compute the FD of the contour in terms
of its chain code sequences. Let the grid spacing for chain code be
equivalent to the unit of coordinate system and let gq >§2*' *§m **e t*ie
chain code string of a contour C. This chain code string should keep
the interior region enclosed by C on the left (see Figure 2.3). Let P
be the perimeter of the contour C, and let 0 = tQ < t-^ < ... < <
t^ = 2i be a partition of [0,2ir] .
From equation (2.5), we get
1 2tt
an 2mtJ !0 exp(jnt)du(t)
2J \ tu(tm) u J m=l
From equation (2.1), the chain code sequence gk can be expressed
as
Ukexp[j|gk]}
where
1 > ifCk
is even
*k- <
/2 if is odd k = 1, ... ,M
Further, by approximation, we obtain
m
P = l L
m k-1 k
M
P l L
k=l 1
(2.8)
(2.9)

15
t =
m
2iiP m M
p- ~ 2it £
* k=l k=l K
M
(2.10)
and
it
^m^Sn-P = VXp(i4gm) m 1*"
M
(2.11)
Substituting (2.10) and (2.11) into (2.7), we obtain
M
a =
\t I \,exP[j(fgm -
2mn m
J m=l
m M
2nk l U l A )], n = 1,2,
k=l K k=l K
(2.12)
For the coefficient a.
0
=
0 2%
. 2%
or / u(t)dt
1 M
- X £ [t t i ] u( t )
2n L m m-1 m
m=l
M m-1 M
U(V l *m < 2A/ \ Vexp[j^m] (2'13)
m=2 k=l k=l
The coefficient aQ represents the position of the shape center of
the contour C. Therefore equation (2.13) can be used to compute the
shape center asymptotically. If we shift the shape center to the origin
of the new coordinates, then the Fourier expression of the contour C can
be rewritten as
CO
I tanexP(Jnt) a_nexp(-jnt)], 0 < t < 2% (2.14)
n=l
To describe the shape features, we propose the following measures:

16
(a) Circularity
SF1 =
l
n=l
SF1 = 1 when C is a circle and 0 < SF1 < 1 otherwise.
(b) Elongatedness
SF2 =
13. I 13.
short semi-axis 1 l1 1 -l1
long semi-axis |a^| + |a_^|
SF2 = 1 when C is a circle and 0 < SF2 < 1 otherwise,
(c) Complexity
SF3 4
M
[ l \]
k=l ^
4tt2[ l n(|an|2-|a_n|2)]
n=l
where
2tt
Area( A ) = Re[ JQ ju(t)u(t)dt] = tt l n|an|
M
perimeter( P ) = £ £
k=l *
SF3 = 1 if C is a circle and SF3 > 1 otherwise,
(d) Convexity
The curvature of C can be written as
da x'y"- x"y* x'y"- x"y' ^ N, .. IN
= ^ = ~3~ = (const.)(x'y x y')
(d£)
Mt;
where t = p increases in the counterclockwise sense.
(2.15)
(2.16)
(2.17)

17
xy" x"y' = Re[ju'(t)u"(t)]
00 00
= Re[ l l n m a a exp( j(m-n)t) ]
uj=oo n-00
= Re[ l n3 |an|2] if m=n
n=-oo
Now, we define
SF4 =
n=l
(|a1|2-|a.1|2)
(2.18)
SF4 = 1 when C is a circle, and SF4 > 1 when C has more convexity.
2.3.5 Predictive Searching for Chain Encoding
To generate the chain codes from a geometrical configuration, the
search direction of successive codes is predetermined. Such a method
works very well when it is applied to simple geometrical
configurations. However, when the geometrical configuration contains
overlapping and touching points, chain-coding methods may become less
satisfactory. Xu and Tou [28] have proposed a predictive searching
method based upon the past information. That is, the subsequent chain
code is predicted from information on the preceding chain codes. The
outline of the predictive searching method is as follows. .
Assume that at the i-th step of chain encoding, the predictive
chain-code direction is D^Ci), the real chain direction is Dr(i), the
predictive code-angle increment is cx^Ci), and the real code-angle
increment is ar(i). Then we have the following relationships:

18
D (i) = D (i-1) + o (i)
p r p
(2.19)
ap(l)
L
l a a (i-j)
3-1 J
(2.20)
ar(k) ap(k) 4- v(k) (2.21)
where a^,j=l,2,...,L, are the weighting coefficients, L is the number of
angle increments employed in the prediction process, and v(k) denotes
the difference in code-angle increment and k = i-l,i-2,...,i-M, and M is
an integer equal to or larger than L. When the curve is very smooth
with small curvature, a large number is assigned to L. When the curve
has a larger curvature, a small number is assigned to L.
By considering the past M code-angle increments, we may express
equation (2.21) in vector form as
Y = HA + V (2.22)
where Y = [ar(i-l) a.(i-2) ... c is the real code-angle increment vector.
' ar(i-2)
a.(i-3) ...
ar(i-l-L)
H =
ar (i-3)
a (i-4) ...
r
a (i-2-L)
r
a (i-M-1)
L r
a (i-M-2)
r
a ^i-M-L)
is an M X L
matrix
.T
T
A [a^ a.2 3-^]
is the coefficient vector, and
V_ = [v(i-l) v(i-2) ... v(i-M)]T
is the error vector.

19
We can determine A, the least-square estimation of A, so that the
following criteria are satisfied:
(1) E {A} = A
(2) E{e^e} = E{(A A)^(A A)} = minimum (2.24)
The least-square estimation of coefficient vector _A is given by
~ T -1 T
A = (H H) H Y
where
A A
A [a a2 ... aL]
(2.25)
, /
'/
T 1 //
If (H H) does not exist, we may use pseudo-inverse matrix H instead
T 1 T # r^r T
of (H H) H The pseudo-inverse matrix of H is H = H where
T (C B)
r r
B = ErH;C^ = I; Ci+^ = C^B; i=l,2,...,r; and r is the rank of HH.
Tr(CrB) indicates the trace of matrix (CrB).
Consequently, the optimal prediction direction of i-th chain code
is given by
1* L .
a (i) = l a P j=i J r s-j=i J y
where K(*) = 0, 1, 2, 3, for octal chain codes 0,1,2,...,7
and { X J denotes the largest integer which is less than the number
X.
In [28], Xu and Tou chose L = 3 and M = 3 for predictive
searching. From the equation (2.24), we know that the performance
measure is essentially based on the view that all the errors are equally
important. This is not necessarily so. We may know, for example, that
data taken later in the experiment were much more in error than data
taken early on and it would seem reasonable to weight the errors
accordingly. Such a scheme is referred to as weighted least squares and
is based on the performance criterion

20
J = £TWe (2.27)
Then the least-square estimation of coefficient vector _A is given by
A = (HTWH)_1HTWY (2.28)
We note that equation (2.28) reduces to ordinary least squares when
W = I, where I is the identity matrix. The predictive searching may
help the continuity of the boundary instead of using interpolation or
extrapolation.
o
2.4 Decision Criterion
During the classification process, the statistical pattern
recognition techniques combined with production rules form a decision
tree to guide the system to make a decision. There are two decision
criterions used in pattern recognition, time domain classifier and
frequency domain classifier. In the time domain, we consider that the
most random processes are governed by the Gaussian probability law;
therefore, the Bayesian decision function [29] is given by
diC^) = MP^)] -|^n|c;Ll ~ |-[(x-yi)TcL 1(x-y1)], i=l,...,M (2.29)
where P(w^) is a priori probability of the occurrence of the i-th class
T
on; X = [x, x_ ...x ] is the pattern vector, x,,...,x are the n
i 12 m l m
selected features; M is the number of classes; and c^ are the mean
vector and covariance matrix of the i-th class. The pattern X is
assigned to class if for that pattern
d(X) > d_.(X) for all j i (2.30)
In the frequency domain (e.g. the Fourier descriptor), for similar
reasons it is assumed that the. distribution of certain descriptor

21
coefficients (fn,n=l,..,N) of a class is also a two-dimensional
Gaussian distribution PfQ (r^,i^),, where r^ and i^ are the real and
imaginary components of a coefficient fn-
fn(rf)= (Ianl + |a_n|)cos(nt)
fn^f) = (kj |a_n|)sin(nt)
To classify the patterns, a simple decision function is defined
VVV = ^ [Pfn(rf>if)] 3=1.(2.31)
n
where the summation is taken over all the f coefficients used for
classification. A pattern is assigned to class if for that pattern
VvV > Dj(rf>if) for a11 ^ (2,32)
2.5 Knowledge Representation
One of the most common methods for representing knowledge is to use
condition-action rules or productions, e.g. Davis, Buchanan, and
Shortliffe [30]. The condition part of such rules is typically a
logical product of several conditions, and the action part describes a
decision, an action, or an assignment of values to variables that is to
be performed when a situation satisfies the condition part. Basically,
the production rules represent the how-type knowledge and the denotation
of the form
IF ... THEN
or
Condition Action.

22
Using the variable-valued logic calculus VL^[31], we may represent
the production rule as
[x. # R] + [D]
where # stands for one of the relational symbols = >, >, <, or <; R
denotes a subset of the value set of variable x^; D denotes a term
describing the decision to be assigned when the condition part (on the
left of is satisifed; - denotes the decision assignment operator.
It is easy to see that if the condition part has more than one
element, the condition part becomes a disjunction of the individual
condition :
[x^ # a,b,c,...] = [x^ # a] & [x^ # b] & [x^ # c] &...
The LISP and Prolog programming languages [32] are designed to
perform the above logical statement.
The advantages of a production system are
(1) Each production rule is completely modular and independent of the
other rules.
(2) The stylized nature of the production rules makes the coding easy to
examine.
(3) Each rule represents a small, isolated chunk of knowledge. Thus a
user familiar with the system may be able to formulate new rules if
necessary.
The disadvantages are
(1) Sometimes we cannot easily represent a piece of knowledge by a
production rule.

23
(2) Some impact may arise from unexpected interactions with other rules
due to a rule change or an addition of a rule.
The other way to represent knowledge is to take the conditional
probabilities into account. This representation is used to diagnose
entities such as diseases by repeatedly applying Bayes' theorem to
compute the probabilities that certain diseases are present, given that
certain symptoms have been observed.
Another way in which knowledge is sometimes organized is by static
descriptions of phenomena. This representation is used in INTERNIST
[33],[34] and MEDIKS [35].
We represent the knowledge in entity-attribute relationship
associated with conditional probabilities. Initially, the conditional
probabilities P(E^|Aj) and P(A^|E^,) of the entity-attribute relationship
are determined by the knowledge base. That is, if there are n possible
entities associated with a particular attribute A^, then the same
initial conditional probability is attached at each E^, i=l, ..., n.
PCEjA.) = P(E2|A.) = ... = P(En|A.) = ~
However, to use frequency of the occurrence to replace the
conditional probability is not accurate. This assignment will blur the
importance of the key feature and lead to a misclassification problem.
To compensate for this drawback, we have to manually replace the
conditional probability by a weight (or certainty factor). However,
experts will not feel comfortable deciding the weight. This weight
assignment is updated until the system performs correct decision making.

24
2.6 Knowledge-Based System
What distinguishes such a knowledge-based system (or expert system)
from an ordinary application program is that in most expert systems, the
knowledge is explicitly in view as a separate entity, rather than
appearing only implicitly as part of the coding of the program.
Ordinary computer programs organize knowledge into two levels: data and
program. Most expert computer systems, however, organize knowledge on
three levels: data, knowledge base and control. For a general review,
see Nau [36].
We propose that our knowledge-based system incorporate the concept
of pattern recognition as the classification reference to compensate for
the lack of a suitable rule. There are three goals in our knowledge-
based system. The first is to structure the problem domain and to
develop analysis techniques. These techniques include algorithm,
heuristic rules and the use of contextual information. The second is to
use the techniques developed in Sections 2.2 to 2.4 for analyzing visual
imagery. The third one is to develop analysis techniques for effective
object classification that would be less computationally expensive.
The knowledge-based approach to visual imagery involves formulating
and evaluating hypotheses about the objects observed in the imagery.
This is accomplished by extracting features from the imagery and then
associating those features with high-level models of possible objects
that are stored in the system's knowledge base. The basic system block
diagram is shown in Figure 2.6. It consists of four main parts:

25
Figure 2.6 The Block Diagram of Pictorial Knowledge Base

26
(1) A data base which contains results about the images. This comprises
spatial information, possible alternatives, and different levels of
representation.
(2) A module which executes control; it decides which methods to apply,
which information to use, and how to access the results contained in
the knowledge base.
(3) A knowledge base which contains a method for preprocessing an image
extraction of simple features and classification.
"(4) A data base which contains knowledge about structural properties of
images and possibly about the field of problems.
A typical processing sequence will include the following steps:
Step 1
Extract some strong major features.
Step 2
Use the available information (extracted features) to
\
prune the list of possible object types and suggest
correct hypotheses.
Step 3
Use the library (data base) information to predict other
lower level features.
Step 4
Associate the predicted lower level features with the
image.
Step 5
Iterate between Steps 2 to 4 until a classification
decision is reached.
In the following chapters, we illustrate the knowledge-based system
design using APRIKS system (Agricultural Productivity Improvement
Knowledge System). Then incorporating the image processing techniques,
we develop the pictorial knowledge-based system (some use the image

27
understanding system) to help the falsified document detection and font
identification in Chapter 4, and the circuit diagram recognition and
interpretation in Chapter 5.

CHAPTER 3
DESIGN OF A KNOWLEDGE-BASED EXPERT SYSTEM
FOR APPLICATIONS IN AGRICULTURE
3.1 Introduction
In a technology-intensive industrial environment, scientific and
technological knowledge is not only the most important element but also
a key factor in productivity improvement. The whole world today has
moved into a knowledge-based mode of operation. As a result, under the
present circumstances, we are confronted by three major problems in
dealing with knowledge [37]
(1) proliferation of knowledge,
(2) utilization of knowledge,
(3) unavailability of knowledge.
The proliferation of knowledge has raised the question of how much
a person should selectively read and how much knowledge and information
he should acquire every day in order to meet his needs. The second
problem is concerned with the most efficient way of transferring the
knowledge to a user when his needs arise. The third problem is the fact
that what masters really know is normally not in the textbooks written
by the masters. The introduction of knowledge-based expert systems has
shown great promise of solving these problems. Knowledge-based expert
systems may be designed to provide selective material for reading and
viewing by an individual, depending upon the scope of his interest and
28

29
the nature of his problem. Knowledge-based expert systems may be
designed to bring the specific knowledge to the user when his needs
arise and to provide general consultation services to him. Knowledge-
based expert systems may be designed to transfer the experience and
know-how of masters into the knowledge base of the expert system. These
three may be considered as the important types of knowledge-based expert
systems which we need in the coming decades.
In recognizing the above mentioned problems, a great deal of
interest has been developed in recent years in the design of knowledge-
based expert systems for various applications, especially in medical
consultation. To conduct medical diagnosis, for instance,, a knowledge-
based expert system may provide comprehensive analysis of possible
disorders, suggest suitable treatment plans, and serve as a ready source
of expertise. Among the well-known computer-based systems for medical
applications are MYCIN [30],[38] for antimicrobial therapy advice,
INTERNIST [33],[34] for diagnosis in internal medicine, CASNET [39] for
glaucoma, and MEDIKS [35] for diagnosis of multiple disorders,
MEDAS [40] for interactive diagnosis.
Other applications include DENDRAL [41] for determining molecular
structures of complex organic chemicals from mass spectrograms and
related data, PROSPECTOR [42] for consultation about potential mineral
deposits, PLANT [43] for diagnosis of soybean diseases, Hearsay-II [44]
for speech understanding, APRIKS [45] for knowledge transfer and
utilization in agriculture, and a number of applications in other fields
[46]-[50].

30
Food productivity may be enhanced by taking several approaches:
(1) crop pest control, (2) plant disease control, and (3) fertilizer
management. The primary function of crop pest control is to make
optimum use of pesticides on plants. The main objective of plant
disease control is to prevent the spread of plant diseases. The primary
goal of fertilizer management is to provide an adequate amount of
nutrition to the plants. To enhance food productivity, we should make
the right diagnostic analysis, select the right treatment plan, and use
the right pesticides and fertilizers with the right amount at the right
time. To accomplish these objectives, farmers and growers should have
the necessary knowledge at their fingertips when their needs arise. The
computer-based APRIK3 system was conceived some five years ago to meet
this challenge. APRIKS is the acronym for Agricultural productivity
JCmprovement Knowledge _System. This paper presents the design of APRIKS
which is a pilot knowledge-based expert system for applications in
agriculture. The APRIKS enables the user (farmer, grower, or county
agent) to interact with the computer in a conversational mode to obtain
satisfactory answers to various questions and suggestions of solutions
to specific problems. The APRIKS is a pilot system which demonstrates
the use of a minicomputer in agricultural information browsing,
knowledge transfer, diagnostic consultation, management-recommendation,
and science education. On the basis of a set of observations provided
by the user, the APRIKS can determine the plant diseases, the damaging
insects, or the planting instructions. It can recommend treatment plans
and pest control procedures,- and can provide useful information such as

31
life history, injury to crop, and injury threshold. The APRIKS is
designed to respond to two types of input formats:
(a) Menu selection,
(b) Simple natural language.
Thus, the APRIKS performs three modes of operation:
(1) Interactive retrieval and browsing mode,
(2) Decision-making and consultation mode,
(3) Question-answering and diagnostic mode.
The design of the APRIKS is divided into two major tasks:
(1) APRIKS knowledge base generation,
(2) APRIKS knowledge seeking and utilization.
The knowledge base, generated by two modes of operation, is the
off-line batch mode and the interactive conversational mode. The
knowledge seeking and utilization process is accomplished on the basis
of knowledge-based pattern recognition and inference principles. It is
our hope that the APRIKS design concept will provide a solution to the
three major problems created by proliferation of knowledge, utilization
of knowledge, and unavailability of knowledge.
3.2 Design Concepts for Knowledge-Based Expert Systems
A knowledge-based expert system generally consists- of two
fundamental components (1) a knowledge base and (2) a recognition/
inference mechanism. The knowledge base is a structural and relational
representation of knowledge. In a natural format, knowledge may be
represented in an associative hierarchical structure. The most general

32
concepts are placed at the top of the tree and the most specific items
at the bottom of the hierarchy. Knowledge of various levels of
specificity is distributed throughout the hierarchy. The structure of
the knowledge base is useful knowledge which may facilitate knowledge
classification, new knowledge acquisition by inserting it at the right
location in the hierarchy, identification of possible problems, and
efficient seeking of the appropriate knowledge in response to a query
[51][53].
The recognition/inference mechanism interprets the users query
input, performs pattern matching, and generates recommendations,
specific answers or inferences. Its decision-making process makes use
of discriminant analysis, Bayesian statistics, clustering analysis,
feature extraction, syntactic rules, and production rules. The design
of a knowledge-based expert system may be conducted by following two
major approaches :
(1) Rule-based approach,
(2) Pattern-directed approach [29].
Sometimes a combination of both approaches should be cleverly
employed. The rule-based approach makes use of a collection of "if-
then" rules. A typical design based upon this approach is the MYCIN
system [38]. The pattern-directed approach is based upon the
construction of pyramid-like "networks" of know-hows, which we refer to
as a knowledge hierarchy. The MEDIKS system has been designed on the
basis of this approach [35]. In this paper, we have developed the
framework of a knowledge-based expert system for the extension, transfer

33
and utilization of knowledge. The organization of this knowlege-based
expert system is illustrated in Figure 3.1.
Through the knowledge acquisition task, the system transfers
experts' experience and know-how into the knowledge base. The system
extends its strategy through an inference mechanism. The system is
designed to perform the following three modes of operation: information
retrieval and browsing, decision-making and consultation, and diagnostic
analysis and question-answering. This framework of a knowledge-based
expert system is applied to productivity improvement in agriculture.
The design of APRIKS system is discussed in the following sections.
In the APRIKS knowledge base, strategic information items are
characterized by feature patterns. The knowledge seeking process is
accomplished through the recognition of feature patterns which match
most closely with query patterns formulated by APRIKS from the user's
query. Two information patterns are said to be matched if a certain
performance criterion is satisfied. The strategic information items are
linked to the detailed descriptions and working knowledge which are
stored in various knowledge files.
In the APRIKS system, the knowledge seeking and utilization process
involved six basic steps:
(1) Formulate the query information patterns from user's query,
(2) Retrieve the feature patterns from the knowledge base,
(3) Modify the user's query to make it more specific,
(4) Recognize the associated feature patterns via knowledge-based
pattern recognition and inference,

1
I
Know
ledge
Transfer
from
Expert s
Strategy
; Extension
From
Experts
Lo
Figure 3.1
Organization of a Knowledge-Based System

35
(5) Generate detailed information on the associated feature
patterns,
(6) Suggest optimal treatment plans and alternative solutions.
3.3 Knowledge Representation
In the APRIKS system, we represent knowledge in the knowledge base
by hierarchical entity-attribute-value relations. The structural
knowledge is semantically categorized into an associative tree. Each
node of the relational hierarchy represents an entity associated with a
set of attributes and values. An entity is described by a
conjunctive expression of the descriptors,
{D, (w ,A V )} j = 1,2, ..., N
where ^ denotes the jth descriptor of entity E^; N is the number of
attributes used to characterize this entity; and A^, V^, and w^ denote
the corresponding attribute, value of the attribute, and the weighting
factor, respectively. Attribute Aj is a characteristic used to classify
an entity E^. The weighting factor w^ is assigned to the jth attribute
Aj to specify its relative importance in the characterization of the
entity E^. Weighting factor w^ may be a specific number determined by
an expert's opinion or a value derived from conditional probability or
fuzzy set theory. For example, in document retrieval, ~the weighting
factor associated with a keyword may be determined by the frequency of
occurrence of the keyword which appears in the document.
The descriptor is an attribute-value pair with a weighting
factor, D. (w.,A.,V.). The set of descriptors {D, (w,,A,,V,)},
IJjj 1 > J J J J
j=l, 2,

N form a cluster space to describe the entity E^:

36
Dij (wj,Aj,Vj)} {Di,l(wl*W* '** d,n(WV}
For simplicity in writing, we use the short notation j(wj) by
dropping the attribute name and value. Furthermore, it is assumed that
the descriptors are independent, i.e.,
Di,l(wl> n i,2 (w2) Di,N We make this assumption since the dependence aspect of the descriptors
is not adequately understood. In fact, it has been recognized that
classification decisions are robust with respect to the assumption of
"conditional independence [54].
Common descriptors exist between two entities and E^, if the
name or the meaning between two descriptor sets (D^ k(wk) } and
{Dj,£ are the same. We denote the common descriptors by
{D. (w )}, where the factor w is defined by
IK. K
wk = min (wk,w)
The common descriptors form a semantic link between two entities, and
the most frequently used name is chosen as the master term.
Distinct descriptors between two entities E^ and E^ are expressed
o -j o
as (D^ (w)}, if the name and meaning between two descriptor sets are
different. The distinct weighting factor is defined as
o
w = w.
if
8?
= D. ,
k
i
i,k
o
w = w
if
= D- ,
i
i
Two entities will be linked together, if the names of the entities
are the same. We call this link an entity semantic link, denoted by
A
E (i,j). However, several individual descriptors may have common
descriptions with different weights. The entity semantic link- has the

37
property that if exist two entity semantic links E (i,j) and E (j,k),
A
then a link E (i,k) exists. We denote partial link by the
A
notation E (i/j).
Shown in Figure 3.2 is the relational hierarchy for a portion of
knowledge representation in APRIKS system. The top-level node is
soybean which is followed by five entities (insects, diseases,
nematodes, weeds and varieties). Furthermore, under the insect node are
included soybean looper, green coverxorm, velvetbean caterpillar, corn
earworm, and so on. The associative tree can readily be expanded
systematically by acquiring more knowledge for the knowledge base. The
entity soybean looper can be described by such physical characteristics
as body color, body length, abdominal proleg, etc. as illustrated in
Figure 3.3. Some of the attributes are key characteristics of an
entity. For instance, the two-pair abdominal proleg is a key
characteristic of soybean looper and it carries a heavy weighting
factor.
3.4 Knowledge Base Generation
Knowledge base generation is an essential step in the design of
knowledge-based systems. In our approach, we divide the knowledge base
into two components. Part One contains knowledge outline or knowledge
sketch, and Part Two stores knowledge details. Knowledge sketch is
represented as a relational hierarchy. Semantic links are used to
associate the nodes of the hierarchy. Knowledge details constitute the
data files and dictionaries of the knowledge base. To generate the
relational hierarchy for knowledge sketch either the bottom-up approach

SOYBEAN
i
Non-chemical
Control
Chemical
Control
l.oopor Cloverworm Caterpillar Earworm
Control
Recommendation
Hon-chemlcal
Control
Baci1lus
Tlnirlnglcnsls
Hetlionyl
Dlpel Ractur Tlmrlclde
Formulation A.l/Acre Application Toxicity
H ,7 See \ / Air \
' > ^ LabelJ \Spray)
Cliemlcal
Control
Bacillus
Tlnirlnglcnsls
Hetlionyl
Hlpel Bactur Tliuriclile.
Formulation A.l/Acre Application
("0 (lSi)
LO
00
Figure 3.2 An Example of APRIKS System Hierarchy

39
Entity
( Soybean Looper )
Attributes
Values
Body Length
>
1/8 1
1/2 inches
Body Shape
>
Head is
smaller than
Body Color
>
Green
Abdominal Prolegs >
2 pairs
Anal Proleg
s >
Erect
Stripe Color >
White
Walk Status
>
Loop-like
Figure 3.3 Attributes of Soybean Looper

40
or the top-down approach may be taken. In the bottom-up approach, it is
assumed that a node in the hierarchy will contain all the
characteristics of its subnodes. As an illustration, consider a
knowledge sketch shown in Figure 3.4. The knowledge base generation
rule for the bottom-up approach is
*Dil*i2***1m-l^ ^Wj^new <
A,i2,,,int-l,CX
D* i a (w)
1- -L__, P
l z nr- i
<*> -< ^>1.
12
m-1
where m = 2, ..., M; M = total number of levels in the hierarchy; a, 3 =
subnode number in the M-th level of the hierarchy; and i^, 2 =
subnode number in the 1st, 2nd, ... level of the hierarchy,
respectively. The symbol u denotes union set operator. The above
knowledge base generation rule is applied repeatedly from the (M-l)th
level to the top level, and the result is shown in Figure 3.5. The top-
down approach is a process in the reverse order of the bottom-up
approach. The knowledge base generation rule for the top-down approach
is
[D
i,.i...i ,j y new
1 2 m J J
< [Di .i ...i ,j (wj )Jold
i / m
u D (w ) where m = 2, ..., M.
1l,12'**m-l,:3 j
This rule is applied iteratively from the 2nd level to the bottom level
and the result is shown in Figure 3.6.
The selection of either the bottom-up approach or the top-down
approach depends on the system design and knowledge-seeking

Al
.1.2
,3
-L *-
,3
(V}
(w )}
(w. ) }
J
Figure 3.4 An Example of the Hierarchy for Knowledge Sketch

ei tDi,J<"j)-s:(v-s:2i El.2 tD1.2,j("j)}
Implementation
Direction
{D.. .. (w.)}
{D1.1.1.2,j(wj)}
N5
Figure 3.5 Bottom-Up Approach for KB Generation

Ei.i {Di.j>Di.i.j(Vl
E-, 9 (Dn (w.) ,D, (w.) }
1.2 l,j 3 1.2,3 J
{D
{D
LO
I
I
Figure 3.6 Top-Down Approach for KB Generation

44
strategies. However, in the real-world situation, the associated
descriptors at the top level are of general nature and may be irrelevant
to system requirements because of the need for specific characterization
at the lower level. However, in practice, the number of inherent
descriptors at the top level are smaller than that of the lower level.
In our work, we propose a modified top-down approach to simplify
knowledge base generation. From the relational hierarchy, we understand
that each entity is actually described by its associated descriptor sets
and its father-son relationship. We, therefore, consider the father-son
relationship as a potential descriptor, calling it the cover. That is,
the cover is a descriptor for the system internal representation. Thus,
the semantic linking rule for knowledge base generation is given by
[D (w.)] <
il*i2 i3'*'imj 2 neW
'i. .i .i-...i ,j (wj)]old u '"i. .i0.i... .i .
1 Z J in l z j m JL
where the cover, e
i. i i~ i .
12 3 nr-1
{E. > E E ..., E }
X1 il*i2 il'12'i3* 1 12i3 m-1
The modified top-down approach is illustrated in Figure 3.7.
The two kinds of data input method for knowledge base generation
and updating are batch mode and interactive mode. The batch mode is
suitable to accept a bunch of raw data and process at a time. The
interactive mode is used for small amounts of data. In APRIKS, we use
batch mode for knowledge base generation and interactive mode for

Ei O),
El,D1.2,j(j)>
-P~
Ui
where e
1 1 1 = ^E1,E1 iEi i l^*Gl 1 = ^E1,E1 1 }>anc* ei =
= E,
Figure 3.7 Modified Top-Down Approach for KB Generation

46
knowledge base updating. In order to speed up data collection, we use
some pre-defined delimiters to identify the data characteristics. Each
delimiter will perform one generation function. The raw data are typed
in compact form to save storage, then sorted into a readable form. The
examples of raw data and delimiter function are shown in Figures 3.8,
3.9, and 3.10. The simplified knowledge base generation scheme is
illustrated in Figure 3.11.
3.5 Data Structure for the APRIKS System
The data structure for the APRIKS system is composed of
hierarchical node index table (NITA), entity dictionary (ED), attribute
dictionary (AD), value dictionary (VD), unofficial entity dictionary
(UED), unofficial attribute dictionary (UAD), entity-attribute table
(EAT), information file (INF), name dictionary (ND), hash table (HT),
and text file. The heart of the APRIKS system is the NITA which
represents an agricultural knowledge sketch base in an associative tree
structure. Information flow in the APRIKS system is summarized in
Figure 3.12, and the data formats for various elements in the data
structure are shown in Figure 3.13.
The hierarchy consists of eight generation codes, with one byte
being used for each generation code. Thus, the eight-level tree is used
to represent 256 subnodes for each node. ED and AD are employed to
interpret entity names and attribute names, respectively. The synonym
dictionaries, UED and UAD, are created for interpreting the
corresponding master terms. For instance, SL and VBC are the synonyms
for soybean looper and velvetbean caterpillar, respectively.

en h
S1*S0YEEAN$2*T0MAT0S3*TURFS1. 1*INSECT=PE5T$1.2*DI5EA5ES1 3>HNAMAT0DE!E i 4
4 HE EDI-i B^VARIETYSi i. 1*50YBEAN L00PER=SBL=P5EUD0PLU5IA INCLUEEN5S1. i. i
. 14C0NTR0L RECOMMENDATIQN^FEST TREATMENT~C0NTR0L=MANAGEMENT4i.i.i + INJUR
Y TO CROP : FOLIAGE< DAMAGE TO CROP ARE HEAVY FOLIAGE FEEDERS.&LARVAE FEED ON UNDERSIDES OF LEAVES.&LARVAE
FEED IN DEEP CANOPY OF PLANT. &LARVAE FEED ON OLD LEAVES.A+THRE5H0LD OF
INJURY< ECONOMIC THRESHOI_D< ECONOMIC INJURY LEVEL'/.A&33 ('/.) DEFOLIATION PRI
OR TO BLOOM. &NO MORE THAN AN ADDITIONAL \10('/.) AFTER BLOOM\.Stl0 WORMS i
/3." OR LONGER PER LINEAR FOOT OF ROW EEFORE BLOOM.&4 WORMS 1/2" OR LONG
ER PER LINEAR FOOT OF ROW AFTER BLOOM.A$i. j..i.i.i*NON-CHEMICAL CONTROL
BIOLOGICAL CONTROL'/.AStPROMOTE BENEFICIAL ARTHROPOD POPULATIONS. A+CULTUR
AL CONTROL< AGRICULTURAL TREATMENT AGRICULTURAL CONTROL'/.A&PLANT BEANS AS
EARLY IN SEASON AS P055IELE.&WINTER PLOWING.as.1.1.1.2+CHEMICAL CONTR
OL--CHEMICAL APFLICATI0N=PE5TICIDE APPLICATION=INSECTICIDE APPLICATION =
CHEMICAL TREATMENT=CHEMICAL SPRAYINGSi.1.1.1.2.1*EACILLU5 THURINGIEN5I5
1.1.1.1.2. 1. 1>1 REDIENT LBS. PER ACRE Figure 3.8 Raw Data for Batch Mode
I

48
CD $
(2)
(3) @
(4) +
(5) <
(6) %
Category Code Entity Name
Synonym Name of the Entity
Number of Record, Length of Record,
Picture File Name
Attribute [: Value]
Synonym Name of the Attribute
<- Deta'iled
Information
\\
for Underline
@ @
for Negative
Image
t f

for Blinking
(7) // (End of Raw Data File)
Where [ ] is optional and A a is
enclosed rectangular box
Figure 3.9 Delimiters for Raw Data

49
S1*S0Y3EAN
$2*TOMATO
53*TURF
51.1*INSECT
=PE3T
=PE5TS
51.2*DISEASE
51.3*NEMAT0DE
51.4#WEED
S1.5*VARIETY
51.1.1KS0YBEAN LDOPER
=SBL
=P5EUD0PLUSIA INCLUDENS
+INJURY TO CROP : FOLIAGE
(DAMAGE TO CROP
(FOLIAGE FEEDER
(LEAF FEEDER
V.A 1. LARVAE ARE HEAVY FOLIAGE FEEDERS.
2. LARVAE FEED ON UNDERSIDES OF LEAVES.
3. LARVAE FEED IN DEEP CANOPY OF PLANT.
A 4. LARVAE FEED ON OLD LEAVES.
+THRESHOLD OF INJURY
(ECONOMIC THRESHOLD
(ECONOMIC INJURY LEVEL
/.A 1. 337. DEFOLIATION PRIOR TO BLOOM.
2. NO MORE THAN AN ADDITIONAL MO'/. AFTER BLOOMY.
3. 10 WORMS 1/2" OR LONGER PER LINEAR FOOT OF
ROW 3EF0RE ELCOM.
4. 4 WORMS 1/2" OR LONGER PER LINEAR FOOT OF
A ROW AFTER BLOOM.
51.1.1.1^CONTROL RECOMMENDATION
=PE3T TREATMENT
TREATMENT PLAN
=PEST CONTROL
^CONTROL
^MANAGEMENT
51. 1. 1.1. lHtNONCHEMICAL CONTROL
+BIOLOGICAL CONTROL
/.A
PROMOTE BENEFICIAL ARTHROPOD POPULATIONS.
A
+CULTURAL CONTROL
)AGRICULTURAL TREATMENT
>AGRICULTURAL CONTROL
/.A 1. PLANT BEANS AS EARLY IN SEASON AS POSSIBLE.
A 2. WINTER PLOWING.
51.1.1.1.2*CHEMICAL CONTROL
=CHEMICAL APPLICATION
=PESTICIDE APPLICATION
=INSECTICIDE APPLICATION
=CHEMICAL TREATMENT
=CHEMICAL SPRAYING
31.1.1.1.2.1*BACILLUS THURINGIENSIS
51.1.1.1.2.1. 1+DIPEL
Figure 3.10 Raw Data After Sorting

50
Figure 3.11 The Simplified Knowledge
Base Generation Scheme

\
I
Figure 3.12 Information Flow in APRIKS System

NITA
i
I 1 2 | 3 I A
|s | 6 | 7
1 8
ptr. to
ptr. to
' H of
ptr. to
ptr. to next
Category Code
1 1 1
(8 levels)
J 1 1
J
ED
181
subnode
subnodes
parent
node
subnode with
same parent
F.D
Block ptr.
Record
ptr. to
0 of
ptr. to
collision
Block ptr.
Record
Record
In ND
ptr. In
EAT
attributes
NITA
Link
UE U
In ND
ptr.in
ptr.In
ND
ptr.
ND
ED
ptr. to
ptr. to next
ptr. to
ptr. to
last record H
ptr. to next
AD
weight
entity with
VD
INF
In INF
EAT record
same attrl-
for same
bute
entity
l_n
M
Block ptr.
Record
ptr. to 1st
of assoc-
UAD
Block ptr.
Record
Record
In ND
ptr.in
associated
latcd
in ND
ptr.in
ptr.in
ND
entity
entitles
ND
AD
ND
IIT
0 of
Character
0 of
Character
characters
string
characters
string
<(
U
INF
Character
string
ptr. to
next
record
0 of
Character
characters
string
Record
File ptr.
ff of
Collision
ptr.to |
to
characters
ptr.
dictionaries
dictionaries
Figure 3.13 APRIKS Data Structure

53
To represent m:n entity-attribute relation, a simple method is to
create a two-dimensional array which stores the relations between entity
occurrence and attribute occurrence. Figure 3.14 illustrates a typical
example. However, the limited number of attributes associated with a
particular entity makes the two-dimensional array very sparse and
storage inefficient. In our design we create an EAT file to represent
entity-attribute relation in a compact form. The interrelations among
the ED, AD, and EAT records are illustrated in Figure 3.15. By using
the EAT table, we can list the attributes of an entity, or seek all of
the entities which have a specific attribute. The ND contains the ASCII
character strings which represent the names of entities or attributes in
the knowledge base. Access to the ND is through the block pointer and
the record pointer in dictionary records. The format of the ND is
variable-length record as shown in Figure 3.13.
3.6 Knowledge-Seeking Strategies
The APRIKS system has been designed to perform three types of
operations (interactive retrieval and browsing, decision-making and
consultation, question-answering and diagnostic analysis) in the
agricultural field. To accomplish these operations, we have designed
knowledge-seeking strategies. From the interactive retrieval task, the
system allows the user to access any portion of the knowledge by
navigating through the hierarchical structure. The system contains
refreshing mechanisms which remove the irrelevant information and keep
them in manageable size. A minimum amount of typing is required from

54
1.2v 1.2,1* 1.2,2;
E1.1.1.1(D1.1.1.1,1)
Dn
if !2 Dl.l,l D1.2,l ,
and lll2 Dl.1.1,1 Dl.1.2,2
(a) An Example of Hierarchy
l.l
1.2
1.1,1
1.1.2
1.1.1.1
1 1
d.*: d.
*111
d; ; d.
1,1 1,2
0
1.1 1.2 1.2 1.1,2 u1.2,2 U1.1.2 Dl.l.l,2 D1.1.2,l Dl.l.l.l,l
0 0 0 0 0 0
0 0 0
1.2,1 0 , 0 0
0 0
0 wl.l,l wl.l,2
0 w,
0
1.2,2
0
"l.l.1,1 "l.l.1,2
1.1.2,2
0
1.1.2,1
0 w.
1.1.1.1,1
(b) Array Representation of m : n Pattern Vectors
Figur-e-3.14 Associated Pattern Vectors in Hierarchy

55
First
Associated
Entity
Figure 3.15 The Inter-relations Among the
ED, EAT and AD Records

56
the user to avoid misspelling due to fatigue. Also the system provides
the following three display symbols to enhance visualization (1)
rectangular box for clear view, (2) underline for emphasizing important
messages, and (3) blinking as a warning signal.
For designing the diagnostic task, we drew on our experience from
the design of the TB3 system [55] and the MEDIKS system [35] We have
simplified the diagnostic strategy and made it more friendly. The
system plays the role of an intelligent advisor with tremendous patience
who serves all the user's requests and makes impartial
recommendations. From the knowledge base, the system seeks the key
features and forms a diagnostic scheme to help the user, make accurate
diagnosis and positive identification. The user answers the presorted
problems provided by the diagnostic scheme. The diagnostic pattern
vector is automatically formulated and is sent to the decision-making
and inference module. The system performs pattern matching, ranks the
most probable causes, and displays the causes which exceed the
predetermined threshold. At the user's request, the system is capable
of explaining to the user the reasoning process behind its
recommendations. User-system interaction may continue until a
satisfactory conclusion is reached.
Man-machine interface is accomplished by a menu-driven scheme or by
a question-answering format. In the menu-driven mode, each displayed
page is divided into three sections of title, contents, and control
mechanism. The title is the name of the selected entity or
attributes. The content section contains a list of subnode entities,

57
associated attributes, or detailed knowledge for each category of the
hierarchy. The sequence numbers are automatically assigned to the list
according to the brother sequence in the associative tree structure.
This arrangment will make the user easy to select the desired items by
soft-key approach and facilitate the growth of the knowledge base or
updating the contents without any programming change. The control
mechanism will provide selection control which includes (1) select the
desired item, (2) back-search to the previous page, (3) return to root
of hierarchy, and (4) exit from the current mode. The functional
flowchart for the menu-selection mode (information retrieval and
browsing mode) is illustrated in Figure 3.16.
The decision-making mode is designed on the basis of similarity
measures. The typical measures between two vectors jc =
T T
[,xi2* ,xin^ and X.. = [xjl x jg ,xjn^ may he
!. (x ,x .) = y x.. x
1 i -J kx ik Jk
or
S2(x.,x_j) =
n
I
k=l
X., x.,
ik jk
I <*ik>2 + I (*1k>2 l *lk*1k
k=l k=l k=l J
In APRIKS system, we consider the hierarchical tree as the
clusters. The associated descriptor vector may be viewed as describing
the centroid of the cluster in M-dimensional feature space. The three
similarity computations used in the APRIKS system are S(E^, E^)
similarity computation between entities E^ and E^ j S(A^} A.) similarity
computation between attributes A^ and A^, and S(Q^, E^) similarity
computation between query vector and entity Ej.
The decision is made

58
Figure 3.16 Functional Flowchart of Menu Selection Mode

59
for the highest similarity measure if it exceeds the preset threshold.
The operation flowchart of decision-making mode is shown in Figure 3.17.
The question-answering mode provides a friendly input/output
interface. In the APRIKS system, the user's query can be semantically
categorized into three types:
(a) Cases which require a Yes/No type answer.
(b) Specific questions which require a what/which/who/ when/where...type
of answer.
(c) Procedures or sequences of actions which require a how/why type of
answer.
Although numerous techniques for processing natural languages have been
proposed in the literature, the problem of natural language
understanding by machine still remains unsolved in the practical
sense. In the APRIKS system we have introduced methods to handle simple
natural language sentences with structural format common to most users.
In the question-answering mode, the APRIKS system is designed to
analyze a simple sentence and to identify its five contextural
components.
{l} = {semantically logical operators)
= {not, or, and)
{I} = {identifiers) __
= {when, how, why, where, what)
{Cl} = {contextual information)
{father-son relationship in the hierarchy)
{characteristics}
{C} =

60
Figure 3.17 Operation Flowchart of Decision-Making Mode

61
= { entity-attribute relationships }
{K} = {all keywords, including entities and attributes)
The identifiers are converted into the "what" canonical form,
{identifier) > {"what" canonical form)
when time
how procedure
why > what + reason
where place
A question is segmented into five contextual components
{ question ) = { L)+{ I)+{k)+{CI)+{ C )
Consider the question "How to control soybean looper and velvetbean
caterpillar by Dipel? The system performs contextual segmentation and
determines the following components.
{L ) = { and )
{I ) = { how )= what + {procedure )
K ) = { control, soybean looper, velvetbean caterpillar, Dipel)
{Cl ) = { (control, soybean looper), (control, velvetbean
caterpillar), (Dipel, control))
C ) = { (Dipel, treatment procedure))
The associative tree is a "what" type structure, and the production
rule is a "how" type structure. By embedding the production rule into
the hierarchy, the APRIKS system answers questions with "how," "why,"
"when," "where" as well as "what" format. In fact, our design of the
APRIKS system combines the concepts of pattern-directed approach and
rule-based approach. A simplified functional flowchart for the
question-answering mode is summarized in Figure 3.18.

62
Figure 3.18 The Operation Flowchart of Question-Answering Mode

63
3.7 Experimental Results
The APRIKS system software has been designed and implemented on a
PDP-11/40 minicomputer under the RSX-llM operating system. The system
performs three modes of operation. The current system is designed to
respond to input format types of either menu selection or simple natural
language input. Computer printouts are given in Figures 3.19-3.21 to
demonstrate the various modes of operation of this knowledge-based
expert system.
3.8 Conclusion
In this chapter we have presented the design of a knowledge-based
expert system for applications in agriculture. The fundamental
components of the APRIKS system are a knowledge base and a
recognition/inference mechanism.
To facilitate the design for knowledge base generation and
updating, our approach is to divide the knowledge base Into knowledge
sketch and knowledge details. A recognition/inference mechanism is used
to perform knowledge seeking and utilization. On the basis of a set of
observations provided by the users, the system can determine the plant
diseases, the damaging insects, or the planting instructions. It can
recommend plans for treatment and pest control and can provide useful
information such as life history, injury to crop,~ and injury
threshold. The APRIKS system is able to respond to menu-selection type
of inputs as well as simple natural language inputs. The design
concepts for the APRIKS system are not limited to agricultural
applications. Based upon the same principles, we are developing

64
INSECT
1 SOYBEAN LOOSER
2 GREEN CLOVERWORM
3 VELVETBEAN CATERPILLAR
4 CORN EARWORM
5 FALL ARMY WORM
6 BEET ARMY WORM
7 LESSER CORNSTALK BORER
8 GREEN STINK BUG
9 MEXICAN BEAN BEETLE
10 BEAN LEAF BEETLE
11 THREE-CORNERED ALFALFA HOPPER
12 THE SOUTHERN RED-LEGGED GRASSHOPPER
13 THE LARGE AMERICAN GRASSHOPPER
TYPE ...
A
RETURN TO
PREVIOUS INDEX
B
RETURN TO
BEGINNING
E
EXIT
INDEX NO.
GET INFORMATION
ENTER YOUR
SELECTION l 3
VELVETBEAN CATERPILLAR
1 CONTROL RECOMMENDATION
2 LIFE HISTORY
3 INJURY TO CROP
4 THRESHOLD OF INJURY
TYPE .
A
B
E
INDEX NO.
RETURN TO PREVIOUS INDEX
RETURN TO BEGINNING
EXIT
GET INFORMATION
ENTER YOUR SELECTION l 4
THRESHOLD OF INJURY
1. 33% DEFOLIATION PRIOR TO BLOOM.
2. NO MORE THAN AN ADDITIONAL 10% AFTER BLOOM.
3. 10 WORMS 1/2' OR LONGER PER LINEAR FOOT OF
ROW BEFORE BLOOM.
4. 4 WORMS 1/2" OR LONGER PER LINEAR FOOT OF
ROW AFTER BLOOM.
Figure 3.19 An Example of Menu Selection Mode

65
INSECT
WHAT KIND OF INSECT ?
1 WORM, LOOPER OR CATERPILLAR
2 BUG
3 BEETLE
A HOPPER
ENTER YOUR SELECTION t 1
ANSWER THE FOLLOWING QUESTION (TYPE IF YOU CAN'T ANSWER)
HOW MANY
ABDOMINAL PROLEGS ?
1
2 PAIRS
2
-3 PAIRS
3
A PAIRS :
ENTER YOUR SELECTION l 1
WHAT IS
THE BODY COLOR ?
1
GREEN
9
Am
GRAY
3
BROWN
A
PINK
5
BLACK
6
YELLOW
ENTER YOUR SELECTION 1
THE SUGGESTED INSECTS ARE RANKED
1 SOYBEAN LOOPER
2 GREEN CLOVERWORM
3 VELVETBEAN CATERPILLAR
A CORN EARWORM
5 BEET ARMY WORM
6 LESSER CORNSTALK BORER
7 THREE-CORNERED ALFALFA HOPPER
Figure 3.20 An Example of Decision-Making Mode

66
HOW TO CONTROL VBC BY USING BACTUR ?
* AIR SPRAY
1 MINIMUM OF 3 GALLONS SPRAY PER ACRE.
2. USE EARLY IN SEASON.
ANY OTHER QUESTION ? (Y/N) Y
ENTER YOUR QUESTION (NO MORE THAN ONE LINE)
SOYBEAN INSECT ?
INSECT
1 SOYBEAN LOOPER
2 GREEN CLOVERWORM
3 VELVETBEAN CATERPILLAR
A CORN EARWORM
5 FALL ARMY WORM
6 BEET ARMY WORM
7 LESSER CORNSTALK BORER
8 GREEN STINK BUG
? MEXICAN BEAN BEETLE
10 BEAN LEAF BEETLE
11 THREE-CORNERED ALFALFA HOPPER
12 THE SOUTHERN RED-LEGGED GRASSHOPPER
13 THE LARGE AMERICAN GRASSHOPPER
Figure 3.21 An Example of Question-Answering Mode

67
knowledge-based expert systems for production automation and computer-
integrated manufacturing, for performing self-diagnosis and self
maintenance in industrial environments, and for the design of
intelligent robots.

CHAPTER 4
FALSIFIED DOCUMENT DETECTION AND FONT IDENTIFICATION
4.1 Introduction
Today, a high percentage of white-collar crimes often involves the
falsification of typewritten documents. The falsified document
detection is currently performed by expert document examiners using
manual techniques. However, the number of different type-fonts
currently in existence is over two thousand. Keeping track of these
fonts represents a nearly impossible data-processing task when manual
techniques are used. The popularity of certain type-fonts have caused
other manufacturers to produce look-alike type-fonts. These fonts are,
to the human eye, indistinguishable from the original manufacturer's
product. Furthermore, the advent of interchangeable type elements, such
as on the ubiquitous IBM Selectric, has caused additional
complications. A single type element may be Interchanged between
different typewriters and even between typewriters from different
manufacturers. As a result, the falsified document detection problem is
extremely difficult to solve if we do not make use of modern computer
technology (e.g., expert system), image processing techniques, and
pattern recognition theory.
The main objective of this chapter is to develop a knowledge-based
system for detecting falsification in a document, identifying type-fonts
68

69
and typewriter manufacturers, and providing positive evidence of
falsification.
To detect falsification in a document, the system should be capable
of
(a) differentiating typewritten documents prepared by different
typewriters with the same single element ball,
(b) differentiating typewritten documents produced by different
single element balls of the same manufacturer and style placed
on the same typewriter, and
(c) differentiating typewritten documents prepared by the same
typewriter and the same element but at a different time.
To identify type-fonts and typewriter manufacturers, the system should
be able to
(a) determine the similarities and differences between two type-
fonts,
(b) recognize type-font manufacturers of the same and similar
typestyle, and
(c) determine the characteristics of typewriters made by different
manufacturers.
To provide positive evidence of falsification in a document, the system
should be able to make quantitative and minute comparisons and
measurements of individual typewritten characters and their
interrelationships.
In the following sections, we investigate the existing image
processing techniques in Chapter 2 and develop or modify them to suit

70
the above system requirements. Finally, a knowledge-based system is
created to conduct and link all the functions using experts knowledge.
4.2 Noise Background
The system involves minute feature comparison, thus the noise could
be a very important factor to the system reliability and accuracy. From
our study, noise occurring in the system may be classified into
typewriter noise, mechanical scanner noise, electronic scanner noise,
and quantization noise. Typewriter noise may be further subdivided into
noise due to ribbons, paper, strike strength, and reproduction. Ribbon
noise is caused by the type of ribbon (carbon or fabric), the weave, the
age, and the ink saturation. Paper noise is caused by texture, ink
absorption, and reflectivity. Strike strength noise is caused by the
different force typists use on the keys on a manual typewriter, or the
different force settings used on an electric typewriter. Reproductive
noise includes noise introduced by carbons or photostatic copying.
Mechanical scanner noise consists of alignment noise and optical path
irregularities. Finally, electronic scanner noise and quantization
noise are standard problems in all image processing applications.
Scanner noise and quantization noise may be largely engineered to
an acceptable level which depends upon the design of 'the scanning
device. Typewriter noise is inherent in the type samples and has a high
ambient level. Since the system involves microscopic comparisons, noise
reduction or removal techniques are employed.

71
4.3 Design Methodology
To meet the above objectives and goals, we propose a system
architecture as shown in Figure 4.1, which consists of the subsystems of
document data acquisition, typewritten character recognition,
falsification detection, type-font/typewriter identification and type-
font/typewriter database.
4.3.1 Document Data Acquisition
The document data acquisition subsystem reads the document page,
removes the noise, converts characters into binary images, and isolates
the characters for subsequent character recognition. Major operations
in this subsystem are briefly discussed as follows (see Figure 4.2).
4.3.1.1 Scanning, Reflection and Windowing
Scanning the document is the first step in this system. We use an
Optronics Photoscan, which possesses a resolution of 100 pm in the
reflection mode, to generate a digitized picture. A typical lower case
character of 1.8 mm x 1.8 mm is represented by an 18 pixel by 18 pixel
digitized image. A typical upper case character is represented by a
30 pixel by 22 pixel digitized image.
Scanning creates a mirror image when it scans the printed
pictures. We employ a "REFLECT" algorithm to eliminate the mirror
effect and a "WINDOW" operation tailors the portion of a document for
examination.
4.3.1.2 Image Preprocessing
The goal of preprocessing is to convert an input gray-level picture
into a noise-free binary image. Preprocessing includes smoothing,

V
'J
ro
Figure 4.1
I
System Architecture for Automatic Typewriter Identification

DOC
U)
Figure 4.2 The Data Acquisition Subsystem

74
histogram analysis, threshold determination, binary picture generation,
spur removal, and gap filling. Smoothing is to remove the background
noise due to rough reflection from the paper. Several effective noise
smoothing methods have been discussed in Chapter 2. We employed a
median filter with 3x3 window for smoothing because it improves the edge
blur. The smoothed picture is shown in Figure 4.3.
4.3.1.3 Adaptive Threshold Determination and Binary Picture Generation
To generate a binary picture for the input document, we have
developed an adaptive thresholding method based upon the concept of
maximizing interset variance of pattern recognition [29]. The histogram
of the gray-level image can be considered as a set of clustering. Then
the threshold selection is equivalent to cluster separation. From the
observation, the document shows dark objects (i.e., characters) on a
light background (i.e., paper). The embedded noises are within the
bright region. Then the threshold selection is simplified as a two
clusters separation; one cluster is characters and the other cluster is
background noises. The gray-level histogram is normalized and treated
as a probability distribution as shown in Figure 4.4. Assume there are
L quantized gray levels (L = 256, in the system). The probability of
gray level i is P^, where i = 1, 2, ..., L.
Let 0 be the desired threshold. Pixels with intensity less than or
equal to 0 are considered as belonging to the background class Cq.
Pixels with intensity greater than 0 are treated as belonging to the
object class C^. The probabilities for these two classes are Pq and P^,
respectively, which are functions of 0, given by

75
(a) Before Smoothing
(b) After Smoothing
Figure 4.3
Smoothing Operation

76
Figure 4.4 Probability Density Function Derived from
Histogram of Gray-level Picture

77
p0 Pr(C0> I P
i=l
'i vci> Lp
i 0+1
Let the mean of the gray levels of the picture be
H = l iP
i=l
The mean of gray levels of the background is
p0 [ iPJ/Po
The mean of gray levels of the object is
L
Hi = [ l
i= 0+1
The threshold is optimal when the separation between Pq and y^
maximum. We use the interset distance between class Cq and class C-^
a criterion for optimal thresholding. This distance function
measured in terms of the interclass variance given by
is
as
is
V(9) = P0[yQ y]2 + P1[y1 y]2
The threshold 9 is optimal if V(9) is maximized:
9=9 for max{V(9)}.
Once the optimal threshold 9 is determined, we generate the binary
picture by assigning a "one" to pixels with intensity greater than 9 and
A
a "zero" to pixels with intensity less than or equal to 9. In
mathematical form, if f(x,y) is the intensity at position (x,y) of the
picture, then
f(x,y)
0
f(x,y) > 9
f(x,y) < 0

78
However, from the experimental results, we have found that if we shift
a
the theoretical 0 value a little bit toward the background side, we will
A
get a better visual binary picture. The shifted threshold 0^ = 0
and C,p = 0.9.
To show how the optimal thresholding works, we use a windowed
document as shown in Figure 4.5 and its gray-level picture is shown in
Figure 4.6. The histogram of Figure 4.6 is shown in Figure 4.7 and its
binary picture is shown in Figure 4.8 with the optimal threshold
0 = 116.
4.3.1.4 Binary Filtering (Spur Removal and Gap Filling)
After binary image generation, there exist spurs, gaps, and
isolated noise in the binary picture. To determine which pixel is a
spur to remove and which pixel is a gap to fill in is usually
objectionable and difficult to judge by computer. From our study, we
define the spurs, gaps, and isolated noise as follows:
Definition of Spur
A pixel '1' is a spur, if the sum of its 4-connected neighbors is
one.
Definition of Isolated Noise
A pixel '1' is an isolated noise, if the sum of its 4-connected
neighbors is zero.
Definition of Gap
A pixel O' is a gap, if the sum of its 4-connected neighbors is
larger than two.
For the 3x3 window

79
1
I
This paper presents an approach to automatic
I
I
detection of document falsification by making use
i
of pattern recognition and image processing techni-
i
ques. The proposed system consists of two parts:
document data acquisition anc| falsification detec-
i
tion. The document data acquistion sybsystem reads
the document under examination, processes input
i
document data andiisolates suspicious characters
i
for subsequent falsification detection
L J
Figure 4.5 An Example of Windowed Document

80
;V ..y :-.-i'''-* **--f- %>i;? \
;£[. has|g a p er-J, d
date cti on-.-'of-docin
of pattern recoqrr
ques. The propos-
L
,j- ft.
j o c usne n t ;d at a offT;
L
aocuine?
the do GUiTtefftSundev
=*-** i* v~ v
document data and
for s uhsequentT fa
Figure 4.6 Gray-Level Picture of Figure 4.5

rn i) q
81
54
2 /
2 0
63
c6
L 6?
E 7 2
V 75
£ 73
L 21
24
37
90
?3
96
99
102
IOS
108
111
114
117
123
126
129
132
135
132
141
144
147
150
153
156
15?
162
165
163
171
174
177
150
REPRESENTS 34.S3 PIXELS.
FREQUENCY OF OCCURENCE
0 1696 3392 5033
* !!!! i i
*** !!!!!*!!!!!
******** J | t i i i i i i i
!!!!!!!!
********************************* ! ! !
*************************************************! !
***H<*************************.*************.*****************
*************** vk********>j<*********** *******************
*****************************************
****************************** ; ¡ j
***********>k*.* !!!!!!!!
****** ! ! ! ! !
*** i 1 ! I !
** 1 1 t 1 1 1 1 1 1 t
* 1 J j 1 1 1 1 1 1 j 1
* ! ; ; |
* 1 1 1 1 1 1 1 1 1 1
*!!!! 1 !! ;
*!!!!!!!!
* !!!!!!!! 1 !
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
*
Figure 4.7 Histogram of Figure 4.6

82
This paper pres'
tection of documen
pattern recognitit
es* The proposed :
cument data
on. The document <
e document under e;
cument data and is<
r subsequent falsii
Figure 4.8 Binary Picture (Before Noise Removal)

83
i-l> j
Pi,j-1 Pi,j Pi,j+1
Pi+1,j
the algorithm of the binary filtering is the following.
(1) Calculate the sum of the 4-connected neighbors
S = P. 1 + P. .,1 + P.,i + P. -i
i-l,j i,3+l 1+1)3 i,j-1
(2) If S > 2, set P, = 1
i > 3
If S < 2, set P, = 0
1 > J
If S = 2, P, remains unchanged.
1 > J
Figure 4.9 shows the results of binary filtering of Figure 4.8.
4.3.1.5 Character Isolation
The isolation procedure is accomplished by assigning a block number
to each isolated character. The whole procedure involves three
operations. One operation is blocking, or pyramid generation, which
isolates the characters by inscribing them with rectangular blocks, and
determines the coordinates of each block. The second operation is
labeling, which assigns each isolated character block a code number by
scanning order. The third operation is resorting, which resorts all the
blocks in text order (i.e. character by character in typing order).
(a) Blocking Algorithm
The essence of the blocking concept is using the top-down pyramid
generation to find the corner coordinates of the circumscribing
rectangular box of each character.
The pyramid generation algorithm is as follows (refer to
Figure 4.10).

84
This paper pres
tection of documen
pattern recogniti
es. The proposed

cument data accJin'
on. The document
e document under e;
cument data and Ts
r subsequent falsii
Figure 4.9 Binary Picture (After Noise Removal)

Looping t
V.
Figure 4.10 Pyramid Generation Algorithm

86
Step 1 J = 2.
Step 2 Read two consecutive data records (Ij_i an image file and do steps 3 to 6.
Step 3 Compare two corresponding pixels Ij_^(k) and Ij(k), where k is
from 1 to LREC and LREC is the length of the record.
If Xj(k) = 1, then I^(k) - 1 and go to step 6.
If Ij_-^(k) = Ij(k) = 0, then I^(k) 0 and go to step 6.
If Ij_^(k) = 1 n I^(k) = then go to step 4.
Step 4 Compare two previous pixels I^_^(k-1) and Ij(k-l).
If I^(k-l) = 1, then I^(k) <- 1 and go to step 6.
If Ij_^(k-1) = Ij(k-l) = 0, then go to step 5.
If Ij_1(k-l) = 1 n Ij(k-l) = 0, then
k-1 + (k-l)-l, if k-1 > 1, then step 4.
else go to step 6.
Step 5 Compare two next pixels Ij_^(k+1) and Ij(k+1).
If I.(k+1) = 1, then I.(k) 1 and go to step 6.
3 3
If I (k+1) = I.(k+1) = 0, then I.(K) <- 0 and go to step 6.
J ^ J J
If Ij_1(kH-l) = 1 n Ij(K+l) = 0, then
k+1 <- (k+l)+l, if k+1 < LREC, then step 5.
else go to step 6.
Step 6 J <- J+l, if J < NREC (the end of the record)
then go to step 2.
else end of algorithm.
To illustrate the pyramid generation algorithm, using Figure 4. 9
as an example, we can get a pyramid-like picture after pyramid
generation (see Figure 4.11).

87
ii pl&aaa aaaaga*
iuaa. lili gaiiiQi
iauaiai dala ^a
Uim. Ihi aauaai

yaaaii
Figure 4.11 Pyramid Generation

88
Upon the completion of the pyramid generation, we start to trace
the coordinates of the circumscribing rectangular box of each character
from the pyramid information and assign a code number to each block.
From the properties of the pyramid, we know that (1) the base of the
pyramid possesses the maximum number of elements compared with the other
layers, (2) the top element of the pyramid has support from every layer
(see Figure 4.12). Furthermore, based upon the above two properties, we
find that the top element provides the starting y-coordinate and the
base provides the ending y-coordinate, starting x-coordinate, and ending
x-coordinate. The x- and y-coordinates above correspond to two corner
coordinates of the circumscribing box.
The algorithm of the labeling is as follows.
Step 1 Scan the pyramid picture record by record and find a pixel with
"1".
Step 2 Take that pixel as the top element of the pyramid and record
its y-coordinate (Y^).
Step 3 Use the x-coordinate as the guide and tracing downward until
the base is reached (i.e. the last "1" before "0" is detected).
Step 4 Record the y-coordinate as Y2, and take the x-coordinate of the
leftmost pixel of the base line as X^ and the x-coordinate of
the rightmost pixel of the base line as
Step 5 Store (X^Y-^ and (X2,Y2) in the character coordinate file and
take the record number of the coordinate file as a block
number.

89
layer
1
2
3
4
't 5
Y 6
> X
elements
1
1
3
4
6
7 maximum
Figure 4.12 The Property of the Pyramid

90
Step 6 Delete the rectangular area, which has X < x < X and
y < Y2, from the pyramid picture.
Step 7 Repeat steps 1 to 6 until no more "1 pixels exist.
The character coordinate file of Figure 4.9 is shown in the left 5
columns of Figure 4.13. The ordering of the block number shows the
scanning order of the characters.
(b) Character Classification
Before discussing the re-sorting algorithm, we like to classify the
character into five types according to its "height, and typing
position.
Type 0: a,c,e,m,n,o,r,s,u,v,w,x,z.
Type 1: b,d,f,h,k,l,t, all uppercase letters, and all numbers.
Type 2: g,p,q,y.
Type 3: i,j1!' (which have two-part characteristics).
Type 4: '*' + (which have very small dimensions).
The y-dimension of types 0 to 2 is listed in Table 4.1. In the
falsification detection task, Type 3 and Type 4 symbols are ignored
except for i and j. The i and j symbols appear without the dots on the
top. We have two reasons for ignoring Type 3 and Type 4 characters.
First, they indicate very little on the document falsificaton. Second,
due to the preprocessing they become degraded and can not be easily
differentiated from the common noise. Furthermore, we classify the
characters i and j as Type 0 and Type 1, respectively, after deletion of
the dot as a feature.

91
SLOCK
Ml
VI
M2
Y2
LINE
POS.
DIFFY
NCHAR
1
74
6
83
21
1
1
0
10
2
36
6
94
21
1
2
a
10
3
100
10
104
21
1
3
a
10
4
111
10
113
22
1
4
i
10
5
146
10
154
22
1
6
i
10
6
170
10
173
22
1
8
i
10
7
135
11
143
25
1
5
4
10
8
159
11
167
26
1
7
S
10
9
183
11
192
22
1
9
1
10
10
207
11
215
26
1
10
5
10
11
13
54
22
70
2
1
£5
16
12
147
55
155
71
2
11
1
16
13
170
55
173
71
2
12
1
16
14
38
56
46
70
2
3
0
16
IS
74
56
82
70
2
6
0
16
16
24
53
33
70
2
2
0
16
17
49
59
58
70
2
4
0
16
13
62
59
70
70
2
5
0
16
19
67
59
91
70
2
7
0
16
20
97
59
106
71
2
3
1
16
21
134
59
143
71
2
10
1
16
22
no
0
113
70
2
9
0
16
23
132
60
191
72
2
13
2
16
24
194
60
202
72
2
14
2
16
25
207
61
215
72
2
15
2
16
26
218
61
219
71
2
16
1
16
27
26
103
34
119
3
2
0
15
23
74
105
31
120
3
5
1
IS
29
86
105
93
120
3
6
1
15
30
12
107
21
11?
3
1
0
15
31
49
103
57
123
3
3
4
15
32
61
103
63
119
3
4
0
IS
33
96
103
105
120
3
7
1
IS
34
110
109
113
119
3
3
0
15
35
121
10?
129
120
3
9
1
IS
36
146
109
154
120
3
10
1
15
37
157
109
166
120
3
11
1
15
38
170
109
173
121
3
12
2
15
39
131
109
190
121
3
13
2
15
40
194
109
202
125
3
14
6
15
41
206
110
214
121
3
15
2
15
42
109
152
117
163
4
6
0
13
43
96
153
105
163
4
5
0
13
44
11
156
20
171
4
1
3
13
45
33
15 6
44
163
4
3
0
13
46
43
156
55
168
4
4
0
13
47
24
157
32
163
4
2
0
13
48
120
157
129
169
4
7
1
13
49
145
157
153
173
4
3
5
13
50
153
15a
166
163
4
?
0
13
Figure 4.13 Character Coordinate File

Table 4.1 Types Criterion (IBM Selectric)
Type Font
V
h
l2
Type 0
a0)
Type 1
Type 2
(V*2>
Gothic
1.9
0.7
0.7
1.9
2.6
2.6
Elite
1.6
0.8
0.8
1.6
2.4
2.4
(Units : mm)

93
The single line spacing between two typing lines can be measured by
using the above character type classification (see Table 4.2). The line
spacing information provides a criterion for the re-sorting algorithm,
(c) Re-sorting Algorithm
After the labeling of each character, we have isolated each
character by assigning a distinct code number. However, for a document
examiner, it is not convenient to find the corresponding character by a
code number because the code numbers are arranged in the scanning
order. We re-sort the characters by text order and estimate the
location of each typing line. The re-sorting algorithm is as follows.
Step 1 Pick up the first ten character blocks from the character
coordinate file and find a candidate Type 1 or 2 character
whose y-dimension, H = max i=l,**,10.
Step 2 Take the threshold 0^=0.4*H as line spacing separation
criterion to scan
the
coordinate file.
If the
spacing
[(Y2)i+l(^2)i 1 >
CD
that means a new
typing
line is
generated.
Step 3 Resort the character sequence in a typing line by increasing
the order of the X^-coordinate, then assign the line number and
position number to each character.
Step 4 Count total number of characters in each typing line.
Step 5 Estimate the location of each typing line by averaging the
Y2_cordinate of the characters in that typing line.
Step 6 Refine the typing line estimation by dropping the Type 2
characters, if they existed, whose Y2~coordinates appeared at a
certain distance below the typing line.

94
Table 4.2 A Measurement of Single Line Spacing between
Two Character Types (IBM Selectric)
Types
Fonts
%
'i
'2
1/o
lf2
2/
'O
2L
2'2
Gothic
2.3
1.6
2.3
2.3
1.6
2.3
1.6
0.9
1.6
Elite
2.6
1.8
2.6
2.6
1.8
2.6
1.8
1.0
1.8
The upper number of the slash denotes character type in the 1st line
The lower number of the slash denotes character type in the 2nd line
The unit of the spacing difference 'is in mm.

95
Step 7 For each character, calculate the deviation away from the
estimated typing line.
Step 8 Repeat steps 2 to 7 until no more characters are left in the
coordinate file.
The result of the re-sorting of character coordinate file is shown
in the right four columns of Figure 4.13.
4.3.2 Character Recognition and Grouping
The character recognition and grouping task is to recognize the
characters in a document and group them into categories e.g.,
A,B,C,... ,a,b,c,...,1,2,3, ... ,9,0. The commonly used techniques are
template matching, correlation, Fourier descriptor [26][27], invariant
moments [56][57], and syntatic pattern recognition [58][60]. The
purpose of character recognition is to expedite retrieval of type-font
information from the database and to facilitate pairwise character
comparison in the falsification detection subsystem and in the
identification subsystem. To conduct character recognition, we use two
approaches: (1) the correlation by Fast Fourier Transform (FFT) and
(2) feature pattern matching.
4.3.2.1 Correlation Technique
The correlation approach is shown in Figure 4.14. To use this
approach, we first discuss some mathematic background with a minimum of
complexity. Then we compare the direct-multiplication approach with the
FFT approach. Finally, we list the computing algorithm and results. A
tutorial textbook [61] is referred as a reference.

96
Figure 4.14 Character Recognition Block Diagram
(Correlation Approach)

97
(a) Two-Dimensional Cross-Correlation
The two-dimensional cross-correlation is defined as
00 oo
Rf (x,y) = / / f(u,v)g(u-x,v-y)dudv (4.1)
The Fourier transform of the cross-correlation is
F(Rfg(x,y)) = F(m1,m2)G(-0J1>-m2) (4.2)
If g(x,y) is real, which is suitable for our case, we have
GC-^.-^) = G*(u1,u2) (4.3)
Therefore, F[Rfg(x,y)] = F( m2)G* ( (4.4)
Take inverse Fourier transform of Equation (4.4),
Rfg(x,y) = F 1 [F(m1,m2)G*(m1,m2)] (4.5)
(b) Properties of Cross-Correlation
The cross-correlation has the following important property
IRfg(x,y)| < Rff(0,0) (4.6)
when f(x,y) and g(x,y) are normalized in the sense that
OO OO oo oo
/ / If(x,y)|2dxdy = / / |g(x,y)|2dxdy = 1 (4.7)
oo oo oo oo
This property can be proved easily using Schwartz Inequality. The
equality of Equation (4.6) holds only when x=y=0, and g(x,y) = f(x,y) for
all x, and y.
This property of cross-correlation constitutes the basic theory
support for our correlation technique for pattern recognition. Lets
suppose f(x,y) is a fixed function (or pattern in our case) and g(x,y)
is chosen from a set of sample patterns. When the correlation is
maximum, we can conclude from the inequality of Equation (4.6) that the
two functions or patterns of f(x,y) and g(x,y) resemble each other.

98
(c) Comparison Between Direct-Multiplication Approach and FFT
Approach
The correlation can be calculated with direct multiplication. With
one displacement, pixel-by-pixel corresponding multiplication must be
performed for the whole array. While using FFT approach, correlations
of all displacements, from (0,0) to (N^,^) which is the dimension of
the patterns, are computed at one time.
For 4-by-4 array patterns, this procedure has been tested using
PDP-11/40 minicomputer and shows that the computing time of one
displacement correlation is about one third of that of FFT approach
which computes 4x8=32 displacements. The comparison means that FFT
approach is at least ten times faster than the direct-multiplication
approach for 4-by-8 arrays.
For arrays of larger dimensions (e.g. 16 by 32 used in our case),
the computing time will be much shorter for the FFT approach compared
with the direct approach because the computing time ratio of FFT to
direct Fourier Transform is inversely proportional to N logN, where N is
the dimension of 1-D arrays.
In general, it is not necessary to compute all the displacements.
However, because of the uncertain noise during the scanning process,
there will be always one-pixel errors at each side of the characters.
Therefore, at least nine displacements must be considered in direct-
multiplication approach and there is no guarantee that these nine
displacements are enough

99
Thus, it becomes obvious that FFT approach is much superior to the
direct-multiplication approach in both computing-time and reliability
considerations.
(d) Computing Algorithm for FFT Approach
In this subsection, we will briefly list the correlation-computing
algorithm of FFT approach which is primarily based on Equations (4.5),
(4.6), and (4.7).
Step 1 Input f(npn2), g(npn2) with dimension (NpN2).
Step 2 Normalize f(npn2) and g(npn2), i.e.,
N1 N2
f(npn2) = f(n1,n2)/ l l |f(u,v)|2
u=l v=l
N1 N2
g(n1#n2) = g(nlfn2)/ l l |g(u,v)
u=l v=2
Step 3 Find 2-D Fourier transform F(kpk2) and G(kpk2) of the
A A
normalized arrays f(npn2) and g(n^,n2) respectively.
Step 4 H(kpk2) = F(kpk2) G*(kpk2) for all k^ and k2.
Step 5 Find the 2-D inverse Fourier transform h(n^,n2) of H(kpk2).
Step 6 Find the maximum of the array h(n^,n2).
Figure 4.15 shows the block diagram of the above algorithm.
At this point, we need to determine the threshold 6^. Because of
the scanning noise, the cross-correlation will not be the ideal value of
unity. If the threshold is set too low, errors might occur. If set too
high, there may be no samples reaching this high degree of
resemblance. The threshold is determined through experiment.

100
Figure 4.15 Block Diagram of Calculating
Correlation Using FFT Approach

101
Figure 4.16 shows a computing result of cross-correlations of the
first line of the document shown in Figure 4.6. With a threshold
setting of = 0.9, we get the reprint of a document accurately (see
Figure 4.17).
4.3.2.2 Feature Pattern Matching
The feature pattern matching [62] is on the basis of the character
profile. A character profile is generated by projecting a character on
two main axes, which we then call x-profile and y-profile. The
advantages of using profiles are (1) the profiles of the same character
are similar and independent of type-fonts, and (2) profile formulation
involves simple binary addition only and which is very fast and
efficient. By considering the x- and y-profiles as binary histograms,
and by dividing each histogram into three sections at the top (or left),
in the middle, or at the bottom (or right) portion of the character, we
can study further the histogram shapes. In order to quantitatively
describe the histogram shape, there are three estimated levels (1) no
maxima, (2) minor maxima (i.e., 0.5M* < M < 0.85M*, M* is the global
maximum), and (3) major maxima (i.e., M > 0.85M*). We encode three
levels by 0 for no maxima, 1 for minor maxima, and 2 for major maxima.
Using the above encoding, we can represent the x- and y-profiles by
T
pattern vectors V = tvi ,v2 ,Vg ,v^ ,Vg ,Vg ] where v-^ to Vg represent the
code of jc-profile at left, in the middle, and at right portions, and v^
to Vg represent the code of y-profile at top, in the middle, and at
bottom portions. Figures 4.18(a) and (b) illustrate the profile pattern
vectors of characters p' and 'e1, respectively.

H< j¡! 4 Hi Hi 4 4 Hi4 H< >f 4 4 Hi H< H< H< Hi Hi 4 Hi H< : !' Hi 4 4 Hi Hi Hi Hi H< Hi Hi Hi Hi -Hi H< Hi -Hi 4 4 ?!< Hi Hi Hi Hi Hi H >!< Hi H: H 4 H< H< H< Ht H< H¡ H: 4 Hi H< H< H< 4 Hi 4 4 4 Hi 4 4 Hi Hi Hi Hi 4
LINE= i
a
a
i
o
u
p
r
.7
t
b
c
d
f
Q
h
1
0.549
0.495
0.756
0.50 3
G. 477
0.675
0. 666
0.479
0.693
0.585
0.545
0.5 61
0. 74 3
0.574
2
0.664
0. 65 2
0.623
0.630
0.699
0.638
0.696
0.55 8
0.642
0. S24
0. 603
0.635
0.646
3. 647
3
0.6 34
0.470
0.383
0. 5 36
0. 5 37
0. 573
8. 624
8.5 36
0.696
0.574
0. 490
8.676
8.727
8. 5 65
4
0.622
0.650
0.499
0.62
8.539
0.537
0. 471
8.3S6
0.594
0.585
0.600
3.593
0.5 37
0. 524
5
0.65 0
0.741
0.633
0.326
0.75 2
0.969
0. 65 8
3. 596
0.66
0.321
0.65 3
0.722
0.646
0.704
6 7 3 9
0.914 0.673 0.698 0.505
0.751 0.730 0.396 0.629
0.532 0.636 0.431 0.628
10
0.65 2
0.713
0. 619
0.718
0. 789
8.662
3. 519
0.635
8.522
0.641
0. 612
0. 702
0.799
0.734
3.953
0. 632
0.613
3.623
0.330
0. 673
0.713
0.309
0.729
0.712
0.645
0.655
0.537
0. 656
0.736
8.762
0.614
0. 5 32
0.623
0.940
0.584
0.621
0. 5 31
0.65 7
0. 685
0.366
0.771
0.931
0. 626
0. 597
3.591
0. 755
0. 700
0.75 7
0'. 5 06 0.645 0.531 0.635 0.634
8.646 0.696 8.703 0.538 0.750
8.575 0.951 0.571 0.546 0.723 0.672 0.723 0.666 0.66? 0.669
m
n
Q
T
0.541 0.564 0.522 0.542 0.565 0.532 0.552 0.514 0.531 0.593
0.573 0.383 0.604 0.594 0.803 0.678 0.789 0.766 0.766 0.323
0.665 0.691 8.591 0.563 0.707 0.694 0.639 0.750 0.561 0.763
0.953 0.564 0.693 0.453 0.651 0.501 0.657 0.497 0.673 0.676
0.445 0.418 8.501 0.425 0.353 0.459 0.356 0.331 0.507 0.36?
Hi 8 Hi 4 6 H-: Hi 8 4 6 H< 4 Hi 6 H< 6 4 H< 6 6 Hi H< 4 4 4 4 8 Hi H< H! H< H< 9 4 H< 4 4 4 4 4 Hi 4 Hi 4 4 Hi44444444444444444444444 4444 44 4444
Figure 4.16 Computing Results of Correlations for the First Line of the Document
102

103
THE CONTENT OF THE DOCUMENT IS:
This nnpcr p
hI < M A ^ M* /* *J M
'J Vi L L 1 1 s U T *-i U L u
of pattern recoqn
ques. The propos
document data ac
tion. The documa
the document unde
document data and
for subsequent fa
Figure 4.17 Reprint of Document

PROFILE FOR THE S-TH CHARACTER
104
BLOCK NUMBER = 5
LINE NUMBER = 1 CHARACTER POSITION = S
9
nj
in
-J
Id
CL
*
36- -5*
f X ^ X X
r -r + -r -r -ir -i-
-Jr
> -k-
-* -=5-
^ x ? 4
r4njmtin-OM30'Or((\jn^in>o
3ZHh
Z1 m Q K Z
PATTERN VECTOR FOR THE
S-TH CHARACTER IS
2 2 0 3 0 1
(a) Character "p"
Figure 4.18 Profile Pattern Vector of Character
M !< *+ + + **
4 tihi-h

PROFILE FOR THE 16-TH CHARACTER
105
BLOCK NUMBER = 16
LINE NUMEER = 2 CHARACTER POSITION = 2
s
OJ
in
Ui
Q.
U-
a
*
*
5* -t
ae* *
-at-
Sr *** *
* *
= * -x-
nninTfin-ONraff-i
o:
u
K
U
<
c:
<
X
I-
I
Cl
a
u.
u
_i
PH
U.
a
in
S
lx.
a
*
Jtm
-f*
-i*
+ "i-
T,*-r*-*T--4*-T'---
-*-----~---Tf--£--T
rrtr-rTr*-*----3e--¡r
^njmfin-ONCDO'O
3 h Q H I
3 M a H X
PATTERN 7E
"GR FOR THE XS-
CHARACTER IS
2 0 1
(b) Character "e
Figure 4.18
(continued)
i+.ii..|,+++

However, due to the preprocessing and noise effect, some pixels are
missing and some edges are blurred. These phenomena will degrade the
profiles and make some characters undistinguishable. To solve this
problem, we use gap characteristic as an additional feature. We encode
the gap features of a character by the location of the gaps as shown in
Table 4.3. Now the number of elements of the pattern vector are
increased to 8 (i.e., V = [v]_ ,v2 >v-j ,v^ ,v5 ,v^ ,Vy Vg]T) The algorithm of
the feature pattern matching is as follows.
Step 1 Determine character type for each character.
Step 2 Using character type to reduce the search range in the
character file.
Step 3 Perform the pattern vector matching to find a candidate
character.
Using the above algorithm to test four different type-fonts (Letter
Gothic, Olympia Gothic, Elite, and Prestige Elite), there are 70 85%
of characters which are recognized correctly.
The grouping procedure is to classify the isolated characters into
alphabet or numeral categories. After character recognition, a
character-grouped file is generated. The flowchart of the character
recognition and grouping subsystem is shown in Figure 4.19. The
character-grouped file of Figure 4.9 is shown in Figure 4.20.
4.3.3 Falsification Detection
The falsification detection subsystem performs a sequence of
operations including size measurement, spacing measurement, alignment
analysis, shape template matching and gray-level intensity analysis.

107
Table 4.3 The Encoding of the Gap Feature
Gap Location
Coded Information
(u7u8)
Example
left
1
0
a
right
2
0
e
left and right
3
0
s
top
0
1
u
bottom
0
2
n
top and bottom
0
3
H
left, right, top and
bottom
3
3
X
no gaps
0
0
0

108
Figure
Document
.19 Flowchart of Character Recognition
and Grouping

109
TTT
hhhh
in
sssss
aaaaaaaaa
eeeeeeeeeeeeeee
PPPPPP
r ir t r r
dddddddddd
f f f f
tttttttttttt
cccccccc
oooooo
nnnnnnnnnn
uuuuuuuuu
9
q q
m in pi m
b
Figure 4.20 Character-Grouped File

110
The suspicious characters are considered falsified if size
discrepancies, spacing deviations, misalignments, shape mismatching, or
intensity variations are detected from the characters. The functional
block diagram of falsification detection is shown in Figure 4.21. The
falsification detection subsystem first makes the alignment test on a
suspicious document. If the alignment test fails, this document is
regarded as a falsified document. If the alignment test is passed, the
suspicious document will be subjected to the size test character by
character, group by group. If all characters pass the size test, they
will take shape matching test. If this test is also passed, they will
take an intensity change test. If a suspicious document fails one of
the above tests, it is considered as a falsified document.
4.3.3.1 Alignment Analysis
It may be assumed that characters in a normal document are well
aligned. Due to re-insertion of paper into a typewriter, some
characters in a falsified document may be out of alignment.
Discovery of misalignment signifies document falsification. In
alignment analysis, we detect jump misalignment and line spacing
alignment. The functional diagram of the alignment analysis is shown in
Figure 4.22.
From the character coordinate file, we get the information about
the deviation (dy) of each character off the typing line. From this
deviation, we can detect the following information.
(1) Down-incline case if the deviations are monotonic decreasing
from left to right (i.e., dy-^ < ... < dyQ),

spacing:
line spacing
Isolated Character
(grey level)
I Figure
4.21 Functional Block Diagram of Falsification Detection
111

112
Questioned Document
Figure 4.22 The Functional Diagram of Alignment Analysis

113
Figure 4.22 (continued)

114
(2) up-incline case if the deviations are monotonic increasing from
left to right (i.e., dy^ > dy2 > ... > dyn),
(3) jump-misalignment if the deviations are zig-zag like change, or
step jump, or sawtooth like change.
The threshold 0^ of jump-misalignment is 2 pixels. If |dy^| > 0 ^,
then jump-misalignment of jth character is detected.
For the different line spacing detection, we calculate the distance
(dV^) between two consecutive typing lines in one page. If the error of
the line spacing |dV^ dVj| is larger than the threshold 0V, then line
spacing difference is detected.
For horizontal spacing detection, we calculate the distance (dH^)
between two consecutive characters in the same typing line. A space is
considered as a character. If the distance difference |dH^ dIL+^| >
0jj, then the horizontal spacing difference is detected. In our working
model, we set 0jj = 0^ = 2 pixels. Figure 4.23 shows the result of the
alignment analysis. The document which passes the alignment test may
proceed to shape analysis.
4.3.3.2 Shape Analysis
The shape analysis has the two major functions of size difference
detection and type-font difference detection. If the sizes of two
characters which belong to the same group are different, then their
type-fonts shall be different. Thus we examine the size first. If the
characters can pass the size measurement test, they continue to take the
shape-matching test pair-wisely.

1
.L JL-L-L _L-i.
A J ? "'.¡frL r.J '7* A fiI A I / T
nLxum iuii i rtnnu i s u. _/
- ^ u. .v
(A) JUMP MISALIGNMENT :
LINE a POSITION
5 13 > 14
? 1 > 3
MARKS : THE EXAMINED DOCUMENT EXHIBIT
t 11 i t ir* u r*
J. J Lriu TT 2 I T
o j 'ji ir
! J. 2L LH'Z.H I 2
/ ti \ i rr.ir r~. x -r tf-* t." i | r.-ii'iriiT .
e J L Ai'iL arHLirju ri shL l:- *ri r-f i
BETWEEN LINES
c ,
c- .
REMARKS : THE EXAMINED DOCUMENT EXHIBITS DIFFERENT LINE E-PACINGS
Figure 4.23 Result of Alignment Analysis
115

116
The size difference detection is very straightforward. We choose
one character from a group of characters as the reference (AXr,AYr). Do
size measurements to the rest of the characters and compare with the
reference. If | AXi AXr| > 0x or | AY AYr| > 0 then the size
spacing difference is detected. Continue to test the other groups until
no group is left. The functional diagram of size difference detection
is shown in Figure 4.24.
For type-font difference detection, we use the shape matching
technique. If two characters in the same group are identical with no
blurred edge, all the corresponding pixels will overlap when they are
superimposed. However, under practical situations, the typewritten
characters are contaminated by noise due to dust or ribbon defects.
These phenomena will cause the characters to shift relatively few
pixels. We check the shift effect and evaluate whether they are the
same type-font or not.
In a group, when character 1 is matched with character 2, all
matched pixels are cancelled. The unmatched pixels from both characters
are labeled 1 for pixels from characters 1 and 2 for those from
character 2 as illustrated in Figure 4.25. Let be the total number
of pixels in character 1, C2 be the total number of pixels in character
2, N-^ be the number of unmatched l's, and N2 be the number of unmatched
2's. Defining a shape matching index as
^2 maxCN^,^)

117
Figure 4.24 Size Difference Detection
O

nj nj rij nj
nj nj rij
nj nj nj
118
CHAR, in LINE 3 7 POS. 8 f IS COMPARED WITH CHAR. IN LINE 8 IPOS. 3 1 :
ill 2
Ill
ill
11 22 2
1
11
i £
Line ? 1
Lin?
i £
£
Pos. ¡? 1
Pos.
Figure
4.25 An Example of Shape Matching

119
we obtain the following cases
(1)
S1 K 6S1
Test passes
(2)
9S1 Check shift effect
(3)
0S1 < S1 < 0S2 and S2 > 0S3
Smeared character
may exist but check
if the reference is
smeared. If yes
change it
(4)
Test fails
The shift effect checking is to move the two characters relative to
each other in eight compass directions with one pixel offset and re
compute the shape matching indexes (S-^ and S2) to see if the test passes
or not. The shift effect checking is illustrated in Figure 4.26. The
functional diagram for shape analysis is shown in Figure 4.27 and the
result is shown in Figure 4.28.
4.3.3.3 Intensity Change Analysis
A document may be falsified by the same typewriter at a later
date. In this case, the suspicious document may have passed all the
preceding tests. We make use of intensity change analysis to detect
this type of falsification which may be revealed from the change of
ribbon darkness or the change of paper brightness due to erasing. The
differences in ribbon darkness and paper brightness may be determined
from the intensity histogram plots, as illustrated in Figure 4.29. The
solid curve represents the intensity histogram of a character. The
dashed curve represents the intensity shift from both sides to the

120
Figure 4.26 The Shift Effect Checking

121
Binary
Picture
And
Coordinate
File
Figure 4.27 Shape Analysis

122
Figure
4.27 (continued)

123
++++++ SHAPE ANALYSIS ++++++
(A)
SMEARED CHARACTERS
CHARACTER
LINE *
POSITION
e
7
15
s
9
7
(B)
SIZE DIFFERENCE


CHARACTER
LINE 1
POSITION
NONE
(C)
TYPE-FONT DIFFERENCE t
CHARACTER
LINE *
POSITION
NONE
REMARKS : NO FALSIFICATION DETECTED
Figure 4.28 The Result of Shape Analysis

Intensity
Figure 4.29 Intensity Histogram

125
middle portion due to the erasing problem. To quantitatively describe
the intensity change, we define three parameters as 1^ denoting the
optimal peak intensity in the background region, I denoting the peak
intensity in the object region, and Iq denoting the optimal thresholding
of the intensity histogram. Referring to Figure A.29, the peak
intensity shifts are given by
AI = I I and AI = I I I
b D2 o o c>2
where AI^ is due to erasing, AIQ is caused by ribbon aging and AIr is
due to the combination of both cases. We introduce an erasing threshold
0e, a ribbon aging threshold 0r, and the threshold 0t of the optimal
threshold change. The decision rule is (1) if AI^ > 0e, the paper has
been erased, (2) if AIQ > 0r, the ribbon has been changed, (3) if
AIq > 0t, the character has been corrected (or erased and retyped).
However, if the intensity histogram is unimodal (i.e. has only one
peak), then Iq cannot be extracted. The Iq will be an important
parameter for the detection of the intensity change. In our working
model, we just use Iq in the intensity change analysis which is shown in
Figure 4.30. The Iq's of two characters is shown in Figure 4.31.
4.3.4 Type-Font Identification
The identification of the type-font is a pattern recognition
problem. The number of type-fonts including variationsexceeds three
hundred. To achieve accurate identification, we have designed a type-
font database which contains information on the type-font name,
features, manufacturers, year, spacing, number of characters per inch,
etc.
In our experimental study, we used an IBM Selectric II with

126
^ Start ^
Grouped
Characters
Select
Reference
For Each Group
Original
Picture
and
Coordinate
File
Thresholding
From
Original
Picture
Figure 4.30 Intensity Change Analysis

CHECK h M GROUP
EACH REPRESENTS 0.55 PIXELS.
FREQUENCY OF OCCURENCE
0
11
22
33

1
!* !
!

! !
l !
b
3
M4
! ! !
1
!
1 1
E
5
666 !
!
!
1 5
;
V
7
44 4646666466
! ! !
!
! !
1 1
9
!n4 *4 44:44*446664**6*6**44*466*****!*

1
1
L
11
f 4 4 *444644664+4646644+6 fr*********!
!
! !
E
13
! 44***44644446444 i^******************************************
V
15
*4444444444444444444444444444444 !
(
!
!
E
17
< 44*4**4444
i
1
f !
!
L
19
! 4 !
!
J
1 1
1 1
cl
!4*4 !
! ! !
!
1
! !
23
6644444*4466
! ! !
!
1
! !
25
! *

!
!
i !
27
! 4 !
! ! !
!
! !
! !
29
! !
! ! t !
!
! 1
! !
31
4 !
! ! !
!
! (
! !
33
! !
! !
1
1 !

3S
! !
!*!!!
! 1
! !
37
44* !
J !
!
! !
39
! 4 ;
! ! !
!
! !
!
41
!44444 1
! ! !
!
i !
i
43
!444
1 1
!

! !
45

! ! !
!
! !
i i
47
444
! ! !
!
! !

49

! ! !
!
! !
! 1
51
!
! ! !
1

! i
S3
.'
! ! !
!
!
! !
55
! !
! ! !
I
! !
! !
i 7
i i i
! ! {
!
J !

59
! !
! ! f
!
! !
! !
61
!* !
! ! !
!
!
! !
t 3
! 4
! ! !
;
!

5
!4*4 !
! ! !
!
!
! !
67
444 !
! ! (
¡
! !
!
9
!444*4 !
! !
!
! !

71
4 !
! ! i !
!
! !

73
44* i
! ! !
!
!
! !
75
444 !
! ! !
!
! !
! !
77
44* !
1 J
!
! !

79
!6*44444444
.' ! !
!
! !
!
3 1
!4*44444
! ! !
!
! !
!
83
144**4444444**44444 ! J
!
! !
! (
as
!
f
! !
37
44*4444444444444 ! !
!
! !
! !
39
444*444*4*444444 ! !
!
i
i t
91
444444444*
! !
!
i !
! !
93
4444*44**!
! ! !
¡
!
! !
95
444*444*4!
! f ! !
!
!
! !
97
444 ! I
! ! J
!
1 i
! !
THE REFERENCE THRESHOLD I IN LINE R 1 POS. 1 51 IS 4*
EACH REPRESENTS 0.53 PIXELS.
FREQUENCY OF OCCURENCE
a
10
21
3
G
7
4
.
!
.
!
1

R
9
4**444**444**
!
1 1
!
!
1
1
E
11
94*4444494*44**44*4****44*4444*44**4***4*****444***********1
V
13
4**4**444*4444444*44***4444**44444*44444444*4***4*
IS
444444444*444444444444444* 1
!


1
L
17
4444*444444444!
1

!
!
! !
1
E
19
J 4*444*44*!
! !
j
!
! !
1
V
21
!*+ !
1

1
!
1
1
E
23
!44444444*! !
j
1
1
!
i !
!
L
25
*4444 J
1
!
1
;

1
27
*** ! !
1
< 1
1
1
!
29
;4*4444*
1
!
;
! !
1
31
!*******
1
1

1
!
33
4 64 !
1
!
1
! !
;
35
!44444
!
! !
!
! !
1
;
37
!4*4444*
;
!
1
; 1

39
4444* !
1
!
1
! !
1
41
* !
I
1
! !
!

43
!* 1 i !
!
!
1
! !
1
45
14! !
1
1
!
1 1
¡
47
!4*4 !
!


!
49
4
! !
!
!
1
1
51
! 4 ! !
!
! !
!
¡
¡
53
! !
! !
!
1
! !
1
SS
! ! !
! !

1
5 !
1
57
¡4444444
! !
!

! !
1
59
4 !
! !
1
!
!
!

61


! !
1
1
! !
;
1
63
4*4*4 !
>
! !
¡
1
! !
1
65
! !

! !

;
! !
;
1
67
! ! !

!
;
1
! !
1
1
69
! 4 ! !
!
!
!
1
! !
1
71
! ! !
!
! !
!
1
!
73
4 ! !
!
!
j
1 1
1
75
iii!
!
1 1
!
I

1
1
77
!
!
! !
1
! !
1
79
!*44 ! !
!
!
1
1
81
444 !


1
!
1
83
!4***4 !
!
!

! !
¡
as
44*44*4444444
!
! !
1
! !

1
97
1
! !
1
89
!4444444*44*4464t444*4*464444444
1
! !
91
!46*46*6*66**4*4*
:
! !
1
! !
!
1
93
4*4 444444 4444444 4444
! !
1
! !
!
95
*** !
!
i

! !
97
!64464
!
! !
!
! !
9?
THE
THRESHOLD OF CHARACTER
IN LINE
<1 2 POS.
a 1
IS
S3
<<<(<<
SAME INTENSITY >)>)))
Figure 4.31 Intensity Histogram Comparison of Two Characters (h)
127

128
different elements to type four different type-fonts (Letter Gothic,
Elite, Prestige Elite, and Courier Italic) as shown in Figure 4.32.
To facilitate the type-font identification task, hierarchical
pattern matching is proposed (see Figure 4.33). The hierarchical
pattern matching performs the three-level discriminations using size
information, global features and local features in sequence. The size
information can be found from the manufacturer's catalog.
The global features used in our system are some chain-coded
features as follows [63].
(1) Centroid : (x, y)
(2) Moment of inertia about centroidal x-axis : (MOX).
(3) Moment of inertia about centroidal y-axis : (MOY)
(4) Inertia product about centroidal x- and y-axis : (MXY)
(5) Direction angle of the major axis : (THETA)
(6) Moment about the major axis : (TEX)
(7) Moment about the minor axis : (TEY)
(8) Elongation index : (El)
(9) AREA : (A).
These nine features form a global feature vector to describe a
character. The similarity between two characters can be evaluated by
the distance measurement between two corresponding feature vectors.
Figures 4.34(a) and (b) show two chain-coded characters "h", which are
of different type-fonts. The chain-coded features are calculated to
show the differences. These two h's are easy to identify by the global
features. However, the chain-coded features may not detect the minute
differences between two similar type-fonts.

129
AAAAABBBBBCCCCCDDDDD
(a) Letter Gothic
aaaaabbbbbcccccddddd 12 pitch
AAAAABBBBBCCCCCDDDDD
aaaaabbbbbcccccddddd
(b) Elite
12 pitch
AAAAABBBBBCCCCCDDDDD
(c) Prestige Elite
bbbbbcccccddddd 12 pitch
a a a a a
A A A A A B B
a a a a a b b
B
B
B
C
C
C
C
C
D
D
D
D
b
b
b
c
c
c
c
c
d
d
d
d
(d) Courier Italic
12 pitch
Figure 4.32 Four Type-Font Samples

Figure 4.33 Three-Level Identification Hierarchy
130

131
ORIGINAL FIGURE SMOOTHED FIGURE
6444444
6
2
007
2
6
1
6
2
6
2
6
2
6
2
6
2
54444
6
344
34
6
000007
3
6
i
7
2
6
1
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
G
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
5
2
6
2
64
34
5
34
*7
i
O
64
2
000001
0000002
6444444
6 2
007 2
6
1
6
2
6
2
6
2
6
2
6
2
54444
6
344
34
6
000007
3
6
1
7
2
6
1
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
6
2
5
2
6
2
5
34
5
34
3
5
2
000001
0000C02
AREA OF THE FIGURE IS =-120.5 UNITS
CENTROID 13 C 8
.2/ 10
. 7 ;
MOX = 6303.S046
moy =
3292.7033
NXY = 1593.9303
TEX= 2537.2666
TEY:
6S89.2463
THET A = 66.723S
El =
0.3716
(a)
Elite
Type-Font "h"
Figure 4.34 Chain-Coded Features of Character

132
ORIGINAL FIGURE
SMOOTHED FIGURE
544
6 2
5 2
6 1
6 2
6 2
6 2
6 1
6 2
5 2
6 1
6 2
-& S 5444
5 2 5 7 3
6 1 501 6 2
6 2 51 6
6 2 51
6 2 6 2
6 2 51
5 31
7 2
5 1
6 2
6 2
6 1
6 2
5 2
6 1
6 2
7 02
1
6 2
6 2
51
6 2
6 2
6 2
5 2
6 1
6 2
6 2
6 2
6 2 54
6 2 5 2
6 34 02
000001
544
6 2
5 2
6
6
6
6
6
6
5
6 1
6 2
6 ?
5 2
6 1
1
2
2
2
6
6
6
6
6
6
5
6
6
6
6
5
6 1
6 2
0002
51
51
6 2
51
5444
5
50006
6
31
2
1
2
2
6
6
51
6 2
6 2
6 2
5 2
1
2
2
2
2
2
34
54
5 2
1
oqoooi
AREA OF THE FIGURE IS =-132.0 UNITS
CENTROID IS C 7.4/ 12.5)
M O X = 9337.0745
TEX = 2779.9470
THETA: 82.1420
MOY = 2895.3704
TEY= 9953.3979
EI = 0.3105
MXY = 936.2037
(b) Courier Italic Type-Font "h"
Figure 4.34 (continued)

133
For minute difference comparison, we propose to use local pictorial
features. The two approaches to extract pictorial features are
skeletonization [64] and quadrant division [62]. The skeletonization
provides topological features such as triple point, end point, and
connectedness for the font identification. The skeletons of characters
'A* of four different type-fonts are shown in Figure 4.35. Experienced
document examiners classify type-fonts by key features only. An example
of the key feature of some type-fonts is shown in Figure 4.36. For the
quadrant division, we divide a character into four quadrants and store
the significant quadrant(s) into the data base for partial matching
purpose. The significant quadrants of a character "a" for the Gothic
and the Prestige Elite type-fonts are shown in Figure 4.37. The system
performs the partial matching with the data base and selects the one
which has the highest score as a candidate type-font.
We illustrate the identification task using Figure 4.38 as an
example. Figure 4.38 consists of two mixed type-fonts. After
identification, the results are shown in Table 4.4.
4.3.5 Knowledge Base Design
The fourth subsystem of the proposed system is the knowledge base
for falsification detection and font identification. Without a
knowledge base, the system will not be able to integrate the procedures,
pattern features, and detection criteria together to make a positive
identification. The knowledge base will be filled with complete
information on various type-fonts and typewriter manufacturers together
with the experience and know-how of expert document examiners. The
control module determines which analysis techniques will be performed

134
(a) Letter Gothic
(b) Prestige Elite
(c) Elite
Figure 4.35 The Skeleton of the Characters (A)

135
Figure 4.36 Key Feature of type-Font

136
Type-font
Gothic
Prestige
Elite
Figure 4.37
Binary
Character
Prestored
Templates
m
£3

(Quadrants
I and IV)
[3?
(Quadrants
I and IV)
Division of a Character into Quadrants

137
The system first makes the alignment test
on suspicious characters. If the characters fail the
alignment test, they are regarded as a falsified
i 1
document. If the suspcious characters pass the
l
automatically be subject
i
is test is also passed
! 1
they will take feature discrimination tests. If
alignment test, they wil
to the size test. If thi
i i
i i
this suspicious character can pass all these
tests, the system will make a final test, the inten
sity analysis test.
(a) Windowed Document
ous characters pas
1 automatically be .
s test is also pas
fscrimination test
r can pass all the
(b) Filtered Binary Image
Figure 4.38 An Example of a Document Which Consists of
Two Mixed Type-Fonts

138
Table 4.4 Results of Type-Font Identification
CHARACTER LINE POSITION TYPE-FONT
1
3
ELITE
16
GOTHIC
13
ELITE
3
1
GOTHIC
4
GOTHIC
7
GOTHIC
10
GOTHIC
14
GOTHIC
5
7
GOTHIC
8
GOTHIC
1
15
GOTHIC
6
ELITE
8
ELITE
2
2
GOTHIC
7
GOTHIC
11
GOTHIC
3
8
GOTHIC
13
GOTHIC
5
3
GOTHIC
6
GOTHIC
9
GOTHIC
1
11
ELITE
2
16
GOTHIC
3
3
GOTHIC
4
15
GOTHIC
5
14
GOTHIC
1
12
ELITE
5
1
GOTHIC

139
and what the criterion is. Then, on the basis of the conclusion of the
analysis, the control module then selects the next analysis task. Thus
the control module is an event driven mechanism. The functional block
diagram of the knowledge base design is shown in Figure 4.39. The
hierarchical tree handles the type-font classification. The type-font
dictionary stores the type-font information and links with the chain-
coded picture file and the manufacturer's file. The type-font
dictionary also provides the corresponding feature vectors through the
type-font-to-feature table for font identification. The detection
criterion file provides the threshold setting for falsification
detection subsystem. The pictorial feature file stores the quadrant
pictorial features of the character for font identification. A
questioned document passes through image processing task linked with the
knowledge base and falsification detection and the font identification
is performed on it.
4.4 Discussion
There are several basic questions in falsified document detection
and font identification.
a) Is the document falsified?
b) In what typefont(s) is the document written?
c) Which manufacturer produced the machine used in the typing of
the document?
d) What actual individual machine is used in typing the document?

I
Documento
Resulta
r**
Chain-Code
Picture
File (CCPF)
Manufacturers
File (MF)
Picture
Feature
File (PFF)
I
!
!
I
I
I
Figure 4.39 Functional Block Diagram of Knowledge Base
for Font Identification
140

141
e) Is the entire document typed on the same machine?
f) Is the document typed continuously?
The proposed system in this chapter has been implemented in the
Center for Information Research, the University of Florida, and answers
the above questions. The system is called the Automatic Typewriter
Identification (ATI) System. Due to the limited samples, the questions
(c) and (d) are not fully tested. However, the growth of the knowledge
base will provide a positive answer to these two questions.

CHAPTER 5
COMPUTER RECOGNITION OF ELECTRONIC CIRCUIT DIAGRAMS
5.1 Introduction
Automated reading of diagrams and drawings has been recognized as
_ an important step in augmenting engineering design productivity. In
current commercially available CAD/CAM systems, data entry of circuit
designs is limited to the interactive editor mode through a special
input device such as a light pen and tablet. Data entry by human
operators is time-consuming and error-prone. An automated schematics
reading machine is needed in order to enhance CAD capabilities. In
recent years, several attempts have been made to develop computer
techniques for automatic recognition of flowcharts and logic circuit
diagrams [65][76]. Some of these approaches are only applicable to
circuit element symbols which are of simple geometrical shape such as a
rectangular type. Some approaches have even modified and represented
the symbols in linear segment form. For a comprehensive discussion, see
Cheng [77].
Our approach is not limited to symbols with rectangular shape.
Electrical and electronic symbols are taken from the standard drafting
handbooks [78][80]. The system design is based upon the concept of
multi-pass pattern extraction which we have developed. Each pass of
pattern extraction generates a page of information of one kind. Through
142

143
multi-pass pattern extraction, the electronic diagram is segmented
according to the nature of the elements. Then the reading of electronic
symbols is treated as a "character" recognition problem discussed in
Chapter 4. The various procedures of multi-pass pattern extraction are
described in Sections 5.4.1-5.4.6. To describe the symbol
configurations and the inter-relationships among them, we introduce two
high-level pictorial manipulation languages, Symbol Description Language
(SDL) and Picture Generation Language (PGL). The PGL is an associative
network structure.
5.2 Analysis of Electronic Circuit Diagram
Electrical and logical schematics are line drawings which consist
of line segments and symbols for the representation of the circuits in
graphics form. They provide the important means of communication in
electrical, electronic and computer engineering. Standards for the
drawing of electric schematics can be found in several reference books
[78][80].
In general, electronic and logical schematics are characterized by
functional elements, connecting elements, and denotations. The
functional elements are represented by symbols consisting of two or more
terminals for connection to other elements. For instance, resistors,
diodes, and capacitors are two-terminal elements; transistors and
amplifiers are three-terminal elements; and flip-flops are four or more
terminal elements. Some basic symbols in_ electronic and logical
schematics are listed in Table 5.1. The connecting elements are

144
Tble 5.1 Some Basic Symbols in Electrical Schematics

145
represented by junction dots, horizontal line segments, and vertical
line segments which connect functional elements. Various types of
connections are listed in Table 5.2. The denotations are used to
describe physical properties or labels of the functional elements in a
schematic in terms of a character string. For example, the string 10K
attached to a resistor means that the resistance of the resistor is
10,000 ft. The string Cl attached to a functional element indicates that
the element is capacitor number one in the given schematics.
Denotations in a schematic will facilitate automatic recognition of the
functional elements.
From the electronic drafting handbook, some layout guidelines may
provide useful information for diagram recognition such as class
designation letters (Table 5.3) and estimating space requirements
(Table 5.4) [80]. Moreover, there exist some geometrical constraints of
logic symbols [67] (refer to Figure 5.1). They are as follows:
Class name: AND type gates
2 2
fj^ = (x (x + a b)) /b2 + (y y) /b2 1
f2 = (y (y + b))/b
f3 = (x (x a))/a
f4 = (y (y b))/b
Class name: OR type gates -
2 2
f. = 3((x (x a))/4a) ((y (y b))/2b) 1
1 2 2
f_ = 3((x (x a))/4a) + ((y (y + b))/2b) 1
2 2
f^ = ((x (x a /3b))/2b) + ((y y)/2b) 1
Class name: amplifier type

146
Table 5.2 Configuration of Connections
Connection Type
Graphic Representation
Conducting Crossing
Corner
Free End
Conducting Touching
Nonconducting
Crossing

147
Table 5.3 Class Designation Letters for Some
Electrical and Logic Components
Class Letter
Component Name
A
AND gate
AR
amplifier
C
Capacitor
CR
diode
FF
flip-flop
L
inductor
NAND
NAND gate
NOR
NOR gate
NOT
inverter
OE
Exclusive OR gate
OR
OR gate
Q
transistor
R
resistor

148
Table 5.4 Estimating Spacing Requirements
Average Diagram Spacing
(0.20-irugrid space)
Component
Capacitors
3-4
Inductors
4
Resistor
4
Diagram Items
Transistor envelop diameter
4
Resistor symbol length
3
Capacitor symbol width
1-2
Lettering height
3/4
Connection line spacing
1-11/2
Spacing between groups of
connection lines
1-2

149
MT) gate NAND gate
NOR gate
\
s
N.
. amplifier
Figure 5.1 Geometric Contraints for Logic Symbols
Ui| to

150
fl = (x (x + a)) /2a (y y)/b
f2 = (x (x + a))/2a + (y y)/b
f3 = (x (x a))/a
where b = y a is used for figure drawing.
In order to speed up the recognition task, we generate a symbol
tree and categorize the symbols into four categories (circular shape,
rectangular shape, curved shape, and line-segment shape). The circular
shape category can be further divided into transistor, junction FET, and
MOSFET types. These types can be distinguished using the information of
the arrow mark (see Figure 5.2). For the rectangular shape category, we
divide it into two parts. One part is vertical bar shapes which include
flip-flops, adders, multiplexers, and VLSI circuits. The other part is
horizontal bar shapes which contain components represented by the
European drawing form. The curved shape category includes AND type and
OR type logic symbols. Diode, resistor, capacitor, and amplifier are
classified into line-segment shape category. A symbol tree is
illustrated in Figure 5.3.
5.3 System Architecture
The electronic circuit diagram recognition system is designed to
perform three tasks. Task I is extraction of schematic symbols. Task II
is interpretation of schematic symbols, and Task III is generation of a
pictorial database for the CAD interface. The architecture of the
recognition system is shown in Figure 5.4. The first step is to scan
the circuit diagram and generate a digitized image. Before symbol

151
Figure 5.2 Key Feature of Circular Elements

I
SYMBOLS
Shape
Shape
type type type type
Figure 5.3 Symbol Tree


154
extraction is performed, a clean binary image of the schematics is
generated. The binary image generation involves reflection, filtering,
noise removal, and adaptive thresholding. The reflection algorithm is
employed to correct the mirror effect. The filtering and noise removal
tasks are used to enhance the lines that have a certain width and to
suppress small spot-like noise and shading effects caused by non-uniform
illuminations on poor quality paper. Through adaptive thresholding, we
obtain a binary image with some spurs and gaps. In the training phase,
the spurs will be removed by noise removal operation and small gaps will
be filled by gap filling task before skeletonization is performed. The
skeletonization is employed to create thinned schematics. After careful
tailoring, the thinned symbols are decomposed into the primitives which
will be stored in the knowledge base as feature vectors. The primitives
currently used are shown in Figure 5.5. The new primitives of each
symbol may be generated and updated in an interactive manner. The
special relationships among the primitives of a symbol are extracted and
stored as feature vectors. Then, knowledge base is created during the
training phase.
To conduct the task of schematic symbols extraction we decompose a
schematic diagram into five sets of drawings, and each set is stored in
a dedicated file which we call a page. The five sets are junction dots,
connecting line segments, functional symbols, denotations, and
unrecognizable elements. The first page stores a drawing of the
junction dots, the second of connecting line segments, the third of
functional elements, the fourth of denotations, and the fifth of

155
9
9
9
9
9
9
9
9
9
9
9
9
O
9
9
9
9
9
9
9
9
9
9
¡9
r-
\o
r
9
9
O
9
9
9
191
9
O
a
o
9
J
9
9
9
9j
9
9
9
9
9
9
9
91
Â¥

I
9
9
i
9j
9
0
9
9
9


lu
u
3


n
u




9



19
19
¡9
91
9
_L_
Figure 5.5 The Primitives Are Used in the Circuit Diagram
Recognition System

156
unrecognizable elements. We call this procedure a multi-pass pattern
extraction process. Why do we decompose the schematic diagram into five
pages? The reasons are
(1) Each pass performs one specific modular function. The task is
easy to replace and modify.
(2) We want to .extract each component and connection and still
"reserve the nature of the schematics.
(3) By considering each functional element as a character, we can
further utilize the concept of blocking.
5.4 Multi-Pass Pattern Extraction
After adaptive thresholding, a binary image of the schematic is
generated. Cleaning of spurs is not necessary in the extraction phase
because the spurs are small and can be ignored during the blocking
procedure. The first pass of pattern extraction is to remove the
junction dots from the binary image and store them in the first page,
i.e., reserve the branch property. The second pass is to remove
horizontal line segments first in order to further isolate the
functional elements. The third pass is to isolate functional elements
which are considered as character blocking as discussed in Chapter 4.
The fourth pass which is to recognize the denotations __is a process
similar to the character recognition. The fifth and last pass is to
remove streaks and noise then copy the unrecognizable elements to page
five. The procedures of the multi-pass pattern extraction are described
in the following sections. ^

157
5.4.1 Extraction of Junction Dots
The junction dots consist of crossing dots and touching dots, as
shown in Figure 5.6, which are ordered from J1 to J5 according to the
frequency of occurrence in a schematic diagram. The black dots
represent the skeleton of a crossing or touching junction. These
patterns are part of the primitives stored in the knowledge base (refer
to Figure 5.5) The knowledge base will handle the ordering of J1 to J5
during the skeleton matching. The white dots denote the possible
conducting pixels. Due to poor paper quality and imperfect
preprocessing, the configuration of conducting pixels may vary from the
patterns shown in Figure 5.6.
The junction dots are extracted by using the following algorithm
(see Figure 5.7)
(1) Detect the skeleton pattern via a template matching process if the
number of pixels in the scanning line exceeds the threshold
(where 0^ = 30). The order J1 to J5 is followed.
(2) Check the number of conducting pixels around the center pixels. If
this number is equal to the maximum or less than the maximum by one
or two pixels, a junction dot is extracted and the X-Y coordinates
of the center pixel are stored in the first page with an
identification code ID (i.e., J1 to J5). Remove that-junction dot
from the binary image.
(3) Check the contents of the first page and resolve the ambiguities.
The ambiguities (like blurring) occur due to the sampling noise of
the scanner, inaccurate 'alignment of the drawing, and the poor

158
Figure 5.6 Junction Dot Patterns

159
Circuit Diagram
Scoring Arrowheads
in a Temporary File
Figure 5.7 Junction Dot Extraction

160
quality of the paper. These ambiguities will strongly affect diode,
capacitor, and ground symbols. To resolve these ambiguities, the
first step is to detect the possibility of diode and capacitor
symbols by detecting two opposite pairs such as 1 and T delete it
from the first page and store it to the temporary file. In the
second step we consider if the X-Y coordinates of a junction dot
fall within a 5x5 window of other junction dots
(i.e., AX < 2 or AY < 2), then these junction dots have to merge to
one. Choose one of them as a junction dot from the majority
occurrences and flag the rest of them. However, the J1 has top
priority if J1 exists in the ambiguities.
(4) Re-check the contents of the first page to eliminate junction dots
for arrowheads. The skeleton patterns for arrowheads in a diode,
transistor-type element, or flowchart which are used as templates
for arrowhead identification, are shown in Figure 5.8.
To illustrate the algorithm, we choose a circuit diagram
(Figure 5.9) as an example. After scanning, we cut two reflected,
positive picture blocks using "WINDOW task (see Figure 5.10(a) and
(b)). Through adaptive thresholding, we get two binary images shown
in Figure 5.11(a) and (b), respectively. Table 5.5(a) and (b) show
the contents of the first page of Figure 5.11(a) before and after
resolving the ambiguities respectively; and the image after junction
dots removal is shown in Figure 5.12. For Figure 5.11(b), the
contents of the first page before and after resolving the
ambiguities are shown in Table 5.6(a) and (b), respectively. The

161
Figure 5.8 Arrowhead Patterns

162

:.: ':l '/ ''' ;L. <:,< UrS
K
kUDdlMM
W ;
1 *x ^' ^¡¡j*" ***'
yc .
&.A
i
m
'S;
Vi
lU-
iv4 r
. ..;;; ;
. :; *.:*: v::: : -:v : r*?*-
.- ^
l ..; ** f;: m
.,U
iltlilifellif'iSwiil
*V >:* -y. 55 r.''v:^: ,! M,.u;i::tAv.N(
I *** ot+ot+Kum a
r':, :'Jfc :
i ,V- ':
..¡iiW'3ffSi
mm
*.. XT. ifi
'ir-,;.-:
.V
l-s
(a) Window 1
Figure 5.10 Windowed Gray-Level Images of Figure 5.9
163

>-f. L-ry
w1? m? ?*?:?!:
*,. K-*<*M M V4rt -an RJJ. BtoJM-\jr>-jrw3 | f| \v|| j*M *** *WMWM*M*fc*4M ** HW
; v.Y- r
i¡.¡¡:< mi
Afth,
V V ?f '
ft
fk! IF
xt-M r* *n* *>
! ' i:'/
! .'V-iP'bri
alW : ;"5¡ii|l
:r-.**.*vt:::*.;*.;-.;'.;- :.v.v... .ft .;. .. -;
;: v; \H::':vvV:::: 1:>v:.:. ji;V' §: jf-'r :.vi::::: v: Vv;:::'; ; Vv;I:;/V ft;>:.
; ft
' ' itiK
i '*%;
&¡m
V^v:
rm
*& $ t p '^vT'Jff.v
- fit J Kl
'Sr'.ip
PS
ferrp
$$>; rA^v
;. ; *v "
t OTOb\*4iK>

:= *
(b) Window 2
Figure 5.10 (continued)

Figure 5.11 Binary Images

166
Table 5.5 Contents of Page 1 for Window 1
(a) Before Resolving Ambiguities
The Coodinates of the Connection Components :
REC #
XI
Y1
X2
Y2
TYPE
1
136
29
140
33
J-l
2
137
29
141
33
J-2
3
79
55
83
59
J-4
4
80
55
84
59
J-l
5
136
55
140
59
J-2
6
137
55
141
59
J-3
7
39
73
43
77
J-2
8
40
73
44
77
J-2
9
79
73
-83
77
J-4
10
80
73
84
77
J-3
11
39
74
43
78
J-l
12
40
74
44
78
J-l
13
79
74
83
78
J-5
14
80
74
84
78
J-5
(b) After Resolving Ambiguities
The Coordinates of the Connection Components :
REC #
XI
Y1
X2
Y2
TYPE
1
137
29
141
33
J-2
2
79
55
83
59
J-4
3
137
55
141
59
J-3
4
39
73
43
77
J-2
5
79
74
83
78
J-5


168
Table 5.6 Contents of Page 1 for Window 2
(a) Before Resolving Ambiguities
The Coordinates
of the
Connection
Components :
REC //
XI
Y1
X2
Y2
TYPE
1
137
24
141
28
J-l
2
204
49
208
53
J-l
3
205
49
209
53
J-l
A
204
50
208
54
J-l
5
205
50
209
54
J-l
6
137
83
141
87
J-4
7
138
83
142
87
J-3
8
41
34
45
88
J-2
9
43
84
47
88
J-2
10
43
84
47
88
J-2
11
89
84
93
88
J-2
12
90
84
94
88
J-2
13
137
84
141
88
J-l
14
138
84
142
88
J-l
15
42
85
46
89
J-l
16
89
85
93
89
J-4
17
90
85
94
89
J-3
18
169
97
173
101
J-4
19
129
98
133
102
J-2
20
130
98
134
102
J-2
21
131
98
135
102
J-2
22
129
99
133
103
J-5
23
130
99
134
103
J-5
24
131
99
135
103
J-5
(b)
After Resolving Ambiguities
The Coordinates
of the
Connection
Components :
REC //
XI
Y1
X2
Y2
TYPE
1
137
24
141
23
J-l
2
204
49
208
53
J-l
3
137
84
141
88
J-2
4
42
84
46
88
J-2
5
89
84
93
88
J-2
6
169
97
173
101
J-4
7
130
98
134
102
J-2

169
contents of the temporary file are shown in Table 5.7 and the image
after potential diode and capacitor symbols removal is shown in
Figure 5.13(a). The image after junction dots removal is shown in
Figure 5.13(b).
5.4.2 Extraction of Horizontal Connecting Line Segments
Observing the electronic diagrams, there are only horizontal and
vertical line segments connecting various circuit elements. In order to
isolate each junctional element, the horizontal and vertical connecting
line segments must be removed before blocking. However, to remove the
vertical connecting line segments is very tedious and time-consuming
because image data are scanned horizontally and stored in a random
access file. The lack of array processing facilities does not permit us
to extract the vertical connecting line segments from the binary image
files. To compensate for this drawback, we consider vertical connecting
line segments as functional elements during the blocking routine and
they will be isolated with very thin rectangular blocks which are
different from the real functional element blocks.
Following the removal of the junction dots, the binary image
contains disjoint connecting line segments and functional symbols. The
second pass pattern extraction is to extract the horizontal connecting
line segments and put them in the second page. The functional diagram
of horizontal connecting line extraction is shown in Figure 5.14 and the
algorithm is the following:
(1) Trace and record the starting point and the length of each
horizontal line segment in scanning order which may end at a
functional element or end at the rim of the image.

170
Table 5.7 Contents of Temporary File for Possible Capacitor
and Diode Symbols (Window 2)
The Coordinates of the Possible Components :
( After Extracting Horizontal Line Segments )
REC #
XI
Y1
X2
Y2
TYPE
1
39
101
49
109
H -0
2
86
101
97
108
H -0
3
166
109
175
121
V -0

171
Figure 5.13 Images After Capacitors, Diodes, and Junction Dots
Removal for Window 2

172
Circuit Diagram
Without
Junction Dots
Store in Temporary File
Figure 5.14 Horizontal Connecting Line Extraction

173
(2) Shrink the element line segment end back 3 pixels to make functional
elements more realistic, if the length of a line segment exceeds the
threshold 9^, (where 0^ = 8). Record the coordinates of both ends
of the line segment in page 2.
(3) Check and "flag" the records if the difference in Y-coordinate does
not exceed a threshold 9y (|9^| < 1). Thus allowing slight deviation
from the horizontal position due to the inaccurate alignment or
rough drawing. Then perform the linking using the longest line
segment as the main axis, the new line segment is generated and
stored in page 2. Our program can handle hand-drawn schematics as
long as the line segment does not deviate from the mediun by 5
wave angle. The linking of line segments is shown in Figure 5.15.
(4) Detect the short parallel line pairs and select the candidate
capacitors or diodes. Then transfer the coordinates of candidate
elements to page 3.
(5) Delete the horizontal connecting line segments from the binary
image, and end the task of the extraction of horizontal line
segments.
Using Figure 5.11(a) and (b) as examples, the contents of the
second page before and after resolving the ambiguity resolution are
shown in Table 5.8 and 5.9, respectively. The images after horizontal
line segments removal are shown in Figure 5.16(a) and (b).
5.4.3 Extraction of Functional Elements
Following the removal of the junction dots and the horizontal
connecting line segments, the binary image contains disjoint functional

W M H
12345 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
9
9
9
9
9
9
9
9
9
9
9
T
9
9
9
9
9
9
9
9
(a) Line Segments Before Linking
1
2
3
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
(b) New Line Segment After Linking

175
Table 5.8 Contents of Page 2 for Window 1
(a) Before Resolving Ambiguities
The Coordinates of the Line Segments :
REC #
LINE #
P0S1
P0S2
TYPE
1
30
84
98
H
2
30
121
128
H
3
31
120
163
H
4
56
83
93
H
5
57
83
105
H
6
57
118
137
H
7
75
4
12
H
8
75
20
50
H
9
76
29
50
H
10
76
73
91
H
11
88
125
137
H
12
99
85
92
H
(b) After Resolving Ambiguities
The Coordinates of the Line Segments :
REC #
LINE #
P0S1
P0S2
TYPE
1
30
84
98
H
2
31
120
163
H
3
56
83
105
H
4
57
118
137
H
5
75
4
12
H
6
75
20
50
H
7
76
73
91
H
8
88
125
137
H
9
99
85
92
H

176
Table 5.9 Contents of Page 2 for Window 2
(a) Before Resolving Ambiguities
The Coordinates of the Line Segments :
REC #
LINE //
P0S1
P0S 2
TYPE
1
26
5
203
H
2
27
5
34
H
3
27
96
104
H
4
27
112
119
H
5
27
186
190
H
6
44
5
55
H
7
44
78
136
H
8
51
142
162
H
9
51
171
216
H
10
52
142
162
H
11
52
171
216
H
12
74
186
204
H
13
75
186
204
H
14
86
5
9
H
15
86
32
57
H
16
86
80
103
H
17
86
126
151
H
18
99
134
140
H
19
99
186
196
H
20
101
130
132
H
21
114
130
132
H
(b)
After
Resolving Ambiguities
The Coordinates of
the Line Segments :
REC #
LINE #
P0S1
P0S2
TYPE
1
26
5
203
H
2
44
5
55
H
3
44
78
136
H
4
51
142
162
H
5
51
171
216
H
6
74
186
204
H
7
86
5
9
H
8
86
32
57
H
9
86
80
103
H
10
86
126
151
H
11
99
186
196
H

Figure 5.16 Image After Horizontal Line Segments Removal

178
elements, denotations, and vertical line segments. The extraction of
functional elements is conducted in the following three
steps (1) blocking, (2) grouping, (3) and recognition (see
Figure 5.17). The blocking routine isolates a functional element by
inscribing it with a rectangular or square block and assigns a code
number to each isolated block in the order of scanning sequence. The
blocking routine used here is the same as the character blocking concept
used in Chapter 4, if we consider the functional elements as
characters. Furthermore, the isolated block can be divided into two
types of circular ones, and rectangular ones as shown in Figure 5.18.
The circular symbols are extracted by using the circle detection
technique based upon a modified Hough transform [81],[82]. The circular
blocks may represent such active elements as transistors, junction FET,
and MOSFET as shown in Figure 5.2. The rectangular blocks may represent
such functional elements as resistors, inductors, capacitors, diodes,
amplifiers, logic gates, flip-flops, etc. The grouping routine is to
categorize the isolated blocks using the proposed symbol tree (see
Figure 5.3). Rectangular blocks are grouped into two classes, vertical
bar and horizontal bar, according to their aspect ratios. The resistor
and inductor block in normal position are of horizontal bar shape. Most
of VLSI blocks and logic element blocks are in the form of vertical bar
shape and some of them are in the form of a square shape. Here, if we
define aspect ratio = AY/AX, then

179
Circuit Diagram
Without Junction Dots And
Figure 5.17 Functional Element Extraction

180
Figure 5.18 An Example of Component Blocking

181
aspect ratio =
horizontal bar shape
square shape
k < 1 vertical bar shape
However, the real rectangular box elements which appear as vertical
bar shapes do not exist after the extraction of horizontal line
segments. These elements will be recovered using the information of
the vertical connecting line segments after the extraction of the
denotation. Thus the elements detected in this pass are vertical
connecting line segments, R/L/C categories, diodes, etc. beside real
rectangular box elements. Since a vertical line segment is
characterized by an extremely small aspect ratio, it can readily be
detected and is transferred to the second page.
The recognition routine is to recognize the isolated functional
elements in each group and assign a label and rotation index to each
element. Template matching technique with pictorial features is used to
extract the pattern features of the isolated functional elements. Size,
strokes, symbol radicals (or primitives), and number of terminals are
the features which are used in the recognition process. For example, a
resistor can be described by the following feature vector
a) # of primitive a : 3
b) # of primitive v : 3
c) # of terminal : 2
d) # of repetition of a : 2
e) # of repetition of v : 2
Definition of Repetition
Repetition is the property which is defined as a particular pattern
which appears repetitively along one direction.

182
For an operational amplifier, the feature vector will be
a) # of primitive -4': 2
b) # of primitive > : 1
c) # of terminal (input) : 2
d) # of terminal (output) : 1
e) # of repetition of 4 : 1
Curved shape elements such as AND and OR type logic symbols can be
simplified by line segments representation. The simplified symbols of
AND and OR gates are shown in Figure 5.19. The simplified configuration
is very close to the quantized symbol, so the simplification would not
affect the recognition of the real curved symbols. Furthermore, after
simplification, we can find that the intersections of line segments and
vertices can be extracted as the feature vectors. These feature vectors
will enhance the recognition scheme.
For AND gate,
the feature vector is
a)
#
of primitive d; 2
b)
#
of primitive r: 1
c)
#
of primitive ¡_: 1
d)
#
of primitive1
e)
#
of primitives: 1
f)
#
of primitive ^: 1
g)
#
of primitive j: 1
h)
#
of primitive f: 1
i)
#
of terminal (input)
j)
#
of terminal (output)
k)
#
or repetition of 4:
For OR Gate,
the feature vector is
a)
#
of primitive 4 : 2
b)
#
of primitive v : 1
c)
#
of primitive : 1
d)
#
of primitive ^ : 1
e)
#
of primitive : 1
f)
#
of primitive : 1
g)
#
of primitive r : 1
b)
#
of primitive : 1
i)
#
of primitive : 1
j)
#
of primitive > : 1
k)
#
of terminal (input)
> 2
1
> 2

183
(a) Simplified AND Gate
(b) Simplified OR Gate
Figure
5.19 Simplified Logic Symbols

184
l) # of terminal (output) : 1
m) # of repetition of 4 : 1
The above feature vectors can be changed or modified using
interactive commands to update the knowledge base. The knowledge base
configuration will be discussed in Section 5.6. Some of the feature
vectors of common electronic elements are listed in Table 5.10.
After features extraction of each functional elements, the pattern
recognition technique is employed. To simplify the computation, the
similarity measure (S) is used.
N
S = X. Y = y x..y.
1 -J f Ji i
T
where X. = (x, x., ... x represents a feature vector of a
J jl j2 jN'
T
functional element j in the knowledge base, Y = > y^)
represents a feature vector of the tested element. T denotes transpose
of vector and N is total dimension of the feature vectors used.
If some S.'s exceed the preset threshold 0C and the ratio of the
difference of the top and second one ASr also exceed the preset
threshold 0r, then the system will provide the top one as the most
probable functional element.
The ratio of the difference ASr is defined as
where S-^ is the measure score of the top one, and S2 is the measure
score of the second one.
If the above conditions are not satisfied, the knowledge base will
trigger the inference mechanism to get more information until the

-r
o
i
i
i
i
2
2
2
O
1
1
.1
I
Table 5.10 The Primitive of Functional Elements
h>AVO-//-N 1/ si /I ^ f T L 'l ) V Z.
00330110
11000011
10000000
01000011
01001001
01000011
10000100
01000110
00030140
00000001
00000001
0 0 0 0 0 0 0
11110 0 0
0 0 0 0 0 0 0
1 0 0 1 0 0 0
1 0 0 0 0 '0 0
1 0 0 1 0 0 0
0 0 0 0 1 0 0
0 0 0 1 1 0 0
0 0 0 1 4 0 0
1 0 0 0 0 1 1
1 0 0 0 0 1 1
00000000
00000000
00000000
00000 0 00
00000000
00000000
00111100
00000111
00000000
10000000
01000000
185

186
conditions are satisified. Then the system stores that functional
element with label and rotation index in page 3 and removes it from the
binary image. The unrecognizable functional elements, due to missing
pixels, new symbol, or noise blurring problem, will remain in the binary
image.
However, some of the vertical bar shape elements may embed
redundant vertical connecting line segments after the blocking
routine. Thus the vertical line detection must be performed before
recognition routine execution. The functional elements and the detected
vertical connecting line segments will be stored in page 3 and 2,
respectively.
Furthermore, if some of the vertical bar shape elements contain
corner features, we may detect that breaking point and transfer it to
page 1.
Some ground symbols are blurred and some are distinct. The
distinct ground symbol may appear as 1 and lose its proper
configuration and becomes unrecognizable. To compensate for this
problem, the system will trigger an inference mechanism to trace the
rest of the parts if only one simple feature is detected.
After the recognition routine is performed, the functional elements
are replaced by their standard drawings.
Continuing the last examples and using blocking routine to
Figure 5.16(a) and (b), we can obtain the block diagrams and their
corresponding coordinate files shown in Figure 5.20(a) and (b) and
Table 5.11(a) and (b), respectively. After grouping, the contents.of

Figure 5.20 Blocking Results

188
Table 5.11 Coordinate Files
(a) Window 1
BLOCK#
XI
Yl
X2
Y2
1
100
19
106
23
2
109
19
111
23
3
113
19
117
23
4
99
26
120
34
5
81
30
83
54
6
129
30
131
30
7
133
30
133
30
8
163
31
171
136
9
134
34
143
54
10
106
52
117
61
11
94
56
96
56
12
98
56
98
56
13
128
58
128
58
14
130
58
131
58
15
133
58
135
58
16
81
60
82
72
17
138
60
140
88
18
9
62
10
66
19
19
62
21
66
20
14
63
17
67
21
53
63
59
68
22
66
63
70
68
23
62
64
64
67
24
13
70
15
80
25
17
70
19
90
26
88
70
124
108
27
51
72
72
79
28
173
73
175
78
29
177
73
179
77
30
182
73
185
78
31
191
73
195
78
32
188
74
189
77
33
1
75
3
75
34
26
76
28
76
35
38
79
46
136
36
78
99
88
118
37
49
103
55
108
38
58
103
60
108
39
62
103
66
108
40
114
110
114
114
41
116
110
117
114
42
119
110
119
110
43
102
111
112
114
44
119
111
124
114
45
80
120
86
120
46
82
122
84
123

2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
Y2
78
79
79
79
79
79
79
90
90
90
86
87
100
100
102
103
122
101
107
107
106
107
107
108
107
110
109
122
127
112
118
116
116
116
138
127
189
Table 5.11 (continued)
(b) Window 2
XI
Yl
X2
Y2
BLOCK it
XI
Yl
X2
139
2
139
23
41
60
74
61
2
26
4
27
42
63
74
66
200
26
207
48
43
69
74
71
35
27
37
27
44
73
74
77
39
27
39
27
45
107
74
108
41
27
41
27
46
116
74
121
87
27
89
27
47
112
75
114
93
27
95
27
48
10
82
31
105
27
107
27
49
58
82
79
109
27
111
27
50
103
82
125
120
27
122
27
51
2
86
4
124
27
125
27
52
98
87
98
174
27
176
27
53
44
90
44
181
27
181
27
54
91
90
92
183
27
185
27
55
124
97
128
191
27
193
27
56
139
99
143
195
27
198
27
57
194
99
203
137
29
139
44
58
137
100
137
55
32
61
37
59
34
102
36
65
32
67
37
60
72
102
73
71
32
79
37
61
81
102
83
63
36
63
37
62
18
103
22
157
38
161
42
63
24
103
25
164
38
165
42
64
29
104
32
173
38
175
42
65
77
104
79
56
40
77
48
66
125
106
141
168
40
171
43
67
143
106
144
128
43
128
43
68
87
109
97
2
44
4
44
69
39
110
49
20
45
20
45
70
141
112
141
23
45
23
45
71
127
113
137
168
46
170
56
72
139
113
139
163
47
166
56
73
141
113
148
139
51
141
82
74
125
114
125
205
55
207
75
75
166
122
-175
146
69
189
108
76
89
124
95
9
74
10
79
12
74
19
79
22
74
24
79
26
74
30
79

190
the third page and their corresponding images are shown in Table 5.12(a)
and (b) and Figure 5.21(a) and (b), respectively. Through the corner
feature and vertical line segment detection, we obtain the new pages 1
and 2 as shown in Table 5.13 and 5.14. After recognition routine is
performed, the images of the corresponding functional elements are shown
in Figure 5.22 and listed in Table 5.15.
5.4.4 Denotation Recognition
After the removal of the junction dots, the connecting line
segments and the functional elements, the binary image contains disjoint
denotations. The denotations of the circuit diagram are used to
describe physical properties or labels of the functional elements in a
schematic in terms of a character string. The denotation recognition
scheme is shown in Figure 5.23. The first step of the denotation
recognition is to recognize all the letters, numerals, and special
characters such as 2, y if they exist. is a resistance unit and p is
a capacitance unit. This step is the same as that of the character
recognition technique given in Chapter 4. The second step is to
concatenate the character string to form a labeling of a functional
element. The denotation is always concatenated horizontally and is
similar to word recognition. Thus the criteria of concatenation are
(1) Y-coordinate of character string is almost same (i.e.|Ay| < 2).
(2) X-coordinate of character string is increasing (i.e., xn > xn_^ >
..> X£ > x^ if n character are concatenated).
(3) the spacing between two consecutive characters is limited to the
width of the normal character x- (i.e.*--x x x ).
c i ii c

191
Table 5.12 Contents of Page 3 After Grouping
(a) Window 1
The Coordinates of the Circuit Components :
REC it
XI
Y1
X2
Y2
TYPE
1
99
26
120
34
H -0
2
163
31
171
136
V -0
3
134
34
143
54
V -0
4
106
52
117
61
H -0
5
13
70
19
80
V -0
6
88
70
124
108
V -0
7
51
72
72
79
H -0
8
38
79
46
136
V -0
9
78
99
88
118
V -0
The Coordinates
(b) Window 2
of the Circuit Components :
REC it
XI
Y1
X2
Y2
TYPE
1
39
101
49
109
H -0
2
86
101
97
108
H -0
3
166
109
175
121
V -0
4
200
26
207
48
V -0
5
56
40
77
48
H -0
6
163
46
170
56
V -0
7
10
82
31
90
H -0
8
58
82
79
90
H -0
9
103
82
125
90
H -0
10
194
115
203
122
H -0
11
87
120
97
126
H -0
12
39
120
49
127
H -0
13
166
131
175
138
H -0

192
(a) Window 1
(b) Window 2
Figure 5.21 Possible Functional Elements

193
Table 5.13 Contents of Page 1 and Page 2
(a) Contents of Page 1 After Extracting Corners (Window 1)
The Coordinates of the Connection Components :
REC //
XI
Yl
X2
Y2
TYPE
1
137
29
141
33
J-2
2
79
55
83
59
J-4
3
137
55
141
59
J-3
4
39
73
43
77
J-2
5
79
74
83
78
J-5
6
81
30
83
32
J-6
7
164
31
166
33
J-7
8
138
85
140
88
J-9
9
81
99
83
101
J-6
(b) Contents of Page 2 After Extracting Vertical Line
Segments (Window 1)
The Coordinates of the Line Segments :
REC //
LINE //
P0S1
P0S2
TYPE
1
30
84
98
H
2
31
120
163
H
3
56
83
105
H
4
57
118
137
H
5
75
4
12
H
6
75
20
50
H
7
76
73
91
H
8
88
125
137
H
9
99
85
92
H
10
81
33
54
V
11
166
34
65
V
12
167
86
136
V
13
81
60
72
V
14
139
60
85
V
15
42
79
94
V
16
42
117
136
V
17
81
102
115
V

194
Table 5.14 Contents of Page 1 and Page 2
(a) Contents of Page 1 After Extracting Corners (Window 2)
The Coordinates of the Connection Components :
REC #
XI
Y1
X2
Y2
TYPE
1
137
24
141
28
J-l
2
204
49
208
53
J-l
3
137
84
141
88
J-2
4
42
84
46
88
J-2
5
89
84
93
88
J-2
6
169
97
173
101
J-4
7
130
98
134
102
J-2
8
204
26
206
28
J-7
9
137
42
139
44
J-9
10
139
51
141
53
J-6
11
205
73
207
75
J-9
12
197
99
199
101
J-7
(b) Contents of Page 2 After Extracting Vertical Line
Segments (Window 2)
The Coordinates of the Line Segments :
:c it
LINE it
P0S1
P0S2
TYPE
1
26
5
203
H
2
44
5
55
H
3
44
78
136
H
4
51
142
162
H
5
51
171
216
H
6
74
186
204
H
7
86
5
9
H
8
86
32
57
H
9
86
80
103
H
10
86
126
151
H
11
99
186
196
H
12
139
2
23
V
13
206
29
48
V
14
137
29
41
V
15
139
54
82
V
16
205
55
72
V
17
44
90
100
V
18
91
90
100
V
19
198
102
114
V
20
91
109
119
V
21
45
110
119
V
22
170
122
130
V

195
AW
t*
(a) Window 1
AW
AAV AAV AW
(b) Window 2
Figure 5.22 Functional Elements

196
Table 5.15 Functional Elements of Page 3
(a) Window 1
The Coordinates of the Circuit Components :
REC it
XI
Y1
X2
Y2
TYPE
1
99
26
120
34
R -0
2
163
65
171
85
R -1
3
134
38
143
50
CR-1
4
106
52
117
61
CR-0
5
13
70
19
80
C -0
6
88
70
124
108
AR-0
7
51
72
72
79
R -0
8
38
95
46
116
R -1
9
78
116
88
122
G -0
(b)
Window
2
The Coordinates of the Circuit Components :
REC it
XI
Yl
X2
Y2
TYPE
1
39
101
49
109
C -1
2
86
101
97
108
C -1
3
166
109
175
121
CR-3
4
56
40
77
48
R -0
5
163
46
170
56
C -0
6
146
69
189
108
AR-3
7
10
82
31
90
R -0
8
58
82
79
90
R -0
9
103
82
125
90
R -0
10
194
115
203
122
G -0
11
87
120
97
122
G -0
12
39
120
49
127
G -0
13
166
131
175
138
G -0

197
Circuit Diagram
Without Junction Dots,
Connecting Lines,
Functional Symbols
pj^guj-0 5 23 Denotation Recognition

198
The third step is to label the possible functional element. The
labeling algorithm is illustrated in Figure 5.24.
Step 1: Construct the denotation window (e.g., 200Kft)
Step 2: Move the window horizontally and vertically (e.g. 1, 2, 3, and
4). If any function element is met, then record the moving
distance d.^ (i = 1, 2, 3, 4).
Step 3: Compare d^ and choose a functional element with minimum
distance as a candidate (e.g., the top resistor will be the
best candidate) and store it in page 4 and mark it as
indentified.
However, an ambiguity may occur such as assigning a label of
capacitor to a resistor. To avoid this ambiguity, the knowledge base
will provide the reasonable interpretation for denotation checking. For
instance, the "K" or "2" inside the denotation window means that it
belongs to a resistor. The system must skip the candidate even if it
has minimum distance and select the next one.
Furthermore, some denotations consist of only a single character.
This situation appears in vertical bar shape elements and provides the
useful pin information to help identify a functional element. These
denotations will be saved in page 4 until the rectangular box is
constructed.
5.4.5 Reconstruction of Rectangular Shape Elements
The rectangular shape element has been destroyed by the extraction
of horizontal connecting line segments. However, the features are
preserved in the separated files. The horizontal line segment pairs are

199
Figure 5.24 An Example of Labeling

200
in the temporary file, the blended corners are in the junction dot file,
and the vertical connecting line segments are in the line connecting
page. The reconstruction routine is to find the compatible pairs and
merge them together if the consistency is satisfied. Then the
denotation file is searched to find all the denotations inside the
rectangular box and the knowledge base is called for functional
identification. Next step is to pick up the denotation window which is
outside the rectangular box, and perform minimum distance measure for
labeling.
5.4.6 Processing of Unrecognizable Page
After the four pages extraction, the remaining page contains noise
and unrecognizable elements which may be new elements or deformed
symbols. This page will provide very useful information to augment the
system capability. Via man/machine interaction, the system will create
new records for new elements or create synonym records for existing
elements. Using Figure 5.16(a) and (b) as examples, after the
functional elements are removed and the denotation recognition skipped,
the unrecognizable pages are shown in Figure 5.25(a) and (b),
respectively.
5.5 Pictorial Manipulation Language
To accomplish the symbol interpretation task and pictorial database
generation task, we introduce picture manipulation languages which
consist of the symbol description language (SDL) and the picture
generation language (PGL). The SDL and PGL are high level languages.

201
Ki k
1*F
3T* fctt
rbJlirkM
(a) Window 1
37,5 kn
aiuF
ICOkft fcDkrt Ihft
D.l .uF yp
C .*>
EniiptiE
iftclud
m
(b) Window 2
Figure 5.25 Unrecognizable Pages

202
5.5.1 Symbol Description Language (SDL)
To describe a functional element in electronic and logic
schematics, we propose a symbol description language (SDL). The SDL is
defined in terms of syntax structure as below.
: = (ID Code, Label, Rotation index)
: =
: = A|AR C|CR|FF|L|...|r|...