An electro-optical image algebra processing system for automatic target recognition


Material Information

An electro-optical image algebra processing system for automatic target recognition
Physical Description:
ix, 131 leaves : ill. ; 28 cm.
Coffield, Patrick C., 1944-
Publication Date:


bibliography   ( marcgt )
theses   ( marcgt )
non-fiction   ( marcgt )


Thesis (Ph. D.)--University of Florida, 1992.
Includes bibliographical references (leaves 125-130).
Statement of Responsibility:
by Patrick C. Coffield.
General Note:
General Note:

Record Information

Source Institution:
University of Florida
Rights Management:
All applicable rights reserved by the source institution and holding location.
Resource Identifier:
aleph - 001796799
notis - AJM0512
oclc - 27483484
System ID:

Full Text







I would like to thank all my committee members for providing the

constructive criticism that helped make my research for this dissertation an

enjoyable and rewarding experience. I would like to extend a special thank you to

my advisor, Dr. Gerhard Ritter, for his patience and for teaching me the value of

thorough research. I hope that our working relationship will continue for years to


I thank the members of the Advanced Guidance Division of the Wright

Laboratory Armament Directorate responsible for allowing me to pursue the

hardware development of the theoretical ideas expressed in this dissertation.

Finally, from the days of the the Air Force Armament Laboratory, I would

like to thank LtCol. David Laborde, USAF, and Dr. Sam Lambert for allowing me

to participate in the Air Force long term full time graduate program.








1.1 The Requirement ..
1.2 The ATR Problem .
1.3 Architectural Considerations.
1.4 Outline of the Dissertation .


2.1 Image Algebra Overview . .
2.1.1 Value Sets and Point Sets .
2.1.2 Images . .
2.1.3 Operations on Images . .
2.1.4 Generalized Templates .
2.1.5 Operations Between Images and Templates
2.1.6 Operations Between Generalized Templates
2.2 Digital Optical Cellular Image Processor (DOCIP) -
A Comparison ......
2.2.1 General Description . ..
2.2.2 Binary Image Algebra . .
2.2.3 Evaluation . .


3.1 Correlator Operation for ATR .
3.1.1 Spatial Light Modulation .
3.1.2 Optical System Throughput .
3.1.3 Composite Matched Filter .
3.1.4 Post Correlation Processing .
3.2 The Logical Architecture . .

. ii

S vii

. 12

S 13
S 15
. 16
S. 17
S 17
S 21
S. 26

. 26
S 26
S. 27
S. 30



3.3 Image Algebra Process Distribution . .. 52
3.3.1 Proof of Concept . 58
3.3.2 Proof of Phase-only Filter Application . 62
3.4 Performance Evaluation . .. 65
3.4.1 Example Operation . . 65
3.4.2 Comparison of Execution with
RISC Performance ................ 68
3.4.3 Algorithm Description and Evaluation .. 69 Description . 69 Evaluation . 73 Conclusions . 74
3.5 Binary Image Operations . .. 74
3.5.1 Functional Description . .. 75
3.5.2 Image Algebra Process Distribution . .. 76
3.6 Variant Template Operations . .. 78


4.1 System Description . . 81
4.1.1 Filter Generation. . .. 84
4.1.2 Filter Implementation . 87
4.2 Image Algebra Process Distribution . .. 88
4.2.1 Morphological Operation . .. 89
4.2.2 Negative Real Value Manipulation. . .. 91
4.2.3 Binary Operation. . .. 94
4.3 Conclusions . . .. .94


5.1 Background. . . 96
5.1.1 DSP Stack Filter Approach ... 9
5.1.2 Morphological Approach . .. 100
5.2 Stack Comparator .................. 107
5.2.1 Concept Description . .. 108
5.2.2 Proof of Concept ................. 111
5.2.3 Time Complexity . . 114
5.3 System Integration and Operation . .. 116


REFERENCES .. . ............ 125




ATR component structure, input/output diagram .

Example of a 2-D rectangular real-valued image .

Example of a 2-D template . .

Example of generalized convolution substitutions .

Example of a 2-D generalized backward convolution .

Dilation (Boolean) expressed in the language of BIA and image

Optical correlator for ATR .....

Cross sectional view of optical correlator .

Ferroelectric liquid crystal cellular geometry .

Spatial light modulator . .

Data flow diagram for implementing at t ...

Pipelining the processors .. .......

Optical convolver (analog) .....

Shift, weight, and accumulate .. ......

Cellular connection of processors .. .....

Comparator enhancement .....

Example of general convolution . .

Example of additive maximum .. ......

Example of instruction stack reordering .. ....

Binary image erosion and dilation .. .....

. 6

. 18

. 20

. 24

. 25

algebra 31

S .. 34

. 36

. 38

. 39

. 44

. 45

. 46

. 49

. 50

. 51

. 66

. 67

. 72

. 77

Multiple fixed filter sets . .

Vander Lugt matched filter generation .

Stack filter 3 window median filtering .

Stack filter 3 window asymmetric median filtering .

Example of morphological median filter .

A comparator circuit ..............

Comparator linking in stack structure .

Data flow of a conceptual stack comparator .

A generalized image-template product hypercube execution .

Stack comparator implementation of asymmetric median filter

. .. 83

. .. 86

. .. 98

. .. 103

. .. 106

... 109

. .. 110

. .. 112

. .. 122

. .. 123


1. Component design requirements . .. 47

2. Image-to-image operations . .. 53

3. Characteristic functions . .. 56

4. Image-to-template operations . .. 57

5. Estimated execution time for proposed design . .. 70

6. Estimated execution time for RISC design . .. 70

7. Stack filters which preserve only constant roots .. 101

8. Stack filters which preserve only increasing roots .. 101

9. Stack filters which preserve only decreasing roots .. 102

10. Stack filters which preserve median filter roots and/or other roots 102

Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy



Patrick C. Coffield

August 1992

Chairman: Dr. Gerhard X. Ritter
Major Department: Computer and Information Sciences

The proposed electro-optical image algebra processing system is designed

specifically for image processing and other related computations. The design is a

hybridization of an optical correlator and a massively paralleled, single instruction

multiple data, processor. The architecture of the design consists of three tightly

coupled components: a spatial configuration processor (the optical analog

portion), a weighting processor (digital), and an accumulation processor (digital).

The systolic flow of data and image processing operations are directed by a control

buffer and pipelined to each of the three processing components. The image

processing operations are defined in terms of basic operations of an image algebra

developed by the University of Florida. The algebra is capable of describing all

common image-to-image transformations. The merit of this architectural design

is how it implements the natural decomposition of algebraic functions into

spatially distributed, point wise operations. The effect of this particular

decomposition allows convolution type operations to be computed strictly as a

function of the number of elements in the template (mask, filter, etc.) instead of

the number of picture elements in the image. Thus, a substantial increase in

throughput is realized.

The implementation of the proposed design may be accomplished in many

ways. While a hybrid electro-optical implementation is of primary interest, the

benefits and design issues of an all digital implementation are also discussed. The

potential utility of this architectural design lies in its ability to control a large

variety of the arithmetic and logic operations of the image algebra's generalized

matrix product. The generalized matrix product is the most powerful fundamental

operation in the algebra, thus allowing a wide range of applications. No other

known device or design has made this claim of processing speed and general

implementation of a heterogeneous image algebra.


The objective of this dissertation is to define the logical configuration of a

hybrid electro-optical system designed specifically for image processing. The

architectural design of the system integrates the operation of an optical correlator

with a massively parallel digital processor. This combined system is designed to

optimize the operations and functionality of an image algebra developed by the

University of Florida under Air Force contract F08635-84-C-0295 [61]. It is in

the further interest of the Air Force that the proper implementation of this

concept may be useful for advanced guidance systems that provide Automatic

Target Recognition (ATR). This issue is also addressed in this dissertation.

There are many processor architectures which have been designed for image

processing applications [12,13,17,29,34,40]. In general, such applications have been

implemented as a limited set of specific processing routines. All image-to-image

transformations can be fully described by the unary, binary, and matrix operations

defined in the aforementioned image algebra. No prior architectural design has

ever been developed to directly support the functionality of the image algebra.

The low-level abstraction of the proposed computational system is founded

on the mathematical principle of discrete convolutions and their geometrical

decompositions. The geometric decompositions combined with single instruction

multiple data (SIMD) array processing requires redefining specific algebraic

operations and optimizing their order of execution. The logical data flow of such

an abstraction leads to a division of operations, those defined by pointwise

operations and those defined in terms of neighborhood operations.

The intended use of the electro-optical image processing system proposed in

this dissertation is that of an integral part of the advanced guidance system for an

airborne autonomous vehicle. The sensor for this type of guidance system can be a

synthetic aperture radar or a two- or three-dimensional infrared imaging device.

The proposed system provides a common modular hardware solution to a wide

class of problems associated with these sensing devices.

Considering other applications, the implementation can range in use from

the embedded form described in this dissertation to a general use coprocessor

configuration coupled with an image processing workstation [60]. Since the design

involves such a comprehensive set of operations, it is reasonable to assume that a

general use implementation would have a wide range of commercial applications

such as medical imaging, manufacturing robotics, etc., as well as ATR.

1.1 The Requirement

The 1991 war in the Persian Gulf was the first war in which smart weapons

were used. The term "smart weapon" was used mostly in the context of pilot cued

contrast tracking. The inertially navigated unmanned cruise missile was as close

to autonomous guidance as the state-of-the-art would allow.

As a result of the success and limitations observed in this generation of
smart weapons, the projection for the next generation is that we will have more

intelligent systems that require positive target identification and quick response in

a high electronic counter measure, low-observable environment. This high threat


demands a reduction in the need for pilot cuing, preferably a complete stand-off,

launch and leave capability.

The real-time computational rates for these next generation systems are

established to be well into the gigaops, or 109 operations per second, range. In

order to realize the necessary rates, technological advances must be made using

massively parallel computational methods.

1.2 The ATR Problem

With respect to any weapon delivery system, the need for target recognition
is fundamental for correct system performance. The rationale for automatic target

recognition impacts not only the system's performance but the tactics used in

delivering the weapon; i.e., pilot acquisition not required, launch and leave.

A conventional, blast-fragmentation type munition is typically designed to
be effective against an array of different targets. Another important advantage of

recognizing the target a priori is knowing when and how to detonate the munition.

If the size of an arbitrary target and the relative location of its most vulnerable

area are known, then the munition can be detonated in an optimal sequence.

Optimally, the warhead explosion is timed to affect the fragmentation distribution

(characterizing velocity, residual mass, and direction), to maximize the kill

potential. Such a level of sophistication in a weapon system can only be

accomplished as an automated sub-system function.

The automatic recognition and subsequent track point location of an object
that is contained in a digital image sequence is accomplished in an interrelated

series of programmed events executed per image frame. These events have been

formally defined by the Automatic Target Recognizer Working Group (ATRWG).

The ATRWG is a joint U.S. Department of Defense/Industry working group

sponsored by the Defense Advanced Research Projects Agency [7].

A depiction of their interrelationships taken from ATRWG report number
87-002 is shown in Figure 1. A summary of the ATRWG definitions is as follows:

(1) Preprocessor: signal conditioning, image enhancement, and/or image

transformation. The primary objective of this procedure is to condition or convert

the raw image data for the next event, usually the points of interest locator. Noise

reduction filtering, dynamic range of image intensity via histogram modification,
and linear/nonlinear transforms that globally modify the high spatial frequencies

of an image are some of the techniques that are typically used.
(2) Interest Point Locator: geometrical or statistical. This procedure is a

basic operation that identifies locations within the image that are potential object

points. Statistical data differences or values computed globally may be used to

isolate points, or specific geometric shapes can be isolated, automatically, using

template matching techniques.
(3) Segmentor: edge based, region based, or texture based. The objective

of this procedure is to partition the image into regions that are based on consistent

information. The information is normally a refinement of the input to the Feature


(4) Feature Extractor: geometrical or statistical. The computation is a

fine grain identification procedure similar to that of the Interest Point Locator.

Specific attributes, or feature values, are determined for each segmented region
and used to form a corresponding feature vector. This syntactic event forms the

input for the Classifier whose output is a semantic event.

(5) Classifier: statistical or syntactic. In some applications, distinct

distributions can be determined for classes of feature vectors. Statistical methods

are then used to associate, with a level of confidence for objects of interest, the

feature vectors that are output from the Feature Extractor with the appropriate

distributions or classes. Other applications can use subcomponents or pattern

primitives that, in a syntactic sense, relate regions to classes. The syntactic

method is favored for its execution speed; however, it is highly dependent upon

system resolution. Both methods output the necessary class assignment or object


(6) Tracker: pixel-to-pixel or object-to-object. Depending on the

sophistication of the system, the tracker can be driven by pixel information (e.g., if

the input to the Tracker is taken directly from either the Preprocessor or the

Interest Point Locator) or by object precedence (e.g., the input comes from the

Feature Extractor or Classifier). Both tracking methods are tightly coupled to a

system's guidance law, typically a rigid-body response feedback loop designed to

minimize a tracking error solution. Note, an optical correlator tracking system is

an object-to-object tracker that receives its input directly from the Preprocessor.

The ATR problem arises when the rate at which the system must respond

to changes in kinematic conditions, called the update rate, is higher than the

information processing rate. When this situation occurs, the tracking error cannot

be minimized and in the worst case the system loses its ability to track. For this

reason, streamlined information processing, current weapon systems are reduced to

some form of pilot cuing coupled with contrast tracking.

1.3 Architectural Considerations

In order to solve the ATR problem, the next generation of weapon system

guidance must incorporate computational methods which are capable of processing









Figure 1. ATR Component Structure, Input/Output Diagram.

information at rates that are several orders of magnitude higher than the
respective update rate. The solution is a time complexity issue.
The mathematical notation used in this dissertation to evaluate time
complexity is the O-notation [1]. The following is a brief review of the
significance of the notation.
For any arbitrary function An) the asymptotic upper bound is defined in
the O-notation as Jn) = 0(g(n)), if and only if there exist two positive constants
c and no such that IjAn) < c |g(n) for all n > no. The most common
references for computing times are

0(1) < 0(logn) < O(n) < 0(nlogn) < 0(n2) < 0(n3) < < 0(2n).

Whereas O-notation is used to express an upper bound, f(n) is used to
express a lower bound. It is defined as: jn) = l(g(n)), if and only if there exist
two positive constants c and no such that I An) | > c | g(n) I for all n > n0.
If g(n) is both an upper and lower bound on An), then 8 is given as the
notation. Formally, An) = 8(9(n)), if and only if there exist positive constants cl,
c2, and no such that for all n > no, c1 I g(n) I < I n) < c2 g(n)l.
The question that arises is, how can application algorithms, as summarized
above from the ATRWG, be processed with time complexity 0(n), 0(log n), or
even 0(1)? The approach taken in this dissertation is to couple the benefit of
massively parallel, digital computation with the benefit of optical computation to
provide a lowest upper bound.
The most important consideration of any algorithm is its mathematical
structure. Two questions associated with the mathematical structure are:

(1) Is the mathematical structure optimized for time complexity and the

kind of parallelism provided by the computer in use?

(2) What type of architecture provides the necessary execution speed on

the basic operations specified by the mathematical structure?

The architectural design proposed in this dissertation provides the

necessary speed at the operator level of the algebraic abstraction. This allows the

algorithm designer to be creative at the conceptual level without first knowing how

the parallel operations are physically mapped to the underlying architecture.

Many application algorithms involving image analysis and processing are

central processing unit (CPU) or input/ouput (I/O) bound, or both; i.e., an

algorithm may be less robust due to limitations on processor or data bandwidths.

Even considering the hardware speedup due to developments in very large scale

integrated (VLSI) components and complementary metal-oxide semiconductor

(CMOS) technology, many algorithm tasks still require throughputs that are

beyond current capabilities. Parallel computer architectures have the capability to

overcome many of these bandwidth limitations; however, the communication

among processors is a physical constraint that is not easily or inexpensively


Optical systems, on the other hand, have such a high space-bandwidth

product that these limitations appear insignificant. In the past, the inherent

inaccuracies of these analog systems have limited the use of such devices almost

exclusively to correlation operations. The interest in these correlator operations is

mainly the magnitude and location of the cross-correlation of an image scene with

a reference subimage (usually an object to be identified in the input image scene).

A large signal to noise ratio can be tolerated in such a system and still provide

acceptable autocorrelation position information.

Recent advances in spatial light modulator design have reduced optical

system inaccuracies and increased their switching speeds to the point where they

are now viable for the hybrid system described in this dissertation [18]. The

approach taken in this dissertation does use the matched filter/correlator principle.

However, it is simply taking advantage of the high space-bandwidth product of the

optics to provide a fundamental convolution procedure which quickly and

efficiently translates the input image while minimizing inter-processor


The benefit from coupling the fine grain, SIMD computational system with

the optical correlator can be summarized by
(1) The I/O for any electrically connected SIMD design is a major

bottleneck in terms of system throughput. Most implementations of these designs

that are directed towards image processing reduce the effect of this condition by

fixing the size of template, or convolution type, operations over images so that an

I/O pipeline can be conveniently configured. By allowing the optical analog to

perform the necessary image translations in free-space, this restriction is

eliminated. Thus, any template configuration can be algebraically processed at a

cost proportional to the number of elements in that template.

(2) The free-space distribution, combined with the algebraic decomposition

provided in this dissertation, is a simple solution to an otherwise complex
inter-processor communication control problem.

(3) The complete arithmetic and logical control the image algebra has over

the Preprocessor event, as defined by ATRWG, provides a more efficient and

flexible optical correlation system, a proposed solution to the ATR problem.

(4) The method of sorting, defined in Chapter 5, while generating the ith

rank-order for order statistic filtering is unique. It removes the combinatorial

constraint incurred with the use of large template configurations and either stack

filter or morphological operations. The order statistic filtering operation using the

proposed design is executed at essentially the same cost function as a single

convolution operation with the same template configuration.

(5) The insight gained from the development of this hybrid optical design

will lead to further research into an all digital design for more general applications.
Also, a coupling of the SIMD approach taken here with a holographic, free-space

information distribution method, sometimes referred to as an optical crossbar

switch, will provide an even greater reduction in the switching time among


1.4 Outline of the Dissertation

A historical perspective on the development of the image algebra by the

University of Florida and its relationship, similarities and differences, to other

image algebras (most notably binary image algebra [41]) is the discussion of

Chapter 2. Chapters 3 and 4 cover the conceptual descriptions of two hybrid

optical configurations. The reason for presenting two separate configurations is

to show applicability to a high resolution, holographic film implementation as well

as a solution to the ATR problem. The first concept is a programmable spatial
light modulator, optical correlator implementation. This configuration offers

greater flexibility but at lower resolution, which is adequate for a solution to the

ATR problem. The second design configuration is based on a Vander Lugt optical

correlation principle which uses preset templates resulting in a different approach


to the image algebra integration. Chapter 5 presents a discussion on the use of

order statistic filters, with respect to the ATR problem, and their integration with

the design configuration proposed in Chapter 3. Other methods of generating the

order statistics are investigated and compared to a proposed method that operates

at or nearly the same speed as a single convolution. Finally, Chapter 6 concludes

the dissertation with a discussion on the use of these findings for further research.


Research in the field of optical computing has increased considerably over
the last decade, primarily due to the inherent parallelism provided by optical
devices. The alluring idea of complex computations performed literally at the
speed of light has been perceived as a panacea for a wide range of problems.
To give a quantitative example of the speed advantage, consider the upper
bound on the speed of electronic devices to be on the order of picoseconds (10-12
seconds); optical devices are easily three orders of magnitude higher, femtoseconds

(10'15 seconds) [5]. Parallelism in the optical device is quantified by the
space-bandwidth product (SW) of the optical system. The SW is equivalent to
the maximum number of spatial degrees of freedom which is directly proportional
to the product of the areas of the input plane and the spatial frequency plane and
inversely proportional to the square of the product of the wavelength of the light
and the focal length of the system. More formally,

SW = a A / (Af)2,

where A is the wavelength of the light, f is the focal length of the system, a is
the area of the input image, and A is the area of the spatial frequency plane [5].
Considering the digital storage required for the equivalent space-bandwidth
product, it is clear that the optical device is superior. It is for these reasons that

the Air Force devotes considerable research funds to the further development of

optical computing methods that provide solutions to the ATR problem.

The optical method reported in this dissertation as a solution to the ATR
problem is a hybrid of a massively parallel digital system and an optical correlator.

The reasons for selecting this particular optical system are the maturity of the

technology and the precision of the target identification process. Also, proprietary

sources have demonstrated the capability of storing many thousands of frequency

transformed reference images in a crystal lattice composition. The crystal is then

installed in the correlator as a multiple reference matched filter.

The preprocessing step is critical for the proper operation of an optical

correlator with a multiple reference matched filter. The raw input image may,

according to the particular application, need to be dynamically enhanced, filtered,

rotated, scaled, and/or thresholded. Operations such as these are necessary to

ensure the highest probability of autocorrelation for the system; i.e., the highest

signal to noise ratio. They also need to be generally programmable and executable

without significantly degrading the speed of the system.

Massively parallel computation at nearly the speed of light is a reality.

What remains is the proper interface to programmatically control the application.

This dissertation presents that interface and its relationship to an algebra that is

specifically designed to express image processing operations.

2.1 Image Algebra Overview

In recent years, several image algebras have been developed [29,41,67,69].

These algebras are for the most part homogeneous, having a finite number of

operations for transforming one or more elements of a single set into another

element of the same set. The mathematics of image processing, in the most

general sense, forms a heterogeneous algebra because it involves many types of

operations on and between elements of different sets. If an algebra A properly

contains subalgebras that are isomorphic to some other algebras, then A is more

powerful than these algebras; i.e., A contains all the laws and rules of these

algebras plus more [62]. The algebra we are about to discuss is a heterogeneous

algebra specifically designed for image processing and is more powerful than other

image processing algebras, such as mathematical morphology, currently discussed

in the literature in that it contains these algebras as isomorphic subalgebras [22].

In 1984 Ritter et al., under joint sponsorship of the Air Force Armament

Laboratory (AFATL) and the Defense Advanced Research Projects Agency, began

the development of a unifying mathematical theory for concepts and operations

encountered in image and signal processing. The goal was to develop a complete,

unified algebraic structure that would provide a common mathematical

environment for image processing algorithm development, optimization,

comparison, coding, and performance evaluation [57]. The outgrowth of this effort

was the AFATL image algebra, which we shall simply refer to as the image


The image algebra is a heterogeneous algebra capable of expressing all

image processing transforms in terms of its many different operations over a family

of sets, its operands [63]. It manifests itself as a theory concerned with the

interrelationship of two broad mathematical areas, point sets subsetss of

topological spaces) and value sets subsetss of algebraic structures; e.g., groups,
rings, vector spaces, lattices, etc.). The algebraic manipulation and

transformation of geometric/spatial information is the basis for the

interrelationship. The transformation of geometric data may result in geometric,

statistical, or syntactic information represented by images. The point and value
sets along with images and templates form the basic types of operands for the
algebraic structure. Operations on and between images are the natural induced
operations of the algebraic system. Operations between images and templates and
between templates and templates are based on the definition of a generalized
product [56].
Isomorphisms have been proven to exist between subalgebras of the image
algebra and the homogeneous algebras previously mentioned [61]. Even more
noteworthy, subalgebras of the image algebra have been proven to be isomorphic
with standard Linear Algebra and Minimax Algebra [22,30]. Image algebra
operations, both unary and binary, are not only capable of defining commonly used
low level transforms, but higher levels of processing as well, these include neural
network methods, feature space manipulation, and others [61]. A complete
description of the image algebra can be found in Ritter et al. [63].

2.1.1 Value Sets and Point Sets
In image algebra, value sets are synonymous with homogeneous algebras.
For a definition of an algebra, refer to [56]. An arbitrary value set will be denoted
by the symbol F. Some of the more common examples of value sets used in image

processing are the sets of integers, real numbers, complex numbers, binary

numbers of fixed length k, and extended real numbers (which include one or both
of the symbols +w and -w). These value sets are denoted by 1, R, C, I k, R+ = R
U {+}, R = R U {-w}, and R i = R+, U R respectively. The operations on
and between elements of a given value set F E {1, R, C, 1 R+., R_ ,m } are the
usual arithmetic and lattice operations associated with F [61]. For example, the

operations on the elements of the value set F, where F = R, are the arithmetic and

logic operations of addition, multiplication, and maximum, and the

complementary operations of subtraction, division, and minimum. The operations

on subsets of F are the operations of union, intersection, set subtraction, choice

function and cardinality function, denoted by U, n, \, choice and card,

respectively. The choice function returns an arbitrary element of the subset of F
while the cardinality function returns the number of elements in the subset of F

In image algebra, the notion of a point set is equivalent with that of a

topological space. Point sets most applicable to present day image processing

tasks are subsets of n-dimensional Euclidean space IRn. These point sets provide

the flexibility of defining rectangular, hexagonal, toroidal, spherical, and parabolic

discrete arrays as well as infinite subsets of Rn [62]. The letters X, Y and W

denote point sets, and elements of point sets are denoted by bold lower case letters

to represent vectors in En. In particular, if x E X and X c Rn, then x = (x,

x2, *, xn), where zi E R V i = 1, 2, -* ,n. Image algebra operations acting on
subsets of point sets are the operations of U, n, \, choice and card. The operations

acting on or between elements of point sets are the usual operations between

coordinate points; i.e., vector addition, scalar and vector multiplication, dot

product, etc. [63].

2.1.2 Images

An image is the most fundamental operand in the image algebra and is

defined in terms of value sets and point sets. Given point and value sets X and

F, respectively, an F-valued image a on X is a function a: X --* F The data

structure associated with an F-valued image a on X is the graph of the function
a, which we again denote by a; i.e.,

a= (x,a(x)): xE X}, (1)

where a(x) E F (see Figure 2).
The set X is called the set of image points of a and the range of the
function a (which is a subset of F) is the set of image values of a. The pair

(x,a(x)) is called a picture element or pixel, where x is the pizel location of the
pixel value a(x). The set of all F-valued images on X is denoted by F' [63].

2.1.3 Operations on Images
The fundamental operations on and between F-valued images are the
naturally induced operations of the algebraic structure of the value set F; in
particular, if F = R, then the induced operations on 0x are the pixel-wise binary
and unary operations of addition, multiplication, maximum, minimum, absolute
value, exponentiation, etc. These pixel-wise operations will be executed by the
weighting and accumulation processors of the proposed architecture [61].

2.1.4 Generalized Templates
A template in Ritter's image algebra is a mathematical entity that
generalizes the concepts of templates, masks, windows and structuring elements,
thereby playing an important role in expressing image transformations. Templates
provide a tool for describing image operations where each pixel value in the output
image depends on the pixel values within some predefined configuration of the
input image [49, 63].






x= (x,y)

a(x) = 0

a = { (x,a(x)): xeX, a(x)F}

If the chosen value set F is equal to IR, then any given point set X

will define a real valued image a on X, a : X-*R where x

is the pixel location and a (x) is the pixel value at x. In this particular

illustration we choose a discrete rectangular point set X C C IRI

where Z2 ={(i, ) :i, j Z}.

Figure 2. Example of a 2-D rectangular real-valued image.

^ -- b


Let X and Y be point sets and F a value set. A generalized F-valued
template t from Y to X is a function t: Y -- Fx. Thus, a template is simply
an image whose values are images; i.e., a template from Y to X is an FX-valued
image on Y. If t E (FX)Y, then for each y E Y, t(y) is an F-valued image on X.
The set of all F-valued templates is denoted by (Fx)Y. The sets Y and X are
referred to as the target domain and the range space of t, respectively. For
notational convenience, the symbol ty is defined as t = t(y). The pixel location
y at which a template ty is evaluated is called the target point of t and the
values ty(x) are called the weights of t at y [22,63] (see Figure 3).
If t is a real or complex valued template from Y to X, then the support
of ty is defined by e(ty) = {x E X: t(x) # 0}. If t is an extended real valued
template, then the following supports at infinity are also defined:

S(ty) = {x E X: ty(X) # },

(ty) = {x EX : t,(x) -},
and ,((ty) = {x E X: ty(x) # *}).

Templates are classified as either translation invariant or translation
variant. Translation invariance is an important property because it allows image
processing algorithms and transformations that make use of invariant templates to
be easily mapped to a wide variety of computer architectures [17,34,37,49].
Suppose X = Rn or X = In. A template t E (FX) is said to be translation
invariant if and only if for each triple x, y, z E X, ty(x) = ty+,(x + z).
Translation invariant templates have fixed supports and fixed weights and can be
represented pictorially in a concise manner. A template which is not translation
invariant is called translation variant, or simply a variant template. These

a ema te t

iiii:: ::Ll::.II x ii

.......... ._ _..

.... "... ;.....
....:... ... ...

ty= (x,ty(x)): xeX,ty(x)IF}

Let X and Y be point sets and F a value set.

A generalized F -valued template t from Y to X is a function

t : Y- Fx [denoted t (IFX) ]. For each target point y Y

t = t(y)

where the values t (x) are the weights of the template t at y.

In this example, Y = XC 2 is a rectangular array, and IF = IR.
The support of t at target point y is shown in shaded form.

Figure 3. Example of a 2-D template.

templates have supports that can vary in shape and weights at different target
points [63,68].

2.1.5 Operations Between Images and Templates
Several common image-template operations as defined in the image algebra
are used to describe image transformations that make use of local and global
convolutions and to change the dimensionality or size and shape of images. Such
transformations include image rotation, zooming, image reduction, masked
extractions and matrix multiplication. In an effort to generalize the definition of
basic operations between images and templates, the global reduce operation and
the generalized product between an image and a template are introduced [63].
Let X c Rn be finite, where X = {xt, x,,- *, x}. If 7 is an associative
and commutative binary operation on the value set F, then the global reduce
operation F on Fx induced by 7 is defined by

ra = r a(x) = a(x,) 7 a(x2) 7 7 a(x), (2)

where a e Fx and F is the mapping function r : F -- F.
Image-template operations are defined by combining an image and a
template with the appropriate binary operation. This is accomplished by means of

the generalized product, specifically, the generalized backward and forward
convolution operations [22,63]. Let F F, and F be three value sets, Y C Rm,and
0: F1 x F2 -- F be a binary operation. If 7 is an associative and commutative
binary operation on F, a E Fx and t E (Fx)Y, then the generalized backward

convolution operation (the generalized forward convolution operation is omitted

here) of a with t is the binary operation 0: F x (F)-- F' defined by

a t- {(y,b(y)): b(y) = r a(x) 0 t(x), y Y}. (3)

Notice that the input image a is an F -valued image on the point set X
while the output image b is an F-valued image on the point set Y. Templates
can therefore be used to transform an image with range values defined over one
point set to an image on a completely different point set with entirely different
values from the original image [22,63].
A wide variety of image-template operations can be derived by substituting
various binary operations for 7 and 0 in the definitions stated above for the
generalized backward and forward template operations (see Figure 4) [63]. There
are, however, three basic image-template operations defined in the image algebra
that are sufficient for describing most image transformations encountered in
current areas of image processing and computer vision applications. The three
basic operations that are considered in this dissertation are generalized convolution,
additive maximum, and multiplicative maximum, and are denoted by A, 0, and
respectively. The operation is a linear operation that is defined for real and
complex valued images. The other two operations, M and @, are nonlinear
operations which are applied to extended real valued images [22,63].
Let X c In be finite and Y C Rm. If a E R and t E (R)Y, then the
backward generalized convolution, see Figure 5, (the forward direction is omitted
here) is defined as

aet {(y,c(y)): c(y)= a(x). t(x), y Y}. (4)

If a E R and t E (x )Y, then the backward additive maximum operation
is defined as

a t {(y,c(y)): c(y) = V a(x)+ty(x), yE Y}, (5)

where V a(x) + t(x)= max {a(x) + t(x) : x E X}.

If a E (R ) and t E ((R 0)1), then the backward multiplicative
maximum operation is defined as

a tt {(y,c(y)): c(y) = V a(x).ty(x), yE Y). (6)

The complementary operations of additive and multiplicative minimum ( M
and ) are defined in terms of the additive and multiplicative dual of images and
templates and the image-template operations of 1 and e [63]. The relationship
between the additive maximum and minimum is given in terms of lattice duality by

a t = (t* 3 a)*,

where the image a* is defined by a*(x) = [a(x)]*, and the dual of t E (R )Y is
the template t* E (Rh)x defined by t*(y) = [ty(x)*. It follows that the additive
dual can be simply defined as tx(y) = -ty(x). A similar correspondence for the
multiplicative dual is defined by Ritter et al. [63].

a(x) t (x), yeY

Resulting Operation:

and *

and n



and AND

Spatial Convolution or Correlation [62]

Set Theoretic Dilation [66]
(where IF is a set of sets)

Morphological Dilation (additive) [62]

Multiplicative Lattice Dilation [62]

Boolean Dilation [40]

Figure 4. Examples of generalized convolution substitutions.

xe X



xe X

and O


Xe X


xe X

xe X

xe X

a( t ={ (y,b(y)): b(y)


T Mlat t b a


b = a( t ={ (y,b(y)): b(y)=a a(x)ty(x),yeY}
xe X

Let X be a finite subset of 2, Y = 22 c IR, and choose F= IR.

Suppose a Fx and t (FX), then where ever the image ty intersects

the image a, the convolved value b (y) will be located at the target point y.

Figure 5. Example of a 2-D generalized backward convolution.

2.1.6 Operations Between Generalized Templates

The basic binary operations between generalized templates are the same

pointwise operations inherent in the value set F [22]. These are the same

operations defined between images.

The image-template operations of 0, Q, and generalize to operations

between templates. When used as template-template operations, these image

algebra operations enable large invariant templates to be decomposed into smaller

invariant templates thereby providing a means for mapping image transformations

to various computer architectures [68]. This process is called template

decomposition. Combining two or more templates with the operations of 9, M

and/or Q is called template composition.

2.2 Digital Optical Cellular Image Processor (DOCIP) A Comparison

The concept of optical image processing has been an item of considerable
research [5,26]. Particular interest has been paid to optical array methods which

remove the interconnection bottleneck common to conventional semiconductor

parallel architectures [29,40,41]. The reason for evaluating the DOCIP is to show

that although this highspeed optical image processor has a limited potential for use

in an ATR problem solution, it does not have the robust image algebra operational

capability of the designs presented in this dissertation.

2.2.1 General Description

The DOCIP, presented by Huang et al., is designed to provide parallel

image processing tasks using the inherent optical parallelism and 3-D free space

interconnection capabilities [41]. Both cellular array and cellular hypercube

designs are presented in [41]. Only the hypercube design, with its higher degree of
connectablity, is of interest here since it has the better execution time complexity,

O(logN) versus O(NlogN) for the cellular array [41].
The DOCIP is intrinsically a fine grained SIMD machine where each cell
operates with low hardware complexity. The hypercube architecture is used to
manipulate structuring element locations and configurations, either 4 or 8
connectedness in the cellular automata sense, with respect to each pixel in an NxN
image. The overall design of the architecture is tightly coupled to the underlying
binary image algebra constructs.

2.2.2 Binary Image Algebra

The programming language initially developed for the DOCIP was later
established as a coherent homogeneous algebra referred to as binary image algebra
(BIA) [29,41]. BIA was another attempt at a unifying theory. Its formulation was
based on a set of three specific fundamental operations taken from the structures of
mathematical morphology [67].

Mathematical morphology is used to determine structure in a subimage by

transforming underlying modeling sets. The modeling sets are sets of points
known as structuring elements. The structuring elements, defined in the image
algebra by the support of a template, are designed in a such a way as to reveal the
spatial characteristics of objects. Morphology used as an algebra for image

processing fails as a universal image algebra due to its set-theoretic formulation
[57]. Conceptually, in a Turing machine sense, it is possible to use these
set-theoretic operations in the linear domain, transformations between different
domains, and transformations between different value sets; however, the
practicality is beyond that which is reasonable. On the other hand, the image

algebra defined by Ritter et al. does not fail in this regard. Furthermore, it
includes all morphological operations, either binary or gray scale, as a subalgebra
Formally, the binary image algebra is defined by a set S and family of
operations F over that set. The set S is further defined as the power set of a
predefined universal image W, denoted P( W) [41]. The universal image is the set

W= { (x,y) : x E In' yE in } (7)

where n = { 0, *1, 2, .., n } and n is a positive integer. Unfortunately,
this definition does not directly convey the notion that this is an algebra over
binary images. Additional constraints are added which restrict the values at each
point (x,y) to 1 (foreground) and 0 (background).
The power set P( W) can be defined simply as the set of all images such
that the picture elements (pixels) of each image belong to the value set {0, 1}.
The image algebra defined by Ritter et al. defines any image belonging to this
power set as

a = {(x,a(x)): x X }, (8)

where a(x) E {1, 0} and x is any vector belonging to the point set 1 nx In In
particular, in Ritter's algebra the set where /1 = {0, 1} and X = 1Fnx ,n'
corresponds to Huang's universal set S of Boolean images on P( W).
The family of operations F is made up of three fundamental, non 0-ary
operations: dilation, union, and complement, denoted (* is used to avoid
confusion with the image algebra notation), U, and -, respectively. A subset of

P( W) containing five images referred to as elementary images is used to derive
the necessary structuring elements needed to perform all morphological functions
of the three fundamental operations [41]. The elementary images are denoted I,
A, A', B, B:1 Huang denotes his binary image algebra symbolically by

BIA = ( P( W); *, U, -, I, A, A;' B, B-' ). (9)

Again, we must point out that this notation does not specify the algebra
completely since the power set P( W) is not the same as the set of all Boolean
images on P W).
Two of the three fundamental operations, union and complement, are basic
operations of any useful set-theoretic algebra (BIA and the image algebra defined
in [63] are no exceptions) and need no further clarification. The dilation operation,
in the strict morphological sense, is the union of all input image translations by
the points in the structuring element [67]. BIA defines it as

X R = (10)
{(x + x y+ y ) 6 W: (x ,y ) E X, (x ,y ) E R} (X 0) A (R # 0)

where X is an input image, R is a reference image derived using the three
fundamental operations in combinations with one or more of the five elementary
images, 0 is the null image, and A means logical and. Once again, the definition
does not clearly express the intent of morphological dilation nor that the result of
the operation is a binary image. The underlying mathematical ring and lattice
structures are never specified, leaving the reader to ponder the action of the binary
operator +. If in a Boolean sense, the operator represents an or operation, then

the dilation notion is reasonable. The dilation required by BIA is formulated by
collectively or-ing a sequence of translations of the input image (translated with
respect to each element in the reference image) each being and-ed with the
reference image (refer to Figure 4).
The image algebra by Ritter et al., using the concepts of value and point
sets, readily defines morphological dilation in this instance. Substituting the

global reduce operation maximum, V, and the multiplicative binary associative
operator, *, in the generalized matrix product form, the binary morphological
dilation is then defined as

at {(y,c(y)): c(y) = V a(x)-tx), y X}, (11)

where input image a E (2 )', template (BIA reference image) t E (( 2)x1, and
the point set X = inx In defined above. Figure 6, an example taken from [41],
shows the dilation operation for input and reference images. Without further
proof, it is clear that for binary images the multiplicative maximum operation

defined in [63] is equivalent to the dilation operation defined for binary image

2.2.3 Evaluation

Since the three fundamental operations defined by the binary image algebra
have equivalent definitions in the image algebra by Ritter et al., and the sets
which define any input and reference images (templates) are also equivalent, BIA
can be considered to be a subalgebra of the image algebra in [63]; that is, the
image algebra [63] can perform any operation combination of the BIA. This has
been proven rigorously by Davidson [22].

Input Image
X BIA[40]
a IA1621
00111 10
0 0 011 1 1 0
000011 1 0


X G R BIA[40]
a t -IA[62]
0 1 1 1 1 1 1

Figure 6. Dilation (Boolean) expressed in the language of BIA and image algebra.

Reference Image
R BIA[40]
0 00000 0 0
00000 00

t -IA[621
~0 ~_0 ~_0 ~_0


As we shall show, the time complexity for binary only operation of the

optical system described in Section 3.5 is 0(1). Since it can perform all the
operations of the DOCIP (those defined in the BIA) which has operating time
complexity O(logN) [41], the binary only design configuration will run faster for

roughly the same cost function, verifying the superiority of the proposed design
over the DOCIP.


Optical correlators provide not only a solution to the ATR problem, but a
method for target identification as well. The complete action of target

identification is accomplished by cross-correlating the Fourier transform of an

image scene containing a target with a transformed image of the target. This

methodology, known as matched filtering, has a computational time complexity of

order 0(1) which is necessary for high response rate systems.

The correlator design described in this chapter is an integral part of a

proposed autonomous guidance system for a munition directed to defeat

high-valued, relocatable ground targets [19]. Although the correlator will be the

primary means of automatically recognizing the target, a large portion of a

system's guidance still depends upon high throughput image processing as shown

in Figure 7. The hybrid electro-optical design described in this dissertation offers

a solution to this image processing requirement. It uses the correlator as a means

of rapidly translating the input image over an array of fine grained SIMD digital

processors that allow for the execution of the arithmetic and logic operations of the

image algebra discussed in Chapter 2. Guidance algorithms that are composed

using the image algebra are capable of satisfying the guidance requirements prior
to and during the cross-correlation portion of the guidance solution. The benefit

here is a combined system, not two separate processing activities. This equates to

lower packaging volume and weight; these requirements are critical to the overall

design of any munition guidance system.


L c

.-3 \ u o

N\ 1 \
\a 0rn
-s 2

< aS a!

\ o
a ) .


CO ; ~

3.1 Correlator Operation for ATR

The cross-correlation/matched filter operation referred to in this

dissertation is accomplished using a conceptual correlator design based on a

conventional Fourier optical system as shown in Figure 8 [55]. The input image
information can be from any sensing device that produces a digital image, e.g.,

passive Forward-Looking-InfraRed (FLIR), Laser Radar (LADAR), Synthetic

Aperture Radar (SAR), etc.

3.1.1 Spatial Light Modulation

The most critical component of the optical correlator described in this

chapter is the spatial light modulator (SLM). A spatial light modulator is a

two-dimensional array of photonic switches that modulate the intensities, phase,

or polarizations of the passing light beams. The conceptual correlator design

discussed in this dissertation requires a SLM that can provide high resolution
(minimum 128x128 pixels), 0 255 pixel gray level values, and a high switching

frequency (40KHz is used as a baseline for this evaluation). Although the

state-of-the-art does not currently provide all the requirements in a single SLM,

research shows that individually, the requirements are met and materials are

available to provide the combination of these properties in a single SLM

The most promising type of SLM that can be expected to meet the
requirements of the proposed design is a ferroelectric liquid crystal (FLC) SLM.
The liquid crystal material is chiral smectic A [43]. In a smectic liquid crystal
material, the molecules are arranged parallel to each other so that diffraction

patterns can only be obtained in a single direction. In the chiral smectic A


0 i
SJ i


\ 0

o I /

S g)~

material, the molecules are oriented collinear to an axis that is orthogonal to the

plane containing the smectic material (see Figure 9). Application of an electric

field causes the molecular director, the null direction, to tilt in linear proportion to

the strength of the applied electrical field. This is referred to as an electroclinic

effect. This electroclinic effect or switch provides gray level response at the

switching frequency [8].

Power dissipation is a primary limitation of the switching speed of most

other SLMs. The power dissipation for FLCs in general is low; in fact, it is

inversely proportional to the switching speed [43].

Smectic A devices have demonstrated 100ns switching speeds, therefore,

assuming a naive line multiplex scheme, 128 frames (one line of a 128x128 image)

x 100ns equates to a 78KHz frame switching frequency [43]. As this technology

matures, submicrosecond smectic A devices have the potential for frame switching

frequencies at MHz rates.

3.1.2 Optical System Throughput

The incoming digital image information is multiplexed to the input SLM

arranged first in the correlator. The output of a single wavelength laser is

collimated and polarized to produce a plane wave of coherent visible light. As the

polarized plane wave passes through the elements of the SLM, portions of the

plane wave are retarded in phase as shown in Figure 10. The amount of

retardation corresponds to the gray level value of the digital image at each
respective pixel location. The emerging wave is no longer perfectly planar but

rather a phase modulated wave. The phase modulated wave is then passed

through a Fourier transform lens (FTL) where the power spectrum of the Fourier

transform of the image information is formed at the focal plane of the lens.

The spontaneous polarization P of the smectic material is perpendicular to the
perpendicular to the y-z plane (tilt plane). The application of an electric field
produces a torque through a first-order interaction with P such that the
average molecular lattice director D is displaced by c4. As the plane wave
of coherent light passes through the crystal lattice, any change in 4) will
produce a phase change in the plane wave (i.e., phase modulation).

Figure 9. Ferroelectric liquid crystal cellular geometry.





_4 :^sE


A second SLM is aligned with the focal plane of the FTL. Fourier
transformed phase information for the complex conjugate of the reference image
(target) is placed on the SLM, and the emerging wave is further modulated in
phase. The effect of this second SLM is multiplicative with respect to the energy
passing through the system. Successful cross-correlation is achieved using
phase-only filters. These are known as kinoforms, i.e., not relying on diffraction
by carrier fringes for phase modulation [39]. When used as a matched filter, the
filter in the Fourier plane need only have the same space-bandwidth product as
the input image [8].
The cross-correlated information is transformed back to the spatial domain
by passing through a second FTL located at the same focal length behind the filter
which performs the inverse Fourier transform. The entire cross-correlation process
is mathematically notated as

C = 9-{9 [f(x,y)].S'[h(x,y)]}
= S-1 F(u,v) H*(u,v) },

where % is the Fourier transform operation, f(x,y) represents the input image,
F(u,v) is the Fourier transform of f(x,y), h(x,y) represents the reference or target
image, H(u,v) is the Fourier transform of h(x,y), and is the complex conjugate.
Also, note that the autocorrelation, C occurs when H*(u,v) = F*(u,v).
The cross-correlation information is now passed through an analyzer,
consisting of a single polarizer, which regulates the intensity with respect to the
amount of phase retardation. Finally, the intensities are collected on a charge
coupled device (CCD) array where they are converted back to a digital form. If
there is a match, a correlation spike is formed at the CCD array. Because the

Fourier transform is linear shift invariant, the correlation spike can occur

anywhere in the input image where there is a match. Once the target is detected,

a position update is passed to the guidance and control unit. According to the

particular guidance law being used, the system is steered in such a manner that

the correlation spike moves to either the center position for zero error, pure

pursuit; or to an offset aim point for lead pursuit, selective aim point.

3.1.3 Composite Matched Filter

Two factors, rotation and scale, induce variation in the matched filter,

correlator operation. These variations are due to the geometrical distortions

inherent in viewing a dynamically changing 3-D scene. For this correlator

configuration, these variations are handled using a technique known as composite

matched filter (CMF, defined by Horner and Caulfield [38]).

One method of dealing with the above variations is to integrate a system of

different focal length lenses with separate filters that present views of the target

over a finite number of azimuths and elevations. Although this system would

provide satisfactory results, it is far too cumbersome to implement. Due to the

linearity of the correlator as a system, the same results can be obtained by

superimposing a sufficient number of filters into one composite matched filter.

This single filter is constructed as a sum of the Fourier transformed reference

images and is notated as
CMF= w -MF,
i=l 1

where k is the total number of matched filters (MF), and w is a weighting

function for each filter that modifies the output intensities to provide a constant

correlator output for each filter. When the reference images are combined before
being transformed, the result is referred to as a synthetic discriminant function


A sufficiently large set of composite matched filters may now define an
operational volume for the correlator (refer to Figure 7). The set of composite

matched filters is continuously cycled through the respective SLM until a

correlation occurs or, in the event of no correlation, a decision is made to switch to
a lower level of image processing.

3.1.4 Post Correlation Processing

Once a correlation is made, a correlation spike is formed on the CCD array.
The location of the spike must be quickly processed to determine its relative

position to either the center null position or any particular offset aim point. This
operation is ideally suited for the design configuration presented in this chapter.
Using the characteristic function X described in Section 2.1.3, the digital values

above the background noise are set to unity while all others are set to zero. This

allows for a rapid boolean readout of the location of all Is on an array of Os.

A more elegant, yet highly theoretical, design configuration is the notion of

on focal plane array digital computation. The fine grained computing elements
necessary for a SIMD design are formed using nanoelectronics, an integrated

components technology developed by the Central Research Laboratories of Texas

Instruments, Inc. [28]. The technology is based on heterostructure fabrication
methods that integrate low level operations (e.g., addition, multiplication,

comparison, etc.) with the output of each element of the CCD array located at the

element [28]. Although the hardware complexity at this level is very low, it is

sufficient to host a subset of image algebra operators to perform a wide range of

operations (specifically, X). The benefit here is that the steering information can

be interpreted on the CCD with computational time complexity 0(1).

3.2 The Logical Architecture

The logical computer architecture described in this section is specifically

designed for the image processing tasks required of an airborne vehicle guidance

system when the vehicle is outside the correlator's operational volume (pre/post

processing, refer to Figure 7).

The proposed design is a hybrid electro-optical architecture consisting of

three tightly coupled components: a spatial configuration processor, a weighting

processor, and an accumulation processor as shown in Figure 11. These

components are based on the fact, proven in Section 3.3.1, that a general

convolution of type a 0 t, for applicable 7 and 0, is equivalent to

r (a Dt(i)) O c.,
i=1 I

where n is the number of elements in the template, a t(i) is simply a shift, and

c is the scalar value or weight of the respective template element. The flow of

data and image processing operations are directed by a control buffer and pipelined

to each of the three processing components as shown in Figure 12. Table 1

presents a component level description for the operations required of each

processing element.

The spatial configuration processor is a stepwise optical convolution of the

original input image with each template location. Figure 13 shows, conceptually,

an optical system for controlling the input of each template element. A unit value


a t(i)

ri'%lI I VVior,
(a 0t (i))oci








Figure 11. Data flow diagram for implementing a t.



1 2 3 n

WWW ... W
1 2 3 n

A A2 A *** An
A1 2 3 n

Figure 12. Pipelining the processors.

Table 1. Component design requirements.

Design Requirement

Spatial configuration

Weighting processor

Accumulation processor

Control buffer

A Fourier optical correlator design using
binary and high resolution (8-bits)
SLM technology (chiral smectic A
liquid crystal FLC) and a charged
couple device (CCD array) to provide the
output image in digital form frame
switching frequency 40 KHz.

A parallel array processor with directly
connected input from the CCD array and
the capability to load a given scalar value
to each arithmetic logic unit (ALU). A
scalar value will be loaded into the
template register. Each ALU must have a
multiplier, an adder, and a modified
comparator (Figure 15). The output image
will be the result of combining the input
image with the scalar using one of the
following operations: +; .; <; =; >;
(and); and (or) (Figure 16) [61].
A parallel array processor with directly
connected input from the weighting
processor. Each ALU will be configured
identical to those in the weighting
processor with the addition of and an
accumulator (see Figure 16). The
accumulator can be loaded directly by the
analog, no shifting and no scalar operation.
The accumulated value (video out) will be
the result of combining the two images
using one of the seven operations described

This is the host controller (e.g., a
Motorola 68040 CPU) that directs all I/O
handling and data flow. A translator or
compiler must be designed or modified to
parse the specific instructions required to
interpret the algebraic syntax and direct
the execution of the other components.








" 0





is assigned to the template element for each convolution. Thus, a since function is

formed and is inherently multiplied in the frequency domain. The result of each

convolution is simply a shift of the input image, that is, each picture element,

pixel, is shifted by the displacement of the respective template element as shown

in Figure 14. If the operation being directed by the control buffer is a pointwise

image-to-image operation, then the input image is sent directly to the array

processor controlling the accumulation operation.

The output of the spatial configuration processor is combined pointwise

using the appropriate binary associative operator from the algebraic set of

operators with the value of the respective template element as shown in Figure 15.

This weighting procedure is performed on a digital array processor using the

binary associative operators assigned to each arithmetic logic unit shown in Figure

16. The output of the weighting processor is now accumulated pointwise. Again,

this procedure is performed using a digital array processor. Once the final

template element is processed in this fashion, the accumulator memory contains

the result of the generalized matrix product for the appropriate combine and

reduce operations. Recall that in our discussion, the generalized matrix product is

the mathematical formulation for operations such as: general

convolution/correlation, additive maximum/minimum, and multiplicative


The generalized matrix operations, or template-to-image operations, along

with the direct image-to-image unary and binary operations allow for the

formulation of all common image processing tasks. Hence, this architecture will

control the execution of all the necessary set operations of the underlying image









Figure 14. Shift, weight, and accumulate.

Digital Image Input
CCD a(xi)
ControlBuffer b(Xi)


* Scalar Input
From the
Control Buffer




Control Buffer
(Video Out)

Figure 15. Cellular connection of processors.


cc ci m RH M

A II V A I v
8^ 0 0


3.3 Image Algebra Process Distribution

All image-to-image operations are performed pointwise in one-to-one

correspondence with respective pixels. Table 2 shows each operation and the

corresponding data flow or process distribution.

Characteristic functions, as defined in the image algebra, apply their

respective conditional arguments to an image and output a binary image consisting

of Is where the condition is true and Os elsewhere. By properly tailoring the

Boolean output of a comparator's circuit, each conditional argument can be

processed as shown in Figure 16. Table 3 shows the algebraic notation and the

corresponding data flow or process distribution.

The global reduce operations E V A and the boolean or transform

an image to a scalar. Since the proposed architecture does not consist of a mesh or

any interconnection among pixels, these operations are best performed by a host


The image algebra basically defines templates in two ways: invariant,

where neither the template's support nor its values change with respect to the

target point; and variant, where the above constraint is not true for either or both

sets. The intent of the proposed design is to use invariant template operations.

The reason for this is that the throughput rate is maximized. Variant template

operation is discussed in Section 3.6. Table 4 shows the algebraic notation for

image-to-template operations and their corresponding data flow or process


The image algebra operations that have been shown to distribute over the

proposed architecture represent a substantial increase in throughput and efficiency.

Table 2. Image-to-image operations.

Image Algebra


k a

Process Distribution

parallel step 1: load the value
template register and image a
weighting processor ALU;
parallel step 2: accumulate 3
= k + a(x).

parallel step 1: load the value
template register and image a
weighting processor ALU;
parallel step 2: accumulate 3
= k. a(x).

k into the
into the


k into the
into the


parallel step 1: load the accumulator with
image a and the weighting processor ALU
with image b (no shift);
parallel step 2: accumulate 3 ACCUM(x)
= a(x) + b(x).

parallel step 1: load the accumulator with
image a and the weighting processor ALU
with image -b (no shift);
parallel step 2: accumulate 3 ACCUM(x)
= a(x) + (-b(x)).

parallel step 1: load the accumulator with
image a and the weighting processor ALU
with image b (no shift);
parallel step 2: accumulate 3 ACCUM(x)
= a(x) b(x).

parallel step 1: load the accumulator with
image a and the weighting processor ALU
with image b''(no shift);
parallel step 2: accumulate 3 ACCUM(x)
= a(x) b-'(x), where ACCUM(x) = 0
when b-(x) = 0.



a b


Table 2 continued

Image Algebra Process Distribution
a and b parallel step 1: load the accumulator with
binary image a and the weighting
processor ALU with binary image b
no shift);
parallel step 2: use the boolean output of
each comparator 3 ACCUM(x) =
a(x) and b(x).
a or b parallel step 1: load the accumulator with
binary image a and the weighting
processor ALU with binary image b
(no shift);
parallel step 2: use the boolean output of
each comparator 3 ACCUM(x) =
a(x) orb(x).
parallel step 1: load the accumulator with
image a and the weighting processor ALU
with image b (no shift);
parallel step 2: accumulate using the
comparator 3 if[a(x) > b(x)], ACCUM(x)
= a(x), else ACCUM(x) = b(x).
a A b parallel step 1: load the accumulator with
image a and the weighting processor ALU
with image b (no shift);
parallel step 2: accumulate using the
comparator 3 ifa(x) < b(x) ACCUM(x)
= a(x), else ACCUM(x) = b(x).

hardware control see Figure 16.

Table 2 continued

Image Algebra

Process Distribution

parallel step 1: load the value -1 into the
template register and image a into the
parallel step 2: accumulate 3 ACCUM(x)
= -1. a(x).

load the accumulator with image a, and
mask the appropriate sign control bit for
the accumulated value.

log b

cos a, sin a, tan a

cos-la, sin-la, tan-'a

t This operation may be processed more efficiently using a high-speed sequential
processor/math accelerator provided by the control buffer. If the hardware
complexity of the physical implementation is robust, then many of these higher
order operations can be included in the parallel computation.

Table 3. Characteristic functions.

Image Algebra Process Distribution

X>b(a) parallel step 1: load the accumulator with
image a and the weighting processor ALU
with image b (no shift, if instead of image
b a scalar value m is used, simply load
template register with m);
parallel step 2: accumulate the boolean
output of each comparator for the >
condition, w.r.t. b (or m).

X <(a) parallel step 1: load the accumulator with
image a and the weighting processor ALU
with image b (no shift, if instead of image
b a scalar value m is used, simply load
template register with m);
parallel step 2: accumulate the boolean
output of each comparator for the <
condition, w.r.t. b (or m).

X>b(a) accumulate 3 ACCUM = [X< (a)]c.

Xb(a) accumulate 3 ACCUM = [X>b(a)]c.

x (a) parallel step 1: load the accumulator with
image a and the weighting processor ALU
with image b (no shift, if instead of image
b a scalar value m is used, simply load
template register with m);
parallel step 2: accumulate the boolean
output of each comparator for the =

Xi (a) accumulate 3 ACCUM = [b (a)]c.

X[m,n](a) parallel step 1: accumulate image b =
X> (a) and store in image plane register B
parallel step 2: accumulate image 3
ACCUM = X processor ALU from B (b, no shift);
parallel step 3: accumulate 3 ACCUM =

Table 4. Image-to-template operations.

Image Algebra Process Distribution
a t parallel step 1: load image a into the
analog processor;
parallel step 2: load the spatial location of
template t into the analog processor and
the value of template t into the template
register of the weighting processor;
parallel step 3: combine the output of the
analog processor with the contents of the
template register using the operator;
parallel step 4: accumulate the output of
the weighting processor using the +
operator 3 ACCUM = ACCUM +
(a Dt(i));
parallel step 5: repeat steps 2 through 4
V i E the template configuration (pipeline).

a t same as a t, except the V operation
replaces the + operation in step 4.

a t same as a t, except the A operation
replaces the + operation in step 4.

a u t same as a t, except the + operation
replaces the operation in step 3, and the
V operation replaces the + operation in
step 4

a M t same as a S t, except the + operation
replaces the operation in step 3, and the
A operation replaces the + operation in
step 4

They also represent a very rich subset of the image algebra, powerful enough to
accomplish all common image-to-image transforms [58,59].

3.3.1 Proof of Concept
Recall the following image algebra definitions: the global reduce function;
Equation 2,

r a = r a(x) = a(x,) 7 a(x,) ... 7 a(x),

the generalized vector-matrix or image-template product; Equation 3,

att (y, b(y)): b(y) = xE a(x) Oty(x), yE Y

the general convolution; Equation 4,

aet- (y, b(y)): b(y) = 1 a(x) t,(x), y E Y ,

where X and Y are coordinate sets, a, b E F, F is an arbitrary value set, 7
and 0 are binary associative operators on F, and template t E (FY)Y
In support for the theorem that follows, suppose F E { R, C, 12k ), YE 62,
and t E (FY)Y denotes a translation invariant template with card( '(ty)) = n.
For some arbitrary point y E X, let { x (y), x (y),. *, (y) } = G(ty)
and for i = 1, 2,.. ,n, set ci = t(xi(y)).

For parameters {1, 2,- .,n}, define an F-valued parameterized template

1, ifx = x i(Y)
t(i)y(x) = o r
( 0, otherwise

Note that e(t(i)y) = { xi(y) } for i = 1, 2,. ., n.


a t = t (aet(i)) ci.


PROOF: In order to simplify notation we let ai = a 0 t(i) and show that
a9t = ai ci.

First note that

ai(y) =

-= a(x) t(i)y(x)
xE (t(i)y)

= a(x) t(i),(x)
xE xi}

= a(xi) t(i)y(xi)

= a(xi).

a(x) t(i)yx)

Thus, ai is simply a shift of image a, with a(y) being replaced by a(xi).
Let b = a $t and d = ai ci. In order to prove the theorem, we
need to show that b(y) = d(y), V y E X.

By definition, b(y) = a(x) ty(x)

= a(x) tx)
xe (ty)

= a(x) t,(x)
xE{xi, X2,' Xn

= a(xi) ty(xi)

i=la(xi) i

= ai(y) ci
i= 1

= d(y)


As an easy consequence of this theorem, the following corollary can be


Corollary I:

(1) If E { } and F
value set for the operation 0, then
a)t = P

(2) If E { }and F

value set for the operation C, then
ast = r

denotes the appropriate extended real

(a 0t(i)) + ci


denotes the appropriate extended real

(a t(i)) ci


PxOOF: The proof is identical to the proof of Theorem 1 with the

appropriate support at infinity replacing 1(ty)


Theorem 1 and Corollary I show that the general convolution operator

can be implemented by using simple shifts, global pixel-wise multiplication (or

addition) using the scalar ci, and accumulating the results. Figures 17 and 18 in

Section 3.4 illustrate this observation. We conclude this section with another

corollary of Theorem 1.

Corollary II: If is one of the operations of Theorem 1 or Corollary I, and

y = xj is an element of the support of ty, then

a t=(aOc ) 7 [ (a t(i))Oci .
i Ij


PROOF: Since t(j) is the identity template, aj = a 0 t(j) = a. Therefore,
a t(j) O cj = a 0 cj. The result now follows from the commutativity of 7.
Corollary II is important in that it saves one convolution if the condition y = xj
is satisfied.

3.3.2 Proof of Phase Only Filter Application
Phase only filter application is of special interest when spatial light
modulators are used to provide matched filtering in optical correlation. A SLM
can only provide amplitude or phase information unlike a holographic film plate
which can provide both simultaneously [39].
Aside from the correlation operation, the main objective of the optical
processor described in this dissertation is to simply shift the image. We must
show that the phase only operation of the SLM will be sufficient. In order to prove
sufficiency, let

{x1(y), (y),..., (y)}= (t).

For each k = 1, 2, *, n, there exists a unique vector

z(k) = (z,(k), z2(k)) E 12,

such that xk(y) + z(k) = y. Since t is translation invariant, then for any
arbitrary vector p E 12, xk(y + p) = xk(y) + p, where

{xt(y+p), x2(y+p),..., Xn(y+p)} = O(typ).


xk(y+p) + s(k) = (xk(y) + z(k)) + p = y + p. (16)

Equation 16 simply says that z(k) is not affected by translations.
For each k = 1, 2,- n, consider the unit impulse response function

(1 if x= z(k)
0 Oif x z(k)

Then 5k can be used to provide another representation of the one-point template
t(k) by observing that 6k(y-x) = t(k),(x), V y, x E 12. This follows from the
fact that

t1 if x= z(k)
0 if x # z(k)

and the observation that y x = z(k) t- x = xk(y).
Next, recall from the proof of Theorem 1 that if ak = a $ t(k), then ak(y)
= a(xk(y)). However, we also have that

a(y) k(y) = 1 a(x) (y-x) = a(xk(y)), (17)

where 0 denotes the convolution product. Therefore,

a k = a t(k)


or, equivalently,

ak = a 6. (19)

In the frequency domain we have, according to the convolution theorem [36], the
equivalent formulation

S{ak} = S{a 6} = S{a} -Sa{k}, (20)

where a denotes the Fourier transform operation.
Employing the notation 8k = {6k}, we have that

M-1 N--
k(u,v) = 1 6 k(z') e-2i(xu/M + yv/W) (21)
z=o y=o

-27ri(z(k)u/M + z2(k)v/N)

for all (u,v) E j2. Thus, ik has constant magnitude I (u,v) = 1 and phase

O(u,v) = e-2ri(zl(k)u/M + z2(k)v/N); i.e., only phase is a factor in the
multiplication of S{a} in Equation 20.

ak(y) = a(xk(y)) = a(y-z(k)), (22)

it also follows from Equations 19, 20, and 21 that

a(y-z(k)) i(uv) e-2i(zl(k)u/M + z2(k)v/W), (23)

where the double arrow is used to indicate the correspondence between a function
and its Fourier transform. Equation 23 expresses the well-known translation
property of the Fourier transform [36]. Therefore, multiplying the Fourier
transform of the image a by the exponential equivalent of 6 and then taking
the inverse Fourier transform, shifts the image by the vector z(k), that is, it
moves the origin of the image to (zl(k),z2(k)). Equation 23 also expresses the fact
that a shift of the image does not affect the magnitude of its Fourier transform.
As a final observation, note that Equations 18, 22, and 23 imply that

a t(k) i. k;

i.e., shifting the image in the spatial domain with the use of the image algebra
operation a t(k) is equivalent to the product a* k in the frequency domain.

3.4 Performance Evaluation

3.4.1 Example Operation
Figures 17 and 18 illustrate the processing steps involved in performing a
general convolution and an additive maximum, using a Von Neumann type
template, respectively. Notice however, that by proper selection of operation
pairing (i.e.; with A, and + with A) multiplicative minimum and additive
minimum are produced in the same fashion.

0 0 0 0 01
I0 0ol
10 01
0 o 0 I

a 0t(1)

0 0o 0 0
0o I
lo ooo

ACC = ACC + W2
00 0 000
0 6126 01
0131710 0O
0 6 61601
0 2 0 0 01

a ( t(4)

0 o0
0 0 0 0 0O
10 0 0 0 01


a t
0 814168
23 28 0
0 2 0 0 01

ACC= W1=a t(1)x 3
0I 0o o001
0 6126 0l
0996 01
0 0 01201
0 0 0 0 0l

a $ t(3)

0 0 000
0 0
0 0 0 8
100 1o
10 0 0 0

v = a @t(4)x 2
1 048401
000 8 01
0 2 0 0 01


t(1)= *

ACC =0

t(2)= 1 t(4)= 1

t(3)= [BM t(5)= ]

a 0 t(2)

0 0 0 0 1

0 06 4
0 01 0
0 2 0 0 o
a =(a t(3) X 2

0 0 0 0 8O

ACC = ACC + W4
0 4 8 4 0
0 8 6168
10328 80
0 2 0 0

S= a D t(2)x 2
10 0 0 0 0
0 8 4 00o

0 0 0 08


0 6161441
0 6 61681
0 2 0 01

a ( t(5)
Iooooo .
0 0 0 0 0
0 0oooo

.0 0 0 0
0~ 0 0 0 0l

v= a ( t(5) x 2

0 080 0
10 0 0 0 0


Figure 17. Example of general convolution.

0 814168



|0 0 0 0 0
00000 2 '

oI : :

a t
S5 6 6 6 4

a ] t(1) ACC= W1=a 0 t(1)+
00000 13333 31
IoooooI 3333"3
0l o0 3575 31
60 3 366531
10 0| 3 3 3 7 3
10 033 1 34333
0 000 3333 3

333331 000000|I
lassas! looooo
357531 100
36653 00
355731 0 1
34363 00 0
333331 10 0 0

a 6 t(4) v=4a D t(4)+ 2
0 124642-
I 1255421
0 0ol 1222621
S o l000 23222
1000001 1222221
0 0 0 0 2 222 2

3 a ( t(2)

0 0
0I0 01

v=a ( t(3)+ 2
22 2 2 6

3464 3 4
3 5 5 7 64

%= a et(5)+ 2

12 2 2221


Figure 18. Example of additive maximum.

v = a t(2)+ 2

133333 3

a t(5)
00-0 00O



t(l)= M

ACC =0

t(2) t(4) =
t(3)= EM t(5)= T]

3.4.2 Comparison of Execution with RISC Performance

There are currently two limiting factors concerning the proposed

architectural design. The major limitation is the switching speed of the spatial
light modulators. The best SLMs in the research community today will operate at

a 10-40 KHz switching frequency with a potential for MHz [5]. This is well below

the expected operating speed of the remaining digital components, a common 25

MHz clock frequency. The other limitation is the memory I/O speed which will be

necessary to load in parallel the SLMs and the digital array processors. High speed
circuits such as cross-bar switching, block memory transfer, and column parallel

loading networks do exist for this application, however, the physical connections
become a problem as the number of elements in the SLM increases. Faster optical

addressing techniques which tend to reduce this limitation are being developed

[20]. The 40 KHz switching frequency will be the anticipated limiting factor for

the rest of the discussion. It will determine the time spent in each section of the
pipeline (refer to Figure 12).
Since the proposed design is dependent upon the size of the template
configuration and independent of image size, the following equation can be used to

estimate the computation time for a convolution type operation:

time = (n + 1). (24)

where n is the number of points in the template support, and clock corresponds
to 40 KHz. The addition is the last stage of the pipeline, the beginning of the

pipeline bypasses the analog processor as noted at the end of Section 3.3.1. Table

5 show the results for 5 and 25 element templates.

The following equation can be used to estimate the computation time for a

naive calculation of a template convolution over an NxN image using a Reduced

Instruction Set sequential Computer (RISC):

time = NZ(2n-1) (25)

where n is the number of points in the template support (n multiplications plus

n-1 additions), the 2 in the denominator represents memory interleaving, clock is

the operating frequency (25 MHz), and 1 clock cycle per operation. Table 6 shows

the results for 5 and 25 elements over various images sizes.

3.4.3 Algorithm Description and Evaluation

The max-min sharpening transform is selected as a representative

algorithm. Such an algorithm is typically used as a preprocessing filter for the

subsequent correlation action. The transform is an image enhancement technique

that brings fuzzy gray level objects into sharp focus. This is an iterative technique

that compares the maximum and minimum gray level values of pixels contained in

a small neighborhood about each pixel in the image. The results of the comparison

establishes the respective pixel output. The benefit of its use as a preprocessing

step is that a sharper image, when presented to the correlator, results in a higher

signal to noise ratio on the correlation plane. Thus, such an image has a higher

probability of identification. Description. The image algebra formulation is defined as follows:

Let a E eR be the input image, and N(x) a predetermined

Table 5. Estimated execution time for proposed design.

Number of elements


1.5 x 10-04
6.5 x 10-04

Table 6. Estimated execution time for RISC design.

Number of elements

Time(sec.) per image size

7.4 x 10-4




neighborhood function of x E ,I and t: I'2 -- I be defined by

t0 if x N(x)
-m otherwise

The output image s is the result of the max-min sharpening transform
given by the following algorithm:

aM:= a3t

am := a t

b:= aM + am -2 a

S := X<0(b) aM + X>0(b) am,

where t is the additive dual of t. For this specific implementation, both the

input and the output images are defined on the point set X which corresponds to
the coordinate points of the CCD array. Therefore, points which lie outside X

are of no computational significance. Also, the support of t and t are

equivalent; i.e., t,(x) = ty(x), V x E <(ty). The algorithm is now iterated either

a predetermined fixed number of times or until s stabilizes, that is, until there is
no change in gray value from one iteration to the next.
Figure 19 depicts a typical input stream for the instruction to execute the
above algorithm. An instruction stack in prefix notation is used to illustrate the
instruction stream. This is a simplification of what would normally take place in a








NOTE: Image Planes are formed, point wise, using the
local memories at each accumulation processor
(refer to Figure 15).

Figure 19. Example of instruction stack reordering.

Store Result
in Image Plane
--- C

Store Result
in Image Plane
- D


>+ 0



Point wise

-- B

--. E








series of translations from the high level algebraic abstraction to the machine

instruction level.

After an initial scan of the stack, the stack operations can be separated into

general matrix production type operations and those that are strictly pointwise.

By sequentially ordering these classes of operations (general matrix operations

first) and further noting the number of repetitive operation substrings, the stack

can be reconfigured (refer to Figure 19) to execute with a minimal number of clock

cycles per operation. Evaluation. Let the switching frequency of the SLMs in the

correlator operate at 40KHz, use 10 clock cycles as an average for each digital

operation in both the weighting and accumulation processors, and use a common

25MHz clock frequency. Then, assuming the card(t) = 5 and using Equation 24,

the time to compute the two operations l and 19 is 0.0003 seconds. The

remaining eight pointwise operations are executed in 3.36x10-06 seconds. The

total time is now 0.00030336 seconds. Note, if the assumption of 10 clock cycles

per operation is raised to 100, the total time of execution becomes 0.0003336

seconds. This is a result of the 40KHz switching frequency which is much lower

than the common clock.

A comparison with a sequential RISC type machine is made in a similar

manner to the comparison in the previous section. An arbitrary image size of

256x256 pixels is used to facilitate the computation. Assuming once again a 25

MHz common clock and one operation per clock cycle, Equation 25 is used to

evaluate the timing for the two general matrix operations. The total time for both

operations is 0.0236 seconds. The remaining operations are evaluated by

modifying Equation 25, where the 2n-1 parameter is replaced by the number of

operations, in this case 8. The time for these eight operations (0.0105 seconds)

plus the general matrix operations equals a total execution time of 0.0341 seconds. Conclusions. The gain in performance, or the speedup, is measured

as the ratio of the best sequential execution time to the parallel execution time.

For this comparison the configuration design estimate to the RISC estimate is

approximately 112 to 1.

A more important measure is based on the requirement of the sensor. For

most applications the minimum frame rate for a system to update a guidance law

is 30 Hz; other higher performance systems are at a 100 Hz rate. Considering

the simplicity of this algorithm and the fact that it represents only a small portion

of what must be computed within a required frame rate. The comparison just

given makes an important point; the algorithm cannot be executed once within the

minimum frame rate of 30 Hz using sequential RISC technology. On the other

hand, the algorithm can be executed on the proposed design approximately 98

times per frame. Clearly there exists a substantial increase in throughput for the

proposed design. The proposed design just breaks the speed barrier necessary to

make it functional in an ATR system. The interesting speculation is how the

throughput would increase with better performance from the SLMs.

3.5 Binary Image Operation

The hybrid electro-optical system described thus far is also usable as a

binary image correlator/processor. The primary reason for doing so is to achieve a

significant increase in the processor throughput. The entire template configuration

is convolved using the optical process, thus eliminating any iterative solution. The

computational time complexity in this case is 0(1). The disadvantage is if the

input image information is restricted to binary images, then more sophisticated
algorithms that turn binary images into gray scale images cannot be utilized

efficiently. With regard to the ATR problem for specific applications that require

very high speed image processing, the advantage may outweigh the disadvantage.

3.5.1 Functional Description

The physical layout of the correlator may remain unchanged from the

description in Section 3.1 depending upon how the images obtained by the sensor

are used. The only change to the correlator is a replacement of the first 8-bit,

SLM with a two state or binary SLM. In this case the sensing device establishes

the necessary threshold to provide the binary input image. Otherwise, an initial

gray level image is processed by the proposed design allowing the X>0 operation

to provide the binary image which, in turn, is fed back to the first SLM. Again,

depending on the requirement, replacing the SLM will provide a higher


With the assumption that a binary image is passed as the input image, and

the weights in any arbitrary template configuration are all set to 1, then the phase

information for the complex conjugate of the entire template configuration is

placed on the second SLM and processed (refer to Figure 8). The intensity of the

energy emerging from the analyzer contains the results of the convolution. At this

time the convolved information is transformed by the CCD array into a gray level

image. The maximum gray level value is directly proportional to the number of

times 1 values overlapped in the convolution process, and it is bounded by the

total number of non zero elements in the template. Figure 20 depicts this


3.5.2 Image Algebra Process Distribution

The simplicity of operation of this design configuration stems from the way

the morphological operations are derived. In [63] Ritter et al. shows that a

dilation of a binary image a with a structuring element b is equivalent to a t,

where b = e(ty) and t,(x) = 1 for all x E et(ty). Similarly, a binary erosion is

equivalent to a a t. By using the convolution operator A, Davidson [22] proves

that the morphological operations of dilation and erosion for a binary image, a, are

derived by the following expressions:

dilation X>0(a t), (26)

a = =n(a t), (27)

where n = the number of elements in Y(ty) and a, adilation, and a are
Sdilation erosion
Boolean images.

The Boolean operations or and and are used interchangeably with the

set-theoretic operations union and intersection. The remaining set-theoretic

operation necessary for the binary image algebra defined by Huang et al. is the

complement operation. As notated in Table 2, this operation is easily

accomplished in hardware using a flip-flop at the appropriate circuit location, as

shown in Figure 14.

The operations just described are only valid when the weights of the

template are all set to unity. If template weights other than 1 are required, then

the convolution operations can be accomplished in the iterative manner as

described in Section 3.3. The use of a census template is an example of this

situation. Here the weights of the template are distinct powers of a prime number.






a t) =


(aOt) =

Figure 20. Binary image erosion and dilation.



When the template is convolved with a binary image the result is a unique number

for each distinct pattern of Os and Is [22]. This is a fundamental step in most

algorithms used for labelling objects in binary images.

The conclusion to draw here is that by allowing the manipulation of gray

level values formed by the convolution of binary images or binary image and gray

level template, Huang's binary image algebra described in Chapter 2 is extended

[41,67]. The extension is for both the set of values and the number of operations

over the set; the set {1,0} is replaced by I c R, and the characteristic operation X

and general convolution operation are included to provide erosion and dilation

operations more directly.

3.6 Variant Template Operations

Referring to Section 2.1.4, the definition of a variant template is one whose

configuration and/or values are allowed to vary with respect to the reference point

location. The Fourier and Gabor templates derived from the kernels of the

discrete Fourier and Gabor transforms are examples of variant templates. The

implementation of algorithms involving variant template operations using the

conceptual system described in Section 3.2 can result in significant reductions in

system throughput for some specific variant templates.

The generalized matrix product relative to parallel architecture

communication, as shown by Wilson [74], unifies the three concepts of

many-to-one, one-to-many, and one-to-one data transfers; i.e., reduction,

replication, and permutation respectively. The first two concepts are used to

explain the throughput differential.

The image-to-template operations of the proposed system as defined in
Table 4, are intended for invariant template operations, and can be interpreted as

the generalized left product of image a with template t. This is defined for all a,
b E F and t E (F)'x such that for any reference point y E X

b(y)= r (a(x) t,(y)).

In short, the image is shifted with respect to the location of each template element
in the support of t; such that, for each shift, each image pixel value is combined
under the O operation with the corresponding template value. This, according to
Wilson, is a one-to-many data transfer, or replication, operation [74]. As noted

earlier, the time complexity is O(N) where N = card(t); i.e., card(a) parallel
operations repeated N times.
The definition of the variant template is dependent on the reference point's
location relative to each point in the image, therefore, for the proposed system
only the generalized right product is interpreted. That is, for each reference point


b(y)= r (a(x)Oty(x)).

In this case the template is shifted with respect to the location of each pixel in the
image a such that, for each shift, each variant of the template description is
combined under the O operation with the corresponding pixel values. In other
words, a variant template is loaded as an image and the image pixel value
corresponding to this particular template is loaded in the template register (refer

to Figure 13). This, according to Wilson, is a many-to-one data transfer, or

reduction, operation [74]. The time complexity is O(N), where in this case
however, N = card(a); i.e., card( (ty)) parallel operations repeated N times
where card( (ty)) varies as y varies. This time difference is significant, two or
three orders of magnitude in most applications.
There are methods that can be applied to the variant templates themselves
that tend to increase the throughput. Gader has shown template decomposition
and localization methods for mesh architectures that result in faster computation
[30,31]. Also, Manseur [50] has proven that for certain variant template
definitions (e.g.; t, column rank one) there exists an invariant vertical template t1
and a (possibly variant) horizontal template t2 such that t = t1 t2. Thus, the
overall problem can be reduced to a linear combination of templates where at least
a portion of the computation can be executed at a higher rate.
Although the methods just mentioned result in an increase in throughput,
the conclusion for their use remains with the specific implementation. Assuming
the implementation is still the correlator interface for ATR; if the frame rate
(update rate) of the system is high (above 30 Hz), then the use of variant
templates in any acquisition algorithms should be avoided.


The hybrid electro-optical system described here is also based on the notion

of integrating Fourier optical correlation and digital image processing. The

imaging media, holographic film, makes this implementation unsuitable as a

solution to the ATR problem defined in Chapter 2. It does, however, have direct

application for many image processing tasks that use this high resolution imaging

media; e.g., photo reconnaissance, industrial robotics, blood analysis, etc.

Moreover, the reason for including it in this dissertation is that the image algebra

interface with a Vander Lugt optical system is noteworthy in general. The

interface affords additional methods for feature extraction which complement the

correlation function.

4.1 System Description

The difference between this configuration and the one described in Chapter

3 is the imaging media, spatial light modulators are replaced by holographic film

for both image and matched filter inputs. Also, the holographic film is capable of

storing both the magnitude and phase of the Fourier transform. This contrasts the

phase-only SLM implementation. The method of producing a matched filter on

holographic film is based on the work of Vander Lugt [72], hence the correlator

implementation is commonly referred to as a Vander Lugt system.

The advantages of using this method are: the capability to statistically and

repeatedly examine large numbers of images for the presence of specific

information content, the resolution of the film is higher than that of the SLM, and

the overall system cost is lower. The disadvantages are that the template

configuration and values must be known a priori, thereby eliminating program

controlled dynamic reconfiguration; the procedure for exchanging filters is

mechanical; and, the space bandwidth product tends to be large. The latter results

in an increase in the physical size of lenses, film plates, etc.

The image algebra algorithm suite for the kinds of processing tasks

mentioned can be executed using a limited number of predefined templates, or

filters. Predefined templates will allow for a system of selectable multiple space

and frequency multiplexed fixed filter sets similar to the one proposed by Casasent

and Schaefer [14,15]. The technique involves a finite number of templates stored

on film and multiplexed using a selectable laser diodes array (see Figure 21). By

selecting the proper diode, the focal point on the holographic film plane is

translated to the desired template and convolved in the manner described in

Chapter 3. The necessary composite matched filters or synthetic discriminant

functions are included on the film containing the templates [13,44]. The remaining

digital circuitry is the same as that defined in Section 3.2 with the only exception

being the absence of any feedback to the input image. This proposed system

configuration allows for some degree of template programmability and eliminates

any mechanical exchange action.

Typically, the number of non zero template elements for template

configurations used in image algebra operations tends to be small compared to the

number of non zero elements used by matched filter correlators. Therefore, the





\ 7XIQ7


space bandwidth product that is necessary for cross correlation is sufficient for the
convolution type operations of the image algebra.

4.1.1 Filter Generation
The 2-dimensional Fourier transform of a continuous function f(x,y) is
defined as

S{f(x,y)} = F(u,v) = ff f(x,y) e-2i xu+ y dx dy,
{i= F}

and in a similar manner the function h(x,y). The convolution and correlation of

f(x,y) and h(x,y) are further defined as

f(x,y) h(x,y) = F(u,v) H(u,v) and
f(x,y) O h(x,y) = F(u,v) H*(u,v),

where denotes the convolution, O denotes correlation, and represents the
complex conjugate [36].
If the reference image, h(x,y), is placed on a transparent film and
illuminated by a plane wave of coherent light, then its Fourier transform is formed
at the focal plane using a simple lens. However, recording film located at the focal
plane can only record the power spectrum of the Fourier transform [33]. The
Fourier transform, as opposed to the phase-only operation previously described, is
needed in this system to remove or attenuate specific frequency components in the
image. Therefore, a complex filter H(u,v) is needed to perform the necessary

filtering operations. Two pieces of information are required to produce such a
filter, namely the real and the imaginary parts of the waveform.
Vander Lugt proposed a heterodyning technique that incorporates
holographic film to generate and store the necessary complex waveform [72]. A
simple lens is used to transform the reference image that corresponds to the
template. The Fourier transform of the reference image, S{h(x,y)}, is brought to
focus on a photographic film plane. A plane wave is directed at the film plane at
an angle 0 and mixed with the transform at the film plane as shown in Figure 22
[12]. The resultant pattern caused by the constructive interference of the two
fields is

Ha(u,v) = H(u,v) + A e2iav,

where a =

is referred to as the spatial carrier frequency, A the wavelength of the light, and
A and e2irav are the amplitude and phase of the reference wave, respectively.
For most applications, A is of little interest and is set to unity. The film records
not the field itself but rather the square magnitude of the field, thus

Ha(u,v) = 1 + IH(u,v) 2

+ H(u,v) e2i7av + H*(u,v) e-2i'av

contains a constant, the power spectral density, and two terms due to the presence
of the spatial carrier frequency. The two spatially modulated terms contain the
Fourier transform information [72].


if :

Both the magnitude and phase of the Fourier transform are co-located on
the holographic film; typically a dichromated gelatin (DCG) which is a high
quality material for recording matched filters [10,64]. Computer generated
holograms written to bleached silver halide plates is another recording method

4.1.2 Filter Implementation
The film can now be placed at the Fourier transform lens focal plane and
illuminated with the Fourier transform of the test image, S{f(x,y)}. This
corresponds to the filter bank location as shown in Figure 21. The product of the
film transmittance and the illuminating Fourier transform is the function G(u,v),

G(u,v) = F(u,v) H (u,v)

= F(u,v) + F(u,v) |H(u,v)12

+ F(u,v) H(u,v) e2i7av

+ F(u,v) H*(u,v) e-2i av

Another lens, L2 in Figure 21, is used as an inverse Fourier transform of this
product to simultaneously obtain the convolution and correlation of the two
images [72], denoted by

g(x,y) = f(x,y) + [f(x,y) h(x,y)] h*(x,y)

+ f(x,y) h(x,y) 6(x,y+a)

+ f(x,y) h*(x,y) 6(x,y-a),

where 6(x,y-a) is simply a translation function and h*(x,y) = h(-x,-y).

4.2 Image Algebra Process Distribution

The distribution of image algebra operations for this implementation differs
significantly from the implementation defined in Chapter 3. The shifting
operation using the Fourier optics and a since function at the filter location is still
the same with the exception that the since function has non-unit amplitude. The
presence of the amplitude in the since function eliminates the need for the weighting
processor section of the digital side of the hybrid configuration. This allows the
complete template configuration to be included in the filter definition. Therefore,
general convolution operations are completed in one iteration. Since the
magnitude of the filter, which corresponds to the value of the template, is
inherently multiplied by the optical convolution properties, the morphological
convolution operations are restricted to the multiplicative maximum and
minimum as defined in Chapter 2. The accumulation processor section remains
the same with the exception that the template register is input directly, refer to
Figure 13. All pointwise operations are accomplished in the same manner as
described in Tables 2 and 3.

4.2.1 Morphological Operation
The objective in this section is to define an expression which allows the
multiplicative nonlinear convolution operations to be expressed in terms of linear
convolution operations. The expression is first established for the multiplicative
maximum, denoted as 0. Once the expression is defined for 0, the nonlinear
operation for multiplicative minimum a will automatically follow.
In support for the theorem that follows, suppose F { Rul, I U {.} } and

t E (F')x denotes a translation invariant template with card( #(ty)) = n.
For some arbitrary point y E X, let { x x x } = Y(t ) and for
i= 1, 2,...,n, set ci = t(xi).
Define a parameterized template s(i) from X to X with defined
parameter i E {1, 2,-..,n} defined by

ci, if x = xi
s(i),(x) =
0, otherwise.

Note that 1(s(i)y) = { xi } for i = 1, 2,- *, n.

THEOIEM 2: a t = V a s(i) (28)

PEOOF: For i = 1, 2, ..- n, let ai = a 0 s(i). Thus, for an arbitrary point y E X,
we have

a(y) = a(x) s(i),(x)

ai(y) = a(xi) ci


Now if b = a t, then for an arbitrary point y E X we have, by the definition of
the convolution operator that

b(y) = V a(x) t(x)

= V a(x).t(x)
xe (ty)

V a(x) t(x)

= V a(x) tx)
i= 1

= V a(x) *ci

xE 0(s(i))
XE QY(s(i) )

since W(s(i)y) = {xij. This shows that

a(x) s(i) (x)

= V a(y) (by Equation 29).

n n
Therefore, a t = V ai= V as(i)
i=1 i=l

As a corollary to THEOREM 2, the multiplicative minimum a is simply

aOt = A ass(i). (30)

PaOOF: The proof is identical to the proof of THEOREM 2 with the appropriate

global reduce function and corresponding induced operation


4.2.2 Negative Real Value Manipulation.

Some image processing algorithms require template definitions to contain

negative values. Edge masks using finite differences are examples of such

templates. The Vander Lugt optical system described above uses only positive

real values. The following method is presented as a solution to this condition.

Using the relationship previously developed for *, one can think of the

notion of an additional subjunction of positive and negative values which when

properly combined fulfill the stated requirement. Specifically, the generalized