COMPUTERGENERATED HOLOGRAPHIC MATCHED FILTERS
By
STEVEN FRANK BUTLER
A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL
OF THE UNIVERSITY OF FLORIDA IN
PARTIAL FULFILLMENT OF THE REQUIREMENTS
FOR THE DEGREE OF DOCTOR OF PHILOSOPHY
UNIVERSITY OF FLORIDA
1985
ACKNOWLEDGEMENT
The author wishes to thank Dr. Henry Register and Mr. Jim
Kirkpatrick for their encouragement to continue graduate studies at
the University of Florida. Dr. Roland Anderson has tirelessly
provided counseling and guidance during the years of study,
experimentation, and writing. Dr. Ron Jones of the University of
North Carolina assisted greatly with the understanding of film non
linearity. Dr. S.S. Ballard provided the scholastic background and
the interest in optics throughout the author's scholastic career at
the University of Florida. The Air Force Office of Scientific
Research and the Air Force Armament Laboratory funded the laboratory
support for this effort. The University of Florida provided academic
and administrative support for the author's entire period of graduate
studies.
TABLE OF CONTENTS
Page
ACKNOWLEDGMENTS...................................................ii
LIST OF TABLES.................................. ........... .... ..
LIST OF FIGURES..................................................vi
ABSTRACT..................................... ..... .... .............. x
CHAPTER
I INTRODUCTION..... .............. .............................1
Machine Vision............................................2
Optical Computers ........................................5
Contribution........................................ .. 7
II BACKGROUND.................................................9
Communication Theory....................................9
Vander Lugt Filtering ................................... 20
III COMPUTERGENERATED HOLOGRAMS (CGH).........................24
Continuous Tone Holograms..............................25
Binary Holograms.......................................30
Sampling and SpaceBandwidth Requirements...............39
IV OPTIMIZATION OF OPTICAL MATCHED FILTERS...................... 63
Performance Criteria................................... 63
Frequency Emphasis.......................................65
PhaseOnlyFilters......................................72
PhaseModulation Materials..............................76
V PATTERN RECOGNITION TECHNIQUES .............................84
Deformation Invariant Optical Pattern Recognition........85
Synthetic Discriminant Functions.......................88
VI MATCHED FILTER LINEARITY..................................94
Measurement of Film Characteristics......................97
Models for Film Nonlinearity..........................102
Computer Linearization of Filter Response...............112
CHAPTER Page
VII SIMULATIONS...............................................133
Techniques for Simulating Matched F,ilters............... 134
Simulation of a ContinuousTone Hologram................. 145
Simulation of a Binary Hologram......................... 151
An Example Using an SDF as a Reference...................159
VIII OPTICAL IMPLEMENTATION...... ...............................170
Techniques for Optical Implementation.................170
Examples of CGH Matched Filters ......................179
IX SUMMARY..................................... ............. 191
Conclusions......................... .. ... ............194
Recommendation...........................................195
BIBLIOGRAPHY.....................................................197
BIOGRAPHICAL SKETCH...............................................201
LIST OF TABLES
TABLE Page
7.1 Signaltonoise ratio and efficiency for an ideal 146
autocorrelation of a square.
7.2 Signaltonoise ratio and efficiency for a 157
continuoustone CGH.
7.3 Signaltonoise ratio and efficiency for an 165
AK hologram of a square.
7.4 Signaltonoise ratio and efficiency of an 169
AK hologram of a SDF correlating with members
of the training set.
LIST OF FIGURES
FIGURE Page
3.1 Brown and Lohmann CGH cell. 33
3.2 Complex plane showing four quadrature components. 36
3.3 Addressable amplitude and phase locations
using the GBCGH method. 38
3.4 Spectral content of an image hologram. 42
3.5 Spectral content of a Vander Lugt filter. 44
3.6 Spectral content of a Fourier Transform hologram. 50
3.7 Two dimensional spectrum of the Fourier Transform
hologram. 51
3.8 Two dimensional spectrum of the Vander Lugt filter. 53
3.9 Spectrum of a modified Vander Lugt filter. 55
3.10 Spectrum of the zero mean Vander Lugt filter. 58
3.11 Output of a 50% aliased Vander Lugt filter with
absorption hologram. 60
4.1 Highfrequency emphasis of a square and a disk. 67
4.2 Phaseonly filtering of a square and a disk. 74
5.1 Training set for the creation of a SDF. 91
5.2 SDF created from the images in Figure 5.1. 92
6.1 Typical H & D curve. 96
6.2 Computer output of the polynomial fit routine. 111
6.3 H & D plot for Agfa 10E75 photographic plates. 113
6.4 Amplitude transmission vs. exposure for Agfa
10E75 plates. 114
FIGURE Page
6.5 Computer output of the polynomial fit routine for
8E75 plates. 115
6.6 H & D plot for Agfa 8E75 photographic plates. 116
6.7 Amplitude transmission vs. exposure for Agfa
8E75 plates. 117
6.8 Image and plot of a linear gradient used for a
test input. 120
6.9 Image and plot of the output transmission on
film from the gradient input. 121
6.10 Image and plot of the predistorted gradient
used for an input. 122
6.11 Image and plot of the output transmission with
predistorted input. 123
6.12 Image and plot of a sinusoidal grating pattern
used for input. 125
6.13 Image and plot of the output transmission with
the sinusoidal input. 126
6.14 Output spectrum for a sinusoidal input. 128
6.15 Image and plot of a predistorted sinusoidal
grating used as an input. 129
6.16 Image and plot of the output transmission for the
predistorted sinusoidal input. 130
6.17 Output spectrum for a predistorted grating input. 131
7.1 Computer simulation of an ideal correlation. 136
7.2 Fourier transform of a square. 139
7.3 Fourier transform of a square with highfrequency
emphasis. 140
7.4 Ideal autocorrelation of a square with no
preemphasis. 141
7.5 Ideal correlation of a square with
highfrequency emphasis. 142
7.6 Ideal correlation of a square using
phaseonly filtering. 143
FIGURE
7.7
7.8
7.9
7.10
7.11
Flow chart for the continuoustone hologram
simulation.
Continuoustone CGH of a square.
Continuoustone CGH of a square with
highfrequency emphasis.
Continuoustone CGH of a square with phase
only filtering.
Autocorrelation of a square using a continuoustone
CGH.
7.12 Autocorrelation of a square using a continuoustone
CGH with highfrequency emphasis.
7.13 Autocorrelation of a square using a continuoustonE
CGH with phaseonly filtering.
7.14 Flow chart for the binary hologram simulation.
7.15 AK binary hologram of a square.
7.16 AK binary hologram using highfrequency
emphasis.
7.17 AK binary hologram of a square with
phaseonly filtering.
7.18 Autocorrelation of a square using an AK binary
hologram with highfrequency emphasis.
7.19 Autocorrelation of a square using an AK binary
hologram with phaseonly filtering.
7.20 AK binary hologram of the SDF using
highfrequency emphasis.
7.21 Correlation of a test image at 300 and the SDF
using an AK hologram with highfrequency emphasis.
8.1 Photo of an interferometrically produced optical
matched filter.
8.2 Cathoderay tube and camera produced by the
Matrix Corporation.
8.3 Cathoderay tube imaged onto a translation table
produced by the Aerodyne Corp.
8.4 Electronbeam writing system at Honeywell Inc.
viii
Page
147
150
152
153
e
e
FIGURE Page
8.5 Magnified views of a binary hologram produced on the
Honeywell Ebeam writer. 180
8.6 AK CGH matched filters, using a square as a
reference produced on the Honeywell Ebeam writer. 181
8.7 Reconstruction from an AK CGH matched filter of a
square using no preemphasis. 183
8.8 Reconstruction from an AK CGH matched filter of a
square using highfrequency emphasis. 184
8.9 Reconstruction from an AK CGH matched filter of a
square using phaseonly filtering. 185
8.10 AK CGH matched filter of the letters "AFATL"
using a) highfrequency emphasis and
b) phaseonly filtering. 186
8.11 Reconstruction from an AK CGH matched filter of
the letters "AFATL" using highfrequency emphasis. 187
8.12 Reconstruction from an AK CGH matched filter of
the letters "AFATL" using phaseonly filtering. 188
8.13 AK CGH matched filter of the SDF shown
in Figure 5.1. 189
8.14 Reconstruction of an AK CGH matched filter of
an SDF. 190
Abstract of Dissertation Presented to the Graduate School
of the University of Florida in Partial Fulfillment of the
Requirements for the Degree of Doctor of Philosophy
COMPUTERGENERATED HOLOGRAPHIC MATCHED FILTERS
By
STEVEN FRANK BUTLER
December 1985
Chairman: Roland C. Anderson
Major Department: Engineering Sciences
This dissertation presents techniques for the use of computer
generated holograms (CGH) for matched filtering. An overview of the
supporting technology is provided. Included are techniques for
modifying existing CGH algorithms to serve as matched filters in an
optical correlator. It shows that matched filters produced in this
fashion can be modified to improve the signaltonoise and efficiency
over that possible with conventional holography. The effect and
performance of these modifications are demonstrated. In addition, a
correction of film nonlinearity in continuoustone filter production
is developed. Computer simulations provide quantitative and
qualitative demonstration of theoretical principles, with specific
examples validated in optical hardware. Conventional and synthetic
holograms, both bleached and unbleached, are compared.
CHAPTER I
INTRODUCTION
Human vision is a remarkable combination of high resolution
sensors and a powerful processing machine. This combination permits
understanding of the world through sensing and interpretation of
visual images. The faculty of vision is so natural and common that
few pause to think how marvelous it is to acquire such clear and
precise information about objects simply by virtue of the luminous
signals that enter the eyes. Without consciousness of the complicated
process, objects are recognized by the characteristic qualities of the
radiations they emit. With the help of memory and previous
experience, the sources of these manifestations are perceived. This
process is known as sight, perception or understanding.
Images and photographs have long been used to identify and locate
objects. By photographing an area, perhaps from afar, a scene could
be given detailed study. This study might disclose the presence of
objects of interest and determine their spatial location. Images from
satellites show weather, agriculture, geology and global actions.
Special images may contain additional scientific information including
object spectral characteristics, velocity, temperature, and the like.
The traditional medium of these images has been photographic film.
It is capable of high resolution and is sensitive to visible and near
visible wavelengths. Unfortunately, film based methods are slow due
to exposure, processing, and analysis time. This time lag is not a
problem for many applications and so film is still the primary medium
for reconnaissance. Electronic imagery (TV, radar, etc.) is used for
those applications that require faster interpretation. These images
can be viewed, like film, by people trained to interpret the
particular images. Because of the electronic nature of the images,
electronic hardware and computers are used for manipulation of the
images.
Machine Vision
For very high speed retrieval and interpretation, machines must be
designed around the specific tasks. Machine interpretation is also
necessary when a human is not available. Unmanned robots work in
hazardous areas and perform many jobs more efficiently without the
encumbrance of human intervention. However, to function and carry out
their assigned job, the robots must have information about their
surroundings. The ability to interpret imagery from selfcontained
sensors is necessary for the proper function of a robot. This image
interpretation includes guidance, obstacle avoidance, target
recognition, tracking, and closed loop control of robot action. For
robot action without human intervention, machine intelligence must
have the ability to make decisions based on scene content. Computer
image processing and recognition refer to techniques that have evolved
in this field in which the computer receives and uses visual
information.
Image processing techniques prepare or preserve an image for
viewing. This includes enhancement, restoration, and reconstruction.
Image enhancement techniques are designed to improve image quality for
human viewing. For example, correction of a geometrically distorted
image produces an obvious improvement in quality to a human observer.
Image restoration techniques compensate an image, which has been
degraded in some fashion, to restore it as nearly as possible to its
undegraded state. For example, an image which is blurred due to
camera motion may be improved using motion restoration. To perform
the difficult task of image interpretation, extraneous noise must be
separated from the desired signals. This may occur in several stages
of enhancement where each stage reduces the extraneous noise and
preserves the information crucial to object recognition. Image
enhancement may include contrast transformation, frame subtraction,
and spatial filtering. The goal of image enhancement is to reduce the
image complexity so that feature analysis is simplified.1
Once the scene has been enhanced, the job of interpretation is
simplified. The interpreter must now decide what the remaining
features represent. The features present a pattern to the interpreter
to be recognized. This pattern recognition problem may be quite
difficult when a large number of features are necessary to
differentiate between two possibilities. Most people have to look
closely to see any difference between two twins. A computer might
have equal difficulty distinguishing a car from a house in low
resolution image.
Recognition involves an interpretation of an image. This includes
scene matching and understanding. Scene matching determines which
region in an image is similar to a pictorial description of a region
of another scene. A reference region or template is provided and
systematically compared to each region in a larger image. Here the
computer attempts to match models of known objects, such as cars,
buildings, or trees, to the scene description and thus determine what
is there. The model objects would be described in memory as having
certain characteristics, and the program would attempt to match these
against various parts of the image. Scene understanding involves a
more general recognition problem describing physical objects in a
scene based on images. For example, a scene may be divided into
regions that match various objects stored in memory such as a house,
tree, and road. Once the scene is divided into known regions, the
interrelationship between these regions provides information about the
scene as a whole.
When it is necessary to recognize specific objects, correlation
techniques are often used.2 A reference image of the desired object
is stored and compared to the test image electronically. When the
correlation coefficient is over a specified threshold, the computer
interprets the image as containing the object. The correlation
procedure may also provide the location of the object in the scene and
enable tracking. The correlation coefficient may be used in decision
making to determine robot action. Because even a single object may
present itself in many ways, correlation procedures are complicated by
the immense reference file that must be maintained.3 Special
correlation techniques may provide invariance to specific changes, but
a wide range of object conditions (i.e., temperature, color, shape,
etc.) make correlation recognition a complicated computer task.4 The
best computer vision systems now available have very primitive
capabilities. Vision is difficult for a computer for a number of
reasons. The images received by a sensing device do not contain
sufficient information to construct an unambiguous description of the
scene. Depth information is lost and objects frequently overlap.
Vision requires a large amount of memory and many computations. For
an image of 1000 X 1000 picture elements, even the simplest operation
may require 108 operations. The human retina, with 108 cells
operating at roughly 100 hertz, performs at least 10 billion
operations a second. Thus, to recognize objects at a rate even
closely resembling human vision, very special processor technologies
must be considered. One promising technology has emerged in the form
of optical computing.
Optical Computers
Optical computers permit the manipulation of every element of an
image at the same time. This parallel processing technique involves
many additions and multiplications occurring simultaneously. Most
digital processors must perform one operation at a time. Even though
the digital processors are very fast, the number of total operations
required to recognize patterns in an image is very large. Using
optical Fourier transformers, an optical processor can operate on the
image and its Fourier transform simultaneously. This permits many
standard image processing techniques, such as spatial filtering and
correlation, to be performed at tremendous rates.
The Fourier transform is formed optically by use of a lens. The
usual case that is considered in optical computing is when the
illuminating source is located at infinity (by use of an auxiliary
collimating lens) and the image transparency is located at a distance
equal to focal length from the transforming lens. The distribution in
the output plane located a focal length behind the transforming lens
is the exact Fourier transform of the input distribution. The Fourier
transform contains all of the information contained in the original
image. However, the information is now arranged according to spatial
frequency rather than spatial location. The advantage of such an
arrangement is that objects or signals of interest may overlap with
noise in the image domain but exist isolated in the frequency domain.
This permits the possible separation of signal from noise in the
frequency plane when it would have been impossible in the image plane.
The image can be transformed into frequency space, frequency filtered
and then transformed back into image space with the noise removed.
The frequency filter may be lowpass, highpass, or bandpass, chosen
to optimize the filtering of a specific signal. This frequency plane
filter is the heart of the analog optical computer.
The frequency plane filter can be constructed in many ways. Low
pass and highpass filters are accomplished using simple apertures
mounted on axis in the frequency plane. More complicated filters are
produced optically using holographic techniques. These filters may
also be produced using computergenerated holography (CGH). The
computer is used to model the desired filter response, mathematically
represent the holographic filter, and create a physical filter using a
writing device. One of the important advantages of computergenerated
holography is that the reference need not exist physically, but only
mathematically. This permits mathematical manipulation of the
reference prior to creation of the filter for purposes of
optimization.
The advantage of an analog optical processor is that it may
operate at very high speeds. In addition, the processor typically is
smaller, lighter, and consumes considerably less power than an
equivalent digital processor.5,6 When coupled with the ability to
manipulate and optimize the frequency plane filter, the optical
processor becomes a useful tool. With considerable justification,
there is great interest in the robotics community.
Contribution
This dissertation states that CGH matched filters should be used
in an optical correlator to recognize patterns in a complex scene, and
describes how to create that filter. The CGH matched filter is
superior to interferometric filters due to the ability to preprocess
the filter function and control the production of the hologram. The
use of optical elements for high speed pattern recognition was first
proposed 20 years ago.7 The concept of using computers to define and
generate holograms came only two years later.8 Since that time,
considerable effort has been devoted to exploring the potential of
these CGH elements for reconstruction holography. Most of this effort
was devoted to optimizing the methods for encoding and writing the
holograms.813 More recently, interest has grown in the area of
efficiency improvement.14 The efficiency of a hologram for optical
correlation must be high in order to utilize low power, light weight
diode lasers. In separate but parallel efforts in artificial
intelligence, researchers have studied the effects of image
enhancement on pattern recognition.15 Though research in the various
fields is proceeding, a unified approach to the interrelation of pre
processing, holographic encoding and physical implementation is
lacking. Specifically, the research in CGH, to date, has only been
for display or reconstruction holography, not matched filtering. This
dissertation describes the steps necessary and possible to create
practical matched filters using CGH.
The approach presented here ties many areas of research together
as they apply to CGH matched filters. Modifications to existing
encoding schemes which provide real valued filter patterns for use in
an optical correlator are explained in Chapter III. In addition,
Chapter III defines the spacebandwidthproduct (SBP) required for
holographic matched filtering rather than for display holography as is
presented in existing literature. This includes procedures for
minimizing the SBP required. Preprocessing methods which apply
specifically to matched filtering are presented along with rationale
for their use in Chapter IV. Techniques for the use of CGH matched
filters as a pattern recognizer are reviewed in Chapter V.
Linearization methods for writing on film are derived and evaluated in
Chapter VI.
These various considerations are not independent, but rather, are
interwoven in the production of CGH matched filters. These
interactions can be fully analyzed only with a complete model
incorporating all the parameters. Chapter VII describes such a model
created to analyze the preprocessing, encoding and writing techniques
used to produce optimal CGH matched filters. Now that the various
methods have been developed and the analytical tools demonstrated,
specific examples are presented and analyzed. Chapter VIII describes
approaches for physically producing a transparency including specific
examples taken from Chapter VII. Finally, conclusions based on the
analysis are offered in Chapter IX.
CHAPTER II
BACKGROUND
The background technology is reviewed here to understand the
operation of an optical processor more fully. A number of different
types of optical processors are in use today. These include one
dimensional signal processors, twodimensional image processors and
multidimensional digital processors. Only twodimensional image
processors used for matched filtering are described here. A matched
filter optimizes the signaltonoise ratio at a specific point when
the characteristics of the input are known.16 Typically, the desired
pattern and the nature of the background or noise in the input image
are known. Specifically, the input consists of a known signal s(x,y)
and an additive noise n(x,y). The system is linear and space
invariant with impulse response h(x,y). The criterion of optimization
will be that the output signaltonoise power ratio be a maximum.
This optimum system will be called a matched filter for reasons that
will become clear as the derivation proceeds.
Communication Theory
A system is any unit that converts an input function I(x,y) into
an output function O(x,y). The system is described by its impulse
responseits output when the input is an impulse or delta function.
A linear system is one in which the output depends linearly on the
input and superposition holds. That is, if the input doubles, so does
the output. More precisely stated, let 01 be the output when I1 is
the input and 02 be the output when 12 is the input. Then the system
is linear when, if the input is al1+bI2 the output is a01+b02. This
property of linearity leads to a vast simplification in the
mathematical description of phenomena and represents the foundation of
a mathematical structure known as linear system theory. When the
system is linear, the input and output may be decomposed into a linear
combination of elementary components.
Another mathematical tool of great use is the Fourier transform.
The Fourier transform is defined by
CO
F(u,v) = fff(x,y) exp j2rr(ux+vy) dx dy = F {f(x,y)}. (2.1)
00
The transform is a complex valued function of u and v, the spatial
frequencies in the image plane. The Fourier transform provides the
continuous coefficients of each frequency component of the image. The
Fourier transform is a reversible process, and the inverse Fourier
transform is defined by
00
f(x,y) = IfF(u,v) exp j27nux+vy) dx dy = F1{F(u,v)}. (2.2)
00
The transform and inverse transform are very similar, differing only
in the sign of the exponent appearing in the integrand. The magnitude
squared of the Fourier transform is called the power spectral density
Of = !F(u,v) 2 = F(u,v) F*(u,v). (2.3)
It is noteworthy that the phase information is lost from the Fourier
transform when the transform is squared and the image cannot, in
general, be reconstructed from the power spectral density. Several
useful properties of the Fourier transform are listed here.
Linearity Theorem
F { af1(x,y) + bf2(x,y)} = a F{f1(x,y)} + b F{f2(x,y)} (2.4)
The transform of the sum of two functions is simply the sum of
their individual transforms. The Fourier transform is a linear
operator or system.
Similarity Theorem
F {f(ax,by)} = F(u/a,v/b)/ab where F(u,v) = F {f(x,y)} (2.5)
Scale changes in the image domain results in an inverse scale
change in the frequency domain along with a change in the overall
amplitude of the spectrum.
Shift Theorem
F {f(xa,yb)} = F(u,v) exp [j(ua+vb)] (2.6)
Translation of patterns in the image merely introduces a linear
phase shift in the frequency domain. The magnitude is invariant to
translation.
Parseval's Theorem
fJ/F(u,v) 12 du dv = If (x,y)12 dx dy (2.7)
The total energy in the images plane is exactly equal to the
energy in the frequency domain.
Convolution Theorem
F{f(x,y) g(x,y)} = ff F(u,v)F(uou,voV) du dv
(2.8)
The Fourier transform of the product of two images is the
convolution of their associated individual transforms. Also the
Fourier transform of the convolution of two images is the product of
the individual transforms.
Correlation Theorem
Rfg(x,y) = fff(x,y) f(xxo,yyo) dxo dyo (2.9)
The correlation is very similar to the convolution except that
neither function is inverted.
Autocorrelation (WienerKhintchine) Theorem
Off(u,v) = F {Rff(x,y)} (2.10)
This special case of the convolution theorem shows that the
autocorrelation and the power spectral density are Fourier transform
pairs.
Fourier integral Theorem
f(x,y) = F1{ F{f(x,y)}} (2.11)
f(x,y) =F {F {f(x,y)}}
Successive transformation and inverse transformation yield that
function again. If the Fourier transform is applied twice
successively, the result is the original image inverted and perverted.
It is also useful to define here the impulse function. Also known
as the Dirac delta function, it describes a function which is infinite
at the origin, zero elsewhere, and contains a volume equal to unity.
One definition of the Dirac delta function is
6 (x) lim (a/ r ) exp a2x2. (2.12)
au
The delta function possesses these fundamental properties:
6 (x) = 0 for x 0 (2.13)
Co
f 6 (x)dx = f6(x)dx = 1 (2.14)
CO CO
6 (x) = 6(x) (2.15)
6 (ax) = (1/a) 6(x) a A 0 (2.16)
f f(x) 6(xa)dx = f(a). (2.17)
The Fourier transform of the delta function is unity. This property
provides a useful tool when studying systems in which an output is
dependent on the input to the system. When an impulse is the input to
the system, the input spectrum is unity at all frequencies. The
spectrum of the output must then correspond to the gain or attenuation
of the system. This frequency response of the system is the Fourier
transform of the output when an impulse is the input. The output of
the system is the impulse response. Thus, the impulse response and
the frequency response of the system are Fourier transform pairs. To
determine the output of a system for a given input, multiply the
Fourier transform of the input by the frequency response of the system
and take the inverse Fourier transform of the result. The convolution
property shows an equivalent operation is to convolve the input with
the impulse response of the system.
O(u,v) = I(u,v) H(u,v)
(2.18)
o(x,y) = F 0(u,v)} =F1 I(u,v) H(u,v)} (2.19)
= f i(xoYo) h(xxo,yyo) dxo dyo
= f(x,y) h(x,y)
where denotes convolution.
Consider the effect of an additive noise on the input of the
system. Although the exact form of the noise n(x,y) may not be known,
the noise statistics or power spectral density may be predictable.
Thus, the effect of the system on the input is determined by its
impulse response or frequency response. That is, when there is
knowledge of the input signal and noise, the output signal and noise
characteristics can be predicted. The relationship of the input and
output are expressed in the following diagram and equations. The
letters i and o indicate the input and output terms while the letters
s and n indicate the signal and noise portions.
Linear System
s(x,y) + n(x,y) h(x  so(x,y) + no(x,y)
h(x,y)
i(x,y) = si(x,y) + ni(x,y) (2.20)
o(x,y) = So(x,y) + no(x,y) (2.21)
0(u,v) = I(u,v) H(u,v) (2.22)
So(u,v) = Si(u,v) H(u,v) (2.23)
No(u,v) = Ni(u,v) H(u,v)
(2.24)
Now that the relationships between the input and output of a
linear system are known, such a system may be utilized to enhance the
input. For example, assume an image has been degraded by some
distorting function d(x,y). The original image was convolved with the
distorting function, and the spectral contents of the ideal image
Fi(u,v) were attenuated by the frequency response D(u,v) of the
distorting system. By multiplying the degraded image by the inverse
of the D(u,v), the original ideal image is obtained. Any distortion
which can be represented as a linear system might theoretically be
canceled out using the inverse filter. A photograph produced in a
camera with a frequency response which rolls off slowly could be
sharpened by Fourier transforming the image, multiplying by the
inverse filter, and then inverse transforming. In this case, the
inverse filter is one in which the low frequencies are attenuated and
the high frequencies are accentuated (high pass filter). Because the
high frequencies represent the edges in the image, the edges are
accentuated and the photo appears sharper.17 As indicated in the
following diagram, the image is distorted by the function D(u,v) but
in some cases can be restored by multiplying by 1/D(u,v).
fi(x,y) Fi(uv) X D(u,v):> fd(x,y) = blurred photograph
fd(x,y);Fd(u,v) X 1/D(u,v > f'd(xy) = enhanced photograph
The linear blur of a camera is another classic example. Consider
traveling through Europe on a train with your camera. Upon getting
home and receiving your trip pictures, you find that all of them are
streaked by the motion of the train past the scenes you photographed.
Each point in the scene streaked past the camera, causing a line to be
formed on the film rather than a sharp point. The impulse response is
a line, and the corresponding frequency response of the distorting
system is a sine function (sin u /u). To retrieve the European
photo collection, merely multiply the Fourier transform of the
pictures by u/sin u and reimage.
In the physical implementation of this process, there are several
practical problems. To multiply the image transform by the inverse
function, a transparency with the appropriate response is produced.
In general, a transparency can only attenuate the light striking it.
That is, the transparency can only represent nonnegative real values
less than one. Herein lies the problem. The inverse response
required to correct a specific distortion may, in fact, be complex.
In some cases, a combination of two transparencies can be combined to
provide complex values. One transparency is used for amplitude or
attenuation, and another phase transparency or phase plate is used to
provide the appropriate phase shift at each point. A phase
transparency can be produced by bleaching film with an appropriate
latent image induced in the emulsion. Chu, Fienup, and Goodman18
demonstrated a technique in color film which consists of three
emulsions. One emulsion was used as an amplitude transparency and
another emulsion was used as a phase plate. The appropriate patterns
were determined by a computer and the film was given the proper
exposure using colored filters.
Even with a twotransparency system, not all distortions are
possible to remove. Note that in the linear blur case, the inverse
response is u/sin u. The denominator goes to zero for specific values
of u, and the response has a pole at those values. The filter cannot
represent those values, and the practical filter is merely an
approximation to the ideal filter. It is worth noting that when the
distorting response reduces a frequency component to zero or below
some noise threshold, that component cannot be recovered. That is,
information is usually lost during the distorting process and inverse
filtering cannot recover it.
It is desirable to remove noise from a corrupted image. Although
it is not always possible to remove all of the noise, the
relationships between the input and output of a.linear system are
known. A linear system is optimized when most of the noise is
removed. To optimize a system the input must be specified, the system
design restrictions known, and a criterion of optimization accepted.
The input may be a combination of known and random signals and noises.
The characteristics of the input such as the noise spectrum or
statistics must be available. The classes of systems are restricted
to those which are linear, spaceinvariant, and physically realizable.
The criterion of the optimization is dependent on the application.
The optimum filters include the least meansquareerror (Wiener)
filter and the matched filter. The Wiener filter minimizes the mean
squarederror between the output of the filter and actual signal
input. The Wiener filter predicts the least meansquarederror
estimate of the noisecorrupted input signal. Thus, the output of the
Wiener filter is an approximation to the input signal. The output of
the matched filter is not an approximation to the input signal but
rather a prediction of whether a specific input signal is present.
The matched filter does not preserve the input image. This is not the
objective. The objective is to distort the input image and filter the
noise so that at the sampling location (xo,yo), the output signal
level will be as large as possible with respect to the output noise.
The signaltonoise ratio is useful in the evaluation of system
performance, particularly in linear systems. In the matched filter,
the criterion of optimization is that the output signaltonoise power
be a maximum. The input consists of a known signal s(x,y) and an
additive random noise n(x,y). The system is linear and space
invariant with impulse response h(xo,yo). To optimize the system or
filter, maximize the expression
Ro = So2(oyo)/E{no2(x,y)} (2.25)
where E{no2(x,y)} =ff no2(x,y) dx dy
at some point (xo,Yo). The problem is then to find the system h(x,y)
that performs the maximization of the output signaltonoise ratio.
The output signal so(x,y) is
so(x,y) = //si(xo,Yo)h(xxo,yyo) dxo dyo (2.26)
and the output noise no(x,y) power is
ff ino(x,y) 2 dx dy = f !No(u,v) 2 du dv
= If Ni(u,v)l2 IH(u,v)2 du dv. (2.27)
The signaltonoise output power ratio becomes
ff si(xo,yo)h(xxo,yyo) dxo dyo 2 (2.28)
R =
!Ni(u,v) 2 IH(u,v) 2
Thus to complete the maximization with respect to h(x,y), the power
spectral density or some equivalent specification of the input noise
must be known. Once the input noise is specified, the filter function
h(x,y) is the only unknown. Equation (2.28) becomes
E{no2(xoo) aso2(xo,yo)} > 0
Ni2(u,v)H2u,v) du dv alff si(x,y)h(xxo,yyo) dxo dy0 2 > 0
where Ro max = 1/a
and the maximum signaltonoise ratio at the output is obtained when
H(u,v) is chosen such that equality is attained. This occurs when
ff ni2(x,y) h(xxo,yyo) dxo dy = si(x,y).
(2.29)
(2.30)
Taking the Fourier transform of both sides and rearranging gives
S(u,v)
H(u,v) ,
INi(u,v) 12
exp j(ux0+vy0)
Thus in an intuitive sense, the matched filter emphasizes the signal
frequencies but with a phase shift and, attenuates the noise
frequencies. This becomes clear when the additive noise is white. In
this case the noise power is constant at all frequencies and thus has
a power spectral density of
INi(u,v)12 = N/2
where N is a constant.
(2.32)
From equation 2.32 the form of the matched filter for the case of
white noise is
H(u,v) = Si(u,v)exp j(uxo+vyo)
(2.33)
= S*i(u,v) exp j(uxo+vyo)
(2.31)
or
h(x,y) = s(x,y). (2.34)
Equation 2.34 shows that the impulse response of the matched filter
(with white noise) is simply the signal image in reverse order
(inverted and perverted). Thus, the filter is said to be matched to
the signal. Filtering with a matched filter is equivalent to cross
correlating with the expected signal or pattern. That is,
O(x,y) = Rhs(x,y)
= ff s(xo,yo)h(xox,yoy) dxo dy (2.35)
Also, it can be seen that the frequency response of the matched filter
is equivalent to that of the signal but with the phase negated so that
the output of the filter is real. That is, the matched filter removes
the phase variations and provides a real valued output.19
Matched filters are used extensively in radar signal
processing, seismic data processing, and communications. These
filters are implemented using electronic circuitry and digital
computers. For image processing, the need to process large two
dimensional arrays places a large burden on conventional filtering
techniques. For these applications, optical processing techniques
provide the highest throughput speeds for matched filtering. One such
optical processing technique was proposed by Vander Lugt7 in 1969.
Vander Lugt Filtering
If an image is placed on a transparent sheet and illuminated by a
plane wave of coherent light, its Fourier transform is formed using a
simple lens.19 Once the Fourier transform is formed, specific
frequency components in the image can be removed or attenuated. The
result may then be inverse Fourier transformed to recreate the
modified image. The aperture, which may be replaced by a complicated
filter, functions to perform specific filtering operations including
Wiener or matched filter. Unfortunately, there are certain
limitations to the functions which can be physically implemented. A
normal transparency merely attenuates the light passing through it.
Its transmission is real and nonnegative. Thus, when a transparency
film is exposed to a waveform to be recorded, the phase information in
the waveform is lost. Two pieces of information, the real and
imaginary parts of the waveform, are recorded as only one value, their
magnitude. This loss of information can be corrected by taking
advantage of the redundancy in the wavefront and the use of additional
film space. Using the heterodyning technique proposed by Vander Lugt,
the complex waveform can be recorded on photographic film.
Vander Lugt proposed the use of holographic film to store the
filter response for a matched filter. A lens is used to Fourier
transform the reference and test images. Derivations of the Fourier
transforming capabilities of lenses can be found in the literature.10
The Fourier transform of the reference image is brought to focus on a
photographic film. Film is a nonlinear, timeintegrating medium and
thus only the magnitude of the Fourier transform or power spectral
density is recorded. The power spectral density does not contain all
of the original image information. Only the autocorrelation of the
original image can be obtained upon inverse transformation. Neither
the power spectral density nor the autocorrelation uniquely describe
the original image. If a plane wave is mixed with the Fourier
transform of the reference image at the film plane, the film will
record the interference pattern caused by the summation of the two
fields. The result on the film then is
H(u,v) = 1 + IF(u,v) 2 + F(u,v)exp j2Tav + F*(u,v)exp j2Trav, (2.35)
which contains a constant, the power spectral density, and two terms
due to a spatial carrier fringe formed due to interference with the
plane wave. The two spatially modulated terms contain the original
image and Fourier transform information. With this Fourier transform
recorded on the film, it is placed in the optical filter arrangement
and illuminated with the Fourier transform G(u,v) of the test image
g(x,y). The output of the film transparency is the product of its
transmittance and the illuminating Fourier transform.
O(u,v) = G(u,v) H(u,v) (2.36)
= G(u,v) + G(u,v)IF(u,v)l2
+ G(u,v)F(u,v)exp j2rav + G(u,v)F*(u,v)exp j2rrav
The product of the transforms from the reference and test images is
then Fourier transformed by another lens to obtain the correlation of
the two images.
o(x,y) = g(x,y) + g(x,y)*h(x,y)*h*(x,y) (2.37)
+ g(x,y)*f(x,y)*6(x,ya)
+ g(x,y)*f*(x,y)*6 (x,y+a)
The first two terms are formed on axis or at the origin of the output
plane. The third term is the convolution of the reference and test
images and is centered off axis. The last term is the correlation of
the reference and test images and is located offaxis opposite the
convolution. This optical arrangement provides the entire convolution
and correlation images at once while a digital processor must compute
one point at a time. In addition to the convolution and correlation
processes, additional image plane and frequency plane filtering may be
accomplished simultaneously in the same optical arrangement. The
convolution, correlation and any additional linear filtering are
accomplished with a single absorbing mask.
When used as a matched filter, the transparency multiplies the
expected pattern by its complex conjugate, thereby rendering an
entirely real field. This matched transparency exactly cancels all
the curvature of the incident wavefront. When an input other than the
expected signal is present, the wavefront curvature will not be
canceled by the transparency and the transmitted light will not be
brought to a bright focus. Thus the expected pattern will be detected
by a bright point of light in the correlation plane. If the pattern
occurs in the input plane but is shifted, the bright point of light in
the correlation plane will shift accordingly. This provides for the
detection of specific patterns in a larger image. The detection and
location of specific objects in large complicated images is a job well
suited for the highspeed processing capability of the Vander Lugt
filter.
CHAPTER III
COMPUTERGENERATED HOLOGRAMS
Vander Lugt described a technique by which the holographic matched
filter could be produced optically.7 At that time, no other
convenient method existed for the computation and creation of the
complicated filter function required. This limitation has faded away
with the availability of digital computers with large memories. Using
digital computers to determine the filter function and a computer
driven writing device, a transparency with the appropriate filter
image can be produced. Using this technique, the computer determines
the appropriate value of the matched filter at each point and produces
a transparency with that absorption at each point. The resolution
required of the writing device depends on the application and, in some
cases, may be consistent with optically generated holograms.
Computergenerated holograms (CGH) have found applications in
optical information processing, interferometry, synthesis of novel
optical elements, laser scanning, and laser machining.2023 CGHs can
implement computeroptimized patternrecognition masks.24 The
computer writes the hologram by transferring the transmittance
function to an appropriate holographic medium. The computer drives
a plotter or scanner and writes the hologram one point at a time.
Typically, the primary limitation is writing resolution. A
conventional optical hologram may have a resolution of onequarter of
a micron. A system using visible light to write holograms (plotters,
flying spot scanners, CRT's, etc.) cannot achieve resolutions much
better than several microns. Writing systems utilizing electron beams
are currently achieving better than 1micron resolution. The electron
beam systems are typically binary and thus the transmittance function
must be quantized in some fashion into two levels, "on" or "off."
Binary holograms are attractive because binary computergraphics
output devices are widely available and because problems with
nonlinearities in the display and recording medium are circumvented.12
When photographic emulsions are involved, granularity noise is
reduced.25
ContinuousTone Holograms
When a hologram is produced optically or interferometrically, a
reference wave is superimposed with the wavefront to be recorded.
Typically, the reference wave is a tilted plane wave with constant
amplitude across the wavefront. The reference wave approaches at an
angle 9 relative to the direction of the wavefront to be recorded.
The resultant field is
E(x,y) = f(x,y) + Aexp(j2Tray) (3.1)
where a= sin 0
x
and the amplitude of the reference wave is 1. An interference pattern
is produced by the superposition of the waves. The fringe spacing is
dependent on the term a, known as the spatial carrier frequency, and
the details in the function f(x,y). A photographic film placed into
this field records not the field itself but rather the square
magnitude of the field. The pattern recorded on the film is then
h(x,y):= f(x,y) + A ej27a 12 (3.2)
= A2 + If(x,y) 2 + A f(x,y)ej2Tay + A f (x,y)ej2aay.
The function recorded on the film contains a D.C. bias, A2, the base
band magnitude, If(x,y)12, and two terms heterodyned to plus and minus
a. These heterodyned terms contain the complex valued information
describing the input function f(x,y). If the spatial carrier
frequency is sufficiently high, the heterodyned terms are separable
and no aliasing exists. The original input function can be retrieved
with no distortion by reilluminating the film with the reference beam
and spatially filtering the output to separate the various terms.
To make the hologram of the Fourier transform of an image, the
same procedure is applied. That is, the Fourier transform of the
image f(x,y) is used as the input to the hologram. Now
h(u,v) = A2 + F(u,v)2 + A F(u,v)eJ2,au + A F*(u,v)ej27au (3.3)
where F(u,v) = Fourier Transform of f(x,y) = F {f(x,y)} and
A ej27au = the offaxis reference wave used to provide the spatial
carrier for the hologram.
a = sin e = the filter spatial carrier frequency (9 = offaxis angle)
X
This filter contains the D.C. bias, A2; the power spectral density,
IF(uv)12; and two terms heterodyned to plus and minus a. These
heterodyned terms contain the complex valued information describing
the Fourier transform of the input f(x,y).
These optically generated holograms are formed
interferometrically by combining a plane wave with the wavefront to be
recorded. The transmittance of the hologram is a real valued, non
negative function of position on the plate. Recall that the input
F(u,v), which was used to create the hologram, is, in general,
complex. This conversion from a complex function to a pattern which
can be recorded on film is known as coding. The coding performed in
optical holography is a natural consequence of the action of the film.
Typically, the complex wavefront is coded to a real nonnegative
function which can be recorded as transmittance values on film.
Equation 2.35 describes a way in which film (a square law detector)
would encode the complex input image in an optically generated
hologram.
Once produced, the hologram and its interference fringes may be
inspected by microscope. The hologram can be copied on another plate
by contact printing. The hologram consists of real valued positive
transmittance varying across the face of the photographic plate. To
record the hologram on a computer, the transmittance across the
surface of the plate is sampled. If the samples are many and the
transmittance determined with accuracy, the hologram can be accurately
reproduced from the recorded samples. In this way the hologram can be
represented with some accuracy using digital numbers stored on some
electronic media. An electronic device writes the physical hologram.
The computer can electronically record, modify an optically produced
hologram, and then rewrite the holographic pattern onto another plate.
The limitations to such a system include the ability to sample the
input hologram sufficiently often and accurately, the ability to store
the large number of sample values, and the ability to rewrite the
holographic pattern to film.
If the input wavefront is known, the optical step may be omitted
altogether. If the input wavefront can be accurately represented by
discrete samples stored electronically, the holographic pattern can be
computed. That is, the input is coded to create a function which can
be recorded on a transparency. In the case of the matched filter, the
Fourier transform of an image is recorded. The image is sampled and
stored on the computer, and equation 2.35 is used to determine the
holographic pattern. Note that the continuous variables are replaced
by discrete steps. At each sample point the actual value is
represented by a finite number. The value may be complex, but the
accuracy is limited by the sampling system. In any case the
holographic pattern is computed and written to the photographic plate.
The writing device is known as continuoustone when the transmittance
of each point in the holographic plate can be controlled over a wide
range of transmittance values. That is, the transmittance varies
smoothly from clear to opaque, including gray scale values between.
These continuoustone holograms most closely resemble the optically
generated holograms when the sampling is dense and many gray scale
values are available.
When continuoustone holograms are written to the photographic
plate using equation 2.35 as the model, they include a D.C. term, a
square magnitude term, and the heterodyned terms due to the tilted
reference wave. Note that the first two terms are real valued and
that the sum of the last two terms is real valued. On the computer,
the film process is emulated using equation 2.35 or other coding
schemes for specific applications. The D.C. and square magnitude
terms need not be included in the computergenerated hologram as long
as the heterodyned terms are scaled and biased to provide results
between 0 and 1. The heterodyned terms contain the desired
information. Omission of the baseband terms has no adverse effect on
the hologram. The square magnitude term typically contains a large
dynamic range. Its omission from the coding algorithm helps reduce
the dynamic range of the hologram and, in most cases, improves the
hologram. Equation 3.3 can be replaced by the expressions
H(u,v) = 2!F(u,v)l + F(u,v)eJ2lau + F*(u,v)ej2nau (3.4)
H(u,v) = A2 + F(u,v)eJ2Tau + F*(u,v)eJ27Tau (3.5)
where each of these expressions includes the reference information, is
real valued, and is nonnegative.
The dynamic range in the hologram, defined as the largest value
divided by the smallest value, is limited by the writing device used
to create the hologram. Most films have dynamic ranges much less than
10,000. That is, the clearest portions of the film can transmit light
no better than 10,000 times better than the darkest portions. If the
coding scheme requires a dynamic range of over 10,000, the writing
device cannot faithfully reproduce the holographic pattern.
Unfortunately, the dynamic range of film is frequently much less than
10,000 and closer to 100. Additionally, the writing device also
limits the dynamic range. Most continuoustone writing devices, which
are attached to computers, convert an integer value to an intensity on
a cathoderay tube or flying spot scanner. Due to physical
limitations in the writing intensity, the dynamic range is usually
much less than 1000. Most commercially available computerwriting
devices are designed with a dynamic range of 256 or 8bit writing
accuracy.. The resultant transmittance on the film will have one of
256 quantized levels determined by an integer value provided by the
computer. Quantization occurs when all values in a specified range
are assigned to a quantized value representative of that range. If
the quantization steps become large, the quantized level may be a poor
estimate of the actual values. The estimate is equivalent to the
actual pattern with an additive noise called quantization noise.
Quantization noise occurs in computergenerated holograms because the
computergraphic devices have limited gray levels and a limited number
of addressable locations in their outputs. Quantizing the holographic
pattern into 256 gray scale levels introduces quantizing noise which
may be considerable when the dynamic range of the pattern is large.
To minimize the quantizing error, the coding scheme must produce a
result with a dynamic range compatible with the writing system.
Some writing systems are capable of only two quantized levels.
These binary devices are either on or off. Most metal etchers, ink
plotters, dot matrix printers, and lithographic printers are binary.
The mark they create is either completely on or completely off. To
represent the reference pattern on binary media accurately requires
specialized coding schemes.
Binary Holograms
Binary holograms are attractive because binary computergraphics
output devices are widely available and because problems with
nonlinearities in the display and recording medium are circumvented.
When photographic emulsions are involved, granularity noise is
reduced. Using the normal definition of dynamic range, binary
holograms have a dynamic range of 1. The transmittance at each point
is completely on or completely off. All gray scale effects must be
created by grouping light and dark areas together and averaging over
an area large enough to provide the needed dynamic range. In this
case the dynamic range is the averaging area. Thus, dynamic range is
exchanged for increased area to represent each point. This is similar
to Pulse Code Modulation (PCM) in an electronic communication
systems.26 In PCM, each sample value is quantized to M levels. Then
each level is represented by a binary code requiring N=log2 M bits.
Rather than represent each point with a continuous variable with
sufficient dynamic range, N binary variables are used. Each variable
is either on or off, but N variables are required to provide
sufficient dynamic range. This exchanges dynamic range of the
variables for the number of variables required. In binary holograms,
the variables are not, in general, exponentially weighted as in PCM;
thus, M variables are required to represent M levels. It becomes very
important to code the hologram such that the number of variables M
needed to represent that dynamic range is reasonable.
One of the first practical binary coding schemes was introduced
when, in 1966, Brown and Lohmann8 devised a method for complex
spatial filtering using binary masks. They coded the Fourier
transform of an image f(x,y). When using this method, the complex
Fourier transform is sampled and represented at each point by an
amplitude and phase. To record a complex filter, both amplitude and
phase information are needed on the hologram. However, the hologram
transmittance is realvalued, nonnegative, and in this case binary.
The amplitude can be recorded by opening or closing an appropriate
number of binary windows in the hologram, but the phase is not
correct. Brown and Lohmann proposed turning the hologram at an angle
to the incoming waveform. Thus, along the surface of the hologram, a
phase shift occurs. This phase shift is proportional to the position
along the hologram. Using this "tilted wave" technique, a phase shift
occurs as the aperture moves up and down the hologram causing the
total path length through that aperture to change.i The further the
detour through the aperture, the larger the phase shift. Phase shift
induced by this technique is known as detour phase. Thus, in the
BrownLohmann hologram, an aperture is moved up and down to create the
appropriate phase shift. The size of the aperture is varied to allow
the appropriate amount of light through. To synthesize the complex
filter function F(u,v), a continuous function is sampled. The cells
of a sizeA u by Av must be sufficiently small that the function F
will be effectively constant throughout the cell.
F(u,v) = F(nAu,mAv) = Fnm =Anmexp ienm (3.6)
where n and m are integers
For each cell in the hologram, the amplitude and phase are determined
by the size and position of an aperture as shown in Figure 3.1. From
each cell a complex light amplitude Fnm will emerge. The tilted wave
must approach at an angle steep enough to allow for a full wavelength
of detour phase within one cell. The dynamic range of the amplitude
and phase is limited by the number of resolvable points within the
cell. If a cell has only 4 by 4 resolvable points, the dynamic range
of the amplitude or phase can be no better than 4. The granularity in
the amplitude and phase may cause distortion in the reconstructed
Phase Shift
I
Am
x = ndx
Figure 3.1 Brown and Lohmann CGH cell.
2litude
4T
Y = M d y_ 1I I
I
It
image. Many points are required to represent a transform with a large
dynamic range accurately.
Lee9 proposed a method in 1970 which helped relieve some of
the phase granularity. The BrownLohmann technique represented each
cell with an amplitude and phase component. The complex value for
each cell may be represented by a magnitude and phase or by the sum of
inphase and outofphase terms. The Lee method represents each cell
with such a quadrature representation. For each cell the magnitude
and phase are converted to real and imaginary components. As in the
BrownLohmann method, the tilted wave is set to provide a wavelength
of delay across the cell. The cell is divided into four components
which represent the positive and negative real and imaginary axes.
Lee defined the functions as
IF(u,v)lexp[j 0(u,v)] = F1(u,v)F2(u,v)+jF3(u,v)jF4(u,v) (3.7)
where
F1(u,v)= IF(u,v) cos+(u,v) if cos((u,v) > 0
= 0 otherwise,
F2(u,v)= IF(u,v) sinp(u,v) if sini(u,v) > 0
= 0 otherwise,
F3(u,v)= IF(u,v)Icos4(u,v) if cost(u,v) > 0
= 0 otherwise,
F4(u,v)= IF(u,v) sin(u,v) if sinq(u,v) > 0
= 0 otherwise.
For any given complex value, two of the four components are zero.
Each of the components Fn(u,v) is real and nonnegative and can be
recorded on film. The Lee hologram uses four apertures for each cell
shown in Figure 3.2. Each aperture is positioned to cause a quarter
wave phase shift by increased path length (detour phase). The two
nonnegative quadrature terms are weighted to vector sum to the
appropriate magnitude and phase for each pixel. The two appropriate
apertures are opened according to their weight. The Lee method uses
continuoustone variables to represent the two nonzero components.
The phase is no longer quantized by the location of the aperture. The
phase is determined by the vector addition of the two nonzero
components. In a totally binary application of the Lee method, the
apertures are rectangles positioned to obtain the quarterwave shift.
The area of each aperture is adjusted to determine the amplitude of
each component. Once again, in this binary case, the dynamic range is
limited by the number of resolution elements within one cell.
Burckhardt10 showed that while the Lee method decomposes the
complexvalued F(u,v) into four real and positive components, only
three components are required. Each cell can be represented by three
components 1200 apart. Any point on the complex plane can be
represented as a sum of any two of these three components. As in the
Lee method, two nonnegative components are chosen to represent each
cell. Because only three instead of four components have to be
stored, the required memory size and plotter resolution are reduced.
Haskell11 describes a technique in which the hologram cell is divided
into N components equally spaced around the complex plane. It is
identical to the binary Lee (N=4) and the Burckhardt (N=3) where N may
take larger values. This Generalized Binary ComputerGenerated
Hologram (GBCGH) uses N columns and K rows of subcells. Each subcell
can take a transmittance value of 1 or 0. The phase is delayed by 2/N
U
Figure 3.2 Complex plane showing four quadrature components.
F1 F2 F3 F4
to provide N unit vectors. The K cells in each component are "opened"
or "closed" to provide the appropriate weights for each component.
The control over the amplitude and phase is not absolute with finite N
and K. The result at each cell is the vector sum of components with
integer length and fixed direction. Figure 3.3 shows that various
combinations of points turned on or off define an array of specific
points addressable in the complex plane. By increasing the number of
points N and K, the amplitude and phase can be more accurately
matched. When the total number of plotter dots is limited and more
subcells used for each cell, fewer cells can exist. Thus, with a
limited number of points, the hologram designer must choose between
spacebandwidth product (number of cells) and quantization noise.
The GBCGH allows more accurate determination of the amplitude and
phase of the cell by using more points. However, the complex sample
to be represented was taken at the center of the aperture. If N, the
number of points in the cell, is large, the outer pixel may have
noticeable error due to the offset in sample location. Allebach12
showed that the Lohmann hologram fell into a class of digital
holograms which sample the object spectrum at the center of each
hologram cell to determine the transmittance of the entire cell. The
Lee hologram fell into a class of digital holograms which sample the
object spectrum at the center of each aperture to determine its size.
He also described a new third class in which the object is sampled at
each resolvable spot to determine the transmittance at that spot.
Although the function to be recorded should be constant over the
entire cell, there is some phase shift across the cell dimensions. By
sampling the object spectrum at the center of each aperture rather
Re
N=K=3
Figure 3.3
Addressable amplitude and phase locations
using the GBCGH method.
than atthe center of each hologram cell, some of the false images in
the reconstruction are removed. By sampling the object spectrum at
the center of each resolvable spot in the hologram, the hologram noise
is further reduced. Allebach described an encoding technique in this
last category known as the AllebachKeegan (AK) hologram.13 The AK
hologram encodes the complexvalued object spectrum by quadrature
components as does the Lee hologram. Unlike the Lee hologram, the AK
hologram compares subsamples within the aperture to an ordered dither
to determine whether each pixel is on or off. The input image is
padded to provide as many points in the FFT as there are resolvable
points. The FFT is decomposed into components spaced a quarter wave
apart (or more as in the GBCGH). Each point is then compared to a
threshold determined by the threshold matrix. The threshold values
are chosen to quantize the amplitude of each component. The threshold
values divide the range from zero to the spectrum maximum in steps
determined by the Max quantizer.27 The size of the dither matrix and
the corresponding points in the cell can increase as with the GBCGH
but the magnitude and phase are sampled at each pixel.
Sampling and SpaceBandwidth Requirements
To represent an image on a computer, the image must be sampled and
quantized into a set of numbers. To sample a continuous image or
function, the value of the function is determined at discrete points.
The values of a function f(x,y) are determined at regular intervals
separated by Ax and Ay. The continuous independent variables x and y
are replaced with discrete sample point denoted by mAx and ny .
Here AX and AY are the fixed sample intervals and m and n are
integers. The sampling rate is u=1/Ax in the x direction and v=1/Ay
in the y direction. To convert the continuous function f(x,y) to a
sampled version f(mAx,nAy), multiply f(x,y) with a grid of narrow unit
pulses at intervals of Ax and Ay. This grid of narrow unit pulses is
defined as
s(x,y) = Z
m_ n=~
6 (xm x,yn y)
(3.8)
and the sampled image is
fs(mAx,nAy) = f(x,y) s(x,y).
(3.9)
The sampled version is the product of the continuous image and the
sampling function s(x,y). The spectrum of the sampled version can be
determined using the convolution theorem (equation 2.8).
Fs(u,v)= F(u,v) S(u,v)
where
and
S(u,v)=
(3.10)
F(u,v) is the Fourier transform of f(x,y)
S(u,v) is the Fourier transform of s(x,y)
6 (umAu,vnAv)
where u = /Ax and v = / Ay
Thus Fs(u,v) = ff F(uuo,vVo) Z 6 (uomAu,vonAv) duo dvo
m= co n= oo
(3.11)
Upon changing the order of summation and integration and invoking the
sampling property of the delta function (equation 2.17), this becomes
0O 00
F(u,v) = E Z F(umAu,vnAv).
m= o n= .
(3.12)
The spectrum of the sampled image consists of the spectrum of the
ideal image repeated over the frequency plane in a grid space (Au, A).
If Au and Av are sufficiently large and the ideal function f(x,y) is
bandlimited, no overlap occurs in the frequency plane. A continuous
image is obtained from the sampled version by spatial filtering to
choose only one order m,n of the sum in equation 3.12. If the image is
undersampled and the frequency components overlap, then no filtering
can separate the different orders and the image is "aliased." To
prevent aliasing, the ideal image must be bandlimited and sampled at a
rate Au >2fu and Av >2fv. The ideal image is restored perfectly when
the sampled version is filtered to pass only the 0,0 order and the
sampling period is chosen such that the image cutoff frequencies lie
within a rectangular region defined by onehalf the sampling
frequency. This required sampling rate is known as the Nyquist
criterion. In the image, the sampling period must be equal to, or
smaller than, onehalf the period of the finest detail within the
image. This finest detail represents one cycle of the highest spatial
frequency contained in the image. Sampling rates above and below this
criterion are oversampling and undersampling, respectively. To
prevent corruption of the reconstructed image, no overlap of the
desired frequency components can occur.
Frequency overlap is also a problem in holography. Recall that in
equation 3.2 the ideal function f(x,y) was heterodyned to a spatial
carrier frequency by mixing with an offaxis reference beam, i.e.,
h(x,y) = A2 + !f(x,y) 2 + A f(x,y)ej2,ay + A f (x,y)ej2vay (3.13)
and that the spectrum (shown in Figure 3.4) of this recorded signal is
4B 3B 2B B 0
H (u)
IF 2
2F (u)
B 2B 3B 4B FREQ
a
Figure 3.4 Spectral content of an image hologram.
H(u,v) = IA12 + F(u,v)@F(u,v) + A F(u,v+a) + A F(u,va) (3.14)
where F(u,v) is the Fourier transform of f(x,y) and denotes
convolution.
The first term is a delta function at (0,0). The second term is
centered on axis (0,0) but has twice the width as the spectrum F(u,v).
The third and fourth terms are the Fourier transforms of the f(x,y)
but centered off axis at plus and minus a. To prevent frequency
overlap, the second term and the heterodyned terms must not overlap.
This requires that the spatial carrier frequency, a, used to
heterodyne the information must be sufficiently large. Specifically,
this carrier frequency must be larger than three times the onesided
bandwidth of the information spectrum.
In the case of the Vander Lugt filter and the subsequent
correlation, the output of the holographic matched filter has the form
o(x,y) = g(x,y) + g(x,yf)f(x,yY)f*(x,y)
+g(x,y)f(x,y) 6 (x,ya)
+g(x,y~@f (x,y) 6(x,y+a). (3.15)
The output, shown in Figure 3.5, contains a replica of the test image
g(x,y) centered onaxis along with a term consisting of the test image
convolved with the autocorrelation of the reference image f(x,y).
This term consumes a width of twice the filter size plus the test
image size. In addition to the onaxis terms, there are two
heterodyned terms centered at plus and minus a. These heterodyned
terms have a width equal to the sum of the widths of the test image
g(x,y) and reference image f(x,y). Again to prevent overlap of the
AzI I\ 1
7b 6b 5b 4b 3b 2b b
gff
f
0 b 2b 3b 4b 5b 6b 7b
Figure 3.5 Spectral content of a Vander Lugt filter
information terms in the output, a spatial carrier of sufficiently
high frequency is required to separate the heterodyned terms from the
onaxis terms. Assuming as an example that the test image and the
reference image are the same size 2B. The output positions of the
various terms can be shown graphically. To prevent the information
terms from overlapping with the onaxis terms, the carrier frequency,
a, must be chosen to center the heterodyned terms at 5B or more. In
the general case, the reference image f(x,y) and g(x,y) may have
different sizes. Let 2Bf represent the size of the reference image
and 2Bg represent the size of the test image. Then the requirement on
the carrier frequency, a, to prevent aliasing is
a = 3Bf + 2Bg. (3.16)
Sampling and heterodyning cause aliasing when improperly
accomplished. The combination of the two in the CGH requires
specific attention to detail. To create a CGH from a continuous image
f(x,y), it must first be sampled and quantized. According to the
Nyquist criteria, there are two samples for the smallest details in
the image. The sampling rate is at least twice the highest spatial
frequency in the continuous image. If a limited number of sampling
points are available, the image should be low pass filtered to limit
the highest frequency in the continuous image to half the number of
sampling points. This can be accomplished in an electronic sensor by
blurring the optics before the detector. When using a television
camera to digitize a transparency or film, the camera must be blurred
to match the detail in the continuous image to the number of points in
the digitizer. The detail required in the reference and test images
is determined by the pattern or target to be recognized. To detect
the presence of a desired target while an unwanted object could appear
in the test scene, sufficient detail to discriminate the two must be
included. To pick out cars from a scene which contains both cars and
trucks, the resolution must be adequate to resolve the differences
between the two. This resolution is typically chosen in an ad hoc
fashion using the human eye to determine what resolution is required.
Computer techniques have been used to quantify the resolution
required, but the results are usually not different than what a human
would have decided by eye. Although beyond the scope of this
dissertation, the bandwidth and specific frequencies best suited to
discriminate between targets and clutter can be determined with large
computers operating on adequate training sets.
The resolution must be adequate for target recognition. However,
oversampling beyond that resolution required will drive the CGH to
impractical limits. The resolution in the test image must match that
in the reference image yet the test image usually represents a much
larger area and larger total number of points. If the image already
exists in digital form, the resolution can be reduced by averaging the
image to produce an unaliased image of the appropriate number of
points. If an image is blurred or averaged to reduce the highest
spatial frequency, the detail above that maximum frequency is lost.
That is, all frequency components above the maximum are zero and lost.
Sampling the image properly (Nyquist criteria) permits the perfect
reconstruction of the averaged image, not the original image.
It is worthwhile to define the concept of spacebandwidth product
(SBP) here. The bandwidth of an image is the width of the spatial
frequency content to the highest spectral component. The space is the
physical length over which the image exists. For example, a piece of
film may have a maximum resolution of 100 points/mm with an image
which occupies 1 cm along the length of the film. In this case the
SBP is 100 points/mm X 10 mm = 1000 points. This is in one dimension.
For a square image, the number of points is 1,000,000. The SBP is the
number of resolution points in an image. The maximum SBP capability
of the film may not be utilized by an image recorded on the film, and
the actual SBP of the stored image will depend on the image itself.
In general, the bandwidth will be determined by the finest detail in
the image and the area of the total image. The area of the smallest
detail divided into the total image area defines the SBP. When a
continuous image is sampled at the Nyquist rate, one sample per
resolution point in the image is required. Thus, the SBP of the image
sampled at the Nyquist rate matches that of the continuous image. The
SBP in the sampled image is a very practical detail because each
sample must be stored in the computer memory. The number of
resolution elements in a 4" X 5" holographic film may exceed 108. A
computer cannot practically store such a large number of values. With
a limited number of memory locations on the computer, the sampling
rate and SBP demand careful consideration.
A CGH is created using a digitized image. A continuous film image
may be sampled and quantized to create a nonaliased digital image.
Some imaging sensors output data in digital format with no further
digitizing required. Once the digital image is obtained, the image
values may be manipulated on a digital computer. If this digital
image is encoded on a continuoustone CGH using equation 2.35 as a
model, a spatial carrier frequency on the Fourier transform of the
image must be induced. The image is encoded as f(mAx,nAy) with a SBP
of M x N where M and N are the number of points in the image in each
direction. If the Fast Fourier Transform (FFT) is applied to the
image, a digital representation of the Fourier transform of the image
is obtained. This transformed image F(mAu,nAv) contains the same
number of points as the image and obviously the same SBP. If the
image contained M points along the x direction, the highest spatial
frequency possible in this image would be M/2 cycles/frame. This
situation would exist when the pixels alternated between 0 and 1 at
every pixel. That is, the image consisted of {0,1,0,1, ...}. The
maximum frequency in the transform is M/2 cycles/frame in both the
positive and negative directions. The FFT algorithm provides the real
and imaginary weights of each frequency component ranging from M/2+1
cycles/frame to +M/2 cycles/frame in one cycle/frame steps. This
provides M points in the u direction. The same is true for N points
in the v direction. Thus, the first point in the FFT matrix is
(M/2+1,N/2+1), the D.C. term is in the M/2 column and N/2 row, and
the last term in the FFT matrix is (M/2,N/2).
It is useful to point out that the FFT describes the frequency
components of the image f(x,y). The FFT pattern also contains
structure which can also be represented by a Fourier series. That is,
the FFT pattern or image has specific frequency components. Because
the image and the FFT are Fourier transform pairs, the image describes
the frequencies in the FFT pattern. For example, a spike in the image
implies the FFT will be sinusoidal. A spike in the FFT implies the
image is sinusoidal. The existence of a nonzero value on the outer
edge ofthe image implies the FFT contains a frequency component at
the maximum frequency. A nonzero value on the corner of the image
implies the maximum frequency exits in the FFT pattern which is M/2 in
the x direction and N/2 in the y direction.
To record the complex Fourier transform as a hologram, the
function F(mAu,nAv) must be heterodyned to a spatial carrier frequency
so as to create a real nonnegative pattern to record on film. To
prevent aliasing, the heterodyne frequency must be sufficiently high.
The frequency components in the hologram are shown in Figure 3.6 and
consist of the D.C. spike, the power spectral density of the function
F(u,v), and the two heterodyned terms. To record the function F(u,v)
on film without distortion from aliasing, the spatial carrier
frequency must be 3 times the highest frequency component of the FFT
pattern. This permits the power spectral density term to exist
adjacent to the heterodyned terms with no overlap. The frequencies in
the hologram extend to plus and minus 4B. Thus, the hologram has a
spacebandwidth product 4 times larger than the original image in the
heterodyne direction. When heterodyned in the v direction as implied
by equation 2.35, the resulting hologram matrix must be larger than
the original image by 4 times in the v direction and 2 times in the u
direction. The spectral content in two dimensions is shown in Figure
3.7. The spacebandwidth product is very large for this CGH to record
the information in H(u,v).
The requirement is even greater when the hologram is to be used as
a Vander Lugt filter. When used as a Vander Lugt filter, the CGH must
diffract the light sufficiently away from the origin and the
additional onaxis terms to prevent aliasing in the correlation plane.
I I
f I
f
4B 3B
a
2B
B
0 B 2B 3B 4B
Figure 3.6 Spectral content of a Fourier Transform hologram.
~:. 
l
B
2B
2
: .l:,r
3B . I  f*
4B L 2B
Figure 3.7 Twodimensional spectrum of the Fourier Transform
hologram.
The output of the Vander Lugt filter is shown in equation 2.37 and the
spectral contents are plotted in Figure 3.5. These spectral components
are shown in two dimensions in Figure 3.8. Here the spacebandwidth
product is 7 times larger than the image in the v direction and 3
times larger than the image in the u direction. To produce a
correlation image without stretching, the samples in the u and v
directions should have the same spacing. Usually for convenience, the
hologram contains the same number of points in both directions, giving
a pattern which is 7B by 7B. The FFT algorithm used on most computers
requires the number of points to be a power of 2. This requires that
the hologram be 8B by 8B. For example, if the original images to be
correlated contain 128 by 128 points, the required continuoustone CGH
contains 1024 by 1024 points. In a binary hologram, each continuous
tone point or cell may require many binary points to record the entire
dynamic range of the image.
This illuminates the key problem with CGHmatched filters. The
spacebandwidth product becomes large for even small images. Yet it
is the ability of optical processors to handle large images with many
points that makes them so attractive. Holograms created with
interferometric techniques contain a large amount of information or a
large spacebandwidth product. However, these opticallygenerated
holograms lack the flexibility offered by CGH. Holographic filters
are produced by either optical or computer prior to their actual use.
The filter imparts its required transfer function to the test image
without any further computation of the hologram pattern. Even if the
task is difficult, production of the filter is a onetime job. The
more information stored on the hologram, the greater the potential
gf
g fef*
. .
fIl2
g@ f*
Figure 3.8 Twodimensional spectrum of the Vander Lugt filter.
processing capability of the Vander Lugt filter. To produce powerful
yet practical CGH filters, the spacebandwidth product and dynamic
range of the hologram must be understood and minimized within design
criteria.
One key to reducing the spacebandwidth product of the CGH is to
recognize that much of the spectrum is not useful information. The
terms in Figure 3.5 are described as the convolution of f and g, the
baseband terms ffti and the correlation of f and g. Only the
correlation term is useful for our purposes in the Vander Lugt filter,
but the other terms arrive as a byproduct of the square law nature of
the film. The two heterodyned terms which result in the convolution
and correlation of f and g must come as a pair. That is, when the real
part of the heterodyned information is recorded, the plus and minus
frequencies exist. The real part, cos e, can be written as
exp(je)+exp(je) using Euler's formula. The plus and minus exponents
give rise to the plus and minus frequency terms which become the
convolution and correlation terms. The convolution and correlation
terms are always present in a spatially modulated hologram.
A more efficient hologram is produced using equation 3.5. This
hologram consists of a D.C. term sufficiently large to produce only
nonnegative values and the heterodyned terms.
H(u,v) = A2 + F(u,v)ej2'av + F*(u,v)ej27av (3.17)
The output (shown in Figure 3.9) of the Vander Lugt filter using this
hologram is
O(u,v) = A2G(u,v) + F(u,v)G(u,v)ej2Tav + F*(u,v)G(u,v)ej2rav (3.18)
5B 4B 3B 2B B
gf
B 2B 3B 4B 5B
a a
Spectrum of a modified Vander Lugt filter.
Figure 3.9
or
o(x,y) = A2g(x,y) + f(x,y)@g(x,y)~(x,y+a) + f(x,y)g(x,y)@S(x,ya)
= A2g(x,y) + f(x,y)@g(x,y)@6(x,y+a) + Rfg(x,y)@S(x,ya) (3.19)
which gives the spectrum shown in Figure 3.9 assuming Bf=Bg=B. Here
the spectrum extends to 5B rather than 7B and considerable space
saving is possible. However, the 5B is not a power of 2 and most
computer systems would still be forced to employ 8B points. The terms
in Figure 3.9 are the convolution term, the image term, and the
correlation term. The image term arises from the product of the D.C.
term with the test image g(x,y). In a normal absorption hologram, it
is not possible to eliminate the D.C. term. The image term takes up
the space from B to B, forcing the spatial carrier frequency to 3B
and requiring 5B total space. If the absorption hologram is replaced
with a bleached hologram where the phase varies across the hologram,
the D.C. term may be eliminated.
As discussed in Chapter II, film may be bleached to produce a
phase modulation. This is accomplished at the expense of the
amplitude modulation. However, this phase hologram behaves much like
the original amplitude or absorption hologram. One advantage of the
bleaching process and the use of phase modulation is the opportunity
to eliminate the D.C. term (set it to zero) and reduce the space
bandwidth product. Equation 3.17 is changed to
H(u,v) = F'(u,v)ej2;av + F*'(u,v)ej2rav (3.20)
where the prime mark (') indicates the function has been modified by
the bleaching process. There is no D.C. term, so the output of the
Vander Lugt filter is
0(u,v) = F'(u,v)G(u,v)ej2rav + F*,(u,v)G(u,v)ej2rav (3.21)
or
o(x,y) = f'(x,y)g(x,y)6(x,y+a) + f'(x,yg(x,y) (x,ya) (3.22)
= f'(x,y)g(x,y)@6(x,y+a) + Rf,g(x,y)@(x,ya)
which gives the spectrum shown in Figure 3.10, assuming Bf=Bg=B.
This phase hologram reduces the number of points to 4B, a power of 2.
This is the smallest possible size in a spatially modulated hologram.
As will be shown later, the phase modulation process may significantly
affect the information, and the correlation obtained may be a poor
approximation to the ideal correlation.
The Vander Lugt filter is typically used to detect the presence of
a small object in a large scene. This implies that Bf may be much
smaller than B In any case, the least theoretical hologram size
using equation 3.20 is still twice the size of the reference image and
test image combined in the y direction. For example, a large scene
consisting of 1024 by 1024 points is to be searched for an object that
would occupy 32 by 32 points in that scene. The smallest continuous
tone hologram to perform that correlation would contain 2112 points in
the y direction (at least 1088 in the x direction). For most
practical applications, the absorption hologram illustrated in Figure
3.9 would be used. For the same example consisting of a 1024 by 1024
test scene and a 32 by 32 reference image, a square hologram would be
at least 2144 by 2144.
Another practical consideration provides some relief in the size
of the correlation plane. The correlation of two images creates a
gf*
g f
4B 3B 2B B 0 B 2B 3B 4B
a a
Figure 3.10 Spectrum of the zero mean Vander Lugt filter.
correlation image whose size is the sum of the individual image sizes.
Nonzero correlation values can exist when any points in the two
images overlap. However, the number of points which overlap becomes
very small near the outer edge of the correlation plane. In a
practical system, a threshold is set to determine correlations which
are "targets" (above threshold) or "background" (below background).
When the target fills the reference image and is entirely present in
the test image, the autocorrelation condition exists and the
correlation can be normalized to one. When the target begins to fall
off the edge of the test image, correlations will still occur.
However, the correlation value will fall from unity by the ratio of
the target area present in the test image to the target area in the
reference image. A practical rule of thumb might be to ignore the
correlations when half of the target falls outside the test image in
any direction. This reduces the correlation plane to the size of the
test image, offering some relief to the required hologram size. If
the outer edge of the correlation plane is ignored, it does not matter
if that edge is aliased. This reduces the sampling and heterodyning
requirements in the filter hologram especially when the reference
contains many points. When using the absorption hologram with 50%
aliasing (shown in Figure 3.11), the spatial frequency is
a = Bg + Bf (3.23)
and the number of points in the hologram in the v direction (SBPv) is
SBPv = 2Bg + 3/2 Bf. (3.24)
Phase encoding this hologram does not relieve the requirement on the
Overlap Area
gf
H I....
I N\
4B 3B
Figure 3.11
2B
a
0 B 2B 3B 4B
Output of a 50% aliased Vander Lugt filter with
absorption hologram.
g@f*
^8S88S888S88/L!!N '"!.'ei8,8
carrier frequency or the total number of points. The edges of the
correlation plane will fall into the active correlation region if a or
SBPv is reduced from the values given in equation 3.23 and 3.24.
In review, the SBP of the hologram is determined by the
following criteria.
(1) The required resolution in the reference scene to recognize the
desired target.
(2) The size of the reference scene. This is not normally a
significant factor due to the small size of the reference compared to
the size of the test image.
(3) The size of the test scene. The potential advantage of optical
processing is to test a large scene for the presence of the reference
object. The test image must contain the same resolution as the
reference image but includes many times the image area. Thus, the SBP
of the test scene is very large and is the driving factor in the size
of the CGHmatched filter.
(4) Usually, aliasing can be tolerated at the edges. This depends
the threshold and expected intensity of false targets. When 50%
imposed aliasing can be tolerated, the SBP reduces to an even multiple
of two.
(5) The dynamic range in the reference scene. The hologram must
adequately represent the dynamic range in the reference scene. In the
case of binary holograms, many binary points may be required for
adequate representation of each hologram cell.
(6) Hologram type. The type of CGH produced determines the encoding
scheme and number of points required to represent the SBP and dynamic
62
range of the reference while preventing aliasing of the active
correlation plane.
(7) Incorporate D.C. elimination when possible to minimize onaxis
terms.
By following these guidelines it is possible to determine the minimum
possible SBP needed in the CGH.
CHAPTER IV
OPTIMIZATION OF CGHMATCHED FILTERS
The previous chapters describe the basic design techniques
employed to create CGHmatched filters. To determine the performance
of these filters, specific criterion must be established.
Performance Criteria
Because the matched filter is based on maximizing the signalto
noise ratio, that criteria is reasonable to apply to the result of the
CGH also. The matched filter created as a result of a CGH is only an
approximation of the ideal filter. The nonlinearities of the film,
along with the sampling, heterodyning, and quantizing of the CGH
process, cause the correlation to be less than ideal. The noise is
not just caused by background in the input image but also by artifacts
from the hologram. The matched filter was intended to recognize a
specific target in a clutter background, yet, in some cases, the
target will vary in size and orientation. There is a tradeoff between
using high resolution to discriminate against false targets and too
much sensitivity for target size and orientation. When modifying the
frequency content of the scene to best distinguish target from
background, the signaltonoise ratio may decrease from the ideal.
Another important property of the optical matched filter is the
efficiency or light throughput. In a practical system, the input
image is illuminated by a laser of limited size and power. Typically
the laser source could be an IR diode putting out 10 mW.6 Even if the
signaltonoise ratio is large, the energy reaching the correlation
plane may be too small to measure. The efficiency of the hologram,
the ratio of the power in the correlation to the power in the input
test image, is an important criterion in evaluating a practical CGH
matched filter. Mathematically, it is given as
ff Ig(x,y) f*(x,y) 2dx dy
= _____________________ (4.1)
H
ff Ig(x,y)I2dxdy
where H has been coined the Horner efficiency,28 f is the reference
scene, g is the test scene, and denotes an ideal correlation. The
correlation derived from a Vander Lugtmatched filter is not ideal.
To determine the Horner efficiency for a CGHmatched filter, equation
4.1 must include an accurate model of the encoding scheme. This
efficiency can be measured experimentally using a known input source
and calibrated detectors. Caulfield28 estimated that efficiencies for
certain matched filters could be as low as 106. Butler and Riggins29
used models of CGH filters to verify Caulfield's prediction and went
on to recommend techniques for improving the efficiency.
The matched filter is used to determine the presence of a target
in a large scene. A test scene is correlated with a reference, and
the correlation plane is thresholded to indicate the target location.
Occasionally, the Vander Lugt filter will generate correlation values
above the threshold in areas where no target exists. Accordingly, the
correlation of an actual target corrupted by noise may be lower than
the threshold. Due to the presence of noise, random and otherwise,
the performance of the filter must be measured in terms of the
probability of detection and the probability of false alarm. The
probability of detection, Pd, is defined as the probability that a
target will be indicated when there is, in fact, a target to be
detected. The probability of false alarm, Pfa, is defined as the
probability that a target will be indicated when there is, in fact, no
target to be detected. These two quantities are correlated by the
presence of noise. If the detection threshold at the correlation
plane is lowered, the probability of detection is increased, but the
probability of false alarm is also increased. As with the efficiency
measurements, determining Pd and Pfa for CGHmatched filters requires
accurate models or optical experiments.
Historically, efficiency was not a concern in laboratory
experiments because powerful lasers were available to overcome the
hologram loss. When attempts are made to improve the efficiency, the
signal to noise ratio may suffer. An efficient hologram is
impractical if the signaltonoise ratio in the correlation plane is
so low that Pd goes down and Pfa goes up significantly. The
performance of matched filters are typically measured in terms of the
Pd and Pfa, but testing requires modeling the entire system and
providing realistic images. All of these measures must be considered
for the cases when the test target deviates from the reference.
Optimization criteria for optical matched filters depend on the
application. To improve the matched filter, modifications to the
filter design have been proposed. These modifications fall into areas
of frequency modification, phase filtering, and phase modulation.
Frequency Emphasis
High frequencies in an image correspond to the small details.
Most images contain large continuous areas bounded by sharp edges.
The large continuous areas contribute to the D.C. and low frequency
contentof the image, while the edges contribute to the high
frequencies. If the high frequencies are removed from the image
through spatial filtering, the sharp edges disappear, the large
continuous areas blend together smoothly, and the resultant image
appears soft or blurred. A lowpass image may not provide sufficient
resolution to discriminate between two similar objects. If the low
frequencies are removed from an image, the continuous areas become
dark with only the edges remaining. The image appears sharp with
welldefined edges and detail. This highpass image provides, to the
human eye, the same or better discrimination of the original image.
That is, objects are identified and distinguished at least as well as
in the original image. For example, images containing a bright square
area and bright circular area are easily distinguished as a square and
circle. If the high frequencies are removed, both square and circle
appear as blobs with no distinct edges. However, if the low
frequencies are removed, the bright area in the center of the square
and circle disappears, leaving only a bright edge. Yet these bright
edges clearly indicate a square and a circle as shown in Figure 4.1.
Even if the square is not filled in, the edge clearly denotes the
square shape. The edge of the circular area still defines a circle.
The square and circle are easily distinguished in the highpass
images. The information that distinguishes the square from the circle
is contained in the high frequencies.
The traditional matched filter, as outlined in Chapter II, is
created from the complex conjugate of the Fourier transform of the
reference image. Filtering with such a filter is equivalent to
correlating the reference image with a test image. Because most
Figure 4.1 Highfrequency emphasis of a square and a disk.
scenes contain large continuous areas with edges, they contain a large
D.C. and low frequency component. Most images have spectra where the
magnitude tends to drop off drastically with increasing frequency.
The energy in the low frequencies may be several orders of magnitude
larger than the high frequencies. However, it is the high frequencies
which contain the useful information in separating the desired target
from false targets. A practical problem with holography is the
dynamic range to be recorded. Film cannot typically induce more than
two or three orders of magnitude of dynamic range. To record a
hologram of the Fourier transform, the film must accurately record the
entire dynamic range of the transformed image. If the dynamic range
of the transformed image is too large, the film cannot record the
Fourier transform linearly and the correlation is not ideal. The film
nonlinearity will emphasize some frequencies and attenuate others.
The correlation signaltonoise ratio will suffer if important
frequency components are attenuated. To reduce the dynamic range of
the transformed image and allow linear recording on the hologram, the
useless frequencies in the image should be eliminated. Because the
low frequencies contain most of the image energy but little of the
information, their omission considerably reduces the dynamic range
with little effect on the correlation except to reduce the overall
light through the hologram.
To determine which frequencies are important in target
discrimination involves considerably more work than can be considered
here. In general, a set of target images and a set of nontarget
images can be compared on a large digital computer to determine which
frequencies appear most in the desired target. This requires a large
data base of true and false targets. Filtered images are correlated
and cross correlated to determine the most discriminating frequencies.
In practice, this process is too time consuming. Certain assumption
are reasonable in spatial filtering. It is reasonable to assume that
the reference and test images do not have much more detail than is
absolutely necessary to distinguish the true target. To reduce the
number of points needed in the digital imagery, the original sampling
process was accomplished by limiting the spatial frequencies to those
required to recognize the target. Thus, the appropriate filter to
eliminate unnecessary frequency components will have the form of a
highpass filter. The nature of this highpass filter is dependent on
the application of the matched filter.
The matched filter is created for a specific target. If the
target is present, the correlation is larger than for areas of the
image where the target is absent. If the target changes slightly from
the reference stored on the filter, the correlation drops. In a
practical application, small changes in the expected target are the
rule rather than the exception. If the target grows in size, rotates,
or changes its appearance slightly, the correlation may drop below the
threshold. This topic will be discussed further in Chapter V, but it
is necessary to point out that the invariance of the filter to small
changes in the target depends heavily on the frequencies used in the
correlation. Using the previous example, recall that the highpass
images showing the edges allowed discrimination between the square and
circle. If the square were rotated slightly, the results would
change. The crosscorrelation between a square and a slightly rotated
square depends on the frequencies used in the correlation. If only
low frequencies are used, considerable rotation can occur with little
effect on correlation. If high frequencies are used, the cross
correlation drops quickly with rotation. Thus, a matched filter
created from a highpass image to discriminate against outofclass
targets will not correlate well on inclass targets with small
changes. That is, as more high frequency emphasis is applied to the
matched filter, the discrimination sensitivity is increased. The
probability of false alarm is increased, but the probability of
detection drops. The high frequency emphasis is then tied to the Pd
and Pfa which must be specified for a particular application.
There is another advantage to the frequency emphasis of matched
filters. As seen in equation 2.35, the transmission of the hologram
at each point depends on the magnitude of the reference image Fourier
transform. Yet the hologram transmission cannot be greater than 1.
Depending on the dynamic range of the film, the transmission out at
the edge of the hologram corresponding to the high frequencies is very
low or zero. As the magnitude drops off for high frequencies, so does
the transmission of light through the holographic filter, and hence,
filter efficiency is low. However, if the high frequencies are
emphasized (boosted), the transmission at those points in a positive
hologram is likewise emphasized. This creates an overall increase in
the hologram transmission. In an absorption hologram, the light which
is not transmitted is absorbed and lost to the system. The throughput
or efficiency is highly dependent on the total transmission of the
hologram. Thus, by emphasizing the high frequencies, the efficiency
of the Vander Lugt filter is increased. Because the maximum
transmission is limited to 1 and the dynamic range is limited on the
film, the greatest efficiency occurs when most of the frequencies have
equal weighting and the transmission is close to 1 across the entire
hologram. This implies that the throughput of the hologram will be
largest when the image transform is nearly white.
The following procedures determine the choice of frequency
emphasis.
(1) Specify the Pd and Pfa for the particular application.
(2) Choose a highpass emphasis which satisfies the Pd and Pfa
requirements. Typical choices include gradient, exponential, and step
filters.
(3) Because the test image should be filtered in the same fashion as
the reference image, the frequency emphasis chosen should be squared
before inclusion in the hologram. This permits the preemphasis of
the test image without a separate stage of spatial filtering. That
is, the test image is spatially filtered for preemphasis with the
same hologram providing the correlation.
(4) The test image is typically much larger than the reference image
and can thus contain frequencies lower than any contained in the
reference. Since those frequency components can never contribute to
correlations, all frequencies below the lowest useful frequency in the
reference should be truncated to the value of the next smaller term.
(5) The frequency emphasis (squared) greatly reduces the dynamic
range of most scenes, simplifying the coding of the CGHmatched filter
and greatly improving the efficiency. The frequencyemphasized CGH
matched filter is created, as shown in Chapter III, but utilizes a
reference image whose frequency content is modified.
F'(u,v) = IP(u,v)!2 F(u,v)
(4.2)
where F' :is the modified image transform,
F is the original image transform,
and P(u,v) is the frequency emphasis chosen.
PhaseOnly Filters
The preceding section describes techniques in which the high
frequencies are emphasized. This emphasis usually improves the
discrimination against false targets and increases hologram
efficiency. Frequency emphasis involves the multiplication of the
image transform by a filter function which attenuates or amplifies the
appropriate frequency components. The filter function adjusts the
spectral magnitude of the image. In the Fourier representation of
images, spectral magnitude and phase tend to play different roles and,
in some situations, many of the important features of a signal are
preserved even when only the phase is retained. Oppenheim15 showed
that when the magnitude portion of an image Fourier Transform is set
to an arbitrary constant and the phase left intact, the reconstructed
image closely resembles the original. Features of an image are
clearly identifiable in a phaseonly image but not in a magnitudeonly
image. Statistical arguments by Tescher30 and by Pearlman, and Gray31
have been applied to realpart, imaginarypart, and magnitudephase
encoding of the discrete Fourier transform of random sequences. They
conclude that, for equivalent distortion, the phase angle must be
encoded with 1.37 bits more than the magnitude. Kermisch32 analyzed
image reconstructions from kinoforms, a phaseonly hologram. He
developed an expansion of the phaseonly reconstructed image I(x,y) in
the form
I(x,y) = A [Io'(x,y) + 1/8 Io'(x,y/@Ro'(X,Y)
+ 3/64 Io'(x,y)Ro'(x,y)@Ro'(x,y) + . .] (43)
where Io'(x,y) is the normalized irradiance of the original object,
Ro'(x,y) is the twodimensional autocorrelation function of Io'(x,y)
and denotes convolution. The first term represents the desired
image, and the higher terms represent the degradation. Kermisch
showed that the first term contributed 78% to the total radiance in
the image, giving a ratio of 1.8 bits.
The phaseonly image typically emphasizes the edges as in the case
of the highpass filtering as shown in Figure 4.1. This phaseonly
filtering is closely related to the highpass filter. Most images
have spectra where the magnitude tends to drop off with frequency. In
the phaseonly image, the magnitude of each frequency component is set
to unity. This implies multiplying each pixel magnitude by its
reciprocal. The Fourier transform tends to fall off at high
frequencies for most images, giving a moundshaped transform. Thus,
the phaseonly process applied to a moundshaped Fourier Transform is
highpass filtering. The phaseonly image has a highfrequency
emphasis which accentuates edges. The processing to obtain the phase
only image is highly nonlinear. Although the response 1/IF(u,v)l
generally emphasizes high frequencies over low frequencies, it will
have spectral details associated with it which could affect or
obliterate important features in the original. Oppenheim15 proposed
that if the Fourier transform is sufficiently smooth, then
intelligibility will be retained in the phaseonly reconstruction.
That is, if the transform magnitude is smooth and falls off at high
Figure 4.2 Phaseonly filtering of a square and a disk.
frequencies, then the principal effect of the whitening process is to
emphasize the high frequencies and therefore the edges in the image,
thereby retaining many of the recognizable features. In Figures 4.1
and 4.2 the phaseonly filter emphasizes edges more strongly than a
gradient filter for the examples shown.
The advantage of using a phaseonly image or highpass image is
the increase in optical efficiency of the resultant matched filter.
As shown in equation 2.35, the transmission of each hologram element
depends on the magnitude of the reference image Fourier transform. As
the magnitude drops off for high frequencies, so does the transmission
of light through the holographic filter, and hence filter efficiency
is low. If the magnitude is set to unity (phaseonly filter) for all
frequencies, the overall efficiency increases dramatically. The image
transform is white and thus the throughput of the absorption hologram
is highest. Horner14 shows that the maximum throughput efficiency of
an ideal autocorrelation of a 2D rect function is only 44%, while the
autocorrelation using an phaseonly filter achieves 100% efficiency.
The phase function, ((u,v) of an image Fourier transform is a
continuous function. To fabricate a phaseonly filter for such an
image requires a linear process capable of faithfully reproducing the
whole range of values from 0 to 2 If the phase is quantized so as
to permit only two values, typically 0 and pi, such a filter is known
as a biphase filter.
H'(u,v) = sgn [cos 4(u,v)] = +1 if Re [H(u,v)] > 0 (4.4)
= 1 otherwise
where H(u,v) is the Fourier transform of the filter impulse response
h(x,y), the sgn operator gives the sign of the argument, and H'(u,v)
is the biphase transform. This biphase information is an
approximation to the phaseonly information. In many cases,
reconstructions from this biphase information contain the same detail
as the ideal amplitude and phase information. This would indicate
that much of the information in an image is contained in the sign of
each pixel or where the zerocrossings occur.
In converting a complex wave, which contains continuous magnitude
and phase values, to binary values, much is thrown away. If the
reconstructions from the binary image transforms are similar to the
original image, then the biphase conversion reduces redundancy and
eliminates superfluous dynamic range. When this is accomplished in an
optical correlator without significant reduction in signaltonoise
ratio, the CGHmatched filter is greatly simplified. Most important
is the ability to use binary light modulators. A number of electronic
spatial light modulators are commercially available. Of these
modulators, several can be used to phasemodulate a light wave. These
include deformable paddles, liquid crystals, and magnetooptical
modulators. These can be used as bipolar phase modulators.33 If the
information in the reference image can be accurately represented using
only biphase information, binary phase modulators can be used as
realtime holographic filters. The ability to adapt the matched
filter in real time permits scanning the test image for various
targets with varying sizes and orientations. This technique is very
efficient because the light is phase shifted and not attenuated.
PhaseModulationMaterials
Recall that spatially modulated holograms are needed for matched
filtering only because film cannot record a complex wavefront. Film
can record only real values. Film may be used to record, at baseband,
the magnitude of a wavefront, or it may be computerencoded and phase
modulated (bleached) to record the phase of a wavefront. Thus,
without using a spatially modulated hologram, the magnitude or phase
may be recorded. If only the phase information of the image is needed
to represent the reference image, a baseband hologram which records
the phase portion of the image transform can be used in the optical
correlator. This onaxis phase hologram, or kinoform, is recorded as
a relief pattern in which appropriate phase delays are induced in the
illuminating wavefront. To produce a Fourier transform kinoform, the
phase is restricted to a range from pi to + pi. The arctangent of
the ratio of the imaginary and real parts yields such a result. The
film is exposed to a pattern, whose intensity is proportional to the
desired phase, and bleached to create a relief pattern.34 These
kinoforms cannot record the amplitude variation of the image transform
and thus, the filter formed is a phaseonly filter.
Several techniques have been proposed by which the phase could be
modified to introduce amplitude variation in the reconstructed
wavefront.35,36 Chu, Fienup, and Goodman18 used multiple layers of
film to represent both the phase and amplitude variation. Kodachrome
II, for color photography, contains three emulsions. The phase
variation was recorded on one emulsion and the amplitude on another.
The inventors named this technique Referenceless OnAxis Complex
Hologram (ROACH). To introduce amplitude variation to the
reconstructed wavefront, light must be removed from the wavefront,
resulting in a reduction in efficiency.
The reconstruction from the kinoform is formed onaxis and is a
phaseonly image. When the phase values are uniformly distributed
between pi and + pi, the D.C. or average term is zero. However, if
the phase recording is unbalanced or the phase distribution is not
uniform, a D.C. term will exist in the hologram. When used as a
matched filter, the kinoform must be carefully phasebalanced to
prevent a large D.C. spike from occurring in the correlation plane.
Such a spike would be indistinguishable from an actual correlation.
If the phase hologram is produced using a "real time" holographic
device, the phase might be controlled using a feedback loop to
eliminate the D.C. term prior to correlation. To produce a
"permanent" hologram on film, the exposure and bleaching processes
must be carefully controlled.
Bleaching includes several processes which produce a phase
hologram from an exposed absorption hologram. The bleached hologram
retards the wavefront, causing a phase shift instead of attenuation.
The result is generally an increase in diffraction efficiency but
often with an accompanying decrease of signaltonoise ratio.37 There
are three basic types of bleaches. The "direct" or "rehalogenizing"
method converts metallic silver back into silver halide which has a
different index than that of the gelatin. "Reversal" or
"complementary" bleaches dissolve the metallic silver from an unfixed
hologram, leaving the undeveloped silver halide which varies the index
of refraction. The third process creates a surface relief by
shrinking the exposed portions of the hologram by removing the
metallic silver. When the emulsion is bleached properly, the
attenuation of the transparency can be reduced to the point that phase
modulation due to index changes dominates any residual amplitude
modulation. Phase modulators prove to be more efficient in terms of
the portion of incident illumination that is diffracted to form the
desired correlation. A sinusoidal hologram using absorption or
amplitude modulation can theoretically diffract only 6.25% of the
incident energy into an image. Experimentally, the number is about
4%.38 A phasemodulated hologram transmits all of the light (ignoring
the emulsion, substrate, and reflection losses). A sinusoidal phase
hologram can diffract as much as 33.9% of the incident light into the
first order.
The bleaching process converts the real function F(u,v), recorded
in silver on the film, to a phase delay.
H(u,v) = exp j[ F(u,v) ] (4.5)
To produce a kinoform, the film is exposed to the phase function
e(u,v) of the image transform. Upon subsequent bleaching, the film
contains the response
H(u,v) = exp j[ 6(u,v) ]. (4.6)
The kinoform, produced in this fashion, records the phaseonly
information of the image transform. The bleaching process is not
restricted to phaseonly information. Rather, the absorption hologram
created from equation 2.35 can also be bleached.
H'(u,v) = exp j[ H(u,v) ] (4.7)
= exp j[1 + IF(u,v)12 + F(u,v)exp j2irav + F*(u,v)exp j2rav]
where H'(u,v) is the bleached hologram response. The phaseonly
information and the phase modulation obtained through bleaching are
entirely independent of one another. That is, a phasemodulated
hologram can be created from an image whose amplitude and phase are
intact or from an image whose amplitude or phase are modified or
removed. Considerable confusion continues to exist in the literature
in which a phase modulation process seems to imply, by default, phase
only information. Cathey attempted to clarify this confusion in 1970
by defining specific terms for each case.39 The holographic process,
which is independent of the recorded information, was described as (1)
phase holography when only phase modulation was present, (2) amplitude
holography when only amplitude modulation was present, and (3) complex
holography when both amplitude and phase modulation were present. In
an equivalent fashion, the information to be recorded on the hologram
can be described as (1) phaseonly information or (2) amplitudeonly
information when either the amplitude or phase portion of the complex
waveform are deleted. Thus, for example, an amplitude hologram can be
created from phaseonly information.
When an amplitude hologram is bleached, the density at each point
on the film is mapped to a phase delay. This mapping is linear when
the bleaching chemistry is correct. This new phase function on the
film is related to the original pattern on the film.
H(u,v) = exp j{F(u,v)} (4.8)
where H(u,v) is the complex function on the film after bleaching and
F(u,v) was the original transmission pattern recorded on the film.
The exponential expression in 4.5 can be expanded with a series
expression.
H(u,v)= 1 + jF(u,v) (1/2)F2(u,v) j(1/6)F3(u,v) +...
S [jF(u,v)]n (4.9)
n!
When reconstructed, this hologram can be expressed as a series of
convolutions.
h(x,y) = (x,y) + jf(x,y) (1/2)f(x,y)f(x,y) 
j(1/6)f(x,y)@f(x,y)f(x,y) + ...
= jn f(n)(x,y) (4.10)
n!
where f(n)(x,y) = f(x,y)@f(x,y) .. f(x,y) n convolutions
and f(O)(x,y) = S(x,y)
f(1)(x,y) = f(x,y)
f(2)(x,y) = f(x,y)f(x,y)
and so on.
Thus, the phase modulation technique is very nonlinear and the
resultant reconstruction is rich with harmonics. The reconstruction
from such a hologram is noisy due to the harmonic content. The higher
order correlations are broader, thus contributing less flux into the
reconstruction. Phase modulation in the form of bleached and
dichromated gelatin holograms have become the rule in display
holography due to the bright images. This fact indicates that the
noise is acceptable in many cases. In fact, the reconstruction of
such display holograms looks very good. Nevertheless, such an example
is deceiving because the repeated convolutions and correlations of
equation 4.10 become more detrimental for more complicated objects,
especially if the object has low contrast.32 The harmonics combine to
produce intermodulation terms within the bandpass of the desired
information, causing an increase in background noise. When used for
matched filtering, the decision to use phase modulation is a balance
between hologram efficiency and signaltonoise ratio.
An interesting case occurs when a binary amplitude hologram is
converted to a phase modulation hologram. The bleaching process maps
an amplitude of zero and one to a phase shift of plus and minus pi.
This equates to an amplitude of plus and minus one. For this binary
mapping, the transfer function is 2x1, which is a linear process. In
that sense, the binary hologram is inherently linear. The binary
hologram represents the continuoustone amplitude hologram by opening
more or fewer binary "windows". Through the use of many "windows,"
the amplitude can be accurately represented by the appropriate
combination of binary values. The subsequent bleaching of the binary
hologram is a linear process and thus no additional harmonics are
contributed. This provides a means by which high efficiency holograms
may be produced without sacrificing signaltonoise ratio due to non
linearity. A sufficient number of points is necessary in the binary
hologram in order to minimize the nonlinearity of the binary CGH
mapping. When a computer and writing device are available to produce
such binary holograms, subsequent bleaching or phase modulation
greatly improves the efficiency without any adverse effect on signal
83
tonoise. This makes digital, phasemodulated holograms very
attractive for matched filtering.
CHAPTER V
PATTERN RECOGNITION TECHNIQUES
Coherent optical correlators have been used as a means of
performing 2D pattern recognition.4043 An optical correlator system
could scan a large scene for the presence of a specific pattern. The
input image is correlated with the impulse response of the matched
filter to determine the presence and position of the reference
pattern. Because the Fourier transform is shift invariant (equation
2.6), correlation can occur anywhere in the input image and multiple
targets can be recognized simultaneously. However, other changes in
the input pattern do effect the correlation function. Rotation, scale
changes, and geometrical distortions due to viewing a 3D scene from
various angles can lead to correlation degradation and a corresponding
loss in detectability.44 For example, to recognize a hammer in a box
of tools, the reference must be capable of correlating with the hammer
when it is laying in any orientation from 0 to 3600. The hammer could
lay on either side so that both orientations would need to be included
in the reference image. If we were not sure of the distance from the
camera to the hammer, we would not be sure of its size in the image.
The fundamental difficulty in achieving a practical recognition
system lies in correlation of the reference image with a realtime
image which differs in scale, aspect, contrast, and even content when
sensed in a different spectral band or at a different time than the
reference image. Matched filter pattern recognition systems, both
optical and digital, tend to suffer from two types of difficulties.
They tend to be too sensitive to differences likely to occur in the
desired pattern. These differences are termed "withinclass
variations." Second, they tend to be too insensitive to differences
between real and false targets. These are "betweenclass variations."
While other deformations in the object condition are possible in
specific applications, translation, rotation, and scale are the most
common in pattern recognition whether it is accomplished optically or
digitally.
Deformation Invariant Optical Pattern Recognition
The basic operation performed in an optical processor is a two
dimensional Fourier transform. Matched spatial filters are used to
perform correlations between an input image and a reference pattern.
While the reference pattern may exist in the input image, it may be
deformed by scale, rotation or geometrical distortion. The Fourier
transform is invariant to shift in two dimensions (see equation 2.6).
It is not however invariant to scale or rotation, and a dramatic loss
in signaltonoise ratio (3 to 30 dB) occurs for small scale changes
(2%) or rotation (3.50).44
In some applications it is desirable to give up translation or
shift invariance in substitution for some other deformation
invariance. The technique described by Casasent and Psaltis45
involves a space variant coordinate transformation to convert the
deformation under consideration to a shift in the new coordinate
system. Because the optical processor performs twodimensional
transforms, it is insensitive to shifts in two dimensions. Thus, two
separate invariances can be accommodated. Scale can be converted to a
shift in one direction and the normal shift can be left in the other
dimension. This would provide scale invariance, but the resultant
correlation would only yield the location of the target in only one
dimension (i.e. the x coordinate).
In another example, the scale can be converted to shift in one
dimension and rotation converted to shift in another dimension. Such
a twodimensional optical correlator could provide correlations on
rotated and scaled objects but would no longer predict the location of
the object. The twodimensional nature of the optical processor
allows the correlator to be invariant to both deformations. In order
to provide invariance to other deformations two at at time, a
coordinate transformation is needed to convert that deformation to a
coordinate shift. The Mellin Transform is an excellent example of
such a transformation used to provide scale and rotation invariance.
The Fourier transform is invariant to translation shift in two
dimensions. To provide invariance to other deformations, a coordinate
transformation is needed to convert each deformation to a shift. To
provide scale invariance a logarithmic transformation is used. The
logarithmic transformation converts a multiplicative scale change to
an additive shift. This shifted version will correlate with the
logarithmically transformed reference pattern. To provide rotation
invariance, a transformation is performed to map the angle to each
point in the image to a theta coordinate. If an object rotates in
the test image, it is translated along the theta coordinate. Usually
the two transformation are combined into the log r, theta
transformation. The test image as well as the reference image is
converted to polar form to provide r and theta values for each pixel.
The image is transformed into a coordinate system where one axis is
log r and the other axis is theta. In this system, scale changes
shift the object along the log r axis and rotation shifts the object
along the theta axis. Because this transform, known as the Mellin
Fourier transform, is itself not shift invariant, it is normally
applied to the Fourier transform of the test image. This provides the
shift invariance but loses the location information in the test scene.
The cross correlation between the transformed test and reference
images no longer can provide the location of the object but does
determine the presence of the object, its size, and its rotation
relative to the reference pattern.
To perform the MellinFourier transform for shift, scale, and
rotation invariance, the input image is first Fourier transformed and
the resultant power spectral density recorded. This magnitude array
is converted to polar coordinates and the linear radial frequency is
converted to a logarithmic radial coordinate. The new coordinate
space (log r,theta) is used for crosscorrelation of the input image
with similarly transformed reference images. A high speed technique
is required to convert the image into log r, theta coordinates at a
speed compatible with the optical processor. This has been
demonstrated using holograms to perform geometrical
transformations.4650 To do this, the coordinate transforming
hologram must influence the light from each point and introduce a
specific deflection to the light incorporating such modifications as
local stretch, rotation, and translation.
A practical correlator system might incorporate such an optical
transforming system or a sensor which collects data in the appropriate
format by the nature of its scan pattern. Whether accomplished by the
sensor scan or by a coordinate transformation, the logarithmic
coordinate transformation is equivalent to resampling an image at
logarithmically spaced intervals. An increase in space bandwidth
(number of samples) is caused by the oversampling which takes place at
small values of the input coordinate. This increased sampling at the
input is a cause for concern in a practical correlator design. In
such a system, the resolution required at the highest sampling rate
fixes the design of the entire system. This may cause the space
bandwidth product required for adequate correlation to exceed the
capability of the sensor. However, Anderson and Callary51 showed
that previous researchers52 had overestimated the spacebandwidth
requirement and that practical MellinFourier correlators were
possible.
Synthetic Discriminant Functions
Another technique for recognizing multiple orientations and
conditions is to crosscorrelate with many different reference images
in parallel. The test image can be transformed by many lenses at
once, with each Fourier transform falling on an array of reference
filters chosen to give reasonable correlation to all conditions. By
the proper choice and location of the inverse transform lens, the
correlations of all the filters can coincide in one common plane.
This parallel setup has been extensively studied by Leib et al.53
They showed that with a memory bank of 23 views of a tank, an optical
correlator could obtain a 98% probability of detection and 1.4% false
alarm rate in scenes exhibiting both scale and rotation variations.
Unfortunately, this parallel technique is somewhat cumbersome to
implement due to alignment of the multiple lenses and filters.
To avoid the need for multiple lenses and filters, it is possible
to combine several reference images into one filter. The use of
multiple lenses and filters superimposes the outputs of the individual
correlators. Because the Fourier transform and correlation are
linear, the superposition at the output is equivalent to superimposing
the individual filter functions into one filter. Likewise, this is
equivalent to superimposing the reference images in the creation of
the filter. Rather than create separate filters from many images, a
single filter is created from a sum of the images. This simplifies
the optical hardware. Caulfield et a155 defines a "composite matched
filter" CMF as a single filter which is a linear sum of ordinary
matched filters, MF.
CMF = E wk MFk (5.1)
k
These filters can be implemented by either multiple exposure optical
holography or computer holography. In the optical hologram, the
weights in the linear combination are obtained by varying the exposure
time. The latter approach is to use computers to generate the CMF
offline. In this way, the longdrawnout creation of the CMF is
performed on a digital computer where time is not critical. This
takes advantage of the control, dynamic range, and flexibility of
digital processors.
Once the CMF function is determined, an optical filter is
produced, tested, and optimized. It is then inserted in an optical
correlator to take advantage of its realtime processing. To
implement the CMF optically, two techniques can be used: (1) transform
the digital image to optical image via a high resolution CRT or
digitally addressed camera and produce a Vander Lugt Filter in the
conventional holographic manner, or (2) retain the image in a digital
format and produce the filter through computergenerated hologram
techniques. This latter technique has the advantage of using the
dynamic range of the digital computer until the final product is
produced. That is, if the CMF function is displayed and transformed
optically, the display will limit the dynamic range. By producing a
computergenerated holographic filter, the dynamic range is retained
till a later stage. In addition, complex filter functions and
frequency preemphasis can be easily incorporated.
However the CMF is implemented, the weights must be chosen for
optimal performance in a specific application. Hester and
Casasent56,57 developed what is called the Synthetic Discriminant
Function (SDF) which is a CMF that gives the same correlation output
intensity for each pattern in the training set. The weights required
to provide a constant correlation output for each pattern are not
unique. Additional constraints can be placed upon the SDF to reduce
the response to specific unwanted targets, to reduce dynamic range, or
to incorporate other desirable features. Starting with a training set
(Figure 5.1) which adequately describes the conditions in which the
desired target could be found, the SDF is formed as a linear
combination of all of the training images (Figure 5.2). The weights
are determined using matrix techniques which typically requires
considerable time on a large computer.5863 The weights are adjusted
to provide a correlation with each member of the training set as close
