UFDC Home  Search all Groups  UF Institutional Repository  UF Institutional Repository  UF Theses & Dissertations  Vendor Digitized Files   Help 
Material Information
Subjects
Notes
Record Information

Full Text 
AN ELECTROOPTICAL IMAGE ALGEBRA PROCESSING SYSTEM FOR AUTOMATIC TARGET RECOGNITION By PATRICK C. COFFIELD A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1992 ACKNOWLEDGEMENTS I would like to thank all my committee members for providing the constructive criticism that helped make my research for this dissertation an enjoyable and rewarding experience. I would like to extend a special thank you to my advisor, Dr. Gerhard Ritter, for his patience and for teaching me the value of thorough research. I hope that our working relationship will continue for years to come. I thank the members of the Advanced Guidance Division of the Wright Laboratory Armament Directorate responsible for allowing me to pursue the hardware development of the theoretical ideas expressed in this dissertation. Finally, from the days of the the Air Force Armament Laboratory, I would like to thank LtCol. David Laborde, USAF, and Dr. Sam Lambert for allowing me to participate in the Air Force long term full time graduate program. TABLE OF CONTENTS ACKNOWLEDGEMENTS . LIST OF FIGURES . LIST OF TABLES . ABSTRACT . CHAPTERS 1 INTRODUCTION . 1.1 The Requirement .. 1.2 The ATR Problem . 1.3 Architectural Considerations. 1.4 Outline of the Dissertation . 2 BACKGROUND 2.1 Image Algebra Overview . . 2.1.1 Value Sets and Point Sets . 2.1.2 Images . . 2.1.3 Operations on Images . . 2.1.4 Generalized Templates . 2.1.5 Operations Between Images and Templates 2.1.6 Operations Between Generalized Templates 2.2 Digital Optical Cellular Image Processor (DOCIP)  A Comparison ...... 2.2.1 General Description . .. 2.2.2 Binary Image Algebra . . 2.2.3 Evaluation . . 3 CONFIGURATION DESCRIPTION . 3.1 Correlator Operation for ATR . 3.1.1 Spatial Light Modulation . 3.1.2 Optical System Throughput . 3.1.3 Composite Matched Filter . 3.1.4 Post Correlation Processing . 3.2 The Logical Architecture . . . ii S vii . 12 S 13 S 15 . 16 S. 17 S 17 S 21 S. 26 . 26 S 26 S. 27 S. 30 I I I I I I I I 3.3 Image Algebra Process Distribution . .. 52 3.3.1 Proof of Concept . 58 3.3.2 Proof of Phaseonly Filter Application . 62 3.4 Performance Evaluation . .. 65 3.4.1 Example Operation . . 65 3.4.2 Comparison of Execution with RISC Performance ................ 68 3.4.3 Algorithm Description and Evaluation .. 69 3.4.3.1 Description . 69 3.4.3.2 Evaluation . 73 3.4.3.3 Conclusions . 74 3.5 Binary Image Operations . .. 74 3.5.1 Functional Description . .. 75 3.5.2 Image Algebra Process Distribution . .. 76 3.6 Variant Template Operations . .. 78 4 VANDER LUGT SYSTEM IMPLEMENTATION .. .. 81 4.1 System Description . . 81 4.1.1 Filter Generation. . .. 84 4.1.2 Filter Implementation . 87 4.2 Image Algebra Process Distribution . .. 88 4.2.1 Morphological Operation . .. 89 4.2.2 Negative Real Value Manipulation. . .. 91 4.2.3 Binary Operation. . .. 94 4.3 Conclusions . . .. .94 5 ORDER STATISTIC FILTERING .. ........ 95 5.1 Background. . . 96 5.1.1 DSP Stack Filter Approach ... 9 5.1.2 Morphological Approach . .. 100 5.2 Stack Comparator .................. 107 5.2.1 Concept Description . .. 108 5.2.2 Proof of Concept ................. 111 5.2.3 Time Complexity . . 114 5.3 System Integration and Operation . .. 116 6 CONCLUSIONS AND SUGGESTIONS FOR FURTHER RESEARCH 118 REFERENCES .. . ............ 125 BIOGRAPHICAL SKETCH .. ...... 131 LIST OF FIGURES page ATR component structure, input/output diagram . Example of a 2D rectangular realvalued image . Example of a 2D template . . Example of generalized convolution substitutions . Example of a 2D generalized backward convolution . Dilation (Boolean) expressed in the language of BIA and image Optical correlator for ATR ..... Cross sectional view of optical correlator . Ferroelectric liquid crystal cellular geometry . Spatial light modulator . . Data flow diagram for implementing at t ... Pipelining the processors .. ....... Optical convolver (analog) ..... Shift, weight, and accumulate .. ...... Cellular connection of processors .. ..... Comparator enhancement ..... Example of general convolution . . Example of additive maximum .. ...... Example of instruction stack reordering .. .... Binary image erosion and dilation .. ..... 6 . 6 . 18 . 20 . 24 . 25 algebra 31 S .. 34 . 36 . 38 . 39 . 44 . 45 . 46 . 49 . 50 . 51 . 66 . 67 . 72 . 77 Multiple fixed filter sets . . Vander Lugt matched filter generation . Stack filter 3 window median filtering . Stack filter 3 window asymmetric median filtering . Example of morphological median filter . A comparator circuit .............. Comparator linking in stack structure . Data flow of a conceptual stack comparator . A generalized imagetemplate product hypercube execution . Stack comparator implementation of asymmetric median filter . .. 83 . .. 86 . .. 98 . .. 103 . .. 106 ... 109 . .. 110 . .. 112 . .. 122 . .. 123 LIST OF TABLES 1. Component design requirements . .. 47 2. Imagetoimage operations . .. 53 3. Characteristic functions . .. 56 4. Imagetotemplate operations . .. 57 5. Estimated execution time for proposed design . .. 70 6. Estimated execution time for RISC design . .. 70 7. Stack filters which preserve only constant roots .. 101 8. Stack filters which preserve only increasing roots .. 101 9. Stack filters which preserve only decreasing roots .. 102 10. Stack filters which preserve median filter roots and/or other roots 102 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy AN ELECTROOPTICAL IMAGE ALGEBRA PROCESSING SYSTEM FOR AUTOMATIC TARGET RECOGNITION By Patrick C. Coffield August 1992 Chairman: Dr. Gerhard X. Ritter Major Department: Computer and Information Sciences The proposed electrooptical image algebra processing system is designed specifically for image processing and other related computations. The design is a hybridization of an optical correlator and a massively paralleled, single instruction multiple data, processor. The architecture of the design consists of three tightly coupled components: a spatial configuration processor (the optical analog portion), a weighting processor (digital), and an accumulation processor (digital). The systolic flow of data and image processing operations are directed by a control buffer and pipelined to each of the three processing components. The image processing operations are defined in terms of basic operations of an image algebra developed by the University of Florida. The algebra is capable of describing all common imagetoimage transformations. The merit of this architectural design is how it implements the natural decomposition of algebraic functions into spatially distributed, point wise operations. The effect of this particular decomposition allows convolution type operations to be computed strictly as a function of the number of elements in the template (mask, filter, etc.) instead of the number of picture elements in the image. Thus, a substantial increase in throughput is realized. The implementation of the proposed design may be accomplished in many ways. While a hybrid electrooptical implementation is of primary interest, the benefits and design issues of an all digital implementation are also discussed. The potential utility of this architectural design lies in its ability to control a large variety of the arithmetic and logic operations of the image algebra's generalized matrix product. The generalized matrix product is the most powerful fundamental operation in the algebra, thus allowing a wide range of applications. No other known device or design has made this claim of processing speed and general implementation of a heterogeneous image algebra. CHAPTER 1 INTRODUCTION The objective of this dissertation is to define the logical configuration of a hybrid electrooptical system designed specifically for image processing. The architectural design of the system integrates the operation of an optical correlator with a massively parallel digital processor. This combined system is designed to optimize the operations and functionality of an image algebra developed by the University of Florida under Air Force contract F0863584C0295 [61]. It is in the further interest of the Air Force that the proper implementation of this concept may be useful for advanced guidance systems that provide Automatic Target Recognition (ATR). This issue is also addressed in this dissertation. There are many processor architectures which have been designed for image processing applications [12,13,17,29,34,40]. In general, such applications have been implemented as a limited set of specific processing routines. All imagetoimage transformations can be fully described by the unary, binary, and matrix operations defined in the aforementioned image algebra. No prior architectural design has ever been developed to directly support the functionality of the image algebra. The lowlevel abstraction of the proposed computational system is founded on the mathematical principle of discrete convolutions and their geometrical decompositions. The geometric decompositions combined with single instruction multiple data (SIMD) array processing requires redefining specific algebraic operations and optimizing their order of execution. The logical data flow of such an abstraction leads to a division of operations, those defined by pointwise operations and those defined in terms of neighborhood operations. The intended use of the electrooptical image processing system proposed in this dissertation is that of an integral part of the advanced guidance system for an airborne autonomous vehicle. The sensor for this type of guidance system can be a synthetic aperture radar or a two or threedimensional infrared imaging device. The proposed system provides a common modular hardware solution to a wide class of problems associated with these sensing devices. Considering other applications, the implementation can range in use from the embedded form described in this dissertation to a general use coprocessor configuration coupled with an image processing workstation [60]. Since the design involves such a comprehensive set of operations, it is reasonable to assume that a general use implementation would have a wide range of commercial applications such as medical imaging, manufacturing robotics, etc., as well as ATR. 1.1 The Requirement The 1991 war in the Persian Gulf was the first war in which smart weapons were used. The term "smart weapon" was used mostly in the context of pilot cued contrast tracking. The inertially navigated unmanned cruise missile was as close to autonomous guidance as the stateoftheart would allow. As a result of the success and limitations observed in this generation of smart weapons, the projection for the next generation is that we will have more intelligent systems that require positive target identification and quick response in a high electronic counter measure, lowobservable environment. This high threat situation demands a reduction in the need for pilot cuing, preferably a complete standoff, launch and leave capability. The realtime computational rates for these next generation systems are established to be well into the gigaops, or 109 operations per second, range. In order to realize the necessary rates, technological advances must be made using massively parallel computational methods. 1.2 The ATR Problem With respect to any weapon delivery system, the need for target recognition is fundamental for correct system performance. The rationale for automatic target recognition impacts not only the system's performance but the tactics used in delivering the weapon; i.e., pilot acquisition not required, launch and leave. A conventional, blastfragmentation type munition is typically designed to be effective against an array of different targets. Another important advantage of recognizing the target a priori is knowing when and how to detonate the munition. If the size of an arbitrary target and the relative location of its most vulnerable area are known, then the munition can be detonated in an optimal sequence. Optimally, the warhead explosion is timed to affect the fragmentation distribution (characterizing velocity, residual mass, and direction), to maximize the kill potential. Such a level of sophistication in a weapon system can only be accomplished as an automated subsystem function. The automatic recognition and subsequent track point location of an object that is contained in a digital image sequence is accomplished in an interrelated series of programmed events executed per image frame. These events have been formally defined by the Automatic Target Recognizer Working Group (ATRWG). The ATRWG is a joint U.S. Department of Defense/Industry working group sponsored by the Defense Advanced Research Projects Agency [7]. A depiction of their interrelationships taken from ATRWG report number 87002 is shown in Figure 1. A summary of the ATRWG definitions is as follows: (1) Preprocessor: signal conditioning, image enhancement, and/or image transformation. The primary objective of this procedure is to condition or convert the raw image data for the next event, usually the points of interest locator. Noise reduction filtering, dynamic range of image intensity via histogram modification, and linear/nonlinear transforms that globally modify the high spatial frequencies of an image are some of the techniques that are typically used. (2) Interest Point Locator: geometrical or statistical. This procedure is a basic operation that identifies locations within the image that are potential object points. Statistical data differences or values computed globally may be used to isolate points, or specific geometric shapes can be isolated, automatically, using template matching techniques. (3) Segmentor: edge based, region based, or texture based. The objective of this procedure is to partition the image into regions that are based on consistent information. The information is normally a refinement of the input to the Feature Extractor. (4) Feature Extractor: geometrical or statistical. The computation is a fine grain identification procedure similar to that of the Interest Point Locator. Specific attributes, or feature values, are determined for each segmented region and used to form a corresponding feature vector. This syntactic event forms the input for the Classifier whose output is a semantic event. (5) Classifier: statistical or syntactic. In some applications, distinct distributions can be determined for classes of feature vectors. Statistical methods are then used to associate, with a level of confidence for objects of interest, the feature vectors that are output from the Feature Extractor with the appropriate distributions or classes. Other applications can use subcomponents or pattern primitives that, in a syntactic sense, relate regions to classes. The syntactic method is favored for its execution speed; however, it is highly dependent upon system resolution. Both methods output the necessary class assignment or object labelling. (6) Tracker: pixeltopixel or objecttoobject. Depending on the sophistication of the system, the tracker can be driven by pixel information (e.g., if the input to the Tracker is taken directly from either the Preprocessor or the Interest Point Locator) or by object precedence (e.g., the input comes from the Feature Extractor or Classifier). Both tracking methods are tightly coupled to a system's guidance law, typically a rigidbody response feedback loop designed to minimize a tracking error solution. Note, an optical correlator tracking system is an objecttoobject tracker that receives its input directly from the Preprocessor. The ATR problem arises when the rate at which the system must respond to changes in kinematic conditions, called the update rate, is higher than the information processing rate. When this situation occurs, the tracking error cannot be minimized and in the worst case the system loses its ability to track. For this reason, streamlined information processing, current weapon systems are reduced to some form of pilot cuing coupled with contrast tracking. 1.3 Architectural Considerations In order to solve the ATR problem, the next generation of weapon system guidance must incorporate computational methods which are capable of processing FEATURE EXTRACTOR t CONFIDENCE MEASURES I IMAGE REGIONS FEATURES CATEGORIES S\ DETECTED REGIONS PRE REOCON I PROCESSOR  SEGMENTOR CLASSIFIER IMAGE IMAGE LOCATION AND PROBABILITY INTEREST POINT LOCATOR FEATURES IMAGES REGIONS TRACKER X, Y TRANSLATION 1 Figure 1. ATR Component Structure, Input/Output Diagram. information at rates that are several orders of magnitude higher than the respective update rate. The solution is a time complexity issue. The mathematical notation used in this dissertation to evaluate time complexity is the Onotation [1]. The following is a brief review of the significance of the notation. For any arbitrary function An) the asymptotic upper bound is defined in the Onotation as Jn) = 0(g(n)), if and only if there exist two positive constants c and no such that IjAn) < c g(n) for all n > no. The most common references for computing times are 0(1) < 0(logn) < O(n) < 0(nlogn) < 0(n2) < 0(n3) < < 0(2n). Whereas Onotation is used to express an upper bound, f(n) is used to express a lower bound. It is defined as: jn) = l(g(n)), if and only if there exist two positive constants c and no such that I An)  > c  g(n) I for all n > n0. If g(n) is both an upper and lower bound on An), then 8 is given as the notation. Formally, An) = 8(9(n)), if and only if there exist positive constants cl, c2, and no such that for all n > no, c1 I g(n) I < I n) < c2 g(n)l. The question that arises is, how can application algorithms, as summarized above from the ATRWG, be processed with time complexity 0(n), 0(log n), or even 0(1)? The approach taken in this dissertation is to couple the benefit of massively parallel, digital computation with the benefit of optical computation to provide a lowest upper bound. The most important consideration of any algorithm is its mathematical structure. Two questions associated with the mathematical structure are: (1) Is the mathematical structure optimized for time complexity and the kind of parallelism provided by the computer in use? (2) What type of architecture provides the necessary execution speed on the basic operations specified by the mathematical structure? The architectural design proposed in this dissertation provides the necessary speed at the operator level of the algebraic abstraction. This allows the algorithm designer to be creative at the conceptual level without first knowing how the parallel operations are physically mapped to the underlying architecture. Many application algorithms involving image analysis and processing are central processing unit (CPU) or input/ouput (I/O) bound, or both; i.e., an algorithm may be less robust due to limitations on processor or data bandwidths. Even considering the hardware speedup due to developments in very large scale integrated (VLSI) components and complementary metaloxide semiconductor (CMOS) technology, many algorithm tasks still require throughputs that are beyond current capabilities. Parallel computer architectures have the capability to overcome many of these bandwidth limitations; however, the communication among processors is a physical constraint that is not easily or inexpensively overcome. Optical systems, on the other hand, have such a high spacebandwidth product that these limitations appear insignificant. In the past, the inherent inaccuracies of these analog systems have limited the use of such devices almost exclusively to correlation operations. The interest in these correlator operations is mainly the magnitude and location of the crosscorrelation of an image scene with a reference subimage (usually an object to be identified in the input image scene). A large signal to noise ratio can be tolerated in such a system and still provide acceptable autocorrelation position information. Recent advances in spatial light modulator design have reduced optical system inaccuracies and increased their switching speeds to the point where they are now viable for the hybrid system described in this dissertation [18]. The approach taken in this dissertation does use the matched filter/correlator principle. However, it is simply taking advantage of the high spacebandwidth product of the optics to provide a fundamental convolution procedure which quickly and efficiently translates the input image while minimizing interprocessor communication. The benefit from coupling the fine grain, SIMD computational system with the optical correlator can be summarized by (1) The I/O for any electrically connected SIMD design is a major bottleneck in terms of system throughput. Most implementations of these designs that are directed towards image processing reduce the effect of this condition by fixing the size of template, or convolution type, operations over images so that an I/O pipeline can be conveniently configured. By allowing the optical analog to perform the necessary image translations in freespace, this restriction is eliminated. Thus, any template configuration can be algebraically processed at a cost proportional to the number of elements in that template. (2) The freespace distribution, combined with the algebraic decomposition provided in this dissertation, is a simple solution to an otherwise complex interprocessor communication control problem. (3) The complete arithmetic and logical control the image algebra has over the Preprocessor event, as defined by ATRWG, provides a more efficient and flexible optical correlation system, a proposed solution to the ATR problem. (4) The method of sorting, defined in Chapter 5, while generating the ith rankorder for order statistic filtering is unique. It removes the combinatorial constraint incurred with the use of large template configurations and either stack filter or morphological operations. The order statistic filtering operation using the proposed design is executed at essentially the same cost function as a single convolution operation with the same template configuration. (5) The insight gained from the development of this hybrid optical design will lead to further research into an all digital design for more general applications. Also, a coupling of the SIMD approach taken here with a holographic, freespace information distribution method, sometimes referred to as an optical crossbar switch, will provide an even greater reduction in the switching time among processes. 1.4 Outline of the Dissertation A historical perspective on the development of the image algebra by the University of Florida and its relationship, similarities and differences, to other image algebras (most notably binary image algebra [41]) is the discussion of Chapter 2. Chapters 3 and 4 cover the conceptual descriptions of two hybrid optical configurations. The reason for presenting two separate configurations is to show applicability to a high resolution, holographic film implementation as well as a solution to the ATR problem. The first concept is a programmable spatial light modulator, optical correlator implementation. This configuration offers greater flexibility but at lower resolution, which is adequate for a solution to the ATR problem. The second design configuration is based on a Vander Lugt optical correlation principle which uses preset templates resulting in a different approach 11 to the image algebra integration. Chapter 5 presents a discussion on the use of order statistic filters, with respect to the ATR problem, and their integration with the design configuration proposed in Chapter 3. Other methods of generating the order statistics are investigated and compared to a proposed method that operates at or nearly the same speed as a single convolution. Finally, Chapter 6 concludes the dissertation with a discussion on the use of these findings for further research. CHAPTER 2 BACKGROUND Research in the field of optical computing has increased considerably over the last decade, primarily due to the inherent parallelism provided by optical devices. The alluring idea of complex computations performed literally at the speed of light has been perceived as a panacea for a wide range of problems. To give a quantitative example of the speed advantage, consider the upper bound on the speed of electronic devices to be on the order of picoseconds (1012 seconds); optical devices are easily three orders of magnitude higher, femtoseconds (10'15 seconds) [5]. Parallelism in the optical device is quantified by the spacebandwidth product (SW) of the optical system. The SW is equivalent to the maximum number of spatial degrees of freedom which is directly proportional to the product of the areas of the input plane and the spatial frequency plane and inversely proportional to the square of the product of the wavelength of the light and the focal length of the system. More formally, SW = a A / (Af)2, where A is the wavelength of the light, f is the focal length of the system, a is the area of the input image, and A is the area of the spatial frequency plane [5]. Considering the digital storage required for the equivalent spacebandwidth product, it is clear that the optical device is superior. It is for these reasons that the Air Force devotes considerable research funds to the further development of optical computing methods that provide solutions to the ATR problem. The optical method reported in this dissertation as a solution to the ATR problem is a hybrid of a massively parallel digital system and an optical correlator. The reasons for selecting this particular optical system are the maturity of the technology and the precision of the target identification process. Also, proprietary sources have demonstrated the capability of storing many thousands of frequency transformed reference images in a crystal lattice composition. The crystal is then installed in the correlator as a multiple reference matched filter. The preprocessing step is critical for the proper operation of an optical correlator with a multiple reference matched filter. The raw input image may, according to the particular application, need to be dynamically enhanced, filtered, rotated, scaled, and/or thresholded. Operations such as these are necessary to ensure the highest probability of autocorrelation for the system; i.e., the highest signal to noise ratio. They also need to be generally programmable and executable without significantly degrading the speed of the system. Massively parallel computation at nearly the speed of light is a reality. What remains is the proper interface to programmatically control the application. This dissertation presents that interface and its relationship to an algebra that is specifically designed to express image processing operations. 2.1 Image Algebra Overview In recent years, several image algebras have been developed [29,41,67,69]. These algebras are for the most part homogeneous, having a finite number of operations for transforming one or more elements of a single set into another element of the same set. The mathematics of image processing, in the most general sense, forms a heterogeneous algebra because it involves many types of operations on and between elements of different sets. If an algebra A properly contains subalgebras that are isomorphic to some other algebras, then A is more powerful than these algebras; i.e., A contains all the laws and rules of these algebras plus more [62]. The algebra we are about to discuss is a heterogeneous algebra specifically designed for image processing and is more powerful than other image processing algebras, such as mathematical morphology, currently discussed in the literature in that it contains these algebras as isomorphic subalgebras [22]. In 1984 Ritter et al., under joint sponsorship of the Air Force Armament Laboratory (AFATL) and the Defense Advanced Research Projects Agency, began the development of a unifying mathematical theory for concepts and operations encountered in image and signal processing. The goal was to develop a complete, unified algebraic structure that would provide a common mathematical environment for image processing algorithm development, optimization, comparison, coding, and performance evaluation [57]. The outgrowth of this effort was the AFATL image algebra, which we shall simply refer to as the image algebra. The image algebra is a heterogeneous algebra capable of expressing all image processing transforms in terms of its many different operations over a family of sets, its operands [63]. It manifests itself as a theory concerned with the interrelationship of two broad mathematical areas, point sets subsetss of topological spaces) and value sets subsetss of algebraic structures; e.g., groups, rings, vector spaces, lattices, etc.). The algebraic manipulation and transformation of geometric/spatial information is the basis for the interrelationship. The transformation of geometric data may result in geometric, statistical, or syntactic information represented by images. The point and value sets along with images and templates form the basic types of operands for the algebraic structure. Operations on and between images are the natural induced operations of the algebraic system. Operations between images and templates and between templates and templates are based on the definition of a generalized product [56]. Isomorphisms have been proven to exist between subalgebras of the image algebra and the homogeneous algebras previously mentioned [61]. Even more noteworthy, subalgebras of the image algebra have been proven to be isomorphic with standard Linear Algebra and Minimax Algebra [22,30]. Image algebra operations, both unary and binary, are not only capable of defining commonly used low level transforms, but higher levels of processing as well, these include neural network methods, feature space manipulation, and others [61]. A complete description of the image algebra can be found in Ritter et al. [63]. 2.1.1 Value Sets and Point Sets In image algebra, value sets are synonymous with homogeneous algebras. For a definition of an algebra, refer to [56]. An arbitrary value set will be denoted by the symbol F. Some of the more common examples of value sets used in image processing are the sets of integers, real numbers, complex numbers, binary numbers of fixed length k, and extended real numbers (which include one or both of the symbols +w and w). These value sets are denoted by 1, R, C, I k, R+ = R U {+}, R = R U {w}, and R i = R+, U R respectively. The operations on and between elements of a given value set F E {1, R, C, 1 R+., R_ ,m } are the usual arithmetic and lattice operations associated with F [61]. For example, the operations on the elements of the value set F, where F = R, are the arithmetic and logic operations of addition, multiplication, and maximum, and the complementary operations of subtraction, division, and minimum. The operations on subsets of F are the operations of union, intersection, set subtraction, choice function and cardinality function, denoted by U, n, \, choice and card, respectively. The choice function returns an arbitrary element of the subset of F while the cardinality function returns the number of elements in the subset of F [63]. In image algebra, the notion of a point set is equivalent with that of a topological space. Point sets most applicable to present day image processing tasks are subsets of ndimensional Euclidean space IRn. These point sets provide the flexibility of defining rectangular, hexagonal, toroidal, spherical, and parabolic discrete arrays as well as infinite subsets of Rn [62]. The letters X, Y and W denote point sets, and elements of point sets are denoted by bold lower case letters to represent vectors in En. In particular, if x E X and X c Rn, then x = (x, x2, *, xn), where zi E R V i = 1, 2, * ,n. Image algebra operations acting on subsets of point sets are the operations of U, n, \, choice and card. The operations acting on or between elements of point sets are the usual operations between coordinate points; i.e., vector addition, scalar and vector multiplication, dot product, etc. [63]. 2.1.2 Images An image is the most fundamental operand in the image algebra and is defined in terms of value sets and point sets. Given point and value sets X and F, respectively, an Fvalued image a on X is a function a: X * F The data structure associated with an Fvalued image a on X is the graph of the function a, which we again denote by a; i.e., a= (x,a(x)): xE X}, (1) where a(x) E F (see Figure 2). The set X is called the set of image points of a and the range of the function a (which is a subset of F) is the set of image values of a. The pair (x,a(x)) is called a picture element or pixel, where x is the pizel location of the pixel value a(x). The set of all Fvalued images on X is denoted by F' [63]. 2.1.3 Operations on Images The fundamental operations on and between Fvalued images are the naturally induced operations of the algebraic structure of the value set F; in particular, if F = R, then the induced operations on 0x are the pixelwise binary and unary operations of addition, multiplication, maximum, minimum, absolute value, exponentiation, etc. These pixelwise operations will be executed by the weighting and accumulation processors of the proposed architecture [61]. 2.1.4 Generalized Templates A template in Ritter's image algebra is a mathematical entity that generalizes the concepts of templates, masks, windows and structuring elements, thereby playing an important role in expressing image transformations. Templates provide a tool for describing image operations where each pixel value in the output image depends on the pixel values within some predefined configuration of the input image [49, 63]. x a X  'J x= (x,y) a(x) = 0 a = { (x,a(x)): xeX, a(x)F} If the chosen value set F is equal to IR, then any given point set X will define a real valued image a on X, a : X*R where x is the pixel location and a (x) is the pixel value at x. In this particular illustration we choose a discrete rectangular point set X C C IRI where Z2 ={(i, ) :i, j Z}. Figure 2. Example of a 2D rectangular realvalued image. ^  b I Let X and Y be point sets and F a value set. A generalized Fvalued template t from Y to X is a function t: Y  Fx. Thus, a template is simply an image whose values are images; i.e., a template from Y to X is an FXvalued image on Y. If t E (FX)Y, then for each y E Y, t(y) is an Fvalued image on X. The set of all Fvalued templates is denoted by (Fx)Y. The sets Y and X are referred to as the target domain and the range space of t, respectively. For notational convenience, the symbol ty is defined as t = t(y). The pixel location y at which a template ty is evaluated is called the target point of t and the values ty(x) are called the weights of t at y [22,63] (see Figure 3). If t is a real or complex valued template from Y to X, then the support of ty is defined by e(ty) = {x E X: t(x) # 0}. If t is an extended real valued template, then the following supports at infinity are also defined: S(ty) = {x E X: ty(X) # }, (ty) = {x EX : t,(x) }, and ,((ty) = {x E X: ty(x) # *}). Templates are classified as either translation invariant or translation variant. Translation invariance is an important property because it allows image processing algorithms and transformations that make use of invariant templates to be easily mapped to a wide variety of computer architectures [17,34,37,49]. Suppose X = Rn or X = In. A template t E (FX) is said to be translation invariant if and only if for each triple x, y, z E X, ty(x) = ty+,(x + z). Translation invariant templates have fixed supports and fixed weights and can be represented pictorially in a concise manner. A template which is not translation invariant is called translation variant, or simply a variant template. These Y IR2 a ema te t iiii:: ::Ll::.II x ii .......... ._ _.. .... "... ;..... ....:... ... ... ty= (x,ty(x)): xeX,ty(x)IF} Let X and Y be point sets and F a value set. A generalized F valued template t from Y to X is a function t : Y Fx [denoted t (IFX) ]. For each target point y Y t = t(y) where the values t (x) are the weights of the template t at y. In this example, Y = XC 2 is a rectangular array, and IF = IR. The support of t at target point y is shown in shaded form. Figure 3. Example of a 2D template. templates have supports that can vary in shape and weights at different target points [63,68]. 2.1.5 Operations Between Images and Templates Several common imagetemplate operations as defined in the image algebra are used to describe image transformations that make use of local and global convolutions and to change the dimensionality or size and shape of images. Such transformations include image rotation, zooming, image reduction, masked extractions and matrix multiplication. In an effort to generalize the definition of basic operations between images and templates, the global reduce operation and the generalized product between an image and a template are introduced [63]. Let X c Rn be finite, where X = {xt, x,, *, x}. If 7 is an associative and commutative binary operation on the value set F, then the global reduce operation F on Fx induced by 7 is defined by ra = r a(x) = a(x,) 7 a(x2) 7 7 a(x), (2) xEX where a e Fx and F is the mapping function r : F  F. Imagetemplate operations are defined by combining an image and a template with the appropriate binary operation. This is accomplished by means of the generalized product, specifically, the generalized backward and forward convolution operations [22,63]. Let F F, and F be three value sets, Y C Rm,and 0: F1 x F2  F be a binary operation. If 7 is an associative and commutative binary operation on F, a E Fx and t E (Fx)Y, then the generalized backward convolution operation (the generalized forward convolution operation is omitted here) of a with t is the binary operation 0: F x (F) F' defined by a t {(y,b(y)): b(y) = r a(x) 0 t(x), y Y}. (3) xEX Notice that the input image a is an F valued image on the point set X while the output image b is an Fvalued image on the point set Y. Templates can therefore be used to transform an image with range values defined over one point set to an image on a completely different point set with entirely different values from the original image [22,63]. A wide variety of imagetemplate operations can be derived by substituting various binary operations for 7 and 0 in the definitions stated above for the generalized backward and forward template operations (see Figure 4) [63]. There are, however, three basic imagetemplate operations defined in the image algebra that are sufficient for describing most image transformations encountered in current areas of image processing and computer vision applications. The three basic operations that are considered in this dissertation are generalized convolution, additive maximum, and multiplicative maximum, and are denoted by A, 0, and respectively. The operation is a linear operation that is defined for real and complex valued images. The other two operations, M and @, are nonlinear operations which are applied to extended real valued images [22,63]. Let X c In be finite and Y C Rm. If a E R and t E (R)Y, then the backward generalized convolution, see Figure 5, (the forward direction is omitted here) is defined as aet {(y,c(y)): c(y)= a(x). t(x), y Y}. (4) xEi If a E R and t E (x )Y, then the backward additive maximum operation is defined as a t {(y,c(y)): c(y) = V a(x)+ty(x), yE Y}, (5) xEx where V a(x) + t(x)= max {a(x) + t(x) : x E X}. xEx If a E (R ) and t E ((R 0)1), then the backward multiplicative maximum operation is defined as a tt {(y,c(y)): c(y) = V a(x).ty(x), yE Y). (6) xEx The complementary operations of additive and multiplicative minimum ( M and ) are defined in terms of the additive and multiplicative dual of images and templates and the imagetemplate operations of 1 and e [63]. The relationship between the additive maximum and minimum is given in terms of lattice duality by a t = (t* 3 a)*, where the image a* is defined by a*(x) = [a(x)]*, and the dual of t E (R )Y is the template t* E (Rh)x defined by t*(y) = [ty(x)*. It follows that the additive dual can be simply defined as tx(y) = ty(x). A similar correspondence for the multiplicative dual is defined by Ritter et al. [63]. a(x) t (x), yeY y Resulting Operation: and * and n and and and AND Spatial Convolution or Correlation [62] Set Theoretic Dilation [66] (where IF is a set of sets) Morphological Dilation (additive) [62] Multiplicative Lattice Dilation [62] Boolean Dilation [40] Figure 4. Examples of generalized convolution substitutions. =r xe X Replace Symbol: r xe X and O with: xE Xe X U XE X V xe X V xe X OR xe X a( t ={ (y,b(y)): b(y) 25 Y IR2 T Mlat t b a X b = a( t ={ (y,b(y)): b(y)=a a(x)ty(x),yeY} xe X Let X be a finite subset of 2, Y = 22 c IR, and choose F= IR. Suppose a Fx and t (FX), then where ever the image ty intersects the image a, the convolved value b (y) will be located at the target point y. Figure 5. Example of a 2D generalized backward convolution. 2.1.6 Operations Between Generalized Templates The basic binary operations between generalized templates are the same pointwise operations inherent in the value set F [22]. These are the same operations defined between images. The imagetemplate operations of 0, Q, and generalize to operations between templates. When used as templatetemplate operations, these image algebra operations enable large invariant templates to be decomposed into smaller invariant templates thereby providing a means for mapping image transformations to various computer architectures [68]. This process is called template decomposition. Combining two or more templates with the operations of 9, M and/or Q is called template composition. 2.2 Digital Optical Cellular Image Processor (DOCIP) A Comparison The concept of optical image processing has been an item of considerable research [5,26]. Particular interest has been paid to optical array methods which remove the interconnection bottleneck common to conventional semiconductor parallel architectures [29,40,41]. The reason for evaluating the DOCIP is to show that although this highspeed optical image processor has a limited potential for use in an ATR problem solution, it does not have the robust image algebra operational capability of the designs presented in this dissertation. 2.2.1 General Description The DOCIP, presented by Huang et al., is designed to provide parallel image processing tasks using the inherent optical parallelism and 3D free space interconnection capabilities [41]. Both cellular array and cellular hypercube designs are presented in [41]. Only the hypercube design, with its higher degree of connectablity, is of interest here since it has the better execution time complexity, O(logN) versus O(NlogN) for the cellular array [41]. The DOCIP is intrinsically a fine grained SIMD machine where each cell operates with low hardware complexity. The hypercube architecture is used to manipulate structuring element locations and configurations, either 4 or 8 connectedness in the cellular automata sense, with respect to each pixel in an NxN image. The overall design of the architecture is tightly coupled to the underlying binary image algebra constructs. 2.2.2 Binary Image Algebra The programming language initially developed for the DOCIP was later established as a coherent homogeneous algebra referred to as binary image algebra (BIA) [29,41]. BIA was another attempt at a unifying theory. Its formulation was based on a set of three specific fundamental operations taken from the structures of mathematical morphology [67]. Mathematical morphology is used to determine structure in a subimage by transforming underlying modeling sets. The modeling sets are sets of points known as structuring elements. The structuring elements, defined in the image algebra by the support of a template, are designed in a such a way as to reveal the spatial characteristics of objects. Morphology used as an algebra for image processing fails as a universal image algebra due to its settheoretic formulation [57]. Conceptually, in a Turing machine sense, it is possible to use these settheoretic operations in the linear domain, transformations between different domains, and transformations between different value sets; however, the practicality is beyond that which is reasonable. On the other hand, the image algebra defined by Ritter et al. does not fail in this regard. Furthermore, it includes all morphological operations, either binary or gray scale, as a subalgebra [22]. Formally, the binary image algebra is defined by a set S and family of operations F over that set. The set S is further defined as the power set of a predefined universal image W, denoted P( W) [41]. The universal image is the set W= { (x,y) : x E In' yE in } (7) where n = { 0, *1, 2, .., n } and n is a positive integer. Unfortunately, this definition does not directly convey the notion that this is an algebra over binary images. Additional constraints are added which restrict the values at each point (x,y) to 1 (foreground) and 0 (background). The power set P( W) can be defined simply as the set of all images such that the picture elements (pixels) of each image belong to the value set {0, 1}. The image algebra defined by Ritter et al. defines any image belonging to this power set as a = {(x,a(x)): x X }, (8) where a(x) E {1, 0} and x is any vector belonging to the point set 1 nx In In particular, in Ritter's algebra the set where /1 = {0, 1} and X = 1Fnx ,n' corresponds to Huang's universal set S of Boolean images on P( W). The family of operations F is made up of three fundamental, non 0ary operations: dilation, union, and complement, denoted (* is used to avoid confusion with the image algebra notation), U, and , respectively. A subset of P( W) containing five images referred to as elementary images is used to derive the necessary structuring elements needed to perform all morphological functions of the three fundamental operations [41]. The elementary images are denoted I, A, A', B, B:1 Huang denotes his binary image algebra symbolically by BIA = ( P( W); *, U, , I, A, A;' B, B' ). (9) Again, we must point out that this notation does not specify the algebra completely since the power set P( W) is not the same as the set of all Boolean images on P W). Two of the three fundamental operations, union and complement, are basic operations of any useful settheoretic algebra (BIA and the image algebra defined in [63] are no exceptions) and need no further clarification. The dilation operation, in the strict morphological sense, is the union of all input image translations by the points in the structuring element [67]. BIA defines it as X R = (10) {(x + x y+ y ) 6 W: (x ,y ) E X, (x ,y ) E R} (X 0) A (R # 0) otherwise, where X is an input image, R is a reference image derived using the three fundamental operations in combinations with one or more of the five elementary images, 0 is the null image, and A means logical and. Once again, the definition does not clearly express the intent of morphological dilation nor that the result of the operation is a binary image. The underlying mathematical ring and lattice structures are never specified, leaving the reader to ponder the action of the binary operator +. If in a Boolean sense, the operator represents an or operation, then the dilation notion is reasonable. The dilation required by BIA is formulated by collectively oring a sequence of translations of the input image (translated with respect to each element in the reference image) each being anded with the reference image (refer to Figure 4). The image algebra by Ritter et al., using the concepts of value and point sets, readily defines morphological dilation in this instance. Substituting the global reduce operation maximum, V, and the multiplicative binary associative operator, *, in the generalized matrix product form, the binary morphological dilation is then defined as at {(y,c(y)): c(y) = V a(x)tx), y X}, (11) xEI where input image a E (2 )', template (BIA reference image) t E (( 2)x1, and the point set X = inx In defined above. Figure 6, an example taken from [41], shows the dilation operation for input and reference images. Without further proof, it is clear that for binary images the multiplicative maximum operation defined in [63] is equivalent to the dilation operation defined for binary image algebra. 2.2.3 Evaluation Since the three fundamental operations defined by the binary image algebra have equivalent definitions in the image algebra by Ritter et al., and the sets which define any input and reference images (templates) are also equivalent, BIA can be considered to be a subalgebra of the image algebra in [63]; that is, the image algebra [63] can perform any operation combination of the BIA. This has been proven rigorously by Davidson [22]. Input Image X BIA[40] a IA1621 0000000 00111 10 0 0 011 1 1 0 000011 1 0 0000010 000000 0000000 Structuring Elements Dilation X G R BIA[40] a t IA[62] 0111111 0 1 1 1 1 1 1 0111111 0011111 0001111 0000111 0000000 Figure 6. Dilation (Boolean) expressed in the language of BIA and image algebra. Reference Image R BIA[40] 0 00000 0 0 0000000 0011100 0011100 0011100 00000 00 0000000 Template t IA[621 ~0 ~_0 ~_0 ~_0 32 As we shall show, the time complexity for binary only operation of the optical system described in Section 3.5 is 0(1). Since it can perform all the operations of the DOCIP (those defined in the BIA) which has operating time complexity O(logN) [41], the binary only design configuration will run faster for roughly the same cost function, verifying the superiority of the proposed design over the DOCIP. CHAPTER 3 CONFIGURATION DESCRIPTION Optical correlators provide not only a solution to the ATR problem, but a method for target identification as well. The complete action of target identification is accomplished by crosscorrelating the Fourier transform of an image scene containing a target with a transformed image of the target. This methodology, known as matched filtering, has a computational time complexity of order 0(1) which is necessary for high response rate systems. The correlator design described in this chapter is an integral part of a proposed autonomous guidance system for a munition directed to defeat highvalued, relocatable ground targets [19]. Although the correlator will be the primary means of automatically recognizing the target, a large portion of a system's guidance still depends upon high throughput image processing as shown in Figure 7. The hybrid electrooptical design described in this dissertation offers a solution to this image processing requirement. It uses the correlator as a means of rapidly translating the input image over an array of fine grained SIMD digital processors that allow for the execution of the arithmetic and logic operations of the image algebra discussed in Chapter 2. Guidance algorithms that are composed using the image algebra are capable of satisfying the guidance requirements prior to and during the crosscorrelation portion of the guidance solution. The benefit here is a combined system, not two separate processing activities. This equates to lower packaging volume and weight; these requirements are critical to the overall design of any munition guidance system. 34 2Qoi L c .3 \ u o N\ 1 \ \a 0rn s 2 4Oa < aS a! \ o a ) . is rrto CO ; ~ 3.1 Correlator Operation for ATR The crosscorrelation/matched filter operation referred to in this dissertation is accomplished using a conceptual correlator design based on a conventional Fourier optical system as shown in Figure 8 [55]. The input image information can be from any sensing device that produces a digital image, e.g., passive ForwardLookingInfraRed (FLIR), Laser Radar (LADAR), Synthetic Aperture Radar (SAR), etc. 3.1.1 Spatial Light Modulation The most critical component of the optical correlator described in this chapter is the spatial light modulator (SLM). A spatial light modulator is a twodimensional array of photonic switches that modulate the intensities, phase, or polarizations of the passing light beams. The conceptual correlator design discussed in this dissertation requires a SLM that can provide high resolution (minimum 128x128 pixels), 0 255 pixel gray level values, and a high switching frequency (40KHz is used as a baseline for this evaluation). Although the stateoftheart does not currently provide all the requirements in a single SLM, research shows that individually, the requirements are met and materials are available to provide the combination of these properties in a single SLM [4,8,20,25,53,65]. The most promising type of SLM that can be expected to meet the requirements of the proposed design is a ferroelectric liquid crystal (FLC) SLM. The liquid crystal material is chiral smectic A [43]. In a smectic liquid crystal material, the molecules are arranged parallel to each other so that diffraction patterns can only be obtained in a single direction. In the chiral smectic A 36 // II / CFJ 0 i SJ i Cu 0o \ 0 o I / K0 S g)~ material, the molecules are oriented collinear to an axis that is orthogonal to the plane containing the smectic material (see Figure 9). Application of an electric field causes the molecular director, the null direction, to tilt in linear proportion to the strength of the applied electrical field. This is referred to as an electroclinic effect. This electroclinic effect or switch provides gray level response at the switching frequency [8]. Power dissipation is a primary limitation of the switching speed of most other SLMs. The power dissipation for FLCs in general is low; in fact, it is inversely proportional to the switching speed [43]. Smectic A devices have demonstrated 100ns switching speeds, therefore, assuming a naive line multiplex scheme, 128 frames (one line of a 128x128 image) x 100ns equates to a 78KHz frame switching frequency [43]. As this technology matures, submicrosecond smectic A devices have the potential for frame switching frequencies at MHz rates. 3.1.2 Optical System Throughput The incoming digital image information is multiplexed to the input SLM arranged first in the correlator. The output of a single wavelength laser is collimated and polarized to produce a plane wave of coherent visible light. As the polarized plane wave passes through the elements of the SLM, portions of the plane wave are retarded in phase as shown in Figure 10. The amount of retardation corresponds to the gray level value of the digital image at each respective pixel location. The emerging wave is no longer perfectly planar but rather a phase modulated wave. The phase modulated wave is then passed through a Fourier transform lens (FTL) where the power spectrum of the Fourier transform of the image information is formed at the focal plane of the lens. The spontaneous polarization P of the smectic material is perpendicular to the perpendicular to the yz plane (tilt plane). The application of an electric field produces a torque through a firstorder interaction with P such that the average molecular lattice director D is displaced by c4. As the plane wave of coherent light passes through the crystal lattice, any change in 4) will produce a phase change in the plane wave (i.e., phase modulation). Figure 9. Ferroelectric liquid crystal cellular geometry. 8. Ca .P. uA 0E 83' _4 :^sE ^f~ A second SLM is aligned with the focal plane of the FTL. Fourier transformed phase information for the complex conjugate of the reference image (target) is placed on the SLM, and the emerging wave is further modulated in phase. The effect of this second SLM is multiplicative with respect to the energy passing through the system. Successful crosscorrelation is achieved using phaseonly filters. These are known as kinoforms, i.e., not relying on diffraction by carrier fringes for phase modulation [39]. When used as a matched filter, the filter in the Fourier plane need only have the same spacebandwidth product as the input image [8]. The crosscorrelated information is transformed back to the spatial domain by passing through a second FTL located at the same focal length behind the filter which performs the inverse Fourier transform. The entire crosscorrelation process is mathematically notated as C = 9{9 [f(x,y)].S'[h(x,y)]} = S1 F(u,v) H*(u,v) }, where % is the Fourier transform operation, f(x,y) represents the input image, F(u,v) is the Fourier transform of f(x,y), h(x,y) represents the reference or target image, H(u,v) is the Fourier transform of h(x,y), and is the complex conjugate. Also, note that the autocorrelation, C occurs when H*(u,v) = F*(u,v). The crosscorrelation information is now passed through an analyzer, consisting of a single polarizer, which regulates the intensity with respect to the amount of phase retardation. Finally, the intensities are collected on a charge coupled device (CCD) array where they are converted back to a digital form. If there is a match, a correlation spike is formed at the CCD array. Because the Fourier transform is linear shift invariant, the correlation spike can occur anywhere in the input image where there is a match. Once the target is detected, a position update is passed to the guidance and control unit. According to the particular guidance law being used, the system is steered in such a manner that the correlation spike moves to either the center position for zero error, pure pursuit; or to an offset aim point for lead pursuit, selective aim point. 3.1.3 Composite Matched Filter Two factors, rotation and scale, induce variation in the matched filter, correlator operation. These variations are due to the geometrical distortions inherent in viewing a dynamically changing 3D scene. For this correlator configuration, these variations are handled using a technique known as composite matched filter (CMF, defined by Horner and Caulfield [38]). One method of dealing with the above variations is to integrate a system of different focal length lenses with separate filters that present views of the target over a finite number of azimuths and elevations. Although this system would provide satisfactory results, it is far too cumbersome to implement. Due to the linearity of the correlator as a system, the same results can be obtained by superimposing a sufficient number of filters into one composite matched filter. This single filter is constructed as a sum of the Fourier transformed reference images and is notated as k CMF= w MF, i=l 1 where k is the total number of matched filters (MF), and w is a weighting function for each filter that modifies the output intensities to provide a constant correlator output for each filter. When the reference images are combined before being transformed, the result is referred to as a synthetic discriminant function [38,44,46]. A sufficiently large set of composite matched filters may now define an operational volume for the correlator (refer to Figure 7). The set of composite matched filters is continuously cycled through the respective SLM until a correlation occurs or, in the event of no correlation, a decision is made to switch to a lower level of image processing. 3.1.4 Post Correlation Processing Once a correlation is made, a correlation spike is formed on the CCD array. The location of the spike must be quickly processed to determine its relative position to either the center null position or any particular offset aim point. This operation is ideally suited for the design configuration presented in this chapter. Using the characteristic function X described in Section 2.1.3, the digital values above the background noise are set to unity while all others are set to zero. This allows for a rapid boolean readout of the location of all Is on an array of Os. A more elegant, yet highly theoretical, design configuration is the notion of on focal plane array digital computation. The fine grained computing elements necessary for a SIMD design are formed using nanoelectronics, an integrated components technology developed by the Central Research Laboratories of Texas Instruments, Inc. [28]. The technology is based on heterostructure fabrication methods that integrate low level operations (e.g., addition, multiplication, comparison, etc.) with the output of each element of the CCD array located at the element [28]. Although the hardware complexity at this level is very low, it is sufficient to host a subset of image algebra operators to perform a wide range of operations (specifically, X). The benefit here is that the steering information can be interpreted on the CCD with computational time complexity 0(1). 3.2 The Logical Architecture The logical computer architecture described in this section is specifically designed for the image processing tasks required of an airborne vehicle guidance system when the vehicle is outside the correlator's operational volume (pre/post processing, refer to Figure 7). The proposed design is a hybrid electrooptical architecture consisting of three tightly coupled components: a spatial configuration processor, a weighting processor, and an accumulation processor as shown in Figure 11. These components are based on the fact, proven in Section 3.3.1, that a general convolution of type a 0 t, for applicable 7 and 0, is equivalent to n r (a Dt(i)) O c., i=1 I where n is the number of elements in the template, a t(i) is simply a shift, and c is the scalar value or weight of the respective template element. The flow of data and image processing operations are directed by a control buffer and pipelined to each of the three processing components as shown in Figure 12. Table 1 presents a component level description for the operations required of each processing element. The spatial configuration processor is a stepwise optical convolution of the original input image with each template location. Figure 13 shows, conceptually, an optical system for controlling the input of each template element. A unit value INCREMENTAL SHIFT OVER TEMPLATE ELEMENTS a t(i) ri'%lI I VVior, COMBINE OUTPUT OF S WITH THE VALUE OF THE RESPECTIVE TEMPLATE ELEMENT (a 0t (i))oci HOST PROCESSOR I/O DATA FLOW PROGRAM CONTROL INPUT: IMAGE a TEMPLATE t LOOP FOR EACH TEMPLATE ELEMENT, j S = SPATIAL CONFIGURATION PROCESSOR W= WEIGHTING PROCESSOR A = ACCUMULATION PROCESSOR ACCUMULATE ALL OUTPUT OF PROCESS W r 1ieX VIDEO OUT: at Figure 11. Data flow diagram for implementing a t. a ASSUME UNIFORM NUMBER OF CLOCK S S S CYCLES PER OPERATION IN THE PIPE 1 2 3 n WWW ... W 1 2 3 n A A2 A *** An A1 2 3 n Figure 12. Pipelining the processors. Table 1. Component design requirements. Design Requirement Spatial configuration processor Weighting processor Accumulation processor Control buffer A Fourier optical correlator design using binary and high resolution (8bits) SLM technology (chiral smectic A liquid crystal FLC) and a charged couple device (CCD array) to provide the output image in digital form frame switching frequency 40 KHz. A parallel array processor with directly connected input from the CCD array and the capability to load a given scalar value to each arithmetic logic unit (ALU). A scalar value will be loaded into the template register. Each ALU must have a multiplier, an adder, and a modified comparator (Figure 15). The output image will be the result of combining the input image with the scalar using one of the following operations: +; .; <; =; >; (and); and (or) (Figure 16) [61]. A parallel array processor with directly connected input from the weighting processor. Each ALU will be configured identical to those in the weighting processor with the addition of and an accumulator (see Figure 16). The accumulator can be loaded directly by the analog, no shifting and no scalar operation. The accumulated value (video out) will be the result of combining the two images using one of the seven operations described above. This is the host controller (e.g., a Motorola 68040 CPU) that directs all I/O handling and data flow. A translator or compiler must be designed or modified to parse the specific instructions required to interpret the algebraic syntax and direct the execution of the other components. AFRRFRRRRRRFR BE E 0 47 Foe 1i0 ce " 0  0 0 O & 1.4 is assigned to the template element for each convolution. Thus, a since function is formed and is inherently multiplied in the frequency domain. The result of each convolution is simply a shift of the input image, that is, each picture element, pixel, is shifted by the displacement of the respective template element as shown in Figure 14. If the operation being directed by the control buffer is a pointwise imagetoimage operation, then the input image is sent directly to the array processor controlling the accumulation operation. The output of the spatial configuration processor is combined pointwise using the appropriate binary associative operator from the algebraic set of operators with the value of the respective template element as shown in Figure 15. This weighting procedure is performed on a digital array processor using the binary associative operators assigned to each arithmetic logic unit shown in Figure 16. The output of the weighting processor is now accumulated pointwise. Again, this procedure is performed using a digital array processor. Once the final template element is processed in this fashion, the accumulator memory contains the result of the generalized matrix product for the appropriate combine and reduce operations. Recall that in our discussion, the generalized matrix product is the mathematical formulation for operations such as: general convolution/correlation, additive maximum/minimum, and multiplicative maximum/minimum. The generalized matrix operations, or templatetoimage operations, along with the direct imagetoimage unary and binary operations allow for the formulation of all common image processing tasks. Hence, this architecture will control the execution of all the necessary set operations of the underlying image algebra. ORIGINAL IMAGE 2 ELEMENT TEMPLATE SHIFT AND MULTIPLY EACH IMAGE PIXEL BY TEMPLATE ELEMENT VALUE Jr ACCUMULATE THE OUTPUT IMAGE ORIGINAL IMAGE 4 ~EDL Figure 14. Shift, weight, and accumulate. Digital Image Input from CCD a(xi) or ControlBuffer b(Xi) WEIGHTING PROCESSOR SECTION * Scalar Input From the Control Buffer Add and Mutiply Add, Multply and Compare ACCUMULATION PROCESSOR SECTION Control Buffer (Video Out) Figure 15. Cellular connection of processors. V cc ci m RH M A II V A I v 8^ 0 0 ACCUM WP ALU 3.3 Image Algebra Process Distribution All imagetoimage operations are performed pointwise in onetoone correspondence with respective pixels. Table 2 shows each operation and the corresponding data flow or process distribution. Characteristic functions, as defined in the image algebra, apply their respective conditional arguments to an image and output a binary image consisting of Is where the condition is true and Os elsewhere. By properly tailoring the Boolean output of a comparator's circuit, each conditional argument can be processed as shown in Figure 16. Table 3 shows the algebraic notation and the corresponding data flow or process distribution. The global reduce operations E V A and the boolean or transform an image to a scalar. Since the proposed architecture does not consist of a mesh or any interconnection among pixels, these operations are best performed by a host computer. The image algebra basically defines templates in two ways: invariant, where neither the template's support nor its values change with respect to the target point; and variant, where the above constraint is not true for either or both sets. The intent of the proposed design is to use invariant template operations. The reason for this is that the throughput rate is maximized. Variant template operation is discussed in Section 3.6. Table 4 shows the algebraic notation for imagetotemplate operations and their corresponding data flow or process distribution. The image algebra operations that have been shown to distribute over the proposed architecture represent a substantial increase in throughput and efficiency. Table 2. Imagetoimage operations. Image Algebra k+a k a Process Distribution parallel step 1: load the value template register and image a weighting processor ALU; parallel step 2: accumulate 3 = k + a(x). parallel step 1: load the value template register and image a weighting processor ALU; parallel step 2: accumulate 3 = k. a(x). k into the into the ACCUM(x) k into the into the ACCUM(x) parallel step 1: load the accumulator with image a and the weighting processor ALU with image b (no shift); parallel step 2: accumulate 3 ACCUM(x) = a(x) + b(x). parallel step 1: load the accumulator with image a and the weighting processor ALU with image b (no shift); parallel step 2: accumulate 3 ACCUM(x) = a(x) + (b(x)). parallel step 1: load the accumulator with image a and the weighting processor ALU with image b (no shift); parallel step 2: accumulate 3 ACCUM(x) = a(x) b(x). parallel step 1: load the accumulator with image a and the weighting processor ALU with image b''(no shift); parallel step 2: accumulate 3 ACCUM(x) = a(x) b'(x), where ACCUM(x) = 0 when b(x) = 0. a+b ab a b a/b Table 2 continued Image Algebra Process Distribution a and b parallel step 1: load the accumulator with binary image a and the weighting processor ALU with binary image b no shift); parallel step 2: use the boolean output of each comparator 3 ACCUM(x) = a(x) and b(x). a or b parallel step 1: load the accumulator with binary image a and the weighting processor ALU with binary image b (no shift); parallel step 2: use the boolean output of each comparator 3 ACCUM(x) = a(x) orb(x). aVb parallel step 1: load the accumulator with image a and the weighting processor ALU with image b (no shift); parallel step 2: accumulate using the comparator 3 if[a(x) > b(x)], ACCUM(x) = a(x), else ACCUM(x) = b(x). a A b parallel step 1: load the accumulator with image a and the weighting processor ALU with image b (no shift); parallel step 2: accumulate using the comparator 3 ifa(x) < b(x) ACCUM(x) = a(x), else ACCUM(x) = b(x). hardware control see Figure 16. Table 2 continued Image Algebra Process Distribution parallel step 1: load the value 1 into the template register and image a into the accumulator; parallel step 2: accumulate 3 ACCUM(x) = 1. a(x). load the accumulator with image a, and mask the appropriate sign control bit for the accumulated value. log b a cos a, sin a, tan a cosla, sinla, tan'a t This operation may be processed more efficiently using a highspeed sequential processor/math accelerator provided by the control buffer. If the hardware complexity of the physical implementation is robust, then many of these higher order operations can be included in the parallel computation. Table 3. Characteristic functions. Image Algebra Process Distribution X>b(a) parallel step 1: load the accumulator with image a and the weighting processor ALU with image b (no shift, if instead of image b a scalar value m is used, simply load template register with m); parallel step 2: accumulate the boolean output of each comparator for the > condition, w.r.t. b (or m). X <(a) parallel step 1: load the accumulator with image a and the weighting processor ALU with image b (no shift, if instead of image b a scalar value m is used, simply load template register with m); parallel step 2: accumulate the boolean output of each comparator for the < condition, w.r.t. b (or m). X>b(a) accumulate 3 ACCUM = [X< (a)]c. Xb(a) accumulate 3 ACCUM = [X>b(a)]c. x (a) parallel step 1: load the accumulator with b image a and the weighting processor ALU with image b (no shift, if instead of image b a scalar value m is used, simply load template register with m); parallel step 2: accumulate the boolean output of each comparator for the = condition. Xi (a) accumulate 3 ACCUM = [b (a)]c. X[m,n](a) parallel step 1: accumulate image b = X> (a) and store in image plane register B parallel step 2: accumulate image 3 ACCUM = X parallel step 3: accumulate 3 ACCUM = b ACCUM Table 4. Imagetotemplate operations. Image Algebra Process Distribution a t parallel step 1: load image a into the analog processor; parallel step 2: load the spatial location of template t into the analog processor and the value of template t into the template register of the weighting processor; parallel step 3: combine the output of the analog processor with the contents of the template register using the operator; parallel step 4: accumulate the output of the weighting processor using the + operator 3 ACCUM = ACCUM + (a Dt(i)); parallel step 5: repeat steps 2 through 4 V i E the template configuration (pipeline). a t same as a t, except the V operation replaces the + operation in step 4. a t same as a t, except the A operation replaces the + operation in step 4. a u t same as a t, except the + operation replaces the operation in step 3, and the V operation replaces the + operation in step 4 a M t same as a S t, except the + operation replaces the operation in step 3, and the A operation replaces the + operation in step 4 They also represent a very rich subset of the image algebra, powerful enough to accomplish all common imagetoimage transforms [58,59]. 3.3.1 Proof of Concept Recall the following image algebra definitions: the global reduce function; Equation 2, r a = r a(x) = a(x,) 7 a(x,) ... 7 a(x), xEi the generalized vectormatrix or imagetemplate product; Equation 3, att (y, b(y)): b(y) = xE a(x) Oty(x), yE Y xEX the general convolution; Equation 4, aet (y, b(y)): b(y) = 1 a(x) t,(x), y E Y , xex where X and Y are coordinate sets, a, b E F, F is an arbitrary value set, 7 and 0 are binary associative operators on F, and template t E (FY)Y In support for the theorem that follows, suppose F E { R, C, 12k ), YE 62, and t E (FY)Y denotes a translation invariant template with card( '(ty)) = n. For some arbitrary point y E X, let { x (y), x (y),. *, (y) } = G(ty) and for i = 1, 2,.. ,n, set ci = t(xi(y)). For parameters {1, 2, .,n}, define an Fvalued parameterized template 1, ifx = x i(Y) t(i)y(x) = o r ( 0, otherwise Note that e(t(i)y) = { xi(y) } for i = 1, 2,. ., n. THEOREM 1: a t = t (aet(i)) ci. (12) PROOF: In order to simplify notation we let ai = a 0 t(i) and show that n a9t = ai ci. i=1 First note that ai(y) = xE1 = a(x) t(i)y(x) xE (t(i)y) = a(x) t(i),(x) xE xi} = a(xi) t(i)y(xi) = a(xi). a(x) t(i)yx) Thus, ai is simply a shift of image a, with a(y) being replaced by a(xi). n Let b = a $t and d = ai ci. In order to prove the theorem, we i=1 need to show that b(y) = d(y), V y E X. By definition, b(y) = a(x) ty(x) xEI = a(x) tx) xe (ty) = a(x) t,(x) xE{xi, X2,' Xn = a(xi) ty(xi) i=l i=la(xi) i n = ai(y) ci i= 1 = d(y) Q.E.D. As an easy consequence of this theorem, the following corollary can be obtained. Corollary I: (1) If E { } and F value set for the operation 0, then n a)t = P i=1 (2) If E { }and F value set for the operation C, then n ast = r i=1 denotes the appropriate extended real (a 0t(i)) + ci (13) denotes the appropriate extended real (a t(i)) ci (14) PxOOF: The proof is identical to the proof of Theorem 1 with the appropriate support at infinity replacing 1(ty) Q.E.D. Theorem 1 and Corollary I show that the general convolution operator can be implemented by using simple shifts, global pixelwise multiplication (or addition) using the scalar ci, and accumulating the results. Figures 17 and 18 in Section 3.4 illustrate this observation. We conclude this section with another corollary of Theorem 1. Corollary II: If is one of the operations of Theorem 1 or Corollary I, and y = xj is an element of the support of ty, then a t=(aOc ) 7 [ (a t(i))Oci . i Ij (15) PROOF: Since t(j) is the identity template, aj = a 0 t(j) = a. Therefore, a t(j) O cj = a 0 cj. The result now follows from the commutativity of 7. Q.E.D. Corollary II is important in that it saves one convolution if the condition y = xj is satisfied. 3.3.2 Proof of Phase Only Filter Application Phase only filter application is of special interest when spatial light modulators are used to provide matched filtering in optical correlation. A SLM can only provide amplitude or phase information unlike a holographic film plate which can provide both simultaneously [39]. Aside from the correlation operation, the main objective of the optical processor described in this dissertation is to simply shift the image. We must show that the phase only operation of the SLM will be sufficient. In order to prove sufficiency, let {x1(y), (y),..., (y)}= (t). For each k = 1, 2, *, n, there exists a unique vector z(k) = (z,(k), z2(k)) E 12, such that xk(y) + z(k) = y. Since t is translation invariant, then for any arbitrary vector p E 12, xk(y + p) = xk(y) + p, where {xt(y+p), x2(y+p),..., Xn(y+p)} = O(typ). Thus, xk(y+p) + s(k) = (xk(y) + z(k)) + p = y + p. (16) Equation 16 simply says that z(k) is not affected by translations. For each k = 1, 2, n, consider the unit impulse response function (1 if x= z(k) 0 Oif x z(k) Then 5k can be used to provide another representation of the onepoint template t(k) by observing that 6k(yx) = t(k),(x), V y, x E 12. This follows from the fact that t1 if x= z(k) 0 if x # z(k) and the observation that y x = z(k) t x = xk(y). Next, recall from the proof of Theorem 1 that if ak = a $ t(k), then ak(y) = a(xk(y)). However, we also have that a(y) k(y) = 1 a(x) (yx) = a(xk(y)), (17) xEI where 0 denotes the convolution product. Therefore, a k = a t(k) (18) or, equivalently, ak = a 6. (19) In the frequency domain we have, according to the convolution theorem [36], the equivalent formulation S{ak} = S{a 6} = S{a} Sa{k}, (20) where a denotes the Fourier transform operation. Employing the notation 8k = {6k}, we have that M1 N k(u,v) = 1 6 k(z') e2i(xu/M + yv/W) (21) z=o y=o 27ri(z(k)u/M + z2(k)v/N) for all (u,v) E j2. Thus, ik has constant magnitude I (u,v) = 1 and phase O(u,v) = e2ri(zl(k)u/M + z2(k)v/N); i.e., only phase is a factor in the multiplication of S{a} in Equation 20. Since ak(y) = a(xk(y)) = a(yz(k)), (22) it also follows from Equations 19, 20, and 21 that a(yz(k)) i(uv) e2i(zl(k)u/M + z2(k)v/W), (23) where the double arrow is used to indicate the correspondence between a function and its Fourier transform. Equation 23 expresses the wellknown translation property of the Fourier transform [36]. Therefore, multiplying the Fourier transform of the image a by the exponential equivalent of 6 and then taking the inverse Fourier transform, shifts the image by the vector z(k), that is, it moves the origin of the image to (zl(k),z2(k)). Equation 23 also expresses the fact that a shift of the image does not affect the magnitude of its Fourier transform. As a final observation, note that Equations 18, 22, and 23 imply that a t(k) i. k; i.e., shifting the image in the spatial domain with the use of the image algebra operation a t(k) is equivalent to the product a* k in the frequency domain. 3.4 Performance Evaluation 3.4.1 Example Operation Figures 17 and 18 illustrate the processing steps involved in performing a general convolution and an additive maximum, using a Von Neumann type template, respectively. Notice however, that by proper selection of operation pairing (i.e.; with A, and + with A) multiplicative minimum and additive minimum are produced in the same fashion. GIVEN: INPUT IMAGE a 0 0 0 0 01 I0 0ol 10 01 0 o 0 I a 0t(1) 0 0o 0 0 0o I lo ooo ACC = ACC + W2 00 0 000 0 6126 01 0131710 0O 0 6 61601 03080 0 2 0 0 01 a ( t(4) 0 o0 ION0o I oooool 0 0 0 0 0O 10 0 0 0 01 TEMPLATE t 2 2 CONVOLUTION a t 048401 1420261841 61927244 0 814168 23 28 0 0 2 0 0 01 ACC= W1=a t(1)x 3 0I 0o o001 0 6126 0l 0996 01 0 0 01201 03000' ooooool 0 0 0 0 0l a $ t(3) 0 0 000 0 0 0 0 0 8 100 1o 10 0 0 0 v = a @t(4)x 2 1 048401 066401 000 8 01 0 2 0 0 01 LET: t(1)= * ACC =0 t(2)= 1 t(4)= 1 t(3)= [BM t(5)= ] a 0 t(2) 0 0 0 0 1 1004841 0 06 4 0 01 0 0 2 0 0 o a =(a t(3) X 2 0 0 0 0 8O ACC = ACC + W4 0 4 8 4 0 012221841 013232441 0 8 6168 10328 80 0 2 0 0 S= a D t(2)x 2 10 0 0 0 0 0 8 4 00o oooool 1048401 06640 0 0 0 08 1020001 ACC = ACC+ VW 0 6161441 013231641 0 6 61681 03280 0 2 0 01 a ( t(5) Iooooo . 0 0 0 0 0 0 0oooo .0 0 0 0 0~ 0 0 0 0l v= a ( t(5) x 2 0 080 0 20000 10 0 0 0 0 ACC = ACC + Figure 17. Example of general convolution. W5=aGt 04840 42026184 61927244 0 814168 23280 02000 GIVEN: INPUT IMAGE a TEMPLATE t 0 0 0 0 0 00000 2 ' oI : : ADDITIVE MAXIMUM a t 346431 467641 566641 S5 6 6 6 4 35676] 134363 133333l a ] t(1) ACC= W1=a 0 t(1)+ 00000 13333 31 IoooooI 3333"3 0l o0 3575 31 60 3 366531 10 0 3 3 3 7 3 10 033 1 34333 0 000 3333 3 333331 000000I lassas! looooo 357531 100 36653 00 355731 0 1 34363 00 0 333331 10 0 0 a 6 t(4) v=4a D t(4)+ 2 0 124642 I 1255421 0 0ol 1222621 S o l000 23222 1000001 1222221 0 0 0 0 2 222 2 3 a ( t(2) 0 0 00000 01 0I0 01 v=a ( t(3)+ 2 1222221 224641 22554I 22 2 2 6 22322' 1222221 ACC = ACC V W4 3464 3 4 357641 366641 3 5 5 7 64 35576' 34363 333331 %= a et(5)+ 2 122222 1464221 1554221 22622 132222 12 2 2221 ACC = ACC V Figure 18. Example of additive maximum. v = a t(2)+ 2 12222221 1222221 246421 25542 1222621 1232221 ACC = ACC V W3 133333 3 1357641 1366541 35576 34363 333331 a t(5) ooo0000 00 00!0000 000 00O %w=aMt 34643 46764 56664 35676 34363 33333 LET: t(l)= M ACC =0 t(2) t(4) = t(3)= EM t(5)= T] 3.4.2 Comparison of Execution with RISC Performance There are currently two limiting factors concerning the proposed architectural design. The major limitation is the switching speed of the spatial light modulators. The best SLMs in the research community today will operate at a 1040 KHz switching frequency with a potential for MHz [5]. This is well below the expected operating speed of the remaining digital components, a common 25 MHz clock frequency. The other limitation is the memory I/O speed which will be necessary to load in parallel the SLMs and the digital array processors. High speed circuits such as crossbar switching, block memory transfer, and column parallel loading networks do exist for this application, however, the physical connections become a problem as the number of elements in the SLM increases. Faster optical addressing techniques which tend to reduce this limitation are being developed [20]. The 40 KHz switching frequency will be the anticipated limiting factor for the rest of the discussion. It will determine the time spent in each section of the pipeline (refer to Figure 12). Since the proposed design is dependent upon the size of the template configuration and independent of image size, the following equation can be used to estimate the computation time for a convolution type operation: time = (n + 1). (24) where n is the number of points in the template support, and clock corresponds to 40 KHz. The addition is the last stage of the pipeline, the beginning of the pipeline bypasses the analog processor as noted at the end of Section 3.3.1. Table 5 show the results for 5 and 25 element templates. The following equation can be used to estimate the computation time for a naive calculation of a template convolution over an NxN image using a Reduced Instruction Set sequential Computer (RISC): time = NZ(2n1) (25) where n is the number of points in the template support (n multiplications plus n1 additions), the 2 in the denominator represents memory interleaving, clock is the operating frequency (25 MHz), and 1 clock cycle per operation. Table 6 shows the results for 5 and 25 elements over various images sizes. 3.4.3 Algorithm Description and Evaluation The maxmin sharpening transform is selected as a representative algorithm. Such an algorithm is typically used as a preprocessing filter for the subsequent correlation action. The transform is an image enhancement technique that brings fuzzy gray level objects into sharp focus. This is an iterative technique that compares the maximum and minimum gray level values of pixels contained in a small neighborhood about each pixel in the image. The results of the comparison establishes the respective pixel output. The benefit of its use as a preprocessing step is that a sharper image, when presented to the correlator, results in a higher signal to noise ratio on the correlation plane. Thus, such an image has a higher probability of identification. 3.4.3.1 Description. The image algebra formulation is defined as follows: Let a E eR be the input image, and N(x) a predetermined Table 5. Estimated execution time for proposed design. Number of elements Time(seconds) 1.5 x 1004 6.5 x 1004 Table 6. Estimated execution time for RISC design. Number of elements Time(sec.) per image size 7.4 x 104 0.003 0.012 0.047 0.189 0.004 0.016 0.064 0.257 1.028 64x64 128x128 256x256 512x512 1024x1024 64x64 128x128 256x256 512x512 1024x1024 Z2 neighborhood function of x E ,I and t: I'2  I be defined by t0 if x N(x) m otherwise The output image s is the result of the maxmin sharpening transform given by the following algorithm: aM:= a3t am := a t b:= aM + am 2 a S := X<0(b) aM + X>0(b) am, where t is the additive dual of t. For this specific implementation, both the input and the output images are defined on the point set X which corresponds to the coordinate points of the CCD array. Therefore, points which lie outside X are of no computational significance. Also, the support of t and t are equivalent; i.e., t,(x) = ty(x), V x E <(ty). The algorithm is now iterated either a predetermined fixed number of times or until s stabilizes, that is, until there is no change in gray value from one iteration to the next. Figure 19 depicts a typical input stream for the instruction to execute the above algorithm. An instruction stack in prefix notation is used to illustrate the instruction stream. This is a simplification of what would normally take place in a Combined Instruction Stack Convolved Instruction Stack + * <.0 + + NOTE: Image Planes are formed, point wise, using the local memories at each accumulation processor (refer to Figure 15). Figure 19. Example of instruction stack reordering. Store Result in Image Plane  C Store Result in Image Plane  D * >+ 0 + . Point wise Instruction Stack Store Resul in Image Plane  B Store Result i1 Image Plane . E + !0 E C * >0 E D series of translations from the high level algebraic abstraction to the machine instruction level. After an initial scan of the stack, the stack operations can be separated into general matrix production type operations and those that are strictly pointwise. By sequentially ordering these classes of operations (general matrix operations first) and further noting the number of repetitive operation substrings, the stack can be reconfigured (refer to Figure 19) to execute with a minimal number of clock cycles per operation. 3.4.3.2 Evaluation. Let the switching frequency of the SLMs in the correlator operate at 40KHz, use 10 clock cycles as an average for each digital operation in both the weighting and accumulation processors, and use a common 25MHz clock frequency. Then, assuming the card(t) = 5 and using Equation 24, the time to compute the two operations l and 19 is 0.0003 seconds. The remaining eight pointwise operations are executed in 3.36x1006 seconds. The total time is now 0.00030336 seconds. Note, if the assumption of 10 clock cycles per operation is raised to 100, the total time of execution becomes 0.0003336 seconds. This is a result of the 40KHz switching frequency which is much lower than the common clock. A comparison with a sequential RISC type machine is made in a similar manner to the comparison in the previous section. An arbitrary image size of 256x256 pixels is used to facilitate the computation. Assuming once again a 25 MHz common clock and one operation per clock cycle, Equation 25 is used to evaluate the timing for the two general matrix operations. The total time for both operations is 0.0236 seconds. The remaining operations are evaluated by modifying Equation 25, where the 2n1 parameter is replaced by the number of operations, in this case 8. The time for these eight operations (0.0105 seconds) plus the general matrix operations equals a total execution time of 0.0341 seconds. 3.4.3.3 Conclusions. The gain in performance, or the speedup, is measured as the ratio of the best sequential execution time to the parallel execution time. For this comparison the configuration design estimate to the RISC estimate is approximately 112 to 1. A more important measure is based on the requirement of the sensor. For most applications the minimum frame rate for a system to update a guidance law is 30 Hz; other higher performance systems are at a 100 Hz rate. Considering the simplicity of this algorithm and the fact that it represents only a small portion of what must be computed within a required frame rate. The comparison just given makes an important point; the algorithm cannot be executed once within the minimum frame rate of 30 Hz using sequential RISC technology. On the other hand, the algorithm can be executed on the proposed design approximately 98 times per frame. Clearly there exists a substantial increase in throughput for the proposed design. The proposed design just breaks the speed barrier necessary to make it functional in an ATR system. The interesting speculation is how the throughput would increase with better performance from the SLMs. 3.5 Binary Image Operation The hybrid electrooptical system described thus far is also usable as a binary image correlator/processor. The primary reason for doing so is to achieve a significant increase in the processor throughput. The entire template configuration is convolved using the optical process, thus eliminating any iterative solution. The computational time complexity in this case is 0(1). The disadvantage is if the input image information is restricted to binary images, then more sophisticated algorithms that turn binary images into gray scale images cannot be utilized efficiently. With regard to the ATR problem for specific applications that require very high speed image processing, the advantage may outweigh the disadvantage. 3.5.1 Functional Description The physical layout of the correlator may remain unchanged from the description in Section 3.1 depending upon how the images obtained by the sensor are used. The only change to the correlator is a replacement of the first 8bit, SLM with a two state or binary SLM. In this case the sensing device establishes the necessary threshold to provide the binary input image. Otherwise, an initial gray level image is processed by the proposed design allowing the X>0 operation to provide the binary image which, in turn, is fed back to the first SLM. Again, depending on the requirement, replacing the SLM will provide a higher throughput. With the assumption that a binary image is passed as the input image, and the weights in any arbitrary template configuration are all set to 1, then the phase information for the complex conjugate of the entire template configuration is placed on the second SLM and processed (refer to Figure 8). The intensity of the energy emerging from the analyzer contains the results of the convolution. At this time the convolved information is transformed by the CCD array into a gray level image. The maximum gray level value is directly proportional to the number of times 1 values overlapped in the convolution process, and it is bounded by the total number of non zero elements in the template. Figure 20 depicts this operation. 3.5.2 Image Algebra Process Distribution The simplicity of operation of this design configuration stems from the way the morphological operations are derived. In [63] Ritter et al. shows that a dilation of a binary image a with a structuring element b is equivalent to a t, where b = e(ty) and t,(x) = 1 for all x E et(ty). Similarly, a binary erosion is equivalent to a a t. By using the convolution operator A, Davidson [22] proves that the morphological operations of dilation and erosion for a binary image, a, are derived by the following expressions: dilation X>0(a t), (26) and a = =n(a t), (27) erosion where n = the number of elements in Y(ty) and a, adilation, and a are Sdilation erosion Boolean images. The Boolean operations or and and are used interchangeably with the settheoretic operations union and intersection. The remaining settheoretic operation necessary for the binary image algebra defined by Huang et al. is the complement operation. As notated in Table 2, this operation is easily accomplished in hardware using a flipflop at the appropriate circuit location, as shown in Figure 14. The operations just described are only valid when the weights of the template are all set to unity. If template weights other than 1 are required, then the convolution operations can be accomplished in the iterative manner as described in Section 3.3. The use of a census template is an example of this situation. Here the weights of the template are distinct powers of a prime number. ORIGINAL IMAGE TEMPLATE t=Bl THEN aDILATION =X a t) = a EROSION X (aOt) = 3 Figure 20. Binary image erosion and dilation. GIVEN: LET: When the template is convolved with a binary image the result is a unique number for each distinct pattern of Os and Is [22]. This is a fundamental step in most algorithms used for labelling objects in binary images. The conclusion to draw here is that by allowing the manipulation of gray level values formed by the convolution of binary images or binary image and gray level template, Huang's binary image algebra described in Chapter 2 is extended [41,67]. The extension is for both the set of values and the number of operations over the set; the set {1,0} is replaced by I c R, and the characteristic operation X and general convolution operation are included to provide erosion and dilation operations more directly. 3.6 Variant Template Operations Referring to Section 2.1.4, the definition of a variant template is one whose configuration and/or values are allowed to vary with respect to the reference point location. The Fourier and Gabor templates derived from the kernels of the discrete Fourier and Gabor transforms are examples of variant templates. The implementation of algorithms involving variant template operations using the conceptual system described in Section 3.2 can result in significant reductions in system throughput for some specific variant templates. The generalized matrix product relative to parallel architecture communication, as shown by Wilson [74], unifies the three concepts of manytoone, onetomany, and onetoone data transfers; i.e., reduction, replication, and permutation respectively. The first two concepts are used to explain the throughput differential. The imagetotemplate operations of the proposed system as defined in Table 4, are intended for invariant template operations, and can be interpreted as the generalized left product of image a with template t. This is defined for all a, b E F and t E (F)'x such that for any reference point y E X b(y)= r (a(x) t,(y)). xeI In short, the image is shifted with respect to the location of each template element in the support of t; such that, for each shift, each image pixel value is combined under the O operation with the corresponding template value. This, according to Wilson, is a onetomany data transfer, or replication, operation [74]. As noted earlier, the time complexity is O(N) where N = card(t); i.e., card(a) parallel operations repeated N times. The definition of the variant template is dependent on the reference point's location relative to each point in the image, therefore, for the proposed system only the generalized right product is interpreted. That is, for each reference point yEX b(y)= r (a(x)Oty(x)). xEI In this case the template is shifted with respect to the location of each pixel in the image a such that, for each shift, each variant of the template description is combined under the O operation with the corresponding pixel values. In other words, a variant template is loaded as an image and the image pixel value corresponding to this particular template is loaded in the template register (refer to Figure 13). This, according to Wilson, is a manytoone data transfer, or reduction, operation [74]. The time complexity is O(N), where in this case however, N = card(a); i.e., card( (ty)) parallel operations repeated N times where card( (ty)) varies as y varies. This time difference is significant, two or three orders of magnitude in most applications. There are methods that can be applied to the variant templates themselves that tend to increase the throughput. Gader has shown template decomposition and localization methods for mesh architectures that result in faster computation [30,31]. Also, Manseur [50] has proven that for certain variant template definitions (e.g.; t, column rank one) there exists an invariant vertical template t1 and a (possibly variant) horizontal template t2 such that t = t1 t2. Thus, the overall problem can be reduced to a linear combination of templates where at least a portion of the computation can be executed at a higher rate. Although the methods just mentioned result in an increase in throughput, the conclusion for their use remains with the specific implementation. Assuming the implementation is still the correlator interface for ATR; if the frame rate (update rate) of the system is high (above 30 Hz), then the use of variant templates in any acquisition algorithms should be avoided. CHAPTER 4 VANDER LUGT SYSTEM IMPLEMENTATION The hybrid electrooptical system described here is also based on the notion of integrating Fourier optical correlation and digital image processing. The imaging media, holographic film, makes this implementation unsuitable as a solution to the ATR problem defined in Chapter 2. It does, however, have direct application for many image processing tasks that use this high resolution imaging media; e.g., photo reconnaissance, industrial robotics, blood analysis, etc. Moreover, the reason for including it in this dissertation is that the image algebra interface with a Vander Lugt optical system is noteworthy in general. The interface affords additional methods for feature extraction which complement the correlation function. 4.1 System Description The difference between this configuration and the one described in Chapter 3 is the imaging media, spatial light modulators are replaced by holographic film for both image and matched filter inputs. Also, the holographic film is capable of storing both the magnitude and phase of the Fourier transform. This contrasts the phaseonly SLM implementation. The method of producing a matched filter on holographic film is based on the work of Vander Lugt [72], hence the correlator implementation is commonly referred to as a Vander Lugt system. The advantages of using this method are: the capability to statistically and repeatedly examine large numbers of images for the presence of specific information content, the resolution of the film is higher than that of the SLM, and the overall system cost is lower. The disadvantages are that the template configuration and values must be known a priori, thereby eliminating program controlled dynamic reconfiguration; the procedure for exchanging filters is mechanical; and, the space bandwidth product tends to be large. The latter results in an increase in the physical size of lenses, film plates, etc. The image algebra algorithm suite for the kinds of processing tasks mentioned can be executed using a limited number of predefined templates, or filters. Predefined templates will allow for a system of selectable multiple space and frequency multiplexed fixed filter sets similar to the one proposed by Casasent and Schaefer [14,15]. The technique involves a finite number of templates stored on film and multiplexed using a selectable laser diodes array (see Figure 21). By selecting the proper diode, the focal point on the holographic film plane is translated to the desired template and convolved in the manner described in Chapter 3. The necessary composite matched filters or synthetic discriminant functions are included on the film containing the templates [13,44]. The remaining digital circuitry is the same as that defined in Section 3.2 with the only exception being the absence of any feedback to the input image. This proposed system configuration allows for some degree of template programmability and eliminates any mechanical exchange action. Typically, the number of non zero template elements for template configurations used in image algebra operations tends to be small compared to the number of non zero elements used by matched filter correlators. Therefore, the 0 af 920 ad 4 Crag \ 7XIQ7 X.XUIi space bandwidth product that is necessary for cross correlation is sufficient for the convolution type operations of the image algebra. 4.1.1 Filter Generation The 2dimensional Fourier transform of a continuous function f(x,y) is defined as S{f(x,y)} = F(u,v) = ff f(x,y) e2i xu+ y dx dy, 0m {i= F} and in a similar manner the function h(x,y). The convolution and correlation of f(x,y) and h(x,y) are further defined as f(x,y) h(x,y) = F(u,v) H(u,v) and f(x,y) O h(x,y) = F(u,v) H*(u,v), where denotes the convolution, O denotes correlation, and represents the complex conjugate [36]. If the reference image, h(x,y), is placed on a transparent film and illuminated by a plane wave of coherent light, then its Fourier transform is formed at the focal plane using a simple lens. However, recording film located at the focal plane can only record the power spectrum of the Fourier transform [33]. The Fourier transform, as opposed to the phaseonly operation previously described, is needed in this system to remove or attenuate specific frequency components in the image. Therefore, a complex filter H(u,v) is needed to perform the necessary filtering operations. Two pieces of information are required to produce such a filter, namely the real and the imaginary parts of the waveform. Vander Lugt proposed a heterodyning technique that incorporates holographic film to generate and store the necessary complex waveform [72]. A simple lens is used to transform the reference image that corresponds to the template. The Fourier transform of the reference image, S{h(x,y)}, is brought to focus on a photographic film plane. A plane wave is directed at the film plane at an angle 0 and mixed with the transform at the film plane as shown in Figure 22 [12]. The resultant pattern caused by the constructive interference of the two fields is Ha(u,v) = H(u,v) + A e2iav, sin0 where a = is referred to as the spatial carrier frequency, A the wavelength of the light, and A and e2irav are the amplitude and phase of the reference wave, respectively. For most applications, A is of little interest and is set to unity. The film records not the field itself but rather the square magnitude of the field, thus Ha(u,v) = 1 + IH(u,v) 2 + H(u,v) e2i7av + H*(u,v) e2i'av contains a constant, the power spectral density, and two terms due to the presence of the spatial carrier frequency. The two spatially modulated terms contain the Fourier transform information [72]. Ir (D 'M EE if : Both the magnitude and phase of the Fourier transform are colocated on the holographic film; typically a dichromated gelatin (DCG) which is a high quality material for recording matched filters [10,64]. Computer generated holograms written to bleached silver halide plates is another recording method [12]. 4.1.2 Filter Implementation The film can now be placed at the Fourier transform lens focal plane and illuminated with the Fourier transform of the test image, S{f(x,y)}. This corresponds to the filter bank location as shown in Figure 21. The product of the film transmittance and the illuminating Fourier transform is the function G(u,v), where G(u,v) = F(u,v) H (u,v) = F(u,v) + F(u,v) H(u,v)12 + F(u,v) H(u,v) e2i7av + F(u,v) H*(u,v) e2i av Another lens, L2 in Figure 21, is used as an inverse Fourier transform of this product to simultaneously obtain the convolution and correlation of the two images [72], denoted by g(x,y) = f(x,y) + [f(x,y) h(x,y)] h*(x,y) + f(x,y) h(x,y) 6(x,y+a) + f(x,y) h*(x,y) 6(x,ya), where 6(x,ya) is simply a translation function and h*(x,y) = h(x,y). 4.2 Image Algebra Process Distribution The distribution of image algebra operations for this implementation differs significantly from the implementation defined in Chapter 3. The shifting operation using the Fourier optics and a since function at the filter location is still the same with the exception that the since function has nonunit amplitude. The presence of the amplitude in the since function eliminates the need for the weighting processor section of the digital side of the hybrid configuration. This allows the complete template configuration to be included in the filter definition. Therefore, general convolution operations are completed in one iteration. Since the magnitude of the filter, which corresponds to the value of the template, is inherently multiplied by the optical convolution properties, the morphological convolution operations are restricted to the multiplicative maximum and minimum as defined in Chapter 2. The accumulation processor section remains the same with the exception that the template register is input directly, refer to Figure 13. All pointwise operations are accomplished in the same manner as described in Tables 2 and 3. 4.2.1 Morphological Operation The objective in this section is to define an expression which allows the multiplicative nonlinear convolution operations to be expressed in terms of linear convolution operations. The expression is first established for the multiplicative maximum, denoted as 0. Once the expression is defined for 0, the nonlinear operation for multiplicative minimum a will automatically follow. In support for the theorem that follows, suppose F { Rul, I U {.} } and t E (F')x denotes a translation invariant template with card( #(ty)) = n. For some arbitrary point y E X, let { x x x } = Y(t ) and for i= 1, 2,...,n, set ci = t(xi). Define a parameterized template s(i) from X to X with defined parameter i E {1, 2,..,n} defined by ci, if x = xi s(i),(x) = 0, otherwise. Note that 1(s(i)y) = { xi } for i = 1, 2, *, n. n THEOIEM 2: a t = V a s(i) (28) i=1 PEOOF: For i = 1, 2, .. n, let ai = a 0 s(i). Thus, for an arbitrary point y E X, we have a(y) = a(x) s(i),(x) xEI ai(y) = a(xi) ci (29) Now if b = a t, then for an arbitrary point y E X we have, by the definition of the convolution operator that b(y) = V a(x) t(x) xEI = V a(x).t(x) xe (ty) V a(x) t(x) = V a(x) tx) i= 1 n = V a(x) *ci i=1 xE 0(s(i)) XE QY(s(i) ) since W(s(i)y) = {xij. This shows that a(x) s(i) (x) n = V a(y) (by Equation 29). i=1 n n Therefore, a t = V ai= V as(i) i=1 i=l Q.E.D. As a corollary to THEOREM 2, the multiplicative minimum a is simply n aOt = A ass(i). (30) i=1 PaOOF: The proof is identical to the proof of THEOREM 2 with the appropriate global reduce function and corresponding induced operation Q.E.D. 4.2.2 Negative Real Value Manipulation. Some image processing algorithms require template definitions to contain negative values. Edge masks using finite differences are examples of such templates. The Vander Lugt optical system described above uses only positive real values. The following method is presented as a solution to this condition. Using the relationship previously developed for *, one can think of the notion of an additional subjunction of positive and negative values which when properly combined fulfill the stated requirement. Specifically, the generalized 