Citation |

- Permanent Link:
- http://ufdc.ufl.edu/AA00031463/00001
## Material Information- Title:
- On the processing of compressed and encrypted signals and imagery, with applications in efficient, secure computation
- Creator:
- Schmalz, Mark Steven
- Publication Date:
- 1996
- Language:
- English
- Physical Description:
- x, 362 leaves : ; 29 cm.
## Subjects- Subjects / Keywords:
- Algebra ( jstor )
Approximation ( jstor ) Cryptography ( jstor ) Data encryption ( jstor ) Fourier transformations ( jstor ) Image compression ( jstor ) Image processing ( jstor ) Images of transformations ( jstor ) Logical givens ( jstor ) Pixels ( jstor ) Coding theory ( lcsh ) Computer and Information Science and Engineering thesis, Ph. D Dissertations, Academic -- Computer and Information Science and Engineering -- UF Image compression ( lcsh ) Image processing -- Digital techniques ( lcsh ) City of Gainesville ( local ) - Genre:
- bibliography ( marcgt )
non-fiction ( marcgt )
## Notes- Thesis:
- Thesis (Ph. D.)--University of Florida, 1996.
- Bibliography:
- Includes bibliographical references (leaves 356-361).
- General Note:
- Typescript.
- General Note:
- Vita.
- Statement of Responsibility:
- by Mark Steven Schmalz.
## Record Information- Source Institution:
- University of Florida
- Holding Location:
- University of Florida
- Rights Management:
- The University of Florida George A. Smathers Libraries respect the intellectual property rights of others and do not claim any copyright interest in this item. This item may be protected by copyright but is made available here under a claim of fair use (17 U.S.C. Â§107) for non-profit research and educational purposes. Users of this work have responsibility for determining copyright status prior to reusing, publishing or reproducing this item for purposes other than what is allowed by fair use or other copyright exemptions. Any reuse of this item in excess of fair use or other copyright exemptions requires permission of the copyright holder. The Smathers Libraries would like to learn more about this item and invite individuals or organizations to contact the RDS coordinator (ufdissertations@uflib.ufl.edu) with any additional information they can provide.
- Resource Identifier:
- 023835227 ( ALEPH )
35822283 ( OCLC )
## UFDC Membership |

Downloads |

## This item has the following downloads: |

Full Text |

ON THE PROCESSING OF COMPRESSED AND ENCRYPTED SIGNALS AND IMAGERY, WITH APPLICATIONS IN EFFICIENT, SECURE COMPUTATION By MARK STEVEN SCHMALZ A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1996 Copyright @ 1996 by Mark S. Schmalz ACKNOWLEDGMENTS Many thanks are given to my family, for their loving support and encouragement, and to Drs. Ritter, Wilson, Laine, Rajasekaran, and Harris for their advisement of this work and associated publication efforts. Special gratitude goes to Gerhard Ritter, for his patient advice and encouragment in the development of unifying theory. iiM TABLE OF CONTENTS page ACKNOWLEDGMENTS ......... LIST OF FIGURES ........... LIST OF TABLES ............ ABSTRACT ........... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v i . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix 1. INTRODUCTION ............................................. 1 1.1. Study O verview ..................... .. ..... .. .... .... 2 1.2. Previous W ork ............................ ....... ... . 4 1.3. Technical Approach ........................................ 31 1.4. Novel Claim s ...................................... . 32 1.5. Implementational Advantages and Disadvantages ................ 33 2. REVIEW OF NOTATION ...................................... 35 2.1. Overview of the Image Algebra (IA) Subset .................... 35 2.2. Study Notation ..................................... 46 3. FUNDAMENTAL THEORY ..................................... 49 3.1. M athematical Concepts................................. 49 3.2. Image Compression, Encryption, and Compressive Computation........ .56 3.3. Taxonomy of Image Transformations......................... 73 3.4. Class-Specific Derivational Techniques. ........................ 79 4. COMPRESSIVE PROCESSING - COMPUTATIONAL COMPLEXITY AND DATA SECURITY ........................................... 117 4.1. Complexity of Image Algebra Operations. ......................... 118 4.2. Complexity of Compressive Computation. ......................... 124 4.3. Feasibility of Compressive Computation. ......................... 134 4.4. Data Security. ............................................ 141 4.5. Feasibility of Encryptive Computation. ........................... 158 5. COMPUTATIONAL ERROR AND INFORMATION LOSS ............ 164 5.1. Theory of Error Propagation in Discrete Systems ................... 164 5.2. Error Propagation in Discrete Image Algebra Operations ........... 173 5.3. Theory of Error-Tolerant Computation ....................... 180 5.4. Feasibility of Error-Tolerant Computation ........................ 182 5.5. Information Theory and Error-Tolerant Computation. .............. 187 iv 6. CLASS 1 TRANSFORMATIONS ............................. 198 6.1. Substitutional Ciphers. .................................. 198 6.2. Transpositional Ciphers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 6.3. Linear Transform s. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 7. CLASS 2 TRANSFORMATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 7.1. Generalized Class 2 Transform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223 7.2. Pixel Value Compression via Bit Slicing. . . . . . . . . . . . . . . . . . . . . . . 228 8. CLASS 3 TRANSFORMATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 8.1. Affine Transformation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 8.2. Spatial Transformation by Pixel Selection . . . . . . . . . . . . . . . . . . . . . 247 8.3. Transpositional Ciphers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 9. CLASS 4 TRANSFORMATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 9.1. Block Encoding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 9.2. Sparse Matrix Processing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 9.3. Transform Coding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 265 9.4. JP E G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 9.5. Block Truncation Coding and VPIC. . . . . . . . . . . . . . . . . . . . . . . . . 276 9.6. Vector Quantization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 10. APPLICATIONS OF COMPRESSIVE IMAGE PROCESSING. . . . . . . . . . 302 10.1. Compressive Smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 10.2. Compressive Edge Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 310 10.3. VPIC Morphological Operations........................... 323 10.4. High-Level Compressive Computations. ......................... 326 11. APPLICATIONS IN PARALLEL COMPUTING . . . . . . . . . . . . . . . . . . 334 11.1. Effect of Domain Compression Ratio. ........................... 334 11.2. Effect of Range Compression Ratio......................... 338 11.3. Simplification of Operations. . ............................ 339 11.4. Partitioning Efficiency. ................................ 341 12. CONCLUSIONS ............................................ 346 12.1. Conclusions. .............................................. 346 12.2. Open Issues and Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350 REFERENCES....... ..........................................356 BIOGRAPHICAL SKETCH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 362 V LIST OF FIGURES 1. Commutativity diagram for processing compressed or encrypted imagery with unary operations.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 2. (a) Example source image a that is transformed by T to yield (b) image a,. Note reversal of the middle 4x4-pixel neighborhood only. . . . . . . . . . . . . . . . . . . 82 3. Example of block fragmentation in pointwise compressive operations over non-identically tesselated domains: (a) representation of the y-th block in a (solid line) and b (dotted line), (b) resultant block decimation required to process the compressed images cc = ac Q' bc such that corresponding sub-blocks are amenable to pointwise combination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 4. Commutativity diagram of a compressive computational system with multiple .encoding transforms Ti and multiple analogues Of of image operation 0. . . . . 136 5. Commutativity diagram of a compressive computational system with multiple encoding transforms Ti , i E Z(, and multiple analogues OQj , j C ZV, of the j-th image operation Qj, where C = 2 and 0 = 2. . . . . . . . . . . . . . . . . . . . . . . 137 6. Information-theoretic model of communication. . . . . . . . . . . . . . . . . . . . . . 187 7. Information-theoretic model of communication. . . . . . . . . . . . . . . . . . . . . . 187 8. Arrangement of noise levels in a signal a with entropy H(a) that partially occupies a channel of bandwidth B having K noise bits. . . . . . . . . . . . . . . . . . . . . . 195 9. Local averaging of a noisy, two-dimensional, 8-bit image transformed by T(x) = (x+128) mod 256: (a-b) source image and template a and t; (c-d) transformed image and template a, and s; (e) transformed result b,; (f) image-domain result b = T- (bc). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 10. Dual of pointwise multiplication over the range space of the pointwise linear transform T(x) = cx + d. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 11. Recovery of source image a from image sum of contrast-stretched imagery: (a-b) source images a and b, (c-d) linear contrast-stretched images a, = T(a) and b, = T(b), (e) c, = a, and b,, (f) recovery of a from cc using a = T1(cc - T(b)). . . 215 12. Pixel indexing and overlap scheme for information loss analysis of an affine transform (a) indexing scheme and (b) example of pixel overlap. . . . . . . . . . 238 vi 13. Local averaging of a rotated Boolean bar chart using a rotated template: (a-b) source image a and template t; (c-d) rotated image a, = T(a) and template s formed by applying T to the weight matrix of t, then restricting to the 3x3-pixel Moore configuration; (e) rotated locally-averaged image ac @ s; (f) derotated im age T*(ac @ s). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 14. Error images of local averaging using rotated and unrotated templates: (a-b) error images e (Equation 307) and f (Equation 308), taken from the central (uncropped) portion of domain(a); (c-d) histograms of error images e and f. . . 252 15. Error and efficiency measures associated with JPEG addition over natural scenes having 3 bits/pixel error. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275 16. Codebook growth and error as a function of codebook size and Methods 1-3 for VQ pointwise addition: (a) input and output code book sizes M and N, (b) input and output errors ci and c,. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 17. Block configuration for compressive local averaging, showing block neighborhoods (dashed boxes) on X returned by c, r, and j....................... 304 18. Example of compressive smoothing at CRd = 16:1: (a) source image, (b) locally averaged image using unitary von Neumann template, (c) compressed array of block means over which processing actually occurs, (d) decompressed result of compressive smoothing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307 19. Example of VPIC coding of very low-resolution Boolean imagery: (a) source image, (b) VPIC representation of a), where M,x denotes a mean block of mean x. . . 318 20. Example of VPIC coding of very low-resolution Boolean imagery: (a) boundary-detection of Figure 19a), (d) VPIC edge detection of Figure 19a). . . 319 21. Error analysis of the edge detector in Figure 20b, in terms of erroneous source pixels per encoding block. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320 22. Greylevel edge detection with VPIC: (a) source image, (b) Sobel edge detection, (c) 4x4-pixel VPIC edge detection using the codebook similar to that given in Example 10.2.2.3, (d) noise and representational error as a function of VPIC blocksize (kxk pixels) for underwater and land-based imagery. . . . . . . . . . . 321 23. Target characterization using VPIC codebook exemplars with indices in {1,2,3}: (a) VPIC codebook, (b) source image with unitarily-valued target region, (c) compressed representation, where 1' denotes exemplar 1 rotated by 90 degrees. . 328 24. Example of VPIC-based target recognition: (a) source image, (b) BTC-compressed image over a portion of which the target recognition algorithm (Equation 410) computes, (c) decompressed target location image. . . . . . . . . 330 25. Conversion of a SIMD-parallel mesh to pipelined computation using compressive processing: (a) input and compute over compressed image 1, (b) input compressed image 2 and compute over image 1, (c) input compressed image 3 and compute over images 1 and 2, (d) input compressed image 4, compute over compressed images 1-3, and output compressed image 1. . . . . . . . . . . . . . . 337 Vii LIST OF TABLES 1. Costs involved in SIMD-parallel computation of a 2-pixel image-template convolution versus a 5-pixel LUT operation over VPIC-format imagery . . 339 2. Processor cycles incurred by SIMD-parallel computation of a 2-pixel imagetemplate convolution versus a 5-pixel LUT operation over VPIC-format im agery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340 Viii Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy ON THE PROCESSING OF COMPRESSED AND ENCRYPTED SIGNALS AND IMAGERY, WITH APPLICATIONS IN EFFICIENT, SECURE COMPUTATION By Mark Steven Schmalz August 1996 Chairman: Gerhard X. Ritter Major Department: Computer and Information Science and Engineering In this dissertation, we develop theory and analyses pertinent to the processing of compressed or encrypted imagery, called compressive processing or encryptive processing. Encryptive processing, which is a long-standing goal of computer science, exploits the processing of secure (encrypted) operands to yield a secure computation. Unfortunately, development has been hindered by an obscure understanding of mathematical concepts fundamental to the encryption process. Similarly, the processing of compressed imagery can yield computational efficiencies as a result of fewer input data. However, such speedups are not uniform across commonly-used image compression formats and may not exist for certain operations over the range spaces of given transforms. Additionally, the formulation of compressive transforms can be quite involved, which intuitively suggests that the derivation of compressive computational operations is difficult in closed form. We first discuss theory that is basic to an understanding of compressive and encryptive processing. Using image algebra, a rigorous set-theoretic and functional mapping notation, ix we derive novel algorithms for implementing selected pointwise arithmetic, neighborhood, and domain-specific operations on transformed data. Our algorithms are both feasible and portable to numerous computers, since image algebra has been implemented on a variety of serial and parallel processors, including the Connection Machine, MasPar SIMD processor, ERIM's Cytocomputer, Alliant Tech's PREP, Inmos' Transputer architecture, and the Martin-Marietta GAPP-IV processor upon which the PAL architecture is based. Subsequent implementational discussion emphasizes the feasibility of sequential and parallel compressive computation. We show that compressive processing methodologies can be mapped to a variety of well-known architectures, especially SIMD array processors. Analyses focus upon time complexity, cost, and error due to information loss. For example, we show that compressive processing can reduce the processor count in parallel architectures without compromising the computational speedup obtained through parallelism. A further advantage of selected compressive transforms is their ability to facilitate the mapping of costly algorithms such as edge detection and component labelling to compressive operations implemented in terms of lookup tables. Such techniques require only I/O operations and can be stored in local memory in certain SIMD-parallel processors. X CHAPTER 1 INTRODUCTION The progress of image and signal processing research has long been hindered by the insufficient computational bandwidth of available computing machinery. Due to the large data burden presented by high-resolution imagery, near-real time processing of large images (e.g., surveillance or medical imagery) is especially costly. Recent research [1-6] has explored the possibility of processing compressed imagery, based on the observation that compressed images have fewer data, as well as the conjecture that fewer input data require fewer operations. In practice, we have found that certain image processing operations can be mapped to the range spaces of specific transformations, such that decompressing the processed compressed image yields an exact representation to (or approximation of) the processed source image. We call such transform-regime operations analogues of the corresponding image operations over the given transform's range space. When one computes an image operation or sequence of such operations using compressed imagery, that is called compressive processing. In practice, it is possible that certain formulations of analogous operations can yield a computational speedup since: (a) the analogous operations may be more efficient than the corresponding image operation, or (b) fewer data may be processed by such operations. Similar to the concept that computational speedup could accrue from the processing of compressed imagery, it has been observed that data security could be maintained throughout a computational system by appropriately processing encrypted data. In contrast to compressive transforms, which are extensively discussed in the literature, the formulation of numerous practical encryptive transforms remains obscure. Additionally, we note that 1 2 many compressive transforms can be thought of as having weak encryption capabilities, due to the visual obscurity of compressed data formats. 1.1. Study Overview In this dissertation, we address the following key questions: 1. Can compressive or encryptive transforms be classified in a manner that facilitates convenient derivation of analogues to common image processing operations? 2. Given the aforementioned classification scheme and a particular transform class, can methods for deriving analogues over the range spaces of multiple transforms in that class (or a relevant subclass) be elucidated? If not, then why not? If so, then how can one employ such methods effectively, and to what types of image operations or operands do they apply? 3. Given the presence of noise and computational error in imagery and image algorithms, respectively, what is the noise sensitivity of compressive processing? This is an important implementational issue, since image compression decreases redundancy in the compressed image. In the source image, noise in a given pixel is generally localized to the pixel's domain point. However, due to reduced redundancy in the compressed image, noise in a givenpixel value of a compressed image could cause perturbation of multiple pixel values in the corresponding decompressed image. 4. Assuming that we can derive an analogue of an image operation over the range space of a compressive transform, what computational advantages can be obtained via image processing with such analogous operations? Additionally, is it possible to cascade compressive operations, thus facilitating algorithm development in the transform domain via composition of known analogues? 3 5. What additional insights, effects, or implementational advantages could accrue from compressive or encryptive processing? How would such concepts be useful in computer science? Given the preceding questions, we conclude this chapter with a brief perspective on the small amount of previous work reported in the literature, as well as comments on methods and potential advantages of our approach. Chapter 2 begins with an overview of notation and Chapters 3-5 summarize theory that supports the derivation of analogues of image processing operations. We choose several commonly-employed transforms from each class of our transform taxonomy. Given a transform T, we discuss the derivation of selected pointwise, global-reduce, and image-template operations over range( T). In certain cases, we have discovered methods for deriving analogous operations that are applicable to a large subclass of transforms in the given class. The latter portion of this dissertation contains an implementational discussion that emphasizes three topics. First, we consider the effects of compressive image processing with three specific transforms (JPEG [7], Vector Quantization [8], and Visual Pattern Image Coding [9]). Second, we analyze the propagation of error through a discrete compressive processing system, in terms of the communications and computational errors manifested by given operations over the range space of each transform. Third, we discuss problems and pitfalls related to the implementation of such analogous operations on sequential and parallel processors. Implementational issues include space complexity as well as limits on computational and communications bandwidth. In particular, we discuss the reduction in parallelism that may be achieved via compressive processing on SIMD-parallel meshes. Such effects pertain primarily to medical and military applications in real-time image processing at high frame rates. Conclusions and suggestions for future work emphasize the impact upon image processing that compressive computation may produce, under the assumption that error propaga- 4 tion -can be predicted and controlled. An additional assumption pertains to the availability of analogous operations over the range spaces of compressive transformations that may be obtained from future development efforts. Open issues, such as the elucidation of general derivational methods for analogues over a transform class or subclass, are also considered. 1.2. Previous Work The computational advantages of processing fewer data were long ago realized in the processing of reduced matrices [10,11]. Additionally, electrical engineering practice has long emphasized operations over parameter spaces that approximately characterize input signals. Such feature-space approximations usually implement one or more forms of data reduction, but can require exotic algorithms or architectures for computing simple operations [12]. Although of implementational interest to certain mission-specific efforts, such feature space operations generally exhibit disadvantages of high space complexity as well as costly hardware and software development. Thus, such methods are not generally attractive for commercial applications. As a result, additional research is required in the processing of compressed data. In order to acquaint the reader with the scope of this study and provide salient detail, we begin our literature review with a discussion of transforms that have been developed specifically for the purpose of data compression (Section 1.2.1). This overview supports the primary topic of this study, namely, compressive processing. We then review data encryption (Section 1.2.2), which provides background for a summary of encrypted data processing (Section 1.2.3). Such background leads naturally to a discussion of processing compressed signals and imagery, which is given in Section 1.2.4. 5 1.2.1. Data Compression Research endeavor in signal and image compression has been extensive, due primarily to the economic benefits of exploiting communication channel bandwidth. In principle, compression is achieved by reducing redundancy in the source data. For example, an image of constant value k can be represented by one number (k), regardless of image size. In contrast, an image consisting of randomly chosen pixel values cannot be compressed. Early on, it was recognized that the first derivatives of many signals exhibited lower information content than the source signal. This lead to the development of primitive methods of image compression such as delta modulation [13] and differential pulse code modulation (DPCM). As a consequence of the need for accurate, efficient quantizers in PCM and DPCM, a variety of statistical compression techniques were developed including adaptive DPCM, which were based. upon adaptive or recursive quantization techniques [14]. As digital images increased in size and thus required higher channel bandwidth and increased storage, data compression research began to emphasize two-dimensional transformations. For example, the concept of subdividing an image into blocks (generally of rectangular shape) gained popularity due in part to the limited memory then available on fast signal processors. Such methods, called block encoding (BE), tessellate the image domain into encoding blocks that exhibit greylevel configurations which can be represented in lossless form (e.g., via indexing) or with a lossy transform such as vector quantization. Since there are fewer block configurations than there are possible blocks, and a group of values are represented by an index or by an exemplar block (also called a vector), VQ produces the desired effect of image compression. By arranging the input data to achieve maximum intra-vector and minimum inter-vector correlation, the compression ratio of VQ can be increased, which partially offsets the effort required by the determination of the VQ codebook (a set of exemplar vectors) [15]. 6 An alternative method of block encoding, called block truncation coding (BTC) [16], encodes regions of low greylevel variance in terms of a mean value and regions of high variance in terms of a mean, standard deviation, and a residual bitmap that denotes the positions of zero crossings. Unfortunately, BTC is expensive computationally, due to the adjustment of bits in the bitmap to effect reduced entropy. Since the cost of BTC increases exponentially with blocksize [9,16], and BTC has a compression ratio that is moderate by today's standards (typically 20:1 to 25:1 for images of natural scenes), BTC is not considered a compression transform of choice. Additionally, by transforming the output of a block encoder with, for example, the Fourier or Cosine transforms, then selecting transform coefficients that are deemed significant a priori, one can further reduce the image data burden, although at the cost of information loss. Such methods are generally called transform coding [17], and feature prominently in compression schemes (e.g., JPEG, MPEG) that are currently in vogue for digital telephony applications [18]. By.following the coefficient selection stage (i.e., quantization step) with a provably optimal, lossless compression transform such as Huffman encoding [19], one can obtain further data reduction and thus achieve higher compression ratios without incurring information loss. More exotic methods of image coding are based upon the reduction of an image to significant eigenvalues, as in the Singular Value Decomposition (SVD) [20]. Although proven optimal for image compression, eigenvalue transforms such as the SVD and Karhunen-Loeve transform (KLT) are computationally burdensome, requiring O(n4) work for an nxn-pixel image. Thus, the SVD and KLT are infrequently employed, despite the ease with which significant transform coefficients (eigenvalues) may be obtained from the transformed image. An alternative method of compression by recursive decomposition, which is often based upon knowledge derived from observation of the human visual system (HVS), has been employed with some success but tends to be data-dependent. For example, early attempts at multifrequency decomposition, such as the Synthetic High technique [17], eventually 7 led to Fourier-transform based methods currently known as wavelet based compression [21]. Similarly, Barnsley et al. [22,23] have published extensively in the area of image compression based on recursive, VQ-like decompositions that derive their methodology from the collage transform and from concepts of fractal geometry [24]. Due to an obscure formulation and high cost of the forward transformation, fractal-based compression remains an experimental technique. Recently-published research by Chen and Bovik [9] has reconsidered the problem of compression based upon HVS-based knowledge of exemplar patterns that may be used by the visual cortex to partition imagery. For example, the visual cortex is known to contain simple, complex, and hypercomplex cells [25] that mediate receptive fields which detect baror wedge-shaped gradients, as well as feature orientation and (possibly) shape. Chen has exploited this information to modify block truncation coding by replacing the costly bitmap construction step with a simple correlation step that finds the best match between the zero crossing configuration and a small set of exemplar blocks. This method, called visual pattern image coding (VPIC) yields high computational efficiency with respect to BTC or VQ and combines advantages of both methods. Given the appropriate set of exemplar patterns, Chen has demonstrated (on several standard images) high reconstruction fidelity at compression ratios of 20:1 to 40:1, which appears to be superior to JPEG's performance at such compression levels. We next consider the allied topic of data encryption, from which area of study signal and image compression theory arose. 1.2.2. Data Encryption The development of data encryption transforms has long been of interest due to applications in military science and statecraft [26]. More recently, business communications research has shown great interest in data encryption, as applied to the security of financial 8 transactions. Beginning with monoalphabetic substitutions and transpositional ciphers [27], data encryption progressed to anagrammatic ciphers, as well as polyalphabetic encryptions based on the Vigenare cipher [28]. The vulnerability of such methods to attack based upon the ciphertext histogram and knowledge of plaintext statistics led to development of the Vernam cipher (29], which is based upon modulo-n addition and subtraction. The Vernam cipher, although used extensively during WWI, can be compromised if an intermediate result is detected, or if a portion of the plaintext is known. In response to this situation, development centered on Hill's linear k-gram cipher [27] and rotor machines that were sophisticated electro-mechanical implementations of the Vignere cipher. The Hill cipher requires solution of a system of k linear equations, in which the alphabetic indices of k elements of plaintext are multiplied by an encryption matrix to yield ciphertext. The product of an inverse of the encryption matrix and the ciphertext produces the plaintext. Although the Hill cipher was initially thought to be secure, it was quickly proven that linear-algebraic attack could compromise the encryption in 0(k3) time [27]. Likewise, rotor machines incur the basic deficiencies of the Vigenere cipher, which are: (a) a finite number of alphabets that is small relative to customary message size, (b) resulting susceptibility to attack based upon known plaintext statistics and semantics, (c) insecure key distribution schemes, and (d) vulnerability to automated (i.e., computational) attack. In the latter category was much pathfinding research in pattern matching and enumerative attack using Turing machines, as embodied in the Atlas computer employed by the British during WWII in their cryptology center at Bletchley Park. It this context, it is interesting to note that the rotor machines used by the Axis powers in WWII were descended from a machine that was patented in the US in the 1920s [30]. Such narrative is well documented in the literature [26], and are thus not considered further in this discussion. Based upon cryptanalytic experience gained during WWII, Claude Shannon in 1949 published suggestions concerning features of cryptosystems that might be useful in the future, such as interleaving of substitutions and transpositions [30]. Throughout the 9 1960s, the IBM Corporation developed Shannon's ideas into a workable prototype cipher, called LUCIFER, which was improved in the early 1970s to yield the Data Encryption Standard [29]. In its original form, DES accepted a 56-bit Boolean key and a 64-bit Boolean plaintext, and returned a 64-bit Boolean ciphertext. In practice, the DES transformation consists of 18 component transformations that are composed to yield a sequence of substitution and transposition operations. The transpositions are based in part upon hitherto undisclosed functions called S-boxes that are implemented in publicallyavailable lookup tables (30]. Due to the possibility of compromise via enumerative attack facilitated by increasingly powerful computers, it was proposed in the mid-1980s that DES' input be extended to 128 bits, and that the key be of similar size. Note that: (a) this extension has yet to be adopted, (b) the US Department of Defense (DoD) until recently continued to recommend the 58-bit DES keyspace as the standard for business communication, and (c) DES has not been employed by the DoD in military encryption. The latter situation was underscored by Diffie and Hellman's proposal [31] of an exhaustive attack on DES, in which a multiprocessor architecture enumerates the 256 possible keys. Given the lns cycle time of simple, fast, commercially-available uniprocessors (GaAs or HEMT technology [32]), and assuming the existence of a 104-processor array, 1013 keys could be generated per second. Since DES is inherently pipelined, and 256 - 7 x 1016, then approximately 7000 seconds, or less than two hours, would be required for brute-force decryption. Additional methods of attacking DES include guessing of the key using techniques based upon genetic algorithms [33], which have proven efficient in attacks upon rotor machines [34]. From the preceding discussion, it should be obvious that there exist serious problems with cryptosystems that have only one key. In particular, extreme vulnerability occurs when the key is distributed to numerous communicating parties. For example, consider network communications, where a fully connected network with n nodes requires O(n2) key 10 transmissions. Clearly, the probability of key seizure (or, at the very least, key monitoring) by unauthorized persons increases polynomially with the number of communicating parties. For example, if N = 4, then there are 42-4 = 12 key transmissions required. However, if N = 8, then 56 keys are transmitted. If one assumes a network organized around a high-speed trunk or backbone line, then the second case (N=8) has a probability of key transmission per encoded message that exceeds 4.5 times the probability of key transmission when N=4. As a partial solution to the key distribution problem, Diffie and Hellman [311 proposed that the key be partitioned into two halves, i.e., k = (kp, k,) where the public key kp is the only key required for encryption, but k, is required for decryption. Implementationally, let the i-th user in an N-user system choose key ki = (kp(i), k,(i)), and let a directory readable by all users list the public keys kp(i), i = 1..N. Let a plaintext message a be sent to user i as ciphertext a,, which is defined as ac= T(a, kp(i)), (1) where T: FX x K - FX denotes an encryption transform. Only user i has the secret portion k,(i) that is required to compute the decryption T*: FX x K' - FX as a = T*(ac, [kp(i),k,(i)]). (2) Thus, in public-key systems, only keys kp(i), i = 1..N, are distributed openly, versus the secret distribution of O(N2) keys required by private-key cryptosystems. Additionally, public-key cryptography (PKC) can be used to authenticate transmissions when a message is accompanied by an encrypted signature, which may be appended automatically. The discussion of authentication, which is beyond the scope of this dissertation, is summarized in Reference 29. 11 Although attractive conceptually, PKC (Public Key Cryptography) has two stringent requirements, which can be stated in terms of the following related problems: a) Given a sum of two numbers, determine the addend and augend; and b) Given a product of two numbers, determine the multiplicand and the multiplier. In practice, such requirements have been fulfilled by the following cryptosystems. Merkle and Hellman's knapsack cryptosystem [35] was the first public-key system proposed in the open literature. Concurrently, Graham and Shamir developed and later published [36] a slightly different approach to knapsack systems, which is distinguished by the construction of knapsack sets and by a modification to Merkle and Hellman's decryption step. Another approach to knapsack systems is the iterated knapsack, which uses multiple knapsack sets, similar in concept to the polyalphabetic substitution that employs multiple alphabets. Knapsack sets are often implemented in terms of superincreasing sets, which can be constructed from an indexed set F, as follows F.;= fi E N: fi > fj C F. (3) j=1 An unfortunate property of superincreasing sets, which defeats the security of knapsack encryption, is that the ratio of the number of elements to the number of bits required to encode the set is not greater than 1. The greater the radix r of a superincreasing set F,; = {ri E R : r > 1 and i E N U {0}}, the further apart are such numbers on the number line. Unfortunately, such low-density sets yield knapsack encryptions that can be easily broken. However, by constructing hard knapsack sets, which are denser and yield encryptions that are difficult to break, it has been shown that a successful computational attack is at least NP-complete [30]. Methods of attack on knapsack systems have been proposed by Shamir [37], Lagarias and Odlyzko [38], as well as Brickell [39]. A summary of such methods is beyond the scope of this overview, but can be found in References 29 and 30 , which review knapsack systems in detail. 12 Rivest, Shamir, and Adelman (RSA) proposed [40] that public-key cryptography be implemented in terms of the product of prime numbers. Their system currently remains secure, i.e., no successful comprehensive attack on RSA has been reported in the open literature. However, the RSA algorithm, as well as a feasible RSA decryption scheme, has been disclosed in the Pretty Good Privacy (PGP) encryption algorithm which was made publicly available in 1994. As a result, RSA is generally considered insecure for purposes of military communications. With the development of efficient prime factoring algorithms, as well as the storage of precomputed prime factors, RSA has become susceptible to attack unless the number of digits k is very large. Furthermore, the processing of data encrypted by RSA requires decryption for numerous operations to be feasible implementationally. Although multiplication can be performed upon the encrypted result, the general equation y = a + bx cannot be solved for y given x, since RSA is based upon the multiplication of exponents. Thus, the utility of RSA in processing encrypted data appears to be limited to integral scaling problems. Additionally, efficient implementation of the RSA algorithm requires reasonably fast algorithms for determining numbers that are relatively prime in a system based upon modular arithmetic. Furthermore, RSA requires the efficient computation of inverses in a modular system, as well as the computation of large exponents. Such algorithms have been published recently [41). Methods of encryption that are currently in vogue exploit the geometry and nonlinearity of certain bivariate functions (such as conic sections) to produce two keys that satisfy constraints of the PKC scheme. Such methods yield a large search space, but can be compromised if the geometric figure upon which they are based can be guessed or approximated computationally [29]. An analysis of such implementational techniques is beyond the scope of this dissertation, but may be found in Reference 29. 13 1.2.3. Encryptive Processing The processing of encrypted data, which we call encryptive processing, predated research in compressive processing by nearly two decades. Encryptive processing seeks to mathematically combine ciphertext and plaintext data in a manner that is secure from the site of encryption to the site where the data is combined, and remains secure to the decryption site. Such methods are useful for financial transactions, updating of personal (i.e., medical) records without compromising privacy, and processing of classified data such as military surveillance imagery. Our review of previously-published work in encryptive processing begins with Rivest's, Adelman's, and Dertouzos' concept [42] of linear homomorphisms, which have been shown to be vulnerable to linear-algebraic attack, but remain interesting pedagogically. Additionally, we overview analyses by Abadi et al. [43] of the likelihood of encrypting polynomial-time and NP-complete functions, and discuss recent work by Ahituv [44]. The latter research is based upon the assumed combinatoric complexity of recursive encryption, and provides an in-depth analysis of cryptographic vulnerability in practical encryptive computational scenarios. We also consider binary superposition operations over plaintextciphertext pairs formulated by Yu and Yu [45], which are feasible in practice, due primarily to an efficient formulation, deep recursive encryption, and large plaintext size. Although an intermediate result may be discovered, its decryption can be rendered infeasible in operational scenarios due to practical constraints upon an adversary's ability to enumerate the recursive encryption steps. 1.2.3.1. Linear privacy homomorphisms. Difficulties in processing encrypted data were recognized early on, and were discussed in Rivest et al.'s paper [42] of 1978. However, linear homomorphisms can be successfully attacked via the solution of a set of linear equations, as noted in Reference 44. 14 Given spaces S and S', as well as functions f : S -+ S and f': S' - S', the transformation T: S -> S' is a homomorphism if and only if f'(T(g)) = T(f(g)), g E S. Additionally, if T is a linear transform, as well as a homomorphism, then T is called a linear homomorphisin. For example, let a and b be n-element vectors over the alphabet F, and let T be a linear homomorphism. We thus have T(a + b) = T(a)+ T(b) (4) T(m.a) = m- T(a) = T(a) - m, (5) where m is a constant in F. Let encrypted data be denoted by a, = T(a) and b, = T(b). Assuming that T has an inverse T-: S' - S, one updates a, with plaintext c as follows STEP 1: Encryption cc= T(c) STEP 2 Updating anew= a c+ (6) STEP 3: Decryption: a ne= T-l(a.ew) Multiplication by a constant is similarly performed, per Equation 5. Linear homomorphic encryption systems are easily broken by solving n linear equations, as follows. Initially, one constructs a dictionary d consisting of n words. In a representation such as the binary number system, where the k-th bit of the k-th word is unitary and a vector dk is zero elsewhere, we have d1= T(1,0,0,...,O) d2 = T(O, 1, 0,..., 0) (7) dn = T(O, 0, 0, ..., 1) In practice, an n-bit encrypted block can be defined as the sum of rows of d, due to the linearity property of T. For example, if we choose an enciphered message 15 ac= (0,1, 1,0,0,1,0,. .. ,0) E {O, 1}' we have the linear equation a,= d2+d3+d6. or some linear combination of the rows of d. By defining n such messages, one obtains n linear equations. By knowing the solution (i.e., the plaintext arguments of T that produced d), the homomorphism can be broken via solving n linear equations in the customary manner. Implementationally, given the 0 (n3) multiplication overhead of matrix inversion, 1024byte signals (n=8192 bits, or 16 lines of 64 characters) could be inverted (with machines capable of 1 billion multiplies per second) in approximately 550 seconds = 81923 multiplications/109 multiplications/sec. As a result, such messages are not immune from attack with current computational hardware when the useful lifespan of a message is as short as several minutes. Thus, privacy homomorphisms remain of pedagogic interest primarily. 1.2.3.2. Abadi's analysis of encryptive processing. Given the apparent failure to secure simple operations such as addition and multiplication by a constant, one would naturally ask what functions are encryptable, and what level of security can thus be attained. Abadi, Fiegenbaum, and Kilian [431 considered the problem of encryptive processing in terms of the following scenario, which is based upon an encryptable function f 1. Party A knows a datum x in domain(f), but has insufficient resources to compute f(x). 2. As a result, A requests from party B the quantity f(y), where y E domain(f) and y * x. Since B will compute the function f that is hard from A's perspective, B can successfully attack a cryptosystem based upon intractability assumptions. However, B may not guarantee the privacy of the encrypted data. 3. Via an encryption, A (with inferior resources) can request f(y) from B, but B cannot infer x from y. 4. However, A can infer f(x) from f(y), and does so, to obtain the desired result. 16 From item 3), one (correctly) concludes that A must be able to understand (and thus compromise) an encryption system that B cannot successfully attack. Therefore, A has greater encryption resources than B, but has fewer computational resources (as noted in Item 1). Via a series of interesting theorems, the authors claim to 0 Prove precise statements about what information the computational operation f hides and discloses, in the information-theoretic sense; 0 Develop theory that proves an encryption T: xK y hides or conceals at least the properties 'H(x) and discloses at least 'D(x); * Give examples of natural encryptable functions; and * Establish a strong negative result about the encryptability of NP-complete functions. In particular, the authors show that the functions for which there exist efficient encryption schemes and disclose nothing are computable in expected polynomial time. For example, under certain operational assumptions, the arithmetic and logic functions can be secure, and might not disclose information concerning their operands. The authors' conclusions support the intuition that if f is hard to compute, then f is difficult to map to the range space of an encryption transform, and A cannot hide everything about f or f(x) from B. Unfortunately, the authors do not address in detail the well-known problem of disclosing the properties of encrypted vector-space operands, such as images. For example, if a function f accepts an n-pixel image a and returns an n-pixel image b, then f discloses the size n of its input, which is an important clue for the cryptanalytic adversary. Similarly, a or b may contain subimages having spatial structure that is revealed in the ciphertext. Note that this study primarily concerns customary image processing functions that compute over an n-pixel image in O(nk) time, where O 17 In summary, Abadi et al. provide a useful description of the encryptability of functions in the complexity classes NP or CoNP, which forms a conceptual basis for our work. Unfortunately, Abadi's paper contributes little to our study of the implementational security of polynomial-time functions (i.e., the amount of input information disclosed by a computation under practical constraints). However, a useful result is presented in the observation that polynomial-tme functions can be encrypted in polynomial time. 1.2.3.3. Key-update model of encryptive processing. In response to the shortcomings of Rivest's homomorphic functions, Ahituv, Lapid, and Neumann [44] present an incremental requirements analysis of computational security that begins with the concept of additive updates of ciphertext data with plaintext. We summarize the salient points of the analysis, as follows. Given key k, plaintext a, as well as the current and updated ciphertext ac and ane", the authors discuss an update scheme given by the encryption a, = (k + a) mod 2L (8) where L is a modulus that is chosen sufficiently large, in order to prevent loss of data due to overflow. Ciphertext updates with plaintext denoted by anew are given by aC = ac + anew. (9) After j consecutive updates, the plaintext equivalent of the current ciphertext value becomes new new k a(j) = ac(j) - k. (10) Unfortunately, the data is continually vulnerable to a one-time breaking of the key. Additionally, the accumulation of knowledge concerning plaintext updates could provide an adversary with increasing amounts of information about the ciphertext. Since the sum (k + a,) could be discovered in an intermediate state, previous updates could be traced without 18 discovering the key. Furthermore, the tagging of the data as plaintext or ciphertext that is required for correct decryption can provide useful information to an adversary. The preceding situation can be improved if one employs a linear isomorphism T: F" x K - F", such that a, = T(a,k), and b, = T(b,k). Ahituv states the obvious requirement that T-1(ac + b,)= T-1( T(a) + T(b))= T-1(T(a + b))=a + b, (11) where b denotes plaintext to be added to a. The utility of such encryptive addition could be augmented by multiplication if the following distributive property was supported ac(k)(a+ b)= ac(k) - a + ac(k) - b, (12) where ac(k) denotes ciphertext produced with key k. Since each successive encryption would add the value of the key, it is required that after the n-th update, the most recent plaintext update is given by a"new =ac --n .k. (13) The system described by Equations 11-13 no longer distinguishes plaintext from ciphertext in the sum, thus reducing vulnerability. However, discovery of the key can compromise the system at any time. This limitation necessitates frequent key replacement for both plaintext updates and stored ciphertext data. Although linear homomorphisms can overcome the key generation problem, such encryptions are easily broken, as discussed in Section 1.2.3.1. Due to key vulnerability, the authors propose a further improvement whereby different neighborhoods of the keyword can be utilized as update keys. Thus, if a certain key is broken, the adversary obtains one-time (temporary) information, but cannot derive the updates a new or the plaintext a. The success of this method depends upon the generation of a sufficiently long random key, which is feasible with off-the-shelf technology [46]. For 19 example, let ki and k2 be successively-generated keys. Denoting the update function with one or two keys as U(kj) or U(k1,k2), and encryption requiring one or two keys as Tk, or Tk1,k2, Equations 11 and 12 become Tk,k2(a,b) = Tk,,k2(a)+ Tki,k2(b)= a + bc (14) U(ki,k2) - (a + b) = ( U(k1) - a)+( U(k2) - b) where the decryption operation T is specified by Tk1k2[Tk (a)+ Tk2(b)) = Tk1k2[ Tk,,k(a + b)] (15) = T 1[Tkl(a)]+ T'[Tk2(b)]= a + b. Assume that n-1 updates have been processed and are contained in U'_1. Let the n-th plaintext update a, be encrypted as U,. The update proceeds as follows Un+1= Un-+Un (16) kn+1= kn-1+kn, where decryption is given by an+l= Un+1-kn+1. (17) Further security can be provided by permuting blockwise partitions of the keys to change their relative locations. In this respect, Ahituv's method suggests possible derivation from the block transpositions inherent in the DES algorithm. Unfortunately, the previous method requires that the protocol between the client (user) and server (secure machine) become more complicated and, therefore, more awkward. Additionally, a more complex key management policy is required, which can render the system more vulnerable. Implementationally, key protection may be insufficient to prevent determination of the plaintext from knowledge of successive ciphertext updates. Thus, we progress to Yu and Yu's cipher [45], as discussed in the following section. 20 1.2.3.4. Yu and Yu's time-reversal ciphers. Following Ahituv's analysis, Yu and Yu [45] proposed a method of recursive encryption T that supports the following superposition of plaintext a and b in terms of ciphertext a, = T(a) and b, = T(b) c, = (mI- a,) + (m2. bc), (18) where m, and m2 are constants. The authors employ a time-reversal encryption, which is both computationally efficient and invertible, and is described as follows. Let a E F" denote a plaintext tuple to be encrypted by a time-reversal transform T: F" x K - F", such that a, = T(a,k), where the key k E K. Now, ac(i)(t), the i-th element of the ciphertext vector a, at time t, is determined by a function of form ac(i)(t) = (k (aC(i)(t)) - aC(i)(t)) mod q, (19) where q is an arbitrary modulus, and the key k(ac) is a function of ac(i) and its L neighboring elements. Time-reversal encryption can exhibit high data security, since the cardinality of a keyspace K constructed from L terms about ac(i) is equal to qqL elements. For example, let q equal the usual value of 256 and L=3, with a polynomially-derived key k(i) = cl - ac(i - 1)+ c2 - ac(i)+ c3 - ac(i + 1), (20) where cj, j=1..3, denote constants. Then IKI = 256263 e 21.34x108 possible keys. As discussed by the authors, circulant boundary conditions should be employed, i.e., for n input data, the index i must be computed modulo n. Additionally, one must set certain initial conditions, such as a(i)(t~o) = d(i) and ac(i)(t=1) = a(i), (21) where elements of the arbitrary background vector b E F" are chosen pseudorandomly. 21 Decryption is obtained by reversing the encryption's temporal order, i.e., ac(i)( , = (k(aC(i)(,)) - aC(i)(t+l)) mod q. (22) Thus, decryption reverses the order of key addition. Although the keyspace is large, time-reversal encryption exhibits two significant disadvantages, as follows: 1. From the definition of data superposition, T is a linear homomorphism, and can therefore be compromised by the method portrayed in Section 1.1.3.1. In particular, if L terms are used to construct a given key, the authors show that the key-generating function k can be determined by solving L+1 simultaneous linear equations if ac(t - 1) is known. However, the term ac(t - 1) supposedly remains confidential, since only two consecutive encrypted texts are publicly available. 2. The term ac(t - 1) could be plaintext, per the initial condition portrayed in Equation 22. However, deep* concealment of ac(t - 1) is feasible a priori via extensive iteration of the encryption function, as well as appropriate selection of a background vector. An additional method of attack on the time reversal cipher is termed data subtraction, which is especially attractive for decrypting linear homomorphisms. As an example of data subtraction, consider the Vernam cipher, which generates a ciphertext bit stream via the expression ac = (k - a) mod 2, (23) where a,ac,k E {0, }' denote the plaintext, ciphertext, and key, respectively. Now, let us generate the updated ciphertext anew = (k - anew) mod 2, and subtract an*w from ac to obtain a, - acw= (a"ew - a) mod 2, (24) 22 which is equivalent to the Vernam encryption of a using ae"' as the key. If a and ane"' are partially known, then the key can be obtained as follows k = (ac + a) mod 2. (25) Since the time-reversal cipher is similar to the Vernam cipher, the authors note in passing that the difference of two ciphertexts a, and b, is given by ac(t + 1) -bc(t + 1) = (k(ac(t)) - k(bc(t))) - (ac(t - 1) +bc(t - 1)). (26) If the key difference k(ac(t)) - k(bc(t)) vanishes (which is claimed to be infrequent) and ac(t-1) and bc(t-1) are plaintext vectors (per item 2, above), then the encryption at time t+l could be unsuccessful. Similarly, if the sum of the ciphertexts at time t-1 vanishes, then the key difference can be revealed in the left member of Equation 27. However, the authors claim that this deficiency can be remedied by subjecting the plaintext to several recursivelyapplied encryption steps prior to computing the plaintext-ciphertext update. The authors further state (without proof) that there exists no simple cryptanalysis method by which one can determine either the plaintext or the keyspace from recursively-encrypted ciphertext. As noted previously, superposition based upon time reversal is a linear homomorphism. Such transformations are notably susceptible to correlation-based attacks. That is, due to the linearity property, the ciphertext and plaintext exhibit high autocorrelation as well as high cross-correlation. Thus, successive updates can be cross-correlated to provide significant information about the keyspace and the source plaintext. Numerous reports of correlation-based cryptanalysis exist in the open literature [30], to which the reader is referred for a detailed explanation of technique. 23 1.2.4. Compressive Processing Our survey of the compressive processing literature is brief, due to the recent and limited nature of work in this area. The allied (and more general) topic of processing transformed imagery is overviewed, beginning with the well-known and nearly trivial case of processing reduced matrices [10], which has been extensively researched. We then progress to Healy's work on edge detection over locally-averaged and subsampled imagery that furnished a portion of the early conceptual and analytical basis for this study. Jawerth's feature detection algorithm for wavelet-transformed imagery [471 is reviewed as is Jolion's brief discussion of operations over pyramid-encoded images [48]. We conclude our overview with a summary of Cosman et al.'s recently-reported work in pattern recognition and lowlevel processing over vector-quantized imagery [1]. 1.2.4.1. Processing of reduced matrices. Let a E Rmxn denote an mxn-pixel, real-valued matrix. Assume that range(a) contains values in a set S that are deemed insignificant, i.e., can be ignored. When such values are removed from a, we can concatenate the remaining values of a together with a representation of the coordinates at which such values occur, to yield a reduced matrix a. In practice, a, can be implemented as an abstract data structure (ADS), e.g., a list ac whose elements are each tuples of form (x, a(x)), where x denotes a point in domain(a). Assume that we have an image operation 0: R x R -+ R, such that i = 0(a). By processing the second coordinate of each element in a, with Q, we can implement the pointwise operation 0 over the list ac, thereby yielding the transformed ADS ac = 0(a,), which can be expressed as ic= U (x, c(x))= U (x, 0 (p2[ac(x))), (27) xEp1(a.) XEp1(ac) 24 where Pk denotes projection onto the k-th coordinate. An identical result would be obtained if Q was applied to all elements of a to yield i, and the matrix reduction was subsequently conducted by removing O(S) from a to yield a'c. For example, let a = (0,1,2,5,3,0,0,0,3), and let the negligible values S = {0,1}. The list representation of the compressed image a, would be given by ac = ((3, 2), (4, 5), (5, 3), (9, 3)). (28) Let the operation O(f) = 2f, where f is a value in range(a). Then, we have that a = Q(a)= (0,2,4,10,6,0,0,0,6) and = ((3, 0[2]), (4, 0[5]), (5, 0[3]), (9, 0[3])) (29) = ((3, 4), (4, 10), (5, 6), (9, 6)). Now, O(S) = {Q(0), Q(1)} = {0,2}. If we remove O(S) from i, we obtain the list representation = ((3, 4), (4, 10), (5, 6), (9, 6)) , (30) which is given in Equation 29. Thus, the preceding example of matrix reduction can be loosely thought of as a homomorphism that preserves the operation Q, which was defined previously. However, the operation Q' that computes over a, is formulated as O(p2(ac)), per Equation 27. Thus, an operation over a reduced matrix can be derived from the corresponding operation over a sparse matrix. In Chapter 9, we show that numerous image processing operations can be computed efficiently over reduced matrices. However, matrix reduction is not an isomorphism, since there exists no inverse (i.e., an exact matrix expansion) that can recover the information lost when the negligible value set S contains more than one value. In Chapter 9, we describe the approximate inversion of a reduced matrix a, in terms of an approximation to a source matrix a. We call such inversion operations approximate inverse transforms, and provide further theoretical development in Chapters 3-5. Additionally, 25 we show how image operations over the reduced matrix a, propagate the error due to the previously-mentioned information loss. The chief advantage of processing reduced matrices is the decreased computational cost that accrues from the processing of fewer matrix elements. The compression ratios achievable with matrix reduction often range from 100:1 to 1000:1, which can yield computational speedups for pointwise operations that approach the compression ratio. As noted previously, the actual speedup is dependent upon the overhead of retrieving the source matrix values from the ADS that represents the reduced matrix. As an example, consider image and matrix operations (denoted by Q, for purposes of brevity) such as edge detection, matrix inversion, and LU decomposition. Such operations usually require that the reduced matrix a, be represented in terms of a data structure that is based upon the dataflow structure of the operation Q. For instance, assume that the application of Q to the x-th element of the source matrix a requires retrieval of the nearest neighbors of the x-th element. It follows that the ADS that encodes the reduced matrix a, should be organized to facilitate optimal retrieval of the nearest neighbors of a(x). In practice, however, when one conducts heterogeneous operations over a given ADS, certain operations must be modified slightly to yield a more efficient computation over the given ADS. In Chapter 9, we discuss the general problem of ADS derivation and the specific problem of adapting reduced-matrix operations to accept certain data structures. We next consider the processing of images that are compressed by combination of adjacent pixel values. 1.2.4.2. Edge detection over averaged imagery. In the early 1980s, Healy [49] developed two methods for edge detection over compressed imagery. The first approach was based on an image restoration technique that optimally reversed the image compression process. The second approach was based upon a maximum-likelihood estimation of the original fullsized edge, which could be deduced from patterns in the reduced imagery. Healy showed 26 that, for a finite number of edge patterns over small neighborhoods, it was feasible to use a lookup table (LUT) to map the compressed edge pattern to the corresponding image neighborhood that produced the edge. Images were reduced by mapping 4x4-pixel source blocks into 2x2-pixel blocks via local averaging, selective averaging (i.e., without averaging across edges), periodic sampling, and gradient-preserving sampling. The reduced images were edge enhanced using Kirsch or Roberts operators. Note that passive FLIR imagery (of resolution 112 lines with 120 pixel per line) was employed, which generally exhibits large, bland regions bounded by nonlinear transitions, and is frequently degraded by noise resulting from the imaging process. Thus error and noise quantification was a key study goal. In addition to edge quality comparisons, mean-squared error (MSE) was calculated for all images, especially those produced by decompressing reduced images via pixel replication.. Note that the averaging of 2x2-pixel blocks is equivalent to representing each block by its local mean, thereby guaranteeing minimum MSE with respect to image intensity. Healy states that the previously-described simple averaging method showed the best overall performance in the edge image, both quantitatively (using computed measures of image clarity) and qualitatively (via visual assessment). Healy tested several additional image reduction techniques, all of which exhibited reduced MSE. However, only simple averaging produced results that consistently corresponded to reduced MSE in the edge image. When the Kirsch operator was applied to reconstructed (decompressed) imagery, edge detection was improved, as opposed to operating on the reduced images. For example, for the simple averaging method of image compression, MSE decreased from 1.38 to 0.75 in the edge image. However, application of the Kirsch detector to the compressed images did not fully extract the edge information. Healy's exploratory studies represent early attempts to perform and analyze image processing operations over images compressed by a well-known method such as block averaging. We further discuss the block averaging technique in 27 Chapter 7. A relatively new method of edge detection, which processes wavelet-compressed imagery, is described in the following section. 1.2.4.3. Local enhancement of wavelet-compressed imagery. In 1993, Jawerth, Hilton, and Huntsberger [47] reported a simple focusing technique for wavelet decompositions. With such methods, one can select neighborhoods of interest within a source image according to region and scale parameters. Additionally, it is possible to obtain variable compression rates and enhance images via wavelet-based filtering of the compressed source image. In summary, beginning with a definition of wavelet series, Jawerth states the wavelet decomposition of an image a over domain X as a(x)= S S 7 k #b(x)xEX, (31) iv k where i denotes the order of the wavelet coefficients in y, V) denotes a wavelet basis function, v represents a scale parameter that corresponds to spatial frequency, and k denotes a positional parameter. Each of the basis functions 0 belongs to one of a finite number of families j V'O , where 0') exists on the approximate scale 2-" and is approximately located at 2-vk. Now, let O)= w : i E Zm} denote a set of weight factors, which are nonnegative numbers. By multiplying the coefficients in a wavelet decomposition by the appropriate weight factors, we obtain the weighted coefficients w(i) 7 (i), with which one can emphasize the importance of certain scales and regions of the decomposed image. The authors show that under statistical constraint, one can specify regions of an image by mean-squared error criteria and employ additional error measures that relate to the smoothness of a over a subset 7? of Euclidean space R". Thus, by choosing weight factors that are relatively large at certain scales and locations, one can obtain better approximations to image details in those regions. 28 Image compression can be effected by analyzing the weighted coefficients wlk U an image, rather than the image itself. As a result, bland areas of an image, which will be manifested as coefficients at large values of the spatial frequency parameter v, can be subtracted from the wavelet series, parameterized, and concatenated with the compressed image. Such methods are reminiscent of transform coding using sparse matrix reduction, which selects coefficients by their magnitudes. However, in wavelet-based image manipulation, one stores in the compressed image those weighted coefficients whose spatial frequencies are in the set of frequencies exhibited by key objects in the scene. Image enhancement is similarly performed, by increasing the magnitude of the wavelet coefficients whose spatial frequencies correspond to features of interest in the scene. A related hierarchical method, processing of pyramidal imagery, is next overviewed. 1.2.4.4. Processing of pyramidally compressed imagery. In the late 1970s, Tanimoto proposed that pyramidal data structures be employed for object detection in natural scenes [50]. We herein overview methods by which Tanimoto's theory of operations over pyramidal data structures has been adapted to operations such as edge detection. Let a denote an nx n-pixel source image whose pixel values comprise the set F, where n is a power of some radix r. Additionally, let domain(a) be tesselated into adjacent blocks of size rxr pixels. Assuming that we can reduce each block (via a data compression transform T) to k < r2 values in F, a compression ratio of CR = r2/k will be obtained for each block and, therefore, for a. Without loss of specificity, assume that each k-pixel compressed block is represented as a terminal node of an r-ary tree. From our initial assumption that n = r', we have logr(n) = m, which implies an m-level r-ary tree representation of a called an image pyramid. In the uncompressed image a, the terminal nodes are rxr-pixel blocks. Thus, in the compressed image a,, the terminal nodes are k-pixel compressed versions of the terminal nodes of a. Further assume that T combines a local feature-detection function with the block reduction 29 operation. Thus, a leaf of the tree that corresponds to the pyramid data structure would contain a compressed block of the source image, with edge information encoded in the compressed image or concatenated to the compressed block. Given such a structure, one can begin processing (for example, via edge detection) at the leaves. By traversing the tree in bottom-up fashion, the result (in this example, an edge map) is propagated to the root node. Thus, an edge detection operation over the pyramidal tree would process the entire tree in O(n - logr(n)) work. With n2-fold parallelism, 0(logmn) time would be required. Additionally, operations such as image summation and maximum can be computed approximately over image pyramids in time proportional to the number of nodes at a given level [501. With n2-fold parallelism, pointwise operations would incur a speedup that is proportional to the number of pixels in the pyramid level at which processing occurs. In contrast, noncompressive processing requires O(Idomain(a)I) operaions. In Chapter 3, we will see that this result is key to achieving computational efficiencies at compressive operation level. 1.2.4.5. Low-level operations on vector-quantized imagery. Let an F-valued source image a on domain X be denoted as a E FX (per the discussion of Section 2.1), and assume that a is segmented into k-pixel encoding blocks, also called vectors. A vector quantization transform T: FX - G x (Fk)& groups the vectors into classes that comprise a codebook of form (Fk)G, based on a representational error criterion. The error measure determines: (a) how closely the exemplar vectors or patterns in each class resemble each other, and (b) how well the patterns represent the partitions of the source image from which each pattern class was derived. Thus, the encoded image is merely a list (or array) of the indices in G of codebook exemplars. At a high level, one can correctly assert that the VQ-compressed image is the result of applying a substitution to the source image. That is, given a codebook, the VQ transform substitutes for a given encoding block the index of the pattern class that best portrays the 30 encoding block's spatial pattern under constraint of a prior error criterion. For example, if the image domain X is two-dimensional, and k-pixel encoding blocks are employed, then the VQ codebook will have no more than [IXI/ki pattern classes. Likewise, the source image will be compressed into [IX /ki pattern indices, regardless of the number of pattern classes. If one constrains the VQ process such that the maximum number of bits required to encode each pattern index is less than or equal to the number of bits required to encode each image value, then the compression ratio CR > k. Since the VQ transform can be thought of (at a high level) as a substitution, the processing of VQ imagery is, in principle, an easy matter. For example, an approximation to the sum of a source image may be obtained by summing the exemplar vector sums weighted by their frequency of occurrence in the compressed image. The substitutional model of VQ underlies much of the theory for processing VQ-encoded imagery, and has thus been employed in recent reports of VQ-based arithmetic [51], pattern recognition over VQ imagery [1] and, most recently, VQ-based template operations [53]. Cosman et al. have summarized their work in processing VQ-encoded imagery in Reference 1, which includes a variety of ingenious methods for using substitution-based computing of the image histogram, image enhancement by histogram equalization, an approximation to edge detection, and several methods of graphics rendering. In each case, Cosman shows that implementationally significant computational efficiencies can be obtained. For example, let source image a E FX, a VQ codebook c E (Fk)G, and the source image histogram h E NF. One can approximate h by computing the histogram li, i E G of each codebook vector, then weighting hi by its frequency of occurrence fi in the VQcompressed image. The histogram is approximated as h ' E fi - h. In Chapter 9, we iEG discuss the processing of VQ imagery in detail, with emphasis upon the more difficult case of VQ-based image-template operations. 31 1.3. Technical Approach In this dissertation, we summarize our previous work in the classification of image transforms [2], and discuss methods for deriving analogues that implement selected image processing operations over the range spaces of such transforms. We provide specific examples that show techniques by which theoretical speedups in image and signal processing algorithms can be achieved. For example, assume that N compressed data are processed instead of M source data, where N < M and M/N approximates the compression ratio (CR). In certain cases, we claim that a computational speedup of order CR is possible. In support of such claims, we emphasize 1. Description of a transform taxonomy (classification scheme) that includes multiple subclasses for the detailed characterization of commonly-employed compressive transforms. 2. Elucidation of methods for deriving analogues of common image processing operations over range spaces of multiple transforms in each taxonomic class. 3. Application of the methods obtained in Step 2 to the derivation of analogues of common image processing operations over the range spaces of two or three transforms in each class. In particular, we consider * a generic type of JPEG transform, 0 VQ with fixed blocksize, and * VPIC with a small exemplar set. Such transforms are shown to be practical for a variety of image compression tasks. Additionally, VPIC followed by Huffman coding is shown to perform well for fast compression applications at high compression ratios. 4. Analysis of the operations derived in Step 3 in terms of noise sensitivity, sequential speedup, and data security. 32 5. Illustration of various stages of the image compression and compressive computation process with images derived from practice, together with supporting performance data. 6. Feasibility analysis of compressive computational systems based upon error accumulation and computational efficiency. 7. Discussion of possible additional advantages or problems that may result from compressive or encryptive processing, which could be addressed in future research. For example, we discuss implementational issues inherent in the mapping of various compressive operations to a SIMD-parallel mesh architecture. As a result of this study, we have achieved several advances in image processing technology that are described in the following section. 1.4. Novel Claims We claim the following novel accomplishments: 0 High-level theoretical unification of compressive and encryptive processing; 0 Classification of image transformations in a rigorous, concise taxonomy, which is instrumental in facilitating the efficient design of compressive and encryptive processing algorithms; 0 Computational speedup for many image processing operations that is achieved by computing with compressed data; * Reduction in the degree of parallelism that may be possible for certain compressive operations on specific parallel architectures, due to the presence of fewer data; and * Enhancement of data security, which may be possible by signal and image processing of data that have been compressed, then encrypted. 33 An implication of this dissertation is that the theory presented herein may furnish the basis for novel, fundamental advances in computational efficiency and data security not hitherto realized via existing image processing algorithms. Specific applications currently exist in areas of high-bandwidth (i.e., high frame rate) image processing, such as realtime automatic target recognition using high-speed imaging devices and real-time medical image processing in support of bone or tissue visualization for machine-assisted surgery. Additional applications include efficient search over compressed databases, as well as parallel algorithm designs that permit reduced parallelism but preserve the functionality of the corresponding image operation(s). 1.5. Implementational Advantages and Disadvantages An interesting observation resulted from our early research in the derivation of an algorithm for performing connected component labelling on one-dimensional, n-pixel Boolean imagery compressed by runlength encoding [5]. We noted that the customary tree structure of connected component labelling [6] was transformed to a SIMD-parallel structure in certain stages of the compressive algorithm. However, an O(n) communications cost remained. We conjecture that similar effects may occur with certain compressive formats that encode image regions in terms of features which are key to a prespecified, tree-structured image processing algorithm. For example, let a transform T encode an image region based upon edge or boundary information. It would be reasonable to assume that an analogue over range( T) of a boundary detection algorithm could be derived that would be amenable to SIMD-parallel implementation, due to preservation of boundary information in the compressed image. Thus, it is reasonable to assume that a tree-structured boundary detection algorithm could be rendered SIMD-parallel via such methods of compressive algorithm design, since boundary information is assumed to be localized in each pixel of the compressed image. 34 In particular, one could design or select transforms that compress the source image so as to preserve features computed by such image operations (e.g., boundary detection). If such was the case, then it could be possible to simplify the design of parallel algorithms for certain image processing operations. Since parallel algorithm design is nontrivial but parallel computation has great promise for practical applications, such developments could possibly increase the speed and scope of parallel computer applications in image processing. There are two disadvantages of compressive processing that pertain primarily to the use of a cascaded sequence compressive operations, denoted by S. First, error due to lossy transformation of a source image a can accumulate as compressive computation proceeds through S(a), thus rendering the visual quality of the decompressed result unacceptable. Additionally, this is a problem for automated target recognition conducted over decompressed images that were previously subject to compressive processing. Second, all operations in S may not be computable over a given compressive format with efficiencies that meet the design criteria for a given implementation. Thus, conversion between compressive formats may be required, which increases overhead and decreases resultant efficiency. We address the preceding issues throughout this dissertation, with particular emphasis on computational error in the applications portion. CHAPTER 2 REVIEW OF NOTATION Prior to developing supporting theory, we summarize study notation. In order to acquaint the reader with the majority of our notation, an overview of the image algebra subset employed in this study is presented in Section 2.1. In Section 2.2, we discuss general rules for, and deviations from, the list of symbols contained in the front portion of this document. 2.1. Overview of the Image Algebra (IA) Subset Image algebra is a set-theoretical and functional notation that unifies linear and nonlinear mathematics in the image domain. The basic entities of IA are set elements, sets, and functional mappings. Sets and set operations are discussed in Section 2.1.1, while functions are subdivided into several classes, namely, images (Section 2.1.2), operations on images (Section 2.1.3), nonrecursive image-template functions (Section 2.1.4), and nonrecursive operations on templates (Section 2.1.5). Recursive operations are summarized in Section 2.1.6. 2.1.1. Sets and operations upon sets In image algebra, set elements are generally defined implementationally as having any data type, such as integer, real, or complex. Additionally, set elements may be data structures such as sets, lists, or images, but such unusual structures tend to require processing operations that are not well-defined and thus are not used extensively in 35 36 computational practice. Elements are collected in sets, which may be partially or totally ordered. Value sets are denoted by open-face capital letters. For example, the customary value sets Z, R, and C denote the integer, real, and complex numbers, respectively. Additionally, N = Z+ denotes the positive integers and F denotes a generalized value set. Point sets are denoted by bold upper-case letters from the tail of the alphabet (e.g., X, Y, Z), and are customarily subsets of Euclidean n-space R". Image algebra supports the customary set operations of union (U), intersection (n), set subtraction (\), cardinality (IFI), and choice. Set operations obey the relevant laws of identity, idempotence, associativity, distributivity, deMorgan's Law, etc. 2.1.2. Images An image is a mapping of a point set (functional domain) to a value set (functional range). Images are usually denoted by bold, lower-case letters from the head of the alphabet, such as a, b, or c, as well as by emboldened strings, such as ci or ldf. Given a point set X and a value set F, the image a E FX, called an F-valued image on X, is the function a: X -- F. The functional domain of a, denoted by domain(a), is equal to X, while range(a) = F denotes the functional range of a. The graph of an image a C FX, denoted by G(a), is expressed as a = G(a)= {(x,a(x)): x E X, a(x) E F}. Additionally, the point x E X, the value a(x) E F, and the pixel (x,a(x)) are important features of the image a. The more elementary image operations are defined as follows. 2.1.3. Operations upon images Operations upon F-valued images are the naturally-induced operations of the algebraic system F. For example, operations on real valued images a, b C RX are the elementary 37 operations induced by the vector lattice (a vector space which is a lattice) R. Thus, basic operations on RX reflect the corresponding arithmetic and logical operations on R. The generalized unary operation f: F - F thus induces the generalized unary image operation f : FX - FX, which is applied to the image a E FX to yield b= f(a) ={(x,b(x)): b(x)= f(a(x)), x E X}. (32) For example if a E RX, then sin(a) {(x, b(x)): b(x) = sin(a(x)), x E X}. Additional unary operations upon the image a E FX are the image sum, denoted by Ea = E a(x), and the image maximum Va= V a(x). Given an image a E FX and an xEX xEX associative, commutative operation -: F x F - F, the preceding operations can be generalized in terms of the global reduce operation Fa = ra(x) = a(x1)- a(x2) -a(x.), XIx 2,..,Xn E X. Given images a, b E FX and a binary operation o: F x F - F, the binary image operation of Hadamard (pointwise) arithmetic o: FX x FX - FX is defined as a o b = {(x, c(x)): c(x)= a(x)O b(x), x E X}. Given the images where a, b E RX, the usual arithmetic operations (+, -, , /) as well as the Boolean logic operations are supported, as shown in the following examples a+b E {(x, c(x)): c(x) = a(x) + b(x), x E X} a*b f{(x, c(x)) c(x) = a(x) - b(x), x E X} (33) aVb {(x, c(x)): c(x) = a(x) V b(x), x E X} where V denotes maximum. Since complex numbers lack a natural lattice structure, only addition and multiplication are defined for complex-valued images. The remaining operations on RX can be expressed in terms of the arithmetic functions, or are induced on RX by the corresponding operations on R. For example, exponentiation of the previouslydefined images a and b is given by ab { (x, c(x)): c(x) = a(X)b(x) if a(x) 5 0, otherwise c(x) = 0, x E X} . (34) 38 Note that exponentiation is defined when a(x)= 0. Computation of the logarithm, which is invalid for zero and the negative real values, is given by loga(b) _{(x,c(x)): c(x)= loga(x)(b(x)) if a(x),b(x)> 0, x E X}. (35) Given the constant image a= {(x,a(x)) : a(x)= k, k E F, x E X}, we have that b k ba and kb - ab (36) kb a * b and k + b a+b. Constant images which are of special importance in IA include the zero image 0=_{(x,0): x E X} and the unitary image 1=_{(x,1): x E X}. For example, notational brevity is realized by defining the image sum as Ea= E a(x) = a . 1, where the xEX dot product a e b= E a(x) - b(x). xEX Subtraction, division, and minimum are defined in terms of the basic operations and the constant image -1, where -b= -1 * b, and a-b= a+(-b), a/b= a * b-', and a A b= -(-a V -b). (37) The images 0 and 1 exhibit the identity properties a+O= a and a*1= a. Although b*b-'*b= b, we note that b*b-I is not necessarily equal to 1. Although this usage is infrequent, we occasionally call b-1 the pseudo-inverse of b, as discussed in Reference 54. Given image a E FX and S : X - 2X, the generalized characteristic function ( (1 if a(x) E S(x) xs(a)= I(x, c(x)): c(x) 1 if=erwise , x E X (38) 1 1~0 otherwise x can be employed in thresholding a at a value T E F, as follows x) {1 if a(x)>T X>T(a)= I (x,c(x)): 0 otherwise , x E X. (39) Characteristic functions may also be defined in terms of elementary IA operations, as shown in Reference 54. 39 The restriction of image a C FX to a subset W of X is denoted by aiw {(x,a(x)): x E W}. The extension of image b E Fw to the image a E FX, where W C X, is given by bla ~(x, C (0: C(x) = a(x) if x E X\W (40) = (b(x) if x E W Implementationally, restriction and extension can be thought of as cut and paste operations, respectively. Furthermore, a restriction of a E FX to a subset of a whose values are characterized by a property S of F is given by ails ={(x,a(x)): a(x) E S, x E X}. (41) For example, given a threshold T E R, if a E RX, then a||>T {(x,a(x)): a(x)> T, x E X}. 2.1.4. Nonrecursive templates and image-template operations Templates are powerful constructs that map a point to an image. Templates unify the customary concepts of convolution masks, sampling windows, structuring elements (mathematical morphology), and neighborhood functions. For example, let X and Y denote point sets, P denote a set of parameters, and F denote the value set. A generalized F-valued template t from Y to X with parameters in P is a function of form t: P - (FX)Y. For purposes of notational simplicity, let t(p) denote a function from Y to FX, for each p C P. The template values, of form ty(p) E FX, where the target point y E X, are each an F-valued image on X. If the set P of parameters is unspecified or is implicit in the context of a discussion, then ty denotes t(p)(y), and (FX)Y denotes P - (FX)Y. Given ty E FX, the weights of t are denoted by ty(x) E F, x E X. 40 If t is a real-valued template from Y to X, then the support of ty is defined as S(ty)= {x E X: ty(x) # 0}. The set S(ty)= {x E X: ty(x) $ -oo} is referred to as the configuration of t at the point y. If X is a space with operation +, then a template t E (FX)X is called translationinvariant (with respect to the operation +) if and only if for each triple x,y,z E X, that ty(x)= ty+.(x + z). If a template is translation-variant, then it is not translationinvariant. For purposes of brevity we call translation-invariant templates invariant templates, and translation-variant templates are called variant templates. Given an image a E FX and a template t E (FX)y, as well as the associative, commutative functions 1,o: FX x FX - FX, the backward image-template operation, also called the right product, is given by b = a ) t {(y,b(y)): b(y)= r(a(x)oty(x)), y E Y . (42) ( EX When F C C, as is customary, the right product can be instantiated as the image-template convolution b = a ( t {(y,b(y)): b(y)= E (a(x) . ty(x)), y E Y}, (43) by setting y= + and o= . For t E (FY)X, the forward image-template operation or left product, denoted by t E a, is given by t e a {(y,c(y)): c(y)= tx(y)- a(x), y E Y . (44) In either case, b E F . Ritter (55] showed that computation of the generalized convolution can be rendered more efficient by replacing X with S(ty) in Equation 43, or with S(tz) in Equation 45. For example, in Equation 45, E tx(y) - a(x) = 0 whenever S(tx) = 0. xES(t.) The three elementary image-template operations employed in the transformation of real-valued images are called generalized convolution (e), multiplicative maximum (@, where y= V and o= -), and additive maximum (2, where 1= V and o= +). For example, 41 if a E FX and t E (FX)Y are defined in terms of the value sets known as the extended real numbers (R-, = R U {-oo} and R+, = R U {+oo}), then we have the following nonlinear operations b= a(@t (y,b(y)): b(y)= \V (a(x) . ty(x)), y E Y ,(45) xes(t,) and b= aM t { (y,b(y)): b(y)= V (a(x) + ty(x)), y E . (46) xES(ty)I The functions of multiplicative and additive minimum (@ and Z , respectively) are defined symmetrically to Equations 46 and 47. The operations @ and M, together with a larger class of generalized image-template operators, are discussed in detail in Reference 54. Templates can implement mappings across heterogeneous point sets, and are thereby useful for coordinate transformations, such as image rotation, magnification, or changes in dimensionality. In the latter case, applications to multispectral imagery and sensor fusion have been extensive [56]. Key to the utilization of the nonlinear functions @ and M in applications of greater complexity is the concept of conjugacy, which is described as follows. Recall the preceding definitions of IA operations in terms of basic operations, as well as our previous mention of the pseudoinverse. Likewise, recall that operations on FX are induced by the corresponding operations on F. Thus, one can see that the ring (RX, +,*) and the lattice (RX, V) behave much like the ring and lattice of real numbers. Similar observations hold for the extended real-valued images. Given the notion of an additive or multiplicative inverse, the concepts of additive or multiplicative conjugacy are suggested. For example, if a E RX or a E RJ , and we assume that -(+oo) = -oo and -(-oo) = +oo, then the additive conjugate of a, denoted by a*, is defined as a*(x) = -(a(x)), x E X. Thus, (a*)* = a, and if a E RX , then a* E RJ , which is the conjugate space. 42 Denote R$0 = R+ U {0, + oo}. For a E R$00 , the multiplicative conjugate of a is denoted by ii and is defined as 1/a(x) if a(x) 0 or a(x)$+ rc l(x)= +oo if a(x)= , X E X . (47) 0 if a(x)= + oo Likewise, several template inversions are key to the expression of image transformations. The transpose of a template t E (FX)Y is the template t' E (FY)X, which is defined in terms of its weights as t' (y) = ty(x), x E X, y E Y. The following conjugate templates are defined for the extended real valued numbers. If t E (RX )y (or t E (RX,)Y), then the additive conjugate of t is the template t* E (R ,)X (or t E (R )X), such that t*(y)= -t'(x), x E X, y E Y. If t E (Rx )Y, then the multiplicative conjugate of t is the template i E (RY )", such that t (y)= [ty(x)], x E X, y E Y, after Equation 48. With template inversion, the operations and M are employed in the definition of the conjugate operations in a manner that is reminiscent of DeMorgan's Law, i.e., a [N t (t* 2 a*)* (48) and a @ t = i@ ).(49) 2.1.5. Nonrecursive operations on templates The elementary arithmetic operations on images given in Equation 34 also generalize to templates. If s and t are real-valued templates from Y to X, then pointwise template arithmetic is defined as follows s+t is defined by (s+t)= sy+ty , s *t is defined by (s*t)y =sy*ty , and (50) sVt is defined by (sVt) = syVty Addition and multiplication are also defined for complex-valued templates. The arithmetic of extended real-valued templates is described in Reference 54. 43 Likewise, the operations of generalized convolution, as well as additive and multiplicative maximum, generalize to operations between templates. If t is a real- or complex-valued template from Y to X, and s is a real- or complex-valued template from X to W, the template r = s D t from Y to W is defined by the image function ry, as follows ry(w)= E ty(x) - sx(w), where w e W. (51) xEX Computation of the weights ry(w) can be accomplished by summing over a subset of X. For example, given a point y in X, then for each w E W, we define the set S(w) = {x E X: x ES(ty) and w ES(sg)}. Then, since ty(x). sx(w) = 0 if x S(w), we obtain ry(w)= ty(x) . s.(w), (52) xES(w) where the definition E ty(x) - sx(w) = 0 holds whenever S(w) = 0. The operations xES(w) of additive and multiplicative maximum are similarly defined, and are discussed in detail in Reference 54. Template composition and decomposition are useful for algorithmic optimization, which is the primary motivation for introducing operations between generalized templates. For example, if r = (1 1 1) and s = (1 1 1)' where the target point is italicized, and r = s ( t, then the computation of a D r = a ( (s e t) by (a e s) e t uses six local multiplications instead of nine. It has been shown that, if r is an nxn template, and s and t are the decompositions of r into lxn and nxl templates, respectively, then the computation of a e r by (a D s) D t uses 2n multiplications, instead of n2. This concept is further discussed in Reference 55. 44 2.1.6. Recursive image algebra Our summary of recursive IA operations (after Li [57]) includes an overview of recursive templates (Section 2.1.6.1), operations between images and recursive templates (Section 2.1.6.2), and operations between recursive templates (Section 2.1.6.3). 2.1.6.1. Recursive templates. A generalized recursive template is defined in terms of the generalized template of Section 2.1.3, together with a partial order imposed upon certain point sets. A partially ordered set or poset P together with a binary relation -< satisfies the axioms of reflexivity, antisymmetry, and transitivity. Let (P,-<) be a poset and [n] = {1,2,...,n}, where n = JPJ. A linear extension of (P,-<) is a bijection o-: P - [n] such that (x -< y) E- P 4== (o-(x) <; a(y)) E [n]. Denote value set F, point sets X and Y, where -< is a partial order imposed upon Y. A generalized F-valued recursive template from Y to X is a function t= (tA,t<): Y - (FX,FY), which can be decomposed into a nonrecursive part tk: Y - FX and a recursive part t.<: Y -+ FY, such that (a) y 0 S(t.<(y)), and (b) for each z E S(t.<(y)), z -< y. Thus, the support S(t.(y)) is consistent with the partial order imposed on Y. As a result, for each y E Y, tk(y) E FX and t.<(y) E F". For purposes of brevity, we denote t-y = tA(y), tgy = t.<(y), and ty=(ty, tYy). The consistency of t< ensures the recursive computability of t. Since a recursive template can be consistent with many partial orders on Y or any of their linear extensions, a recursive template that is consistent with a partial order -< is said to admit the partial order -.<. Thus, the set of all F-valued recursive templates from Y to X that admits the partial order -< on Y is denoted by (FX, FX)Y (FX, FX) . If t E (FX, FX) Y, then the support S(ty)= (S(tAy),S(tsy)). If S(t. 45 A recursive template t E (FX, FX)X is translation-invariant if and only if for each triple x, y, z E X with y+z E X and x+z G X, we have ty (x)= ty+z(x + z), which is equivalent to tky(x)= tky+,(x + z) and t..y (x)= t- 2.1.6.2. Operations between images and recursive templates. Recursive operations in image algebra are generalized in terms of the recursive right and left products, similar to Equations 43 and 45. Let r denote the global reduce operation, as before, and let y: F x F -+ F be associative and commutative. Let F1 and F2 be two value sets with operations 01,02 : F1 x F2 - F. If a E (F1)" and t E (FX, Ff) , then the generalized recursive right product is the binary operation )<: FX x (F',F')y - FY, defined as a ). with the left product symmetrically defined. For example, if F,F1,F2c R, y= +, and 01,02= ., then = ., the recursive convolution. Likewise, if F,F1,F2= R-,, -y= V, and 01,02 = +, then ).= @ <, the recursive additive maximum. 2.1.6.3. Operations between recursive templates. It is well established that image algebra template composition and decomposition are useful for algorithmic optimization. For example, Li showed that for a E RX and for s,t E (RXO) , a ((s ï¿½ t)= (a ï¿½s) (t and (54) a ((s + t)= (a Gs) + (a @t). Additionally, if a E Rx0 and s,t E (RX)Y, then aM (sM t)= (a s)M1t and (55) a M (s V t)= (a M s) V (a M t). Li [57] extended template decomposition to recursive templates, and proved certain associative and distributive laws. We summarize salient theory, as follows. 46 Let the finite point sets X, W C R" and let the recursive templates t E (RX, RX)X and s E (Rw, RX) . The generalized recursive convolution of two recursive templates, denoted by r = s r7k= sk @ t and r-. = 1-(1- s-) @(1- t<), (56) where 1 E (Rx)X denotes the unitary template, which is defined in terms of its weights as (1 if Z=x 1"(Z) = 0ohrie, x,z E X. (57) Given the previously-defined templates s and t, the recursive addition r = s +< t is given by r#= sg G(1- t-) + (1-ss<) (t, (58) r-< = 1 - 1 - ) (1 -t ) Nonlinear recursive template composition and inversion are defined in Reference 57. Having concluded the discussion of image algebra, we next consider notation specific to this document. 2.2. Study Notation In addition to image algebra, which comprises the majority of the notation employed in this study, we adopt the following conventions: a) Image transformations are expressed in general form as T: FX - G , where F $ G and X $ Y are possible. b) Operations in image space, i.e., over domain( T), are denoted by Q, unless otherwise stated. c) Operations over range( T) are usually denoted by Q'. d) The images of primary concern are denoted by a the source image, 47 a, the transformed image T(a), a the result of the operation ((a), and ai' the result of the operation ('(a,). e) The i-th derivative of an image a e RX is denoted as d' (a) di' (, x E X, (59) di xi provided that the operation d is defined over X. Assuming that X c R", the i-th partial derivative with respect to the projection of a point x E X is denoted as 0'(a) (60) ( (gx))i f) In addition to the sets described in Section 2.1.1, which are denoted in open face (F) and bold face (X) type, sets may also be specified by upper-case letters in normal (S), italic (G), script (P), or Gothic (D) faces. g) The properties of a set S are usually denoted by P(S), unless otherwise stated. h) Set properties that are disclosed or concealed from the scrutiny of an oracle 0 (defined in Chapter 5) are denoted by D(S) and R-(S), respectively. i) Scalar variables are denoted in lower case by normal face type (e.g., x, y) or by italic type (e.g., x, y), unless otherwise stated. j) The operator (.) denotes scalar multiplication when applied to scalar variables. When (.) is applied to one- or two-dimensional matrices, or to images defined on a subset of R or R2, then matrix multiplication is indicated. In contrast, Hadamard multiplication of matrices or images is denoted by (*). k) The time complexity of an operation f is denoted by T(f), which is not to be confused with the transform notation T (italic font). Similarly, the space complexity of f is denoted by S(f), and the work required to compute f, by W(f). Note that the work W is not to be confused with W, which denotes a point set. Additionally, the cost of computing f is denoted by C(f), which is not to be confused with C, an 48 image transform that is used in Chapter 4. When we compare the computation of f over n input data on various N-processor architectures A, then we occasionally elaborate T, S, W, and C as TA(f,N)(n). 1) The superscript asterisk (*) may denote template conjugation (e.g., t), complex conjugation of an image (e.g., a*), or may denote an approximation to an inverse transform (e.g., T). which is defined in the following chapter. CHAPTER 3 FUNDAMENTAL THEORY In this chapter, we develop theory that describes the processing of transformed data. Section 3.1 contains a review of the concepts of homomorphism, isomorphism, and heteroassociativity. In Section 3.2 we present the basic theory of computation over compressed or encrypted imagery, which is related to several problems of computer science, such as the transform properties problem. Section 3.3 contains a taxonomy of compressive and encryptive transformations that is organized in four classes. Such classification facilitates the development of high-level methods for deriving functions that compute over the range spaces of transforms in each class, as exemplified in Section 3.4. 3.1. Mathematical Concepts. This study is based in part upon the concepts of commutativity diagrams, homomorphism, and isomorphism. Homomorphic transformations facilitate the processing of transformed imagery in a manner analogous to operations that process the source image. An isomorphism is a special case of a homomorphism. Isomorphisms are important in the direct mapping of operations over one domain to operations in another domain. Additionally, isomorphism is useful when discussing data security, as in Chapter 4. We begin with a summary of basic mathematical terms. 3.1.1. Basic Theory. The concepts of elements, sets, and mappings are key to our theoretical development, and constitute the basic building blocks of image algebra. 49 50 3.1.1.1. Definition. The n-fold Cartesian product of a set S is defined by n S" f S = {(S1,S2,...,sn) :sS2,..., s S}. (61) i=1 3.1.1.2. Definition. A mapping M from set R to set S is denoted by M: R - S. The collection of all mappings R - S is denoted by SR. Thus, if M: R - S, then M E SR. We call R the domain of M, written as R = domain(M). The range of M, is defined as range(M)= {s E S : s = M(r) for some r E R}. 3.1.1.3. Definition. An m-to-k-ary operation Q from set R to set S is a mapping : m k [JR -1 fS, where it is hereafter understood that m,k > 1. When k=1, we call Q an i=1 j=1 m-ary operation, denoted by Om 3.1.2. Groupoids, Operations, and Transformations. We next state facts concerning groupoids, operations, and transformations, which lead to definitions of homomorphism and isomorphism. Groupoids combine the concepts of sets and operations, and lead to the consideration of transforms and their inverses. The definition of a transformation leads to discussion of homomorphism and isomorphism. 3.1.2.1. Definition. A groupoid (S,Q) is comprised of the set S and a binary operation 0: S x S - S. A groupoid whose binary operation is associative is called a semigroup. 3.1.2.2. Definition. We call e the identity element of S with respect to the operation 0: S x S -+ S, if and only if f O e = e Q f = f, Vf E S. (62) 51 3.1.2.3. Definition. Let S be a set with identity e with respect to the operation 0: S x S - S. If f,g E S then g is called the inverse of f E S with respect to the operation 0, if and only if fog = g of = e,VfES. (63) Iff E S and g E S is the inverse of f, then we denote g by f-. 3.1.2.4. Definition. Suppose T: S - S' and V: S' - S. Then V is called the inverse of the transformation T if and only if (V o T)(f) = f, Vf E S. (64) If T: S - S' and V: S' - S is the inverse of T, then we denote V by T-'. If we denote T(f) by f' (i.e., f' = T(f)), then it follows that T-1(f') = f , Vf E S . 3.1.2.5. Definition. Let T: S - S' be a homomorphism. If S is a normed space with norm 1111 and e E R, then we say that a homomorphism T*: GY - Fx is an c-near inverse approximation of T if and only if 11 T*(T(a)) - a 11 < c , Va E Fx. (65) Note that every inverse is an c-near inverse for all e > 0. 3.1.2.6. Definition. Given groupoids (S,Q) and (S',O'), the transformation T: S - S' is a homomorphism from S to S' if and only if T(f 0 g)= T(f)Q' T(g), Vf,g E S. (66) 3.1.2.7. Definition. Let (S,Q) and (S',O') be groupoids with homomorphism T: S - S'. Then, T-': S' - S is an inverse homomorphism for T if and only if T-1 is the inverse transform of T and T- is a homomorphism. 52 3.1.2.8. Definition. If T: S - S' is a homomorphism and T is a one-to-one and onto mapping, then T is an isomorphism of S onto S'. 3.1.2.9. Definition. A semigroup is a groupoid whose binary operation is associative. 3.1.2.10. Definition. A monoid is a semigroup with identity. 3.1.2.11. Theorem. Let (S,Q) and (S',O') be monoids, and let e denote the identity of S. If T: S -+ S' is an isomorphism, then the following statements hold: (i) If e' is the identity element of S', then e' = T(e), and (ii) T(f-1) = (T(f))-1 , Vf E S which has an inverse f-' E S. Proof. The proof is well known. 3.1.3. Preservation of Set Properties. We next develop theory that describes the preservation of various properties of sets and mappings under homomorphism. The following theory is fundamental to the derivation of functions that compute over the range space of an image transformation. 3.1.3.1. Notation. Denote the properties of a collection of subsets of a set S by P(S). If A C S has property q and A is a singleton set (i.e., A = {u}), then we say that u has property q. 3.1.3.2. Definition. We say that the transform T: S - S' preserves the properties P(S) if and only if for each property q E P(S) and for all A C S with property q, T(A) also has property q. We illustrate this concept with the following examples. 3.1.3.3. Example. Let q E P(S) be the property of connectivity, and let T: S - S' preserve q. This means that for every connected set A C S, T(A) must also be connected. 53 3.1.3.4. Example. Given an image a E (Zr)X, it is easily verified that the sorted histogram h(a) is preserved under the transformation A(a) = n - a. 3.1.3.5. Example. The fact that associativity and commutativity are preserved under homomorphisms follows from the fact that the operations of algebraic structures are preserved under homomorphisms. 3.1.3.6. Definition. We say that a binary operation Q: S x S -+ S preserves property q E P(S) if and only if, for all pairs of subsets A, B C S with property q, or for each pair of elements f,g E S with property q, then O(A,B) or f Q g exhibits property q. 3.1.3.7. Definition. A groupoid (S,0) is said to exhibit property q if the operation Q preserves property q. 3.1.3.8. Definition. An algebra A = (F, 0) consists of a set F of operands and a set 0 of operations upon the operands in F. 3.1.3.9. Example. The algebra consisting of the real numbers together with the operations of multiplication and addition is denoted by the tuple (R, {+, .}), which can be written as (R, +, .) for purposes of simplicity. 3.1.4. Heteroassociativity. We next discuss the property of heteroassociativity, which is employed in the derivation of operations over the range spaces of certain encryptive transforms. 3.1.4.1. Definition. An operation y : S x S -- S is associative with respect to the operation 0: S x S - S, if and only if (f Q g)y h = f Q (g y h), Vf,g,h E S. (67) 54 3.1.4.2. Definition. If an operation -y is associative with respect to an operation 0, and Q is associative with respect to -y, then we say that the operations y, 0: S x S -+ S are heteroassociative. 3.1.4.3. Definition. An algebra (S,0,y) is called a heteroassociative algebra if and only if the operations y, 0: S x S - S are heteroassociative. 3.1.5. Heteroassociativity of Inverse Functions. We next consider inverse operations and their heteroassociative properties. 3.1.5.1. Definition. Let 0 : S x S - S and let S' = {f E S: f has an inverse}. Then, Q: S x S' -+ S is called the inverse operation (or inverse) of 0 if and only if f 0 g' = f Q g, Vf e S and Vg e S', (68) where g-1 denotes the inverse of g. If Q is the inverse operation of 0, then we denote Q by 0'. 3.1.5.2. Remark. In the preceding definition, Q' does not mean the inverse function S - S x S. It should be clear from the context when we mean inverse operation versus inverse function. 3.1.5.3. Example. Referring to the preceding definition, note that if S = R, then S' = S for the operation of addition. Thus, subtraction is the inverse of addition over the real numbers. In contrast, although division is the inverse of multiplication over the real numbers, S' = R \ {0} C R, since f/o remains undefined for all f in R. 55 3.1.5.4. Definition. Assume that S is a normed space with norm |. Let c E R+ and S* = S x S. An operation 0: S* - S has an c-near inverse approximation if and only if f Q g- = f 0* g, Vg E S, (69) where S' was defined in Section 3.1.5.1 and (f 0* f) - ell < c, Vf E S, (70) with e denoting the identity of S. 3.1.5.5. Observation. Assume that (S,Q) and (S',O') are two monoids, where operations 0: S x S -- S and 0' : S' x S' - S' each have an inverse denoted by 0-1: S x S -S and (Q')- : S' x S' - S'. If T: S - S' is a homomorphism, then T(f 0 f-1) = T(f)(')-l T(f), Vf-1 C S. 3.1.5.6. Definition. A monoid (S,0) is called a group if and only if each element f E S has an inverse f- E S. Alternatively, we may define a group as a monoid with identity and an inverse, whose binary operation is associative. If 0 is commutative, then the group (S,Q) is called an Abelian group. 3.1.5.7. Definition. A ring (S,0,y) satisfies the following conditions for all f, g,h E S: (i) The functions 0,y : S x S -, S are associative and commutative. (ii) -y has an identity e E S and there exists Vf E S an inverse f0 such that f y f = e. (iii) Q is left- and right-distributive over 7, i.e., f Q (g -y h) = (f 0 g) -t (f 0 h) and (f - g) O h = (f 0 h) y (g Q h), Vf, g, h E S. 3.1.5.8. Definition. A semi-ring (S,Q) is a commutative semigroup. Note that this is different from an Abelian group (S,0) that is present in a ring (S,Q,y). 56 3.1.5.9. Lemma. Let (S,Q) be a group. If Q is associative, then the following statements hold: (i) Q-1 is associative, and (ii) (f Q g) 0-1 h = f Q (g Q-1 h) , Vf,g,h E S, where Q1 denotes the inverse operation of 0. Proof. The proof follows from the definitions of a group and an inverse function. E 3.1.5.10. Definition. Suppose that A= (S, Qy) and A'= (S',O',-y') are algebras, where Q and -y are binary operations over S, and Q' and y' are binary operations over S'. A function T: S - S' is called a homomorphism from algebra A to algebra A' if and only if the following conditions are satisfied for all f, g E S: (i) T(f 0 g)= T(f) 0' T(g), and (ii) T(f y g)= T(f)y' T(g). Note that the correspondence of operations 0 and 0' , as well as -y and y', must be preserved. It follows that the cardinality of the set of operations in algebra A must equal the cardinality of the set of operations in A'. This concludes the definitions upon which our theory of processing compressed data is based. We introduce the core theory in the following section. 3.2. Image Compression, Encryption, and Compressive Computation. A mathematical discussion of compressive computation requires the following definitions as background. Section 3.2.1 contains definitions of parameters that characterize compressive transforms. In Section 3.2.2, we define operations that process over the range space of such transforms. 57 3.2.1'. Parameters of Compressive Transforms. 3.2.1.1. Definition. Given a finite set S c N and r E N, the maximum field width of S, denoted by siz(S), is the maximum number of radix-r digits required to encode a value in S. 3.2.1.2. Example. If finite S C R+ is quantized under an optimal space constraint, then the binary representation (r=2) of S has siz(S) = [log2(IS)1. However, optimality does not necessarily imply high accuracy, since error can result when elements of S are quantized. The following example is illustrative. 3.2.1.3. Example. If finite S C R+ is quantized linearly into a radix-r representation using a quantization interval q E R+, then siz(S) is bounded as [logriSil siz(S) V{y : y = [logr(x/q)l, x E S}, (71) where q denotes quantization interval width and the inequality accounts for data that may not be normalized to the interval [AS, VS]. 3.2.1.4. Definition. An encoding transform is a mapping T: FX -- G , where X $ Y and F $ G are possible. 3.2.1.5. Definition. An exact encoding a, of an image a by a transform T fulfills the following conditions: (i) No additional information is introduced into domain( T). Implementationally, this means that pixels are not added to a prior to applying T. (ii) No information is lost in the encoding. That is, T encodes all the information in a such that the information in a, completely describes the information in a. (iii) No information is added to the encoded result. For example, a, is not augmented with additional pixels to fulfill a given format specification. 58 Exact encodings are often called information-preserving or lossless transformations. If T is not exact, then it is called inexact. Alternative terms are information-nonpreserving or lossy. 3.2.1.6. Definition. The domain compression ratio of a transform T: FX - GY is given by IX' CRD(T)= (72) IYI where X and Y are finite point sets. 3.2.1.7. Definition. The range compression ratio of a transform T: FX - G is given by siz(F) CRR(T)= siz,() (73) siz(G) where* implementations of F and G would be finite sets. 3.2.1.8. Definition. The compression ratio of a transform T: FX - GY is given by CR(T) = IXI siz(F) IY - siz(G) It follows from the preceding definitions that CR = CRR-CRD. Thus, CR describes compression at a digit (bit) level for radix-r (binary) encodings, under the assumption of constant field width. This assumption is reasonable for machines with fixed-size registers, which include the majority of digital computers. 3.2.1.9. Definition. A compressive transform T: FX - GY is an encoding transform whose compression ratio CR( 7) > 1. Note that a compressive transform T may be a homomorphism or an isomorphism, which leads to the following definitions of operations over range( T). 3.2.2. Transform-regime Analogues, Duals, and Computational Systems. The concept of computing over compressed (encrypted) imagery, which we call compressive (encryptive) processing, can be illustrated by the commutativity diagram shown in Figure 1. Here, 0 denotes a unary image operation that is applied to image a to yield image a. The transform T, which may be information-nonpreserving, maps image a to a compressed image a<, which is accepted by an operation Q' that yields the image ac. The operation Q' is called a transform-regime analogue or analogue of 0. Similarly, an operation Q that transforms a, into i is called a dual of Q. If T has an inverse T (or an approximation T* to its inverse, in the case of lossy T), then the image jc can be transformed to yield an exact representation of (approximation to) i. 0 aa a T T orT * ac 0' ac Figure 1. Commutativity diagram for processing compressed or encrypted imagery with unary operations. Since it is occasionally possible that dual operations can be derived in the presence of intractable analogues, duals are useful in practice. We later show how to use analogues and duals to formulate computational systems based upon homomorphisms and isomorphisms. This prepares the reader for further theoretical development in Parts II and III. 3.2.2.1. Assumption. Let (S, Q) and (S', Q') be groups, and let T: S -+ S' be a homomorphism. 3.2.2.2. Definition. If Assumption 3.2.2.1 holds, then the operation 0' is called an 0property of T, or a transform-regime analogue of 0 over range( T). 60 3.2.2.3. Remark. We say that the determination of a specific property of a given transform is called the Transform Properties Problem (TPP). A theoretical discussion of this problem is given later in this section, where we show how the TPP can be used to construct compressive computational systems. 3.2.2.4. Example. The Fourier transform f: CX - CX has a convolution property (i.e., o = @) that is given by O'= *, the operation of Hadamard multiplication. Alternatively, we say that * is a transform-regime analogue of @ over range(.F). 3.2.2.5. Definition. If Assumption 3.2.2.1 holds and Q: S' x S' - S such that 1(T(f), T(g)) = Q(f,g), Vf,g E S, then Q is called a transform-regime dual of Q, over range( T). 3.2.2.6. Definition. If Assumption 3.2.2.1 holds, T has an c-near approximation to its inverse, and Q : S' x S' - S such that IIQ(T(f), T(g)) - Q(f,g)jj < E, Vf,g E S, then Q is called an c-near approximation to a transform-regime dual of Q over range( T). For purposes of brevity, we also call Q an approximate dual of Q over range( T). 3.2.2.7. Definition. If Assumption 3.2.2.1 holds, then the structure ((S, Q), T,(S', 0')) is called a homomorphic computational system (HCS). 3.2.2.8. Definition. If the transform Tin Assumption 3.2.2.1 is an isomorphism, then the structure ((S,0), T,(S',Q'), T-1) is called an isomorphic computational system (ICS). 3.2.2.9. Definition. If Assumption 3.2.2.1 holds and T: S - S' has an c-near approximation to its inverse, denoted by T*: S' - S, then the structure ((S, 0), T,(S', 0'), T*) is called an c-near isomorphic computational system (C-ICS). 61 3.2.2.10. Definition. If Assumption 3.2.2.1 holds and T is a compressive transform, then the corresponding HCS (ICS) is said to be a compressive HCS (ICS). We can extend the definition of an HCS or ICS to include operations with multiple operands of different type, or multiple operations that each have a corresponding analogue. In order to do this, we first define several terms pertaining to algebras. 3.2.2.11. Definition. Given the indexed sets F {F},\ and G= {Q,}, an algebra 1= (G,U) is a subalgebra of algebra A = (F, 0) if and only if (i) U C 0, and (ii) For each 6, E G, there exists an F,\E F such that 6, C FA. 3.2.2.12. Definition. An algebra A = (F, 0) is a homogeneous algebra iff IFl = 1 and is a heterogeneous algebra iff -TI > 1. 3.2.2.13. Observation. Since a homomorphism T preserves the operations of an algebra, an HCS may be formulated using a set of operations ( rather than a single operation. For example, consider the structure ((S,(0), T,(S', 0')), where 0' contains the transformregime analogues of the corresponding operations in 0. The symmetric case holds for ICS. Of primary interest to this study is the determination of the 0-properties of T, where 0 are common image operations and T is a compressive transform. A more difficult problem, which address only tangentially, concerns the derivation of 0-properties of encryptive transforms. A key theoretical constraint encountered early in this study is implied by the fact that homomorphisms are defined for binary operations only. Thus, HCS were initially defined for only one type of image algebra operation, namely, binary pointwise operations. In practice, this was an unacceptable restriction, since image algebra contains unary and binary pointwise operations as well as unary global-reduce operations and binary operations with heterogeneous operands (e.g., image-template and template-template operations). We 62 thus developed the following theory that unifies compressive computation over all image algebra operations by expressing the concept of an HCS in terms of a family of mappings. 3.2.2.14. Definition. Let algebras A= {A1, A2, . ..,A.; 0} and B = {B1, B2,..., Bn; O'}, where 0 and (' denote sets of operations. For example, A1 = F, A2 = FX, A3 = (FX)X, and O= {+, ., @}. A family of mappings (= { O}, where (1: Ai - Bi, for i = 1..n, is an algebraic homomorphism (: A - B if and only if, for each 0 E 0, (a) there exists an analogous operation O' E 0' over range(4) and (b) : 0 -+ (', such that the following statement holds. If O: Ai, x Aj2 x ... x Am_ Ai,, then 0(0)= Q' : Bi, x B12 x .. x Bim -+ Bi, such that 0(CO)[( (a;,), i2(ai2), -.. , C(am)] = G([O (ai.,aj,,..., aj.)]. (75) 3.2.2.15. Remark. If Definition 3.2.2.14 holds and 0(Q)= Q' is a dual over range(() of Q, then O(.)[j1 (as ), ,(ai,), . . . , .(a )] = Q (aj,,ai,,...,am). (76) 3.2.2.16. Observation. Given Definition 3.2.2.14 and T: FX , G , one encounters the following cases in nonrecursive image algebra: Case 1. If m = 1, then the commutativity diagram of Figure 1 is satisfied, i.e., Q'( T(a)) = T(O(a)). For such unary operations, we form the following structure: X0 = (A, C, B) = ({FX,}, { T,0}, {G, O'}). (77) If Tis a compressive transform, then we call X a compressive computational system (CCS). Case 2. If m = 1, then let AI = F, A2 = FX, and (= {T, 0} such that the global reduce operation r: FX -+ F has an analogue o(0)= r' : G - G for which Equation 75 holds. The corresponding CCS is given by Xr = (A, C, B) = ({FX, F; l}, { T,O}, {GY, G; r'}) . (78) 63 Case 3. If m = 2 and A1 = A2 = FX, then Equation 75 reduces to the definition of a homomorphism T. For example, if Q: FX x FX - FX, ( = { T, 0}, and 0(0) = ' : G x -* G , then the corresponding homomorphic computational system (HCS) is given by X0, which is a CCS if CR(T) > 1. Case 4. If m = 2, then let A1 = FX, A2 = (FX)X, B, = G , B2 = (G) Y, and 0= Q: FX x (FX)X X FX. Let (= {T,U;0}, where U: (Fx)X - (G6)y incorporates the transformation process inherent in T such that 0(0) = Q' : G x (G') - 6" is an analogue of 0 over range( T). Given image a E FX and template t E (FX)X, if a. = T(a) and s = U(t), then 0(Q)[T(a), U(t)]= T(Q(a,t)) = T(a ï¿½ t). (79) This implies the existence of the structure X0 = (A, (, B) = ( FX, (FX) X; O}, {T, U; e}, {G", (6) Y; o'}) , (80) which is a CCS if CR(T) > 1. Case 5. If m = 2, then let A= {FX, (FX) V, FV; , B= (G )W, GW; , T: FV - Gw, U: (FX)V -+ (6) W, and ( = { T, T, U;0}, where 0(0) = 0' is an analogue over range(() of Q. If T and T are compressive transforms, then setting X0 = {A,(,B} yields the CCS for the generalized image-template product over heterogeneous domains. 3.2.2.17. Remark. Definition 3.2.2.14 and Observation 3.2.2 provide a unifying formalism that supports the derivation of analogues of nonrecursive image algebra operations over range( T). Note that Cases 4 and 5 of the preceding observation can describe operations between templates without loss of specificity, since ty E FX for all y E X. Additionally, duals of 0 over range(() are expressed symmetrically to Cases 1-5. For example, if r has a dual r' over range((), then Case 2 would yield the dual CCS Xr = ({FX,F; r}, {T,0}, {G",F; r'}). (81) 64 In passing, we note that compressive transforms are generally not based upon a scalar mapping of form F - G, as are pointwise greylevel operations. Rather, image compression is generally achieved by mapping source image blocks to one or more exemplar values g EG. For example, g could be an exemplar that approximately characterizes an encoding block, as in VQ. As a result, the global reduce operations usually do not have numerous analogues, since a single value in G represents many values in F, but rarely one value. However, the block mapping FkI - G that is often inherent in T facilitates the construction of a mapping g :4G - Fk1 from which a dual 1': G3 - F can be derived via applying r to range(g). We thus have discovered few systems of form Xr, but have derived numerous CCS of form Xr. 3.2.2.18. Observation. If all the elements of ( except 0 (i.e., (j1,2,...,(c) are onto mappings that have inverses denoted by ( = {q',ci,... } then C is an algebraic isomorphism. If Q is a binary operation and there is only one operand set, then X0 can be recast as an isomorphic computational system (ICS), as follows: VO = (A, (, B,() = ({FX,}, { T,0}, {G", O'}, T-') . (82) It is well known that ICS are useful in cryptology, since isomorphisms are required for exact recovery of plaintext from ciphertext. If any of the elements of ( except 0 does not have an inverse, then we can construct E-near approximations to the inverses of (1,(2,...,(n that are denoted by *= {* .}. In such cases, we say that there exists an approximate ICS V* (A, , ,*) We next overview applications of the preceding theory to high-level problems in computer science, in preparation for the more specific discussion of transformation and compressive processing given in Sections 3.3 and 3.4. 65 3.2.3. Relationship of Compressive Processing to Problems of Computer Science. We previously mentioned that the derivation of specific solutions of the transform properties problem was central to the feasibility of compressive processing. The TPP provides a convenient, unifying formalism for expressing problems of computer science that pertain to imaging practice. In particular, portions of our research currently emphasize the semi-automatic derivation of partial functions that approximate forward and inverse transformations. For example, consider the problem of inverting an imaging system's pointspread function (PSF) that is approximately deduced from imagery. Inversion is but one step in deconvolving a PSF from camera imagery, in order to clarify details of interest. Similar inversion capabilities would be useful in cryptanalysis, where an adversary attempts to guess the key (or the mechanism of) an encryption transform. In this section, we show that formal statements of the TPP and several related problems are useful for expressing homomorphic and isomorphic computational systems. Through such discussion, we hope to acquaint the reader with the larger, conceptual context of this study, prior to addressing applications of our theory in the remaining chapters. Eventually, the semi-automatic development of approximations to solutions of the TPP could facilitate further automation of burdensome computing tasks (e.g., derivation of compression and encryption algorithms) that are currently performed manually or semi-automatically. Thus, the sense and import of this section is not merely abstract, but has potential applications in computing practice that are being actively investigated in our ongoing research. We begin by defining Kleene closure in terms of functional composition. 3.2.3.1. Observation. In programming language theory, it is customary to define Kleene closure in terms of concatenation. For example, given the set of symbols A, let L, LI, and L2 denote strings that are comprised of symbols in A. The concatenation of Li and L2, denoted by LIL2, is equal to the set {xy: x E L1 and y E L2}. Let us further define 66 the set L0 = {A}, where A denotes the empty symbol, and L' = LL'-1, for i > 1. Defined in terms of the concatenation relation, the Kleene closure of L, denoted by L*, and the positive closure of L, denoted by L+, are given by * = L' and L+ =U L'. (83) i=0 i=1 A corresponding definition is given for tuple construction, as follows. 3.2.3.2. Definition. Given a set of operators 0, let 0, 01, and 02 denote sequences of operators that are comprised of operators in 0. The construction of 01 and 02, denoted by 01 r 02, is equal to the set of tuples {(Q, Q) : 0 E 01 and Q E 02}. Further define 00 = E, where E denotes the null operator (which does nothing), and 0 = 0 7 Oi-1, for i > 1. The Kleene closure of 0 under the construction relation, denoted by 0*, is given by 0 = O . (84) i=o For purposes of brevity (and to avoid typographic confusion among the sequence 0 and the operators Q and 0), we hereafter denote Kleene closure under the construction relation as 0* = 0*. The construction relation 7 can be likewise applied to operands in F to yield the Kleene closure F*. 3.2.3.3. Example. Let 0 = {+, -} and let E denote the null operator. The elements of 0* are specified in canonical order as 0* = {E, +, -, (E, E)E, + (, -),(,E), ), ...}. (85) We thus have a convenient theoretical construct for enumerating sequences of operators. We call such a sequence an algorithm. 3.2.3.4. Notation. For purposes of discussion, we denote the set of transforms T C 0*. 67 3.2.3.5. Definition. The data transformation problem (DTP) can be described in simplified form by the mapping MT,, : (S' U {I})s x A - T, if the following conditions are satisfied: (a) domain(MT,,) consists of (i) a set D of ordered tuples (f,f') that constitute a partial function, where f E S, and f' E S', with S, S' E F the domain and range spaces of a transform T E T, and I denotes an unidentified entity; (ii) an algebra A = (9,U), where g c F denotes a set of data structures and U C 0 denotes a set of operators; (b) range(MDTP) contains the transform T: S -> S'; and (c) MD, : (D, (9,U)) - T. 3.2.3.6. Remark. For purposes of consistency, we currently assume that MDTP expresses T in terms of operators in U as well as operands (e.g., data structures) in 9. Since each transform T can be written in various ways, the consideration of lowlevel formulations of MT, is presently tangential to our theoretical development. The reader should not confuse an abstract formulation of MDTP with more concrete solutions to instances of the DTP. For example, we note that closed-form solutions to the DTP are not evident in the literature for sets S,S' of practical interest. However, Koza [58] has demonstrated methods of deriving exact or approximate solutions to restricted instances of the DTP over specific finite discrete sets S,S' via stochastic techniques. We next examine the problem of obtaining inverse transformations, which closely resembles the DTP. 68 3.2.3.7. Definition. The inverse transformation problem (ITP) can be described in simple form by the mapping MITI: T x A - T, if the following conditions are satisfied: (a) domain(MT,) consists of (i) a transform T: S - S' in T, where S, S' E F, the operands; (ii) an algebra A = (g,U), where g C F denotes a set of data structures that instantiate S and S', and U C 0 denotes a set of operators from which the transform in range(MIT) is comprised; (b) The inverse transform T : S' - S exists and is contained in range(M,); and (c) MIT, ( T, (G,U)) - T-. Solutions to instances of MIT, have been employed in cryptanalysis, e.g., the determination of a monoalphabetic substitutional cipher using an input comprised of unique symb6ls. Recently, there was reported a successful attack upon rotor machines via genetic algorithms (GAs) using sample plaintext, ciphertext, and a knowledge of key size [34]. Instead of deriving the inverse transformation in closed form from the forward transform, it may be more efficient to approximate the inverse transform computationally. For example, the forward transform can be represented discretely as a partial function, to which an approximate inverse can be sought. Such derivation of approximate inverse transforms is expressed similarly to the ITP, as shown in the following definition. 3.2.3.8. Definition. An instance of the approximate inverse transformation problem (AIP) can be described in simple form by the mapping M... : (S' U {1})S x T x R x A - T, if the following conditions are satisfied: (a) domain(MI) consists of (i) a set D of ordered tuples (f,f') that constitute a partial function, where f C S, ' E S', and S, S' E F are the domain and range spaces of a transform T E T, with I denoting an unidentified entity; 69 (ii) a transform T: S -, S' in T; (iii) an error bound e E R; (iv) an algebra A = (g,U), where g C F denotes a set of data structures that instantiate S and S', and U c 0 denotes a set of operators of which the transform in range(M,,,) is comprised; (b) range(MIP) contains an e-near inverse approximation to T, denoted by T*: S' - S; and (c) M. : (D, T, E, (g, U)) - T*. 3.2.3.9. Observation. Inherent in the preceding definition is the assumption that T* can be constructed and applied to the input data in D within the error bound E. In practice, such may not be the case. For example, one may have to select some Q C S and Q' C S' in order for T*: Q' - Q to satisfy E. In such cases, it is reasonable to assume that range( T*) C S. 3.2.3.10. Example. Let a polynomial regression technique be applied to a dataset D. Assuming that the transform T: S - S' is also a polynomial, the coefficients of T could seed the regression that would determine a polynomial P which would approximate T. Then, one might invert P (where possible), thus obtaining a polynomial P1 that could be employed to seed a polynomial regression over the set D* = {(f',f) : (f,f') E D}. Again applying regression to D* to yield a polynomial T*, analysis of residuals in the output of T* could highlight errors that lie outside the interval [-E,e]. That is, T* would exhibit errors within the bounds c only for some pairs (f', f) E D*. In such cases, we would rewrite an instance of Definition 3.2.3.8 to read T*: Q' -- Q, where Q C S and Q' c S'. 70 3.2.3.11. Remark. The preceding observation implies that formulations of M,,, may be constrained by prespecification of the error bound E in domain(MI). This is a realistic constraint, since postspecification of e in range(M,,) implies loss of precise control over the error bound, which precision could be required in engineering applications. As a result of such considerations, we present Definition 3.2.3.8 as one of many instances of a more general problem. Additional issues of interest, e.g., whether M,, reduces to an instance of MTP when e = 0, will be considered in future research. We next summarize the central problem of this study, namely, that of determining properties of a given transform. 3.2.3.12. Definition. The transform properties problem (TPP) can be specified in terms of the mapping MTPP (F x 0) x T x A - F x 0, if the following conditions are satisfied: (a) domain(MTpp) consists of (i) a tuple (S, Q) comprised of a value set S E F and an operation 0 E 0; (ii) a homomorphism T in T, from S to S'; (iii) an algebra A = (9,U), where g C F denotes a set of data structures and U C 0 denotes a set of operators; (b) range(MT.,) contains a tuple (S', Q'), such that S' = T(S) and 0' E 0* is an operation on S'; and (c) M,.. : ((S,Q),T, (9,U)) -+ (S',Q'). 3.2.3.13. Observation. As mentioned previously, this study emphasizes the derivation of solutions to restricted instances of the TPP. For purposes of discussion, we define the term feasible computation to mean an algorithm that satisfies implementational constraints of time, space, and error for a certain operational scenario. In this study, we concentrate on designing feasible computations over the range space of various transforms. In Section 3.2.2, we defined a computation over range( T) in terms of computational systems whose derivation was based on an algebraic homomorphism. Given such definitions, we next 71 observe that HCS and ICS can be constructed from solutions to the TPP, ITP, and AIP. The following theorems illustrate this concept and have simple proofs that are presented in outline form. 3.2.4. Derivation of HCS and ICS via the DTP, ITP, and AIP 3.2.4.1. Theorem. If particular solutions to the DTP and TPP exist, a homomorphic computational system XH (reference Observation 3.2.2) can be constructed from (a) a set D = {(f, f') : f E S, f' E S'}, which is a partial function that is an element of (S' U { }) and (b) an operation Q on S. Proof. Assuming the existence of an algebra A = (g,U), where G C F and U C 0, we outline the proof as follows: Step 1. Choose Q, an operation on S. Step 2. Where possible, solve the data transformation problem as T := MT,(D, (G,U)), (86) to obtain the transformation T: S -, S'. Step 3. Where possible, solve the transform property problem as (S', 0') := MTPP((S, 0), T, (G,U)), (87) to obtain the tuple (S', Q'). By the definition of an HCS, T is a homomorphism. Step 4. Formulate XH = ({S, 01, {T, E}, {S', Q'}), where an analogous operation 0(0) = P2[MTPP((S, 0), T,(GU))I. E 3.2.4.2. Theorem. If particular solutions to the inverse transformation problem exist, an isomorphic computational system V can be derived from a homomorphic computational system XH. 72 Poof. Assuming the existence of an algebra A = (g,U), where 9 c F and U c 0, we outline the proof as follows: Step 1. Choose an HCS XH = ((S,O),T,(S',O')), where the transformation T : S - S'. Step 2. Where possible, solve the inverse transformation problem as follows: T ':=M,,P( T, (g, U)),1 (88) to obtain the inverse transformation T-1: S' -+ S. Step 3. Formulate V = ({S,0},{T,E},{S',Q'}, T-1), where an analogous operation 0(Q) = p2[M'TPP((S, 0), T,(9,U))]. 0 3.2.4.3. Theorem. If particular solutions to the approximate inverse transformation problem exist, an approximate isomorphic computational system X* can be derived from a homomorphic computational system XH. Proof. Assume the existence of an algebra A = (U), where 1 C F and U C 0, together with the dataset D = {(f, f') : f E S, f' E S'}, which is a partial function that is an element of (S' U {-L})s with S, S' E F. The proof is outlined as follows Step 1. Choose an HCS XH = ((S,0),T,(S',Q')), where the homomorphism T: S - S'. Step 2. Choose an error bound E E R and, where possible, solve the AIP as T* := M,,(D, T, c,(9,U)), (89) to obtain an e-near approximation T*: S' - S to the inverse transformation T-1: S' -+ S. Step 3. Formulate V* = ({S, 0},{T, E},{S', Q'}, T*), where an analogous operation E(0) = P2[MT..((S,0), T,(g,U))]. E 73 We next present a taxonomy of image transformations that facilitates derivation of analogous and dual operations throughout the remainder of this dissertation. 3.3. Taxonomy of Image Transformations. We begin by deriving a simple notation for transform classification and exemplify the computation of a transform parameter (e.g., compression ratio) for each class. 3.3.1. Taxonomic Classes. From Chapter 2, recall that an image is a mapping of form FX - X - F and image transforms map images to images. In this study, image transforms, denoted by T: FX - G', are classified via taxa that describe equality of domain or range spaces. 3.3.1.1. Assumption. Although the general form of T: FX - GY is mathematically the same whether or not X = Y or F = G, numerous transforms of practical interest exhibit different range and domain spaces. Thus, we assume (for implementational convenience) that FX, FY, GX, and GY denote distinct spaces. 3.3.1.2. Definition. Under constraint of Assumption 3.3.1.1, encoding transforms are grouped as follows: Class 1: T1: FX - FX Class 2: T2: FX - GX Class 3: T3: FX -F Class 4: T4: FX - 74 3.3.1.3. Example. Class 1 transforms are exemplified by the (a) (b) (c) (d) (e) Linear inner-product transforms (e.g., Fourier, Walsh, Hadamard, and Cosine transformations); Unary arithmetic and logic functions; Delta modulation transform, derivative-based compressions, PCM, and certain instances of DPCM; Substitutional ciphers over F; as well as Ciphers such as DES and RSA, if the keyspace is not included in domain( T). 3.3.1.4. Example. Class 2 transforms include (a) Local averaging operations; (b) Thresholding and bit-slicing operations; and (c) Image labelling operations such as connected component labelling. 3.3.1.5. Example. Class 3 transforms are represented by (a) Spatial warping transforms (i.e., the affine transform); (b) Image minification by subsampling or magnification by pixel replication; and (c) Spatial superresolution transformations, such as sub-pixel interpolation. 3.3.1.6. Example. Class 4 contains many interesting transformations currently in use, which include (a) Runlength encoding and sparse matrix reduction techniques; (b) Fractal-based encryption via iterated function systems; (c) HVS-based blockwise transforms such as VPIC; (d) Blockwise sine/cosine transforms such as DCT, JPEG, and MPEG; (e) Blockwise statistical transformations, including BTC and VQ; 75 (f) Adaptive or recursive encoding schemes, such as wavelet transformation, adaptive vector quantization, as well as adaptive Lempel-Ziv encoding; and (g) Higher-level processes, such as the transformation of image components into adjacency graphs. The foregoing transform classes each have different attributes (e.g., compression ratio). For example, computational complexity, data security, and information loss are discussed in Chapters 4 and 5. To demonstrate the utility of our taxonomy, we analyze the compressive properties of transforms in Classes 1-4, which provides background for the development of Parts II and III. 3.3.2. Compression Ratio. 3.3.2.1. Lemma. The compression ratio of a Class 1 transform T1 is unitary. Proof. From Definitions 3.3.1.2 and 3.2.1.8, the Class 1 transform T1: FX - FX exhibits the compression ratio CR( T1) = |domain(choice(domain( T1))). siz[range(choice(domain( T1)))] Idomain(choice(range( T1)))| - siz[range(choice(range( T1)))] JXJ. siz(F) = 1. lXi -siz(F) (90) As a result, Class 1 transforms alone are not interesting for compression applications. E The following lemmas are similarly proven. 3.3.2.2. Lemma. The compression ratio of Class 2 transforms is given by CR(T2) = siz(F)/siz(G). 3.3.2.3. Lemma. The compression ratio of Class 3 transforms is given by CR(T3) = JXJ/JYJ. 76 3.3.2.4. Lemma. The compression ratio of Class 4 transforms is given by CR(T4) = IXI siz(F) YI- siz(G) which is the general case. For purposes of clarity, we next present a notation for decomposing transform mappings, which helps us visualize the transform mappings implementationally. 3.3.3. Formulaic Granularity and Decomposition. We begin by showing how a given transform T can be expressed at three levels of granularity (coarse, medium, and fine). The multigranular representation of T does not imply that each level of granularity has the same formulation. Rather, we decompose T to concisely acquaint the reader with T's structure and function. From such decompositions, one can further understand the format in which information is represented in range( T). 3.3.3.1. Definition. An image transform is described at coarse granularity as a mapping between image spaces of form FX. Coarse granularity is denoted by presuperscript C. For example, we write sparse matrix reduction at coarse granularity as C TrS: FX , G , which is a Class 4 transform. 3.3.3.2. Definition. A transform can be described at medium granularity via its data structures. Medium granularity is denoted by presuperscript M. For example, given Y C N an instance of sparse-matrix reduction is denoted as MTsM: FX - (X x F) . That is, an F-valued image on X is reduced to a list on Y of tuples of form (x,f), where a non-negligible value f E F is grouped with its domain point x c X. 77 3.3.3-.3. Definition. The fine granular level of transform description can specify operations on one or more source data. Such operations may be represented as scalar (as opposed to vector-space) mappings. Fine granularity is denoted by presuperscript F. For example, given the preceding definition, we can write FTsM: (x,a(x))k-+ (h(x,a(x)), (x,a(x))) (92) where h denotes an indexing function that is described in greater detail in Chapter 9. 3.3.3.4. Notation. We denote the formulaic decomposition of a transform T from coarseto fine-granular form as C T =4 M T 4 F T. Two examples of this technique follow, which are applied to runlength encoding (RLE). 3.3.3.5. Example. Let the Class 4 mapping T4: FX -* GY describe runlength encoding (RLE). The RLE transform TRLE maps a Boolean image on X to an integer-valued image on Y, where Y C N is usual. Let n denote the maximum source runlength in domain( TRLE, and let a run (i.e., a contiguous region) of k < n zero-valued (unitarily-valued) pixels be mapped to the integer -k (+k). If k > n, then we partition the run into k/n runs of length n, and a residual run of length k mod n pixels. For example, if range(choice(domain(TRL,))) = Z256, and we want the encoding alphabet (which includes a marker symbol) to be both positively and negatively valued, then n = [(256 - 1)/2] = 127 pixels. Since we will be accumulating the detected runs using a counter, we let Y = N. Denoting Z = {i : i E [-(l-1),l - 1] C Z}, we denote range(choice(range( TRLE))) as Z . Thus, we write M TRLE(Z N Per the preceding discussion (and as shown in Chapter 9), F TRLE: ffk k - (2f - 1), where {f}k denotes a zero- or unitarily-valued run of k pixels, and k < n. 78 3.3.3.6. Example. The Class 4 transforms, which can be complex mathematically as well as notationally, can be further classified as follows: (i) Class 4-A contains fixed-blocksize substitutions, denoted by T4f, where M 4:(k -6 4FT4f F )f y 2 , fk) g , fl, f2,. - -k E F, g E G . (93) Thus, the encoding block size k is invariant to the points x E X or y E Y. (ii) Class 4-B contains variable-block substitutions, denoted by T4y, where the mediumgranular transform appears as in the previous equation, but the fine-granular transform exhibits space-variant encoding blocksize, i.e., F T4v: (Y,(f1, f2, - - - , fk(y))) F- (y,g) ,i 1, f2, - - -, fk(y) E F, g E G, y E Y . (94) It is easily verified that Class 4-B portrays substitutions whose blocksize varies with x E X, for example, partitioning systems customarily employed in fractal-based compression. Given Y, the indexing function h: X - Y, and an input image a E FX in domain( T4), Equation 94 can be rewritten to portray a value-dependent blocksize. In such cases, T4 would portray an encoding with a blocksize that is dependent upon the values of a, since a is a function from X to F. The medium- and fine-granular transform descriptions are especially useful when analyzing a mapping's computational complexity and data security, as shown in Chapter 4. Chapter 5, which pertains to error analysis, employs the fine-granular formulation as an initial step in characterizing error propagation through various transform stages. The derivations and implementational analyses of Chapters 6-9 are also based partially upon multi-granular decomposition. Thus, our notation is used not only for classification, but facilitates theoretical derivation, analysis, and algorithmic design. We next discuss high-level methods for the derivation of operations that compute over the range spaces of Class 1-4 transforms. 79 3.4. Class-Specific Derivational Techniques. We have thus far presented theory that is basic to the analysis of compressive and encryptive transformations, and have proposed a taxonomy by which numerous image transforms may be classified. In this section, we describe high-level techniques for analyzing transforms in taxonomic Classes 1-4 and for determining operations over the range spaces of such transforms. We begin by summarizing a few simple algebraic methods that are useful analytically (Section 3.4.1), then progress to high-level derivations of pointwise operations (Section 3.4.2), global reduce, and image-template operations (Sections 3.4.3 and 3.4.4, respectively). Given such basic theory, we present more detailed descriptions of high-level methods for deriving analogues and duals of operations over imagery transformed by Class 1-4 mappings (Sections 3.4.5 through 3.4.8). 3.4.1. Basic Algebraic Methods. 3.4.1.1. Assumption. Hereafter assume that f, f-1, g, g-1, h c S, where f-1 and g-' denote the inverses of f and g. Additionally, we assume that the operations 0,y : S x S -, S, Q', 7' : S' x S' - S', the transform T: S - S' is a homomorphism, and e denotes the identity of S with respect to 0. 3.4.1.2. Observation. Let Assumption 3.4.1.1 hold. One can employ the following simple techniques for the reduction of algebraic expressions: (a) If Q is commutative, then one can transpose the arguments of Q as f -(g 0 h) = f (h 0 g), (95) (b) If Q is associative, then one can regroup terms as (f g)Qh = fo(goh), (96) 80 (c) If -y is associative with respect to Q, then one can regroup terms to yield (f 0 g) -y h = f Q (g y h), (97) (d) If 0 is associative and g has an inverse g-1 E S, then one cancels terms as (fog)og 1 = fO (gog-) = fOe =f, (98) (e) If 0 is associative and has an inverse 0-1 that is associative with respect to Q, then one can eliminate terms, i.e., (f O g) -1 g = f O (g O- g) = f o e = f, (99) 3.4.1.3. Example. Let g = f-1 7 h. Via substitution, f 0 g becomes f 0 (f'-y h). If Y is associative with respect to 0, then we have f 0 (f-yh) = (f 0 f-1)'yh = eyh = h-ye = h. (100) 3.4.2. Pointwise Operations. We first examine the simple case of image transforms that are bijections, then continue with spatial transforms, and non-invertible (lossy) m-ary mappings. We conclude our development with a discussion of fixed- and variable-blocksize encodings, which comprise the majority of compression transforms in common use. 3.4.2.1. Assumption. Let VO = ({FX, O}, {T, E}, {FX, O'}, T-1) denote an ICS such that (i) T: FX , FX is implemented in terms of a bijection g: F -- F, and (ii) ï¿½(0)(T(a))= T(Q(a)), Va E FX. 81 3.4.2.2. Technique. If Assumption 3.4.2.1 holds, then for finite F and X, we observe that E and hence, O'= 0(0), can be implemented in terms of a discrete data structure such as a lookup table, subject to space constraints dictated by IFl. Alternatively, if T is expressed symbolically (i.e., an analytic function) and we wish to express 0' analytically, then we solve 3.4.2.1(ii) to obtain 0' under constraints of tractability. The following example is illustrative. 3.4.2.3. Example. Let Assumption 3.4.2.1 hold, and let T = log, which implies that F = R+. If 0 = ., then one can solve the following equation analytically 0'(log f, log g) = 0(f, g)= f - g, Vf, g E R+ (101) to obtain 0' = In order to better understand our treatment of neighborhood-based mappings, we next consider the simplistic case of an m-ary partial function that is a bijection over the domain of the image operation 0. 3.4.2.4. Assumption. Let VO = ({FX, 0}, { T, 0}, {FX, O'}, T-1) denote an ICS such that (i) T: FX -+ FX is based on a pointwise m-ary function g: F1a -, F, (ii) T-1: F - F' exists only for those JFl neighborhood configurations in domain( T), and (iii) 0(Q)(T(a))= T(O(a)) for those a E FX for which condition (ii) holds. 3.4.2.5. Observation. Note that there are IFl' possible neighborhood configurations in F', but only IF! mappings possible under constraint of the bijection that constitutes the partial function T. That is, if T is an onto mapping, then the inverse T-': F -- F"-, if it exists, would be an into mapping. 82 3.4.2.6. Technique. The equation in Assumption 3.4.2.4(iii) can be solved for 0, subject to the given constraints, after the method of Section 3.4.2.3. However care must be takin in qualifying constraints under which the solution holds. The following example is illustrative. 3.4.2.7. Example. Let Assumption 3.4.2.4 hold, and let F= Z9 with X a 2x4-pixel array. Let the source image a E FX shown in Figure 2 be transformed by T: F4 - F to yield a, E FX, as shown in Figure 2b. Define T as follows, using the form F2X2 - F2X2 as applicable, for purposes of brevity: (i) T: f 0), (0 f), (0 0), (0 ) f E- F, where f is a nonzero value in the 2x2-pixel array that is input to T; (ii) T: ( f (0 f and T: (f , (f where nonzero f, g E F; (iii) T: (, g ( where nonzero e,f,g,h E F; and (iv) T is undefined elsewhere, for purposes of simplicity. 1 2 7 5 1 4 8 5 3 4 8 6 3 2 7 6 (102) (a) (b) Figure 2. (a) Example source image a that is transformed by T to yield (b) image a,. Note reversal of the middle 4x4-pixel neighborhood only. Let 0: FX - FX be given by O(a) = (a - 1) mod 9. For example, 0(8) = 1, 0(1) = 0, etc. For the formulations of T, 0, and a given in Figure 2 and items (i)-(iv), it is easily verified by inspection of Figure 2 that 0 preserves the ordering of pixel values in a as constrained by the ordering of X. That is, 0 does not transpose values in a. As a result, it is sufficient to state that 0' = 0 in this case. However, if 0(a) = (a - 6) mod 9, then the right-hand two columns of pixels in a, will be 2 instead of 8 5), and the constituent map T: K (e can no longer be inverted to produce a from a, under constraint of the existing definition of T. In such cases, T-1 does not exist. 83 3.4.2.8. Remark. It is this difficulty with inverting constituent maps of T that renders neighborhood-based bijections difficult (and generally impractical) for image compression. In practice, a collection of neighborhoods (also called vectors) is often represented in terms of an exemplar that portrays a group of vectors within a given error criterion. Note that the preceding example shows why the determination of analogues to lossless (i.e., bijective) neighborhood transforms that are partial functions is dependent on the formulation of T, 0, and a. 3.4.2.9. Observation. Note that the preceding example specifies T as a space-variant spatial transform and Q as a grey-level transform. There, the processing sequence Q'( T(a)) would yield a greylevel transform (Q') of the neighborhood-based spatial transform ( 7. This is similar in concept to the Data Encryption Standard, where greylevel and spatial transformations are alternated as application of the XOR and S-box sub-stages of each DES transform stage. However, we have noted that in the preceding section that, unless T and Q are chosen carefully in relation to a, the transformed image ac could have undefined values. Hence, 7-' or 0' might not exist, or T'(O'(T(a))) might be undefined. Noninvertibility would render T useless for cryptographic applications, where isomorphisms are required. However, we have found that this observation can be exploited to form the basis for a neighborhood-based transposition similar to that shown in Section 3.4.2.6(iii). By choosing T and Q carefully, one can construct an encryption T(Q(a)) that reliably encrypts prespecified sequences of symbols in the plaintext a but discards (or irreversibly transforms) other sequences. Here, we say that T irreversibly transforms a given plaintext a because the inverse transform T1( T(a)) could exist outside the mappings that comprise T. Thus, a highly selective DES-like transform could be devised that would obliterate certain words or transform them into undefined values (also called noise values). Alternatively, consider that the preceding concept can be applied inversely to the decryption process, since Assumption 3.4.2.4 defines Vto be an ICS. In other words, instead 84 of obliterating certain symbolic sequences, the partial function T and the operation 0 could accept ciphertext and obliterate partial decryptions that did not meet prespecified constraints. Although we have not fully explored the potential of such operations, they appear to have greater utility in cryptanalysis than in cryptography. This is likely due to the fact that in cryptanalysis, one searches for a given set of plaintext sequences while attempting to process (decrypt) the ciphertext. Although the ciphertext sequences may be accurately surmised, the corresponding plaintext is usually guessed in a sketchy fashion only. Since this study does not include the derivation of novel cryptologic transformations, we defer further discussion of this technique to future research. We next consider the case of spatial warping transforms (Class 3 of our taxonomy). 3.4.2.10. Assumption. Let VO= ({FX, O}, { T, 0}, {FX, O'}, T-1) denote an ICS such that (i) T: FX -+ FY is based on the spatial transform g: Y - X such that T(a)= a o g, Va E FX, where a o g ={(y, a(g(y))): y E Y}. (ii) T-1: FY - FX exists and is based upon the spatial transform g-1: X - Y such that a= T(a) o g-, Va E FX; (iii) 0(Q)(T(a))= T(O(a)). 3.4.2.11. Theorem. If Assumption 3.4.2.10 holds, then 0', the analogue of 0 over range( T), is given by O'= 0 o g. (103) 85 Proof. Assuming that the givens hold, if g has an inverse g-1: X - Y, then T1: FY - FX exists, such that a= T~'(ac)= ac o g-1= a o g o g-'. (104) Given a unary image operation 0: FX - FX (chosen for purposes of simplicity), the analogous operation over range( T), denoted by Q': F - FY, can be derived from the preceding equation, as 0'(ac )= T3(0( T-'(ac))) = (Q (ac o g-')) o g (105) = O(a) o g= (Q o g)(a), since a, = a o g and composition is associative. 1 3.4.2.12. Remark. The preceding theorem holds for spatial warping transforms g that have an inverse g-1. However, Equation 105 also shows us that g-1 is not required in order to derive 0' from 0. Since the spatial warping transforms constitute an important class of image processing transformations that includes the well-known affine transform, this is a useful result. Additionally, since the affine transform forms a basis for fractal-like encoding of images (a Class 4 transform), this result is pertinent to imaging practice. 3.4.2.13. Observation. The spatial warping transforms present a deceptively easy example of deriving analogues for Class 3 transforms. Consider the more difficult case, where T is as defined in Assumption 3.4.2.10 but 0 is also a spatial transform. In particular, let 0(a) = a o e, where e : Y - X is not necessarily the inverse of g and may perturb domain(a) nonlinearly or noninvertibly. Since composition is not commutative, the commutativity diagram of Figure 1, which illustrates Q'( T(a))= T(O(a)), may not hold since O'(a o g) $ (a o e) o g is possible for some e. That is, the equality 0'(ac) = O'(a o g) = (0 o g)(a) may not hold because composition is not commutative. Thus, only certain types of spatial transforms are admitted by Theorem 3.4.2.11, namely, those that are commutative and invertible, as stated in the following theorem. 86 3.4.2.14. Theorem. Let Assumption 3.4.2.10 hold and let a E FX and e: Y - X be a spatial warping transform such that 0(a) = a o e. If e commutes with T's basis mapping g and both are invertible, then 0 has an analogue 0'= 0 o g. Proof. The proof follows directly from the proof of Theorem 3.4.2.11 and the preceding discussion. 0 3.4.2.15. Remark. The non-commutativity that prevents Theorem 3.4.2.11 from holding for all (discrete as well as continuous) domains X is exemplified by the case of an affine transform g that is first applied to discrete imagery which is then transformed by 0(a) = a o e. In such cases, quantization and interpolation error dictates that g o e $ e o g, although g and e may be linear and invertible within an prespecified error constraint e. Thus, we prefer to depict discrete spatial transforms that involve interpolation and quantization as being lossy and having E-near approximations to their inverses. The preceding discussion of a lossy transform leads naturally to a discussion of lossy neighborhood mappings. Such mappings can be found in Classes 1, 3, and 4 of our taxonomy, but primarily comprise the majority of useful transforms in Class 4. 3.4.2.16. Definition. A lossy neighborhood transform TLN: FX - GY is a noninvertible, m-ary onto mapping , where Y = X and G = F are possible. TLN is expressed at medium granularity as M TLN: F" -+ G. 3.4.2.17. Example. A local averaging transform that incorporates thresholding of the averaged result is an example of a lossy neighborhood transform. Other common examples include median filtering and neighborhood-based contrast enhancement. In such cases, the source image cannot be exactly retrieved from the compressed image that has been operated upon by an analogous operation. 87 3.4.2.18. Observation. Since TLN is an onto mapping but is many-to-one, it is not invertible. As a result, we cannot construct an ICS based on TLN, and we must instead use an approximate ICS based on an c-near approximation to the inverse of TLN, as follows. 3.4.2.19. Assumption. Let the c-near approximate ICS V0 ({FX, O}, { TLN,O}, {GY, O'}, TN (106) such that TLN conforms to Definition 3.4.2.16. 3.4.2.20. Technique. When deriving Q' over range( TLN), we primarily consider the following cases: (i) Linear or nonlinear operations may comprise TLN, such that at a given encoding block's target point x E X, the transform output approximates the source image value a(x). (ii) TLN may be primarily comprised of a spatial transform g: Y - X, with greylevel manipulations that only slightly perturb the transform output values with respect to the corresponding source image values. For example, consider image rotation with interpolation. In such cases, the transformed value ac(y), where y E Y, closely approximates the source image value a(g(y)). (iii) Occasionally, a subset of range(a) is exactly represented in a, = TLN(a). For example, a constant-valued encoding block may be encountered by a compression transform that encodes blocks according to their mean values (e.g., BTC or VPIC). 88 If Case (i) or (ii) holds, then it is possible that the analogue 0' of an image operation Q may be approximated by 0 due to the approximation of values in domain(TLN) by corresponding values in range( TLN). In case (iii) the equality 0' = 0 holds only for neighborhoods in a and a, that have values such that alh'(y) can be retrieved from ac(y), where y E Y. From such simple cases, we progress to the more realistic case of neighborhoods that are represented by an exemplar, under constraint of a prespecified error criterion. 3.4.2.21. Technique. Let a neighborhood transform TLN:FX G 6 be expressed at medium granularity as M TLN: F' - G, and let TLN generate a codebook c E (F')G. Given a set B of encoding blocks in F', the codebook is constructed by a clustering algorithm that groups B into sets B(g), for each g E G for which an exemplar c(g) is to be specified. The extent of this process can be constrained by a prespecified limit e on the error with which vectors in B(g) are represented by c(g). An additional constraint is the codebook size J6|. For example, e could be an MSE criterion, such that 11 b - c(g)|1 < e, Vb E B(g). (107) In practice, T is often called a vector quantization transform. Since c(g) represents encoding blocks in B(g) within an error e, we say that TLN has an c-near approximation to its inverse, denoted at medium granularity by MTCN: 6 -- Fm. As a result, there are two strategies that can be employed in processing an image a compressed by TLN to yield a,: Method 1. Decode the y-th block in a, by applying TLN to the block index g E 6 to yield the y-th source block b. Use customary tree-based search strategies to find the exemplar c(h) that best represents b. Encode the processed image ;c = Q'(ac) as ac(y)= h. Method 2. Alternatively, one can apply 0 to each codebook exemplar in c to yield a new codebook d. Then, a, references d instead of c. 89 3.4.2:22. Observation. With respect to the preceding technique, note the following salient advantages and disadvantages: Method 1. Advantage: The codebook is not modified, and thus retains the representation of the training set upon which TIN's design was based. Disadvantage: The transformed image must be inverse-transformed blockwise prior to processing by an analogous or dual operation, which contradicts the key premise of compressive processing, namely, the processing of imagery in compressed form. Disadvantage: Since JY < IXI/m is typical for compressive applications, a given exemplar is likely to be processed by 0' more than once, which is inefficient. Disadvantage: The block configuration represented by 0(b) may not be represented by any exemplar in c within the error limit E upon which the design and construction of c was based. Method 2. Advantage: Each exemplar is processed only once, requiring O(IIYI) work. Advantage: The image need not be decompressed. This maintains such data security as may be inherent in the compressed format, since 0' does not require an association between range(ac) and domain(c). In fact, 0' never sees the placement of source image values in a,, since 0' inputs values in range(ac) only as indices of c. Advantage: The new codebook represents blocks produced by Q'( TN(ac)), instead of approximating the output of 0' with an existing exemplar. Disadvantage: A new codebook is produced, whose statistics may vary from those of c, and which may not represent statistics of the training set from which a was abstracted. 90 3.4.2.23. Remark. The advantages of Method 3.4.2.21(1) outweigh those of Technique 3.4.2.21(2), with the exception that the codebook statistics can be altered by Method 2. One could argue that a solution to this problem involves unioning the codebooks c and d within a given error constraint on exemplar similarity. However, codebook size is thus increased, which is clearly disadvantageous for compressive processing, when the net efficiency rq = O(CRd). This problem is further addressed in Chapter 9, when we discuss VQ-based processing, together with the associated codebook representation error and resultant reconstruction error that appears after a compressive operation's output is decompressed. This is an important implementational issue, due to the accrued error that results from cascaded compressive operations. We next consider problems of coordinate-set manipulation that occur during the processing of block-encoded imagery using analogues or duals of binary pointwise operations. 3.4.2.24. Observation. Let Assumption 3.4.2.19 hold, and let the lossy neighborhood transform TLN tesselate two images a, b E FX thereby producing the corresponding compressed images a, bc E G . For each y E Y, ac(y) references a block in a (and likewise for b, and b). If TLN tessellates a and b identically, then it is possible that 0' can process over a, and b, in O(IYI) time since the y-th blocks of a, and bc can be input to Q'. Thus, only FYI invocations of Q' are required. However, this condition occurs only when TLN employs an indexing function h: X -+ Y that identically tessellates the domains of a and b. If h produces encoding block domains that are space-invariant, we call this fixed-block encoding. If domain(a) is tessellated isomorphically to domain(b), we have isomorphic encoding, which is usually employed in conjunction with fixed-block encoding. |

Full Text |

PAGE 1 ON THE PROCESSING OF COMPRESSED AND ENCRYPTED SIGNALS AND IMAGERY, WITH APPLICATIONS IN EFFICIENT, SECURE COMPUTATION By MARK STEVEN SCHMALZ A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY UNIVERSITY OF FLORIDA 1996 PAGE 2 Copyright Â© 1996 by Mark S. Schmalz PAGE 3 ACKNOWLEDGMENTS Many thanks are given to my family, for their loving support and encouragement, and to Drs. Ritter, Wilson, Laine, Rajasekaran, and Harris for their advisement of this work and associated publication efforts. Special gratitude goes to Gerhard Ritter, for his patient advice and encouragment in the development of unifying theory. iii PAGE 4 TABLE OF CONTENTS page ACKNOWLEDGMENTS iii LIST OF FIGURES vi LIST OF TABLES viii ABSTRACT ix 1. INTRODUCTION 1 1.1. Study Overview 2 1.2. Previous Work 4 1.3. Technical Approach 31 1.4. Novel Claims 32 1.5. Implementational Advantages and Disadvantages 33 2. REVIEW OF NOTATION 35 2.1. Overview of the Image Algebra (lA) Subset 35 2.2. Study Notation 46 3. FUNDAMENTAL THEORY 49 3.1. Mathematical Concepts 49 3.2. Image Compression, Encryption, and Compressive Computation 56 3.3. Taxonomy of Image Transformations 73 3.4. Class-Specific Derivational Techniques 79 4. COMPRESSIVE PROCESSING Â— COMPUTATIONAL COMPLEXITY AND DATA SECURITY 117 4.1. Complexity of Image Algebra Operations 118 4.2. Complexity of Compressive Computation 124 4.3. Feasibility of Compressive Computation 134 4.4. Data Security 141 4.5. Feasibility of Encryptive Computation 158 5. COMPUTATIONAL ERROR AND INFORMATION LOSS 164 5.1. Theory of Error Propagation in Discrete Systems 164 5.2. Error Propagation in Discrete Image Algebra Operations 173 5.3. Theory of ErrorTolerant Computation 180 5.4. Feasibility of ErrorTolerant Computation 182 5.5. Information Theory and Error-Tolerant Computation 187 iv PAGE 5 6. CLASS 1 TRANSFORMATIONS 198 6.1. Substitutional Cipliers 198 6.2. Transpositional Ciphers 208 6.3. Linear Transforms 211 7. CLASS 2 TRANSFORMATIONS 223 7.1. Generalized Class 2 Transform 223 7.2. Pixel Value Compression via Bit Slicing 228 8. CLASS 3 TRANSFORMATIONS 236 8.1. Affine Transformation 236 8.2. Spatial Transformation by Pixel Selection 247 8.3. Transpositional Ciphers 253 9. CLASS 4 TRANSFORMATIONS 259 9.1. Block Encoding 259 9.2. Sparse Matrix Processing 261 9.3. Transform Coding 265 9.4. JPEG 268 9.5. Block Truncation Coding and VPIC 276 9.6. Vector Quantization 291 10. APPLICATIONS OF COMPRESSIVE IMAGE PROCESSING 302 10.1. Compressive Smoothing 302 10.2. Compressive Edge Detection 310 10.3. VPIC Morphological Operations 323 10.4. High-Level Compressive Computations 326 11. APPLICATIONS IN PARALLEL COMPUTING 334 11.1. Effect of Domain Compression Ratio 334 11.2. Effect of Range Compression Ratio 338 11.3. Simplification of Operations 339 11.4. Partitioning Efficiency 341 12. CONCLUSIONS 346 12.1. Conclusions 346 12.2. Open Issues and Future Work 350 REFERENCES 356 BIOGRAPHICAL SKETCH 362 V PAGE 6 LIST OF FIGURES 1. Commutativity diagram for processing compressed or encrypted imagery with unary operations 59 2. (a) Example source image a that is transformed by T to yield (b) image a^. Note reversal of the middle 4 x 4-pixel neighborhood only 82 3. Example of block fragmentation in pointwise compressive operations over non-identically tesselated domains: (a) representation of the y-th block in a (solid line) and b (dotted line), (b) resultant block decimation required to process the compressed images Cc = ac O' such that corresponding sub-blocks are amenable to pointwise combination 113 4. Commutativity diagram of a compressive computational system with multiple encoding transforms Ti and multiple analogues Q[ of image operation Q 136 5. Commutativity diagram of a compressive computational system with multiple encoding transforms Tj , i G Z^, aild multiple analogues Oj,i > j ^ Z,/, of the j-th image operation Oj> where C = 2 and 0 = 2 137 6. Information-theoretic model of communication 187 7. Information-theoretic model of communication 187 8. Arrangement of noise levels in a signal a with entropy H(a) that partially occupies a channel of bandwidth B having K noise bits 195 9. Local averaging of a noisy, two-dimensional, 8-bit image transformed by T(x) = (x-)-128) mod 256: (a-b) source image and template a and t; (c-d) transformed image and template a^ and s; (e) transformed result be; (f) image-domain result b = T-i(bc) 206 10. Dual of pointwise multiplication over the range space of the pointwise linear transform T(x) = cx -|d 214 11. Recovery of source image a from image sum of contrast-stretched imagery: (a-b) source images a and b, (c-d) linear contrast-stretched images a^ = T(a) and be = T(b), (e) Cc = ac and b^, (f) recovery of a from using a = T"^(cc T(b)). . . 215 12. Pixel indexing and overlap scheme for information loss analysis of an affine transform (a) indexing scheme and (b) example of pixel overlap 238 vi PAGE 7 13. Local averaging of a rotated Boolean bar chart using a rotated template: (a-b) source image a and template t; (c-d) rotated image ac = T(a) and template s formed by applying T to the weight matrix of t, then restricting to the 3x 3-pixel Moore configuration; (e) rotated locallyaveraged image ac @s; (f) derotated image r'(ac @s) 251 14. Error images of local averaging using rotated and unrotated templates: (a-b) error images e (Equation 307) and f (Equation 308), taken from the central (uncropped) portion of domain(a); (c-d) histograms of error images e and f. . . 252 15. Error and efficiency measures associated with JPEG addition over natural scenes having 3 bits/pixel error 275 16. Codebook growth and error as a function of codebook size and Methods 1-3 for VQ pointwise addition: (a) input and output code book sizes M and N, (b) input and output errors e, and 6Â„ 300 17. Block configuration for compressive local averaging, showing block neighborhoods (dashed boxes) on X returned by c, r, and j 304 18. Example of compressive smoothing at CRa = 16:1: (a) source image, (b) locally averaged image using unitary von Neumann template, (c) compressed array of block means over which processing actually occurs, (d) decompressed result of compressive smoothing 307 19. Example of VPIC coding of very low-resolution Boolean imagery: (a) source image, (b) VPIC representation of a), where M,x denotes a mean block of mean x. . . 318 20. Example of VPIC coding of very low-resolution Boolean imagery: (a) boundary-detection of Figure 19a), (d) VPIC edge detection of Figure 19a). . . 319 21. Error analysis of the edge detector in Figure 20b, in terms of erroneous source pixels per encoding block 320 22. Greylevel edge detection with VPIC: (a) source image, (b) Sobel edge detection, (c) 4 X 4-pixel VPIC edge detection using the codebook similar to that given in Example 10.2.2.3, (d) noise and representational error as a function of VPIC blocksize (kxk pixels) for underwater and land-based imagery 321 23. Target characterization using VPIC codebook exemplars with indices in {1,2,3}: (a) VPIC codebook, (b) source image with unitarily-valued target region, (c) compressed representation, where 1' denotes exemplar 1 rotated by 90 degrees. . 328 24. Example of VPIC-based target recognition: (a) source image, (b) BTC-compressed image over a portion of which the target recognition algorithm (Equation 410) computes, (c) decompressed target location image 330 25. Conversion of a SIMD-parallel mesh to pipelined computation using compressive processing: (a) input and compute over compressed image 1, (b) input compressed image 2 and compute over image 1, (c) input compressed image 3 and compute over images 1 and 2, (d) input compressed image 4, compute over compressed images 1-3, and output compressed image 1 337 vii PAGE 8 LIST OF TABLES 1. Costs involved in SIMD-parallel computation of a 2-pixel imagetemplate convolution versus a 5-pixel LUT operation over VPIC-format imagery . . 339 2. Processor cycles incurred by SIMD-parallel computation of a 2-pixel imagetemplate convolution versus a 5-pixel LUT operation over VPIC-format 340 imagery VIU PAGE 9 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy ON THE PROCESSING OF COMPRESSED AND ENCRYPTED SIGNALS AND IMAGERY, WITH APPLICATIONS IN EFFICIENT, SECURE COMPUTATION By Mark Steven Schmalz August 1996 Chairman: Gerhard X. Ritter Major Department: Computer and Information Science and Engineering In this dissertation, we develop theory and analyses pertinent to the processing of compressed or encrypted imagery, called compressive processing or encryptive processing. Encryptive processing, which is a long-standing goal of computer science, exploits the processing of secure (encrypted) operands to yield a secure computation. Unfortunately, development has been hindered by an obscure understanding of mathematical concepts fundamental to the encryption process. Similarly, the processing of compressed imagery can yield computational efficiencies as a result of fewer input data. However, such speedups are not uniform across commonly-used image compression formats and may not exist for certain operations over the range spaces of given transforms. Additionally, the formulation of compressive transforms can be quite involved, which intuitively suggests that the derivation of compressive computational operations is difficult in closed form. We first discuss theory that is basic to an understanding of compressive and encryptive processing. Using image algebra, a rigorous set-theoretic and functional mapping notation, ix PAGE 10 we derive novel algorithms for implementing selected pointwise arithmetic, neighborhood, and domain-specific operations on transformed data. Our algorithms are both feasible and portable to numerous computers, since image algebra has been implemented on a variety of serial and parallel processors, including the Connection Machine, MasPar SIMD processor, ERIM's Cytocomputer, Alliant Tech's PREP, Inmos' Transputer architecture, and the Martin-Marietta GAPP-IV processor upon which the PAL architecture is based. Subsequent implementational discussion emphasizes the feasibility of sequential and parallel compressive computation. We show that compressive processing methodologies can be mapped to a variety of well-known architectures, especially SIMD array processors. Analyses focus upon time complexity, cost, and error due to information loss. For example, we show that compressive processing can reduce the processor count in parallel architectures without compromising the computational speedup obtained through parallelism. A further advantage of selected compressive transforms is their ability to facilitate the mapping of costly algorithms such as edge detection and component labelling to compressive operations implemented in terms of lookup tables. Such techniques require only I/O operations and can be stored in local memory in certain SIMD-parallel processors. PAGE 11 CHAPTER 1 INTRODUCTION The progress of image and signal processing research has long been hindered by the insufficient computational bandwidth of available computing machinery. Due to the large data burden presented by high-resolution imagery, near-real time processing of large images (e.g., surveillance or medical imagery) is especially costly. Recent research [1-6] has explored the possibility of processing compressed imagery, based on the observation that compressed images have fewer data, as well as the conjecture that fewer input data require fewer operations. In practice, we have found that certain image processing operations can be mapped to the range spaces of specific transformations, such that decompressing the processed compressed image yields an exact representation to (or approximation of) the processed source image. We call such transform-regime operations analogues of the corresponding image operations over the given transform's range space. When one computes an image operation or sequence of such operations using compressed imagery, that is called compressive processing. In practice, it is possible that certain formulations of analogous operations can yield a computational speedup since: (a) the analogous operations may be more efficient than the corresponding image operation, or (b) fewer data may be processed by such operations. Similar to the concept that computational speedup could accrue from the processing of compressed imagery, it has been observed that data security could be maintained throughout a computational system by appropriately processing encrypted data. In contrast to compressive transforms, which are extensively discussed in the literature, the formulation of numerous practical encryptive transforms remains obscure. Additionally, we note that 1 PAGE 12 2 many compressive transforms can be thought of as having weak encryption capabilities, due to the visual obscurity of compressed data formats. 1.1. Study Overview In this dissertation, we address the following key questions: 1. Can compressive or encryptive transforms be classified in a manner that facilitates convenient derivation of analogues to common image processing operations? 2. Given the aforementioned classification scheme and a particular transform class, can methods for deriving analogues over the range spaces of multiple transforms in that class (or a relevant subclass) be elucidated? If not, then why not? If so, then how can one employ such methods effectively, and to what types of image operations or operands do they apply? 3. Given the presence of noise and computational error in imagery and image algorithms, respectively, what is the noise sensitivity of compressive processing? This is an important implementational issue, since image compression decreases redundancy in the compressed image. In the source image, noise in a given pixel is generally localized to the pixel's domain point. However, due to reduced redundancy in the compressed image, noise in a givenpixel value of a compressed image could cause perturbation of multiple pixel values in the corresponding decompressed image. 4. Assuming that we can derive an analogue of an image operation over the range space of a compressive transform, what computational advantages can be obtained via image processing with such analogous operations? Additionally, is it possible to cascade compressive operations, thus facilitating algorithm development in the transform domain via composition of known analogues? PAGE 13 3 5. What additional insiglits, effects, or implementational advantages could accrue from compressive or encryptive processing? How would such concepts be useful in computer science? Given the preceding questions, we conclude this chapter with a brief perspective on the small amount of previous work reported in the literature, as well as comments on methods and potential advantages of our approach. Chapter 2 begins with an overview of notation and Chapters 3-5 summarize theory that supports the derivation of analogues of image processing operations. We choose several commonly-employed transforms from each class of our transform taxonomy. Given a transform T, we discuss the derivation of selected pointwise, global-reduce, and image-template operations over range{ T). In certain cases, we have discovered methods for deriving analogous operations that are applicable to a large subclass of transforms in the given class. The latter portion of this dissertation contains an implementational discussion that emphasizes three topics. First, we consider the effects of compressive image processing with three specific transforms (JPEG [7], Vector Quantization [8], and Visual Pattern Image Coding [9]). Second, we analyze the propagation of error through a discrete compressive processing system, in terms of the communications and computational errors manifested by given operations over the range space of each transform. Third, we discuss problems and pitfalls related to the implementation of such analogous operations on sequential and parallel processors. Implementational issues include space complexity as well as limits on computational and communications bandwidth. In particular, we discuss the reduction in parallelism that may be achieved via compressive processing on SIMD-paraUel meshes. Such effects pertain primarily to medical and military applications in real-time image processing at high frame rates. Conclusions and suggestions for future work emphasize the impact upon image processing that compressive computation may produce, under the assumption that error propaga- PAGE 14 4 tion ran be predicted and controlled. An additional assumption pertains to the availability of analogous operations over the range spaces of compressive transformations that may be obtained from future development eiTorts. Open issues, such as the elucidation of general derivational methods for analogues over a transform class or subclass, are also considered. 1.2. Previous Work The computational advantages of processing fewer data were long ago realized in the processing of reduced matrices [10,11]. Additionally, electrical engineering practice has long emphasized operations over parameter spaces that approximately characterize input signals. Such feature-space approximations usually implement one or more forms of data reduction, but can require exotic algorithms or architectures for computing simple operations [12]. Although of implementational interest to certain mission-specific efforts, such feature space operations generally exhibit disadvantages of high space complexity as well as costly hardware and software development. Thus, such methods are not generally attractive for commercial applications. As a result, additional research is required in the processing of compressed data. In order to acquaint the reader with the scope of this study and provide salient detail, we begin our literature review with a discussion of transforms that have been developed specifically for the purpose of data compression (Section 1.2.1). This overview supports the primary topic of this study, namely, compressive processing. We then review data encryption (Section 1.2.2), which provides background for a summary of encrypted data processing (Section 1.2.3). Such background leads naturally to a discussion of processing compressed signals and imagery, which is given in Section 1.2.4. PAGE 15 1.2.1. Data Compression 5 Research endeavor in signal and image compression has been extensive, due primarily to the economic benefits of exploiting communication channel bandwidth. In principle, compression is achieved by reducing redundancy in the source data. For example, an image of constant value k can be represented by one number (k), regardless of image size. In contrast, an image consisting of randomly chosen pixel values cannot be compressed. Early on, it was recognized that the first derivatives of many signals exhibited lower information content than the source signal. This lead to the development of primitive methods of image compression such as delta modulation [13] and differential pulse code modulation (DPCM). As a consequence of the need for accurate, efficient quantizers in PCM and DPCM, a variety of statistical compression techniques were developed including adaptive DPCM, which were based, upon adaptive or recursive quantization techniques [14]. As digital images increased in size and thus required higher channel bandwidth and increased storage, data compression research began to emphasize two-dimensional transformations. For example, the concept of subdividing an image into blocks (generally of rectangular shape) gained popularity due in part to the limited memory then available on fast signal processors. Such methods, called block encoding (BE), tessellate the image domain into encoding blocks that exhibit greylevel configurations which can be represented in lossless form (e.g., via indexing) or with a lossy transform such as vector quantization. Since there are fewer block configurations than there are possible blocks, and a group of values are represented by an index or by an exemplar block (also called a vector), VQ produces the desired effect of image compression. By arranging the input data to achieve maximum intra-vector and minimum intervector correlation, the compression ratio of VQ can be increased, which partially offsets the effort required by the determination of the VQ codebook (a set of exemplar vectors) [15]. PAGE 16 6 An alternative method of block encoding, called block truncation coding (BTC) [16], encodes regions of low greylevel variance in terms of a mean value and regions of high variance in terms of a mean, standard deviation, and a residual bitmap that denotes the positions of zero crossings. Unfortunately, BTC is expensive computationally, due to the adjustment of bits in the bitmap to effect reduced entropy. Since the cost of BTC increases exponentially with blocksize [9,16], and BTC has a compression ratio that is moderate by today's standards (typically 20:1 to 25:1 for images of natural scenes), BTC is not considered a compression transform of choice. Additionally, by transforming the output of a block encoder with, for example, the Fourier or Cosine transforms, then selecting transform coefficients that are deemed significant a priori, one can further reduce the image data burden, although at the cost of information loss. Such methods are generally called transform coding [17], and feature prominently in compression schemes (e.g., JPEG, MPEG) that are currently in vogue for digital telephony applications [18]. By. following the coefficient selection stage (i.e., quantization step) with a provably optimal, lossless compression transform such as Huffman encoding [19], one can obtain further data reduction and thus achieve higher compression ratios without incurring information loss. More exotic methods of image coding are based upon the reduction of an image to significant eigenvalues, as in the Singular Value Decomposition (SYB) [20]. Although proven optimal for image compression, eigenvalue transforms such as the SVD and Karhunen-Loeve transform (KLT) are computationally burdensome, requiring Â©(n"*) work for an nxn-pixel image. Thus, the SVD and KLT are infrequently employed, despite the ease with which significant transform coefficients (eigenvalues) may be obtained from the transformed image. An alternative method of compression by recursive decomposition, which is often based upon knowledge derived from observation of the human visual system (HVS), has been employed with some success but tends to be data-dependent. For example, early attempts at multifrequency decomposition, such as the Synthetic High technique [17], eventually PAGE 17 7 led to Fourier-transform based methods currently known as wavelet based compression [21]. Similarly, Barnsley et al. [22,23] have published extensively in the area of image compression based on recursive, VQ-like decompositions that derive their methodology from the collage transform and from concepts of fractal geometry [24]. Due to an obscure formulation and high cost of the forward transformation, fractal-based compression remains an experimental technique. Recently-published research by Chen and Bovik [9] has reconsidered the problem of compression based upon HVS-based knowledge of exemplar patterns that may be used by the visual cortex to partition imagery. For example, the visual cortex is known to contain simple, complex, and hypercomplex cells [25] that mediate receptive fields which detect barer wedge-shaped gradients, as well as feature orientation and (possibly) shape. Chen has exploited this information to modify block truncation coding by replacing the costly bitmap construction step with a simple correlation step that finds the best match between the zero crossing configuration and a small set of exemplar blocks. This method, called visual pattern image coding (VPIC) yields high computational efficiency with respect to BTC or VQ and combines advantages of both methods. Given the appropriate set of exemplar patterns, Chen has demonstrated (on several standard images) high reconstruction fidelity at compression ratios of 20:1 to 40:1, which appears to be superior to JPEG's performance at such compression levels. We next consider the allied topic of data encryption, from which area of study signal and image compression theory arose. 1.2.2. Data Encryption The development of data encryption transforms has long been of interest due to applications in military science and statecraft [26]. More recently, business communications research has shown great interest in data encryption, as applied to the security of financial PAGE 18 "^9 transactions. Beginning with monoalphabetic substitutions a.nd transpositional ciphers [27], data encryption progressed to anagrammatic ciphers, as well as polyalphabetic encryptions based on the Vigenere cipher [28]. The vulnerability of such methods to attack based upon the ciphertext histogram and Icnowledge of plaintext statistics led to development of the Vernam cipher [29], which is based upon modulo-n addition and subtraction. The Vernam cipher, although used extensively during WWI, can be compromised if an intermediate result is detected, or if a portion of the plaintext is known. In response to this situation, development centered on Hill's linear k-gram cipher [27] and rotor machines that were sophisticated electro-mechanical implementations of the Vignere cipher. The Hill cipher requires solution of a system of k linear equations, in which the alphabetic indices of k elements of plaintext are multiplied by an encryption matrix to yield ciphertext. The product of an inverse of the encryption matrix and the ciphertext produces the plaintext. Although the Hill cipher was initially thought to be secure, it was quickly proven that linear-algebraic attack could compromise the encryption in O(k^) time [27]. Likewise, rotor machines incur the basic deficiencies of the Vigenere cipher, which are: (a) a finite number of alphabets that is small relative to customary message size, (b) resulting susceptibility to attack based upon known plaintext statistics and semantics, (c) insecure key distribution schemes, and (d) vulnerability to automated (i.e., computational) attack. In the latter category was much pathfinding research in pattern matching and enumerative attack using Turing machines, as embodied in the Atlas computer employed by the British during WWII in their cryptology center at Bletchley Park. It this context, it is interesting to note that the rotor machines used by the Axis powers in WWII were descended from a machine that was patented in the US in the 1920s [30]. Such narrative is well documented in the literature [26], and are thus not considered further in this discussion. Based upon cryptanalytic experience gained during WWII, Claude Shannon in 1949 published suggestions concerning features of cryptosystems that might be useful in the future, such as interleaving of substitutions and transpositions [30]. Throughout the PAGE 19 9 1960s, the IBM Corporation developed Shannon's ideas into a workable prototype cipher, called LUCIFER, which was improved in the early 1970s to yield the Data Encryption Standard [29]. In its original form, DES accepted a 56-bit Boolean key and a 64-bit Boolean plaintext, and returned a 64-bit Boolean ciphertext. In practice, the DES transformation consists of 18 component transformations that are composed to yield a sequence of substitution and transposition operations. The transpositions are based in part upon hitherto undisclosed functions called S-boxes that are implemented in publicallyavailable lookup tables [30]. Due to the possibility of compromise via enumerative attack facilitated by increasingly powerful computers, it was proposed in the mid-1980s that DES' input be extended to 128 bits, and that the key be of similar size. Note that: (a) this extension has yet to be adopted, (b) the US Department of Defense (DoD) until recently continued to recommend the 58-bit DES keyspace as the standard for business communication, and (c) DES has not been employed by the DoD in military encryption. The latter situation was underscored by DifRe and Bellman's proposal [31] of an exhaustive attack on DES, in which a multiprocessor architecture enumerates the 2^^ possible keys. Given the Ins cycle time of simple, fast, commerciallyavailable uniprocessors (GaAs or HEMT technology [32]), and assuming the existence of a 10'*-processor array, lO'"^ keys could be generated per second. Since DES is inherently pipelined, and 2^^ Â« 7 x 10^^, then approximately 7000 seconds, or less than two hours, would be required for brute-force decryption. Additional methods of attacking DES include guessing of the key using techniques based upon genetic algorithms [33], which have proven efficient in attacks upon rotor machines [34]. From the preceding discussion, it should be obvious that there exist serious problems with cryptosystems that have only one key. In particular, extreme vulnerability occurs when the key is distributed to numerous communicating parties. For example, consider network communications, where a fully connected network with n nodes requires O(n^) key PAGE 20 10 transmissions. Clearly, the probability of key seizure (or, at the very least, key monitoring) by unauthorized persons increases polynomially with the number of communicating parties. For example, if N = 4, then there are 4^-4 = 12 key transmissions required. However, if N = 8, then 56 keys are transmitted. If one assumes a network organized around a high-speed trunk or backbone line, then the second case (N=8) has a probability of key transmission per encoded message that exceeds 4.5 times the probability of key transmission when N=4. As a partial solution to the key distribution problem, Diffie and Hellman [31] proposed that the key be partitioned into two halves, i.e., k = (kp,ks) where the public key kp is the only key required for encryption, but ks is required for decryption. Implement ationally, let the i-th user in an N-user system choose key kj = (kp(i), ks(i)) , and let a directory readable by all users list the public keys kp(i), i = 1..N. Let a plaintext message a be sent to user i as ciphertext a^, which is defined as ac = r(a, kp(i)) , (1) where T : xK ^ denotes an encryption transform. Only user i has the secret portion ks(i) that is required to compute the decryption T' : x Â— <Â• F-'^ as a= nac, [kp(i),ks(i)]). (2)Thus, in public-key systems, only keys kp(i), i = 1..N, are distributed openly, versus the secret distribution of Â©(N'^) keys required by private-key cryptosystems. Additionally, public-key cryptography (PKC) can be used to authenticate transmissions when a message is accompanied by an encrypted signature, which may be appended automatically. The discussion of authentication, which is beyond the scope of this dissertation, is summarized in Reference 29. PAGE 21 11 Although attractive conceptually, PKC (Public Key Cryptography) has two stringent requirements, which can be stated in terms of the following related problems: a) Given a sum of two numbers, determine the addend and augend; and In practice, such requirements have been fulfilled by the following cryptosystems. Merkle and Hellman's knapsack cryptosystem [35] was the first public-key system proposed in the open literature. Concurrently, Graham and Shamir developed and later published [36] a slightly different approach to knapsack systems, which is distinguished by the construction of knapsack sets and by a modification to Merkle and Hellman's decryption step. Another approach to knapsack systems is the iterated knapsack, which uses multiple knapsack sets, similar in concept to the polyalphabetic substitution that employs multiple alphabets. Knapsack sets are often implemented in terms of superincreasing sets, which can be constructed from an indexed set F, as follows An unfortunate property of superincreasing sets, which defeats the security of knapsack encryption, is that the ratio of the number of elements to the number of bits required to encode the set is not greater than 1. The greater the radix r of a superincreasing set Fsi = {r' 6 R: r > 1 and i G NU {0}}, the further apart are such numbers on the number line. Unfortunately, such low-density sets yield knapsack encryptions that can be easily broken. However, by constructing hard knapsack sets, which are denser and yield encryptions that are difficult to break, it has been shown that a successful computational attack is at least NP-complete [30]. Methods of attack on knapsack systems have been proposed by Shamir [37], Lagarias and Odlyzko [38], as well as Brickell [39]. A summary of such methods is beyond the scope of this overview, but can be found in References 29 and 30 , which review knapsack systems in detail. b) Given a product of two numbers, determine the multiplicand and the multiplier. (3) PAGE 22 12 Rivest, Shamir, and Adelman (RSA) proposed [40] that public-key cryptography be implemented in terms of the product of prime numbers. Their system currently remains secure, i.e., no successful comprehensive attack on RSA has been reported in the open literature. However, the RSA algorithm, as well as a feasible RSA decryption scheme, has been disclosed in the Pretty Good Privacy (PGP) encryption algorithm which was made publicly available in 1994. As a result, RSA is generally considered insecure for purposes of military communications. With the development of efficient prime factoring algorithms, as well as the storage of precomputed prime factors, RSA has become susceptible to attack unless the number of digits k is very large. Furthermore, the processing of data encrypted by RSA requires decryption for numerous operations to be feasible implementationally. Although multiplication can be performed upon the encrypted result, the general equation y = a -|bx cannot be solved for y given x, since RSA is based upon the multiplication of exponents. Thus, the utility of RSA in processing encrypted data appears to be limited to integral scaling problems. Additionally, efficient implementation of the RSA algorithm requires reasonably fast algorithms for determining numbers that are relatively prime in a system based upon modular arithmetic. Furthermore, RSA requires the efficient computation of inverses in a modular system, as well as the computation of large exponents. Such algorithms have been published recently [41]. Methods of encryption that are currently in vogue exploit the geometry and nonlinearity of certain bivariate functions (such as conic sections) to produce two keys that satisfy constraints of the PKC scheme. Such methods yield a large search space, but can be compromised if the geometric figure upon which they are based can be guessed or approximated computationally [29]. An analysis of such implementational techniques is beyond the scope of this dissertation, but may be found in Reference 29. PAGE 23 13 1.2.3. Encryptive Processing The processing of encrypted data, which we call encryptive processing, predated research in compressive processing by nearly two decades. Encryptive processing seeks to mathematically combine ciphertext and plaintext data in a manner that is secure from the site of encryption to the site where the data is combined, and remains secure to the decryption site. Such methods are useful for financial transactions, updating of personal (i.e., medical) records without compromising privacy, and processing of classified data such as military surveillance imagery. Our review of previously-published work in encryptive processing begins with Rivest's, Adelman's, and Dertouzos' concept [42] of linear homomorphisms, which have been shown to be vulnerable to linear-algebraic attack, but remain interesting pedagogically. Additionally, we overview analyses by Abadi et al. [43] of the likelihood of encrypting polynomial-time and NP-complete functions, and discuss recent work by Ahituv [44]. The latter research is based upon the assumed combinatoric complexity of recursive encryption, and provides an in-depth analysis of cryptographic vulnerability in practical encryptive computational scenarios. We also consider binary superposition operations over plaintextciphertext pairs formulated by Yu and Yu [45], which are feasible in practice, due primarily to an eflficient formulation, deep recursive encryption, and large plaintext size. Although an intermediate result may be discovered, its decryption can be rendered infeasible in operational scenarios due to practical constraints upon an adversary's ability to enumerate the recursive encryption steps. 1.2.3.1. Linear privacy homomorphisms. Difficulties in processing encrypted data were recognized early on, and were discussed in Rivest et al.'s paper [42] of 1978. However, linear homomorphisms can be successfully attacked via the solution of a set of linear equations, as noted in Reference 44. PAGE 24 14 Given spaces S and S', as well as functions / : S -* S and /' : S' -^Â• S', the transformation T: S ^ S' is a homomorphism if and only if /'(T(g)) = T{f{g)) , g G S. Additionally, if r is a linear transform, as well as a homomorphism, then T is called a linear homomorphism. For example, let a and b be n-element vectors over the alphabet F, and let T be a linear homomorphism. We thus have r(a + b) = r(a)+r(b) (4) T(m-a)=m-r(a) = r(a)-m, (5) where m is a constant in F. Let encrypted data be denoted by a.c = T{a) and be = T{h). Assuming that Thas an inverse T~^:S' Â— >Â• S, one updates ac with plaintext c as follows STEP 1 : Encryption : Cc = T(c) STEP 2 : Updating : a^^" = ac + Cc (6) STEP 3 : Decryption : a"^* = T-^a.^^'') . Multiplication by a constant is similarly performed, per Equation 5. Linear homomorphic encryption systems are easily broken by solving n linear equations, as follows. Initially, one constructs a dictionary d consisting of n words. In a representation such as the binary number system, where the k-th bit of the k-th word is unitary and a vector dk is zero elsewhere, we have di = r(i,o,o,...,o) d2 = r(o,i,o,...,o) . dÂ„ = r(o,o,o,...,i) In practice, an n-bit encrypted block can be defined as the sum of rows of d, due to the linearity property of T. For example, if we choose an enciphered message PAGE 25 15 ac= (0,1, 1,0,0,1,0,. . .,0) G {0, 1}" we have the linear equation ac= d2+d3+d6. or some linear combination of the rows of d. By defining n such messages, one obtains n linear equations. By knowing the solution (i.e., the plaintext arguments of T that produced d), the homomorphism can be broken via solving n linear equations in the customary manner. Implementationally, given the O (n^) multiplication overhead of matrix inversion, 1024byte signals (n=819'2 bits, or 16 lines of 64 characters) could be inverted (with machines capable of 1 billion multiplies per second) in approximately 550 seconds = 8192^ multiplications/10^ multiplications/sec. As a result, such messages are not immune from attack with current computational hardware when the useful lifespan of a message is as short as several minutes. Thus, privacy homomorphisms remain of pedagogic interest primarily. 1.2.3.2. Abadi's analysis of encryptive processing. Given the apparent failure to secure simple operations such as addition and multiplication by a constant, one would naturally ask what functions are encryptable, and what level of security can thus be attained. Abadi, Fiegenbaum, and Kilian [43] considered the problem of encryptive processing in terms of the following scenario, which is based upon an encryptable function f: 1. Party A knows a datum x in doTnain{f), but has insufficient resources to compute fix). 2. As a result, A requests from party B the quantity f{y), where y G domain(f) a-ad y^x. Since B will compute the function / that is hard from A's perspective, B can successfully attack a cryptosystem based upon intractability assumptions. However, B may not guarantee the privacy of the encrypted data. 3. Via an encryption, A (with inferior resources) can request f{y) from B, but B cannot infer x from y. 4. However, A can infer f{x) from f{y), and does so, to obtain the desired result. PAGE 26 16 From item 3), one (correctly) concludes that A must be able to understand (and thus compromise) an encryption system that B cannot successfully attack. Therefore, A has greater encryption resources than B, but has fewer computational resources (as noted in Item 1). Via a series of interesting theorems, the authors claim to Â• Prove precise statements about what information the computational operation / hides and discloses, in the information-theoretic sense; Â• Develop theory that proves an encryption T : x\-^ y hides or conceals at least the properties H{x) and discloses at least I'(x); Â• Give examples of natural encryptable functions; and Â• Establish a strong negative result about the encryptability of NP-complete functions. In particular, the authors show that the functions for which there exist efficient encryption schemes and disclose nothing are computable in expected polynomial time. For example, under certain operational assumptions, the arithmetic and logic functions can be secure, and might not disclose information concerning their operands. The authors' conclusions support the intuition that if /is hard to compute, then / is difficult to map to the range space of an encryption transform, and A cannot hide everything about / or f{x) from B. Unfortunately, the authors do not address in detail the well-known problem of disclosing the properties of encrypted vector-space operands, such as images. For example, if a function /accepts an n-pixel image a and returns an n-pixel image b, then /discloses the size n of its input, which is an important clue for the cryptanalytic adversary. Similarly, a or b may contain subimages having spatial structure that is revealed in the ciphertext. Note that this study primarily concerns customary image processing functions that compute over an n-pixel image in 0(n'') time, where 0 PAGE 27 17 In summary, Abadi et al. provide a useful description of the encryptability of functions in the complexity classes NP or CoNP, which forms a conceptual basis for our work. Unfortunately, Abadi's paper contributes little to our study of the implementational security of polynomial-time functions (i.e., the amount of input information disclosed by a computation under practical constraints). However, a useful result is presented in the observation that polynomial-tme functions can be encrypted in polynomial time. 1.2.3.3. Key-update model of encryptive processing. In response to the shortcomings of Rivest's homomorphic functions, Ahituv, Lapid, and Neumann [44] present an incremental requirements analysis of computational security that begins with the concept of additive updates of ciphertext data with plaintext. We summarize the salient points of the analysis, as follows. Given key k, plaintext a, as well as the current and updated ciphertext ac and aÂ°**, the authors discuss an update scheme given by the encryption ac = (k -Ia) mod 2^, (8) where L is a modulus that is chosen sufficiently large, in order to prevent loss of data due to overflow. Ciphertext updates with plaintext denoted by a"''" are given by a^Â«* = ac + a"''" . (9) After j consecutive updates, the plaintext equivalent of the current ciphertext value becomes ^ar=^c(i)-k. (10) Unfortunately, the data is continually vulnerable to a one-time breaking of the key. Additionally, the accumulation of knowledge concerning plaintext updates could provide an adversary with increasing amounts of information about the ciphertext. Since the sum (k -|ac) could be discovered in an intermediate state, previous updates could be traced without PAGE 28 18 discovering the key. Furthermore, the tagging of the data as plaintext or ciphertext that is required for correct decryption can provide useful information to an adversary. The preceding situation can be improved if one employs a linear isomorphism f. pn X K ^ F", such that = 7T(a,k), and be = T(h,k). Ahituv states the obvious requirement that r-i(ac + be) = r-H r(a) + T(b)) = r-i( r(a + b)) = a + b , (ii) where b denotes plaintext to be added to a. The utility of such encryptive addition could be augmented by multiplication if the following distributive property was supported ac(k) (a + b) = ac(k) Â• a + ac(k) Â• b , (12) where ac(k) denotes ciphertext produced with key k. Since each successive encryption would add the value of the key, it is required that after the n-th update, the most recent plaintext update is given by a"^"'= a^Â«*n k. (13) The system described by Equations 11-13 no longer distinguishes plaintext from ciphertext in the sum, thus reducing vulnerability. However, discovery of the key can compromise the system at any time. This limitation necessitates frequent key replacement for both plaintext updates and stored ciphertext data. Although linear homomorphisms can overcome the key generation problem, such encryptions are easily broken, as discussed in Section 1.2.3.1. Due to key vulnerability, the authors propose a further improvement whereby different neighborhoods of the keyword can be utilized as update keys. Thus, if a certain key is broken, the adversary obtains one-time (temporary) information, but cannot derive the updates a^** or the plaintext a. The success of this method depends upon the generation of a sufficiently long random key, which is feasible with off-the-shelf technology [46]. For PAGE 29 19 example, let ki and k2 be successively-generated keys. Denoting the update function with one or two keys as U(ki) or U{k\,'k2)i and encryption requiring one or two keys as T^^^ or T'ki.kj? Equations 11 and 12 become rkÂ„k,(a,b)= rk,.k,(a)+rkÂ„k,(b) = ac + bc (14) C/(ki,k2) Â• (a + b) = ( [/(ki) Â• a)+( U{k2) b) , where the decryption operation is specified by 7^k'.!kJ^k.(a)+ Tk.lb)] = T,-|,jTk.,k.(a + b)] (15) = r,-/[rk.(a)] + T,-;[rk,(b)]= a + b. Assume that n-1 updates have been processed and are contained in Un-i. Let the n-th plaintext update an be encrypted as UnThe update proceeds as follows Un+1= Un-l+Un (16) kn+i= kn-i4-kÂ„, where decryption is given by an+l= C^n+l-kn+1 Â• (17) Further security can be provided by permuting blockwise partitions of the keys to change their relative locations. In this respect, Ahituv's method suggests possible derivation from the block transpositions inherent in the DES algorithm. Unfortunately, the previous method requires that the protocol between the client (user) and server (secure machine) become more complicated and, therefore, more awkward. Additionally, a more complex key management policy is required, which can render the system more vulnerable. Implementationally, key protection may be insufficient to prevent determination of the plaintext from knowledge of successive ciphertext updates. Thus, we progress to Yu and Yu's cipher [45], as discussed in the following section. PAGE 30 20 I.2.3.4. Yu and Yu's time-reversal ciphers. Following Ahituv's analysis, Yu and Yu [45] proposed a method of recursive encryption T that supports the following superposition of plaintext a and b in terms of ciphertext ac = r(a) and = T(b) Cc= (mi-ac) +(m2-bc), (18) where mi and m2 are constants. The authors employ a time-reversal encryption, which is both computationally efficient and invertible, and is described as follows. Let a Â£ F" denote a plaintext tuple to be encrypted by a time-reversal transform T: F" X K F", such that ac = r(a,k), where the key k G K. Now, ac(i)(t), the i-th element of the ciphertext vector ac at time t, is determined by a function of form ac(i)(t) = (k(ac(i)(t)) ac(i)(t))modq, (19) where q is an arbitrary modulus, and the key k(ac) is a function of ac(i) and its L neighboring elements. Time-reversal encryption can exhibit high data security, since the cardinality of a keyspace K constructed from L terms about ac(i) is equal to q'^'' elements. For example, let q equal the usual value of 256 and L=3, with a polynomially-derived key k(i)= ci-ac(i-l) + C2-ac(i) + C3-ac(i + l), (20)where cj, j=1..3, denote constants. Then |K| = 256^^^' Â»i 2^-^^^'^' possible keys. As discussed by the authors, circulant boundary conditions should be employed, i.e., for n input data, the index i must be computed modulo n. Additionally, one must set certain initial conditions, such as a(i)(t=o)= d(i) and ac(i)(,^i) = a(i) , (21) where elements of the arbitrary background vector b G F" are chosen pseudorandomly. PAGE 31 21 Decryption is obtained by reversing the encryption's temporal order, i.e., *c(i)(t-i) (22) Thus, decryption reverses the order of key addition. Although the keyspace is large, time-reversal encryption exhibits two significant disadvantages, as follows: 1. From the definition of data superposition, T is a linear homomorphism, and can therefore be compromised by the method portrayed in Section 1.1.3.1. In particular, if L terms are used to construct a given key, the authors show that the key-generating function k can be determined by solving L-l-1 simultaneous linear equations if ac(t 1) is known. However, the term ac(t Â— 1) supposedly remains confidential, since only two consecutive encrypted texts are publicly available. 2. The term ac(t 1) could be plaintext, per the initial condition portrayed in Equation 22. However, deep ' concealment of ac(t 1) is feasible a priori via extensive iteration of the encryption function, as well as appropriate selection of a background vector. An additional method of attack on the time reversal cipher is termed data subtraction, which is especially attractive for decrypting linear homomorphisms. As an example of data subtraction, consider the Vernam cipher, which generates a ciphertext bit stream via the expression where a,ac,kG {0,1}Â° denote the plaintext, ciphertext, and key, respectively. Now, let us generate the updated ciphertext aj?^* = (k a''^*) mod 2, and subtract a^^* from a^ ac = (k a) mod 2 , (23) to obtain ac a^^"' = (a"<=* a) mod 2 , (24) PAGE 32 22 which is equivalent to the Vernam encryption of a using a"*" as the Icey. If a and a"*^* are partially known, then the key can be obtained as follows k= (ac + a)mod2. (25) Since the time-reversal cipher is similar to the Vernam cipher, the authors note in passing that the difference of two ciphertexts ac and is given by ae(t + 1) -bc(t + 1) = (k(ac(t)) k(bc(t))) (ac(t 1) -|-bc(t 1) ) . (26) If the key difference k(ac(t)) Â— k(bc(t)) vanishes (which is claimed to be infrequent) and ac,(t-l) and bc(t-l) are plaintext vectors (per item 2, above), then the encryption at time t+1 could be unsuccessful. Similarly, if the sum of the ciphertexts at time t-1 vanishes, then the key difference can be revealed in the left member of Equation 27. However, the authors claim that this deficiency can be remedied by subjecting the plaintext to several recursivelyapplied encryption steps prior to computing the plaintext-ciphertext update. The authors further state (without proof) that there exists no simple cryptanalysis method by which one can determine either the plaintext or the keyspace from recursively-encrypted ciphertext. As noted previously, superposition based upon time reversal is a linear homomorphism. Such transformations are notably susceptible to correlationbased attacks. That is, due to the linearity property, the ciphertext and plaintext exhibit high autocorrelation as well as high cross-correlation. Thus, successive updates can be cross-correlated to provide significant information about the keyspace and the source plaintext. Numerous reports of correlationbased cryptanalysis exist in the open literature [30], to which the reader is referred for a detailed explanation of technique. PAGE 33 1.2.4. Compressive Processing 23 Our survey of the compressive processing literature is brief, due to the recent and limited nature of work in this area. The allied (and more general) topic of processing transformed imagery is overviewed, beginning with the well-known and nearly trivial case of processing reduced matrices [10], which has been extensively researched. We then progress to Healy's work on edge detection over locally-averaged and subsampled imagery that furnished a portion of the early conceptual and analytical basis for this study. Jawerth's feature detection algorithm for wavelet-transformed imagery [47] is reviewed as is Jolion's brief discussion of operations over pyramid-encoded images [48]. We conclude our overview with a summary of Cosman et al.'s recently-reported work in pattern recognition and lowlevel processing over vector-quantized imagery [1]. 1.2.4.1. Processing of reduced matrices. Let a G Rmxn denote an mxn-pixel, realvalued matrix. Assume that range{a) contains values in a set S that are deemed insignificant, i.e., can be ignored. When such values are removed from a, we can concatenate the remaining values of a together with a representation of the coordinates at which such values occur, to yield a reduced matrix ac. In practice, ac can be implemented as an abstract data structure (ADS), e.g., a list ac whose elements are each tuples of form (x, a(x)), where x denotes a point in (/omam(a). Assume that we have an image operation Q: R x R -+ R, such that a= 0(a)By processing the second coordinate of each element in ac with Q, we can implement the pointwise operation Q over the list ac, thereby yielding the transformed ADS a^ = 0(ac), which can be expressed as ae= (J (x,a;(x)) = (J (x, 0 (P2[ac(x)])) , (27) xepi(aj x6pi(ac) PAGE 34 24 where pk denotes projection onto the k-th coordinate. An identical result would be obtained if 0 was applied to all elements of a to yield a, and the matrix reduction was subsequently conducted by removing 0{S) from a to yield ac. For example, let a = (0,1,2,5,3,0,0,0,3), and let the negligible values S = {0,1}. The list representation of the compressed image a^ would be given by ac= ((3, 2), (4, 5), (5, 3), (9, 3)). (28) Let the operation 0(f) = where f is a value in range{a). Then, we have that a= 0(a)= (0,2,4,10,6,0,0,0,6) and ac = ((3, 0[2]), (4, 0[5]), (5, 0[3]), (9, 0[3])) (29) = ((3,4),(4,10),(5,6),(9,6)). Now, 0(S) = {O(0)> 0(1)} = {0'2}. If we remove 0(S) from a, we obtain the list representation ac= ((3, 4), (4, 10), (5, 6), (9, 6)), (30) which is given in Equation 29. Thus, the preceding example of matrix reduction can be loosely thought of as a homomorphism that preserves the operation Q, which was defined previously. However, the operation Q' that computes over ac is formulated as Q{p2{a.c)), per Equation 27. Thus, an operation over a reduced matrix can be derived from the corresponding operation over a sparse matrix. In Chapter 9, we show that numerous image processing operations can be computed efficiently over reduced matrices. However, matrix reduction is not an isomorphism, since there exists no inverse (i.e., an exact matrix expansion) that can recover the information lost when the negligible value set S contains more than one value. In Chapter 9, we describe the approximate inversion of a reduced matrix a^ in terms of an approximation to a source matrix a. We call such inversion operations approximate inverse transforms, and provide further theoretical development in Chapters 3-5. Additionally, PAGE 35 25 we show how image operations over the reduced matrix propagate the error due to the previously-mentioned information loss. The chief advantage of processing reduced matrices is the decreased computational cost that accrues from the processing of fewer matrix elements. The compression ratios achievable with matrix reduction often range from 100:1 to 1000:1, which can yield computational speedups for pointwise operations that approach the compression ratio. As noted previously, the actual speedup is dependent upon the overhead of retrieving the source matrix values from the ADS that represents the reduced matrix. As an example, consider image and matrix operations (denoted by Q, for purposes of brevity) such as edge detection, matrix inversion, and LU decomposition. Such operations usually require that the reduced matrix ac be represented in terms of a data structure that is based upon the dataflow structure of the operation Q. For instance, assume that the application of Q to the x-th element of the source matrix a requires retrieval of the nearest neighbors of the x-th element. It follows that the ADS that encodes the reduced matrix ac should be organized to facilitate optimal retrieval of the nearest neighbors of a(x). In practice, however, when one conducts heterogeneous operations over a given ADS, certain operations must be modified slightly to yield a more efficient computation over the given ADS. In Chapter 9, we discuss the general problem of ADS derivation and the specific problem of adapting reduced-matrix operations to accept certain data structures. We next consider the processing of images that are compressed by combination of adjacent pixel values. 1.2.4.2. Edge detection over averaged imagery. In the early 1980s, Healy [49] developed two methods for edge detection over compressed imagery. The first approach was based on an image restoration technique that optimally reversed the image compression process. The second approach was based upon a maximum-likelihood estimation of the original fullsized edge, which could be deduced from patterns in the reduced imagery. Healy showed PAGE 36 26 that, for a finite number of edge patterns over small neighborhoods, it was feasible to use a lookup table (LUT) to map the compressed edge pattern to the corresponding image neighborhood that produced the edge. Images were reduced by mapping 4 x 4-pixel source blocks into 2 x 2-pixel blocks via local averaging, selective averaging (i.e., without averaging across edges), periodic sampling, and gradient-preserving sampling. The reduced images were edge enhanced using Kirsch or Roberts operators. Note that passive FLIR imagery (of resolution 112 lines with 120 pixel per line) was employed, which generally exhibits large, bland regions bounded by nonlinear transitions, and is frequently degraded by noise resulting from the imaging process. Thus error and noise quantification was a key study goal. In addition to edge quality comparisons, mean-squared error (MSE) wa.s calculated for all images, especially those produced by decompressing reduced images via pixel replication^ Note that the averaging of 2x2pixel blocks is equivalent to representing each block by its local mean, thereby guaranteeing minimum MSE with respect to image intensity. Healy states that the previouslydescribed simple averaging method showed the best overall performance in the edge image, both quantitatively (using computed measures of image clarity) and qualitatively (via visual assessment). Healy tested several additional image reduction techniques, all of which exhibited reduced MSE. However, only simple averaging produced results that consistently corresponded to reduced MSE in the edge image. When the Kirsch operator was applied to reconstructed (decompressed) imagery, edge detection was improved, as opposed to operating on the reduced images. For example, for the simple averaging method of image compression, MSE decreased from 1.38 to 0.75 in the edge image. However, application of the Kirsch detector to the compressed images did not fully extract the edge information. Healy's exploratory studies represent early attempts to perform and analyze image processing operations over images compressed by a well-known method such as block averaging. We further discuss the block averaging technique in PAGE 37 27 Chapter 7. A relatively new method of edge detection, which processes wavelet-compressed imagery, is described in the following section. 1.2.4. 3. Local enhancement of wavelet-compressed imagery. In 1993, Jawerth, Hilton, and Huntsberger [47] reported a simple focusing technique for wavelet decompositions. With such methods, one can select neighborhoods of interest within a source image according to region and scale parameters. Additionally, it is possible to obtain variable compression rates and enhance images via wavelet-based filtering of the compressed source image. In summary, beginning with a definition of wavelet series, Jawerth states the wavelet decomposition of an image a over domain X as EEE^S^SW'^^ex, (31) i y k where i denotes the order of the wavelet coefficients in 7, denotes a wavelet basis function, 1/ represents a scale parameter that corresponds to spatial frequency, and k denotes a positional parameter. Each of the basis functions ip belongs to one of a finite number of families | tp^^l \ , where exists on the approximate scale and is approximately Ii uk located at 2''^k. Now, let 5= |w^'] : i 6 Zmj denote a set of weight factors, which are nonnegative numbers. By multiplying the coefficients in a wavelet decomposition by the appropriate weight factors, we obtain the weighted coefficients ^^^l^l^l , with which one can emphasize the importance of certain scales and regions of the decomposed image. The authors show that under statistical constraint, one can specify regions of an image by mean-squared error criteria and employ additional error measures that relate to the smoothness of a over a subset U of Euclidean space R". Thus, by choosing weight factors that are relatively large at certain scales and locations, one can obtain better approximations to image details in those regions. PAGE 38 28 Image compression can be effected by analyzing the weighted coefficients v^l^lll^l of an image, rather than the image itself. As a result, bland areas of an image, which will be manifested as coefficients at large values of the spatial frequency parameter can be subtracted from the wavelet series, parameterized, and concatenated with the compressed image. Such methods are reminiscent of transform coding using sparse matrix reduction, which selects coefficients by their magnitudes. However, in wavelet-based image manipulation, one stores in the compressed image those weighted coefficients whose spatial frequencies are in the set of frequencies exhibited by key objects in the scene. Image enhancement is similarly performed, by increasing the magnitude of the wavelet coefficients whose spatial frequencies correspond to features of interest in the scene. A related hierarchical method, processing of pyramidal imagery, is next overviewed. 1.2.4.4. Processing of pyramidally compressed imagery. In the late 1970s, Tanimoto proposed that pyramidal data structures be employed for object detection in natural scenes [50]. We herein overview methods by which Tanimoto's theory of operations over pyramidal data structures has been adapted to operations such as edge detection. Let a denote an nxn-pixel source image whose pixel values comprise the set F, where n is a power of some radix r. Additionally, let domain{aL) be tesselated into adjacent blocks of size rxr pixels. Assuming that we can reduce each block (via a data compression transform r) to k < r^ values in F, a compression ratio of CR = r^/k will be obtained for each block and, therefore, for a. Without loss of specificity, assume that each k-pixel compressed block is represented as a terminal node of an r-ary tree. From our initial assumption that n = r"", we have logr(n) = m, which implies an m-level r-ary tree representation of a called an image pyramid. In the uncompressed image a, the terminal nodes are rxr-pixel blocks. Thus, in the compressed image a^, the terminal nodes are k-pixel compressed versions of the terminal nodes of a. Further assume that T combines a local feature-detection function with the block reduction PAGE 39 29 operation. Thus, a leaf of the tree that corresponds to the pyramid data structure would contain a compressed block of the source image, with edge information encoded in the compressed image or concatenated to the compressed block. Given such a structure, one can begin processing (for example, via edge detection) at the leaves. By traversing the tree in bottom-up fashion, the result (in this example, an edge map) is propagated to the root node. Thus, an edge detection operation over the pyramidal tree would process the entire tree in 0(n Â• logr(n)) work. With n^-fold parallelism, O(logrn) time would be required. Additionally, operations such as image summation and maximum can be computed approximately over image pyramids in time proportional to the number of nodes at a given level [50]. With n'^-fold parallelism, pointwise operations would incur a speedup that is proportional to the number of pixels in the pyramid level at which processing occurs. In contrast, noncompressive processing requires 0{\doTnain{a)\) operaions. In Chapter 3, we will see that this result is key to achieving computational efficiencies at compressive operation level. 1.2.4.5. Low-level operations on vector-quantized imagery. Let an F-valued source image a on domain X be denoted as a G F''^ (per the discussion of Section 2.1), and assume that a is segmented into k-pixel encoding blocks, also called vectors. A vector quantization transform T : ^ x (F^) groups the vectors into classes that comprise a codebook of form (F'')^, based on a representational error criterion. The error measure determines: (a) how closely the exemplar vectors or patterns in each class resemble each other, and (b) how well the patterns represent the partitions of the source image from which each pattern class was derived. Thus, the encoded image is merely a list (or array) of the indices in G of codebook exemplars. At a high level, one can correctly assert that the VQ-compressed image is the result of applying a substitution to the source image. That is, given a codebook, the VQ transform substitutes for a given encoding block the index of the pattern class that best portrays the PAGE 40 30 encoding block's spatial pattern under constraint of a prior error criterion. For example, if the image domain X is two-dimensional, and k-pixel encoding blocks are employed, then the VQ codebook will have no more than |"|X|/k] pattern cksses. Likewise, the source image will be compressed into r|X|/k] pattern indices, regardless of the number of pattern classes. If one constrains the VQ process such that the maximum number of bits required to encode each pattern index is less than or equal to the number of bits required to encode each image value, then the compression ratio CR > k. Since the VQ transform can be thought of (at a high level) as a substitution, the processing of VQ imagery is, in principle, an easy matter. For example, an approximation to the sum of a source image may be obtained by summing the exemplar vector sums weighted by their frequency of occurrence in the compressed image. The substitutional model of VQ underlies much of the theory for processing VQ-encoded imagery, and has thus been employed in recent reports of VQ-based arithmetic [51], pattern recognitionover VQ imagery [1] and, most recently, VQ-based template operations [53]. Cosman et al. have summarized their work in processing VQ-encoded imagery in Reference 1, which includes a variety of ingenious methods for using substitution-based computing of the image histogram, image enhancement by histogram equalization, an approximation to edge detection, and several methods of graphics rendering. In each case, Cosman shows that implementationally significant computational efficiencies can be obtained. For example, let source image a G F^, a VQ codebook c Â€ (F'')*', and the source image histogram h G N*^. One can approximate h by computing the histogram hj , i G G of each codebook vector, then weighting hi by its frequency of occurrence f; in the VQcompressed image. The histogram is approximated as ha fj Â• hi. In Chapter 9, we discuss the processing of VQ imagery in detail, with emphasis upon the more difficult case of VQ-based image-template operations. PAGE 41 31 1.3. Technical Approach In this dissertation, we summarize our previous work in the classification of image transforms [2], and discuss methods for deriving analogues that implement selected image processing operations over the range spaces of such transforms. We provide specific examples that show techniques by which theoretical speedups in image and signal processing algorithms can be achieved. For example, assume that N compressed data are processed instead of M source data, where N < M and M/N approximates the compression ratio (CR). In certain cases, we claim that a computational speedup of order CR is possible. In support of such claims, we emphasize 1. Description of a transform taxonomy (classification scheme) that includes multiple subclasses for the detailed characterization of commonly-employed compressive transforms. 2. Elucidation of methods for deriving analogues of common image processing operations over range spaces of multiple transforms in each taxonomic class. 3. Application of the methods obtained in Step 2 to the derivation of analogues of common image processing operations over the range spaces of two or three transforms in each class. In particular, we consider Â• a generic type of JPEG transform, Â• VQ with fixed blocksize, and Â• VPIC with a small exemplar set. Such transforms are shown to be practical for a variety of image compression tasks. Additionally, VPIC followed by Huffman coding is shown to perform well for fast compression applications at high compression ratios. 4. Analysis of the operations derived in Step 3 in terms of noise sensitivity, sequential speedup, and data security. PAGE 42 32 5. Illustration of various stages of the image compression and compressive computation process with images derived from practice, together with supporting performance data. 6. Feasibility analysis of compressive computational systems based upon error accumulation and computational efficiency. 7. Discussion of possible additional advantages or problems that may result from compressive or encryptive processing, which could be addressed in future research. For example, we discuss implementational issues inherent in the mapping of various compressive operations to a SIMD-parallel mesh architecture. As a result of this study, we have achieved several advances in image processing technology that are described in the following section. 1.4. Novel Claims We claim the following novel accomplishments: Â• High-level theoretical unification of compressive and encryptive processing; Â• Classification of image transformations in a rigorous, concise taxonomy, which is instrumental in facilitating the efficient design of compressive and encryptive processing algorithms; Â• Computational speedup for many image processing operations that is achieved by computing with compressed data; Â• Reduction in the degree of parallelism that may be possible for certain compressive operations on specific parallel architectures, due to the presence of fewer data; and Â• Enhancement of data security, which may be possible by signal and image processing of data that have been compressed, then encrypted. PAGE 43 33 An implication of this dissertation is that the theory presented herein may furnish the basis for novel, fundamental advances in computational efficiency and data security not hitherto realized via existing image processing algorithms. Specific applications currently exist in areas of highbandwidth (i.e., high frame rate) image processing, such as realtime automatic target recognition using high-speed imaging devices and real-time medical image processing in support of bone or tissue visualization for machineassisted surgery. Additional applications include efficient search over compressed databases, as well as parallel algorithm designs that permit reduced parallelism but preserve the functionality of the corresponding image operation(s). 1.5. Implementational Advantages and Disadvantages An interesting observation resulted from our early research in the derivation of an algorithm for performing connected component labelling on one-dimensional, n-pixel Boolean imagery compressed by runlength encoding [5]. We noted that the customary tree structure of connected component labelling [6] was transformed to a SIMD-parallel structure in certain stages of the compressive algorithm. However, an 0(n) communications cost remained. We conjecture that similar effects may occur with certain compressive formats that encode image regions in terms of features which are key to a prespecified, tree-structured image processing algorithm. For example, let a transform T encode an image region based upon edge or boundary information. It would be reasonable to assume that an analogue over range{ T) of a boundary detection algorithm could be derived that would be amenable to SIMD-parallel implementation, due to preservation of boundary information in the compressed image. Thus, it is reasonable to assume that a tree-structured boundary detection algorithm could be rendered SIMD-parallel via such methods of compressive algorithm design, since boundary information is assumed to be localized in each pixel of the compressed image. PAGE 44 'r 34 In particular, one could design or select transforms that compress the source image so as to preserve features computed by such image operations (e.g., boundary detection). If such was the case, then it could be possible to simplify the design of parallel algorithms for certain image processing operations. Since parallel aJgorithm design is nontrivial but parallel computation has great promise for practical applications, such developments could possibly increase the speed and scope of parallel computer applications in image processing. There are two disadvantages of compressive processing that pertain primarily to the use of a cascaded sequence compressive operations, denoted by S. First, error due to lossy transformation of a source image a can accumulate as compressive computation proceeds through S(a), thus rendering the visual quality of the decompressed result unacceptable. Additionally, this is a problem for automated target recognition conducted over decompressed images that were previously subject to compressive processing. Second, all operations in S may not be computable over a given compressive format with efficiencies that meet the design criteria for a given implementation. Thus, conversion between compressive formats may be required, which increases overhead and decreases resultant efficiency. We address the preceding issues throughout this dissertation, with particular emphasis on computational error in the applications portion. PAGE 45 CHAPTER 2 REVIEW OF NOTATION Prior to developing supporting tlieory, we summarize study notation. In order to acquaint the reader with the majority of our notation, an overview of the image algebra subset employed in this study is presented in Section 2.1. In Section 2.2, we discuss general rules for, and deviations from, the list of symbols contained in the front portion of this document. 2.1. Overview of the Image Algebra (lA) Subset Image algebra is a set-theoretical and functional notation that unifies linear and nonlinear mathematics in the image domain. The basic entities of lA are set elements, sets, and functional mappings. Sets and set operations are discussed in Section 2.1.1, while functions are subdivided into several classes, namely, images (Section 2.1.2), operations on images (Section 2.1.3), nonrecursive image-template functions (Section 2.1.4), and nonrecursive operations on templates (Section 2.1.5). Recursive operations are summarized in Section 2.1.6. 2.1.1. Sets and operations upon sets In image algebra, set elements are generally defined implementationally as having any data type, such as integer, real, or complex. Additionally, set elements may be data structures such as sets, lists, or images, but such unusual structures tend to require processing operations that are not well-defined and thus are not used extensively in 35 PAGE 46 36 computational practice. Elements are collected in sets, which may be partially or totally ordered. Value sets are denoted by open-face capital letters. For example, the customary value sets Z,R, andC denote the integer, real, and complex numbers, respectively. Additionally, N = Z"*" denotes the positive integers and F denotes a generalized value set. Point sets are denoted by bold upper-case letters from the tail of the alphabet (e.g., X, Y, Z), and are customarily subsets of Euclidean n-space R". Image algebra supports the customary set operations of union (U), intersection (H), set subtraction (\), cardinality (|F|), and choice. Set operations obey the relevant laws of identity, idempotence, associativity, distributivity, deMorgan's Law, etc. 2.1.2. Images An image is a mapping of a point set (functional domain) to a value set (functional range). Images are usually denoted by bold, lower-case letters from the head of the alphabet, such as a, b, or c, as well as by emboldened strings, such as ci or Idf. Given a point set X and a value set F, the image a G F^, called an F-valued image on X, is the function a: X Â— Â»Â• F. The functional domain of a, denoted by domain{a), is equal to X, while range{a) = F denotes the functional range of a. The graph of an image a Â£ F''^, denoted by G'(a), is expressed as a= G{a)= {(x,a(x)): X e X, a(x) 6 F}. Additionally, the point x G X, the value a(x) 6 F, and the pixel (x,a(x)) are important features of the image a. The more elementary image operations are defined as follows. 2.1.3. Operations upon images Operations upon F-valued images are the naturally-induced operations of the algebraic system F. For example, operations on real valued images a, b G are the elementary PAGE 47 37 operations induced by the vector lattice (a vector space which is a lattice) R. Thus, basic operations on R^ reflect the corresponding arithmetic and logical operations on R. The generalized unary operation / : F F thus induces the generalized unary image operation f : F-'^ Â— F^, which is applied to the image a G F'^ to yield b = /(a) = {(x,b(x)) : b(x) = /(a(x)) . x G X} . (32) For example if a G R^, then sin(a) = {(x,b(x)): b(x)= sin(a(x)), x G X}. Additional unary operations upon the image a G F-'^ are the image sum, denoted by Sa = X] ^t'^)' image maximum Va= V a(x). Given an image a G F-'^ and an xex xex associative, commutative operation 7 : F x F ^ F, the preceding operations can be generalized in terms of the global reduce operation Ta. = xex*(^) = 7 Â• ' * K^i^a), Xi,X2,. . .,Xn G X. Given images a,bGF'^ and a binary operation o: F X F F, the binary image operation of Hadamard (pointwise) arithmetic Q: F-'^ x F-'' F-'^ is defined as aob = {(x,c(x)): c(x)= a(x)ob(x), x G X}. Given the images where a,b G R'''', the usual arithmetic operations (+,Â—,-,/) as well as the Boolean logic operations are supported, as shown in the following examples a+b = {(x, c(x)) : c(x) = a(x) + b(x), x G X} a*b = {(x, c(x)) : c(x) = a(x) Â• b(x), x G X} (33) aVb = {(x,c(x)) : c(x) = a(x) V b(x), x G X} where V denotes maximum. Since complex numbers lack a natural lattice structure, only addition and multiplication are defined for complex-valued images. The remaining operations on RX can be expressed in terms of the arithmetic functions, or are induced on R^ by the corresponding operations on R. For example, exponentiation of the previouslydefined images a and b is given by a^ = {(x,c(x)): c(x) = a(x)*'(''^f a(x) 7^ 0, otherwise c(x) = 0, x G x} . (34) PAGE 48 38 Note that exponentiation is defined when a(x)= 0. Computation of the logarithm, which is invalid for zero and the negative real values, is given by loga(b) = {(x,c(x)) : c(x) = loga(x)(b(x)) if a(x),b(x) > 0 , X 6 X} . (35) Given the constant image a = {(x,a(x)) : a(x) = k, k G F, x G X}, we have that b'' = b^ and k^ = a^ (36) kb = a b and k + b = a+b . Constant images which are of special importance in lA include the zero image 0= {(x,0): X G X} and the unitary image 1= {(x, 1): x G X}. For example, notational brevity is realized by defining the image sum as Sa = ^ a(x) = a Â• 1 , where the dot product a Â• b = a(x) Â• b(x). xex Subtraction, division, and minimum are defined in terms of the basic operations and the constant image -1, where -b= -1 + b, and a-b=a+(-b), a/b= a* b~\ and a A b= -(-aV -b). (37) The images 0 and 1 exhibit the identity properties a+0= a and a*l= a. Although b*b~^*b= b, we note that b*b~^ is not necessarily equal to 1. Although this usage is infrequent, we occasionally call b~^ the pseudo-inverse of b, as discussed in Reference 54. Given image a G and S : X Â— > 2-'', the generalized characteristic function XsW={(x,c(x),:c(x,= {J Itif "".x.x) (38) can be employed in thresholding a at a value T G F, as follows X>T(.)= {(x,c(x)): c(x) = { [ Â€ X} . (39) Characteristic functions may also be defined in terms of elementary lA operations, as shown in Reference 54. PAGE 49 39 The restriction of image a G to a subset W of X is denoted by a|w = {(x,a(x)): X G W}. The extension of image b G to the image a G F^, where W C X, is given by f . f a(x) if X G X\W) b|^=|(x,c(x)):c(x)=|^JJ^.^^^^\ |. (40) Implementationally, restriction and extension can be thought of as cut and pasie operations, respectively. Furthermore, a restriction of a G F-'^ to a subset of a whose values are characterized by a property S of F is given by alls = {(x,a(x)): a(x) G S, x G X} . (41) For example, given a threshold T G R, if a G R''^, then a||>T = . {(x,a(x)):a(x)>T,xGX}. 2.1.4. Nonrecursive templates and image-template operations Templates are powerful constructs that map a point to an image. Templates unify the customary concepts of convolution masks, sampling windows, structuring elements (mathematical morphology), and neighborhood functions. For example, let X and Y denote point sets, P denote a set of parameters, and F denote the value set. A generalized F-valued template t from Y PAGE 50 40 If t is a real-valued template from Y to X, then the support of ty is defined as 5(ty)= {x G X: ty(x) 0}. The set 5(ty)= {x Â£ X: ty(x) ^ -00} is referred to as the configuration of t at the point y. If X is a space with operation +, then a template t E (F^) is called translationinvariant (with respect to the operation +) if and only if for each triple x,y,z 6 X, that ty(x)= ty+z(x + z). If a template is translation-variant, then it is not translationinvariant. For purposes of brevity we call translation-invariant templates invariant templates, and translation-variant templates are called variant templates. Given an image a G and a template t Â£ (F^) , as well as the associative, commutative functions 7,0: F-'' x F-'^ F-'^, the backward image-template operation, also called the right product, is given by b = a Â® t = |(y,b(y)): My)= J^(a(x)oty(x)), y G y| . (42) When F C C, as is customary, the right product can be instantiated as the image-template convolution b = a e t = j (y ,b(y )): b(y )= ^ (a(x) Â• ty (x)), y G Y I , I xÂ€X J (43) by setting 7= -Iand 0= Â• For t G (F^)-'^, the forward image-template operation or left product, denoted by t Â® a, is given by t e a = I (y,c(y)) : c(y) = Yl ' *(^)' y ^ Y I . (44) l xÂ€X J In either case, b G F^. Ritter [55] showed that computation of the generalized convolution can be rendered more efficient by replacing X with 5(ty) in Equation 43, or with 5(tx) in Equation 45. For example, in Equation 45, J2 *x(y) Â• a(x) = 0 whenever 5(tx) = 0. xes(t,) The three elementary image-template operations employed in the transformation of real-valued images are called generalized convolution (Â®), multiplicative maximum ( (g), where 7= V and 0= Â•), and additive maximum ( El , where 7= V and 0= -f-). For example, PAGE 51 41 ifaeF'^andtG (F^) are defined in terms of the value sets known as the extended real numbers (R-oo = R U {-00} and R+00 = R U {+00}), then we have the following nonlinear operations The functions of multiplicative and additive minimum ( Â® and E! , respectively) are defined symmetrically to Equations 46 and 47. The operations @ and 0 , together with a larger class of generalized image-template operators, are discussed in detail in Reference 54. Templates can implement mappings across heterogeneous point sets, and are thereby useful for coordinate transformations, such as image rotation, magnification, or changes in dimensionality. In the latter case, applications to multispectral imagery and sensor fusion have been extensive [56]. Key to the utilization of the nonlinear functions @ and M in applications of greater complexity is the concept of conjugacy, which is described as follows. Recall the preceding definitions of lA operations in terms of basic operations, as well as our previous mention of the pseudoinverse. Likewise, recall that operations on are induced by the corresponding operations on F. Thus, one can see that the ring (R^, +,*) and the lattice (R^, v) behave much like the ring and lattice of real numbers. Similar observations hold for the extended real-valued images. Given the notion of an additive or multiplicative inverse, the concepts of additive or multiplicative conjugacy are suggested. For example, if a G R^oo or a G R+co' we assume that -(-f-oo) = -00 and -(Â—00) = -1-00, then the additive conjugate of a, denoted by a*, is defined as a'(x)= -(a(x)), X G X. Thus, (a')'= a, and if a G R^^, then a" G R^^o^ which is the conjugate space. (45) and (46) PAGE 52 42 Denote R^^ = R+ U {0, + oo}. For a G (r^^)^, the multiplicative conjugate of & is denoted by a and is defined as r l/a(x) if a(x) 7^ 0 or a(x) 7^ + 00 a(x)=<^+oo ifa(x)=0 ,x6X. . (47) [ 0 if a(x) = + 00 Likewise, several template inversions are key to the expression of image transformations. The transpose of a template t G (F^)"^ is the template t' G (F'^)^, which is defined in terms of its weights as t'^{y) = ty(x), x G X, y G Y. The following conjugate templates are defined for the extended real valued numbers. If t G (R^oo)"^ (or * ^ (R+co)^)' then the additive conjugate of t is the template f G (o^ ** ^ (R-00)^)' such that t*(y)= -ty(x), X G X, y G Y. If t G (R+co)^f then the multiplicative conjugate of t is the template t G (R^qq)''', such that t^(y)= [ty(x)], x G X, y G Y, after Equation 48. With template inversion, the operations @ and M are employed in the definition of the conjugate operations in a manner that is reminiscent of DeMorgan's Law, i.e., a El t= (t* M a*)* (48) and a@t= (t@a). (49) 2.1.5. Nonrecursive operations on templates The elementary arithmetic operations on images given in Equation 34 also generalize to templates. If s and t are real-valued templates from Y to X, then pointwise template arithmetic is defined as follows s+t is defined by (s+t)y = Sy+ty , s*t is defined by (s*t)y = Sy*ty , and (50) sVt is defined by (sVt)y = SyVty . Addition and multiplication are also defined for complexvalued templates. The arithmetic of extended realvalued templates is described in Reference 54. PAGE 53 43 Likewise, the operations of generalized convolution, as well as additive and multiplicar tive maximum, generalize to operations between templates. If t is a realor complexvalued template from Y to X, and s is a realor complexvalued template from X to W, the template r = s Â® t from Y to W is defined by the image function Ty, as follows ry(w)= J2 ty(x) Â• Sx(w), where w G W . (51) xex Computation of the weights ry(w) can be accomplished by summing over a subset of X. For example, given a point y in X, then for each w 6 W, we define the set S(w) = {x G X: X G5(ty) and w G5(sx)}. Then, since ty(x) Â• Sx(w) = 0 if x ^ S(w), we obtain ry(w)= ty(x)-sx(w), (52) x6S(w) where the definition Y, ^yi'^) ' Sx(w) = 0 holds whenever S(w) = 0. The operations x6S(w) of additive and multiplicative maximum are similarly defined, and are discussed in detail in Reference 54. Template composition and decomposition are useful for algorithmic optimization, which is the primary motivation for introducing operations between generalized templates. For example, if r = (1 11) and s = (1 i 1)' where the target point is italicized, and r = s Â® t, then the computation of a Â® r = a Â© (s Â® t) by (a Â® s) Â® t uses six local multiplications instead of nine. It has been shown that, if r is an nxn template, and s and t are the decompositions of r into Ixn and nxl templates, respectively, then the computation of a Â® r by (a Â® s) Â® t uses 2n multiplications, instead of n^. This concept is further discussed in Reference 55. PAGE 54 2.1.6. Recursive image algebra 44 Our summary of recursive lA operations (after Li [57]) includes an overview of recursive templates (Section 2.1.6.1), operations between images and recursive templates (Section 2.1.6.2), and operations between recursive templates (Section 2.1.6.3). 2.1. 6.1. Recursive templates. A generalized recursive template is defined in terms of the generalized template of Section 2.1.3, together with a partial order imposed upon certain point sets. A partially ordered set or poset P together with a binary relation < satisfies the axioms of refiexivity, antisymmetry, and transitivity. Let (P,-<) be a poset and [n] = {l,2,...,n}, where n = |P|. A linear extension o/(P,x) is a bijection a: F ^ [n] such that (x^y)eP ^ (^T(x) < <7(y)) G [n]. Denote value set F, point sets X and Y, where -< is a partial order imposed upon Y. A generalized F-valued recursive template from Y PAGE 55 45 A recursive template t G (F^, F^)^ is translation-invariant if and only if for each triple x,y,z G X with y+z Â£ X and x+z G X, we have ty(x)= ty+z(x + z), which is equivalent to t^y(x)= t^y+z(x + z) and t^y(x)= t^y+2(x + z). Â•2.I.6.2. Operations between images and recursive templates. Recursive operations in image algebra are generalized in terms of the recursive right and left products, similar to Equations 43 and 45. Let T denote the global reduce operation, as before, and let 7: F X F F be associative and commutative. Let Fi and F2 be two value sets with operations Oi,02 : Fi x F2 F. If a G (Fi)^ and t G(Ff ,F^)^, then the generalized recursive right product is the binary operation F^ x (F2^,F2')^ Â—> F^, defined as aÂ®^t = {(y,c(y)): c(y)= J a(x) Oi t^y(x) 7 c(z) O2 txy(z) , y G Y | , (53) with the left product symmetrically defined. For example, if F,Fi,F2C R, 7= +, and Oi,02= Â•Â» then 0^, the recursive convolution. Likewise, if F,Fi,F2= R-ooÂ» 7= V, and Oi,02 = +Â» then E!-<> the recursive additive maximum. 2.1.6.3. Operations between recursive templates. It is well established that image algebra template composition and decomposition are useful for algorithmic optimization. For example, Li showed that for a G R''^ and for s,t G (R^oo)^) a Â®(s Â®t)= (a Â®s) @t and (54) a Â®(s + t)= (a 08) + (a Â®t). Additionally, if a G R^^o ^nd s,t G (R""-)^, then a 0 (s El t) = (a EI s) El t and (55) a El (s V t)= (a El s) V (a El t). Li [57] extended template decomposition to recursive templates, and proved certain associative and distributive laws. We summarize salient theory, as follows. PAGE 56 46 Let the finite point sets X,W C R" and let the recursive templates t g(R^,R^)^ and s g(R^,R''^)^The generalized recursive convolution of two recursive templates, denoted by r= sÂ©^t, where r= (r^,rx) 6(R^,R^)J, is defined as r^=s^Â©t^ and = 1 (1s^) 0 (1t^) , (56) where 1 Â£ (R'''')''^ denotes the unitary template, which is defined in terms of its weights as l,(z)= ,x,zeX. (57) [ 0 otherwise Given the previously-defined templates s and t, the recursive addition r= t is given by r^= e(i-t^) + (1-s^) et^ (58) r^= l-(l-s^) Â©(1-t^). Nonlinear recursive template composition and inversion are defined in Reference 57. Having concluded the discussion of image algebra, we next consider notation specific to this document. 2.2. Study Notation In addition to image algebra, which comprises the majority of the notation employed in this study, we adopt the following conventions: a) Image transformations are expressed in general form as T: G^, where F / G and X ^ Y are possible. b) Operations in image space, i.e., over domain( T), are denoted by Q, unless otherwise stated. c) Operations over range{T) are usually denoted by Q'd) The images of primary concern are denoted by a the source image, PAGE 57 47 ac the transformed image T(a), a the result of the operation 0(^)' ac the result of the operation 0'(*c)The i-th derivative of an image a G R''^ is denoted as ^,XGX, (59) a x' provided that the operation d is defined over X. Assuming that X C R", the i-th partial derivative with respect to the projection of a point x G X is denoted as (60) In addition to the sets described in Section 2.1.1, which are denoted in open face (F) and bold face (X) type, sets may also be specified by upper-case letters in normal (S), italic (G), script {V), or Gothic (O) faces. The properties of a set S are usually denoted by 7^(8), unless otherwise stated. Set properties that are disclosed or concealed from the scrutiny of an oracle O (defined in Chapter 5) are denoted by T>{S) and ^(S), respectively. Scalar variables are denoted in lower case by normal face type (e.g., x, y) or by italic type (e.g., x, y), unless otherwise stated. The operator (Â•) denotes scalar multiplication when applied to scalar variables. When (Â•) is applied to oneor two-dimensional matrices, or to images defined on a subset of R or R^, then matrix multiplication is indicated. In contrast, Hadamard multiplication of matrices or images is denoted by (*). The time complexity of an operation / is denoted by T{f), which is not to be confused with the transform notation T (italic font). Similarly, the space complexity of /is denoted by S(/), and the work required to compute /, by W(/). Note that the work W is not to be confused with W, which denotes a point set. Additionally, the cost of computing / is denoted by C(/), which is not to be confused with C, an PAGE 58 48 image transform that is used in Chapter 4. When we compare the computation of / over n input data on various N-processor architectures A, then we occasionally elaborate T, S, VV, and C as Ta(/,n)(")The superscript asterisk (*) may denote template conjugation (e.g., t*), complex conjugation of an image (e.g., a*), or may denote an approximation to an inverse transform (e.g., r*), which is defined in the following chapter. PAGE 59 CHAPTER 3 FUNDAMENTAL THEORY In this chapter, we develop theory that describes the processing of transformed data. Section 3.1 contains a review of the concepts of homomorphism, isomorphism, and heteroassociativity. In Section 3.2 we present the basic theory of computation over compressed or encrypted imagery, which is related to several problems of computer science, such as the transform properties problem. Section 3.3 contains a taxonomy of compressive and encryptive transformations that is organized in four classes. Such classification facilitates the development of high-level methods for deriving functions that compute over the range spaces of transforms in each class, as exemplified in Section 3.4. 3.1. Mathematical Concepts. This study is based in part upon the concepts of commutativity diagrams, homomorphism, and isomorphism. Homomorphic transformations facilitate the processing of transformed imagery in a manner analogous to operations that process the source image. An isomorphism is a special case of a homomorphism. Isomorphisms are important in the direct mapping of operations over one domain to operations in another domain. Additionally, isomorphism is useful when discussing data security, as in Chapter 4. We begin with a summary of basic mathematical terms. 3.1.1. Basic Theory. The concepts of elements, sets, and mappings are key to our theoretical development, and constitute the basic building blocks of image algebra. 49 PAGE 60 50 3.1.1.1. Definition. The n-fold Cartesian product of a set S is defined by n S" = n S = {(si,S2,...,sÂ„) : si,S2,...,sÂ„GS}. (61) i=l 3.1. 1.2. Definition. A mapping M from set R to set S is denoted by M: R S. The collection of all mappings R ^ S is denoted by S^. Thus, if M: R S, then M Â£ S^. We call R the domain of M, written as R = domain(M). The range of M, is defined as range{M) = {s 6 S : s = M(r) for some r G R}. 3.1.1.3. Definition. An m-to-k-ary operation Q from set R to set S is a mapping Q : m k n R ^ n S, where it is hereafter understood that m,k > 1. When k=l, we call Q an i=i j=i m-ary operation, denoted by Qm3.1.2. Groupoids, Operations, and Transformations. We next state facts concerning groupoids, operations, and transformations, which lead to definitions of homomorphism and isomorphism. Groupoids combine the concepts of sets and operations, and lead to the consideration of transforms and their inverses. The definition of a transformation leads to discussion of homomorphism and isomorphism. 3.1.2.1. Definition. A groupoid (S,0) is comprised of the set S and a binary operation O : S X S Â— < S. A groupoid whose binary operation is associative is called a semigroup. 3.1.2.2. Definition. We call e the identity element of S with respect to the operation O : S X S Â— Â»Â• S, if and only if f 0 e = e 0 f = f,Vf GS. (62) PAGE 61 51 3.I.2.3. Definition. Let S be a set with identity e with respect to the operation 0 : S X S ^ S. If f,g 6 S then g is called the inverse of { Â£ S with respect to the operation 0, if and only if fOg = gOf = e,VfÂ£S. (63) If f G S and g G S is the inverse of f, then we denote g by . 3.1 2.4. Definition. Suppose T : S ^ S' and V : S' S. Then Vis called the inverse of the transformation T if and only if {Vo T){{) = f, Vf G S. (64) If r : S ^ S' and F : S' S is the inverse of T, then we denote Vhy 7"'. If we denote r(f) by f (i.e., f = r(f)), then it follows that T-\{')= f , Vf G S . 3.1.2.5. Definition. Let T: S ^ S' be a homomorphism. If S is a normed space with norm || |{ and e G R, then we say that a homomorphism T* : Â—> is an e-near inverse approximation of T if and only if ||r(r(a))-a|| < Â€, VaGF^. (65) Note that every inverse is an c-near inverse for all c > 0. 3.1.2.6. Definition. Given groupoids (S,0) ^.nd (S',0')Â» the transformation T : S Â— S' is a homomorphism from S to S' if and only if T(f Og)= r(f)0'r(g),vf,gGS. (66) 3.1.2.7. Definition. Let (S,0) and (S',0') be groupoids with homomorphism T : S ^ S'. Then, : S' Â— Â»Â• S is an inverse homomorphism for T if and only if T'^ is the inverse transform of T and r"' is a homomorphism. PAGE 62 52 3. 1.2.8. Definition. If T : S ^ S' is a homomorphism and T is a one-to-one and onto mapping, then T is an isomorphism of S onto S'. 3.1.2.9. Definition. A semigroup is a groupoid whose binary operation is associative. 3.1.2.10. Definition. A monoid is a semigroup with identity. 3.1.2.11. Theorem. Let (S,0) ^^nd (S',0') be monoids, and let e denote the identity of S. If r : S Â— > S' is an isomorphism, then the following statements hold: (i) If e' is the identity element of S', then e' = T{e), and (ii) T(f-i)= (r(f))~\ VfG S which has an inverse f-i e S. Proof. The proof is well known. 3.1.3. Preservation of Set Properties. We next develop theory that describes the preservation of various properties of sets and mappings under homomorphism. The following theory is fundamental to the derivation of functions that compute over the range space of an image transformation. 3.1.3.1. Notation. Denote the properties of a collection of subsets of a set S by V{S). If A C S has property q and A is a singleton set (i.e., A = {u}), then we say that u has property q. 3.1.3.2. Definition. We say that the transform T : S -* S' preserves the properties ViS) if and only if for each property q e V{S) and for all A C S with property q, r(A) also has property q. We illustrate this concept with the following examples. 3.1.3.3. Example. Let q Â£ ViS) be the property of connectivity, and let T : S -* S' preserve q. This means that for every connected set A C S, r(A) must also be connected. PAGE 63 53 3.1.3.4. Example. Given an image a G (Zn)^, it is easily verified that the sorted histogram h(a) is preserved under the transformation j{a.) = n a. 3.1.3.5. Example. The fact that associativity and commutativity are preserved under homomorphisms follows from the fact that the operations of algebraic structures are preserved under homomorphisms. 3.1.3.6. Definition. We say that a binary operation Q: S x S ^ S preserves property q e V(S) if and only if, for all pairs of subsets A,B C S with property q, or for each pair of elements f,g G S with property q, then 0(A,B) or f Q g exhibits property q. 3.1.3.7. Definition. A groupoid (S,0) is said to exhibit property q if the operation 0 preserves property q. 3.1.3.8. Definition. An algebra A = {T, O) consists of a set T of operands and a set O of operations upon the operands in T. 3.1.3.9. Example. The algebra consisting of the real numbers together with the operations of multiplication and addition is denoted by the tuple (R, {+,}), which can be written as (R,-|-,-) for purposes of simplicity. 3.1.4. Heteroassociativity. We next discuss the property of heteroassociativity, which is employed in the derivation of operations over the range spaces of certain encryptive transforms. 3.1.4.1. Definition. An operation 7 : S x S Â— ^ S is associative with respect to the operation 0 : S X S ^ S, if and only if (f 0 g)7h = f 0 (g7h),Vf,g,hGS. (67) PAGE 64 54 3.1.4.2. Definition. If an operation 7 is associative with respect to an operation and 0 is associative with respect to 7, then we say that the operations 7,0 Â• S x S Â— >Â• S are heteroassociative. 3. 1.4.3. Definition. An algebra (S,0)7) is called a heteroassociative algebra if and only if the operations 7,0S x S S are heteroassociative. 3.1.5. Heteroassociativity of Inverse Functions. We next consider inverse operations and their heteroassociative properties. 3.1.5.1. Definition. Let Q Â• S X S Â— > S and let S' = {f 6 S: f has an inverse}. Then, Q,: S X S' Â— Â»Â• S is called the inverse operation (or inverse) of Q if and only if f 0 g"^ = ffig-Vf eSandVgeS', (68) where g"' denotes the inverse of g. If is the inverse operation of Q, then we denote ft by 0"'3.1.5.2. Remark. In the preceding definition, does not mean the inverse function S ^ S X S. It should be clear from the context when we mean inverse operation versus inverse function. 3.1.5.3. Example. Referring to the preceding definition, note that if S = R, then S' = S for the operation of addition. Thus, subtraction is the inverse of addition over the real numbers. In contract, although division is the inverse of multiplication over the real numbers, S' = R \ {0} C R, since f/0 remains undefined for all f in R. PAGE 65 55 3.1.5.4. Definition. Assume that S is a normed space with norm || ||. Let e 6 R+ and S* = S X S. An operation Q : S* Â— S has an e-near inverse approximation if and only if f Og"' = f 0*g,VgGS', (69) where S' was defined in Section 3.1.5.1 and ||(f 0'f)-e|| < f,Vf 6S, (70) with e denoting the identity of S. 3.1.5.5. Observation. Assume that (S,0) (S',0') two monoids, where operations O Â• S X S -> S and 0' Â• S' x S' Â— Â» S' each have an inverse denoted by 0"^ : S X S S and (0')~^ : S' X S' ^ S'. If T: S ^ S' is a homomorphism, then r(f Of"') = r(f)(0')"^7'(f), vf-i G s. 3.1.5.6. Definition. A monoid (S,0) is called a group if and only if each element f G S has an inverse f~^ G S. Alternatively, we may define a group as a monoid with identity and an inverse, whose binary operation is associative. If Q is commutative, then the group (S,0) is called an Abelian group. 3.1.5.7. Definition. A ring (S,077) satisfies the following conditions for all f,g,h G S: (i) The functions 0'7 Â• S x S -+ S are associative and commutative. (ii) 7 has an identity e G S and there exists Vf G S an inverse such that f 7 f = e. (iii) O is leftand right-distributive over 7, i.e., f Q (STh) = (f Og)7(f 0^) and (f7g)0h = (fOh)7(gOh),Vf,g,hGS. 3.1.5.8. Definition. A semi-ring (S,0) is a commutative semigroup. Note that this is different from an Abelian group (S,0) that is present in a ring (S,0,7)- PAGE 66 56 3.1.5.9. Lemma. Let (S,0) be a group. If 0 is associative, then the following statements hold: (i) is associative, and (ii) (f 0 g) 0"' h = f 0 (g 0"^ h) , Vf,g,h G S , where Q"' denotes the inverse operation of Q. Proof. The proof follows from the definitions of a group and an inverse function. 3.1.5.10. Definition. Suppose that A= (S,0Â»7) and A' = (S',0')7') are algebras, where Q and 7 are binary operations over S, and Q' and 7' are binary operations over S'. A function T : S S' is called a homomorphism from algebra A to algebra A' if and only if the following conditions are satisfied for all f,g G S: (i) r(f Og)= mO' r(g), and (ii) r(f7g)= r(f)7'T(g). Note that the correspondence of operations Q and Q' , as well as 7 and 7', must be preserved. It follows that the cardinality of the set of operations in algebra A must equal the cardinality of the set of operations in A'. This concludes the definitions upon which our theory of processing compressed data is based. We introduce the core theory in the following section. 3.2. Image Compression, Encryption, and Compressive Computation. A mathematical discussion of compressive computation requires the following definitions as background. Section 3.2.1 contains definitions of parameters that characterize compressive transforms. In Section 3.2.2, we define operations that process over the range space of such transforms. PAGE 67 57 3.2. r. Parameters of Compressive Transforms. 3.2.1.1. Definition. Given a finite set S C N and r G N, the maximum field width of S, denoted by siz{S), is tiie maximum number of radix-r digits required to encode a value in S. 3.2.1.2. Example. If finite S C R"*" is quantized under an optimal space constraint, then the binary representation (r=2) of S has 5iz(S) = [log2(|S|)]. However, optimality does not necessarily imply high accuracy, since error can result when elements of S are quantized. The following example is illustrative. 3.2.1.3. Example. If finite S C R"*" is quantized linearly into a radix-r representation using a quantization interval q G R"*", then siz{S) is bounded as [logrlSIl < siziS) < V{y : y = riogr(x/q)] , x G S} , (71) where q denotes quantization interval width and the inequality accounts for data that may not be normalized to the interval [AS, VS]. 3.2.1.4. Definition. An encoding transform is a mapping T: Â— * G^, where X 7^ Y and F 7^ G are possible. 3.2.1.5. Definition. An exact encoding a.c of an image a by a transform T fulfills the following conditions: (i) No additional information is introduced into domain(T). Implementationally, this means that pixels are not added to a prior to applying T. (ii) No information is lost in the encoding. That is, T encodes all the information in a such that the information in ac completely describes the information in a. (iii) No information is added to the encoded result. For example, ac is not augmented with additional pixels to fulfill a given format specification. PAGE 68 58 Exact encodings are often called information-preserving or lossless transformations. If T is not exact, then it is called inexact. Alternative terms are information-nonpreserving or lossy. 3.'2.1.6. Definition. The domain compression ratio of a transform T: by CRD(r)= where X and Y are finite point sets. 3.2.1.7. Definition. The range compression ratio of a. transform T: is given by CRR(r)=^, (73) siz{G) where' implementations of F and G would be finite sets. 3.2.1.8. Definition. The compression ratio of a transform T: Â— G^ is given by It follows from the preceding definitions that CR= CRr-CRq. Thus, CR describes compression at a digit (bit) level for radix-r (binary) encodings, under the assumption of constant field width. This assumption is reasonable for machines with fixed-size registers, which include the majority of digital computers. 3.2.1.9. Definition. A compressive transform T: F^ Â—> is an encoding transform whose compression ratio CR(T) > 1. Note that a compressive transform T may be a homomorphism or an isomorphism, which leads to the following definitions of operations over range(T). gy IS given (72) PAGE 69 59 3.2.2. Transform-regime Analogues, Duals, and Computational Systems. The concept of computing over compressed (encrypted) imagery, which we call compressive (encryptive) processing, can be illustrated by the commutativity diagram shown in Figure 1. Here, Q denotes a unary image operation that is applied to image a to yield image a. The transform T, which may be information-nonpreserving, maps image a to a compressed image a.c, which is accepted by an operation Q' that yields the image ac. The operation Q' called a transform-regime analogue or analogue of QSimilarly, an operation Q that transforms ac into a is called a dual of Q. If T has an inverse I^' (or an approximation T* to its inverse, in the case of lossy T), then the image ac can be transformed to yield an exact representation of (approximation to) a. Since it is occasionally possible that dual operations can be derived in the presence of intractable analogues, duals are useful in practice. We later show how to use analogues and duals to formulate computational systems based upon homomorphisms and isomorphisms. This prepares the reader for further theoretical development in Parts II and III. 3.2.2.1. Assumption. Let (S,0) and (S',0') be groups, and let T : S S' be a homomorphism. 3.2.2.2. Definition. If Assumption 3.2.2.1 holds, then the operation Q' is called an Qproperty of T, or a transform-regime analogue of Q over range{T). O T o' Figure 1. Commutativity diagram for processing compressed or encrypted imagery with unary operations. PAGE 70 60 3.2:2.3. Remark. We say that the determination of a specific property of a given transform is called the Transform Properties Problem (TPP). A theoretical discussion of this problem is given later in this section, where we show how the TPP can be used to construct compressive computational systems. 3.2.2.4. Example. The Fourier transform IF : has a convolution property (i.e., 0= @) that is given by 0' = +, the operation of Hadamard multiplication. Alternatively, we say that * is a transform-regime analogue of 0 over range(J^). 3.2.2.5. Definition. If Assumption 3.2.2.1 holds and Q : S' x S' ^ S such that i?(T(f), T(g)) = 0(fÂ»g)Â» Vf>g ^ S, then Q is called a transform-regime dual of Q, over range{T). 3.2.2.6. Definition. If Assumption 3.2.2.1 holds, T has an e-near approximation to its inverse, and : S' x S' S such that ||ft(r(f), r(g)) 0(f,g)ll < Vf,g G S, then is called an e-near approximation to a transform-regime dual of Q over range{ T). For purposes of brevity, we also call fi an approximate dual of Q over range{T). 3.2.2.7. Definition. If Assumption 3.2.2.1 holds, then the structure ((S, Q), T^S', Q')) is called a homomorphic computational system (HCS). 3.2.2.8. Definition. If the transform Tin Assumption 3.2.2.1 is an isomorphism, then the structure ((S,0), T,{S',0'), T""^) is called an isomorphic computational system (ICS). 3.2.2.9. Definition. If Assumption 3.2.2.1 holds and T : S ^ S' has an e-near approximation to its inverse, denoted by T* : S' -* S, then the structure ((S,0), T,{S',0'), T*) is called an e-near isomorphic computational system (t-ICS). PAGE 71 61 3.2.2.10. Definition. If Assumption 3.2.2.1 holds and Tis a compressive transform, then the corresponding HCS (ICS) is said to be a compressive HCS (ICS). We can extend the definition of an HCS or ICS to include operations with multiple operands of different type, or multiple operations that each have a corresponding analogue. In order to do this, we first define several terms pertaining to algebras. 3.2.2.11. Definition. Given the indexed sets T= {Fa} and Q= {G^}, an algebra B= (Q,H) is a subalgebra of algebra A = {T,0) if and only if (i) U cO, and (ii) For each e Q, there exists an F,\ 6 T such that G^ C Fa3.2.2.12. Definition. An algebra A = (!F, O) is a homogeneous algebra iff |/"| = 1 and is a heterogeneous algebra iff \T\ > 1. 3.2.2.13. Observation. Since a homomorphism T preserves the operations of an algebra, an HCS may be formulated using a set of operations O rather than a single operation. For example, consider the structure {(3,0), T,(S',0')), where O' contains the transformregime analogues of the corresponding operations in O. The symmetric case holds for ICS. Of primary interest to this study is the determination of the O-properties of T, where O are common image operations and Tis a compressive transform. A more difficult problem, which address only tangentially, concerns the derivation of 0-properties of encryptive transforms. A key theoretical constraint encountered early in this study is implied by the fact that homomorphisms are defined for binary operations only. Thus, HCS were initially defined for only one type of image algebra operation, namely, binary pointwise operations. In practice, this was an unacceptable restriction, since image algebra contains unary and binary pointwise operations as well as unary global-reduce operations and binary operations with heterogeneous operands (e.g., image-template and template-template operations). We PAGE 72 62 thus developed the following theory that unifies compressive computation over all image algebra operations by expressing the concept of an HCS in terms of a family of mappings. 3.2.2.14. Definition. Let algebras ^ = {Ai, A2, . . . , An; 0} and B= {Bi, B2, Â• Â• Â• , Bn; O'}, where O and O' denote sets of operations. For example, Ai F, A2 = F^, A3 = (F^)^, and 0= {+, Â•, Â©}. A family of mappings C= {Cii C27 Â• Â• -Cn; 0}i where (i : Ai ^ Bi, for i = l..n, is an algebraic homomorphism CA Â—< B if and only if, for each Q Â£ O, (a) there exists an analogous operation 0' Â£ O' over range{Q and {h) Q : O O', such that the following statement holds. If Oi : Ai, X Ai, X Â• Â• Â• X AiÂ„ ^ Aij, then 0(0) = 0' : Bi, x Bi, x Â• Â• Â• x BiÂ„ -> Bi^ such that 0(O)[Ci.(ai,),Ci,(aiJ, . . . ,CiÂ„(aiJ] = Cj[0(ai.,aiÂ„ . . .,aij] . (75) 3.2.2. 15. Remark. If Definition 3.2.2.14 holds and 0(0)= 0' is a dual over range{C) of 0' then 0(O.)[Cu(aiJ,Ci.(aiJ,...,CiÂ„(aiJ] = 0(ai.,aiÂ„ . . . ,ai J . (76) 3.2.2.16. Observation. Given Definition 3.2.2.14 and T: Â—y G^, one encounters the following cases in nonrecursive image algebra: Case 1. If m = 1, then the commutativity diagram of Figure 1 is satisfied, i.e., 0'(r(a))= r(0(a)). For such unary operations, we form the following structure: Xo=(AC,^)= ({FX,O},{r,0},{GY,O'}). (77) If ris a compressive transform, then we call Xa compressive computational system (CCS). Case 2. If m = 1, then let Ai = F, A2 = F^, and C = { T, 0} such that the global reduce operation T : F^ F has an analogue 0(0) = T' : O''^ ^ G for which Equation 75 holds. The corresponding CCS is given by Xt={A,CB)= ({F^F;^},{^,0},{G^G;^'}). (78) PAGE 73 63 Case 3. If m = 2 and Ai = A2 = F^, then Equation 75 reduces to the definition of a homomorphism T. For example, if Q: xF^ Â— F^, (= {T,Q}, and 0(Q)= Q' : X G^, then the corresponding homomorphic computational system (HCS) is given by Xq, which is a CCS if CR(r) > 1. Case 4. If m = 2, then let Ai = F^, A2 = (F^)^, Bj = G^, B2 = (G^)^, and 0= Â® : F^ X (F^)^ F^. Let (={T,U;Q}, where U : (F^)'''' -+ (G^)^ incorporates the transformation process inherent in T such that 0(0)= 0' Â• X (^^)^ ~^ is analogue of Q over range{T). Given image a G F^ and template t G (F-'^)^, if ac = ^(a) and s = U{t), then QiO)[T{a.),Uit)]= T(0(a,t))= r(aÂ®t). (79) This implies the existence of the structure . Xq={A,C,B)= ({F^(F^)^ Â®},{r,C/;6>},{GY,(GY)'';0'}), (80) which is a CCS if CR(r) > 1. Case 5. If m = 2, then let A= [f^ , (F^)^ ,F^ ; Â®}, B= {g^, (G^)^, G^; Q'}, T:FV^GW, [/: (pX)^-. (GY)W,andC= { r,T, f/;0}, where 0(0) = 0' is an analogue over range{Q of Â©. If T and T are compressive transforms, then setting {A.C^B) yields the CCS for the generalized image-template product over heterogeneous domains. 3.2.2.17. Remark. Definition 3.2.2.14 and Observation 3.2.2 provide a unifying formalism that supports the derivation of analogues of nonrecursive image algebra operations over ranqe{T). Note that Cases 4 and 5 of the preceding observation can describe operations between templates without loss of specificity, since ty 6 F^ for all y e X. Additionally, duals of O over range(C) are expressed symmetrically to Cases 1-5. For example, if T has a dual r' over range{Q, then Case 2 would yield the dual CCS Xr= ({F^F;^},{^,0},{G^F;^'}). (81) PAGE 74 64 In passing, we note that compressive transforms are generally not based upon a scalar mapping of form F G, as are pointwise greylevel operations. Rather, image compression is generally achieved by mapping source image blocks to one or more exemplar values g GG. For example, g could be an exemplar that approximately characterizes an encoding block, as in VQ. As a result, the global reduce operations usually do not have numerous analogues, since a single value in G represents many values in F, but rarely one value. However, the block mapping F''' G that is often inherent in T facilitates the construction of a mapping g -.G F^' from which a dual T' : G^ F can be derived via applying T to range(g). We thus have discovered few systems of form Xt, but have derived numerous CCS of form Xp. 3.2.2.18. Observation. If all the elements of ( except 0 (i.e., Ci) Cii Â• Â• Â• t Cn) are onto mappings that have inverses denoted by ,^ = {Ci~\Cr^' Â• -'Cn^}' ^^^^ C is an algebraic isomorphism. If Q is a binary operation and there is only one operand set, then Xq can be recast as an isomorphic computational system (ICS), as follows: Fo=(AC,5,0= ({F^,O},{r,0},{G^O'},7^-')(82) It is well known that ICS are useful in cryptology, since isomorphisms are required for exact recovery of plaintext from ciphertext. If any of the elements of ( except 0 does not have an inverse, then we can construct e-near approximations to the inverses of CiÂ»C2,---,Cn that are denoted by C = {CrÂ»C2' Â• Â• -'Cn}^ such cases, we say that there exists an approximate ICS V^= (A, CB,nWe next overview applications of the preceding theory to high-level problems in computer science, in preparation for the more specific discussion of transformation and compressive processing given in Sections 3.3 and 3.4. PAGE 75 65 3.2.3. Relationship of Compressive Processing to Problems of Computer Science. We previously mentioned that the derivation of specific solutions of the transform properties problem was central to the feasibility of compressive processing. The TPP provides a convenient, unifying formalism for expressing problems of computer science that pertain to imaging practice. In particular, portions of our research currently emphasize the semi-automatic derivation of partial functions that approximate forward and inverse transformations. For example, consider the problem of inverting an imaging system's pointspread function (PSF) that is approximately deduced from imagery. Inversion is but one step in deconvolving a PSF from camera imagery, in order to clarify details of interest. Similar inversion capabilities would be useful in cryptanalysis, where an adversary attempts to guess the key (or the mechanism of) an encryption transform. In this section, we show that formal statements of the TPP and several related problems are useful for expressing homomorphic and isomorphic computational systems. Through such discussion, we hope to acquaint the reader with the larger, conceptual context of this study, prior to addressing applications of our theory in the remaining chapters. Eventually, the semi-automatic development of approximations to solutions of the TPP could facilitate further automation of burdensome computing tasks (e.g., derivation of compression and encryption algorithms) that are currently performed manually or semi-automatically. Thus, the sense and import of this section is not merely abstract, but has potential applications in computing practice that are being actively investigated in our ongoing research. We begin by defining Kleene closure in terms of functional composition. 3.2.3.1. Observation. In programming language theory, it is customary to define Kleene closure in terms of concatenation. For example, given the set of symbols A, let L, Li, and L2 denote strings that are comprised of symbols in A. The concatenation of Li and L2, denoted by L1L2, is equal to the set {xy: x E Li and y Â€ L2}. Let us further define PAGE 76 66 the set LÂ° = {A}, where A denotes the empty symbol, and L' = LL'"\ for i > 1. Defined in terms of the concatenation relation, the Kleene closure of L, denoted by L*, and the positive closure 0/ L, denoted by L"*", are given by 00 00 L' = \JV and L+ = y L'. (83) i=0 i=l A corresponding definition is given for tuple construction, as follows. 3.2.3.2. Definition. Given a set of operators O, let 0, Oi, and O2 denote sequences of operators that are comprised of operators in O. The construction of Oi and O2, denoted by Oi r O2, is equal to the set of tuples {(Q, ^) 0 Â£ Oi and Q Â£ O2}. Further define 0Â° = E, where E denotes the null operator (which does nothing), and 0' = 0 r 0'~^, for i > 1. The Kleene closure of 0 under the construction relation, denoted by O*'', is given by 00 0*^ = U 0' . (84) i=0 For purposes of brevity (and to avoid typographic confusion among the sequence 0 and the operators Q and o), we hereafter denote Kleene closure under the construction relation as O' = 0*^. The construction relation r can be likewise applied to operands in T to yield the Kleene closure !F*. 3.2.3.3. Example. Let 0= {+,Â•} and let E denote the null operator. The elements of O* are specified in canonical order as 0'={S,+,; (E, E), (E, +), (E, Â•), (E, E, E), . . . } . (85) We thus have a convenient theoretical construct for enumerating sequences of operators. We call such a sequence an algorithm. 3.2.3.4. Notation. For purposes of discussion, we denote the set of transforms T C O' PAGE 77 67 3.2.3.5. Definition. The data transformation problem (DTP) can be described in simplified form by the mapping Mdtp : (S' U {l})^ x ^ Â— <Â• T, if the following conditions are satisfied: (a) domain{MaTp) consists of (i) a set D of ordered tuples (f/) that constitute a partial function, where f G S, and f 6 S', with S,S' Â£ f the domain and range spaces of a transform T E T, and Â± denotes an unidentified entity; (ii) an algebra^ = (^,^), where ^ C .f" denotes a set of data structures and ^ C O denotes a set of operators; (b) range{M.DTp) contains the transform T : S ^ S'; and (c) M,,p : (D, (GM)) T. 3.2.3.6. Remark. For purposes of consistency, we currently assume that Mdtp expresses Tin terms of operators in ^ as well as operands (e.g., data structures) in GSince each transform T can be written in various ways, the consideration of lowlevel formulations of M^tp is presently tangential to our theoretical development. The reader should not confuse an abstract formulation of Mdtp with more concrete solutions to instances of the DTP. For example, we note that closed-form solutions to the DTP are not evident in the literature for sets S,S' of practical interest. However, Koza [58] has demonstrated methods of deriving exact or approximate solutions to restricted instances of the DTP over specific finite discrete sets S,S' via stochastic techniques. We next examine the problem of obtaining inverse transformations, which closely resembles the DTP. PAGE 78 68 3.2.3.7. Definition. The inverse transformation problem (ITP) can be described in simple form by the mapping M,tp : T x ^ Â— > T, if the following conditions are satisfied: (a) domain{MÂ„p) consists of (i) a transform T : S Â— S' in T, where S,S' 6 T, the operands; (ii) an algebra A = (QM), where Q C T denotes a set of data structures that instantiate S and S', and U C O denotes a set of operators from which the transform in ran5fe( M,tp) is comprised; (b) The inverse transform : S' S exists and is contained in range{MÂ„p); and (c) M,,p : {TAQM)) T-\ Solutions to instances of M,tp have been employed in cryptanalysis, e.g., the determination of a monoalphabetic substitutional cipher using an input comprised of unique symbols. Recently, there was reported a successful attack upon rotor machines via genetic algorithms (GAs) using sample plaintext, ciphertext, and a knowledge of key size [34]. Instead of deriving the inverse transformation in closed form from the forward transform, it may be more efficient to approximate the inverse transform computationally. For example, the forward transform can be represented discretely as a partial function, to which an approximate inverse can be sought. Such derivation of approximate inverse transforms is expressed similarly to the ITP, as shown in the following definition. 3.2.3.8. Definition. An instance of the approximate inverse transformation problem (AIP) can be described in simple form by the mapping M^.p : (S' U {l.}f xTxRx>t->T, if the following conditions are satisfied: (a) domain{M.j^ip) consists of (i) a set D of ordered tuples (f,f ) that constitute a partial function, where f G S, f e S', and S, S' G are the domain and range spaces of a transform T 6 T, with J. denoting an unidentified entity; PAGE 79 69 (ii) a transform T : S S' in T; (iii) an error bound e G R; (iv) an algebra A = (Q,U), where G C T denotes a set of data structures that instantiate S and S', and U C O denotes a set of operators of which the transform in range{}A ^^^p) is comprised; (b) range{M. f,ip) contains an f-near inverse approximation to T, denoted by T* : S' Â— S; and (c) M,,p : (D,r,Â€,(a,^)) T\ 3.2.3.9. Observation. Inherent in the preceding definition is the assumption that T* can be constructed and applied to the input data in D within the error bound e. In practice, such may not be the case. For example, one may have to select some Q C S and Q' C S' in order for r* : Q' ^ Q to satisfy e. In such cases, it is reasonable to assume that range{T*) C S. 3.2.3.10. Example. Let a polynomial regression technique be applied to a dataset D. Assuming that the transform T : S S' is also a polynomial, the coefficients of T could seed the regression that would determine a polynomial P which would approximate T. Then, one might invert P (where possible), thus obtaining a polynomial P~' that could be employed to seed a polynomial regression over the set D* = {(f',f) : (f,f') G D}. Again applying regression to D* to yield a polynomial T*, analysis of residuals in the output of T* could highlight errors that lie outside the interval That is, T* would exhibit errors within the bounds Â±Â€ only for some pairs (f',f) 6 D*. In such cases, we would rewrite an instance of Definition 3.2.3.8 to read T* : Q' ^ Q, where Q C S and Q' C S'. PAGE 80 70 3.2.3.11. Remark. The preceding observation implies that formulations of M^ip may be constrained by prespecification of the error bound e in doma^n{MJ^^p). This is a realistic constraint, since postspecification of e in range{Mf,ip) implies loss of precise control over the error bound, which precision could be required in engineering applications. As a result of such considerations, we present Definition 3.2.3.8 as one of many instances of a more general problem. Additional issues of interest, e.g., whether M^ip reduces to an instance of M,tp when e = 0, will be considered in future research. We next summarize the central problem of this study, namely, that of determining properties of a given transform. 3.2.3.12. Definition. The transform properties problem (TPP) can be specified in terms of the mapping Mtpp : {J^ x O) x T x A ^ x O, if the following conditions are satisfied: (a) clomain{Mrpp) consists of (i) a tuple (S, 0) comprised of a value set S Â£ f and an operation Q Â£ O; (ii) a homomorphism T in T, from S to S'; (iii) an algebra A = (Q,U), where Q C denotes a set of data structures and ZY C O denotes a set of operators; (b) ranfife(MTPp) contains a tuple {S',0'), such that S' = T(S) and Q' e O* is an operation on S'; and (c) M,pp : ((S,0),r, {Q,K)) (S',0')3.2.3.13. Observation. As mentioned previously, this study emphasizes the derivation of solutions to restricted instances of the TPP. For purposes of discussion, we define the term feasible computation to mean an algorithm that satisfies implementational constraints of time, space, and error for a certain operational scenario. In this study, we concentrate on designing feasible computations over the range space of various transforms. In Section 3.2.2, we defined a computation over ranfife( T) in terms of computational systems whose derivation was based on an algebraic homomorphism. Given such definitions, we next PAGE 81 71 observe that HCS and ICS can be constructed from solutions to the TPP, ITP, and AIP. The following theorems illustrate this concept and have simple proofs that are presented in outline form. 3.2.4. Derivation of HCS and ICS via the DTP, ITP, and AIP 3.2.4.1. Theorem. If particular solutions to the DTP and TPP exist, a homomorphic computational system Xu (reference Observation 3.2.2) can be constructed from (a) a set D = {(f,f') : f G S, f G S'}, which is a partial function that is an element of (S' U {-L})^, and (b) an operation Q S. Proof. Assuming the existence of an algebra A = {Q,U), where ^ C T and U C O, vfe outline the proof as follows: Step 1. Choose 0> operation on S. Step 2. Where possible, solve the data transformation problem as r := Mo,p(D,(a,i/)), (86) to obtain the transformation T : S -> S'. Step 3. Where possible, solve the transform property problem as (S',0') := M,pp((S,0),r,(a,Z/)), (87) to obtain the tuple (S', 0')By the definition of an HCS, T is a homomorphism. Step 4. Formulate Xe = ({S, 0), {7, 0}, {5', Q'}), where an analogous operation 0(0) = P2[M,pp((S,0), T,{g,u))]. 3.2.4.2. Theorem. If particular solutions to the inverse transformation problem exist, an isomorphic computational system V can be derived from a homomorphic computational system .Yh- PAGE 82 72 Pioof. Assuming the existence of an algebra A = {GM)i where Q C T and U C O, we outline the proof as follows: Step 1. Choose an HCS Xh = ((S,0),r,{S',0'))> where the transformation r:S-S'. Step 2. Where possible, solve the inverse transformation problem as follows: r-i := M,,p(r,(c;,i/)), (88) to obtain the inverse transformation : S' Â— Â» S. Step 3. Formulate V = ({S, Q}, {r, 0}, {5', 0'}, 3""^), where an analogous operation 0(0) = P2[M^pp((S, 0), T,{gM))]. 3.2.4.3. Theorem. If particular solutions to the approximate inverse transformation problem exist, an approximate isomorphic computational system X\* can be derived from a homomorphic computational system X\{. Proof. Assume the existence of an algebra A = {Q,U), where G C T and U C O, together with the dataset D = {(f,f') : f G S, f e S'}, which is a partial function that is an element of (S' U {Â±}f with S,S' G J^. The proof is outlined as follows Step 1. Choose an HCS Xh = ((S, 0),^, (S', 0'))Â» where the homomorphism r : S ^ S'. Step 2. Choose an error bound e e R and, where possible, solve the AIP as T* := MÂ„p(D,r,e,(a,ZY)), (89) to obtain an e-near approximation T"" : S' ^ S to the inverse transformation T-^ : S' S. Step 3. Formulate V* = ({S, Q}, {T, 0}, {5', Q'}, T*), where an analogous operation 0(0) = i^[M^Pp((S,0), T,{g,U))]. PAGE 83 73 We next present a taxonomy of image transformations that facilitates derivation of analogous and dual operations throughout the remainder of this dissertation. We begin by deriving a simple notation for transform classification and exemplify the computation of a transform parameter (e.g., compression ratio) for each class. 3.3.1. Taxonomic Classes. image transforms map images to images. In this study, image transforms, denoted by T : Â— ^ G^, are classified via taxa that describe equality of domain or range spaces. 3.3.1.1. Assumption. Although the general form of T : Â— > is mathematically the same whether or not X = Y or F = G, numerous transforms of practical interest exhibit different range and domain spaces. Thus, we assume (for implementational convenience) that F-'^, F^, G^, and G^ denote distinct spaces. 3.3.1.2. Definition. Under constraint of Assumption 3.3.1.1, encoding transforms are grouped as follows: 3.3. Taxonomy of Image Transformations. From Chapter 2, recall that an image is a mapping of form F-'^ = X -+ F and Class 1 Ti: F^ ^ F^ Class 2 T2: F^ G^ Class 3 T3: F^ ^ FY Class 4 U F^ G^ PAGE 84 74 3.3.1. 3. Example. Class 1 transforms are exemplified by the (a) Linear inner-product transforms (e.g., Fourier, Walsh, Hadamard, and Cosine transformations); (b) Unary arithmetic and logic functions; (c) Delta modulation transform, derivativebased compressions, PCM, and certain instances of DPCM; (d) Substitutional ciphers over F; as well as (e) Ciphers such as DES and RSA, if the keyspace is not included in domain{ T). 3.3.1.4. Example. Class 2 transforms include (a) Local averaging operations; (b) Thresholding and bit-slicing operations; and (c) Image labelling operations such as connected component labelling. 3.3.1.5. Example. Class 3 transforms are represented by (a) Spatial warping transforms (i.e., the affine transform); (b) Image minification by subsampling or magnification by pixel replication; and (c) Spatial superresolution transformations, such as sub-pixel interpolation. 3.3.1.6. Example. Class 4 contains many interesting transformations currently in use, which include (a) Runlength encoding and sparse matrix reduction techniques; (b) Fractal-based encryption via iterated function systems; (c) HVS-based blocliwise transforms such as VPIC; (d) Blockwise sine/cosine transforms such as DCT, JPEG, and MPEG; (e) Blockwise statistical transformations, including BTC and VQ; PAGE 85 75 (f) Adaptive or recursive encoding schemes, such as wavelet transformation, adaptive vector quantization, as well as adaptive Lempel-Ziv encoding; and (g) Higher-level processes, such as the transformation of image components into adjacency graphs. The foregoing transform classes each have different attributes (e.g., compression ratio). For example, computational complexity, data security, and information loss are discussed in Chapters 4 and 5. To demonstrate the utility of our taxonomy, we analyze the compressive properties of transforms in Classes 1-4, which provides background for the development of Parts II and III. 3.3.2. Compression Ratio. 3.3.2.1. Lemma. The compression ratio of a Class 1 transform Ti is unitary. Proof. From Definitions 3.3.1.2 and 3.2.1.8, the Class 1 transform Ti: ^ exhibits the compression ratio CR(ri) siz[range{choice{domain{Ti)))] \domain(choice{range{Ti)))\ siz[range{choice{range{Ti)))] _ |X| Â• sizjF) ^ |X| Â• siz{F) (90) As a result. Class 1 transforms alone are not interesting for compression applications. The following lemmas are similarly proven. 3.3.2.2. Lemma. The compression ratio of Class 2 transforms is given by CR(r2) = siz(F)/siz(G). 3.3.2.3. Lemma. The compression ratio of Class 3 transforms is given by CR(r3) = |X|/|Y|. PAGE 86 76 3.3.2.4. Lemma. The compression ratio of Class 4 transforms is given by CR(T4) = ^l^i^, (91) |Y| Â• siz{0) which is the general case. For purposes of clarity, we next present a notation for decomposing transform mappings, which helps us visualize the transform mappings implementationally. 3.3.3. Formulaic Granularity and Decomposition. We begin by showing how a given transform T can be expressed at three levels of granularity (coarse, medium, and fine). The multigranular representation of T does not imply that each level of granularity has the same formulation. Rather, we decompose T to concisely acquaint the reader with Ts structure and function. From such decompositions, one can further understand the format in which information is represented in range{T). 3.3.3.1. Definition. An image transform is described at coarse granularity as a mapping between image spaces of form F-'^. Coarse granularity is denoted by presuperscript C. For example, we write sparse matrix reduction at coarse granularity as Tsm-> , which is a Class 4 transform. 3.3.3.2. Definition. A transform can be described at medium granularity via its data structures. Medium granularity is denoted by presuperscript M. For example, given Y C N an instance of sparse-matrix reduction is denoted as ^Tsm(X x F)^. That is, an Fvalued image on X is reduced to a list on Y of tuples of form (x,f), where a non-negligible value f e F is grouped with its domain point x G X. PAGE 87 77 3.3.3.3. Definition. The fine granular level of transform description can specify operations on one or more source data. Such operations may be represented as scalar (as opposed to vector-space) mappings. Fine granularity is denoted by presuperscript F. For example, given the preceding definition, we can write P'T,,: (x,a(x))h (Kx,a(x)), (x,a(x))) ' (92) where h denotes an indexing function that is described in greater detail in Chapter 9. 3.3.3.4. Notation. We denote the formulaic decomposition of a transform T from coarseto fine-granular form as^T^^T^^T. Two examples of this technique follow, which are applied to runlength encoding (RLE). 3.3.3.5. Example. Let the Class 4 mapping T4: -v describe runlength encoding (RLE). The RLE transform T^le maps a Boolean image on X to an integer-valued image on Y, where Y C N is usual. Let n denote the maximum source runlength in (iomam( Trle), and let a run (i.e., a contiguous region) of k < n zero-valued (unitarily-valued) pixels be mapped to the integer -k (+k). If k > n, then we partition the run into k/n runs of length n, and a residual run of length k mod n pixels. For example, if range{choice{domain{T^^s))) = Z256, and we want the encoding alphabet (which includes a marker symbol) to be both positively and negatively valued, then n = [(256 1)/2J = 127 pixels. Since we will be accumulating the detected runs using a counter, we let Y = N. Denoting Zf = {i : i e [-(/-I), / 1] C Z}, we denote range{choice{range{T^^^))) as Z^+j. Thus, we write ^T^^^ : Zf ^ (Z^^.i)^. Per the preceding discussion (and as shown in Chapter 9), ^Thle: {f}'' h k Â• (2f 1), where {f}'' denotes a zeroor unitarily-valued run of k pixels, and k < n. PAGE 88 78 3.3.3.6. Example. The Class 4 transforms, which can be complex mathematically as well as notationally, can be further classified as follows: (i) Class 4-A contains fixed-blocksize substitutions, denoted by T^f, where Thus, the encoding block size k is invariant to the points x G X or y 6 Y. (ii) Class 4-B contains i;ana6/e-6/ocA; substitutions, denoted by r4v, where the mediumgranular transform appears as in the previous equation, but the fine-granular transform exhibits spacevariant encoding blocksize, i.e., ^T^y (y,(fi,f2,...,fk(y))) h (y,g),fi,f2,...,fk(y)eF, gGG,y G Y. (94) It is easily verified that Class 4-B portrays substitutions whose blocksize varies with x G X, for example, partitioning systems customarily employed in fractal-based compression. Given Y, the indexing function /i : X ^ Y, and an input image a G in domain{Ti), Equation 94 can be rewritten to portray a value-dependent blocksize. In such cases, Ti would portray an encoding with a blocksize that is dependent upon the values of a, since a is a function from X to F. The mediumand fine-granular transform descriptions are especially useful when analyzing a mapping's computational complexity and data security, as shown in Chapter 4. Chapter 5, which pertains to error analysis, employs the fine-granular formulation as an initial step in characterizing error propagation through various transform stages. The derivations and implementation^ analyses of Chapters 6-9 are also based partially upon multi-granular decomposition. Thus, our notation is used not only for classification, but facilitates theoretical derivation, analysis, and algorithmic design. We next discuss high-level methods for the derivation of operations that compute over the range spaces of Class 1-4 transforms. ^ ''r4f: (fi,f2,...,fk)h g,fi,f2,...,fkeF,gGG. (93) PAGE 89 79 3.4. Class-Specific Derivational Techniques. We have thus far presented theory that is basic to the analysis of compressive and encryptive transformations, and have proposed a taxonomy by which numerous image transforms may be classified. In this section, we describe high-level techniques for analyzing transforms in taxonomic Classes 1-4 and for determining operations over the range spaces of such transforms. We begin by summarizing a few simple algebraic methods that are useful analytically (Section 3.4.1), then progress to high-level derivations of pointwise operations (Section 3.4.2), global reduce, and image-template operations (Sections 3.4.3 and 3.4.4, respectively). Given such basic theory, we present more detailed descriptions of high-level methods for deriving analogues and duals of operations over imagery transformed by Class 1-4 mappings (Sections 3.4.5 through 3.4.8). 3.4.1. Basic Algebraic Methods. 3.4.1.1. Assumption. Hereafter assume that f,f"^g,g~^h 6 S, where and g"' denote the inverses of f and g. Additionally, we assume that the operations 0Â»7 ' S x S Â— S, 0', 7' : S' X S' ^ S', the transform T : S -* S' is a homomorphism, and e denotes the identity of S with respect to Q3.4.1.2. Observation. Let Assumption 3.4.1.1 hold. One can employ the following simple techniques for the reduction of algebraic expressions: (a) If O is commutative, then one can transpose the arguments of Q as f7(g 0 h) = f7(h 0 g), (96) (b) If 0 is associative, then one can regroup terms as (fOg)Oh = fO(gOh), (96) PAGE 90 80 (c) If 7 is associative with respect to 0> then one can regroup terms to yield (fOg)7h = f 0(g7h), (97) (d) If 0 is associative and g has an inverse g~^ G S, then one cancels terms as (f Og)Og-' = f 0 (gOg-^) = f Oe = f, (98) (e) If 0 is associative and has an inverse Q ' that is associative with respect to Q, then one can eliminate terms, i.e., (f 0 g) 0"' g = f 0 (g 0"' g) = f 0 e = f , (99) 3.4.I.3. Example. Let g = f"^ 7 h. Via substitution, f Q g becomes f Q (f 7 h) . If 7 is associative with respect to Q, then we have f 0 (f"'7h) = (f 0 f"')7h = e7h = h7e = h. (100) 3.4.2. Pointwise Operations. We first examine the simple case of image transforms that are bijections, then continue with spatial transforms, and non-invertible (lossy) m-ary mappings. We conclude our development with a discussion of fixedand variableblocksize encodings, which comprise the majority of compression transforms in common use. 3.4.2.1. Assumption. Let = ({F^,0}, {7^,Â©}, {F^,0'}, T-') denote an ICS such that (i) r : is implemented in terms of a bijection g: F F, and (ii) 0(O)(na))= T(0(a)),VaGFX. PAGE 91 81 3.4.2.2. Technique. If Assumption 3.4.2.1 holds, then for finite F and X, we observe that 0 and hence, Q' = 0(O)Â» can be implemented in terms of a discrete data structure such as a lookup table, subject to space constraints dictated by |F|. Alternatively, if Tis expressed symbolically (i.e., an analytic function) and we wish to express Q' analytically, then we solve 3.4.2. l(ii) to obtain Q' under constraints of tractability. The following example is illustrative. 3.4.2.3. Example. Let Assumption 3.4.2.1 hold, and let T = log, which implies that F = R"^. If 0 = ' then one can solve the following equation analytically 0'(logf,logg) = 0(f,g)= f Â•g,Vf,gGR+ (101) to obtain Q' = +. In order to better understand our treatment of neighborhood-based mappings, we next consider the simplistic case of an m-ary partial function that is a bijection over the domain of the image operation Q. 3.4.2.4. Assumption. Let = ({F^, Q}, { 7^, 0}, {F^, 0'}, T'l) denote an ICS such that (i) T : F-'' -+ F-'^ is based on a pointwise m-ary function g: FÂ™ Â— >Â• F, (ii) T~^: F -Â»Â• FÂ™ exists only for those |F| neighborhood configurations in domain{ T), and (iii) 0(O)(7'(a))= r(0(a)) for those a G F^ for which condition (ii) holds. 3.4.2.5. Observation. Note that there are |Fp possible neighborhood configurations in F"*, but only |F| mappings possible under constraint of the bijection that constitutes the partial function T. That is, if T is an onto mapping, then the inverse T~^: F -> FÂ™, if it exists, would be an into mapping. PAGE 92 82 3.4.2.6. Technique. The equation in Assumption 3.4.2.4(iii) can be solved for 0'. subject to the given constraints, after the method of Section 3.4.2.3. However care must be takin in qualifying constraints under which the solution holds. The following example is illustrative. 3.4.2.7. Example. Let Assumption 3.4.2.4 hold, and let F = Zg with X a 2x4-pixel array. Let the source image a G shown in Figure 2 be transformed by T : F'' F to yield ac G F^, as shown in Figure 2b. Define T as follows, using the form F^''^ Â— F^**^ as applicable, for purposes of brevity: ^= (o o)'(o o)'(f f)ifel^'^^^^^^f'^^^^^^^^^^^l^^^^ the 2 X 2-pixel array that is input to T; Oi) T: (o g) (g o)^ (g o) (iii) ^" h) ^ (g f ) ' ^'^^'^^ nonzero e,f,g,h G F; and (iv) T is undefined elsewhere, for purposes of simplicity. 1 2 7 5 1 4 8 5 3486 3276 ^^^2) (a) (b) Figure 2. (a) Example source image a that is transformed by T to yield (b) image ac. Note reversal of the middle 4 x 4-pixel neighborhood only. Let 0 : F^ ^ F^ be given by 0(a) = (a 1) mod 9. For example, 0(8) = 1, 0(1) = 0, etc. For the formulations of T, 0) and a given in Figure 2 and items (i)-(iv), it is easily verified by inspection of Figure 2 that 0 preserves the ordering of pixel values in a as constrained by the ordering of X. That is, 0 does not transpose values in a. As a result, it is sufficient to state that 0' = 0 in this case. However, if 0(a) = (a -6) mod 9, then the right-hand two columns of pixels in a<. will be {^^ instead of g^, and the constituent map 1' ' {^^ h) ^ (g f ^ ^an no longer be inverted to produce a from ac under constraint of the existing definition of T. In such cases, 7"' does not exist. PAGE 93 83 3.4.2.8. Remark. It is this difficulty with inverting constituent maps of T that renders neighborhood-based bijections difficult (and generally impractical) for image compression. In practice, a collection of neighborhoods (also called vectors) is often represented in terms of an exemplar that portrays a group of vectors within a given error criterion. Note that the preceding example shows why the determination of analogues to lossless (i.e., bijective) neighborhood transforms that are partial functions is dependent on the formulation of T, 0, and a. 3.4.2.9. Observation . Note that the preceding example specifies T as a spacevariant spatial transform and Q as & grey-level transform. There, the processing sequence 0'{T{ai)) would yield a greylevel transform (QO of the neighborhood-based spatial transform ( T). This is similar in concept to the Data Encryption Standard, where greylevel and spatial transformations are alternated as application of the XOR and S-box sub-stages of each DES transform stage. However, we have noted that in the preceding section that, unless Tand Q are chosen carefully in relation to a, the transformed image ac could have undefined values. Hence, r' or Q' might not exist, or T-\0'{T{a))) might be undefined. Noninvertibility would render T useless for cryptographic applications, where isomorphisms are required. However, we have found that this observation can be exploited to form the basis for a neighborhood-based transposition similar to that shown in Section 3.4.2.6(iii). By choosing Tand Q carefully, one can construct an encryption r(0(a)) that reliably encrypts prespecified sequences of symbols in the plaintext a but discards (or irreversibly transforms) other sequences. Here, we say that T irreversibly transforms a given plaintext a because the inverse transform T\ T{a.)) could exist outside the mappings that comprise T. Thus, a highly selective DES-like transform could be devised that would obliterate certain words or transform them into undefined values (also called noise values). Alternatively, consider that the preceding concept can be applied inversely to the decryption process, since Assumption 3.4.2.4 defines Vto be an ICS. In other words, instead PAGE 94 84 ol obliterating certain symbolic sequences, the partial function T and the operation Q could accept ciphertext and obliterate partial decryptions that did not meet prespecified constraints. Although we have not fully explored the potential of such operations, they appear to have greater utility in cryptanalysis than in cryptography. This is likely due to the fact that in cryptanalysis, one searches for a given set of plaintext sequences while attempting to process (decrypt) the ciphertext. Although the ciphertext sequences may be accurately surmised, the corresponding plaintext is usually guessed in a sketchy fashion only. Since this study does not include the derivation of novel cryptologic transformations, we defer further discussion of this technique to future research. We next consider the case of spatial warping transforms (Class 3 of our taxonomy). 3.4.2.10. Assumption. Let = ({fX,0}, {T,Q}, {F^,0'}, T''') denote an ICS such that (i) T :F^-^F^ is based on the spatial transform g : Y X such that r(a) = a 0 g , Va e F^, where a o gr = {(y, a{g(y))) : y G Y}. (ii) T~^: exists and is based upon the spatial transform : X Y such that a= r(a) o g-^ , Va e F^ ; (iii) 0(O)(na))= r(0(a)). 3.4.2.11. Theorem. If Assumption 3.4.2.10 holds, then Q', the analogue of Q over mngeCT), is given by 0'= 0 0 9(103) PAGE 95 85 Proof. Assuming that the givens hold, if g has an inverse : X Â— Y, then : ^ exists, such that a= r~Vac)= He 0 = a 0 5f 0 gf-' . (104) Given a unary image operation Q ^ F-'' -* F^ (chosen for purposes of simplicity), the analogous operation over range{ T), denoted by 0' Â• ~" can be derived from the preceding equation, as 0'(ac)= MO{T^'{a,))) = {0{^cOg-'))og (105) = 0(a)ofir= (Ooi7)(a), since ac = a 0 g and composition is associative. 3.4.2.12. Remark. The preceding theorem holds for spatial warping transforms g that have an inverse g~^. However, Equation 105 also shows us that g~^ is not required in order to derive Q' from Q. Since the spatial warping transforms constitute an important class of image processing transformations that includes the well-known afRne transform, this is a useful result. Additionally, since the afRne transform forms a basis for fractal-like encoding of images (a Class 4 transform), this result is pertinent to imaging practice. 3.4.2.13. Observation. The spatial warping transforms present a deceptively easy example of deriving analogues for Class 3 transforms. Consider the more difficult case, where T is as defined in Assumption 3.4.2.10 but Q is also a spatial transform. In particular, let 0(a) = a 0 e, where e : Y X is not necessarily the inverse of g and may perturb domain{sL) nonlinearly or noninvertibly. Since composition is not commutative, the commutativity diagram of Figure 1, which illustrates 0'{ T{a.))= T{0{a)), may not hold since 0'(a o g) ^ (a o e) 0 g is possible for some e. That is, the equality O'(ac) = 0'(a o fif) = (0 Â° fi')(a) may not hold because composition is not commutative. Thus, only certain types of spatial transforms are admitted by Theorem 3.4.2.11, namely, those that are commutative and invertible, as stated in the following theorem. PAGE 96 86 3.4.2.14. Theorem. Let Assumption 3.4.2.10 hold and let a 6 and e : Y X be a spatial warping transform such that 0(*) = a o e. If e commutes with Ts basis mapping g and both are invertible, then Q has an analogue Q' = 0 Â° 9Proof. The proof follows directly from the proof of Theorem 3.4.2.11 and the preceding discussion. 3.4.2.15. Remark. The non-commutativity that prevents Theorem 3.4.2.11 from holding for all (discrete as well as continuous) domains X is exemplified by the case of an affine transform g that is first applied to discrete imagery which is then transformed by 0(a) = a o e. In such cases, quantization and interpolation error dictates that g o e ^ ^ Â° 9, although g and e may be linear and invertible within an prespecified error constraint e. Thus, we prefer to depict discrete spatial transforms that involve interpolation and quantization as being lossy and having e-near approximations to their inverses. The preceding discussion of a lossy transform leads naturally to a discussion of lossy neighborhood mappings. Such mappings can be found in Classes 1, 3, and 4 of our taxonomy, but primarily comprise the majority of useful transforms in Class 4. 3.4.2.16. Definition. A lossy neighborhood transform Tln:F^ Â— Â»^ is a noninvertible, m-ary onto mapping , where Y = X and G = F are possible. Tln is expressed at medium granularity as '^rLN:FÂ™ -* G. 3.4.2.17. Example. A local averaging transform that incorporates thresholding of the averaged result is an example of a lossy neighborhood transform. Other common examples include median filtering and neighborhood-based contrast enhancement. In such cases, the source image cannot be exactly retrieved from the compressed image that has been operated upon by an analogous operation. PAGE 97 87 3.4.2.18. Observation. Since Tis is an onto mapping but is many-to-one, it is not invertible. As a result, we cannot construct an ICS based on Tiy, and we must instead use an approximate ICS based on an e-near approximation to the inverse of Tis, as follows. 3.4.2.19. Assumption. Let the e-near approximate ICS ^6= ({F^0}, {TLN,e}, {G^O'}, T^n) (106) such that Tln conforms to Definition 3.4.2.16. 3.4.2.20. Technique. When deriving Q' over range{Tij^), we primarily consider the following cases: (i) Linear or nonlinear operations may comprise Tln, such that at a given encoding block's target point x e X, the transform output approximates the source image value a(x). (ii) Tln may be primarily comprised of a spatial transform g : Y X, with greylevel manipulations that only slightly perturb the transform output values with respect to the corresponding source image values. For example, consider image rotation with interpolation. In such cases, the transformed value ac(y), where y G Y, closely approximates the source image value a(gi(y)). (iii) Occasionally, a subset of ranjfe(a) is exactly represented in a.c = Tiy{a.). For example, a constantvalued encoding block may be encountered by a compression transform that encodes blocks according to their mean values (e.g., BTC or VPIC). PAGE 98 88 If Case (i) or (ii) holds, then it is possible that the analogue 0' of a-n image operation 0 may be approximated by Q due to the approximation of values in domain{Ti^) by corresponding values in range{Ti-s). In case (iii) the equality 0' = 0 holds only for neighborhoods in a and ac that have values such that a|4.(y) can be retrieved from ac(y), where y 6 Y. From such simple cases, we progress to the more realistic case of neighborhoods that are represented by an exemplar, under constraint of a prespecified error criterion. 3.4.2.21. Technique. Let a neighborhood transform 7ln:F^-Â»-G^ be expressed at medium granularity as ^ Ti^iF"^ Â—> G, and let Tln generate a codebook c G (FÂ™)*'. Given a set B of encoding blocks in F'", the codebook is constructed by a clustering algorithm that groups B into sets B(g), for each g G G for which an exemplar c(g) is to be specified. The extent of this process can be constrained by a prespecified limit e on the error with which vectors in B(g) are represented by c(g). An additional constraint is the codebook size |G|. For example, e could be an MSE criterion, such that ||b-c(g)|| < 6, VbeB(g). (107) In practice, Tis often called a vector quantization tT&nsfoTm. Since c(g) represents encoding blocks in B(g) within an error e, we say that Tln has an e-near approximation to its inverse, denoted at medium granularity by ^T^^-.G FÂ™. As a result, there are two strategies that can be employed in processing an image a compressed by Tln to yield ac: Method 1. Decode the y-th block in a^, by applying T^f^ to the block index g 6 G to yield the y-th source block b. Use customary tree-based search strategies to find the exemplar c(h) that best represents b. Encode the processed image ac = 0'(*^c) as ac(y)= h. Method 2. Alternatively, one can apply Q to each codebook exemplar in c to yield a new codebook d. Then, ac references d instead of c. PAGE 99 89 3.4.2:22. Observation. With respect to the preceding technique, note the following salient advantages and disadvantages: Method 1. Advantage: The codebook is not modified, and thus retains the representation of the training set upon which T\,y's design was based. Disadvantage: The transformed image must be inverse-transformed blockwise prior to processing by an analogous or dual operation, which contradicts the key premise of compressive processing, namely, the processing of imagery in compressed form. Disadvantage: Since |Y| < |X|/m is typical for compressive applications, a given exemplar is likely to be processed by Q' more than once, which is inefficient. Disadvantage: The block configuration represented by 0(h) may not be represented by any exemplar in c within the error limit e upon which the design and construction of c was based. Method 2. Advantage: Each exemplar is processed only once, requiring 0(||Y|) work. Advantage: The image need not be decompressed. This maintains such data security as may be inherent in the compressed format, since Q' does not require an association between range{sic) and domain{c). In fact, Q' never sees the placement of source image values in ac, since Q' inputs values in range{a.c) only as indices of c. Advantage: The new codebook represents blocks produced by 0'(7ln(^))' instead of approximating the output of Q' with an existing exemplar. Disadvantage: A new codebook is produced, whose statistics may vary from those of c, and which may not represent statistics of the training set from which a was abstracted. PAGE 100 90 3.4.2.23. Remark. The advantages of Method 3.4.2.21(1) outweigh those of Technique 3.4.2.21(2), with the exception that the codebook statistics can be altered by Method 2. One could argue that a solution to this problem involves unioning the codebooks c and d within a given error constraint on exemplar similarity. However, codebook size is thus increased, which is clearly disadvantageous for compressive processing, when the net efficiency t] = O(CRd). This problem is further addressed in Chapter 9, when we discuss VQ-based processing, together with the associated codebook representation error and resultant reconstruction error that appears after a compressive operation's output is decompressed. This is an important implementational issue, due to the accrued error that results from cascaded compressive operations. We next consider problems of coordinate-set manipulation that occur during the processing of block-encoded imagery using analogues or duals of binary pointwise operations. 3.4.2.24. Observation. Let Assumption 3.4.2.19 hold, and let the lossy neighborhood transform Tln tesselate two images a, b Â£ thereby producing the corresponding compressed images ac,bc G G^. For each y G Y, ac(y) references a block in a (and likewise for be and b). If Tln tessellates a and b identically, then it is possible that Q' can process over ac and be in 0(|Y|) time since the y-th blocks of ae and be can be input to 0'Thus, only |Y| invocations of Q' are required. However, this condition occurs only when Tln employs an indexing function /i : X Â— Â» Y that identically tessellates the domains of a and b. If h produces encoding block domains that are space-invariant, we call this fixed-block encoding. If domain{a.) is tessellated isomorphically to doniain{h), we have isomorphic encoding, which is usually employed in conjunction with fixed-block encoding. PAGE 101 91 3.4.2.25. Observation. In contrast to the preceding case, if source images a, b G F-'^ are tessellated differently, then it is implied that the indexing functions h: X ^ Y and h* : Y -+ 2^ will vary with the source image and would thus be data-dependent. As a result the y-th blocks in the corresponding compressed images ac,bc G may or may not have partially or totally overlapping domains. We call this non-isomorphic encoding, which is usually employed in conjunction with variable blocksize. Depending on the shape or size of the encoding blocks and whether or not they overlap, the following salient cases pertain to the design of operations over ac and be: Case 1. Let the y-th blocks in a and b have domains that are a subset of X, which are denoted by h*(y)), where y G Y. Let such domains be disjoint or partially overlapping. Given ac(y), one must search be to find one or more blocks bc(z), z G Y, hat contain information about bl^.jyj. Case 2. If /i and /i* are invariant to a and b, then Observation 3.4.2.24 applies. 3.4.2.26. Remark. If Case 1 of Observation 3.4.2.25 holds, then search over be requires between one and |Y| 1 search operations. If search occurs for each block indexed by /i(X), then 0(|Yp) search overhead is associated with a binary operation over imagery compressed by TlxRecall that if Wo'(Y) < Wo(X), then > 1 and a computational efficiency accrues from compressive processing. Implementationally, this means that if an average of M blocks must be searched for each of |Y| invocations of a binary operation, then we have the condition that 7y > 1 if and only if M Â• |Y| > |X|. From the definition of CRd, that means M < CRa is required. When a proportionality constant c that depicts the difference in work required by operations over Y (versus operations over X) is included, we have Mc Â• |Y| > |X|, which implies that M < CRa/c if t; > 1 is to be achieved. PAGE 102 92 3.4.2.27. Remark. The preceding discussion has practical applications, since variableblocksize transforms are found in the literature (e.g., runlength encoding, variable-blocksize VQ, fractal-based encoding). Since encoding block configurations can vary widely and can be nonrectangular, the analysis of computational efficiency can be involved. Thus, we reserve discussion of specific algorithms for block search and variable blocksize encoding until Parts II and III of this dissertation. We next consider the case of global reduce operations over compressed imagery, which can be simpler than the binary pointwise operations. 3.4.3. Global Reduce Operations. We begin with observations about several straightforward cases. 3.4.3.1. Assumption. Let a CCS Xt = ({F^,F; T}, {T, /; 0}, {G^ ,F; T'}) be based upon a global reduce operation T : F-''^ F that has a dual T' : F-'^ ^ G over the range space of T : F^ G^. 3.4.3.2. Observation. Implicit in Assumption 3.4.3.1 is the fact that all values in F may not be represented by corresponding values in G. For example, if a Â€ F^, then Sa might be in F but not in G for one or more of the following implementational reasons: (i) Each value in G may be a scalar that represents a vector or encoding block constructed of typical source pixel values in F. For example, since Sa is a scalar, Sa might not be represented by a value in G. (ii) Values in G may represent a subset of values in F, but G may not represent all possible source vector configurations (e.g., as in vector quantization). For example, Sa may not be represented in G. Additionally, we could have the trivial cases where Fa ^ F (e.g., if F = {0,1} and F = S) or an implementation of F : F^ Â— >Â• F is not known. PAGE 103 93 3.4.3.3. Technique. Let Assumption 3.4.3.1 hold and assume (per Observation 3.4.3.2) that a value in range{T) may not be represented in G. If T G {E,n,V, A}, then we state the following simple cases: Case 1. If T : F'' Â— G is an exact block encoding, then each block in domain( T) has one and only one label in G. As a result, we can build a dual of T by applying r to each encoding block to produce a value in F. We then construct a manyto-one map g: F ^ G that encodes the result of F as a value or group of values in G. From the output of g, we can build a lookup table that implements T' over values in G. Case 2. Case (i) may apply where T is a lossy encoding, such as VQ. In such cases, g and a dual V can be constructed. However, r'(r(a)) may not equal r(a), i.e., may be an approximate dual of F, since the values in range{T{a.)) over which F operates are approximations of blocks in domain{T). Case 3. Let T be a spatial warping transform that isomorphically minifies or magnifies the source image by a factor K. If F G {E,V,A}, then F'(r(a)) = F(a)/K. Likewise, if F = 11 and domain{a.) is m-dimensional, then F'( r(a)) = F(a)/K'". In the former case, we multiply the output of F' by K to obtain F(a), and the latter case is stated symmetrically. Case 4. In a few rare cases, T can preserve certain F(a) in the compressed image ^(a). For example, T could have a quantization step that finds the maximum Va of in image in domain{ T), then clamps a range of values less than Va to the image maximum. In such cases, T could preserve Va in rangÂ€( T(a)). Then, the dual of F = V would entail locating Va as it is represented in range( T{a.)). The following examples are illustrative. PAGE 104 94 3.4.3.4. Example. Let Assumption 3.4.3.1 hold, and let T form a representation of a source encoding block in image a such that the block mean is preserved in range{T{aL)). If r Â€ {E,n,V,A}, then the approximate dual V = T can be applied to r(a) to obtain an approximation to r(a). We utilize this method in our development of operations over the range spaces of BTC, VQ, and VPIC transformations. 3.4.3.5. Example. Let Assumption 3.4.3.1 hold, and let T form a representation of a source encoding block b 6 F*' in image a such that one or more such blocks are represented by an exemplar in a codebook c G (F*')^, where U is an indexing set. Given a compressed image ac = T{&) G whose histogram is given by h 6 N^, the following statements hold: (i) If r = E, then r'(ac)=EMg)-Sc(g)Â«. Ea. (108) (ii) If r = V, then r'(ac)= Vc Si Va. (109) A symmetric case holds for image minimum. (iii) If r = n, then r'(ac)= n [nc(g)]''(Â«)s. na. (uo) gÂ€G The preceding technique is useful for deriving operations that process over VQ-compressed imagery. 3.4.3.6. Observation. In the preceding example, if the codebook has M exemplars of size k pixels, then Mk operations are required to compute the approximate dual of r(a). If M is small (customarily less than 256 exemplars, in the case of high-compression VQ), then M < |Y|. For example, if 4x4-pixel encoding blocks are employed, then k = 16 and Mk < 4096 operations. If \domain{a.)\ = IM, then |Y| = 64K, which far exceeds M. In such cases, the work required would be far less than |Y| operations, and the computational efficiency would greatly exceed CRj. PAGE 105 95 3.4.3.7. Remark. The preceding techniques cover the majority of instances where we derive analogues or duals of global reduce operations over the range spaces of transforms in Classes 1-4. In this study, the following exceptions hold: (i) Nonlinear inner product transforms, which generally do not preserve a representation of Ea in r(a); (ii) Encryptions that unpredictably represent the number of unitary bits present in a. From the cryptographic perspective, such encodings are useful since the correlation between Sa and ciphertext T(a) is thus reduced, thereby reducing disclosure of a's salient properties to an adversary; and (iii) Transformations (mainly encryptions) that represent numbers in a with information loss as values in r(a) whose correspondence with source values is difficult to trace. Such ciphers belong to a class of encryptions called steganogmphy (i.e., hidden writing) and are not considered in this study due to their complexity. We next consider techniques for deriving analogues and duals of image-template operations. 3.4.4. ImageTemplate Operations. There are four primary methods with which image-template operations can be simulated over the range space of a compressive or encryptive transform. For example, one can use the image-domain template in compressed or uncompressed form. Additionally, the analogous or dual operation can be derived analytically or, in certain cases, can be encoded in a lookup table. PAGE 106 3.4.4.1. Assumption. Let the CCS ^Â® = ({f""' {^""f-^ Â®}' U; 0}, {g^, (G^)^; Â®'}) ^Â® = ({F^(F^)''; Â®}' {7^^ ^; 0}^ {g^^{f'')^ Â®'}) , where T : G^, the template transform U : (F-'')'^ ^ (G^)^, / denotes the identity transform, and Â®'= 0(Â®), with Â® : x (F^)^ ^ F^. 3.4.4.2. Assumption. Let Assumption 3.4.4.1 hold, and let source image a G F-'^, template t e (F-'^)''^, compressed (encrypted) image ac = T{a.), and computed image a = a Â® t. 3.4.4.3. Technique. Let Assumptions 3.4.4.1 and 3.4.4.2 hold. As mentioned in the introductory comments to this section, the following four cases are pertinent to the derivation of analogues or duals of @ over range{T): Case 1. If U exists, then the operation Â® can be simulated over range{T) by Â®' that accepts ac and s = f/(t), and produces T{a. Â®t) or an approximation thereof. This is accomplished with the CCS X ^. Note that Â®' can be implemented in terms of a lookup table that accepts values in ac within the support of s, and produces the value in T{a. Â® t) that occurs at the target point of s, or produces an approximation to that value. Problems and pitfalls associated with this approach are discussed below. Case 2. The preceding case applies, but Â®' is a dual of Â®, i.e., computes a Â® t given ac and s. Case 3. If U does not exist, then the operation Â® can be simulated over range{ T) by Â®', which accepts ac and t, and produces T(a Â® t) or an approximation thereof. This is accomplished with the CCS Fq, and is particularly useful for block-structured operations over block-encoded imagery (e.g., block averaging, smoothing, or edge detection). In such cases, source blocks are preserved either in ac or, more usually, PAGE 107 97 in a codebook. For example, in the case of a codebook c, template t can be applied to exemplars in c that are images on a subset of X. Case 4. The preceding case applies, but Â©' is a dual of Â®, i.e., computes a Â® t given ac and t. Although there are additional methods of simulating image-template operations over range{T), the preceding four cases cover the majority of derivations encountered in this study. For purposes of illustration, we discuss one instance each of Case 1 and Ca^e 3 in the following observations. 3.4.4.4. Observation. Consider the problem of formulating template transformation U in Case 1 of the preceding section. The obvious approach is to assume that t is spatially invariant, then transform tx, the weight matrix of t, where x is a source domain point in X. We thus produce the weight matrix of the compressive template s as 84(3^) = r(tx). Several practical problems are associated with this approach. We first assume, for purposes of convenience, that T is a blockwise transform which employs constant block shape and size. Additionally, we assume (for purposes of notational convenience) that the analogous operation Â®' is comprised of two associative commutative operations O', 7' : G X G Â— > G. If 5(tx) subtends an integral number N of encoding blocks, then U employs Tto transform tx into an array or list of N values in G, each of which represents an encoding block. Typically, no further manipulation is required. Unfortunately, there are numerous cases where 5(tx) does not completely subtend N encoding blocks. In such cases, padding of tx is required. For example, if W = h{S{tx)), and b is a constant image on W whose pixel values equal the zero of O', then the corresponding weight matrix of s would be given by SMx)= r((tx|^) |,.(^)),x6X, (112) where h* was defined previously. PAGE 108 98 There are several problems with this approach. First, by zero-padding tx, one may cause T to distort the representation of encoding blocks in tx. For example, let T characterize an encoding block of size k pixels by its mean, and assume that a given block partition of tx, having size n pixels, subtends a portion of the y-th encoding block, where y is a point in the compressed domain Y. We call such a partition of tx an incomplete block partition. In such cases, zero-padding will cause the y-th block mean to be reduced by a factor n/k of its value computed over n pixels. In practice, the n-pixel mean portrays the given (incomplete) block partition of tx more accurately. 3.4.4.5. Remark. When we consider the problem of template matching over compressed images, we shall see that the preceding issue is not fanciful, but has direct impact upon the accuracy with which partitions that comprise targets of interest can be recognized in compressed imagery. For example, BTC, VPIC, and block-normalized VQ characterize encoding blocks in terms of the block mean. Thus, we can either pad incomplete block partitions of tx that characterize targets of interest with typical (e.g., mean) background values, or we can truncate incomplete partitions from tx prior to its transformation into the weight matrix of s. Background padding, although attractive conceptually, has the disadvantage of rendering t specific to one type of target-background configuration. Likewise, truncation may incur inaccuracies in representing target shape or statistics. We discuss these methods briefly in the final portion of this dissertation and plan to explore alternative techniques in future research. 3.4.4.6. Complexity. The method of Case 1 requires 0(|Y|^) work, since both the source image and template are compressed. Given that 0(|Xp) work is required for noncompressive image-template operations (over X) this represents a net efficiency of O(CRd^), which exceeds the study goal of O(CRd) efficiency. Unfortunately, if t is spacevariant, then this efficiency is severely degraded by the overhead incurred by transformation PAGE 109 99 ol the weight matrix tx at each point x G X. If we assume, for purposes of convenience, that t is invariant within each encoding block, then |Y| such transformations are required. 3.4.4.7. Observation. An alternative to Case 1 is given by Case 3, where we could combine the source template t with a codebook exemplar or compressed block representation. Since the codebook processing approach is more typical in current practice, we let Assumptions 3.4.4.1 and 3.4.4.2 hold, and denote the codebook as c g(F'')^, where each exemplar is a k-pixel, F-valued block that contains the values of an image on a subset of X. Given an exemplar index u in the codebook domain U, we produce a new exemplar d(u) by applying t to c(u) as d(u)=c(u)Â®t. (113) By thus processing |U| exemplars, we produce a new codebook d from c. If the compressed image ac Â£ U^, then processing of ac is not required, since Equation 113 redefines the codebook only. The problem of boundary conditions that occur in Equation 113 can be remedied as follows: Step 1. Expand in the i-th dimension of X the corresponding boundaries of each codebook exemplar in c by mi pixels, where m denotes the halfwidth of t. Step 2. Apply Equation 113 to each extended exemplar in c to yield d. Step 3. For each exemplar in d, remove the pixels added in Step 1, thereby yielding exemplars of size identical to those in c. Although there is residual error at the exemplar boundaries, such error is much less than that which would be present if the boundaries were not expanded. That is, when t's target point coincides with a boundary pixel of a given exemplar, the support of t overlies pixels that would be artificially valued with the zero of o : F x F F, a constituent operation of Â®. PAGE 110 100 3.4.4.8. Complexity. Given a codebook of M k-pixel exemplars and an N-pixel source template, Equation 113 requires O(MNk) operations, as opposed to 0(N Â• |X|) operations in the uncompressive case. Since M < |Y| and Mk < |Y| are typical, supra-unitary computational efficiencies can be achieved. 3.4.4.9. Example. Let a codebook have M = 256 exemplars of 4x4 = 16 pixels each. Let an N-pixel source image be linearly convolved (i.e., Â® = 0) with a source template having 25 pixels, which is zero-padded to 8x8 pixels prior to transformation of its weight matrix (per Equation 3.4.4.4). Thus, the size of the uncompressed template is given by N = 25 pixels, while the compressed template has L = 2x2 = 4 pixels. Assuming that X is a 1024 X 1024-pixel array and N Â• |X| multiplications are required for noncompressive convolution, and assuming that the noncompressive multiplication operation can be used in th& compressive convolution, the computational efficiency is given by N-IXI 25-10242 ^^^^ = 1600, (114) ' M Â• k Â• L 256 Â• 16 Â• 4 which is nontrivial. Even if the analogue of convolution requires operations that are 16 times more costly than the noncompressive multiplications, a speedup of two orders of magnitude would remain, which greatly exceeds the domain compression ratio of 16:1. 3.4.4.10. Observation. An alternative to combining a compressed image ac and a compressed template s on doTnain{a.(.) using Â® can be obtained via a lookup table. Due to current constraints on memory size, this method is useful only for small templates convolved over images whose greylevels are quantized into an interval A such that 512(A) is small. In practice, we prefer to use invariant templates to avoid LUT reconfiguration. In such cases, we merely configure the LUT domain to accept values in 5(sy), where y is a point in domain{a.c). The LUT then outputs the template value at the target point y. PAGE 111 101 3.4.4.11. Complexity. Let ac be produced by BTC, such that the y-th compressed image value contains the mean /i, standard deviation cr, and a zero crossing bitmap d of the y-th encoding block, where y is a point in domain{a.c). Let the compressed template s be the unitary von Neumann template, for purposes of simplicity. For purposes of brevity, we assume that s operates upon the block mean only, and let n be quantized into four bits, which is permissible in BTC of noisy imagery. Since |5(sy)| = 5, the LUT has a domain size of 20 bits, which implies IM elements storage. Since s operates upon the mean only, s returns a mean value (4 bits), which implies a total LUT size of 500KB, which is feasible implementationally. In Chapter 9, we show that local averaging can be simulated over BTCand VPICcompressed imagery by such methods, with a consequent cost of 0(|Y|) operations, where the proportionality constant slightly |5(sy)| due to I/O overhead and data structural manipulations. Given the preceding theoretical background, we next consider the more specialized cases of operations over the range space of specific transform classes in our taxonomy. 3.4.5. Analogues over the Range of Class 1 Transforms. We describe two techniques for Class 1 transforms, which are (a) the definition of 0' in terms of 0Â» ^nd (b) the use of isomorphisms between Q and Q'. 3.4.5.1. Techniques. Method 1. Expressing Q' in terms of Q. Let Assumption 3.4.1.1 hold, and let Til ^ be a homomorphism from S to S. From Assumption 3.4.1.1, it follows that 0-0': S x S S. Since Q and Q' have the same range and domain, it is reasonable to assume that Q' could be expressed in terms of 0 or (in restricted instances) that Q' = OFor example, in the case of substitutional ciphers that are bijections, Q' can be expressed in terms of Q. PAGE 112 102 Method 2. Deriving O' analytically from Q and T. Assume that Ti: is a homomorphism, 0',0: S x S ^ S, and source images a,b Â£ F^. From the definition of a homomorphism, ri(aOb) = Ti{a)Q' Ti{h). Since T, a, b, and 0 ^r^ known, instances of the preceding equation may be soluble for Q'. For example, the solution is generally easier for customary local (e.g., pointwise) transformations than for global (e.g., inner-product) transforms. The following examples are illustrative. 3.4.5.2. Example: Method 1. Let Ti : R^, such that Ti(a)= 2 * a, a pointwise transformation. Let 0= R^ X R^ ^ R^ be defined as 0(a,b)= a+ b. Then, 0'= 0> since multiplication distributes over addition. 3.4.5.3. Example: Method 2. Let X = and let the homomorphism Ti : (22 f {l2f, with 0, 0' : (Z2)^ X (Zj)^ ^ (Zj)^. Define the transform T'i(a)= -i(a)= (a +1) mod 2 and let the image operation 0(a,b)= axoRb= (a-|-b)mod2. Given DeMorgan's laws and the definition of homomorphism, we can rewrite the expression ri(aO b) = ri(a)0' Ti{h) as -n(a)0'^(b)= ^(axoRb)= ^(a)xoR^(b), (115) where axoRb= (a b)mod2. Thus, 0'= XOR. Given Definitions 3.2.2.7 and 3.2.3.12, together with the Boolean operators ZYzj, we therefore have the following solution to an instance of the TPP M^pp : (((Z2)^, xor), -n, ((Z2)^,Wz,)) h ((Zi)'^, xor) , (116) which yields the computational system Xq = {{if, xor}, {-., {if, xor}) . (117) PAGE 113 103 In practice, many Class 1 transforms are amenable to such simple derivational methods. The derivation of an analogous or dual operation is nearly trivial when T\ is a linear transform and 0 is a linear operator. However, when Q is nonlinear, an approximation is generally required (per Technique 2 of Section 3.4.5.1). Similar methods are employed for global-reduce and image-template operations, as shown in the derivation of TPP solutions for Class 1 transforms (reference Chapter 6). We next consider the determination of analogous operations over the outputs of Class 2 transforms. 3.4.6. Analogues over the Range of Class 2 Transforms. Class 2 transforms modify the range space of an image but not the image domain, and can be thought of implementationally as a subset of the greyscale operations. 3.4.6.1. Observation. Note that certain operations over domain{ T2) can be transformed into operations over mnge{Ti) by the application of a greyscale transform. For example, bijective transpositions over a uniquely valued image are isomorphic to value substitutions. Alternatively, Ti can be a many-to-one mapping such as a local neighborhood transform (e.g., local averaging or median filtering followed by bit-slicing). The Class 2 transform 72 : can be thought of as a scalar transform ^T2: FÂ° ^ G. Note that ^ T2 admits n-ary kernel functions that compress data by grouping several neighborhood values in terms of a value in G, similar to vector quantization. Similarly, at fine granularity, we think of a Class 2 transform as a grey-level mapping =g: (fi , f2, Â• Â• Â• , fn) h g Â» where fi , f2, . . . , fn G F, g G G. When T2 is not invertible, an closed-form solution may not be obtained by solving the definition of an isomorphism (using T2 and 0) for the analogous operation Q. Note that when n is large, search over |F|" incurs exponentially-growing cost that becomes prohibitively large for practical values of n. As a result, neighborhood-based formulations of T2 usually have no e-near approximation to the inverse transform that can be obtained by efficient search over |F|". This fact PAGE 114 104 has a,dverse implications for cryptanalysis. Similarly if a VQ-based approach to solving for 0' is employed (by attempting to map clusters of values in |F|" to corresponding exemplars), then if n is large, |F|" may be so sparsely populated that quantization yields a prohibitive distance (or error) between an exemplar and the data it represents. As a result of the foregoing problems, we concentrate (for purposes of brevity) on tractable pointwise instances of T^, and defer discussion of approximations of Q' to subsequent development. 3.4.6.2. Technique. If T2 is a homomorphism, then the statement r2(aOb) = T2{a)0'T2ih) can be written as fir(aOb) = g{ai)0'g(h). If g and 0 can be expressed analytically, then the preceding equation can be solved for Q', within constraints of tractability. When T2 is a bijection, it is convenient to implement an analogue Q' of iÂ™" age operation Q in terms of the inverse mapping g~^ : G Â— > F. For example, given a unary image operation Q Â• F ~* the commutativity diagram of Figure 1 implies that 0'(g)= 0(ff~Hg)) Â» g 6 G. As discussed in Chapter 4, such formulations exhibit the disadvantage of decreased data security, since a bijection is an isomorphism and g~^ can disclose information about F . The following simple examples illustrates the concept of deriving an operation over range{T2). More detailed examples are given in Chapter 7. 3.4.6.3. Example. Let the homomorphism T2 : (R+)^ ^ R^, such that r2(a)= sin(a). Let 0: (R"^)^ X (R+)^ ^ (R+)^ be defined as 0(a,b) = a + b. An analogous operation O'(na), T{^))= T(0(a,b))= sin(a + b) is given by 0'(sin a, sin b) = sin a Â• cos b + sin b Â• cos a /, ^\ . . . / T\ (118) = sin a Â• sin (^b -IÂ— J 4sin b Â• sin + 2" j ' which follows from basic trigonometry. We next provide an example of deriving an operation Q' over the range space of a many-to-one Class 2 transform, which operation is not an analogue of the corresponding PAGE 115 105 image operation QIn particular, Q yields an image that has certain pixels that are a predictable subset of pixels in the output of Q'. 3.4.6.4. Example. Consider the reduction cipher 7^ : F-'' G-'^, where |F| > |G|. For example, a message in English plaintext (where |F| = 26) could map with two-fold size reduction to a message in a language where |G| = 13 (e.g., G = {a,e,i,o,u,h4c,l,m,n,p,w,y}). By choosing the mapping appropriately, a particular plaintext word or phrase could be translated into legible ciphertext, while the remainder of the ciphertext could be gibberish. This method is similar to that of steganography, which encodes text messages in pictures formed from ASCII characters. Alternatively, one could pair frequently occurring characters with infrequent characters, in order to render the frequency distribution of encoding symbols more uniform. In a simple case, if Tr does not perturb the message ordering, then an operation that preserves message ordering and does not perturb symbol values (e.g., concatenation), would be preserved over range{Tr). In a more complicated case, if an N-pixel plaintext template was encoded using Tr, where N is an integral multiple of the number of pixels in each of Tr's input values, then the operation of template matching would not be preserved over range{Tr), since Tr is a many-to-one mapping. Let us examine the latter assertion by setting F = {A,B,C,D} and G = {0,1}, for purposes of simplicity. Suppose that Tr maps A or B to 0 and C or D to 1. Given the message ABAC AD, Tr would produce the ciphertext 000101. Let a template t G (F^)""", and let a template matching operation Q:F^y. (F^)^ ->Â• produce D whenever (*i|5(ty))(x) = ty(x), x,y G X, and an A otherwise. Assume that t can be encoded as a template s e(G^)^ by the application of Tr to the weight matrix ty, such that s is an exact encoding of t. Let 0'= G^ x (G^)^ ^ G^ match s to Hc = Tr{a.), producing ac(y) = 1 whenever (ac|5(s^))(x) = Sy(x) , x,y Â£ X, and zero otherwise. Since Tr is a PAGE 116 106 many-to-one mapping, Tr{0{a.,t)) 0'(ac,s) and Q' is not an analogue of 0However, rr(0(a,t))||o}C 0'(ac,s). This can be verified by setting a = {A,B,B,A,B,C,A,C,D,A) and t = {A,B), where the target pixel is underlined, then blc = (0,0,0,0,0,1,0,1,1,0) and s = {0,0. Thus, 0(a,t)= {A,D,A,A,D,A,A,A,A,A), rr(0(a,t))= (0,1,0,0,1,0,0,0,0,0), (119) and 0'(ac,s)= (0,1,1,1,1,0,0,0,0,0) ^ T,(0(a,t)) . As a result, analogues of many image operations often do not exist over the range spaces Â• of many-to-one mappings in Class 2. However, as shown in Chapter 7 and outlined in the following section, it is possible in certain cases to perturb the original image operation, or to utilize the concept of an Â€-near approximation to the inverse transform to obtain an analogue over ranjfe(72). We next consider processing over the range spaces of the Class 3 transforms, which was summarized in Section 3.4.2.11. 3.4.7. Analogues over the Range of Class 3 Transforms. In the Class 3 transforms (denoted by T^), the domains of the input and output images are unequal. Thus, we can think of the Class 3 transforms as domain transformations. If the image domain is a subset of R", then such mappings are frequently called spatial transforms. As a result of T^s conceptual and formulaic simplicity, the derivation of analogues over the Class 3 transforms can be accomplished with relative ease. 3.4.7.1. Observation. The Class 3 transforms, of form T3: ^ F^, can be thought of as being based upon a spatial mapping / : Y X. The inverse spatial transform f~^ : X Â— Â»Â• Y, if it exists, can be the basis for a transform-regime analogue of an image operation over range(T3). This concept is similar to our use (for Class 2 transforms) of the inverse : G F of the grey-level transform PAGE 117 107 3.4.7.2. Technique. Assume that the Class 3 transforms are represented generally by a spatial transform that is a homomorphism. That is, given a e and the spatial transform /: Y ^X, the transformed image ac = r(a) = ao/, which composition was defined previously. If / has an inverse /"^ : X ^ Y, let the inverse homomorphism : ^ exists, such that a= r-Hac)= ao 0/-!= ao/ 0/-1. (120) As shown in Section 3.4.2.11, the analogous operation over range{T), denoted by 0': F^ Â— *Â• F^, can be derived from the preceding equation as 0'(ac)= 0(a)o/= (0Â°/)(a), (121) since ac = a o / and composition is associative. Thus, if T is a homomorphism and Q is specified, then 0' = O Â° /Â• A symmetric case holds for binary operations. 3.4.7.3. Remark. Let a,b G F^, assume that Tis as given in the preceding section, and let ac = T'(a) and be = r(b). Without loss of generality, the preceding remark applies to binary pointwise operations as O'(aobc)= 0(a,b)o/= (Oo/)(a,b), (122) where /was defined previously. Similarly, given a template t G (F-'^)''^, the operation a Â® t has an analogue Â®' over range{ T) that is comprised of basis operations o'= O 0 / and 7'= 7 o/, (123) which are the analogues over range{T) of 0,7 : F x F ->Â• F, the basis operations of Â®. PAGE 118 108 3.4.7. 4. Observation. We advance the following comments concerning global reduce and image-template operations over the range spaces of spatial warping transformations, which have obvious proofs. 1. If the spatial transform /is continuous and isomorphic and does not crop or magnify the source image a, then the global reduce functions are analogues of themselves. 2. Let / be discrete and isomorphic, without cropping a, but let / magnify a by a factor of r > 0 in each of m dimensions of domain{a.). Neglecting quantization error, the following statements hold: Â• E'a, the dual of Sa, is given by E'(a) = r Â• Ea; Â• Image maximum and minimum are selfanalogues; and Â• If a is not unitarily-valued, then n'a, the dual of Ha, is given by n'(a)= (naf"; 3. If /is anisomorphic, then analogues or duals of the global reduce functions cannot be predicted globally, but must be computed over small neighborhoods of range{f), then combined to yield the functional result. 3.4.7.5. Observation. Image-template operations are computed over range( T^) in a manner that is symmetric to the Class 3 pointwise operations. Assuming an invariant template t, for purposes of convenience, the source image a and template weight matrix are first transformed by Tz to yield an image ac and a weight matrix of a new template s that is defined over domain{sic)Within the constraints of quantization and interpolation error, the operation Â® is a selfanalogue. If / is anisomorphic, then t (and, hence, s) must be defined as spacevariant, such the weight matrix of s must be computed at each target point y in domain(sic)-, prior to application of Â®. This is a key drawback of computing over imagery that is perturbed by spatial warping transforms. We address this issue in some detail in Chapter 8. PAGE 119 109 In the development of Chapter 9, we present examples of the foregoing technique, and derive various functions of spatially-transformed imagery. We also discuss other types of transformations in Class 3, including those mappings that discard pixels in the transform domain. In such cases, the preceding theory holds for source pixels that remain in the compressed image. However, information loss results, due to the discarding of information from the source image. We next discuss processing over the range spaces of Class 4 transforms. 3.4.8. Analogues over the Range of Class 4 Transforms. Class 4 contains many interesting and complex compression and encryption transforms. As a result of the formulaic complexity of the transform, analogues of image operations are not as easily derived as for Classes 1-3. However, via the use of substitutions, we can derive general formulations for Class 4 analogues of certain image operations. Additional operations can be formulated via approximation methods. 3.4.8.1. Observation. Recall that, in Chapter 3, we exemplified the multigranular decomposition by specializing Class 4 transforms into two sub-classes. In particular. Class 4-A contained fixed-block encodings that were portrayed as substitutions, and Class 4-B contained variableblock encodings. Note that (a) the medium-granular formulation : (f^)^ -Â»Â• generalizes subclasses 4-A and 4-B and (b) we can represent T4 in scalar form by the greylevel operation g: (fi , f2, . . . ,fk) g, where fi , f2, Â• Â• Â• ,fk e F, g e G, and k may vary with y Â€ Y. This situation is similar to the Class 2 transform, where g: F Â—< G. As a result, pointwise image operations and global reduce operations can map easily to the range spaces of Class 4 substitutions, as follows. PAGE 120 110 3.4.8.2. Technique: Global Reduce Operations. Let the Class 4 transform be decomposed as ^T:,: F^^GY ^ ^^^4: (F'')'^ , where ^T^ is specified in finegranular form as the function g: G. For purposes of development, assume that there exists an inverse (or an e-near approximation to the inverse) :G Â—^ F^. Let the source image a G F-'^ be operated upon by a global reduce operation T : F^ Â— ^ F. Additionally, let the transformed image a.^ = r4(a), and assume that the encoding blocks denoted by F'' partition X completely into |X|/k disjoint partitions. Then, T' : G^ F, the transform regime-dual of F, is given by r'(ac) = J^(ff-^(ac(y))). (124) The use of a transform-regime dual, instead of an analogue, is appropriate here, since T returns one value, rather than an image on X. In practice, the output of T may not be in G if |X| or k is large, since G is designed to represent single pixel values, whereas F (by definition of F) contains values in range(T). 3.4.8.3. Example. Under constraint of the preceding technique, if F = S and g= E (i.e., T4 denotes block summation), then g"' does not exist, but F' = S (where summation is applied to block sums in range{Ti) that are produced by g), since addition is associative. We next discuss the more challenging task of deriving analogues over range(T4) of the pointwise image operations. For purposes of simplicity, we begin with the transformregime dual. 3.4.8.4. Technique: Dual pointwise operations. Let ^ T4 be specified in fine-granular form as the scalar operation g: F'' Â— >Â• G. Assume that there exists an inverse operation g~^ :G Â— >^ F'' or an e-near approximation to the inverse denoted by 3* :G Â— F'', and let h": Y 2^ denote an indexing function that accepts a point y G Y and returns the domain D C X of the y-th encoding block. Let / : F'' x 2^ 2^"" accept a tuple from the output of g~^ and a subset of X from the output of h* , and return the subimage on F^ PAGE 121 Ill whose domain is specified by the output of h* . Given source images a,b G that are operated upon by Q : F'''' X F'''' ^ F^, we outline the derivation of Q : G^' x G'^' F'^, where Yi,Y2 C Y, which is the dual of Q over range{Ti), as Step 1. Assume that ac G G^' and be Â£ G^' are computed as sic = 74(a) and be = T4(b). Step 2. Then, c = a Q ^ computed with the dual operation fi as c = n(ac,bc) = U U /(?~'My)], /i'(y))U.(y)nA-(z) 0 /(s-'Wz)], h'{z))\,.^y)nk-(z). yeYi Z6Y, (125) Thus, c = a 0 b = ac n be. The double union operation in Equation 125 is important implementationally, as discussed in Section 3.4.8.6. We briefly digress to consider an alternative formulation. 3.4.8.5. Technique: Analogous pointwise operations. Given / and h* , assume that there exists a transform V4 : 2*^" Â— > G that accepts the output of / and produces a value in range{rangÂ€{Ti)). Further assume that a tuple constructor T can accumulate values in G to form a transformed image in G^. Under constraint of the technique shown in Section 3.4.8.4, and given V4 and T, a dual operation (reference Equation 125) can be converted to an analogue Q' : G"^' x G^^ -> G^^ of Q over range{ T4), where Yi, Y2, Y3 C Y. The analogous operation is specified as Ce= 0'(ac,bc) " yevi^zeY, ^4/(5-'My)],/^*(y))lA.(y)nA.(z) 0 /(ff-^[ac(z)],/Â»-(z))U.(y)nA.(z)], (126) where Cc equals or approximates r(0(a,b)). Here, the tuple constructor is employed only once, versus the double union operations in Equation 125, to avoid constructing a tuple of tuples. If we employed double constructors T, then the resultant image would have two levels of structure. That is, each high-level tuple would contain a list that would contain one PAGE 122 112 or more lists of pixels in the transformed image Cc. This image would then be tuplevalued, instead of G-valued. 3.4.8.6. Remark. The preceding techniques yield two important observations about the behavior of dual and analogous pointwise operations over range(Ti), namely, 1. The double union of values under restriction of the block intersection h'{y) n h'(z) (per Equation 125) implies that dual pointwise operations over subsets of Y that are not identically partitioned blockwise can incur work of 0(|Yp) block intersections. 2. If the source images a and b are partitioned differently, then the block intersection operation can lead to fragmentation of blocks encoded in a,, and be. Such fragmentation can also occur in analogous pointwise operations, as described by the restriction in Equation 126. In Figure 3, we portray fragmentation in the compressed image Cc = ac 0' that would be produced by an analogous pointwise operation. Fragmentation can increase the space complexity of Cc and thus decrease CRj. This adversely affects the compression ratio and decreases computational efficiencies that depend upon CRa. In a cascade of such compressive operations, it is possible that CR could become sub-unitary, thus obviating the chief goal of compressive processing, namely, reduced storage and computational costs. In order to minimize intersection overhead and block fragmentation, the images a and b should be identically partitioned. If identical partitioning is employed, then the following simplification holds. PAGE 123 113 2 3 => 4 5 6 7 two blocks fragment to produce seven blocks (a) (b) Figure 3. Example of block fragmentation in pointwise compressive operations over non-identically tesselated domains: (a) representation of the y-th block in a (solid line) and b (dotted line), (b) resultant block decimation required to process the compressed images Cc = ac 0' be such that corresponding sub-blocks are amenable to pointwise combination. 3.4.8.7. Technique: Analogous pointwise operations with corresponding blocks. Given source images a,b G F-'^ that are operated upon by Q : x and T4 : F^ (implemented in terms of a block encoding jf : F*' G, we derive Q' : G^ x G^ G^, where Yi,Y2,Y3 C Y, an analogue of Q over range{T4), as follows: Step 1. Assume that ac Â€ G^ and be G G^' denote the compressed images ac = T4{a) and be = T'4(b). Step 2. Assuming that g has an inverse g~^: G Â— * F*', the compressed image Cc = r4(a 0 is expressed in terms of the analogous operation Q, as If O' was an e-near analogue of Q, then the equality in the preceding expression would be replaced by an approximation relation, and g would have an approximate inverse g*: G ^ F**. cc= 0'(ac,bc)= U g-'{0{g-'M)],9->c{y)])). (127) y6Y PAGE 124 114 3.4.8.8. Remark. Although imagetemplate operations can be derived via composition of the global reduce and pointwise operations, such formulations may not be tractable analytically. For example, one could incur high computational cost attempting to compute an image-template operation couched in terms of Equation 126, due to the block intersection operation. Thus, we next discuss image-template operations in terms of manipulating the substitutions inherent in the transform classes 4A and 4-B. 3.4. 8.9. Observation. Let the Class 4 transform T^: G^, expressed at medium granularity as ^ 74 : (F'')^ Â— > G^, be a homomorphism that is implemented at fine granularity in terms of a substitution = gf : F'' G. If the source image a G F^, and the indexing function h' : Y ~ 2^, then the encoded image ac is given by ac= r4(a)= U (y,ff(aU.(y))). (128) yeY Here, the equivalence relation is used, since the right member of Equation 126 denotes a set that contains the graph of the compressed image ac. Let the template t Â£ (F^)''', and assume that the image-template operation Q : x (F-''-)''^ F^ is applied to yield a= 0 (a,t)= aÂ®t. An analogue of 0 can be formulated in one of two ways: Method 1. If 0'X (F-'^)^ Â— Â»^ G^, then from the definition of homomorphism, 74(a) = 0'(ac?t). Thus, ac is combined with t. However, Q' must implement at least the maps : G ^ F'' or g*: G Â— > F*' as well as /i*, in order to respectively convert values in G to a block of k values in F, and points in Y to a subset of X. Method 2. Alternatively, if Q' : x (G^) G"^, then one may define a template sÂ€ (G^) , such that T4(a) = 0'(acÂ»s)= acÂ®'s. If the basis operations 0',Y : G X G Â— > G of Â®' differ from the basis operations 0,7 : F x F ^ F of Â®, then it is possible for Â®' to differ from @, the operation upon which 0 is based. PAGE 125 115 The formulation of Q' shown in Method 1 can be readily implemented if g-^ or gf* and h are known. In practice, templates such as s may be easily specified at a high level, but may not be amenable to efficient implementation via an analogous imagetemplate operation @'. We briefly discuss such issues in the subsequent technique. 3.4.8.10. Technique: Special case of image-template operations. Assume that the template s g(G^)^ is applied to the compressed image ac Â£ G^. From the previous discussion of transform T4, we know that each point y G Y corresponds to an encoding block domain D C X. Considering the usual case where h* is a bijection, we see that the image operation 0(a,t)= a Â®t is spacevariant, such that >S(tu)= h{y) , u 6 X,Vy G Y, and the values in a|^.(y) are preserved in ac(y), Vy G Y. Since Â® is implemented in terms of the associative, commutative operations 0,7 : F X F ^ F, the template s is defined by its weights as where e denotes the identity of o, and h denotes a value that is the zero of o as well as the identity of 7, if h exists. Since composition is preserved under homomorphism, an analogue Â®': G"^ X (G"^)^ ^ G"^, of Â® over range{T), can be defined in terms of o' and 7' in a manner that is symmetric to the definition of Â®, i.e., (acÂ®'s)(v)= ^ac(y) 0' sv(y), vG Y. (130) y t o (tv ) From merely Â®', we compute 0'{ac,s)= acÂ®'s. PAGE 126 116 3.4.8.11. Observation. The preceding case can be higUy restrictive in practice, for the following reasons. First, the block encoding kernel fir : F'' G is a bijection in the case of lossless encoding only. However, most Class 4 transforms implement inherently lossy forms of block encoding (e.g., VQ and JPEG). In such cases, T4 (and, therefore, the basis function fif) would have an approximate inverse only. Thus, 0') stated in Equation 130, might not exist for lossy encoding since T4 might not be a homomorphism. Instead we would treat as having an e-near approximation T^* to its inverse. Second, even if we have a lossless version of T^, if analogues of o and 7 over range( T4) are unknown or intractable, then Â®' (and, therefore, Q') cannot be derived as shown in the preceding section. Instead, approximations to o' and 7' must be derived, then composed to form Â®'. Third, even if T4 is lossless and analogues of o and 7 exist, the formulation of the compressive template s may be difficult due to lossy transformation of the boundary descriptions of adjacent encoding blocks. Thus, we prefer to consider each transform in Class 4 as a special case of the image-template formulations given previously. 3.4.8.12. Remark. In practice, it is often useful to approximate image-template operations by exploiting some feature of the source image that is preserved in the compressed image. For example, if T preserves boundary information in the compressed image, then such information may be employed by a compressive edge detection operation that filters out non-boundary information. We further illustrate this process in Chapter 9. We next discuss the complexity of operations over transforms in Classes 1-4, and present a high-level model for compressive computation. PAGE 127 CHAPTER 4 COMPRESSIVE PROCESSING Â— COMPUTATIONAL COMPLEXITY AND DATA SECURITY In Chapter 1, we stated that computational efficiency is a key premise of processing compressed data, due to the reduced number of operations that may be required by fewer input data. The realization of such efficiency assumes that the transform properties problem can be solved to yield one or more transformregime analogues of image operations. Implementationally, we assume that such analogues exhibit computational cost less than or equal to that incurred by the corresponding image operations. However, such assumptions may not hold universally in practice. In the previous chapter, we proposed high-level techniques for the determination of analogous operations. In this chapter, we present theory that supports implementational analysis of compressive processing. We first discuss the complexity of image algebra operations, then derive expressions for the complexity of analogous operations over transformed imagery (Section 4.1). We next examine the complexity of computations over the range space of generalized transforms in Classes 1-4 of our transform taxonomy (Section 4.2). The resulting theory is used to describe preliminary optimization procedures for compressive computational systems. A discussion of information theory is given in Section 4.3, which leads naturally to a consideration of data security. Similar to the assertion of computational speedup due to compressive processing, we state that the data security of computational operations can be increased by processing encrypted data. Thus, Section 4.4 overviews concepts and theory of cryptography and cryptanalysis, while Section 4.5 discusses theory and feasibility of encryptive processing. 117 PAGE 128 118 4.1. Complexity of Image Algebra Operations. In Chapter 2, we state that image algebra consists of pointwise, global-reduce, and image-template operations [54,55]. In the latter category are included the image-image convolutions, as well as operations between templates. In this section, we analyze the work required to compute nonrecursive image algebra operations in an implementational context, beginning with pointwise operations. 4.1.1. Complexity of Pointwise Image Operations. 4.1.1.1. Lemma. Each computation of the nonconstant function Q : incurs minimum work of (i) one input operation, (ii) one computation, and (iii) one output operation. Proof. The proof follows trivially from the assumption that each tuple that comprises an element of FÂ™ or F*' can be encoded as one storage element whose input or output requires one pair of I/O operations (read and write). Since the pointwise operation Q Â• (F^)Â™ is induced over X by the scalar function Q : F"" F, one can use the preceding lemma to predict the work required by Q. 4.1.1.2. Example. Denote B = {0, 1}, for purposes of brevity. A Boolean logic operation O : B X B -* B over the image a G B^ induces the function Q : xB^ -^B^, which requires |X| invocations of Q. Thus, if an element in domain{Q) is encoded as a two-bit integer in B^, or as a one-bit encoding of a signed integer, then 2 Â• |X| I/O operations will be required. In other words, one operation is required to input an element in B^, and another for output. Otherwise, the I/O overhead would increase to 3 Â• |X| operations. PAGE 129 119 4.1.2; Complexity of Global Reduce Operations. 4.1.2.1. Lemma. If T : Â— > F is a global-reduce operation that is based upon the associative, commutative function 7: F x F F, then one computation of F requires minimum work of (i) |X| input operations, (ii) |X| 1 computations of 7, and (iii) one output operation. Proof. The proof follows trivially from Lemma 4.1.1.1 and the assumption that the 4.1.2.2. Example. Let X be an mxn-pixel array and source image a 6 R^, and let the associative, commutative function 7 = V. The corresponding global reduce function r(a)= Va requires work of at least |X| + 1 I/O operations and |X| 1 comparisons. 4.1.3. Complexity of Imagetemplate Operations. 4.1.3.1. Lemma. If image a Â€ F-'^ and template t G (F-'^)^, then the right product b = a Â® t 6 F^ exhibits the following minimum work requirement: (i) Nj, the number of input operations, and Nc, the number of computations of O : F X F Â— > F, are bounded as computational cost includes I/O overhead. (131) yÂ€Y (ii) the number of computations of 7 : F x F -+ F is bounded as (132) yGY (iii) the number of output operations is bounded as 1 < No < |Y| . (133) PAGE 130 120 Proof. The proof is outlined as follows: (i) Ni follows from the definition of template support, and the fact that a minimum of |Y| inputs must be examined by t. Nc follows from N,, the definition of Â® and Lemma 4.1.1.1. (ii) N-y follows from Ni, the definition of Â®, Lemma 4.1.1.1, and Example 4.1.2.2. (iii) The lower bound on No follows from the implicit assumption that the output image b is initialized to an image bo, and that it is possible for (aÂ®t)(y)= bo(y) , y G Y, which would obviate the necessity of output to b(y). If all values in b were replaced by different values, then | Y| write operations would be required. The image-template left product presents a symmetric case and operations between templates are similarly analyzed. 4.1.4. Complexity of Operations over Transformed Data In this section, we derive expressions for the complexity of computations over the range space of a generalized image transform T: G^. We mirror the preceding development by discussing pointwise, global-reduce, and image-template operations. 4.1.4.1. Theorem. Given a homomorphic computational system -^H= ({F^)0}i {^)Â€)}> {G^,0'}) with pointwise operation Q, the computation of a pointwise analogue Q' requires work that is bounded as (i) |Y| < Ni < 2 Â• |Y| input operations, (ii) 1 < Nc < |Y| computations of the scalar operation Q', and (iii) No > |Y| output operations. PAGE 131 121 Proof. Statements i) through ii) follow from Lemma 4.1.1.1 and the definition of a transform-regime analogue. 4.1.4.2. Remark. It follows from the preceding theorem that the computation of pointwise operations over range(T) could be achieved in 0(|Y|) time, as stated in the introduction. However, if domain fragmentation occurs (for example, due to unequal tesselation of the inputs of 0') domain of the output of Q' may be larger than |Y|. That is why we state that No > |Y| in Theorem 4.1.4.1(iii). Alternatively, Y may be preserved by 0'Â» but the output of Q' have multiple values per point in Y that cannot be written in one I/O operation. 4.1.4.3. Theorem. Given T: Â— ^ G^, let the global reduce operation T : Â— Â» F have an analogue T' : G over range{T). If T is implemented in terms of an associative; commutative function 7 : F x F Â— F that has an analogue 7' : G x G ^ G, then the computation of r' requires work that is bounded as (i) |Y| input operations, (ii) Nc > |Y| 1 computations of 7', and (iii) No > 1 output operations. Proof. Statements i) through iii) follow directly from Lemma 4.1.1.1 and the givens. Note that the bound on Nc is constrained by the implicit assumption that 7' is a pointwise analogue of 7. 4.1.4.4. Remark. Consider the implications of the preceding theorems for parallel computation. Referring to Theorem 4.1.4.1, assume that (a) the analogue Q' is a pointwise operation that requires |Y| < |X| computations of the scalar operation Q' : G x G -Â» G, (b) that O' is computed with degree of parallelism N, and (c) that the speed of computation is invariant to the ratio r = siz{F)/siz{G) for purposes of convenience. In such cases, the compression ratio reduces to CR = r Â• CRa = r Â• |X|/|Y|. Neglecting I/O cost and PAGE 132 122 interprocessor communication overhead, it is easily verified that the speedup achievable via compressive parallel processing, as opposed to noncompressive sequential processing, is given by ?/(0',N) N Â• CRj. In practice, the proportionality constant would likely be nonlinear in N due to the usual communications and control overhead. Thus, unless otherwise stated, we assume that computational cost predominates. Per Lemma 4.1.2.1, assume that the global reduce operation T requires |X| 1 invocations of its basis function 7. With sufficient parallelism and exclusive of I/O and control overhead, it is possible to compute 7 in ["log2|X|] Â• At-y time, where At-, denotes the computational delay incurred by 7. Likewise, if V, the analogue of T over range{T), requires | Y| 1 invocations of the function 7', then with sufficient parallelism, it is possible to compute 7' in |"log2|Y-l|] Â• At-,time, exclusive of I/O and control overhead. Letting T(i,j,)(n) denote the time complexity of operation x on an architecture of y processors with n input data, the computational speedup associated with T' (versus T) would be given by _ T(r,ix|/2)(|X|) _ flogixn V,,v,,Â«(|Y|)-fkilW' (13^) 4.1.4.5. Remark. Initially, the preceding equation may not appear to yield a remarkable increase in processing efficiency. For example, rather than exhibiting the parallel efl^ciency of O(CRd) that is characteristic of analogous pointwise operations, the analogous global reduce operations exhibit a parallel efficiency of 0(log|X|/log|Y|), which is less than log(CRd) = log(|X|) log(|Y|). For example, if |X| = IM, and |Y| = 256K, then CRa = 4, and log|X|/log|Y| = 10/8 = 1.25. In contrast, log(CRd) = 2. However, this apparently low efficiency is compensated for by the decreased cost of processing, which is computed as C = N Â• T, where T denotes time complexity. In the second member of Equation 134, note that T requires |X|/2 processors to achieve logarithmic computation time, while |Y|/2 processors are required for computing T' under the same constraint. Thus, the total processing efficiency is given by ncost _ |X|-T(r,|x|/2)(|X|) ^''Â•'^''""^ ~ |Y|-T(rMY|/2)(|Y|) = ''(^Â•^'Â•1^1) Â• (135) PAGE 133 123 Thus, the analogous global reduce function can produce a computational efficiency proportional to the domain compression ratio. We next consider computation of Â®', the analogue of the image-template operation @ over range( T). For purposes of simplicity, we assume that there exists a space-invariant template t Â€ (F''^)'''' for which an analogous template s G (G^)^ exists, per the discussion in Section 3.2.2 of HCS and ICS that embody image-template operations. 4.1.4.6. Theorem. Let a source image a 6 F-'^ be convolved with template t G (F^)^ to yield the operation 0(a,t) = aÂ®t, where Â® is comprised of operations o, 7 : F x F ^ F. Assume that T: is a homomorphism, and that ac = r(a). Additionally, let template sG (G^)^, and let 0 = Â® have an analogue Â®' over mnge{T), such that 0'(ac,s) = Â£icÂ®'s= r(aÂ®t), where Â®' is comprised of operations o',7' : F x F Â— > F that are the analogues of o and 7. The computation of Q' requires work that is bounded as (i) |Y| < Ni < l<^(sy)| input operations, yÂ€Y (ii) |Y| < Nc < X] l'^(sy)| computations of o' yÂ€Y and |Y| -1 < Nc < ^ \S(sy)\ (|Y|-|-1) computations of 7', as well as yeY (iii) 1 < No < |Y| output operations. Proof. Statements i) through iii) follow directly from Lemmas 4.1.4.1 and 4.1.4.3. When t e (F^)''^ is space-invariant, the left product is identically characterized. 4.1.4.7. Remark. As we showed for pointwise and global reduce operations, an analogue to the image-template product can, in principle, be computed in 0(|Y|) work. However, a more realistic estimate is O ( |Y| Â• X! IÂ«5(sy)| I work. The latter estimate is generally bounded above by 0(|Yp), since the usual upper bound of 0(|Y|2) associated with convolution is multiplied by a maximum of 0(|Y|) search operations that could be incurred in locating the point y G Y corresponding to a given source point x G X. Such conditions hold especially if the values of ac encode one or more points in X that cannot be predicted PAGE 134 124 from y using one invocation of an indexing function. In Chapter 11, we discuss this problem as it pertains to the special case of SIMD parallel compressive computation and show how certain instances of this problem can be solved implementationally via lookup tables. 4.1.4.8. Summary. We have seen that |Y|, the number of pixels in the encoded image, is the order of work required by compressive pointwise analogues to pointwise image operations. When such analogues are composed to yield an analog to the image-template product, a speedup of O(CRd) can be achieved under two constraints. First, the compressed domain must not be enlarged due to fragmentation by the analogous operations. Second, the work required by an analogous operation with respect to the corresponding image operation should be a linear function of the operand size at worst, and a constant function at best. Given the fact that | Y| is a key parameter of the compression ratio CR, it is reasonable to expect that the work required for computation of a given analogous operation can be predicted in terms of CR. Thus, in the following section, we express the computational speedup in terms of a limit called the critical compression ratio, above which supra-unitary computational speedup occurs. 4.2. Complexity of Compressive Computation. In Chapter 3, we defined a compressive transform and its compression ratio. In this section, we first elaborate the efficiency r/ due to computation over the range space of a transform T. Second, we derive the critical compression ratio CRc. When the compression ratio CR > CRc, then > 1, i.e., a computational speedup occurs with respect to noncompressive processing. PAGE 135 125 4.2.1. Basic Theory. Per the discussion of sections 4.1.4.4 and 4.1.4.5, we define the concept of computational speedup in a compressive homomorphic computational system, then relate the speedup to the compression ratio. 4.2.1.1. Definition. Given a homomorphic computational system Xii= {{F^,0}i {T,Q}, {G^ ,Q']), the computational efficiency r] due to processing on an N-processor architecture A with operation Q' i^^ analogue over range{ T) of an image operation 0) is given in general form as /V ^ Wa(o,n)(|X|) For purposes of simplicity, we generally assume sequential processing unless otherwise stated and omit the subscripts of as well as A and N, which can be inferred from context. We next present an assumption that simplifies the relationship between r] and CR. 4.2.1.2. Assumption. Let homomorphism T : ^ G^, image operation 0 : (F^)Â™ ^ F^, and let Q' : (G^)" G^ denote the analogue of Q over range{ T). (i) Assume that the work Wq^nC") can be expressed as a tractable, continuous function that has an inverse denoted by Wq f^(n). (ii) When work incurred by Q is written as Wq.nC") or is instantiated as the sequential time complexity Tseq(o,N)(|X|) = W(o,i)(|X|), we assume that the effect of si>(F) upon W is accounted for in the formulation or expression of W, unless stated otherwise. PAGE 136 126 4.2.1.3. Remark. Assumption 4.2.1.2(i) is justified in practice, since continuous, invertible functions can generally be fitted to data that describe the performance of an operation on a specific architecture with a given number of inputs. It is beyond the scope of this chapter to discuss the complex implementational issues pertaining to the accuracy with which the complexity function W represents the performance of a given operation. Rather, we assume for purposes of convenience that the representation is accurate, and present implementational analyses in Parts II and III. 4.2.1.4. Lemma. Let a homomorphic computational system Xh= ({F^,0}, {^,0}, {G^.O'}) and let the ratio U,^.(0,0',^) = ^P^y (137) Given the simplistic assumption that Wo(n) oc Wo'(n) oc n for all n < |X|, then the efficiency due to processing over the range space of a compressive transform T is given by n(X \Wo,N(Cltj |Y|) Proof. From the definition of computational speedup and the definition of an HCS, we have the more general expression that (139) If Wo(n) Â« Wo'(n) oc n, then we can (a) separate the effects of domain cardinality and value set space requirement, (b) substitute |X| = |Y| Â• CRa, and (c) introduce r into the denominator, thus obtaining Equation 138. PAGE 137 127 4.2.1.5. Remark. The assumption of linearity between Wq and Wq' is not fanciful, but holds frequently in practice. In particular, consider the case when the range spaces of Ts input and output conform to the relationship siz{F) oc siz{G}, and observe that W is usually proportional to siz{F) but not to |F|. Given the foregoing relationship between rj and CRa, we can derive the critical compression ratio CRcFor purposes of simplicity, we continue to denote Wo(n) = Wa(o,N)(") and denote r(n) = r(A,N)(0> 0'> ") as well as 77 = ri(Xu) 4.2.1.6. Definition. Given a homomorphic computational system Xh, if the compression ratio CR(T) is less than the critical compression ratio CRc, then the efficiency rj{Xi{) > 1. 4.2.1.7. Theorem. If a homomorphic computational system ^H= ({F^,0}, {T,e}, {gY,0'})and Wo(n) Â« Wo'(n) a n, then the critical compression ratio is given by CRc = ]^-W5i(r(n).Wo(|Y|)). (140) Proof. Per Assumption 4.2.1.2, the complexity function Wq has an inverse Wq that accepts a measure of time and returns the number of elements upon which Q operates in the given time. Since Wo(n) oc Wo'(n) oc n, the effects of domain cardinality and 522 of the value set are separable. Regrouping the terms of Equation 138, and applying the definitions of CRj and CRr, we apply Wq to obtain the derivation r?.r(rÂ»).WQ(|Y|) = Wo(CR.|Y|/CRr) W5^[7? Â• r(n) Â• Wo(|Y|)]= W^^ Wo(CR Â• |Y|/CRr)] (141) CR-|Y|/CR,= W5i[r?.r(n).Wo(|Y|)]. By constraining the preceding equation such that r/ = 1 when CR = CRc, we can regroup terms to obtain Equation 140. If CR > CRc, then 77 > 1. PAGE 138 128 4.2.I.8. Remark. The preceding theorem is useful implementationally, since speedup and complexity as well as critical compression ratio can be predicted given T, Q, Q', and knowledge of a given architecture A. Note that Equation 140 does not explicitly require knowledge of domain{T). As discussed in Section 4.4-4.5, data security may thus be enhanced due to the concealment of certain properties of domain{ T). Inherent in the preceding statement is the tacit assumption that one would conceal from routine usage the details of how r, CRd, or CRr are computed. In the following sections, we analyze the computational complexity of each transform class in a compressive computational scenario. It is shown that the complexity that results from processing over a compressive transform's range space is typically bounded below by 0(|Y|) work, which implies O(CRd) efficiency. Since CR oc CRd, the computational speedup that results from compressive processing is claimed to be 0(CR) in the typical case. 4.2.2. Complexity of Class 1 Compressive Processing Class 1 presents a useful introductory case, since transforms in this class can be viewed as greylevel functions that may be one-to-one (pointwise bijections) or many-toone (pointwise or neighborhood operations). 4.2.2.1. Lemma. Let Ti : F-'^ be a homomorphism and Q: F^ x F^ ^ F^ be a pointwise image operation that has an analogue Q' : xF^ F^ over range(Ti). If Wo(n) oc Wo'(") Â« "> then the computational efficiency rj incurred by the HCS Xh = ({fX,O},{T,0}, {GY,0'}) is given by 7?(Xh) = (r(0,OMX|))-\ (142) where the inverse is multiplicative. PAGE 139 129 Proof. In Section 3.4.2, we showed that CR( T\) = I. By the definitions of t) and r, we have Wo(|X|) _ and the lemma holds. As a result, pointwise computation over range( Ti) derives its efficiency from the choice of 0' versus Q. This effect is not due to data compression, which is nonexistent in Class 1 transforms. The following example illustrates this concept. 4.2.2.2. Example. If the Fourier transform Ti = Â— * C^, the transform-regime analogue of convolution 0 = Â© of images or templates having circulant domains is given by 0' = *, which denotes Hadamard multiplication. Now, let X be an mxn-pixel array. Note that 0 requires work Wo(ni, n)= O(m^n^) multiplications, while the work associated with Hadamard multiplication is given by Wo'(ni,n) = 0(mn) multiplications. Neglecting the overhead due to Fourier transformation, the resultant efficiency is computed as Wo 0(mW w = -P^T r = 0{mn). (144) Wq' 0(mn) ^ ^ Thus, the elfect 77 > 1 is due to the performance ratio r(0,0'> |X|), as predicted by the preceding lemma. However the analysis of efficiency is more complicated when the space requirement is introduced in addition to a computational speedup, as shown in the following example. 4.2.2.3. Example. Suppose that we have the computational system X= ({R^, Â©}, {/",0}, {C^,*}), as in the previous example. Given a G with circulant X C RÂ°, a spatially-invariant template t e (R"'')^, and |X|-fold parallelism, an image domain convolution a = a @t requires |5(ty)| 1 I/O and addition operations on a SIMD-parallel mesh, where y 6 X. In contrast, given ac = jr(a) and s = .F(ty), the PAGE 140 130 computation of Sc = ac * s requires one multiplication using |X| processors. The resulting efficiency is given in terms of computational cost by where Ata and Atm denote the computational delay incurred by addition and multiplication. If X is an mxm-pixel array, then the preceding equation reduces toT/(X) = (| PAGE 141 131 4.2.3.2. Lemma. Under constraint of Assumption 4.2.3.1, the computational efficiency Tj incurred by processing over the range space of r2 : ^ with operation 0'" (G^)"* Â— Â» G-'^, as opposed to processing over domain{T2) with operation 0 : {P^)"" is given by ,= |offl,Â£He. (146) Proof. We previously showed that CR(T'2) = siz{F)/siz{G). By the definitions of rj and r, we have _ Wo(|X|) ^ Wo(.u(F)) r(0,OMX|)-Wo(|X|) WoisiziG))^ ^ From Assumption 4.2.3.1, we have Wo(n) Â« n. Thus, tj cc siz{F)/siz{G), and the lemma holds. As a result, computation over range^T^) derives its efficiency from the choice of the encoding alphabet G, as well as from the work required by Q' versus QThis implies the possibility of achieving a computational efficiency r/ > r that would be obtained via shortening the word length. The following example is illustrative. 4.2.3.3. Example. Assume that T2 Â— G-'^ is a word truncation transform, expressed at medium granularity as 72 : (Zg) (Z4) . Let a,b G (Zg) and ac,bc G (Z4)'''', such that ac= T'2(a)= [a/2j. Let Q = +8 denote addition modulo eight. Since T2(0(a,b)) = 0'(ac,bc), we have that 0'(ac,bc) = [^(a +8 b)J . Thus, Q' = +4, addition modulo four. From Equation 147, if W is linear in the number of input bits, with the same proportionality constant for Q and Q' (for purposes of convenience) then t} = 1.5, since 7] a siz(F)/siz{G), where siz{F) = log(8) = 3 bits and siz{G) = log(4) = 2 bits. PAGE 142 4.2.4. Class 3 Transforms. 132 The complexity associated with transformations in Class 3 depends primarily upon the domain size of the input and output images. Per the definition of Class 3 given in Section 3.4, we assume that the wordlength of image pixel values, denoted by the siz function, remains unchanged. 4.2.4.1. Lemma. Under constraint of Assumption 4.2.3.1, the computational efficiency r] incurred by processing over the range space of the Class 3 transform T3 : Â— < with operation Q' : (F^)"* F^, as opposed to processing over domain{Tz) with operation 0 : (F^)Â™ Â— F^, is given by Wo(|Y|-CR) ^Â„ , , ^ = . Wo,(|Y|) " Proof. The proof follows trivially from the definition of rj and the fact that CR( T3) = |X|/|Y|. 4.2.4.2. Remark. Since the value set of domain{Tz) is preserved in ranfire(73), it may be feasible to set the analogue 0' = 0such cases, the performance ratio r(Q_o') is unitary, and 77 = CR. It is easily verified that if Wq is linear, then r/ > CR is possible. Additionally, the preceding statement holds if Wq and Wq' are linear and sublinear in their input size, respectively. The preceding complexity analyses of Classes 1-3 intuitively suggest that if both the value set and the point set of a source image were compressed, significant speedup could be realized. In practice, the speedup could exceed that achieved by pointor value-set compression alone. Furthermore, under constraint of the preceding remark, 77 > CR could be possible. Such is the case with certain Class 4 transforms, as we show in the next section. PAGE 143 4.2.5. Class 4 Transforms. 133 We begin by assuming a decomposition of image operations and analogues over range{T4) in terms of a greyscale function g and a spatial transform /. Then, we show how knowledge about the complexity of / and g can be combined to yield a generalized complexity function for Class 4 transforms. 4.2.5.1. Assumption Recall (from Section 3.3) that the Class 4 transforms T4 : Â— G^. Assume the special case where a unary image operation Q : F^ is comprised of a greyscale function g:F^F and a spatial transform / : X Â— *Â• X, such that Q = g o f. That is, if a 6 F^, then 0(a) = ff(a)o/Although the preceding decomposition provides a convenient vehicle for explaining methods of optimizing computational cost, but the decomposition itself may not exhibit optimal complexity. Thus, further optimization of the functions / and g may be required to achieve optimal performance. 4.2.5.2. Observation. Under constraint of Assumption 4.2.3.1, since composition is preserved under homomorphism, if J'4 is a homomorphism and Q = fir 0/, then it is possible that Q', the analogue of Q over range{T4), could be decomposed as 0'= q' Â°f'Here, g' : G -+ G' and /' : Y ^ Y denote the analogues of g and /over range{ T4), assuming that such analogues exist. Given the preceding decomposition, we state the complexity of operations over mnge{ T4) in the following theorem. 4.2.5.3. Theorem. Under constraint of Assumption 4.2.3.1, the computational speedup of a Class 4 transform T4 : ^ G^ is given by W.jsizjF)) W,(|X|) W,,(.uiG)) WHM""^^^''^^ PAGE 144 134 Proof. The proof follows directly from Lemma 4.2.4.1 and the given assumptions, since w,{siz{F)) W;(|X|) siziF) W rn n^o^ WA^iziG)) W/,(|Y|) lk(G) |Y| ^"^^ ^^^"^ Thus, the Lemma holds. Under the assumption that Q and Q' can be decomposed as stated. Equation 149 describes the general case of image transforms in Classes 1-4. However, the assumed decompositions may not exist, or may be intractable. Therefore, we provide proof when we employ the preceding theorem. A related comment pertaining to data security follows. 4.2.5.4. Remark. Assume that q = CRj and p = CRr are specific to a secure transform T, and would be concealed in practice. Further assume that the term data security means disclosing in range{ T) as few of the properties of domain{ T) as possible under theoretical or implementational constraints. For purposes of data security, we can write the numerator of Equation 149 as ^^(p Â• siz{G)) Tf{q |Y|). Since Class 4 can be thought of as generalizing Classes 1-3, calculation of the computational speedup for any transform Tin our taxonomy does not require knowledge of the properties of domain{ T), such as |F| or |X|. As discussed in Section 4.4, the disclosure of such properties may be crucial to data security. We next present a brief discussion of salient issues pertaining to the implementational feasibility of computation over compressed data. ^ 4.3. Fecisibility of Compressive Computation. We begin our analysis of compressive processing in an implementational scenario by considering a simple commutativity diagram that describes an isomorphic computational system with one transform and one unary operation (Section 4.5.1). In Section 4.5.2, we progress to an analysis of multiple image formats, and Section 4.5.3 expands our computational scenario to multiple operations and multiple images. Bounds upon the performance of such systems are given, together with a discussion of how such systems could be useful in practice. PAGE 145 135 4.3.1. Single Encoding Transform. Assume that the primary goal of compressive processing is computational speedup. Now, reference the commutativity diagram of Figure 1, which schematically depicts an isomorphic computational system X = ({S,0}) {T,Q}, {S', 0'}' ^~^) with unary operations 0 0'4.3.1.1. Assumption. Let the transform T : Â— Â»Â• have an inverse : Â—> F^. Let the unary image operation Q : Â—> F^ have an analogue Q' Â• Â—> G^, and let X as well as the images a, ac, a, and Sc be as defined previously. Restricting the arity of Q to m=l for purposes of simplicity, we further assume that T~^iO'iT{a))) = 0(a)Thus, we have the following lemma that describes the complexity inherent in the system portrayed in Figure 1. 4.3.1.2. Lemma. Under constraint of Assumption 4.3.1.1, given work Wo(|X|), the operation T~^{Q'{T{&))) exhibits an efficiency relative to the equivalent operation 0(*) if and only if the following condition is satisfied: Wo(|X|) < Wr(|X|) + Wo.(|Y|) + Wr-.(|Y|). (151) Proof. The proof is obvious if one relabels the edges of the graph shown in Figure 1 with the corresponding complexity functions. We next discuss the more involved case of multiple encoding formats. 4.3.2. Processing with Multiple Encoding Transforms. Implementationally, one may have a computational system where the group (S,0) is replaced by a heterogeneous algebra. In such an algebra, we assume that the various operand sets would each be expressed in terms of different compressive formats. PAGE 146 136 4.3.2.1. Assumptions. Let there exist ( transforms denoted by : Â— > (Gi)^', i G Z^, where the sets Gj or Yj may or may not be disjoint. Let each T\ have an inverse : (Gi)^' Â—> F^, i 6 Z,;. For a given image operation Q : F-^ let a known analogue 01 Â• ' ^ o'^^'" "J^ff^i 7'i)> where ^ : Z^ Â— ^ Z(^ is an indexing function. If we restrict the arity of each operation to m=l, for purposes of simplicity, such a system can be described by Figure 4. o a i a '0 a. J1 o; a -1 ar al. o; Figure 4. Commutativity diagram of a compressive computational system with multiple encoding transforms Ti and multiple analogues 0[ of image operation Q. 4.3.2.2. Lemma. Under constraint of Assumption 4.3.2.1, given the time complexity To(|X|), the operation T-~^{0{{Ti{a.))) , i e Z^, exhibits a speedup relative to 0(a) if and only if the following condition is satisfied: Wo(|X|) < Wr.(|X|) + Wo;(|Yi|) + W7,-.(iY|),iGZc. (152) Proof. The proof is obvious if one relabels the operations in Figure 4 with the corresponding complexity functions, then considers the commutativity diagram as a weighted graph. By the definition of computational speedup r), the i-th path from a to a, via the edge labelled Q[, must satisfy Equation 152 for a speedup to occur. 4.3.3. Processing with Multiple Transforms and Multiple Operations. The structure depicted in the graph of Figure 4 can be extended to include multiple image operations 0, as well as the multiple compression formats depicted schematically in Figure 4. We formalize this concept as follows. PAGE 147 137 4.3.3.1. Assumption. Let an image processing algorithm be comprised of image operations Oj : , j e Z^. Here, unary operations are employed for purposes of brevity but without loss of generality. Let there exist ( transforms T\: ^ (Gi) i 6 Z,^, where the sets Gi may or may not be disjoint, and likewise for Yi. Let each T\ have an inverse : (Gi)^' F-'', i G Z^. Further assume that, for a each image operation Qj there exist C known analogues of form Oj,i ' G; G; that compute over range{T[), where 6 : Z(_ denotes an indexing function. Additioucdly, denote a transform that converts from the /-th to the m-th format by T/m , /,m G Z^. Letting V = 2 and C = 2 for purposes of simplicity, we obtain the computational system diagrammed schematically in Figure 5. Figure 5. Commutativity diagram of a compressive computational system with multiple encoding transforms Tj , i G Z^, and multiple analogues Oj,i Â» j G Z^ of the j-th image operation Qji where C = 2 and 0 = 2. 4.3.3.2. Notation. Assuming the existence of an indexing function A; : Z^ Z^, we denote the work associated with each edge in the graph of Figure 5, as follows: Wa(|X|) the j-th image operation Wr.Â„,(|X|) the j-th encoding transform that yields the A:(j)-th encoding format Wo;..(|Yj,.a)|) an analogue to the j-th image operation over range{ Tf^-^) Wr,,,,,.,(|X|) the j-th conversion T/.n.fai) = (4)) , /,m G Z^ Such formalisms, together with the graph of Figure 5, intuitively suggests the following theorem. PAGE 148 138 4.3.3.3. Theorem. Under constraint of Assumption 4.3.3.1, the A:-th sequence of transform-regime analogues, denoted by Oo,i(o) ' 0'i,t(i) ' Â• Â• ' Oj,i(j) > Â• Â• Â• Â» O^fr-i, *(,/,-!) J e , (153) exhibits a computational efficiency, relative to the corresponding sequence of image operations S,/, = {Oo, Oi, , Ov-} if if there exists a mapping A; : Â— Â» such that the computational work incurred in traversing a path from image a to image is less than the work incurred by computing the sequence S^. Proof. The proof follows trivially from the existence of the shortest path algorithm, from which one obtains the function k. 4.3.3.4. Remark. It is well known that the cost of computing the shortest path over a weighted digraph of N edges is 0(N''). The graph depicted in Figure 5 has ^ operations, and C transforms, which yields the following edge count: 1. 0 operations yields 0-1 edges on the spine of the graph; 2. For each of ^ operations and each of ( transforms, there are three edges, namely, r, 0') and T"', yielding 3V'C edges; and 3. For operations, there are edges involved in the format conversion transforms, since conversion between like formats merely requires an identity. Thus, the total number of edges in the graph is given by N(V,C)= 0 + (0-l)(C'-C) + 3-V-C = v-(C'-C + 3C+i)-C' + C (154) = (0 + i)-C' + (2C-l)-C + ^/' = o(0.c^). As a result, application of the shortest-path algorithm to the graph of Figure 3 incurs work operations. Although the number of available compression PAGE 149 139 transforms is relatively small, the cost may be impractical for an algorithm which has a large number of steps 0 or a system that processes over many formats Note also that the preceding theorem presupposes that |Yj is known a priori for all j Â€ T-ij)When such static domains are impractical, as is often the case when processing is restricted to areas of interest whose domains are data-dependent, we make the following provision for a greedy optimization procedure. 4.3.3.5. Theorem. Under constraint of Assumption 4.3.3.1, further assume that the compressed domains exhibit sizes denoted by |Yj j 6 Z^, which are not known a priori. Then, the work W(V',C) associated with a simple path along the graph of Figure 5, which path may exhibit minimum computational cost with respect to the sequence of image operations S,/, = {Qo, Oi' Â• Â• Â• > Ov-}* is computed via the following greedy algorithm. For purposes of simplicity, let each operation (e.g., T, Q', T"') denote the computational work associated with that operation (e.g., Wr, Wq', W^i). Step 1. Compute the minimum work component of the first edge of Figure 5, corresponding to Oo) as Wo(V',C) = A Oo(a), A Ti{\X\) + OJ,,i(|Yo,il) + + A kmHixi), A ruiYi,ii) meZc \ /6Zc (155) Step 2. Compute the minimum work component of each edge of Figure 5 that corresponds to Oi through 0^-2 ^ Wj(0,o = A Oi(s^), A \ iÂ€Z, Oj,i(|Yj,i|) + + A T-\\Yi^\), A ri,K|Yj,i|) (156) PAGE 150 wv,-i(v,o = a(oo(s'^-'), a 140 Step 3. Compute the minimum work incurred by the edge in Figure 5 that corresponds to 0^_p as Oj,i(|Yj,i|) + + A (|Yj,J), A n/(|Yi,i|) Â• mÂ€Z< \ lelc /J/ Step 4. Sum the preceding work to obtain the total work, which is given by W(V,0 = Wo(V,C) + ^EWj(V',C)j + W^_i(V',0(158) Proof. The proof follows directly from the construction of Figure 5 and the development of Sections 4.3.1-4.3.3. 4.3.3.6. Remark. Assume that we have appropriate indexing functions, with which we compute the indices of the operations T, Q', and in Steps 1-3 of the preceding algorithm. For purposes of argument, assume that a computation that incurs minimum work would correspond to the shortest path on the graph of Figure 5. One can thus construct a sequence of operations that may exhibit lower computational cost than the sequence of image operations S^. In practice, we must qualify the foregoing assertion by stating the well-known observation that greedy optimization will not necessarily yield a minimum-work sequence of compressive (or noncompressive) operations which computes a (or an approximation thereof) from a. Additionally, the overhead incurred by the greedy optimization procedure is computed similar to W(V',C), and is of order ip This estimate can be verified by an inspection of Theorem 4.3.3.5. As expected, the greedy algorithm exhibits lower cost than the shortest-path approach, by a factor of 0[i>'^ (^). Various greedy optimization procedures have been developed that exhibit a higher probability of an optimal solution as the lookahead increases [65]. However, such greedy algorithms have as a common deficit the inability to consistently reach a global optimum. PAGE 151 141 As a result, the greedy optimization outlined in Theorem 4.3.3.5, while exhibiting lower overhead than the shortest-path algorithm, will not necessarily yield the optimal computational path through the graph depicted in Figure 5. The subject of optimal computation over such networks will be investigated in detail in future research. We next consider the issue of data security, which is another potential advantage of compressive processing, and is key to the design of computational systems that process encrypted data. 4.4. Data Security. This section overviews a review of data security theory, which we express in terms of properties of mappings. We show how each transform class in the taxonomy of Section 3.3 conceals or discloses properties of the transform domain. We then apply our data security theory, together with theory developed in Chapters 3 and in the preceding part of Chapter 4, to describe computation over encrypted data. Throughout this chapter, we assume the following implementational scenario. First, given an encryptive transform T, we define the data security o/ T at a high level in terms of information about a plaintext a that is obtainable from the corresponding ciphertext ac = T(a). If no properties of a can be inferred from ac by a given cryptanalytic method, then we say that Tis secure with respect to that method. Additionally, if one cannot infer operationally salient properties of a from ac, then we say that Tis operationally secure. A key problem, however, is determining which properties are salient with respect to a given operational scenario. Second, we assume that an image processing system is used by non-privileged users and is maintained by privileged users (e.g., programmers and systems administrators). The non-privileged users can only specify image processing algorithms in the image domain. For example, one would construct an algorithm from operations such as the image sum, PAGE 152 142 Sobel edge operator, etc., given a library of permissible operations. If so required, nonprivileged users can be prevented from viewing imagery in either encrypted or decrypted form. Thus, non-privileged users think they are working with image operations, when they actually may be computing over encrypted data (i.e., with operations that are transformregime analogues of the image operations). Additionally, non-privileged users can be prevented from becoming security threats by not allowing them to see the imagery they are processing. For the purposes of this development, such constraints are assumed to exist where appropriate. For purposes of simplicity, we further assume that such constraints cannot be altered by any class of users. Therefore, we assume that the non-privileged user's only salient capability is the specification and invocation of algorithms constructed from a list of permissible operations. In contrast, we assume that the privileged users are given a library of plaintext operations, as well as a library containing the corresponding analogous operations over ciphertext. Using such operations the privileged users specify and construct the building blocks to which the non-privileged users are allowed access. Now, the privileged users may not know the internal workings of either the encryption transform(s) or the ciphertext operations, but may use such mappings arbitrarily. Thus, the system is vulnerable if the privileged users are adversaries and can employ one or more encryptive analogues Q' to exercise the output of a given encryption T and thus attack T. This technique will be discussed subsequently. Third, we assume that the privileged users may apply T and Q' to various input images, which would enable them to guess T or Q' by analysis of the resultant output. In particular, we assume that if privileged users are adversaries, then T may be guessed at via cryptanalytic techniques. Fourth, we assume that the implementations of all transforms T and ciphertext operations 0' are maintained at levels of medium and fine granularity (i.e., data structures PAGE 153 143 and functionality, respectively) by trustworthy personnel. Thus, only privileged users are admitted to the role of adversary. Our implementational goal is encryptive processing with the highest level of security possible under the foregoing constraints. That is, no user should be able to determine the encryption T by using one or more analogous operations 0'Â» by analysis of transform inputs and outputs. In this somewhat artificial scenario, we do not consider specific implementational problems such as key distribution or tapping of data distribution lines. Such implementational issues are outside the core issue of the security of Tand Q'Instead, since this study is primarily theoretical, we prefer to concentrate upon the security of mappings inherent in Tand Q' . We thus direct the reader to Reference 60 for an excellent implementational discussion of computer system security. In this study, the key questions pertaining to data security are as follows: 1. How secure is a given encryption transform T, i.e., how easily can T be guessed, and under what conditions? 2. How secure are the operations Q' that process over range{ T), i.e., what properties of a are disclosed in Q'{T{a.))1 In order to answer such questions, we develop several concepts. First, we assume that an encryption transform or encryption conceals properties of the plaintext input in the ciphertext output. For the purposes of this study, an encryption is deemed insecure if salient properties of the transform's input (plaintext) are disclosed in the transform's output (ciphertext). We generalize the preceding convention theoretically in terms of a transform's domain and range spaces. Accordingly, in this section we develop theory that describes the properties of a mapping, beginning with cardinality. Then, we describe properties of sets and mappings that pertain to analyses presented in Chapters 6-9. We close with a brief discussion of the propagation of transform domain properties into the transform's range space. PAGE 154 144 4.4. r. Cardinality of Mappings. The success of a cryptanalytic attack depends upon a knowledge of the work involved in attacking a discrete mapping. In the case of finite discrete images, a knowledge of the image domain and range spaces is key to furnishing useful clues to the plaintext contents or the keyspace. At each step of the cryptanalytic process, we thus try to guess salient properties of the mappings that may be involved in the encryption. The first step in this guessing process is the determination of the domain cardinality (number of pixels or symbols) and range cardinality (length of the alphabet), where possible. We present relevant theory, as follows. 4.4.1.1. Definition. The Cartesian product S"" = H S exhibits cardinality |SÂ™| = |Sr. We state the following definition, in order to make plain to the reader when cardinality is defined. 4.4.1.2. Definition. If A/ is a mapping, then one of the following conditions holds: (i) If domain{M) and range{M) are finite, then M and \M\ are finite. (ii) If domain{M) or range{M) are countably infinite, then Mand \M\ are infinite but countable. (iii) If domain{M) or range{M) are uncountable, then Mand \M\ are undefined. 4.4.1.3. Definition. The cardinality of a finite mapping M is given by m |M| = |ranffe(M)|l'''""'''"(^)l. 4.4.1.5. Example. If a finite mapping M : ^ (Zj)^"*'"!, then \M\ = |range(M)|l'''""""'(^)l (159) m It follows that (m+l)"Â™ < \M\ < m"". PAGE 155 145 4.4.2. Circulant Sets and Mappings. Circulant sets and mappings are important in cryptography, for example, in modulo arithmetic. We therefore present the following definitions. 4.4.2.1. Definition. Let S be an indexed set, where /: S Â— >Â• A is an indexing function, and the index set A is totally ordered, e.g., A C N. We customarily say that an element g e S has (i) an immediate predecessor f G S iff j{f) precedes J[g) in the order inherent in A, and (ii) an immediate successor h G S iff f(g) precedes J{h) in the order inherent in A. 4.4.2.2. Definition. A circulant set is an indexed set S whose index set A satisfies the following conditions: (i) If A= Z|s| with n = |A|, then XSn-i) is the immediate predecessor of ^Sq) in the ordered set S, and (ii) If A= Z|s| with n = |A|, then /(So) is the immediate successor of y(Sn-i) in the ordered set S. Thus, in a circulant set, every element is a predecessor of every other element. This concept has important implications for modular arithmetic, as shown in Chapters 6-9. We next discuss two important properties of circulant mappings. 4.4.2.3. Definition. A circulant mapping has range and domain spaces that are circulant sets. 4.4.2.4. Definition. A circulant function is a circulant mapping of arity m-to-1. In Chapter 3, we discussed the preservation of certain properties of sets. In the following definition and example, we elaborate the concept of properties in order to facilitate the subsequent definition of oracles. PAGE 156 146 4.4.2.5. Definition. For purposes of data security, a property of a mapping is defined by the statement a (3, where a denotes an action or state of being, and (3 denotes a property of a set that constitutes the mapping. 4.4.2.6. Example. If a transform Tis a homomorphism, then we express the property of homomorphism as a = is and f3 = homomorphism. Additionally, given the transform class T\ , we say that T\ preserves the size of an input image, such that \domain{choice{range{T[)))\ = \domain{choice{domain{T[)))\. Here, a =' preserves and /? = size{domain(T])) = \domain{choice{domain(Ti)))\. 4.4.3. Graphs and Matrix Representations. Properties of a given mapping M can be described in terms of the graph of M, together with the adjacency matrix of the graph. For example, such representations are especially useful for cryptanalysis of a mapping M that employs synthetic arithmetic, where an adversary would attempt to discover properties such as is associative or is commutative by scrutinizing some representation of the inputs and outputs of M. 4.4.3.1. Definition. The graph G of a mapping M is given by G{M)= {(x,y): x G domain{M), y = M(x)} . (160) For those readers who are unfamiliar with the concept of graphs, we note that the graph G{M) = (V,E) such that the vertex set V of G is given, per the preceding definition, by V = domain{M) U S, where S C range{M) and the edge sef E C V x V. PAGE 157 147 4.4.3.2. Definition. Let G = (V,E) denote the graph of mapping M. Let the isomorphism / : V ^ A index the elements of C such that G' ^ (V, E') = (A, A x A), where (~) signifies We can similarly represent the mapping M as a weighted digraph G(A/), from which we derive a matrix representation of M, as follows. 4.4.3.3. Definition. Let G = (V,E) be the graph of g, a binary function over a subset of V, and let the indexing function / : V ^ A. Denote / : VxVÂ— Â»-AxA, such that /(e) = (/(pi(e)),/(p2(e))) , e G E C V x V. Then, the matrix representation of g, denoted by g G A^^^, is given by Such matrix representations (MRs) of a function are useful constructs for testing properties such as commutativity and associativity. Given g, the discrete computation of the MR of g can facilitate the determination of numerous properties and effects that are useful in cryptanalysis. As previously mentioned, this technique is especially applicable to the analysis of operations over synthetic number systems. We give the following lemmas as examples. isomorphism and A denotes an arbitrary indexing set. Then a G {0, 1}'^'*'^. the reachability image of G, is defined as (161) g = {{x,g{x)): X G E,h(x) G range{M) C V}. (162) 4.4.3.4. Lemma. The matrix representation of a commutative function over S is symmetric about its major diagonal. PAGE 158 148 Proof. Assume, for purposes of convenience, that the operation Q ' S x S -+ S is commutative, and that the indexing function / : S Â— Â»Â• N. Denote i = /(f), j = /(g), and k = /(h), and let the matrix representation of Q be given by a(i, j) = /(f Og),'^f,ge S. We then write f 0 g = g 0 f implies that /(fOg) = /(gOf) (163) = a(i,j) = a(j,i),Vf,gGS. Thus, the matrix representation a is symmetric about its major diagonal. 4.4.3.5. Lemma. Let the operation Q : S x S Â— Â»^ S be associative, and let the indexing function / : S Â— *^ A C N. The matrix representation of Q is given by a(a(i,j),k)= a(i,a(i,j)) Vi,j,kG/(S). (164) Proof. Assume that (a) the operation Q : S x S S is associative, (b) the indexing function / : S A C N is a bijection, and (c) that a 6 A^'*^. Further assume that a(a(i,j),k)= a(i,a(i,j)) Vi, j, k G /(S) . Denote i = /(f), j = /(g), and k = /(h). By applying /-^(a(i,j)) = 0{f-\i), /"Hi)), to Equation 164 we obtain /-^(a(a(i,j),k))= /-i(a(i,a(j,k))) implies that (fOg)Oh= 0(r'(a(i,j)),/-Hk)) = 0(r\i),/-na(j,k))) = 0(0(rHi),/-'(j))/-ni),h) (165) = o{uo{f-\i),r\k))) = 0(f,0(g,h)) = fO(gOh). Thus, the function Q is associative and the lemma holds. PAGE 159 149 4.4.3.6. Lemma. Let the operations 0>7 = S X S S, where 7 distributes over Q, and let the indexing function / : S ^ A C N. The matrix representation of Q and 7, denoted by a G A^^'^ and b G A^'*'^, respectively, is given by a(i,b(j,k))= b(a(i,j),a(i,k)) Vi,j,kÂ€/(S). (166) Proof. Assuming that the givens hold, the proof consists of transforming both sides of the distributivity law f 0 (g7h) = (f 0 g) 7 (f 0 h) , Vf,g, h 6 S, by applying f in a manner symmetric to that of Lemma 4.4.3.5. 4.4.3.7. Lemma. Let the operations 0Â»7 SxSÂ— Â»-Sbe heteroassociative, and let the indexing function /: S ^ A C N. The matrix representation of Q 7> denoted by a e and b G A^**^, respectively, is given by b(a(i,j),k)= a(i,ba,k)) Vi,j,kG/(S). (167) Proof. Assuming that the givens hold, the proof consists of transforming both sides of the definition of hetereoassociativity f 0 (g 7 h) = (f 0 g) 7 h , Vf , g, h G S, by applying /~Mn a manner symmetric to that of Lemma 4.4.3.5. There are certain abstract properties of mappings that are useful in cryptanalysis, such as the histogram of an image, input size, etc., which are easily computed. However, another interesting property of two mappings is their similarity. For example, assume that one has a known encryption ra(a,k), where k denotes the key, that produces a result that is similar to the encryption Tb(a,k) whose internal operation is unknown. It is possible to apply and Tb to identical inputs, and compare the outputs via methods such as correlation. If this process can be pursued to exhaustion, then one has a complete comparison of the mappings inherent in Ta and Tb with respect to a similarity measure such as the correlation coefficient. Otherwise, and Tb can be characterized as partial functions and can be correlated to obtain a partial similarity measure. Given such descriptions, one can construct the graphs of PAGE 160 150 Ta and Tb, which can be compared using graph similarity functions to provide information on the resemblance of the unknown mapping to the known. We provide theoretical support and an example, as follows. 4.4.3.8. Definition. Let Mbe a mapping, and let ^ : domain{M) x range{M) R denote a similarity function that returns a measure of similarity between domain{M) and range(M). Let t Â€ R denote a similarity threshold. Either one of the following statements holds, or both (i) and (ii) hold: (i) Mis termed a correlated mapping (or is simply called correlated) if ^(S,M(S)) > t, VS G domain{M), (ii) M{S) is said to correlate with S if ^(S,M(S)) > T, for some S 6 domain{M), or (iii) M is uncorrelated. It is easily verified that the preceding definition can include comparisons of data structures such as graphs. However, for purposes of simplicity, we couch our expression of the similarity measure between mappings in terms of the familiar operation of correlation. In Parts II and III, we further discuss the concept of similarity between mappings, and next define a simple model of data security. 4.4.4. A Set-Theoretic Convention for Data Security For purposes of discussion, assume that we have a "magic" computer program that can determine the properties of a mapping, given a set of operational constraints. We call such a program an oracle. The concept of oracles underlies cryptanalytic theory, but is rarely discussed openly, since specification of an oracle could reveal one's encryption capabilities to an adversary. In this section, we formulate a simple oracle for purposes of defining a model of data security that is based upon properties of a mapping. In this discussion, our oracles are not meant to be comprehensive implementationally, but are kept simple to illustrate technique. PAGE 161 151 We begin by asserting that an encryption Tm&y disclose information about a plaintext a in the ciphertext T{a). We then show how an oracle can be applied to a mapping (which implies the mapping's input and output) to deduce properties of T that would be useful to a cryptanalyst. We further extend our model to discuss the security of compressive or encryptive computational systems. 4.4.4.1. Definition. Let a simple oracle be denoted by an operation D: ^ X T (Q X R^)" that is not omniscient, and consists of (i) an algebra A = {T, O) that contains operands in T and operations in O, (ii) a mapping Tin T, where range{T),doTnain{T) Â€ and (iii) a tuple of n elements, where each element consists of a property q Â£ Q = V(T) together with two real numbers that specify a measure such as the probability of correctly determining q, as well as the computational error (or inaccuracy in the error estimate) associated with that determination (where applicable). For the purposes of this section's discussion, we restrict the preceding definition to constrain the oracle to a scrutiny of properties of T that are known to exist in domain{T), and may or may not be evident in range{T). 4.4.4.2. Assumption. Given a cryptanalytic system that is characterized by an algebra A in A, an encryption Tin T, and an oracle o G O, we assume for purposes of discussion that o{A,T) returns a list of properties qi, i=l..n, of T obtained from the scrutiny of range{T), given knowledge about domainCr). 4.4.4.3. Example. Let an oracle o G D have . in its domain the algebra accept a bijection T: in T, and return a tuple ((qi,pi,ei), (q2,P2,e2)). PAGE 162 152 For example, qi Â€ N'" could denote preserves input histogram. Here, the probability pi = 1 and error f i = 0, since the histogram of an integer-valued image of finite size can be computed exactly. Since T is an isomorphism, if a G (Zm) , then h(r(a))(i)= h(a)(r(i)), i G Z^, Va G domain{T) . (168) Continuing with property q2, which we assume denotes does not preserve mean, the mean fx of an input image a would not necessarily be preserved, since fi{T{a)) ^ ^(a) , Va G rfomam( r) , for all certain substitutions T. For example, if range(a.) = {0,1}, and a contains unequal numbers of zeroes and ones, then the transform T{a) = |a 1| is a bijection but does not necessarily preserve the mean. Thus, the probability p2 = 1 and error ei = 0, since it is certain that does not preserve mean is a property of T. 4.4.4.4. Remark. We have thus far discussed only a very restricted type of oracle, which could determine properties such as the disclosure of cardinality, order, moments, or the histogram of an image in domain{ T) via scrutiny of Ts input and output. In practice, a simple oracle, such as that given in Definition 4.4.4.1, could be used to determine the preservation of such properties by an encryption transform. In contrast, a more capable oracle might accept clues about plaintext a (e.g., semantics, word order, letter n-gram frequency) and deduce further information about the corresponding properties of the ciphertext T(a) (e.g., how T affects plaintext semantics and word order). In keeping with our previous assumptions, we shall employ the simple oracle to define the concepts of disclosure and concealment oi properties of domain{T) in range{T). More complex oracles will be discussed in Parts II and III. PAGE 163 153 4.4.4.5. Definition. Let an oracle o Â£ O employ an algebra A e A that contains operators and operands employed in cryptanalysis, and let o accept an encryption T Â£ T. The properties of T that are disclosed or leaked to o's scrutiny of T are expressed as X)(T)= piioiA, T)). Here, the existence of o Is implied in the notation P(r), unless otherwise stated. For example, Ts security could be enhanced by minimizing |2?(T)|. 4.4.4.6. Definition. Properties of T that are hidden or concealed horn o's scrutiny of T are given by W( T) = ViT)\ V{ T). 4.4.4.7. Remark. The preceding definitions are conveiuent for two reasons. First, we properly limit the capabilities of oracles to non-omniscience. Since an omniscient oracle would be able to solve all problems, but Goedel has shown that not all functions are computable, there is a contradiction when omniscience is attributed to finite systems. Even if 0 was omniscient only over the limited applications domain of transform properties, then probability and computational error would not be required in range{o), since all outcomes would be certain and, therefore, accurate. Second, we define data security in terms of the information that an oracle can determine with nonzero probability. Thus, we avoid the problem of determining what an oracle does not know, which is undecidable over infinite sets, and may be undecidable over certain finite sets. Since numerous encryptions have a keyspace that is computed from the real or complex numbers, such considerations are germane, but detailed discussion is beyond the scope of this dissertation. Therefore, instead of concentrating upon oracular failure, we emphasize the more manageable problem of whether salient properties of domain{T) are leaked to range{T). Given the following definitions of saliency, we introduce our discussion of data security with a convenient definition of security. PAGE 164 154 4.4.4.8. Definition. Per Definition 4.4.4.1, assume that an oracle o G O is applied to a transform Tand outputs n properties, where each property q(i), i=l..n, has associated with it a probability p(i) and a computational error e(i). A property q(i) is deemed operationally salient if q(i) Â£ 0 C V{T), where 0 denotes a predetermined set of operationally salient properties. Additional criteria for determining saliency may include the restriction that property q(i) in the output of oracle o must exhibit a probability p(i) that exceeds some threshold t. A similar argument can be made for the associated error measure e(i). 4.4.4.9. Example. Suppose a transform T leaks the nonconstant histogram h of its input. If an adversary has plaintext n-gram statistics at his disposal, then leaks h is an operationally salient property. 4.4.4.10. Definition. A transform T is secure if and only if an oracle o Â£ O determines that no properties of domain{T) are leaked to range{T). Given an algebra A Â£ A and a secure transform T G T it follows that Pi{o{A,T))n'P{domain{T)) = 0. (169) 4.4.4.11. Remark. The preceding definition illustrates another reason why oracles cannot be omniscient. Namely, if an oracle was omniscient, then T could not hide any property of its input. As a result there would be (a) no information in the ciphertext and (b) no encryption transforms. This observation leads to a more realistic definition of data security, as follows. 4.4.4.12. Definition. A transform T is operationally secure if and only if the salient properties of domain{ T) in 0 C V{T) are not disclosed to an oracle o e O that employs algebra A. In other words, if T is operationally secure, then pi{o{A,T)) D 0 = 0. PAGE 165 155 4.4.4.13. Observation. It follows that, if T is an operationally secure mapping, then neither n{ T) nor T>{ T) are necessarily empty. For example, 2)( T) may contain properties of rthat are not viewed as operationally salient, such as the onto-ness of T. Additionally, consider scenarios where enumeration-based attacks upon T are thought to be infeasible under operational constraints such as available computational bandwidth or storage. In such cases, it would be reasonable for 0 not to contain the property preserves input size. 4.4.4.14. Example. Consider a compression transform T: ^ , whose output contains {siz{F) |X|)/CR bits. If one has an approximate knowledge of CR and a range of probable values for siz{F), then one can guess probable values for the input size |X|. However, |X| tells us nothing about the input's structure or about properties that can be computed from the input. Each transform class exhibits properties that have various implications for data security. We thus present the following theorems that pertain to the data security of transforms in Classes 1-4 of our transformation tajconomy (reference Section 3.3). For the purposes of this discussion, we assume that F, G, X, and Y are finite sets, and that the taxonomic class to which a given transform belongs would be known a priori to an adversary. Per our introduction to the previous section, we assume that each transform class is analyzed independent of implementationaJ concerns that are peripheral to the transform mapping (e.g., key distribution, privilege levels, and keyspace configuration). In summary, our eventual objective is to analyze the security of each mapping T in transform classes 1-4, when T is considered only in the context of the given transform class. In this dissertation, we will provide several illustrative examples. In the analysis of finite discrete encryptions, the length of the plaintext and the size of the plaintext alphabet provide sufficient information to determine an upper bound of complexity upon an enumerative attack. PAGE 166 156 4.4.4.15. Theorem. The following statements hold: (i) Class 1 and Class 2 transforms are insecure, since they disclose the plaintext size in range(T), and (ii) Class 1 and Class 3 transforms are insecure, since they disclose the alphabet size in range{T). Proof. Assume that Class 1-3 transforms are as defined in Section 3.4. Then, (i) follows from the fact that domain{choice{range{ Ti))) = domain{choice{domain{ T\))), where i G {1,2}, and (ii) follows from the fact that range{choice{range{T.^)) = range{choice{domain{Ti))), where i G {1,3}. 4.4.4.16. Observation. Class 4 mappings can be compromised as follows. Given T4: -+ G^, / : X Â— Â»^ Y, and y : F Â— > G, let Ta = f 0 g. For purposes of discussion, assume that the latter equation is known to the adversary, but / and g may not be known. Now, assume that subsets and go of / and g, which constitute partial functions, are known. Then, a subset (or partial function) To of T\ could be synthesized by composing /Â„ with goAlthough one may not have an oracle that can attack T, one may be able to apply the appropriate oracle to To in order to disclose properties of domain{ To) in range{ To), since /Â„ and go are known. Given the preceding scenario, assume that the oracle's output facilitates further guesses about Ti, i.e., helps the adversary enlarge his knowledge of /o and jfoIn the presence of a fitness function that evaluates the content of the reconstructed transform T4, such methods can be iterated in order to achieve a reconstruction of T4 that may converge. Such methods are amenable to implementation in genetic algorithms [66] or relaxation algorithms [67]. We next discuss the preservation of set properties, with the image sum and histogram given as an example. 4.4.4.17. Theorem. A bijection (Class 1 transform) preserves the histogram of its input. PAGE 167 157 Proof. A bijection is an isomorphism. Therefore, the histogram of the plaintext input is discovered in the plaintext output by relabelling the histogram domain. 4.4.4.18. Observation. The image sum is preserved by linear transforms of form T{x) = a.x + h, due to the linearity property, as well as distributivity of multiplication over addition. However, the sum may be scaled or offset by a constant value. 4.4.4.19. Examples. 1. Let the linear transform T{a) = (3 * a) + 4. If N = \domain{sL)\, then S(r(a)) = E(3 * a) + 4 = 3*Sa + 4N. 2. Truncation of image grey levels by k bits can be viewed as a linear transform whose scale factor equals 2'''. With respect to data security, we described (in Chapter 1) how a linear homomorphism of size n could be compromised by solving a system of n linear equations. 4.4.4.20. Theorem. Spatial transforms (Class 3) preserve an e-near approximation to the input sum and histogram, where e = 0 for continuous domains that are not truncated by the spatial transform. Proof. This holds for continuous spatial transforms, since range{range{Tz)) = range{domain{T^)). In the case of discrete transforms, may include interpolation of values, which usually renders E(T'3(a)) an approximation to EaD The preceding theorems are key to the description of constraints upon cryptanalytic attack via combinatoric means, as applied to compressive or encryptive transformations. The difficulty (or ease) of inverting such transforms dictates the security of computations over encrypted sets. We discuss encryptive computation in the following section. PAGE 168 158 4.5. Feasibility of Encryptive Computation. We begin by stating several conditions that influence the security of computations over encrypted data. Then, we present an abstract framework for cryptanalytic attack upon encryptive transforms, which is employed in the evaluations of transform security that is discussed primarily in Chapter 8. 4.5.1. Security and e-Security of Homomorphisms Since we can determine the security of a transform using an oracle, then we can extend such statements to computational systems embodying such transforms. 4.5.1.1. Recall. In Chapter 3, we defined the preservation of set properties under homomorphism. As we have shown in Section 4.5, properties that are preserved under transformation T are not limited merely to analogues of image operations, but can include plaintext length and alphabet size. If T is a homomorphism, then it is obviously correct to say that such properties are preserved under homomorphism. Furthermore, if such properties define the security of a transform, then it is reasonable to define secure homomorphisms as follows. 4.5.1.2. Definition. A secure homomorphic computational system is an homomorphic computational system in which the constituent homomorphism(s) is(are) secure according to Definition 4.4.4.10. 4.5.1.3. Definition. An operationally secure homomorphic computational system is an homomorphic computational system whose constituent homomorphism(s) is(are) secure with respect to a set of properties 0, per Definition 4.4.4.12. From the development of Chapter 3, it should be understood that the preceding definitions apply to isomorphic and e-near approximations to homomorphic computational systems. Assuming that we have a secure HCS Xu, a key issue in encryptive computation PAGE 169 159 concerns the compromise of Xr's security by one or more analogues of an image operation, as discussed in the following theorem. 4.5.1.4. Theorem. If a homomorphic computational system Ah= ({F^,0}' {^'0}' {^^'0'}) is secure, then the analogous operation 0' over range{T) does not compromise the security of T. Proof. By definition of a secure HCS, Tis assumed to be secure. For purposes or argument, assume that the definition of security given in Section 4.4.4.10 holds, i.e., no properties of domain{ T) can be detected in range{ T) by an oracle o Â£ O. Implied in this definition is the assumption that no other operation is as capable as the oracle. (If such was the case, then that operation would, by definition, become the oracle.) Thus, Q\ which Operates over mnge{T), cannot produce any property of domain{T) in range{Q'). Since 0' l^^ks the oracle's capabilities, Q' is said to have no information about domain{ T) at its input or output. Thus, Q' does not compromise the security of T. 4.5.1.5. Remark. The preceding theorem may initially seem somewhat fanciful, since we know that no finite discrete encryption is safe from attack via enumeration of possible plaintexts. However, per Remark 4.4.4.11. It follows from inspection of Theorem 4.5.1.4 that an insecure homomorphism may have whatever security it possesses further compromised by operations conducted over range{T). With the exception of the trivial case of attack by enumeration, there are very few finite discrete homomorphisms that are not provably insecure. For example, the Fourier transform, which is often the basis of linear homomorphisms, can easily yield to attack via correlation, due to the correlation property of the FT. It follows that a transform T can succumb to an attack which is based upon one or more properties Q of T that involve the action of preservation. It is easily verified that such cases occur especially when one can determine an analogue Q' of Q. PAGE 170 160 Since secure transforms appear to be scarce, but operationally secure transforms can appear to be plentiful (depending upon the mapping and the set of operationally salient properties), we prefer to couch the security of operations over encrypted data in terms of the following convention that derives from our previous discussion of operational security. 4 .5.1.6. Assumption. Assume that an oracle o G O is applied to a transform T. Given a three-tuple (q,p,e) in range(o), assume that the probability measure p for a given property q denotes the certainty (in the information-theoretic sense) with which T discloses q to o's scrutiny of range{T). 4.5.1.7. Definition. Given an oracle o e O with an algebra >1, a set of salient properties 0, as well as probability and error thresholds fp and fe, a transform Tis said to be operationally e-secure if and only if, for all tuples (q,p,e) in W, the following condition holds: W = {(q,p,e) G range(o{A,T)) : q G O , p < fp, ande < (170) We can use this concept of e-security for various purposes, as shown in the following observation and example. 4.5.1.8. Observation. Under constraint of Assumption 4.5.1.6, it follows that 1-p denotes the uncertainty with which T discloses q to o and the certainty with which T conceals q from 0. Thus, p denotes the uncertaintywith. which T conceals q, a property of domain{ T), from o's scrutiny of range{T). As a result, e-security can be thought of as a link between data security, information theory, and cryptanalysis. The utility of this concept will become apparent in subsequent discussion concerning information-theoretic analysis of data security. We present the following example for purposes of illustration. PAGE 171 161 4.5.I.9. Example. Let Definition 4.5.1.7 hold. Given a probability threshold Cp in (0,1] and an oracle o G O with algebra A, a mapping M is called operationally mean e-secure if and only if where W was defined in terms of o[A,T) in Equation 170. Similarly, operational RMS e-security could be defined as Prior to the second example of a probabilistic oracle, we expand upon Observation 4.5.1.8 by advancing the following observations concerning the e-security of transforms, which we express in terms of an ensemble of properties. 4.5.1.10. Observation. It can be easily verified by inspection of Definition 4.4.4.12 that a transform T which is operationally secure may not be secure in the sense of Definition 4.4.4.10. That is, one or more properties in 0 may be manifest in both the range and domain of r, albeit with different probabilities of existence. This difference in probabilities is due to the fact that the probability with which T conceals a given property q of domain{ T) from an oracle's scrutiny of range{ T) can vary with q. If an oracle is not omniscient (as we assume in this study), then its knowledge will be limited, and the oracle might not be able to deduce certain properties of a given encryption. 4.5.1.11. Example. Assume (for purposes of simplicity) that q and s are the only properties in 0 that are known properties of domain{T), and let an oracle o G D employ algebra A and accept T. For purposes of simplicity, further assume that pi{o{A,T)) is a set containing probabilities q and s, whose probability of detection in range{ T) is given by P(q) = P2(o(^5r)|{(jnp_e):m=q})Â» ^^d Symmetrically for s. Since q and s are known to be properties of domain{'r), we say that p(q) = 1. Now, suppose that T is a homomorphism (171) (172) PAGE 172 162 upon which an HCS Xh is based. Let a threshold of probability be denoted by e. Then, J^H would be e-secure with respect to 0\{q}, under the scrutiny of o, if and only if e > V(p(q),p(s)). It is easily verified that a criterion for 6 could be thusly derived from practical knowledge of a mapping and its properties. We next formalize the concept of operational security for homomorphic computational systems, which generalizes all computational systems, as discussed in Chapter 3. 4.5.2. Operational Security of HCS The definition of security criteria for homomorphic computational systems is key to our discussion of secure computation, since HCS and ICS form the basis for our theoretical discussion. 4.5.2.-1. Definition. Given a set of operational properties 0 and an oracle o G O, a,n operationally e-secure homomorphic computational system Xu = ({S,0}) {T,Q}, {S',0'}) is an HCS that satisfies the following conditions: (i) The homomorphism T : S S' is an operationally e-secure mapping, in the context of 0 and o; and (ii) The operation Q' : S' x S' Â— ^ S', an analogue of the operation Q : S X S Â— Â» S, is an operationally e-secure mapping. 4.5.2.2. Theorem. Given a set of operational properties 0 and an oracle o G O, an isomorphic computational system Xi= ({S,0}) {T,Q}, {S',0'}Â» T~^) is operationally e-secure if the following three conditions are satisfied: (i) The HCS Xh = ({S,0}Â» {^,0}, {S',0'}) is e'-secure in the context of 0 and o; (ii) The mapping 0 : 0 h 0' has an inverse 0"^ : 0' h 0 and the HCS Xu = ({S',0'}, {T-\Q-^}, {S,0}) is e"-secure in the context of 0 and o; and PAGE 173 163 (iii) e= V(6',e"). Proof. The proof is outlined as follows. From Theorem 4.5.2.2 and Section 3.2, we see that a secure ICS can be constructed from two HCS and X^, where Xh is based upon the homomorphism T and is based upon the inverse homomorphism 7"'. If the insecurity level of Xn and X^ are denoted by e' and e", respectively, then from Observation 4.5.1.8, the net insecurity is equal to e= V(e', e"). The maximum function is employed since we seek to determine the maximum probability with which a property of domain{ T) is manifested in range{ T). 4.5.2.3. Remark. We emphasize that the preceding comments hold primarily for finite sets or finite subsets of infinite sets. Since all properties of all sets are not computable, we have shown that it is sufficient to speak of security in terms of non-omniscient oracles that determine only those properties that are known to pertain to a given operational scenario. When deriving an e-secure HCS, one must first consider the specification of the properties O that constrain the definition of security. Then, one can proceed with the design of an oracle, which (if it exists) would be a special case of a more general oracle o G D. For example, if one knows that preservation of the histogram is a salient property, then one can proceed with the design of an oracle that deduces the certainty with which the plaintext histogram is preserved in ciphertext. We next discuss computational error and information loss. PAGE 174 CHAPTER 5 COMPUTATIONAL ERROR AND INFORMATION LOSS Computations that employ discrete approximations of real numbers produce errors which can degrade the accuracy of operations over the range spaces of numerous transforms. In Section 5.1, we analyze error propagation in discrete computational systems. Given the error functions associated with basic arithmetic and transcendental operations, we then derive error predictors for the image algebra operations employed in this study (Section 5.2). In Section 5.3, we present a theory of errortolerant computation and discuss the feasibility of such erroneous computations in Section 5.4. 5.1. Theory of Error Propagation in Discrete Systems We begin our discussion of error analysis by considering the errors of roundoff and series truncation inherent in digital computation, after Chern [64]. Then, we analyze the error that accrues from the discrete implementation of arithmetic and transcendental functions. 5.1.1. Basic Error Measures . We first establish several notational conventions in terms of an idealized computation. 5.1.1.1. Assumption. Let S be a normed space with norm || ||. Assume that one defines a function / : S Â— >Â• S symbolically as f{x) = g{f,x) , a; G R, where the ideal implementation g : (S U {-L})^ X S ^ S accepts a partial function /o C / with J. denoting an undefined symbol, together with a value in S, and produces f(x) G S. 164 PAGE 175 165 5.1.1.2. Remark. The preceding formulation is abstract by design, and thus says nothing about the implementation of /. For example, / could be defined by a programmer in terms of the functional composition g = fox, where x could denote a reflexive function that maps an element f G S onto itself. 5.1.1.3. Definition. In principle, let / have an exact solution that is denoted by /e and an algorithmic solution /, that could be solved exactly, with sufficient resources. Additionally, let the computation of /a on an N-processor discrete architecture A with finite precision be denoted by fcFor purposes of completeness, we denote the initial value fo of J{x) for some X = Xq. 5.1.1.4. Example. For example, given S = R, if f{x) = sin'^{x) , Vz G S, then /a could be the Taylor-series approximation to the symbolic expression /. The computation of /a on a sequential architecture with 32-bit precision would be given by /c. 5.1.1.5. Definition. . Under constraint of Assumption 5.1.1.1 and Definition 5.1.1.3, the accumulated roundoff error of / when computed as fc (as opposed to being computed on A as /a) is given by: 5.1.1.6. Definition. Under constraint of Assumption 5.1.1.1 and Definition 5.1.1.3, the accumulated truncation error associated with computing / with infinite precision using a finite (truncated) expression (series) /a is given by: 5.1.1.7. Definition. Under constraint of Assumption 5.1.1.1 and Definition 5.1.1.3, the total accumulated error of / when computed on A is given by: (t{x)= fc{x)-f^{x). (173) Â€tix)= /a(a;)-/e(a;). (174) f tot Â— f t + f r Â— /c Â— /e Â• (175) PAGE 176 166 5.1. 1.8. Remark. For purposes of brevity, we express the total error etot of a function / as the deviation 6f. Similarly we express the error of an operand f Â€ S, where S is a normed space, by 5f. In this dissertation, ^/will be a symbolic entity primarily, while 6f generally denotes the scalar error or error distribution associated with a scalar variable or image values. The context will be clear from the form of a given expression. This notational convenience enables us to express the following error functions concisely. 5.1.2. Error Propagation in Arithmetic and Transcendental Functions. We begin by assuming that the basis functions for discrete computing are the arithmetic functions of addition (+) and multiplication (Â•), as well as their inverses. We express the transcendental functions and their inverses (log, exp, sine, cosine) in terms of Taylor series expansions, which can be computed in terms of addition and multiplication. 5.1.2.1. Theorem. Let f,g,h G R, and let the nonzero sum h = y(f,g) = f + g, where ^f , ^g G R denote the errors in f and g. The absolute error |^h| in the nonzero sum h = f + g 7^ 0 is bounded as |^h| < \Sl\ + |^g|. Proof. The proof is given in Reference 60, and is well known. 5.1.2.2. Theorem. Let f,g,h G R, and let the nonzero sum h = f{i,g) = f + g, where (5f,6g G R denote the errors in f and g. The absolute deviation |^h| in the nonzero product h = f Â• g ^ 0 is bounded as (176) The rightmost term is usually of negligible magnitude, and can be ignored. Proof. The proof is given in Reference 60, and is well known. PAGE 177 167 5.I.2.3. Theorem. Let g, h G R, and let the exponent h = Xg) = where (5g e R denotes the error in g. If the series expansion of / is given by the algorithm n-l ^(g)= i + Eff' g' i=l where n is finite and 0! = 1, then the absolute error |^h| in /a is bounded as n-l i M i=l The rightmost term (|^i|/i!) is usually zero, since i is encoded as an exact (integer) quantity in most computers. Proof. Assuming that the givens hold, we have from Theorems 5.1.2.1 and 5.1.2.2 that: /nÂ— 1 i\ nÂ— 1 8 ^j^(Kt) ^(i!) i=l n igr 61 (179) Â£rvigi-igr since the error Sl'm an integer /equals zero. This result can be obtained from the right-hand member of the preceding equation by assuming that ^(i!) = 0, then expanding ^(f ) according to the product rule given in Theorem 5.1.2.2. 5.1.2.4. Theorem. Let h = /(g) = g'' , Vg G R and k 6 Z, where (5g G R denotes the error in g. If / is computed in terms of the algorithm k /a(g)=ng' (180) i=l where k is assumed to be finite, then the absolute error |^h| in /a is bounded as: m < k^ -g^ (181) PAGE 178 168 Proof. The proof follows straightforwardly from Theorem 5.1.2.2. 5.1.2.5. Theorem. Let h = /(g) = ln(g) , Vg, h Â€ R, such that g > 0, where ^g Â€ R denotes the error in g. If the series expansion of / is given by the algorithm (182) where n is assumed to be finite, then the absolute error |^h| in /a is bounded as: |^h| < n(n+ 1) g + n Â• ft , (183) with ft denoting the total computational error in the quotient expression l/(2i-l). Proof. Assuming that the givens hold, we have from Theorems 5.1.2.1 and 5.1.2.2 that 1\'' V|^h| = V n i=l n = n Â• ft + 5] 2i i=l g+1 g+1 g (184) Since \Sh\ < n Â• t + (n + 1) Â• , the theorem holds. 5.1.2.6. Theorem. Let h = /(g) = sin(g) , Vg 6 R, where <5g Â€ R denotes the error in g. If the series expansion of / is given by the algorithm /.(g)=E(-ir'7fr^(2i-l) i=l (2i 1)! (185) where n is a.ssumed to be finite, then the absolute error |^h| in /a is bounded as |^h| < n-et + g E(2i-l)-g''-'' (186) i=l with et denoting the total error inherent in computing the quotient 1/(21-1)! PAGE 179 V|(5h|= V Proof. Assuming that the givens hold, and noting that sign inversion carries no computational error, we further assume that the computation of ^2i-i)\ 'incurs total error ff Then, we have the following derivation: n = n-ft + J](2i-l)= n Â• f t + =1 g (187) ,2i-l which states the lower bound, where the inequality follows from the triangle inequality. CD The following method of computing the sine function yields greater accuracy and computational efficiency. 5.I.2.7. Theorem. Let h = /(g) = sin(g), Vg G R, where <$gG R denotes the error in g. If the series expansion of / is given by the algorithm i=l /.(g)=g-E (i-fi)' (188) where n is finite and the error inherent in computing tt is denoted by cxÂ» then the absolute error |^h| in /a is bounded as: m < g -l-n (Ifl-)' 1 (g/T)^ (189) with Â€i denoting the machine error inherent in computing the quotient l/(2i-l)!. PAGE 180 170 Proof. Assuming that the givens hold, we have from Theorems 5.1.2.1 and 5.1.2.2 that: vi^hi=v6(^g.x:(i-f^)) Sj. g Sg g n ^(l_(g/;r)2) tl l-(g/T) + n ( (190) 1 (g/T)^ which restates the lower bound. The inequality follows from the triangle inequality, after the proof of the preceding theorem. 5.1.2.8. Observation. Method 2 (Theorem 5.1.2.7) of computing the sine function exhibits an error which is linear in the fractional error , while Method 1 (Theorem 5.1.2.6) is quadratic in If < 1 (which is usual), then Method 1 is more accurate, provided that K is computed with sufficient precision. However, if \6g\ > |g| then Method 2 yields greater precision. 5.1.2.9. Remark. Since the cosine function differs from the sine only by a phase constant, then the error in the cosine should be determined in a manner similar to that of the sine. Since proofs of the following assertions can be arrived at via inspection of Theorems 5.1.2.6 and 5.1.2.7, we state the error functions in brief form only. Let g,h 6 R exhibit deviations Sg,Sh G R. If the cosine of g is given by ^i g' h = /(g) = cos(g)Â«A(g) = J](-l)'.|, then the deviation in h is given by: (191) i=0 |^h| PAGE 181 ITI Alternatively, if the cosine is given by the series expansion: h = /(g) = cos(g) Â« = n 1 4g^ i=l (2iif Â•7r2 (193) then the deviation in h is given by: |^h| < n 4s' (194) which follows directly from Theorem 5.1.2.2. As before, ^ is taken 0{n^) times. 5.1.2. 10. Theorem. Let g, h G R exhibit deviations 6g, 6h. Â£ R. If the inverse sine function is given by the series expansion: n h = /(g) = sin-i(g) Â« Ug) = h-g''-\ (195) then the deviation ^h is given by: i=l |^h| = Sg g i=l Proof. The proof follows directly from Theorems 5.1.2.1, 5.1.2.2, and 5.1.2.6. (196) (197) 5.1.2.11. Remark. The error in the inverse cosine expansion h = /(g) = cos-^(g) Â« /,(g) = I E i=l is equal to the error in the inverse sine function plus e^, the error inherent in computing ir. This is due to the phase difference of 7r/2 radians between the sine and cosine functions. We next present theorems that describe the error in univariate and multivariate functions. The proofs, which are given in References 60 and 61, are well known. 5.1.3. Error in Univariate and Multivariate Functions. It is useful to determine the error inherent in symbolic functional specifications, which are frequently encountered in certain programming languages (such as Lisp). We begin with an analysis of the general univariate function, then progress to multivariate functions and functional composition. PAGE 182 172 5.1.3.1. Theorem. Let a univariate function / : X F be differentiable. The error inherent in J{x) is given in terms of the deviation as Â•^x,xÂ£X, (198) where we assume that ^x is taken in the same sense as dx. 5.1.3.2. Theorem. Let a multivariate function / : -> F be differentiable. The error inherent in /(xi,X2, . . ..Xm) , where Xi,X2,. . .,Xin G XÂ™, is given by (199) if the following conditions are satisfied: (i) the errors 6x1,6x2, ... ,Sxra G R in Xi,X2,...,Xm G R are independent and random; and m (ii) 6f is bounded as |^/| < i=l The following theorem exhibits an obvious proof, and is a useful introduction to the discussion of error propagation in computational systems. 5.1.3.3. Theorem. Let the composition of n functions of form g\ : S ^ S, where i = l..n, be given by n h= f{gi,g2,.--,gn)= 0 9!, (200) i=l The computational error is given by 6h= f{eg^,Â€gÂ„...,Â€gj= Â€ o(gXSg), (201) where eg.^ denotes the error function of gi and the deviation ^g denotes the input error. We thus have the basis for determining error propagation in image algebra operations, as discussed in the following section. i^/|(x) = df{x) dx m= E Â— Â• 6xi dx PAGE 183 173 5.2. Error Propagation in Discrete Image Algebra Operations Since the algorithms of this study are expressed in terms of image algebra, we present the following analysis of error propagation through discrete lA operations. Due to the fact that pointwise image operations obey the error propagation rules stated in Section 5.1, we begin with the global reduce functions of image sum, product, and maximum. We then analyze error propagation through the floor and ceiling functions, as well as the image histogram, median, and moments (central and noncentral). We conclude with a brief discussion of errors inherent in computing the image-template product. Unless otherwise specified, we make the usual assumption that the specification of pixel locations in the image domain X is exact. Thus, we adopt the convention that errors occur as a result of perturbations in image pixel values. 5.2. i. Global Reduce Function We begin our discussion of error propagation in global reduce function with the common case of image summation. It is easily verified that the mean error in the image sum is bounded above by the sum of the mean pixel errors, as shown in the following theorem. 5.2.1.1. Theorem. Let image a G R-'^ and let the absolute mean deviation per pixel be denoted by (5r = Â• l^(a(x))|. Then, the mean error ^h G R that results from the xex image sum h = /(a)= Sa is bounded as \Sh\ < |X| Â• ^r. Proof. Assuming that the givens hold, then from Theorem 5.1.2.1, we have that \6h\ = X^^(a(x)) = |X|Y|^. 5]^(a(x))j < IX^r, (202) where the inequality follows from the triangle inequality. PAGE 184 174 5.2.1.2. Example. Let a G (22'^)^ and let 6v represent an error of Â±1 least significant bit (LSB), i.e., \6t\ = 1. Assuming uniformly distributed image values, the average value of each pixel will equal |X| Â• 2^^/2 = 2^^ . If |X| = IM pixels, then v^= |X| Â• (2^^) = 2^Â° Â• 2^^ = 2^^ a 3.43 x 10^Â°. Thus, the mean error per pixel in the sum is bounded as ^(EI)< Ea/|X|Â« ^^^^ = 3.26 x 10^ 5.2.1.3. Theorem. Let image a G and let the mean deviation per pixel be denoted by 6v G R. If the image product h = /(a) =11^' then the error ^h in h is bounded as: *i>^nwE^(203) Proof. The result follows directly from Theorem 5.1.2.2. From the preceding two theorems, it is obvious that the pixelwise error is a significant measure. In particular, it is useful to express the expected error in a space-variant fashion, since many imaging devices exhibit spatially variant SNR and PSF effects. Additionally, the following convention can simplify the formulation of error propagation equations. 5.2.1.4. Definition. An error image e G R^ of an image a G R^ is defined as e(x)= <5(a(x)), X G X. 5.2.1.5. Example. From Theorem 5.1.2.1, the error in the image sum is given by |(5(Ea)| < Ee. From Theorem 5.1.2.2, the error in the image product |^(na)| < S(e/a).na. We conclude our discussion of the global reduce function with the image maximum Va = V a(x), which can be analyzed in terms of the binary function V : R x R Â— >^ R. xex 5.2.1.6. Theorem. Let f,g,h G R, with respective deviations ^f, ^g,^h G R. If the maximum is expressed in terms of addition, i.e., h=/(f,g)= V(f,g) (f g) > 0 (204) therwis< then the error ^h in h is bounded as |^h| < \6i\ + \6g\. * ' \ g otherwise PAGE 185 175 Proof. The proof follows directly from Theorem 5.1.2.1, since the operation of maximum is defined in terms of addition. ^ 5.2. 1.7. Observation. It follows from the preceding theorem that the global reduce function of maximum has an error function that is identical to that of the image sum, since both functions are expressed in terms of addition. 5.2.2. Additional Image Functions We now progress to the analysis of the nonlinear operations of floor and ceiling, then derive the errors inherent in higher-level image functions, such as the histogram, median, and mean. 5.2.2.1. Theorem. Let g, h G R exhibit error ^g, Sh G R. If the ceiling function is computed exactly as h = /(g) = [g] and is implemented in terms of the algorithm h Â« /,(g)= iGZifi-l PAGE 186 176 5.2.2.3. Theorem. Let F denote a finite subset of R and a Â€ F^. Denote the histogram h e N*", such that h(r)= Ex=r(a), where r G range{a). If e e is the absolute error image of a, and the values in range{a) are sorted in ascending order as (ri , r2, . . . , r;, . . .), then the counting error specific to the i-th histogram bin is bounded as: ^(h(ri)) < E V (x>,(i,o)(e), X>,(i,-i)(e)) , (207) where s(i, j) = n+i+j li+j. Proof. The proof is outlined as follows. Assuming that the givens hold, let ri_i, rj, n+i G R denote pixel values in image a Â£ F-'^ that are adjacent domain points of the histogram h G N*^. Assume that histogram bin boundaries are situated midway between each pair of adjacent values in F, as is usual. We have the following two cases: Case 1: Ife(x)= |^(a(x))| > n+i r; and a(x) = n + |e(x)|, then a(x) would not be classified in the i-th bin of h, but would be classified in the j-th bin, where j > i. The counting error would thus be less than or equal to Sx>,(i_i)(e). Case 2: Ife(x)= |<5(a(x))| > n n-i and a(x) = ri |e(x)|, then a(x) would be misclassified in the j-th bin of h, where j < i. The counting error would thus equal S X>4(i,-i)(6)We use the relation (>) together with the characteristic function, since e denotes the absolute error. If the error was not absolute, then the second case would pertain to negatively-signed error. Additionally, in order to provide an upper bound, we use the maximum function. It is not known a priori if Case 1 or Case 2 holds for each pixel, so we would compute both in practice, and employ the worst case as an error bound. PAGE 187 177 5.2.2.4. Theorem. Let F denote a finite subset of R, let a G F'^, and let the histogram h e N"" have an error image e G R*^ and a cumulative histogram c Â€ N*^. Assume that values in range(a) are sorted in ascending order as ri,r2, . . .,ri_i,ri,ri+i, . .. ,rn. If the median q 6 R of a is defined as: q = /(a)= A domain (h||>2h/2) > (20$) then the error 6q in computing the median is bounded as: ^q<|ri-rj| where c(rj) e(rj) < ^ < c(rj) + e(rj) . (209) Proof. The proof is outlined as follows. Assume that the givens hold and note that the computation of c is an integer operation, which incurs no error. However, if the histogram exhibits error e, then error can accrue in the cumulative histogram. In particular, e cannot be absolute, since Eh + Se would exceed |X|. If e denotes the average (expected) error in a given bin of h, then we have two cases: Case 1: If c(ri) e < ^ < c(ri) + e, then q = ri typically, and there is no counting error at the i-th bin. Case 2: Else, c(rj) < ^ < c(rj+i) and there is an error that equals |ri rj|. Thus, the theorem holds. 5.2.2.5. Theorem. Let X denote an mxn-pixel array of points, and let a G R^. Let the noncentral moment mpq of order pq be defined in lA as: mpq(a) = S(iP * j'l * a) , (210) where i = {(x, i(x)): i(x) = pi(x), x G X} and j = {(x,j(x)):j(x) = P2(x) , x G X}. The error function of the noncentral moment is given by: ^(mpq(a))= ^(Ea), (211) where S(T,a.) was specified in Theorem 5.1.2.1. PAGE 188 178 Proof. Assume that the givens hold, and that i and j exhibit no error. Since neither iP * j'l or its summation exhibit error, Equation 210 indicates that the error in mpq is given by ^(Ea), which is described in Theorem 5.1.2.1. 5.2.2.6. Theorem. Let X denote an mxn-pixel array of points and a Â£ R^. Letting i = {(x,i(x)):i(x) = pi(x),xGX} and j = {(x, j(x)): j(x) = P2(x) , x e X}, the mean images i and j defined as i=(i*a)/i:a and J=(j*a)/Ea. (212) Let the nonzero central moment /ipq of order pq be given by /Xp,(a)=E((i-i)%(j-j)%a). (213) If e G denotes the error image of a, and e is nowhere zero-valued, then the error in /ipq is bounded as |^(/^pq(a))| < ((p + q + 1) : V(e/a) + (p + q) Â• (Ee/Ea)) Â• /ipq(a) , (214) and the average error is given by %pq(a))< /ipq(a).(eI+(p + q).(Se/Sa)), (215) where el= S(e/(a * |X|)). Proof. The proof is outlined as follows. Assume that the givens hold and assume that i and j are not erroneous. Then, we have *Â®=<^) = T-^and 6(J) is specified symmetrically. Thus, <5(/Zpq(a)) < /xpq(a) . |^p . ^ + q . ^ + ^ j . (217) Let e = ^a denote the error image of a. By substitution of Equation 216 into Equation 217, we have Equation 214. The average error is proven symmetrically. PAGE 189 179 5.2.3. Image-Template Product. The image-template operation Â® is the most powerful operation in lA, and has the greatest potential for error propagation, since Â® computes over image and template values that may be erroneous. We begin by analyzing the error in the common case of Â® = Â©, then discuss higherlevel methods for error analysis of Â®. 5.2.3.1. Theorem. Let X denote an raxn-pixel array of points, and let a 6 R'^. Let template t 6 (R^)^ with e,f G denoting the error images of a and t, respectively, such that f(y)= ^ ^(ty(x)),yG Y, (218) xes{ty) where absolute error is assumed. If c = a 0 1, then the error 6c in the result is bounded as: wc(y))i<(Ei)+ s%lwr ^^^xÂ€S(t,) Proof. Assuming that the givens hold, a pixel value c(y) in the convolved image c = a @ t is given by: c(y)= (220) X6X The error ^(c(y)) is derived as Vxex / = J] 6(a(x) . ty(x)) / .?(a(x)) 6{ty{x)) \ (221) " ^y, V |a(x)| ^ |ty(x)| ; V + f(y) ~ ^ U(x)>' ^ E Ity(x)|' xex which restates the theorem. The error propagation function for the right product c = t 0 a is symmetrically defined. PAGE 190 180 5.2.3.2. Remark. The error in the generalized image-template product Â® is computed similarly to the method of Theorem 5.2.3.1. Assuming that a G and t G (F^)^, we let c(y) = a(x) o ty(x) , y G Y, where 0,7: F x F F denote associative, commutative functions. We compute the deviation in c(y) by first applying the error propagation function for 7. For example, if 7 = +, then the law of sums (Theorem .5.1.2.1) applies. We next apply the error function specific to the operation OFor example, if O = then the law of products (Theorem 5.1.2.2) is applied. Since we have defined error propagation in Section 5.1 for all arithmetic operations that are presently employed in lA imagetemplate products, the construction of the composite error function is primarily a matter of algebraic manipulation. As before, the error functions for the right and left products are symmetrically defined. 5.3. Theory of ErrorTolerant Computation In Chapter 4, we discussed the optimization of computational networks comprised of operations and transforms, which optimization was based upon the criterion of time complexity. In this section, we discuss a similar situation, namely, optimizations based upon a criterion of computational error that results from a given image processing algorithm. 5.3.1. Basic Concepts. 5.3.1.1. Observation. Recall, from Theorem 5.1.3.3, that composition of functions yields an isomorphic composition of error functions. That is, if 3 = /i 0 /2 0 Â• Â• Â• 0 /Â„ and / denotes /n-l \ the i-th error function, then the net error function is given by = 0 Â€/j 1 o Â€f^{6x), ,i=l where 6x symbolizes the input error. PAGE 191 181 5.3.1.2. Assumption. Let the approximate isomorphic computational system X({S,0}Â» {T,Q)^ {S',0'}' T'), where the homomorphism T : S ^ S' has an e-near approximate inverse T' : S' S. Given that 0(f>g) ~ nO'(nf),r(g))) , Vf,g G S, denote the error function associated with Q as eQ{hi,Sg), and likewise denote ej, er-, and eo'Further assume that computation of T'{Q'{T{i)J{g))) is more efficient for all f,g G S, than the computation of O(f'g)5.3.1.3. Observation. Under constraint of Assumption 5.3.1.2, if n denotes the input size, n' = n/CRd, and we assume that n" < n, then from Section 4.4, the work Wo(ii) > Wr(n) + Wo'(n') + Wr-(n") . (222) Similarly, if tQ > fjÂ° (Q' Â° ^Ti then (under constraint of reasonable work and error bounds), the computation T*{Q\T{i),T{g))) may be less accurate than computing Q. Assume that a given function / exhibits time complexity T/, space complexity S/, and has an error function e/. In order to design a computational system that implements / under the aforementioned constraints, one would employ the usual time-space-error bandwidth product, denoted by i/j = (T/ Â• 8/ Â• f/)~^ Alternatively, a method of stochastic search, such as genetic algorithms, might be used to find the computationally optimal solution to the graph shown in Figure 4. In practice, the function u, T, W, S, or e can be substituted for the time complexity in the equations of Section 4.4. Since the dataflow graph of the performance function u is isomorphic to the dataflow graph of the computational system that i/ represents, the optimization theory given in Chapter 4 can be applied straightforwardly to error optimization. 5.3.2. Multivariate Error Functions. In the analysis of error, one encounters the realistic case of multivariate functions, whose errors are modelled as follows. PAGE 192 182 5.3.2. 1. Theorem. Under constraint of Assumptions 5.3.1.2 and Observation 5.3.1.3, if 0 and 0' are m-ary functions over S and S', respectively, with arguments fi,f2,---,fi,---Â»fni G S, then the computation of the right member of 0(fi, f2, ...... .fi, fm) Â« T(o'( r(fi), r(f2), . . . , T(fo, . . . , r(fn.))) (223) where ^fi denotes the deviation in fi. Proof. Assuming that d Q' I dl[ is to be taken, and that di\' = dT(fi)/dfi, the proof follows directly from Theorems 5.1.2.1 and 5.1.2.2, and the chain rule of derivatives. Note that the error function for T* may require additional error parameters than the 5.3.2.2. Observation. In a system that has more than one operation Q more than one compressive format, the error function ej would be replaced by the error inherent in the conversion function Ck/(ac)w r;(r^(ac)), which we denote as cci,/As a result of the preceding theory, we can predict the error inherent in a computational system. The optimization theory of Chapter 4 can be applied to Equation 224, yielding an error-optimized computation. We next discuss practical aspects of error optimization, as well as implementational issues of errortolerant computation. incurs the error (224) one provided in Equation 224. 5.4. Feasibility of ErrorTolerant Computation In the previous section, we derived error functions for approximate isomorphic computational systems that are based on transforms that have e-near approximations to their inverses. Additionally, we showed that the temporally-constrained optimization described PAGE 193 183 in Section 4.4 can be applied to error optimization. In this section, we discuss the feasibiUty of error detection and measurement, as well as the utility of such methods in image processing. 5.4.1. Automatic Error Profiling. It is easily verified that the error functions described in Section 5.1 and 5.2 could be incorporated in computer software, such as a compiler. During execution, the accumulated error would be computed and subsequently reported to the algorithm designer or user. Such software has been designed and implemented in prototype form by the author [62]. The advantages of this technique, called error profiling, include 1. Interactive error display in interpretive software can aid algorithm development via selection of functions that yield near-optimal error figures; 2. The prediction of accrued error for given input data would greatly aid the performance and reliability analysis of algorithms employed in field scenarios, such as automated target recognition applications; and 3. Error profiling would furnish useful performance data for the validation of computational accuracy in processors and compilers, which is currently lacking in many software testing efforts. An example of item 1 can be found in the development of medical image processing software. In such circumstances, a physician using an image enhancement package may want to know the amount of information loss encountered at various spatial frequencies. Such reports are key to the determination of image feature detectability, as in cases of incipient tumors, early signs of vascular aneurysm, and the microhemorrhages commonly manifested in diabetic retinopathy [63]. Via theory given in the following section, one can determine the effect of a given error in terms of information-theoretic measures such as entropy and uncertainty. We present several related applications in Chapters 6-10. PAGE 194 error 184 In the field of cryptology, the analysis of certain message formats can be fraught with when the encryption transform operates near the machine's error limit. Similarly, the analysis of airborne or spaceborne reconnaissance imagery requires knowledge of the error produced by image enhancement procedures. In particular, the derivation of lexical or semantic information from data processed with such inverse transforms may be erroneous, due to errors inherent in the transform inversion process, such as division by small magnitudes in the inverse Fourier transform. Per item b), above, the design of cryptanalytic or enhancement procedures could be greatly aided by error prediction at each function stage. However, the computational work incurred by the previously-given error functions can exceed the cost of computing the function whose error is being estimated. For example, estimation of the product error (Theorem 5.1.2.2) requires two divisions, two multiplications, and one addition operation. In this case. We, > 4 Â• Wx. Due to their high cost, such computations would likely be useful primarily for software design and testing. An open research question pertains to the errors inherent in digital image acquisition. Although sampling and quantization errors have been extensively analyzed, published reports have infrequently focused on the comparison of focaJ-plane irradiance with the digital output of the image acquisition hardware and software. Unfortunately, focal-plane radiometry, which would be used to measure detector input, is a difficult procedure in a research environment and more so in the field. Thus, it is appropriate to consider the supporting technologies that may be required for accurate quantification of device errors in support of error profiling. Given the significance of input errors in the formulae of Sections 6.1 and 6.2, such analyses should be available to the algorithm designer in a form that can be easily manipulated. Thus, much work remains in the development of accurate, user-friendly calibration and error measurement techniques for digital imaging. The remainder of this section contains a summary of error measures, input errors, and examples of propagation errors. We emphasize the representation of error in terms of a joint magnitude and error distribution. Via such representations, we can support PAGE 195 185 accurate computation of the error functions given in Sections 5.1 and 5.2 when the error distribution is computed over an entire image. 5.4.2. Joint Magnitude/Error Distributions. We begin with a summary of error measures, then define the joint magnitude-error distribution ( JMED) of an image. Subsequent remarks pertain to the use of JMED in the analysis of error propagation in discrete systems. 5.4.2.1. Definition. For purposes of discussion, let a source image a G R-'' be derived from a hypothetical image acquisition device that yields an error image e G R^. Salient error measures can be computed from e, as follows: Â• Sum-squared error fss(a) = Se^ Â• Average or mean error ^^/l^l Â• RMS error w(a)= (EeV|X|)'/' Â• Error histogram h(e) = ^ X=r(e) r6ronje(e) 5.4.2.2. Definition. Let the joint magnitude-error distribution (JMED) of image a G R^ be denoted by e(a) G R^'*^, where U = J[range{si)), V = h{range{e)), with / and h denoting indexing functions. Let /and h have inverses and h'^, such that e(i,j) represents the probability of a value /"'(i) whose error measure is given by h'^(']). 5.4.2.3. Remark. It is easily verified that the functions / and h are practical for values in a finite mapping based upon Zm, where m is small. For example, assume the normal case of m = 256. Under the assumption of a maximum error of Â±100 percent in any given value, h would range over the interval [-256, -|-512]. Thus, the space complexity of e would be m(3m-|-l), or Â©(M'^), which is not burdensome for m = 256. Now suppose that the error distributions contained in the rows of e could be expressed in terms of fractional error. Thus, PAGE 196 186 if a sampling interval exceeding l/2m was employed as the increment for indexing V, then the space requirement specific to the second coordinate of doTnain{e) would be reduced. We previously (Section 5.3) alluded to errors in image acquisition, which are assumed to consist mainly of pixel magnitude errors and quantization errors, described as follows. 5.4. 2.4. Remark. Let a camera project irradiance reflected or emitted from a scene comprised of three-dimensional objects to an mxn-pixel focal plane X, thus forming an image a G (Zm)^We assume a magnitude quantization error of Â±1LSB, which will equal Â±l/m. Suppose that the camera exhibits an error distribution of Pr(Â€r), where Â€ denotes the error associated with a given value r 6 range{a}. Then, from Definition 5.4.2.2, the JMED of a is given by: e(/(r), h{z)) = Pr(6r) , r G ranye(a) , z G range{er) , (225) which implies that Cr G R^. The JMED is a convenient construct for analyzing error propagation in discrete systems, as follows. 5.4.2.5. Definition. Let images a,b G (Zm)^ exhibit the corresponding JMED image e(a), e(b) G R^**^, where U = Zm and V is comprised of error fractions on the real interval [0,1]. If the scalar operation Q: Zm X Zm Zm exhibits the JMED error function Â€q: R^""^ X R^**^ X (Zm)^ R^'*'^, then assuming that c= a Q b is available, the resulting JMED is given by e(c)= Â€Q{e{a.), e(b), c). 5.4.2.6. Example. Let Definition 5.4.2.2 hold for Q = + ^5 well as indexing functions / and h. From Theorem 5.1.2.1, if c = a + b is defined on R^, then ^c(x)< (5a(x) ^b(x) , X G X. Let r denote a function that extracts an element in {p2{domain{e))), Sa = a(x) Â• r(a(x)), and Sb = b(x) Â• r(b(x)). The resulting JMED Â€(c) is given by e(c)(/(c(x)), Msa + Sb)) = e(a)(/(a(x)), his,)) + e(b)(/(b(x)), /.(sb)) . (226) PAGE 197 187 It is assumed that h contains a renormalizing function, in order to maintain the error fraction within the range [0,1]. The error predictors defined in Sections 5.1 through 5.3 are similarly expressed in terms of the JMED. We next discuss information-theoretic issues associated with error propagation. 5.5. Information Theory and ErrorTolerant Computation. The information-theoretic description of computational error is scantily reported in the image processing literature. We thus present a summary of salient theory, with examples. A brief digression on applications of information theory to cryptography concludes this section. 5.5.I.Models of Information Transmission. We begin with Shannon's information-theoretic model of communication, which we modify to be isomorphic to the commutativity diagram of Figure 1. 5.5.1.1. Observation. Per Shannon's seminal paper [68], we have the following diagram: Source Transmitter Channel Receiver Sink Figure 6. Information-theoretic model of communication, which can be modified by adding an encoder and decoder, to yield the following model: Source Sink t Encoder Decoder t Transmitter Channel Receiver Figure 7. Information-theoretic model of communication, where the dashed line denotes an idealized path. PAGE 198 188 5.5.1.2. Observation. An analogy can be drawn between the preceding diagram and the commutativity diagram of an isomorphic computational system, shown in Figure 1. For example, denote the source as an image a; the sink a; the encoder and decoder as a transform Tand its inverse T"' (or an e-near approximation to its inverse T*), respectively; the encoder output ac = r(a), and receiver input ac. Letting an image operation Q represent the idealized channel in Figure 7, we see that Q', the analogue of Q over range{ T), can represent the corresponding real communication channel, which may include effects due to noise, system errors, etc. 5.5.1.3. Remark. The model described in the preceding observation is intuitively attractive for the analysis of both compressive and encryptive computation, and provides insights into certain aspects of cryptanalysis. In order to gain a unified understanding of such problems, we briefly address several related mathematical issues, as follows. 5.5.2. Basic Information Theory. We begin with several observations from information theory, which prepare us for an information-theoretic description of image compression. 5.5.2.1. Observation. Assume that processes Pi and P2 have ni and nj equiprobable outcomes, respectively. If the outputs of P] and P2 are combined to yield the output of a process P3, then P3 has ns = ni Â• n2 equiprobable outcomes. The computation of na is described as follows. Let Pi, P2, P3 : S Â— < S, and let ni = |rangfe(Pi)| with n2 = |rangfe(P2)|. If P3 is obtained from some function of the output of Pi and P2, then domain(P3) C rangeCPi) x range{'P2)However, we defined range{Fi) = domain{Pi), for i = 1..3. Thus, the preceding expression can be rewritten as \range{P3)\ C |ranfife(Pi)| x |ranfife(P2)| . (227) PAGE 199 189 Assuming the limiting case, where the preceding relation is the equality relation, we have that na = ni Â• n2. 5.5.2.'2. Observation. The amount of information output by a continually productive source is a strictly increasing function of time. 5.5.2.3. Definition. Let H denote the information content (or entropy) of a process, and let ^denote the operation that determines H. If Hi = H{?[), for i = 1..3, then Hi = H2 + H3. 5.5.2.4. Assumption. Let the number of equiprobable outcomes of a process P : S S be determined by a function N : {S ^ S) ^ N. Assume that a function / combines two processes of form S S to yield a process of form S S. What analytically tractable function can embody the concept of Nl 5.5.2.5. Lemma. Under constraint of Assumption 5.5.2.4, the entropy H{?z) of a process P3 = j[Pi,?2) is given by H(?z) = log(n3) = log(A^(P3)) if the following conditions are satisfied: (i) Pi,P2: S ^ S are two processes having equiprobable outcomes denoted by Ui = iV(Pi), for i = 1..2. (ii) P3 hcLS na = ni Â• n2 equiprobable outcomes. (iii) If Hi = F(Pi), for i = 1..3, then H3 = Hj + H2. Proof. Assume that the givens and conditions i) through iii) hold. Note that H = log is the only function that satisfies i)-iii) simultaneously, since 113 = ni Â•n2, H{Pz) = log(n3) = log(iV(P3)), and log(n3) = log(ni Â•n2) = log(ni) + log(n2) . (228) Thus, [{ H = log, then H3 = Hi + H2, and the lemma holds. PAGE 200 190 5.5.2.6. Remark. Suppose we have two processes P i and P2 with numbers of equiprobable outcomes ni and n-^. Now, let probabilities pi and p2 be associated with ni and n2, such that n = ni + n2 and pi = Â— and P2 = Â— Â• (229) n n Recall that the entropy associated with one message among n equally likely outcomes is denoted by ^(n) = log(n). However, for ni/n of the time, F(ni) = log(ni). Symmetrically, H{n2) = log(n2). Thus, the net entropy is given by H= log(n) (ni/n) Â•log(ni) (n2/n) Â•log(n2) (230) = -pi Â•log(pi) pi Â•log(pi). Since pi,p2 < 1, the logarithms are negative, and H > 0. Via Equation 230, we can describe information content in terms of a probability distribution {pi G [0, 1] : i = l..n}, as shown in the following theorem. 5.5.2.7. Theorem. Let a process P have n possible outcomes that exhibit individual probabilities pi, where i = l..n. The entropy of P, denoted by ^(P), is given by: n F(P) = -Pi-log(pi). (231) i=l Proof. The proof, which is well known, follows from Remark 5.5.2.6 and induction on We next restate the law of additive combination of information, with proof given in References 30, 69, and 70. 5.5.2.8. Theorem. If a process P consists of two statistically independent processes Pi and P2, then the entropy of P is given by Hi?) = H{Pr)+H{?2). (232) PAGE 201 191 5.5.2.9. Theorem. If a process P consists of two processes Pi and P2 that are not statistically independent, then the entropy of P is given by ^(P) < Hi?i)+H{?2). (233) The following theorem of maximum information is stated without proof, which is given in Reference 69. 5. 5.2.10. Theorem. If a discrete process P is a source that has n probable outcomes that exhibit individual probabilities pÂ„ where i = l..n, then P yields maximum information when Pi = 1/n, for all 1 < i < n. The following examples suffice to show the utility of the maximum-information theorem. 5.5.2.11. Example. The modern English alphabet, denoted by F, has 26 symbols. Thusr, H{F) = log(26) Â« 4.5 bits/symbol. If the probability of each letter is taken into account and we apply Equation 233, then H{F) < 4 bits/symbol. The actual information content of English has been shown to be approximately one bit per symbol [30], since the nonuniform occurrence of n-grams implies statistical dependence among symbols. 5.5.2.12. Example. Assume that the substitution cipher : F x kI*"! F is one-to-one and onto. Let |F| = 26, as is customary, and let the keyspace K = F. Thus, there are 26! (i.e., |F|!) possible keys of length 26 symbols. Let us construct a 26-character key that encrypts a 40-character text. The information content of the key is given by Hk = log(|F|!) = log(26!). If the text has entropy Ht, then the substitution has entropy n{Ti3)< Ht-|Hk. From Theorem 5.5.2.7, a 40-character text has maximum entropy V(Ht) = 40 Â• log(26). Since we cannot determine a priori whether the plaintext and key are statistically independent, we use Theorems 5.5.2.7 and 5.5.2.9 to derive the inequality V(F( Tu)) > Ht + Hk = Ht + log(26!) . (234) PAGE 202 192 Substituting the maximum entropy for Ht, we obtain: Ht < V(Ht) Hk = 40 Â• log(26) log(26!) Â« 100 bits. (235) At 100 bits per 40 characters, we have 2.5 bits/symbol, which is considerably less than the maximum information content of text computed in Example 5.5.2.11. 5.5.3. Information Theory and Data Compression The preceding examples suggest that one could construct transformations that compress the information content of a given source. We next show that a compressive transform can exploit existing communication channel bandwidth to increase a channel's informationcarrying ability. 5.5.3.1. Definition. An ideal channel transmits a finite number of symbols without error, given the required transmission time of ti seconds per symbol S[. 5.5.3.2. Definition. Channel capacity is the maximum rate (in bits/second) at which one can transit a signal over a channel using the fastest possible source. 5.5.3.3. Remark. Channel capacity may be effectively increased by placing an encoder between the source and the channel. We initially assume that the encoder is lossless (i.e., is an invertible transformation). In the modification of Figure 7 discussed in Section 5.5.1, such an encoder would correspond to the transformation T. 5.5.3.4. Definition. A statistical encoder outputs symbols Sj G F, where each symbol has probability pi = Pr(si). If the output ac of a statistical encoder consists of n symbols, then ac is called an n-gram. PAGE 203 193 5.5.3.5. Definition. The compression ratio of a statistical encoder that outputs an ngram is computed as: CR = where siz was defined in Chapter 3. 1 " Â• Pi Â• siz{si) n f-^ i=l -1 (236) 5.5.3.6. Example. Assume that a source has an alphabet F = {a,b}, where the probabilities Pr(a) = 0.7 and Pr(b) = 0.3 are assumed to be statistically independent, for purposes of convenience. The source output rate is assumed to be 80 symbols per minute. The entropy of this source is given by: H= -0.7 Â• log(0.7) 0.3 Â• log(0.3) (237) = 0.779 bits/symbol, and the information rate is given by: H ^ 0.779 bits 80 symbols/min ^ ^^3^ ^.^^^^^^ ^^38) t symbol 60 seconds/min However, we prefer the channel to transmit F-valued messages at 0.95 bits/sec, which is less than the source output of 1.038 bits/sec. Thus, we derive an encoder that constrains the source to exhibit an output bandwidth that is less than or equal to the channel bandwidth. Since Pr(a) / Pr(b), we can employ a digraphic encoder Tdg:F^ Â— > {0, 1}^ to describe data via unique prefix codes. For example, Tag could implement the following map: aa 0 with probability 0.49 = 0.7^ ab h 10 " " 0.21 = 0.7 Â• 0.3 bah 110 " " 0.21 = 0.3 0.7 bb h 111 " " 0.09 = 0.32 From Definition 5.5.3.5, the inverse compression ratio of Tag is given by m = _ . V Pi . di m f-^ 1=1 = 0.5 Â• (0.49 Â• 1 + 0.21 2 + 0.21 Â• 3 + 0.09 Â• 3) ^2^^) = 0.5-1.81 = 0.905 encoding symbols/source symbol , PAGE 204 194 which implies CR = 1.105. The source-encoder combination will now produce the output rate (H/t)/CR = 1.038 Â• 0.905 = 0.939 bits/sec. Thus, the channel, with a capacity of only 0.95 bits per second, can transmit the output of the combined source and encoder, which has a rate less than 0.94 bits per second. We next discuss issues related to the transmission of encoded signals sent across realistic channels. 5.5.4. Information Loss and Computational Error. In practice, it is useful to express the effect of noise on encoded signals in terms of information-theoretic measures such as entropy. Such measures depend upon noise statistics and correlation of noise sources with encoded data. Additionally, information-theoretic analyses that describe channel errors can describe computational error. 5.5.4.1. Observation. Let a message or image a Â£ be specified as a = (si, S2, ... , Sn). In the preceding section, we compressed a signal using a statistical encoder to send N bits per second of information across a channel with bandwidth M < N. In realistic channels, noise often corrupts K < M bits of a. For example, the noise may be correlated with a or the encoded image ac. Additionally, the noise may be nonuniformly distributed in time or space. 5.5.4.2. Remark. When such noise is modelled as being uniformly distributed (for purposes of convenience) and is random (and thus is irreducible), we can say that the entropy H{a) is biased or increased in the communication channel by K bits per symbol. Using the formalism derived in Section 5.5.1, if a signal a is compressed to yield ac at the channel input, then the entropy at the receiver input is given by (240) = F(a)/CR-i-K, PAGE 205 195 where CR denotes the compression ratio. If the symbol probabilities are not independent, then we replace equality with "<", per Theorem 5.5.2.9. When the source rate N is less than the channel bandwidth M but the source and noise rates together (N + K) < M and K < N, portions of the source will be replaced by noise. As a result, we have the following theorem. 5.5.4.3. Theorem. Let a signal a be transmitted along a lossy channel of bandwidth B > H(a) = N. If the channel has K < B irreducible noise rate and K < N, then the following statements hold: (i) L = B N bits per symbol represent noise that does not necessarily corrupt a; (ii) B K bits of a are not necessarily corrupted by noise; and (iii) K L bits of a represent noise. Proof. Let the givens hold and be portrayed by Figure 8, from which i) through iii) follow directly. Channel Signal Noise B //(a)Â— K Â» Figure 8. Arrangement of noise levels in a signal a with entropy ^(a) that partially occupies a channel of bandwidth B having K noise bits. 5.5.4.4. Example. In Example 5.5.3.6, an encoder compressed a 1.04 bits/pixel (bpp) signal a into a 0.94 bpp signal a^, which could then be transmitted along a 0.955 bpp channel. If the channel has noise of K = 0.1 bpp, then H{sic) is increased by K if the noise is additive, or 0.1 bpp of ac are replaced by noise (i.e., become irreversibly noise-corrupted). PAGE 206 196 5.5.4.5. Observation. Theorem 5.5.4.3 says nothing about the spatial effects of noise on compressed imagery. For example, assume an image a is transmitted along a noisy channel in uncompressed form. Let errors in the source image are highly correlated with bit errors in the compressed image ac. Thus, the distortions in ac will represent the effect that such distortions would have on a. However, the majority of commonly-used compressed schemes implement decorrelate a and ac in order to achieve high compression. Unfortunately, if noise statistics and ergodicity are not well known a priori, unpredictable (and possibly serious) image distortions can result. The following example is illustrative. 5.5.4.6. Example. Consider that eight contiguous bits of a VPIC-compressed image could be occupied by (a) one flag bit indicating the presence of an edge or mean block, (b) four bits describing the block mean, and (c) three bits describing the block gradient intensity if the block is an edge block. Otherwise, the last three bits would commence with the flag bit of the next block. Since mean and edge blocks have different lengths, an error in the flag bit could erroneously perturb the remainder of the decoding process. As a result, we recommend complete error protection for the flag bit, as well as high error protection for the more significant bits of the mean, gradient intensity and orientation, and full error protection for the VPIC exemplar index. We further discuss such requirements in Chapters 9 and 10. The preceding formalisms can be used to describe computational error in terms of SNR. 5.5.4.7. Observation. If we replace the realistic channel of Figure 7 with an operation 0' that computes over the range space of an encoding transform T: S Â— Â» S', and Q' is the analogue of an operation Q o^^r domain{T), then we can model error inherent in a realistic computational system X = ({S,0}5 {S',0'}Â» ^*)Â» P^r concepts developed earlier in this chapter. In such cases. Theorem 5.5.4.3 describes information loss in X, but does not describe the spatial phenomena associated with such error. PAGE 207 197 Since it is customary to discuss error in terms of SNR, we present the following definition. 5.5.4.8. Definition. Let a realvalued source image with m-bit pixels be corrupted by k < m bits of noise per pixel. If such error e occurs in the less significant bits and is uniformly distributed, then the associated SNR is given by SNR, = 4r = 2'"-''. (241) This concludes our discussion of information loss. PAGE 208 CHAPTER 6 CLASS 1 TRANSFORMATIONS This chapter contains derivations of analogues to the image algebra pointwise, global reduce, and image-template operations over the range spaces of substitutional ciphers The substitutional cipher Tis : is a bijection and, therefore, an isomorphism. As a result, operations over range( Tu) are arrived at merely by relabelling the operands of Tis. We first present high-level theory, then examine the derivation of transform-regime analogues and duals over range{Tia). 6.1.1. Basic Theory. Since Tu is a bijection, the derivation of operations over the transform's range space is relatively simple, and furnishes a useful introduction to the organization of our derivational techniques. 6.1.1.1. Definition. The substitutional cipher Tu : F^ F^ is a bijection that is based upon the scalar grey-level mapping (Section 6.1), transpositional ciphers (Section 6.2), and linear transforms (Section 6.3). 6.1. Substitutional Ciphers. (242) and can be decomposed as Is Â• (243) Since Tu is a bijection, it is an isomorphism with inverse F^, which is implemented at fine granularity in terms of fif ^ . F Â—^ F. 198 PAGE 209 199 6.1.1.2. Assumption. Let a, b 6 denote two source images that are transformed by Tu to yield the corresponding images ac,bc G F^, and let the pointwise image operation 0: F^ X F^ F^. 6.1.1.3. Lemma. Under constraint of Definition 6.1.1.1 and Assumption 6.1.1.2, a dual 0' : F-'' X F-'^ ^ F-'' of O over range{ Tig) is given by: c= 0'(ac,bc)= 0{g-'M,9-\h,)) . (244) In other words, 0'= 0 Â° ff"^Proof. The proof follows directly from the fact that Tu has a bijection g as its basis function, and Tu is thus a bijection. 6.1.1.4. Example. Let a = (1,2,0) and let Tu{x) = x+l mod 3. Thus, ac = ris(a) = (2,0,1). Similarly, if b = (2,1,0), then b^ = (0,2,1). Let 0 = +3 (i-e., addition modulo 3) such that 0(a,b) = a +3 b = (0,0,0). We can express Q in tabular form as 0 0 1 2 0 0 1 2 1 1 2 0 2 2 0 1 Similarly, Tu has as its basis function g, which has an inverse g \ Both operations are depicted in tabular form as 9 0 1 2 X 1 2 0 0 1 2 X 2 1 0 Applying 5 ' to Q (P^r Equation 244), we express the tabular representation of Q' as PAGE 210 200 0 u 1 2 0 1 0 2 1 0 2 1 2 2 1 0 which is obtained by rotating the tabular representation of Q about its minor diagonal. 6.1.1.5. Lemma. Under constraint of Definition 6.1.1.1 and Assumption 6.1.1.2, the analogue Q' : X -> F^ of Q over range{ Tu) is given by: c= rr,'(0'(ac,bc)) = 9{0{9-'M,g-\h,))) Â• (245) That is, 0'= 0 Â°9~^Proof. The proof follows directly from the proof of Lemma 6.1.1.3. We next describe techniques for deriving pointwise duals and analogues over range{ Tis)6.1.2. Technique: Binary Pointwise Operations. Assuming that Tu : F-'^ F^ is a finite mapping, the technique for deriving duals (and, therefore, analogues) over range(Tis) proceeds as follows. Step 1. Assume that Q, a binary pointwise operation over range{Tis), is specified in terms of its matrix representation d, as discussed in Section 4.4. Additionally, we assume that g is the basis function of Tu. Step 2. Via an indexing bijection h: F Â— ^ A, where A= [1,|F|] we set range{choice{domain{Tu))) Â— range{choice{range{Tu))) = A. We similarly transform g: F ^ F into / : A Â— >Â• A. Step 3. Applying / to each projection of the coordinates of matrix representation d, we obtain the matrix representation of 0'Â» the dual of Q. PAGE 211 201 Thus, we use the indexing function h to map a greylevel transformation (normally applied to an image) into a spatial transform of the matrix representation of Q. 6.1.2.1. Lemma. Under constraint of Definition 6.1.1.1 and Assumption 6.1.1.2, let /j. F A, where A= [1,|F|] C N, transform the basis function 3: F ^ F of Tu into / : A ^ A. If d e A"^**"^ is the matrix representation of Q, then g, the matrix representation of 0' (which denotes the dual of 0 over range{Ti^j), is given by: g=do/-i. (246) It follows that 0'( Tu{f), ri.(g)) = h-'ism, Ms))) , Vf, g G F . (247) Proof. Assuming that the givens hold, / and h have inverses because they are bijections. If g is the matrix representation of Q, then from the definition of/, g = d o/-i= {(w,d[/-\w)]) : we Ax A}. (248) The definition of Q' follows directly from g and h. 6.1.2.2. Theorem. Let Lemma 6.1.2.1 hold. If g G A"^^-^ denotes the matrix representation of the dual of Q, then Q', the analogue of Q over txinge{ T^)), is given by: O'iTUf), Tu{g))= g{h-'[g{hif),hig))]) , Vf,g G F. (249) Proof. The proof follows directly from Lemma 6.1.2.1, the definitions of transform-regime dual and analogue operations, and the definition of h and g. 6.1.2.3. Complexity. In the preceding lemmas, d can be obtained computationally in |F^| computations of QApplication of the spatial transform/"^ yields the dual Q' in [F'^l I/O operations. The analogue is obtained from the dual by \F'^\ applications of A"' and g. If d is stored in memory, then a one-time computation of Q' is required. PAGE 212 202 6.1.3. Technique: Global Reduce Operations. Given the preceding lemma and theorem, the global reduce operations are simply derived. 6.1.3.1. Theorem. If F : ^ F is based upon the associative, commutative function ^. p X F Â— F and Theorem 6.1.2.2 holds, then T' : Â— F, an analogue of T over range{T\s)), is given by one of the following statements: (i) r' = ff 0 T; or fii) If 7': F X F Â— F denotes the analogue of 7 over range{Tis)), then 7' is the basis function of T'. Proof. Assuming that the givens hold, (i) follows directly from the definitions of T and g as well as the commutativity diagram of Figure 1. Statement (ii) follows directly from the definition of an analogue, as well as from the definition of a global reduce operation. 6.1.3.2. Example. Let T{x) = x+'2 mod k, where x,k G Z"*" and k > 2. Let a 6 be transformed to yield ac = ^(a), and let 7 = +k denote addition modulo k. For example, if a = (0,1,2,3,2,3,1,1,0) and k = 4, then ac = (2,3,0,1,0,1,3,3,1). Taking the image sum yields Ea = 13 and Sac = 14. However, if 7 is the basis of F, then r(a) = 0. In order to obtain 7', the analogue of 7 over range{ T), we apply the technique of Section 6.1.1.3 to yield the following tabular representations: PAGE 213 203 7 0 1 2 3 0 0 1 2 3 1 1 2 3 0 2 2 3 0 1 3 3 0 1 2 7' 2 3 0 1 2 0 1 2 3 3 1 2 3 0 0 2 3 0 1 1 3 0 1 2 which means that 7' = 7, since the preceding matrices are block-symmetric. Thus, r'(ac)= 27'37'07' Â•Â•Â•t'I = 2. (250) As a result, we have r'(ac)= r(r(a)), which verifies that 7' is the analogue of 7 over range{T). However, as we shall see in Chapter 7, if 7 = -|-m, where m ^ then 7' may not exist, due to the induction of a many-to-one mapping in T(r(a)). Rather than deducing the analogue from the dual, we provide the reader with the converse method. 6.1.3.3. Theorem. Let the preceding lemmas hold, and let denote an analogue of F over range{ T\s). Then, F' : -> F, a dual of F over range{ Tu), is given by F' = 0 F . Proof. Assuming that the givens hold, the proof follows directly from the fact that g is a bijection. 6.1.3.4. Complexity. The associative, commutative function 7' can be obtained in IF]'^ computations of g and 7. If 7 is stored, then this is a one-time computation. The preceding example is illustrative. PAGE 214 204 6.1.4. Technique: Image-Template Operations Given the dual (analogous) operations of pointwise arithmetic and global reduce functions, the construction of various analogues of the image-template operations over range{ Tio) is a trivial matter of functional composition. 6.1.4.1. Assumption. Under constraint of Definition 6.1.1.1 and Assumption 6.1.1.2, let the operation Q : x (F-'')''' Â— >Â• F-'' combine a source image a G F-'' and a template t G (F^)^ to yield the image a= 0(M) = aÂ®t. Assume that Â® consists of two associative, commutative operations o,7 : F x F -Â»Â• F, the latter of which is the basis function for the global reduce function T. Let o' = ^(o) and 7' = 9(7) denote the duals (analogues) of o and 7' over range{Tis). 6.1.4.2. Theorem. Under constraint of Assumption 6.1.4.1, if Q' : F'^ x (F^)^ F^, the analogue of Q over range{ T\s), combines a ciphertext image ac = T'is(a) and a template s Â£ (F^) to yield the image ac= Q'{ac,s)= ris(aÂ®t), then Q' is given by: ac= 0'(ac,s)= ac Â®'s (251) where s is defined by its weights as Sy(x)= g{ty{x)), Vx,y 6 X. More simply, s= g{t). Proof. The proof follows directly from the discussion of image-template operations in Section 3.4. Note that the equivalence operation is employed, since Equation 251 forms the graph of ac. PAGE 215 205 6.1. 4.3. Complexity. Assume that the operations o' and 7' need not be computed for each invocation of Equation 251 (i.e., are stored in LUTs or are available on an encryptive processor). Then, it follows from our preceding discussion of complexity (Chapter 4) that the overhead associated with Equation 251 is bounded as 2-|X| < Wq' < |X| -(IXI+l) invocations of g associated with template conversion. Otherwise, the overhead increases to .3-Wo' (i-e-, one instance of g per o' , T', and the previously-given conversion of t to s). Since the transform is not compressive, a sub-unitary speedup would be achieved by encryptive processing if o' and T' were not encoded in terms of one or more LUTs. 6.1.4.4. Example. Let a G Zj^g, as shown in Figure 9a, be convolved with a von Neumann template t shown in Figure 9b. Let the transform T(x) = (a;-M28) mod 256 be applied to a to yield ac G Z^g' as shown in Figure 9c. The transformed template s, produced by the method of Equation 251, is shown in Figure 9d. Let the operation @': Z^6 ^ (^256)^ ^256 ^Â® derived from analogues of addition and multiplication, as shown in Sections 6.1.2. Applying T^{x) = (x-128) mod 256 to the result be = ac Â®'s (shown in Figure 9e) yields the image of Figure 9f, which portrays the image operation b= a Â©t. 6.1.5. Data Security and Information Loss. The substitutional cipher is insecure due to high input-output correlation and preservation of the input histogram. Since Tu is a finite bijection, no information loss occurs. Under constraint of Definition 6.1.1.1 and Assumption 6.1.1.2, we present the following cryptanalytic methods. PAGE 216 206 Figure 9. Local averaging of a noisy, two-dimensional, 8-bit image transformed by T{x) = (x-|-r28) mod 256: (a-b) source image and template a and t; (c-d) transformed image and template and s; (e) transformed result be; (f) image-domain result b = ^^'(bc). PAGE 217 207 6.1.5.1. Observation. Assume that plaintext and ciphertext are available to a cryptanalyst. Since T\s merely relabels the plaintext symbols in a, one needs only to formulate an indexing function h : F Â— A, where A= [1,|F|] C N, as well as the trial plaintext a= {(x,f) : X = h{i). f Â£ F}. Application of slc = ris(a) enables construction of the bijection g: a(x) [Â— ac(x), Vx 6 X which is implemented in ^ Tu. 6.1.5:2. Observation. Assume the more difficult case, where the plaintext a is unknown, i.e., only the ciphertext a,., is available. Let h(a) and h(ac) denote the plaintext and ciphertext histograms. We use the following algorithm, which is a slight modification of a procedure presented in Sinkov [27]: Step 1. For each f G F, compute the probabilities B,0= (252) Step 2. Given the probability distribution Pr(f), f 6F, accumulate for each f the set Af = t/omam (x[Pr(f)-e,Pr(f)+.](b)) , (253) which are the more probable candidates for substitution. Step 3. Starting with the first letter f of the most frequent words (in colloquial English: a or the), mark the positions D = domain (a.c\\^\,). Step 4. Interrogate each unmarked successor value g = a(.(x+l) of each marked value ac(x), where x Â€ D, to determine if b(g) matches (or nearly matches) the probabilities of any successors of f in a table of digram frequencies. If there is a match (nearmatch), then the decoded value d(x) is assigned the value f determined in Step 3. Step 5. Repeat Steps 3 and 4 iteratively until the number of decoded words that correspond to known plaintext words (obtained from a lexicon) is maximized. As a result, d will contain a best-match candidate solution to a,.. PAGE 218 Â•208 This method can be implemented automatically for many natural languages. Convergence efficiency tends to increase when Pr(f) accurately portrays the plaintext statistics. In the previous section, we introduced the reader to our derivation methodology. For purposes of brevity, we henceforth present more concise derivations, with the preceding methodology understood. 6.2. Transpositional Ciphers. As discussed in the introduction, a transpositional cipher Tu ' F''' is based upon a bijection / : X ^ X. Such spatial bijections preserve image values and thus preserve both global reduce and pointwise operations. However, m-ary pointwise and image-template operations over Tu require that the same transposition be applied to the domain of each input (plaintext) image. 6.2.1. Technique: ImageTemplate Operations. Since pointwise and global reduce operations are not interesting cases of the pointwise transpositions, we provide a summary lemma, then briefly discuss image-template operations. 6.2.1.1. Definition. A transpositional cipher Tu : F-'^ is based upon a spatial bijection /: X Â— > X, such that ac = rit(a)= a o /. 6.2.1.2. Lemma. Under constraint of Definition 6.2.1.1, the following statements hold: (i) A pointwise image operation Q : (F^)"* exhibits a dual (analogous) operation over range{ Tu) that is given by Q' = 0Here, we assume that the operands of pointwise binary operations are each transformed by the same instance of Tu.. (ii) A global reduce operation T : F-'^ ^ F exhibits a dual (analogous) operation over range(T\t.) that is given by r'= T. PAGE 219 209 (iii) The matrix representations of dual and analogous operations of a given pointwise or global reduce operation are not equal. Proof. Statements (i) and (ii) follow directly from our preceding discussion of lossy neighborhood operations (Section 3.4). Statement (iii) follows from the fact that input values are preserved in range{ Tit), but their order is perturbed by T\t. Since the pointwise and global reduce operations are preserved over range{T\i), one needs only to apply the spatial transform / : X ^ X to the template domain to produce a template that can be convolved over the ciphertext = rit(a). 6.2.1.3. Theorem. Let the operation Q : X (F^)^ combine a source image a G F^ and a template t G (F^)^ to yield the image a= 0(ait) = aÂ®t. Under constraint of Definition 6.2.1.1 and the assumption that a.c= rit(a), the operation 0' : F^ X (F^)^ -> F^, a dual (analogue) over range{Tu) of Q, is given by: ae= 0'(ac,t6/)= ac Â®(to/) = U (y, yeX \ x6X where to/ implies the existence of a weight matrix transformation ty o / for each y in X. Proof. Assuming that the givens hold, it is easily verified that the template t, which is configured for the source domain X, is not configured for the transformed domain y(X). Since t may be spacevariant, we need to transform the domain of each image ty 6 F-'^ by composition with /. Note that the equivalence relation is used since Equation 254 describes construction of the graph of slcCD 6.2.1.4. Complexity. Equation 254 requires overhead of |X| compositions of / with dom.ain{ty) in addition to the overhead of Â®. Given the discussion of complexity in Section 4.3, the resultant work is bounded as 2-|X| < Wq' < |X| Â• (|X|-t-l) I/O operations. PAGE 220 210 6.2.2. Data Security and Information Loss. Since image values are preserved in range{Tu,) there is no information loss. In the introduction, we briefly mentioned that the data security of spatial transformations could be compromised via histogram-based attack, which holds particularly for spatial bijections, since the probability distributions of plaintext and ciphertext image values are identical. 6.2.2.1. Observation. Cryptanalysis of transpositional ciphers is trivial, and includes the following two cases: 1. If Tit is available to the adversary, then one merely formulates the image a= {(x,x):xGX}, and performs the operation slc= Tit{a.). The graph of the spatial bijection /, upon which Tu is based, is given by G{f)= {(ac(x),a(x)): xeX}. 2. If Tit is not available to the adversary, then cryptanalysis is still trivial, since range{Tu.) contains all plaintext image values. Thus, histogramor correlationbased attack can yield the plaintext [27,30]. 6.2.2.2. Remark. Regarding correlationbased attack, since Tu is a bijection, the location of a neighborhood in the source image a is highly correlated with the location of the corresponding transformed neighborhood in ac. Thus, one has merely to use different neighborhood domains to detect neighborhoods in ac that correlate well with each other. This is especially easy in the case of frequently-occurring digrams and trigrams, such as the English determiners an and the. However, given the success of histogram-based attack, which is simpler computationally, correlation-based attacks are of pedagogic interest only. PAGE 221 211 6.2.2.3. Observation. The case of image transpositions differs significantly from message encryption via transposition. For example, levels of message complexity can be characterized as (0) symbol (1) word, (2) phrase or clause, (3) sentence, (4) topic, and (5) document. In contrast, the structure of imagery is not apparent beyond the pixel level, where maximum likelihood estimates can be formulated over von Neumann neighborhoods [71]. Similarly, the co-occurrence of symbols in messages can be described in terms of digram and trigram statistics, which are well known for a variety of natural languages. In contrast, the form of symbols and higher-level structures in imagery is not well understood. For example, at a low level are pixel values, whose co-occurrence has not been extensively studied [71]. At higher levels of abstraction, imagery can be treated in terms of multiresolution constructs (e.g., image pyramids) that are derived from the source image mathematically, using simple operations. The application of n-gram analyses to multiresolution imagery is a topic of current interest to us, and has not been extensively reported in the literature. In summary, the application of transpositions to imagery can yield more secure encryption, in principle, due to the current lack of knowledge about pixel value co-occurrence. Such concepts have been implemented in ciphers that use extensive transposition and permutation of bits (e.g., DES). We next progress to the somewhat more interesting case of linear transformations, which are widely employed in signal and image processing and frequently form the basis for image compression via transform coding. 6.3. Linear Transforms. Due to the linearity property, pointwise and global reduce operations over the range space of linear transforms can be readily derived. However, duals or analogues of imagetemplate operations are less easily derived. PAGE 222 6.3. t. Basic Theory. 212 We briefly review the concept of a linear transform and its decomposition. 6.3.1.1. Definition. Given an image a 6 C^, a linear transformation Tn : ^ is a mapping that exhibits the following linearity property. ri/((kra) + k2)= (ki-ri,(a)) + k2,ki,k2 e C. (255) 6.3.1.2. Observation. For purposes of convenience, we classify linear transforms as pointwise or neighborhood transformations. A pointwise linear transform operates upon a single value in each of its input images (e.g., T{a.,h) = a + b/k). A neighborhood transform is a linear combination of values located in a subset of the domain(s) of one or more operand images. For example, linear inner product transforms are included in the neighborhood transforms. We further note that, due to their many-to-one basis maps, linear neighborhood transforms cannot be inverted as can their pointwise counterparts. Additionally, if a linear transform preserves the sum of its input in the transform output, then the global reduce function of image sum is readily derived. Similarly, addition and multiplication by a constant are preserved over the range space of a linear transform. In contrast, the preservation of nonlinear pointwise (and global reduce) operations is not supported by the linearity property. We thus begin our discussion with the simple case of linear operations over linear pointwise transformations. 6.3.2. Technique: Pointwise Operations. We consider linear and nonlinear pointwise operations, which do not have elaborate analogues over the output of pointwise linear transforms. 6.3.2.1. Lemma. If Q' denotes a linear operation over the range space of a linear transform T, then Q' o T is a, linear operation. PAGE 223 Â•213 Proof. The proof follows from the fact that the composition of linear operations yields a linear operation. CD 6.3.2.2. Example. Let a, b e and let T denote a linear transform. Then, the analogue of the pointwise binary operation Q = + is given by Q' = which follows directly from the linearity property of T. A similar case holds for multiplication by a constant. We next derive a dual of Hadamard multiplication over the range space of a pointwise linear transform. 6.3.2.3. Lemma. Given Lemma 6.3.2.1, let images a, ac,b,bc G R''', where ac = T'i;(a) = ci Â• a + di and be = ri/(b) = C2 Â• b +d2If the image operation 0 = * such that c = a + b, then the dual of Hadamard multiplication is given by: r\' f u^ ac * be di + be d2 * ac + did2 rn^c\ c = 0 (ac,bc)= . (256) Cl Â• C2 Proof. From the givens, we have ac + be = CiC2+a + b + Cid2 * a + C2di + b + did2 , (257) which can be solved to yield ac + be diC2*bd2Ci*adid2 a*b= . (258) Cl Â• C2 The preceding equation can be rendered nonrecursive by noting that ~ di , . be Â— d2 , , a= Â— and b= Â— . (259) Cl C2 Substituting Equation 259 into Equation 258 yields Equation 256. PAGE 224 214 6.3.2.4. Complexity. Equation 256 requires five additions and multiplications per pixel. If a,b G R^, then the total work required by Q' equals 4 Â• |X| multiplications and 5 Â• |X| additions. Given 4 Â• |X| processors, Q' can be computed on a pipelined parallel processor in the time required for two multiplications and two additions, as shown in Figure 10. a db d, a b dcL c 2 c 1 c c \ ~ a * b Figure 10. Dual of pointwise multiplication over the range space of the pointwise linear transform T{x) = cx + d. 6.3.2.5. Remark. The preceding lemma is not fanciful, but is useful for recovering the product of two images that have undergone linear contrast stretching. If one of the images is known (or approximately known) the other can be retrieved (approximated), as shown in Figure 11. The analogue of pointwise multiplication over the range space of a linear pointwise transform T{x) = cx + d follows from Lemma 6.3.2.3. The following lemma is illustrative. PAGE 225 215 Figure 11. Recovery of source image a from image sum of contrast-stretched imagery: (a-b) source images a and b, (c-d) linear contrast-stretched images a,-. = r(a) and be = r(b), (e) Cc = ac and be, (f) recovery of a from Cc using a = T\cc T{h)). PAGE 226 216 6.3.2.6. Lemma. Given Lemma 6.3.2.3, let images a, ac,b,bc G R^, where ac= T'i;(a)= c-a + d and be = Tii(h). If the image operation 0 = * such that c = a * b, then the dual of Hadamard multiplication is given by: r\> u u \ ac*bc-d(ac + bc-d) Cc= O (ac,bc)= 5 . (260) Proof. The proof follows directly from the proof of Lemma 6.3.2.3 by substituting c for ci and C2, as well as substituting d for di and d2. This simple result can be expanded to derive an analog of pointwise multiplication over the range spaces of all linear transforms, via the following decomposition. 6.3.2.7. Definition. Let the step function a be given by Alternatively, we can write a as cr(y)= (l|{x:x>y,y6X}) l"6.3.2.8. Lemma. Given the step function a, the additive step function decomposition of a onedimensional image a G F''^ is given by a= 5] (aÂ©t)(x) .(t(x), (262) xex where t = (-1 1), with the target pixel italicized. Proof. The proof sketch is embodied without loss of generality in the following example. Given the definition of a and letting a = (c, d, e), we note that b = a Â© t = (c, d c, e d). Thus, a = (c,c,c) + (0, d-c, d-c) -|(0, 0, e-d) a= (c,c,c) + (0,d^c,d-c) -f(0,0, e-d) = (c, c-f-d-c, c-hd-c-he-d) (263) = (c,d,e), and the lemma holds. PAGE 227 217 The preceding lemma is not limited to one dimensional images. For example, a could be a step function with the same dimensionality as X. However, one must define the maximum operation over X prior to implementing a, which definition depends upon scanning order. For purposes of simplicity, we continue with the one-dimensional case. 6.3.2.9. Assumption. Let a linear transform T : Â— be applied to an image a G R'''' as ac = a @r, where rG (R-'')'''. Assume that the inverse transformation 7^-1 ; rX ^ rX jg gjygjj by a= ac Â©^~^ such that a= a @r Â©r"^ 6.3.2.10. Theorem. Under constraint of Definition 6.3.2.7, Assumption 6.3.2.9, and Lemma 6.3.2.8, let images a,ac,b,bc G R'''', where ac and be result from application of a linear transform Tto a and b. Let The implemented as T'(a) = a Â©r, where r G (R^)''', and let s= Â©t, where t = (-1 1), with the target pixel italicized. An analogue O' : R-'' X R^ ^ R^ of Hadamard multiplication (Q = *) over range{ T) is given by T(a * b) = 0'(ac, be) = 5] (ac Â©s)(x) Â• (be Â©s)(y) Â• r(^T[V(x,y)]) , (264) X6X yeX where we assume that the maximum operation (V) is defined over X. Proof. From Lemma 6.3.2.8 and the givens, a = (aÂ©* )(y) " <^(y)> ^-iid symmetrically yex for b. Thus, the product * * = E E Â• (b Â©t)(y) Â• ^(x) Â• (T{y) . (265) X6X yex Since (t(x) Â• <7(y)= cr(V(x,y)), if maximum is defined over X and r(a)= T[(aÂ©t)(y).<7(y)] yex V(266) yex then Equation 264 follows directly from the givens and Assumption 6.3.2.9. Thus, the theorem holds. PAGE 228 218 6.3.2.11. Complexity. With slight manipulation, Equation 266 can be implemented efficiently on parallel architectures. For example, let X = {l,2,....n}, and let A be an nxnprocessor mesh. If a is a Z^valued image, then the y-th column of processing elements (PEs) can store the m partial products f Â• r((T(x))(y), Vf G Zm, where x,y G X. As a result, each PE stores mn Â• [logm] bits. For example, if n = 512 and m = 256, as is usual, then each PE would store IM bits. Given today's processors with large local memories, the requirement of 128KB storage per PE is not burdensome. 6.3.2.12. Remark. The preceding approach is advantageous, since a discrete linear transform T(a) can be computed in log(n) addition delays, for a total cost of C(n) = S(n) Â• T(n) = n^ 0(log(n)) = 0{n^ log(n)) . (267) C exceeds by a factor of n space complexity the cost of computing a fast version of T (e.g., the Fast Fourier Transform) on an n-processor butterfly architecture. However, in Equation 266, n is not limited to a power of 2. Additionally, if partial products are stored locally, then Equation 266 requires only addition operations, which can be integer additions. In contrast, the fast linear inner product transforms computed on butterfly architectures usually require real or complex local multiplications [72]. In practice, this approach would be useful for production transformation of images and signals where the image size and transform type are fixed, but time constraints predominate over space complexity. Given current advances in circuit miniaturization, it is not unreasonable to expect the availability of large, inexpensive SIMD meshes with large local memories in the foreseeable future. Thus, the preceding approach is realistic. PAGE 229 219 6.3.2.13. Data Security and information loss. The analogues of + and * over range{T\i) do not compromise the security of the transform, which is insecure per the discussion of Sections 4.5-4.6. Such computations can suffer from information loss due to errors in be incurred by lossy or erroneous T as well as input errors in a or b. Propagation of such errors is governed by the product rule given in Section 5.1.2.2. Assuming (for purposes of convenience) that the error operator (5 is a linear function, if Q' denotes the analogue over range{T\i) of 0 = * (P^^" Equation 264, then the error inherent in Q' is given by fO'= es(fs(fx fr,ex(e@(er[fa],es),e0(cr[eb],cs)) )) , (268) where er(a)= e0(faifr)5 a^nd denotes a representational error in the template r, per Assumption 6.3.2.9. 6.3.2.14. Remark. There currently exists no general closed-form derivation of an analogue for nonlinear pointwise image operations (e.g., maximum or minimum) over the range space of a linear neighborhood transform T. Since neighborhood transforms are based upon noninvertible (many-to-one) mappings, it is difficult to predict the manner in which transform output represents individual image values. For example, in the simple case of pointwise transformations of form T{x) = ca; -|d, if (a) c,a;,d > 0 or (b) c,d < 0 and x > 0, then the analogues of maximum and minimum are preserved in case a) and are the minimum and maximum operations, respectively, in case b). Alternatively, in the case of inner-product transformations such as the Fourier transform, analogues of nonlinear operations are currently not apparent. We next discuss the derivation of global reduce operations over the range spaces of linear transforms. PAGE 230 220 6.3.3. Technique: Global Reduce Operations. Since linear operations over linear transforms are themselves linear operations, the (linear) global reduce operation of summation can be formulated straightforwardly. However, the nonlinear operations are less obvious, and must be restricted to special cases, which we discuss as follows. 6.3.3.1. Assumption. Let the pointwise linear transformation Tn : Â— F-''be applied to an image a Â£ to yield the transformed image ac = (ki Â• a) + k2, where ki,k2 Â€ C. 6.3.3.2. Lemma. Under constraint of Assumption 6.3.3.1, the dual (analogous) operation over range(Tu) of the global reduce operation T : Â— Â» F of image summation is given by: r'(ac)= (ki Â•Sa) + k2-|X| . (269) Proof. The proof follows directly from Lemma 6.3.2.1 and the definition of a linear transform. Additionally, the offset k2 is replicated |X| times, regardless of the values of s. 6.3.3.3. Lemma. Under constraint of Assumption 6.3.3.1, the dual (analogous) operation over range( Tu) of the global reduce operation T : F^ ^ F of image maximum or minimum is given by: r'(ac) = (ki Â• Va) + k2 or r'(ac) = (ki Â• Aa) + k2 . (270) Proof. The proof follows directly from Lemma 6.3.2.1 and the linearity property. PAGE 231 221 6.3.3.4. Observation. The ease with which image sum can be retrieved from neighborhood (e.g., inner-product) transform representations varies with the transform. For example, in the case of the Fourier transform, the zero-order coefficient go generally approximates the image mean. Thus, one can approximate the image sum Ea by forming the product of go and \domain(si)\. If a is block-encoded over Y, then Sa Â« 5] go(y) Â• k/, (271) y6Y where k/ denotes blocksize. 6.3.4. Technique: ImageTemplate Operations. We next discuss derivation of analogues of the image-template operations over the range space of linear transforms. 6.3.4.1. Observation. Theorem 6.3.2.10 covers the case of analogous image-template operations over X, since r(a) = a @ r, where r G (R-'^)'''. That is, if q G (R''^)^ and a = 0(*Â» PAGE 232 222 Proof. Let Assumption 6.3.4.2 hold, and let a= 0(*'^)~^ Â®^ ^^'^ T{a)= a @ r, where q,r G (R''^)^From the proof of Theorem 6.3.2.10, we have a 0q = ac 0 @ q, which leads to the following decomposition: T{a)= J2 n(aces)(y)-^(y)] (273) = J] (ac es)(y) Â• T[a{y)]. yex Thus, the theorem holds. We next examine the case of Class 2 transforms. PAGE 233 CHAPTER 7 CLASS 2 TRANSFORMATIONS Recall (from Section 3.3) that the Class 2 transforms, denoted by T2, modify only the grey levels of an image without transforming the image domain. Thus, the Class 2 transforms may be modelled as a greylevel function. The derivation of analogues and duals for the generalized Class 2 transform is discussed in Section 7.1. Pixel value compression via bit slicing is a common image compression method that has easily-derived analogues, as discussed in Section 7.2. 7.1. Generalized Class 2 Transform. We present a generalization of the Class 2 transforms that can be used as a template for many of the commonly-used greyscale functions. 7.1.1. Technique: Pointwise and Global Reduce Operations. We begin with a decomposition of T-^, then note several interesting properties of the Class 2 transforms. 7.1.1.1. Observation. Per Section 3.3, the generalized Class 2 transform can be decomposed as ^T^-.F^^G^ ^^Ti-.F'^-^G ^ : 9{i).i eF , (274) where the greyscale function g is assumed to be a finite mapping. We present the following definition and lemma, followed by several comments that cover the derivations given in the remaining sections of this chapter. Â•223 PAGE 234 224 7.1.1.2. Definition. If an image value consists of a vector of n bits (bo, bi, . . .,bk_i, . . . ,bn_k, . . . , bn_i), tiien the k-th least significant bit is denoted by bk-i and the k-th most significant bit is denoted by bn-k7.1.1.3. Lemma. Assume that Observation 7.1.1.1 holds, is based on a bijection g, and 0 Â• F X F ^ F denotes an image operation. If O' G ^ G denotes the analogue of 0 over rangÂ€{T2), then 0'(r2(f), r2(g))= ffo 0(f,g),f,gGF. (275) Proof. The proof follows directly from Observation 7.1.1.1 and the definition of homomorphism. 7.1.1.4. Remark. As shown previously, we can implement Equation 7.1.1.3 without knowing the bijection g at each invocation of Q', merely by encoding Q' in a precomputed LUT. The customary LUT quantization error would hold for such an implementation. 7.1.1.5. Observation. Assume that Lemma 7.1.1.3 holds, where F is a finite subset of R. Further assume that g is not a bijection, but preserves the k-th most significant bit of each pixel in T^'s input. It follows that T2 is a lossy transform. Thus, Q', an analogue of a binary greyscale operation Q over range(T2) can be applied to range{T2) to yield the following e-near approximation to QII 0'{T2{i\Ug))0(f,g)|| < 26,Vf,gGF. (276) The upper limit of 2e occurs in the case of addition, per the additive rule of error propagation. PAGE 235 225 7. 1.1 .6. Information Loss. In Observation 7.1.1.1, if Q = +, then where 2'' < Â€ < This statement follows directly from the sum rule for error propagation (given in Chapter .5), and the fact that the truncation ofk-l high-order bits (with the modification of as many as n-k lower-order bits possible) can simulate division by a factor of 2*^ < e < 2'^"'"'. Given an analogue of a pointwise binary operation the analogous global reduce function is readily derived, as follows. 7.1.1.7. Theorem. Let T2 : F-'^ ^ be applied to a source image a G F-'' to yield an encoded image a^. The global reduce operation F : F^ Â— > F has an analogue T' : Â— G over range{T2), if the following conditions are satisfied: (i) r is based upon the associative, commutative operation 7 : F x F Â— ^ F; (ii) 7 has an analogue 7' over range^T^):, and (iii) 7' is the associative, commutative function upon which V is based. Proof. The proof follows directly from the definitions of the global reduce operation and an analogous operation. We offer the following two examples as illustrations of an approximation to the global reduce operation, as well as a common pitfall in deriving analogues to Class 2 transforms. 7.1.1.8. Example. Suppose that T2 is a mapping from a finite subset of the reals F = {1.1, 2.2, 3.3, 4.4} to an index set G. Let a = (1.1, 3.3, 2.2), and let Tiix) = [22;], where [ ] denotes rounding to the nearest integer. Thus, ac = ^2(3) = (2, 7, 4), and Ea = 6.6 while Sb = 12. Since is a nonlinear approximation to a linear transform (i.e., T{x) = 2a:), and can therefore be a lossy transform, the operation of addition does not have an analogue over range{ T). As a result, r'(r2(a)), where T' = S, is an f-near approximation to r2(I)a), where e > i(12 (2 Â• 6.6)) = 0.4 if RMS error computation is employed. Given that the predicted absolute error per pixel in is bounded by the interval [-0.5,0.5) or (-0.5,0.5] (i.e., the quantization error inherent in the rounding function), we PAGE 236 226 have that each invocation of 7' incurs an maximum absolute error of 1.0 = 0.5 + 0.5 (i.e., in the case of addition). Since T' is comprised of two cascaded invocations of 7', the maximum absolute error is given by 1.5 = (0.5+0.5)+0.5. In this instance, the computed error condition e > 0.4 is satisfied. 7.1.1.9. Example. Since we assume that T2 is a finite homomorphism, it would appear that analogues could be formulated over non-numeric sets merely by indexing the value sets of r2's input and output images (assuming that such value sets are unequal). However, this does not hold universally, especially in the case of modular arithmetic with heterogeneous moduli. For example, let a,b = (0,1,2), let d be defined in terms of its graph G(d) = {(0,a), (1,6)}, and let r2(a:) = d(zmod 2). Thus, = T2(a) = (a,6,a). Letting Q = +3f we have c = a +3 b = (0,0,0). It is easily verified that Cf = T2{c) = (o,a,a), since 0(0, 0) = 0 and 0(2,2) = 1. However, the arguments 0 and 2 each have the same representation in range{T2), namely, a. Thus, 0 = +3 has no analogous operation over the range space of this particular instance of T^. Therefore, care must be exercised to inspect the underlying mappings that constitute the encoding, lest they comprise one or more of the cases for which analogues or duals cannot be derived by known means. 7.1.2. ImageTemplate Operations. It follows from the preceding development as well as the development of Chapter 5, that a lossy encoding transform induces error propagation not oiUy through the pointwise and global reduce transforms, but through the image-template operations as well. We previously discussed cases where the transform T2 is implemented in terms of a finite bijection and is thus not lossy. Given such theory, it is relatively easy to formulate image-template operations over range{ T2). PAGE 237 227 7.1.2.1. Assumption. Let a source image a G F-'^ be combined with a template t 6 (F^)^ to yield a = 0(a,t) = a Â® t. Let 12 : F^ Â— be based on a mapping g: F ^ G, such that ac = r2(a), and let the template s e (G^)^ be defined by its weights as Sy(x)= T2(ty(x)),x,y GX. (277) 7. 1.2.2. Observation. Under constraint of Assumption 7.1.2.1, if 5 : F Â— G is a bijection, then Q' : x (G-'')'''' G-'^, the analogue of Q over range{Ti), is given by 0'(ac,s) = ac Â®'s, where Â®' : G^ x (G^)^ G^ is expressed as (ac Â® ' s) (y ) = ac(x) o' Sy (x) , y G X , (278) with 7' (the basis operation of V) and o' denoting analogues of the operations 7 and o that comprise Â®. 7.1.2.3. Theorem. Under constraint of Assumption 7.1.2.1, if Ti has an c-near approximate inverse : G-'^ F^ that is based upon the e'-near approximate inverse g' : G F o{ g: F G, then Q' Â• l^'''')^ ~^ analogue of Q over range{ T'2), would expressed as O'(ac)S) = ac Â®'s, where Â®': G^ x (G-'^)''' G^ is given by (acÂ®'s)(y)= ^ ac(x)o'sy(x),yGX, . (279) with r' and o' denoting analogues of the operations F and o that constitute the operation Proof. The proof follows directly from the definition of homomorphism and an e-near approximate inverse. PAGE 238 228 7.2. Pixel Value Compression via Bit Slicing. The technique of bit-plane elimination, otherwise called bit slicing, has long been known as a simple method of image compression. The customary method of bit slicing eliminates low-order bits, which may (in practice) represent noise. Other methods of bit slicing may remove higher-order bits that are thought not to have a significant effect on the visual representation of an image. However, the latter method requires intensive computation, in order to determine which bit planes are statistically less significant. We first describe the bit slicing transform, then derive analogues of basic lA operations over the transform's range space. 7.2.1. Basic Concepts. We begin by analyzing a generalized formulation of the bit-slicing transform in terms of complexity, security, and information loss. 7.2.1.1. Assumption. Assume that B = {0, 1} and that the greyscale function g : B"" ^ B*', where k < m. 7.2.1.2. Definition. The bit slicing transform, denoted at coarse granularity as '~'T2h'G-'^, is decomposed as signed integers be compressed by right-shifting the bits of each image value by / = m-k bits, to effect division by 2'. Then, the basis function g: BÂ™ ^ B'' of T2h is implemented as fif(a(x))= a(x)2"' , x G X. (280) It follows that ~ g. 7.2.1.3. Example. Let a source image a Â£ (B"*)^ whose image values are encoded as PAGE 239 229 7.2. 1.4. Compression Ratio. Since reduces the word width of its m-bit domain (determined by the function siz) by / bits, the compression ratio is given by 7.2. 1.5. Complexity. Assume that the image domain is denoted by X. If T^h is based upon g : BÂ™ B^, which is computed sequentially using an m-dimensional LUT, then the time complejcity T(|X|) Â— 0(|X|), where the number of LUT invocations is independent of m. Alternatively, if N processors are employed to compute T-2h in 0(|X|/N) time, then each processor must have a datapath of width V(m, n) bits, which may be impractical for large m. 7.2.1.6. Data Security. For purposes of discussion, let Tab implement a linear transformation (e.g., multiplication by the factor 2"'). Thus, if T^h is an encryptive homomorphism, then T2h may be compromised by the method of linear-algebraic attack summarized in Ahituv [44]. It is interesting to note that a cipher based on nonlinear bit slicing would be more secure than T2hFor example, a message characterized by the eight-bit word {01 1 0 0 I 0 1) could be encoded by removing the italicized bits, thereby yielding the 5-bit code (10 1 0 1). Since the input size is disclosed by T^h, one merely has to determine m and k, then perform a brute-force attack requiring work W(m,k) < |X|(Â™) invocations of g. For customary values of 16 < m < 8 bits and possible values of k, the attack is not burdensome over time scales of 30-60 minutes using current technology. For example, the required work is clearly bounded above by |X| (m 1)! w 1.2 Â• 10^^ Â• |X| invocations of g. Let the input contain |X| < .3600 m-bit symbols (i.e., 60 lines of 60 characters, or approximately one page of ASCII text). Assuming |X|-fold parallelism and one machine cycle per invocation of g, one divides the preceding estimate by 1800 seconds to obtain a bandwidth requirement per processor of 66.7MHz, which is available commercially. PAGE 240 230 7.2.1.7. Information loss. Let domain{T2h) be the m-bit integers, and let range{Tih) be the k-bit integers, as implied hy g: B" -+ B''. Denoting entropy as H, the information loss AH is given by: AH= ^{domain{T2h)) R{range{T2h)) = siz(domain{T2h)) Â— siz(range{T2h)) (282) = m Â— k . For example, if the word truncation portrayed in Example 7.2.1.3 is employed, then the information loss is clearly / bits per image pixel value, and / Â• |X| bits per image. 7.2.1. 8. Remark. Since T^h is a lossy transform, the inverse transform T^^ does not exist and is not an isomorphism. However, given the definition of contained in Example 7.2.1.3, it is possible to specify ane-near approximation to T^^^ , which is denoted by ^2*b Â• i^^)^ ~^ (B'")'''' and is based upon the mapping g: BÂ™. The implementation is specified at fine granularity as g{{) = f Â• 2', where / = m-k. In this case, the error associated with the image decompression operation T^y^ is simply the information loss incurred by r2b, which was determined in Equation 282. For purposes of development, we assume that the bit slicing transform eliminates / lower-order bits, as shown in Example 7.2.1.3. This assumption is realistic, since it simulates clipping of low-order (noise) bits, and renders derivation of pointwise analogues over range{T2h) more straightforward. 7.2.2. Technique: Pointwise Operations. Pointwise operations over range(T2h) are derived similarly to the pointwise operations over the range space of the generalized linear transform Tn. PAGE 241 231 7.2.2.1. Lemma. Let source images a, b G (BÂ™)''^ be transformed by T2b: (BÂ™)^ ^ (B'')^, which is based on the function g : B"" ~ (per Definition 7.2.1.2) to yield the compressed images and b^. Further assume that Tib has an e-near approximation to the inverse transform, denoted by : (B*') ^ (B'")^. If the image operation Q : B"* X B"" Â— B"" has an analogue Q' : B*' x B'' ^ B'' over range{ Tjb), then the following statements hold: (i) 0'= 0, e.g., if 0= then 0'= +; and (ii) ||T2b(0(a,b))T^^iO'{sicM)\\ < f', where e'= eo(2',2'), with the error function of Q. Proof. Assuming that the givens hold, (i) follows from the definition of a linear transform, which T^h is assumed to be. Statement (ii) follows from the definition of an e-near approximation to the inverse transform, and (iii) follows from the definition of an error function of a binary operation, as well as the analysis of information loss given in Section 7.2.1.7. 7.2.2.2. Remark. The e-near approximation to the dual Q : (B'')^ x (B*')^ (B*")^ of the pointwise operation Q is similarly derived, by setting fi(ac, be) = 2' Â• 0(ac, be),, which merely implements = o Q', where the analogue Q' was defined in the preceding lemma. 7.2.2.3. Complexity. Assume that 0'= 0> where Q and O' compute over m-bit and k-bit integers, respectively. Further assume that the time complexities To(m) = 0(m) and To'(k) = 0(k). As shown in Section 4.2.3, the resultant speedup due to processing over range[ T2h) is given by 0(m/k). PAGE 242 232 7.2.2.4. Data Security and Information Loss. For purposes of discussion, assume that 72b is a lossy encryption witli key / = m-k. If 0'= 0^ then the analogue Q' does not compromise the data security of T'2b, since no information is added to the inputs and bf of 0'The preceding statement can be easily verified by noting that be cannot contain the / lower-order bits of a that are truncated by T2b, since / lower-order bits of b are also truncated by r2bFurthermore, the approximate inverse 72*^ does not compromise the data security of T2b, since the replacement of / lower-order bits of a with zeroes (which is performed by the operation T2^^{ r2b(a))) does not necessarily restore tJie information lost by r2b. However, attack upon r2b is quite easy, for two reasons. First, r2b preserves the plaintext domain size and ordering. Thus, if r2b is available, the basis function g can be easily determined. Second, r2b partially preserves the histogram of its input, which assists the cryptanalyst in discovering the plaintext, whether or not r2b is available. In particular, if the adversary knows the frequency distribution of the /lower-order bits of each k-bit integer in range(rangÂ€{T2h)), then it is possible to reconstruct a probabilistic version of the plaintext histogram. With the domain ordering information that T2b discloses, the adversary can construct a relaxation algorithm that guesses at n adjacent characters, based upon n-gram frequency tables. For purposes of efficiency, one would initially set n=2, then increment n until recognizable words appear in the decrypted plaintext. Thereafter, lexicaland relaxation-based methods would be combined to recognize probable words and their collocated candidate words [27]. 7.2.3. Technique: Global Reduce Operations. Given the pointwise analogue, formulating an analogue of the global reduce function is a straightforward process. PAGE 243 233 7.2.3. 1. Theorem. Let source images a, b G (B"") be transformed by T2h(B'")-'^ ^ (B'')''^, which is based on the function fif : B"" -> B'' (per Definition 7.2.1.2), to yield the compressed images a,., and be. Further assume that has an e-near w ^ X approximation to the inverse transform, denoted by T^^ : (B'') ^ (B"*)^. If the global reduce operation T: (BÂ™)^ ^ BÂ™ has an analogue T' : (B'')^ ^ B'' over rangeiT^h), then the following statements hold: (i) r' = r, e.g., if 7 = + is the basis function of F, then 7' = + is the basis function of r'; and (ii) ||r2b(ra)r2'"b(r'(ac))|| < e', where e'= (2', 2') and denotes the error function of 7. Proof. The proof is outlined as follows. Assuming that the givens hold, (i) follows from Lemma 7.2.2.1 and the assumption that T^h is a linear transform. Statement (ii) follows from the definition of an e-near approximation to the inverse transfornO 7.2.3.2. Complexity and Data Security. Since T' = F, the complexity of F would initially appear to be preserved over range{ T^h)However, if the time complexity of 7 is a function of the input word width, then it follows that the computational speedup incurred by processing over range{T2h) using an analogue F' is given by ^ Tr(|X|) ^ |X| Â• T.isizjF)) _ T>i) _ m '(^Â•^'^ Tr-dXi) |X|-Ty(sz2(G)) ~ T.,(k) ~ k" ^^"^^ The global reduce functions over rangelT^b) are not themselves secure, but do not compromise the transform's security. Note that the input image's arithmetic mean can be deduced from the image sum, and the image product discloses only the geometric mean of the input image. Similarly, the image maximum and minimum disclose no additional information about the input. The dual operation is symmetrically defined, and has a similar error function. We next consider the pointwise image-template operations over range(T-ih)- PAGE 244 234 7.2.4. Technique: Image-Template Operations. Given the analogues of the pointwise and global reduce operations, the formulation of the image-template operation can proceed straightforwardly, per the methods of Section 3.4. In particular, since the bit-slicing transform affects only the values of the compressed image and not the image domain, the composition of 7 and o to form Â®', the analogue of @ over range( T^b), is generally straightforward. However, if we apply to the output of Â®', the result will be an approximation of the corresponding image operation, as discussed in the following sections, 7.2.4.1. Assumption. Let source image a 6 and let template t Â£ {^^) be spacevariant. Let ac = T2b(a), and let s G (G-'^)''' be defined by its weights as Sy(x) = T'2b(ty(x)), x,y Â£ X. Let a= a @t, and let 7 and Oi the basis operations of Â®, have analogues 7' and o' over range{T2h), as described previously. Further assume that T2b is an isomorphism with an e-near approximation to its inverse denoted by Tj^,. 7.2.4.2. Theorem. Under constraint of Assumption 7.2.4.1, Â®': X (G^)^ G^, the analogue of Â® over range{T2h) is given by: (acÂ®'s)(y)= I ac(x)osy(x),y e Y, (284) such that i|7'2b(acÂ®'s)-(aÂ®t)||< e', (285) where e' < e, the error inherent in ^2*1,(3). Proof. Assuming that the givens hold. Equation 284 follows directly from the definition of Â® and the definition of a homomorphism. Similarly, Equation 285 follows from the definition of an e-near approximation to the inverse transform. The condition e' < Â€ follows directly from the fact that Â®' introduces computational error in addition to that incurred by the operation T2^^{a). PAGE 245 235 7.2.4.3. Complexity. Under constraint of Assumption 7.2.4.1 and Theorem 7.2.4.2, if the analogous operations 7' and o' incur identical computational cost as the corresponding image operations 7 and o, then the image-template operation ac Â®'s over range{ T2h) incurs computational cost identical to the operation a Â®t. Since 7' and o' compute over their operands in the same fashion as 7 and o, it is reasonable to assume (for practical purposes) that 7' = 7 and o' = OThat is, if 7 = +, then 7' = +, and likewise for O'. As a result, it would appear that no computational advantage accrues from processing over range{ T-^h)However, let us assume that T-y(m) = 0(m) and symmetrically for o and k. For purposes of simplicity, let |5(t)| = |5(t)-l| and note that |Â»S(ty)|= |5(sy)|, Vy Â€ Y. If the value sets of r2b's domain and range spaces respectively consist of mand k-bit numbers, then the speedup achieved by the image-template operation over range{ T^h) is given by: T@(t, m) ^^^^""^ + ^Â®'('^-^-^^^)= f^:(-k) = (Ty(k) |5(sy)|-l) + (T,,(k) . |5(sy)|) ' ^^S^) ^ yex which becomes V^'it s m k)|X|-(T,(m) + To(m) -15(1^)1) _ /mN _ (t, s,m, k)|x|.(T,(k) + To(k). |5(sy)|) " " ^^^^^^ ^^^^^ Thus, in practice, one could process bit-sliced imagery with a speedup proportional to the compression ratio. PAGE 246 CHAPTER 8 CLASS 3 TRANSFORMATIONS In this chapter, we overview Class 3 transforms of interest to image processing and cryptography. The affine transform (Section 8.1) is a component of fractal-based compression algorithms, whereas pixel selection (Section 8.2) is often employed in the reduction of sparse matrices. We conclude with the simple case of a transpositional cipher (Section 8.3). 8.1. Affine Transformation. The affine transform derives from linear algebra and supports various transformations employed in computer graphics. 8.1.1. Basic Concepts. 8.1.1.1. Definition. Let domains X, Y C and let the source image a G F-'^ be deformed by T : F-'^ Â—> F^, which is based on the spatial transformation g : Y Â— > X, where each coordinate x= (xi,X2) G X is transformed as y = gix)= (A -x + B) hA (288) a2i J ' \X2 J ^ \h2 It is customary to call g an affine transformation and to express A as (-r2^shi^(92 r2C0^22)' ^^^'^^ (ri.^i) and (r2,^2 + |) correspond to the points (aii,ai2) and (a2i,a22) expressed in polar form. Scaling is specified by r, rotation by 0, and the coefficient matrix B (reference Equation 288) specifies shift or spatial offset. The image algebra formulation of T is given by ac = r(a) = ao g = {(yMgiy))) : y G Y} . (289) 236 PAGE 247 237 8.1.r.2. Observation. It is possible to redefine T : F'^ F''^ in terms of a basis map y : X Â— Â»Â• X that incorporates restriction to X. If X is finite, then the discrete affine transform is usually implemented in terms of an interpolation algorithm that improves the visual appearance of the transformed image. Unfortunately, such algorithms generally render the output of T a lossy approximation to the output of the corresponding aflRne transform over continuous X. In Section 8.1.1.8, we present a general formulation for such losses. Additionally, we note that an affine transform can be spacevariant, where A or B is a function of x 6 X. However, we emphasize the space-invariant affine transform for purposes of simplicity. 8.1.1.3. Complexity. The space-invariant affine transform requires four multiplications and six additions per point in X, yielding the time complexity Tr(|X|) = |X| Â• (4Atn> + 6Ata). (290) If Tis space-invariant, then A and B must be computed at each point in X. For example, if the polar form given in Section 8.1.1.1 is employed, then four multiplications and invocations of the sine function are incurred per point in X, in addition to the overhead shown in Equation 290. 8.1.1.4. Information loss. The average computational error inherent in Equation 288 is bounded as \Â€t\ < 0.5 Â• (qi -fq2)/2 = 0.25 Â• (qi -fq2), where qi and q2 denote the quantization intervals specific to the first and second coordinate of X. This statement can be verified by noting that the maximum error in quantizing a real-valued number into an integer (q = 1) is Â±q/2. More precise quantification of information loss in a spatial affine transformation requires discussion of the errors inherent in an interpolation step that might be present. Rather than specialize our discussion to customary methods only, such as bilinear or bicubic-spline interpolation, we present a general discussion and formulation that will be useful in subsequent development. PAGE 248 238 8. 1.1. 5. Observation. Consider the case of image rotation in the absence of scaling. Let a pixel neighborhood N{x), where x 6 X is rectangular, since X C R^. be oriented arbitrarily at the center of a 3x3pixel grid as shown in Figure 12a. It is well known that such a configuration cannot produce a rotated pixel that overlaps more than six pixels, inclusive of the central pixel, as shown in Figure 12b. The labelling scheme of Figure r2a intuitively suggests the following representation for error inherent in an afRne transform of a rotated image. 4 3 2 5 0 1 6 7 8 ^Â« f ^ \ (a) (b) Figure 12. Pixel indexing and overlap scheme for information loss analysis of an affine transform (a) indexing scheme and (b) example of pixel overlap. 8.I.I.6. Definition. Let the discrete domain X C and let the unsealed rotated image ac= T{a.), where T was given in Section 8.1.1.1. Given Figure 12, we can define ac in terms of an n-pixel tesselation of a scaled, rotated, or shifted source pixel using a family of indexing functions H = {hi{y) G X : i G Zn+i and y 6 Y}. Additionally, we adopt the notational convenience that fi(/i(y)) = f(/ii(y)) since an implementation of the function f could pass i as an argument to h. Thus, we can write ac= {(y,[fi(/i(y))>f2(My)),---,fn(My))]): y e Y}, (291) where fi, i=l..n, denotes the fraction of the source pixel value a.{h{y)) that is contained in a.c(y)For unsealed rotation restricted to X, we can set Y = X and the preceding convention reduces to n=8, per Figure 2. PAGE 249 239 8.1.1 .7. Observation. Equation 291 facilitates specification of a discrete spatial transform T in terms of an interpolation that distributes fractions fo through fg of a pixel value spatially. By denoting the exact interpolation as and its attendant fractions as f , we are able to specify the interpolation error as shown in the following definition. 8.1.1.8. Definition. Given Definition 8.1.1.6 and the preceding observation, the error i=0 where f was defined previously. 8.1.1.9. Remark. Implementationally, the approximation sTc can be computed via line intersection and trapezoidal area formulae. Thus, Equation 292 provides an accurate theoretical expression for the error inherent in a space-invariant affine transformation T. Since f and f vary spatially, this method is also useful for characterizing the error inherent in space-variant T. 8.1.1.10. Data Security. The space-invariant afl[ine transformation shown in Equation 288 has little utility in data encryption, since it merely shifts, rotates, or scales an image, which does not render the image appearance obscure. As a result, correlationbased attack using multiple scaled and rotated exemplars is feasible. For example, one could employ highbandwidth optical architectures based on Fouriertransform correlation [73]. In contrast, a space-variant transform can decorrelate the pixel ordering inherent in the source domain such that the transformed image may appear visually cryptic. As we have seen, such spatial transformation can approximate a transposition. Thus, the histogram of the transformed image approximates the histogram of the source image. As a result, an affine transform T is generally not considered useful for encryption, since T discloses the histogram of its input and might not obscure its input visually. inherent in ac = r(a) is given by fa.(y)= ac(y) ac(y) n (292) PAGE 250 240 8.1.2. Derivation of Pointwise Analogues over range(T) We begin by expressing the pointwise operations over ac, then analyze such operations in terms of the representation given in Equation 292. 8.1.2.1. Lemma. If Definition 8.1.1.1 holds and the system A'o = ({F-'^, 0}' Q}' {f^^' 0'} ) has the following salient components: (i) 0 is a unary (binary) operation over F-^, (ii) T : F^ is a space-invariant affine transform, and (iii) 0' is a unary (binary) operation over F^, then 0'= 0(0) = 0Proof. The proof follows directly from the fact that an afRne transform over continuous domains does not modify pixel values in Ts input. The nearly trivial result is that 0'= 08.1.2.2. Remark. It follows from the preceding Lemma that the global-reduce operations and the image-template operation Â® are preserved over range{ T), when T is continuous and space-invariant, provided that: (i) T does not scale the source image when T Â£ {E,n}, and (ii) If r(a) is restricted to X, such restriction does not incur information loss, i.e., X C Y is required. Condition (i) is easily verified, since scaling by a factor of k would scale the image sum by k and the image product by kl"^l/l^l. Condition (ii) prohibits cropping of a by T, which would prohibit F and Â® from having analogues, regardless of whether or not X is continuous. We next discuss the more interesting case of analogues over the range of a discrete affine transform, using the representation given in Equation 291. PAGE 251 241 8.1.2.3. Assumption. Let Definition 8.1.1.1 hold, where X,Y G Z^, and let Equation 292 describe the error inherent in an affine transform T. 8.1.2.4. Observation. Since X and Y are discrete and the error in T is nonzero, T is a lossy transformation. Thus, we say in practice that T has an e-near approximation to its inverse, denoted by T' : Â— * F-''such that, for a given pair of images a G F^ and ac= T(a), ||r(r(a))-a|| < fa. + ^A, (293) where G R"*" denotes implementational error (e.g., due to architecture-dependent precision). 8.1.2.5. Assumption. Let the system Vq= ({F^, Q}, {T, 0), {F^, Q'}, T*) have salient components that axe specified as follows: (i) 0 is a unary operation over F^, (ii) T: -* F^ is an affine transform that has an e-near approximation to its inverse given by T' : ^ F^, such that |er'|= kr(faj| < 2fa, (for purposes of convenience), where is given in Equation 292; and (iii) 0' is a unary operation over F^. 8.1.2.6. Theorem. If Assumption 8.1.2.5 and Definitions 8.1.1.1, 8.1.1.6, and 8.1.1.8 hold and we have 0'= Â©(0) = 0> then |r-(0'(r(a)))-a| PAGE 252 242 8.1.2.7. Theorem. If Assumption 8.1.2.5 and Definitions 8.1.1.1, 8.1.1.6, and 8.1.1.8 hold, then n ac(y)= 0'(na)(y))= 0(0) = J]0(a(A(y)))-fi(/i(y)),yGY, (295) i=l within the error bound of Theorem 8.1.2.6, where h was defined in Section 8.1.1.6. Proof. Let the givens hold, and recall the commutativity diagram of Figure 1. Given the construction of ady) shown in Definition 8.1.1.6, we obtain Equation 295 directly from the decomposition of a.c. Since this decomposition varies from the exact decomposition ac by the amount shown in Equation 292, the error bound of Theorem 8.1.2.6 holds. The case of an analogue of a binary pointwise operation is symmetrically defined. 8.1.2.8. Observation. In Theorem 8.1.2.7, if Q is a linear operation, then n ^c(y)= 0'(na)(y)) = 0(0) = E 0(a)(/^i(y)) Â• my)) (296) = EO(a(A(y)) Â•fi(My))),y6Y. 1=1 This formulation may be slightly more efl[icient computationally on SIMD-parallel processor meshes that have linear I/O cost and a small operator set. For example, assume that O is implemented in terms of a LUT or instruction code that must be loaded prior to each invocation of 0It would be more efficient to interleave (via pipelining) the LUT loading procedure while computing the product in Equation 296, then input the result to Oi versus loading the LUT, then computing 0(a(/Ji(y))), then computing the product, as in Equation . PAGE 253 243 8.1.2:9. Complexity. Computing O'C^c) = 0(ac) ^ 0(a) requires only |Y| invocations of 0, which implies a speedup of r?= |X|/|Y| = CRd that exceeds unity for minified images. Computation of Equation 295 requires n Â• |Y| multiplications and invocations of 0' together with (n 1) Â• |Y| additions. Thus, r?= |X|/(n Â• |Y|), which implies that computational efficiency is not realized if n > CRj. Since T is insecure, Q'(T{a)) is insecure. Thus, data security remains a moot issue. We next progress to a more detailed discussion of information loss. 8.1.2.10. Information loss. Since = T{sl) may be a scaled, rotated, or shifted version of a that is erroneous due to interpolation errors, we can further exploit the fractional representation f of Equation 292 by expressing the error in a given interpolation fraction fi in terms of the error Â€{. = f, fi, where fi denotes the exact value of fi. The error in a unary pointwise operation Q' formulated per Equation 295 is given by: ^0'(ac(y))= fS,n(co[fx(fa,efi)]) ^^ r , ,1 (297) i=l This provides us with an implementational expression that can be analyzed as follows. 8.1.2.11. Bit error rate. Note that cq' predicted from Equation 297 is linear in f, which linearly comprises jTc and, therefore, e^^. Thus, cqi correlates with e^,^, and the likelihood E(eo') = E(eaJ. As a result, Q' does not compromise the bit-error rate of ac. As discussed previously, this issue is separate from degradation of bit-error rate by computational error due to the approximation inherent in Ts interpolation scheme. We next consider the simpler case of global reduce operations over range{ T). 8.1.3. Global Reduce Operations The derivation of analogous global reduce operations over the range space of an affine transform can be partitioned according to the additive or multiplicative form employed in error analysis. PAGE 254 244 8.1.3.1. Theorem. If Definition 8.1.1.1 holds and G 's computed per Equation 292 then the CCS Xt = ({F^,F; T}, {T,I; 0}, {F^,F; T'}) has the following elements: (i) T : Â— * F^ is a space-invariant affine transform; (ii) /: F ^ F is the identity transform; and (iii) A dual of F is specified within the following error tolerance: Â— If r= S, V, or A, then r'= 0(r) = T such that I r'(ac) r(a) I < Eea. Â• (298) Â— If r= n, then r'= 0(r) = r such that I r'(ac) r(a) I < r'(ac) Â• SleaVad Â• (299) Proofs The proof follows directly from the givens, as well as the sum and product laws of error propagation. We derive a dual rather than an analogue, since the latter would require that T = I, which is a trivial case. 8.1.3.2. Observation. An alternative to Theorem 8.1.3.1 can be derived by setting 0'= 7, the basis function of T, then composing Q' (reference Equation 295) |X| 1 times over ac to yield an approximation to r(a). The resultant error will be given by Equation 299. 8.1.3.3. Complexity. When Theorem 8.1.3.1 characterizes a dual T' of a global-reduce function T, the work required by T' equals |Y| 1 invocations of 7, the basis function of r. If the method of Observation 8.1.3.2 is employed in formulating 7', then |Y| 1 invocations of Equation 295 are required, which incurs n Â• (|Y|-1) multiplications and (n1)-(|Y|-1) additions. Information loss was discussed in Theorem 8.1.3.1. We next consider image-template operations over range{ T). PAGE 255 245 8.1.4. ImageTemplate Operations. We begin by assuming that the template is spatially invariant, for purposes of simplicity. 8.1.4.1. Assumption. Let Definition 8.1.1.1 hold, and let a source image a e F-'^ be convolved with a spatially invariant template t G (F''^)'''' as b = a Â® t. Let t be represented by its weight matrix c = tx, x G X, which can be transformed by T to yield Cc G F^, the weight matrix of a template s G (F^)^. 8.1.4.2. Theorem. If Assumption 8.1.4.1 holds and e^^ G is computed per Equation 292, then the CCS X@ = ({fX,(FX)^ Â©}, {T,U; 0}, {fY,(fY)^; Â®'}) has the following salient elements: (i) T : F^ is a space-invariant afRne transform; (ii) U : (F^)^ ^ (F^)^ produces s G (F''^)^ given t G (F^)^, as discussed in Assumption 8.1.4.1; and (iii) An analogue Â®'= 0(Â®) = Â® of Â® incurs computational error e^such that e@' = l(acÂ®s)T(aÂ®t)|< 4fa.. (300) Proof. The proof is outlined as follows. Assuming that the givens hold, observe that Â®' = Â® combines each pixel in ac with a pixel in s using the operation O: F X F -+ F. If o,7 G {-|-,V,A}, then the additive law of error propagation applies and e^^^e^K 2 Cg^. Here, we assume that derives from information loss inherent in T, since ac= r(a) and the template s has weight matrix Cc = T{c), where c denotes the weight matrix of t. Thus, f g)' < 4 Â• Â€a,, which yields Equation 300. If o' = 0= Â• combines the values ac{v) and Sy(v), then the product law of error propagation applies and .o. < a.(v) .,(v) . (1^1 + 1^1) . V . 5M, y e Y , (301) PAGE 256 246 where we cissume fg ~ fa,Â» as before. If we represent ac(v) and Sy(v) as p and q, then we can bound (qi < 2pq-ea^. Since we do not utilize the case where 0,T= Â•, we have only to consider Â® G { Â©, @, Â©}, where Â€ < 2 Â€q due to the additive law of error propagation. Thus, Equation 300 holds, and the theorem holds. 8.1.4.3. Observation. An alternative formulation of an analogue Â®' of Â® can be obtained by composing Q', an analogue of o : F x F Â— >Â• F over range{T), with 7', the basis function of F'. For example, let a,b Â£ F-'' and let Theorem 8.1.2.7 hold for binary operations. Then, Equation 295 becomes 0'(r(a)(y),r(b)(y))= 0(0) = E Q(a(/Â»i(y)),b(/ii(y))) Â• fi(My), ) , y e Y , 1=1 where h and f were defined previously. Given ac = T'(a) and s= U{i) (from in Theorem 8.1.4.2) we can formulate Â®' : x (G^)^ Â— > G^ as follows (ae Â®'s)(y)= ^^Jj^^^ ac(v)o'sy(v), y 6 Y. (303) 8.1.4.4. Remark. The work required by Equation 303 is given by I S{sy)\ invocations yÂ€Y of o' and ^ (| 5'(sy)| 1) invocations of 7'. However, each invocation of o' requires n multiplications and invocations of Q : F x F ^ F, as well as n-1 additions, where n denotes the size of the interpolation neighborhood about a point in Y. Thus, the actual work is given by W@-(|Y|,n)= n5] |5(sy)| Â• (W,(l) + Wo(l)) yÂ€Y V(304) + n-5;(|5(sy)|-l).W4l). yeY Note that the sequential time complexity T0'(|Y|,n) is obtained by relabelling W in Equation 304 with T. PAGE 257 247 8.I.4.5. Information loss. From Chapter 5, we see that the information loss due to composition of functions 7' and o' is given by the composition of the error functions c^' and e^/, which were defined previously. In particular, for each combination of k results of o' by the customary operations 7' G {+,V,A}, the absolute error is bounded above by eg)' < k-e^/. We next consider the case of pixel selection processes that preserve only a subset of the source image in Ts output. 8.2. Spatial Transformation by Pixel Selection We being with a general overview of the pixel selection technique, then progress to derivations of analogues or dual operations. 8.2.1. Basic Concepts. Pixel selection is often employed in transform coding, where negligible transform coefficients are discarded, leaving a compressed transform representation. 8.2.1.1. Definition. Let a G F-''and let S C F contain negligible values. Let T : F^ transform a to yield ac = {(y,a(x)) : y = g{x, a(x)), a(x) ^ S, x G X } , (305) where g is an indexing function that belongs to a family of functions G : X X F Y. 8.2.1.2. Example. Consider that T could select coefficients of a Fourier-transformed image to yield a list of non-negligible coefficients. Additionally, Equation 305 can be modified to recast T as a sparse-matrix reduction transform, by replacing the pixel value a(x) with the tuple (x,a(x)). Such development is given in Chapter 9. PAGE 258 248 8.2.1. "3. Observation. Another way to describe T is to assume that the range space of a G F-'^ is reconfigured by T acting as a value dependent transposition based on /i : X X F Â— Â» X, not as a value substitution of form F Â— F. We can thus express T as a parameterized spatial transform r(a)= a 0 /j(a), (306) where h: F x Y Â— Â» X accepts a value in a and a location in Y, and produces a source point in X. Given this simplification of T, we can derive operations over range( T) by relabelling the domain points of a, as follows. 8.2.2. Derivation of Analogues over range(T). 8.2.2.1. Theorem. If Observation 8.2.1.3 holds X,Y C R, and the CCS = ({f^^' 0}i {T, 0}, {F^, 0'} ) has the following salient components: (i) O is a unary pointwise operation over F^, (ii) r : F-'' -> F^ is a pixel selection transform; and (iii) 0' is a unary pointwise operation over F^; then 0' is given by 0'= 0' which incurs no additional information loss. ProofThe proof follows directly from the givens and the fact that values in a are mapped directly to values in ac without interpolation. Thus, the operation 0'= 0 introduces no computational error beyond that produced by applying Q to a given value of a. Analogues of binary operations over range( T) are symmetrically defined, if corresponding values in a and b are preserved in a^ and b^. Otherwise, when one operand of a binary pointwise operation is missing, it must be replaced with the identity element of that operation or with a value prespecified as characterizing the missing values. PAGE 259 249 8. 2. 2.2. Observation. Since values are eliminated from a, image sum and product have neither an analogue nor a dual over range{ T), unless one considers T to have an enear approximation to its inverse, denoted by T* : F^. For example, T* could approximately reconstruct a from r(a), given knowledge of the statistics of a and S, as well as the indexing function g. 8.2.2.3. Theorem. If Observation 8.2.2.2 holds and the image maximum (minimum) is not in the set of image values discarded by T, then the global reduce operations T of image maximum (minimum) are selfanalogues over range{T). Proof. For purposes of argument, assume that the image maximum (minimum), denoted by f, is in S. Then, f cannot be in range{sLc) and the corresponding analogous operation of image maximum (minimum) applied to ac cannot satisfy the definition of homomorphism. Thus, the theorem holds. We next consider the case of the nonlinear image-template products, assuming that image maximum and minimum are preserved in range{T). 8.2.2.4. Assumption. Let t G (F-'^)''^ be a space-invariant template. Assume that U : (F-'^)^ (F^)^ transforms t into s by applying Tto the weight matrix of t to yield the weight matrix of s. 8.2.2.5. Theorem. If Assumption 8.2.2.4 holds, T Â£ {V,A}, and the CCS ^Â® = ({'^''' Â®}AT, U; 0}, {pY, (fY)Y; has the following elements: (i) r : px ^ FY is a pixel selection transform; (ii) U: (fX)^_(fX)Y produces s e [F^]^ from t G (P^)^, per Assumption 8.2.2.4; then an analogue of Â® over range{T), is given by Â®' = Â®. PAGE 260 250 Proof. Assuming the givens hold, ac and s are spatial transforms of a and t, with certain values removed. Recall that Theorems 8.2.2.1 and 8.2.2.3 respectively state that 0 : F X F F and T G {V, A} are self-analogues over rangÂ€( T). Within the limits of information loss incurred by o' operating over reduced values in a, we have 8.2.2.6. Example. Let a two-dimensional source image a G [0, 1] (e.g., the bar chart of Figure 13a) be rotated by an aflfine transform T 30 degrees counterclockwise about the image domain's geometric center to form a^, shown in shown in Figure 13c. Let template t G (R'''^)^ (shown in Figure 13b) be rotated similarly to form template s G (R-'^)''^, per Figure 13d. The locally averaged image a 0t is shown in Figure 13e, with T*{a.c @ s) shown in Figure 13f. The error image that depicts blurring with the unrotated template t is shown in Figure 14a and is given by whereas blurring of a^ with the rotated template s generates less error, as shown in Figure 14b, which depicts @' = @, and the theorem holds. e= (nacet))|x-(a Â®t), (307) f= (naces))|x-(aÂ©t). (308) The histogram of error intensities that corresponds to e (or f) is given in Figure 14c (d). PAGE 261 251 (a) (c) 0 1 0 1 2 ^ V. y 1 0 1 0 (b) 0.261 0.802 0.261 0.802 1.662 0.802 0.261 0.802 0.261 (d) (e) (f) Figure 13. Local averaging of a rotated Boolean bar chart using a rotated template: (a-b) source image a and template t; (c-d) rotated image = T(a) and template s formed by applying T to the weight matrix of t, then restricting to the 3 x 3-pixel Moore configuration; (e) rotated locallyaveraged image ac Â©s; (f) derotated image r"'(ac @s). PAGE 262 2280 /rror Histogram; Image Rotation Â• Rotated Tetoplate j I Image; ./'teatdgX .pgin band; TOOnmj . ^ , ' Mill II II pill I jriiii m m il 0.200 0.300 0.400 0.500 0.600 0.700 0.800 0.900 padian ; 0.468i mean ; 0.45S7 xkow ; -O.0O30 mode ; 0.4651 stddev; 0.0508 kurtoal*; -1.4993 disnl^ (d) Figure 14. Error images of local averaging using rotated and unrotated templates: (a-b) error images e (Equation 307) and f (Equation 308), taken from the central (uncropped) portion of domain{a}; (c-d) histograms of error images e and f. PAGE 263 253 8.2.2".7. Remark. The preceding example demonstrates the utility of transforming the template as well as the image, when image-template operations are employed over the range space of an image transform. Note that the rotated template not only produces reduced error, but the errors are more closely clustered near the modes. When the derotated images (e.g., Figure 13f are normalized to the interval [0,1], the mean error about one standard deviation in Figure 14c (d) is given by 0.516Â±0.176 (0.45Â±0.05). Thus, using a rotated template reduces error more than threefold, from a standard deviation of 0.176 to 0.051. 8.3. Transpositional Ciphers. The transpositional ciphers are a special case of the generalized Class 3 transform analyzed in Section 3.4. Since the transposition is a bijection / on the source domain X, one need only apply / to the domain of an image operation Q to produce the corresponding analogue over rangÂ€{T). Salient theory is given in Theorem 3.4.2.11. 8.3.1. Properties of Transpositional Ciphers. We begin with a few observations about transpositions and oracles. 8.3.1.1. Observation. In Section 4.4.4 we discussed oracles and in Theorem 4.4.4.20 we showed that Class 3 transforms preserve an e-near approximation to the image histogram and sum. We also showed that if e = 0, then the image histogram and sum are preserved. Note that e = 0 for transpositional ciphers, which are usually considered as bijections over the source image domain. 8.3.1.2. Definition. A transpositional cipher T : is based upon a spatial bijection / : X ^ X, such that ac = a o /, which was defined previously. PAGE 264 254 8.3.2. Cryptanalysis of Transpositional Ciphers. We begin with a brief discussion of attack in the presence of the cipher, then progress to ciphertext-only attack. 8.3.2.1. Observation. If Definition 8.3.1.2 holds and the transpositional cipher T is available (with absence of the key understood), then plaintext attack is achieved as follows: Step 1. Let i = {(x,x) : x G X}. Step 2. Since values in i are unique, the graph G{T) = {(i,r(i))}. 8.3.2.2. Remark. Recall the probabilistic oracle given in Definition 4.4.4.1, which yields a measure of probability that a given result of applying the oracle discloses a property of a specified cipher's input (plaintext). Additionally, Definition 4.4.4.1 discussed the assessment of error associated with such a result. Note that the preceding algorithm constitutes an oracle for plaintext attack on transpositional ciphers that yields the graph of the transposition with unitary probability and zero error. The property of the cipher that is preserved is the statement preserves G{T) = {(i,r(i))}. 8.3.2.3. Observation. If Definition 8.3.1.2 holds, then the cardinality of the search space for enumerative ciphertext-only attack on a transpositional cipher is |X|!. In principle, this is computationally prohibitive. In practice, the transposition is easily compromised by the method of histogram sorting and correlation, as follows. 8.3.2.4. Technique. Let Definition 8.3.1.2 hold and let a known plaintext corpus over the alphabet F have a histogram of i-grams hi, where i > 1. A transpositional cipher T'ls vulnerable to attack given ciphertext aconly, via the following algorithm: Step 1. Find the plaintext and ciphertext digram and trigram (f,g) = c/omam(Vh2) and (f, g, h) = c^omam (vh3|{(f,g,h):h6F}) , (309) PAGE 265 255 where f,g G F. Step 2. Locate all occurrences of f, g, and h in ac and collect their domain points in X in the sets F, G, and H, respectively. Step 3. Construct the domain tuples A(h) Â— {(x,y) : x G F, y 6 G} of all possible candidate pairs (f,g). Step 4. For each a in A(h), collect in B(h) all possible triples (x,y^), where z is a point in H. Step 5. Repeat Steps 1-4 for other trigrams beginning with letters other than h. Step 6. A set of candidate digrams is found by the following intersection C = fl fl A(h) n (bi(b),p2(b)]), b G B(e). (310) hÂ€F eeF By minimizing |C| through careful selection of trigrams in Step 5, one can force convergence of the algorithm. 8.3.2.5. Complexity. Given m digrams having a frequency of occurrence that is deemed significant, and n significant trigrams, the preceding algorithm requires 0(mn|Fp) intersection operations, which is burdensome for large F. Such comparisons are usually implemented as integer subtraction and comparison by zero, performed on the ciphertext coordinates in F, G, and H. If the plaintext is large, the overhead can be prohibitive. Thus, the following algorithm can be employed for purposes of efficiency. 8.3.2.6. Algorithm. In practice, the preceding technique can be implemented more efficiently using positional differences, as follows. Step 1. Find most frequent digram (f,g) and compute positional differences between all fs and g's in ac. Example. Let AN be the most frequent plaintext digram. Suppose a fragment of ac is indexed as follows: PAGE 266 256 Ciphertext Values: . . .sdljyAdlNthdoAjMkNusdsAeiNt. . . Positions: II I | II (units) 123456789012345678901234567890 (decade) 12 3 Step 2. Extend candidate digrams: Given all Instances of AN, we find more frequent trigrams that begin with AN, and see if there are any consistent occurrences. Example. For purposes of illustration, let AND be the most frequent plaintext trigram, followed by ^lA'^Vand ANT. Continuing with our previous example, we have for the word and: Ciphertext Values: . . .sdljyAdfHthdoAjhlkHusdsAeiHt. . . Positions: # *|#|*#| | #| |* (units) Â• 123456789012345678901234567890 (decade) 12 3 where the positions of symbols T and Y are denoted by an asterisk and Z)'s positions are denoted by (#). We then construct the positional differences: Positional Differences (A-H,H-b) Trigram 12 3 4 5 6 AID 3,-7 3,-2 3,3 3,13 5,-7 5,-17, ... AIY 3,-4 3,11 5,-14 5,1 3,-7 AIT 3.1 3,19 5,-9 5,9 3.-17 3,1 where x denotes D, Y, or T. The trigrams that repeat are ANT, with a difference of (3,1) that occurs twice. Step 3. Compare candidate trigrams: Examine the trigram data to locate coincidences of the first two letters in any two trigrams (e.gÂ„ the pairs (f,g) in our previous example). This increases the probability of finding a correct transpositional map for digrams that form the first two letters of the candidate trigram. This comparison PAGE 267 can also establish a set of candidate maps for the third letter(s) of trigrams that have more than one occurrence of a given set of positional differences. Repeat Steps 1-3 for different trigrams until a more complete candidate transposition emerges. Example. Continuing with the preceding example, we see from the table in the preceding step that only one trigram (ANT) repeats is positional difference (3,1). Thus, we can assume that A **NT is a reasonable first estimate of ANT in ciphertext, where (*) denotes ciphertext letters. We also note that in ANY and AND there is a preponderance of difference configurations of form (3,x). Thus, there is corroborative evidence that the transposition T separates adjacent letters A and A'' in plaintext via two intervening characters in ciphertext. For example, if T(ANT) has positional differences (3,1), then NT has a unitary inter-symbol distance. Step 4. Trial inversion: Given a candidate transposition, which will likely be a partial function over the message domain, invert the transposition and apply it to the ciphertext. This is possible, since Tis a bijection. Look for common, recognizable words in the ciphertext, and attempt to fill in the blank symbols based on known plaintext statistics. Iterate Steps 1-4 until convergence is achieved. 8.3.2.7. Remark. The preceding algorithm constitutes an oracle that can determine if a given word (e.g., ANT) is in a given ciphertext. At convergence, the probability of finding a prespecified word is either one, or indeterminate. That is, without complete decryption, an oracle cannot accurately determine the probability with which a given word w would be found in ciphertext. As shown in Sinkov [27] and Patterson [30], one can employ Bayesian techniques to estimate such probabilities from n-gram distributions and knowledge of plaintext statistics. If the preceding algorithm does not converge, then no useful information is provided by the oracle. That is, the oracle's error measure would be maximized, and the probability would be indeterminate. PAGE 268 258 We next consider the Class 4 transforms. PAGE 269 CHAPTER 9 CLASS 4 TRANSFORMATIONS This chapter contains derivations of analogues to the image algebra pointwise, global reduce, and image-template operations over the range spaces of several commonly-used Class 4 transforms. We begin with block encoding (Section 9.1), then progress to the processing of reduced sparse matrices (Section 9.2). Transform coding (Section 9.3), which employs matrix reduction, is instantiated in terms of JPEG compression (Section 9.4). Block truncation coding and VPIC are grouped together in Section 9.5, due to the similarity of their approaches. Having introduced the reader to codebook-based compression with VPIC," we then proceed to vector quantization (Section 9.6). 9.1. Block Encoding. We discuss generalized block encoding as an introduction to the Class 4 transforms, which are primarily block-structured. 9.1.1. Basic Concepts. Various transformations accept encoding blocks and index the blocks according to block, subblock or pixel properties. The indexing scheme is generally couched in terms of a codebook, which is a list of exemplars. Reconstruction (decompression) primarily emphasizes substitution of exemplars and, occasionally, normalization. Alternatively, a block can be decomposed into mean and variance measures associated with a zero-crossing bitmap. Decompression involved reconstructing the block using the mean perturbed by the bitmap, which is scaled by a measure of block variance. 259 PAGE 270 260 9.1.1.1. Definition. Let X be an MxNpixel array and let the source image a 6 be tesselated by an indexing function h: X Â— Y, which maps a point x G X in the y-th encoding block to y. Additionally, Let h* : Y ^ 2^ accept a point y in the compressed domain and return the corresponding encoding block domain. Further let / : 2^ X return a reference point in the encoding block (e.g., the block centroid or upper left-hand corner). 9.1.1.2. Definition. Given Definition 9.1.1.1, an encoding blockh(y) , y 6 Y, is defined as ' b(y)= a|A.(y). (311) 9.1.1.3. Algorithm. If Definitions 9.1.1.1 and 9.1.1.2 hold, then for each source image block b(y), y 6 Y, a codebook-based block encoding transform T is computed as follows.: Step 1. Determine the block properties Q = {qi, q2,..., Qn}. Step 2. From the block properties determine a block index i that corresponds to a bestmatch exemplar pattern in a codebook c (a list of block exemplars), when the match satisfies a prespecified error criterion e. Step 3. Depict the source block b(y) in the compressed image as My)= (,Jc^PA(y)'Ky)), (3i2) where A indexes Q and T denotes a tuple constructor. Decompression involves reconstruction of each encoding block as b= /(ff(pi(ac)),c(p2(ac))), (313) where the second argument of /is a codebook exemplar. In cases where a codebook is not employed, P2(ac) is null. PAGE 271 261 9.1.1 .4. Remark. An additional block encoding transform, called transform coding, applies an operation such as the DCT to a block, then selects significant coefficients for further compression based on prespecified criteria. Since only a partial block transform is retained, the decompressed image only approximates the source image. 9.1.1-5. Observation. In order for T to be compressive, the set D and the bounds on i must be chosen such that \domain(ac}\ siz{range{ac)) < \domain{ac)\ siz{range{ac)) Â• (314) This implies a priori knowledge of image properties and optimal or near-optimal choice of Q for efficient compression. However, the properties that yield minimum entropy in the compressed image do not necessarily facilitate operations over range{T). The derivation of pointwise, global reduce, and image-template operations over blockstructured compression transforms was discussed in Section 3.4. We next consider the processing of reduced sparse matrices. 9.2. Sparse Matrix Processing. Sparse matrix reduction is fundamental to the JPEG transform and to the majority of transform coding techniques. Since matrix reduction generally preserves the relationship between selected image pixel values and their domain points, a given image operation is (at a high level) generally a self-analogue over a given reduced-matrix data structure. However, the image operation must be modified slightly (at a low level) to accept the reduced-matrix format. 9.2.1. Basic Concepts. We begin by defining the sparse matrix reduction transform Jsm, then form the basis for derivation of operations over range{TsM)- PAGE 272 262 9.2.I.I. Definition. Given a source image a G F-''^ and a set of negligible values S C F, let a be transformed by the sparse matrix reduction transform Tsm ' -+ (X X F)^ to yield a compressed image: where h is derived from the family of indexing functions H : X x F Â— Y. We call a^ a reducedmatrix representation of a. 9.'2.1.2. Complexity. The computation of Tsm clearly requires 0(|X|) comparison operations to determine if a(x) is in S, together with 0(|Y|) I/O operations to enlist the non-negligible values in the graph representation of a (similar to Equation 315) together with their domain points. 9.2.1.3. Information loss. Since values in a are discarded, Tsm is a lossy transformation. In particular, if h Â€ N/^ denotes the histogram of a, then the error with which ac represents a is computed as follows. First, let a G (F \ S)^ denote an image reconstructed from ac. Letting the error image e= a -a, we can compute the mean error per pixel ?sm= e/|X| from h as follows: ac = {(y,(x,a(x))) : y = /i(x,a(x)) and a(x) ^ S, x G X} , (315) (316) since it is precisely the values in S that are discarded. 9.2.2. Derivation of Operations over range{ Ts\\). 9.2.2.I. Theorem. If Tsm is given by Definition 9.2.1.1, then unary image operations over domain(TsM) are self-analogues over range(TsM), with slight refinement to accomodate data structures in range{TsM)- PAGE 273 263 Proof. Let the givens hold and let a e F''^ with Q : then let the refinement -F^. If Tsm: FX^(XxF)^ 0'(P2[rsM(x,a(x))])= 0(a(x)) . (317) define an analogue of Q over range{TsM)Thus, Q fulfills the definition of selfanalogue in the context of the preceding refinement. 9.2.2.2. Theorem. If 7sm is given by Definition 9.2.1.1, then binary image operations over domain{TsM) are self-analogues over range{TsM), provided that one of the following conditions holds: (i) Let an image operation Q : F^ x F^ F^ that has scalar arguments denoted by f,g 6 F be applied to images a and b. Assume that (x,f), x e X is the pixel value of ac(y), where y G Y is a domain point of ac = r(a). Symmetrically, let (x,g) = bc(z), where z 6 Y. Then, 0'(ac(y), bc(z))= 0(P2(ac(y)),P2(bc(z))) = 0(f,g)(318) (ii) If (x,f) e ransfe(ac) has no corresponding (x,g) in range{hc), then 0'(ac(y), bc(z))= Q(p2(ac(y)),e) = O(f.e) = f , (319) where e denotes the identity of Q. Otherwise, 0'(ac(y), bc(z)) is undefined. ProofIf one considers Equations 318 and 319 to be refinements of Q, then the proof is symmetric to that of the preceding theorem. PAGE 274 264 9 .2 .2 .3. Complexity. If Y is indexed by h per Definition 9.2.1.1, then h has a dual formulation h* : Y -> X, since h is one-to-one and onto. The latter map can be implemented in a looicup table for constant-time execution or in a mintree, which requires a minimum of |"log(|Y|)] comparisons. Unfortunately, the use of h' in searching for a source point in X generally requires search over Y. If Case (i) of Theorem 9.2.2.2 pertains, then searching for corresponding source points could require 0(|Yp) overhead in the worst case. Alternatively, h' can be inverted to yield fi : X^ Yu{J-}, which can return the location y 6 Y (or the undefined symbol Â±) if a source point is (is not) preserved in a given reduced matrix representation. The complexity of h is 0(|X|) (or 0(|Y|) if only those source points in the reduced matrix) are indexed. Symmetrically, the minimum search overhead for a tree-based implementation of h would be 0(log(|X|)) or 0(log(|Y|)). 9.2.2.4. Information loss. We consider only Theorem 9.2.2.2(1), since (ii) has unpredictable error due to the replacement of 0(f) g) with f when g is undefined or unavailable in the reduced matrix. In (i), Q' is computed only when the corresponding operation Q has operands f and g present. Given a set of negligible values S, if either f G S or g G S, then O(f'g) is not computed. Let a and b be two source images transformed by Tsm to yield ac and be, where h denotes the histogram h(f , g) = S Xf (a) Â• Xg(b) , f , g Â€ mnge(a) U mngeih) . (320) Per Equation 316, the information loss per pixel in 0'(ac,bc) due to value elimination in TsM is given by ^= M Â• E E h(f'g) Â• o(f,g). (321) ' ' fes ges PAGE 275 265 9.2. 2.5. Observation. Assume that Q' = Q a.t a. high level (i.e., at coarse granularity) such that a slight refinement in Q's data access process(es) is required to form Q'. Since a binary operation is employed in global-reduce and image-template operations, Theorem 9.2.2.2 holds in such cases. As a result, if corresponding source points are found in two reduced matrices (images), then Q can be applied to yield Q'. Otherwise, e (the identity of 0) be introduced to let 0{f,e) = f (per Equation 319), or we assume that Q is undefined. We next consider transform coding, an application of Tsu9.3. Transform Coding. Transform coding applies an image transformation to a source image (usually tesselated blockwise), followed by sparse matrix reduction. For purposes of discussion, transforni coefficients are treated in principle as a sparse matrix and are selected in practice based on various criteria (entropy, coefficient magnitude, preselected feature attributes, etc.) In the preceding section, we showed that sparse matrices facilitate selfanalogues of image operations with slight refinement to accomodate data structures. Thus, derivation of an analogue of an image operation Q over transform-coded imagery is relatively straightforward, provided that the Q-property of the given transform is known. 9.3.1. Basic Concepts. We begin with a general definition of transform coding that is simplified for purposes of clarity. 9.3.1.1. Definition. Let X be a finite discrete domain and let a Â€ be transformed by R : H^, where range(R) = domain{R) is possible. Transform coding, denoted by Ttc : (W X H)^ is given by: 7Tc(a)= TsM(fl(a),S), (322) PAGE 276 266 where S denotes one or more criteria (e.g., a threshold) that defines negligible values in the sparse matrix reduction transform Ism9.3. 1.2. Complexity. The work required by Tjc is given by WTc(a) = Wfl(a)+ Wsm(H,W), (323) where Wsm was discussed previously, and we assume that the work requirement of image transformation (Wr) is known a priori. 9.3.1.3. Information loss. Tjc incurs information loss that can be expressed in terms of the error function eTc(a) = esM[ffl(ea)] , (324) where esM was given previously, and we assume that C/j is known a priori. 9.3.2. Deriving Operations over mngÂ€{Tjc)Derivation of operations over range( Trc) can be straightforward, requiring only a knowledge of properties of the image transformation R, in addition to previously-developed theory. 9.3.2.1. Theorem. Let Definition 9.3.1.1 hold and let Q denote an analogue over range{R) of the image operation 0 : x ^ F^. If Q.' denotes an analogue of Q over range{ Tsm), then Q', an analogue of Q over range{ Ttc), is defined in terms of images a,b G F^ as 0'( TTc(a), TTc(b))= rTc(0(a,b)) = n'(R{a), R(h)) . (325) ProofIf the givens hold, then by definition of an analogue, n(i?(a),i2(b)) 6 H'^. Similarly, n'{R(a),R{h)) e (W x H)^ = ranffe( Ttc), and the theorem holds. PAGE 277 267 9.3.2.2. Remark. Since it is couched in terms of a binary operation, the preceding theorem can be thought of as generalizing unary and binary pointwise, global reduce, and image template operations over range{Tjc)9.3.'2.3. Information loss. Assume that Q' is an analogue over range{ Tsm) of the reducedmatrix operation Cl that incurs errore^'Let an operation over rangÂ€(TsM) be a selfanalogue, with slight refinement to accomodate the reduced matrix data structure, and let a, b G F-'' with ac = Tjci^) and symmetrically for b. Then, Q' incurs information loss that can be expressed in terms of the error function eo'(ac,bc) = esM(fn[ey?(ea),efl(eb)]), (326) where we assume that is known a priori, and esM was given previously. This result follows straightforwardly from the composition of functions implied in Theorem 9.3.2.1. 9.3.2.4. Remark. The information loss incurred in processing transform-coded imagery can be problematic when cascading compressive operations. However, with error reporting and control incorporated into a given algorithm, one could predict which transform coefficients are more likely to be perturbed. For example, if one simulates a low-pass filter over Fourier transform-coded imagery, the high-frequency coefficients would be rejected, but the mid-range portion of the spectrum would be moderately attenuated. Computational error is associated with this attenuation, particularly as one approaches the high frequencies, where coefficient magnitudes are small and SNR is more noiseor error-sensitive. Such computational errors can induce significant error in the inverse Fourier transform, yielding midto high-frequency artifacts in the decompressed image. An instance of transform coding that is currently in vogue is JPEG compression, which is discussed in general form as follows. PAGE 278 268 9.4. JPEG. JPEG is an image compression transform that exploits spatial properties of source imagery to yield moderate compression ratios (typically, 15:1 to 30:1 in images of natural scenes). JPEG has many variants, each of which generally has proprietary parameters or processes. Thus, for purposes of simplicity, we present JPEG, which has many variants, in terms of a general formulation, which is a composition of three well-known transforms: (1) the Discrete Cosine Transform, (2) the sparse-matrix reduction transform, and (3) the Huffman encoding transform. The former two transforms are lossy, whereas Huffman transformation is lossless. 9.4.1. Background. Due to the requirements of a finite-state automaton (FSA) and training-set specific codebook for decoding Huffman-encoded imagery, we prefer to discuss JPEG in terms of its Huffman-decoded form. We thus avoid issues of Huffman codebook construction that have been discussed elsewhere. We begin by deriving operations that process over the sparse-matrix reduction of discrete cosine transform coefficients. 9.4.1.1. Definition. The discrete cosine transform expresses an image a G R^" as a weighted sum of cosine functions that is given by [rDCT(a)](u) = 2.c(u) a(x) Â• cos (2x 4l)u7r 2^ u = 0,l,...,n 1, (327) n x=0 where (328) The inverse 1-D discrete cosine transform T, DCt(^) is given by n-l (329) u=0 PAGE 279 269 Given a discrete image a Â£ fjZnxZÂ„^ two-dimensional cosine transform and its inverse are given by , 2-c(u)-c(v) ^;^ /(2x+l)u7r\ _f {2Y+l)v7r \ [w(a)](U'V) = 2^^a(x,y)-cos( 1-cosl 1 (330) and [r-,(a)](x.,) = if g.(u)..(v).a(u,v).cos(eiiÂ±il:^^) -cosf^.^^), (331) u=Ov=0 ^ ^ ^ ' where c was defined previously. We say that TbcT is a block DCTwhen the transform is applied to individual encoding blocks in a source image. 9.4.1.2. Complexity and computational error. Over an MxN-pixel array X, the separable fast formulation of the block DCT [85] applied to a kx /-pixel image has complexity given by WDCT(M,N,k,/) = "M" "N" k / (/Â•logk + k Â• log /) real multiplications . (332) Similar to the separable Fourier transform, the separable block DCT is comprised of flogk] -Iflog/] stages where the principal operations are real addition and multiplication. Since the dataflow graph of a function / is isomorphic to that of its error function e/ [13], the computational error in the DCT is given by: cdct(0= o 2-ex(i) where ^^(i) = r*^^'*''^^ " ' (333) i_i [, fx otherwise ^ ' and ea denotes the error function of the source image. Here, multiplication of Cx by two implements the additive error propagation law. Although we assume (for purposes of simplicity) that the error function of each block resembles such ergodicity generally does not occur in imagery that is obtained from a realistic sensor whose noise figure is spatially variant. In such cases, one must calculate cdct at each location in each block of the source image, which is prohibitive for production processing. Thus, we are currently investigating PAGE 280 Â•270 the design and implementation of error protocols that would characterize error throughout an image, or in partitions (e.g., quadrants) of an image. Results will be reported in a future publication. 9.4.1.3. Definition. The generalized JPEG transformation that we employ in this study is the composition of the discrete cosine transform, the sparse matrix reduction transform, and the Huffman transform. Given an image a Â£ R^, rjPEG(a) = [Tdct o Tsm 0 TH](a) , (334) with associated parameters such as block dimensions k,/ and negligible values in S C R understood. 9.4.1.4. Observation. Huffman encoding, the third component of JPEG, is a lossless transform that is codebook-based. Similar to VQ, the Huffman codebook must be computed for each training set and is thus data-dependent. Additionally, it is difficult to process over the range space of the Huffman transform Th : W"^ x (F"*)^, due to (a) variable blocksize m and (b) the requirement of a finite-state automaton for decoding the unique prefix codes that comprise exemplars in the Huffman codebook. As a result, we assume that the Huffman-encoded data is Huffman-decoded prior to JPEG processing. Note that proprietary Huffman codebooks that are a modification of the standard Huffman codebook are employed in much commercial JPEG compression software and hardware. Thus, the development of compressive operations over the formats produced by such transforms is a difficult goal, since the Huffman encoding is obscure. Thus, we have further justification for using the generalized JPEG transform. PAGE 281 271 9.4.I.5. Complexity and information loss. The JPEG transform exhibits computational cost that is the sum of the work requirement of the DCT, sparse-matrix reduction, and Huffman transforms. Since we are concerned primarily with the former two transforms, we write WjPEGi the work required by the separable fast DCT version of Tjpeg, as WjPEG(|X|,k,/)= WDCT(|X|,k,/) + Wsm(|X|) Â• k/(log k + log /) multiplications and cosines + 2 Â• |X| comparisons (335) which can be rendered more efficient implementationally by encoding the cosine functions in a lookup table. Assume that a source image a G and that the set of negligible values S is a subset of range{choice{range{ Tjpeg))) = range{a). From Equation 334 and the introductory discussion, we express ejpEC, the error associated with JPEG, as: ejPEG(a)= esM(S, rDCT(a),eDCT(fa)), (336) where esM was given in Equation 316 and focT was given in Equation 333. 9.4.2. Pointwise Arithmetic . We next summarize JPEG-based pointwise operations. 9.4.2.1. Theorem. An analogous operation 0' over JPEG-compressed imagery is an Qproperty of the DCT, provided that Q' is refined to accomodate data structures that encode the reduced matrix of DCT coefficients. Proof. Let the givens hold and let the JPEG transform be as stated in Equation 334, i.e., 7jPEG(a) = 7sM(7bcT(a))From the definitions of an analogue and the TPP, if Q' is an 0-property of the DCT, then Q' is an analogue of Q over range{ Tdct)From Theorem 9.2.2.1, if Q' is refined to accomodate Tsm's data structures, then theorem holds. ' |pi(X)| k PAGE 282 272 9.4.2.2. Complexity and information loss. Given the preceding theorem, the complexity of the analogous JPEG operation is merely the complexity of the analogous DCT operation 0' together with the overhead required for manipulation of reduced-matrix data structures. Similarly, given source images a,b G and compressed images slc = Tjpeg(*') be = TjPEGlb), with Â€q' denoting the error function of the analogous DCT operation 0'Â» then the total error in the resulting JPEG operation is given by ^O'jPEG ^ fSMffO'[fDCT(fa),fDCT(fb)]), . (337) where fa and e\, denote source image errors, with fsM and eoCT defined previously. The practical formulation of fQ' depends upon parameters of the JPEG implementation (e.g., quality, smoothing, sampling area). For example, if the quality setting (whose value varies with the implementation and does not indicate percent information retained) is set at a high level, then the compression ratio and representational error will be reduced. Similarly, if the smoothing parameter (appropriate for monospectral images) is set high, then local averaging tends to degrade much high-frequency data. 9.4.3. Pixel-Level Operations over JPEG-compressed Imagery. 9.4.3.1. Observation. The JPEG quantizer may be implemented either as a division process [7] or as value selection by sparse-matrix reduction. Since low frequency coefficients generally exhibit greater magnitude than high-frequency coefficients in cosine transformations of natural scenes, it is reasonable to assume that low-order coefficients of the DCT are more likely to be preserved. Thus, it is reasonable to expect that image smoothing, which can also be portrayed as low-frequency filtering, can be accomplished by filtering the output of the JPEG transform to reject all high frequency information (i.e., low-pass filtering). PAGE 283 273 9.4.3.2. Theorem. Let Observation 9.4.3.1 hold, and assume that ac = TjPEG(a) has high-frequency coefficients stored as pixel values a(x). If the domain points x at which this information is located in the DCT transform representation are collected in set D, then the JPEG filtering algorithm that corresponds to the image operation Q : ^ has an analogue 0'"(XxF)^ (XxF)^ which is denoted by: O'(ac) = ac||{(x.a(x)):xeD} Â• (-^38) Proof. Assuming that the givens hold, D represents the domain selection criteria for highfrequency filtering. Equation 338 implements the restriction of ac to domain points in D. Thus, the theorem holds. 9.4.3.3. Complexity. Assume, for purposes of realism, that smoothing is computed via convolution of a source image a Â€ F^ with a template t Â£ (F"'^)'''' using an image operation Q: F^x {F'^)'^ F^. If the compressive analogue Q' : (XxF)'^ -* (XxF)"^ over range( Tjpeg) is given by Equation 338, then the preceding theorem gives rise to a CCS Xo = {AX,B) = ((FX,(FX)'',0},{r, f/;0},{(XxF)Y,(F^)'',O'}), (339) where 0'= Â©(O) ^"d U is the null transform. That is, the template used in the image domain is present in the compressed domain merely as a placeholder, such that the elements of algebras A and B exhibit a one-to-one correspondence in number and type. Given that X is m-dimensional. Equation 338 requires work of at most m Â• |Y| comparison operations and (m Â— 1) Â• |Y| combinations of the comparison results (e.g., logical and operations). If m < CRa, then the computational efficiency associated with X is given by Wo'(|Y|) ~ 2m.|Y| 2m ^Rd,xeX. (340) For example, if t is a space-invariant template such as the von Neumann template, then |iS(tx)| = 5. If the encoding blocks are four pixels square, then m = 2 and CRd = 16. Under PAGE 284 274 the artificial assumption that image-template operations of multiplication and addition constituent to Q require work equal to that of the comparison operations present in Q', we have rj = (5/4)16 = 1.25(16) = 20. However, the image-template operations are usually more burdensome than the comparisons, so rj > 20 would be more likely in this case. 9.4.3.4. Information loss. The method of Equation 338 can be analyzed similarly, since values of pixels in p2(ac(Y)) that correspond to preselected domain points in X are discarded during filtering. Denoting the domain points of discarded pixels as B = pi(ac(Y)) \ D, the information discarded by JPEG filtering is given by QPf(D,ac) = ^ Â• 5] P2(ac(x)). (341) ' ' x6B 9.4.3.5. Example. Given an image database of outdoor scenes containing small manmade targets that were acquired using an intensified camera with approximately three bits per pixel error, we computed the error and performance measures shown in Figure 15 for pointwise image addition. This operation, although simple mathematically, was useful for averaging images over a multispectral frame containing six monospectral images. The abscissas of Figure 15 depict removal of the lower first through fourth magnitude quintiles of the DOT coefl[icients in the matrix reduction process. The zero-th quintile merely represents the control case, i.e., where all transform coefficients were preserved. Error is expressed on the left-hand ordinate in terms of bits per pixel, and computational efficiency is expressed on the right-hand ordinate in terms of the factor by which the speed of noncompressive addition was increased. Note that for the zero-th quintile, the efficiency is slightly less than unity, due to the overhead incurred by accessing reduced-matrix data structures. This overhead is further reflected in the fourth-quintile efficiency, which is less than the theoretical maximum of 5 = 1/(1-0.8). PAGE 285 275 X 5 o Q. m p" ^ X-X = Efficiency 0-o = Error ,0-' 111 K )r' iO c ^^/'^ 4 y it: LU o 2 iS a. E o 1 O 0 12 3 4 Quintile of DCT Coefficients Discarded Figure 15. Error and efficiency measures associated with JPEG addition over natural scenes having 3 bits/pixel error. We consider this example to describe a limiting case of computational error, since the imagery is already severely noise-degraded, and the error upper bound is described by additive error propagation. We believe that the variance from linearity in the error result is due to two factors. First, the DCT coefficients that were discarded represent much high-frequency information, such as texture of grass, which was prevalent as background. Additionally, there were patches of sand among the grass that contributed highand midrange frequency information, which was obliterated as the third and fourth quintiles were neglected. Second, various multispectral bands of the imagery had different modulation transfer functions (MTFs). Thus, the DCT coefficients that were discarded perturbed the representational error of different monospectral band images in different ways. This result has important implications for the accuracy of compressive ATR filters, and will be investigated in future research. 9.4.3.6. Observation. JPEG-based filtering in the spatial frequency domain was described in Equation 338. Thus, JPEG-based edge detection, which can be implemented as highpass filtering, would be computed using Equation 338, where D would contain the domain points of high-frequency coefficients of the DCT. PAGE 286 276 9.5. Block Truncation Coding and VPIC. We begin by reviewing the BTC and VPIC transforms, then derive operations over ETCand VPIC-compressed imagery. 9.5 J. Basic Concepts of BTC. Block truncation coding [16] stores the mean of an encoding block. If the block standard deviation exceeds a prespecified threshold, an edge block is said to be present, and the standard deviation is stored, together with a bitmap of the block crossings about the mean. 9.5.1.1. Algorithm. If Definitions 9.1.1.1 and 9.1.1.2 hold, then for each source image block b(y ) , y 6 Y, the BTC transform is computed as follows: Step 1. Determine the block mean fi(y) and standard deviation ^(y). Step 2. Assuming that the compressed image ac is initialized to some constant f, where f < Arange{si) or f > Vranfife(a), and given variance threshold Tmin, interrogate the standard deviation as follows: If ac(y) = f, then an edge block is present, and we proceed to Step 3. Otherwise, a mean block is present, and we apply Step 1 to the y+l-th block. Step 3. Compute the bitmap d of the zero crossings of b(y) about its mean, as follows: Step 4. Via bitmap manipulations that vary with the implementation of the BTC transform, reduce the entropy of d by eliminating single bits in rows or columns, ac(y)=/i if cT(y) PAGE 287 277 to yield clusters of four or more bits if k,/ > 2. Such reductions can be performed in several ways. For example, given a bit pattern such as d = (344) the leftmost group of two unitary bits, as well as the single unitary bit in the bottom row, would be zeroed to form the manipulated bitmaps di = /O 1 1 0\ 0 0 11 0 0 0 1 Vo 0 0 0/ or d2 = /O 1 1 0\ 0 111 0 0 11 Vo 0 0 0/ (345) Note that the zeroes present in the left-hand half of d have been interchanged with neighboring unitary values to produce di, which can be thought of as being comprised of two 2x2-pixel sub-blocks of form ^ In contrast, d2, is comprised of two overlapping unitarilyvalued 2 X 2-pixel clusters. Such construction can facilitate more efficient bitmap coding in terms of preselected exemplar patterns or hierarchical indexing. Step 5. Store the following parameters in the compressed representation of the y-th edge block ac(y)= (/i, d). (346) Subsequently, encoded blocks are binarized to effect maximum bit packing. For example, the mean value may be quantized to 5 bits and the standard deviation to 3 bits. Decompression involves the block reconstruction step b= a {2d -I) + /i, (347) which requires one multiplication and 2k/ additions per edge block as well as k/ I/O operations per block. PAGE 288 278 9.5.1.2. Complexity. Per kx /-pixel block, BTC requires one comparison operation to threshold the standard deviation, k/ 1 additions and one multiplication to compute the block mean, and 2k/ subtractions and multiplications, as well as one square root to compute the standard deviation. Per block, one I/O operation is required for storing the mean value. Per edge block, one I/O operation is employed to store the standard deviation and at least 0(k/) comparison operations are required for block manipulation. Neglecting I/O overhead and letting |Y| = [M/k] Â• [N//] denote the number of encoding blocks, the total computational cost of BTC is bounded as WBTc(|Y|,k,/)< |Y| Â• (3k/ 1 additions + 2k/ multiplications (348) (+ 1 sqrt + 0(k/) comparisons) , where sqrt denotes the square root operation. 9.5.1.3. Information loss. Due to (a) the representation of a "smooth" block by the block mean, as well as (b) the representation of an edge block by the block mean, standard deviation a, and manipulated bitmap d, decompression of the BTC transform output approximates the source image. Observe that BTC is designed to represent a source image a Â£ in terms of a reconstructed image a Â£ R^, within a prespecified error limit fa ||a a II that can be expressed in terms of the block standard deviation (t under the assumption of normality. A more convenient method is to let the location of the i-th point within the y-th encoding block be given by x= h*{h(y)){i) 6 X. The probability of a value e(x) in the error image e = a Â— a given the corresponding source value a(x) is given by Pr(e(x))=| ifa(x)G[/.-a,/. + ^] rr^ew; 1 i . e-(a(x)-^)/2a^ otherwise ' ^"^^^^ where the value 0.683 is the probability of error at one standard deviation. Since the probability distribution of source values in a reconstructed block is distorted by quantizing block values to the interval \p + a], nothing more can be said about the associated error. Given fi and PAGE 289 279 9.5.2. Visual Pattern Image Coding. We next describe visual pattern image coding, which resembles BTC since block mean and a representation of block variance are included in the compressed image. Bitmap manipulation is not a feature of VPIC, which matches a zero-crossing bitmap to a small number of codebook exemplars to achieve a best-match exemplar whose index is stored in the compressed image, versus simplification and storage of the zero-crossing bitmap employed in BTC. 9.5.2. 1. Algorithm. If Definitions 9.1.1.1 and 9.1.1.2 hold, then for each block b(y) , y G Y, the VPIC transform is computed as follows: Step 1. Determine the block mean niy) and partial gradients Ax, Ay as follows ^ l^*(y)l ' " \domain{ht)\ \domainihh)\ ' A = h ^Â£ , ^ \domain{ht)\ |domam(br)| ' where bt, bb, b/, and br denote the top, bottom, left, and right halves of b(y). If b has an odd number of rows, then the first \l/2] pixels of b's middle row are contained in bt, and symmetrically for the k columns of b/. The gradient intensity g(y) and orientation ^(y) are given by g(y)= yjAl + Aj and 0(y) = tan'^Ay / Ax) . (351) Step 2. Given that the compressed image slc is initialized to a constant value f, where f PAGE 290 280 Step 3. Compute the bitmap d of the zero crossings of b(y) about its mean, as d= [2.x>o(b(y)-/i)]-l, (353) which can be binarized by substituting 0 for -1. Step 4. Let d be correlated with bitmap exemplars in the codebook c Â£ MO, 1} J , where finite U C N. Implementationally, we assume that correlation is assisted by a LUT which restricts the choice of exemplars in c to orientations near 9. For example, the bitmap d portrayed in Equation 344 could be matched to an exemplar such as c(u) = /O 1 1 1\ 0 0 11 0 0 0 1 Vo 0 0 0/ , u e U . (354) The exemplar index u is stored with the edge block parameters in the compressed image as ac(y) = (//, g, ^, u) . (355) After all blocks are encoded, a.c is binarized to achieve maximum bit packing. Decompression involves scaling the u-th exemplar pattern for mean fi and gradient g, where the latter can be thought of as being similar to the standard deviation in Equation 347. The decompression step therefore requires one multiplication and k/ additions per edge block, as well as kl I/O operations per mean block. 9.5.2.2. Observation. Horizontal and vertical block partitions (Step 1) are recommended [9], which is likely due to the implications of psychophysical research pertaining to the vertical and horizontal primary and secondary directions of gaze [74,75]. We are currently investigating additional methods of block partitioning that may facilitate optimization of VPIC's compression ratio. For example, overlapping elliptical or hexagonal blocks would respectively be appropriate as a consequence of overlapping visual fields or close-packing of PAGE 291 281 retinal photoreceptors. Additionally, we note that the preceding algorithm can be computed in SIMD-parallel fashion, since fi, g, 9, and u can be stored as images on Y. When fi < Tmin, the unused values (g, 6, and u) in Equation 355 would be ignored in the binarization step, which could be computed in SIMD-parallel fashion, followed by a merging step that would optimally pack ac. This issue is further discussed in Chapter 11. 9.5.2.3. Remark. Our preliminary results indicate that following VPIC with Huffman encoding may yield near-optimal compression. However, the Huffman codebook must be recomputed for each VPIC encoding of an image training set, due to the non-ergodic distribution of mean and standard deviation values in various test sets. For example, training imagery may have Gaussian-distributed means, while test-set imagery acquired using a different sensor may have block means may conform to a Poisson distribution. Recall that it is not convenient to process over Huffman-encoded imagery, due to nonuniform blocksize and the implementational requirement of a finite-state automaton (FSA) in the decoding step. In Section 3.4.8, we showed that nonuniform blocksize could induce block fragmentation in compressive binary pointwise and image-template operations. FSAbased decoding is not necessarily useful for other types of compression formats, and would represent additional complexity useful only for Huffman decoding. If Huffman encoding follows VPIC, then it is reasonable in this study to assume Huffman decoding prior to processing VPIC-encoded imagery. 9.5.2.4. Complexity. Chen and Bovik [9] showed that VPIC requires the following operations: Â• Per block, k/-l additions and one multiplication for each computation of the block mean, and one subtraction and two comparisons to compute the gradient intensity g, since (a) Ebt and Ebb can be added to obtain Sb for computing the mean and (b) division is not required to compute non-normalized g for purposes of comparison with T^in. PAGE 292 282 Â• Per edge block, VPIC requires 4k/+2 additions, k/ multiplications, one square root and one inverse tangent operation to compute the gradient orientation 6. Depending on the block matching scheme, if the codebook has |U| exemplars, the matching overhead is bounded above by 2k/ Â• |U| subtractions. Additionally, 2 Â• |U| multiplications and |U| square root operations are required for rms computation of matching scores. Letting |Y| = [M/k] Â• [N//] denote the number of encoding blocks and denote the number of edge blocks, the total computational cost is bounded as Wvpic(|X|,k,/,|U|)< |X| Â• ^5+ + |U| Â•2k/ additions + |X| . + 1^ + |U| multiplications (3^^) + |X|/k/-||U| roots + |X|/k/ inverse tangents . If |U| < |X|/k/, which is customary, and if k/ is sufficiently large, then Wypic can be approximated in practice by six additions and one multiplication per pixel of the source image, which is a small cost. For example, if |X| = IM, then 6M additions and IM multiplications would be required. 9.5.2.5. Information loss. As in the case of BTC, due to (a) the representation of a "smooth" block by the block mean, as well as (b) the representation of an edge block by the block mean, gradient intensity g, orientation 9, and exemplar index u, VPIC approximately represents the source image and is therefore lossy. In this study, we assume that the gradient g approximates the standard deviation, since both are computed from block means in RMS fashion. As a result, the error of estimation under which Equation 349 applies depends primarily upon (a) the codebook exemplars, (b) codebook size constraints, and (c) the frequency of exemplar use, the latter of which depends upon encoding block configurations in the training set. PAGE 293 283 9.5.2.6. Remark. We further note that VPIC error analysis may be considered a moot point for image display. This conclusion is supported by the fact that VPIC is designed to produce a visually pleasing (versus a mathematically accurate) representation of the source image. As previously mentioned, this effect results from a prespecified codebook whose receptive fields [74,75]. As a result, VPIC is not designed to minimize a criterion such as mean-squared error (MSE), which is the case for most compression transforms (e.g., BTC, JPEG, or VQ). Thus, VPIC appears to be more appropriate for visual analysis of image content, versus ATR applications that emphasize computerized image analysis under uniform error constraints. 9.5.3. Pointwise Arithmetic . We have found that operations of addition, multiplication, and maximum can be implemented over the range spaces of BTCand VPIC-encoded imagery. VQ-based arithmetic is more involved, due to issues of codebook growth and information loss. Salient discussion follows. 9.5.3.1. Assumption. Let a source image a G R-'^ by compressed by a BTC or VPIC transform T : -* to yield ac G G^, then be decompressed to yield a reconstructed image a G R^. Let the bipolar transform 6(f) = 2f 1, f G {0, 1}. Assume that a pointwise image operation Q = X R''^ R"'' has an analogue Q' over range{ T), which comprises aCCSXo= ({RX,0}, {7^,0}, {6^,0'}), where 0'= e(0). 9.5.3.2. Theorem. If Assumption 9.5.3.1 holds, T = BTC, Q = +, g = ac(y), and h = ac(z), where y,z G Y, then Q' is given by: exemplars are designed to conform to knowledge about patterns that stimulate cortical Z Pi(f) + P2(f) ^(P3(f)) if g and h represent edge blocks fe{g,h} Pi(g)+Pi(h) otherwise (357) PAGE 294 284 where p; denotes projection to the i-th coordinate, p\ denotes block mean /z, p2 denotes block standard deviation cr, and pz denotes the block bitmap. Similarly, if O = Â•> then {n Pi(f) + P2(f) Â• KP^i^)) if g ^^'^ h represent edge blocks f6{g,h} , (358) Pi(g)-pi(h) otherwise and if 0 = ^' ^^^^ {V Pi(f) + />2(f) Â• ^(P3(f)) if g a^nd h represent edge blocks fe{g,h} . (359) Pi(g)Vpi(h) otherwise The operation of pointwise minimum is symmetrically defined. Proof. The proof is outlined as follows. Assuming that the givens hold, denote the arguments of 0' g and h. Equation 357 merely denotes the summation of reconstructed blocks corresponding to a<;(y), and h = ao.(z), where y,z 6 Y. The proof is symmetric for 0 = V, and the theorem holds. 9.5.3.3. Observation. Equation 357 can be characterized in terms of a map /: (R2 X N)^ ^ R2 X N, where p2{range{f)) denotes the index of a manipulated bitmap. Implementationally, the mean and standard deviation (or gradient, in the case of VPIC) would be indexed and collected together with the exemplar index to yield / : x ^ N^. Such implementation is appropriate for BTC, since ^ and a are quantized and the number of manipulated bitmap configurations M is small in relation to the number (2''') of possible bitmaps. In practice, we usually assume M < 32 for k,/ = 4, since manipulation yields only those blocks that have unitarilyvalued components of size 2x 2-pixels or larger, such that log(M) = 5 bits. Assuming that siz{range{a.)) = 8 bits as well as 5 and 2 bits required for fi and cT, respectively, we have 12 bits = 5 -f5 + 2 bits required for each operand. Thus, the total LUT domain size is 24 bits, or 16 MB. In VPIC based processing, a codebook of size M = 3 is typical for two possible angular orientations. Given 3 bits for the mean and 2 bits for the block gradient, we have 9 bits PAGE 295 285 per edge block (not including the block i.d. flag, which is not germane). This implies that domainif) would have 2(9) = 18 bits. Since each value in range{f) would have 9 bits, the LUT implementation of /would have total size 294,912 bytes, which is feasible implementationally. This observation holds for pointwise arithmetic and logical operations. 9.5.3.4. Complexity. Equation 357 requires 3k/ additions and 2k/ sign manipulations per edge block, as well as one addition per mean block. Implementationally, it is reasonable to assume that the sign manipulations incur negligible cost in relationship to the additions. If a fraction r of the encoding blocks are edge blocks, then the work required by BTC pointwise addition is given by Wo'(|X|) = |Y| Â• (3r Â• k/ + r 1) additions . (360) For example, if r = 0.4 and k/ = 16, then 18.6 Â• | Y| additions are required. This implies that CRd > 18.6 if computational efficiency rj > I. However, this is a contradiction, since CRd = k/ = 16. We resolve this apparent conflict by observing that the condition (3rk/ + r 1) < CRd must be satisfied for a computational speedup to occur. We note that if k/ = 16, then CRd > 49r 1 is required for 77 > 1. In contrast, if the LUT described in Section 9.5.3.3 is employed, then r Â• |Y| applications of the LUT are required for each edge, block, and one I/O operation is required per mean block. Neglecting I/O overhead, we have Wo(|X|) ^ |X| additions ^ ^/CRp Wo'(|Y|) r-|Y| LUT accesses V r which reduces to CRo/r, if (a) the addition operation is performed in a LUT or (b) additions and LUT accesses require equal work. If W is replaced in the preceding equation by the time complexity T, then r? denotes sequential computational speedup. Since 0 < r < 1, processing at efficiencies greater than CRd is feasible. PAGE 296 286 9.5.3.5. Informatioii loss. Given the operands g and h of Q', if fg denotes the error in g and fh denotes the error in h, then by the additive law of error propagation, the error in 0' is given by: fO'=fg + fh if Oe{+,V}. (362) Since we assume that the operations of maximum or minimum are implemented by subtraction or comparison, the additive error function applies. In the case of multiplication, we have the exact expression: 60'= gh-(^ +1^1 ) +fg-fh if 0 = -, (363) which is customarily approximated by the first term of the preceding outer sum, since the product of fg and ei, is small. 9.5.3.6. Observation. Pointwise operations for VPIC-compressed imagery are symmetrically derived. Since VPIC lacks a direct measure of standard deviation, we prefer to think of the block gradient g as a measure of deviation. This is justified by noting that the gradient of a source encoding block b is given by: g(b) = y/Al + Aj = ^{b;-h^'+{h,-h:)\ (364) where bt, bb, b/, and br denote the top, bottom, left, and right halves of b and the overline denotes the arithmetic mean. By way of comparison, the computation of g has the same general form as the standard deviation a(b) = (b(x)-Mb)) , (365) \ \domain(h)\ \ xfcaoT7iain(D) where /z(b) denotes the mean of b. The chief exception to equality between g and (T is that the hemi-block means bi",b^, etc., are taken in VPIC within block partitions prior to differencing. PAGE 297 287 In practice, we have found that g is often a useful estimate of a, within an error of Â±20 percent. Since many of the operations presented herein quantize values in b to an interval [n ntr, + n PAGE 298 288 be quantized into the number of bits allotted for the block mean and standard deviation. For example, letting /^(x) = 2x1, where 2; is a Boolean value, the sum a + b becomes /I 0 1\ /O 1 0\ a + b /ia + <7a 7* 0 1 1 + Mb + "/i M 0 1 Vo 0 1/ Vo 1 1/ Ma + /ib + = Ma + Mb + <7b + -Cb <7b -(^h -CTb <7b fb J (367) -((7a-(Tb) CTa (Tb -(cTa fb) Ca 'T'b <7a + ^b Â•(fa + ^b) -(fa-^b) '''a + ^ b . Thus, the block standard deviation must be quantized from the four-valued block shown in the right member of the last line of Equation 367. 9.5.3.10. Observation. The preceding problem can be remedied by quantizing the fourvalued standard deviation sum to the range [p,q], where P = A[-(<7a + PAGE 299 289 If I/O overhead is negligible, the efficiency resulting from Equation 367 is given by Wo(|X|) |X| additions Wo'(|Y|) ~ I Y| + 2 additions + |Y| LUTs (370) which approximates the speedup exhibited by Equation 366. 9.5.3.12. Observation. BTCor VPIC-based pointwise multiplication is similarly derived. Given the product law of error propagation, we model the block standard deviations in the pointwise block product c = a + b as errors. Thus, we have Hc= Ha Â• Hb and PAGE 300 290 9.5.4.3. Theorem. If Assumption 9.5.4.2 holds and the following conditions hold: (i) r = BTC, (ii) r = E, (iii) h Â€ N^" denotes the histogram of a G Z^, (iv) the compressed image ac = ^(a), and (v) the error image e = a a; then the error Â€v inherent in r'(ac)= Spi(ac), is bounded as es< Se, (372) where pi, i = 1..3 were defined previously. Similarly, the error en in the image product T = n, which has an analogue r'(ac)= npi(ac), is bounded as en< na+ (373) Proof. The proof is summarized as follows. Assuming that the givens hold, let T' = S be expressed as r'(ac)= Pi(ac(y)), (374) yeY which is the sum of block means fi. The error Â€Â£ inherent in T' is given by the summation over all y G Y of the error with which /i(y) represents values in the y-th block. Given the error image e = a a, Se denotes the representational error in the sum (from the additive law of error propagation) since (a) the within block error due to block summation is given by eb(y) = S el^.fy), where h* was defined previously, and (b) f E = X] fb(y)Equation 373 follows symmetrically from the expression for error yÂ€Y propagation in the product of two real numbers g and h. In deriving BTC or VPIC image-template operations, one encounters the same challenges as described in Section 3.4 for block-encoded imagery. For example, if one has a PAGE 301 291 von Neumann source template, then one must obtain information from the von Neumann neighbors of the current encoding block, as well as from the block itself. Assuming that n bits are required to encode an edge block, and that the von Neumann topology characterizes the image domain, one has the requirement of 5n bits for the LUT domain size and n(25") bits for the total LUT size. If n>4, however, the LUT size will be prohibitive for current memory capacities. Rather than producing large lookup tables that may be impractical to implement, we currently prefer to use application-specific strategies for deriving neighborhood operations over BTC or VPIC encoded imagery. For example, in Chapter 10, we present algorithms for a VPIC edge detector, as well as for morphological operations such as erosion and dilation. These algorithms operate efficiently, and provide insights into the design of algorithms for additional applications. We are currently investigating the possibility of separable lookup tables that would partition the n-bit block descriptor such that several cascaded LUTs could be employed for imagetemplate operations, instead of one large LUT. 9.6. Vector Quantization. We begin with a brief review of VQ methodology, then progress to pointwise, globalreduce, and image-template operations over VQ-compressed imagery. 9.6.1. Background. It has been frequently observed that images are comprised of repeated patterns called texels [76,77,78]. In an extreme case, a brick wall can be reconstructed from two patterns (brick and mortar texture). Less obvious are the pseudorandom but structured patterns inherent in natural objects such as a carefully mowed lawn or recently tended golf course green, leaves on trees, and ripples on the surface of a pond. PAGE 302 292 By exploiting spatial redundancy in such patterned imagery, one can generally describe the majority of an image's appearance in terms of a list of exemplar patterns (the codebook) and an array of exemplar indices (the compressed image). The compressed image is merely an array or list of pattern indices, while the decompressed image is obtained by substituting the indexed exemplar patterns according to the map described in the compressed image. In practice, exemplar patterns can be of different size and shape, and may be rotated or scaled anisomorphically. Certain image compression schemes, such as iterative function systems (IFS) based on concepts derived from fractal geometry, extensively exploit recurrent image structure to achieve generally high compression ratios, albeit with representational error [14]. The processing of imagery compressed by IFS-based systems, which are also called self-VQ, will be considered in future research. Returning to more conventional VQ, we note that the exemplar list, also called a codebook, is usually constructed under constraint of an error criterion e. As a result, error inherent in the VQ reconstruction step is codebook-dependent. Since the codebook is derived from a given image training set, input nonstationarities may perturb the decompressed image erroneously, thereby resulting in an image representation whose error exceeds e. This lack of determinacy in the reconstruction error can severely impact the accuracy of VQ compressive computation, especially when cascaded operations are employed. 9.6.1.1. Definition. Let X denote an MxN-pixel array and let a G denote an image that is tesselated into kx /-pixel encoding blocks b(/j(x)), x 6 X, whose domain is given by h*(h{-x.)), where h* and h were specified in Definition 9.1.1.1. Given a source image a, the VQ transform Tvq : ^ x (F''')^ clusters the encoding blocks of form F''' into a codebook c G (F''')^, where U Â€ N is customary. The resultant compressed image on Y is Uvalued, i.e., references codebook exemplars by their indices. PAGE 303 293 In this simple formulation of VQ, decompression is merely a matter of substituting exemplars based on indices encoded in ac. This step requires only |X| I/O operations, and is thus trivial. 9. 6. 1.2. Alternative formulation. If the image blocks are normalized to a prespecified mean and standard deviation, then codebook size can be reduced and decompression requires further overhead of |X| addition operations. For example, let a G F-'^ denote an image that is tesselated as described in Definition 9.1.1.2. The following steps are performed for each y G Y: Step 1. Compute the mean n and standard deviation a of the block b(y). Step 2. Given x G domain{h), let b be thresholded as T/i 2(7 if b(x) < fi -2(7 b(x) = K +2a if b(x) > /x + 2(7 , (375) I b(x) otherwise which clamps the extrema of b to an interval [/i-2<7,//+2cr]. Step 3. Subtracting the mean, let bfi normalized to the interval [-L,L] be correlated with exemplars in a codebook c G (F^') ^ to yield a best-match exemplar index u G U. Step 4. Store the mean and exemplar index to yield the following tuple-valued compressed image pixel ac(y)= (n,(T,u). (376) Thus, if F= R, then ac G (R x R+ x U)'^. In practice, the algorithm operates as follows. Under the assumption of normality, Steps 1 and 2 quantize b into a range that contains 95.4 percent of the values in the real interval A = n(T,/i + no-]. We set n = 2 customarily, since the normal distribution has an associated probability of 0.477 when n = 2. Thus, 2(0.477) or approximately 95.4 percent of possible block values are included in A. Step 3 normalizes the block to a prespecified PAGE 304 294 interval, for efficient comparison with codebook exemplars, and Step 4 constructs the compressed image. Decompression involves rescaling the u-th exemplar stored in ac(y) to the interval [fi 2(7, fi + 2(t], such that the rescaled block has mean fi. The manipulated block is then substituted into the reconstructed image. 9.6.1. 3. Complexity. Step 1 requires work of |Y| -(k/1) additions and |Y| multiplications to compute the mean, as well as |Y| Â• (2k/1) additions and |Y| multiplications and square roots. Step 2 requires |Y| Â• 2k/ comparison operations, or |Y| Â• k/ operations if \range{a.)\ is sufficiently small that table lookup can be employed to compute Equation 375. Steps 3 and 4 require |Y| additions and I/O operations. 9.6.1.4. Observation. VQ has the following key deficiencies: 1. Codebook generation is difficult computationally, often requiring 0((|X|/k/)'^) multiplications if the generalized Lloyd clustering algorithm (GLA) is employed [1,12]. The GLA is a descent algorithm with a monotonically decreasing error function that converges by iteratively updating the codebook while attempting to satisfy various centroid and nearest-neighbor conditions. Since the error function is nonconvex and usually contains numerous local minima, the resulting codebook is frequently nonoptimal for a given training set. 2. The VQ codebook must be regenerated for each image test set and is sensitive to input nonstationarities. In contrast, VPIC's codebook is data-invariant [9], being dependent only on knowledge about the human visual system (HVS). 3. VQ's codebook is customarily computed under a mean-square or Hammingdistance error criterion that may result in obscuration of image features that are useful for ATR. In particular, VQ has no ability to discriminate between spatial PAGE 305 295 attributes of various image features, which can lead to significant errors when examining images for the presence of small features (i.e., whose spatial frequency is less than k/M or //N. 4. Codebook search overhead is large, typically requiring a minimum of |U| Â• log(|U|) comparison operations. Alternatives to the GLA abound in the literature. For example, Kohonen implemented a neural-network based algorithm that approximates least-mean-squares (LMS) clustering, with dynamic update [79]. Kohonen's LMS net circumvents the problem of convergence to a local minimum via network design that purportedly ensures convergence to a global minimum. A further advantage of Kohonen's method is the admission of input nonstationarities via dynamic codebook updating as each source vector is presented. Unfortunately, training times for the neural net are typically long. Additional variants of the Lloyd algorithm have been proposed that utilize stochastic clustering techniques to produce a slightly more compact codebook than that which results from Lloyd's greedy algorithm [80]. Codebook size can also be varied by increasing the error constraint, albeit at the cost of visual fidelity in the decompressed image. As a result of the foregoing deficiencies, we present the following discussions of information loss and computational error inherent in VQ compression. 9.6.1.5. Information loss. By design, vector quantization produces a codebook whose exemplars represent source image blocks under an MSE criterion e. Using notation employed in the VPIC transform description of Section 9.5.2, each source block b(y), y 6 Y, maps to its respective exemplar c(u), where u G U, under the constraint ||b(y)-c(u)|| < e, (377) which ensures that the blockwise error remains within the prespecified bound Â±e. This means that the total absolute error Ct in representing the source image will be bounded PAGE 306 296 as ft < |Y| Â€. As a result, the mean error per pixel in the reconstructed source image on X is given by 9.6.1.6. Remark. Although Equation 378 initially appears to state that the per-pixel information loss fvQ decreases as the compression ratio increases, this is not the case, since the block representation error e increases with CRd as fast as (or faster than) fvQ would decrease. If this were not so, then there would (in principle) exist some compression ratio at which evQ would asymptotically approach zero. Unfortunately, the converse occurs in practice, i.e., evQ grows with CR due to codebook compression. In the presence of static encoding block size, CRd remains unchanged. 9.6.2. Pointwise Arithmetic. Pointwise VQ operations can be quite simple, but raise questions concerning tradeoffs between the propagation of representational error and codebook growth. 9.6.2.1. Observation. We have found that pointwise arithmetic operations 0 : F-'^ X F-'^ have analogues over imagery compressed by a VQ transform T-vQ : X (F''')^ can be derived merely by applying Q to codebook exemplars indexed in U^. However, this observation requires that one of the following three methods is employed in deriving Q' : x ^ G^ , the analogue of Q over rangei ^vq): Method 1. After Cosman et al. [1,12], if b and d are two encoding blocks in domain{Q) and Cb, Cd are the corresponding codebooks in p2{range{TvQ)}, then we first locate the indices Ub and uj in U of the exemplars that match b and d within a prespecified error threshold e. Such indices are found in the compressed images be = rvQ(b) and dc = rvQ(d). By forming a lookup table L whose PAGE 307 297 domain consists of the cross product of exemplars in Cb and Cd, we obtain a new codebook Ce, which has an entry that corresponds to the block e = OC^^^i)Method 2. We have modified Method 1 to reduce codebook growth, as follows. Letting operands be defined as in Method 1, we have some codebook Ce that is the union of Cb and cj. We compute e = 0(b,d), then find the exemplar in Ce that best matches e, although the resultant error may exceed e. The index Ug of this exemplar is the corresponding value in the compressed image = OC'odc)Method 3. A further modification of Method 1 yields constant codebook size. Letting operands be defined as in Method 1, we assume that only one codebook c exists whose exemplars encode blocks b and d as indices Ub and Uj in U. Then, the computed block e = 0(b,d) is encoded in terms of the best-match exemplar in c whose index Ue is substituted in ec. 9.6.2.2. Remark. The preceding methods have the following implementational advantages and disadvantages. Method 1. Advantage. The error with which the result e is represented is minimized, since a large codebook Cg is constructed from the exemplars in Cb and cj. Thus, all combinations of the source exemplars are accounted for. Disadvantage. Codebook size is large, and codebook growth is exponential in the number of cascaded operations. For example, if there are M exemplars in Cb and Cd, then Ce can have M^ exemplars. Given L cascaded operations, M^^ exemplars are possible in the L-th codebook. This is clearly unacceptable for practical compression applications. Method 2. Advantage. Codebook size is reduced from |cb| Â• |cd| to |cb U Cd|, which is a significant savings. That is, if Cb and Cd are derived from similar training sets, then they are likely to have similar exemplars. In practice, codebook growth is usually sublinear in L for images of natural scenes containing a small PAGE 308 298 number of manufactured objects. Disadvantages. The reduction in codebook size implies a compromise that yields increased error, since the operation = 0(bc,dc) is no longer represented uniquely in Ce. Additionally, the codebook still continues to grow with the number of cascaded VQ-based operations. Method 3. Advantage. Codebook size remains constant as a function of the number of operands of O the number of operations L. Disadvantage. Error accumulation can be unacceptably high, since the single codebook c is likely not equal to Cb U Cd, and thus does not contain all exemplars in Cb and cj. As a result, the error with which be and dc represent b and d is higher than in Methods 1 or 2. When such error is propagated through 0) then the result of Q is quantized by representation with an exemplar in c, higher error results. This is due to the fact that the error eyq that is input to Â€q is larger than in Methods 1 or 2. We discuss issues of computational error and information loss, as follows. 9.6.2.3. Information loss. Let Observation 9.6.2.1 hold. In Methods 1-3, there are three sources of error (in principle), namely, (1) the encoding error eyq with which b and d are represented by be and dd (2) the computational error inherent in the pointwise operation 0, which is denoted by Â€q; and (3) the error with which the result e = 0(b,d) is encoded in the compressed result Cc. The resultant error specific to an analogous operation Q', which we denote by Â€q', has the following general formulation: fO' < fVQ(eo[fVQ(fb),evQ(ed)])(379) If O = +) then 0' = + and cq' is given by fQ' ^ fVQ(evQ(cb) + evQ(fd)) ~ 3-evQ, (380) PAGE 309 299 where the approximation is formulated under the realistic assumption that fyq is constant for codebooks Cb, cj, and if Method 1 is employed. Since Method 1 forms the codebook Ce from the operand codebooks Cb and Cd, representational error is less than in Methods 2 and 3. However, this assumes that the statistics of e upon which Ce is based are preserved when Q is applied to Cb and cj. With linear operations 0' i^^P^^ statistics (of b and d) are preserved in e, due to the linearity property of expected values. However, such is not the case for nonlinear operations such as logb(d). Since Method 2 quantizes the result e into an exemplar in Cb U Cd, if the statistics of b and d are preserved in e, then there is no additional representation error beyond that incurred in Method 1. In practice, we have found that, for most pointwise operations (particularly transcendental functions), Cb U Cd may poorly represent blocks that have values near the extrema of rangfe(e). The reasons for this behavior, which are being considered in ongoing research, likely derive from the LMS clustering algorithm used in the generalized Lloyd clustering algorithm with which Cb and cj are formed. That is, the LMS algorithm appears to produce clusters less accurately when cluster size is small, as occurs near range extrema that occur infrequently in training set imagery. In Method 3, a single codebook c is employed without regard to differences in statistics among b, d, and e. As a result, all eyq could be higher than in Methods 1 or 2, thereby yielding high cq'As a result, we prefer to employ Method 3 only when constant codebook size is required. Method 2 presents a useful compromise between codebook growth and error accumulation. 9.6.2.4. Example. We computed codebook growth and error for VQ codebooks derived from over 100 natural images of outdoor scenes containing small, man-made targets. Figure 16a (b) illustrates the differences in codebook size (error) for the original images and the images subjected to addition using Methods 1-3. Note that the codebook grows maximally PAGE 310 300 for Method 1, and grows least for Method 2, as predicted by theory. Constant codebook size is exhibited by Method 3. Note that the error growth of Method 1 is typical for addition, due to the additive error propagation law. Method 3 did not lend itself to zero codebook error, hence that data point is not represented in Figure 16b. Note that the results shown in the following figure vary with (a) the type of operation employed, (b) the image data, and (b) the type of codebook clustering algorithm employed (in this case, a textbook LMS clustering algorithm). X b 5 6 7 8 9 10 11 12 13 log(M), bits (a) 3 Â— 3 ? 3" r -i 3 ? 3 f' >0 1 1 0.5 I'o 1.5 2. e I , bits/pixel (b) Figure 16. Codebook growth and error as a function of codebook size and Methods 1-3 for VQ pointwise addition: (a) input and output code book sizes M and N, (b) input and output errors and Â€o9.6.3. Global Reduce Operations. 9.6.3.1. Observation. VQ imagery has attractive properties for computing global-reduce operations. For example, if one has the sum of each exemplar stored in an array of partial sums, then the VQ image histogram , which describes the probability of a given exemplar, also describes the frequency-of-occurrence of each partial sum. It is then a simple matter to additively combine the partial sums, weighted by their frequency-of-occurrence. Similarly, the image maximum is the maximum of all exemplar maxima, and symmetrically for the minimum. The following theorem is illustrative. PAGE 311 301 9.6.3.2. Theorem. If Assumption 9.5.4.2 holds and T is a vector quantization transform that produces a compressed image G U"^ and a codebook c G (F''')^ from a source image a G (per Definition 9.6.1.1), then duals of the global reduce operations T G {E,V,n} are given by: ^ r(c(ac(y))). (381) Additionally, if h G denotes the histogram of a compressed image ac G , then the following dual formulations hold: Saw ^ h(u) Â• Sc(u) u^domain{c] VaÂ« V Vc(u) (3g2) u6cfomam(c) u6domatn(c) Proof. The proof is outlined as follows. Assuming that the givens hold, the proof follows directly from the fact that the binary operation 7, the basis function of F, is associative and commutative. Additionally, the approximation results from the information loss inherent in VQ. Imagetemplate operations over VQ imagery have associated implementational difficulties that are similar to those encountered in processing BTC and VPIC-encoded imagery. Per the comments at the end of the preceding section, we are currently investigating VQ edge detection and morphological erosion operations, and discuss such methods at a high level in the following chapter. PAGE 312 CHAPTER 10 APPLICATIONS OF COMPRESSIVE IMAGE PROCESSING Having discussed the BTC, VPIC, VQ, and JPEG transformations, we next describe several compressive operations that are useful in target recognition practice. We begin with smoothing via local averaging. 10.1. Compressive Smoothing Local averaging is customarily employed in reducing the visual effect of noise, to increase the definition of targets against natural backgrounds. Such averaging also helps reduce effects such as variance that results from pseudorandom textures (e.g., grass and leaves) in images of natural scenes. 10.1.1. Compressive Local Averaging. For purposes of discussion, we consider local averaging of imagery using a unitarilyvalued Moore or von Neumann template. 10.1.1.1. Concept. Image smoothing can be approximated by compressive smoothing within an encoding block via replicating the block mean at each pixel of a given block. Then, we convolve a blur function (e.g., a unitarily-valued von Neumann template) with the compressed image to obtain smoothed block boundaries. A key postprocessing step involves horizontal (vertical) one-dimensional unitary von Neumann template to smooth vertical (horizontal) block boundaries, followed by a two-dimensional unitary von Neumann template smooth pixels, where the source point of such a pixel is the intersection of adjacent encoding blocks. 302 PAGE 313 303 10.1.1.2. Algorithm. Let a source image a G be encoded by a transform T:F^ ^ that tessellates a into kx /-pixel blocks b(y) G F''' ,y G Y. Further assume that T preserves in the compressed image ac = T(a) the block mean fi as pi(ac). Let the spatially invariant templates r,s,t G (F-'^)^ be defined by their weights as follows: r= (1 1 1), s= r', and t=^l ; ij, (383) where prime (') denotes transpose and the target point is italicized. Let rc,Sc,tc G (F^)^ be compressive templates whose weight matrices are identical to those of r, s, and t, respectively. Note that the preceding templates need not be limited in size to an mxn-pixel domain, where m,n = 3. Rather, m or n can be as large as A([k/2j, [l/2\), since the block mean can be assigned only to the central region of the source block domain. We simulate the action of 0 Â• F''^ x (F''')^ Â— Â» F^ over range{ T) using a dual compressive operation fi : x F-'^ such that b = fi(ac,rc,Sc,tc) Â« 0(a,t), (384) which is computed via the following algorithm. Let the indexing functions h and h* be as given in Definition 9.1.1.1, and let the indexing functions c,r,> : Y 2^ respectively return the source neighborhood associated with the column, row, and row-column intersection boundaries of the y-th encoding block. Step 1. Let the column averages be stored in dc = Pi(ac) 0 Fc. Step 2. Similarly, the row averages are stored in dr= Pi(ac) Â©Sc. Step 3. Store the row-column intersection averages in dj = pi(ac) @tc. Step 4. For each y G Y do: b|A-(y)= Pi(ac(y)) tÂ»lc(y) = dc(j/) (385) *>lr(y) = My) b|;(y) = dj(t/). PAGE 314 304 The algorithm operates as follows. First, the local averages are taken at block boundaries. This can be done via the compressive convolution shown in Steps 1-3. In Step 4, we merely input the block means to the non-boundary areas of each block domain h*{y), then input boundary values at points given by c, r, and j. 3x3-bIock area of X returned by c returned by j Figure 17. Block configuration for compressive local averaging, showing block neighborhoods (dashed boxes) on X returned by c, r, and j. 10.1.1.3. Observation. As shown in Part 1, if we have an image smoothing operation Q and its compressive dual fi, we can define a dual CCS X = ({f^, (F^)^; o}, {T,U; 0}, {g^, (F^)^; fi}) , (386) where T was given in the preceding section and U is an identity transform on the weight matrices of templates r,s,t G (F^)^ that produces templates rc,Sc,tc 6 (F"^)'^. 10.1.1.4. Complexity and information loss. Steps 1-3 require 0(|Y|) multiplications and additions or, more precisely, 11 Â• |Y| multiplications and 8 Â• |Y| additions. The definition of h can be modified slightly, such that Step 4 requires |X| I/O operations when block means are placed in non-boundary regions only. Neglecting I/O cost, the total complexity of compressive smoothing is given by WsÂ„,ooth(Y) = |Y| . [(|5(rc(y))| + |5(sc(y))|-f|^(tc(y))|) multiplications + [(l<5(rc(y))| + |5(sc(y))| + |5(te(y))| 3) additions] , x G X, y G Y . (387) PAGE 315 305 Neglecting possible differences between the number of multiplications and additions per block, the corresponding sequential computational speedup inherent in X can be approximated as .y. |X| Â•(|Â»S(tx)|)-(At^ + Ata) "^^^ \Y\(|5(rc(y))i + |.S(se(y))| + |5(tc(y))| ) Â• (AtÂ„, + At,) (I^(tx)l) ^'''^ '^^'^ (|5(re(y))| + |5(sc(y))| + |5(tc(y))|)' where Atm and At, denote the delay incurred by a multiplication or addition operation. Thus, X exhibits a speedup of O(CRd) or greater if the ratio of the total size of the compressive templates to the size of the uncompressed template is less than unity. For example, in Section 10.1.1.2, 77= CRa Â• 8/11. If we take the convenient case of a4x4-pixel encoding block, where CRa = 16, then t] = 11.6, a nontrivial speedup. 10.1.1.5. Information loss. Note that Q' represents the convolution of a with t by assigning block means // to the central region of each block and local averages to the boundary regions. Thus, Q' erroneously represents the local average in the center of the block if (i does not represent the local average. Similarly, the boundary averages are taken across the block means, and thus erroneously represent boundary values. The representational error, which is dependent on the relationship between template size and weight as well as blocksize, is also dependent upon the template configuration (i.e., von Neumann or Moore templates). For example, if t is 5 pixels square and k,/ = 5, then there will be a 2-pixel wide region on either side of the encoding block boundaries that is portrayed incorrectly if the size of tc is less than 5x5 pixels. If t is a unitarilyvalued Moore template, then the mean of an encoding block that is the same size as t clearly portrays the block mean. In the remaining cases, which appear to predominate in practice, the block mean does not portray the local average at the center of the encoding block. Thus, we specify U as the identity transform applied to the weight matrices of r, s, and t, in order to preserve the size of t in tc. PAGE 316 306 10.1.1.6. Observation. An additional problem with compressive smoothing is that the weights of r, s, and t must be preserved in Tc, Sc, and tc in order that the boundary averages can be computed over the compressed image in the same manner as averaging would be performed in the image smoothing operation. The formulation of U as an identity transformation ensures this result. We next provide an example of compressive smoothing applied to multi-target imagery taken against a grassy baclcground. 10.1.1.7. Example. Let the source image a shown in Figure 18a (i.e., a field of small targets) be convolved with the unitary von Neumann template t to yield the image of Figure 18b. Let a be compressed by BTC (4x4-pixel blocks) to yield the mean image shown in Figure 18c. After the compressive smoothing cdgorithm is applied to 18c, the image of Figure 18d results. As can be seen from the figure, the visual effect of compressive smoothing on highresolution imagery yields slightly increased blur with respect to noncompressive smoothing when the blocksize is larger than the template dimension. This eifect can be corrected by using a larger template, which decreases computational efficiency. In this example, a nontrivial speedup of ?; = 10.6 was achieved on a Sun Sparcstation-10 running Sun FORTRAN-77 on SunOS Version 4.1. This eflUciency is clearly less than CRq, due to the overhead (in addition to the 0(CRd cost of local averaging) incurred by the boundarysmoothing templates. When considering the more involved operation of VPIC-based edge detection, we shall see that computational efficiencies that approach CRj are achievable in practice. PAGE 317 307 (c) (d) Figure 18. Example of compressive smoothing at CRj = 16:1: (a) source image, (b) locally averaged image using unitary von Neumann template, (c) compressed array of block means over which processing actually occurs, (d) decompressed result of compressive smoothing. We next consider the problem of smoothing VQ-compressed imagery. 10.1.2. VQ-based Compressive Smoothing. We begin by noting several useful properties of the VQ codebook exemplars. 10.1.2.1. Observation. Recall that the VQ codebook c contains exemplars that represent one or more encoding blocks of the source image. This implies two strategies for simulating image smoothing over VQ-compressed imagery. First, one can alter the codebook by taking the block mean, then reconstructing the VQ-compressed image and applying boundary PAGE 318 308 smoothing operations similar to those described in Section 10.1.1.2. The boundary smoothing ameliorates the "blocking effect" that is frequently found to be objectionable in VQ reconstructions. A second method is to merely smooth the codebook exemplars and substitute them into the reconstructed image. This has the advantage of preserving smoothed features withinblock, but also has the disadvantage of exaggerating the blocking effect due to the presence of boundary effects resulting from the local (blockwise) smoothing operation. Given a codebook of size M, both methods have 0(M) overhead in codebook processing. However, the first method has an additional 0(|Y|) overhead, where Y denotes the compressed domain. Given the preceding disadvantages, we developed a third method of VQ-based smoothing, described in the following section, which combines the best features of the preceding methods. 10.1.2.2. Algorithm. Given a source image a on X that is transformed by a VQ transform T to yield a compressed image ac on Y and codebook c, we perform the following steps to compute an improved dual of compressive smoothing: Step 1. Apply a smoothing operation directly to each exemplar in c, thereby producing a new codebook d that contains smoothed exemplar blocks. Thus, ac portrays a smoothed version of a, but has blocking effects due to boundary effects resulting from the smoothing. Step 2. Implement the source image reconstruction (decompression) step to produce image b on domain(&) by substituting exemplars from d indexed in ac. Step 3. Apply boundary smoothing templates Tc, Sc, and tc defined in Section 10.1.1.2 to the encoding block boundaries of b to reduce boundary and blocking effects. In contrast to the preceding method of computing block means, within-block features are preserved and between-block variance is smoothed. PAGE 319 309 10.1.2.3. Complexity and information loss. Given a codebook of size M and a smoothing template t, Step 1 of the preceding algorithm requires M Â• iÂ«S(t)| multiplications and M-(|5(t)|-l) additions. Letting X = domain{a.), Step 2 requires |X| I/O operations. Step 3 requires one application each of Vc and Sc as well as two applications of tc per pixel in Y. The total work of compressive smoothing using VQ imagery is therefore given by: WvQsm(Y,M,k,/) = Mk/ Â• |5(tx)| + |Y| Â• (|.S(rc(y))| + |5(sc(y))| + | PAGE 320 310 10.2. Compressive Edge Detection. Various properties of compressed image formats lend themselves to exploitation in image segmentation tasks. In this case, we examine techniques by which the edge-preserving properties of BTC, VPIC, JPEG, and VQ transforms can be employed to simulate edge detection over compressed imagery. This approach is useful for variancebased target classification [62,81]. 10.2.1. BTC-based Edge Detection. 10.2.1.1. Observation. The BTC transform discussed in Chapter 9 computes the mean and standard deviation a (gradient g) of an encoding block as a measure of block variance. The rationale for this computation is based on the fact that blocks which contain more edge information (i.e., stronger edge transitions) exhibit greater variance. By thresholding the compressed image based on a or g, then combining stored bitmap patterns with a or g, BTCand VPIC-compressed imagery can be efficiently manipulated to approximate edge imagery in the source domain. We thus specify the following dual operations in blockwise fashion. 10.2.1.2. Algorithm. Let an image a on X be compressed by T = BTC to yield a compressed image ac on Y. The compressed pixel values have form (/i,(7,d), that denote the mean, standard deviation, and zero-crossing bitmap of the corresponding encoding block in a. If a mean block is portrayed by a compressed pixel value ac(y), where y is a point in Y, then a and d are zero-valued. Compressive edge detection using BTC imagery requires that the bitmap d be transformed using an edge detection operation e : {0, 1}''' {0, 1}''' that computes edge detection over bits in the interior of the block and conditionally alters bits on block boundaries. PAGE 321 311 For example, the mapping e : 1 1 1 1 0 0 1 1 h 0 0 1 i) 0 1 1 0 0 0 1 \o 1 1 l) ^0 1 1 (391) inverts the bit in row 3, column 2 that does not touch the block boundary. The reason for this discrimination between interior and boundary information pertains to the fact that the upper boundary of a given block (called bi, for convenience) may be adjacent to a block b2 that has unitary bits which are adjacent to unitary bits in bi. In such cases, an edge may not be present at a block boundary that is subtended by a unitarily-valued region. For example, consider the following juxtaposition of blocks and the corresponding edge detection: 0 0 0 0 0 0 ^\ 0 0 0 1 0 0 1 1 1 0 1 0 0 0 1 1 e : 0 1 1 1 1 1 1 0 0 1 0 0 0 0 0 0 0 0 oy ^0 0 0 0 1 0 0 0 0 1110 (392) Note that the block boundary (denoted by a dashed line) is intersected by a horizontal region or run of three unitary bits. Due to the edge detection algorithm for this type of block, which is based (in this case) upon Moore connectivity, the boundary crossing region, which is formerly configured as becomes i^" contrast, given von Neumann connectivity, the run of pixels that spans the block boundary would be preserved. We also note that, following edge detection, e performs the bipolar transform on all bits of the edge-detected bitmap d, such that 6(d) = 2d-l. The design of e for VPIC-compressed imagery is discussed later in this chapter. Given the preceding operations, compressive edge detection over BTC imagery is achieved by performing the following steps: Step 1. Initialize a decompressed image on X denoted by b = 0. PAGE 322 312 Step 2. For each y Â€ Y, do: if (P2(ac(y))= 0) then bU.(y)= 0 else (393) bU.(y) = P2(ac(y)) Â• eb3(ac(y))] endif , where h* was defined previously. Thus, mean blocks are zeroed while edge blocks are treated as edge maps that are multiplied by the edge magnitude, which we previously showed is approximated by the standard deviation a. 10.2.1.3. Complexity Given a total of [ Y| blocks, of which a fraction r are edge blocks, then the preceding BTC-based edge detection algorithm requires (r 1) Â• |Y| Â• k/ I/O operations to set mean blocks to zero, r Â• |Y| invocations of e, and r Â• |Y| Â• kl sign changes to set the edge magnitudes to Â±a. Assuming that the I/O overhead and sign change (a one-bit operation) can be neglected, the overhead of edge detection is primarily due to r Â• |Y| invocations of e. Our current implementation of e requires at least 2k/ comparison operations for each block. Thus, the cost of BTC-based edge detection is given by: WBTCed(Y,r) > 2r Â• |Y| Â• CRd = 2r Â• |X| comparisons. (394) By way of comparison, computation of the image gradient using a Prewitt operator [82], which requires a three-pixel von Neumann template, incurs work of 3 Â• |X| addition operations. This represents one of the more efficient (but not necessarily the most effective) methods of edge detection in the image domain. Thus, the efficiency that results from BTC-based edge detection is given by: ^ Wed(X) ^ 3 Â• |X| additions ^ L5 ^ ~ WBTCed(Y,r) 2r-|Xl comparisons r In edge-intensive imagery (e.g., leaves, woodland scenes, rushing water, views of buildings, or text images) we have found that 0.5 < r < 0.8 is typical. Thus, in the preceding PAGE 323 313 equation the efficiency would range from twoto three-fold. However, recall that this is the minimum efficiency achievable, since our cost analysis is based on the Prewitt operator. Given Sobel and Kirsch edge detectors [83,84], which are more burdensome to compute, we have found in practice that tenfold decreases in the computational cost are not unusual with compressive edge detection. 10.2.1.4. Information loss. Since the edge detection operation e discards block means and bits from the zero-crossing bitmap, much information about object structure is lost. Unlike a standard edge image, the image resulting from BTC-based edge detection cannot necessarily be inverted to yield a close approximation to the source image. This is due to (a) possible loss of gradient magnitude in the BTC block representation and (b) the losses inherent in quantizing a greyscale block into a block mean and bit map. Furthermore, e yields only an approximation to the edge detection of the bitmaps, and is thus key to producing an accurate approximation of the resultant edge image. As a result, we are currently researching methods of configuring e to achieve a close approximation to results obtained via image-domain edge detection. Related examples are given in Section 10 for VPIC edge detection and morphological operations. 10.2.2. VPIC-based Edge Detection. The following description of a VPIC edge detection algorithm is similar to BTC-based edge detection, but requires that the bitmap be retrieved from a codebook. Since the codebook is small, the edge detection operation e can be implemented in a lookup table that accepts codebook indices taken from the nearest neighbors of the compressed pixel of regard. We provide examples from Boolean and greyscale imagery. PAGE 324 314 10.2.2.1. Algorithm. Let an image a on X be compressed by T = VPIC to yield a compressed image on Y whose pixel values are of form {p,g,0,j), which denote the corresponding source block mean /x, gradient intensity g, gradient orientation 9, and index j of an exemplar in the VPIC codebook c. We assume that the exemplar is rotated through angle 6 to yield a best match with the block's zero-crossing bitmap. If a mean block is portrayed by a compressed pixel value ac(y), where y is a point in Y, then the remainder of the parameters in ac(y) are zero-valued. Compressive edge detection using VPIC-compressed imagery involves an edge transform e that has a similar high-level description to the transform e previously employed for BTC-based edge detection. We also employ a rotation transform rot: Rx {0,1}''' -+ {0,1}''' that accepts 9 and an exemplar c(j), which is then rotated through angle 9 in the output. Since VPIC quantizes the gradient angle ^ into a small number of bins and employs only a few exemplars that are deemed visually significant, the actual codebook size is quite small, rarely exceeding eight exemplars, which can be indexed using three bits. As a result, rather than developing an involved analytical representation of e, we can employ a lookup table g to compute e. For example, let g accept the y-th encoding block and its nearest neighbor blocks, and output the y-th edge detected block. For example, if X has von Neumann connectivity, then a given block has four neighbors, and \domain{g)\ = (4 -|1)(3) = 15 bits, or 32K elements. The corresponding LUT has size 15k/ bits. Thus, if k,l = 4, then each element occupies two bytes, and the LUT size is 64K bytes, which is feasible for some currently-available parallel processors. For Moore connectivity, each block has eight neighbors and the LUT domain has 30 bits = IG elements, which is prohibitively large. Thus, we prefer von Neumann connectivity for our compressive processing operations. PAGE 325 315 Compressive edge detection over VPIC-compressed imagery is achieved by performing the following steps: Step 1. Initialize b, an image on X to zero. Step 2. For each y G Y, do: if (p2(ac(y))= 0) then b|A.(y)= 0 else (396) bU-(y)= i'2(ac(y)) * ro<(i)3(ac(y)),e[c(p4(ac(y)))]) endif . Apart from block rotation, the treatment of edge and mean blocks is highly similar to BTC-based edge detection. 10.2.2.2. Complexity. Given a total of | Y| blocks, of which a fraction r are edge blocks, then the preceding VPIC-based edge detection algorithm requires (r 1) Â• |Y| Â• k/ I/O operations to set mean blocks to zero. Since edge detection is performed over a small codebook of M exemplars, the cost of M invocations of e is negligible. Thus, the only significant processing overhead is r Â• |Y| Â• kl sign changes (a one-bit operation). Thus, VPIC-based edge detection is very fast, and requires work of only r Â• |X| one-bit operations. The resultant efficiency is given by ^ Wed(X) ^ 3|X| additions ^ 3 Â• Ata ^ ~ WvpiCed(Y,r) r Â• |X| comparisons r ' where Ata denotes the processing time (or work) required for addition. In edge-intensive imagery, we have found that 0.5 < r < 0.8 is typical. Thus, in the preceding equation, 4 < r/ < 6. However, recall that this is the minimum efficiency achievable. As opposed to Sobel and Kirsch edge detectors, which are more burdensome to compute, we have found that twenty-fold efficiencies can be realized using VPIC-based compressive edge detection. The edge thickness, which is a primary distinction between various image-domain edge detectors, can be regulated in VPIC by configuring the codebook exemplars to have a thicker edge representation. PAGE 326 316 10.2.2.3. Example. We present the following example of VPIC edge detector configuration, expressed in terms of the exemplar lookup table. This example holds equally well for greyscale or Boolean imagery. However, one may want to adjust the edge-block mean to account for the intensity difference between the current block (pixel of the compressed image) and its neighbors. Observe that the 4 x 4-pixel VPIC codebook c(l) = c(3) = '1 1 1 r "1 1 1 r 0 0 0 0 0 0 0 0 c(2) = 1 0 1 0 1 0 1 0 .0 0 0 0. .0 0 0 0. '1 1 1 1" '1 1 0 0' 1 1 1 0 c(4) = 1 0 0 0 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 can be used to represent the following derivation of an edge block LI 1 1 1 1111 1111 0 0 0 0 0 0 0 0 1111 0 0 0 0 1111 0 0 0 0 0 0 0 0 in terms of the additional edge representation exemplars c(5) = c(7) = and the following derivation: (1|2|3)" '0 0 0 0' "0 0 1 r 1 1 1 1 c(6) = 0 1 1 0 0 0 0 0 1 1 0 0 .0 0 0 0. .1 0 0 0. '0 1 0 0' "1 1 1 1" 1 1 0 0 c(8) = 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 (1|2|3)" 2 1|2|3 (1|2|3)' 5, (398) (399) (400) (401) where (1|2|3)' denotes the transpose of exemplar c(l), c(2), or c(3). The preceding derivation represents an efficiency of order CRo, since one LUT operation could be substituted implementationally for sixteen pixel operations. The following examples are illustrative. PAGE 327 317 10.2.2.4. Complexity. Given the VPIC codebook index set U = {1,2,. ..,4}, the VPIC Boolean edge detector requires only |Y| = 6(7) = 42 invocations of the lookup table y . Xjs U. In contrast, a Prewitt-like edge detector that employs a template which is defined by its weights as (402) requires Wo(|X|)= 3-|X| = 3 Â• 28 Â• 24 multiplications (403) = 2,016 multiplications. Given a ratio r of multiplication delay to lookup table delay, this represents a speedup of T]= T 2016/42 = 48r, if I/O overhead is neglected. If r = 6.5, as is the case for integer multiplication on the Sun Sparcstation-10 on which this algorithm was tested, then rj = 312, which represents a nontrivial speedup that far exceeds CRd10.2.2.5. Example. Let the source image a shown in Figure 19a be subjected to an edge detector that preserves the outer boundary, to yield the boundary image of Figure 20a. The VPIC representation of a is given in 19b. Such a detector could be implemented in terms of the Prewitt detector [82], or could be realized via the subtraction of a morphologically eroded version of a from a. Applying the preceding derivational method to VPIC edge detection of a, we obtain the image of Figure 20b. Note the encoding errors near the vertices of the airplane's tail and wings. This type of error is typical in low-resolution imagery, and represents the lower limit of VPIC's coding accuracy. PAGE 328 318 SOURCE IMAGE. 28x24 pixels. . 1 1 1 . 1 1 1 1 1 1 1 1 1 1 1 1 1 . 1 1 1 1 1 1 1 1 i 1 1 1 . 1 1 . . 1 . 1 1 1 1 1 1 1 1 I 1 1 1 1 1 1 1 1 1 1 . . 1 . 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 (a) VPIC Image Column 1 2 3 4 5 6 7 1 M,0 M,0 M,0 M,0 M,0 M,0 M,0 2 M,0 M,0 M,0 3' r 2' M,0 3 M.0 4' -1 M.l M,l -1"' M,0 4 M,0 4" 2 M.l M,l 2' M.O 5 M,0 M,0 M,0 4" r 4"' M.O 6 M,0 M,0 M,0 M,0 M,0 M,0 M.O (b) Figure 19. Example of VPIC coding of very low-resolution Boolean imagery: (a) source image, (b) VPIC representation of a), where M,x denotes a mean block of mean x. PAGE 329 319 EDGE IMAGE OF SOURCE. .1 1 1 .... 1 1 1 1 ... 1 1 1 1 . 1 . . 1 1 1 1 1 . . 1 . . 1 . 1 .... 1 1 ... 1 .. 1 . 1 1 1 1 1 1 .... 1 1 1 1 . 1 .1 11 11 1 .11 11 .. 1 1 1 1 1 .... 1 1 1 1 . 1 . 1 1 . . . 1 . . 1 . 1 . 1 1 . . 1 . . 1 . 1 . . 1 1 . 1 . . 1 1 1 ... 1 1 1 ... 1 1 .... 1 1 .... 1 1 (c) 4x4 VPIC EDGE DECOMPRESSIO H 1 1 . . 1 1 1 . 1 1 1 . . 1 . 1 . 1 1 . 1 . . 1 . 1 . 1 1 . . 1 . . 1 . 1 1 1...1111..1 ..111111 1 .11 1 11 11 11 1 .111111 1 11 1 11.111111 11 1 .... 1 1 . 1 1 1 ... 1 1 . 1 1 ... 1 . 1 . . . . (d) Figure 20. Example of VPIC coding of very low-resolution Boolean imagery: (a) boundary-detection of Figure 19a), (d) VPIC edge detection of Figure 19a). PAGE 330 320 10.2.2.6. Information loss. We have configured the VPIC edge detector such that the matching of exemplars with the zero-crossing bitmaps incurs less than 25 percent erroneous pixels per block. Given the ratio Ne/|Y|, where Ne denotes the number of edge blocks, we have a total error cross-section of 0.25(Ne/lY|)x 100 percent of erroneous pixels in the compressed image. In the preceding example, there were 21 erroneous pixels of 672 = 28x24 pixels, as shown in the following figure. This represents an error rate of 0.3125 = 21/672, or 3.1 percent probability of error per pixel. From Figure 19b, we see that there are 12 edge blocks, for a total maximum predicted error rate of 0.0714 = (0.25)12/42. Assuming that the mean rate is half the maximum, we have a mean predicted error rate of 0.0357, or 3.6 percent of source pixels. The measured error rate of 3.1 percent of source pixels varies from the predicted rate by 3.36 pixels = (0.035-0.031)(24x28pixels), or 0.5 percent of the source domain size. In practice, we find that the predicted mean error rate consistently approximates the actual rate to within 10 percent of the image size when a 25 percent error cross-section is assumed in the VPIC match. Column of VPIC Representation Row 1 2 3 4 5 6 7 2 3 4 5 6 . . 1 . 1 1 2 2 -31-4 1 1 3 Figure 21. Error analysis of the edge detector in Figure 20b, in terms of erroneous source pixels per encoding block. PAGE 331 321 10.2.2.7. Example. Greylevel VPIC edge detection is illustrated in the following figure, in which a noisy, slightly blurred 720x480-pixel underwater (UW) image was subjected to Sobel edge detection as well as VPIC edge detection. An error image was obtained by differencing the Sobel and VPIC edge detectors. Note that the error graph contrasts the UW case with terrestrial imagery of small targets against a grassy background. In both cases, error increases as small features are obliterated. However, in the land-based imagery, which has much lower spatial frequency content over the periods ranging from five to eight pixels, the combined camera noise and VPIC representational error did not exceed 3.2 bpp for blocksizes as large as 8 pixels square. (c) (d) Figure 22. Greylevel edge detection with VPIC: (a) source image, (b) Sobel edge detection, (c) 4x4-pixel VPIC edge detection using the codebook similar to that given in Example 10.2.2.3, (d) noise and representational error as a function of VPIC blocksize (kxk pixels) for underwater and land-based imagery. PAGE 332 322 10.2.3. VQ-ba sed Edge Detection. We begin by noting several useful properties of the VQ exemplars. 10.2.3.1. Observation. Cosman et al. [1,12] have shown that VQ-compressed imagery can be thresholded according to a minimum-entropy criterion to achieve a very approximate method of edge detection. Unfortunately, certain exemplars that have no edges are frequently juxtaposed with those that contain high-intensity edge information (i.e., high variance). The resulting "blocking effect" at block boundaries can lead to visually-apparent noise in the edge image if the blocksize is large in relation to spatial features of interest. As discussed previously, the edge detection which is local to a given block produces boundary effects that can be mistaken for linear features. 10.2.3.2. Algorithm. As an improvement on Cosman's method, we have modified the method employed in VQ smoothing to facilitate edge detection by applying a given edge detector to the exemplars in the VQ codebook c. Given a codebook of M exemplars, this requires 0(M) effort and 0(Mk/) pixels are processed. Upon image reconstruction, the exemplars are substituted into the reconstructed image, and an approximation to the edgedetected source image results. Blocking effects can be reduced to a visually acceptable level by extending the boundaries of each codebook exemplar by replicating m row or column pixels outward using randomly chosen values, where m denotes the halfwidth of the edge detector's sampling window. The block extensions are discarded after edge detection occurs, thus reducing boundary effects. Since the VQ codebook is specific to a given image training set, this method is more useful than a specific instance of BTCor JPEG-based edge detection. However, VQ based edge detection is particular to a given training set, as opposed to VPIC-based filtering, which is based on a data-independent codebook. PAGE 333 323 10.3. VPIC Morphological Operations. VPIC morphological operations are similar implementationally to VPIC edge detection, and exhibit corresponding advantages of speed and simplicity. For purposes of brevity, we consider Boolean erosion and dilation with the von Neumann template. 10.3. 1. Concept . We begin with a brief overview of erosion and dilation. 10.3.1.1. Observation. Ritter [54] has shown that morphological erosion and dilation with the von Neumann template can be implemented in terms of the image algebra additive maximum and minimum. For example, given finite X C Z^, the erosion of a Boolean source image a Â£ [0, 1]-''by a von Neumann template t = ^1 1 1 ^ is expressed as a = a Â® t. 10.3.1.2. Example. The erosion of a source image by t = 1 1 1 is given by 0 0 0 0\ 0 1 1 1 0 0 1 1 1 0 0 1 1 1 0 Vo 0 0 0 0/ /o 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 Vo 0 0 0 0/ (404) (405) 10.3.1.3. Remark. Whereas erosion removes pixels from a unitarily-valued boundary, dilation adds pixels to the boundary. In the preceding example, the dilation of a (Equation 405) by t yields the image a (Equation 404). We thus write a= a Â® t. PAGE 334 324 10.3.1.4. Complexity. If multiplications are employed for combination of an image with t using (g) or Â©, then a total of five multiplications and four maximum operations are required per pixel in X. Otherwise, a lookup table may be employed that has five-bit input and one-bit output, for a total space requirement of 32 bits and a total cost of |X| LUT invocations and 6 Â• |X| I/O operations. Since there is no information loss in Boolean erosion/dilation, we proceed to a discussion of VPIC erosion. 10.3.2. VPIC Erosion or Dilation . VPIC-based erosion has the advantage of requiring I/O operations only, if the implementation uses a lookup table similar to that required for VPIC edge detection. 10.3.2.1. Observation. Recall that VPIC represents an edge block of a Boolean image (i.e., a zero-crossing map that has both ones and zeroes) in terms of zero block mean and standard deviation, with a given exemplar index. Thus, we can derive a codebook specific to morphological erosion, similar to the codebook we derived for edge detection. The following example is illustrative. 10.3.2.2. Example. Observe that the VPIC codebook c given in Example 10.2.2.3 can be used to express the following derivation of an eroded block exemplar as ,1111 1111 1111 0 0 0 0 0 0 0 0 1111 (1|2|3)" (1|2|3)"' 2 (1|2|3)' 1|2|3 where (1|2|3)' was defined in Section 10.2.2.3. "1 1 1 1" 1 0 0 1 0 0 0 0 0 0 0 0 8, (406) (407) PAGE 335 325 10.3.2.3. Example. Similarly, c can be used to express the following dilation of a VPIC block exemplar Ll 1 1 1 11 fl 1 1 1 Â•Â•Â•1 1111 Â•Â•Â•1 0 0 0 0 ij Lo 0 0 0 fl 1 1 1 as (1|2|3)" (11213)"' 2 (1|2|3)' -1", (409) 1|2|3 where -1" denotes the Boolean negation of c(l) transposed twice. Note that the Hamming distance between the result of Equation 408 and the exemplar -i(c(l))" is two, which is within the error limit of 25 percent of total pixels discussed in Section 10.2.2.6. 10.3.2.4. Remark. The discussion of complexity and information loss given in Section 10.2.2 holds for VPIC morphological operations as well. Preliminary tests on Boolean text imagery indicate that VPIC dilation incurs greater error than VPIC erosion. This is a reasonable observation, since dilation fills the von Neumann neighborhood of a pixel with unitary values, while erosion may remove one or more unitary values from the neighborhood of a given pixel. That is, erosion and dilation are not inverses of each other, and should be expected to exhibit differing error figures. 10.3.2.5. Computational efficiency. VPIC morphological operations that employ the von Neumann template require six I/O operations (five inputs, one output) and one LUT invocation. Assuming that a VPIC erosion/dilation LUT invocation incurs identical overhead to that incurred by an erosion/dilation LUT over source imagery (per Section 10.3.1.4), the cost per pixel of the preceding method of VPIC erosion would be identical to the cost of noncompressive erosion. A symmetric observation holds for dilation. "1 1 1 r 1111 1111 10 0 1 (408) PAGE 336 326 However, VPIC morphological operations with the von Neumann template affect kl pixels (the encoding block size), versus one pixel with image erosion. Thus, the resultant efficiency rj = CRdWe are currently investigating the configuration of VPIC codebooks for multi-pixel erosion with von Neumann and non-von Neumann templates. For example, in the absence of boundary information, two-pixel erosion would convert -'c(l) to c(l). An additional topic of interest is the description of VPIC block formation rules (per Equation 409) in terms of a grammatical representation. 10.4. High-Level Compressive Computations. Practical image processing of multi-target fields can benefit from efficient methods of target classification, as well as methods of identifying target shapes. In this section, we present an algorithm for target classification based on BTCor VPIC-compressed imagery (Section 10.4.1) and VPIC-based connected component labelling (Section 10.4.2). 10.4.1. Target classification. We begin by observing that targets can be classified by their statistical measures, which are encoded in VPIC and BTC imagery. This observation is elaborated to include detection of pixel-level target features. 10.4.1.1. Observation. Naturally-occurring objects or groups of objects tend to be irregularly shaped and have relatively rough textures (e.g., rocks, a field of grass, or tree bark). In contrast, manufactured objects are generally regularly shaped and tend to have smooth surfaces due to the requirements of drawing, forming, and stamping processes. It is well known that, given a multispectral image, object color can be roughly assessed using local averages across taken across image partitions in each band whose size approximates the expected size of one or more target(s) of interest. We have elsewhere noted [62,81] PAGE 337 327 that surface roughness or texture can be approximated by the image gradient or standard deviation. As a result, manufactured objects can be detected in natural scenes by first selecting candidate image areas according to a local mean criterion, then thresholding the candidate neighborhoods based on local variance criteria. Since mean and variance information are present in BTCand VPIC-compressed imagery, it follows that, by thresholding such imagery as previously mentioned, one could classify targets of interest. 10.4.1.2. Algorithm. Let a multi-target image a G be compressed by a transform T = BTC to yield an image ac G G^^. Recall that each kx /-pixel encoding block b in a is represented by the block mean /i, standard deviation a, and zero-crossing bitmap d e {0,1}'''. Given probable target intensities in T C range{a), as well as probable target textures represented by standard deviations in V, the set of probable target locations D C X is given by: D = r[domam(x6T(Pi[ac(Y)]) * X6v(P2[ac(Y)]))] , (410) where the indexing function h* was defined previously. In a more efficient linear combination scheme, we could substitute addition for Hadamard multiplication, denoted by (*) in the preceding equation. 10.4.1.3. Complexity. If T and V are each specified in terms of an interval on range{ai), then the preceding algorithm requires 4 Â• |Y| comparisons and 3 Â• |Y| combination operations, which can be implemented via an addition or logical and operation. As a result, processing requires 0(|Y|) time. By way of comparison, signature determination over X requires 2 Â• |X| comparisons per spectral band. The computational efficiency of BTCbased compressive target recognition is clearly O(CRd), and usually exceeds CRd due to the burdensome operations required for local averaging and computation of the standard deviation over X. PAGE 338 328 10.4.1.4. Observation. In the case of VPIC-compressed imagery, a target is characterized by a variance measure that is derived from the encoding block gradient intensity. One thresholds the VPIC image according to mean and variance in order to obtain candidate target points in the compressed domain |Y|. Thus, the VPIC gradient orientation and exemplar index can be ignored in this simple formulation. Since the mean and an approximation to standard deviation is given, one can implement filters such as the Double-Gated Filter [81] using a reduced template representation. The problem of implementing ATR filters over VPIC-encoded imagery will be addressed in a future publication. 10.4.1.5. Algorithm. In more involved implementations of target classification, we employ the VPIC exemplars, as follows: Step 1. Given target intensities in T and variance parameters in V, compute Equation 410 to yield probable target locations in D. Step 2. Compute C = /i(D), which are the target locations in Y, where the indexing function h was defined previously. Step 3. Assuming that target size and shape is known, arrange the VPIC codebook exemplars to yield an approximate representation of target shape, as follows: 00= 1 01" 01 0 1=2 0 1_3 10 00 00 00 0 1 00 1 0 00 00 00 0 1 1 1 1 1 1 1 0 1 00 1 0 00 00 1 1 0 1 1 0 1 0 1 0 00 => M,0 1 1'" M,0 1 M,1 -1" V" M,0 -1'" 3" 1" (a) (b) (c) Figure 23. Target characterization using VPIC codebook exemplars with indices in {1,2,3}: (a) VPIC codebook, (b) source image with unitarily-valued target region, (c) compressed representation, where 1' denotes exemplar 1 rotated by 90 degrees. Step 4. Perform Step 3 for blocks whose indices are in C by matching VPIC exemplars to the combined threshold results (e.g., the product in Equation 410) restricted to h*(C). PAGE 339 329 Step 5. Compare the results of Step 3 for each contiguous group of blocks found in Step 2 to yield a measure of target presence. The algorithm operates as follows. Steps 1 and 2 compute indices in C of candidate target blocks in a. Step 3 produces a probable target template. Step 4 correlates the target template developed in Step 3 with the VPIC compressed image's exemplar indices to determine target presence. Additionally, the known data about target mean and standard deviation are used to restrict the source image to candidate target locations. When these two partial results are combined (usually via addition and thresholding), one obtains an image of candidate target locations where the block mean, gradient intensity, orientation, and exemplar indices are specific to blocks whose indices are in C. If a match is found, then target presence is indicated. At this point, one can check the uncompressed source image to achieve greater accuracy in target discrimination. 10.4.1.6. Complexity. Steps 1 and 2 of Algorithm 10.4.1.5 require one computation of Equation 410 and |D| applications of h* , respectively. Step 3 depends upon target size and blocksize used in VPIC-compressing the target representation shown in Figure 24b. Step 4 can be obtained with no significant effort by restricting the compressed image to points in C. Step 5, which requires at most 0(|Cp) comparisons, is the costliest step. 10.4.1.7. Example. Figure 24a shows a multi-target image that contains four targets of varying sizes and visibility, all of which are partially resolved (5 to 100 pixels in size). Figure 24b shows the compressed target field image that resulted from VPIC applied to (a), and Figure 24c illustrates the output of Equation 410 following decompression. All four targets are correctly detected, although there are several false target pixels above the bottom left-hand target, which are due to misidentification of noise pixels in that area and can be removed by morphological erosion and dilation. PAGE 340 (c) Figure 24. Example of VPIC-based target recognition: (a) source image, (b) BTC-compressed image over a portion of wiiicii tiie target recognition algoritlim (Equation 410) computes, (c) decompressed target location image. 10.4.2. VPIC Connected Component Labelling. We conclude our presentation of compressive processing with an overview of a more difficult problem, namely connected component labelling (CCL). Here, we threshold an image such as that shown in Figure 24c and label each contiguous unitarily-valued region with a unique number. For purposes of simplicity, we do not require sequential CCL, which can nevertheless be implemented by relabelling the result of VPIC CCL. PAGE 341 331 10.4.2.1. Observation. VPIC encodes a Boolean image by assuming that encoding block contents are zero-crossing bitmaps of encoding blocks that have zero mean and unit gradient intensity. Each block is represented in the VPIC image by a best-match exemplar from a small codebook (typically less than eight exemplars). Since the codebook exemplars are Boolean images that are known a priori, we can tag each codebook exemplar with a tag vector (N,E,W,S), where N = 1 if the North (top) row of a given exemplar is unitarily valued or has a majority of one-valued pixels, and N = 0 otherwise. Symmetric operations are applied to the E, W, and S boundaries. This facilitates the determination of adjacency between blocks, which greatly simplifies CCL. 10.4.2.2. Assumption. Assume that an image contains the values of encoding blocks in a source image a Â£ F^, which is obtained by applying a transform Typic to thus obtaining a compressed image ac G G^. Further assume that we perform connected component labelling to yield an image c = CCL(a). Let Y be two-dimensional, and let a compressed block representation whose domain point y G Y is located in the row and column directly beneath a block whose domain point is denoted by z. If the N-tag of the exemplar indexed by P4(ac(y)) and the S-tag of the exemplar indexed by p4(ac(z)) are both unitarily-valued, then we say the blocks are adjacent. Symmetric cases hold for the E, W, and S tags. From such interrogation, which requires a maximum of 0(|Y|) time, where Y denotes the compressed domain, an adjacency graph G= (V,E)G(Y, Yx Y) (411) is constructed, whose edge set E C Y x Y. This process is possible using ac only, since we assume that the block means which represent targets are greater than zero. PAGE 342 10.4:2.3. Algorithm. Given the adjacency graph G, we next assign unique positive integers to pixels in an image e on Y. The image f that contains the indices of connected components on Y is computed as follows: Step 1. For each point yG Y, interrogate the points z G A''(y), where TV denotes the von Neumann neighborhood of y, such that if E(y,z) exists in G, then e(y)= V(e(y),e(z)). (412) Step 2. Apply the zero-valued, two-dimensional von Neumann template t to e as follows: m= X>o(pi(ac)) repeat : { f = m + (e 0 t) (413) if f = e then stop , else e= f endif } . The algorithm operates as follows. Step 1 ensures that a large value in e is placed on a given component in ac, thereby facilitating the subsequent maximum operation. Step 2 propagates the maximum value of each component throughout the component to yield an image f of uniquely labelled components. 10.4.2.4. Complexity. In practice, given a compressed image ac, the formation of G requires 4 Â• |Y| comparisons. Given convex components of maximum radius r, Ritter [7] has shown that the labelling algorithm of Step 2 requires at least r/2 iterations of the inner loop, which equates to 2.5r Â• |Y| additions and 2r Â• |Y| maximum operations, since the von Neumann template has five pixels. Assuming that the addition, maximum, and comparison operations require identical work, then the computational efficiency of COL PAGE 343 333 over VPIC-compressed imagery, as opposed to applying Equation 413 to uncompressed imagery containing convex hulls, is given by: Wccl(X) ^ 4.5 -IXI operations ^ q , ^R^ (414) Wvccl(Y) 8.5 Â• 1Y| operations If k,/ = 4 (which equates to CRd = 16), then r? = 8.47, under constraint of the preceding assumptions. Therefore, compressive processing with O(CRd) efficiency is feasible. PAGE 344 CHAPTER 11 APPLICATIONS IN PARALLEL COMPUTING In this chapter, we consider applications of the theory developed previous chapters, particularly concerning implementation on SIMD-parallel meshes. We begin with a brief discussion of the effect of domain compression ratio (Section 11.1), then progress to range compression ratio effects (Section 11.2). We conclude with a discussion of mapping image operations to more efficient operations over the corresponding compressed format (Section 11.3) and image partitioning efficiency in compressive processing (Section 11.4). 11.1. Effect of Domain Compression Ratio. We have previously shown that the domain compression ratio CRd can be supraunitary, which indicates a reduced number of pixels in the compressed image representation versus the source domain. For example, if kx/-pixel encoding blocks are employed in VPIC, then each VPIC exemplar referenced in a given compressed pixel is associated with k/ source pixels. This seemingly obvious fact leads to the following three important consequences for parallel implementation of compressive operations: 1. The number of processors required to compute an image operation can be reduced by O(CRd). However, such reduction is achieved in the absence of computational speedup due to fewer pixels. In practice, reduced parallelism is achieved when the number of compressed pixels per processing element (PE) is greater than or equal to the number of pixels per PE for the uncompressed image. Any additional speedup derives from a reduced word width in the compressed pixel, if such effects are present (reference Section 11.2). In this chapter, we assume that one pixel 334 PAGE 345 335 is stored per PE, with nearestneighbor pixel values input and stored on-board a given PE in registers, as required. 2. I/O requirements can be correspondingly decreased for compressed imagery. This is a direct consequence of the reduced data burden incurred by CRd > 1. If the source imagery is tesselated blockwise (e.g., kx/ pixel rectangular blocks) and an nxm-processor SIMD mesh with n-channel parallel I/O is required for an image operation, then it is possible that the mesh size can be reduced to n/k by m// PEs, with only n/k-channel parallel I/O required. Alternatively, the n-fold I/O parallelism can be employed to speed up I/O by a factor of k, using input of multiple image rows. The rows would then be shifted across the mesh to their appropriate positions, which would require additional I/O overhead. Depending upon n,m,k,/, and the requirements of computations that can be pipelined with of follow the I/O operations, this additional overhead may or may not be burdensome. 3. // the topology of the compressed image is isomorphic to (a) the topology required by a given compressive operation and (b) the inter-processor connection scheme, then near-optimal mapping of the compressive operation to the target architecture is possible. A case in point is VPIC edge detection with the von Neumann topology, square encoding blocks, and a von Neumann template or subset thereof. Recall that we have shown (in Section 10.2.2) that nearest-neighbor (NN) data dependencies are required for VPIC edge detection. For example, consider imagetemplate operations with the von Neumann template. Assuming von Neumann source and compressed domain topologies that are isomorphic to the processor mesh connectivity, we have isomorphism between the source, compressed, and mesh domains. Thus, we can map an algorithm with NN communication in the image domain to an algorithm with NN communication requirements in the compressed domain, which can then be implemented on a processor with native NN communication. This is clearly an optimal situation for I/O efficiency, provided PAGE 346 336 that additional I/O overhead is not required by the compressive operation beyond that needed for the noncompressive operation. In other words, we do not want the VPIC edge detector to require NN communication in the Moore sense, versus the von Neumann connectivity supported in hardware on many 2-D meshes. That is, Moore NN communication would require two shifting operations per pixel to read pixel values in diagonal NNs. As an example of Item 2, above, consider the following simple example. Let a 4x4-processor mesh be employed for image-domain operations. Assume that a 4x4-pixel source image (having 16 pixels) can be compressed at CRd = 4 to yield a 2 x 2-pixel image, as shown in Figure 25. Further assume that the image and compressive operations fulfill the criteria of Item 3, above. Since only 25 percent of the mesh is now required for image processing, we can construct a pipelined SIMD mesh (similar to a systolic processor) using no additional hardware other than the n parallel I/O channels and the n*^ PEs previously available. Considering the effect of pipeline setup, the throughput is increased nearly four times for noncompressive processing. If operations 1-4 performed in each quadrant of the mesh require similar computational delay (e.g., are implemented using lookup tables, similar to VPIC edge detection and morphological operations), then near-optimal throughput could be achieved. The potential impact of the preceding concepts on the design of image processing circuitry could be significant. For example, we currently load an image onto a SIMD mesh, process the image, then (frequently) store the intermediate result to disk. This requires considerable I/O overhead. In contrast, assume that we can consistently reduce imagery to an easily-manipulated compressive format (e.g., VPIC or VQ) where compressive operations have template configurations or inter-pixel communication requirements that can be efficiently accommodated by the processor mesh. Further assume that local memories will increase in capacity such that LUT-based processing (similar to VPIC edge detection PAGE 347 337 or CCL) can be accommodated on-board with no additional I/O overhead due to LUT loading/unloading. (This is a reasonable assumption, given current developments in memory technology). 3 c 1 1 3 a. c ^ 2 2 1 1 1 1 2 2 1 1 ^ (a) (b) a. c 3 3 2 2 3 3 2 2 1 1 1 1 (c) 3 C 4 4 3 3 4 4 3 3 1 1 2 2 1 1 2 2 i i Output (d) Figure 25. Conversion of a SIMD-parallel mesh to pipelined computation using compressive processing: (a) input and compute over compressed image 1, (b) input compressed image 2 and compute over image 1, (c) input compressed image 3 and compute over images 1 and 2, (d) input compressed image 4, compute over compressed images 1-3, and output compressed image 1. Now, suppose that we plan to perform production image processing in a manner similar to that shown in Figure 25. If a given architecture could be optimally or near-optimally configured by allocating mesh partitions for various algorithm steps in a manner that balanced image partitioning efficiency with pipeline throughput, then this would represent a significant advance in SIMD-parallel processing efficiency. In particular, intermediate results could be stored on (and circulated around) the various mesh partitions until output to disk storage. Additionally, if the pipeline partitions were allocated correctly, intermediate results could be migrated to other mesh partitions during processing, thereby obviating the PAGE 348 338 need for I/O of some intermediate results from/to external storage (a time-consuming process). 11.2. EflFect of Range Compression Ratio. A second effect of interest to parallel computing is the compression of source image values can be compressed to yield fewer bits per pixel. For example, consider BTC or VPIC, where a kxk block of m-bit pixels (k^m bits) is reduced to n bits (e.g., a block mean and variance measure), where n PAGE 349 339 potentially faster computation due to the compounding of domain and range compression effects, and (c) possible noise averaging and estimation due to the computation of block mean and variance measures, respectfully. The latter advantage is especially important when ATR filter performance depends upon the image noise figure [62,81]. 11.3. Simplification of Operations. As a final note, we summarize several issues pertaining to the replacement of burdensome operations with more efficient operations. 11.3.1. Optimizing the Cost of an Operation. Recall the case of VPIC edge detection (Section 10.2), where we replaced a costly n<6 pixel convolution over |X| source pixels with a simple lookup table that computed over |Y| C |X| pixels. The convolution required at least two multiplications and one addition, whereas the LUT required only six I/O operations (five inputs and one output). The following table illustrates the cost of SIMD-parallel computation of such an operation. Table 1. Costs involved in SIMD-parallel computation of a 2-pixel image-template convolution versus a 5-pixel LUT operation over VPIC-format imagery. Algorithm Step (Processor Cycle) Uncompressed Compressive 1. Load image&template values (LUT inputs) 2n inputs 5 inputs 2. Multiply values (apply LUT) n multiplies 1 LUT access 3. Combine partial products n-1 additions 4. Output result 1 output 1 output 11.3.1.1. I/O cost. Assuming that all I/O operations are the same, we have that an n<6 pixel von Neumann template convolution requires 2n-|-l I/O operations per pixel in X, versus 6 I/O operations and one memory operation per pixel in Y for the compressive operation. Inter-processor communication is nearest-neighbor only, in the von Neuman sense, for both operations. PAGE 350 340 11.3. 1.2. Computational cost. The uncompressive operation requires n multiplies and n-1 additions per pixel in X, while the compressive operation requires only one LUT access (usually faster than an inter-processor I/O operation). Thus, if we assume that the compressive operation requires seven I/O operations (one for the LUT invocation and six for values), then the uncompressive operation will match this I/O overhead when n=3. In addition, the uncompressive operation requires multiplications and additions, which are drastically slower than I/O operations. As a result, it is reasonable to have as a future research goal the implementation of compressive image operations in terms of lookup tables. If such techniques are feasible for SIMD-parallel processors, this strategy (coupled with the reduction in I/O cost achievable through reduced parallelism) could yield drastic increases in image processing throughput. 11.3.1.3. Remark. However, the preceding analysis assumes that all multiplications and additions can be completed in one major processor cycle. This is certainly not the case with a synchronous SIMD mesh. We thus revise Table 1 to yield the following cost budget. Table 2. Processor cycles incurred by SIMD-parallel computation of a 2-pixel image-template convolution versus a 5-pixel LUT operation over VPIC-format imagery. Algorithm Step Uncompressed Compressive 1. Load image&template values (LUT inputs) 2n cycles 5 cycles 2. Multiply values (apply LUT) n Â• Cx cycles 1 cycle 3. Combine partial products (n 1) Â• c+ cycles 4. Output result 1 cycle 1 cycle Total Cycles Predicted n(cx+c++3) . ^5-f-l For purposes of illustration, assume that one multiplication requires 20 cycles and an addition requires 8 cycles, we would have 30n 7 cycles for one invocation of the noncompressive operation and seven cycles for the compressive operation. Such comparisons provide a clear illustration of the advantages of compressive processing in situations where PAGE 351 341 isomorphism exists between the topological/connectivity requirements of the source image tesselation, image operation, compressive operation, and mesh. 11.4. Partitioning Efficiency. We have noted elsewhere [6] that compressive processing facilitates the restructuring of individual operations, by modifying the connectivity of dataflow graphs of certain parallel algorithms such as connected component labelling. An additional advantage of compressive processing that is important for SIMD-parallel computation is the increase in efficiency with which an image can be partitioned 11.4.1. Basic Concepts. The optimal partitioning of imagery is a key challenge in parallel image processing, due (for example) to the I/O overhead required to swap image partitions on SIMD-parallel arrays. 11.4.1.1. Observation. Given a parallel processor such as a SIMD-parallel mesh with N processors, consider the partitioning of |X| source data to be optimally allocated to the Nprocessor mesh. For example, in pointwise operations, we could divide the source data into [|X|/N] partitions, each of which would be loaded and processed independently. If globalreduce operations are employed, then a global-reduce result from the i-th partition would be stored as intermediate data and would be accumulated with the subsequent (i-|-l-th) partition's global reduce result. 11.4.1.2. Remark. Image partitioning for pointwise and global reduce operations is relatively straightforward, and generally requires only non-overlapping partitions. However, when image-template operations are employed, template boundary conditions require overlapping partitions. For example, given a kx /-pixel template with rectangular support, a PAGE 352 342 two-dimensional |X|-pixel source image would require partitions that had overlap of [k/2j pixels at the top and bottom of the partition and [//2J pixels at the sides of the partition. As a result, for odd-dimensioned rectangular templates, the partition size would be increased by k-1 pixels in the vertical direction and /-I in the horizontal direction. Since only N processors are available, this implies an increase in the number of partitions. Consequently, the number of processing stages can increase, thereby decreasing throughput. This effect is pronounced for large or non-rectangular templates, and can significantly degrade performance on small meshes, due to the repeated loading of boundary pixels. 11.4.1.3. Observation. An additional problem in image-template operations is the storage of boundary conditions for recursive neighborhood operations. Although we do not extensively discuss recursive image algebra in this study, we note that recursive image-template operations have the additional requirement that partitions be processed sequentially. In such cases, out-of-order execution may be less applicable as a means of achieving nearoptimal functional parallelism. The impact upon data parallelism is twofold: (1) in-order partitioning is required, and (2) intermediate results must be stored in boundary pixels (or PEs) on the mesh or registers associated with the mesh. The requirement of withinPE storage can further decrease effective partition size and can thus increase algorithm throughput by decreasing costs associated with I/O and partition swapping costs. 11.4.2. Partitioning Efficiency in Compressive Processing. The advantage of compressive processing in reducing partition swapping overhead results primarily from block compression, where source pixels are grouped into rectangular blocks that are characterized by few parameters. PAGE 353 343 11.4.2.1. Observation. Given a supra-unitary domain compression ratio CRd and a block tesselation scheme that is indexed isomorphically to the image domain coordinate system, it is possible that pointwise and global-reduce operations can be partitioned such that [lYl/N] < CIXI/N] partitions are required, where X and Y denote the source and compressed domains, respectively. Further decreases in parallelism can be effected by codebook processing in block-encoded imagery, versus processing the |Y| encoding blocks. 11.4.2.2. Example. Consider VQ-compressed imagery, where M kx /-pixel exemplars in codebook c represent encoding blocks directly (i.e., are not scaled or normalized). When computing a blockwise operation over the codebook, a total space requirement of Mk/ pixels is present, instead of |Y| > Mk/ or |X| > |Y| pixels. 11.4.2.3. Example. Consider the blockwise operation of image addition. Instead of computing |X| operations in the image domain, we can approximate the effect of VQ codebook addition by adding the exemplars d(ij) = c(i) -(c(j), for all codebook indices 1 < i j < M. By eliminating the lower triangular portion of domain{d), the new codebook d will have (M^ -tM)/2 exemplars. For example, if M = 64, then d would have 2,080 exemplars. Given the source domain size |X| = 1024x 1024, and 8x 8-pixel encoding blocks, the compressed domains of the addend and augend images would each have 16K exemplar indices. Thus, 16K block addition operations are required for compressive addition over Y. However, codebook addition requires only 2,080 block additions, which yields an efficiency of = 16,384/2,080 = 7.87. Thus, parallelism could be reduced nearly eightfold by such VQ-based compressive processing. PAGE 354 344 11.4.2.4. Example. Consider the VPIC target classification algorithm given in Chapter 10. Here, VPIC exemplar indices are matched to an exemplar-index template. If the template subtends L encoding blocks, then L comparisons and L-1 combination operations (addition, multiplication, or maximum) are required per block position. Thus, given tesselation of an MxN-pixel image into kx /-pixel encoding blocks, a parallel space savings of CRd = MN/k/is realized, with a computational efficiency of 0(CRd/L). However, if the parallel (blockwise) classification operations are a factor of r more efficient than the sequential target recognition operations over each block, then the computational efficiency becomes t]= 0(r-CRD/L), and the parallel efficiency 7/|| = 0(r), neglecting partitioning overhead. 11.4.2.5. Observation. For purposes of illustration, let us now suppose that the classification, algorithm discussed in the preceding section incurs K partitions over a P-processor architecture when processing is conducted over the source domain X. Further assume that an r-fold increase in sequential efficiency is realized per block. Since we can effectively reduce K by a factor of CRd via compressive processing, it would appear that Tyy = 0(r Â• CRd). However, the template matching algorithm requires that L compressed pixels be processed concurrently. Thus, there is an average overlap of /2 pixels per partition. As a result, the P-pixel partition that was realized through pointwise operations and isomorphic tesselation now incurs an overhead of 2 pixels. Given K' = K/CRd nonoverlapping partitions of Y, each of size P pixels, the overlap requirement of 2 Â• y/L pixels per parti tion will induce 2K' Â• /P extra partitions. This possible degradation in efficiency due to partitioning overhead in compressive target recognition is given by K A?7 = /P 1 + 2 K' Â• 1 (415) /P PAGE 355 345 Note that At/ is large when L and P are small and CRd is large. Thus, the maximization of blocksize, which increases CRd, also tends to decrease P. As a result, there is a design tradeoff between blocksize and compression ratio. Additionally, one must balance template size L with partition size P in order to achieve sufficient resolution (and, therefore, accuracy) in the template representation. Our current research emphasizes the derivation of optimization expressions for At/. In future research, we plan to investigate partitioning requirements for a variety of template configurations, including non-rectangular (i.e., L-shaped or serpentine) and sparse templates. PAGE 356 CHAPTER 12 CONCLUSIONS We present conclusions in Section 12.1 as well as open research issues and suggestions for future work in Section 12.2. 12.1. Conclusions. This study supports the concept that one can derive operations which simulate image processing operations over the range spaces of various compressive transformations. In many cases of unary pointwise and global reduce operations, the computational speedup equals or exceeds that of the domain compression ratio. However, the speedup obtained for image-template operations varies with the template configuration and image specification. For example, given an uncompressed domain X and a compressed domain Y, we have shown that morphological erosion and dilation over the range space of one-dimensional Boolean runlength encoded(RLE) imagery requires 0(|Y|) additions, versus 0(|X|) multiplications and comparisons. Similar claims hold for edge detection over VPIC-, BTC, and JPEGcompressed 2-D imagery. In contrast, binary pointwise operations over nonuniformly tesselated realvalued imagery that is compressed blockwise require 0(|Yp) time in the worst case, due to the overhead of 0(|Y|) search operations incurred per point in Y. The search operations are required to find values in the compressed image that correspond to known points in the uncompressed image, to which the binary operations are referenced. This overhead also pertains to certain imagetemplate operations, especially those with space-variant templates. However, the preceding disadvantage does not hold universally. Instead, it appears that analytical derivations based upon e-near approximations to transforms or their inverses can 346 PAGE 357 347 be substituted for the actual transformation over whose range space an analogous or dual image-template operation is computationally burdensome. Using such approximations, we find that: 1. Certain image-template operations (notably, edge detection) can be approximated over a wide range of compression transformations (e.g., BTC, VPIC, .JPEG, and RLE) by exploiting inherent properties of the compressed image, versus deriving an edge detection operation in closed form. This is an advantageous result, which incurs only 0(|Y|) work. 2. Over the range spaces of transformations that preserve edge and boundary information (e.g., RLE and VPIC), component labelling is possible in 0(|Y|) or 0(|Y| log|Y|) work. This stands in contrast to the known minimum sequential bound of 0(|X|) work required for sequential connected component labelling of convex components. Congruent with Cosman's work, we find that the processing of vector-quantized imagery can often be performed efficiently on sequential or parallel architectures using codebook processing only, i.e., by manipulating the codebook exemplars with the source image operation. The chief advantage of this approach is the small amount of work required. For example, if the codebook size is M exemplars, 0(M) work is required, versus 0(|X|) work for the uncompressed image and a minimum of 0(|Y|) work for the compressed image. Since M < (416) |Y| is usual, such operations can exhibit computational efficiency that exceeds the compression ratio. Unfortunately, the codebook manipulations required for many arithmetic operations tend to generate a codebook whose statistics poorly characterize the image training set from which the original codebook(s) was(were) developed. As the number of PAGE 358 348 cascaded operations on the codebook(s) increases, the gradual departure of codebook statistics from image statistics may adversely impact the characterization of other images in the training set by processed codebooks. A key question of this study pertained to the parallel processing of compressed imagery. Given CRa = |X|/|Y|, we have shown that the 0(|CRa|) efficiencies realized on sequential architectures can, in certain cases, be replicated on parallel architectures, provided that the following constraints are met: Â• For optimal efficiency, the image tesselation inherent in the compression or encryption transform should correspond (i.e., be isomorphic to) the multiprocessor indexing scheme. For example, it has been found that SIMD-parallel processor meshes are well-suited to this type of processing, since their Euclidean coordinate system is isomorphic to the rectangular block structure of the encoding block grid employed by many customary compression and encryption transforms. Â• The processing elements must be equipped with a set of operators that permits straightforward, efficient implementation of the analogous or dual operations. For example, if an analogous operation requires m-bit addition, that operation should be available on the processor array. Otherwise, synthesis from native operations is required, which new operation may be inefficient. If the synthesized operation is I/O-intensive, then it is often the case that the efficiency of a given parallel analogous or dual operation will decrease per processor. This performance reduction factor generally degrades the performance of parallel compressive processing at least sublinearly, if not linearly. Â• Compressive or encryptive processing operations should not require prohibitive I/O overhead to achieve a nominal reduction in computational (versus I/O) operations. Since current parallel architectures have limited interprocessor connectivity, I/O requirements tend to consume processor cycles in work that may not be directly PAGE 359 349 related to computation. If such I/O cost causes bottlenecks in inter-processor communication (and therefore, in computation), the performance gains that might otherwise be realized by compressive processing will not be apparent. Â• When processing encrypted data, interprocessor communication channels (data busses) must be secure, and the data format produced by the encryption transform must not be altered by I/O operations across the processor mesh. For example, in the simple case of transposition ciphers, permutation across the mesh can inadvertently cause total or partial decryption of image neighborhoods. If the processor links are insecure, such partial plaintext can easily be made vulnerable to an adversary. An additional problem that appears to affect much of practical compressive processing pertains to the accumulation of error in cascaded discrete operations over the range spaces of lossy transforms. Since such transforms have only approximations to their inverses (or, in certain cases, have no inverses), input error (manifested as information loss in the transform) can propagate through the compressive operations, gradually reducing the probability of correctly determining the greylevel of a given pixel. Fortunately, the problem of information loss does not customarily occur in encryptive transforms, due to the fact that cryptosystems are usually isomorphic computational systems. As a result, analogues over the range spaces of encryptive transforms generally incur less information loss than their counterparts over compressive transformations. However, the preceding comments do not mean that encryptive computation is without information loss. In fact, the computational error associated with encryptive processing could, in extreme circumstances, obviate correct decryption. For example, this unpleasant consequence could result from the application of extensively many multi-stage operations (e.g., processing over the range space of DES transforms) when emulating the functionality PAGE 360 350 of lengthy image processing algorithms. The accrued computational error could, in principle, exceed the precision required for discriminating character indices (i.e., Â±0.5 when the index set is integral). In practice, such problems occur when analog machines (e.g., optical processors) are used to compute cryptographic or cryptanalytic transformations. The instability and error inherent in such processors is well known, and could exhibit a sufficient magnitude to propagate through encryptive processing operations in the aforementioned manner. As a result of the foregoing observations, we have several suggestions for open research issues and future work. 12.2. Open Issues and Future Work The following high-level research questions present open issues resulting from the current study: 1. Are there basic mathematical structures that support a general formulation of image compression? For example, are there properties of image neighborhoods that facilitate compression by pixel-level processing? In 2-D RLE imagery, we know that a constantvalued region which accommodates a rectangle within its boundary would be preserved. If the region shape was isomorphic to the encoding block shape (customarily rectangular), then such regions would be amenable to compression. If such theory provided useful insights into a unification of image compression operations, then it might be possible to develop a rigorous, predictive technique for better matching compressive transforms to source imagery. Such knowledge would also be useful in deriving efficient compressive operations. 2. Is the derivation of compressive operations over the range spaces of block encoding transforms (the predominant paradigm) limited to customary (i.e., rectangular) block geometries when the imagery has a Euclidean coordinate system? Or, are PAGE 361 351 there other block geometries (i.e., interlocking arbitrarily-shaped blocks) that provide efficient tesselation and processing configurations for certain types of images? This issue pertains directly to image processing over hexagonal lattices, which is useful in the simulation of natural visual systems. 3. Can variable-hlocksize encoding schemes (e.g., fractal-based encoding) he regularized at a resolution level related to a measure derived from the hlocksize distribution? By regularization, we mean converting the variable-blocksize tesselation scheme into a uniformblocksize tesselation. If this is possible, then the performance of a wide variety of pointwise and imagetemplate operations over RLEand IFS (fractal)compressed imagery could be drastically improved. Whereas the computational cost is currently 0(|Y|'^), it would be possible to achieve 0(|Y|) cost, albeit with additional error due to approximations involved (for example) in grouping small blocks into larger blocks. 4. Are there novel methods for image coding that minimize (or optimize with respect to operational constraints) the propagation of lossy transformation error through various compressive operations? If so, what classes or subclasses of our transform taxonomy appear to provide improved results? This concept would be especially useful in the processing of radiological imagery (e.g., detection of small precancerous lesions in mammograms or MRI scans). In such cases, accuracy of object location can be constrained to pixel or sub-pixel resolution. However, image display errors alone often obviate such discrimination, which errors can increase when computational error due to image processing is factored into the error budget. Hence, without improved error prediction methods, this remains an unsettled question. 5. What interprocessor connection schemes tend to render encrypted imagery more vulnerable, due to the inadvertent inversion of transpositions that may be effected PAGE 362 352 by the encryption transform? This issue will become more important as interprocessor connectivity schemes become more flexible and such connectivity increases. For example, in a butterfly (FFT) processor, the generation of permutation sequences is nearly trivial. Could the presence of such permutations obviate encryptive processing on highly connected processor networks? We plan to address combinatorial aspects of this question in future research. Issue 1), the elucidation of mathematically tractable, unifying structure for image compression, is analogous to much ongoing work in the mathematics of encryption. Although the latter has been well supported by the defense community, until recently, fundamental research in image compression has considerably lagged encryption research. With the increasing use of wide-area networks for image transmission, compression theory is receiving more attention in the literature. Nevertheless, current trends in development appear to support the determination of minimum image information content, rather than theory that would predict image compressibility in relation to physical constraints (e.g., visual picture quality). Ongoing and future work in the area of processing compressed imagery is expected to emphasize the following practical issues: Â• Implementation of compressive processing algorithms that were developed in this study on the PAL-I and PAL-II architectures. PAL is a novel, high-bandwidth SIMD-parallel architecture that is based upon Lockheed-Martin's GAPP-IV processor and directly implements a large subset of nonrecursive image algebra. Although PAL was scheduled to be available for performance testing during the term of this study, various delays in hardware and software development made PAL implementation infeasible within the allotted time. Thus, we expect to begin early PAL implementations in late 1996. PAGE 363 353 Â• Further computation of error figures and simulation of error propagation in compressive processing. Apart from the issue of rendering image-template operations more efficient, the adverse effects of error propagation through lossy transforms and cascaded compressive operations remains the chief disadvantage of compressive processing. This effect must be researched in detail at various resolution and computational precision levels, in order to determine limits of detectability for known objects in compressively processed imagery. With such results, we could better determine the feasibility of compressive processing for on-board target recognition applications using high-bandwidth processors (such as PAL) at high frame rates. Â• Investigation of template decomposition procedures for more efficient compressive image-template operations. The difficulty with which complicated space-variant template configurations are implemented over the range space of certain compressive transforms leads naturally to the conjecture that compressive processing could benefit from the large amount of work performed in template decomposition. This appears to be particularly true in the case of template decompositions for blockencoded imagery, due to problems associated with representing template blocks in terms of complete encoding blocks. Such representational problems are especially crucial for template specification in target recognition applications, where the preservation of template information can be crucial to the compressive detection partially resolved targets (several pixels to hundreds of pixels on target). JPEG compressive operations present many unresolved issues. For example, computation of the quality and smoothing parameters varies with JPEG implementation and is occasionally proprietary. How are such parameters related to the information content of the DCT coefficients that are retained/discarded? What, if any, spatial information (e.g., source image MTF) is accounted for in such measures? (The MTF is useful as a measure of spatial frequencies discarded by the DCT coefficient quantization step.) Additionally, the PAGE 364 , 354 computation of JPEG's Huffman codebook has not been extensively reported for commercial JPEG implementations. Although we do not consider the Huffman encoding aspect of JPEG in this study, it is certainly important to know how such encoding occurs in a practical implementation, in order to access the reduced matrix of DCT coefficients. Thus, we plan to survey JPEG algorithms and their manufacturers to obtain several commercial JPEG algorithms whose output can be inverted piecewise. We also plan to investigate information loss associated with VPIC edge detection and morphological operations for a variety of textual (Boolean) or natural imagery. The results of such research are expected to provide useful information for the design of improved VPIC edge and erosion/dilation codebooks. Based upon our current research results, we further expect that an in-depth understanding of VPIC's error behavior for simple (i.e., von Neumann) edge detection and morphological structuring elements would facilitate further improvements in VPIC morphological operations with multi-pixel and non-von Neumann templates. The issue of error statistics is important and cannot be overlooked (or overstressed) in ATR applications. This issue is of particular interest when the ATR algorithm perturbs the input noise distribution in an unexpected manner. For example, we have presented various error figures for VPIC, JPEG, and image rotation (Chapter 8). In each case, although the input image was generally perturbed with Gaussian noise produced by the imaging system, the error statistics were markedly non-Gaussian, and included multimodal distributions. We are interested in determining not only the magnitude and probability of such errors, but also the form of their distributions, where possible. From such research, we eventually hope to gain a better (and possibly, an analytical or formulaic) understanding of how error propagates through ATR algorithms and their compressive analogues. For example, consider that the gradient magnitude distribution of a natural image is frequently Lorentzian [86]. Regardless of whether the image is perturbed with Poisson or Gaussian noise, the gradient distribution remains Lorentzian. The reasons PAGE 365 355 for this behavior are not apparent. From the definition of an analogue, we expect that such effects are produced by a given compressive analogue of a gradient operation. However, comprehensive testing of the preceding hypothesis remains to be completed. Now, suppose we perturb the analogous gradient operation such that significant computational error (approximating the input noise magnitude) is introduced into the gradient output. Will the gradient distribution remain Lorentzian? Such questions have not been answered in the literature, and represent a large, open area of research in the statistics of image processing that would have crucial impact on error-dependent processes. Medical image understanding (i.e., recognition of known tumor types) and ATR of small targets represent key applications of such analyses, and could benefit from improvements in computational accuracy. The results of such research could enhance and expand analytical knowledge that is often lacking when error analyses of complex algorithms are attempted. PAGE 366 REFERENCES [1] Cosman, P.C., K.L. Oehler, E.A. Riskin, and R. Gray. "Using vector quantization for image processing", Proceedings of the /Â£'Â£'Â£' 81:1325-1341 (1993). [2] Schmalz, M.S. "General theory for the processing of compressed and encrypted imagery, with taxonomic analysis ",Proceec(mfi(s SPIE 1702:250-263 (1992). [3] Schmalz, M.S. "On the processing of compressed and encrypted imagery: Complexity analyses, with application to novel regimes of efficient computation", in Proceedings SPIE 1700 (1992). [4] Schmalz, M.S. "On the processing of compressed data: Application to morphological and grey-scale computations over runlength-, block-, transform-, and derivative-encoded [msigeTy" , Proceedings SPIE 1702:60-75 (1992). [5] Schmalz, M.S. "On the processing of runlength-encoded Boolean imagery", in Proceedings SPIE 1955 (1993). '6] Schmalz, M.S. "On the parallel processing of runlength-encoded Boolean imagery", in Proceedings SPIE 1955 (1993). 7] Leger, A., T. Omachi, and G.K. Wallace. "JPEG still picture compression algorithm". Optical Engineering 30:947-954 (1991). 8] Modestino, J.W. and Y.H. Kim. "Adaptive entropy-coded predictive vector quantization of images", IEEE Transactions on Signal Processing 40:633-644 (1992). 9] Chen, D. and A.C. Bovik. "Visual pattern image coding", /Â£'Â£'Â£' Transactions on Communications 38:2137-2146 (1990). 10] Duff, I.S., R.G. Grimes, and J.G. Lewis. "Sparse matrix test problems", ACM Transactions on Mathematical Software 15:1-14 (1989). 11] Liu, J.W.H. "The multifrontal method for sparse matrix solution: theory and practice", SIAM Review 34:82-109 (1992). 12] Oehler, K.L., P.C. Cosman, R.M. Gray, and J. May. "Classification using vector quantization", in Proceedings of the 25th Annual Asilomar Conference on Signals, Systems, and Computers (Pacific Grove, CA, Nov. 1991), pp. 439-445 (1991). 13] Netravali, A.N. and J.O Limb. "Picture coding: A review", Proceedings of the IEEE 68:366-406 (1980). 356 PAGE 367 357 \U] Zschunke, W. "DPCM picture coding with adaptive prediction", IEEE Transactions on Communication, COM-25(11):1295-1302 (1977). [151 Li, W. and Y.Q. Zhang. "Vector-based signal processing and quantization for image and video compression", Proceedings of the lEEEE 83:317-335 (1995). [161 Healy, D.J. and O.R. Mitchell. "Digital video bandwidth compression using block truncation encoding", IEEE Transactions on Communication COM-29(12):1809-1817 (1981) . [171 Kunt, M. M. Bernard, and R. Leonardi. "Recent results in high-compression image coding", IEEE Transactions on Circuits and Systems CAS-34(ll):1306-36 (1987). [181 Baron, S. and W.R. Wilson. "MPEG overview", SMPTE Journal 103:391-394 (1994). [191 Felician, L. and A. Gentili. "A nearly-optimal Huffman encoding technique in the microcomputer environment". Information Systems (UK) 12(4):371-3 (1987). [201 Shoaff, W.D. "The Singular Value Decomposition Implemented in Image Algebra", M.S. Thesis, Department of Computer and Information Sciences, University of Florida (1986). [211 Ohta, Mutsumi, and S. Nogaki. "Hybrid picture coding with wavelet transform and overlapped motion-compensated interframe prediction coding", IEEE Transactions on Signal Processing 41:3416-3424 (1993). [221 Dettmer, R. "Form and function: Fractal-based image compression", lEE Review 38:323-327 (1992). [231 Bani-Eqbal, B. "Enhancing the speed of fractal image compression". Optical Engineering 34:1705-1710 (1995). [241 Mandelbrot, B.B. The Fractal Geometry of Nature, 2nd Edition, New York: W.H. Freeman (1983). [251 Barlow, H.B., and P. Fatt, Eds. Vertebrate Photoreception, New York: Academic Press ((1976). [261 Kahn, D. The Codehreakers, New York: Macmillan (1967). [271 Sinkov, A. Elementary Cryptanalysis: A Mathematical Approach, Washington, DC: Mathematical Association of America (1966). [281 Koblitz, N. A Course in Number Theory and Cryptography, 2nd Edition, New York: SpringerVerlag (1994). [291 Meyer, C.H. Cryptography: A New Dimension in Computer Security, New York: Wiley (1982) . PAGE 368 358 [30] Patterson, W. Mathematical Cryptology for Computer Scientists and Mathematicians, Totowa, NJ: Rowman and Littlefield (1987). [31] Nechtavai, J. Public-Key Cryptography, Gaithersburg, MD: US Department of Commerce, National Institute of Standards and Technology (1991). [32] Fisher, D. and I. Bahl, Editors. Gallium Arsenide IC Applications Handbook, San Diego: Academic Press (1995). [33] Goldberg, D.E. Genetic Algorithms in Search, Optimization, and Machine Learning, Reading,MA:AddisonWesley (1989). [34] van der Bank, D.J. "The Use of Genetic Algorithms for Cryptanalysis", PhD Dissertation, University of Pretoria, SA (1992). [35] Karnin, E.D. "Probabilistic and Computational Methods in Cryptography", PhD Dissertation, Stanford University (1984). [36] Chor, B.Z. Two Issues in Public-Key Cryptography: RSA Bit Security and a New Knapsack Type System, Cambridge, MA: MIT Press (1986). [37] Cobb, S.E. Public-Key Cryptography and the Zero-One Knapsack Problem, M.Sc. Thesis, Carleton University, Canada (1983). [38] Jan, J.K. and S.J. Wang. "A dynamic access control scheme based upon the Knapsack problem". Computers and Mathematics with Applications 26:75-86 (1993). [39] Amirazizi, H.R. "Time-Memory Processor Tradeoffs (Cryptography, MultipleEncryption, Knapsack Problem)", Ph.D. Dissertation, Stanford University (1986). [40] Evaluated in Hartman, W.J. "A Critique of Some Public-Key Cryptosystems", Report 126-D-3, US. Department of Commerce, National Telecommunications and Information Administration (1980). [41] Takagi, N. "A radix-4 modular multiplication hardware algorithm for modular exponentiation", IEEE Transactions on Computers 41:949-956 (1992). [42] Rivest, R.L., L. Adelman, and M.L. Dertouzos. "On data banks and privacy homomorphisms", in Foundations of Secure Computations, R.A. DeMillo, D.P. Dobkin, A.K. Jones, and R.J. Lipton, Eds., New York: Academic Press, pp. 169-179 (1978). [43] Abadi, M., J. Feigenbaum, and J. Kilian. "On hiding information from an oracle", Journal of Computer System Science 39:21-50 (1989). [44] Ahituv, N., Y. Lapid, and S. Neumann. "Processing encrypted data", Communications of the ACM 30:777-779 (1987). [45] Yu, K.W. and T.L. Yu. "Superimposing encrypted data". Communications of the ACM 34:49-54 (1991). PAGE 369 359 [46] Nechtaval, J. Public-Key Cryptography, NIST Special Publication 800-2, Gaithersburg, MD: Department of Commerce (1991). [47] Jawerth, B. and W. Sweldens. "An overview of wavelet based multiresolution analyses", SIAM Review 36:377-412 (1994). [48] Jolion, J.M. and A. Montanvert. "The adaptive pyramid: A framework for 2D image analysis", CVGIP: Image Understanding 55:339-348 (1992). [49] Healy, D.J. [On edge detection over locally avereaged images]. Unpublished report for AFOSR Summer Research Grant (1978). [50] Evaluated in Burt, P.J. and E.H. Adelson. "The Laplacian pyramid as a compact image code", IEEE Transactions on Communications COM-31:532-540, (1983). [51] Schmalz, M.S. "On the processing of compressed imagery. 2. Compressive operations with VPIC-, BTC-, VQ-, and JPEG-compressed imagery", in Proceedings SPIE 2751 (1996). [52] Schmalz, M.S. "Automated target recognition in compressed stereo images", to appear in Proceedings SPIE 1996 Technical Symposium, Denver, CO, August 1996. [53] Schmalz, M.S. "Automated detection and recognition of small targets in compressed imagery. 1. Background and theory", in Proceedings SPIE 2765 (1996). [54] Ritter, G.X. and J.N. Wilson. Handbook of Image Processing Algorithms in Image Algebra, Boca Raton, PL: CRC Press (1996). [55] Ritter, G.X. Image Algebra (In preparation, partial manuscript avaialble via anonymous FTP through jnw@cis.ufl.edu). [56] Schmalz, M.S. and G.X. Ritter. "Image-algebraic design of multispectral target recognition algorithms", in Proceedings SPIE 2300:213-228 (1994). [57] Li, D. "Recursive Operations in the Image Algebra and their Applications to Image Processing", Ph.D. Dissertation, Department of Computer and Information Sciences, University of Florida, Gainesville, FL (1990). [58] Koza, J. Genetic Programming II: Automatic Discovery of Reusable Programs, Cambridge, MA: MIT Press (1994). [59] National Research Council (USA). Computers at Risk: Safe Computing in the Information Age, Washington, DC: National Academy Press (1991). [60] Bevington, P.R. Data Reduction and Analysis for the Physical Sciences, New York: McGraw-Hill (1969). [61] Clifford, A. A. Multivariate Error Analysis: A Handbook of Error Propagation and Calculation in ManyParameter Systems, London: Applied Science Publishers (1973). PAGE 370 360 [62] Schmalz, M.S., G.X. Ritter, C.T. Yang, and W.C. Hu. "Center-surround filters for the detection of small targets in cluttered multispectral imagery. 2. Analysis of errors and filter performance", in Proceedings SPIE 2496 (1995). [63] Sigelman, J. Retinal Diseases: Pathogenesis, Laser Therapy, and Surgery, Boston, MA: Little, Brown (1984). [64] Chern, W.S. "Error Propagation in Discrete Processes", PhD Dissertation, Department of Electrical Engineering, University of Florida, Gainesville, FL (1970). [65] Jiang, L.M., T.F. Lee, and T.T. Hwang. "Performance-driven interconnection optimization for microarchitecture synthesis", IEEE Transactions on ComputerAided Designh of Integrated Circuits and Systems 13:137-149 (1994). [66] Pardalos, P.M., Ed. Advances in Optimization and Parallel Computing, New York: NorthHolland (1992). [67] Kim, W.Y. "3-D object recognition using bipartite matching embedded in discrete relaxation", IEEE Transactions on Pattern Analysis and Machine Intelligence, 13:224-251 (1994). [68] Shannon, C. "Communication theory of secrecy systems". Bell System Technical Journal 28:656-715 (1949). [69] Ash, R.B. Information Theory, New York: Interscience (1965). [70] Abramson, N. Information Theory and Coding, New York: McGraw-Hill (1963). [71] Kersten, D. "Predictability and redundancy of natural images". Journal of the Optical Society of AmericaA 4:2395-2400 (1987). [72] Akl, S. The Design and Analysis of Parallel Algorithms, Englewood Cliffs, NJ: Prentice Hall (1989). [73] Silvera, E. "Adaptive pattern recognition with rotation, scale, and shift invariance", Applied Optics 34:1891-1900 (1995). [74] Cornsweet, T. Visual Perception, New York: Academic Press (1970). [75] Zrenner, E. Physiology of the Human Eye and Visual System, Hagerstown, MD: Harper&Row (1979). [76] Haralick, R., K. Shanmugan, and L Dinstein. "Textural features for image classification", IEEE Transactions on Systems, Man, and Cybernetics SMC-3:610-621 (1973). [77] Haralick, R. "Statistical and structural approaches to textures". Proceedings IEEE 67:786-804 (1979). PAGE 371 361 [78] Weszka, J., C. Dyer, and A. Rosenfeld. "A comparative study of texture measures for terrain classification", IEEE Transactions on Systems, Man, and Cybernetics SMC-6:269-285 (1976). [79] Pal, N.R., J.C. Bezdek, and E.C.K. Tsao. "Generalized clustering networks and Kohonen's self-organizing scheme", IEEE Transactions on Neural Networks 4:549-547 (1993). [80] Kim, Y.K. and J.B. Ra. "Adaptive learning method in self-organizing map for edge preserving vector quantization", IEEE Transactions on Neural Networks 6:278-280 (1995). [81] Yang, C, W.C. Hu, M.S. Schmalz, and G.X. Ritter. "Center-surround filters for the detection of small targets in cluttered multispectral imagery. 1. Background and filter design", in Proceedings SPIE 2496 (1995). [82] Prewitt, J.M.S. "Object enhancement and extraction" in B.S. Lipkin and A. Rosenfeld, Eds., Picture Processing and Psychopictorics, New York: Academic Press (1970). [83] Sobel, I.E. Camera Models and Machine Perception, Ph.D. Dissertation, Stanford University, Stanford, CA (1970). [84] Kirsch, R.A. "Computer determination of the constituent structure of biological images". Computers and Biomedical Research 4(3):315-328 (1971). [85] Gonzalez, R.C. and P. Wintz. Digital Image Processing, Reading, MA: AddisonWesley (1987). [86] Lettington, A. and Q.H. Hong. "Interpolator for infrared images". Optical Engineering 33:725-729 (1994). PAGE 372 BIOGRAPHICAL SKETCH Mark Schmalz was awarded the baccalaureate degree in applied physics (minor in electrical engineering) from Michigan Technological University in 1974. Following conferral of the O.D. degree in optometry (minor in optics) from Pacific University in 1979, Dr. Schmalz maintained an active clinical practice from 1980 to 1984. From 1982 through 1988, the author conducted research at the Harbor Branch Foundation and the Harbor Branch Oceanographic Institution (Florida). Since 1989, he has conducted research at the Department of Computer and Information Science and Engineering, University of Florida, Gainesville, where he received the M.S. in Computer and Information Sciences in 1991. Dr. Schmalz has published over. 60 research papers in automated target recognition, optical propagation theory, image compression and restoration, parallel computation, and cryptography. 362 PAGE 373 I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Gerhard X. Ritter, Chairman Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Joseph N. Wilson Assistant Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Andrew F. Laine Associate Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Sanguthevar Rajasekaran Associate Professor of Computer and Information Science and Engineering PAGE 374 I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quahty, as a dissertation for the degree of Doctor of Philosophy. John Harris Associate Professor of Electrical and Computer Engineering This dissertation was submitted to the Graduate Faculty of the College of Engineering and to the Graduate School and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August 1996 . Winfred M.PhiUips Dean, College of Engineering Karen M. Holbrook Dean, Graduate School PAGE 375 I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Gerhard X. Ritter, Chairman Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Joseph N. Wilson Assistant Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Andrew F. Laine Associate Professor of Computer and Information Science and Engineering I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quality, as a dissertation for the degree of Doctor of Philosophy. Sanguthevar Rajasekaran Associate Professor of Computer and Information Science and Engineering PAGE 376 I certify that I have read this study and that in my opinion it conforms to acceptable standards of scholarly presentation and is fully adequate, in scope and quahty, as a dissertation for the degree of Doctor of Philosophy. John Harris Associate Professor of Electrical and Computer Engineering This dissertation was submitted to the Graduate Faculty of the College of Engineering and to the Graduate School and was accepted as partial fulfillment of the requirements for the degree of Doctor of Philosophy. August 1996 . Winfred M.PhiUips Dean, College of Engineering Karen M. Holbrook Dean, Graduate School |